When working for an MVP in one of our projects (a React Native-based app) we were asked to have a "Reconnect" modal dialog with a "Retry" button that would be shown when a request timeouts, preventing the user to interact with an empty UI. The major issue with the initial implementation we wrote was that such button would just simply reload the critical content of the app (some index pages) with no regards for the specific request that had failed or what was the current state of the app.
After the initial release, we decided to improve this behavior and provide a mechanism to retry only the requests that failed because of a timeout. The idea was to recover the state of the app on a successful reconnection transparently, without altering the logic of the services that use our custom HTTP client. The first limitation we have found is that we were bound to use ES6 Fetch API, which at the current moment, among other things, it does not support failure on "hung" connections - i.e., connections that never finish. So what we did is to wrap the original promise returned by fetch within a second one that give us control on how and when to resolve it.
After this improvement, the next challenge was to keep track of the timeout requests and, upon reconnection, retry those again. Just rejecting the promise on a timeout was not an option for us, because this implied that upper level logic reacting to this promise failure would be immediately executed - and that was not the case, since a timeout is not necessarily a failure in an endpoint but more like an unknown state. If the promise was immediately rejected on a timeout, then it will be too late to retry and restore the app state upon a successful request retrial. This meant that with this approach, we would not be able to implement a transparent retrial strategy; we needed a way to, if desirable and even after several retries, resolve the original promise returned to the upper service level so the execution would continue normally after connection restoring.
After debating different approaches internally, our team agreed that a reasonable solution was to queue callbacks to our fetch function itself. In the code shown below, we just modified our wrapped promise to, in the event of a operation timeout, inject an anonymous function that would retry the execution of the same method and propagate the resolve or reject the promises accordingly
So, summarizing, we have different scenarios and outcomes:
If case 3 happens one or more times, the showErrorCallback will shown the "Reconnect" message to the user - IMPORTANT: this function must be idempotent. When the user taps on the "Retry" button, the code that executes the retrial (retryFailedRequests, listed below) will be ran and all the queued requests will be retried. Every retry will call fetchWithTimeoutAndQueue and the result of this promise will propagate in chain until reaching the top promise which, once resolved, will execute the real code.
Now, whenever the app loses connection it will be effectively paused until a successful request is made. Then, each service that has attempted a request will continue with normal execution.
This is exactly the behavior we were looking for, so we should be good to go, right?. Well, almost. Since we are queuing multiple requests, there is a good probability of them being identical to each other. So, if these requests were idempotent, we could add some simple sanitization in a way that repeated ones are filtered out, and even add an amount limit to the queue itself. Our final implementation included these and other improvements.