Root cause analysis and follow up for 20 05 4245:
On 18th of May 12:01 CET, our monitoring alerted us of a problem on our NL3 hosting location. We noticed several databases which we could not connect to.
The SaaS Operations team noticed that the problems originated from a single host and we asked our supplier to fail this host over to a secondary one. Availability was restored after this action, a check at 12:30 showed that the databases were responding.
Two technical points of improvement have been set in motion in order to have databases fail over more quickly. Another point of improvement focuses on internal communication. We aim to optimize our communication so we can solve problems faster and give quicker updates on the status of a problem.