RESOLVED: SaaS disruption NL4C01

Incident Report for TOPdesk SaaS Status page

Postmortem

Root Cause Analysis (RCA) Report

Summary:
On the morning of September 6th, 2024, at approximately 09:05 CEST, we began receiving reports from our customers stating that their TOPdesk environments were unavailable or unreachable. An initial check by our Support department indicated that these reports were all originating from customer environments hosted in the NL4 datacenter, specifically within container NL4C01.

Timeline of Events:
At 09:10 CEST, monitoring showed that the SQL server was having issues. By 09:15 CEST, the issue was noticed by our Operations team. By 09:40 CEST, the Operations team confirmed that restarting the secondary SQL node would resolve the issue. By 09:52 CEST, it was confirmed that all systems should be operational again.

Resolution:
Our engineers determined that the reported issues were caused by a misbehaving host machine. To mitigate the impact on our customers, all environments within container NL4C01 were failed over to backup infrastructure at approximately 09:40 CEST.

Future Preventive Measures:
Moving forward, we will continue to ensure prompt responses to similar incidents to mitigate disruptions quickly.
Additionally, we will review and improve our monitoring systems and escalation procedures to ensure quicker detection and resolution of similar issues in the future.

Conclusion:
The incident was caused by issues with a host machine in the NL4 datacenter, specifically within container NL4C01. Swift actions were taken to fail over to backup infrastructure and resolve the issue.

Posted Oct 17, 2024 - 13:54 CEST

Resolved

Our engineers have identified the cause and a fix has been implemented.

We will proceed to evaluate this issue internally. Upon its completion, a Root Cause Analysis (RCA) will be posted on our status page for your reference.

If you continue to experience any issues, kindly reach out to our support team for assistance. We appreciate your patience and understanding in this matter.
Posted Sep 06, 2024 - 10:00 CEST

Investigating

We are currently experiencing problems on NL4C01 hosting location. As a result your TOPdesk environment may not be available.

We are aware of the problem and are working on a solution.

Our apologies for the inconvenience. At the time of writing this we are not able to give you an estimate on when your environment will be available. 
We aim to update this status page every 30 minutes until the issue has been resolved.

E-mail updates will be sent when the issue has been resolved. You can subscribe on the status page (https://status.topdesk.com) for additional updates.

To inform TOPdesk you are affected by this issue, please visit https://my.topdesk.com/tas/public/ssp/ . Please refer to incident TDR24 09 2108.
Posted Sep 06, 2024 - 09:48 CEST
This incident affected: NL4 SaaS hosting location.