RESOLVED: UK2 degraded performance
Incident Report for TOPdesk SaaS Status page
Postmortem

Incident Summary

On October 4, 2024, at around 09:45 CET, we received the first report of slowness from customers using the TOPdesk SaaS environment hosted in the UK2 data center. By early afternoon, multiple incidents had been reported, prompting an internal investigation. Our initial findings ruled out internal new version releases as the cause. Monitoring revealed that the upload folders for specific containers were unreachable, and logs confirmed failures in file upload and download operations. To keep our customers informed, we published an update on our status page. An internal status update from Microsoft Azure indicated a NetApp issue, and shortly after, the upload folders became reachable again, restoring file upload functionality. The incident was resolved by 17:41 CET. The root cause was identified as an Azure Network outage affecting Azure NetApp Files in the UK South region.

Root Cause Analysis

Microsoft Azure identified the root cause as a networking issue impacting a backend component that Azure NetApp Files operations rely on. This issue led to access issues for a subset of customers in the UK South region.

Follow-up actions

Our cloud engineers will review the current infrastructure to find ways to make it more robust and prevent similar issues in the future.

Conclusion

We will continue our ongoing collaboration with Microsoft to ensure quick detection and resolution in case such disruptions happen again.

Posted Oct 25, 2024 - 13:26 CEST

Resolved
We are pleased to inform you that our hosting provider, Microsoft Azure, has successfully resolved the intermittent access issues with NetApp Files in the UK South region. Their latest status update confirms that the underlying backend networking issue has been fully addressed.

Our monitoring also indicates that the issue with longer loading times has been resolved, and the system has shown consistent stability for the past hour.

We apologize for any inconvenience this may have caused and appreciate your patience and understanding.

The major incident will now be closed, and we will conduct an internal evaluation. Once this evaluation is complete, a detailed root cause analysis will be published.
Posted Oct 04, 2024 - 17:41 CEST
Update
Our hosting provider Microsoft Azure is experiencing intermittent access issues with NetApp Files in the UK South region. They have identified a backend networking issue as the cause and are actively working on a solution. 

We will provide updates as we receive more information. 

Thank you for your patience and understanding.
Posted Oct 04, 2024 - 16:52 CEST
Update
Since this morning, an increasing number of customers hosted at our UK2 location have reported performance and availability issues. Our investigative team, which includes members from Development, SaaS Support, Technical Support, and SaaS Ops, is treating this as a high-priority issue and is actively working on identifying the root cause.

We have ruled out several potential causes and have determined that this is likely a networking issue, possibly originating from outside the TOPdesk network.

In addition, some customers are experiencing difficulties uploading files in their TOPdesk environments. Our team is aware of this and is in contact with Microsoft Azure to resolve the issue.

At this time, we are unable to provide an estimated timeline for resolution. However, we are committed to keeping you informed and will provide updates as soon as we have new information.

We sincerely apologize for the inconvenience and appreciate your patience.
Posted Oct 04, 2024 - 15:52 CEST
Investigating
We are currently experiencing issues at the UK2 hosting location. As a result, you may experience longer loading times in your TOPdesk SaaS environment.

We are aware of the problem and are working on a solution.

Our apologies for the inconvenience. At the time of writing this we are not able to give you an estimate on when your environment will be available. We aim to update this status page every 30 minutes until the issue has been resolved.

E-mail updates will be sent when the issue has been resolved. You can subscribe on the status page (https://status.topdesk.com) for additional updates.

To inform TOPdesk you are affected by this issue, please visit https://my.topdesk.com/tas/public/ssp/ . Please refer to incident TDR24 10 1813.
Posted Oct 04, 2024 - 15:14 CEST
This incident affected: UK2 SaaS hosting location.