Postmortem -
Read details
Jan 12, 17:12 EST
Resolved -
Our team has mitigated the incident. All systems are operational and customer impact has ceased. We are continuing our root cause analysis to prevent recurrence.
Jan 12, 13:50 EST
Update -
We have not observed any abnormalities since the system stabilised. We will close this incident after 45 minutes. Root cause analysis will continue, and findings will be shared separately. Thank you for your patience.
Jan 12, 13:05 EST
Update -
The system has stabilized and no issues are currently being observed. Our team continues to work on identifying the root cause and will provide an update once we have more information.
Jan 12, 12:48 EST
Update -
we are still monitoring the system.
Jan 12, 12:35 EST
Update -
All API endpoints are operating normally. We are continuing to investigate the root cause and will share findings once available. Thank you for your patience.
Jan 12, 12:20 EST
Update -
Our investigation continues to make progress. We have analysed system behavior during the affected periods and have narrowed our focus to connection management, which we believe may be contributing to the intermittent issues. We are encouraged that most impacted requests are ultimately completing successfully. The team remains fully engaged and is working toward a resolution. We appreciate your patience and will keep you informed as we learn more.
Jan 12, 12:08 EST
Update -
Our investigation is progressing. We have identified several contributing factors and are actively analyzing traffic patterns, system behavior, and recent changes to determine the root cause. The team is working diligently toward a full resolution and will continue to provide updates.
Jan 12, 11:55 EST
Update -
We continue to see intermittent timeouts on an order-handling service, occasionally impacting orders and account/position lookups—though most requests are completing successfully. Our engineers are actively isolating affected pods, capturing diagnostic data, and restarting them to restore stability. We are also investigating database behavior and traffic patterns to identify the underlying root cause.
Jan 12, 11:45 EST
Update -
The system has largely stabilised and the majority of requests are completing successfully. We are still observing intermittent timeouts (~3-5%) affecting position updates. Our engineering team is actively investigating the root cause related to high-volume data requests and working toward a full resolution. We will continue to provide updates as we make progress.
Jan 12, 11:22 EST
Update -
We are continuing to address the issue. Clients may still experience intermittent 5xx errors.
Jan 12, 11:02 EST
Update -
We are continuing to address the issue. Clients may still experience intermittent 5xx errors
Jan 12, 10:50 EST
Update -
We have identified another increase in 5xx errors. We are actively addressing and will provide more updates. clients may still experience intermittent timeouts on order placement while we fix the issue.
Jan 12, 10:33 EST
Update -
We are no longer observing 500 errors across our API endpoints. All services are now accessible, and we are continuing to monitor for any further issues.
Jan 12, 10:20 EST
Monitoring -
The team identified and addressed the underlying issue. We are seeing our API's recovering. The team is continuing to monitor performance. We will continue to provide updates
Jan 12, 10:14 EST
Identified -
Our team is actively addressing the issue, clients may still experience intermittent timeouts on order placement while we fix the issue.
Jan 12, 10:08 EST
Investigating -
We are currently investigating elevated error rates on multiple API. Some requests may fail with HTTP 500 responses. Our engineering team is actively working to identify and resolve the root cause.
Jan 12, 09:58 EST