Intermittent delay in API response for orders and JNLC

Incident Report for Alpaca

Postmortem

Service update — Order processing & account sync

January 20, 2026

What happened (short):A background database process created by a price-update job held an open transaction for an extended period. That blocked the database from cleaning up a very busy internal table, causing worker backlogs. As a result some orders and account updates were slower or intermittently timed out.

Impact:

  • Slower order responses and occasional timeouts for a subset of customers.
  • Delays in account balance refreshes and transaction visibility.
  • No data was lost — all funds, transactions and records remained secure.

What we did:

  • Stopped the offending background work, killed the long-running transaction, and manually cleaned the affected table.
  • Performed a rolling restart of worker and order-processing services to clear stuck processes.
  • Restored normal processing and cleared the backlog. Services were healthy again later that morning (all backlog processed).

Why this won’t happen again (actions): Immediate

  • Added transaction timeouts for database access at the application level.

Short/medium term

  • Paused a non-critical job which had resulted in resource contentions; this will be re-enabled only once this issue is resolved.
  • Improve alerting/handling for long-open transactions and failed database cleanups so we can act earlier.

Long term

  • Review and tune database maintenance settings so write-heavy tables are protected.

Our commitment:We apologize for the disruption. We’ve taken steps to fix the immediate problem and are implementing monitoring and architectural changes to reduce the chance of recurrence. If you’d like additional detail or a technical appendix, we’re happy to share it.

Posted Jan 20, 2026 - 16:42 EST

Resolved

We are no longer observing any issues across our APIs. This incident is now marked as resolved. A root cause analysis will be published by tomorrow..
Posted Jan 20, 2026 - 05:26 EST

Update

Errors are very low and limited to brief blips during restarts; we will keep monitoring closely before marking this incident as resolved.
Posted Jan 20, 2026 - 05:20 EST

Update

After We have restarted the affected backend services then pending account/journal work has cleared. Error rates and timeouts are now very low, and requests are succeeding again. We are continuing to monitor closely for any other issue.
Posted Jan 20, 2026 - 05:11 EST

Update

overall performance has improved for all affected endpoints. We will keep monitoring closely before marking this incident as resolved.
Posted Jan 20, 2026 - 05:01 EST

Monitoring

Error rates and latency are improving after remedial actions, we are continuously monitoring for any issue.
Posted Jan 20, 2026 - 04:51 EST

Investigating

Our team is identified issues with one of the service being stale to refresh data and we are working on it.
Posted Jan 20, 2026 - 04:46 EST

Identified

Core databases and replication appear healthy; impact is limited to specific journal /account operations and is intermittent.
Engineering is actively working to stabilize the affected service and reduce latency; we will provide another update as we learn more.
Posted Jan 20, 2026 - 04:43 EST

Update

Accounts are taking longer than expected to fully load, which is occasionally causing some users to see an error (a timeout) when they try to check their journals or orders. We are actively working to quickly finish loading these accounts and stabilize the system.
Posted Jan 20, 2026 - 04:32 EST

Investigating

API response for orders and JNLC were delayed intermittently
Posted Jan 20, 2026 - 04:20 EST
This incident affected: Broker API (broker.trading.accounts.account_id.orders.get, broker.journals.get).