Resolved
The root cause appeared to be an issue with an upstream dependency (database connection pooler), which we have replaced.
We will be writing a full post-mortem with more details.
Resolved
We are no longer seeing issues
Monitoring
DB upgrades have completed, and we have seen error rates go down. Continuing to monitor.
Investigating
We are upgrade DB resources, which may result in brief downtime
Investigating
We are repairing the DB which may result in brief downtime.
Investigating
We're observing a high rate of errors on the production control plane which is causing errors on braintrust.dev and data plane operations.