Processing Delays
Incident Report for Redox Engine
Postmortem

Root cause

Some production connections across Redox that were not yet live had been paused for an extended period of time; when these connections were unpaused the Redox engine tried and was unable to ingest a very old transmission, which caused a backup of some transmissions that affected a small subset of customers.

Impact on customers

A handful of customers on a particular partition were affected by message processing delays.

What Happened?

At approximately 4:56PM CT observed an increase in message processing time and began investigating the issue. By 5:41PM CT a fix of increasing the partition size was implemented and we observed the decrease of queue depth to resolve the issue.

Learnings / Follow-ups

Redox has committed to creating and implementing alerting based on this specific scenario and areas affected.

Posted Nov 30, 2021 - 12:49 CST

Resolved
This incident has been resolved.
Posted Nov 18, 2021 - 16:52 CST
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Nov 18, 2021 - 11:32 CST
Update
We are continuing to work on a fix for this issue.
Posted Nov 18, 2021 - 10:59 CST
Update
We are continuing to work on a fix for this issue.
Posted Nov 18, 2021 - 10:58 CST
Update
We are continuing to work on a fix for this issue.
Posted Nov 18, 2021 - 10:56 CST
Identified
The issue has been identified and a fix is being implemented.
Posted Nov 18, 2021 - 10:53 CST
This incident affected: Engine Core.