MLLP Inbound Message Processing Degradation
Incident Report for Redox Engine
Postmortem

Root cause

On Monday 10/4, ongoing infrastructure scalability tasks inadvertently caused increased DNS lookups by services between 2:41PM CT and 7:00PM CT.

What Happened?

DNS performance degradation led to application services degrading as a result. We reverted partial changes on Monday (10/4) evening, which immediately improved performance. As volume increased we experienced a more severe degradation on Tuesday 10/5. Reverting all changes deployed between 2:41PM CT and 7:00PM CT on Monday 10/4 resolved DNS lookup performance, and returned message processing to normal speed.

Impact on customers

Customers with high-volume MLLP feeds experienced messages queueing up on the HCO side due to a sporadic increased latency for all HTTP requests requiring DNS resolution on the order of seconds (5-10).

Learnings / Follow-ups

Internal teams are evaluating technology which allows us to improve our testing and alerting on network degradation.

Posted Oct 15, 2021 - 15:05 CDT

Resolved
This incident has been resolved.
Posted Oct 05, 2021 - 16:45 CDT
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Oct 05, 2021 - 12:13 CDT
Identified
The issue has been identified and a fix is being implemented.
Posted Oct 05, 2021 - 11:50 CDT
Update
We are continuing to investigate this issue.
Posted Oct 05, 2021 - 11:50 CDT
Update
In addition to MLLP Routing we are experiencing a delay within our dashboard.
Posted Oct 05, 2021 - 10:40 CDT
Update
We are continuing to investigate this issue.
Posted Oct 05, 2021 - 10:37 CDT
Investigating
We are currently experiencing an issue with our inbound MLLP processing. We are actively investigating and should have an update shortly. If you have any questions please contact support@redoxengine.com.
Posted Oct 05, 2021 - 10:34 CDT
This incident affected: Dashboard and Engine Core.