Big Mess Up

Soooo, we messed up.  A large marathon used our email system to send out pre-race information emails. One of the cool features of our CRM-driven email is that you can put “tags” into an email and they populate from the database automatically.  Pretty cool. Except when there is a miscommunication and two of the tags do not work and everyone gets the same address and an anticipated finish time of 5:30!

So, yes, we are human and make mistakes. Unfortunately mistakes in code can get repeated very quickly and to over 28,000 people in a matter of a couple of minutes.

The email went out at 9:30 last night. Fortunately, the customer was checking their email responses when people started reporting the error. She notified us and our team worked on a resolution. After some diagnosis, workarounds and testing, we were able to get the new email out around 12:30AM. Special thanks to the customer and Andrew, Eric and Natalie for staying up late and getting this fixed!

We always report errors publicly on our blog. This paragraph gets into the gory details – and they are complicated because it was a two-step issue. The first piece was that a couple of months ago we upgraded to a new version of the SendGrid API. This new version placed some new restrictions on us when sending very large sets of emails. The basic issue was that if we had a mail merge of say 28,000 emails with say a dozen different parameters with each parameter (like expected finish time) automatically replaced it consumed a lot of memory. To optimize this, we knew that some of the tags were really constants, like race start time or packet pickup information. So we defaulted a constant rather than replace that tag in all 28,000 emails that were being created.

Where this caused a problem is that this marathon uses Custom Email Lists to send emails (and we recommended this path) since last year. The advantage is that our email tags do not cover every possible piece of information like a custom question. By exporting a CRM report and importing that into the email system by mapping fields like expected finish time to an unused custom tag like %Packet_Pickup_Information%, they could send out their emails. Pretty cool little work around, except some of the custom tags they used were race level tags that were converted into a constant so that everyone had an expected finish time of 5:30.

So a little optimization that happened in development met a little customization that happened in the field and it caused this issue.  No one’s fault specifically, but the question is how do we fix this. Well, we have some software fixes going in, and this event will cause us to document changes better. We will be more aware of creating little work-arounds.

We do not beat up people when mistakes happen. We admit it, analyze it and learn from it. We all make mistakes and we learn from them.  Learning is one of our Guiding Principles for a reason.

Leave a Reply

Leave a Reply

%d bloggers like this: