Part of Bob’s continued ramblings.
I’ve been asked a number of times in the past couple of days, including by my daughters (who know I am a Swiftie) if we could have handled the load that Ticketmaster could not? The simple answer is kind of.
Our biggest limitation is seating charts. We do not do them. Letting customers pick specific seats causes a huge burden on Ticketmaster to assign specific seats up front. The way we would handle it is to purchase a seat level – like Bronze, Silver and Gold, and then assign the specific seats after.
Could We Do It?
Yes, given the above caveat of seating.
Kind of hacky if we had to do it tomorrow. If we had a week or two to prepare, we most certainly would have been able to do it very well. Heck, we did 50,000 transactions in 7 minutes in 2012 (assume 2 tickets per transaction, that is 2 Million tickets in under 3 hours) when we designed our AWS based scalable infrastructure (and it has only improved since then as we continue to invest every year in upgrading our infrastructure and code).
We run our infrastructure so we can do 2,000 tickets a minute (actually transactions so more likely 4,000 tickets – but we will go conservative and say 1 ticket per transaction). That is 120,000 tickets per hour. Ticketmaster sold 2 Million tickets in the first 24 hours, and that 2M tickets would have taken us 16 hours to process.
What we do not have is infrastructure to handle that peak load of incoming requests at opening. Ticketmaster was probably handling 200,000 page requests per second when they opened (figure 2 Million people/bots every 10 seconds – they averaged 40,000 per second over the course of the day based on the reported 3.5 Billion requests). We have had some attacks at a couple thousand requests per second that we handle fine, but 40,000 per second would probably overwhelm our normal infrastructure.
Quick Infrastructure Expansion
We have quick capabilities that we can do in less than 10 minutes that would multiply our infrastructure by at least 4X. Here is an example page we have on our admin dashboard that allow us to upgrade our AWS EC2 instances to a higher level of capacity:
We can also start new servers on demand at each tier. For example, the web server tier:
The bottleneck would be doing an upgrade of our database servers – which is why I say 4X upgrade.
But still, with a 4X upgrade, we could probably handle the sale of 2 Million tickets in 4 hours. So not too bad.
Give us a Day
If you gave us a day to get ready, we would upgrade our database servers to larger instances (we can go at least 16-32X our current infrastructure. We would also do some quick tricks on the front end to handle the high load.
So we could probably sell those 2 Million in 1 hour, at least theoretically.
Give us a Couple of Weeks
There are two paths we would pursue.
First, we would do some major reworking of our front end to handle the high volume of requests and put in place better queueing for web requests so we could handle the 200,000+ requests per second.
Second, we would put in place a better process for Ticketmaster’s “Verified Fan” filtering and queue management. We would probably extend our “Reserved Entry” capability and/or Lottery functionality to make things fair for ticket buyers. This would involve a private URL/Domain that either spaced out the buyers, or entered ticket buyers into a lottery that was picked at random and winners notified.
TicketMaster Infrastructure, Process and Queues
Infrastructure – Of course big demand like this would be difficult for anyone to handle. Google does almost 10 Billions searches per day, and Ticketmaster received something like 3.5 Billion requests that day. Hopefully TicketMaster is not running on their own servers and uses AWS like us (and services like Netflix when Stranger Things debuts). They probably also are burdened with a bunch of old systems and some amount of technical debt that holds them back from making fast changes like we are able to make.
Process – The demand spike of opening was simply overwhelming. We understand the PR of a big opening, but this PR is kind of bad… So this is where process comes in – there needed to be a way to spread the load. We have two mechanisms that could have been used to spread that demand. First, a lottery type of system. Second, a reserved entry type of a system where invitations and admittance were set up in waves (100,000 Verified Fans each hour or something like that).
Queues – Actually two comments on queues. First technical. Queues are amazing in terms of designing complex and high volume systems. It is clear that Ticketmaster’s queue technology leaves something to be desired. Second, the buying process of making buying the specific seat in the purchase path imposes unnecessary burdens on the computing infrastructure. It means that there is a queue to start, and then there is a complex set up queues and tracking all the active purchases in process. This is all driven from giving individuals specific seat selection capabilities as well as historical processes that likely could be changed.
Kind of. Certainly from a transaction and volume basis. But specific seat picking is tough…