I got spurred to write about this topic on Founder’s Corner about Continuous Improvement by an email from a customer yesterday:
Hi Bob, my name is Tom Gifford and I am the Race Series Director for the East Shore YMCA. We are in Harrisburg, PA. I have been using RunSignUp since 2015 and I wanted to send a note about how impressive it is with what your company has built. It is always great to see how your company never stops looking at ways to offer new features.Tom Gifford, East Shore YMCA
The concept of continuous improvement is something that is foundational to our company. It is cool that our customers recognize this and value it. And probably worthwhile to give some of the background on why we view this as a core part of our company and how we try to achieve it in various ways.
Continuous Improvement Influences
I’ve had a number of things that have influenced my high prioritization of this concept. The first was my experience at Bucknell as a runner. Our Coach, Art Gulden, was a believer in the Lydiard style of training – meaning lots of long slow distance. We ran 5 miles in the morning and 10 miles in the afternoon most of the year (I averaged 4,400 miles per year after tapers, injuries and end of season low mileage for a few weeks). There is theory about expanding the oxygen flow to and consumption by mitochondria. But I basically think that if you do that much work consistently, you will get a lot better – and we did.
The other big influence around Continuous Improvement has been my involvement with CloudBees and Kohsuke Kawaguchi. Kohsuke invented the Hudson, later Jenkins, open source project to do “Continuous Integration”. Basically automating the many tasks that a developer needs to do to release software. His creation is used by hundreds of thousands of organizations. His creation, along with Git, led to the DevOps revolution that allows software to be released very often.
My final example of being influenced to believe in Continuous Improvement success is our own company. Bryan Jenkins once coined the term that “We are a 10 year overnight success”.
Continuous Improvement in Software
One of the hallmarks of our company is the fact that we do over 2,000 releases of our software each year. We’ve actually spent a lot of time on the process of releasing software. With PCI requirements, and our own focus on quality, every line of code that is released is reviewed by another person, and released by one of our more senior people. But we have put in place systems to manage this well:
- Github – this is our code repository that allows our developers to share code, branch off to work on their own projects and merge back in to the master branch, which is what we deploy multiple times each day.
- Code Review Process – we built our own workflow process and system that allows a developer to ask for code to be reviewed as a draft or as a final release. It tracks each and every change that is made, and records who the reviewers and approvers were for accountability, but also to make the process very smooth.
- Code Checkers – we use a couple of tools, primarily tools that sit in each developer’s development editor on their laptop that makes sure the code complies with our coding standards to catch errors early.
- Unit Testing – we have expanded our use of unit testing over the past couple of years. This allows us to run automated unit tests to make sure we did not break something when we change and add code.
- Test Environments on Laptops – we have close to the entire infrastructure we deploy to packaged up in a Docker image so developers can have a full test environment right on their laptop as they develop.
- Test Environments on AWS – we have 9 test servers that we can deploy different branches of the code to so that other people can test new functionality. For example, we just deployed some great new features for our Voucher system and had Vacation Races do testing on a test environment to make sure we got good customer feedback.
- Fast Deployment Scripts – we have automated procedures that make it exceptionally easy to deploy code.
- Adaptable Production Environment – we can deploy a new version of the codebase literally between user clicks in their browsers. For example we made a major improvement to the Registration Dashboard page on Monday and our users just saw it the next time they clicked on the dashboard. There is no downtime needed. We can also extend the database dynamically as well, allowing us to add tables or fields as we add new functionality without taking the database offline. In fact, we have not had to take the system offline since 2015, when we did it for 8 minutes for a major database migration. And this is a complex database with 2,000 tables, shards, read replicas, etc.
- Fast Rollback – we experienced this in 2021 when we had a 4 minute outage because a release we did caused major system errors. We were able to roll back to the previous version and the site was operational again in 4 minutes.
We have some big plans to improve this process this year. We will be using Github Actions to drive more automated testing, and will be expanding our test suite.
Releases the Past Two Days
To give an idea of the pace of releases, here is a list of the 31 releases over the past two days (I get an email each morning where I review all of the releases for the day that I need to approve as part of our PCI process):
Jeff Kiesel Race Dashboard: Analytics dropdown update Darren Wamboldt hide slideshow dots on mobile Jeff Kiesel updated icon font and logos Darren Wamboldt Tippy in horizontal scroller fix Stephen Sigwart Fix Adyen chargeback error Michael Lindeboom Tickets Event with Donation Journal Categorization Stephen Sigwart Remove GiveSignup approval checks Stephen Sigwart Ticket journal calculation update Jenn Levas Update Facebook tracking pixels to use new generic tracking pixels setup Andrew Burke Correct facebook pixel capitalization Andrew Burke Fix additional Facebook Pixel code capitalization Alyssa Stone New Milestones & Badges for Races (Group 1) Darren Wamboldt UX updates to the sent email list page Ryan Snell Fix default state for milestones and badges Darren Wamboldt web builder UI updates; event description update; event info updates; Darren Wambolt Added basic info URL. Fixed detailsNote and detailsImgUrl Darren Wambolt Added detailsLink for event description Darren Wambolt create new About component Darren Wambolt use Vue Slots to render in templates Stephen Sigwart Fix duplicate key database exceptions Jeff Kohart Fix SMS messages being sent too many times Philip Quinn Removed a few options for estimated donations amount Stephen Sigwart Add voucher to deferred registrations report Matt Morrisson allow null filename Stephen Sigwart Director form UX cleanup Philip Quinn Made nonprofit question available to all domains in ticket wizard Stephen Sigwart Waitlist fix Jeff Kohart Stop email daemon errors due to template race Stephen Sigwart Composer updates Stephen Sigwart Allow admin to issue WorldPay refunds Michael Lindeboom Add details to csv/xlsx export and fix phpcs flagged issues
I won’t bore you with descriptions of each of these items, but they are all examples of small amounts of incremental progress. And we get those slightly better capabilities out to our customers each and every day. We get feedback and can see them in real operation across the thousands of events that process transactions each day on our platform. You will see a couple of examples, like the release by Jenn Levas with fixes by Andrew Burke right after. Andrew is the one who deployed the release. One of the challenges with Facebook is that you can not really test until it is deployed. Andrew was able to quickly find a couple of little tweaks that needed to be made and had those out in literally minutes.
Continuous Improvement on Infrastructure
We also make continuous improvements on our infrastructure. You can see a couple of small examples in the list above, like an error Adyen had on processing Chargebacks that we had to do a workaround for.
On a larger basis, we do regular monthly updates to all of our servers to make sure they all have the latest security patches. We run about 50 servers, and while we have done a fair amount of automation for that monthly task, it still requires work. We are lucky to have Kristian Decker join us to focus on DevOps and continue to improve that process. He and Stephen just did that yesterday (in addition to all of those releases mentioned above!).
Also yesterday, we did our annual review of our AWS servers and renewed 41 servers, upgrading 26 of them to the latest hardware available. This expanded our capacity by about 20% while keeping costs the same.
We have a laundry list of items we will continue to improve on in our infrastructure and the continual daily maintenance and vigilance we need on security. The important concept is not to treat infrastructure as static, but a valuable asset that needs continual improvement.
Continuous Improvement for People
One of our Guiding Principles is LEARN. We think that learning is continual and try to set ourselves up for this in multiple ways.
Developer Friday Share – Ryan Snell started this up about 2 years ago. The idea at the time was we had a bunch of great new developers. They all had talent, but all of them needed experience. Ryan and I talked about how if we helped them learn more that they could be a powerful force to drive our product forward in the future. Each week on Friday at 10 they get together and about 5-7 of them will share in depth either what they did that was clever that week so others could learn, or share a challenge that they are having and try to seek help in figuring out the best approach. This exposes all of them to various parts of the system, lets them see coding patterns that might help them in future projects, and creates a learning environment. Ryan and the team have set this up to be a “no judgement” zone, so no one feels embarrassed about asking stupid questions since there are none.
Onboarding New Sales and Support People – This is another area we have made a lot of progress on. I remember 8 years ago when Natallie Young joined us as an Account Manager, I told her to write down how she learned stuff as she started so it could be the beginning of an onboarding document. We now have a structured program that lasts about 6-8 weeks before a new person actually takes responsibility for doing things at the company. We have a complex system that everyone needs to learn, we have a mature set of business and system processes that we follow. I see new people starting, and having an overwhelming desire to make an impact on their first day. I remember telling Stephanie Davern, our VP of Sales for GiveSignup, that she was not allowed to do ANYTHING for the first 60 days. She just needed to learn. It mostly worked 🙂
Keeping Up – Our best people like Matt Sinclair set an example for others to continuously learn. We come out with a LOT of new features, and it is tough to keep up. It takes a real investment of time and effort by all of our employees to understand the new features as they come out.
Our company is designed to continuously improve. It is kind of like compounding interest that keeps getting larger and larger as time passes – Bryan’s 10 year overnight success…
This consistent effort on a day to day basis is one of the things that makes us different and hopefully better.