Before we kicked off our Reinvent initiative (quick refresher on Reinvent: transforming our core monolithic application into a set of smaller, nimbler microservices running on AWS), we had already built something that resembled a Continuous Integration (CI) framework; it was Jenkins-based, and running on AWS because we didn’t want to build CI capacity within our co-los. Let’s call this Zuora’s legacy CI framework (Legacy CI).
However, with the explosion of tens of teams owning the destinies (design, build, deploy, monitor) of tens of microservices, we were presented with the perfect opportunity to revisit our Legacy CI and build something that would scale, and address the needs of our new microservice world.
Besides being an author who has multiple New York Times best sellers, legend has it he also likes donuts. He is also now the code name for our latest, greatest, fully automated CI framework/pipeline that is being used by all microservice teams.
Homer allows teams to use a centralized portal (UI) to provision and manage Jenkins clusters; namely, unique pairs of Jenkins masters and corresponding slaves, as well as automated integration between the two.
Homer is implemented on AWS and supports a wide range of features that developers have asked for. While we are in the process of putting together a more detailed blog post describing Homer, we’d like to share one of the more nifty things Homer does: It provides CI environments on the cheap! It does so by provisioning CI environments using AWS Spot instances.
What is Spot?
Spot instances are spare EC2 instance capacity that AWS makes available for customers to bid on in a marketplace. In some cases, one can save up to 90% of the on-demand hourly price of an EC2 instance by procuring a Spot instance in the Spot market.
The main caveat, however, is that your Spot instance can be interrupted (i.e. terminated) for any number of reasons, including if your bid price is now lower than the Spot market price, or if AWS needs the extra EC2 capacity back. There are ways to mitigate these terminations (including our solution described below) and as can be seen by one of the more recent announcements from AWS, there appears to be a continuous rollout of improvements to make Spot instances more desirable solutions for workloads that can tolerate terminations.
FWIW - Spot instances are a great way to perform Chaos Monkey testing on your service :-)
Homer and Spot Fleet FTW!
How does all of this tie back into Homer?
In addition to allowing end-users to specify on-demand EC2 instances for their CI environments, Homer leverages the free EC2 Spot Plugin for Jenkins to allow end users to launch CI environments that run on top of Spot Fleets.
With Spot Fleets, Homer launches the number of Spot Instances that are required to meet the target capacity for the microservice team’s needs. Spot Fleet also attempts to maintain the target capacity if Spot Instances are interrupted due to a change in Spot prices or available capacity.
Internal end-users are also able to use Homer to customize their CI environment's Spot Fleet configuration. This means teams can customize their Spot Fleet instance types and sizes, bid price, and availability zones to meet their specific needs.
Today we have about 17 different CI environments provisioned through Homer, with an 80/20 spread between Spot and On Demand instances being used. And as teams become more comfortable using Spot, we expect the ratio to favor a larger percentage of Spot instances in our Jenkins farm fleets. Hence a zero dollar CI Framework. OK, close to a zero dollar CI framework.
ps - all of the On Demand instances are also running on the cheap, being discounted by RI coupons.
... View more
Over the coming week, Zuora will be performing a maintenance operation in our Production Copy Environments.
During the operation, environments will be migrated to a more modern computing platform. Maintenance will be conducted outside of respective business hours for customers based on their geographical location (i.e. US, EMEA, etc.). During the maintenance operation, we anticipate minimal interruption to Production Copy Environments.
Thank you for your patience as we improve our services.
... View more
We have started the maintenance operation but it will take longer than originally estimated.
We expect the maintenance operation to complete around 2:30pm PST, and for service degradations on real-time sync to last about an hour after that, until 3:30pm PST.
... View more
We plan on performing a maintenance operation in our Performance Test Environment on Monday, February 8, 2016 at 12pm PST.
A subset of customers using SFDC Real Time Sync in the environment will experience a service degradation as their real-time operations fail-back to batched sync mode.
We expect the service degradation to last no more than an hour, ending at 1pm PST.
... View more