Immutable AWS Deployment Pipeline
For the impatient readers, there’s a diagram below that shows the whole deployment pipeline. I would still suggest that you read the post for a deeper understanding.
Many organizations make the mistake of not leveraging the Amazon Machine Image (AMI) for AWS deployment. The most common deployment strategy is to provision new nodes from top to bottom as the nodes are being launched. Provision just-in-time can lead to slow and brittle deployment cycles. Running system updates, downloading packages and setup configurations can take a very long time. What’s worse is that this time is wasted for every machine you provision in AWS. I have seen machines that have taken more than 30 minutes to become useful. If anything goes wrong during provision, the machine will not function as expected, which leads to brittle deploys. One way to solve these problems is to deploy via AMI.
AMI deployment strategy has been perceived as an unmanageable manual process. The bigger issue is how to update the running system. All those things are true, but it doesn’t have to be that way. Bundling software into an Amazon Machine Image (AMI) is by far the most reliable and fastest way to deploy applications on Amazon Web Services (AWS). The unmanageable manual process can be eliminated with automation. If the system needs to be updated, new AMI can be built then deployed side-by-side with the old nodes but will receive only a portion of the traffic. Once the new nodes have been proven to function correctly, old nodes will be decommissioned. This is what is typically referred to as a Canary Deployment, which also removes the need to have any downtime during deployments.
The AMI is considered immutable because once it is built, the configuration will not be changed (from human intervention perspective). In order to release the next version of the software, a new AMI is built from a clean base, not from the previous version. In this post, I will provide a high level description of all the necessary components to build an “Immutable AMI Deployment Pipeline”. Below is a description of one way to build this deployment pipeline, but there are probably many ways to achieve the same outcome.
The Setup:
Source Control
In the the context of deployment pipeline source control provides a way for developers to transmit code to a known location so the software can be built by packager (the next step). The most important decision here is to figure out which branch of the software the deployment pipeline will be built from.
Packager
This step will pull the bits from source control and package up all of the software in an automatic fashion. The easiest way is to inject a custom tool at the end of your Continuous Integration (CI) runs. I would recommend to use your distribution's package type to package up the software. I usually like to use fpm (https://github.com/jordansissel/fpm) to build my packages. It is very flexible in terms of what it can build and is easy to get started. All the hard lifting of getting the application running should be done at this step. For example, if the application requires an upstart file, this tool should be able to construct one on the fly. Dependencies management is another common step. If the app requires nginx it should be able to "include" it. It can be as easy as depend on another package, or it can more complicated as in running a Chef script. The most important part of this step is that it needs to be able to version the package properly so it is clear what has been deployed. Depending on the complexity of the software, it might need some kind of metadata configuration file in order for this step to glue it all together. I typically will use a yaml file that is versioned with the software to provide the hint.
Artifact Repository
Once the artifact has been built, it needs to be stored in a location where it can be retrieved for installation. It also serves as a catalog of all the software that has been built and released. If you are using RPM as your package type, it makes sense to store in a yum repository. However, it can be as simple as a file server.
AMI Provisioner
AMI provisioner is a tool that will provision an instance and install the necessary software, then create an AMI at the end of the run. I typically use Chef solo (or the like) to provision the instance to a point where the target software package can be installed on top of it. For example, if you are running a Java application, Java will be installed via Chef before the target package. Once all the software has been installed, an AMI needs to be created. This can be done by using the AWS SDK. It is also possible to use open source AMI creation tool such as Aminator (https://github.com/Netflix/aminator) if you chose not to roll your own. At the end of the run, it should have created an AMI with proper naming and version to clearly define what software it contains and the version.
This tool should create the AMI in the development AWS account then grant it to the production account. I will talk more about this in the next step.
AWS EC2 Environments
Before the software can be released to production it should be tested. A separate AWS account is recommended in order to provide isolation from production nodes. The previous step should have built an AMI that is available to two AWS accounts (production and development). Depending on the organization structure, the developers might only have access to the development AWS account since you might not want everyone to be able to mess around in production. This typically applies to larger organizations with a separate team for production environment. Regardless of your organization structure, two AWS accounts should be utilized for complete isolation. This way, developers have complete freedom to experiment in the development account.
Deployment Orchestrator
We need a tool to launch the AMI in development and production account. I also recommend launching all your services inside an Auto Scaling Group (ASG) even if you don’t plan to scale up and down. There are many cases nodes might be terminated undesirably. Using an ASG ensure any terminated instances get replenished automatically.
Hopefully your software is designed to be able to distribute the load across multiple nodes. Once the AMI has been deployed inside an ASG, it can be easily scaled up and the old software can be scaled down. This will be another tool that will interface with the AWS SDK to create ASG in an automated fashion. The tool will also properly name and tag the ASG so it is clear what software has been deployed. If CloudWatch (or some other alerting system) is used, you should setup the proper alerting at this step.
It is also common to inject some environment specific configurations in this step. AWS provides an easy way to run arbitrary scripts when an instance is being booted up called userdata. If you have a lot of environments or need to change the configurations during runtime, this might not be the best option. It is better for the application to have a way to retrieve configurations dynamically. This can be done by calling some external service before the application is fully booted. The application will monitor the external service in order to realize new configuration values. The configuration service is most likely not required to get the deployment pipeline going.
If you do not wish to build something custom, Asgard (https://github.com/Netflix/asgard) may be used at this step. Asgard may be too opinionated since it was designed to deploy Netflix services, but it is worth checking out as a possible solution.
The Complete Pipeline
Once all the pieces are in place and glued together you have a complete pipeline. Any developer (or anyone, if you have fancy UI) should be able to deploy a version of the software to production with minimal effort and risk. In order for new software to get out of the door, new nodes must be provisioned. This provides an easy way to roll back to the previous version and avoid any manual cruft that might have accumulated over time. Any manual manipulation of the nodes will be wiped out in the following release. The only real way for those changes to stick is to include it as part of the pipeline. This is a high level overview of how a deployment pipeline can be constructed. Some level of engineering is required, but it will enable you to deliver value to your end users quickly and in a robust fashion.
If you have any suggestion or questions, feel free to drop me an email: [email protected]
If you are interested in building a deployment pipeline like this for your organization please email: [email protected]. We can help.
Written by Aaron Feng
Aaron Feng is the founder of forty9ten LLC. He is a passionate Software Engineer with a special interest in cloud based infrastructure and DevOps. He has organized various tech groups since 2007, but is most well known for Philly Lambda. He is currently organizing DockerATL. Twitter: @aaronfeng Email: [email protected]














