Deploying a Large-Scale Infrastructure with Amazon AWS
Things are bound to fail when it comes to large scale deployments in any configuration of the infrastructure, even more so when you are implementing virtual servers “in the cloud” outside their sphere of influence. So, you must be prepared for things to fail. This is a good thing, why? It forces you to think ahead in failure scenarios and to design IT system infrastructure in a way that minimizes single points of failure.
Overall, I have used, implemented, architected and managed multiple global enterprise-grade cloud systems and have been impressed with the absolute reliability of Amazon’s EC2. Like many other professionals, I did not know what to expect all the time, but I was pleasantly surprised. Very rarely does an EC2 instance have an absolute complete fallout. Matter of fact, I cant not recall when I have ever seen a complete failure. I have seen a few instances where they have been in a deteriorated state. Using monitoring systems, I have generally received a heads-up via email or SMS allow myself enough time to mitigate the risk proactively versus reactively.
There is absolutely nothing wrong with expecting things to fail at any time. I see this as a primary differentiator separating the unexperienced from seasoned leaders. This mentality requires you to think implement systems more cautiously with full redundancy, scalability, emergency fail-over and most importantly fully-tested backup and recovery solutions. When architecting and planning, consider the following:
Fully Automate Infrastructure Deployments
Consider for a moment where you might be dealing tens and/or hundreds of virtual server instances that require the ability to scale up and down either onDemand or better yet fully automated. Server instances are not the only thing affected for this. You need to consider the additional infrastructure components like load balancers, storage, configurations, application settings, etc.
This can be achieved using custom scripts that will automatically build each and every server instance. Starting with either a custom created image (AMI) or a standard “Vanilla” core image (AMI), these instances can be rolled out which contain enough bootstrap code to instantiate the custom configuration scripts. Using services like RightScale or Scalr help this process tremendously. Simply by specifying what “role” that instance belongs to (for example ‘web-server’ or ‘master-db-server’ or ‘slave-db-server’) it will run a script that belongs to that specific role. In this script you can install everything the instance needs to do its job including pre-requisite packages along with the actual application with all its necessary configuration files.
Fully automating your deployments truly enables you to scale as defined or infinitely. Considering the option to fully automate your system environment/architecture, application infrastructure requirements need to be designed in such a way that allows this type of scaling. Regardless, having the necessary building blocks for automatic deployment any type of server that you need is invaluable.
When scaling, it is highly recommended that you keep information about your deployed instances in a database. This enables the ability to write tools that can inspect the database and generate the necessary configuration files (such as role configuration file), and other text files such as DNS zone files. This database becomes the one true source of information about the infrastructure.
Speaking of DNS, specifically in the context of Amazon EC2, it’s worth rolling out your own internal DNS servers, with zones that aren’t even registered publicly, but for which your internal DNS servers are authoritative. Then all communication within the EC2 cloud can happen via internal DNS names, as opposed to IP addresses. Trust me, your tired brain will thank you. This would be very hard to achieve though if you were to manually edit BIND zone files. One approach is to automatically generate those files from the master database mentioned above.
While on the subject of fully automated deployments, one goal that should be highly considered is to remove the requirement to ssh into any production server. When deploy the server through automated scripts, the application gets installed automatically and monitoring agents get set up automatically, so there should really be no need to manually do stuff on the server itself. If you want to make a change, you can create an additional script that can be pushed to production. If the server misbehaves or gets out of line with the other servers, you simply terminate that server instance and launch another one. Since you have everything automated, it’s one command line for terminating the instance, and another one for deploying a brand new replacement. Simplicity at it’s finest!
Design Your Infrastructure With Automated Horizontal Scaling
Generally speaking, there are two ways to scale an infrastructure: vertically; better known as deploying your application on more powerful servers, and horizontally; increasing the number of servers that support your application. For ‘infinite’ scaling in a cloud computing environment, you need to design your system infrastructure so that it scales horizontally. Otherwise you’re bound to hit limits of individual servers that you will find very hard to get past. Horizontal scaling also eliminates single points of failure and increases the accessibility and redundancy of the infrastructure.
A few ideas to consider when deploying a Web site with a database back-end so that it uses multiple tiers, with each tier being able to scale horizontally:
Deploy multiple Web servers behind one or more load balancers. This is pretty standard these days, and this tier is the easiest to scale. However, you also want to maximize the work done by each Web server, so you need to find the sweet spot of that particular type of server in terms of httpd processes it can handle. Too few processes and you’re wasting CPU/RAM on the server, too many and you’re overloading the server. You also need to be fully aware of the fact that each EC2 instance costs you money. It can become so easy to launch a new instance that you don’t necessarily think of getting the most out of the existing instances. Avoid the mistake of mis-managing or failing to monitor your resources unless you want to have a possible sticker shock when you get the bill from Amazon at the end of the month.
Implement Load Balancing. Amazon now offers load balancers for a very reasonable rate. The notion of Elastic Load Balancing, as recently brought to public attention by Amazon’s offering of the capability, is nothing new. The basic concept is pure Infrastructure 2.0 and the functionality offered via the API has long been available on several application delivery controllers for many years. Elastic Load Balancing automatically distributes incoming application traffic across multiple Amazon EC2 instances. It enables you to achieve even greater fault tolerance in your applications, seamlessly providing the amount of load balancing capacity needed in response to incoming application traffic. Elastic Load Balancing detects unhealthy instances within a pool and automatically reroutes traffic to healthy instances until the unhealthy instances have been restored. Customers can enable Elastic Load Balancing within a single Availability Zone or across multiple zones for even more consistent application performance.
Deploy several database servers. If you’re using MySQL for example, you can set up a master DB server for writes, and multiple slave DB servers for reads. The slave DBs can sit behind a load balancer. In this scenario, you’re limited by the capacity of the single master DB server. One thing you can do is to use sharding techniques, meaning you can partition the database into multiple instances that each handle writes for a subset of your application domain. Another thing you can do is to write to local databases deployed on the Web servers, either in memory or on disk, and then periodically write to the master DB server (of course, this assumes that you don’t need that data right away; this technique is useful when you have to generate statistics or reports periodically for example).
Another way of dealing with databases is to not use them, or at least to avoid the overhead of making a database call each time you need something from the database. A common technique for this is to use memcache. Your application needs to be aware of memcache, but this is easy to implement in all of the popular programming languages. Once implemented, you can have your Web servers first check a value in memcache, and only if it’s not there have them hit the database. The more memory you give to the memcached process, the better off you are.
Clearly Define and Establish Measurable Goals
The most common reason for scaling an Internet infrastructure is to handle increased Web traffic. However, you need to keep in mind the quality of the user experience, which means that you need to keep the response time of the pages your serve under a certain limit which will hopefully meet and surpass the user’s expectations. It extremely useful to have a very simple script that can measure the response time of certain pages and then graphs it inside a dashboard-type administration page. As you deployed more and more servers in order to keep up with the demands of increased traffic, always keep an eye on your end goal (ie. keep response time/latency under N milliseconds (N will vary depending on your application).) Remember if you see spikes in the latency chart, make sure you take the proactive approach and act accordingly.