Performance Tuning of Web Apps
There are thousands of blogs and books on performance optimization. Yet, I thought it might be appropriate to put in some of my learning into a blog. So here it goes, a starter guide to performance tuning of web apps.
Simple Page Specific Optimizations
There are a lot of simple optimizations that can be done at the page level. The simple optimizations are
Load fewer resources (such as images, css, js) or put all of them into a single js or cs file.
Optimize image sizes, if possible use sprites.
Load only the JS or CSS necessary
Use tools like YSLOW to test page performance and optimize
Cache all static resources (more on this later).
Don't serve static assets through your application server.
Do not use Rails, Play or any other app sever to serve static assets. Not even Apache. Instead use Nginx or other reverse proxies which can handle this with ease, cache them (on client browsers) and Gzip them whenever possible.
Nginx code to serve static assets given below:
location ~* ^.+\.(jpg|jpeg|gif|png|ico|css|zip|js|mov|html)$ {
//If the file is not available, route the request to your app server configuration
if (!-f $request_filename) {
proxy_pass http://yourdynamicserver;
Compress your files using Gzip in your response.
Often, reverse proxies like nginx offer simple GZip compression where data is sent in a compressed format. Most modern browsers support this and it can be enabled using simple configuration, example for Nginx is given below
# output compression saves bandwidth
gzip_proxied expired no-cache no-store private auth;
gzip_types text/plain; #application/xml text/html text/css text/javascript;
HTTP Limitation 2-6 connections per domain:
HTTP 1.1 spec limits to about 2 connections per domain which means simultaneous requests are handled in a FIFO queue per domain. One way to tune this (disclaimer: I have not specifically tried this) is to have different requests coming from different domain within your page.
http://www.stevesouders.com/blog/2008/03/20/roundup-on-parallel-connections/
Dynamic apps are mostly connected to databases. Or are they?
Most people who start building simple applications start building dynamic database connected applications. My suggestion: don't build Dynamic Pages unless you have to: Let’s consider a few use cases. You have a news website. Guess how often news needs to be updated?. Not very often. If so, your app should be a CMS that outputs HTML files that are served statically. These caches could be used or even other kinds of caching could be used to speed up your web app.
A cache is typically a temporary data storage area on a disk or in Memory RAM.
Caches can be used to speed up web applications.
A brief Overview of Memcached:
Memcached is a distributed in memory store for storing and retrieving data really fast. Read more about memcached here: http://www.slideshare.net/azifali/memcached-presentation-5729628
HTTP Caching also sometimes referred as page is the process of putting a cache in front of your application server and caching pieces of your application. There are caches like Varnish, Squid available. For a detailed article on HTTP caching, visit the article: http://blog.octo.com/en/http-caching-with-nginx-and-memcached/. I will however talk about some form of page caching below using file system and using memcached.
The key point to consider in building scalable web apps is to minimize computing power used per request.
Caching And Serving Files using the File System:
The easiest thing to do in situations where the content is the same for all users is to cache the entire page and deliver it via the file system.
Caching And Serving Files using the Nginx and Memcached:
You can also use the above pattern to store data into memcached and force nginx to directly connect to memcached and serve the files. The result will be much faster than fetching files from a file system. Remember, the ‘HTML Generator’ is a component that you will need to write. This component will fetch results and store it into Memcached with the file name as a key. We can then modify the nginx configuration to look into memcached first if a request comes in. If the file is not available into memcached, then the request could be passed to the file system or the underlying application server.
location ~* \.(html)$ {
access_log off;
expires max;
add_header Last-Modified "Thu, 26 Mar 2000 17:35:45 GMT";
set $memcached_key $uri;
memcached_pass 127.0.0.1:11211;
error_page 404 = /fetch;
}
location /fetch {
internal;
access_log off;
expires max;
add_header Last-Modified "Thu, 26 Mar 2000 17:35:45 GMT";
proxy_pass http://backend;
break;
}
Assuming that there are scenarios where the entire page cannot be cached for all users. In such scenarios, it is better to apply caching to cache a specific data set, a specific part of the page, a recomputed result and so forth. This cache could be specific for each user or a common piece that could be used for all users.
A cache is typically an in-memory system such as memcached.
Example: Loading some dataset into memory using nodeJS code:
var client = new memcache.Client(11211, '127.0.0.1');
connection.query("SELECT business_name,id from business", function (error, rows, fields) {
client.set(rows[i].id,rows[i].business_name ) function(error, result){
A partial segment could be a table that is repeatedly shown to the users. This could be generated and stored into memcached for use anytime later..
Example: Creating a list of links and storing it into memcached using Javascript / NodeJS:
news=news+items[i]+”</br>”;
memcachedvariable.set(“news_links”,news); //set news links into memcached.
Database Performance Tuning
For a detailed guide on MySQL performance mistakes, please read: http://www.slideshare.net/techdude/how-to-kill-mysql-performance.
Note: Some tips from this slideshow have been compiled into these notes.
Databases are generally harder to tune and the simple reason is because databases can be tuned differently for different use cases. For example a high volume read database tends needs different optimization than a high write database.
There are no exact sets of things to do in tuning a database, but I will try to list some of the most basic things that you would do in order to ensure that the database is up and running well.
Ensure that you don't have any joins while fetching data in a web app. If you are looking up data from a table or various tables, ensure that the data is available in a single data structure (de-normalize if necessary). When running joins, ensure that there is enough Join buffers and other query buffer space available (more on this in the variables section).
Fetch Limited Datasets: Don't write queries that look like this, " select * from entitites" and try to reduce the amount of items displayed in your code. You're still forcing the database to return a full dataset. Write queries which have some limits on resultsets fetched.."Select entity_name, description from entities limit 10 order by id asc" is more better.
Use the right storage engine. For most parts MyIsam works but there are different storage engines available for different purposes. Use what is right for your application use case. If your application needs high read performance, MyIsam engine would be suitable. On the contrary if you’re looking for high write performance, then Innodb is recommended. If you're using MySQL MyIsam as the engine for storing data and have the key cached enabled, watch out for frequent updates as this invalidates the Key cache which would make reads significantly slower. Also watch out for write lock contention.
For Both Innodb and MyIsam (both which are commonly used engines) , ensure that your buffer variables have enough memory allocated. The default configuration that ships with MySQL won't do and many people make the mistake of assuming that this is well taken care of. And memory does not refer to a single variable but a range of variables that can be configured at setup (Server Variables) and those which can be monitored at run time (Status Variables).
Use Explain statement to understand and optimize your queries. Generally using an ORM layer abstracts queries against the database, if so, look out for the slow query log with the list of queries that are slowing down your app.
Have enough indexes on the right columns: Having too few or too many indexes are both a problem and can lead to performance issues.
Ensure that your database data-types are of the correct datatype. Ensure that the size is absolutely minimal: This goes a long way in helping optimize the amount of storage required and the speed of queries. For example, do you need to have a char[255] when you could store it in char[2]?.
Please note: Most of these notes are from MySQL Help with some explanation where necessary
innodb_flush_log_at_trx_commit:
et this to 0 if you have a high write environment else set it to one. By setting it to 0 you're asking innodb to write values to the innodb log and then to the file once a second instead of it happening on a per transaction basis.
innodb_additional_mem_pool_size:
The size in bytes of a memory pool InnoDB uses to store data dictionary information and other internal data structures. The more tables you have in your application, the more memory you need to allocate here. If InnoDB runs out of memory in this pool, it starts to allocate memory from the operating system and writes warning messages to the MySQL error log. The default value is 1MB.
The size in bytes of the memory buffer InnoDB uses to cache data and indexes of its tables. The default value is 8MB. The larger you set this value, the less disk I/O is needed to access data in tables.
innodb_commit_concurrency:
The number of threads that can commit at the same time. A value of 0 (the default) permits any number of transactions to commit simultaneously
Enable a seperate file per table using this variable.
innodb_lock_wait_timeout:
The timeout in seconds an InnoDB transaction may wait for a row lock before giving up. The default value is 50 seconds. A transaction that tries to access a row that is locked by another InnoDB transaction will hang for at most this many seconds before issuing the following error:
transaction-isolation:
InnoDB supports each of the transaction isolation levels described here using different locking strategies. You can enforce a high degree of consistency with the default REPEATABLE READ level, for operations on crucial data where ACID compliance is important. Otherwise READ COMMITTED works for most use cases.
For locking reads (SELECT with FOR UPDATE or LOCK IN SHARE MODE), InnoDB locks only index records, not the gaps before them, and thus permits the free insertion of new records next to locked records
The size in bytes of each log file in a log group. The default value is 5MB. The larger the value, the less checkpoint flush activity is needed in the buffer pool, saving disk I/O. But larger log files also mean that recovery is slower in case of a crash.
Most importantly ensure that queries are well tested and enough indexes are available using the "Explain" command.
innodb_thread_concurrency:
InnoDB tries to keep the number of operating system threads concurrently inside InnoDB less than or equal to the limit given by this variable. Once the number of threads reaches this limit, additional threads are placed into a wait state within a FIFO queue for execution. Threads waiting for locks are not counted in the number of concurrently executing threads.
This variable decides how the innodb data is written to the disk. Recommend O_DIRECT except in the case of SAN based storage.
innodb_lock_wait_timeout:
Default value is 50 seconds. If you want your app to respond faster in case of write locks, set this value lower. Note, that this means that data consistency issues will occur.
Key status variables to watch out for:
There are a number of variables that one needs to watch out while running your MySQL Databases. These are called STATUS variables. Here are a few important STATUS variables to watch out for:
The number of internal on-disk temporary tables created by the server while executing statements.
If an internal temporary table is created initially as an in-memory table but becomes too large, MySQL automatically converts it to an on-disk table. The maximum size for in-memory temporary tables is the minimum of the tmp_table_size and max_heap_table_size values. If Created_tmp_disk_tables is large, you may want to increase the tmp_table_size or max_heap_table_size values. value to lessen the likelihood that internal temporary tables in memory will be converted to on-disk tables.
The number of requests to read a row based on a fixed position. This value is high if you are doing a lot of queries that require sorting of the result. You probably have a lot of queries that require MySQL to scan entire tables or you have joins that do not use keys properly.
The number of requests to read the next row in the data file. This value is high if you are doing a lot of table scans. Generally this suggests that your tables are not properly indexed or that your queries are not written to take advantage of the indexes you have.
Innodb_buffer_pool_read_ahead_rnd:
The number of “random” read-aheads initiated by InnoDB. This happens when a query scans a large portion of a table but in random order.
The total time spent in acquiring row locks, in milliseconds.
Innodb_row_lock_time_avg:
The average time to acquire a row lock, in milliseconds. If this value is high it means your queries are waiting and database needs optimization.
The number of query cache hits. If this is high, it is actually good…if you’re reading same data frequently and if this value is low, then check if the query cache is enabled or if the queries are getting written to the cache.
The number of queries added to the query cache. If this value is high and increasing frequently, cache invalidation is high. If so, try to optimize your queries to not to write to the query cache by setting query_cache=0 in the MySQL Configuration file which is loaded at startup or to 2 to enable query cache only for queries that begin with that begin with SELECT SQL_CACHE.
Write asynchronous, non-blocking code wherever possible: As with all apps, a user is typically held up when an IO happens and people don't realize how many place IO blocks can happen.
I don’t think this is a complete list of how web page optimization is done and I believe that there are more modern techniques in using asynchronous approaches to write high performing web apps.
However, I hope that this list has been a good starter guide to a small startup to help them get started with their optimization off the ground. If you have comments or questions, please write to me on twitter a @azifali or send me an email to asif.ali [at] outlook.com.