No matter the site, the traffic, the scope or content, a tip of the hat should always be gestured to the scale it needs to support. Not to say any effort should be put forth to make it the most scalable site or webapp in the universe, but certainly a little thought is allowed. Of course, I’ve said plenty of times before and will continue to: don’t fall into the trap of premature optimization. It’s dumb. And in most cases, developing to solve existing scalability issues is a business case most people are willing to take as it will (generally) save time, money, sanity and morale in the long run. Regardless, I also believe there are simple simple steps everyone setting up a site can take. Not as a premature optimization, but as a logical, default method of installing your web tier to ensure it’s optimized out of the box. Particularly when it’s combining different pieces of proven technology and enabling their features to prevent needless overhead, it’s no longer premature, it’s simply expected.
Cache Statically, Proxy Dynamically
This is the first thing I always do when installing a new web server: install my two preferred HTTP servers, nginx and Apache. However, the trick is to understand the features of both so when combined you have the best performance. I’m not going to get into every specific detail of each specific server, but here’s the high-level gist: nginx is great for static content, reverse proxies, caching and being a CPU-friendly, light HTTP server. Apache, on the other hand, is great at building dynamic content with just about any language through one extension or another. Although Apache could do everything as its feature set includes everything nginx has to offer. The downfall is it has become the Jack of All Trades, Master of None. nginx concentrates on being a fast front-end server for old-school web content, which means Apache has risen as the leader of dynamic content. Although ambiguous, this heavily depends on the language being used, but is generally the case with the ones I seem to use most.
The simple steps of setting this up are as follows. This assumes some of the following environment settings:
- Your document root is at /var/www/site/htdocs.
- Your static content (JS, CSS, media, html/text files, etc) are located under /var/www/site/htdocs/static.
- With a few exceptions, we will assume everything else is dynamic content.
First, setup and install the nginx and apache servers.
~$] sudo /etc/init.d/apache2 stop
~$] sudo /etc/init.d/nginx stop
Next, move the apache install to a subordinate port
~$] # chose your preferred port number below (this uses 8081 for no good reason)
~$] echo "127.0.0.1:8081" > /etc/apache2/ports.conf
Configure nginx to proxy, serve and cache static content
Alter the default configuration
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main ‘$remote_addr – $remote_user [$time_local] $request ‘
‘"$status" $body_bytes_sent "$http_referer" ‘
‘"$http_user_agent" "$http_x_forwarded_for"’;
access_log /var/log/nginx/access.log;
sendfile on;
keepalive_timeout 65;
tcp_nodelay on;
gzip on;
gzip_proxied any;
gzip_types text/plain text/html text/css text/javascript;
gzip_buffers 16 8k;
server {
listen 80;
gzip on;
location ~/* {
# Keep the port number below the same as what you defined
# it earlier
proxy_pass http://127.0.0.1:8081;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header Host $host;
}
location /wp-content/ {
root /var/www/site/htdocs/;
expires 1d;
}
location /favicon.ico {
root /var/www/site/htdocs/;
expires 1d;
}
}
}
Now at a high level, this has accomplished:
- Moving apache to a different port.
- Serving static content from nginx, caching it for a day before running the fseek again.
- Serving the favicon.ico from nginx.
- Proxying everything else to the apache server.
Compress Your Output
The simplest most widely adopted mechanism for settling any server down is to compress the output before you send it. Yes, this will take CPU cycles, but it will help prevent slow connections from dogging your site. Say, for example, John Doe is wading your site at a blistering 2400Kb/s. It will take him six years to download an image, the entire length of which will be taking a socket and process/thread away from your server. Therefore, your server is forced to open a new socket/process/thread to handle the next connection which may also be slow. And the cycle continues. Of course, this battle is partly combated with the installation and configuration of nginx, which can handle this type of activity much better than Apache. Yet, it is still needless and so extremely simple to setup, I would consider it a mistake not to. So simple in fact that if you followed the directions from earlier, it’s already done. You may or may not have caught this little snippet in the nginx conf mod:
gzip_proxied any;
gzip_types text/plain text/html text/css text/javascript;
gzip_buffers 16 8k;
It is that simple to enable gzip (compression), keeps a maximized buffer to reduce overhead and notes that it is allowed to compress proxied content of the particular type defined. Good for you for following directions, you win.
Minimize and Merge your Static Code Easily
Maybe a shameless plug for one of my freely distributed modules, Apache2::Response::FileMerge (aka. File Merge), but still something worth noting as it’s still one of the only Apache extensions I can find that will let you compress and merge multiple JS and CSS files into a single file. I’ve written about this here before, so I will keep it brief this time around. Basically, the concept is this: the fewer hits to your server, the less it has to do. Win. However, there are some details worth mentioning. I will start by assuming you have already installed and configured the module, the instructions of which are detailed here.
The next step is to disable the front-end (nginx) serving and caching of any JS/CSS files you want auto-merged. Simple. All you have to do for this is to move the JS/CSS you want processed by the Apache server out of the /static directory and into another. Let’s say we move it to the /merge directory. Boom. Done. Now, simply alter the confirmation you found here to represent the same (at least in terms of the document root and underlying code includes). Presto, you’re done.
To Conclude
Again, as I have said and as I will continue to say, premature optimization is a sword wielded by the devil himself. It’s costly in all regards, and the assumptions you make of predetermined bottlenecks will likely be wrong anyways. However, there are some options as the ones I have noted here that are so simple to introduce they are no longer optimizations and simple configurations that should be your automatic default. They’re no longer optimizations when they’re expectations.
On a side-note, I mention a few technologies in this post. I hope you come away from this with the concept of what is happening rather than a dire need to convert technologies. You can replace nginx with just about anything, even a scaled down and simple install of Apache if you had the want. The concept is that you have the experts do what they are best at. In this case we have nginx, the resident expert on proxies and static content serving, doing just that. There are plenty of alternatives. On the same note, nobody said you ahve to use Apache at all, nor the file merge Apache extension I call out. The idea is very simple: reduce the stress on your server with simple installation and configuration techniques and make them an expectation of all your installs, big or small. You will find you will be happy with the speed and agility your server will have all while adding only a few minutes to any first-time configuration and setup time.
Have fun.
Tags: howto, Scalability, sysadmin