Senior Drupal Architect, Article found on Lullabot
Varnish is a very popular software package that can dramatically accelerate the work of serving HTTP pages. Varnish caches fully-rendered responses to HTTP requests and serves them without the delay of building content from scratch. Because it's so much more efficient than building a page with Apache and Drupal, Lullabot regularly deploys Varnish when a site needs to handle high levels of anonymous traffic. In many situations, installing Varnish on the same machine as MySQL and Apache can help squeeze more performance out of a single box. However in most of our deployments, we're working with "high availability" setups, where dedicated servers handle different functions and redundant backup servers are on hand for every piece of the infrastructure. That means two database servers, two web servers, two Varnish servers, and so on.
This article is all about configuring Varnish optimally for these high-availability setups, in which multiple dedicated back-end servers are protected from heavy traffic by Varnish serving cached content to the outside world. We also have some neat tricks for server maintenance, optimizing your cached content, and configuring Varnish to act as a fail-safe even if all of your back-end servers go down.
Basic Varnish Configuration
Varnish usually has three locations of configuration. The boot script, the system-wide configuration, and the VCL file that does most of the work.
The first script that starts up Varnish is usually located with the rest of your system startup scripts at/etc/init.d/varnish. This file rarely needs adjustments, but it can be interesting to read or help you locate further configuration (since the startup script is responsible for calling the next file).
The second file is usually located at/etc/sysconfig/varnish (on CentOS and RedHat machines) or /etc/default/varnish (on Ubuntu). This file defines global configuration for Varnish such as which port it should run on and where it should store its cache. Typically it contains 5 different ways of writing the same thing. It doesn't matter which option you use: just be sure to change your storage backend. The default is usually "file", which stores cached information on disk. Be absolutely sure to change this to "malloc", which stores information in memory! If you don't have enough memory in your box for a decent sized cache (say a few gigabytes), consider a memory upgrade.
Here's a configuration that we have running on a very popular site with a large number of images and pages being cached (this is using the "Option 2" in the /etc/sysconfig/varnish file):
The last line is the most important to set up. In this case we're allocating 3GB of memory for Varnish's dedicated use. Also note the paths used in this file that reference which VCL file you will use. It's probably best to stick with whatever file path your distribution uses. The lines above for <Number of CPU cores>, be sure to replace with actual server information. You can get information about the number of processors in your machine by runninggrep processor /proc/cpuinfo, which will return a line for each processor core you have available.
The VCL file is the main location for configuring Varnish and it's where we'll be doing the majority of our changes. It's important to note that Varnish includes a large set of defaults that are always automatically appended to the rules that you have specified. Unless you force a particular command like "pipe", "pass", or "lookup", the defaults will be run. Varnish includes an entirely commented-out default.vcl file that is for reference.
We'll be going through each of the sections individually down below, but for ease of reading here's a complete copy of the VCL file that we're currently using, and a copy of the defaults that Varnish will automatically append to our own rules.
Updated 4/9/2012: I've added an updated VCL and examples here for Varnish 3, which differs slightly from the Varnish 2 configuration.
- Lullabot's default.vcl for multiple web servers (for Varnish 2.1.x)
- Lullabot's default.vcl for multiple web servers (for Varnish 3.x)
- View the set of defaults (as of Varnish 2.1.3)
Now let's get started walking through the most interesting stuff, the VCL!
Health Checks and Directors
A more recent feature of Varnish (2.x and higher) has been the addition of "directors" to send traffic to any number of web servers. These directors can do regular health checks on each web server to see if the server is still running smoothly. If Varnish receives a request for an asset that it hasn't yet cached, it will only pass on the request to a "healthy" server.
Our VCL file is typically set up to handle both HTTP and HTTPS traffic, so we need to define a list of web servers by their IP address twice; once for port 80 which provides normal pages and again for port 443 for secure connections. If you've set up Apache on a different port, reference that port here.
The last part of the configuration above is part of ourvcl_recv sub-routine. It's what defines which set of servers will be used based on the port by which Varnish received the traffic.
It's a good idea then to set up a reasonable health check on each web server to make sure that server is ready to deliver traffic. We use a file located directly in the root of the web server called "status.php" to check if the web server is healthy. This file does a few checks including:
- Bootstrapping Drupal
- Connecting to the Master database
- Connecting to the Slave database (if any)
- Connecting to Memcache (if in use)
- Checking that the files directory is accessible
If any of these checks fail, the file throws a 500 server error and Varnish will take the web server out of rotation. The status.php file is also extremely useful for intentionally taking a web server out of rotation. Simply move the status.php file to a new location (like status-temp.php) and Varnish will automatically remove the server from rotation while the server itself stays up so that it may be serviced independently of the other web servers. This approach is common when performing upgrades or installations of new software on the web servers.
While our status.php script is fairly universal, it may require some tweaking for your own purposes. If you have additional services that are required for a server to function properly, adding additional checks to the script will ensure Varnish doesn't cache bad data from a broken server.
Caching Even if Apache Goes Down
Even in an environment where everything has a redundant backup, it's possible for the entire site to go "down" due to any number of causes. A programming error, a database connection failure, or just plain excessive amounts of traffic. In such scenarios, the most likely outcome is that Apache will be overloaded and begin rejecting requests. In those situations, Varnish can save your bacon with the Grace period. Apache gives Varnish an expiration date for each piece of content it serves. Varnish automatically discards outdated content and retrieves a fresh copy when it hits the expiration time. However, if the web server is down it's impossible to retrieve the fresh copy. "Grace" is a setting that allows Varnish to serve up cached copies of the page even after the expiration period if Apache is down. Varnish will continue to serve up the outdated cached copies it has until Apache becomes available again.
To enable Grace, you just need to specify the setting invcl_recv and in vcl_fetch:
Both of these settings can be the same, but the setting in vcl_fetch must be longer than the setting in vcl_recv. Think of the vcl_fetch grace setting as "the maximum time Varnish should keep an object". The setting in vcl_recv on the other hand defines when Varnish should use a stale object if it has one.
Just remember: while the powers of grace are awesome, Varnish can only serve up a page that it has already received a request for and cached. This can be a problem when you're dealing with authenticated users, who are usually served customized versions of pages that are difficult to cache. If you're serving uncached pages to authenticated users and all of your web servers die, the last thing you want is to present them with error messages. Instead, wouldn't it be great if Varnish could "fall back" to the anonymous pages that it does have cached until the web servers came back? Fortunately, it can -- and doing this is remarkably easy! Just add this extra bit of code into the vcl_recv sub-routine:
Varnish sets a property req.backend.health if any web server is available. If all web servers go down, this flag becomes FALSE. Varnish will strip the cookie that indicates a logged-in user from incoming request, and attempt to retrieve an anonymous version of the page. As soon as one server becomes healthy again, Varnish will quit stripping the cookie from incoming requests and pass them along to Apache as normal.
Making Varnish Pass to Apache for Uncached Content
Often when configuring Varnish to work with an application like Drupal, you'll have some pages that should absolutely never be cached. In those scenarios, you can easily tell Varnish to not cache those URLs by returning a "pass" statement.
Varnish will still act as an intermediary between requests from the outside world and your web server, but the "pass" command ensures that it will always retrieve a fresh copy of the page.
In some situations, though, you do need Varnish to give the outside world a direct connection to Apache. Why is it necessary? By default, Varnish will always respond to page requests with an explicitly specified "content-length". This information allows web browsers to display progress indicators to users, but some types of files don't have predictable lengths. Streaming audio and video, and any files that are being generated on the server and downloaded in real-time, are of unknown size, and Varnish can't provide the content-length information. This is often encountered on Drupal sites when using the Backup and Migrate module, which creates a SQL dump of the database and sends it directly to the web browser of the user who requested the backup.
To keep Varnish working in these situations, it must be instructed to "pipe" those special request types directly to Apache.
Finally, while we're discussing paths that need exceptions, Varnish is also a good place to restrict access to specific URLs. Because the VCL file is so flexible, it can be a good place to lock down paths that should never be seen by the outside world.
Up at the very top of our VCL file, we have a line that defines an access control list of IP addresses. These addresses are considered to be "internal" to our environment. A common example is restricting public access to Drupal's cron.php file so that only local web servers can trigger expensive tasks like search indexing. Since the local web servers all have IP addresses that begins with "192.10.", they are granted access while all others receive an access denied message
At the top of the default.vcl file:
An then inside of vcl_recv:
Optimizing Varnish's Cache
First, let's dissect a very popular but misguided configuration that's made the rounds on the internet. When describing how to serve cached content to users with cookies, a number of sources recommended this solution:
Generally, this is not a useful approach unless you're serving up the same page to a single user repeatedly. It will recognize the unique cookie that Drupal gives to every logged in user, and use it to keep cached content for one user from being displayed to another. However, this approach is a waste: Drupal explicitly returns a Cache-Control header for all authenticated users that prevents Varnish from caching their content:
In other words, don't use this approach unless you have an explicit reason to cache authenticated pages. In most situations, this approach will add overhead without caching anything. In the worst case scenario, it will fill up your cache with entries for each authenticated user and push out more valuable anonymous pages that can be reused for thousands of visitors.
Although the HTML page itself is likely to change for logged in users, there's no reason that Varnish can't serve up cached versions of the support assets. Taking this into consideration, it's a good idea to discard any cookies that are sent by the browser when requesting such files:
There are multiple approaches to handling this problem, and most administrators start by trying to build a "blacklist" of cookies to strip out from the request, leaving only the ones in which they have interest. This usually results in a configuration file that look something like this:
Instead, we use an "inclusion" list, where all cookies but a few will be automatically stripped from the request. This logic is a lot more verbose, but it is definitely a more sustainable solution:
Once you've tamed cookies, there's one other "enemy" of caching you need to plan for: the "Accept-Encoding" header sent by different browsers. Each browser sends information to the server about what kind of caching mechanisms it supports. All modern browsers now support "gzip" compression, but they all inform the server that they support it in different ways. For example, the header of modern browsers will report Accept-Encoding the following ways:
Firefox, IE: gzip, deflate
Opera: deflate, gzip, x-gzip, identity, *;q=0
In addition to the headers sent by the browser, Varnish must also pay attention to the headers sent by Apache, which usually include lines like this:
This means that Varnish will store a different cache for every version of the "Accept-Encoding" header it receives from different browsers! That means you'll be maintaining separate cached copies of your web site for each different browser, and in some cases for different versions of the same browser. This is a huge wast of space, since every browser actually supports "gzip", but just reports it differently. To prevent this confusion, we include this segment in vcl_recv:
Varnish is an amazing and incredibly efficient tool for serving up common resources from your site to end-users. Besides simply making your site faster, it also can add additional redundancy to your setup by acting as a full backup if the web servers fail. In order to make Varnish both serve as an effective backup and efficient caching layer, it needs to clean up incoming headers from the browser, strip down cookies, and consolidate the "Accept-Encoding" header.
After such extensive explanations it's easy to get overwhelmed, but the good news is that the VCL file provided here can quickly be deployed to almost any Drupal site and start working immediately. For most sites no further customization is needed, and sites that need to tweak it will have a good head start towards huge reductions in server load. On our sites, Varnish is usually able to handle about 85% of the traffic without ever touching the web servers. Even during peak times with hundreds of thousands of requests coming in per hour, Varnish can hum along at less than 5% CPU usage of an average 4-core server. Instead of scaling out your web servers horizontally, adding a few Varnish machines in front of them can save a huge amount of processing and speed up your site at the same time.
If you haven't already, grab the actual default.vcl file that we use on our sites and read it through start to finish. Now with all of the individual pieces explained in-depth above, we hope you can use it as a starting point for your own VCL configuration. Happy caching!