Varnish HTTP Cache
Varnish Cache
Varnish is a state-of-the-art, high-performance HTTP accelerator. It uses the advanced features in Linux 2.6. Varnish supports Edge Side Includes, ESI. Edge side includes allow you to break your pages into smaller pieces, and cache those pieces independently.
In "high availability" setups, where dedicated servers handle different functions and redundant backup servers are on hand for every piece of the infrastructure. That means two database servers, two web servers, two Varnish servers, and so on.
Configure Varnish on linux
Edit httpd.conf
change Listen <hostIP>:80 to Listen 127.0.0.1:80
Edit /etc/varnish/default.vcl
change
Default start up
# vi /etc/varnish/varnish.params or
# vi /etc/sysconfig/varnish
varnishd -a :80 -b localhost:8080 -T localhost:6082 -s file,/var/cache/varnish.cache,256M
varnishd -a 10.35.42.66:80 -b localhost:80 -T localhost:6082 -s file,/var/cache/varnish.cache,256M
-a gives the IP:port varnish should listen on, -b gives the IP:port of the backend handling web server
To stop varnish
/etc/rc.d/init.d/varnish stop
To see varnish statistics
# varnishstat
VCL Basics
When Varnish processes a request, it starts by parsing the request itself, separating the request method from headers, verifying that it’s a valid HTTP request and so on. When this basic parsing has completed, the very first policy decisions can be done: Should Varnish even attempt to find this resource in the cache? This decision is left to VCL, more specifically the vcl_recv method.
If you do not provide any vcl_recv function, the default VCL function for vcl_recv is executed. But even if you do specify your own vcl_recv function, the default is still present. Whether it is executed or not depends on whether your own VCL code terminates that specific state or not.
It is strongly advised to let the default VCL run whenever possible. It is designed with safety in mind, which often means it’ll handle any flaws in your VCL in a reasonable manner. It may not cache as much, but often it’s better to not cache some content instead of delivering the wrong content to the wrong user.
The basic syntax of Varnish is inspired mainly by C and Perl.
VCL is all about policy. By providing a state machine which you can hook into, VCL allows you to affect the handling of any single request almost anywhere in the execution chain.
Whenever you are working on VCL, you should think of what that exact line you are writing has to do. The best VCL is built by having many independent sections that don’t interfere with each other more than they have to.
This is made easier by the fact that VCL also has a default - which is always present. If you just need to modify one little thing in vcl_recv, you can do just that. You don’t have to copy the default VCL, because it will be executed after your own - assuming you don’t have any return statements.
Scaling PHP applications with Varnish
Using Varnish with Drupal
- Drupal will serve content to authorised, logged in users, as well as anonymous users on the same URL.
- Drupal 7 supports the basic cache-control headers which are necessary to use Varnish as a cache for full pages. Pressflow backports this functionality to version 6.x for those who need it.
-
Drupal Module Varnish HTTP Accelerator Integration
When setting up varnish using the default configuration in front of your application (in my case : a PHP app), you’ll probably notice that varnish caches nothing.
An example setup of Varnish, Apache and MediaWiki on a single server outlined.