This article explains how you can cache your web site contents with Apache’s mod_cache on Ubuntu 10.04. If you have a high-traffic dynamic web site that generates lots of database queries on each request, you can decrease the server load dramatically by caching your content for a few minutes or more (that depends on how often you update your content).
I do not issue any guarantee that this will work for you!
1 Preliminary Note
I’m assuming that you have a working Apache2 setup (Apache 2.2.x – prior to that version, mod_cache is considered experimental) from the Ubuntu repositories – the Apache version in the Ubuntu 10.04 repositories is 2.2.14 so you should be good to go.
I’m using the document root /var/www here for my test vhost – you must adjust this if your document root differs.
2 Enabling mod_cache
mod_cache has two submodules that manage the cache storage, mod_disk_cache (for storing contents on the hard drive) and mod_mem_cache (for storing contents in memory which is faster than disk caching). Decide which one you want to use and continue either with chapter 2.1 (mod_disk_cache) or 2.2 (mod_mem_cache).
The mod_disk_cache configuration is stored in /etc/apache2/mods-available/disk_cache.conf, so let’s edit that one:
Make sure you uncomment the CacheEnable disk / line, so that the minimal configuration looks as follows:
<IfModule mod_disk_cache.c> # cache cleaning is done by htcacheclean, which can be configured in # /etc/default/apache2 # # For further information, see the comments in that file, # /usr/share/doc/apache2.2-common/README.Debian, and the htcacheclean(8) # man page. # This path must be the same as the one in /etc/default/apache2 CacheRoot /var/cache/apache2/mod_disk_cache # This will also cache local documents. It usually makes more sense to # put this into the configuration for just one virtual host. CacheEnable disk / CacheDirLevels 5 CacheDirLength 3 </IfModule>
You can find explanations for these configuration options and further configuration options on http://httpd.apache.org/docs/2.2/mod/mod_disk_cache.html.
Now we can enable mod_cache and mod_disk_cache:
To make sure that our cache directory /var/cache/apache2/mod_disk_cache doesn’t fill up over time, we have to clean it with the htcacheclean command. That command is part of the apache2-utils package which we install as follows:
aptitude install apache2-utils
Afterwards, we can start htcacheclean as a daemon like this:
htcacheclean -d30 -n -t -p /var/cache/apache2/mod_disk_cache -l 100M -i
This will clean our cache directory every 30 minutes and make sure that it will not get bigger than 100MB. To learn more about htcacheclean, take a look at
Of course, you don’t want to start htcacheclean manually each time you reboot the server – therefore we edit /etc/rc.local…
… and add the following line to it, right before the exit 0 line:
[...] /usr/sbin/htcacheclean -d30 -n -t -p /var/cache/apache2/mod_disk_cache -l 100M -i [...]
This will start htcacheclean automatically each time you start the server.
The mod_mem_cache configuration is located in /etc/apache2/mods-available/mem_cache.conf:
<IfModule mod_mem_cache.c> CacheEnable mem / MCacheSize 4096 MCacheMaxObjectCount 100 MCacheMinObjectSize 1 MCacheMaxObjectSize 2048 </IfModule>
This is the default configuration – if you like you can modify it. A list of configuration directives for mod_mem_cache is available here: http://httpd.apache.org/docs/2.2/mod/mod_mem_cache.html
Now let’s enable mod_cache and mod_mem_cache as follows:
That’s it already! With mod_mem_cache, you don’t have to clean up any cache directories.
Unfortunately mod_cache doesn’t provide any logging functionalities which is bad if you want to know if logging is working. Therefore I create a small PHP test file, /var/www/cachetest.php, that sends out HTTP headers that tell mod_cache that it should cache the file for 300 seconds, and that simply prints the timestamp:
<?php header("Cache-Control: must-revalidate, max-age=300"); header("Vary: Accept-Encoding"); echo time()."<br>"; ?>
Now call that file in a browser – it should display the current time stamp. Then click in the browser’s address bar and press ENTER so that the page gets loaded again (don’t press F5 or the reload button – this will always fetch a fresh copy from the server instead of the cache!) – if all goes well, you should still see the old, cached timestamp. If you wait 300 seconds, you should get a fresh copy from the server instead of the cache.
4 HTTP Headers
Caching doesn’t work out-of-the-box – you must modify your web application so that caching can work (it is possible that your web application already supports caching – please consult the documentation of your application to find out). mod_cache will cache web pages only if the HTTP headers sent out by your web application tell it to do so.
Here are some examples of headers that tell mod_cache not to cache:
- Expires headers with a date in the past: “Expires: Sun, 19 Nov 1978 05:00:00 GMT”
- Certain Cache-Control headers: “Cache-Control: no-store, no-cache, must-revalidate” or “Cache-Control: must-revalidate, max-age=0”
- Set-Cookie headers: a page will not be cached if a cookie is set.
So if you want mod_cache to cache your pages, modify your application to not send out such headers.
If you want mod_cache to cache your pages, you can set an Expires header with a date in the future, but the recommended way is to use max-age:
“Cache-Control: must-revalidate, max-age=300”
This tells mod_cache to cache the page for 300 seconds (max-age) – unfortunately mod_cache doesn’t know the s-maxage option (see http://www.mnot.net/cache_docs/#CACHE-CONTROL), that’s why we must use the max-age option (which also tells your browser to cache – please keep this in mind if you get unexpected results!). If mod_cache knew the s-maxage option, we could use “Cache-Control: must-revalidate, max-age=0, s-maxage=300” which would tell mod_cache, but not the browser, to cache the page.
Of course, this header is useless if you send out one of the non-caching headers (Expires in the past, Set-Cookie, etc.) from above at the same time!
Another very important header for caching is this one:
This makes mod_cache keep two copies of each cached page, one compressed (gzip) and one uncompressed so that it can deliver the right version depending on the capabilities of the user-agent/browser. Some user-agents don’t understand gzip compression, so they should get the uncompressed version.
So here’s the summary: use the following two headers if you want mod_cache to cache:
“Cache-Control: must-revalidate, max-age=300”
and make sure that no Expires with a date in the past, cookies, etc. are sent.
If your application is written in PHP, you can use PHP’s header() function to send out HTTP headers, e.g. like this:
header(“Cache-Control: must-revalidate, max-age=300”);
This page is a must-read if you want to learn more about HTTP headers and caching: http://www.mnot.net/cache_docs/
- Apache: http://httpd.apache.org/
- mod_cache: http://httpd.apache.org/docs/2.2/mod/mod_cache.html
- mod_disk_cache: http://httpd.apache.org/docs/2.2/mod/mod_disk_cache.html
- mod_mem_cache: http://httpd.apache.org/docs/2.2/mod/mod_mem_cache.html
- Apache Caching Guide: http://httpd.apache.org/docs/2.2/caching.html
- Caching tutorial: http://www.mnot.net/cache_docs/
- Ubuntu: http://www.ubuntu.com/