Configuring Squid as an Accelerator for Zope
This HowTo gives an explanation of configuring Squid as a caching http accelerator for Zope. In this configuration the Squid looks to the outside world like an ordinary http server, however it obtains its page content from your Zope http server.
Some elements described in this HowTo can be omitted from a simpler configuration. However, simple configurations always grow into bigger ones. The example given here includes elements that every Squid installation will need sooner or later:
To make best use of your caching server you will need to set specific caching headers in your Zope application. See another HowTo for more information on this manually. Alternatively Zope 2.3 has limited built-in support for setting caching headers.
Some people have reported success with getting Apache to perform caching, using ProxyPass. For detailed instructions see http://www.zope.org/Members/rbeer/caching. I prefer squid since it is easier to configure, and uses fewer system resources. It also provides more detailed cache-specific logs, and provides detailed statistics while it is running through its web-based Cache Manager CGI.
For the full details see Squids documentation on its http-accelerator mode.
You must use Squid version 2.3.STABLE4 or later. Earlier versions do not support everything described in this HowTo.
Specify the IP addresses on which squid should listen.
Specify the https ports on which squid should listen. You will need to provide one port for each SSL key and certificate. Note that https support requires Squid 2.5
The default configuration prevents squid from caching for any URL containing cgi-bin or ?. This is inapproporiate as a Zope accelerator, so you probably want to remove this no_cache line from the default configuration file.
This is inappropriate for the same reason as no_cache. You probably want to remove the default hierarchy_stoplist line.
Increase this number if you want to allow uploads larger than 1M
All front-end proxy solutions involve a problem of mapping incoming URLs into requests to backend servers. Squid solves this by allowing to provide an external program or script that translates external URLs into internal URLS. The content of this script is explained below, and you must enter the name of the script as
(for comparison, Apache's ProxyPass uses regular-expression based rules in its configuration file. I prefer the squid solution, mainly because I can easily test the redirector script outside of Squid. However it is a little more intimidating if you are not familiar with writing text-processing scripts.)
Squids default access rules prevent http access except from the listed IP addresses. The easiest change is to change the
Such a configuration is secure, but a single error in your redirector script could turn your squid into an open relay. For extra depth of security you may want to add
All that remains is to tell squid about your backend servers. This howto only deals with Zope, although it is possible to use other web servers as backeds.
Squid provides two different ways to make an http request to a backend server. The first option is most traditional; where the backend server is an origin server, and squid makes an ordinary http request.
To implement this your redirector script must translate a url such as http://www.example.com/a/b/c into http://backend-zope.dmz.example.com:8080/VirtualHostBase/http/www.example.com:80/a/b/c
Note that the host name in the output URL is the backend host, and probably should be inaccessible from the public internet. The trailing abc has been copied from the incoming URL, and the VirtualHostBase/http/www.example.com:80 segment is some VirtualHostMonster magic so that Zope can reconstruct the URL used by the original requester (in case it has to use similar URLs in its pages). www.zope.org contains full documentation on installing and using VirtualHostMonster.
Treating the Zope backend as an origin server as described in option 1 is by far the easiest to set up, but not the most effective. I believe this second option to be better, although it is a little unconventional.
Squid can also make http requests to other caches, which Zope can understand. Squid contains some sophisticated logic for managing connections to a pool of other caches, and these features prove to be useful for managing a pool of backend Zope servers too.
To implement this solution your redirector script must output a URL where the hostname part of the URL is a keyword which describes a pool of backend servers, such as http://backendpool/VirtualHostBase/http/www.example.com:80/a/b/c Note that the hostname part of the URL is not a real host; it is a keyword that will be used in squid's configuration.
In addition you must configure squid with the backend zope server as peers, and configure its access control rules so that all requests for that 'host' keyword are directed to those peers. If you have two Zope instances which serve redundant copies of the same virtual host, then squid.conf needs to contain lines such as:
cache_peer backendzope1.dmz.example.com parent 8080 8080 no-digest no-netdb-exchange round-robin cache_peer backendzope2.dmz.example.com parent 8080 8080 no-digest no-netdb-exchange round-robin acl in_backendpool dstdomain backendpool cache_peer_access backendzope1.dmz.example.com allow in_backendpool cache_peer_access backendzope1.dmz.example.com deny all cache_peer_access backendzope2.dmz.example.com allow in_backendpool cache_peer_access backendzope2.dmz.example.com deny all never_direct allow all
The never_direct line will ensure that Squid does not try to resolve the backendpool 'host' keyword as if it was a real host name, to connect to it if all the peers are down. You may need a more sophisticated never_direct acl if you have some backend servers which are not presented as peers.
The configuration above assumes that the two backend zopes are providing http and ICP on port 8080.
To use ICP you will need to enable it with the
This really needs some examples.
You should ensure that your redirector script will only output URLs targeting your backend servers. If your script can output URLs to arbitrary hosts then your accelerator is effectively an open http proxy.
Thanks to Jim Washington for pointing out the security concerns for the redirector script.
Thanks to Robert Collins for the more robuts
Thanks to Wankyu Choi for pointing out the typos in the examples.