How to use Edge Side Includes (ESI) with Varnish in Plone

published Mar 01, 2010 08:50   by admin ( last modified Mar 01, 2010 08:50 )

With ESI it is possible to cache the more or less static parts of Plone's pages in Varnish, and "fold in" the dynamic stuff per request. This can in some use cases increase performance massively.

Putting Varnish in front of Plone is a common way to speed up things, and can lead to insane speed increases. However, Plone and Zope are dynamic and tighly coupled frameworks, and it is a bit of a shame to have it serve out essentially static pages. Even if you use something intelligent like CacheSetup (Cache fu), if you make a change that affects a portlet, e.g. "recent comments", you will need to invalidate all pages where that portlet is shown.

Varnish however has support for caching only parts of pages and then fetch the dynamic parts and combine with the static stuff, before serving it out to the visitor. Can Plone be used with it? Yes it can, with some hoop jumping.

The technology of combining pages on the fly is called Edge Side Includes (ESI) and Varnish supports part of the standard. Here is how it works in Varnish:

If you put in a special xml fragment (stanza) like this in your html output:

<esi:include src="/site/@@left_column"/>

...Varnish will fetch the contents of that url (http://yoursite/site/@@left_column), and include it, and replace the stanza with what it finds at that url.

The 123-567.net hobby project

My current hobby project is a salsa dancing web site called 123-567.net. It allows visitors to put together their own salsa dancing routine from short fragments ("Moves"). Moves are connected through positions. It is currently running on our hobby server on the office ADSL connection.

In the right hand column there is a portlet that keeps track of each visitor's routine as it is being built up. No login is required but a captcha tries to keep the salsa dancing away from robots (who suck at dancing anyway, currently, although this may change).

There is one big honking page that lists all moves, and traverses a lot of archetypes references to compute and display all possible moves from each move (and how many moves in its turn can follow each such move). This page basically manages to wake up and excercise every single content object on the site, sometimes multiple times, and the page takes 3-5 seconds to render on a small server. The page cannot be made static, since the contents of the routine portlet will be different for each visitor that uses it.

There are multiple ways of optimizing this page while keeping the portlet dynamic, from RAM-caching with memoize (which brings down the render time of the page to 0.2 s), to using KSS, to using the catalog exclusively for calculating moves.

Memoize seemed like a good solution, but since there are also videos on the site, and I did not want to tie up a Plone process serving each one, I still needed some extra oomph. Varnish should be able to cache the videos and offload Zope's precious processes.

Making the portlet render with ESI

It should be enough to instead of showing the portlet on the page, just show a placeholder for it:

<esi:include src="/site/@@left_column"/>

..and then construct a view by the name "@@left_column" and show the portlet in that view.

Varnish's behavior is controlled by a configuration file, normally called varnish.vcl. The configuration file basically deals with:

  • What it should do with requests coming in
  • What to do with responses going out

The part of the vcl file that deals with responses is a subroutine called vcl_fetch. Here is a simplified part of that code for 123-567.net, with esi enabled:

 

sub vcl_fetch {

    if (req.url == "/site/@@left_column") {
        pass;
    }
   
   
    if (obj.http.Content-Type ~ "html") {
        esi;  /* Do ESI processing */
        unset obj.http.set-cookie;
        set obj.ttl = 24 h;

    }

      

The esi directive as framed above enables esi parsing and caching of all html pages, but first the preceding if clause sees to that the @@left_column is passed through without caching ("pass").

The @@left_column view needs to be constructed, and since the portlet is a proper portlet, I could register its page template file also as a viewlet, but a quicker and dirtier way (the "customer" of this project, i.e. me, is very tolerant of whims and shortcuts :-) is to just put in a the entire portlet renderer into the view.

<metal:block use-macro="here/global_defines/macros/defines"/>

<tal:block replace="structure provider:plone.leftcolumn" />

 

Come to think of it, that may not be quicker... Anyway, now the site renders the page statically but puts in the portlet in varnish. We are down to 50 ms rendering time from 5s rendering time, according to Apache's ab tool (with an empty portlet).

The portlet is context free, it renders the same on all pages. It uses the Zope (not Plone) session machinery and is essentialy independent of Plone, and should render very quickly, if put directly in the view instead of with the above provider stanza. For portlets that need context, one could just put in a cgi parameter in the esi stanza, with the path from the calling page, and then use that to recreate the context on the receiving end.

But now the site is broken without Varnish...

However if we view the site without Varnish the portlet will never materialize on the pages. There will just be stupid esi directive sitting there. I believe it is important that the site can be used without Varnish.

The obvious solution would be this:

<esi:include src="/site/@@left_column">
<--portlet goes here-->
</esi>

One would hope that Varnish would replace the entire esi element. However Varnish is not zpt, and if you put in the above code, Varnish will replace the start tag of the esi element with the portlet, and happily let the rest of the code stay on the page. Viewed through Varnish you would now have two portlets on the page (of which one portlet is stale).

So we need a way for the portlet to "know":

  • If it is behind Vanish
  • If Varnish has any intentions of caching it

It seems you can find this out from the Plone side! Analyzing the contents of the request variable shows that there is an HTTP_X_VARNISH environment variable which will only be present if varnish is in front. That takes care of displaying the portlet when you are directly accessing Zope. But when behind Varnish, how do we now if we are on the page (ouput ESI tag) or on the specially crafted view (output portlet)? One way is to check for path. The specially crafted view has in its id the string "left_column".  Quickly and dirtily we can do it in the page template:

<tal:varnish define= "behind_varnish 
python:context.REQUEST.get('HTTP_X_VARNISH',False);
not_cached_in_varnish python: 'left_column' in context.REQUEST['URL0']">

<esi:include src="/site/@@left_column"
tal:condition=
"python:behind_varnish and not not_cached_in_varnish"/>

<tal:portlet condition=
"python: not behind_varnish or not_cached_in_varnish">
[...]

Installing varnish

You can do it with a buildout recipe.

Monitoring varnish

You can use varnishtop and varnishhist, which probably comes if you install Varnish centrally on your machine. However for monitoring the buildout based Varnish we need to specify its name. The buildout recipe does not allow this and I do not think it would work even if it were possible. Instead of the name we supply the path to the name parameter ("-n"), giving the path to to where the buildout's Varnish .vsh directory is.

varnishttop -n absolutepath/to/buildout/parts/varnish-build/var/varnish/

..and then tab complete to the lonely dir in there.