How to use caching in Jahia

This document explains how and when to use the different caches in Jahia

In this section

Introduction to caches

Jahia basically has 3 cache layers :

This document only presents different configuration options for the output cache. For more detailed information about caching in general, please check the Jahia Performance Tuning Guide

In the output cache system, there are basically two subsystems :

The WebDAV file cache is basically a small file cache, that is detailed at the end of this document.

Jahia offers different HTML output cache implementations, that can cover different needs. Basically choosing the proper implementation depends a lot on the volumetry of the load or the content that is projected on the Jahia servers. We present the table below to help you choose your cache implementation.

  Full-page cache Container cache ESI cache
Large amounts of content * X X
Lots of users   X X
Dynamic pages (AJAX, Portlets) X X +
Immediate invalidation on update X* X  
All modes (live/edit/preview/compare) X* X  
Remote cache server     X
Default cache   X  
* = the full page cache can be setup to also use time-based expiration, in which case it will be able to handle more content, but will no longer expire upon content modification. Also note that this will deactivate EDIT mode caching, which will slowdown page-rendering.
+ = ESI is compatible with dynamic technologies, but not recommended as it will relay the requests. Also due to the implementation, the rendering of portlets will always have to go to the Jahia server, which reduces the usefulness of this server if portlets are used on high-traffic pages.

The above table shows which output cache implementations are best for which types of usage of Jahia installation.

HTML Full-page Cache

What is the HTML cache

On of the first caches that was introduced in Jahia is the Full-page HTML Output cache. The full-page HTML output cache will store the full HTML rendered for each users, for each mode.

Why do we need the HTML cache

The Full-page HTML Output cache is very efficient for a low number of pages and users, but when the traffic grows, it is less effective than the ESI cache or the HTML Container cache as it not suited for large web sites with a lot of different users.

How do we configure the HTML cache

I - In tomcat/webapps/jahia/WEB-INF/etc/config/Jahia.properties

Set

outputCacheActivated=true

And set to false all the other caches outputContainerCacheActivated and esiCacheActivated to false.

II - In tomcat/webapps/jahia/WEB-INF/etc/config/process-pipeline.xml

comment

<valveDescriptor>
<className>org.jahia.operations.valves.SkeletonAggregatorValve</className>
</valveDescriptor>

And

<valveDescriptor>
<className>org.jahia.operations.valves.SkeletonParseAndStoreValve</className>
</valveDescriptor>

Also uncomment

<valveDescriptor>
<className>org.jahia.operations.valves.CacheReadValve</className>
</valveDescriptor>

And

<valveDescriptor>
<className>org.jahia.operations.valves.CacheWriteValve</className>
</valveDescriptor>

Do the opposite to return to set container cache or ESI cache

III (optional) - Time-based expiration mode

If you wish to setup the Full-page HTML cache to expiration only mode, so that it can handle more content, you will have to do the following modifications in the jahia.properties file :

Set

outputCacheExpirationOnly=true

And also set

outputCacheDefaultExpirationDelay=TIME_DELAY_IN_MILLISECONDS

where TIME_DELAY_IN_MILLISECONDS is the time delay after which a page will expire in the full-page cache.

Please note that activating this mode will disable page caching in EDIT mode, so this will have a performance impact in this mode.

Container cache

What is the container cache

Jahia has introduced a new HTML cache system acting at the container-level in Jahia. The container cache replaces the HTML cache at the page level. Only containers are cached (i.e. fields, data), but not container lists. The main benefits of this new implementation are:

Why do we need the container cache

For medium size to big size web sites we have found the need to include a cache that is more efficient than the classical HTML one. As a consequence this solution is better suited for web sites with a fair amount of users and not that often contributions.

How do we configure the container cache

Activating the container cache in Jahia is very easy :

I - In the jahia.properties file

Set

outputCacheActivated=false

and set

outputContainerCacheActivated=true

II - In tomcat/webapps/jahia/WEB-INF/etc/config/process-pipeline.xml

uncomment

<valveDescriptor>
<className>org.jahia.operations.valves.SkeletonAggregatorValve</className>
</valveDescriptor>

And

<valveDescriptor>
<className>org.jahia.operations.valves.SkeletonParseAndStoreValve</className>
</valveDescriptor>

Also comment

<valveDescriptor>
<className>org.jahia.operations.valves.CacheReadValve</className>
</valveDescriptor>

And

<valveDescriptor>
<className>org.jahia.operations.valves.CacheWriteValve</className>
</valveDescriptor>

In most of the cases, no development needs to be done to take full benefit of this cache. However, in some complex cases, the default behavior of the cache may need to modified. The template developer may have to change the cache parameters in order to obtain the correct behavior of the system. We now take an in-depth look at how this cache works

Integrating the container cache

I - Basic functioning

By default, the standard <content:container> tag caches all the output and returns the cached HTML when it is reused

The container cache can be disabled for one container, or for all containers of a list

If the container needs to be reused on multiple places with different rendering, the attribute cacheKey can be used to set a unique key for each different rendering.

A container that is not entirely cached can still cache some of its parts by using the <content:container-cache> tag

Container cache is invalidated when one of its fields or sub-container are modified. Other containers in the page are still cached.

II - Disabling the cache

For the container cache to be disabled for all the container in a containerlist , we should set the cache attribute to "off"

<content:containerList name="webappContainer" id="webappsContainerList"
parentContainerName="boxContainer">
<content:container cache="off">
...
</content:container>
</content:containerList"

III - Multiple rendering

Sometimes a container is used on different spots in the same page, and rendered in different ways. For example, in the faq template :

<content:container id="qaContainer">
<a href="#<bean:write name='qaContainer' property='id'/>"
class="bold"><content:textField name="question"/></a>
</content:container>
...
<content:container id="qaContainer">
<a name="<bean:write name='qaContainer' property='id'/>"
class="bold"><content:textField name="question"/></a><br/>
<content:textField name="answer"/>
<br/>&nbsp;
</content:container>

With the container cache activated, the second container tag will use the cache generated by the first one. This can be fixed by adding the cacheKey attribute:

<content:container id="qaContainer" cacheKey="question">
<a href="#<bean:write name='qaContainer' property='id'/>"
class="bold"><content:textField name="question"/></a>
</content:container>
...
<content:container id="qaContainer" cacheKey="answers">
<a name="<bean:write name='qaContainer' property='id'/>"
class="bold"><content:textField name="question"/></a><br/>
<content:textField name="answer"/>
<br/>&nbsp;
</content:container>

cacheKeyName, cacheKeyProperty and cacheKeyScope attributes can be used to get the actual value from a bean.

IV - Partial container caching

As it has been said before, the container cache can be disabled at the container level, but it is still possible to cache parts of the container. Use the following in order to take profit of this feature:

<content:container id="topMenuContainer" cache="off">
<content:pageField valueId="topLink" name="topLink" id="topLinkField" />
<content:container-cache>
... display ...
</content:container-cache>
</content:container>

The container-cache tag can also use the cacheKey attributes.

V - Dependencies

A container cache is invalidated when one of its fields is modified, or one of its sub-containers list is modified. In some cases, you might need to add dependencies to other container lists (for example when displaying the sub pages):

<% ContainerTag.addContainerListDependency(pageContext, containerListID) %>

You can also add dependencies to any container :

<% ContainerTag.addContainerDependency(pageContext, containerId) %>

VI - Known issues

Containers with context-conditional rendering like the left menu.

Containers that do not display content, but set beans in the page like the siteSettings container.

VII - Debugging

"containercache” parameter can be used in the URL for debugging:

– /containercache/off - disable the cache

– /containercache/offonce - refresh the cache entries

– /containercache/debug - shows debug information

These options act on all the containers on the page.

VIII - Container cache in scriptlets

For those who still need to use scriptlets, activating the container cache is not much harder than using tags. As an example, here is the display of a container list.
Comments in code below detail how to proceed to build a cache entry from a container list and to store it in the cache.

// call the service
final ContainerHTMLCache cacheInstance =
          ServicesRegistry.getInstance().getCacheService().getContainerHTMLCacheInstance();
final ProcessingContext context = jData.getProcessingContext();
// this is the cache key generation
// the key has to be unique for the content object (e.g. the container list name + the site key)
final String rawCacheKey =
          new StringBuffer("myContainerList").append("_site_").append(context.getSiteKey()).toString();
// to avoid AJAX issues, the cache key has to be modified to include additional information
// concerning advanced edit settings, using the methode appendAESMode
final String cacheKey = ContainerHTMLCache.appendAESMode(context, rawCacheKey);
String htmlOutput; // this will contain the generated HTML
// get the entry from cache if it exists
final ContainerHTMLCacheEntry htmlCacheEntry =
          cacheInstance.getFromContainerCache(null, jData, cacheKey, false, 0, null, null);
String cacheParam = context.getParameter(ProcessingContext.CONTAINERCACHE_MODE_PARAMETER) ;
if (htmlCacheEntry != null && !"off".equals(cacheParam)) {
  htmlOutput = htmlCacheEntry.getBodyContent(); // HTML is cached
} else {
  //myContainerList is the list to be displayed (previously declared elsewhere)
  StringBuffer sbf = new StringBuffer() ; // this will contain the HTML,
  Set<ContentObjectKey> dependencies = new HashSet<ContentObjectKey>(); // this will contain all the dependencies
  if (myContainerList != null) {
    dependencies.add(new ContentContainerListKey(myContainerList.getID()));
    myContainers = myContainerList.getContainers();
    while (myContainers.hasMoreElements()) {
    JahiaContainer myContainer = (JahiaContainer) myContainers.nextElement();
    // dependencies have to be updated for each container the cache entry depends on
    // (e.g. any content object contained in the container list)
    dependencies.add(new ContentContainerKey(myContainer.getID()));
    /* take care of container display here, adding the html to the StringBuffer sbf (no direct output) */
  }
  htmlOutput = sbf.toString() ; // HTML is put in the output String
  // this is how HTML is cached, using the service
  cacheInstance.writeToContainerCache(null, jData, htmlOutput, cacheKey, dependencies); 
  }
}
// this is the actual output, where HTML is wrapped in a specific esi tag
// to build a skeleton corresponding to the current URL
// this skeleton will have references to the container cache fragments to retrieve them later
out.print("<esi:include src=\"" + context.getSiteURL(context.getSite(), context.getPageID(), false, true, true) +
          "?ctnid=0&cacheKey=" + cacheKey + "\">");
out.print(htmlOutput); // actual output
out.print("</esi:include>\n");

Please note that in order to benefit from updates and get better support, scriptlets should be replaced by tags.

IX - Changing the cache provider (for advanced users only)

Jahia supports two cache providers: the default one (DEFAULT_CACHE) and the EHCACHE. In order to disable or enable one or another, just modify the applicationcontext-services.xml file in WEB-INF/etc/spring.

<bean id="JahiaCacheService" parent="proxyTemplate">
<property name="target">
<bean class="org.jahia.services.cache.CacheFactory" parent="jahiaServiceTemplate" factory-method="getInstance" >
<property name="cacheProviders">
<map>
<entry>
<key><value>DEFAULT_CACHE</value></key>
<ref bean="org.jahia.services.cache.CacheProvider" />
</entry>
<entry>
<key><value>EH_CACHE</value></key>
<bean class="org.jahia.services.cache.ehcache.EhCacheProvider">
<!-- This property allows to fix a limit for cache entries dependencies management,
if an entry have more than this value of dependencies then
when we flush this entry we will flush the whole cache-->
<property name="groupsSizeLimit"><value>100</value></property>
</bean>
</entry>
</map>
</property>
<property name="cacheProviderForCache">
<map>
<entry>
<key><value>SkeletonCache</value></key>
<value>DEFAULT_CACHE</value>
</entry>
<entry>
<key><value>ContainerHTMLCache</value></key>
<value>DEFAULT_CACHE</value>
</entry>
<entry>
<key><value>LockAlreadyAcquiredMap</value></key>
<value>EH_CACHE</value>
</entry>
<entry>
<key><value>LockPrerequisitesResultMap</value></key>
<value>EH_CACHE</value>
</entry>
</map>
</property>
</bean>
</property>
</bean>

and choose another for the cacheProviderForCache and the ContainerHTMLCache elements.

ESI

What is ESI (Cache proxy server)

ESI server dynamically caches and assembles HTML fragments without having to regenerate them from the underlying Jahia application server and database. Coupled with AOP technology, it automatically detects any Jahia modifications on the authoring server and transparently flushes and manages your HTML fragments.

Why do we use ESI

In order to further reduce response times and support massive user loads. Main advantages of this server are the boost of performance,

Optimize memory by sharing fragments by groups of users, does automatic Invalidation and Cache Server clustering support

How to install and setup an ESI server with Jahia

To install the ESI server, please proceed with the following steps :

I - Stand-Alone mode

  1. Install a Jahia instance on port 8080 (the default port when installing a new Jahia server).
  2. Once installed, make the following modifications to the jahia.properties (by simply pasting the following text snippet over the appropriate section in your jahia.properties). Make sure to replace YOUR_ESI_SERVER_IP where appropriate.
    [...]
    ###################################################################### ### Output cache ##################################################### ###################################################################### # The output (HTML) cache may also be controlled in more detail with the # following parameters. outputCacheActivated = false # the following value is in milliseconds, set to -1 for no time expiration outputCacheDefaultExpirationDelay = -1
    ###################################################################### ### ESI fragment-based Output cache ################################# ###################################################################### # Note that the above conventional cache must be deactivated # for ESI fragment caching to work esiCacheActivated = true # Login details required to send invalidations to the ESI server. Please # make sure these correspond to those declared in your ESI server's # WEB-INF\config\data.xml config file. esiCacheServerLogin = admin esiCacheServerPassword = password # Display an HTML fieldset box around fragment tags # useful for debugging purposes only : esiDisplayFragmentDelimiters = false # Used to display the referenced content IDs in each Template/Fragment # useful for debugging purposes only : esiDisplayContentIDs = false # To ensure cache coherence, should the cache of the remote ESI server # be cleared at boot time: esiCleanupCacheAtBootTime = true # It is recommended to set this setting to false in Cluster mode, otherwise if a cluster node # reboots it will flush the ESI server cache for all other nodes. # In order to do carry out esiCleanupCacheAtBootTime and for other invalidation purposes, # Jahia needs to know address(es) used to access one or more of the ESI server instances at boot time. # You can declare a ';' seperated list of IP/server combinations # e.g. : # esiServerIPs = 192.168.2.178;192.168.4.2;80.2.201.100 # esiServerPorts = 8081;7079;80 # which corresponds to addresses 192.168.2.178:8081, 192.168.4.2:7079 and 80.2.201.100:80 esiServerIPs = YOUR_ESI_SERVER_IP esiServerPorts = 8081 # The IP/Port of the ESI server where all SOAP invalidation messages are sent to # Similarly to the esiServerIPs/esiServerPorts notation, you can declare more than one. esiInvalidationIPs = YOUR_ESI_SERVER_IP esiInvalidationPorts = 6666 # force use of cookies (instead of URL rewriting) to transmit user specific ESI cache key parameter esiUseCookieUserIdentifiers = true # encrypt the user specific identifiers (recommended in production environments) esiEncryptUserIdentifiers = true [...]

  3. Download the latest ESI server build at http://nightly.jahia.org/cacheserver/ and install it in \YOUR_ESI_DIR
  4. Edit the file \YOUR_ESI_DIR\tomcat\webapps\ROOT\WEB-INF\config\data.xml and change the address of
    your remote Jahia server in the following entry:
    
    <server className="net.sf.j2ep.servers.BaseServer"
    domainName="localhost:8080"
    isRewriting="false"
    id="myServer3"
    usingVirtualHost="true">
    
        <rule className="net.sf.j2ep.rules.AcceptEverythingRule"/>
    </server>
    
    
    by
    <server className="net.sf.j2ep.servers.BaseServer"
    domainName="YOUR_JAHIA_SERVER_IP:8080"
    isRewriting="false"
    id="myServer3"
    usingVirtualHost="true">
    
        <rule className="net.sf.j2ep.rules.AcceptEverythingRule"/>
    </server>
    
    

    For each site that you add (including the default one), if you decide to use hostname-based site identification via the "site server name" (e.g. www.mywebsite.com) setting in the "Manage Virtual Sites" administration dialog, you must add a HostRule server rule entry to your data.xml which therefore becomes:
    
    <server className="net.sf.j2ep.servers.BaseServer"
    domainName="www.mywebsite.com:8080"
    isRewriting="false"
    id="myServer3"
    usingVirtualHost="true">
    
       <rule className="net.sf.j2ep.rules.HostRule" hostname="www.mywebsite.com" port="8081" />
    </server>
    
    <server className="net.sf.j2ep.servers.BaseServer"
    domainName="YOUR_JAHIA_SERVER_IP:8080"
    isRewriting="false"
    id="myServer3"
    usingVirtualHost="true">
    
        <rule className="net.sf.j2ep.rules.AcceptEverythingRule"/>
    </server>
    

    You need to manually add an associated HostRule server entry in the data.xml for each new site (where hostname-based site identification is active) you create in Jahia. The order of the entries is significant, so always keep the bottom AcceptEverythingRule server entry as the last entry.

  5. To activate routing to multiple Jahia instances, you need comment this part in data.xml :
    
    <server className="net.sf.j2ep.servers.BaseServer"
    domainName="localhost:8080"
    isRewriting="false"
    id="myServer3"
    usingVirtualHost="true">
    
        <rule className="net.sf.j2ep.rules.AcceptEverythingRule"/>
    </server>
        

    and uncomment this :

        <!--cluster-server className="net.sf.j2ep.servers.RoundRobinCluster" isRewriting="false" id="myCluster1"
             usingVirtualHost="true" >
             <server domainName="192.168.2.129:8080"  />   (=second server)
             <server domainName="192.168.2.194:8080" />    (=first server)
             <rule className="net.sf.j2ep.rules.AcceptEverythingRule" description="catch all"/>
        </cluster-server-->
    
        
    However this feature has not been thoroughly tested.

    To obtain the same behavior with a hardware load-balancer (like F5) , use the configuration described on point 4, and replace the "YOUR_JAHIA_SERVER_IP:8080" value by the address of the hardware load-balancer. The load-balancer will then reroute the requests to multiple Jahia servers. Make sure sticky sessions are enabled on the load-balancer.

  6. Run the ESI server with the following command (or use the "Start Jahia Cache Server" icon in your Windows Start menu):

    \YOUR_ESI_DIR\tomcat\bin\catalina start (Windows)

  7. Start the Jahia server with the following command :

    \YOUR_JAHIA_DIR\bin\jahia.bat (Windows)

    or

    /YOUR_JAHIA_DIR/bin/jahia.sh (Linux)

  8. You can now browse Jahia via the ESI server at http://YOUR_ESI_SERVER_IP:8081/jahia/Jahia. Note that Jahia is still accessible at http://YOUR_JAHIA_SERVER_IP:8080/jahia/Jahia.

  9. If you need debug information on all ESI related processing, you can change to "debug" level all ESI related classes in \YOUR_JAHIA_DIR\tomcat\webapps\jahia\WEB-INF\etc\config\log4j.xml.

    You can also activate debug mode on the ESI server by replacing "info" values by "debug" in \YOUR_ESI_DIR\tomcat\webapps\ROOT\WEB-INF\log4j.xml. You will need to restart the ESI server for any changes to be taken into account.

  10. To access the administration center of the ESI server, please go to http://YOUR_ESI_SERVER_IP:8081/esiadmin/index.jsp or go to "Server and Cache Status" section in Jahia's administration menu. To get access to the admin center, you will be prompted for authentication details by Tomcat.

    For security purposes, the first time you boot the ESI server a random password is assigned to the user Jahia and stored in \YOUR_ESI_DIR\tomcat\conf\tomcat-users.xml. This means you will not be able to access the admin center until you reboot the server at least once. This is because the appropriate login information isn't yet available to Tomcat since it's just been added to tomcat-users.xml. Login using user:Jahia and password:(see tomcat-users.xml). You can then edit the tomcat-users.xml and change the username or password as long your user belongs to the esiadmin role.
    Example tomcat-users.xml:

    <?xml version='1.0' encoding='utf-8'?>
    <tomcat-users>
      <role rolename="esiadmin" description="role with rights to access the ESI server Admin
          Center"/>
      <role rolename="tomcat"/>
      <role rolename="role1"/>
      <role rolename="manager"/>
      <user username="tomcat" password="tomcat" roles="tomcat"/>
      <user username="role1" password="tomcat" roles="role1"/>
      <user username="both" password="tomcat" roles="tomcat,role1"/>
      <user username="Jahia" password="rv7vDr10eC" fullName="ESI Admin Center
          Administrator" roles="esiadmin"/>
    </tomcat-users>					   

Note: In this example config, we used port 8080 for the Jahia server and 8081 for the ESI server. However, you can change this to any suitable port combination by first changing Jahia's and ESI's \tomcat\conf\server.xml files.

To change the allocated memory to the ESI server, go to \YOUR_ESI_DIR\tomcat\bin\catalina.bat (or catalina.sh under Linux), and change the memory allocation options on line 40 (or line 45 in catalina.sh) : "-Xms64m -Xmx512m -XX:MaxPermSize=256m".

II - Cluster mode

On the ESI side the configuration allows you to choose to use ESI in cluster or not, propagate or not the invalidation message between nodes, use rules of type FailOver or load balancing between your JAHIA server. If your ESI servers are not clustered then each one of them will have to generate all the pages, and if they are clustered then they will share the pages and fragments so this will divide the work on each node by 50%.

To configure cluster you have to use EHCACHE instead of REFERENCECACHE for “cacheProvider” and also turn “clusterEsiCache” to
true in tomcat/webapps/ROOT/WEB-INF/config/data.xml on the ESI server.

After that the cluster configuration is done in file tomcat/webapps/ROOT/WEBINF/ classes/ehcache-cluster-esi.xml.In this file there are two elements to set for your cluster :

– First element: “cacheManagerPeerProviderFactory”, in this node you need to lists all your other ESI nodes, telling on which IP and port they are
listening for each internal cache of ESI (there is five of them)(see the comments inside the file to understand how to declare each server).

– Second element: “cacheManagerPeerListenerFactory”, this node will defines on which IP and port you are listening.
For relaying the invalidation messages you need to turn “relayInvalidationsToRemoteEsiServers” to true, then in “remote-esi-servers”
you will have to list all the ESI servers you want to relay to. At the end of the files you have the definitions of your rules of redirection
between ESI and JAHIA. For example to make fail-over rules between two JAHIA servers and ESI :

<cluster-server className="net.sf.j2ep.servers.FailOverCluster"
isRewriting="false" id="myCluster2" usingVirtualHost="true">
<server domainName="192.168.2.129:8080" />
<server domainName="192.168.2.194:8080" />
<rule className="net.sf.j2ep.rules.AcceptEverythingRule" description="catch
all"/>
</cluster-server>


In this case all ESI requests will be redirected to the first JAHIA defined, and in case we have an error in forwarding the messages we will switch to another node. Refer to the documentation inside “WEBINF/config/data.xml” to see more examples of configurations.

WebDAV Files Cache

Each time a webdav file is requested by a browser on a jahia server, it is streamed from the slide content store. When using database storage, this can be quite long and takes a connection to the database. The goal of the cache is to keep small binaries that are often served, in order to reduce the database load due to blob streaming.

Configuration

A single parameter is added to the web.xml to configure the maximum size of the files to be cached :

        <servlet>
            <display-name>Slide DAV Server</display-name>
            <servlet-name>webdav</servlet-name>
        ...
            <init-param>
                <param-name>cache-threshold</param-name>
                <param-value>65536</param-value>
            </init-param>
        </servlet>
    

The value of the cache-threshold parameter is expressed in bytes. Here, only files smaller than 64kb are cached.

By default, the webdav cache uses the standard jahia cache provider. If disk serialization is required, the cache can be configured to use EHCache. This is defined in applicationContext-services.xml file :

        <entry>
            <key><value>webdavCache</value></key>
            <value>EH_CACHE</value>
        </entry>
    

Then EHCache has to be configured in ehcache-jahia.xml file :

        <cache
            name="webdavCache"
            overflowToDisk="true"
            maxElementsInMemory="100"
            diskPersistent="true"
            eternal="true"/>
    

When using cluster, replication should be disabled to avoid high network traffic, whatever the cache implementation that is being used.