Dan
Dan
5 min read

Cache Tags/Surrogate Keys - Enterprise or Essential?

Caches are a way of storing information so that future requests for that data can be served more quickly. CDNs, like Peakhour, run caches on each of theor POPs (Points of Presence). Essentially these caches are key/value stores, where the key could be the combination of a number of bits of information, as outlined in our previous blog post on cache keys.

An established way of supercharging your website perfomance is to perform full page caching, where a copy of a full page generated by a CMS is stored in a CDN. This can typically cut anywhere between 1-4s off of a page load for a cache hit vs a cache miss.

Savvysupporter before Main document load before caching: 2.07s
Savvysupporter after Main document load after caching: 82ms!!

A Simple Cache Example

When a website changes something, say a page, or an image, it can instruct the CDN to flush its cache with the key for that resource. So far so good, its pretty simple, lets keep going with a simple example, say we have a page:

/about-us/

If we're doing full page caching it will get stored in the CDN with the key "/about-us/", if it ever gets changed we can issue a flush using the key "/about-us/" and the CDN will fetch a new version, great!

A blog example

But what about in the case of a blog with an article with the URL "/caching/caching-explained/", our typical blog has categories, tags, and authors. So a link, with a summary, to our blog post potentially exists on MANY pages, perhaps the home page if its a recent article, the category pages that the article belongs to, the author page etc.

When we update the article its not good enough to just flush using the key "/caching/caching-explained/", we also have to find all the other pages that it appears on so we can flush them too, as they're also potentially changed. This means we have a fair bit of work to do, we're going to have to issue database queries to find all the pages that our article appears on, gather them all together in a list and issue a flush for them!

An eCommerce example

Another example would be an ecommerce store with lots of products and product categories. A particular product might exist on 100s of pages with its price displayed, update that product price you want to make sure that the site is correct. You have two choices, do lots of work on the server to discover the pages the product is on and flush them, or flush everything. Neither are good options, the first one can slow your website to a crawl with database queries, the second does the same thing by forcing the cache to repopulate.

Enter Cache Tags

Cache tags, also known as surrogate keys, are a mechanism to add an extra of finding content in a cache. These cache tags aren't unique like the primary cache key.

A website utilises cache tags by returning them in a HTTP Header in the response with the page. For example Magento 2 uses the header X-Magento-Tags, an example looks like this:

x-magento-tags: cms_b_site_home_main_banner,store,cms_b,cms_b_site_homepage_bar,cms_p_47,cms_b_header_custom_notice,cms_b_porto_custom_block_for_header_home5,cms_b_site_header_social_links,cms_b_site_home_shopby_category,cms_b_site_home_shopby_brand,cat_c_p_2,cat_p_2508,cat_p,cat_p_2483,cat_p_2387,cat_p_2372,cat_p_1412,cat_p_1388,cat_p_2575,cat_p_2560,cat_p_2557,cat_p_2543,cat_p_2520,cat_p_1262,cat_p_2434,cat_p_2423,cat_p_1660,cat_p_1579,cat_p_1276,cat_p_1217,cms_b_site_footer_social_links,cms_b_site_footer_contact_us,cms_b_site_footer_popular_items,cms_b_site_footer_quick_links,cms_b_site_footer_information

Magento returns tags for page elements like the navigation, sidebar, notice, as well as for product categories and products. Products having tags in the format cat_p_1234 where 1234 is the product id in the database.

When someone updates product 1234 a flush is issued for the tag cat_p_1234, all pages that have that tag are flushed. Magento doesn't have to do any work trying to determine which page the product might be on. The cache can efficiently find all such cached pages and invalidate them.

Cache tags in CMSs

As mentioned, Magento 2 uses a sophisticated cache tag strategy to maximise the performance of its full page cache. Other CMS's like Drupal 8/9/10 and Typo3 also utilise cache tags. Peakhour adds cache tags to Wordpress, Prestashop, Magento 1, and Opencart via our plugins to enable full page caching.

Cache Tag support amongst CDNs

If you're looking for maximum full page cache effectiveness for your website, especially if you're using a CMS with builtin cache tag support, you WANT cache tag supported. Here's a table outlining support amongst major CDN providers.

CDN/CacheCache Tag SupportCustom Header
Peakhour
Cloudflare Enterprise Plan Only
Fastly
Self Hosted Varnish
Cloudfront

Enterprise or essential?

In our view cache tags are an essential feature to have in any CDN to maximise cache performance. They enable efficient, targeted cache invalidation to maximise hit rates and minimise work on the origin server. They shouldn't be walled off on an Enterprise level package.

© PEAKHOUR.IO PTY LTD 2024   ABN 76 619 930 826    All rights reserved.