Creating a Cache-aware HTTP/2 Server Push Mechanism

Avatar of Jeremy Wagner
Jeremy Wagner on (Updated on )

If you’ve been reading at all about HTTP/2, then you’ve likely heard about server push. If not, here’s the gist of it: Server push lets you preemptively send an asset when the client requests another. To use it, you need an HTTP/2-capable web server, and then you just set a Link header for the asset you want to push like so:

Link: </css/styles.css>; rel=preload

If this rule is set as a response header for an HTML resource, say index.html, the server will not only transmit index.html, but also styles.css in reply. This eliminates the return trip latency from the server, meaning that the document can render faster. At least in this scenario when CSS is pushed. You can push whatever your little heart desires.

One issue with server push that some developers have speculated over is that it may not be cache-aware in all situations, depending on any number of factors. Browsers do have the capability to reject pushes, and some servers have their own mitigation mechanisms. For example, Apache’s mod_http2 module has the H2PushDiarySize directive which attempts to address this problem. H2O Server has a thing called “Cache-aware Server Push” that stores a fingerprint of the pushed assets in a cookie. This is great news, but only if you can actually use H2O Server, which may not be an option for you, depending on your application requirements.

If you’re using an HTTP/2 server that hasn’t solved this problem yet, don’t sweat it. You can easily solve this problem on your own with a little back end code.

A super basic cache-aware server push solution

Let’s say you have a website running on HTTP/2 and you’re pushing a couple assets, like a CSS file and a JavaScript file. Let’s also say that this content rarely changes, and these assets have a long max-age time in their Cache-Control header. If this describes your situation, then there’s this quick and dirty back end solution you can use:

if (!isset($_COOKIE["h2pushes"])) {
    $pushString = "Link: </css/styles.css>; rel=preload,";
    $pushString .= "</js/scripts.js>; rel=preload";
    header($pushString);
    setcookie("h2pushes", "h2pushes", 0, 2592000, "", ".myradwebsite.com", true);
}

This PHP-centric example will check for the existence of a cookie named h2pushes. If the visitor is not a known user, the cookie check will predictably fail. When that happens, the appropriate Link headers will be created, and sent with the response using the header function. After the headers have been set, setcookie is used to create a cookie that will prevent potential redundant pushes should the user return. In this example, the expiry time for the cookie is 30 days (2,592,000 seconds). When the cookie expires (or is deleted), the process reoccurs.

This isn’t strictly “cache-aware” in the sense that the server knows for sure if the asset is cached on the client side, but the logic follows. The cookie is only set if the user has visited the page. By the time it’s set, assets have been pushed, and caching policies set by the Cache-Control header are in effect. This works great. Great, that is, until you have to change an asset.

A more flexible cache-aware server push solution

What if you run a website that uses server push, but assets change frequently? You want to ensure that redundant pushes don’t occur, but you also want to push assets if they have changed, or maybe you want to push additional assets later on. This requires a bit more code than our previous solution:

function pushAssets() {
    $pushes = array(
        "/css/styles.css" => substr(md5_file("/var/www/css/styles.css"), 0, 8),
        "/js/scripts.js" => substr(md5_file("/var/www/js/scripts.js"), 0, 8)
    );

    if (!isset($_COOKIE["h2pushes"])) {
        $pushString = buildPushString($pushes);
        header($pushString);
        setcookie("h2pushes", json_encode($pushes), 0, 2592000, "", ".myradwebsite.com", true);
    } else {
        $serializedPushes = json_encode($pushes);

        if ($serializedPushes !== $_COOKIE["h2pushes"]) {
            $oldPushes = json_decode($_COOKIE["h2pushes"], true);
            $diff = array_diff_assoc($pushes, $oldPushes);
            $pushString = buildPushString($diff);
            header($pushString);
            setcookie("h2pushes", json_encode($pushes), 0, 2592000, "", ".myradwebsite.com", true);
        }
    }
}

function buildPushString($pushes) {
    $pushString = "Link: ";

    foreach($pushes as $asset => $version) {
        $pushString .= "<" . $asset . ">; rel=preload";

        if ($asset !== end($pushes)) {
            $pushString .= ",";
        }
    }

    return $pushString;
}

// Push those assets!
pushAssets();

Okay, so maybe it’s more than just a bit of code, but it’s still grokkable. We start by defining a function named pushAssets that will drive the cache-aware push behavior. This function begins by defining an array of assets we want to push. Because we want to re-push assets if they change, we need to fingerprint them for comparison later on. For example, if you’re serving a file named styles.css, but you change it, you’ll version the asset with a query string (e.g., /css/styles.css?v=1) to ensure that the browser won’t serve a stale version of it. In this case, we’re using the md5_file function to create a checksum of the asset based on its contents. Because md5 checksums are 32 bytes, we use substr to shorten it to 8. Whenever these assets change, the checksum will change, which means that assets will automatically be versioned.

Now for the main event: Like before, we’ll check for the presence of the h2pushes cookie. If it doesn’t exist, we’ll use the buildPushString helper function to build the Link header string from the assets we’ve specified in the $pushes array, and set the headers with the header function. Then we’ll create the cookie, but this time we’ll create a storable representation of the $pushes array with the json_encode function, and store that value in the cookie. We could serialize this value, but this presents a potentially serious security risk when we unserialize it later, so we should stick with something safer like json_encode.

Now comes the interesting part: What to do with returning visitors. If it turns out that the visitor is returning and has an h2pushes cookie, we json_encode the $pushes array and compare the value of this JSON-encoded array to the one stored in the h2pushes cookie. If there’s no difference, we do nothing else and merrily go on our way. If there’s a discrepancy, though, we need to find out what has changed. To do this, we’ll use the json_decode function to convert the h2pushes cookie value back into an array, and use array_diff_assoc to find the differences between the $pushes array and the JSON-decoded $oldPushes array.

With the differences returned from array_diff_assoc, we use the buildPushString helper function to once again build a string of resources to push again. The headers are sent, and the cookie value is updated with the JSON-encoded contents of the $pushes array. Congratulations. You just learned how to create your own cache-aware server push mechanism!

Conclusion

With a bit of ingenuity, it’s not too difficult to push assets in a way that minimizes redundant pushes for repeat visitors. If you don’t have the luxury of being able to use a web server like H2O, this solution may work well enough for your purposes. It’s currently in use on my own website, and it seems to work pretty well. It’s very low maintenance, too. I can change assets on my site, and with the fingerprinting mechanism used, asset references update themselves, and pushes adapt to changes in assets without me having to do any extra work.

One thing to remember is that as browsers mature, they will likely become better at recognizing when they should reject pushes, and serve from the cache. If browsers fail at perfecting this behavior, HTTP/2 servers will likely implement some cache-aware pushing mechanism for the user much like H2O does. Until that day comes, however, this may be something for you to consider. While written in PHP, porting this code to another back-end language should be trivial.

Happy pushing!


Cover of Web Performance in Action

Jeremy Wagner is the author of Web Performance in Action, an upcoming title from Manning Publications. Use coupon code csstripc to save 38% off it, or any other Manning book.

Check him out on Twitter: @malchata