AMP

Streaming in the Shadow Reader

Uncategorized

What did we do?

We made The Shadow Reader faster again!

We created The Shadow Reader (https://amp.cards) to demonstrate how AMP pages can be used within a Progressive Web App (PWA) (read our announcement post for more context). The site folds real articles from The Guardian into an immersive news reader experience. It’s a demo, but it’s also a fully functional site, containing everything you need to embed AMPs inside a beautiful PWA.

Earlier this year, we enhanced the Shadow Reader to follow an AMP=>AMP/PWA pattern, which made initial article loads faster and made the entire app more SEO-friendly. Now, we’ve made the site even faster by adding DOM streaming. That means the browser can render content as it loads – it doesn’t have to wait until the full load!

The video below shows the new streaming Shadow Reader on the left and the old version on the right. To clearly see the difference streaming makes, I used Charles proxy to simulate a 56kbps connection. Though this is obviously worse than the average connection, the Shadow Reader was already quite fast!  In AMPs, all JavaScript is asynchronous. And in many AMPs, including The Guardian’s, most of the HTML consists of CSS in the <head>. There isn’t that much <body> to stream! So it’s tough to spot the difference on a good connection, or even in a 3G connection simulated in DevTools.

It takes a little while to load the HTML in the <head>, so this video starts a few seconds in. Once we get to the <body>, the streaming works its magic, and in my unlablike test environment, the main article text was entirely visible 11 seconds earlier with streaming than without. Interestingly, the article’s YouTube thumbnail showed up right after the text in the streaming version, whereas in non-streaming, it showed up 67 seconds later than it did in the streaming version.

 

How does this work?

AMP comes in a flavor called Shadow AMP, which allows an AMP to exist entirely within a shadow root. Shadow AMP lets you embed an AMP in another webpage. You create a shadow AMP like this:

const shadowDoc = AMP.attachShadowDoc(container, doc, url);

If you’ve read Jake Archibald’s post on the subject, you know that it’s possible to stream content into the DOM via a trick involving an <iframe>. What you may not know is that this technique has been used to stream AMP content into a shadow root!  You simply create the shadow AMP with a different method:

const shadowDoc = AMP.attachShadowDocAsStream(container, url); 

Here’s how you stream AMP content into a shadow root, in 4 steps:

  1. Create the streaming shadow AMP
  2. Access your content with fetch()
  3. Stream that content into the shadow AMP.
  4. When fetch() tells you it’s done, close the writer.


The four steps

1. Create the streaming shadow AMP. See above.

2. Access your content with fetch(). Open a fetch() to the AMP URL. fetch() returns a promise that resolves to a response object. This object is a stream. Usually you’d use a method that read the stream to completion, then processed it in some way. For example, you might use response.text() or response.blob(), like this:

fetch('example.com/amp.html')
.then(response => response.text())
.then(text => console.log(text));

In our case, though, we’re going to use the response object as a stream. That’s easy to access via its body property, which exposes a ReadableStream.

fetch('example.com/amp.html')
.then(response => response.body())
.then( // read from the ReadableStream );

That ReadableStream contains a getReader() method which locks the stream and returns a ReadableStreamDefaultReader. The ReadableStreamDefaultReader exposes a read() method that provides access to chunks of HTML as they stream in, like this:

// read from the ReadableStream
let reader = response.body.getReader();
let chunk = await reader.read();


3. Stream that content into the shadow AMP.
Meanwhile, the shadowDoc we’ve created contains a writer object, and that writer contains write and close methods that can be used for streaming. So you can stream content into the DOM like this:

shadowDoc.writer.write(html);

and close it like this:

shadowDoc.writer.close();

So once we have the stream reader and the shadowDoc streamer, we repeat the following steps:

  • get chunk from stream
  • decode chunk and write it into the DOM

until the stream reader tells us it’s done, which it does through a boolean.

One way to do this is through recursion. In the Shadow Reader, we used await. Here’s the gist of our streaming code:

   const shadowDoc = AMP.attachShadowDocAsStream(container, url);
    fetch(url).then(async response => {
      let reader = response.body.getReader();
      let decoder = new TextDecoder();
      while (true) {
        let chunk = await reader.read();
        if (chunk.done) {
          shadowDoc.writer.close();
          break;
        }
        let html = decoder.decode(
          chunk.value || new Uint8Array(),
          {stream: !chunk.done}
        );
        if (html) {
          shadowDoc.writer.write(html);
        }
      }
    });


4. When
fetch() tells you it’s done, close the writer. Looks like we covered that in step 3 above. And Bob’s your uncle!

Wait, you say – what about browsers that don’t yet support streaming?  Fortunately, in such cases, a call to AMP.attachShadowDocAsStream() gracefully degrades to the same functionality as AMP.attachShadowDoc().

The Shadow Reader loads and prerenders the top three articles when the app loads. We didn’t use streaming for these, and just to be on the safe side, we also don’t use streaming for browsers that don’t support it. You can conveniently compare the streaming and non-streaming approaches in the code by looking at load() and stream() in Article.js.

 

Reality is complicated.

What would life be without a complexity or two?

Testing. If you’re like me, you’d like to see the streaming happen, chunk by chunk, by inserting a breakpoint into the streaming code. But this may not give the desired result. At the time I’m writing this, with breakpoints set in Chrome, the shadow root remains invisible until it’s closed. You can see streaming more easily if you use DevTools to throttle your connection to 2G. It works even better to use a proxy server that sends content in smaller chunks. Or to test with an exceptionally large AMP, since many AMPs contain little HTML, and the entire content may arrive in only 2 chunks or even 1.

Order of operations. Before streaming, the Shadow Reader showed a loading spinner, loaded the article, rendered it in a hidden div, then removed the spinner and showed the article. With streaming, we want to show the article immediately!  This meant we had to move around some of the animations. It also meant that any modification of the AMP had to occur immediately, not after it fully loaded. In particular:

Hiding unwanted bits of the AMP. When the Shadow Reader imports an AMP, it needs to remove the original article’s menu, header, or footer, since those are already part of the PWA, and it would be silly to have, say, two menus. The Shadow Reader used to create the whole article in the DOM, remove the unwanted bits, then display the sanitized article. With streaming, we have to instead never show the unwanted elements in the first place. We could filter out unwanted elements as HTML came in, but this is difficult. Instead, we simply hide those elements with CSS.

Normally this is simple. AMP automatically applies an amp-shadow class to the shadow root, which allows you to easily style an AMP differently when it’s in a shadow root. Unfortunately, in our case, we don’t control the Guardian articles. We solve this by looking for a chunk that contains <style amp-custom>, and then injecting our CSS right into that.

This is useful anyway, because the Shadow Reader included a custom stylesheet in the AMP. Unfortunately, during streaming, a browser can’t be counted on to load and apply CSS as soon as it sees a <style> tag – it may wait until streaming is done. We solve this problem by injecting the custom stylesheet as well. Done!  Here’s the actual final streaming code:

    fetch(this.proxyUrl).then(async response => {
      let reader = response.body.getReader();
      let decoder = new TextDecoder();

      while (true) {
        let chunk = await reader.read();

        if (chunk.done) {
          shadowDoc.writer.close();
          break;
        }

        let html = decoder.decode(
          chunk.value || new Uint8Array(),
          {stream: !chunk.done}
        );

        // check each chunk of HTML to see if it contains <style amp-custom>. If so, add in some extra CSS.
        if (html) {
          html = shadowReader.backend.injectCSS(html);

          // when we've got the body, start the process of animating the card and showing the article,
          // placing the card before the article
          if (html.includes('<body')) {
            html = article.prependCardHtml(html);
            shadowDoc.writer.write(html);
            article.card.animate();
            article.show();

          } else {
            shadowDoc.writer.write(html);
          }
        }
      }
    });


Left as an exercise for the reader.
At press time, I have not dealt with an edge case: this fails if <body is split between two chunks. Please submit a PR!

Posted by Ben Morss, Developer Advocate for AMP, at Google