GraphQL API Gateway Patterns
Defer Stream

Defer & Stream Pattern for (Federated) GraphQL APIs and API Gateways

Sometimes, loading all data at once might just not be the ideal solution, that's why we have the defer and stream directives.

Problem

Let's say you're using a GraphQL Client like Relay that allows you to define data dependencies per UI Component using Fragments. We might be building a complex nested user interface, where we want to load the social media timeline of a user. Behind the scenes, this timeline will require a lot of data from different services. E.g. there could be a service to load the user profile, another service to load the posts for the timeline, and a third service to load comments, likes and other interactions for each post.

If we had to wait for the whole request to complete before we can render the first pixel, we'd have to wait until the complete GraphQL Operation is resolved and the response arrived at the client. Keep in mind that the client can only start parsing the JSON response once it's entirely loaded, and the renderer can only start rendering the first pixel once the whole response is parsed into a JavaScript object.

The user experience would be terrible if we had to wait for all of this to happen before the user can see anything. If one single service is slow in answering a request, the whole UI would be blocked until the slowest service responded.

Imagine if the user profile would be available immediately because it can be loaded from a fast cache. The timeline would resolve within 1s, and the comments would take 5s to load.

Wouldn't it be nice if we could show the user profile immediately, then show the timeline after 1s, and update the user interface as the comments arrive one by one?

Solution

The defer and stream directives allow you to do exactly that. In addition to just using Fragments to define per-component data dependencies, we can use the defer and stream directives to annotate the Fragments to indicate to the GraphQL Server or GraphQL API Gateway that we want to load a certain Fragment asynchronously (defer), or "stream" the items of a list one by one.

Let's get into a bit more detail on how the two directives work.

The defer GraphQL Directive to load Fragments asynchronously

Let's take an example GraphQL Operation without the defer directive to illustrate the problem:

query UserProfilePage {
  user(id: "me") {
    id
    name
    profilePicture {
      url
    }
    timeline {
      ...PostsFragment
    }
  }
}
 
fragment PostsFragment on Timeline {
  posts {
    id
    text
    ...CommentsFragment
  }
}
 
fragment CommentsFragment on Post {
  comments {
    id
    text
  }
}

This GraphQL Operation will load the user profile, the profile picture, and the timeline with all posts and comments. Although we're using Fragments, we have to wait until all Fragments are resolved before we can render the first pixel.

Now let's add the defer directive to the GraphQL Operation to improve the user experience:

query UserProfilePage {
  user(id: "me") {
    id
    name
    profilePicture {
      url
    }
    timeline {
      ...PostsFragment @defer
    }
  }
}
 
fragment PostsFragment on Timeline {
  posts {
    id
    text
    ...CommentsFragment
  }
}
 
fragment CommentsFragment on Post {
  comments {
    id
    text
  }
}

In this case, we've annotated the PostsFragment with the @defer directive. This allows the frontend to render the user profile and the profile picture immediately, and then render the timeline once it's loaded.

That said, we still have to wait until all comments are loaded before we can start rendering the timeline. Let's add the @stream directive to improve this further.

The stream GraphQL Directive to asynchronously load list items

Let's take the GraphQL Operation from the previous example and add the @stream directive to the comments field:

query UserProfilePage {
  user(id: "me") {
    id
    name
    profilePicture {
      url
    }
    timeline {
      ...PostsFragment @defer
    }
  }
}
 
fragment PostsFragment on Timeline {
  posts {
    id
    text
    ...CommentsFragment
  }
}
 
fragment CommentsFragment on Post {
  comments @stream(initialCount: 0) {
    id
    text
  }
}

We've applied the @stream directive to the comments field with an initialCount value of 0. This means that during the first resolver call, the GraphQL Server will return an empty list of comments without waiting for the comments to be loaded.

As a result, the frontend can render the user profile immediately, then we get the full timeline, but without comments, and the comments get streamed in asynchronously one by one. That's a massive improvement in user experience compared to the initial solution.

Considerations

Depending on how you're using the directives @stream and @defer, it might be highly beneficial for the user experience, or actually making it worse.

For example, if you're using a single monolithic GraphQL Server, and this server is capable of fetching the user profile, timeline, and comments with one single database operation, adding the directives might not lead to the expected benefits.

On the other hand, when you're using a distributed system where some parts of your graph load faster, while others take a bit longer, the overall experience might improve when using @stream and @defer.

The challenge here lies in communicating between the API Implementers and the API users, what fields could benefit from deferring or streaming them. In other words, it would be good to have semantics like deferrable and streamable communicated in the schema.

If a user tries to defer a Fragment that is not deferrable, the validation or a linter could actually fail or at least suggest to not defer it.

At the end of the day, you should be cautious in using the directives and as always, measure before you improve. Don't just apply a pattern and hope for magic to happen. First, understand why a certain GraphQL Operation is slow, then find the right tool to optimize it. Distributed tracing might help you to understand why an operation is slow.