GraphQL API Gateway Patterns
GraphQL Federation

GraphQL Federation Pattern

The GraphQL Federation Pattern is a pattern for building a GraphQL API that allows for multiple teams to work on different parts of the API in isolation. One widely known implementation of this pattern is Apollo Federation, which also uses the term "Supergraph" to describe a combined GraphQL API of multiple "Subgraphs". Other implementations exist as well, and they all share the same idea: Building a GraphQL API that is composed of multiple GraphQL APIs.

What problems does federated GraphQL solve?

Let's discuss the problems that the GraphQL Federation Pattern solves, what the trade-offs are, what the alternatives are, and what type of companies and teams can benefit from it.

First of all, Federation solves an organizational problem, not a technical one. It allows multiple teams to build partial APIs in isolation, while being able to combine them into a single unified API. E.g. one team can work on the "Products" API, while another team works on the "Reviews" API, and a third team works on the "Users" API. All of these APIs can be combined into a single API, so the frontend developers don't have to work with multiple APIs.

Federation gives you a specification for how to build such a federated API, how to implement Subgraphs, how to combine them into a federated Graph / Subergraph, and how to implement a Gateway that can serve the federated Graph on top of the Subgraphs.

Who benefits from federated GraphQL and what are the trade-offs?

As stated above, Federation allows multiple teams to work on different parts of an API in isolation, with an emphasis on the word "teams". If you're a single developer, a small team or even a group of teams that work closely together, you might still fare quite well with a single monolithic (GraphQL) API.

Federated GraphQL doesn't come for free. You need to coordinate schema changes across all participants of a federated Graph. You might want to have design guidelines for how to build Subgraphs, and naming conventions for types and fields. Someone needs to be responsible for the Gateway, maintain and operate it. You need schema composition and validation tools that can validate the Subgraphs and the federated Graph during continous integration. You need to check during CI that changes to a Subgraph don't break the federated Graph. You want to track changes to the federated Graph and Subgraphs over time, and you need to make sure that you're not breaking any clients when you deploy a new version of a Subgraph and the resulting federated Graph.

Compared to a single monolithic API, Federation adds a lot of complexity. It's a trade-off after all. Aside from enabling your company to scale the development of a GraphQL API across multiple teams, there are other benefits to Federation that are inherited from Microservices in general:

  • A bug in one Subgraph doesn't bring down the entire API
  • You can scale Subgraphs independently
  • You can use different technologies for different Subgraphs
  • You can use different programming languages for different Subgraphs

Deciding whether to use Federation or not

If you're evaluating whether to use Federation or not, you can ask yourself if you're suffering from the problems that Federation solves.

  • Do you have multiple teams that try to build APIs together?
  • Are your teams stepping on each others toes?
  • Would you like to use different technologies, frameworks and programming languages for different parts of your API?
  • Are you already invested in GraphQL and do your API consumers like it?
  • Do you have a platform team that can take care of the Gateway and help other teams with implementing CI checks for schema changes?
  • Do you really benefit from a federated API, or can multiple decoupled APIs work just as well for you?

You should be able to answer most of these questions with a strong "yes" if you want to adopt Federation successfully.

Alternatives to Federated GraphQL

Before you decide to use Federation, you might want to consider other alternatives.

You can have a lot of success with a single monolithic GraphQL API. With good tooling, processes and a good CI setup, you might be able to scale a single GraphQL API to a large number of developers.

If you still feel the need for a Microservice architecture, you can go that route without Federation or GraphQL at all. You can build multiple decoupled API using REST or gRPC, and use a Gateway to combine them into a unified API. Both REST and gRPC use HTTP as a transport-layer with path-based routing, so the Gateway implementation is a lot simpler compared to a composed GraphQL API.

When not to use Federation

As explained above, Federation comes with a lot of complexity and overhead. If you have a small team, or even multiple teams that work closely together, Federation might be a bigger burden than a benefit.

Furthermore, if you're not already invested in GraphQL, Federation might be too much to chew on at once. GraphQL is a powerful technology when used right, with clients like Relay and all the other great tooling from the GraphQL ecosystem. But GraphQL is not a silver bullet that solves all your problems, and so is Federation. Your teams don't magically scale just because you build federated GraphQL APIs, which leads us to the next point.

How to adopt Federation successfully

If you decide to adopt Federation, how do you do it successfully?

Over the last couple of years, some best practices have emerged and learnings have been shared, e.g. from Netflix (opens in a new tab).

One important takeaway is that you should think of a federated Graph top down, not bottom up.

You might be under the impression that individual teams can build their Subgraphs in full isolation, which then get merged "upwards" into a Supergraph. This will not scale well from an organizational perspective because every team will build their Subgraphs in different ways, leading to a lot of inconsistencies in the resulting federated Graph.

You also cannot have a centralized team that builds the federated Graph or gate-keeps contributions from other teams. This team will quickly become a bottleneck for the rest of the organization.

Instead, you should think of a federated Graph as a product with shared ownership. Teams need to collaborate on the design of the federated Graph, and they need to agree on design guidelines and naming conventions. The implementation of the Subgraphs can still be done in isolation, and individual teams can own individual Subgraphs, but the federated Graph needs to be owned by all teams together.

You should consider having a dedicated infrastructure team that owns the Gateway and helps other teams with implementing CI checks and API Governance. This team can help with the tooling and processes around Federation, but they should not own the federated Graph.

You can achieve shared ownership of the federated Graph by building a group of lead engineers from all teams that govern the federated Graph. Their responsibility is to make quorum-based decisions on the design of the federated Graph without creating a bottleneck or a single point of failure. This group needs to establish the design guidelines and a process for how to change the federated Graph over time. With the help of the infrastructure team, they can implement this process.

Conclusion

When your organization is benefiting from GraphQL, and you are looking for a way to scale the implementation of your GraphQL API across multiple teams, Federation might be able to help you.

Federation is a tool, it's not a goal in itself. Facebook (Meta) has been using a monolithic GraphQL API for years, while Netflix successfully adopted Federation. What works for one company might not work for another, so you should take off your blinders and evaluate all options carefully.