Coursera, like many modern startups, began with a JSON API. This worked well when the data being requested was small, but as their course catalog grew, the client-side SPA needed to download 1MB of JSON just to display the home page courses. Coupled with changing features and a growing team, the original API design soon became a burden.
The team experimented with Scala microservices to handle the growing dataset and feature list. They tried designing “experience-based APIs” where different sections of the site could call different APIs to get the data they needed, but as you can imagine, this caused the APIs and the views to be tightly coupled and greatly increased the workload involved in implementing a new feature.
In 2014, they began to roll out their own homegrown API framework to address their needs. While it was more popular than the Scala/Play microservices, it too had issues with scalability and maintainability. The framework allowed clients to query the Coursera database with relational algebra operations, but what they got back was a flattened list that was difficult to work with.
When GraphQL was announced, Coursera immediately recognized that it could solve a lot of the problems they had been having and allow the clients to fetch data in a single round-trip. However, they also had over 1000 REST API endpoints, and migrating would be extremely difficult. To accomodate this and other concerns, they instead added a GraphQL proxy layer on top of their REST APIs.
The next challenge they faced was keeping the GraphQL schema in sync with the 1000+ resources. They chose to keep the REST APIs as the single source of truth and automate a process where the GraphQL server pings downstream services every 5 minutes to request the state of the schemas. Coursera combined this with a custom syntax developers could use to specify relations between resources they were requesting.
At this point, the team has been using GraphQL in production for almost a year and a half.
Shopify is one company who has actually taken the bold step of making their GraphQL API public. Although it’s still in beta, the GraphQL implementation offers a powerful way for third-party developers to integrate with the e-commerce platform while maximizing performance. They’ve made use of GraphiQL, an in-browser IDE, to offer a GraphQL API explorer that can be used to test and validate calls.
Internally, Shopify began their GraphQL journey while developing a new version of their mobile app in 2016. Their primary motivations, like many adopters of GraphQL, were reducing the number of round-trips to the server and customizing the data they received.
Since Shopify’s back end is a Rails monolith, they adopted GraphQL Ruby to implement it, and wrote some custom abstractions on top to suit their specific needs. One of the biggest issues they faced was the “n+1 problem”. They go into great detail about the problem and their solution here, but the basic premise is that queries for nested data will make separate calls to the DB for each top-level resource being requested. To solve this, they wrote a batching solution as a Ruby gem.
Since third-party access has been top of mind while implementing a GraphQL API at Shopify, they’ve prioritized developer experience and use a bot to track breaking changes to the schema. The team has also established strict permissions across type definitions to prevent unwanted access to resources and set a timeout limit to handle extremely complex queries that could bog down their servers.
According to Nick Rockwell, CTO of The New York Times, the decision to adopt GraphQL was a pretty easy one. Initially, the engineers there wanted to avoid rewriting HTTP clients for their multitude of products and have “one place to add and retrieve data and one way to authenticate against it”. They also sought to generally simplify the applications for existing engineers and especially new hires.
After starting out on Relay, the team switched to Apollo due to its first-class support for server-side rendering, larger ecosystem and open source community, prefetching, and more. They’ve also made heavy use of Facebook’s DataLoader utility to handle batch requests and caching.
One of the major challenges The New York Times has faced with their GraphQL implementation is handling changes to the schema. In a perfect world, this would happen in one place and changes could be propagated down with some kind of automation. The current reality is that changes typically need to be made in several places: the GraphQL schema, the back end schema, and potentially the UI. As the engineering team moves forward with GraphQL, this is a pain point they hope to eliminate.
For more on The New York Times use of GraphQL and the rest of their stack, take a look at our recent interview with their CTO.
If you’re not familiar, Artsy is an online art marketplace with a killer engineering blog. They’re also very big on GraphQL. Just a few months ago, they posted an article stating that “the future looks a lot like GraphQL”, and in 5 years, “REST API design will be obsolete.”
Most non-legacy code for their web and mobile apps communicates with a GraphQL server built in Node.js and Express. This server provides a single API endpoint for the client and forwards requests to the core API.
For two years after introducing GraphQL, Artsy kept the implementation read-only due to the fact that its main use was consolidating multiple APIs. More recently, they’ve begun to support mutations via Relay.
On the mobile side of things, GraphQL has allowed Artsy to build faster mobile apps and deliver a better in-app experience for their users. It’s also lessened their need for view models. Their decision to move from Swift to React Native was heavily influenced by the desire to use GraphQL.
If you’re interested in seeing what this looks like in practice, the Artsy team has open sourced their GraphQL server.
Two years ago, a tiny social media startup called Twitter, started experimenting with GraphQL. They’ve rolled it out slowly across their services, starting with TweetDeck, then onto Twitter Lite, and most recently to their Android and iOS apps.
One of the challenges they’ve faced while implementing GraphQL is monitoring. With a standard REST API, you have nice HTTP status codes that reflect the success of your request. However, with GraphQL, requests can be partially successful, and you almost always get back a 200 status code regardless of the data returned. To solve this, they track the number of exceptions per query.
Like Shopify, Twitter has also had to safeguard against extremely expensive queries. To do this, they assign a complexity score to each field and calculate the cost before execution. They also set a limit on the depth of the query.
Beyond standard API calls, Twitter is also using GraphQL to allow clients to subscribe to events. The engineers there built a service that enables GraphQL clients to stream data flowing through their massive pub/sub framework. They made use of their existing streaming HTTP system, which allowed su#bscriptions to “topics”. The GraphQL implementation creates unique topics for each subscription requested by the client.
Twitter, like many other companies, is generally excited about the new possibilities that GraphQL opens and the new ways of thinking it enables.
Yelp is another high-profile company who has launched a public GraphQL API. Exposing their data in this way made a lot of sense for them and probably would for many other companies. When you’re building an API for external users, you may have some ideas about how they’ll use it, but they may end up using the data in completely unexpected ways in practice.
The flexibility of GraphQL and the ability for third-party developers to decide what data they want back from an API request has made it a very attractive option for the Yelp crew.
Speaking of high-profile public GraphQL APIs, GitHub got to that party early. Their REST API has been a core piece of their product since they first started, and there are entire businesses built on top of it. So why mess with it?
Again, the core issue with their REST API was that it did not provide enough flexibility:
“Our responses simultaneously sent too much data and didn’t include data that consumers needed.”
Over 60% of the requests made to their database tier are done through the REST API. Flexibility along with “bloated” responses and the need to make several API calls to get the data needed were major pain points. So as the team was beginning work on the v4 of their API, they started looking seriously at GraphQL.
GitHub engineers have likened their GraphQL adoption to switching from XML to JSON. They began the migration by testing the implementation of a small feature: emoji reactions on comments. After finding this to be very straightforward, they began porting over other features of the application.
Today, GitHub uses GraphQL on the front end and back end, and its engineers are consuming the same GraphQL API that third-party developers are. They’ve found numerous benefits for both consumers and maintainers of the API such as type-safety, code-generated documentation, and introspection.
As you might be able to tell from this post, we’re also big fans of GraphQL here at StackShare. We’ve been using GraphQL with GraphQL Ruby in production for several months now.
This week, we also released our very first open source contribution, GraphQL Cache. It provides a simpler way to cache resolved fields in GraphQL with Rails.
Want more of these sorts of posts? Sign up for StackShare Weekly to get this hotness in your inbox once a week!
This content was originally published here.