Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Adopters of GraphQL are quick to call it game-changing, but is it a game worth playing? It has to add value above the standards already in place. Is it accomplishing that? How is it more valuable than whatever it is that it replaces (REST, whatever)?


I think I came to this conclusion the other day: GraphQL truly shines when the variation of ways that you want to access data is significantly more complex than the complexity of the stored data. In other words, if you want to look at 10 tables in a twenty or forty or more different views (counting the times when a view is composed into another), then I believe using GraphQL will begin to show order of magnitude efficiency rewards.

If the number of views is only slightly more than the number of tables then the rewards are more a matter of taste. It will give you a nice typed interface to your data and an interactive query GUI. But other tools provide that as well.


> if you want to look at 10 tables in a twenty or forty or more different views (counting the times when a view is composed into another), then I believe using GraphQL will begin to show order of magnitude efficiency rewards.

How will it show order of magnitude efficiency?!!

It means that you will have 20-40 views towards a database that only contains 10 tables. Which will inevitably result in highly inefficient queries towards said database. Especially if a view is composed into another.

There's no magic in GraphQL that suddenly makes that go away.


Imagine you have a cross-platform app with multiple teams. You have a lot of data-driven components like different feed types. You want to avoid situations that change in backed API needs to be synced with every team/app. You want to give power to app developers to define efficient queries.

Unless you develop multiple apps or data-driven application there is very little reason to use GraphQL. Personally, I am in favor of building two APIs: public/third party REST and JSON-RPC for the frontend. Getting REST right is difficult and after you create your resources your frontend needs to workaround incomplete/superfluous data.


This doesn't answer my question :) You still have 10 tables, and you still have 40 ineffecient ad-hoc views into that database.

Why is it that any text, and post, and comment about GraphQL focused on client-side only and completely ignores any questions about server-side?


Because your source data will be like 10 tables. But your frontend needs to access this data in different ways (views). There no way to design your database to support all possible queries in an efficient way. With complex queries even caching become tricky.

  user(id = 6) -> with ('comments') -> with ('votes')
ORM - create a complex query that hit three tables. There is no way to represent this as proper REST service (/userNameWithCommentsVotes - resource). You end up with REST resources that accept a lot query params. Each endpoint will create coupling to both data storage and consumer (front-end).

  User (id = 6) {
    name
    comments {
      votes: {
         up
         down
       }
     }
  }
GraphQL shift data access towards the client. It will make multiple DB queries but it will be trivially cached using Dataloader pattern. You hit DB more with simple queries but you don't muddle you data schema with ad-hock views/queries.

People start with pretty REST and normalized Database, then frontend and third-party requirements demolish this to another "Paypal API". I am not saying that GraphQL is a silver bullet, you need to structure your frontend application Graph as well and complex conditional queries are difficult to express.


> ORM - create a complex query that hit three tables.

1. ORM isn't a requirement for REST

2. A REST endpoint will execute a highly specialised query that will hit all the right database indices and will return just the dataset required in one roundtrip to the database.

And since it's a known quantity, it will benefit from: hot db indices and caches, intermediate caches, and even HTTP caches (because GET requests are both idempotent and cacheable, for example).

Meanwhile with GraphQL your server will have execute what's essentially `SELECT <all>` three times (otherwise your "dataloader" won't be able to cache data for more complex queries) and do all the filtering and joins in-memory in-code.

> GraphQL shift data access towards the client.

Hahahha wat? The client has no access to data. The only thing it does is send a query request to the server. The server will parse the query. The server will request data from the database (multiple times). The server will end up joining, filtering out, caching, figuring out proper auth access to, etc. etc. to data.

And only then will that data will be returned to the client in the form that the client requested. The client has no access to data. The only thing that the client can do is ad-hoc potentially non-perfromant queries to the server.

> You hit DB more with simple queries but you don't muddle you data schema with ad-hock views/queries.

Yup. You only muddle your code with those queries (the code needs to find a way to compose/filter out/etc. etc. etc. the data for the ad-hoc queries from the client). And you only do multiple redundant and expensive trips to the database (what happens when the DB is sharded, and some data required for the query lies in a different shard? What happens when connections are slow/interrupted? etc. etc.)

> it will be trivially cached using Dataloader pattern.

No it won't. Dataloader only caches some data during one request. On the next request you will do the same: expensive multiple roundtrips to the database.

Oh. By the way. Remember how you dismissed ORM? Well, your dataloaders and data resolvers (and whatever other new lingo GraphQL came up with) is nothing but a very limited and inefficient ORM.


> A REST endpoint will execute a highly specialised query that will hit all the right database indices and will return just the dataset required in one roundtrip to the database.

Each time you need to have a new highly specialized query you create REST resource. That why I prefer to be honest and just make RPC call.

> Meanwhile with GraphQL your server will have execute what's essentially `SELECT <all>` three times (otherwise your "dataloader" won't be able to cache data for more complex queries) and do all the filtering and joins in-memory in-code.

That fine, because easy cache invalidation is worth it. You will find that REST complex endpoints will take cache the same data multiple times.

> Hahahha wat? The client has no access to data. The only thing it does is send a query request to the server. The server will parse the query. The server will request data from the database (multiple times). The server will end up joining, filtering out, caching, figuring out proper auth access to, etc. etc. to data.

I said data access, not data. The benefit of GraphQL materializes when you have multiple client application that targets same backend data store. You describe your data schema and clients can figure out what data they need.

> Yup. You only muddle your code with those queries (the code needs to find a way to compose/filter out/etc. etc. etc. the data for the ad-hoc queries from the client). And you only do multiple redundant and expensive trips to the database (what happens when the DB is sharded, and some data required for the query lies in a different shard? What happens when connections are slow/interrupted? etc. etc.)

You have cache. Having only one copy of user in the cache is super important. GraphQL is not creating expensive queries. With GraphQL it is easier to shard your DB because you will have less joins.

> No it won't. Dataloader only caches some data during one request. On the next request you will do the same: expensive multiple roundtrips to the database.

Dataloader supports any Cache backend. In production, you will use something like Redis. Whole point dataloader is to cache between requests. Its supports cache invalidation as well.

> Oh. By the way. Remember how you dismissed ORM? Well, your dataloaders and data resolvers (and whatever other new lingo GraphQL came up with) is nothing but a very limited and inefficient ORM.

No, they are query interface. They expose DSL for accessing data. Mapping is something that can be done in Relay.


> Each time you need to have a new highly specialized query you create REST resource.

I love it how you meander. First you where complaining about ORM creating complex queries. When I countered with the simple fact that you don't need ORM, and "complex queries" are highly specialised efficient queries that take full advantage of DB capabilities, you immediately go off on a tangent talking about RPCs.

:-\

> That fine, because easy cache invalidation is worth it.

No it's not fine. Because instead of retrieving a simple single highly optimised dataset in one go you do multiple inefficient roundtrips to the database.

> I said data access, not data.

access. /ˈaksɛs/ 2. obtain or retrieve (computer data or a file).

All data access happens on the server through inefficient database queries and in-memory juggling of data. Clients have no access to data, they send queries.

> Whole point dataloader is to cache between requests.

Dataloader (in it's original form and specification) doesn't cache between requests.

DataLoader provides a memoization cache for all loads which occur in a single request to your application.

DataLoader caching does not replace Redis, Memcache, or any other shared application-level cache. DataLoader is first and foremost a data loading mechanism, and its cache only serves the purpose of not repeatedly loading the same data in the context of a single request to your Application.

If any other Dataloader implementation implements caching between requests, they are just reinvent the wheel.

> No, they are query interface. They expose DSL for accessing data.

ORM is a DSL at it's core. And that's what "dataloaders" and "resolvers" in essence are: ORMs.

To quote from Apollo:

In order to respond to queries, a schema needs to have resolve functions for all fields.

    const resolverMap = {
      Query: {
        author(obj, args, context, info) {
          return find(authors, { id: args.id });
        },
      },
      Author: {
        posts(author) {
          return filter(posts, { authorId: author.id });
        },
      },
    };
Oh look. A wild ORM appears! Oh look how it quickly devolves into multiple DB roundtrips for any non-trivial query.


Hmm, are you talking about performance here? I'm talking about developer efficiency and code complexity. Sorry, should have been more clear about that.


Of course we are talking about performance. When someone carelessly says "40 unforseen composable views towards 10 tables" or (as a recent article on HN mentioned [1]) "You should find yourself being able to build a query that delves 4+ relations deep without much trouble", I find myself asking: at what cost?.

There's no magic in this world. And yet, no one seems to address the question of the server. If you have 10 tables and 40 composable views, most of those views will end up highly inefficient queries towards the database (possibly with multiple roundtrips).

And that's on top of many other concerns: https://news.ycombinator.com/item?id=17293337

[1] https://news.ycombinator.com/item?id=17269028


"we are talking about performance" - Well, I was not. My comment was not about performance.


Well, that's the problem, isn't it? You can always carelessly say "ah, it's magically gives you magical capabilities of magically making 40 composable views that magically make everything shiny."

But then the hard questions come. It's no surprise that there are so few GraphQL resources, blogs, docs, posts, proponents that talk about performance on the server.


From reading around it _seems_ that caching in particular is left as a exercise for the developer, vs putting your REST API behind nginx for example when you need to scale out.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: