For me, ORMs are a sign one isn’t comfortable reading and writing SQL. I think t...

SomeCallMeTim · on June 18, 2021

I am super happy and comfortable reading and writing SQL. Give me the slightest excuse and I'll write up a wicked complex query that does exactly what we need to solve a particular problem.

I simply also feel that having 100% type safety in your results (which you really can't do with SQL queries) is a huge bonus. Add in database portability for most of your operations (including between, say, PostgreSQL and MongoDB), and you end up with a huge software engineering win over writing custom SQL for every query.

And frankly? That's only the tip of the iceberg in the advantages of relying on an ORM.

The next step is to use something like FeathersJS or GraphQL that will prevent you from ever needing to create a basic CRUD API by hand.

Once your ORM knows the shape of the data, you can simply tell FeathersJS to serve that API, complete with POST/GET (one)/GET (find/search)/PATCH/UPDATE/DELETE, and poof, you're done except for any custom behavior or security that you want to add using hooks. Or similarly, you hook your ORM into your GraphQL adapter, and you get a great graph querying API with very little effort.

And in both cases it can be very type safe. The best query to maintain is the one that you never need to write by hand. The GraphQL code generator I'm currently using took 200 lines of clean schema definition and turned that into 20,000 lines of code that I never need to maintain or even think about.

ORMs are tools. If you ignore the power tools you have available, you're going to end up as obsolete as someone who insists on still building houses or cabinets using only hand tools. Yes, of course it can be done. But it takes 100-1000x longer, and the results are often not as good.

throwaway894345 · on June 18, 2021

> ORMs are tools. If you ignore the power tools you have available, you're going to end up as obsolete as someone who insists on still building houses or cabinets using only hand tools. Yes, of course it can be done. But it takes 100-1000x longer, and the results are often not as good.

I'm very skeptical that there are any productivity benefits to ORMs. Note you can forego the ORM and still use a query builder to offload details of SQL syntax and even abstract over multiple SQL syntaxes for different databases. Type safety is also orthogonal to ORMs--you can have type safety without an ORM or you can have ORMs that don't provide type safety (these are very common).

In my experience, any productivity that you might gain from an ORM is immediately swallowed by the time it takes to debug problems in the magical ORM layers (and then some). I could make an equally silly analogy that using an ORM is like using a bulldozer to build a house rather than carpentry power tools, but I won't because these analogies don't add any substance to the conversation.

SomeCallMeTim · on June 20, 2021

I've been doing this as my $DAYJOB for the last five years.

Your examples/predicted catastrophes are straw men that I've never seen in reality myself--though I've heard reports of Rails/ActiveRecord having issues similar to what you describe. Maybe that's the real problem you're worried about? RoR sucks. We can agree on that.

In the code I've been working with, though? An ORM can have annoying limitations, but doesn't suffer from weird magic. So in that case, write custom SQL to solve the exact issue the ORM doesn't support. Problem solved!

I'm confused by "use a query builder instead of an ORM," though. The ORMs I'm using are effectively query builders with knowledge of your object relationships. See for example Sequelize, TypeORM, or Prisma.io. Lighter query builders like knex.js don't seem to have enough information on the schema to give you type safety or auto-generated code (like FeathersJS or GraphQL resolver builders).

TypeORM and Prisma.io even allow you to use the Data Mapper pattern instead of the annoying ActiveRecord pattern. I suspect you're confusing the ActiveRecord pattern with "ORM". No "magical ORM layers" required.

Don't confuse Rails/ActiveRecord garbage for a modern ORM.

spion · on June 21, 2021

We contributed some type definitions a couple years back to node-sql, which is a very lightweight query builder: https://github.com/brianc/node-sql

tasuki · on June 18, 2021

> I simply also feel that having 100% type safety in your results (which you really can't do with SQL queries) is a huge bonus.

It is possible to get type safety with SQL queries. Eg doobie[0], a purely functional JDBC layer for Scala.

[0]: https://tpolecat.github.io/doobie/

SomeCallMeTim · on June 20, 2021

OK, that's awesome.

But type safety is only part of the equation.

Automatic CRUD or GraphQL code generation and/or other abstraction such that you never need to write basic CRUD code ever again? That should be a minimal software engineering best practice at this point, and yet we have tons of people insisting on writing every single query, either as a SQL query directly or using a query builder.

It's on the order of professional negligence to ever write complex code to handle the same situation over and over. CRUD is exactly that. Huge swaths of CRUD code are blatant DRY violations, and using an ORM or another kind of query builder that can interface with CRUD/GraphQL automation should be the minimal best practice we're all using.

saurik · on June 20, 2021

I have read through a bunch of your comments and it frankly doesn't even sound like you are doing database queries for an application: it sounds like you were tasked with creating a very general purpose API endpoint for application developers. You thereby to me don't even seem to be having a use case for an "ORM", so I am extremely confused? It seems like you wanted a tool that let you describe an intended API surface over some schema and have it automatically construct your backend for you... that sounds useful, and if that's some side effect of working with an ORM that is really cool I guess, but it isn't what an ORM is about. I would imaging then that the client is then using a GraphQL adapter to access your data.

In contrast, when all of us who despite ORMs are using that term, we are talking about a library or--worse--a framework designed to have the developer directly make queries using object-oriented data modeling. These layers then generate SQL that is pretty much guaranteed to be not just inefficient but ridiculously unscalable in ways that you run into even in simple software.

It sounds like in your architecture you would have no use for what I would call an ORM: at best, maybe the application developer that is consuming your API would in turn (I would argue, incorrectly) choose to use an ORM to access your GraphQL... but they are not generating SQL and you are not making queries. So I guess I just feel like you and we are talking about entirely unrelated use cases?

SomeCallMeTim · on June 23, 2021

Good question! To which I respond with another question: Why not both? ;)

Actual example code from my current project (with the actual object type anonymized by renaming it to "thing"):

    const result = await prisma.thingInstance.findMany({
        where: {
            thingTypeId: {
                in: thingTypes.map((id) => id.childThingTypeId),
            },
            thingInstanceChildren: {
                none: {
                    parentThingId: {
                        equals: args.parentThingId,
                    },
                },
            },
        },
        orderBy: {
            thingTypeId: "desc",
        },
        take: 120,
    });

Returns an array of ThingInstance objects. It can also return joined relations (giving you the "Object Relational" part of ORM), or filter on joins, or what have you.

It's because Prisma (in this case) is an ORM that fully understands the data architecture that you can layer a full automation/code generation/GraphQL server on top of it. Not all ORMs have this feature, but you mostly need to start with an ORM in order to bootstrap this functionality.

I guess you could do what you're describing by creating tons of highly specialized code given a schema without adding the ORM/query building features--but adding the layer that gives you a Data Mapper pattern ORM is a tiny amount of additional effort at that point.

I did mention elsewhere: Some people seem to think that the ActiveRecord pattern is the only "ORM" approach, but Data Mapper is another approach--and one that is usually referred to as another view strategy of an ORM. [1]

[1] https://culttt.com/2014/06/18/whats-difference-active-record...

brainless · on June 18, 2021

I am also very happy reading, writing and understanding SQL. My own side-product depends on that ability: https://dwata.com. It uses reflection of the underlying SQL, using an ORM based reflect library, generates a vector of the relations and state of schema and creates SQL on the fly.

But, would I recommend a team to start writing SQL by hand: No, absolutely not.

pionar · on June 18, 2021

Why not?

If you're going to use a tool (be it SQL, NoSQL, docker, kubernetes), your team needs to have/build some expertise in it.

IMHO, using something like an ORM abstracts that layer. It also introduces a "black box" into your ecosystem.

Going by my opinion in the first paragraph above, you'd have to build some expertise in this ORM. Why spend that time when you can just spend time getting to know your data storage technology better?

SomeCallMeTim · on June 18, 2021

Because "getting to know your data storage technology better" isn't the business goal. On the surface, that's a great goal to have, but taking an approach that requires you to do 100x as much work so you can learn SQL better isn't software engineering. It's over-billing.

I throw together 200 lines of schema in Prisma.io and it creates 20,000 lines of GraphQL handling code plus more ORM-specific code that allows me to create type-safe access to the database.

And "the database" right now is PostgreSQL, but one of the business requirements is that "the database" could also be SQL Server or MySQL.

Having learned the salient details of the ORM in ... hmm ... about three days? ... I feel that the advantages of getting a full GraphQL API that can be wired up to an arbitrary backend database for 99% of the queries vastly outweigh the fact that most of my database queries are entirely inside a "black box."

In six months I've encountered exactly one problem where the black box made my life more difficult. I think I wasted about an hour, maybe 90 minutes, on tracking down why it didn't work as expected. Given that "writing custom SQL" for all of the queries that I've created would have required 10s of thousands of lines of additional code plus system tests to verify that the queries were all working as expected? That's a profoundly huge win.

I'd still be writing SQL queries months from now if I had taken the approach you're suggesting. Maybe it's advice that's good for consulting firms that charge by the hour and love to find ways to make their developers work more hours, but in my book it's not even ethical to recommend.

prirun · on June 18, 2021

I develop a backup program, HashBackup. It uses SQLite. I wrote all the SQL. Most of the SQL stmts are a few lines long.

Another backup program, Duplicati, uses SQLite. Sometimes they post SQL that is having problems, and one query will go on forever, like 50-100 lines of SQL. Duplicati is written in C# so I'm guessing (but don't know) that it uses some kind of ORM. Figuring out why a 100-line SQL statement was slow would be very unfun.

Not intending to dis Duplicati; just saying that black boxes are only good if they work 100% of the time, and my experience is, they don't. And any black box you can reverse engineer in an hour is not giving much productivity gain IMO.

SomeCallMeTim · on June 19, 2021

Writing, what, a couple dozen (at most) SQL statements that are custom designed for a single application? Not really a big deal. Sure, write your SQLite queries by hand. Fine. Whatever; I wouldn't even say it's inefficient if it's that few.

Writing ~20 SQL queries for each of 20 different data structures (including various JOINs and search options) to create a CRUD API? And maintaining them each individually? And needing to make changes to the query every time the clients need to make a slightly different search request?

When I can literally write the 20 different schemas in one file, type a build command, and get a fully featured GraphQL API with full JOIN and custom boolean logic search capability in a few seconds?

I'm saving myself probably 40,000 lines of code, counting all of the data wrangling and test cases that I'd need to create to provide the minimal functionality that I would need--and the GraphQL code wouldn't need to be modified to change the search or order-by options of a particular query. And lines of code you don't write are lines of code you don't need to maintain. It's a huge win. Not even close.

And that said, the above is based on a real project I'm currently involved with, and that project does have six custom GraphQL queries that I wrote by hand--most of which also leveraging the ORM to some degree.

In exactly three cases, I wrote raw SQL queries that the ORM didn't support directly.

So I'm going to say no, the black boxes don't need to work 100% of the time. For me they're working 99% of the time, and I write the last 1% by hand, which is just fine. If I had run across a terrible query in that one corner case, I'd probably not try to dig inside the black box; instead I'd create a custom query that does exactly what I want and expose it for that bit of functionality. Done. [1]

As to the ORM used by Duplicati? Crappy ORMs exist. That doesn't mean all ORMs are bad. I can see the exact queries (minus parameters) scroll by in my log that the ORM I'm using is writing out. My own custom SQL queries tend to be more lines of code in practice, and none of the queries that it has generated so far have been slow.

[1] Figuring out why a 100 line SQL statement is slow is generally not that hard: You put "EXPLAIN ANALYZE" in front of it, and it will point out where the time is going. That's a PostgreSQL feature, so you'd need to have a Postgres backend in order to use it. Good thing we're talking about using an ORM that could easily switch between SQLite and PostgreSQL! Optimize the schema and indices on PostgreSQL and then switch to SQLite for embedded work.

andrewmcwatters · on June 18, 2021

I don’t necessarily agree with the approach, but I completely agree with the thinking. Thank you for sharing.

brainless · on June 18, 2021

By that logic, I should also know compilers, lexers, kernel, and the whole pile of protocols that run the Internet in a great amount of details. Where do you draw the line?

We need abstractions. Sure some abstractions are leaky, others are simply not good. But that is exactly why people should try to make better ones.

Scarbutt · on June 18, 2021

SQL is a high level language, more declarative and better at manipulating data than your project's programming language.

Comparing a compiler implementation would be in this case comparing to a database implementation.

sergiomattei · on June 18, 2021

Damn right! I'm not comfortable with writing nor reading SQL, and I avoid it usually when doing my projects.

I know SQL. I just don't care enough or want to write it. For 99% of the stuff I do, it's just easier to grab an ORM.

andrewmcwatters · on June 18, 2021

Yep. It’s good that these options exist.