I finally escaped Node (and you can too)

2021-03-21|10 min|

I wrote my first Node program in 2013. (It was in CoffeeScript.)

In those days, there were three factors contributing to Node's ascendancy:

The first was "JavaScript everywhere." This originally meant "JavaScript on the front-end, JavaScript on the back-end," which always struck me as a weak rationale. (This has since evolved into "might is right" and JavaScript as a lingua franca, which is a bit stronger.)

The second was an attraction to what Node wasn't. There was a growing reactionary movement in opposition to the batteries-included monolithic frameworks of Ruby on Rails and ASP.NET. This was another weak argument, as it wasn't like Ruby's Terms of Service required one to use Rails (see Stripe).

The third was by far the most important. Node had a solid concurrency story, at a time when San Francisco's dominate framework (Ruby on Rails) did not. People knew JavaScript, and callbacks provided a simpler on-ramp to concurrency than many threaded models at the time.

It was Node's concurrency that fueled its wildfire adoption. But I remember even in those early days, something felt off.

We had a service that would inexplicably drop hundreds of requests for seemingly random, 1-2 second periods. In my first ever Node cave-dive, I would discover the reason: an uncaught exception killed the lone process on the server. During that intermittent 1-2 second period, there was simply nobody home.

We blamed ourselves for adopting new tech too early. But I'm not sure the story has changed much in the intervening years.

I found this throwaway line from the Deno v1 release remarkable:

A hello-world Deno HTTP server does about 25k requests per second with a max latency of 1.3 milliseconds. A comparable Node program does 34k requests per second with a rather erratic max latency between 2 and 300 milliseconds.

I'm reminded of a lesson I learned about data structures.

Data structures

I once worked with a buddy of mine, Sacha, on a software project. He was obssessed with data structures. We were in Saigon at the time, and packed up our laptops and traveled to the remote Cambodian island of Koh Chang. We were cut off from the world. Our beach bungalows did not have internet.

We'd spend the days working alone and meet at sunset for dinner at a nearby shack. On the first night, I wanted to talk about architecture and algorithms. All Sacha wanted to talk about was data structures.

By the second night, I had broken ground on a couple of the project's workflows. When Sacha arrived, it looked as if he'd barely slept. He told me he'd been up all night, taking long walks, brainstorming, drawing, experimenting. He had done some morning yoga, was a day into a fast -- and, finally, had some breakthroughs on the data structures.

I had to know: "Sacha. Why is it always about the data structures?" And he responded with a quote that echoed one I'd later hear attributed to Rob Pike:

Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming.

From that moment forward, I would see this everywhere. If ever I felt like my program was getting too complicated or hard to read, it was almost always a data structure problem. Since then, every time I've been humbled by another programmer's code, it hasn't been by clever tricks or algorithms. It has always been by their ingenious insight into how their program's data ought to be organized.

This principle views data structures as foundational. If you have solid foundation, the house will come with little effort. If the foundation is mud and sticks on top of a trash heap, your life as a builder is going to be complicated.

This principle applies to tools in a broader sense. You want to do the least work possible when swinging a sledgehammer, so you design it such that the hammer is a much heavier material than the handle. This gives you leverage. If you designed your sledgehammer in the inverse, you'd have to swing it harder every time you used it.

While concurrency was fundamental to Node's rapid adoption, Node's concurrency has always had foundational flaws. It's a sledgehammer, but an inverted one.

Callbacks were never optimal, and I can make that claim confidently because almost no one uses them in the greenfield anymore.

We can say the same about Promises, as async/await was designed specifically to abstract them.

But it's only a matter of time until the next floor of the house is built and async/await is abstracted. Its ergonomics are strange, even if you've grown used to them. I think a helpful perspective is the concept of red functions and blue functions. In JavaScript, red functions (asynchronous) can call blue functions (synchronous), but not the other way around. The two also have different invocation syntax. And when one red function is introduced, it bleeds through your codebase, causing many second- and third-degree functions to turn red as well.

Async/await, and the event loop broadly, is a strange paradigm. It is tough to explain to new programmers without a lot of hand-waving. And it sounds like the exact kind of algorithmic kludge a programmer would introduce if her underlying foundation had a few flaws.

Cognitive overhead

Humans think through abstract concepts by mapping them onto physical analogies. My brain was evolved to understand the rough quantity and color of berries in a distant bush. It was not evolved to craft software programs. But it turns out it can perform the task (relatively) well by using the same circuitry in my brain that was intended for surviving in the wild. In my head, my program resides on a 3-dimensional plane, where functions in a file "over here" call functions in a file "over there." Reality is far from this, but the abstractions we've created, from filesystems to compilers to displays, makes this possible.

When it comes to concurrency, nothing will map quite so elegantly for my human brain than something like a task, be it a process in Elixir or a Goroutine in Go. The idea of a worker working on a task is just so damn easy for the human brain to reason about:

I want to request the first five pages of this API, simultaneously. I then want to bundle the results together and deliver them to the client. So, I'll just have five of my little buddies here go and each make a request, assigning to each of them the page they are to grab -- there we are. And now, I will just sit back and wait for each of them to report back with the results.

On the other hand, for me, the idea of a callback or a promise has always required a few extra CPU percentage points. It's like a photon hitting a semi-silvered mirror: the reality of your program splits into two universes. In one trajectory, control flow continues. In the other trajectory, at an indeterminate point in the future, a callback or promise handler is executed.

Async/await is the attempt to fold the paradigm back to something that's easier to reason about. It makes your program "feel" more synchronous in certain areas. But it's an imperfect abstraction and an abstraction at the wrong layer of the stack.

Querying a database

In Node, let's say you want to query your database in a REPL. Here's what it looks like:

> const { Client } = require("pg");
> client = new Client(/* connection string */);
> client.query("select now()");
Promise { <pending> }

Something about this always felt so depressing. Rationally, I could justify it: nothing lost, nothing gained. If I wanted to board Node's async rocket to the moon, I had to accept inferior ergonomics in a situation like this. I get a promise, not a result, so I need to add additional logic to handle the promise and get a result.

And if only life were so simple:

> await client.query("select now()");
await client.query("select now()");
SyntaxError: await is only valid in async function

Alas, begrudged acceptance, we all got used to "the new way of doing things here."

At this moment, thanks to the proliferation of async/await, I can no longer remember the API for a Promise instance. So I'll just regress all the way to a callback. Which is fortunately possible, because JavaScript's "no man left behind" principle ensures callbacks will be well-supported for my grandchildren:

> client.query('select now()', (err, res) => console.log(res))

Still no result. Five minutes of head scratching and banging ensues until I realize – for the love of von Neumann – I've forgotten to call client.connect(). If you don't call client.connect() before calling client.query(), the pg client will silently push the query to an internal queue. This would be more infuriating if it wasn't so understandable – remember the flawed foundation we're building on here.

So, finally, I call connect() and then query() and I get a result (somewhere in there...):

> client.connect()
> client.query('select now()', (err, res) => console.log(res))
Result {
command: 'SELECT',
rowCount: 1,
oid: null,
rows: [ { now: 2021-03-20T19:32:42.621Z } ],
fields: [
Field {
name: 'now',
tableID: 0,
columnID: 0,
dataTypeID: 1184,
dataTypeSize: 8,
dataTypeModifier: -1,
format: 'text'
_parsers: [ [Function: parseDate] ],
_types: TypeOverrides {
_types: {
getTypeParser: [Function: getTypeParser],
setTypeParser: [Function: setTypeParser],
arrayParser: [Object],
builtins: [Object]
text: {},
binary: {}
RowCtor: null,
rowAsArray: false

I'll never forget, after years of only writing Node programs, the moment I first made a SQL query in Elixir's REPL, iex.

I start a connection to a remote Postgres database:

iex()> opts = # connection opts
iex()> {:ok, conn} = Postgrex.start_link(opts)
{:ok, #PID<0.1330.0>}
iex()> ▊

I type in my query:

iex()> Postgrex.query(conn, "select now()")▊

I press the Enter key:

iex()> Postgrex.query(conn, "select now()")

And you know what the REPL does next? It hangs. For a tiny fraction of a second, the beautiful little bugger just hangs. I'll never forget, it looked just like this, for a few milliseconds:

iex()> Postgrex.query(conn, "select now()")

I very perceptibly gasped. And then, my result:

iex()> Postgrex.query(conn, "select now()")
columns: ["now"],
command: :select,
connection_id: 88764,
messages: [],
num_rows: 1,
rows: [[~U[2021-03-20 19:40:13.378111Z]]]
iex()> ▊

How can Elixir get away with this? An I/O operation like this is the precise location you want async, right? Did I turn off async in my REPL somehow?? Is Elixir not async???

No. Elixir can get away with this because it's built on top of Erlang/OTP. And Erlang/OTP got concurrency so right.

Concurrency - and the processes that support it - have been a part of OTP since the beginning. And not as a feature, but as a part of its raison d'etre.

When I run Postgrex.start_link above, the function returns to me a pid, which I store in the variable conn. A pid is an address. In this case, Postgrex has started a process that manages the connection to my Postgres database. That process is running somewhere in the background. The pid is a pointer to that process.

When I run Postgrex.query(conn, statement), the first argument I pass to query/2 is the pid of the connection process. The REPL - the process I'm typing in - sends the statement as a message to the connection process.

The human metaphor of two friends passing messages back and forth to each other applies perfectly. Importantly for me the programmer, the REPL cheerily waits until it hears back from the connection process. The call to query/2 is a synchronous one. It's synchronous because it can be.

In fact, whenever an individual process does anything, it is always synchronous. On a local level, an Elixir/Erlang programmer is always thinking about synchronous, functional reductions. Including when sending and receiving messages to other processes. (And is all the while blissfully free of a red/blue function dichotomy.)

In Elixir and Erlang, concurrency doesn't happen at the function layer. It happens at the module layer. You can instantiate a module as a process, and now it's running concurrently with other processes. Each process maintains its own state. And can pass messages back and forth with other processes.

What you end up with is a paradigm for concurrency that a human can easily map to and reason about. I always feel like I'm thinking about concurrency at the right layer of the stack.

In fact, for the Elixir/Erlang programmer, properly modeling your process modules is as important as properly modeling your data structures. I think this is why so many describe these languages as a "joy" to do concurrency with. The same rush you get when you have a breakthrough on how to model your data structure and delete 400 lines of complex business logic, you get when have a breakthrough on your process organization.

Creation vs. Evolution

Consider the history of Elixir: first you take Erlang, which was invented by Joe Armstrong and team to solve the extraordinary concurrency challenges that telecommunications imposes. You battle test it for many years. Then you take José Valim and team, who come in and create Elixir with the sole goal of making it more ergonomic by borrowing heavily from one of the most ergonomic languages of all time.

Like, of course you end up with something special that's a joy to use.

Languages like Elixir and Ruby are an act of creation. Ruby, for example, reduces to a creator and designer (Matz). You can feel His embrace from first contact with the language. It is welcoming. It is warm. Things feel purposeful. They make sense. Ruby's principle of least surprise makes everything feel in order.

JavaScript is exactly the opposite. JavaScript is evolution. Node is full of surprises, at every turn, for every skill level. JavaScript will always find little ways to undermine you, to humiliate you. There is no one designer, just the cold force of natural selection. It is riddled with arcane evolutionary quirks. It is direct democracy, the people's language, for better and worse.

The history of JavaScript is complex and deeply human. And perhaps one day the story of collapse into a singularity. I can't wait to read the book.

But might ≠ right, and I'm happy I've finally broken free.