Thursday, June 18, 2015

A quick overview of what WebAssembly is and what it is not (yet)

There's been a lot of buzz today about WebAssembly, and that's completely understandable. A bytecode form of Javascript has been on the minds of many web developers for a long time. But the online commentary seems to have been based more on hopes than on released information. I hope to clear up some of that confusion. Note that I'm not involved in the project at all, so some of this information is likely incorrect; feel free to leave a comment if I've gotten something wrong.

WebAssembly isn't a spec yet. It's not even a draft spec. It's an idea and a proof-of-concept. So far, that's all that's in the public. But it has a lot of promise.

The WebAssembly roadmap essentially spells out three phases:

  1. A minimum viable product (MVP) that is roughly analogous to asm.js, specifically targeting C/C++ as the source language
  2. Additional features, such as threads, SIMD, and proper exception handling, all still with a focus on C/C++
  3. "Everything else", including features meant to support more languages. One such feature is access from WebAssembly code to objects on the garbage-collected Javascript heap. This could enable WebAssembly code to access the DOM and web APIs (which would not be supported in either of the previous iterations). This phase also contains support for things like large heaps, coroutines, tail-call optimization, mmapping files, etc. If and when we get to this phase, I would expect it to get further split.

There will also be a polyfill, since browsers will not initially support WebAssembly. In fact, there is already a polyfill, but it's more of a proof-of-concept than an actual implementation. There is no WebAssembly spec yet; this polyfill is merely to test the viability of a binary-encoded AST.

Initially, it's not wrong to think of WebAssembly as binary-encoded asm.js, though that will change over time. It may eventually grow to the point that it could replace Javascript, but in its first two incarnations, you will still need some JS code (to interact with the DOM and with web APIs). A likely use case is to compile low-level, algorithmic code (think image processing, compression, or encryption code) to WebAssembly, but to still write the bulk of your application in plain JS. If you're not familiar with asm.js, it might make sense to look into it; many of the same limitations of that environment also appear to apply to the first incarnation of WebAssembly.

However, even at this early stage, WebAssembly does specify some features that even asm.js doesn't provide. In particular, it talks about 64-bit integer operations, which asm.js can't natively provide. The current polyfill POC doesn't seem to support them, and they may choose for the polyfill to always implement them as floating-point operations, but an actual runtime would need to provide proper support for int64 (and other sizes as well, like int8 and int16).

WebAssembly only deals with the binary format and runtime environment; it doesn't provide any new APIs for dealing with the DOM or the network or anything like that. It may eventually provide alternatives to other web APIs (WebWorkers would be an obvious one, when WebAssembly eventually adds threading support).

The current plan for WebAssembly is to not use bytecode in the same way as JVM or CLR use bytecode. Those VMs represent their bytecode as an instruction stream, similar to instructions for non-virtual machines. WebAssembly, on the other hand, appears to be going more for a "serialized abstract syntax tree" form. The claim seems to be that this can be compressed more efficiently than can be achieved using general-purpose compression routines, and that doesn't seem too crazy. This distinction isn't important for web app developers, though it could mean that "disassembled" WebAssembly is easier to grok than disassembled JVM bytecode.

WebAssembly will have defined binary AND text forms. So, while you won't be able to curl a WebAssembly file and immediately understand it, there will be tooling to convert from the binary form to the text form (which will certainly eventually be built into browsers). It seems likely that the bytecode semantics will make concessions to retain some degree of human readability when converted to text form.

I really like this quote from Peter Kasting in the comments on Ars. Like, seriously, if you have an Ars account, go give him some fake internet points.

Notably, the people working on WebAssembly are the PNaCl (Google) and asm.js (Mozilla) teams. In some sense this can be considered the followon effort to those projects, meant to combine the best attributes from each, and in a way that can be agreed on by all browser vendors.

This is exactly how the system is supposed to work: individual teams try to advance the state of the art, and eventually all those lessons learned are incorporated into a new and better system. See e.g. SPDY -> HTTP2. WebAssembly draws on both the past work and the experience of all those involved, and wouldn't be what it is without them.

I say this partly so the sorts of people who have bemoaned "non-standard" vendor efforts in the past may have reason to pause next time they feel the urge to do so. No one wants a balkanized world forever, but that vendor-specific effort may gain the sort of real-world experience necessary to come back and design a great cross-vendor solution afterwards. All four major browser vendors have taken flak for this sort of thing in the past, and in my mind often unfairly so.

All told, it looks like this is just the first, small step. I'm a little surprised that they went public so early. Either they really plan to develop it in the open, or the tech media got wind of it and blew it well out of proportion. Whatever the case, this looks like it will be a large undertaking. It's very promising that everybody seems to be at the table. And given that the browser vendors appear to want to actively develop their products, we might actually be able to use this within a couple of years.

What a time to be alive!

1 comment:

Nick Desaulniers said...

> I'm a little surprised that they went public so early. Either they really plan to develop it in the open, or the tech media got wind of it and blew it well out of proportion.

Romantic, but no. I've had on my calendar June 17th for at least a week or two. Everything was well coordinated.