Sunday, April 06, 2008

Message Passing

Anybody who has tried to do multithreaded programming in Java, C#, or most any C-based language finds that it can quickly become nightmarish. Deadlocks, race conditions, lock management - all of these issues will occur. In Java, all threads are able to share objects. As a result, the programmer is responsible for ensuring that his objects can't be put into an invalid state. Java provides per-object monitors to enable this. However, working with such monitors can be a nightmare. Using notifyall when you should have used notify, or forgetting to add the synchronized keyword to a method declaration, or incorrectly assuming that a class was thread-safe when it wasn't; these are all examples of simple problems that can cause your application to fail (perhaps only sometimes).

Erlang, on the other hand, does not suffer the same problems. Instead of sharing things, Erlang processes (which are the Erlang unit of concurrency) do not share anything. As a result, the need for synchronization goes away. Erlang processes send messages to each other. Each process has a mailbox. Each process can retrieve and process messages from its mailbox. You can imagine that a message is copied when it is put into another process' mailbox, but in actuality, since data in Erlang is immutable, the system is probably able optimize that away.

One of Java's founding principles was that manual memory management is complex, and thus easy to get wrong. Far simpler was to provide a system where the programmer didn't need to worry about such things. Erlang does for concurrency what garbage collection did for memory management - it removes that responsibility from the developer. Instead of asking every developer to handle any threading issues that might arise, it provides an abstraction that hides all of them. Behind the scenes, Erlang's mailbox system does whatever synchronization it needs to do in order to be completely thread safe.

Of course, Erlang's approach isn't perfect. Just as Java's garbage collection can sometimes allow memory to leak (when you are holding onto a reference longer than you need to), Erlang's system could still be susceptible to problems such as deadlocks. However, it is less likely to occur, and should be more obviously solvable when it happens.

Erlang's approach is so radical, yet so simple, that it's definitely worth learning more about. I hope to see Java and C# projects adopt Erlang's philosophy when appropriate. In performance-critical applications, manual concurrency management might be the way to go. For most purposes, though, the safety provided by Erlang's approach outweighs any negative side effects.

No comments: