Monday, June 6, 2011

Points for Style: Synchronous I/O in JavaScript

Thanks to developments in the evolving Mozilla JavaScript standard, it is now possible to use I/O paradigms from 1982-era ROM BASIC in the browser. You can skip this tediously technical article and try it yourself (Firefox 2+).





Traditional JavaScript relies extensively on asynchronous I/O. If your program needs input, and that input isn't available right away, you would have to structure your code around callbacks. For example:

function async_io() {
  do_first_part();
  addEventListener(when_input_is_ready, do_second_part);
  return "We aren't actually done til do_second_part() is called.";
}

By contrast, synchronous (or "blocking") input mechanisms, such as INPUT in BASIC or scanf() in C's stdio, provide a simpler (and some might argue, classier) model for waiting on input:

function sync_io() {
  do_first_part();
  wait_for_input();
  do_second_part();
  return "We are now totally finished, thanks.";
}

An important feature of blocking I/O is that the program waits without using any CPU. We specifically eschew such techniques as "busy waiting":

function bad_wait_for_input_implementation() {
  while (typeof user_input === 'undefined') {} // gross
}

At best, this will just spike your processor utilization at 100%; at worst, it will prevent whatever code is supposed to set user_input from ever running, so you will be stuck in an infinite loop.

Before 2007, JavaScript lacked an obvious language facility to implement blocking for I/O. It was impossible to suspend execution in the middle of a function, idle, and resume where we left off. Since JavaScript 1.7, this can now be achieved with the "yield" keyword.

Yield is like a special kind of return statement: it returns control to the caller, optionally returning a value. The difference is, the caller can now resume execution of our function right where it left off: immediately after the last yield. Here's an example:

function callee() {
  do_first_part();
  yield "I need input now, dag nab it!";
  do_second_part();
}

Notice that yield acts like an inside-out function call. In a normal function call, we start up a subroutine to do some work for us, and expect it to return control back to us when it finishes. In a yield, we are returning control back up the call chain to our own caller, which can later "magically" resume us at just the point we left off. It does so by invoking the next() method on a "control object" associated with our function. Heavy!

So what does our caller look like? It needs to be a sort of scheduler to sequentially invoke each part of callee(), interleaved with input handling. Here's a first attempt:

function caller_that_sucks(c) {
  c.next(); // do_first_part() is executed
  make_sure_input_is_ready();
  c.next(); // do_second_part() is executed
}

A good start, but there are two issues. First, we need to create c, the control object for callee(). In JavaScript this is called a Generator, because it was originally used for generating data sequences. Second, we never really solved the problem of how to stall execution until we have input; we just moved it up into the caller. Any attempt to implement make_sure_input_is_ready() will fall into the blocking problem we ran into before, trying to write wait_for_input(). Here's a better solution:

function caller() {
  var c = callee();
  var status = c.next(); // do_first_part is executed now
  if (status === I_NEED_INPUT_NOW_DAG_NAB_IT) {
    addEventListener(input_event, c.next);
    // do_second_part will be executed when input is ready
  }
}

The overall win is that the complexity of event handling and asynchronous callbacks is moved out of our user code (callee), and into a general-purpose scheduler (caller). Plus, now it is trivial to translate C64 BASIC type-ins from COMPUTE!'s Gazette. Micro Adventure, anyone?

LINKS!

Note that yield only exists in JavaScript, the Mozilla extension of the ECMAscript standard. This means it isn't guaranteed to work in Chrome/Chromium, Safari, IE, etc -- "V8 is an implementation of ECMAScript, not JavaScript".

You can start by trying the demo on ModernHacker.com. Read Mozilla's Generators and Iterators in JavaScript 1.7 for a more complete description of yield. Yield originated with Python; their version is described in a series PEPs including Coroutines via Enhanced Generators. You can even pass values back into the yielding function when it resumes.

The JavaScript community has started to build concurrency libraries exploiting yield. Alex Gravely, all-around free software good guy and friend of the Modern Hacker, wrote Er.js to implement Erlang-style concurrency abstractions. It's based on Neil Mix's Thread.js proof-of-concept. Dave Herman of Mozilla Research maintains Task.JS, mentioned recently by Brendan Eich in his talk at NodeConf. Oni Labs (the software startup, not the steroid chemist) develops StratifiedJS, which they use in their server-side JavaScript product (taking the opposite approach of the Node.JS async library).

And for the last word in style, be sure to leave a Micro Adventure gamebook on your coffee table.

No comments:

Post a Comment