2014.0804 permalink

Why I left Factual

Just kidding. I'm totally not leaving yet, and going strong two years after starting here (this is a new record for duration of continuous employment for me, by the way). If you're looking for a fun, mathy job in Clojure, I highly recommend working here. Feel free to email me if you're interested.

2013.0814 permalink

Bash is irrecoverably broken

Everybody knows that bash is a little weird. The quoting shenanigans with $@ and $*, the horked local scoping rules, the required whitespace around [[ and ]], and a handful of other oddities that make shell scripting feel like an adventure. And these were things I was happy to put up with in exchange for its convenient ubiquity and highly usable default settings.

That all changed this morning.

I was up early to work on bake, a build tool that integrates with the shell environment. Bake maintains a table of variable bindings and uses these repeatedly when solving the dependency graph, so I wrote up some quick benchmarks to test various strategies for storing and accessing it.

No problem, right? Bash has sparse arrays, global variables, and bash 4 introduces associative arrays. One way or another, we've got this covered. Ok, so let's do a quick check to make sure the sparse-array support doesn't do anything too crazy with access times:

# setup xs to have 65536 elements, each one character long
benchmarking local x=${xs[1024]}...
user    0m0.234s
user    0m0.237s
user    0m0.234s
user    0m0.241s
benchmarking local x=${xs[16384]}...
user    0m0.233s
user    0m0.240s
user    0m0.242s
user    0m0.240s

Looks good! It doesn't seem to matter what the index is. So however bash is doing arrays, it isn't doing anything crazy like a linear scan when we retrieve stuff. Right?

Arrays are linked lists

Wrong. Sparse arrays are implemented as linked lists in bash. To optimize the sequential access case, the array structure contains a stateful pointer to the last element accessed. We can see this behavior in a benchmark by requesting element 0 after element N:

benchmarking local x=${xs[1024]} y=${xs[0]}...
user  0m0.347s
user  0m0.353s
user  0m0.349s
user  0m0.350s
benchmarking local x=${xs[16384]} y=${xs[0]}...
user  0m1.175s
user  0m1.183s
user  0m1.189s
user  0m1.182s

A tragic lack of forethought on the part of the array implementer. But no matter; bash gives us at least two ways to access hash tables, so we can work around the problem that way.

Hash tables are linked lists

Bash does not use hashtables as of version 4.2, and you should not use a hashtable written by anyone who tells you otherwise.

Yes, you read that correctly. Every variable lookup, every associative array lookup, alias lookup, etc -- these are all linear-time scans through a linked list.

In the bash source, there's a set of functions called hash_create, hash_insert, etc, that are unsurprisingly called whenever bash maintains a mapping from strings to other stuff like functions, variables, aliases, and such. And also unsurprisingly, there's a function that hashes strings by multiplying each character by a big prime number. So one might reasonably conclude that all this talk of hashes is about setting up a hash table for averaged constant-time access.

That was probably the idea, anyway. But if you benchmark variable lookups, you'll observe a noticeable decline in performance as more of them are defined:

setting up 64 vars...
real  0m0.004s
user  0m0.004s
sys   0m0.000s
benchmarking local x=${xs_0} y=${xs_0}...
user  0m0.260s
user  0m0.259s
user  0m0.258s
user  0m0.259s
setting up 65536 vars...
real  0m7.484s
user  0m7.415s
sys   0m0.071s
benchmarking local x=${xs_0} y=${xs_0}...
user  0m2.905s
user  0m2.914s
user  0m2.927s
user  0m2.909s

Let's give bash the benefit of the doubt here and say that maybe some of the later variables created some hash collisions (despite all being named xs_i for densely-distributed integers i). It isn't quite fair to compare a nearly-empty hashtable with a full one.

setting up 524288 vars...
real  12m4.956s
user  12m2.650s
sys   0m2.090s
benchmarking local x=${xs_0} y=${xs_0}...
user  1m20.455s
user  1m19.054s
user  1m17.574s
user  1m12.624s

ZOMGWTF. Eight times as many variables, and a whopping 100x as long to set them up. 10x as long to access. And we're not swapping to disk or anything; this is all happening well within memory.

Clearly this is no ordinary hashtable implementation. And indeed, here is the hash_insert function from the source (reformatted for brevity):

BUCKET_CONTENTS *
hash_insert (string, table, flags)
     char *string;
     HASH_TABLE *table;
     int flags;
{
  BUCKET_CONTENTS *item;
  int bucket;
  unsigned int hv;
  if (table == 0) table = hash_create (0);
  item = (flags & HASH_NOSRCH) ? (BUCKET_CONTENTS *)NULL
               : hash_search (string, table, 0);
  if (item == 0)
    {
      bucket = HASH_BUCKET (string, table, hv);
      item = (BUCKET_CONTENTS *)xmalloc (sizeof (BUCKET_CONTENTS));
      item->next = table->bucket_array[bucket];
      table->bucket_array[bucket] = item;
      item->data = NULL;
      item->key = string;
      item->khash = hv;
      item->times_found = 0;
      table->nentries++;
    }
  return (item);
}

This all looks quite innocuous at first glance, but if you've ever written a hashtable you will quickly realize that something crucial is missing: the bucket array is never resized. Bash does us the service of maintaining an exact count of the number of items stored in the table, but at no point does it resize the array to maintain an average chain depth.

Which, of course, is something of a problem as we start to add more entries. What it really means is that bash isn't using a proper hashtable at all; instead, it's just a fixed number of linked lists over which it will ultimately do linear scans. This explains why creating n variables takes n2 time.

Why this matters

You might be thinking, "look, it's a shell, who cares." I was tempted to think this too, and for most people it probably really doesn't matter. But the shell isn't just this thing you type commands into. It's your window into the system. It's a programming language that you're using to tell the computer what to do. And being able to extend your navigational context is enormously helpful. Bash, through its sloppy implementation, reduces the shell to just a command prompt and not much more.

It's possible to overcome most kinds of limitations in one way or another. And I'm generally willing to go to some absurd lengths to get something to work if it's sufficiently interesting. There are two kinds of problems, however, that force a hard stop. One is when you don't have enough computational expressiveness to do something, like trying to write Lisp in the C preprocessor. (You can do it with C++ template metaprogramming, but the C preprocessor isn't Turing-complete.)

The other is when you hit a performance barrier with no workaround. If you can allocate memory and get to it in constant time, you can probably come up with some creative solution to a performance problem; but this all stops when your language has no constant-time data access.

I guess it's time to switch to zsh:

if (++ht->ct >= ht->hsize * 2 && !ht->scan)
    expandhashtable(ht);

Or maybe this will be enough motivation to write a shell. But either way, today is my last day using bash.

2013.0713 permalink

Compressing the requirements

It's taken a while, but I've finally come up with a counterargument (by technological analogy) to my previously-held belief that a more powerful language would make me a more productive programmer. This is a hard kind of mindset to get out of, because the equation seems so obvious: productivity = (features delivered) / (development time). And if you can use a better language to decrease the development time for the same features, then you're more productive. Obviously.

So, with that said, here's a hand-wavy, ostensibly mathematical (but not really), don't-try-this-at-home analysis of software development economics.

Obviously the above equation is true, but it's capturing only the "coder" element of the software development supply chain. The full equation from the client's perspective would look more like this:

        problems solved       features
value = --------------- x ----------------
           features       development time

This new term, (problems solved) / (features), is really important. It makes the "features" term disappear altogether in the final analysis, but more importantly it adds a factor that can be optimized.

Here's the reason this matters so much. Most of the time, for most applications that people will pay you to write, a signifcant fraction of the features make little or no sense. For example, the password-reset mechanism on every website you sign in to. Here's the technological absurdity of it:

Technological purist: Ok, so I'm making a website, and the user needs a password to prove their identity. Got it.

Real world representative: Yep, but sometimes the user forgets their password.

T: Forgets?

R: Yep, all the time, in fact. So we need a way for them to recover it, or at least to create a new one.

T: Ok wait, but they don't also forget their identity, right? They just forget their password.

R: Exactly, so we need to have another secondary way for them to prove their identity.

T: Like a second password, since statistically, the odds that they'll forget two passwords are lower than the odds that they'll forget just one.

R: No, that's too much work. Let's ask them something about their history instead.

T: Aha, so it's a new password, but the user didn't have to make it up!

R: Um, sure, whatever. But let's call it a "security question" instead, since it's not a password (T: Yes it is.), and the user will need to use it only when they forget their password.

T: Ok, wait. If the user has this magic "security question" that they'll never forget, and they can get or create a password by using it, then why not just use that instead of a password in the first place?

R: Because someone else might know it.

T: So anyone who knows their "security question", which I still claim is a password (R: No it isn't.), can assert that they're the user, and we know up-front that this "security question" is insecure enough that we can't use it for account access in the first place.

R: Yes.

T: Explain to me how this is anything other than a horrible idea.

R: Amazon does it* and is worth billions. Q.E.D.

T: Well played.

* This is not true (or at least not anymore), though I remember having to enter security questions for a number of websites in the past.

Now, you might argue (as I used to) that with a powerful language like Lisp, you can just write macros and such, and compress these complicated-seeming real world problems into compact implementations. In some cases this works, but in my experience the closer you are to the user, the less reliably you can factor your code by reusing logic. I want to justify this point by analogy.

Suppose you're tasked with compressing a bunch of photos. They're all different angles and moments in time of The Fonz jumping the shark, so there's some opportunity to reuse information across multiple images. What's the best strategy?

Let's approach the problem top-down first. You've got these images that may have shared structure, but you'll get a lot of mileage out of something obvious like JPEG. It's lossy, but most users won't care much. From there, you might be able to try stacking the images up and encoding it as a movie, or maybe gluing them all together and encoding it as a large JPEG. Worst-case, maybe you encode them separately, then bundle it up as a .tar.xz to get some small compression.

So that's the top-down approach: probably effective, not very technologically inspiring. Luckily the bottom-up approach is a lot more fun. We know the images are different camera angles of the same event, so from a 3D point of view we have a lot of shared structure. Sounds like a photorealistic rendering problem. So let's write a raytracer and describe the physical shapes in the scene, setting the camera angle and position as the primary variant. We can then make minor model adjustments and encode the inevitable small differences from the actual pixels, resulting in a space-efficient lossless encoding.

Aside from not being able to encode any images in under a decade, I'm not actually that convinced that the bottom-up solution would result in much better compression. And if you found this was the case and wanted to improve the compression ratio, it would be more difficult than it is for a straightforward strategy like JPEG. The compression artifacts would probably be more visible because the model has more degrees of freedom and because it maps nonlinearly onto the screen.

JPEG and the raytracer encoding differ philosophically in a significant way. The whole idea behind JPEG is that these images are being viewed by humans; and because of what we know about human perception, some features are more important than others. Notice where JPEG optimizes: it removes detail, which is analogous to a software developer realizing that some feature doesn't even need to be written.

Contrast this with the raytracer strategy, which is about feature encoding efficiency. The focus isn't on removing stuff, but on studying the features that do exist and finding a way to share information between them. The user's perception is an afterthought; we're encoding what exists, not the way in which the user will see it.

Here's what I think is so interesting about this difference. JPEG is clever in that it carefully encodes the degrees of freedom created by optical perception, allowing it to use a simple cosine-transform and still get great compression ratios. The real optimization was in the design, not the implementation.

JPEG happens to encode each image independently using cosines, but you could imagine a scenario where the encoder could refer to pieces of other images like a kind of codebook. If it worked this way, its compression efficiency would be largely dependent on the availability of applicable codebook entries, which in turn would depend on how much it could change the original image (and this, again, is a matter of optimizing the (problems solved) / (features) term).

By analogy, the same is mostly true of languages and libraries. A language doesn't exist to compress every problem to its ideal structural truth, it exists as a medium for people to build a large common codebook in the form of shared libraries. The real world is just too noisy to be reliably compressible, at least where end-users are concerned. And even when it is structurally compressible, understandably few people are going to be interested in betting their success on that proposition.

2013.0327 permalink

Why Caterwaul sucks

In case you haven't heard of Caterwaul (you probably haven't), it's an experimental programming language I wrote that does some very awesome stuff. I've used it productively for over three years to write all kinds of different apps in a fraction of the time it would have taken in vanilla JS. Over these three years I've also written about two dozen extension modules for it, doing everything from the useful to the laughably misguided. This post is about the changes in mindset I've gone through since starting the project.

Caterwaul, like many other bad ideas, began as an opportunistic hack. What made Caterwaul different and interesting was that it was actually useful: it worked for my honeymoon-stage Lisp Enlightenment and gave me a way to write tools to enable better web UI development, which at the time I wanted to start a business around. It was wonderful, really. I built libraries like Modus that made working with the DOM as simple as getting and setting a jQuery val, and generating HTML as simple as writing quasi-HAML inside Javascript and returning the results from functions.

This was pre-Google, so I hadn't really worked with large-scale software before. In particular, I was used to the economies of startups, where reliability was a luxury that cost too much to be taken that seriously. Armed with that mentality and a dangerous amount of hubris, I set out to write things like Caterwaul with two goals. First, it must be awesome. Like, "that shouldn't be possible" awesome, stuff like introspective, self-replicating compilers that can drop live references from the compiling environment into the code. And defer parsing for certain chunks of stuff. Caterwaul is full of things like this, not because it was useful but because it seemed just barely possible.

Second, it must be useful to me. I didn't really care whether it was useful to anyone else because I was, at that point, waging a personal war against the widespread blub influence, compatibility with the unwashed masses be damned. There is some merit to this mode of thinking, but it isn't self-justifying like I thought it was. (More on this later.)

So that's where Caterwaul came from, and that's why it looks weird, emits only one error message that you will probably never encounter, fails through for dozens that you probably will, and is written in 192-column-wrapped Javascript squeezed onto as few lines as humanly possible. It's also why Caterwaul uses a horrendously hacked operator precedence parser to parse the entirety of the Javascript language. (To this day I'm still proud that I managed to get that code to work.)

That, from my perspective, is the somewhat obscure technological foundation from which Caterwaul arose. More interesting, I think, is the non-technical perspective of anyone faced with using Caterwaul in real life.

I've tried to come up with an analogy that conveys what I consider to be Caterwaul's main problem, and here's the best one I can think of so far. A product-eng meeting is about to start when one of the engineers arrives wearing no clothes. Suppose we look at the state of mind in the fraction of a second before anything is said.

Most of the people in the room are thinking variations of "tasteless," "offensive," "lawsuit," and "let's hope this doesn't make it to the press." The engineer is thinking, "I've never understood why the human form is so offensive. It's a precise, powerful machine whose engineering is second to none. And these people have forgotten what true engineering is. Besides, clothing impedes performance."

The modes of failure in this scenario aren't factual in nature. Nobody needs to be incorrect for everyone to be offended. And I personally think that from a rationalist perspective my fictional engineer makes a reasonable point, but what's really relevant is how it fits into the surrounding context.

Put differently, being iconoclastic isn't a crime or a sign of immaturity. But being reflexively iconoclastic certainly is.

My mistake wasn't writing Caterwaul. The reason Caterwaul sucks so much is that I thought it was something other people could use. It's a great piece of wall art, an ambitious venture into uncharted territory, and maybe a good read, but ultimately unfit for production.

I write software differently now. About a week ago I started Infuse, a library that takes the best ideas from Caterwaul and reframes them in Javascript idioms. This doesn't mean I'm giving up quixotic crusades towards impossible goals (you can take my self-modifying Perl when you pry it out of my cold dead hands). It's the opposite: some of the most real problems programmers face have to do with human, not computer, communication. Those are the problems that demand the most interesting and creative solutions, and that's what I've resolved to start focusing on.

That, and wearing clothes to meetings. Gotta start doing that.

2012.1201 permalink

Selling Go

A couple of days ago Tibor on the Go user group thread posted a question that I thought was really interesting. Quoting directly from his message:

Many people just flip through the pages and code samples and say "alright I get it" and then either forget about it or start trolling right away. I am personally having a hard time having certain people try Go for real. I might do something wrong or maybe it just depends on the people.

Being a PL enthusiast, I started thinking about what about Go as a language might contribute to or prevent its adoption. Here's part of my reply:

Regarding motivating people to use Go, it occurs to me that Google might not be very good at selling languages. I could be overgeneralizing on that point, but as a Javascript developer the Dart synonym page looked fairly uninspiring: http://synonym.dartlang.org/ (and in some cases the page is incomplete or misleading; e.g. "const" in newer versions of Javascript). The reason it's uninspiring to me is that, coming at it from the perspective of Javascript, I see the introduction of a type system (which I'm not really a fan of, being a JS developer), some kind of worrying browser compatibility stuff, and a bunch of other stuff that in many cases can be solved with libraries. I also have to learn a whole new set of semantics and idioms; for instance, suppose I'm interfacing with JS code that gives me undefined ... how do I check for that case? I do get some nice features for it, like string interpolation, but I can get rid of most of the negative adoption costs by going with a thinner abstraction layer like CoffeeScript instead.

Another point is that Java has been the status quo for some time; people who enjoy adopting new languages have had lots of time to explore alternatives, and many of those alternatives aren't very Java-like (Ruby, Scala, Clojure, Haskell, etc). So using C/Java-style syntax in the Go language might work against its adoption, since early adopters who are used to non-Java stuff may assume that the syntactic similarity implies a conceptual similarity, or they may just not like C-style syntax. (Both apply in my case, and that's part of the reason it took me so long to see why Go might be useful. The other part is that I was just being closed-minded.)

Later, in response to my implicit comparison of Clojure and Go:

Clojure gives me a number of things that I really like, among which are syntax macros, dynamic typing, immutable data structures and excellent destructuring binds, transparent JVM interop, REPL-oriented development, non-OO multimethods, and a nice hybrid of familiar semantics from Lisp and Java. Now it also comes with some disadvantages, but since I already knew Java and Lisp the adoption cost was very low for me and Clojure is decisively better than either one for what I'm doing.

My honest question is, from my point of view (i.e. relative to Clojure) what does Go bring to the table? If I were writing a large application and collaborating with other developers, it would bring API stability (due to types), performance, and a unified way of doing inter-process communication. And if I were coming from Java only, it would be a clear win, since it makes a lot of things easier than they are in Java. But you're already assuming that I'm willing to adopt Go over Java, which would mean I'm a member of the demographic that is willing to learn a new language to solve a problem. Once that's the case, Go isn't competing with Java anymore; it's competing with Java, Ruby, Lisp, Scheme, OCaml, Haskell, Scala, Clojure, F#, D, CoffeeScript, Factor, Erlang, etc. Go has a natural advantage in that because it was developed by Google it probably has a lot of internal consistency and is likely to be well-engineered, but since it's a new language without a well-known killer library (e.g. Rails for Ruby, jQuery for Javascript, Cascalog for Clojure), people could be justifiably hesitant to absorb the cost of learning it.

Knowing just enough about programming language adoption trends to be dangerous, here are my thoughts about some of the factors governing Go's adoption (elaborating on my reply above). I'll start off with the parts of Go that I think are particularly cool:

  1. Multiple return values from functions. This is a great feature for functional programming because it means that you can compose and invert non-unary functions, and Go's standard library contains several function that use this feature eloquently.
  2. Anonymous functions and closure support.
  3. Goroutines, and the fact that they look just like function calls.
  4. No de-facto OOP.
  5. Channels. I don't know all of the details about how these work, but judging by the success of the actor model in Erlang and Scala, and in particular the idea of unshared memory and message passing, I think channels give Go a compelling edge when writing imperative multithreaded code.
  6. A sane static type system. Nothing over the top like Scala.

As an aside, throughout this post I talk as though Go has very few users. This is not the case. This article puts Go in the same ballpark as Scheme, Erlang, Prolog, Clojure, D, OCaml, Groovy, and R. I think the reason I see Go as being less widely used is that it feels like a general-purpose language, whereas the others don't -- and it's reasonable to assume that a general-purpose language will have a wider audience than a specialized one. So I do think it's valid to speculate a little about why Go hasn't seen more widespread adoption.

Here's how the language looks to me from the point of view of some stuff I've done.

Writing a low-level interpreted language

Whether I should or not, I really value C's ability to get right down to the metal. For instance, if I write code like this:

some_struct_type **stuff = ...;
stuff[i] = get_a_struct_pointer();

I can be really confident that somewhere in my machine code there's going to be a couple of register loads followed by an 8-scaled move to memory. And in particular, nothing else is going to happen; C doesn't insert write barriers, bounds checks, or anything else unless I tell it to.

What does Go do with this? I'm not entirely sure. Go is garbage-collected, which means it might put in a write barrier. It might also insert bounds checks, even though I might be absolutely certain that it doesn't need to.

More significantly, am I going to be able to JIT stuff from Go? In C I can mmap some memory, mprotect it to be executable, write some machine code, and run it. That generated code can interact with my struct values because I know where all the bits are.

C also gives me a preprocessor that I can use to abstract away a lot of repetition. For example, I have some bytecodes that do vectorized math on 64-bit value cells. I could write each operation by hand, but instead I wrote a preprocessor macro and instantiated it four times, once for each integer size:

#define defbinop(name, name_prefix, type_prefix, op) \
  defbytecode(name_prefix##8##name) \
    type_prefix##8_t v1[8], v2[8]; \
    dpop(v1); dpop(v2); \
    if (exit_code) return exit_code; \
    v1[0] = v2[0] op v1[0]; v1[1] = v2[1] op v1[1]; \
    v1[2] = v2[2] op v1[2]; v1[3] = v2[3] op v1[3]; \
    v1[4] = v2[4] op v1[4]; v1[5] = v2[5] op v1[5]; \
    v1[6] = v2[6] op v1[6]; v1[7] = v2[7] op v1[7]; \
    dpush(v1); \
  end \
  defbytecode(name_prefix##16##name) \
    type_prefix##16_t v1[4], v2[4]; \
    dpop(v1); dpop(v2); \
    if (exit_code) return exit_code; \
    v1[0] = v2[0] op v1[0]; v1[1] = v2[1] op v1[1]; \
    v1[2] = v2[2] op v1[2]; v1[3] = v2[3] op v1[3]; \
    dpush(v1); \
  end \
  defbytecode(name_prefix##32##name) \
    type_prefix##32_t v1[2], v2[2]; \
    dpop(v1); dpop(v2); \
    if (exit_code) return exit_code; \
    v1[0] = v2[0] op v1[0]; v1[1] = v2[1] op v1[1]; \
    dpush(v1); \
  end \
  defbytecode(name_prefix##64##name) \
    type_prefix##64_t v1, v2; \
    dpop(&v1); dpop(&v2); \
    v1 = v2 op v1; \
    dpush(v1); \
  end
#define defubinop(name, op) defbinop(name, u, uint, op)
#define defsbinop(name, op) defbinop(name, i, int, op)
defubinop(add, +)
defubinop(sub, -)
defubinop(eq, ==)
defubinop(lt, <)  defsbinop(lt, <)
defubinop(le, <=) defsbinop(le, <=)
defubinop(mul, *) defsbinop(mul, *)
#undef defbinop

More superficially, I have a soft spot for C's anachronistic syntax oddities. Go feels a little undecided about whether to keep them or to go with something smoother like Ruby syntax.

Finally, Go carries with it a new set of default assumptions that need to be learned. What is the relative precedence of == and &? How are strings encoded? Can I construct a slice over varargs? (And if I can, what happens if I return that slice?) Do standard library functions return slices that might make large arrays ineligible for GC? If I write an anonymous function that doesn't close over anything, does it still allocate memory?

So those are my biggest areas of concern about Go as a language for an interpreter. From where I'm standing, it could provide some nice benefits like memory safety and GC, but only at the huge cost of significantly increasing the underlying complexity compared to something like C or C++.

Writing a web app

For a problem like this, low-level systems programming is (hopefully) out of the picture and you can just focus on the domain-specific code. At that point Go's managed runtime looks a lot more attractive, and its concurrency primitives seem perfectly suited to horizontal scaling. (Not to mention its integration with AppEngine.)

It might seem like Go has an advantage here, but there are a lot of competitive options. Java and Ruby each offer great framework support and the ability to quickly scale up an experienced development team. For corporate development, I don't see Go winning as an app development language until it gets a killer library that gives it an edge over Rails. (Actually, this isn't quite true. It can win for anything where maintainability matters, but at that point people will probably gravitate towards Java.)

What about solo developers pursuing hobby projects? In this case there probably isn't a budget and you can afford to take a risk to learn a new language.

Here's the problem with this situation, though. If you're even looking at changing languages (as opposed to libraries, for instance), then you're probably looking for something that gives you a lot of leverage. Learning a new language is a lot of work, and you probably want to get significant returns for your effort. In a case like this, you're more likely to end up with Ruby, Python, Perl, Smalltalk, IO, Clojure, Scala, Javascript (for Node), Erlang, etc.

Getting down to brass tacks, though, Go is competitive with all of the above languages in that it gives you lots of mechanisms to write functional programs. And furthermore, it's faster than all but Scala if you take the Shootout results at face value. I think Go has more technological merit than Ruby, Python, IO, and, I reluctantly admit, Javascript.

However, this technological merit and suitability for FP tasks is obscured somewhat by the presentation on the Go language website. The very first example you see is something that is structurally similar to "hello world" in C. A lot of the other examples on the site use Go's functional constructs in minor ways but have relatively low awesomeness per byte of source.

Compare this to CoffeeScript, whose website kicks off with some concise and compelling use cases that are appealing to Javascript programmers. Although a tiny fraction of the engineering achievement of Go, CoffeeScript's website makes its case much more clearly than Go's.

Incidentally, after a few hours of reading stuff on the Go website I found this webserver program (I've changed the formatting slightly):

package main
import (
  "fmt"
  "net/http"
)
func handler(w http.ResponseWriter, r *http.Request) {
  fmt.Fprintf(w, "Hi there, I love %s!", r.URL.Path[1:])
}
func main() {
  http.HandleFunc("/", handler)
  http.ListenAndServe(":8080", nil)
}

This is beautiful: simple and to the point. If this were on the front page, I imagine Go would have a lot more traction within the web development community. (An example very much like this sold me on node.js when it first came out -- mostly because there was no config-related boilerplate, a welcome change from JVM-based frameworks.)

Making enemies

I'm going to speculate wildly and suggest that Go needs to make some enemies. For example, here are some controversial statements I think are made by other non-mainstream languages, often on their websites:

  • Erlang: "if you want to distribute, don't share data"
  • Haskell: "we don't need no stinking side-effects"
  • Ruby: "we don't need no stinking main method"
  • CoffeeScript: "Javascript should be more like Ruby"
  • IO: "prototype OO and Lisp metaprogramming are good ideas, but S-expressions are kinda lame"
  • Scala: "you won't fall into the gaping chasm we've left between FP and OO, we promise" (disclaimer: I don't like Scala)
  • Clojure: "Lisp needs immutability and the JVM to write good concurrent software"
  • Caterwaul: "readability is overrated"
  • J: "we win at code golf, every. single. time."

I think if Go made a similarly strong statement, it would catch the attention of like-minded programmers, albeit at the cost of alienating others.

And that's the end of my rant. Let me know if I've made any factual errors, and as always, I'm not an expert and am speculating wildly. I'll be interested to see what happens with Go. It wouldn't surprise me if it became the next big thing in big data processing, for example. I think it's a language that is just waiting for the right killer library.

2012.1129 permalink

An apology

About six months ago I wrote a now-infamous post about why I left Google. I still stand behind most of what I wrote, but after thinking about it I need to address one point in particular: "Go failed to solve any significant problems."

I'm sorry I wrote this and I think it shows a significant lack of perspective. First, I don't know Go as a language, and I didn't then when I bashed it. When I wrote the post I was still frustrated with the unimaginativeness of Dart, and Go, from what I had seen, looked like another Java-ification of an otherwise interesting language (Erlang).

Let's suppose for a moment that I was right about that much. Even in that case, I don't think my point had much merit because Erlang has a number of problems that make it difficult to use in practice. The two that I've heard about most commonly are obscure syntax and suboptimal performance. If Google managed to solve just these problems with Go, then they've changed the landscape of programming languages enough to make writing concurrent programs much more approachable to normal people (i.e. people who don't get excited about phrases like "process calculus").

Anyway, I shouldn't have written this about Go. Google has started a number of interesting compiler/language projects and the only one I actually know enough about to criticize is Dart (which I still think is a disaster, but that's another rant). All of the others I've looked at have been meritorious in some significant way, whether as a matter of design or, more commonly, of exceptionally thoughtful implementation.

2012.0819 permalink

Clojure metadata

I've finally gotten around to learning Clojure, and I have to say it's an impressive and very usable piece of technology. At first I was worried that it was a hack to get Common Lisp running on the JVM, but it's much more than that. In general I've found it to be predictable and easy to learn, but one of the counterintuitive parts of Clojure is the way it deals with metadata.

When I first saw the metadata mechanism, I thought, "oh cool, we can attach arbitrary maps to things", so I popped open the repl and tried to do exactly that. I used the ^meta value shorthand, since the documentation states that it is a reader macro that expands to with-meta.

user> (meta (with-meta 'a {:doc "hi"}))
{:doc "hi"}                   ; so far so good...
user> (meta ^{:doc "hi"} 'a)
nil                           ; huh?
user>

Clearly my understanding of metadata is a little off, but even more perplexing is this:

user> (meta (with-meta [] {:doc "hi"}))
{:doc "hi"}                   ; as expected...
user> (meta ^{:doc "hi"} [])
{:doc "hi"}                   ; wtf!
user>

So we can attach metadata to a vector but not a symbol?

Here's the missing piece. ^meta value is a reader macro, whereas with-meta is a regular function call. This is important. You'll notice that you can use metadata annotations inside of quoted values:

user> '(a ^:b c)
(a c)
user>

This is different from something like the @x reader macro (and, in fact, nearly every other reader macro supported by Clojure):

user> '(a @b c)
(a (clojure.core/deref b) c)
user>

The difference is that in the case of quotation, dereferencing, unquoting, etc, the reader generates a function call. In the case of metadata, the function call happens inside the reader! This explains why we started losing metadata on quoted symbols; here's what happened there:

;; the repl reads the input:
'(meta ^{:doc "hi"} 'a)               <- initial list to be read
`(meta ~(with-meta 'a {:doc "hi"}))   <- ^ reader macro
`(meta (quote a))                     <- metadata is on the (quote a) list
;; the repl has read the list; it evaluates the result:
(eval '(meta (quote a)))              <- metadata is still on (quote a)
((resolve meta) (eval (quote a)))     <- eval rule for function calls
(<meta> a)                            <- evaluate the argument

The metadata is lost when we evaluate (quote a), which is a list that contains metadata, into a, which is a symbol. This makes sense; metadata shouldn't persist across evaluation.

What about vectors? Here's the difference:

;; read:
'(meta ^{:doc "hi"} [])               <- initial value to be read
`(meta ~(with-meta [] {:doc "hi"}))   <- ^ reader macro
`(meta [])                            <- metadata is on the [] vector
;; evaluate:
(eval '(meta []))                     <- metadata is still on []
((resolve meta) (eval []))            <- eval rule
(<meta> [])                           <- [] evaluates to itself!

We don't lose the metadata here because the empty vector evaluates to itself and therefore keeps whatever metadata it had at read-time. There is some strangeness that comes about because of this:

user> ^{:doc "hi"} [1 2 3]
[1 2 3]
user> (meta *1)
{:doc "hi"}
user> [1 2 3]
[1 2 3]
user> (meta ^{:doc "hi"} *1)
nil
user>

It's the same problem we had with the quoted symbol from before. The reader hands us a symbol *1 with metadata already on it, but eval converts that metadata-laden symbol to the un-metadata-laden vector [1 2 3] before meta can see it.

The result of all of this behavior is that you are likely to see with-meta in evaluative contexts; that is, within functions and other normal expressions that will be processed by eval. Read-time metadata makes more sense in any quoted context, including in macro arguments (which themselves are quoted by Clojure when it sees that you're calling a macro).

This also means that the strange behavior of the reader macro is actually a strong feature: it allows metadata to be value-transparent even before the evaluation process. This results in nice properties like:

user> (= '(a b c) '(a ^{:doc "hi"} b c))
true
user>
2012.0620 permalink

Text-mode is awesome

I recently tweeted that (text : GUI) :: (FP : OOP), and that Vim was a WYSIWYG version of Ed. I've been thinking a lot about how UIs are structured, more computationally than visually, so I'm interested in analogies like these. This post is a rambling collection of thoughts around the philosophical differences between text and GUIs.

The obligatory anecdote

I downloaded Wikipedia the other day. Now I've got 8GB of bzipped XML that I need to unpack and index into articles so I can pull them up in less or vim. I wanted to see how the XML was structured, so I did this:

$ bzcat wikipedia.xml.bz2 | less

And magically, everything Just Worked (keep in mind that I'm using a two-year-old netbook with 1GB of memory). No swapping to disk. No significant delay at all. And, remarkably, no CPU usage until I scrolled or searched the document. I saw the first screenful of XML in well under a second.

UNIX/Linux users, will, of course, not be at all surprised by this because the pipe has worked this way since the Middle Ages. But a large segment of the computing world doesn't have the ability to slice 8GB of data so elegantly. For example, suppose the Wikipedia data were stored in 34GB of HTML, similarly bzipped, and we wanted to view it in a browser. No problem, we'll just do this:

$ bunzip2 wikipedia.html.bz2 &      # Or worse, use WinZIP :)
$ google-chrome ./wikipedia.html

Aside from the obvious issue that Chrome is likely to load the file faster than bunzip2 can provide data, there's a bigger problem lurking here. Suppose you've loaded up the first screenful of HTML. Does Chrome wait before reading the rest of the document?

Of course not. It can't, because HTML has no locality properties at all. You could easily put a script like this at the bottom:

<script>document.body.innerHTML = ''</script>

The problem here is that the document has a mutable model with no spatial ordering of elements. The only way to accurately render the document is to load the whole thing into memory, run the scripts, and then render the subset of components that appear on the screen (which itself is no small task).

Put differently, a function from time to state, for any element in the document, is both Turing-complete and requires a complete model of the entire document.

FP and OOP

FP generally emphasizes data transformations that are extrinsic to the data itself, and elements of data are generally immutable and unencapsulated. OOP tends towards the opposite: transformations are intrinsic, data is generally mutable, and state is hidden.

This difference is crucial to the function of shell pipes. Once bunzip2 writes a byte to stdout, it has no way to change it or impact the state of that byte. All interpretation is left to the next process.

The real power comes from the way ASCII is interpreted. Text streams have a spatial ordering that, aside from ANSI escape codes and backspace characters, maps directly onto the order of bytes in the input. This wouldn't be true if ASCII also defined a 'previous-line' character (I'll call it \p). For example:

foo\nbar -> foo
            bar
foo\nbar\pbif -> bif
                 bar

The problem with \p is that it breaks the ordering. This forces interpreters like less to provide a mutable data model, which in turn requires them to load the entire input stream before displaying anything.

IO

Since bytes have no intrinsic data transformations or hidden state, they have no identity; the byte 65 (capital A in ASCII) is the same regardless of whether it comes from a file, the network, or, importantly, the keyboard. This last point is where things get interesting.

Suppose you're doing stuff on the command line and get tired of repeating yourself. Maybe you're sending a series of requests to a webserver:

$ curl http://server/request1
$ curl http://server/request2
...

It's insanely easy to store this process. You can just send your commands to a file rather than to a shell:

$ cat > commands
curl http://server/request1
curl http://server/request2
...
^D

Then you can replay them using a redirection:

$ bash < commands

Boring, right? We've seen all this already. Okay, so why can't we do the same thing with a web browser?

The obvious answer is that we could, it would just involve replaying a series of browser-generated events. But with sites doing A/B testing and what-not, those events wouldn't necessarily mean the same thing. A click could go to the wrong place if the layout changed even by a few pixels.

The real problem is that the events themselves contain implicit ties to the DOM state, which I've mentioned requires a Turing-complete simulation and is not fully initialized until all of the data has been read. Events are generated by an implementation-specific process, and because the model has hidden state it is unclear how much data will be required to emulate it.

By the way, you can always get to a DOM-style model for text. All you have to do is start piping things into ed. At that point you'll have an in-memory representation of the document so that you can use random access, mutation, and other features that aren't normally supported by the streaming pattern. The advantage of this is that you can then export the result back to a stream and continue using the same old tools.

Programming languages

Strangely enough, I think the philosophy behind FP is better represented by shell programming than it is by functional programming languages. FP languages still have a number of strange quirks, most notably that you can't generally serialize values even though technically every value is anonymous. The reasons for this tend to be arcane -- stuff like holding references to opaque structures within the language runtime -- but still relevant for real-world use.

The UNIX command-line toolset is remarkably analogous to functional programming in any case. For example:

grep 'pattern' file         -> filter f xs
head -n100 file             -> take 100 xs
tail -n100 file     sort of -> drop n xs

The similarity goes beyond commands; data structures are also modeled:

mkdir foo                   -> foo = {}
echo bif > foo/bar          -> foo['bar'] = ['bif']
echo baz >> foo/bar         -> foo['bar'].push('baz')

Conveniently, newlines give us seamless Perl-style lists with associative consing. That is, a scalar (a one-line file) is just a one-element array (a many-line file). This means that files generalize a couple of different monads (Maybe and List are the ones that come to mind) and allow us to freely convert between them since the types are erased by the time we use the data.

The image

Continuing the now off-topic tangent, I'd like to point out that an image of a hard disk is, as its name suggests, an image. And Bash is, as its name sort of suggests, a REPL, which is a kind of shell.

Therefore, using a shell to modify a hard disk image when writing software is very similar to using a REPL to modify an image file, something you might do in Smalltalk or Lisp. A lot of developers are understandably leery of this paradigm because it's hard to see what has been defined, diffing is difficult, etc. In general I would argue that the UNIX image and REPL are much more powerful and functional than the REPLs offered by other languages -- that could be why we're willing to work with it but unwilling to use other image-based systems. But the analogy is still there, and the power of a self-modifying image is sufficient that nobody edits a disk image from the outside anymore.

Complexity

Last sub-point. UNIX command-line tools are small and have very few bugs, graphical programs are large and have many more bugs. This echoes the claims made by FP advocates that functional code is simpler to write and maintain. The 200+ megabyte Haskell installation is not a great testament to this point, but the difference between Chrome (70MB) and less (151KB) could be, depending on how you interpret it.

2012.0611 permalink

How Modus works

Modus is a project I started a while ago and have largely left unmaintained. The idea was to let you use jQuery in higher-order ways on lists or other structures of components. The canonical example is something like name/email:

<div class='person' id='person1'>
  <input class='name'><input class='email'>
</div>

Given the above HTML, you could use Modus like this:

$('.person').modus('composite', {name: '.name', email: '.email'});
$('#person1').val()   // -> {name:  $('#person1 .name').val(),
                             email: $('#person1 .email').val()}

This morning I received an email from someone who is using Modus in production, asking for a line-by-line explanation of the source code. This is actually a great opportunity to explain some things about what Caterwaul is doing, so here goes:

First, Modus is written in Caterwaul, not Javascript. It ultimately gets compiled into Javascript, but modus.waul.sdoc is the human-readable source code and modus.js is the unreadable Caterwaul output. SDoc is a documentation format I wrote that basically auto-comments any paragraphs beginning with an uppercase letter or pipe symbol. waul, the Caterwaul precompiler, understands SDoc syntax if the file ends in .sdoc.

OK, so with that out of the way, here's the code (also on Github), snippet by snippet:

caterwaul.module('modus', 'js_all', function ($) {

This defines a Caterwaul module transformed under the js_all set of macros. js_all includes all of the client/server-agnostic JS macros, but not jquery[]. (We don't need jquery here because we don't create DOM nodes.)

There are two reasons we create a module. First, it's one of the forms that waul understands. Technically, waul is doing something that isn't possible: it's predicting the outcome of executing a bunch of Javascript and doing some work beforehand. The only reason this works is that it recognizes a few different toplevel forms and treats them specially. The result is that any precompilable Caterwaul module is generally defined in its own file and starts with something like the module definition above.

The second reason to define a module is that it will become a part of Caterwaul's replica if you end up calling .replicate(). Most people will never have a reason to do this.

$ = jQuery;

Obvious enough; we don't need the $ reference that Caterwaul gives us, so we replace it with jQuery. Notice that this also gives us the shorthand $ = jQuery, which is not true by default if the user has called noConflict.

var original_jquery_val = $.fn.val;
$.fn.val(args = arguments) = ...;

We're replacing jQuery's val method, so we need to first store the original and then redefine it. The second line uses Caterwaul's function definition syntax, which lets us bind local variables:

f(x) = x + 1          // -> f = function (x) {return x + 1}
f(x = 5) = x + 1      // -> f = function () {
                                  var x = 5;
                                  return x + 1;
                                }

We need to store arguments because Caterwaul generates sub-functions for certain things, including the -re macro. If in doubt, always store the arguments. Here's the actual function we're using to replace the original val:

$.fn.val(args = arguments) =
  this.data('modus') -re [it
    ? args.length ? it.setter.apply(this, args)
                  : it.getter.call(this)
    : original_jquery_val.apply(this, args)];

We're checking to see whether the current DOM node defines a modus data attribute; if it does, we use the getter and setter from that instead of the regular jQuery val. Otherwise we use the original jQuery val function that we stored earlier. The args.length stuff is similar to jQuery's logic to use val as both a getter and a setter:

myElement.val('foo')  // setter
myElement.val()       // getter

Modus takes care of this for you as a matter of convenience. The next bit of code is a little strange:

$.fn.modus(getter, setter) =
  getter.constructor === String
    ? use_named_combinator(this, arguments)
    : this.data('modus', {getter: getter, setter: setter}),
where [use_named_combinator(receiver, args) =
         $.modus[args[0]].apply(receiver,
                                Array.prototype.slice.call(args, 1))]

First I should point out that there are two modus objects. One is the method that is present on each jQuery collection; this is used to set up behavior for the elements in that collection. The other modus is a global object on jQuery; this stores predefined behaviors that can be used later on (list and composite are two such examples).

With that said, the $.fn.modus method checks to see whether you're specifying the getter/setter functions as functions or in some other format. If getter is a string, then it assumes you want to use a predefined behavior; otherwise it assumes you want to set the element's getter and setter directly. Predefined behaviors are expected to be defined as attributes on the global $.modus, which we'll see more of in a moment.

$.modus = capture [
  util = {},

Here's where we set up the global $.modus object that ultimately will contain some useful element behaviors. capture[] is a macro that lets you use function assignment syntax to initialize object key/value pairs:

capture [foo() = bar]   // -> {foo: function () {return bar}}

I don't remember what I was using util for. There are three behaviors that come with Modus out of the box:

Original val function

This is not a behavior; it's just a utility function.

val() = original_jquery_val.apply(this, arguments),

This is kind of subtle. Suppose you're writing a field that lower-cases its input using Modus. Here's how you might go about it at first:

$('.lowercase').modus(getter, setter)
-where [getter()  = this.val().toLowerCase(),
        setter(v) = this.val(v.toLowerCase())]

The obvious problem here, however, is that you'll infinite-loop; val is a call to Modus, which will call your getter and setter functions all over again. What you want to do instead is use the original val from jQuery, which you can get to by asking Modus for it:

$('.lowercase').modus(getter, setter)
-where [getter()  = this.modus('val').toLowerCase(),
        setter(v) = this.modus('val', v.toLowerCase())]

Delegate behavior

This is where the behaviors start, and unfortunately with something of a hack.

delegate(getter, setter) = capture [
  first() = this,
  val()   = arguments.length
              ? getter.apply(this, arguments) -then- this
              : setter.apply(this, arguments)]

This basically lets you assign one component's value to another one. You can imagine doing it for a single child:

<div class='person-container'>
  <div class='person'>
    ...
  </div>
</div>

Here, you'd hook up .person-container to use the val function for .person:

$('.person-container').each(function () {
  var person = $(this).first('.person').data('modus');
  $(this).modus(person.getter, person.setter);
});

I don't think I've ever used this behavior and can't remember why I wrote it in the first place.

List behavior

This is the first combinator that is both useful and interesting. It constructs an array from the children of an element, and allows you to modify the children of an element by setting a new array of their values. The new_element constructor is used to convert from values to elements (more on this below).

list(new_element) = this.modus(
  "+this.children() *[$(x).val()] /seq".qf,
  "this.empty() -se- _ *![new_element(x).val(x) /!it.append] /seq".qf)

This combinator is invoked when you write something like element.modus('list', f). As you can see, it is setting up a getter and setter that close over the function you give it. In detail, here's what each one does.

"+this.children() *[$(x).val()] /seq".qf

This is a closure that maps each child into its value. We start by grabbing the children of the Modus list, then using the prefix unary + to slice it into an Array instance (this is a seq[] macro feature). We then map (specified by *), and for every child element x, return $(x).val() -- thus retrieving the value. The result is bundled into an array that has the same ordering as the DOM nodes.

"this.empty() -se- _ *![new_element(x).val(x) /!it.append] /seq".qf

This is the setter. First, we blow away all children in the node, since we're about to receive a new list. We then iterate through each new value (*!) and construct an element for it. There are a couple of things to notice here. First, new_element receives a copy of the value up-front. This helps you handle cases where there are several different value types that need different display elements (value polymorphism). Second, we then immediately set the value on the new component. This is more of a convenience than anything else; it just lets you omit a call to .val() on the new element if you want to simplify the new_element function.

Note that the .qf modifier on strings just builds a function out of the string's contents:

"_ + 1".qf  // -> function (_) {return _ + 1}

The other Caterwaul macro used here is /!, which is a way to wrap stuff into a function call while reducing the number of nested parentheses:

x /!f       // -> f(x)
x /!o.f     // -> o.f(x)

Composite behavior

This lets you map arbitrary descendants of elements to object keys. Unlike the list behavior, the composite behavior doesn't create or destroy elements; it just modifies the values of elements that are already in the DOM.

composite(paths) = this.modus(
  "paths %v* [find(this, x).first().val()] /seq".qf
  "paths %k*![find(this, paths[x]).first().val(_[x])] /seq /then.this".qf),
where [find(container, path) =
         path.constructor === String
           ? container.filter(path) /~add/ container.find(path)
           : path.constructor === Function
             ? path(container)
             : raise [new Error('invalid modus path: #{path}')]]

Here, paths is an object that describes the name and location of each piece of the value. For example, consider a DOM container that looks like this:

<div class='github-project'>
  <input class='name'>
  <input class='language'>
  <input class='branch'>
</div>

We want an object that looks like {name, language, branch}. To get this, we just tell Modus which component each key corresponds to:

$('.github-project').modus('composite',
  {name:     'input.name',
   language: 'input.language',
   branch:   'input.branch'});
$('#my-project').val()      // -> {name:     ...,
                                   language: ...,
                                   branch:   ...}

Here's how the getter works:

"paths %v* [find(this, x).first().val()] /seq".qf

We're doing a value-only map (%v* in the caterwaul seq[] library) over the paths object. This means that the object keys will remain the same, but we'll transform the values based on a map function. In this case, the values start out being paths and we're converting them to element values, which we do by first finding the component and then invoking the val() function. Notice, by the way, that we're not using the original jQuery val; we're using the Modus val method. This allows you to nest Modus components.

The setter is similar:

"paths %k*![find(this, paths[x]).first().val(_[x])] /seq /then.this".qf

We're still going through the values in paths, but we need to hang onto the keys (using a for-each-key operation indicated by %k*!) so that we can grab the values of the value object we're setting, which is bound to the _ variable by the .qf string modifier.

The most interesting part of the composite behavior is the find sub-function:

find(container, path) =
  path.constructor === String
    ? container.filter(path) /~add/ container.find(path)
    : path.constructor === Function
      ? path(container)
      : raise [new Error(...)]

The goal here is to resolve a jQuery selector with respect to a container, but this is not as simple as you might expect. First, the user could specify a function instead of a string; that function is then called to return the sub-component. For example:

{name: function (container) {
         return container.find('.name');
       },
 language: '.language',
 ...}

Second, the container itself could be a member of a multi-element collection that contributes to the value. (The second case is pathological and I'm not sure why I thought it mattered, but I decided to add it anyway.) To deal with this, we combine the filter output with the find output to make sure that we've included the containers themselves.

There are two new Caterwaul macros used in this section. One is /~method/, which works like this:

object /~method/ value      // -> object.method(value)

The other is raise[] which throws an error. This is useful because the throw keyword is illegal within an expression.

And that's it! Modus is just a combination of these few behaviors.

2012.0610 permalink

The user is a thread

Any user-facing program is by its nature multithreaded. The semantics aren't the usual ones, however. Things that happen within 30ms or so are atomic, since that's below the user's reaction time. Anything else has the potential to create a race condition.

If this line of reasoning is valid, it would mean that applications with inconsistent timing are buggy by default. (I would argue that this is the case.) UI code would be subject to the same constraints as hard real-time code. Most applications certainly aren't written this way, and to some extent they can't be considering the presence of virtual memory, asynchronous garbage collection, etc. But given the computational horsepower and otherwise compromised user experience of mobile devices, I think it's a promising area of focus.

2012.0601 permalink

An open-source life

Wouldn't it be cool to be able to publish the "source code" of your life? From the outside, someone else's life choices are about as opaque as someone else's web server code. Nobody really knows what kind of information goes in or out, or what the decision-making process looks like. All we see are the outputs of a black box.

The real challenge behind implementing something like this probably isn't the sharing component. Twitter is full of accurate autobiographical information. I think the real challenge is having a meaningful semantic structure behind it. This would enable searching for answers to life situations, or posing hypotheticals and simulating based on people similar to you.

Life may be too chaotic for this to have any chance of being useful for prediction, but even if this is the case it would be interesting to see it fail.

2012.0601 permalink

The market inefficiency factor

I was talking to a friend recently about the role of online advertising in market efficiency. On the one hand, it gives information to consumers and thus allows them to make more informed decisions. But there's also a competition for consumer attention (a scarce resource), and a lot of ads aren't designed to help consumers make rational decisions.

I thought about it some more and ended up coming to this generalization:

                    sales $
inefficiency = -----------------
               sales + product $

A completely inefficient market has no product at all; just sales. And a completely efficient market has no sales spending, just product development. We see occurrences of both. Currency exchange markets are almost completely efficient; I don't think anybody runs ads that say things like, "The all-new 2012 Euro. Better and more luxurious than before." They don't need to because the currency exchange market is largely automated, and money is so boring that emotional arguments for or against it are generally not useful.

On the other side, you have things like lottery tickets. This market is almost completely inefficient; a lottery ticket is a mathematically-expected financial loss for the bearer. But lotteries are surrounded by misleading advertising that appeals to people at an emotional level, omitting the only useful piece of information about the product itself. Whoever has the most emotionally compelling advertising, regardless of the odds of winning, will sell the most lottery tickets.

2012.0530 permalink

Why I left Google

This post is getting a lot more attention than I expected or hoped. I wrote it mostly for family and friends who know me well and who might be curious about it. So I need to address some stuff to avoid saying the wrong thing.

Responses to Hacker News comments

  • Replies to DannyBee's comments

    I think I've unintentionally misrepresented Google here. I'm not a lawyer or an expert, and I tried to play things safe and as much by the book as possible, since I didn't know what to expect. I did receive a reply about the project to the effect that it was in an area that Google was interested in, and it was a pleasant surprise when they allowed me to release the code and retain copyright. I'm not blaming anyone from the committee and I completely appreciate their efforts and reasonableness. The two-month complaint was a contrast from my previous job, where open-source software was easier to release. But I don't think Google was remotely unfair or unreasonable.

  • "No mention seems to be made of how long Spencer was at Google"

    I was there for a total of six months. A lot of people have said this is not enough time to get an accurate picture, and I think that's correct. A big part of why I left had to do with family events, and I was unsure about how well I would do considering my biases as a programmer.

  • "[he basically said] 'if you don't adopt functional programming you are a loser'"

    I'm biased about this. I don't think Google employs a bunch of losers. They're worth billions of dollars and I don't even own a house. My post isn't about what Google is or isn't, it's about the problems I faced based on my biases and perspective. I'm certainly not saying I'm right.

  • "LinkedIn shows a series of jobs, each lasting less than one year"

    My job history is choppy primarily because of choices I've made. I've been thinking about this a lot and haven't completely figured out why I've had so much trouble staying in one place. A lot of it comes down to perfectionism on my part, and the kind of impatient and relentless optimization that ultimately leads nowhere. For what it's worth, I've been in a professional career for three years since college, so I've got a lot to figure out. But that should also put this post into perspective; I'm not coming from a broadly informed viewpoint.

  • "the Scala game of integrating every single academic PL research feature of the last 20 years"

    I hate Scala with a passion, for what it's worth. I just wish there had been more advocacy for it.

  • "directly counter to California law [about open source projects]"

    Google isn't trying to be difficult here. The problem is that they do so much stuff that you're inevitably competing with something, or at least it's hard to rule that out. So to be safe, I ran every new project by the committee. They did a great job handling it, too. I just wish the latency had been a little lower. (First-world problems.)

  • "Google's broken hiring process"

    I think Google's hiring process is perhaps the most objective, well-considered, and effective I've ever seen. The stuff I'm complaining about is probably inevitable in a company this size, and like I stated a few times in the post itself, it really is an indictment of me, not them.

  • "noobs are under the impression that HR somehow impacts employees"

    Sorry, I should clarify. When I talk about HR, I'm talking about everything that wasn't related to coding. Vacation time and expense reimbursement, for example. The system was brilliant, easy to use, and efficient. I never felt like I was blocking on any kind of red tape.

  • "Sounds like someone is a bit difficult to please."

    Yes! I don't even like a lot of the technology I write for myself, and I consider myself to be one of the most xenophobic and picky technologists I know. This is not Google's fault by any means.

  • "Abstraction is discouraged" (I wrote this)

    This is probably misleading. I'm talking about abstraction in the general sense; things like metaprogramming, etc. I'm not talking about anything like simple code duplication, which Google is very sensible about. Avoiding clever abstraction is why their engineers rarely kill each other. But I missed having a job where I could write code like this, so it acted as a Con in my decision process.

Also, jrockway has provided an excellent summary from the other end. Seriously, read this if you want a thoughtful counterpoint.

Original post

I resigned from Google about a month ago and have been meaning to write about it. It's always hard to sound grateful when describing why something didn't work out, but hopefully this comes across as more of an analysis of my own flaws than an indictment of Google. On the whole, they are an awesome company and I thoroughly enjoyed my time there.

Also keep in mind that I only saw the part of the company I was working with. Google is a huge place and a lot of my observations probably don't generalize at all.

From a highly subjective and local point of view, here are some of the key factors that influenced my decision. List items are in no particular order.

Technological culture

This is by far the most important factor behind my decision about where to work, and it was ultimately the reason I ended up leaving. I think a lot of this has more to do with my biases than with any particular flaws in Google's technical process. Again, this is all from my point of view and isn't necessarily meant to be objective.

Pros:

  1. Total openness. I could read and contribute to the source code for almost any project.
  2. Well-engineered solutions to hard problems.
  3. Rigorous unit testing and attention to low-level code quality.
  4. Good developers had more influence than bad ones (Google employs some of each).
  5. Very reliable and solid infrastructure to help developers get their job done.
  6. Bugs and features were generally well-prioritized within the context of our project.
  7. 20% is a real thing, and there are lots of opportunities to pursue cool projects. Mine involved machine learning, for example.
  8. Basically unlimited computing resources if you needed them.
  9. Good technological choices were valued and encouraged.

Cons:

  1. Pathological love for Java and anything resembling Java.
  2. Little inclination to solve fundamental problems; most engineering effort spent on superficial implementation.
  3. Most engineers were not comfortable with {functional, concatenative, combinatory, logic, meta} programming.
  4. Therefore, most code was stateful and object-oriented, and no alternative design was seriously considered. [1]
  5. Coding standards and reviews prevented bad low-level decisions, but not bad high-level ones.
  6. Reviews preferred local simplicity over global simplicity; abstraction was discouraged.
  7. Internally-developed languages like Go and Dart failed to solve any significant problems. [2]
  8. UI programming was incredibly tedious due to GWT and its enterprise Java influence, and I as an engineer felt like I couldn't do anything to improve the situation. [3]
  9. Java was viewed as being "good enough"; alternatives like Scala and Clojure were not considered. [4]
  10. Insufficient code ownership for engineers to have a reason to write excellent code.
  11. Out-of-the-box thinking was often not useful.
  12. Productivity was graded without much regard to the amount of technological debt accrued. (Though to be fair, this is a hard problem.)
  13. The overhead of getting things done was often dominated by a complex API. Many internal APIs were excessively complex, in my opinion.
  14. High turnaround time for my own open-source projects. [5]
  15. Gratuitous boilerplate and complexity were widely accepted.

Notes:

  1. I didn't mind the fact that they had chosen OOP for nearly everything. There are good reasons to do this. What bothered me was that a lot of people making these decisions didn't seem to have enough knowledge of alternatives to make well-informed choices from a strictly technological point of view. This probably reflects the conservative nature of a large company, but things like this detract from Google actually feeling like a startup.
  2. This didn't impact me directly, but I felt like it was symptomatic of a larger problem within Google's technological power structure. It also felt like a compromise for the wrong reasons; Google was nominally willing to solve hard problems, but even with people like Guido van Rossum and Rob Pike, they ended up backing out of implementing well-established, if not mainstream, PL research.

    Update (June 4): There's been an interesting discussion going on in the Go Google group, and I should clarify what I meant here. I'm not saying that Go is a bad language to use, like it's all the others in every way, etc. What I'm saying is that it gets its roots in concurrency from Erlang, a language that, in my opinion, did a better job of addressing concerns like shared-nothing even if in a less-than-performant way. Erlang also was designed from the ground up to be highly concurrent; immutability and such are baked into the language at a low level to make this work. I don't know a lot about Go, but from what I've seen, it appears to be a version of Erlang that made it about halfway to C and didn't really decide to stick to a particular mode of thought (e.g. shared-nothing actor model, shared-memory multitasking).

    Incidentally, Michael Jones has brilliantly captured the idea I'm talking about. Google has the intellectual firepower to be going where no programmer has gone before, yet languages like Go remain solidly in well-understood and, for the most part, commonly-implemented territory. I have no doubt that this is a good strategic move on Google's part, and that Go solves a number of important problems that Google faces, but I was disappointed that Google hadn't taken a more experimental and research-focused approach.

  3. Not all teams have this problem; some write their frontends in Javascript directly, which I imagine makes things nicer. But even the Javascript coding standards are rigid enough to preclude a lot of interesting metaprogramming opportunities otherwise afforded by the language.
  4. This is a personal bias and point of arrogance on my part. Java is "good enough" if you are a MegaCorp who hires mostly inexperienced college graduates; but Java is, by many standards, not a powerful language and not something that makes sense as a mandate for solid, experienced developers. Scala and Clojure each have significant design flaws, in my opinion, and neither would have been a significantly better choice. My objection is not that Google didn't choose one of these, but rather that it's symptomatic of a technological inertia that prevents new ideas from being adopted.
  5. This deserves some explanation. Technically, Google owns everything you write while you work there, even if it's on your own time and with your own equipment. However, they have a committee that reviews things you write and assigns copyright to you provided that they don't conflict with something Google is working on. My experiences working with this committee were completely positive, but there was often a two-month lag before I got an official reply from them. This uncertainty bothered me a lot, since I wasn't sure whether my project could be legally released as open source.

Corporate culture

This wasn't nearly as important to me as the technological culture, but it did still matter. This list is also subjective and may not accurately represent Google as a whole.

Pros:

  1. Efficient, well-managed HR system with minimal bureaucracy and red tape.
  2. Excellent managers, tech leads, and delegation of responsibility.
  3. Excellent salary, benefits, corporate policies, etc. I cannot emphasize this enough.
  4. "Don't be evil" was taken seriously at all levels of the company.
  5. Engineers were given everything required to be happy and do their jobs well.
  6. Top of the line hardware, and easy access to other devices.
  7. Frequent tech talks by guest and internal speakers, and about interesting subjects.
  8. You could leave your desk without anyone assuming you were slacking off.
  9. Minimal corporate politics, at least from where I was standing.
  10. Google employees were generally cool people. I can't think of anyone I didn't like.
  11. Low-stress, low-pressure work environment.
  12. No negative feedback from management or my team lead. All criticism was constructive, and I really had to ask for it to get any. [1]
  13. Every week, upper management answers (sometimes critical) questions from employees.
  14. I felt at ease talking to my manager and team lead about anything that was bothering me.
  15. It was common practice to make snarky memes, often about Google, and send them to other employees. Management snarks back.
  16. Google as a company has a great sense of humor.
  17. Developers could set their own schedule, modulo meetings.
  18. Performance reviews, interviews, and other evaluations were very objective and fair. I never observed any bias of any sort.
  19. User experience was a high priority.

Cons:

  1. Google+. [2]
  2. Ubiquitous political emphasis on Google+ that sometimes compromised other engineering efforts.
  3. Products were not engineered with low-powered devices in mind. [3]
  4. Features outweighed the bugs that came with them. [4]
  5. Performance feedback was infrequent and vague.
  6. A lot of the maintenance I was doing produced a worse user experience, in my opinion. [5]
  7. "More wood behind fewer arrows."
  8. Much more formal than a startup (could also be a pro, but for me it was less fun).

Notes:

  1. This isn't my way of saying, "look how awesome I am." I totally made my share of screw-ups. I think Google management is legitimately very cool about this stuff across the board.
  2. I think Google+ is an effort that does not deserve the engineering minds at Google. This is mostly a personal bias. I see Google as solving legitimately difficult technological problems, not doing stupid things like cloning Facebook. Google, in my opinion, lost sight of what was important when they went down this rabbit hole.
  3. This is a natural by-product of the combination of (1) dogfooding, and (2) giving engineers top-notch hardware. I think both of these practices are fine, but the problem could be addressed by issuing each engineer an underpowered netbook in addition to the insanely powerful machines we were given.
  4. I'm referring to a number of things, but I feel like Google's tolerance for bugs has gone way up since the early days of GMail. They have implemented more features, but the overall software quality is lower now in my opinion.
  5. This was a major downer for me. I thought the original Google UIs were simple, clean, and fast; the new ones felt clunky and buggy, and I felt like the attention to detail that went into the originals was lacking in our redesigns. The decisions behind this were primarily corporate-driven, not engineer-driven.

Update

Some people read this post and concluded that Google would be a bad choice for them. I think that's a definite possibility, but my advice would still be to apply and take a job if you're offered one. There are two reasons I think this makes sense.

First, you could be wrong. You might really love it. And if you do, it's probably one of the top five jobs you'll ever have. Enough brilliant programmers love working at Google that I think it's naive to dismiss this possibility too quickly.

Second, I doubt you'll come out having learned nothing. I learned a lot at Google and left mostly because I wanted to take the time to pursue stuff closer to my interests.

2012.0527 permalink

Developer machines should suck

I've been using an underpowered netbook for software development for the past four years, and it's one of the best things I've ever decided to do. The main reason is that any side-effects like low performance or disk access become immediately visible. Google Chrome takes fifteen seconds to cold-start, for instance. I can run about eight tabs before the system begins swapping to disk. Most web pages are CPU-bound (!) when rendering. (A significant part of this is the Intel Atom's lack of out-of-order execution and small L2 cache, I imagine.)

Why would I advocate reducing developers to such a miserable existence? Because they're the ones who are responsible for creating that existence. It's possible to write fast software, but stuff that crawls on a netbook will appear fast on the latest 8-core Mac with 12GB of DDR3 memory and SSD storage. All of those side-effects, which will probably have a noticeable impact on 80% of the users of the software, become invisible.

2012.0522 permalink

Flat-mapping function arguments

I've been getting back into concatenative programming recently. One of the coolest things about it is the fact that you can erase values completely. Here's what I mean by that. Suppose you want to write a comment function that lets you treat a string (or list, or anything else) as a code comment. You can't do it in most languages. Each expression has a continuation, and each continuation takes a value. But you can easily do it in a stack language:

: comment drop ;              ( forth definition syntax )
"This is a comment!" comment

This works for an interesting reason. drop takes a value without returning anything. This is very different from C, which gives you the illusion of doing the same thing:

void comment(const char *x) {}
printf(comment("This isn't gonna work!"),
       "hello world\n");

Nothing comes out of a void function, but the infix syntax assumes that each expression returns a value of some sort. If it didn't, differentiating between unary negation and binary subtraction would require type information.

Intuitively, the difference seems like map vs flatmap over lists. The rule for map is that you need to transform each value; you can't just make one disappear. flatmap, on the other hand, can be used to encode every other list operation including map, filter, etc. It also provides some nice regularity. Here's the failure mode for applicative style:

(define (uncons c)
  (call/cc (lambda (return)
    (return (car c) (cdr c)))))
(cons (uncons (cons 3 4)))
;; error: too few arguments to [the outer] cons

However, this works just fine in concatenative style:

: uncons dup cdr swap car ;
3 4 cons uncons cons
2012.0513 permalink

Selection bias in new technologies

Each new thing that comes out seems to improve the state of technology. OOP "conquered complexity", FP "conquered buggy state", etc. Yet once enough people adopt a new paradigm, we start seeing the same old problems pop up again; today's OOP is just as unmaintainable as yesterday's GOTO-laden spaghetti code.

I think there's a selection bias that makes new technologies look artificially effective. Who adopts the latest unproven technologies? Probably enthusiasts, for the most part. Programmers who are inherently curious about things. And it should come as no surprise that, on the average, a curious programmer is probably more effective than a non-curious one.

The real test is whether the new technology holds up after it becomes mainstream.

2012.0508 permalink

The programming language quiz

I've just written an impeccable measuring tool for useless knowledge. The programming language quiz.

2012.0416 permalink

Goto isn't evil

Constructs which are easy to misuse are branded as evil, despite the fact that using them is generally a conscious decision. goto is the canonical example of this. Mutability is another, in FP circles. Neither of these is in itself evil, but they become harmful when misused by incompetent programmers. Similarly, OOP and immutability aren't univeral solutions. Either of these can be misused by similarly incompetent programmers to form codebases that are unmaintainable or impractical.

By analogy, cars aren't evil despite the fact that careless misuse of them kills so many people. Nor are explosives, firearms, chainsaws, or any number of other things that end up hurting people. They may be treacherous and dangerous, but such is the world we live in sometimes. (Programmers are no exception; look at Javascript.)

Incompetence is the thing that's evil. Capable programmers make informed, responsible choices about how they implement things.

2012.0411 permalink

The box problem

It can be argued that Java and C++ have roughly comparable performance for CPU-bound tasks. Yet Java is justifiably considered to be a much slower language than C++. What gives?

Most programs aren't CPU-bound. They're IO-bound against memory. An L2 cache miss is often estimated at 100 clock cycles. This is comparable to two IDIV instructions on a fast 64-bit processor (page 28). This delay is so significant that processors can use HyperThreading to create virtual cores just so that they have something to do while waiting for data to load.

In Java, objects are passed by-reference and primitive values are passed by-value. However, there's an important exception to this rule: A generic class with a type parameter always uses a reference type. That is, List<Double> uses reference doubles, also called "boxed" doubles. Java avoids this representational overhead in certain cases by providing primitive arrays, which store the values directly.

Boxed values are a problem for four reasons:

  1. Accesses to boxed values require an additional pointer dereference, which may miss the cache.
  2. Allocating boxed values takes up a lot of additional space and therefore requires the GC to run more often.
  3. Boxed values are additional entities that must be garbage-collected, so having many of them makes it slower to find the live set every time the GC runs. (Also, the GC will incur its own cache misses traversing them.)
  4. In Java, changing a boxed value often involves allocating a new one and garbage-collecting the old one; changing a primitive value can be done in a single instruction.

Compilers generally optimize imperative instruction sequences but not data representation. Boxed values are generally an artifact of language or VM limitations, primarily due to the weak type erasure semantics that most likely became mainstream with the introduction of object-oriented programming (they had been present in dynamically-typed languages for some time, however). Until compilers get better at optimizing the way data structures are implemented in memory, there will be a strong case for using low-level languages when performance matters.

2012.0403 permalink

Zero-risk consulting

I'm going to try something new. I have no idea how it will turn out. It occurs to me that there's probably a lot of risk associated with hiring people in various capacities: Full-time employees are a very high risk, descending to contractors from managed places like eLance. But with any of these mechanisms, there's enough overhead that you generally don't have a market for micro-work; that is, stuff that might benefit from a specialist but that only takes a few minutes.

So my idea as it stands is to become a zero-risk consultant. Let me know what you think.

2012.0401 permalink

Why microbenchmarks are misleading

The Great Programming Language Shootout contains a fairly prominent disclaimer that microbenchmarks are not an accurate measure of real-world performance. The first time I read this, I remember thinking: "Why not? What about these applications isn't 'real-world'?"

I think the problem isn't that microbenchmarks are somehow fake; they obviously solve very real problems. The real issue lies in the various processes used to write different kinds of software. For example, suppose you're writing a gzip encoder. There's probably a known-optimal solution with a low instruction count and good cache locality, and which is known to maximize performance across a wide range of architectures. And it's probably worth tuning the algorithm like this (maybe even per architecture) because gzip faces such a horizontal market.

On the other hand, consider something like Eclipse. It is written in one of the fastest languages in the world, yet it is one of the slowest pieces of software I have ever used. Nothing about what Eclipse is doing should take nearly as long as it does, but my guess is that the codebase is large and complex enough that effective optimization isn't realistic. Once code gets to be that large, complexity precludes good optimization. The language's runtime performance is made irrelevant by high-level problems like ineffective multithreading, heavy indirection, and swapping to disk.

In other words, microbenchmarks are useful only when the language runtime is the limiting factor. Most real-world applications are limited by concerns like maintainability, good style, sensible abstractions, short timelines, and developers who are not compiler experts. For these projects, the applicability of the language's paradigm to the problem will contribute more to good performance than the vast majority of low-level runtime optimizations.

2012.0327 permalink

Pure functions don't exist

State is evil. Such is what we are led to believe by functional programming devotees who justifiably value purity in code. And it continues: referential transparency is crucial, embrace pure functions, use immutable data structures, etc.

I don't have much of a problem with the philosophical values that motivate this. However, I think there is a particular brand of idiocy that has sprung up from the dogma. It is delusional to claim that a function has no side-effects, or that pure-functional programming languages as they are implemented today even allow you to write such functions. Here's an example in Haskell:

-- Program 1: A pure function.
-- Load this up in ghci and evaluate 'identity 5'.
identity = id
-- Program 2: A similarly pure function.
big_identity 0 = id
big_identity n = id . big_identity (n - 1)
identity = big_identity 100000000000000

The first program will quickly return 5. The second program will use up 100% of your CPU for a few seconds, then start swapping to disk, then die after it exhausts all of the available memory. (I had to reboot because I couldn't interrupt the process due to swapping.) On the face of it, this is patently ridiculous; any human will tell you that it doesn't matter how many times you compose the identity function onto itself, it remains the identity function. It's also not exactly obvious what this code is even doing; the identity function is defined not to do anything at all.

The real fallacy, I think, arises from the fact that time, memory usage, etc. are all side effects, and they can cause just as many bugs as excessive state-sharing in imperative code. At the end of the day, the best way to eliminate undesirable side effects is probably to have a deep knowledge of what your compiler is doing. Abstraction, purity, and correctness don't absolve the programmer from the responsibility of using a real CPU, real memory, and having real performance constraints.

Put differently, languages don't eliminate undesirable side effects. You can easily write pure-functional C if you want to. Programmers eliminate undesirable side effects by making high-level decisions about things that compilers are light-years away from understanding. Purity is a property of design, not implementation.

2012.0323 permalink

Solving the wrong problem?

Source code is represented as text. On the face of it, this is a strange format to use; it requires complex parsing algorithms, makes it easy to write nonsensical statements, and introduces a large layer of indirection when debugging. Yet I am unaware of any remotely palatable non-text programming environment.

This, of course, bugs me. I would love to see text be replaced by something closer to a program's true representation. Yet more than fifty years after the development of the first programming languages, we are still using text exclusively. Why?

My guess is that the proposed solutions don't get at the real problem with using text to represent code. For example, structural solutions that I've seen (and I'm no expert, so I may be overgeneralizing) have worse ergonomics than text, generally take up far more space to represent the same amount of information, and don't provide any particularly compelling advantage other than constraining the programmer to known-valid constructs.

What about a structural representation that did things differently? How about one which let the programmer move the code through invalid states, used less space than pure text, and had superior ergonomics? Given the power of editors like vim and emacs, this may not be possible. And arguably it has already been achieved by IDEs like Eclipse.

Put differently, maybe people like to communicate with computers like we communicate with other people: in writing. Just as we would never diagram our sentences as an alternative to speaking or writing them, maybe it's unrealistic to expect programmers to explicitly state the structure of their code.

2012.0319 permalink

The value of heuristics

Some problems don't have clean, elegant solutions. One example of this is object traversal in Javascript. I was working on a serialization library that crawled through a series of Javascript objects, serializing each one and encoding the object graph as a series of references that could later be reconstructed. In order to do this, you need to be able to reliably mark each object and be able to reverse-map objects to their IDs. And the catch is that Javascript only lets you use strings as object keys.

The hard but correct way to do this is to keep an array of [ID, object] pairs that you could then search each time you encountered an object. But this is time-consuming; O(n^2) overall. Better is to find some way to store the ID directly on each object. This can be done if you choose an extremely improbable name, like extremely_improbable_name_for_serialization_library. But this approach breaks down once you want to serialize the source tree of your serializer (if you ever did want to do such a thing), since the name is present verbatim.

I ended up generating a pseudorandom name using at least 128 bits of entropy. This isn't too hard; Javascript identifier characters (A-Z, a-z, 0-9, $, _) encode 6 bits each, so 22 characters contains 132 bits of information. I chose 128 because for a long time this was considered cryptographically secure; if you could guess someone's 128-bit key then you could see everything they were saying to each other. Pseudorandom data isn't cryptographically secure, but it is difficult enough to predict that it seemed like a reasonable choice.

The beauty of this solution is that it isn't correct, but it always works. If you can put an upper bound on the degree of pathology of your problem domain, then you don't have to implement something that is universally functional; you just need to handle the most pathological case that comes your way.

2012.0315 permalink

Too little structure

Caterwaul's parser and syntax trees are much looser than the Javascript language specification. I did this mostly out of laziness; it seemed simpler to write an ad-hoc parser and stick to operator-precedence parsing as much as possible. It turns out that my laziness enabled a whole range of very cool stuff to be done. For example, caterwaul can parse expressions like this:

S[for (var L[i] = R[10]; R[i < 10]) S[{_body}]]

This means that you can annotate arbitrary subtrees, including ones in statement mode (!), with markers and then pattern-match against those markers and their contents. Caterwaul has used this for some time; the seq library uses it to descend through subtrees and parse out the sequence operators. And because the caterwaul parser handles all of the cases uniformly, markers can be seamlessly integrated into qs and qse forms.

The interesting thing about this is that I wouldn't have even thought of the marker approach had I been using a highly-structured parser and AST representation, and I would have worked harder to write the parser and AST in the first place. So I've got a new approach: Write APIs with too little structure and fix it later.

2012.0312 permalink

Type theory considered distracting

Assembly language is untyped. By the time you're writing assembly programs, all types have been erased into instruction patterns that happen to reflect the semantics of what you're doing. If you're in a higher-level language, those semantics could even vary depending on runtime variants; the fixity of in-memory representation is, from what I can tell, primarily a historical nod to low-level type erased languages that depend on it being a compile-time invariant. (And compacting garbage collectors loosen this restriction somewhat by making pointers opaque, so we're slowly losing aspects of this codependency even in situations where it seems like it should be difficult.)

C inherits its type system from the constraints of assembly language: Types were not particularly used to guarantee correctness or to prove complex compile-time invariants; rather, they dictated the memory layout in a consistent way. And C has a desirable property because of this. All of its types can be erased at compile-time; there is no runtime type information. This isn't always the fastest way to do things because of the overhead associated with x86 calling conventions and indirect jumps, but the type system arguably leaves off where it makes sense -- before the program is run.

Fast forward to the development of the JVM Hotspot compiler, which includes an impressive array of optimizations including inline caching that deal specifically with compile-time unknowns whose runtime impact is generally mitigated by heuristic observation. There isn't anything wrong with heuristic optimization; generational GC, for instance, makes a tremendous amount of sense. But it's worth thinking a bit about the kinds of optimizations that didn't happen even when they would be relatively trivial to implement.

For example, loop invariant-analyzing compilers do not, to the best of my knowledge, hoist invariants across function boundaries, even when the function is provably monomorphic. This matters in languages like Javascript; for example:

var process = function (data, x) {
  return (data instanceof Array ? f1 : f2)(data, x);
};
for (var i = 0; i < xs.length; ++i)
  process(constant_array, xs[i]);

Here, process and constant_array are loop invariants, but even sophisticated compilers like V8 are unlikely to include an optimization step that inlines the loop body into the single expression f1(constant_array, xs[i]) even after executing it many times. However, if the dispatch were made implicit by using prototype methods, the inline cache would kick in and provide a significant benefit.

2012.0310 permalink

Becoming a dad

Adam Tipping was born two days ago at about 8 in the morning. He's our first, so Joyce and I went from a low-maintenance child-free existence to something totally different. I haven't thought much about technology since Adam was born, so for the next little while I'll post cute baby pics instead.

2012.0307 permalink

Developing without unit tests

About a year ago I removed caterwaul's unit tests. Now it has only a large functional test: can it bootstrap-compile itself? I think the change has been really good for the project, all things considered.

Tests are a time investment. You hope that the effort you spend maintaining tests is less than the amount of effort you would spend debugging/fixing regressions that would have occurred without them. Whether or not this is true depends on a few factors, a significant one of which is how quickly the project requirements change. Caterwaul's API doesn't change very often, but its internals change more often.

Unit tests are also protection against other developers. I deliberately wrote caterwaul to be difficult for other developers to modify so I wouldn't need to worry about this (and because writing software this way is just more fun IMO). I'm convinced that team development has some significant game theory problems, and unit tests mitigate these problems to some extent.

Finally, caterwaul has had some interesting philosophical changes since I removed unit tests. It has gotten simpler because I needed to keep it manageable. If I tried to write some ill-defined, complex library, something would break somewhere and it would take a lot of time to fix. The result is that the code is roughly uniform in its distribution of fragility * modification frequency.

2012.0302 permalink

A Caterwaul bug

Not all bugs are created equal. They range from the stupidly obvious mistakes of late-night coding to the intriguingly subtle nonlocal consequences of reasonable design decisions. In my experience, most bugs tend towards the obvious side of this scale. But sometimes I run into a legendary bit of pathology that is so convoluted or mysterious that it qualifies as a work of art. Yesterday I was fortunate enough to encounter one of these in Caterwaul.

I was a little worried about it because I had seen it happen at least once before and had no idea why. The symptom was that, for some input programs that made heavy use of syntax quotation, waul would die with output similar to the following:

node.js:201
      throw e; // process.nextTick error, or 'error' event on first tick
            ^
TypeError: Object , has no method 'reach'
    at eval at <anonymous> (waul-1.2:140:119)
    at [object Object].each (eval at <anonymous> (waul-1.2:140:119))
    at [object Object].reach (eval at <anonymous> (waul-1.2:140:119))
    at eval at <anonymous> (waul-1.2:140:119)
    at [object Object].each (eval at <anonymous> (waul-1.2:140:119))
    at [object Object].reach (eval at <anonymous> (waul-1.2:140:119))
    at eval at <anonymous> (waul-1.2:140:119)
    at [object Object].each (eval at <anonymous> (waul-1.2:140:119))
    at [object Object].reach (eval at <anonymous> (waul-1.2:140:119))
    at eval at <anonymous> (waul-1.2:140:119)

One of the problems with writing your own programming language is that you also have to write your own debugger. Right now Caterwaul has no debugger, so figuring out why this was happening was a real challenge. Fortunately, I had a hunch: I suspected that I was building a syntax tree with a string child instead of a syntax tree child. Caterwaul lets you use either as a convenience, but the push() method of syntax trees doesn't do that conversion. So I refactored push() to use the same logic as the constructor and figured that would take care of it.

It didn't.

Instead, I saw this in the generated output:

... .call(this, ..., , , , , , , , );

Node was understandably unhappy. Also strange was that my .qse literal expression refs had been replaced by variables called e1, e2, etc (normally they'd be called qse1, qse2, ...). At first I thought I had some macro that was re-expanding something and generating anonymous expression refs, so I poked around. To my surprise, no macro like this existed in the Caterwaul standard library. This meant that the bug was in the Caterwaul core itself.

I copied the failing program and started trimming it down until the problem went away. I found out that constructs like 'g[_x]'.qse /-f/ y would cause an e variable to be created. As soon as I got rid of the /-f/ y, the qse would behave normally. I then tried it out on the REPL and found the same strange behavior:

$ waul -c js_all
waul> '"foo".qse /-f/ bar'.qse.toString()
'f(qse_h_azCSgoP86KF16yaWYRruza)'             <- notice: no 'bar'!
waul> '"foo".qse / z /-f/ bar'.qse.toString()
'f(qse_j_azCSgoP86KF16yaWYRruza,z,bar)'       <- now it's there

At this point it was clear that it was some strange case of the infix function syntax. After some more poking around and random guesses, I found that calling flatten() on expression refs didn't work correctly. It turns out that tree coercion had a subtle bug in this function (it's written on one line in the Caterwaul source):

as: function (d) {
  return this.data === d
    ? this
    : new this.constructor(d).push(this);
}

This appeared to be doing the right thing: Either return the node or wrap it in a new d node and return that. It would be appropriately generic, too: new this.constructor would be closed over the node type.

And that's when everything made sense. Expression refs aren't instantiated like regular nodes. Most syntax trees take the "data", or their name, as the first argument. So if you said new caterwaul.syntax('foo'), you'd have a syntax tree representing the identifier foo. But the expression ref and closure ref constructors are front-loaded to take the value before the name. The name is entirely optional; normally Caterwaul will just use a gensym.

The result of this was that when flatten was called on an expression ref, the expression ref was parenting itself with another expression ref, not a regular syntax node. Expression refs don't have children, so this caused the original expression ref to be hidden and replaced by a nullary comma node. Expression refs serialize to their data, so the nullary comma would be serialized as just ',', hence the syntactically incorrect output.

Long story short, it turns out that as is an anomaly among tree transform methods in that it shouldn't respect type closure. The fix was to write it this way:

as: function (d) {
  return this.data === d
    ? this
    : new caterwaul_global.syntax(d).push(this);
}

If only all bugs could be this cool.

2012.0229 permalink

Performance as a side effect

Someone on Twitter posed a good question recently: "What is state?" (that is, statefulness vs statelessness in code). I thought about that for a while. It's a difficult thing to pin down because you could argue that any stored data is really state. My answer was that state is an implicit dependency on time.

If it's true, this answer has some strange ramifications. For example, performance is another variable that involves time. And I would argue that something with unpredictable performance is, in production scenarios, as dangerous as something that randomly emits side-effects. On the aggregate, it wouldn't surprise me if Java and C++ were actually the two languages with the highest level of abstraction for the generally low total time dependency.

2012.0226 permalink

Indirect jumps in GNU assembler

I'm not great at assembly-level programming, but I've been getting into it recently to implement a new programming language that I'll probably never finish. While working on it, I decided to use some continuation-passing style patterns to minimize the amount of stack manipulation that would need to happen. So I started out with some code like this:

movq continuation, %r12
jmp f
...
f:
...
jmp *%r12
...
continuation:

If you're familiar with AT&T assembler syntax, you probably won't be surprised to hear that this segfaulted and gdb showed %r12 as having a value way outside of the program's code space. After a lot of reading online, it turns out that you need to do it this way:

movq $continuation, %r12

Linguistically, this makes sense: you're jmping to the code itself (which is similar to a dereference), but you're talking about the code's address when you put the value into a register.

2012.0222 permalink

The wrong tool

As a programmer who likes functional paradigms, I have a hard time accepting the fact that Java is so popular. But it is, and so much so that it's noteworthy. Great software is built in Java, C++, C, and other very non-functional languages with tons of mutable state and edge-triggering. (For example, my OS, window manager, editor, terminal emulator, and browser are all written in one of these languages.)

I think there are some interesting dynamics behind Java's popularity and empirical success (and Haskell's empirical non-success, despite being signficantly more academically meritorious). What is it that Java has that Haskell doesn't? My best guess is that Haskell gives you the ability to create tools easily; this is the default way of developing. So in order to make progress, you are continuously defining abstractions. Put differently, you have the freedom, and the burden, of choosing your tools.

Java is simple. It gives you one mediocre abstraction that leads to slow, complex, bug-ridden software. Its model of objects is, in my opinion, the wrong tool for any job. However, and this is the genius of it, you don't have to think about your toolset. You just write stuff using the wrong tool, and slowly but surely you make progress without solving unnecessary problems.

Another way to look at it is that this is a reflection of the software process as a whole: Most failures occur in the requirements phase, not design or implementation. By making programmers think about their own requirements during the development phase, languages like Haskell create another point of failure. What interests me particularly is that this point of failure is evidently so large that the average corporate programmer (even at Google!) is better off using the wrong tools than they are taking a shot at inventing their own. Maybe the true measure of a programmer's usefulness is not their ability to write software, but their ability to know what needs to be written.

2012.0222 permalink

Perfection is irrecoverable

Haskell emulates something close to perfection on an imperfect platform. Rather than embracing the platform's idiosyncrasies and imperfections, it attempts to create a world where they don't exist. C, on the other hand, doesn't hide the fact that the platform is imperfect. It simply refines the world into a less imperfect one if you're writing normal assembly code.

In general, I don't think it's possible to cover up significant amounts of imperfection. It's possible to use the Church encoding for numbers, for instance, but we don't do it because the cost of introducing this abstraction is far too expensive. I would argue, too, that it's unnecessary. Purity is beautiful, but we're better off asymptotically approaching it than we are living in a world where it's the only option.

2012.0217 permalink

Syntax as a constraint

Someone at some point said something like "constraints breed creativity." It's an interesting thing to say considering that constraints also limit your options. But sometimes the burden of choice is much more significant than the flexibility it provides.

I'm running into this with Mulholland and its very complex operator precedence model. The language doesn't present enough of an opinion to guide library design, and as a result I'm not sure what to do with it. Caterwaul didn't have this problem because working inside Javascript's syntax was very limiting; there was rarely more than one way to do something. And it's one of the most useful (to me) things I've written. The necessary compromises haven't held it back much at all.

Put differently, syntax should be friendly both to library designers and to library users. Users, obviously, should be able to write what they mean and have that translated into something useful. But more subtly, designers should have a set of constructive constraints so they aren't making decisions in a vacuum.

2012.0215 permalink

Software reliability

When I was working for startups, it seemed like something was always burning down. The database would spontaneously become read-only, the middleware would sponataneously fail, or something similarly catastrophic, and everyone would be in red-alert mode to try to fix the problem. It isn't hard to see why, either: we used brand-new technologies that weren't mature.

Big companies don't seem to do it this way, and it isn't hard to see why. They can't afford the risk. (Using immature technologies is a huge risk, by the way; when we used Tokyo Tyrant, for example, it would have started silently losing our data after we put more than 40GB into it.) Startups are lured into making risky choices by the promise of more rapid software development.

At the same time, though, I wonder how many startups fail because they jumped the gun and tried to be more technologically savvy (in the use-what's-cool way) than their competitors, only to end up with a technology Jenga tower that collapsed at some crucial moment (or worse, a development team whose performance was crippled by fighting fires). The ideal risk isn't blind, it's calculated.

2012.0209 permalink

Not invented here

I have a terrible case of NIH. This means I reimplement stuff that other people have written because I think I can do a better job (or based on other shaky reasoning). It does end somewhere; I haven't yet written an operating system, browser, terminal emulator, machine-code compiler, or even a replacement for jQuery. But I have written a couple of programming languages, data structure implementations, serialization formats, and other stuff that I could have gotten for free with open-source software.

Strangely enough, I'm not sure I want to outgrow this bad habit just yet. I've learned a lot by reinventing the wheel. It's also been productive in some cases; when I find a problem in my stack I can fix it quickly and move on.

2012.0207 permalink

Occam's Razor

In real life, true facts are often both counterintuitive and simplifying. Finding out that the earth is round is a significant mental leap considering that it looks flat from every angle. But it simplifies our perception of deeper questions like gravity, and it explains how airplanes can keep going west and end up where they started.

The same is true of great platforms and frameworks. Rather than obscuring things or creating noise, they use a less-than-obvious presentation of something that ends up making your world a simpler place.

2012.0205 permalink

Great abstractions

jQuery didn't take long to become the dominant Javascript library, and it's easy to see why. For someone who already knows DOM programming, it's trivial to learn and provides great ways to get stuff done with less work. jQuery also managed to achieve a kind of minimalism; you can't remove much without changing something fundamental. And for many purposes, it has replaced the DOM.

Another platform like this is the C programming language. It allows you to write assembly-level code far faster than you could in assembly, and it conveniently obscures the details of architecture/platform-specific instructions and ABIs. The vast majority of C programmers feel no need to look underneath the hood of GCC, the linker, the calling conventions, the standard C library, or any number of other complex pieces of C's infrastructure.

Each of these abstractions has an interesting property: people who use them rarely find the need to work around them. Put differently, these abstractions don't leak. Most abstractions, of course, aren't like this. For instance, X11 implementations use direct-memory rendering for local connections to provide hardware acceleration. All of the rendering could be done over its network protocol, but it would be dog-slow and nobody would use it.

Good abstractions vanish

CPU-bound programs written in Perl or Ruby usually take much longer to run than the same programs written in C. As such, there's a case to be made against, for example, writing ray-tracers or programming language interpreters in Perl. C, on the other hand, is not appreciably slower than assembly language for the vast majority of tasks. For most purposes, you'd never know the result wasn't written in assembly to begin with, except that C is so cool that nobody would do that. (Actually, this isn't true. C imposes a lot of structure on the resulting assembly that makes it easy to detect. But none of this structure impedes the program's functionality/performance very much.)

Like C, jQuery also doesn't add much overhead. Last time I measured it, the cost of creating a Javascript object is 1% of the cost of creating an HTML element; although jQuery isn't erased per se, it is such a thin layer, and it's so aggressively optimized, that jQuery itself is rarely the cause of performance problems.

Here's where things get slightly counterintuitive: both jQuery and C will generally result in a net performance increase despite each creating some overhead over the perfect hand-coded solution. Not only do the abstractions vanish, but they result in a faster end product. And we use these libraries because low performance is one of the most harmful side-effects that an application can have.

Put differently, asking the question, is using this abstraction making my application slow is just as problematic as asking, is my app failing because this abstraction has a bug.

Good abstractions are culturally accessible

jQuery uses CSS syntax to select things, and while it did invent (as far as I know) its own accessor/mutator convention, this was so unsurprising and easy to use that things like Java's getter/setter pattern looked clunky and outdated by comparison. C allowed programmers to keep doing the kinds of cool stuff you could do in assembly (pointer typecasting, computed jumps), but made it easier and less error-prone to write programs with consistent structure.

Also, and just as importantly, jQuery didn't present some grand unified replacement for the DOM. Instead, it leveraged one of the DOM's core properties (the structured node hierarchy) and made it accessible in a reliable, cross-browser way. C didn't try to manage memory, create a new paradigm, or ignore the fact that you're ultimately writing machine language. Instead, it embraced these things and made them work consistently across every major platform (at least as far as the end-programmer is concerned).

The last, and perhaps most important, component of accessibility is ergonomics. C is much terser than assembly language for common use cases, just as jQuery is often much terser than direct DOM programming.

C and jQuery aren't the only libraries that have great ergonomics, of course. The UNIX filesystem and shell are also so useful that there isn't a decisively better alternative. Ditto for text editors as IDE components, despite how non-textual most programs are. Whatever design flaws these systems have, their ease of use is so compelling that we don't want to switch away.

Great abstractions withstand adaptation

Adaptation comes in two ways. One is when the abstraction author changes something, often breaking code that uses the abstraction. This happened with Ruby; 1.9.x isn't backwards-compatible with 1.8.x, yielding workarounds like the RVM. Perl and Javascript are examples of the opposite; they have preserved horrible design flaws throughout their evolution so that old programs would keep working without modification.

The other, and more interesting, form of adaptation is when people start misusing something. A great example of this is the C++ template system, which was probably in no way designed with the idea that people would be using it for general-purpose metaprogramming. The fact that this misuse is so reliable that it has become commonplace is a significant compliment to C++ templates. They have held up to unforeseen use where a lesser system would have broken down. The Web is perhaps the greatest example of misuse; what started as a simple linked document management system has ended up nearly replacing desktop applications.