Thursday, April 19, 2007

 

Zip means zip

Elegance isn't everything. I recently had a discussion on Reddit.com about this Haskell code snippet:

sum $ zipWith(*) a b

(It's calculating the dot product of vectors a and b. In other words, a[0]*b[0] + a[1]*b[1] + a[2]*b[2] + ...)

In short - I hate it.
I'm not talking about any intellectual or pragmatic objection - it just hits me with a gut-level "yuck", like a bowl of maggots. So what exactly is my problem with it? Here's my attempt to understand my reaction and explain it better.


Really, the only part I object to is the zipWith function. It's simply poor communication. A program's job, or a large chunk of it, is to explain your intent to other programmers, not to the compiler. It can be helpful to name an abstract concept like this, but it's important to keep the target in sight: you're doing this to make the program more readable. A bad abstraction is just an indirection: it makes the program less clear, not more.


How would you describe the dot product, in words? Perhaps something like...

"Multiply each number in A by the corresponding number in B, and total the results."

Does "zip" appear in there at all? Would anyone use that word to describe what the dot product is doing? "It zips two lists with a multiplication"?


Moreover, the visual metaphor of a "zip" is inappropriate. The teeth of a zip don't meet head on - they interlock, with each tooth held by the two teeth opposite. So the zipWith(*) operation ought to be multiplying each item by the two opposite:

a[0]*b[0] + a[0]*b[1] + a[1]*b[1] + a[1]*b[2] + a[2]*b[2] + ...


All right then, let's make this criticism more constructive. What should this say?

Let's look at what zipWith is actually doing. zipWith(*) evaluates to a function: one that multiplies the matching elements in two lists together. In other words, by applying zipWith we've taken a function (*) that multiplies numbers, and elevated it into one that multiplies lists together. List multiplication.

But we can't call it list(*). So how about this?

sum $ listwise(*) a b

Ok, it's not ideal; there are loads of different ways one could elevate a function to work "listwise", so the ignorant reader might wonder, what exactly is the above doing? Perhaps it's...

Multiplying a single value with a list?
a*b[0] + a*b[1] + a*b[2] + a*b[3] + ...

Multiplying each item in list a with each in list b?
a[0]*b[0] + a[0]*b[1] + a[0]*b[2] + ... a[1]*b[0] + a[1]*b[1] + ...

Or even something esoteric, like multiplying each item with the two opposite? :)
a[0]*b[0] + a[0]*b[1] + a[1]*b[1] + a[1]*b[2] + a[2]*b[2] + ...

There are plenty of other possible patterns, too. Any of these could justifiably be called "listwise multiplication" - the name isn't really giving the reader enough information.

So, we're looking for a more precise name. One that says "listwise, applying to pairs of elements with the same index". pairwise(*) is just as vague (when does multiplication not apply to a pair of numbers?); zipping(*) could work but has the same bad metaphor as before.

What other metaphors could work? Teeth clenching... fingertips touching... passengers moving from one train to another... aha! How about two ships shooting cannonballs at each other?
It's got the right metaphor (each cannonball hits the corresponding part of the opposite ship), and a snappy name that works as an adjective, and makes sense even if you don't make the mental leap to ships.
Ladies and gentlemen, I give you - "broadside".

sum $ broadside(*) a b

This is a function, also known as zipwith, which elevates a function into its "broadside form" - in this case, broadside multiplication. Multiplying two lists together along their broadest sides.

Like it? Hate it? I'd be interested to know what you think.

Comments:
Well I do disagree with your post, if only because the origin of the "zip" part in "zipWith" isn't the zipper, it's the zip function which is a staple of functional programming (I don't know where it comes from, but I do know that SML and Python also have it) which turns 2 (or more if the language handles varargs) lists into a single list of tuples.

In that context, zipWith means "transform these 2 lists in a single lists with the application of this function", so e.g. when zip would transform [a, b, c], [d, e, f] into [(a, d), (b, e), (c, f)] zipWith fun would transform it into [fun a d, fun b e, fun c f].

As a matter of fact, zipWith can be easily defined as mapping over a list generated by zip:

> \f a b -> map (uncurry f) $ zip a b

should work exactly as zipWith does (minor efficiency).

Now of course you can't understand why zipWith means if you don't know about `zip`, but it's very hard doing functional programming and not knowing about `zip`, I think.
 
Laurie, I am quite surprised and rather pleased to see you blogging again!
 
Interesting point. I know what the zip function does, but the conversion from one to the other hadn't occurred to me.

And thanks Jon. I do have some stuff about Flan in the works too; don't hold your breath though, I write very slowly.
 
This comment has been removed by a blog administrator.
 
This comment has been removed by a blog administrator.
 
Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?