Where Are The Bytes?

Created
Sat, 12/06/2010 - 02:31
Updated
Sat, 12/06/2010 - 02:31

A few years ago, I was considering starting a Free Software project. I never did start that one, but I learned something valuable in the process. When I thought about starting this project, I did what I usually do: ask someone who knows more about the topic than I do. So I phoned my friend Loïc Dachary, who has started many Free Software projects, and asked him for advice.

Before I could even describe the idea, Loïc said: you don't have a URL? I was taken aback; I said: but I haven't started yet. He said: of course you have, you're talking to me about it, so you've started already. The most important thing you can tell me, he said, is Where are the bytes?

Loïc explained further: Most projects don't succeed. The hardest part about a software freedom project is carrying it far enough so it can survive even if its founders quit. Therefore, under Loïc's theory, the most important task at the project's start is to generate those bytes, in hopes those bytes find their way to the a group of developers who will help keep the project alive.

But, what does he mean by “bytes”? He means, quite simply, that you have to core dump your thinking, your code, your plans, your ideas, just about everything on a public URL that everyone can take a look at. Push bytes. Push them out every time you generate a few. It's the only chance your software freedom project has.

The first goal of a software freedom project is to gain developers. No project can have long-term success without a diverse developer base. The problem is, the initial development work and project planning too often ends up trapped in the head of a few developers. It's human nature: How can I spend my time telling everyone about what I'm doing? If I do that, when will I actually do anything? Successful software freedom project leaders resist this human urge and do the seemingly counterintuitive thing: they dump their bytes on the public, even if it slows them down a bit.

This process is even more essential in the network age. If someone wants to find a program that does a job, the first tool is a search engine: to find out if someone else has done it yet. Your project's future depends completely that every such search performed helps developers find your bytes.

In early 2001, I asked Larry Wall, of all the projects he'd worked on, which was the hardest. His answer was quick: when I was developing the first version of perl5, Larry said, I felt like I had to code completely alone and just make it work by myself. Of course, Larry's a very talented guy who can make that happen: generate something by himself that everyone wanted to use. While I haven't asked him what he'd do in today's world if he was charged with a similar task, I can guess — especially given at how public the Perl6 process has been — that he'd instead use the new network tools, such as DVCS, to push his bytes early and often and seek to get more developers involved early.0

Admittedly, most developers' first urge is to hide everything. We'll release it when it's ready, is often heard, or — even worse — Our core team works so well together; it'll just slow us down to make things public now. Truth is, this is a dangerous mixture of fear and narcissism — the very same drives that lead proprietary software developers to keep things proprietary.

Software freedom developers have the opportunity to actually get past the simple reality of software development: all code sucks, and usually isn't complete. Yet, it's still essential that the community see what's going on at ever step, from the empty codebase and beyond. When a project is seen as active, that draws in developers and gives the project hope of success.

When I was in college, one of the teams in a software engineering class crashed and burned; their project failed hopelessly. This happened despite one of the team members spending about half the semester up long nights, coding by himself, ignoring the other team members. In their final evaluation, the professor pointed out: Being a software developer isn't like being a fighter pilot. The student, missing the point, quipped: Yeah, I know, at least a fighter pilot has a wingman. Truth is, one person, or two people, or even a small team, aren't going to make a software freedom project succeed. It's only going to succeed when a large community bolsters it and prevents any single point of failure.

Nevertheless, most software freedom projects are going to fail. But, there is no shame in pushing out a bunch of bytes, encouraging people to take a look, and giving up later if it just doesn't make it. All of science works this way, and there's no reason computer science should be any different. Keeping your project private assures its failure; the only benefit is that you can hide that you even tried. As my graduate advisor told me when I was worried my thesis wasn't a success: a negative result can be just as compelling as a positive one. What's important is to make sure all results are published and available for public scrutiny.


When I started discussing this idea a few weeks ago, some argued that early GNU programs — the founding software of our community — were developed in private initially. This much is true, but just because GNU developers once operated that way doesn't mean it was the right way. We have the tools now to easily do development in public, so we should. In my view, today, it's not really in the spirit of software freedom until the project, including its design discussions, plans, and prototypes are all developed in public. Code (regardless of its license) merely dumped over the wall on intervals deserves to be forked by a community committed to public development.


Update (2010-06-12): I completely forgot to mention The Risks of Distributed Version Control by Ben Collins-Sussman, which is five years old now but still useful. Ben is making a similar point to mine, and pointing out how some uses of DVCS can cause the effects that I'm encouraging developers to avoid. I think DVCS is like any tool: it can be used wrongly. The usage Ben warns about should be avoided, and DVCS, when used correctly, assists in the public software development process.


0Note that pushing code out to the public in the mid-1990s was substantially more arduous (from a technological perspective) than it is today. Those of you who don't remember shar archives may not realize that. :)