When I Hear “Fastest Growing” I Reach for XKCD

You may have heard of Christiane Amanpour. She’s the Chief International Correspondent for CNN. She probably has a dedicated room in her house to store all the journalism awards she’s won. Math awards though, she probably has none. Yesterday she opened her segment with a question:

“What is the fastest growing economy in the world? If you said China, you’d be wrong.”

And offered some insightful speculation:

Its staggering rate of more than 17%-a-year growth last year may explain why U.S. Secretary of State Hillary Clinton made a pilgrimage to Mongolia, along with Vice President Joe Biden and before him George W. Bush.

Wow, is Mongolia the up-and-coming economic powerhouse of the world? Not so fast. Its GDP is about $8B, which is about the same Apple’s profits for one quarter (obviously that’s tiny for a country). China’s GDP is larger by three orders of magnitude, and it’s also ahead in per-capita GDP. In other words, Mongolia plays a negligible role in the world economy. Small and poor, it can grow or shrink very fast. Its economy depends largely on a handful of minerals, so it has no diversification. It became a market economy in 1992. This is how it has performed since then (with China for reference):

China’s growth rate looks extremely stable in comparison, doesn’t it? We might as well be comparing Apple, Inc. with a fruit stand. This recent XKCD comic fits perfectly:

Fastest-Growing

Someone please get Christiane Amanpour a copy of Innumeracy, or A Mathematician Reads the Newspaper 🙂 [actually, I’m not joking]

 

Traditional VCs and First-time Entrepreneurs Are not Aligned

If you are an entrepreneur looking to raise your first round of funding ever, you surely must be curious about the different kinds of investors you will pitch. You may be particularly interested in VCs because they have the deepest pockets. There’s something you need to know: it’s very likely that a VC’s investment objectives will not be aligned with your personal goals. I’ll try to explain why.

A VC fund typically has a 10-year investment horizon. In the words of Fred Wilson:

VCs are professional money managers. We are provided capital to invest as long as we can return it to our investors with a strong return in a reasonable amount of time. A strong return is 3x cash on cash. A reasonable amount of time is ten years max.

It is known that for any portfolio of startups, the returns follow a power law. If a VC expects to turn 150M into 450M over ten years, one or maybe two investments must contribute hundreds of millions of dollars. At the time of a big exit the VC will usually own a relatively small stake in a company, so it must be worth billions of dollars. It follows that VCs want every single one of their investments to have that potential. If they believe that an investment will never return more than tens of millions, it’s just not worth their time from a financial perspective.

Now let’s talk about you, because you are more interesting to yourself. You are the founder and CEO of GulpMonger, three years out of college, fresh out of Y Combinator. Typical first-time SV entrepreneurs are not millionaires, so let’s say you have 100k in the bank (maybe you’ve made some money from stock options). For most people the utility of going from a net worth of 100k to a few millions is huge. It many cases it means to not have to work for a living. Paul Graham calls it “solving the money problem.” Also for most people, the difference in utility between 5M and 50M (or 500M) is not as significant.

I’ve spoken with dozens of first-time entrepreneurs in the past year alone. Pretty much all of them admitted that they would party like it’s 1999 if they could get a few million out of their startups. I’ve also seen quite a few of such exits; because of the power law distribution of returns, their odds are disproportionately higher than those of becoming a multi-billion dollar company.

Let’s say you raise money from a VC because you are honestly open to all sorts of outcomes. You don’t know how big and how fast GulpMonger could grow. Given what you know today, you are willing to go as far as possible. The VC believes that GulpMonger has a high enough chance (say 3%) of becoming the next Dropbox or Twitter. The VC also believes that the odds that the company will take an early exit if it’s doing well are relatively low. This is crucial to a VC: if Facebook had sold to Yahoo for one billion dollars in 2006, the 10-year return for Meritech Capital Partners would look very different (and not in a good way).

Now let’s say that GulpMonger starts doing well, and attracts the interest of potential acquirers. Perhaps you get a serious offer, and you do the math: saying yes would make you a multimillionaire with 100% certainty (let’s leave the issue of vesting aside, as it may not change the order of magnitude depending on what you negotiate). You could also choose to go long, and hope to continue growing. Perhaps in two or three years the company would be worth ten times more, and the odds of that could be 30%. You guess that there’s also a 30% chance of being worth diddly-zippo-nil by then.

If you believe what I just stated, it would be irrational not to sell GulpMonger. On the other hand, things look very different for a VC. A 30% chance of 10x (and maybe a 3% chance of 100x) makes much more sense than a 100% chance of x. The rational thing to do is to oppose the sale. Of course, this only makes sense if opposing the sale doesn’t hurt the above chances.

At this point things can get ugly. If the CEO would be very unhappy going long, he/she could get replaced. The VC may use a divide-and-conquer approach on the founders to block the sale and/or oust the CEO. There will be resentment one way or the other. It happens all the time, but in most cases people don’t hear about it.

What can you do to avoid the above scenario? Two things come to mind: one, decide what you want before you start playing the game. E.g. do you want VC money? If so, when? Two, educate yourself. If at all possible, meet with experienced entrepreneurs who are not invested in your outcome. Learn as much as you can from their experiences. You’ll still have to make it up as you go, because experience transfusions have not been invented yet (I’d like to hear that startup pitch). However, knowing is some percentage of the battle. I wouldn’t affirm it’s fifty, though.

If you found this post interesting, I recommend that you invest 90 minutes in watching Something Ventured: a documentary about the history of venture capital in the Silicon Valley (hat tip to Elad Gil). Keep in mind that these guys are trying to look their best, so read between the lines 🙂

Final caveat: what I just said does not apply to ALL investors who call themselves VCs, and does apply to investors who prefer other labels. Your mileage may vary. Shop around. Void where prohibited. Offer valid for residents of Silicon Valley mostly. Safe travels.

PSA: The 32-Bit Version of MongoDB Is a Toy

One thing that wasn’t clear enough from my previous post: the reason I ran into the “silent error” issue was that I’d installed the 32-bit version of MongoDB. The 32-bit version of MongoDB is limited to 2GB. In other words, it’s a toy. As many people pointed out, the Downloads page of MongoDB pretty much admits so.

The problem is, why would anyone need to go to the Downloads page? I’d installed MongoDB on my Mac (64-bit) through Homebrew, and on 32-bit Ubuntu via APT:

~$ apt-cache search mongodb
mongodb - An object/document-oriented database (metapackage)
mongodb-clients - An object/document-oriented database (client apps)
mongodb-dev - An object/document-oriented database (development)
mongodb-server - An object/document-oriented database (server package)

How would you know from those descriptions that you’re installing a toy? If you release a toy, at the very least you should show a big warning when the service starts. Let’s see:

~$ sudo service mongodb start
mongodb start/running, process 7680

How about “Hey, you just started the 32-bit version of MongoDB which is LIMITED TO TWO GIGABYTES (2GB).” Nah, it’s in the Downloads page. It reminds me of this scene from the Hitchhiker’s Guide to the Galaxy, which I quoted in the comments:

“There’s no point in acting surprised about it. All the planning charts and demolition orders have been on display at your local planning department in Alpha Centauri for 50 of your Earth years, so you’ve had plenty of time to lodge any formal complaint and it’s far too late to start making a fuss about it now. … What do you mean you’ve never been to Alpha Centauri? Oh, for heaven’s sake, mankind, it’s only four light years away, you know. I’m sorry, but if you can’t be bothered to take an interest in local affairs, that’s your own lookout. Energize the demolition beams.”

Anyway, I hope that I’ve mentioned the limitations of the 32-bit version of MongoDB enough times for people to find them through Google if needed.

For the record, I don’t hate MongoDB or think it sucks. I quite like it. I just think it’s still an immature technology. Hopefully it will become a viable alternative to time-tested databases. Maybe in ten years? 🙂

I’ll Give MongoDB Another Try. In Ten Years.

A few weeks ago I wrote a small app that fetches JSON documents from app.net’s API and draws a word cloud. At first I wasn’t keeping the content around after generating the images. Later I thought of other things I’d like to do with the documents, so I decided to start storing them.

I’d never used MongoDB, and I have little interest in the NoSQL hype (particularly for my own toy projects). However, it seemed like a good fit for what I wanted to do: store and query JSON documents without worrying about schemas. I followed the MongoDB Ruby tutorial, which shows you how simple it is to insert documents into Mongo:

doc = {"name" => "MongoDB", "type" => "database", "count" => 1,
       "info" => {"x" => 203, "y" => '102'}}
coll.insert(doc)

So, one gem install and two lines of code later I was happily inserting documents into a MongoDB server on my puny AWS Micro instance somewhere in Oregon. It worked just fine for all of three weeks.

Yesterday I decided to compute some stats, and I discovered that the most recent document was four days old. Hmmm. I checked my script that fetches and inserts documents; it was running and there were no errors in the log. MongoDB seemed fine too. What could be the problem? Long story short, this blog post from three years ago explains it:

32-bit MongoDB processes are limited to about 2 gb of data.  This has come as a surprise to a lot of people who are used to not having to worry about that.  The reason for this is that the MongoDB storage engine uses memory-mapped files for performance.

By not supporting more than 2gb on 32-bit, we’ve been able to keep our code much simpler and cleaner.  This greatly reduces the number of bugs, and reduces the time that we need to release a 1.0 product. The world is moving toward all 64-bit very quickly.  Right now there aren’t too many people for whom 64-bit is a problem, and in the long term, we think this will be a non-issue.

Sure enough, my database had reached 2GB in size and the inserts started failing silently. WTF zomg LOL zombie sandwiches!

This is a horrendous design flaw for a piece of software that calls itself a database. From the Zen of Python:

Errors should never pass silently.
    Unless explicitly silenced.

There is a post on Hacker News by someone who doesn’t like the Go Language because you have to check errors in return values (that’s where I got the quote above). This is worse because like I just said, MongoDB is a database (or at least it plays one on the web). If you tell a database to store something, and it doesn’t complain, you should safely assume that it was stored. In fact, the Ruby tutorial never tells you to check any error codes. That’s what exceptions are for.

This gave me a nasty feeling about MongoDB. If something so elementary can be so wrong, what other problems could be lurking in there? I immediately switched to CouchDB (once again because it was pretty trivial), but if this were a serious project I’d be using Postgres. I’d spend the extra hour figuring out the right schema, or maybe I’d even try the new JSON support in Postgres 9.2.

Wait a second, maybe I should reconsider. After all, relational databases were not designed for the web. And MongoDB is Web Scale.

Slap me silly on Hacker News or maybe Reddit Programming  🙂

Search for Obama on Facebook and You Get Romney

Last night I saw this tweet:

Of course I went to try it myself. Here’s what I saw:

Facebook may be the $50B pound gorilla of social networks, but when it comes to search it’s just a toddler. I want to believe I know a thing or two about search by now, so let me explain why the above is terrible in a number of ways.

Search and advertising have one thing in common: they are specific instances of the Matchmaking Problem. As a channel for search results or ads, you are trying to find a match between two parties where each one has something the other wants:

  • In search, one party wants information. The other wants eyeballs.
  • In advertising, one party has money. The other has product to sell.

Why did Google get to dominate the world of online advertising? One word: relevance. If I search for something and the result is an ad, I may not perceive it as an ad if it’s exactly what I want. For example, I searched Amazon for “Macbook Air” just now. Would you say the results below are ads? Does the distinction even matter in this case?

As a broker between two parties, your job is to make transactions more efficient. If Amazon  tried desperately to sell me a Dell machine whenever I search for a Mac, they would be wasting my time. Presumably they would be wasting Dell’s money too, if Dell paid Amazon for this service. This is bad for me and Dell in the short run, and for Amazon in the long run: they might lose me as a customer and/or alienate Dell and Apple.

So the question is: why did Facebook think it was a good idea to take Romney’s money and display his page as a sponsored result for the search Obama? I bet 99% of the people who type Obama DO NOT want to see Romney. I also think this is a waste of Romney’s money (which may be a good thing from my point of view, but this is irrelevant).

I’ll attribute this to Facebook’s inexperience with sponsored search results. This is certainly not good for Facebook’s user experience, but they must have thought it was worth it to them because of the revenue. If so, that’s shortsighted. An honest broker would have told Romney’s staff that buying the word Obama was a poor use of their money. I assume they’d want to make them a happy customer: Romney may not be winning this election, but there can be no doubt that the GOP will be spending truckloads of money on advertising for the foreseeable future.

 Hacker News discussion for this post.

Optimizing for the Right Thing (Yahoo and Ads)

Hilary Mason says that Yahoo is optimizing its mail interface for the wrong thing. The gist of her complaint is that Yahoo’s emphasis on making users click on ads creates a horrible experience for users.  The implication is that Yahoo should be optimizing for user experience.

I sympathize with the sentiment; I am annoyed by unwanted ads as much as the next nerd. However, Yahoo has chosen to be an advertising business. Asking them to optimize for user experience would be like asking NBC to broadcast shows without ads. Bad NBC trying to make me buy soap or Romney or Happy Fun Balls.

The question of what Yahoo should be optimizing for is not that simple. All ad-based web services face a conflict: On one hand, advertisers want the best possible return for their money. On another, users want the best possible experience. The extremes would be:

  • show no ads whatsoever and focus on the most pristine user experience.
  • plaster the page with as many ads as you can sell, until the page becomes unusable.

Here’s an example of a site that is closer to the latter (GoDaddy is always a good whipping boy):

Obviously the extremes don’t work. If you are an ad-based business with no ads, you make zero revenues. On the other hand, if all you do is show ads then you have no user experience, so you will end up with zero users (and obviously, nada de dinero).

This is reminiscent of another famous economic problem: determining the tax rates that maximize government revenues. Tax rates of 0% and 100% yield no revenue (with 100% nobody would do any work). This concept is illustrated by the Laffer Curve.

Economists agree that there must be a tax structure that doesn’t force people to emigrate or stop working, while at the same time yielding near-maximal revenues. Of course, finding it is another matter.

To complicate things for Yahoo, it’s much easier to move to another site than it is to leave a country. Countries can make it hard for people to emigrate, so they can get away with unfair tax schemes for relatively long. This comes at the expense of other variables because there is no free lunch in our globalized economy, but I digress.

So, what should Yahoo optimize for? Clearly Yahoo’s demographic is not the same as those of Google, Twitter or Facebook. Only they know what amount of ads will make leave their contacts behind and run for the GMail border. Given how long Yahoo has been around for, I have to assume that they know very well their position in the “Laffer Curve” of web advertising. I suspect one problem for Yahoo is that their demographics are changing, and they haven’t completely re-optimized yet. Their revenues have been decreasing faster than their eyeballs.

Judging by Hilary’s post though, they are doing something right. Her grandparents didn’t seem too annoyed with the state of Yahoo Mail until she came to visit and showed them the true path. Luckily for Yahoo, there are not too many Hilary Masons on the planet 🙂

Discuss on Hacker News if you’d like.

Social Networks Implode Quickly

Friendster, Myspace, Bebo. They were huge not long ago. Do social networking sites tend do die faster than other types? My guess is that they do, for two main reasons:

1) Metcalfe’s law. A social network increases its value faster than linearly as the number of users grow. Obviously it goes both ways. This is different from a content site: the content in Geocities or Tripod continued to be useful to individuals for a long time after authors stopped updating it.

2) “Coolness” factor. If your cool friends stop using Friendster or Myspace, you don’t want to use them either. It’s ok to lurk, but not to be seen doing anything. On the other hand, your Google searches may take you to content sites that nobody updates anymore. You can visit a Tripod page. No one needs to know.

Compare the rise and fall of Friendster with the slow decay of Geocities during the same time frame:

Friendster

Geocities

Of course these two charts are far from scientific evidence. I spent an hour looking at traffic data (not just search trends) for a number of social networking sites. They all show the same quick decay when compared to content sites. You could argue that all these sites were killed off by the meteoric rise of Facebook, but that’s partially the point. If these sites had interesting content that was publicly accessible, we’d still be seeing residual traffic like we did with Geocities. Even today, Tripod’s Alexa ranking is 926 (compare with Friendster at 15k or Bebo at 4k.

What could this mean for Twitter and Facebook? For starters, Twitter’s content is ephemeral in nature. It doesn’t contain cultural artifacts with long-term value such as the lyrics to the Mr. Ed Theme. This means that if users stop tweeting, traffic will likely implode. Also, the barrier to exit Twitter is very low. Most of us don’t have that many followers, and we could easily “tweet” on Facebook, LinkedIn,  app.net, etc. If Twitter disappeared tomorrow (or if I got kicked out), the three or four people who care could find me easily 🙂 What makes Twitter really valuable are the tiny fraction of people who have amassed a large audience. These people are more invested, and they tweet more often. If they became alienated and went elsewhere, so would their followers.

The story is a bit different for Facebook. It’s harder to leave because your friends and family are there. You could replace many of the people you follow on Twitter by a different set, and still get most of the same value. However, your friends and family are unique. Four years ago Cory Doctorow argued that your creepy ex-coworkers would kill Facebook. Obviously that hasn’t happened yet (here’s a Hacker News discussion about what they have been doing to prevent it). The main risk to Facebook is that it will become “uncool” like Friendster and Myspace before it. So far Facebook has managed to avoid that fate.

The main point of this post is that there are two kinds of companies: those that could erode and lose market share quickly over time, and those that are more prone to a quick collapse. Social networking sites seem to be most unstable company type the world has seen.

TL;DR: Big social networks need to take advantage of the spotlight, and solidify their position to rely less on network effects. Otherwise, they are extremely risky investments.

Hacker News discussion of this post.

Startup idea: short, paid email

Here’s a half-baked thought inspired by a comment Marc Andreessen made on app.net:

The idea: a service where you can receive emails limited to a certain size (say, 500 characters). People would know that if Joe’s address is joe@short.please, he would only see the first 500 characters of whatever you send. If your email to Joe exceeds the length, you’d get a message back saying:

“Joe is using service X, which rejects emails longer than 500 characters. It also rejects mails with attachments or links. Your original message will not be delivered, but if you compose one that fits those requirements, he will receive it.”

Maybe Joe could tweak some of these requirements. He may want only 300-character emails, and he might accept links. No attachments, though.

Taking it one step further, busy people like Marc could add a price to the mix: you cannot reach him unless you prepay an amount he picks (say, $20). It’s not that he cares about the money; this ensure you’re serious, and he could choose not to cash it after reading your email. He could also experiment with this price until he finds the right level. The “pay to email me” concept is not a new idea by the way. I remember reading something Bill Gates wrote a long time ago to that effect, as a possible solution for spam.

Once again, this is a 30-second idea with no serious though behind it. Don’t rip me a new one!

Discuss on Hacker News

Google Surveys: Know What You Are Asking

As I write this there is a post on the front page of Hacker News entitled 34.5% of US Internet Population not using Facebook/Twitter. It sounds like an interesting figure, but it’s not very meaningful. Let’s see why:

The survey the author ran looked like this:

The first problem: he is drawing indirect conclusions from a question. He never asked people if they used Facebook or Twitter directly. These were the possible answers:

  • Yes – Because it’s easy.
  • No – I don’t understand how it works.
  • Yes – But I hate it.
  • No – I’m scared of scams.
  • I’m not on Facebook/Twitter

There are many other problems with those answers. The survey is not about usage of Twitter or Facebook. It’s about people’s behavior when faced with a login button. Furthermore, the questions use terms that evoke specific feelings in the audience (hate, scared), so they will be primed to respond in a biased way. Yes Prime Minister explains it better than I can:

In other words, the proposed answers suffer the problem of Response Bias.

If I were to run a survey to draw conclusions about usage of Twitter, I would ask exactly one question, with a yes/no answer. For example:

Do you have a Twitter acount? [Yes/No]

Of course, the only conclusion I could draw from that survey would be: X% of respondents say they have a Twitter account. I could not make any claims about daily usage, engagement, etc. Designing a poll to get useful answers is serious business. In the discussion linked below, user mallloc47 recommends this book: Asking Questions: The Definitive Guide to Questionnaire Design.

Moral of the story: Google has enough money already. Don’t give them your $150 unless you know you’re getting your money’s worth.

Discuss on Hacker News if you please.

First Month on app.net – Charts and Stats

I’ve been playing with app.net for a few weeks. It’s still a relatively small community: there are about 20k users, and a little over 300k posts so far. For those who haven’t checked out the service, a post is similar to a tweet only up to 256 characters in length instead of 140. John C. Reilly and Will Ferrell should sign up, there’s so much room for activities!

Some interesting tidbits:

250 users have generated half of the posts. This is the core group of early fans/evangelists. I’m one of them, at #62 the last time I checked.

The average post length is about 100 characters (for comparison, tweets are around 70).

A high percentage of appnet’s users are developers. Even taking that into account, there is a surprising number of applications that support the service. These are the top 20 in terms of posts:

171852 Alpha (the website)
22993 quickApp
16566 AppApp
14564 IFTTT
11865 Buffer
11043 moApp
10441 Mention
10304 Appetizer
7116 Hooha
6491 #PAN
5534 xtendr
5188 shrtmsg
4565 AppNet Rhino
3201 Felix
3073 Adian
2923 Appeio
1782 Spoonbill
1403 AppToDate
1303 Appnetizens
985 Succynct

The topics being discussed are still fairly homogeneous. Here is a word cloud from this morning, during the Kindle announcement:

If you’d like to keep an eye on ADN’s zeitgeist, I update it every hour: ADN word cloud.

Finally, here’s something that I found particularly interesting. You may have seen the following chart before; it’s the distribution of tweet lengths (in this case, taken from a sample of 10M tweets):

What’s happening here? Clearly many tweets have room to spare, but sometimes we just have too much to say for one tweet. As we get closer to 140, people start editing a tweet (e.g. preferring shorter synonyms, using abbreviations). The large number of tweets at exactly 140 may be the result of apps that truncate content. You would expect something similar to happen at appnet, except with 256 characters. Let’s see:

Observe the peak near 140. This is the result of cross-posting from Twitter, particularly with applications such as IFTTT. As a result, the first portion of the chart has a shape that’s reminiscent of the Twitter chart, and the whole chart is similar too except for that anomaly in the middle. The posts between 141 and 256 characters are “native”, and we see the same struggle to fit into 256 characters (only in this case I believe there is no automatic truncation yet). We get used to the extra space really quickly!

If you’d like to see any other appnet stats, let me know. If you are a subscriber, you can follow me there: @dbasch.

Continue the discussion on Hacker News.