Dogecoin and the Appeal of Small Numbers

Dogecoin is a unique phenomenon in the fascinating world of cryptocurrencies. It’s barely six weeks old, and as I write this post its network has more computing power than any other cryptocurrency except for Bitcoin. It made headlines this weekend when its community raised enough money to send the Jamaican bobsled team to the Sochi Winter Olympics.

From a technical standpoint, Dogecoin is essentially a branded clone of Litecoin (the second cryptocurrency in terms of total market value). Without a doubt one of the most important factors contributing to Dogecoin’s popularity is its community. The Dogecoin subreddit has almost 40k users right now. The front page usually has a good mix of humor, good will, finance, and technology. Check it out if you haven’t already.

There’s another more subtle factor that I believe plays in Dogecoin’s favor: its tiny value. One DOGE is worth about $0.0015 right now. In other words, one dollar buys you about 600-700 DOGE. Contrast that with Bitcoin: $1 is about 0.001 BTC. This puts Bitcoin and Dogecoin in two completely different mental buckets for most people. One BTC is comparable to an ounce of gold. The press reinforces this idea, and many people view Bitcoin as a digital store of value. The daily transaction volume of BTC is about 0.2 percent of the total bitcoins in existence, which means that BTC does not circulate very much yet.

Contrast this with Dogecoin, for which the daily transaction volume is close to 15%. Where does that money go? Perhaps the most common usage of DOGE is to give online tips. Compare the activity of Reddit’s bitcointip and dogetipbot, and you’ll see the latter is much more active. What would you prefer as a tip, 100 DOGE or 0.000002 BTC? Both are almost meaningless in terms of monetary value, but receiving 100 units of a coin does feel better. It’s also easier to give tips; you don’t have to think much about tipping someone 10, 25 or 100 DOGE. With BTC you either have to choose a dollar amount, or be very careful with the number of zeroes.

The reason a DOGE is worth so little is the total supply of coins. The Bitcoin software has an embedded constant called MAX_MONEY. For Bitcoin it’s set to 21 million, which means that if Bitcoin takes over as a world currency it will be impossible for most people to ever own one. Litecoin is only slightly better, at 84 million. For DOGE, it’s one hundred billion (perhaps more, yet to be decided). This makes it unlikely that one DOGE will be worth $1 any time soon (or ever). It’s easy and fun to exchange $20 for 10k DOGE and give a fraction of them to strangers on the internet. Anyone can still mine hundreds of dogecoins per day with a desktop computer, and not feel very attached to them. Being a “slumdoge millionaire” is still affordable to many.

In a world where people get a kick out of likes or retweets, Dogetips take it up a notch. A Dogetip is an upvote that you can use, internet karma points that are actually worth something. So fun, very value.

Image credit: /u/binxalot, this person deserves tips. Of course I accept them too 🙂

DHpZsQCDKq9WbqyqfetMcGq87pFZfkwLBh

The Harsh, Logical World of Cryptocurrencies

There’s a subtle point that escapes most people buying bitcoin these days: Bitcoins are not exactly something you own. Rather, they are something you know.

Let’s say you you buy a bitcoin, and you transfer it to an address. The world now knows that there is one BTC there. This is public information. What the world doesn’t know is the key to use that money. Only the person who knows that key (you and only you, hopefully) can spend it. If you lose that key, nobody will ever be able to spend that bitcoin, and it will stay locked forever.

How is this possible? Why can’t we recover the key somehow? Because it’s like finding an infinitesimally small needle in an absurdly gigantic haystack. There is a very large number of possible bitcoin keys, comparable to the number of atoms in the universe (it’s a 77-digit number). Your bitcoin address is derived from that key, but not in a way that can be reversed. A very rough analogy would be: if you know John, you know his height is 5’10”. If I tell you that someone’s height is 5’10”, you have no idea who that person is.

This analogy doesn’t take us very far; if being 5’10” were the only requirement to spend John’s money, it would be easy to find someone with that height. Now imagine we are traveling around the universe and measuring the diameters of atoms. One day, and for no particular reason, they all expand into random sizes. Some remain tiny, others are are as large as a cow, a planet or a galaxy. Your job is to find an atom whose diameter is exactly 176,891,292,523,293.23412 miles. Good luck.

When you install a bitcoin wallet on your computer it generates a unique, virtually unrepeatable key for you. This key is 256 random bits, and you can derive your address from it. The second you transfer significant money to that address, you have a problem. You need to make sure that:

a) You never lose your key.

b) Nobody else can ever know your key.

You can see how this is difficult. If you wrote down your key, someone could find it and spend your money. If that happens, you wouldn’t know until you checked the blockchain transactions involving that address. Same thing if you backed it up. What if someone finds your backup drive? What if you upload it to Dropbox, and a disgruntled employee reads your files? On the other hand, you cannot afford to not back up your key. You wouldn’t want to be like the guy whose hard drive containing millions of dollars worth of BTC is buried somewhere in a dumpster.

So how do we solve this problem? Suppose you want to store a life-changing amount of bitcoin somewhere (why you’d want to do that is an interesting question). The best solution I’ve found so far involves two-factor authentication. You can use an algorithm like BIP38 to generate a passphrase-encrypted key, which you can then print and or / store somewhere offline. You may want to have a few paper / digital copies of this key in different locations (why not, it’s cheap). Still, there are issues with this approach:

– You have to know what you’re doing. For example, you have no reason to be online to generate a bitcoin key/address. Technically you don’t even need a computer; you could generate a bitcoin key by rolling a die 100 times, writing down the numbers (3, 6, 3, 2, 1 …) and performing some mathematical transformations with pen and paper. That would be a bit too paranoid, though. It’s easier (and less error prone) to use a live read-only Linux distribution on a computer that’s never been connected to the internet. You’d need to install a trusted, open-source program to generate a BIP38-encrypted paper wallet (you’d copy it from a portable drive, of course). There are several implementations out there. This is my favorite, although it requires a graphical interface. I’ve also written my own command-line paper wallet generator and BIP38 encrypt/decrypt utility (although I urge you to not trust my hasty implementations very much).

– If you use BIP38, now you have two things you can’t lose: your encrypted key and your password. Losing either now means that your money is gone forever. And if you can remember your password easily, it’s probably not good enough. Relevant XKCD:

Even though the BIP38 algorithm is pretty hard to brute-force, a determined adversary who possessed your encrypted key could try billions of combinations in a relatively small amount of time. Therefore, choosing a password for an encrypted key that controls millions of dollars worth of bitcoin is not a trivial matter.

Are there any better alternatives? Probably not. Some people are partial to the idea of a Brain Wallet, which works as follows: you pick a very complicated passphrase that you can remember, and use a mathematical function to convert it into a 256-bit bitcoin key. Therefore, the “wallet” exists only in your brain. The problem with this approach is that you’re losing significant entropy by picking something you can remember, therefore making the job of an attacker easier (see this Reddit thread about a guy who used an “obscure poem in Afrikaans” as a passphrase).

Could we go the opposite way perhaps? Could we generate a random key and then come up with a mnemonic that encodes 256 bits of information in a way that most people can remember? I gave the problem a little bit of thought this morning and came up with a few ideas, none of which convince me fully:

– Memorize your key as a decimal number. It’s “only” 77 digits, and you could use the techniques described in Moonwalking with Einstein. I’m sure I could remember 77 digits for an hour or two. Next year? Forget it 🙂

– Use the 256 bits to generate a fictional character. For example, the first bit could be man / woman. The next six bits could be age (say 18 to 82). Favorite color, height, weight, country of origin, occupation, etc. I made a reasonably long list of attributes that I thought I might remember about someone I really cared about, and it was hard to get past 100 bits.

– Generate meaningful text, perhaps a poem or a story, using arbitrary rules, e.g.

Random word from [ A/The/My/Your/One/His/Her/Our ] -> 3 bits

[Color name picked from a list of 16] -> 4 bits

[Nationality from 128 countries] -> 7 bits

[Type of animal from the most memorable ones, maybe 6 bits]

For example, the first 20 bits might encode into “your blue Zimbabwean otter,” and  that would be the start of a story generated by other rules that consumed the remaining 236 bits. It would be mandatory to lay out all your choices beforehand, and strictly adhere to what the 256 bits of your random key had dictated. The structure could yield something more memorable than the on-the-fly narratives “memory athletes” use for short-term recall. Even though it’s clearly possible to do this, I wouldn’t trust my memory to precisely remember such a story for years. It’s still a fun thought experiment.

If there is a point to this post, it’s that using bitcoin to safely store large amounts of money long-term is still impractical for most people. There’s a reason bitcoin wallets are not called vaults. Unless you really know what you’re doing, don’t use them to store more money than you would carry in your pocket.

By the way, the key to the address linked in the second paragraph of this post is the number 1. You can see that people send tiny amounts of money to it frequently, perhaps to test out new software. Want to catch a digital dime once in a while? 🙂

If you found this post useful, tips are welcome.

1EmwBbfgH7BPMoCpcFzyzgAN9Ya7jm8L1Z

Hacker News discussion of this post.

Markov Chains in Clojure, Part 2 – Scaling Up

Last month I posted a very simple Markov Chain generator that takes one paragraph of text and produces gibberish. The one issue is that it doesn’t scale, for two reasons:

1) It reads all the text into memory at once. That’s fine if you’re processing a few paragraphs, but for large amounts of text it’s better to process one line at a time.

2) The word lists contain repeated words. These lists would take much less space if they had word counts instead. For example:

{"He" ("is" "is"), :start ("He" "She" "He" "She")}

Could be

{"He" {"is" 2}, :start {"He" 2, "She" 2)}

Obviously it’s not much of a gain for this simple example. If we were processing book though, we might have hundreds of sentences that started with He or She. The second structure would be a couple orders of magnitude smaller.

Let’s start with the second problem. Here’s an elegant code snippet to traverse the list of words and build the structure with counts from a Stack Overflow discussion:

Notice the usefulness of fnil, it makes the code more concise. More importantly, the above is a lazy function; if we read the file lazily then we can process the input stream without reading it all into memory first. How would you do it?

Below is a snippet of code based on another Stack Overflow question. By the way, isn’t Stack Overflow awesome? There was a time when we had to code without it, or without internet access. Those were the bad old times 🙂

Notice that I sandwiched the words in a sentence between the :start and :end markers because I want to know what words are good starting and ending points for generated sentences. The transform function sees a long stream of words that could be a single sentence without these markers. As anecdotal evidence that this approach is more memory-efficient, running the JVM on my Macbook against a text file with 50k English words takes up 20Mb less than the original program.

Now that we have a suitable structure, how do we pick words at random but with a probability that’s proportional to the number of times it occurs after a given word?  For example, if I have {“He” {“is” 15, “was” 5}} I want to pick “is” 3/4 of the time. This was an interview question that we asked frequently at Inktomi in the late 90s. The simplest solution is to pick a random number in the range of the number of individual instances of words, and pick the index of the word corresponding to the slice it would take in the total.

In the above example, we have two slices:

0 -14 -> "is"

15 - 19 -> "was"

so choosing a random number between 0 and 19 and then checking what slice it belongs to would yield “is” 75% of the time and “was” the remaining 25%. Here’s a nice implementation from Rick Hickey:

And that’s pretty much it. The entire code is here. I ran it against this collection of Hacker News headlines from this post and generated a few of my own:

  • “Microsoft’s Decline of Employee #1, Leaving Github”
  • “Y Combinator’s First iOS 6 Countdown”
  • “Congress Is the HipHop VM”
  • “Welcome to Get From Russia in Javascript charts and Numpy”

Where do we go from here? We could change the code to pick only sentences of a certain length, or to make chains of n-grams (instead of single words) to create more plausible text. Have fun!