This Before That

This article is ostensibly about why the challenge space in an interactive zero knowledge proof has to be large. Understanding this rather obscure theoretical aspect of zero knowledge proofs is quite rewarding intellectually. I promise.

Let me start with a trivial question. How do you convince yourself that something happened before something else?

Here’re some possible answers.

  1. You literally see Event-X happen. You wait for a bit. You then literally see Event-Y happen. You know that X happened before Y because you saw it yourself. You are convinced because you trust your memory of what you have seen.
  2. Sometimes, things are physically structured such that Y cannot happen before X. Let’s say I see you wearing socks and shoes, you could not have worn shoes before wearing socks.
  3. Someone you trust tells you that Event-X happened before Event-Y and you believe them.

These are so intuitive that you don’t bother to reflect on this till someone specifically asks you this question. In fact, the 2nd example I gave above is used in many magic tricks – you expect a certain order because of obvious structure, but the magician circumvents that order to enthrall you with his magic. One such trick is where the magician pretends to cut an unpeeled banana with an imaginary knife, and then peels the banana to reveal a precise cut in the same location on the inner fruit. The magic works because it belies the natural order of events that you are used to.

Now that I have asked this seemingly trivial question on ordering, let’s formalize these three modes of what proves order: Witness, Structure, or Trusted Third Party.

  1. Witness: You personally witnessed something happen before something else.
  2. Structure: The structure of the two events is such that one cannot happen before the other. This could be due to physics, or the natural fixed dependency between two events, like birth and death.
  3. Trusted Third Party: A newspaper, a notary, an atomic clock which knows “real time”, etc.

Now that we have set the building blocks of our discussion, we can get into the main question. Let’s say Alice sees 2 events happen in succession. She wants to convince Bob that this is the case, but Bob wasn’t there with her at the same time. Alice is convinced because of mode #1: she was herself a witness to the order. Bob was not there at that time. So, mode #1 is ruled out. How does Alice convince Bob of order using mode #2 or mode #3?

Structure

If there is structure between these events, it’s self-evident to Bob that Alice is right. No real proof is required. But structure is not as straight forward as birth/death, socks/shoes, peeling/cutting a fruit, etc. Bob has to make sure that he verifies the structure thoroughly to see if Alice is playing any tricks. For example, in the pre-cut banana magic trick, Bob must look for a tiny pin-sized hole in the banana peel to see if Alice the great magician inserted a needle in the banana, moved it around to cut the fruit before beginning the trick. This would tell Bob that the banana was pre-prepared, and the structure assumption doesn’t hold anymore. Structure works, but only if Bob can convince himself that the structure was not tampered with by Alice. That there are no magic tricks or shoe-contortions that allows Alice to wear her socks after she has worn her shoes.

Trusted Third Party (TTP)

TTP’s work, but there’s just not enough of them around. We have newspapers, atomic clocks, notaries, and such, but they don’t cover every event that we are interested in. How does Alice get a TTP to notarize that event X happened before event Y?

Here’s one approach: Alice waits for a TTP to output a signed timestamp. The TTP doesn’t even know that Alice exists. A newspaper, for example, prints a copy on physical paper everyday. This counts as a signed timestamp. Alice then combines this signed timestamp and Event-X as two inputs to event Y. Bob sees event Y, and is convinced the output of Event-Y happened after Event-X. Bob is convinced because he sees that the timestamped newspaper is an input to Y, and that means that even X should have been available before Y. This is similar to how they show movie-kidnappers proving that their hostage is alive at certain time. They make a videotape of the hostage holding that day’s edition of a newspaper. In this example, the newspaper is the TTP; the hostage being alive is Event-X. The hostage holding the newspaper, being videotaped is equivalent to the timestamp and Event-X being given as inputs to Event-Y.

In fact, this approach is a slightly modified take on mode #2 (structure). Alice is using the newspaper in the video to convince Bob that the newspaper was printed first. Alice then made a video with the newspaper clearly visible in it, and these two events could have happened only in that order. The only two ways it could have happened in the opposite order are:

  • Alice predicted in advance what the newspaper frontpage would look like in the future. She then prints that newspaper herself, and uses it in the video. The newspaper is eventually published exactly like she had predicted, and Bob is now convinced. Alice is unlikely to be able to pull this off because she doesn’t know the future.
  • Alice is friends with the newspaper editor, and this editor will print a headline in the future that Alice has asked him to. That way, she can print her own copy of the newspaper with this headline, make the video and when the real newspaper is printed with this pre-determined headline, Bob can be fooled into thinking that the video was made after the newspaper was printed. But this scenario contradicts our assumption that the newspaper is a trusted third party, and cannot be manipulated by either Alice or Bob.

Interactive Proofs

With this background, we now get into interactive proofs. It so happens that interactive proofs rely on Alice and Bob’s shared acceptance that Event-X happened before Event-Y because Alice and Bob together orchestrate these events and are both witnesses to it. In the “Where is Waldo” example from our previous article on Zero Knowledge Proofs, we saw that Alice does Event-X first (places a piece of paper on the picture, called a “commitment”), Bob later comes in and gives Alice a challenge (either open the paper and show the image inside, or cut a hole in the paper to reveal just Waldo through the paper). Alice then does what Bob asks of her, and Bob can verify the response (called the “response”) They can repeat this sequence a few times for Bob to be convinced that Alice knows where Waldo is. Alice and Bob repeat this sequence of “commitment-challenge-response” (randomly changing the commitment and challenge each time) a few times and if it works each time, Bob knows that it’s overwhelmingly likely that Alice knows where Waldo is.

The catch is – only Bob can be convinced of this. If Bob goes to Carol and shows her this sequence of “commitment-challenge-response” triples, Carol doesn’t have to be convinced. Carol knows that Bob can create a similar sequence by doing the slightly different ordering of “challenge-commitment-response” (commitment and challenge are swapped) and trick her. In the Where is Waldo example, Bob makes up a valid commitment to both his possible challenges. Here’s how.

  • Challenge #1. Reveal that the picture under the paper is the original Where is Waldo image. If Bob wants this to be the challenge, the equivalent commitment is random. The paper can be anywhere on the image.
  • Challenge #2. Cut a hole in the paper and reveal Waldo. If Bob wants this to be the challenge, he make a new picture with just one image of Waldo in the middle, covers it with the paper, and then cuts it open to reveal Waldo.

In both these cases, Bob doesn’t know where Waldo is in the original picture. So, without actually knowing where Waldo is, Bob can write down a sequence of “Commitment-Challenge-Response”, by generating the challenge first, and not the commitment.

Why Order Matters

When Alice and Bob were doing the interactive proof, Alice could not trick Bob this way, because Bob saw in person that Alice did the commitment first, and then had to respond to Bob’s challenge without changing the commitment. Bob knows the ordering of “commitment-challenge-response” because he was there, and saw it happen. But he can never convince Carol of this because Carol also knows that Bob can create such a sequence himself without knowing where Waldo is, by just changing the order of “commitment-challenge-response” to “challenge-commitment-response”. Even if Bob is honest, he cannot convince Carol that Alice proved it to him that Alice knows where Waldo is. Carol can only be convinced if Alice and Carol do the interactive proof between just the two of them, and Carol can be witness to the ordering of “commitment-challenge-response”.

Challenge Space

The Where is Waldo proof doesn’t work as a general interactive proof because Bob cannot use it to convince Carol that Alice knows the secret. This is because the sequence “commitment-challenge-response” relies on mode #1 order proof (witness). What if the “commitment-challenge-response” sequence has some structure that proves that the commitment happened before the challenge? That could convince Carol that the sequence was generated in the right order, and not the cheating-order. In the Where is Waldo proof, there is no obvious way of imposing structure on the ordering between the commitment and the challenge.

In other interactive proofs (like the Schnorr protocol), Bob’s challenge is just a large number. Alice has to do some arithmetic operations with her previous commitment and Bob’s challenge number to come up with a response that satisfies Bob. In these kinds of proofs, it is possible that we could impose some structure between Alice’s commitment and Bob’s challenge such that it’s obvious to Carol that Bob could not have created the challenge before Alice made her commitment.

One obvious example of such a structure is a secure hash function like SHA256, where Bob’s challenge is the hash digest of Alice’s commitment. Carol knows that Bob cannot first come up with a challenge and then make up Alice’s commitment – because the hash function cannot be inverted. In such a proof system, Bob always makes up his challenge by hashing Alice’s commitment. So, the sequence is “commitment-hash(commitment)-response”. The challenge is always hash(commitment). If Bob now takes a series of such triples to Carol, Carol is sure that these triples were generated honestly by either Alice and Bob using an interactive proof, or Bob knows the secret and was able to prepare the proof himself. Either way, Carol can accept the proof. This is how digital signature schemes are constructed, where Bob can show Carol that Alice signed something, and Carol will believe that Bob is not lying (if Carol can independently confirm what Alice’s public key is).

So…..

We touched upon many concepts of Zero Knowledge Proofs in an informal way here. C-Simulatability in Sigma Protocols and the Fiat-Shamir heuristic, mostly. There are graduate level text books on these topics, and I have barely scratched the surface here.

Trusting Trust

Ken Thompson is a Turing Award winner and an all-around genius – gave a seminal talk during his Turing Award acceptance called “Reflections on Trusting Trust.” In it, he shows how to sneak a Trojan horse into your application (in his case, the Unix operating system) while you compile the source code of the application using the C compiler.

Say the C compiler has malicious code in it that patiently waits for pieces of code that look like Unix’s source code, and the moment it knows that it is compiling this particular source code, it adds the Trojan horse code into the final Unix compiled binary. This Unix binary now has a Trojan horse and is no longer secure.

A security-conscious user would look at their build chain and would only use a C compiler binary that they have built themselves. Before building this C binary, they will look at the C compiler code to check that such Trojan-horse-inserting malicious code doesn’t exist in their copy of the C source code. They would then compile such a clean C compiler source code to generate the C compiler binary before using that to compile the Unix binary. But how does one compile the C compiler binary? Using the C compiler, of course. This (other) C compiler binary could have malicious code that inserts malicious-code inserting code. The malicious source code used to create this malicious C compiler binary was probably written and thrown away by Ken Thompson back in 1983. The compiled version of this malicious code will self-perpetuate forever . As long as it, or one of its descendants is used somewhere along the way to build your C compiler, compiling a clean-looking C compiler source code will always get you the malicious C compiler binary. The malicious C compiler binary can now corrupt the naive Unix application binary. The maliciousness lives on – with its source long gone.

We trust Bitcoin’s blockchain because the entire blockchain can be verified from the genesis block’s hash that is hardcoded in the Bitcoin source code. We can just read the source code, convince ourselves that it does what we want it to do, compile it, and …… oh fuck!

Can we trust that compiler not to add a Trojan horse into the Bitcoin binary? For example, the genesis block’s hash value (0x000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f) has to be hardcoded in Bitcoin’s source code and can always be “sniffed” for by the compiler. To make sure that the compiler has no such sniffing code, we can look at its source, and compile the compiler first – but where are we going to find a clean compiler for that? One that has never been touched by a program touched by Ken Thompson sometime in the past? One way out of this is to write a basic C compiler from scratch using assembly language and use that to bootstrap your C compiler. Who does that? [1]Well, if someone is paranoid enough to do that – there is a new Bitcoin implementation called Mako that is implemented in pure C with no external dependencies, which can be compiled to a … Continue reading

Trust is a tricky thing, and sometimes, it’s turtles all the way down – and turtles come in all shapes: libraries, compilers, hardware, network routers, random number generators, cryptographic magic numbers, cryptographic assumptions, and on and on and on…..

References

References
1 Well, if someone is paranoid enough to do that – there is a new Bitcoin implementation called Mako that is implemented in pure C with no external dependencies, which can be compiled to a Bitcoin binary that perhaps has no Trojan horses. Or so we hope.

Asymmetric power, reversed

What is power, really?

Power comes about when someone has the ability to destroy someone else’s accumulated capital.

What is capital, then?

Capital

Capital comes about as a result of raw materials, labour, and time. A healthy body, stored grains, a house to live in, a bank account with money earned through a job, or any sort of owned property – all of that is capital. Capital comes about as a combination of raw materials, labour, and time. Capital can either be consumed directly by its owner, or can be exchanged for other things like the labour of others.

If Bob has accumulated capital like this, and Alice has the ability to either destroy this capital, or make its value go down to zero, or coercively move the ownership rights of the capital from Bob to a new owner, I claim that Alice has power over Bob. In some primitive societies, Bob is never allowed to even accumulate capital because Alice controls Bob’s labour and time as well. The capital Bob creates is directly accumulated into Alice’s “account” short-circuiting Bob’s “account” entirely. This is sometimes called slavery.

The modern power dynamic is mostly as a result of Bob having accumulated capital to show for his investment of raw materials, labour, and time – all of which actually are physically not recoverable after the capital has been accumulated. The only thing Bob has to show for his raw materials, labour, and time is the accumulated capital. If Alice has the ability to take this capital away from Bob, Bob is rightfully afraid of Alice’s power.

The key thing to note is that the power is quick and efficient to wield. If Alice’s use of her power takes her as long as it took Bob to accumulate capital, there is no point in using power. Alice could as well generate her own capital, which is more efficient than going through the extra step of waiting for Bob to generate his capital and then confiscating it. Power is meaningful only when it’s quickly and efficiently wieldable. We can look at a few examples.

  • A gun is powerful because the shooter can quickly and efficiently end the life of the a victim. The victim, on the other hand, would have spent a lifetime of raw material, labour, and time to grow a healthy body.
  • Police have power because they can easily jail people, thereby nullifying the past and future labour/time of their “victims”.
  • Governments have power because they can take a share of their citizens’ accumulated capital in the form of taxes. Tax collection is inherently quicker and more efficient than generating capital.

There is an inherently asymmetric nature to power. Power is wielded quickly and efficiently on capital that is slow and inefficient to accumulate. It seems like this is almost natural. To counter this natural emergence of power in society, we have designed social mechanisms to keep these powers in check. Every once in a while, these mechanisms break, and we see humans resort to their more basic instincts.

Cryptography

Cryptography – through the magic of arithmetic and large numbers, reverses this very natural seeming asymmetric nature of power. Every primitive in cryptography reverses this power dynamic: Encryption, digital signatures, hash functions, zero knowledge proofs, multi-party-computation, you name it. Each of these lets Bob quickly and efficiently secure information, and makes it very difficult for Alice to undermine that security. It’s almost surreal to watch it in action.

Let me give a simple example from Elliptic Curve Digital Signature Algorithm, which secures ownership rights in Bitcoin. Bob randomly generates a private key, which is a very quick and efficient task for any computer, and somewhat quick and efficient even with paper, pencil, and a coin. The private key looks like this:

0xc2cdf0a8b0a83b35ace53f097b5e6e6a0a1f2d40535eff1cf434f52a43d59d8f

From this private key, Bob generates its corresponding public key, which is made of two numbers, and looks like this:

0x6fcc37ea5e9e09fec6c83e5fbd7a745e3eee81d16ebd861c9e66f55518c19798,

0x4e9f113c07f875691df8afc1029496fc4cb9509b39dcd38f251a83359cc8b4f7

The key thing to note is that the process of generating the public key from the private key is quick and efficient. In this case, a special well known number has to be repeatedly squared the private key number of times to generate the public key. Repeatedly squaring to calculate exponents is quick and efficient. Reversing this process, that is, taking the logarithm of the public key to recover the private key, seems to be very hard, and there are no known easy ways of doing it.

Another example is the famous RSA algorithm for encryption and digital signatures. Here, the private key is made up of two large prime numbers, and the public key is the multiplicative product of these two numbers. The two private key numbers look like:

p = 101565610013301240713207239558950144682174355406589305284428666903702505233009

q = 89468719188754548893545560595594841381237600305314352142924213312069293984003

The public key p*q = 9086945041514605868879747720094842530294507677354717409873592895614408619688608144774037743497197616416703125668941380866493349088794356554895149433555027

Given p and q, it’s quick and efficient to generate p*q, but given p*q, it’s not known how to quickly and efficiently calculate p and q.

As the numbers get larger, the degree of asymmetry between the creation of security and the breaking of it gets harder.

How does it all come together?

Just having some arithmetic be easy from one direction and hard from the other direction is not enough. These structures have to be useful to secure information. Wise cryptographers have come up with clever tricks to secure information using these arithmetic asymmetries. That these arithmetic asymmetries are opposite to the more naturally occurring power asymmetries in society is poetic justice of sorts. Mull over that for a second: the asymmetric nature of physical power where creation was expensive, but destruction was cheap – has been changed to a new order where creation is cheap, but destruction is expensive. All thanks to the quirks of how numbers work together. Just good old numbers.

To tie this to Bitcoin – we just have to understand that Bitcoin transforms capital to information. What used to be the best form of capital in the analog world (gold), is now digital. To prevent capture of analog capital (gold), owners used to build vaults and fortresses to make the oppressor’s power wielding inefficient. To prevent the capture of digital capital, all owners have to do is perform some arithmetic.

Bitcoin is Gold, but better.

The more I learn about Bitcoin, the more I respect gold. The more I learn about gold, the more I appreciate BitcoinNic Carter

One thing Bitcoin taught me is that goldbugs don’t truly understand why gold is valuableVijay Boyapati

Bitcoin as Gold 2.0 is a great analogy. Both Bitcoin and gold are mined. In both, the miner spends an excessive amount of time and energy going through a large haystack looking for a shiny needle. With gold, the haystack is the entire Earth, and the needle is around 3,300 metric tons of gold per year (to put that in context, we extract around 2,500,000,000 metric tons of iron per year).

Other than having to be mined, natural laws of physics and geology have given gold many different properties:

  • Inertness: Gold doesn’t rust, corrode, or evaporate. It just stays as is.
  • Density: Gold is among the densest metals known to us. It’s one notch below the legendary tungsten (used to make fake gold, typically) and almost twice as dense as lead.
  • Mouldable: Gold can be broken down into small pieces and reassembled back without losing any other properties.
  • Scarcity: As we saw earlier, gold is not that readily available. But interestingly enough, it is not incredibly rare either and is distributed relatively evenly on Earth, albeit in small quantities – unlike say, platinum, which is rarer and heavier than gold but also concentrated in just South Africa. Gold has what appears to be “optimal scarcity.”

What appears as boring chemical and physical properties of a shiny metal, is in fact, a remarkable set of properties that is not common to physical things. We will soon find out – why these properties enable gold to solve a significant problem entirely unrelated to gold itself. First, a slight detour.

Capital and Labor

There have been libraries worth of books written about capital and labor, but I want to talk about just one thing: how do I save the fruits of my labor. If someone performs a task and wants to be compensated for it, what is the best means of payment? Curiously, this means of payment needs properties quite similar to gold.

  • Inertness: their compensations shouldn’t rust, corrode, or evaporate.
  • Density: their compensations should be economical to save, store, and carry. It shouldn’t need a container ship to carry the amount of labor I have expended to – say – buy a house.
  • Mouldable: Labor spent on one task should be compensated such that the person can utilize this for many tasks later. Or the compensation for the labor of many tasks should be aggregatable into one spendable unit later.
  • Scarcity: If whatever is being used for saving the fruits of my labor is available freely everywhere to everyone, my part of that compensation “pool” becomes infinitesimally small and meaningless.

Many units of such compensation accumulate capital. This capital is then spent on other labor. And so forth. This conversion of energy from labor to capital and then back to labor again allows humans to specialize and flourish. Peoples of past millennia figured out that to achieve this state, the means of payment had to be gold. I am simplifying the enormously profound topic of the emergence of money. Read Nick Szabo’s classic essay on the origins of money for more details.

Is Gold Good Enough?

Unfortunately for gold, physics also prevents it from being teleported across space. That’s a bug in gold’s very nature. If labor wants to be paid across timezones instantly, gold just doesn’t cut it. We have to introduce a trusted third party whose IOU serves this purpose. If we are the type who doesn’t trust third parties and want to use something natural, or something not controlled by anyone, we were out of luck. Till Bitcoin, which can be teleported.

Unfortunately for gold, it’s not possible for any individual to know how much gold exists on Earth. Above or below ground. There is some talk about gold being found on asteroids. What is the total supply of gold? We are out of luck. Till Bitcoin, whose supply is capped.

Unfortunately for gold, it’s hard to judge whether a piece of metal I have in my hand is gold or not. It could be gold-coated tungsten, for instance. We need specialized equipment to assay gold, and if an average person wants to make sure that what he received as payment is actually gold, he is out of luck. Till Bitcoin, which can be verified trivially.

All these gold-bugs are fixed with Bitcoin. Bitcoin also carries gold’s features of inertness, density, mouldability, and scarcity. How does it all work? It’s not hard to design a centralized digital system that has all these properties. Designing a decentralized system with these properties, and making sure that it doesn’t change at all – is the hard part. I wrote about this part in an earlier post: Bitcoin is Forever. This social barrier that lets Bitcoin have gold’s properties, and not its bugs – has to hold forever – for Bitcoin to succeed in replacing gold as a way to store the fruits of one’s labor. I have high hopes for this, but by no means am I certain.

Needles in the Haystack

Gold mining is expensive, time-consuming, and energy-intensive. Again, thanks to physics and geology. Also, gold mining has been going on for millennia and has been somewhat commoditized. We find parts of Earth with gold and deploy a somewhat well-understood piece of technology on it. On the other hand, Bitcoin mining is a bit trickier – but in my opinion, more elegant. Let’s see why Bitcoin mining is expensive and consumes energy. Rewinding ourselves all the way back to the idea of mining as finding a needle in a haystack, Bitcoin’s haystack is of the size 2^256, which is this astronomical number: 115792089237316195423570985008687907853269984665640564039457584007913129639936

The proverbial needles in this haystack are currently of the size 2^180, which is a much smaller number than the size of the haystack: 1464008529111715998423770879212294388201901757724360704

Any number smaller than this number is a needle for us, and if we find it – we hit gold, so to speak. As it turns out, the only way to mine Bitcoin is to randomly go through the entire range of numbers from 0 to 2^256 and hope to get lucky with a number that is lower than 2^180. Well, hang on a second!! If I start with 1, 2, 3, and so forth – it seems like it should work. They are all lesser than 2^180. Unfortunately, it’s not that easy. Bitcoin’s protocol forces randomness on us as we go through the range of numbers from 0 to 2^256. Randomness ensures that if we pick any random number from this range, we have very high odds of picking something much larger than 2^180. So, we trudge along randomly sampling numbers, looking for gold, thereby proving to the protocol that we are doing work. With enough proof-of-work, we eventually hit gold.

Can I do this on my desktop? What are my odds of striking gold? I wrote a quick program on my desktop that could make around 1,000,000 random guesses per second. That’s 10^6 or 2^20. Bitcoin sets its mining window to 10 minutes, and in that time, my desktop can run through 2^29 random numbers. My target, though, as set by the Bitcoin protocol, is around 2^(256-180), or 2^76 numbers, before I hit gold (metaphorically). Even if my desktop has eight cores, I can get to 2^32 numbers. That’s still quite far away from 2^76. How do I achieve 44 orders of magnitude in performance improvement? Are there special tools for this? There are!!

The thing is – general-purpose computers are like human hands, good at many tasks, but not fast or efficient at any specific task. One can instead design tools to solve specific problems – like how we use a hammer to put a nail in the wall and not use our hands for that job. But these tools are pretty useless at solving other problems. A hammer is no good for cutting paper, for example. Human hands can do both, but not as efficiently. Let’s say we designed a hammer to do the Bitcoin mining function and make it as streamlined as can be designed. Can we get 44 orders of magnitude improvement over the equivalent of human hands? Somewhat.

ASICs, or Application Specific Integrated Circuits, are the best type of computers for Bitcoin mining. Among these, the very best can do 110 terra-guesses per second, or around 2^56 random guesses every 10 minutes. That’s 27 orders of magnitude improvement over my desktop. One such ASIC costs around $15,000 and is hyper-optimized just to run the Bitcoin random number generator and nothing else. If we assume the lifespan of hardware to be around two years, the amortized cost of this ASIC is $0.14 for 10 minutes of use. Before we forget, this ASIC has to be connected to a source of power, the cheapest of which costs around $0.06 per kilowatt-hour (kWh). The ASIC has a power rating of about 3.2 kWh, making it cost $0.03 for 10 minutes of use. Going from 2^56 to the required rate of 2^76 is still around 2^20, or just more than a million such ASICs spread over a few datacenters, making the cost around $225,000 for 10 minutes of mining. The gold strike at the end of the 10 minutes is worth 6.25 Bitcoin, which translates to around $260,000 in today’s USD price, netting a profit of $35,000 dollars every 10 minutes to the whole mining industry.

This is just a back-of-the-envelope estimation with more assumptions than I can list: transaction fee variance, location rent, software and hardware labor cost, hardware age/lifespan/generation, electricity price variance, regulatory uncertainty, etc. Bitcoin mining is almost as geographically distributed as gold mining, and it’s hard to get exact numbers.

On a side note, this entire mining process is random, which ensures that the mining reward at the end of each 10 minute period goes to a random miner, where the randomness is proportional to their share of the entire network’s computing power. This delicate dance of the 10-minute timespan, the proportionate randomness-based distribution of rewards, and other such details were covered in my previous post on Bitcoin’s secret sauce.

There are a few interesting consequences of this particular setup:

  • When Bitcoin is more valuable, older and less efficient hardware becomes viable. On the flip side, when Bitcoin’s value goes down, it’s no longer profitable for miners to run old hardware. Bitcoin’s value, as we know, is highly volatile – which makes old mining hardware almost impossible to throw out as e-waste, as we never know when they will become more profitable to run again. Mining hardware, which guesses random numbers, is not the same as my mobile phone, which does become e-waste as application demands on older phone hardware get overwhelming over time.
  • The random number guessing program these hyper-specialized ASICs run is an open standard from before Bitcoin’s genesis – called the SHA2 standard. The actual program specification of this program is well known, and long before Bitcoin was born, people have been optimizing its execution in both software and hardware. After Bitcoin got popular, another wave of optimizations happened, and some optimizations are happening even now. But there is a limit to these optimizations – both from software and hardware angles.
    • Software: the functional specification is well known, and we have only so many things we can do with for-loops, additions, and multiplication. The low-hanging fruits are all long gone.
    • Hardware: Hardware optimization will eventually hit physical limits. Transistors placed close enough to each other will start doing unexpected things that only quantum physicists can understand.
  • As optimizations end, mines will seek electricity at lower prices. There are many more ways of optimizing electricity production than optimizing the SHA256 function. Finding cheap sources of power or even creating new sources of cheap power is where the next optimization is. Note that cheap is not always green.

Some have argued that miners in Bitcoin serve the same purpose as gold miners. I think it’s not the same. Gold miners can all go away tomorrow, and gold will continue to be what it is. If all Bitcoin miners are gone tomorrow, transactions will stop. Bitcoin’s protocol pays its miners because they have to exist – forever. As we saw earlier, it’s a thin-margin business, where technology will almost certainly be commoditized – as the specifications are set in stone. At this thin margin, cheap electricity is their only competitive advantage. As their margins drop, hopefully 51% or more miners will function honestly, and transactions will keep flowing.

Will Bitcoin deprecate gold? I hope the answer is a no, but I am afraid the answer will be a yes.

Zero Knowledge

“Zero Knowledge”, contrary to what it sounds like, is actually quite interesting and fun. It might even be a solution to our long standing problem of validating the world’s transactions without a trusted third party or government or central bank. If you Google for the terms Zero Knowledge and Blockchains, you will be flooded with whitepapers, articles, explainers, investment advice, and everything in between.

What does Zero Knowledge (ZK) even mean? Let me start with a toy example, and then we can work our way up to world peace.

Say we both get the same newspaper and it has a Sudoku puzzle in the games page. I claim to you that I know the solution to this puzzle, but will not tell you what the solution is. Being the frenemy that you are, you won’t believe me, obviously. Can I prove it to you, beyond reasonable doubt, that I do know the solution to this Sudoku puzzle, without telling you what the solution is? More formally,

  • If I know a solution, I should be able to convince you of that. Without leaking any knowledge about the solution.
  • If I lie about knowing the solution, I should be caught – with overwhelming probability.

If both the above are possible, that would be a Zero Knowledge Proof of Knowledge of a Sudoku puzzle. There are some ingenious ways of doing this, which rely heavily on cryptographic primitives and protocol design. In fact, it’s possible to convince an audience that you know the solution for almost any puzzle without giving them any hint of the solution itself. Sudoku was just one example. You could prove that you know the solution to a crossword puzzle, or the Rubick’s cube, or that you know a cycling route from New York to Seattle that’s exactly 5000km, or that you have paid your rent, or that your bank balance is more than $10,000 or any such statement really – without actually revealing the actual solution to the statement.

Imagine the power of such a system, where you could convince others that something is true, without revealing how it is true. In most real world systems, including financial systems, to prove something to someone, you have to reveal the actual facts of the matter – and thereby reveal more than you have to.

For example, getting a visa to any country requires you to provide your bank statement – just to prove that you can afford the trip. It should be possible to prove that you can afford the trip, without revealing any financial information. Also, the proof should be real – as in, if you cannot afford the trip, you shouldn’t be able to prove such a thing and fool the visa-issuing agency. We want both sides, the prover and the verifier, to win. Just with no leak of extra information. The best kind of privacy, if you will.

Where is Waldo?

First, I will give an example of how such a Zero Knowledge protocol looks like, to make you believe that it’s possible. Below is Waldo: Say Hi to him.

Waldo is somewhere in the amusement park image below. Can you find him? Don’t try too hard, it’s not worth it.

This “Where is Waldo” puzzle lends itself very well to a Zero Knowledge protocol. I can prove it to you that I know where Waldo is without revealing his actual location on the image. How do I do that? We run the following protocol between the two of us.

  1. You blindfold yourself. I keep a large white sheet of paper on top of the amusement park image, and ask you to remove your blindfold. You can give me one of two challenges. You should choose these challenges randomly.
    1. I should remove the sheet of paper and show you the amusement park image underneath.
    2. I should cut out a small hole in the white paper right above where Waldo is on the image. If I do this, I must know where Waldo is.
  2. Repeat step #1 till you are satisfied.

Why does this protocol work?

  1. If I know where Waldo is, I can easily answer challenge #2. That part is easy. It’s not so easy to figure out why challenge #1 is required.
  2. I could cheat by keeping some other image under the paper which has just many images of Waldo on it. How do you know that it’s actually the amusement park image and not some other image that I made up? Challenge #1 to the rescue. If you had asked me challenge #1, I had to remove the entire paper and show you that this was the amusement park image in question.
  3. Note that you cannot give me both the challenges at the same time, as that would tell you where Waldo is. Only one challenge per protocol round.

If we do this entire exercise just once, you could have asked me to answer challenge #2 and I could still cheat with a probability of 50%. If we do it twice successfully, I can still cheat with a probability of 25%. If we do it three times, it reduces to 12.5%. If we do it 10 times, and you picked your challenge randomly each time, I can cheat only with a probability of 0.1%. If we repeat this 20 times, the cheating probability drops to 0.0001%. And so forth, exponentially. Again, this only works if you pick your challenge randomly. If I know in advance that you will ask me the challenge sequence, of say, 122212121222111 – I can pass all challenges easily. The protocol works only if I am unable to guess your challenge sequence.

Cryptographic researchers have proven that almost any statement can be proven in zero knowledge. Imagine that! Any statement! It’s one of the most celebrated results in theoretical computer science, all the way back from 1986. The concept of Zero Knowledge Proof itself was introduced in 1985, after the original paper was rejected in major scientific conferences in the prior years because of how absurd the idea sounded. It still sounds counter-intuitive, if you ask me.

One popularly used ZK-proof system, solving a very specific problem, is that of Digital Signatures. When you digitally sign a document, you are proving to the verifier that you know a secret key to your public key (which the verifier already knows, or is tied to your identity, or some such). For the longest time, general purpose ZK-systems, which could prove any statement, were just theoretical results – the actual proofs themselves can be quite unwieldy and inefficient. Theoretical work continued, but there were still no practical applications that needed these proofs to get smaller, or easier to understand, or even remotely workable. 25 years went by, and people were mostly happy with either revealing everything about something to prove it, or having a trusted third party (like a Bank Officer or Notary) signing a statement saying that something is true, without revealing the underlying details. Ho-hum.

Enter Bitcoin!

Bitcoin removed the trusted third party from financial transactions. Or at least, introduced the idea that it could be done with clever cryptography and protocol design. Researchers who were toiling away in obscure labs and universities were suddenly like: “Hey, there are these amazing theoretical cryptography results from decades ago, let’s use them”. These ideas suddenly seemed ripe for more R&D to make them practical. And boy did the researchers and engineers deliver! Here’s a short list of how Zero Knowledge pervades the cryptocurrency space.

  1. New cryptocurrencies: Zcash, Monero, Grin, Beam, Mina, etc.
    • Everything about a transaction is hidden. Who is paying. Who is the recipient. What is the amount. Everything is hidden. Crucially though, verifiers can verify that the transaction is valid, and no one is cheating anyone. Zero knowledge magic. Details differ, but this is the general idea.
    • Additionally, Zero Knowledge proofs can verify large numbers of transactions without needing to store all those transactions. So, these ZK-blockchains can be as small as a few KB. For comparison the Bitcoin blockchain is 350GB and growing. Ethereum’s blockchain is 1TB or 5TB (depending on whom you ask) and growing.
  2. Layer-2: ZK-Sync, StarkNet, etc. bring the benefits of ZK-proofs to legacy blockchains like Ethereum and increase throughput quite dramatically.
  3. Other Proofs: Exchanges can use ZK-proofs to convince their users that they are not doing fractional reserve or rehypothecation shenanigans, and in fact, do custody all their customer assets.

What next?

Some of these general purpose ZK-systems have quite advanced cryptography, and their security guarantees are proven sometimes under ideal settings. When I say security guarantees, what I mean is:

  • Can the prover cheat?
  • Can the verifier learn something by violating the zero knowledge principle?
  • Can we do the entire thing without relying on cryptographic assumptions?
  • Some systems rely on an initial ceremony where some trusted party has to do one-off computation. Can we remove such requirements?

Practical minded people say that this stuff is too advanced, or “moon-math” as they call it. These primitives will not make it to Bitcoin for a LONG LONG time, if at all. Bitcoin’s cryptography is from an even older generation, and has been vetted in traditional settings like e-commerce, national defense, etc. No moon-math for Bitcoin!

That doesn’t mean that Bitcoin won’t benefit from these new developments. Bitcoin has evolved to a place now where the core protocol itself won’t change that easily, but additional features have to be built on top, in other layers. ZK-proofs will reside on a secondary layer somewhere on top.

Ethereum, on the other hand, is more open to these ideas. ZK-proofs are making their way into Ethereum’s core-system slowly, but will definitely pervade Ethereum’s Layer-2 ecosystem quite thoroughly in the near future. Much faster than in Bitcoin, from what I can see. Newer blockchains will go all-in, and will be built around ZK-ideas, or will offer them as native operators or subroutines.

You have the entire spectrum of blockchain platforms – some boringly conservative, and just trying to be sound money. Some others on the bleeding edge of maths, offering true privacy through ZK-proofs and the like. I expect these to become more mainstream as privacy becomes non-negotiable. Currencies, smart contract platforms, exchanges, and every other financial intermediary will go maths-first!

Bitcoin’s secret sauce

Bitcoin’s secret sauce, and how it works, was on full display these last few weeks. Bitcoin was designed to work against the most powerful of adversaries, and boy – did the adversary show up!

China Ban

A few months ago, 45% to 75% of Bitcoin mining happened inside China. Then the Chinese government banned it.

There are anecdotal accounts from people on the ground are seeing Bitcoin mining operations being shut down by law enforcement agents. And there are similar accounts from people on the ground elsewhere in the world where containers full of mining hardware are being shipped to, lock, stock and barrel.

And then there is the Bitcoin blockchain – the source of absolute truth. I have a copy of the Bitcoin blockchain on my computer, and could actually run the numbers myself and see that the production of Bitcoin blocks slowed down dramatically. Here’s a plot of how long it took, on average, to find 2016 blocks from 12-May-2014 to 18-July-2021.

Bitcoin blocks, on an average, are supposed to be generated once every 600 seconds. But you can see the spike in this number on the graph towards the end, going all the way up to 832 seconds. This means that during that period, the total number of active miners went down dramatically, and that led to the inter-block average-gap increasing equally dramatically from 600 seconds to 832 seconds.

Putting the anecdotal and canonical sources of data together, we can be reasonably certain that the Chinese mining ban lead to a global drop in Bitcoin mining.

Does it matter?

Not really. Miners come, miners go – Bitcoin chugs along. This is not an accident. This is by very careful design. Bitcoin targets a block production rate of 600 seconds per block. If Bitcoin’s design had been naïve, whenever its dollar value went up, more miners would enter the system to make more money, and blocks would arrive faster than 600 seconds. Similarly, if its value went down (or if governments kicked them out), miners would leave the system, and blocks would arrive much slower than 600 seconds. The block production rate on either side of 600 would persist, and reflect the total number of miners in the system.

But no, that’s not what happens. No matter how many miners are in the system, it always takes around 600 seconds to mine a block. This is done through the difficulty adjustment algorithm, also known as Satoshi’s stroke of genius.

Difficulty Adjustment a.k.a Bitcoin’s secret sauce

Before we get to the difficulty adjustment algorithm, we have to first understand why keeping the inter-block interval of 600 seconds is important. Bitcoin works because everyone can check whether their perceived ownership of their own Bitcoin is fact or fiction. To check this, you need access to Bitcoin’s data. Where is this data? How big is it? How do I access it? Bitcoin’s data is not held by some central custodian, or a bank. It’s held by everyone who is interested. It includes all transaction from the genesis block onwards – from January 2009. But storing everything with everyone sounds crazy – and to be honest, it is crazy. But the more you think about it, the more you realize that there are no other easier ways of doing self-validation, other than offloading the “do I control my money or not?” question to someone else – and trusting them. Bitcoin prefers the opposite: self-validation.

So, if we accept the crazy idea that everyone stores a copy of the blockchain, we have a fundamental tradeoff – the blockchain cannot get very big (by growing very fast). It also cannot stay static: new transactions need to be added every so often to facilitate economic activity. Currently, the blockchain is around 377 GB, and growing at around 50 GB per year. If it grows too fast, not everyone will be able to hold their own copy. If it doesn’t grow fast enough, there is not enough transaction space to accommodate the demand for transactions. Under these constraints, Satoshi decided that a 1MB block every 10 minutes is a good tradeoff. To keep this tradeoff constant, blocks cannot be generated slower or faster.

What happens if Bitcoin’s value skyrockets and everyone wants to be a miner? Remember that a miner who generates a new block gets to keep the newly minted Bitcoin that comes out of each block. So, if the value of Bitcoin goes up, expect more miners to materialize. To accommodate this, Satoshi designed a simple algorithm that makes mining harder or easier depending on how long it takes to generate the previous 2016 blocks.

The Bitcoin protocol contains a positive number called “difficulty”, whose value is currently 13,672,594,272,814. This number controls how hard or easy it is to mine a block. Let’s say the total time taken to mine the previous 2016 blocks was greater than 2016 times 600 seconds, by a factor of X. This difficulty number is then adjusted lower by the same factor X. If the time taken to mine the previous 2016 blocks was lower, the difficulty number is adjusted upwards – again by the factor X. That’s it.

As far as “algorithms” go, this is as simple as it gets. It’s middle school level arithmetic. Turns out that this is not simple at all and was never done before. Other than combining existing ideas from cryptography and distributed systems, Satoshi’s only novel contribution was this middle school level formula. The genius, as they say, is in the simplicity of it.

When these erstwhile Chinese miners turned down their mining hardware around end of June/beginning of July 2021, Bitcoin’s mining difficulty dropped from 19 trillion to 14 trillion, by around 5 trillion – which is around 28%. The reduced difficulty made it easier for the remaining online Bitcoin miners to start generating blocks every 10 minutes again. The next 2016 block average was 630 seconds. Voila!

As Bitcoin’s value increased from 0 to wherever it is today, miners have only entered the system – and have rarely left. Difficulty has always gone up – to accommodate this increase in value. So, how does this difficulty number actually make it easier or harder to mine a Bitcoin block?

The Proof of Work Function

Bitcoin, famously, relies the “partial hash-preimage puzzle” to build its Proof of Work function. A lot of people argue that miners using tons of custom built hardware and scouting the earth for cheap electricity to solve this puzzle many many times over is a waste of resources. That comes down to whether we consider Bitcoin itself to be a waste of resources. That’s a debate for another time. But if we consider that Bitcoin has value – we have to take a moment to appreciate how difficult it is to design a function that has all the properties of Bitcoin’s Proof of Work function.

The proof of work function is:

That’s it. You double hash data from the block you want to generate, and check if that hash value is less than the target on the right hand side of the equation. If it’s not, you change the block data, and try again, and again, and again, and again…

For example, if I double hash make-believe block-data, say the string “Bitcoin forever!”, I get the number:

99399038078883646938846821706752581723151100264172406332358249387420489004987.

The current value of the target is:

1971823790658122626473078926498088015421759366553927680.

So, it doesn’t work. I need to keep trying the function again and again with different block-data to hit gold. The actual previous Bitcoin block’s hash was 888160945014446794317532755205888398236464272495427689, which is under the required target, and that miner struck gold – so to speak. If the difficulty number goes up, the mining target goes down, and finding block-data that double-hashes to a number lower than that target gets harder. It’s like tossing a 6 sided dice and wanting to hit a number less than or equal to 1. It happens only once every 6 times. If difficulty were to reduce, the target would move to a number less than or equal to 2. That happens every 3 times – mining just got easier.

Why go into the nitty gritty details of this function, with all the associated arithmetic and probability? I want to get into the 3 properties that this unique function has, that makes it ideal for Bitcoin mining – and resisting nation state attacks. Not everyday do you see nation-states attacking simple computations like these, and… losing.

Parameterizability: The function provides very fine degree of control over how much harder or easier we want the function evaluation to be. If you increase or decrease the difficulty number, the function becomes easier or harder to evaluate, respectively.

Memorylessness or Progress-free ness: Even if you have already run the function a million times, it still doesn’t give you any advantage over the next run. Each run of the function is what is called a Bernoulli trial – with the odds of hitting gold the same no matter how many times you have tried in the past. This makes sure that larger miners have no other advantage than just the larger chance of producing a block. If this property weren’t there, the largest miner would *always* win, even if they had just 0.0001% more power than the next largest miner.

The other incredible advantage of Memorylessness is that a miner can be turned off, put in a container, shipped elsewhere and plugged back in. The only loss the miner incurs is the Bitcoin that could have been mined in that interim time when the machine was turned off. Most physical objects being built, or even computations that are being performed on computers rely on previous data or “progress” that has been done, stored and retrieved, so that we can continue the process further. Shutting down something abruptly, without needing to store any state of progress, and starting elsewhere without any extraneous loss is not that common. This allows Bitcoin miners to be incredibly mobile and seek out the cheapest electricity wherever it exists. They are, in the true sense, plug-and-play.

Hard to compute, but easy to verify: To get the double-hash value which is under the target needs millions of trials of the function. But once someone finds it, the rest of us can verify it immediately with just a single iteration of the function. This, again, makes decentralization possible – where all of us can run the Bitcoin software on our computers and check that the miners are doing the right thing.

Replacing this function is not that easy. Most attempts have kept the general idea, and have tinkered with the specifics.

Conclusion

A nation state the size of China attacked Bitcoin where it’s supposed to hurt: Bitcoin Mining and all they managed to get in return was a giant shrug of indifference by the protocol. Yet another instance of Bitcoin living up to its promise of being designed to last forever. This self-adjusting nature of Bitcoin – that makes it change itself based on market conditions, with no one central entity being in charge – separates it from all other forms of money. Fiat money always has a central planner. Bitcoin has a protocol.

Governance, Decentralized

Define Governance: the act or process of governing or overseeing the control and direction of something (such as a country or an organization).

In this article, I will focus on whether any organization can have decentralized governance, and what does that even mean? And how is this related to cryptocurrencies. Let’s start with a very basic organization, and see whether it can be governed in a decentralized way.

What is an organization anyway?

Say some people want to pool their money and use it for charity. We have ourselves a rudimentary organization. During the organization’s inception, the founders make some bylaws – for example: for any charitable donation to happen, say 2/3rd of the remaining capital in the pool has to approve it. These bylaws are written down formally in a “human language” (the language being a “human language” is important). The organization will register itself with the government of that geographical area (let’s say, a country). In case disputes arise in the future, the courts of that country will interpret the bylaws of the organization, apply the relevant common laws of that country, and with the threat of force, ask the members of the organization to abide by the court’s judgment. We kind of get how this works.

I will call this “centralized governance”, because the dispute resolution is adjudicated by a centralized authority. In an ideal world, this centralized authority is fairly appointed by representatives of the people who were fairly elected by the people to carry out such appointments.

Enter Smart Contracts

If the bylaws were precisely written down in an unambiguous computer language, and deployed on a distributed computer that could not be stopped, or taken over by any single authority – we have a decentralized organization. Its governance is encoded in the program that was deployed on the distributed computer. Ideally, once deployed, the program cannot be changed, and can be arbitrarily run by anyone forever. Who are the members of this organization? Let’s say the program has a function that accepts money as input, and gives out an equivalent valued token – anyone who makes such a function call is a member of this organization, as they have a stake in the program. Do disputes arise in such an organization? No. To see why the answer is “no”, we have to understand that this system adheres to the maxim: “Code is Law”. The program does exactly what it was programmed to do – there is no randomness or discretion or uncertainty in the execution. This faithful execution of the program obsoletes the idea of dispute resolution.

Ethereum smart contracts are such programs. They are deployed and run on Ethereum, which is a distributed network of computers that ideally cannot be censored or stopped. Ethereum has a richer programming language, along with the notion of a smart contract having monetary deposits, and other arbitrary data. Using this setup, one can write a smart contract that represents the charitable organization that we saw earlier. In fact, back in 2016, when Ethereum was still in its infancy, exactly such an organization was deployed as a smart contract on it. It was called The DAO, or the decentralized autonomous organization. It could accept funds from anyone, and with token holders voting for projects, would fund these projects from the collective pool of funds. Venture capitalists thought that the DAO would disrupt the VC industry itself, and added their own funds into the pool. At its peak, the DAO had 14% of all of ETH pooled inside it (ETH is the native currency of the Ethereum system). I didn’t read the code of the DAO, and am not sure how a project got actual funding – was some ETH moved to the recipient’s address? How would the DAO verify that the recipient actually produced something of value, if that artifact was not native to the blockchain itself? In the cryptocurrency space, it’s important to ask these questions – as the answers are not obvious, and often times hide red flags that indicate possible scams.

But as it turned out, this DAO program itself had a software bug, and that allowed a clever hacker to drain the uninvested funds into their own control. To “fix” this “hack”, people who had enough social clout in the Ethereum ecosystem managed to undo history, and start an alternate timeline where this hack never happened.

What?!?!

How does one undo history and make alternate timelines?

It’s the settlement assurances, stupid[1]Read more here: https://medium.com/@nic__carter/its-the-settlement-assurances-stupid-5dcd1c3f4e41

Let’s start with an example. Let’s say your credit card is stolen, and is used to buy strange things in strange lands. You call your credit card issuer and ask them to undo history, and start an alternate timeline where the theft never happened, and you have a clean slate of your own previous transactions and new transactions. Where did the thief’s transactions go? Turns out that they were never “settled”. In the traditional finance world, very very few transactions are actually “fully settled”. Transactions between countries, or between large banks, or those that are brokered by central banks are considered settled for good, and are truly irreversible. The rest of the world’s transactions can be reversed, if the right people are convinced.

In Ethereum, where code is supposed to be law – alternate timelines should not have been possible. The hacker took out the pooled funds from the DAO because the smart contract allowed that to happen. That’s the bylaws of the contract, and the hacker is playing by the rules. There shouldn’t be a discretionary voice that says “But that’s not the spirit of the law”. Smart contracts are only supposed to respect the word of the law, and not the spirit of the law. Ethereum, in its early days at least, believed that the spirit of the law mattered more than the word of the law, and allowed the DAO hack to be “bailed out”.

Ethereum is just one such “network computer” (blockchain, to keep up with the times) that runs such code-is-law smart contracts. There are other blockchains that claim to do the same, and have varying degrees of centralization that allows the powers-that-be to “bail out” certain contracts if shit his the fan. On the other hand, Bitcoin doesn’t even allow such powerful smart contracts, and the rudimentary smart contracts that it does allow, have never been reversed because some people lost their money. I think it’s an important distinction that makes Bitcoin the most (if not the only) credible blockchain in existence, but that’s just me.

Governance, through code

Coming back to Ethereum smart contracts which act as decentralized autonomous organizations, how can governance rules be changed if all token holders agree to it? We now get into some of the more sophisticated governance models for smart contracts, which can all be coded into the initial smart contract itself. Here’s one popular model:

In our original charity smart contract, we had the initial bylaw that 2/3rds of the total pool had to approve every new donation. Let’s say we want to change this rule to have 3/4 instead of 2/3. While writing the initial smart contract, this particular constant (2/3) is delegated to a different smart contract that is deployed first, and the main smart contract calls this other smart contract to perform it’s actions. In software programming, this is either called “delegation” or “forwarding” or “a pimpl – pointer to an implementation”. The difference between a classic software program that does this, vs. a smart contract that does the same thing – is that in a smart contract with decentralized governance, the change in implementation of a functionality has to be voted by token holders. This is how it looks:

  1. The initial smart contract is written in such a way that the following steps are supported.
  2. Someone (doesn’t matter who) codes a new piece of functionality and deploys it on the blockchain. For now, this is dead code, as no one is executing it. But everyone can see what it does.
  3. Someone (again, doesn’t matter who) makes a proposal in the original contract that they would want to call a vote for this new functionality from step (2) to replace the equivalent step in the original code.
  4. There is a timeline for token holders of the smart contract to vote for this proposal. Votes are tallied. The result is known.
  5. If the governance change is approved, there is an additional time window before it comes into effect. Token holders who are unhappy with this change can withdraw their capital from the pool by returning or burning the tokens.
  6. The governance change is affected by changing the smart contract implementation of this functionality from the original to the new.

Many smart contracts on Ethereum have the so called “governance token” that allows token holders to change the rules of the smart contract if enough such token holders vote for it.

  1. Uniswap, the popular decentralized exchange on Ethereum, has its own governance token UNI, which allows UNI holders to vote for governance changes like increasing or decreasing the fee taken by the protocol per exchange trade.
  2. Compound, a smart contract for credit issuance on Ethereum, has its own governance token COMP, which allows COMP holders to affect governance changes – like how they recently voted to change their price oracle.
  3. MakerDAO, the smart contract behind the stable coin DAI, has its own governance token MKR, which allows MKR holders to change the parameters of the DAI stablecoin, and how it maintains its 1:1 peg against the USD.

In my naïve unqualified opinion, these kinds of governance tokens can sometimes pass the Howey test, and could qualify as securities under some regulatory regime.

What’s in it for me?

Many tokens/coins are available to buy on many cryptocurrency exchanges.

  1. Some are native coins of their own blockchains – like BTC/ETH. Many of these native coins are centralized, issued to investors first, and dumped on the general public later.
  2. Some are ERC-20 tokens on the Ethereum blockchain. They represent governance rights on protocols, and thereby generate cash flow.
  3. Some are tokens on other blockchains. Most blockchains’ native currencies themselves are worth nothing. Tokens that are launched on these blockchains are even trickier.
  4. Some are even more complex tokens issued by smart contracts that govern other smart contracts.
  5. Some tokens are blatantly pointless, and are valuable just as collectibles: remember NFTs?

Some tokens have a point, but are still worth nothing.

Some tokens have a point, and might be worth something.

To keep life simple, one can just buy Bitcoin. If that’s too conservative (it’s not), maybe add ETH to the mix (don’t).

References

References
1 Read more here: https://medium.com/@nic__carter/its-the-settlement-assurances-stupid-5dcd1c3f4e41

Defi for the rest of us

DeFi stands for Decentralized Finance.

Decentralized: Ideally, any single entity should not be able to stop the process or program or system in question. It’s running on some unstoppable system where anyone can execute operations.

Finance: Savings, Loans, Exchanges, Margin Trading, Synthetic Assets (Equities, for example), Lotteries, Insurance, Collateralized Debt Obligations (why not?), and such.

Before the advent of Bitcoin/Ethereum, financial products were run on a computer that some entity controlled. This entity had a physical address, and could be visited by law enforcement or regulators or more generally, whom I call “men with guns”. Bitcoin/Ethereum run on so many computers that it’s not possible for men with guns to stop it. Smart contracts running on Ethereum are hard to take physical control of – and stop, or modify unilaterally by men with guns. This is the decentralization that we are interested in. Because of this, we have “unstoppable programs”, at least in theory.

First, a simple example of where these “unstoppable programs” come in. Let’s say you want to buy some Ether. You could submit your KYC details to a centralized exchange like Coinbase or Kraken and get an account. You then wire-transfer some dollars to their bank account, with some routing instructions so that the money goes to your account. You wait for the dollars to show up in your dashboard, and then buy some Ether with it. You could let the Ether stay there (like how you let your money stay in a real world bank) or you could self-custody by transferring the Ether out to your own hardware wallet. Like you withdraw cash from a bank and self-custody under a mattress, for example.

Decentralized Exchanges

Given that you could be an “under-the-mattress” type of person, Coinbase could block your account. What then? Enter DEX’es, or decentralized exchanges. Uniswap is one such DEX. It’s a set of smart contracts that run on the Ethereum network. The specific Uniswap smart contract that accepts USD and gives back Ether is located at the address 0xb4e16d0168e52d35cacd2c6185b44281ec28c9dc on the Ethereum blockchain’s “main-net”. Think of it as the unchanging IP address of the smart contract on the Internet. If you make a request to this smart contract with some USD, and it returns some Ether to your address. Think of it as making a web-search request to Google.com with a query and getting back 10 blue links as the result. But to start this process, you need to have USD in a form that the smart contract can accept. Enter Stablecoins.

Stablecoins

Stablecoins are tokens that 1:1-track external fiat currencies like the US Dollar or Euro, external (to the system in question) cryptocurrencies like Bitcoin. This token system is implemented as an ERC-20 token (which I explained in my post on NFT’s). Take USDC for example, which is a stablecoin that tracks the US Dollar. Every token minted by the USDC smart contract can be redeemed for $1. How do you mint a USDC token? You create an account on Coinbase, you transfer USD to it, and you buy 1 USDC for 1 USD. This 1 USDC is an ERC-20 token that can be transferred from your Coinbase account to your computer, or some other contract, or exchanged on Uniswap for something else. The 1 USD you owned earlier is now on the Ethereum blockchain in the form of 1 USDC. To redeem this 1 USDC back to 1 USD, you transfer this USDC back to your Coinbase account, and sell if for 1 USD. Note again, that there is no USD, ever, on the Ethereum blockchain. Ethereum does not know about USD at all. All it knows is USDC. Coinbase is your bridge from the real world to the ethereal world.

Coinbase is able to redeem USDC to USD because they have a traditional bank account somewhere that stores the USD that backs the USDC.

Coming back to our earlier use case: now that you have USDC on Ethereum, you can use the Uniswap contract to buy Ether with it, without going through Coinbase. But hey, we had to go to Coinbase to buy USDC. So, didn’t we just move the trusted third party from the exchange to the stablecoin issuer? We did. But do note that you can get USDC without going to Coinbase as well – it’s just an ERC-20 token that anyone can transfer to you on the Ethereum blockchain without permission from anyone else. And you can use this to exchange to any other token without anyone’s permission as well. If more and more of the economy “moves on chain”, the on and off ramps to fiat currencies like USD will become less important. But for now, someone, somewhere has to store 1 USD in a bank account to be able to generate the equivalent stablecoin “on chain”.

Automated Market Makers

So, how does Uniswap know the exchange rates for every token pair that it allows us to trade with? Each token-pair is run as a smart contract, where you can make function calls to swap one token for another. The smart contract also has a liquidity pool under its control which stores both the tokens in some ratio, and this ratio is used to infer the market price. The assumption is that if this ratio goes out of sync with the external market price, arbitrageurs will trade in the other direction to take tiny profits and revert the pool ratio back to reflect external market price. Users with excess liquidity in any token can fund these liquidity pools and take a small cut of each trade that hits their liquidity pool. We now have a liquidity provider who can get some yield on their capital. Notice that this system of smart contracts is not relying on any external data to be ingested into the system. The exchange rate between token is entirely set by market dynamics.

Let’s say you wanted to provide liquidity to the token pair ABC-XYZ on Uniswap, but you have neither token with you. On the other hand, you have more than enough Bitcoin that you want to HODL and not want to sell. Can we use this Bitcoin as collateral to get a loan of some ABC tokens that you can then use to fund the ABC-XYZ Uniswap pool? Enter DeFi loans.

Loans

In the traditional world of finance, Loans are given out to parties with good credit rating, and defaults are prevented/mitigated by a combination of social pressure of reputational damage, law enforcement, liquidation of other assets, or such. In the world of cryptocurrencies, the users have just one identity – a public key, which looks like this: 12cbQLTFMXRnSzktFkuoG3eHoMeFtpTu3S. How do you cause reputational damage to this public key? Traditional default protection ideas fail here. Most crypto-loans are, for that reason, over-collateralized. You want to borrow 100 tokens of ABC? You put up 150 ABC worth of Bitcoin as collateral, and then you take 100 ABC. As long as the smart contract can convince itself that the loan remains over-collateralized, you are good. If the value of Bitcoin goes down, you are expected to put up more collateral – or risk being liquidated.

Why would someone borrow an amount of X by pledging a collateral of 1.5X? Well, one obvious reason is that the borrowed token is more useful than the collateral token. It could be that the borrowed token is undervalued by the market vis-à-vis the collateral token. It could be that the borrower knows that the collateral token will tank in value the next day, and wants to willfully default on the loan. It’s all possible.

Hmm

What next? “TradFi” could get disrupted by “DeFi” because of how automated these smart contracts are, and how they can easily build on top of each other. Everything is an API, and API’s are open. On the other hand, men with guns could mess with the trusted third parties that, say, back stablecoins – and take down the whole system. Also, they could just run in this little corner of the general financial ecosystem, and everyone wins.

PS: Overheard on Twitter: Fish are swimming to DeFi in droves, and that’s attracting the sharks 🙂

So Doge

I will admit something first. Dogecoin is fun. Dogecoin makes you laugh out of sheer joy, despite yourself. Dogecoin sucks you down into a rabbit hole of memes, parodies, and all things not serious.

But is everything a joke? Obviously not. So, in that spirit – let’s get serious.

Bitcoin is an idea. A meme, if you will. Like how the original Doge meme is backed by a cute Shiba Inu dog, the Bitcoin meme is based on the idea of what money is. As we know, money is just a made up thing – a meme – which people ascribe value to. Money doesn’t have to be “backed by” anything. All you need is the collective belief of people in the meme of money. To take this comparison further, on the Doge side, the meme goes a bit deeper than just the dog. We have words like: “much”, “wow”, “so”, “amaze”, “many”, etc. that can enhance the context in which the Doge meme is being used. On the Bitcoin side, you have the mythical founder, dead simple cryptography, and a few other powerful ideas that go on to implement a glorified ledger of IOU’s. That ledger is considered legit because of the meme that Bitcoin is set in stone.

If Bitcoin itself is a meme, why not make a coin out of a literal meme? Enter Dogecoin.

Started off in 2013 as a joke, Dogecoin needed to work just like Bitcoin, but with a few tweaks. Why tweaks? Why not? It’s just a joke anyway. But sadly though, these weren’t “fun tweaks”. Like there is no Doge ASCII art in the transactions, or a “much wow” after every block of transactions. The tweaks were almost arbitrary technical departures from Bitcoin. Notably:

  • Changing the inter-block arrival rate (Bitcoin: 10 minutes, Dogecoin: 1 minute).
  • Proof of Work with the SCRYPT hashing algorithm in Dogecoin vs. SHA256 in Bitcoin.
  • Arbitrary rewards for block producers, but now changed to a fixed reward of 10000 Dogecoins per block (which are generated every minute).

Dogecoin works, in the sense that the jokes are funny, and if you choose to – you could use Dogecoin as money. If enough people choose to use it, it might very well thrive, not just survive. In 2021, enough people are buying it, holding it, talking about it, “meme-ing it”, and watching its value skyrocket in terms of USD. Because it’s funny, it’s an F.U to the traditional financial establishment, and perhaps even to the Bitcoin establishment (whatever that is).

But if everything about Dogecoin is warm and fuzzy, what gives?

Two things, specifically.

1. What makes a meme?

A meme implodes if what literally backs the meme fails to work. When I say “literal”, I mean the literal thing that backs the meme. Like in the case of Doge the meme, we want that Shiba Inu dog to have been real dog (and not secretly a stuffed toy), and the meanings of English words like “much” and “wow” to not change. In the case of Dogecoin, the literal technology that underpins the meme has to work. Let’s say Dogecoin can be double-spent because of the quirky way it is mined, or let’s say users cannot audit the global supply and the ownership of their Dogecoin because they cannot run a full node, or let’s say Dogecoin’s governing rules change tomorrow….for the lulz. In fact, those tweaks that Dogecoin did over Bitcoin can be argued to be quite unsound. These, and other technical artifacts can undermine the Dogecoin meme fundamentally.

Without being controversial, I can say that Dogecoin is orders of magnitude weaker than Bitcoin in these terms.

Why is that? That’s my second point

2. Stronger meme

Bitcoin’s meme is serious, to the point of almost being noble. This has inspired serious people. Some of these people have worked hard to make small technical improvements over the surprisingly good initial design, make the code robust against bugs, have a small footprint, and keep running forever. Some others have looked hard at the theoretical aspects of Bitcoin to see why it works, and have almost convinced themselves that it works because it has to work. Some others have meme-ed the idea that Bitcoin’s rules cannot change at all, and have fought long and hard wars of attrition to keep it as it is. There are entire industries built around Bitcoin’s mission, and words like “mission” get used quite often.

On the other side, we have Elon Musk and Joe Weisenthal of Bloomberg who have meme-ed about Dogecoin. And they have meme-ed well. Like Elon putting a Dogecoin on the literal moon (whatta great meme). Joe has even joked that Dogecoin is a purer incarnation of what a cryptocurrency should be, without all the added serious baggage of Bitcoin. I argue the opposite. The serious nature of the Bitcoin meme is what makes it work, by getting the virtuous cycle of seriousness begetting robustness begetting soundness.

To meme Dogecoin into a phenomenon stronger than Bitcoin, it has to come from many fronts. Textbooks have to written about it. Academic conferences dedicated to it should emerge. Universities should start teaching courses about it. CME has to create a futures market for it. Central Banks all over the world have to start aping it. Folks should be drilling holes into the Alps to create vaults that can store a piece of paper with a private key written on it. These and many more have to happen for a meme to emerge stronger. Also, critically, despite the memes, the thing has to not change, and keep its singular purpose.

Bitcoin, luckily, had many things go its way, which kick-started the virtuous cycle of meme-ing, and those memes attracting people who were good enough to improve the thing that underlies the memes. Dogecoin might get there as well, or might not.

Bitcoin cannot be secured by public-key cryptography

Bloomberg columnist Noah Smith wrote an article[1]Bloomberg paywall link. about Bitcoin’s energy consumption. “Blogger” Nic Carter wrote a rebuttal to it. Noah Smith wrote a rebuttal to this rebuttal. Noah’s counter-rebuttal calls for its own rebuttal.

Eventually, we will see why the following tweet from Nic follows quite naturally.

“Proof-of-stake is just a fancy name for “exactly the same system that bitcoin was designed to be an alternative to”.

tweet by Nic Carter.

Here are their arguments in a nutshell (paraphrased for brevity):

Noah: The more Bitcoin’s price goes up, the more resources it consumes.

Nic: Gold extraction also consumes energy.[2]Independently, Nic Carter has many rebuttals against Bitcoin’s energy consumption FUD. See here, here, and here.

Noah: Extraction is not the same as Storage. Stores of value like Gold, Stocks, our homes, etc. are not that expensive to store/maintain, as opposed to extract/create/build. The cost of secure storage of these traditional stores-of-value does not go up linearly with their value. Bitcoin is an exception, whose “cost of secure storage” (mining) goes up linearly with its price. It’s not a very efficient storage technology.

Rebuttal

Now that the stage is set, we differentiate between the asymmetries of public-key cryptography and cryptographic hash functions. Stay with me here, this is super important.

In public-key cryptography, there is asymmetry between the public key and the private key. Creating both keys is quite easy. For encryption, the public key locks, the private key unlocks. For digital signatures, the private key signs, the public key verifies the signature. For encryption, one cannot decrypt without the private key. For digital signatures, one cannot forge a signature without the private key. For the purposes of this article, let’s call these phenomena keyed-asymmetry. A cryptographic hash function, on the other hand, has no notion of keys. You have some information – you hash it, and you get a random looking fixed length string on the other side. Finding information that hashes to a specific non-random output is next to impossible. There is no private key that let’s you do this. The construction is just an algorithm, with no associated key-pair at all. To find a valid input that maps to a specific type of output, you need to try all possible inputs one at a time, for a long time – and hope to get lucky. Other than such brute forcing, there is no way around this asymmetry. Let’s call this keyless-asymmetry.

Given that background, let’s talk about the costs of secure storage of traditional store-of-value assets that Noah alluded to in his counter-rebuttal. My contention is that these assets are secured by poor physical world implementations of keyed-asymmetry. For example, Fort Knox is secured with a building, vaults, security protocols, and armed guards with guns. It’s assumed that unauthorized access through a break-in is impossible. But if you have authorization from someone in charge, you could walk in and walk out with the gold. This is equivalent to securing something with keyed-asymmetry. The private-key gives you access. Without the private key, even a James Bond villain cannot break in. Note that if the system didn’t allow the idea of a private key, that gold would be lost forever. A private-key is essential to making the gold visible/verifiable/transferable. A public good is being secured with a private-key, where the key-holder is supposedly competent and incorruptible.

In the digital world, public-key cryptography implements keyed-asymmetry in an ideal way, where security and authorized access are cheap, but unauthorized access is impossibly expensive. Digital signatures even go as far as revealing what the asset is, but just prevent forgery/confiscation of the asset. Physical manifestations of keyed-asymmetry, like the locks and vaults of Fort Knox, or social constructs like police-protection for your home, or even paper-and-pen signatures, are not even close to being as asymmetric in their nature as public-key cryptography is. They are poor substitutes, but we will give them a pass because they are, well, physical, and human ingenuity has not yet been able to import number theoretic cryptographic primitives to meat-space.

The key thing to note with keyed-asymmetry is that it is keyed. Access to the private key gives access to the asset. If you want to build a public good that has to be stored securely, is publicly visible, and doesn’t allow private key access – to anyone – governments, powerful corporations, venture capitalists, selective stakeholders – you just cannot use keyed-asymmetry. Something keyless has to be deployed: cryptographic hash functions. Used cleverly, they can store the asset securely, keep the asset publicly visible, and more importantly, prevent easy access – because cryptographic hash functions are truly one way functions (unless P=NP, but let’s not get into that). This clever way of using cryptographic hash functions to achieve an immutable public ledger is what Satoshi Nakamoto invented with Bitcoin. Remember that with hash functions, given that the output has to start with (say) 20 zeroes, one cannot find the corresponding input easily. They have to necessarily brute force it, by spending energy. This spent energy is what keeps Bitcoin’s public good, the blockchain, immutable – and not some key-holder’s competence and incorruptibility.

Note that it is incidental that Bitcoin also separately uses public-key cryptography to protect individual bitcoins.[3]Satoshi’s admission about the choice of the secp256k1 curve for Bitcoin’s implementation of ECDSA as “I didn’t find anything to recommend a curve type so I just… picked … Continue reading

Back to the physical world: let’s say Fort Knox were transparent so that everyone could verify that there is gold inside. Now, everyone who needs to protect their purchasing power also wants to contribute their share in guarding Fort Knox so that even authorized entry is not possible. A physical world implementation of keyless-asymmetry. How could we do that? How much energy would that require? First of all, it’s not possible to contemplate such a physical system, but more importantly, even if you did contemplate such a system, it’s easy to see that it would consume an inordinate amount of public energy. I would be shocked if such a keyless-asymmetric security structure even existed in the physical world. Humans seem to have given up on that idea, and have come up with a trust based model where we go back to locks and vaults, but trust that key-holders are competent and incorruptible. Given this trust based model, Fort Knox like storage of gold is indeed cheaper[4]It’s ironic that Fort Knox itself is probably quite expensive to maintain. But that’s just poor implementation, and perhaps a bit of security theater. than Bitcoin’s expensive way of storing its equivalent of the gold.

Well, Satoshi didn’t go with the trust based model. Proof-of-work is a physical world realization of keyless-asymmetry. Bitcoin’s blockchain, being a public good – if it was secured using keyed-asymmetry – would have left us open to incompetence and corruption of key-holders. Bitcoin is, thankfully, secured by keyless-asymmetry. On the contrary, all physical goods (homes, paper documents, gold, country borders, etc.) and most digital goods (emails, bank ledgers, the Fed money printer, etc.) are secured by keyed-asymmetry. If a public good is secured with keyed-asymmetry, you should be worried. Key-holders have to be competent and incorruptible – forever.

Keyed-asymmetry in the digital world is, of course, public-key cryptography, and hence, Bitcoin cannot be secured by public-key cryptography.

Given this setting, why proof-of-stake does not work for Bitcoin is just a corollary.

References

References
1 Bloomberg paywall link.
2 Independently, Nic Carter has many rebuttals against Bitcoin’s energy consumption FUD. See here, here, and here.
3 Satoshi’s admission about the choice of the secp256k1 curve for Bitcoin’s implementation of ECDSA as “I didn’t find anything to recommend a curve type so I just… picked one.” is quite illuminating in that Satoshi probably didn’t care as much about what public-key cryptography was used in Bitcoin as long as it did its job while maintaining a small footprint.
4 It’s ironic that Fort Knox itself is probably quite expensive to maintain. But that’s just poor implementation, and perhaps a bit of security theater.