December 29, 2016 This is a mirror of The Mail https://medium.com/@VitalikButerin/a-proof-of-stake-design-philosophy-506585978d51
Systems like Ethereum (as well as Bitcoin, NXT, and Bitshares) represent a new kind of cryptoeconomic organism—a decentralized, jurisdiction-free entity that exists entirely in cyberspace, maintained by a combination of cryptography, economics, and social consensus. They are somewhat like BitTorrent, but they are also unlike BitTorrent because BitTorrent does not have the concept of state—this distinction becomes crucial. They are sometimes described as decentralized autonomous organizations, but they are not quite companies—you cannot just fork Microsoft. They are somewhat like open-source software projects, but not entirely—while you can fork a blockchain, it is not as easy as forking OpenOffice.
These cryptoeconomic networks come in various styles—ASIC-based PoW, GPU-based PoW, naive PoS, delegated PoS, and soon Casper PoS—which each inevitably have their own underlying philosophies. A well-known example is the maximalist view of proof of work, where the “correct” blockchain, singular, is defined as the chain created by miners burning the maximum amount of economic capital. This mechanism was originally just a branch selection rule within a protocol, but in many cases has been elevated to a sacred principle—see my discussion with Chris DeRose on Twitter for example, where someone seriously attempted to defend this idea in its purest form, even in the face of hard forks changing the protocol due to hash algorithm changes. 'Bitshares' delegated proof of stake presents another coherent philosophy, where everything stems from a single principle, but this principle can be described more simply: shareholder voting.
Each philosophy—Satoshi consensus, social consensus, shareholder voting consensus—will draw its own set of conclusions and form a value system that makes sense from its own perspective, although they will certainly be criticized when compared to one another. Casper consensus also has a philosophical foundation, although it has not yet been succinctly expressed.
Myself, Vlad, Dominic, Jae, and others have our own views on why stake-based proof protocols exist and how to design them, but here I intend to explain my personal perspective.
I will list my observations and conclusions directly.
-
Cryptography is indeed special in the 21st century because cryptography is one of the few fields where hostile conflicts continue to severely favor the defender. Castles are far easier to destroy than to build, islands can be defended but still attacked, yet an ordinary person's ECC key is secure enough to withstand even nation-state actors. The cypherpunk philosophy is fundamentally about leveraging this precious asymmetry to create a world that better protects individual autonomy, and cryptoeconomics is, to some extent, an extension of this philosophy, only this time protecting the security and vitality of complex coordination and cooperation systems, not just the integrity and confidentiality of private information. Systems that consider themselves ideological heirs to the cypherpunk spirit should maintain this fundamental property; destroying or compromising them is far more costly than using and maintaining them.
-
The “cypherpunk spirit” is not just about idealism; creating systems where defense is easier than attack is also simple rational engineering.
-
On medium to long time scales, humans are very good at consensus. Even if an adversary has infinite hashing power and launches a 51% attack on any major blockchain, even restoring the history from last month, persuading the community that this chain is legitimate is much more difficult than simply surpassing the hashing power of the main chain. They need to subvert block explorers, every trusted member of the community, the New York Times, archive.org, and many other sources on the internet; in short, they need to convince the world that the new attack chain is the one that first appeared in the information technology-intensive 21st century, just as difficult as convincing the world that the American moon landing never happened. In the long run, these social factors will ultimately protect any blockchain, regardless of whether the blockchain community acknowledges this (noted that Bitcoin Core does recognize this primacy at the social level).
-
However, blockchains protected solely by social consensus are too inefficient, too slow, and can easily allow divergences to persist indefinitely (though it is difficult, it has happened); thus, economic consensus plays an extremely important role in protecting activity and security properties in the short term.
-
Because proof of work can only come from collective rewards (in Dominic Williams' words, it lacks two of the three e's), and the incentives for miners can only come from the risk of losing future bulk rewards, proof of work must necessarily be based on a logic where immense power exists under the incentive of enormous rewards. Recovering from a PoW attack is very difficult: the first time it happens, you can hard fork to change PoW, rendering the attacker's ASICs useless, but the second time you no longer have that option, so the attacker can attack again and again. Therefore, the scale of the mining network must be very large so that an attack is unimaginable. By making the network continuously spend X daily to prevent attackers smaller than X from appearing. I reject this logic because it kills trees, and (ii) it fails to realize the cypherpunk spirit—attack costs and defense costs are at a 1:1 ratio, so there is no advantage for the defender.
-
Proof of stake breaks this symmetry because it does not rely on security rewards but rather on penalties. Validators put their funds (“deposits”) at risk and receive a slight reward for locking up their capital and maintaining nodes and taking extra precautions to ensure their private keys are secure, but the bulk of the cost of recovering transactions comes from penalties that are hundreds or thousands of times greater than the rewards they simultaneously receive. Thus, the “one-sentence philosophy” of proof of stake is not “security comes from burning energy,” but rather “security comes from proposing economic loss values.” If you can prove that unless malicious nodes collude to attempt to cause an internal loss of value $X, no equivalent level of termination can be achieved for any conflicting block or state, then the given block or state has $X security.
-
In theory, a majority of validators colluding could take over a proof of stake chain and begin to act maliciously. However, (I) through clever protocol design, their ability to earn extra profits through this manipulation can be limited as much as possible, and more importantly (ii) if they attempt to prevent new validators from joining or execute a 51% attack, the community can simply coordinate a hard fork to remove the deposits of the violating validators. A successful attack might cost $50 million, but the process of cleaning up the aftermath will not be that much heavier than the geth/parity consensus failure on November 25, 2016. Two days later, the blockchain and community were back on track, the attacker lost $50 million, and others in the community might be richer because the attack would cause the token's value to drop up due to the ensuing supply shortage. That is the attack/defense asymmetry for you.
-
The above should not be understood as unplanned hard forks becoming a frequent occurrence; if necessary, a single 51% attack on proof of stake can certainly be set to a permanent cost of a 51% attack, and the pure cost and ineffectiveness of the attack should ensure that it is almost never attempted in practice.
-
Economics is not everything. Individual actors may be incentivized by off-protocol motivations, they may be hacked, they may be kidnapped, or they may simply be drunk and decide to destroy the blockchain one day, let the costs go to hell. Moreover, on the positive side, individual moral restraint and inefficient communication often raise the cost of attacks to levels far exceeding the nominal loss values prescribed by the protocol. This is an advantage we cannot rely on, but at the same time, it is one we should not unnecessarily discard.
-
Therefore, the best protocols are those that work well under a variety of models and assumptions—the economic rationality of coordinated choices, the economic rationality of individual choices, simple fault tolerance, Byzantine fault tolerance (ideally, adaptive and non-adaptive adversary variants), behavioral economics models inspired by Ariely/Kahneman (“We all cheat a little”), ideally any other models that are realistically feasible. It is important to have two layers of defense: economic incentives to prevent centralized cartels from taking antisocial actions, and anti-centralization incentives to prevent the formation of cartels from the start.
-
Consensus protocols that work as quickly as possible are risky and should be approached with great caution because if the possibility is very fast related to the incentives to do so, this combination will reward very high levels of network-level centralization that trigger systemic risks (for example, all validators running from the same hosting provider). Consensus protocols do not care much about the speed at which validators send messages, as long as they send messages within some acceptable long time interval (e.g., 4-8 seconds, as we know from experience that delays in Ethereum are typically ~500 milliseconds to 1 second). A possible middle ground is to create protocols that can work very quickly, but mechanisms similar to Ethereum's uncle mechanism ensure that nodes increase their network connectivity beyond a certain easily reachable point, with diminishing returns.
From here, of course, there are many details and many ways to diverge on the details, but the above are at least the core principles on which my version of Casper is based. From here, we can certainly discuss the trade-offs between competing values. Are we giving ETH a 1% annual issuance rate and incurring a $50 million forced remedial hard fork cost, or giving ETH a zero annual issuance rate and incurring a $5 million forced remedial hard fork cost? Under the economic model, when do we increase the security of a protocol in exchange for reducing its security in a fault tolerance model? Are we more concerned with predictable security levels or predictable issuance levels? These are all questions for another post, and the different trade-offs in executing these values are more questions for more posts. But we will get to that.
Hard Forks, Soft Forks, Defaults, and Enforcement#
The Chinese translation is as follows:
When new consensus rules are released, nodes that do not upgrade will produce invalid blocks due to their ignorance of the new consensus rules, resulting in temporary forks.
So, a simple understanding is that soft forks are actually temporary; they can potentially revert to the latest chain state as long as the non-upgraded nodes upgrade to the latest state, allowing them to return to the updated chain. In chains where soft forks occur, non-upgraded nodes can validate blocks produced by upgraded nodes, and upgraded nodes can also validate blocks produced by non-upgraded nodes. Let's illustrate this with an image.
Characteristics of soft forks:
-
Better compatibility, allowing the use of previous version functionalities without upgrading.
-
There are no forks in the blockchain, as shown in the image, only distinctions between new blocks and old blocks.
-
Long-term allowance for non-upgrading, coexistence of new and old blocks.
- Hard Forks
Here, I also quote the English description from the Bitcoin official website:
A permanent divergence in the blockchain, commonly occurs when non-upgraded nodes can’t validate blocks created by upgraded nodes that follow newer consensus rules.
The Chinese translation is as follows:
A permanent divergence in the blockchain occurs when, after new consensus rules are released, some non-upgraded nodes cannot validate blocks produced by upgraded nodes, which usually leads to hard forks.
In other words, hard forks are permanent forks; once a fork occurs, it cannot revert to the original fork. When the blockchain version is updated, non-upgraded nodes refuse to validate blocks generated by upgraded nodes, while upgraded nodes can validate blocks produced by non-upgraded nodes, leading to two different chains. As the saying goes: "Paths that differ do not conspire." Since we disagree, my old node does not accept your new node, so we each govern ourselves, and no one needs to bother the other. Let's use an image again.
Characteristics:
-
No compatibility, incompatible with previous versions, emphasizing upgrades.
-
The blockchain splits into two chains, as shown in the image.
-
Requires unanimous agreement to upgrade within a certain time; those who disagree enter the old chain.
II. Why Hard Forks Occur
The reason for soft or hard forks, in my personal understanding, is mainly due to various proposals arising from certain issues, but because each party believes their proposal is better, disagreements arise, and everyone refuses to yield, leading to each pursuing their own proposal, resulting in a fork. For example, various proposals have been made regarding the scaling issue.
In the original Bitcoin system, a block's capacity was set at 1M, and the confirmation time for a block was 10 minutes, meaning each block could handle about 7 transactions per second. This processing capability is indeed somewhat low and can easily lead to congestion, ultimately causing a collapse. Therefore, some proposed block expansion. Some suggested changing it to 2M, others to 20M, and some to unlimited capacity. Who should we listen to? With no one able to decide, everyone can only pursue their own version.
The most well-known hard forks are ETC and ETH. In July 2016, the Ethereum development team forcibly transferred all funds of The DAO (Decentralized Autonomous Organization) and its sub-DAOs to a specific refund contract address by modifying the Ethereum software code at block 192000, thereby reclaiming the Ether controlled by the hacker. This then formed two chains, one ETC (the original chain) and one ETH (the forked chain).
An important debate in the blockchain space is whether hard forks or soft forks are the preferred protocol upgrade mechanism. The fundamental difference between the two is that soft forks change the protocol rules by strictly reverting the effective transaction set, allowing nodes following the old rules to still enter the new chain (assuming most miners/validators implement the fork), while hard forks allow previously invalid transactions and blocks to become valid, meaning clients must upgrade their clients to stay on the hard fork chain. Hard forks also have two subtypes: strictly expanding hard forks, which strictly expand the effective transaction set, so the old rules are effectively a soft fork relative to the new rules, and bilateral hard forks, where the two rule sets are mutually incompatible.
Here is a Venn diagram to illustrate the types of forks:
The benefits of both are usually listed as follows.
-
Hard forks allow developers greater flexibility when upgrading protocols, as they do not have to ensure that new rules “fit” with old rules.
-
Soft forks are more convenient for users, as users can stay on the chain without needing to upgrade.
-
Soft forks are less likely to lead to chain breaks.
-
Soft forks only require agreement from miners/validators (because even if users still use old rules, if the nodes constituting the chain use new rules, then only things valid under the new rules will enter the chain); hard forks require opt-in user agreement.
In addition, a major criticism of hard forks is that they are “mandatory.” The kind of coercion implied here is not physical coercion; rather, it is coercion through network effects. That is, if the network changes the rules from A to B, then even if you personally prefer A, if most other users prefer B and switch to B, you must switch to B, even if you personally disagree with this change, in order to remain on the same network as others.
Supporters of hard forks are often mocked for attempting to “maliciously take over” the network and “force” users to follow them. Additionally, the risk of chain breaks is often used to view hard forks as “unsafe.”
In my personal view, these criticisms are misguided and, in many cases, completely regressive. This perspective is not unique to Ethereum, Bitcoin, or any other blockchain; it arises from the general nature of these systems and applies to any system. Moreover, the arguments below only apply to controversial changes where at least one group (miners/validators and users) largely disagrees with these changes; if a change is uncontroversial, then it can usually be safely implemented regardless of the format of the fork.
First, let’s discuss the issue of coercion. Both hard forks and soft forks change the protocol in ways that some users may not like; any protocol change that does not receive 100% support will do so. Moreover, it is almost inevitable that at least some dissenters will value adhering to the network effects of the larger group more than their own preferences for the protocol rules. Therefore, from the perspective of network effects, both types of forks are coercive.
However, hard forks and soft forks have an essential difference: hard forks are opt-in, while soft forks do not allow users to “opt out.” For a user to join a hard fork chain, they must personally install the software package that implements the fork rules, and those users who disagree with the rule changes can theoretically simply remain on the old chain—indeed, such an event has already occurred.
This is true for both strictly expanding hard forks and bilateral hard forks. However, in the case of soft forks, if the fork is successful, there is no unforgeable chain. Therefore, soft forks are institutionally more inclined to coercion rather than splitting, while hard forks have the opposite inclination. My own moral viewpoint makes me more inclined toward separation rather than coercion, although others may disagree (the most common argument is that network effects are indeed very important, and it is crucial that “a single coin rules them all” [https://blog.ethereum.org/2014/11/20/bitcoin-maximalism-currency-platform-network-effects/], although there are also more moderate versions).
If I had to guess why, despite these arguments, soft forks are often touted as “less coercive” than hard forks, I would say it is because it feels like hard forks “force” users to install software updates, while with soft forks users do not “have to” do anything at all. However, this intuition is misguided: what matters is not whether individual users must perform the simple bureaucratic step of clicking the “download” button, but whether users must do so under coercion to accept changes to the protocol rules they would rather not accept. By this standard, as noted above, both types of forks are ultimately coercive, with hard forks performing better in protecting user freedom.
Now, let’s look at the controversial forks, particularly those where the preferences of miners/validators conflict with those of users. There are three scenarios: (1) bilateral hard forks, (2) strictly expanding hard forks, and (3) the so-called “user-activated soft fork” (UASF). The fourth category is miner-activated soft forks without user consent; we will discuss this later.
First, bilateral hard forks. In the best case, the situation is simple. The two coins trade on the market, and traders decide the relative value of both. In the case of ETC/ETH, we have overwhelming evidence that, regardless of their ideological views, miners are very likely to simply allocate their hash power to the coin based on price ratios to maximize their profits.
Even if some miners acknowledge an ideological lean towards one side, it is highly likely that there will be enough miners willing to arbitrage any mismatch between price ratios and purchasing power ratios, keeping both in sync. If a miner cartel tries to form to not mine on one chain, there is excessive motivation to defect.
There are two extreme scenarios here. The first possibility is that due to inefficient difficulty adjustment algorithms, the value of mining coins declines as prices fall, but difficulty does not decrease to compensate, making mining very unprofitable, and no miner is willing to mine at a loss to keep the chain moving until difficulty is restored to balance. Ethereum is not like this, but it is likely Bitcoin is. Therefore, minority chains are unlikely to take off and will die. Note that the normative question of whether this is a good thing depends on your view of coercion and splitting; as you can imagine from what I wrote above, I personally think this kind of minority chain hostile difficulty adjustment algorithm is bad.
The second edge case is that if the gap is very large, the large chain can 51% attack the smaller chain. Even in the case of ETH/ETC splitting at a 10:1 ratio, this did not happen; so this is certainly not inevitable. However, if miners on the dominant chain prefer coercion over allowing separation and act according to those values, this is always a possibility.
Next, let’s look at strictly expanding hard forks. In SEHF, there is an attribute that under the fork rules, the non-fork chain is valid, so if the price of the fork chain is lower than that of the non-fork chain, then its hash power will be lower than that of the non-fork chain, and thus the non-fork chain will ultimately be accepted as the longest chain by the original client and fork client rules—so the fork chain will be “annihilated” [https://twitter.com/SatoshiLite/status/839673905627353088].
There is a view that such forks exist with a strong inherent bias toward failure because the possibility of the fork chain being annihilated will be priced in, pushing down the price and making it more likely to be annihilated... this argument seems quite strong to me, so it is a good reason to prefer any controversial hard fork to be bilateral rather than strictly expanding.
Bitcoin unlimited developers suggested solving this problem by manually creating bilateral hard forks, but a better option is to internalize the bilaterality; for example, in the case of Bitcoin, a rule could be added to prohibit some unused opcodes, and then include transactions containing that opcode on the non-fork chain, so that under the fork rules, the non-fork chain would henceforth be considered permanently invalid. In Ethereum's case, due to various details about how state computation works, almost all hard forks are nearly automatically bilateral. Other chains may have different properties depending on their structure.
The last type of fork mentioned above is the user-activated soft fork. In UASF, users open the soft fork rules without bothering to gain consensus from miners; for economic reasons, miners may join in. If many users disagree with the UASF, then there will be a coin split, leading to a scenario similar to that of strictly expanding hard forks, except—this is the truly clever and roundabout part of this concept—the same “risk of annihilation” pressure strongly disfavoring the fork chain in strictly expanding hard forks strongly favors the fork chain in UASF. Even if UASF is opt-in, it uses economic asymmetry to tilt toward success (though this bias is not absolute; if UASF is absolutely unpopular, it will not succeed and will only lead to chain splits).
However, UASFs are a dangerous game. For example, let’s assume a project’s developers want to create a UASF patch that changes a previously accepted opcode that accepted all transactions into one that only accepts transactions that comply with some cool new features, even though this is politically or technically controversial and disliked by miners. Miners have a clever and cunning counterattack: they can unilaterally implement a miner-activated soft fork that causes all transactions created with that soft fork to fail.
Now, we have three sets of rules:
-
Opcode X is always valid under the original rules.
-
Opcode X is only valid if the rest of the transaction complies with the new rules.
-
Opcode X is always invalid.
Note that
(2) is a soft fork relative to (1), while (3) is a soft fork relative to (2).
Now, there is strong economic pressure supporting (3), so the soft fork fails to achieve its goal.
The conclusion is this. Soft forks are a dangerous game, and if they are controversial and miners begin to retaliate, they become even more dangerous. Strictly expanding hard forks are also a dangerous game. Miner-activated soft forks are coercive; user-activated soft forks are less coercive, although they are still quite coercive due to economic pressures, and they also have their dangers. If you really want to make a controversial change and have decided that the high social costs of doing so are worth it, then make a clean bilateral hard fork, take some time to add some appropriate replay protection, and let the market sort it out.
March 14, 2017
Defining the Problem
So what does a good token sale mechanism look like? One way to start is to observe the criticisms currently faced by sales models and find a list of popular features.
Let’s do this together. Some inherent characteristics include:
-
Valuation Certainty: If you participate in a sales project, you must at least determine what the upper limit of the valuation is (or, say, the maximum proportion of tokens you can obtain).
-
Participation Certainty: If you want to participate in such a sales project, you must have the intention to succeed.
-
Funding Cap: To avoid criticisms of greed (or to eliminate the risk of regulatory scrutiny), the sale must limit the amount of funding.
-
No Central Bank: The initiators of the token sale cannot hold an excessive number of tokens to avoid controlling the market.
-
Efficiency: The sale must not lead to economic inefficiencies or unnecessary losses.
Does this sound reasonable?
Well, here’s the less interesting part.
-
(1) and (2) cannot be satisfied simultaneously.
-
At least if no tricks can be taken, (3), (4), and (5) cannot be satisfied simultaneously.
These can be referred to as the “first token sale dilemma” and the “second token sale trilemma.”
The proof of the first dilemma is simple: suppose you offer users a guaranteed valuation of $100 million in the sale. Now suppose a user wants to invest $101 million. At least some people will fail. Simple supply and demand can prove the second trilemma. If you satisfy item (4), you must sell all or a fixed share of the total tokens, so the valuation you sell is proportional to the price you sell at. If you satisfy item (3), you set a cap on the price. However, it may be that the equilibrium price corresponding to the quantity you sold exceeds the price cap you set, leading to a situation of supply shortage: (i) the digital equivalent of waiting 4 hours in line at a busy restaurant; (ii) the digital equivalent of ticket scalping, resulting in a lot of unnecessary losses and conflicting with (5).
The first dilemma cannot be overcome; certainty of valuation and participation is unavoidable, although it is better to choose certainty of participation over certainty of valuation when possible. The closest result we can derive is to sacrifice full participation to guarantee partial participation. This can be achieved through proportional refunds (for example, buying tokens worth $101 million at a $100 million valuation, with everyone receiving a 1% refund). We can also think of this mechanism as an unlimited sale, where partial payments represent locked funds rather than consumption; however, from this perspective, it can be determined that the requirement for locked funds is a loss of efficiency, so this mechanism cannot satisfy (5). If Ether shares are not well distributed, it may favor wealthy shareholders, harming fairness.
The second dilemma is difficult to overcome. Many attempts to overcome it may lead to failures or unpredictable bad outcomes. For example, Bancor's token sale considered capping the gas price for purchase transactions at 50 entropy (about 12 times the regular gas price). However, this means that the optimal strategy for buyers is to set up multiple accounts, send a transaction from each account, trigger the contract, and then buy in (circumventing the risk of buyers accidentally exceeding their expected purchase and reducing capital requirements). The more accounts buyers set up, the higher their potential purchase. Therefore, in equilibrium, this could exacerbate congestion on the Ethereum blockchain, even exceeding BAT-type sales; the transaction fees for such sales are at least $6,600, rather than the denial of service attacks the network suffers from. Moreover, any type of on-chain transaction competition severely harms fairness, as the costs of participating in the competition are constant, while the rewards are related to your capital amount, leading to outcomes that disproportionately favor wealthy shareholders.
Next Steps
There are three wise things you can do. First, you can conduct a reverse Dutch auction like Gnosis, but with an adjustment: do not hold unsold tokens, but use them for the public good. Simple examples include: (i) airdrops (redistributing to Ether holders); (ii) donations to the Ethereum Foundation; (iii) donations to Parity, Brainbot, Smartpool, or other companies and individuals building infrastructure independently in the Ethereum space; (iv) combining the first three, possibly according to the ratio voted by token buyers.
Second, you can retain unsold tokens, but solve the “central bank” problem by automatically deciding how tokens are spent. The reasoning here is similar to why many economists are interested in rule-based monetary policy: even though centralized institutions can control powerful resources significantly, they can eliminate the political uncertainties arising from this by following certain usage rules through that institution. For example, unsold tokens can be given to market makers to maintain the stability of token prices.
Third, you can conduct sales with caps. Limit the amount each person can buy. Effectively doing this requires a KYC process, but fortunately, KYC institutions can complete this task at once, and after confirming that an address represents a specific individual, a whitelist of user addresses can be created, which can be repeatedly applied to each token sale and other applications that may benefit from this model, such as voting in the Akasha game. There is still unnecessary loss (efficiency), which may lead uninterested individuals to participate in the sale because they know they can quickly profit from it in the market. However, this may not be so bad: it creates a basic income for everyone in cryptocurrency, and if behavioral economics assumptions like the “endowment effect” have even a little accuracy, it will achieve the goal of ensuring token distribution is widespread.
Is Single-Round Sales Good?
Returning to the topic of “greed.” I want to say that, in principle, not many people oppose a development team that can spend $500 million to create a great project that earns $500 million. What people oppose is: (i) a brand new untested development team receiving $50 million all at once; (ii) more importantly, the time lag between developer rewards and token buyer interests. In single-round sales, developers only have one chance to receive funding for project creation, which is close to the time period when the development process begins. There is no feedback mechanism: the team first receives a small amount of funding to prove themselves, and then, as they prove themselves reliable and capable of success, they receive more funding over time. During the sale, there is relatively little information about the quality of the development team, and once the sale is completed, the developers' motivation to continue working is lower compared to traditional companies. “Greed” does not refer to receiving a lot of money, but rather to not making efforts to prove that they can spend this money wisely while wanting to receive a large amount of funding.
If you want to hit the nail on the head, how do we solve this? I would say the answer is simple: focus on mechanisms rather than single-round sales.
I can suggest several examples for reference:
Angelshares, this project sold in 2014, selling a fixed proportion of AGS daily over several months. Every day, people could provide unlimited funds to the crowdfunding project, and the daily AGS allocation would be divided among several contributors. Essentially, this is like maintaining hundreds of “small round trades” without a sales cap for nearly a year; I would say the duration of this sale could be extended even further.
Mysterium, conducted a six-month small sale before a large-scale sale, which almost no one noticed.
Bancor, recently agreed to give all funding to market makers, allowing them to maintain token price stability while capping the bottom at 0.01 Ether. These funds cannot be withdrawn for two years.
It seems difficult to see the connection between Bancor's strategy and solving this time lag, but here is an element of that solution. To understand, consider two scenarios. In the first case, suppose a sale raises $30 million, with a cap of $10 million, but a year later, everyone agrees that the project has failed. In this case, the token price may drop below 0.01 Ether, and the market maker may lose all funds due to maintaining the minimum price, leaving the team with only $10 million in operational funds. In the second case, suppose the sale raises $30 million, with a cap of $10 million, and two years later, everyone is satisfied with the project. In this case, the market maker would not be triggered, and the team could receive all $30 million in funds.
A related proposal comes from Vlad Zamfir's “safe token sale mechanism.” This concept is broad and can be parameterized in various ways, but one way is to sell tokens at the highest price, then obtain a slightly lower minimum price, and gradually widen the gap between the two prices over time; if the price can maintain itself, development funds are gradually released.
The above three may not be sufficient. We want to extend the duration of sales projects, allowing us more time to understand which development teams are the most trustworthy before providing large amounts of funding. However, this seems to be the most fruitful direction to explore.
Breaking Out of the Dilemma
Summarizing the above analysis, it can be clearly concluded that although the above dilemmas and trilemmas cannot be directly overcome, they can be gradually broken through by innovative thinking and overcoming less severe similar problems. We can reach consensus on participation, using time as the third dimension to eliminate the impact: if you did not participate in the Nth round of sales, you can wait for the N+1 round a week later, when the price may not be very different.
We can conduct sales without caps, but must include different cycles, setting no caps on sales within those cycles; thus, without proving the ability to handle small amount sales, the team will not demand large amounts of funding. We can sell small portions of tokens at once, agreeing on the remaining supply through contracts and automatically selling according to a pre-set formula, to eliminate the political uncertainties it brings.
Here are some feasible mechanisms that follow these ideas:
Conduct a reverse Dutch auction similar to Gnosis, setting a lower cap (e.g., one million dollars). If the auction sells less than 100% of the total supply of tokens, automatically transfer the remaining funds to another auction set at a cap 30% higher. Repeat this process until all issued tokens are sold.
Sell tokens at a price of X dollars without limit, depositing 90% of the proceeds into a smart contract, guaranteeing its minimum price to be 0.9 times X over five years, while the maximum price can increase indefinitely, and the minimum price can approach zero.
Imitate AngelShares’ approach, but extend the time to five years instead of several months.
Conduct a reverse Dutch auction similar to Gnosis; if the auction sells less than 100% of the total supply of tokens, transfer the remaining funds to an automatic market maker to maintain token price stability (if prices continue to rise, the market maker can sell tokens, with part of the proceeds going to the development team).
Immediately transfer all tokens to market makers, setting parameters and X variables (X being the minimum price), s (the share of tokens sold), t (the time the sale has been ongoing), T (the expected duration of the sale, e.g., five years), to sell tokens at a price of k / (t/T - s) (this is a bit strange and may require economic analysis).
Note that other mechanisms for solving token sale issues must also be attempted, such as giving proceeds to multiple custodians, only releasing funds once a certain threshold is reached; this idea is interesting and worth exploring in depth. However, the design space is multidimensional, and there are many ways to try.
Translation: Annie_Xu
Although the ideas behind the current Ethereum protocol have been relatively stable for two years, Ethereum did not suddenly appear with its current concept fully formed. Before the blockchain was launched, the protocol underwent many significant developments and design decisions. The purpose of this article will be to introduce the various evolutions the protocol has undergone from inception to release; the countless work done in terms of protocol implementation, such as Geth, cppethereum, pyethereum, and EthereumJ, as well as the history of applications and businesses in the Ethereum ecosystem, are deliberately beyond the scope.
The history of Casper and sharding research is also outside the scope. While we could certainly publish more blog posts discussing the various ideas proposed and discarded by Vlad, Gavin, myself, and others, including “proof of work,” radial chains, hypercubes, shadow chains (arguably... plasma), chain fibers, and various iterations of Casper, as well as the rapidly evolving thoughts of Vlad regarding the incentives and properties of actors in consensus protocols, this would also be too complex a story to tell in one article, so we will omit it for now.
Let’s start with the earliest version that ultimately became Ethereum, when it wasn’t even called Ethereum. When I visited Israel in October 2013, I spent quite a bit of time with the Mastercoin team, even suggesting some features to them. After spending some time thinking about what they were doing, I submitted a proposal to the team to make their protocol more general, supporting more types of contracts without adding the same large and complex feature set:
https://web.archive.org/web/20150627031414/http://vbuterin.com/ultimate_scripting.html
Note that this is far from the later broader vision of Ethereum: it purely focused on the area that Mastercoin was already trying to focus on, namely bilateral contracts, where party A and party B would both put in funds, and then they would receive funds according to some formula specified in the contract (for example, a bet might say “if X happens, give all funds to A, otherwise give all funds to B”). The scripting language was not Turing complete.
The Mastercoin team was impressed, but they were not interested in abandoning everything they were doing to move in this direction, and I became increasingly convinced that this was the right choice. The second version appeared around December:
https://web.archive.org/web/20131219030753/http://vitalik.ca/ethereum.html
Here, you can see the result of a major reorganization, primarily as a result of my hiking trip to San Francisco in November, when I realized that smart contracts had the potential to be fully generalized. The scripting language was not just a way to describe the terms of the relationship between the two parties; the contracts themselves were fully mature accounts capable of holding, sending, and receiving assets, and even maintaining permanent storage (at the time, permanent storage was referred to as “memory,” and the only temporary “memory” was 256 registers). This language transformed from a stack-based machine to a register-based machine according to my own will; I had no objections to this, I just felt it seemed more complex.
Additionally, note that there was now a built-in fee mechanism:
At this point, Ether was essentially gas; after every computational step, the balance of the contract called by the transaction would drop a little, and if the funds of the contract ran out, execution would stop. Note that this “receiver pays” mechanism means that the contract itself must require the sender to pay a fee to the contract, and if this fee does not exist, it exits immediately; the protocol specified 16 free execution steps, allowing contracts to reject non-paying transactions.
This was when the Ethereum protocol was entirely my own creation. However, from here, new participants began to join the circle. So far, the most prominent in terms of etiquette is Gavin Wood, who contacted me in December 2013 with a message about my work.
“Hey Jeremy, great to see you interested in Ethereum…”
Gavin's initial contributions were twofold. First, you might notice that the contract call model in the original design was asynchronous: while contract A could create an “internal transaction” to contract B (the term “internal transaction” is jargon from Etherscan; initially, they were just called “transactions,” later referred to as “message calls” or “calls”), the internal transaction would not begin executing until the first transaction was fully executed. This meant that transactions could not use internal transactions as a way to retrieve information from other contracts; the only way to do this was through the EXTRO opcode (somewhat like SLOAD, which could be used to read the storage of other contracts), which was later removed with the support of Gavin and others.
When implementing my initial specifications, Gavin naturally synchronized the implementation of internal transactions, even without realizing the intent was different—that is, in Gavin's implementation, when one contract calls another, the internal transaction is executed immediately, and once execution is complete, the VM returns to the contract that created the internal transaction and continues with the next opcode. This approach seemed superior for both of us, so we decided to make it part of the specification.
Secondly, a discussion between him and me (during a walk in San Francisco, so the exact details will forever be lost to the winds of history, though there may be one or two copies in the NSA's deep archives) led to a reworking of the transaction fee model, shifting from the “contract pays” method to the “sender pays” method and switching to a “gas” architecture. Instead of each individual transaction step immediately taking away a little Ether, the transaction sender pays and is allocated some “gas” (roughly speaking, a counter for the number of computational steps), and the computational steps are deducted from the allowed amount of that gas. If a transaction runs out of gas, the gas is still forfeited, but the entire execution is reverted; this seems to be the safest approach, as it eliminates a whole class of “partial execution” attacks that contracts previously had to worry about. When transaction execution is complete, any unused gas fees are refunded.
Gavin can also be largely credited with the subtle shift in vision, from viewing Ethereum as a platform for building programmable currency, where contracts based on the blockchain could hold digital assets and transfer them according to pre-set rules, to a general computing platform. This began with subtle changes in focus and terminology, and later this influence became stronger with the increasing emphasis on the overall “Web 3,” which sees Ethereum as part of a set of decentralized technologies, the other two being Whisper and Swarm.
In early 2014, there were also some changes suggested by others. After Andrew Miller and others proposed the idea, we ultimately returned to a stack-based architecture.
Charles Hoskinson suggested switching from Bitcoin's SHA256 to the newer SHA3 (or more accurately, keccak256). While there was some controversy for a time, discussions with Gavin, Andrew, and others led to the establishment that the size of values on the stack should be limited to 32 bytes; another option being considered, infinite-sized integers, had the problem that it was difficult to calculate how much addition, multiplication, and other operations would cost.
As early as January 2014, our initial mining algorithm was a device called Dagger:
https://github.com/ethereum/wiki/blob/master/Dagger.md
Dagger is named after the mathematical structure “Directed Acyclic Graph” (DAG) used in the algorithm. The idea is that every N blocks, a new DAG would be pseudo-randomly generated from a seed, and the underlying DAG would be a collection of nodes that would require several terabytes to store. However, generating any single value in the DAG only requires computing a few thousand entries. “Dagger computation” involves grabbing some values from random positions in the underlying dataset and hashing them together. This means there is a fast way to perform DAG computations—data is already in memory, and a slow but memory-free way—regenerating each value you need from the DAG from scratch.
The purpose of this algorithm was to have the same “memory hardness” properties as algorithms that were popular at the time (like Scrypt), while still being lightweight client-friendly. Miners would use the fast way, so their mining would be limited by memory bandwidth (theoretically, consumer-grade RAM has already been optimized so much that it is difficult to further optimize with ASICs), but lightweight clients could use the memory-free but slower version for verification. The fast way might take a few microseconds, while the slow but memory-free way might take a few milliseconds, so it was still very feasible for lightweight clients.
From here, the algorithm would change several times during Ethereum's development. The next idea we went through was “proof of work adaptability”; here, proof of work would involve executing randomly selected Ethereum contracts, and there was a clever reason to explain why this would be ASIC-resistant: if ASICs were developed, competing miners would have the incentive to create and publish many contracts that ASICs are not good at executing. The story is that there are no ASICs for general computation because it is just a CPU, so we can use this opposing incentive mechanism to prove work that is essentially executing general computation.
The reason for failure is simple: long-range attacks. An attacker could start a chain from block 1, filling it with simple contracts for which they could create dedicated hardware, and quickly surpass the main chain. Therefore... back to the drawing board.
The next algorithm was called Random Circuit, described in a Google Doc here, proposed by me and Vlad Zamfir, and analyzed by Matthew Wampler-Doty and others. The idea here was also to simulate general computation in mining algorithms, this time by executing randomly generated circuits. There is no conclusive evidence that something based on these principles cannot work, but we were quite pessimistic about this from the computer hardware experts we contacted in 2014. Matthew Wampler-Doty himself proposed a proof of work based on SAT solving, but this was also ultimately rejected.
Finally, we circled back to an algorithm called “Dagger Hashimoto.” Sometimes referred to as “Dashimoto,” it drew on much of Thaddeus Dryja's work on proof of work, which pioneered the concept of “I/O bound proof of work,” where the main limiting factor on mining speed is not hashes per second but megabytes of RAM accessed per second. However, it combined the concept of Dagger's lightweight client-friendly DAG generation dataset. After multiple rounds of adjustments by me, Matthew, Tim, and others, these ideas ultimately converged into what we now call the algorithm Ethash.
By the summer of 2014, the protocol was quite stable, with the main exception being that the proof of work algorithm did not reach the Ethash stage until early 2015, with a semi-official specification existing in the form of Gavin's yellow paper.
In August 2014, I developed and introduced mechanisms that allowed Ethereum's blockchain to have shorter block times and higher capacity while reducing centralization risks. This was introduced as part of PoC6.
Discussions with the Bitshares team led us to consider adding a stack as a first-class data structure, although due to time constraints, we ultimately did not do so, but later security audits and DoS attacks would show that safely doing this was actually much more difficult than we had imagined at the time.
In September, Gavin and I planned the next two major changes to the protocol design. First, in addition to the state tree and transaction tree, each block would also contain a “receipt tree.” The receipt tree would include hashes of logs created by transactions, as well as the intermediate state root. The logs would allow transactions to create “outputs,” which would be stored on the blockchain and accessible to lightweight clients, but not accessible for future state computations. This could be used to allow decentralized applications to easily query events such as token transfers, purchases, the creation and completion of trade orders, the start of auctions, and so on.
Other ideas were also considered, such as making a Merkle tree from the entire execution trace of a transaction to allow proving anything; logs were chosen because they were a compromise between simplicity and completeness.
The second idea was the concept of “precompiles,” which solved the problem of allowing complex cryptographic computations to be available in the EVM without having to deal with EVM overhead. We also went through many more ambitious ideas of native contracts “if miners optimized the execution of certain contracts, they could ‘vote’ to lower the gas prices of those contracts, so contracts that most miners could execute faster would naturally have lower gas prices; however, all these ideas were rejected because we could not come up with a cryptoeconomic secure way to implement such things. Attackers could always create a contract that executes some trapdoor cryptographic operation, distribute the trapdoors to themselves and their friends to allow them to execute that contract faster, then vote to lower the gas price and use it to deny the network. Instead, we chose a less ambitious approach, specifying only a small number of precompiles for common operations, such as hashing and signature schemes.
Gavin was also a major advocate for the development of the idea of protocol abstraction—moving many parts of the protocol (such as Ether balances, transaction signing algorithms, random numbers, etc.) into contracts within the protocol itself, with the theoretical ultimate goal being to reach a point where the entire Ethereum protocol could be described as function calls to a virtual machine with some pre-initialized state. There was not enough time to get these ideas into the initial Frontier version, but these principles were expected to begin to be slowly integrated through some changes in Constantinople, Casper contracts, and sharding specifications.
This was all implemented in PoC7; after PoC7, the protocol did not change much, except for some minor but in some cases important details discovered through security audits...
In early 2015, Jutta Steiner and others organized pre-release security audits, which included software code audits and academic audits. The software audits primarily targeted the C++ and Go implementations, respectively led by Gavin Wood and Jeffrey Wilcke, although there was also a smaller audit targeting my pyethereum implementation. In the two academic audits, one was conducted by Itai Eyal (famous for “selfish mining”), and the other was conducted by Andrew Miller and other less authoritative figures. The Eyal audit led to a small protocol change: the total difficulty of a chain would not include uncles. This least authority audit focused more on smart contracts and gas economics, as well as Patricia trees. This audit led to several protocol changes. A small example is using sha3(addr) and sha3(key) as trie keys instead of directly using addresses and keys; this made worst-case attacks on the trie more difficult.
This is a warning that may be a bit ahead of its time...
Another important issue we discussed was the gas limit voting mechanism. At the time, we were already concerned about the lack of progress in the Bitcoin block size debate and wanted Ethereum to have a more flexible design that could adjust as needed. But the challenge was: what is the optimal limit? My initial idea was to create a dynamic limit that locks in the long-term exponentially moving average of 1.5⋅actual gas consumption, so that on average, the data blocks would be 23 full in the long run. However, Andrew pointed out that this could be exploited in some ways—specifically, miners wanting to raise the limit could simply include transactions consuming a lot of gas in their own blocks, but take very little time to process, thus always creating full blocks without increasing their own costs. Therefore, the security model, at least in the upward direction, is equivalent to simply letting miners vote to decide the gas limit.
We did not come up with a gas limit strategy that was unlikely to be breached, so Andrew's recommended solution was simply to let miners explicitly vote on the gas limit and set the default voting strategy to the 1.5⋅moving average rule. The reasoning was that we still did not know the correct way to set the maximum gas limit, and the risk of any specific method failing seemed to outweigh the risk of miners abusing their voting rights. Therefore, we might as well simply let miners vote on the gas limit and accept the risks of limits being set too high or too low in exchange for the benefits of flexibility and the ability for miners to cooperate to quickly raise or lower limits as needed.
This increases the opportunity for market manipulation, as manipulators can not only waste their money fighting a single equilibrium but may actually succeed in pushing a given currency from one equilibrium to another and profiting from the successful “prediction” that leads to this shift. This also means there is a lot of path dependence, and established brands are very important; witness the epic battle over which branch of the Bitcoin blockchain can be called Bitcoin, to give one particularly striking example.
Another perhaps more important conclusion is that the market capitalization of appcoins critically depends on holding time H. If someone creates a very efficient exchange that allows users to buy appcoins in real-time and then immediately use them in the application, and then allows sellers to cash out immediately, then the market capitalization will plummet. If a currency is stable or has optimistic prospects, this may not matter, as users actually prefer holding tokens over holding anything else (i.e., zero “effective costs”), but if the prospects begin to turn sour, then such a well-functioning exchange could accelerate its demise.
You might think exchanges are inherently inefficient, requiring users to create accounts, log in, deposit coins, wait for 36 confirmations, trade, and log out, but in fact, ultra-efficient exchanges are on the horizon. Here is a thread discussing the design of fully autonomous synchronous on-chain transactions that can convert token A to token B and even possibly use token B to do something, in a single transaction. Many other platforms are also in development.
All of this suggests that relying purely on the argument of exchange medium to support token value, while attractive due to its apparent ability to print money out of thin air, is ultimately very fragile. Due to irrationality and the implicit zero holding costs of tokens, protocols using this model may last for a while, but this is a model that always carries the inevitable risk of collapsing at any moment.
So what are the alternatives? A simple alternative is the etherdelta method, where the application simply charges fees in the interface. A common criticism is: but can’t someone fork the interface and take the fees out? A rebuttal is: someone can also replace your protocol token with ETH, BTC, DOGE, or whatever users prefer to use. One can make a more complex argument that this is difficult because the “pirated” version must compete with the “official” version for network effects, but people can also easily create an official paid client that refuses to interact with non-paying clients; this execution method based on network effects is similar to the typical VAT enforcement in Europe and elsewhere. Official client buyers will not interact with unofficial client sellers, and official client sellers will not interact with unofficial client buyers, so a large group of users would need to simultaneously switch to the “pirated” client to successfully evade the fees. This is not very robust, but it is certainly as good as creating new protocol tokens.
If developers want upfront revenue to fund initial development, they can sell tokens and use all paid fees to buy back some tokens and burn them; this would make the tokens supported by the future expected value of the fees spent in the system. This design can be transformed into a more direct utility token by requiring users to pay with the utility token, and if users do not have the token, let the interface automatically purchase the token using exchanges.
Importantly, for tokens with stable values, it is very beneficial for the token supply to have the characteristic of sinkage—where tokens actually disappear, thus reducing the total amount of tokens over time. This creates a more transparent and explicit fee that users pay, rather than a highly variable and difficult-to-calculate “effective fee,” and also provides a more transparent and explicit way to calculate what the value of the protocol token should be.
November 19, 2022
The earliest attempts by exchanges to prove with cryptography that they are not deceiving users date back a long time. In 2011, the largest Bitcoin exchange MtGox proved they had funds with a transaction that moved to address 424242 which had been pre-announced. In 2013, discussions began about how to solve the other side of the problem: proving the total scale of customer deposits. If you prove that customer deposits equal X (“proof of liabilities”) and prove ownership of the private keys to X coins (“proof of assets”), then you have a proof of solvency: you have proven that the exchange has enough funds to repay all depositors.
The simplest way to prove deposits is to simply publish a list of (username, balance)
pairs. Each user can check that their balance is included in the list, and anyone can check the complete list to see (i) that each balance is non-negative, and (ii) that the sum is the required amount. Of course, this violates privacy, so we can slightly modify the scheme: publish (hash(username, salt), balance)
pairs and privately send each user their salt
value. But even so, this leaks balances and reveals patterns of balance changes. The desire to protect privacy led us to the next invention: Merkle tree technology.
Merkle tree technology involves placing the customer balance table into a Merkle sum tree. In a Merkle sum tree, each node is a (balance, hash)
pair. The underlying leaf nodes represent the balances and salted username hashes of individual customers. In each higher-level node, the balance is the sum of the two balances below, and the hash is the hash of the two nodes below. Like Merkle proofs, Merkle sum proofs are branches of the tree, consisting of sister nodes along the path from the leaf to the root.
Exchanges will send each user a Merkle sum proof of their balance. Then, users will ensure that their balance is correctly included in the total. A simple example code implementation can be found here.
# The function for computing a parent node given two child nodes
def combine_tree_nodes(L, R):
L_hash, L_balance = L
R_hash, R_balance = R
assert L_balance >= 0 and R_balance >= 0
new_node_hash = hash(
L_hash + L_balance.to_bytes(32, 'big') +
R_hash + R_balance.to_bytes(32, 'big')
)
return (new_node_hash, L_balance + R_balance)
# Builds a full Merkle tree. Stored in flattened form where
# node i is the parent of nodes 2i and 2i+1
def build_merkle_sum_tree(user_table: "List[(username, salt, balance)]"):
tree_size = get_next_power_of_2(len(user_table))
tree = (
[None] * tree_size +
[userdata_to_leaf(*user) for user in user_table] +
[EMPTY_LEAF for _ in range(tree_size - len(user_table))]
)
for i in range(tree_size - 1, 0, -1):
tree[i] = combine_tree_nodes(tree[i*2], tree[i*2+1])
return tree
# Root of a tree is stored at index 1 in the flattened form
def get_root(tree):
return tree[1]
# Gets a proof for a node at a particular index
def get_proof(tree, index):
branch_length = log2(len(tree)) - 1
# ^ = bitwise xor, x ^ 1 = sister node of x
index_in_tree = index + len(tree) // 2
return [tree[(index_in_tree // 2**i) ^ 1] for i in range(branch_length)]
# Verifies a proof (duh)
def verify_proof(username, salt, balance, index, user_table_size, root, proof):
leaf = userdata_to_leaf(username, salt, balance)
branch_length = log2(get_next_power_of_2(user_table_size)) - 1
for i in range(branch_length):
if index & (2**i):
leaf = combine_tree_nodes(proof[i], leaf)
else:
leaf = combine_tree_nodes(leaf, proof[i])
return leaf == root
The privacy leakage in such a design is much lower than a completely public list, and can be further reduced by adjusting branches each time the root is published, but some privacy leakage still exists: Charlie learns that someone has a balance of 164 ETH, some two users' balances sum to 70 ETH, and so on. An attacker controlling many accounts still has the potential to learn a lot about the exchange's users.
An important subtlety of this scheme is negative balances: what if an exchange with 1390 ETH in customer balances but only 890 ETH in reserves tries to cover the gap by adding a -500 ETH balance under some fake account in the tree? It turns out that this possibility does not undermine the scheme, which is why we specifically need a Merkle sum tree rather than a regular Merkle tree. Suppose Henry is a fake account controlled by the exchange, and the exchange puts -500 ETH there.
Greta's proof verification will fail: the exchange will have to give her Henry's -500 ETH node, and she will reject it as invalid. Eve and Fred's verifications will also fail because the total ETH above Henry will be -230, making it invalid as well! To escape theft, the exchange would have to hope that no one checks their balance proofs in the entire right half of the tree.
If the exchange can identify 500 users they are confident either won't bother checking the evidence or won't be believed when they complain about not receiving evidence, they can escape theft. However, the exchange could also exclude those users from the tree and achieve the same effect. Therefore, Merkle tree technology is essentially the best level that a debt proof scheme can achieve if merely achieving debt proof is the goal. But its privacy properties are still not ideal. You can go further and use Merkle trees in smarter ways, such as making every single coin or satoshi a separate leaf, but ultimately with more modern techniques, there are better ways.
ZK-SNARKs are a powerful technology. The impact of ZK-SNARKs on cryptography could be likened to transformers in artificial intelligence: a general-purpose technology so powerful that it will completely defeat a series of application-specific technologies, solving a range of problems developed decades ago. Therefore, of course, we can use ZK-SNARK methods to greatly simplify and improve the privacy of proof of solvency protocols.
The simplest thing we can do is to put all users' deposits into a Merkle tree (or more simply, a KZG commitment) and use ZK-SNARKs to prove that all balances in the tree are non-negative and sum to some claimed value. If we add a layer of hashing for privacy, each user's Merkle branch (or KZG proof) will show nothing about other users' balances.
The simplest version of asset proof is the protocol we saw above: to prove you hold X coins, you simply move X coins in a transaction at a pre-agreed time or in a transaction where the data field contains the words “these funds belong to Binance.” To avoid paying transaction fees, you can sign an off-chain message; both Bitcoin and Ethereum have standards for... off-chain message signing information.
This simple asset proof technique has two practical problems:
-
Handling Cold Storage
-
Parallel Double Spending
For security reasons, most exchanges will “cold store” the vast majority of customer funds: on offline computers, transactions require manual signatures and uploads to the internet. Literal air gaps are common: my past cold storage setup for personal funds involved a permanently offline computer that generated a QR code containing signed transactions, which I would scan from my phone. Given the high value involved, the security protocols used by exchanges are crazier, often involving multi-party computation across several devices to further reduce the chances of a hacker leaking keys from a single device. Given such a setup, creating an additional message to prove control over an address is an expensive operation!
Exchanges have several avenues:
-
Retain some public long-term use addresses. The exchange generates several addresses, publishes a proof of ownership for each address once, and then reuses those addresses. This is by far the simplest option, although it adds some constraints on how to protect security and privacy.
-
Have many addresses, randomly prove a few. The exchange will have many addresses, possibly even using each address only once and reclaiming it after one transaction. In this case, the exchange might have a protocol where a few addresses are randomly selected from time to time and must be “opened” to prove ownership. Some exchanges have already done similar things through auditors, but in principle, this technique could be turned into a fully automated program.
-
More complex ZKP options. For example, the exchange could set all its addresses to a 2-of-1 multisig, where one key for each address is different, and the other is a hidden version of some “large” emergency backup key stored in a very secure way, such as 16-of-12 multisig. To protect privacy and avoid exposing their entire address set, the exchange could even run zero-knowledge proofs on the blockchain, proving the total balance of all addresses in this format on-chain.
Another major issue is preventing incidental double spending. It is easy for exchanges to pass collateral back and forth between each other to prove reserves, allowing them to pretend to be solvent while actually being insolvent. Ideally, proof of solvency would be done in real-time and updated after each freeze. If this is impractical, the next best option is to coordinate a fixed schedule between different exchanges, for example, proving reserves every Tuesday at 1400 UTC.
The final question is: Can you do asset proof on fiat? Exchanges hold not only cryptocurrencies but also fiat currency within the banking system. Here, the answer is: yes, but such a program will inevitably rely on a “fiat” trust model: banks themselves can prove balances, auditors can prove balance sheets, and so on. Given that fiat cannot be verified with cryptography, this is the best that can be done within this framework, but it is still worth doing.
Another approach is to completely separate one entity from another (USDC itself), where the former operates the exchange, handling cash flow processes supported by assets like USDC, while the latter handles cash flows moving between the crypto and traditional banking systems. Because USDC's “liabilities” are just ERC20 tokens on-chain, proof of liabilities is “free,” requiring only proof of assets.
Suppose we want to go further: we do not want to merely prove that the exchange has funds to repay its users. Instead, we want to completely prevent exchanges from stealing user funds.
The first major attempt at this was plasma, a scaling scheme popular in the Ethereum research circles in 2017 and 2018. Plasma works by dividing balances into a set of independent “coins,” each assigned an index and located at a specific position in the Merkle tree of the Plasma block. Valid transfers of coins require placing transactions in the correct position in the tree, the root of which is published on-chain.
A simplified diagram of plasma. Coins are held in a smart contract that executes the rules of the plasma protocol upon withdrawal.
OmiseGo attempted to build a decentralized exchange on this protocol, but since then, they have shifted to other ideas—in this regard, the plasma group itself is now the optimistic EVM aggregation project Optimism.
It is not worth looking at the technical limitations of plasma as a moral story about the entire concept (for example, proof of coin fragmentation). Since the peak of plasma discourse in 2018, ZK-SNARKs have become more feasible for scale-related use cases, as we mentioned above, ZK-SNARKs change everything.
A more modern version of the plasma idea is what Starkware calls validium: essentially the same as ZK rollups, except that data is stored off-chain. This structure can be used for many use cases, and it can be imagined in any situation where a central server runs some code and proves it executed the code correctly. **Within the validity period, the operator has no way to