banner
leaf

leaf

It is better to manage the army than to manage the people. And the enemy.
follow
substack
tg_channel

Three Stages of Blockchain Development

Exploration and discovery of blockchain application scenarios in the next 3 to 5 years can be roughly divided into three stages:

● Information "blockchainization";

● Value "blockchainization";

● Scenario "blockchainization".

23.1 Information "blockchainization", solving the problem of information fragmentation. Blockchain technology can maintain and verify a public transaction ledger. Generally speaking, the application of any technology usually starts from the nodes that can most improve performance and efficiency. People usually have their own transaction ledgers in Alipay or bank cards. Generally, transfers between Alipay accounts or within the same bank can be credited instantly, requiring only verification from Alipay or a specific bank; however, interbank transfers or even cross-border payments take a long time to confirm transactions, mainly because various banks and immigration registration authorities need to repeatedly verify and check, consuming a lot of human and material resources. According to Fan Bin, General Manager of IBM Global Enterprise Consulting Services for the Greater China banking sector, many banking services are now preparing or have implemented the "blockchainization" of transaction data, significantly improving the efficiency of cross-border and interbank remittances and eliminating redundant work, typically existing in the form of consortium chains. Essentially, the blockchainized public ledger allows data interconnection and real-time synchronization among different countries and banking systems, simplifying the account opening process and providing convenient services for each user. For banks, it allows a comprehensive understanding of each user's credit data, reducing business risks and providing precise credit services.

In addition to the banking system's bills and payments, various social problems caused by information fragmentation in other real-life scenarios can also be resolved through information "blockchainization".

There are many foreseeable application scenarios for value "blockchainization," which generally need to possess the following characteristics:

● The products or services exchanged by both parties can be digitized;

● Services/products are standardized, and the evaluation system is clear and traceable;

● Services are provided by individuals, and consumption is also by individuals;

● As more individuals gradually join, the value of the network increases.

A simple example is the shared mobility industry, where the service providers are mainly drivers (car owners), and the consumers of the service are passengers. The driver takes the passenger from point A to point B, and the passenger pays the driver a certain fee, completing the transaction. The distance from point A to point B can be calculated using GPS navigation, and the travel time can be recorded with a timestamp. The fare is automatically generated based on a publicly disclosed calculation method, and after the driver delivers the passenger to the destination, confirmation occurs, and the passenger's account automatically transfers the money to the driver's account. As more passengers and drivers join the blockchain network, it becomes easier for drivers to receive more suitable orders, and passengers find it easier to get rides.

The main entrepreneurial opportunities in this stage lie in "disintermediation" application scenarios, such as blockchainized Didi, Meituan, and Taobao. Since these centralized companies do not have significant technical barriers, such as the order distribution algorithms of Didi and Meituan being relatively simple and easy to replicate, the challenges lie in user acquisition and addressing the throughput issues of on-chain transactions. Solving these problems relies on educating the public about blockchain knowledge and the development of technology. Other industries with similar characteristics, such as P2P insurance, credit, gambling, prediction, and gaming, can also achieve "disintermediation" of transactions and solve trust issues through value "blockchainization" if they meet the aforementioned four characteristics.

From the perspective of the chain's carrying capacity and the characteristics of information, on-chain information that requires distributed verification by a large number of nodes generally needs to possess the following characteristics:

● High information value, such as Bitcoin transfers;

● Each piece of information is independent and does not interfere with each other;

The first point ensures economic motivation for more nodes to participate in verification. If a massive amount of low-value data is stored in centralized servers, it may not be a problem; the second point involves a certain degree of filtering of information, where packaging blocks and chaining have a good effect in independent information flows. In situations where information is interdependent, the block and chain structure may not work well.

In the future, there may be a trust level for information, quantified based on the randomness of distributed storage and verification nodes. It is foreseeable that in more future application scenarios, only a few pieces of information, such as rights confirmation and transaction records, will require a large number or even all nodes in the network to verify. Various application scenarios, such as articles, images, videos, and music in apps, are merely projections of on-chain string mappings to real life.

The objects of asset on-chain include ownership, usage rights, and income rights of assets. Table 27-3 compares the priority distribution of ownership, usage rights, and income rights of different assets on-chain. It was found that:

(1) The content of the same asset on-chain differs, and the priority will vary. For example, when the on-chain asset is usage rights (leasing) and future income rights, it becomes the highest priority asset (Category 1 asset). In contrast, if the on-chain asset is ownership, its demand for disintermediation is low, and its liquidity is low, making it a Category 11 asset.

(2) The priority of income rights on-chain is relatively high. Income rights have a high demand for disintermediation, are highly liquid, and are very suitable for on-chain assets. Therefore, attention should be given to the on-chain process after asset tokenization.

(3) The priority of ownership on-chain is relatively low. Apart from traditional financial assets like stocks, securities, and receivables, the ownership of other assets has a lower priority for on-chain compared to income rights and usage rights, mainly due to low asset liquidity and low demand for disintermediation.

(4) The second-hand trading and leasing markets are potential growth points for asset on-chain. Whether in the second-hand trading market or the leasing market, there is a high demand for disintermediation of transaction content, especially in the leasing market, where the liquidity of asset usage rights is high, placing it at the top of the asset on-chain priority list.

Table 27-3 Asset On-Chain Priority Distribution

image

image

As a distributed accounting technology, blockchain can effectively meet the needs of transparency, decentralization, and privacy protection in the asset circulation process. With the rise of the "blockchain craze," some people blindly "praise" blockchain technology for promotional or speculative purposes, believing that all assets need to be on-chain. In reality, not all assets are suitable for on-chain. To determine whether an asset is suitable for on-chain, two aspects need to be clarified:

(1) Whether the on-chain content is the usage rights, ownership, or income rights of the asset. The priority will vary depending on the content of the same asset on-chain, with income rights having a relatively high priority and ownership having a relatively low priority.

(2) When assessing the priority of asset on-chain, it is necessary to consider factors from the supply layer, operational layer, and demand layer. This article lists four quantifiable indicators—demand for disintermediation, asset value, asset liquidity, and operability—as standards for assessing the feasibility of "blockchain technology + asset management." Among them:

● The demand for disintermediation is the original driving force for asset on-chain. Not all assets need to be on-chain. Blockchain technology must address the "pain points" in the current asset management process. If the trust relationship between buyers and sellers can be guaranteed in the current asset trading process, and assets can achieve rapid circulation, then such assets do not need to apply blockchain technology.

● Asset value and liquidity are the foundation for asset on-chain. First, asset on-chain has requirements for asset value and liquidity. For low-value assets or assets sensitive to customer pricing, or assets with low transaction frequency or one-time transactions, the feasibility of applying blockchain technology is low.

● Operability is the catalyst for asset on-chain. Operability is not the determining factor for asset on-chain, but it affects the speed of asset on-chain. In addition, this article also draws the following two conclusions:

(3) The most suitable assets for priority on-chain include: usage rights and income rights of land, houses, and buildings; usage rights and income rights of specialized equipment and machinery; usage rights and income rights of precious metals and jewelry; income rights of coal, oil, and natural gas; ownership of virtual property in social, community, and entertainment platforms; ownership of stocks and bonds; ownership of receivables; income rights of antiques and calligraphy; usage rights of patents and trademarks.

(4) The second-hand trading and leasing markets are potential growth points for asset on-chain. This is due to two reasons: firstly, the usage rights of these two types of assets have a high demand for disintermediation.

From the perspectives of target audience, token structure, consensus mechanism, issuance method, and distribution method, the current state of token development can be summarized as follows:

(1) In terms of quantity, payment currency, general platform, and industry application tokens are relatively balanced, presenting a "tripod" situation.

(2) In terms of token structure, single-layer tokens are dominant, with multi-layer tokens developing in parallel. Among them, payment currency tokens are all single-layer tokens, while there are five types of second-layer tokens and one type of third-layer token.

(3) Consensus mechanisms and issuance methods are more flexible. In terms of consensus mechanisms, it has gradually evolved from the proof-of-work mechanism (PoW) initially adopted by Bitcoin to proof-of-stake (PoS) and Byzantine fault tolerance (BFT), then to delegated proof-of-stake (DPoS) and delegated Byzantine fault tolerance (DBFT), and finally to hybrid consensus mechanisms that combine various consensus mechanisms. In terms of initial issuance methods, depending on project characteristics, community management, and technical promotion needs, in addition to classic mining releases, other means have been introduced to promote the early development of tokens, including pre-mining, ICO crowdfunding, venture capital, airdrops, and rewards.

(4) Incentive methods are diversified. Mining rewards and transaction fee node rewards still play an important role in the current token economic system, especially for payment currency tokens and general platform tokens. For application platform tokens, the incentive methods are generally personalized based on the content of the token project.

(5) Community governance is integrated into the construction of the token economic system. Especially for general platform and application tokens, community governance mechanisms play an important role in promoting the sustainable development of the token economy.

Secondly, this article summarizes the current development models of the token economy. The development path of the token economy system represented by Bitcoin can be divided into three categories: first, from a technical perspective, solving issues such as information concealment, transaction efficiency, and high energy consumption; second, expanding application scenarios to implement the token economy in more projects; third, reducing market disturbances, i.e., minimizing the impact of market factors (such as price fluctuations) on the blockchain community. Specifically, the development from a single-layer token structure to a multi-layer token structure separates the value attributes and management attributes of tokens, so that the value fluctuations of tokens do not affect the normal operation of the blockchain network.

At the same time, the future development direction of the token economy system is discussed:

(1) For tokens designed from the perspective of technical improvement, the focus is on solving the technical difficulties of blockchain networks, emphasizing technological innovation and the universality of application scenarios. Therefore, such tokens often choose a single-layer token structure, and payment currency tokens and underlying technology development platform tokens characterized by a single-layer token structure will continue to be the mainstream development of the token economy system.

(2) For tokens serving application scenarios, the aim is to pursue the application capabilities of blockchain technology, and thus personalized adjustments will be made based on the content of the scenarios. Such tokens may choose either a single-layer token structure or a multi-layer token structure.

(3) When the token economy system requires stability in the blockchain community and aims to reduce the impact of market speculation, price fluctuations, etc., a multi-layer token structure is generally chosen. The multi-layer token structure includes two parts: management tokens and value tokens. Token economy systems based on community management are suitable for multi-layer token structures. Currently, the application scenarios of multi-layer token structures are still relatively few, but their potential in the design of future token economy systems can be expected.

Finally, this article points out the "pain points" in the development and application process of the token economy:

(1) The development of blockchain technology is still immature, manifested in the imperfection of underlying blockchain technology and the immaturity of blockchain technology's operational management model.

(2) The landing process of token projects is difficult. In terms of market capitalization, payment currency tokens account for 63%, general platform tokens account for 27%, while application tokens account for less than 10%. Excluding content, entertainment, advertising, and IoT technology, non-financial tokens account for only 1%.

(3) The development environment of the token economy is chaotic, with high uncertainty in landing. This is reflected in two aspects: 1) Token founders only regard the token economy as a means of fundraising, without considering the landing issues of the token economy. 2) There is much uncertainty regarding whether the token economy can land and how long it will take.

Decentralized Storage#

Another popular innovation is decentralized storage, which is an application innovation that uses distributed storage technology to store files in chunks across different storage nodes. Compared to centralized storage, it offers a higher level of privacy protection, lower storage costs, and more redundant data backup copies, effectively avoiding single points of failure.

In fact, the connection between decentralized storage and blockchain is not that close; blockchain mainly serves as an incentive billing mechanism above the storage layer. "Miners" with idle hard disk devices contribute space and record their contributions in the blockchain through a specially designed storage proof mechanism, focusing on dimensions such as contribution duration, space size, and effective space utilization rate. Contribution can be used to obtain equivalent token rewards; users with storage needs must pay tokens to acquire more data storage space.

However, the development of decentralized storage currently faces several issues, mainly in three areas.

  • Firstly, speculation on its token price has almost never stopped, making it difficult for users with genuine storage needs to endure the cost uncertainty caused by price fluctuations.

  • Secondly, the space contributed by different miners is extremely decentralized. Although it can be set up worldwide, the overall storage performance of the decentralized network is not high due to local network environments and the mechanical quality of hard disks, making it far inferior to centralized storage, thus only suitable for cold data and personal data storage.

  • Finally, with the influx of miners, the supply scale of storage space for most projects far exceeds actual demand, leading to a situation of oversupply. The source of data is currently an urgent issue that needs to be resolved; how to solve the data source problem is a key focus, or there will be little room for further development.

Cross-Chain#

The two types of innovations mentioned earlier belong to application innovations, while cross-chain belongs to technological innovation. Different blockchain networks are independent, functioning as isolated data islands. Cross-chain technology builds bridges between these islands, providing the possibility of data interoperability between different chains.

The initial cross-chain was a direct connection between two chains, while the current stage of cross-chain resembles a hub where interactions between chains are no longer direct but occur through a relay chain for information transfer. It can be said that the development of cross-chain technology is the foundation for other innovations. For example, the following diagram illustrates a cross-chain network of the Polkadot cross-chain protocol, connecting other public chains within the same ecosystem.

image

Through the simple analysis of the two major categories of public chain technology application innovations above, we can easily see the role of investment incentives in promoting public chains. This is understandable; there is no free lunch. Without innovation, there would be no speculative explosion, and ecological prosperity would be impossible. Compared to the previous stage of barbaric growth, purely speculative projects are gradually disappearing. DeFi is a derivative of traditional finance, while decentralized storage is an exploration of the sharing economy, and the introduction of new concepts makes it closer to people's daily lives.

  • Development of Consortium Chains. After understanding the recent innovations in the public chain sector, let’s take a look at the development of the consortium chain sector. Over the past few years in the consortium chain sector, I have witnessed the practical application of blockchain technology in enterprise business activities, how traditional businesses can reduce costs and increase efficiency with the support of blockchain technology, and how to lower the cooperation threshold between enterprises. Of course, I have also encountered the existing development barriers of blockchain technology. Personally, I summarize the development of blockchain technology in enterprises into three stages.

Data Preservation#

Blockchain technology has characteristics such as temporal continuity, immutability, and traceability, making it very suitable for data storage that requires evidence preservation. Currently, applications like product traceability, internet courts, and electronic certificates belong to this stage, with most applications storing only data proofs on the blockchain rather than original data.

This approach is related to both the data carrying capacity of blockchain itself and the demands of the business itself. From the network's perspective, the demand for storage and bandwidth in blockchain increases exponentially with the growth of data volume, so it is necessary to reasonably control the orientation of data. I will provide further interpretation on this point at the end of the technical section. The application in the data preservation stage only stores data on the blockchain, and the data is utilized for evidence only when necessary; most of the time, the data remains dormant. Blockchain technology serves as a special technical guarantee for these applications, rather than being indispensable. Moreover, most of the time, each party operates independently without involving multiple participants. In this stage, the substitutability of blockchain is relatively strong. Currently, most enterprise blockchain applications are still at this stage.

Data Exchange#

Traditional inter-enterprise cooperation involves the exchange of commercial data. Generally, the data exchange model involves the data provider or demander offering API interfaces in a pull or push manner. However, once mutual data needs arise and more than two enterprises are involved, the problem becomes more complicated. Blockchain technology can conveniently solve this issue by deploying blockchain nodes within each enterprise, allowing each enterprise to interact only with its own maintained nodes. The blockchain mechanism automatically synchronizes data to other participants, and if any node has new data on-chain, the blockchain's event notification mechanism automatically notifies the internal applications of each enterprise. In this stage, enterprise cooperation heavily relies on blockchain as a data exchange hub, so the substitutability of blockchain is relatively low. Some enterprises have already reached the data exchange stage, such as the combination of federated learning and blockchain technology.

Value Transfer#

One of the significances of Ethereum's existence is to provide a medium that anchors real-world value to the digital world, allowing us to see the possibility of transferring from an information network to a value network. In the field of enterprise blockchain, this significance can be infinitely magnified. The products offered by enterprises are nothing more than goods/services/solutions. If blockchain technology can convert enterprise products into circulating assets within the value network, various asset applications can extend beyond basic value.

In this stage, blockchain technology is the cornerstone of the value network and is irreplaceable. Of course, we are still far from the value network at this stage, and I cannot accurately describe what the future will look like. The development of enterprise blockchain is relatively short, and overall, it is still in its infancy. However, the country has a very optimistic view of its future prospects, guiding the development of blockchain in various fields of national social economy from a high-dimensional perspective.

Blockchain + is still in its first stage of development, but there is already a trend toward further advancement. Of course, we should also recognize that while the value of blockchain technology has been gradually validated over the past decade, it is still in its early development stage, and the technology itself is not yet mature. The transformation of thinking patterns it brings is also sufficiently shocking, and it will take more time for blockchain technology to fully integrate into our work and lives. If one day, every innovation in public chains can empower the real economy, and enterprises can let go of the layers of commercial barriers they have built, and the public chain sector and consortium chain sector can merge and intersect again, that will be the golden age of blockchain technology, where the value network is within reach.

Does this mean that blockchain can transmit real value? Mainly three points: anchoring, rights confirmation, and transaction (reconfirmation of rights).

Looking at today's internet giants like Huawei, Ant Group, Baidu, and Tencent, it's clear that they are deeply engaged in consortium chains. Currently, the waning heat of the cryptocurrency market indicates the decline of public chains, while the rise of the chain sector essentially points to the rise of consortium chains. Now, consortium chains occupy the majority, while public chains are heavily suppressed.

Digital currency trading platforms, such as Binance and Huobi, belong to which circle? They belong to the public chain circle.

"The conversion of enterprise products into circulating assets within the value network." This statement is somewhat unclear. Why do enterprises need to convert their products or services into the value network?

Envisioning possible scenarios for the value network, it can be understood that the information circulating on the internet is subject to correctness and incorrectness, making it impossible to assign a value to information (measuring value). The value network aims to express valuable information, and the products and services provided by enterprises are inherently valuable, thus can transition into the value network. Additionally, it may generate extra value, but this remains uncertain and is merely a conjecture.

Many people may think that blockchain is a novel technology; however, it is not. It is merely old wine in a new bottle, as it does not create new technology but combines several already mature technologies, representing a form of integrative innovation. When we first start learning about blockchain, the most important thing is to grasp its technical characteristics and understand its technological foundation.

The Technical Foundation of Blockchain#

At the same time, it helps you master the three most important characteristics of blockchain technology.

Blockchain technology has developed to the point where, especially after Ethereum integrated the concept of smart contracts with blockchain technology, it has become a medium between reality and the network. It can be said that blockchain is a carrier of value and a new type of social production relationship.

How to understand this statement? My personal understanding is that based on blockchain technology, we can break down the barriers between the real world and the network world, virtualize material, and materialize value. The future internet will not circulate information but rather living values.

The future is promising, but it must start with the present. Let’s return to the real world for a moment. Recently, a picture circulated widely online, humorously dubbed "the physical manifestation of blockchain technology." This example illustrates the technical characteristics of blockchain perfectly.

This is the entrance of a community in Shenyang, Liaoning, where residents use multiple locks connected together to form a simple access control system. Whoever has a car adds a lock; each lock has an identifier, and community car owners only need to use their keys to open the corresponding lock to access the gate. This prevents outside vehicles from occupying community parking spaces, demonstrating that ingenuity often lies among the people.

How does this illustrate the characteristics of blockchain technology? First, we need to clarify what characteristics blockchain has. Generally, we consider three points: combining this example, we can understand these concepts one by one.

Decentralization, Traceability, Immutability.

Decentralization: In this community access control system, each lock represents a homeowner. They do not need a property management company to manage it uniformly; they only need to manage and maintain their locks to ensure the system operates normally. Each lock varies in size and cost, and homeowners may have multiple cars, but in this system, there are no status differences. Moreover, when a new homeowner joins or a homeowner moves out, they only need to add or remove the corresponding lock.

Have you noticed that this contains the concept of decentralization? Access control is no longer managed by a "third-party institution" like a property management company; each lock is part of the management. Similarly, the original intention of blockchain is to eliminate centralized third-party institutions (refer back to Lecture 1). The data and state of the entire network are jointly maintained by all nodes in the network, with no status differences, only differences in available resources. Moreover, the offline status of any node does not affect the operation of the system. You can try to deduce what kind of network model and storage model should be chosen to achieve decentralization. In a system without a central node, it can be said that every node is a center, each node can provide services externally, and can also request services from other nodes. This is the characteristic of a peer-to-peer network model, where nodes are both data producers and consumers for each other.

image

On the other hand, because the roles of nodes are equal, the data stored by each node should be consistent, independently maintaining a complete blockchain structure. Even if some nodes lose data, as long as there is one node intact, historical data will not be lost. This effectively avoids system crashes caused by single points of failure, making it far more reliable than traditional data disaster recovery models. Of course, decentralization is only an ideal state. At this stage, blockchain decentralization is essentially relative decentralization, which we can also call multi-centralization. When understanding concepts, we need to apply rational thinking and also learn to use emotional thinking to accept the gradual changes in the intermediate process.

image

Traceability: In the access control system, each lock records relevant information about the homeowner, binding it to the homeowner. This allows for accountability in exceptional situations, such as when a homeowner forgets to lock the door, allowing outside vehicles to enter the community, reflecting the system's traceability.

In the actual operation of blockchain, the principle of completing information traceability is similar but more complex. Let me explain.

For a single node, blockchain can be considered a time-sequenced database. Each operation on the system essentially stores corresponding data and logs in each node's database. Moreover, each piece of data is not stored discretely but is associated in chronological order, with new data deriving from a previously existing set of data.

Based on this association, if you want to trace the historical changes in data status, it is easy; you just need to look back sequentially, and you will definitely find the initial state of the data. All of this relies on blockchain storage technology, which mainly focuses on data structure and data relationships, including transaction and block data structures, the relationship between transactions and blocks, and the storage mode of block states, etc.

Immutability: Another technical characteristic is immutability, which is a concept that beginners often confuse. So-called tampering refers to unauthorized modifications that are not recognized, rather than being unable to modify.

Looking back at the example, we can see that the keys held by homeowners correspond one-to-one with the locks. Without a key or using the wrong key, vehicles cannot enter the community. If a social vehicle wants to enter the community, it may attempt to impersonate a homeowner, duplicate a key, or add a lock, which constitutes tampering. However, once discovered by the homeowners' committee, the error will be promptly corrected, and the social vehicle will be removed, thus achieving immutability.

In blockchain, achieving immutability requires two technologies for assurance. In the previous example, the function of the "lock" is realized through cryptographic technology, while the role of the "homeowners' committee" is played by consensus algorithms. As mentioned earlier, the data in a single blockchain node is stored in chronological order, and the key to this chaining utilizes cryptographic hash algorithms. Hash algorithms can transform a piece of data into a fixed-length data fingerprint, and even slight changes in the data will result in a significantly different fingerprint. Through this method, we can integrate the data fingerprint from the previous time period with the data from the subsequent time period. This process continues, ensuring that the data in the later time period always contains the fingerprint of the data from the earlier time period, thus forming a chain of information linked by data fingerprints.

image

If a malicious actor attempts to modify the data from a specific time period, according to the principles of hash algorithms, the corresponding data fingerprint will change. Therefore, they must modify the data of every subsequent time period; otherwise, the data chain will break at the moment they modify it, losing its traceability.

image

We can further amplify the difficulty. Consider how a malicious actor would modify all data on their local node. In this case, the consensus algorithm must step in to ensure that the data of the entire system remains unaltered. Consensus, in a distributed system, means maintaining data consistency. If data inconsistency occurs, most consensus algorithms follow the principle of the majority ruling the minority.

The data of every node in the blockchain network is consistent. A malicious actor only tampered with the data of a single node they maintain. From the perspective of the entire network, the data of the majority of nodes remains the correct data. Finally, I want to remind you that when analyzing problems, one should not look at only one side. Immutability is actually a dialectical characteristic. The principle of majority ruling the minority means that if a malicious actor can control the majority of node resources, tampering with the blockchain is possible. In a robust and sufficiently decentralized network, the cost of tampering is enormous, making it almost impossible to succeed. However, if successful, the remaining minority would be the malicious actors attempting to disrupt the consensus of the blockchain, as seen in Ethereum's hard forks.

Blockchain is an important driving force for the new generation of information technology. It utilizes the integration of foundational technologies such as storage, cryptography, peer-to-peer networks, and consensus algorithms to provide decentralization, traceability, and immutability, which can solve trust and security issues on the internet, thus promoting the transformation of the internet from information transmission to value transmission.

You might find the previous interpretation somewhat formal. To help you understand, let me share a more straightforward version: one could say that storage is the bricks, cryptography is the steel bars, peer-to-peer networks are the concrete, and consensus algorithms are the blueprints. Based on these, when combined, they construct the intricate superstructure of blockchain. I want to emphasize that blockchain technology is not dogmatically following a textbook. Bitcoin is blockchain, Ethereum is also blockchain, and the technology itself does not prescribe a single path for implementation.

Blockchain is a distributed storage technology with a high degree of difficulty in tampering, and its main application is in digital currencies like Bitcoin.

Blockchain is a decentralized, traceable, and immutable information storage technology, a fusion innovation combining various technologies, primarily addressing trust and security issues on the internet.

Reading "Blockchain Revolution: How the Technology Behind Bitcoin Is Changing Money, Business, and the World" is recommended. If you are not very familiar with the technology, you can read "Blockchain: Blueprint for a New Economy."

Git is a variant of blockchain.

  • Decentralization: Each user has their own git repository locally and can pull and push between each other.

  • Traceability: Each git commit relies on the previous commit, allowing tracing back to the initial commit.

  • Immutability: Modifying a certain historical commit locally will change the hash values of subsequent commits.

From an intuitive perspective, I have discussed the characteristics of blockchain technology using the example of "iron chains," which also introduces the technical foundation of blockchain. Starting today, I will take a few lectures to explain the core applications of each technology in blockchain, providing a comprehensive overview of the blockchain technology system. In this lecture, I will take you deep into a single blockchain node, helping you understand how blockchain storage is designed. When we talk about storage design, we first think about how data is stored in the blockchain and which database to use, among other conventional content. However, in my view, these only scratch the surface of storage design. To truly grasp the key points of blockchain storage, we need to focus on three foundational concepts: transactions, blocks, and states. With these foundations, analyzing blockchain storage design will become second nature.

image

The first concept to understand is the transaction (Transaction), which is the smallest and most core knowledge point in blockchain. Since we usually start learning about blockchain from Bitcoin, we often understand transactions as transfers. However, this understanding is somewhat one-dimensional. In fact, the concept of transactions in blockchain has already been expanded.

From the perspective of behavior, a transaction is equivalent to an operation (Operation). When we submit a transaction to the blockchain network, we are essentially initiating an operation, and the specific content of the operation is related to the specific blockchain protocol. For example, in Ethereum, an operation might be executing a method in a smart contract. If we analyze from the perspective of computer technology, a transaction is essentially an atomic entity; the English term is the same: Transaction. A transaction is the smallest component of data in the blockchain network. Once a transaction is submitted, it can only have two states: either successful or failed; there cannot be a situation where it is half successful.

Although different blockchains have consistent definitions of transactions, their attribute fields may differ, but this does not prevent us from abstracting a general transaction attribute template. It is important to note that not all blockchains follow the rules depicted in the following diagram; this is merely for your understanding of the main attributes of transactions.

image

We can see that a transaction typically has eight attributes (the transaction hash itself is also an attribute). The From and To fields point to the initiator and receiver of the transaction, which is easy to understand; for example, a currency transfer naturally requires both the payer and the payee. There are three attributes related to smart contracts: the smart contract identifier indicates the name of the smart contract to be executed for this transaction, followed by the method to execute and the parameter list that should accompany the execution of that method. Different methods may have parameters of varying lengths and data types, but here we represent them uniformly as a parameter list. The next timestamp field indicates the time when the transaction was constructed on the client side. This time is independently added by the client, but we need not worry about discrepancies with standard time, as the blockchain network will verify the transaction time upon receiving it. Transactions that are too early or too late will not be accepted by the network, which limits the possibility of fraud to some extent. The final general transaction field is the signature, which is generally issued by the account in the From field to prove to the network that this transaction was indeed constructed by this account and not forged by someone else. This is mainly done by signing the transaction with the private key held by the account owner, which only the account owner possesses. This is similar to the seals we use in daily life, except that the likelihood of the private key being forged is almost nonexistent unless it is stolen.

One point to note is that all transactions in the blockchain are basically initiated from outside the blockchain network. The blockchain network only receives transactions and does not produce transactions, nor does it make any modifications to transactions. In other words, once a transaction is constructed on the client side, it is solidified.

Therefore, we can use the hash value of the transaction content as the identifier for the transaction in the blockchain network, and this identifier is not included in the transaction's field content. How to understand this? You can think of it this way: an ID card can represent you, but the ID card is not a part of you.

Additionally, you may have another question: if each transaction is independently constructed by the client and does not negotiate with other participants in the network, won't this transaction hash be duplicated? Here we need to utilize the properties of hash algorithms. Hash duplication means a hash collision has occurred. In the subsequent cryptography chapter, we will discuss the probability of hash collisions and how it relates to the hash algorithm used, which is almost impossible to collide.

Block: Having understood transactions, let's discuss what "container" to use to store these transaction data. In fact, this container is the block. You can understand the relationship between transactions and blocks as follows: transactions are like goods, while blocks are containers that can hold multiple transactions.

In the previous discussion on traceability, we mentioned that blockchain is a sequential integration of data associated over time, and each time period's data is referred to as a block (Block). A block refers to a data structure formed by packaging all (valid) transactions received by a node within a certain time frame. The term "valid" is used because some blockchain designs also include invalid transactions. Understanding the concept directly can be somewhat abstract, so we can refer to the block schematic diagram to understand the data structure of a block.

image

The design of a block may seem complex, but don't worry; we only need to clarify three key points to appreciate the essence of a block.

Block Structure: The first point we need to clarify is the structure of the block. From the diagram, we can see that a block is divided into a block header and a block body. The block header contains the basic attributes of the block, with four important attributes: the previous block hash for linking blocks, the transaction root hash for linking the block with transactions, the block height for marking the current block's position in the blockchain for easy location, and the timestamp for recording when the block was packaged. The block body contains only transactions, and the transactions are ordered chronologically, generally sorted by the timestamp field of the transactions.

Inter-Block Association#

The second point to focus on is the association between blocks, which we have mentioned multiple times. Each block will include the previous block hash as the anchor point that logically associates two blocks. The block hash, similar to the transaction hash, is an external attribute of the block and can only be obtained after the block is constructed. If we trace back from the current block step by step, we will eventually find the genesis block, which also has a previous block hash, but it is an empty value. For example, you can check the previous hash of the genesis block through the Ethereum browser.

image

Block and Transaction: The final point is the relationship between blocks and transactions. Although we previously used a metaphor to compare blocks to containers and transactions to goods, we did not clarify the underlying principles. Conceptually, this is relatively complex because it introduces an uncommon data structure: the Merkle tree. Let's first understand it.

A Merkle tree is a tree structure that generally has at least three layers: leaf nodes, intermediate nodes, and root nodes. The number of layers of intermediate nodes depends on the number of leaf nodes; the more leaf nodes there are, the deeper the Merkle tree becomes.

Its construction logic is as follows: adjacent leaf nodes undergo hash operations, and the resulting hash value becomes the parent node of these two leaf nodes. The same logic continues upward, ultimately leaving only two intermediate nodes in the second-to-last layer, which undergo a hash operation to obtain their parent node, which is the root node of the entire tree. Thus, the Merkle tree constructed from hash values is completed.

Referring back to the previous block schematic, we can see that the transaction hashes contained in the block body can serve as the leaf nodes of the Merkle tree, and then the hash calculations proceed upward, ultimately yielding the root hash, which represents the transaction root hash of all transactions in the block body. This data will be recorded in the block header. At this point, you may have a question: why go through the trouble of introducing a Merkle tree? Why not simply mix all transactions together and take a single hash? We know that the result of hashing data can be used as a data fingerprint, which means that the hash can serve as a data verification mechanism.

If at this moment, one of the transactions in the block is tampered with by a malicious actor, and we designed the transaction root hash by simply taking a single hash of all transactions, then when the data verification fails, it would be challenging to identify the tampered transaction, especially when there are a large number of transactions.

However, if we use a Merkle tree, any changes in the leaf node hashes will propagate up to the parent nodes, layer by layer, until reaching the root node. This means that the root node's value actually contains the hashes of all leaf nodes, but it processes potentially tampered transactions separately. Thus, if a problem arises, we can easily determine the erroneous branch. This enhances the flexibility of data verification and reduces unnecessary resource waste. Through the above analysis of the three key points of block logic, we have clarified the design context of blocks. The reason blockchain is called a blockchain is, literally, due to the special data structure of blocks.

State#

Having discussed transactions and blocks, let's now understand a frequently overlooked concept: state (State). You may have never heard of it before, but its role is significant.

Every transaction executed in blockchain has an output, and the state is the accumulation of outputs after executing transactions. How to understand this? Here’s a simple example: 2 + 3 + (4 * 7) + (8 - 9 / 3) + 23 = 61. Each addition expression on the left side of the "=" can be considered a transaction record, while the 61 on the right side is the accumulated state after executing the transactions.

Finite State Machine#

We can observe that even if the result of the expression is lost, as long as we remember the expression, we can recalculate the corresponding value. This is the concept of a finite state machine, which states that in a closed system, if the initial conditions of the state are consistent and the order of conditions for state changes is consistent, the final result will always be the same. Blockchain records all transactions in chronological order, so even if the state is lost, we can easily replay the state by executing the transactions again in order. Therefore, from a certain perspective, blockchain can also be seen as a finite state machine.

Since we can replay the state, why should we retain the state? Let’s consider a hypothetical scenario: if you are currently executing a transaction that requires a certain input, and this input relates to the output of a transaction in a previous block.

If we do not retain the state, before executing this transaction, you would have to re-execute the related transactions, which may in turn relate to even earlier transactions, requiring you to trace back until you find the source. Thus, theoretically, it is possible not to retain the state, but this would require bearing the corresponding consequences of such a design.

Differences and Similarities with Databases: If you still find the concept of state difficult to understand, we can also explain it by comparing it with databases. When we interact with a database through CRUD operations, the records we insert, update, or delete in the database represent the state, while each statement you execute is a transaction. In other words, if you export all SQL statements from creating a database to creating tables, inserting data, updating data, and deleting data, you can replicate an identical database elsewhere.

Both blockchain and databases preserve historical operation records and state data sets. However, databases focus more on state, while blockchain primarily records historical blocks, with state as a secondary concern. One lives in the present, while the other reminisces about the past. The logic of the two is not fundamentally different; they simply emphasize different aspects.

State Models

Having understood the design concept of state, you may wonder how state is represented in blockchain. Depending on the positioning of the blockchain, we can roughly categorize the design of state models into three types.

One is the UTXO model, led by Bitcoin, focusing on digital currency, which stands for Unspent Transaction Outputs. In this model, each transaction should have N transaction inputs and generate M transaction outputs (N and M can be unequal). The transaction inputs are the unspent transaction outputs of any preceding transactions. If the current transaction is successful, the output of the preceding transaction becomes the input for the successful transaction, losing its qualification to become an input again. The UTXO model can track the flow of digital currency: unspent transaction inputs indicate where the currency comes from, while unspent transaction outputs indicate where the currency is going.

Another is the account model adopted by the Ethereum blockchain, which represents changes in account balances through addition and subtraction. Each transaction execution achieves dynamic balance among different accounts. For example, if you transfer 1 unit of currency, your account balance decreases by 1, while mine increases by 1. This model aligns more closely with our daily understanding. Additionally, the account model supports the storage of custom data beyond just representing balances, allowing for the derivation of smart contract data storage on top of basic accounts.

The final category is a general model that further advances on the account model, lacking built-in state attributes and allowing for the storage of any custom data. This model is widely applied in consortium chains. The positioning of consortium chains is to support enterprise-level applications, and the types and models of enterprise businesses are unpredictable, making it impossible to predefine a state model in the design. Since it is difficult to cater to everyone's needs, the design of the state is left to enterprise application developers, allowing for custom states while the chain itself only provides a general data interface.

From the design and application scenarios of the three state models, there is no unique solution for selecting a state model; any model design that meets the application scenario is a good model.

Summary#

In this lecture, we primarily delved into a single blockchain node, focusing on understanding the key points of blockchain storage, mainly discussing the data structures of transactions and blocks, as well as the blockchain state model. I did not emphasize specific design solutions because no matter how novel or innovative a blockchain platform is, its most fundamental design cannot escape these three points.

The design of blockchain storage is not fixed; as long as you truly understand transactions, blocks, and states, you will become the next blockchain storage architect.

For developers, state is more important. The block is like a framework, while the state is the data structure and algorithm needed for specific business designs.

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.