├── README.md ├── en ├── architecture.png ├── ate.png ├── cita-network.png ├── cita-parallel.png ├── router-and-views.png └── technical-whitepaper.md └── zh └── technical-whitepaper.md /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/citahub/cita-whitepaper/27700f560d26bc75056b153597dbd39e2fb872e7/README.md -------------------------------------------------------------------------------- /en/architecture.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/citahub/cita-whitepaper/27700f560d26bc75056b153597dbd39e2fb872e7/en/architecture.png -------------------------------------------------------------------------------- /en/ate.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/citahub/cita-whitepaper/27700f560d26bc75056b153597dbd39e2fb872e7/en/ate.png -------------------------------------------------------------------------------- /en/cita-network.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/citahub/cita-whitepaper/27700f560d26bc75056b153597dbd39e2fb872e7/en/cita-network.png -------------------------------------------------------------------------------- /en/cita-parallel.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/citahub/cita-whitepaper/27700f560d26bc75056b153597dbd39e2fb872e7/en/cita-parallel.png -------------------------------------------------------------------------------- /en/router-and-views.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/citahub/cita-whitepaper/27700f560d26bc75056b153597dbd39e2fb872e7/en/router-and-views.png -------------------------------------------------------------------------------- /en/technical-whitepaper.md: -------------------------------------------------------------------------------- 1 | # CITA Technical Whitepaper 2 | 3 | Jan Xie 4 | 5 | Version 1.0 6 | 7 | ## Introduction 8 | 9 | Known for proving the concept and introducing the world to blockchain technology, Bitcoin creatively solves the distributed consensus problem in a permissionless setting, and reveals a whole new architecture based on distributed computing and cryptography. Ethereum brings in a more general purpose computation layer, that demonstrates the extraordinary potential of smart contract technology. In the context of permissioned blockchains, we ask an interesting question: what if we consider this new architecture in a permissioned environment, where nodes maintain their identities? 10 | 11 | Permissioned blockchains satisfy the performance requirements for enterprise level applications. Identify management and permission control are the key components to making the system functional. Only nodes that satisfy permission requirements are allowed to join the network and communicate with other nodes, while in a permissionless blockchain nodes join and leave freely. Permission control allows the prevention of sybil attacks, and allows traditional consensus algorithms to be applicable, while largely boosting throughput and reducing latency of transaction processing. 12 | 13 | Node configuration and network status vary notably on open networks. In order to lower the entry barrier and avoid centralization, public blockchain protocols have to give consideration to the weakest node, thus limiting the design space. Nodes in permissioned networks can use better hardware and are more closely aligned with each other. An appropriate architecture should take advantage of this dynamic in order to achieve further performance improvements. 14 | 15 | In a partitioned system, it’s known that we cannot have both availability and consistency at the same time. Public blockchains usually favor availability since it is much harder to recover from emergencies in a highly decentralized governance model. In enterprise blockchain applications the existence of efficient off-chain collaboration and governance means that faster response times are possible, so that users can favor consistency within the system. 16 | 17 | As users of blockchain applications increase over time, the system has to scale to provide more and more transaction processing and storage capacity. Scalability while maintaining security is a must-have in the context of blockchain technology and although we do not see this in public chain infrastructures yet, permissioned blockchains can presently produce a viable solution to this conundrum. 18 | 19 | Data stored on the blockchain is public to all nodes, and while enterprise blockchain applications ask for privacy protection, the solutions based on pseudonyms or temporary transaction keys are not enough to satisfy requirements. Moreover, privacy solutions using bleeding-edge cryptography are not yet mature and fast enough for real world use. We believe what enterprises need is an imperfect but practical solution, and a modularized blockchain architecture with which future privacy solutions can easily plug-in. 20 | 21 | Business logic is extremely complicated in enterprise applications; a single design can hardly meet all the requirements of a certain deployment. To maximize efficiency gains instantiated by blockchain technology, blockchain node software must be customizable to adapt different deploy and integration environments. 22 | 23 | We perceive that independent blockchain networks are constantly emerging nowadays while even more will appear in the future. Various chains will begin to communicate with each other forming a network of blockchain networks. All blockchains should foreseeably prepare for cross-chain communication, to amplify the value of applications running on the various chains. 24 | 25 | Given our analysis of this environment in addition to thorough understanding of the technology, we created CITA, an enterprise oriented blockchain framework that supports smart contract execution and design. CITA provides a stable, efficient, flexible and future-proof platform for enterprise-level blockchain applications. 26 | 27 | ## Microservices 28 | 29 | When designing enterprise-level applications, scalability is a key concern , which is one of the marquee problems of blockchain technology today. No matter how many nodes there are in a blockchain network, the capacity is limited to a single node. To increase system capacity we have two options: 30 | 31 | 1. comprise global transaction validation on the premise of security, i.e. sharding or cross-chain; 32 | 2. enhance the capability of every single node, i.e. use powerful and expensive servers (scale up). 33 | 34 | CITA adopts a microservices architecture to boost each (logical) node’s performance. As shown in Fig. 1, a logical node is composed by a group of loosely coupled microservices, and those services communicate with each other through a message bus. In CITA, a "node" is a logic concept which may be a single server (with a group of services running on it), or a cluster of servers. 35 | 36 | With a microservice architecture, CITA can be easily scaled. Node administrators can increase system capacity simply by adding more PC servers on high load. The administrator can even use dedicated servers to provide services for hot-spot accounts. We call this as **Internal Sharding**. 37 | 38 | There’s no special hardware requirement other than a common PC. Transactions can be routed to multiple servers, and each server only needs to process a fraction of the load. Together they will meet any level of enterprise application activity in the system. In certain scenarios, different logic nodes run different groups of microservices to provide a variety of services to the system if necessary, e.g. validator vs. storage nodes. 39 | 40 | Customization and integration is easy with CITA’s microservice architecture. Microservices are loosely coupled and their communications are only via messages. Hence microservices/components in CITA can be replaced by users with any programming language, as long as the microservice implements standard internal API’s to parse, process and return messages. External systems can connect directly to the message bus too, in order to read internal messages at runtime, for easy and deep integration. 41 | 42 | ![Fig 1a. CITA Microservice Architecture](../en/cita-network.png) 43 | 44 | ![Fig 1b. CITA Microservice Architecture](../en/cita-parallel.png) 45 | 46 | ## Consensus 47 | 48 | Blockchain nodes reach a consistent transaction history via consensus algorithms. The consensus service selects valid transactions to build a global total/partial order log, which is the input for later execution. Transactions are stored into an immutable history built on authenticated data structures, resulting in what we would acknowledge as a ‘view’ after transactions are processed by the executor. A ledger maintaining account balances is a view, for example. 49 | 50 | Different blockchains differ on whether a view requires consensus. For instance, UTXO sets are not pinned into Bitcoin blocks, while the hash of the ‘world state’ consists of all accounts will be included in an Ethereum/Fabric block. Consensus on the view can help identify bugs in transaction execution. Including hashes of view in blocks also facilitates view data exchange between nodes, which benefits light clients and cross-chain protocols. Therefore we have designed CITA’s block data structure to support consensus on view. 51 | 52 | As a service shared by multiple participants, transactions sent by users must be included into the blockchain within a certain time frame, in order to facilitate censorship resistance. Being unable to execute a transaction can lead to great loss in many financial scenarios, e.g. trader failed to deposit more cash or sell some assets in time on a margin call. Due to the fact that validator nodes are able to pick and order transactions, they must be rotated regularly in order to make sure that any specific transactions won’t be blocked by a certain node for too long. CITA applies a proactive round-robin rotation strategy by default to ensure censorship resistance, and random rotation is available as an extension. 53 | 54 | On new blocks generated by leader on rotation, CITA runs CITA-BFT consensus by default. CITA-BFT is a high performance consensus algorithm designed for blockchains. After extensively synthesizing the latest achievements - [PBFT][1], [Tendermint][2] in the field of distributed systems, CITA-BFT has been deeply modified and optimized for the enterprise-level blockchain network structure and data structure. Guaranteed security (tolerance of Byzantine nodes that do not exceed 1/3 of the total number of nodes), CITA-BFT achieves extremely high throughput. 55 | 56 | CITA-BFT can be replaced by more appropriate consensus algorithms if necessary. The alternate consensus can be written in any language as long as it implements the consensus microservice interface. Note that consensus algorithms are hard to abstract perfectly since they are usually associated with special network/storage requirements, and their replacement may be coupled with modifications to other services. 57 | 58 | ## Execution 59 | 60 | ### Asynchronous Transaction Execution (ATE) 61 | 62 | A blockchain node’s functionalities include peer-to-peer networking, consensus, transaction execution and authenticated data storage. Nodes reach consensus on transaction order, and execute transactions one by one in the determined order. Under deterministic execution, all nodes reach a consistent state eventually, that is, they store the same data in their local database. 63 | 64 | Consensus and transaction execution are highly coupled in today’s blockchain projects. As a result, the execution of transactions set a bottleneck to the performance of consensus. In CITA, consensus and transaction execution are decoupled as separate microservices. The consensus service is only responsible for transaction ordering, which can finish independently before transaction execution, so the later can run asynchronously. ATE enables better consensus performance as well as elastic transaction processing: system loads can be distributed to many blocks in a time range (Fig. 2). 65 | 66 | However, due to asynchronous executions, only limited validation (i.e. signature verification) can be applied to transactions in consensus; therefore consensus output may include invalid transactions which will be fed to the executor. This problem can be solved by quota control mechanisms and garbage cleaning tools provided by CITA. 67 | 68 | ![Fig 2. Asynchronous Transaction Execution](ate.png) 69 | 70 | ### Executor 71 | 72 | Applications care more about views than transaction history. Transaction executors take ordered transactions as inputs, and update associated view during processing. Different executors will generate different views even with the same transaction history. CITA supports executors listed below by default: 73 | 74 | 1. NOOP: Simplest executor which does nothing. 75 | 2. Native: Feed the transaction as input data to contract written in native language. 76 | 3. EVM: A thin wrapper of the Ethereum Virtual Machine; support the creation and invocation of Ethereum smart contract. 77 | 4. Private: Private transaction execution. 78 | 5. Hybrid: a combinator for executors. 79 | 80 | View state is read and written by an executor during transaction processing. View states may have their own unique data models, which are most commonly, UTXO and Account models. In the UTXO model, UTXOs constitute a view of the ledger where each transaction creates new UTXOs out of consumed UTXOs. In the Account model, accounts constitute a view of world state where a transaction may read/write multiple accounts in execution. 81 | 82 | The UTXO model introduces a ledger invariant, in which the total number of accounting units must stay the same before and after transaction execution, by sacrificing generality. By splitting account balances into multiple UTXOs, it benefits parallelization to a certain extent, while compromising versatility for complex business and bringing in the complexity of splitting/combining UTXOs. Account model is simpler and more efficient for general tasks. Moreover, metadata such as those for authentication and authorization can be associated with accounts naturally in enterprise applications. CITA supports the Account model by default, while users can build their own view state models like UTXO. 83 | 84 | ### Quota 85 | 86 | Transactions will be replicated, stored and executed on multiple nodes. The resources of nodes are limited and shared by all users, and nodes will be overloaded and lose responsiveness if too many workloads are submitted into the system, therefore creating a situation in which blockchains with smart contract support must find a way to restrict resource use. In CITA we call the unit of resource a ‘quota’, and the issuance and consumption mechanisms are called quota management. Quota issuance and consumption strategies are configurable to users with quota management permissions. 87 | 88 | How quota is consumed depends on the executor used in transaction processing. For example, the NOOP executor consumes quota by transaction size; the Native executor consumes quota as a realtime clock ticks; while the EVM has built-in fine grained GAS mechanisms which count opcode’s complexity. In CITA we set quota limits for blocks/views and the users in order to limit resource usage of individual blocks. 89 | 90 | In contrast to a public blockchain, permissioned blockchains usually don’t issue tokens to provide on-chain consensus incentivisation, therefore this platform needs an alternative solution to issue quota in order to compensate user’s quota consumption. Quote issuance is very flexible in CITA: a simple periodical-recovery strategy is supported by default, and customized strategies can also be configured if necessary. 91 | 92 | ### View 93 | 94 | Blockchain is an Online Transaction Processing (OLTP) system: users broadcast their transactions; nodes execute the transactions upon receival. There are alternatives with different pros and cons to transaction processing in blockchain systems. Nevertheless, most blockchain systems pick one of them, and are thus inflexible and encounter difficulty in meeting the needs of varied scenarios. By what we call transaction tunnels, CITA supports multiple-strategy transaction processing and basic parallelization. 95 | 96 | One can set multiple independent views when configuring CITA. Each view has its own executor and state data model, and registers its executor to the transaction router. After ordering by consensus service, transactions are dispatched via the router to corresponding executors. Transaction sets processed in different views may or may not have intersections. 97 | 98 | By proper view configuration, CITA versatilely supports all kinds of application scenarios. A view with a NOOP executor is economical for notary businesses. A view with a Native executor and Account model fits best for applications with stable business logic. The EVM combined with the Account model is adequate for businesses with frequently changing requirements. 99 | 100 | CITA runs independent transaction execution services for different views, due in part to the state data independence. In a CITA network with multiple views, processing capacity is nearly proportional to the number of views. 101 | 102 | ![Fig 3. Transaction Router and Views](router-and-views.png) 103 | 104 | ### Privacy 105 | 106 | Replicated execution of smart contracts and privacy are contradictory in nature. Execution/verification requires that all validator nodes can read data within a transaction. However, to protect privacy, irrelevant validator nodes shall not be allowed to see the respective data. 107 | 108 | Privacy protection based on pseudonyms conceal the sender and receiver of a transaction to some extent. Though with the help of data analysis, one can still acquire user information in a transaction. In a one-time transaction key based solution, transactions are still decrypted before execution, and data is only hidden to ordinary users not validator nodes. This is not quite practical since validators are also competitors in most enterprise applications. 109 | 110 | Some advances in cryptography, like zero knowledge proofs or full homomorphic encryption, help us to proceed a transaction without revealing its data. However, technologies of such kind have their own bottlenecks against practicality and maturity. 111 | 112 | CITA features partial-execution to protect privacy for its users. Before a private transaction is submitted, transaction data is encrypted. The encrypted transaction is then sent to relevant nodes through a peer-to-peer private transport connection, while its hash value gets packed into the block. Private transactions are only stored and executed on relevant nodes, completely eliminating the risk of privacy leaks. 113 | 114 | ## Authentication and Authorization 115 | 116 | There are generally two kinds of roles in a blockchain network: nodes and users. Nodes are service providers and users are consumers of the shared computing and storage service. 117 | 118 | CITA provides a standard interface for node authentication, and imposes very strict restrictions on nodes joining. Connections from a node that fails to authenticate itself will be dropped even if it is in the same network as other nodes. 119 | 120 | For widely adopted centralized user authentication services based on LDAP or PKI in enterprise applications, CITA provides standard interfaces to facilitate integration with such services too. 121 | 122 | In a public-key cryptography based authentication solution, the result could be disastrous if one loses his/her private key. CITA has a sophisticated identity management solution. When users lose their keys, or when their keys need to be updated, administrators with key update permissions can replace the old key with a new one at a user’s request. 123 | 124 | CITA also features role-based access control for enterprise-level applications. Resources operable by users are divided in fine-grained components to various levels of authority. Users can define roles to organise users access control and manage resource accessibility, allowing enterprises to configure CITA deployment to match their organizational structure. Updates to permissions and roles are all stored on a blockchain for future auditing. 125 | 126 | ## Governance 127 | 128 | Blockchain is a tool used to reflect the **consensus of human**. In normal cases, blockchain facilitates coordination automation. In abnormal cases, erroneous view data may occur. Calibrations can be made based on immutable transaction history. As a distributed system of equivalent peers, blockchain has no central node in its networking topology, but its governance may be carried out in either decentralized or centralized processes. With regards to the principle that **transaction history must be immutable** while views are mutable, CITA supports various governance structures and view amendment capabilities. 129 | 130 | There is a superadmin role in CITA. Benefiting from a flexible authentication service design, superadmins may have any authentication logic. In a centralized governance method, superadmins can be controlled by a core member. In a multi-center governance method, core members can constitute a committee to manage the superadmin together. 131 | 132 | A centralized governance role/committee is able to agree on a resolution via some off-chain channel, which allows the system to recover rapidly when emergencies happen. When problems like mistaken operations, software errors or hardware errors happen, systems will enter into an emergency status. We define two types of emergency statuses: Transaction Recoverable and Message Recoverable. 133 | 134 | Systems with erroneous view data from mistaken transactions or bugged smart contracts are in a transaction recoverable status, since nodes can still process transactions. A superadmin can create an amend transaction for a fast fix. Nodes will include such amend transactions into blocks too allowing that any view update operations are stored as evidence for auditing. 135 | 136 | When a system is in a message recoverable status, nodes are unable to execute transactions and consensus services are at a standstill, though peer-to-peer networks are still functional. A superadmin can broadcast a special message with the administrator tool provided by CITA. Upon receival of such messages, nodes verify the sender's identity and execute them directly without consensus. 137 | 138 | ## Summary 139 | 140 | For enterprise-level blockchain applications, we propose CITA, a blockchain framework with smart contract support. In CITA, functionalities of a blockchain node are decoupled into microservices, including consensus, transaction execution, peer-to-peer network, quota, authentication and authorization. Microservices coordinate via a message bus. CITA is designed to be a highly extensible and a future-proof general blockchain framework, where users can configure and customize services on demand. 141 | 142 | [1]:http://pmg.csail.mit.edu/papers/osdi99.pdf 143 | [2]:https://tendermint.com/docs/introduction/introduction.html#consensus-overview 144 | -------------------------------------------------------------------------------- /zh/technical-whitepaper.md: -------------------------------------------------------------------------------- 1 | # CITA技术白皮书 2 | 3 | Jan Xie 4 | 5 | Version 1.0 6 | 7 | ## 概述 8 | 9 | 比特币作为区块链的起源,创造性的解决了开放网络上的分布式共识问题,为世界呈现了一种基于分布式系统及密码学的全新技术架构。后继者以太坊很好的将通用计算融合进这个架构,让人们看到了智能合约的非凡潜力。而许可链提出了一个有趣的问题:如果我们重新在节点有身份的许可网络中评估这个新架构会得到什么呢? 10 | 11 | 许可链可以很好的满足企业级应用的性能需求。身份管理以及权限控制机制是许可链的核心组件。在开放网络中节点可以自由的加入和退出,相反在许可网络中,只有获得特定许可的节点才能接入网络,与网络中的其它节点通过可验证来源的消息交互。准入机制的存在杜绝了女巫攻击,使得传统共识算法有了用武之地,使交易处理的延迟和吞吐量获得质的飞跃。 12 | 13 | 在开放网络上,节点配置及网络条件的差异极大。为了最大程度的降低使用门槛以及去中心化,公有链设计不得不参照最低标准的节点配置及部署环境进行设计,设计空间受到极大的限制。在许可网络中节点性能更好,配置更加一致,一个适当的架构应该利用这一点进一步提升系统处理能力。 14 | 15 | 对于分布式系统,可用性与一致性不可兼得。公有链由于技术与治理的高度去中心化,在紧急状况出现时缺乏高效的协调及干预手段。为绕过这个缺点,公有链在设计时以可用性为先,牺牲了在网络分区情况下的一致性保证。企业级应用中用户具有更好的协调机制,在系统不可用时的干预方法相对高效,同时对一致性又有较强要求,与公有链的设计偏好有较大的区别。 16 | 17 | 随着使用区块链应用的用户数量增加,区块链必须水平扩展以支撑越来越大的交易处理和存储需要。保持系统安全性不变的水平扩展能力是区块链的必备属性,虽然我们目前还没有看到做到这一点的公有链,但是许可链已经可以给出不同的答案。 18 | 19 | 区块链上数据对所有共识节点公开,基于假名的隐私方案并不能完全满足企业级应用的需求。另一方面,各种密码学隐私方案的安全性尚未得到完全验证,性能离实用也还有距离。我们需要一个不完美但现在就能用的隐私方案,以及一个模块化的可以轻松融合未来隐私技术的区块链架构。 20 | 21 | 企业级应用场景业务逻辑繁杂,单一的通用设计只能够满足最低需求,很难将应用潜力全部挖掘。为了最大化应用区块链技术带来的效率提升,区块链软件必须可定制,以适应各种不同的部署和集成环境。 22 | 23 | 随着区块链技术的不断普及,独立的区块链网络必然不断出现,区块链网络之间进行交互,形成区块链网络的网络。合理的区块链设计需要提供跨链的基础,才能让其上运行的应用在未来产生更大的价值。 24 | 25 | 基于这些理念,我们设计了 CITA,一个面向企业级应用的支持智能合约的区块链框架。CITA 可以为企业级区块链应用提供一个稳固、高效、灵活、可适应未来的运行平台。 26 | 27 | ## 微服务 28 | 29 | 水平扩展能力是企业级应用成功的关键,也恰恰是现有区块链技术最突出的问题。无论区块链网络中节点数量多少,整个网络的处理能力都只相当于单个节点的处理能力。要提升整个网络的处理能力只有两个选择: 30 | 31 | 1. 在保证安全性的前提下放弃全局交易验证,i.e. 分片或是跨链; 32 | 2. 提升单个节点的处理能力,i.e. 使用性能强劲但是价格昂贵的专用服务器(scale up)。 33 | 34 | 而 CITA 则是利用微服务架构来提高单(逻辑)节点处理能力。通过微服务架构(见图1),将单个节点按照不同的功能解构为一组松耦合微服务,微服务之间通过消息总线进行通讯。在 CITA 中,“节点”是一个逻辑概念,有可能是一台服务器(上面运行一组微服务),也有可能是一组服务器组成的集群。 35 | 36 | 基于微服务架构,CITA 非常容易水平扩展,在系统负载上升时,可以通过增加服务器的方式增加节点的处理能力。对于系统中的热点账户,甚至可以使用专门的服务器提供服务。我们把这种水平扩展方式称为**内部分片(Internal Sharding)**。 37 | 38 | CITA 节点对服务器硬件要求低,交易处理可以被分散到多台普通 PC 服务器上,无需专门硬件支持即可应对企业级场景。在节点角色多样化的场景中,不同节点亦可运行不同的微服务组合,实现不同的角色功能。 39 | 40 | 业务优化与系统深度集成在 CITA 中都能轻松实现。微服务之间通过消息进行通讯,耦合程度低。只要能够解析和返回相关的消息,用户能够用任意语言实现所需的服务实现对节点某个组件的替换。外部系统也可以直接连接到消息总线上,实时获取节点运行时消息,轻松实现深度集成。 41 | 42 | ![Fig 1a. CITA Microservice Architecture](../en/cita-network.png) 43 | 44 | ![Fig 1b. CITA Microservice Architecture](../en/cita-parallel.png) 45 | 46 | ## 共识服务 Consensus 47 | 48 | 区块链节点通过共识算法形成一致的交易历史记录。通过共识服务,有效的交易被选择出来,并形成全局的全序或者半序关系,为交易处理提供基础。交易通过可证数据结构凝结成不可篡改的历史,在被执行器处理之后形成的数据我们称之为视图(View),记录用户账户余额的账本即是视图的一种。 49 | 50 | 不同的区块链设计对于视图是否需要共识有不同的态度。UTXO 集合在 Bitcoin 区块中并无体现,相反由账户集合形成的“世界状态”的特征值会被记录在 Ethereum/Fabric 区块中。对视图数据进行共识,有利于发现交易处理中的问题;将视图的特征值固化在区块中,有利于节点间视图数据的交换,是轻节点验证和跨链协议的重要基础。CITA 的区块数据结构设计兼顾了视图共识的需要。 51 | 52 | 作为一种多参与方的共享服务,保证使用者发出的交易能够在一定的时间内被处理是一个重要的设计目标,我们称之为反屏蔽。在企业级应用场景中,用户的交易不能及时被处理可能会给用户造成巨大损失,例如在规定时间内无法向智能合约补充保证金导致强行平仓。由于出块节点拥有交易的选择和排序权,出块节点必须以一定的规则轮换,以保证单一节点无法长时间将某些特定的交易排除在外。CITA 使用出块节点主动轮换策略以满足反屏蔽的要求。默认的顺序轮换能够满足一般应用的需要,同时也提供随机轮换作为扩展模块。 53 | 54 | 在产生新区块后,CITA 默认使用 CITA-BFT 算法进行共识。CITA-BFT 是一种专为区块链设计的高性能共识算法。CITA-BFT 在广泛综合分布式系统研究领域的最新成果 [PBFT][1],[Tendermint][2] 的基础上,针对企业级区块链的网络结构和数据结构进行了深度的改造和优化,在保证安全性的基础上(可容忍不超过节点总数 1/3 的拜占庭节点),实现了极高的吞吐量。 55 | 56 | CITA-BFT 可以方便的被替换成任何更合适的共识算法,只要实现共识服务标准接口,替换算法可以用任意语言实现。需要注意的是,共识算法替换往往涉及网络、存储等多个方面,很难被完美抽象,因此共识算法的替换可能不仅仅需要共识服务的替换,还需要同步对其他微服务进行定制。 57 | 58 | ## 交易处理服务 59 | 60 | ### 异步交易处理(ATE) 61 | 62 | 区块链节点的最主要职责包括点对点网络交互、共识、交易处理以及数据存储四个方面。节点通过共识算法,在系统中形成对交易排序的全局共识,再按照共识后的顺序对交易进行逐个处理。只要处理过程能保证确定性,所有节点最后都能达到一致的状态,产生相同的本地数据。 63 | 64 | 在当前的区块链设计中,共识与交易处理耦合程度较高,共识的性能受到交易处理能力的影响。CITA 将共识与交易处理解耦为独立的微服务,共识服务只负责交易排序,并不关心交易内容,交易处理服务只负责对排好顺序的交易进行处理。此时共识过程可以先于交易处理完成,交易处理服务可以异步执行。异步交易处理技术不仅使 CITA 具有更好的共识性能,还带来了更有弹性的交易处理能力,交易负荷可以被更均匀的分摊到一段时间内(见图2)。 65 | 66 | 由于交易异步处理,在共识前只能对交易进行有限的检查,例如签名验证。无效的交易有可能通过共识进入交易处理服务,产生一定程度的垃圾数据。在有必要的情况下,可以通过 CITA 的交易配额机制及垃圾清理技术解决该问题。 67 | 68 | ![Fig 2. Asynchronous Transaction Execution](../en/ate.png) 69 | 70 | ### 执行器 Executor 71 | 72 | 相对于交易列表,应用更关心的数据是视图。执行器以排好序的交易为输入,在处理过程中相应的更新对应的视图。即使处理的是相同的交易列表,不同的执行器可以产生不同的视图。CITA 默认支持如下执行器: 73 | 74 | 1. NOOP: 最简单的执行器,对传入的交易不做任何处理。 75 | 2. Native: 原生执行,将交易通过标准接口交给原生代码执行。 76 | 3. EVM: 对以太坊虚拟机的封装,可以处理以太坊轻量智能合约的部署和调用。 77 | 4. Private: 隐私交易处理,详见隐私交易。 78 | 5. Hybrid: 混合执行器,能够组合执行器形成新的执行器。 79 | 80 | 视图状态则是执行器执行过程中读写的对象,不同的视图状态模型使用不同的基本数据单元,常见的有 UTXO 及账户两种。在 UTXO 模型中,由 UTXO 构成账本视图,每个交易在销毁旧有 UTXO 的同时创造新的 UTXO;在账户模型中,由账户构成世界状态视图,交易在处理过程中可以读写多个账户。 81 | 82 | UTXO 模型包含交易前后记账单位数量不变约束,引入了业务逻辑,放弃了一定的通用性;将账户状态离散保存在多个 UTXO 中,获得了有限的并行能力提升,也带来了分割/合并 UTXO 的复杂度。账户模型相对更加简单,实现通用任务更有效率。在企业级应用中往往存在身份验证与授权的需要,这些服务所依赖的数据可以自然的与账户关联。CITA 默认支持账户模型。用户可以自定义包括 UTXO 在内的其他状态模型。 83 | 84 | ### 配额 Quota 85 | 86 | 交易会被复制到多个节点进行执行和存储。节点的计算资源(包括 CPU、磁盘空间、带宽等)有限,为所有用户共享,如果某个用户无意或是有意提交了过重的执行任务会导致节点负荷过重失去响应。因此支持智能合约的区块链需要恰当的机制来限制资源使用。在 CITA 中我们将资源的度量称为计算配额,相应的发行和消耗机制称为配额管理。配额消耗和发行策略都可以由具有权限的用户制定。 87 | 88 | 不同的执行器具有不同的配额消耗机制。例如,NOOP 执行器按照交易数据大小计算配额消耗,Native 执行器随着真实世界的时钟跳动计算消耗,而 EVM 自带细粒度的 GAS 计算机制,按照指令复杂度计算配额消耗。在 CITA 中我们可以为区块(中的视图)或是用户设置配额消耗上限,以此将控制单个区块的资源消耗。 89 | 90 | 与公有链不同,许可链中往往无需发行代币以提供链上的共识激励,因此我们需要一种替代机制发放计算配额以补充用户的配额消耗。CITA 中的配额发行策略非常灵活,不仅默认支持包括周期性恢复在内的简单策略,也可以根据需要自定义复杂策略。 91 | 92 | ### 视图 View 93 | 94 | 区块链是一种联机交易处理系统(OLTP),用户广播交易,节点收到交易以此作为输入进行处理。区块链交易处理有多种各有利弊的方案,但目前的区块链系统大多使用单一固定方案,难以满足不同场景的不同需求。通过交易通道技术,CITA 实现了多执行策略支持与基本的并行处理。 95 | 96 | 用户在配置 CITA 区块链网络时可以设定多个视图,视图相互独立。每个视图都可以设定对应的交易执行器和状态存储模型,并将交易执行器注册到交易路由。交易在经过共识服务排序后,由交易路由分配到不同的执行器处理(见图3)。不同视图处理的交易子集可以有交集,也可以没有交集。 97 | 98 | 通过灵活的视图配置,CITA 可以全面的支持各种应用场景。例如配置 NOOP 执行器的视图能够很好的支持数据存证的场景,避免不必要的执行开销;Native 执行器与账户模型的组合适合业务逻辑比较固定的场景,同时能够获得很好的性能;EVM执行器与账户模型组合适用于业务逻辑灵活多变的场景。 99 | 100 | 由于使用独立的状态存储,CITA 支持对不同的视图使用独立的交易处理服务并行处理。在配置了多个视图的 CITA 区块链网络中,系统处理能力几乎与可以与视图数量成正比。 101 | 102 | ![Fig 3. Transaction Router and Views](../en/router-and-views.png) 103 | 104 | ### 隐私交易 105 | 106 | 智能合约执行与隐私保护有本质的矛盾。执行和验证智能合约要求所有共识节点能够读取交易中的数据,而隐私保护要求无关共识节点不能看到交易中的数据。 107 | 108 | 基于假名(psedonymous)的隐私保护,只能在一定程度上隐藏交易的发起方和接收方, 通过数据分析手段还是能获得交易方信息。在使用临时私钥对交易加密的方案中,共识节点依然需要对交易进行解密才能执行交易,交易对共识节点没有隐私。在许多应用场景中,共识节点相互之间可能存在竞争关系,无法采用这样的方案。 109 | 110 | 最新的密码学技术,例如零知识证明以及同态加密,可以帮助我们做到在不知道交易数据的情况下执行交易。但是这些技术并不成熟,性能难以实用,安全性有待时间检验。 111 | 112 | CITA 1.0 通过交易局部执行技术,实现了一种实用的隐私方案。隐私交易提交后,先在本地进行加密,加密后的交易通过点对点隐私交易传输协议被传送给拥有解密私钥的节点,同时交易哈希被打包进入区块链。隐私交易数据只在拥有解密私钥的相关节点上保存,相关节点先解密再执行交易,交易数据不会发送给无关节点,完全杜绝了任何信息泄漏的可能。 113 | 114 | ## 身份验证与授权服务 115 | 116 | 区块链中的参与方可分为节点和用户两类。节点是区块链服务的提供者,用户是区块链服务的消费者。 117 | 118 | CITA 为节点身份验证提供了标准接口,同时对节点接入进行更严格的控制。对于身份验证失败的节点,即使该节点能够在网络层与其他 CITA 节点联通,CITA 节点也会拒绝与其建立通讯会话,避免信息泄漏。 119 | 120 | 在企业应用环境中,可能已经存在中心化用户身份验证服务,例如 LDAP 或是 PKI 证书体系。CITA 为用户身份验证提供了标准接口,能够方便的与企业内已有的身份验证服务集成。 121 | 122 | 在基于非对称加密的身份验证方案中,用户私钥丢失是一个难以处理的问题。CITA 的身份验证服务支持更复杂的身份管理策略,在用户私钥丢失或是定期更新时,具有私钥更新权限的操作者可以根据用户申请使用新的私钥替换旧的私钥。 123 | 124 | CITA 实现了基于角色的权限控制(Role-based access control)以满足企业级应用的需要。CITA 为可供用户操作的资源进行了细粒度的划分和权限定义,并允许用户自定义角色,用户可以通过角色方便的组织用户、管理资源权限,使权限分配准确匹配企业的组织架构。同时权限与角色数据和变更历史都会被保存在区块链上,满足日后审计需要。 125 | 126 | ## 系统治理 127 | 128 | 区块链是反映**人类共识**的工具。在正常情况下,区块链可以促成自动化协作,在非正常情况下可能产生错误的视图数据,此时可以根据无法篡改的交易历史作出修正决策。作为一种由对等节点构成的分布式系统,区块链在技术架构上不存在中心点,在治理层面则存在无中心,多中心甚至单中心多种可能。CITA 以**交易历史不可修改**为设计原则,支持各种治理机构以及视图订正。 129 | 130 | 在 CITA 中用户可以设定超级管理员角色,得益于灵活的身份验证服务设计,超级管理员角色可以有任意的身份验证逻辑。在单中心的治理结构下该角色可以由单一核心用户控制,在多中心的治理结构下,核心用户可以形成类似委员会的治理机构联合控制(例如通过多重签名)超级管理员角色。 131 | 132 | 中心化治理角色能够通过链外通道协商形成一致行动决议,增强系统在紧急情况下的应对能力。在操作错误、软件错误或是硬件错误等问题发生时,系统可能进入紧急状态。我们可以将紧急状态分为交易可恢复(Transaction Recoverable)和消息可修复(Message Recoverable)两类。 133 | 134 | 由于错误的交易或者是有 bug 的智能合约生成了错误的视图数据,但是节点依然能够处理交易,此时系统处于交易可恢复紧急状态。在这种情况下,超级管理员可以构造修订交易快速应对。节点在处理修订交易时同样会先将该交易打包入块,再执行交易,因此所有修订交易都将被记录在历史中,为操作审计提供支持。 135 | 136 | 消息可修复紧急状态发生时,节点无法再正常处理交易并打包,共识服务停滞,但是点对点网络依然能够正常工作。此时超级管理员可以通过 CITA 提供的管理员工具构造特殊消息并广播,节点收到消息并验证发送者身份后将直接处理,无需共识。 137 | 138 | ## 总结 139 | 140 | 为了满足企业级应用的需要,我们提出了一个支持智能合约的区块链框架 CITA。CITA 将区块链节点的必要功能解耦为微服务,共识、交易处理、点对点网络协议、身份验证与授权等组件之间通过消息总线交换信息相互协作,为应用提供服务。CITA 的设计充分考虑了通用性与未来扩展的可能,通过配置和定制相应的服务,CITA 能够满足企业级用户的全部需要。 141 | 142 | [1]:http://pmg.csail.mit.edu/papers/osdi99.pdf 143 | [2]:https://tendermint.com/docs/introduction/introduction.html#consensus-overview 144 | --------------------------------------------------------------------------------