Difference between a Blockchain and a Database
Companies these days are left with a new question to answer, “do we use a blockchain or a more traditional database?”
There are many differences between the two and I will try to cover as many of them as possible below. It starts out with a difference in architecture, or more accurately how the technologies are orchestrated.
A traditional database set up is mostly a client-server type of network architecture.
I.e. a client (the user) with the correct permissions associated with their user account (log-in credentials, access permissions, roles, etc.) can change entries that are stored on a centralised server (either physical centralisation or distributed cloud centralisation).
In case of a change to the “master copy” every user will receive an updated copy of the “master copy” the next time they access the database from their PC / tablet / smartphone; however only the “master copy” is the valid one. Control of the database remains with database administrators, which allows for access and permissions being maintained by a central authority.
This works quite differently in a blockchain, in a blockchain, each participant maintains, calculates, updates and validates new entries into the database. All of the participants (nodes) work together to ensure they are all coming to the same conclusions (consensus), thus providing built-in security for the network.
The consequences of this fundamental difference are that blockchains are well-suited as a system of record for certain functions, while a centralised database is entirely appropriate for other functions.
Various Factors to Consider
Disintermediation / Decentralised Control & Trust
When we are talking about trust we are not referring to everyone having malicious intent but about the simple fact that the entities in the transaction don’t necessarily know each other and have yet to establish trust in the relationship. It is like using your bank card or visa card to make a payment online. Doing so implies that you trust the institution where you got the card from and you trust the source where you are making the purchase, this kind of trust can often be misplaced trust.
Almost all blockchains allow different parties that do not necessarily trust each other to share information without requiring a central administrator. Transactions are processed by the network of users (nodes) acting as a consensus mechanism so that everyone is creating the same shared system of record (more or less simultaneously, depending on the blockchain).
In a traditional model, however, the trust lies with the database administrator, and for the most part administrators there need to and have earned the trust put into them (by not being malicious); often the trust is simply in the organisation itself since we – more often than not – don’t know the administrator. As an example, people’s money is generally not outright stolen by banks that record the money they hold in their private databases.
There can absolutely be a logical reason why you would want centralised control, it can in itself be a business model / a specialty/ a reason for being; look at Amazon Cloud and Azure Cloud as an example. People trust the entities behind the technology (Amazon and Microsoft) and as such put their trust in the networks they control.
However this also means that those with the control, such as banks, need to spend billions of dollars keeping their centrally held databases safe and secure from being altered by hackers or anyone else (including the administrators) who might wish to profit from this database (be it by stealing money, stealing data or simply altering data). If the central administrators we are trusting in this regard, or the institutions behind them, fail, then we lose.
Why is there value in this disintermediation? Because even though a database is just bits and bytes, it is also a tangible thing. The contents of a database are stored in the memory and disk of a particular computer system, and anybody with sufficient access to that system can destroy or corrupt the data within. As a result, the moment you entrust your data to a regular database, you also become dependent on the human organization in which that database resides.
Whereas blockchain provides a database that is publicly verifiable and enabled by integrity and transparency:
- Integrity because every user can be sure that the data they are retrieving is uncorrupted and unaltered since the moment it was recorded.
- Transparency because every user can verify how the blockchain has been appended over time.
The question to ask here is whether our trust is rightly placed, and whether or not trust is required between the various transacting parties. One of the underlying principles of blockchain is to remove the need for trust as much as possible from the equation.
The world is filled with organizations which have earned this trust – governments and banks (mostly), universities, trade associations, and even private companies like Google and Facebook. In most cases, especially in the developed world, these work extremely well. I believe my vote has always been counted, and no bank has ever stolen my money (except for charging me exorbitant transaction fees). So what’s the problem? If an organization controls an important database, it also needs a bunch of people and processes in place to prevent that database being tampered with. People need hiring, processes need to be designed, and all this takes a great deal of time and money.
So blockchain offers a way to replace these organizations with a distributed database, locked down by clever cryptography. Like so much that has come before, they leverage the ever-increasing capacity of computer systems to provide a new way of replacing humans with code. And once it’s been written and debugged, code tends to be an awful lot cheaper. Added to this are “Read and write operations” and “validating and writing” which are key factors in achieving decentralised control.
CRUD (Database) vs Read & Write Operations (Blockchain)
In a traditional database, a client can perform four functions on data: Create, Read, Update, and Delete (collectively known as the CRUD commands).
The blockchain is designed to be an append only structure. A user can only add more data, in the form of additional blocks. All previous data is permanently stored and cannot be altered. Therefore, the only operations associated with blockchains are:
- Read Operations: these query and retrieve data from the blockchain
- Write Operations: these add more data onto the blockchain
Validating and Writing On The Blockchain
The blockchain allows for two functions: validation of a transaction, and writing of a new transaction.
A transaction is an operation that changes the state of data that lives on the blockchain. While past entries on the blockchain must always remain the same, a new entry can change the state of the data in the past entries. For example, if the blockchain has recorded that my Bitcoin wallet has 1 million BTC, that figure is permanently stored in the blockchain.
When I spend 200,000 BTC, that transaction is recorded onto the blockchain, bringing my balance to 800,000 BTC. However, since the blockchain can only be appended, my pre-transaction balance of 1 million BTC also remains on the blockchain permanently, for those who care to look. This is why the blockchain is often referred to as an immutable and distributed ledger.
Historical vs Real-time
Most central databases keep information that is up-to-date at a particular moment in time, they provide more or less a snapshot of a moment in time but do not provide real-time information (which is in processing at the time of the snapshot).
Blockchain databases, on the other hand, are able to keep information that is relevant now, as well as all the information that has come before. Blockchain technology creates a database chain that has a history of itself, they grow as an ever-expanding archive of their own history while providing a real-time portrait. Thanks to their use of cryptography and Merkle trees, the historical information becomes immutable and unchangeable, the only real way to change a blockchain is to add a new transaction that offsets the previous transaction and this can only be done with consent of all parties involved (in most cases, the Ethereum hard fork was actually an instance where they reverted to an older state to cancel out a hack of the system; which may have been necessary but an undermining of the principle of blockchain).
Yes, in all blockchain technologies it is theoretically possible to change the past data but anyone trying to do so would need such an overwhelming majority over the rest of the network that it becomes practically impossible and/or insanely expensive. It is exactly that expense required to compromise or change the blockchain databases that has led people to call a blockchain database immutable.
Database Or Blockchain Performance
Looking at the initial blockchains that came out, which can be used as systems of record and are ideal as transaction platforms, they are considered slow as databases when compared to what is possible for digital transaction technology such as used by Visa and Paypal today. However, in recent months a number of blockchains has come to the surface that are able to handle much higher volumes of transactions, without having to resort to alternative paths / off-chain networks to be able to handle transactions.
Having said that, the nature of blockchain technology requires that some speed be sacrificed. The way distributed networks are employed in blockchain technology means that they do not share and compound processing power but rather that they each independently service the network, then compare the results of their work with the rest of the network until there is a consensus that something has happened. Traditional databases on the other hand, have been around for decades and have seen their performance increase in line with Moore’s law.
Moore’s law refers to an observation made by Intel co-founder Gordon Moore in 1965. He noticed that the number of transistors per square inch on integrated circuits had doubled every year since their invention. Moore’s law predicts that this trend will continue into the foreseeable future. Although the pace has slowed, the number of transistors per square inch has since doubled approximately every 18 months. This is used as the current definition of Moore’s law.
Blockchain Transaction Comparison
Do take not that this is claimed speed in some cases, not measured speed. IOTA and Raiblocks are still in testing in this sense.
While the numbers above clearly indicate that the blockchain technologies are catching up with centralised databases in terms of transaction volumes and throughput capacity, it is clear that this will always remain a catch up situation due to Moore’s law and due to the nature of blockchains themselves. When processing a transaction, a blockchain has to do all the same things as a regular database, but it also always must do three additional steps:
- Signature verification. Every blockchain transaction must be digitally signed using a public-private cryptography scheme such as ECDSA. This is necessary because transactions propagate between nodes in a peer-to-peer fashion, so their source cannot otherwise be proven. The generation and verification of these signatures is computationally complex and constitutes the primary bottleneck in products like ours. By contrast, in centralized databases, once a connection has been established, there is no need to individually verify every request that comes over it.
- Consensus mechanisms. In a distributed database such as a blockchain, effort must be expended in ensuring that nodes in the network reach consensus. Depending on the consensus mechanism used, this might involve significant back-and-forth communication and/or dealing with forks and their consequent rollbacks. While it’s true that centralized databases must also contend with conflicting and aborted transactions, these are far less likely where transactions are queued and processed in a single location.
- Redundancy. This isn’t about the performance of an individual node, but the total amount of computation that a blockchain requires. Whereas centralized databases process transactions once (or twice), in a blockchain they must be processed independently by every node in the network. So lots more work is being done for the same end result. However, in the new generation of blockchains, this is now handled in the client.
While this puts blockchains at a slight disadvantage, you can see from the confirmation time that the difference is becoming negligible.
Bitcoin as an example is a write-uncontrolled, read-uncontrolled database, the same applies for most cryptocurrencies and blockchains to-date (with the exception of blockchains for consortiums and private blockchains where control sits with the consortium and private entity respectively). This means that anyone can write a new block into the chain and anyone can read a block in the chain.
Also, a permissioned blockchain, like a central database, can be write-controlled and read-controlled, meaning that the network or protocol can be set up so only permissioned participants can write into the database or read the database.
Hiding information on the blockchain requires a lot of cryptography and a related computational burden (in the case of Proof-of-Work consensus algorithms) for the nodes in the network. This is the case for Bitcoin’s blockchain and most of the earlier blockchain adoptions, however newer systems (Tangle, Hashgraph, RaiBlocks, etc.) have all developed means to exclude this issue.
If confidentiality is the only goal / deciding factor as to whether or not to use a blockchain and trust is not a factor at all then blockchain databases pose no advantage over a centralised database.
To be fair, many strategies are available on the blockchain for mitigating the confidentiality problem. These range from simple ideas like transacting under multiple blockchain addresses, to advanced cryptographic techniques such as confidential transactions and zero-knowledge proofs (now in production). Nonetheless, the more information you want to hide on a blockchain, the heavier a computational burden you pay to generate and verify transactions. And no matter how these techniques develop, they will never beat the simple and straightforward method of hiding data completely.
Robustness / Endurance
A large benefit of blockchain-powered databases is extreme fault tolerance, which stems from their built-in redundancy. Every node processes every transaction, so no individual node is crucial to the database as a whole. Similarly, nodes connect to each other in a dense peer-to-peer fashion, so many communication links can fail before things grind to a halt. The blockchain ensures that nodes which went down can always catch up on transactions they missed.
So while it’s true that regular databases offer many techniques for replication, blockchains take this to a whole new level. For a start, no configuration is required – simply connect some blockchain nodes together, and they automatically keep themselves in sync. In addition, nodes can be freely added or removed from a network, without any preparation or consequences. Lastly, external users can send their transactions to any node, or to multiple nodes simultaneously, and these transactions propagate automatically and seamlessly to everyone else.
This robustness transforms the economics of database availability. With regular databases, high availability is achieved through a combination of expensive infrastructure and disaster recovery. A primary database runs on high-end hardware which is monitored closely for problems, with transactions replicated to a backup system in a different physical location. If the primary database fails (e.g. due to a power cut or catastrophic hardware failure), activity is automatically moved over to the backup, which becomes the new primary. Once the failed system is fixed, it’s lined up to act as the new backup if and when necessary. While all this is doable, it’s expensive and notoriously difficult to get right.
Instead, what if we had 10 blockchain nodes running in different parts of the world, all on commodity hardware, these nodes would be densely connected to each other, sharing transactions on a peer-to-peer basis and using a blockchain to ensure consensus.
End users generating the transactions connect to (say) 5 of these nodes, so it doesn’t matter if a few communication links go down. And if one or two nodes fail completely on any given day, nobody feels a thing, because there are still more than enough copies to go round. As it happens, this combination of low-cost systems and high redundancy is exactly how Google built its search engine so cheaply. Blockchains can do the same thing for databases.
So when it comes to a long-term decision on using a blockchain, the question to ask is this: What’s more important for my use case? Disintermediation and robustness? Or confidentiality and performance?
Obviously your company’s or individual concerns are left out of the above comparison and the factors in favour of a centralised database are gradually being addressed by next generation (3rd) blockchain technology. As such blockchains have strong use cases, where disintermediation and robustness are more important than confidentiality (which cryptography addresses) and performance (soon to be a non-issue).
Acknowledgements / References
• Title page “Intelligent Solutions” courtesy of http://www.hloom.com/cover-pages/
• Page header / footer “Abstract blue lights” created by Kotkoa – Freepik.com
Read the original article on Steemit