Digital signatures are the foundation of online sovereignty. The advent of public key cryptography in 1976 paved the way for the creation of a global means of communication - the Internet - and an entirely new form of money, Bitcoin. Although the fundamental properties of public key cryptography have not changed much since then, today there are dozens of different open source digital signature schemes available to cryptographers.
When Satoshi Nakamoto began working on Bitcoin, one of the key points to consider was which signature scheme should be chosen for an open and public financial system. The requirements were clear: it was necessary to create an algorithm that would be widely used, understandable, reasonably secure, lightweight and, most importantly, open source. Of all the options available at that time, he chose the one that met these criteria best: Elliptic Curve Digital Signature Algorithm (elliptic curve digital signature algorithm), or ECDSA.
At the time, native support for ECDSA was provided in OpenSSL, an open-source encryption toolkit developed by veteran cypherpunks to improve the privacy of online communications. Compared to other popular schemes, ECDSA has the advantages of less computational requirements and shorter key lengths - useful properties for digital money. At the same time, it provides a proportional level of security for schemes such as RSA: for example, a 256-bit ECDSA key has the same level of security as a 3072-bit RSA key with a significantly smaller key size.
Thanks to the hard work Peter Wuille and his colleagues have done on an improved elliptic curve called secp256k1, Bitcoin's ECDSA has become even faster and more efficient. However, ECDSA still has some shortcomings that may serve as sufficient grounds for its complete replacement. After several years of research and experimentation, a new signature scheme has been established to improve the privacy and efficiency of Bitcoin transactions: the Schnorr digital signature scheme.
In this article, I will outline the many implementations of Schnorr signatures and the benefits of these signatures. Next, we'll talk about MuSig, a new multisig standard that can serve as the basis for introducing new Bitcoin technologies such as Taproot. Finally, I'll tell you how a full implementation of Schnorr signatures could break the heuristics used in blockchain analytics and at the same time help develop a strong fee market on the Bitcoin core layer.
The history of Schnorr signatures
While the Schnorr digital signature scheme has many advantages over ECDSA, it is certainly not new. It was invented back in the 1980s. Klaus-Peter Schnorr, a German cryptographer and academician, at that time a professor and researcher at the University of Frankfurt. His proposed signature scheme was based on the research and work of David Chaum, Taher El-Gamal, Amos Fiat and Adi Shamir. However, before publishing the new circuit, Klaus Schnorr filed numerous patents that prevented its direct use for many years.
Interestingly, the predecessor of ECDSA, the DSA algorithm, was a hybrid of ElGamal and Schnorr schemes, created solely to circumvent Klaus Schnorr's patents. In fact, just two months after the US patent was issued to Klaus Schnorr, DSA's progenitor, the US National Institute of Standards and Technology (NIST), also filed a patent for its solution. After this, Klaus Schnorr became even more active in defending his patents and directly responded to his critics in the Coderpunks mailing list (an offshoot of the original Cypherpunks email newsletter). His answers can be read here and here (English). And here you can find an internal NIST memo describing patent issues.
In 2008, almost two decades after the introduction of Schnorr's signature scheme, Klaus Schnorr's patent expired. Coincidentally, 2008 was also the year that Satoshi Nakamoto introduced Bitcoin to the world. Although Schnorr signatures were usable at this point, they were not yet standardized or widely used. This is probably why Satoshi chose ECDSA. And although cryptographers and mathematicians often characterize this algorithm, ECDSA is still quite widely used today, and at that time it was a more secure option for Bitcoin.
Digital Signature Protocol
The Schnorr algorithm can also be used as a digital signature protocol for a message M. The same key pair is used, but a one-way hash function H(M) is added.
Signature generation
- Preliminary processing.
Peggy chooses a random number r less than q and calculates x= g^r \pmod p. This is the preliminary calculation stage. It is worth noting that the same public and private keys can be used to sign different messages, while the number r is selected anew for each message. - Peggy concatenates the message M and x and hashes the result to obtain the first signature: S_1=H(M | g^r \bmod p)
- Peggy calculates the second signature. It should be noted that the second signature is calculated modulo q. S_2=r+wS_1 \bmod q.
- Peggy sends Victor a message M and signatures S_1, S_2.
Signature verification
- Victor calculates X = g^{S_2}y^{S_1}\bmod p (or X = g^{S_2}y^{-S_1}\bmod p, if we calculate y as y=g^{w}\pmod p ).
- Victor checks that H(M|X)=S_1. If so, then he considers the signature to be correct.
Efficiency
The main calculations for generating the signature are performed at the pre-processing stage and at the stage of calculating wS_1\bmod q, where the numbers w and S_1 are of the order of 140 bits, and the parameter r is 72 bits. The last multiplication is negligible compared to the modular multiplication in the RSA circuit.
Signature verification consists mainly of the calculation X = g^{S_2}y^{S_1}, which can be done on average in 1.5l + 0.25t calculations modulo p, where l = [log_2q] is the length of q in bits.
A shorter signature allows you to reduce the number of operations for signature generation and verification: in the Schnorr scheme O(\log_2q\log_2^2p), and in the ElGamal scheme O(\log^3p).
Example
Key generation:
- q = 103 and p = 2267. Moreover, p = 22q + 1.
- Select f=2, which is the elements in the field Z_{2267*}. Then \frac{p-1}{q} = 22 and g = 2^{22} \bmod 2267 = 354
- Peggy chooses the key w = 30, then y = 1206
- Peggy's private key is 30 and her public key is (103,2267,354,1206).
Message signature:
- Peggy needs to sign the message M=1000.
- Peggy chooses r = 11 and calculates g^r = 354^{11} = 630 mod 2267.
- Let's assume the message is 1000 and the serial connection means 1000630. Let's also assume that hashing this value produces a digest of H(1000630) = 200 . This means S_1 = 200.
- Peggy calculates S_2 = r + wS_1 mod q = 11 + 30*200 mod 103 = 11 + 26 = 37.
- Peggy sends Victor M=1000, S_1 =200 and S_2 = 37.
Schnorr signatures in Bitcoin
Let's fast forward another decade and fast forward to today. The Schnorr signature scheme now looks much less esoteric and its standardized implementations – such as ed25519 – are becoming a popular option for some altcoins. Informal talk about a potential implementation of Schnorr signatures on the Bitcoin network dates back to this BitcoinTalk forum thread dating back to 2014, but the proposal was only formalized after several years of research and experimentation when Peter Wuile wrote the Schnorr BIP (Bitcoin Improvement Proposal). ). This draft proposal describes the specifications and technical aspects of a potential implementation of Schnorr signatures that would have the following advantages over ECDSA:
- Proof of Security:
The security of Schnorr signatures is easily proven by using a sufficiently random hash function (random oracle model) and sufficient complexity of the elliptic curve discrete logarithm problem (ECDLP). No such evidence exists for ECDSA. - Inflexibility:
ECDSA signatures are flexible in nature, which could allow a third party without access to the private key to modify an existing valid signature and spend the funds twice. This issue was officially discussed in BIP62. In comparison, Schnorr signatures are provably inflexible. - Linearity:
Schnorr signatures have the remarkable property that multiple parties can jointly create a signature that is valid for the sum of their public keys. This can serve as a building block for various higher-level designs that improve efficiency and privacy, such as multisigs and other smart contracts.
The security proofs of Schnorr signatures, as well as the guarantee of their inflexibility, give them clear advantages over ECDSA. These two advantages alone can serve as sufficient grounds for a soft fork. But a particularly impressive property of Schnorr signatures is their linearity. Specifically, it allows multiple signers of a multisignature transaction to combine their public keys into one aggregated key representing the entire group—a property called key aggregation.
While the ability to combine keys into one may sound a bit trivial, the benefits of doing so should not be underestimated. Since ECDSA does not natively support multisigs, Bitcoin had to implement them through a standardized smart contract (yes, Bitcoin also has smart contracts) called Pay-to-ScriptHash (P2SH). It allows users to add spending conditions, called encumbrances, to specify how funds can be spent - for example, "unlock the balance only if the message is signed by Bob and Alice."
The first problem with P2SH is that it requires knowledge of the public keys of all participants in the multisignature, which is not an efficient system. Aggregating these keys would optimize verification, since the network would only need to verify one key rather than several. This also implies a smaller footprint on the blockchain, lower transaction costs, and improved network throughput.
The second problem with P2SH is that it offers very little privacy guarantee. As stated in BIP13, P2SH transactions require that addresses begin with the number 3. This allows blockchain analysts to not only recognize all P2SH transactions on the network, but also accurately determine the addresses participating in the multisig:
Blockchain analyst: “Definitely multi-signature.” - Not good.
In the example above, the network will know (1) that a multisig transaction exists, (2) how many addresses are participating in the multisig, and (3) who exactly signed the transaction. Not great for operational security, especially for use cases like 2FA. This is bad from a privacy perspective.
Key aggregation, on the other hand, preserves the anonymity of the multisignature participants and does not compromise operational security by exposing the keys needed to unlock the balance. Most importantly, key aggregation makes multi-signature transactions indistinguishable from regular transactions:
Blockchain Analyst: “It could be multi-signature... It’s impossible to say for sure...” – Okay now.
The first iteration of Schnorr signatures in Bitcoin will eliminate the OP_CHECKSIG and OP_CHECKMULTISIG family of opcodes currently used with ECDSA in favor of a new class of opcodes called OP_CHECKDLS. Without going into too much detail, DLS stands for Discrete Log Signature, and it allows signatures to be verified more efficiently and with fewer opcodes.
Back in early 2020, Gregory Maxwell, Andrew Poelstra, Yannick Seurin, and Peter Wuille published a white paper on a new Schnorr signature-based multisignature scheme called MuSig. Since this publication, they have worked hard to translate the proposed multisignature scheme into usable code.
One of the most interesting things about MuSig in the context of key aggregation is the ability to create private smart contracts outside the blockchain. Essentially, MuSig allows multisig participants to apply encumbrances to aggregated off-chain keys without disclosing those terms and completely separate from Bitcoin's consensus rules.
In December 2020, Anthony Townes became the first Bitcoin Core developer to prepare a semi-formalized proposal for enabling Schnorr signatures, which was presented in the Bitcoin developer mailing list. I expect conversations about a potential soft fork to increase in the coming months.
To summarize, the first iteration of MuSig in Bitcoin will have support for key aggregation, which can immediately (1) improve the privacy of multisigs, (2) improve the efficiency of transaction verification, (3) improve security by eliminating the problems inherent in ECDSA, and (4) provide the ability to integrate smart contracts such as Taproot, which we will discuss in the next section.
And this is just the beginning.
What are Schnorr signatures? What is Taproot?
Taproot offers its own version of a Merkle tree called a script tree. Members can choose to spend using:
- public key as a regular signature;
- spending using a script.
In the first option, this is the default spending path, where single or multi-party public keys are indistinguishable.
In the second case, hidden scripts are not revealed until the spend is made. The different scripts can be organized into a Merkle tree, and the outputs can also be spent by expanding one of the specifiers.
If we spend a transaction using the primary spend script, we simply provide a Merkle proof that consists of the primary spend script and the hash of the alternate spend script - this is enough to prove that the primary spend script is contained in the script tree.
Taproot uses the MAST structure to hide the conditions behind the Merkle root. The Merkle root itself is hidden in this scenario and allows direct spending through the key. Only a single key is sent to the blockchain - no one sees that there are additional conditions.
In combination with Schnorr signatures, the MAST structure is hidden thanks to Taproot outputs. At the top of the Merkle tree there is an option to publish a single public key and signature. As a result, P2PKH and P2SH transactions look identical.
An illustration can be seen in the closure of the Lightning channel.
Lightning channels are variations of 2-of-2 multisig. Instead of closing a transaction using a cumbersome script, Schnorr allows signatures to be combined and represented as a public key/Taproot signature. When both parties agree, the result looks like someone used up this output using a regular signature, sending to two addresses. An observer will not be able to determine that this is a Lightning channel.
TapBranch is a script tree (TapTree) for closing a Lightning channel
To hide the MAST structure, the TapBranch hash in the graph above is hashed with the aggregated public key (thanks to the Schnorr scheme, Alice and Bob can add their public keys to create an internal Taproot key).
The resulting hash is used as a private key, from which another modified public key is derived. Changing keys, also known as key pair hiding, involves inlining scripts 1 and 2.
The modified public key is then added to Taproot's internal key to create the Taproot exit key. The process is illustrated below:
As stated, there are two keys to spending. The default spending path is when Alice and Bob agree to close the Lightning channel, and the Taproot exit key ensures that the transaction appears to be a standard P2PKH transaction. In other scenarios, the script used is revealed once the coins are spent, while all other options remain hidden.
In the example above, if Alice and Bob agree to make a Lightning payment, they can jointly merge Schnorr signatures, create a master public key, add signatures together, and create a master signature.
Both parties provide partial signatures using their private keys, and closing the Lightning channel is like a direct payment to the public key.
In the case where the closure is inconsistent, only the script used is expanded. Verifiers will be able to determine that the threshold public key has been modified through the Merkle root. However, all other options/scripts will remain hidden.
The graph above shows that the script tree offers a new recovery option to gain access to Bitcoins. Taproot provides a recovery option for lost coins (for users with updated wallets). If a single key is lost, it is lost forever. If the user loses the private key and their funds are in the form of a Taproot exit, then there must be another way through which the coins can be claimed (for example, recovering the 3-of-5 backup keys held by the user's relatives).
Taproot improves the privacy, efficiency, and flexibility of Bitcoin scripts by allowing developers to write complex scripts while minimizing impact on the blockchain.
Complex transactions allow significant savings on commissions, since data-intensive scripts no longer have to pay commissions that exceed the commission fees in a standard Pay-to-Public-Key-Hash transaction. The more complex the transactions, the higher their efficiency.
Because Taproot allows complex transactions with just a single signature, the number of bytes used for aggregated keys and signatures does not change depending on the number of signers. When using Witness-Script-Hash (P2WSH) multisignature, each additional public key adds 8.5 bytes, and each additional signature adds approximately 18.25 bytes.
From a privacy perspective, Taproot allows you to minimize information about the spending conditions for the transaction output, which is disclosed in the blockchain. With Taproot, most apps can use a key-based spending path that is privacy protected.
While Schnorr's scheme allows multi-signature transactions to appear as if they were regular Pay-to-Public-Key-Hash transactions, Taproot expands the range of transactions that can be given this appearance (making Pay-to-Public-Key-Hash and Pay-to-Script-Hash indistinguishable ).