DLT Interoperability and More ⛓️#29— Blockchain Privacy and Regulatory Compliance: Towards a Practical Equilibrium⛓️
In this series, we analyze papers on blockchain and interoperability.
This edition covers a survey paper on privacy-preserving regulatory compliance: the ability of users to prove their funds originate from legitimate sources without disclosing the entire transaction graph.
➡️ Title: Blockchain Privacy and Regulatory Compliance: Towards a Practical Equilibrium
➡️ Authors: Vitalik Buterin, Jacob Illum, Matthias Nadler, Fabian Schar, Ameen Soleimani
➡️ Paper source: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4563364
➡️ Motivation:
If no one cares about on-chain privacy, sooner or later people will resort to centralized alternatives to hide their traces. First, let’s introduce some context on why privacy is important, especially considering the interoperability trends.
Blockchain has reached a tipping point. Recently, Arbitrum has become the first rollup to surpass Ethereum in daily activity. A varied landscape of Layer 1 solutions is competing for user transactions. Users compete for block space within each ecosystem, creating a fee market incentivizing different actors to reorder transactions and extract added value. With all this diversity, connecting centralized (financial infrastructure, databases) with decentralized infrastructure (L1, L2, L3) requires a huge effort in interoperability and integration engineering. As we mentioned in our report on the state of blockchain interoperability, rollup research tackles a lot of interoperability research, as the on-ramp is done via a native bridge — so there is an inherent need to study system and data orchestration techniques to enable seamless cooperation.
These trends have been capitalizing for the last couple of years, yielding an ecosystem of ecosystems that is more and more interconnected and intertwined. It is as if globalization within Web3 is about to be completed: first slowly, with a few dozen solutions (and barely any standards) back in 2021, and then suddenly — a true “Cambrian” explosion of blockchains and interoperability mechanisms.
The industry's focus also changed: when I started working in blockchain in 2017, my first focus was (perhaps strangely) permissioned blockchains. There was something truly interesting about increasing the robustness of data, and sharing accountability in a network of identified peers that caught my attention. The idea of governments and traditionally centralized entities moving towards decentralization and technical accountability seemed too good to be true. Justicechain was the output of that research, where we decentralized the logging mechanism of the court information systems of the Portuguese government.
However, permissioned and private blockchain technology didn’t catch much momentum because the regulatory risk was too high. Perhaps because cooperating is too political and technically harder than expected. Or perhaps because decentralizing a business will invariably reduce your slice of the cake (although, arguably, in the long run, there will be a bigger cake). Learning from the permissionless world and incentivized by being thought leaders, the momentum for permissioned infrastructure is rising again. Much infrastructure and services are now being built for enterprises. See, for example, Blockdaemon’s service offerings on institutional staking, nodes, and wallets. See, for example, our work on standardizing blockchain interoperability protocols at the IETF. Besides, research on hybrid blockchains (see, for example, an open-source flagship interoperability project called Hyperledger Cacti) balances crypto-economic incentives with the privacy requirements that organizations often have (see the Carbon Accounting group within Hyperledger). Some discussed centralization vectors bring stability, efficiency, and security for normal people to be onboarded into crypto. Nobody wants to benefit from blockchain's immense advantages if your wallet is drained when you participate in a suspicious ETH giveaway. So there needs to be centralization up to a point: balanced by the natural market forces that decentralize the market, keeping a delicate balance between centralization and decentralization in the Web3 space, which have implications on the balance between privacy and accountability. This aligns very well with the paper we are going to discuss today.
This paper’s main contribution is protecting the legitimate Web3 user, whose funds come from legitimate sources — a step towards complying with a heterogeneous set of regulations in a privacy-preserving way. This is essential for enterprises and individuals that must ensure their actions comply with the law. Imagine, for example, that a legitimate user receives proceedings from a bridge hack, which wouldn’t be too improbable, given that more than $2B has been lost to hackers in the last two years. How can a user prevent this and keep their traces clean? Surely, after all the on-chain analytics research and development (where a very relevant player is Chainalysis), identifying such situations becomes more and more facilitated. We defined the notion of a cross-chain model for bridge protection, where an operator defines the required business rules that bridges should abide by and monitor in real-time deviations from those rules.
Blockchain Privacy and Regulatory Compliance: Towards a Practical Equilibrium takes the intersection between policy-making and technology., with consequences for the interoperability and privacy space.
➡️ Contributions:
- The authors present a smart contract privacy-enhancing protocol.
➡️ Analysis:
Indeed, privacy is a tricky topic, especially for public blockchains. If the privacy of transactions is paramount, a simple solution is to utilize a centralized service or a private blockchain. Private blockchains are less decentralized than public blockchains but more decentralized than centralized services due to their immutable audit trail. Thus, they could be appropriate depending on the specific use case. However, it is easy to argue that, for most popular applications, public blockchains are the most popular option. A way to provide transaction privacy without sacrificing decentralization is at the application level — in other words, implementing a protocol as a smart contract. While on-chain privacy services like Tornado Cash (a mixer [1]) are somewhat effective, it is generally difficult for users to disassociate themselves from illegal activity. Moreover, analysis tools are becoming increasingly sophisticated to undo the acquired privacy [2].
Users might want to prove that 1) their funds came from a certain source (membership proof) or 2) they did not come from a certain source (exclusion proof). This mechanism is based on Merkle proofs and zero-knowledge proofs. There is a lot of literature on these topics, so I will not repeat them; however, I can recommend some sources: our paper on trustless blockchain interoperability using zero-knowledge and Merkle proofs (Section II has high-level background on it. Note that we explain SNARKS and zkSNARKS, which are used interchangeably with zk proofs in the industry); and the paper we are studying. In short, zk proofs allow us to prove that a coin transfer is legitimate without disclosing the sender or receiver.
“The core idea of Privacy Pools is this: instead of merely zero-knowledge-proving that their withdrawal is linked to some previously-made deposit, a user proves membership in a more restrictive association set.” This is done by providing two zero-knowledge proofs: one showing a transfer is valid, and another showing the transfer is within the association set (the funds are coming from a certain range of addresses).
💪 Strong points:
- A timely proposal in an age where privacy is more paramount than ever. Implications of monitoring crime executed by resorting to cross-chain technologies still need to be studied.
🤞 Suggestions for improvement:
- It would be interesting to see, in more detail, how the authors envision real-time AI-based scoring. For blockchain monitoring and security, formal methods, model-based security, and similar techniques are more reliable than probabilistic methods, such as the ones based on LLMs or classical machine learning algorithms (because they imply false positives and false negatives).
- “Last but not least, a malicious ASP could choose to compile the proposed association sets in a way that allows them to maximize the extractable information or inflate the perceived anonymity by adding deposits for which the corresponding withdrawals are known.” — this is future research directions that need to be taken into account during the specification of the protocol.
🔥 Points of interest:
- “Add with delay, exclude bad actors: any deposit is automatically added to the association set after a fixed period (e.g., seven days), but if the system detects that a given deposit is connected to known bad behavior (e.g., large-scale thefts, or addresses on a government- published sanctions list), the deposit is never added.” — real-time monitoring of blockchains is a hard task. It implies monitoring the mem pools with low latencies, implying that the monitor needs to have a geographically distributed node network that can catch up with transaction proposals ahead of time. Simulations need to be run in order to predict the state of the blockchain when transactions will be applied, taking into account that transactions can be reordered and dropped. All this process takes a few seconds and is distributed amongst thousands of machines. Therefore, good actors will benefit from a delay time so they have time to run algorithms that infer if, for example, funds are coming from illicit sources. The following paragraphs reinforce this idea:
- “Suppose that Alice sends a coin to Bob; that is, she makes an internal send that (perhaps partially) consumes a coin ID owned by Alice, and creates a new coin ID with parameters provided by Bob. Bob then wants to immediately spend the coin, sending it to Carl, and he would prefer his spending transaction to be private as well. Here we have our challenge: inclusion delays. In many of the configurations we proposed above, ASPs would not be willing to immediately add Bob’s new coin to their association set, because they need to watch for the possibility that the source of funds is not Alice, but instead someone who just stole the funds from Alice’s wallet. The inclusion delay gives Alice time to report the incident, or third parties time to detect it.”
“Another similar use case would be: “Alice” is a DeFi protocol, and Bob wants to withdraw funds from the DeFi protocol and immediately use those funds to privately pay Carl. This scenario has one fewer human being, but is otherwise structurally very similar.”
- “If there is a perfect consensus on which funds are “good” and which are “bad”, the system will lead to a simple separating equilibrium. All users with “good” assets have strong incentives and the ability to prove their membership in a “good”-only association set. Bad actors, on the other hand, will not be able to provide that proof. They could still deposit “bad” funds into the pool, but it would not provide them any benefits. Everyone could easily identify that the funds have been withdrawn from a privacy-enhancing protocol and see that the withdrawal references an association set that includes deposits from questionable sources. More importantly, the “bad” funds would not taint the “good” funds. When funds from legitimate deposits are withdrawn, their owner can simply exclude all known “bad” deposits from their association set. In cases where there is no global consensus, and the conclusion on whether funds are perceived as “good” or “bad” depends on the societal perspective or the jurisdiction, association sets could differ significantly.” — this paves the way for marketplaces of association sets that are based on reputation.
🚀 What are the implications for our work?
- It is good to see the industry thinking about on-chain privacy-preserving mechanisms. In an increasingly interconnected Web3 universe, privacy-preserving tools for bridging will need to be developed. At the same time, it is important to protect the legitimate user who wants to use these services and provide accountability for bad actors. Our research is focusing on mapping the landscape of security and privacy of interoperability mechanisms, which we believe can contribute to the vision of Buterin et al.’s paper.