A few months back Martin Swende wrote an article on how a sophisticated profit maximizer can frontrun on Bancor exchanges. (Blockchain Frontrunning) Although he refers directly to Bancor for the purposes of his research, the important takeaway in my opinion has less to do with Bancor, and more to do with the Ethereum protocol at large. Before getting into that, let’s take a moment to summarize Swende’s analysis and discuss its implications.
He correctly states that market orders are the problem. That is, market orders are inherently vulnerable to frontrunning behavior. To motivate our understanding, let’s take a brief look at how market buy orders work: There are two or more relevant parties, the buyer and seller(s). The buyer enters into a market order contract, and in doing so they concretize their intent to purchase any shares of an agreed upon security at the current market price. Fulfilling the order involves finding the cheapest available ask prices and buying them out until it’s complete. Since there are a finite number of shares offered at a given ask price, it’s important to note that the current market price is not necessarily fixed. (In most securities markets, acute variations in market price caused by a given market order are negligible except for extremely large orders)
Now we can extend our intuitions about market orders directly to Swende’s test dummy, the Bancor protocol. From the get-go, we should take note of several glaring issues present in the protocol architecture. First off, Bancor markets see modest (at best) market capitalizations and trading volume. This is an issue because volume shocks can sway the price dramatically. Second, orderbooks and trades are all handled on chain. Since all trading behavior is handled on chain and is out there for any and all to see, any would-be frontrunner has an easy time aggregating data on incoming transactions. Lastly, price adjustments are handled algorithmically, meaning our would-be attacker can perfectly predict exchange rate changes from incoming orders. So if a technically savvy individual or lucky miner sought to exploit a given market order in Bancor, they’d adjust the expected transaction ordering through various means, gain a favorable rate for the exchange, and capture any profits motivated by the resulting price increase from the original order. The frontrunner basically abuses the pricing algorithm to perfectly predict exchange rate adjustments that would result from incoming market orders.
Swende’s analysis is spot on, but it doesn’t cover Ethereum frontrunning in its entirety. That being said, it still serves as a wonderful starting point. In the remainder of this article I extend his observation and explain why frontrunning is a systematic, rather than Bancor specific vulnerability. In the first few sections I outline some preliminary concepts that might help some less familiar readers understand what’s going on. After, I discuss what might be called a denial-of-service (DoS) frontrunning attack whereby an attacker spams the Ethereum blockchain with pointless computation so as to prevent any real transactions from being processed in order to secure an economically favorable position. This attack vector extends what was at start a market-order specific vulnerability to a systematic vulnerability for any application built on top of Ethereum that relies on internal state for the adjudication of exchange rates.
Starting With the Basics (Feel free to skip ahead)
How exactly does one frontrun a trade?
Frontrunning as it is defined on the NASDAQ website is the act of entering into an equity trade, options or futures contracts with advance knowledge of a block transaction that will influence the price of the underlying security to capitalize on the trade. This practice is expressly forbidden by the SEC. Traders are not allowed to act on nonpublic information to trade ahead of customers lacking that knowledge. (NASDAQ) To frontrun a trade, in the traditional sense, is to capitalize on information asymmetries pertaining to the future price of a security. (Note that the vast majority of crypto exchanges remain unregulated, so crypto frontrunning is a grey area)
Let’s suppose I’m a stock broker and I just got a call from one of my clients. He’s convinced that the 2018 Ford Mustang will outsell all other cars globally in the next quarter and wants me to order ten million shares of the Ford Motor Company. For a brief moment I feel as though I should advise him against his impulsive behavior and deny his request, but since I’m a profit maximizer I remember that I don’t feel and decide against such silly moral objections. My rational choice is to frontrun my client’s order. So that’s what I do.
The scenario plays out as follows: Suppose the current market price of Ford is $X. Let’s also suppose I have enough money to buy 10,000 of my own shares at price $X. I enter a market contract to make the trade. This initial trade is likely to be less than sufficient in magnitude to cause any meaningful change in the current market price, so we can assume that the price remains at approximately $X per share. After my order is filled, I proceed to enter my client’s order for a whopping 10,000,000 shares of Ford. His order fills the majority of the existing ask prices on the orderbook, and drives the market price up to $X + $Z where Z is some positive value. Once his order has been filled in its entirety, I offload my own 10,000 shares and capture that sweet sweet profit. This might not be the most elegant and profitable way to execute this transaction, but you get the idea.
Blockchains and Transaction Fees
Before we can understand how frontrunning translates onto the blockchain, we need to understand the market for fees works. In the case of bitcoin, the fee paid for a transaction is specified by the sending party at the time of execution. The sender can commit any remaining funds in their wallet to as the transaction fee. Currently, storage capacity for each new block on the bitcoin blockchain is limited to approximately 1Mb. Miners have the freedom to include any pending transactions (or none at all) in a block in whatever order they see fit. Constrained blocksize, the miner’s selectivity, and dynamic transaction fees combine to create a competitive market for fees. Rational miners are incentivized to process transactions in descending order based on transaction fee in order to maximize profit. The current iteration of the Bitcoin architecture has caused some potentially problematic properties to surface. If you have a look at the pending transaction pool(known commonly as the mempool) you’ll see that at any time there are thousands of unconfirmed transactions waiting to be included in a future block.
Ethereum extends the concept of the blockchain transaction to more easily encompass a wide variety of operations. The formal specification for the Ethereum protocol outlines a series of built-in low-level opcodes. These opcodes amass to a turing complete programming language, enabling users to push arbitry computation onto the blockchain. Ethereum miners, just like Bitcoin miners, exhaust computational resources to process these transactions. The difference is that on Ethereum, transactions can take the form of any valid operations sent to the miners. That means single transactions on Ethereum often differ in terms of how much memory in a given block is required. In the case of bitcoin, transactions function in such a way that each occupies a relatively consistent space in memory. On Ethereum, the amount of space occupied by a given transaction is dynamic with an upper bound at the current block’s gas limit (transaction ordering is inconsistent between clients, making it somewhat difficult to sustainably flood blocks from one account, but I get into that later). The first ethereum miner to find the solution to the proof of work puzzle is rewarded with the block reward (currently 3.5 eth) and any fees paid for the transactions in that block.
Referring to appendix G of the Ethereum yellow paper, we can start to figure out how much a miner might make in fees. Appendix G contains a list of valid operations in Ethereum. For a given operation miners expend a certain amount of computational resources. The term gas is used to approximate how computationally “costly” operations are relative to one another. The cost of an operation in gas remains fixed because its difficulty is not expected to deviate. The gas limit represents a block’s operational capacity, similar to the Bitcoin blocksize. The gas limit imposes a dynamic restriction on the amount of computation a sender can request in one transaction, so programs with infinite loops are not processed. Additionally, it lets the miners know how much of their computation they should dedicate to transaction processing in order to optimize their returns.
The gas value only loosely coincides with the real monetary value of fees. The value of an operation in terms of gas is constant, but the value of gas in terms of ether (the token) is dynamic. Consider the following example:
Suppose the transaction pool is currently populated with two transactions. The first transaction has account A sending 1 ether to account B. The second transaction has account C sending 1 ether to account D. These transactions are identical in terms of the amount of computation necessary to process them. Suppose further, that the current gas limit restricts the next block such that there is only room for one of these transactions. How then, should a miner differentiate between them? They could, for example, just process whichever one they received first. However, if we assume all miners are profit maximizers, (in practice, they behave as such for the most part) then they will ignore time altogether and select whichever one offers the highest fee. To determine their return, the miner looks to the unique parameter called gas price. Gas price is a value denominated in fractional quantities of ether that corresponds to a single unit of gas. Ether can be converted into GWei, where 1 ether == 10⁹ GWei or 1000000000 GWei == 1 ether.
Back to our example… Let’s say account A decides to set a gas price of 20 GWei whereas account C chooses a price of 25GWei. For the sake of simplicity, let’s assume that the transaction requires a total of 10 gas. If a miner decides to process account A’s transaction their return will be 200 Gwei. If that same miner chose to process account C’s transaction, they would have received 250GWei. Since we can be relatively safe in assuming miners are profit maximizers, they’ll always choose to process the transaction sent by account C. In reality, miner’s reserve the right to process transactions in whatever order they see fit. The assumption that miner’s will order transactions based solely on gas price holds true in reality for the most part but is not always true. I’ll get into this more in part II.
The ability for miners to discriminate and arbitrarily order transactions gives rise to some interesting properties. Currently, a new block is added to the Ethereum blockchain every 12 seconds or so. Swende’s frontrunning exploit takes advantage of the long block time and discriminatory transaction ordering by rational miners.
Transaction Ordering and Frontrunning
Let’s say we’re looking at the ETH-BNT Bancor exchange. While looking intently at the transaction pool, (that’s what I usually do in my free time) I notice a market order for 10000 ETH worth of BNT at a gas price of 20 GWei. Since this order will probably put upward pressure on the price of BNT in terms of ETH, I decide that it might be in my best interest to try and frontrun it. In order to ensure that I can get an order of my own processed before the original 10,000 ETH one, I enter my own market order for $10,000 worth of Ford shares at a gas price of 40 GWei. If my transaction propagates the transaction pool before the large order is successfully mined into a block, there’s a really good chance that my order will get processed first. We know from before that the Bancor protocol will programmatically adjust the exchange rate based on order fulfillment, so assuming my order was successfully processed prior to the large one, I just made some hefty profit in a matter of seconds.
A Quick Intro to Decentralized Exchange
As you can see, this technique is quite trivial. Most cryptocurrency exchanges implement some mechanism to combat frontrunning. Centralized exchanges (Poloniex, Shapeshift, Bitfinex, etc.) control order matching on their own servers and do not rely on on-chain transactions for users to execute trades. Decentralized exchanges (Etherdelta, 0x, KyberNetwork, etc.) are forced to either employ some novel strategies to combat frontrunning or risk exposure Bancor-esque frontrunning. For example, Etherdelta and 0x, two decentralized exchanges solutions chose to centralize the order matching component of trades by moving it off chain. So trades are processed serially in the order that they’re received, but Etherdelta and 0x aren’t completely decentralized. Orders are matched off chain then enforced on an on-chain escrow contract. This method has two advantages. Orders can be adjusted or canceled seamlessly through the centralized orderbook without paying for fees associated with on-chain work, and exchange rates adjust dynamically off-chain, making 0x and Etherdelta frontrunning resilient to the methods of frontrunning we’re concerned with. At the same time, 0x and Etherdelta intentionally introduce a single point of failure into their protocol design. What that means is, when a system or network as a whole relies on the existence and good behavior of a single actor. If that single actor fails or is corrupted the entire network suffers. By moving the orderbook off-chain, traders are forced to trust whoever it is that’s running the service. Although this actors incentives are less misaligned relative to completely centralized exchanges, you can still make the case that this solution is suboptimal.
KyberNetwork, an alternative decentralized exchange solution, offers a unique protocol architecture that might maintain decentralization in a more pure sense. Every aspect of the Kyber exchange is handled on chain and thus gains a significant edge in terms of security and transparency from the trader’s perspective. The protocol employs a system of reserves. Reserves can be an individual, a group, a smart contract — basically, anyone that holds some fungible crypto-asset that’s exchangeable on the Ethereum blockchain. Reserves deposit their tokens into the Kyber smart contract, select the tokens they accept for exchange, and set exchange rates. Users can then request trades through the smart contract, get matched with a reserve, and settle their transaction on chain.
A quick example might help understand how it works: Suppose I’ve got 10,000 ether. I can deposit that 10,000 ether into a Kyber smart contract to register myself as a reserve. Let’s say I only want to accept MKR in exchange for my ether soI set my exchange rate to 1,000 MKR for 1 ether. Now users can get my ether for MKR just as long as there’s some left in the reserve. As the reserve operator, I can update the exchange rate in MKR for ether as the price of ether fluctuates. To do so, I just update the exchange rate parameter on the Kyber smart contract. I can even charge a premium relative to the market exchange rate for my on-chain services. I make a profit, users can securely exchange their digital assets, everyone’s happy, right?
Before I go on, I’d like to put forth a short disclaimer. I’m not writing this article glorify/denounce decentralized exchanges. As far as I’m concerned, it’s still an open problem. In the next section I’m going to elaborate on a unique style of frontrunning that uses everything we just learned. It’s important to note that this vulnerability is not specific to decentralized exchange protocols or any apps I discuss. It’s an inherent flaw that exists in the current PoW style of Ethereum that will become less and less relevant as scalability improves. The following is just a few examples of what’s possible.
DoS frontrunning involves an entity, be it an individual, a group, or a smart contract, taking advantage of the block gas limit restriction. If we draw from analysis done in Vitalik’s recent gasprice market analysis, the decile variables contain some important informational cues for our frontrunner.
In Vitalik’s words, “The “deciles” variable contains 11 values, where the lowest is the lowest gasprice in each block, the next is the gasprice that only 10% of other transaction gasprices are lower than, and so forth; the last is the highest gasprice in each block.”
In order to execute the DoS attack, the attacker simply floods the transaction pool with high fees. The data suggests that under normal conditions, paying a gas price of approximately 100 GWei is enough to prevent all most all transactions from being processed. Currently (February 9, 2018) the block gas limit is relatively constant at 8,000,000 +/- 10,000. (https://ethstats.net/ to see for yourself) What that means is you can deny most transactions from the next block for approximately 0.8 ether. This translates roughly to $800 at current exchange rates, so if people don’t respond to a DoS attack by paying higher gasprices, the approximate cost to an attacker for holding the transaction pool hostage for one minute is approximately $4,000. At first glance, this all might seem pointless and costly. In order to see how someone might use this for profit, let’s think about how we might use this to frontrun. We’ll look at how an attacker could use this method to hedge risk or magnify profits on two applications, Kyber exchanges and decentralized prediction markets.
Our DoS frontrunning attack puts Kyber reserves under fire, so lets dig a bit deeper into how they work. Reserves can be broken down into two constituent parts, the reserve operator and smart contract. The smart contract stores all relevant units of exchange while the reserve operator sends transactions to the smart contract, updating exchange rates periodically. Notably, users only interact with the smart contract when making trades. The reserve operator is an entity that works in the background to make sure the reserve remains a profitable endeavor for its depositors. The DoS attack takes advantage of the necessity for the reserve operator to send a transaction in order to update the contract parameters. In the following example, we will look at a hypothetical scenario where a reserve offers to exchange Ether for Monero at a 1:2 rate. For the purpose of this example, let’s assume that I have outstanding knowledge that the real price of Monero is going to double for whatever reason. The DoS frontrun might be executed as follows:
- The attacker floods the transaction pool with useless computation. Their goal is to prevent any future blocks from containing the reserve operators exchange rate update.
- The price of Monero doubles suddenly.
- The reserve operator tries to update the exchange rates on the smart contract to reflect the price increase but their transaction is frozen in the txpool.
- The reserve operator realises that an attacker is flooding the transaction pool and attempts to force his transaction through by paying a gas price greater than that of the attacker. The two then enter into a multiround game, where both actor’s best response is to pay a higher gas price than the other. Miner’s are pretty happy about this, but the reserve operator and anyone else who might be trying to send an Ethereum transaction at the time are not.
- At any point over the course of this game, the attacker can exchange their Ether for Monero at the 2:1 rate and hopefully turn a profit.
The rational move for the reserve operator in the above example is to set the gas price high enough such that the attackers optimal choice is to call off the attack. Still, I can see this attack playing out in favor of the attacker in some cases. (Reserve operator is slow to respond) Though there’s always the potential for the reserve operator to successfully fend off the attack, that’s only if we assume that they’re sufficiently competent and attentive. An interesting failsafe mechanism brought to my attention by someone on the Kyber team involves the smart contract denying exchange requests when the parameters are not updated after a certain amount of time has elapsed.
In the case of prediction markets, the DoS attack plays out a little bit differently. Prediction markets allow people to create shares of and trade outcomes to future events. To illustrate the DoS attack scenario, we will consider a simple market where participants can effectively bet on the winner of Super Bowl LII. There are only two possible outcomes, either the Eagles win and the Patriots lose (1,0) or the Eagles lose and the Patriots win (0,1). Market participants can trade shares up until the conclusion of the game, at which point the price of the winning share will converge to 1.
Let’s say at the start of the game, both outcomes are trading at fifty cents on the dollar. An attacker who plans on executing the DoS attack might hold off on buying any shares until they get some sort of indication for how the game might turn out. Let’s say that the Eagles end up taking a 14–0 lead at the end of the first quarter. Since this is a pretty significant lead, the outcome prices adjust to reflect people’s expectations for how the game will play out. A reasonable figure to represent this scenario might see Eagles shares trading at .90 and Patriots shares at .10. Suppose that the attacker has a hunch that the Patriots will come back, but they’re not certain of it. The attacker can buy the Patriots outcome for cheap now and risk losing their investment as a result of their uncertainty, or do what they do best and flood the transaction pool. Executing the DoS attack allows the attacker to secure the .10 Vikings outcome indefinitely.
The game theoretic consequences of this scenario are categorically different from the Kyber example. We can the shareholders’ best response as a uniform gas price adjustment that would render the frontrunner’s . Logistically this might be difficult. Since we’re dealing with a network of actors, each of whom has their own unique implied valuation, expectations for the event, and level of competence, a uniform response becomes unlikely. Still, we can expect a certain subset of shareholders to realize the threat of the DoS attack and swiftly adjust gas accordingly. The attacker, in this case, can afford to allow a few transactions through to be processed. Their intention is to secure favorable odds relative to the real odds, in which case, their optimal choice would not involve competing against transactions that offer an especially high gas price. By placing the attacker up against a network of actors, prediction markets might present more ideal conditions for executing a profitable frontrunning relative to the Kyber case
At this point you might be thinking “This all makes sense, but it would totally be too expensive to be worthwhile.” Well, yes and no. The lack of popular and user-friendly dapps means trying to pull this off today would be analogous to flushing wads of money down the drain every 12 seconds. At the same time, we saw earlier just how little flooding a block costs, so perhaps this attack might be relevant in the right circumstances.
That concludes part I.
Some of the topics to expect in part II: A deeper look into miner transaction ordering and discrimination (are all miners rational?), multi-round gas price games, approximating the cost of a DoS attack for a single block, and a rudimentary implementation in solidity/web3js.