WTF: What's That Function?
Parsing EVM traces & function signatures for our new Maker Schema
Our most technical issue yet. As part of our in-progress research on Maker Vault Elasticity, we’re taking a step back and re-architecting our MakerDAO Schema. Thanks to our amazing data engineering team, I’ve learned a lot about the nitty gritty of the Ethereum Virtual Machine.
Re-architecting the @MakerDAO schema.
vault numbers collateral_type (ilk) ez_dai_mints deposits, withdraws, and repayments.
@flipsidecrypto wi@flipsidecryptot Maker data (& research😉) soon.
If you have MKR, delegate some to flipsidecrypto.eth you won't regret it.
— charliemarketplace.eth (🧊,🧊) - truefreeze.xyz (@charliemktplace)
Feb 3, 2023
In this issue we’ll cover key tools we use at Flipside to understand what is actually happening on-chain (as opposed to trusting smart contract events to tell us the whole truth).
Events are Shortcuts
Events in smart contracts are really clean and useful:
[Alice] sent  [WETH] to [Bob]
event Transfer(address indexed from, address indexed to, uint256 _value)
The ERC20 token standard includes emitting transfer events. For our Alice/Bob example: WETH.Transfer(Alice, Bob, 5) is the code translation.
But not everything is as simple as from-amount-to. The more advanced the protocol, the more complexity is required to safely support that protocol and make it possible for it to grow and change long-term.
MakerDAO runs the protocol behind the DAI stablecoin. It allows a variety of collateral types to back DAI and ensure that a free market of self-interested individuals keep DAI close to ~$1 of value (i.e., redeemable for $1 of ETH or $1 of WBTC or $1 of USDC) without having to trust anyone’s goodwill.
To make this work, with its variety of auctions and collateral types it uses Proxy and Manager contracts to isolate collateral types and change records in the Vat (the big database of assets and liabilities for DAI).
Separate from the purposefully unique vocabulary (frob, flux, etc.), different contracts in the workflow live in different contexts so the same function parameters may not make sense to a human at first glance.
For example, when someone deposits ETH to mint DAI, they are creating a new Collateralized Debt Position (CDP). As an example, here is the Maker-Vault Tracker for Vault 16,123. This Vault is of the ETH-B collateral type.
In the new Maker Schema, we want to ensure that anytime a Maker Vault is opened we know (1) Vault # and (2) Collateral Type.
Traces are Trustless
The Vault Tracker links us to TokenFlow’s ETHtx info (here is the Open transaction for the Vault we mentioned) which tells us everything we could want to know about a transaction. Not just the events (what the contract wants us to know) but also the traces - all the nitty gritty internal computation that makes up the EVM’s gas costs.
This looks crazy right! It can feel super overwhelming. Part of that, is because this transaction itself is complicated: it’s a Vault Open, and an ETH Deposit, and a DAI Generate all at once!
So let’s take it slow. We want the know the Vault # and the Collateral type when the Vault opens. We can skim around to find what we already know: the Vault # is 16,123. Because we know Maker’s architecture uses CDP Manager and DSProxy, we can find the exact trace we want:
208022 gas used by the CDP Manager to open ETH-B Vault: 16123
Inside the CDP manager, the Open() function was called. The Collateral type was ETH-B. The Vault Number is the output, it is 16,123. Here, the usr is the DS Proxy, that’s confusing because a human would probably expect to see the address of the person who created this CDP as the Vault owner. (Context is everything!).
That 208022 gas used will come in handy as we get to the next step: Generalizing our process to find Vault Numbers and Collateral types when we don’t know the answer in advance.
Curation is Creation
Ultimately, at Flipside we want to give analysts a curated experience that lets them analyze without having to get this deep into the weeds of the EVM. To do that, we generalize queries into models that output nice clean tables analysts can trust by dealing with protocol level nuance ahead of time. We have schemas like Uniswap, Maker, and other protocols already curated for analysts across 10+ blockchains!
But it’s important analysts understand that this process involves tradeoffs (e.g., who owns a Vault? Is ‘owner’ even the right way to think about Maker Vaults?).
Here, we take what we got from the Maker Vault and begin the generalization process. First, we specify exactly what we want, going bottoms up:
In Block 11,117,766 in the TX_HASH with the OPEN event, there was a trace with 208,022 GAS USED.
This gets us straight to the signal and avoids all the other noise that doesn’t have exactly what we want: Vault # & Collateral Type.
When a function is called in the EVM what happens is the transaction has input data and output data. The first 10 bytes are the function signature for what’s happening (hence: WTF, What’s That Function). Because functions can have different numbers of parameters, we break the input and output in segments of 64 bytes each. Often the parameters aren’t that long so expect empty space (lots of 000000s).
0×6090de5 function signature seems to have 2 parameters (2 quoted things in the segmented_input JSON). We can go to the Ethereum Signature database and find out what this function it.
It’s the Open() function from the MakerDAO CDP Contract we saw before! And the two parameters are bytes and address. Going back to the traces above, we know that the first argument
ilk is our collateral type and the 2nd argument is the DSProxy (hm… again, not the human owner…). We also know the output is the Vault Number.
From Example to Model
These HEX values can be decoded as (1) a string (Ilk-B) and (2) an Integer (Vault Number)
Now that we’ve confirmed the fundamental structure of how the MakerDAO CDP Manager opens vaults:
CDPManager.open(collateral_type, ds_proxy) → vault_number
We can go to all traces in the Ethereum history and find every instance where the CDPManager uses its open function and decode our Vault Number & Collateral Type. We can use the available try_hex_decode_string() and Flipside’s custom udf_hex_to_int() to separate what we know is text (collateral type) from what we know is a number (the Vault Number).
Running the code on the full Ethereum history takes time, but we only need to do it once and then re-use the process as we add new blocks. Giving analysts a clean table to work with with all the traces headaches abstracted way.
The real model will clean out the empty data.
The EVM is a beautiful but complicated machine. Tools like Etherscan, ETHTx Info, 4bytes and protocol specific tools like Maker Vault Tracker allow us to build on top of past work and provide clean, curated data for analysts.
This same workflow: starting with a known example, checking traces, querying for exact traces using the Gas Used trick, parsing the inputs & outputs, and cross-referencing the tools to get function signatures is one of the ways we go deep so analysts don’t have to.
Keep a lookout for out Maker Elasticity Analysis and if you like code, here are some links to Vault 16,123 queries that will form the base of our new Maker schema:
- Identify Collateral Type (ilk) ETH-B
- Deposit of 0.4 ETH
- Mint of 120 DAI
- Repay of 125 DAI
- Withdraw of 0.33 ETH