Ir ao contido principal

Reverse Engineering a Contract

evmopcodes
Advanced
Ori Pomerantz
30 de decembro de 2021
32 minute read minute read

Introduction

There are no secrets on the blockchain, everything that happens is consistent, verifiable, and publicly available. Ideally, contracts should have their source code published and verified on Etherscan(opens in a new tab). However, that is not always the case(opens in a new tab). In this article you learn how to reverse engineer contracts by looking at a contract without source code, 0x2510c039cc3b061d79e564b38836da87e31b342f(opens in a new tab).

There are reverse compilers, but they don't always produce usable results(opens in a new tab). In this article you learn how to manually reverse engineer and understand a contract from the opcodes(opens in a new tab), as well as how to interpret the results of a decompiler.

To be able to understand this article you should already know the basics of the EVM, and be at least somewhat familiar with EVM assembler. You can read about these topics here(opens in a new tab).

Prepare the Executable Code

You can get the opcodes by going to Etherscan for the contract, clicking the Contract tab and then Switch to Opcodes View. You get a view that is one opcode per line.

Opcode View from Etherscan

To be able to understand jumps, however, you need to know where in the code each opcode is located. To do that, one way is to open a Google Spreadsheet and paste the opcodes in column C. You can skip the following steps by making a copy of this already prepared spreadsheet(opens in a new tab).

The next step is to get the correct code locations so we'll be able to understand jumps. We'll put the opcode size in column B, and the location (in hexadecimal) in column A. Type this function in cell B1 and then copy and paste it for the rest of column B, until the end of the code. After you do this you can hide column B.

1=1+IF(REGEXMATCH(C1,"PUSH"),REGEXEXTRACT(C1,"PUSH(\d+)"),0)

First this function adds one byte for the opcode itself, and then looks for PUSH. Push opcodes are special because they need to have additional bytes for the value being pushed. If the opcode is a PUSH, we extract the number of bytes and add that.

In A1 put the first offset, zero. Then, in A2, put this function and again copy and paste it for the rest of column A:

1=dec2hex(hex2dec(A1)+B1)

We need this function to give us the hexadecimal value because the values that are pushed prior to jumps (JUMP and JUMPI) are given to us in hexadecimal.

The Entry Point (0x00)

Contracts are always executed from the first byte. This is the initial part of the code:

OffsetOpcodeStack (after the opcode)
0PUSH1 0x800x80
2PUSH1 0x400x40, 0x80
4MSTOREEmpty
5PUSH1 0x040x04
7CALLDATASIZECALLDATASIZE 0x04
8LTCALLDATASIZE<4
9PUSH2 0x005e0x5E CALLDATASIZE<4
CJUMPIEmpty

This code does two things:

  1. Write 0x80 as a 32 byte value to memory locations 0x40-0x5F (0x80 is stored in 0x5F, and 0x40-0x5E are all zeroes).
  2. Read the calldata size. Normally the call data for an Ethereum contract follows the ABI (application binary interface)(opens in a new tab), which at a minimum requires four bytes for the function selector. If the call data size is less than four, jump to 0x5E.

Flowchart for this portion

The Handler at 0x5E (for non-ABI call data)

OffsetOpcode
5EJUMPDEST
5FCALLDATASIZE
60PUSH2 0x007c
63JUMPI

This snippet starts with a JUMPDEST. EVM (Ethereum virtual machine) programs throw an exception if you jump to an opcode that isn't JUMPDEST. Then it looks at the CALLDATASIZE, and if it is "true" (that is, not zero) jumps to 0x7C. We'll get to that below.

OffsetOpcodeStack (after opcode)
64CALLVALUE provided by the call. Called msg.value in Solidity
65PUSH1 0x066 CALLVALUE
67PUSH1 0x000 6 CALLVALUE
69DUP3CALLVALUE 0 6 CALLVALUE
6ADUP36 CALLVALUE 0 6 CALLVALUE
6BSLOADStorage[6] CALLVALUE 0 6 CALLVALUE

So when there is no call data we read the value of Storage[6]. We don't know what this value is yet, but we can look for transactions that the contract received with no call data. Transactions which just transfer ETH without any call data (and therefore no method) have in Etherscan the method Transfer. In fact, the very first transaction the contract received(opens in a new tab) is a transfer.

If we look in that transaction and click Click to see More, we see that the call data, called input data, is indeed empty (0x). Notice also that the value is 1.559 ETH, that will be relevant later.

The call data is empty

Next, click the State tab and expand the contract we're reverse engineering (0x2510...). You can see that Storage[6] did change during the transaction, and if you change Hex to Number, you see it became 1,559,000,000,000,000,000, the value transferred in wei (I added the commas for clarity), corresponding to the next contract value.

The change in Storage[6]

If we look in the state changes caused by other Transfer transactions from the same period(opens in a new tab) we see that Storage[6] tracked the value of the contract for a while. For now we'll call it Value*. The asterisk (*) reminds us that we don't know what this variable does yet, but it can't be just to track the contract value because there's no need to use storage, which is very expensive, when you can get your accounts balance using ADDRESS BALANCE. The first opcode pushes the contract's own address. The second one reads the address at the top of the stack and replaces it with the balance of that address.

OffsetOpcodeStack
6CPUSH2 0x00750x75 Value* CALLVALUE 0 6 CALLVALUE
6FSWAP2CALLVALUE Value* 0x75 0 6 CALLVALUE
70SWAP1Value* CALLVALUE 0x75 0 6 CALLVALUE
71PUSH2 0x01a70x01A7 Value* CALLVALUE 0x75 0 6 CALLVALUE
74JUMP

We'll continue to trace this code at the jump destination.

OffsetOpcodeStack
1A7JUMPDESTValue* CALLVALUE 0x75 0 6 CALLVALUE
1A8PUSH1 0x000x00 Value* CALLVALUE 0x75 0 6 CALLVALUE
1AADUP3CALLVALUE 0x00 Value* CALLVALUE 0x75 0 6 CALLVALUE
1ABNOT2^256-CALLVALUE-1 0x00 Value* CALLVALUE 0x75 0 6 CALLVALUE

The NOT is bitwise, so it reverses the value of every bit in the call value.

OffsetOpcodeStack
1ACDUP3Value* 2^256-CALLVALUE-1 0x00 Value* CALLVALUE 0x75 0 6 CALLVALUE
1ADGTValue*>2^256-CALLVALUE-1 0x00 Value* CALLVALUE 0x75 0 6 CALLVALUE
1AEISZEROValue*<=2^256-CALLVALUE-1 0x00 Value* CALLVALUE 0x75 0 6 CALLVALUE
1AFPUSH2 0x01df0x01DF Value*<=2^256-CALLVALUE-1 0x00 Value* CALLVALUE 0x75 0 6 CALLVALUE
1B2JUMPI

We jump if Value* is smaller than 2^256-CALLVALUE-1 or equal to it. This looks like logic to prevent overflow. And indeed, we see that after a few nonsense operations (writing to memory is about to get deleted, for example) at offset 0x01DE the contract reverts if the overflow is detected, which is normal behavior.

Note that such an overflow is extremely unlikely, because it would require the call value plus Value* to be comparable to 2^256 wei, about 10^59 ETH. The total ETH supply, at writing, is less than two hundred million(opens in a new tab).

OffsetOpcodeStack
1DFJUMPDEST0x00 Value* CALLVALUE 0x75 0 6 CALLVALUE
1E0POPValue* CALLVALUE 0x75 0 6 CALLVALUE
1E1ADDValue*+CALLVALUE 0x75 0 6 CALLVALUE
1E2SWAP10x75 Value*+CALLVALUE 0 6 CALLVALUE
1E3JUMP

If we got here, get Value* + CALLVALUE and jump to offset 0x75.

OffsetOpcodeStack
75JUMPDESTValue*+CALLVALUE 0 6 CALLVALUE
76SWAP10 Value*+CALLVALUE 6 CALLVALUE
77SWAP26 Value*+CALLVALUE 0 CALLVALUE
78SSTORE0 CALLVALUE

If we get here (which requires the call data to be empty) we add to Value* the call value. This is consistent with what we say Transfer transactions do.

OffsetOpcode
79POP
7APOP
7BSTOP

Finally, clear the stack (which isn't necessary) and signal the successful end of the transaction.

To sum it all up, here's a flowchart for the initial code.

Entry point flowchart

The Handler at 0x7C

I purposely did not put in the heading what this handler does. The point isn't to teach you how this specific contract works, but how to reverse engineer contracts. You will learn what it does the same way I did, by following the code.

We get here from several places:

  • If there is call data of 1, 2, or 3 bytes (from offset 0x63)
  • If the method signature is unknown (from offsets 0x42 and 0x5D)
OffsetOpcodeStack
7CJUMPDEST
7DPUSH1 0x000x00
7FPUSH2 0x009d0x9D 0x00
82PUSH1 0x030x03 0x9D 0x00
84SLOADStorage[3] 0x9D 0x00

This is another storage cell, one that I couldn't find in any transactions so it's harder to know what it means. The code below will make it clearer.

OffsetOpcodeStack
85PUSH20 0xffffffffffffffffffffffffffffffffffffffff0xff....ff Storage[3] 0x9D 0x00
9AANDStorage[3]-as-address 0x9D 0x00

These opcodes truncate the value we read from Storage[3] to 160 bits, the length of an Ethereum address.

OffsetOpcodeStack
9BSWAP10x9D Storage[3]-as-address 0x00
9CJUMPStorage[3]-as-address 0x00

This jump is superfluous, since we're going to the next opcode. This code isn't nearly as gas-efficient as it could be.

OffsetOpcodeStack
9DJUMPDESTStorage[3]-as-address 0x00
9ESWAP10x00 Storage[3]-as-address
9FPOPStorage[3]-as-address
A0PUSH1 0x400x40 Storage[3]-as-address
A2MLOADMem[0x40] Storage[3]-as-address

In the very beginning of the code we set Mem[0x40] to 0x80. If we look for 0x40 later, we see that we don't change it - so we can assume it is 0x80.

OffsetOpcodeStack
A3CALLDATASIZECALLDATASIZE 0x80 Storage[3]-as-address
A4PUSH1 0x000x00 CALLDATASIZE 0x80 Storage[3]-as-address
A6DUP30x80 0x00 CALLDATASIZE 0x80 Storage[3]-as-address
A7CALLDATACOPY0x80 Storage[3]-as-address

Copy all the call data to memory, starting at 0x80.

OffsetOpcodeStack
A8PUSH1 0x000x00 0x80 Storage[3]-as-address
AADUP10x00 0x00 0x80 Storage[3]-as-address
ABCALLDATASIZECALLDATASIZE 0x00 0x00 0x80 Storage[3]-as-address
ACDUP40x80 CALLDATASIZE 0x00 0x00 0x80 Storage[3]-as-address
ADDUP6Storage[3]-as-address 0x80 CALLDATASIZE 0x00 0x00 0x80 Storage[3]-as-address
AEGASGAS Storage[3]-as-address 0x80 CALLDATASIZE 0x00 0x00 0x80 Storage[3]-as-address
AFDELEGATE_CALL

Now things are a lot clearer. This contract can act as a proxy(opens in a new tab), calling the address in Storage[3] to do the real work. DELEGATE_CALL calls a separate contract, but stays in the same storage. This means that the delegated contract, the one we are a proxy for, accesses the same storage space. The parameters for the call are:

  • Gas: All the remaining gas
  • Called address: Storage[3]-as-address
  • Call data: The CALLDATASIZE bytes starting at 0x80, which is where we put the original call data
  • Return data: None (0x00 - 0x00) We'll get the return data by other means (see below)
OffsetOpcodeStack
B0RETURNDATASIZERETURNDATASIZE (((call success/failure))) 0x80 Storage[3]-as-address
B1DUP1RETURNDATASIZE RETURNDATASIZE (((call success/failure))) 0x80 Storage[3]-as-address
B2PUSH1 0x000x00 RETURNDATASIZE RETURNDATASIZE (((call success/failure))) 0x80 Storage[3]-as-address
B4DUP50x80 0x00 RETURNDATASIZE RETURNDATASIZE (((call success/failure))) 0x80 Storage[3]-as-address
B5RETURNDATACOPYRETURNDATASIZE (((call success/failure))) 0x80 Storage[3]-as-address

Here we copy all the return data to the memory buffer starting at 0x80.

OffsetOpcodeStack
B6DUP2(((call success/failure))) RETURNDATASIZE (((call success/failure))) 0x80 Storage[3]-as-address
B7DUP1(((call success/failure))) (((call success/failure))) RETURNDATASIZE (((call success/failure))) 0x80 Storage[3]-as-address
B8ISZERO(((did the call fail))) (((call success/failure))) RETURNDATASIZE (((call success/failure))) 0x80 Storage[3]-as-address
B9PUSH2 0x00c00xC0 (((did the call fail))) (((call success/failure))) RETURNDATASIZE (((call success/failure))) 0x80 Storage[3]-as-address
BCJUMPI(((call success/failure))) RETURNDATASIZE (((call success/failure))) 0x80 Storage[3]-as-address
BDDUP2RETURNDATASIZE (((call success/failure))) RETURNDATASIZE (((call success/failure))) 0x80 Storage[3]-as-address
BEDUP50x80 RETURNDATASIZE (((call success/failure))) RETURNDATASIZE (((call success/failure))) 0x80 Storage[3]-as-address
BFRETURN

So after the call we copy the return data to the buffer 0x80 - 0x80+RETURNDATASIZE, and if the call is successful we then RETURN with exactly that buffer.

DELEGATECALL Failed

If we get here, to 0xC0, it means that the contract we called reverted. As we are just a proxy for that contract, we want to return the same data and also revert.

OffsetOpcodeStack
C0JUMPDEST(((call success/failure))) RETURNDATASIZE (((call success/failure))) 0x80 Storage[3]-as-address
C1DUP2RETURNDATASIZE (((call success/failure))) RETURNDATASIZE (((call success/failure))) 0x80 Storage[3]-as-address
C2DUP50x80 RETURNDATASIZE (((call success/failure))) RETURNDATASIZE (((call success/failure))) 0x80 Storage[3]-as-address
C3REVERT

So we REVERT with the same buffer we used for RETURN earlier: 0x80 - 0x80+RETURNDATASIZE

Call to proxy flowchart

ABI calls

If the call data size is four bytes or more this might be a valid ABI call.

OffsetOpcodeStack
DPUSH1 0x000x00
FCALLDATALOAD(((First word (256 bits) of the call data)))
10PUSH1 0xe00xE0 (((First word (256 bits) of the call data)))
12SHR(((first 32 bits (4 bytes) of the call data)))

Etherscan tells us that 1C is an unknown opcode, because it was added after Etherscan wrote this feature(opens in a new tab) and they haven't updated it. An up to date opcode table(opens in a new tab) shows us that this is shift right

OffsetOpcodeStack
13DUP1(((first 32 bits (4 bytes) of the call data))) (((first 32 bits (4 bytes) of the call data)))
14PUSH4 0x3cd8045e0x3CD8045E (((first 32 bits (4 bytes) of the call data))) (((first 32 bits (4 bytes) of the call data)))
19GT0x3CD8045E>first-32-bits-of-the-call-data (((first 32 bits (4 bytes) of the call data)))
1APUSH2 0x00430x43 0x3CD8045E>first-32-bits-of-the-call-data (((first 32 bits (4 bytes) of the call data)))
1DJUMPI(((first 32 bits (4 bytes) of the call data)))

By dividing the method signature matching tests in two like this saves half the tests on average. The code that immediately follows this and the code in 0x43 follow the same pattern: DUP1 the first 32 bits of the call data, PUSH4 (((method signature>, run EQ to check for equality, and then JUMPI if the method signature matches. Here are the method signatures, their addresses, and if known the corresponding method definition(opens in a new tab):

MethodMethod signatureOffset to jump into
splitter()(opens in a new tab)0x3cd8045e0x0103
???0x81e580d30x0138
currentWindow()(opens in a new tab)0xba0bafb40x0158
???0x1f1358230x00C4
merkleRoot()(opens in a new tab)0x2eb4a7ab0x00ED

If no match is found, the code jumps to the proxy handler at 0x7C, in the hope that the contract to which we are a proxy has a match.

ABI calls flowchart

splitter()

OffsetOpcodeStack
103JUMPDEST
104CALLVALUECALLVALUE
105DUP1CALLVALUE CALLVALUE
106ISZEROCALLVALUE==0 CALLVALUE
107PUSH2 0x010f0x010F CALLVALUE==0 CALLVALUE
10AJUMPICALLVALUE
10BPUSH1 0x000x00 CALLVALUE
10DDUP10x00 0x00 CALLVALUE
10EREVERT

The first thing this function does is check that the call did not send any ETH. This function is not payable(opens in a new tab). If somebody sent us ETH that must be a mistake and we want to REVERT to avoid having that ETH where they can't get it back.

OffsetOpcodeStack
10FJUMPDEST
110POP
111PUSH1 0x030x03
113SLOAD(((Storage[3] a.k.a the contract for which we are a proxy)))
114PUSH1 0x400x40 (((Storage[3] a.k.a the contract for which we are a proxy)))
116MLOAD0x80 (((Storage[3] a.k.a the contract for which we are a proxy)))
117PUSH20 0xffffffffffffffffffffffffffffffffffffffff0xFF...FF 0x80 (((Storage[3] a.k.a the contract for which we are a proxy)))
12CSWAP10x80 0xFF...FF (((Storage[3] a.k.a the contract for which we are a proxy)))
12DSWAP2(((Storage[3] a.k.a the contract for which we are a proxy))) 0xFF...FF 0x80
12EANDProxyAddr 0x80
12FDUP20x80 ProxyAddr 0x80
130MSTORE0x80

And 0x80 now contains the proxy address

OffsetOpcodeStack
131PUSH1 0x200x20 0x80
133ADD0xA0
134PUSH2 0x00e40xE4 0xA0
137JUMP0xA0

The E4 Code

This is the first time we see these lines, but they are shared with other methods (see below). So we'll call the value in the stack X, and just remember that in splitter() the value of this X is 0xA0.

OffsetOpcodeStack
E4JUMPDESTX
E5PUSH1 0x400x40 X
E7MLOAD0x80 X
E8DUP10x80 0x80 X
E9SWAP2X 0x80 0x80
EASUBX-0x80 0x80
EBSWAP10x80 X-0x80
ECRETURN

So this code receives a memory pointer in the stack (X), and causes the contract to RETURN with a buffer that is 0x80 - X.

In the case of splitter(), this returns the address for which we are a proxy. RETURN returns the buffer in 0x80-0x9F, which is where we wrote this data (offset 0x130 above).

currentWindow()

The code in offsets 0x158-0x163 is identical to what we saw in 0x103-0x10E in splitter() (other than the JUMPI destination), so we know currentWindow() is also not payable.

OffsetOpcodeStack
164JUMPDEST
165POP
166PUSH2 0x00da0xDA
169PUSH1 0x010x01 0xDA
16BSLOADStorage[1] 0xDA
16CDUP20xDA Storage[1] 0xDA
16DJUMPStorage[1] 0xDA

The DA code

This code is also shared with other methods. So we'll call the value in the stack Y, and just remember that in currentWindow() the value of this Y is Storage[1].

OffsetOpcodeStack
DAJUMPDESTY 0xDA
DBPUSH1 0x400x40 Y 0xDA
DDMLOAD0x80 Y 0xDA
DESWAP1Y 0x80 0xDA
DFDUP20x80 Y 0x80 0xDA
E0MSTORE0x80 0xDA

Write Y to 0x80-0x9F.

OffsetOpcodeStack
E1PUSH1 0x200x20 0x80 0xDA
E3ADD0xA0 0xDA

And the rest is already explained above. So jumps to 0xDA write the stack top (Y) to 0x80-0x9F, and return that value. In the case of currentWindow(), it returns Storage[1].

merkleRoot()

The code in offsets 0xED-0xF8 is identical to what we saw in 0x103-0x10E in splitter() (other than the JUMPI destination), so we know merkleRoot() is also not payable.

OffsetOpcodeStack
F9JUMPDEST
FAPOP
FBPUSH2 0x00da0xDA
FEPUSH1 0x000x00 0xDA
100SLOADStorage[0] 0xDA
101DUP20xDA Storage[0] 0xDA
102JUMPStorage[0] 0xDA

What happens after the jump we already figured out. So merkleRoot() returns Storage[0].

0x81e580d3

The code in offsets 0x138-0x143 is identical to what we saw in 0x103-0x10E in splitter() (other than the JUMPI destination), so we know this function is also not payable.

OffsetOpcodeStack
144JUMPDEST
145POP
146PUSH2 0x00da0xDA
149PUSH2 0x01530x0153 0xDA
14CCALLDATASIZECALLDATASIZE 0x0153 0xDA
14DPUSH1 0x040x04 CALLDATASIZE 0x0153 0xDA
14FPUSH2 0x018f0x018F 0x04 CALLDATASIZE 0x0153 0xDA
152JUMP0x04 CALLDATASIZE 0x0153 0xDA
18FJUMPDEST0x04 CALLDATASIZE 0x0153 0xDA
190PUSH1 0x000x00 0x04 CALLDATASIZE 0x0153 0xDA
192PUSH1 0x200x20 0x00 0x04 CALLDATASIZE 0x0153 0xDA
194DUP30x04 0x20 0x00 0x04 CALLDATASIZE 0x0153 0xDA
195DUP5CALLDATASIZE 0x04 0x20 0x00 0x04 CALLDATASIZE 0x0153 0xDA
196SUBCALLDATASIZE-4 0x20 0x00 0x04 CALLDATASIZE 0x0153 0xDA
197SLTCALLDATASIZE-4<32 0x00 0x04 CALLDATASIZE 0x0153 0xDA
198ISZEROCALLDATASIZE-4>=32 0x00 0x04 CALLDATASIZE 0x0153 0xDA
199PUSH2 0x01a00x01A0 CALLDATASIZE-4>=32 0x00 0x04 CALLDATASIZE 0x0153 0xDA
19CJUMPI0x00 0x04 CALLDATASIZE 0x0153 0xDA

It looks like this function takes at least 32 bytes (one word) of call data.

OffsetOpcodeStack
19DDUP10x00 0x00 0x04 CALLDATASIZE 0x0153 0xDA
19EDUP20x00 0x00 0x00 0x04 CALLDATASIZE 0x0153 0xDA
19FREVERT

If it doesn't get the call data the transaction is reverted without any return data.

Let's see what happens if the function does get the call data it needs.

OffsetOpcodeStack
1A0JUMPDEST0x00 0x04 CALLDATASIZE 0x0153 0xDA
1A1POP0x04 CALLDATASIZE 0x0153 0xDA
1A2CALLDATALOADcalldataload(4) CALLDATASIZE 0x0153 0xDA

calldataload(4) is the first word of the call data after the method signature

OffsetOpcodeStack
1A3SWAP20x0153 CALLDATASIZE calldataload(4) 0xDA
1A4SWAP1CALLDATASIZE 0x0153 calldataload(4) 0xDA
1A5POP0x0153 calldataload(4) 0xDA
1A6JUMPcalldataload(4) 0xDA
153JUMPDESTcalldataload(4) 0xDA
154PUSH2 0x016e0x016E calldataload(4) 0xDA
157JUMPcalldataload(4) 0xDA
16EJUMPDESTcalldataload(4) 0xDA
16FPUSH1 0x040x04 calldataload(4) 0xDA
171DUP2calldataload(4) 0x04 calldataload(4) 0xDA
172DUP20x04 calldataload(4) 0x04 calldataload(4) 0xDA
173SLOADStorage[4] calldataload(4) 0x04 calldataload(4) 0xDA
174DUP2calldataload(4) Storage[4] calldataload(4) 0x04 calldataload(4) 0xDA
175LTcalldataload(4)<Storage[4] calldataload(4) 0x04 calldataload(4) 0xDA
176PUSH2 0x017e0x017EC calldataload(4)<Storage[4] calldataload(4) 0x04 calldataload(4) 0xDA
179JUMPIcalldataload(4) 0x04 calldataload(4) 0xDA

If the first word is not less than Storage[4], the function fails. It reverts without any returned value:

OffsetOpcodeStack
17APUSH1 0x000x00 ...
17CDUP10x00 0x00 ...
17DREVERT

If the calldataload(4) is less than Storage[4], we get this code:

OffsetOpcodeStack
17EJUMPDESTcalldataload(4) 0x04 calldataload(4) 0xDA
17FPUSH1 0x000x00 calldataload(4) 0x04 calldataload(4) 0xDA
181SWAP20x04 calldataload(4) 0x00 calldataload(4) 0xDA
182DUP30x00 0x04 calldataload(4) 0x00 calldataload(4) 0xDA
183MSTOREcalldataload(4) 0x00 calldataload(4) 0xDA

And memory locations 0x00-0x1F now contain the data 0x04 (0x00-0x1E are all zeros, 0x1F is four)

OffsetOpcodeStack
184PUSH1 0x200x20 calldataload(4) 0x00 calldataload(4) 0xDA
186SWAP1calldataload(4) 0x20 0x00 calldataload(4) 0xDA
187SWAP20x00 0x20 calldataload(4) calldataload(4) 0xDA
188SHA3(((SHA3 of 0x00-0x1F))) calldataload(4) calldataload(4) 0xDA
189ADD(((SHA3 of 0x00-0x1F)))+calldataload(4) calldataload(4) 0xDA
18ASLOADStorage[(((SHA3 of 0x00-0x1F))) + calldataload(4)] calldataload(4) 0xDA

So there is a lookup table in storage, which starts at the SHA3 of 0x000...0004 and has an entry for every legitimate call data value (value below Storage[4]).

OffsetOpcodeStack
18BSWAP1calldataload(4) Storage[(((SHA3 of 0x00-0x1F))) + calldataload(4)] 0xDA
18CPOPStorage[(((SHA3 of 0x00-0x1F))) + calldataload(4)] 0xDA
18DDUP20xDA Storage[(((SHA3 of 0x00-0x1F))) + calldataload(4)] 0xDA
18EJUMPStorage[(((SHA3 of 0x00-0x1F))) + calldataload(4)] 0xDA

We already know what the code at offset 0xDA does, it returns the stack top value to the caller. So this function returns the value from the lookup table to the caller.

0x1f135823

The code in offsets 0xC4-0xCF is identical to what we saw in 0x103-0x10E in splitter() (other than the JUMPI destination), so we know this function is also not payable.

OffsetOpcodeStack
D0JUMPDEST
D1POP
D2PUSH2 0x00da0xDA
D5PUSH1 0x060x06 0xDA
D7SLOADValue* 0xDA
D8DUP20xDA Value* 0xDA
D9JUMPValue* 0xDA

We already know what the code at offset 0xDA does, it returns the stack top value to the caller. So this function returns Value*.

Method Summary

Do you feel you understand the contract at this point? I don't. So far we have these methods:

MethodMeaning
TransferAccept the value provided by the call and increase Value* by that amount
splitter()Return Storage[3], the proxy address
currentWindow()Return Storage[1]
merkleRoot()Return Storage[0]
0x81e580d3Return the value from a lookup table, provided the parameter is less than Storage[4]
0x1f135823Return Storage[6], a.k.a. Value*

But we know any other functionality is provided by the contract in Storage[3]. Maybe if we knew what that contract is it'll give us a clue. Thankfully, this is the blockchain and everything is known, at least in theory. We didn't see any methods that set Storage[3], so it must have been set by the constructor.

The Constructor

When we look at a contract(opens in a new tab) we can also see the transaction that created it.

Click the create transaction

If we click that transaction, and then the State tab, we can see the initial values of the parameters. Specifically, we can see that Storage[3] contains 0x2f81e57ff4f4d83b40a9f719fd892d8e806e0761(opens in a new tab). That contract must contain the missing functionality. We can understand it using the same tools we used for the contract we are investigating.

The Proxy Contract

Using the same techniques we used for the original contract above we can see that the contract reverts if:

  • There is any ETH attached to the call (0x05-0x0F)
  • The call data size is less than four (0x10-0x19 and 0xBE-0xC2)

And that the methods it supports are:

MethodMethod signatureOffset to jump into
scaleAmountByPercentage(uint256,uint256)(opens in a new tab)0x8ffb5c970x0135
isClaimed(uint256,address)(opens in a new tab)0xd2ef07950x0151
claim(uint256,address,uint256,bytes32[])(opens in a new tab)0x2e7ba6ef0x00F4
incrementWindow()(opens in a new tab)0x338b1d310x0110
???0x3f26479e0x0118
???0x1e7df9d30x00C3
currentWindow()(opens in a new tab)0xba0bafb40x0148
merkleRoot()(opens in a new tab)0x2eb4a7ab0x0107
???0x81e580d30x0122
???0x1f1358230x00D8

We can ignore the bottom four methods because we will never get to them. Their signatures are such that our original contract takes care of them by itself (you can click the signatures to see the details above), so they must be methods that are overridden(opens in a new tab).

One of the remaining methods is claim(<params>), and another is isClaimed(<params>), so it looks like an airdrop contract. Instead of going through the rest opcode by opcode, we can try the decompiler(opens in a new tab), which produces usable results for three functions from this contract. Reverse engineering the other ones is left as an exercise to the reader.

scaleAmountByPercentage

This is what the decompiler gives us for this function:

1def unknown8ffb5c97(uint256 _param1, uint256 _param2) payable:
2 require calldata.size - 4 >=β€² 64
3 if _param1 and _param2 > -1 / _param1:
4 revert with 0, 17
5 return (_param1 * _param2 / 100 * 10^6)
Copiar

The first require tests that the call data has, in addition to the four bytes of the function signature, at least 64 bytes, enough for the two parameters. If not then there is obviously something wrong.

The if statement seems to check that _param1 is not zero, and that _param1 * _param2 is not negative. It is probably to prevent cases of wrap around.

Finally, the function returns a scaled value.

claim

The code the decompiler creates is complex, and not all of it is relevant for us. I am going to skip some of it to focus on the lines that I believe provide useful information

1def unknown2e7ba6ef(uint256 _param1, uint256 _param2, uint256 _param3, array _param4) payable:
2 ...
3 require _param2 == addr(_param2)
4 ...
5 if currentWindow <= _param1:
6 revert with 0, 'cannot claim for a future window'
Copiar

We see here two important things:

  • _param2, while it is declared as a uint256, is actually an address
  • _param1 is the window being claimed, which has to be currentWindow or earlier.
1 ...
2 if stor5[_claimWindow][addr(_claimFor)]:
3 revert with 0, 'Account already claimed the given window'
Copiar

So now we know that Storage[5] is an array of windows and addresses, and whether the address claimed the reward for that window.

1 ...
2 idx = 0
3 s = 0
4 while idx < _param4.length:
5 ...
6 if s + sha3(mem[(32 * _param4.length) + 328 len mem[(32 * _param4.length) + 296]]) > mem[(32 * idx) + 296]:
7 mem[mem[64] + 32] = mem[(32 * idx) + 296]
8 ...
9 s = sha3(mem[_62 + 32 len mem[_62]])
10 continue
11 ...
12 s = sha3(mem[_66 + 32 len mem[_66]])
13 continue
14 if unknown2eb4a7ab != s:
15 revert with 0, 'Invalid proof'
Mostrar todo
Copiar

We know that unknown2eb4a7ab is actually the function merkleRoot(), so this code looks like it is verifying a merkle proof(opens in a new tab). This means that _param4 is a merkle proof.

1 call addr(_param2) with:
2 value unknown81e580d3[_param1] * _param3 / 100 * 10^6 wei
3 gas 30000 wei
Copiar

This is how a contract transfers its own ETH to another address (contract or externally owned). It calls it with a value that is the amount to be transferred. So it looks like this is an airdrop of ETH.

1 if not return_data.size:
2 if not ext_call.success:
3 require ext_code.size(stor2)
4 call stor2.deposit() with:
5 value unknown81e580d3[_param1] * _param3 / 100 * 10^6 wei
Copiar

The bottom two lines tell us that Storage[2] is also a contract that we call. If we look at the constructor transaction(opens in a new tab) we see that this contract is 0xc02aaa39b223fe8d0a0e5c4f27ead9083c756cc2(opens in a new tab), a Wrapped Ether contract whose source code has been uploaded to Etherscan(opens in a new tab).

So it looks like the contracts attempts to send ETH to _param2. If it can do it, great. If not, it attempts to send WETH(opens in a new tab). If _param2 is an externally owned account (EOA) then it can always receive ETH, but contracts can refuse to receive ETH. However, WETH is ERC-20 and contracts can't refuse to accept that.

1 ...
2 log 0xdbd5389f: addr(_param2), unknown81e580d3[_param1] * _param3 / 100 * 10^6, bool(ext_call.success)
Copiar

At the end of the function we see a log entry being generated. Look at the generated log entries(opens in a new tab) and filter on the topic that starts with 0xdbd5.... If we click one of the transactions that generated such an entry(opens in a new tab) we see that indeed it looks like a claim - the account sent a message to the contract we're reverse engineering, and in return got ETH.

A claim transaction

1e7df9d3

This function is very similar to claim above. It also checks a merkle proof, attempts to transfer ETH to the first, and produces the same type of log entry.

1def unknown1e7df9d3(uint256 _param1, uint256 _param2, array _param3) payable:
2 ...
3 idx = 0
4 s = 0
5 while idx < _param3.length:
6 if idx >= mem[96]:
7 revert with 0, 50
8 _55 = mem[(32 * idx) + 128]
9 if s + sha3(mem[(32 * _param3.length) + 160 len mem[(32 * _param3.length) + 128]]) > mem[(32 * idx) + 128]:
10 ...
11 s = sha3(mem[_58 + 32 len mem[_58]])
12 continue
13 mem[mem[64] + 32] = s + sha3(mem[(32 * _param3.length) + 160 len mem[(32 * _param3.length) + 128]])
14 ...
15 if unknown2eb4a7ab != s:
16 revert with 0, 'Invalid proof'
17 ...
18 call addr(_param1) with:
19 value s wei
20 gas 30000 wei
21 if not return_data.size:
22 if not ext_call.success:
23 require ext_code.size(stor2)
24 call stor2.deposit() with:
25 value s wei
26 gas gas_remaining wei
27 ...
28 log 0xdbd5389f: addr(_param1), s, bool(ext_call.success)
Mostrar todo
Copiar

The main difference is that the first parameter, the window to withdraw, isn't there. Instead, there is a loop over all the windows that could be claimed.

1 idx = 0
2 s = 0
3 while idx < currentWindow:
4 ...
5 if stor5[mem[0]]:
6 if idx == -1:
7 revert with 0, 17
8 idx = idx + 1
9 s = s
10 continue
11 ...
12 stor5[idx][addr(_param1)] = 1
13 if idx >= unknown81e580d3.length:
14 revert with 0, 50
15 mem[0] = 4
16 if unknown81e580d3[idx] and _param2 > -1 / unknown81e580d3[idx]:
17 revert with 0, 17
18 if s > !(unknown81e580d3[idx] * _param2 / 100 * 10^6):
19 revert with 0, 17
20 if idx == -1:
21 revert with 0, 17
22 idx = idx + 1
23 s = s + (unknown81e580d3[idx] * _param2 / 100 * 10^6)
24 continue
Mostrar todo
Copiar

So it looks like a claim variant that claims all the windows.

Conclusion

By now you should know how to understand contracts whose source code is not available, using either the opcodes or (when it works) the decompiler. As is evident from the length of this article, reverse engineering a contract is not trivial, but in a system where security is essential it is an important skill to be able to verify contracts work as promised.

Última edición: @wackerow(opens in a new tab), 2 de abril de 2024

Foi ΓΊtil este tutorial?