Lessons learned running an ethereum node

Julio Santana
4 min readApr 19, 2022

Ethereum cryptocurrency operates thanks to a protocol run on a network of p2p clients. By using it, each one of them is able to send and receive transactions, mine blocks, inspect the blockchain, and all other related tasks. To take part in the network, a node needs to be setup running some of the clients already created for this protocol. Among the options are:

So there is already a decision to take:

Select your client

In my case I wanted to run a client to create an application similar to a wallet with the capacity to create new accounts, know the balance from my wallet, send new transactions.

On this step I decided to use Geth, since it seemed to be the more stable and well maintained client from the available. Also it provides binaries in several flavours (including Docker) and it was very easy to integrate it with a kubernetes environment which was also one of my goals. If you want to give it a try you can just run

docker run -d --name ethereum-node -v /Users/alice/ethereum:/root \
-p 8545:8545 -p 30303:30303 \
ethereum/client-go

Although for many more options are available here

Don’t forget the hardware requirements

Before actually doing anything, you need to make sure you have hardware that fulfils Geth requirements which in version 1.10.16 were:

Minimum:CPU with 2+ cores4GB RAM320GB free storage space to sync the Mainnet8 MBit/sec download Internet service
Recommended:
Fast CPU with 4+ cores16GB+ RAMFast SSD with at least 500GB free space25+ MBit/sec download Internet service

From this, I tried with the minimum while using private networks that didn’t generate so many transactions and it was ok. However in the moment I moved to mainnet, the node run for weeks and it never got to be 100% updated and the cause is that the network produces data at a pace that is faster than the max speed a SATA disk can write, resulting in a never ending update. On this step, a lesson learn is that to run a full node, you really need an SSD disk.

Use PoA for testing networks

If you are using your node to do tests in local, probably you’ll need to control the way testing transactions and blocks are created. Creating testing transactions and mining them into a block is one of the few things that is easier to do in a bitcoin testing node compared to an ethereum one.

Of course in ethereum you can use the API and create transaction using eth_sendRawTransaction and later activate mining using miner_start and wait for the mining logic to do its job and create it some seconds later (this time depends on the difficulty the network has, but is approximately 12 seconds). But this is not optimal, you can not say how many transaction will be on the block or request that it is mined immediately. With PoA on the other hand, a signer can be specified and he will be able to sign new blocks at any moment in time

Another problem is when you need your network to start with an account with available funds to be used in your tests. An example can be when you need to generate thousands of transactions to run a stress test.

PoA has also some things to say on this. A Genesis block with enough funds can be configured defining among other parameters the amount of available ether, the gas limit, etc. More info on how to do this can be found here.

Get ready for the “Head state missing, repairing pairs” error

After running the node for a while, I noticed that after every shutdown the node had to spend some time repairing its internal state and showed the error (as of version 1.10.16)

Head state missing, repairing pairs

and stayed there for minutes or even hours. Some people on the internet even complained that the node never really started again because the container timed out after some minutes and tried to start again going in an infinite loop.

The problem seemed to be that Geth keeps the state of the blockchain in memory and only flushes every hour or so as stated here and here. So if the process is shutdown without a graceful signal (as it was the case with the k8s container I was running) part of this can be lost and a big reprocess might be needed.

The key here was to create a pre-stop hook sending the INT signal to the node process

kill -INT $PID”

SIGTERM was also tried but it didn’t seem to be handled by the Geth code at that moment.

--

--