目前是 LikeCoin Chain 验证人 Yoitsu 背后的家伙,以及 CDC/CFC 划水组成员(笑)。偶尔会变身成狐狸。( @foobarz )

Incident Report about Yoitsu's offline on June 9th, 2022

I rebuilt my validator node called Yoitsu due to I lost this node's private key in this incident.

Details of this incident

The times below are all in UTC+8 time.

About 11am, I noticed my validator node is missing lots of blocks from the Discord bot.

curl -sS http://localhost:26657/net_info | jq -r '.result.n_peers' returned 0, it means my node could not connect to any peer at that time. So I plan move the virtual server to other location for trying to mitigate this problem.

My validator node is hosted on Linode, which it supports migrate a virtual server to a different datacenter, althrough it doesn’t support moving external mounted block storage together.

In order to save time, I did't transfer ~/.liked/data directory, and tried to use state-sync to catch up blocks. While I encountered "content deadline exceeded" errors every time:

cosmovisor[3370]: 12:27PM ERR error on light block request from witness, removing... error="post failed: Post \"https://fotan-node-2.like.co:443/rpc/\": context deadline exceeded" module=light primary={}

Thus I tried to clear old ~/.liked directory for using nnkken's snapshot for catching, while I only taken out ~/.liked/keyring-file directory. So I lost the private key of validator node in ~/.liked/config.

After sync_info.catching_up is turned to false on my node's status. I noticed there is no voting power in my node and BigDipper was still showing My node is missing blocks. So I checked logs:

Jun 09 10:11:12 localhost cosmovisor[2485]: 10:11AM INF This node is not a validator addr=353558D7C7D69DF83A6C9D37BB8204B38561217C module=consensus pubKey=cEwyDK/M1mJ+fJHXASe……

And ~/liked tendermint show-address returned an address different than my validator's operator address. I recognised I had lost my node's private key. So I announced this incident on #mainnet-validators channel on Discord and started to recreate a new node.

Learned from this incident

  • It is not enough for backing up node operator's private key, we should backup node's private key itself either.

Followed up actions


  • There may be lots of failure reports about state syncs. It may be necessary to test this mechanism, even if it will not be frequently used in synchronization.
  • Or we may expand the documentation for covering how to enable state sync for more nodes to improve the robustness of this feature.


Like my work?
Don't forget to support or like, so I know you are with me..



(没错这个围炉目前就是来蹭热度的,所有的文章都不会上锁。当然汝要是真的想支持咱的话订阅也 OK)