Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange behaviour with staged/unripe blocks. #3855

Open
walrusmustgo opened this issue Aug 14, 2024 · 24 comments
Open

Strange behaviour with staged/unripe blocks. #3855

walrusmustgo opened this issue Aug 14, 2024 · 24 comments

Comments

@walrusmustgo
Copy link

walrusmustgo commented Aug 14, 2024

I've found an issue -- i reckon it's most definitely an issue, not misusage -- regarding querying staged and unripe blocks. It seems to me that the chifra isn't reading either of the two when querying. I'm going to explain my process, but keep in mind,this's very time sensitive, it's only concerning newer data on the blockchain, so for actually testing it out properly, i reckon we'll have to test it out in similar times.

So, i've got a service which's listening for approvals on tokens which aren't released for trading yet. I'm going to be using this address:

0x1a7cAC64314FB35e69a1045014EEa422D902fB17

At the time of writing, this contract is ~10mins old. When i run:

chifra export 0x1a7cAC64314FB35e69a1045014EEa422D902fB17

i get the following output:

2812a94825aa:/# chifra export 0x1a7cAC64314FB35e69a1045014EEa422D902fB17
EROR[14-08|13:55:05.157] no blocks found for the query

When i run:

chifra export --unripe 0x1a7cAC64314FB35e69a1045014EEa422D902fB17

i get the same output:

2812a94825aa:/# chifra export 0x1a7cAC64314FB35e69a1045014EEa422D902fB17 --unripe
EROR[14-08|14:03:52.839] no blocks found for the query

And using the curl debugger:

2812a94825aa:/# TB_DEBUG_CURL=1 chifra export 0x1a7cAC64314FB35e69a1045014EEa422D902fB17 --unripe
EROR[14-08|14:04:40.626] no blocks found for the query
INFO[14-08|14:04:40.626] Closing curl debugger

Now,this contract has activity on it, the scraper is running properly, the staging file is growing in size, and when i try

grep 0x1a7cAC64314FB35e69a1045014EEa422D902fB17 <filename>

in the staging folder, i don't get anything in the output. This's happening with all new tokens, and it's pretty inconsistent when i'll be able to query them. Sometimes it needs over 4h ( which further points me to thinking that chifra only reads finalized chunks, because it's very possible that a chunk was built during that time ) to actually output anything, and it misses last 1h of data, and this's with the --unripe flag. From my understanding, if i use --unripe, i should basically have access to the chain head, or a block less, until the scraper picks it up, but that doesn't seem to be the case.

Any help would be appreciated :)

PS. This's also true for --traces flag, and chifra list command as well. And checking the unripe and staging folders, the file name seems to be correct as well, 28 blocks behind the head for the staged, and chain head block for the unripe file.

I'm on trueblocks 3.1.2, docker version, and i'm using erigon 3.0 as a node.

@walrusmustgo
Copy link
Author

walrusmustgo commented Aug 16, 2024

Some more examples, but this time with WETH/USDT pair address instead of new tokens.

a900bdce7c5a:/# chifra export --unripe 0x0d4a11d5eeaac28ec3f61d100daf4d40471f1852 --first_block 20542500
EROR[16-08|17:19:58.219] no blocks found for the query
a900bdce7c5a:/# chifra export --unripe 0x0d4a11d5eeaac28ec3f61d100daf4d40471f1852 --first_block 20542499
EROR[16-08|17:20:03.713] no blocks found for the query
a900bdce7c5a:/# chifra export --unripe 0x0d4a11d5eeaac28ec3f61d100daf4d40471f1852 --first_block 20542489
blockNumber     transactionIndex        timestamp       date    from    to      value   gasPrice        gasUsed gasCost hash    isError encoding        type    ether
20542491        13      1723826975      2024-08-16 16:49:35 UTC 0x042a77a27d7677465d689fcf3161f685bd363f50      0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad      0       5589343442      255248  1426668734883616        0x48e3d94ff93e54b2e5a39d4e9fb72f65a29ad116464091e9d20ae22f99eb147e      false   0x3593564c      0x2     0
20542491        63      1723826975      2024-08-16 16:49:35 UTC 0xd51983c46e4c993dfe7063ded181da03d8cd2de8      0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad      0       4127695726      219151  904588646048626 0x58fb989e7b95914eb93c6af42b9ceef14d192127dea0dab7c0ea96eb4e89c670      false   0x3593564c      0x2     0
a900bdce7c5a:/# chifra export --unripe 0x0d4a11d5eeaac28ec3f61d100daf4d40471f1852 --first_block 20542492
EROR[16-08|17:20:13.123] no blocks found for the query
a900bdce7c5a:/# chifra export --unripe 0x0d4a11d5eeaac28ec3f61d100daf4d40471f1852 --first_block 20542491
blockNumber     transactionIndex        timestamp       date    from    to      value   gasPrice        gasUsed gasCost hash    isError encoding        type    ether
20542491        13      1723826975      2024-08-16 16:49:35 UTC 0x042a77a27d7677465d689fcf3161f685bd363f50      0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad      0       5589343442      255248  1426668734883616        0x48e3d94ff93e54b2e5a39d4e9fb72f65a29ad116464091e9d20ae22f99eb147e      false   0x3593564c      0x2     0
20542491        63      1723826975      2024-08-16 16:49:35 UTC 0xd51983c46e4c993dfe7063ded181da03d8cd2de8      0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad      0       4127695726      219151  904588646048626 0x58fb989e7b95914eb93c6af42b9ceef14d192127dea0dab7c0ea96eb4e89c670      false   0x3593564c      0x2     0
a900bdce7c5a:/# chifra export --unripe 0x0d4a11d5eeaac28ec3f61d100daf4d40471f1852 --first_block 20542492
EROR[16-08|17:20:21.349] no blocks found for the query
a900bdce7c5a:/# chifra export --unripe 0x0d4a11d5eeaac28ec3f61d100daf4d40471f1852 --first_block 20542492
EROR[16-08|17:21:35.761] no blocks found for the query
a900bdce7c5a:/# chifra export --unripe 0x0d4a11d5eeaac28ec3f61d100daf4d40471f1852 --first_block 20542491
blockNumber     transactionIndex        timestamp       date    from    to      value   gasPrice        gasUsed gasCost hash    isError encoding        type    ether
20542491        13      1723826975      2024-08-16 16:49:35 UTC 0x042a77a27d7677465d689fcf3161f685bd363f50      0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad      0       5589343442      255248  1426668734883616        0x48e3d94ff93e54b2e5a39d4e9fb72f65a29ad116464091e9d20ae22f99eb147e      false   0x3593564c      0x2     0
20542491        63      1723826975      2024-08-16 16:49:35 UTC 0xd51983c46e4c993dfe7063ded181da03d8cd2de8      0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad      0       4127695726      219151  904588646048626 0x58fb989e7b95914eb93c6af42b9ceef14d192127dea0dab7c0ea96eb4e89c670      false   0x3593564c      0x2     0

Here,i've been pinpointing exactly where the no blocks found error occurs. At the time of writing, the current block number is 20542666, and it fails after 20542491.

Here are some status logs if they can help:

a900bdce7c5a:/# chifra status index
{ROG[16-08|17:23:19.698]                                            
  "data": [
    {
      "cachePath": "/cache/mainnet/",
      "caches": [
        {
          "items": [],
          "lastCached": "2024-08-16 17:23:19",
          "nFiles": 4851,
          "nFolders": 1,
          "path": "/unchained/mainnet/finalized/",
          "sizeInBytes": 128196093084,
          "type": "indexCache"
        }
      ],
      "chainConfig": "/root/.local/share/trueblocks/config/mainnet/",
      "clientVersion": "erigon/3.00.0/linux-amd64/go1.21.5",
      "hasEsKey": true,
      "hasPinKey": false,
      "indexPath": "/unchained/mainnet/",
      "isApi": false,
      "isArchive": true,
      "isTesting": false,
      "isTracing": true,
      "rootConfig": "/root/.local/share/trueblocks/",
      "rpcProvider": "http://erigon:8547",
      "trueblocksVersion": "GHC-TrueBlocks//3.1.2-release"
    }
  ]
}
a900bdce7c5a:/# chifra status blocks
{ROG[16-08|17:23:54.556]                                            
  "data": [],
  "errors": [
    "no files were found in the [blocks] caches"
  ]
}
a900bdce7c5a:/# chifra status unripe
{ROG[16-08|17:24:16.126]                                            
  "data": [
    {
      "cachePath": "/cache/mainnet/",
      "caches": [
        {
          "items": [],
          "lastCached": "2024-08-16 17:24:16",
          "nFiles": 28,
          "nFolders": 1,
          "path": "/unchained/mainnet/unripe/",
          "sizeInBytes": 1537127,
          "type": "unripeCache"
        }
      ],
      "chainConfig": "/root/.local/share/trueblocks/config/mainnet/",
      "clientVersion": "erigon/3.00.0/linux-amd64/go1.21.5",
      "hasEsKey": true,
      "hasPinKey": false,
      "indexPath": "/unchained/mainnet/",
      "isApi": false,
      "isArchive": true,
      "isTesting": false,
      "isTracing": true,
      "rootConfig": "/root/.local/share/trueblocks/",
      "rpcProvider": "http://erigon:8547",
      "trueblocksVersion": "GHC-TrueBlocks//3.1.2-release"
    }
  ]
}
a900bdce7c5a:/# chifra status staging
{ROG[16-08|17:24:29.377]                                            
  "data": [
    {
      "cachePath": "/cache/mainnet/",
      "caches": [
        {
          "items": [],
          "lastCached": "2024-08-16 17:24:29",
          "nFiles": 1,
          "nFolders": 1,
          "path": "/unchained/mainnet/staging/",
          "sizeInBytes": 7378599,
          "type": "stagingCache"
        }
      ],
      "chainConfig": "/root/.local/share/trueblocks/config/mainnet/",
      "clientVersion": "erigon/3.00.0/linux-amd64/go1.21.5",
      "hasEsKey": true,
      "hasPinKey": false,
      "indexPath": "/unchained/mainnet/",
      "isApi": false,
      "isArchive": true,
      "isTesting": false,
      "isTracing": true,
      "rootConfig": "/root/.local/share/trueblocks/",
      "rpcProvider": "http://erigon:8547",
      "trueblocksVersion": "GHC-TrueBlocks//3.1.2-release"
    }
  ]
}

And here's some ripe-vs-unripe tests:

a900bdce7c5a:/# chifra export 0x0d4a11d5eeaac28ec3f61d100daf4d40471f1852 --first_block 20542491
blockNumber     transactionIndex        timestamp       date    from    to      value   gasPrice        gasUsed gasCost hash    isError encoding        type    ether
20542491        13      1723826975      2024-08-16 16:49:35 UTC 0x042a77a27d7677465d689fcf3161f685bd363f50      0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad      0       5589343442      255248  1426668734883616        0x48e3d94ff93e54b2e5a39d4e9fb72f65a29ad116464091e9d20ae22f99eb147e      false   0x3593564c      0x2     0
20542491        63      1723826975      2024-08-16 16:49:35 UTC 0xd51983c46e4c993dfe7063ded181da03d8cd2de8      0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad      0       4127695726      219151  904588646048626 0x58fb989e7b95914eb93c6af42b9ceef14d192127dea0dab7c0ea96eb4e89c670      false   0x3593564c      0x2     0
a900bdce7c5a:/# chifra export --unripe 0x0d4a11d5eeaac28ec3f61d100daf4d40471f1852 --first_block 20542491
blockNumber     transactionIndex        timestamp       date    from    to      value   gasPrice        gasUsed gasCost hash    isError encoding        type    ether
20542491        13      1723826975      2024-08-16 16:49:35 UTC 0x042a77a27d7677465d689fcf3161f685bd363f50      0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad      0       5589343442      255248  1426668734883616        0x48e3d94ff93e54b2e5a39d4e9fb72f65a29ad116464091e9d20ae22f99eb147e      false   0x3593564c      0x2     0
20542491        63      1723826975      2024-08-16 16:49:35 UTC 0xd51983c46e4c993dfe7063ded181da03d8cd2de8      0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad      0       4127695726      219151  904588646048626 0x58fb989e7b95914eb93c6af42b9ceef14d192127dea0dab7c0ea96eb4e89c670      false   0x3593564c      0x2     0

I reckon that from this benchmark, tests can be ran with any address, as long as the block count is closer to the chain head, they'll always produce these results ( on my end at least ).

Btw, if i call chifra blocks latest, i get the latest block and get the correct output, which leads me to believe that the index is actually being built, but some of the commands aren't using the data properly.

I've checked out the repo, i'm going to start going through the code to see if i'll manage to find an issue -- i'm a noob with Go though, it might take some time for me to get deep enough --.

@tjayrush
Copy link
Member

This is just for my notes...

I looked at chifra transactions --cache 20567924.\* --uniq | grep 0x08 and found two appearances.

0x08f104c70126354b3755777d72436e983ba46439	20567924	51		input
0x08f104c70126354b3755777d72436e983ba46439	20567924	68		log_0_topic_2

I then ran chifra list 0x08f104c70126354b3755777d72436e983ba46439 which return zero results. This made sense since my index was not updated, so I started a watch on the address:

watch -n .2 "chifra list 0x08f104c70126354b3755777d72436e983ba46439 --unripe"

which should have show records as soon as the scraper got to block 20567924. I then started my scraper. I was about 3000 blocks behind. And it caught up, but the watch did not start showing the transactions until the block of interest was consolidated into a chunk.

It should have shown up as soon as the scraper got to block 20567924 but it took about 90 blocks later until the consolidation.

This is incorrect behaviour.

This is VERY hard to test.

@tjayrush
Copy link
Member

I've found an issue -- i reckon it's most definitely an issue, not misusage -- regarding querying staged and unripe blocks. It seems to me that the chifra isn't reading either of the two when querying.

I think this may be correct...I'm seeing the same thing...

@tjayrush
Copy link
Member

it's only concerning newer data on the blockchain, so for actually testing it out properly, i reckon we'll have to test it out in similar times.

It's very time dependant, so it's super hard to test...

@walrusmustgo
Copy link
Author

Yeah,it's quite related to time. One of the ways i've found easiest to test is by looking into some deploys channels on telegram, and using newer addresses. That way, you always have a fresh supply of newely deployed tokens, you just need to check etherscan to be sure there's some transactions on the contract, for more data. :)

Imma shamelessly put the one i'm working on lol:
https://t.me/Snipereum_ETH

@tjayrush
Copy link
Member

Now,this contract has activity on it, the scraper is running properly, the staging file is growing in size, and when i try grep 0x1a7cAC64314FB35e69a1045014EEa422D902fB17 in the staging folder,

This is an excellent test. If this address is not in the files in the folders, then it will not be found. So that makes sense.

@walrusmustgo
Copy link
Author

Now,this contract has activity on it, the scraper is running properly, the staging file is growing in size, and when i try grep 0x1a7cAC64314FB35e69a1045014EEa422D902fB17 in the staging folder,

This is an excellent test. If this address is not in the files in the folders, then it will not be found. So that makes sense.

Yeah,but from the looks of everything, there should be something related to the address, if we're to trust etherscan. Also, i've grep the entire file ,there was a bunch of 0xfffffff.... addresses in there, don't know if that should be the case? Unfortuantely, i've exited the shell without copying the staged file to showcase, but from my memory, it was literally ~50% of 0xfffff addresses.

@tjayrush
Copy link
Member

trueblocks 3.1.2

correct

docker version

Okay, although I prefer the core directly running. The docker is okay, but may be under tested.

erigon 3.0 as a node.

Very interesting. I heard of some issue on Erigon 3.0 and logs. I have no idea if this is important, but it may be.

@tjayrush
Copy link
Member

I've checked out the repo, i'm going to start going through the code to see if i'll manage to find an issue -- i'm a noob with Go though, it might take some time for me to get deep enough --.

Please, please, please let me help you here. If you have any questions about anything, ask in the discord.

@tjayrush
Copy link
Member

there was a bunch of 0xfffffff.... addresses in there, don't know if that should be the case?

We do something we call being purposefully sloppy. We err on the side of being overly inclusive instead of missing addresses. It's explained in excruciating detail in the "Building the Index and Bloom Filters" section of this white paper: https://trueblocks.io/papers/2023/specification-for-the-unchained-index-v2.0.0-release.pdf

@tjayrush
Copy link
Member

literally ~50% of 0xfffff addresses.

Probably not that many...You can do

chifra chunks addresses 20566100 | sort -r -n -k4,4 | head
chifra chunks addresses 20566100 | sort -r -n -k4,4 | grep 0xfffff | head

to see some of them in that block range.

@walrusmustgo
Copy link
Author

erigon 3.0 as a node.

Very interesting. I heard of some issue on Erigon 3.0 and logs. I have no idea if this is important, but it may be.

Yeah,there's a bug with alpha1 version ( which i'm currently using ), the method handler tends to crash on getLogs, however!, that's not the case with TB, i've been checking the erigon logs, and it runs just fine, no errors pop out. For reference, i've tested out the graph node out of curiosity, and it can't use the alpha1 version, the getLogs error basically halts it immediately ,but somehow, TB works just fine :D

I've checked out the repo, i'm going to start going through the code to see if i'll manage to find an issue -- i'm a noob with Go though, it might take some time for me to get deep enough --.

Please, please, please let me help you here. If you have any questions about anything, ask in the discord.

Will do, i've been poking around the past few days, we can work on this for sure. I'll most probably return to TB from tomorrow ( had some unavoidable hurdles at work i've had to take care of ASAP -- which's done now ), and i'll most definitely ping you for help!

there was a bunch of 0xfffffff.... addresses in there, don't know if that should be the case?

We do something we call being purposefully sloppy. We err on the side of being overly inclusive instead of missing addresses. It's explained in excruciating detail in the "Building the Index and Bloom Filters" section of this white paper: https://trueblocks.io/papers/2023/specification-for-the-unchained-index-v2.0.0-release.pdf

Ahhhhh, yes, i remember now, i've read the whitepaper, and i was actually intrigued with this approach. :D Forgot about it, my bad. in that case, it makes sense :) However, the missed address still doesn't...

literally ~50% of 0xfffff addresses.

Probably not that many...You can do

chifra chunks addresses 20566100 | sort -r -n -k4,4 | head
chifra chunks addresses 20566100 | sort -r -n -k4,4 | grep 0xfffff | head

to see some of them in that block range.

Yeah yeah, could be less, i was eyeballing it. There was a lot, but there were a lot of "normal" addresses as well, i've probably just went overboard :D

@tjayrush
Copy link
Member

How do you find these addresses that a being newly created? The trouble with testing is that either an address shows up rarely (in which case you have to wait for a new tx) or it shows up in every block (in which case TrueBlocks will choke).

Are you watching Uniswap or something? Or watching for contract deployments?

@tjayrush
Copy link
Member

Cheers. Out for the evening, now.

@walrusmustgo
Copy link
Author

walrusmustgo commented Aug 20, 2024

Yeah,it's quite related to time. One of the ways i've found easiest to test is by looking into some deploys channels on telegram, and using newer addresses. That way, you always have a fresh supply of newely deployed tokens, you just need to check etherscan to be sure there's some transactions on the contract, for more data. :)

Imma shamelessly put the one i'm working on lol: https://t.me/Snipereum_ETH

I'm looking into deployments. Now, the deployments are a bit tricky, because you need to go into each one of them,and make sure there's more transactions than just contract creation ( i reckon it's much easier to find the issue with more activity ).

One more channel is this :
https://t.me/SnipereumSniperScout

This one actually tracks approvals on new tokens, so basically everything newer in that channel will do, but it's possible that the token was released much earlier than the actual approvals started, so it's a trade-off.

Imo ,probably better to just track the deploys channel, it's much easier to be sure when the contract was deployed that way.

Cheers. Out for the evening, now.

Cheerio! ^^ we'll keep in touch! :)

@walrusmustgo
Copy link
Author

walrusmustgo commented Aug 20, 2024

I reckon that from this benchmark, tests can be ran with any address, as long as the block count is closer to the chain head, they'll always produce these results ( on my end at least ).

BTW, i think this's also a good approach for testing ( second message i've sent to this issue ). I've took a really popular address -- WETH/USDT Pair address in this case -- ,and tried chifra export --unripe on it, but with --first_block close to the chain head, and got basically the same behaviour as with new tokens.

@tjayrush
Copy link
Member

Yeah. I was thinking of this. Sounds good.

Did some further testing. I'm on a Mac. I opened a separate finder window on each of the three folders (./unripe, ./ripe, and ./staging).

When the scraper is way behind the chain tip, there is never any items in the unripe folder. That makes perfect sense. While the scraper is "completing a pass" (that is, it's scanning blocks), the staging file stays frozen and the ./ripe folder is being filled up with individual files, one for each block. This also makes perfect sense. One the "pass" is completed, all the ./ripe folder are grouped into the stage. This also makes sense. When there's enough records on the stage (2,000,000) then a chunk is created. This also makes sense.

This is how it works when it's not at the chain tip. I'll report next what I see at the chain tip once it catches up.

@tjayrush
Copy link
Member

Here's a video of what happens at the front of the chain:

https://ipfs.unchainedindex.io/ipfs/QmeHajNek31sfjueb8sjRgCpv5RctKgc2tKdXCVV7VPDVb

This appears to be working exactly as I would expect, so there must be something wrong somewhere else. The files are there. They are where I would expect them to be. I do notice that they periodically disappear and re-appear which may cause some problem, but this is no different than it's worked for six years, so I'm not going to call that a problem yet.

The problem must be somewhere in the querying code. More later...

@walrusmustgo
Copy link
Author

I've got exactly the same behaviour as you, the folders are being populated in exactly the same way. I've managed to grep some unripe blocks as well, and everything seems in order...

Yeah,i've noticed that the files are consantly being deleted/added, i even found it in code while i was looking through, but from what i can tell, everything is setup nicely, so that shouldn't cause any race conditions -- however, it's not impossible, if querying has changed recently, perhaps there's a race condition, and querying code can't find the files when it needs them? --.

My bet is that it's the querying code too, since everything before the query itself seems to behave just as it should. Can you point me in the right direction, where's the querying code? I reckon it's on a per-command basis?

@tjayrush
Copy link
Member

tjayrush commented Aug 21, 2024 via email

@walrusmustgo
Copy link
Author

Ok,i've been messing around a bit with the code, but i'm yet to find something useful.. I'll be deploying a new index today/tomorrow, and start from scratch, and let you know!

@tjayrush
Copy link
Member

We found a bug today related to the --staging option. It was very clearly not working correctly. This will be fixed in the next release (which we are preparing now).

The --unripe option remains un-implemented, but we're hoping to add this (fairly easy) feature soon. We need it for our own work on trueblocks-browse.

@walrusmustgo
Copy link
Author

oooooh, that's really wonderful to hear!!! Can't wait :) Sorry i didn't manage to find this myself, it's been really hectic the last couple of months,hadn't had 1 minute for side-projects unfortunately :( This's really good news though, because i've needed this to move my workflow to TB entirely :)

@tjayrush
Copy link
Member

We're working on our own desktop app, so we needed these features as well. Reminder that the --unripe part still doesn't work currently. My comment above was about the staged data. We'll be fixing the --unripe part early next year...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants