Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save transactions in database #1180

Merged
merged 7 commits into from
Aug 2, 2024
Merged

Conversation

jp1ac4
Copy link
Collaborator

@jp1ac4 jp1ac4 commented Jul 9, 2024

This is the first step of #56 (comment).

The poller will now save transactions in our own database. These transactions are selected based on the deposit and spend transactions of coins. Only the txid and transaction itself are saved, with the corresponding block height and time taken from the coins table.

In a couple of follow-up commits, I've replaced some RPC calls to bitcoind with DB queries.

@jp1ac4 jp1ac4 self-assigned this Jul 9, 2024
@nondiremanuel nondiremanuel added this to the v7 - Liana milestone Jul 9, 2024
src/database/sqlite/mod.rs Show resolved Hide resolved
src/database/sqlite/mod.rs Show resolved Hide resolved
src/commands/mod.rs Outdated Show resolved Hide resolved
src/testutils.rs Outdated Show resolved Hide resolved
@edouardparis
Copy link
Member

I think list_missing_txids may fetch a too big batch of txids. I was thinking maybe it is better to loop over list_txids: https://github.com/wizardsardine/liana/blob/master/src/database/mod.rs#L148 from current time to genesis block timestamp, and for unconfirmed transactions use the coins(&[Status::Unconfirmed], ...)

@jp1ac4
Copy link
Collaborator Author

jp1ac4 commented Jul 17, 2024

Perhaps yes, although in that case, we should probably also do the same for the coins() method, as the max number of missing txids should be of a similar order to the total number of coins in the DB.

@jp1ac4
Copy link
Collaborator Author

jp1ac4 commented Jul 18, 2024

I'll change list_missing_txids() to use a loop.

@jp1ac4
Copy link
Collaborator Author

jp1ac4 commented Jul 18, 2024

I used a loop to get the missing txids using ROW_NUMBER() as it seemed the simplest way. Most of the time, we don't expect there to be many missing values, only when the transactions table is initially populated for an existing wallet or perhaps if the poller has not been run for a long time.

@edouardparis
Copy link
Member

edouardparis commented Jul 18, 2024

Sorry If I was unclear, my point was to fetch missing txids and store transactions by batch in order to not keep a whole set in memory.

In fact, i am thinking that we do not need to paginate txids to do that, just fetch 20 txids from the coins table where the txid is not in the txid field of the table transactions, fetch their raw tx with bitcoind then store them in transactions table and repeat until no txids are fund.

@jp1ac4
Copy link
Collaborator Author

jp1ac4 commented Jul 18, 2024

Ah I see, yep that makes sense, will change it.

@jp1ac4
Copy link
Collaborator Author

jp1ac4 commented Jul 18, 2024

In fact, i am thinking that we do not need to paginate txids to do that, just fetch 20 txids from the coins table where the txid is not in the txid field of the table transactions, fetch their raw tx with bitcoind then store them in transactions table and repeat until no txids are fund.

I couldn't quite do it like this in the end as there's a chance we won't be able to retrieve the transaction from bitcoind, and so the loop might have continued forever. Instead, I get all missing txids in one query still, but then retrieve chunks of transactions at a time so that we don't store them all in memory at once. As the transactions are larger than txids, and we already store the same order of coins in memory as there are txids, I felt it was fine to do it this way.

src/commands/mod.rs Show resolved Hide resolved
src/commands/mod.rs Show resolved Hide resolved
Copy link
Member

@darosior darosior left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do something closer to what we do for the backend and its poller:

  • Have the coins table's txid row reference a transaction in the transactions table (thereby introducing a constraint that a transaction must exist in database for any coin);
  • When inserting new coins to the database, first insert the corresponding transaction.

Of course this means you need access to transactions at the time of the migration. I'm still thinking about ways to reduce ugliness of doing that.

src/bitcoin/poller/looper.rs Outdated Show resolved Hide resolved
@jp1ac4
Copy link
Collaborator Author

jp1ac4 commented Jul 24, 2024

Of course this means you need access to transactions at the time of the migration. I'm still thinking about ways to reduce ugliness of doing that.

Perhaps we could add a migration step in Poller::new() around the same place we call looper::maybe_initialize_tip(&bit, &db).

@darosior
Copy link
Member

Alright, so here is how i think we should proceed:

  1. Separate migration from the SQLite database constructor
  2. When setting up SQLite at init, first poll the database version before proceeding with any migration
  3. If the database version is one for which Bitcoin transactions weren't stored in DB, query them from bitcoind
  4. Only then proceed with the SQLite migration and pass the Bitcoin transactions which need to be stored

At the expense of making the code less encapsulated and a bit harder to follow, this gives us two nice robustness properties:

  • If any Bitcoin transaction is missing (which should never happen ™️) the migration will fail.
  • For any utxo in DB its transaction is always in DB too.

I've implemented this on this branch: https://github.com/darosior/liana/commits/pr_1180. Feel free to take anything from there, of course.

@jp1ac4
Copy link
Collaborator Author

jp1ac4 commented Jul 30, 2024

Thanks for the updated approach and code.

I wonder whether for such migrations that need to query the DB (to get the txids in the present case), we should first migrate the DB to the most recent version, then query the DB, and then apply the final migration. Although I don't think it would cause any problems in the present case, the DB query used in the current code base may not always be compatible with the user's DB version.

Perhaps this could be done by passing the bitcoind connection to maybe_apply_migrations instead of the transactions themselves?

@darosior
Copy link
Member

I think we should strive to make migrations atomics: your database is either fully compatible with the previous version or the next version. For instance with what you suggest a user could use Liana version N+1 with a database created with Liana version N, apply the first migration, for whatever reason encounter an error a crash or just stop the application before applying the second migration, and be stuck on a database which can neither be used by Liana version N nor by Liana version N+1.

If the existing query methods on the SQLite connection struct were not appropriate to an old database we could just write a custom one for this purpose.

Perhaps this could be done by passing the bitcoind connection to maybe_apply_migrations instead of the transactions themselves?

I precisely tried to avoid entangling the Bitcoin and Database modules.

@jp1ac4
Copy link
Collaborator Author

jp1ac4 commented Jul 30, 2024

Sorry, I wasn't clear in my description. I didn't mean that the migration from DB versions 4 to 5 would be split in two separate steps. Each migration from one DB version to the following would still be atomic.

So the function migrate_v4_to_v5 would still perform all changes on the DB. But instead of receiving the txs, it would instead get these from bitcoind as an initial step within the function.

But I agree that keeping in mind the potential need to modify DB queries for previous DB versions should be sufficient. Indeed, that could still be required using my approach if version N - 1 is different.

I'll proceed using your code and start adding it into this PR.

@jp1ac4
Copy link
Collaborator Author

jp1ac4 commented Jul 30, 2024

Does bitcoind's watchonly wallet keep transactions that are no longer in the best chain or were never confirmed and are no longer in the mempool?

@darosior
Copy link
Member

Yes.

@jp1ac4
Copy link
Collaborator Author

jp1ac4 commented Jul 31, 2024

I've now updated it to use foreign key constraints from the coins to transactions tables and to retrieve transactions as part of the migration.

src/lib.rs Show resolved Hide resolved
src/lib.rs Show resolved Hide resolved
src/database/sqlite/mod.rs Show resolved Hide resolved
Copy link
Member

@darosior darosior left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments as i read through the diff, which looks good.

src/database/mod.rs Outdated Show resolved Hide resolved
src/bitcoin/poller/looper.rs Outdated Show resolved Hide resolved
src/bitcoin/poller/looper.rs Outdated Show resolved Hide resolved
src/database/sqlite/mod.rs Outdated Show resolved Hide resolved
src/database/sqlite/mod.rs Show resolved Hide resolved
src/database/sqlite/utils.rs Outdated Show resolved Hide resolved
src/lib.rs Show resolved Hide resolved
src/lib.rs Show resolved Hide resolved
src/database/sqlite/mod.rs Outdated Show resolved Hide resolved
src/database/sqlite/mod.rs Show resolved Hide resolved
src/database/sqlite/mod.rs Show resolved Hide resolved
@darosior
Copy link
Member

darosior commented Aug 1, 2024

I've written a unit to specifically exercise the migration from v4 to v5, including the failure case: darosior@c9c82f2.

It only trips on the last check which compares the transactions from DB with the transactions created in the test because i think your list_wallet_transactions returns duplicates. See #1180 (comment).

@jp1ac4
Copy link
Collaborator Author

jp1ac4 commented Aug 1, 2024

The failing test may be due to the spend_txid having multiple different block infos, whereas list_wallet_transactions assumes that a given txid can't have inconsistent block info in the coins table.

@jp1ac4
Copy link
Collaborator Author

jp1ac4 commented Aug 1, 2024

I could add a corresponding assertion to list_wallet_transactions to catch duplicates, as that would indicate corrupt DB data.

@darosior
Copy link
Member

darosior commented Aug 1, 2024

The failing test may be due to the spend_txid having multiple different block infos, whereas list_wallet_transactions assumes that a given txid can't have inconsistent block info in the coins table.

Ah, good catch. Fixing it now.

darosior and others added 2 commits August 2, 2024 09:39
This is a slightly modified version of darosior's commit:
d826f9f
This includes changes from darosior's commits:
3e7d968
e2225a5
@jp1ac4
Copy link
Collaborator Author

jp1ac4 commented Aug 2, 2024

I think I've now made all the requested changes.

src/database/sqlite/mod.rs Outdated Show resolved Hide resolved
@darosior
Copy link
Member

darosior commented Aug 2, 2024

Tested migrating from v4.0, v5.0 and v6.0 using the migration functional test.

@darosior
Copy link
Member

darosior commented Aug 2, 2024

Tested by upgrading a mainnet wallet. No issue.

@darosior
Copy link
Member

darosior commented Aug 2, 2024

ACK 3953a0b

@darosior
Copy link
Member

darosior commented Aug 2, 2024

re-ACK bf1e90e

@darosior darosior merged commit 90df14c into wizardsardine:master Aug 2, 2024
21 checks passed
@jp1ac4
Copy link
Collaborator Author

jp1ac4 commented Aug 2, 2024

Thanks for the review and additional commits 😃

@jp1ac4 jp1ac4 deleted the save-txs-in-db branch August 2, 2024 10:27
@darosior
Copy link
Member

darosior commented Aug 2, 2024 via email

@jp1ac4 jp1ac4 mentioned this pull request Aug 8, 2024
7 tasks
darosior added a commit that referenced this pull request Sep 5, 2024
341f940 func test: prevent disconnects when using mocktime (Michael Mallan)
b630d46 func test: wait for block heights to match (Michael Mallan)
ada89c5 doc: update with electrum info (Michael Mallan)
177dfc4 lib: expose BDK's electrum client (Michael Mallan)
72c6314 ci: add electrs to func tests (jp1ac4)
371e31e func test: allow to run using electrs backend (jp1ac4)
a85d488 func test: allow for different bitcoin backends (jp1ac4)
1b04b29 func test: fix min rbf feerate (jp1ac4)
c7ee862 bitcoin: add electrum backend (jp1ac4)
c4c2424 bitcoin: expose MempoolEntryFees (jp1ac4)
5011ad9 bitcoin: return spent block height & time separately (jp1ac4)
689442c bitcoin: allow to store UTXO deriv index (jp1ac4)
4c02b0d bitcoin: use mut ref for start_rescan (Michael Mallan)
69259c1 poller: sync wallet before checking updates (jp1ac4)
89e004d bitcoin: add sync_wallet method to interface (jp1ac4)
34b9a49 config: add general bitcoin backend option (jp1ac4)
e267f66 descriptors: allow to get underlying public key (jp1ac4)

Pull request description:

  This is to add the Electrum backend as part of #56 (comment).

  This requires the database to be running version 5 following #1180. The migration from a previous DB version must be done using bitcoind. Thereafter, the daemon can be run using an Electrum backend by replacing the `[bitcoind_config]` section in the daemon.toml config file with:
  ```
  [electrum_config]
  addr = "127.0.0.1:50001" # adjust IP:port as required
  ```
  Remaining tasks:

  - [x] Include ancestors and descendants when getting a transaction's `MempoolEntry`
  - [ ] Additional sanity checks, e.g. check Electrum version
  - [x] Check if logic regarding ongoing rescan needs to be adjusted
  - [x] Add Electrum backend to functional tests
  - [x] Add Electrum backend to CI pipline functional tests
  - [ ] Add unit tests
  - [x] Update README & other docs

ACKs for top commit:
  darosior:
    re-ACK 341f940

Tree-SHA512: dbbb375123cc5c566f5c8535d002ac4491b4be5ab2766845b5b7ab51d19e7e85eafc0097235e1ebc5c5a049bd2222ab5582264314e46c77b5fff8027da31b803
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants