Entropy check workflow in ResetDevice #4155

andrewkozlik · 2024-09-05T13:34:57Z

This PR implements an automated workflow that allows the host to verify that when Trezor generates the seed, it correctly includes the external entropy from the host. The host performs a randomized test asking Trezor to generate several seeds, checking that they were generated correctly and using the last one as the final seed.

Documentation:

TODO:

Device tests.

Each iteration of the entropy check adds about 2 seconds to the ResetDevice workflow on Trezor T with SLIP-39. A lot of that time is taken up by the seed derivation, so we probably can't do much better.

@onvej-sl please review the security of the proposed workflow.
@matejcik please do a high-level review of the workflow, namely the protobuf messages and how they are handled. Let's discuss what's the best way to pass the staged seed to the get_public_key() handler. Currently I use the storage cache (APP_STAGED_MNEMONIC_SECRET).
Let's leave the detailed code review until after we are happy with the security and design of the workflow.

github-actions · 2024-09-05T14:00:25Z

core UI changes	device test	click test	persistence test
T2T1 Model T	test(screens) main(screens)	test(screens) main(screens)	test(screens) main(screens)
T3B1 Safe 3	test(screens) main(screens)	test(screens) main(screens)	test(screens) main(screens)
T3T1 Safe 5	test(screens) main(screens)	test(screens) main(screens)	test(screens) main(screens)
All	main(screens)

onvej-sl · 2024-09-18T12:18:43Z

The protocol seems secure with two reservations highlighted in bold in the following text.

My first question was: What kind of probability distribution the suite should use to generate entropy_check_count?

First, let's look at the problem from the perspective of a fake Trezor. Assume the attacker knows the probability distribution used to generate entropy_check_count. The attacker wants to maximize the success of the attack, where the attacker's strategy is to generate the seed honestly n-1 times and dishonestly on the n-th attempt. The success probability of this attack is Pr[entropy_check_count = n]. The attacker has the highest success rate if they choose n such that Pr[entropy_check_count = n] is maximal, in other words, n should be the mode of the entropy_check_count distribution.

Now, let's examine the problem from the suite's perspective. Assume that the suite wants the attack probability to be at most e. The suite seeks a distribution that generates on average the smallest possible values of entropy_check_count. In other words, it looks for a distribution with a mode at most e and the smallest expected value.

It can be shown that if e = 1/i for natural i, the desired distribution is the uniform dicrete distribution over the set 1..n, which is not surprising.

It follows that if the attack's success probability needs to be at most e, then entropy_check_count should be a uniform discrete distribution over 1..ceil(1/e), and the average number of rounds will be (ceil(1/e) + 1)/2.

My second question was: Does the protocol protects against an attacker who honestly generates the seed but generates addresses dishonestly?

The answer is that it can only protect those addresses for which the passphrase and all parts of the path up to the last hardened index are known at the time this protocol runs. This typically means the passphrase, purpose, coin_type, and account.

matejcik · 2024-09-30T08:07:06Z

@matejcik please do a high-level review of the workflow, namely the protobuf messages and how they are handled. Let's discuss what's the best way to pass the staged seed to the get_public_key() handler. Currently I use the storage cache (APP_STAGED_MNEMONIC_SECRET).

high-level flow looks fine to me

instead of messing with the root sources for get_seed() and get_secret(), i'd pass an optional keychain: Keychain | None to get_public_key handler, and have the reset device flow construct a Keychain instance with the local copy of the seed.
(there's also probably no reason to go through workflow_handlers, you can directly import the get_public_key app)

andrewkozlik · 2024-10-23T12:48:53Z

Rebased on top of main.

there's also probably no reason to go through workflow_handlers, you can directly import the get_public_key app

Done in 2a5d8f2.

have the reset device flow construct a Keychain instance with the local copy of the seed

Done in 0fed6cc, but I have to say I am not entirely convinced it's for the better.

If at some point we wanted to use CardanoGetPublicKey, SolanaGetPublicKey or TezosGetPublicKey in the entropy check flow, or some other messages, then I think the implementation which involves constructing a Keychain instance with the local copy of the seed would start getting messy, because each handler would need to be modified and each message would need to be addressed independently, not to mention that the listed messages use keychain decorators, which would need to be circumvented somehow. On the other hand, I doubt that we really need any other messages other than plain GetPublicKey, because it can be used to access nearly any BIP 32 node, and AFAICS, the host can translate between the XPUB format and the special Solana or Tezos format. Cardano is a different story though, because it uses a special seed derivation, which is why I used the word nearly. In the implementation where Keychain is passed to get_public_key() this would again need special handling. Currently as it's written I didn't bother with replicating the Cardano logic from derive_and_store_roots(), because CardanoGetPublicKey is not supported.

In terms of implementation complexity at this moment I don't see much difference between the two options. In terms extensibility and avoiding code duplication in the future, I think the original solution which messes with the root source for get_secret() is better, because, for example, adding support for CardanoGetPublicKey is simpler. That being said, I think both implementations are acceptable.

andrewkozlik · 2024-11-25T09:53:23Z

Let's get this review moving, please.

matejcik · 2024-12-20T14:33:37Z

squashed & rebased your work, with minimal changes so far, so that we can get the CI going. i intend to make some changes for your review that will be easier to do this way

matejcik

ACK on functionality.

I have more comments on Python library side, but I tried to implement my own suggestions and decided to spin it off into a separate PR with somewhat larger refactorings. Good to be merged in this PR as-is.

While reading the code in detail, I realized that there is a slightly better way to do the protobuf data structures:

Trezor should not be returning Success when the entropy check is not finished. That's simply misleading. Instead, I introduced a special message EntropyCheckReady for this state.
With that, I decided to fold ResetDeviceContinue and ResetDeviceFinish into a single message EntropyCheckContinue(finish: bool), which seems slightly tidier: this way, the overall number of MessageTypes does not need to grow.

Implementing these changes leads to a more readable docs section in the .proto files, which now e.g. allows us to document that we support GetPublicKey in the check phase (and we can extend that if we add more messages).
Modification in firmware is very simple, and the same thing is trivial in trezorlib too, so even if Connect already started to work on this, they should be able to adapt.

@andrewkozlik please look over the branch and comment, we can merge into this PR if you're OK with it.

common/protob/messages-management.proto

legacy/firmware/protob/Makefile

core/src/apps/management/reset_device/__init__.py

common/protob/messages.proto

matejcik · 2024-12-20T16:12:03Z

entropy is broken for all tests for reset_device -- i assume that is expected, due to (probably??) changed order of storage writes?

matejcik · 2024-12-21T08:58:11Z

we'll also need some tests that validate expected_responses for the old and the new flow

andrewkozlik · 2024-12-27T18:18:53Z

Merged @matejcik's fixups and updated docs/common/message-workflows.md.

andrewkozlik · 2024-12-27T20:38:45Z

entropy is broken for all tests for reset_device -- i assume that is expected, due to (probably??) changed order of storage writes?

Correct. This is because set_slip39_iteration_exponent() and set_backup_type() are now called before random.bytes(), instead of after, which makes for cleaner code. (Writing encrypted entries to storage makes use of the random number generator.)

matejcik

spun off a separate issue #4464 for the outstanding items, let's merge this

[no changelog]

andrewkozlik added core Trezor Core firmware. Runs on Trezor Model T and T2B1. trezorlib Python library and the command line trezorctl tool. R&D Research and development team related labels Sep 5, 2024

andrewkozlik requested a review from onvej-sl September 5, 2024 13:34

andrewkozlik self-assigned this Sep 5, 2024

andrewkozlik force-pushed the andrewkozlik/display_random branch from 78fe225 to ddb3640 Compare September 5, 2024 13:38

andrewkozlik requested a review from matejcik September 5, 2024 13:39

andrewkozlik added the translations Put this label on a PR to run tests in all languages label Oct 23, 2024

andrewkozlik force-pushed the andrewkozlik/display_random branch from ddb3640 to ce803a5 Compare October 23, 2024 12:27

andrewkozlik marked this pull request as ready for review October 23, 2024 12:51

andrewkozlik requested a review from prusnak as a code owner October 23, 2024 12:51

andrewkozlik removed request for prusnak and onvej-sl October 23, 2024 12:51

szymonlesisz mentioned this pull request Dec 10, 2024

Feat/connect: Entropy check workflow in ResetDevice trezor/trezor-suite#15887

Merged

4 tasks

matejcik force-pushed the andrewkozlik/display_random branch from ce803a5 to 294661e Compare December 20, 2024 14:33

matejcik removed the translations Put this label on a PR to run tests in all languages label Dec 20, 2024

matejcik requested changes Dec 20, 2024

View reviewed changes

andrewkozlik added the translations Put this label on a PR to run tests in all languages label Dec 27, 2024

matejcik mentioned this pull request Jan 2, 2025

Entropy check: improve Python API and tests #4464

Open

matejcik approved these changes Jan 2, 2025

View reviewed changes

andrewkozlik added 6 commits January 2, 2025 12:19

feat(common): Add messages for entropy check workflow.

2dd82ff

[no changelog]

feat(core): Implement entropy check workflow in ResetDevice.

3fc6284

chore(python): Add shamir-mnemonic and SLIP10 packages.

74ebb52

feat(python): Implement entropy check workflow in device.reset().

5fedb6d

feat(tests): Tests for entropy check workflow in ResetDevice.

99cffb9

chore(core): Update fixtures.

38557d4

andrewkozlik force-pushed the andrewkozlik/display_random branch from f78335c to 38557d4 Compare January 2, 2025 11:21

andrewkozlik merged commit 57868ad into main Jan 2, 2025
138 of 139 checks passed

andrewkozlik deleted the andrewkozlik/display_random branch January 2, 2025 12:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Entropy check workflow in ResetDevice #4155

Entropy check workflow in ResetDevice #4155

andrewkozlik commented Sep 5, 2024

github-actions bot commented Sep 5, 2024 •

edited

Loading

onvej-sl commented Sep 18, 2024

matejcik commented Sep 30, 2024 •

edited

Loading

andrewkozlik commented Oct 23, 2024

andrewkozlik commented Nov 25, 2024

matejcik commented Dec 20, 2024

matejcik left a comment •

edited

Loading

matejcik commented Dec 20, 2024

matejcik commented Dec 21, 2024

andrewkozlik commented Dec 27, 2024

andrewkozlik commented Dec 27, 2024

matejcik left a comment

Entropy check workflow in ResetDevice #4155

Entropy check workflow in ResetDevice #4155

Conversation

andrewkozlik commented Sep 5, 2024

github-actions bot commented Sep 5, 2024 • edited Loading

onvej-sl commented Sep 18, 2024

matejcik commented Sep 30, 2024 • edited Loading

andrewkozlik commented Oct 23, 2024

andrewkozlik commented Nov 25, 2024

matejcik commented Dec 20, 2024

matejcik left a comment • edited Loading

Choose a reason for hiding this comment

matejcik commented Dec 20, 2024

matejcik commented Dec 21, 2024

andrewkozlik commented Dec 27, 2024

andrewkozlik commented Dec 27, 2024

matejcik left a comment

Choose a reason for hiding this comment

github-actions bot commented Sep 5, 2024 •

edited

Loading

matejcik commented Sep 30, 2024 •

edited

Loading

matejcik left a comment •

edited

Loading