Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement -f option to force inherit key with zfs change-key -if #15821

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

digitalsignalperson
Copy link

@digitalsignalperson digitalsignalperson commented Jan 25, 2024

This implements functionality already available in pyzfs, i.e. libzfs_core.lzc_change_key(b"tank/enc/data", "force_inherit"). Unlike zfs change-key -i the key is not required to be loaded. Work-around to #15687.

Motivation and Context

If you have two replicas of the same encryption root, if you create a new dataset on one side and send it to the other, the received dataset will be its own encryption root rather than inheriting as desired. This makes it difficult to do replication with raw encrypted sends if you ever create new datasets.

In zfs receive there is a recv_fix_encryption_hierarchy function that normally fixes this by force-inheriting the encryptionroot, but currently it only works for -R replication streams which include both the encryption root parent and child. The current way to solve this requires loading the key on the received side and doing zfs change-key -i my/received/dataset on the dataset. This is a pain if you just received 100 datasets. If your keylocation is prompt, you'll be prompted for the password 100 times by zfs load-key unless you write a script to handle it. But we shouldn't need to load the key to do this anyway. And we might have a remote server we do not want to load the key on.

The workaround I just discovered is to use pyzfs, which exposes all the commands to lzc_change_key: python -c 'import libzfs_core; libzfs_core.lzc_change_key(b"my/received/dataset", "force_inherit")'.

This PR adds the same functionality present in pyzfs by adding an -f option to zfs change-key to be used with -i. This is a work-around to #15687 until there is a way to auto-detect when an non-recursive received encryptionroot should be force-inherited (which I'm unclear if it is possible).

I've also made some notes here about some partial success trying to patch libzfs_sendrecv.c. But the question to solving that is something like: is there a way to detect that a dataset has the same key as another, without the keys being loaded? Something like the guid property but that is immutable across replication and is set when the key is set. I think the hidden salt property would satisfy this, but only if PBKDF2 password encryption is used.

Description

  • added a -f option to zfs change-key that when used with -i results in the DCP_CMD_FORCE_INHERIT command being used.

Here's a full example of using this in the intended scenario:

# Create test pool
dd if=/dev/zero of=/root/zpool bs=1M count=128
zpool create testpool /root/zpool -m /mnt/zpool

# Create encryptionroot and some datasets
echo "12345678" | zfs create -o canmount=off -o encryption=on -o keylocation=prompt -o keyformat=passphrase testpool/enc
zfs create testpool/enc/data1
zfs create testpool/enc/data2
touch /mnt/zpool/enc/data1/x
touch /mnt/zpool/enc/data2/x
zfs snapshot -r testpool/enc@1

# Make a recursive copy of the encryption root
zfs send -Rw testpool/enc@1 | zfs recv testpool/enc_copy

# Make a new dataset on the original encryption root and try to send to the new one
zfs create testpool/enc/data3
touch /mnt/zpool/enc/data3/x
zfs snapshot testpool/enc/data3@x
zfs send -w testpool/enc/data3@x | zfs recv testpool/enc_copy/data3

# Check the encryptionroot
zfs get encryptionroot

# Fix the encryptionroot:
#python -c 'import libzfs_core; libzfs_core.lzc_change_key(b"testpool/enc_copy/data3", "force_inherit")'
zfs change-key -if testpool/enc_copy/data3

# Check the encryptionroot
zfs get encryptionroot

The encryptionroots before the zfs change-key are:

zfs get encryptionroot -t filesystem
NAME                     PROPERTY        VALUE                    SOURCE
testpool                 encryptionroot  -                        -
testpool/enc             encryptionroot  testpool/enc             -
testpool/enc/data1       encryptionroot  testpool/enc             -
testpool/enc/data2       encryptionroot  testpool/enc             -
testpool/enc/data3       encryptionroot  testpool/enc             -
testpool/enc_copy        encryptionroot  testpool/enc_copy        -
testpool/enc_copy/data1  encryptionroot  testpool/enc_copy        -
testpool/enc_copy/data2  encryptionroot  testpool/enc_copy        -
testpool/enc_copy/data3  encryptionroot  testpool/enc_copy/data3  -

and after it is fixed:

zfs get encryptionroot testpool/enc_copy/data3
NAME                     PROPERTY        VALUE              SOURCE
testpool/enc_copy/data3  encryptionroot  testpool/enc_copy  -

In the test script I also ensured we can load the key and mount the dataset after this.

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

#

#
# Copyright (c) 2017 Datto, Inc. All rights reserved.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied and modified parts from zfs_change-key_inherit.ksh and zfs_receive_raw.ksh, but not sure if I'm supposed to keep these same copyright notices.

This implements functionality already available in pyzfs
i.e. libzfs_core.lzc_change_key(b"tank/enc/data", "force_inherit")
Unlike zfs change-key -i, the key is not required to be loaded.
Work-around to openzfs#15687.

Signed-off-by: andy <[email protected]>
@digitalsignalperson
Copy link
Author

sorry was missing the .abi file change, pushed change

@rincebrain
Copy link
Contributor

rincebrain commented Jan 25, 2024

Look at the changes around bookmark_v2 for not causing comedy with encrypted datasets and bookmarks, for another case where people wanted to track things about keys.

Does this -f work without -i? Because that would be useful for some bugs in native encryption incorrectly inheriting encryptionroot.

@digitalsignalperson
Copy link
Author

digitalsignalperson commented Jan 25, 2024

Does this -f work without -i? Because that would be useful for some bugs in native encryption incorrectly inheriting encryptionroot.

The pyzfs lzc_change_key() also exposes a "force_new_key" command, so I was thinking that the -f could be use for that too. It's currently not implemented. I'll have to take a closer look at how forcing a new key is supposed to work. Have an example of a specific bug that would be relevant in this context?

Look at the changes around bookmark_v2

Hmm yes there must be some interesting bits to understand around how bookmark_v2/extensible_dataset are required for and used with encryption. I see the keyword "IVset guids" in the module which sounds like what is needed. Also maybe the hidden salt property is generated for all key types per "Both encryption and decryption functions need a salt for key generation and an IV". If so that could also serve to uniquely identify the encryption root and is easily accessible.

Is there any way to see the bookmark_v2 data from any userspace tools? They are invisible to zfs list -t bookmark and lcz_get_bookmarks()

Side-note: I realize that identical raw snapshots/bookmarks share the same guids. That could be useful but is definitely more comedy/hoops to jump through

$ zfs get guid testpool/enc@1
NAME            PROPERTY  VALUE                SOURCE
testpool/enc@1  guid      6388058723774087726  -
$ zfs get guid testpool/enc_copy@1
NAME                 PROPERTY  VALUE                SOURCE
testpool/enc_copy@1  guid      6388058723774087726  -

Edit: this gives some more insight into tracking keys

zfs send -w testpool/enc@1 | zstream dump > enc.txt
zfs send -w testpool/enc/data1@1 | zstream dump > enc_data1.txt
diff -U 10000 --color=always enc.txt enc_data1.txt
--- enc.txt     2024-01-26 01:33:26.534364141 +0000
+++ enc_data1.txt       2024-01-26 01:33:34.527630013 +0000
@@ -1,51 +1,51 @@
 BEGIN record
        hdrtype = 1
        features = 1420004
        magic = 2f5bacbac
        creation_time = 65b26876
        type = 2
        flags = 0xc
-       toguid = 58a6f1905ae3c22e
+       toguid = cfcd3871d2893d3d
        fromguid = 0
-       toname = testpool/enc@1
+       toname = testpool/enc/data1@1
        payloadlen = 1028
 nvlist version: 0
        crypt_keydata = (embedded nvlist)
        nvlist version: 0
                DSL_CRYPTO_SUITE = 0x8
-               DSL_CRYPTO_GUID = 0xed69203ad8866ea4
+               DSL_CRYPTO_GUID = 0xa3ad38dc4e94861d
                DSL_CRYPTO_VERSION = 0x1
-               DSL_CRYPTO_MASTER_KEY_1 = 0x74 0xb8 0x48 0x81 0xb3 0xe5 0xf6 0x1c 0xab 0xb0 0x9b 0x0 0xae 0x58 0x3e 0xbc 0x28 0x56 0x3a 0x74 0x27 0x1b 0xf1 0x13 0x86 0x4f 0xec 0x52 0xf3 0x9f 0xc8 0x50
-               DSL_CRYPTO_HMAC_KEY_1 = 0x83 0x4c 0x4f 0x3e 0x8e 0x7 0x9e 0x2a 0x86 0x4f 0xf 0xa0 0x72 0x95 0x76 0x7 0xd2 0x8e 0x81 0xc2 0xc1 0xef 0x94 0xc6 0x12 0xd5 0x2f 0xce 0xad 0x57 0xfc 0xbe 0x7b 0xca 0xc1 0xb0 0x56 0x7a 0x80 0xdb 0x8f 0x25 0x2 0x56 0xdc 0x86 0xb5 0x28 0x40 0x49 0xb1 0xbb 0x83 0x33 0x80 0x4b 0x5 0xc8 0xc5 0x3 0xdf 0xd7 0x84 0x94
-               DSL_CRYPTO_IV = 0x84 0xe 0xfc 0x26 0xfc 0x78 0x8e 0x3b 0xcd 0x18 0xb 0x61
-               DSL_CRYPTO_MAC = 0x2 0x1b 0x60 0xa9 0x79 0x6c 0xe8 0x51 0x15 0xd5 0x31 0x7d 0xf6 0x83 0x78 0x28
-               portable_mac = 0xd1 0x40 0xda 0xde 0x25 0x22 0x68 0xb8 0x2c 0x41 0xe 0x37 0x33 0x4b 0x61 0x34 0x39 0xa0 0xc2 0xcc 0xa8 0x1c 0x46 0x99 0xd 0xe3 0xf1 0x2d 0x99 0x7f 0x6e 0x47
+               DSL_CRYPTO_MASTER_KEY_1 = 0x57 0x5c 0x6c 0x8e 0xfc 0x9a 0x16 0x17 0x57 0xf6 0x64 0xef 0x79 0xfe 0x9d 0x43 0x6c 0xb8 0xa8 0xd5 0x79 0x43 0x8a 0x9d 0x18 0x2e 0xba 0xd1 0x4c 0x66 0xe8 0xcd
+               DSL_CRYPTO_HMAC_KEY_1 = 0xc 0xd0 0x96 0xdd 0x8d 0x42 0xd9 0x38 0xac 0x3d 0xc1 0xb2 0x98 0x60 0xc5 0x9c 0x89 0xcc 0xef 0x0 0x7a 0xa1 0xbc 0xfc 0x63 0xe1 0x73 0xf6 0x6a 0x80 0xf3 0x69 0x40 0xf9 0x1a 0x60 0xe6 0x37 0xa9 0x29 0x1 0xea 0xc4 0x20 0x6b 0x15 0xc5 0x9d 0x17 0xf0 0xdb 0x97 0x57 0x24 0x5a 0xe5 0x6e 0xa5 0x4c 0x1 0x74 0xbe 0x81 0xa0
+               DSL_CRYPTO_IV = 0xee 0xf2 0x40 0x4 0x4a 0x11 0x6c 0x83 0xfc 0xa7 0xd1 0x6f
+               DSL_CRYPTO_MAC = 0x4b 0xc5 0x19 0x33 0x63 0x6a 0x70 0xfc 0xec 0x73 0x7c 0x30 0x2c 0x87 0xb 0xae
+               portable_mac = 0xe0 0xce 0x3e 0x13 0x97 0xde 0x5a 0x7e 0xeb 0xd9 0xc4 0x9c 0xdf 0x18 0x69 0x8d 0x6f 0xdd 0x5c 0x8e 0xdc 0x99 0xbb 0x55 0xae 0x51 0x5a 0x78 0xa9 0x90 0x63 0x4
                keyformat = 0x3
                pbkdf2iters = 0x55730
                pbkdf2salt = 0x37383a1592a6d850
                mdn_checksum = 0x0
                mdn_compress = 0x0
                mdn_nlevels = 0x6
                mdn_blksz = 0x4000
                mdn_indblkshift = 0x11
                mdn_nblkptr = 0x3
                mdn_maxblkid = 0x1
-               to_ivset_guid = 0x3780fc940dcfdb
+               to_ivset_guid = 0x8068d7abddf742
                from_ivset_guid = 0x0
        (end crypt_keydata)
 
-END checksum = 410dc89a94e/1968dd1f61a78c/7529ad2ff8954aab/356a6a507b9c52cb
+END checksum = 415d5e2c3a3/19d3c91fd18184/78d028e22685a9a7/7d1d0d9d140c2cbd
 SUMMARY:
        Total DRR_BEGIN records = 1 (1028 bytes)
        Total DRR_END records = 1 (0 bytes)
        Total DRR_OBJECT records = 6 (320 bytes)
        Total DRR_FREEOBJECTS records = 2 (0 bytes)
        Total DRR_WRITE records = 7 (6656 bytes)
        Total DRR_WRITE_BYREF records = 0 (0 bytes)
        Total DRR_WRITE_EMBEDDED records = 0 (0 bytes)
        Total DRR_FREE records = 11 (0 bytes)
        Total DRR_SPILL records = 0 (0 bytes)
        Total records = 30
        Total payload size = 8004 (0x1f44)
        Total header overhead = 9360 (0x2490)
        Total stream length = 17364 (0x43d4)

as we can see here, the pbkdf2salt = 0x37383a1592a6d850 is the only common thing that links the child dataset to it's original encryption root. But I confirm creating an encyption root with a raw key results in pbkdf2salt = 0x0

@rincebrain
Copy link
Contributor

I believe ivsetguid and friends are visible but not listed in "get all".

I'm specifically thinking of...I don't have the bug handy, but one where you do a send of parent and parent/child from src to dst, then change-key the child, and incrementally send it back, and it incorrectly changes the key on the parent without actually triggering rekeying, so you get locked out, or something like that. (I vaguely recall it being that it changes one of the key properties when it shouldn't ,but that might be yet another bug.)

@digitalsignalperson
Copy link
Author

The output from zstream dump includes to_ivset_guid and from_ivset_guid. If I make an encryption root and snapshot it, the from_ivset_gui is 0x0 and the to_ivset_guid is some value. And if I make another snapshot and dump the incremental that value changes. So I still don't see an unchanging property associated with the encryptionroot.

e.g.

echo "12345678" | zfs create -o canmount=off -o encryption=on -o keylocation=prompt -o keyformat=passphrase testpool/enc
zfs snapshot testpool/enc@1
zfs snapshot testpool/enc@2
zfs send -w testpool/enc@1 | zstream dump -v > enc1.dump
zfs send -wI @1 testpool/enc@2 | zstream dump -v  > enc2.dump
diff enc1.dump enc2.dump | grep ivset
<               to_ivset_guid = 0x3105692094cb37
<               from_ivset_guid = 0x0
>               to_ivset_guid = 0x59dd9f0db4abc3
>               from_ivset_guid = 0x3105692094cb37

But maybe looking at the zsteam output is not correct still. Because I feel there must be a non-0x0 ivset_guid before a snapshot is taken since of course the dataset is encrypted before we ever snapshot it.

Trying to avoid needing to reference a snapshot, because both copies of the encryptionroot could delete all their snapshots and still have the same key without needing to reference a specific snapshot.

@rincebrain
Copy link
Contributor

well, if the value is "from" and "to", and the "from" of -i snap1 snap2 is the same as the "to" of sending snap1...

@digitalsignalperson
Copy link
Author

But if we look at the scenario trying to send a child from one encryption root to another, without that child already existing on the receiving side, there is no "from" to reference for the child

[root@archlinux ~]# zfs send -w testpool/enc@1 | zstream dump -v > enc1.dump
[root@archlinux ~]# zfs send -w testpool/enc/data1@1 | zstream dump -v > enc1child.dump
[root@archlinux ~]# cat enc1.dump | grep ivset
                to_ivset_guid = 0x618858b80ea1ae
                from_ivset_guid = 0x0
[root@archlinux ~]# cat enc1child.dump | grep ivset
                to_ivset_guid = 0x6a39ece5aee5f6
                from_ivset_guid = 0x0

So you'd have to include the ivset_guid of a snapshot of the encryptionroot to look for on the other side, and where the snapshot may or may not exist.

@rincebrain
Copy link
Contributor

I think I no longer understand the question you're trying to answer.

@digitalsignalperson
Copy link
Author

All good. I'm trying to answer: can we send a child dataset from one encryption root (e.g. testpool/enc) to an identical copy of that encryption root (e.g. testpool/enc_copy), without having to manually force_inherit the key on the receive side.

For example:

echo "12345678" | zfs create -o canmount=off -o encryption=on -o keylocation=prompt -o keyformat=passphrase testpool/enc
zfs create testpool/enc/data1
zfs snapshot -r testpool/enc@1
zfs send -Rw testpool/enc@1 | zfs recv testpool/enc_copy

Here testpool/enc and testpool/enc_copy are identical, have the same keys per the raw send, and contain a "data1" child. Let's make a new "data2" child on the original encryption root and send it to the enc_copy

zfs create testpool/enc/data2
zfs snapshot testpool/enc/data2@x
zfs send -w testpool/enc/data2@x | zfs recv testpool/enc_copy/data2

zfs get encryptionroot

The received encryption root is testpool/enc_copy/data2, but I want it to automatically know to inherit it to testpool/enc_copy

The current ways of manually fixing it are

  • python -c 'import libzfs_core; libzfs_core.lzc_change_key(b"testpool/enc_copy/data2", "force_inherit")'
  • or with this PR it could be zfs change-key -if testpool/enc_copy/data2
  • or otherwise zfs load-key -r testpool/enc_copy && zfs change-key -i testpool/enc_copy/data2

I expect to often be creating new datasets in an encryptionroot and wanting those to be replicated without needing any of this key inheriting or key loading on the received side. I was working around this by including a few hundred empty datasets in the initial zfs send -Rw that I could start using at any time without needing to do this dance.

Hopefully that clarifies the problem. And from there in trying to solve it, I think it requires the zfs send stream contain some information like "This is the UUID of the encryption root on the send side. If you find it on the receive side, force inherit the encryption root to it". And the last couple comments was around trying to find a suitable UUID parameter. Worst case, the snapshots on the encryptionroot have been deleted on the send and receive side, so I am looking for something that does not depend on a snapshot. Even with no snapshots on the encryption root, it still has the expected key.

One property that would satisfy this is pbkdf2salt which is identical between encryption root parent and child. But it is only used for password keys. Versus to_ivset_guid/from_ivset_guid change both between parent and child, and between snapshots.

@rincebrain
Copy link
Contributor

I mean, the actual key encrypted datasets use is per-dataset, so the encryptionroot is just a setting for what the wrapper key is using, and doesn't really matter.

You appear to be trying to solve problems that have already been solved.

@digitalsignalperson
Copy link
Author

Sorry I might be out of my depth and not making sense as far as a solution, but at least does my problem description sense?

Trying to re-state the problem as simple as possible: when you raw send a single encrypted dataset from "mypool/enc/data" to "mypool/enc_copy/data", the received encryptionroot is "mypool/enc_copy/data" when it should be "mypool/enc_copy".

I am wanting to implement something so that zfs recv should automatically force inherit the encryptionroot on the received side so it ends up as "mypool/enc_copy".

The only way I know it's solved right now is to manually load the key on the received side and use zfs change-key -i, or otherwise use that nice force_inherit function exposed in the pyzfs api.

If nothing comes of this at least I can move forward with that pyzfs command and bake it into my replication system as a post-receive command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants