Skip to content
This repository has been archived by the owner on Mar 24, 2023. It is now read-only.

Erasure-code namespace ID #146

Merged
merged 7 commits into from
Mar 19, 2021
Merged

Erasure-code namespace ID #146

merged 7 commits into from
Mar 19, 2021

Conversation

adlerjohn
Copy link
Member

@adlerjohn adlerjohn commented Mar 17, 2021

Fixes #145.

Note the proposed fix in #145 is actually not complete as it does not affect erasure coding, but rather only the NMT. This PR changes the format of non-parity shares themselves to be 8-byte namespace ID + 248 bytes of data, which is then erasure coded. This guarantees a power-of-2 invariant on share size, which is needed for proper erasure coding.

@adlerjohn adlerjohn added the bug Something isn't working label Mar 17, 2021
@adlerjohn adlerjohn self-assigned this Mar 17, 2021
specs/data_structures.md Outdated Show resolved Hide resolved
@adlerjohn adlerjohn marked this pull request as draft March 17, 2021 15:54
@adlerjohn
Copy link
Member Author

Converting to draft while we figure out the full extent of changes required.

specs/data_structures.md Outdated Show resolved Hide resolved
@adlerjohn adlerjohn marked this pull request as ready for review March 18, 2021 16:15
@adlerjohn adlerjohn changed the title Add namespace ID to NMT leaf hash Erasure-code namespace ID Mar 18, 2021
@liamsi
Copy link
Member

liamsi commented Mar 18, 2021

rendered version of the the share section: https://github.com/lazyledger/lazyledger-specs/blob/adlerjohn-nmt_namespace_leaves/specs/data_structures.md#share

Copy link
Member

@liamsi liamsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, this is still too ambiguous. I think I understand how this works but I asked a lot of questions to better be sure.


![fig: Reserved share.](./figures/share.svg)

For shares **with a namespace ID above [`NAMESPACE_ID_MAX_RESERVED`](./consensus.md#constants)**, the first [`SHARE_RESERVED_BYTES`](./consensus.md#constants) bytes have no special meaning and are simply used to store data like all the other bytes in the share.
For shares **with a namespace ID above [`NAMESPACE_ID_MAX_RESERVED`](./consensus.md#constants)**, the first [`SHARE_RESERVED_BYTES`](./consensus.md#constants) bytes after [`NAMESPACE_ID_BYTES`](./consensus.md#constants) have no special meaning and are simply used to store data like all the other bytes in the share.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole paragraph in its current form rendered incl. the vector graphic:
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So because of this * the raw data can only be max 247 = 256 - 1 - 8 bytes long right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, the next section clarifies this:
SHARE_SIZE - NAMESPACE_ID_BYTES - SHARE_RESERVED_BYTES

Copy link
Member

@liamsi liamsi Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the encoding is wrong in the implementation but note that the index, if varint encoded, could exceed one byte: celestiaorg/celestia-app#53 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 519 to 522
1. Split up the length/request pairs into [`SHARE_SIZE`](./consensus.md#constants)`-`[`SHARE_RESERVED_BYTES`](./consensus.md#constants)-byte [shares](#share) and assign [the appropriate namespace ID](./consensus.md#reserved-namespace-ids). This data has a _reserved_ namespace ID, so the first [`SHARE_RESERVED_BYTES`](./consensus.md#constants) bytes for these shares must be [set specially](#share).
1. Split up the length/request pairs into [`SHARE_SIZE`](./consensus.md#constants)`-`[`NAMESPACE_ID_BYTES`](./consensus.md#constants)`-`[`SHARE_RESERVED_BYTES`](./consensus.md#constants)-byte [shares](#share) and assign [the appropriate namespace ID](./consensus.md#reserved-namespace-ids). This data has a _reserved_ namespace ID, so the first [`NAMESPACE_ID_BYTES`](./consensus.md#constants)`+`[`SHARE_RESERVED_BYTES`](./consensus.md#constants) bytes for these shares must be [set specially](#share).
1. Concatenate the lists of shares in the order: transactions, intermediate state roots, evidence.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the concatenation is what makes it necessary to use the * thingy (the start index above)?
It feels the text is missing some step and does not match the diagram: the concatenation here can refer to what leads to
image
because if you simply concatenate, tx3 would also carry a namespace.
Other than this concatenation, the text only mentions to split up requests into share_size - 8 - 1 shares.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, with the above description (step 1), it is not clear if there shouldn't be an associated NID for the the length/request pair len(tx3), tx3.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhh okay I see why the text might be confusing. What it's supposed to be is that the namespace ID of all transactions is the same, so only shares have a namespace ID, not individual transactions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can clarify.

Copy link
Member Author

@adlerjohn adlerjohn Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

specs/data_structures.md Outdated Show resolved Hide resolved
@adlerjohn adlerjohn requested a review from liamsi March 19, 2021 13:45
Copy link
Member

@liamsi liamsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the clarifications.

@adlerjohn adlerjohn merged commit a66353f into master Mar 19, 2021
@adlerjohn adlerjohn deleted the adlerjohn-nmt_namespace_leaves branch March 19, 2021 14:38
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Namespace ID must be erasure-coded
2 participants