-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extended balance_root
to handle root underflow case
#621
base: main
Are you sure you want to change the base?
Conversation
Did some code refactoring.
… trying to convert overflow pages to PageType variant I have changed places which use `page_type` function to use `maybe_page_type`
cd7c875
to
8a54812
Compare
I'm not sure why the test is failing only on windows. I have fixed the same error on my ubuntu machine. This is the only check that's failing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we handle root underflow? Balance root tries to map 1:1 to balance_deeper
implementation which is an overflow balancing of root. Underflow should be managed generically for all cases. Furthermore, can a root page underflow even? Isn't it possible to create a btree page that once you do first insertion it will see itself as underfull?
@pereman2 I've looked at both the references below and thought it is better to handle root underflow case in the I think the comment from [1] describes root underflow case.
[1] https://github.com/sqlite/sqlite/blob/master/src/btree.c#L8867-L8881 |
core/storage/btree.rs
Outdated
@@ -1285,8 +1400,7 @@ impl BTreeCursor { | |||
|
|||
// setup overflow page | |||
let contents = page.get().contents.as_mut().unwrap(); | |||
let buf = contents.as_ptr(); | |||
buf.fill(0); | |||
contents.write_u32(0, 0); // Initialize next overflow page pointer to 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 0, can we use one of the constants? This 0 means PAGE_HEADER_OFFSET_PAGE_TYPE
, are you sure this is what this 0 write means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, what I did is wrong. Will fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pereman2 can you please tell me what's happening here?: https://github.com/sqlite/sqlite/blob/c2e400af042bdd7d21e159a41fcf34c05398044c/src/btree.c#L7166-L7172
This is my understanding:
// Writes the page number of the new overflow page (pgnoOvfl) to pPrior. For the first overflow page, pPrior points to the end of the local cell content.
put4byte(pPrior, pgnoOvfl);
// ignore this
releasePage(pToRelease);
pToRelease = pOvfl;
// Points pPrior to the start of the new overflow page's data area
pPrior = pOvfl->aData;
// Writes 0 at the start of the overflow page, which shows its the last overflow page
put4byte(pPrior, 0);
// Sets pPayload to point past the 4-byte pointer. This is where the actual data will be stored in the overflow page
pPayload = &pOvfl->aData[4];
//Calculates remaining space for data in this overflow page. Usable space minus 4 bytes used for the next-page pointer
spaceLeft = pBt->usableSize - 4;
Where are we doing put4byte(pPrior, 0);
equivalent in limbo?
1. Added an unreachable!() to the if-else under/overflow checker. 2. Directly inserted to cells using `insert_into_cell()` instead of cloning vecs into an external vector 3. Propagated errors using `?` instead of `except` and handled the case where child page is not loaded/locked. 4. Restored `allocate_overflow_page()` function to previous state.
@pereman2 @jussisaurio I've added the following changes, can you please take a look?
|
…h its overflow cells to dst 2. Added `zero_page` function which zeros's page properly. 3. Fixed the bug on windows which was due to bad offset handling and the test_sequential_overflow_page passes now. 4. Now `balance_root` mirror's closely what sqlite is doing.
I think this is ready for review.
I will add tests for root underflow case when delete is functional. |
core/storage/sqlite3_ondisk.rs
Outdated
match self.maybe_page_type() { | ||
Some(page_type) => match page_type { | ||
PageType::IndexInterior => Some(self.read_u32(8)), | ||
PageType::TableInterior => Some(self.read_u32(8)), | ||
PageType::IndexLeaf => None, | ||
PageType::TableLeaf => None, | ||
}, | ||
None => None, // Handle overflow pages by returning None | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we want to return None
? Wouldn't it be better to crash if we call any of these methods if we are on an overflow page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing these changes. I have made them for compute_used_space
which is wrong and not needed for root balancing.
core/storage/btree.rs
Outdated
fn zero_page(&self, page: &PageRef, page_type: PageType, is_leaf: bool) { | ||
let contents = page.get().contents.as_mut().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if page is page 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my understanding, since write_u16
and other write functions add page header size offsets when writing data to a location we need not handle page1 specially. Is my understanding correct?
core/storage/btree.rs
Outdated
let header_and_pointers_size = | ||
src_contents.header_size() + src_contents.cell_pointer_array_size(); | ||
dst_buf[..header_and_pointers_size].copy_from_slice(&src_buf[..header_and_pointers_size]); | ||
|
||
// Copy overflow cells | ||
dst_contents.overflow_cells = src_contents.overflow_cells.clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this handle page 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we would need to add 100 as offset if page_1, since we are directly copying. I will make the necessary edits.
core/storage/btree.rs
Outdated
for overflow_cell in &page.overflow_cells { | ||
total_used += overflow_cell.payload.len(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do overflow cells count? I mean, no overflow cell is there inside the page. If there is an overflow cell, regardless of the used space we have to balance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This calculation is not needed for now and the calculation is wrong. I'm using this function inside in is_underflow
. I've simplified both is_underflow
and is_overflow
to handle only balance_root
case for. I will extend them in the future.
I will remove this function. This can be resolved.
…contents. 2. Refactored `zero_page` to only zero offsets. And added documentation on where and how to use it. 3. Removed erroneous `compute_used_space` function and changed `is_underflow` `is_overflow` to currently only check inside balance_root. 4. Removed changes to `maybe_page_type` function inside sqlite3_ondisk.
fn zero_page(&self, page: &PageRef) { | ||
let contents = page.get().contents.as_mut().unwrap(); | ||
|
||
contents.write_u32(PAGE_HEADER_OFFSET_FIRST_FREEBLOCK, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pereman2 I've written write_u32
here instead of write_u16
because SQLite does memset to 4 bytes: https://github.com/sqlite/sqlite/blob/c2e400af042bdd7d21e159a41fcf34c05398044c/src/btree.c#L2277
Balance root now handles root underflow case:
balance_leaf