Skip to content
This repository has been archived by the owner on May 9, 2024. It is now read-only.

Index structure for free buffers of a slab #688

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

akroviakov
Copy link
Contributor

@akroviakov akroviakov commented Oct 4, 2023

This PR introduces an index structure for free buffers of a slab, this allows keep data fetching time pretty much constant.

Example: 1000 fragments, 15 columns, (we observe GPU as it is easier to measure fetching time).

Problem: current buffer manager looks for a fitting buffer by walking through all buffers in all slabs. Normally this means that we go through all previously allocated buffers until the last one. When we fetch columns one by one (for simplicity), the measurements per column fetch are:
[102ms, 105ms, 125ms, 205ms, 301ms, 416ms, ..., 1100ms, ...](i.e., 1st column took 102ms to fetch, 2nd 105ms, etc.).

Solution: introduce an index structure for free buffers in a slab.

Effect: we get a quick shortcut over slabs where we wouldn't fit, and we don't have to iterate over all buffers in a slab, only over the free ones and in log time. In practice the overhead is eliminated:
[102ms, 103ms, 98ms, 100ms, ..., 101ms].

Drawback: slightly higher overhead per buffer, but it is maybe in the order of a few ms for 1000 buffers.

@akroviakov akroviakov force-pushed the akroviak/bm_index_free_buf branch from 73d010a to 5917a28 Compare October 5, 2023 13:27
Copy link
Contributor

@ienkovich ienkovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall changes look reasonable but I have two concerns.

  1. There is a confusion between segment and buffer terms. Slab consists of segments, but we call them buffers sometimes and this patch worsens the situation. It would be better to use the segment term when we refer to BufferSeg structures.
  2. You introduce a free segments index to avoid linear complexity in a free segment search, But in fact, insert/remove operations for the introduced index are also linear. I suggest another structure in the comments.

@@ -856,8 +946,12 @@ void BufferMgr::getChunkMetadataVecForKeyPrefix(ChunkMetadataVector& chunk_metad
LOG(FATAL) << "getChunkMetadataVecForPrefix not supported for BufferMgr.";
}

const std::vector<BufferList>& BufferMgr::getSlabSegments() {
return slab_segments_;
const std::vector<BufferList> BufferMgr::getSlabSegments() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to remove this method completely.

};

struct Slab {
Slab(BufferList& buffers) : buffers_(buffers), free_buffs_index_(buffers_) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear why this set of constructors was chosen. IMO the only constructor should simply get a number of pages in a slab.

: buffers_(std::move(buffers)), free_buffs_index_(buffers_) {}

BufferList buffers_;
SlabFreeBufIndex free_buffs_index_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why it's split into Slab and SlabFreeBufIndex with the latter referencing fields of the former. Wouldn't it be simpler and more clear to have Slab class only?

CHECK_EQ((*index_pos)->num_pages, slab_buffer_pos->num_pages);
return index_pos;
}
std::vector<BufferList::iterator> free_buffs_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel the vector is a good choice for an index. In your experiments you allocate a lot of small segments but hardly free any of them. With more buffer deallocations involved you might end up with another performance issue because insert/remove operations for vectors are linear. And linearity is what you want to avoid in segment allocation/deallocation in the first place, right?
Since we are searching buffers by their size, what about std::map<size_t, std::[unordered_]set<BufferList::iterator>>? It would enable the search of a free buffer at the same log(n) but much more efficient insert/remove operations.

return index_pos;
}
std::vector<BufferList::iterator> free_buffs_;
BufferList& list_to_index_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field name is confusing.

@@ -210,7 +272,7 @@ class BufferMgr : public AbstractBufferMgr { // implements
const size_t page_size_;
std::vector<int8_t*> slabs_; /// vector of beginning memory addresses for each
/// allocation of the buffer pool
std::vector<BufferList> slab_segments_;
std::list<Slab> slab_segments_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you prefer list here? We don't add/remove slabs often. Efficient slab access by index looks more important.

size_t excess_pages = num_pages - num_pages_requested;
if (evict_it != slab_segments_[slab_num].end() &&
const size_t excess_pages = num_pages - num_pages_requested;
if (evict_it != slab_segs.end() &&
evict_it->mem_status == FREE) { // need to merge with current page
evict_it->start_page = start_page + num_pages_requested;
evict_it->num_pages += excess_pages;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a free segment's size is modified, you need to re-index it similarly to what you do in BufferMgr::reserveBuffer.

@akroviakov akroviakov requested a review from ienkovich October 11, 2023 13:31
@akroviakov akroviakov force-pushed the akroviak/bm_index_free_buf branch from 2d1c992 to 7167cc8 Compare October 27, 2023 14:07
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants