Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: ABD chunk iterator #16848

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

RFC: ABD chunk iterator #16848

wants to merge 7 commits into from

Conversation

robn
Copy link
Member

@robn robn commented Dec 9, 2024

[Sponsors: Klara, Inc., Wasabi Technology, Inc.]

Motivation and Context

This is a prototype of "chunk" iterators for ABDs.

This is attempting to solve the problems (which of course I assert are; we can discuss that if you wish):

  • using an iterator is difficult, requiring a callback function and a state structure. This in turn makes ABDs difficult to use; much harder than the simple character buffers they replace
  • the need for a callback add function call overhead for every iteration (granted, it's likely negligible in most cases)
  • the iterator structure requires special knowledge of every ABD type, making adding new ABD types difficult
  • iterators are byte-oriented, even though their underlying storage is not
    • we're rarely (never?) interested arbitrarily-sized runs of bytes
    • advancing by bytes can land you in the "same" chunk, but the caller can't know this, so has to do an unmap/map cycle, which can be expensive
    • writing more exotic iterators (reverse, repeat) is extremely difficult because its not easy to know how much iteration will or should yield

This PR proposes an alternative iterator that expressly yields "chunks", that is, an object (struct) representing an arbitrary run of bytes. The caller can than take operations against the chunk (request size, map, unmap, etc) separately. All iterator movement then becomes simply moving to the next chunk. The much simpler housekeeping required allows an iterator to be used directly rather than through a control function, addressing the usability concerns.

For this PR I'm looking for rough consensus on the goals and interface.

Description

For easy reading online:

See the individual commits and their comments for more details. In the commits and their comments:

  • definition and description of chunks, iterators and their supporting functions
  • an implementation of abd_iterate_func in terms of chunk iterators
  • conversion of many callers of abd_iterate_func to use a simpler iterator loop
  • an implementation of abd_iterate_func2 in terms of chunk iterators
  • two possible implementations of a abd_for_each_chunk macro that hides the details of the most common style of iterator loop (map each chunk)

This PR is largely aimed at getting the interface right; the implementation details are unimportant for now. My intention when/if this passes muster is to entirely remove the existing iterator code, including the callback iterators, so none of this implementation will carry over. So don't worry about it, but do point out anything you think might not be implementable.

The only thing not here is a replacement for abd_iterate_page_func(). I don't think it's hard, just a bit fiddly - needs a way to get the page pointer and data offset/size within for a chunk, and a way to indicate that each chunk should be a page, even if it could be bigger (linear), and what to do with compound pages. Maybe abd_chunk_start() will gain a flags parameter. I'd see what falls out of the "real" implementation and what feels right. I don't want to overthink it, but I still want it to be usable - more easily sharing pages with the kernel is gonna be useful!

How Has This Been Tested?

Light sanity runs, as befits a prototype.

Types of changes

One or more of:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

robn added 7 commits December 9, 2024 16:04
The existing iterators do not have any way for an external user to get
the size of the data within the current iteration. This lifts that out
of abd_iter_map() and exposes it as abd_iter_size() for the chunk
iterator to use.

FreeBSD not included for now, since this is just for my own testing.
For now, these are implemented in terms of the old iterators.
There is no real overhead change here, because it still has to call the
callback function, which has to manage its own state, but the code is
much simpler.
This shows how this form of iterator can be much simpler to work with.
No need to set up separate state struct and callback function, just
write a loop.
This version is not really simpler.

(Though it did help me to see a small efficiency gain: the existing
version will remap each chunk when either iterator advances. This
version does not).

I suspect that users of abd_iterate_func2 will be able to do a better
job by using chunk iterators directly, rather than through callbacks,
because they know more about the incoming data format and how and when
to advance it.
This is where I really wanted to get to. Rather then needing the loop
boilerplate every time, have a macro that hide all that away, having a
use that looks more like a plain old for loop.

This method has two downsides:
- the data and size vars have to be declared outside the loop
- an early exit (break or return) is not possible, as there's no place
  to unmap the chunk.
By making the loop block an arg to the macro, we can wrap it, and solve
both the problems in the previous version. We can declare the data &
size vars ourselves, so the caller doesn't have to, and we can run at
loop exit, so can unmap the chunk properly (alas, we can't do it for an
early return, but that's a much rarer thing to want than an early break).

The cost is a tiny bit of readbility, as a code block as a macro arg is
a little less familiar than one following a loop operator. The extra
safety seems worth it to me.
@amotin amotin added the Status: Design Review Needed Architecture or design is under discussion label Dec 11, 2024
Comment on lines +141 to +150
* - void abd_chunk_advance(abd_chunk_t *ch)
*
* Move the iterator to the next chunk. If there is no next chunk, the
* iterator is changed to the "done" state and is no longer useable.
*
* - boolean_t abd_chunk_done(abd_chunk_t *ch)
*
* If true, the iterator is pointing to a valid chunk, and the underlying
* memory can be accessed with the access functions. If false, the iterator
* is exhausted and no longer useable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitions of "done" in those paragraphs seems to be opposite.

@amotin
Copy link
Member

amotin commented Dec 12, 2024

using an iterator is difficult, requiring a callback function and a state structure.

The state structure seems not going anywhere. Ability to avoid callbacks and do thing inline indeed makes it much more flexible, but the macro style makes me shiver similar to style checker. I like the direction, but not so much the specific implementation.

the iterator structure requires special knowledge of every ABD type, making adding new ABD types difficult

I am not sure you are doing much about it here. For now you just included the old iterator with its code as is, but I worry that once you try to integrate it we'll end up about where we are now. We do need some OS-specific ABD-specific storage for mapping, etc.

iterators are byte-oriented, even though their underlying storage is not

It seems current code already operates in chunks of possible mappings. The fact that abd_iterate_func2() always remaps both abds seems like an implementation detail. I don't see what would stop from handling offsets within a map and remapping only as necessary.

@tuxoko
Copy link
Contributor

tuxoko commented Jan 23, 2025

iterators are byte-oriented, even though their underlying storage is not
we're rarely (never?) interested arbitrarily-sized runs of bytes

For context, when abd was still out of tree, abd was directly attached to dmu buf.
So arbitrary byte iteration was needed.

In fact, I think we should add it back again, at least for user data dmu buf.
Because right now, if arc buf is scatter, dmu will have to allocate extra buf and do memcpy.
This is inevitable for metadata, but for user data it's just extra memcpy for no good reason.

So for this reason I don't think we should remove the ability to do arbitrary byte iteration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Design Review Needed Architecture or design is under discussion
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants