* XArray multiple marks support
@ 2022-06-30 15:48 Boris Burkov
2022-07-04 22:32 ` Matthew Wilcox
0 siblings, 1 reply; 4+ messages in thread
From: Boris Burkov @ 2022-06-30 15:48 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: linux-fsdevel
Hi Matthew,
I was reading the XArray documentation and noticed a comment that there
is potential support for searching by ANDs of multiple marks, but it was
waiting for a use case. I think I might have such a use case, but I'm
looking for some feedback on its validity.
I'm working on some fragmentation issues in a space allocator in btrfs,
so I'm attempting to categorize the allocation unit of this allocator
(block group) by size class to help. I've got a branch where I migrated
btrfs's storage of block groups from a linked list to an xarray, since
marks felt like a really nice way for me to iterate by size class.
e.g.:
mark = get_size_class_mark(size);
xa_for_each_marked(block_groups, index, block_group, mark) {
// try to allocate in block_group
}
Further, this allocator already operates in passes, which try harder and
harder to find a block_group, which also fits nicely, since eventually,
I can make the mark XA_PRESENT.
i.e.:
while (pass < N) {
mark = get_size_class_mark(size);
if (pass > K)
mark = XA_PRESENT;
xa_for_each_marked(block_groups, index, block_group, mark) {
// try to allocate in block_group
}
if (happy)
break;
pass++;
}
However, I do feel a bit short on marks! Currently, I use one for "not
in any size class" which leaves just two size classes. Even a handful
more would give me a lot of extra flexibility. So with that said, if I
could use ANDs in the iteration to make it essentially 7 marks, that
would be sweet. I don't yet see a strong need for ORs, in my case.
Does this seem like a good enough justification to support finding by
combination of marks? If not, my alternative, for what it's worth, is
to have an array of my block group data structure indexed by size class.
If you do think it's a good idea, I'm happy to help with implementing
it or testing it, if that would be valuable.
Thanks for your time,
Boris
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: XArray multiple marks support
2022-06-30 15:48 XArray multiple marks support Boris Burkov
@ 2022-07-04 22:32 ` Matthew Wilcox
2022-07-05 18:01 ` Boris Burkov
0 siblings, 1 reply; 4+ messages in thread
From: Matthew Wilcox @ 2022-07-04 22:32 UTC (permalink / raw)
To: Boris Burkov; +Cc: linux-fsdevel
On Thu, Jun 30, 2022 at 08:48:10AM -0700, Boris Burkov wrote:
> I was reading the XArray documentation and noticed a comment that there
> is potential support for searching by ANDs of multiple marks, but it was
> waiting for a use case. I think I might have such a use case, but I'm
> looking for some feedback on its validity.
>
> I'm working on some fragmentation issues in a space allocator in btrfs,
> so I'm attempting to categorize the allocation unit of this allocator
> (block group) by size class to help. I've got a branch where I migrated
> btrfs's storage of block groups from a linked list to an xarray, since
> marks felt like a really nice way for me to iterate by size class.
>
> e.g.:
> mark = get_size_class_mark(size);
> xa_for_each_marked(block_groups, index, block_group, mark) {
> // try to allocate in block_group
> }
>
> Further, this allocator already operates in passes, which try harder and
> harder to find a block_group, which also fits nicely, since eventually,
> I can make the mark XA_PRESENT.
>
> i.e.:
> while (pass < N) {
> mark = get_size_class_mark(size);
> if (pass > K)
> mark = XA_PRESENT;
> xa_for_each_marked(block_groups, index, block_group, mark) {
> // try to allocate in block_group
> }
> if (happy)
> break;
> pass++;
> }
>
> However, I do feel a bit short on marks! Currently, I use one for "not
> in any size class" which leaves just two size classes. Even a handful
> more would give me a lot of extra flexibility. So with that said, if I
> could use ANDs in the iteration to make it essentially 7 marks, that
> would be sweet. I don't yet see a strong need for ORs, in my case.
Unfortunately, I don't think doing this will work out really well for
you. The bits really are independent of each other, and the power of
the search marks lies in their ability to skip over vast swathes of
the array when they're not marked. But to do what you want, we'd end up
doing something like this:
leaf array 1:
entry 0 is in category 1
entry 1 is in category 2
entry 2 is in category 5
and now we have to set all three bits in the parent of leaf array 1,
so any search will have to traverse all of leaf array 1 in order to find
out whether there are any entries in the category we're looking for.
What you could do is keep one XArray per category. I think that's what
you're proposing below. It's a bit poor because each XArray has its
own lock, so to move a group from one size category to another, you have
to take two locks. On the other hand, that means that you can allocate
from two different size categories at the same time, so maybe that's a
good thing?
> Does this seem like a good enough justification to support finding by
> combination of marks? If not, my alternative, for what it's worth, is
> to have an array of my block group data structure indexed by size class.
> If you do think it's a good idea, I'm happy to help with implementing
> it or testing it, if that would be valuable.
>
> Thanks for your time,
> Boris
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: XArray multiple marks support
2022-07-04 22:32 ` Matthew Wilcox
@ 2022-07-05 18:01 ` Boris Burkov
2022-07-05 18:38 ` Matthew Wilcox
0 siblings, 1 reply; 4+ messages in thread
From: Boris Burkov @ 2022-07-05 18:01 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: linux-fsdevel
On Mon, Jul 04, 2022 at 11:32:56PM +0100, Matthew Wilcox wrote:
> On Thu, Jun 30, 2022 at 08:48:10AM -0700, Boris Burkov wrote:
> > I was reading the XArray documentation and noticed a comment that there
> > is potential support for searching by ANDs of multiple marks, but it was
> > waiting for a use case. I think I might have such a use case, but I'm
> > looking for some feedback on its validity.
> >
> > I'm working on some fragmentation issues in a space allocator in btrfs,
> > so I'm attempting to categorize the allocation unit of this allocator
> > (block group) by size class to help. I've got a branch where I migrated
> > btrfs's storage of block groups from a linked list to an xarray, since
> > marks felt like a really nice way for me to iterate by size class.
> >
> > e.g.:
> > mark = get_size_class_mark(size);
> > xa_for_each_marked(block_groups, index, block_group, mark) {
> > // try to allocate in block_group
> > }
> >
> > Further, this allocator already operates in passes, which try harder and
> > harder to find a block_group, which also fits nicely, since eventually,
> > I can make the mark XA_PRESENT.
> >
> > i.e.:
> > while (pass < N) {
> > mark = get_size_class_mark(size);
> > if (pass > K)
> > mark = XA_PRESENT;
> > xa_for_each_marked(block_groups, index, block_group, mark) {
> > // try to allocate in block_group
> > }
> > if (happy)
> > break;
> > pass++;
> > }
> >
> > However, I do feel a bit short on marks! Currently, I use one for "not
> > in any size class" which leaves just two size classes. Even a handful
> > more would give me a lot of extra flexibility. So with that said, if I
> > could use ANDs in the iteration to make it essentially 7 marks, that
> > would be sweet. I don't yet see a strong need for ORs, in my case.
>
> Unfortunately, I don't think doing this will work out really well for
> you. The bits really are independent of each other, and the power of
> the search marks lies in their ability to skip over vast swathes of
> the array when they're not marked. But to do what you want, we'd end up
> doing something like this:
>
> leaf array 1:
> entry 0 is in category 1
> entry 1 is in category 2
> entry 2 is in category 5
>
> and now we have to set all three bits in the parent of leaf array 1,
> so any search will have to traverse all of leaf array 1 in order to find
> out whether there are any entries in the category we're looking for.
Thank you for the explanation. To check my understanding, does this
point also imply that currently the worst case for marked search is if
every leaf array has an entry with each mark set?
>
> What you could do is keep one XArray per category. I think that's what
> you're proposing below. It's a bit poor because each XArray has its
> own lock, so to move a group from one size category to another, you have
> to take two locks. On the other hand, that means that you can allocate
> from two different size categories at the same time, so maybe that's a
> good thing?
I'll either do this, or convince myself that 3 categories is sufficient.
>
> > Does this seem like a good enough justification to support finding by
> > combination of marks? If not, my alternative, for what it's worth, is
> > to have an array of my block group data structure indexed by size class.
> > If you do think it's a good idea, I'm happy to help with implementing
> > it or testing it, if that would be valuable.
> >
> > Thanks for your time,
> > Boris
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: XArray multiple marks support
2022-07-05 18:01 ` Boris Burkov
@ 2022-07-05 18:38 ` Matthew Wilcox
0 siblings, 0 replies; 4+ messages in thread
From: Matthew Wilcox @ 2022-07-05 18:38 UTC (permalink / raw)
To: Boris Burkov; +Cc: linux-fsdevel
On Tue, Jul 05, 2022 at 11:01:43AM -0700, Boris Burkov wrote:
> > Unfortunately, I don't think doing this will work out really well for
> > you. The bits really are independent of each other, and the power of
> > the search marks lies in their ability to skip over vast swathes of
> > the array when they're not marked. But to do what you want, we'd end up
> > doing something like this:
> >
> > leaf array 1:
> > entry 0 is in category 1
> > entry 1 is in category 2
> > entry 2 is in category 5
> >
> > and now we have to set all three bits in the parent of leaf array 1,
> > so any search will have to traverse all of leaf array 1 in order to find
> > out whether there are any entries in the category we're looking for.
>
> Thank you for the explanation. To check my understanding, does this
> point also imply that currently the worst case for marked search is if
> every leaf array has an entry with each mark set?
Yes. Each leaf array is 64 entries, and the tags are stored at the end
of the node. A mark search consists of scanning one of the three
bitmaps. Which means we have to load two cachelines (both the header
and the mark array) out of each node which has a mark set. We can
avoid accessing six or seven of the nine cachelines if only one
bit is set, but with the amount of prefetching CPUs do these days,
I don't know how effective that is.
I keep meaning to try moving the mark array from the end of the
node to the header, but I worry it'll hurt non-marked lookups.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-07-05 18:39 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-06-30 15:48 XArray multiple marks support Boris Burkov
2022-07-04 22:32 ` Matthew Wilcox
2022-07-05 18:01 ` Boris Burkov
2022-07-05 18:38 ` Matthew Wilcox
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).