* XArray multiple marks support
@ 2022-06-30 15:48 Boris Burkov
2022-07-04 22:32 ` Matthew Wilcox
0 siblings, 1 reply; 4+ messages in thread
From: Boris Burkov @ 2022-06-30 15:48 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: linux-fsdevel
Hi Matthew,
I was reading the XArray documentation and noticed a comment that there
is potential support for searching by ANDs of multiple marks, but it was
waiting for a use case. I think I might have such a use case, but I'm
looking for some feedback on its validity.
I'm working on some fragmentation issues in a space allocator in btrfs,
so I'm attempting to categorize the allocation unit of this allocator
(block group) by size class to help. I've got a branch where I migrated
btrfs's storage of block groups from a linked list to an xarray, since
marks felt like a really nice way for me to iterate by size class.
e.g.:
mark = get_size_class_mark(size);
xa_for_each_marked(block_groups, index, block_group, mark) {
// try to allocate in block_group
}
Further, this allocator already operates in passes, which try harder and
harder to find a block_group, which also fits nicely, since eventually,
I can make the mark XA_PRESENT.
i.e.:
while (pass < N) {
mark = get_size_class_mark(size);
if (pass > K)
mark = XA_PRESENT;
xa_for_each_marked(block_groups, index, block_group, mark) {
// try to allocate in block_group
}
if (happy)
break;
pass++;
}
However, I do feel a bit short on marks! Currently, I use one for "not
in any size class" which leaves just two size classes. Even a handful
more would give me a lot of extra flexibility. So with that said, if I
could use ANDs in the iteration to make it essentially 7 marks, that
would be sweet. I don't yet see a strong need for ORs, in my case.
Does this seem like a good enough justification to support finding by
combination of marks? If not, my alternative, for what it's worth, is
to have an array of my block group data structure indexed by size class.
If you do think it's a good idea, I'm happy to help with implementing
it or testing it, if that would be valuable.
Thanks for your time,
Boris
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: XArray multiple marks support 2022-06-30 15:48 XArray multiple marks support Boris Burkov @ 2022-07-04 22:32 ` Matthew Wilcox 2022-07-05 18:01 ` Boris Burkov 0 siblings, 1 reply; 4+ messages in thread From: Matthew Wilcox @ 2022-07-04 22:32 UTC (permalink / raw) To: Boris Burkov; +Cc: linux-fsdevel On Thu, Jun 30, 2022 at 08:48:10AM -0700, Boris Burkov wrote: > I was reading the XArray documentation and noticed a comment that there > is potential support for searching by ANDs of multiple marks, but it was > waiting for a use case. I think I might have such a use case, but I'm > looking for some feedback on its validity. > > I'm working on some fragmentation issues in a space allocator in btrfs, > so I'm attempting to categorize the allocation unit of this allocator > (block group) by size class to help. I've got a branch where I migrated > btrfs's storage of block groups from a linked list to an xarray, since > marks felt like a really nice way for me to iterate by size class. > > e.g.: > mark = get_size_class_mark(size); > xa_for_each_marked(block_groups, index, block_group, mark) { > // try to allocate in block_group > } > > Further, this allocator already operates in passes, which try harder and > harder to find a block_group, which also fits nicely, since eventually, > I can make the mark XA_PRESENT. > > i.e.: > while (pass < N) { > mark = get_size_class_mark(size); > if (pass > K) > mark = XA_PRESENT; > xa_for_each_marked(block_groups, index, block_group, mark) { > // try to allocate in block_group > } > if (happy) > break; > pass++; > } > > However, I do feel a bit short on marks! Currently, I use one for "not > in any size class" which leaves just two size classes. Even a handful > more would give me a lot of extra flexibility. So with that said, if I > could use ANDs in the iteration to make it essentially 7 marks, that > would be sweet. I don't yet see a strong need for ORs, in my case. Unfortunately, I don't think doing this will work out really well for you. The bits really are independent of each other, and the power of the search marks lies in their ability to skip over vast swathes of the array when they're not marked. But to do what you want, we'd end up doing something like this: leaf array 1: entry 0 is in category 1 entry 1 is in category 2 entry 2 is in category 5 and now we have to set all three bits in the parent of leaf array 1, so any search will have to traverse all of leaf array 1 in order to find out whether there are any entries in the category we're looking for. What you could do is keep one XArray per category. I think that's what you're proposing below. It's a bit poor because each XArray has its own lock, so to move a group from one size category to another, you have to take two locks. On the other hand, that means that you can allocate from two different size categories at the same time, so maybe that's a good thing? > Does this seem like a good enough justification to support finding by > combination of marks? If not, my alternative, for what it's worth, is > to have an array of my block group data structure indexed by size class. > If you do think it's a good idea, I'm happy to help with implementing > it or testing it, if that would be valuable. > > Thanks for your time, > Boris ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: XArray multiple marks support 2022-07-04 22:32 ` Matthew Wilcox @ 2022-07-05 18:01 ` Boris Burkov 2022-07-05 18:38 ` Matthew Wilcox 0 siblings, 1 reply; 4+ messages in thread From: Boris Burkov @ 2022-07-05 18:01 UTC (permalink / raw) To: Matthew Wilcox; +Cc: linux-fsdevel On Mon, Jul 04, 2022 at 11:32:56PM +0100, Matthew Wilcox wrote: > On Thu, Jun 30, 2022 at 08:48:10AM -0700, Boris Burkov wrote: > > I was reading the XArray documentation and noticed a comment that there > > is potential support for searching by ANDs of multiple marks, but it was > > waiting for a use case. I think I might have such a use case, but I'm > > looking for some feedback on its validity. > > > > I'm working on some fragmentation issues in a space allocator in btrfs, > > so I'm attempting to categorize the allocation unit of this allocator > > (block group) by size class to help. I've got a branch where I migrated > > btrfs's storage of block groups from a linked list to an xarray, since > > marks felt like a really nice way for me to iterate by size class. > > > > e.g.: > > mark = get_size_class_mark(size); > > xa_for_each_marked(block_groups, index, block_group, mark) { > > // try to allocate in block_group > > } > > > > Further, this allocator already operates in passes, which try harder and > > harder to find a block_group, which also fits nicely, since eventually, > > I can make the mark XA_PRESENT. > > > > i.e.: > > while (pass < N) { > > mark = get_size_class_mark(size); > > if (pass > K) > > mark = XA_PRESENT; > > xa_for_each_marked(block_groups, index, block_group, mark) { > > // try to allocate in block_group > > } > > if (happy) > > break; > > pass++; > > } > > > > However, I do feel a bit short on marks! Currently, I use one for "not > > in any size class" which leaves just two size classes. Even a handful > > more would give me a lot of extra flexibility. So with that said, if I > > could use ANDs in the iteration to make it essentially 7 marks, that > > would be sweet. I don't yet see a strong need for ORs, in my case. > > Unfortunately, I don't think doing this will work out really well for > you. The bits really are independent of each other, and the power of > the search marks lies in their ability to skip over vast swathes of > the array when they're not marked. But to do what you want, we'd end up > doing something like this: > > leaf array 1: > entry 0 is in category 1 > entry 1 is in category 2 > entry 2 is in category 5 > > and now we have to set all three bits in the parent of leaf array 1, > so any search will have to traverse all of leaf array 1 in order to find > out whether there are any entries in the category we're looking for. Thank you for the explanation. To check my understanding, does this point also imply that currently the worst case for marked search is if every leaf array has an entry with each mark set? > > What you could do is keep one XArray per category. I think that's what > you're proposing below. It's a bit poor because each XArray has its > own lock, so to move a group from one size category to another, you have > to take two locks. On the other hand, that means that you can allocate > from two different size categories at the same time, so maybe that's a > good thing? I'll either do this, or convince myself that 3 categories is sufficient. > > > Does this seem like a good enough justification to support finding by > > combination of marks? If not, my alternative, for what it's worth, is > > to have an array of my block group data structure indexed by size class. > > If you do think it's a good idea, I'm happy to help with implementing > > it or testing it, if that would be valuable. > > > > Thanks for your time, > > Boris ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: XArray multiple marks support 2022-07-05 18:01 ` Boris Burkov @ 2022-07-05 18:38 ` Matthew Wilcox 0 siblings, 0 replies; 4+ messages in thread From: Matthew Wilcox @ 2022-07-05 18:38 UTC (permalink / raw) To: Boris Burkov; +Cc: linux-fsdevel On Tue, Jul 05, 2022 at 11:01:43AM -0700, Boris Burkov wrote: > > Unfortunately, I don't think doing this will work out really well for > > you. The bits really are independent of each other, and the power of > > the search marks lies in their ability to skip over vast swathes of > > the array when they're not marked. But to do what you want, we'd end up > > doing something like this: > > > > leaf array 1: > > entry 0 is in category 1 > > entry 1 is in category 2 > > entry 2 is in category 5 > > > > and now we have to set all three bits in the parent of leaf array 1, > > so any search will have to traverse all of leaf array 1 in order to find > > out whether there are any entries in the category we're looking for. > > Thank you for the explanation. To check my understanding, does this > point also imply that currently the worst case for marked search is if > every leaf array has an entry with each mark set? Yes. Each leaf array is 64 entries, and the tags are stored at the end of the node. A mark search consists of scanning one of the three bitmaps. Which means we have to load two cachelines (both the header and the mark array) out of each node which has a mark set. We can avoid accessing six or seven of the nine cachelines if only one bit is set, but with the amount of prefetching CPUs do these days, I don't know how effective that is. I keep meaning to try moving the mark array from the end of the node to the header, but I worry it'll hurt non-marked lookups. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-07-05 18:39 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-06-30 15:48 XArray multiple marks support Boris Burkov 2022-07-04 22:32 ` Matthew Wilcox 2022-07-05 18:01 ` Boris Burkov 2022-07-05 18:38 ` Matthew Wilcox
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).