* [PATCH] fs/resctrl: Slightly optimize cbm_validate()
@ 2025-10-26 7:39 Christophe JAILLET
2025-10-27 11:43 ` Dave Martin
2025-11-03 22:13 ` Reinette Chatre
0 siblings, 2 replies; 7+ messages in thread
From: Christophe JAILLET @ 2025-10-26 7:39 UTC (permalink / raw)
To: Tony Luck, Reinette Chatre, Dave Martin, James Morse, Babu Moger
Cc: linux-kernel, kernel-janitors, Christophe JAILLET
'first_bit' is known to be 1, so it can be skipped when searching for the
next 0 bit. Doing so mimics bitmap_next_set_region() and can save a few
cycles.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
---
Compile tested only.
For the records, on x86, the diff of the asm code is:
--- fs/resctrl/ctrlmondata.s.old 2025-10-26 08:21:46.928920563 +0100
+++ fs/resctrl/ctrlmondata.s 2025-10-26 08:21:40.864024143 +0100
@@ -1603,11 +1603,12 @@
call _find_first_bit
# ./include/linux/find.h:192: return _find_next_zero_bit(addr, size, offset);
movq %r12, %rsi
- leaq 48(%rsp), %rdi
- movq %rax, %rdx
+# fs/resctrl/ctrlmondata.c:133: zero_bit = find_next_zero_bit(&val, cbm_len, first_bit + 1);
+ leaq 1(%rax), %rdx
# ./include/linux/find.h:214: return _find_first_bit(addr, size);
movq %rax, 8(%rsp)
# ./include/linux/find.h:192: return _find_next_zero_bit(addr, size, offset);
+ leaq 48(%rsp), %rdi
call _find_next_zero_bit
# fs/resctrl/ctrlmondata.c:136: if (!r->cache.arch_has_sparse_bitmasks &&
leaq 28(%rbx), %rdi
---
fs/resctrl/ctrlmondata.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
index 0d0ef54fc4de..1ff479a2dbbc 100644
--- a/fs/resctrl/ctrlmondata.c
+++ b/fs/resctrl/ctrlmondata.c
@@ -130,7 +130,7 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
}
first_bit = find_first_bit(&val, cbm_len);
- zero_bit = find_next_zero_bit(&val, cbm_len, first_bit);
+ zero_bit = find_next_zero_bit(&val, cbm_len, first_bit + 1);
/* Are non-contiguous bitmasks allowed? */
if (!r->cache.arch_has_sparse_bitmasks &&
--
2.51.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] fs/resctrl: Slightly optimize cbm_validate()
2025-10-26 7:39 [PATCH] fs/resctrl: Slightly optimize cbm_validate() Christophe JAILLET
@ 2025-10-27 11:43 ` Dave Martin
2025-11-01 13:40 ` Christophe JAILLET
2025-11-03 18:17 ` Luck, Tony
2025-11-03 22:13 ` Reinette Chatre
1 sibling, 2 replies; 7+ messages in thread
From: Dave Martin @ 2025-10-27 11:43 UTC (permalink / raw)
To: Christophe JAILLET, Tony Luck
Cc: Reinette Chatre, James Morse, Babu Moger, linux-kernel,
kernel-janitors
Hi,
[Tony, I have a side question on min_cbm_bits -- see below.]
On Sun, Oct 26, 2025 at 08:39:52AM +0100, Christophe JAILLET wrote:
> 'first_bit' is known to be 1, so it can be skipped when searching for the
> next 0 bit. Doing so mimics bitmap_next_set_region() and can save a few
> cycles.
This seems reasonable, although:
Nit: missing statement of what the patch does. (Your paragraph
describes only something that _could_ be done and gives rationale for
it.)
>
> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
> ---
> Compile tested only.
>
> For the records, on x86, the diff of the asm code is:
> --- fs/resctrl/ctrlmondata.s.old 2025-10-26 08:21:46.928920563 +0100
> +++ fs/resctrl/ctrlmondata.s 2025-10-26 08:21:40.864024143 +0100
> @@ -1603,11 +1603,12 @@
> call _find_first_bit
> # ./include/linux/find.h:192: return _find_next_zero_bit(addr, size, offset);
> movq %r12, %rsi
> - leaq 48(%rsp), %rdi
> - movq %rax, %rdx
> +# fs/resctrl/ctrlmondata.c:133: zero_bit = find_next_zero_bit(&val, cbm_len, first_bit + 1);
> + leaq 1(%rax), %rdx
> # ./include/linux/find.h:214: return _find_first_bit(addr, size);
> movq %rax, 8(%rsp)
> # ./include/linux/find.h:192: return _find_next_zero_bit(addr, size, offset);
> + leaq 48(%rsp), %rdi
(This is really only showing that the compiler works. The real
question is whether the logic is still sound after this change to the
arguments of _find_first_bit()...)
> call _find_next_zero_bit
> # fs/resctrl/ctrlmondata.c:136: if (!r->cache.arch_has_sparse_bitmasks &&
> leaq 28(%rbx), %rdi
> ---
> fs/resctrl/ctrlmondata.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
> index 0d0ef54fc4de..1ff479a2dbbc 100644
> --- a/fs/resctrl/ctrlmondata.c
> +++ b/fs/resctrl/ctrlmondata.c
> @@ -130,7 +130,7 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
> }
>
> first_bit = find_first_bit(&val, cbm_len);
> - zero_bit = find_next_zero_bit(&val, cbm_len, first_bit);
> + zero_bit = find_next_zero_bit(&val, cbm_len, first_bit + 1);
Does this definitely do the right thing if val was zero?
>
> /* Are non-contiguous bitmasks allowed? */
> if (!r->cache.arch_has_sparse_bitmasks &&
Also, what about the find_first_bit() below?
[...]
<aside>
Also, not directly related to this patch, but, looking at the final if
statement:
if ((zero_bit - first_bit) < r->cache.min_cbm_bits) {
rdt_last_cmd_printf("Need at least %d bits in the mask\n",
r->cache.min_cbm_bits);
return false;
}
If min_cbm_bits is two or greater, this can fail if the bitmap has
enough contiguous set bits but not in the first block of set bits,
and it can succeed if there are blocks of set bits beyond the first
block, that have fewer than min_cbm_bits.
Is that intended? Do we ever expect arch_has_sparse_bitmasks alongside
min_cbm_bits > 1, or should these be mutually exclusive?
</aside>
Cheers
---Dave
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] fs/resctrl: Slightly optimize cbm_validate()
2025-10-27 11:43 ` Dave Martin
@ 2025-11-01 13:40 ` Christophe JAILLET
2025-11-03 16:24 ` Dave Martin
2025-11-03 18:17 ` Luck, Tony
1 sibling, 1 reply; 7+ messages in thread
From: Christophe JAILLET @ 2025-11-01 13:40 UTC (permalink / raw)
To: Dave Martin, Tony Luck
Cc: Reinette Chatre, James Morse, Babu Moger, linux-kernel,
kernel-janitors
Le 27/10/2025 à 12:43, Dave Martin a écrit :
> Hi,
>
> [Tony, I have a side question on min_cbm_bits -- see below.]
>
> On Sun, Oct 26, 2025 at 08:39:52AM +0100, Christophe JAILLET wrote:
>> 'first_bit' is known to be 1, so it can be skipped when searching for the
>> next 0 bit. Doing so mimics bitmap_next_set_region() and can save a few
>> cycles.
>
> This seems reasonable, although:
>
> Nit: missing statement of what the patch does. (Your paragraph
> describes only something that _could_ be done and gives rationale for
> it.)
Will add it in v2.
>
>>
>> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
>> ---
>> Compile tested only.
>>
>> For the records, on x86, the diff of the asm code is:
>> --- fs/resctrl/ctrlmondata.s.old 2025-10-26 08:21:46.928920563 +0100
>> +++ fs/resctrl/ctrlmondata.s 2025-10-26 08:21:40.864024143 +0100
>> @@ -1603,11 +1603,12 @@
>> call _find_first_bit
>> # ./include/linux/find.h:192: return _find_next_zero_bit(addr, size, offset);
>> movq %r12, %rsi
>> - leaq 48(%rsp), %rdi
>> - movq %rax, %rdx
>> +# fs/resctrl/ctrlmondata.c:133: zero_bit = find_next_zero_bit(&val, cbm_len, first_bit + 1);
>> + leaq 1(%rax), %rdx
>> # ./include/linux/find.h:214: return _find_first_bit(addr, size);
>> movq %rax, 8(%rsp)
>> # ./include/linux/find.h:192: return _find_next_zero_bit(addr, size, offset);
>> + leaq 48(%rsp), %rdi
>
> (This is really only showing that the compiler works. The real
> question is whether the logic is still sound after this change to the
> arguments of _find_first_bit()...)
Will remove in v2, if not useful.
>
>> call _find_next_zero_bit
>> # fs/resctrl/ctrlmondata.c:136: if (!r->cache.arch_has_sparse_bitmasks &&
>> leaq 28(%rbx), %rdi
>> ---
>> fs/resctrl/ctrlmondata.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
>> index 0d0ef54fc4de..1ff479a2dbbc 100644
>> --- a/fs/resctrl/ctrlmondata.c
>> +++ b/fs/resctrl/ctrlmondata.c
>> @@ -130,7 +130,7 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
>> }
>>
>> first_bit = find_first_bit(&val, cbm_len);
>> - zero_bit = find_next_zero_bit(&val, cbm_len, first_bit);
>> + zero_bit = find_next_zero_bit(&val, cbm_len, first_bit + 1);
>
> Does this definitely do the right thing if val was zero?
Yes, IMHO, it does.
If val is zero, first_bit will be assigned to cbm_len (see [1]).
Then, find_next_zero_bit() will do the same because 'first_bit + 1' will
overflow the size of the bitmap. (see [2] and [3])
The only case were we could have trouble would be to have 'first_bit +
1' overflow and be equal to 0. I don't think that such a case is possible.
>
>>
>> /* Are non-contiguous bitmasks allowed? */
>> if (!r->cache.arch_has_sparse_bitmasks &&
>
> Also, what about the find_first_bit() below?
Should be updated as well.
Will send a v2.
>
>
> [...]
>
> <aside>
>
> Also, not directly related to this patch, but, looking at the final if
> statement:
>
> if ((zero_bit - first_bit) < r->cache.min_cbm_bits) {
> rdt_last_cmd_printf("Need at least %d bits in the mask\n",
> r->cache.min_cbm_bits);
> return false;
> }
>
> If min_cbm_bits is two or greater, this can fail if the bitmap has
> enough contiguous set bits but not in the first block of set bits,
> and it can succeed if there are blocks of set bits beyond the first
> block, that have fewer than min_cbm_bits.
>
> Is that intended? Do we ever expect arch_has_sparse_bitmasks alongside
> min_cbm_bits > 1, or should these be mutually exclusive?
>
> </aside>
>
> Cheers
> ---Dave
>
>
CJ
[1]:
https://elixir.bootlin.com/linux/v6.18-rc3/source/include/linux/find.h#L203
[2]:
https://elixir.bootlin.com/linux/v6.18-rc3/source/include/linux/find.h#L185
[3]: https://elixir.bootlin.com/linux/v6.18-rc3/source/lib/find_bit.c#L55
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] fs/resctrl: Slightly optimize cbm_validate()
2025-11-01 13:40 ` Christophe JAILLET
@ 2025-11-03 16:24 ` Dave Martin
0 siblings, 0 replies; 7+ messages in thread
From: Dave Martin @ 2025-11-03 16:24 UTC (permalink / raw)
To: Christophe JAILLET
Cc: Tony Luck, Reinette Chatre, James Morse, Babu Moger, linux-kernel,
kernel-janitors
Hi,
On Sat, Nov 01, 2025 at 02:40:58PM +0100, Christophe JAILLET wrote:
> Le 27/10/2025 à 12:43, Dave Martin a écrit :
> > Hi,
> >
> > [Tony, I have a side question on min_cbm_bits -- see below.]
> >
> > On Sun, Oct 26, 2025 at 08:39:52AM +0100, Christophe JAILLET wrote:
> > > 'first_bit' is known to be 1, so it can be skipped when searching for the
> > > next 0 bit. Doing so mimics bitmap_next_set_region() and can save a few
> > > cycles.
> >
> > This seems reasonable, although:
> >
> > Nit: missing statement of what the patch does. (Your paragraph
> > describes only something that _could_ be done and gives rationale for
> > it.)
>
> Will add it in v2.
Thanks
[...]
> > > For the records, on x86, the diff of the asm code is:
> > > --- fs/resctrl/ctrlmondata.s.old 2025-10-26 08:21:46.928920563 +0100
> > > +++ fs/resctrl/ctrlmondata.s 2025-10-26 08:21:40.864024143 +0100
> > > @@ -1603,11 +1603,12 @@
> > > call _find_first_bit
> > > # ./include/linux/find.h:192: return _find_next_zero_bit(addr, size, offset);
> > > movq %r12, %rsi
> > > - leaq 48(%rsp), %rdi
> > > - movq %rax, %rdx
> > > +# fs/resctrl/ctrlmondata.c:133: zero_bit = find_next_zero_bit(&val, cbm_len, first_bit + 1);
> > > + leaq 1(%rax), %rdx
> > > # ./include/linux/find.h:214: return _find_first_bit(addr, size);
> > > movq %rax, 8(%rsp)
> > > # ./include/linux/find.h:192: return _find_next_zero_bit(addr, size, offset);
> > > + leaq 48(%rsp), %rdi
> >
> > (This is really only showing that the compiler works. The real
> > question is whether the logic is still sound after this change to the
> > arguments of _find_first_bit()...)
>
> Will remove in v2, if not useful.
It's harmless, but a bit of a distraction...
> > > call _find_next_zero_bit
> > > # fs/resctrl/ctrlmondata.c:136: if (!r->cache.arch_has_sparse_bitmasks &&
> > > leaq 28(%rbx), %rdi
> > > ---
> > > fs/resctrl/ctrlmondata.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
> > > index 0d0ef54fc4de..1ff479a2dbbc 100644
> > > --- a/fs/resctrl/ctrlmondata.c
> > > +++ b/fs/resctrl/ctrlmondata.c
> > > @@ -130,7 +130,7 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
> > > }
> > > first_bit = find_first_bit(&val, cbm_len);
> > > - zero_bit = find_next_zero_bit(&val, cbm_len, first_bit);
> > > + zero_bit = find_next_zero_bit(&val, cbm_len, first_bit + 1);
> >
> > Does this definitely do the right thing if val was zero?
>
> Yes, IMHO, it does.
>
> If val is zero, first_bit will be assigned to cbm_len (see [1]).
> Then, find_next_zero_bit() will do the same because 'first_bit + 1' will
> overflow the size of the bitmap. (see [2] and [3])
Right, I think that works.
> The only case were we could have trouble would be to have 'first_bit + 1'
> overflow and be equal to 0. I don't think that such a case is possible.
I looks impossible to me: first_bit comes from
find_first_bit(..., cbm_len), so I don't think it can be greater than
cbm_len.
> > > /* Are non-contiguous bitmasks allowed? */
> > > if (!r->cache.arch_has_sparse_bitmasks &&
> >
> > Also, what about the find_first_bit() below?
>
> Should be updated as well.
> Will send a v2.
OK, sounds fair.
Cheers
---Dave
[...]
> [1]:
> https://elixir.bootlin.com/linux/v6.18-rc3/source/include/linux/find.h#L203
> [2]:
> https://elixir.bootlin.com/linux/v6.18-rc3/source/include/linux/find.h#L185
> [3]: https://elixir.bootlin.com/linux/v6.18-rc3/source/lib/find_bit.c#L55
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] fs/resctrl: Slightly optimize cbm_validate()
2025-10-27 11:43 ` Dave Martin
2025-11-01 13:40 ` Christophe JAILLET
@ 2025-11-03 18:17 ` Luck, Tony
2025-11-05 14:51 ` Dave Martin
1 sibling, 1 reply; 7+ messages in thread
From: Luck, Tony @ 2025-11-03 18:17 UTC (permalink / raw)
To: Dave Martin
Cc: Christophe JAILLET, Reinette Chatre, James Morse, Babu Moger,
linux-kernel, kernel-janitors
On Mon, Oct 27, 2025 at 11:43:49AM +0000, Dave Martin wrote:
> Hi,
>
> [Tony, I have a side question on min_cbm_bits -- see below.]
> [...]
>
> <aside>
>
> Also, not directly related to this patch, but, looking at the final if
> statement:
>
> if ((zero_bit - first_bit) < r->cache.min_cbm_bits) {
> rdt_last_cmd_printf("Need at least %d bits in the mask\n",
> r->cache.min_cbm_bits);
> return false;
> }
>
> If min_cbm_bits is two or greater, this can fail if the bitmap has
> enough contiguous set bits but not in the first block of set bits,
> and it can succeed if there are blocks of set bits beyond the first
> block, that have fewer than min_cbm_bits.
>
> Is that intended? Do we ever expect arch_has_sparse_bitmasks alongside
> min_cbm_bits > 1, or should these be mutually exclusive?
>
> </aside>
There's no enumeration for the minimium number of bits in a CBM mask.
Haswell (first to implemenent L3 cache allocation) got a quirk to
to set it to "2". I don't expect that we'd do that again.
So safe to assume that resctrl doesn't have to handle the combination
of min_cbm_bits > 1 with arch_has_sparse_bitmasks.
-Tony
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] fs/resctrl: Slightly optimize cbm_validate()
2025-10-26 7:39 [PATCH] fs/resctrl: Slightly optimize cbm_validate() Christophe JAILLET
2025-10-27 11:43 ` Dave Martin
@ 2025-11-03 22:13 ` Reinette Chatre
1 sibling, 0 replies; 7+ messages in thread
From: Reinette Chatre @ 2025-11-03 22:13 UTC (permalink / raw)
To: Christophe JAILLET, Tony Luck, Dave Martin, James Morse,
Babu Moger
Cc: linux-kernel, kernel-janitors
Hi Christophe,
On 10/26/25 12:39 AM, Christophe JAILLET wrote:
> 'first_bit' is known to be 1, so it can be skipped when searching for the
> next 0 bit. Doing so mimics bitmap_next_set_region() and can save a few
> cycles.
This is not part of a flow where cycles matter and may thus
not be considered unless it forms part of a larger series. We could
work more on getting this ready for inclusion but please be aware that it
may not be considered. This is up to the x86 maintainers so please also
include them in your next submission (x86@kernel.org).
>
> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
> ---
...
> ---
> fs/resctrl/ctrlmondata.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
> index 0d0ef54fc4de..1ff479a2dbbc 100644
> --- a/fs/resctrl/ctrlmondata.c
> +++ b/fs/resctrl/ctrlmondata.c
> @@ -130,7 +130,7 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
> }
>
> first_bit = find_first_bit(&val, cbm_len);
> - zero_bit = find_next_zero_bit(&val, cbm_len, first_bit);
> + zero_bit = find_next_zero_bit(&val, cbm_len, first_bit + 1);
>
It looks like cbm_ensure_valid() has the same pattern so a similar change there would
help to make the code consistent.
sidenote: Not related to this patch but looks like cbm_ensure_valid() is missing taking
arch support for sparse masks into account.
> /* Are non-contiguous bitmasks allowed? */
> if (!r->cache.arch_has_sparse_bitmasks &&
Reinette
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] fs/resctrl: Slightly optimize cbm_validate()
2025-11-03 18:17 ` Luck, Tony
@ 2025-11-05 14:51 ` Dave Martin
0 siblings, 0 replies; 7+ messages in thread
From: Dave Martin @ 2025-11-05 14:51 UTC (permalink / raw)
To: Luck, Tony
Cc: Christophe JAILLET, Reinette Chatre, James Morse, Babu Moger,
linux-kernel, kernel-janitors
Hi,
On Mon, Nov 03, 2025 at 10:17:16AM -0800, Luck, Tony wrote:
> On Mon, Oct 27, 2025 at 11:43:49AM +0000, Dave Martin wrote:
> > Hi,
> >
> > [Tony, I have a side question on min_cbm_bits -- see below.]
> > [...]
> >
> > <aside>
> >
> > Also, not directly related to this patch, but, looking at the final if
> > statement:
> >
> > if ((zero_bit - first_bit) < r->cache.min_cbm_bits) {
> > rdt_last_cmd_printf("Need at least %d bits in the mask\n",
> > r->cache.min_cbm_bits);
> > return false;
> > }
> >
> > If min_cbm_bits is two or greater, this can fail if the bitmap has
> > enough contiguous set bits but not in the first block of set bits,
> > and it can succeed if there are blocks of set bits beyond the first
> > block, that have fewer than min_cbm_bits.
> >
> > Is that intended? Do we ever expect arch_has_sparse_bitmasks alongside
> > min_cbm_bits > 1, or should these be mutually exclusive?
> >
> > </aside>
>
> There's no enumeration for the minimium number of bits in a CBM mask.
> Haswell (first to implemenent L3 cache allocation) got a quirk to
> to set it to "2". I don't expect that we'd do that again.
>
> So safe to assume that resctrl doesn't have to handle the combination
> of min_cbm_bits > 1 with arch_has_sparse_bitmasks.
>
> -Tony
OK. A min_cbm_bits value > 1 seems unlikely with sparse bitmasks
anyway. If the hardware has independent storage for each bit, there
would be no need for such a constraint... so I would be surprised to
see this in practice.
Just wanted to check that I wasn't missing something!
In MPAM, bitmap controls always allow each bit to be controlled
independently, according to the architecture.
Cheers
---Dave
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-11-05 14:51 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-26 7:39 [PATCH] fs/resctrl: Slightly optimize cbm_validate() Christophe JAILLET
2025-10-27 11:43 ` Dave Martin
2025-11-01 13:40 ` Christophe JAILLET
2025-11-03 16:24 ` Dave Martin
2025-11-03 18:17 ` Luck, Tony
2025-11-05 14:51 ` Dave Martin
2025-11-03 22:13 ` Reinette Chatre
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).