All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] x86/math64: handle #DE in mul_u64_u64_div_u64()
@ 2025-08-15 16:40 Oleg Nesterov
  2025-08-15 16:40 ` [PATCH v3 1/2] " Oleg Nesterov
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Oleg Nesterov @ 2025-08-15 16:40 UTC (permalink / raw)
  To: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner
  Cc: David Laight, Li RongQing, Yu Kuai, Khazhismel Kumykov,
	Jens Axboe, linux-kernel, x86

Hello,

Changes since V2:

	1/2 - added the acks from David and Li (thank you)
	2/2 - new and trivial

Oleg.
---

 arch/x86/include/asm/div64.h | 13 ++++++++-----
 block/blk-throttle.c         |  6 ------
 2 files changed, 8 insertions(+), 11 deletions(-)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 1/2] x86/math64: handle #DE in mul_u64_u64_div_u64()
  2025-08-15 16:40 [PATCH v3 0/2] x86/math64: handle #DE in mul_u64_u64_div_u64() Oleg Nesterov
@ 2025-08-15 16:40 ` Oleg Nesterov
  2025-08-15 16:41 ` [PATCH v3 2/2] blk-throttle: kill the no longer needed overflow check in calculate_bytes_allowed() Oleg Nesterov
  2025-08-17 13:07 ` [PATCH v3 0/2] x86/math64: handle #DE in mul_u64_u64_div_u64() David Laight
  2 siblings, 0 replies; 9+ messages in thread
From: Oleg Nesterov @ 2025-08-15 16:40 UTC (permalink / raw)
  To: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner
  Cc: David Laight, Li RongQing, Yu Kuai, Khazhismel Kumykov,
	Jens Axboe, linux-kernel, x86

Change mul_u64_u64_div_u64() to return ULONG_MAX if the result doesn't
fit into u64 or div == 0. The former matches the generic implementation
in lib/math/div64.c, the latter doesn't. Perhaps we will add a WARN()
into the fixup_exception() paths later.

No need to use _ASM_EXTABLE_TYPE_REG(), we know that the target register
is pt_regs->ax with offset == 0, so a simple EX_DATA_REG(0) should work
just fine.

Reported-by: Li RongQing <lirongqing@baidu.com>
Link: https://lore.kernel.org/all/78a0d7bb20504c0884d474868eccd858@baidu.com/
Reviewed-by: David Laight <david.laight.linux@gmail.com>
Tested-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 arch/x86/include/asm/div64.h | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/div64.h b/arch/x86/include/asm/div64.h
index 9931e4c7d73f..c2532d38ed0a 100644
--- a/arch/x86/include/asm/div64.h
+++ b/arch/x86/include/asm/div64.h
@@ -79,18 +79,21 @@ static inline u64 mul_u32_u32(u32 a, u32 b)
 
 #else
 # include <asm-generic/div64.h>
+# include <asm/asm.h>
 
 /*
- * Will generate an #DE when the result doesn't fit u64, could fix with an
- * __ex_table[] entry when it becomes an issue.
+ * Returns U64_MAX if the result doesn't fit u64 or div == 0.
  */
 static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
 {
 	u64 q;
 
-	asm ("mulq %2; divq %3" : "=a" (q)
-				: "a" (a), "rm" (mul), "rm" (div)
-				: "rdx");
+	asm ("mulq %2; 1: divq %3; 2:\n"
+		_ASM_EXTABLE_TYPE(1b, 2b,
+			EX_TYPE_IMM_REG | EX_DATA_REG(0) | EX_DATA_IMM(-1))
+		: "=a" (q)
+		: "a" (a), "rm" (mul), "rm" (div)
+		: "rdx");
 
 	return q;
 }
-- 
2.25.1.362.g51ebf55


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 2/2] blk-throttle: kill the no longer needed overflow check in calculate_bytes_allowed()
  2025-08-15 16:40 [PATCH v3 0/2] x86/math64: handle #DE in mul_u64_u64_div_u64() Oleg Nesterov
  2025-08-15 16:40 ` [PATCH v3 1/2] " Oleg Nesterov
@ 2025-08-15 16:41 ` Oleg Nesterov
  2025-08-17 12:50   ` David Laight
  2025-08-17 13:07 ` [PATCH v3 0/2] x86/math64: handle #DE in mul_u64_u64_div_u64() David Laight
  2 siblings, 1 reply; 9+ messages in thread
From: Oleg Nesterov @ 2025-08-15 16:41 UTC (permalink / raw)
  To: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner
  Cc: David Laight, Li RongQing, Yu Kuai, Khazhismel Kumykov,
	Jens Axboe, linux-kernel, x86

Now that mul_u64_u64_div_u64() can't crash there is no need to check for
possible overflow in calculate_bytes_allowed().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 block/blk-throttle.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 397b6a410f9e..66339e22cc85 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -601,12 +601,6 @@ static unsigned int calculate_io_allowed(u32 iops_limit,
 
 static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long jiffy_elapsed)
 {
-	/*
-	 * Can result be wider than 64 bits?
-	 * We check against 62, not 64, due to ilog2 truncation.
-	 */
-	if (ilog2(bps_limit) + ilog2(jiffy_elapsed) - ilog2(HZ) > 62)
-		return U64_MAX;
 	return mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed, (u64)HZ);
 }
 
-- 
2.25.1.362.g51ebf55


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 2/2] blk-throttle: kill the no longer needed overflow check in calculate_bytes_allowed()
  2025-08-15 16:41 ` [PATCH v3 2/2] blk-throttle: kill the no longer needed overflow check in calculate_bytes_allowed() Oleg Nesterov
@ 2025-08-17 12:50   ` David Laight
  2025-08-17 17:06     ` H. Peter Anvin
  2025-08-18 13:21     ` Oleg Nesterov
  0 siblings, 2 replies; 9+ messages in thread
From: David Laight @ 2025-08-17 12:50 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Li RongQing, Yu Kuai,
	Khazhismel Kumykov, Jens Axboe, linux-kernel, x86

On Fri, 15 Aug 2025 18:41:02 +0200
Oleg Nesterov <oleg@redhat.com> wrote:

> Now that mul_u64_u64_div_u64() can't crash there is no need to check for
> possible overflow in calculate_bytes_allowed().
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
>  block/blk-throttle.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 397b6a410f9e..66339e22cc85 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -601,12 +601,6 @@ static unsigned int calculate_io_allowed(u32 iops_limit,
>  
>  static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long jiffy_elapsed)
>  {
> -	/*
> -	 * Can result be wider than 64 bits?
> -	 * We check against 62, not 64, due to ilog2 truncation.
> -	 */
> -	if (ilog2(bps_limit) + ilog2(jiffy_elapsed) - ilog2(HZ) > 62)
> -		return U64_MAX;
>  	return mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed, (u64)HZ);

Not directly related, but the two (u64) casts are pointless and can be removed.

	David

>  }
>  


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/2] x86/math64: handle #DE in mul_u64_u64_div_u64()
  2025-08-15 16:40 [PATCH v3 0/2] x86/math64: handle #DE in mul_u64_u64_div_u64() Oleg Nesterov
  2025-08-15 16:40 ` [PATCH v3 1/2] " Oleg Nesterov
  2025-08-15 16:41 ` [PATCH v3 2/2] blk-throttle: kill the no longer needed overflow check in calculate_bytes_allowed() Oleg Nesterov
@ 2025-08-17 13:07 ` David Laight
  2025-08-18 12:39   ` Oleg Nesterov
  2 siblings, 1 reply; 9+ messages in thread
From: David Laight @ 2025-08-17 13:07 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Li RongQing, Yu Kuai,
	Khazhismel Kumykov, Jens Axboe, linux-kernel, x86

On Fri, 15 Aug 2025 18:40:09 +0200
Oleg Nesterov <oleg@redhat.com> wrote:

One of my 'idea patches' is to make mul_u64_u64_div_u64() a wrapper for
another function that takes in extra 'int *overflowed' parameter that is
set zero/non-zero for success/overflow.
The 'overflowed' parameter can either be a compile-time NULL or a
valid pointer.

So the x86-x64 asm implementation would use different code - you need
the 'jump around fail label' to write the ~0 return value to *overflowed.
The extra pointer check in the C version normal path may not be worth
worrying about (but the '*overflow = 0' could easily be inlined).

The typical use would be:
	quotient = mul_u64_u64_div_u64_overflow(..., &overflowed);
	if (quotient == ~0ull && overflowed)
		...
That will generate better code than returning 'overflowed' and the
quotient by reference.

Although I wonder how often ~0ull is a valid result?

	David

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 2/2] blk-throttle: kill the no longer needed overflow check in calculate_bytes_allowed()
  2025-08-17 12:50   ` David Laight
@ 2025-08-17 17:06     ` H. Peter Anvin
  2025-08-18 13:21     ` Oleg Nesterov
  1 sibling, 0 replies; 9+ messages in thread
From: H. Peter Anvin @ 2025-08-17 17:06 UTC (permalink / raw)
  To: David Laight, Oleg Nesterov
  Cc: Borislav Petkov, Dave Hansen, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Li RongQing, Yu Kuai, Khazhismel Kumykov,
	Jens Axboe, linux-kernel, x86

On August 17, 2025 5:50:13 AM PDT, David Laight <david.laight.linux@gmail.com> wrote:
>On Fri, 15 Aug 2025 18:41:02 +0200
>Oleg Nesterov <oleg@redhat.com> wrote:
>
>> Now that mul_u64_u64_div_u64() can't crash there is no need to check for
>> possible overflow in calculate_bytes_allowed().
>> 
>> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
>> ---
>>  block/blk-throttle.c | 6 ------
>>  1 file changed, 6 deletions(-)
>> 
>> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
>> index 397b6a410f9e..66339e22cc85 100644
>> --- a/block/blk-throttle.c
>> +++ b/block/blk-throttle.c
>> @@ -601,12 +601,6 @@ static unsigned int calculate_io_allowed(u32 iops_limit,
>>  
>>  static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long jiffy_elapsed)
>>  {
>> -	/*
>> -	 * Can result be wider than 64 bits?
>> -	 * We check against 62, not 64, due to ilog2 truncation.
>> -	 */
>> -	if (ilog2(bps_limit) + ilog2(jiffy_elapsed) - ilog2(HZ) > 62)
>> -		return U64_MAX;
>>  	return mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed, (u64)HZ);
>
>Not directly related, but the two (u64) casts are pointless and can be removed.
>
>	David
>
>>  }
>>  
>

It's also rather broken, because a division with a constant can be implemented as a multiply, and both gcc and clang knows how to do that.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/2] x86/math64: handle #DE in mul_u64_u64_div_u64()
  2025-08-17 13:07 ` [PATCH v3 0/2] x86/math64: handle #DE in mul_u64_u64_div_u64() David Laight
@ 2025-08-18 12:39   ` Oleg Nesterov
  2025-08-18 18:28     ` David Laight
  0 siblings, 1 reply; 9+ messages in thread
From: Oleg Nesterov @ 2025-08-18 12:39 UTC (permalink / raw)
  To: David Laight
  Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Li RongQing, Yu Kuai,
	Khazhismel Kumykov, Jens Axboe, linux-kernel, x86

David,

We had a lengthy discussion and you have already acked this fix.

I thought that we agreed on that a) we need to fix the problem first
and b) x86 version should be consistent with the generic implementation
regarding ~0ull on overflow.

Can we finally merge this fix, then discuss the possible improvements
and possibly change both implementation?

Oleg.

On 08/17, David Laight wrote:
>
> On Fri, 15 Aug 2025 18:40:09 +0200
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> One of my 'idea patches' is to make mul_u64_u64_div_u64() a wrapper for
> another function that takes in extra 'int *overflowed' parameter that is
> set zero/non-zero for success/overflow.
> The 'overflowed' parameter can either be a compile-time NULL or a
> valid pointer.
>
> So the x86-x64 asm implementation would use different code - you need
> the 'jump around fail label' to write the ~0 return value to *overflowed.
> The extra pointer check in the C version normal path may not be worth
> worrying about (but the '*overflow = 0' could easily be inlined).
>
> The typical use would be:
> 	quotient = mul_u64_u64_div_u64_overflow(..., &overflowed);
> 	if (quotient == ~0ull && overflowed)
> 		...
> That will generate better code than returning 'overflowed' and the
> quotient by reference.
>
> Although I wonder how often ~0ull is a valid result?
>
> 	David
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 2/2] blk-throttle: kill the no longer needed overflow check in calculate_bytes_allowed()
  2025-08-17 12:50   ` David Laight
  2025-08-17 17:06     ` H. Peter Anvin
@ 2025-08-18 13:21     ` Oleg Nesterov
  1 sibling, 0 replies; 9+ messages in thread
From: Oleg Nesterov @ 2025-08-18 13:21 UTC (permalink / raw)
  To: David Laight
  Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Li RongQing, Yu Kuai,
	Khazhismel Kumykov, Jens Axboe, linux-kernel, x86

On 08/17, David Laight wrote:
>
> On Fri, 15 Aug 2025 18:41:02 +0200
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > Now that mul_u64_u64_div_u64() can't crash there is no need to check for
> > possible overflow in calculate_bytes_allowed().
> >
> > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> > ---
> >  block/blk-throttle.c | 6 ------
> >  1 file changed, 6 deletions(-)
> >
> > diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> > index 397b6a410f9e..66339e22cc85 100644
> > --- a/block/blk-throttle.c
> > +++ b/block/blk-throttle.c
> > @@ -601,12 +601,6 @@ static unsigned int calculate_io_allowed(u32 iops_limit,
> >
> >  static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long jiffy_elapsed)
> >  {
> > -	/*
> > -	 * Can result be wider than 64 bits?
> > -	 * We check against 62, not 64, due to ilog2 truncation.
> > -	 */
> > -	if (ilog2(bps_limit) + ilog2(jiffy_elapsed) - ilog2(HZ) > 62)
> > -		return U64_MAX;
> >  	return mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed, (u64)HZ);
>
> Not directly related, but the two (u64) casts are pointless and can be removed.

Yeah... I only reverted 2dd710d476f2 ("blk-throttle: check for overflow in
calculate_bytes_allowed")

Oleg.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/2] x86/math64: handle #DE in mul_u64_u64_div_u64()
  2025-08-18 12:39   ` Oleg Nesterov
@ 2025-08-18 18:28     ` David Laight
  0 siblings, 0 replies; 9+ messages in thread
From: David Laight @ 2025-08-18 18:28 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Borislav Petkov, Dave Hansen, H. Peter Anvin, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Li RongQing, Yu Kuai,
	Khazhismel Kumykov, Jens Axboe, linux-kernel, x86

On Mon, 18 Aug 2025 14:39:01 +0200
Oleg Nesterov <oleg@redhat.com> wrote:

> David,
> 
> We had a lengthy discussion and you have already acked this fix.
> 
> I thought that we agreed on that a) we need to fix the problem first
> and b) x86 version should be consistent with the generic implementation
> regarding ~0ull on overflow.
> 
> Can we finally merge this fix, then discuss the possible improvements
> and possibly change both implementation?

I deliberately put this comment on 0/2 because it is 'future thought'.
I didn't want to delay the patch going in.

	David


> 
> Oleg.
> 
> On 08/17, David Laight wrote:
> >
> > On Fri, 15 Aug 2025 18:40:09 +0200
> > Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > One of my 'idea patches' is to make mul_u64_u64_div_u64() a wrapper for
> > another function that takes in extra 'int *overflowed' parameter that is
> > set zero/non-zero for success/overflow.
> > The 'overflowed' parameter can either be a compile-time NULL or a
> > valid pointer.
> >
> > So the x86-x64 asm implementation would use different code - you need
> > the 'jump around fail label' to write the ~0 return value to *overflowed.
> > The extra pointer check in the C version normal path may not be worth
> > worrying about (but the '*overflow = 0' could easily be inlined).
> >
> > The typical use would be:
> > 	quotient = mul_u64_u64_div_u64_overflow(..., &overflowed);
> > 	if (quotient == ~0ull && overflowed)
> > 		...
> > That will generate better code than returning 'overflowed' and the
> > quotient by reference.
> >
> > Although I wonder how often ~0ull is a valid result?
> >
> > 	David
> >  
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-08-18 18:28 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-15 16:40 [PATCH v3 0/2] x86/math64: handle #DE in mul_u64_u64_div_u64() Oleg Nesterov
2025-08-15 16:40 ` [PATCH v3 1/2] " Oleg Nesterov
2025-08-15 16:41 ` [PATCH v3 2/2] blk-throttle: kill the no longer needed overflow check in calculate_bytes_allowed() Oleg Nesterov
2025-08-17 12:50   ` David Laight
2025-08-17 17:06     ` H. Peter Anvin
2025-08-18 13:21     ` Oleg Nesterov
2025-08-17 13:07 ` [PATCH v3 0/2] x86/math64: handle #DE in mul_u64_u64_div_u64() David Laight
2025-08-18 12:39   ` Oleg Nesterov
2025-08-18 18:28     ` David Laight

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.