Kernel KVM virtualization development
 help / color / mirror / Atom feed
* Re: [PATCH kvm-unit-tests] realmode: load above stack
       [not found] <20240604143507.1041901-1-pbonzini@redhat.com>
@ 2026-05-04  7:58 ` Thomas Huth
  2026-05-04  8:07   ` intel_iommu unit test is also failing (was: Re: [PATCH kvm-unit-tests] realmode: load above stack) Thomas Huth
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Huth @ 2026-05-04  7:58 UTC (permalink / raw)
  To: Paolo Bonzini, kvm

On 04/06/2024 16.35, Paolo Bonzini wrote:
> The bottom 32K of memory are generally reserved for use by the BIOS;
> for example, traditionally the boot loader is placed at 0x7C00 and
> the stack grows below that address.
> 
> It turns out that with some versions of clang, realmode.flat has
> become big enough that it overlaps the stack used by the multiboot
> option ROM loader.  The result is that a couple instructions are
> overwritten.  Typically one or two tests fail and that's it...
> 
> Move the code above the forbidden region, in real 90s style.
> 
> Reported-by: Thomas Huth <thuth@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   x86/realmode.lds | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/x86/realmode.lds b/x86/realmode.lds
> index 0ed3063b..e4782a98 100644
> --- a/x86/realmode.lds
> +++ b/x86/realmode.lds
> @@ -1,6 +1,6 @@
>   SECTIONS
>   {
> -    . = 16K;
> +    . = 32K;
>       stext = .;
>       .text : { *(.init) *(.text) }
>       . = ALIGN(4K);

  Hi Paolo!

FYI, the realmode kvm-unit-test now also fails with the recent version of 
GCC 16 for the i386 target:

  https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727

It was working fine some weeks ago with GCC 15.1:

  https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961

When I apply your patch, the problem seems to be gone again in this case, 
but since there were some other issues with this (with older versions of 
GCC, I think):

https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/

... there must be a better way to fix it?

Could you please have a look?

  Thanks,
   Thomas


^ permalink raw reply	[flat|nested] 17+ messages in thread

* intel_iommu unit test is also failing (was: Re: [PATCH kvm-unit-tests] realmode: load above stack)
  2026-05-04  7:58 ` [PATCH kvm-unit-tests] realmode: load above stack Thomas Huth
@ 2026-05-04  8:07   ` Thomas Huth
  2026-05-04 15:45     ` Peter Xu
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Huth @ 2026-05-04  8:07 UTC (permalink / raw)
  To: Paolo Bonzini, kvm; +Cc: Peter Xu

On 04/05/2026 09.58, Thomas Huth wrote:
> On 04/06/2024 16.35, Paolo Bonzini wrote:
>> The bottom 32K of memory are generally reserved for use by the BIOS;
>> for example, traditionally the boot loader is placed at 0x7C00 and
>> the stack grows below that address.
>>
>> It turns out that with some versions of clang, realmode.flat has
>> become big enough that it overlaps the stack used by the multiboot
>> option ROM loader.  The result is that a couple instructions are
>> overwritten.  Typically one or two tests fail and that's it...
>>
>> Move the code above the forbidden region, in real 90s style.
>>
>> Reported-by: Thomas Huth <thuth@redhat.com>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>   x86/realmode.lds | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/x86/realmode.lds b/x86/realmode.lds
>> index 0ed3063b..e4782a98 100644
>> --- a/x86/realmode.lds
>> +++ b/x86/realmode.lds
>> @@ -1,6 +1,6 @@
>>   SECTIONS
>>   {
>> -    . = 16K;
>> +    . = 32K;
>>       stext = .;
>>       .text : { *(.init) *(.text) }
>>       . = ALIGN(4K);
> 
>   Hi Paolo!
> 
> FYI, the realmode kvm-unit-test now also fails with the recent version of 
> GCC 16 for the i386 target:
> 
>   https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727
> 
> It was working fine some weeks ago with GCC 15.1:
> 
>   https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961
> 
> When I apply your patch, the problem seems to be gone again in this case, 
> but since there were some other issues with this (with older versions of 
> GCC, I think):
> 
> https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/
> 
> ... there must be a better way to fix it?
> 
> Could you please have a look?

By the way, the intel_iommu test now also suddenly started failing (for the 
x86_64 target), either due to update of GCC or due to the update from QEMU 
v10.2 to 11.0 :

  https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728

Two weeks ago, it was still working fine:

  https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962

  Thomas


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: intel_iommu unit test is also failing (was: Re: [PATCH kvm-unit-tests] realmode: load above stack)
  2026-05-04  8:07   ` intel_iommu unit test is also failing (was: Re: [PATCH kvm-unit-tests] realmode: load above stack) Thomas Huth
@ 2026-05-04 15:45     ` Peter Xu
  2026-05-05  5:49       ` Clément MATHIEU--DRIF
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Xu @ 2026-05-04 15:45 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Paolo Bonzini, kvm, Yi Liu, Clément Mathieu--Drif,
	Michael S. Tsirkin

On Mon, May 04, 2026 at 10:07:25AM +0200, Thomas Huth wrote:
> On 04/05/2026 09.58, Thomas Huth wrote:
> > On 04/06/2024 16.35, Paolo Bonzini wrote:
> > > The bottom 32K of memory are generally reserved for use by the BIOS;
> > > for example, traditionally the boot loader is placed at 0x7C00 and
> > > the stack grows below that address.
> > > 
> > > It turns out that with some versions of clang, realmode.flat has
> > > become big enough that it overlaps the stack used by the multiboot
> > > option ROM loader.  The result is that a couple instructions are
> > > overwritten.  Typically one or two tests fail and that's it...
> > > 
> > > Move the code above the forbidden region, in real 90s style.
> > > 
> > > Reported-by: Thomas Huth <thuth@redhat.com>
> > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > > ---
> > >   x86/realmode.lds | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/x86/realmode.lds b/x86/realmode.lds
> > > index 0ed3063b..e4782a98 100644
> > > --- a/x86/realmode.lds
> > > +++ b/x86/realmode.lds
> > > @@ -1,6 +1,6 @@
> > >   SECTIONS
> > >   {
> > > -    . = 16K;
> > > +    . = 32K;
> > >       stext = .;
> > >       .text : { *(.init) *(.text) }
> > >       . = ALIGN(4K);
> > 
> >   Hi Paolo!
> > 
> > FYI, the realmode kvm-unit-test now also fails with the recent version
> > of GCC 16 for the i386 target:
> > 
> >   https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727
> > 
> > It was working fine some weeks ago with GCC 15.1:
> > 
> >   https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961
> > 
> > When I apply your patch, the problem seems to be gone again in this
> > case, but since there were some other issues with this (with older
> > versions of GCC, I think):
> > 
> > https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/
> > 
> > ... there must be a better way to fix it?
> > 
> > Could you please have a look?
> 
> By the way, the intel_iommu test now also suddenly started failing (for the
> x86_64 target), either due to update of GCC or due to the update from QEMU
> v10.2 to 11.0 :
> 
>  https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728
> 
> Two weeks ago, it was still working fine:
> 
>  https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962

Looping in those who take care of qemu's VT-D now (Yi, Clément)..

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: intel_iommu unit test is also failing (was: Re: [PATCH kvm-unit-tests] realmode: load above stack)
  2026-05-04 15:45     ` Peter Xu
@ 2026-05-05  5:49       ` Clément MATHIEU--DRIF
  2026-05-05  6:37         ` Clément MATHIEU--DRIF
  0 siblings, 1 reply; 17+ messages in thread
From: Clément MATHIEU--DRIF @ 2026-05-05  5:49 UTC (permalink / raw)
  To: Peter Xu, Thomas Huth
  Cc: Paolo Bonzini, kvm@vger.kernel.org, Yi Liu, Michael S. Tsirkin

Hi,

Indeed, it seems to start failing when switching to gcc 16.1.

gcc 15.2.1 - Qemu 11 => pass
gcc 16.1 - Qemu 11 => fail

On Mon, 2026-05-04 at 11:45 -0400, Peter Xu wrote:
> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
>
>
> On Mon, May 04, 2026 at 10:07:25AM +0200, Thomas Huth wrote:
>
> > On 04/05/2026 09.58, Thomas Huth wrote:
> >
> > > On 04/06/2024 16.35, Paolo Bonzini wrote:
> > >
> > > > The bottom 32K of memory are generally reserved for use by the BIOS;
> > > > for example, traditionally the boot loader is placed at 0x7C00 and
> > > > the stack grows below that address.
> > > >
> > > > It turns out that with some versions of clang, realmode.flat has
> > > > become big enough that it overlaps the stack used by the multiboot
> > > > option ROM loader.  The result is that a couple instructions are
> > > > overwritten.  Typically one or two tests fail and that's it...
> > > >
> > > > Move the code above the forbidden region, in real 90s style.
> > > >
> > > > Reported-by: Thomas Huth <[thuth@redhat.com](mailto:thuth@redhat.com)>
> > > > Signed-off-by: Paolo Bonzini <[pbonzini@redhat.com](mailto:pbonzini@redhat.com)>
> > > > ---
> > > >   x86/realmode.lds | 2 +-
> > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/x86/realmode.lds b/x86/realmode.lds
> > > > index 0ed3063b..e4782a98 100644
> > > > --- a/x86/realmode.lds
> > > > +++ b/x86/realmode.lds
> > > > @@ -1,6 +1,6 @@
> > > >   SECTIONS
> > > >   {
> > > > -    . = 16K;
> > > > +    . = 32K;
> > > >       stext = .;
> > > >       .text : { *(.init) *(.text) }
> > > >       . = ALIGN(4K);
> > >
> > >
> > >   Hi Paolo!
> > >
> > > FYI, the realmode kvm-unit-test now also fails with the recent version
> > > of GCC 16 for the i386 target:
> > >
> > >   [https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727)
> > >
> > > It was working fine some weeks ago with GCC 15.1:
> > >
> > >   [https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961)
> > >
> > > When I apply your patch, the problem seems to be gone again in this
> > > case, but since there were some other issues with this (with older
> > > versions of GCC, I think):
> > >
> > > [https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/)
> > >
> > > ... there must be a better way to fix it?
> > >
> > > Could you please have a look?
> >
> >
> > By the way, the intel_iommu test now also suddenly started failing (for the
> > x86_64 target), either due to update of GCC or due to the update from QEMU
> > v10.2 to 11.0 :
> >
> >  [https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728)
> >
> > Two weeks ago, it was still working fine:
> >
> >  [https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962)
>
>
> Looping in those who take care of qemu's VT-D now (Yi, Clément)..
>
> --
> Peter Xu
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: intel_iommu unit test is also failing (was: Re: [PATCH kvm-unit-tests] realmode: load above stack)
  2026-05-05  5:49       ` Clément MATHIEU--DRIF
@ 2026-05-05  6:37         ` Clément MATHIEU--DRIF
  2026-05-05  7:36           ` Clément MATHIEU--DRIF
  0 siblings, 1 reply; 17+ messages in thread
From: Clément MATHIEU--DRIF @ 2026-05-05  6:37 UTC (permalink / raw)
  To: Peter Xu, Thomas Huth
  Cc: Paolo Bonzini, kvm@vger.kernel.org, Yi Liu, Michael S. Tsirkin

I will try to investigate today, it seems that the host does not wait for the dma operation to complete before reading back.

keep you posted

cmd

On Tue, 2026-05-05 at 07:49 +0200, Clement Mathieu--Drif wrote:
> Hi,
>
> Indeed, it seems to start failing when switching to gcc 16.1.
>
> gcc 15.2.1 - Qemu 11 => pass
> gcc 16.1 - Qemu 11 => fail
>
> On Mon, 2026-05-04 at 11:45 -0400, Peter Xu wrote:
>
> > Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
> >
> >
> > On Mon, May 04, 2026 at 10:07:25AM +0200, Thomas Huth wrote:
> >
> >
> > > On 04/05/2026 09.58, Thomas Huth wrote:
> > >
> > >
> > > > On 04/06/2024 16.35, Paolo Bonzini wrote:
> > > >
> > > >
> > > > > The bottom 32K of memory are generally reserved for use by the BIOS;
> > > > > for example, traditionally the boot loader is placed at 0x7C00 and
> > > > > the stack grows below that address.
> > > > >
> > > > > It turns out that with some versions of clang, realmode.flat has
> > > > > become big enough that it overlaps the stack used by the multiboot
> > > > > option ROM loader.  The result is that a couple instructions are
> > > > > overwritten.  Typically one or two tests fail and that's it...
> > > > >
> > > > > Move the code above the forbidden region, in real 90s style.
> > > > >
> > > > > Reported-by: Thomas Huth <[[thuth@redhat.com](mailto:thuth@redhat.com)](mailto:[thuth@redhat.com](mailto:thuth@redhat.com))>
> > > > > Signed-off-by: Paolo Bonzini <[[pbonzini@redhat.com](mailto:pbonzini@redhat.com)](mailto:[pbonzini@redhat.com](mailto:pbonzini@redhat.com))>
> > > > > ---
> > > > >   x86/realmode.lds | 2 +-
> > > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/x86/realmode.lds b/x86/realmode.lds
> > > > > index 0ed3063b..e4782a98 100644
> > > > > --- a/x86/realmode.lds
> > > > > +++ b/x86/realmode.lds
> > > > > @@ -1,6 +1,6 @@
> > > > >   SECTIONS
> > > > >   {
> > > > > -    . = 16K;
> > > > > +    . = 32K;
> > > > >       stext = .;
> > > > >       .text : { *(.init) *(.text) }
> > > > >       . = ALIGN(4K);
> > > >
> > > >
> > > >
> > > >   Hi Paolo!
> > > >
> > > > FYI, the realmode kvm-unit-test now also fails with the recent version
> > > > of GCC 16 for the i386 target:
> > > >
> > > >   [[https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727))
> > > >
> > > > It was working fine some weeks ago with GCC 15.1:
> > > >
> > > >   [[https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961))
> > > >
> > > > When I apply your patch, the problem seems to be gone again in this
> > > > case, but since there were some other issues with this (with older
> > > > versions of GCC, I think):
> > > >
> > > > [[https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/)](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/))
> > > >
> > > > ... there must be a better way to fix it?
> > > >
> > > > Could you please have a look?
> > >
> > >
> > >
> > > By the way, the intel_iommu test now also suddenly started failing (for the
> > > x86_64 target), either due to update of GCC or due to the update from QEMU
> > > v10.2 to 11.0 :
> > >
> > >  [[https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728))
> > >
> > > Two weeks ago, it was still working fine:
> > >
> > >  [[https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962))
> >
> >
> >
> > Looping in those who take care of qemu's VT-D now (Yi, Clément)..
> >
> > --
> > Peter Xu
> >
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: intel_iommu unit test is also failing (was: Re: [PATCH kvm-unit-tests] realmode: load above stack)
  2026-05-05  6:37         ` Clément MATHIEU--DRIF
@ 2026-05-05  7:36           ` Clément MATHIEU--DRIF
  2026-05-05  9:27             ` Clément MATHIEU--DRIF
  0 siblings, 1 reply; 17+ messages in thread
From: Clément MATHIEU--DRIF @ 2026-05-05  7:36 UTC (permalink / raw)
  To: Peter Xu, Thomas Huth
  Cc: Paolo Bonzini, kvm@vger.kernel.org, Yi Liu, Michael S. Tsirkin

Back with some answers:

This is the incriminated hunk:

```diff
--- <unnamed>
+++ <unnamed>
@@ -1,17 +1,16 @@
-  404395:       8b 80 98 00 00 00       mov    0x98(%eax),%eax
+  40441d:       8b 43 38                mov    0x38(%ebx),%eax
         edu_reg_writeq(dev, EDU_REG_DMA_DST, to);
         edu_reg_writeq(dev, EDU_REG_DMA_COUNT, size);
         edu_reg_writel(dev, EDU_REG_DMA_CMD, cmd);

         /* Wait until DMA finished */
         while (edu_reg_readl(dev, EDU_REG_DMA_CMD) & EDU_CMD_DMA_START)
-  40439b:       a8 01                   test   $0x1,%al
-  40439d:       74 10                   je     4043af <edu_dma+0x121>
-  40439f:       f3 90                   pause
-  4043a1:       48                      dec    %eax
-  4043a2:       8b 43 38                mov    0x38(%ebx),%eax
-  4043a5:       8b 80 98 00 00 00       mov    0x98(%eax),%eax
-  4043ab:       a8 01                   test   $0x1,%al
-  4043ad:       75 f0                   jne    40439f <edu_dma+0x111>
+  404420:       f6 80 98 00 00 00 01    testb  $0x1,0x98(%eax)
+  404427:       74 0f                   je     404438 <edu_dma+0x11f>
+  404429:       f3 90                   pause
+  40442b:       48                      dec    %eax
+  40442c:       8b 43 38                mov    0x38(%ebx),%eax
+  40442f:       f6 80 98 00 00 00 01    testb  $0x1,0x98(%eax)
+  404436:       75 f1                   jne    404429 <edu_dma+0x110>
                 cpu_relax();
 }

+ is gcc 16
- is gcc 15

The instructions generated by gcc 16 always skip the following condition:

```
        /* Wait until DMA finished */
        while (edu_reg_readl(dev, EDU_REG_DMA_CMD) & EDU_CMD_DMA_START)
                cpu_relax();
```

As a consequence, the test performs the second dma operation too early and reads a wrong value.

Regards,
cmd

On Tue, 2026-05-05 at 08:37 +0200, Clement Mathieu--Drif wrote:
> I will try to investigate today, it seems that the host does not wait for the dma operation to complete before reading back.
>
> keep you posted
>
> cmd
>
> On Tue, 2026-05-05 at 07:49 +0200, Clement Mathieu--Drif wrote:
>
> > Hi,
> >
> > Indeed, it seems to start failing when switching to gcc 16.1.
> >
> > gcc 15.2.1 - Qemu 11 => pass
> > gcc 16.1 - Qemu 11 => fail
> >
> > On Mon, 2026-05-04 at 11:45 -0400, Peter Xu wrote:
> >
> >
> > > Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
> > >
> > >
> > > On Mon, May 04, 2026 at 10:07:25AM +0200, Thomas Huth wrote:
> > >
> > >
> > >
> > > > On 04/05/2026 09.58, Thomas Huth wrote:
> > > >
> > > >
> > > >
> > > > > On 04/06/2024 16.35, Paolo Bonzini wrote:
> > > > >
> > > > >
> > > > >
> > > > > > The bottom 32K of memory are generally reserved for use by the BIOS;
> > > > > > for example, traditionally the boot loader is placed at 0x7C00 and
> > > > > > the stack grows below that address.
> > > > > >
> > > > > > It turns out that with some versions of clang, realmode.flat has
> > > > > > become big enough that it overlaps the stack used by the multiboot
> > > > > > option ROM loader.  The result is that a couple instructions are
> > > > > > overwritten.  Typically one or two tests fail and that's it...
> > > > > >
> > > > > > Move the code above the forbidden region, in real 90s style.
> > > > > >
> > > > > > Reported-by: Thomas Huth <[[[thuth@redhat.com](mailto:thuth@redhat.com)](mailto:[thuth@redhat.com](mailto:thuth@redhat.com))](mailto:[[thuth@redhat.com](mailto:thuth@redhat.com)](mailto:[thuth@redhat.com](mailto:thuth@redhat.com)))>
> > > > > > Signed-off-by: Paolo Bonzini <[[[pbonzini@redhat.com](mailto:pbonzini@redhat.com)](mailto:[pbonzini@redhat.com](mailto:pbonzini@redhat.com))](mailto:[[pbonzini@redhat.com](mailto:pbonzini@redhat.com)](mailto:[pbonzini@redhat.com](mailto:pbonzini@redhat.com)))>
> > > > > > ---
> > > > > >   x86/realmode.lds | 2 +-
> > > > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/x86/realmode.lds b/x86/realmode.lds
> > > > > > index 0ed3063b..e4782a98 100644
> > > > > > --- a/x86/realmode.lds
> > > > > > +++ b/x86/realmode.lds
> > > > > > @@ -1,6 +1,6 @@
> > > > > >   SECTIONS
> > > > > >   {
> > > > > > -    . = 16K;
> > > > > > +    . = 32K;
> > > > > >       stext = .;
> > > > > >       .text : { *(.init) *(.text) }
> > > > > >       . = ALIGN(4K);
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >   Hi Paolo!
> > > > >
> > > > > FYI, the realmode kvm-unit-test now also fails with the recent version
> > > > > of GCC 16 for the i386 target:
> > > > >
> > > > >   [[[https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727))](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727)))
> > > > >
> > > > > It was working fine some weeks ago with GCC 15.1:
> > > > >
> > > > >   [[[https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961))](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961)))
> > > > >
> > > > > When I apply your patch, the problem seems to be gone again in this
> > > > > case, but since there were some other issues with this (with older
> > > > > versions of GCC, I think):
> > > > >
> > > > > [[[https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/)](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/))](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/)](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/)))
> > > > >
> > > > > ... there must be a better way to fix it?
> > > > >
> > > > > Could you please have a look?
> > > >
> > > >
> > > >
> > > >
> > > > By the way, the intel_iommu test now also suddenly started failing (for the
> > > > x86_64 target), either due to update of GCC or due to the update from QEMU
> > > > v10.2 to 11.0 :
> > > >
> > > >  [[[https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728))](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728)))
> > > >
> > > > Two weeks ago, it was still working fine:
> > > >
> > > >  [[[https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962))](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962)))
> > >
> > >
> > >
> > >
> > > Looping in those who take care of qemu's VT-D now (Yi, Clément)..
> > >
> > > --
> > > Peter Xu
> > >
> >
> >
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: intel_iommu unit test is also failing (was: Re: [PATCH kvm-unit-tests] realmode: load above stack)
  2026-05-05  7:36           ` Clément MATHIEU--DRIF
@ 2026-05-05  9:27             ` Clément MATHIEU--DRIF
  2026-05-05  9:45               ` intel_iommu unit test is also failing Thomas Huth
  0 siblings, 1 reply; 17+ messages in thread
From: Clément MATHIEU--DRIF @ 2026-05-05  9:27 UTC (permalink / raw)
  To: Peter Xu, Thomas Huth
  Cc: Paolo Bonzini, kvm@vger.kernel.org, Yi Liu, Michael S. Tsirkin

I had a bit more time to hook into qemu to check the root cause.

It seems that testb issues a single byte read (out of the valid size range), as we can see on the following breakpoint:

```
Thread 6 "CPU 0/TCG" hit Breakpoint 2, memory_region_dispatch_read (mr=0x55d72883cb30, addr=152, pval=0x7f62d25f4590, op=MO_BSWAP, attrs=...) at ../system/memory.c:1473
1473       unsigned size = memop_size(op);
(gdb) n
1474       MemTxResult r;
(gdb) p size
$1 = 1
(gdb)
```

cmd

On Tue, 2026-05-05 at 09:36 +0200, Clement Mathieu--Drif wrote:
> Back with some answers:
>
> This is the incriminated hunk:
>
> ```diff
> --- <unnamed>
> +++ <unnamed>
> @@ -1,17 +1,16 @@
> -  404395:       8b 80 98 00 00 00       mov    0x98(%eax),%eax
> +  40441d:       8b 43 38                mov    0x38(%ebx),%eax
>          edu_reg_writeq(dev, EDU_REG_DMA_DST, to);
>          edu_reg_writeq(dev, EDU_REG_DMA_COUNT, size);
>          edu_reg_writel(dev, EDU_REG_DMA_CMD, cmd);
>
>          /* Wait until DMA finished */
>          while (edu_reg_readl(dev, EDU_REG_DMA_CMD) & EDU_CMD_DMA_START)
> -  40439b:       a8 01                   test   $0x1,%al
> -  40439d:       74 10                   je     4043af <edu_dma+0x121>
> -  40439f:       f3 90                   pause
> -  4043a1:       48                      dec    %eax
> -  4043a2:       8b 43 38                mov    0x38(%ebx),%eax
> -  4043a5:       8b 80 98 00 00 00       mov    0x98(%eax),%eax
> -  4043ab:       a8 01                   test   $0x1,%al
> -  4043ad:       75 f0                   jne    40439f <edu_dma+0x111>
> +  404420:       f6 80 98 00 00 00 01    testb  $0x1,0x98(%eax)
> +  404427:       74 0f                   je     404438 <edu_dma+0x11f>
> +  404429:       f3 90                   pause
> +  40442b:       48                      dec    %eax
> +  40442c:       8b 43 38                mov    0x38(%ebx),%eax
> +  40442f:       f6 80 98 00 00 00 01    testb  $0x1,0x98(%eax)
> +  404436:       75 f1                   jne    404429 <edu_dma+0x110>
>                  cpu_relax();
>  }
>
> + is gcc 16
> - is gcc 15
>
> The instructions generated by gcc 16 always skip the following condition:
>
> ```
>       /* Wait until DMA finished */
>       while (edu_reg_readl(dev, EDU_REG_DMA_CMD) & EDU_CMD_DMA_START)
>               cpu_relax();
> ```
>
> As a consequence, the test performs the second dma operation too early and reads a wrong value.
>
> Regards,
> cmd
>
> On Tue, 2026-05-05 at 08:37 +0200, Clement Mathieu--Drif wrote:
>
> > I will try to investigate today, it seems that the host does not wait for the dma operation to complete before reading back.
> >
> > keep you posted
> >
> > cmd
> >
> > On Tue, 2026-05-05 at 07:49 +0200, Clement Mathieu--Drif wrote:
> >
> >
> > > Hi,
> > >
> > > Indeed, it seems to start failing when switching to gcc 16.1.
> > >
> > > gcc 15.2.1 - Qemu 11 => pass
> > > gcc 16.1 - Qemu 11 => fail
> > >
> > > On Mon, 2026-05-04 at 11:45 -0400, Peter Xu wrote:
> > >
> > >
> > >
> > > > Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
> > > >
> > > >
> > > > On Mon, May 04, 2026 at 10:07:25AM +0200, Thomas Huth wrote:
> > > >
> > > >
> > > >
> > > >
> > > > > On 04/05/2026 09.58, Thomas Huth wrote:
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > > On 04/06/2024 16.35, Paolo Bonzini wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > The bottom 32K of memory are generally reserved for use by the BIOS;
> > > > > > > for example, traditionally the boot loader is placed at 0x7C00 and
> > > > > > > the stack grows below that address.
> > > > > > >
> > > > > > > It turns out that with some versions of clang, realmode.flat has
> > > > > > > become big enough that it overlaps the stack used by the multiboot
> > > > > > > option ROM loader.  The result is that a couple instructions are
> > > > > > > overwritten.  Typically one or two tests fail and that's it...
> > > > > > >
> > > > > > > Move the code above the forbidden region, in real 90s style.
> > > > > > >
> > > > > > > Reported-by: Thomas Huth <[[[[thuth@redhat.com](mailto:thuth@redhat.com)](mailto:[thuth@redhat.com](mailto:thuth@redhat.com))](mailto:[[thuth@redhat.com](mailto:thuth@redhat.com)](mailto:[thuth@redhat.com](mailto:thuth@redhat.com)))](mailto:[[[thuth@redhat.com](mailto:thuth@redhat.com)](mailto:[thuth@redhat.com](mailto:thuth@redhat.com))](mailto:[[thuth@redhat.com](mailto:thuth@redhat.com)](mailto:[thuth@redhat.com](mailto:thuth@redhat.com))))>
> > > > > > > Signed-off-by: Paolo Bonzini <[[[[pbonzini@redhat.com](mailto:pbonzini@redhat.com)](mailto:[pbonzini@redhat.com](mailto:pbonzini@redhat.com))](mailto:[[pbonzini@redhat.com](mailto:pbonzini@redhat.com)](mailto:[pbonzini@redhat.com](mailto:pbonzini@redhat.com)))](mailto:[[[pbonzini@redhat.com](mailto:pbonzini@redhat.com)](mailto:[pbonzini@redhat.com](mailto:pbonzini@redhat.com))](mailto:[[pbonzini@redhat.com](mailto:pbonzini@redhat.com)](mailto:[pbonzini@redhat.com](mailto:pbonzini@redhat.com))))>
> > > > > > > ---
> > > > > > >   x86/realmode.lds | 2 +-
> > > > > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/x86/realmode.lds b/x86/realmode.lds
> > > > > > > index 0ed3063b..e4782a98 100644
> > > > > > > --- a/x86/realmode.lds
> > > > > > > +++ b/x86/realmode.lds
> > > > > > > @@ -1,6 +1,6 @@
> > > > > > >   SECTIONS
> > > > > > >   {
> > > > > > > -    . = 16K;
> > > > > > > +    . = 32K;
> > > > > > >       stext = .;
> > > > > > >       .text : { *(.init) *(.text) }
> > > > > > >       . = ALIGN(4K);
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >   Hi Paolo!
> > > > > >
> > > > > > FYI, the realmode kvm-unit-test now also fails with the recent version
> > > > > > of GCC 16 for the i386 target:
> > > > > >
> > > > > >   [[[[https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727))](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727)))](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727))](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195727))))
> > > > > >
> > > > > > It was working fine some weeks ago with GCC 15.1:
> > > > > >
> > > > > >   [[[[https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961))](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961)))](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961))](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260961))))
> > > > > >
> > > > > > When I apply your patch, the problem seems to be gone again in this
> > > > > > case, but since there were some other issues with this (with older
> > > > > > versions of GCC, I think):
> > > > > >
> > > > > > [[[[https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/)](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/))](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/)](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/)))](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/)](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/))](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/)](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/](https://lore.kernel.org/kvm/49f8aadf-6e3f-4d2b-a32a-8ba941a3a2a1@redhat.com/))))
> > > > > >
> > > > > > ... there must be a better way to fix it?
> > > > > >
> > > > > > Could you please have a look?
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > By the way, the intel_iommu test now also suddenly started failing (for the
> > > > > x86_64 target), either due to update of GCC or due to the update from QEMU
> > > > > v10.2 to 11.0 :
> > > > >
> > > > >  [[[[https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728))](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728)))](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728))](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/14195195728))))
> > > > >
> > > > > Two weeks ago, it was still working fine:
> > > > >
> > > > >  [[[[https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962))](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962)))](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962))](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962)](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962](https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/13977260962))))
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > Looping in those who take care of qemu's VT-D now (Yi, Clément)..
> > > >
> > > > --
> > > > Peter Xu
> > > >
> > >
> > >
> > >
> >
> >
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: intel_iommu unit test is also failing
  2026-05-05  9:27             ` Clément MATHIEU--DRIF
@ 2026-05-05  9:45               ` Thomas Huth
  2026-05-05  9:53                 ` Clément MATHIEU--DRIF
  2026-05-05 10:23                 ` Michael S. Tsirkin
  0 siblings, 2 replies; 17+ messages in thread
From: Thomas Huth @ 2026-05-05  9:45 UTC (permalink / raw)
  To: Clément MATHIEU--DRIF, Peter Xu
  Cc: Paolo Bonzini, kvm@vger.kernel.org, Yi Liu, Michael S. Tsirkin

On 05/05/2026 11.27, Clément MATHIEU--DRIF wrote:
> I had a bit more time to hook into qemu to check the root cause.
> 
> It seems that testb issues a single byte read (out of the valid size range), as we can see on the following breakpoint:
> 
> ```
> Thread 6 "CPU 0/TCG" hit Breakpoint 2, memory_region_dispatch_read (mr=0x55d72883cb30, addr=152, pval=0x7f62d25f4590, op=MO_BSWAP, attrs=...) at ../system/memory.c:1473
> 1473       unsigned size = memop_size(op);
> (gdb) n
> 1474       MemTxResult r;
> (gdb) p size
> $1 = 1
> (gdb)
> ```

Ouch! That's an excellent finding, Clément ... so GCC 16 is "smart" enough 
to see that we only want to test the lowest bit here, so it optimizes the 
code to access only one byte of memory instead of 4 bytes... which would be 
ok for normal memory, but not for an MMIO register :-/

Ugly work-around, to force GCC to read 32 bits:

diff --git a/lib/asm-generic/io.h b/lib/asm-generic/io.h
--- a/lib/asm-generic/io.h
+++ b/lib/asm-generic/io.h
@@ -38,7 +38,9 @@ static inline u16 __raw_readw(const volatile void *addr)
  #ifndef __raw_readl
  static inline u32 __raw_readl(const volatile void *addr)
  {
-       return *(const volatile u32 *)addr;
+       u32 val = *(const volatile u32 *)addr;
+       asm volatile ("\n" : : "r"(addr));
+       return val;
  }
  #endif

... but I wonder whether this should rather be treated as a bug in GCC 
instead, since it should IMHO really not change the access size for a 
volatile memory access?

  Thomas


> cmd
> 
> On Tue, 2026-05-05 at 09:36 +0200, Clement Mathieu--Drif wrote:
>> Back with some answers:
>>
>> This is the incriminated hunk:
>>
>> ```diff
>> --- <unnamed>
>> +++ <unnamed>
>> @@ -1,17 +1,16 @@
>> -  404395:       8b 80 98 00 00 00       mov    0x98(%eax),%eax
>> +  40441d:       8b 43 38                mov    0x38(%ebx),%eax
>>           edu_reg_writeq(dev, EDU_REG_DMA_DST, to);
>>           edu_reg_writeq(dev, EDU_REG_DMA_COUNT, size);
>>           edu_reg_writel(dev, EDU_REG_DMA_CMD, cmd);
>>
>>           /* Wait until DMA finished */
>>           while (edu_reg_readl(dev, EDU_REG_DMA_CMD) & EDU_CMD_DMA_START)
>> -  40439b:       a8 01                   test   $0x1,%al
>> -  40439d:       74 10                   je     4043af <edu_dma+0x121>
>> -  40439f:       f3 90                   pause
>> -  4043a1:       48                      dec    %eax
>> -  4043a2:       8b 43 38                mov    0x38(%ebx),%eax
>> -  4043a5:       8b 80 98 00 00 00       mov    0x98(%eax),%eax
>> -  4043ab:       a8 01                   test   $0x1,%al
>> -  4043ad:       75 f0                   jne    40439f <edu_dma+0x111>
>> +  404420:       f6 80 98 00 00 00 01    testb  $0x1,0x98(%eax)
>> +  404427:       74 0f                   je     404438 <edu_dma+0x11f>
>> +  404429:       f3 90                   pause
>> +  40442b:       48                      dec    %eax
>> +  40442c:       8b 43 38                mov    0x38(%ebx),%eax
>> +  40442f:       f6 80 98 00 00 00 01    testb  $0x1,0x98(%eax)
>> +  404436:       75 f1                   jne    404429 <edu_dma+0x110>
>>                   cpu_relax();
>>   }
>>
>> + is gcc 16
>> - is gcc 15
>>
>> The instructions generated by gcc 16 always skip the following condition:
>>
>> ```
>>        /* Wait until DMA finished */
>>        while (edu_reg_readl(dev, EDU_REG_DMA_CMD) & EDU_CMD_DMA_START)
>>                cpu_relax();
>> ```
>>
>> As a consequence, the test performs the second dma operation too early and reads a wrong value.
>>
>> Regards,
>> cmd


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: intel_iommu unit test is also failing
  2026-05-05  9:45               ` intel_iommu unit test is also failing Thomas Huth
@ 2026-05-05  9:53                 ` Clément MATHIEU--DRIF
  2026-05-05 10:15                   ` Thomas Huth
  2026-05-05 10:23                 ` Michael S. Tsirkin
  1 sibling, 1 reply; 17+ messages in thread
From: Clément MATHIEU--DRIF @ 2026-05-05  9:53 UTC (permalink / raw)
  To: Thomas Huth, Peter Xu
  Cc: Paolo Bonzini, kvm@vger.kernel.org, Yi Liu, Michael S. Tsirkin



On Tue, 2026-05-05 at 11:45 +0200, Thomas Huth wrote:
> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
> 
> 
> On 05/05/2026 11.27, Clément MATHIEU--DRIF wrote:
> 
> > I had a bit more time to hook into qemu to check the root cause.
> > 
> > It seems that testb issues a single byte read (out of the valid size range), as we can see on the following breakpoint:
> > 
> > ```
> > Thread 6 "CPU 0/TCG" hit Breakpoint 2, memory_region_dispatch_read (mr=0x55d72883cb30, addr=152, pval=0x7f62d25f4590, op=MO_BSWAP, attrs=...) at ../system/memory.c:1473
> > 1473       unsigned size = memop_size(op);
> > (gdb) n
> > 1474       MemTxResult r;
> > (gdb) p size
> > $1 = 1
> > (gdb)
> > ```
> 
> 
> Ouch! That's an excellent finding, Clément ... so GCC 16 is "smart" enough  
> to see that we only want to test the lowest bit here, so it optimizes the  
> code to access only one byte of memory instead of 4 bytes... which would be  
> ok for normal memory, but not for an MMIO register :-/
> 
> Ugly work-around, to force GCC to read 32 bits:
> 
> diff --git a/lib/asm-generic/io.h b/lib/asm-generic/io.h  
> --- a/lib/asm-generic/io.h  
> +++ b/lib/asm-generic/io.h  
> @@ -38,7 +38,9 @@ static inline u16 __raw_readw(const volatile void *addr)  
>   #ifndef __raw_readl  
>   static inline u32 __raw_readl(const volatile void *addr)  
>   {  
> -       return *(const volatile u32 *)addr;  
> +       u32 val = *(const volatile u32 *)addr;  
> +       asm volatile ("\n" : : "r"(addr));  
> +       return val;  
>   }  
>   #endif
> 
> ... but I wonder whether this should rather be treated as a bug in GCC  
> instead, since it should IMHO really not change the access size for a  
> volatile memory access?

Volatile is expected to make sure that the read side effect is visible.  
I don't know if the size of the access is in the scope of this constraint or not o.O  

> 
>   Thomas
> 
> 
> 
> > cmd
> > 
> > On Tue, 2026-05-05 at 09:36 +0200, Clement Mathieu--Drif wrote:
> > 
> > > Back with some answers:
> > > 
> > > This is the incriminated hunk:
> > > 
> > > ```diff
> > > --- <unnamed>
> > > +++ <unnamed>
> > > @@ -1,17 +1,16 @@
> > > -  404395:       8b 80 98 00 00 00       mov    0x98(%eax),%eax
> > > +  40441d:       8b 43 38                mov    0x38(%ebx),%eax
> > >           edu_reg_writeq(dev, EDU_REG_DMA_DST, to);
> > >           edu_reg_writeq(dev, EDU_REG_DMA_COUNT, size);
> > >           edu_reg_writel(dev, EDU_REG_DMA_CMD, cmd);
> > > 
> > >           /* Wait until DMA finished */
> > >           while (edu_reg_readl(dev, EDU_REG_DMA_CMD) & EDU_CMD_DMA_START)
> > > -  40439b:       a8 01                   test   $0x1,%al
> > > -  40439d:       74 10                   je     4043af <edu_dma+0x121>
> > > -  40439f:       f3 90                   pause
> > > -  4043a1:       48                      dec    %eax
> > > -  4043a2:       8b 43 38                mov    0x38(%ebx),%eax
> > > -  4043a5:       8b 80 98 00 00 00       mov    0x98(%eax),%eax
> > > -  4043ab:       a8 01                   test   $0x1,%al
> > > -  4043ad:       75 f0                   jne    40439f <edu_dma+0x111>
> > > +  404420:       f6 80 98 00 00 00 01    testb  $0x1,0x98(%eax)
> > > +  404427:       74 0f                   je     404438 <edu_dma+0x11f>
> > > +  404429:       f3 90                   pause
> > > +  40442b:       48                      dec    %eax
> > > +  40442c:       8b 43 38                mov    0x38(%ebx),%eax
> > > +  40442f:       f6 80 98 00 00 00 01    testb  $0x1,0x98(%eax)
> > > +  404436:       75 f1                   jne    404429 <edu_dma+0x110>
> > >                   cpu_relax();
> > >   }
> > > 
> > > + is gcc 16
> > > - is gcc 15
> > > 
> > > The instructions generated by gcc 16 always skip the following condition:
> > > 
> > > ```
> > >        /* Wait until DMA finished */
> > >        while (edu_reg_readl(dev, EDU_REG_DMA_CMD) & EDU_CMD_DMA_START)
> > >                cpu_relax();
> > > ```
> > > 
> > > As a consequence, the test performs the second dma operation too early and reads a wrong value.
> > > 
> > > Regards,
> > > cmd
> >
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: intel_iommu unit test is also failing
  2026-05-05  9:53                 ` Clément MATHIEU--DRIF
@ 2026-05-05 10:15                   ` Thomas Huth
  0 siblings, 0 replies; 17+ messages in thread
From: Thomas Huth @ 2026-05-05 10:15 UTC (permalink / raw)
  To: Clément MATHIEU--DRIF, Peter Xu
  Cc: Paolo Bonzini, kvm@vger.kernel.org, Yi Liu, Michael S. Tsirkin

On 05/05/2026 11.53, Clément MATHIEU--DRIF wrote:
> 
> 
> On Tue, 2026-05-05 at 11:45 +0200, Thomas Huth wrote:
>> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
>>
>>
>> On 05/05/2026 11.27, Clément MATHIEU--DRIF wrote:
>>
>>> I had a bit more time to hook into qemu to check the root cause.
>>>
>>> It seems that testb issues a single byte read (out of the valid size range), as we can see on the following breakpoint:
>>>
>>> ```
>>> Thread 6 "CPU 0/TCG" hit Breakpoint 2, memory_region_dispatch_read (mr=0x55d72883cb30, addr=152, pval=0x7f62d25f4590, op=MO_BSWAP, attrs=...) at ../system/memory.c:1473
>>> 1473       unsigned size = memop_size(op);
>>> (gdb) n
>>> 1474       MemTxResult r;
>>> (gdb) p size
>>> $1 = 1
>>> (gdb)
>>> ```
>>
>>
>> Ouch! That's an excellent finding, Clément ... so GCC 16 is "smart" enough
>> to see that we only want to test the lowest bit here, so it optimizes the
>> code to access only one byte of memory instead of 4 bytes... which would be
>> ok for normal memory, but not for an MMIO register :-/
>>
>> Ugly work-around, to force GCC to read 32 bits:
>>
>> diff --git a/lib/asm-generic/io.h b/lib/asm-generic/io.h
>> --- a/lib/asm-generic/io.h
>> +++ b/lib/asm-generic/io.h
>> @@ -38,7 +38,9 @@ static inline u16 __raw_readw(const volatile void *addr)
>>    #ifndef __raw_readl
>>    static inline u32 __raw_readl(const volatile void *addr)
>>    {
>> -       return *(const volatile u32 *)addr;
>> +       u32 val = *(const volatile u32 *)addr;
>> +       asm volatile ("\n" : : "r"(addr));
>> +       return val;
>>    }
>>    #endif
>>
>> ... but I wonder whether this should rather be treated as a bug in GCC
>> instead, since it should IMHO really not change the access size for a
>> volatile memory access?
> 
> Volatile is expected to make sure that the read side effect is visible.
> I don't know if the size of the access is in the scope of this constraint or not o.O

Maybe we should simply adjust the read/write functions in the pci-edu.h 
code, WDYT:

diff --git a/lib/pci-edu.h b/lib/pci-edu.h
--- a/lib/pci-edu.h
+++ b/lib/pci-edu.h
@@ -59,24 +59,24 @@ struct pci_edu_dev {

  static inline uint64_t edu_reg_readq(struct pci_edu_dev *dev, int reg)
  {
-       return __raw_readq(edu_reg(dev, reg));
+       return readq(edu_reg(dev, reg));
  }

  static inline uint32_t edu_reg_readl(struct pci_edu_dev *dev, int reg)
  {
-       return __raw_readl(edu_reg(dev, reg));
+       return readl(edu_reg(dev, reg));
  }

  static inline void edu_reg_writeq(struct pci_edu_dev *dev, int reg,
                                   uint64_t val)
  {
-       __raw_writeq(val, edu_reg(dev, reg));
+       writeq(val, edu_reg(dev, reg));
  }

  static inline void edu_reg_writel(struct pci_edu_dev *dev, int reg,
                                   uint32_t val)
  {
-       __raw_writel(val, edu_reg(dev, reg));
+       writel(val, edu_reg(dev, reg));
  }

  bool edu_init(struct pci_edu_dev *dev);

... it might be good to use the non-raw functions here for ordered access 
anyway...

  Thomas


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: intel_iommu unit test is also failing
  2026-05-05  9:45               ` intel_iommu unit test is also failing Thomas Huth
  2026-05-05  9:53                 ` Clément MATHIEU--DRIF
@ 2026-05-05 10:23                 ` Michael S. Tsirkin
  2026-05-05 10:34                   ` Thomas Huth
  1 sibling, 1 reply; 17+ messages in thread
From: Michael S. Tsirkin @ 2026-05-05 10:23 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Clément MATHIEU--DRIF, Peter Xu, Paolo Bonzini,
	kvm@vger.kernel.org, Yi Liu

On Tue, May 05, 2026 at 11:45:17AM +0200, Thomas Huth wrote:
> On 05/05/2026 11.27, Clément MATHIEU--DRIF wrote:
> > I had a bit more time to hook into qemu to check the root cause.
> > 
> > It seems that testb issues a single byte read (out of the valid size range), as we can see on the following breakpoint:
> > 
> > ```
> > Thread 6 "CPU 0/TCG" hit Breakpoint 2, memory_region_dispatch_read (mr=0x55d72883cb30, addr=152, pval=0x7f62d25f4590, op=MO_BSWAP, attrs=...) at ../system/memory.c:1473
> > 1473       unsigned size = memop_size(op);
> > (gdb) n
> > 1474       MemTxResult r;
> > (gdb) p size
> > $1 = 1
> > (gdb)
> > ```
> 
> Ouch! That's an excellent finding, Clément ... so GCC 16 is "smart" enough
> to see that we only want to test the lowest bit here, so it optimizes the
> code to access only one byte of memory instead of 4 bytes... which would be
> ok for normal memory, but not for an MMIO register :-/
> 
> Ugly work-around, to force GCC to read 32 bits:
> 
> diff --git a/lib/asm-generic/io.h b/lib/asm-generic/io.h
> --- a/lib/asm-generic/io.h
> +++ b/lib/asm-generic/io.h
> @@ -38,7 +38,9 @@ static inline u16 __raw_readw(const volatile void *addr)
>  #ifndef __raw_readl
>  static inline u32 __raw_readl(const volatile void *addr)
>  {
> -       return *(const volatile u32 *)addr;
> +       u32 val = *(const volatile u32 *)addr;
> +       asm volatile ("\n" : : "r"(addr));
> +       return val;
>  }
>  #endif
> 
> ... but I wonder whether this should rather be treated as a bug in GCC
> instead, since it should IMHO really not change the access size for a
> volatile memory access?
> 
>  Thomas

Wouldn't this break linux generally?

#ifndef __READ_ONCE
#define __READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x) *)&(x))
#endif




> 
> > cmd
> > 
> > On Tue, 2026-05-05 at 09:36 +0200, Clement Mathieu--Drif wrote:
> > > Back with some answers:
> > > 
> > > This is the incriminated hunk:
> > > 
> > > ```diff
> > > --- <unnamed>
> > > +++ <unnamed>
> > > @@ -1,17 +1,16 @@
> > > -  404395:       8b 80 98 00 00 00       mov    0x98(%eax),%eax
> > > +  40441d:       8b 43 38                mov    0x38(%ebx),%eax
> > >           edu_reg_writeq(dev, EDU_REG_DMA_DST, to);
> > >           edu_reg_writeq(dev, EDU_REG_DMA_COUNT, size);
> > >           edu_reg_writel(dev, EDU_REG_DMA_CMD, cmd);
> > > 
> > >           /* Wait until DMA finished */
> > >           while (edu_reg_readl(dev, EDU_REG_DMA_CMD) & EDU_CMD_DMA_START)
> > > -  40439b:       a8 01                   test   $0x1,%al
> > > -  40439d:       74 10                   je     4043af <edu_dma+0x121>
> > > -  40439f:       f3 90                   pause
> > > -  4043a1:       48                      dec    %eax
> > > -  4043a2:       8b 43 38                mov    0x38(%ebx),%eax
> > > -  4043a5:       8b 80 98 00 00 00       mov    0x98(%eax),%eax
> > > -  4043ab:       a8 01                   test   $0x1,%al
> > > -  4043ad:       75 f0                   jne    40439f <edu_dma+0x111>
> > > +  404420:       f6 80 98 00 00 00 01    testb  $0x1,0x98(%eax)
> > > +  404427:       74 0f                   je     404438 <edu_dma+0x11f>
> > > +  404429:       f3 90                   pause
> > > +  40442b:       48                      dec    %eax
> > > +  40442c:       8b 43 38                mov    0x38(%ebx),%eax
> > > +  40442f:       f6 80 98 00 00 00 01    testb  $0x1,0x98(%eax)
> > > +  404436:       75 f1                   jne    404429 <edu_dma+0x110>
> > >                   cpu_relax();
> > >   }
> > > 
> > > + is gcc 16
> > > - is gcc 15
> > > 
> > > The instructions generated by gcc 16 always skip the following condition:
> > > 
> > > ```
> > >        /* Wait until DMA finished */
> > >        while (edu_reg_readl(dev, EDU_REG_DMA_CMD) & EDU_CMD_DMA_START)
> > >                cpu_relax();
> > > ```
> > > 
> > > As a consequence, the test performs the second dma operation too early and reads a wrong value.
> > > 
> > > Regards,
> > > cmd


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: intel_iommu unit test is also failing
  2026-05-05 10:23                 ` Michael S. Tsirkin
@ 2026-05-05 10:34                   ` Thomas Huth
  2026-05-05 10:53                     ` Michael S. Tsirkin
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Huth @ 2026-05-05 10:34 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Clément MATHIEU--DRIF, Peter Xu, Paolo Bonzini,
	kvm@vger.kernel.org, Yi Liu

On 05/05/2026 12.23, Michael S. Tsirkin wrote:
> On Tue, May 05, 2026 at 11:45:17AM +0200, Thomas Huth wrote:
>> On 05/05/2026 11.27, Clément MATHIEU--DRIF wrote:
>>> I had a bit more time to hook into qemu to check the root cause.
>>>
>>> It seems that testb issues a single byte read (out of the valid size range), as we can see on the following breakpoint:
>>>
>>> ```
>>> Thread 6 "CPU 0/TCG" hit Breakpoint 2, memory_region_dispatch_read (mr=0x55d72883cb30, addr=152, pval=0x7f62d25f4590, op=MO_BSWAP, attrs=...) at ../system/memory.c:1473
>>> 1473       unsigned size = memop_size(op);
>>> (gdb) n
>>> 1474       MemTxResult r;
>>> (gdb) p size
>>> $1 = 1
>>> (gdb)
>>> ```
>>
>> Ouch! That's an excellent finding, Clément ... so GCC 16 is "smart" enough
>> to see that we only want to test the lowest bit here, so it optimizes the
>> code to access only one byte of memory instead of 4 bytes... which would be
>> ok for normal memory, but not for an MMIO register :-/
>>
>> Ugly work-around, to force GCC to read 32 bits:
>>
>> diff --git a/lib/asm-generic/io.h b/lib/asm-generic/io.h
>> --- a/lib/asm-generic/io.h
>> +++ b/lib/asm-generic/io.h
>> @@ -38,7 +38,9 @@ static inline u16 __raw_readw(const volatile void *addr)
>>   #ifndef __raw_readl
>>   static inline u32 __raw_readl(const volatile void *addr)
>>   {
>> -       return *(const volatile u32 *)addr;
>> +       u32 val = *(const volatile u32 *)addr;
>> +       asm volatile ("\n" : : "r"(addr));
>> +       return val;
>>   }
>>   #endif
>>
>> ... but I wonder whether this should rather be treated as a bug in GCC
>> instead, since it should IMHO really not change the access size for a
>> volatile memory access?
>>
>>   Thomas
> 
> Wouldn't this break linux generally?
> 
> #ifndef __READ_ONCE
> #define __READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x) *)&(x))
> #endif

I asked myself the very same question, but after googling for "GCC 16 linux 
kernel" issues, I did not find anything related... there is likely something 
specific to kvm-unit-tests in here...

  Thomas


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: intel_iommu unit test is also failing
  2026-05-05 10:34                   ` Thomas Huth
@ 2026-05-05 10:53                     ` Michael S. Tsirkin
  2026-05-05 11:38                       ` Thomas Huth
  2026-05-05 11:39                       ` Clément MATHIEU--DRIF
  0 siblings, 2 replies; 17+ messages in thread
From: Michael S. Tsirkin @ 2026-05-05 10:53 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Clément MATHIEU--DRIF, Peter Xu, Paolo Bonzini,
	kvm@vger.kernel.org, Yi Liu

On Tue, May 05, 2026 at 12:34:41PM +0200, Thomas Huth wrote:
> On 05/05/2026 12.23, Michael S. Tsirkin wrote:
> > On Tue, May 05, 2026 at 11:45:17AM +0200, Thomas Huth wrote:
> > > On 05/05/2026 11.27, Clément MATHIEU--DRIF wrote:
> > > > I had a bit more time to hook into qemu to check the root cause.
> > > > 
> > > > It seems that testb issues a single byte read (out of the valid size range), as we can see on the following breakpoint:
> > > > 
> > > > ```
> > > > Thread 6 "CPU 0/TCG" hit Breakpoint 2, memory_region_dispatch_read (mr=0x55d72883cb30, addr=152, pval=0x7f62d25f4590, op=MO_BSWAP, attrs=...) at ../system/memory.c:1473
> > > > 1473       unsigned size = memop_size(op);
> > > > (gdb) n
> > > > 1474       MemTxResult r;
> > > > (gdb) p size
> > > > $1 = 1
> > > > (gdb)
> > > > ```
> > > 
> > > Ouch! That's an excellent finding, Clément ... so GCC 16 is "smart" enough
> > > to see that we only want to test the lowest bit here, so it optimizes the
> > > code to access only one byte of memory instead of 4 bytes... which would be
> > > ok for normal memory, but not for an MMIO register :-/
> > > 
> > > Ugly work-around, to force GCC to read 32 bits:
> > > 
> > > diff --git a/lib/asm-generic/io.h b/lib/asm-generic/io.h
> > > --- a/lib/asm-generic/io.h
> > > +++ b/lib/asm-generic/io.h
> > > @@ -38,7 +38,9 @@ static inline u16 __raw_readw(const volatile void *addr)
> > >   #ifndef __raw_readl
> > >   static inline u32 __raw_readl(const volatile void *addr)
> > >   {
> > > -       return *(const volatile u32 *)addr;
> > > +       u32 val = *(const volatile u32 *)addr;
> > > +       asm volatile ("\n" : : "r"(addr));
> > > +       return val;
> > >   }
> > >   #endif
> > > 
> > > ... but I wonder whether this should rather be treated as a bug in GCC
> > > instead, since it should IMHO really not change the access size for a
> > > volatile memory access?
> > > 
> > >   Thomas
> > 
> > Wouldn't this break linux generally?
> > 
> > #ifndef __READ_ONCE
> > #define __READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x) *)&(x))
> > #endif
> 
> I asked myself the very same question, but after googling for "GCC 16 linux
> kernel" issues, I did not find anything related... there is likely something
> specific to kvm-unit-tests in here...
> 
>  Thomas


This seems to be pertinent:
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

-ffuse-ops-with-volatile-access
Allow limited optimization of operations with volatile memory access when doing so does not change the semantics outlined in See When is a Volatile Object Accessed?.

The default is -ffuse-ops-with-volatile-access

implemented here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122343


Try disabling? -fno-fuse-ops-with-volatile-access


-- 
MST


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: intel_iommu unit test is also failing
  2026-05-05 10:53                     ` Michael S. Tsirkin
@ 2026-05-05 11:38                       ` Thomas Huth
  2026-05-05 12:33                         ` Michael S. Tsirkin
  2026-05-05 11:39                       ` Clément MATHIEU--DRIF
  1 sibling, 1 reply; 17+ messages in thread
From: Thomas Huth @ 2026-05-05 11:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Clément MATHIEU--DRIF, Peter Xu, Paolo Bonzini,
	kvm@vger.kernel.org, Yi Liu

On 05/05/2026 12.53, Michael S. Tsirkin wrote:
> On Tue, May 05, 2026 at 12:34:41PM +0200, Thomas Huth wrote:
>> On 05/05/2026 12.23, Michael S. Tsirkin wrote:
>>> On Tue, May 05, 2026 at 11:45:17AM +0200, Thomas Huth wrote:
>>>> On 05/05/2026 11.27, Clément MATHIEU--DRIF wrote:
>>>>> I had a bit more time to hook into qemu to check the root cause.
>>>>>
>>>>> It seems that testb issues a single byte read (out of the valid size range), as we can see on the following breakpoint:
>>>>>
>>>>> ```
>>>>> Thread 6 "CPU 0/TCG" hit Breakpoint 2, memory_region_dispatch_read (mr=0x55d72883cb30, addr=152, pval=0x7f62d25f4590, op=MO_BSWAP, attrs=...) at ../system/memory.c:1473
>>>>> 1473       unsigned size = memop_size(op);
>>>>> (gdb) n
>>>>> 1474       MemTxResult r;
>>>>> (gdb) p size
>>>>> $1 = 1
>>>>> (gdb)
>>>>> ```
>>>>
>>>> Ouch! That's an excellent finding, Clément ... so GCC 16 is "smart" enough
>>>> to see that we only want to test the lowest bit here, so it optimizes the
>>>> code to access only one byte of memory instead of 4 bytes... which would be
>>>> ok for normal memory, but not for an MMIO register :-/
>>>>
>>>> Ugly work-around, to force GCC to read 32 bits:
>>>>
>>>> diff --git a/lib/asm-generic/io.h b/lib/asm-generic/io.h
>>>> --- a/lib/asm-generic/io.h
>>>> +++ b/lib/asm-generic/io.h
>>>> @@ -38,7 +38,9 @@ static inline u16 __raw_readw(const volatile void *addr)
>>>>    #ifndef __raw_readl
>>>>    static inline u32 __raw_readl(const volatile void *addr)
>>>>    {
>>>> -       return *(const volatile u32 *)addr;
>>>> +       u32 val = *(const volatile u32 *)addr;
>>>> +       asm volatile ("\n" : : "r"(addr));
>>>> +       return val;
>>>>    }
>>>>    #endif
>>>>
>>>> ... but I wonder whether this should rather be treated as a bug in GCC
>>>> instead, since it should IMHO really not change the access size for a
>>>> volatile memory access?
>>>>
>>>>    Thomas
>>>
>>> Wouldn't this break linux generally?
>>>
>>> #ifndef __READ_ONCE
>>> #define __READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x) *)&(x))
>>> #endif
>>
>> I asked myself the very same question, but after googling for "GCC 16 linux
>> kernel" issues, I did not find anything related... there is likely something
>> specific to kvm-unit-tests in here...
>>
>>   Thomas
> 
> 
> This seems to be pertinent:
> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
> 
> -ffuse-ops-with-volatile-access
> Allow limited optimization of operations with volatile memory access when doing so does not change the semantics outlined in See When is a Volatile Object Accessed?.
> 
> The default is -ffuse-ops-with-volatile-access
> 
> implemented here:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122343
> 
> Try disabling? -fno-fuse-ops-with-volatile-access

Thanks, this seems to fix the issue, indeed!

Would you like to send a patch for it?

  Thomas


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: intel_iommu unit test is also failing
  2026-05-05 10:53                     ` Michael S. Tsirkin
  2026-05-05 11:38                       ` Thomas Huth
@ 2026-05-05 11:39                       ` Clément MATHIEU--DRIF
  1 sibling, 0 replies; 17+ messages in thread
From: Clément MATHIEU--DRIF @ 2026-05-05 11:39 UTC (permalink / raw)
  To: Michael S. Tsirkin, Thomas Huth
  Cc: Peter Xu, Paolo Bonzini, kvm@vger.kernel.org, Yi Liu


On Tue, 2026-05-05 at 06:53 -0400, Michael S. Tsirkin wrote:
> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
>
>
> On Tue, May 05, 2026 at 12:34:41PM +0200, Thomas Huth wrote:
>
> > On 05/05/2026 12.23, Michael S. Tsirkin wrote:
> >
> > > On Tue, May 05, 2026 at 11:45:17AM +0200, Thomas Huth wrote:
> > >
> > > > On 05/05/2026 11.27, Clément MATHIEU--DRIF wrote:
> > > >
> > > > > I had a bit more time to hook into qemu to check the root cause.
> > > > >
> > > > > It seems that testb issues a single byte read (out of the valid size range), as we can see on the following breakpoint:
> > > > >
> > > > > ```
> > > > > Thread 6 "CPU 0/TCG" hit Breakpoint 2, memory_region_dispatch_read (mr=0x55d72883cb30, addr=152, pval=0x7f62d25f4590, op=MO_BSWAP, attrs=...) at ../system/memory.c:1473
> > > > > 1473       unsigned size = memop_size(op);
> > > > > (gdb) n
> > > > > 1474       MemTxResult r;
> > > > > (gdb) p size
> > > > > $1 = 1
> > > > > (gdb)
> > > > > ```
> > > >
> > > >
> > > > Ouch! That's an excellent finding, Clément ... so GCC 16 is "smart" enough
> > > > to see that we only want to test the lowest bit here, so it optimizes the
> > > > code to access only one byte of memory instead of 4 bytes... which would be
> > > > ok for normal memory, but not for an MMIO register :-/
> > > >
> > > > Ugly work-around, to force GCC to read 32 bits:
> > > >
> > > > diff --git a/lib/asm-generic/io.h b/lib/asm-generic/io.h
> > > > --- a/lib/asm-generic/io.h
> > > > +++ b/lib/asm-generic/io.h
> > > > @@ -38,7 +38,9 @@ static inline u16 __raw_readw(const volatile void *addr)
> > > >   #ifndef __raw_readl
> > > >   static inline u32 __raw_readl(const volatile void *addr)
> > > >   {
> > > > -       return *(const volatile u32 *)addr;
> > > > +       u32 val = *(const volatile u32 *)addr;
> > > > +       asm volatile ("\n" : : "r"(addr));
> > > > +       return val;
> > > >   }
> > > >   #endif
> > > >
> > > > ... but I wonder whether this should rather be treated as a bug in GCC
> > > > instead, since it should IMHO really not change the access size for a
> > > > volatile memory access?
> > > >
> > > >   Thomas
> > >
> > >
> > > Wouldn't this break linux generally?
> > >
> > > #ifndef __READ_ONCE
> > > #define __READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x) *)&(x))
> > > #endif
> >
> >
> > I asked myself the very same question, but after googling for "GCC 16 linux
> > kernel" issues, I did not find anything related... there is likely something
> > specific to kvm-unit-tests in here...
> >
> >  Thomas
>
>
>
> This seems to be pertinent:
> [https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html](https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html)
>
> -ffuse-ops-with-volatile-access
> Allow limited optimization of operations with volatile memory access when doing so does not change the semantics outlined in See When is a Volatile Object Accessed?.
>
> The default is -ffuse-ops-with-volatile-access
>
> implemented here:
> [https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122343](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122343)
>
>
> Try disabling? -fno-fuse-ops-with-volatile-access

Good guess Michael, the test suite passes with this option set.

cmd

>
>
> --
> MST
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: intel_iommu unit test is also failing
  2026-05-05 11:38                       ` Thomas Huth
@ 2026-05-05 12:33                         ` Michael S. Tsirkin
  2026-05-05 17:08                           ` Thomas Huth
  0 siblings, 1 reply; 17+ messages in thread
From: Michael S. Tsirkin @ 2026-05-05 12:33 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Clément MATHIEU--DRIF, Peter Xu, Paolo Bonzini,
	kvm@vger.kernel.org, Yi Liu

On Tue, May 05, 2026 at 01:38:26PM +0200, Thomas Huth wrote:
> On 05/05/2026 12.53, Michael S. Tsirkin wrote:
> > On Tue, May 05, 2026 at 12:34:41PM +0200, Thomas Huth wrote:
> > > On 05/05/2026 12.23, Michael S. Tsirkin wrote:
> > > > On Tue, May 05, 2026 at 11:45:17AM +0200, Thomas Huth wrote:
> > > > > On 05/05/2026 11.27, Clément MATHIEU--DRIF wrote:
> > > > > > I had a bit more time to hook into qemu to check the root cause.
> > > > > > 
> > > > > > It seems that testb issues a single byte read (out of the valid size range), as we can see on the following breakpoint:
> > > > > > 
> > > > > > ```
> > > > > > Thread 6 "CPU 0/TCG" hit Breakpoint 2, memory_region_dispatch_read (mr=0x55d72883cb30, addr=152, pval=0x7f62d25f4590, op=MO_BSWAP, attrs=...) at ../system/memory.c:1473
> > > > > > 1473       unsigned size = memop_size(op);
> > > > > > (gdb) n
> > > > > > 1474       MemTxResult r;
> > > > > > (gdb) p size
> > > > > > $1 = 1
> > > > > > (gdb)
> > > > > > ```
> > > > > 
> > > > > Ouch! That's an excellent finding, Clément ... so GCC 16 is "smart" enough
> > > > > to see that we only want to test the lowest bit here, so it optimizes the
> > > > > code to access only one byte of memory instead of 4 bytes... which would be
> > > > > ok for normal memory, but not for an MMIO register :-/
> > > > > 
> > > > > Ugly work-around, to force GCC to read 32 bits:
> > > > > 
> > > > > diff --git a/lib/asm-generic/io.h b/lib/asm-generic/io.h
> > > > > --- a/lib/asm-generic/io.h
> > > > > +++ b/lib/asm-generic/io.h
> > > > > @@ -38,7 +38,9 @@ static inline u16 __raw_readw(const volatile void *addr)
> > > > >    #ifndef __raw_readl
> > > > >    static inline u32 __raw_readl(const volatile void *addr)
> > > > >    {
> > > > > -       return *(const volatile u32 *)addr;
> > > > > +       u32 val = *(const volatile u32 *)addr;
> > > > > +       asm volatile ("\n" : : "r"(addr));
> > > > > +       return val;
> > > > >    }
> > > > >    #endif
> > > > > 
> > > > > ... but I wonder whether this should rather be treated as a bug in GCC
> > > > > instead, since it should IMHO really not change the access size for a
> > > > > volatile memory access?
> > > > > 
> > > > >    Thomas
> > > > 
> > > > Wouldn't this break linux generally?
> > > > 
> > > > #ifndef __READ_ONCE
> > > > #define __READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x) *)&(x))
> > > > #endif
> > > 
> > > I asked myself the very same question, but after googling for "GCC 16 linux
> > > kernel" issues, I did not find anything related... there is likely something
> > > specific to kvm-unit-tests in here...
> > > 
> > >   Thomas
> > 
> > 
> > This seems to be pertinent:
> > https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
> > 
> > -ffuse-ops-with-volatile-access
> > Allow limited optimization of operations with volatile memory access when doing so does not change the semantics outlined in See When is a Volatile Object Accessed?.
> > 
> > The default is -ffuse-ops-with-volatile-access
> > 
> > implemented here:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122343
> > 
> > Try disabling? -fno-fuse-ops-with-volatile-access
> 
> Thanks, this seems to fix the issue, indeed!
> 
> Would you like to send a patch for it?
> 
>  Thomas


So then it's this bug apparently:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125180

will likely be in 16.2?

Maybe just wait?


-- 
MST


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: intel_iommu unit test is also failing
  2026-05-05 12:33                         ` Michael S. Tsirkin
@ 2026-05-05 17:08                           ` Thomas Huth
  0 siblings, 0 replies; 17+ messages in thread
From: Thomas Huth @ 2026-05-05 17:08 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Clément MATHIEU--DRIF, Peter Xu, Paolo Bonzini,
	kvm@vger.kernel.org, Yi Liu

On 05/05/2026 14.33, Michael S. Tsirkin wrote:
> On Tue, May 05, 2026 at 01:38:26PM +0200, Thomas Huth wrote:
>> On 05/05/2026 12.53, Michael S. Tsirkin wrote:
>>> On Tue, May 05, 2026 at 12:34:41PM +0200, Thomas Huth wrote:
>>>> On 05/05/2026 12.23, Michael S. Tsirkin wrote:
>>>>> On Tue, May 05, 2026 at 11:45:17AM +0200, Thomas Huth wrote:
>>>>>> On 05/05/2026 11.27, Clément MATHIEU--DRIF wrote:
>>>>>>> I had a bit more time to hook into qemu to check the root cause.
>>>>>>>
>>>>>>> It seems that testb issues a single byte read (out of the valid size range), as we can see on the following breakpoint:
>>>>>>>
>>>>>>> ```
>>>>>>> Thread 6 "CPU 0/TCG" hit Breakpoint 2, memory_region_dispatch_read (mr=0x55d72883cb30, addr=152, pval=0x7f62d25f4590, op=MO_BSWAP, attrs=...) at ../system/memory.c:1473
>>>>>>> 1473       unsigned size = memop_size(op);
>>>>>>> (gdb) n
>>>>>>> 1474       MemTxResult r;
>>>>>>> (gdb) p size
>>>>>>> $1 = 1
>>>>>>> (gdb)
>>>>>>> ```
>>>>>>
>>>>>> Ouch! That's an excellent finding, Clément ... so GCC 16 is "smart" enough
>>>>>> to see that we only want to test the lowest bit here, so it optimizes the
>>>>>> code to access only one byte of memory instead of 4 bytes... which would be
>>>>>> ok for normal memory, but not for an MMIO register :-/
>>>>>>
>>>>>> Ugly work-around, to force GCC to read 32 bits:
>>>>>>
>>>>>> diff --git a/lib/asm-generic/io.h b/lib/asm-generic/io.h
>>>>>> --- a/lib/asm-generic/io.h
>>>>>> +++ b/lib/asm-generic/io.h
>>>>>> @@ -38,7 +38,9 @@ static inline u16 __raw_readw(const volatile void *addr)
>>>>>>     #ifndef __raw_readl
>>>>>>     static inline u32 __raw_readl(const volatile void *addr)
>>>>>>     {
>>>>>> -       return *(const volatile u32 *)addr;
>>>>>> +       u32 val = *(const volatile u32 *)addr;
>>>>>> +       asm volatile ("\n" : : "r"(addr));
>>>>>> +       return val;
>>>>>>     }
>>>>>>     #endif
>>>>>>
>>>>>> ... but I wonder whether this should rather be treated as a bug in GCC
>>>>>> instead, since it should IMHO really not change the access size for a
>>>>>> volatile memory access?
>>>>>>
>>>>>>     Thomas
>>>>>
>>>>> Wouldn't this break linux generally?
>>>>>
>>>>> #ifndef __READ_ONCE
>>>>> #define __READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x) *)&(x))
>>>>> #endif
>>>>
>>>> I asked myself the very same question, but after googling for "GCC 16 linux
>>>> kernel" issues, I did not find anything related... there is likely something
>>>> specific to kvm-unit-tests in here...
>>>>
>>>>    Thomas
>>>
>>>
>>> This seems to be pertinent:
>>> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
>>>
>>> -ffuse-ops-with-volatile-access
>>> Allow limited optimization of operations with volatile memory access when doing so does not change the semantics outlined in See When is a Volatile Object Accessed?.
>>>
>>> The default is -ffuse-ops-with-volatile-access
>>>
>>> implemented here:
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122343
>>>
>>> Try disabling? -fno-fuse-ops-with-volatile-access
>>
>> Thanks, this seems to fix the issue, indeed!
>>
>> Would you like to send a patch for it?
> 
> So then it's this bug apparently:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125180

Ah, thanks, that looks like the bug, indeed!

> will likely be in 16.2?
> 
> Maybe just wait?

I guess we could drop the failing test from the CI for the time being...

  Thomas


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2026-05-05 17:08 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20240604143507.1041901-1-pbonzini@redhat.com>
2026-05-04  7:58 ` [PATCH kvm-unit-tests] realmode: load above stack Thomas Huth
2026-05-04  8:07   ` intel_iommu unit test is also failing (was: Re: [PATCH kvm-unit-tests] realmode: load above stack) Thomas Huth
2026-05-04 15:45     ` Peter Xu
2026-05-05  5:49       ` Clément MATHIEU--DRIF
2026-05-05  6:37         ` Clément MATHIEU--DRIF
2026-05-05  7:36           ` Clément MATHIEU--DRIF
2026-05-05  9:27             ` Clément MATHIEU--DRIF
2026-05-05  9:45               ` intel_iommu unit test is also failing Thomas Huth
2026-05-05  9:53                 ` Clément MATHIEU--DRIF
2026-05-05 10:15                   ` Thomas Huth
2026-05-05 10:23                 ` Michael S. Tsirkin
2026-05-05 10:34                   ` Thomas Huth
2026-05-05 10:53                     ` Michael S. Tsirkin
2026-05-05 11:38                       ` Thomas Huth
2026-05-05 12:33                         ` Michael S. Tsirkin
2026-05-05 17:08                           ` Thomas Huth
2026-05-05 11:39                       ` Clément MATHIEU--DRIF

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox