hvmloader - allow_memory_relocate overlaps

All of lore.kernel.org
 help / color / mirror / Atom feed

* hvmloader - allow_memory_relocate overlaps
@ 2023-12-16  7:01 Neowutran
  2023-12-19 16:57 ` Jan Beulich
  0 siblings, 1 reply; 10+ messages in thread
From: Neowutran @ 2023-12-16  7:01 UTC (permalink / raw)
  To: xen-devel

I am wondering if the variable "allow_memory_relocate" is still
relevant today and if its default value is still relevant. 
Should it be defined to 0 by default instead of 1 (it seems to be a
workaround for qemu-traditional, so maybe an outdated default value ? ) ? 

Code extract: 

tools/firmware/hvmloader/pci.c 
"
   /*
     * Do we allow hvmloader to relocate guest memory in order to
     * increase the size of the lowmem MMIO hole?  Defaulting to 1
     * here will
 mean that non-libxl toolstacks (including xend and
     * home-grown ones) means that those using qemu-xen will still
     * experience the memory relocation bug described below; but it
     * also means that those using q\0
emu-traditional will *not*
     * experience any change; and it also means that there is a
     * work-around for those using qemu-xen, namely switching to
     * qemu-traditional.
     *
     * If we defaulted to 0, and failing to resize the hole caused any
     * problems with qemu-traditional, then there is no work-around.
     *
     * Since xend can only use qemu-traditional, I think this is the
     * option that will have the least impact.
     */
    bool allow_memory_relocate = 1;
"

"
        /*
         * At the moment qemu-xen can't deal with relocated memory regions.
         * It's too close to the release to make a proper fix; for now,
         * only allow t
he MMIO hole to grow large enough to move guest memory
         * if we're running qemu-traditional.  Items that don't fit will be
         * relocated into the 64-bit address space.
         *
         * This loop now does the following:
         * - If allow_memory_relocate, increase the MMIO hole until it's
         *   big enough, or \0
until it's 2GiB
         * - If !allow_memory_relocate, increase the MMIO hole until it's
         *   big enough, or until it's 2GiB, or until it overlaps guest
         *   memory
         */
        while ( (mmio_total > (pci_mem_end - pci_mem_start))
                && ((pci_mem_start << 1) != 0)
                && (allow_memory_relocate
                    || (((pci_mem_start << 1) >> PAGE_SHIFT)
                        >= hvm_info->low_mem_pgend)) )
            pci_mem_start <<= 1;
"

The issue it cause is documented in the source code: guest memory can
be trashed by the hvmloader. 

Due to this issue, it is impossible to passthrough a large PCI device to a HVM with a lot of ram.
(https://github.com/QubesOS/qubes-issues/issues/4321). 

(Forcing "allow_memory_relocate" to be 0 seems to solve the issue
linked)



Thanks, 
Neowutran
\0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hvmloader - allow_memory_relocate overlaps
  2023-12-16  7:01 hvmloader - allow_memory_relocate overlaps Neowutran
@ 2023-12-19 16:57 ` Jan Beulich
  2023-12-21 18:08   ` Neowutran
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2023-12-19 16:57 UTC (permalink / raw)
  To: Neowutran; +Cc: xen-devel

On 16.12.2023 08:01, Neowutran wrote:
> I am wondering if the variable "allow_memory_relocate" is still
> relevant today and if its default value is still relevant. 
> Should it be defined to 0 by default instead of 1 (it seems to be a
> workaround for qemu-traditional, so maybe an outdated default value ? ) ? 

So are you saying you use qemu-trad? Otherwise isn't libxl suppressing
this behavior anyway? Or did that code go stale?

> Code extract: 
> 
> tools/firmware/hvmloader/pci.c 
> "
>    /*
>      * Do we allow hvmloader to relocate guest memory in order to
>      * increase the size of the lowmem MMIO hole?  Defaulting to 1
>      * here will
>  mean that non-libxl toolstacks (including xend and
>      * home-grown ones) means that those using qemu-xen will still
>      * experience the memory relocation bug described below; but it
>      * also means that those using q 
> emu-traditional will *not*
>      * experience any change; and it also means that there is a
>      * work-around for those using qemu-xen, namely switching to
>      * qemu-traditional.
>      *
>      * If we defaulted to 0, and failing to resize the hole caused any
>      * problems with qemu-traditional, then there is no work-around.
>      *
>      * Since xend can only use qemu-traditional, I think this is the
>      * option that will have the least impact.
>      */
>     bool allow_memory_relocate = 1;
> "
> 
> "
>         /*
>          * At the moment qemu-xen can't deal with relocated memory regions.
>          * It's too close to the release to make a proper fix; for now,
>          * only allow t
> he MMIO hole to grow large enough to move guest memory
>          * if we're running qemu-traditional.  Items that don't fit will be
>          * relocated into the 64-bit address space.
>          *
>          * This loop now does the following:
>          * - If allow_memory_relocate, increase the MMIO hole until it's
>          *   big enough, or  
> until it's 2GiB
>          * - If !allow_memory_relocate, increase the MMIO hole until it's
>          *   big enough, or until it's 2GiB, or until it overlaps guest
>          *   memory
>          */
>         while ( (mmio_total > (pci_mem_end - pci_mem_start))
>                 && ((pci_mem_start << 1) != 0)
>                 && (allow_memory_relocate
>                     || (((pci_mem_start << 1) >> PAGE_SHIFT)
>                         >= hvm_info->low_mem_pgend)) )
>             pci_mem_start <<= 1;
> "
> 
> The issue it cause is documented in the source code: guest memory can
> be trashed by the hvmloader. 
> 
> Due to this issue, it is impossible to passthrough a large PCI device to a HVM with a lot of ram.
> (https://github.com/QubesOS/qubes-issues/issues/4321). 
> 
> (Forcing "allow_memory_relocate" to be 0 seems to solve the issue
> linked)

What I don't understand here (and what I also can't find helpful logs for)
is: The code in hvmloader is in principle intended to work in both cases.
If there's suspected guest memory corruption, can we get to see the actual
results of the MMIO hole creation and its using for device ranges? If there
indeed is an overlap with guest RAM, that's a bug which wants fixing. I
don't see why we would want to drop allow_memory_relocate in the way you
suggest; quite the other way around, if upstream qemu still can't tolerate
hvmloader doing this, then it wants fixing there such that RAM relocation
can be enabled unconditionally. Then the variable will indeed not be needed
anymore.

Jan


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hvmloader - allow_memory_relocate overlaps
  2023-12-19 16:57 ` Jan Beulich
@ 2023-12-21 18:08   ` Neowutran
  2023-12-22 11:35     ` Jan Beulich
  0 siblings, 1 reply; 10+ messages in thread
From: Neowutran @ 2023-12-21 18:08 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Neowutran, xen-devel

On 2023-12-19 17:12, Jan Beulich wrote:
> On 16.12.2023 08:01, Neowutran wrote:
> > I am wondering if the variable "allow_memory_relocate" is still
> > relevant today and if its default value is still relevant. 
> > Should it be defined to 0 by default instead of 1 (it seems to be a
> > workaround for qemu-traditional, so maybe an outdated default value ? ) ? 
> 
> So are you saying you use qemu-trad?
No, I am using "qemu_xen" ( from \0
xenstore-read -> 'device-model =
"qemu_xen"' 

> Otherwise isn't libxl suppressing this behavior anyway?
If by "isn't libxl suppressing this behavior" you means if libxl is setting
the value of "allow_memory_relocate", then the answer is no. 

Following this lead, I checked in what code path
"allow_memory_relocate" could be defined. 

It is only defined in one code path, 

In the file "tools/libs/light/libxl_dm.c",
in the function "void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state *dmss)": 

'''
 // ...
    if (b_info->type == LIBXL_DOMAIN_TYPE_HVM) {
 // ...

        libxl__xs_printf(gc, XBT_NULL,
                         GCSPRINTF("%s/hvmloader/allow-memory-relocate", path),
                         "%d",
                         b_info->device_model_version==LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL &&
                         !libxl__vnuma_configured(b_info));
// ...
'''

However, on QubesOS (my system), "local_dm" is never used, "stub_dm"
is always used. 

In the function "void lib\0
xl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)", 
the value of "allow_memory_relocate" is never defined. 

I tried to patch the code to define "allow_memory_relocate" in
"libxl__spawn_stub_dm": 

'''
--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -2431,6 +2431,10 @@ void libxl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)
                                        libxl__xs_get_dompath(gc, guest_domid)),
                         "%s",
                         libxl_bios_type_to_string(guest_config->b_info.u.hvm.bios));
+        libxl__xs_printf(gc, XBT_NULL,
+                         libxl__sprintf(gc, "%s/hvmloader/allow-memory-relocate", libxl__xs_get_dompath(gc, guest_domid)),
+                         "%d",
+                         0);
     }
     ret = xc_domain_set_target(ctx->xch, dm_domid, guest_domid);
     if (ret<0) {
'''

And it is indeed another way to solve my issue. 
However I do not understand the xen hypervisor enough to known i\0
f
"allow-memory-relocate" should have been defined in
"libxl__spawn_stub_dm" or if the real issue is somewhere else. 


> Or did that code go stale?
> 
> > Code extract: 
> > 
> > tools/firmware/hvmloader/pci.c 
> > "
> >    /*
> >      * Do we allow hvmloader to relocate guest memory in order to
> >      * increase the size of the lowmem MMIO hole?  Defaulting to 1
> >      * here will
> >  mean that non-libxl toolstacks (including xend and
> >      * home-grown ones) means that those using qemu-xen will still
> >      * experience the memory relocation bug described below; but it
> >      * also means that those using q 
> > emu-traditional will *not*
> >      * experience any change; and it also means that there is a
> >      * work-around for those using qemu-xen, namely switching to
> >      * qemu-traditional.
> >      *
> >      * If we defaulted to 0, and failing to resize the hole caused any
> >      * problems with qemu-traditional, then there is no work-around.
> >      *
> >      * Since xend can\0
 only use qemu-traditional, I think this is the
> >      * option that will have the least impact.
> >      */
> >     bool allow_memory_relocate = 1;
> > "
> > 
> > "
> >         /*
> >          * At the moment qemu-xen can't deal with relocated memory regions.
> >          * It's too close to the release to make a proper fix; for now,
> >          * only allow t
> > he MMIO hole to grow large enough to move guest memory
> >          * if we're running qemu-traditional.  Items that don't fit will be
> >          * relocated into the 64-bit address space.
> >          *
> >          * This loop now does the following:
> >          * - If allow_memory_relocate, increase the MMIO hole until it's
> >          *   big enough, or  
> > until it's 2GiB
> >          * - If !allow_memory_relocate, increase the MMIO hole until it's
> >          *   big enough, or until it's 2GiB, or until it overlaps guest
> >          *   memory
> >          */
> >         while ( (mmio_total > (pci_mem_end - pci_mem_start))
> >    \0
             && ((pci_mem_start << 1) != 0)
> >                 && (allow_memory_relocate
> >                     || (((pci_mem_start << 1) >> PAGE_SHIFT)
> >                         >= hvm_info->low_mem_pgend)) )
> >             pci_mem_start <<= 1;
> > "
> > 
> > The issue it cause is documented in the source code: guest memory can
> > be trashed by the hvmloader. 
> > 
> > Due to this issue, it is impossible to passthrough a large PCI device to a HVM with a lot of ram.
> > (https://github.com/QubesOS/qubes-issues/issues/4321). 
> > 
> > (Forcing "allow_memory_relocate" to be 0 seems to solve the issue
> > linked)
> 
> What I don't understand here (and what I also can't find helpful logs for)
> is: The code in hvmloader is in principle intended to work in both cases.
> If there's suspected guest memory corruption, can we get to see the actual
> results of the MMIO hole creation and its using for device ranges? If there
> indeed is an overlap with guest RAM, that's a bug which wants fixing.

tools/firmware/\0
hvmloader/pci.c, function "void pci_setup(void)"
I added a log to print if the hvmloader would have tried to overlaps
guest memory

'''

+        printf("NEOWUTRAN pci.c: pci_mem_end: %d\n",pci_mem_end);
+        printf("NEOWUTRAN pci.c: pci_mem_start: %d\n",pci_mem_start);
+        printf("NEOWUTRAN pci.c: allow_memory_relocate: %d\n",allow_memory_relocate);
+        printf("NEOWUTRAN pci.c: hvm_info->low_mem_pgend: %d\n",hvm_info->low_mem_pgend);
+
+
        while ( (mmio_total > (pci_mem_end - pci_mem_start))
                && ((pci_mem_start << 1) != 0)
                && (allow_memory_relocate
                    || (((pci_mem_start << 1) >> PAGE_SHIFT)
                        >= hvm_info->low_mem_pgend)) )
            pci_mem_start <<= 1;

+         if ( (mmio_total > (pci_mem_end - pci_mem_start))
+                && ((pci_mem_start << 1) != 0)
+                && (1
+                    || (((pci_mem_start << 1) >> PAGE_SHIFT)
+                        >= hvm_info->low_mem_pgend)) ){
+               \0
printf("NEOWUTRAN pci.c: HVM_LOADER would have tried to relocate guest memory\n");
+               printf("NEOWUTRAN pci.c: pci_mem_start: %d\n",pci_mem_start);
+
+        }
'''

And the associated result is: 

(d22) NEOWUTRAN pci.c: pci_mem_end: -67108864
(d22) NEOWUTRAN pci.c: pci_mem_start: -268435456
(d22) NEOWUTRAN pci.c: allow_memory_relocate: 0
(d22) NEOWUTRAN pci.c: hvm_info->low_mem_pgend: 983040
(d22) NEOWUTRAN pci.c: HVM_LOADER would have tried to relocate guest memory
(d22) NEOWUTRAN pci.c: pci_mem_start: -268435456

So if "allow_memory_relocate" was not defined to 0, the hvmloader
would have tried to overlaps guest memory, and it seems that is
something that qemu_xen cannot handle.

> I
> don't see why we would want to drop allow_memory_relocate in the way you
> suggest; quite the other way around, if upstream qemu still can't tolerate
> hvmloader doing this, then it wants fixing there such that RAM relocation
> can be enabled unconditionally. Then the variable will indeed not be needed
> anymor\0
e.

I can send different logs or try different things if needed 


> 
> Jan

Thanks, 
Neowutran
\0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hvmloader - allow_memory_relocate overlaps
  2023-12-21 18:08   ` Neowutran
@ 2023-12-22 11:35     ` Jan Beulich
  2023-12-22 15:49       ` Neowutran
  2023-12-22 23:36       ` Marek Marczykowski-Górecki
  0 siblings, 2 replies; 10+ messages in thread
From: Jan Beulich @ 2023-12-22 11:35 UTC (permalink / raw)
  To: Neowutran; +Cc: xen-devel, Anthony Perard, George Dunlap

On 21.12.2023 19:08, Neowutran wrote:
> On 2023-12-19 17:12, Jan Beulich wrote:
>> On 16.12.2023 08:01, Neowutran wrote:
>>> I am wondering if the variable "allow_memory_relocate" is still
>>> relevant today and if its default value is still relevant. 
>>> Should it be defined to 0 by default instead of 1 (it seems to be a
>>> workaround for qemu-traditional, so maybe an outdated default value ? ) ? 
>>
>> So are you saying you use qemu-trad?
> No, I am using "qemu_xen" ( from  
> xenstore-read -> 'device-model =
> "qemu_xen"' 
> 
>> Otherwise isn't libxl suppressing this behavior anyway?
> If by "isn't libxl suppressing this behavior" you means if libxl is setting
> the value of "allow_memory_relocate", then the answer is no. 
> 
> Following this lead, I checked in what code path
> "allow_memory_relocate" could be defined. 
> 
> It is only defined in one code path, 
> 
> In the file "tools/libs/light/libxl_dm.c",
> in the function "void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state *dmss)": 
> 
> '''
>  // ...
>     if (b_info->type == LIBXL_DOMAIN_TYPE_HVM) {
>  // ...
> 
>         libxl__xs_printf(gc, XBT_NULL,
>                          GCSPRINTF("%s/hvmloader/allow-memory-relocate", path),
>                          "%d",
>                          b_info->device_model_version==LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL &&
>                          !libxl__vnuma_configured(b_info));
> // ...
> '''
> 
> However, on QubesOS (my system), "local_dm" is never used, "stub_dm"
> is always used. 
> 
> In the function "void lib 
> xl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)", 
> the value of "allow_memory_relocate" is never defined. 
> 
> I tried to patch the code to define "allow_memory_relocate" in
> "libxl__spawn_stub_dm": 
> 
> '''
> --- a/tools/libs/light/libxl_dm.c
> +++ b/tools/libs/light/libxl_dm.c
> @@ -2431,6 +2431,10 @@ void libxl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)
>                                         libxl__xs_get_dompath(gc, guest_domid)),
>                          "%s",
>                          libxl_bios_type_to_string(guest_config->b_info.u.hvm.bios));
> +        libxl__xs_printf(gc, XBT_NULL,
> +                         libxl__sprintf(gc, "%s/hvmloader/allow-memory-relocate", libxl__xs_get_dompath(gc, guest_domid)),
> +                         "%d",
> +                         0);
>      }
>      ret = xc_domain_set_target(ctx->xch, dm_domid, guest_domid);
>      if (ret<0) {
> '''
> 
> And it is indeed another way to solve my issue. 
> However I do not understand the xen hypervisor enough to known i 
> f
> "allow-memory-relocate" should have been defined in
> "libxl__spawn_stub_dm" or if the real issue is somewhere else. 

I think it should be done the same no matter where qemu runs. Back at the
time iirc only qemu-trad could run in a stubdom, which may explain the
omission. Cc-ing the two guys who are likely in the best position to tell
for sure.

>>> Code extract: 
>>>
>>> tools/firmware/hvmloader/pci.c 
>>> "
>>>    /*
>>>      * Do we allow hvmloader to relocate guest memory in order to
>>>      * increase the size of the lowmem MMIO hole?  Defaulting to 1
>>>      * here will
>>>  mean that non-libxl toolstacks (including xend and
>>>      * home-grown ones) means that those using qemu-xen will still
>>>      * experience the memory relocation bug described below; but it
>>>      * also means that those using q 
>>> emu-traditional will *not*
>>>      * experience any change; and it also means that there is a
>>>      * work-around for those using qemu-xen, namely switching to
>>>      * qemu-traditional.
>>>      *
>>>      * If we defaulted to 0, and failing to resize the hole caused any
>>>      * problems with qemu-traditional, then there is no work-around.
>>>      *
>>>      * Since xend can 
>  only use qemu-traditional, I think this is the
>>>      * option that will have the least impact.
>>>      */
>>>     bool allow_memory_relocate = 1;
>>> "
>>>
>>> "
>>>         /*
>>>          * At the moment qemu-xen can't deal with relocated memory regions.
>>>          * It's too close to the release to make a proper fix; for now,
>>>          * only allow t
>>> he MMIO hole to grow large enough to move guest memory
>>>          * if we're running qemu-traditional.  Items that don't fit will be
>>>          * relocated into the 64-bit address space.
>>>          *
>>>          * This loop now does the following:
>>>          * - If allow_memory_relocate, increase the MMIO hole until it's
>>>          *   big enough, or  
>>> until it's 2GiB
>>>          * - If !allow_memory_relocate, increase the MMIO hole until it's
>>>          *   big enough, or until it's 2GiB, or until it overlaps guest
>>>          *   memory
>>>          */
>>>         while ( (mmio_total > (pci_mem_end - pci_mem_start))
>>>     
>              && ((pci_mem_start << 1) != 0)
>>>                 && (allow_memory_relocate
>>>                     || (((pci_mem_start << 1) >> PAGE_SHIFT)
>>>                         >= hvm_info->low_mem_pgend)) )
>>>             pci_mem_start <<= 1;
>>> "
>>>
>>> The issue it cause is documented in the source code: guest memory can
>>> be trashed by the hvmloader. 
>>>
>>> Due to this issue, it is impossible to passthrough a large PCI device to a HVM with a lot of ram.
>>> (https://github.com/QubesOS/qubes-issues/issues/4321). 
>>>
>>> (Forcing "allow_memory_relocate" to be 0 seems to solve the issue
>>> linked)
>>
>> What I don't understand here (and what I also can't find helpful logs for)
>> is: The code in hvmloader is in principle intended to work in both cases.
>> If there's suspected guest memory corruption, can we get to see the actual
>> results of the MMIO hole creation and its using for device ranges? If there
>> indeed is an overlap with guest RAM, that's a bug which wants fixing.
> 
> tools/firmware/ 
> hvmloader/pci.c, function "void pci_setup(void)"
> I added a log to print if the hvmloader would have tried to overlaps
> guest memory
> 
> '''
> 
> +        printf("NEOWUTRAN pci.c: pci_mem_end: %d\n",pci_mem_end);
> +        printf("NEOWUTRAN pci.c: pci_mem_start: %d\n",pci_mem_start);
> +        printf("NEOWUTRAN pci.c: allow_memory_relocate: %d\n",allow_memory_relocate);
> +        printf("NEOWUTRAN pci.c: hvm_info->low_mem_pgend: %d\n",hvm_info->low_mem_pgend);
> +
> +
>         while ( (mmio_total > (pci_mem_end - pci_mem_start))
>                 && ((pci_mem_start << 1) != 0)
>                 && (allow_memory_relocate
>                     || (((pci_mem_start << 1) >> PAGE_SHIFT)
>                         >= hvm_info->low_mem_pgend)) )
>             pci_mem_start <<= 1;
> 
> +         if ( (mmio_total > (pci_mem_end - pci_mem_start))
> +                && ((pci_mem_start << 1) != 0)
> +                && (1
> +                    || (((pci_mem_start << 1) >> PAGE_SHIFT)
> +                        >= hvm_info->low_mem_pgend)) ){
> +                
> printf("NEOWUTRAN pci.c: HVM_LOADER would have tried to relocate guest memory\n");
> +               printf("NEOWUTRAN pci.c: pci_mem_start: %d\n",pci_mem_start);
> +
> +        }
> '''
> 
> And the associated result is: 
> 
> (d22) NEOWUTRAN pci.c: pci_mem_end: -67108864
> (d22) NEOWUTRAN pci.c: pci_mem_start: -268435456
> (d22) NEOWUTRAN pci.c: allow_memory_relocate: 0
> (d22) NEOWUTRAN pci.c: hvm_info->low_mem_pgend: 983040
> (d22) NEOWUTRAN pci.c: HVM_LOADER would have tried to relocate guest memory
> (d22) NEOWUTRAN pci.c: pci_mem_start: -268435456
> 
> So if "allow_memory_relocate" was not defined to 0, the hvmloader
> would have tried to overlaps guest memory, and it seems that is
> something that qemu_xen cannot handle.

Well, memory addresses printed in decimal are pretty hard to work with.
But I'd like to ask anyway that you supply all of the log messages for
such a guest starting, to be able to correlate what you've added with
other items also logged.

Jan


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hvmloader - allow_memory_relocate overlaps
  2023-12-22 11:35     ` Jan Beulich
@ 2023-12-22 15:49       ` Neowutran
  2024-01-04 13:16         ` Jan Beulich
  2023-12-22 23:36       ` Marek Marczykowski-Górecki
  1 sibling, 1 reply; 10+ messages in thread
From: Neowutran @ 2023-12-22 15:49 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Neowutran, xen-devel, Anthony Perard, George Dunlap

> > '''
> > 
> > And the associated result is: 
> > 
> > (d22) NEOWUTRAN pci.c: pci_mem_end: -67108864
> > (d22) NEOWUTRAN pci.c: pci_mem_start: -268435456
> > (d22) NEOWUTRAN pci.c: allow_memory_relocate: 0
> > (d22) NEOWUTRAN pci.c:\0
 hvm_info->low_mem_pgend: 983040
> > (d22) NEOWUTRAN pci.c: HVM_LOADER would have tried to relocate guest memory
> > (d22) NEOWUTRAN pci.c: pci_mem_start: -268435456
> > 
> > So if "allow_memory_relocate" was not defined to 0, the hvmloader
> > would have tried to overlaps guest memory, and it seems that is
> > something that qemu_xen cannot handle.
> 
> Well, memory addresses printed in decimal are pretty hard to work with.
> But I'd like to ask anyway that you supply all of the log messages for
> such a guest starting, to be able to correlate what you've added with
> other items also logged.

Full logs with my patch to set allow-memory-relocate (https://github.com/neowutran/qubes-vmm-xen/commit/819705bc346cad14836fd523195ad2b0445330ac)
https://pastebin.com/9kQgvraK
(GPU passthrough work, hvmloader doesn't overlaps with guest memory)


Full logs without my patch to set allow-memory-relocate (https://github.com/neowutran/qubes-vmm-xen/blob/allowmemoryrelocate/ALLOWMEMORYRELOCATE.patch)
https://pastebin.com/g\0
QGg55WZ
(GPU passthrough doesn't work, hvmloader overlaps with guest memory)

> 
> Jan

Thanks,
Neowutran
\0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hvmloader - allow_memory_relocate overlaps
  2023-12-22 11:35     ` Jan Beulich
  2023-12-22 15:49       ` Neowutran
@ 2023-12-22 23:36       ` Marek Marczykowski-Górecki
  1 sibling, 0 replies; 10+ messages in thread
From: Marek Marczykowski-Górecki @ 2023-12-22 23:36 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Neowutran, xen-devel, Anthony Perard, George Dunlap

[-- Attachment #1: Type: text/plain, Size: 3552 bytes --]

On Fri, Dec 22, 2023 at 12:35:57PM +0100, Jan Beulich wrote:
> On 21.12.2023 19:08, Neowutran wrote:
> > On 2023-12-19 17:12, Jan Beulich wrote:
> >> On 16.12.2023 08:01, Neowutran wrote:
> >>> I am wondering if the variable "allow_memory_relocate" is still
> >>> relevant today and if its default value is still relevant. 
> >>> Should it be defined to 0 by default instead of 1 (it seems to be a
> >>> workaround for qemu-traditional, so maybe an outdated default value ? ) ? 
> >>
> >> So are you saying you use qemu-trad?
> > No, I am using "qemu_xen" ( from  
> > xenstore-read -> 'device-model =
> > "qemu_xen"' 
> > 
> >> Otherwise isn't libxl suppressing this behavior anyway?
> > If by "isn't libxl suppressing this behavior" you means if libxl is setting
> > the value of "allow_memory_relocate", then the answer is no. 
> > 
> > Following this lead, I checked in what code path
> > "allow_memory_relocate" could be defined. 
> > 
> > It is only defined in one code path, 
> > 
> > In the file "tools/libs/light/libxl_dm.c",
> > in the function "void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state *dmss)": 
> > 
> > '''
> >  // ...
> >     if (b_info->type == LIBXL_DOMAIN_TYPE_HVM) {
> >  // ...
> > 
> >         libxl__xs_printf(gc, XBT_NULL,
> >                          GCSPRINTF("%s/hvmloader/allow-memory-relocate", path),
> >                          "%d",
> >                          b_info->device_model_version==LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL &&
> >                          !libxl__vnuma_configured(b_info));
> > // ...
> > '''
> > 
> > However, on QubesOS (my system), "local_dm" is never used, "stub_dm"
> > is always used. 
> > 
> > In the function "void lib 
> > xl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)", 
> > the value of "allow_memory_relocate" is never defined. 
> > 
> > I tried to patch the code to define "allow_memory_relocate" in
> > "libxl__spawn_stub_dm": 
> > 
> > '''
> > --- a/tools/libs/light/libxl_dm.c
> > +++ b/tools/libs/light/libxl_dm.c
> > @@ -2431,6 +2431,10 @@ void libxl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)
> >                                         libxl__xs_get_dompath(gc, guest_domid)),
> >                          "%s",
> >                          libxl_bios_type_to_string(guest_config->b_info.u.hvm.bios));
> > +        libxl__xs_printf(gc, XBT_NULL,
> > +                         libxl__sprintf(gc, "%s/hvmloader/allow-memory-relocate", libxl__xs_get_dompath(gc, guest_domid)),
> > +                         "%d",
> > +                         0);
> >      }
> >      ret = xc_domain_set_target(ctx->xch, dm_domid, guest_domid);
> >      if (ret<0) {
> > '''
> > 
> > And it is indeed another way to solve my issue. 
> > However I do not understand the xen hypervisor enough to known i 
> > f
> > "allow-memory-relocate" should have been defined in
> > "libxl__spawn_stub_dm" or if the real issue is somewhere else. 
> 
> I think it should be done the same no matter where qemu runs. Back at the
> time iirc only qemu-trad could run in a stubdom, which may explain the
> omission. Cc-ing the two guys who are likely in the best position to tell
> for sure.

This indeed looks like a missing piece of qemu-xen support in
stubdomain.
But also, As Jan pointed out, this would be better fixed at the qemu
side so memory relocation could be re-enabled.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hvmloader - allow_memory_relocate overlaps
  2023-12-22 15:49       ` Neowutran
@ 2024-01-04 13:16         ` Jan Beulich
  2024-02-07 15:08           ` Jan Beulich
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2024-01-04 13:16 UTC (permalink / raw)
  To: Neowutran; +Cc: xen-devel, Anthony Perard, George Dunlap

On 22.12.2023 16:49, Neowutran wrote:
> Full logs without my patch to set allow-memory-relocate (https://github.com/neowutran/qubes-vmm-xen/blob/allowmemoryrelocate/ALLOWMEMORYRELOCATE.patch)
> https://pastebin.com/g 
> QGg55WZ
> (GPU passthrough doesn't work, hvmloader overlaps with guest memory)

So there are oddities, but I can't spot any overlaps. What's odd is that
the two blocks already above 4Gb are accounted for (and later relocated)
when calculating total MMIO size. BARs of size 2Gb and more shouldn't be
accounted for at all when deciding whether low RAM needs relocating, as
those can't live below 4Gb anyway. I vaguely recall pointing this out
years ago, but it was thought we'd get away for a fair while. What's
further odd is where the two blocks are moved to: F800000 moves (down)
to C00000, while the smaller FC00000 moves further up to FC80000.

I'll try to get to addressing at least the first oddity; if I can figure
out why the second one occurs, I may try to address that as well.

Jan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hvmloader - allow_memory_relocate overlaps
  2024-01-04 13:16         ` Jan Beulich
@ 2024-02-07 15:08           ` Jan Beulich
  2024-03-01 20:24             ` Neowutran
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2024-02-07 15:08 UTC (permalink / raw)
  To: Neowutran; +Cc: xen-devel, Anthony Perard, George Dunlap

On 04.01.2024 14:16, Jan Beulich wrote:
> On 22.12.2023 16:49, Neowutran wrote:
>> Full logs without my patch to set allow-memory-relocate (https://github.com/neowutran/qubes-vmm-xen/blob/allowmemoryrelocate/ALLOWMEMORYRELOCATE.patch)
>> https://pastebin.com/g 
>> QGg55WZ
>> (GPU passthrough doesn't work, hvmloader overlaps with guest memory)
> 
> So there are oddities, but I can't spot any overlaps. What's odd is that
> the two blocks already above 4Gb are accounted for (and later relocated)
> when calculating total MMIO size. BARs of size 2Gb and more shouldn't be
> accounted for at all when deciding whether low RAM needs relocating, as
> those can't live below 4Gb anyway. I vaguely recall pointing this out
> years ago, but it was thought we'd get away for a fair while. What's
> further odd is where the two blocks are moved to: F800000 moves (down)
> to C00000, while the smaller FC00000 moves further up to FC80000.
> 
> I'll try to get to addressing at least the first oddity; if I can figure
> out why the second one occurs, I may try to address that as well.

Could you give the patch below a try? I don't have a device with large
enough a BAR that I could sensibly pass through to a guest, so I was
only able to build-test the change.

Jan

hvmloader/PCI: skip huge BARs in certain calculations

BARs of size 2Gb and up can't possibly fit below 4Gb: Both the bottom of
the lower 2Gb range and the top of the higher 2Gb range have special
purpose. Don't even have them influence whether to (perhaps) relocate
low RAM.

Reported-by: Neowutran <xen@neowutran.ovh>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
If we wanted to fit e.g. multiple 1Gb BARs, it would likely be prudent
to similarly avoid low RAM relocation in the first place. Yet accounting
for things differently depending on many large BARs there are would
require more intrusive code changes.

That said, I'm open to further lowering of the threshold. That'll
require different justification then, though.

--- a/tools/firmware/hvmloader/pci.c
+++ b/tools/firmware/hvmloader/pci.c
@@ -33,6 +33,13 @@ uint32_t pci_mem_start = HVM_BELOW_4G_MM
 const uint32_t pci_mem_end = RESERVED_MEMBASE;
 uint64_t pci_hi_mem_start = 0, pci_hi_mem_end = 0;
 
+/*
+ * BARs larger than this value are put in 64-bit space unconditionally.  That
+ * is, such BARs also don't play into the determination of how big the lowmem
+ * MMIO hole needs to be.
+ */
+#define HUGE_BAR_THRESH GB(1)
+
 enum virtual_vga virtual_vga = VGA_none;
 unsigned long igd_opregion_pgbase = 0;
 
@@ -286,9 +293,11 @@ void pci_setup(void)
             bars[i].bar_reg = bar_reg;
             bars[i].bar_sz  = bar_sz;
 
-            if ( ((bar_data & PCI_BASE_ADDRESS_SPACE) ==
-                  PCI_BASE_ADDRESS_SPACE_MEMORY) ||
-                 (bar_reg == PCI_ROM_ADDRESS) )
+            if ( is_64bar && bar_sz > HUGE_BAR_THRESH )
+                bar64_relocate = 1;
+            else if ( ((bar_data & PCI_BASE_ADDRESS_SPACE) ==
+                       PCI_BASE_ADDRESS_SPACE_MEMORY) ||
+                      (bar_reg == PCI_ROM_ADDRESS) )
                 mmio_total += bar_sz;
 
             nr_bars++;
@@ -367,7 +376,7 @@ void pci_setup(void)
             pci_mem_start = hvm_info->low_mem_pgend << PAGE_SHIFT;
     }
 
-    if ( mmio_total > (pci_mem_end - pci_mem_start) )
+    if ( mmio_total > (pci_mem_end - pci_mem_start) || bar64_relocate )
     {
         printf("Low MMIO hole not large enough for all devices,"
                " relocating some BARs to 64-bit\n");
@@ -446,8 +455,9 @@ void pci_setup(void)
          *   the code here assumes it to be.)
          * Should either of those two conditions change, this code will break.
          */
-        using_64bar = bars[i].is_64bar && bar64_relocate
-            && (mmio_total > (mem_resource.max - mem_resource.base));
+        using_64bar = bars[i].is_64bar && bar64_relocate &&
+            (mmio_total > (mem_resource.max - mem_resource.base) ||
+             bar_sz > HUGE_BAR_THRESH);
         bar_data = pci_readl(devfn, bar_reg);
 
         if ( (bar_data & PCI_BASE_ADDRESS_SPACE) ==
@@ -467,7 +477,8 @@ void pci_setup(void)
                 resource = &mem_resource;
                 bar_data &= ~PCI_BASE_ADDRESS_MEM_MASK;
             }
-            mmio_total -= bar_sz;
+            if ( bar_sz <= HUGE_BAR_THRESH )
+                mmio_total -= bar_sz;
         }
         else
         {




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hvmloader - allow_memory_relocate overlaps
  2024-02-07 15:08           ` Jan Beulich
@ 2024-03-01 20:24             ` Neowutran
  2024-03-04  7:29               ` Jan Beulich
  0 siblings, 1 reply; 10+ messages in thread
From: Neowutran @ 2024-03-01 20:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Neowutran, xen-devel, Anthony Perard, George Dunlap

On 2024-02-07 16:02, Jan Beulich wrote:
> On 04.01.2024 14:16, Jan Beulich wrote:
> > On 22.12.2023 16:49, Neowutran wrote:
> >> Full logs without my patch to set allow-memory-relocate (https://github.com/neowutran/qubes-vmm-xen/blob/allowmemoryrelocate/ALLOWMEMORYRELOCATE.patch)
> >> https://pastebin.com/g 
> >> QGg55WZ
> >> (GPU passthrough doesn't work, hvmloader overlaps with guest memory)
> > 
> > So there are oddities, but I can't spot any overlaps. What's odd is that
> > the two blocks already above 4Gb are accounted for (and later relocated)
> > when calculating total MMIO size. BARs of size 2Gb and more shouldn't be
> > accounted for at all when deciding whether low RAM needs relocating, as
> > those can't live below 4Gb anyway. I vaguely recall pointing this out
> > years ago, but it was thought we'd get away for a fair while. What's
> > further odd is where the two blocks are moved to: F800000 moves (down)
> > to C00000, while the smaller FC00000 moves further up to FC80000.
> > 
> > I'll try to get to addressing at least the first oddity; if I can figure
> > out why the second one occurs, I may try to address that as well.
> 
> Could you give the patch below a try? I don't have a device with large
> enough a BAR that I could sensibly pass through to a guest, so I was
> only able to build-test the change.

Hi and thanks,
I just tested it, it indeed work well when the GPU have bar > 1Go. 

------------
Setup: 
I removed my patch 
(


--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -2431,6 +2431,10 @@ void libxl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)
                                        libxl__xs_get_dompath(gc, guest_domid)),
                         "%s",
                         libxl_bios_type_to_string(guest_config->b_info.u.hvm.bios));
+        libxl__xs_printf(gc, XBT_NULL,
+                         libxl__sprintf(gc, "%s/hvmloader/allow-memory-relocate", libxl__xs_get_dompath(gc, guest_domid)),
+                         "%d",
+                         0);
     }
     ret = xc_domain_set_target(ctx->xch, dm_domid, guest_domid);
     if (ret<0) {


)
and applied your suggested "skip huge BARs" patch. 
My GPU: nvidia 4080
-------------

When the option "Resizable BAR support" is activated in my bios,
the BAR1 size of my gpu is reported to be 16GB. With this patch, gpu
passthrough work. 

When the option "Resizable BAR support" is desactivated in my bios,
the BAR1 size of my gpu is reported to be 256MB. With this patch,
gpu passthrough doesn't work (same crash as before)


( note: the option "Resizable BAR support" may or may not exist
depending on the motherboard model, on some it is 'hardcoded' to
activated, on some it is 'hardcoded' to desactivated)

> Jan
> 
> hvmloader/PCI: skip huge BARs in certain calculations
> 
> BARs of size 2Gb and up can't possibly fit below 4Gb: Both the bottom of
> the lower 2Gb range and the top of the higher 2Gb range have special
> purpose. Don't even have them influence whether to (perhaps) relocate
> low RAM.
> 
> Reported-by: Neowutran <xen@neowutran.ovh>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> If we wanted to fit e.g. multiple 1Gb BARs, it would likely be prudent
> to similarly avoid low RAM relocation in the first place. Yet accounting
> for things differently depending on many large BARs there are would
> require more intrusive code changes.
> 
> That said, I'm open to further lowering of the threshold. That'll
> require different justification then, though.
> 
> --- a/tools/firmware/hvmloader/pci.c
> +++ b/tools/firmware/hvmloader/pci.c
> @@ -33,6 +33,13 @@ uint32_t pci_mem_start = HVM_BELOW_4G_MM
>  const uint32_t pci_mem_end = RESERVED_MEMBASE;
>  uint64_t pci_hi_mem_start = 0, pci_hi_mem_end = 0;
>  
> +/*
> + * BARs larger than this value are put in 64-bit space unconditionally.  That
> + * is, such BARs also don't play into the determination of how big the lowmem
> + * MMIO hole needs to be.
> + */
> +#define HUGE_BAR_THRESH GB(1)
> +
>  enum virtual_vga virtual_vga = VGA_none;
>  unsigned long igd_opregion_pgbase = 0;
>  
> @@ -286,9 +293,11 @@ void pci_setup(void)
>              bars[i].bar_reg = bar_reg;
>              bars[i].bar_sz  = bar_sz;
>  
> -            if ( ((bar_data & PCI_BASE_ADDRESS_SPACE) ==
> -                  PCI_BASE_ADDRESS_SPACE_MEMORY) ||
> -                 (bar_reg == PCI_ROM_ADDRESS) )
> +            if ( is_64bar && bar_sz > HUGE_BAR_THRESH )
> +                bar64_relocate = 1;
> +            else if ( ((bar_data & PCI_BASE_ADDRESS_SPACE) ==
> +                       PCI_BASE_ADDRESS_SPACE_MEMORY) ||
> +                      (bar_reg == PCI_ROM_ADDRESS) )
>                  mmio_total += bar_sz;
>  
>              nr_bars++;
> @@ -367,7 +376,7 @@ void pci_setup(void)
>              pci_mem_start = hvm_info->low_mem_pgend << PAGE_SHIFT;
>      }
>  
> -    if ( mmio_total > (pci_mem_end - pci_mem_start) )
> +    if ( mmio_total > (pci_mem_end - pci_mem_start) || bar64_relocate )
>      {
>          printf("Low MMIO hole not large enough for all devices,"
>                 " relocating some BARs to 64-bit\n");
> @@ -446,8 +455,9 @@ void pci_setup(void)
>           *   the code here assumes it to be.)
>           * Should either of those two conditions change, this code will break.
>           */
> -        using_64bar = bars[i].is_64bar && bar64_relocate
> -            && (mmio_total > (mem_resource.max - mem_resource.base));
> +        using_64bar = bars[i].is_64bar && bar64_relocate &&
> +            (mmio_total > (mem_resource.max - mem_resource.base) ||
> +             bar_sz > HUGE_BAR_THRESH);
>          bar_data = pci_readl(devfn, bar_reg);
>  
>          if ( (bar_data & PCI_BASE_ADDRESS_SPACE) ==
> @@ -467,7 +477,8 @@ void pci_setup(void)
>                  resource = &mem_resource;
>                  bar_data &= ~PCI_BASE_ADDRESS_MEM_MASK;
>              }
> -            mmio_total -= bar_sz;
> +            if ( bar_sz <= HUGE_BAR_THRESH )
> +                mmio_total -= bar_sz;
>          }
>          else
>          {
> 




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hvmloader - allow_memory_relocate overlaps
  2024-03-01 20:24             ` Neowutran
@ 2024-03-04  7:29               ` Jan Beulich
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2024-03-04  7:29 UTC (permalink / raw)
  To: Neowutran; +Cc: xen-devel, Anthony Perard, George Dunlap

On 01.03.2024 21:24, Neowutran wrote:
> On 2024-02-07 16:02, Jan Beulich wrote:
>> Could you give the patch below a try? I don't have a device with large
>> enough a BAR that I could sensibly pass through to a guest, so I was
>> only able to build-test the change.
> 
> Hi and thanks,
> I just tested it, it indeed work well when the GPU have bar > 1Go. 
> 
> ------------
> Setup: 
> I removed my patch 
> (
> 
> 
> --- a/tools/libs/light/libxl_dm.c
> +++ b/tools/libs/light/libxl_dm.c
> @@ -2431,6 +2431,10 @@ void libxl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)
>                                         libxl__xs_get_dompath(gc, guest_domid)),
>                          "%s",
>                          libxl_bios_type_to_string(guest_config->b_info.u.hvm.bios));
> +        libxl__xs_printf(gc, XBT_NULL,
> +                         libxl__sprintf(gc, "%s/hvmloader/allow-memory-relocate", libxl__xs_get_dompath(gc, guest_domid)),
> +                         "%d",
> +                         0);
>      }
>      ret = xc_domain_set_target(ctx->xch, dm_domid, guest_domid);
>      if (ret<0) {
> 
> 
> )
> and applied your suggested "skip huge BARs" patch. 
> My GPU: nvidia 4080
> -------------
> 
> When the option "Resizable BAR support" is activated in my bios,
> the BAR1 size of my gpu is reported to be 16GB. With this patch, gpu
> passthrough work. 

Good. I guess I should properly post that patch then, no matter that ...

> When the option "Resizable BAR support" is desactivated in my bios,
> the BAR1 size of my gpu is reported to be 256MB. With this patch,
> gpu passthrough doesn't work (same crash as before)

... this clearly requires further work then. 256Mb isn't all that large;
could you post a log file (fragment) for this very case?

> ( note: the option "Resizable BAR support" may or may not exist
> depending on the motherboard model, on some it is 'hardcoded' to
> activated, on some it is 'hardcoded' to desactivated)

Resizable BARs is yet another thing on my (and maybe also Roger's) todo
list. I'm a little uncomfortable doing any work there without having a
system where I could actually test it ...

Jan


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-03-04  7:29 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-16  7:01 hvmloader - allow_memory_relocate overlaps Neowutran
2023-12-19 16:57 ` Jan Beulich
2023-12-21 18:08   ` Neowutran
2023-12-22 11:35     ` Jan Beulich
2023-12-22 15:49       ` Neowutran
2024-01-04 13:16         ` Jan Beulich
2024-02-07 15:08           ` Jan Beulich
2024-03-01 20:24             ` Neowutran
2024-03-04  7:29               ` Jan Beulich
2023-12-22 23:36       ` Marek Marczykowski-Górecki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.