All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Edgar E. Iglesias" <edgar.iglesias@amd.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>,
	Olaf Hering <olaf@aepfle.de>, <xen-devel@lists.xenproject.org>,
	<jgross@suse.com>, Anthony PERARD <anthony@xenproject.org>,
	Paul Durrant <paul@xen.org>, <andrew.cooper3@citrix.com>,
	<roger.pau@citrix.com>
Subject: Re: slow start of Pod HVM domU with qemu 9.1
Date: Wed, 29 Jan 2025 12:23:46 +0100	[thread overview]
Message-ID: <Z5oPwthxmLfIbjSE@zapote> (raw)
In-Reply-To: <b4ccccbb-f3ee-44fe-a5e2-780195cbbc0e@suse.com>

On Wed, Jan 29, 2025 at 09:52:19AM +0100, Jan Beulich wrote:
> On 29.01.2025 00:58, Stefano Stabellini wrote:
> > On Tue, 28 Jan 2025, Edgar E. Iglesias wrote:
> >> On Tue, Jan 28, 2025 at 03:15:44PM +0100, Olaf Hering wrote:
> >>> With this change the domU starts fast again:
> >>>
> >>> --- a/hw/xen/xen-mapcache.c
> >>> +++ b/hw/xen/xen-mapcache.c
> >>> @@ -522,6 +522,7 @@ ram_addr_t xen_ram_addr_from_mapcache(void *ptr)
> >>>      ram_addr_t addr;
> >>>  
> >>>      addr = xen_ram_addr_from_mapcache_single(mapcache, ptr);
> >>> +    if (1)
> >>>      if (addr == RAM_ADDR_INVALID) {
> >>>          addr = xen_ram_addr_from_mapcache_single(mapcache_grants, ptr);
> >>>      }
> >>> @@ -626,6 +627,7 @@ static void xen_invalidate_map_cache_entry_single(MapCache *mc, uint8_t *buffer)
> >>>  static void xen_invalidate_map_cache_entry_all(uint8_t *buffer)
> >>>  {
> >>>      xen_invalidate_map_cache_entry_single(mapcache, buffer);
> >>> +    if (1)
> >>>      xen_invalidate_map_cache_entry_single(mapcache_grants, buffer);
> >>>  }
> >>>  
> >>> @@ -700,6 +702,7 @@ void xen_invalidate_map_cache(void)
> >>>      bdrv_drain_all();
> >>>  
> >>>      xen_invalidate_map_cache_single(mapcache);
> >>> +    if (0)
> >>>      xen_invalidate_map_cache_single(mapcache_grants);
> >>>  }
> >>>  
> >>> I did the testing with libvirt, the domU.cfg equivalent looks like this:
> >>> maxmem = 4096
> >>> memory = 2048
> >>> maxvcpus = 4
> >>> vcpus = 2
> >>> pae = 1
> >>> acpi = 1
> >>> apic = 1
> >>> viridian = 0
> >>> rtc_timeoffset = 0
> >>> localtime = 0
> >>> on_poweroff = "destroy"
> >>> on_reboot = "destroy"
> >>> on_crash = "destroy"
> >>> device_model_override = "/usr/lib64/qemu-9.1/bin/qemu-system-i386"
> >>> sdl = 0
> >>> vnc = 1
> >>> vncunused = 1
> >>> vnclisten = "127.0.0.1"
> >>> vif = [ "mac=52:54:01:23:63:29,bridge=br0,script=vif-bridge" ]
> >>> parallel = "none"
> >>> serial = "pty"
> >>> builder = "hvm"
> >>> kernel = "/bug1236329/linux"
> >>> ramdisk = "/bug1236329/initrd"
> >>> cmdline = "console=ttyS0,115200n8 quiet ignore_loglevel""
> >>> boot = "c" 
> >>> disk = [ "format=qcow2,vdev=hda,access=rw,backendtype=qdisk,target=/bug1236329/sles12sp5.qcow2" ]
> >>> usb = 1
> >>> usbdevice = "tablet"
> >>>
> >>> Any idea what can be done to restore boot times?
> >>
> >>
> >> A guess is that it's taking a long time to walk the grants mapcache
> >> when invalidating (in QEMU). Despite it being unused and empty. We
> >> could try to find a way to keep track of usage and do nothing when
> >> invalidating an empty/unused cache.
> > 
> > If mapcache_grants is unused and empty, the call to
> > xen_invalidate_map_cache_single(mapcache_grants) should be really fast?
> > 
> > I think probably it might be the opposite: mapcache_grants is utilized,
> > so going through all the mappings in xen_invalidate_map_cache_single
> > takes time.
> > 
> > However, I wonder if it is really needed. At least in the PoD case, the
> > reason for the IOREQ_TYPE_INVALIDATE request is that the underlying DomU
> > memory has changed. But that doesn't affect the grant mappings, because
> > those are mappings of other domains' memory.
> > 
> > So I am thinking whether we should remove the call to
> > xen_invalidate_map_cache_single(mapcache_grants) ?
> > 
> > Adding x86 maintainers: do we need to flush grant table mappings for the
> > PV backends running in QEMU when Xen issues a IOREQ_TYPE_INVALIDATE
> > request to QEMU?
> 
> Judging from two of the three uses of ioreq_request_mapcache_invalidate()
> in x86'es p2m.c, I'd say no. The 3rd use there is unconditional, but
> maybe wrongly so.
> 
> However, the answer also depends on what qemu does when encountering a
> granted page. Would it enter it into its mapcache? Can it even access it?
> (If it can't, how does emulated I/O work to such pages? If it can, isn't
> this a violation of the grant's permissions, as it's targeted at solely
> the actual HVM domain named in the grant?)
>

QEMU will only map granted pages if the guest explicitly asks QEMU to
DMA into granted pages. Guests first need to grant pages to the domain
running QEMU, then pass a cookie/address to QEMU with the grant id. QEMU
will then map the granted memory, enter it into a dedicated mapcache
(mapcache_grants) and then emulate device DMA to/from the grant.

So QEMU will only map grants intended for QEMU DMA devices, not any grant
to any domain.

Details:
https://github.com/torvalds/linux/blob/master/drivers/xen/grant-dma-ops.c

Cheers,
Edgar


  reply	other threads:[~2025-01-29 11:24 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-28 14:15 slow start of Pod HVM domU with qemu 9.1 Olaf Hering
2025-01-28 15:58 ` Edgar E. Iglesias
2025-01-28 23:58   ` Stefano Stabellini
2025-01-29  8:52     ` Jan Beulich
2025-01-29 11:23       ` Edgar E. Iglesias [this message]
2025-01-29 10:53     ` Edgar E. Iglesias
2025-01-29 22:50       ` Stefano Stabellini
2025-01-30 18:14         ` Stefano Stabellini
2025-01-31 21:24           ` Olaf Hering

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z5oPwthxmLfIbjSE@zapote \
    --to=edgar.iglesias@amd.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=anthony@xenproject.org \
    --cc=jbeulich@suse.com \
    --cc=jgross@suse.com \
    --cc=olaf@aepfle.de \
    --cc=paul@xen.org \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.