From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: pedro.principeza@canonical.com,
Eduardo Habkost <ehabkost@redhat.com>,
dann.frazier@canonical.com,
"Guilherme G. Piccoli" <gpiccoli@canonical.com>,
qemu-devel@nongnu.org, christian.ehrhardt@canonical.com,
Gerd Hoffmann <kraxel@redhat.com>,
lersek@redhat.com, fw@gpiccoli.net
Subject: Re: ovmf / PCI passthrough impaired due to very limiting PCI64 aperture
Date: Wed, 17 Jun 2020 11:28:34 +0100 [thread overview]
Message-ID: <20200617102834.GB2776@work-vm> (raw)
In-Reply-To: <20200617085033.GB568347@redhat.com>
* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Tue, Jun 16, 2020 at 01:10:21PM -0400, Eduardo Habkost wrote:
> > On Tue, Jun 16, 2020 at 05:57:46PM +0100, Dr. David Alan Gilbert wrote:
> > > * Gerd Hoffmann (kraxel@redhat.com) wrote:
> > > > Hi,
> > > >
> > > > > (a) We could rely in the guest physbits to calculate the PCI64 aperture.
> > > >
> > > > I'd love to do that. Move the 64-bit I/O window as high as possible and
> > > > use -- say -- 25% of the physical address space for it.
> > > >
> > > > Problem is we can't.
> > > >
> > > > > failure. Also, if the users are not setting the physbits in the guest,
> > > > > there must be a default (seems to be 40bit according to my experiments),
> > > > > seems to be a good idea to rely on that.
> > > >
> > > > Yes, 40 is the default, and it is used *even if the host supports less
> > > > than that*. Typical values I've seen for intel hardware are 36 and 39.
> > > > 39 is used even by recent hardware (not the xeons, but check out a
> > > > laptop or a nuc).
> > > >
> > > > > If guest physbits is 40, why to have OVMF limiting it to 36, right?
> > > >
> > > > Things will explode in case OVMF uses more physbits than the host
> > > > supports (host physbits limit applies to ept too). In other words: OVMF
> > > > can't trust the guest physbits, so it is conservative to be on the safe
> > > > side.
> > > >
> > > > If we can somehow make a *trustable* physbits value available to the
> > > > guest, then yes, we can go that route. But the guest physbits we have
> > > > today unfortunately don't cut it.
> > >
> > > In downstream RH qemu, we run with host-physbits as default; so it's reasonably
> > > trustworthy; of course that doesn't help you across a migration between
> > > hosts with different sizes (e.g. an E5 Xeon to an E3).
> > > Changing upstream to do the same would seem sensible to me, but it's not
> > > a foolproof config.
> >
> > Yeah, to make it really trustworthy we would need to prevent
> > migration to hosts with mismatching phys sizes. We would need to
> > communicate that to the guest somehow (with new hypervisor CPUID
> > flags, maybe).
>
> QEMU should be able to validate the hostphysbits >= guestphysbits when
> accepting incoming migration, and abort it.
Yeh, there's an outstanding request to validate other CPU flags as well.
> Meanwhile libvirt should be enhanced to report hostphysbits, so that
> management apps can determine that they shouldn't even pick bad hosts
> in the first place.
Sounds reasonable.
Note there are a couple of other considerations when choosing the
physbits as reported to the guest:
a) TCG's view - I think it had a fixed size of 40 bits, but I haven't
dug into it.
b) We recently gained 'host-phys-bits-limit' which when used with
host-phys-bits lets you take the host value but then limit it. Eduardo
seems to have done that to limit the guest from flipping into 5-level
page tables. Hmm I've not tried with chips that do 5-level - but maybe
we also need this if you expect to migrate to hosts that don't have it.
(I've also got a vague memory that there's a limit in some IOMMUs
address sizes, but I can't remember what the details were).
Dave
>
> Regards,
> Daniel
> --
> |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o- https://fstop138.berrange.com :|
> |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2020-06-17 10:30 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-16 15:16 ovmf / PCI passthrough impaired due to very limiting PCI64 aperture Guilherme G. Piccoli
2020-06-16 16:50 ` Gerd Hoffmann
2020-06-16 16:57 ` Dr. David Alan Gilbert
2020-06-16 17:10 ` Eduardo Habkost
2020-06-17 8:17 ` Christophe de Dinechin
2020-06-17 16:25 ` Eduardo Habkost
2020-06-17 8:50 ` Daniel P. Berrangé
2020-06-17 10:28 ` Dr. David Alan Gilbert [this message]
2020-06-17 14:11 ` Eduardo Habkost
2020-06-16 17:10 ` Gerd Hoffmann
2020-06-16 17:16 ` Dr. David Alan Gilbert
2020-06-16 17:14 ` Guilherme Piccoli
2020-06-17 6:40 ` Gerd Hoffmann
2020-06-17 13:25 ` Laszlo Ersek
2020-06-17 13:26 ` Laszlo Ersek
2020-06-17 13:22 ` Laszlo Ersek
2020-06-17 13:43 ` Guilherme Piccoli
2020-06-17 15:57 ` Laszlo Ersek
2020-06-17 16:01 ` Guilherme Piccoli
2020-06-18 7:56 ` Laszlo Ersek
2020-06-17 13:46 ` Dr. David Alan Gilbert
2020-06-17 15:49 ` Eduardo Habkost
2020-06-17 15:57 ` Guilherme Piccoli
2020-06-17 16:33 ` Eduardo Habkost
2020-06-17 16:40 ` Guilherme Piccoli
2020-06-18 8:00 ` Laszlo Ersek
2020-06-17 16:04 ` Dr. David Alan Gilbert
2020-06-17 16:17 ` Daniel P. Berrangé
2020-06-17 16:22 ` Eduardo Habkost
2020-06-17 16:41 ` Dr. David Alan Gilbert
2020-06-17 17:17 ` Daniel P. Berrangé
2020-06-17 17:23 ` Dr. David Alan Gilbert
2020-06-17 16:28 ` Eduardo Habkost
2020-06-19 16:13 ` Dr. David Alan Gilbert
2020-06-17 16:14 ` Laszlo Ersek
2020-06-17 16:43 ` Laszlo Ersek
2020-06-17 17:02 ` Eduardo Habkost
2020-06-18 8:29 ` Laszlo Ersek
2020-06-17 8:16 ` Christophe de Dinechin
2020-06-17 10:12 ` Gerd Hoffmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200617102834.GB2776@work-vm \
--to=dgilbert@redhat.com \
--cc=berrange@redhat.com \
--cc=christian.ehrhardt@canonical.com \
--cc=dann.frazier@canonical.com \
--cc=ehabkost@redhat.com \
--cc=fw@gpiccoli.net \
--cc=gpiccoli@canonical.com \
--cc=kraxel@redhat.com \
--cc=lersek@redhat.com \
--cc=pedro.principeza@canonical.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).