From: "Michael S. Tsirkin" <mst@redhat.com>
To: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <sean.j.christopherson@intel.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Peter Xu <peterx@redhat.com>,
Julia Suvorova <jsuvorov@redhat.com>,
Andy Lutomirski <luto@kernel.org>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/3] KVM: x86: KVM_MEM_PCI_HOLE memory
Date: Thu, 6 Aug 2020 05:53:18 -0400 [thread overview]
Message-ID: <20200806055008-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <873650p1vo.fsf@vitty.brq.redhat.com>
On Thu, Aug 06, 2020 at 11:19:55AM +0200, Vitaly Kuznetsov wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
>
> > On Tue, Jul 28, 2020 at 04:37:38PM +0200, Vitaly Kuznetsov wrote:
> >> This is a continuation of "[PATCH RFC 0/5] KVM: x86: KVM_MEM_ALLONES
> >> memory" work:
> >> https://lore.kernel.org/kvm/20200514180540.52407-1-vkuznets@redhat.com/
> >> and pairs with Julia's "x86/PCI: Use MMCONFIG by default for KVM guests":
> >> https://lore.kernel.org/linux-pci/20200722001513.298315-1-jusual@redhat.com/
> >>
> >> PCIe config space can (depending on the configuration) be quite big but
> >> usually is sparsely populated. Guest may scan it by accessing individual
> >> device's page which, when device is missing, is supposed to have 'pci
> >> hole' semantics: reads return '0xff' and writes get discarded.
> >>
> >> When testing Linux kernel boot with QEMU q35 VM and direct kernel boot
> >> I observed 8193 accesses to PCI hole memory. When such exit is handled
> >> in KVM without exiting to userspace, it takes roughly 0.000001 sec.
> >> Handling the same exit in userspace is six times slower (0.000006 sec) so
> >> the overal; difference is 0.04 sec. This may be significant for 'microvm'
> >> ideas.
> >>
> >> Note, the same speed can already be achieved by using KVM_MEM_READONLY
> >> but doing this would require allocating real memory for all missing
> >> devices and e.g. 8192 pages gives us 32mb. This will have to be allocated
> >> for each guest separately and for 'microvm' use-cases this is likely
> >> a no-go.
> >>
> >> Introduce special KVM_MEM_PCI_HOLE memory: userspace doesn't need to
> >> back it with real memory, all reads from it are handled inside KVM and
> >> return '0xff'. Writes still go to userspace but these should be extremely
> >> rare.
> >>
> >> The original 'KVM_MEM_ALLONES' idea had additional optimizations: KVM
> >> was mapping all 'PCI hole' pages to a single read-only page stuffed with
> >> 0xff. This is omitted in this submission as the benefits are unclear:
> >> KVM will have to allocate SPTEs (either on demand or aggressively) and
> >> this also consumes time/memory.
> >
> > Curious about this: if we do it aggressively on the 1st fault,
> > how long does it take to allocate 256 huge page SPTEs?
> > And the amount of memory seems pretty small then, right?
>
> Right, this could work but we'll need a 2M region (one per KVM host of
> course) filled with 0xff-s instead of a single 4k page.
Given it's global doesn't sound too bad.
>
> Generally, I'd like to reach an agreement on whether this feature (and
> the corresponding Julia's patch addding PV feature bit) is worthy. In
> case it is (meaning it gets merged in this simplest form), we can
> suggest further improvements. It would also help if firmware (SeaBIOS,
> OVMF) would start recognizing the PV feature bit too, this way we'll be
> seeing even bigger improvement and this may or may not be a deal-breaker
> when it comes to the 'aggressive PTE mapping' idea.
About the feature bit, I am not sure why it's really needed. A single
mmio access is cheaper than two io accesses anyway, right? So it makes
sense for a kvm guest whether host has this feature or not.
We need to be careful and limit to a specific QEMU implementation
to avoid tripping up bugs, but it seems more appropriate to
check it using pci host IDs.
> --
> Vitaly
next prev parent reply other threads:[~2020-08-06 11:13 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-28 14:37 [PATCH 0/3] KVM: x86: KVM_MEM_PCI_HOLE memory Vitaly Kuznetsov
2020-07-28 14:37 ` [PATCH 1/3] KVM: x86: move kvm_vcpu_gfn_to_memslot() out of try_async_pf() Vitaly Kuznetsov
2020-07-28 14:37 ` [PATCH 2/3] KVM: x86: introduce KVM_MEM_PCI_HOLE memory Vitaly Kuznetsov
2020-08-05 15:18 ` Andrew Jones
2020-08-06 9:08 ` Vitaly Kuznetsov
2020-08-05 17:05 ` Jim Mattson
2020-08-06 0:18 ` Michael S. Tsirkin
2020-08-06 17:36 ` Jim Mattson
2020-08-06 9:14 ` Vitaly Kuznetsov
2020-07-28 14:37 ` [PATCH 3/3] KVM: selftests: add KVM_MEM_PCI_HOLE test Vitaly Kuznetsov
2020-08-06 0:21 ` [PATCH 0/3] KVM: x86: KVM_MEM_PCI_HOLE memory Michael S. Tsirkin
2020-08-06 9:19 ` Vitaly Kuznetsov
2020-08-06 9:53 ` Michael S. Tsirkin [this message]
2020-08-06 11:39 ` Vitaly Kuznetsov
2020-08-06 12:19 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200806055008-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=jmattson@google.com \
--cc=jsuvorov@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=sean.j.christopherson@intel.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox