From: Sean Christopherson <sean.j.christopherson@intel.com>
To: Barret Rhoden <brho@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
David Hildenbrand <david@redhat.com>,
Alexander Duyck <alexander.h.duyck@linux.intel.com>,
linux-nvdimm@lists.01.org, x86@kernel.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, jason.zeng@intel.com
Subject: Re: [PATCH v5 1/2] mm: make dev_pagemap_mapping_shift() externally visible
Date: Tue, 17 Dec 2019 16:18:57 -0800 [thread overview]
Message-ID: <20191218001857.GM11771@linux.intel.com> (raw)
In-Reply-To: <e004e742-f755-c22c-57bb-acfe30971c7d@google.com>
On Mon, Dec 16, 2019 at 12:59:53PM -0500, Barret Rhoden wrote:
> On 12/13/19 12:47 PM, Sean Christopherson wrote:
> >>+EXPORT_SYMBOL_GPL(dev_pagemap_mapping_shift);
> >
> >This is basically a rehash of lookup_address_in_pgd(), and doesn't provide
> >exactly what KVM needs. E.g. KVM works with levels instead of shifts, and
> >it would be nice to provide the pte so that KVM can sanity check that the
> >pfn from this walk matches the pfn it plans on mapping.
>
> One minor issue is that the levels for lookup_address_in_pgd() and for KVM
> differ in name, although not in value. lookup uses PG_LEVEL_4K = 1. KVM
> uses PT_PAGE_TABLE_LEVEL = 1. The enums differ a little too: x86 has a name
> for a 512G page, etc. It's all in arch/x86.
>
> Does KVM-x86 need its own names for the levels? If not, I could convert the
> PT_PAGE_TABLE_* stuff to PG_LEVEL_* stuff.
Not really. I suspect the whole reason KVM has different enums is to
handle PSE/Mode-B paging, where PG_LEVEL_2M would be inaccurate, e.g.:
if (PTTYPE == 32 && walker->level == PT_DIRECTORY_LEVEL && is_cpuid_PSE36())
gfn += pse36_gfn_delta(pte);
That being said, I absolute loathe PT_PAGE_TABLE_LEVEL, I can never
remember that it means 4k pages. I would be in favor of using the kernel's
enums with some KVM-specific abstraction of PG_LEVEL_2M, e.g.
/* KVM Hugepage definitions for x86 */
enum {
PG_LEVEL_2M_OR_4M = PG_LEVEL_2M,
/* set max level to the biggest one */
KVM_MAX_HUGEPAGE_LEVEL = PG_LEVEL_1G,
};
And ideally restrict usage of the ugly PG_LEVEL_2M_OR_4M to flows that can
actually encounter 4M pages, i.e. when walking guest page tables. On the
host side, KVM always uses PAE or 64-bit paging.
Personally, I'd pursue that in a separate patch/series, it'll touch a
massive amount of code and will probably get bikeshedded a fair amount.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
WARNING: multiple messages have this Message-ID (diff)
From: Sean Christopherson <sean.j.christopherson@intel.com>
To: Barret Rhoden <brho@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Dan Williams <dan.j.williams@intel.com>,
David Hildenbrand <david@redhat.com>,
Dave Jiang <dave.jiang@intel.com>,
Alexander Duyck <alexander.h.duyck@linux.intel.com>,
linux-nvdimm@lists.01.org, x86@kernel.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, jason.zeng@intel.com
Subject: Re: [PATCH v5 1/2] mm: make dev_pagemap_mapping_shift() externally visible
Date: Tue, 17 Dec 2019 16:18:57 -0800 [thread overview]
Message-ID: <20191218001857.GM11771@linux.intel.com> (raw)
In-Reply-To: <e004e742-f755-c22c-57bb-acfe30971c7d@google.com>
On Mon, Dec 16, 2019 at 12:59:53PM -0500, Barret Rhoden wrote:
> On 12/13/19 12:47 PM, Sean Christopherson wrote:
> >>+EXPORT_SYMBOL_GPL(dev_pagemap_mapping_shift);
> >
> >This is basically a rehash of lookup_address_in_pgd(), and doesn't provide
> >exactly what KVM needs. E.g. KVM works with levels instead of shifts, and
> >it would be nice to provide the pte so that KVM can sanity check that the
> >pfn from this walk matches the pfn it plans on mapping.
>
> One minor issue is that the levels for lookup_address_in_pgd() and for KVM
> differ in name, although not in value. lookup uses PG_LEVEL_4K = 1. KVM
> uses PT_PAGE_TABLE_LEVEL = 1. The enums differ a little too: x86 has a name
> for a 512G page, etc. It's all in arch/x86.
>
> Does KVM-x86 need its own names for the levels? If not, I could convert the
> PT_PAGE_TABLE_* stuff to PG_LEVEL_* stuff.
Not really. I suspect the whole reason KVM has different enums is to
handle PSE/Mode-B paging, where PG_LEVEL_2M would be inaccurate, e.g.:
if (PTTYPE == 32 && walker->level == PT_DIRECTORY_LEVEL && is_cpuid_PSE36())
gfn += pse36_gfn_delta(pte);
That being said, I absolute loathe PT_PAGE_TABLE_LEVEL, I can never
remember that it means 4k pages. I would be in favor of using the kernel's
enums with some KVM-specific abstraction of PG_LEVEL_2M, e.g.
/* KVM Hugepage definitions for x86 */
enum {
PG_LEVEL_2M_OR_4M = PG_LEVEL_2M,
/* set max level to the biggest one */
KVM_MAX_HUGEPAGE_LEVEL = PG_LEVEL_1G,
};
And ideally restrict usage of the ugly PG_LEVEL_2M_OR_4M to flows that can
actually encounter 4M pages, i.e. when walking guest page tables. On the
host side, KVM always uses PAE or 64-bit paging.
Personally, I'd pursue that in a separate patch/series, it'll touch a
massive amount of code and will probably get bikeshedded a fair amount.
next prev parent reply other threads:[~2019-12-18 0:19 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-12 18:22 [PATCH v5 0/2] kvm: Use huge pages for DAX-backed files Barret Rhoden
2019-12-12 18:22 ` Barret Rhoden
2019-12-12 18:22 ` [PATCH v5 1/2] mm: make dev_pagemap_mapping_shift() externally visible Barret Rhoden
2019-12-12 18:22 ` Barret Rhoden
2019-12-13 17:47 ` Sean Christopherson
2019-12-13 17:47 ` Sean Christopherson
2019-12-13 18:13 ` Dan Williams
2019-12-13 18:13 ` Dan Williams
2019-12-16 17:59 ` Barret Rhoden
2019-12-16 17:59 ` Barret Rhoden
2019-12-18 0:18 ` Sean Christopherson [this message]
2019-12-18 0:18 ` Sean Christopherson
2020-01-15 18:33 ` Paolo Bonzini
2020-01-15 18:33 ` Paolo Bonzini
2019-12-12 18:22 ` [PATCH v5 2/2] kvm: Use huge pages for DAX-backed files Barret Rhoden
2019-12-12 18:22 ` Barret Rhoden
2019-12-12 18:47 ` Liran Alon
2019-12-12 18:47 ` Liran Alon
2019-12-12 18:49 ` Liran Alon
2019-12-12 18:49 ` Liran Alon
2019-12-12 19:55 ` Barret Rhoden
2019-12-12 19:55 ` Barret Rhoden
2019-12-13 1:07 ` Liran Alon
2019-12-13 1:07 ` Liran Alon
2019-12-13 14:13 ` Barret Rhoden
2019-12-13 14:13 ` Barret Rhoden
2019-12-13 17:19 ` Sean Christopherson
2019-12-13 17:19 ` Sean Christopherson
2019-12-13 17:31 ` Liran Alon
2019-12-13 17:31 ` Liran Alon
2019-12-13 17:50 ` Sean Christopherson
2019-12-13 17:50 ` Sean Christopherson
2019-12-13 18:08 ` Liran Alon
2019-12-13 18:08 ` Liran Alon
2019-12-16 16:05 ` Barret Rhoden
2019-12-16 16:05 ` Barret Rhoden
2020-01-07 19:05 ` Sean Christopherson
2020-01-07 19:05 ` Sean Christopherson
2020-01-07 19:19 ` Barret Rhoden
2020-01-07 19:19 ` Barret Rhoden
2020-01-08 1:20 ` Sean Christopherson
2020-01-08 1:20 ` Sean Christopherson
2020-01-08 1:39 ` Dan Williams
2020-01-08 1:39 ` Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191218001857.GM11771@linux.intel.com \
--to=sean.j.christopherson@intel.com \
--cc=alexander.h.duyck@linux.intel.com \
--cc=brho@google.com \
--cc=david@redhat.com \
--cc=jason.zeng@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=pbonzini@redhat.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.