From: Sean Christopherson <sean.j.christopherson@intel.com>
To: Joerg Roedel <joro@8bytes.org>
Cc: "Paolo Bonzini" <pbonzini@redhat.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
"Vlastimil Babka" <vbabka@suse.cz>,
"Andrea Arcangeli" <aarcange@redhat.com>,
"Radim Krčmář" <rkrcmar@redhat.com>,
kvm@vger.kernel.org
Subject: Re: THP refcounting in disallowed_hugepage_adjust()?
Date: Tue, 26 Nov 2019 09:46:03 -0800 [thread overview]
Message-ID: <20191126174603.GB22233@linux.intel.com> (raw)
In-Reply-To: <20191126152109.GA23850@8bytes.org>
On Tue, Nov 26, 2019 at 04:21:09PM +0100, Joerg Roedel wrote:
> Hi Paolo et al,
>
> while looking again at the recently added IFU patches I noticed a
> dicrepancy between the two _hugepage_adjust() functions which doesn't
> make sense to me yet:
>
> * transparent_hugepage_adjust(), when changing the value of pfn,
> does a kvm_release_pfn_clean() on the old value and a
> kvm_get_pfn() on the new value to make sure the code holds the
> reference to the correct pfn.
>
> * disallowed_hugepage_adjust() also changes the value of the pfn
> to map, kinda reverses what transparent_hugepage_adjust() did
> before. But that function does not care about the pfn
> refcounting.
>
> I was wondering what the reason for that might be, is it just not
> necessary in disallowed_hugepage_adjust() or is that an oversight?
The page fault flows don't actually rely on holding a reference to the
page once they reach thp_adjust(). At that point, they hold mmu_lock and
have verified no invalidation from mmu_notifier have occured since the
page reference was acquired.
The release/get pfn dance in transparent_hugepage_adjust() is a quirk of
sorts that is necessitated because thp_adjust() modifies the local @pfn
variable in FNAME(page_fault)(), nonpaging_map() and tdp_page_fault().
Those functions all call kvm_release_pfn_clean() on @pfn, so thp_adjust()
needs to transfer the page reference purely for correctness when the pfn
is released.
disallowed_hugepage_adjust() is called from __direct_map() and its
modification of the pfn is also contained to __direct_map(), i.e. the
updated @pfn doesn't get propagated back up to the fault handlers. Thus,
kvm_release_pfn_clean() is called on the original pfn and so there's no
need to transfer the page reference.
The above discrepancy can resolved by moving thp_adjust() into FNAME(fetch)
and __direct_map() so that the "top-level" @pfn isn't modified. Getting
rid of the kvm_get_pfn() call would be a nice side effect. The downside is
that @force_pt_level would need to be passed down the call stack.
next prev parent reply other threads:[~2019-11-26 17:46 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-26 15:21 THP refcounting in disallowed_hugepage_adjust()? Joerg Roedel
2019-11-26 17:46 ` Sean Christopherson [this message]
2019-11-27 9:41 ` Joerg Roedel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191126174603.GB22233@linux.intel.com \
--to=sean.j.christopherson@intel.com \
--cc=aarcange@redhat.com \
--cc=joro@8bytes.org \
--cc=kirill@shutemov.name \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=rkrcmar@redhat.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.