linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hyeonggon Yoo <42.hyeyoo@gmail.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
Cc: "peterz@infradead.org" <peterz@infradead.org>,
	"rppt@kernel.org" <rppt@kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"Williams, Dan J" <dan.j.williams@intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"aarcange@redhat.com" <aarcange@redhat.com>,
	"mingo@redhat.com" <mingo@redhat.com>, "Christopherson,,
	Sean" <seanjc@google.com>, "Lutomirski, Andy" <luto@kernel.org>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"Tianyu.Lan@microsoft.com" <Tianyu.Lan@microsoft.com>,
	"aneesh.kumar@linux.ibm.com" <aneesh.kumar@linux.ibm.com>,
	"chu, jane" <jane.chu@oracle.com>
Subject: Re: [RFC 2/2] x86/mm/cpa: drop pgprot_clear_protnone_bits()
Date: Sun, 19 Jun 2022 21:20:55 +0900	[thread overview]
Message-ID: <Yq8Up9cfvNL70Qeb@hyeyoo> (raw)
In-Reply-To: <6e3eb8a0fc059419b77e1f6fdf3cb8ab746eb37b.camel@intel.com>

On Wed, Jun 15, 2022 at 06:18:15PM +0000, Edgecombe, Rick P wrote:
> On Wed, 2022-06-15 at 12:47 +0900, Hyeonggon Yoo wrote:
> > On Tue, Jun 14, 2022 at 06:23:43PM +0000, Edgecombe, Rick P wrote:
> > > On Tue, 2022-06-14 at 15:53 +0900, Hyeonggon Yoo wrote:
> > > > On Tue, Jun 14, 2022 at 03:39:33PM +0900, Hyeonggon Yoo wrote:
> > > > > commit a8aed3e0752b4 ("x86/mm/pageattr: Prevent PSE and GLOABL
> > > > > leftovers
> > > > > to confuse pmd/pte_present and pmd_huge") made CPA clear
> > > > > _PAGE_GLOBAL when
> > > > > _PAGE_PRESENT is not set. This prevents kernel crashing when
> > > > > kernel
> > > > > reads
> > > > > a page with !_PAGE_PRESENT and _PAGE_PROTNONE (_PAGE_GLOBAL).
> > > > > And
> > > > > then it
> > > > > set _PAGE_GLOBAL back when setting _PAGE_PRESENT again.
> > > > > 
> > > > > After commit d1440b23c922d ("x86/mm: Factor out pageattr
> > > > > _PAGE_GLOBAL
> > > > > setting") made kernel not set unconditionally _PAGE_GLOBAL,
> > > > > pages
> > > > > lose
> > > > > global flag after _set_pages_np() and _set_pages_p() are
> > > > > called.
> > > > > 
> > > > > But after commit 3166851142411 ("x86: skip check for spurious
> > > > > faults for
> > > > > non-present faults"), spurious_kernel_fault() does not confuse
> > > > > pte/pmd entries with _PAGE_PROTNONE as present anymore. So
> > > > > simply
> > > > > drop pgprot_clear_protnone_bits().
> > > > 
> > > >  
> > > > Looks like I forgot to Cc: Andrea Arcangeli <aarcange@redhat.com>
> > > > 
> > > > Plus I did check that kernel does not crash when reading
> > > > from/writing
> > > > to
> > > > non-present pages with this patch applied.
> > > 
> > > Thanks for the history.
> > > 
> > > I think we should still fix pte_present() to not check prot_none if
> > > the
> > > user bit is clear.
> > 
> > I tried, but realized it wouldn't work :(
> > 
> > For example, when a pte entry is used as swap entry, _PAGE_PRESENT is
> > cleared and _PAGE_PROTNONE is set.
> > 
> > And other bits are used as type and offset of swap entry.
> > In that case, _PAGE_BIT_USER bit does not represent _PAGE_USER.
> > It is just one of bits that represents type of swap entry.
> > 
> > So checking if _PAGE_PROTNONE set only when _PAGE_USER is set
> > will confuse some swap entries as non-present.
> 
> Oooh, right. So the user bit records "when a pagetable is
> writeprotected by userfaultfd WP support". I'm not sure if maybe PCD is
> available to move that to and leave the user bit in place, but it
> sounds like an errata sensitive area to be tweaking.
> 
> > 
> > > The spurious fault handler infinite loop may no
> > > longer be a problem, but pte_present() still would return true for
> > > kernel NP pages, so be fragile. Today I see at least the oops
> > > message
> > > and memory hotunplug (see remove_pagetable()) that would get
> > > confused.
> > 
> > As explained above, I don't think it's possible to make
> > pte_present() 
> > accurate for both kernel and user ptes.
> > 
> > Maybe we can implement pte_present_kernel()/pte_present_user()
> > for when kernel knows it is user or kernel pte.
> 
> This seems like a decent option to me. It seems there are only a few
> places that are isolated to arch/x86.

But there are some places where kernel does not know if it's
kernel pte or user pte.

For example show_fault_oops() can be called for both kernel and user
address. Is something like this acceptable?

static inline bool pte_present_address(pte_t pte, address)
{
	if (kernel address)
		return pte_present_kernel(pte);
	return pte_present_user(pte);
}

> > 
> > or pte_present_with_address(pte, address) if we don't
> > know it is user pte or kernel pte.
> > 

-- 
Thanks,
Hyeonggon


      reply	other threads:[~2022-06-19 12:21 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-14  6:39 [RFC 0/2] CPA improvements Hyeonggon Yoo
2022-06-14  6:39 ` [RFC 1/2] x86/mm/cpa: always fail when user address is passed Hyeonggon Yoo
2022-06-14 17:52   ` Edgecombe, Rick P
2022-06-15  3:26     ` Hyeonggon Yoo
2022-06-15 18:17       ` Edgecombe, Rick P
2022-06-14 18:31   ` Dave Hansen
2022-06-16  8:49     ` Hyeonggon Yoo
2022-06-16 14:20       ` Dave Hansen
2022-06-20  8:08         ` Hyeonggon Yoo
2022-07-07 20:24           ` Dave Hansen
2022-06-15 13:11   ` Christoph Hellwig
2022-06-16  8:51     ` Hyeonggon Yoo
2022-06-14  6:39 ` [RFC 2/2] x86/mm/cpa: drop pgprot_clear_protnone_bits() Hyeonggon Yoo
2022-06-14  6:53   ` Hyeonggon Yoo
2022-06-14 18:23     ` Edgecombe, Rick P
2022-06-15  3:47       ` Hyeonggon Yoo
2022-06-15 18:18         ` Edgecombe, Rick P
2022-06-19 12:20           ` Hyeonggon Yoo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yq8Up9cfvNL70Qeb@hyeyoo \
    --to=42.hyeyoo@gmail.com \
    --cc=Tianyu.Lan@microsoft.com \
    --cc=aarcange@redhat.com \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jane.chu@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rppt@kernel.org \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).