All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: "Longpeng (Mike,
	Cloud Infrastructure Service Product Dept.)" 
	<longpeng2@huawei.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
	akpm@linux-foundation.org, kirill.shutemov@linux.intel.com,
	linux-kernel@vger.kernel.org, arei.gonglei@huawei.com,
	weidong.huang@huawei.com, weifuqiang@huawei.com,
	kvm@vger.kernel.org, linux-mm@kvack.org,
	Matthew Wilcox <willy@infradead.org>,
	Sean Christopherson <sean.j.christopherson@intel.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH v2] mm/hugetlb: fix a addressing exception caused by huge_pte_offset()
Date: Tue, 24 Mar 2020 08:55:41 -0300	[thread overview]
Message-ID: <20200324115541.GH20941@ziepe.ca> (raw)
In-Reply-To: <e8e71ba4-d609-269a-6160-153e373e7563@huawei.com>

On Tue, Mar 24, 2020 at 10:37:49AM +0800, Longpeng (Mike, Cloud Infrastructure Service Product Dept.) wrote:
> 
> 
> On 2020/3/24 6:52, Jason Gunthorpe wrote:
> > On Mon, Mar 23, 2020 at 01:35:07PM -0700, Mike Kravetz wrote:
> >> On 3/23/20 11:07 AM, Jason Gunthorpe wrote:
> >>> On Mon, Mar 23, 2020 at 10:27:48AM -0700, Mike Kravetz wrote:
> >>>
> >>>>>  	pgd = pgd_offset(mm, addr);
> >>>>> -	if (!pgd_present(*pgd))
> >>>>> +	if (!pgd_present(READ_ONCE(*pgd)))
> >>>>>  		return NULL;
> >>>>>  	p4d = p4d_offset(pgd, addr);
> >>>>> -	if (!p4d_present(*p4d))
> >>>>> +	if (!p4d_present(READ_ONCE(*p4d)))
> >>>>>  		return NULL;
> >>>>>  
> >>>>>       pud = pud_offset(p4d, addr);
> >>>>
> >>>> One would argue that pgd and p4d can not change from present to !present
> >>>> during the execution of this code.  To me, that seems like the issue which
> >>>> would cause an issue.  Of course, I could be missing something.
> >>>
> >>> This I am not sure of, I think it must be true under the read side of
> >>> the mmap_sem, but probably not guarenteed under RCU..
> >>>
> >>> In any case, it doesn't matter, the fact that *p4d can change at all
> >>> is problematic. Unwinding the above inlines we get:
> >>>
> >>>   p4d = p4d_offset(pgd, addr)
> >>>   if (!p4d_present(*p4d))
> >>>       return NULL;
> >>>   pud = (pud_t *)p4d_page_vaddr(*p4d) + pud_index(address);
> >>>
> >>> According to our memory model the compiler/CPU is free to execute this
> >>> as:
> >>>
> >>>   p4d = p4d_offset(pgd, addr)
> >>>   p4d_for_vaddr = *p4d;
> >>>   if (!p4d_present(*p4d))
> >>>       return NULL;
> >>>   pud = (pud_t *)p4d_page_vaddr(p4d_for_vaddr) + pud_index(address);
> >>>
> >>
> >> Wow!  How do you know this?  You don't need to answer :)
> > 
> > It says explicitly in Documentation/memory-barriers.txt - see
> > section COMPILER BARRIER:
> > 
> >  (*) The compiler is within its rights to reorder loads and stores
> >      to the same variable, and in some cases, the CPU is within its
> >      rights to reorder loads to the same variable.  This means that
> >      the following code:
> > 
> >         a[0] = x;
> >         a[1] = x;
> > 
> >      Might result in an older value of x stored in a[1] than in a[0].
> > 
> > It also says READ_ONCE puts things in program order, but we don't use
> > READ_ONCE inside pud_offset(), so it doesn't help us.
> > 
> > Best answer is to code things so there is exactly one dereference of
> > the pointer protected by READ_ONCE. Very clear to read, very safe.
> > 
> > Maybe Longpeng can rework the patch around these principles?
> > 
> Thanks Jason and Mike, I learn a lot from your analysis.
> 
> So... the patch should like this ?

Yes, the pattern looks right

The commit message should reference the above section of COMPILER
BARRIER and explain that de-referencing the entries is a data race, so
we must consolidate all the reads into one single place.

Also, since CH moved all the get_user_pages_fast code out of the
arch's many/all archs can drop their arch specific version of this
routine. This is really just a specialized version of gup_fast's
algorithm..

(also the arch versions seem different, why do some return actual
 ptes, not null?)

Jason

  reply	other threads:[~2020-03-24 11:55 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-22  3:33 [PATCH v2] mm/hugetlb: fix a addressing exception caused by huge_pte_offset() Longpeng(Mike)
2020-03-21 23:38 ` Mike Kravetz
2020-03-23  2:03   ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2020-03-23  2:54     ` Mike Kravetz
2020-03-23  3:43       ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2020-03-23 14:40       ` Sean Christopherson
2020-03-23 16:44         ` Jason Gunthorpe
2020-03-23 16:09   ` Jason Gunthorpe
2020-03-23 17:27     ` Mike Kravetz
2020-03-23 18:07       ` Jason Gunthorpe
2020-03-23 20:35         ` Mike Kravetz
2020-03-23 22:52           ` Jason Gunthorpe
2020-03-24  2:37             ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2020-03-24 11:55               ` Jason Gunthorpe [this message]
2020-03-24 15:25                 ` Mike Kravetz
2020-03-24 15:55                   ` Jason Gunthorpe
2020-03-24 16:19                     ` Mike Kravetz
2020-03-24 17:59                       ` Jason Gunthorpe
2020-03-24 19:47                         ` Mike Kravetz
  -- strict thread matches above, loose matches on Subject: below --
2020-02-22  5:23 Qian Cai
2020-02-22  6:33 ` Longpeng (Mike)
2020-02-22 11:50   ` Qian Cai
2020-02-22 17:02   ` Matthew Wilcox
2020-02-23  1:24     ` Longpeng (Mike)
2020-02-27 21:41       ` Mike Kravetz
2020-03-21 22:46         ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200324115541.GH20941@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=akpm@linux-foundation.org \
    --cc=arei.gonglei@huawei.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longpeng2@huawei.com \
    --cc=mike.kravetz@oracle.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=stable@vger.kernel.org \
    --cc=weidong.huang@huawei.com \
    --cc=weifuqiang@huawei.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.