From: Jerome Glisse <jglisse@redhat.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Ingo Molnar <mingo@kernel.org>, Michal Hocko <mhocko@suse.com>,
Mel Gorman <mgorman@suse.de>
Subject: Re: [PATCH] x86/mm: pgds getting out of sync after memory hot remove
Date: Mon, 22 May 2017 10:11:51 -0400 [thread overview]
Message-ID: <20170522141150.GA3813@redhat.com> (raw)
In-Reply-To: <20170522131215.wrnklp4dtemntixz@node.shutemov.name>
On Mon, May 22, 2017 at 04:12:15PM +0300, Kirill A. Shutemov wrote:
> On Fri, May 19, 2017 at 02:01:26PM -0400, Jerome Glisse wrote:
> > After memory hot remove it seems we do not synchronize pgds for kernel
> > virtual memory range (on vmemmap_free()). This seems bogus to me as it
> > means we are left with stall entry for process with mm != mm_init
> >
> > Yet i am puzzle by the fact that i am only now hitting this issue. It
> > never was an issue with 4.12 or before ie HMM never triggered following
> > BUG_ON inside sync_global_pgds():
> >
> > if (!p4d_none(*p4d_ref) && !p4d_none(*p4d))
> > BUG_ON(p4d_page_vaddr(*p4d) != p4d_page_vaddr(*p4d_ref));
> >
> >
> > It seems that Kirill 5 level page table changes play a role in this
> > behavior change. I could not bisect because HMM is painfull to rebase
> > for each bisection step so that is just my best guess.
> >
> >
> > Am i missing something here ? Am i wrong in assuming that should sync
> > pgd on vmemmap_free() ? If so anyone have a good guess on why i am now
> > seeing the above BUG_ON ?
>
> What would we gain by syncing pgd on free? Stale pgds are fine as long as
> they are not referenced (use-after-free case). Syncing is addtional work.
Well then how do i avoid the BUG_ON above ? Because the init_mm pgd is
clear but none of the stall entry in any other mm. So if i unplug memory
and replug memory at exact same address it tries to allocate new p4d/pud
for struct page area and then when sync_global_pgds() is call it goes
over the list of pgd and BUG_ON() :
if (!p4d_none(*p4d_ref) && !p4d_none(*p4d))
BUG_ON(p4d_page_vaddr(*p4d) != p4d_page_vaddr(*p4d_ref));
So to me either above check need to go and we should overwritte pgd no
matter what or we should restore previous behavior. I don't mind either
one.
Cheers,
Jerome
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Jerome Glisse <jglisse@redhat.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Ingo Molnar <mingo@kernel.org>, Michal Hocko <mhocko@suse.com>,
Mel Gorman <mgorman@suse.de>
Subject: Re: [PATCH] x86/mm: pgds getting out of sync after memory hot remove
Date: Mon, 22 May 2017 10:11:51 -0400 [thread overview]
Message-ID: <20170522141150.GA3813@redhat.com> (raw)
In-Reply-To: <20170522131215.wrnklp4dtemntixz@node.shutemov.name>
On Mon, May 22, 2017 at 04:12:15PM +0300, Kirill A. Shutemov wrote:
> On Fri, May 19, 2017 at 02:01:26PM -0400, Jérôme Glisse wrote:
> > After memory hot remove it seems we do not synchronize pgds for kernel
> > virtual memory range (on vmemmap_free()). This seems bogus to me as it
> > means we are left with stall entry for process with mm != mm_init
> >
> > Yet i am puzzle by the fact that i am only now hitting this issue. It
> > never was an issue with 4.12 or before ie HMM never triggered following
> > BUG_ON inside sync_global_pgds():
> >
> > if (!p4d_none(*p4d_ref) && !p4d_none(*p4d))
> > BUG_ON(p4d_page_vaddr(*p4d) != p4d_page_vaddr(*p4d_ref));
> >
> >
> > It seems that Kirill 5 level page table changes play a role in this
> > behavior change. I could not bisect because HMM is painfull to rebase
> > for each bisection step so that is just my best guess.
> >
> >
> > Am i missing something here ? Am i wrong in assuming that should sync
> > pgd on vmemmap_free() ? If so anyone have a good guess on why i am now
> > seeing the above BUG_ON ?
>
> What would we gain by syncing pgd on free? Stale pgds are fine as long as
> they are not referenced (use-after-free case). Syncing is addtional work.
Well then how do i avoid the BUG_ON above ? Because the init_mm pgd is
clear but none of the stall entry in any other mm. So if i unplug memory
and replug memory at exact same address it tries to allocate new p4d/pud
for struct page area and then when sync_global_pgds() is call it goes
over the list of pgd and BUG_ON() :
if (!p4d_none(*p4d_ref) && !p4d_none(*p4d))
BUG_ON(p4d_page_vaddr(*p4d) != p4d_page_vaddr(*p4d_ref));
So to me either above check need to go and we should overwritte pgd no
matter what or we should restore previous behavior. I don't mind either
one.
Cheers,
Jérôme
next prev parent reply other threads:[~2017-05-22 14:11 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-19 18:01 [PATCH] x86/mm: pgds getting out of sync after memory hot remove Jérôme Glisse
2017-05-19 18:01 ` Jérôme Glisse
2017-05-19 18:01 ` [PATCH] x86/mm: synchronize pgd in vmemmap_free() Jérôme Glisse
2017-05-19 18:01 ` Jérôme Glisse
2017-05-20 0:34 ` John Hubbard
2017-05-20 0:34 ` John Hubbard
2017-05-22 13:12 ` [PATCH] x86/mm: pgds getting out of sync after memory hot remove Kirill A. Shutemov
2017-05-22 13:12 ` Kirill A. Shutemov
2017-05-22 14:11 ` Jerome Glisse [this message]
2017-05-22 14:11 ` Jerome Glisse
2017-05-22 14:29 ` Kirill A. Shutemov
2017-05-22 14:29 ` Kirill A. Shutemov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170522141150.GA3813@redhat.com \
--to=jglisse@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.com \
--cc=mingo@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.