All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: David Vrabel <david.vrabel@citrix.com>
Cc: Linux-X86 <x86@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Cyrill Gorcunov <gorcunov@gmail.com>, Peter Anvin <hpa@zytor.com>,
	Ingo Molnar <mingo@kernel.org>,
	Steven Noonan <steven@uplinklabs.net>,
	Rik van Riel <riel@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 4/5] mm: use paravirt friendly ops for NUMA hinting ptes
Date: Tue, 15 Apr 2014 15:44:24 +0100	[thread overview]
Message-ID: <20140415144423.GV7292@suse.de> (raw)
In-Reply-To: <534D09AC.7020704@citrix.com>

On Tue, Apr 15, 2014 at 11:27:56AM +0100, David Vrabel wrote:
> On 08/04/14 14:09, Mel Gorman wrote:
> > David Vrabel identified a regression when using automatic NUMA balancing
> > under Xen whereby page table entries were getting corrupted due to the
> > use of native PTE operations. Quoting him
> > 
> > 	Xen PV guest page tables require that their entries use machine
> > 	addresses if the preset bit (_PAGE_PRESENT) is set, and (for
> > 	successful migration) non-present PTEs must use pseudo-physical
> > 	addresses.  This is because on migration MFNs in present PTEs are
> > 	translated to PFNs (canonicalised) so they may be translated back
> > 	to the new MFN in the destination domain (uncanonicalised).
> > 
> > 	pte_mknonnuma(), pmd_mknonnuma(), pte_mknuma() and pmd_mknuma()
> > 	set and clear the _PAGE_PRESENT bit using pte_set_flags(),
> > 	pte_clear_flags(), etc.
> > 
> > 	In a Xen PV guest, these functions must translate MFNs to PFNs
> > 	when clearing _PAGE_PRESENT and translate PFNs to MFNs when setting
> > 	_PAGE_PRESENT.
> > 
> > His suggested fix converted p[te|md]_[set|clear]_flags to using
> > paravirt-friendly ops but this is overkill. He suggested an alternative of
> > using p[te|md]_modify in the NUMA page table operations but this is does
> > more work than necessary and would require looking up a VMA for protections.
> > 
> > This patch modifies the NUMA page table operations to use paravirt friendly
> > operations to set/clear the flags of interest. Unfortunately this will take
> > a performance hit when updating the PTEs on CONFIG_PARAVIRT but I do not
> > see a way around it that does not break Xen.
> 
> We're getting more reports of users hitting this regression with distro
> provided kernels.  Irrespective of the rest of this series, can we get
> at least this applied and tagged for stable, please?
> 
> http://lists.xenproject.org/archives/html/xen-devel/2014-04/msg01905.html
> 

The resending of the series got delayed until today. Fengguang Wu hit
problems testing the series and I ran into a number of similarly shaped
problems that took time to resolve. I sent out a v4 of the series with this
patch at the front and a note on the leader saying it should be picked up
for stable regardless of what happens with the patches 2 and 3.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@suse.de>
To: David Vrabel <david.vrabel@citrix.com>
Cc: Linux-X86 <x86@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Cyrill Gorcunov <gorcunov@gmail.com>, Peter Anvin <hpa@zytor.com>,
	Ingo Molnar <mingo@kernel.org>,
	Steven Noonan <steven@uplinklabs.net>,
	Rik van Riel <riel@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 4/5] mm: use paravirt friendly ops for NUMA hinting ptes
Date: Tue, 15 Apr 2014 15:44:24 +0100	[thread overview]
Message-ID: <20140415144423.GV7292@suse.de> (raw)
In-Reply-To: <534D09AC.7020704@citrix.com>

On Tue, Apr 15, 2014 at 11:27:56AM +0100, David Vrabel wrote:
> On 08/04/14 14:09, Mel Gorman wrote:
> > David Vrabel identified a regression when using automatic NUMA balancing
> > under Xen whereby page table entries were getting corrupted due to the
> > use of native PTE operations. Quoting him
> > 
> > 	Xen PV guest page tables require that their entries use machine
> > 	addresses if the preset bit (_PAGE_PRESENT) is set, and (for
> > 	successful migration) non-present PTEs must use pseudo-physical
> > 	addresses.  This is because on migration MFNs in present PTEs are
> > 	translated to PFNs (canonicalised) so they may be translated back
> > 	to the new MFN in the destination domain (uncanonicalised).
> > 
> > 	pte_mknonnuma(), pmd_mknonnuma(), pte_mknuma() and pmd_mknuma()
> > 	set and clear the _PAGE_PRESENT bit using pte_set_flags(),
> > 	pte_clear_flags(), etc.
> > 
> > 	In a Xen PV guest, these functions must translate MFNs to PFNs
> > 	when clearing _PAGE_PRESENT and translate PFNs to MFNs when setting
> > 	_PAGE_PRESENT.
> > 
> > His suggested fix converted p[te|md]_[set|clear]_flags to using
> > paravirt-friendly ops but this is overkill. He suggested an alternative of
> > using p[te|md]_modify in the NUMA page table operations but this is does
> > more work than necessary and would require looking up a VMA for protections.
> > 
> > This patch modifies the NUMA page table operations to use paravirt friendly
> > operations to set/clear the flags of interest. Unfortunately this will take
> > a performance hit when updating the PTEs on CONFIG_PARAVIRT but I do not
> > see a way around it that does not break Xen.
> 
> We're getting more reports of users hitting this regression with distro
> provided kernels.  Irrespective of the rest of this series, can we get
> at least this applied and tagged for stable, please?
> 
> http://lists.xenproject.org/archives/html/xen-devel/2014-04/msg01905.html
> 

The resending of the series got delayed until today. Fengguang Wu hit
problems testing the series and I ran into a number of similarly shaped
problems that took time to resolve. I sent out a v4 of the series with this
patch at the front and a note on the leader saying it should be picked up
for stable regardless of what happens with the patches 2 and 3.

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2014-04-15 14:44 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-08 13:09 [RFC PATCH 0/5] Use an alternative to _PAGE_PROTNONE for _PAGE_NUMA v2 Mel Gorman
2014-04-08 13:09 ` Mel Gorman
2014-04-08 13:09 ` [PATCH 1/5] x86: Require x86-64 for automatic NUMA balancing Mel Gorman
2014-04-08 13:09   ` Mel Gorman
2014-04-08 13:09 ` [PATCH 2/5] x86: Define _PAGE_NUMA by reusing software bits on the PMD and PTE levels Mel Gorman
2014-04-08 13:09   ` Mel Gorman
2014-04-08 13:09 ` [PATCH 3/5] mm: Allow FOLL_NUMA on FOLL_FORCE Mel Gorman
2014-04-08 13:09   ` Mel Gorman
2014-04-08 13:09 ` [PATCH 4/5] mm: use paravirt friendly ops for NUMA hinting ptes Mel Gorman
2014-04-08 13:09   ` Mel Gorman
2014-04-08 17:21   ` David Vrabel
2014-04-08 17:21     ` David Vrabel
2014-04-15 10:27   ` David Vrabel
2014-04-15 10:27     ` David Vrabel
2014-04-15 14:44     ` Mel Gorman [this message]
2014-04-15 14:44       ` Mel Gorman
2014-04-08 13:09 ` [PATCH 5/5] x86: Allow Xen to enable NUMA_BALANCING Mel Gorman
2014-04-08 13:09   ` Mel Gorman
2014-04-08 14:40 ` [RFC PATCH 0/5] Use an alternative to _PAGE_PROTNONE for _PAGE_NUMA v2 H. Peter Anvin
2014-04-08 14:40   ` H. Peter Anvin
2014-04-08 15:22   ` Linus Torvalds
2014-04-08 15:22     ` Linus Torvalds
2014-04-08 16:04     ` H. Peter Anvin
2014-04-08 16:04       ` H. Peter Anvin
2014-04-08 16:12     ` Peter Zijlstra
2014-04-08 16:12       ` Peter Zijlstra
2014-04-08 16:46     ` Mel Gorman
2014-04-08 16:46       ` Mel Gorman
2014-04-08 17:01       ` Linus Torvalds
2014-04-08 17:01         ` Linus Torvalds
2014-04-08 18:51         ` Mel Gorman
2014-04-08 18:51           ` Mel Gorman
2014-04-08 18:55           ` Linus Torvalds
2014-04-08 18:55             ` Linus Torvalds
2014-04-08 19:06             ` Mel Gorman
2014-04-08 19:06               ` Mel Gorman
2014-04-08 19:08             ` Rik van Riel
2014-04-08 19:08               ` Rik van Riel
2014-04-08 17:03       ` Mel Gorman
2014-04-08 17:03         ` Mel Gorman
2014-04-08 17:30       ` Peter Zijlstra
2014-04-08 17:30         ` Peter Zijlstra
2014-04-08 17:41         ` Linus Torvalds
2014-04-08 17:41           ` Linus Torvalds
2014-04-08 18:16         ` Cyrill Gorcunov
2014-04-08 18:16           ` Cyrill Gorcunov
2014-04-09  6:21         ` Ingo Molnar
2014-04-09  6:21           ` Ingo Molnar
2014-04-09 23:34           ` H. Peter Anvin
2014-04-09 23:34             ` H. Peter Anvin
2014-04-10  0:12             ` Linus Torvalds
2014-04-10  0:12               ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140415144423.GV7292@suse.de \
    --to=mgorman@suse.de \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=david.vrabel@citrix.com \
    --cc=gorcunov@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=steven@uplinklabs.net \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.