All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: Balbir Singh <bsingharora@gmail.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org,
	Ralph Campbell <rcampbell@nvidia.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH 2/7] mm/rmap: map_pte() was not handling private ZONE_DEVICE page properly
Date: Thu, 30 Aug 2018 10:34:19 -0400	[thread overview]
Message-ID: <20180830143418.GC3529@redhat.com> (raw)
In-Reply-To: <20180830140538.GA28695@350D>

On Fri, Aug 31, 2018 at 12:05:38AM +1000, Balbir Singh wrote:
> On Fri, Aug 24, 2018 at 03:25:44PM -0400, jglisse@redhat.com wrote:
> > From: Ralph Campbell <rcampbell@nvidia.com>
> > 
> > Private ZONE_DEVICE pages use a special pte entry and thus are not
> > present. Properly handle this case in map_pte(), it is already handled
> > in check_pte(), the map_pte() part was lost in some rebase most probably.
> > 
> > Without this patch the slow migration path can not migrate back private
> > ZONE_DEVICE memory to regular memory. This was found after stress
> > testing migration back to system memory. This ultimatly can lead the
> > CPU to an infinite page fault loop on the special swap entry.
> > 
> > Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> > Signed-off-by: Jerome Glisse <jglisse@redhat.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  mm/page_vma_mapped.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> > index ae3c2a35d61b..1cf5b9bfb559 100644
> > --- a/mm/page_vma_mapped.c
> > +++ b/mm/page_vma_mapped.c
> > @@ -21,6 +21,15 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw)
> >  			if (!is_swap_pte(*pvmw->pte))
> >  				return false;
> >  		} else {
> > +			if (is_swap_pte(*pvmw->pte)) {
> > +				swp_entry_t entry;
> > +
> > +				/* Handle un-addressable ZONE_DEVICE memory */
> > +				entry = pte_to_swp_entry(*pvmw->pte);
> > +				if (is_device_private_entry(entry))
> > +					return true;
> > +			}
> > +
> 
> This happens just for !PVMW_SYNC && PVMW_MIGRATION? I presume this
> is triggered via the remove_migration_pte() code path? Doesn't
> returning true here imply that we've taken the ptl lock for the
> pvmw?

This happens through try_to_unmap() from migrate_vma_unmap() and thus
has !PVMW_SYNC and !PVMW_MIGRATION

But you are right about the ptl lock, so looking at code we were just
doing pte modification without holding the pte lock but the
page_vma_mapped_walk() would not try to unlock as pvmw->ptl == NULL
so this never triggered any warning.

I am gonna post a v2 shortly which address that.

Cheers,
Jerome

WARNING: multiple messages have this Message-ID (diff)
From: Jerome Glisse <jglisse@redhat.com>
To: Balbir Singh <bsingharora@gmail.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org,
	Ralph Campbell <rcampbell@nvidia.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH 2/7] mm/rmap: map_pte() was not handling private ZONE_DEVICE page properly
Date: Thu, 30 Aug 2018 10:34:19 -0400	[thread overview]
Message-ID: <20180830143418.GC3529@redhat.com> (raw)
In-Reply-To: <20180830140538.GA28695@350D>

On Fri, Aug 31, 2018 at 12:05:38AM +1000, Balbir Singh wrote:
> On Fri, Aug 24, 2018 at 03:25:44PM -0400, jglisse@redhat.com wrote:
> > From: Ralph Campbell <rcampbell@nvidia.com>
> > 
> > Private ZONE_DEVICE pages use a special pte entry and thus are not
> > present. Properly handle this case in map_pte(), it is already handled
> > in check_pte(), the map_pte() part was lost in some rebase most probably.
> > 
> > Without this patch the slow migration path can not migrate back private
> > ZONE_DEVICE memory to regular memory. This was found after stress
> > testing migration back to system memory. This ultimatly can lead the
> > CPU to an infinite page fault loop on the special swap entry.
> > 
> > Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> > Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  mm/page_vma_mapped.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> > index ae3c2a35d61b..1cf5b9bfb559 100644
> > --- a/mm/page_vma_mapped.c
> > +++ b/mm/page_vma_mapped.c
> > @@ -21,6 +21,15 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw)
> >  			if (!is_swap_pte(*pvmw->pte))
> >  				return false;
> >  		} else {
> > +			if (is_swap_pte(*pvmw->pte)) {
> > +				swp_entry_t entry;
> > +
> > +				/* Handle un-addressable ZONE_DEVICE memory */
> > +				entry = pte_to_swp_entry(*pvmw->pte);
> > +				if (is_device_private_entry(entry))
> > +					return true;
> > +			}
> > +
> 
> This happens just for !PVMW_SYNC && PVMW_MIGRATION? I presume this
> is triggered via the remove_migration_pte() code path? Doesn't
> returning true here imply that we've taken the ptl lock for the
> pvmw?

This happens through try_to_unmap() from migrate_vma_unmap() and thus
has !PVMW_SYNC and !PVMW_MIGRATION

But you are right about the ptl lock, so looking at code we were just
doing pte modification without holding the pte lock but the
page_vma_mapped_walk() would not try to unlock as pvmw->ptl == NULL
so this never triggered any warning.

I am gonna post a v2 shortly which address that.

Cheers,
Jérôme

WARNING: multiple messages have this Message-ID (diff)
From: Jerome Glisse <jglisse@redhat.com>
To: Balbir Singh <bsingharora@gmail.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org,
	Ralph Campbell <rcampbell@nvidia.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH 2/7] mm/rmap: map_pte() was not handling private ZONE_DEVICE page properly
Date: Thu, 30 Aug 2018 10:34:19 -0400	[thread overview]
Message-ID: <20180830143418.GC3529@redhat.com> (raw)
In-Reply-To: <20180830140538.GA28695@350D>

On Fri, Aug 31, 2018 at 12:05:38AM +1000, Balbir Singh wrote:
> On Fri, Aug 24, 2018 at 03:25:44PM -0400, jglisse@redhat.com wrote:
> > From: Ralph Campbell <rcampbell@nvidia.com>
> > 
> > Private ZONE_DEVICE pages use a special pte entry and thus are not
> > present. Properly handle this case in map_pte(), it is already handled
> > in check_pte(), the map_pte() part was lost in some rebase most probably.
> > 
> > Without this patch the slow migration path can not migrate back private
> > ZONE_DEVICE memory to regular memory. This was found after stress
> > testing migration back to system memory. This ultimatly can lead the
> > CPU to an infinite page fault loop on the special swap entry.
> > 
> > Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> > Signed-off-by: J�r�me Glisse <jglisse@redhat.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  mm/page_vma_mapped.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> > index ae3c2a35d61b..1cf5b9bfb559 100644
> > --- a/mm/page_vma_mapped.c
> > +++ b/mm/page_vma_mapped.c
> > @@ -21,6 +21,15 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw)
> >  			if (!is_swap_pte(*pvmw->pte))
> >  				return false;
> >  		} else {
> > +			if (is_swap_pte(*pvmw->pte)) {
> > +				swp_entry_t entry;
> > +
> > +				/* Handle un-addressable ZONE_DEVICE memory */
> > +				entry = pte_to_swp_entry(*pvmw->pte);
> > +				if (is_device_private_entry(entry))
> > +					return true;
> > +			}
> > +
> 
> This happens just for !PVMW_SYNC && PVMW_MIGRATION? I presume this
> is triggered via the remove_migration_pte() code path? Doesn't
> returning true here imply that we've taken the ptl lock for the
> pvmw?

This happens through try_to_unmap() from migrate_vma_unmap() and thus
has !PVMW_SYNC and !PVMW_MIGRATION

But you are right about the ptl lock, so looking at code we were just
doing pte modification without holding the pte lock but the
page_vma_mapped_walk() would not try to unlock as pvmw->ptl == NULL
so this never triggered any warning.

I am gonna post a v2 shortly which address that.

Cheers,
J�r�me

  reply	other threads:[~2018-08-30 14:34 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-24 19:25 [PATCH 0/7] HMM updates, improvements and fixes jglisse
2018-08-24 19:25 ` jglisse
2018-08-24 19:25 ` [PATCH 1/7] mm/hmm: fix utf8 jglisse
2018-08-24 19:25   ` jglisse
2018-08-24 19:25 ` [PATCH 2/7] mm/rmap: map_pte() was not handling private ZONE_DEVICE page properly jglisse
2018-08-24 19:25   ` jglisse
2018-08-30 14:05   ` Balbir Singh
2018-08-30 14:05     ` Balbir Singh
2018-08-30 14:05     ` Balbir Singh
2018-08-30 14:34     ` Jerome Glisse [this message]
2018-08-30 14:34       ` Jerome Glisse
2018-08-30 14:34       ` Jerome Glisse
2018-08-30 14:41   ` [PATCH 3/7] mm/rmap: map_pte() was not handling private ZONE_DEVICE page properly v2 jglisse
2018-08-30 14:41     ` jglisse
2018-08-31  9:27     ` Balbir Singh
2018-08-31  9:27       ` Balbir Singh
2018-08-31  9:27       ` Balbir Singh
2018-08-31 16:19       ` Jerome Glisse
2018-08-31 16:19         ` Jerome Glisse
2018-08-31 16:19         ` Jerome Glisse
2018-09-02  6:58         ` Balbir Singh
2018-09-02  6:58           ` Balbir Singh
2018-09-02  6:58           ` Balbir Singh
2018-08-24 19:25 ` [PATCH 3/7] mm/hmm: fix race between hmm_mirror_unregister() and mmu_notifier callback jglisse
2018-08-24 19:25   ` jglisse
2018-08-30 14:14   ` Balbir Singh
2018-08-30 14:14     ` Balbir Singh
2018-08-30 14:14     ` Balbir Singh
2018-08-24 19:25 ` [PATCH 4/7] mm/hmm: properly handle migration pmd jglisse
2018-08-24 19:25   ` jglisse
2018-08-25  0:05   ` Zi Yan
2018-08-28  0:35     ` Jerome Glisse
2018-08-28  0:35       ` Jerome Glisse
2018-08-28 15:24     ` Michal Hocko
2018-08-28 15:36       ` Jerome Glisse
2018-08-28 15:36         ` Jerome Glisse
2018-08-28 15:42         ` Michal Hocko
2018-08-28 15:45           ` Michal Hocko
2018-08-28 15:54             ` Zi Yan
2018-08-28 16:06               ` Jerome Glisse
2018-08-28 16:06                 ` Jerome Glisse
2018-08-28 16:10               ` Michal Hocko
2018-08-29 17:17   ` [PATCH 4/7] mm/hmm: properly handle migration pmd v2 jglisse
2018-08-29 17:17     ` jglisse
2018-08-24 19:25 ` [PATCH 5/7] mm/hmm: use a structure for update callback parameters jglisse
2018-08-24 19:25   ` jglisse
2018-08-30 23:11   ` Balbir Singh
2018-08-30 23:11     ` Balbir Singh
2018-08-31 16:12     ` Jerome Glisse
2018-08-31 16:12       ` Jerome Glisse
2018-08-24 19:25 ` [PATCH 6/7] mm/hmm: invalidate device page table at start of invalidation jglisse
2018-08-24 19:25   ` jglisse
2018-08-24 19:25 ` [PATCH 7/7] mm/hmm: proper support for blockable mmu_notifier jglisse
2018-08-24 19:25   ` jglisse
2018-10-12 18:15 ` [PATCH 0/7] HMM updates, improvements and fixes Jerome Glisse
2018-10-12 18:15   ` Jerome Glisse
2018-10-12 21:12   ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180830143418.GC3529@redhat.com \
    --to=jglisse@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bsingharora@gmail.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rcampbell@nvidia.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.