All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alistair Popple <apopple@nvidia.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: "Sierra Guiza, Alejandro \(Alex\)" <alex.sierra@amd.com>,
	Ralph Campbell <rcampbell@nvidia.com>,
	Lyude Paul <lyude@redhat.com>, Karol Herbst <kherbst@redhat.com>,
	David Hildenbrand <david@redhat.com>,
	Nadav Amit <nadav.amit@gmail.com>,
	Felix Kuehling <Felix.Kuehling@amd.com>,
	linuxppc-dev@lists.ozlabs.org,
	LKML <linux-kernel@vger.kernel.org>, Peter Xu <peterx@redhat.com>,
	linux-mm@kvack.org, Logan Gunthorpe <logang@deltatee.com>,
	Matthew Wilcox <willy@infradead.org>,
	Jason Gunthorpe <jgg@nvidia.com>,
	John Hubbard <jhubbard@nvidia.com>,
	stable@vger.kernel.org, akpm@linux-foundation.org,
	huang ying <huang.ying.caritas@gmail.com>,
	Ben Skeggs <bskeggs@redhat.com>
Subject: Re: [PATCH v3 1/3] mm/migrate_device.c: Flush TLB while holding PTL
Date: Fri, 26 Aug 2022 08:35:17 +1000	[thread overview]
Message-ID: <87y1vcdjzs.fsf@nvdebian.thelocal> (raw)
In-Reply-To: <87sfll2jfc.fsf@yhuang6-desk2.ccr.corp.intel.com>


"Huang, Ying" <ying.huang@intel.com> writes:

> Alistair Popple <apopple@nvidia.com> writes:
>
>> When clearing a PTE the TLB should be flushed whilst still holding the
>> PTL to avoid a potential race with madvise/munmap/etc. For example
>> consider the following sequence:
>>
>>   CPU0                          CPU1
>>   ----                          ----
>>
>>   migrate_vma_collect_pmd()
>>   pte_unmap_unlock()
>>                                 madvise(MADV_DONTNEED)
>>                                 -> zap_pte_range()
>>                                 pte_offset_map_lock()
>>                                 [ PTE not present, TLB not flushed ]
>>                                 pte_unmap_unlock()
>>                                 [ page is still accessible via stale TLB ]
>>   flush_tlb_range()
>>
>> In this case the page may still be accessed via the stale TLB entry
>> after madvise returns. Fix this by flushing the TLB while holding the
>> PTL.
>>
>> Signed-off-by: Alistair Popple <apopple@nvidia.com>
>> Reported-by: Nadav Amit <nadav.amit@gmail.com>
>> Fixes: 8c3328f1f36a ("mm/migrate: migrate_vma() unmap page from vma while collecting pages")
>> Cc: stable@vger.kernel.org
>>
>> ---
>>
>> Changes for v3:
>>
>>  - New for v3
>> ---
>>  mm/migrate_device.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
>> index 27fb37d..6a5ef9f 100644
>> --- a/mm/migrate_device.c
>> +++ b/mm/migrate_device.c
>> @@ -254,13 +254,14 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>>  		migrate->dst[migrate->npages] = 0;
>>  		migrate->src[migrate->npages++] = mpfn;
>>  	}
>> -	arch_leave_lazy_mmu_mode();
>> -	pte_unmap_unlock(ptep - 1, ptl);
>>
>>  	/* Only flush the TLB if we actually modified any entries */
>>  	if (unmapped)
>>  		flush_tlb_range(walk->vma, start, end);
>
> It appears that we can increase "unmapped" only if ptep_get_and_clear()
> is used?

In other words you mean we only need to increase unmapped if pte_present
&& !anon_exclusive?

Agree, that's a good optimisation to make. However I'm just trying to
solve a data corruption issue (not dirtying the page) here, so will post
that as a separate optimisation patch. Thanks.

 - Alistair

> Best Regards,
> Huang, Ying
>
>> +	arch_leave_lazy_mmu_mode();
>> +	pte_unmap_unlock(ptep - 1, ptl);
>> +
>>  	return 0;
>>  }
>>
>>
>> base-commit: ffcf9c5700e49c0aee42dcba9a12ba21338e8136

WARNING: multiple messages have this Message-ID (diff)
From: Alistair Popple <apopple@nvidia.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	Peter Xu <peterx@redhat.com>, Nadav Amit <nadav.amit@gmail.com>,
	huang ying <huang.ying.caritas@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"Sierra Guiza, Alejandro (Alex)" <alex.sierra@amd.com>,
	Felix Kuehling <Felix.Kuehling@amd.com>,
	Jason Gunthorpe <jgg@nvidia.com>,
	John Hubbard <jhubbard@nvidia.com>,
	David Hildenbrand <david@redhat.com>,
	Ralph Campbell <rcampbell@nvidia.com>,
	Matthew Wilcox <willy@infradead.org>,
	Karol Herbst <kherbst@redhat.com>, Lyude Paul <lyude@redhat.com>,
	Ben Skeggs <bskeggs@redhat.com>,
	Logan Gunthorpe <logang@deltatee.com>,
	paulus@ozlabs.org, linuxppc-dev@lists.ozlabs.org,
	stable@vger.kernel.org
Subject: Re: [PATCH v3 1/3] mm/migrate_device.c: Flush TLB while holding PTL
Date: Fri, 26 Aug 2022 08:35:17 +1000	[thread overview]
Message-ID: <87y1vcdjzs.fsf@nvdebian.thelocal> (raw)
In-Reply-To: <87sfll2jfc.fsf@yhuang6-desk2.ccr.corp.intel.com>


"Huang, Ying" <ying.huang@intel.com> writes:

> Alistair Popple <apopple@nvidia.com> writes:
>
>> When clearing a PTE the TLB should be flushed whilst still holding the
>> PTL to avoid a potential race with madvise/munmap/etc. For example
>> consider the following sequence:
>>
>>   CPU0                          CPU1
>>   ----                          ----
>>
>>   migrate_vma_collect_pmd()
>>   pte_unmap_unlock()
>>                                 madvise(MADV_DONTNEED)
>>                                 -> zap_pte_range()
>>                                 pte_offset_map_lock()
>>                                 [ PTE not present, TLB not flushed ]
>>                                 pte_unmap_unlock()
>>                                 [ page is still accessible via stale TLB ]
>>   flush_tlb_range()
>>
>> In this case the page may still be accessed via the stale TLB entry
>> after madvise returns. Fix this by flushing the TLB while holding the
>> PTL.
>>
>> Signed-off-by: Alistair Popple <apopple@nvidia.com>
>> Reported-by: Nadav Amit <nadav.amit@gmail.com>
>> Fixes: 8c3328f1f36a ("mm/migrate: migrate_vma() unmap page from vma while collecting pages")
>> Cc: stable@vger.kernel.org
>>
>> ---
>>
>> Changes for v3:
>>
>>  - New for v3
>> ---
>>  mm/migrate_device.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
>> index 27fb37d..6a5ef9f 100644
>> --- a/mm/migrate_device.c
>> +++ b/mm/migrate_device.c
>> @@ -254,13 +254,14 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>>  		migrate->dst[migrate->npages] = 0;
>>  		migrate->src[migrate->npages++] = mpfn;
>>  	}
>> -	arch_leave_lazy_mmu_mode();
>> -	pte_unmap_unlock(ptep - 1, ptl);
>>
>>  	/* Only flush the TLB if we actually modified any entries */
>>  	if (unmapped)
>>  		flush_tlb_range(walk->vma, start, end);
>
> It appears that we can increase "unmapped" only if ptep_get_and_clear()
> is used?

In other words you mean we only need to increase unmapped if pte_present
&& !anon_exclusive?

Agree, that's a good optimisation to make. However I'm just trying to
solve a data corruption issue (not dirtying the page) here, so will post
that as a separate optimisation patch. Thanks.

 - Alistair

> Best Regards,
> Huang, Ying
>
>> +	arch_leave_lazy_mmu_mode();
>> +	pte_unmap_unlock(ptep - 1, ptl);
>> +
>>  	return 0;
>>  }
>>
>>
>> base-commit: ffcf9c5700e49c0aee42dcba9a12ba21338e8136

  reply	other threads:[~2022-08-25 22:42 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-24  3:03 [PATCH v3 1/3] mm/migrate_device.c: Flush TLB while holding PTL Alistair Popple
2022-08-24  3:03 ` Alistair Popple
2022-08-24  3:03 ` [PATCH v3 2/3] mm/migrate_device.c: Copy pte dirty bit to page Alistair Popple
2022-08-24  3:03   ` Alistair Popple
2022-08-24 15:39   ` Peter Xu
2022-08-24 15:39     ` Peter Xu
2022-08-25 22:21     ` Alistair Popple
2022-08-25 22:21       ` Alistair Popple
2022-08-25 23:27       ` Peter Xu
2022-08-25 23:27         ` Peter Xu
2022-08-26  1:02         ` Alistair Popple
2022-08-26  1:02           ` Alistair Popple
2022-08-26  1:14           ` Huang, Ying
2022-08-26  1:14             ` Huang, Ying
2022-08-26 14:32           ` Peter Xu
2022-08-26 14:32             ` Peter Xu
2022-08-26 14:47             ` David Hildenbrand
2022-08-26 14:47               ` David Hildenbrand
2022-08-26 15:55               ` Peter Xu
2022-08-26 15:55                 ` Peter Xu
2022-08-26 16:46                 ` David Hildenbrand
2022-08-26 16:46                   ` David Hildenbrand
2022-08-26 21:37                   ` Peter Xu
2022-08-26 21:37                     ` Peter Xu
2022-08-26 22:19                     ` David Hildenbrand
2022-08-26 22:19                       ` David Hildenbrand
2022-08-24  3:03 ` [PATCH v3 3/3] selftests/hmm-tests: Add test for dirty bits Alistair Popple
2022-08-24  3:03   ` Alistair Popple
2022-08-24  3:06   ` kernel test robot
2022-08-24  4:24     ` Alistair Popple
2022-08-24  4:24       ` Alistair Popple
2022-08-24  8:21 ` [PATCH v3 1/3] mm/migrate_device.c: Flush TLB while holding PTL David Hildenbrand
2022-08-24  8:21   ` David Hildenbrand
2022-08-24 12:26   ` Alistair Popple
2022-08-24 12:26     ` Alistair Popple
2022-08-24 12:35     ` David Hildenbrand
2022-08-24 12:35       ` David Hildenbrand
2022-08-25  1:36 ` Huang, Ying
2022-08-25  1:36   ` Huang, Ying
2022-08-25 22:35   ` Alistair Popple [this message]
2022-08-25 22:35     ` Alistair Popple
2022-08-26  0:56     ` Huang, Ying
2022-08-26  0:56       ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y1vcdjzs.fsf@nvdebian.thelocal \
    --to=apopple@nvidia.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.sierra@amd.com \
    --cc=bskeggs@redhat.com \
    --cc=david@redhat.com \
    --cc=huang.ying.caritas@gmail.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kherbst@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=logang@deltatee.com \
    --cc=lyude@redhat.com \
    --cc=nadav.amit@gmail.com \
    --cc=peterx@redhat.com \
    --cc=rcampbell@nvidia.com \
    --cc=stable@vger.kernel.org \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.