All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Jason Baron <jbaron@redhat.com>,
	torvalds@osdl.org, akpm@osdl.org, linux-ia64@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: RE: [PATCH] ia64: race flushing icache in COW path
Date: Fri, 14 Jul 2006 03:11:47 +0000	[thread overview]
Message-ID: <1152846707.7466.12.camel@lappy> (raw)
In-Reply-To: <617E1C2C70743745A92448908E030B2A38D779@scsmsx411.amr.corp.intel.com>

On Thu, 2006-07-13 at 13:37 -0700, Luck, Tony wrote:
> > lazy_mmu_prot_update() is used in a number of other places *after* the pte 
> > is established. An explanation as to why this case is different, would be 
> > interesting.
> 
> The other places do need a close look, it seems that some of
> them may not be needed (e.g. the one inside "if (reuse) { }" at
> the top of do_wp_page() ... at the moment I'm struggling to see
> what it manages to achieve).
> 
> Most of the rest are in cases where we are adding a new virtual
> page (comments like "No need to invalidate - it was non-present
> before").  These may also need to have the order shuffled, but
> they seem unlikely to be the cause of a bug (it is unlikely
> that an application has threads that branch to new anonymous
> pages as they are being attached to the process).
> 
> So you are right that there may be some more work here, but
> I wanted to get the one-liner that is a clear and obvious
> bugfix posted without being cluttered with some less obvious
> fixes.

How about something like this.
---

From: Peter Zijlstra <a.p.zijlstra@chello.nl>

Since calling of lazy_mmu_prot_update() while the PTE is already set
could open up a race window wrt. icache coherency, make sure all
invocations are done before setting the PTE.

These call-sites are currently not problematic, the one that was has 
been addressed by a previous patch. This patch just tidies up the
call semantics for lazy_mmu_prot_update().

Signed-Off-By: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 mm/hugetlb.c  |    4 ++--
 mm/memory.c   |   10 +++++-----
 mm/mprotect.c |    2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

Index: linux-2.6/mm/hugetlb.c
=================================--- linux-2.6.orig/mm/hugetlb.c	2006-07-14 04:54:48.000000000 +0200
+++ linux-2.6/mm/hugetlb.c	2006-07-14 05:04:58.000000000 +0200
@@ -313,9 +313,9 @@ static void set_huge_ptep_writable(struc
 	pte_t entry;
 
 	entry = pte_mkwrite(pte_mkdirty(*ptep));
+	lazy_mmu_prot_update(entry);
 	ptep_set_access_flags(vma, address, ptep, entry, 1);
 	update_mmu_cache(vma, address, entry);
-	lazy_mmu_prot_update(entry);
 }
 
 
@@ -634,8 +634,8 @@ void hugetlb_change_protection(struct vm
 		if (!pte_none(*ptep)) {
 			pte = huge_ptep_get_and_clear(mm, address, ptep);
 			pte = pte_mkhuge(pte_modify(pte, newprot));
-			set_huge_pte_at(mm, address, ptep, pte);
 			lazy_mmu_prot_update(pte);
+			set_huge_pte_at(mm, address, ptep, pte);
 		}
 	}
 	spin_unlock(&mm->page_table_lock);
Index: linux-2.6/mm/memory.c
=================================--- linux-2.6.orig/mm/memory.c	2006-07-14 05:00:06.000000000 +0200
+++ linux-2.6/mm/memory.c	2006-07-14 05:02:07.000000000 +0200
@@ -1506,9 +1506,9 @@ static int do_wp_page(struct mm_struct *
 		flush_cache_page(vma, address, pte_pfn(orig_pte));
 		entry = pte_mkyoung(orig_pte);
 		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+		lazy_mmu_prot_update(entry);
 		ptep_set_access_flags(vma, address, page_table, entry, 1);
 		update_mmu_cache(vma, address, entry);
-		lazy_mmu_prot_update(entry);
 		ret |= VM_FAULT_WRITE;
 		goto unlock;
 	}
@@ -1980,6 +1980,7 @@ static int do_swap_page(struct mm_struct
 	}
 
 	flush_icache_page(vma, page);
+	lazy_mmu_prot_update(pte);
 	set_pte_at(mm, address, page_table, pte);
 	page_add_anon_rmap(page, vma, address);
 
@@ -1997,7 +1998,6 @@ static int do_swap_page(struct mm_struct
 
 	/* No need to invalidate - it was non-present before */
 	update_mmu_cache(vma, address, pte);
-	lazy_mmu_prot_update(pte);
 unlock:
 	pte_unmap_unlock(page_table, ptl);
 out:
@@ -2055,11 +2055,11 @@ static int do_anonymous_page(struct mm_s
 		page_add_file_rmap(page);
 	}
 
+	lazy_mmu_prot_update(entry);
 	set_pte_at(mm, address, page_table, entry);
 
 	/* No need to invalidate - it was non-present before */
 	update_mmu_cache(vma, address, entry);
-	lazy_mmu_prot_update(entry);
 unlock:
 	pte_unmap_unlock(page_table, ptl);
 	return VM_FAULT_MINOR;
@@ -2180,6 +2180,7 @@ retry:
 		entry = mk_pte(new_page, vma->vm_page_prot);
 		if (write_access)
 			entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+		lazy_mmu_prot_update(entry);
 		set_pte_at(mm, address, page_table, entry);
 		if (anon) {
 			inc_mm_counter(mm, anon_rss);
@@ -2197,7 +2198,6 @@ retry:
 
 	/* no need to invalidate: a not-present page shouldn't be cached */
 	update_mmu_cache(vma, address, entry);
-	lazy_mmu_prot_update(entry);
 unlock:
 	pte_unmap_unlock(page_table, ptl);
 	return ret;
@@ -2293,9 +2293,9 @@ static inline int handle_pte_fault(struc
 	}
 	entry = pte_mkyoung(entry);
 	if (!pte_same(old_entry, entry)) {
+		lazy_mmu_prot_update(entry);
 		ptep_set_access_flags(vma, address, pte, entry, write_access);
 		update_mmu_cache(vma, address, entry);
-		lazy_mmu_prot_update(entry);
 	} else {
 		/*
 		 * This is needed only for protection faults but the arch code
Index: linux-2.6/mm/mprotect.c
=================================--- linux-2.6.orig/mm/mprotect.c	2006-07-14 04:54:48.000000000 +0200
+++ linux-2.6/mm/mprotect.c	2006-07-14 05:03:37.000000000 +0200
@@ -43,8 +43,8 @@ static void change_pte_range(struct mm_s
 			 * into place.
 			 */
 			ptent = pte_modify(ptep_get_and_clear(mm, addr, pte), newprot);
-			set_pte_at(mm, addr, pte, ptent);
 			lazy_mmu_prot_update(ptent);
+			set_pte_at(mm, addr, pte, ptent);
 #ifdef CONFIG_MIGRATION
 		} else if (!pte_file(oldpte)) {
 			swp_entry_t entry = pte_to_swp_entry(oldpte);



WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Jason Baron <jbaron@redhat.com>,
	torvalds@osdl.org, akpm@osdl.org, linux-ia64@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: RE: [PATCH] ia64: race flushing icache in COW path
Date: Fri, 14 Jul 2006 05:11:47 +0200	[thread overview]
Message-ID: <1152846707.7466.12.camel@lappy> (raw)
In-Reply-To: <617E1C2C70743745A92448908E030B2A38D779@scsmsx411.amr.corp.intel.com>

On Thu, 2006-07-13 at 13:37 -0700, Luck, Tony wrote:
> > lazy_mmu_prot_update() is used in a number of other places *after* the pte 
> > is established. An explanation as to why this case is different, would be 
> > interesting.
> 
> The other places do need a close look, it seems that some of
> them may not be needed (e.g. the one inside "if (reuse) { }" at
> the top of do_wp_page() ... at the moment I'm struggling to see
> what it manages to achieve).
> 
> Most of the rest are in cases where we are adding a new virtual
> page (comments like "No need to invalidate - it was non-present
> before").  These may also need to have the order shuffled, but
> they seem unlikely to be the cause of a bug (it is unlikely
> that an application has threads that branch to new anonymous
> pages as they are being attached to the process).
> 
> So you are right that there may be some more work here, but
> I wanted to get the one-liner that is a clear and obvious
> bugfix posted without being cluttered with some less obvious
> fixes.

How about something like this.
---

From: Peter Zijlstra <a.p.zijlstra@chello.nl>

Since calling of lazy_mmu_prot_update() while the PTE is already set
could open up a race window wrt. icache coherency, make sure all
invocations are done before setting the PTE.

These call-sites are currently not problematic, the one that was has 
been addressed by a previous patch. This patch just tidies up the
call semantics for lazy_mmu_prot_update().

Signed-Off-By: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 mm/hugetlb.c  |    4 ++--
 mm/memory.c   |   10 +++++-----
 mm/mprotect.c |    2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

Index: linux-2.6/mm/hugetlb.c
===================================================================
--- linux-2.6.orig/mm/hugetlb.c	2006-07-14 04:54:48.000000000 +0200
+++ linux-2.6/mm/hugetlb.c	2006-07-14 05:04:58.000000000 +0200
@@ -313,9 +313,9 @@ static void set_huge_ptep_writable(struc
 	pte_t entry;
 
 	entry = pte_mkwrite(pte_mkdirty(*ptep));
+	lazy_mmu_prot_update(entry);
 	ptep_set_access_flags(vma, address, ptep, entry, 1);
 	update_mmu_cache(vma, address, entry);
-	lazy_mmu_prot_update(entry);
 }
 
 
@@ -634,8 +634,8 @@ void hugetlb_change_protection(struct vm
 		if (!pte_none(*ptep)) {
 			pte = huge_ptep_get_and_clear(mm, address, ptep);
 			pte = pte_mkhuge(pte_modify(pte, newprot));
-			set_huge_pte_at(mm, address, ptep, pte);
 			lazy_mmu_prot_update(pte);
+			set_huge_pte_at(mm, address, ptep, pte);
 		}
 	}
 	spin_unlock(&mm->page_table_lock);
Index: linux-2.6/mm/memory.c
===================================================================
--- linux-2.6.orig/mm/memory.c	2006-07-14 05:00:06.000000000 +0200
+++ linux-2.6/mm/memory.c	2006-07-14 05:02:07.000000000 +0200
@@ -1506,9 +1506,9 @@ static int do_wp_page(struct mm_struct *
 		flush_cache_page(vma, address, pte_pfn(orig_pte));
 		entry = pte_mkyoung(orig_pte);
 		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+		lazy_mmu_prot_update(entry);
 		ptep_set_access_flags(vma, address, page_table, entry, 1);
 		update_mmu_cache(vma, address, entry);
-		lazy_mmu_prot_update(entry);
 		ret |= VM_FAULT_WRITE;
 		goto unlock;
 	}
@@ -1980,6 +1980,7 @@ static int do_swap_page(struct mm_struct
 	}
 
 	flush_icache_page(vma, page);
+	lazy_mmu_prot_update(pte);
 	set_pte_at(mm, address, page_table, pte);
 	page_add_anon_rmap(page, vma, address);
 
@@ -1997,7 +1998,6 @@ static int do_swap_page(struct mm_struct
 
 	/* No need to invalidate - it was non-present before */
 	update_mmu_cache(vma, address, pte);
-	lazy_mmu_prot_update(pte);
 unlock:
 	pte_unmap_unlock(page_table, ptl);
 out:
@@ -2055,11 +2055,11 @@ static int do_anonymous_page(struct mm_s
 		page_add_file_rmap(page);
 	}
 
+	lazy_mmu_prot_update(entry);
 	set_pte_at(mm, address, page_table, entry);
 
 	/* No need to invalidate - it was non-present before */
 	update_mmu_cache(vma, address, entry);
-	lazy_mmu_prot_update(entry);
 unlock:
 	pte_unmap_unlock(page_table, ptl);
 	return VM_FAULT_MINOR;
@@ -2180,6 +2180,7 @@ retry:
 		entry = mk_pte(new_page, vma->vm_page_prot);
 		if (write_access)
 			entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+		lazy_mmu_prot_update(entry);
 		set_pte_at(mm, address, page_table, entry);
 		if (anon) {
 			inc_mm_counter(mm, anon_rss);
@@ -2197,7 +2198,6 @@ retry:
 
 	/* no need to invalidate: a not-present page shouldn't be cached */
 	update_mmu_cache(vma, address, entry);
-	lazy_mmu_prot_update(entry);
 unlock:
 	pte_unmap_unlock(page_table, ptl);
 	return ret;
@@ -2293,9 +2293,9 @@ static inline int handle_pte_fault(struc
 	}
 	entry = pte_mkyoung(entry);
 	if (!pte_same(old_entry, entry)) {
+		lazy_mmu_prot_update(entry);
 		ptep_set_access_flags(vma, address, pte, entry, write_access);
 		update_mmu_cache(vma, address, entry);
-		lazy_mmu_prot_update(entry);
 	} else {
 		/*
 		 * This is needed only for protection faults but the arch code
Index: linux-2.6/mm/mprotect.c
===================================================================
--- linux-2.6.orig/mm/mprotect.c	2006-07-14 04:54:48.000000000 +0200
+++ linux-2.6/mm/mprotect.c	2006-07-14 05:03:37.000000000 +0200
@@ -43,8 +43,8 @@ static void change_pte_range(struct mm_s
 			 * into place.
 			 */
 			ptent = pte_modify(ptep_get_and_clear(mm, addr, pte), newprot);
-			set_pte_at(mm, addr, pte, ptent);
 			lazy_mmu_prot_update(ptent);
+			set_pte_at(mm, addr, pte, ptent);
 #ifdef CONFIG_MIGRATION
 		} else if (!pte_file(oldpte)) {
 			swp_entry_t entry = pte_to_swp_entry(oldpte);



WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Jason Baron <jbaron@redhat.com>,
	torvalds@osdl.org, akpm@osdl.org, linux-ia64@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: RE: [PATCH] ia64: race flushing icache in COW path
Date: Fri, 14 Jul 2006 05:11:47 +0200	[thread overview]
Message-ID: <1152846707.7466.12.camel@lappy> (raw)
In-Reply-To: <617E1C2C70743745A92448908E030B2A38D779@scsmsx411.amr.corp.intel.com>

On Thu, 2006-07-13 at 13:37 -0700, Luck, Tony wrote:
> > lazy_mmu_prot_update() is used in a number of other places *after* the pte 
> > is established. An explanation as to why this case is different, would be 
> > interesting.
> 
> The other places do need a close look, it seems that some of
> them may not be needed (e.g. the one inside "if (reuse) { }" at
> the top of do_wp_page() ... at the moment I'm struggling to see
> what it manages to achieve).
> 
> Most of the rest are in cases where we are adding a new virtual
> page (comments like "No need to invalidate - it was non-present
> before").  These may also need to have the order shuffled, but
> they seem unlikely to be the cause of a bug (it is unlikely
> that an application has threads that branch to new anonymous
> pages as they are being attached to the process).
> 
> So you are right that there may be some more work here, but
> I wanted to get the one-liner that is a clear and obvious
> bugfix posted without being cluttered with some less obvious
> fixes.

How about something like this.
---

From: Peter Zijlstra <a.p.zijlstra@chello.nl>

Since calling of lazy_mmu_prot_update() while the PTE is already set
could open up a race window wrt. icache coherency, make sure all
invocations are done before setting the PTE.

These call-sites are currently not problematic, the one that was has 
been addressed by a previous patch. This patch just tidies up the
call semantics for lazy_mmu_prot_update().

Signed-Off-By: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 mm/hugetlb.c  |    4 ++--
 mm/memory.c   |   10 +++++-----
 mm/mprotect.c |    2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

Index: linux-2.6/mm/hugetlb.c
===================================================================
--- linux-2.6.orig/mm/hugetlb.c	2006-07-14 04:54:48.000000000 +0200
+++ linux-2.6/mm/hugetlb.c	2006-07-14 05:04:58.000000000 +0200
@@ -313,9 +313,9 @@ static void set_huge_ptep_writable(struc
 	pte_t entry;
 
 	entry = pte_mkwrite(pte_mkdirty(*ptep));
+	lazy_mmu_prot_update(entry);
 	ptep_set_access_flags(vma, address, ptep, entry, 1);
 	update_mmu_cache(vma, address, entry);
-	lazy_mmu_prot_update(entry);
 }
 
 
@@ -634,8 +634,8 @@ void hugetlb_change_protection(struct vm
 		if (!pte_none(*ptep)) {
 			pte = huge_ptep_get_and_clear(mm, address, ptep);
 			pte = pte_mkhuge(pte_modify(pte, newprot));
-			set_huge_pte_at(mm, address, ptep, pte);
 			lazy_mmu_prot_update(pte);
+			set_huge_pte_at(mm, address, ptep, pte);
 		}
 	}
 	spin_unlock(&mm->page_table_lock);
Index: linux-2.6/mm/memory.c
===================================================================
--- linux-2.6.orig/mm/memory.c	2006-07-14 05:00:06.000000000 +0200
+++ linux-2.6/mm/memory.c	2006-07-14 05:02:07.000000000 +0200
@@ -1506,9 +1506,9 @@ static int do_wp_page(struct mm_struct *
 		flush_cache_page(vma, address, pte_pfn(orig_pte));
 		entry = pte_mkyoung(orig_pte);
 		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+		lazy_mmu_prot_update(entry);
 		ptep_set_access_flags(vma, address, page_table, entry, 1);
 		update_mmu_cache(vma, address, entry);
-		lazy_mmu_prot_update(entry);
 		ret |= VM_FAULT_WRITE;
 		goto unlock;
 	}
@@ -1980,6 +1980,7 @@ static int do_swap_page(struct mm_struct
 	}
 
 	flush_icache_page(vma, page);
+	lazy_mmu_prot_update(pte);
 	set_pte_at(mm, address, page_table, pte);
 	page_add_anon_rmap(page, vma, address);
 
@@ -1997,7 +1998,6 @@ static int do_swap_page(struct mm_struct
 
 	/* No need to invalidate - it was non-present before */
 	update_mmu_cache(vma, address, pte);
-	lazy_mmu_prot_update(pte);
 unlock:
 	pte_unmap_unlock(page_table, ptl);
 out:
@@ -2055,11 +2055,11 @@ static int do_anonymous_page(struct mm_s
 		page_add_file_rmap(page);
 	}
 
+	lazy_mmu_prot_update(entry);
 	set_pte_at(mm, address, page_table, entry);
 
 	/* No need to invalidate - it was non-present before */
 	update_mmu_cache(vma, address, entry);
-	lazy_mmu_prot_update(entry);
 unlock:
 	pte_unmap_unlock(page_table, ptl);
 	return VM_FAULT_MINOR;
@@ -2180,6 +2180,7 @@ retry:
 		entry = mk_pte(new_page, vma->vm_page_prot);
 		if (write_access)
 			entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+		lazy_mmu_prot_update(entry);
 		set_pte_at(mm, address, page_table, entry);
 		if (anon) {
 			inc_mm_counter(mm, anon_rss);
@@ -2197,7 +2198,6 @@ retry:
 
 	/* no need to invalidate: a not-present page shouldn't be cached */
 	update_mmu_cache(vma, address, entry);
-	lazy_mmu_prot_update(entry);
 unlock:
 	pte_unmap_unlock(page_table, ptl);
 	return ret;
@@ -2293,9 +2293,9 @@ static inline int handle_pte_fault(struc
 	}
 	entry = pte_mkyoung(entry);
 	if (!pte_same(old_entry, entry)) {
+		lazy_mmu_prot_update(entry);
 		ptep_set_access_flags(vma, address, pte, entry, write_access);
 		update_mmu_cache(vma, address, entry);
-		lazy_mmu_prot_update(entry);
 	} else {
 		/*
 		 * This is needed only for protection faults but the arch code
Index: linux-2.6/mm/mprotect.c
===================================================================
--- linux-2.6.orig/mm/mprotect.c	2006-07-14 04:54:48.000000000 +0200
+++ linux-2.6/mm/mprotect.c	2006-07-14 05:03:37.000000000 +0200
@@ -43,8 +43,8 @@ static void change_pte_range(struct mm_s
 			 * into place.
 			 */
 			ptent = pte_modify(ptep_get_and_clear(mm, addr, pte), newprot);
-			set_pte_at(mm, addr, pte, ptent);
 			lazy_mmu_prot_update(ptent);
+			set_pte_at(mm, addr, pte, ptent);
 #ifdef CONFIG_MIGRATION
 		} else if (!pte_file(oldpte)) {
 			swp_entry_t entry = pte_to_swp_entry(oldpte);


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2006-07-14  3:11 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-07-13 17:00 [PATCH] ia64: race flushing icache in COW path Luck, Tony
2006-07-13 17:00 ` Luck, Tony, Anil Keshavamurthy
2006-07-13 17:00 ` Luck, Tony
2006-07-13 19:16 ` Jason Baron
2006-07-13 19:16   ` Jason Baron
2006-07-13 19:16   ` Jason Baron
2006-07-13 20:37 ` Luck, Tony
2006-07-13 20:37   ` Luck, Tony
2006-07-13 20:37   ` Luck, Tony
2006-07-14  3:11   ` Peter Zijlstra [this message]
2006-07-14  3:11     ` Peter Zijlstra
2006-07-14  3:11     ` Peter Zijlstra
2006-07-14 17:11 ` Luck, Tony
2006-07-14 17:11   ` Luck, Tony
2006-07-14 17:11   ` Luck, Tony

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1152846707.7466.12.camel@lappy \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@osdl.org \
    --cc=jbaron@redhat.com \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tony.luck@intel.com \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.