linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: linux-mm@kvack.org
Cc: linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org,
	Dan Williams <dan.j.williams@intel.com>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	Jan Kara <jack@suse.cz>
Subject: [PATCH 13/15] mm: Provide helper for finishing mkwrite faults
Date: Fri, 22 Jul 2016 14:19:39 +0200	[thread overview]
Message-ID: <1469189981-19000-14-git-send-email-jack@suse.cz> (raw)
In-Reply-To: <1469189981-19000-1-git-send-email-jack@suse.cz>

Provide a helper function for finishing write faults due to PTE being
read-only. The helper will be used by DAX to avoid the need of
complicating generic MM code with DAX locking specifics.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 include/linux/mm.h |  1 +
 mm/memory.c        | 62 +++++++++++++++++++++++++++++++++++-------------------
 2 files changed, 41 insertions(+), 22 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index daf690fccc0c..32ff572a6e6c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -601,6 +601,7 @@ static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
 void do_set_pte(struct vm_area_struct *vma, unsigned long address,
 		struct page *page, pte_t *pte, bool write, bool anon);
 int finish_fault(struct vm_area_struct *vma, struct vm_fault *vmf);
+int finish_mkwrite_fault(struct vm_area_struct *vma, struct vm_fault *vmf);
 #endif
 
 /*
diff --git a/mm/memory.c b/mm/memory.c
index 1d2916c53d43..30cf7b36df48 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2262,6 +2262,41 @@ oom:
 	return VM_FAULT_OOM;
 }
 
+/**
+ * finish_mkrite_fault - finish page fault making PTE writeable once the page
+ *			 page is prepared
+ *
+ * @vma: virtual memory area
+ * @vmf: structure describing the fault
+ *
+ * This function handles all that is needed to finish a write page fault due
+ * to PTE being read-only once the mapped page is prepared. It handles locking
+ * of PTE and modifying it. The function returns 0 on success, error in case
+ * the PTE changed before we acquired PTE lock.
+ *
+ * The function expects the page to be locked or other protection against
+ * concurrent faults / writeback (such as DAX radix tree locks).
+ */
+int finish_mkwrite_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+	unsigned long address = (unsigned long)vmf->virtual_address;
+	pte_t *pte;
+	spinlock_t *ptl;
+
+	pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, address, &ptl);
+	/*
+	 * We might have raced with another page fault while we
+	 * released the pte_offset_map_lock.
+	 */
+	if (!pte_same(*pte, vmf->orig_pte)) {
+		pte_unmap_unlock(pte, ptl);
+		return -EBUSY;
+	}
+	wp_page_reuse(vma->vm_mm, vma, address, pte, ptl, vmf->orig_pte,
+		      vmf->page);
+	return 0;
+}
+
 /*
  * Handle write page faults for VM_MIXEDMAP or VM_PFNMAP for a VM_SHARED
  * mapping
@@ -2282,17 +2317,12 @@ static int wp_pfn_shared(struct mm_struct *mm,
 		ret = vma->vm_ops->pfn_mkwrite(vma, &vmf);
 		if (ret & VM_FAULT_ERROR)
 			return ret;
-		page_table = pte_offset_map_lock(mm, pmd, address, &ptl);
-		/*
-		 * We might have raced with another page fault while we
-		 * released the pte_offset_map_lock.
-		 */
-		if (!pte_same(*page_table, orig_pte)) {
-			pte_unmap_unlock(page_table, ptl);
+		if (finish_mkwrite_fault(vma, &vmf) < 0)
 			return 0;
-		}
+	} else {
+		wp_page_reuse(mm, vma, address, page_table, ptl, orig_pte,
+			      NULL);
 	}
-	wp_page_reuse(mm, vma, address, page_table, ptl, orig_pte, NULL);
 	return VM_FAULT_WRITE;
 }
 
@@ -2319,28 +2349,16 @@ static int wp_page_shared(struct mm_struct *mm, struct vm_area_struct *vma,
 			put_page(old_page);
 			return tmp;
 		}
-		/*
-		 * Since we dropped the lock we need to revalidate
-		 * the PTE as someone else may have changed it.  If
-		 * they did, we just return, as we can count on the
-		 * MMU to tell us if they didn't also make it writable.
-		 */
-		page_table = pte_offset_map_lock(mm, pmd, address,
-						 &ptl);
-		if (!pte_same(*page_table, orig_pte)) {
+		if (finish_mkwrite_fault(vma, &vmf) < 0) {
 			unlock_page(old_page);
-			pte_unmap_unlock(page_table, ptl);
 			put_page(old_page);
 			return 0;
 		}
-		wp_page_reuse(mm, vma, address, page_table, ptl, orig_pte,
-			      old_page);
 	} else {
 		wp_page_reuse(mm, vma, address, page_table, ptl, orig_pte,
 			      old_page);
 		lock_page(old_page);
 	}
-
 	fault_dirty_shared_page(vma, old_page);
 	put_page(old_page);
 	return VM_FAULT_WRITE;
-- 
2.6.6


  parent reply	other threads:[~2016-07-22 12:19 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-22 12:19 [PATCH 0/15 v2] dax: Clear dirty bits after flushing caches Jan Kara
2016-07-22 12:19 ` [PATCH 01/15] mm: Create vm_fault structure earlier Jan Kara
2016-07-22 12:19 ` [PATCH 02/15] mm: Propagate original vm_fault into do_fault_around() Jan Kara
2016-07-22 12:19 ` [PATCH 03/15] mm: Add pmd and orig_pte fields to vm_fault Jan Kara
2016-07-22 12:19 ` [PATCH 04/15] mm: Allow full handling of COW faults in ->fault handlers Jan Kara
2016-07-22 12:19 ` [PATCH 05/15] mm: Factor out functionality to finish page faults Jan Kara
2016-07-22 12:19 ` [PATCH 06/15] mm: Move handling of COW faults into DAX code Jan Kara
2016-07-22 12:19 ` [PATCH 07/15] dax: Make cache flushing protected by entry lock Jan Kara
2016-07-22 12:19 ` [PATCH 08/15] mm: Export follow_pte() Jan Kara
2016-07-22 12:19 ` [PATCH 09/15] mm: Remove unnecessary vma->vm_ops check Jan Kara
2016-07-22 12:19 ` [PATCH 10/15] mm: Factor out common parts of write fault handling Jan Kara
2016-07-22 12:19 ` [PATCH 11/15] mm: Move part of wp_page_reuse() into the single call site Jan Kara
2016-07-22 12:19 ` [PATCH 12/15] mm: Lift vm_fault structure creation from do_page_mkwrite() Jan Kara
2016-07-22 12:19 ` Jan Kara [this message]
2016-08-09 14:50   ` [lkp] [mm] 0c649028cd: vm-scalability.throughput 343.9% improvement kernel test robot
2016-07-22 12:19 ` [PATCH 14/15] dax: Protect PTE modification on WP fault by radix tree entry lock Jan Kara
2016-07-25 21:30   ` Ross Zwisler
2016-07-26 14:09     ` Jan Kara
2016-07-22 12:19 ` [PATCH 15/15] dax: Clear dirty entry tags on cache flush Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1469189981-19000-14-git-send-email-jack@suse.cz \
    --to=jack@suse.cz \
    --cc=dan.j.williams@intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=ross.zwisler@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).