[PATCH 0/2] Reduce mmap_sem hold times during file backed page faults

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] Reduce mmap_sem hold times during file backed page faults
@ 2010-10-01  5:04 Michel Lespinasse
  2010-10-01  5:04 ` [PATCH 1/2] Unique path for locking page in filemap_fault() Michel Lespinasse
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Michel Lespinasse @ 2010-10-01  5:04 UTC (permalink / raw)
  To: linux-mm, Linus Torvalds, Ying Han
  Cc: linux-kernel, Andrew Morton, Rik van Riel, Nick Piggin,
	Peter Zijlstra, Hugh Dickins

Linus, I would appreciate your comments on this since you shot down the
previous proposal. I hope you'll find this approach is sane, but I would
be interested to hear if you have specific objections.

mmap_sem is very coarse grained (per process) and has long read-hold times
(disk latencies); this breaks down rapidly for workloads that use both
read and write mmap_sem acquires. This short patch series tries to reduce
mmap_sem hold times when faulting in file backed VMAs.

First patch creates a single place to lock the page in filemap_fault().
There should be no behavior differences.

Second patch modifies that lock_page() so that, if trylock_page() fails,
we consider releasing the mmap_sem while waiting for page to be unlocked.
This is controlled by a new FAULT_FLAG_RELEASE flag. If the mmap_sem gets
released, we return the VM_FAULT_RELEASED status; the caller is then expected
to re-acquire mmap_sem and retry the page fault. Chances are that the same
page will be accessed and will now be unlocked, so the mmap_sem hold time
will be short.

Michel Lespinasse (2):
  Unique path for locking page in filemap_fault()
  Release mmap_sem when page fault blocks on disk transfer.

 arch/x86/mm/fault.c |   35 ++++++++++++++++++++++++++---------
 include/linux/mm.h  |    2 ++
 mm/filemap.c        |   38 +++++++++++++++++++++++++++++---------
 mm/memory.c         |    3 ++-
 4 files changed, 59 insertions(+), 19 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] Unique path for locking page in filemap_fault()
  2010-10-01  5:04 [PATCH 0/2] Reduce mmap_sem hold times during file backed page faults Michel Lespinasse
@ 2010-10-01  5:04 ` Michel Lespinasse
  2010-10-01  5:04 ` [PATCH 2/2] Release mmap_sem when page fault blocks on disk transfer Michel Lespinasse
  2010-10-01 12:07 ` [PATCH 0/2] Reduce mmap_sem hold times during file backed page faults Rik van Riel
  2 siblings, 0 replies; 8+ messages in thread
From: Michel Lespinasse @ 2010-10-01  5:04 UTC (permalink / raw)
  To: linux-mm, Linus Torvalds, Ying Han
  Cc: linux-kernel, Andrew Morton, Rik van Riel, Nick Piggin,
	Peter Zijlstra, Hugh Dickins

Note that if the file got truncated before we locked the page, we retry
at find_get_page(). This is similar to what find_lock_page() would do.
Linus's commit ef00e08e introduced a goto no_cached_page in that simuation,
which seems correct as well but possibly less efficient ?

----------------------- >8 ------------------------

This change introduces a single location where filemap_fault() locks
the desired page. There used to be two such places, depending if the
initial find_get_page() was successful or not.

Signed-off-by: Michel Lespinasse <walken@google.com>
---
 mm/filemap.c |   20 +++++++++++---------
 1 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 3d4df44..8ed709a 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1539,25 +1539,27 @@ int filemap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 		 * waiting for the lock.
 		 */
 		do_async_mmap_readahead(vma, ra, file, page, offset);
-		lock_page(page);
-
-		/* Did it get truncated? */
-		if (unlikely(page->mapping != mapping)) {
-			unlock_page(page);
-			put_page(page);
-			goto no_cached_page;
-		}
 	} else {
 		/* No page in the page cache at all */
 		do_sync_mmap_readahead(vma, ra, file, offset);
 		count_vm_event(PGMAJFAULT);
 		ret = VM_FAULT_MAJOR;
 retry_find:
-		page = find_lock_page(mapping, offset);
+		page = find_get_page(mapping, offset);
 		if (!page)
 			goto no_cached_page;
 	}
 
+	lock_page(page);
+
+	/* Did it get truncated? */
+	if (unlikely(page->mapping != mapping)) {
+		unlock_page(page);
+		put_page(page);
+		goto retry_find;
+	}
+	VM_BUG_ON(page->index != offset);
+
 	/*
 	 * We have a locked page in the page cache, now we need to check
 	 * that it's up-to-date. If not, it is going to be due to an error.
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] Release mmap_sem when page fault blocks on disk transfer.
  2010-10-01  5:04 [PATCH 0/2] Reduce mmap_sem hold times during file backed page faults Michel Lespinasse
  2010-10-01  5:04 ` [PATCH 1/2] Unique path for locking page in filemap_fault() Michel Lespinasse
@ 2010-10-01  5:04 ` Michel Lespinasse
  2010-10-01 14:06   ` Rik van Riel
  2010-10-01 15:31   ` Linus Torvalds
  2010-10-01 12:07 ` [PATCH 0/2] Reduce mmap_sem hold times during file backed page faults Rik van Riel
  2 siblings, 2 replies; 8+ messages in thread
From: Michel Lespinasse @ 2010-10-01  5:04 UTC (permalink / raw)
  To: linux-mm, Linus Torvalds, Ying Han
  Cc: linux-kernel, Andrew Morton, Rik van Riel, Nick Piggin,
	Peter Zijlstra, Hugh Dickins

This change reduces mmap_sem hold times that are caused by waiting for
disk transfers when accessing file mapped VMAs. It introduces the
VM_FAULT_RELEASED flag, which indicates that the call site holds mmap_lock
and wishes for it to be released if blocking on a pending disk transfer.
In that case, filemap_fault() returns the VM_FAULT_RELEASED status bit
and do_page_fault() will then re-acquire mmap_sem and retry the page fault.
It is expected that the retry will hit the same page which will now be cached,
and thus it will complete with a low mmap_sem hold time.


Signed-off-by: Michel Lespinasse <walken@google.com>
---
 arch/x86/mm/fault.c |   35 ++++++++++++++++++++++++++---------
 include/linux/mm.h  |    2 ++
 mm/filemap.c        |   20 +++++++++++++++++++-
 mm/memory.c         |    3 ++-
 4 files changed, 49 insertions(+), 11 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 4c4508e..58109ba 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -954,6 +954,7 @@ do_page_fault(struct pt_regs *regs, unsigned long error_code)
 	struct mm_struct *mm;
 	int write;
 	int fault;
+	unsigned int release_flag = FAULT_FLAG_RELEASE;
 
 	tsk = current;
 	mm = tsk->mm;
@@ -1064,6 +1065,7 @@ do_page_fault(struct pt_regs *regs, unsigned long error_code)
 			bad_area_nosemaphore(regs, error_code, address);
 			return;
 		}
+retry:
 		down_read(&mm->mmap_sem);
 	} else {
 		/*
@@ -1119,21 +1121,36 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault:
 	 */
-	fault = handle_mm_fault(mm, vma, address, write ? FAULT_FLAG_WRITE : 0);
+	fault = handle_mm_fault(mm, vma, address,
+				release_flag | (write ? FAULT_FLAG_WRITE : 0));
 
 	if (unlikely(fault & VM_FAULT_ERROR)) {
 		mm_fault_error(regs, error_code, address, fault);
 		return;
 	}
 
-	if (fault & VM_FAULT_MAJOR) {
-		tsk->maj_flt++;
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, 0,
-				     regs, address);
-	} else {
-		tsk->min_flt++;
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, 0,
-				     regs, address);
+	if (release_flag) {	/* Did not go through a retry */
+		if (fault & VM_FAULT_MAJOR) {
+			tsk->maj_flt++;
+			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, 0,
+				      regs, address);
+		} else {
+			tsk->min_flt++;
+			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, 0,
+				      regs, address);
+		}
+		if (fault & VM_FAULT_RELEASED) {
+			/*
+			 * handle_mm_fault() found that the desired page was
+			 * locked. We asked for it to release mmap_sem in that
+			 * case, so as to avoid holding it for too long.
+			 * Retry starting at the mmap_sem acquire, this time
+			 * without FAULT_FLAG_RETRY so that we avoid any
+			 * risk of starvation.
+			 */
+			release_flag = 0;
+			goto retry;
+		}
 	}
 
 	check_v8086_mode(regs, address, tsk);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 74949fb..7782c30 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -144,6 +144,7 @@ extern pgprot_t protection_map[16];
 #define FAULT_FLAG_WRITE	0x01	/* Fault was a write access */
 #define FAULT_FLAG_NONLINEAR	0x02	/* Fault was via a nonlinear mapping */
 #define FAULT_FLAG_MKWRITE	0x04	/* Fault was mkwrite of existing pte */
+#define FAULT_FLAG_RELEASE	0x08	/* Release mmap_sem if blocking */
 
 /*
  * This interface is used by x86 PAT code to identify a pfn mapping that is
@@ -722,6 +723,7 @@ static inline int page_mapped(struct page *page)
 
 #define VM_FAULT_NOPAGE	0x0100	/* ->fault installed the pte, not return page */
 #define VM_FAULT_LOCKED	0x0200	/* ->fault locked the returned page */
+#define VM_FAULT_RELEASED	0x0400	/* mmap_sem got released */
 
 #define VM_FAULT_ERROR	(VM_FAULT_OOM | VM_FAULT_SIGBUS | VM_FAULT_HWPOISON)
 
diff --git a/mm/filemap.c b/mm/filemap.c
index 8ed709a..74197e2 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1550,7 +1550,25 @@ retry_find:
 			goto no_cached_page;
 	}
 
-	lock_page(page);
+	/* Lock the page. */
+	if (!trylock_page(page)) {
+		if (!(vmf->flags & FAULT_FLAG_RELEASE))
+			__lock_page(page);
+		else {
+			/*
+			 * Caller passed FAULT_FLAG_RELEASE flag.
+			 * This indicates it has read-acquired mmap_sem,
+			 * and requests that it be released if we have to
+			 * wait for the page to be transferred from disk.
+			 * Caller will then retry starting with the
+			 * mmap_sem read-acquire.
+			 */
+			up_read(&vma->vm_mm->mmap_sem);
+			wait_on_page_locked(page);
+			page_cache_release(page);
+			return ret | VM_FAULT_RELEASED;
+		}
+	}
 
 	/* Did it get truncated? */
 	if (unlikely(page->mapping != mapping)) {
diff --git a/mm/memory.c b/mm/memory.c
index 0e18b4d..2efb59d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2926,7 +2926,8 @@ static int __do_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	vmf.page = NULL;
 
 	ret = vma->vm_ops->fault(vma, &vmf);
-	if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE)))
+	if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE |
+			    VM_FAULT_RELEASED)))
 		return ret;
 
 	if (unlikely(PageHWPoison(vmf.page))) {
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] Release mmap_sem when page fault blocks on disk transfer.
  2010-10-01  5:04 ` [PATCH 2/2] Release mmap_sem when page fault blocks on disk transfer Michel Lespinasse
@ 2010-10-01 14:06   ` Rik van Riel
  2010-10-01 15:31   ` Linus Torvalds
  1 sibling, 0 replies; 8+ messages in thread
From: Rik van Riel @ 2010-10-01 14:06 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: linux-mm, Linus Torvalds, Ying Han, linux-kernel, Andrew Morton,
	Peter Zijlstra, Hugh Dickins

On 10/01/2010 01:04 AM, Michel Lespinasse wrote:
> This change reduces mmap_sem hold times that are caused by waiting for
> disk transfers when accessing file mapped VMAs. It introduces the
> VM_FAULT_RELEASED flag, which indicates that the call site holds mmap_lock
> and wishes for it to be released if blocking on a pending disk transfer.
> In that case, filemap_fault() returns the VM_FAULT_RELEASED status bit
> and do_page_fault() will then re-acquire mmap_sem and retry the page fault.
> It is expected that the retry will hit the same page which will now be cached,
> and thus it will complete with a low mmap_sem hold time.

The concept makes sense.  A nitpick, though...

> +	if (release_flag) {	/* Did not go through a retry */
> +		if (fault&  VM_FAULT_MAJOR) {
> +			tsk->maj_flt++;
> +			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, 0,
> +				      regs, address);
> +		} else {
> +			tsk->min_flt++;
> +			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, 0,
> +				      regs, address);
> +		}
> +		if (fault&  VM_FAULT_RELEASED) {
> +			/*
> +			 * handle_mm_fault() found that the desired page was
> +			 * locked. We asked for it to release mmap_sem in that
> +			 * case, so as to avoid holding it for too long.
> +			 * Retry starting at the mmap_sem acquire, this time
> +			 * without FAULT_FLAG_RETRY so that we avoid any
> +			 * risk of starvation.
> +			 */
> +			release_flag = 0;
> +			goto retry;
> +		}

Do we really want to count a minor page fault when we
got VM_FAULT_RELEASED?

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] Release mmap_sem when page fault blocks on disk transfer.
  2010-10-01  5:04 ` [PATCH 2/2] Release mmap_sem when page fault blocks on disk transfer Michel Lespinasse
  2010-10-01 14:06   ` Rik van Riel
@ 2010-10-01 15:31   ` Linus Torvalds
  2010-10-01 23:06     ` Michel Lespinasse
  1 sibling, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2010-10-01 15:31 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: linux-mm, Ying Han, linux-kernel, Andrew Morton, Rik van Riel,
	Nick Piggin, Peter Zijlstra, Hugh Dickins

I have nothing against the 1/2 patch, it seems nice regardless.

This one is really messy, though. I think you're making the code much
less readable (and it's not wonderful to start with). That's
unacceptable.

On Thu, Sep 30, 2010 at 10:04 PM, Michel Lespinasse <walken@google.com> wrote:
>        int fault;
> +       unsigned int release_flag = FAULT_FLAG_RELEASE;

Try this with just "flag", and make it look something like

   unsigned int flag;

   flag = FAULT_FLAG_RELEASE | (write ? FAULT_FLAG_WRITE : 0);

and just keep the whole mm_handle_fault() flags value in there. That
avoids one ugly/complex line, and makes it much easier to add other
flags if we ever do.

Also, I think the "RELEASE" naming is too much about the
implementation, not about the context. I think it would be more
sensible to call it "ALLOW_RETRY" or "ATOMIC" or something like this,
and not make it about releasing the page lock so much as about what
you want to happen.

Because quite frankly, I could imagine other reasons to allow page fault retry.

(Similarly, I would rename VM_FAULT_RELEASED to VM_FAULT_RETRY. Again:
name things for the _concept_, not for some odd implementation issue)

> -       if (fault & VM_FAULT_MAJOR) {
> -               tsk->maj_flt++;
> -               perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, 0,
> -                                    regs, address);
> -       } else {
> -               tsk->min_flt++;
> -               perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, 0,
> -                                    regs, address);
> +       if (release_flag) {     /* Did not go through a retry */
> +               if (fault & VM_FAULT_MAJOR) {

I really don't know if this is correct. What if you have two major
faults due to the retry? What if the first one is a minor fault, but
when we retry it's a major fault because the page got released? The
nesting of the conditionals doesn't seem to make conceptual sense.

I dunno. I can see what you're doing ("only do statistics for the
first return"), but at the same time it just feels a bit icky.

> -       lock_page(page);
> +       /* Lock the page. */
> +       if (!trylock_page(page)) {
> +               if (!(vmf->flags & FAULT_FLAG_RELEASE))
> +                       __lock_page(page);
> +               else {
> +                       /*
> +                        * Caller passed FAULT_FLAG_RELEASE flag.
> +                        * This indicates it has read-acquired mmap_sem,
> +                        * and requests that it be released if we have to
> +                        * wait for the page to be transferred from disk.
> +                        * Caller will then retry starting with the
> +                        * mmap_sem read-acquire.
> +                        */
> +                       up_read(&vma->vm_mm->mmap_sem);
> +                       wait_on_page_locked(page);
> +                       page_cache_release(page);
> +                       return ret | VM_FAULT_RELEASED;
> +               }
> +       }

I'd much rather see this abstracted out (preferably together with the
"did it get truncated" logic) into a small helper function of its own.
The main reason I say that is because I hate your propensity for
putting the comments deep inside the code. I think any code that needs
big comments at a deep indentation is fundamentally flawed.

You had the same thing in the x86 fault path. I really think it's
wrong. Needing a comment _inside_ a conditional is just nasty. You
shouldn't explain what just happened, you should explain what is
_going_ to happen, an why you do a test in the first place.

But on the whole I think that if the implementation didn't raise my
hackles so badly, I think the concept looks fine.

                                         Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] Release mmap_sem when page fault blocks on disk transfer.
  2010-10-01 15:31   ` Linus Torvalds
@ 2010-10-01 23:06     ` Michel Lespinasse
  2010-10-02  0:02       ` Linus Torvalds
  0 siblings, 1 reply; 8+ messages in thread
From: Michel Lespinasse @ 2010-10-01 23:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-mm, Ying Han, linux-kernel, Andrew Morton, Rik van Riel,
	Nick Piggin, Peter Zijlstra, Hugh Dickins

On Fri, Oct 1, 2010 at 8:31 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> Also, I think the "RELEASE" naming is too much about the
> implementation, not about the context. I think it would be more
> sensible to call it "ALLOW_RETRY" or "ATOMIC" or something like this,
> and not make it about releasing the page lock so much as about what
> you want to happen.
>
> Because quite frankly, I could imagine other reasons to allow page fault retry.
>
> (Similarly, I would rename VM_FAULT_RELEASED to VM_FAULT_RETRY. Again:
> name things for the _concept_, not for some odd implementation issue)

All right, I changed for your names and I think they do help. There is
still one annoyance though (and this is why I had not made this purely
about retry in the first iteration): the up_read(mmap_sem) and the
wait_on_page_locked(page) still happen within filemap_fault(). I think
ideally we would prefer to move this into do_page_fault so that the
interface could *really* be about retry; however we can't easily do
that because the struct page is not exposed at that level.

>
>> -       if (fault & VM_FAULT_MAJOR) {
>> -               tsk->maj_flt++;
>> -               perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, 0,
>> -                                    regs, address);
>> -       } else {
>> -               tsk->min_flt++;
>> -               perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, 0,
>> -                                    regs, address);
>> +       if (release_flag) {     /* Did not go through a retry */
>> +               if (fault & VM_FAULT_MAJOR) {
>
> I really don't know if this is correct. What if you have two major
> faults due to the retry? What if the first one is a minor fault, but
> when we retry it's a major fault because the page got released? The
> nesting of the conditionals doesn't seem to make conceptual sense.
>
> I dunno. I can see what you're doing ("only do statistics for the
> first return"), but at the same time it just feels a bit icky.

In a way filemap_fault() already has that problem - during a minor
fault, the page could go away before we have a chance to lock it, and
the fault would still be counted as minor. So I just took that
property (first find_get_page() determines if we call the fault minor
or major) and extended it into the retry case.

One reasonable alternative, I think, would be to always count the
fault as major if we had to go through the retry path. The main
difference this would make, I think, is if two threads hit the exact
same page before we get a chance to load it from disk - in which case
they would both get counted as major faults, vs the current accounting
that would charge one as major and the other one as minor.

>> -       lock_page(page);
>> +       /* Lock the page. */
>> +       if (!trylock_page(page)) {
>> +               if (!(vmf->flags & FAULT_FLAG_RELEASE))
>> +                       __lock_page(page);
>> +               else {
>> +                       /*
>> +                        * Caller passed FAULT_FLAG_RELEASE flag.
>> +                        * This indicates it has read-acquired mmap_sem,
>> +                        * and requests that it be released if we have to
>> +                        * wait for the page to be transferred from disk.
>> +                        * Caller will then retry starting with the
>> +                        * mmap_sem read-acquire

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] Release mmap_sem when page fault blocks on disk transfer.
  2010-10-01 23:06     ` Michel Lespinasse
@ 2010-10-02  0:02       ` Linus Torvalds
  0 siblings, 0 replies; 8+ messages in thread
From: Linus Torvalds @ 2010-10-02  0:02 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: linux-mm, Ying Han, linux-kernel, Andrew Morton, Rik van Riel,
	Nick Piggin, Peter Zijlstra, Hugh Dickins

On Fri, Oct 1, 2010 at 4:06 PM, Michel Lespinasse <walken@google.com> wrote:
>
> To be clear, is it about the helper function or about the comment
> location ? I think the code block is actually short and simple, so
> maybe if I just moved the comment up to the /* Lock the page */
> location it'd also look that way ?

I suspect that if the comment had been up-front rather than mixed deep
in the code, I wouldn't have reacted so much to it.

That said, if something can be cleanly abstracted out as a separate
operation, and a big function be split into smaller ones where the
helper functions do clearly defined things, I think that's generally a
good idea.

Personally, I tend to like comments in front of code - preferably at
the head of a function. If the function is so complex that it needs
comments inside of it, to me that's a sign that perhaps it should be
split up.

That's not _always_ true, of course. Sometimes some particular detail
in a function is what is really specific ("we don't need to use an
atomic instruction here, because xyz"). So it's not a hard rule, but
the "please explain the code _before_ it happens rather than as it
happens" is still a good guideline.

The thing I reacted to in your patch was that in both cases the
comment really explained the _conditional_, not the code inside the
conditional. So putting it inside the conditional was really at the
wrong level, and too late.

                     Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] Reduce mmap_sem hold times during file backed page faults
  2010-10-01  5:04 [PATCH 0/2] Reduce mmap_sem hold times during file backed page faults Michel Lespinasse
  2010-10-01  5:04 ` [PATCH 1/2] Unique path for locking page in filemap_fault() Michel Lespinasse
  2010-10-01  5:04 ` [PATCH 2/2] Release mmap_sem when page fault blocks on disk transfer Michel Lespinasse
@ 2010-10-01 12:07 ` Rik van Riel
  2 siblings, 0 replies; 8+ messages in thread
From: Rik van Riel @ 2010-10-01 12:07 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: linux-mm, Linus Torvalds, Ying Han, linux-kernel, Andrew Morton,
	Nick Piggin, Peter Zijlstra, Hugh Dickins

On 10/01/2010 01:04 AM, Michel Lespinasse wrote:
> Linus, I would appreciate your comments on this since you shot down the
> previous proposal. I hope you'll find this approach is sane, but I would
> be interested to hear if you have specific objections.
>
> mmap_sem is very coarse grained (per process) and has long read-hold times
> (disk latencies); this breaks down rapidly for workloads that use both
> read and write mmap_sem acquires. This short patch series tries to reduce
> mmap_sem hold times when faulting in file backed VMAs.

The changes make sense to me, but it would be good to know
what kind of benefits you have seen with these patches.

Especially performance numbers :)

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-10-02  0:03 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-01  5:04 [PATCH 0/2] Reduce mmap_sem hold times during file backed page faults Michel Lespinasse
2010-10-01  5:04 ` [PATCH 1/2] Unique path for locking page in filemap_fault() Michel Lespinasse
2010-10-01  5:04 ` [PATCH 2/2] Release mmap_sem when page fault blocks on disk transfer Michel Lespinasse
2010-10-01 14:06   ` Rik van Riel
2010-10-01 15:31   ` Linus Torvalds
2010-10-01 23:06     ` Michel Lespinasse
2010-10-02  0:02       ` Linus Torvalds
2010-10-01 12:07 ` [PATCH 0/2] Reduce mmap_sem hold times during file backed page faults Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).