[PATCH 1/1] arch/arm/mm/fault.c: Porting OOM changes into __do_page

public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed

* [PATCH 1/1] arch/arm/mm/fault.c: Porting OOM changes into __do_page_fault
@ 2011-11-12 18:33 Kautuk Consul
  2011-11-12 23:01 ` kautuk.c @samsung.com
  0 siblings, 1 reply; 8+ messages in thread
From: Kautuk Consul @ 2011-11-12 18:33 UTC (permalink / raw)
  To: linux-arm-kernel

Commits d065bd810b6deb67d4897a14bfe21f8eb526ba99 and
37b23e0525d393d48a7d59f870b3bc061a30ccdb introduced changes into
the x86 pagefault handler for makeing the page fault handler
retryable as well as killable.

These changes reduce the mmap_sem hold time(for x86), which is crucial
during OOM killer invocation.

Porting these changes to ARM.

Without these changes, my ARM board encounters many hang and livelock
scenarios.
After applying this patch, OOM feature performance improves according to
my testing.

Motivation for porting these changes:
------------------------------------
Embedded devices such as SMART TVs and SMART phones in the near future
will have the capability to download and run apps from the internet.
Due to this, the device user might run some malignant app that
allocates too much memory.
In that case, OOM killer performance is very important so that the
device can free up memory for other apps as well as the kernel.

Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
---
 arch/arm/mm/fault.c |   57 ++++++++++++++++++++++++++++++++++++++------------
 1 files changed, 43 insertions(+), 14 deletions(-)

diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index aa33949..f251ec1 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -231,11 +231,15 @@ static inline bool access_error(unsigned int fsr, struct vm_area_struct *vma)
 
 static int __kprobes
 __do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr,
-		struct task_struct *tsk)
+		struct pt_regs *regs, struct task_struct *tsk)
 {
 	struct vm_area_struct *vma;
 	int fault;
+	int write = fsr & FSR_WRITE;
+	unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE |
+					(write ? FAULT_FLAG_WRITE : 0);
 
+retry:
 	vma = find_vma(mm, addr);
 	fault = VM_FAULT_BADMAP;
 	if (unlikely(!vma))
@@ -257,13 +261,44 @@ good_area:
 	 * If for any reason at all we couldn't handle the fault, make
 	 * sure we exit gracefully rather than endlessly redo the fault.
 	 */
-	fault = handle_mm_fault(mm, vma, addr & PAGE_MASK, (fsr & FSR_WRITE) ? FAULT_FLAG_WRITE : 0);
-	if (unlikely(fault & VM_FAULT_ERROR))
+	fault = handle_mm_fault(mm, vma, addr & PAGE_MASK, flags);
+
+	if (unlikely((fault & VM_FAULT_ERROR)))
 		return fault;
-	if (fault & VM_FAULT_MAJOR)
-		tsk->maj_flt++;
-	else
-		tsk->min_flt++;
+
+	if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
+		return fault;
+
+	/*
+	 * Major/minor page fault accounting is only done on the
+	 * initial attempt. If we go through a retry, it is extremely
+	 * likely that the page will be found in page cache at that point.
+	 */
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
+	if (flags & FAULT_FLAG_ALLOW_RETRY) {
+		if (fault & VM_FAULT_MAJOR) {
+			tsk->maj_flt++;
+			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
+				      regs, addr);
+		} else {
+			tsk->min_flt++;
+			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
+				      regs, addr);
+		}
+		if (fault & VM_FAULT_RETRY) {
+			/* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk
+			 * of starvation. */
+			flags &= ~FAULT_FLAG_ALLOW_RETRY;
+
+			/* Acquire the mmap_sem again before retrying this
+			 * pagefault. This would have been released by
+			 * __lock_page_or_retry() in mm/filemap.c. */
+			down_read(&mm->mmap_sem);
+
+			goto retry;
+		}
+	}
+
 	return fault;
 
 check_stack:
@@ -320,15 +355,9 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 #endif
 	}
 
-	fault = __do_page_fault(mm, addr, fsr, tsk);
+	fault = __do_page_fault(mm, addr, fsr, regs, tsk);
 	up_read(&mm->mmap_sem);
 
-	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
-	if (fault & VM_FAULT_MAJOR)
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, addr);
-	else if (fault & VM_FAULT_MINOR)
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, addr);
-
 	/*
 	 * Handle the "normal" case first - VM_FAULT_MAJOR / VM_FAULT_MINOR
 	 */
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 1/1] arch/arm/mm/fault.c: Porting OOM changes into __do_page_fault
  2011-11-12 18:33 Kautuk Consul
@ 2011-11-12 23:01 ` kautuk.c @samsung.com
  0 siblings, 0 replies; 8+ messages in thread
From: kautuk.c @samsung.com @ 2011-11-12 23:01 UTC (permalink / raw)
  To: linux-arm-kernel

Sorry, please ignore this patch.

There is one error in it which I discovered through code review:
lockdep will detect a lock imbalance in the up_read() inside do_page_fault
if __do_page_fault returns with VM_FAULT_RETRY set in the fault variable.
This will happen in the fatal signal handling case of __do_page_fault.

I'll post another patch which takes care of this.

On Sat, Nov 12, 2011 at 1:33 PM, Kautuk Consul <consul.kautuk@gmail.com> wrote:
> Commits d065bd810b6deb67d4897a14bfe21f8eb526ba99 and
> 37b23e0525d393d48a7d59f870b3bc061a30ccdb introduced changes into
> the x86 pagefault handler for makeing the page fault handler
> retryable as well as killable.
>
> These changes reduce the mmap_sem hold time(for x86), which is crucial
> during OOM killer invocation.
>
> Porting these changes to ARM.
>
> Without these changes, my ARM board encounters many hang and livelock
> scenarios.
> After applying this patch, OOM feature performance improves according to
> my testing.
>
> Motivation for porting these changes:
> ------------------------------------
> Embedded devices such as SMART TVs and SMART phones in the near future
> will have the capability to download and run apps from the internet.
> Due to this, the device user might run some malignant app that
> allocates too much memory.
> In that case, OOM killer performance is very important so that the
> device can free up memory for other apps as well as the kernel.
>
> Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
> ---
> ?arch/arm/mm/fault.c | ? 57 ++++++++++++++++++++++++++++++++++++++------------
> ?1 files changed, 43 insertions(+), 14 deletions(-)
>
> diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
> index aa33949..f251ec1 100644
> --- a/arch/arm/mm/fault.c
> +++ b/arch/arm/mm/fault.c
> @@ -231,11 +231,15 @@ static inline bool access_error(unsigned int fsr, struct vm_area_struct *vma)
>
> ?static int __kprobes
> ?__do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr,
> - ? ? ? ? ? ? ? struct task_struct *tsk)
> + ? ? ? ? ? ? ? struct pt_regs *regs, struct task_struct *tsk)
> ?{
> ? ? ? ?struct vm_area_struct *vma;
> ? ? ? ?int fault;
> + ? ? ? int write = fsr & FSR_WRITE;
> + ? ? ? unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE |
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? (write ? FAULT_FLAG_WRITE : 0);
>
> +retry:
> ? ? ? ?vma = find_vma(mm, addr);
> ? ? ? ?fault = VM_FAULT_BADMAP;
> ? ? ? ?if (unlikely(!vma))
> @@ -257,13 +261,44 @@ good_area:
> ? ? ? ? * If for any reason at all we couldn't handle the fault, make
> ? ? ? ? * sure we exit gracefully rather than endlessly redo the fault.
> ? ? ? ? */
> - ? ? ? fault = handle_mm_fault(mm, vma, addr & PAGE_MASK, (fsr & FSR_WRITE) ? FAULT_FLAG_WRITE : 0);
> - ? ? ? if (unlikely(fault & VM_FAULT_ERROR))
> + ? ? ? fault = handle_mm_fault(mm, vma, addr & PAGE_MASK, flags);
> +
> + ? ? ? if (unlikely((fault & VM_FAULT_ERROR)))
> ? ? ? ? ? ? ? ?return fault;
> - ? ? ? if (fault & VM_FAULT_MAJOR)
> - ? ? ? ? ? ? ? tsk->maj_flt++;
> - ? ? ? else
> - ? ? ? ? ? ? ? tsk->min_flt++;
> +
> + ? ? ? if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
> + ? ? ? ? ? ? ? return fault;
> +
> + ? ? ? /*
> + ? ? ? ?* Major/minor page fault accounting is only done on the
> + ? ? ? ?* initial attempt. If we go through a retry, it is extremely
> + ? ? ? ?* likely that the page will be found in page cache at that point.
> + ? ? ? ?*/
> + ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
> + ? ? ? if (flags & FAULT_FLAG_ALLOW_RETRY) {
> + ? ? ? ? ? ? ? if (fault & VM_FAULT_MAJOR) {
> + ? ? ? ? ? ? ? ? ? ? ? tsk->maj_flt++;
> + ? ? ? ? ? ? ? ? ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? regs, addr);
> + ? ? ? ? ? ? ? } else {
> + ? ? ? ? ? ? ? ? ? ? ? tsk->min_flt++;
> + ? ? ? ? ? ? ? ? ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? regs, addr);
> + ? ? ? ? ? ? ? }
> + ? ? ? ? ? ? ? if (fault & VM_FAULT_RETRY) {
> + ? ? ? ? ? ? ? ? ? ? ? /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk
> + ? ? ? ? ? ? ? ? ? ? ? ?* of starvation. */
> + ? ? ? ? ? ? ? ? ? ? ? flags &= ~FAULT_FLAG_ALLOW_RETRY;
> +
> + ? ? ? ? ? ? ? ? ? ? ? /* Acquire the mmap_sem again before retrying this
> + ? ? ? ? ? ? ? ? ? ? ? ?* pagefault. This would have been released by
> + ? ? ? ? ? ? ? ? ? ? ? ?* __lock_page_or_retry() in mm/filemap.c. */
> + ? ? ? ? ? ? ? ? ? ? ? down_read(&mm->mmap_sem);
> +
> + ? ? ? ? ? ? ? ? ? ? ? goto retry;
> + ? ? ? ? ? ? ? }
> + ? ? ? }
> +
> ? ? ? ?return fault;
>
> ?check_stack:
> @@ -320,15 +355,9 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
> ?#endif
> ? ? ? ?}
>
> - ? ? ? fault = __do_page_fault(mm, addr, fsr, tsk);
> + ? ? ? fault = __do_page_fault(mm, addr, fsr, regs, tsk);
> ? ? ? ?up_read(&mm->mmap_sem);
>
> - ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
> - ? ? ? if (fault & VM_FAULT_MAJOR)
> - ? ? ? ? ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, addr);
> - ? ? ? else if (fault & VM_FAULT_MINOR)
> - ? ? ? ? ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, addr);
> -
> ? ? ? ?/*
> ? ? ? ? * Handle the "normal" case first - VM_FAULT_MAJOR / VM_FAULT_MINOR
> ? ? ? ? */
> --
> 1.7.5.4
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/1] arch/arm/mm/fault.c: Porting OOM changes into __do_page_fault
@ 2011-11-12 23:03 Kautuk Consul
  2011-11-12 23:07 ` kautuk.c @samsung.com
  0 siblings, 1 reply; 8+ messages in thread
From: Kautuk Consul @ 2011-11-12 23:03 UTC (permalink / raw)
  To: linux-arm-kernel

Commits d065bd810b6deb67d4897a14bfe21f8eb526ba99 and
37b23e0525d393d48a7d59f870b3bc061a30ccdb introduced changes into
the x86 pagefault handler for makeing the page fault handler
retryable as well as killable.

These changes reduce the mmap_sem hold time(for x86), which is crucial
during OOM killer invocation.

Porting these changes to ARM.

Without these changes, my ARM board encounters many hang and livelock
scenarios.
After applying this patch, OOM feature performance improves according to
my testing.

Motivation for porting these changes:
------------------------------------
Embedded devices such as SMART TVs and SMART phones in the near future
will have the capability to download and run apps from the internet.
Due to this, the device user might run some malignant app that
allocates too much memory.
In that case, OOM killer performance is very important so that the
device can free up memory for other apps as well as the kernel.

Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
---
 arch/arm/mm/fault.c |   60 ++++++++++++++++++++++++++++++++++++++------------
 1 files changed, 45 insertions(+), 15 deletions(-)

diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index aa33949..2f89dba 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -231,11 +231,15 @@ static inline bool access_error(unsigned int fsr, struct vm_area_struct *vma)
 
 static int __kprobes
 __do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr,
-		struct task_struct *tsk)
+		struct pt_regs *regs, struct task_struct *tsk)
 {
 	struct vm_area_struct *vma;
 	int fault;
+	int write = fsr & FSR_WRITE;
+	unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE |
+					(write ? FAULT_FLAG_WRITE : 0);
 
+retry:
 	vma = find_vma(mm, addr);
 	fault = VM_FAULT_BADMAP;
 	if (unlikely(!vma))
@@ -257,13 +261,44 @@ good_area:
 	 * If for any reason at all we couldn't handle the fault, make
 	 * sure we exit gracefully rather than endlessly redo the fault.
 	 */
-	fault = handle_mm_fault(mm, vma, addr & PAGE_MASK, (fsr & FSR_WRITE) ? FAULT_FLAG_WRITE : 0);
-	if (unlikely(fault & VM_FAULT_ERROR))
+	fault = handle_mm_fault(mm, vma, addr & PAGE_MASK, flags);
+
+	if (unlikely((fault & VM_FAULT_ERROR)))
 		return fault;
-	if (fault & VM_FAULT_MAJOR)
-		tsk->maj_flt++;
-	else
-		tsk->min_flt++;
+
+	if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
+		return fault;
+
+	/*
+	 * Major/minor page fault accounting is only done on the
+	 * initial attempt. If we go through a retry, it is extremely
+	 * likely that the page will be found in page cache at that point.
+	 */
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
+	if (flags & FAULT_FLAG_ALLOW_RETRY) {
+		if (fault & VM_FAULT_MAJOR) {
+			tsk->maj_flt++;
+			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
+				      regs, addr);
+		} else {
+			tsk->min_flt++;
+			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
+				      regs, addr);
+		}
+		if (fault & VM_FAULT_RETRY) {
+			/* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk
+			 * of starvation. */
+			flags &= ~FAULT_FLAG_ALLOW_RETRY;
+
+			/* Acquire the mmap_sem again before retrying this
+			 * pagefault. This would have been released by
+			 * __lock_page_or_retry() in mm/filemap.c. */
+			down_read(&mm->mmap_sem);
+
+			goto retry;
+		}
+	}
+
 	return fault;
 
 check_stack:
@@ -320,14 +355,9 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 #endif
 	}
 
-	fault = __do_page_fault(mm, addr, fsr, tsk);
-	up_read(&mm->mmap_sem);
-
-	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
-	if (fault & VM_FAULT_MAJOR)
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, addr);
-	else if (fault & VM_FAULT_MINOR)
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, addr);
+	fault = __do_page_fault(mm, addr, fsr, regs, tsk);
+	if (unlikely(!(fault & VM_FAULT_RETRY)))
+		up_read(&mm->mmap_sem);
 
 	/*
 	 * Handle the "normal" case first - VM_FAULT_MAJOR / VM_FAULT_MINOR
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 1/1] arch/arm/mm/fault.c: Porting OOM changes into __do_page_fault
  2011-11-12 23:03 Kautuk Consul
@ 2011-11-12 23:07 ` kautuk.c @samsung.com
  0 siblings, 0 replies; 8+ messages in thread
From: kautuk.c @samsung.com @ 2011-11-12 23:07 UTC (permalink / raw)
  To: linux-arm-kernel

Argh, please ignore this patch, too.

I should be using likely and not unlikely in the check I introduced
for checking
VM_FAULT_RETRY.

Ill post one more.. :(

On Sat, Nov 12, 2011 at 6:03 PM, Kautuk Consul <consul.kautuk@gmail.com> wrote:
> Commits d065bd810b6deb67d4897a14bfe21f8eb526ba99 and
> 37b23e0525d393d48a7d59f870b3bc061a30ccdb introduced changes into
> the x86 pagefault handler for makeing the page fault handler
> retryable as well as killable.
>
> These changes reduce the mmap_sem hold time(for x86), which is crucial
> during OOM killer invocation.
>
> Porting these changes to ARM.
>
> Without these changes, my ARM board encounters many hang and livelock
> scenarios.
> After applying this patch, OOM feature performance improves according to
> my testing.
>
> Motivation for porting these changes:
> ------------------------------------
> Embedded devices such as SMART TVs and SMART phones in the near future
> will have the capability to download and run apps from the internet.
> Due to this, the device user might run some malignant app that
> allocates too much memory.
> In that case, OOM killer performance is very important so that the
> device can free up memory for other apps as well as the kernel.
>
> Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
> ---
> ?arch/arm/mm/fault.c | ? 60 ++++++++++++++++++++++++++++++++++++++------------
> ?1 files changed, 45 insertions(+), 15 deletions(-)
>
> diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
> index aa33949..2f89dba 100644
> --- a/arch/arm/mm/fault.c
> +++ b/arch/arm/mm/fault.c
> @@ -231,11 +231,15 @@ static inline bool access_error(unsigned int fsr, struct vm_area_struct *vma)
>
> ?static int __kprobes
> ?__do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr,
> - ? ? ? ? ? ? ? struct task_struct *tsk)
> + ? ? ? ? ? ? ? struct pt_regs *regs, struct task_struct *tsk)
> ?{
> ? ? ? ?struct vm_area_struct *vma;
> ? ? ? ?int fault;
> + ? ? ? int write = fsr & FSR_WRITE;
> + ? ? ? unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE |
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? (write ? FAULT_FLAG_WRITE : 0);
>
> +retry:
> ? ? ? ?vma = find_vma(mm, addr);
> ? ? ? ?fault = VM_FAULT_BADMAP;
> ? ? ? ?if (unlikely(!vma))
> @@ -257,13 +261,44 @@ good_area:
> ? ? ? ? * If for any reason at all we couldn't handle the fault, make
> ? ? ? ? * sure we exit gracefully rather than endlessly redo the fault.
> ? ? ? ? */
> - ? ? ? fault = handle_mm_fault(mm, vma, addr & PAGE_MASK, (fsr & FSR_WRITE) ? FAULT_FLAG_WRITE : 0);
> - ? ? ? if (unlikely(fault & VM_FAULT_ERROR))
> + ? ? ? fault = handle_mm_fault(mm, vma, addr & PAGE_MASK, flags);
> +
> + ? ? ? if (unlikely((fault & VM_FAULT_ERROR)))
> ? ? ? ? ? ? ? ?return fault;
> - ? ? ? if (fault & VM_FAULT_MAJOR)
> - ? ? ? ? ? ? ? tsk->maj_flt++;
> - ? ? ? else
> - ? ? ? ? ? ? ? tsk->min_flt++;
> +
> + ? ? ? if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
> + ? ? ? ? ? ? ? return fault;
> +
> + ? ? ? /*
> + ? ? ? ?* Major/minor page fault accounting is only done on the
> + ? ? ? ?* initial attempt. If we go through a retry, it is extremely
> + ? ? ? ?* likely that the page will be found in page cache at that point.
> + ? ? ? ?*/
> + ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
> + ? ? ? if (flags & FAULT_FLAG_ALLOW_RETRY) {
> + ? ? ? ? ? ? ? if (fault & VM_FAULT_MAJOR) {
> + ? ? ? ? ? ? ? ? ? ? ? tsk->maj_flt++;
> + ? ? ? ? ? ? ? ? ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? regs, addr);
> + ? ? ? ? ? ? ? } else {
> + ? ? ? ? ? ? ? ? ? ? ? tsk->min_flt++;
> + ? ? ? ? ? ? ? ? ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? regs, addr);
> + ? ? ? ? ? ? ? }
> + ? ? ? ? ? ? ? if (fault & VM_FAULT_RETRY) {
> + ? ? ? ? ? ? ? ? ? ? ? /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk
> + ? ? ? ? ? ? ? ? ? ? ? ?* of starvation. */
> + ? ? ? ? ? ? ? ? ? ? ? flags &= ~FAULT_FLAG_ALLOW_RETRY;
> +
> + ? ? ? ? ? ? ? ? ? ? ? /* Acquire the mmap_sem again before retrying this
> + ? ? ? ? ? ? ? ? ? ? ? ?* pagefault. This would have been released by
> + ? ? ? ? ? ? ? ? ? ? ? ?* __lock_page_or_retry() in mm/filemap.c. */
> + ? ? ? ? ? ? ? ? ? ? ? down_read(&mm->mmap_sem);
> +
> + ? ? ? ? ? ? ? ? ? ? ? goto retry;
> + ? ? ? ? ? ? ? }
> + ? ? ? }
> +
> ? ? ? ?return fault;
>
> ?check_stack:
> @@ -320,14 +355,9 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
> ?#endif
> ? ? ? ?}
>
> - ? ? ? fault = __do_page_fault(mm, addr, fsr, tsk);
> - ? ? ? up_read(&mm->mmap_sem);
> -
> - ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
> - ? ? ? if (fault & VM_FAULT_MAJOR)
> - ? ? ? ? ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, addr);
> - ? ? ? else if (fault & VM_FAULT_MINOR)
> - ? ? ? ? ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, addr);
> + ? ? ? fault = __do_page_fault(mm, addr, fsr, regs, tsk);
> + ? ? ? if (unlikely(!(fault & VM_FAULT_RETRY)))
> + ? ? ? ? ? ? ? up_read(&mm->mmap_sem);
>
> ? ? ? ?/*
> ? ? ? ? * Handle the "normal" case first - VM_FAULT_MAJOR / VM_FAULT_MINOR
> --
> 1.7.5.4
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/1] arch/arm/mm/fault.c: Porting OOM changes into __do_page_fault
@ 2011-11-12 23:08 Kautuk Consul
  2011-11-12 23:20 ` Russell King - ARM Linux
  0 siblings, 1 reply; 8+ messages in thread
From: Kautuk Consul @ 2011-11-12 23:08 UTC (permalink / raw)
  To: linux-arm-kernel

Commits d065bd810b6deb67d4897a14bfe21f8eb526ba99 and
37b23e0525d393d48a7d59f870b3bc061a30ccdb introduced changes into
the x86 pagefault handler for makeing the page fault handler
retryable as well as killable.

These changes reduce the mmap_sem hold time(for x86), which is crucial
during OOM killer invocation.

Porting these changes to ARM.

Without these changes, my ARM board encounters many hang and livelock
scenarios.
After applying this patch, OOM feature performance improves according to
my testing.

Motivation for porting these changes:
------------------------------------
Embedded devices such as SMART TVs and SMART phones in the near future
will have the capability to download and run apps from the internet.
Due to this, the device user might run some malignant app that
allocates too much memory.
In that case, OOM killer performance is very important so that the
device can free up memory for other apps as well as the kernel.

Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
---
 arch/arm/mm/fault.c |   60 ++++++++++++++++++++++++++++++++++++++------------
 1 files changed, 45 insertions(+), 15 deletions(-)

diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index aa33949..2f89dba 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -231,11 +231,15 @@ static inline bool access_error(unsigned int fsr, struct vm_area_struct *vma)
 
 static int __kprobes
 __do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr,
-		struct task_struct *tsk)
+		struct pt_regs *regs, struct task_struct *tsk)
 {
 	struct vm_area_struct *vma;
 	int fault;
+	int write = fsr & FSR_WRITE;
+	unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE |
+					(write ? FAULT_FLAG_WRITE : 0);
 
+retry:
 	vma = find_vma(mm, addr);
 	fault = VM_FAULT_BADMAP;
 	if (unlikely(!vma))
@@ -257,13 +261,44 @@ good_area:
 	 * If for any reason at all we couldn't handle the fault, make
 	 * sure we exit gracefully rather than endlessly redo the fault.
 	 */
-	fault = handle_mm_fault(mm, vma, addr & PAGE_MASK, (fsr & FSR_WRITE) ? FAULT_FLAG_WRITE : 0);
-	if (unlikely(fault & VM_FAULT_ERROR))
+	fault = handle_mm_fault(mm, vma, addr & PAGE_MASK, flags);
+
+	if (unlikely((fault & VM_FAULT_ERROR)))
 		return fault;
-	if (fault & VM_FAULT_MAJOR)
-		tsk->maj_flt++;
-	else
-		tsk->min_flt++;
+
+	if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
+		return fault;
+
+	/*
+	 * Major/minor page fault accounting is only done on the
+	 * initial attempt. If we go through a retry, it is extremely
+	 * likely that the page will be found in page cache at that point.
+	 */
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
+	if (flags & FAULT_FLAG_ALLOW_RETRY) {
+		if (fault & VM_FAULT_MAJOR) {
+			tsk->maj_flt++;
+			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
+				      regs, addr);
+		} else {
+			tsk->min_flt++;
+			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
+				      regs, addr);
+		}
+		if (fault & VM_FAULT_RETRY) {
+			/* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk
+			 * of starvation. */
+			flags &= ~FAULT_FLAG_ALLOW_RETRY;
+
+			/* Acquire the mmap_sem again before retrying this
+			 * pagefault. This would have been released by
+			 * __lock_page_or_retry() in mm/filemap.c. */
+			down_read(&mm->mmap_sem);
+
+			goto retry;
+		}
+	}
+
 	return fault;
 
 check_stack:
@@ -320,14 +355,9 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 #endif
 	}
 
-	fault = __do_page_fault(mm, addr, fsr, tsk);
-	up_read(&mm->mmap_sem);
-
-	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
-	if (fault & VM_FAULT_MAJOR)
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, addr);
-	else if (fault & VM_FAULT_MINOR)
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, addr);
+	fault = __do_page_fault(mm, addr, fsr, regs, tsk);
+	if (likely(!(fault & VM_FAULT_RETRY)))
+		up_read(&mm->mmap_sem);
 
 	/*
 	 * Handle the "normal" case first - VM_FAULT_MAJOR / VM_FAULT_MINOR
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 1/1] arch/arm/mm/fault.c: Porting OOM changes into __do_page_fault
  2011-11-12 23:08 [PATCH 1/1] arch/arm/mm/fault.c: Porting OOM changes into __do_page_fault Kautuk Consul
@ 2011-11-12 23:20 ` Russell King - ARM Linux
  2011-11-12 23:56   ` kautuk.c @samsung.com
  0 siblings, 1 reply; 8+ messages in thread
From: Russell King - ARM Linux @ 2011-11-12 23:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Nov 12, 2011 at 06:08:03PM -0500, Kautuk Consul wrote:
> diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
> index aa33949..2f89dba 100644
> --- a/arch/arm/mm/fault.c
> +++ b/arch/arm/mm/fault.c
> @@ -231,11 +231,15 @@ static inline bool access_error(unsigned int fsr, struct vm_area_struct *vma)
>  
>  static int __kprobes
>  __do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr,
> -		struct task_struct *tsk)
> +		struct pt_regs *regs, struct task_struct *tsk)
>  {
>  	struct vm_area_struct *vma;
>  	int fault;
> +	int write = fsr & FSR_WRITE;
> +	unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE |
> +					(write ? FAULT_FLAG_WRITE : 0);
>  
> +retry:
>  	vma = find_vma(mm, addr);
>  	fault = VM_FAULT_BADMAP;
>  	if (unlikely(!vma))
> @@ -257,13 +261,44 @@ good_area:
>  	 * If for any reason at all we couldn't handle the fault, make
>  	 * sure we exit gracefully rather than endlessly redo the fault.
>  	 */
> -	fault = handle_mm_fault(mm, vma, addr & PAGE_MASK, (fsr & FSR_WRITE) ? FAULT_FLAG_WRITE : 0);
> -	if (unlikely(fault & VM_FAULT_ERROR))
> +	fault = handle_mm_fault(mm, vma, addr & PAGE_MASK, flags);
> +
> +	if (unlikely((fault & VM_FAULT_ERROR)))
>  		return fault;
> -	if (fault & VM_FAULT_MAJOR)
> -		tsk->maj_flt++;
> -	else
> -		tsk->min_flt++;
> +
> +	if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
> +		return fault;
> +
> +	/*
> +	 * Major/minor page fault accounting is only done on the
> +	 * initial attempt. If we go through a retry, it is extremely
> +	 * likely that the page will be found in page cache at that point.
> +	 */
> +	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
> +	if (flags & FAULT_FLAG_ALLOW_RETRY) {
> +		if (fault & VM_FAULT_MAJOR) {
> +			tsk->maj_flt++;
> +			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
> +				      regs, addr);
> +		} else {
> +			tsk->min_flt++;
> +			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
> +				      regs, addr);
> +		}
> +		if (fault & VM_FAULT_RETRY) {
> +			/* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk
> +			 * of starvation. */
> +			flags &= ~FAULT_FLAG_ALLOW_RETRY;
> +
> +			/* Acquire the mmap_sem again before retrying this
> +			 * pagefault. This would have been released by
> +			 * __lock_page_or_retry() in mm/filemap.c. */
> +			down_read(&mm->mmap_sem);
> +
> +			goto retry;
> +		}
> +	}
> +
>  	return fault;
>  
>  check_stack:
> @@ -320,14 +355,9 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
>  #endif
>  	}
>  
> -	fault = __do_page_fault(mm, addr, fsr, tsk);
> -	up_read(&mm->mmap_sem);
> -
> -	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
> -	if (fault & VM_FAULT_MAJOR)
> -		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, addr);
> -	else if (fault & VM_FAULT_MINOR)
> -		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, addr);
> +	fault = __do_page_fault(mm, addr, fsr, regs, tsk);
> +	if (likely(!(fault & VM_FAULT_RETRY)))
> +		up_read(&mm->mmap_sem);

I really don't like this.  I crafted this handling in such a way that
the locking was plainly obvious - with all locking handled in
do_page_fault and not inside __do_page_fault.  That's how I want things
to stay, so please rework this patch to maintain that.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/1] arch/arm/mm/fault.c: Porting OOM changes into __do_page_fault
  2011-11-12 23:20 ` Russell King - ARM Linux
@ 2011-11-12 23:56   ` kautuk.c @samsung.com
  2011-11-13  0:51     ` kautuk.c @samsung.com
  0 siblings, 1 reply; 8+ messages in thread
From: kautuk.c @samsung.com @ 2011-11-12 23:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Nov 12, 2011 at 6:20 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Sat, Nov 12, 2011 at 06:08:03PM -0500, Kautuk Consul wrote:
>> diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
>> index aa33949..2f89dba 100644
>> --- a/arch/arm/mm/fault.c
>> +++ b/arch/arm/mm/fault.c
>> @@ -231,11 +231,15 @@ static inline bool access_error(unsigned int fsr, struct vm_area_struct *vma)
>>
>> ?static int __kprobes
>> ?__do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr,
>> - ? ? ? ? ? ? struct task_struct *tsk)
>> + ? ? ? ? ? ? struct pt_regs *regs, struct task_struct *tsk)
>> ?{
>> ? ? ? struct vm_area_struct *vma;
>> ? ? ? int fault;
>> + ? ? int write = fsr & FSR_WRITE;
>> + ? ? unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE |
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? (write ? FAULT_FLAG_WRITE : 0);
>>
>> +retry:
>> ? ? ? vma = find_vma(mm, addr);
>> ? ? ? fault = VM_FAULT_BADMAP;
>> ? ? ? if (unlikely(!vma))
>> @@ -257,13 +261,44 @@ good_area:
>> ? ? ? ?* If for any reason at all we couldn't handle the fault, make
>> ? ? ? ?* sure we exit gracefully rather than endlessly redo the fault.
>> ? ? ? ?*/
>> - ? ? fault = handle_mm_fault(mm, vma, addr & PAGE_MASK, (fsr & FSR_WRITE) ? FAULT_FLAG_WRITE : 0);
>> - ? ? if (unlikely(fault & VM_FAULT_ERROR))
>> + ? ? fault = handle_mm_fault(mm, vma, addr & PAGE_MASK, flags);
>> +
>> + ? ? if (unlikely((fault & VM_FAULT_ERROR)))
>> ? ? ? ? ? ? ? return fault;
>> - ? ? if (fault & VM_FAULT_MAJOR)
>> - ? ? ? ? ? ? tsk->maj_flt++;
>> - ? ? else
>> - ? ? ? ? ? ? tsk->min_flt++;
>> +
>> + ? ? if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
>> + ? ? ? ? ? ? return fault;
>> +
>> + ? ? /*
>> + ? ? ?* Major/minor page fault accounting is only done on the
>> + ? ? ?* initial attempt. If we go through a retry, it is extremely
>> + ? ? ?* likely that the page will be found in page cache at that point.
>> + ? ? ?*/
>> + ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
>> + ? ? if (flags & FAULT_FLAG_ALLOW_RETRY) {
>> + ? ? ? ? ? ? if (fault & VM_FAULT_MAJOR) {
>> + ? ? ? ? ? ? ? ? ? ? tsk->maj_flt++;
>> + ? ? ? ? ? ? ? ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? regs, addr);
>> + ? ? ? ? ? ? } else {
>> + ? ? ? ? ? ? ? ? ? ? tsk->min_flt++;
>> + ? ? ? ? ? ? ? ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? regs, addr);
>> + ? ? ? ? ? ? }
>> + ? ? ? ? ? ? if (fault & VM_FAULT_RETRY) {
>> + ? ? ? ? ? ? ? ? ? ? /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk
>> + ? ? ? ? ? ? ? ? ? ? ?* of starvation. */
>> + ? ? ? ? ? ? ? ? ? ? flags &= ~FAULT_FLAG_ALLOW_RETRY;
>> +
>> + ? ? ? ? ? ? ? ? ? ? /* Acquire the mmap_sem again before retrying this
>> + ? ? ? ? ? ? ? ? ? ? ?* pagefault. This would have been released by
>> + ? ? ? ? ? ? ? ? ? ? ?* __lock_page_or_retry() in mm/filemap.c. */
>> + ? ? ? ? ? ? ? ? ? ? down_read(&mm->mmap_sem);
>> +
>> + ? ? ? ? ? ? ? ? ? ? goto retry;
>> + ? ? ? ? ? ? }
>> + ? ? }
>> +
>> ? ? ? return fault;
>>
>> ?check_stack:
>> @@ -320,14 +355,9 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
>> ?#endif
>> ? ? ? }
>>
>> - ? ? fault = __do_page_fault(mm, addr, fsr, tsk);
>> - ? ? up_read(&mm->mmap_sem);
>> -
>> - ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
>> - ? ? if (fault & VM_FAULT_MAJOR)
>> - ? ? ? ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, addr);
>> - ? ? else if (fault & VM_FAULT_MINOR)
>> - ? ? ? ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, addr);
>> + ? ? fault = __do_page_fault(mm, addr, fsr, regs, tsk);
>> + ? ? if (likely(!(fault & VM_FAULT_RETRY)))
>> + ? ? ? ? ? ? up_read(&mm->mmap_sem);
>
> I really don't like this. ?I crafted this handling in such a way that
> the locking was plainly obvious - with all locking handled in
> do_page_fault and not inside __do_page_fault. ?That's how I want things
> to stay, so please rework this patch to maintain that.


I understand your concern.

However, the entire concept of retryable and killable page fault
handlers kind of messes with the
locking that was there earlier.
( Please look at the commits I have mentioned in the patch. )

There will anyways have to be a check somewhere in the pagefault
handler code to check if
VM_FAULT_RETRY was returned and only do an up_read when there wasn't.
The reason for this is that the mmap_sem is released in
__lock_page_or_retry() in filemap.c.

Can you shed some more light on what you would find more acceptable in
the locking mechanism ?

>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/1] arch/arm/mm/fault.c: Porting OOM changes into __do_page_fault
  2011-11-12 23:56   ` kautuk.c @samsung.com
@ 2011-11-13  0:51     ` kautuk.c @samsung.com
  0 siblings, 0 replies; 8+ messages in thread
From: kautuk.c @samsung.com @ 2011-11-13  0:51 UTC (permalink / raw)
  To: linux-arm-kernel

Anyways, I have created a version 2 of this patch with all the locking
kept inside do_page_fault
itself. I will send it right after I send this email.

I would appreciate it if you could review that.

On Sat, Nov 12, 2011 at 6:56 PM, kautuk.c @samsung.com
<consul.kautuk@gmail.com> wrote:
> On Sat, Nov 12, 2011 at 6:20 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
>> On Sat, Nov 12, 2011 at 06:08:03PM -0500, Kautuk Consul wrote:
>>> diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
>>> index aa33949..2f89dba 100644
>>> --- a/arch/arm/mm/fault.c
>>> +++ b/arch/arm/mm/fault.c
>>> @@ -231,11 +231,15 @@ static inline bool access_error(unsigned int fsr, struct vm_area_struct *vma)
>>>
>>> ?static int __kprobes
>>> ?__do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr,
>>> - ? ? ? ? ? ? struct task_struct *tsk)
>>> + ? ? ? ? ? ? struct pt_regs *regs, struct task_struct *tsk)
>>> ?{
>>> ? ? ? struct vm_area_struct *vma;
>>> ? ? ? int fault;
>>> + ? ? int write = fsr & FSR_WRITE;
>>> + ? ? unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE |
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? (write ? FAULT_FLAG_WRITE : 0);
>>>
>>> +retry:
>>> ? ? ? vma = find_vma(mm, addr);
>>> ? ? ? fault = VM_FAULT_BADMAP;
>>> ? ? ? if (unlikely(!vma))
>>> @@ -257,13 +261,44 @@ good_area:
>>> ? ? ? ?* If for any reason at all we couldn't handle the fault, make
>>> ? ? ? ?* sure we exit gracefully rather than endlessly redo the fault.
>>> ? ? ? ?*/
>>> - ? ? fault = handle_mm_fault(mm, vma, addr & PAGE_MASK, (fsr & FSR_WRITE) ? FAULT_FLAG_WRITE : 0);
>>> - ? ? if (unlikely(fault & VM_FAULT_ERROR))
>>> + ? ? fault = handle_mm_fault(mm, vma, addr & PAGE_MASK, flags);
>>> +
>>> + ? ? if (unlikely((fault & VM_FAULT_ERROR)))
>>> ? ? ? ? ? ? ? return fault;
>>> - ? ? if (fault & VM_FAULT_MAJOR)
>>> - ? ? ? ? ? ? tsk->maj_flt++;
>>> - ? ? else
>>> - ? ? ? ? ? ? tsk->min_flt++;
>>> +
>>> + ? ? if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
>>> + ? ? ? ? ? ? return fault;
>>> +
>>> + ? ? /*
>>> + ? ? ?* Major/minor page fault accounting is only done on the
>>> + ? ? ?* initial attempt. If we go through a retry, it is extremely
>>> + ? ? ?* likely that the page will be found in page cache at that point.
>>> + ? ? ?*/
>>> + ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
>>> + ? ? if (flags & FAULT_FLAG_ALLOW_RETRY) {
>>> + ? ? ? ? ? ? if (fault & VM_FAULT_MAJOR) {
>>> + ? ? ? ? ? ? ? ? ? ? tsk->maj_flt++;
>>> + ? ? ? ? ? ? ? ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? regs, addr);
>>> + ? ? ? ? ? ? } else {
>>> + ? ? ? ? ? ? ? ? ? ? tsk->min_flt++;
>>> + ? ? ? ? ? ? ? ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? regs, addr);
>>> + ? ? ? ? ? ? }
>>> + ? ? ? ? ? ? if (fault & VM_FAULT_RETRY) {
>>> + ? ? ? ? ? ? ? ? ? ? /* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk
>>> + ? ? ? ? ? ? ? ? ? ? ?* of starvation. */
>>> + ? ? ? ? ? ? ? ? ? ? flags &= ~FAULT_FLAG_ALLOW_RETRY;
>>> +
>>> + ? ? ? ? ? ? ? ? ? ? /* Acquire the mmap_sem again before retrying this
>>> + ? ? ? ? ? ? ? ? ? ? ?* pagefault. This would have been released by
>>> + ? ? ? ? ? ? ? ? ? ? ?* __lock_page_or_retry() in mm/filemap.c. */
>>> + ? ? ? ? ? ? ? ? ? ? down_read(&mm->mmap_sem);
>>> +
>>> + ? ? ? ? ? ? ? ? ? ? goto retry;
>>> + ? ? ? ? ? ? }
>>> + ? ? }
>>> +
>>> ? ? ? return fault;
>>>
>>> ?check_stack:
>>> @@ -320,14 +355,9 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
>>> ?#endif
>>> ? ? ? }
>>>
>>> - ? ? fault = __do_page_fault(mm, addr, fsr, tsk);
>>> - ? ? up_read(&mm->mmap_sem);
>>> -
>>> - ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
>>> - ? ? if (fault & VM_FAULT_MAJOR)
>>> - ? ? ? ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, addr);
>>> - ? ? else if (fault & VM_FAULT_MINOR)
>>> - ? ? ? ? ? ? perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, addr);
>>> + ? ? fault = __do_page_fault(mm, addr, fsr, regs, tsk);
>>> + ? ? if (likely(!(fault & VM_FAULT_RETRY)))
>>> + ? ? ? ? ? ? up_read(&mm->mmap_sem);
>>
>> I really don't like this. ?I crafted this handling in such a way that
>> the locking was plainly obvious - with all locking handled in
>> do_page_fault and not inside __do_page_fault. ?That's how I want things
>> to stay, so please rework this patch to maintain that.
>
>
> I understand your concern.
>
> However, the entire concept of retryable and killable page fault
> handlers kind of messes with the
> locking that was there earlier.
> ( Please look at the commits I have mentioned in the patch. )
>
> There will anyways have to be a check somewhere in the pagefault
> handler code to check if
> VM_FAULT_RETRY was returned and only do an up_read when there wasn't.
> The reason for this is that the mmap_sem is released in
> __lock_page_or_retry() in filemap.c.
>
> Can you shed some more light on what you would find more acceptable in
> the locking mechanism ?
>
>>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-11-13  0:51 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-12 23:08 [PATCH 1/1] arch/arm/mm/fault.c: Porting OOM changes into __do_page_fault Kautuk Consul
2011-11-12 23:20 ` Russell King - ARM Linux
2011-11-12 23:56   ` kautuk.c @samsung.com
2011-11-13  0:51     ` kautuk.c @samsung.com
  -- strict thread matches above, loose matches on Subject: below --
2011-11-12 23:03 Kautuk Consul
2011-11-12 23:07 ` kautuk.c @samsung.com
2011-11-12 18:33 Kautuk Consul
2011-11-12 23:01 ` kautuk.c @samsung.com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox