* [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault
@ 2012-03-31 12:03 Kautuk Consul
2012-06-12 15:54 ` Kautuk Consul
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Kautuk Consul @ 2012-03-31 12:03 UTC (permalink / raw)
To: Tony Luck, Fenghua Yu; +Cc: linux-ia64, linux-kernel, Kautuk Consul
Commit d065bd810b6deb67d4897a14bfe21f8eb526ba99
(mm: retry page fault when blocking on disk transfer) and
commit 37b23e0525d393d48a7d59f870b3bc061a30ccdb
(x86,mm: make pagefault killable)
The above commits introduced changes into the x86 pagefault handler
for making the page fault handler retryable as well as killable.
These changes reduce the mmap_sem hold time, which is crucial
during OOM killer invocation.
Port these changes to ia64.
Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
---
arch/ia64/mm/fault.c | 38 ++++++++++++++++++++++++++++++--------
1 files changed, 30 insertions(+), 8 deletions(-)
diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
index 02d29c2..b0c3b29 100644
--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@@ -81,6 +81,12 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
struct siginfo si;
unsigned long mask;
int fault;
+ unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
+
+ mask = ((((isr >> IA64_ISR_X_BIT) & 1UL) << VM_EXEC_BIT)
+ | (((isr >> IA64_ISR_W_BIT) & 1UL) << VM_WRITE_BIT));
+
+ flags |= ((mask & VM_WRITE) ? FAULT_FLAG_WRITE : 0);
/* mmap_sem is performance critical.... */
prefetchw(&mm->mmap_sem);
@@ -109,6 +115,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
if (notify_page_fault(regs, TRAP_BRKPT))
return;
+retry:
down_read(&mm->mmap_sem);
vma = find_vma_prev(mm, address, &prev_vma);
@@ -142,9 +149,6 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
if (((isr >> IA64_ISR_R_BIT) & 1UL) && (!(vma->vm_flags & (VM_READ | VM_WRITE))))
goto bad_area;
- mask = ( (((isr >> IA64_ISR_X_BIT) & 1UL) << VM_EXEC_BIT)
- | (((isr >> IA64_ISR_W_BIT) & 1UL) << VM_WRITE_BIT));
-
if ((vma->vm_flags & mask) != mask)
goto bad_area;
@@ -153,7 +157,11 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
* sure we exit gracefully rather than endlessly redo the
* fault.
*/
- fault = handle_mm_fault(mm, vma, address, (mask & VM_WRITE) ? FAULT_FLAG_WRITE : 0);
+ fault = handle_mm_fault(mm, vma, address, flags);
+
+ if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
+ return;
+
if (unlikely(fault & VM_FAULT_ERROR)) {
/*
* We ran out of memory, or some other thing happened
@@ -168,10 +176,24 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
}
BUG();
}
- if (fault & VM_FAULT_MAJOR)
- current->maj_flt++;
- else
- current->min_flt++;
+
+ if (flags & FAULT_FLAG_ALLOW_RETRY) {
+ if (fault & VM_FAULT_MAJOR)
+ current->maj_flt++;
+ else
+ current->min_flt++;
+ if (fault & VM_FAULT_RETRY) {
+ flags &= ~FAULT_FLAG_ALLOW_RETRY;
+
+ /* No need to up_read(&mm->mmap_sem) as we would
+ * have already released it in __lock_page_or_retry
+ * in mm/filemap.c.
+ */
+
+ goto retry;
+ }
+ }
+
up_read(&mm->mmap_sem);
return;
--
1.7.5.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault
2012-03-31 12:03 [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault Kautuk Consul
@ 2012-06-12 15:54 ` Kautuk Consul
2012-06-13 20:50 ` Tony Luck
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Kautuk Consul @ 2012-06-12 15:54 UTC (permalink / raw)
To: linux-ia64
Hi,
Can you please review this patch ?
Many archs have accpted this change so I think it is okay. :)
Thanks,
Kautuk.
On Sat, Mar 31, 2012 at 5:33 PM, Kautuk Consul <consul.kautuk@gmail.com> wrote:
> Commit d065bd810b6deb67d4897a14bfe21f8eb526ba99
> (mm: retry page fault when blocking on disk transfer) and
> commit 37b23e0525d393d48a7d59f870b3bc061a30ccdb
> (x86,mm: make pagefault killable)
>
> The above commits introduced changes into the x86 pagefault handler
> for making the page fault handler retryable as well as killable.
>
> These changes reduce the mmap_sem hold time, which is crucial
> during OOM killer invocation.
>
> Port these changes to ia64.
>
> Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
> ---
> arch/ia64/mm/fault.c | 38 ++++++++++++++++++++++++++++++--------
> 1 files changed, 30 insertions(+), 8 deletions(-)
>
> diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
> index 02d29c2..b0c3b29 100644
> --- a/arch/ia64/mm/fault.c
> +++ b/arch/ia64/mm/fault.c
> @@ -81,6 +81,12 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
> struct siginfo si;
> unsigned long mask;
> int fault;
> + unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
> +
> + mask = ((((isr >> IA64_ISR_X_BIT) & 1UL) << VM_EXEC_BIT)
> + | (((isr >> IA64_ISR_W_BIT) & 1UL) << VM_WRITE_BIT));
> +
> + flags |= ((mask & VM_WRITE) ? FAULT_FLAG_WRITE : 0);
>
> /* mmap_sem is performance critical.... */
> prefetchw(&mm->mmap_sem);
> @@ -109,6 +115,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
> if (notify_page_fault(regs, TRAP_BRKPT))
> return;
>
> +retry:
> down_read(&mm->mmap_sem);
>
> vma = find_vma_prev(mm, address, &prev_vma);
> @@ -142,9 +149,6 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
> if (((isr >> IA64_ISR_R_BIT) & 1UL) && (!(vma->vm_flags & (VM_READ | VM_WRITE))))
> goto bad_area;
>
> - mask = ( (((isr >> IA64_ISR_X_BIT) & 1UL) << VM_EXEC_BIT)
> - | (((isr >> IA64_ISR_W_BIT) & 1UL) << VM_WRITE_BIT));
> -
> if ((vma->vm_flags & mask) != mask)
> goto bad_area;
>
> @@ -153,7 +157,11 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
> * sure we exit gracefully rather than endlessly redo the
> * fault.
> */
> - fault = handle_mm_fault(mm, vma, address, (mask & VM_WRITE) ? FAULT_FLAG_WRITE : 0);
> + fault = handle_mm_fault(mm, vma, address, flags);
> +
> + if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
> + return;
> +
> if (unlikely(fault & VM_FAULT_ERROR)) {
> /*
> * We ran out of memory, or some other thing happened
> @@ -168,10 +176,24 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
> }
> BUG();
> }
> - if (fault & VM_FAULT_MAJOR)
> - current->maj_flt++;
> - else
> - current->min_flt++;
> +
> + if (flags & FAULT_FLAG_ALLOW_RETRY) {
> + if (fault & VM_FAULT_MAJOR)
> + current->maj_flt++;
> + else
> + current->min_flt++;
> + if (fault & VM_FAULT_RETRY) {
> + flags &= ~FAULT_FLAG_ALLOW_RETRY;
> +
> + /* No need to up_read(&mm->mmap_sem) as we would
> + * have already released it in __lock_page_or_retry
> + * in mm/filemap.c.
> + */
> +
> + goto retry;
> + }
> + }
> +
> up_read(&mm->mmap_sem);
> return;
>
> --
> 1.7.5.4
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault
2012-03-31 12:03 [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault Kautuk Consul
2012-06-12 15:54 ` Kautuk Consul
@ 2012-06-13 20:50 ` Tony Luck
2012-06-14 16:29 ` Kautuk Consul
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Tony Luck @ 2012-06-13 20:50 UTC (permalink / raw)
To: linux-ia64
On Tue, Jun 12, 2012 at 8:51 AM, Kautuk Consul <consul.kautuk@gmail.com> wrote:
>> + mask = ((((isr >> IA64_ISR_X_BIT) & 1UL) << VM_EXEC_BIT)
>> + | (((isr >> IA64_ISR_W_BIT) & 1UL) << VM_WRITE_BIT));
You added these uses of VM_EXEC_BIT and VM_WRITE_BIT ... but
these macros are not defined until a few lines later in this file. I moved
them (and the VM_READ_BIT definition) up to just before the
start of ia64_do_page_fault() to fix the compilation errors. No need
to re-submit a fixed patch.
Do you know if the test program mentioned in
d065bd810b6deb67d4897a14bfe21f8eb526ba99:
- microbenchmark: thread A mmaps a large file and does random read accesses
to the mmaped area - achieves about 55 iterations/s. Thread B does
mmap/munmap in a loop at a separate location - achieves 55 iterations/s
before, 15000 iterations/s after.
was posted? I'd like to make sure that ia64 is seeing the benefit.
-Tony
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault
2012-03-31 12:03 [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault Kautuk Consul
2012-06-12 15:54 ` Kautuk Consul
2012-06-13 20:50 ` Tony Luck
@ 2012-06-14 16:29 ` Kautuk Consul
2012-06-14 16:43 ` Luck, Tony
2012-06-14 16:53 ` Kautuk Consul
4 siblings, 0 replies; 6+ messages in thread
From: Kautuk Consul @ 2012-06-14 16:29 UTC (permalink / raw)
To: linux-ia64
On Thu, Jun 14, 2012 at 2:20 AM, Tony Luck <tony.luck@gmail.com> wrote:
> On Tue, Jun 12, 2012 at 8:51 AM, Kautuk Consul <consul.kautuk@gmail.com> wrote:
>>> + mask = ((((isr >> IA64_ISR_X_BIT) & 1UL) << VM_EXEC_BIT)
>>> + | (((isr >> IA64_ISR_W_BIT) & 1UL) << VM_WRITE_BIT));
>
> You added these uses of VM_EXEC_BIT and VM_WRITE_BIT ... but
> these macros are not defined until a few lines later in this file. I moved
> them (and the VM_READ_BIT definition) up to just before the
> start of ia64_do_page_fault() to fix the compilation errors. No need
> to re-submit a fixed patch.
>
Thanks for the help, Tony !
Terribly sorry for my mistakes as I don't have an ia64 system.
> Do you know if the test program mentioned in
> d065bd810b6deb67d4897a14bfe21f8eb526ba99:
> - microbenchmark: thread A mmaps a large file and does random read accesses
> to the mmaped area - achieves about 55 iterations/s. Thread B does
> mmap/munmap in a loop at a separate location - achieves 55 iterations/s
> before, 15000 iterations/s after.
> was posted? I'd like to make sure that ia64 is seeing the benefit.
I couldn't find any test-case for that, but I tested in for ARM and
MIPS where I was seeing
hangs and livelocks.
Maybe you can try my test-case at https://lkml.org/lkml/2012/3/31/90 ?
>
> -Tony
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault
2012-03-31 12:03 [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault Kautuk Consul
` (2 preceding siblings ...)
2012-06-14 16:29 ` Kautuk Consul
@ 2012-06-14 16:43 ` Luck, Tony
2012-06-14 16:53 ` Kautuk Consul
4 siblings, 0 replies; 6+ messages in thread
From: Luck, Tony @ 2012-06-14 16:43 UTC (permalink / raw)
To: linux-ia64
[-- Attachment #1: Type: text/plain, Size: 650 bytes --]
> Maybe you can try my test-case at https://lkml.org/lkml/2012/3/31/90 ?
There seem to be a few cut & paste errors in that version (threads 3, 4, 5
think they are "Thread 2"). Also no check for malloc failing (which since
it is done in an infinite loop seems inevitable - unless your machines have
infinite memory :-)
Fixed up version attached. Seems to run without any lockups on a kernel
with your patch applied until either the malloc's start to fail, or the OOM
killer zaps it.
I put your patch into my "next" tree - it's in linux-next tag "next-20120614"
and I'll push to Linus in the next merge window.
Thanks
-Tony
[-- Attachment #2: kautuk.c --]
[-- Type: text/plain, Size: 4420 bytes --]
/* https://lkml.org/lkml/2012/3/31/90 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <pthread.h>
#include <unistd.h>
#define ALLOC_BYTE 512*1024
#define COUNT 50
void *alloc_function_one( void *ptr );
void *alloc_function_two( void *ptr );
void *alloc_function_three( void *ptr );
void *alloc_function_four( void *ptr );
void *alloc_function_five( void *ptr );
void *enable_function( void *ptr );
int main(int argc, char *argv[])
{
pthread_t thread1, thread2, thread3, thread4, thread5;
char *message1 = "Thread 1";
char *message2 = "Thread 2";
char *message3 = "Thread 3";
char *message4 = "Thread 4";
char *message5 = "Thread 5";
int iret1 = -1;
int iret2 = -1;
int iret3 = -1;
int iret4 = -1;
int iret5 = -1;
fork();
iret1 = pthread_create( &thread1, NULL, alloc_function_one, (void*) message1);
iret2 = pthread_create( &thread2, NULL, alloc_function_two, (void*) message2);
iret3 = pthread_create( &thread3, NULL, alloc_function_three, (void*) message3);
iret4 = pthread_create( &thread4, NULL, alloc_function_four, (void*) message4);
iret5 = pthread_create( &thread5, NULL, alloc_function_five, (void*) message5);
pthread_join( thread1, NULL);
pthread_join( thread2, NULL);
pthread_join( thread3, NULL);
pthread_join( thread4, NULL);
pthread_join( thread5, NULL);
printf("Thread 1 returns: %d\n",iret1);
printf("Thread 2 returns: %d\n",iret2);
printf("Thread 3 returns: %d\n",iret3);
printf("Thread 4 returns: %d\n",iret4);
printf("Thread 5 returns: %d\n",iret5);
exit(0);
}
void *alloc_function_two( void *ptr )
{
char *message;
message = (char *) ptr;
void *myblock[COUNT];
int i= 0,j=0;
int freed=0;
printf("message_alloc \n");
while(1)
{
memset(myblock,0,sizeof(myblock));
printf("message_alloc %s \n",message);
for(i=0;i< COUNT ;i++)
{
myblock[i] = (void *) malloc(ALLOC_BYTE);
if (myblock[i] == NULL) {
printf("malloc failed in %s\n", message);
return NULL;
}
memset(myblock[i],1, ALLOC_BYTE);
}
}
}
void *alloc_function_one( void *ptr )
{
char *message;
message = (char *) ptr;
void *myblock[COUNT];
int i= 0,j=0;
int freed=0;
printf("message_alloc \n");
while(1)
{
memset(myblock,0,sizeof(myblock));
printf("message_alloc %s \n",message);
for(i=0;i< COUNT ;i++)
{
myblock[i] = (void *) malloc(ALLOC_BYTE);
if (myblock[i] == NULL) {
printf("malloc failed in %s\n", message);
return NULL;
}
memset(myblock[i],1, ALLOC_BYTE);
}
}
}
void *alloc_function_three( void *ptr )
{
char *message;
message = (char *) ptr;
void *myblock[COUNT];
int i= 0,j=0;
int freed=0;
printf("message_alloc \n");
while(1)
{
memset(myblock,0,sizeof(myblock));
printf("message_alloc %s \n",message);
for(i=0;i< COUNT ;i++)
{
myblock[i] = (void *) malloc(ALLOC_BYTE);
if (myblock[i] == NULL) {
printf("malloc failed in %s\n", message);
return NULL;
}
memset(myblock[i],1, ALLOC_BYTE);
}
}
}
void *alloc_function_four( void *ptr )
{
char *message;
message = (char *) ptr;
void *myblock[COUNT];
int i= 0,j=0;
int freed=0;
printf("message_alloc \n");
while(1)
{
memset(myblock,0,sizeof(myblock));
printf("message_alloc %s \n",message);
for(i=0;i< COUNT ;i++)
{
myblock[i] = (void *) malloc(ALLOC_BYTE);
if (myblock[i] == NULL) {
printf("malloc failed in %s\n", message);
return NULL;
}
memset(myblock[i],1, ALLOC_BYTE);
}
}
}
void *alloc_function_five( void *ptr )
{
char *message;
message = (char *) ptr;
void *myblock[COUNT];
int i= 0,j=0;
int freed=0;
printf("message_alloc \n");
while(1)
{
memset(myblock,0,sizeof(myblock));
printf("message_alloc %s \n",message);
for(i=0;i< COUNT ;i++)
{
myblock[i] = (void *) malloc(ALLOC_BYTE);
if (myblock[i] == NULL) {
printf("malloc failed in %s\n", message);
return NULL;
}
memset(myblock[i],1, ALLOC_BYTE);
}
}
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault
2012-03-31 12:03 [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault Kautuk Consul
` (3 preceding siblings ...)
2012-06-14 16:43 ` Luck, Tony
@ 2012-06-14 16:53 ` Kautuk Consul
4 siblings, 0 replies; 6+ messages in thread
From: Kautuk Consul @ 2012-06-14 16:53 UTC (permalink / raw)
To: linux-ia64
On Thu, Jun 14, 2012 at 10:13 PM, Luck, Tony <tony.luck@intel.com> wrote:
>> Maybe you can try my test-case at https://lkml.org/lkml/2012/3/31/90 ?
>
> There seem to be a few cut & paste errors in that version (threads 3, 4, 5
> think they are "Thread 2"). Also no check for malloc failing (which since
> it is done in an infinite loop seems inevitable - unless your machines have
> infinite memory :-)
Yes. This test-case was written in a hurry to run on ARM and the test-case was
just meant to blindly allocate and fault pages till OOM happens.
So sorry about that !
Evidently on ARM we OOM much before the mallocs start failing. That
maybe due to
some overcommit_memory setting which I had on that board.
>
> Fixed up version attached. Seems to run without any lockups on a kernel
> with your patch applied until either the malloc's start to fail, or the OOM
> killer zaps it.
Thank you for that.
Really appreciate your help !
>
> I put your patch into my "next" tree - it's in linux-next tag "next-20120614"
> and I'll push to Linus in the next merge window.
And thanks again for that. :-)
>
> Thanks
>
> -Tony
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-06-14 16:53 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-31 12:03 [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault Kautuk Consul
2012-06-12 15:54 ` Kautuk Consul
2012-06-13 20:50 ` Tony Luck
2012-06-14 16:29 ` Kautuk Consul
2012-06-14 16:43 ` Luck, Tony
2012-06-14 16:53 ` Kautuk Consul
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox