public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault
@ 2012-03-31 12:03 Kautuk Consul
  2012-06-12 15:54 ` Kautuk Consul
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Kautuk Consul @ 2012-03-31 12:03 UTC (permalink / raw)
  To: Tony Luck, Fenghua Yu; +Cc: linux-ia64, linux-kernel, Kautuk Consul

Commit d065bd810b6deb67d4897a14bfe21f8eb526ba99
(mm: retry page fault when blocking on disk transfer) and
commit 37b23e0525d393d48a7d59f870b3bc061a30ccdb
(x86,mm: make pagefault killable)

The above commits introduced changes into the x86 pagefault handler
for making the page fault handler retryable as well as killable.

These changes reduce the mmap_sem hold time, which is crucial
during OOM killer invocation.

Port these changes to ia64.

Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
---
 arch/ia64/mm/fault.c |   38 ++++++++++++++++++++++++++++++--------
 1 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
index 02d29c2..b0c3b29 100644
--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@@ -81,6 +81,12 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
 	struct siginfo si;
 	unsigned long mask;
 	int fault;
+	unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
+
+	mask = ((((isr >> IA64_ISR_X_BIT) & 1UL) << VM_EXEC_BIT)
+		| (((isr >> IA64_ISR_W_BIT) & 1UL) << VM_WRITE_BIT));
+
+	flags |= ((mask & VM_WRITE) ? FAULT_FLAG_WRITE : 0);
 
 	/* mmap_sem is performance critical.... */
 	prefetchw(&mm->mmap_sem);
@@ -109,6 +115,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
 	if (notify_page_fault(regs, TRAP_BRKPT))
 		return;
 
+retry:
 	down_read(&mm->mmap_sem);
 
 	vma = find_vma_prev(mm, address, &prev_vma);
@@ -142,9 +149,6 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
 	if (((isr >> IA64_ISR_R_BIT) & 1UL) && (!(vma->vm_flags & (VM_READ | VM_WRITE))))
 		goto bad_area;
 
-	mask = (  (((isr >> IA64_ISR_X_BIT) & 1UL) << VM_EXEC_BIT)
-		| (((isr >> IA64_ISR_W_BIT) & 1UL) << VM_WRITE_BIT));
-
 	if ((vma->vm_flags & mask) != mask)
 		goto bad_area;
 
@@ -153,7 +157,11 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
 	 * sure we exit gracefully rather than endlessly redo the
 	 * fault.
 	 */
-	fault = handle_mm_fault(mm, vma, address, (mask & VM_WRITE) ? FAULT_FLAG_WRITE : 0);
+	fault = handle_mm_fault(mm, vma, address, flags);
+
+	if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
+		return;
+
 	if (unlikely(fault & VM_FAULT_ERROR)) {
 		/*
 		 * We ran out of memory, or some other thing happened
@@ -168,10 +176,24 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
 		}
 		BUG();
 	}
-	if (fault & VM_FAULT_MAJOR)
-		current->maj_flt++;
-	else
-		current->min_flt++;
+
+	if (flags & FAULT_FLAG_ALLOW_RETRY) {
+		if (fault & VM_FAULT_MAJOR)
+			current->maj_flt++;
+		else
+			current->min_flt++;
+		if (fault & VM_FAULT_RETRY) {
+			flags &= ~FAULT_FLAG_ALLOW_RETRY;
+
+			 /* No need to up_read(&mm->mmap_sem) as we would
+			 * have already released it in __lock_page_or_retry
+			 * in mm/filemap.c.
+			 */
+
+			goto retry;
+		}
+	}
+
 	up_read(&mm->mmap_sem);
 	return;
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault
  2012-03-31 12:03 [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault Kautuk Consul
@ 2012-06-12 15:54 ` Kautuk Consul
  2012-06-13 20:50 ` Tony Luck
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Kautuk Consul @ 2012-06-12 15:54 UTC (permalink / raw)
  To: linux-ia64

Hi,

Can you please review this patch ?

Many archs have accpted this change so I think it is okay. :)

Thanks,
Kautuk.

On Sat, Mar 31, 2012 at 5:33 PM, Kautuk Consul <consul.kautuk@gmail.com> wrote:
> Commit d065bd810b6deb67d4897a14bfe21f8eb526ba99
> (mm: retry page fault when blocking on disk transfer) and
> commit 37b23e0525d393d48a7d59f870b3bc061a30ccdb
> (x86,mm: make pagefault killable)
>
> The above commits introduced changes into the x86 pagefault handler
> for making the page fault handler retryable as well as killable.
>
> These changes reduce the mmap_sem hold time, which is crucial
> during OOM killer invocation.
>
> Port these changes to ia64.
>
> Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
> ---
>  arch/ia64/mm/fault.c |   38 ++++++++++++++++++++++++++++++--------
>  1 files changed, 30 insertions(+), 8 deletions(-)
>
> diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
> index 02d29c2..b0c3b29 100644
> --- a/arch/ia64/mm/fault.c
> +++ b/arch/ia64/mm/fault.c
> @@ -81,6 +81,12 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
>        struct siginfo si;
>        unsigned long mask;
>        int fault;
> +       unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
> +
> +       mask = ((((isr >> IA64_ISR_X_BIT) & 1UL) << VM_EXEC_BIT)
> +               | (((isr >> IA64_ISR_W_BIT) & 1UL) << VM_WRITE_BIT));
> +
> +       flags |= ((mask & VM_WRITE) ? FAULT_FLAG_WRITE : 0);
>
>        /* mmap_sem is performance critical.... */
>        prefetchw(&mm->mmap_sem);
> @@ -109,6 +115,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
>        if (notify_page_fault(regs, TRAP_BRKPT))
>                return;
>
> +retry:
>        down_read(&mm->mmap_sem);
>
>        vma = find_vma_prev(mm, address, &prev_vma);
> @@ -142,9 +149,6 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
>        if (((isr >> IA64_ISR_R_BIT) & 1UL) && (!(vma->vm_flags & (VM_READ | VM_WRITE))))
>                goto bad_area;
>
> -       mask = (  (((isr >> IA64_ISR_X_BIT) & 1UL) << VM_EXEC_BIT)
> -               | (((isr >> IA64_ISR_W_BIT) & 1UL) << VM_WRITE_BIT));
> -
>        if ((vma->vm_flags & mask) != mask)
>                goto bad_area;
>
> @@ -153,7 +157,11 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
>         * sure we exit gracefully rather than endlessly redo the
>         * fault.
>         */
> -       fault = handle_mm_fault(mm, vma, address, (mask & VM_WRITE) ? FAULT_FLAG_WRITE : 0);
> +       fault = handle_mm_fault(mm, vma, address, flags);
> +
> +       if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
> +               return;
> +
>        if (unlikely(fault & VM_FAULT_ERROR)) {
>                /*
>                 * We ran out of memory, or some other thing happened
> @@ -168,10 +176,24 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
>                }
>                BUG();
>        }
> -       if (fault & VM_FAULT_MAJOR)
> -               current->maj_flt++;
> -       else
> -               current->min_flt++;
> +
> +       if (flags & FAULT_FLAG_ALLOW_RETRY) {
> +               if (fault & VM_FAULT_MAJOR)
> +                       current->maj_flt++;
> +               else
> +                       current->min_flt++;
> +               if (fault & VM_FAULT_RETRY) {
> +                       flags &= ~FAULT_FLAG_ALLOW_RETRY;
> +
> +                        /* No need to up_read(&mm->mmap_sem) as we would
> +                        * have already released it in __lock_page_or_retry
> +                        * in mm/filemap.c.
> +                        */
> +
> +                       goto retry;
> +               }
> +       }
> +
>        up_read(&mm->mmap_sem);
>        return;
>
> --
> 1.7.5.4
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault
  2012-03-31 12:03 [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault Kautuk Consul
  2012-06-12 15:54 ` Kautuk Consul
@ 2012-06-13 20:50 ` Tony Luck
  2012-06-14 16:29 ` Kautuk Consul
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Tony Luck @ 2012-06-13 20:50 UTC (permalink / raw)
  To: linux-ia64

On Tue, Jun 12, 2012 at 8:51 AM, Kautuk Consul <consul.kautuk@gmail.com> wrote:
>> +       mask = ((((isr >> IA64_ISR_X_BIT) & 1UL) << VM_EXEC_BIT)
>> +               | (((isr >> IA64_ISR_W_BIT) & 1UL) << VM_WRITE_BIT));

You added these uses of VM_EXEC_BIT and VM_WRITE_BIT ... but
these macros are not defined until a few lines later in this file.  I moved
them (and the VM_READ_BIT definition) up to just before the
start of ia64_do_page_fault() to fix the compilation errors. No need
to re-submit a fixed patch.

Do you know if the test program mentioned in
d065bd810b6deb67d4897a14bfe21f8eb526ba99:
    - microbenchmark: thread A mmaps a large file and does random read accesses
      to the mmaped area - achieves about 55 iterations/s. Thread B does
      mmap/munmap in a loop at a separate location - achieves 55 iterations/s
      before, 15000 iterations/s after.
was posted? I'd like to make sure that ia64 is seeing the benefit.

-Tony

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault
  2012-03-31 12:03 [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault Kautuk Consul
  2012-06-12 15:54 ` Kautuk Consul
  2012-06-13 20:50 ` Tony Luck
@ 2012-06-14 16:29 ` Kautuk Consul
  2012-06-14 16:43 ` Luck, Tony
  2012-06-14 16:53 ` Kautuk Consul
  4 siblings, 0 replies; 6+ messages in thread
From: Kautuk Consul @ 2012-06-14 16:29 UTC (permalink / raw)
  To: linux-ia64

On Thu, Jun 14, 2012 at 2:20 AM, Tony Luck <tony.luck@gmail.com> wrote:
> On Tue, Jun 12, 2012 at 8:51 AM, Kautuk Consul <consul.kautuk@gmail.com> wrote:
>>> +       mask = ((((isr >> IA64_ISR_X_BIT) & 1UL) << VM_EXEC_BIT)
>>> +               | (((isr >> IA64_ISR_W_BIT) & 1UL) << VM_WRITE_BIT));
>
> You added these uses of VM_EXEC_BIT and VM_WRITE_BIT ... but
> these macros are not defined until a few lines later in this file.  I moved
> them (and the VM_READ_BIT definition) up to just before the
> start of ia64_do_page_fault() to fix the compilation errors. No need
> to re-submit a fixed patch.
>

Thanks for the help, Tony !
Terribly sorry for my mistakes as I don't have an ia64 system.

> Do you know if the test program mentioned in
> d065bd810b6deb67d4897a14bfe21f8eb526ba99:
>    - microbenchmark: thread A mmaps a large file and does random read accesses
>      to the mmaped area - achieves about 55 iterations/s. Thread B does
>      mmap/munmap in a loop at a separate location - achieves 55 iterations/s
>      before, 15000 iterations/s after.
> was posted? I'd like to make sure that ia64 is seeing the benefit.

I couldn't find any test-case for that, but I tested in for ARM and
MIPS where I was seeing
hangs and livelocks.
Maybe you can try my test-case at https://lkml.org/lkml/2012/3/31/90 ?

>
> -Tony

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault
  2012-03-31 12:03 [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault Kautuk Consul
                   ` (2 preceding siblings ...)
  2012-06-14 16:29 ` Kautuk Consul
@ 2012-06-14 16:43 ` Luck, Tony
  2012-06-14 16:53 ` Kautuk Consul
  4 siblings, 0 replies; 6+ messages in thread
From: Luck, Tony @ 2012-06-14 16:43 UTC (permalink / raw)
  To: linux-ia64

[-- Attachment #1: Type: text/plain, Size: 650 bytes --]

> Maybe you can try my test-case at https://lkml.org/lkml/2012/3/31/90 ?

There seem to be a few cut & paste errors in that version (threads 3, 4, 5
think they are "Thread 2").  Also no check for malloc failing (which since
it is done in an infinite loop seems inevitable - unless your machines have
infinite memory :-)

Fixed up version attached.  Seems to run without any lockups on a kernel
with your patch applied until either the malloc's start to fail, or the OOM
killer zaps it.

I put your patch into my "next" tree - it's in linux-next tag "next-20120614"
and I'll push to Linus in the next merge window.

Thanks

-Tony



[-- Attachment #2: kautuk.c --]
[-- Type: text/plain, Size: 4420 bytes --]

/* https://lkml.org/lkml/2012/3/31/90 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <pthread.h>
#include <unistd.h>

#define ALLOC_BYTE 512*1024
#define COUNT 50

void *alloc_function_one( void *ptr );
void *alloc_function_two( void *ptr );
void *alloc_function_three( void *ptr );
void *alloc_function_four( void *ptr );
void *alloc_function_five( void *ptr );
void *enable_function( void *ptr );


int main(int argc, char *argv[])
{
	pthread_t thread1, thread2, thread3, thread4, thread5;
	char *message1 = "Thread 1";
	char *message2 = "Thread 2";
	char *message3 = "Thread 3";
	char *message4 = "Thread 4";
	char *message5 = "Thread 5";
	int iret1 = -1;
	int iret2 = -1;
	int iret3 = -1;
	int iret4 = -1;
	int iret5 = -1;
	fork();
	iret1 = pthread_create( &thread1, NULL, alloc_function_one, (void*) message1);
	iret2 = pthread_create( &thread2, NULL, alloc_function_two, (void*) message2);
	iret3 = pthread_create( &thread3, NULL, alloc_function_three, (void*) message3);
	iret4 = pthread_create( &thread4, NULL, alloc_function_four, (void*) message4);
	iret5 = pthread_create( &thread5, NULL, alloc_function_five, (void*) message5);
	pthread_join( thread1, NULL);
	pthread_join( thread2, NULL);
	pthread_join( thread3, NULL);
	pthread_join( thread4, NULL);
	pthread_join( thread5, NULL);

	printf("Thread 1 returns: %d\n",iret1);
	printf("Thread 2 returns: %d\n",iret2);
	printf("Thread 3 returns: %d\n",iret3);
	printf("Thread 4 returns: %d\n",iret4);
	printf("Thread 5 returns: %d\n",iret5);
	exit(0);
}
void *alloc_function_two( void *ptr )
{
	char *message;
	message = (char *) ptr;
	void *myblock[COUNT];
	int i= 0,j=0;
	int freed=0;
	printf("message_alloc  \n");
	while(1)
	{
		memset(myblock,0,sizeof(myblock));
		printf("message_alloc %s \n",message);
		for(i=0;i< COUNT ;i++)
		{
			myblock[i] = (void *) malloc(ALLOC_BYTE);
			if (myblock[i] == NULL) {
				printf("malloc failed in %s\n", message);
				return NULL;
			}
			memset(myblock[i],1, ALLOC_BYTE);
		}
	}
}

void *alloc_function_one( void *ptr )
{
	char *message;
	message = (char *) ptr;
	void *myblock[COUNT];
	int i= 0,j=0;
	int freed=0;
	printf("message_alloc  \n");
	while(1)
	{
		memset(myblock,0,sizeof(myblock));
		printf("message_alloc %s \n",message);
		for(i=0;i< COUNT ;i++)
		{
			myblock[i] = (void *) malloc(ALLOC_BYTE);
			if (myblock[i] == NULL) {
				printf("malloc failed in %s\n", message);
				return NULL;
			}
			memset(myblock[i],1, ALLOC_BYTE);
		}
	}
}
void *alloc_function_three( void *ptr )
{
       char *message;
        message = (char *) ptr;
        void *myblock[COUNT];
        int i= 0,j=0;
        int freed=0;
        printf("message_alloc  \n");
        while(1)
        {
                memset(myblock,0,sizeof(myblock));
                printf("message_alloc %s \n",message);
                for(i=0;i< COUNT ;i++)
                {
                        myblock[i] = (void *) malloc(ALLOC_BYTE);
			if (myblock[i] == NULL) {
				printf("malloc failed in %s\n", message);
				return NULL;
			}
                        memset(myblock[i],1, ALLOC_BYTE);
                }
        }
}
void *alloc_function_four( void *ptr )
{
       char *message;
        message = (char *) ptr;
        void *myblock[COUNT];
        int i= 0,j=0;
        int freed=0;
        printf("message_alloc  \n");
        while(1)
        {
                memset(myblock,0,sizeof(myblock));
                printf("message_alloc %s \n",message);
                for(i=0;i< COUNT ;i++)
                {
                        myblock[i] = (void *) malloc(ALLOC_BYTE);
			if (myblock[i] == NULL) {
				printf("malloc failed in %s\n", message);
				return NULL;
			}
                        memset(myblock[i],1, ALLOC_BYTE);
                }
        }
}
void *alloc_function_five( void *ptr )
{
       char *message;
        message = (char *) ptr;
        void *myblock[COUNT];
        int i= 0,j=0;
        int freed=0;
        printf("message_alloc  \n");
        while(1)
        {
                memset(myblock,0,sizeof(myblock));
                printf("message_alloc %s \n",message);
                for(i=0;i< COUNT ;i++)
                {
                        myblock[i] = (void *) malloc(ALLOC_BYTE);
			if (myblock[i] == NULL) {
				printf("malloc failed in %s\n", message);
				return NULL;
			}
                        memset(myblock[i],1, ALLOC_BYTE);
                }
        }
}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault
  2012-03-31 12:03 [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault Kautuk Consul
                   ` (3 preceding siblings ...)
  2012-06-14 16:43 ` Luck, Tony
@ 2012-06-14 16:53 ` Kautuk Consul
  4 siblings, 0 replies; 6+ messages in thread
From: Kautuk Consul @ 2012-06-14 16:53 UTC (permalink / raw)
  To: linux-ia64

On Thu, Jun 14, 2012 at 10:13 PM, Luck, Tony <tony.luck@intel.com> wrote:
>> Maybe you can try my test-case at https://lkml.org/lkml/2012/3/31/90 ?
>
> There seem to be a few cut & paste errors in that version (threads 3, 4, 5
> think they are "Thread 2").  Also no check for malloc failing (which since
> it is done in an infinite loop seems inevitable - unless your machines have
> infinite memory :-)

Yes. This test-case was written in a hurry to run on ARM and the test-case was
just meant to blindly allocate and fault pages till OOM happens.

So sorry about that !

Evidently on ARM we OOM much before the mallocs start failing. That
maybe due to
some overcommit_memory setting which I had on that board.

>
> Fixed up version attached.  Seems to run without any lockups on a kernel
> with your patch applied until either the malloc's start to fail, or the OOM
> killer zaps it.

Thank you for that.
Really appreciate your help !

>
> I put your patch into my "next" tree - it's in linux-next tag "next-20120614"
> and I'll push to Linus in the next merge window.

And thanks again for that. :-)

>
> Thanks
>
> -Tony
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-06-14 16:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-31 12:03 [PATCH 09/19 v2] ia64/mm/fault.c: Port OOM changes to ia64_do_page_fault Kautuk Consul
2012-06-12 15:54 ` Kautuk Consul
2012-06-13 20:50 ` Tony Luck
2012-06-14 16:29 ` Kautuk Consul
2012-06-14 16:43 ` Luck, Tony
2012-06-14 16:53 ` Kautuk Consul

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox