Re: [RFC PATCH V3] mm readahead: Fix the readahead fail in case of empty numa node

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>,
	David Cohen <david.a.cohen@linux.intel.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Damien Ramonda <damien.ramonda@intel.com>,
	jack@suse.cz, Linus <torvalds@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH V3] mm readahead: Fix the readahead fail in case of empty numa node
Date: Wed, 08 Jan 2014 14:19:23 +0530	[thread overview]
Message-ID: <52CD1113.2070003@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140106141300.4e1c950d45c614d6c29bdd8f@linux-foundation.org>

On 01/07/2014 03:43 AM, Andrew Morton wrote:
> On Mon,  6 Jan 2014 15:51:55 +0530 Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> wrote:
>
>> +	/*
>> +	 * Readahead onto remote memory is better than no readahead when local
>> +	 * numa node does not have memory. We sanitize readahead size depending
>> +	 * on free memory in the local node but limiting to 4k pages.
>> +	 */
>> +	return local_free_page ? min(sane_nr, local_free_page / 2) : sane_nr;
>>   }
>
> So if the local node has two free pages, we do just one page of
> readahead.
>
> Then the local node has one free page and we do zero pages readahead.
>
> Assuming that bug(!) is fixed, the local node now has zero free pages
> and we suddenly resume doing large readahead.
>
> This transition from large readahead to very small readahead then back
> to large readahead is illogical, surely?
>
>

Hi Andrew, Thanks for having a look at this.

You are correct that there is a transition from small readahead to
large once we have zero free pages.
I am not sure I can defend well, but 'll give a try :).

Hoping that we have evenly distributed cpu/memory load, if we have very
less free+inactive memory may be we are in really bad shape already.

But in the case where we have a situation like below [1] (cpu does not 
have any local memory node populated) I had mentioned
earlier where we will have to depend on remote node always,
is it not that sanitized readahead onto remote memory seems better?

But having said that I am not able to get an idea of sane implementation
to solve this readahead failure bug overcoming the anomaly you pointed
:(.  hints/ideas.. ?? please let me know.


[1]: IBM P730
----------------------------------
# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 
23 24 25 26 27 28 29 30 31
node 0 size: 0 MB
node 0 free: 0 MB
node 1 cpus:
node 1 size: 12288 MB
node 1 free: 10440 MB
node distances:
node   0   1
0:  10  40
1:  40  10

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>,
	David Cohen <david.a.cohen@linux.intel.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Damien Ramonda <damien.ramonda@intel.com>,
	jack@suse.cz, Linus <torvalds@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH V3] mm readahead: Fix the readahead fail in case of empty numa node
Date: Wed, 08 Jan 2014 14:19:23 +0530	[thread overview]
Message-ID: <52CD1113.2070003@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140106141300.4e1c950d45c614d6c29bdd8f@linux-foundation.org>

On 01/07/2014 03:43 AM, Andrew Morton wrote:
> On Mon,  6 Jan 2014 15:51:55 +0530 Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> wrote:
>
>> +	/*
>> +	 * Readahead onto remote memory is better than no readahead when local
>> +	 * numa node does not have memory. We sanitize readahead size depending
>> +	 * on free memory in the local node but limiting to 4k pages.
>> +	 */
>> +	return local_free_page ? min(sane_nr, local_free_page / 2) : sane_nr;
>>   }
>
> So if the local node has two free pages, we do just one page of
> readahead.
>
> Then the local node has one free page and we do zero pages readahead.
>
> Assuming that bug(!) is fixed, the local node now has zero free pages
> and we suddenly resume doing large readahead.
>
> This transition from large readahead to very small readahead then back
> to large readahead is illogical, surely?
>
>

Hi Andrew, Thanks for having a look at this.

You are correct that there is a transition from small readahead to
large once we have zero free pages.
I am not sure I can defend well, but 'll give a try :).

Hoping that we have evenly distributed cpu/memory load, if we have very
less free+inactive memory may be we are in really bad shape already.

But in the case where we have a situation like below [1] (cpu does not 
have any local memory node populated) I had mentioned
earlier where we will have to depend on remote node always,
is it not that sanitized readahead onto remote memory seems better?

But having said that I am not able to get an idea of sane implementation
to solve this readahead failure bug overcoming the anomaly you pointed
:(.  hints/ideas.. ?? please let me know.


[1]: IBM P730
----------------------------------
# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 
23 24 25 26 27 28 29 30 31
node 0 size: 0 MB
node 0 free: 0 MB
node 1 cpus:
node 1 size: 12288 MB
node 1 free: 10440 MB
node distances:
node   0   1
0:  10  40
1:  40  10

next prev parent reply	other threads:[~2014-01-08  8:42 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-06 10:21 [RFC PATCH V3] mm readahead: Fix the readahead fail in case of empty numa node Raghavendra K T
2014-01-06 10:21 ` Raghavendra K T
2014-01-06 10:56 ` Jan Kara
2014-01-06 10:56   ` Jan Kara
2014-01-08  8:37   ` Raghavendra K T
2014-01-08  8:37     ` Raghavendra K T
2014-01-08 10:38     ` Jan Kara
2014-01-08 10:38       ` Jan Kara
2014-01-08 11:59       ` Raghavendra K T
2014-01-08 11:59         ` Raghavendra K T
2014-01-06 22:13 ` Andrew Morton
2014-01-06 22:13   ` Andrew Morton
2014-01-08  8:49   ` Raghavendra K T [this message]
2014-01-08  8:49     ` Raghavendra K T
2014-01-08 10:47     ` Jan Kara
2014-01-08 10:47       ` Jan Kara
2014-01-08 11:57       ` Raghavendra K T
2014-01-08 11:57         ` Raghavendra K T

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52CD1113.2070003@linux.vnet.ibm.com \
    --to=raghavendra.kt@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=damien.ramonda@intel.com \
    --cc=david.a.cohen@linux.intel.com \
    --cc=fengguang.wu@intel.com \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.