From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
To: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Fengguang Wu <fengguang.wu@intel.com>,
David Cohen <david.a.cohen@linux.intel.com>,
Al Viro <viro@zeniv.linux.org.uk>,
Damien Ramonda <damien.ramonda@intel.com>,
Jan Kara <jack@suse.cz>,
Linus Torvalds <torvalds@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH V5] mm readahead: Fix readahead fail for no local memory and limit readahead pages
Date: Mon, 10 Feb 2014 17:55:58 +0530 [thread overview]
Message-ID: <52F8C556.6090006@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1402100200420.30650@chino.kir.corp.google.com>
On 02/10/2014 03:35 PM, David Rientjes wrote:
> On Mon, 10 Feb 2014, Raghavendra K T wrote:
>
>> As you rightly pointed , I 'll drop remote memory term and use
>> something like :
>>
>> "* Ensure readahead success on a memoryless node cpu. But we limit
>> * the readahead to 4k pages to avoid trashing page cache." ..
>>
>
> I don't know how to proceed here after pointing it out twice, I'm afraid.
>
> numa_mem_id() is local memory for a memoryless node. node_present_pages()
> has no place in your patch.
Hi David, I am happy to see your pointer reg. numa_mem_id(). I did not
meant to be ignoring/offensive .. sorry if conversation thought to be so.
So I understood that you are suggesting implementations like below
1) I do not have problem with the below approach, I could post this in
next version.
( But this did not include 4k limit Linus mentioned to apply)
unsigned long max_sane_readahead(unsigned long nr)
{
unsigned long local_free_page;
int nid;
nid = numa_mem_id();
/*
* We sanitize readahead size depending on free memory in
* the local node.
*/
local_free_page = node_page_state(nid, NR_INACTIVE_FILE)
+ node_page_state(nid, NR_FREE_PAGES);
return min(nr, local_free_page / 2);
}
2) I did not go for below because Honza (Jan Kara) had some
concerns for 4k limit for normal case, and since I am not
the expert, I was waiting for opinions.
unsigned long max_sane_readahead(unsigned long nr)
{
unsigned long local_free_page, sane_nr;
int nid;
nid = numa_mem_id();
/* limit the max readahead to 4k pages */
sane_nr = min(nr, MAX_REMOTE_READAHEAD);
/*
* We sanitize readahead size depending on free memory in
* the local node.
*/
local_free_page = node_page_state(nid, NR_INACTIVE_FILE)
+ node_page_state(nid, NR_FREE_PAGES);
return min(sane_nr, local_free_page / 2);
}
>
>> Regarding ACCESS_ONCE, since we will have to add
>> inside the function and still there is nothing that could prevent us
>> getting run on different cpu with a different node (as Andrew ponted), I have
>> not included in current patch that I am posting.
>> Moreover this case is hopefully not fatal since it is just a hint for
>> readahead we can do.
>>
>
> I have no idea why you think the ACCESS_ONCE() is a problem. It's relying
> on gcc's implementation to ensure that the equation is done only for one
> node. It has absolutely nothing to do with the fact that the process may
> be moved to another cpu upon returning or even immediately after the
> calculation is done. Is it possible that node0 has 80% of memory free and
> node1 has 80% of memory inactive? Well, then your equation doesn't work
> quite so well if the process moves.
>
> There is no downside whatsoever to using it, I have no idea why you think
> it's better without it.
I have no problem introducing ACESSS_ONCE too. But I skipped only
after I got the below error.
mm/readahead.c: In function ?max_sane_readahead?:
mm/readahead.c:246: error: lvalue required as unary ?&? operand
>
>> So there are many possible implementation:
>> (1) use numa_mem_id(), apply freepage limit and use 4k page limit for all
>> case
>> (Jan had reservation about this case)
>>
>> (2)for normal case: use free memory calculation and do not apply 4k
>> limit (no change).
>> for memoryless cpu case: use numa_mem_id for more accurate
>> calculation of limit and also apply 4k limit.
>>
>> (3) for normal case: use free memory calculation and do not apply 4k
>> limit (no change).
>> for memoryless case: apply 4k page limit
>>
>> (4) use numa_mem_id() and apply only free page limit..
>>
>> So, I ll be resending the patch with changelog and comment changes
>> based on your and Andrew's feedback (type (3) implementation).
>>
>
> It's frustrating to have to say something three times. Ask yourself what
> happens if ALL NODES WITH CPUS DO NOT HAVE MEMORY?
>
True, this is the reason why we could go for implementation (1) I posted
above. It was just that I did not want to float a new version without
knowing whether Andrew was expecting new patch or change log updates.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-02-10 12:20 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-22 10:53 [RFC PATCH V5] mm readahead: Fix readahead fail for no local memory and limit readahead pages Raghavendra K T
2014-02-03 8:30 ` Raghavendra K T
2014-02-06 22:51 ` Andrew Morton
2014-02-06 22:58 ` David Rientjes
2014-02-06 23:22 ` Andrew Morton
2014-02-06 23:48 ` David Rientjes
2014-02-06 23:58 ` David Rientjes
2014-02-07 10:42 ` Raghavendra K T
2014-02-07 20:41 ` David Rientjes
2014-02-10 8:21 ` Raghavendra K T
2014-02-10 10:05 ` David Rientjes
2014-02-10 12:25 ` Raghavendra K T [this message]
2014-02-10 21:35 ` David Rientjes
2014-02-13 7:07 ` Raghavendra K T
2014-02-13 8:05 ` David Rientjes
2014-02-13 10:04 ` Raghavendra K T
2014-02-13 22:41 ` David Rientjes
2014-02-14 0:14 ` Nishanth Aravamudan
2014-02-14 0:37 ` Linus Torvalds
2014-02-14 0:45 ` Andrew Morton
2014-02-14 4:32 ` Nishanth Aravamudan
2014-02-14 10:54 ` David Rientjes
2014-02-17 19:28 ` Nishanth Aravamudan
2014-02-17 23:14 ` David Rientjes
2014-02-18 1:31 ` Nishanth Aravamudan
2014-02-17 22:59 ` Linus Torvalds
2014-02-14 7:43 ` Jan Kara
2014-02-17 22:57 ` Linus Torvalds
2014-02-14 5:47 ` Nishanth Aravamudan
2014-02-13 21:06 ` Andrew Morton
2014-02-13 21:42 ` Nishanth Aravamudan
2014-02-10 8:29 ` [RFC PATCH V5 RESEND] " Raghavendra K T
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52F8C556.6090006@linux.vnet.ibm.com \
--to=raghavendra.kt@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=damien.ramonda@intel.com \
--cc=david.a.cohen@linux.intel.com \
--cc=fengguang.wu@intel.com \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).