From: Paul Jackson <pj@sgi.com>
To: "Rohit, Seth" <rohit.seth@intel.com>
Cc: akpm@osdl.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH]: Clean up of __alloc_pages
Date: Sat, 29 Oct 2005 17:16:30 -0700 [thread overview]
Message-ID: <20051029171630.04a69660.pj@sgi.com> (raw)
In-Reply-To: <20051028183326.A28611@unix-os.sc.intel.com>
Seth wroteL
> @@ -851,19 +853,11 @@
> * Ignore cpuset if GFP_ATOMIC (!wait) rather than fail alloc.
> * See also cpuset_zone_allowed() comment in kernel/cpuset.c.
> */
> - for (i = 0; (z = zones[i]) != NULL; i++) {
> - if (!zone_watermark_ok(z, order, z->pages_min,
> - classzone_idx, can_try_harder,
> - gfp_mask & __GFP_HIGH))
> - continue;
> -
> - if (wait && !cpuset_zone_allowed(z, gfp_mask))
> - continue;
> -
> - page = buffered_rmqueue(z, order, gfp_mask);
> - if (page)
> - goto got_pg;
> - }
> + if (!wait)
> + page = get_page_from_freelist(gfp_mask, order, zones,
> + can_try_harder);
Thanks for the clean-up work. Good stuff.
I think you've changed the affect that the cpuset check has on the
above pass.
As you know, the above is the last chance we have for GFP_ATOMIC (can't
wait) allocations before getting into the oom_kill code. The code had
been written to ignore cpuset constraints for GFP_ATOMIC (that is,
"!wait") allocations. The intent is to allow taking GFP_ATOMIC memory
from any damn node we can find it on, rather than start killing.
Your change will call into get_page_from_freelist() in such cases,
where the cpuset check is still done.
I would be tempted instead to:
1) pass 'can_try_harder' value of -1, instead of the the local value
of 1 (which it certainly is, since we are in !wait code).
2) condition the cpuset check in get_page_from_freelist() on
can_try_harder being not equal to -1.
The item (2) -does- change the existing cpuset conditions as well,
allowing cpuset boundaries to be violated for the cases that would
"allow future memory freeing" (such as GFP_MEMALLOC or TIF_MEMDIE),
whereas until now, we did not allow violating cpuset conditions
for this. But that is arguably a good change.
The following patch, on top of yours, shows what I have in mind here:
--- 2.6.14-rc5-mm1.orig/mm/page_alloc.c 2005-10-29 14:45:07.000000000 -0700
+++ 2.6.14-rc5-mm1/mm/page_alloc.c 2005-10-29 16:35:55.000000000 -0700
@@ -777,7 +777,7 @@ get_page_from_freelist(unsigned int __no
* See also cpuset_zone_allowed() comment in kernel/cpuset.c.
*/
for (i = 0; (z = zones[i]) != NULL; i++) {
- if (!cpuset_zone_allowed(z, gfp_mask))
+ if (can_try_harder != -1 && !cpuset_zone_allowed(z, gfp_mask))
continue;
if ((can_try_harder >= 0) &&
@@ -940,8 +940,7 @@ restart:
* See also cpuset_zone_allowed() comment in kernel/cpuset.c.
*/
if (!wait)
- page = get_page_from_freelist(gfp_mask, order, zones,
- can_try_harder);
+ page = get_page_from_freelist(gfp_mask, order, zones, -1);
if (page)
goto got_pg;
However ...
1) The above also would change __GFP_HIGH and rt allocations to also
ignore mins entirely, instead of just going deeper into reserves,
on this pass. That is likely not good.
2) I can't get my head wrapped around Nick's reply to this patch.
So my above patch is no doubt flawed in one or more ways.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
WARNING: multiple messages have this Message-ID (diff)
From: Paul Jackson <pj@sgi.com>
To: "Rohit, Seth" <rohit.seth@intel.com>
Cc: akpm@osdl.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH]: Clean up of __alloc_pages
Date: Sat, 29 Oct 2005 17:16:30 -0700 [thread overview]
Message-ID: <20051029171630.04a69660.pj@sgi.com> (raw)
In-Reply-To: <20051028183326.A28611@unix-os.sc.intel.com>
Seth wroteL
> @@ -851,19 +853,11 @@
> * Ignore cpuset if GFP_ATOMIC (!wait) rather than fail alloc.
> * See also cpuset_zone_allowed() comment in kernel/cpuset.c.
> */
> - for (i = 0; (z = zones[i]) != NULL; i++) {
> - if (!zone_watermark_ok(z, order, z->pages_min,
> - classzone_idx, can_try_harder,
> - gfp_mask & __GFP_HIGH))
> - continue;
> -
> - if (wait && !cpuset_zone_allowed(z, gfp_mask))
> - continue;
> -
> - page = buffered_rmqueue(z, order, gfp_mask);
> - if (page)
> - goto got_pg;
> - }
> + if (!wait)
> + page = get_page_from_freelist(gfp_mask, order, zones,
> + can_try_harder);
Thanks for the clean-up work. Good stuff.
I think you've changed the affect that the cpuset check has on the
above pass.
As you know, the above is the last chance we have for GFP_ATOMIC (can't
wait) allocations before getting into the oom_kill code. The code had
been written to ignore cpuset constraints for GFP_ATOMIC (that is,
"!wait") allocations. The intent is to allow taking GFP_ATOMIC memory
from any damn node we can find it on, rather than start killing.
Your change will call into get_page_from_freelist() in such cases,
where the cpuset check is still done.
I would be tempted instead to:
1) pass 'can_try_harder' value of -1, instead of the the local value
of 1 (which it certainly is, since we are in !wait code).
2) condition the cpuset check in get_page_from_freelist() on
can_try_harder being not equal to -1.
The item (2) -does- change the existing cpuset conditions as well,
allowing cpuset boundaries to be violated for the cases that would
"allow future memory freeing" (such as GFP_MEMALLOC or TIF_MEMDIE),
whereas until now, we did not allow violating cpuset conditions
for this. But that is arguably a good change.
The following patch, on top of yours, shows what I have in mind here:
--- 2.6.14-rc5-mm1.orig/mm/page_alloc.c 2005-10-29 14:45:07.000000000 -0700
+++ 2.6.14-rc5-mm1/mm/page_alloc.c 2005-10-29 16:35:55.000000000 -0700
@@ -777,7 +777,7 @@ get_page_from_freelist(unsigned int __no
* See also cpuset_zone_allowed() comment in kernel/cpuset.c.
*/
for (i = 0; (z = zones[i]) != NULL; i++) {
- if (!cpuset_zone_allowed(z, gfp_mask))
+ if (can_try_harder != -1 && !cpuset_zone_allowed(z, gfp_mask))
continue;
if ((can_try_harder >= 0) &&
@@ -940,8 +940,7 @@ restart:
* See also cpuset_zone_allowed() comment in kernel/cpuset.c.
*/
if (!wait)
- page = get_page_from_freelist(gfp_mask, order, zones,
- can_try_harder);
+ page = get_page_from_freelist(gfp_mask, order, zones, -1);
if (page)
goto got_pg;
However ...
1) The above also would change __GFP_HIGH and rt allocations to also
ignore mins entirely, instead of just going deeper into reserves,
on this pass. That is likely not good.
2) I can't get my head wrapped around Nick's reply to this patch.
So my above patch is no doubt flawed in one or more ways.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2005-10-30 0:16 UTC|newest]
Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-10-29 1:33 [PATCH]: Clean up of __alloc_pages Rohit, Seth
2005-10-29 1:33 ` Rohit, Seth
2005-10-29 2:33 ` Nick Piggin
2005-10-29 2:33 ` Nick Piggin
2005-10-31 20:55 ` Rohit Seth
2005-10-31 20:55 ` Rohit Seth
2005-11-01 1:14 ` Nick Piggin
2005-11-01 1:14 ` Nick Piggin
2005-11-04 18:15 ` Rohit Seth
2005-11-04 18:15 ` Rohit Seth
2005-11-05 0:00 ` Nick Piggin
2005-11-05 0:00 ` Nick Piggin
2005-10-30 0:16 ` Paul Jackson [this message]
2005-10-30 0:16 ` Paul Jackson
2005-10-31 19:09 ` Rohit Seth
2005-10-31 19:09 ` Rohit Seth
2005-11-05 17:09 ` Andi Kleen
2005-11-05 17:09 ` Andi Kleen
2005-11-06 4:18 ` Paul Jackson
2005-11-06 4:18 ` Paul Jackson
2005-11-06 17:35 ` Andi Kleen
2005-11-06 17:35 ` Andi Kleen
2005-11-06 20:49 ` Paul Jackson
2005-11-06 20:49 ` Paul Jackson
2005-11-07 2:57 ` Nick Piggin
2005-11-07 2:57 ` Nick Piggin
2005-11-07 3:42 ` Andi Kleen
2005-11-07 3:42 ` Andi Kleen
2005-11-07 4:37 ` Paul Jackson
2005-11-07 4:37 ` Paul Jackson
2005-11-07 6:08 ` Nick Piggin
2005-11-07 6:08 ` Nick Piggin
2005-11-07 9:46 ` Paul Jackson
2005-11-07 9:46 ` Paul Jackson
2005-11-07 10:17 ` Nick Piggin
2005-11-07 10:17 ` Nick Piggin
2005-11-07 14:41 ` Paul Jackson
2005-11-07 14:41 ` Paul Jackson
2005-11-07 3:44 ` Paul Jackson
2005-11-07 3:44 ` Paul Jackson
2005-10-30 1:47 ` Paul Jackson
2005-10-30 1:47 ` Paul Jackson
2005-10-30 2:01 ` Nick Piggin
2005-10-30 2:01 ` Nick Piggin
2005-10-30 2:19 ` Paul Jackson
2005-10-30 2:19 ` Paul Jackson
2005-10-30 2:32 ` Nick Piggin
2005-10-30 2:32 ` Nick Piggin
2005-10-30 3:06 ` Paul Jackson
2005-10-30 3:06 ` Paul Jackson
2005-10-30 3:53 ` Nick Piggin
2005-10-30 3:53 ` Nick Piggin
2005-10-30 2:26 ` Paul Jackson
2005-10-30 2:26 ` Paul Jackson
2005-10-30 2:36 ` Nick Piggin
2005-10-30 2:36 ` Nick Piggin
2005-10-30 3:09 ` Paul Jackson
2005-10-30 3:09 ` Paul Jackson
2005-10-30 3:55 ` Nick Piggin
2005-10-30 3:55 ` Nick Piggin
2005-10-30 4:11 ` Paul Jackson
2005-10-30 4:11 ` Paul Jackson
2005-10-31 21:20 ` Rohit Seth
2005-10-31 21:20 ` Rohit Seth
2005-10-31 21:28 ` Paul Jackson
2005-10-31 21:28 ` Paul Jackson
-- strict thread matches above, loose matches on Subject: below --
2005-11-05 1:57 Seth, Rohit
2005-11-05 1:57 ` Seth, Rohit
2005-10-01 19:00 Seth, Rohit
2005-10-01 19:00 ` Seth, Rohit
2005-10-02 3:09 ` Nick Piggin
2005-10-02 3:09 ` Nick Piggin
2005-10-03 16:50 ` Rohit Seth
2005-10-03 16:50 ` Rohit Seth
2005-10-03 15:34 ` Christoph Lameter
2005-10-03 15:34 ` Christoph Lameter
2005-10-03 16:55 ` Rohit Seth
2005-10-03 16:55 ` Rohit Seth
2005-10-03 16:57 ` Christoph Lameter
2005-10-03 16:57 ` Christoph Lameter
2005-10-03 17:48 ` Rohit Seth
2005-10-03 17:48 ` Rohit Seth
2005-10-04 13:27 ` Andi Kleen
2005-10-04 13:27 ` Andi Kleen
2005-10-04 16:26 ` Ray Bryant
2005-10-04 16:26 ` Ray Bryant
2005-10-04 16:10 ` Martin J. Bligh
2005-10-04 16:10 ` Martin J. Bligh
2005-10-04 17:02 ` Ray Bryant
2005-10-04 17:02 ` Ray Bryant
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20051029171630.04a69660.pj@sgi.com \
--to=pj@sgi.com \
--cc=akpm@osdl.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rohit.seth@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.