All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Jackson <pj@sgi.com>
To: Dave Hansen <haveblue@us.ibm.com>
Cc: mrmacman_g4@mac.com, jschopp@austin.ibm.com, akpm@osdl.org,
	lhms-devel@lists.sourceforge.net, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, mel@csn.ul.ie, kravetz@us.ibm.com
Subject: Re: [PATCH 1/9] add defrag flags
Date: Mon, 26 Sep 2005 22:44:39 -0700	[thread overview]
Message-ID: <20050926224439.056eaf8d.pj@sgi.com> (raw)
In-Reply-To: <1127780648.10315.12.camel@localhost>

Dave wrote:
> I think Joel simply made an error in his description.

Looks like he made the same mistake in the actual code comments:

+/* Allocation type modifiers, group together if possible
+ * __GPF_USER: Allocation for user page or a buffer page
+ * __GFP_KERNRCLM: Short-lived or reclaimable kernel allocation
+ */
+#define __GFP_USER	0x40000u /* Kernel page that is easily reclaimable */
+#define __GFP_KERNRCLM	0x80000u /* User is a userspace user */

I'd guess you meant to write more like the following:

#define __GFP_USER   0x40000u /* Page for user address space */
#define __GFP_KERNRCLM 0x80000u /* Kernel page that is easily reclaimable */

And the block comment seems to needlessly repeat the inline comments,
add a dubious claim, and omit the interesting stuff ...  In other words:

    Does it actually matter if these two bits are grouped, or not?  I
    suspect that some of your other code, such as shifting the gfpmask by
    RCLM_SHIFT bits, _requires_ that these two bits be adjacent.  So the
    "if possible" in the comment above is misleading.

    And I suspect that gfp.h should contain the RCLM_SHIFT define, or
    at least mention in comment that RCLM_SHIFT depends on the position
    of the above two __GFP_* bits.

    And I don't see any mention in the comments in gfp.h that these
    two bits, in tandem, have an additional meaning - both bits off
    means, I guess, not reclaimable, well at least not easily.

My HARDWALL patch appears to already be in Linus's kernel, so you
probably also need to do a global substitute of all instances in
the kernel of __GFP_HARDWALL, replacing it with __GFP_USER.  Here
is the list of files I see affected, with a count of the number of
__GFP_HARDWALL strings in each:

    include/linux/gfp.h:4
    kernel/cpuset.c:6
    mm/page_alloc.c:2
    mm/vmscan.c:4

The comment in the next line looks like it needs to be changed to match
the code change:

+#define __GFP_BITS_SHIFT 21	/* Room for 20 __GFP_FOO bits */

On the other hand, why did you change __GFP_BITS_SHIFT?  Isn't 20
enough - just enough?

Why was the flag change in fs/buffer.c:grow_dev_page() to add the
__GFP_USER bit, not to add the __GFP_KERNRCLM bit?  I don't know that
code - perhaps the answer is simply that the resulting page ends up in
user space.

Aha - I just read one of the comments above that I cut+pasted.
It says that __GFP_USER means user *OR* buffer page.  That certainly
explains the fs/buffer.c code using __GFP_USER.  But it causes me to
wonder if we can equate __GFP_USER with __GFP_HARDWALL.  I'm reluctant,
but more on principal than concrete experience, to modify the meaning
of hardwall cpusets to constrain both user address space pages *AND*
buffer pages.  How open would you be to making buffers __GFP_KERNRCLM
instead of __GFP_USER?

If you have good reason to keep __GFP_USER meanin either user or buffer,
then perhaps the name __GFP_USER is misleading.

What sort of performance claims can you make for this change?  How does
it impact kernel text size?  Could we see a diffstat for the entire
patchset?  Under what sort of loads or conditions would you expect
this patchset to do more harm than good?

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

WARNING: multiple messages have this Message-ID (diff)
From: Paul Jackson <pj@sgi.com>
To: Dave Hansen <haveblue@us.ibm.com>
Cc: mrmacman_g4@mac.com, jschopp@austin.ibm.com, akpm@osdl.org,
	lhms-devel@lists.sourceforge.net, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, mel@csn.ul.ie, kravetz@us.ibm.com
Subject: Re: [PATCH 1/9] add defrag flags
Date: Mon, 26 Sep 2005 22:44:39 -0700	[thread overview]
Message-ID: <20050926224439.056eaf8d.pj@sgi.com> (raw)
In-Reply-To: <1127780648.10315.12.camel@localhost>

Dave wrote:
> I think Joel simply made an error in his description.

Looks like he made the same mistake in the actual code comments:

+/* Allocation type modifiers, group together if possible
+ * __GPF_USER: Allocation for user page or a buffer page
+ * __GFP_KERNRCLM: Short-lived or reclaimable kernel allocation
+ */
+#define __GFP_USER	0x40000u /* Kernel page that is easily reclaimable */
+#define __GFP_KERNRCLM	0x80000u /* User is a userspace user */

I'd guess you meant to write more like the following:

#define __GFP_USER   0x40000u /* Page for user address space */
#define __GFP_KERNRCLM 0x80000u /* Kernel page that is easily reclaimable */

And the block comment seems to needlessly repeat the inline comments,
add a dubious claim, and omit the interesting stuff ...  In other words:

    Does it actually matter if these two bits are grouped, or not?  I
    suspect that some of your other code, such as shifting the gfpmask by
    RCLM_SHIFT bits, _requires_ that these two bits be adjacent.  So the
    "if possible" in the comment above is misleading.

    And I suspect that gfp.h should contain the RCLM_SHIFT define, or
    at least mention in comment that RCLM_SHIFT depends on the position
    of the above two __GFP_* bits.

    And I don't see any mention in the comments in gfp.h that these
    two bits, in tandem, have an additional meaning - both bits off
    means, I guess, not reclaimable, well at least not easily.

My HARDWALL patch appears to already be in Linus's kernel, so you
probably also need to do a global substitute of all instances in
the kernel of __GFP_HARDWALL, replacing it with __GFP_USER.  Here
is the list of files I see affected, with a count of the number of
__GFP_HARDWALL strings in each:

    include/linux/gfp.h:4
    kernel/cpuset.c:6
    mm/page_alloc.c:2
    mm/vmscan.c:4

The comment in the next line looks like it needs to be changed to match
the code change:

+#define __GFP_BITS_SHIFT 21	/* Room for 20 __GFP_FOO bits */

On the other hand, why did you change __GFP_BITS_SHIFT?  Isn't 20
enough - just enough?

Why was the flag change in fs/buffer.c:grow_dev_page() to add the
__GFP_USER bit, not to add the __GFP_KERNRCLM bit?  I don't know that
code - perhaps the answer is simply that the resulting page ends up in
user space.

Aha - I just read one of the comments above that I cut+pasted.
It says that __GFP_USER means user *OR* buffer page.  That certainly
explains the fs/buffer.c code using __GFP_USER.  But it causes me to
wonder if we can equate __GFP_USER with __GFP_HARDWALL.  I'm reluctant,
but more on principal than concrete experience, to modify the meaning
of hardwall cpusets to constrain both user address space pages *AND*
buffer pages.  How open would you be to making buffers __GFP_KERNRCLM
instead of __GFP_USER?

If you have good reason to keep __GFP_USER meanin either user or buffer,
then perhaps the name __GFP_USER is misleading.

What sort of performance claims can you make for this change?  How does
it impact kernel text size?  Could we see a diffstat for the entire
patchset?  Under what sort of loads or conditions would you expect
this patchset to do more harm than good?

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2005-09-27  5:45 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-09-26 20:01 [PATCH 0/9] fragmentation avoidance Joel Schopp
2005-09-26 20:03 ` [PATCH 1/9] add defrag flags Joel Schopp
2005-09-27  0:16   ` Kyle Moffett
2005-09-27  0:16     ` Kyle Moffett
2005-09-27  0:24     ` Dave Hansen
2005-09-27  0:24       ` Dave Hansen
2005-09-27  0:43       ` Kyle Moffett
2005-09-27  0:43         ` Kyle Moffett
2005-09-27  5:44       ` Paul Jackson [this message]
2005-09-27  5:44         ` Paul Jackson
2005-09-27 13:34         ` Mel Gorman
2005-09-27 13:34           ` Mel Gorman
2005-09-27 16:26           ` [Lhms-devel] " Paul Jackson
2005-09-27 16:26             ` Paul Jackson
2005-09-27 18:38         ` Joel Schopp
2005-09-27 19:30           ` Paul Jackson
2005-09-27 19:30             ` Paul Jackson
2005-09-27 21:00             ` [Lhms-devel] " Joel Schopp
2005-09-27 21:00               ` Joel Schopp
2005-09-27 21:23               ` Paul Jackson
2005-09-27 21:23                 ` Paul Jackson
2005-09-27 22:03                 ` Joel Schopp
2005-09-27 22:45                   ` Paul Jackson
2005-09-27 22:45                     ` Paul Jackson
2005-09-26 20:05 ` [PATCH 2/9] declare defrag structs Joel Schopp
2005-09-26 20:06 ` [PATCH 3/9] initialize defrag Joel Schopp
2005-09-26 20:09 ` [PATCH 4/9] defrag helper functions Joel Schopp
2005-09-26 22:29   ` Alex Bligh - linux-kernel
2005-09-26 22:29     ` Alex Bligh - linux-kernel
2005-09-27 16:08     ` Joel Schopp
2005-09-27 16:08       ` Joel Schopp
2005-09-26 20:11 ` [PATCH 5/9] propagate defrag alloc types Joel Schopp
2005-09-26 20:13 ` [PATCH 6/9] fragmentation avoidance core Joel Schopp
2005-09-26 20:14 ` [PATCH 7/9] try harder on large allocations Joel Schopp
2005-09-27  7:21   ` Coywolf Qi Hunt
2005-09-27  7:21     ` Coywolf Qi Hunt
2005-09-27 16:17     ` Joel Schopp
2005-09-27 16:17       ` Joel Schopp
2005-09-26 20:16 ` [PATCH 8/9] defrag fallback Joel Schopp
2005-09-26 20:17 ` [PATCH 9/9] free memory is user reclaimable Joel Schopp
2005-09-26 20:19 ` [PATCH 10/9] percpu splitout Joel Schopp
2005-09-26 21:49 ` [Lhms-devel] [PATCH 0/9] fragmentation avoidance Joel Schopp
2005-09-26 21:49   ` Joel Schopp

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050926224439.056eaf8d.pj@sgi.com \
    --to=pj@sgi.com \
    --cc=akpm@osdl.org \
    --cc=haveblue@us.ibm.com \
    --cc=jschopp@austin.ibm.com \
    --cc=kravetz@us.ibm.com \
    --cc=lhms-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=mrmacman_g4@mac.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.