Re: [PATCH] Compound page overhaul

All of lore.kernel.org
 help / color / mirror / Atom feed

From: David Howells <dhowells@redhat.com>
To: William Lee Irwin III <wli@holomorphy.com>
Cc: torvalds@osdl.org, akpm@osdl.org, hch@infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] Compound page overhaul
Date: Mon, 22 Nov 2004 16:07:06 +0000	[thread overview]
Message-ID: <14391.1101139626@redhat.com> (raw)
In-Reply-To: <20041122144127.GE2714@holomorphy.com>

William Lee Irwin III <wli@holomorphy.com> wrote:

> >  (1) A new bit flag PG_compound_slave has been added. This is used to mark
> > ...
> There are a lot of ways to do these things. Most of it is bitpacking
> and dodging assumptions in other code about various fields always being
> something or other they expect (e.g. bh's vs. page->private).

I want to avoid putting magic numbers in page->private. What goes in there
could be anything, as it's up to the filesystem.

Do you have any preferences? I'd prefer to use page->mapping, I think, except
that's used for the destructor.

I should probably make a set-destructor function for hugetlbfs to call.

> A generally innocuous rearrangement. Some explanation of the advantage
> of these new bitpacking and field arrangements over the current
> arrangement may be good to have.

The only differences are:

 (1) PG_compound_slave now exists.

 (2) I'm permitting the owner of the page to do do what it will with
     page->private on the first page.

> >  (3) __page_first() is now provided to find the first page of any page set
> >      (even single page sets).
> 
> I have to question the underscores.

The underscores can be dropped if they're not wanted.

> Also, there's a commonly-used term in the superpage literature, ``head of
> the superpage'', that may be more easily recognizable for readers familiar
> with that but not Linux specifics, but that's just nomenclature and not
> particularly pressing or any kind of requirement, just a non-Linux
> precedent.

I couldn't think of a good name for it, so I settled on it being the first
page.

How about page_head()?

> __GFP_COMP was introduced because several unusual drivers allocate
> higher-order pages and then move on to free fragments of them. There's
> a small danger some others may allocate higher-order pages and then
> treat each piece as a separate entity (particularly in the freeing pass).

I wasn't aware of that. Looking at the mm code, doing a fragmentary release
would cause bad_page() to be invoked. Presumably these drivers modify the
various struct pages involved directly to keep the allocator happy.

It would be better, I think, to provide a page splitter function. Thus
allowing pages to be cut in half, and then have the two halves made into the
equivalent allocated pages.

> Sweeping affected drivers to use a fragmenting primitive may help here.

Do you know which drivers?

> >  (6) compound_page_order() is now available. This will indicate the order
> ...
> Possible, but it's likely a micro-optimization to cache the order in
> registers across function calls. The allocator is something of a ``hot
> path'' and small alterations can have noticeable effects.

Yes... but the order gets examined anyway in the free page checker, and the
second plus page structs get modified too, so I don't think it'll make much of
a difference. Plus the filesystem or driver that owned the page won't need to
keep track of the size, nor will it need to calculate it.

> >  (7) Trailing spaces have been cleaned up on lines in page_alloc.c.
> 
> I like this quite a bit. =)

	(defun trim ()
	  "Delete trailing whitespace in buffer"
	  (interactive)
	  (save-excursion
	    (goto-char (point-min))
	    (replace-regexp "[ \t\r]+$" "")
	    (goto-char (point-max))
	    (skip-chars-backward "\n")
	    (if (not (eobp))
		(delete-region (1+ (point)) (point-max)))))

> >  (8) bootmem.c now has a separate path to release pages to the main
> >      allocator
> ...
> Clearly it could merely scan the bitmap for the largest properly-sized,
> properly-aligned leading run of free bits beyond even that, though I
> wouldn't expect you to pursue that as it's far beyond the scope of the
> patch. I was hit up to deal with bootmem.c issues, and will be looking
> into that and more after the current set of bootmem changes has settled
> down and ia64 bootstrap has been stable for a while.

I may look at doing this after this patch (or similar) goes in. If so, I'll
send you the patch.

> (2) The physaddr alignment comment in bootmem.c is mangled. It's not
> 	O(LOG2(BITS_PER_LONG)) -aligned, it's exactly LOG2(BITS_PER_LONG)
> 	aligned. But we don't have a LOG2(...) macro, we have fls()/ffs().

I suspect that's meant to be mathematical notation, not strictly compilable
code, though I think there may be a missing "if" in it.

> (3) page_count() probably deserves the %0*lx treatment in __bad_page().
> 	Conserving screenspace when possible helps some, though that's
> 	offset a bit against predictable field alignment. Maybe putting
> 	variable-length fields at the end of the line would help.

I don't think that matters too much. This message should never be seen, after
all...

That said, I think I should probably provide a 64-bit version too... some of
the fields will have 16-char widths there.

>       Also, the pfn would be great to have there, too, while you're at it.

Okay.

> (4) I wonder if anyone's run with CONFIG_DEBUG_PAGEALLOC recently.
> 	bootmem.c seems a bit early for kernel_map_pages() et al.
> 	It could be okay depending.

I can try it. BTW, should the free page checking be contingent on this option?
Or maybe it should have its own option.

> (5) This patch does a fair number of different things and it takes a
> 	bit of effort to wade through some of the longer rearrangements
> 	as they overflow 80x24. It would be helpful for reviewers if you
> 	could break this down into a somewhat more easily-digestible
> 	series of smaller patches.

It's a little tricky to break it up logically since it's mostly incredibly
interrelated.

I could separate out some of the cleanups: rearrangement between files,
trailing space splatting.

David

next prev parent reply	other threads:[~2004-11-22 16:40 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-11-22 13:27 [PATCH] Compound page overhaul David Howells
2004-11-22 14:41 ` William Lee Irwin III
2004-11-22 16:07   ` David Howells [this message]
2004-11-22 16:34     ` William Lee Irwin III
2004-11-22 23:54 ` Andrew Morton
2004-11-23  9:18   ` David Howells
2004-11-23 16:11     ` Andrew Morton
2004-11-23 16:48       ` David Howells
2004-11-23 16:56         ` Andrew Morton
2004-11-23 17:48           ` David Howells
2004-11-23 17:10       ` William Lee Irwin III
2004-11-23 17:24         ` David Howells
2004-11-23 17:46           ` William Lee Irwin III
2004-11-23 17:51             ` David Howells
2004-11-24 14:22         ` Greg Ungerer
2004-11-24 18:03           ` David Howells
2004-11-25  3:37             ` Greg Ungerer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=14391.1101139626@redhat.com \
    --to=dhowells@redhat.com \
    --cc=akpm@osdl.org \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@osdl.org \
    --cc=wli@holomorphy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.