public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: William Lee Irwin III <wli@holomorphy.com>
To: Rusty Russell <rusty@rustcorp.com.au>
Cc: torvalds@osdl.org, mfedyk@matchmail.com, ebiederm@xmission.com,
	anton@mips.complang.tuwien.ac.at, linux-kernel@vger.kernel.org,
	phillips@arcor.de
Subject: Re: Page Colouring (was: 2.6.0 Huge pages not working as expected)
Date: Mon, 29 Dec 2003 20:59:18 -0800	[thread overview]
Message-ID: <20031230045918.GA22443@holomorphy.com> (raw)
In-Reply-To: <20031230130029.6183a872.rusty@rustcorp.com.au>

On Mon, 29 Dec 2003 02:23:19 -0800 William Lee Irwin III wrote:
>> The fact merely elevating PAGE_SIZE breaks numerous things makes me
>> rather suspicious of claims that minimalistic patches can do likewise.

On Tue, Dec 30, 2003 at 01:00:29PM +1100, Rusty Russell wrote:
> Can you give an example?
> 	One approach is to simply present a larger page size to userspace w/
> getpagesize().  This does break ELF programs which have been laid out assuming
> the old page size (presumably they try to mprotect the read-only sections).
> On PPC, the ELF ABI already insists on a 64k boundary between such sections,
> and maybe for others you could simply round appropriately and pray, or do
> fine-grained protections (ie. on real pagesize) for that one case.

Apps must, of course, be relinked for that, but that's userspace. This
ABI change is largely out of the picture due to legacy binaries, user
virtualspace fragmentation (most likely an issue for 32-bit threading),
and so on. The choice of PAGE_SIZE in such schemes is also restricted
to no larger than whatever choice used for userspace linking, which is
a relatively ugly dependency. There's also a question of "smooth
transition": the only way to "incrementally deploy" it on a mixture
"ready" userspace and "unready" userspace is to turn it off. I suppose
it has the minor advantage of being trivial to program.

I had in mind pure kernel internal issues, not ABI.

The issues from raising PAGE_SIZE alone are things like interpreting
hardware descriptions in arch code, some shifts underflowing for things
like hashtables, certain drivers doing ioremap() and the like either
filling up vmallocspace or getting their math wrong, and some other
drivers doing calculations on physical addresses getting them wrong, or
using PAGE_SIZE to represent some 4KB or other fixed-size memory area
interpreted by hardware, and filesystems that assume blocksize ==
PAGE_SIZE or assume PAGE_SIZE is less than some particular value (e.g.
short offsets into pages, worst of all being signed shorts), and
tripping BUG()'s in ll_rw_blk.c when 512*q->max_sectors < PAGE_SIZE.

These issues are the bulk of the work needing to be done for the driver
and fs sweeps. Actual concerns about MMUPAGE_SIZE in drivers/ and fs/
are rather limited in scope, though drivers/char/drm/ was somewhat
painful to get going (Zwane actually did most of this for me, as I have
no DRM/DRI -capable graphics cards at my disposal).


-- wli

  reply	other threads:[~2003-12-30  4:59 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <179fV-1iK-23@gated-at.bofh.it>
     [not found] ` <179IS-1VD-13@gated-at.bofh.it>
2003-12-27 20:21   ` Page Colouring (was: 2.6.0 Huge pages not working as expected) Anton Ertl
2003-12-27 20:56     ` Linus Torvalds
2003-12-27 23:31       ` Eric W. Biederman
2003-12-27 23:50         ` William Lee Irwin III
2003-12-28  1:09         ` David S. Miller
2003-12-28  4:53         ` Linus Torvalds
2003-12-28 16:39           ` William Lee Irwin III
2003-12-29  0:36             ` Mike Fedyk
2003-12-29  2:55               ` William Lee Irwin III
2003-12-29  4:09                 ` Linus Torvalds
2003-12-29  6:52                   ` William Lee Irwin III
2003-12-29  9:14                     ` Linus Torvalds
2003-12-29  9:22                       ` William Lee Irwin III
2003-12-29  9:33                         ` Linus Torvalds
2003-12-29 10:23                           ` William Lee Irwin III
2003-12-29 10:59                             ` Mike Fedyk
2003-12-29 11:14                               ` William Lee Irwin III
2003-12-30  2:00                             ` Rusty Russell
2003-12-30  4:59                               ` William Lee Irwin III [this message]
     [not found]                     ` <20031229084304.GA31630@elte.hu>
2003-12-29 12:09                       ` Ingo Molnar
2003-12-29 12:49                         ` William Lee Irwin III
2003-12-29 20:02                   ` Subpages (was: Page Colouring) Daniel Phillips
2003-12-29 20:15                     ` Linus Torvalds
2003-12-29 21:11           ` Page Colouring (was: 2.6.0 Huge pages not working as expected) Eric W. Biederman
2003-12-29 21:35             ` Linus Torvalds
     [not found]       ` <17tHK-3K6-21@gated-at.bofh.it>
2003-12-28 17:17         ` Anton Ertl
     [not found] <176UD-6vl-3@gated-at.bofh.it>
2003-12-26 21:48 ` Anton Ertl
2003-12-26 23:28   ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20031230045918.GA22443@holomorphy.com \
    --to=wli@holomorphy.com \
    --cc=anton@mips.complang.tuwien.ac.at \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mfedyk@matchmail.com \
    --cc=phillips@arcor.de \
    --cc=rusty@rustcorp.com.au \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox