From: Catalin Marinas <catalin.marinas@arm.com>
To: Russell King <rmk@arm.linux.org.uk>
Cc: Jared Hulbert <jaredeh@gmail.com>, Nick Piggin <npiggin@suse.de>,
Martin Schwidefsky <martin.schwidefsky@de.ibm.com>,
carsteno@linux.vnet.ibm.com,
Heiko Carstens <h.carstens@de.ibm.com>,
Linux Memory Management List <linux-mm@kvack.org>,
linux-arch@vger.kernel.org
Subject: Re: [rfc][patch] mm: use a pte bit to flag normal pages
Date: Tue, 08 Jan 2008 13:54:15 +0000 [thread overview]
Message-ID: <1199800456.17809.72.camel@pc1117.cambridge.arm.com> (raw)
In-Reply-To: <20080108105227.GA10546@flint.arm.linux.org.uk>
On Tue, 2008-01-08 at 10:52 +0000, Russell King wrote:
> On Tue, Jan 08, 2008 at 10:11:15AM +0000, Catalin Marinas wrote:
> > Can we not restrict these to a maximum of 8 base types at run-time? If
> > yes, we can only use 3 bits for encoding and also benefit from the
> > automatic remapping in later ARM CPUs. For those not familiar with ARM,
> > 8 combinations of the TEX, C, B and S (shared) bits can be specified in
> > separate registers and the pte would only use 3 bits to refer to those.
> > Even older cores would benefit from this as I think it is faster to read
> > the encoding from an array in set_pte than doing all the bit comparisons
> > to calculate the hardware pte in the current implementation.
>
> So basically that gives us the following combinations:
I reordered them a bit for easier commenting.
> TEXCB
> 00010 - write through mappings (selectable via kernel command line)
> and also work-around for user read-only write-back mappings
> on PXA2.
> 00011 - normal write back mappings
> 00111 - write back write allocate mappings
Do you need to use all of the above at the same time? We could have only
one type, "normal memory", and configure the desired TEX encoding at
boot time.
> 00000 - /dev/mem and device uncachable mappings (strongly ordered)
> 00101 - Xscale3 "shared device" work-around for strongly ordered mappings
> 01000 - non-shared device (will be required to map some devices to
> userspace)
> and also Xscale3 work-around for strongly ordered mappings
I don't know the details of the Xscale3 bug but would you need all of
these encodings at run-time? Do you need both "strongly ordered" and the
workaround? We could only have the "strongly ordered" type and configure
the TEX bits at boot time to be "shared device" if the workaround is
needed.
For the last one, we could have the "non-shared device" type.
> 00001 - frame buffers
This would be "shared device" on newer CPUs.
> 00110 - PXA3 mini-cache or other "implementation defined features"
> 10111 - Xscale3 L2 cache-enabled mappings
It depends on how many of these you would need at run-time. If the base
types are "normal", "strongly ordered", "shared device", "non-shared
device", you still have 4 more left (or 3 on ARMv6 with TEX remapping
enabled since one encoding is implementation defined).
> It's unclear at present what circumstances you'd use each of the two
> Xscale3 work-around bit combinations - or indeed whether there's a
> printing error in the documentation concerning TEXCB=00101.
As I said, I don't know the details of this bug and can't comment.
> It's also unclear how to squeeze these down into a bit pattern in such
> a way that we avoid picking out bits from the Linux PTE, and recombining
> them so we can look them up in a table or whatever - especially given
> that set_pte is a fast path and extra cycles there have a VERY noticable
> impact on overall system performance.
As with the automatic remapping on ARMv6, we could use TEX[0], C and B
to for the 3 bit index in the table. For pre-ARMv6 hardware, we need a
bit of shifting and masking before looking up in the 8 32bit words table
but, for subsequent calls to set_pte, it is likely that the table would
be in cache anyway. There is also the option of choosing 3 consecutive
bits to avoid shifting on pre-ARMv6.
I agree there would be a delay on pre-ARMv6 CPUs but the impact might
not be that big since the current set_pte implementations still do
additional bit shifting/comparison for the access permissions. The
advantage is that we free 2 bits from the TEXCB encoding.
I haven't run any benchmarks and I can't say how big the impact is but,
based on some past discussions, 3-4 more cycles in set_pte might go
unnoticed because of other, bigger overheads.
--
Catalin
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-01-08 13:54 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-12-14 13:38 [rfc][patch 1/2] mm: introduce VM_MIXEDMAP mappings Nick Piggin
2007-12-14 13:41 ` [rfc][patch 2/2] xip: support non-struct page memory Nick Piggin
2007-12-14 13:46 ` Carsten Otte
2007-12-15 1:07 ` Jared Hulbert
2007-12-15 1:17 ` Nick Piggin
2007-12-15 6:47 ` Jared Hulbert
2007-12-19 14:04 ` Carsten Otte
2007-12-20 9:23 ` Jared Hulbert
2007-12-21 0:40 ` Nick Piggin
2007-12-20 13:53 ` Carsten Otte
2007-12-20 14:33 ` Carsten Otte
2007-12-20 14:50 ` Carsten Otte
2007-12-20 17:24 ` Jared Hulbert
2007-12-21 0:12 ` Jared Hulbert
2007-12-21 0:56 ` Nick Piggin
2007-12-21 9:56 ` Carsten Otte
2007-12-21 9:49 ` Carsten Otte
2007-12-21 0:50 ` Nick Piggin
2007-12-21 10:02 ` Carsten Otte
2007-12-21 10:14 ` Nick Piggin
2007-12-21 10:17 ` Carsten Otte
2007-12-21 10:23 ` Nick Piggin
2007-12-21 10:31 ` Carsten Otte
2007-12-21 0:45 ` Nick Piggin
2007-12-21 10:05 ` Carsten Otte
2007-12-21 10:20 ` Nick Piggin
2007-12-21 10:35 ` Carsten Otte
2007-12-21 10:47 ` Nick Piggin
2007-12-21 19:29 ` Martin Schwidefsky
2008-01-07 4:43 ` [rfc][patch] mm: use a pte bit to flag normal pages Nick Piggin
2008-01-07 10:30 ` Russell King
2008-01-07 11:14 ` Nick Piggin
2008-01-07 18:49 ` Jared Hulbert
2008-01-07 19:45 ` Russell King
2008-01-07 22:52 ` Jared Hulbert
2008-01-08 2:37 ` Andi Kleen
2008-01-08 2:49 ` Nick Piggin
2008-01-08 3:31 ` Andi Kleen
2008-01-08 3:52 ` Nick Piggin
2008-01-08 10:11 ` Catalin Marinas
2008-01-08 10:52 ` Russell King
2008-01-08 13:54 ` Catalin Marinas [this message]
2008-01-08 14:08 ` Russell King
2008-01-10 13:33 ` Carsten Otte
2008-01-10 23:18 ` Nick Piggin
2008-01-08 9:35 ` [rfc][patch 0/4] VM_MIXEDMAP patchset with s390 backend Carsten Otte
2008-01-08 10:08 ` Nick Piggin
2008-01-08 11:34 ` Carsten Otte
2008-01-08 11:55 ` Nick Piggin
2008-01-08 12:03 ` Carsten Otte
2008-01-08 13:56 ` Jörn Engel
2008-01-08 14:51 ` Carsten Otte
2008-01-08 18:09 ` Jared Hulbert
2008-01-08 22:12 ` Nick Piggin
2008-01-09 15:14 ` [rfc][patch 0/4] VM_MIXEDMAP patchset with s390 backend v2 Carsten Otte
[not found] ` <1199891032.28689.9.camel@cotte.boeblingen.de.ibm.com>
2008-01-09 15:14 ` [rfc][patch 1/4] include: add callbacks to toggle reference counting for VM_MIXEDMAP pages Carsten Otte, Carsten Otte
2008-01-09 17:31 ` Martin Schwidefsky
2008-01-09 18:17 ` Jared Hulbert
2008-01-10 7:59 ` Carsten Otte
2008-01-10 20:01 ` Jared Hulbert
2008-01-11 8:45 ` Carsten Otte
2008-01-13 2:44 ` Nick Piggin
2008-01-14 11:36 ` Carsten Otte
2008-01-16 4:04 ` Nick Piggin
2008-01-15 13:05 ` Carsten Otte
2008-01-16 4:22 ` Nick Piggin
2008-01-16 14:29 ` [rft] updated xip patch rollup Nick Piggin
2008-01-17 10:24 ` Carsten Otte
2008-01-10 20:23 ` [rfc][patch 1/4] include: add callbacks to toggle reference counting for VM_MIXEDMAP pages Jared Hulbert
2008-01-11 8:32 ` Carsten Otte
2008-01-10 0:20 ` Nick Piggin
2008-01-10 8:06 ` Carsten Otte
2008-01-09 15:14 ` [rfc][patch 2/4] mm: introduce VM_MIXEDMAP Carsten Otte, Jared Hulbert, Carsten Otte
2008-01-09 15:14 ` [rfc][patch 3/4] Convert XIP to support non-struct page backed memory Carsten Otte, Nick Piggin
2008-01-09 15:14 ` [rfc][patch 4/4] s390: remove struct page entries for DCSS memory segments Carsten Otte, Carsten Otte
[not found] ` <1199784196.25114.11.camel@cotte.boeblingen.de.ibm.com>
2008-01-08 9:35 ` [rfc][patch 1/4] mm: introduce VM_MIXEDMAP Carsten Otte, Jared Hulbert, Carsten Otte
2008-01-08 9:35 ` [rfc][patch 2/4] xip: support non-struct page memory Carsten Otte, Nick Piggin, Carsten Otte
2008-01-08 9:36 ` [rfc][patch 3/4] s390: remove sturct page entries for z/VM DCSS memory segments Carsten Otte
2008-01-08 9:36 ` [rfc][patch 4/4] s390: mixedmap_refcount_pfn implementation using list walk Carsten Otte
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1199800456.17809.72.camel@pc1117.cambridge.arm.com \
--to=catalin.marinas@arm.com \
--cc=carsteno@linux.vnet.ibm.com \
--cc=h.carstens@de.ibm.com \
--cc=jaredeh@gmail.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=martin.schwidefsky@de.ibm.com \
--cc=npiggin@suse.de \
--cc=rmk@arm.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).