All of lore.kernel.org
 help / color / mirror / Atom feed
* [parisc-linux] [akpm@digeo.com: arch changes for file-offset-in-pte's]
@ 2003-03-22  1:20 Matthew Wilcox
  0 siblings, 0 replies; 3+ messages in thread
From: Matthew Wilcox @ 2003-03-22  1:20 UTC (permalink / raw)
  To: parisc-linux

FYI.  jsm, any comments?

----- Forwarded message from Andrew Morton <akpm@digeo.com> -----

Envelope-to: willy@ftp.uk.linux.org
Delivery-date: Fri, 21 Mar 2003 23:52:14 +0000
Date: Fri, 21 Mar 2003 17:56:26 -0800
From: Andrew Morton <akpm@digeo.com>
To: "David S. Miller" <davem@redhat.com>, paulus@au.ibm.com,
        benh@kernel.crashing.org, rth@twiddle.net, davidm@hpl.hp.com,
        ralf@linux-mips.org, schwidefsky@de.ibm.com,
        Russell King
 <rmk@arm.linux.org.uk>, bjornw@axis.com,
        geert@linux-m68k.org, Matthew
 Wilcox <willy@debian.org>,
        gniibe@m17n.org, linux-sh@m17n.org, jdike@karaya.com,
        uclinux-v850@lsi.nec.co.jp
Cc: Ingo Molnar <mingo@elte.hu>, linux-mm@kvack.org
Subject: arch changes for file-offset-in-pte's
X-Mailer: Sylpheed version 0.8.10 (GTK+ 1.2.10; i686-pc-linux-gnu)
X-OriginalArrivalTime: 21 Mar 2003 23:51:17.0829 (UTC) FILETIME=[C3FA4750:01C2F004]


hi,

I'd like to submit Ingo's remap_file_pages() enhancements soon.  His patch
allows pages in "nonlinear" mappings to be reestablished by the kernel's
pagefault handler.

It does this by embedding the page's ->index into the pte which wants to map
the page.  This is arch-specific, and I only have ia32, ppc64 and x86_64 done.

So if&when this hits the tree, it will break other architectures.  It's a
five-minute-fix.

Four things need to be provided:

pte_t pgoff_to_pte(unsigned long pgoff)

    Return a pte_t which contains as many of the lower bits of pgoff as
    you can feasibly pack into a pte.

    You'll probably need to reserve at least two bits - one for
    not-present and one to say "this is a pte_file pte".

unsigned long pte_to_pgoff(pte_t pte)

    Extract the unsigned long from a pte.

int pte_file(pte_t)

    Return true if the pte is a "file pte".  This is where you'll need to
    use the magical reserved bit to distinguish this from a swapped out pte.

PTE_FILE_MAX_BITS	(a constant)

    Tells the kernel how many bits of the file offset the architecture is
    capable of placing in the pte, via pgoff_to_pte().  ia32 sets this to 29
    in non-PAE mode, 32 in PAE mode (CONFIG_HIGHMEM64G)


As an example, here is the x86_64 implementation (the comment next to
_PAGE_FILE is wrong, btw.  These are not swapcache pages)
The ia32 version of this code is right at the start of the the main patch, at

http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.65/2.5.65-mm3/broken-out/remap-file-pages-2.5.63-a1.patch


Thanks.


diff -puN include/asm-x86_64/pgtable.h~file-offset-in-pte-x86_64 include/asm-x86_64/pgtable.h
--- 25/include/asm-x86_64/pgtable.h~file-offset-in-pte-x86_64	2003-03-13 04:45:57.000000000 -0800
+++ 25-akpm/include/asm-x86_64/pgtable.h	2003-03-13 04:45:57.000000000 -0800
@@ -151,6 +151,7 @@ static inline void set_pml4(pml4_t *dst,
 #define _PAGE_ACCESSED	0x020
 #define _PAGE_DIRTY	0x040
 #define _PAGE_PSE	0x080	/* 2MB page */
+#define _PAGE_FILE	0x040	/* pagecache or swap */
 #define _PAGE_GLOBAL	0x100	/* Global TLB entry */
 
 #define _PAGE_PROTNONE	0x080	/* If not present */
@@ -245,6 +246,7 @@ extern inline int pte_exec(pte_t pte)		{
 extern inline int pte_dirty(pte_t pte)		{ return pte_val(pte) & _PAGE_DIRTY; }
 extern inline int pte_young(pte_t pte)		{ return pte_val(pte) & _PAGE_ACCESSED; }
 extern inline int pte_write(pte_t pte)		{ return pte_val(pte) & _PAGE_RW; }
+static inline int pte_file(pte_t pte)		{ return pte_val(pte) & _PAGE_FILE; }
 
 extern inline pte_t pte_rdprotect(pte_t pte)	{ set_pte(&pte, __pte(pte_val(pte) & ~_PAGE_USER)); return pte; }
 extern inline pte_t pte_exprotect(pte_t pte)	{ set_pte(&pte, __pte(pte_val(pte) & ~_PAGE_USER)); return pte; }
@@ -330,6 +332,11 @@ static inline pgd_t *current_pgd_offset_
 #define	pmd_bad(x)	((pmd_val(x) & (~PTE_MASK & ~_PAGE_USER)) != _KERNPG_TABLE )
 #define pfn_pmd(nr,prot) (__pmd(((nr) << PAGE_SHIFT) | pgprot_val(prot)))
 
+
+#define pte_to_pgoff(pte) ((pte_val(pte) & PHYSICAL_PAGE_MASK) >> PAGE_SHIFT)
+#define pgoff_to_pte(off) ((pte_t) { ((off) << PAGE_SHIFT) | _PAGE_FILE })
+#define PTE_FILE_MAX_BITS __PHYSICAL_MASK_SHIFT
+
 /* PTE - Level 1 access. */
 
 /* page, protection -> pte */
diff -puN include/asm-x86_64/page.h~file-offset-in-pte-x86_64 include/asm-x86_64/page.h
--- 25/include/asm-x86_64/page.h~file-offset-in-pte-x86_64	2003-03-13 04:45:57.000000000 -0800
+++ 25-akpm/include/asm-x86_64/page.h	2003-03-13 04:48:53.000000000 -0800
@@ -69,8 +69,9 @@ typedef struct { unsigned long pgprot; }
 /* See Documentation/x86_64/mm.txt for a description of the memory map. */
 #define __START_KERNEL		0xffffffff80100000
 #define __START_KERNEL_map	0xffffffff80000000
-#define __PAGE_OFFSET           0x0000010000000000
-#define __PHYSICAL_MASK		0x000000ffffffffff
+#define __PAGE_OFFSET           0x0000010000000000	/* 1 << 40 */
+#define __PHYSICAL_MASK_SHIFT	40
+#define __PHYSICAL_MASK		((1UL << __PHYSICAL_MASK_SHIFT) - 1)
 
 #define KERNEL_TEXT_SIZE  (40UL*1024*1024)
 #define KERNEL_TEXT_START 0xffffffff80000000UL 

_



----- End forwarded message -----

-- 
"It's not Hollywood.  War is real, war is primarily not about defeat or
victory, it is about death.  I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [parisc-linux] [akpm@digeo.com: arch changes for file-offset-in-pte's]
@ 2003-03-24 13:28 John Marvin
  2003-03-24 19:19 ` Carlos O'Donell
  0 siblings, 1 reply; 3+ messages in thread
From: John Marvin @ 2003-03-24 13:28 UTC (permalink / raw)
  To: parisc-linux

> FYI.  jsm, any comments?

It doesn't appear to be a problem, since the bit comes out of the swap
entry bits, not the "present" bits, which we don't have any more available
(e.g. _PAGE_FILE can overload _PAGE_ACCESSED, since _PAGE_ACCESSED is
only valid when _PAGE_PRESENT is set).

The main issue is whether to take the bit out of the swap offset or swap
type. i386 went from 24 offset bits and 6 type bits to 24 offset bits and
5 type bits, i.e. they took the bit from the type field. We are currently
at 25 offset bits and 5 type bits.

The offset bits control the maximum swap size for each swap device. The
type bits control the number of possible swap devices. So, one possibility
is to reduce the number of possible swap devices from 32 to 16, rather
than decrease the maximum swap size. But, perhaps since limiting the
maximum swap size would only really have an effect on the 32 bit kernel,
perhaps it would better to leave the maximum number of swap devices at
32 and decrease the maximum swap size (i.e. the maximum swap size
for a swap device would decrease from 128 Gb to 64 Gb for a 32 bit
kernel). This would be the same as i386.

Another possibility would be to split out the SWP_TYPE, SWP_OFFSET
and SWP_ENTRY macros in order to have a 32 bit kernel version and a
64 bit kernel version. Then we could go with 16 swap devices for the
32 bit kernel and even increase the number of swap devices for the
64 bit kernel, since we have a huge number of bits for the offset.

Any opinions?

John

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [parisc-linux] [akpm@digeo.com: arch changes for file-offset-in-pte's]
  2003-03-24 13:28 [parisc-linux] [akpm@digeo.com: arch changes for file-offset-in-pte's] John Marvin
@ 2003-03-24 19:19 ` Carlos O'Donell
  0 siblings, 0 replies; 3+ messages in thread
From: Carlos O'Donell @ 2003-03-24 19:19 UTC (permalink / raw)
  To: John Marvin; +Cc: parisc-linux

> Another possibility would be to split out the SWP_TYPE, SWP_OFFSET
> and SWP_ENTRY macros in order to have a 32 bit kernel version and a
> 64 bit kernel version. Then we could go with 16 swap devices for the
> 32 bit kernel and even increase the number of swap devices for the
> 64 bit kernel, since we have a huge number of bits for the offset.

My $0.02

a) Keep the swap size the same, and reduce the number of entries.

	= Offers better compatiblity in most situations

b) Split the macro's, keeping the swap size the same, and increase 
   the number of possible devices.

	= Offers size consistency, and benefits to 64-bit kernels.

c.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-03-24 19:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-03-24 13:28 [parisc-linux] [akpm@digeo.com: arch changes for file-offset-in-pte's] John Marvin
2003-03-24 19:19 ` Carlos O'Donell
  -- strict thread matches above, loose matches on Subject: below --
2003-03-22  1:20 Matthew Wilcox

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.