public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: William Lee Irwin III <wli@holomorphy.com>
To: Andrew Morton <akpm@digeo.com>,
	"Randy.Dunlap" <rddunlap@osdl.org>,
	colin@colina.demon.co.uk, linux-kernel@vger.kernel.org
Subject: Re: Maximum swap space?
Date: Sat, 7 Jun 2003 17:55:43 -0700	[thread overview]
Message-ID: <20030608005543.GM20413@holomorphy.com> (raw)
In-Reply-To: <20030607205046.GL20413@holomorphy.com>

On Sat, Jun 07, 2003 at 01:50:46PM -0700, William Lee Irwin III wrote:
> These limits could be slightly relaxed by the kernel with some slightly
> more complex bit twiddlings to recover up to 6 bits of the lower byte
> of a non-present PTE for 64 swapfiles. The limitation on size seems to
> be in userspace.  It appears the kernel has 24 bits for offsets in 4KB
> units, for up to something approaching 64GB swapfiles. Andi Kleen tells
> me newer distributions have fixed the mkswap(8) userspace limitation.
> So non-PAE x86 should be able to do 4TB of aggregate swapspace modulo
> vmallocspace and/or ZONE_NORMAL exhaustion from swap maps. Also, PAE
> should be able to do 64TB of aggregate swapspace (modulo vmallocespace)
> since it has an additional 4 bits usage for page offsets. But I didn't
> audit intensively, so some silly limits may be lurking in dark corners.

Santamarta on #kn tested the following patch to allow up to 64
swapfiles.


diff -prauN linux-2.5.70/include/asm-i386/pgtable.h swap-2.5.70/include/asm-i386/pgtable.h
--- linux-2.5.70/include/asm-i386/pgtable.h	Thu May  1 19:15:41 2003
+++ swap-2.5.70/include/asm-i386/pgtable.h	Sat Jun  7 16:47:04 2003
@@ -106,6 +106,7 @@
 #define _PAGE_BIT_PCD		4
 #define _PAGE_BIT_ACCESSED	5
 #define _PAGE_BIT_DIRTY		6
+#define _PAGE_BIT_FILE		6
 #define _PAGE_BIT_PSE		7	/* 4 MB (or 2MB) page, Pentium+, if present.. */
 #define _PAGE_BIT_GLOBAL	8	/* Global TLB entry PPro+ */
 
@@ -320,12 +321,38 @@
  */
 #define update_mmu_cache(vma,address,pte) do { } while (0)
 
-/* Encode and de-code a swap entry */
-#define __swp_type(x)			(((x).val >> 1) & 0x1f)
+/*
+ * Encode and de-code a swap entry
+ * PAE could use more swapspace if swp_entry_t were wider, as there
+ * is an additional word in PTE's with 4 bits available. The benefit
+ * of extending it for such is, however, questionable.
+ */
+
 #define __swp_offset(x)			((x).val >> 8)
-#define __swp_entry(type, offset)	((swp_entry_t) { ((type) << 1) | ((offset) << 8) })
 #define __pte_to_swp_entry(pte)		((swp_entry_t) { (pte).pte_low })
 #define __swp_entry_to_pte(x)		((pte_t) { (x).val })
+
+/*
+ * Bit 0 is reserved for present/non-present, and _PAGE_BIT_FILE is
+ * reserved for non-present PTE's representing file pages.
+ */
+#define __swp_type(entry)						\
+({									\
+	unsigned long __val__ = (entry).val;				\
+	(__val__ & (_PAGE_FILE - 1)) >> 1				\
+		| (__val__ & (_PAGE_FILE << 1)) >> 2;			\
+})
+
+#define __swp_entry(type, offset)					\
+({									\
+	unsigned long __type__ = type;					\
+	(swp_entry_t)							\
+		{ (offset) << 8						\
+			| ((__type__ << 1) & (_PAGE_FILE - 1))		\
+			| ((__type__ << 2) & (_PAGE_FILE << 1)) };	\
+})
+
+#define MAX_SWAPFILES_SHIFT	_PAGE_BIT_FILE
 
 #endif /* !__ASSEMBLY__ */
 
diff -prauN linux-2.5.70/include/linux/swap.h swap-2.5.70/include/linux/swap.h
--- linux-2.5.70/include/linux/swap.h	Wed May  7 21:19:58 2003
+++ swap-2.5.70/include/linux/swap.h	Sat Jun  7 15:09:28 2003
@@ -10,6 +10,7 @@
 #include <linux/sched.h>
 #include <asm/atomic.h>
 #include <asm/page.h>
+#include <asm/pgtable.h>
 
 #define SWAP_FLAG_PREFER	0x8000	/* set if swap priority specified */
 #define SWAP_FLAG_PRIO_MASK	0x7fff
@@ -25,10 +26,14 @@
  * be swapped to.  The swap type and the offset into that swap type are
  * encoded into pte's and into pgoff_t's in the swapcache.  Using five bits
  * for the type means that the maximum number of swapcache pages is 27 bits
- * on 32-bit-pgoff_t architectures.  And that assumes that the architecture packs
- * the type/offset into the pte as 5/27 as well.
+ * on 32-bit-pgoff_t architectures.  And that assumes that the
+ * architecture packs the type/offset into the pte as 5/27 as well.
+ * Architectures can override this by simply defining MAX_SWAPFILES_SHIFT
+ * in appropriate headers.
  */
+#ifndef MAX_SWAPFILES_SHIFT
 #define MAX_SWAPFILES_SHIFT	5
+#endif
 #define MAX_SWAPFILES		(1 << MAX_SWAPFILES_SHIFT)
 
 /*

  reply	other threads:[~2003-06-08  0:43 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-06-07  8:55 Maximum swap space? Colin Paul Adams
2003-06-07 17:50 ` Randy.Dunlap
2003-06-07 20:24   ` Andrew Morton
2003-06-07 20:50     ` William Lee Irwin III
2003-06-08  0:55       ` William Lee Irwin III [this message]
2003-06-08  1:28         ` Andrew Morton
2003-06-08  1:38           ` William Lee Irwin III
2003-06-08  6:31             ` William Lee Irwin III
2003-06-07 21:43   ` H. Peter Anvin
2003-06-07 21:49     ` William Lee Irwin III
2003-06-07 22:18       ` H. Peter Anvin
2003-06-07 22:43         ` William Lee Irwin III
2003-06-07 22:28 ` Rik van Riel
2003-06-10 19:00   ` Randy.Dunlap
2003-06-10 19:29     ` William Lee Irwin III
  -- strict thread matches above, loose matches on Subject: below --
2003-06-08 22:51 Andries.Brouwer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030608005543.GM20413@holomorphy.com \
    --to=wli@holomorphy.com \
    --cc=akpm@digeo.com \
    --cc=colin@colina.demon.co.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rddunlap@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox