linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Lameter <cl@linux.com>
To: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: David Rientjes <rientjes@google.com>,
	Hugh Dickins <hughd@google.com>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	linux-mm@kvack.org
Subject: [slubllv333num@/21] mm: Rearrange struct page
Date: Fri, 15 Apr 2011 15:12:56 -0500	[thread overview]
Message-ID: <20110415201300.792352989@linux.com> (raw)
In-Reply-To: 20110415201246.096634892@linux.com

[-- Attachment #1: resort_struct_page --]
[-- Type: text/plain, Size: 5157 bytes --]

We need to be able to use cmpxchg_double on the freelist and object count
field in struct page. Rearrange the fields in struct page according to
doubleword entities so that the freelist pointer comes before the counters.
Do the rearranging with a future in mind where we use more doubleword
atomics to avoid locking of updates to flags/mapping or lru pointers.

Create another union to allow access to counters in struct page as a
single unsigned long value.

The doublewords must be properly aligned for cmpxchg_double to work.
Sadly this increases the size of page struct by one word but as a result
page structs are now cacheline aligned on x86_64.

Signed-off-by: Christoph Lameter <cl@linux.com>

---
 include/linux/mm_types.h |   85 ++++++++++++++++++++++++++++++-----------------
 1 file changed, 55 insertions(+), 30 deletions(-)

Index: linux-2.6/include/linux/mm_types.h
===================================================================
--- linux-2.6.orig/include/linux/mm_types.h	2011-04-15 13:14:36.000000000 -0500
+++ linux-2.6/include/linux/mm_types.h	2011-04-15 13:14:48.000000000 -0500
@@ -30,52 +30,68 @@ struct address_space;
  * moment. Note that we have no way to track which tasks are using
  * a page, though if it is a pagecache page, rmap structures can tell us
  * who is mapping it.
+ *
+ * The objects in struct page are organized in double word blocks in
+ * order to allows us to use atomic double word operations on portions
+ * of struct page. That is currently only used by slub but the arrangement
+ * allows the use of atomic double word operations on the flags/mapping
+ * and lru list pointers also.
  */
 struct page {
+	/* First double word block */
 	unsigned long flags;		/* Atomic flags, some possibly
 					 * updated asynchronously */
-	atomic_t _count;		/* Usage count, see below. */
-	union {
-		atomic_t _mapcount;	/* Count of ptes mapped in mms,
-					 * to show when page is mapped
-					 * & limit reverse map searches.
+	struct address_space *mapping;	/* If low bit clear, points to
+					 * inode address_space, or NULL.
+					 * If page mapped as anonymous
+					 * memory, low bit is set, and
+					 * it points to anon_vma object:
+					 * see PAGE_MAPPING_ANON below.
 					 */
-		struct {		/* SLUB */
-			unsigned inuse:16;
-			unsigned objects:15;
-			unsigned frozen:1;
+	/* Second double word block used by SLUB */
+	union {
+		pgoff_t index;		/* Our offset within mapping. */
+		void *freelist;		/* SLUB: freelist req. slab lock */
+	};
+	union {
+		unsigned long counters;
+		struct {
+			union {
+				struct {		/* SLUB */
+					unsigned inuse:16;
+					unsigned objects:15;
+					unsigned frozen:1;
+				};
+				atomic_t _mapcount;	/* Count of ptes mapped in mms,
+							 * to show when page is mapped
+							 * & limit reverse map searches.
+							 */
+			};
+			atomic_t _count;		/* Usage count, see below. */
 		};
 	};
+
+	/* Third double word block */
+	struct list_head lru;		/* Pageout list, eg. active_list
+					 * protected by zone->lru_lock !
+					 */
+
+	/* Remainder is not double word aligned */
 	union {
-	    struct {
-		unsigned long private;		/* Mapping-private opaque data:
+	 	unsigned long private;		/* Mapping-private opaque data:
 					 	 * usually used for buffer_heads
 						 * if PagePrivate set; used for
 						 * swp_entry_t if PageSwapCache;
 						 * indicates order in the buddy
 						 * system if PG_buddy is set.
 						 */
-		struct address_space *mapping;	/* If low bit clear, points to
-						 * inode address_space, or NULL.
-						 * If page mapped as anonymous
-						 * memory, low bit is set, and
-						 * it points to anon_vma object:
-						 * see PAGE_MAPPING_ANON below.
-						 */
-	    };
 #if USE_SPLIT_PTLOCKS
-	    spinlock_t ptl;
+		spinlock_t ptl;
 #endif
-	    struct kmem_cache *slab;	/* SLUB: Pointer to slab */
-	    struct page *first_page;	/* Compound tail pages */
+		struct kmem_cache *slab;	/* SLUB: Pointer to slab */
+		struct page *first_page;	/* Compound tail pages */
 	};
-	union {
-		pgoff_t index;		/* Our offset within mapping. */
-		void *freelist;		/* SLUB: freelist req. slab lock */
-	};
-	struct list_head lru;		/* Pageout list, eg. active_list
-					 * protected by zone->lru_lock !
-					 */
+
 	/*
 	 * On machines where all RAM is mapped into kernel address space,
 	 * we can simply calculate the virtual address. On machines with
@@ -101,7 +117,16 @@ struct page {
 	 */
 	void *shadow;
 #endif
-};
+}
+/*
+ * If another subsystem starts using the double word pairing for atomic
+ * operations on struct page then it must change the #if to ensure
+ * proper alignment of the page struct.
+ */
+#if defined(CONFIG_SLUB) && defined(CONFIG_CMPXCHG_LOCAL)
+	__attribute__((__aligned__(2*sizeof(unsigned long))))
+#endif
+;
 
 /*
  * A region containing a mapping of a non-memory backed file under NOMMU

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2011-04-15 20:13 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-15 20:12 [slubllv333num@/21] SLUB: Lockless freelists for objects V3 Christoph Lameter
2011-04-15 20:12 ` [slubllv333num@/21] slub: Use NUMA_NO_NODE in get_partial Christoph Lameter
2011-04-15 20:12 ` [slubllv333num@/21] slub: get_map() function to establish map of free objects in a slab Christoph Lameter
2011-04-15 20:12 ` [slubllv333num@/21] slub: Eliminate repeated use of c->page through a new page variable Christoph Lameter
2011-04-15 20:12 ` [slubllv333num@/21] slub: Move node determination out of hotpath Christoph Lameter
2011-04-15 20:12 ` [slubllv333num@/21] slub: Move debug handlign in __slab_free Christoph Lameter
2011-04-15 20:12 ` [slubllv333num@/21] slub: Per object NUMA support Christoph Lameter
2011-04-15 20:12 ` [slubllv333num@/21] slub: Do not use frozen page flag but a bit in the page counters Christoph Lameter
2011-04-15 20:12 ` [slubllv333num@/21] slub: Move page->frozen handling near where the page->freelist handling occurs Christoph Lameter
2011-04-15 20:12 ` [slubllv333num@/21] x86: Add support for cmpxchg_double Christoph Lameter
2011-04-15 20:12 ` Christoph Lameter [this message]
2011-04-15 20:12 ` [slubllv333num@/21] slub: Add cmpxchg_double_slab() Christoph Lameter
2011-04-15 20:12 ` [slubllv333num@/21] slub: explicit list_lock taking Christoph Lameter
2011-04-15 20:12 ` [slubllv333num@/21] slub: Pass kmem_cache struct to lock and freeze slab Christoph Lameter
2011-04-15 20:13 ` [slubllv333num@/21] slub: Rework allocator fastpaths Christoph Lameter
2011-04-15 20:13 ` [slubllv333num@/21] slub: Invert locking and avoid slab lock Christoph Lameter
2011-04-15 20:13 ` [slubllv333num@/21] slub: Disable interrupts in free_debug processing Christoph Lameter
2011-04-15 20:13 ` [slubllv333num@/21] slub: Avoid disabling interrupts in free slowpath Christoph Lameter
2011-04-15 20:13 ` [slubllv333num@/21] slub: Get rid of the another_slab label Christoph Lameter
2011-04-15 20:13 ` [slubllv333num@/21] slub: fast release on full slab Christoph Lameter
2011-04-15 20:13 ` [slubllv333num@/21] slub: Not necessary to check for empty slab on load_freelist Christoph Lameter
2011-04-15 20:13 ` [slubllv333num@/21] slub: update statistics for cmpxchg handling Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110415201300.792352989@linux.com \
    --to=cl@linux.com \
    --cc=eric.dumazet@gmail.com \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=linux-mm@kvack.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=penberg@cs.helsinki.fi \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).