public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@novell.com>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Andrew Morton <akpm@osdl.org>, linux-kernel@vger.kernel.org
Subject: Re: ZONE_PADDING wastes 4 bytes of the new cacheline
Date: Mon, 25 Oct 2004 14:44:43 +0200	[thread overview]
Message-ID: <20041025124443.GV14325@dualathlon.random> (raw)
In-Reply-To: <417A86AC.2080505@yahoo.com.au>

The more I read the old protection setting the more it seems broken. It
even looks it would buffer overflow if it wasn't by luck that
GFP_ZONETYPES == GFP_ZONEMASK, and I suspect the Non-loner version of
GFP_ZONETYPES would trigger it. I don't understand what loner means.
online dict returns:

loner
     n : a person who avoids the company or assistance of others

nosense for non-english speaker in GFP_ZONETYPES context:

/* #define GFP_ZONETYPES	(GFP_ZONEMASK + 1) */		/* Non-loner */
#define GFP_ZONETYPES	((GFP_ZONEMASK + 1) / 2 + 1)		/* Loner */


anyways GFP_ZONETYPES has absolutely nothing to do with this code, no
idea how it splipped into it, this is not a zonelist, this is a
node_zones array only in function of MAX_NR_ZONES and it makes no sense
to me to use anything but MAX_NR_ZONES to initialialize it.

This is my forward port from 2.4 that should fix it.

static void setup_per_zone_lowmem_reserve(void)
{
	struct pglist_data *pgdat;
	int j, idx;

	for_each_pgdat(pgdat) {
		for (j = 0; j < MAX_NR_ZONES; j++) {
			zone_t *zone = pgdat->node_zones + j;
			unsigned long presentpages = zone->presentpages;

			zone->lowmem_reserve[j] = 0;

			for (idx = j-1; idx >= 0; idx--) {
				zone_t * lower_zone = pgdat->node_zones + idx;

				lower_zone->lowmem_reserve[j] = presetpages / lower_zone_reserve_ratio[idx];
				presentpages += lower_zone->presentpages;
			}
		}
	}
}

here a diff to compare it with current 2.6 code.

diff -urNp --exclude CVS --exclude BitKeeper --exclude {arch} --exclude .arch-ids -u sles-ref/mm/page_alloc.c sles/mm/page_alloc.c
--- sles-ref/mm/page_alloc.c	2004-10-25 13:50:37.287995072 +0200
+++ sles/mm/page_alloc.c	2004-10-25 14:37:23.358407528 +0200
@@ -1905,87 +1892,29 @@ void __init page_alloc_init(void)
 	hotcpu_notifier(page_alloc_cpu_notify, 0);
 }
 
-static unsigned long higherzone_val(struct zone *z, int max_zone,
-					int alloc_type)
-{
-	int z_idx = zone_idx(z);
-	struct zone *higherzone;
-	unsigned long pages;
-
-	/* there is no higher zone to get a contribution from */
-	if (z_idx == MAX_NR_ZONES-1)
-		return 0;
-
-	higherzone = &z->zone_pgdat->node_zones[z_idx+1];
-
-	/* We always start with the higher zone's protection value */
-	pages = higherzone->protection[alloc_type];
-
-	/*
-	 * We get a lower-zone-protection contribution only if there are
-	 * pages in the higher zone and if we're not the highest zone
-	 * in the current zonelist.  e.g., never happens for GFP_DMA. Happens
-	 * only for ZONE_DMA in a GFP_KERNEL allocation and happens for ZONE_DMA
-	 * and ZONE_NORMAL for a GFP_HIGHMEM allocation.
-	 */
-	if (higherzone->present_pages && z_idx < alloc_type)
-		pages += higherzone->pages_low * sysctl_lower_zone_protection;
-
-	return pages;
-}
-
 /*
- * setup_per_zone_protection - called whenver min_free_kbytes or
- *	sysctl_lower_zone_protection changes.  Ensures that each zone
- *	has a correct pages_protected value, so an adequate number of
+ * setup_per_zone_lowmem_reserve - called whenever
+ *	sysctl_lower_zone_reserve_ratio changes.  Ensures that each zone
+ *	has a correct pages reserved value, so an adequate number of
  *	pages are left in the zone after a successful __alloc_pages().
- *
- *	This algorithm is way confusing.  I tries to keep the same behavior
- *	as we had with the incremental min iterative algorithm.
  */
-static void setup_per_zone_protection(void)
+static void setup_per_zone_lowmem_reserve(void)
 {
 	struct pglist_data *pgdat;
-	struct zone *zones, *zone;
-	int max_zone;
-	int i, j;
+	int j, idx;
 
 	for_each_pgdat(pgdat) {
-		zones = pgdat->node_zones;
+		for (j = 0; j < MAX_NR_ZONES; j++) {
+			zone_t *zone = pgdat->node_zones + j;
+			unsigned long presentpages = zone->presentpages;
 
-		for (i = 0, max_zone = 0; i < MAX_NR_ZONES; i++)
-			if (zones[i].present_pages)
-				max_zone = i;
+			zone->lowmem_reserve[j] = 0;
 
-		/*
-		 * For each of the different allocation types:
-		 * GFP_DMA -> GFP_KERNEL -> GFP_HIGHMEM
-		 */
-		for (i = 0; i < GFP_ZONETYPES; i++) {
-			/*
-			 * For each of the zones:
-			 * ZONE_HIGHMEM -> ZONE_NORMAL -> ZONE_DMA
-			 */
-			for (j = MAX_NR_ZONES-1; j >= 0; j--) {
-				zone = &zones[j];
+			for (idx = j-1; idx >= 0; idx--) {
+				zone_t * lower_zone = pgdat->node_zones + idx;
 
-				/*
-				 * We never protect zones that don't have memory
-				 * in them (j>max_zone) or zones that aren't in
-				 * the zonelists for a certain type of
-				 * allocation (j>=i).  We have to assign these
-				 * to zero because the lower zones take
-				 * contributions from the higher zones.
-				 */
-				if (j > max_zone || j >= i) {
-					zone->protection[i] = 0;
-					continue;
-				}
-				/*
-				 * The contribution of the next higher zone
-				 */
-				zone->protection[i] = higherzone_val(zone,
-								max_zone, i);
+				lower_zone->lowmem_reserve[j] = presetpages / lower_zone_reserve_ratio[idx];
+				presentpages += lower_zone->presentpages;
 			}
 		}
 	}

quite some garbage got deleted:

andrea@dualathlon:~/devel/kernel> diffstat /tmp/z
 page_alloc.c |   84 +++++++++--------------------------------------------------
 1 files changed, 13 insertions(+), 71 deletions(-)
andrea@dualathlon:~/devel/kernel> 

I'll adapt the rest of the code to deal with this and test it in a few
more minutes.

  reply	other threads:[~2004-10-25 12:45 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-10-21  1:17 ZONE_PADDING wastes 4 bytes of the new cacheline Andrea Arcangeli
2004-10-21  3:10 ` Nick Piggin
2004-10-21  4:36   ` Andrew Morton
2004-10-21  4:53     ` Nick Piggin
2004-10-21 10:51     ` Mikael Pettersson
2004-10-21 12:45       ` Andrea Arcangeli
2004-10-21 18:54         ` Adam Heath
2004-10-21 20:21           ` DaMouse
2004-10-21 21:24             ` Jon Masters
2004-10-22 10:09               ` DaMouse
2004-10-21 22:26     ` Nick Piggin
2004-10-21 22:45       ` Andrea Arcangeli
2004-10-22  0:34         ` Nick Piggin
2004-10-22  1:10           ` Andrea Arcangeli
2004-10-22  1:26             ` Andrew Morton
2004-10-22  2:55               ` Jesse Barnes
2004-10-22  3:38                 ` Nick Piggin
2004-10-22  3:49                   ` Jesse Barnes
2004-10-22 17:15                     ` Andrea Arcangeli
2004-10-22  3:09               ` Nick Piggin
2004-10-22  3:26                 ` Andrew Morton
2004-10-22  3:35                   ` Nick Piggin
2004-10-22 17:13                     ` Andrea Arcangeli
2004-10-22 17:07                   ` Andrea Arcangeli
2004-10-22 15:50               ` Andrea Arcangeli
2004-10-22  3:02             ` Nick Piggin
2004-10-22 16:58               ` Andrea Arcangeli
2004-10-23  4:33                 ` Nick Piggin
2004-10-23  9:59                   ` Andrea Arcangeli
2004-10-23 10:22                     ` Nick Piggin
2004-10-23 11:03                       ` Andrea Arcangeli
2004-10-23 16:28                         ` Nick Piggin
2004-10-25 12:44                           ` Andrea Arcangeli [this message]
2004-10-25 12:49                             ` Nick Piggin
2004-10-25 13:51                               ` Andrea Arcangeli
2004-10-25 20:09                             ` Robert White

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041025124443.GV14325@dualathlon.random \
    --to=andrea@novell.com \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nickpiggin@yahoo.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox