All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andres Salomon <dilinger@queued.net>
To: Grant Likely <grant.likely@secretlab.ca>
Cc: devicetree-discuss@lists.ozlabs.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	linux-kernel@vger.kernel.org
Subject: [PATCH 3/3] x86: OLPC: speed up device tree creation during boot (v2)
Date: Thu, 11 Nov 2010 21:45:46 -0800	[thread overview]
Message-ID: <20101111214546.4e573cad@queued.net> (raw)


Calling alloc_bootmem() for tiny chunks of memory over and over is really
slow; on an XO-1, it caused the time between when the kernel started
booting and when the display came alive (post-lxfb probe) to increase
to 44s.  This patch optimizes the prom_early_alloc function by
calling alloc_bootmem for 4k-sized blocks of memory, and handing out
chunks of that to callers.  With this patch, the time between kernel load
and display initialization decreased to 23s.  If there's a better way to
do this early in the boot process, please let me know.

(Note: increasing the chunk size to 16k didn't noticably affect boot time,
and wasted 9k.)

v2: reorder function as suggested by Grant.

Signed-off-by: Andres Salomon <dilinger@queued.net>
---
 arch/x86/platform/olpc/olpc_dt.c |   27 ++++++++++++++++++++++-----
 1 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/arch/x86/platform/olpc/olpc_dt.c b/arch/x86/platform/olpc/olpc_dt.c
index b8c8ff9..0ab824d 100644
--- a/arch/x86/platform/olpc/olpc_dt.c
+++ b/arch/x86/platform/olpc/olpc_dt.c
@@ -126,14 +126,31 @@ static unsigned int prom_early_allocated __initdata;
 
 void * __init prom_early_alloc(unsigned long size)
 {
+	static u8 *mem = NULL;
+	static size_t free_mem = 0;
 	void *res;
 
-	res = alloc_bootmem(size);
-	if (res)
-		memset(res, 0, size);
-
-	prom_early_allocated += size;
+	if (free_mem < size) {
+		const size_t chunk_size = max(PAGE_SIZE, size);
+
+		/*
+		 * To mimimize the number of allocations, grab at least 4k of
+		 * memory (that's an arbitrary choice that matches PAGE_SIZE on
+		 * the platforms we care about, and minimizes wasted bootmem)
+		 * and hand off chunks of it to callers.
+		 */
+		res = mem = alloc_bootmem(chunk_size);
+		if (!res)
+			return NULL;
+		prom_early_allocated += chunk_size;
+		memset(res, 0, chunk_size);
+		free_mem = chunk_size;
+	}
 
+	/* allocate from the local cache */
+	free_mem -= size;
+	res = mem;
+	mem += size;
 	return res;
 }
 
-- 
1.7.2.3


WARNING: multiple messages have this Message-ID (diff)
From: Andres Salomon <dilinger-pFFUokh25LWsTnJN9+BGXg@public.gmane.org>
To: Grant Likely <grant.likely-s3s/WqlpOiPyB63q8FvJNQ@public.gmane.org>
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org,
	Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
	"H. Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
Subject: [PATCH 3/3] x86: OLPC: speed up device tree creation during boot (v2)
Date: Thu, 11 Nov 2010 21:45:46 -0800	[thread overview]
Message-ID: <20101111214546.4e573cad@queued.net> (raw)


Calling alloc_bootmem() for tiny chunks of memory over and over is really
slow; on an XO-1, it caused the time between when the kernel started
booting and when the display came alive (post-lxfb probe) to increase
to 44s.  This patch optimizes the prom_early_alloc function by
calling alloc_bootmem for 4k-sized blocks of memory, and handing out
chunks of that to callers.  With this patch, the time between kernel load
and display initialization decreased to 23s.  If there's a better way to
do this early in the boot process, please let me know.

(Note: increasing the chunk size to 16k didn't noticably affect boot time,
and wasted 9k.)

v2: reorder function as suggested by Grant.

Signed-off-by: Andres Salomon <dilinger-pFFUokh25LWsTnJN9+BGXg@public.gmane.org>
---
 arch/x86/platform/olpc/olpc_dt.c |   27 ++++++++++++++++++++++-----
 1 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/arch/x86/platform/olpc/olpc_dt.c b/arch/x86/platform/olpc/olpc_dt.c
index b8c8ff9..0ab824d 100644
--- a/arch/x86/platform/olpc/olpc_dt.c
+++ b/arch/x86/platform/olpc/olpc_dt.c
@@ -126,14 +126,31 @@ static unsigned int prom_early_allocated __initdata;
 
 void * __init prom_early_alloc(unsigned long size)
 {
+	static u8 *mem = NULL;
+	static size_t free_mem = 0;
 	void *res;
 
-	res = alloc_bootmem(size);
-	if (res)
-		memset(res, 0, size);
-
-	prom_early_allocated += size;
+	if (free_mem < size) {
+		const size_t chunk_size = max(PAGE_SIZE, size);
+
+		/*
+		 * To mimimize the number of allocations, grab at least 4k of
+		 * memory (that's an arbitrary choice that matches PAGE_SIZE on
+		 * the platforms we care about, and minimizes wasted bootmem)
+		 * and hand off chunks of it to callers.
+		 */
+		res = mem = alloc_bootmem(chunk_size);
+		if (!res)
+			return NULL;
+		prom_early_allocated += chunk_size;
+		memset(res, 0, chunk_size);
+		free_mem = chunk_size;
+	}
 
+	/* allocate from the local cache */
+	free_mem -= size;
+	res = mem;
+	mem += size;
 	return res;
 }
 
-- 
1.7.2.3

             reply	other threads:[~2010-11-12  5:45 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-12  5:45 Andres Salomon [this message]
2010-11-12  5:45 ` [PATCH 3/3] x86: OLPC: speed up device tree creation during boot (v2) Andres Salomon
2010-11-12  7:48 ` Milton Miller
2010-11-12  7:48   ` Milton Miller
2010-11-12  8:27   ` Andres Salomon
2010-11-12  8:27     ` Andres Salomon
2010-11-14  9:50     ` Ingo Molnar
2010-11-15  4:21       ` H. Peter Anvin
2010-11-15  4:21         ` H. Peter Anvin
2010-11-15  7:02         ` Ingo Molnar
2010-11-15  7:02           ` Ingo Molnar
2010-11-15 17:43           ` H. Peter Anvin
2010-11-15 17:43             ` H. Peter Anvin
2010-11-17  6:12             ` [PATCH 3/3] x86: OLPC: speed up device tree creation during boot (v3) Andres Salomon
2010-11-17  6:12               ` Andres Salomon
2010-11-29 23:39               ` [PATCH 3/3] x86: OLPC: speed up device tree creation during boot (v4) Andres Salomon
2010-12-16  2:58                 ` [tip:x86/olpc] x86, olpc: Speed up device tree creation during boot tip-bot for Andres Salomon
2010-11-18  8:34             ` [PATCH 3/3] x86: OLPC: speed up device tree creation during boot (v2) Ingo Molnar
2010-11-18  8:34               ` Ingo Molnar
2010-11-18 11:02               ` Michael Ellerman
2010-11-18 11:02                 ` Michael Ellerman
2010-11-18 15:04                 ` H. Peter Anvin
2010-11-18 15:04                   ` H. Peter Anvin
2010-11-18 17:41                   ` Andres Salomon
2010-11-18 17:41                     ` Andres Salomon
2010-11-18 17:48                     ` H. Peter Anvin
2010-11-18 17:48                       ` H. Peter Anvin
2010-11-19 20:24                       ` Andres Salomon
2010-11-19 20:24                         ` Andres Salomon
2010-12-23 11:57               ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101111214546.4e573cad@queued.net \
    --to=dilinger@queued.net \
    --cc=devicetree-discuss@lists.ozlabs.org \
    --cc=grant.likely@secretlab.ca \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.