From: Yinghai Lu <yinghai@kernel.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Jesse Barnes <jbarnes@virtuousgeek.org>,
"H. Peter Anvin" <hpa@zytor.com>,
Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
linux-pci@vger.kernel.org, yannick.roehlly@free.fr
Subject: Re: [PATCH] x86/pci: make pci_mem_start to be aligned only -v4
Date: Sat, 18 Apr 2009 15:20:59 -0700 [thread overview]
Message-ID: <49EA524B.9040705@kernel.org> (raw)
In-Reply-To: <49EA2983.3070003@kernel.org>
Yinghai Lu wrote:
> Ingo Molnar wrote:
>> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
>>
>>> On Sat, 18 Apr 2009, Ingo Molnar wrote:
>>>> Am i missing something?
>>> We also try to avoid random motherboard resources etc that aren't
>>> reserved or documented by the BIOS. It's better to go into big
>>> holes. It's also better to try to keep as close to the old
>>> (tested) behavior.
>> Yeah - i'm not suggesting any change in behavior, nor am i
>> suggesting any risky behavior. The current code seems to work quite
>> well.
>>
>> I'm just suggesting (maybe foolishly) that instead of having any
>> gap-rounding logic at all, add artificial entries to the e820 map to
>> 'extend' and round up any odd ending entries.
>>
>> I.e. explicitly manage all the 'hole' space to be nicely rounded and
>> to be far away from any T-Seg or other sekrit motherboard resource
>> danger area.
>>
>> We'd do this after PCI static allocations (so we dont ever stomp on
>> real, known resources) but before PCI dynamic allocations.
>>
>> The e820 printout would look literally like this:
>>
>> BIOS-provided physical RAM map:
>> BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) 0.639 MB RAM
>> BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) 0.001 MB
>> [ hole ] 0.250 MB
>> BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) 0.125 MB
>> BIOS-e820: 0000000000100000 - 000000003ed94000 (usable) 1004.5 MB RAM
>> BIOS-e820: 000000003ed94000 - 000000003ee4e000 (ACPI NVS) 0.7 MB
>> BIOS-e820: 000000003ee4e000 - 000000003fea2000 (usable) 16.3 MB RAM
>> BIOS-e820: 000000003fea2000 - 000000003fee9000 (ACPI NVS) 0.3 MB
>> BIOS-e820: 000000003fee9000 - 000000003feed000 (usable) 0.15 MB RAM
>> BIOS-e820: 000000003feed000 - 000000003feff000 (ACPI data 0.07 MB
>> BIOS-e820: 000000003feff000 - 000000003ff00000 (usable) 0.004 MB RAM
>> BIOS-e820: 000000003ff00000 - 0000000040000000 (guard) 1.0 MB
>> [ hole ] 3072.0 MB
>>
>> The '(guard)' entry at the end i added above.
>>
>> This way we intentionally create a 'free physical address space'
>> hole space that is the same as the rounding logic. No rounding
>> needed anywhere - as all the remaining address space is well-rounded
>> already. Plus we'd also _see_ all our rounding logic by looking at
>> the '(guard)' entries.
>>
>> Or maybe there's some aspect of gap-rounding that cannot be
>> expressed in such a static way?
>>
>
> please check following patch.
>
> From: Linus Torvalds <torvalds@linux-foundation.org>
>
> [PATCH] x86: reserve range near the ram -v2
>
> some BIOS use ram near end, but don't state it, just try to reserve them
> as RAM buffer
>
> v2: make it in e820 table early instead of resource tree.
>
> [Impact: protect stolen RAM]
>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>
> ---
> arch/x86/include/asm/e820.h | 2 +
> arch/x86/kernel/e820.c | 52 ++++++++++++++++++++++++++++++++++++++++++++
> arch/x86/kernel/setup.c | 6 +++++
> 3 files changed, 60 insertions(+)
>
> Index: linux-2.6/arch/x86/kernel/e820.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/e820.c
> +++ linux-2.6/arch/x86/kernel/e820.c
> @@ -150,6 +150,9 @@ static void __init e820_print_type(u32 t
> case E820_UNUSABLE:
> printk(KERN_CONT "(unusable)");
> break;
> + case E820_RAM_BUFFER:
> + printk(KERN_CONT "(RAM buffer)");
> + break;
> default:
> printk(KERN_CONT "type %u", type);
> break;
> @@ -1314,6 +1317,54 @@ void __init finish_e820_parsing(void)
> }
> }
>
> +/* How much should we pad RAM ending depending on where it is? */
> +static unsigned long __init ram_alignment(resource_size_t pos)
> +{
> + unsigned long mb = pos >> 20;
> +
> + /* To 64kB in the first megabyte */
> + if (!mb)
> + return 64*1024;
> +
> + /* To 1MB in the first 16MB */
> + if (mb < 16)
> + return 1024*1024;
> +
> + /* To 32MB for anything above that */
> + return 32*1024*1024;
> +}
> +
> +void __init e820_reserve_stolen_ram(void)
> +{
> + int i;
> + int changed = 0;
> +
> + /*
> + * Try to bump up RAM regions to reasonable boundaries to
> + * avoid stolen RAM
> + */
> + for (i = 0; i < e820.nr_map; i++) {
> + struct e820entry *entry = &e820_saved.map[i];
> + resource_size_t start, end;
> +
> + if (entry->type != E820_RAM)
> + continue;
> + start = entry->addr + entry->size;
> + end = round_up(start, ram_alignment(start));
> + if (start == end)
> + continue;
> + e820_add_region(start, end - start, E820_RAM_BUFFER);
> + changed = 1;
> + }
> +
> + if (!changed)
> + return;
> +
> + sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
> + printk(KERN_INFO "fixed physical RAM map:\n");
> + e820_print_map("reserve_stolen_range");
> +}
> +
> static inline const char *e820_type_to_string(int e820_type)
> {
> switch (e820_type) {
> @@ -1322,6 +1373,7 @@ static inline const char *e820_type_to_s
> case E820_ACPI: return "ACPI Tables";
> case E820_NVS: return "ACPI Non-volatile Storage";
> case E820_UNUSABLE: return "Unusable memory";
> + case E820_RAM_BUFFER: return "RAM Buffer";
> default: return "reserved";
> }
> }
> Index: linux-2.6/arch/x86/include/asm/e820.h
> ===================================================================
> --- linux-2.6.orig/arch/x86/include/asm/e820.h
> +++ linux-2.6/arch/x86/include/asm/e820.h
> @@ -44,6 +44,7 @@
> #define E820_ACPI 3
> #define E820_NVS 4
> #define E820_UNUSABLE 5
> +#define E820_RAM_BUFFER 6
>
> /* reserved RAM used by kernel itself */
> #define E820_RESERVED_KERN 128
> @@ -78,6 +79,7 @@ extern u64 e820_update_range(u64 start,
> extern u64 e820_remove_range(u64 start, u64 size, unsigned old_type,
> int checktype);
> extern void update_e820(void);
> +extern void e820_reserve_stolen_ram(void);
> extern void e820_setup_gap(void);
> extern int e820_search_gap(unsigned long *gapstart, unsigned long *gapsize,
> unsigned long start_addr, unsigned long long end_addr);
> Index: linux-2.6/arch/x86/kernel/setup.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/setup.c
> +++ linux-2.6/arch/x86/kernel/setup.c
> @@ -812,6 +812,12 @@ void __init setup_arch(char **cmdline_p)
> insert_resource(&iomem_resource, &data_resource);
> insert_resource(&iomem_resource, &bss_resource);
>
> + /*
> + * some systems use end of ram to for acpi or video ram
> + * but doesn't state that in reserved in e820
> + * try to round of ram etc and reserve them
> + */
> + e820_reserve_stolen_ram();
>
> #ifdef CONFIG_X86_32
> if (ppro_with_ram_bug()) {
>
it seems ram_alignment is too aggressive, it eat some RAM really
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: 0000000000000000 - 0000000000097400 (usable)
[ 0.000000] BIOS-e820: 0000000000097400 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 00000000b7fa0000 (usable)
[ 0.000000] BIOS-e820: 00000000b7fae000 - 00000000b7fb0000 (usable)
[ 0.000000] BIOS-e820: 00000000b7fb0000 - 00000000b7fbe000 (ACPI data)
[ 0.000000] BIOS-e820: 00000000b7fbe000 - 00000000b7ff0000 (ACPI NVS)
[ 0.000000] BIOS-e820: 00000000b7ff0000 - 00000000b8000000 (reserved)
[ 0.000000] BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
[ 0.000000] BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
[ 0.000000] BIOS-e820: 00000000fee00000 - 00000000fef00000 (reserved)
[ 0.000000] BIOS-e820: 00000000ff700000 - 0000000100000000 (reserved)
[ 0.000000] BIOS-e820: 0000000100000000 - 0000002048000000 (usable)
[ 0.000000] Early serial console at I/O port 0x3f8 (options '115200n8')
[ 0.000000] console [uart0] enabled
[ 0.000000] DMI present.
[ 0.000000] fixed physical RAM map:
[ 0.000000] reserve_stolen_range: 0000000000000000 - 0000000000097400 (usable)
[ 0.000000] reserve_stolen_range: 0000000000097400 - 00000000000a0000 (RAM buffer)
[ 0.000000] reserve_stolen_range: 00000000000e0000 - 0000000000100000 (reserved)
[ 0.000000] reserve_stolen_range: 0000000000100000 - 00000000b7fa0000 (usable)
[ 0.000000] reserve_stolen_range: 00000000b7fa0000 - 00000000b8000000 (RAM buffer)
[ 0.000000] reserve_stolen_range: 00000000e0000000 - 00000000f0000000 (reserved)
[ 0.000000] reserve_stolen_range: 00000000fec00000 - 00000000fec01000 (reserved)
[ 0.000000] reserve_stolen_range: 00000000fee00000 - 00000000fef00000 (reserved)
[ 0.000000] reserve_stolen_range: 00000000ff700000 - 0000000100000000 (reserved)
[ 0.000000] reserve_stolen_range: 0000000100000000 - 0000002048000000 (usable)
next prev parent reply other threads:[~2009-04-18 22:22 UTC|newest]
Thread overview: 117+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <bug-11103-13546@http.bugzilla.kernel.org/>
[not found] ` <200904101913.n3AJDhMm018684@demeter.kernel.org>
2009-04-10 20:27 ` [Bug 11103] Can't use framebuffer or vesa Xorg with two memory modules Yinghai Lu
2009-04-11 3:29 ` Yinghai Lu
2009-04-14 20:49 ` [PATCH] pci: don't assume pref memio are 64bit -v2 Yinghai Lu
2009-04-14 20:50 ` [PATCH] x86/pci: make pci_mem_start to be aligned only Yinghai Lu
2009-04-14 21:10 ` Linus Torvalds
2009-04-14 21:27 ` H. Peter Anvin
2009-04-14 21:27 ` Yinghai Lu
2009-04-14 22:35 ` Yannick Roehlly
2009-04-15 0:29 ` [PATCH] x86/pci: make pci_mem_start to be aligned only -v2 Yinghai Lu
2009-04-15 0:41 ` [PATCH] x86/pci: make pci_mem_start to be aligned only -v3 Yinghai Lu
2009-04-15 0:42 ` [PATCH] x86/pci: fix -1 calling to e820_all_mapped with mmconfig Yinghai Lu
2009-04-16 16:31 ` [PATCH] x86/pci: make pci_mem_start to be aligned only -v3 Jesse Barnes
2009-04-16 16:44 ` Linus Torvalds
2009-04-16 16:56 ` Ingo Molnar
2009-04-16 17:18 ` Yinghai Lu
2009-04-16 17:27 ` H. Peter Anvin
2009-04-16 17:38 ` Ingo Molnar
2009-04-16 17:28 ` Ingo Molnar
2009-04-16 20:13 ` [PATCH] x86/pci: make pci_mem_start to be aligned only -v4 Yinghai Lu
2009-04-16 23:18 ` Linus Torvalds
2009-04-16 23:54 ` Ingo Molnar
2009-04-17 0:24 ` Linus Torvalds
2009-04-17 13:16 ` Ingo Molnar
2009-04-17 21:59 ` Yinghai Lu
2009-04-17 22:04 ` H. Peter Anvin
2009-04-18 5:37 ` [PATCH] pci: keep pci device resource name pointer right Yinghai Lu
2009-04-18 7:51 ` Ingo Molnar
2009-04-18 16:05 ` Jesse Barnes
2009-04-18 18:42 ` Linus Torvalds
2009-04-18 19:19 ` Yinghai Lu
2009-04-18 19:23 ` Greg KH
2009-04-18 20:00 ` Kay Sievers
2009-04-18 20:27 ` Kay Sievers
2009-04-18 20:37 ` Ingo Molnar
2009-04-18 22:05 ` [PATCH] driver: dont update dev_name via device_add path Yinghai Lu
2009-04-28 7:36 ` [PATCH] driver: make dev_set_name(, NULL) work Yinghai Lu
2009-04-28 7:42 ` [RFC PATCH] use dev_set_name(,NULL) to prevent leaking Yinghai Lu
2009-04-28 8:25 ` Kay Sievers
2009-04-28 15:21 ` Yinghai Lu
2009-04-28 15:34 ` Yinghai Lu
2009-04-28 15:39 ` Greg KH
2009-04-28 15:51 ` Yinghai Lu
2009-04-28 15:56 ` Kay Sievers
2009-04-28 16:08 ` Yinghai Lu
2009-04-28 16:15 ` Kay Sievers
2009-04-28 19:04 ` Yinghai Lu
2009-04-28 16:36 ` Yinghai Lu
2009-04-28 16:50 ` Kay Sievers
2009-04-28 14:52 ` Greg KH
2009-04-28 14:51 ` [PATCH] driver: make dev_set_name(, NULL) work Greg KH
2009-04-28 15:14 ` Yinghai Lu
2009-04-28 15:39 ` Greg KH
2009-04-18 21:07 ` [PATCH] pci: keep pci device resource name pointer right Yinghai Lu
2009-04-18 22:17 ` Linus Torvalds
2009-04-18 20:00 ` [PATCH] driver: dont update dev_name if it is not changed Yinghai Lu
2009-04-18 20:11 ` Ingo Molnar
2009-04-18 20:20 ` Yinghai Lu
2009-04-18 20:27 ` Ingo Molnar
2009-04-18 8:33 ` [PATCH] x86/pci: make pci_mem_start to be aligned only -v4 Yinghai Lu
2009-04-18 9:22 ` Ingo Molnar
2009-04-18 17:07 ` Yinghai Lu
2009-04-18 18:57 ` Linus Torvalds
2009-04-18 19:14 ` Ingo Molnar
2009-04-18 19:26 ` Yinghai Lu
2009-04-18 22:20 ` Yinghai Lu [this message]
2009-04-18 22:31 ` Linus Torvalds
2009-04-18 20:13 ` Ivan Kokshaysky
2009-04-18 18:50 ` Linus Torvalds
2009-04-18 22:44 ` Yinghai Lu
2009-04-18 23:01 ` Yinghai Lu
2009-04-18 23:06 ` Linus Torvalds
2009-04-18 23:26 ` Yinghai Lu
2009-04-18 23:30 ` Yinghai Lu
2009-04-18 23:04 ` Linus Torvalds
2009-04-19 0:32 ` H. Peter Anvin
2009-04-19 4:50 ` Linus Torvalds
2009-04-19 5:26 ` Yinghai Lu
2009-04-19 19:35 ` Yannick Roehlly
2009-04-19 19:59 ` Yinghai Lu
2009-04-19 20:24 ` Yannick Roehlly
2009-04-19 9:02 ` Ingo Molnar
2009-04-19 9:06 ` Ingo Molnar
2009-04-19 17:52 ` Jesse Barnes
2009-04-20 22:33 ` Ivan Kokshaysky
2009-04-20 22:52 ` Yinghai Lu
2009-04-21 10:54 ` Ivan Kokshaysky
2009-04-21 0:09 ` Yinghai Lu
2009-04-21 10:56 ` Ivan Kokshaysky
2009-04-21 15:57 ` Yinghai Lu
2009-04-22 22:37 ` [RFC PATCH 1/2] pci: don't assume pref memio are 64bit -v3 Yinghai Lu
2009-04-22 22:38 ` [RFC PATCH 2/2] pci: try to assign res for device under transparent bridges Yinghai Lu
2009-04-22 22:49 ` [RFC PATCH 1/2] pci: don't assume pref memio are 64bit -v3 Jesse Barnes
2009-04-23 0:49 ` Yinghai Lu
2009-04-23 1:05 ` Jesse Barnes
2009-04-23 2:03 ` Yinghai Lu
2009-04-23 12:58 ` Ivan Kokshaysky
2009-04-23 15:30 ` Yinghai Lu
2009-04-23 2:10 ` Yinghai Lu
2009-04-23 13:22 ` Ivan Kokshaysky
2009-04-23 15:13 ` Yinghai Lu
2009-04-23 22:19 ` Ivan Kokshaysky
2009-04-24 3:48 ` [PATCH 1/4] pci/x86: don't assume pref memio are 64bit -v4 Yinghai Lu
2009-04-24 3:49 ` [PATCH 2/4] pci: try to assign res for device under transparent bridges -v2 Yinghai Lu
2009-04-24 3:50 ` [PATCH 3/4] x86: reserve range near the ram Yinghai Lu
2009-04-24 3:50 ` [PATCH 4/4] x86/pci: make pci_mem_start to be aligned only -v5 Yinghai Lu
2009-04-24 13:16 ` [PATCH 1/4] pci/x86: don't assume pref memio are 64bit -v4 Ivan Kokshaysky
2009-05-05 18:52 ` Jesse Barnes
2009-05-06 12:33 ` Ingo Molnar
2009-05-06 15:06 ` [PATCH 1/2] x86: reserve range near the ram Yinghai Lu
2009-05-11 9:51 ` [tip:x86/mm] x86, e820, pci: reserve extra free space near end of RAM tip-bot for Linus Torvalds
2009-05-06 15:07 ` [PATCH 1/2] x86/pci: make pci_mem_start to be aligned only -v5 Yinghai Lu
2009-05-11 9:51 ` [tip:x86/mm] x86/pci: remove rounding quirk from e820_setup_gap() tip-bot for Yinghai Lu
2009-04-23 12:36 ` [RFC PATCH 1/2] pci: don't assume pref memio are 64bit -v3 Ivan Kokshaysky
2009-04-23 12:41 ` Ingo Molnar
2009-04-23 13:09 ` Ivan Kokshaysky
2009-04-23 15:05 ` Yinghai Lu
2009-04-21 15:41 ` [PATCH] x86/pci: make pci_mem_start to be aligned only -v4 Jesse Barnes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49EA524B.9040705@kernel.org \
--to=yinghai@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=hpa@zytor.com \
--cc=jbarnes@virtuousgeek.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=yannick.roehlly@free.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).