* Re: swsusp: which page should be saved?
2006-03-16 3:53 swsusp: which page should be saved? Shaohua Li
@ 2006-03-15 23:08 ` Pavel Machek
2006-03-17 1:12 ` Shaohua Li
2006-03-16 16:05 ` Rafael J. Wysocki
1 sibling, 1 reply; 17+ messages in thread
From: Pavel Machek @ 2006-03-15 23:08 UTC (permalink / raw)
To: Shaohua Li; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 670 bytes --]
On Thu 16-03-06 11:53:27, Shaohua Li wrote:
> I thought there was a discussion before, but I still saw many pages like
> BIOS reserved are saved/restored in swsusp. Pages reserved by BIOS, not
> used by OS and kernel text should be skipped in swsusp to me. Below
> patch works in my test. Any thought?
This will need quite a lot of testing.
Does it actually fix anything?
I'm afraid that some of the BIOSs state is actually shared knowledge
with kernel, and that we should better save it, so that kernel's and
BIOS's view are consistent after resume.
Arguably such pieces should better have their own suspend/resume
methods, but...
Pavel
--
Thanks, Sharp!
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* swsusp: which page should be saved?
@ 2006-03-16 3:53 Shaohua Li
2006-03-15 23:08 ` Pavel Machek
2006-03-16 16:05 ` Rafael J. Wysocki
0 siblings, 2 replies; 17+ messages in thread
From: Shaohua Li @ 2006-03-16 3:53 UTC (permalink / raw)
To: Linux-pm mailing list; +Cc: Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 3623 bytes --]
I thought there was a discussion before, but I still saw many pages like
BIOS reserved are saved/restored in swsusp. Pages reserved by BIOS, not
used by OS and kernel text should be skipped in swsusp to me. Below
patch works in my test. Any thought?
Thanks,
Shaohua
---
linux-2.6.15-root/arch/i386/kernel/setup.c | 78 +++++++++++++++++++++++++++++
linux-2.6.15-root/kernel/power/snapshot.c | 1
2 files changed, 78 insertions(+), 1 deletion(-)
diff -puN arch/i386/kernel/setup.c~nosave_pages arch/i386/kernel/setup.c
--- linux-2.6.15/arch/i386/kernel/setup.c~nosave_pages 2006-03-14 14:42:24.000000000 +0800
+++ linux-2.6.15-root/arch/i386/kernel/setup.c 2006-03-15 09:38:18.000000000 +0800
@@ -1483,6 +1483,83 @@ static void set_mca_bus(int x)
static void set_mca_bus(int x) { }
#endif
+extern char __end_rodata;
+static void __init mark_nosave_page_range(unsigned long start, unsigned long end)
+{
+ struct page *page;
+ while (start <= end) {
+ printk("%ld,", start);
+ page = pfn_to_page(start);
+ SetPageNosave(page);
+ start ++;
+ }
+}
+
+static int __init efi_mark_nosave_page(unsigned long start, unsigned long end,
+ void *arg)
+{
+ unsigned long start_pfn, end_pfn, last = *(unsigned long*)arg;
+
+ start_pfn = PFN_DOWN(start);
+ end_pfn = PFN_DOWN(end);
+ /* check max_low_pfn */
+ if (start_pfn >= max_low_pfn)
+ return 0;
+ if (end_pfn >= max_low_pfn)
+ end = max_low_pfn;
+ if (start_pfn > last)
+ mark_nosave_page_range(last, start_pfn - 1);
+ *(unsigned long*)arg = end;
+ return 0;
+}
+
+static void __init mark_nosave_pages(void)
+{
+ unsigned long pfn, pfn_end_rodata, last_pfn = 0;
+ int i;
+
+ /* Mark all BIOS reserved regions as nosave */
+ if (efi_enabled) {
+ efi_memmap_walk(efi_mark_nosave_page, &last_pfn);
+ return;
+ } else {
+ for (i = 0; i < e820.nr_map; i++) {
+ unsigned long start_pfn, end_pfn;
+ if (e820.map[i].type != E820_RAM)
+ continue;
+
+ start_pfn = PFN_DOWN(e820.map[i].addr);
+ if (start_pfn >= max_low_pfn)
+ break;
+ end_pfn = PFN_UP(e820.map[i].addr + e820.map[i].size);
+
+ if (end_pfn > max_low_pfn)
+ end_pfn = max_low_pfn;
+ if (end_pfn <= start_pfn)
+ continue;
+ if (start_pfn > last_pfn) {
+ start_pfn--;
+ mark_nosave_page_range(last_pfn, start_pfn);
+ }
+ last_pfn = end_pfn;
+ }
+ }
+
+ if (last_pfn < max_low_pfn)
+ mark_nosave_page_range(last_pfn, max_low_pfn);
+
+ /* kernel text */
+ pfn = PFN_UP(__pa(__KERNEL_START));
+ pfn_end_rodata = PFN_DOWN(__pa(&__end_rodata));
+ mark_nosave_page_range(pfn, pfn_end_rodata);
+
+ /* other pages kernel doesn't use (see setup_bootmem_allocator) */
+ mark_nosave_page_range(0, 0);
+#ifdef CONFIG_SMP
+ mark_nosave_page_range(PFN_DOWN(PAGE_SIZE), PFN_DOWN(PAGE_SIZE));
+#endif
+}
+
/*
* Determine if we were loaded by an EFI loader. If so, then we have also been
* passed the efi memmap, systab, etc., so we should use these data structures
@@ -1573,6 +1650,7 @@ void __init setup_arch(char **cmdline_p)
remapped_pgdat_init();
sparse_init();
zone_sizes_init();
+ mark_nosave_pages();
/*
* NOTE: at this point the bootmem allocator is fully available.
diff -puN kernel/power/snapshot.c~nosave_pages kernel/power/snapshot.c
--- linux-2.6.15/kernel/power/snapshot.c~nosave_pages 2006-03-14 14:42:29.000000000 +0800
+++ linux-2.6.15-root/kernel/power/snapshot.c 2006-03-14 14:43:06.000000000 +0800
@@ -173,7 +173,6 @@ static int saveable(struct zone *zone, u
return 0;
page = pfn_to_page(pfn);
- BUG_ON(PageReserved(page) && PageNosave(page));
if (PageNosave(page))
return 0;
if (PageReserved(page) && pfn_is_nosave(pfn))
_
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: swsusp: which page should be saved?
2006-03-16 3:53 swsusp: which page should be saved? Shaohua Li
2006-03-15 23:08 ` Pavel Machek
@ 2006-03-16 16:05 ` Rafael J. Wysocki
1 sibling, 0 replies; 17+ messages in thread
From: Rafael J. Wysocki @ 2006-03-16 16:05 UTC (permalink / raw)
To: linux-pm; +Cc: Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 4147 bytes --]
On Thursday 16 March 2006 04:53, Shaohua Li wrote:
> I thought there was a discussion before, but I still saw many pages like
> BIOS reserved are saved/restored in swsusp.
I've never seen this causing any problems but generally it's a good idea to
avoid saving these pages IMO.
> Pages reserved by BIOS, not
> used by OS and kernel text should be skipped in swsusp to me. Below
> patch works in my test. Any thought?
You should do this for x86_64 too.
Greetings,
Rafael
> ---
>
> linux-2.6.15-root/arch/i386/kernel/setup.c | 78 +++++++++++++++++++++++++++++
> linux-2.6.15-root/kernel/power/snapshot.c | 1
> 2 files changed, 78 insertions(+), 1 deletion(-)
>
> diff -puN arch/i386/kernel/setup.c~nosave_pages arch/i386/kernel/setup.c
> --- linux-2.6.15/arch/i386/kernel/setup.c~nosave_pages 2006-03-14 14:42:24.000000000 +0800
> +++ linux-2.6.15-root/arch/i386/kernel/setup.c 2006-03-15 09:38:18.000000000 +0800
> @@ -1483,6 +1483,83 @@ static void set_mca_bus(int x)
> static void set_mca_bus(int x) { }
> #endif
>
> +extern char __end_rodata;
> +static void __init mark_nosave_page_range(unsigned long start, unsigned long end)
> +{
> + struct page *page;
> + while (start <= end) {
> + printk("%ld,", start);
> + page = pfn_to_page(start);
> + SetPageNosave(page);
> + start ++;
> + }
> +}
> +
> +static int __init efi_mark_nosave_page(unsigned long start, unsigned long end,
> + void *arg)
> +{
> + unsigned long start_pfn, end_pfn, last = *(unsigned long*)arg;
> +
> + start_pfn = PFN_DOWN(start);
> + end_pfn = PFN_DOWN(end);
> + /* check max_low_pfn */
> + if (start_pfn >= max_low_pfn)
> + return 0;
> + if (end_pfn >= max_low_pfn)
> + end = max_low_pfn;
> + if (start_pfn > last)
> + mark_nosave_page_range(last, start_pfn - 1);
> + *(unsigned long*)arg = end;
> + return 0;
> +}
> +
> +static void __init mark_nosave_pages(void)
> +{
> + unsigned long pfn, pfn_end_rodata, last_pfn = 0;
> + int i;
> +
> + /* Mark all BIOS reserved regions as nosave */
> + if (efi_enabled) {
> + efi_memmap_walk(efi_mark_nosave_page, &last_pfn);
> + return;
> + } else {
> + for (i = 0; i < e820.nr_map; i++) {
> + unsigned long start_pfn, end_pfn;
> + if (e820.map[i].type != E820_RAM)
> + continue;
> +
> + start_pfn = PFN_DOWN(e820.map[i].addr);
> + if (start_pfn >= max_low_pfn)
> + break;
> + end_pfn = PFN_UP(e820.map[i].addr + e820.map[i].size);
> +
> + if (end_pfn > max_low_pfn)
> + end_pfn = max_low_pfn;
> + if (end_pfn <= start_pfn)
> + continue;
> + if (start_pfn > last_pfn) {
> + start_pfn--;
> + mark_nosave_page_range(last_pfn, start_pfn);
> + }
> + last_pfn = end_pfn;
> + }
> + }
> +
> + if (last_pfn < max_low_pfn)
> + mark_nosave_page_range(last_pfn, max_low_pfn);
> +
> + /* kernel text */
> + pfn = PFN_UP(__pa(__KERNEL_START));
> + pfn_end_rodata = PFN_DOWN(__pa(&__end_rodata));
> + mark_nosave_page_range(pfn, pfn_end_rodata);
> +
> + /* other pages kernel doesn't use (see setup_bootmem_allocator) */
> + mark_nosave_page_range(0, 0);
> +#ifdef CONFIG_SMP
> + mark_nosave_page_range(PFN_DOWN(PAGE_SIZE), PFN_DOWN(PAGE_SIZE));
> +#endif
> +}
> +
> /*
> * Determine if we were loaded by an EFI loader. If so, then we have also been
> * passed the efi memmap, systab, etc., so we should use these data structures
> @@ -1573,6 +1650,7 @@ void __init setup_arch(char **cmdline_p)
> remapped_pgdat_init();
> sparse_init();
> zone_sizes_init();
> + mark_nosave_pages();
>
> /*
> * NOTE: at this point the bootmem allocator is fully available.
> diff -puN kernel/power/snapshot.c~nosave_pages kernel/power/snapshot.c
> --- linux-2.6.15/kernel/power/snapshot.c~nosave_pages 2006-03-14 14:42:29.000000000 +0800
> +++ linux-2.6.15-root/kernel/power/snapshot.c 2006-03-14 14:43:06.000000000 +0800
> @@ -173,7 +173,6 @@ static int saveable(struct zone *zone, u
> return 0;
>
> page = pfn_to_page(pfn);
> - BUG_ON(PageReserved(page) && PageNosave(page));
> if (PageNosave(page))
> return 0;
> if (PageReserved(page) && pfn_is_nosave(pfn))
> _
>
>
>
--
Beer is proof that God loves us and wants us to be happy - Benjamin Franklin
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: swsusp: which page should be saved?
2006-03-15 23:08 ` Pavel Machek
@ 2006-03-17 1:12 ` Shaohua Li
2006-03-17 6:59 ` Pavel Machek
2006-03-17 10:50 ` Rafael J. Wysocki
0 siblings, 2 replies; 17+ messages in thread
From: Shaohua Li @ 2006-03-17 1:12 UTC (permalink / raw)
To: Pavel Machek; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 995 bytes --]
On Thu, 2006-03-16 at 07:08 +0800, Pavel Machek wrote:
> On Thu 16-03-06 11:53:27, Shaohua Li wrote:
> > I thought there was a discussion before, but I still saw many pages
> like
> > BIOS reserved are saved/restored in swsusp. Pages reserved by BIOS,
> not
> > used by OS and kernel text should be skipped in swsusp to me. Below
> > patch works in my test. Any thought?
>
> This will need quite a lot of testing.
>
> Does it actually fix anything?
No, it doesn't. Just want to speed up swsusp.
> I'm afraid that some of the BIOSs state is actually shared knowledge
> with kernel, and that we should better save it, so that kernel's and
> BIOS's view are consistent after resume.
Fair enough. ACPI BIOS does communicate with OS by reserved memory. But
in the mean time, blindly save/restore such mem might be dangerous. If
all ACPI devices have their suspend/resume method, reserved memory
should not be saved.
Anyway, skipping kernel text should be safe, isn't it?
Thanks,
Shaohua
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: swsusp: which page should be saved?
2006-03-17 1:12 ` Shaohua Li
@ 2006-03-17 6:59 ` Pavel Machek
2006-03-21 2:19 ` Shaohua Li
2006-03-17 10:50 ` Rafael J. Wysocki
1 sibling, 1 reply; 17+ messages in thread
From: Pavel Machek @ 2006-03-17 6:59 UTC (permalink / raw)
To: Shaohua Li; +Cc: Linux-pm mailing list
On Pá 17-03-06 09:12:15, Shaohua Li wrote:
> On Thu, 2006-03-16 at 07:08 +0800, Pavel Machek wrote:
> > On Thu 16-03-06 11:53:27, Shaohua Li wrote:
> > > I thought there was a discussion before, but I still saw many pages
> > like
> > > BIOS reserved are saved/restored in swsusp. Pages reserved by BIOS,
> > not
> > > used by OS and kernel text should be skipped in swsusp to me. Below
> > > patch works in my test. Any thought?
> >
> > This will need quite a lot of testing.
> >
> > Does it actually fix anything?
> No, it doesn't. Just want to speed up swsusp.
>
> > I'm afraid that some of the BIOSs state is actually shared knowledge
> > with kernel, and that we should better save it, so that kernel's and
> > BIOS's view are consistent after resume.
> Fair enough. ACPI BIOS does communicate with OS by reserved memory. But
> in the mean time, blindly save/restore such mem might be dangerous. If
> all ACPI devices have their suspend/resume method, reserved memory
> should not be saved.
Ok, I guess it is okay to go in if it stays in -mm for long enough to
get a lot of testing.
> Anyway, skipping kernel text should be safe, isn't it?
It probably is. But you need to save modules. And we do use some
self-modifying code these days, no? (Called runtime patching or
something like that.) Ouch and IIRC top-level pagedir or something
like that lives in kernel "text" -- it is in assembly and wrongly
placed.
So yes, it would be nice, but yes, it might be tricky.
Pavel
--
82: return SampleTable;
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Re: swsusp: which page should be saved?
2006-03-17 1:12 ` Shaohua Li
2006-03-17 6:59 ` Pavel Machek
@ 2006-03-17 10:50 ` Rafael J. Wysocki
1 sibling, 0 replies; 17+ messages in thread
From: Rafael J. Wysocki @ 2006-03-17 10:50 UTC (permalink / raw)
To: linux-pm
[-- Attachment #1: Type: text/plain, Size: 1151 bytes --]
On Friday 17 March 2006 02:12, Shaohua Li wrote:
> On Thu, 2006-03-16 at 07:08 +0800, Pavel Machek wrote:
> > On Thu 16-03-06 11:53:27, Shaohua Li wrote:
> > > I thought there was a discussion before, but I still saw many pages
> > like
> > > BIOS reserved are saved/restored in swsusp. Pages reserved by BIOS,
> > not
> > > used by OS and kernel text should be skipped in swsusp to me. Below
> > > patch works in my test. Any thought?
> >
> > This will need quite a lot of testing.
> >
> > Does it actually fix anything?
> No, it doesn't. Just want to speed up swsusp.
>
> > I'm afraid that some of the BIOSs state is actually shared knowledge
> > with kernel, and that we should better save it, so that kernel's and
> > BIOS's view are consistent after resume.
> Fair enough. ACPI BIOS does communicate with OS by reserved memory. But
> in the mean time, blindly save/restore such mem might be dangerous. If
> all ACPI devices have their suspend/resume method, reserved memory
> should not be saved.
>
> Anyway, skipping kernel text should be safe, isn't it?
I wouldn't do that. Saving it doesn't cost us a lot in terms of time.
Rafael
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: swsusp: which page should be saved?
2006-03-17 6:59 ` Pavel Machek
@ 2006-03-21 2:19 ` Shaohua Li
2006-03-21 3:33 ` Nigel Cunningham
2006-03-21 9:42 ` Pavel Machek
0 siblings, 2 replies; 17+ messages in thread
From: Shaohua Li @ 2006-03-21 2:19 UTC (permalink / raw)
To: Pavel Machek; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 2180 bytes --]
On Fri, 2006-03-17 at 14:59 +0800, Pavel Machek wrote:
> On Pá 17-03-06 09:12:15, Shaohua Li wrote:
> > On Thu, 2006-03-16 at 07:08 +0800, Pavel Machek wrote:
> > > On Thu 16-03-06 11:53:27, Shaohua Li wrote:
> > > > I thought there was a discussion before, but I still saw many
> pages
> > > like
> > > > BIOS reserved are saved/restored in swsusp. Pages reserved by
> BIOS,
> > > not
> > > > used by OS and kernel text should be skipped in swsusp to me.
> Below
> > > > patch works in my test. Any thought?
> > >
> > > This will need quite a lot of testing.
> > >
> > > Does it actually fix anything?
> > No, it doesn't. Just want to speed up swsusp.
> >
> > > I'm afraid that some of the BIOSs state is actually shared
> knowledge
> > > with kernel, and that we should better save it, so that kernel's
> and
> > > BIOS's view are consistent after resume.
> > Fair enough. ACPI BIOS does communicate with OS by reserved memory.
> But
> > in the mean time, blindly save/restore such mem might be dangerous.
> If
> > all ACPI devices have their suspend/resume method, reserved memory
> > should not be saved.
>
> Ok, I guess it is okay to go in if it stays in -mm for long enough to
> get a lot of testing.
I'll do more tests and back to you. BTW, I wonder if BIOS already saved
reserved memory (those doing communication with OS) in the 'platform'
method of S4.
> > Anyway, skipping kernel text should be safe, isn't it?
>
> It probably is. But you need to save modules.
I just consider the region from kernel start(1M) to the end of rodata.
In my test, the region is about 4M memory. Just adding several lines to
save 4M memory is worthy.
> And we do use some
> self-modifying code these days, no? (Called runtime patching or
> something like that.)
Alternative instructions? The resume OS will do the same modification
anyway.
> Ouch and IIRC top-level pagedir or something
> like that lives in kernel "text" -- it is in assembly and wrongly
> placed.
i386 does the right thing and put the pagedir in data segment. x86_64
not, I think we could clean it up.
Thanks,
Shaohua
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: Re: swsusp: which page should be saved?
@ 2006-03-21 3:11 Yu, Luming
0 siblings, 0 replies; 17+ messages in thread
From: Yu, Luming @ 2006-03-21 3:11 UTC (permalink / raw)
To: Li, Shaohua, Pavel Machek; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 423 bytes --]
>> > Anyway, skipping kernel text should be safe, isn't it?
>>
>> It probably is. But you need to save modules.
>I just consider the region from kernel start(1M) to the end of rodata.
>In my test, the region is about 4M memory. Just adding several lines to
>save 4M memory is worthy.
Why it is safe? :-) If you think BIOS as a black box, and don't know
how BIOS works,
then, you have to save them for safety concern.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: Re: swsusp: which page should be saved?
@ 2006-03-21 3:23 Yu, Luming
2006-03-21 3:33 ` Shaohua Li
0 siblings, 1 reply; 17+ messages in thread
From: Yu, Luming @ 2006-03-21 3:23 UTC (permalink / raw)
To: Yu, Luming, Li, Shaohua, Pavel Machek; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 453 bytes --]
>>> > Anyway, skipping kernel text should be safe, isn't it?
>>>
>>> It probably is. But you need to save modules.
>>I just consider the region from kernel start(1M) to the end of rodata.
>>In my test, the region is about 4M memory. Just adding
>several lines to
>>save 4M memory is worthy.
>
>Why it is safe? :-) If you think BIOS as a black box, and don't know
>how BIOS works,
>then, you have to save them for safety concern.
>
s/BIOS/Modules/
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: Re: swsusp: which page should be saved?
2006-03-21 3:23 Yu, Luming
@ 2006-03-21 3:33 ` Shaohua Li
0 siblings, 0 replies; 17+ messages in thread
From: Shaohua Li @ 2006-03-21 3:33 UTC (permalink / raw)
To: Yu, Luming; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 698 bytes --]
On Tue, 2006-03-21 at 11:23 +0800, Yu, Luming wrote:
> >>> > Anyway, skipping kernel text should be safe, isn't it?
> >>>
> >>> It probably is. But you need to save modules.
> >>I just consider the region from kernel start(1M) to the end of rodata.
> >>In my test, the region is about 4M memory. Just adding
> >several lines to
> >>save 4M memory is worthy.
> >
> >Why it is safe? :-) If you think BIOS as a black box, and don't know
> >how BIOS works,
> >then, you have to save them for safety concern.
> >
> s/BIOS/Modules/
As I said, I just consider the region from kernel start to the end of
rodata (get from the link script). modules should be saved/restored as
normal.
Thanks,
Shaohua
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Re: swsusp: which page should be saved?
2006-03-21 2:19 ` Shaohua Li
@ 2006-03-21 3:33 ` Nigel Cunningham
2006-03-21 9:42 ` Pavel Machek
1 sibling, 0 replies; 17+ messages in thread
From: Nigel Cunningham @ 2006-03-21 3:33 UTC (permalink / raw)
To: linux-pm
[-- Attachment #1.1: Type: text/plain, Size: 457 bytes --]
Hi.
On Tuesday 21 March 2006 12:19, Shaohua Li wrote:
> > Ouch and IIRC top-level pagedir or something
> > like that lives in kernel "text" -- it is in assembly and wrongly
> > placed.
>
> i386 does the right thing and put the pagedir in data segment. x86_64
> not, I think we could clean it up.
Perhaps you could make use of the rotext/data sections? IIRC, there's already
a patch in 2.6.16 that enforces the attribute.
Regards,
Nigel
[-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: Re: swsusp: which page should be saved?
@ 2006-03-21 3:41 Yu, Luming
0 siblings, 0 replies; 17+ messages in thread
From: Yu, Luming @ 2006-03-21 3:41 UTC (permalink / raw)
To: Li, Shaohua; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 751 bytes --]
>> >>> It probably is. But you need to save modules.
>> >>I just consider the region from kernel start(1M) to the
>end of rodata.
>> >>In my test, the region is about 4M memory. Just adding
>> >several lines to
>> >>save 4M memory is worthy.
>> >
>> >Why it is safe? :-) If you think BIOS as a black box, and
>don't know
>> >how BIOS works,
>> >then, you have to save them for safety concern.
>> >
>> s/BIOS/Modules/
>As I said, I just consider the region from kernel start to the end of
>rodata (get from the link script). modules should be saved/restored as
>normal.
I'm not sure if this a real case, but I'm sure in this case
kernel text needs to be saved too if the kernel
has been patched with binary patch at run-time.
Thanks,
Luming
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: Re: swsusp: which page should be saved?
@ 2006-03-21 4:25 Yu, Luming
0 siblings, 0 replies; 17+ messages in thread
From: Yu, Luming @ 2006-03-21 4:25 UTC (permalink / raw)
To: Yu, Luming, Li, Shaohua; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 913 bytes --]
>>> >>> It probably is. But you need to save modules.
>>> >>I just consider the region from kernel start(1M) to the
>>end of rodata.
>>> >>In my test, the region is about 4M memory. Just adding
>>> >several lines to
>>> >>save 4M memory is worthy.
>>> >
>>> >Why it is safe? :-) If you think BIOS as a black box, and
>>don't know
>>> >how BIOS works,
>>> >then, you have to save them for safety concern.
>>> >
>>> s/BIOS/Modules/
>>As I said, I just consider the region from kernel start to the end of
>>rodata (get from the link script). modules should be saved/restored as
>>normal.
>
>I'm not sure if this a real case, but I'm sure in this case
>kernel text needs to be saved too if the kernel
>has been patched with binary patch at run-time.
Another real case is that the image on disk has been changed ?
But probably, user need previous S4 resume back correctly.
Yes, all these are corner cases.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: swsusp: which page should be saved?
2006-03-21 2:19 ` Shaohua Li
2006-03-21 3:33 ` Nigel Cunningham
@ 2006-03-21 9:42 ` Pavel Machek
2006-04-07 3:46 ` Shaohua Li
1 sibling, 1 reply; 17+ messages in thread
From: Pavel Machek @ 2006-03-21 9:42 UTC (permalink / raw)
To: Shaohua Li; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 1455 bytes --]
Hi!
> > Ok, I guess it is okay to go in if it stays in -mm for long enough to
> > get a lot of testing.
> I'll do more tests and back to you. BTW, I wonder if BIOS already saved
> reserved memory (those doing communication with OS) in the 'platform'
> method of S4.
Yep, it should be safe. I bet it will break some obscure machine, but
it will probably fix some obscure machine, too... Just needs lots of testing.
> > > Anyway, skipping kernel text should be safe, isn't it?
> >
> > It probably is. But you need to save modules.
> I just consider the region from kernel start(1M) to the end of rodata.
> In my test, the region is about 4M memory. Just adding several lines to
> save 4M memory is worthy.
Well, few lines to save 4MB is nice. OTOH 4MB are saved in about
100msec, and if it brings in hard-to-debug bug on obscure
machine... we did not win much.
> > And we do use some
> > self-modifying code these days, no? (Called runtime patching or
> > something like that.)
> Alternative instructions? The resume OS will do the same modification
> anyway.
Okay, hopefully.
> > Ouch and IIRC top-level pagedir or something
> > like that lives in kernel "text" -- it is in assembly and wrongly
> > placed.
> i386 does the right thing and put the pagedir in data segment. x86_64
> not, I think we could clean it up.
This probably should be done, first, and gotten past Andi.
Pavel
--
Picture of sleeping (Linux) penguin wanted...
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: swsusp: which page should be saved?
2006-03-21 9:42 ` Pavel Machek
@ 2006-04-07 3:46 ` Shaohua Li
2006-04-11 7:39 ` Pavel Machek
0 siblings, 1 reply; 17+ messages in thread
From: Shaohua Li @ 2006-04-07 3:46 UTC (permalink / raw)
To: Pavel Machek; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 14476 bytes --]
On Tue, 2006-03-21 at 17:42 +0800, Pavel Machek wrote:
> Hi!
>
> > > Ok, I guess it is okay to go in if it stays in -mm for long enough
> to
> > > get a lot of testing.
> > I'll do more tests and back to you. BTW, I wonder if BIOS already
> saved
> > reserved memory (those doing communication with OS) in the
> 'platform'
> > method of S4.
>
> Yep, it should be safe. I bet it will break some obscure machine, but
> it will probably fix some obscure machine, too... Just needs lots of
> testing.
>
> > > > Anyway, skipping kernel text should be safe, isn't it?
> > >
> > > It probably is. But you need to save modules.
> > I just consider the region from kernel start(1M) to the end of
> rodata.
> > In my test, the region is about 4M memory. Just adding several lines
> to
> > save 4M memory is worthy.
>
> Well, few lines to save 4MB is nice. OTOH 4MB are saved in about
> 100msec, and if it brings in hard-to-debug bug on obscure
> machine... we did not win much.
>
> > > And we do use some
> > > self-modifying code these days, no? (Called runtime patching or
> > > something like that.)
> > Alternative instructions? The resume OS will do the same
> modification
> > anyway.
>
> Okay, hopefully.
>
> > > Ouch and IIRC top-level pagedir or something
> > > like that lives in kernel "text" -- it is in assembly and
> wrongly
> > > placed.
> > i386 does the right thing and put the pagedir in data segment.
> x86_64
> > not, I think we could clean it up.
>
> This probably should be done, first, and gotten past Andi.
We asked the question to (intel's) BIOS guys, and below is the result.
a. BIOS reserved region/hole - no save/restore
b. ACPI NVS - save/restore
c. 'ACPI Data' is a little tricky. After OS boots, os can reclaim this
region, so regard it as normal ram. But we are afraid Linux runtime
module loading might use this region somewhere, so we also mark this
region as save/restore. Anyway, this hasn't any side effect.
Hopefully all BIOSes follow this rule.
Pages (Reserved/ACPI NVS/ACPI Data) below
end_pfn(x86_64)/max_low_pfn(i386) will be saved/restored by S4
currently. We should mark 'Reserved' pages not saveable.
Pages (Reserved/ACPI NVS/ACPI Data) above end_pfn/max_low_pfn
will not be saved/restored by S4 currently. We should save the
'ACPI NVS/ACPI Data' pages in the highmem.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
---
linux-2.6.17-rc1-root/arch/i386/kernel/setup.c | 101 +++++++++++++++++++++
linux-2.6.17-rc1-root/arch/x86_64/kernel/setup.c | 93 +++++++++++++++++++
linux-2.6.17-rc1-root/include/linux/suspend.h | 1
linux-2.6.17-rc1-root/kernel/power/snapshot.c | 107 ++++++++++++++++++++++-
linux-2.6.17-rc1-root/kernel/power/swsusp.c | 20 +---
5 files changed, 308 insertions(+), 14 deletions(-)
diff -puN arch/i386/kernel/setup.c~nosave_pages arch/i386/kernel/setup.c
--- linux-2.6.17-rc1/arch/i386/kernel/setup.c~nosave_pages 2006-04-04 15:10:42.000000000 +0800
+++ linux-2.6.17-rc1-root/arch/i386/kernel/setup.c 2006-04-06 09:40:02.000000000 +0800
@@ -48,6 +48,7 @@
#include <linux/crash_dump.h>
#include <linux/dmi.h>
#include <linux/pfn.h>
+#include <linux/suspend.h>
#include <video/edid.h>
@@ -1400,6 +1401,106 @@ static void set_mca_bus(int x)
static void set_mca_bus(int x) { }
#endif
+#ifdef CONFIG_SOFTWARE_SUSPEND
+static void __init mark_nosave_page_range(unsigned long start, unsigned long end)
+{
+ struct page *page;
+ while (start <= end) {
+ page = pfn_to_page(start);
+ SetPageNosave(page);
+ start ++;
+ }
+}
+
+static void __init e820_nosave_reserved_pages(void)
+{
+ int i;
+ unsigned long r_start = 0, r_end = 0;
+
+ /* Assume e820 map is sorted */
+ for (i = 0; i < e820.nr_map; i++) {
+ struct e820entry *ei = &e820.map[i];
+ unsigned long start, end;
+
+ start = PFN_DOWN(ei->addr);
+ end = PFN_UP(ei->addr + ei->size);
+ if (start >= end)
+ continue;
+ if (ei->type == E820_RESERVED)
+ continue;
+ r_end = start;
+ /*
+ * Highmem 'Reserved' pages are marked as reserved, swsusp
+ * will not save/restore them, so we ignore these pages here.
+ */
+ if (r_end > max_low_pfn)
+ r_end = max_low_pfn;
+ if (r_end > r_start)
+ mark_nosave_page_range(r_start, r_end-1);
+ if (r_end >= max_low_pfn)
+ break;
+ r_start = end;
+ }
+}
+
+static void __init e820_save_acpi_pages(void)
+{
+ int i;
+
+ /* Assume e820 map is sorted */
+ for (i = 0; i < e820.nr_map; i++) {
+ struct e820entry *ei = &e820.map[i];
+ unsigned long start, end;
+
+ start = PFN_DOWN(ei->addr);
+ end = PFN_UP(ei->addr + ei->size);
+ if (start >= end)
+ continue;
+ if (ei->type != E820_ACPI && ei->type != E820_NVS)
+ continue;
+ /*
+ * If the region is below max_low_pfn, it will be
+ * saved/restored by swsusp follow 'RAM' type.
+ */
+ if (start < max_low_pfn)
+ start = max_low_pfn;
+ /*
+ * Highmem pages (ACPI NVS/Data) are reserved, but swsusp
+ * highmem save/restore will not save/restore them. We marked
+ * them as arch saveable pages here
+ */
+ if (end > start)
+ swsusp_add_arch_pages(start, end - 1);
+ }
+}
+
+extern char __start_rodata, __end_rodata;
+/*
+ * kernel rodata - no save/restore
+ * BIOS reserved region/hole - no save/restore
+ * ACPI NVS - save/restore
+ * ACPI Data - save/restore
+ */
+static int __init mark_nosave_pages(void)
+{
+ unsigned long pfn_start, pfn_end;
+
+ /* BIOS reserved regions & holes */
+ e820_nosave_reserved_pages();
+
+ /* kernel rodata */
+ pfn_start = PFN_UP(virt_to_phys(&__start_rodata));
+ pfn_end = PFN_DOWN(virt_to_phys(&__end_rodata));
+ mark_nosave_page_range(pfn_start, pfn_end-1);
+
+ /* record ACPI Data/NVS as saveable */
+ e820_save_acpi_pages();
+
+ return 0;
+}
+core_initcall(mark_nosave_pages);
+#endif
+
/*
* Determine if we were loaded by an EFI loader. If so, then we have also been
* passed the efi memmap, systab, etc., so we should use these data structures
diff -puN arch/x86_64/kernel/setup.c~nosave_pages arch/x86_64/kernel/setup.c
--- linux-2.6.17-rc1/arch/x86_64/kernel/setup.c~nosave_pages 2006-04-04 15:10:42.000000000 +0800
+++ linux-2.6.17-rc1-root/arch/x86_64/kernel/setup.c 2006-04-06 09:40:03.000000000 +0800
@@ -47,6 +47,7 @@
#include <linux/dmi.h>
#include <linux/dma-mapping.h>
#include <linux/ctype.h>
+#include <linux/suspend.h>
#include <asm/mtrr.h>
#include <asm/uaccess.h>
@@ -582,6 +583,98 @@ static void __init reserve_ebda_region(v
reserve_bootmem_generic(addr, PAGE_SIZE);
}
+#ifdef CONFIG_SOFTWARE_SUSPEND
+static void __init mark_nosave_page_range(unsigned long start, unsigned long end)
+{
+ struct page *page;
+ while (start <= end) {
+ page = pfn_to_page(start);
+ SetPageNosave(page);
+ start ++;
+ }
+}
+
+static void __init e820_nosave_reserved_pages(void)
+{
+ int i;
+ unsigned long r_start = 0, r_end = 0;
+
+ /* Assume e820 map is sorted */
+ for (i = 0; i < e820.nr_map; i++) {
+ struct e820entry *ei = &e820.map[i];
+ unsigned long start, end;
+
+ start = round_down(ei->addr, PAGE_SIZE);
+ end = round_up(ei->addr + ei->size, PAGE_SIZE);
+ if (start >= end)
+ continue;
+ if (ei->type == E820_RESERVED)
+ continue;
+ r_end = start>>PAGE_SHIFT;
+ /* swsusp ignores invalid pfn, ignore these pages here */
+ if (r_end > end_pfn)
+ r_end = end_pfn;
+ if (r_end > r_start)
+ mark_nosave_page_range(r_start, r_end-1);
+ if (r_end >= end_pfn)
+ break;
+ r_start = end>>PAGE_SHIFT;
+ }
+}
+
+static void __init e820_save_acpi_pages(void)
+{
+ int i;
+
+ /* Assume e820 map is sorted */
+ for (i = 0; i < e820.nr_map; i++) {
+ struct e820entry *ei = &e820.map[i];
+ unsigned long start, end;
+
+ start = round_down(ei->addr, PAGE_SIZE) >> PAGE_SHIFT;
+ end = round_up(ei->addr + ei->size, PAGE_SIZE) >> PAGE_SHIFT;
+ if (start >= end)
+ continue;
+ if (ei->type != E820_ACPI && ei->type != E820_NVS)
+ continue;
+ /*
+ * If the region is below end_pfn, it will be
+ * saved/restored by swsusp follow 'RAM' type.
+ */
+ if (start < end_pfn)
+ start = end_pfn;
+ if (end > start)
+ swsusp_add_arch_pages(start, end - 1);
+ }
+}
+
+extern char __start_rodata, __end_rodata;
+/*
+ * kernel rodata - no save/restore
+ * BIOS reserved region/hole - no save/restore
+ * ACPI NVS - save/restore
+ * ACPI Data - save/restore
+ */
+static int __init mark_nosave_pages(void)
+{
+ unsigned long pfn_start, pfn_end;
+
+ /* BIOS reserved regions & holes */
+ e820_nosave_reserved_pages();
+
+ /* kernel rodata */
+ pfn_start = round_up(__pa_symbol(&__start_rodata), PAGE_SIZE) >> PAGE_SHIFT;
+ pfn_end = round_down(__pa_symbol(&__end_rodata), PAGE_SIZE) >> PAGE_SHIFT;
+ mark_nosave_page_range(pfn_start, pfn_end-1);
+
+ /* record ACPI Data/NVS as saveable */
+ e820_save_acpi_pages();
+
+ return 0;
+}
+core_initcall(mark_nosave_pages);
+#endif
+
void __init setup_arch(char **cmdline_p)
{
unsigned long kernel_end;
diff -puN kernel/power/snapshot.c~nosave_pages kernel/power/snapshot.c
--- linux-2.6.17-rc1/kernel/power/snapshot.c~nosave_pages 2006-04-04 15:10:42.000000000 +0800
+++ linux-2.6.17-rc1-root/kernel/power/snapshot.c 2006-04-05 14:56:56.000000000 +0800
@@ -39,6 +39,85 @@ static unsigned int nr_copy_pages;
static unsigned int nr_meta_pages;
static unsigned long *buffer;
+struct arch_savable_page {
+ unsigned long start_pfn;
+ unsigned long end_pfn;
+ struct arch_savable_page *next;
+ void *data[0];
+};
+static struct arch_savable_page * arch_pages;
+
+int swsusp_add_arch_pages(unsigned long start, unsigned long end)
+{
+ struct arch_savable_page *tmp;
+ unsigned int size;
+
+ end = end + 1;
+ size = sizeof(struct arch_savable_page) + sizeof(void *) * (end - start);
+ tmp = kzalloc(size, GFP_KERNEL);
+ if (!tmp)
+ return -ENOMEM;
+ tmp->start_pfn = start;
+ tmp->end_pfn = end;
+ tmp->next = arch_pages;
+ arch_pages = tmp;
+ return 0;
+}
+
+static unsigned int count_arch_pages(void)
+{
+ unsigned int count = 0;
+ struct arch_savable_page *tmp = arch_pages;
+ while (tmp) {
+ count += tmp->end_pfn - tmp->start_pfn;
+ tmp = tmp->next;
+ }
+ return count;
+}
+
+static int save_arch_mem(void)
+{
+ void *kaddr;
+ struct arch_savable_page *tmp = arch_pages;
+ int i;
+
+ pr_debug("swsusp: Saving arch specific memory");
+ while (tmp) {
+ for (i = 0; i < tmp->end_pfn - tmp->start_pfn; i++) {
+ tmp->data[i] = (void *)get_zeroed_page(GFP_ATOMIC);
+ if (!tmp->data[i])
+ return -ENOMEM;
+ /* arch pages might haven't a 'struct page' */
+ kaddr = kmap_atomic_pfn(tmp->start_pfn + i, KM_PTE0);
+ memcpy(tmp->data[i], kaddr, PAGE_SIZE);
+ kunmap_atomic(kaddr, KM_PTE0);
+ }
+ tmp = tmp->next;
+ }
+ return 0;
+}
+
+static int restore_arch_mem(void)
+{
+ void *kaddr;
+ struct arch_savable_page *tmp = arch_pages;
+ int i;
+
+ while (tmp) {
+ for (i = 0; i < tmp->end_pfn - tmp->start_pfn; i++) {
+ if (!tmp->data[i])
+ continue;
+ kaddr = kmap_atomic_pfn(tmp->start_pfn + i, KM_PTE0);
+ memcpy(kaddr, tmp->data[i], PAGE_SIZE);
+ kunmap_atomic(kaddr, KM_PTE0);
+ free_page((long)tmp->data[i]);
+ tmp->data[i] = NULL;
+ }
+ tmp = tmp->next;
+ }
+ return 0;
+}
+
#ifdef CONFIG_HIGHMEM
unsigned int count_highmem_pages(void)
{
@@ -150,8 +229,35 @@ int restore_highmem(void)
}
return 0;
}
+#else
+static unsigned int count_highmem_pages(void) {return 0;}
+static int save_highmem(void) {return 0;}
+static int restore_highmem(void) {return 0;}
#endif
+unsigned int count_special_pages(void)
+{
+ return count_arch_pages() + count_highmem_pages();
+}
+
+int save_special_mem(void)
+{
+ int ret;
+ ret = save_arch_mem();
+ if (!ret)
+ ret = save_highmem();
+ return ret;
+}
+
+int restore_special_mem(void)
+{
+ int ret;
+ ret = restore_arch_mem();
+ if (!ret)
+ ret = restore_highmem();
+ return ret;
+}
+
static int pfn_is_nosave(unsigned long pfn)
{
unsigned long nosave_begin_pfn = __pa(&__nosave_begin) >> PAGE_SHIFT;
@@ -177,7 +283,6 @@ static int saveable(struct zone *zone, u
return 0;
page = pfn_to_page(pfn);
- BUG_ON(PageReserved(page) && PageNosave(page));
if (PageNosave(page))
return 0;
if (PageReserved(page) && pfn_is_nosave(pfn))
diff -puN kernel/power/swsusp.c~nosave_pages kernel/power/swsusp.c
--- linux-2.6.17-rc1/kernel/power/swsusp.c~nosave_pages 2006-04-05 12:08:39.000000000 +0800
+++ linux-2.6.17-rc1-root/kernel/power/swsusp.c 2006-04-05 12:49:13.000000000 +0800
@@ -62,15 +62,9 @@ unsigned long image_size = 500 * 1024 *
int in_suspend __nosavedata = 0;
-#ifdef CONFIG_HIGHMEM
-unsigned int count_highmem_pages(void);
-int save_highmem(void);
-int restore_highmem(void);
-#else
-static int save_highmem(void) { return 0; }
-static int restore_highmem(void) { return 0; }
-static unsigned int count_highmem_pages(void) { return 0; }
-#endif
+unsigned int count_special_pages(void);
+int save_special_mem(void);
+int restore_special_mem(void);
/**
* The following functions are used for tracing the allocated
@@ -186,7 +180,7 @@ int swsusp_shrink_memory(void)
printk("Shrinking memory... ");
do {
- size = 2 * count_highmem_pages();
+ size = 2 * count_special_pages();
size += size / 50 + count_data_pages();
size += (size + PBES_PER_PAGE - 1) / PBES_PER_PAGE +
PAGES_FOR_IO;
@@ -228,7 +222,7 @@ int swsusp_suspend(void)
goto Enable_irqs;
}
- if ((error = save_highmem())) {
+ if ((error = save_special_mem())) {
printk(KERN_ERR "swsusp: Not enough free pages for highmem\n");
goto Restore_highmem;
}
@@ -239,7 +233,7 @@ int swsusp_suspend(void)
/* Restore control flow magically appears here */
restore_processor_state();
Restore_highmem:
- restore_highmem();
+ restore_special_mem();
device_power_up();
Enable_irqs:
local_irq_enable();
@@ -265,7 +259,7 @@ int swsusp_resume(void)
*/
swsusp_free();
restore_processor_state();
- restore_highmem();
+ restore_special_mem();
touch_softlockup_watchdog();
device_power_up();
local_irq_enable();
diff -puN include/linux/suspend.h~nosave_pages include/linux/suspend.h
--- linux-2.6.17-rc1/include/linux/suspend.h~nosave_pages 2006-04-05 14:45:18.000000000 +0800
+++ linux-2.6.17-rc1-root/include/linux/suspend.h 2006-04-05 14:49:07.000000000 +0800
@@ -72,6 +72,7 @@ struct saved_context;
void __save_processor_state(struct saved_context *ctxt);
void __restore_processor_state(struct saved_context *ctxt);
unsigned long get_safe_page(gfp_t gfp_mask);
+int swsusp_add_arch_pages(unsigned long start, unsigned long end);
/*
* XXX: We try to keep some more pages free so that I/O operations succeed
_
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: swsusp: which page should be saved?
2006-04-07 3:46 ` Shaohua Li
@ 2006-04-11 7:39 ` Pavel Machek
2006-04-11 7:52 ` Shaohua Li
0 siblings, 1 reply; 17+ messages in thread
From: Pavel Machek @ 2006-04-11 7:39 UTC (permalink / raw)
To: Shaohua Li; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 3525 bytes --]
Hi!
> > > > Ouch and IIRC top-level pagedir or something
> > > > like that lives in kernel "text" -- it is in assembly and
> > wrongly
> > > > placed.
> > > i386 does the right thing and put the pagedir in data segment.
> > x86_64
> > > not, I think we could clean it up.
> >
> > This probably should be done, first, and gotten past Andi.
> We asked the question to (intel's) BIOS guys, and below is the result.
> a. BIOS reserved region/hole - no save/restore
> b. ACPI NVS - save/restore
> c. 'ACPI Data' is a little tricky. After OS boots, os can reclaim this
> region, so regard it as normal ram. But we are afraid Linux runtime
> module loading might use this region somewhere, so we also mark this
> region as save/restore. Anyway, this hasn't any side effect.
> Hopefully all BIOSes follow this rule.
This comment needs to go somewhere in source.... aha, it is there,
mostly, but without the "tricky" remark.
> Pages (Reserved/ACPI NVS/ACPI Data) below
> end_pfn(x86_64)/max_low_pfn(i386) will be saved/restored by S4
> currently. We should mark 'Reserved' pages not saveable.
> Pages (Reserved/ACPI NVS/ACPI Data) above end_pfn/max_low_pfn
> will not be saved/restored by S4 currently. We should save the
> 'ACPI NVS/ACPI Data' pages in the highmem.
> Signed-off-by: Shaohua Li <shaohua.li@intel.com>
> ---
>
> linux-2.6.17-rc1-root/arch/i386/kernel/setup.c | 101 +++++++++++++++++++++
> linux-2.6.17-rc1-root/arch/x86_64/kernel/setup.c | 93 +++++++++++++++++++
> linux-2.6.17-rc1-root/include/linux/suspend.h | 1
> linux-2.6.17-rc1-root/kernel/power/snapshot.c | 107 ++++++++++++++++++++++-
> linux-2.6.17-rc1-root/kernel/power/swsusp.c | 20 +---
> 5 files changed, 308 insertions(+), 14 deletions(-)
>
> diff -puN arch/i386/kernel/setup.c~nosave_pages arch/i386/kernel/setup.c
> --- linux-2.6.17-rc1/arch/i386/kernel/setup.c~nosave_pages 2006-04-04 15:10:42.000000000 +0800
> +++ linux-2.6.17-rc1-root/arch/i386/kernel/setup.c 2006-04-06 09:40:02.000000000 +0800
> @@ -1400,6 +1401,106 @@ static void set_mca_bus(int x)
> static void set_mca_bus(int x) { }
> #endif
>
> +#ifdef CONFIG_SOFTWARE_SUSPEND
> +static void __init mark_nosave_page_range(unsigned long start, unsigned long end)
> +{
> + struct page *page;
> + while (start <= end) {
> + page = pfn_to_page(start);
> + SetPageNosave(page);
> + start ++;
I think start++ is prefered style.
> diff -puN kernel/power/snapshot.c~nosave_pages kernel/power/snapshot.c
> --- linux-2.6.17-rc1/kernel/power/snapshot.c~nosave_pages 2006-04-04 15:10:42.000000000 +0800
> +++ linux-2.6.17-rc1-root/kernel/power/snapshot.c 2006-04-05 14:56:56.000000000 +0800
> @@ -39,6 +39,85 @@ static unsigned int nr_copy_pages;
> static unsigned int nr_meta_pages;
> static unsigned long *buffer;
>
> +struct arch_savable_page {
I believe we use "saveable" spelling. Either use it, too, or fix the
other occurences.
> + unsigned long start_pfn;
> + unsigned long end_pfn;
> + struct arch_savable_page *next;
> + void *data[0];
> +};
> +static struct arch_savable_page * arch_pages;
No space between * and arch_pages, please.
> +int swsusp_add_arch_pages(unsigned long start, unsigned long end)
> +{
> + struct arch_savable_page *tmp;
> + unsigned int size;
> +
> + end = end + 1;
> + size = sizeof(struct arch_savable_page) + sizeof(void *) * (end - start);
> + tmp = kzalloc(size, GFP_KERNEL);
That's potentialy quite a big allocation, no? Could we use highmem
(-like) per page saving?
Pavel
--
Thanks, Sharp!
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: swsusp: which page should be saved?
2006-04-11 7:39 ` Pavel Machek
@ 2006-04-11 7:52 ` Shaohua Li
0 siblings, 0 replies; 17+ messages in thread
From: Shaohua Li @ 2006-04-11 7:52 UTC (permalink / raw)
To: Pavel Machek; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 4150 bytes --]
On Tue, 2006-04-11 at 15:39 +0800, Pavel Machek wrote:
> Hi!
>
> > > > > Ouch and IIRC top-level pagedir or something
> > > > > like that lives in kernel "text" -- it is in assembly and
> > > wrongly
> > > > > placed.
> > > > i386 does the right thing and put the pagedir in data segment.
> > > x86_64
> > > > not, I think we could clean it up.
> > >
> > > This probably should be done, first, and gotten past Andi.
> > We asked the question to (intel's) BIOS guys, and below is the
> result.
> > a. BIOS reserved region/hole - no save/restore
> > b. ACPI NVS - save/restore
> > c. 'ACPI Data' is a little tricky. After OS boots, os can reclaim
> this
> > region, so regard it as normal ram. But we are afraid Linux runtime
> > module loading might use this region somewhere, so we also mark
> this
> > region as save/restore. Anyway, this hasn't any side effect.
> > Hopefully all BIOSes follow this rule.
>
> This comment needs to go somewhere in source.... aha, it is there,
> mostly, but without the "tricky" remark.
Ok, will add that one in code too.
>
> > Pages (Reserved/ACPI NVS/ACPI Data) below
> > end_pfn(x86_64)/max_low_pfn(i386) will be saved/restored by S4
> > currently. We should mark 'Reserved' pages not saveable.
> > Pages (Reserved/ACPI NVS/ACPI Data) above end_pfn/max_low_pfn
> > will not be saved/restored by S4 currently. We should save the
> > 'ACPI NVS/ACPI Data' pages in the highmem.
>
>
>
> > Signed-off-by: Shaohua Li <shaohua.li@intel.com>
> > ---
> >
> > linux-2.6.17-rc1-root/arch/i386/kernel/setup.c | 101
> +++++++++++++++++++++
> > linux-2.6.17-rc1-root/arch/x86_64/kernel/setup.c | 93
> +++++++++++++++++++
> > linux-2.6.17-rc1-root/include/linux/suspend.h | 1
> > linux-2.6.17-rc1-root/kernel/power/snapshot.c | 107
> ++++++++++++++++++++++-
> > linux-2.6.17-rc1-root/kernel/power/swsusp.c | 20 +---
> > 5 files changed, 308 insertions(+), 14 deletions(-)
> >
> > diff -puN arch/i386/kernel/setup.c~nosave_pages
> arch/i386/kernel/setup.c
> > --- linux-2.6.17-rc1/arch/i386/kernel/setup.c~nosave_pages
> 2006-04-04 15:10:42.000000000 +0800
> > +++ linux-2.6.17-rc1-root/arch/i386/kernel/setup.c 2006-04-06
> 09:40:02.000000000 +0800
> > @@ -1400,6 +1401,106 @@ static void set_mca_bus(int x)
> > static void set_mca_bus(int x) { }
> > #endif
> >
> > +#ifdef CONFIG_SOFTWARE_SUSPEND
> > +static void __init mark_nosave_page_range(unsigned long start,
> unsigned long end)
> > +{
> > + struct page *page;
> > + while (start <= end) {
> > + page = pfn_to_page(start);
> > + SetPageNosave(page);
> > + start ++;
>
> I think start++ is prefered style.
ok.
>
> > diff -puN kernel/power/snapshot.c~nosave_pages
> kernel/power/snapshot.c
> > --- linux-2.6.17-rc1/kernel/power/snapshot.c~nosave_pages
> 2006-04-04 15:10:42.000000000 +0800
> > +++ linux-2.6.17-rc1-root/kernel/power/snapshot.c 2006-04-05
> 14:56:56.000000000 +0800
> > @@ -39,6 +39,85 @@ static unsigned int nr_copy_pages;
> > static unsigned int nr_meta_pages;
> > static unsigned long *buffer;
> >
> > +struct arch_savable_page {
>
> I believe we use "saveable" spelling. Either use it, too, or fix the
> other occurences.
Ok, will fix the spelling, sorry.
>
> > + unsigned long start_pfn;
> > + unsigned long end_pfn;
> > + struct arch_savable_page *next;
> > + void *data[0];
> > +};
> > +static struct arch_savable_page * arch_pages;
>
> No space between * and arch_pages, please.
ok.
>
> > +int swsusp_add_arch_pages(unsigned long start, unsigned long end)
> > +{
> > + struct arch_savable_page *tmp;
> > + unsigned int size;
> > +
> > + end = end + 1;
> > + size = sizeof(struct arch_savable_page) + sizeof(void *) *
> (end - start);
> > + tmp = kzalloc(size, GFP_KERNEL);
>
> That's potentialy quite a big allocation, no? Could we use highmem
> (-like) per page saving?
It could be, I'll follow the per page saving of highmem.
Thanks for looking at it. I'll send it to lkml for -mm test after fixed
you mentioned.
Thanks,
Shaohua
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2006-04-11 7:52 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-16 3:53 swsusp: which page should be saved? Shaohua Li
2006-03-15 23:08 ` Pavel Machek
2006-03-17 1:12 ` Shaohua Li
2006-03-17 6:59 ` Pavel Machek
2006-03-21 2:19 ` Shaohua Li
2006-03-21 3:33 ` Nigel Cunningham
2006-03-21 9:42 ` Pavel Machek
2006-04-07 3:46 ` Shaohua Li
2006-04-11 7:39 ` Pavel Machek
2006-04-11 7:52 ` Shaohua Li
2006-03-17 10:50 ` Rafael J. Wysocki
2006-03-16 16:05 ` Rafael J. Wysocki
-- strict thread matches above, loose matches on Subject: below --
2006-03-21 3:11 Yu, Luming
2006-03-21 3:23 Yu, Luming
2006-03-21 3:33 ` Shaohua Li
2006-03-21 3:41 Yu, Luming
2006-03-21 4:25 Yu, Luming
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox