Proposal for a new algorithm for reading & writing a hibernation image.

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Proposal for a new algorithm for reading & writing a hibernation image.
@ 2010-05-09 23:38 Nigel Cunningham
  2010-05-10 15:54 ` Bill Davidsen
  2010-05-30  5:25 ` [linux-pm] " Pavel Machek
  0 siblings, 2 replies; 5+ messages in thread
From: Nigel Cunningham @ 2010-05-09 23:38 UTC (permalink / raw)
  To: pm list, LKML, TuxOnIce-devel

Hi all.

Some discussions with Rafael a while ago (can't find the original 
message now, sorry) got me thinking about whether there might be a 
better way of writing a complete image of memory, particularly in the 
context of KMS breaking existing TuxOnIce algorithms. I finally got 
around to hammering out the algorithm last night, and thought I'd put it 
out there for others to comment on, particularly since I'm no expert on 
fault handling - it may be that what I'm thinking of is impossible on 
the hardware we support.

The algorithm I'm thinking of trying to implement goes as follows:

When saving the image
=====================

1. Modify driver suspend and resume routines so that the freeing of 
memory used for the storage of state is separated from restoring the 
resume methods. This will allow us to get the drivers to save their 
state prior to writing the image, without needing the memory allocated 
for this purpose to be atomically copied.
2. Prior to writing any of the image, also set up new 4k page tables 
such that an attempt to make a change to any of the pages we're about to 
write to disk will result in a page fault, giving us an opportunity to 
flag the page as needing an atomic copy later. Once this is done, write 
protection for the page can be disabled and the write that caused the 
fault allowed to proceed.
3. Write the entire contents of memory to disk.
4. Disable secondary CPUs (no need to do the driver suspend/resume 
again) and atomically copy pages that faulted while writing the image.
5. Write atomically copied data to disk, giving a complete image on disk 
of memory at the time of the atomic copy.

When loading the image
======================
1. Locate and allocate pages that can have data directly loaded (ie are 
free now and used in the saved image). These will be loaded without an 
'atomic restore'.
2. For other pages:
    As each page is loaded:
    - Write protect existing data.
    - If contents are the same as what is being loaded
      Discard loaded version
      If contents change after being write protected,
      1. make a copy of unmodified version to later atomically copy back.
      2. remove write protection
    - If contents differ
      1. set up atomic restore later
      2. remove write protection
3. After loading memory and determining what needs to be atomically 
restored:
- Do drivers suspend, atomic restore as is done at the moment

The main difficulties I see with the above are - apart from not being 
sure that I can achieve the above with fault handling - are:

1. Memory requirements for the atomic copy wouldn't be known until the 
point where we get to the atomic copy. I guess, though, that with most 
things frozen, we'd expect the number to be reasonably consistent and small.
2. We also need extra memory for the driver suspend at resume time. That 
said, since it's not otherwise needed, it could be the same memory 
that's reserved for doing I/O and for atomically copied data when 
writing the image.

Are there other issues people can see that I might have missed?

Regards,

Nigel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Proposal for a new algorithm for reading & writing a hibernation image.
  2010-05-09 23:38 Proposal for a new algorithm for reading & writing a hibernation image Nigel Cunningham
@ 2010-05-10 15:54 ` Bill Davidsen
  2010-05-10 21:16   ` Nigel Cunningham
  2010-05-30  5:25 ` [linux-pm] " Pavel Machek
  1 sibling, 1 reply; 5+ messages in thread
From: Bill Davidsen @ 2010-05-10 15:54 UTC (permalink / raw)
  To: linux-kernel; +Cc: tuxonice-devel, linux-kernel

Nigel Cunningham wrote:
> Hi all.
> 
> Some discussions with Rafael a while ago (can't find the original 
> message now, sorry) got me thinking about whether there might be a 
> better way of writing a complete image of memory, particularly in the 
> context of KMS breaking existing TuxOnIce algorithms. I finally got 
> around to hammering out the algorithm last night, and thought I'd put it 
> out there for others to comment on, particularly since I'm no expert on 
> fault handling - it may be that what I'm thinking of is impossible on 
> the hardware we support.
> 
> The algorithm I'm thinking of trying to implement goes as follows:
> 
> When saving the image
> =====================
> 
> 1. Modify driver suspend and resume routines so that the freeing of 
> memory used for the storage of state is separated from restoring the 
> resume methods. This will allow us to get the drivers to save their 
> state prior to writing the image, without needing the memory allocated 
> for this purpose to be atomically copied.
> 2. Prior to writing any of the image, also set up new 4k page tables 
> such that an attempt to make a change to any of the pages we're about to 
> write to disk will result in a page fault, giving us an opportunity to 
> flag the page as needing an atomic copy later. Once this is done, write 
> protection for the page can be disabled and the write that caused the 
> fault allowed to proceed.
> 3. Write the entire contents of memory to disk.
> 4. Disable secondary CPUs (no need to do the driver suspend/resume 
> again) and atomically copy pages that faulted while writing the image.
> 5. Write atomically copied data to disk, giving a complete image on disk 
> of memory at the time of the atomic copy.
> 
> When loading the image
> ======================
> 1. Locate and allocate pages that can have data directly loaded (ie are 
> free now and used in the saved image). These will be loaded without an 
> 'atomic restore'.
> 2. For other pages:
>    As each page is loaded:
>    - Write protect existing data.
>    - If contents are the same as what is being loaded
>      Discard loaded version
>      If contents change after being write protected,
>      1. make a copy of unmodified version to later atomically copy back.
>      2. remove write protection
>    - If contents differ
>      1. set up atomic restore later
>      2. remove write protection
> 3. After loading memory and determining what needs to be atomically 
> restored:
> - Do drivers suspend, atomic restore as is done at the moment
> 
> The main difficulties I see with the above are - apart from not being 
> sure that I can achieve the above with fault handling - are:
> 
> 1. Memory requirements for the atomic copy wouldn't be known until the 
> point where we get to the atomic copy. I guess, though, that with most 
> things frozen, we'd expect the number to be reasonably consistent and 
> small.
> 2. We also need extra memory for the driver suspend at resume time. That 
> said, since it's not otherwise needed, it could be the same memory 
> that's reserved for doing I/O and for atomically copied data when 
> writing the image.
> 
> Are there other issues people can see that I might have missed?
> 
I doubt you "missed" considering compression, but you didn't mention it.

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Proposal for a new algorithm for reading & writing a hibernation image.
  2010-05-10 15:54 ` Bill Davidsen
@ 2010-05-10 21:16   ` Nigel Cunningham
  0 siblings, 0 replies; 5+ messages in thread
From: Nigel Cunningham @ 2010-05-10 21:16 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: pm list, LKML, TuxOnIce-devel

Hi Bill.

On 11/05/10 01:54, Bill Davidsen wrote:
> Nigel Cunningham wrote:
>> Hi all.
>>
>> Some discussions with Rafael a while ago (can't find the original
>> message now, sorry) got me thinking about whether there might be a
>> better way of writing a complete image of memory, particularly in the
>> context of KMS breaking existing TuxOnIce algorithms. I finally got
>> around to hammering out the algorithm last night, and thought I'd put
>> it out there for others to comment on, particularly since I'm no
>> expert on fault handling - it may be that what I'm thinking of is
>> impossible on the hardware we support.
>>
>> The algorithm I'm thinking of trying to implement goes as follows:
>>
>> When saving the image
>> =====================
>>
>> 1. Modify driver suspend and resume routines so that the freeing of
>> memory used for the storage of state is separated from restoring the
>> resume methods. This will allow us to get the drivers to save their
>> state prior to writing the image, without needing the memory allocated
>> for this purpose to be atomically copied.
>> 2. Prior to writing any of the image, also set up new 4k page tables
>> such that an attempt to make a change to any of the pages we're about
>> to write to disk will result in a page fault, giving us an opportunity
>> to flag the page as needing an atomic copy later. Once this is done,
>> write protection for the page can be disabled and the write that
>> caused the fault allowed to proceed.
>> 3. Write the entire contents of memory to disk.
>> 4. Disable secondary CPUs (no need to do the driver suspend/resume
>> again) and atomically copy pages that faulted while writing the image.
>> 5. Write atomically copied data to disk, giving a complete image on
>> disk of memory at the time of the atomic copy.
>>
>> When loading the image
>> ======================
>> 1. Locate and allocate pages that can have data directly loaded (ie
>> are free now and used in the saved image). These will be loaded
>> without an 'atomic restore'.
>> 2. For other pages:
>> As each page is loaded:
>> - Write protect existing data.
>> - If contents are the same as what is being loaded
>> Discard loaded version
>> If contents change after being write protected,
>> 1. make a copy of unmodified version to later atomically copy back.
>> 2. remove write protection
>> - If contents differ
>> 1. set up atomic restore later
>> 2. remove write protection
>> 3. After loading memory and determining what needs to be atomically
>> restored:
>> - Do drivers suspend, atomic restore as is done at the moment
>>
>> The main difficulties I see with the above are - apart from not being
>> sure that I can achieve the above with fault handling - are:
>>
>> 1. Memory requirements for the atomic copy wouldn't be known until the
>> point where we get to the atomic copy. I guess, though, that with most
>> things frozen, we'd expect the number to be reasonably consistent and
>> small.
>> 2. We also need extra memory for the driver suspend at resume time.
>> That said, since it's not otherwise needed, it could be the same
>> memory that's reserved for doing I/O and for atomically copied data
>> when writing the image.
>>
>> Are there other issues people can see that I might have missed?
>>
> I doubt you "missed" considering compression, but you didn't mention it.

Yeah. I was just focusing on the method of ensuring we get a consistent 
image. I'd be seeking in the first instance to modify the existing 
TuxOnIce code to work this way, so it would still have multithreaded 
I/O, compression and so on.

What I really want to do is work on patches to improve swsusp, but I 
have to keep the existing TuxOnIce users happy too :)

Regards,

Nigel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-pm] Proposal for a new algorithm for reading & writing a hibernation image.
  2010-05-09 23:38 Proposal for a new algorithm for reading & writing a hibernation image Nigel Cunningham
  2010-05-10 15:54 ` Bill Davidsen
@ 2010-05-30  5:25 ` Pavel Machek
  2010-06-03  8:14   ` Nigel Cunningham
  1 sibling, 1 reply; 5+ messages in thread
From: Pavel Machek @ 2010-05-30  5:25 UTC (permalink / raw)
  To: Nigel Cunningham; +Cc: pm list, LKML, TuxOnIce-devel

Hi!

> 2. Prior to writing any of the image, also set up new 4k page tables 
> such that an attempt to make a change to any of the pages we're about to 
> write to disk will result in a page fault, giving us an opportunity to 
> flag the page as needing an atomic copy later. Once this is done, write 
> protection for the page can be disabled and the write that caused the 
> fault allowed to proceed.

Tricky.

page faulting code touches memory, too...

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-pm] Proposal for a new algorithm for reading & writing a hibernation image.
  2010-05-30  5:25 ` [linux-pm] " Pavel Machek
@ 2010-06-03  8:14   ` Nigel Cunningham
  0 siblings, 0 replies; 5+ messages in thread
From: Nigel Cunningham @ 2010-06-03  8:14 UTC (permalink / raw)
  To: Pavel Machek; +Cc: pm list, LKML, TuxOnIce-devel

Hi.

On 30/05/10 15:25, Pavel Machek wrote:
> Hi!
>
>> 2. Prior to writing any of the image, also set up new 4k page tables
>> such that an attempt to make a change to any of the pages we're about to
>> write to disk will result in a page fault, giving us an opportunity to
>> flag the page as needing an atomic copy later. Once this is done, write
>> protection for the page can be disabled and the write that caused the
>> fault allowed to proceed.
>
> Tricky.
>
> page faulting code touches memory, too...

Yeah. I realise we'd need to make the pages that are used to record the 
faults be unprotected themselves. I'm imagining a bitmap for that.

Do you see any reason that it could be inherently impossible? That's 
what I really want to know before (potentially) wasting time trying it.

Regards,

Nigel

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-06-03  8:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-09 23:38 Proposal for a new algorithm for reading & writing a hibernation image Nigel Cunningham
2010-05-10 15:54 ` Bill Davidsen
2010-05-10 21:16   ` Nigel Cunningham
2010-05-30  5:25 ` [linux-pm] " Pavel Machek
2010-06-03  8:14   ` Nigel Cunningham

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox