From: Vivek Goyal <vgoyal@redhat.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
Pavel Machek <pavel@ucw.cz>,
nigel@nigel.suspend2.net, "Rafael J. Wysocki" <rjw@sisk.pl>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org,
Kexec Mailing List <kexec@lists.infradead.org>
Subject: Re: [PATCH] kexec based hibernation: a prototype of kexec multi-stage load
Date: Tue, 13 May 2008 22:56:07 -0400 [thread overview]
Message-ID: <20080514025607.GA19944@redhat.com> (raw)
In-Reply-To: <1210730266.23707.50.camel@caritas-dev.intel.com>
On Wed, May 14, 2008 at 09:57:46AM +0800, Huang, Ying wrote:
> Hi, Vivek,
>
> On Tue, 2008-05-13 at 01:34 -0400, Vivek Goyal wrote:
> > On Mon, May 12, 2008 at 02:40:41PM +0800, Huang, Ying wrote:
> > > This patch implements a prototype of kexec multi-stage load. With this
> > > patch, the "backup pages map" can be passed to kexeced kernel via
> > > /sbin/kexec; and the sys_kexec_load can be used to load large
> > > hibernated image with huge number of segments.
> > >
> > >
> >
> > Hi Huang,
> >
> > Had a quick look at the patch. Will review in detail soon. Had few
> > thoughts.
> >
> > In general, these patches are on top of previous kexec jump patches.
> > It would be good if you could repost your updated patches so that
> > I can apply the patches and and get some testing going.
>
> The kexec jump patch v9 is sufficient for this patch to work. I have no
> new version of kexec jump patch so far.
>
> > Last time I tried the patches (V9) and kexec jump did not work for me. I
> > was not getting timer interrupts in second kernel. Then I had to put
> > LAPIC and IOAPIC in legacy mode and then at one way jump started working.
> > I am not sure how the next kernel boots for you without putting APICs
> > in legacy mode. (Yet to make returning back to original kernel work
> > using V9).
>
> Can normal kexec (without kexec jump) works without putting LAPIC and
> IOAPIC in legacy mode? Does this mean we should put LAPIC and IOAPIC
> into legacy mode before kexec and restore them after?
>
We do put LAPIC and IOAPIC in legacy mode in normal kexec. Look at
disable_IO_APIC() in native_machine_shutdown(). So I think we shall
have to do the same thing in kexec jump code too.
> The kexec jump patch works well on my IBM T42. But it seems that the
> IOAPIC is disabled in BIOS, so I can only use i8259 and LAPIC on this
> machine.
>
> > > In kexec based hibernation, resuming from disk is implemented as
> > > loading the hibernated disk image with sys_kexec_load(). But unlike
> > > the normal kexec load, the hibernated image may have huge number of
> > > segments. So multi-stage loading is necessary for kexec load based
> > > resuming from disk implementation.
> >
> > I understand that hibernated images are huge. But why do we require
> > multi stage loading? I knew there was a maximum segment limit in kexec.
> > But I think we can change that limit. Anything else prevents us from
> > loading large images in one go?
>
> There are two reason for multi-stage loading:
>
> - Pass backup pages map from original kernel (A) to kexeced kernel (B),
> because it is not known before loading. We have discussed this before
> in:
> http://lkml.org/lkml/2008/3/12/308
> http://lkml.org/lkml/2008/3/14/59
> http://lkml.org/lkml/2008/3/21/299
>
See my response below....
> - Load large hibernated image. The hibernated image can be not only
> large but also discontinuous. For example, the physical memory size is
> 4G, and there is one free page every 2 pages, that is, there will be
> nearly 2G segments. Loading these segments in one go is impossible. So
> multi-stage load is necessary. And if the hibernated image is
> compressed, it is also very difficult to load it in one go because the
> anonymous pages needed.
>
> > > And, multi-stage loading is also
> > > necessary for parameter passing from original kernel to kexeced kernel
> > > because some information such as "backup pages map" is not available
> > > before loading.
> > >
> > >
> > > Four stages are defined:
> > >
> > > - KS_start: start stage; begin a new kexec loading; there must be only
> > > one KS_start stage in one kexec loading.
> > >
> > > - KS_mid: middle stage; continue load some segments; there may be many
> > > or zero KS_mid stages in one kexec loading; follows a KS_start or
> > > KS_mid stage.
> > >
> > > - KS_final: final stage; finish a kexec loading; there must be only
> > > one KS_final stage in one kexec loading; follows a KS_start or
> > > KS_mid stage.
> > >
> > > - KS_full: back compatible with original loading semantics, finish all
> > > work of a kexec loading in one KS_full stage.
> > >
> > >
> > > Overlapping between pages of different segments is allowed to support
> > > "parameter passing".
> > >
> > >
> > > During loading, a hash table mapped from destination page to source
> > > page is used instead of original linear mapping
> > > implementation. Because the hibernated image may be very large (up to
> > > near the size of physical memory), it is very time-consuming to search
> > > a source page given the destination page, which is used to check
> > > whether an newly allocated page is in the range of allocated
> > > destination pages.
> >
> > This seems to be an optimization of kexec so that it becomes efficient
> > in loading large images (containing large number of segments). Probably
> > this can be a separate patch.
>
> If it is desired, I can separate it into another patch.
>
> > IMHO, we can just first write a minimal patch where one can just switch
> > between kernels. Once that patch is upstream, we can enhance
> > it to do the hibernation and saving core functionality. Incremental
> > review becomes easier. Your last patch (v9) was a good attempt at that and
> > I thought very soon we shall have something mergable.
>
> Agreed. We can first focus on kexec jump patch. But as in last thread of
> kexec jump (v9), we need a protocol for parameter passing between kernel
> A and kernel B. So, we can use this patch as a prototype for the
> communication protocol.
I went through above mail thread again where we were discussing what all
information need to be passed between kernels.
Last time we enumerated three things.
- kernel entry/re-entry point for switch between kernels.
- backup pages map for core filtering
- Probably ELF core notes for saving hibernated image.
I think if we just implement the functionality so that one can switch
back and forth between kernels (no hibernated image saving),then we probably
need to pass around only kernel entry/re-entry point and nothing else and in
your patches I think you are already doing using %edi.
So, IMHO, for first simple implementation, we don't have to pass around
any data between kernels except entry point. (Please correct me if I am
wrong). Lets get that implementation in first and then we can get rest
of the pieces in place.
>
> > > The original mapping is only used by assembly code
> > > to swap the page contents. This map is also exported to user space via
> > > /proc/kexec_pgmap, so that /sbin/kexec can use it to construct the
> > > "backup pages map" parameter for kexeced kernel.
> > >
> > >
> > > This patch is based on Linux kernel 2.6.25 and kexec_jump patch, and
> > > has been tested on an IBM T42.
> > >
> >
> > Is kexec_jump v9 patch good enough or you have anohter internal version
> > of patch on top of this patch applies?
>
> v9 is the latest kexec jump patch, no other internal version so far.
Great. I got busy in other stuff last time. Will download the v9 again
and give it a try.
Thanks
Vivek
next prev parent reply other threads:[~2008-05-14 2:56 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-12 6:40 [PATCH] kexec based hibernation: a prototype of kexec multi-stage load Huang, Ying
2008-05-13 5:34 ` Vivek Goyal
2008-05-14 1:57 ` Huang, Ying
2008-05-14 2:56 ` Vivek Goyal [this message]
2008-05-14 3:37 ` Huang, Ying
2008-05-14 21:43 ` Eric W. Biederman
2008-05-15 2:40 ` Huang, Ying
2008-05-15 4:57 ` Huang, Ying
2008-05-15 18:39 ` Eric W. Biederman
2008-05-16 1:41 ` Huang, Ying
2008-05-16 2:25 ` Eric W. Biederman
2008-05-16 2:56 ` Huang, Ying
2008-05-16 3:27 ` Vivek Goyal
2008-05-16 13:40 ` Vivek Goyal
2008-05-18 1:59 ` Eric W. Biederman
2008-05-16 3:33 ` Eric W. Biederman
2008-05-16 2:00 ` Vivek Goyal
2008-05-16 2:19 ` Huang, Ying
2008-05-16 2:55 ` Eric W. Biederman
2008-05-16 4:52 ` Huang, Ying
2008-05-16 13:36 ` Vivek Goyal
2008-05-16 11:58 ` Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080514025607.GA19944@redhat.com \
--to=vgoyal@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=ebiederm@xmission.com \
--cc=kexec@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nigel@nigel.suspend2.net \
--cc=pavel@ucw.cz \
--cc=rjw@sisk.pl \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox