* kexec trouble
@ 2006-12-05 14:37 Gerd Hoffmann
2006-12-05 15:53 ` Magnus Damm
0 siblings, 1 reply; 20+ messages in thread
From: Gerd Hoffmann @ 2006-12-05 14:37 UTC (permalink / raw)
To: Magnus Damm; +Cc: Xen devel list
Hi,
Uh, it's a bit messy, with the changes sprinkled over the sparse tree
and the patches directory, which makes it a bit hard to fixup stuff.
IMHO the kexec code makes way to many decisions at compile time, not
runtime, especially the ones in the kexec code core. Having something
depend on CONFIG_XEN doesn't fly with the paravirt approach planned for
mainline merge (same kernel binary runs both native and paravirtualized).
I'm also in trouble now with guest kexec patches as they work with guest
phys addrs not machine phys addrs.
I think we need either wrapper functions for machine_kexec_* functions
which dispatch to the correct function depending on the environment
(dom0 vs domU, later also native) or just make them function pointers to
archive the same effect. Same goes for the KEXEC_ARCH_HAS_PAGE_MACROS
stuff. IMHO "#ifdef CONFIG_XEN" should go away from the core code (i.e.
kernel/kexec.c).
cheers,
Gerd
--
Gerd Hoffmann <kraxel@suse.de>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-05 14:37 kexec trouble Gerd Hoffmann
@ 2006-12-05 15:53 ` Magnus Damm
2006-12-05 16:55 ` Gerd Hoffmann
2006-12-06 8:37 ` Keir Fraser
0 siblings, 2 replies; 20+ messages in thread
From: Magnus Damm @ 2006-12-05 15:53 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list
Hi Gerd,
On 12/5/06, Gerd Hoffmann <kraxel@suse.de> wrote:
> Hi,
>
> Uh, it's a bit messy, with the changes sprinkled over the sparse tree
> and the patches directory, which makes it a bit hard to fixup stuff.
Well, I'm sorry to hear that you think it is messy. I don't think that
we touch that many places in the sparse tree, but I agree that the
combination of patches and sparse may be a bit confusing. The
alternative to patches would have been to duplicate the files by
copying the into the sparse tree which I wanted to avoid because I
think it makes future up porting difficult.
> IMHO the kexec code makes way to many decisions at compile time, not
> runtime, especially the ones in the kexec code core. Having something
> depend on CONFIG_XEN doesn't fly with the paravirt approach planned for
> mainline merge (same kernel binary runs both native and paravirtualized).
Sure, but isn't the paravirt stuff just for domU first to begin with?
I'm pretty sure that making the code dynamically decide between dom0,
domU or native is quite simple to implement when it comes to kexec,
but I wanted to wait with that until most parts of dom0 was running
under paravirt.
> I'm also in trouble now with guest kexec patches as they work with guest
> phys addrs not machine phys addrs.
Sorry if that made your life difficult, but shouldn't it just be a
matter of using the native versions of the page macros for domU? They
are in include/linux/kexec.h if I'm not mistaken. In a patch, not in
sparse.
> I think we need either wrapper functions for machine_kexec_* functions
> which dispatch to the correct function depending on the environment
> (dom0 vs domU, later also native) or just make them function pointers to
> archive the same effect. Same goes for the KEXEC_ARCH_HAS_PAGE_MACROS
> stuff. IMHO "#ifdef CONFIG_XEN" should go away from the core code (i.e.
> kernel/kexec.c).
You mean for the paravirt stuff? Isn't paravirt basically a set of
callbacks that you can register? If so, what is stopping us from
registering a set of paravirt callbacks for the kexec code?
/ magnus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-05 15:53 ` Magnus Damm
@ 2006-12-05 16:55 ` Gerd Hoffmann
2006-12-06 4:08 ` Magnus Damm
2006-12-06 8:37 ` Keir Fraser
1 sibling, 1 reply; 20+ messages in thread
From: Gerd Hoffmann @ 2006-12-05 16:55 UTC (permalink / raw)
To: Magnus Damm; +Cc: Magnus Damm, Xen devel list
Hi,
>> IMHO the kexec code makes way to many decisions at compile time, not
>> runtime, especially the ones in the kexec code core. Having something
>> depend on CONFIG_XEN doesn't fly with the paravirt approach planned for
>> mainline merge (same kernel binary runs both native and paravirtualized).
>
> Sure, but isn't the paravirt stuff just for domU first to begin with?
domU only as first step, later dom0 too.
> I'm pretty sure that making the code dynamically decide between dom0,
> domU or native is quite simple to implement when it comes to kexec,
> but I wanted to wait with that until most parts of dom0 was running
> under paravirt.
I'd prefer to do that _now_.
>> I'm also in trouble now with guest kexec patches as they work with guest
>> phys addrs not machine phys addrs.
>
> Sorry if that made your life difficult, but shouldn't it just be a
> matter of using the native versions of the page macros for domU?
No. The same xen kernel can run as both dom0 and domU, thus that must
be decided at runtime.
>> I think we need either wrapper functions for machine_kexec_* functions
>> which dispatch to the correct function depending on the environment
>> (dom0 vs domU, later also native) or just make them function pointers to
>> archive the same effect. Same goes for the KEXEC_ARCH_HAS_PAGE_MACROS
>> stuff. IMHO "#ifdef CONFIG_XEN" should go away from the core code (i.e.
>> kernel/kexec.c).
>
> You mean for the paravirt stuff?
And domU kexec. That works without any kexec core changes, and I
suspect the #ifdef CONFIG_XEN code will break it.
> Isn't paravirt basically a set of
> callbacks that you can register?
Yes.
> If so, what is stopping us from
> registering a set of paravirt callbacks for the kexec code?
Hmm, we'll end up with *two* sets of callbacks for xen, one for dom0 and
one for domU kexec. Not sure that fits the current paravirt design.
Given we may move to paravirt some day it's probably best to go with the
function pointers approach for now, that makes switching over to the
paravirt infrastructure (once it is mainline) easier. And I think its
also less messy in the code.
cheers,
Gerd
--
Gerd Hoffmann <kraxel@suse.de>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-05 16:55 ` Gerd Hoffmann
@ 2006-12-06 4:08 ` Magnus Damm
2006-12-06 8:48 ` Gerd Hoffmann
0 siblings, 1 reply; 20+ messages in thread
From: Magnus Damm @ 2006-12-06 4:08 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list, Horms
Hi again Gerd,
[CC Simon]
On 12/6/06, Gerd Hoffmann <kraxel@suse.de> wrote:
> >> I'm also in trouble now with guest kexec patches as they work with guest
> >> phys addrs not machine phys addrs.
> >
> > Sorry if that made your life difficult, but shouldn't it just be a
> > matter of using the native versions of the page macros for domU?
>
> No. The same xen kernel can run as both dom0 and domU, thus that must
> be decided at runtime.
Well, for us there was no need to decide that at runtime. Our scope
was only dom0.
For you a runtime check makes sense, especially now when our code is
merged and you have a conflict. It does however sound like you are
pissed because the conflict, but I don't think you should blame that
on us. Simon and I reposted the patches at least 10 times over the
last half a year - so you had your time to come with feedback.
That aside, what about doing as little as possible now? Use
is_initial_xendomain() or something like that to switch between the
different dom0 and domU implementations. And whenever domU and dom0
runs under paravirt we fix up to code to remove the #ifdef and add
native mode support.
> >> I think we need either wrapper functions for machine_kexec_* functions
> >> which dispatch to the correct function depending on the environment
> >> (dom0 vs domU, later also native) or just make them function pointers to
> >> archive the same effect. Same goes for the KEXEC_ARCH_HAS_PAGE_MACROS
> >> stuff. IMHO "#ifdef CONFIG_XEN" should go away from the core code (i.e.
> >> kernel/kexec.c).
> >
> > You mean for the paravirt stuff?
>
> And domU kexec. That works without any kexec core changes, and I
> suspect the #ifdef CONFIG_XEN code will break it.
Replacing the #ifdefs with a runtime check that is fine by me. I'm
think it's nice to avoid #ifdefs if possible, but again - our scope of
implementation was simply to add dom0 support. We did not care about
domU support or paravirt that wasn't included at that time.
> > If so, what is stopping us from
> > registering a set of paravirt callbacks for the kexec code?
>
> Hmm, we'll end up with *two* sets of callbacks for xen, one for dom0 and
> one for domU kexec. Not sure that fits the current paravirt design.
I'm pretty sure that these things will be easy to resolve when the
time is right.
> Given we may move to paravirt some day it's probably best to go with the
> function pointers approach for now, that makes switching over to the
> paravirt infrastructure (once it is mainline) easier. And I think its
> also less messy in the code.
There is only a point in having function pointers when you have more
than one implementation. And now you are going from one implementation
to two so adding function pointers makes sense. If we would have added
function pointers in our patch it would have been pure bloat because
there was no one there except us to use them.
/ magnus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-05 15:53 ` Magnus Damm
2006-12-05 16:55 ` Gerd Hoffmann
@ 2006-12-06 8:37 ` Keir Fraser
2006-12-06 9:08 ` Magnus Damm
1 sibling, 1 reply; 20+ messages in thread
From: Keir Fraser @ 2006-12-06 8:37 UTC (permalink / raw)
To: Magnus Damm, Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list
On 5/12/06 3:53 pm, "Magnus Damm" <magnus.damm@gmail.com> wrote:
>> I think we need either wrapper functions for machine_kexec_* functions
>> which dispatch to the correct function depending on the environment
>> (dom0 vs domU, later also native) or just make them function pointers to
>> archive the same effect. Same goes for the KEXEC_ARCH_HAS_PAGE_MACROS
>> stuff. IMHO "#ifdef CONFIG_XEN" should go away from the core code (i.e.
>> kernel/kexec.c).
>
> You mean for the paravirt stuff? Isn't paravirt basically a set of
> callbacks that you can register? If so, what is stopping us from
> registering a set of paravirt callbacks for the kexec code?
I think partly Gerd's point is that CONFIG_XEN in kernel/kexec.c will never
get merged upstream. Guaranteed.
The kexec/kdump patches are not very tidy in some respects like this. We
applied them now because the functionality is useful, but I don't think we
yet have the finished polished article. Also you got away with it because
the code changes were hidden in the patches/ directory, which you originally
said was simply backported code from 2.6.19 (not backported-and-hacked-on!).
-- Keir
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-06 4:08 ` Magnus Damm
@ 2006-12-06 8:48 ` Gerd Hoffmann
2006-12-06 9:41 ` Magnus Damm
0 siblings, 1 reply; 20+ messages in thread
From: Gerd Hoffmann @ 2006-12-06 8:48 UTC (permalink / raw)
To: Magnus Damm; +Cc: Magnus Damm, Xen devel list, Horms
Magnus Damm wrote:
> For you a runtime check makes sense, especially now when our code is
> merged and you have a conflict. It does however sound like you are
> pissed because the conflict, but I don't think you should blame that
> on us.
Yes, a bit, especially as we've talked a bit about dom0/domU kexec at
the Xen Summit, so I assumed you are aware of the problem. The
sparse/patches split of the code also makes it hard to change it.
> Simon and I reposted the patches at least 10 times over the
> last half a year - so you had your time to come with feedback.
Yes, I should have checked before. -ENOTIME. Bad decision
nevertheless, now it probably costs even more time to fix it up
afterwards ....
> That aside, what about doing as little as possible now? Use
> is_initial_xendomain() or something like that to switch between the
> different dom0 and domU implementations. And whenever domU and dom0
> runs under paravirt we fix up to code to remove the #ifdef and add
> native mode support.
I'd go for the function pointer approach. I think it is easier to
maintain in the long run. Wrapper functions which look at
is_initial_xendomain() then call either xen0_machine_kexec or
xenU_machine_kexec quickly get messy with lots of #ifdef CONFIG_FOOBAR,
and it would be a temporary solution only anyway.
I think you compile in native code too, although it is dead code, right?
So we can make machine_kexec() + friends function pointers, rename the
native functions and initialize the function pointers to the native
versions. I think it should even be possible to make them function
pointers for i386/x86_64 archs only. Things keep working with
CONFIG_XEN=n then, and with CONFIG_XEN=y the initialization function
just switches the function pointers (depending on is_initial_domain()).
This also eliminates the first set of #ifdefs in kernel/kexec.c ;)
> Replacing the #ifdefs with a runtime check that is fine by me. I'm
> think it's nice to avoid #ifdefs if possible, but again - our scope of
> implementation was simply to add dom0 support. We did not care about
> domU support or paravirt that wasn't included at that time.
Having "#ifdef CONFIG_XEN" in kernel/kexec.c most likely never ever is
accepted mainline (and we do seek mainline merge, don't we?). IMHO that
is enough reason to avoid it in the first place.
cheers,
Gerd
--
Gerd Hoffmann <kraxel@suse.de>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-06 8:37 ` Keir Fraser
@ 2006-12-06 9:08 ` Magnus Damm
0 siblings, 0 replies; 20+ messages in thread
From: Magnus Damm @ 2006-12-06 9:08 UTC (permalink / raw)
To: Keir Fraser; +Cc: Gerd Hoffmann, Xen devel list, Magnus Damm
On 12/6/06, Keir Fraser <keir@xensource.com> wrote:
> On 5/12/06 3:53 pm, "Magnus Damm" <magnus.damm@gmail.com> wrote:
>
> >> I think we need either wrapper functions for machine_kexec_* functions
> >> which dispatch to the correct function depending on the environment
> >> (dom0 vs domU, later also native) or just make them function pointers to
> >> archive the same effect. Same goes for the KEXEC_ARCH_HAS_PAGE_MACROS
> >> stuff. IMHO "#ifdef CONFIG_XEN" should go away from the core code (i.e.
> >> kernel/kexec.c).
> >
> > You mean for the paravirt stuff? Isn't paravirt basically a set of
> > callbacks that you can register? If so, what is stopping us from
> > registering a set of paravirt callbacks for the kexec code?
>
> I think partly Gerd's point is that CONFIG_XEN in kernel/kexec.c will never
> get merged upstream. Guaranteed.
Sure, I understand that. But I see this as an iterative process, where
the our code so far has been written to fit the current codebase. When
dom0 runs on paravirt and we can test the code then it should be
adjusted. It's kind of hard to write for something that doesn't yet
exist. =) So regardless how you do it, you still need to adjust your
code towards the new interface in the end - it's just a matter of how
much code you need to adjust.
I'm all for converting the code into using runtime checks or callbacks
if that is needed, and I would have done so in the first place if I'd
known that it was something that you guys wanted. But I didn't so we
used the simplest possible solution instead which was CONFIG_XEN.
> The kexec/kdump patches are not very tidy in some respects like this. We
> applied them now because the functionality is useful, but I don't think we
> yet have the finished polished article. Also you got away with it because
> the code changes were hidden in the patches/ directory, which you originally
> said was simply backported code from 2.6.19 (not backported-and-hacked-on!).
The git-patches are backports. The other ones are not:
http://lists.xensource.com/archives/html/xen-devel/2006-10/msg01240.html
/ magnus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-06 8:48 ` Gerd Hoffmann
@ 2006-12-06 9:41 ` Magnus Damm
2006-12-06 10:31 ` Gerd Hoffmann
0 siblings, 1 reply; 20+ messages in thread
From: Magnus Damm @ 2006-12-06 9:41 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list, Horms
On 12/6/06, Gerd Hoffmann <kraxel@suse.de> wrote:
> Magnus Damm wrote:
> > For you a runtime check makes sense, especially now when our code is
> > merged and you have a conflict. It does however sound like you are
> > pissed because the conflict, but I don't think you should blame that
> > on us.
>
> Yes, a bit, especially as we've talked a bit about dom0/domU kexec at
> the Xen Summit, so I assumed you are aware of the problem. The
> sparse/patches split of the code also makes it hard to change it.
We chit-chatted a bit, but I don't remember us talking about any
implementation details.
I've heard complaints and doubts about using sparse together with
patches, but when I ask for a better alternative it's always awfully
silent. We could have copied the files into sparse and applied our
patches, but duplicating files seemed a step in the wrong direction.
It's funny because the reason behind using patches is to simplify up
porting, but now instead of simplifying it seems to confuse people.
Maybe we should have copied the files to sparse instead, would that
have been better?
> > Simon and I reposted the patches at least 10 times over the
> > last half a year - so you had your time to come with feedback.
>
> Yes, I should have checked before. -ENOTIME. Bad decision
> nevertheless, now it probably costs even more time to fix it up
> afterwards ....
I don't mind changing pieces of the code now. It would probably have
been easier to do the right thing earlier, but the number of changes
needed are probably pretty low.
If there is anything I can help out with just let me know!
> > That aside, what about doing as little as possible now? Use
> > is_initial_xendomain() or something like that to switch between the
> > different dom0 and domU implementations. And whenever domU and dom0
> > runs under paravirt we fix up to code to remove the #ifdef and add
> > native mode support.
>
> I'd go for the function pointer approach. I think it is easier to
> maintain in the long run. Wrapper functions which look at
> is_initial_xendomain() then call either xen0_machine_kexec or
> xenU_machine_kexec quickly get messy with lots of #ifdef CONFIG_FOOBAR,
> and it would be a temporary solution only anyway.
Yes, the function pointer solution is a lot nicer.
> I think you compile in native code too, although it is dead code, right?
The only dead code function that I know of would be machine_kexec(),
and that one will be needed if we want to support native mode.
> So we can make machine_kexec() + friends function pointers, rename the
> native functions and initialize the function pointers to the native
> versions. I think it should even be possible to make them function
> pointers for i386/x86_64 archs only. Things keep working with
> CONFIG_XEN=n then, and with CONFIG_XEN=y the initialization function
> just switches the function pointers (depending on is_initial_domain()).
> This also eliminates the first set of #ifdefs in kernel/kexec.c ;)
Sounds exactly what I would have done! =)
> > Replacing the #ifdefs with a runtime check that is fine by me. I'm
> > think it's nice to avoid #ifdefs if possible, but again - our scope of
> > implementation was simply to add dom0 support. We did not care about
> > domU support or paravirt that wasn't included at that time.
>
> Having "#ifdef CONFIG_XEN" in kernel/kexec.c most likely never ever is
> accepted mainline (and we do seek mainline merge, don't we?). IMHO that
> is enough reason to avoid it in the first place.
Yes and no. =)
You seem to code with the goal of having something that will be
directly acceptable for mainilne, but my goal is to write as simple
code as possible which should be easy to adjust to whatever framework
that exists at the time of mainline merge.
Let me know what I can do to help out.
Thanks,
/ magnus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-06 9:41 ` Magnus Damm
@ 2006-12-06 10:31 ` Gerd Hoffmann
2006-12-06 11:11 ` Magnus Damm
0 siblings, 1 reply; 20+ messages in thread
From: Gerd Hoffmann @ 2006-12-06 10:31 UTC (permalink / raw)
To: Magnus Damm; +Cc: Magnus Damm, Xen devel list, Horms
Hi,
> We chit-chatted a bit, but I don't remember us talking about any
> implementation details.
Discussed briefly possible code sharing, that there likely isn't much to
share because we have two very different approachs to take, and that we
are probably best off just having two machine_kexec() versions for
dom0/domU. No details yet how to actually implement that, but at least
the need for some kind of runtime switching should have been clear.
> I've heard complaints and doubts about using sparse together with
> patches, but when I ask for a better alternative it's always awfully
> silent. We could have copied the files into sparse and applied our
> patches, but duplicating files seemed a step in the wrong direction.
For backports and code planned for quick mainline merge maintaining as
patches is fine, makes it easier to move forward once stuff is merged
and/or the xen linux tree is updated to a newer upstream kernel.
For code which likely lives longer in the xen tree (especially
kexec-generic.patch which has almost no chance to be accepted mainline
as-is) it is a pain to deal with as patch.
I'd love to see kernel/kexec.c not being touched at all, but I think
that is impossible for dom0 kexec (due to range checks which must happen
in machine not guest address space for example).
>> So we can make machine_kexec() + friends function pointers, rename the
>> native functions and initialize the function pointers to the native
>> versions. I think it should even be possible to make them function
>> pointers for i386/x86_64 archs only. Things keep working with
>> CONFIG_XEN=n then, and with CONFIG_XEN=y the initialization function
>> just switches the function pointers (depending on is_initial_domain()).
>> This also eliminates the first set of #ifdefs in kernel/kexec.c ;)
>
> Sounds exactly what I would have done! =)
Great, so lets do that.
> You seem to code with the goal of having something that will be
> directly acceptable for mainilne, but my goal is to write as simple
> code as possible which should be easy to adjust to whatever framework
> that exists at the time of mainline merge.
Given that the framework will be paravirt_ops function pointers fit
nicely ;)
cheers,
Gerd
--
Gerd Hoffmann <kraxel@suse.de>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-06 10:31 ` Gerd Hoffmann
@ 2006-12-06 11:11 ` Magnus Damm
2006-12-06 13:23 ` Gerd Hoffmann
2006-12-07 11:24 ` Gerd Hoffmann
0 siblings, 2 replies; 20+ messages in thread
From: Magnus Damm @ 2006-12-06 11:11 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list, Horms
On 12/6/06, Gerd Hoffmann <kraxel@suse.de> wrote:
> Hi,
>
> > We chit-chatted a bit, but I don't remember us talking about any
> > implementation details.
>
> Discussed briefly possible code sharing, that there likely isn't much to
> share because we have two very different approachs to take, and that we
> are probably best off just having two machine_kexec() versions for
> dom0/domU. No details yet how to actually implement that, but at least
> the need for some kind of runtime switching should have been clear.
We needed to work together to implement runtime switching anyhow, and
that's what is happening now. But maybe I should have considered the
runtime switching earlier...
> > I've heard complaints and doubts about using sparse together with
> > patches, but when I ask for a better alternative it's always awfully
> > silent. We could have copied the files into sparse and applied our
> > patches, but duplicating files seemed a step in the wrong direction.
>
> For backports and code planned for quick mainline merge maintaining as
> patches is fine, makes it easier to move forward once stuff is merged
> and/or the xen linux tree is updated to a newer upstream kernel.
Ack.
> For code which likely lives longer in the xen tree (especially
> kexec-generic.patch which has almost no chance to be accepted mainline
> as-is) it is a pain to deal with as patch.
Yeah, I can agree with that. Feel free to add the files to sparse and
throw out the patch. The dependency on patches and other stuff may
make it difficult though.
> I'd love to see kernel/kexec.c not being touched at all, but I think
> that is impossible for dom0 kexec (due to range checks which must happen
> in machine not guest address space for example).
We hoped to not touch the generic code at all too, but we had to
because of machine addresses
> >> So we can make machine_kexec() + friends function pointers, rename the
> >> native functions and initialize the function pointers to the native
> >> versions. I think it should even be possible to make them function
> >> pointers for i386/x86_64 archs only. Things keep working with
> >> CONFIG_XEN=n then, and with CONFIG_XEN=y the initialization function
> >> just switches the function pointers (depending on is_initial_domain()).
> >> This also eliminates the first set of #ifdefs in kernel/kexec.c ;)
> >
> > Sounds exactly what I would have done! =)
>
> Great, so lets do that.
Excellent! Let me know how and where you want my help.
> > You seem to code with the goal of having something that will be
> > directly acceptable for mainilne, but my goal is to write as simple
> > code as possible which should be easy to adjust to whatever framework
> > that exists at the time of mainline merge.
>
> Given that the framework will be paravirt_ops function pointers fit
> nicely ;)
Function pointers sound like the right way to go! Happy hacking!
Thanks,
/ magnus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-06 11:11 ` Magnus Damm
@ 2006-12-06 13:23 ` Gerd Hoffmann
2006-12-06 13:40 ` Muli Ben-Yehuda
2006-12-07 11:24 ` Gerd Hoffmann
1 sibling, 1 reply; 20+ messages in thread
From: Gerd Hoffmann @ 2006-12-06 13:23 UTC (permalink / raw)
To: Magnus Damm; +Cc: Magnus Damm, Xen devel list, Horms
Hi,
>> For code which likely lives longer in the xen tree (especially
>> kexec-generic.patch which has almost no chance to be accepted mainline
>> as-is) it is a pain to deal with as patch.
>
> Yeah, I can agree with that. Feel free to add the files to sparse and
> throw out the patch. The dependency on patches and other stuff may
> make it difficult though.
*Aaaaargh*, it's even messier than I thought. We have linux kernel
source files which are modified by patches *AND* are in the sparse tree.
And the two versions don't match of course. Looks like that is an
older issue though, so I can't blame kexec for that one ;)
These patches can't be removed cleanly after running mkbuildtree:
x86-put-note-sections-into-a-pt_note-segment-in-vmlinux.patch
smp-alts.patch
net-gso-2-checksum-fix.patch
net-gso-0-base.patch
We *must* find a more sane way to maintain the linux kernel sources,
this is one more reason why mixing sparse tree and patches isn't going
to fly. As far I know at least the sparse tree is planned to be
dropped, now with dom0 and xen being decoupled (3.0.3+) it should be
possible without too much hassle. Any plans what to use instead? quilt
patch queue?
cheers,
Gerd
--
Gerd Hoffmann <kraxel@suse.de>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-06 13:23 ` Gerd Hoffmann
@ 2006-12-06 13:40 ` Muli Ben-Yehuda
0 siblings, 0 replies; 20+ messages in thread
From: Muli Ben-Yehuda @ 2006-12-06 13:40 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list, Magnus Damm, Horms
On Wed, Dec 06, 2006 at 02:23:36PM +0100, Gerd Hoffmann wrote:
> We *must* find a more sane way to maintain the linux kernel sources,
> this is one more reason why mixing sparse tree and patches isn't going
> to fly. As far I know at least the sparse tree is planned to be
> dropped, now with dom0 and xen being decoupled (3.0.3+) it should be
> possible without too much hassle. Any plans what to use instead? quilt
> patch queue?
A full hg or git tree would be nicer... the way things are going,
patches applied to such a tree wouldn't be appropriate for upstream
inclusion without cleaning up anyway, so I don't see what the patch
queue method will buy us as opposed to a full tree. Of course nearly
anything would be better than the sparse + patches method.
Cheers,
Muli
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-06 11:11 ` Magnus Damm
2006-12-06 13:23 ` Gerd Hoffmann
@ 2006-12-07 11:24 ` Gerd Hoffmann
2006-12-08 4:15 ` Magnus Damm
1 sibling, 1 reply; 20+ messages in thread
From: Gerd Hoffmann @ 2006-12-07 11:24 UTC (permalink / raw)
To: Magnus Damm; +Cc: Magnus Damm, Xen devel list, Horms
[-- Attachment #1: Type: text/plain, Size: 340 bytes --]
Hi,
> Function pointers sound like the right way to go! Happy hacking!
First step of a cleanup by moving to function pointers.
Compile tested only.
First three attachments replace the patches with identical names in
patches/linux-2.6. The last should be applied to the sparse tree.
cheers,
Gerd
--
Gerd Hoffmann <kraxel@suse.de>
[-- Attachment #2: kexec-generic.patch --]
[-- Type: text/x-patch, Size: 8430 bytes --]
---
include/linux/kexec.h | 22 +++++++++++-
kernel/kexec.c | 85 ++++++++++++++++++++++++++++++++++++++++----------
2 files changed, 89 insertions(+), 18 deletions(-)
Index: kexec-2.6.16/include/linux/kexec.h
===================================================================
--- kexec-2.6.16.orig/include/linux/kexec.h
+++ kexec-2.6.16/include/linux/kexec.h
@@ -85,12 +85,30 @@ struct kimage {
#define KEXEC_TYPE_CRASH 1
};
-
-
/* kexec interface functions */
+extern unsigned long (*kexec_page_to_pfn)(struct page *page);
+extern struct page* (*kexec_pfn_to_page)(unsigned long pfn);
+extern unsigned long (*kexec_virt_to_phys)(void *addr);
+extern void* (*kexec_phys_to_virt)(unsigned long addr);
+
+#ifdef KEXEC_ARCH_USES_HOOKS
+extern NORET_TYPE void (*machine_kexec)(struct kimage *image) ATTRIB_NORET;
+extern int (*machine_kexec_prepare)(struct kimage *image);
+extern int (*machine_kexec_load)(struct kimage *image);
+extern void (*machine_kexec_unload)(struct kimage *image);
+extern void (*machine_kexec_cleanup)(struct kimage *image);
+#else
extern NORET_TYPE void machine_kexec(struct kimage *image) ATTRIB_NORET;
extern int machine_kexec_prepare(struct kimage *image);
+static inline int machine_kexec_load(struct kimage *image) { return 0; }
+static inline void machine_kexec_unload(struct kimage *image) { }
extern void machine_kexec_cleanup(struct kimage *image);
+#endif
+
+#ifdef CONFIG_XEN
+extern void xen_machine_kexec_setup_resources(void);
+extern void xen_machine_kexec_register_resources(struct resource *res);
+#endif
extern asmlinkage long sys_kexec_load(unsigned long entry,
unsigned long nr_segments,
struct kexec_segment __user *segments,
Index: kexec-2.6.16/kernel/kexec.c
===================================================================
--- kexec-2.6.16.orig/kernel/kexec.c
+++ kexec-2.6.16/kernel/kexec.c
@@ -27,6 +27,31 @@
#include <asm/system.h>
#include <asm/semaphore.h>
+static unsigned long default_page_to_pfn(struct page *page)
+{
+ return page_to_pfn(page);
+}
+
+static struct page* default_pfn_to_page(unsigned long pfn)
+{
+ return pfn_to_page(pfn);
+}
+
+static unsigned long default_virt_to_phys(void *addr)
+{
+ return virt_to_phys(addr);
+}
+
+static void* default_phys_to_virt(unsigned long addr)
+{
+ return phys_to_virt(addr);
+}
+
+unsigned long (*kexec_page_to_pfn)(struct page *page) = default_page_to_pfn;
+struct page* (*kexec_pfn_to_page)(unsigned long pfn) = default_pfn_to_page;
+unsigned long (*kexec_virt_to_phys)(void *addr) = default_virt_to_phys;
+void* (*kexec_phys_to_virt)(unsigned long addr) = default_phys_to_virt;
+
/* Per cpu memory for storing cpu states in case of system crash. */
note_buf_t* crash_notes;
@@ -403,7 +428,7 @@ static struct page *kimage_alloc_normal_
pages = kimage_alloc_pages(GFP_KERNEL, order);
if (!pages)
break;
- pfn = page_to_pfn(pages);
+ pfn = kexec_page_to_pfn(pages);
epfn = pfn + count;
addr = pfn << PAGE_SHIFT;
eaddr = epfn << PAGE_SHIFT;
@@ -437,6 +462,7 @@ static struct page *kimage_alloc_normal_
return pages;
}
+#ifndef CONFIG_XEN
static struct page *kimage_alloc_crash_control_pages(struct kimage *image,
unsigned int order)
{
@@ -490,7 +516,7 @@ static struct page *kimage_alloc_crash_c
}
/* If I don't overlap any segments I have found my hole! */
if (i == image->nr_segments) {
- pages = pfn_to_page(hole_start >> PAGE_SHIFT);
+ pages = kexec_pfn_to_page(hole_start >> PAGE_SHIFT);
break;
}
}
@@ -517,6 +543,13 @@ struct page *kimage_alloc_control_pages(
return pages;
}
+#else /* !CONFIG_XEN */
+struct page *kimage_alloc_control_pages(struct kimage *image,
+ unsigned int order)
+{
+ return kimage_alloc_normal_control_pages(image, order);
+}
+#endif
static int kimage_add_entry(struct kimage *image, kimage_entry_t entry)
{
@@ -532,7 +565,7 @@ static int kimage_add_entry(struct kimag
return -ENOMEM;
ind_page = page_address(page);
- *image->entry = virt_to_phys(ind_page) | IND_INDIRECTION;
+ *image->entry = kexec_virt_to_phys(ind_page) | IND_INDIRECTION;
image->entry = ind_page;
image->last_entry = ind_page +
((PAGE_SIZE/sizeof(kimage_entry_t)) - 1);
@@ -593,13 +626,13 @@ static int kimage_terminate(struct kimag
#define for_each_kimage_entry(image, ptr, entry) \
for (ptr = &image->head; (entry = *ptr) && !(entry & IND_DONE); \
ptr = (entry & IND_INDIRECTION)? \
- phys_to_virt((entry & PAGE_MASK)): ptr +1)
+ kexec_phys_to_virt((entry & PAGE_MASK)): ptr +1)
static void kimage_free_entry(kimage_entry_t entry)
{
struct page *page;
- page = pfn_to_page(entry >> PAGE_SHIFT);
+ page = kexec_pfn_to_page(entry >> PAGE_SHIFT);
kimage_free_pages(page);
}
@@ -611,6 +644,9 @@ static void kimage_free(struct kimage *i
if (!image)
return;
+ if (machine_kexec_unload)
+ machine_kexec_unload(image);
+
kimage_free_extra_pages(image);
for_each_kimage_entry(image, ptr, entry) {
if (entry & IND_INDIRECTION) {
@@ -630,7 +666,8 @@ static void kimage_free(struct kimage *i
kimage_free_entry(ind);
/* Handle any machine specific cleanup */
- machine_kexec_cleanup(image);
+ if (machine_kexec_cleanup)
+ machine_kexec_cleanup(image);
/* Free the kexec control pages... */
kimage_free_page_list(&image->control_pages);
@@ -686,7 +723,7 @@ static struct page *kimage_alloc_page(st
* have a match.
*/
list_for_each_entry(page, &image->dest_pages, lru) {
- addr = page_to_pfn(page) << PAGE_SHIFT;
+ addr = kexec_page_to_pfn(page) << PAGE_SHIFT;
if (addr == destination) {
list_del(&page->lru);
return page;
@@ -701,12 +738,12 @@ static struct page *kimage_alloc_page(st
if (!page)
return NULL;
/* If the page cannot be used file it away */
- if (page_to_pfn(page) >
+ if (kexec_page_to_pfn(page) >
(KEXEC_SOURCE_MEMORY_LIMIT >> PAGE_SHIFT)) {
list_add(&page->lru, &image->unuseable_pages);
continue;
}
- addr = page_to_pfn(page) << PAGE_SHIFT;
+ addr = kexec_page_to_pfn(page) << PAGE_SHIFT;
/* If it is the destination page we want use it */
if (addr == destination)
@@ -729,7 +766,7 @@ static struct page *kimage_alloc_page(st
struct page *old_page;
old_addr = *old & PAGE_MASK;
- old_page = pfn_to_page(old_addr >> PAGE_SHIFT);
+ old_page = kexec_pfn_to_page(old_addr >> PAGE_SHIFT);
copy_highpage(page, old_page);
*old = addr | (*old & ~PAGE_MASK);
@@ -779,7 +816,7 @@ static int kimage_load_normal_segment(st
result = -ENOMEM;
goto out;
}
- result = kimage_add_page(image, page_to_pfn(page)
+ result = kimage_add_page(image, kexec_page_to_pfn(page)
<< PAGE_SHIFT);
if (result < 0)
goto out;
@@ -811,6 +848,7 @@ out:
return result;
}
+#ifndef CONFIG_XEN
static int kimage_load_crash_segment(struct kimage *image,
struct kexec_segment *segment)
{
@@ -833,7 +871,7 @@ static int kimage_load_crash_segment(str
char *ptr;
size_t uchunk, mchunk;
- page = pfn_to_page(maddr >> PAGE_SHIFT);
+ page = kexec_pfn_to_page(maddr >> PAGE_SHIFT);
if (page == 0) {
result = -ENOMEM;
goto out;
@@ -881,6 +919,13 @@ static int kimage_load_segment(struct ki
return result;
}
+#else /* CONFIG_XEN */
+static int kimage_load_segment(struct kimage *image,
+ struct kexec_segment *segment)
+{
+ return kimage_load_normal_segment(image, segment);
+}
+#endif
/*
* Exec Kernel system call: for obvious reasons only root may call it.
@@ -978,9 +1023,11 @@ asmlinkage long sys_kexec_load(unsigned
if (result)
goto out;
- result = machine_kexec_prepare(image);
- if (result)
- goto out;
+ if (machine_kexec_prepare) {
+ result = machine_kexec_prepare(image);
+ if (result)
+ goto out;
+ }
for (i = 0; i < nr_segments; i++) {
result = kimage_load_segment(image, &image->segment[i]);
@@ -991,6 +1038,13 @@ asmlinkage long sys_kexec_load(unsigned
if (result)
goto out;
}
+
+ if (machine_kexec_load) {
+ result = machine_kexec_load(image);
+ if (result)
+ goto out;
+ }
+
/* Install the new kernel, and Uninstall the old */
image = xchg(dest_image, image);
@@ -1045,7 +1099,6 @@ void crash_kexec(struct pt_regs *regs)
struct kimage *image;
int locked;
-
/* Take the kexec_lock here to prevent sys_kexec_load
* running on one cpu from replacing the crash kernel
* we are using after a panic on a different cpu.
[-- Attachment #3: linux-2.6.19-rc1-kexec-xen-i386.patch --]
[-- Type: text/x-patch, Size: 4588 bytes --]
---
arch/i386/kernel/crash.c | 4 ++
arch/i386/kernel/machine_kexec.c | 65 +++++++++++++++++++++++++--------------
include/asm-i386/kexec.h | 3 +
3 files changed, 49 insertions(+), 23 deletions(-)
Index: kexec-2.6.16/arch/i386/kernel/crash.c
===================================================================
--- kexec-2.6.16.orig/arch/i386/kernel/crash.c
+++ kexec-2.6.16/arch/i386/kernel/crash.c
@@ -90,6 +90,7 @@ static void crash_save_self(struct pt_re
crash_save_this_cpu(regs, cpu);
}
+#ifndef CONFIG_XEN
#ifdef CONFIG_SMP
static atomic_t waiting_for_crash_ipi;
@@ -158,6 +159,7 @@ static void nmi_shootdown_cpus(void)
/* There are no cpus to shootdown */
}
#endif
+#endif /* CONFIG_XEN */
void machine_crash_shutdown(struct pt_regs *regs)
{
@@ -174,10 +176,12 @@ void machine_crash_shutdown(struct pt_re
/* Make a note of crashing cpu. Will be used in NMI callback.*/
crashing_cpu = smp_processor_id();
+#ifndef CONFIG_XEN
nmi_shootdown_cpus();
lapic_shutdown();
#if defined(CONFIG_X86_IO_APIC)
disable_IO_APIC();
#endif
+#endif /* CONFIG_XEN */
crash_save_self(regs);
}
Index: kexec-2.6.16/arch/i386/kernel/machine_kexec.c
===================================================================
--- kexec-2.6.16.orig/arch/i386/kernel/machine_kexec.c
+++ kexec-2.6.16/arch/i386/kernel/machine_kexec.c
@@ -19,6 +19,10 @@
#include <asm/desc.h>
#include <asm/system.h>
+#ifdef CONFIG_XEN
+#include <xen/interface/kexec.h>
+#endif
+
#define PAGE_ALIGNED __attribute__ ((__aligned__(PAGE_SIZE)))
static u32 kexec_pgd[1024] PAGE_ALIGNED;
#ifdef CONFIG_X86_PAE
@@ -28,37 +32,45 @@ static u32 kexec_pmd1[1024] PAGE_ALIGNED
static u32 kexec_pte0[1024] PAGE_ALIGNED;
static u32 kexec_pte1[1024] PAGE_ALIGNED;
-/*
- * A architecture hook called to validate the
- * proposed image and prepare the control pages
- * as needed. The pages for KEXEC_CONTROL_CODE_SIZE
- * have been allocated, but the segments have yet
- * been copied into the kernel.
- *
- * Do what every setup is needed on image and the
- * reboot code buffer to allow us to avoid allocations
- * later.
- *
- * Currently nothing.
- */
-int machine_kexec_prepare(struct kimage *image)
-{
- return 0;
-}
+#ifdef CONFIG_XEN
-/*
- * Undo anything leftover by machine_kexec_prepare
- * when an image is freed.
- */
-void machine_kexec_cleanup(struct kimage *image)
+#define __ma(x) (pfn_to_mfn(__pa((x)) >> PAGE_SHIFT) << PAGE_SHIFT)
+
+#if PAGES_NR > KEXEC_XEN_NO_PAGES
+#error PAGES_NR is greater than KEXEC_XEN_NO_PAGES - Xen support will break
+#endif
+
+#if PA_CONTROL_PAGE != 0
+#error PA_CONTROL_PAGE is non zero - Xen support will break
+#endif
+
+void machine_kexec_setup_load_arg(xen_kexec_image_t *xki, struct kimage *image)
{
+ void *control_page;
+
+ memset(xki->page_list, 0, sizeof(xki->page_list));
+
+ control_page = page_address(image->control_code_page);
+ memcpy(control_page, relocate_kernel, PAGE_SIZE);
+
+ xki->page_list[PA_CONTROL_PAGE] = __ma(control_page);
+ xki->page_list[PA_PGD] = __ma(kexec_pgd);
+#ifdef CONFIG_X86_PAE
+ xki->page_list[PA_PMD_0] = __ma(kexec_pmd0);
+ xki->page_list[PA_PMD_1] = __ma(kexec_pmd1);
+#endif
+ xki->page_list[PA_PTE_0] = __ma(kexec_pte0);
+ xki->page_list[PA_PTE_1] = __ma(kexec_pte1);
+
}
+#endif /* CONFIG_XEN */
+
/*
* Do not allocate memory (or fail in any way) in machine_kexec().
* We are past the point of no return, committed to rebooting now.
*/
-NORET_TYPE void machine_kexec(struct kimage *image)
+static NORET_TYPE ATTRIB_NORET void native_machine_kexec(struct kimage *image)
{
unsigned long page_list[PAGES_NR];
void *control_page;
@@ -87,3 +99,10 @@ NORET_TYPE void machine_kexec(struct kim
relocate_kernel((unsigned long)image->head, (unsigned long)page_list,
image->start, cpu_has_pae);
}
+
+NORET_TYPE void (*machine_kexec)(struct kimage *image) ATTRIB_NORET
+ = native_machine_kexec;
+int (*machine_kexec_prepare)(struct kimage *image) = NULL;
+int (*machine_kexec_load)(struct kimage *image) = NULL;
+void (*machine_kexec_unload)(struct kimage *image) = NULL;
+void (*machine_kexec_cleanup)(struct kimage *image) = NULL;
Index: kexec-2.6.16/include/asm-i386/kexec.h
===================================================================
--- kexec-2.6.16.orig/include/asm-i386/kexec.h
+++ kexec-2.6.16/include/asm-i386/kexec.h
@@ -98,6 +98,9 @@ relocate_kernel(unsigned long indirectio
unsigned long start_address,
unsigned int has_pae) ATTRIB_NORET;
+
+#define KEXEC_ARCH_USES_HOOKS 1
+
#endif /* __ASSEMBLY__ */
#endif /* _I386_KEXEC_H */
[-- Attachment #4: linux-2.6.19-rc1-kexec-xen-x86_64.patch --]
[-- Type: text/x-patch, Size: 7755 bytes --]
---
arch/x86_64/kernel/crash.c | 6 +
arch/x86_64/kernel/machine_kexec.c | 133 +++++++++++++++++++++++++++++++++----
include/asm-x86_64/kexec.h | 7 +
3 files changed, 132 insertions(+), 14 deletions(-)
Index: kexec-2.6.16/arch/x86_64/kernel/crash.c
===================================================================
--- kexec-2.6.16.orig/arch/x86_64/kernel/crash.c
+++ kexec-2.6.16/arch/x86_64/kernel/crash.c
@@ -92,6 +92,7 @@ static void crash_save_self(struct pt_re
crash_save_this_cpu(regs, cpu);
}
+#ifndef CONFIG_XEN
#ifdef CONFIG_SMP
static atomic_t waiting_for_crash_ipi;
@@ -156,6 +157,7 @@ static void nmi_shootdown_cpus(void)
/* There are no cpus to shootdown */
}
#endif
+#endif /* CONFIG_XEN */
void machine_crash_shutdown(struct pt_regs *regs)
{
@@ -173,6 +175,8 @@ void machine_crash_shutdown(struct pt_re
/* Make a note of crashing cpu. Will be used in NMI callback.*/
crashing_cpu = smp_processor_id();
+
+#ifndef CONFIG_XEN
nmi_shootdown_cpus();
if(cpu_has_apic)
@@ -181,6 +185,6 @@ void machine_crash_shutdown(struct pt_re
#if defined(CONFIG_X86_IO_APIC)
disable_IO_APIC();
#endif
-
+#endif /* CONFIG_XEN */
crash_save_self(regs);
}
Index: kexec-2.6.16/arch/x86_64/kernel/machine_kexec.c
===================================================================
--- kexec-2.6.16.orig/arch/x86_64/kernel/machine_kexec.c
+++ kexec-2.6.16/arch/x86_64/kernel/machine_kexec.c
@@ -24,6 +24,104 @@ static u64 kexec_pud1[512] PAGE_ALIGNED;
static u64 kexec_pmd1[512] PAGE_ALIGNED;
static u64 kexec_pte1[512] PAGE_ALIGNED;
+#ifdef CONFIG_XEN
+
+/* In the case of Xen, override hypervisor functions to be able to create
+ * a regular identity mapping page table...
+ */
+
+#include <xen/interface/kexec.h>
+#include <xen/interface/memory.h>
+
+#define x__pmd(x) ((pmd_t) { (x) } )
+#define x__pud(x) ((pud_t) { (x) } )
+#define x__pgd(x) ((pgd_t) { (x) } )
+
+#define x_pmd_val(x) ((x).pmd)
+#define x_pud_val(x) ((x).pud)
+#define x_pgd_val(x) ((x).pgd)
+
+static inline void x_set_pmd(pmd_t *dst, pmd_t val)
+{
+ x_pmd_val(*dst) = x_pmd_val(val);
+}
+
+static inline void x_set_pud(pud_t *dst, pud_t val)
+{
+ x_pud_val(*dst) = phys_to_machine(x_pud_val(val));
+}
+
+static inline void x_pud_clear (pud_t *pud)
+{
+ x_pud_val(*pud) = 0;
+}
+
+static inline void x_set_pgd(pgd_t *dst, pgd_t val)
+{
+ x_pgd_val(*dst) = phys_to_machine(x_pgd_val(val));
+}
+
+static inline void x_pgd_clear (pgd_t * pgd)
+{
+ x_pgd_val(*pgd) = 0;
+}
+
+#define X__PAGE_KERNEL_LARGE_EXEC \
+ _PAGE_PRESENT | _PAGE_RW | _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_PSE
+#define X_KERNPG_TABLE _PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY
+
+#define __ma(x) (pfn_to_mfn(__pa((x)) >> PAGE_SHIFT) << PAGE_SHIFT)
+
+#if PAGES_NR > KEXEC_XEN_NO_PAGES
+#error PAGES_NR is greater than KEXEC_XEN_NO_PAGES - Xen support will break
+#endif
+
+#if PA_CONTROL_PAGE != 0
+#error PA_CONTROL_PAGE is non zero - Xen support will break
+#endif
+
+void machine_kexec_setup_load_arg(xen_kexec_image_t *xki, struct kimage *image)
+{
+ void *control_page;
+ void *table_page;
+
+ memset(xki->page_list, 0, sizeof(xki->page_list));
+
+ control_page = page_address(image->control_code_page) + PAGE_SIZE;
+ memcpy(control_page, relocate_kernel, PAGE_SIZE);
+
+ table_page = page_address(image->control_code_page);
+
+ xki->page_list[PA_CONTROL_PAGE] = __ma(control_page);
+ xki->page_list[PA_TABLE_PAGE] = __ma(table_page);
+
+ xki->page_list[PA_PGD] = __ma(kexec_pgd);
+ xki->page_list[PA_PUD_0] = __ma(kexec_pud0);
+ xki->page_list[PA_PUD_1] = __ma(kexec_pud1);
+ xki->page_list[PA_PMD_0] = __ma(kexec_pmd0);
+ xki->page_list[PA_PMD_1] = __ma(kexec_pmd1);
+ xki->page_list[PA_PTE_0] = __ma(kexec_pte0);
+ xki->page_list[PA_PTE_1] = __ma(kexec_pte1);
+}
+
+#else /* CONFIG_XEN */
+
+#define x__pmd(x) __pmd(x)
+#define x__pud(x) __pud(x)
+#define x__pgd(x) __pgd(x)
+
+#define x_set_pmd(x, y) set_pmd(x, y)
+#define x_set_pud(x, y) set_pud(x, y)
+#define x_set_pgd(x, y) set_pgd(x, y)
+
+#define x_pud_clear(x) pud_clear(x)
+#define x_pgd_clear(x) pgd_clear(x)
+
+#define X__PAGE_KERNEL_LARGE_EXEC __PAGE_KERNEL_LARGE_EXEC
+#define X_KERNPG_TABLE _KERNPG_TABLE
+
+#endif /* CONFIG_XEN */
+
static void init_level2_page(pmd_t *level2p, unsigned long addr)
{
unsigned long end_addr;
@@ -31,7 +129,7 @@ static void init_level2_page(pmd_t *leve
addr &= PAGE_MASK;
end_addr = addr + PUD_SIZE;
while (addr < end_addr) {
- set_pmd(level2p++, __pmd(addr | __PAGE_KERNEL_LARGE_EXEC));
+ x_set_pmd(level2p++, x__pmd(addr | X__PAGE_KERNEL_LARGE_EXEC));
addr += PMD_SIZE;
}
}
@@ -56,12 +154,12 @@ static int init_level3_page(struct kimag
}
level2p = (pmd_t *)page_address(page);
init_level2_page(level2p, addr);
- set_pud(level3p++, __pud(__pa(level2p) | _KERNPG_TABLE));
+ x_set_pud(level3p++, x__pud(__pa(level2p) | X_KERNPG_TABLE));
addr += PUD_SIZE;
}
/* clear the unused entries */
while (addr < end_addr) {
- pud_clear(level3p++);
+ x_pud_clear(level3p++);
addr += PUD_SIZE;
}
out:
@@ -92,12 +190,12 @@ static int init_level4_page(struct kimag
if (result) {
goto out;
}
- set_pgd(level4p++, __pgd(__pa(level3p) | _KERNPG_TABLE));
+ x_set_pgd(level4p++, x__pgd(__pa(level3p) | X_KERNPG_TABLE));
addr += PGDIR_SIZE;
}
/* clear the unused entries */
while (addr < end_addr) {
- pgd_clear(level4p++);
+ x_pgd_clear(level4p++);
addr += PGDIR_SIZE;
}
out:
@@ -108,11 +206,17 @@ out:
static int init_pgtable(struct kimage *image, unsigned long start_pgtable)
{
pgd_t *level4p;
+ unsigned long x_end_pfn = end_pfn;
+
+#ifdef CONFIG_XEN
+ x_end_pfn = HYPERVISOR_memory_op(XENMEM_maximum_ram_page, NULL);
+#endif
+
level4p = (pgd_t *)__va(start_pgtable);
- return init_level4_page(image, level4p, 0, end_pfn << PAGE_SHIFT);
+ return init_level4_page(image, level4p, 0, x_end_pfn << PAGE_SHIFT);
}
-int machine_kexec_prepare(struct kimage *image)
+static int native_machine_kexec_prepare(struct kimage *image)
{
unsigned long start_pgtable;
int result;
@@ -128,16 +232,11 @@ int machine_kexec_prepare(struct kimage
return 0;
}
-void machine_kexec_cleanup(struct kimage *image)
-{
- return;
-}
-
/*
* Do not allocate memory (or fail in any way) in machine_kexec().
* We are past the point of no return, committed to rebooting now.
*/
-NORET_TYPE void machine_kexec(struct kimage *image)
+static NORET_TYPE ATTRIB_NORET void native_machine_kexec(struct kimage *image)
{
unsigned long page_list[PAGES_NR];
void *control_page;
@@ -171,3 +270,11 @@ NORET_TYPE void machine_kexec(struct kim
relocate_kernel((unsigned long)image->head, (unsigned long)page_list,
image->start);
}
+
+NORET_TYPE void (*machine_kexec)(struct kimage *image) ATTRIB_NORET
+ = native_machine_kexec;
+int (*machine_kexec_prepare)(struct kimage *image)
+ = native_machine_kexec_prepare;
+int (*machine_kexec_load)(struct kimage *image) = NULL;
+void (*machine_kexec_unload)(struct kimage *image) = NULL;
+void (*machine_kexec_cleanup)(struct kimage *image) = NULL;
Index: kexec-2.6.16/include/asm-x86_64/kexec.h
===================================================================
--- kexec-2.6.16.orig/include/asm-x86_64/kexec.h
+++ kexec-2.6.16/include/asm-x86_64/kexec.h
@@ -91,6 +91,13 @@ relocate_kernel(unsigned long indirectio
unsigned long page_list,
unsigned long start_address) ATTRIB_NORET;
+/* Under Xen we need to work with machine addresses. These macros give the
+ * machine address of a certain page to the generic kexec code instead of
+ * the pseudo physical address which would be given by the default macros.
+ */
+
+#define KEXEC_ARCH_USES_HOOKS 1
+
#endif /* __ASSEMBLY__ */
#endif /* _X86_64_KEXEC_H */
[-- Attachment #5: xen-sparse-kexec-fixes.diff --]
[-- Type: text/x-patch, Size: 2642 bytes --]
---
drivers/xen/core/machine_kexec.c | 42 ++++++++++++++++++++++++++++++++++++---
1 file changed, 39 insertions(+), 3 deletions(-)
Index: kexec-2.6.16/drivers/xen/core/machine_kexec.c
===================================================================
--- kexec-2.6.16.orig/drivers/xen/core/machine_kexec.c
+++ kexec-2.6.16/drivers/xen/core/machine_kexec.c
@@ -11,6 +11,7 @@
extern void machine_kexec_setup_load_arg(xen_kexec_image_t *xki,
struct kimage *image);
+static void xen0_set_hooks(void);
int xen_max_nr_phys_cpus;
struct resource xen_hypervisor_res;
@@ -24,6 +25,7 @@ void xen_machine_kexec_setup_resources(v
if (!is_initial_xendomain())
return;
+ xen0_set_hooks();
/* determine maximum number of physical cpus */
@@ -124,7 +126,7 @@ static void setup_load_arg(xen_kexec_ima
* is currently called too early. It might make sense
* to move prepare, but for now, just add an extra hook.
*/
-int xen_machine_kexec_load(struct kimage *image)
+static int xen0_machine_kexec_load(struct kimage *image)
{
xen_kexec_load_t xkl;
@@ -140,7 +142,7 @@ int xen_machine_kexec_load(struct kimage
* is called too late, and its possible xen could try and kdump
* using resources that have been freed.
*/
-void xen_machine_kexec_unload(struct kimage *image)
+static void xen0_machine_kexec_unload(struct kimage *image)
{
xen_kexec_load_t xkl;
@@ -157,7 +159,7 @@ void xen_machine_kexec_unload(struct kim
* stop all CPUs and kexec. That is it combines machine_shutdown()
* and machine_kexec() in Linux kexec terms.
*/
-NORET_TYPE void xen_machine_kexec(struct kimage *image)
+static NORET_TYPE void xen0_machine_kexec(struct kimage *image)
{
xen_kexec_exec_t xke;
@@ -172,6 +174,40 @@ void machine_shutdown(void)
/* do nothing */
}
+static unsigned long xen0_page_to_pfn(struct page *page)
+{
+ return pfn_to_mfn(page_to_pfn(page));
+}
+
+static struct page* xen0_pfn_to_page(unsigned long pfn)
+{
+ return pfn_to_page(mfn_to_pfn(pfn));
+}
+
+static unsigned long xen0_virt_to_phys(void *addr)
+{
+ return virt_to_machine(addr);
+}
+
+static void* xen0_phys_to_virt(unsigned long addr)
+{
+ return phys_to_virt(machine_to_phys(addr));
+}
+
+
+static void xen0_set_hooks(void)
+{
+ kexec_page_to_pfn = xen0_page_to_pfn;
+ kexec_pfn_to_page = xen0_pfn_to_page;
+ kexec_virt_to_phys = xen0_virt_to_phys;
+ kexec_phys_to_virt = xen0_phys_to_virt;
+
+ machine_kexec_load = xen0_machine_kexec_load;
+ machine_kexec_unload = xen0_machine_kexec_unload;
+ machine_kexec = xen0_machine_kexec;
+
+ printk("%s: kexec hook setup done\n", __FUNCTION__);
+}
/*
* Local variables:
[-- Attachment #6: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-07 11:24 ` Gerd Hoffmann
@ 2006-12-08 4:15 ` Magnus Damm
2006-12-08 10:01 ` Gerd Hoffmann
0 siblings, 1 reply; 20+ messages in thread
From: Magnus Damm @ 2006-12-08 4:15 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list, Horms
Hi again Gerd,
On 12/7/06, Gerd Hoffmann <kraxel@suse.de> wrote:
> Hi,
>
> > Function pointers sound like the right way to go! Happy hacking!
>
> First step of a cleanup by moving to function pointers.
As a first step I think they look pretty good. I have a few random comments.
> Compile tested only.
Ok. I've browsed through the patches and done some basic compilation too.
> First three attachments replace the patches with identical names in
> patches/linux-2.6. The last should be applied to the sparse tree.
I think using a structure for all callbacks will result in cleaner
code. This is sort of a nitpick because it does not really matter
function wise, but it sounded earlier like you were aiming for
something that would be directly acceptable by the kexec and kdump
community. And I'm all for cleanliness.
Personally I would go with changing the code in kernel/kexec.c to
instead of calling machine_kexec() call kexec_ops.machine_kexec().
This regardless of the use of KEXEC_ARCH_USES_HOOKS. Then I would have
a single global instance of the structure kexec_ops declared in
kernel/kexec.c, and it would by default fill in
kexec_ops.machine_kexec() to machine_kexec. That way you won't have to
rename the arch-specific functions and there is no need to declare the
hooks in the arch-specific files. Maybe you won't need
KEXEC_ARCH_USES_HOOKS at all.
The load and unload code may be broken today if KEXEC_ARCH_USES_HOOKS
is unset - can you really check if machine_kexec_load is non-NULL if
it is inline?
The reason why I did put the page-macros in arch-specific header files
was because they need to be different on ia64. So your unification in
drivers/xen/core/machine_kexec.c may be ok for now (if our goal is x86
only), but in the future we need to figure out how to change them
nicely on ia64.
You probably remember that I was kind of negative to trying to solve
mainline merge issues at the same time as implementing this "switch".
This was because I remembered that paravirt allowed patching of inline
machine code. At least that's the impression I got from a presentation
given here in Tokyo by Rusty. I think the page macros ideally should
be patched in, but it's kind of hard trying to do that without
paravirt..
Finally, we should get rid of the #ifdef CONFIG_XEN left here and
there. My main concern is the code in crash.c which need to be
replaced with runtime checks if we are aiming for a single binary for
both native and dom0. I left out domU because it doesn't do crash,
right?
If you have an updated snapshot (or a replay saying I should use this
version) then I'll try out the code the first thing next week.
Thanks,
/ magnus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-08 4:15 ` Magnus Damm
@ 2006-12-08 10:01 ` Gerd Hoffmann
2006-12-08 10:24 ` Ian Campbell
0 siblings, 1 reply; 20+ messages in thread
From: Gerd Hoffmann @ 2006-12-08 10:01 UTC (permalink / raw)
To: Magnus Damm; +Cc: Magnus Damm, Xen devel list, Horms
[-- Attachment #1: Type: text/plain, Size: 3545 bytes --]
Magnus Damm wrote:
> I think using a structure for all callbacks will result in cleaner
> code. This is sort of a nitpick because it does not really matter
> function wise, but it sounded earlier like you were aiming for
> something that would be directly acceptable by the kexec and kdump
> community. And I'm all for cleanliness.
>
> Personally I would go with changing the code in kernel/kexec.c to
> instead of calling machine_kexec() call kexec_ops.machine_kexec().
> This regardless of the use of KEXEC_ARCH_USES_HOOKS. Then I would have
> a single global instance of the structure kexec_ops declared in
> kernel/kexec.c, and it would by default fill in
> kexec_ops.machine_kexec() to machine_kexec. That way you won't have to
> rename the arch-specific functions and there is no need to declare the
> hooks in the arch-specific files. Maybe you won't need
> KEXEC_ARCH_USES_HOOKS at all.
Yep, good idea, that works without the hooks define (and also without
touching all architectures which I want to avoid too).
> The load and unload code may be broken today if KEXEC_ARCH_USES_HOOKS
> is unset - can you really check if machine_kexec_load is non-NULL if
> it is inline?
Didn't check what gcc made out of it. It's a moot point now anyway with
the switch to a ops struct.
> The reason why I did put the page-macros in arch-specific header files
> was because they need to be different on ia64. So your unification in
> drivers/xen/core/machine_kexec.c may be ok for now (if our goal is x86
> only), but in the future we need to figure out how to change them
> nicely on ia64.
I simply wasn't aware of the ia64 issue. Well, maybe we should simply
create a arch/$arch/kernel/machine_kexec_xen0.c file with that kind of
code placed into. And maybe move the arch-independant xen bits in
driver/xen/core/machine_kexec.c to bits to kernel/kexec_xen0, but I
think that discussion better should be defered until we are actually
seeking mainline merge. Maybe we get our own subdirectory below kernel/
for that kind of stuff. There is some more code which is arch-specific
on native but simply a hypercall on xen, smpboot.c for example.
> You probably remember that I was kind of negative to trying to solve
> mainline merge issues at the same time as implementing this "switch".
> This was because I remembered that paravirt allowed patching of inline
> machine code. At least that's the impression I got from a presentation
> given here in Tokyo by Rusty. I think the page macros ideally should
> be patched in, but it's kind of hard trying to do that without
> paravirt..
Yep, there is, to get some percent performace improvements for hot path
code. Which IMHO isn't true for the kexec bits. It isn't performance
critical, usually you load a kexec kernel only once, I don't think it is
worth the trouble.
> Finally, we should get rid of the #ifdef CONFIG_XEN left here and
> there. My main concern is the code in crash.c which need to be
> replaced with runtime checks if we are aiming for a single binary for
> both native and dom0. I left out domU because it doesn't do crash,
> right?
I expect *lots* of changes in that area (apic/smp) anyway when we
upgrade the xen linux tree to be based on 2.6.20-rc1 or newer. paravirt
infrastructure is in Linus' tree now. I'd wait until that is done then
look again.
> If you have an updated snapshot (or a replay saying I should use this
> version) then I'll try out the code the first thing next week.
Updated patches attached.
cheers,
Gerd
--
Gerd Hoffmann <kraxel@suse.de>
[-- Attachment #2: kexec-generic.patch --]
[-- Type: text/x-patch, Size: 8376 bytes --]
---
include/linux/kexec.h | 19 ++++++++++
kernel/kexec.c | 90 +++++++++++++++++++++++++++++++++++++++++---------
2 files changed, 92 insertions(+), 17 deletions(-)
Index: kexec-2.6.16/include/linux/kexec.h
===================================================================
--- kexec-2.6.16.orig/include/linux/kexec.h
+++ kexec-2.6.16/include/linux/kexec.h
@@ -85,12 +85,29 @@ struct kimage {
#define KEXEC_TYPE_CRASH 1
};
+/* kexec interface functions */
+struct kexec_machine_ops {
+ unsigned long (*kpage_to_pfn)(struct page *page);
+ struct page* (*kpfn_to_page)(unsigned long pfn);
+ unsigned long (*kvirt_to_phys)(void *addr);
+ void* (*kphys_to_virt)(unsigned long addr);
+ NORET_TYPE void (*kexec)(struct kimage *image) ATTRIB_NORET;
+ int (*kexec_prepare)(struct kimage *image);
+ int (*kexec_load)(struct kimage *image);
+ void (*kexec_unload)(struct kimage *image);
+ void (*kexec_cleanup)(struct kimage *image);
+};
+extern struct kexec_machine_ops kexec_ops;
-/* kexec interface functions */
extern NORET_TYPE void machine_kexec(struct kimage *image) ATTRIB_NORET;
extern int machine_kexec_prepare(struct kimage *image);
extern void machine_kexec_cleanup(struct kimage *image);
+
+#ifdef CONFIG_XEN
+extern void xen_machine_kexec_setup_resources(void);
+extern void xen_machine_kexec_register_resources(struct resource *res);
+#endif
extern asmlinkage long sys_kexec_load(unsigned long entry,
unsigned long nr_segments,
struct kexec_segment __user *segments,
Index: kexec-2.6.16/kernel/kexec.c
===================================================================
--- kexec-2.6.16.orig/kernel/kexec.c
+++ kexec-2.6.16/kernel/kexec.c
@@ -27,6 +27,36 @@
#include <asm/system.h>
#include <asm/semaphore.h>
+static unsigned long default_page_to_pfn(struct page *page)
+{
+ return page_to_pfn(page);
+}
+
+static struct page* default_pfn_to_page(unsigned long pfn)
+{
+ return pfn_to_page(pfn);
+}
+
+static unsigned long default_virt_to_phys(void *addr)
+{
+ return virt_to_phys(addr);
+}
+
+static void* default_phys_to_virt(unsigned long addr)
+{
+ return phys_to_virt(addr);
+}
+
+struct kexec_machine_ops kexec_ops = {
+ .kpage_to_pfn = default_page_to_pfn,
+ .kpfn_to_page = default_pfn_to_page,
+ .kvirt_to_phys = default_virt_to_phys,
+ .kphys_to_virt = default_phys_to_virt,
+ .kexec = machine_kexec,
+ .kexec_prepare = machine_kexec_prepare,
+ .kexec_cleanup = machine_kexec_cleanup,
+};
+
/* Per cpu memory for storing cpu states in case of system crash. */
note_buf_t* crash_notes;
@@ -403,7 +433,7 @@ static struct page *kimage_alloc_normal_
pages = kimage_alloc_pages(GFP_KERNEL, order);
if (!pages)
break;
- pfn = page_to_pfn(pages);
+ pfn = kexec_ops.kpage_to_pfn(pages);
epfn = pfn + count;
addr = pfn << PAGE_SHIFT;
eaddr = epfn << PAGE_SHIFT;
@@ -437,6 +467,7 @@ static struct page *kimage_alloc_normal_
return pages;
}
+#ifndef CONFIG_XEN
static struct page *kimage_alloc_crash_control_pages(struct kimage *image,
unsigned int order)
{
@@ -490,7 +521,7 @@ static struct page *kimage_alloc_crash_c
}
/* If I don't overlap any segments I have found my hole! */
if (i == image->nr_segments) {
- pages = pfn_to_page(hole_start >> PAGE_SHIFT);
+ pages = kexec_ops.kpfn_to_page(hole_start >> PAGE_SHIFT);
break;
}
}
@@ -517,6 +548,13 @@ struct page *kimage_alloc_control_pages(
return pages;
}
+#else /* !CONFIG_XEN */
+struct page *kimage_alloc_control_pages(struct kimage *image,
+ unsigned int order)
+{
+ return kimage_alloc_normal_control_pages(image, order);
+}
+#endif
static int kimage_add_entry(struct kimage *image, kimage_entry_t entry)
{
@@ -532,7 +570,7 @@ static int kimage_add_entry(struct kimag
return -ENOMEM;
ind_page = page_address(page);
- *image->entry = virt_to_phys(ind_page) | IND_INDIRECTION;
+ *image->entry = kexec_ops.kvirt_to_phys(ind_page) | IND_INDIRECTION;
image->entry = ind_page;
image->last_entry = ind_page +
((PAGE_SIZE/sizeof(kimage_entry_t)) - 1);
@@ -593,13 +631,13 @@ static int kimage_terminate(struct kimag
#define for_each_kimage_entry(image, ptr, entry) \
for (ptr = &image->head; (entry = *ptr) && !(entry & IND_DONE); \
ptr = (entry & IND_INDIRECTION)? \
- phys_to_virt((entry & PAGE_MASK)): ptr +1)
+ kexec_ops.kphys_to_virt((entry & PAGE_MASK)): ptr +1)
static void kimage_free_entry(kimage_entry_t entry)
{
struct page *page;
- page = pfn_to_page(entry >> PAGE_SHIFT);
+ page = kexec_ops.kpfn_to_page(entry >> PAGE_SHIFT);
kimage_free_pages(page);
}
@@ -611,6 +649,9 @@ static void kimage_free(struct kimage *i
if (!image)
return;
+ if (kexec_ops.kexec_unload)
+ kexec_ops.kexec_unload(image);
+
kimage_free_extra_pages(image);
for_each_kimage_entry(image, ptr, entry) {
if (entry & IND_INDIRECTION) {
@@ -630,7 +671,8 @@ static void kimage_free(struct kimage *i
kimage_free_entry(ind);
/* Handle any machine specific cleanup */
- machine_kexec_cleanup(image);
+ if (kexec_ops.kexec_cleanup)
+ kexec_ops.kexec_cleanup(image);
/* Free the kexec control pages... */
kimage_free_page_list(&image->control_pages);
@@ -686,7 +728,7 @@ static struct page *kimage_alloc_page(st
* have a match.
*/
list_for_each_entry(page, &image->dest_pages, lru) {
- addr = page_to_pfn(page) << PAGE_SHIFT;
+ addr = kexec_ops.kpage_to_pfn(page) << PAGE_SHIFT;
if (addr == destination) {
list_del(&page->lru);
return page;
@@ -701,12 +743,12 @@ static struct page *kimage_alloc_page(st
if (!page)
return NULL;
/* If the page cannot be used file it away */
- if (page_to_pfn(page) >
+ if (kexec_ops.kpage_to_pfn(page) >
(KEXEC_SOURCE_MEMORY_LIMIT >> PAGE_SHIFT)) {
list_add(&page->lru, &image->unuseable_pages);
continue;
}
- addr = page_to_pfn(page) << PAGE_SHIFT;
+ addr = kexec_ops.kpage_to_pfn(page) << PAGE_SHIFT;
/* If it is the destination page we want use it */
if (addr == destination)
@@ -729,7 +771,7 @@ static struct page *kimage_alloc_page(st
struct page *old_page;
old_addr = *old & PAGE_MASK;
- old_page = pfn_to_page(old_addr >> PAGE_SHIFT);
+ old_page = kexec_ops.kpfn_to_page(old_addr >> PAGE_SHIFT);
copy_highpage(page, old_page);
*old = addr | (*old & ~PAGE_MASK);
@@ -779,7 +821,7 @@ static int kimage_load_normal_segment(st
result = -ENOMEM;
goto out;
}
- result = kimage_add_page(image, page_to_pfn(page)
+ result = kimage_add_page(image, kexec_ops.kpage_to_pfn(page)
<< PAGE_SHIFT);
if (result < 0)
goto out;
@@ -811,6 +853,7 @@ out:
return result;
}
+#ifndef CONFIG_XEN
static int kimage_load_crash_segment(struct kimage *image,
struct kexec_segment *segment)
{
@@ -833,7 +876,7 @@ static int kimage_load_crash_segment(str
char *ptr;
size_t uchunk, mchunk;
- page = pfn_to_page(maddr >> PAGE_SHIFT);
+ page = kexec_ops.kpfn_to_page(maddr >> PAGE_SHIFT);
if (page == 0) {
result = -ENOMEM;
goto out;
@@ -881,6 +924,13 @@ static int kimage_load_segment(struct ki
return result;
}
+#else /* CONFIG_XEN */
+static int kimage_load_segment(struct kimage *image,
+ struct kexec_segment *segment)
+{
+ return kimage_load_normal_segment(image, segment);
+}
+#endif
/*
* Exec Kernel system call: for obvious reasons only root may call it.
@@ -978,9 +1028,11 @@ asmlinkage long sys_kexec_load(unsigned
if (result)
goto out;
- result = machine_kexec_prepare(image);
- if (result)
- goto out;
+ if (kexec_ops.kexec_prepare) {
+ result = kexec_ops.kexec_prepare(image);
+ if (result)
+ goto out;
+ }
for (i = 0; i < nr_segments; i++) {
result = kimage_load_segment(image, &image->segment[i]);
@@ -991,6 +1043,13 @@ asmlinkage long sys_kexec_load(unsigned
if (result)
goto out;
}
+
+ if (kexec_ops.kexec_load) {
+ result = kexec_ops.kexec_load(image);
+ if (result)
+ goto out;
+ }
+
/* Install the new kernel, and Uninstall the old */
image = xchg(dest_image, image);
@@ -1045,7 +1104,6 @@ void crash_kexec(struct pt_regs *regs)
struct kimage *image;
int locked;
-
/* Take the kexec_lock here to prevent sys_kexec_load
* running on one cpu from replacing the crash kernel
* we are using after a panic on a different cpu.
[-- Attachment #3: linux-2.6.19-rc1-kexec-xen-i386.patch --]
[-- Type: text/x-patch, Size: 2714 bytes --]
---
arch/i386/kernel/crash.c | 4 ++++
arch/i386/kernel/machine_kexec.c | 38 ++++++++++++++++++++++++++++++++++++++
2 files changed, 42 insertions(+)
Index: kexec-2.6.16/arch/i386/kernel/crash.c
===================================================================
--- kexec-2.6.16.orig/arch/i386/kernel/crash.c
+++ kexec-2.6.16/arch/i386/kernel/crash.c
@@ -90,6 +90,7 @@ static void crash_save_self(struct pt_re
crash_save_this_cpu(regs, cpu);
}
+#ifndef CONFIG_XEN
#ifdef CONFIG_SMP
static atomic_t waiting_for_crash_ipi;
@@ -158,6 +159,7 @@ static void nmi_shootdown_cpus(void)
/* There are no cpus to shootdown */
}
#endif
+#endif /* CONFIG_XEN */
void machine_crash_shutdown(struct pt_regs *regs)
{
@@ -174,10 +176,12 @@ void machine_crash_shutdown(struct pt_re
/* Make a note of crashing cpu. Will be used in NMI callback.*/
crashing_cpu = smp_processor_id();
+#ifndef CONFIG_XEN
nmi_shootdown_cpus();
lapic_shutdown();
#if defined(CONFIG_X86_IO_APIC)
disable_IO_APIC();
#endif
+#endif /* CONFIG_XEN */
crash_save_self(regs);
}
Index: kexec-2.6.16/arch/i386/kernel/machine_kexec.c
===================================================================
--- kexec-2.6.16.orig/arch/i386/kernel/machine_kexec.c
+++ kexec-2.6.16/arch/i386/kernel/machine_kexec.c
@@ -19,6 +19,10 @@
#include <asm/desc.h>
#include <asm/system.h>
+#ifdef CONFIG_XEN
+#include <xen/interface/kexec.h>
+#endif
+
#define PAGE_ALIGNED __attribute__ ((__aligned__(PAGE_SIZE)))
static u32 kexec_pgd[1024] PAGE_ALIGNED;
#ifdef CONFIG_X86_PAE
@@ -54,6 +58,40 @@ void machine_kexec_cleanup(struct kimage
{
}
+#ifdef CONFIG_XEN
+
+#define __ma(x) (pfn_to_mfn(__pa((x)) >> PAGE_SHIFT) << PAGE_SHIFT)
+
+#if PAGES_NR > KEXEC_XEN_NO_PAGES
+#error PAGES_NR is greater than KEXEC_XEN_NO_PAGES - Xen support will break
+#endif
+
+#if PA_CONTROL_PAGE != 0
+#error PA_CONTROL_PAGE is non zero - Xen support will break
+#endif
+
+void machine_kexec_setup_load_arg(xen_kexec_image_t *xki, struct kimage *image)
+{
+ void *control_page;
+
+ memset(xki->page_list, 0, sizeof(xki->page_list));
+
+ control_page = page_address(image->control_code_page);
+ memcpy(control_page, relocate_kernel, PAGE_SIZE);
+
+ xki->page_list[PA_CONTROL_PAGE] = __ma(control_page);
+ xki->page_list[PA_PGD] = __ma(kexec_pgd);
+#ifdef CONFIG_X86_PAE
+ xki->page_list[PA_PMD_0] = __ma(kexec_pmd0);
+ xki->page_list[PA_PMD_1] = __ma(kexec_pmd1);
+#endif
+ xki->page_list[PA_PTE_0] = __ma(kexec_pte0);
+ xki->page_list[PA_PTE_1] = __ma(kexec_pte1);
+
+}
+
+#endif /* CONFIG_XEN */
+
/*
* Do not allocate memory (or fail in any way) in machine_kexec().
* We are past the point of no return, committed to rebooting now.
[-- Attachment #4: linux-2.6.19-rc1-kexec-xen-x86_64.patch --]
[-- Type: text/x-patch, Size: 5915 bytes --]
---
arch/x86_64/kernel/crash.c | 6 +
arch/x86_64/kernel/machine_kexec.c | 116 +++++++++++++++++++++++++++++++++++--
2 files changed, 115 insertions(+), 7 deletions(-)
Index: kexec-2.6.16/arch/x86_64/kernel/crash.c
===================================================================
--- kexec-2.6.16.orig/arch/x86_64/kernel/crash.c
+++ kexec-2.6.16/arch/x86_64/kernel/crash.c
@@ -92,6 +92,7 @@ static void crash_save_self(struct pt_re
crash_save_this_cpu(regs, cpu);
}
+#ifndef CONFIG_XEN
#ifdef CONFIG_SMP
static atomic_t waiting_for_crash_ipi;
@@ -156,6 +157,7 @@ static void nmi_shootdown_cpus(void)
/* There are no cpus to shootdown */
}
#endif
+#endif /* CONFIG_XEN */
void machine_crash_shutdown(struct pt_regs *regs)
{
@@ -173,6 +175,8 @@ void machine_crash_shutdown(struct pt_re
/* Make a note of crashing cpu. Will be used in NMI callback.*/
crashing_cpu = smp_processor_id();
+
+#ifndef CONFIG_XEN
nmi_shootdown_cpus();
if(cpu_has_apic)
@@ -181,6 +185,6 @@ void machine_crash_shutdown(struct pt_re
#if defined(CONFIG_X86_IO_APIC)
disable_IO_APIC();
#endif
-
+#endif /* CONFIG_XEN */
crash_save_self(regs);
}
Index: kexec-2.6.16/arch/x86_64/kernel/machine_kexec.c
===================================================================
--- kexec-2.6.16.orig/arch/x86_64/kernel/machine_kexec.c
+++ kexec-2.6.16/arch/x86_64/kernel/machine_kexec.c
@@ -24,6 +24,104 @@ static u64 kexec_pud1[512] PAGE_ALIGNED;
static u64 kexec_pmd1[512] PAGE_ALIGNED;
static u64 kexec_pte1[512] PAGE_ALIGNED;
+#ifdef CONFIG_XEN
+
+/* In the case of Xen, override hypervisor functions to be able to create
+ * a regular identity mapping page table...
+ */
+
+#include <xen/interface/kexec.h>
+#include <xen/interface/memory.h>
+
+#define x__pmd(x) ((pmd_t) { (x) } )
+#define x__pud(x) ((pud_t) { (x) } )
+#define x__pgd(x) ((pgd_t) { (x) } )
+
+#define x_pmd_val(x) ((x).pmd)
+#define x_pud_val(x) ((x).pud)
+#define x_pgd_val(x) ((x).pgd)
+
+static inline void x_set_pmd(pmd_t *dst, pmd_t val)
+{
+ x_pmd_val(*dst) = x_pmd_val(val);
+}
+
+static inline void x_set_pud(pud_t *dst, pud_t val)
+{
+ x_pud_val(*dst) = phys_to_machine(x_pud_val(val));
+}
+
+static inline void x_pud_clear (pud_t *pud)
+{
+ x_pud_val(*pud) = 0;
+}
+
+static inline void x_set_pgd(pgd_t *dst, pgd_t val)
+{
+ x_pgd_val(*dst) = phys_to_machine(x_pgd_val(val));
+}
+
+static inline void x_pgd_clear (pgd_t * pgd)
+{
+ x_pgd_val(*pgd) = 0;
+}
+
+#define X__PAGE_KERNEL_LARGE_EXEC \
+ _PAGE_PRESENT | _PAGE_RW | _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_PSE
+#define X_KERNPG_TABLE _PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY
+
+#define __ma(x) (pfn_to_mfn(__pa((x)) >> PAGE_SHIFT) << PAGE_SHIFT)
+
+#if PAGES_NR > KEXEC_XEN_NO_PAGES
+#error PAGES_NR is greater than KEXEC_XEN_NO_PAGES - Xen support will break
+#endif
+
+#if PA_CONTROL_PAGE != 0
+#error PA_CONTROL_PAGE is non zero - Xen support will break
+#endif
+
+void machine_kexec_setup_load_arg(xen_kexec_image_t *xki, struct kimage *image)
+{
+ void *control_page;
+ void *table_page;
+
+ memset(xki->page_list, 0, sizeof(xki->page_list));
+
+ control_page = page_address(image->control_code_page) + PAGE_SIZE;
+ memcpy(control_page, relocate_kernel, PAGE_SIZE);
+
+ table_page = page_address(image->control_code_page);
+
+ xki->page_list[PA_CONTROL_PAGE] = __ma(control_page);
+ xki->page_list[PA_TABLE_PAGE] = __ma(table_page);
+
+ xki->page_list[PA_PGD] = __ma(kexec_pgd);
+ xki->page_list[PA_PUD_0] = __ma(kexec_pud0);
+ xki->page_list[PA_PUD_1] = __ma(kexec_pud1);
+ xki->page_list[PA_PMD_0] = __ma(kexec_pmd0);
+ xki->page_list[PA_PMD_1] = __ma(kexec_pmd1);
+ xki->page_list[PA_PTE_0] = __ma(kexec_pte0);
+ xki->page_list[PA_PTE_1] = __ma(kexec_pte1);
+}
+
+#else /* CONFIG_XEN */
+
+#define x__pmd(x) __pmd(x)
+#define x__pud(x) __pud(x)
+#define x__pgd(x) __pgd(x)
+
+#define x_set_pmd(x, y) set_pmd(x, y)
+#define x_set_pud(x, y) set_pud(x, y)
+#define x_set_pgd(x, y) set_pgd(x, y)
+
+#define x_pud_clear(x) pud_clear(x)
+#define x_pgd_clear(x) pgd_clear(x)
+
+#define X__PAGE_KERNEL_LARGE_EXEC __PAGE_KERNEL_LARGE_EXEC
+#define X_KERNPG_TABLE _KERNPG_TABLE
+
+#endif /* CONFIG_XEN */
+
static void init_level2_page(pmd_t *level2p, unsigned long addr)
{
unsigned long end_addr;
@@ -31,7 +129,7 @@ static void init_level2_page(pmd_t *leve
addr &= PAGE_MASK;
end_addr = addr + PUD_SIZE;
while (addr < end_addr) {
- set_pmd(level2p++, __pmd(addr | __PAGE_KERNEL_LARGE_EXEC));
+ x_set_pmd(level2p++, x__pmd(addr | X__PAGE_KERNEL_LARGE_EXEC));
addr += PMD_SIZE;
}
}
@@ -56,12 +154,12 @@ static int init_level3_page(struct kimag
}
level2p = (pmd_t *)page_address(page);
init_level2_page(level2p, addr);
- set_pud(level3p++, __pud(__pa(level2p) | _KERNPG_TABLE));
+ x_set_pud(level3p++, x__pud(__pa(level2p) | X_KERNPG_TABLE));
addr += PUD_SIZE;
}
/* clear the unused entries */
while (addr < end_addr) {
- pud_clear(level3p++);
+ x_pud_clear(level3p++);
addr += PUD_SIZE;
}
out:
@@ -92,12 +190,12 @@ static int init_level4_page(struct kimag
if (result) {
goto out;
}
- set_pgd(level4p++, __pgd(__pa(level3p) | _KERNPG_TABLE));
+ x_set_pgd(level4p++, x__pgd(__pa(level3p) | X_KERNPG_TABLE));
addr += PGDIR_SIZE;
}
/* clear the unused entries */
while (addr < end_addr) {
- pgd_clear(level4p++);
+ x_pgd_clear(level4p++);
addr += PGDIR_SIZE;
}
out:
@@ -108,8 +206,14 @@ out:
static int init_pgtable(struct kimage *image, unsigned long start_pgtable)
{
pgd_t *level4p;
+ unsigned long x_end_pfn = end_pfn;
+
+#ifdef CONFIG_XEN
+ x_end_pfn = HYPERVISOR_memory_op(XENMEM_maximum_ram_page, NULL);
+#endif
+
level4p = (pgd_t *)__va(start_pgtable);
- return init_level4_page(image, level4p, 0, end_pfn << PAGE_SHIFT);
+ return init_level4_page(image, level4p, 0, x_end_pfn << PAGE_SHIFT);
}
int machine_kexec_prepare(struct kimage *image)
[-- Attachment #5: xen-sparse-kexec-fixes.diff --]
[-- Type: text/x-patch, Size: 2684 bytes --]
---
drivers/xen/core/machine_kexec.c | 42 ++++++++++++++++++++++++++++++++++++---
1 file changed, 39 insertions(+), 3 deletions(-)
Index: kexec-2.6.16/drivers/xen/core/machine_kexec.c
===================================================================
--- kexec-2.6.16.orig/drivers/xen/core/machine_kexec.c
+++ kexec-2.6.16/drivers/xen/core/machine_kexec.c
@@ -11,6 +11,7 @@
extern void machine_kexec_setup_load_arg(xen_kexec_image_t *xki,
struct kimage *image);
+static void xen0_set_hooks(void);
int xen_max_nr_phys_cpus;
struct resource xen_hypervisor_res;
@@ -24,6 +25,7 @@ void xen_machine_kexec_setup_resources(v
if (!is_initial_xendomain())
return;
+ xen0_set_hooks();
/* determine maximum number of physical cpus */
@@ -124,7 +126,7 @@ static void setup_load_arg(xen_kexec_ima
* is currently called too early. It might make sense
* to move prepare, but for now, just add an extra hook.
*/
-int xen_machine_kexec_load(struct kimage *image)
+static int xen0_machine_kexec_load(struct kimage *image)
{
xen_kexec_load_t xkl;
@@ -140,7 +142,7 @@ int xen_machine_kexec_load(struct kimage
* is called too late, and its possible xen could try and kdump
* using resources that have been freed.
*/
-void xen_machine_kexec_unload(struct kimage *image)
+static void xen0_machine_kexec_unload(struct kimage *image)
{
xen_kexec_load_t xkl;
@@ -157,7 +159,7 @@ void xen_machine_kexec_unload(struct kim
* stop all CPUs and kexec. That is it combines machine_shutdown()
* and machine_kexec() in Linux kexec terms.
*/
-NORET_TYPE void xen_machine_kexec(struct kimage *image)
+static NORET_TYPE ATTRIB_NORET void xen0_machine_kexec(struct kimage *image)
{
xen_kexec_exec_t xke;
@@ -172,6 +174,40 @@ void machine_shutdown(void)
/* do nothing */
}
+static unsigned long xen0_page_to_pfn(struct page *page)
+{
+ return pfn_to_mfn(page_to_pfn(page));
+}
+
+static struct page* xen0_pfn_to_page(unsigned long pfn)
+{
+ return pfn_to_page(mfn_to_pfn(pfn));
+}
+
+static unsigned long xen0_virt_to_phys(void *addr)
+{
+ return virt_to_machine(addr);
+}
+
+static void* xen0_phys_to_virt(unsigned long addr)
+{
+ return phys_to_virt(machine_to_phys(addr));
+}
+
+
+static void xen0_set_hooks(void)
+{
+ kexec_ops.kpage_to_pfn = xen0_page_to_pfn;
+ kexec_ops.kpfn_to_page = xen0_pfn_to_page;
+ kexec_ops.kvirt_to_phys = xen0_virt_to_phys;
+ kexec_ops.kphys_to_virt = xen0_phys_to_virt;
+
+ kexec_ops.kexec = xen0_machine_kexec;
+ kexec_ops.kexec_load = xen0_machine_kexec_load;
+ kexec_ops.kexec_unload = xen0_machine_kexec_unload;
+
+ printk("%s: kexec hook setup done\n", __FUNCTION__);
+}
/*
* Local variables:
[-- Attachment #6: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-08 10:01 ` Gerd Hoffmann
@ 2006-12-08 10:24 ` Ian Campbell
2006-12-08 11:28 ` Gerd Hoffmann
0 siblings, 1 reply; 20+ messages in thread
From: Ian Campbell @ 2006-12-08 10:24 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list, Magnus Damm, Horms
Hi Gerd,
On Fri, 2006-12-08 at 11:01 +0100, Gerd Hoffmann wrote:
> Updated patches attached.
Unfortunately I'm just about to push a changeset which move the contents
of these patches:
patches/linux-2.6.16.33/kexec-generic.patch
patches/linux-2.6.16.33/linux-2.6.19-rc1-kexec-xen-i386.patch
patches/linux-2.6.16.33/linux-2.6.19-rc1-kexec-xen-x86_64.patch
into the sparse tree where they belong. Sorry for moving the ground
under you.
Also due to the freeze we won't be able to take these changes until
after 3.0.4 is released.
Cheers,
Ian.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-08 10:24 ` Ian Campbell
@ 2006-12-08 11:28 ` Gerd Hoffmann
2006-12-08 11:32 ` Keir Fraser
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: Gerd Hoffmann @ 2006-12-08 11:28 UTC (permalink / raw)
To: Ian Campbell; +Cc: Magnus Damm, Xen devel list, Magnus Damm, Horms
Ian Campbell wrote:
> Hi Gerd,
>
> On Fri, 2006-12-08 at 11:01 +0100, Gerd Hoffmann wrote:
>> Updated patches attached.
>
> Unfortunately I'm just about to push a changeset which move the contents
> of these patches:
> patches/linux-2.6.16.33/kexec-generic.patch
> patches/linux-2.6.16.33/linux-2.6.19-rc1-kexec-xen-i386.patch
> patches/linux-2.6.16.33/linux-2.6.19-rc1-kexec-xen-x86_64.patch
> into the sparse tree where they belong. Sorry for moving the ground
> under you.
Oh, that is fine. Makes it easier for me, also the I can fold my
changes into a single patch for the sparse tree then which likely is
smaller and easier to review ;)
Your changes are not in the public tree yet though ....
cheers,
Gerd
--
Gerd Hoffmann <kraxel@suse.de>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-08 11:28 ` Gerd Hoffmann
@ 2006-12-08 11:32 ` Keir Fraser
2006-12-08 11:52 ` Ian Campbell
2006-12-08 15:49 ` Ian Campbell
2 siblings, 0 replies; 20+ messages in thread
From: Keir Fraser @ 2006-12-08 11:32 UTC (permalink / raw)
To: Gerd Hoffmann, Ian Campbell
Cc: Magnus Damm, Xen devel list, Magnus Damm, Horms
On 8/12/06 11:28, "Gerd Hoffmann" <kraxel@suse.de> wrote:
> Oh, that is fine. Makes it easier for me, also the I can fold my
> changes into a single patch for the sparse tree then which likely is
> smaller and easier to review ;)
>
> Your changes are not in the public tree yet though ....
The staging tree is stalled for some reason. I'm not sure whether there's a
systematic problem or just a few random problems in a row... We'll look into
it this afternoon so we can get stuff pushed to the public tree later today.
-- Keir
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-08 11:28 ` Gerd Hoffmann
2006-12-08 11:32 ` Keir Fraser
@ 2006-12-08 11:52 ` Ian Campbell
2006-12-08 15:49 ` Ian Campbell
2 siblings, 0 replies; 20+ messages in thread
From: Ian Campbell @ 2006-12-08 11:52 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list, Magnus Damm, Horms
On Fri, 2006-12-08 at 12:28 +0100, Gerd Hoffmann wrote:
> Ian Campbell wrote:
> > Hi Gerd,
> >
> > On Fri, 2006-12-08 at 11:01 +0100, Gerd Hoffmann wrote:
> >> Updated patches attached.
> >
> > Unfortunately I'm just about to push a changeset which move the contents
> > of these patches:
> > patches/linux-2.6.16.33/kexec-generic.patch
> > patches/linux-2.6.16.33/linux-2.6.19-rc1-kexec-xen-i386.patch
> > patches/linux-2.6.16.33/linux-2.6.19-rc1-kexec-xen-x86_64.patch
> > into the sparse tree where they belong. Sorry for moving the ground
> > under you.
>
> Oh, that is fine. Makes it easier for me, also the I can fold my
> changes into a single patch for the sparse tree then which likely is
> smaller and easier to review ;)
>
> Your changes are not in the public tree yet though ....
I was delayed a bit in pushing them, they are in now. Hopefully that
will be unwedged and flow through this afternoon.
Cheers,
Ian.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble
2006-12-08 11:28 ` Gerd Hoffmann
2006-12-08 11:32 ` Keir Fraser
2006-12-08 11:52 ` Ian Campbell
@ 2006-12-08 15:49 ` Ian Campbell
2 siblings, 0 replies; 20+ messages in thread
From: Ian Campbell @ 2006-12-08 15:49 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list, Magnus Damm, Horms
On Fri, 2006-12-08 at 12:28 +0100, Gerd Hoffmann wrote:
> Your changes are not in the public tree yet though ....
Should be there now.
Cheers,
Ian.
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2006-12-08 15:49 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-05 14:37 kexec trouble Gerd Hoffmann
2006-12-05 15:53 ` Magnus Damm
2006-12-05 16:55 ` Gerd Hoffmann
2006-12-06 4:08 ` Magnus Damm
2006-12-06 8:48 ` Gerd Hoffmann
2006-12-06 9:41 ` Magnus Damm
2006-12-06 10:31 ` Gerd Hoffmann
2006-12-06 11:11 ` Magnus Damm
2006-12-06 13:23 ` Gerd Hoffmann
2006-12-06 13:40 ` Muli Ben-Yehuda
2006-12-07 11:24 ` Gerd Hoffmann
2006-12-08 4:15 ` Magnus Damm
2006-12-08 10:01 ` Gerd Hoffmann
2006-12-08 10:24 ` Ian Campbell
2006-12-08 11:28 ` Gerd Hoffmann
2006-12-08 11:32 ` Keir Fraser
2006-12-08 11:52 ` Ian Campbell
2006-12-08 15:49 ` Ian Campbell
2006-12-06 8:37 ` Keir Fraser
2006-12-06 9:08 ` Magnus Damm
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.