* kexec trouble @ 2006-12-05 14:37 Gerd Hoffmann 2006-12-05 15:53 ` Magnus Damm 0 siblings, 1 reply; 20+ messages in thread From: Gerd Hoffmann @ 2006-12-05 14:37 UTC (permalink / raw) To: Magnus Damm; +Cc: Xen devel list Hi, Uh, it's a bit messy, with the changes sprinkled over the sparse tree and the patches directory, which makes it a bit hard to fixup stuff. IMHO the kexec code makes way to many decisions at compile time, not runtime, especially the ones in the kexec code core. Having something depend on CONFIG_XEN doesn't fly with the paravirt approach planned for mainline merge (same kernel binary runs both native and paravirtualized). I'm also in trouble now with guest kexec patches as they work with guest phys addrs not machine phys addrs. I think we need either wrapper functions for machine_kexec_* functions which dispatch to the correct function depending on the environment (dom0 vs domU, later also native) or just make them function pointers to archive the same effect. Same goes for the KEXEC_ARCH_HAS_PAGE_MACROS stuff. IMHO "#ifdef CONFIG_XEN" should go away from the core code (i.e. kernel/kexec.c). cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-05 14:37 kexec trouble Gerd Hoffmann @ 2006-12-05 15:53 ` Magnus Damm 2006-12-05 16:55 ` Gerd Hoffmann 2006-12-06 8:37 ` Keir Fraser 0 siblings, 2 replies; 20+ messages in thread From: Magnus Damm @ 2006-12-05 15:53 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list Hi Gerd, On 12/5/06, Gerd Hoffmann <kraxel@suse.de> wrote: > Hi, > > Uh, it's a bit messy, with the changes sprinkled over the sparse tree > and the patches directory, which makes it a bit hard to fixup stuff. Well, I'm sorry to hear that you think it is messy. I don't think that we touch that many places in the sparse tree, but I agree that the combination of patches and sparse may be a bit confusing. The alternative to patches would have been to duplicate the files by copying the into the sparse tree which I wanted to avoid because I think it makes future up porting difficult. > IMHO the kexec code makes way to many decisions at compile time, not > runtime, especially the ones in the kexec code core. Having something > depend on CONFIG_XEN doesn't fly with the paravirt approach planned for > mainline merge (same kernel binary runs both native and paravirtualized). Sure, but isn't the paravirt stuff just for domU first to begin with? I'm pretty sure that making the code dynamically decide between dom0, domU or native is quite simple to implement when it comes to kexec, but I wanted to wait with that until most parts of dom0 was running under paravirt. > I'm also in trouble now with guest kexec patches as they work with guest > phys addrs not machine phys addrs. Sorry if that made your life difficult, but shouldn't it just be a matter of using the native versions of the page macros for domU? They are in include/linux/kexec.h if I'm not mistaken. In a patch, not in sparse. > I think we need either wrapper functions for machine_kexec_* functions > which dispatch to the correct function depending on the environment > (dom0 vs domU, later also native) or just make them function pointers to > archive the same effect. Same goes for the KEXEC_ARCH_HAS_PAGE_MACROS > stuff. IMHO "#ifdef CONFIG_XEN" should go away from the core code (i.e. > kernel/kexec.c). You mean for the paravirt stuff? Isn't paravirt basically a set of callbacks that you can register? If so, what is stopping us from registering a set of paravirt callbacks for the kexec code? / magnus ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-05 15:53 ` Magnus Damm @ 2006-12-05 16:55 ` Gerd Hoffmann 2006-12-06 4:08 ` Magnus Damm 2006-12-06 8:37 ` Keir Fraser 1 sibling, 1 reply; 20+ messages in thread From: Gerd Hoffmann @ 2006-12-05 16:55 UTC (permalink / raw) To: Magnus Damm; +Cc: Magnus Damm, Xen devel list Hi, >> IMHO the kexec code makes way to many decisions at compile time, not >> runtime, especially the ones in the kexec code core. Having something >> depend on CONFIG_XEN doesn't fly with the paravirt approach planned for >> mainline merge (same kernel binary runs both native and paravirtualized). > > Sure, but isn't the paravirt stuff just for domU first to begin with? domU only as first step, later dom0 too. > I'm pretty sure that making the code dynamically decide between dom0, > domU or native is quite simple to implement when it comes to kexec, > but I wanted to wait with that until most parts of dom0 was running > under paravirt. I'd prefer to do that _now_. >> I'm also in trouble now with guest kexec patches as they work with guest >> phys addrs not machine phys addrs. > > Sorry if that made your life difficult, but shouldn't it just be a > matter of using the native versions of the page macros for domU? No. The same xen kernel can run as both dom0 and domU, thus that must be decided at runtime. >> I think we need either wrapper functions for machine_kexec_* functions >> which dispatch to the correct function depending on the environment >> (dom0 vs domU, later also native) or just make them function pointers to >> archive the same effect. Same goes for the KEXEC_ARCH_HAS_PAGE_MACROS >> stuff. IMHO "#ifdef CONFIG_XEN" should go away from the core code (i.e. >> kernel/kexec.c). > > You mean for the paravirt stuff? And domU kexec. That works without any kexec core changes, and I suspect the #ifdef CONFIG_XEN code will break it. > Isn't paravirt basically a set of > callbacks that you can register? Yes. > If so, what is stopping us from > registering a set of paravirt callbacks for the kexec code? Hmm, we'll end up with *two* sets of callbacks for xen, one for dom0 and one for domU kexec. Not sure that fits the current paravirt design. Given we may move to paravirt some day it's probably best to go with the function pointers approach for now, that makes switching over to the paravirt infrastructure (once it is mainline) easier. And I think its also less messy in the code. cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-05 16:55 ` Gerd Hoffmann @ 2006-12-06 4:08 ` Magnus Damm 2006-12-06 8:48 ` Gerd Hoffmann 0 siblings, 1 reply; 20+ messages in thread From: Magnus Damm @ 2006-12-06 4:08 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list, Horms Hi again Gerd, [CC Simon] On 12/6/06, Gerd Hoffmann <kraxel@suse.de> wrote: > >> I'm also in trouble now with guest kexec patches as they work with guest > >> phys addrs not machine phys addrs. > > > > Sorry if that made your life difficult, but shouldn't it just be a > > matter of using the native versions of the page macros for domU? > > No. The same xen kernel can run as both dom0 and domU, thus that must > be decided at runtime. Well, for us there was no need to decide that at runtime. Our scope was only dom0. For you a runtime check makes sense, especially now when our code is merged and you have a conflict. It does however sound like you are pissed because the conflict, but I don't think you should blame that on us. Simon and I reposted the patches at least 10 times over the last half a year - so you had your time to come with feedback. That aside, what about doing as little as possible now? Use is_initial_xendomain() or something like that to switch between the different dom0 and domU implementations. And whenever domU and dom0 runs under paravirt we fix up to code to remove the #ifdef and add native mode support. > >> I think we need either wrapper functions for machine_kexec_* functions > >> which dispatch to the correct function depending on the environment > >> (dom0 vs domU, later also native) or just make them function pointers to > >> archive the same effect. Same goes for the KEXEC_ARCH_HAS_PAGE_MACROS > >> stuff. IMHO "#ifdef CONFIG_XEN" should go away from the core code (i.e. > >> kernel/kexec.c). > > > > You mean for the paravirt stuff? > > And domU kexec. That works without any kexec core changes, and I > suspect the #ifdef CONFIG_XEN code will break it. Replacing the #ifdefs with a runtime check that is fine by me. I'm think it's nice to avoid #ifdefs if possible, but again - our scope of implementation was simply to add dom0 support. We did not care about domU support or paravirt that wasn't included at that time. > > If so, what is stopping us from > > registering a set of paravirt callbacks for the kexec code? > > Hmm, we'll end up with *two* sets of callbacks for xen, one for dom0 and > one for domU kexec. Not sure that fits the current paravirt design. I'm pretty sure that these things will be easy to resolve when the time is right. > Given we may move to paravirt some day it's probably best to go with the > function pointers approach for now, that makes switching over to the > paravirt infrastructure (once it is mainline) easier. And I think its > also less messy in the code. There is only a point in having function pointers when you have more than one implementation. And now you are going from one implementation to two so adding function pointers makes sense. If we would have added function pointers in our patch it would have been pure bloat because there was no one there except us to use them. / magnus ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-06 4:08 ` Magnus Damm @ 2006-12-06 8:48 ` Gerd Hoffmann 2006-12-06 9:41 ` Magnus Damm 0 siblings, 1 reply; 20+ messages in thread From: Gerd Hoffmann @ 2006-12-06 8:48 UTC (permalink / raw) To: Magnus Damm; +Cc: Magnus Damm, Xen devel list, Horms Magnus Damm wrote: > For you a runtime check makes sense, especially now when our code is > merged and you have a conflict. It does however sound like you are > pissed because the conflict, but I don't think you should blame that > on us. Yes, a bit, especially as we've talked a bit about dom0/domU kexec at the Xen Summit, so I assumed you are aware of the problem. The sparse/patches split of the code also makes it hard to change it. > Simon and I reposted the patches at least 10 times over the > last half a year - so you had your time to come with feedback. Yes, I should have checked before. -ENOTIME. Bad decision nevertheless, now it probably costs even more time to fix it up afterwards .... > That aside, what about doing as little as possible now? Use > is_initial_xendomain() or something like that to switch between the > different dom0 and domU implementations. And whenever domU and dom0 > runs under paravirt we fix up to code to remove the #ifdef and add > native mode support. I'd go for the function pointer approach. I think it is easier to maintain in the long run. Wrapper functions which look at is_initial_xendomain() then call either xen0_machine_kexec or xenU_machine_kexec quickly get messy with lots of #ifdef CONFIG_FOOBAR, and it would be a temporary solution only anyway. I think you compile in native code too, although it is dead code, right? So we can make machine_kexec() + friends function pointers, rename the native functions and initialize the function pointers to the native versions. I think it should even be possible to make them function pointers for i386/x86_64 archs only. Things keep working with CONFIG_XEN=n then, and with CONFIG_XEN=y the initialization function just switches the function pointers (depending on is_initial_domain()). This also eliminates the first set of #ifdefs in kernel/kexec.c ;) > Replacing the #ifdefs with a runtime check that is fine by me. I'm > think it's nice to avoid #ifdefs if possible, but again - our scope of > implementation was simply to add dom0 support. We did not care about > domU support or paravirt that wasn't included at that time. Having "#ifdef CONFIG_XEN" in kernel/kexec.c most likely never ever is accepted mainline (and we do seek mainline merge, don't we?). IMHO that is enough reason to avoid it in the first place. cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-06 8:48 ` Gerd Hoffmann @ 2006-12-06 9:41 ` Magnus Damm 2006-12-06 10:31 ` Gerd Hoffmann 0 siblings, 1 reply; 20+ messages in thread From: Magnus Damm @ 2006-12-06 9:41 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list, Horms On 12/6/06, Gerd Hoffmann <kraxel@suse.de> wrote: > Magnus Damm wrote: > > For you a runtime check makes sense, especially now when our code is > > merged and you have a conflict. It does however sound like you are > > pissed because the conflict, but I don't think you should blame that > > on us. > > Yes, a bit, especially as we've talked a bit about dom0/domU kexec at > the Xen Summit, so I assumed you are aware of the problem. The > sparse/patches split of the code also makes it hard to change it. We chit-chatted a bit, but I don't remember us talking about any implementation details. I've heard complaints and doubts about using sparse together with patches, but when I ask for a better alternative it's always awfully silent. We could have copied the files into sparse and applied our patches, but duplicating files seemed a step in the wrong direction. It's funny because the reason behind using patches is to simplify up porting, but now instead of simplifying it seems to confuse people. Maybe we should have copied the files to sparse instead, would that have been better? > > Simon and I reposted the patches at least 10 times over the > > last half a year - so you had your time to come with feedback. > > Yes, I should have checked before. -ENOTIME. Bad decision > nevertheless, now it probably costs even more time to fix it up > afterwards .... I don't mind changing pieces of the code now. It would probably have been easier to do the right thing earlier, but the number of changes needed are probably pretty low. If there is anything I can help out with just let me know! > > That aside, what about doing as little as possible now? Use > > is_initial_xendomain() or something like that to switch between the > > different dom0 and domU implementations. And whenever domU and dom0 > > runs under paravirt we fix up to code to remove the #ifdef and add > > native mode support. > > I'd go for the function pointer approach. I think it is easier to > maintain in the long run. Wrapper functions which look at > is_initial_xendomain() then call either xen0_machine_kexec or > xenU_machine_kexec quickly get messy with lots of #ifdef CONFIG_FOOBAR, > and it would be a temporary solution only anyway. Yes, the function pointer solution is a lot nicer. > I think you compile in native code too, although it is dead code, right? The only dead code function that I know of would be machine_kexec(), and that one will be needed if we want to support native mode. > So we can make machine_kexec() + friends function pointers, rename the > native functions and initialize the function pointers to the native > versions. I think it should even be possible to make them function > pointers for i386/x86_64 archs only. Things keep working with > CONFIG_XEN=n then, and with CONFIG_XEN=y the initialization function > just switches the function pointers (depending on is_initial_domain()). > This also eliminates the first set of #ifdefs in kernel/kexec.c ;) Sounds exactly what I would have done! =) > > Replacing the #ifdefs with a runtime check that is fine by me. I'm > > think it's nice to avoid #ifdefs if possible, but again - our scope of > > implementation was simply to add dom0 support. We did not care about > > domU support or paravirt that wasn't included at that time. > > Having "#ifdef CONFIG_XEN" in kernel/kexec.c most likely never ever is > accepted mainline (and we do seek mainline merge, don't we?). IMHO that > is enough reason to avoid it in the first place. Yes and no. =) You seem to code with the goal of having something that will be directly acceptable for mainilne, but my goal is to write as simple code as possible which should be easy to adjust to whatever framework that exists at the time of mainline merge. Let me know what I can do to help out. Thanks, / magnus ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-06 9:41 ` Magnus Damm @ 2006-12-06 10:31 ` Gerd Hoffmann 2006-12-06 11:11 ` Magnus Damm 0 siblings, 1 reply; 20+ messages in thread From: Gerd Hoffmann @ 2006-12-06 10:31 UTC (permalink / raw) To: Magnus Damm; +Cc: Magnus Damm, Xen devel list, Horms Hi, > We chit-chatted a bit, but I don't remember us talking about any > implementation details. Discussed briefly possible code sharing, that there likely isn't much to share because we have two very different approachs to take, and that we are probably best off just having two machine_kexec() versions for dom0/domU. No details yet how to actually implement that, but at least the need for some kind of runtime switching should have been clear. > I've heard complaints and doubts about using sparse together with > patches, but when I ask for a better alternative it's always awfully > silent. We could have copied the files into sparse and applied our > patches, but duplicating files seemed a step in the wrong direction. For backports and code planned for quick mainline merge maintaining as patches is fine, makes it easier to move forward once stuff is merged and/or the xen linux tree is updated to a newer upstream kernel. For code which likely lives longer in the xen tree (especially kexec-generic.patch which has almost no chance to be accepted mainline as-is) it is a pain to deal with as patch. I'd love to see kernel/kexec.c not being touched at all, but I think that is impossible for dom0 kexec (due to range checks which must happen in machine not guest address space for example). >> So we can make machine_kexec() + friends function pointers, rename the >> native functions and initialize the function pointers to the native >> versions. I think it should even be possible to make them function >> pointers for i386/x86_64 archs only. Things keep working with >> CONFIG_XEN=n then, and with CONFIG_XEN=y the initialization function >> just switches the function pointers (depending on is_initial_domain()). >> This also eliminates the first set of #ifdefs in kernel/kexec.c ;) > > Sounds exactly what I would have done! =) Great, so lets do that. > You seem to code with the goal of having something that will be > directly acceptable for mainilne, but my goal is to write as simple > code as possible which should be easy to adjust to whatever framework > that exists at the time of mainline merge. Given that the framework will be paravirt_ops function pointers fit nicely ;) cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-06 10:31 ` Gerd Hoffmann @ 2006-12-06 11:11 ` Magnus Damm 2006-12-06 13:23 ` Gerd Hoffmann 2006-12-07 11:24 ` Gerd Hoffmann 0 siblings, 2 replies; 20+ messages in thread From: Magnus Damm @ 2006-12-06 11:11 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list, Horms On 12/6/06, Gerd Hoffmann <kraxel@suse.de> wrote: > Hi, > > > We chit-chatted a bit, but I don't remember us talking about any > > implementation details. > > Discussed briefly possible code sharing, that there likely isn't much to > share because we have two very different approachs to take, and that we > are probably best off just having two machine_kexec() versions for > dom0/domU. No details yet how to actually implement that, but at least > the need for some kind of runtime switching should have been clear. We needed to work together to implement runtime switching anyhow, and that's what is happening now. But maybe I should have considered the runtime switching earlier... > > I've heard complaints and doubts about using sparse together with > > patches, but when I ask for a better alternative it's always awfully > > silent. We could have copied the files into sparse and applied our > > patches, but duplicating files seemed a step in the wrong direction. > > For backports and code planned for quick mainline merge maintaining as > patches is fine, makes it easier to move forward once stuff is merged > and/or the xen linux tree is updated to a newer upstream kernel. Ack. > For code which likely lives longer in the xen tree (especially > kexec-generic.patch which has almost no chance to be accepted mainline > as-is) it is a pain to deal with as patch. Yeah, I can agree with that. Feel free to add the files to sparse and throw out the patch. The dependency on patches and other stuff may make it difficult though. > I'd love to see kernel/kexec.c not being touched at all, but I think > that is impossible for dom0 kexec (due to range checks which must happen > in machine not guest address space for example). We hoped to not touch the generic code at all too, but we had to because of machine addresses > >> So we can make machine_kexec() + friends function pointers, rename the > >> native functions and initialize the function pointers to the native > >> versions. I think it should even be possible to make them function > >> pointers for i386/x86_64 archs only. Things keep working with > >> CONFIG_XEN=n then, and with CONFIG_XEN=y the initialization function > >> just switches the function pointers (depending on is_initial_domain()). > >> This also eliminates the first set of #ifdefs in kernel/kexec.c ;) > > > > Sounds exactly what I would have done! =) > > Great, so lets do that. Excellent! Let me know how and where you want my help. > > You seem to code with the goal of having something that will be > > directly acceptable for mainilne, but my goal is to write as simple > > code as possible which should be easy to adjust to whatever framework > > that exists at the time of mainline merge. > > Given that the framework will be paravirt_ops function pointers fit > nicely ;) Function pointers sound like the right way to go! Happy hacking! Thanks, / magnus ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-06 11:11 ` Magnus Damm @ 2006-12-06 13:23 ` Gerd Hoffmann 2006-12-06 13:40 ` Muli Ben-Yehuda 2006-12-07 11:24 ` Gerd Hoffmann 1 sibling, 1 reply; 20+ messages in thread From: Gerd Hoffmann @ 2006-12-06 13:23 UTC (permalink / raw) To: Magnus Damm; +Cc: Magnus Damm, Xen devel list, Horms Hi, >> For code which likely lives longer in the xen tree (especially >> kexec-generic.patch which has almost no chance to be accepted mainline >> as-is) it is a pain to deal with as patch. > > Yeah, I can agree with that. Feel free to add the files to sparse and > throw out the patch. The dependency on patches and other stuff may > make it difficult though. *Aaaaargh*, it's even messier than I thought. We have linux kernel source files which are modified by patches *AND* are in the sparse tree. And the two versions don't match of course. Looks like that is an older issue though, so I can't blame kexec for that one ;) These patches can't be removed cleanly after running mkbuildtree: x86-put-note-sections-into-a-pt_note-segment-in-vmlinux.patch smp-alts.patch net-gso-2-checksum-fix.patch net-gso-0-base.patch We *must* find a more sane way to maintain the linux kernel sources, this is one more reason why mixing sparse tree and patches isn't going to fly. As far I know at least the sparse tree is planned to be dropped, now with dom0 and xen being decoupled (3.0.3+) it should be possible without too much hassle. Any plans what to use instead? quilt patch queue? cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-06 13:23 ` Gerd Hoffmann @ 2006-12-06 13:40 ` Muli Ben-Yehuda 0 siblings, 0 replies; 20+ messages in thread From: Muli Ben-Yehuda @ 2006-12-06 13:40 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list, Magnus Damm, Horms On Wed, Dec 06, 2006 at 02:23:36PM +0100, Gerd Hoffmann wrote: > We *must* find a more sane way to maintain the linux kernel sources, > this is one more reason why mixing sparse tree and patches isn't going > to fly. As far I know at least the sparse tree is planned to be > dropped, now with dom0 and xen being decoupled (3.0.3+) it should be > possible without too much hassle. Any plans what to use instead? quilt > patch queue? A full hg or git tree would be nicer... the way things are going, patches applied to such a tree wouldn't be appropriate for upstream inclusion without cleaning up anyway, so I don't see what the patch queue method will buy us as opposed to a full tree. Of course nearly anything would be better than the sparse + patches method. Cheers, Muli ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-06 11:11 ` Magnus Damm 2006-12-06 13:23 ` Gerd Hoffmann @ 2006-12-07 11:24 ` Gerd Hoffmann 2006-12-08 4:15 ` Magnus Damm 1 sibling, 1 reply; 20+ messages in thread From: Gerd Hoffmann @ 2006-12-07 11:24 UTC (permalink / raw) To: Magnus Damm; +Cc: Magnus Damm, Xen devel list, Horms [-- Attachment #1: Type: text/plain, Size: 340 bytes --] Hi, > Function pointers sound like the right way to go! Happy hacking! First step of a cleanup by moving to function pointers. Compile tested only. First three attachments replace the patches with identical names in patches/linux-2.6. The last should be applied to the sparse tree. cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> [-- Attachment #2: kexec-generic.patch --] [-- Type: text/x-patch, Size: 8430 bytes --] --- include/linux/kexec.h | 22 +++++++++++- kernel/kexec.c | 85 ++++++++++++++++++++++++++++++++++++++++---------- 2 files changed, 89 insertions(+), 18 deletions(-) Index: kexec-2.6.16/include/linux/kexec.h =================================================================== --- kexec-2.6.16.orig/include/linux/kexec.h +++ kexec-2.6.16/include/linux/kexec.h @@ -85,12 +85,30 @@ struct kimage { #define KEXEC_TYPE_CRASH 1 }; - - /* kexec interface functions */ +extern unsigned long (*kexec_page_to_pfn)(struct page *page); +extern struct page* (*kexec_pfn_to_page)(unsigned long pfn); +extern unsigned long (*kexec_virt_to_phys)(void *addr); +extern void* (*kexec_phys_to_virt)(unsigned long addr); + +#ifdef KEXEC_ARCH_USES_HOOKS +extern NORET_TYPE void (*machine_kexec)(struct kimage *image) ATTRIB_NORET; +extern int (*machine_kexec_prepare)(struct kimage *image); +extern int (*machine_kexec_load)(struct kimage *image); +extern void (*machine_kexec_unload)(struct kimage *image); +extern void (*machine_kexec_cleanup)(struct kimage *image); +#else extern NORET_TYPE void machine_kexec(struct kimage *image) ATTRIB_NORET; extern int machine_kexec_prepare(struct kimage *image); +static inline int machine_kexec_load(struct kimage *image) { return 0; } +static inline void machine_kexec_unload(struct kimage *image) { } extern void machine_kexec_cleanup(struct kimage *image); +#endif + +#ifdef CONFIG_XEN +extern void xen_machine_kexec_setup_resources(void); +extern void xen_machine_kexec_register_resources(struct resource *res); +#endif extern asmlinkage long sys_kexec_load(unsigned long entry, unsigned long nr_segments, struct kexec_segment __user *segments, Index: kexec-2.6.16/kernel/kexec.c =================================================================== --- kexec-2.6.16.orig/kernel/kexec.c +++ kexec-2.6.16/kernel/kexec.c @@ -27,6 +27,31 @@ #include <asm/system.h> #include <asm/semaphore.h> +static unsigned long default_page_to_pfn(struct page *page) +{ + return page_to_pfn(page); +} + +static struct page* default_pfn_to_page(unsigned long pfn) +{ + return pfn_to_page(pfn); +} + +static unsigned long default_virt_to_phys(void *addr) +{ + return virt_to_phys(addr); +} + +static void* default_phys_to_virt(unsigned long addr) +{ + return phys_to_virt(addr); +} + +unsigned long (*kexec_page_to_pfn)(struct page *page) = default_page_to_pfn; +struct page* (*kexec_pfn_to_page)(unsigned long pfn) = default_pfn_to_page; +unsigned long (*kexec_virt_to_phys)(void *addr) = default_virt_to_phys; +void* (*kexec_phys_to_virt)(unsigned long addr) = default_phys_to_virt; + /* Per cpu memory for storing cpu states in case of system crash. */ note_buf_t* crash_notes; @@ -403,7 +428,7 @@ static struct page *kimage_alloc_normal_ pages = kimage_alloc_pages(GFP_KERNEL, order); if (!pages) break; - pfn = page_to_pfn(pages); + pfn = kexec_page_to_pfn(pages); epfn = pfn + count; addr = pfn << PAGE_SHIFT; eaddr = epfn << PAGE_SHIFT; @@ -437,6 +462,7 @@ static struct page *kimage_alloc_normal_ return pages; } +#ifndef CONFIG_XEN static struct page *kimage_alloc_crash_control_pages(struct kimage *image, unsigned int order) { @@ -490,7 +516,7 @@ static struct page *kimage_alloc_crash_c } /* If I don't overlap any segments I have found my hole! */ if (i == image->nr_segments) { - pages = pfn_to_page(hole_start >> PAGE_SHIFT); + pages = kexec_pfn_to_page(hole_start >> PAGE_SHIFT); break; } } @@ -517,6 +543,13 @@ struct page *kimage_alloc_control_pages( return pages; } +#else /* !CONFIG_XEN */ +struct page *kimage_alloc_control_pages(struct kimage *image, + unsigned int order) +{ + return kimage_alloc_normal_control_pages(image, order); +} +#endif static int kimage_add_entry(struct kimage *image, kimage_entry_t entry) { @@ -532,7 +565,7 @@ static int kimage_add_entry(struct kimag return -ENOMEM; ind_page = page_address(page); - *image->entry = virt_to_phys(ind_page) | IND_INDIRECTION; + *image->entry = kexec_virt_to_phys(ind_page) | IND_INDIRECTION; image->entry = ind_page; image->last_entry = ind_page + ((PAGE_SIZE/sizeof(kimage_entry_t)) - 1); @@ -593,13 +626,13 @@ static int kimage_terminate(struct kimag #define for_each_kimage_entry(image, ptr, entry) \ for (ptr = &image->head; (entry = *ptr) && !(entry & IND_DONE); \ ptr = (entry & IND_INDIRECTION)? \ - phys_to_virt((entry & PAGE_MASK)): ptr +1) + kexec_phys_to_virt((entry & PAGE_MASK)): ptr +1) static void kimage_free_entry(kimage_entry_t entry) { struct page *page; - page = pfn_to_page(entry >> PAGE_SHIFT); + page = kexec_pfn_to_page(entry >> PAGE_SHIFT); kimage_free_pages(page); } @@ -611,6 +644,9 @@ static void kimage_free(struct kimage *i if (!image) return; + if (machine_kexec_unload) + machine_kexec_unload(image); + kimage_free_extra_pages(image); for_each_kimage_entry(image, ptr, entry) { if (entry & IND_INDIRECTION) { @@ -630,7 +666,8 @@ static void kimage_free(struct kimage *i kimage_free_entry(ind); /* Handle any machine specific cleanup */ - machine_kexec_cleanup(image); + if (machine_kexec_cleanup) + machine_kexec_cleanup(image); /* Free the kexec control pages... */ kimage_free_page_list(&image->control_pages); @@ -686,7 +723,7 @@ static struct page *kimage_alloc_page(st * have a match. */ list_for_each_entry(page, &image->dest_pages, lru) { - addr = page_to_pfn(page) << PAGE_SHIFT; + addr = kexec_page_to_pfn(page) << PAGE_SHIFT; if (addr == destination) { list_del(&page->lru); return page; @@ -701,12 +738,12 @@ static struct page *kimage_alloc_page(st if (!page) return NULL; /* If the page cannot be used file it away */ - if (page_to_pfn(page) > + if (kexec_page_to_pfn(page) > (KEXEC_SOURCE_MEMORY_LIMIT >> PAGE_SHIFT)) { list_add(&page->lru, &image->unuseable_pages); continue; } - addr = page_to_pfn(page) << PAGE_SHIFT; + addr = kexec_page_to_pfn(page) << PAGE_SHIFT; /* If it is the destination page we want use it */ if (addr == destination) @@ -729,7 +766,7 @@ static struct page *kimage_alloc_page(st struct page *old_page; old_addr = *old & PAGE_MASK; - old_page = pfn_to_page(old_addr >> PAGE_SHIFT); + old_page = kexec_pfn_to_page(old_addr >> PAGE_SHIFT); copy_highpage(page, old_page); *old = addr | (*old & ~PAGE_MASK); @@ -779,7 +816,7 @@ static int kimage_load_normal_segment(st result = -ENOMEM; goto out; } - result = kimage_add_page(image, page_to_pfn(page) + result = kimage_add_page(image, kexec_page_to_pfn(page) << PAGE_SHIFT); if (result < 0) goto out; @@ -811,6 +848,7 @@ out: return result; } +#ifndef CONFIG_XEN static int kimage_load_crash_segment(struct kimage *image, struct kexec_segment *segment) { @@ -833,7 +871,7 @@ static int kimage_load_crash_segment(str char *ptr; size_t uchunk, mchunk; - page = pfn_to_page(maddr >> PAGE_SHIFT); + page = kexec_pfn_to_page(maddr >> PAGE_SHIFT); if (page == 0) { result = -ENOMEM; goto out; @@ -881,6 +919,13 @@ static int kimage_load_segment(struct ki return result; } +#else /* CONFIG_XEN */ +static int kimage_load_segment(struct kimage *image, + struct kexec_segment *segment) +{ + return kimage_load_normal_segment(image, segment); +} +#endif /* * Exec Kernel system call: for obvious reasons only root may call it. @@ -978,9 +1023,11 @@ asmlinkage long sys_kexec_load(unsigned if (result) goto out; - result = machine_kexec_prepare(image); - if (result) - goto out; + if (machine_kexec_prepare) { + result = machine_kexec_prepare(image); + if (result) + goto out; + } for (i = 0; i < nr_segments; i++) { result = kimage_load_segment(image, &image->segment[i]); @@ -991,6 +1038,13 @@ asmlinkage long sys_kexec_load(unsigned if (result) goto out; } + + if (machine_kexec_load) { + result = machine_kexec_load(image); + if (result) + goto out; + } + /* Install the new kernel, and Uninstall the old */ image = xchg(dest_image, image); @@ -1045,7 +1099,6 @@ void crash_kexec(struct pt_regs *regs) struct kimage *image; int locked; - /* Take the kexec_lock here to prevent sys_kexec_load * running on one cpu from replacing the crash kernel * we are using after a panic on a different cpu. [-- Attachment #3: linux-2.6.19-rc1-kexec-xen-i386.patch --] [-- Type: text/x-patch, Size: 4588 bytes --] --- arch/i386/kernel/crash.c | 4 ++ arch/i386/kernel/machine_kexec.c | 65 +++++++++++++++++++++++++-------------- include/asm-i386/kexec.h | 3 + 3 files changed, 49 insertions(+), 23 deletions(-) Index: kexec-2.6.16/arch/i386/kernel/crash.c =================================================================== --- kexec-2.6.16.orig/arch/i386/kernel/crash.c +++ kexec-2.6.16/arch/i386/kernel/crash.c @@ -90,6 +90,7 @@ static void crash_save_self(struct pt_re crash_save_this_cpu(regs, cpu); } +#ifndef CONFIG_XEN #ifdef CONFIG_SMP static atomic_t waiting_for_crash_ipi; @@ -158,6 +159,7 @@ static void nmi_shootdown_cpus(void) /* There are no cpus to shootdown */ } #endif +#endif /* CONFIG_XEN */ void machine_crash_shutdown(struct pt_regs *regs) { @@ -174,10 +176,12 @@ void machine_crash_shutdown(struct pt_re /* Make a note of crashing cpu. Will be used in NMI callback.*/ crashing_cpu = smp_processor_id(); +#ifndef CONFIG_XEN nmi_shootdown_cpus(); lapic_shutdown(); #if defined(CONFIG_X86_IO_APIC) disable_IO_APIC(); #endif +#endif /* CONFIG_XEN */ crash_save_self(regs); } Index: kexec-2.6.16/arch/i386/kernel/machine_kexec.c =================================================================== --- kexec-2.6.16.orig/arch/i386/kernel/machine_kexec.c +++ kexec-2.6.16/arch/i386/kernel/machine_kexec.c @@ -19,6 +19,10 @@ #include <asm/desc.h> #include <asm/system.h> +#ifdef CONFIG_XEN +#include <xen/interface/kexec.h> +#endif + #define PAGE_ALIGNED __attribute__ ((__aligned__(PAGE_SIZE))) static u32 kexec_pgd[1024] PAGE_ALIGNED; #ifdef CONFIG_X86_PAE @@ -28,37 +32,45 @@ static u32 kexec_pmd1[1024] PAGE_ALIGNED static u32 kexec_pte0[1024] PAGE_ALIGNED; static u32 kexec_pte1[1024] PAGE_ALIGNED; -/* - * A architecture hook called to validate the - * proposed image and prepare the control pages - * as needed. The pages for KEXEC_CONTROL_CODE_SIZE - * have been allocated, but the segments have yet - * been copied into the kernel. - * - * Do what every setup is needed on image and the - * reboot code buffer to allow us to avoid allocations - * later. - * - * Currently nothing. - */ -int machine_kexec_prepare(struct kimage *image) -{ - return 0; -} +#ifdef CONFIG_XEN -/* - * Undo anything leftover by machine_kexec_prepare - * when an image is freed. - */ -void machine_kexec_cleanup(struct kimage *image) +#define __ma(x) (pfn_to_mfn(__pa((x)) >> PAGE_SHIFT) << PAGE_SHIFT) + +#if PAGES_NR > KEXEC_XEN_NO_PAGES +#error PAGES_NR is greater than KEXEC_XEN_NO_PAGES - Xen support will break +#endif + +#if PA_CONTROL_PAGE != 0 +#error PA_CONTROL_PAGE is non zero - Xen support will break +#endif + +void machine_kexec_setup_load_arg(xen_kexec_image_t *xki, struct kimage *image) { + void *control_page; + + memset(xki->page_list, 0, sizeof(xki->page_list)); + + control_page = page_address(image->control_code_page); + memcpy(control_page, relocate_kernel, PAGE_SIZE); + + xki->page_list[PA_CONTROL_PAGE] = __ma(control_page); + xki->page_list[PA_PGD] = __ma(kexec_pgd); +#ifdef CONFIG_X86_PAE + xki->page_list[PA_PMD_0] = __ma(kexec_pmd0); + xki->page_list[PA_PMD_1] = __ma(kexec_pmd1); +#endif + xki->page_list[PA_PTE_0] = __ma(kexec_pte0); + xki->page_list[PA_PTE_1] = __ma(kexec_pte1); + } +#endif /* CONFIG_XEN */ + /* * Do not allocate memory (or fail in any way) in machine_kexec(). * We are past the point of no return, committed to rebooting now. */ -NORET_TYPE void machine_kexec(struct kimage *image) +static NORET_TYPE ATTRIB_NORET void native_machine_kexec(struct kimage *image) { unsigned long page_list[PAGES_NR]; void *control_page; @@ -87,3 +99,10 @@ NORET_TYPE void machine_kexec(struct kim relocate_kernel((unsigned long)image->head, (unsigned long)page_list, image->start, cpu_has_pae); } + +NORET_TYPE void (*machine_kexec)(struct kimage *image) ATTRIB_NORET + = native_machine_kexec; +int (*machine_kexec_prepare)(struct kimage *image) = NULL; +int (*machine_kexec_load)(struct kimage *image) = NULL; +void (*machine_kexec_unload)(struct kimage *image) = NULL; +void (*machine_kexec_cleanup)(struct kimage *image) = NULL; Index: kexec-2.6.16/include/asm-i386/kexec.h =================================================================== --- kexec-2.6.16.orig/include/asm-i386/kexec.h +++ kexec-2.6.16/include/asm-i386/kexec.h @@ -98,6 +98,9 @@ relocate_kernel(unsigned long indirectio unsigned long start_address, unsigned int has_pae) ATTRIB_NORET; + +#define KEXEC_ARCH_USES_HOOKS 1 + #endif /* __ASSEMBLY__ */ #endif /* _I386_KEXEC_H */ [-- Attachment #4: linux-2.6.19-rc1-kexec-xen-x86_64.patch --] [-- Type: text/x-patch, Size: 7755 bytes --] --- arch/x86_64/kernel/crash.c | 6 + arch/x86_64/kernel/machine_kexec.c | 133 +++++++++++++++++++++++++++++++++---- include/asm-x86_64/kexec.h | 7 + 3 files changed, 132 insertions(+), 14 deletions(-) Index: kexec-2.6.16/arch/x86_64/kernel/crash.c =================================================================== --- kexec-2.6.16.orig/arch/x86_64/kernel/crash.c +++ kexec-2.6.16/arch/x86_64/kernel/crash.c @@ -92,6 +92,7 @@ static void crash_save_self(struct pt_re crash_save_this_cpu(regs, cpu); } +#ifndef CONFIG_XEN #ifdef CONFIG_SMP static atomic_t waiting_for_crash_ipi; @@ -156,6 +157,7 @@ static void nmi_shootdown_cpus(void) /* There are no cpus to shootdown */ } #endif +#endif /* CONFIG_XEN */ void machine_crash_shutdown(struct pt_regs *regs) { @@ -173,6 +175,8 @@ void machine_crash_shutdown(struct pt_re /* Make a note of crashing cpu. Will be used in NMI callback.*/ crashing_cpu = smp_processor_id(); + +#ifndef CONFIG_XEN nmi_shootdown_cpus(); if(cpu_has_apic) @@ -181,6 +185,6 @@ void machine_crash_shutdown(struct pt_re #if defined(CONFIG_X86_IO_APIC) disable_IO_APIC(); #endif - +#endif /* CONFIG_XEN */ crash_save_self(regs); } Index: kexec-2.6.16/arch/x86_64/kernel/machine_kexec.c =================================================================== --- kexec-2.6.16.orig/arch/x86_64/kernel/machine_kexec.c +++ kexec-2.6.16/arch/x86_64/kernel/machine_kexec.c @@ -24,6 +24,104 @@ static u64 kexec_pud1[512] PAGE_ALIGNED; static u64 kexec_pmd1[512] PAGE_ALIGNED; static u64 kexec_pte1[512] PAGE_ALIGNED; +#ifdef CONFIG_XEN + +/* In the case of Xen, override hypervisor functions to be able to create + * a regular identity mapping page table... + */ + +#include <xen/interface/kexec.h> +#include <xen/interface/memory.h> + +#define x__pmd(x) ((pmd_t) { (x) } ) +#define x__pud(x) ((pud_t) { (x) } ) +#define x__pgd(x) ((pgd_t) { (x) } ) + +#define x_pmd_val(x) ((x).pmd) +#define x_pud_val(x) ((x).pud) +#define x_pgd_val(x) ((x).pgd) + +static inline void x_set_pmd(pmd_t *dst, pmd_t val) +{ + x_pmd_val(*dst) = x_pmd_val(val); +} + +static inline void x_set_pud(pud_t *dst, pud_t val) +{ + x_pud_val(*dst) = phys_to_machine(x_pud_val(val)); +} + +static inline void x_pud_clear (pud_t *pud) +{ + x_pud_val(*pud) = 0; +} + +static inline void x_set_pgd(pgd_t *dst, pgd_t val) +{ + x_pgd_val(*dst) = phys_to_machine(x_pgd_val(val)); +} + +static inline void x_pgd_clear (pgd_t * pgd) +{ + x_pgd_val(*pgd) = 0; +} + +#define X__PAGE_KERNEL_LARGE_EXEC \ + _PAGE_PRESENT | _PAGE_RW | _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_PSE +#define X_KERNPG_TABLE _PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY + +#define __ma(x) (pfn_to_mfn(__pa((x)) >> PAGE_SHIFT) << PAGE_SHIFT) + +#if PAGES_NR > KEXEC_XEN_NO_PAGES +#error PAGES_NR is greater than KEXEC_XEN_NO_PAGES - Xen support will break +#endif + +#if PA_CONTROL_PAGE != 0 +#error PA_CONTROL_PAGE is non zero - Xen support will break +#endif + +void machine_kexec_setup_load_arg(xen_kexec_image_t *xki, struct kimage *image) +{ + void *control_page; + void *table_page; + + memset(xki->page_list, 0, sizeof(xki->page_list)); + + control_page = page_address(image->control_code_page) + PAGE_SIZE; + memcpy(control_page, relocate_kernel, PAGE_SIZE); + + table_page = page_address(image->control_code_page); + + xki->page_list[PA_CONTROL_PAGE] = __ma(control_page); + xki->page_list[PA_TABLE_PAGE] = __ma(table_page); + + xki->page_list[PA_PGD] = __ma(kexec_pgd); + xki->page_list[PA_PUD_0] = __ma(kexec_pud0); + xki->page_list[PA_PUD_1] = __ma(kexec_pud1); + xki->page_list[PA_PMD_0] = __ma(kexec_pmd0); + xki->page_list[PA_PMD_1] = __ma(kexec_pmd1); + xki->page_list[PA_PTE_0] = __ma(kexec_pte0); + xki->page_list[PA_PTE_1] = __ma(kexec_pte1); +} + +#else /* CONFIG_XEN */ + +#define x__pmd(x) __pmd(x) +#define x__pud(x) __pud(x) +#define x__pgd(x) __pgd(x) + +#define x_set_pmd(x, y) set_pmd(x, y) +#define x_set_pud(x, y) set_pud(x, y) +#define x_set_pgd(x, y) set_pgd(x, y) + +#define x_pud_clear(x) pud_clear(x) +#define x_pgd_clear(x) pgd_clear(x) + +#define X__PAGE_KERNEL_LARGE_EXEC __PAGE_KERNEL_LARGE_EXEC +#define X_KERNPG_TABLE _KERNPG_TABLE + +#endif /* CONFIG_XEN */ + static void init_level2_page(pmd_t *level2p, unsigned long addr) { unsigned long end_addr; @@ -31,7 +129,7 @@ static void init_level2_page(pmd_t *leve addr &= PAGE_MASK; end_addr = addr + PUD_SIZE; while (addr < end_addr) { - set_pmd(level2p++, __pmd(addr | __PAGE_KERNEL_LARGE_EXEC)); + x_set_pmd(level2p++, x__pmd(addr | X__PAGE_KERNEL_LARGE_EXEC)); addr += PMD_SIZE; } } @@ -56,12 +154,12 @@ static int init_level3_page(struct kimag } level2p = (pmd_t *)page_address(page); init_level2_page(level2p, addr); - set_pud(level3p++, __pud(__pa(level2p) | _KERNPG_TABLE)); + x_set_pud(level3p++, x__pud(__pa(level2p) | X_KERNPG_TABLE)); addr += PUD_SIZE; } /* clear the unused entries */ while (addr < end_addr) { - pud_clear(level3p++); + x_pud_clear(level3p++); addr += PUD_SIZE; } out: @@ -92,12 +190,12 @@ static int init_level4_page(struct kimag if (result) { goto out; } - set_pgd(level4p++, __pgd(__pa(level3p) | _KERNPG_TABLE)); + x_set_pgd(level4p++, x__pgd(__pa(level3p) | X_KERNPG_TABLE)); addr += PGDIR_SIZE; } /* clear the unused entries */ while (addr < end_addr) { - pgd_clear(level4p++); + x_pgd_clear(level4p++); addr += PGDIR_SIZE; } out: @@ -108,11 +206,17 @@ out: static int init_pgtable(struct kimage *image, unsigned long start_pgtable) { pgd_t *level4p; + unsigned long x_end_pfn = end_pfn; + +#ifdef CONFIG_XEN + x_end_pfn = HYPERVISOR_memory_op(XENMEM_maximum_ram_page, NULL); +#endif + level4p = (pgd_t *)__va(start_pgtable); - return init_level4_page(image, level4p, 0, end_pfn << PAGE_SHIFT); + return init_level4_page(image, level4p, 0, x_end_pfn << PAGE_SHIFT); } -int machine_kexec_prepare(struct kimage *image) +static int native_machine_kexec_prepare(struct kimage *image) { unsigned long start_pgtable; int result; @@ -128,16 +232,11 @@ int machine_kexec_prepare(struct kimage return 0; } -void machine_kexec_cleanup(struct kimage *image) -{ - return; -} - /* * Do not allocate memory (or fail in any way) in machine_kexec(). * We are past the point of no return, committed to rebooting now. */ -NORET_TYPE void machine_kexec(struct kimage *image) +static NORET_TYPE ATTRIB_NORET void native_machine_kexec(struct kimage *image) { unsigned long page_list[PAGES_NR]; void *control_page; @@ -171,3 +270,11 @@ NORET_TYPE void machine_kexec(struct kim relocate_kernel((unsigned long)image->head, (unsigned long)page_list, image->start); } + +NORET_TYPE void (*machine_kexec)(struct kimage *image) ATTRIB_NORET + = native_machine_kexec; +int (*machine_kexec_prepare)(struct kimage *image) + = native_machine_kexec_prepare; +int (*machine_kexec_load)(struct kimage *image) = NULL; +void (*machine_kexec_unload)(struct kimage *image) = NULL; +void (*machine_kexec_cleanup)(struct kimage *image) = NULL; Index: kexec-2.6.16/include/asm-x86_64/kexec.h =================================================================== --- kexec-2.6.16.orig/include/asm-x86_64/kexec.h +++ kexec-2.6.16/include/asm-x86_64/kexec.h @@ -91,6 +91,13 @@ relocate_kernel(unsigned long indirectio unsigned long page_list, unsigned long start_address) ATTRIB_NORET; +/* Under Xen we need to work with machine addresses. These macros give the + * machine address of a certain page to the generic kexec code instead of + * the pseudo physical address which would be given by the default macros. + */ + +#define KEXEC_ARCH_USES_HOOKS 1 + #endif /* __ASSEMBLY__ */ #endif /* _X86_64_KEXEC_H */ [-- Attachment #5: xen-sparse-kexec-fixes.diff --] [-- Type: text/x-patch, Size: 2642 bytes --] --- drivers/xen/core/machine_kexec.c | 42 ++++++++++++++++++++++++++++++++++++--- 1 file changed, 39 insertions(+), 3 deletions(-) Index: kexec-2.6.16/drivers/xen/core/machine_kexec.c =================================================================== --- kexec-2.6.16.orig/drivers/xen/core/machine_kexec.c +++ kexec-2.6.16/drivers/xen/core/machine_kexec.c @@ -11,6 +11,7 @@ extern void machine_kexec_setup_load_arg(xen_kexec_image_t *xki, struct kimage *image); +static void xen0_set_hooks(void); int xen_max_nr_phys_cpus; struct resource xen_hypervisor_res; @@ -24,6 +25,7 @@ void xen_machine_kexec_setup_resources(v if (!is_initial_xendomain()) return; + xen0_set_hooks(); /* determine maximum number of physical cpus */ @@ -124,7 +126,7 @@ static void setup_load_arg(xen_kexec_ima * is currently called too early. It might make sense * to move prepare, but for now, just add an extra hook. */ -int xen_machine_kexec_load(struct kimage *image) +static int xen0_machine_kexec_load(struct kimage *image) { xen_kexec_load_t xkl; @@ -140,7 +142,7 @@ int xen_machine_kexec_load(struct kimage * is called too late, and its possible xen could try and kdump * using resources that have been freed. */ -void xen_machine_kexec_unload(struct kimage *image) +static void xen0_machine_kexec_unload(struct kimage *image) { xen_kexec_load_t xkl; @@ -157,7 +159,7 @@ void xen_machine_kexec_unload(struct kim * stop all CPUs and kexec. That is it combines machine_shutdown() * and machine_kexec() in Linux kexec terms. */ -NORET_TYPE void xen_machine_kexec(struct kimage *image) +static NORET_TYPE void xen0_machine_kexec(struct kimage *image) { xen_kexec_exec_t xke; @@ -172,6 +174,40 @@ void machine_shutdown(void) /* do nothing */ } +static unsigned long xen0_page_to_pfn(struct page *page) +{ + return pfn_to_mfn(page_to_pfn(page)); +} + +static struct page* xen0_pfn_to_page(unsigned long pfn) +{ + return pfn_to_page(mfn_to_pfn(pfn)); +} + +static unsigned long xen0_virt_to_phys(void *addr) +{ + return virt_to_machine(addr); +} + +static void* xen0_phys_to_virt(unsigned long addr) +{ + return phys_to_virt(machine_to_phys(addr)); +} + + +static void xen0_set_hooks(void) +{ + kexec_page_to_pfn = xen0_page_to_pfn; + kexec_pfn_to_page = xen0_pfn_to_page; + kexec_virt_to_phys = xen0_virt_to_phys; + kexec_phys_to_virt = xen0_phys_to_virt; + + machine_kexec_load = xen0_machine_kexec_load; + machine_kexec_unload = xen0_machine_kexec_unload; + machine_kexec = xen0_machine_kexec; + + printk("%s: kexec hook setup done\n", __FUNCTION__); +} /* * Local variables: [-- Attachment #6: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-07 11:24 ` Gerd Hoffmann @ 2006-12-08 4:15 ` Magnus Damm 2006-12-08 10:01 ` Gerd Hoffmann 0 siblings, 1 reply; 20+ messages in thread From: Magnus Damm @ 2006-12-08 4:15 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list, Horms Hi again Gerd, On 12/7/06, Gerd Hoffmann <kraxel@suse.de> wrote: > Hi, > > > Function pointers sound like the right way to go! Happy hacking! > > First step of a cleanup by moving to function pointers. As a first step I think they look pretty good. I have a few random comments. > Compile tested only. Ok. I've browsed through the patches and done some basic compilation too. > First three attachments replace the patches with identical names in > patches/linux-2.6. The last should be applied to the sparse tree. I think using a structure for all callbacks will result in cleaner code. This is sort of a nitpick because it does not really matter function wise, but it sounded earlier like you were aiming for something that would be directly acceptable by the kexec and kdump community. And I'm all for cleanliness. Personally I would go with changing the code in kernel/kexec.c to instead of calling machine_kexec() call kexec_ops.machine_kexec(). This regardless of the use of KEXEC_ARCH_USES_HOOKS. Then I would have a single global instance of the structure kexec_ops declared in kernel/kexec.c, and it would by default fill in kexec_ops.machine_kexec() to machine_kexec. That way you won't have to rename the arch-specific functions and there is no need to declare the hooks in the arch-specific files. Maybe you won't need KEXEC_ARCH_USES_HOOKS at all. The load and unload code may be broken today if KEXEC_ARCH_USES_HOOKS is unset - can you really check if machine_kexec_load is non-NULL if it is inline? The reason why I did put the page-macros in arch-specific header files was because they need to be different on ia64. So your unification in drivers/xen/core/machine_kexec.c may be ok for now (if our goal is x86 only), but in the future we need to figure out how to change them nicely on ia64. You probably remember that I was kind of negative to trying to solve mainline merge issues at the same time as implementing this "switch". This was because I remembered that paravirt allowed patching of inline machine code. At least that's the impression I got from a presentation given here in Tokyo by Rusty. I think the page macros ideally should be patched in, but it's kind of hard trying to do that without paravirt.. Finally, we should get rid of the #ifdef CONFIG_XEN left here and there. My main concern is the code in crash.c which need to be replaced with runtime checks if we are aiming for a single binary for both native and dom0. I left out domU because it doesn't do crash, right? If you have an updated snapshot (or a replay saying I should use this version) then I'll try out the code the first thing next week. Thanks, / magnus ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-08 4:15 ` Magnus Damm @ 2006-12-08 10:01 ` Gerd Hoffmann 2006-12-08 10:24 ` Ian Campbell 0 siblings, 1 reply; 20+ messages in thread From: Gerd Hoffmann @ 2006-12-08 10:01 UTC (permalink / raw) To: Magnus Damm; +Cc: Magnus Damm, Xen devel list, Horms [-- Attachment #1: Type: text/plain, Size: 3545 bytes --] Magnus Damm wrote: > I think using a structure for all callbacks will result in cleaner > code. This is sort of a nitpick because it does not really matter > function wise, but it sounded earlier like you were aiming for > something that would be directly acceptable by the kexec and kdump > community. And I'm all for cleanliness. > > Personally I would go with changing the code in kernel/kexec.c to > instead of calling machine_kexec() call kexec_ops.machine_kexec(). > This regardless of the use of KEXEC_ARCH_USES_HOOKS. Then I would have > a single global instance of the structure kexec_ops declared in > kernel/kexec.c, and it would by default fill in > kexec_ops.machine_kexec() to machine_kexec. That way you won't have to > rename the arch-specific functions and there is no need to declare the > hooks in the arch-specific files. Maybe you won't need > KEXEC_ARCH_USES_HOOKS at all. Yep, good idea, that works without the hooks define (and also without touching all architectures which I want to avoid too). > The load and unload code may be broken today if KEXEC_ARCH_USES_HOOKS > is unset - can you really check if machine_kexec_load is non-NULL if > it is inline? Didn't check what gcc made out of it. It's a moot point now anyway with the switch to a ops struct. > The reason why I did put the page-macros in arch-specific header files > was because they need to be different on ia64. So your unification in > drivers/xen/core/machine_kexec.c may be ok for now (if our goal is x86 > only), but in the future we need to figure out how to change them > nicely on ia64. I simply wasn't aware of the ia64 issue. Well, maybe we should simply create a arch/$arch/kernel/machine_kexec_xen0.c file with that kind of code placed into. And maybe move the arch-independant xen bits in driver/xen/core/machine_kexec.c to bits to kernel/kexec_xen0, but I think that discussion better should be defered until we are actually seeking mainline merge. Maybe we get our own subdirectory below kernel/ for that kind of stuff. There is some more code which is arch-specific on native but simply a hypercall on xen, smpboot.c for example. > You probably remember that I was kind of negative to trying to solve > mainline merge issues at the same time as implementing this "switch". > This was because I remembered that paravirt allowed patching of inline > machine code. At least that's the impression I got from a presentation > given here in Tokyo by Rusty. I think the page macros ideally should > be patched in, but it's kind of hard trying to do that without > paravirt.. Yep, there is, to get some percent performace improvements for hot path code. Which IMHO isn't true for the kexec bits. It isn't performance critical, usually you load a kexec kernel only once, I don't think it is worth the trouble. > Finally, we should get rid of the #ifdef CONFIG_XEN left here and > there. My main concern is the code in crash.c which need to be > replaced with runtime checks if we are aiming for a single binary for > both native and dom0. I left out domU because it doesn't do crash, > right? I expect *lots* of changes in that area (apic/smp) anyway when we upgrade the xen linux tree to be based on 2.6.20-rc1 or newer. paravirt infrastructure is in Linus' tree now. I'd wait until that is done then look again. > If you have an updated snapshot (or a replay saying I should use this > version) then I'll try out the code the first thing next week. Updated patches attached. cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> [-- Attachment #2: kexec-generic.patch --] [-- Type: text/x-patch, Size: 8376 bytes --] --- include/linux/kexec.h | 19 ++++++++++ kernel/kexec.c | 90 +++++++++++++++++++++++++++++++++++++++++--------- 2 files changed, 92 insertions(+), 17 deletions(-) Index: kexec-2.6.16/include/linux/kexec.h =================================================================== --- kexec-2.6.16.orig/include/linux/kexec.h +++ kexec-2.6.16/include/linux/kexec.h @@ -85,12 +85,29 @@ struct kimage { #define KEXEC_TYPE_CRASH 1 }; +/* kexec interface functions */ +struct kexec_machine_ops { + unsigned long (*kpage_to_pfn)(struct page *page); + struct page* (*kpfn_to_page)(unsigned long pfn); + unsigned long (*kvirt_to_phys)(void *addr); + void* (*kphys_to_virt)(unsigned long addr); + NORET_TYPE void (*kexec)(struct kimage *image) ATTRIB_NORET; + int (*kexec_prepare)(struct kimage *image); + int (*kexec_load)(struct kimage *image); + void (*kexec_unload)(struct kimage *image); + void (*kexec_cleanup)(struct kimage *image); +}; +extern struct kexec_machine_ops kexec_ops; -/* kexec interface functions */ extern NORET_TYPE void machine_kexec(struct kimage *image) ATTRIB_NORET; extern int machine_kexec_prepare(struct kimage *image); extern void machine_kexec_cleanup(struct kimage *image); + +#ifdef CONFIG_XEN +extern void xen_machine_kexec_setup_resources(void); +extern void xen_machine_kexec_register_resources(struct resource *res); +#endif extern asmlinkage long sys_kexec_load(unsigned long entry, unsigned long nr_segments, struct kexec_segment __user *segments, Index: kexec-2.6.16/kernel/kexec.c =================================================================== --- kexec-2.6.16.orig/kernel/kexec.c +++ kexec-2.6.16/kernel/kexec.c @@ -27,6 +27,36 @@ #include <asm/system.h> #include <asm/semaphore.h> +static unsigned long default_page_to_pfn(struct page *page) +{ + return page_to_pfn(page); +} + +static struct page* default_pfn_to_page(unsigned long pfn) +{ + return pfn_to_page(pfn); +} + +static unsigned long default_virt_to_phys(void *addr) +{ + return virt_to_phys(addr); +} + +static void* default_phys_to_virt(unsigned long addr) +{ + return phys_to_virt(addr); +} + +struct kexec_machine_ops kexec_ops = { + .kpage_to_pfn = default_page_to_pfn, + .kpfn_to_page = default_pfn_to_page, + .kvirt_to_phys = default_virt_to_phys, + .kphys_to_virt = default_phys_to_virt, + .kexec = machine_kexec, + .kexec_prepare = machine_kexec_prepare, + .kexec_cleanup = machine_kexec_cleanup, +}; + /* Per cpu memory for storing cpu states in case of system crash. */ note_buf_t* crash_notes; @@ -403,7 +433,7 @@ static struct page *kimage_alloc_normal_ pages = kimage_alloc_pages(GFP_KERNEL, order); if (!pages) break; - pfn = page_to_pfn(pages); + pfn = kexec_ops.kpage_to_pfn(pages); epfn = pfn + count; addr = pfn << PAGE_SHIFT; eaddr = epfn << PAGE_SHIFT; @@ -437,6 +467,7 @@ static struct page *kimage_alloc_normal_ return pages; } +#ifndef CONFIG_XEN static struct page *kimage_alloc_crash_control_pages(struct kimage *image, unsigned int order) { @@ -490,7 +521,7 @@ static struct page *kimage_alloc_crash_c } /* If I don't overlap any segments I have found my hole! */ if (i == image->nr_segments) { - pages = pfn_to_page(hole_start >> PAGE_SHIFT); + pages = kexec_ops.kpfn_to_page(hole_start >> PAGE_SHIFT); break; } } @@ -517,6 +548,13 @@ struct page *kimage_alloc_control_pages( return pages; } +#else /* !CONFIG_XEN */ +struct page *kimage_alloc_control_pages(struct kimage *image, + unsigned int order) +{ + return kimage_alloc_normal_control_pages(image, order); +} +#endif static int kimage_add_entry(struct kimage *image, kimage_entry_t entry) { @@ -532,7 +570,7 @@ static int kimage_add_entry(struct kimag return -ENOMEM; ind_page = page_address(page); - *image->entry = virt_to_phys(ind_page) | IND_INDIRECTION; + *image->entry = kexec_ops.kvirt_to_phys(ind_page) | IND_INDIRECTION; image->entry = ind_page; image->last_entry = ind_page + ((PAGE_SIZE/sizeof(kimage_entry_t)) - 1); @@ -593,13 +631,13 @@ static int kimage_terminate(struct kimag #define for_each_kimage_entry(image, ptr, entry) \ for (ptr = &image->head; (entry = *ptr) && !(entry & IND_DONE); \ ptr = (entry & IND_INDIRECTION)? \ - phys_to_virt((entry & PAGE_MASK)): ptr +1) + kexec_ops.kphys_to_virt((entry & PAGE_MASK)): ptr +1) static void kimage_free_entry(kimage_entry_t entry) { struct page *page; - page = pfn_to_page(entry >> PAGE_SHIFT); + page = kexec_ops.kpfn_to_page(entry >> PAGE_SHIFT); kimage_free_pages(page); } @@ -611,6 +649,9 @@ static void kimage_free(struct kimage *i if (!image) return; + if (kexec_ops.kexec_unload) + kexec_ops.kexec_unload(image); + kimage_free_extra_pages(image); for_each_kimage_entry(image, ptr, entry) { if (entry & IND_INDIRECTION) { @@ -630,7 +671,8 @@ static void kimage_free(struct kimage *i kimage_free_entry(ind); /* Handle any machine specific cleanup */ - machine_kexec_cleanup(image); + if (kexec_ops.kexec_cleanup) + kexec_ops.kexec_cleanup(image); /* Free the kexec control pages... */ kimage_free_page_list(&image->control_pages); @@ -686,7 +728,7 @@ static struct page *kimage_alloc_page(st * have a match. */ list_for_each_entry(page, &image->dest_pages, lru) { - addr = page_to_pfn(page) << PAGE_SHIFT; + addr = kexec_ops.kpage_to_pfn(page) << PAGE_SHIFT; if (addr == destination) { list_del(&page->lru); return page; @@ -701,12 +743,12 @@ static struct page *kimage_alloc_page(st if (!page) return NULL; /* If the page cannot be used file it away */ - if (page_to_pfn(page) > + if (kexec_ops.kpage_to_pfn(page) > (KEXEC_SOURCE_MEMORY_LIMIT >> PAGE_SHIFT)) { list_add(&page->lru, &image->unuseable_pages); continue; } - addr = page_to_pfn(page) << PAGE_SHIFT; + addr = kexec_ops.kpage_to_pfn(page) << PAGE_SHIFT; /* If it is the destination page we want use it */ if (addr == destination) @@ -729,7 +771,7 @@ static struct page *kimage_alloc_page(st struct page *old_page; old_addr = *old & PAGE_MASK; - old_page = pfn_to_page(old_addr >> PAGE_SHIFT); + old_page = kexec_ops.kpfn_to_page(old_addr >> PAGE_SHIFT); copy_highpage(page, old_page); *old = addr | (*old & ~PAGE_MASK); @@ -779,7 +821,7 @@ static int kimage_load_normal_segment(st result = -ENOMEM; goto out; } - result = kimage_add_page(image, page_to_pfn(page) + result = kimage_add_page(image, kexec_ops.kpage_to_pfn(page) << PAGE_SHIFT); if (result < 0) goto out; @@ -811,6 +853,7 @@ out: return result; } +#ifndef CONFIG_XEN static int kimage_load_crash_segment(struct kimage *image, struct kexec_segment *segment) { @@ -833,7 +876,7 @@ static int kimage_load_crash_segment(str char *ptr; size_t uchunk, mchunk; - page = pfn_to_page(maddr >> PAGE_SHIFT); + page = kexec_ops.kpfn_to_page(maddr >> PAGE_SHIFT); if (page == 0) { result = -ENOMEM; goto out; @@ -881,6 +924,13 @@ static int kimage_load_segment(struct ki return result; } +#else /* CONFIG_XEN */ +static int kimage_load_segment(struct kimage *image, + struct kexec_segment *segment) +{ + return kimage_load_normal_segment(image, segment); +} +#endif /* * Exec Kernel system call: for obvious reasons only root may call it. @@ -978,9 +1028,11 @@ asmlinkage long sys_kexec_load(unsigned if (result) goto out; - result = machine_kexec_prepare(image); - if (result) - goto out; + if (kexec_ops.kexec_prepare) { + result = kexec_ops.kexec_prepare(image); + if (result) + goto out; + } for (i = 0; i < nr_segments; i++) { result = kimage_load_segment(image, &image->segment[i]); @@ -991,6 +1043,13 @@ asmlinkage long sys_kexec_load(unsigned if (result) goto out; } + + if (kexec_ops.kexec_load) { + result = kexec_ops.kexec_load(image); + if (result) + goto out; + } + /* Install the new kernel, and Uninstall the old */ image = xchg(dest_image, image); @@ -1045,7 +1104,6 @@ void crash_kexec(struct pt_regs *regs) struct kimage *image; int locked; - /* Take the kexec_lock here to prevent sys_kexec_load * running on one cpu from replacing the crash kernel * we are using after a panic on a different cpu. [-- Attachment #3: linux-2.6.19-rc1-kexec-xen-i386.patch --] [-- Type: text/x-patch, Size: 2714 bytes --] --- arch/i386/kernel/crash.c | 4 ++++ arch/i386/kernel/machine_kexec.c | 38 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+) Index: kexec-2.6.16/arch/i386/kernel/crash.c =================================================================== --- kexec-2.6.16.orig/arch/i386/kernel/crash.c +++ kexec-2.6.16/arch/i386/kernel/crash.c @@ -90,6 +90,7 @@ static void crash_save_self(struct pt_re crash_save_this_cpu(regs, cpu); } +#ifndef CONFIG_XEN #ifdef CONFIG_SMP static atomic_t waiting_for_crash_ipi; @@ -158,6 +159,7 @@ static void nmi_shootdown_cpus(void) /* There are no cpus to shootdown */ } #endif +#endif /* CONFIG_XEN */ void machine_crash_shutdown(struct pt_regs *regs) { @@ -174,10 +176,12 @@ void machine_crash_shutdown(struct pt_re /* Make a note of crashing cpu. Will be used in NMI callback.*/ crashing_cpu = smp_processor_id(); +#ifndef CONFIG_XEN nmi_shootdown_cpus(); lapic_shutdown(); #if defined(CONFIG_X86_IO_APIC) disable_IO_APIC(); #endif +#endif /* CONFIG_XEN */ crash_save_self(regs); } Index: kexec-2.6.16/arch/i386/kernel/machine_kexec.c =================================================================== --- kexec-2.6.16.orig/arch/i386/kernel/machine_kexec.c +++ kexec-2.6.16/arch/i386/kernel/machine_kexec.c @@ -19,6 +19,10 @@ #include <asm/desc.h> #include <asm/system.h> +#ifdef CONFIG_XEN +#include <xen/interface/kexec.h> +#endif + #define PAGE_ALIGNED __attribute__ ((__aligned__(PAGE_SIZE))) static u32 kexec_pgd[1024] PAGE_ALIGNED; #ifdef CONFIG_X86_PAE @@ -54,6 +58,40 @@ void machine_kexec_cleanup(struct kimage { } +#ifdef CONFIG_XEN + +#define __ma(x) (pfn_to_mfn(__pa((x)) >> PAGE_SHIFT) << PAGE_SHIFT) + +#if PAGES_NR > KEXEC_XEN_NO_PAGES +#error PAGES_NR is greater than KEXEC_XEN_NO_PAGES - Xen support will break +#endif + +#if PA_CONTROL_PAGE != 0 +#error PA_CONTROL_PAGE is non zero - Xen support will break +#endif + +void machine_kexec_setup_load_arg(xen_kexec_image_t *xki, struct kimage *image) +{ + void *control_page; + + memset(xki->page_list, 0, sizeof(xki->page_list)); + + control_page = page_address(image->control_code_page); + memcpy(control_page, relocate_kernel, PAGE_SIZE); + + xki->page_list[PA_CONTROL_PAGE] = __ma(control_page); + xki->page_list[PA_PGD] = __ma(kexec_pgd); +#ifdef CONFIG_X86_PAE + xki->page_list[PA_PMD_0] = __ma(kexec_pmd0); + xki->page_list[PA_PMD_1] = __ma(kexec_pmd1); +#endif + xki->page_list[PA_PTE_0] = __ma(kexec_pte0); + xki->page_list[PA_PTE_1] = __ma(kexec_pte1); + +} + +#endif /* CONFIG_XEN */ + /* * Do not allocate memory (or fail in any way) in machine_kexec(). * We are past the point of no return, committed to rebooting now. [-- Attachment #4: linux-2.6.19-rc1-kexec-xen-x86_64.patch --] [-- Type: text/x-patch, Size: 5915 bytes --] --- arch/x86_64/kernel/crash.c | 6 + arch/x86_64/kernel/machine_kexec.c | 116 +++++++++++++++++++++++++++++++++++-- 2 files changed, 115 insertions(+), 7 deletions(-) Index: kexec-2.6.16/arch/x86_64/kernel/crash.c =================================================================== --- kexec-2.6.16.orig/arch/x86_64/kernel/crash.c +++ kexec-2.6.16/arch/x86_64/kernel/crash.c @@ -92,6 +92,7 @@ static void crash_save_self(struct pt_re crash_save_this_cpu(regs, cpu); } +#ifndef CONFIG_XEN #ifdef CONFIG_SMP static atomic_t waiting_for_crash_ipi; @@ -156,6 +157,7 @@ static void nmi_shootdown_cpus(void) /* There are no cpus to shootdown */ } #endif +#endif /* CONFIG_XEN */ void machine_crash_shutdown(struct pt_regs *regs) { @@ -173,6 +175,8 @@ void machine_crash_shutdown(struct pt_re /* Make a note of crashing cpu. Will be used in NMI callback.*/ crashing_cpu = smp_processor_id(); + +#ifndef CONFIG_XEN nmi_shootdown_cpus(); if(cpu_has_apic) @@ -181,6 +185,6 @@ void machine_crash_shutdown(struct pt_re #if defined(CONFIG_X86_IO_APIC) disable_IO_APIC(); #endif - +#endif /* CONFIG_XEN */ crash_save_self(regs); } Index: kexec-2.6.16/arch/x86_64/kernel/machine_kexec.c =================================================================== --- kexec-2.6.16.orig/arch/x86_64/kernel/machine_kexec.c +++ kexec-2.6.16/arch/x86_64/kernel/machine_kexec.c @@ -24,6 +24,104 @@ static u64 kexec_pud1[512] PAGE_ALIGNED; static u64 kexec_pmd1[512] PAGE_ALIGNED; static u64 kexec_pte1[512] PAGE_ALIGNED; +#ifdef CONFIG_XEN + +/* In the case of Xen, override hypervisor functions to be able to create + * a regular identity mapping page table... + */ + +#include <xen/interface/kexec.h> +#include <xen/interface/memory.h> + +#define x__pmd(x) ((pmd_t) { (x) } ) +#define x__pud(x) ((pud_t) { (x) } ) +#define x__pgd(x) ((pgd_t) { (x) } ) + +#define x_pmd_val(x) ((x).pmd) +#define x_pud_val(x) ((x).pud) +#define x_pgd_val(x) ((x).pgd) + +static inline void x_set_pmd(pmd_t *dst, pmd_t val) +{ + x_pmd_val(*dst) = x_pmd_val(val); +} + +static inline void x_set_pud(pud_t *dst, pud_t val) +{ + x_pud_val(*dst) = phys_to_machine(x_pud_val(val)); +} + +static inline void x_pud_clear (pud_t *pud) +{ + x_pud_val(*pud) = 0; +} + +static inline void x_set_pgd(pgd_t *dst, pgd_t val) +{ + x_pgd_val(*dst) = phys_to_machine(x_pgd_val(val)); +} + +static inline void x_pgd_clear (pgd_t * pgd) +{ + x_pgd_val(*pgd) = 0; +} + +#define X__PAGE_KERNEL_LARGE_EXEC \ + _PAGE_PRESENT | _PAGE_RW | _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_PSE +#define X_KERNPG_TABLE _PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY + +#define __ma(x) (pfn_to_mfn(__pa((x)) >> PAGE_SHIFT) << PAGE_SHIFT) + +#if PAGES_NR > KEXEC_XEN_NO_PAGES +#error PAGES_NR is greater than KEXEC_XEN_NO_PAGES - Xen support will break +#endif + +#if PA_CONTROL_PAGE != 0 +#error PA_CONTROL_PAGE is non zero - Xen support will break +#endif + +void machine_kexec_setup_load_arg(xen_kexec_image_t *xki, struct kimage *image) +{ + void *control_page; + void *table_page; + + memset(xki->page_list, 0, sizeof(xki->page_list)); + + control_page = page_address(image->control_code_page) + PAGE_SIZE; + memcpy(control_page, relocate_kernel, PAGE_SIZE); + + table_page = page_address(image->control_code_page); + + xki->page_list[PA_CONTROL_PAGE] = __ma(control_page); + xki->page_list[PA_TABLE_PAGE] = __ma(table_page); + + xki->page_list[PA_PGD] = __ma(kexec_pgd); + xki->page_list[PA_PUD_0] = __ma(kexec_pud0); + xki->page_list[PA_PUD_1] = __ma(kexec_pud1); + xki->page_list[PA_PMD_0] = __ma(kexec_pmd0); + xki->page_list[PA_PMD_1] = __ma(kexec_pmd1); + xki->page_list[PA_PTE_0] = __ma(kexec_pte0); + xki->page_list[PA_PTE_1] = __ma(kexec_pte1); +} + +#else /* CONFIG_XEN */ + +#define x__pmd(x) __pmd(x) +#define x__pud(x) __pud(x) +#define x__pgd(x) __pgd(x) + +#define x_set_pmd(x, y) set_pmd(x, y) +#define x_set_pud(x, y) set_pud(x, y) +#define x_set_pgd(x, y) set_pgd(x, y) + +#define x_pud_clear(x) pud_clear(x) +#define x_pgd_clear(x) pgd_clear(x) + +#define X__PAGE_KERNEL_LARGE_EXEC __PAGE_KERNEL_LARGE_EXEC +#define X_KERNPG_TABLE _KERNPG_TABLE + +#endif /* CONFIG_XEN */ + static void init_level2_page(pmd_t *level2p, unsigned long addr) { unsigned long end_addr; @@ -31,7 +129,7 @@ static void init_level2_page(pmd_t *leve addr &= PAGE_MASK; end_addr = addr + PUD_SIZE; while (addr < end_addr) { - set_pmd(level2p++, __pmd(addr | __PAGE_KERNEL_LARGE_EXEC)); + x_set_pmd(level2p++, x__pmd(addr | X__PAGE_KERNEL_LARGE_EXEC)); addr += PMD_SIZE; } } @@ -56,12 +154,12 @@ static int init_level3_page(struct kimag } level2p = (pmd_t *)page_address(page); init_level2_page(level2p, addr); - set_pud(level3p++, __pud(__pa(level2p) | _KERNPG_TABLE)); + x_set_pud(level3p++, x__pud(__pa(level2p) | X_KERNPG_TABLE)); addr += PUD_SIZE; } /* clear the unused entries */ while (addr < end_addr) { - pud_clear(level3p++); + x_pud_clear(level3p++); addr += PUD_SIZE; } out: @@ -92,12 +190,12 @@ static int init_level4_page(struct kimag if (result) { goto out; } - set_pgd(level4p++, __pgd(__pa(level3p) | _KERNPG_TABLE)); + x_set_pgd(level4p++, x__pgd(__pa(level3p) | X_KERNPG_TABLE)); addr += PGDIR_SIZE; } /* clear the unused entries */ while (addr < end_addr) { - pgd_clear(level4p++); + x_pgd_clear(level4p++); addr += PGDIR_SIZE; } out: @@ -108,8 +206,14 @@ out: static int init_pgtable(struct kimage *image, unsigned long start_pgtable) { pgd_t *level4p; + unsigned long x_end_pfn = end_pfn; + +#ifdef CONFIG_XEN + x_end_pfn = HYPERVISOR_memory_op(XENMEM_maximum_ram_page, NULL); +#endif + level4p = (pgd_t *)__va(start_pgtable); - return init_level4_page(image, level4p, 0, end_pfn << PAGE_SHIFT); + return init_level4_page(image, level4p, 0, x_end_pfn << PAGE_SHIFT); } int machine_kexec_prepare(struct kimage *image) [-- Attachment #5: xen-sparse-kexec-fixes.diff --] [-- Type: text/x-patch, Size: 2684 bytes --] --- drivers/xen/core/machine_kexec.c | 42 ++++++++++++++++++++++++++++++++++++--- 1 file changed, 39 insertions(+), 3 deletions(-) Index: kexec-2.6.16/drivers/xen/core/machine_kexec.c =================================================================== --- kexec-2.6.16.orig/drivers/xen/core/machine_kexec.c +++ kexec-2.6.16/drivers/xen/core/machine_kexec.c @@ -11,6 +11,7 @@ extern void machine_kexec_setup_load_arg(xen_kexec_image_t *xki, struct kimage *image); +static void xen0_set_hooks(void); int xen_max_nr_phys_cpus; struct resource xen_hypervisor_res; @@ -24,6 +25,7 @@ void xen_machine_kexec_setup_resources(v if (!is_initial_xendomain()) return; + xen0_set_hooks(); /* determine maximum number of physical cpus */ @@ -124,7 +126,7 @@ static void setup_load_arg(xen_kexec_ima * is currently called too early. It might make sense * to move prepare, but for now, just add an extra hook. */ -int xen_machine_kexec_load(struct kimage *image) +static int xen0_machine_kexec_load(struct kimage *image) { xen_kexec_load_t xkl; @@ -140,7 +142,7 @@ int xen_machine_kexec_load(struct kimage * is called too late, and its possible xen could try and kdump * using resources that have been freed. */ -void xen_machine_kexec_unload(struct kimage *image) +static void xen0_machine_kexec_unload(struct kimage *image) { xen_kexec_load_t xkl; @@ -157,7 +159,7 @@ void xen_machine_kexec_unload(struct kim * stop all CPUs and kexec. That is it combines machine_shutdown() * and machine_kexec() in Linux kexec terms. */ -NORET_TYPE void xen_machine_kexec(struct kimage *image) +static NORET_TYPE ATTRIB_NORET void xen0_machine_kexec(struct kimage *image) { xen_kexec_exec_t xke; @@ -172,6 +174,40 @@ void machine_shutdown(void) /* do nothing */ } +static unsigned long xen0_page_to_pfn(struct page *page) +{ + return pfn_to_mfn(page_to_pfn(page)); +} + +static struct page* xen0_pfn_to_page(unsigned long pfn) +{ + return pfn_to_page(mfn_to_pfn(pfn)); +} + +static unsigned long xen0_virt_to_phys(void *addr) +{ + return virt_to_machine(addr); +} + +static void* xen0_phys_to_virt(unsigned long addr) +{ + return phys_to_virt(machine_to_phys(addr)); +} + + +static void xen0_set_hooks(void) +{ + kexec_ops.kpage_to_pfn = xen0_page_to_pfn; + kexec_ops.kpfn_to_page = xen0_pfn_to_page; + kexec_ops.kvirt_to_phys = xen0_virt_to_phys; + kexec_ops.kphys_to_virt = xen0_phys_to_virt; + + kexec_ops.kexec = xen0_machine_kexec; + kexec_ops.kexec_load = xen0_machine_kexec_load; + kexec_ops.kexec_unload = xen0_machine_kexec_unload; + + printk("%s: kexec hook setup done\n", __FUNCTION__); +} /* * Local variables: [-- Attachment #6: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-08 10:01 ` Gerd Hoffmann @ 2006-12-08 10:24 ` Ian Campbell 2006-12-08 11:28 ` Gerd Hoffmann 0 siblings, 1 reply; 20+ messages in thread From: Ian Campbell @ 2006-12-08 10:24 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list, Magnus Damm, Horms Hi Gerd, On Fri, 2006-12-08 at 11:01 +0100, Gerd Hoffmann wrote: > Updated patches attached. Unfortunately I'm just about to push a changeset which move the contents of these patches: patches/linux-2.6.16.33/kexec-generic.patch patches/linux-2.6.16.33/linux-2.6.19-rc1-kexec-xen-i386.patch patches/linux-2.6.16.33/linux-2.6.19-rc1-kexec-xen-x86_64.patch into the sparse tree where they belong. Sorry for moving the ground under you. Also due to the freeze we won't be able to take these changes until after 3.0.4 is released. Cheers, Ian. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-08 10:24 ` Ian Campbell @ 2006-12-08 11:28 ` Gerd Hoffmann 2006-12-08 11:32 ` Keir Fraser ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Gerd Hoffmann @ 2006-12-08 11:28 UTC (permalink / raw) To: Ian Campbell; +Cc: Magnus Damm, Xen devel list, Magnus Damm, Horms Ian Campbell wrote: > Hi Gerd, > > On Fri, 2006-12-08 at 11:01 +0100, Gerd Hoffmann wrote: >> Updated patches attached. > > Unfortunately I'm just about to push a changeset which move the contents > of these patches: > patches/linux-2.6.16.33/kexec-generic.patch > patches/linux-2.6.16.33/linux-2.6.19-rc1-kexec-xen-i386.patch > patches/linux-2.6.16.33/linux-2.6.19-rc1-kexec-xen-x86_64.patch > into the sparse tree where they belong. Sorry for moving the ground > under you. Oh, that is fine. Makes it easier for me, also the I can fold my changes into a single patch for the sparse tree then which likely is smaller and easier to review ;) Your changes are not in the public tree yet though .... cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-08 11:28 ` Gerd Hoffmann @ 2006-12-08 11:32 ` Keir Fraser 2006-12-08 11:52 ` Ian Campbell 2006-12-08 15:49 ` Ian Campbell 2 siblings, 0 replies; 20+ messages in thread From: Keir Fraser @ 2006-12-08 11:32 UTC (permalink / raw) To: Gerd Hoffmann, Ian Campbell Cc: Magnus Damm, Xen devel list, Magnus Damm, Horms On 8/12/06 11:28, "Gerd Hoffmann" <kraxel@suse.de> wrote: > Oh, that is fine. Makes it easier for me, also the I can fold my > changes into a single patch for the sparse tree then which likely is > smaller and easier to review ;) > > Your changes are not in the public tree yet though .... The staging tree is stalled for some reason. I'm not sure whether there's a systematic problem or just a few random problems in a row... We'll look into it this afternoon so we can get stuff pushed to the public tree later today. -- Keir ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-08 11:28 ` Gerd Hoffmann 2006-12-08 11:32 ` Keir Fraser @ 2006-12-08 11:52 ` Ian Campbell 2006-12-08 15:49 ` Ian Campbell 2 siblings, 0 replies; 20+ messages in thread From: Ian Campbell @ 2006-12-08 11:52 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list, Magnus Damm, Horms On Fri, 2006-12-08 at 12:28 +0100, Gerd Hoffmann wrote: > Ian Campbell wrote: > > Hi Gerd, > > > > On Fri, 2006-12-08 at 11:01 +0100, Gerd Hoffmann wrote: > >> Updated patches attached. > > > > Unfortunately I'm just about to push a changeset which move the contents > > of these patches: > > patches/linux-2.6.16.33/kexec-generic.patch > > patches/linux-2.6.16.33/linux-2.6.19-rc1-kexec-xen-i386.patch > > patches/linux-2.6.16.33/linux-2.6.19-rc1-kexec-xen-x86_64.patch > > into the sparse tree where they belong. Sorry for moving the ground > > under you. > > Oh, that is fine. Makes it easier for me, also the I can fold my > changes into a single patch for the sparse tree then which likely is > smaller and easier to review ;) > > Your changes are not in the public tree yet though .... I was delayed a bit in pushing them, they are in now. Hopefully that will be unwedged and flow through this afternoon. Cheers, Ian. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-08 11:28 ` Gerd Hoffmann 2006-12-08 11:32 ` Keir Fraser 2006-12-08 11:52 ` Ian Campbell @ 2006-12-08 15:49 ` Ian Campbell 2 siblings, 0 replies; 20+ messages in thread From: Ian Campbell @ 2006-12-08 15:49 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list, Magnus Damm, Horms On Fri, 2006-12-08 at 12:28 +0100, Gerd Hoffmann wrote: > Your changes are not in the public tree yet though .... Should be there now. Cheers, Ian. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-05 15:53 ` Magnus Damm 2006-12-05 16:55 ` Gerd Hoffmann @ 2006-12-06 8:37 ` Keir Fraser 2006-12-06 9:08 ` Magnus Damm 1 sibling, 1 reply; 20+ messages in thread From: Keir Fraser @ 2006-12-06 8:37 UTC (permalink / raw) To: Magnus Damm, Gerd Hoffmann; +Cc: Magnus Damm, Xen devel list On 5/12/06 3:53 pm, "Magnus Damm" <magnus.damm@gmail.com> wrote: >> I think we need either wrapper functions for machine_kexec_* functions >> which dispatch to the correct function depending on the environment >> (dom0 vs domU, later also native) or just make them function pointers to >> archive the same effect. Same goes for the KEXEC_ARCH_HAS_PAGE_MACROS >> stuff. IMHO "#ifdef CONFIG_XEN" should go away from the core code (i.e. >> kernel/kexec.c). > > You mean for the paravirt stuff? Isn't paravirt basically a set of > callbacks that you can register? If so, what is stopping us from > registering a set of paravirt callbacks for the kexec code? I think partly Gerd's point is that CONFIG_XEN in kernel/kexec.c will never get merged upstream. Guaranteed. The kexec/kdump patches are not very tidy in some respects like this. We applied them now because the functionality is useful, but I don't think we yet have the finished polished article. Also you got away with it because the code changes were hidden in the patches/ directory, which you originally said was simply backported code from 2.6.19 (not backported-and-hacked-on!). -- Keir ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: kexec trouble 2006-12-06 8:37 ` Keir Fraser @ 2006-12-06 9:08 ` Magnus Damm 0 siblings, 0 replies; 20+ messages in thread From: Magnus Damm @ 2006-12-06 9:08 UTC (permalink / raw) To: Keir Fraser; +Cc: Gerd Hoffmann, Xen devel list, Magnus Damm On 12/6/06, Keir Fraser <keir@xensource.com> wrote: > On 5/12/06 3:53 pm, "Magnus Damm" <magnus.damm@gmail.com> wrote: > > >> I think we need either wrapper functions for machine_kexec_* functions > >> which dispatch to the correct function depending on the environment > >> (dom0 vs domU, later also native) or just make them function pointers to > >> archive the same effect. Same goes for the KEXEC_ARCH_HAS_PAGE_MACROS > >> stuff. IMHO "#ifdef CONFIG_XEN" should go away from the core code (i.e. > >> kernel/kexec.c). > > > > You mean for the paravirt stuff? Isn't paravirt basically a set of > > callbacks that you can register? If so, what is stopping us from > > registering a set of paravirt callbacks for the kexec code? > > I think partly Gerd's point is that CONFIG_XEN in kernel/kexec.c will never > get merged upstream. Guaranteed. Sure, I understand that. But I see this as an iterative process, where the our code so far has been written to fit the current codebase. When dom0 runs on paravirt and we can test the code then it should be adjusted. It's kind of hard to write for something that doesn't yet exist. =) So regardless how you do it, you still need to adjust your code towards the new interface in the end - it's just a matter of how much code you need to adjust. I'm all for converting the code into using runtime checks or callbacks if that is needed, and I would have done so in the first place if I'd known that it was something that you guys wanted. But I didn't so we used the simplest possible solution instead which was CONFIG_XEN. > The kexec/kdump patches are not very tidy in some respects like this. We > applied them now because the functionality is useful, but I don't think we > yet have the finished polished article. Also you got away with it because > the code changes were hidden in the patches/ directory, which you originally > said was simply backported code from 2.6.19 (not backported-and-hacked-on!). The git-patches are backports. The other ones are not: http://lists.xensource.com/archives/html/xen-devel/2006-10/msg01240.html / magnus ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2006-12-08 15:49 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-12-05 14:37 kexec trouble Gerd Hoffmann 2006-12-05 15:53 ` Magnus Damm 2006-12-05 16:55 ` Gerd Hoffmann 2006-12-06 4:08 ` Magnus Damm 2006-12-06 8:48 ` Gerd Hoffmann 2006-12-06 9:41 ` Magnus Damm 2006-12-06 10:31 ` Gerd Hoffmann 2006-12-06 11:11 ` Magnus Damm 2006-12-06 13:23 ` Gerd Hoffmann 2006-12-06 13:40 ` Muli Ben-Yehuda 2006-12-07 11:24 ` Gerd Hoffmann 2006-12-08 4:15 ` Magnus Damm 2006-12-08 10:01 ` Gerd Hoffmann 2006-12-08 10:24 ` Ian Campbell 2006-12-08 11:28 ` Gerd Hoffmann 2006-12-08 11:32 ` Keir Fraser 2006-12-08 11:52 ` Ian Campbell 2006-12-08 15:49 ` Ian Campbell 2006-12-06 8:37 ` Keir Fraser 2006-12-06 9:08 ` Magnus Damm
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.