* [PATCH 1/3] kexec: extend hypercall with improved load/unload ops
2013-01-16 16:29 [RFC PATCH 0/3] Improve kexec support in Xen hypervisor David Vrabel
@ 2013-01-16 16:29 ` David Vrabel
2013-01-17 12:28 ` Daniel Kiper
2013-01-17 12:33 ` [Xen-devel] " Ian Campbell
2013-01-16 16:29 ` [PATCH 2/3] kexec: remove kexec_load and kexec_unload ops David Vrabel
` (4 subsequent siblings)
5 siblings, 2 replies; 24+ messages in thread
From: David Vrabel @ 2013-01-16 16:29 UTC (permalink / raw)
To: xen-devel; +Cc: Daniel Kiper, kexec, David Vrabel, Eric Biederman
From: David Vrabel <david.vrabel@citrix.com>
In the existing kexec hypercall, the load and unload ops depend on
internals of the Linux kernel (the page list and code page provided by
the kernel). The code page is used to transition between Xen context
and the image so using kernel code doesn't make sense and will not
work for PVH guests.
Add replacement KEXEC_CMD_kexec_load_v2 and KEXEC_CMD_kexec_unload_v2
ops that no longer require a code page to be provided by the guest --
Xen now provides the code for calling the image directly.
The load_v2 op looks similar to the Linux kexec_load system call and
allows the guest to provide the image data to be loaded into the crash
kernel memory region. The guest may also specify whether the image is
64-bit or 32-bit.
The toolstack can now load images without kernel involvement. This is
required for supporting kexec of crash kernels from PV-ops kernels.
Note: This also changes the behaviour of the kexec op when a image is
loaded with the old ABI. The code page will no longer be used which
may result is incorrect behaviour in non-Linux guests. This allowed
the code to be simpler and support for the old ABI is being removed in
a subsequent patch anyway.
[ This is a prototype and has the following limitations:
- no compat implementation for kexec_load_v2.
- 64-bit images are not supported.
- 32-bit images are called with paging enabled (Linux starts 32-bit
images with paging disabled). ]
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
xen/arch/x86/machine_kexec.c | 73 ++------------
xen/arch/x86/x86_64/compat_kexec.S | 25 -----
xen/common/kexec.c | 204 ++++++++++++++++++++++++++++++++++--
xen/include/public/kexec.h | 44 ++++++++
xen/include/xen/kexec.h | 18 ++--
5 files changed, 255 insertions(+), 109 deletions(-)
diff --git a/xen/arch/x86/machine_kexec.c b/xen/arch/x86/machine_kexec.c
index 8191ef1..7131d20 100644
--- a/xen/arch/x86/machine_kexec.c
+++ b/xen/arch/x86/machine_kexec.c
@@ -12,62 +12,16 @@
#include <asm/fixmap.h>
#include <asm/hpet.h>
-typedef void (*relocate_new_kernel_t)(
- unsigned long indirection_page,
- unsigned long *page_list,
- unsigned long start_address,
- unsigned int preserve_context);
-
-int machine_kexec_load(int type, int slot, xen_kexec_image_t *image)
+int machine_kexec_load(struct kexec_image *image)
{
- unsigned long prev_ma = 0;
- int fix_base = FIX_KEXEC_BASE_0 + (slot * (KEXEC_XEN_NO_PAGES >> 1));
- int k;
-
- /* setup fixmap to point to our pages and record the virtual address
- * in every odd index in page_list[].
- */
-
- for ( k = 0; k < KEXEC_XEN_NO_PAGES; k++ )
- {
- if ( (k & 1) == 0 )
- {
- /* Even pages: machine address. */
- prev_ma = image->page_list[k];
- }
- else
- {
- /* Odd pages: va for previous ma. */
- if ( is_pv_32on64_domain(dom0) )
- {
- /*
- * The compatability bounce code sets up a page table
- * with a 1-1 mapping of the first 1G of memory so
- * VA==PA here.
- *
- * This Linux purgatory code still sets up separate
- * high and low mappings on the control page (entries
- * 0 and 1) but it is harmless if they are equal since
- * that PT is not live at the time.
- */
- image->page_list[k] = prev_ma;
- }
- else
- {
- set_fixmap(fix_base + (k >> 1), prev_ma);
- image->page_list[k] = fix_to_virt(fix_base + (k >> 1));
- }
- }
- }
-
return 0;
}
-void machine_kexec_unload(int type, int slot, xen_kexec_image_t *image)
+void machine_kexec_unload(struct kexec_image *image)
{
}
-void machine_reboot_kexec(xen_kexec_image_t *image)
+void machine_reboot_kexec(struct kexec_image *image)
{
BUG_ON(smp_processor_id() != 0);
smp_send_stop();
@@ -75,7 +29,7 @@ void machine_reboot_kexec(xen_kexec_image_t *image)
BUG();
}
-void machine_kexec(xen_kexec_image_t *image)
+void machine_kexec(struct kexec_image *image)
{
struct desc_ptr gdt_desc = {
.base = (unsigned long)(boot_cpu_gdt_table - FIRST_RESERVED_GDT_ENTRY),
@@ -116,22 +70,11 @@ void machine_kexec(xen_kexec_image_t *image)
*/
asm volatile ( "lgdt %0" : : "m" (gdt_desc) );
- if ( is_pv_32on64_domain(dom0) )
- {
- compat_machine_kexec(image->page_list[1],
- image->indirection_page,
- image->page_list,
- image->start_address);
- }
+ if ( image->class == KEXEC_CLASS_32 )
+ compat_machine_kexec(image->entry_maddr);
else
- {
- relocate_new_kernel_t rnk;
-
- rnk = (relocate_new_kernel_t) image->page_list[1];
- (*rnk)(image->indirection_page, image->page_list,
- image->start_address,
- 0 /* preserve_context */);
- }
+ /* FIXME */
+ panic("KEXEC_CLASS_64 not yet supported\n");
}
int machine_kexec_get(xen_kexec_range_t *range)
diff --git a/xen/arch/x86/x86_64/compat_kexec.S b/xen/arch/x86/x86_64/compat_kexec.S
index fc92af9..d853231 100644
--- a/xen/arch/x86/x86_64/compat_kexec.S
+++ b/xen/arch/x86/x86_64/compat_kexec.S
@@ -36,21 +36,6 @@
ENTRY(compat_machine_kexec)
/* x86/64 x86/32 */
/* %rdi - relocate_new_kernel_t CALL */
- /* %rsi - indirection page 4(%esp) */
- /* %rdx - page_list 8(%esp) */
- /* %rcx - start address 12(%esp) */
- /* cpu has pae 16(%esp) */
-
- /* Shim the 64 bit page_list into a 32 bit page_list. */
- mov $12,%r9
- lea compat_page_list(%rip), %rbx
-1: dec %r9
- movl (%rdx,%r9,8),%eax
- movl %eax,(%rbx,%r9,4)
- test %r9,%r9
- jnz 1b
-
- RELOCATE_SYM(compat_page_list,%rdx)
/* Relocate compatibility mode entry point address. */
RELOCATE_MEM(compatibility_mode_far,%eax)
@@ -118,12 +103,6 @@ compatibility_mode:
movl %eax, %gs
movl %eax, %ss
- /* Push arguments onto stack. */
- pushl $0 /* 20(%esp) - preserve context */
- pushl $1 /* 16(%esp) - cpu has pae */
- pushl %ecx /* 12(%esp) - start address */
- pushl %edx /* 8(%esp) - page list */
- pushl %esi /* 4(%esp) - indirection page */
pushl %edi /* 0(%esp) - CALL */
/* Disable paging and therefore leave 64 bit mode. */
@@ -153,10 +132,6 @@ compatibility_mode:
ud2
.data
- .align 4
-compat_page_list:
- .fill 12,4,0
-
.align 32,0
/*
diff --git a/xen/common/kexec.c b/xen/common/kexec.c
index 25ebd6a..56bf8b4 100644
--- a/xen/common/kexec.c
+++ b/xen/common/kexec.c
@@ -45,7 +45,7 @@ static Elf_Note *xen_crash_note;
static cpumask_t crash_saved_cpus;
-static xen_kexec_image_t kexec_image[KEXEC_IMAGE_NR];
+static struct kexec_image kexec_image[KEXEC_IMAGE_NR];
#define KEXEC_FLAG_DEFAULT_POS (KEXEC_IMAGE_NR + 0)
#define KEXEC_FLAG_CRASH_POS (KEXEC_IMAGE_NR + 1)
@@ -316,7 +316,7 @@ void kexec_crash(void)
static long kexec_reboot(void *_image)
{
- xen_kexec_image_t *image = _image;
+ struct kexec_image *image = _image;
kexecing = TRUE;
@@ -732,9 +732,19 @@ static void crash_save_vmcoreinfo(void)
#endif
}
+static void kexec_unload_slot(int slot)
+{
+ struct kexec_image *image = &kexec_image[slot];
+
+ if ( test_and_clear_bit(slot, &kexec_flags) )
+ {
+ machine_kexec_unload(image);
+ }
+}
+
static int kexec_load_unload_internal(unsigned long op, xen_kexec_load_t *load)
{
- xen_kexec_image_t *image;
+ struct kexec_image *image;
int base, bit, pos;
int ret = 0;
@@ -750,9 +760,13 @@ static int kexec_load_unload_internal(unsigned long op, xen_kexec_load_t *load)
BUG_ON(test_bit((base + !pos), &kexec_flags)); /* must be free */
- memcpy(image, &load->image, sizeof(*image));
+ if ( is_pv_32on64_domain(dom0) )
+ image->class = KEXEC_CLASS_32;
+ else
+ image->class = KEXEC_CLASS_64;
+ image->entry_maddr = load->image.start_address;
- if ( !(ret = machine_kexec_load(load->type, base + !pos, image)) )
+ if ( !(ret = machine_kexec_load(image)) )
{
/* Set image present bit */
set_bit((base + !pos), &kexec_flags);
@@ -767,11 +781,7 @@ static int kexec_load_unload_internal(unsigned long op, xen_kexec_load_t *load)
/* Unload the old image if present and load successful */
if ( ret == 0 && !test_bit(KEXEC_FLAG_IN_PROGRESS, &kexec_flags) )
{
- if ( test_and_clear_bit((base + pos), &kexec_flags) )
- {
- image = &kexec_image[base + pos];
- machine_kexec_unload(load->type, base + pos, image);
- }
+ kexec_unload_slot(base + pos);
}
return ret;
@@ -816,7 +826,7 @@ static int kexec_load_unload_compat(unsigned long op,
static int kexec_exec(XEN_GUEST_HANDLE_PARAM(void) uarg)
{
xen_kexec_exec_t exec;
- xen_kexec_image_t *image;
+ struct kexec_image *image;
int base, bit, pos, ret = -EINVAL;
if ( unlikely(copy_from_guest(&exec, uarg, 1)) )
@@ -845,6 +855,162 @@ static int kexec_exec(XEN_GUEST_HANDLE_PARAM(void) uarg)
return -EINVAL; /* never reached */
}
+static int kexec_load_segments(xen_kexec_load_v2_t *load)
+{
+ unsigned s;
+ bool_t valid_entry = 0;
+
+ for ( s = 0; s < load->nr_segments; s++ )
+ {
+ xen_kexec_segment_t seg;
+ unsigned long to_copy;
+ unsigned long src_offset;
+ unsigned long dest;
+
+ if ( copy_from_guest_offset(&seg, load->segments, s, 1) )
+ return -EFAULT;
+
+ /* Caller is responsible for ensuring the crash space is
+ shared between multiple users of the load call. Xen just
+ validates the load is to somewhere within the region. */
+
+ if ( seg.dest_maddr < kexec_crash_area.start
+ || seg.dest_maddr + seg.size > kexec_crash_area.start + kexec_crash_area.size)
+ return -EINVAL;
+
+ if ( load->entry_maddr >= seg.dest_maddr
+ && load->entry_maddr < seg.dest_maddr + seg.size)
+ valid_entry = 1;
+
+ to_copy = seg.size;
+ src_offset = 0;
+ dest = seg.dest_maddr;
+
+ while ( to_copy )
+ {
+ unsigned long dest_mfn;
+ size_t dest_off;
+ void *dest_va;
+ size_t size;
+
+ dest_mfn = dest >> PAGE_SHIFT;
+ dest_off = dest & ~PAGE_MASK;
+
+ size = min(PAGE_SIZE - dest_off, to_copy);
+
+ dest_va = vmap(&dest_mfn, 1);
+ if ( dest_va == NULL )
+ return -EINVAL;
+
+ copy_from_guest_offset(dest_va, seg.buf, src_offset, size);
+
+ vunmap(dest_va);
+
+ to_copy -= size;
+ src_offset += size;
+ dest += size;
+ }
+ }
+
+ /* Entry point is somewhere in a loaded segment? */
+ if ( !valid_entry )
+ return -EINVAL;
+
+ return 0;
+}
+
+static int slot_to_pos_bit(int slot)
+{
+ return KEXEC_IMAGE_NR + slot / 2;
+}
+
+static int kexec_load_slot(int slot, xen_kexec_load_v2_t *load)
+{
+ struct kexec_image *image = &kexec_image[slot];
+ int ret;
+
+ BUG_ON(test_bit(slot, &kexec_flags)); /* must be free */
+
+ /* validate and load each segment. */
+ ret = kexec_load_segments(load);
+ if ( ret < 0 )
+ return ret;
+
+ image->entry_maddr = load->entry_maddr;
+
+ ret = machine_kexec_load(image);
+ if ( ret < 0 )
+ return ret;
+
+ /* Set image present bit */
+ set_bit(slot, &kexec_flags);
+
+ /* Make new image the active one */
+ change_bit(slot_to_pos_bit(slot), &kexec_flags);
+
+ crash_save_vmcoreinfo();
+
+ return ret;
+}
+
+
+static int kexec_load_v2(XEN_GUEST_HANDLE_PARAM(void) uarg)
+{
+ xen_kexec_load_v2_t load;
+ int base, bit, pos, slot;
+ struct kexec_image *image;
+ int ret;
+
+ if ( unlikely(copy_from_guest(&load, uarg, 1)) )
+ return -EFAULT;
+
+ if ( kexec_load_get_bits(load.type, &base, &bit) )
+ return -EINVAL;
+
+ pos = (test_bit(bit, &kexec_flags) != 0);
+ slot = base + !pos;
+ image = &kexec_image[slot];
+
+ switch ( load.class )
+ {
+ case KEXEC_CLASS_32:
+ case KEXEC_CLASS_64:
+ image->class = load.class;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ ret = kexec_load_slot(slot, &load);
+
+ /* Unload the old image if present and load successful */
+ if ( ret == 0 && !test_bit(KEXEC_FLAG_IN_PROGRESS, &kexec_flags) )
+ {
+ kexec_unload_slot(slot ^ 0x1);
+ }
+
+ return ret;
+}
+
+static int kexec_unload_v2(XEN_GUEST_HANDLE_PARAM(void) uarg)
+{
+ xen_kexec_unload_v2_t unload;
+ int base, bit, pos, slot;
+
+ if ( unlikely(copy_from_guest(&unload, uarg, 1)) )
+ return -EFAULT;
+
+ if ( kexec_load_get_bits(unload.type, &base, &bit) )
+ return -EINVAL;
+
+ pos = (test_bit(bit, &kexec_flags) != 0);
+ slot = base + !pos;
+
+ kexec_unload_slot(slot);
+
+ return 0;
+}
+
static int do_kexec_op_internal(unsigned long op,
XEN_GUEST_HANDLE_PARAM(void) uarg,
bool_t compat)
@@ -882,6 +1048,22 @@ static int do_kexec_op_internal(unsigned long op,
case KEXEC_CMD_kexec:
ret = kexec_exec(uarg);
break;
+ case KEXEC_CMD_kexec_load_v2:
+ spin_lock_irqsave(&kexec_lock, flags);
+ if ( !test_bit(KEXEC_FLAG_IN_PROGRESS, &kexec_flags) )
+ ret = kexec_load_v2(uarg);
+ else
+ ret = -EBUSY;
+ spin_unlock_irqrestore(&kexec_lock, flags);
+ break;
+ case KEXEC_CMD_kexec_unload_v2:
+ spin_lock_irqsave(&kexec_lock, flags);
+ if ( !test_bit(KEXEC_FLAG_IN_PROGRESS, &kexec_flags) )
+ ret = kexec_unload_v2(uarg);
+ else
+ ret = -EBUSY;
+ spin_unlock_irqrestore(&kexec_lock, flags);
+ break;
}
return ret;
diff --git a/xen/include/public/kexec.h b/xen/include/public/kexec.h
index 61a8d7d..4b7d637 100644
--- a/xen/include/public/kexec.h
+++ b/xen/include/public/kexec.h
@@ -83,6 +83,8 @@
#define KEXEC_TYPE_DEFAULT 0
#define KEXEC_TYPE_CRASH 1
+#define KEXEC_CLASS_32 1 /* 32-bit image. */
+#define KEXEC_CLASS_64 2 /* 64-bit image. */
/* The kexec implementation for Xen allows the user to load two
* types of kernels, KEXEC_TYPE_DEFAULT and KEXEC_TYPE_CRASH.
@@ -152,6 +154,48 @@ typedef struct xen_kexec_range {
unsigned long start;
} xen_kexec_range_t;
+/*
+ * A contiguous chunk of a kexec image and it's destination machine
+ * address.
+ */
+typedef struct xen_kexec_segment {
+ XEN_GUEST_HANDLE(const_void) buf;
+ uint32_t size;
+ uint64_t dest_maddr;
+} xen_kexec_segment_t;
+DEFINE_XEN_GUEST_HANDLE(xen_kexec_segment_t);
+
+/*
+ * Load a kexec image into memory.
+ *
+ * Each segment of the image must reside in the memory region reserved
+ * for kexec (KEXEC_RANGE_MA_CRASH) and the entry point must be within
+ * the image.
+ *
+ * The caller is responsible for ensuring that multiple images do not
+ * overlap.
+ */
+#define KEXEC_CMD_kexec_load_v2 4
+typedef struct xen_kexec_load_v2 {
+ uint8_t type; /* One of KEXEC_TYPE_* */
+ uint8_t class; /* One of KEXEC_CLASS_* */
+ uint32_t nr_segments;
+ XEN_GUEST_HANDLE(xen_kexec_segment_t) segments;
+ uint64_t entry_maddr; /* image entry point machine address. */
+} xen_kexec_load_v2_t;
+DEFINE_XEN_GUEST_HANDLE(xen_kexec_load_v2_t);
+
+/*
+ * Unload a kexec image.
+ *
+ * Type must be one of KEXEC_TYPE_DEFAULT or KEXEC_TYPE_CRASH.
+ */
+#define KEXEC_CMD_kexec_unload_v2 5
+typedef struct xen_kexec_unload_v2 {
+ uint32_t type;
+} xen_kexec_unload_v2_t;
+DEFINE_XEN_GUEST_HANDLE(xen_kexec_unload_v2_t);
+
#endif /* _XEN_PUBLIC_KEXEC_H */
/*
diff --git a/xen/include/xen/kexec.h b/xen/include/xen/kexec.h
index b3ca8b0..5c13af8 100644
--- a/xen/include/xen/kexec.h
+++ b/xen/include/xen/kexec.h
@@ -27,6 +27,11 @@ void set_kexec_crash_area_size(u64 system_ram);
#define KEXEC_IMAGE_CRASH_BASE 2
#define KEXEC_IMAGE_NR 4
+struct kexec_image {
+ uint8_t class;
+ unsigned long entry_maddr;
+};
+
enum low_crashinfo {
LOW_CRASHINFO_INVALID = 0,
LOW_CRASHINFO_NONE = 1,
@@ -40,11 +45,11 @@ extern enum low_crashinfo low_crashinfo_mode;
extern paddr_t crashinfo_maxaddr_bits;
void kexec_early_calculations(void);
-int machine_kexec_load(int type, int slot, xen_kexec_image_t *image);
-void machine_kexec_unload(int type, int slot, xen_kexec_image_t *image);
+int machine_kexec_load(struct kexec_image *image);
+void machine_kexec_unload(struct kexec_image *image);
void machine_kexec_reserved(xen_kexec_reserve_t *reservation);
-void machine_reboot_kexec(xen_kexec_image_t *image);
-void machine_kexec(xen_kexec_image_t *image);
+void machine_reboot_kexec(struct kexec_image *image);
+void machine_kexec(struct kexec_image *image);
void kexec_crash(void);
void kexec_crash_save_cpu(void);
crash_xen_info_t *kexec_crash_save_info(void);
@@ -52,10 +57,7 @@ void machine_crash_shutdown(void);
int machine_kexec_get(xen_kexec_range_t *range);
int machine_kexec_get_xen(xen_kexec_range_t *range);
-void compat_machine_kexec(unsigned long rnk,
- unsigned long indirection_page,
- unsigned long *page_list,
- unsigned long start_address);
+void compat_machine_kexec(unsigned long start_address);
/* vmcoreinfo stuff */
#define VMCOREINFO_BYTES (4096)
--
1.7.2.5
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 24+ messages in thread* Re: [PATCH 1/3] kexec: extend hypercall with improved load/unload ops
2013-01-16 16:29 ` [PATCH 1/3] kexec: extend hypercall with improved load/unload ops David Vrabel
@ 2013-01-17 12:28 ` Daniel Kiper
2013-01-17 14:50 ` David Vrabel
2013-01-17 12:33 ` [Xen-devel] " Ian Campbell
1 sibling, 1 reply; 24+ messages in thread
From: Daniel Kiper @ 2013-01-17 12:28 UTC (permalink / raw)
To: David Vrabel; +Cc: kexec, Eric Biederman, xen-devel
On Wed, Jan 16, 2013 at 04:29:04PM +0000, David Vrabel wrote:
> From: David Vrabel <david.vrabel@citrix.com>
>
> In the existing kexec hypercall, the load and unload ops depend on
> internals of the Linux kernel (the page list and code page provided by
> the kernel). The code page is used to transition between Xen context
> and the image so using kernel code doesn't make sense and will not
> work for PVH guests.
>
> Add replacement KEXEC_CMD_kexec_load_v2 and KEXEC_CMD_kexec_unload_v2
> ops that no longer require a code page to be provided by the guest --
> Xen now provides the code for calling the image directly.
>
> The load_v2 op looks similar to the Linux kexec_load system call and
> allows the guest to provide the image data to be loaded into the crash
> kernel memory region. The guest may also specify whether the image is
> 64-bit or 32-bit.
>
> The toolstack can now load images without kernel involvement. This is
> required for supporting kexec of crash kernels from PV-ops kernels.
>
> Note: This also changes the behaviour of the kexec op when a image is
> loaded with the old ABI. The code page will no longer be used which
> may result is incorrect behaviour in non-Linux guests. This allowed
> the code to be simpler and support for the old ABI is being removed in
> a subsequent patch anyway.
>
> [ This is a prototype and has the following limitations:
>
> - no compat implementation for kexec_load_v2.
> - 64-bit images are not supported.
> - 32-bit images are called with paging enabled (Linux starts 32-bit
> images with paging disabled). ]
>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
> xen/arch/x86/machine_kexec.c | 73 ++------------
> xen/arch/x86/x86_64/compat_kexec.S | 25 -----
> xen/common/kexec.c | 204 ++++++++++++++++++++++++++++++++++--
> xen/include/public/kexec.h | 44 ++++++++
> xen/include/xen/kexec.h | 18 ++--
> 5 files changed, 255 insertions(+), 109 deletions(-)
>
> diff --git a/xen/arch/x86/machine_kexec.c b/xen/arch/x86/machine_kexec.c
> index 8191ef1..7131d20 100644
> --- a/xen/arch/x86/machine_kexec.c
> +++ b/xen/arch/x86/machine_kexec.c
> @@ -12,62 +12,16 @@
> #include <asm/fixmap.h>
> #include <asm/hpet.h>
>
> -typedef void (*relocate_new_kernel_t)(
> - unsigned long indirection_page,
> - unsigned long *page_list,
> - unsigned long start_address,
> - unsigned int preserve_context);
> -
> -int machine_kexec_load(int type, int slot, xen_kexec_image_t *image)
> +int machine_kexec_load(struct kexec_image *image)
> {
> - unsigned long prev_ma = 0;
> - int fix_base = FIX_KEXEC_BASE_0 + (slot * (KEXEC_XEN_NO_PAGES >> 1));
> - int k;
> -
> - /* setup fixmap to point to our pages and record the virtual address
> - * in every odd index in page_list[].
> - */
> -
> - for ( k = 0; k < KEXEC_XEN_NO_PAGES; k++ )
> - {
> - if ( (k & 1) == 0 )
> - {
> - /* Even pages: machine address. */
> - prev_ma = image->page_list[k];
> - }
> - else
> - {
> - /* Odd pages: va for previous ma. */
> - if ( is_pv_32on64_domain(dom0) )
> - {
> - /*
> - * The compatability bounce code sets up a page table
> - * with a 1-1 mapping of the first 1G of memory so
> - * VA==PA here.
> - *
> - * This Linux purgatory code still sets up separate
> - * high and low mappings on the control page (entries
> - * 0 and 1) but it is harmless if they are equal since
> - * that PT is not live at the time.
> - */
> - image->page_list[k] = prev_ma;
> - }
> - else
> - {
> - set_fixmap(fix_base + (k >> 1), prev_ma);
> - image->page_list[k] = fix_to_virt(fix_base + (k >> 1));
> - }
> - }
> - }
> -
> return 0;
> }
>
> -void machine_kexec_unload(int type, int slot, xen_kexec_image_t *image)
> +void machine_kexec_unload(struct kexec_image *image)
> {
> }
>
> -void machine_reboot_kexec(xen_kexec_image_t *image)
> +void machine_reboot_kexec(struct kexec_image *image)
> {
> BUG_ON(smp_processor_id() != 0);
> smp_send_stop();
> @@ -75,7 +29,7 @@ void machine_reboot_kexec(xen_kexec_image_t *image)
> BUG();
> }
>
> -void machine_kexec(xen_kexec_image_t *image)
> +void machine_kexec(struct kexec_image *image)
> {
> struct desc_ptr gdt_desc = {
> .base = (unsigned long)(boot_cpu_gdt_table - FIRST_RESERVED_GDT_ENTRY),
> @@ -116,22 +70,11 @@ void machine_kexec(xen_kexec_image_t *image)
> */
> asm volatile ( "lgdt %0" : : "m" (gdt_desc) );
>
> - if ( is_pv_32on64_domain(dom0) )
> - {
> - compat_machine_kexec(image->page_list[1],
> - image->indirection_page,
> - image->page_list,
> - image->start_address);
> - }
> + if ( image->class == KEXEC_CLASS_32 )
> + compat_machine_kexec(image->entry_maddr);
Why do you need that?
> else
> - {
> - relocate_new_kernel_t rnk;
> -
> - rnk = (relocate_new_kernel_t) image->page_list[1];
> - (*rnk)(image->indirection_page, image->page_list,
> - image->start_address,
> - 0 /* preserve_context */);
> - }
> + /* FIXME */
> + panic("KEXEC_CLASS_64 not yet supported\n");
> }
>
> int machine_kexec_get(xen_kexec_range_t *range)
> diff --git a/xen/arch/x86/x86_64/compat_kexec.S b/xen/arch/x86/x86_64/compat_kexec.S
> index fc92af9..d853231 100644
> --- a/xen/arch/x86/x86_64/compat_kexec.S
> +++ b/xen/arch/x86/x86_64/compat_kexec.S
> @@ -36,21 +36,6 @@
> ENTRY(compat_machine_kexec)
> /* x86/64 x86/32 */
> /* %rdi - relocate_new_kernel_t CALL */
> - /* %rsi - indirection page 4(%esp) */
> - /* %rdx - page_list 8(%esp) */
> - /* %rcx - start address 12(%esp) */
> - /* cpu has pae 16(%esp) */
> -
> - /* Shim the 64 bit page_list into a 32 bit page_list. */
> - mov $12,%r9
> - lea compat_page_list(%rip), %rbx
> -1: dec %r9
> - movl (%rdx,%r9,8),%eax
> - movl %eax,(%rbx,%r9,4)
> - test %r9,%r9
> - jnz 1b
> -
> - RELOCATE_SYM(compat_page_list,%rdx)
>
> /* Relocate compatibility mode entry point address. */
> RELOCATE_MEM(compatibility_mode_far,%eax)
> @@ -118,12 +103,6 @@ compatibility_mode:
> movl %eax, %gs
> movl %eax, %ss
>
> - /* Push arguments onto stack. */
> - pushl $0 /* 20(%esp) - preserve context */
> - pushl $1 /* 16(%esp) - cpu has pae */
> - pushl %ecx /* 12(%esp) - start address */
> - pushl %edx /* 8(%esp) - page list */
> - pushl %esi /* 4(%esp) - indirection page */
> pushl %edi /* 0(%esp) - CALL */
>
> /* Disable paging and therefore leave 64 bit mode. */
> @@ -153,10 +132,6 @@ compatibility_mode:
> ud2
>
> .data
> - .align 4
> -compat_page_list:
> - .fill 12,4,0
> -
> .align 32,0
>
> /*
> diff --git a/xen/common/kexec.c b/xen/common/kexec.c
> index 25ebd6a..56bf8b4 100644
> --- a/xen/common/kexec.c
> +++ b/xen/common/kexec.c
> @@ -45,7 +45,7 @@ static Elf_Note *xen_crash_note;
>
> static cpumask_t crash_saved_cpus;
>
> -static xen_kexec_image_t kexec_image[KEXEC_IMAGE_NR];
> +static struct kexec_image kexec_image[KEXEC_IMAGE_NR];
>
> #define KEXEC_FLAG_DEFAULT_POS (KEXEC_IMAGE_NR + 0)
> #define KEXEC_FLAG_CRASH_POS (KEXEC_IMAGE_NR + 1)
> @@ -316,7 +316,7 @@ void kexec_crash(void)
>
> static long kexec_reboot(void *_image)
> {
> - xen_kexec_image_t *image = _image;
> + struct kexec_image *image = _image;
>
> kexecing = TRUE;
>
> @@ -732,9 +732,19 @@ static void crash_save_vmcoreinfo(void)
> #endif
> }
>
> +static void kexec_unload_slot(int slot)
> +{
> + struct kexec_image *image = &kexec_image[slot];
> +
> + if ( test_and_clear_bit(slot, &kexec_flags) )
> + {
> + machine_kexec_unload(image);
> + }
> +}
> +
> static int kexec_load_unload_internal(unsigned long op, xen_kexec_load_t *load)
> {
> - xen_kexec_image_t *image;
> + struct kexec_image *image;
> int base, bit, pos;
> int ret = 0;
>
> @@ -750,9 +760,13 @@ static int kexec_load_unload_internal(unsigned long op, xen_kexec_load_t *load)
>
> BUG_ON(test_bit((base + !pos), &kexec_flags)); /* must be free */
>
> - memcpy(image, &load->image, sizeof(*image));
> + if ( is_pv_32on64_domain(dom0) )
> + image->class = KEXEC_CLASS_32;
> + else
> + image->class = KEXEC_CLASS_64;
Ditto.
> + image->entry_maddr = load->image.start_address;
>
> - if ( !(ret = machine_kexec_load(load->type, base + !pos, image)) )
> + if ( !(ret = machine_kexec_load(image)) )
> {
> /* Set image present bit */
> set_bit((base + !pos), &kexec_flags);
> @@ -767,11 +781,7 @@ static int kexec_load_unload_internal(unsigned long op, xen_kexec_load_t *load)
> /* Unload the old image if present and load successful */
> if ( ret == 0 && !test_bit(KEXEC_FLAG_IN_PROGRESS, &kexec_flags) )
> {
> - if ( test_and_clear_bit((base + pos), &kexec_flags) )
> - {
> - image = &kexec_image[base + pos];
> - machine_kexec_unload(load->type, base + pos, image);
> - }
> + kexec_unload_slot(base + pos);
> }
>
> return ret;
> @@ -816,7 +826,7 @@ static int kexec_load_unload_compat(unsigned long op,
> static int kexec_exec(XEN_GUEST_HANDLE_PARAM(void) uarg)
> {
> xen_kexec_exec_t exec;
> - xen_kexec_image_t *image;
> + struct kexec_image *image;
> int base, bit, pos, ret = -EINVAL;
>
> if ( unlikely(copy_from_guest(&exec, uarg, 1)) )
> @@ -845,6 +855,162 @@ static int kexec_exec(XEN_GUEST_HANDLE_PARAM(void) uarg)
> return -EINVAL; /* never reached */
> }
>
> +static int kexec_load_segments(xen_kexec_load_v2_t *load)
> +{
> + unsigned s;
> + bool_t valid_entry = 0;
> +
> + for ( s = 0; s < load->nr_segments; s++ )
> + {
> + xen_kexec_segment_t seg;
> + unsigned long to_copy;
> + unsigned long src_offset;
> + unsigned long dest;
> +
> + if ( copy_from_guest_offset(&seg, load->segments, s, 1) )
> + return -EFAULT;
> +
> + /* Caller is responsible for ensuring the crash space is
> + shared between multiple users of the load call. Xen just
> + validates the load is to somewhere within the region. */
> +
> + if ( seg.dest_maddr < kexec_crash_area.start
> + || seg.dest_maddr + seg.size > kexec_crash_area.start + kexec_crash_area.size)
> + return -EINVAL;
This way you are breaking regular kexec support which
does not use prealocated area. As I said earlier you
should use kexec code from Linux Kernel (with relevant
changes). It has all needed stuff and you do not need
to reinvent the wheel.
> +
> + if ( load->entry_maddr >= seg.dest_maddr
> + && load->entry_maddr < seg.dest_maddr + seg.size)
> + valid_entry = 1;
> +
> + to_copy = seg.size;
> + src_offset = 0;
> + dest = seg.dest_maddr;
> +
> + while ( to_copy )
> + {
> + unsigned long dest_mfn;
> + size_t dest_off;
> + void *dest_va;
> + size_t size;
> +
> + dest_mfn = dest >> PAGE_SHIFT;
> + dest_off = dest & ~PAGE_MASK;
> +
> + size = min(PAGE_SIZE - dest_off, to_copy);
> +
> + dest_va = vmap(&dest_mfn, 1);
> + if ( dest_va == NULL )
> + return -EINVAL;
> +
> + copy_from_guest_offset(dest_va, seg.buf, src_offset, size);
> +
> + vunmap(dest_va);
> +
> + to_copy -= size;
> + src_offset += size;
> + dest += size;
> + }
> + }
> +
> + /* Entry point is somewhere in a loaded segment? */
> + if ( !valid_entry )
> + return -EINVAL;
> +
> + return 0;
> +}
> +
> +static int slot_to_pos_bit(int slot)
> +{
> + return KEXEC_IMAGE_NR + slot / 2;
> +}
> +
> +static int kexec_load_slot(int slot, xen_kexec_load_v2_t *load)
> +{
> + struct kexec_image *image = &kexec_image[slot];
> + int ret;
> +
> + BUG_ON(test_bit(slot, &kexec_flags)); /* must be free */
> +
> + /* validate and load each segment. */
> + ret = kexec_load_segments(load);
> + if ( ret < 0 )
> + return ret;
> +
> + image->entry_maddr = load->entry_maddr;
> +
> + ret = machine_kexec_load(image);
> + if ( ret < 0 )
> + return ret;
> +
> + /* Set image present bit */
> + set_bit(slot, &kexec_flags);
> +
> + /* Make new image the active one */
> + change_bit(slot_to_pos_bit(slot), &kexec_flags);
> +
> + crash_save_vmcoreinfo();
> +
> + return ret;
> +}
> +
> +
> +static int kexec_load_v2(XEN_GUEST_HANDLE_PARAM(void) uarg)
> +{
> + xen_kexec_load_v2_t load;
> + int base, bit, pos, slot;
> + struct kexec_image *image;
> + int ret;
> +
> + if ( unlikely(copy_from_guest(&load, uarg, 1)) )
> + return -EFAULT;
> +
> + if ( kexec_load_get_bits(load.type, &base, &bit) )
> + return -EINVAL;
> +
> + pos = (test_bit(bit, &kexec_flags) != 0);
> + slot = base + !pos;
> + image = &kexec_image[slot];
> +
> + switch ( load.class )
> + {
> + case KEXEC_CLASS_32:
> + case KEXEC_CLASS_64:
> + image->class = load.class;
> + break;
> + default:
> + return -EINVAL;
> + }
> +
> + ret = kexec_load_slot(slot, &load);
> +
> + /* Unload the old image if present and load successful */
> + if ( ret == 0 && !test_bit(KEXEC_FLAG_IN_PROGRESS, &kexec_flags) )
> + {
> + kexec_unload_slot(slot ^ 0x1);
> + }
> +
> + return ret;
> +}
> +
> +static int kexec_unload_v2(XEN_GUEST_HANDLE_PARAM(void) uarg)
> +{
> + xen_kexec_unload_v2_t unload;
> + int base, bit, pos, slot;
> +
> + if ( unlikely(copy_from_guest(&unload, uarg, 1)) )
> + return -EFAULT;
> +
> + if ( kexec_load_get_bits(unload.type, &base, &bit) )
> + return -EINVAL;
> +
> + pos = (test_bit(bit, &kexec_flags) != 0);
> + slot = base + !pos;
> +
> + kexec_unload_slot(slot);
> +
> + return 0;
> +}
> +
> static int do_kexec_op_internal(unsigned long op,
> XEN_GUEST_HANDLE_PARAM(void) uarg,
> bool_t compat)
> @@ -882,6 +1048,22 @@ static int do_kexec_op_internal(unsigned long op,
> case KEXEC_CMD_kexec:
> ret = kexec_exec(uarg);
> break;
> + case KEXEC_CMD_kexec_load_v2:
> + spin_lock_irqsave(&kexec_lock, flags);
> + if ( !test_bit(KEXEC_FLAG_IN_PROGRESS, &kexec_flags) )
> + ret = kexec_load_v2(uarg);
> + else
> + ret = -EBUSY;
> + spin_unlock_irqrestore(&kexec_lock, flags);
> + break;
> + case KEXEC_CMD_kexec_unload_v2:
> + spin_lock_irqsave(&kexec_lock, flags);
> + if ( !test_bit(KEXEC_FLAG_IN_PROGRESS, &kexec_flags) )
> + ret = kexec_unload_v2(uarg);
> + else
> + ret = -EBUSY;
> + spin_unlock_irqrestore(&kexec_lock, flags);
> + break;
> }
>
> return ret;
> diff --git a/xen/include/public/kexec.h b/xen/include/public/kexec.h
> index 61a8d7d..4b7d637 100644
> --- a/xen/include/public/kexec.h
> +++ b/xen/include/public/kexec.h
> @@ -83,6 +83,8 @@
> #define KEXEC_TYPE_DEFAULT 0
> #define KEXEC_TYPE_CRASH 1
>
> +#define KEXEC_CLASS_32 1 /* 32-bit image. */
> +#define KEXEC_CLASS_64 2 /* 64-bit image. */
???
>
> /* The kexec implementation for Xen allows the user to load two
> * types of kernels, KEXEC_TYPE_DEFAULT and KEXEC_TYPE_CRASH.
> @@ -152,6 +154,48 @@ typedef struct xen_kexec_range {
> unsigned long start;
> } xen_kexec_range_t;
>
> +/*
> + * A contiguous chunk of a kexec image and it's destination machine
> + * address.
> + */
> +typedef struct xen_kexec_segment {
> + XEN_GUEST_HANDLE(const_void) buf;
> + uint32_t size;
> + uint64_t dest_maddr;
> +} xen_kexec_segment_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_kexec_segment_t);
> +
> +/*
> + * Load a kexec image into memory.
> + *
> + * Each segment of the image must reside in the memory region reserved
> + * for kexec (KEXEC_RANGE_MA_CRASH) and the entry point must be within
> + * the image.
> + *
> + * The caller is responsible for ensuring that multiple images do not
> + * overlap.
> + */
> +#define KEXEC_CMD_kexec_load_v2 4
> +typedef struct xen_kexec_load_v2 {
> + uint8_t type; /* One of KEXEC_TYPE_* */
> + uint8_t class; /* One of KEXEC_CLASS_* */
Why do not use one member called flags (uint32_t or uint64_t)?
This way you could add quite easily new flags in the future.
Daniel
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH 1/3] kexec: extend hypercall with improved load/unload ops
2013-01-17 12:28 ` Daniel Kiper
@ 2013-01-17 14:50 ` David Vrabel
2013-01-17 15:17 ` Daniel Kiper
0 siblings, 1 reply; 24+ messages in thread
From: David Vrabel @ 2013-01-17 14:50 UTC (permalink / raw)
To: Daniel Kiper
Cc: kexec@lists.infradead.org, Eric Biederman,
xen-devel@lists.xen.org
Can you trim your replies? It's too hard to find your comments otherwise.
On 17/01/13 12:28, Daniel Kiper wrote:
> On Wed, Jan 16, 2013 at 04:29:04PM +0000, David Vrabel wrote:
>> From: David Vrabel <david.vrabel@citrix.com>
>>
>> In the existing kexec hypercall, the load and unload ops depend on
>> internals of the Linux kernel (the page list and code page provided by
>> the kernel). The code page is used to transition between Xen context
>> and the image so using kernel code doesn't make sense and will not
>> work for PVH guests.
>>
>> Add replacement KEXEC_CMD_kexec_load_v2 and KEXEC_CMD_kexec_unload_v2
>> ops that no longer require a code page to be provided by the guest --
>> Xen now provides the code for calling the image directly.
>>
[...]
>> + if ( image->class == KEXEC_CLASS_32 )
>> + compat_machine_kexec(image->entry_maddr);
>
> Why do you need that?
image->class controls whether the processor is in 32-bit or 64-bit mode
when calling the image. The current implementation only allows images
to be executed with the same class as dom0.
It's called class because that's the term ELF uses in the ELF header.
>> + if ( seg.dest_maddr < kexec_crash_area.start
>> + || seg.dest_maddr + seg.size > kexec_crash_area.start + kexec_crash_area.size)
>> + return -EINVAL;
>
> This way you are breaking regular kexec support which
> does not use prealocated area. As I said earlier you
> should use kexec code from Linux Kernel (with relevant
> changes). It has all needed stuff and you do not need
> to reinvent the wheel.
Yeah. I did say it was a prototype.
>> +#define KEXEC_CMD_kexec_load_v2 4
>> +typedef struct xen_kexec_load_v2 {
>> + uint8_t type; /* One of KEXEC_TYPE_* */
>> + uint8_t class; /* One of KEXEC_CLASS_* */
>
> Why do not use one member called flags (uint32_t or uint64_t)?
> This way you could add quite easily new flags in the future.
Neither type nor class are flags.
David
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH 1/3] kexec: extend hypercall with improved load/unload ops
2013-01-17 14:50 ` David Vrabel
@ 2013-01-17 15:17 ` Daniel Kiper
2013-01-17 17:53 ` David Vrabel
0 siblings, 1 reply; 24+ messages in thread
From: Daniel Kiper @ 2013-01-17 15:17 UTC (permalink / raw)
To: David Vrabel
Cc: kexec@lists.infradead.org, Eric Biederman,
xen-devel@lists.xen.org
On Thu, Jan 17, 2013 at 02:50:26PM +0000, David Vrabel wrote:
> Can you trim your replies? It's too hard to find your comments otherwise.
OK.
> On 17/01/13 12:28, Daniel Kiper wrote:
> > On Wed, Jan 16, 2013 at 04:29:04PM +0000, David Vrabel wrote:
> >> From: David Vrabel <david.vrabel@citrix.com>
> >>
> >> In the existing kexec hypercall, the load and unload ops depend on
> >> internals of the Linux kernel (the page list and code page provided by
> >> the kernel). The code page is used to transition between Xen context
> >> and the image so using kernel code doesn't make sense and will not
> >> work for PVH guests.
> >>
> >> Add replacement KEXEC_CMD_kexec_load_v2 and KEXEC_CMD_kexec_unload_v2
> >> ops that no longer require a code page to be provided by the guest --
> >> Xen now provides the code for calling the image directly.
> >>
> [...]
> >> + if ( image->class == KEXEC_CLASS_32 )
> >> + compat_machine_kexec(image->entry_maddr);
> >
> > Why do you need that?
>
> image->class controls whether the processor is in 32-bit or 64-bit mode
> when calling the image. The current implementation only allows images
> to be executed with the same class as dom0.
>
> It's called class because that's the term ELF uses in the ELF header.
As I correctly understand this sets processor mode before new kernel exection.
If yes then it is not needed. Purgatory code (from kexec-tools) does all
needed things. Please check.
> >> + if ( seg.dest_maddr < kexec_crash_area.start
> >> + || seg.dest_maddr + seg.size > kexec_crash_area.start + kexec_crash_area.size)
> >> + return -EINVAL;
> >
> > This way you are breaking regular kexec support which
> > does not use prealocated area. As I said earlier you
> > should use kexec code from Linux Kernel (with relevant
> > changes). It has all needed stuff and you do not need
> > to reinvent the wheel.
>
> Yeah. I did say it was a prototype.
OK. I could not find any comment that this feature will be implemented too.
I prefer to say about my thoughts know than later break your work.
> >> +#define KEXEC_CMD_kexec_load_v2 4
> >> +typedef struct xen_kexec_load_v2 {
> >> + uint8_t type; /* One of KEXEC_TYPE_* */
> >> + uint8_t class; /* One of KEXEC_CLASS_* */
> >
> > Why do not use one member called flags (uint32_t or uint64_t)?
> > This way you could add quite easily new flags in the future.
>
> Neither type nor class are flags.
type could be passed as flags (as kexec Linux Kernel implementation does)
which gives us more flexibility if we need more flags in the future.
class is not needed as I stated above.
Daniel
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH 1/3] kexec: extend hypercall with improved load/unload ops
2013-01-17 15:17 ` Daniel Kiper
@ 2013-01-17 17:53 ` David Vrabel
2013-01-18 9:44 ` Daniel Kiper
` (2 more replies)
0 siblings, 3 replies; 24+ messages in thread
From: David Vrabel @ 2013-01-17 17:53 UTC (permalink / raw)
To: Daniel Kiper
Cc: kexec@lists.infradead.org, Eric Biederman,
xen-devel@lists.xen.org
On 17/01/13 15:17, Daniel Kiper wrote:
> On Thu, Jan 17, 2013 at 02:50:26PM +0000, David Vrabel wrote:
>> On 17/01/13 12:28, Daniel Kiper wrote:
>>> On Wed, Jan 16, 2013 at 04:29:04PM +0000, David Vrabel wrote:
[..]
>>>> + if ( image->class == KEXEC_CLASS_32 )
>>>> + compat_machine_kexec(image->entry_maddr);
>>>
>>> Why do you need that?
>>
>> image->class controls whether the processor is in 32-bit or 64-bit mode
>> when calling the image. The current implementation only allows images
>> to be executed with the same class as dom0.
>>
>> It's called class because that's the term ELF uses in the ELF header.
>
> As I correctly understand this sets processor mode before new kernel exection.
> If yes then it is not needed. Purgatory code (from kexec-tools) does all
> needed things. Please check.
On x86 I think it would probably be fine to specify entry is always in
64-bit mode but for ARM and future architectures it is less clear and it
becomes more difficult to have a well-defined ABI.
In fact, we probably want a more generic architecture field. e.g,
#define XEN_KEXEC_ARCH_X86_32 0
#define XEN_KEXEC_ARCH_X86_64 1
#define XEN_KEXEC_ARCH_ARMv7 2
#define XEN_KEXEC_ARCH_ARMv8 3
David
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH 1/3] kexec: extend hypercall with improved load/unload ops
2013-01-17 17:53 ` David Vrabel
@ 2013-01-18 9:44 ` Daniel Kiper
2013-01-18 9:50 ` [Xen-devel] " Ian Campbell
2013-01-18 19:01 ` Eric W. Biederman
2 siblings, 0 replies; 24+ messages in thread
From: Daniel Kiper @ 2013-01-18 9:44 UTC (permalink / raw)
To: David Vrabel
Cc: kexec@lists.infradead.org, Eric Biederman,
xen-devel@lists.xen.org
On Thu, Jan 17, 2013 at 05:53:17PM +0000, David Vrabel wrote:
> On 17/01/13 15:17, Daniel Kiper wrote:
> > On Thu, Jan 17, 2013 at 02:50:26PM +0000, David Vrabel wrote:
> >> On 17/01/13 12:28, Daniel Kiper wrote:
> >>> On Wed, Jan 16, 2013 at 04:29:04PM +0000, David Vrabel wrote:
> [..]
> >>>> + if ( image->class == KEXEC_CLASS_32 )
> >>>> + compat_machine_kexec(image->entry_maddr);
> >>>
> >>> Why do you need that?
> >>
> >> image->class controls whether the processor is in 32-bit or 64-bit mode
> >> when calling the image. The current implementation only allows images
> >> to be executed with the same class as dom0.
> >>
> >> It's called class because that's the term ELF uses in the ELF header.
> >
> > As I correctly understand this sets processor mode before new kernel exection.
> > If yes then it is not needed. Purgatory code (from kexec-tools) does all
> > needed things. Please check.
>
> On x86 I think it would probably be fine to specify entry is always in
Which entry? Kernel entry point? I think it is always the same.
> 64-bit mode but for ARM and future architectures it is less clear and it
> becomes more difficult to have a well-defined ABI.
>
> In fact, we probably want a more generic architecture field. e.g,
>
> #define XEN_KEXEC_ARCH_X86_32 0
> #define XEN_KEXEC_ARCH_X86_64 1
> #define XEN_KEXEC_ARCH_ARMv7 2
> #define XEN_KEXEC_ARCH_ARMv8 3
If we need them then please look into linux/include/uapi/linux/kexec.h.
In Linux Kernel case they are passed via flags.
Daniel
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Xen-devel] [PATCH 1/3] kexec: extend hypercall with improved load/unload ops
2013-01-17 17:53 ` David Vrabel
2013-01-18 9:44 ` Daniel Kiper
@ 2013-01-18 9:50 ` Ian Campbell
2013-01-18 19:01 ` Eric W. Biederman
2 siblings, 0 replies; 24+ messages in thread
From: Ian Campbell @ 2013-01-18 9:50 UTC (permalink / raw)
To: David Vrabel
Cc: Daniel Kiper, kexec@lists.infradead.org, Eric Biederman,
xen-devel@lists.xen.org
On Thu, 2013-01-17 at 17:53 +0000, David Vrabel wrote:
> On 17/01/13 15:17, Daniel Kiper wrote:
> > On Thu, Jan 17, 2013 at 02:50:26PM +0000, David Vrabel wrote:
> >> On 17/01/13 12:28, Daniel Kiper wrote:
> >>> On Wed, Jan 16, 2013 at 04:29:04PM +0000, David Vrabel wrote:
> [..]
> >>>> + if ( image->class == KEXEC_CLASS_32 )
> >>>> + compat_machine_kexec(image->entry_maddr);
> >>>
> >>> Why do you need that?
> >>
> >> image->class controls whether the processor is in 32-bit or 64-bit mode
> >> when calling the image. The current implementation only allows images
> >> to be executed with the same class as dom0.
> >>
> >> It's called class because that's the term ELF uses in the ELF header.
> >
> > As I correctly understand this sets processor mode before new kernel exection.
> > If yes then it is not needed. Purgatory code (from kexec-tools) does all
> > needed things. Please check.
>
> On x86 I think it would probably be fine to specify entry is always in
> 64-bit mode but for ARM and future architectures it is less clear and it
> becomes more difficult to have a well-defined ABI.
Just a random thought but rather than an entry point would it make sense
to pass in a complete struct vcpu_guest_context? Not all of the members
would be relevant to this use case but a bunch of them other than PC
might be?
Ian.
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] kexec: extend hypercall with improved load/unload ops
2013-01-17 17:53 ` David Vrabel
2013-01-18 9:44 ` Daniel Kiper
2013-01-18 9:50 ` [Xen-devel] " Ian Campbell
@ 2013-01-18 19:01 ` Eric W. Biederman
2 siblings, 0 replies; 24+ messages in thread
From: Eric W. Biederman @ 2013-01-18 19:01 UTC (permalink / raw)
To: David Vrabel
Cc: Daniel Kiper, kexec@lists.infradead.org, xen-devel@lists.xen.org
David Vrabel <david.vrabel@citrix.com> writes:
> On 17/01/13 15:17, Daniel Kiper wrote:
>> On Thu, Jan 17, 2013 at 02:50:26PM +0000, David Vrabel wrote:
>>> On 17/01/13 12:28, Daniel Kiper wrote:
>>>> On Wed, Jan 16, 2013 at 04:29:04PM +0000, David Vrabel wrote:
> [..]
>>>>> + if ( image->class == KEXEC_CLASS_32 )
>>>>> + compat_machine_kexec(image->entry_maddr);
>>>>
>>>> Why do you need that?
>>>
>>> image->class controls whether the processor is in 32-bit or 64-bit mode
>>> when calling the image. The current implementation only allows images
>>> to be executed with the same class as dom0.
>>>
>>> It's called class because that's the term ELF uses in the ELF header.
>>
>> As I correctly understand this sets processor mode before new kernel exection.
>> If yes then it is not needed. Purgatory code (from kexec-tools) does all
>> needed things. Please check.
>
> On x86 I think it would probably be fine to specify entry is always in
> 64-bit mode but for ARM and future architectures it is less clear and it
> becomes more difficult to have a well-defined ABI.
>
> In fact, we probably want a more generic architecture field. e.g,
>
> #define XEN_KEXEC_ARCH_X86_32 0
> #define XEN_KEXEC_ARCH_X86_64 1
> #define XEN_KEXEC_ARCH_ARMv7 2
> #define XEN_KEXEC_ARCH_ARMv8 3
The way this is defined for kexec on linux is that we always transition
in the processors native mode. The page tables for the transition are
definined as being identity mapped for the pages specified in the image.
The linux kexec pass in what architecture it thinks the system is runing
in so that the kexec_load implemenation can fail load requests with the
wrong architecture.
In particular a 32bit kexec on a x86_64 kernel does expect to transition
in 64bit mode.
Non-native transitions are possible if you want to support them when Xen
is crashing but I don't see the point. I do admit I am a bit puzzled on
how a 32bit dom0 on a 64bit hypervisor implements kexec on panic
functionality today. Xen is weird. Shrug.
Eric
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Xen-devel] [PATCH 1/3] kexec: extend hypercall with improved load/unload ops
2013-01-16 16:29 ` [PATCH 1/3] kexec: extend hypercall with improved load/unload ops David Vrabel
2013-01-17 12:28 ` Daniel Kiper
@ 2013-01-17 12:33 ` Ian Campbell
1 sibling, 0 replies; 24+ messages in thread
From: Ian Campbell @ 2013-01-17 12:33 UTC (permalink / raw)
To: David Vrabel
Cc: Daniel Kiper, kexec@lists.infradead.org, Eric Biederman,
xen-devel@lists.xen.org
On Wed, 2013-01-16 at 16:29 +0000, David Vrabel wrote:
>
> +/*
> + * Load a kexec image into memory.
> + *
> + * Each segment of the image must reside in the memory region reserved
> + * for kexec (KEXEC_RANGE_MA_CRASH) and the entry point must be within
> + * the image.
> + *
> + * The caller is responsible for ensuring that multiple images do not
> + * overlap.
> + */
> +#define KEXEC_CMD_kexec_load_v2 4
> +typedef struct xen_kexec_load_v2 {
> + uint8_t type; /* One of KEXEC_TYPE_* */
> + uint8_t class; /* One of KEXEC_CLASS_* */
> + uint32_t nr_segments;
> + XEN_GUEST_HANDLE(xen_kexec_segment_t) segments;
> + uint64_t entry_maddr; /* image entry point machine address. */
> +} xen_kexec_load_v2_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_kexec_load_v2_t);
Might you want to implement the double buffering scheme which the
existing stuff has? This can avoid windows where there is no complete
crash kernel present while you load a new one.
Ian.
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 2/3] kexec: remove kexec_load and kexec_unload ops
2013-01-16 16:29 [RFC PATCH 0/3] Improve kexec support in Xen hypervisor David Vrabel
2013-01-16 16:29 ` [PATCH 1/3] kexec: extend hypercall with improved load/unload ops David Vrabel
@ 2013-01-16 16:29 ` David Vrabel
2013-01-16 16:29 ` [PATCH 3/3] libxc: add API for kexec hypercall David Vrabel
` (3 subsequent siblings)
5 siblings, 0 replies; 24+ messages in thread
From: David Vrabel @ 2013-01-16 16:29 UTC (permalink / raw)
To: xen-devel; +Cc: Daniel Kiper, kexec, David Vrabel, Eric Biederman
From: David Vrabel <david.vrabel@citrix.com>
The KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload are obsoleted by
KEXEC_CMD_kexec_load_v2 and KEXEC_CMD_kexec_load_v2.
The kexec hypercall is for use by dom0 and the toolstack so we remove
support for them completely.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
xen/common/kexec.c | 91 +-------------------------------------------
xen/include/public/kexec.h | 27 +------------
xen/include/xlat.lst | 1 -
3 files changed, 2 insertions(+), 117 deletions(-)
diff --git a/xen/common/kexec.c b/xen/common/kexec.c
index 56bf8b4..a3ddbbc 100644
--- a/xen/common/kexec.c
+++ b/xen/common/kexec.c
@@ -742,87 +742,6 @@ static void kexec_unload_slot(int slot)
}
}
-static int kexec_load_unload_internal(unsigned long op, xen_kexec_load_t *load)
-{
- struct kexec_image *image;
- int base, bit, pos;
- int ret = 0;
-
- if ( kexec_load_get_bits(load->type, &base, &bit) )
- return -EINVAL;
-
- pos = (test_bit(bit, &kexec_flags) != 0);
-
- /* Load the user data into an unused image */
- if ( op == KEXEC_CMD_kexec_load )
- {
- image = &kexec_image[base + !pos];
-
- BUG_ON(test_bit((base + !pos), &kexec_flags)); /* must be free */
-
- if ( is_pv_32on64_domain(dom0) )
- image->class = KEXEC_CLASS_32;
- else
- image->class = KEXEC_CLASS_64;
- image->entry_maddr = load->image.start_address;
-
- if ( !(ret = machine_kexec_load(image)) )
- {
- /* Set image present bit */
- set_bit((base + !pos), &kexec_flags);
-
- /* Make new image the active one */
- change_bit(bit, &kexec_flags);
- }
-
- crash_save_vmcoreinfo();
- }
-
- /* Unload the old image if present and load successful */
- if ( ret == 0 && !test_bit(KEXEC_FLAG_IN_PROGRESS, &kexec_flags) )
- {
- kexec_unload_slot(base + pos);
- }
-
- return ret;
-}
-
-static int kexec_load_unload(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) uarg)
-{
- xen_kexec_load_t load;
-
- if ( unlikely(copy_from_guest(&load, uarg, 1)) )
- return -EFAULT;
-
- return kexec_load_unload_internal(op, &load);
-}
-
-static int kexec_load_unload_compat(unsigned long op,
- XEN_GUEST_HANDLE_PARAM(void) uarg)
-{
-#ifdef CONFIG_COMPAT
- compat_kexec_load_t compat_load;
- xen_kexec_load_t load;
-
- if ( unlikely(copy_from_guest(&compat_load, uarg, 1)) )
- return -EFAULT;
-
- /* This is a bit dodgy, load.image is inside load,
- * but XLAT_kexec_load (which is automatically generated)
- * doesn't translate load.image (correctly)
- * Just copy load->type, the only other member, manually instead.
- *
- * XLAT_kexec_load(&load, &compat_load);
- */
- load.type = compat_load.type;
- XLAT_kexec_image(&load.image, &compat_load.image);
-
- return kexec_load_unload_internal(op, &load);
-#else /* CONFIG_COMPAT */
- return 0;
-#endif /* CONFIG_COMPAT */
-}
-
static int kexec_exec(XEN_GUEST_HANDLE_PARAM(void) uarg)
{
xen_kexec_exec_t exec;
@@ -1035,15 +954,7 @@ static int do_kexec_op_internal(unsigned long op,
break;
case KEXEC_CMD_kexec_load:
case KEXEC_CMD_kexec_unload:
- spin_lock_irqsave(&kexec_lock, flags);
- if (!test_bit(KEXEC_FLAG_IN_PROGRESS, &kexec_flags))
- {
- if (compat)
- ret = kexec_load_unload_compat(op, uarg);
- else
- ret = kexec_load_unload(op, uarg);
- }
- spin_unlock_irqrestore(&kexec_lock, flags);
+ ret = -ENOSYS;
break;
case KEXEC_CMD_kexec:
ret = kexec_exec(uarg);
diff --git a/xen/include/public/kexec.h b/xen/include/public/kexec.h
index 4b7d637..7f2028f 100644
--- a/xen/include/public/kexec.h
+++ b/xen/include/public/kexec.h
@@ -86,23 +86,6 @@
#define KEXEC_CLASS_32 1 /* 32-bit image. */
#define KEXEC_CLASS_64 2 /* 64-bit image. */
-/* The kexec implementation for Xen allows the user to load two
- * types of kernels, KEXEC_TYPE_DEFAULT and KEXEC_TYPE_CRASH.
- * All data needed for a kexec reboot is kept in one xen_kexec_image_t
- * per "instance". The data mainly consists of machine address lists to pages
- * together with destination addresses. The data in xen_kexec_image_t
- * is passed to the "code page" which is one page of code that performs
- * the final relocations before jumping to the new kernel.
- */
-
-typedef struct xen_kexec_image {
-#if defined(__i386__) || defined(__x86_64__)
- unsigned long page_list[KEXEC_XEN_NO_PAGES];
-#endif
- unsigned long indirection_page;
- unsigned long start_address;
-} xen_kexec_image_t;
-
/*
* Perform kexec having previously loaded a kexec or kdump kernel
* as appropriate.
@@ -113,17 +96,9 @@ typedef struct xen_kexec_exec {
int type;
} xen_kexec_exec_t;
-/*
- * Load/Unload kernel image for kexec or kdump.
- * type == KEXEC_TYPE_DEFAULT or KEXEC_TYPE_CRASH [in]
- * image == relocation information for kexec (ignored for unload) [in]
- */
+/* Obsolete: uses kexec_load_v2 and kexec_unload_v2 instead. */
#define KEXEC_CMD_kexec_load 1
#define KEXEC_CMD_kexec_unload 2
-typedef struct xen_kexec_load {
- int type;
- xen_kexec_image_t image;
-} xen_kexec_load_t;
#define KEXEC_RANGE_MA_CRASH 0 /* machine address and size of crash area */
#define KEXEC_RANGE_MA_XEN 1 /* machine address and size of Xen itself */
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 3d4f1e3..e2e05fe 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -52,7 +52,6 @@
? grant_entry_v2 grant_table.h
? gnttab_swap_grant_ref grant_table.h
? kexec_exec kexec.h
-! kexec_image kexec.h
! kexec_range kexec.h
! add_to_physmap memory.h
! foreign_memory_map memory.h
--
1.7.2.5
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH 3/3] libxc: add API for kexec hypercall
2013-01-16 16:29 [RFC PATCH 0/3] Improve kexec support in Xen hypervisor David Vrabel
2013-01-16 16:29 ` [PATCH 1/3] kexec: extend hypercall with improved load/unload ops David Vrabel
2013-01-16 16:29 ` [PATCH 2/3] kexec: remove kexec_load and kexec_unload ops David Vrabel
@ 2013-01-16 16:29 ` David Vrabel
2013-01-16 16:59 ` [Xen-devel] " Ian Campbell
2013-01-16 16:33 ` [RFC PATCH 0/3] Improve kexec support in Xen hypervisor David Vrabel
` (2 subsequent siblings)
5 siblings, 1 reply; 24+ messages in thread
From: David Vrabel @ 2013-01-16 16:29 UTC (permalink / raw)
To: xen-devel; +Cc: Daniel Kiper, kexec, David Vrabel, Eric Biederman
From: David Vrabel <david.vrabel@citrix.com>
Add xc_kexec_exec(), xc_kexec_get_ranges(), xc_kexec_load(), and
xc_kexec_unload(). The load and unload calls require the v2 load and
unload ops.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
tools/libxc/Makefile | 1 +
tools/libxc/xc_kexec.c | 120 ++++++++++++++++++++++++++++++++++++++++++++++++
tools/libxc/xenctrl.h | 53 +++++++++++++++++++++
3 files changed, 174 insertions(+), 0 deletions(-)
create mode 100644 tools/libxc/xc_kexec.c
diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index d44abf9..39badf9 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -31,6 +31,7 @@ CTRL_SRCS-y += xc_mem_access.c
CTRL_SRCS-y += xc_memshr.c
CTRL_SRCS-y += xc_hcall_buf.c
CTRL_SRCS-y += xc_foreign_memory.c
+CTRL_SRCS-y += xc_kexec.c
CTRL_SRCS-y += xtl_core.c
CTRL_SRCS-y += xtl_logger_stdio.c
CTRL_SRCS-$(CONFIG_X86) += xc_pagetab.c
diff --git a/tools/libxc/xc_kexec.c b/tools/libxc/xc_kexec.c
new file mode 100644
index 0000000..ebd55cf
--- /dev/null
+++ b/tools/libxc/xc_kexec.c
@@ -0,0 +1,120 @@
+/******************************************************************************
+ * xc_kexec.c
+ *
+ * API for loading and executing kexec images.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation;
+ * version 2.1 of the License.
+ *
+ * Copyright (C) 2013 Citrix Systems R&D Ltd.
+ */
+#include "xc_private.h"
+
+int xc_kexec(xc_interface *xch, int type)
+{
+ DECLARE_HYPERCALL;
+ DECLARE_HYPERCALL_BUFFER(xen_kexec_exec_t, exec);
+ int ret = -1;
+
+ exec = xc_hypercall_buffer_alloc(xch, exec, sizeof(*exec));
+ if ( exec == NULL )
+ {
+ PERROR("Count not alloc bounce buffer for kexec_exec hypercall");
+ goto out;
+ }
+
+ exec->type = type;
+
+ hypercall.op = __HYPERVISOR_kexec_op;
+ hypercall.arg[0] = KEXEC_CMD_kexec;
+ hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(exec);
+
+ ret = do_xen_hypercall(xch, &hypercall);
+
+out:
+ xc_hypercall_buffer_free(xch, exec);
+
+ return ret;
+}
+
+int xc_kexec_get_range(xc_interface *xch, int range, int nr,
+ unsigned long *size, unsigned long *start)
+{
+ DECLARE_HYPERCALL;
+ DECLARE_HYPERCALL_BUFFER(xen_kexec_range_t, get_range);
+ int ret = -1;
+
+ get_range = xc_hypercall_buffer_alloc(xch, get_range, sizeof(*get_range));
+ if ( get_range == NULL )
+ {
+ PERROR("Could not alloc bounce buffer for kexec_get_range hypercall");
+ goto out;
+ }
+
+ hypercall.op = __HYPERVISOR_kexec_op;
+ hypercall.arg[0] = KEXEC_CMD_kexec_get_range;
+ hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(get_range);
+
+ ret = do_xen_hypercall(xch, &hypercall);
+
+ *size = get_range->size;
+ *start = get_range->start;
+
+out:
+ xc_hypercall_buffer_free(xch, get_range);
+
+ return ret;
+}
+
+int xc_kexec_load(xc_interface *xch, xen_kexec_load_v2_t *load)
+{
+ int ret = -1;
+ DECLARE_HYPERCALL;
+ DECLARE_HYPERCALL_BOUNCE(load, sizeof(*load), XC_HYPERCALL_BUFFER_BOUNCE_IN);
+
+ if ( xc_hypercall_bounce_pre(xch, load) )
+ {
+ PERROR("Could not alloc bounce buffer for kexec_load_v2 hypercall");
+ goto out;
+ }
+
+ hypercall.op = __HYPERVISOR_kexec_op;
+ hypercall.arg[0] = KEXEC_CMD_kexec_load_v2;
+ hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(load);
+
+ ret = do_xen_hypercall(xch, &hypercall);
+
+out:
+ xc_hypercall_bounce_post(xch, load);
+
+ return ret;
+}
+
+int xc_kexec_unload(xc_interface *xch, int type)
+{
+ DECLARE_HYPERCALL;
+ DECLARE_HYPERCALL_BUFFER(xen_kexec_unload_v2_t, unload);
+ int ret = -1;
+
+ unload = xc_hypercall_buffer_alloc(xch, unload, sizeof(*unload));
+ if ( unload == NULL )
+ {
+ PERROR("Count not alloc bounce buffer for kexec_unload_v2 hypercall");
+ goto out;
+ }
+
+ unload->type = type;
+
+ hypercall.op = __HYPERVISOR_kexec_op;
+ hypercall.arg[0] = KEXEC_CMD_kexec_unload_v2;
+ hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(unload);
+
+ ret = do_xen_hypercall(xch, &hypercall);
+
+out:
+ xc_hypercall_buffer_free(xch, unload);
+
+ return ret;
+}
diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
index 32122fd..001f7d1 100644
--- a/tools/libxc/xenctrl.h
+++ b/tools/libxc/xenctrl.h
@@ -46,6 +46,7 @@
#include <xen/hvm/params.h>
#include <xen/xsm/flask_op.h>
#include <xen/tmem.h>
+#include <xen/kexec.h>
#include "xentoollog.h"
@@ -2236,4 +2237,56 @@ int xc_compression_uncompress_page(xc_interface *xch, char *compbuf,
unsigned long compbuf_size,
unsigned long *compbuf_pos, char *dest);
+/*
+ * Execute an image previously loaded with xc_kexec_load().
+ *
+ * Does not return on success.
+ *
+ * Fails with:
+ * ENOENT if the specified image has not been loaded.
+ */
+int xc_kexec(xc_interface *xch, int type);
+
+/*
+ * Find the machine address and size of certain memory areas.
+ *
+ * The regions are:
+ *
+ * KEXEC_RANGE_MA_CRASH crash area
+ * KEXEC_RANGE_MA_XEN Xen itself
+ * KEXEC_RANGE_MA_CPU CPU note for CPU number 'nr'
+ * KEXEC_RANGE_MA_XENHEAP xenheap
+ * KEXEC_RANGE_MA_EFI_MEMMAP EFI Memory Map
+ * KEXEC_RANGE_MA_VMCOREINFO vmcoreinfo
+ *
+ * Fails with:
+ * EINVAL if the range or CPU number isn't valid.
+ */
+int xc_kexec_get_range(xc_interface *xch, int range, int nr,
+ unsigned long *size, unsigned long *start);
+
+/*
+ * Load a kexec image into memory.
+ *
+ * The image may be of type KEXEC_TYPE_DEFAULT (executed on request)
+ * or KEXEC_TYPE_CRASH (executed on a crash).
+ *
+ * The image may be 32-bit (KEXEC_CLASS_32) or 64-bit (KEXEC_CLASS_64)
+ * independently of whether the current domain is 32- or 64-bit.
+ *
+ * Fails with:
+ * EINVAL if the image does not fit into the crash area or the entry
+ * point isn't within one of segments.
+ * EBUSY if another image is being executed.
+ */
+int xc_kexec_load(xc_interface *xch, xen_kexec_load_v2_t *load);
+
+/*
+ * Unload a kexec image.
+ *
+ * This prevents a KEXEC_TYPE_DEFAULT or KEXEC_TYPE_CRASH image from
+ * being executed. The image is not cleared from memory.
+ */
+int xc_kexec_unload(xc_interface *xch, int type);
+
#endif /* XENCTRL_H */
--
1.7.2.5
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 24+ messages in thread* Re: [Xen-devel] [PATCH 3/3] libxc: add API for kexec hypercall
2013-01-16 16:29 ` [PATCH 3/3] libxc: add API for kexec hypercall David Vrabel
@ 2013-01-16 16:59 ` Ian Campbell
0 siblings, 0 replies; 24+ messages in thread
From: Ian Campbell @ 2013-01-16 16:59 UTC (permalink / raw)
To: David Vrabel
Cc: Daniel Kiper, kexec@lists.infradead.org, Eric Biederman,
xen-devel@lists.xen.org
On Wed, 2013-01-16 at 16:29 +0000, David Vrabel wrote:
> From: David Vrabel <david.vrabel@citrix.com>
>
> Add xc_kexec_exec(), xc_kexec_get_ranges(), xc_kexec_load(), and
> xc_kexec_unload(). The load and unload calls require the v2 load and
> unload ops.
>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
> tools/libxc/Makefile | 1 +
> tools/libxc/xc_kexec.c | 120 ++++++++++++++++++++++++++++++++++++++++++++++++
> tools/libxc/xenctrl.h | 53 +++++++++++++++++++++
> 3 files changed, 174 insertions(+), 0 deletions(-)
> create mode 100644 tools/libxc/xc_kexec.c
>
> diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
> index d44abf9..39badf9 100644
> --- a/tools/libxc/Makefile
> +++ b/tools/libxc/Makefile
> @@ -31,6 +31,7 @@ CTRL_SRCS-y += xc_mem_access.c
> CTRL_SRCS-y += xc_memshr.c
> CTRL_SRCS-y += xc_hcall_buf.c
> CTRL_SRCS-y += xc_foreign_memory.c
> +CTRL_SRCS-y += xc_kexec.c
> CTRL_SRCS-y += xtl_core.c
> CTRL_SRCS-y += xtl_logger_stdio.c
> CTRL_SRCS-$(CONFIG_X86) += xc_pagetab.c
> diff --git a/tools/libxc/xc_kexec.c b/tools/libxc/xc_kexec.c
> new file mode 100644
> index 0000000..ebd55cf
> --- /dev/null
> +++ b/tools/libxc/xc_kexec.c
> @@ -0,0 +1,120 @@
> +/******************************************************************************
> + * xc_kexec.c
> + *
> + * API for loading and executing kexec images.
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation;
> + * version 2.1 of the License.
> + *
> + * Copyright (C) 2013 Citrix Systems R&D Ltd.
> + */
> +#include "xc_private.h"
> +
> +int xc_kexec(xc_interface *xch, int type)
> +{
> + DECLARE_HYPERCALL;
> + DECLARE_HYPERCALL_BUFFER(xen_kexec_exec_t, exec);
> + int ret = -1;
> +
> + exec = xc_hypercall_buffer_alloc(xch, exec, sizeof(*exec));
> + if ( exec == NULL )
> + {
> + PERROR("Count not alloc bounce buffer for kexec_exec hypercall");
This one isn't actually a bounce buffer.
[...]
> + get_range = xc_hypercall_buffer_alloc(xch, get_range, sizeof(*get_range));
> + if ( get_range == NULL )
> + {
> + PERROR("Could not alloc bounce buffer for kexec_get_range hypercall");
Nor this.
> + goto out;
> + }
> +
> + hypercall.op = __HYPERVISOR_kexec_op;
> + hypercall.arg[0] = KEXEC_CMD_kexec_get_range;
> + hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(get_range);
> +
> + ret = do_xen_hypercall(xch, &hypercall);
> +
> + *size = get_range->size;
> + *start = get_range->start;
> +
> +out:
> + xc_hypercall_buffer_free(xch, get_range);
> +
> + return ret;
> +}
> +
> +int xc_kexec_load(xc_interface *xch, xen_kexec_load_v2_t *load)
> +{
> + int ret = -1;
> + DECLARE_HYPERCALL;
> + DECLARE_HYPERCALL_BOUNCE(load, sizeof(*load), XC_HYPERCALL_BUFFER_BOUNCE_IN);
> +
> + if ( xc_hypercall_bounce_pre(xch, load) )
> + {
> + PERROR("Could not alloc bounce buffer for kexec_load_v2 hypercall");
> + goto out;
> + }
You'll also need to bounce the "segments" member of this struct.
> @@ -2236,4 +2237,56 @@ int xc_compression_uncompress_page(xc_interface *xch, char *compbuf,
> unsigned long compbuf_size,
> unsigned long *compbuf_pos, char *dest);
>
> +/*
> + * Execute an image previously loaded with xc_kexec_load().
> + *
> + * Does not return on success.
> + *
> + * Fails with:
> + * ENOENT if the specified image has not been loaded.
> + */
> +int xc_kexec(xc_interface *xch, int type);
> +
> +/*
> + * Find the machine address and size of certain memory areas.
> + *
> + * The regions are:
A reference to include/public is less likely to get out of sync.
Ian.
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 0/3] Improve kexec support in Xen hypervisor
2013-01-16 16:29 [RFC PATCH 0/3] Improve kexec support in Xen hypervisor David Vrabel
` (2 preceding siblings ...)
2013-01-16 16:29 ` [PATCH 3/3] libxc: add API for kexec hypercall David Vrabel
@ 2013-01-16 16:33 ` David Vrabel
2013-01-16 17:02 ` [Xen-devel] " Ian Campbell
2013-01-17 11:27 ` Daniel Kiper
5 siblings, 0 replies; 24+ messages in thread
From: David Vrabel @ 2013-01-16 16:33 UTC (permalink / raw)
To: David Vrabel
Cc: Daniel Kiper, kexec@lists.infradead.org, Eric Biederman,
xen-devel@lists.xen.org
[-- Attachment #1: Type: text/plain, Size: 359 bytes --]
On 16/01/13 16:29, David Vrabel wrote:
> This series of patches improves the kexec hypercall in the Xen
> hypervisor. It is an incomplete prototype but I posting it early for
> comments on the proposed ABI/API.
And here is a (very) simple test program and a trivial test image I have
used. This provides an example of how the libxc API can be used.
David
[-- Attachment #2: xen-kexec-test.c --]
[-- Type: text/x-csrc, Size: 2081 bytes --]
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <xenctrl.h>
#include <xen/kexec.h>
#include "xc_kexec.h"
extern unsigned long image;
extern unsigned long image_end;
int main(void)
{
xc_interface *xch;
unsigned long start, size;
size_t image_size;
DECLARE_HYPERCALL_BUFFER(xen_kexec_segment_t, segments);
DECLARE_HYPERCALL_BUFFER(void, image_buf);
xen_kexec_load_v2_t load;
int ret;
xch = xc_interface_open(NULL, NULL, 0);
if ( !xch )
{
perror("xc_open");
exit(1);
}
ret = xc_kexec_get_range(xch, KEXEC_RANGE_MA_CRASH, 0, &start, &size);
if ( ret )
{
perror("xc_kexec_get_range");
exit(1);
}
printf("Crash region: 0x%08lx-0x%08lx\n", start, start + size);
image_size = (void *)&image_end - (void *)ℑ
printf("Image %p-%p\n", &image, (void *)&image + image_size);
image_buf = xc_hypercall_buffer_alloc(xch, image_buf, image_size);
if ( !image_buf )
{
perror("xc_hypercall_buffer_alloc");
exit(1);
}
segments = xc_hypercall_buffer_alloc(xch, segments, sizeof(*segments) * 1);
if ( !segments )
{
perror("xc_hypercall_buffer_alloc");
exit(1);
}
memcpy(image_buf, &image, image_size);
set_xen_guest_handle(segments[0].buf, image_buf);
segments[0].size = image_size;
segments[0].dest_maddr = start;
load.type = KEXEC_TYPE_DEFAULT;
load.class = KEXEC_CLASS_32;
load.nr_segments = 1;
set_xen_guest_handle(load.segments, segments);
load.entry_maddr = start;
ret = xc_kexec_load(xch, &load);
if ( ret )
{
perror("xc_kexec_load");
exit(1);
}
xc_hypercall_buffer_free(xch, image_buf);
xc_hypercall_buffer_free(xch, segments);
ret = xc_kexec(xch, KEXEC_TYPE_DEFAULT);
if ( ret )
{
perror("xc_kexec_exec");
exit(1);
}
xc_interface_close(xch);
return 0;
}
/*
* Local variables:
* mode: C
* c-file-style: "BSD"
* c-basic-offset: 4
* indent-tabs-mode: nil
* End:
*/
[-- Attachment #3: image.S --]
[-- Type: text/plain, Size: 280 bytes --]
.text
.code32
.globl image
image:
1: mov $0x3f8+5, %dx
inb %dx, %al
test $0x20, %al
je 1b
mov $0x3f8, %dx
mov $'I', %al
outb %al, %dx
hlt
.globl image_end
image_end:
[-- Attachment #4: Type: text/plain, Size: 143 bytes --]
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [Xen-devel] [RFC PATCH 0/3] Improve kexec support in Xen hypervisor
2013-01-16 16:29 [RFC PATCH 0/3] Improve kexec support in Xen hypervisor David Vrabel
` (3 preceding siblings ...)
2013-01-16 16:33 ` [RFC PATCH 0/3] Improve kexec support in Xen hypervisor David Vrabel
@ 2013-01-16 17:02 ` Ian Campbell
2013-01-16 17:48 ` David Vrabel
2013-01-17 11:27 ` Daniel Kiper
5 siblings, 1 reply; 24+ messages in thread
From: Ian Campbell @ 2013-01-16 17:02 UTC (permalink / raw)
To: David Vrabel
Cc: Daniel Kiper, kexec@lists.infradead.org, Eric Biederman,
xen-devel@lists.xen.org
On Wed, 2013-01-16 at 16:29 +0000, David Vrabel wrote:
> Since the kexec hypercall is for use by dom0 I have removed the
> implementation of the old load/unload ops and thus guests will require
> updated kexec-tools to load images. Is this acceptable?
How hard would it be to also support the old interface for the benefit
of old kernels and userspaces (e.g. existing distros)?
Rather than a _v2 suffix we have in the past renamed the old ones
_compat and introduced the new ones, with appropriate use of
__XEN_INTERFACE_VERSION__ to paper over things for old users.
See __HYPERVISOR_sched_op for example.
Ian.
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [Xen-devel] [RFC PATCH 0/3] Improve kexec support in Xen hypervisor
2013-01-16 17:02 ` [Xen-devel] " Ian Campbell
@ 2013-01-16 17:48 ` David Vrabel
2013-01-17 9:35 ` Ian Campbell
2013-01-17 10:46 ` Jan Beulich
0 siblings, 2 replies; 24+ messages in thread
From: David Vrabel @ 2013-01-16 17:48 UTC (permalink / raw)
To: Ian Campbell
Cc: Daniel Kiper, kexec@lists.infradead.org, Eric Biederman,
xen-devel@lists.xen.org
On 16/01/13 17:02, Ian Campbell wrote:
> On Wed, 2013-01-16 at 16:29 +0000, David Vrabel wrote:
>> Since the kexec hypercall is for use by dom0 I have removed the
>> implementation of the old load/unload ops and thus guests will require
>> updated kexec-tools to load images. Is this acceptable?
>
> How hard would it be to also support the old interface for the benefit
> of old kernels and userspaces (e.g. existing distros)?
There's an easy way (see patch 1), but it doesn't have full
compatibility as it no longer executes the code page supplied by the
guest. This won't matter with Linux guests as their code pages can be
replaced by the code in Xen, but may matter with some more obscure
guests that do unusual things in their code pages (are there any like
this -- I doubt it?).
Full compatibility is possible and not that hard. Is it actually worth
it though? Will there be people updating Xen to 4.3 and unable to
update their kernel or userspace tools?
> Rather than a _v2 suffix we have in the past renamed the old ones
> _compat and introduced the new ones, with appropriate use of
> __XEN_INTERFACE_VERSION__ to paper over things for old users.
>
> See __HYPERVISOR_sched_op for example.
Ok. I'll look at this.
David
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Xen-devel] [RFC PATCH 0/3] Improve kexec support in Xen hypervisor
2013-01-16 17:48 ` David Vrabel
@ 2013-01-17 9:35 ` Ian Campbell
2013-01-17 10:46 ` Jan Beulich
1 sibling, 0 replies; 24+ messages in thread
From: Ian Campbell @ 2013-01-17 9:35 UTC (permalink / raw)
To: David Vrabel
Cc: Daniel Kiper, kexec@lists.infradead.org, Eric Biederman,
xen-devel@lists.xen.org
On Wed, 2013-01-16 at 17:48 +0000, David Vrabel wrote:
> On 16/01/13 17:02, Ian Campbell wrote:
> > On Wed, 2013-01-16 at 16:29 +0000, David Vrabel wrote:
> >> Since the kexec hypercall is for use by dom0 I have removed the
> >> implementation of the old load/unload ops and thus guests will require
> >> updated kexec-tools to load images. Is this acceptable?
> >
> > How hard would it be to also support the old interface for the benefit
> > of old kernels and userspaces (e.g. existing distros)?
>
> There's an easy way (see patch 1), but it doesn't have full
> compatibility as it no longer executes the code page supplied by the
> guest. This won't matter with Linux guests as their code pages can be
> replaced by the code in Xen, but may matter with some more obscure
> guests that do unusual things in their code pages (are there any like
> this -- I doubt it?).
I'm not sure that any other PV dom0 support kexec on Xen at all.
> Full compatibility is possible and not that hard. Is it actually worth
> it though? Will there be people updating Xen to 4.3 and unable to
> update their kernel or userspace tools?
We've been telling people for a while to try and use their distro
packages where possible. Even if they build Xen from source it would be
nice if they didn't have to also then rebuild their kernel or the kexec
tools etc and could keep their existing packages.
That said I'm not sure how widespread use of kexec is outside of distros
anyway, and they can obviously integrate the right bits.
> > Rather than a _v2 suffix we have in the past renamed the old ones
> > _compat and introduced the new ones, with appropriate use of
> > __XEN_INTERFACE_VERSION__ to paper over things for old users.
> >
> > See __HYPERVISOR_sched_op for example.
>
> Ok. I'll look at this.
>
> David
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Xen-devel] [RFC PATCH 0/3] Improve kexec support in Xen hypervisor
2013-01-16 17:48 ` David Vrabel
2013-01-17 9:35 ` Ian Campbell
@ 2013-01-17 10:46 ` Jan Beulich
2013-01-17 10:51 ` Jan Beulich
1 sibling, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2013-01-17 10:46 UTC (permalink / raw)
To: David Vrabel, Ian Campbell
Cc: Daniel Kiper, kexec@lists.infradead.org, Eric Biederman,
xen-devel@lists.xen.org
>>> On 16.01.13 at 18:48, David Vrabel <david.vrabel@citrix.com> wrote:
> Full compatibility is possible and not that hard. Is it actually worth
> it though? Will there be people updating Xen to 4.3 and unable to
> update their kernel or userspace tools?
I think we simply shouldn't even be considering removal of
functionality that we can't tell for sure is not being used
anywhere, and that we don't deliberately want to go away (like
was the case with 32-bit hypervisor support).
Jan
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Xen-devel] [RFC PATCH 0/3] Improve kexec support in Xen hypervisor
2013-01-17 10:46 ` Jan Beulich
@ 2013-01-17 10:51 ` Jan Beulich
0 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2013-01-17 10:51 UTC (permalink / raw)
To: David Vrabel, Ian Campbell
Cc: Daniel Kiper, kexec@lists.infradead.org, Eric Biederman,
xen-devel@lists.xen.org
>>> On 17.01.13 at 11:46, "Jan Beulich" <JBeulich@suse.com> wrote:
>>>> On 16.01.13 at 18:48, David Vrabel <david.vrabel@citrix.com> wrote:
>> Full compatibility is possible and not that hard. Is it actually worth
>> it though? Will there be people updating Xen to 4.3 and unable to
>> update their kernel or userspace tools?
>
> I think we simply shouldn't even be considering removal of
> functionality that we can't tell for sure is not being used
> anywhere, and that we don't deliberately want to go away (like
> was the case with 32-bit hypervisor support).
Forgot to say - for our distros, the kernel is being updated quite
regularly, whereas I have no idea at what intervals the kexec
user space code gets pulled. Plus I don't think I've seen any
patches to this one yet, so you're basically disabling all kexec
functionality for the time being when you remove the old code.
Jan
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 0/3] Improve kexec support in Xen hypervisor
2013-01-16 16:29 [RFC PATCH 0/3] Improve kexec support in Xen hypervisor David Vrabel
` (4 preceding siblings ...)
2013-01-16 17:02 ` [Xen-devel] " Ian Campbell
@ 2013-01-17 11:27 ` Daniel Kiper
2013-01-17 11:37 ` [Xen-devel] " Andrew Cooper
2013-01-17 13:01 ` David Vrabel
5 siblings, 2 replies; 24+ messages in thread
From: Daniel Kiper @ 2013-01-17 11:27 UTC (permalink / raw)
To: David Vrabel; +Cc: kexec, Eric Biederman, xen-devel
On Wed, Jan 16, 2013 at 04:29:03PM +0000, David Vrabel wrote:
> This series of patches improves the kexec hypercall in the Xen
> hypervisor. It is an incomplete prototype but I posting it early for
> comments on the proposed ABI/API.
>
> This allows a privileged Xen guest to load kexec images into the
> hypervisor from a userspace tool without using the Linux kernel's
> kexec subsystem. It is the first step to supporting kexec of crash
> kernels from a pv-ops dom0 kernel (the required kernel and kexec-tools
> patches will be posted later).
>
> The kernel will require a kexec hypercall somewhere in the
> crash_kexec() path to actually exec the loaded image. Any preferences
> on how the hook for this should be implemented? Note that the kernel
This should be implemented as stub which be called by machine_kexec()
and later it would call relevant hypercall.
> won't be aware that an image as been loaded as it is loaded directly
> into the hypervisor and not via the kernel's kexec_load system call.
Maybe we should have sepcial kexec hypercall function which allow us
to ask hypervisor that image is loaded or not.
> Since the kexec hypercall is for use by dom0 I have removed the
> implementation of the old load/unload ops and thus guests will require
> updated kexec-tools to load images. Is this acceptable?
Not yet. I think that old interface should stay as long as Xen Linux Kernel
could run on latest versions of hypervisor.
Daniel
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [Xen-devel] [RFC PATCH 0/3] Improve kexec support in Xen hypervisor
2013-01-17 11:27 ` Daniel Kiper
@ 2013-01-17 11:37 ` Andrew Cooper
2013-01-17 12:59 ` Daniel Kiper
2013-01-17 13:01 ` David Vrabel
1 sibling, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2013-01-17 11:37 UTC (permalink / raw)
To: Daniel Kiper
Cc: xen-devel@lists.xen.org, kexec@lists.infradead.org, David Vrabel,
Eric Biederman
On 17/01/13 11:27, Daniel Kiper wrote:
> On Wed, Jan 16, 2013 at 04:29:03PM +0000, David Vrabel wrote:
>> This series of patches improves the kexec hypercall in the Xen
>> hypervisor. It is an incomplete prototype but I posting it early for
>> comments on the proposed ABI/API.
>>
>> This allows a privileged Xen guest to load kexec images into the
>> hypervisor from a userspace tool without using the Linux kernel's
>> kexec subsystem. It is the first step to supporting kexec of crash
>> kernels from a pv-ops dom0 kernel (the required kernel and kexec-tools
>> patches will be posted later).
>>
>> The kernel will require a kexec hypercall somewhere in the
>> crash_kexec() path to actually exec the loaded image. Any preferences
>> on how the hook for this should be implemented? Note that the kernel
> This should be implemented as stub which be called by machine_kexec()
> and later it would call relevant hypercall.
>
>> won't be aware that an image as been loaded as it is loaded directly
>> into the hypervisor and not via the kernel's kexec_load system call.
> Maybe we should have sepcial kexec hypercall function which allow us
> to ask hypervisor that image is loaded or not.
But we already have this information. If the kexec crash hypercall
returns back to dom0 then a crash kernel is not loaded.
One could certainly argue that even if a crash kernel is not loaded, a
kexec crash hypercall means that dom0 is in bad state and Xen should
panic() anyway, which is the case on any other form of dom0 crash.
~Andrew
>
>> Since the kexec hypercall is for use by dom0 I have removed the
>> implementation of the old load/unload ops and thus guests will require
>> updated kexec-tools to load images. Is this acceptable?
> Not yet. I think that old interface should stay as long as Xen Linux Kernel
> could run on latest versions of hypervisor.
>
> Daniel
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Xen-devel] [RFC PATCH 0/3] Improve kexec support in Xen hypervisor
2013-01-17 11:37 ` [Xen-devel] " Andrew Cooper
@ 2013-01-17 12:59 ` Daniel Kiper
0 siblings, 0 replies; 24+ messages in thread
From: Daniel Kiper @ 2013-01-17 12:59 UTC (permalink / raw)
To: Andrew Cooper
Cc: xen-devel@lists.xen.org, kexec@lists.infradead.org, David Vrabel,
Eric Biederman
On Thu, Jan 17, 2013 at 11:37:48AM +0000, Andrew Cooper wrote:
> On 17/01/13 11:27, Daniel Kiper wrote:
> > On Wed, Jan 16, 2013 at 04:29:03PM +0000, David Vrabel wrote:
> >> This series of patches improves the kexec hypercall in the Xen
> >> hypervisor. It is an incomplete prototype but I posting it early for
> >> comments on the proposed ABI/API.
> >>
> >> This allows a privileged Xen guest to load kexec images into the
> >> hypervisor from a userspace tool without using the Linux kernel's
> >> kexec subsystem. It is the first step to supporting kexec of crash
> >> kernels from a pv-ops dom0 kernel (the required kernel and kexec-tools
> >> patches will be posted later).
> >>
> >> The kernel will require a kexec hypercall somewhere in the
> >> crash_kexec() path to actually exec the loaded image. Any preferences
> >> on how the hook for this should be implemented? Note that the kernel
> > This should be implemented as stub which be called by machine_kexec()
> > and later it would call relevant hypercall.
> >
> >> won't be aware that an image as been loaded as it is loaded directly
> >> into the hypervisor and not via the kernel's kexec_load system call.
> > Maybe we should have sepcial kexec hypercall function which allow us
> > to ask hypervisor that image is loaded or not.
>
> But we already have this information. If the kexec crash hypercall
> returns back to dom0 then a crash kernel is not loaded.
>
> One could certainly argue that even if a crash kernel is not loaded, a
> kexec crash hypercall means that dom0 is in bad state and Xen should
> panic() anyway, which is the case on any other form of dom0 crash.
I thought about somthing which does not do big bang when kexec image is loaded.
Now I think that it should return what type of image is loaded
(crash and/or regular one).
Daniel
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 0/3] Improve kexec support in Xen hypervisor
2013-01-17 11:27 ` Daniel Kiper
2013-01-17 11:37 ` [Xen-devel] " Andrew Cooper
@ 2013-01-17 13:01 ` David Vrabel
2013-01-17 13:25 ` Eric W. Biederman
1 sibling, 1 reply; 24+ messages in thread
From: David Vrabel @ 2013-01-17 13:01 UTC (permalink / raw)
To: Daniel Kiper
Cc: kexec@lists.infradead.org, Eric Biederman,
xen-devel@lists.xen.org
On 17/01/13 11:27, Daniel Kiper wrote:
> On Wed, Jan 16, 2013 at 04:29:03PM +0000, David Vrabel wrote:
>> This series of patches improves the kexec hypercall in the Xen
>> hypervisor. It is an incomplete prototype but I posting it early for
>> comments on the proposed ABI/API.
>>
>> This allows a privileged Xen guest to load kexec images into the
>> hypervisor from a userspace tool without using the Linux kernel's
>> kexec subsystem. It is the first step to supporting kexec of crash
>> kernels from a pv-ops dom0 kernel (the required kernel and kexec-tools
>> patches will be posted later).
>>
>> The kernel will require a kexec hypercall somewhere in the
>> crash_kexec() path to actually exec the loaded image. Any preferences
>> on how the hook for this should be implemented? Note that the kernel
>
> This should be implemented as stub which be called by machine_kexec()
> and later it would call relevant hypercall.
That's a complicated way of making a simple function call. What's the
justification for doing it this way?
Because Linux doesn't allow images to be unloaded, it's not clear how we
can get sensible behavior if the image is unloaded from Xen. The stub
will remain loaded in Linux but will not be able to return (easily?).
>> won't be aware that an image as been loaded as it is loaded directly
>> into the hypervisor and not via the kernel's kexec_load system call.
>
> Maybe we should have sepcial kexec hypercall function which allow us
> to ask hypervisor that image is loaded or not.
It's not possible to check for an image atomically. I don't think this
is necessary anyway, KEXEC_CMD_exec will return if no image is loaded.
David
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 0/3] Improve kexec support in Xen hypervisor
2013-01-17 13:01 ` David Vrabel
@ 2013-01-17 13:25 ` Eric W. Biederman
0 siblings, 0 replies; 24+ messages in thread
From: Eric W. Biederman @ 2013-01-17 13:25 UTC (permalink / raw)
To: David Vrabel
Cc: Daniel Kiper, kexec@lists.infradead.org, xen-devel@lists.xen.org
David Vrabel <david.vrabel@citrix.com> writes:
> On 17/01/13 11:27, Daniel Kiper wrote:
>> On Wed, Jan 16, 2013 at 04:29:03PM +0000, David Vrabel wrote:
>>> The kernel will require a kexec hypercall somewhere in the
>>> crash_kexec() path to actually exec the loaded image. Any preferences
>>> on how the hook for this should be implemented? Note that the kernel
>>
>> This should be implemented as stub which be called by machine_kexec()
>> and later it would call relevant hypercall.
>
> That's a complicated way of making a simple function call. What's the
> justification for doing it this way?
Because there is no justification for a Xen to have a hack in the
crash_kexec() path. crash_kexec comes as close as it can to doing
a simple function call.
> Because Linux doesn't allow images to be unloaded, it's not clear how we
> can get sensible behavior if the image is unloaded from Xen. The stub
> will remain loaded in Linux but will not be able to return (easily?).
Linux allows images to be unloaded by loading an empty image. Aka an
image with nr_segments == 0.
>>> won't be aware that an image as been loaded as it is loaded directly
>>> into the hypervisor and not via the kernel's kexec_load system call.
>>
>> Maybe we should have sepcial kexec hypercall function which allow us
>> to ask hypervisor that image is loaded or not.
>
> It's not possible to check for an image atomically. I don't think this
> is necessary anyway, KEXEC_CMD_exec will return if no image is loaded.
If Xen supported something other than the kexec on panic path this could
be interesting. The current linux userspace does things conditionally
if you are in the reboot runlevel and kexec is coming up. Certainly for
the kexec on panic path it is non-interesting.
Eric
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 24+ messages in thread