From mboxrd@z Thu Jan 1 00:00:00 1970 From: Don Slutz Subject: Re: [PATCHv9 0/9] Xen: extend kexec hypercall for use with pv-ops kernels Date: Thu, 31 Oct 2013 12:59:34 -0400 Message-ID: <52728C76.8040501@terremark.com> References: <1381251310-29449-1-git-send-email-david.vrabel@citrix.com> <20131018184031.GA12658@router-fw-old.local.net-space.pl> <5261C0D0.4090606@cantab.net> <20131021121940.GY3626@debian70-amd64.local.net-space.pl> <52652469.2040703@citrix.com> <20131021202032.GE3626@debian70-amd64.local.net-space.pl> <52713A86.3050102@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------060508080104080408050406" Return-path: In-Reply-To: <52713A86.3050102@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: David Vrabel , Daniel Kiper Cc: Keir Fraser , Jan Beulich , Daniel Kiper , xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org --------------060508080104080408050406 Content-Type: multipart/alternative; boundary="------------090109010008020708030205" --------------090109010008020708030205 Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit On 10/30/13 12:57, David Vrabel wrote: > On 21/10/13 21:20, Daniel Kiper wrote: >> On Mon, Oct 21, 2013 at 01:56:09PM +0100, David Vrabel wrote: >>> On 21/10/13 13:19, Daniel Kiper wrote: >>>> On Sat, Oct 19, 2013 at 12:14:24AM +0100, David Vrabel wrote: >>>>> On 18/10/2013 19:40, Daniel Kiper wrote: >>>>>> On Tue, Oct 08, 2013 at 05:55:01PM +0100, David Vrabel wrote: >>>>>>> The series (for Xen 4.4) improves the kexec hypercall by making Xen >>>>>>> responsible for loading and relocating the image. This allows kexec >>>>>>> to be usable by pv-ops kernels and should allow kexec to be usable >>>>>>> from a HVM or PVH privileged domain. >>>>>> I could not load panic image because Xen crashes in following way: >>>>>> >>>>>> (XEN) ----[ Xen-4.4-unstable x86_64 debug=y Tainted: C ]---- >>>>> [...] >>>>>> (XEN) Xen call trace: >>>>>> (XEN) [] kimage_free+0x67/0xd2 >>>>>> (XEN) [] do_kimage_alloc+0x29c/0x2f0 >>>>>> (XEN) [] kimage_alloc+0xb1/0xe6 >>>>>> (XEN) [] do_kexec_op_internal+0x68e/0x789 >>>>>> (XEN) [] do_kexec_op+0xe/0x12 >>>>>> (XEN) [] syscall_enter+0xeb/0x145 I get the same thing. >>>>> The appended patch should fix this crash which only occurs if there's an >>>>> error in do_kimage_alloc(). >>>> Patch had wrapped lines. I hope that I fixed it properly. >>>> I cannot load panic kernel. kexec fails with following message: My version of this patch is attached (0001...). It has both crashed right away and not: (XEN) [2013-10-30 21:26:39] ----[ Xen-4.4-unstable x86_64 debug=y Not tainted ]---- (XEN) [2013-10-30 21:26:39] CPU: 7 (XEN) [2013-10-30 21:26:39] RIP: e008:[] xmem_pool_free+0x6f/0x2e9 (XEN) [2013-10-30 21:26:39] RFLAGS: 0000000000010286 CONTEXT: hypervisor (XEN) [2013-10-30 21:26:39] rax: ffff8308df5a5e90 rbx: ffff83083f48f9f0 rcx: 000000000001b410 (XEN) [2013-10-30 21:26:39] rdx: 00000000a01164a0 rsi: ffff83083a1ae000 rdi: ffff83083a1af86c (XEN) [2013-10-30 21:26:39] rbp: ffff830823fbfd88 rsp: ffff830823fbfd68 r8: 000000000000000c (XEN) [2013-10-30 21:26:39] r9: 0000000010000000 r10: ffff83083f4904f0 r11: 00000000004c6000 (XEN) [2013-10-30 21:26:39] r12: ffff83083a1ae000 r13: ffff83083a1af868 r14: 00007fff5a9b7fc0 (XEN) [2013-10-30 21:26:39] r15: 0000000000000003 cr0: 0000000080050033 cr4: 00000000000426f0 (XEN) [2013-10-30 21:26:39] cr3: 000000066b482000 cr2: ffff8308df5a5e98 (XEN) [2013-10-30 21:26:39] ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) [2013-10-30 21:26:39] Xen stack trace from rsp=ffff830823fbfd68: (XEN) [2013-10-30 21:26:39] 00000000000000e0 00000000ffffff9d ffff83083f4904f0 ffff83083f48f9f0 (XEN) [2013-10-30 21:26:39] ffff830823fbfdc8 ffff82d0801304fe ffff830823fbfdc8 00000000ffffff9d (XEN) [2013-10-30 21:26:39] ffff83083f4904f0 ffff8800870bb5e8 00007fff5a9b7fc0 0000000000000003 (XEN) [2013-10-30 21:26:39] ffff830823fbfee8 ffff82d08011450c ffff830823fbfef8 0000000000000000 (XEN) [2013-10-30 21:26:39] 0000000000000002 ffff830823fb10b8 ffff830823fbfe18 ffff82d08012b104 (XEN) [2013-10-30 21:26:39] ffff8300bf2f4060 000000000066a0cb 0000000000000000 ffff8300bf2f4000 (XEN) [2013-10-30 21:26:39] ffff82004001f000 00007ff000000003 00000007003e0001 00007f84d993b004 (XEN) [2013-10-30 21:26:39] 000000001ff53720 ffff830823fb1000 ffff830823fbfe68 ffff82d08016fb23 (XEN) [2013-10-30 21:26:39] ffff830823fbfe88 ffff82d080221348 ffff830823fb1000 ffff830823fbff18 (XEN) [2013-10-30 21:26:39] ffff830823fbfef8 ffff82d0802214a8 00000000d69204a7 0000000000000000 (XEN) [2013-10-30 21:26:39] 0000000000000217 0000003564eee0a7 0000000000000100 0000003564eee0a7 (XEN) [2013-10-30 21:26:39] ffff830823fbfed8 ffff82d08016fb23 ffff8300bf2f4000 00007fff5a9b7fc0 (XEN) [2013-10-30 21:26:39] ffff830823fbfef8 ffff82d0801145d9 00007cf7dc0400c7 ffff82d0802268cb (XEN) [2013-10-30 21:26:39] ffffffff810014aa 0000000000000025 0000001efd525f9a 0000001efd60d300 (XEN) [2013-10-30 21:26:39] 0000000000000000 00000021d69204a7 ffff880087debe88 ffff880005d9a500 (XEN) [2013-10-30 21:26:39] 0000000000000286 00007fff5a9b8180 ffff880087191480 000000001ff53720 (XEN) [2013-10-30 21:26:39] 0000000000000025 ffffffff810014aa 0000003564a148e5 00007f84d8f3f004 (XEN) [2013-10-30 21:26:39] 0000000000000004 0001010000000000 ffffffff810014aa 000000000000e033 (XEN) [2013-10-30 21:26:39] 0000000000000286 ffff880087debde0 000000000000e02b d53835942492fce9 ... The auto reboot overwrote the rest. When it did not crash right away, the next day I got error messages about page table issues. (I forgot that the request to write hypervisor console data to a file is not the default.) I hope to still have the data at home. Best guess at this point is that the error handling still has issues. >>>> kexec_load failed: Cannot assign requested address >>> This is -EADDRINVALID which means one of >>> >>> a) the entry point isn't within a segment. >>> b) one of the segments is not page aligned. >>> c) one of the segments is not within the crash region. >>> >>> But the segments kexec has constructed all looked fine to me (and >>> similar to the segments I see). I have tracked this down to in kexec-tools: + if (info->kexec_flags & KEXEC_ON_CRASH) { + set_xen_guest_handle(xen_segs[s].buf.h, HYPERCALL_BUFFER_NULL); + xen_segs[s].buf_size = 0; + xen_segs[s].dest_maddr = info->backup_src_start; + xen_segs[s].dest_size = info->backup_src_size; + nr_segments++; + } Which in some cases passes the 1st e820 line which for me is: (XEN) Xen-e820 RAM map: (XEN) 0000000000000000 - 000000000009b800 (usable) (XEN) 000000000009b800 - 00000000000a0000 (reserved) (XEN) 00000000000e0000 - 0000000000100000 (reserved) (XEN) 0000000000100000 - 00000000bf63f000 (usable) ... 000000000009b800 is not page aligned and so the test: if ( (mstart & ~PAGE_MASK) || (mend & ~PAGE_MASK) ) goto out; Fails. A possible fix is attached as (0002...) this does allow me to get into the crash kernel. -Don Slutz >>> I'm afraid I cannot reproduce either of your failures. Are you sure >>> you've built everything correctly? In particular has kexec-tools been >>> built against the correct version of Xen headers? >> It looks that I build it correctly but I will double check it. >> Could you send me your Xen/Linux boot command lines and kexec >> command lines for normal and panic kernel? Could you tell me >> what is your RAM size? > AMD Opteron 4264 with 8 GiB RAM. > > Xen 4.4-unstable debug=y: > > com1=115200,8n1 console=com1 crashkernel=256M@64M > > Linux 3.12-rc4 > > root=/dev/mapper/cam--st09-root ro console=hvc0 > > Normal image: > > build/sbin/kexec --debug --console-serial --serial-baud=115200 > --command-line="console=ttyS0,115200n8 maxcpus=1" -l > /boot/vmlinuz-3.11.0.davidvr > > Panic image: > > build/sbin/kexec --debug --console-serial --serial-baud=115200 > --command-line="console=ttyS0,115200n8 maxcpus=1" -p > /boot/vmlinuz-3.11.0.davidvr > > David > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel --------------090109010008020708030205 Content-Type: text/html; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit
On 10/30/13 12:57, David Vrabel wrote:
On 21/10/13 21:20, Daniel Kiper wrote:
On Mon, Oct 21, 2013 at 01:56:09PM +0100, David Vrabel wrote:
On 21/10/13 13:19, Daniel Kiper wrote:
On Sat, Oct 19, 2013 at 12:14:24AM +0100, David Vrabel wrote:
On 18/10/2013 19:40, Daniel Kiper wrote:
On Tue, Oct 08, 2013 at 05:55:01PM +0100, David Vrabel wrote:
The series (for Xen 4.4) improves the kexec hypercall by making Xen
responsible for loading and relocating the image.  This allows kexec
to be usable by pv-ops kernels and should allow kexec to be usable
from a HVM or PVH privileged domain.
I could not load panic image because Xen crashes in following way:

(XEN) ----[ Xen-4.4-unstable  x86_64  debug=y  Tainted:    C ]----
[...]
(XEN) Xen call trace:
(XEN)    [<ffff82d080114ef2>] kimage_free+0x67/0xd2
(XEN)    [<ffff82d0801151f9>] do_kimage_alloc+0x29c/0x2f0
(XEN)    [<ffff82d0801152fe>] kimage_alloc+0xb1/0xe6
(XEN)    [<ffff82d0801144c0>] do_kexec_op_internal+0x68e/0x789
(XEN)    [<ffff82d0801145c9>] do_kexec_op+0xe/0x12
(XEN)    [<ffff82d0802268cb>] syscall_enter+0xeb/0x145
I get the same thing.
The appended patch should fix this crash which only occurs if there's an
error in do_kimage_alloc().
Patch had wrapped lines. I hope that I fixed it properly.
I cannot load panic kernel. kexec fails with following message:
My version of this patch is attached (0001...). It has both crashed right away and not:
(XEN) [2013-10-30 21:26:39] ----[ Xen-4.4-unstable  x86_64  debug=y  Not tainted ]----
(XEN) [2013-10-30 21:26:39] CPU:    7
(XEN) [2013-10-30 21:26:39] RIP:    e008:[<ffff82d08012fd72>] xmem_pool_free+0x6f/0x2e9
(XEN) [2013-10-30 21:26:39] RFLAGS: 0000000000010286   CONTEXT: hypervisor
(XEN) [2013-10-30 21:26:39] rax: ffff8308df5a5e90   rbx: ffff83083f48f9f0   rcx: 000000000001b410
(XEN) [2013-10-30 21:26:39] rdx: 00000000a01164a0   rsi: ffff83083a1ae000   rdi: ffff83083a1af86c
(XEN) [2013-10-30 21:26:39] rbp: ffff830823fbfd88   rsp: ffff830823fbfd68   r8:  000000000000000c
(XEN) [2013-10-30 21:26:39] r9:  0000000010000000   r10: ffff83083f4904f0   r11: 00000000004c6000
(XEN) [2013-10-30 21:26:39] r12: ffff83083a1ae000   r13: ffff83083a1af868   r14: 00007fff5a9b7fc0
(XEN) [2013-10-30 21:26:39] r15: 0000000000000003   cr0: 0000000080050033   cr4: 00000000000426f0
(XEN) [2013-10-30 21:26:39] cr3: 000000066b482000   cr2: ffff8308df5a5e98
(XEN) [2013-10-30 21:26:39] ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) [2013-10-30 21:26:39] Xen stack trace from rsp=ffff830823fbfd68:
(XEN) [2013-10-30 21:26:39]    00000000000000e0 00000000ffffff9d ffff83083f4904f0 ffff83083f48f9f0
(XEN) [2013-10-30 21:26:39]    ffff830823fbfdc8 ffff82d0801304fe ffff830823fbfdc8 00000000ffffff9d
(XEN) [2013-10-30 21:26:39]    ffff83083f4904f0 ffff8800870bb5e8 00007fff5a9b7fc0 0000000000000003
(XEN) [2013-10-30 21:26:39]    ffff830823fbfee8 ffff82d08011450c ffff830823fbfef8 0000000000000000
(XEN) [2013-10-30 21:26:39]    0000000000000002 ffff830823fb10b8 ffff830823fbfe18 ffff82d08012b104
(XEN) [2013-10-30 21:26:39]    ffff8300bf2f4060 000000000066a0cb 0000000000000000 ffff8300bf2f4000
(XEN) [2013-10-30 21:26:39]    ffff82004001f000 00007ff000000003 00000007003e0001 00007f84d993b004
(XEN) [2013-10-30 21:26:39]    000000001ff53720 ffff830823fb1000 ffff830823fbfe68 ffff82d08016fb23
(XEN) [2013-10-30 21:26:39]    ffff830823fbfe88 ffff82d080221348 ffff830823fb1000 ffff830823fbff18
(XEN) [2013-10-30 21:26:39]    ffff830823fbfef8 ffff82d0802214a8 00000000d69204a7 0000000000000000
(XEN) [2013-10-30 21:26:39]    0000000000000217 0000003564eee0a7 0000000000000100 0000003564eee0a7
(XEN) [2013-10-30 21:26:39]    ffff830823fbfed8 ffff82d08016fb23 ffff8300bf2f4000 00007fff5a9b7fc0
(XEN) [2013-10-30 21:26:39]    ffff830823fbfef8 ffff82d0801145d9 00007cf7dc0400c7 ffff82d0802268cb
(XEN) [2013-10-30 21:26:39]    ffffffff810014aa 0000000000000025 0000001efd525f9a 0000001efd60d300
(XEN) [2013-10-30 21:26:39]    0000000000000000 00000021d69204a7 ffff880087debe88 ffff880005d9a500
(XEN) [2013-10-30 21:26:39]    0000000000000286 00007fff5a9b8180 ffff880087191480 000000001ff53720
(XEN) [2013-10-30 21:26:39]    0000000000000025 ffffffff810014aa 0000003564a148e5 00007f84d8f3f004
(XEN) [2013-10-30 21:26:39]    0000000000000004 0001010000000000 ffffffff810014aa 000000000000e033
(XEN) [2013-10-30 21:26:39]    0000000000000286 ffff880087debde0 000000000000e02b d53835942492fce9
...
The auto reboot overwrote the rest.  When it did not crash right away, the next day I got error messages about page table issues.  (I forgot that the request to write hypervisor console data to a file is not the default.)  I hope to still have the data at home.

Best guess at this point is that the error handling still has issues.
kexec_load failed: Cannot assign requested address
This is -EADDRINVALID which means one of

a) the entry point isn't within a segment.
b) one of the segments is not page aligned.
c) one of the segments is not within the crash region.

But the segments kexec has constructed all looked fine to me (and
similar to the segments I see).
I have tracked this down to in kexec-tools:

+    if (info->kexec_flags & KEXEC_ON_CRASH) {
+        set_xen_guest_handle(xen_segs[s].buf.h, HYPERCALL_BUFFER_NULL);
+        xen_segs[s].buf_size = 0;
+        xen_segs[s].dest_maddr = info->backup_src_start;
+        xen_segs[s].dest_size = info->backup_src_size;
+        nr_segments++;
+    }
Which in some cases passes the 1st e820 line which for me is:

(XEN) Xen-e820 RAM map:
(XEN)  0000000000000000 - 000000000009b800 (usable)
(XEN)  000000000009b800 - 00000000000a0000 (reserved)
(XEN)  00000000000e0000 - 0000000000100000 (reserved)
(XEN)  0000000000100000 - 00000000bf63f000 (usable)
...
000000000009b800 is not page aligned and so the test:

         if ( (mstart & ~PAGE_MASK) || (mend & ~PAGE_MASK) )
            goto out;

Fails.

A possible fix is attached as (0002...) this does allow me to get into the crash kernel.

   -Don Slutz

I'm afraid I cannot reproduce either of your failures.  Are you sure
you've built everything correctly?  In particular has kexec-tools been
built against the correct version of Xen headers?
It looks that I build it correctly but I will double check it.
Could you send me your Xen/Linux boot command lines and kexec
command lines for normal and panic kernel? Could you tell me
what is your RAM size?
AMD Opteron 4264 with 8 GiB RAM.

Xen 4.4-unstable debug=y:

com1=115200,8n1 console=com1 crashkernel=256M@64M

Linux 3.12-rc4

root=/dev/mapper/cam--st09-root ro console=hvc0

Normal image:

build/sbin/kexec --debug --console-serial --serial-baud=115200
--command-line="console=ttyS0,115200n8 maxcpus=1" -l
/boot/vmlinuz-3.11.0.davidvr

Panic image:

build/sbin/kexec --debug --console-serial --serial-baud=115200
--command-line="console=ttyS0,115200n8 maxcpus=1" -p
/boot/vmlinuz-3.11.0.davidvr

David

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

--------------090109010008020708030205-- --------------060508080104080408050406 Content-Type: text/x-patch; name="0001-kexec-v9a-Fix-error-handling-if-do_kimage_alloc-repo.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*0="0001-kexec-v9a-Fix-error-handling-if-do_kimage_alloc-repo.pa"; filename*1="tch" >>From bb2407bb2b712b33355d8c3df751bd1ecde1f971 Mon Sep 17 00:00:00 2001 From: Don Slutz Date: Wed, 30 Oct 2013 17:17:24 -0400 Subject: [PATCH 1/2] kexec: v9a -- Fix error handling if do_kimage_alloc() reports an error Signed-off-by: Don Slutz --- xen/common/kimage.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/xen/common/kimage.c b/xen/common/kimage.c index 6bee9cf..f2b331e 100644 --- a/xen/common/kimage.c +++ b/xen/common/kimage.c @@ -179,7 +179,7 @@ static int do_kimage_alloc(struct kexec_image **rimage, paddr_t entry, page_to_maddr(image->control_code_page), page_to_maddr(image->control_code_page)); if ( result < 0 ) - return result; + goto out; /* Add an empty indirection page. */ image->entry_page = kimage_alloc_control_page(image, 0); @@ -188,7 +188,7 @@ static int do_kimage_alloc(struct kexec_image **rimage, paddr_t entry, result = machine_kexec_add_page(image, page_to_maddr(image->entry_page), page_to_maddr(image->entry_page)); if ( result < 0 ) - return result; + goto out; image->head = page_to_maddr(image->entry_page); @@ -510,15 +510,14 @@ static void kimage_free_entry(kimage_entry_t entry) free_domheap_page(page); } -void kimage_free(struct kexec_image *image) +static void kimage_free_all_entries(struct kexec_image *image) { kimage_entry_t *ptr, entry; kimage_entry_t ind = 0; - if ( !image ) + if ( !image->head ) return; - kimage_free_extra_pages(image); for_each_kimage_entry(image, ptr, entry) { if ( entry & IND_INDIRECTION ) @@ -537,8 +536,15 @@ void kimage_free(struct kexec_image *image) /* Free the final indirection page. */ if ( ind & IND_INDIRECTION ) kimage_free_entry(ind); +} - /* Free the kexec control pages. */ +void kimage_free(struct kexec_image *image) +{ + if ( !image ) + return; + + kimage_free_extra_pages(image); + kimage_free_all_entries(image); kimage_free_page_list(&image->control_pages); xfree(image->segments); xfree(image); -- 1.7.11.7 --------------060508080104080408050406 Content-Type: text/x-patch; name="0002-kexec-Skip-checking-of-info-backup_src_start-info-ba.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*0="0002-kexec-Skip-checking-of-info-backup_src_start-info-ba.pa"; filename*1="tch" >>From a4eb6108908c65559d4b3997949fb41f0b2828a5 Mon Sep 17 00:00:00 2001 From: Don Slutz Date: Thu, 31 Oct 2013 11:31:06 -0400 Subject: [PATCH 2/2] kexec: Skip checking of info->backup_src_start & info->backup_src_size. Signed-off-by: Don Slutz --- xen/common/kimage.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/xen/common/kimage.c b/xen/common/kimage.c index f2b331e..92c6ebf 100644 --- a/xen/common/kimage.c +++ b/xen/common/kimage.c @@ -124,6 +124,9 @@ static int do_kimage_alloc(struct kexec_image **rimage, paddr_t entry, { paddr_t mstart, mend; + if ( guest_handle_is_null(segments[i].buf.h) ) + continue; + mstart = image->segments[i].dest_maddr; mend = mstart + image->segments[i].dest_size; if ( (mstart & ~PAGE_MASK) || (mend & ~PAGE_MASK) ) @@ -142,11 +145,18 @@ static int do_kimage_alloc(struct kexec_image **rimage, paddr_t entry, paddr_t mstart, mend; unsigned long j; + if ( guest_handle_is_null(segments[i].buf.h) ) + continue; + mstart = image->segments[i].dest_maddr; mend = mstart + image->segments[i].dest_size; for (j = 0; j < i; j++ ) { paddr_t pstart, pend; + + if ( guest_handle_is_null(segments[i].buf.h) ) + continue; + pstart = image->segments[j].dest_maddr; pend = pstart + image->segments[j].dest_size; /* Do the segments overlap? */ @@ -163,6 +173,9 @@ static int do_kimage_alloc(struct kexec_image **rimage, paddr_t entry, result = -EINVAL; for ( i = 0; i < nr_segments; i++ ) { + if ( guest_handle_is_null(segments[i].buf.h) ) + continue; + if ( image->segments[i].buf_size > image->segments[i].dest_size ) goto out; } -- 1.7.11.7 --------------060508080104080408050406 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --------------060508080104080408050406--