From: Vasiliy Tolstov <v.tolstov@selfip.ru>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
Ian Jackson <Ian.Jackson@eu.citrix.com>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: xen 4.3 test report
Date: Fri, 31 May 2013 08:56:47 +0400 [thread overview]
Message-ID: <CACaajQvAxOCCc-=sGm_ZCf+Z8w__o2f3+wD9obRKZjOK98kQfw@mail.gmail.com> (raw)
In-Reply-To: <20130525114058.GH2418@localhost.localdomain>
migration with qemu-xen-traditional:
xen16:~ # xl migrate --debug 21-10887 ib-xen06.kh11.clodo.ru
the global config option vifscript is deprecated, please switch to
vif.default.script
the global config option vifscript is deprecated, please switch to
vif.default.script
migration target: Ready to receive domain.
Saving to migration stream new xl format (info 0x0/0x0/631)
Loading new save file <incoming migration stream> (new xl fmt info 0x0/0x0/631)
Savefile contains xl domain config
xc: progress: Reloading memory pages: 53248/1048576 5%
xc: progress: Reloading memory pages: 105472/1048576 10%
xc: progress: Reloading memory pages: 157658/1048576 15%
xc: progress: Reloading memory pages: 209882/1048576 20%
xc: progress: Reloading memory pages: 263130/1048576 25%
migration receiver stream contained unexpected data instead of ready message
(command run was: exec ssh ib-xen06.kh11.clodo.ru xl migrate-receive -d )
migration target: Transfer complete, requesting permission to start domain.
libxl: error: libxl_utils.c:393:libxl_read_exactly: file/stream
truncated reading GO message from migration stream
migration target: Failure, destroying our copy.
migration child [15697] not exiting, no longer waiting (exit status
will be unreported)
Migration failed, resuming at sender.
migration target: Cleanup OK, granting sender permission to resume.
xl dmesg:
(XEN) event_channel.c:297:d1 d1v0 [evtchn_bind_virq:297], port:3, rc:-17
(XEN) event_channel.c:298:d1 EVTCHNOP failure: error -17
xl console:
[ 981.869689] PM: late freeze of devices complete after 0.073 msecs
[ 981.873833] ------------[ cut here ]------------
[ 981.873833] kernel BUG at
/build/buildd-linux_3.2.41-2+deb7u2-amd64-NHQI9B/linux-3.2.41/drivers/xen/events.c:1489!
[ 981.873833] invalid opcode: 0000 [#1] SMP
[ 981.873833] CPU 0
[ 981.873833] Modules linked in: xenfs snd_pcm snd_page_alloc
snd_timer snd coretemp soundcore crc32c_intel evdev joydev pcspkr ext3
mbcache jbd xen_blkfront xen_netfront
[ 981.873833]
[ 981.873833] Pid: 6, comm: migration/0 Not tainted 3.2.0-4-amd64 #1
Debian 3.2.41-2+deb7u2
[ 981.873833] RIP: e030:[<ffffffff8121c4e2>] [<ffffffff8121c4e2>]
xen_irq_resume+0xbd/0x28b
[ 981.873833] RSP: e02b:ffff88001ae99d20 EFLAGS: 00010082
[ 981.873833] RAX: ffffffffffffffef RBX: 0000000000000000 RCX: 0000000000000001
[ 981.873833] RDX: 0000000000000000 RSI: 00000000deadbeef RDI: 00000000deadbeef
[ 981.873833] RBP: 0000000000000000 R08: ffff88001f026e00 R09: ffff88001ae99d48
[ 981.873833] R10: 0000000000013780 R11: 0000000000013780 R12: 0000000000000010
[ 981.873833] R13: 0000000000010dd0 R14: 0000000000010d70 R15: 0000000000000000
[ 981.873833] FS: 00007f1fff8d37a0(0000) GS:ffff88001fc00000(0000)
knlGS:0000000000000000
[ 981.873833] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 981.873833] CR2: 000000f8400b5410 CR3: 00000000033ad000 CR4: 0000000000002660
[ 981.873833] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 981.873833] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 981.873833] Process migration/0 (pid: 6, threadinfo
ffff88001ae98000, task ffff88001ae8e0c0)
[ 981.873833] Stack:
[ 981.873833] 0000000000013780 0000000000000000 ffff880000000000
0000000000010d70
[ 981.873833] 0000160000000000 0000000000000000 ffff88001affbddc
ffffffff810050a2
[ 981.873833] 0000000000013780 ffffea00005ab258 ffffffff810043e3
ffff88001affbe40
[ 981.873833] Call Trace:
[ 981.873833] [<ffffffff810050a2>] ? xen_mc_issue+0x3e/0x50
[ 981.873833] [<ffffffff810043e3>] ? arch_local_irq_restore+0x7/0x8
[ 981.873833] [<ffffffff8121ca3b>] ? xen_suspend+0x73/0x8b
[ 981.873833] [<ffffffff81087d91>] ? stop_machine_cpu_stop+0x89/0xc3
[ 981.873833] [<ffffffff81087d08>] ? queue_stop_cpus_work+0xa5/0xa5
[ 981.873833] [<ffffffff81087b62>] ? cpu_stopper_thread+0xea/0x177
[ 981.873833] [<ffffffff810359d7>] ? arch_local_irq_enable+0x7/0x8
[ 981.873833] [<ffffffff81039854>] ? finish_task_switch+0x88/0xb9
[ 981.873833] [<ffffffff8134c694>] ? __schedule+0x5ac/0x5c3
[ 981.873833] [<ffffffff81087a78>] ? cpu_stop_signal_done+0x2a/0x2a
[ 981.873833] [<ffffffff8105f329>] ? kthread+0x76/0x7e
[ 981.873833] [<ffffffff81354b34>] ? kernel_thread_helper+0x4/0x10
[ 981.873833] [<ffffffff81352bf3>] ? int_ret_from_sys_call+0x7/0x1b
[ 981.873833] [<ffffffff8134dd3c>] ? retint_restore_args+0x5/0x6
[ 981.873833] [<ffffffff81354b30>] ? gs_change+0x13/0x13
[ 981.873833] Code: 74 79 44 89 e7 e8 77 ee ff ff 39 e8 74 02 0f 0b
48 8d 74 24 28 bf 01 00 00 00 89 6c 24 28 89 5c 24 2c e8 19 ec ff ff
85 c0 74 02 <0f> 0b 8b 44 24 30 44 89 e7 89 44 24 14 e8 58 e9 ff ff 0f
b7 4c
[ 981.873833] RIP [<ffffffff8121c4e2>] xen_irq_resume+0xbd/0x28b
[ 981.873833] RSP <ffff88001ae99d20>
[ 981.873833] ---[ end trace 8243bb8e343ac633 ]---
[ 981.873833] ------------[ cut here ]------------
[ 981.873833] WARNING: at
/build/buildd-linux_3.2.41-2+deb7u2-amd64-NHQI9B/linux-3.2.41/kernel/time/timekeeping.c:265
ktime_get+0x1e/0x86()
[ 981.873833] Modules linked in: xenfs snd_pcm snd_page_alloc
snd_timer snd coretemp soundcore crc32c_intel evdev joydev pcspkr ext3
mbcache jbd xen_blkfront xen_netfront
[ 981.873833] Pid: 0, comm: swapper/0 Tainted: G D
3.2.0-4-amd64 #1 Debian 3.2.41-2+deb7u2
[ 981.873833] Call Trace:
[ 981.873833] [<ffffffff81046a55>] ? warn_slowpath_common+0x78/0x8c
[ 981.873833] [<ffffffff8106644f>] ? ktime_get+0x1e/0x86
[ 981.873833] [<ffffffff8106c223>] ? tick_nohz_stop_sched_tick+0x61/0x327
[ 981.873833] [<ffffffff8100d210>] ? cpu_idle+0x72/0xf2
[ 981.873833] [<ffffffff816abb36>] ? start_kernel+0x3b8/0x3c3
[ 981.873833] [<ffffffff816ad4d9>] ? xen_start_kernel+0x412/0x418
[ 981.873833] ---[ end trace 8243bb8e343ac634 ]---
2013/5/25 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>:
> On Sat, May 25, 2013 at 12:15:44AM +0400, Vasiliy Tolstov wrote:
>> 2013/5/24 George Dunlap <George.Dunlap@eu.citrix.com>:
>> >
>> > Did you mean xm save or xl save?
>>
>>
>> In my case xl save crash domU with messages like followind. And domU
>> crashes centos 2.6.18 and 2.6.32 (xenlinux) and never 3.8.6 kernel and
>> 3.4...
>
> Is the 3.8.6 crashing at the same point?
>>
>> [ 1826.587110] PM: late freeze of devices complete after 0.048 msecs
>> [ 1826.591220] ------------[ cut here ]------------
>> [ 1826.591220] kernel BUG at
>> /build/buildd-linux_3.2.41-2-amd64-Wvc92F/linux-3.2.41/drivers/xen/events.c:1489!
>
> That looks to be this (https://git.kernel.org/cgit/linux/kernel/git/bwh/linux-3.2.y.git/tree/drivers/xen/events.c)
>
> if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq,
> &bind_virq) != 0)
> BUG();
>
> which is odd. Would you be able to instrument evtchn_bind_virq (this is
> in Xen) with some printks, like this (hand't compile tested it):
>
> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
> index 2d7afc9..c109cee 100644
> --- a/xen/common/event_channel.c
> +++ b/xen/common/event_channel.c
> @@ -270,24 +270,34 @@ static long evtchn_bind_virq(evtchn_bind_virq_t *bind)
> int port, virq = bind->virq, vcpu = bind->vcpu;
> long rc = 0;
>
> - if ( (virq < 0) || (virq >= ARRAY_SIZE(v->virq_to_evtchn)) )
> + if ( (virq < 0) || (virq >= ARRAY_SIZE(v->virq_to_evtchn)) ) }
> +gdprintk(XENLOG_WARNING, "d%dv%d [%s:%d], virq:%d, rc:%ld\n", d->domain_id,
> + vcpu, __func__,__LINE__, virq, -EINVAL);
> return -EINVAL;
> -
> - if ( virq_is_global(virq) && (vcpu != 0) )
> + }
> + if ( virq_is_global(virq) && (vcpu != 0) ) {
> +gdprintk(XENLOG_WARNING, "d%dv%d [%s:%d], virq_is_global:%d, rc:%ld\n", d->domain_id,
> + vcpu, __func__,__LINE__, virq_is_global(virq), -EINVAL);
> return -EINVAL;
> -
> + }
> if ( (vcpu < 0) || (vcpu >= d->max_vcpus) ||
> - ((v = d->vcpu[vcpu]) == NULL) )
> + ((v = d->vcpu[vcpu]) == NULL) ) {
> +gdprintk(XENLOG_WARNING, "d%dv%d [%s:%d], v:%p, max_vcpus:%d, rc:%ld\n", d->domain_id,
> + vcpu, __func__,__LINE__, v, d->max_vcpus, -ENOENT);
> return -ENOENT;
> -
> + }
> spin_lock(&d->event_lock);
>
> - if ( v->virq_to_evtchn[virq] != 0 )
> + if ( v->virq_to_evtchn[virq] != 0 ) {
> +gdprintk(XENLOG_WARNING, "d%dv%d [%s:%d], v:%p, evtchn:%d, rc:%ld\n", d->domain_id,
> + vcpu, __func__,__LINE__, v->virq_to_evtchn[virq] , -EEXIST);
> ERROR_EXIT(-EEXIST);
> -
> - if ( (port = get_free_port(d)) < 0 )
> + }
> + if ( (port = get_free_port(d)) < 0 ) {
> +gdprintk(XENLOG_WARNING, "d%dv%d [%s:%d], port:%d, rc:%ld\n", d->domain_id,
> + vcpu, __func__,__LINE__, port, port);
> ERROR_EXIT(port);
> -
> + }
> chn = evtchn_from_port(d, port);
> chn->state = ECS_VIRQ;
> chn->notify_vcpu_id = vcpu;
--
Vasiliy Tolstov,
e-mail: v.tolstov@selfip.ru
jabber: vase@selfip.ru
next prev parent reply other threads:[~2013-05-31 4:56 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-24 10:40 xen 4.3 test report Vasiliy Tolstov
2013-05-24 12:46 ` George Dunlap
2013-05-24 13:15 ` Vasiliy Tolstov
2013-05-24 14:11 ` Konrad Rzeszutek Wilk
2013-05-24 14:38 ` George Dunlap
2013-05-24 20:15 ` Vasiliy Tolstov
2013-05-25 11:40 ` Konrad Rzeszutek Wilk
2013-05-27 5:32 ` Vasiliy Tolstov
2013-05-28 15:31 ` Konrad Rzeszutek Wilk
2013-05-28 20:58 ` Vasiliy Tolstov
2013-05-31 4:56 ` Vasiliy Tolstov [this message]
2013-06-03 14:08 ` Konrad Rzeszutek Wilk
2013-06-04 12:17 ` Vasiliy Tolstov
2013-06-05 18:50 ` Is: events not being cleared during fast migration over InfiniBand Was: " Konrad Rzeszutek Wilk
2013-06-06 9:23 ` George Dunlap
2013-06-06 9:25 ` George Dunlap
2013-06-13 11:22 ` Vasiliy Tolstov
2013-06-13 13:14 ` Konrad Rzeszutek Wilk
2013-06-13 13:17 ` Vasiliy Tolstov
2013-06-13 11:24 ` Vasiliy Tolstov
2013-05-25 11:27 ` Konrad Rzeszutek Wilk
2013-05-24 20:13 ` Vasiliy Tolstov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CACaajQvAxOCCc-=sGm_ZCf+Z8w__o2f3+wD9obRKZjOK98kQfw@mail.gmail.com' \
--to=v.tolstov@selfip.ru \
--cc=George.Dunlap@eu.citrix.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=konrad.wilk@oracle.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).