xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* Trying to unmap invalid handle! pending_idx: @ drivers/net/xen-netback/netback.c:998 causes kernel panic/reboot
@ 2014-07-14  2:25 Armin Zentai
  2014-07-14  9:52 ` Wei Liu
  2014-07-14 11:54 ` Zoltan Kiss
  0 siblings, 2 replies; 14+ messages in thread
From: Armin Zentai @ 2014-07-14  2:25 UTC (permalink / raw)
  To: xen-devel

Dear Xen Developers!


We're running Xen on multiple machines, most of them are Dell R410 or SM 
X8DTL, with one E5645 cpu, and 48 GB of RAM. We've update the kernel to 
3.15.4, after the some of our hypervisors started to rebooting at random 
times.

The logs were empty, and we have no information about the crashes, we've 
tried some tricks, and at the end the netconsole kernel modul helped, so 
we can do a very thin layer of remote kernel logging. We've found the 
following in the remote logs:

Jul 13 00:46:58 node11 [157060.106323] vif vif-2-0 h14z4mzbvfrrhb: 
Trying to unmap invalid handle! pending_idx: c
Jul 13 00:46:58 node11 [157060.106476] ------------[ cut here ]------------
Jul 13 00:46:58 node11 [157060.106546] kernel BUG at 
drivers/net/xen-netback/netback.c:998!
Jul 13 00:46:58 node11 [157060.106616] invalid opcode: 0000 [#1]
Jul 13 00:46:58 SMP
Jul 13 00:46:58 node11
Jul 13 00:46:58 node11 [157060.106765] Modules linked in:
Jul 13 00:46:58 node11  netconsole
Jul 13 00:46:58 node11  configfs
Jul 13 00:46:58 node11  nf_conntrack_ipv4
Jul 13 00:46:58 node11  nf_defrag_ipv4
Jul 13 00:46:58 node11  xt_multiport
Jul 13 00:46:58 node11  xt_physdev
Jul 13 00:46:58 node11  sch_tbf
Jul 13 00:46:58 node11  dm_snapshot
Jul 13 00:46:58 node11  dm_bufio
Jul 13 00:46:58 node11  arptable_filter
Jul 13 00:46:58 node11  arp_tables
Jul 13 00:46:58 node11  ip6t_REJECT
Jul 13 00:46:58 node11  ip6table_mangle
Jul 13 00:46:58 node11  ipt_REJECT
Jul 13 00:46:58 node11  iptable_filter
Jul 13 00:46:58 node11  ip_tables
Jul 13 00:46:58 node11  bridge
Jul 13 00:46:58 node11  xen_pciback
Jul 13 00:46:58 node11  xen_gntalloc
Jul 13 00:46:58 node11  autofs4
Jul 13 00:46:58 node11  dm_round_robin
Jul 13 00:46:58 node11  scsi_dh_alua
Jul 13 00:46:58 node11  8021q
Jul 13 00:46:58 node11  mrp
Jul 13 00:46:58 node11  garp
Jul 13 00:46:58 node11  stp
Jul 13 00:46:58 node11  llc
Jul 13 00:46:58 node11  bonding
Jul 13 00:46:58 node11  xt_tcpudp
Jul 13 00:46:58 node11  nf_conntrack_ipv6
Jul 13 00:46:58 node11  nf_defrag_ipv6
Jul 13 00:46:58 node11  xt_state
Jul 13 00:46:58 node11  nf_conntrack
Jul 13 00:46:58 node11  ip6table_filter
Jul 13 00:46:58 node11  ip6_tables
Jul 13 00:46:58 node11  x_tables
Jul 13 00:46:58 node11  bnx2i
Jul 13 00:46:58 node11  cnic
Jul 13 00:46:58 node11  uio
Jul 13 00:46:58 node11  cxgb4i
Jul 13 00:46:58 node11  cxgb4
Jul 13 00:46:58 node11  cxgb3i
Jul 13 00:46:58 node11  libcxgbi
Jul 13 00:46:58 node11  cxgb3
Jul 13 00:46:58 node11  mdio
Jul 13 00:46:58 node11  ib_iser
Jul 13 00:46:58 node11  rdma_cm
Jul 13 00:46:58 node11  ib_cm
Jul 13 00:46:58 node11  iw_cm
Jul 13 00:46:58 node11  ib_sa
Jul 13 00:46:58 node11  ib_mad
Jul 13 00:46:58 node11  ib_core
Jul 13 00:46:58 node11  ib_addr
Jul 13 00:46:58 node11  iscsi_tcp
Jul 13 00:46:58 node11  libiscsi_tcp
Jul 13 00:46:58 node11  binfmt_misc
Jul 13 00:46:58 node11  dm_mirror
Jul 13 00:46:58 node11  dm_region_hash
Jul 13 00:46:58 node11  dm_log
Jul 13 00:46:58 node11  dm_multipath
Jul 13 00:46:58 node11  scsi_dh
Jul 13 00:46:58 node11  xen_netback
Jul 13 00:46:58 node11  xen_blkback
Jul 13 00:46:58 node11  xen_gntdev
Jul 13 00:46:58 node11  xen_evtchn
Jul 13 00:46:58 node11  xenfs
Jul 13 00:46:58 node11  xen_privcmd
Jul 13 00:46:58 node11  ufs
Jul 13 00:46:58 node11  gpio_ich
Jul 13 00:46:58 node11  iTCO_wdt
Jul 13 00:46:58 node11  iTCO_vendor_support
Jul 13 00:46:58 node11  joydev
Jul 13 00:46:58 node11  psmouse
Jul 13 00:46:58 node11  serio_raw
Jul 13 00:46:58 node11  pcspkr
Jul 13 00:46:58 node11  tpm_infineon
Jul 13 00:46:58 node11  i2c_i801
Jul 13 00:46:58 node11  lpc_ich
Jul 13 00:46:58 node11  hid_generic
Jul 13 00:46:58 node11  e1000e
Jul 13 00:46:58 node11  ptp
Jul 13 00:46:58 node11  pps_core
Jul 13 00:46:58 node11  snd_hda_intel
Jul 13 00:46:58 node11  snd_hda_controller
Jul 13 00:46:58 node11  snd_hda_codec
Jul 13 00:46:58 node11  snd_hwdep
Jul 13 00:46:58 node11  snd_seq
Jul 13 00:46:58 node11  snd_seq_device
Jul 13 00:46:58 node11  snd_pcm
Jul 13 00:46:58 node11  snd_timer
Jul 13 00:46:58 node11  snd
Jul 13 00:46:58 node11  soundcore
Jul 13 00:46:58 node11  ioatdma
Jul 13 00:46:58 node11  dca
Jul 13 00:46:58 node11  mac_hid
Jul 13 00:46:58 node11  i7core_edac
Jul 13 00:46:58 node11  edac_core
Jul 13 00:46:58 node11  be2iscsi
Jul 13 00:46:58 node11  iscsi_boot_sysfs
Jul 13 00:46:58 node11  libiscsi
Jul 13 00:46:58 node11  scsi_transport_iscsi
Jul 13 00:46:58 node11  be2net
Jul 13 00:46:58 node11  vxlan
Jul 13 00:46:58 node11  ahci(E)
Jul 13 00:46:58 node11  libahci(E)
Jul 13 00:46:58 node11  usbhid(E)
Jul 13 00:46:58 node11  hid(E)
Jul 13 00:46:58 node11  [last unloaded: evbug]
Jul 13 00:46:58 node11
Jul 13 00:46:58 node11 [157060.112705] CPU: 0 PID: 0 Comm: swapper/0 
Tainted: G            E 3.15.4 #1
Jul 13 00:46:58 node11 [157060.112776] Hardware name: Supermicro 
X8DTL/X8DTL, BIOS 1.1b    03/19/2010
Jul 13 00:46:58 node11 [157060.112848] task: ffffffff81c1b480 ti: 
ffffffff81c00000 task.ti: ffffffff81c00000
Jul 13 00:46:58 node11 [157060.112936] RIP: e030:[<ffffffffa025f61d>]
Jul 13 00:46:58 node11  [<ffffffffa025f61d>] 
xenvif_idx_unmap+0x11d/0x130 [xen_netback]
Jul 13 00:46:58 node11 [157060.113078] RSP: e02b:ffff88008ea03d48 
EFLAGS: 00010292
Jul 13 00:46:58 node11 [157060.113147] RAX: 000000000000004a RBX: 
000000000000000c RCX: 0000000000000000
Jul 13 00:46:58 node11 [157060.113234] RDX: ffff88008a40b600 RSI: 
ffff88008ea03a18 RDI: 000000000000021b
Jul 13 00:46:58 node11 [157060.113321] RBP: ffff88008ea03d88 R08: 
0000000000000000 R09: ffff88008a40b600
Jul 13 00:46:58 node11 [157060.113408] R10: ffff88008a0004e8 R11: 
00000000000006d8 R12: ffff8800569708c0
Jul 13 00:46:58 node11 [157060.113495] R13: ffff88006558fec0 R14: 
ffff8800569708c0 R15: 0000000000000001
Jul 13 00:46:58 node11 [157060.113589] FS:  00007f351684b700(0000) 
GS:ffff88008ea00000(0000) knlGS:0000000000000000
Jul 13 00:46:58 node11 [157060.113679] CS:  e033 DS: 0000 ES: 0000 CR0: 
000000008005003b
Jul 13 00:46:58 node11 [157060.113747] CR2: 00007fc2a4372000 CR3: 
00000000049f3000 CR4: 0000000000002660
Jul 13 00:46:58 node11 [157060.113835] Stack:
Jul 13 00:46:58 node11 [157060.113896]  ffff880056979f90
Jul 13 00:46:58 node11  ff00000000000001
Jul 13 00:46:58 node11  ffff880b0605e000
Jul 13 00:46:58 node11  0000000000000000
Jul 13 00:46:58 node11
Jul 13 00:46:58 node11 [157060.114143]  ffff0000ffffffff
Jul 13 00:46:58 node11  00000000fffffff6
Jul 13 00:46:58 node11  0000000000000001
Jul 13 00:46:58 node11  ffff8800569769d0
Jul 13 00:46:58 node11
Jul 13 00:46:58 node11 [157060.114390]  ffff88008ea03e58
Jul 13 00:46:58 node11  ffffffffa02622fc
Jul 13 00:46:58 node11  ffff88008ea03dd8
Jul 13 00:46:58 node11  ffffffff810b5223
Jul 13 00:46:58 node11
Jul 13 00:46:58 node11 [157060.114637] Call Trace:
Jul 13 00:46:58 node11 [157060.114700]  <IRQ>
Jul 13 00:46:58 node11
Jul 13 00:46:58 node11 [157060.114750]
Jul 13 00:46:58 node11  [<ffffffffa02622fc>] 
xenvif_tx_action+0x27c/0x7f0 [xen_netback]
Jul 13 00:46:58 node11 [157060.114927]  [<ffffffff810b5223>] ? 
__wake_up+0x53/0x70
Jul 13 00:46:58 node11 [157060.114998]  [<ffffffff810ca077>] ? 
handle_irq_event_percpu+0xa7/0x1b0
Jul 13 00:46:58 node11 [157060.115073]  [<ffffffffa02647d1>] 
xenvif_poll+0x31/0x64 [xen_netback]
Jul 13 00:46:58 node11 [157060.115147]  [<ffffffff81653d4b>] 
net_rx_action+0x10b/0x290
Jul 13 00:46:58 node11 [157060.115221]  [<ffffffff81071c73>] 
__do_softirq+0x103/0x320
Jul 13 00:46:58 node11 [157060.115292]  [<ffffffff81072015>] 
irq_exit+0x135/0x140
Jul 13 00:46:58 node11 [157060.115363]  [<ffffffff8144759c>] 
xen_evtchn_do_upcall+0x3c/0x50
Jul 13 00:46:58 node11 [157060.115436]  [<ffffffff8175c07e>] 
xen_do_hypervisor_callback+0x1e/0x30
Jul 13 00:46:58 node11 [157060.115506]  <EOI>
Jul 13 00:46:58 node11
Jul 13 00:46:58 node11 [157060.115551]
Jul 13 00:46:58 node11  [<ffffffff810013aa>] ? 
xen_hypercall_sched_op+0xa/0x20
Jul 13 00:46:58 node11 [157060.115722]  [<ffffffff810013aa>] ? 
xen_hypercall_sched_op+0xa/0x20
Jul 13 00:46:58 node11 [157060.115794]  [<ffffffff8100a200>] ? 
xen_safe_halt+0x10/0x20
Jul 13 00:46:58 node11 [157060.115869]  [<ffffffff8101dbbf>] ? 
default_idle+0x1f/0xc0
Jul 13 00:46:58 node11 [157060.115939]  [<ffffffff8101d38f>] ? 
arch_cpu_idle+0xf/0x20
Jul 13 00:46:58 node11 [157060.116009]  [<ffffffff810b5aa1>] ? 
cpu_startup_entry+0x201/0x360
Jul 13 00:46:58 node11 [157060.116084]  [<ffffffff817420a7>] ? 
rest_init+0x77/0x80
Jul 13 00:46:58 node11 [157060.116156]  [<ffffffff81d3a156>] ? 
start_kernel+0x406/0x413
Jul 13 00:46:58 node11 [157060.116227]  [<ffffffff81d39b6e>] ? 
repair_env_string+0x5b/0x5b
Jul 13 00:46:58 node11 [157060.116298]  [<ffffffff81d39603>] ? 
x86_64_start_reservations+0x2a/0x2c
Jul 13 00:46:58 node11 [157060.116373]  [<ffffffff81d3d5dc>] ? 
xen_start_kernel+0x584/0x586
Jul 13 00:46:58 node11 [157060.116446] Code:
Jul 13 00:46:58 41
Jul 13 00:46:58 5c
Jul 13 00:46:58 c9
Jul 13 00:46:58 c3
Jul 13 00:46:58 48
Jul 13 00:46:58 be
Jul 13 00:46:58 00
Jul 13 00:46:58 00
Jul 13 00:46:58 00
Jul 13 00:46:58 80
Jul 13 00:46:58 ff
Jul 13 00:46:58 77
Jul 13 00:46:58 00
Jul 13 00:46:58 00
Jul 13 00:46:58 e9
Jul 13 00:46:58 62
Jul 13 00:46:58 ff
Jul 13 00:46:58 ff
Jul 13 00:46:58 ff
Jul 13 00:46:58 49
Jul 13 00:46:58 8b
Jul 13 00:46:58 bc
Jul 13 00:46:58 24
Jul 13 00:46:58 78
Jul 13 00:46:58 ba
Jul 13 00:46:58 00
Jul 13 00:46:58 00
Jul 13 00:46:58 89
Jul 13 00:46:58 da
Jul 13 00:46:58 48
Jul 13 00:46:58 c7
Jul 13 00:46:58 c6
Jul 13 00:46:58 30
Jul 13 00:46:58 57
Jul 13 00:46:58 26
Jul 13 00:46:58 a0
Jul 13 00:46:58 31
Jul 13 00:46:58 c0
Jul 13 00:46:58 e8
Jul 13 00:46:58 73
Jul 13 00:46:58 df
Jul 13 00:46:58 3e
Jul 13 00:46:58 e1
Jul 13 00:46:58 node11 f>
Jul 13 00:46:58 0b
Jul 13 00:46:58 eb
Jul 13 00:46:58 fe
Jul 13 00:46:58 66
Jul 13 00:46:58 66
Jul 13 00:46:58 66
Jul 13 00:46:58 66
Jul 13 00:46:58 66
Jul 13 00:46:58 66
Jul 13 00:46:58 2e
Jul 13 00:46:58 0f
Jul 13 00:46:58 1f
Jul 13 00:46:58 84
Jul 13 00:46:58 00
Jul 13 00:46:58 00
Jul 13 00:46:58 00
Jul 13 00:46:58 00
Jul 13 00:46:58 00
Jul 13 00:46:58 55
Jul 13 00:46:58 48
Jul 13 00:46:58 node11
Jul 13 00:46:58 node11 [157060.119179] RIP
Jul 13 00:46:58 node11  [<ffffffffa025f61d>] 
xenvif_idx_unmap+0x11d/0x130 [xen_netback]
Jul 13 00:46:58 node11 [157060.119312]  RSP <ffff88008ea03d48>
Jul 13 00:46:58 node11 [157060.119395] ---[ end trace 7e021c96c8cfea53 ]---
Jul 13 00:46:58 node11 [157060.119465] Kernel panic - not syncing: Fatal 
exception in interrupt


h14z4mzbvfrrhb was a name of a VIF. This VIF belongs to a Windows Server 
2008 R2 X64 virtual machine. We had 6 random reboots until now, all of 
the VIFs are belonged to the same operating system, but different 
virtual machines. So only Windows Server 2008 R2 X64 system's virtual 
interfaces caused the crashes, these systems has been provisioned from 
different installs or templates. The GPLPV driver's versions are also 
different.

XM info output:
[root@c2-node11 ~]# xm info
host                   : c2-node11
release                : 3.15.4
version                : #1 SMP Tue Jul 8 17:58:26 CEST 2014
machine                : x86_64
nr_cpus                : 12
nr_nodes               : 1
cores_per_socket       : 6
threads_per_core       : 2
cpu_mhz                : 2400
hw_caps                : 
bfebfbff:2c100800:00000000:00003f40:029ee3ff:00000000:00000001:00000000
virt_caps              : hvm hvm_directio
total_memory           : 49143
free_memory            : 41830
free_cpus              : 0
xen_major              : 4
xen_minor              : 2
xen_extra              : .4-33.el6
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 
hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : unavailable
xen_commandline        : dom0_mem=3145728 pcie_aspm=off noreboot=true
cc_compiler            : gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)
cc_compile_by          : mockbuild
cc_compile_domain      : centos.org
cc_compile_date        : Mon Jun 16 17:22:14 UTC 2014
xend_config_format     : 4

[root@c2-node11 ~]# uname -a
Linux c2-node11 3.15.4 #1 SMP Tue Jul 8 17:58:26 CEST 2014 x86_64 x86_64 
x86_64 GNU/Linux


The xm create config file of the specified VM (the other VM's config 
files are the same):

kernel = "/usr/lib/xen/boot/hvmloader"
device_model = "/usr/lib64/xen/bin/qemu-dm"
builder = "hvm"
memory = "2000"
name = "vna3mhwnv9pn4m"
vcpus = "1"

timer_mode = "2"
viridian = "1"

vif = [ "type=ioemu, mac=00:16:3e:64:c8:ba, bridge=x0evss6g1ztoa4, 
ip=..., vifname=h14z4mzbvfrrhb, rate=100Mb/s" ]

disk = [ "phy:/dev/q7jiqc2gh02b2b/xz7wget4ycmp77,ioemu:hda,w" ]
vnc = 1
vncpasswd="aaaaa1"
usbdevice="tablet"


The HV's networking looks as the following:
We are using dual emulex 10gbit network adapters, with bonding (LACP), 
and on the top of the bond, we're using VLAN's for the VM, management 
and the iSCSI traffic.
We're tried to reproduce the error, but we couldn't, the crash/reboot 
happened randomly every time.

Thanks, for your help,

  - Armin Zentai

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-07-21 10:24 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-14  2:25 Trying to unmap invalid handle! pending_idx: @ drivers/net/xen-netback/netback.c:998 causes kernel panic/reboot Armin Zentai
2014-07-14  9:52 ` Wei Liu
2014-07-14 10:53   ` Armin Zentai
2014-07-14 11:15     ` Wei Liu
2014-07-14 11:54 ` Zoltan Kiss
2014-07-14 12:07   ` Zoltan Kiss
2014-07-14 12:27     ` Zoltan Kiss
2014-07-14 12:14   ` Armin Zentai
2014-07-14 15:30     ` Zoltan Kiss
2014-07-14 21:15       ` Armin Zentai
2014-07-15  9:32         ` Wei Liu
2014-07-17 19:17     ` Zoltan Kiss
2014-07-20 17:20       ` Armin Zentai
2014-07-21 10:24         ` Zoltan Kiss

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).