From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [BUG] kernel BUG at drivers/block/xen-blkfront.c:1711 Date: Mon, 6 Jun 2016 10:42:32 +0200 Message-ID: <1465202552.15816.70.camel@citrix.com> References: <57553355.8050302@virtuozzo.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============9043921257089896873==" Return-path: In-Reply-To: <57553355.8050302@virtuozzo.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Evgenii Shatokhin , xen-devel@lists.xen.org Cc: Juergen Gross , George Dunlap , David Vrabel , Roger Pau Monne List-Id: xen-devel@lists.xenproject.org --===============9043921257089896873== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-s/rLZicoDSIAib/CkxgD" --=-s/rLZicoDSIAib/CkxgD Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Just Cc-ing some Linux, block, and Xen on CentOS people... On Mon, 2016-06-06 at 11:24 +0300, Evgenii Shatokhin wrote: > (Resending this bug report because the message I sent last week did > not=C2=A0 > make it to the mailing list somehow.) >=20 > Hi, >=20 > One of our users gets kernel panics from time to time when he tries > to=C2=A0 > use his Amazon EC2 instance with CentOS7 x64 in it [1]. Kernel panic=C2= =A0 > happens within minutes from the moment the instance starts. The > problem=C2=A0 > does not show up every time, however. >=20 > The user first observed the problem with a custom kernel, but it was=C2= =A0 > found later that the stock kernel 3.10.0-327.18.2.el7.x86_64 from=C2=A0 > CentOS7 was affected as well. >=20 > The part of the system log he was able to retrieve is attached. Here > is=C2=A0 > the bug info, for convenience: >=20 > ------------------------------------ > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] kernel BUG at drivers/block/xen-blkfro= nt.c:1711! > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] invalid opcode: 0000 [#1] SMP > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] Modules linked in: ata_generic pata_ac= pi > crct10dif_pclmul=C2=A0 > crct10dif_common crc32_pclmul crc32c_intel ghash_clmulni_intel=C2=A0 > xen_netfront xen_blkfront(+) aesni_intel lrw ata_piix gf128mul=C2=A0 > glue_helper ablk_helper cryptd libata serio_raw floppy sunrpc > dm_mirror=C2=A0 > dm_region_hash dm_log dm_mod scsi_transport_iscsi > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] CPU: 1 PID: 50 Comm: xenwatch Not tain= ted=C2=A0 > 3.10.0-327.18.2.el7.x86_64 #1 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] Hardware name: Xen HVM domU, BIOS 4.2.= amazon > 12/07/2015 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] task: ffff8800e9fcb980 ti: ffff8800e98= bc000 task.ti:=C2=A0 > ffff8800e98bc000 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] RIP: 0010:[]=C2=A0= =C2=A0[]=C2=A0 > blkfront_setup_indirect+0x41f/0x430 [xen_blkfront] > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] RSP: 0018:ffff8800e98bfcd0=C2=A0=C2=A0= EFLAGS: 00010283 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] RAX: ffff8800353e15c0 RBX: ffff8800e98= c52c8 RCX:=C2=A0 > 0000000000000020 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] RDX: ffff8800353e15b0 RSI: ffff8800e98= c52b8 RDI:=C2=A0 > ffff8800353e15d0 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] RBP: ffff8800e98bfd20 R08: ffff8800353= e15b0 R09:=C2=A0 > ffff8800eb403c00 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] R10: ffffffffa0155532 R11: fffffffffff= fffe8 R12:=C2=A0 > ffff8800e98c4000 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] R13: ffff8800e98c52b8 R14: 00000000000= 00020 R15:=C2=A0 > ffff8800353e15c0 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] FS:=C2=A0=C2=A00000000000000000(0000) = GS:ffff8800efc20000(0000)=C2=A0 > knlGS:0000000000000000 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] CS:=C2=A0=C2=A00010 DS: 0000 ES: 0000 = CR0: 0000000080050033 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] CR2: 00007f1b615ef000 CR3: 00000000e2b= 44000 CR4:=C2=A0 > 00000000001406e0 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] DR0: 0000000000000000 DR1: 00000000000= 00000 DR2:=C2=A0 > 0000000000000000 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] DR3: 0000000000000000 DR6: 00000000fff= f0ff0 DR7:=C2=A0 > 0000000000000400 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] Stack: > [=C2=A0=C2=A0=C2=A0=C2=A02.246912]=C2=A0=C2=A00000000000000020 0000000000= 000001 00000020a0157217=C2=A0 > 00000100e98bfdbc > [=C2=A0=C2=A0=C2=A0=C2=A02.246912]=C2=A0=C2=A00000000027efa3ef ffff8800e9= 8bfdbc ffff8800e98ce000=C2=A0 > ffff8800e98c4000 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912]=C2=A0=C2=A0ffff8800e98ce040 0000000000= 000001 ffff8800e98bfe08=C2=A0 > ffffffffa0155d4c > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] Call Trace: > [=C2=A0=C2=A0=C2=A0=C2=A02.246912]=C2=A0=C2=A0[] blkbac= k_changed+0x4ec/0xfc8=C2=A0 > [xen_blkfront] > [=C2=A0=C2=A0=C2=A0=C2=A02.246912]=C2=A0=C2=A0[] ? xenb= us_gather+0x170/0x190 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912]=C2=A0=C2=A0[] ? __sl= ab_free+0x10e/0x277 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912]=C2=A0=C2=A0[] > xenbus_otherend_changed+0xad/0x110 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912]=C2=A0=C2=A0[] ? xenw= atch_thread+0x77/0x180 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912]=C2=A0=C2=A0[] backen= d_changed+0x13/0x20 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912]=C2=A0=C2=A0[] xenwat= ch_thread+0x66/0x180 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912]=C2=A0=C2=A0[] ? wake= _up_atomic_t+0x30/0x30 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912]=C2=A0=C2=A0[] ? > unregister_xenbus_watch+0x1f0/0x1f0 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912]=C2=A0=C2=A0[] kthrea= d+0xcf/0xe0 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912]=C2=A0=C2=A0[] ? > kthread_create_on_node+0x140/0x140 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912]=C2=A0=C2=A0[] ret_fr= om_fork+0x58/0x90 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912]=C2=A0=C2=A0[] ? > kthread_create_on_node+0x140/0x140 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] Code: e1 48 85 c0 75 ce 49 8d 84 24 40= 01 00 00 48 89 > 45=C2=A0 > b8 e9 91 fd ff ff 4c 89 ff e8 8d ae 06 e1 e9 f2 fc ff ff 31 c0 e9 2e > fe=C2=A0 > ff ff <0f> 0b e8 9a 57 f2 e0 0f 0b 0f 1f 84 00 00 00 00 00 0f 1f 44 > 00 > [=C2=A0=C2=A0=C2=A0=C2=A02.246912] RIP=C2=A0=C2=A0[]=C2= =A0 > blkfront_setup_indirect+0x41f/0x430 [xen_blkfront] > [=C2=A0=C2=A0=C2=A0=C2=A02.246912]=C2=A0=C2=A0RSP > [=C2=A0=C2=A0=C2=A0=C2=A02.491574] ---[ end trace 8a9b992812627c71 ]--- > [=C2=A0=C2=A0=C2=A0=C2=A02.495618] Kernel panic - not syncing: Fatal exce= ption > ------------------------------------ >=20 > Xen version 4.2. >=20 > EC2 instance type: c3.large with EBS magnetic storage, if that > matters. >=20 > Here is the code where the BUG_ON triggers (drivers/block/xen- > blkfront.c): > ------------------------------------ > if (!info->feature_persistent && info->max_indirect_segments) { > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* We are using indi= rect descriptors but not persistent > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* grants, we need t= o allocate a set of pages that can be > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* used for mapping = indirect grefs > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*/ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0int num =3D INDIRECT_GREFS(segs) * BLK_RING= _SIZE; >=20 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0BUG_ON(!list_empty(&info->indirect_pages));= // << This one hits. > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0for (i =3D 0; i < num; i++) { > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0struct page *indire= ct_page =3D alloc_page(GFP_NOIO); > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (!indirect_page) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0goto out_of_memory; > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0list_add(&indirect_= page->lru, &info->indirect_pages); > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > } > ------------------------------------ >=20 > As we checked, 'info->indirect_pages' list indeed contained around > 30=C2=A0 > elements at that point. >=20 > Any ideas what may cause this and how to fix it? >=20 > If any other data are needed, please let me know. >=20 > Regards, > Evgenii >=20 > References: > [1] https://bugs.openvz.org/browse/OVZ-6718 > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-s/rLZicoDSIAib/CkxgD Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJXVTd5AAoJEBZCeImluHPue8EP/RHbqOAXP2xJqmmTrQNJzs4R 2pBhazTQl2idHiOUjCouFlzlX10gpQwjSEuO0ej9NoVg1/mBxMyOlpGZzFf03QLq laQQIQoUbOPvUnL6ZNsBP1xX5lqOhK7WpkrnBGwVmggTyiB3KfHF7HIEobC74RsB yDQHlduENQRPrXEFXnknj3LRumdBl/uqra+Vk7OK6Zu4SFNAEM/u+jnfMnlTTy5d nohA9j+65XaQC3xWXMNeeuuGZQSCuhgQgHIspFTV9iAo0tqKFiPS0DibHMhDxGPk oPds6yLs2TxCHyu5s+fXGRrgOlKTvpSVBPjICnwt+HbtqeeBnFudq53RwLq1P7mt VjPjzFLRWr3tIHGSEpWzWUFzEUyLOxHEbpsiQkp3m6KlQh7rEPmyhMm0xT2J2nWN IA9tSifdxUAKco/mgUa2AFDBRx0VwKKCuMRGuMw05kIKDojZWkUPbiCXBqkZ9BvG EP+kbpPT8qtFSpoKhIbdJOLir7FFrxP0LMFNYB513Gu8uZaausKNTHLHzNR0jV9G YFrR+drLuLdT2AudNH2/0i1epqqAYD/kWkRmSX8Hy/eFFGb84PDuwMxQlEVd0UEB PPUYr9cnBgGWWJ4n1bMhVdE0ByWl9rbhqzemZXciAy1DgYa9d2P5czBl2DQbOyeo HfsvosXJ1l1bJrg+Dv5h =LPUi -----END PGP SIGNATURE----- --=-s/rLZicoDSIAib/CkxgD-- --===============9043921257089896873== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwOi8vbGlzdHMueGVuLm9y Zy94ZW4tZGV2ZWwK --===============9043921257089896873==--