From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay2.corp.sgi.com [137.38.102.29]) by oss.sgi.com (Postfix) with ESMTP id 84BC37F5D for ; Thu, 12 Sep 2013 18:51:07 -0500 (CDT) Message-ID: <52325369.1070001@sgi.com> Date: Thu, 12 Sep 2013 18:51:05 -0500 From: Mark Tinguely MIME-Version: 1.0 Subject: Re: XFS: Assertion failed: first <= last && last < BBTOB(bp->b_length), file: fs/xfs/xfs_trans_buf.c, line: 568 References: <52165830.8050006@redhat.com> In-Reply-To: <52165830.8050006@redhat.com> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: Brian Foster Cc: xfs@oss.sgi.com On 08/22/13 13:28, Brian Foster wrote: > Hi all, > > I hit an assert on a debug kernel while beating on some finobt work and > eventually reproduced it on unmodified/TOT xfs/xfsprogs as of today. I > hit it through a couple different paths, first while running fsstress on > a CRC enabled filesystem (with otherwise default mkfs options): > > (These tests are running on a 4p, 4GB VM against a 100GB virtio disk, > hosted on a single spindle desktop box). > > crc=1 > fsstress -z -fsymlink=1 -n99999999 -p4 -d /mnt/test > > XFS: Assertion failed: first<= last&& last< BBTOB(bp->b_length), > file: fs/xfs/xfs_trans_buf.c, line: 568 > ------------[ cut here ]------------ > kernel BUG at fs/xfs/xfs_message.c:108! > invalid opcode: 0000 [#1] SMP > Modules linked in: xfs libcrc32c fuse ebtable_nat > nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE > ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 > nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle bnep > nf_conntrack_ipv4 nf_defrag_ipv4 bluetooth xt_conntrack nf_conntrack > rfkill ebtable_filter ebtables ip6table_filter ip6_tables snd_hda_intel > snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc > snd_timer snd joydev soundcore i2c_piix4 pcspkr mperf virtio_balloon > floppy uinput qxl drm_kms_helper ttm drm virtio_blk virtio_net i2c_core > CPU: 0 PID: 1419 Comm: fsstress Not tainted 3.11.0-rc1+ #10 > Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > task: ffff8800d65b5dc0 ti: ffff8800d10ba000 task.ti: ffff8800d10ba000 > RIP: 0010:[] [] assfail+0x22/0x30 [xfs] > RSP: 0018:ffff8800d10bb998 EFLAGS: 00010292 > RAX: 000000000000006b RBX: ffff8800d67be3a0 RCX: 0000000000000000 > RDX: ffff88011fc0ee48 RSI: ffff88011fc0d038 RDI: ffff88011fc0d038 > RBP: ffff8800d10bb998 R08: 0000000000000000 R09: 000000000000020a > R10: ffffffff81858260 R11: 0000000000000209 R12: ffff8800d165d500 > R13: ffff8800d1158980 R14: 0000000000001007 R15: ffff8800d1cb8300 > FS: 00007f1efd2ce740(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007f1ef80fb018 CR3: 0000000036edb000 CR4: 00000000000006f0 > Stack: > ffff8800d10bb9e8 ffffffffa031d549 000000fc24a6f000 00000e20000000d3 > ffff8800d10bb9f8 ffff8800d67c3040 ffff8800d1cb8208 ffff8800d1cb81e8 > ffff8800d67c3000 ffff8800d1cb8300 ffff8800d10bba48 ffffffffa02e7c1c > Call Trace: > [] xfs_trans_log_buf+0x89/0x1b0 [xfs] > [] xfs_da3_node_add+0x11c/0x210 [xfs] > [] xfs_da3_node_split+0xc3/0x230 [xfs] > [] xfs_da3_split+0x1a8/0x410 [xfs] > [] xfs_dir2_node_addname+0x47f/0xde0 [xfs] > [] xfs_dir_createname+0x1d5/0x1e0 [xfs] > [] ? kmem_alloc+0x67/0xf0 [xfs] > [] xfs_symlink+0x619/0xa20 [xfs] > [] ? _d_rehash+0x36/0x40 > [] ? __lookup_hash+0x38/0x50 > [] ? lookup_hash+0x19/0x20 > [] ? kern_path_create+0x8e/0x170 > [] xfs_vn_symlink+0x5c/0xe0 [xfs] > [] vfs_symlink+0x99/0x100 > [] SyS_symlinkat+0x66/0xd0 > [] SyS_symlink+0x16/0x20 > [] system_call_fastpath+0x16/0x1b > Code: 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 f1 41 89 d0 48 > c7 c6 70 50 33 a0 48 89 fa 31 c0 48 89 e5 31 ff e8 de fb ff ff<0f> 0b > 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 > RIP [] assfail+0x22/0x30 [xfs] > RSP > ---[ end trace 9578edaae955ff56 ]--- > > I repeated the test on a crc=0 fs (with -isize=512) and could not > reproduce during fsstress. I let it populate to about 10GB and > ultimately hit the same assert on unlink during a post-test cleanup: > > crc=0 > rm -rf /mnt/test > > XFS: Assertion failed: first<= last&& last< BBTOB(bp->b_length), > file: fs/xfs/xfs_trans_buf.c, line: 568 > ------------[ cut here ]------------ > kernel BUG at fs/xfs/xfs_message.c:108! > invalid opcode: 0000 [#1] SMP > Modules linked in: xfs libcrc32c fuse ebtable_nat > nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE > ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 > nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle > nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack > ebtable_filter ebtables bnep bluetooth rfkill ip6table_filter ip6_tables > snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm > snd_page_alloc snd_timer snd soundcore joydev pcspkr virtio_balloon > i2c_piix4 floppy mperf uinput qxl drm_kms_helper ttm drm virtio_net > virtio_blk i2c_core > CPU: 1 PID: 2198 Comm: rm Not tainted 3.11.0-rc1+ #10 > Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > task: ffff8801161ec650 ti: ffff8800c803e000 task.ti: ffff8800c803e000 > RIP: 0010:[] [] assfail+0x22/0x30 [xfs] > RSP: 0018:ffff8800c803fa98 EFLAGS: 00010292 > RAX: 000000000000006b RBX: ffff8801029a6e80 RCX: 0000000000000000 > RDX: ffff88011fc8ee48 RSI: ffff88011fc8d038 RDI: ffff88011fc8d038 > RBP: ffff8800c803fa98 R08: 0000000000000000 R09: 0000000000000209 > R10: ffffffff81858260 R11: 0000000000000208 R12: ffff8800302bd200 > R13: ffff8800d25e0850 R14: 000000000000122f R15: ffff8800d271f010 > FS: 00007f28ef9bf740(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 000000000153a000 CR3: 00000000b1fd3000 CR4: 00000000000006e0 > Stack: > ffff8800c803fae8 ffffffffa032b549 00800201008006cc 000000100185febe > ffffffffa033fcb0 ffff8800ade0c010 ffff8800ade0c000 ffff8800d3c2b9e0 > ffff8800d25e0850 ffff8800d271f010 ffff8800c803fb58 ffffffffa02f61ff > Call Trace: > [] xfs_trans_log_buf+0x89/0x1b0 [xfs] > [] xfs_da3_node_unbalance+0xef/0x1d0 [xfs] > [] xfs_da3_join+0x240/0x290 [xfs] > [] xfs_dir2_node_removename+0x69b/0x8b0 [xfs] > [] ? xfs_bmap_last_extent+0x6e/0xb0 [xfs] > [] xfs_dir_removename+0x195/0x1a0 [xfs] > [] xfs_remove+0x2a9/0x410 [xfs] > [] xfs_vn_unlink+0x52/0xa0 [xfs] > [] vfs_unlink+0x9e/0x110 > [] do_unlinkat+0x1a1/0x230 > [] SyS_unlinkat+0x1b/0x40 > [] system_call_fastpath+0x16/0x1b > Code: 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 f1 41 89 d0 48 > c7 c6 70 30 34 a0 48 89 fa 31 c0 48 89 e5 31 ff e8 de fb ff ff<0f> 0b > 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 > RIP [] assfail+0x22/0x30 [xfs] > RSP > ---[ end trace 3ef54f36db3ba0c5 ]--- > > Info on the crc=0 fs is as follows: > > meta-data=/dev/vdb isize=512 agcount=4, agsize=6553600 blks > = sectsz=512 attr=2, projid32bit=1 > = crc=0 > data = bsize=4096 blocks=26214400, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0 > log =internal bsize=4096 blocks=12800, version=2 > = sectsz=512 sunit=0 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > > > Brian FYI: The second (rm version) of the test bisects to the patch: commit f5ea110044fa858925a880b4fa9f551bfa2dfc38 xfs: add CRCs to dir2/da node blocks --- The secret to tripping over the bug is run the test until fsstress fills the filesystem before removing the files. So an error handling? I use the test: #!/bin/sh ltp/fsstress -z -s 1378390208 -fsymlink=1 -n9999999 -p4 -d /test2 cd /test2 sync rm -rf * If your filesystem is smaller, decrease the -n to make the test faster. I have still not gotten a core, though Michael Semon sent one. --Mark. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs