From mboxrd@z Thu Jan 1 00:00:00 1970 From: Fabio Massimo Di Nitto Date: Fri, 21 Jul 2006 13:54:39 +0200 Subject: [Cluster-devel] Problems mounting GFS2 devices In-Reply-To: <20060720144905.GD26286@redhat.com> References: <20060720063421.84887234001@gordian.int.fabbione.net> <20060720144905.GD26286@redhat.com> Message-ID: <44C0C07F.3040009@ubuntu.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit David Teigland wrote: > On Thu, Jul 20, 2006 at 08:34:21AM +0200, Fabio M. Di Nitto wrote: > >> # mount /dev/mapper/mofo /mnt >> Segmentation fault > >> [42950437.180000] GFS2: fsid=edgy:mygfs2.4294967295: can't mount journal >> #4294967295 >> [42950437.180000] GFS2: fsid=edgy:mygfs2.4294967295: there are only 4 >> journals (0 - 3) > > A jid of -1 usually means mount(8) didn't find the mount.gfs2 helper. The > mount should fail cleanly, though, if that happens, don't know what's > wrong there. So right, that was it. Fixed that (mount is not very open to $PATH) i found some other interesting issues after your suggestion on IRC to test a set of different kernels. I did compare .18-rc2 from git + gfs2-2.6.git + the 3 export symbol patch + gfs-kernel built from CVS HEAD with the backport of all of the above into the ubuntu the kernel to make sure that everything was working. The test enviroment was exactly the same and there are the results: GFS1 lock_nolock .18-rc2: ok .17-5-server: ok GFS1 lock_dlm 18-rc2: nok .17-5-server: nok the mount operation seems to succeed. in dmesg: [ 148.044734] GFS: fsid=edgy:gfs1.4294967295: can't mount journal #4294967295 [ 148.044800] GFS: fsid=edgy:gfs1.4294967295: there are only 4 journals (0 - 3) (note that there is no mount.gfs and it's not in CVS either. There is man page for it tho) umount generate a OOPS. [ 200.262962] BUG: unable to handle kernel NULL pointer dereference at virtual address 00000004 [ 200.263070] printing eip: [ 200.263115] c011aff3 [ 200.263159] *pde = 1d392001 [ 200.263206] Oops: 0000 [#1] [ 200.263249] SMP [ 200.263358] Modules linked in: gfs video container button battery asus_acpi ac sctp lock_dlm gfs2 dlm configfs ipv6 af_packet md_mod sg tsdev shpchp pci_hotplug evdev intel_agp agpgart e100 mii floppy pcspkr psmouse serio_raw dm_round_robin dm_multipath dm_mod ext3 jbd mbcache sd_mod lpfc scsi_transport_fc scsi_mod ide_generic ide_cd cdrom ide_disk piix generic thermal processor fan [ 200.265034] CPU: 0 [ 200.265036] EIP: 0060:[] Not tainted VLI [ 200.265040] EFLAGS: 00010086 (2.6.18-rc2 #1) [ 200.265183] EIP is at task_rq_lock+0x23/0x70 [ 200.265232] eax: 00000286 ebx: c03e7120 ecx: 00000000 edx: dd429ee0 [ 200.265284] esi: c03e7120 edi: 00000000 ebp: dd429e84 esp: dd429e74 [ 200.265334] ds: 007b es: 007b ss: 0068 [ 200.265383] Process umount (pid: 4449, ti=dd428000 task=ddc25000 task.ti=dd428000) [ 200.265432] Stack: dd429ee0 e0ad8000 00000000 df671200 dd429ef0 c011bfff 00000282 e0ad8000 [ 200.265786] dd0a3708 00000000 0000000f dd428000 00000000 00000000 00000000 00000001 [ 200.266137] e0c0d2c7 dd0a1840 00000000 00000246 dd0a1848 dd0a1840 dd429ef0 c02d10f4 [ 200.266490] Call Trace: [ 200.266581] [] try_to_wake_up+0x1f/0x420 [ 200.266674] [] gfs_clear_inode+0x57/0x80 [gfs] [ 200.266881] [] wait_for_completion+0x14/0xb0 [ 200.266980] [] gfs_put_super+0x112/0x2be [gfs] [ 200.267097] [] generic_shutdown_super+0x82/0x130 [ 200.267194] [] kill_block_super+0x25/0x40 [ 200.267278] [] deactivate_super+0x4c/0x70 [ 200.267363] [] sys_umount+0x4a/0x250 [ 200.267454] [] unmap_region+0xee/0x120 [ 200.267542] [] sys_stat64+0xf/0x30 [ 200.267626] [] do_munmap+0x197/0x1f0 [ 200.267717] [] sys_oldumount+0x15/0x20 [ 200.267800] [] sysenter_past_esp+0x56/0x79 [ 200.267895] Code: e7 8d b4 26 00 00 00 00 55 89 e5 83 ec 10 89 7d fc 89 c7 89 5d f4 89 55 f0 89 75 f8 be 20 71 3e c0 9c 58 fa 8b 55 f0 89 f3 89 02 <8b> 47 04 8b 40 10 8b 14 85 00 b4 3a c0 01 d3 89 d8 e8 87 7b 1b [ 200.270094] EIP: [] task_rq_lock+0x23/0x70 SS:ESP 0068:dd429e74 [ 200.270212] BUG: warning at kernel/exit.c:854/do_exit() [ 200.270298] [] do_exit+0x741/0x8a0 [ 200.270387] [] printk+0x1b/0x20 [ 200.270479] [] die+0x316/0x320 [ 200.270573] [] do_page_fault+0x34e/0x6bb [ 200.270662] [] bitmap_parse+0x113/0x1f0 [ 200.270752] [] do_page_fault+0x0/0x6bb [ 200.270834] [] error_code+0x39/0x40 [ 200.270922] [] task_rq_lock+0x23/0x70 [ 200.271008] [] try_to_wake_up+0x1f/0x420 [ 200.271097] [] gfs_clear_inode+0x57/0x80 [gfs] [ 200.271215] [] wait_for_completion+0x14/0xb0 [ 200.271303] [] gfs_put_super+0x112/0x2be [gfs] [ 200.271420] [] generic_shutdown_super+0x82/0x130 [ 200.271508] [] kill_block_super+0x25/0x40 [ 200.271592] [] deactivate_super+0x4c/0x70 [ 200.271675] [] sys_umount+0x4a/0x250 [ 200.271760] [] unmap_region+0xee/0x120 [ 200.271845] [] sys_stat64+0xf/0x30 [ 200.271928] [] do_munmap+0x197/0x1f0 [ 200.272017] [] sys_oldumount+0x15/0x20 [ 200.272101] [] sysenter_past_esp+0x56/0x79 GFS2 lock_nolock is ok modulo a stupid bug in mount.gfs2 but i had no time to provide a patch. sorry about that. mount -t gfs2 /dev/foo /mnt to reproduce. It works fine if you omit "-t gfs2" and it's not kernel version dependent. GFS2 lock_dlm (this time with mount.gfs2 in the right place) is not ok (both kernels fail the same way). [ 85.899272] GFS2: fsid=: Trying to join cluster "lock_dlm", "edgy:gfs2" [ 85.921611] dlm: gfs2: recover 1 [ 85.921757] dlm: gfs2: add member 1 [ 85.921816] dlm: gfs2: total members 1 [ 85.921863] dlm: gfs2: dlm_recover_directory [ 85.922080] dlm: gfs2: dlm_recover_directory 0 entries [ 85.922158] dlm: gfs2: recover 1 done: 0 ms [ 85.923137] GFS2: fsid=edgy:gfs2.0: Joined cluster. Now mounting FS... [ 86.071914] GFS2: fsid=edgy:gfs2.0: jid=0, already locked for use [ 86.071980] GFS2: fsid=edgy:gfs2.0: jid=0: Looking at journal... [ 86.179944] GFS2: fsid=edgy:gfs2.0: jid=0: Done [ 86.180008] GFS2: fsid=edgy:gfs2.0: jid=1: Trying to acquire journal lock... [ 86.183291] GFS2: fsid=edgy:gfs2.0: jid=1: Looking at journal... [ 86.315432] GFS2: fsid=edgy:gfs2.0: jid=1: Done [ 86.315497] GFS2: fsid=edgy:gfs2.0: jid=2: Trying to acquire journal lock... [ 86.318686] GFS2: fsid=edgy:gfs2.0: jid=2: Looking at journal... [ 86.461654] GFS2: fsid=edgy:gfs2.0: jid=2: Done [ 86.461719] GFS2: fsid=edgy:gfs2.0: jid=3: Trying to acquire journal lock... [ 86.465081] GFS2: fsid=edgy:gfs2.0: jid=3: Looking at journal... [ 86.606193] GFS2: fsid=edgy:gfs2.0: jid=3: Done [ 86.616587] GFS2: fsid=edgy:gfs2.0: fatal assertion failed [ 86.616674] ------------[ cut here ]------------ [ 86.616722] kernel BUG at fs/gfs2/ops_super.c:290! [ 86.616770] invalid opcode: 0000 [#1] [ 86.616815] SMP [ 86.616923] Modules linked in: video container button battery asus_acpi ac sctp lock_dlm gfs2 dlm configfs ipv6 af_packet md_mod tsdev sg shpchp pci_hotplug evdev intel_agp agpgart e100 mii pcspkr floppy psmouse serio_raw dm_round_robin dm_multipath dm_mod ext3 jbd mbcache sd_mod lpfc scsi_transport_fc scsi_mod ide_generic ide_cd cdrom ide_disk piix generic thermal processor fan [ 86.618567] CPU: 0 [ 86.618569] EIP: 0060:[] Not tainted VLI [ 86.618573] EFLAGS: 00210282 (2.6.18-rc2 #1) [ 86.618823] EIP is at gfs2_clear_inode+0x73/0x90 [gfs2] [ 86.618872] eax: 00000041 ebx: dc937108 ecx: fffeddbb edx: 00000000 [ 86.618924] esi: dc937240 edi: dc937108 ebp: dd719cda esp: dd719c9c [ 86.618974] ds: 007b es: 007b ss: 0068 [ 86.619024] Process mount.gfs2 (pid: 4396, ti=dd718000 task=ddce8000 task.ti=dd718000) [ 86.619074] Stack: dc937108 c018ad74 df122a00 dc937108 df122a00 c018af4e dc937108 dc93e148 [ 86.619428] c018a8cc e0ace000 e0b2d8e8 dd719cda e0b3c8b8 00000000 00000000 75710fed [ 86.619780] 5f61746f 6e616863 00306567 dc92d500 e0b2e02c dd719cec 00000000 00000000 [ 86.620133] Call Trace: [ 86.620220] [] clear_inode+0x54/0xd0 [ 86.620311] [] generic_drop_inode+0x6e/0x160 [ 86.620396] [] iput+0x5c/0x70 [ 86.620477] [] init_per_node+0x168/0x300 [gfs2] [ 86.620597] [] init_inodes+0xec/0x200 [gfs2] [ 86.620714] [] fill_super+0x5ae/0x6f0 [gfs2] [ 86.620840] [] gfs2_glock_nq_num+0x48/0x80 [gfs2] [ 86.620959] [] get_sb_bdev+0x101/0x140 [ 86.621054] [] get_page_from_freelist+0x3d0/0x400 [ 86.621150] [] gfs2_get_sb+0x21/0x30 [gfs2] [ 86.621263] [] fill_super+0x0/0x6f0 [gfs2] [ 86.621376] [] vfs_kern_mount+0xaa/0x150 [ 86.621466] [] do_kern_mount+0x39/0x60 [ 86.621550] [] dummy_sb_mount+0x0/0x10 [ 86.621643] [] do_mount+0x2e3/0x6f0 [ 86.621735] [] ipv6_rcv+0x17e/0x2a0 [ipv6] [ 86.621930] [] e100_poll+0x29e/0x340 [e100] [ 86.622062] [] error_code+0x39/0x40 [ 86.622156] [] copy_mount_options+0x43/0x150 [ 86.622242] [] getname+0xb3/0xe0 [ 86.622332] [] sys_mount+0x77/0xc0 [ 86.622418] [] sysenter_past_esp+0x56/0x79 [ 86.622511] Code: 58 02 00 00 85 c0 74 10 8d 83 5c 02 00 00 5b e9 d4 f4 fe ff 8d 74 26 00 5b c3 8b 83 9c 00 00 00 8b 80 60 01 00 00 e8 8d 99 00 00 <0f> 0b 22 01 b9 ca b3 e0 8b 83 58 02 00 00 eb 9e 8d b6 00 00 00 [ 86.624710] EIP: [] gfs2_clear_inode+0x73/0x90 [gfs2] SS:ESP 0068:dd719c9c [ 86.624867] <1>BUG: unable to handle kernel NULL pointer dereference at virtual address 00000008 [ 86.625200] printing eip: [ 86.625200] printing eip: [ 86.625244] e0b20e9e [ 86.625288] *pde = 003ea001 [ 86.625333] Oops: 0002 [#2] [ 86.625377] SMP [ 86.625484] Modules linked in: video container button battery asus_acpi ac sctp lock_dlm gfs2 dlm configfs ipv6 af_packet md_mod tsdev sg shpchp pci_hotplug evdev intel_agp agpgart e100 mii pcspkr floppy psmouse serio_raw dm_round_robin dm_multipath dm_mod ext3 jbd mbcache sd_mod lpfc scsi_transport_fc scsi_mod ide_generic ide_cd cdrom ide_disk piix generic thermal processor fan [ 86.627124] CPU: 0 [ 86.627127] EIP: 0060:[] Not tainted VLI [ 86.627130] EFLAGS: 00210246 (2.6.18-rc2 #1) [ 86.627290] EIP is at drop_bh+0x8e/0x1b0 [gfs2] [ 86.627338] eax: 00000004 ebx: dc92ec7c ecx: 00000000 edx: dc937108 [ 86.627390] esi: dd719c58 edi: dc92ec98 ebp: 00000000 esp: dd7eff2c [ 86.627440] ds: 007b es: 007b ss: 0068 [ 86.627489] Process lock_dlm1 (pid: 4399, ti=dd7ee000 task=dd915000 task.ti=dd7ee000) [ 86.627538] Stack: e0ae17b8 dd7effac e0ace000 e0b4f440 e0ace000 dd7effac dc92ec7c dd0ba740 [ 86.627892] e0b20586 00200292 c0137b1a df097000 df097000 fffefffe e0a19b3d 00000013 [ 86.628245] 000597ad 00000014 00000000 00000086 00000009 dd7effac df097174 00019a0e [ 86.628597] Call Trace: [ 86.628685] [] gfs2_glock_cb+0x96/0x170 [gfs2] [ 86.628797] [] remove_wait_queue+0x1a/0x50 [ 86.628893] [] gdlm_thread+0x4ed/0x730 [lock_dlm] [ 86.629002] [] default_wake_function+0x0/0x10 [ 86.629098] [] gdlm_thread+0x0/0x730 [lock_dlm] [ 86.629185] [] kthread+0xf7/0x100 [ 86.630076] [] kthread+0x0/0x100 [ 86.630160] [] kernel_thread_helper+0x5/0x10 [ 86.630245] Code: 89 d8 e8 d6 f0 ff ff 8b 44 24 0c 8b 48 14 85 c9 74 09 ba 60 00 00 00 89 d8 ff d1 85 f6 74 22 89 f8 e8 f7 1c 7b df 8b 06 8b 56 04 <89> 50 04 89 02 b0 01 89 36 89 76 04 c7 46 18 00 00 00 00 86 43 [ 86.632444] EIP: [] drop_bh+0x8e/0x1b0 [gfs2] SS:ESP 0068:dd7eff2c [ 86.632588] Hope this gives some better info. Thanks Fabio -- I'm going to make him an offer he can't refuse.