All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] Problems mounting GFS2 devices
@ 2006-07-20  6:34 Fabio M. Di Nitto
  2006-07-20 14:49 ` David Teigland
  0 siblings, 1 reply; 4+ messages in thread
From: Fabio M. Di Nitto @ 2006-07-20  6:34 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi guys,

this is using the latest gfs2 code from git and the latest cvs head userland.

# gfs2_mkfs -t edgy:mygfs2 -p lock_dlm -j 4 /dev/mapper/mofo 
This will destroy any data on /dev/mapper/mofo.

Are you sure you want to proceed? [y/n] y

Device:                    /dev/mapper/mofo
Blocksize:                 4096
Device Size                237.36 GB (62223680 blocks)
Filesystem Size:           237.36 GB (62223679 blocks)
Journals:                  4
Resource Groups:           950
Locking Protocol:          "lock_dlm"
Lock Table:                "edgy:mygfs2"

mapper/mofo is a SAN exported device as seen by multipath,
but accessing the device directly makes no difference.

# mount /dev/mapper/mofo /mnt
Segmentation fault

# dmesg
[42950437.160000] GFS2: fsid=: Trying to join cluster "lock_dlm", "edgy:mygfs2"
[42950437.170000] dlm: mygfs2: recover 1
[42950437.170000] dlm: mygfs2: add member 1
[42950437.170000] dlm: mygfs2: total members 1
[42950437.170000] dlm: mygfs2: dlm_recover_directory
[42950437.170000] dlm: mygfs2: dlm_recover_directory 0 entries
[42950437.170000] dlm: mygfs2: recover 1 done: 0 ms
[42950437.170000] GFS2: fsid=edgy:mygfs2.4294967295: Joined cluster. Now mounting FS...
[42950437.180000] GFS2: fsid=edgy:mygfs2.4294967295: can't mount journal #4294967295
[42950437.180000] GFS2: fsid=edgy:mygfs2.4294967295: there are only 4 journals (0 - 3)
[42950437.180000] GFS2: fsid=edgy:mygfs2.4294967295: fatal assertion failed

^^^ note i get the same kind of error no matter how many journals i create.

[42950437.180000] ------------[ cut here ]------------
[42950437.180000] kernel BUG at fs/gfs2/ops_super.c:290!
[42950437.180000] invalid opcode: 0000 [#1]
[42950437.180000] SMP 
[42950437.180000] Modules linked in: video tc1100_wmi sony_acpi pcc_acpi hotkey dev_acpi container button acpi_sbs battery ac i2c_acpi_ec i2c_core sctp lock_dlm gfs2 dlm configfs ipv6 af_packet md_mod lp sg snd_intel8x0 snd_ac97_codec snd_ac97_bus hw_random snd_pcm_oss snd_mixer_oss tsdev shpchp snd_pcm snd_timer evdev intel_agp agpgart snd soundcore snd_page_alloc pci_hotplug e100 mii parport_pc psmouse pcspkr floppy serio_raw parport dm_round_robin dm_multipath dm_mod ext3 jbd sd_mod uhci_hcd usbcore lpfc scsi_transport_fc scsi_mod ide_generic ide_cd cdrom ide_disk piix generic thermal processor fan vesafb capability commoncap vga16fb vgastate fbcon tileblit font bitblit softcursor
[42950437.180000] CPU:    0
[42950437.180000] EIP:    0060:[<e0c20493>]    Not tainted VLI
[42950437.180000] EFLAGS: 00010296   (2.6.17-5-server #2) 
[42950437.180000] EIP is at gfs2_clear_inode+0x73/0x90 [gfs2]
[42950437.180000] eax: 0000004f   ebx: d0118048   ecx: 00000000   edx: 00000292
[42950437.180000] esi: 00000000   edi: e0bd3000   ebp: e0beb4ac   esp: d3c4fcb8
[42950437.180000] ds: 007b   es: 007b   ss: 0068
[42950437.180000] Process mount (pid: 4736, threadinfo=d3c4e000 task=dafb2580)
[42950437.180000] Stack: d0118048 c01850bd df20c400 d0118048 df20c400 c01852ce d0118048 e0beb788 
[42950437.180000]        c0184bac ffffffea e0c1cd2b e0c2cecc e0beb788 00000004 00000003 d3c4fcf4 
[42950437.180000]        d3c4fcf4 00000000 dafb2580 00000003 00000020 00000000 000000c2 00000000 
[42950437.180000] Call Trace:
[42950437.180000]  <c01850bd> clear_inode+0x9d/0x120  <c01852ce> generic_drop_inode+0x6e/0x150
[42950437.180000]  <c0184bac> iput+0x5c/0x70  <e0c1cd2b> init_journal+0x8b/0x4a0 [gfs2]
[42950437.180000]  <e0c1d17f> init_inodes+0x3f/0x200 [gfs2]  <e0c1dd8f> fill_super+0x58f/0x6e0 [gfs2]
[42950437.180000]  <e0c107e8> gfs2_glock_nq_num+0x48/0x80 [gfs2]  <c017278c> get_sb_bdev+0xec/0x130
[42950437.180000]  <c0187598> alloc_vfsmnt+0xa8/0xe0  <e0c1c859> gfs2_get_sb+0x19/0x20 [gfs2]
[42950437.180000]  <e0c1d800> fill_super+0x0/0x6e0 [gfs2]  <c017210c> do_kern_mount+0xcc/0x170
[42950437.180000]  <c01889a5> do_mount+0x435/0x730  <c014e339> filemap_nopage+0x2e9/0x390
[42950437.180000]  <c0158b88> __handle_mm_fault+0x368/0xc10  <c01190a6> do_page_fault+0x3b6/0x744
[42950437.180000]  <c0103be7> error_code+0x4f/0x54  <c0150c32> __alloc_pages+0x52/0x310
[42950437.180000]  <c0187873> copy_mount_options+0x43/0x150  <c0188d17> sys_mount+0x77/0xc0
[42950437.180000]  <c0103007> sysenter_past_esp+0x54/0x75 
[42950437.180000] Code: 60 02 00 00 85 c0 74 10 8d 83 64 02 00 00 5b e9 a4 f4 fe ff 8d 74 26 00 5b c3 8b 83 9c 00 00 00 8b 80 60 01 00 00 e8 9d 98 00 00 <0f> 0b 22 01 dc ba c2 e0 8b 83 60 02 00 00 eb 9e 8d b6 00 00 00 
[42950437.180000] EIP: [<e0c20493>] gfs2_clear_inode+0x73/0x90 [gfs2] SS:ESP 0068:d3c4fcb8
[42950437.180000]  <1>BUG: unable to handle kernel NULL pointer dereference at virtual address 00000008
[42950437.520000]  printing eip:
[42950437.530000] e0c1005e
[42950437.530000] *pde = 0170d001
[42950437.540000] Oops: 0002 [#2]
[42950437.540000] SMP 
[42950437.540000] Modules linked in: video tc1100_wmi sony_acpi pcc_acpi hotkey dev_acpi container button acpi_sbs battery ac i2c_acpi_ec i2c_core sctp lock_dlm gfs2 dlm configfs ipv6 af_packet md_mod lp sg snd_intel8x0 snd_ac97_codec snd_ac97_bus hw_random snd_pcm_oss snd_mixer_oss tsdev shpchp snd_pcm snd_timer evdev intel_agp agpgart snd soundcore snd_page_alloc pci_hotplug e100 mii parport_pc psmouse pcspkr floppy serio_raw parport dm_round_robin dm_multipath dm_mod ext3 jbd sd_mod uhci_hcd usbcore lpfc scsi_transport_fc scsi_mod ide_generic ide_cd cdrom ide_disk piix generic thermal processor fan vesafb capability commoncap vga16fb vgastate fbcon tileblit font bitblit softcursor
[42950437.540000] CPU:    0
[42950437.540000] EIP:    0060:[<e0c1005e>]    Not tainted VLI
[42950437.540000] EFLAGS: 00010246   (2.6.17-5-server #2) 
[42950437.540000] EIP is at drop_bh+0x8e/0x1b0 [gfs2]
[42950437.540000] eax: 00000004   ebx: d484f43c   ecx: 00000000   edx: d0118048
[42950437.540000] esi: d3c4fc74   edi: d484f458   ebp: 00000000   esp: c8505f2c
[42950437.540000] ds: 007b   es: 007b   ss: 0068
[42950437.540000] Process lock_dlm2 (pid: 4739, threadinfo=c8504000 task=dfc81a90)
[42950437.540000] Stack: e0be4358 c8505fac e0bd3000 e0c3e220 e0bd3000 c8505fac d484f43c df20ce00 
[42950437.540000]        e0c0f746 00000292 c0135c7a df20ce00 df348f40 fffefffe e0b32b9d 00000000 
[42950437.540000]        00000009 dfc81b98 dfc81a90 dffa7a90 c1404d20 c8505fac df20cf74 00010000 
[42950437.540000] Call Trace:
[42950437.540000]  <e0c0f746> gfs2_glock_cb+0x96/0x170 [gfs2]  <c0135c7a> remove_wait_queue+0x1a/0x50
[42950437.540000]  <e0b32b9d> gdlm_thread+0x4fd/0x740 [lock_dlm]  <c011b9f0> default_wake_function+0x0/0x10
[42950437.540000]  <e0b326a0> gdlm_thread+0x0/0x740 [lock_dlm]  <c013586c> kthread+0xac/0xe0
[42950437.540000]  <c01357c0> kthread+0x0/0xe0  <c0101005> kernel_thread_helper+0x5/0x10
[42950437.540000] Code: 89 d8 e8 d6 f0 ff ff 8b 44 24 0c 8b 48 14 85 c9 74 09 ba 60 00 00 00 89 d8 ff d1 85 f6 74 22 89 f8 e8 e7 6a 6c df 8b 06 8b 56 04 <89> 50 04 89 02 b0 01 89 36 89 76 04 c7 46 18 00 00 00 00 86 43 
[42950437.540000] EIP: [<e0c1005e>] drop_bh+0x8e/0x1b0 [gfs2] SS:ESP 0068:c8505f2c
[42950437.540000]  <3>BUG: soft lockup detected on CPU#0!

system is still usable for a few seconds. then another OOPS appears on the terminal and
the machine dies hard.

(hand copied)

[42950461.990000] <c014899x> softlockup_tick+0x9c/0xf0		<c012b9c1> update_process_times+0x21/0x80
[42950461.990000] <c0113cb1> smp_apic_timer_interrupt+0x51/0x60 <c0103b40> apic_timer_interrupt+0x1c/0x24
[42950461.990000] <c02d6b45> _spin_lock+0x5/0x10		<e0c0e85b> gfs2_glmutex_trylock+0xb/0x40 [gfs2]
[42950461.990000] <e0c10f88> scan_glock+0x8/0x70 [gfs2]		<e0c0e9fb> examine_bucket+0x8b/0xd0 [gfs2]
[42950461.990000] <e0c10f80> scan_glock+0x0/0x70 [gfs2]		<e0c07790> gfs2_scand+0x0/0x50 [gfs2]
[42950461.990000] <e0c0ebaf> gfs2_scand_internal+0x1f/0x40 [gfs2] <e0c0779c> gfs2_scand+0xc/0x50 [gfs2]
[42950461.990000] <c013586c> kthread+0xac/0xe0			<c01357c0> kthread+0x0/0xe0
[42950461.990000] <c0101005> kernel_thread_herlper+0x5/0x10

Here a test with lock_nolock:

# gfs2_mkfs -t edgy:mygfs2 -p lock_nolock -j 4 /dev/mapper/mofo 
This will destroy any data on /dev/mapper/mofo.

Are you sure you want to proceed? [y/n] y

Device:                    /dev/mapper/mofo
Blocksize:                 4096
Device Size                237.36 GB (62223680 blocks)
Filesystem Size:           237.36 GB (62223679 blocks)
Journals:                  4
Resource Groups:           950
Locking Protocol:          "lock_nolock"
Lock Table:                "edgy:mygfs2"

[42949467.940000] Lock_Nolock (built Jul 18 2006 14:27:44) installed
[42949521.080000] GFS2: fsid=: Trying to join cluster "lock_nolock", "edgy:mygfs2"
[42949521.080000] GFS2: fsid=edgy:mygfs2.0: Joined cluster. Now mounting FS...
[42949521.220000] GFS2: fsid=edgy:mygfs2.0: jid=0, already locked for use
[42949521.220000] GFS2: fsid=edgy:mygfs2.0: jid=0: Looking at journal...
[42949521.330000] GFS2: fsid=edgy:mygfs2.0: jid=0: Done
[42949521.330000] GFS2: fsid=edgy:mygfs2.0: jid=1: Trying to acquire journal lock...
[42949521.330000] GFS2: fsid=edgy:mygfs2.0: jid=1: Looking at journal...
[42949521.470000] GFS2: fsid=edgy:mygfs2.0: jid=1: Done
[42949521.470000] GFS2: fsid=edgy:mygfs2.0: jid=2: Trying to acquire journal lock...
[42949521.470000] GFS2: fsid=edgy:mygfs2.0: jid=2: Looking at journal...
[42949521.620000] GFS2: fsid=edgy:mygfs2.0: jid=2: Done
[42949521.620000] GFS2: fsid=edgy:mygfs2.0: jid=3: Trying to acquire journal lock...
[42949521.620000] GFS2: fsid=edgy:mygfs2.0: jid=3: Looking at journal...
[42949521.770000] GFS2: fsid=edgy:mygfs2.0: jid=3: Done

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/hda1              19G  789M   17G   5% /
varrun                252M   80K  252M   1% /var/run
varlock               252M  4,0K  252M   1% /var/lock
udev                   10M  112K  9,9M   2% /dev
devshm                252M     0  252M   0% /dev/shm
Segmentation fault

# dmesg
[42949571.960000] BUG: unable to handle kernel paging request at virtual address 0000109c
[42949571.960000]  printing eip:
[42949571.960000] e0c374c8
[42949571.960000] *pde = 1bbf2001
[42949571.960000] Oops: 0000 [#1]
[42949571.960000] SMP 
[42949571.960000] Modules linked in: lock_nolock video tc1100_wmi sony_acpi pcc_acpi hotkey dev_acpi container button acpi_sbs battery ac i2c_acpi_ec i2c_core sctp lock_dlm gfs2 dlm configfs ipv6 af_packet md_mod lp hw_random snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_pcm_oss snd_mixer_oss sg snd_pcm snd_timer snd soundcore e100 tsdev evdev mii shpchp intel_agp agpgart pci_hotplug snd_page_alloc parport_pc psmouse serio_raw pcspkr parport floppy dm_round_robin dm_multipath dm_mod ext3 jbd sd_mod lpfc scsi_transport_fc uhci_hcd usbcore scsi_mod ide_generic ide_cd cdrom ide_disk piix generic thermal processor fan vesafb capability commoncap vga16fb vgastate fbcon tileblit font bitblit softcursor
[42949571.960000] CPU:    0
[42949571.960000] EIP:    0060:[<e0c374c8>]    Not tainted VLI
[42949571.960000] EFLAGS: 00010286   (2.6.17-5-server #2) 
[42949571.960000] EIP is at gfs2_statfs+0x18/0xd0 [gfs2]
[42949571.960000] eax: 00001000   ebx: def2d800   ecx: e0c556c0   edx: cc1abeb0
[42949571.960000] esi: cc1abeb0   edi: cc1abf04   ebp: cc1abeb0   esp: cc1abe74
[42949571.960000] ds: 007b   es: 007b   ss: 0068
[42949571.960000] Process df (pid: 4689, threadinfo=cc1aa000 task=dfc7da90)
[42949571.960000] Stack: dffc5ea0 dfbfe5f8 c017b8c1 dc7ff000 dfbfe5f8 dffc5ea0 def2d800 cc1abeb0 
[42949571.960000]        cc1abf04 cc1aa000 c0168fe5 00000000 cc1abeb0 cc1abf14 c0169116 00000000 
[42949571.960000]        00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
[42949571.960000] Call Trace:
[42949571.960000]  <c017b8c1> link_path_walk+0x71/0xf0  <c0168fe5> vfs_statfs+0x65/0x80
[42949571.960000]  <c0169116> vfs_statfs64+0x16/0x30  <c016a5c3> sys_statfs64+0x83/0xc0
[42949571.960000]  <c0226220> tty_write+0x0/0x1f0  <c016be11> sys_write+0x41/0x70
[42949571.960000]  <c0103007> sysenter_past_esp+0x54/0x75 
[42949571.960000] Code: 60 02 00 00 eb 9e 8d b6 00 00 00 00 8d bc 27 00 00 00 00 83 ec 28 89 74 24 1c 89 7c 24 20 89 6c 24 24 89 d5 89 5c 24 18 8b 40 0c <8b> 80 9c 00 00 00 8b 98 60 01 00 00 8d 83 e4 02 00 00 e8 61 f6 
[42949571.960000] EIP: [<e0c374c8>] gfs2_statfs+0x18/0xd0 [gfs2] SS:ESP 0068:cc1abe74
[42949571.960000]  


Thanks for your time
Fabio

PS of course i am ready to test possible patches or provide any extra info
required. The SAN is not in production so we can play as much as we want.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Cluster-devel] Problems mounting GFS2 devices
  2006-07-20  6:34 [Cluster-devel] Problems mounting GFS2 devices Fabio M. Di Nitto
@ 2006-07-20 14:49 ` David Teigland
  2006-07-21 11:54   ` Fabio Massimo Di Nitto
  0 siblings, 1 reply; 4+ messages in thread
From: David Teigland @ 2006-07-20 14:49 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Thu, Jul 20, 2006 at 08:34:21AM +0200, Fabio M. Di Nitto wrote:

> # mount /dev/mapper/mofo /mnt
> Segmentation fault

> [42950437.180000] GFS2: fsid=edgy:mygfs2.4294967295: can't mount journal
> #4294967295
> [42950437.180000] GFS2: fsid=edgy:mygfs2.4294967295: there are only 4
> journals (0 - 3)

A jid of -1 usually means mount(8) didn't find the mount.gfs2 helper.  The
mount should fail cleanly, though, if that happens, don't know what's
wrong there.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Cluster-devel] Problems mounting GFS2 devices
  2006-07-20 14:49 ` David Teigland
@ 2006-07-21 11:54   ` Fabio Massimo Di Nitto
  2006-07-25  8:33     ` [Cluster-devel] Problems mounting GFS2/GFS devices and possible ipv6 dlm issue Fabio Massimo Di Nitto
  0 siblings, 1 reply; 4+ messages in thread
From: Fabio Massimo Di Nitto @ 2006-07-21 11:54 UTC (permalink / raw)
  To: cluster-devel.redhat.com

David Teigland wrote:
> On Thu, Jul 20, 2006 at 08:34:21AM +0200, Fabio M. Di Nitto wrote:
> 
>> # mount /dev/mapper/mofo /mnt
>> Segmentation fault
> 
>> [42950437.180000] GFS2: fsid=edgy:mygfs2.4294967295: can't mount journal
>> #4294967295
>> [42950437.180000] GFS2: fsid=edgy:mygfs2.4294967295: there are only 4
>> journals (0 - 3)
> 
> A jid of -1 usually means mount(8) didn't find the mount.gfs2 helper.  The
> mount should fail cleanly, though, if that happens, don't know what's
> wrong there.

So right, that was it. Fixed that (mount is not very open to $PATH) i found some
other interesting issues after your suggestion on IRC to test a set of different
kernels.

I did compare .18-rc2 from git + gfs2-2.6.git + the 3 export symbol patch +
gfs-kernel built from CVS HEAD with the backport of all of the above into the
ubuntu the kernel to make sure that everything was working.

The test enviroment was exactly the same and there are the results:

GFS1 lock_nolock
.18-rc2: ok
.17-5-server: ok

GFS1 lock_dlm
18-rc2: nok
.17-5-server: nok

the mount operation seems to succeed. in dmesg:

[  148.044734] GFS: fsid=edgy:gfs1.4294967295: can't mount journal #4294967295
[  148.044800] GFS: fsid=edgy:gfs1.4294967295: there are only 4 journals (0 - 3)

(note that there is no mount.gfs and it's not in CVS either. There is man page
for it tho)

umount generate a OOPS.

[  200.262962] BUG: unable to handle kernel NULL pointer dereference at virtual
address 00000004
[  200.263070]  printing eip:
[  200.263115] c011aff3
[  200.263159] *pde = 1d392001
[  200.263206] Oops: 0000 [#1]
[  200.263249] SMP
[  200.263358] Modules linked in: gfs video container button battery asus_acpi
ac sctp lock_dlm gfs2 dlm configfs ipv6 af_packet md_mod sg tsdev shpchp
pci_hotplug evdev intel_agp agpgart e100 mii floppy pcspkr psmouse serio_raw
dm_round_robin dm_multipath dm_mod ext3 jbd mbcache sd_mod lpfc
scsi_transport_fc scsi_mod ide_generic ide_cd cdrom ide_disk piix generic
thermal processor fan
[  200.265034] CPU:    0
[  200.265036] EIP:    0060:[<c011aff3>]    Not tainted VLI
[  200.265040] EFLAGS: 00010086   (2.6.18-rc2 #1)
[  200.265183] EIP is at task_rq_lock+0x23/0x70
[  200.265232] eax: 00000286   ebx: c03e7120   ecx: 00000000   edx: dd429ee0
[  200.265284] esi: c03e7120   edi: 00000000   ebp: dd429e84   esp: dd429e74
[  200.265334] ds: 007b   es: 007b   ss: 0068
[  200.265383] Process umount (pid: 4449, ti=dd428000 task=ddc25000
task.ti=dd428000)
[  200.265432] Stack: dd429ee0 e0ad8000 00000000 df671200 dd429ef0 c011bfff
00000282 e0ad8000
[  200.265786]        dd0a3708 00000000 0000000f dd428000 00000000 00000000
00000000 00000001
[  200.266137]        e0c0d2c7 dd0a1840 00000000 00000246 dd0a1848 dd0a1840
dd429ef0 c02d10f4
[  200.266490] Call Trace:
[  200.266581]  [<c011bfff>] try_to_wake_up+0x1f/0x420
[  200.266674]  [<e0c0d2c7>] gfs_clear_inode+0x57/0x80 [gfs]
[  200.266881]  [<c02d10f4>] wait_for_completion+0x14/0xb0
[  200.266980]  [<e0c0d402>] gfs_put_super+0x112/0x2be [gfs]
[  200.267097]  [<c0177f32>] generic_shutdown_super+0x82/0x130
[  200.267194]  [<c0178005>] kill_block_super+0x25/0x40
[  200.267278]  [<c01780bc>] deactivate_super+0x4c/0x70
[  200.267363]  [<c018ea1a>] sys_umount+0x4a/0x250
[  200.267454]  [<c01617ae>] unmap_region+0xee/0x120
[  200.267542]  [<c017b03f>] sys_stat64+0xf/0x30
[  200.267626]  [<c0162297>] do_munmap+0x197/0x1f0
[  200.267717]  [<c018ec35>] sys_oldumount+0x15/0x20
[  200.267800]  [<c0103109>] sysenter_past_esp+0x56/0x79
[  200.267895] Code: e7 8d b4 26 00 00 00 00 55 89 e5 83 ec 10 89 7d fc 89 c7 89
5d f4 89 55 f0 89 75 f8 be 20 71 3e c0 9c 58 fa 8b 55 f0 89 f3 89 02 <8b> 47 04
8b 40 10 8b 14 85 00 b4 3a c0 01 d3 89 d8 e8 87 7b 1b
[  200.270094] EIP: [<c011aff3>] task_rq_lock+0x23/0x70 SS:ESP 0068:dd429e74
[  200.270212]  BUG: warning at kernel/exit.c:854/do_exit()
[  200.270298]  [<c0126491>] do_exit+0x741/0x8a0
[  200.270387]  [<c012379b>] printk+0x1b/0x20
[  200.270479]  [<c01046d6>] die+0x316/0x320
[  200.270573]  [<c01195ae>] do_page_fault+0x34e/0x6bb
[  200.270662]  [<c01e3233>] bitmap_parse+0x113/0x1f0
[  200.270752]  [<c0119260>] do_page_fault+0x0/0x6bb
[  200.270834]  [<c0103ccd>] error_code+0x39/0x40
[  200.270922]  [<c011aff3>] task_rq_lock+0x23/0x70
[  200.271008]  [<c011bfff>] try_to_wake_up+0x1f/0x420
[  200.271097]  [<e0c0d2c7>] gfs_clear_inode+0x57/0x80 [gfs]
[  200.271215]  [<c02d10f4>] wait_for_completion+0x14/0xb0
[  200.271303]  [<e0c0d402>] gfs_put_super+0x112/0x2be [gfs]
[  200.271420]  [<c0177f32>] generic_shutdown_super+0x82/0x130
[  200.271508]  [<c0178005>] kill_block_super+0x25/0x40
[  200.271592]  [<c01780bc>] deactivate_super+0x4c/0x70
[  200.271675]  [<c018ea1a>] sys_umount+0x4a/0x250
[  200.271760]  [<c01617ae>] unmap_region+0xee/0x120
[  200.271845]  [<c017b03f>] sys_stat64+0xf/0x30
[  200.271928]  [<c0162297>] do_munmap+0x197/0x1f0
[  200.272017]  [<c018ec35>] sys_oldumount+0x15/0x20
[  200.272101]  [<c0103109>] sysenter_past_esp+0x56/0x79

GFS2 lock_nolock is ok modulo a stupid bug in mount.gfs2 but i had no time to
provide a patch. sorry about that.
mount -t gfs2 /dev/foo /mnt
to reproduce. It works fine if you omit "-t gfs2" and it's not kernel version
dependent.

GFS2 lock_dlm (this time with mount.gfs2 in the right place) is not ok (both
kernels fail the same way).

[   85.899272] GFS2: fsid=: Trying to join cluster "lock_dlm", "edgy:gfs2"
[   85.921611] dlm: gfs2: recover 1
[   85.921757] dlm: gfs2: add member 1
[   85.921816] dlm: gfs2: total members 1
[   85.921863] dlm: gfs2: dlm_recover_directory
[   85.922080] dlm: gfs2: dlm_recover_directory 0 entries
[   85.922158] dlm: gfs2: recover 1 done: 0 ms
[   85.923137] GFS2: fsid=edgy:gfs2.0: Joined cluster. Now mounting FS...
[   86.071914] GFS2: fsid=edgy:gfs2.0: jid=0, already locked for use
[   86.071980] GFS2: fsid=edgy:gfs2.0: jid=0: Looking at journal...
[   86.179944] GFS2: fsid=edgy:gfs2.0: jid=0: Done
[   86.180008] GFS2: fsid=edgy:gfs2.0: jid=1: Trying to acquire journal lock...
[   86.183291] GFS2: fsid=edgy:gfs2.0: jid=1: Looking at journal...
[   86.315432] GFS2: fsid=edgy:gfs2.0: jid=1: Done
[   86.315497] GFS2: fsid=edgy:gfs2.0: jid=2: Trying to acquire journal lock...
[   86.318686] GFS2: fsid=edgy:gfs2.0: jid=2: Looking at journal...
[   86.461654] GFS2: fsid=edgy:gfs2.0: jid=2: Done
[   86.461719] GFS2: fsid=edgy:gfs2.0: jid=3: Trying to acquire journal lock...
[   86.465081] GFS2: fsid=edgy:gfs2.0: jid=3: Looking at journal...
[   86.606193] GFS2: fsid=edgy:gfs2.0: jid=3: Done
[   86.616587] GFS2: fsid=edgy:gfs2.0: fatal assertion failed
[   86.616674] ------------[ cut here ]------------
[   86.616722] kernel BUG at fs/gfs2/ops_super.c:290!
[   86.616770] invalid opcode: 0000 [#1]
[   86.616815] SMP
[   86.616923] Modules linked in: video container button battery asus_acpi ac
sctp lock_dlm gfs2 dlm configfs ipv6 af_packet md_mod tsdev sg shpchp
pci_hotplug evdev intel_agp agpgart e100 mii pcspkr floppy psmouse serio_raw
dm_round_robin dm_multipath dm_mod ext3 jbd mbcache sd_mod lpfc
scsi_transport_fc scsi_mod ide_generic ide_cd cdrom ide_disk piix generic
thermal processor fan
[   86.618567] CPU:    0
[   86.618569] EIP:    0060:[<e0b312a3>]    Not tainted VLI
[   86.618573] EFLAGS: 00210282   (2.6.18-rc2 #1)
[   86.618823] EIP is at gfs2_clear_inode+0x73/0x90 [gfs2]
[   86.618872] eax: 00000041   ebx: dc937108   ecx: fffeddbb   edx: 00000000
[   86.618924] esi: dc937240   edi: dc937108   ebp: dd719cda   esp: dd719c9c
[   86.618974] ds: 007b   es: 007b   ss: 0068
[   86.619024] Process mount.gfs2 (pid: 4396, ti=dd718000 task=ddce8000
task.ti=dd718000)
[   86.619074] Stack: dc937108 c018ad74 df122a00 dc937108 df122a00 c018af4e
dc937108 dc93e148
[   86.619428]        c018a8cc e0ace000 e0b2d8e8 dd719cda e0b3c8b8 00000000
00000000 75710fed
[   86.619780]        5f61746f 6e616863 00306567 dc92d500 e0b2e02c dd719cec
00000000 00000000
[   86.620133] Call Trace:
[   86.620220]  [<c018ad74>] clear_inode+0x54/0xd0
[   86.620311]  [<c018af4e>] generic_drop_inode+0x6e/0x160
[   86.620396]  [<c018a8cc>] iput+0x5c/0x70
[   86.620477]  [<e0b2d8e8>] init_per_node+0x168/0x300 [gfs2]
[   86.620597]  [<e0b2e02c>] init_inodes+0xec/0x200 [gfs2]
[   86.620714]  [<e0b2ebae>] fill_super+0x5ae/0x6f0 [gfs2]
[   86.620840]  [<e0b21628>] gfs2_glock_nq_num+0x48/0x80 [gfs2]
[   86.620959]  [<c0178861>] get_sb_bdev+0x101/0x140
[   86.621054]  [<c0156700>] get_page_from_freelist+0x3d0/0x400
[   86.621150]  [<e0b2d641>] gfs2_get_sb+0x21/0x30 [gfs2]
[   86.621263]  [<e0b2e600>] fill_super+0x0/0x6f0 [gfs2]
[   86.621376]  [<c017818a>] vfs_kern_mount+0xaa/0x150
[   86.621466]  [<c0178289>] do_kern_mount+0x39/0x60
[   86.621550]  [<c01bca80>] dummy_sb_mount+0x0/0x10
[   86.621643]  [<c018e503>] do_mount+0x2e3/0x6f0
[   86.621735]  [<e0a8dbfe>] ipv6_rcv+0x17e/0x2a0 [ipv6]
[   86.621930]  [<e094827e>] e100_poll+0x29e/0x340 [e100]
[   86.622062]  [<c0103ccd>] error_code+0x39/0x40
[   86.622156]  [<c018d453>] copy_mount_options+0x43/0x150
[   86.622242]  [<c0180453>] getname+0xb3/0xe0
[   86.622332]  [<c018e987>] sys_mount+0x77/0xc0
[   86.622418]  [<c0103109>] sysenter_past_esp+0x56/0x79
[   86.622511] Code: 58 02 00 00 85 c0 74 10 8d 83 5c 02 00 00 5b e9 d4 f4 fe ff
8d 74 26 00 5b c3 8b 83 9c 00 00 00 8b 80 60 01 00 00 e8 8d 99 00 00 <0f> 0b 22
01 b9 ca b3 e0 8b 83 58 02 00 00 eb 9e 8d b6 00 00 00
[   86.624710] EIP: [<e0b312a3>] gfs2_clear_inode+0x73/0x90 [gfs2] SS:ESP
0068:dd719c9c
[   86.624867]  <1>BUG: unable to handle kernel NULL pointer dereference at
virtual address 00000008
[   86.625200]  printing eip:
[   86.625200]  printing eip:
[   86.625244] e0b20e9e
[   86.625288] *pde = 003ea001
[   86.625333] Oops: 0002 [#2]
[   86.625377] SMP
[   86.625484] Modules linked in: video container button battery asus_acpi ac
sctp lock_dlm gfs2 dlm configfs ipv6 af_packet md_mod tsdev sg shpchp
pci_hotplug evdev intel_agp agpgart e100 mii pcspkr floppy psmouse serio_raw
dm_round_robin dm_multipath dm_mod ext3 jbd mbcache sd_mod lpfc
scsi_transport_fc scsi_mod ide_generic ide_cd cdrom ide_disk piix generic
thermal processor fan
[   86.627124] CPU:    0
[   86.627127] EIP:    0060:[<e0b20e9e>]    Not tainted VLI
[   86.627130] EFLAGS: 00210246   (2.6.18-rc2 #1)
[   86.627290] EIP is at drop_bh+0x8e/0x1b0 [gfs2]
[   86.627338] eax: 00000004   ebx: dc92ec7c   ecx: 00000000   edx: dc937108
[   86.627390] esi: dd719c58   edi: dc92ec98   ebp: 00000000   esp: dd7eff2c
[   86.627440] ds: 007b   es: 007b   ss: 0068
[   86.627489] Process lock_dlm1 (pid: 4399, ti=dd7ee000 task=dd915000
task.ti=dd7ee000)
[   86.627538] Stack: e0ae17b8 dd7effac e0ace000 e0b4f440 e0ace000 dd7effac
dc92ec7c dd0ba740
[   86.627892]        e0b20586 00200292 c0137b1a df097000 df097000 fffefffe
e0a19b3d 00000013
[   86.628245]        000597ad 00000014 00000000 00000086 00000009 dd7effac
df097174 00019a0e
[   86.628597] Call Trace:
[   86.628685]  [<e0b20586>] gfs2_glock_cb+0x96/0x170 [gfs2]
[   86.628797]  [<c0137b1a>] remove_wait_queue+0x1a/0x50
[   86.628893]  [<e0a19b3d>] gdlm_thread+0x4ed/0x730 [lock_dlm]
[   86.629002]  [<c011c400>] default_wake_function+0x0/0x10
[   86.629098]  [<e0a19650>] gdlm_thread+0x0/0x730 [lock_dlm]
[   86.629185]  [<c0137747>] kthread+0xf7/0x100
[   86.630076]  [<c0137650>] kthread+0x0/0x100
[   86.630160]  [<c0101005>] kernel_thread_helper+0x5/0x10
[   86.630245] Code: 89 d8 e8 d6 f0 ff ff 8b 44 24 0c 8b 48 14 85 c9 74 09 ba 60
00 00 00 89 d8 ff d1 85 f6 74 22 89 f8 e8 f7 1c 7b df 8b 06 8b 56 04 <89> 50 04
89 02 b0 01 89 36 89 76 04 c7 46 18 00 00 00 00 86 43
[   86.632444] EIP: [<e0b20e9e>] drop_bh+0x8e/0x1b0 [gfs2] SS:ESP 0068:dd7eff2c
[   86.632588]


Hope this gives some better info.

Thanks
Fabio

-- 
I'm going to make him an offer he can't refuse.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Cluster-devel] Problems mounting GFS2/GFS devices and possible ipv6 dlm issue.
  2006-07-21 11:54   ` Fabio Massimo Di Nitto
@ 2006-07-25  8:33     ` Fabio Massimo Di Nitto
  0 siblings, 0 replies; 4+ messages in thread
From: Fabio Massimo Di Nitto @ 2006-07-25  8:33 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Just a short update after fixing a bunhc of bugs

Fabio Massimo Di Nitto wrote:
> David Teigland wrote:
> 
> GFS1 lock_nolock
> .18-rc2: ok
> .17-5-server: ok
> 
> GFS1 lock_dlm
> 18-rc2: nok
> .17-5-server: nok
> 
> the mount operation seems to succeed. in dmesg:
> 
> [  148.044734] GFS: fsid=edgy:gfs1.4294967295: can't mount journal #4294967295
> [  148.044800] GFS: fsid=edgy:gfs1.4294967295: there are only 4 journals (0 - 3)
> 
> (note that there is no mount.gfs and it's not in CVS either. There is man page
> for it tho)

This is now ok after symlinking mount/umount.gfs to mount/jmount.gfs2 and
applying the patch i posted a few minutes ago to handle /mnt or /mnt/.

> 
> umount generate a OOPS.

not anymore.

It is still NOT possible to mount the shared device on two nodes when using IPv6
 address. It seems that the dlm kernel module isn't configured properly and we
get messages like this one in dmesg:

[  403.530413] Joined cluster. Now mounting FS...
[  403.537627] dlm: reject connect from unknown addr
[  403.538067] dlm: Error sending to node 2 -32
[  403.538119] dlm: Initiating association with node 2

also when trying to running other services, so it's not GFS/GFS2.

gfs2 status isn't changed.

Thanks
Fabio

-- 
I'm going to make him an offer he can't refuse.



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-07-25  8:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-20  6:34 [Cluster-devel] Problems mounting GFS2 devices Fabio M. Di Nitto
2006-07-20 14:49 ` David Teigland
2006-07-21 11:54   ` Fabio Massimo Di Nitto
2006-07-25  8:33     ` [Cluster-devel] Problems mounting GFS2/GFS devices and possible ipv6 dlm issue Fabio Massimo Di Nitto

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.