[Cluster-devel] gfs_controld doesn't clean up tables properly

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Cluster-devel] gfs_controld doesn't clean up tables properly
@ 2007-10-26  8:40 Fabio Massimo Di Nitto
  2007-10-29  7:42 ` Fabio Massimo Di Nitto
  2007-10-29 11:29 ` Fabio Massimo Di Nitto
  0 siblings, 2 replies; 3+ messages in thread
From: Fabio Massimo Di Nitto @ 2007-10-26  8:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi guys,

I think we have seen a similar bug a while ago and I am afraid it's back again.

Extremely simple setup, 3 nodes cluster, no services or anything fancy. One
shared block device (aoe + vblade) on 2.6.24-rc1.

node1# mkfs.gfs -j 6 -p lock_dlm -t gutsy:gfs /dev/etherd/e0.0

on all nodes:

mount -t gfs /dev/etherd/e0.0 /mnt/gfs
umount /mnt/gfs
mount -t gfs /dev/etherd/e0.0 /mnt/gfs
/sbin/mount.gfs: mount point already used or other mount in progress
/sbin/mount.gfs: error mounting lockproto lock_dlm

Oct 26 10:37:55 node3 gfs_controld[3466]: mount point /mnt/gfs already used

note that the first mount succeeded properly.

Attempting to mount to another mountpoint will stall:

root at node3:~# mount -t gfs /dev/etherd/e0.0 /mnt/gfs2
[hang]
there is no strace output.

Fabio

PS sorry i don't have enough time to look into this today in more details.

-- 
I'm going to make him an offer he can't refuse.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Cluster-devel] gfs_controld doesn't clean up tables properly
  2007-10-26  8:40 [Cluster-devel] gfs_controld doesn't clean up tables properly Fabio Massimo Di Nitto
@ 2007-10-29  7:42 ` Fabio Massimo Di Nitto
  2007-10-29 11:29 ` Fabio Massimo Di Nitto
  1 sibling, 0 replies; 3+ messages in thread
From: Fabio Massimo Di Nitto @ 2007-10-29  7:42 UTC (permalink / raw)
  To: cluster-devel.redhat.com

gfs_controld debugging:

root at node1:~# gfs_controld -P -D
1193643134 listen 3
1193643134 cpg 6
1193643134 groupd 8
1193643134 uevent 9
1193643134 plocks 12
1193643134 plock cpg message size: 336 bytes
1193643134 setup done
1193643157 client 6: join /mnt/gfs gfs lock_dlm gutsy:gfs rw /dev/etherd/e0.0
1193643157 mount: /mnt/gfs gfs lock_dlm gutsy:gfs rw /dev/etherd/e0.0
1193643157 gfs cluster name matches: gutsy
1193643157 gfs do_mount: rv 0
1193643157 groupd cb: set_id gfs 10001
1193643157 groupd cb: start gfs type 2 count 1 members 1
1193643157 gfs start 3 init 1 type 2 member_count 1
1193643157 gfs add member 1
1193643157 gfs total members 1 master_nodeid -1 prev -1
1193643157 gfs start_first_mounter
1193643157 gfs start_done 3
1193643157 notify_mount_client: nodir not found for lockspace gfs
1193643157 notify_mount_client: cmanconf_free_conf
1193643157 notify_mount_client: cman_finish
1193643157 notify_mount_client: hostdata=jid=0:id=65537:first=1
1193643157 groupd cb: finish gfs
1193643157 gfs finish 3 needs_recovery 0
1193643157 gfs set /sys/fs/gfs/gutsy:gfs/lock_module/block to 0
1193643157 gfs set open /sys/fs/gfs/gutsy:gfs/lock_module/block error -1 2
1193643157 kernel: add@ gutsy:gfs
1193643157 gfs ping_kernel_mount 0
1193643158 kernel: change@ gutsy:gfs
1193643158 gfs kernel_recovery_done_first first_done 0
1193643158 kernel: change@ gutsy:gfs
1193643158 gfs kernel_recovery_done_first first_done 0
1193643158 kernel: change@ gutsy:gfs
1193643158 gfs kernel_recovery_done_first first_done 0
1193643158 kernel: change@ gutsy:gfs
1193643158 gfs kernel_recovery_done_first first_done 0
1193643158 kernel: change@ gutsy:gfs
1193643158 gfs kernel_recovery_done_first first_done 0
1193643158 kernel: change@ gutsy:gfs
1193643158 gfs kernel_recovery_done_first first_done 1
1193643158 kernel: change@ gutsy:gfs
1193643158 gfs recovery_done jid 5 ignored, first 1,1
1193643158 gfs receive_recovery_done from 1 needs_recovery 0
1193643158 gfs set /sys/fs/gfs/gutsy:gfs/lock_module/block to 0
1193643158 client 6: mount_result /mnt/gfs gfs 0
1193643158 gfs got_mount_result: ci 6 result 0 another 0 first_mounter 1 opts 9
1193643158 gfs send_mount_status kernel_mount_error 0 first_mounter 1
1193643158 client 6 fd 13 dead
1193643158 client 6 fd -1 dead
1193643158 gfs receive_mount_status from 1 len 288 last_cb 3
1193643158 gfs _receive_mount_status from 1 kernel_mount_error 0 first_mounter 1
opts 9
1193643284 kernel: remove@ gutsy:gfs
1193643284 gfs get open /sys/fs/gfs/gutsy:gfs/lock_module/id error -1 2
1193643284 gfs ping_kernel_mount -1
1193643331 client 6: join /mnt/gfs gfs lock_dlm gutsy:gfs rw /dev/etherd/e0.0
1193643331 mount: /mnt/gfs gfs lock_dlm gutsy:gfs rw /dev/etherd/e0.0
1193643331 gfs add_another_mountpoint dir /mnt/gfs dev /dev/etherd/e0.0 ci 6
1193643331 mount point /mnt/gfs already used
1193643331 gfs do_mount: rv -16
1193643331 client 6 fd 13 dead
1193643331 client 6 fd -1 dead

David, just to avoid confusion, I discovered this problem while testing the
noccs branch (hence the strange log entries) but I can reproduce it in the exact
same way with a clean CVS checkout from HEAD.

Fabio

-- 
I'm going to make him an offer he can't refuse.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Cluster-devel] gfs_controld doesn't clean up tables properly
  2007-10-26  8:40 [Cluster-devel] gfs_controld doesn't clean up tables properly Fabio Massimo Di Nitto
  2007-10-29  7:42 ` Fabio Massimo Di Nitto
@ 2007-10-29 11:29 ` Fabio Massimo Di Nitto
  1 sibling, 0 replies; 3+ messages in thread
From: Fabio Massimo Di Nitto @ 2007-10-29 11:29 UTC (permalink / raw)
  To: cluster-devel.redhat.com

I found the problem to be a missing /sbin/umount.gfs* in the installed system.

Is there any way we can make the system provide a *slightly* better error?

Fabio

Fabio Massimo Di Nitto wrote:
> Hi guys,
> 
> I think we have seen a similar bug a while ago and I am afraid it's back again.
> 
> Extremely simple setup, 3 nodes cluster, no services or anything fancy. One
> shared block device (aoe + vblade) on 2.6.24-rc1.
> 
> node1# mkfs.gfs -j 6 -p lock_dlm -t gutsy:gfs /dev/etherd/e0.0
> 
> on all nodes:
> 
> mount -t gfs /dev/etherd/e0.0 /mnt/gfs
> umount /mnt/gfs
> mount -t gfs /dev/etherd/e0.0 /mnt/gfs
> /sbin/mount.gfs: mount point already used or other mount in progress
> /sbin/mount.gfs: error mounting lockproto lock_dlm
> 
> Oct 26 10:37:55 node3 gfs_controld[3466]: mount point /mnt/gfs already used
> 
> note that the first mount succeeded properly.
> 
> Attempting to mount to another mountpoint will stall:
> 
> root at node3:~# mount -t gfs /dev/etherd/e0.0 /mnt/gfs2
> [hang]
> there is no strace output.
> 
> Fabio
> 
> PS sorry i don't have enough time to look into this today in more details.
> 


-- 
I'm going to make him an offer he can't refuse.



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2007-10-29 11:29 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-26  8:40 [Cluster-devel] gfs_controld doesn't clean up tables properly Fabio Massimo Di Nitto
2007-10-29  7:42 ` Fabio Massimo Di Nitto
2007-10-29 11:29 ` Fabio Massimo Di Nitto

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.