* [Cluster-devel] inconsistent dlm_new_lockspace LVB_LEN size from ocfs2 user-space tool and ocfs2 kernel module
@ 2016-05-13 8:36 Gang He
2016-05-13 16:07 ` David Teigland
0 siblings, 1 reply; 2+ messages in thread
From: Gang He @ 2016-05-13 8:36 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hello Guys,
Here is a inconsistent LVB_LEN size problem when create a new lockspace from user-space tool (e.g. fsck.ocfs2) and kernel module (e.g. ocfs2/stack_user.c).
From the userspace tool, the LVB size is DLM_USER_LVB_LEN (32 bytes, defined in /include/linux/dlm_device.h)
From the kernel module, the LVB size is DLM_LVB_LEN (64 bytes).
Why did we design like this? Look at GFS2 kernel module code, it uses 32 bytes as LVB_LEN size, it is the same size with DLM_USER_LVB_LEN macro definition.
Now, We encountered a customer issue, the user did a fsck on a ocfs2 file system from one node, but aborted without release this lockspace (32bytes), then the user mounted this file system.
The kernel module would use the existing same lockspace, without creating the new lockspace with 64 bytes LVB_LEN.
Next, the bad result was that the user could not mount this file system from the other nodes no longer.
The error messages likes,
Apr 26 16:29:16 mapkhpch1bl02 kernel: [ 3730.430947] dlm: 032F55597DEA4A61AB065568F964174D: config mismatch: 64,0 nodeid 177127961: 32,0
Apr 26 16:29:16 mapkhpch1bl02 kernel: [ 3730.433267] (mount.ocfs2,26981,46):ocfs2_dlm_init:2995 ERROR: status = -71
Apr 26 16:29:16 mapkhpch1bl02 kernel: [ 3730.433325] (mount.ocfs2,26981,46):ocfs2_mount_volume:1881 ERROR: status = -71
Apr 26 16:29:16 mapkhpch1bl02 kernel: [ 3730.433376] (mount.ocfs2,26981,46):ocfs2_fill_super:1236 ERROR: status = -71
Apr 26 16:29:16 mapkhpch1bl02 Filesystem(MITC_Pool1)[26912]: ERROR: Couldn't mount filesystem /dev/disk/by-id/scsi-3600507640081010d5000000000000082 on /MITC_Pool1
Of course, the urgent fix is easy, we can reboot all the nodes, then mount the file system again.
But, I want to if there were some reasons about this design, otherwise, I want to see if we can use the same size between user space and kernel module.
Thanks
Gang
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Cluster-devel] inconsistent dlm_new_lockspace LVB_LEN size from ocfs2 user-space tool and ocfs2 kernel module
2016-05-13 8:36 [Cluster-devel] inconsistent dlm_new_lockspace LVB_LEN size from ocfs2 user-space tool and ocfs2 kernel module Gang He
@ 2016-05-13 16:07 ` David Teigland
0 siblings, 0 replies; 2+ messages in thread
From: David Teigland @ 2016-05-13 16:07 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Fri, May 13, 2016 at 02:36:25AM -0600, Gang He wrote:
> Here is a inconsistent LVB_LEN size problem when create a new lockspace
> from user-space tool (e.g. fsck.ocfs2) and kernel module (e.g.
> ocfs2/stack_user.c).
> From the userspace tool, the LVB size is DLM_USER_LVB_LEN (32 bytes,
> defined in /include/linux/dlm_device.h) From the kernel module, the LVB
> size is DLM_LVB_LEN (64 bytes).
Yes
> Why did we design like this? Look at GFS2 kernel module code, it uses 32
> bytes as LVB_LEN size, it is the same size with DLM_USER_LVB_LEN macro
> definition.
The lvb length was originally a constant 32 bytes, and was made variable
after the dlm user interface existed. The variable length lvb could not
be added to the existing user interface. (The dlm user interface is
terrible and a new version has been needed for many years, but it's not
used much, so it's not been worth the effort.)
> Now, We encountered a customer issue, the user did a fsck
> on a ocfs2 file system from one node, but aborted without release this
> lockspace (32bytes), then the user mounted this file system. The kernel
> module would use the existing same lockspace, without creating the new
> lockspace with 64 bytes LVB_LEN. Next, the bad result was that the user
> could not mount this file system from the other nodes no longer.
> The error messages likes,
> config mismatch: 64,0 nodeid 177127961: 32,0
> Of course, the urgent fix is easy, we can reboot all the nodes, then
> mount the file system again. But, I want to if there were some reasons
> about this design, otherwise, I want to see if we can use the same size
> between user space and kernel module.
Sorry, I think the only way around this is to ensure that lockspaces are
created from the kernel.
Dave
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2016-05-13 16:07 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-13 8:36 [Cluster-devel] inconsistent dlm_new_lockspace LVB_LEN size from ocfs2 user-space tool and ocfs2 kernel module Gang He
2016-05-13 16:07 ` David Teigland
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).