linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv2 dlm/next 00/12] dlm: net-namespace functionality
@ 2024-09-30 20:13 Alexander Aring
  2024-09-30 20:13 ` [PATCHv2 dlm/next 01/12] dlm: introduce dlm_find_lockspace_name() Alexander Aring
                   ` (12 more replies)
  0 siblings, 13 replies; 17+ messages in thread
From: Alexander Aring @ 2024-09-30 20:13 UTC (permalink / raw)
  To: teigland
  Cc: gfs2, song, yukuai3, agruenba, mark, jlbec, joseph.qi, gregkh,
	rafael, akpm, linux-kernel, linux-raid, ocfs2-devel, netdev,
	vvidic, heming.zhao, lucien.xin, donald.hunter, aahringo

Hi,

this patch-series is huge but brings a lot of basic "fun" net-namespace
functionality to DLM. Currently you need a couple of Linux kernel
instances running in e.g. Virtual Machines. With this patch-series I
want to break out of this virtual machine world dealing with multiple
kernels need to boot them all individually, etc. Now you can use DLM in
only one Linux kernel instance and each "node" (previously represented
by a virtual machine) is separate by a net-namespace. Why
net-namespaces? It just fits to the DLM design for now, you need to have
them anyway because the internal DLM socket handling on a per node
basis. What we do additionally is to separate the DLM lockspaces (the
lockspace that is being registered) by net-namespaces as this represents
a "network entity" (node). There might be reasons to introduce a
complete new kind of namespaces (locking namespace?) but I don't want to
do this step now and as I said net-namespaces are required anyway for
the DLM sockets.

You need some new user space tooling as a new netlink net-namespace
aware UAPI is introduced (but can co-exist with configfs that operates
on init_net only). See [0] for more steps, there is a copr repo for the
new tooling and can be enabled by:

$ dnf copr enable aring/nldlm
$ dnf install nldlm

or compile it yourself.

Then there is currently a very simple script [1] to show a 3 nodes cluster
using gfs2 on a multiple loop block devices on a shared loop block device
image (sounds weird but I do something like that). There are currently
some user space synchronization issues that I solve by simple sleeps,
but they are only user space problems.

To test it I recommend some virtual machine "but only one" and run the
[1] script. Afterwards you have in your executed net-namespace the 3
mountpoints /cluster/node1, /cluster/node2/ and /cluster/node3. Any vfs
operations on those mountpoints acts as a per node entity operation.

We can use it for testing, development and also scale testing to have a
large number of nodes joining a lockspace (which seems to be a problem
right now). Instead of running 1000 vms, we can run 1000 net-namespaces
in a more resource limited environment. For me it seems gfs2 can handle
several mounts and still separate the resource according their global
variables. Their data structures e.g. glock hash seems to have in their
key a separation for that (fsid?). However this is still an experimental
feature we might run into issues that requires more separation related
to net-namespaces. However basic testing seems to run just fine.

Limitations

I disable any functionality for the DLM character device that allow
plock handling or do DLM locking from user space. Just don't use any
plock locking in gfs2 for now. But basic vfs operations should work. You
can even sniff DLM traffic on the created "dlmsw" virtual bridge.

- Alex

[0] https://gitlab.com/netcoder/nldlm
[1] https://gitlab.com/netcoder/gfs2ns-examples/-/blob/main/three_nodes

changes since v2:
 - move to ynl and introduce and use netlink yaml spec
 - put the nldlm.h DLM netlink header under UAPI directory
 - fix build issues building with CONFIG_NET disabled
 - fix possible nullpointer deference if lookup of lockspace failed

Alexander Aring (12):
  dlm: introduce dlm_find_lockspace_name()
  dlm: disallow different configs nodeid storages
  dlm: add struct net to dlm_new_lockspace()
  dlm: handle port as __be16 network byte order
  dlm: use dlm_config as only cluster configuration
  dlm: dlm_config_info config fields to unsigned int
  dlm: rename config to configfs
  kobject: add kset_type_create_and_add() helper
  kobject: export generic helper ops
  dlm: separate dlm lockspaces per net-namespace
  dlm: add nldlm net-namespace aware UAPI
  gfs2: separate mount context by net-namespaces

 Documentation/netlink/specs/nldlm.yaml |  438 ++++++++
 drivers/md/md-cluster.c                |    3 +-
 fs/dlm/Makefile                        |    3 +
 fs/dlm/config.c                        | 1291 +++++++++--------------
 fs/dlm/config.h                        |  215 +++-
 fs/dlm/configfs.c                      |  882 ++++++++++++++++
 fs/dlm/configfs.h                      |   19 +
 fs/dlm/debug_fs.c                      |   24 +-
 fs/dlm/dir.c                           |    4 +-
 fs/dlm/dlm_internal.h                  |   24 +-
 fs/dlm/lock.c                          |   64 +-
 fs/dlm/lock.h                          |    3 +-
 fs/dlm/lockspace.c                     |  220 ++--
 fs/dlm/lockspace.h                     |   12 +-
 fs/dlm/lowcomms.c                      |  525 +++++-----
 fs/dlm/lowcomms.h                      |   29 +-
 fs/dlm/main.c                          |    5 -
 fs/dlm/member.c                        |   36 +-
 fs/dlm/midcomms.c                      |  287 ++---
 fs/dlm/midcomms.h                      |   31 +-
 fs/dlm/netlink2.c                      | 1330 ++++++++++++++++++++++++
 fs/dlm/nldlm-kernel.c                  |  290 ++++++
 fs/dlm/nldlm-kernel.h                  |   50 +
 fs/dlm/nldlm.c                         |  847 +++++++++++++++
 fs/dlm/plock.c                         |    2 +-
 fs/dlm/rcom.c                          |   16 +-
 fs/dlm/rcom.h                          |    3 +-
 fs/dlm/recover.c                       |   17 +-
 fs/dlm/user.c                          |   63 +-
 fs/dlm/user.h                          |    2 +-
 fs/gfs2/glock.c                        |    8 +
 fs/gfs2/incore.h                       |    2 +
 fs/gfs2/lock_dlm.c                     |    6 +-
 fs/gfs2/ops_fstype.c                   |    5 +
 fs/gfs2/sys.c                          |   35 +-
 fs/ocfs2/stack_user.c                  |    2 +-
 include/linux/dlm.h                    |    9 +-
 include/linux/kobject.h                |   10 +-
 include/uapi/linux/nldlm.h             |  153 +++
 lib/kobject.c                          |   65 +-
 40 files changed, 5566 insertions(+), 1464 deletions(-)
 create mode 100644 Documentation/netlink/specs/nldlm.yaml
 create mode 100644 fs/dlm/configfs.c
 create mode 100644 fs/dlm/configfs.h
 create mode 100644 fs/dlm/netlink2.c
 create mode 100644 fs/dlm/nldlm-kernel.c
 create mode 100644 fs/dlm/nldlm-kernel.h
 create mode 100644 fs/dlm/nldlm.c
 create mode 100644 include/uapi/linux/nldlm.h

-- 
2.43.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-10-11  6:20 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-30 20:13 [PATCHv2 dlm/next 00/12] dlm: net-namespace functionality Alexander Aring
2024-09-30 20:13 ` [PATCHv2 dlm/next 01/12] dlm: introduce dlm_find_lockspace_name() Alexander Aring
2024-09-30 20:13 ` [PATCHv2 dlm/next 02/12] dlm: disallow different configs nodeid storages Alexander Aring
2024-09-30 20:13 ` [PATCHv2 dlm/next 03/12] dlm: add struct net to dlm_new_lockspace() Alexander Aring
2024-09-30 20:13 ` [PATCHv2 dlm/next 04/12] dlm: handle port as __be16 network byte order Alexander Aring
2024-09-30 20:13 ` [PATCHv2 dlm/next 05/12] dlm: use dlm_config as only cluster configuration Alexander Aring
2024-09-30 20:13 ` [PATCHv2 dlm/next 06/12] dlm: dlm_config_info config fields to unsigned int Alexander Aring
2024-09-30 20:13 ` [PATCHv2 dlm/next 07/12] dlm: rename config to configfs Alexander Aring
2024-09-30 20:13 ` [PATCHv2 dlm/next 08/12] kobject: add kset_type_create_and_add() helper Alexander Aring
2024-09-30 20:13 ` [PATCHv2 dlm/next 09/12] kobject: export generic helper ops Alexander Aring
2024-09-30 20:13 ` [PATCHv2 dlm/next 10/12] dlm: separate dlm lockspaces per net-namespace Alexander Aring
2024-10-11  6:19   ` kernel test robot
2024-09-30 20:13 ` [PATCHv2 dlm/next 11/12] dlm: add nldlm net-namespace aware UAPI Alexander Aring
2024-09-30 20:22   ` Alexander Aring
2024-09-30 20:13 ` [PATCHv2 dlm/next 12/12] gfs2: separate mount context by net-namespaces Alexander Aring
2024-09-30 20:39 ` [PATCHv2 dlm/next 00/12] dlm: net-namespace functionality John Stoffel
2024-10-01  0:09   ` Alexander Aring

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).