netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v4 0/3] Introduce IPPROTO_SMC
@ 2024-05-29  3:59 D. Wythe
  2024-05-29  3:59 ` [PATCH net-next v4 1/3] net/smc: refactoring initialization of smc sock D. Wythe
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: D. Wythe @ 2024-05-29  3:59 UTC (permalink / raw)
  To: kgraul, wenjia, jaka, wintera, guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, tonylu, pabeni,
	edumazet

From: "D. Wythe" <alibuda@linux.alibaba.com>

This patch allows to create smc socket via AF_INET,
similar to the following code,

/* create v4 smc sock */
v4 = socket(AF_INET, SOCK_STREAM, IPPROTO_SMC);

/* create v6 smc sock */
v6 = socket(AF_INET6, SOCK_STREAM, IPPROTO_SMC);

There are several reasons why we believe it is appropriate here:

1. For smc sockets, it actually use IPv4 (AF-INET) or IPv6 (AF-INET6)
address. There is no AF_SMC address at all.

2. Create smc socket in the AF_INET(6) path, which allows us to reuse
the infrastructure of AF_INET(6) path, such as common ebpf hooks.
Otherwise, smc have to implement it again in AF_SMC path. Such as:
  1. Replace IPPROTO_TCP with IPPROTO_SMC in the socket() syscall
     initiated by the user, without the use of LD-PRELOAD.
  2. Select whether immediate fallback is required based on peer's port/ip
     before connect().

A very significant result is that we can now use eBPF to implement smc_run
instead of LD_PRELOAD, who is completely ineffective in scenarios of static
linking.

Another potential value is that we are attempting to optimize the
performance of fallback socks, where merging socks is an important part,
and it relies on the creation of SMC sockets under the AF_INET path. 
(More information :
https://lore.kernel.org/netdev/1699442703-25015-1-git-send-email-alibuda@linux.alibaba.com/T/)

v2 -> v1 :

- Code formatting, mainly including alignment and annotation repair.
- move inet_smc proto ops to inet_smc.c, avoiding af_smc.c becoming too bulky.
- Fix the issue where refactoring affects the initialization order.
- Fix compile warning (unused out_inet_prot) while CONFIG_IPV6 was not set.

v3 -> v2 :

- Add Alibaba's copyright information to the newfile

v4 -> v3 :

- Fix some spelling errors
- Align function naming style with smc_sock_init() to smc_sk_init()
- Reversing the order of the conditional checks on clcsock to make the code more intuitive

D. Wythe (3):
  net/smc: refactoring initialization of smc sock
  net/smc: expose smc proto operations
  net/smc: Introduce IPPROTO_SMC

 include/uapi/linux/in.h |   2 +
 net/smc/Makefile        |   2 +-
 net/smc/af_smc.c        | 182 ++++++++++++++++++++++++++++++------------------
 net/smc/inet_smc.c      | 108 ++++++++++++++++++++++++++++
 net/smc/inet_smc.h      |  34 +++++++++
 net/smc/smc.h           |  38 ++++++++++
 6 files changed, 297 insertions(+), 69 deletions(-)
 create mode 100644 net/smc/inet_smc.c
 create mode 100644 net/smc/inet_smc.h

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net-next v4 1/3] net/smc: refactoring initialization of smc sock
  2024-05-29  3:59 [PATCH net-next v4 0/3] Introduce IPPROTO_SMC D. Wythe
@ 2024-05-29  3:59 ` D. Wythe
  2024-05-29  6:14   ` Tony Lu
  2024-05-29  3:59 ` [PATCH net-next v4 2/3] net/smc: expose smc proto operations D. Wythe
  2024-05-29  3:59 ` [PATCH net-next v4 3/3] net/smc: Introduce IPPROTO_SMC D. Wythe
  2 siblings, 1 reply; 16+ messages in thread
From: D. Wythe @ 2024-05-29  3:59 UTC (permalink / raw)
  To: kgraul, wenjia, jaka, wintera, guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, tonylu, pabeni,
	edumazet

From: "D. Wythe" <alibuda@linux.alibaba.com>

This patch aims to isolate the shared components of SMC socket
allocation by introducing smc_sk_init() for sock initialization
and __smc_create_clcsk() for the initialization of clcsock.

This is in preparation for the subsequent implementation of the
AF_INET version of SMC.

Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
---
 net/smc/af_smc.c | 86 +++++++++++++++++++++++++++++++-------------------------
 net/smc/smc.h    |  5 ++++
 2 files changed, 53 insertions(+), 38 deletions(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index e50a286..77a9d58 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -361,25 +361,15 @@ static void smc_destruct(struct sock *sk)
 		return;
 }
 
-static struct sock *smc_sock_alloc(struct net *net, struct socket *sock,
-				   int protocol)
+void smc_sk_init(struct net *net, struct sock *sk, int protocol)
 {
-	struct smc_sock *smc;
-	struct proto *prot;
-	struct sock *sk;
-
-	prot = (protocol == SMCPROTO_SMC6) ? &smc_proto6 : &smc_proto;
-	sk = sk_alloc(net, PF_SMC, GFP_KERNEL, prot, 0);
-	if (!sk)
-		return NULL;
+	struct smc_sock *smc = smc_sk(sk);
 
-	sock_init_data(sock, sk); /* sets sk_refcnt to 1 */
 	sk->sk_state = SMC_INIT;
 	sk->sk_destruct = smc_destruct;
 	sk->sk_protocol = protocol;
 	WRITE_ONCE(sk->sk_sndbuf, 2 * READ_ONCE(net->smc.sysctl_wmem));
 	WRITE_ONCE(sk->sk_rcvbuf, 2 * READ_ONCE(net->smc.sysctl_rmem));
-	smc = smc_sk(sk);
 	INIT_WORK(&smc->tcp_listen_work, smc_tcp_listen_work);
 	INIT_WORK(&smc->connect_work, smc_connect_work);
 	INIT_DELAYED_WORK(&smc->conn.tx_work, smc_tx_work);
@@ -389,6 +379,24 @@ static struct sock *smc_sock_alloc(struct net *net, struct socket *sock,
 	sk->sk_prot->hash(sk);
 	mutex_init(&smc->clcsock_release_lock);
 	smc_init_saved_callbacks(smc);
+	smc->limit_smc_hs = net->smc.limit_smc_hs;
+	smc->use_fallback = false; /* assume rdma capability first */
+	smc->fallback_rsn = 0;
+}
+
+static struct sock *smc_sock_alloc(struct net *net, struct socket *sock,
+				   int protocol)
+{
+	struct proto *prot;
+	struct sock *sk;
+
+	prot = (protocol == SMCPROTO_SMC6) ? &smc_proto6 : &smc_proto;
+	sk = sk_alloc(net, PF_SMC, GFP_KERNEL, prot, 0);
+	if (!sk)
+		return NULL;
+
+	sock_init_data(sock, sk); /* sets sk_refcnt to 1 */
+	smc_sk_init(net, sk, protocol);
 
 	return sk;
 }
@@ -3321,6 +3329,31 @@ static ssize_t smc_splice_read(struct socket *sock, loff_t *ppos,
 	.splice_read	= smc_splice_read,
 };
 
+int smc_create_clcsk(struct net *net, struct sock *sk, int family)
+{
+	struct smc_sock *smc = smc_sk(sk);
+	int rc;
+
+	rc = sock_create_kern(net, family, SOCK_STREAM, IPPROTO_TCP,
+			      &smc->clcsock);
+	if (rc) {
+		sk_common_release(sk);
+		return rc;
+	}
+
+	/* smc_clcsock_release() does not wait smc->clcsock->sk's
+	 * destruction;  its sk_state might not be TCP_CLOSE after
+	 * smc->sk is close()d, and TCP timers can be fired later,
+	 * which need net ref.
+	 */
+	sk = smc->clcsock->sk;
+	__netns_tracker_free(net, &sk->ns_tracker, false);
+	sk->sk_net_refcnt = 1;
+	get_net_track(net, &sk->ns_tracker, GFP_KERNEL);
+	sock_inuse_add(net, 1);
+	return 0;
+}
+
 static int __smc_create(struct net *net, struct socket *sock, int protocol,
 			int kern, struct socket *clcsock)
 {
@@ -3346,35 +3379,12 @@ static int __smc_create(struct net *net, struct socket *sock, int protocol,
 
 	/* create internal TCP socket for CLC handshake and fallback */
 	smc = smc_sk(sk);
-	smc->use_fallback = false; /* assume rdma capability first */
-	smc->fallback_rsn = 0;
-
-	/* default behavior from limit_smc_hs in every net namespace */
-	smc->limit_smc_hs = net->smc.limit_smc_hs;
 
 	rc = 0;
-	if (!clcsock) {
-		rc = sock_create_kern(net, family, SOCK_STREAM, IPPROTO_TCP,
-				      &smc->clcsock);
-		if (rc) {
-			sk_common_release(sk);
-			goto out;
-		}
-
-		/* smc_clcsock_release() does not wait smc->clcsock->sk's
-		 * destruction;  its sk_state might not be TCP_CLOSE after
-		 * smc->sk is close()d, and TCP timers can be fired later,
-		 * which need net ref.
-		 */
-		sk = smc->clcsock->sk;
-		__netns_tracker_free(net, &sk->ns_tracker, false);
-		sk->sk_net_refcnt = 1;
-		get_net_track(net, &sk->ns_tracker, GFP_KERNEL);
-		sock_inuse_add(net, 1);
-	} else {
+	if (clcsock)
 		smc->clcsock = clcsock;
-	}
-
+	else
+		rc = smc_create_clcsk(net, sk, family);
 out:
 	return rc;
 }
diff --git a/net/smc/smc.h b/net/smc/smc.h
index 18c8b78..3edec1e 100644
--- a/net/smc/smc.h
+++ b/net/smc/smc.h
@@ -34,6 +34,11 @@
 extern struct proto smc_proto;
 extern struct proto smc_proto6;
 
+/* smc sock initialization */
+void smc_sk_init(struct net *net, struct sock *sk, int protocol);
+/* clcsock initialization */
+int smc_create_clcsk(struct net *net, struct sock *sk, int family);
+
 #ifdef ATOMIC64_INIT
 #define KERNEL_HAS_ATOMIC64
 #endif
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next v4 2/3] net/smc: expose smc proto operations
  2024-05-29  3:59 [PATCH net-next v4 0/3] Introduce IPPROTO_SMC D. Wythe
  2024-05-29  3:59 ` [PATCH net-next v4 1/3] net/smc: refactoring initialization of smc sock D. Wythe
@ 2024-05-29  3:59 ` D. Wythe
  2024-05-29 17:57   ` Zhu Yanjun
  2024-05-29  3:59 ` [PATCH net-next v4 3/3] net/smc: Introduce IPPROTO_SMC D. Wythe
  2 siblings, 1 reply; 16+ messages in thread
From: D. Wythe @ 2024-05-29  3:59 UTC (permalink / raw)
  To: kgraul, wenjia, jaka, wintera, guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, tonylu, pabeni,
	edumazet

From: "D. Wythe" <alibuda@linux.alibaba.com>

Externalize smc proto operations (smc_xxx) to allow
access from files other that af_smc.c

This is in preparation for the subsequent implementation
of the AF_INET version of SMC.

Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
---
 net/smc/af_smc.c | 60 ++++++++++++++++++++++++++++----------------------------
 net/smc/smc.h    | 33 +++++++++++++++++++++++++++++++
 2 files changed, 63 insertions(+), 30 deletions(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 77a9d58..8e3ce76 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -170,15 +170,15 @@ static bool smc_hs_congested(const struct sock *sk)
 	return false;
 }
 
-static struct smc_hashinfo smc_v4_hashinfo = {
+struct smc_hashinfo smc_v4_hashinfo = {
 	.lock = __RW_LOCK_UNLOCKED(smc_v4_hashinfo.lock),
 };
 
-static struct smc_hashinfo smc_v6_hashinfo = {
+struct smc_hashinfo smc_v6_hashinfo = {
 	.lock = __RW_LOCK_UNLOCKED(smc_v6_hashinfo.lock),
 };
 
-static int smc_hash_sk(struct sock *sk)
+int smc_hash_sk(struct sock *sk)
 {
 	struct smc_hashinfo *h = sk->sk_prot->h.smc_hash;
 	struct hlist_head *head;
@@ -193,7 +193,7 @@ static int smc_hash_sk(struct sock *sk)
 	return 0;
 }
 
-static void smc_unhash_sk(struct sock *sk)
+void smc_unhash_sk(struct sock *sk)
 {
 	struct smc_hashinfo *h = sk->sk_prot->h.smc_hash;
 
@@ -207,7 +207,7 @@ static void smc_unhash_sk(struct sock *sk)
  * work which we didn't do because of user hold the sock_lock in the
  * BH context
  */
-static void smc_release_cb(struct sock *sk)
+void smc_release_cb(struct sock *sk)
 {
 	struct smc_sock *smc = smc_sk(sk);
 
@@ -307,7 +307,7 @@ static int __smc_release(struct smc_sock *smc)
 	return rc;
 }
 
-static int smc_release(struct socket *sock)
+int smc_release(struct socket *sock)
 {
 	struct sock *sk = sock->sk;
 	struct smc_sock *smc;
@@ -401,8 +401,8 @@ static struct sock *smc_sock_alloc(struct net *net, struct socket *sock,
 	return sk;
 }
 
-static int smc_bind(struct socket *sock, struct sockaddr *uaddr,
-		    int addr_len)
+int smc_bind(struct socket *sock, struct sockaddr *uaddr,
+	     int addr_len)
 {
 	struct sockaddr_in *addr = (struct sockaddr_in *)uaddr;
 	struct sock *sk = sock->sk;
@@ -1649,8 +1649,8 @@ static void smc_connect_work(struct work_struct *work)
 	release_sock(&smc->sk);
 }
 
-static int smc_connect(struct socket *sock, struct sockaddr *addr,
-		       int alen, int flags)
+int smc_connect(struct socket *sock, struct sockaddr *addr,
+		int alen, int flags)
 {
 	struct sock *sk = sock->sk;
 	struct smc_sock *smc;
@@ -2631,7 +2631,7 @@ static void smc_clcsock_data_ready(struct sock *listen_clcsock)
 	read_unlock_bh(&listen_clcsock->sk_callback_lock);
 }
 
-static int smc_listen(struct socket *sock, int backlog)
+int smc_listen(struct socket *sock, int backlog)
 {
 	struct sock *sk = sock->sk;
 	struct smc_sock *smc;
@@ -2696,8 +2696,8 @@ static int smc_listen(struct socket *sock, int backlog)
 	return rc;
 }
 
-static int smc_accept(struct socket *sock, struct socket *new_sock,
-		      struct proto_accept_arg *arg)
+int smc_accept(struct socket *sock, struct socket *new_sock,
+	       struct proto_accept_arg *arg)
 {
 	struct sock *sk = sock->sk, *nsk;
 	DECLARE_WAITQUEUE(wait, current);
@@ -2766,8 +2766,8 @@ static int smc_accept(struct socket *sock, struct socket *new_sock,
 	return rc;
 }
 
-static int smc_getname(struct socket *sock, struct sockaddr *addr,
-		       int peer)
+int smc_getname(struct socket *sock, struct sockaddr *addr,
+		int peer)
 {
 	struct smc_sock *smc;
 
@@ -2780,7 +2780,7 @@ static int smc_getname(struct socket *sock, struct sockaddr *addr,
 	return smc->clcsock->ops->getname(smc->clcsock, addr, peer);
 }
 
-static int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 {
 	struct sock *sk = sock->sk;
 	struct smc_sock *smc;
@@ -2818,8 +2818,8 @@ static int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	return rc;
 }
 
-static int smc_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
-		       int flags)
+int smc_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+		int flags)
 {
 	struct sock *sk = sock->sk;
 	struct smc_sock *smc;
@@ -2868,8 +2868,8 @@ static __poll_t smc_accept_poll(struct sock *parent)
 	return mask;
 }
 
-static __poll_t smc_poll(struct file *file, struct socket *sock,
-			     poll_table *wait)
+__poll_t smc_poll(struct file *file, struct socket *sock,
+		  poll_table *wait)
 {
 	struct sock *sk = sock->sk;
 	struct smc_sock *smc;
@@ -2921,7 +2921,7 @@ static __poll_t smc_poll(struct file *file, struct socket *sock,
 	return mask;
 }
 
-static int smc_shutdown(struct socket *sock, int how)
+int smc_shutdown(struct socket *sock, int how)
 {
 	struct sock *sk = sock->sk;
 	bool do_shutdown = true;
@@ -3061,8 +3061,8 @@ static int __smc_setsockopt(struct socket *sock, int level, int optname,
 	return rc;
 }
 
-static int smc_setsockopt(struct socket *sock, int level, int optname,
-			  sockptr_t optval, unsigned int optlen)
+int smc_setsockopt(struct socket *sock, int level, int optname,
+		   sockptr_t optval, unsigned int optlen)
 {
 	struct sock *sk = sock->sk;
 	struct smc_sock *smc;
@@ -3148,8 +3148,8 @@ static int smc_setsockopt(struct socket *sock, int level, int optname,
 	return rc;
 }
 
-static int smc_getsockopt(struct socket *sock, int level, int optname,
-			  char __user *optval, int __user *optlen)
+int smc_getsockopt(struct socket *sock, int level, int optname,
+		   char __user *optval, int __user *optlen)
 {
 	struct smc_sock *smc;
 	int rc;
@@ -3174,8 +3174,8 @@ static int smc_getsockopt(struct socket *sock, int level, int optname,
 	return rc;
 }
 
-static int smc_ioctl(struct socket *sock, unsigned int cmd,
-		     unsigned long arg)
+int smc_ioctl(struct socket *sock, unsigned int cmd,
+	      unsigned long arg)
 {
 	union smc_host_cursor cons, urg;
 	struct smc_connection *conn;
@@ -3261,9 +3261,9 @@ static int smc_ioctl(struct socket *sock, unsigned int cmd,
  * Note that subsequent recv() calls have to wait till all splice() processing
  * completed.
  */
-static ssize_t smc_splice_read(struct socket *sock, loff_t *ppos,
-			       struct pipe_inode_info *pipe, size_t len,
-			       unsigned int flags)
+ssize_t smc_splice_read(struct socket *sock, loff_t *ppos,
+			struct pipe_inode_info *pipe, size_t len,
+			unsigned int flags)
 {
 	struct sock *sk = sock->sk;
 	struct smc_sock *smc;
diff --git a/net/smc/smc.h b/net/smc/smc.h
index 3edec1e..34b781e 100644
--- a/net/smc/smc.h
+++ b/net/smc/smc.h
@@ -34,6 +34,39 @@
 extern struct proto smc_proto;
 extern struct proto smc_proto6;
 
+extern struct smc_hashinfo smc_v4_hashinfo;
+extern struct smc_hashinfo smc_v6_hashinfo;
+
+int smc_hash_sk(struct sock *sk);
+void smc_unhash_sk(struct sock *sk);
+void smc_release_cb(struct sock *sk);
+
+int smc_release(struct socket *sock);
+int smc_bind(struct socket *sock, struct sockaddr *uaddr,
+	     int addr_len);
+int smc_connect(struct socket *sock, struct sockaddr *addr,
+		int alen, int flags);
+int smc_accept(struct socket *sock, struct socket *new_sock,
+	       struct proto_accept_arg *arg);
+int smc_getname(struct socket *sock, struct sockaddr *addr,
+		int peer);
+__poll_t smc_poll(struct file *file, struct socket *sock,
+		  poll_table *wait);
+int smc_ioctl(struct socket *sock, unsigned int cmd,
+	      unsigned long arg);
+int smc_listen(struct socket *sock, int backlog);
+int smc_shutdown(struct socket *sock, int how);
+int smc_setsockopt(struct socket *sock, int level, int optname,
+		   sockptr_t optval, unsigned int optlen);
+int smc_getsockopt(struct socket *sock, int level, int optname,
+		   char __user *optval, int __user *optlen);
+int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len);
+int smc_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+		int flags);
+ssize_t smc_splice_read(struct socket *sock, loff_t *ppos,
+			struct pipe_inode_info *pipe, size_t len,
+			unsigned int flags);
+
 /* smc sock initialization */
 void smc_sk_init(struct net *net, struct sock *sk, int protocol);
 /* clcsock initialization */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next v4 3/3] net/smc: Introduce IPPROTO_SMC
  2024-05-29  3:59 [PATCH net-next v4 0/3] Introduce IPPROTO_SMC D. Wythe
  2024-05-29  3:59 ` [PATCH net-next v4 1/3] net/smc: refactoring initialization of smc sock D. Wythe
  2024-05-29  3:59 ` [PATCH net-next v4 2/3] net/smc: expose smc proto operations D. Wythe
@ 2024-05-29  3:59 ` D. Wythe
  2024-05-29 11:12   ` Dust Li
                     ` (3 more replies)
  2 siblings, 4 replies; 16+ messages in thread
From: D. Wythe @ 2024-05-29  3:59 UTC (permalink / raw)
  To: kgraul, wenjia, jaka, wintera, guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, tonylu, pabeni,
	edumazet

From: "D. Wythe" <alibuda@linux.alibaba.com>

This patch allows to create smc socket via AF_INET,
similar to the following code,

/* create v4 smc sock */
v4 = socket(AF_INET, SOCK_STREAM, IPPROTO_SMC);

/* create v6 smc sock */
v6 = socket(AF_INET6, SOCK_STREAM, IPPROTO_SMC);

There are several reasons why we believe it is appropriate here:

1. For smc sockets, it actually use IPv4 (AF-INET) or IPv6 (AF-INET6)
address. There is no AF_SMC address at all.

2. Create smc socket in the AF_INET(6) path, which allows us to reuse
the infrastructure of AF_INET(6) path, such as common ebpf hooks.
Otherwise, smc have to implement it again in AF_SMC path.

Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
---
 include/uapi/linux/in.h |   2 +
 net/smc/Makefile        |   2 +-
 net/smc/af_smc.c        |  36 ++++++++++++++++
 net/smc/inet_smc.c      | 108 ++++++++++++++++++++++++++++++++++++++++++++++++
 net/smc/inet_smc.h      |  34 +++++++++++++++
 5 files changed, 181 insertions(+), 1 deletion(-)
 create mode 100644 net/smc/inet_smc.c
 create mode 100644 net/smc/inet_smc.h

diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
index e682ab6..0c6322b 100644
--- a/include/uapi/linux/in.h
+++ b/include/uapi/linux/in.h
@@ -83,6 +83,8 @@ enum {
 #define IPPROTO_RAW		IPPROTO_RAW
   IPPROTO_MPTCP = 262,		/* Multipath TCP connection		*/
 #define IPPROTO_MPTCP		IPPROTO_MPTCP
+  IPPROTO_SMC = 263,		/* Shared Memory Communications		*/
+#define IPPROTO_SMC		IPPROTO_SMC
   IPPROTO_MAX
 };
 #endif
diff --git a/net/smc/Makefile b/net/smc/Makefile
index 2c510d54..472b9ee 100644
--- a/net/smc/Makefile
+++ b/net/smc/Makefile
@@ -4,6 +4,6 @@ obj-$(CONFIG_SMC)	+= smc.o
 obj-$(CONFIG_SMC_DIAG)	+= smc_diag.o
 smc-y := af_smc.o smc_pnet.o smc_ib.o smc_clc.o smc_core.o smc_wr.o smc_llc.o
 smc-y += smc_cdc.o smc_tx.o smc_rx.o smc_close.o smc_ism.o smc_netlink.o smc_stats.o
-smc-y += smc_tracepoint.o
+smc-y += smc_tracepoint.o inet_smc.o
 smc-$(CONFIG_SYSCTL) += smc_sysctl.o
 smc-$(CONFIG_SMC_LO) += smc_loopback.o
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 8e3ce76..320624c 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -54,6 +54,7 @@
 #include "smc_tracepoint.h"
 #include "smc_sysctl.h"
 #include "smc_loopback.h"
+#include "inet_smc.h"
 
 static DEFINE_MUTEX(smc_server_lgr_pending);	/* serialize link group
 						 * creation on server
@@ -3594,9 +3595,31 @@ static int __init smc_init(void)
 		goto out_lo;
 	}
 
+	rc = proto_register(&smc_inet_prot, 1);
+	if (rc) {
+		pr_err("%s: proto_register smc_inet_prot fails with %d\n", __func__, rc);
+		goto out_ulp;
+	}
+	inet_register_protosw(&smc_inet_protosw);
+#if IS_ENABLED(CONFIG_IPV6)
+	rc = proto_register(&smc_inet6_prot, 1);
+	if (rc) {
+		pr_err("%s: proto_register smc_inet6_prot fails with %d\n", __func__, rc);
+		goto out_inet_prot;
+	}
+	inet6_register_protosw(&smc_inet6_protosw);
+#endif
+
 	static_branch_enable(&tcp_have_smc);
 	return 0;
 
+#if IS_ENABLED(CONFIG_IPV6)
+out_inet_prot:
+	inet_unregister_protosw(&smc_inet_protosw);
+	proto_unregister(&smc_inet_prot);
+#endif
+out_ulp:
+	tcp_unregister_ulp(&smc_ulp_ops);
 out_lo:
 	smc_loopback_exit();
 out_ib:
@@ -3633,6 +3656,10 @@ static int __init smc_init(void)
 static void __exit smc_exit(void)
 {
 	static_branch_disable(&tcp_have_smc);
+	inet_unregister_protosw(&smc_inet_protosw);
+#if IS_ENABLED(CONFIG_IPV6)
+	inet6_unregister_protosw(&smc_inet6_protosw);
+#endif
 	tcp_unregister_ulp(&smc_ulp_ops);
 	sock_unregister(PF_SMC);
 	smc_core_exit();
@@ -3644,6 +3671,10 @@ static void __exit smc_exit(void)
 	destroy_workqueue(smc_hs_wq);
 	proto_unregister(&smc_proto6);
 	proto_unregister(&smc_proto);
+	proto_unregister(&smc_inet_prot);
+#if IS_ENABLED(CONFIG_IPV6)
+	proto_unregister(&smc_inet6_prot);
+#endif
 	smc_pnet_exit();
 	smc_nl_exit();
 	smc_clc_exit();
@@ -3660,4 +3691,9 @@ static void __exit smc_exit(void)
 MODULE_LICENSE("GPL");
 MODULE_ALIAS_NETPROTO(PF_SMC);
 MODULE_ALIAS_TCP_ULP("smc");
+/* 263 for IPPROTO_SMC and 1 for SOCK_STREAM */
+MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_INET, 263, 1);
+#if IS_ENABLED(CONFIG_IPV6)
+MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_INET6, 263, 1);
+#endif
 MODULE_ALIAS_GENL_FAMILY(SMC_GENL_FAMILY_NAME);
diff --git a/net/smc/inet_smc.c b/net/smc/inet_smc.c
new file mode 100644
index 00000000..1ba73d7
--- /dev/null
+++ b/net/smc/inet_smc.c
@@ -0,0 +1,108 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ *  Shared Memory Communications over RDMA (SMC-R) and RoCE
+ *
+ *  Definitions for the IPPROTO_SMC (socket related)
+ *
+ *  Copyright IBM Corp. 2016, 2018
+ *  Copyright (c) 2024, Alibaba Inc.
+ *
+ *  Author: D. Wythe <alibuda@linux.alibaba.com>
+ */
+
+#include "inet_smc.h"
+#include "smc.h"
+
+struct proto smc_inet_prot = {
+	.name		= "INET_SMC",
+	.owner		= THIS_MODULE,
+	.init		= smc_inet_init_sock,
+	.hash		= smc_hash_sk,
+	.unhash		= smc_unhash_sk,
+	.release_cb	= smc_release_cb,
+	.obj_size	= sizeof(struct smc_sock),
+	.h.smc_hash	= &smc_v4_hashinfo,
+	.slab_flags	= SLAB_TYPESAFE_BY_RCU,
+};
+
+const struct proto_ops smc_inet_stream_ops = {
+	.family		= PF_INET,
+	.owner		= THIS_MODULE,
+	.release	= smc_release,
+	.bind		= smc_bind,
+	.connect	= smc_connect,
+	.socketpair	= sock_no_socketpair,
+	.accept		= smc_accept,
+	.getname	= smc_getname,
+	.poll		= smc_poll,
+	.ioctl		= smc_ioctl,
+	.listen		= smc_listen,
+	.shutdown	= smc_shutdown,
+	.setsockopt	= smc_setsockopt,
+	.getsockopt	= smc_getsockopt,
+	.sendmsg	= smc_sendmsg,
+	.recvmsg	= smc_recvmsg,
+	.mmap		= sock_no_mmap,
+	.splice_read	= smc_splice_read,
+};
+
+struct inet_protosw smc_inet_protosw = {
+	.type		= SOCK_STREAM,
+	.protocol	= IPPROTO_SMC,
+	.prot		= &smc_inet_prot,
+	.ops		= &smc_inet_stream_ops,
+	.flags		= INET_PROTOSW_ICSK,
+};
+
+#if IS_ENABLED(CONFIG_IPV6)
+struct proto smc_inet6_prot = {
+	.name		= "INET6_SMC",
+	.owner		= THIS_MODULE,
+	.init		= smc_inet_init_sock,
+	.hash		= smc_hash_sk,
+	.unhash		= smc_unhash_sk,
+	.release_cb	= smc_release_cb,
+	.obj_size	= sizeof(struct smc_sock),
+	.h.smc_hash	= &smc_v6_hashinfo,
+	.slab_flags	= SLAB_TYPESAFE_BY_RCU,
+};
+
+const struct proto_ops smc_inet6_stream_ops = {
+	.family		= PF_INET6,
+	.owner		= THIS_MODULE,
+	.release	= smc_release,
+	.bind		= smc_bind,
+	.connect	= smc_connect,
+	.socketpair	= sock_no_socketpair,
+	.accept		= smc_accept,
+	.getname	= smc_getname,
+	.poll		= smc_poll,
+	.ioctl		= smc_ioctl,
+	.listen		= smc_listen,
+	.shutdown	= smc_shutdown,
+	.setsockopt	= smc_setsockopt,
+	.getsockopt	= smc_getsockopt,
+	.sendmsg	= smc_sendmsg,
+	.recvmsg	= smc_recvmsg,
+	.mmap		= sock_no_mmap,
+	.splice_read	= smc_splice_read,
+};
+
+struct inet_protosw smc_inet6_protosw = {
+	.type		= SOCK_STREAM,
+	.protocol	= IPPROTO_SMC,
+	.prot		= &smc_inet6_prot,
+	.ops		= &smc_inet6_stream_ops,
+	.flags		= INET_PROTOSW_ICSK,
+};
+#endif
+
+int smc_inet_init_sock(struct sock *sk)
+{
+	struct net *net = sock_net(sk);
+
+	/* init common smc sock */
+	smc_sk_init(net, sk, IPPROTO_SMC);
+	/* create clcsock */
+	return smc_create_clcsk(net, sk, sk->sk_family);
+}
diff --git a/net/smc/inet_smc.h b/net/smc/inet_smc.h
new file mode 100644
index 00000000..c55345d
--- /dev/null
+++ b/net/smc/inet_smc.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ *  Shared Memory Communications over RDMA (SMC-R) and RoCE
+ *
+ *  Definitions for the IPPROTO_SMC (socket related)
+
+ *  Copyright IBM Corp. 2016
+ *  Copyright (c) 2024, Alibaba Inc.
+ *
+ *  Author: D. Wythe <alibuda@linux.alibaba.com>
+ */
+#ifndef __INET_SMC
+#define __INET_SMC
+
+#include <net/protocol.h>
+#include <net/sock.h>
+#include <net/tcp.h>
+
+extern struct proto smc_inet_prot;
+extern const struct proto_ops smc_inet_stream_ops;
+extern struct inet_protosw smc_inet_protosw;
+
+#if IS_ENABLED(CONFIG_IPV6)
+#include <net/ipv6.h>
+/* MUST after net/tcp.h or warning */
+#include <net/transp_v6.h>
+extern struct proto smc_inet6_prot;
+extern const struct proto_ops smc_inet6_stream_ops;
+extern struct inet_protosw smc_inet6_protosw;
+#endif
+
+int smc_inet_init_sock(struct sock *sk);
+
+#endif /* __INET_SMC */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 1/3] net/smc: refactoring initialization of smc sock
  2024-05-29  3:59 ` [PATCH net-next v4 1/3] net/smc: refactoring initialization of smc sock D. Wythe
@ 2024-05-29  6:14   ` Tony Lu
  0 siblings, 0 replies; 16+ messages in thread
From: Tony Lu @ 2024-05-29  6:14 UTC (permalink / raw)
  To: D. Wythe
  Cc: kgraul, wenjia, jaka, wintera, guwen, kuba, davem, netdev,
	linux-s390, linux-rdma, pabeni, edumazet

On Wed, May 29, 2024 at 11:59:05AM +0800, D. Wythe wrote:
> From: "D. Wythe" <alibuda@linux.alibaba.com>
> 
> This patch aims to isolate the shared components of SMC socket
> allocation by introducing smc_sk_init() for sock initialization
> and __smc_create_clcsk() for the initialization of clcsock.
> 
> This is in preparation for the subsequent implementation of the
> AF_INET version of SMC.
> 
> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>

v4 looks good for me, thanks.

Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>

> ---
>  net/smc/af_smc.c | 86 +++++++++++++++++++++++++++++++-------------------------
>  net/smc/smc.h    |  5 ++++
>  2 files changed, 53 insertions(+), 38 deletions(-)
> 
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index e50a286..77a9d58 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -361,25 +361,15 @@ static void smc_destruct(struct sock *sk)
>  		return;
>  }
>  
> -static struct sock *smc_sock_alloc(struct net *net, struct socket *sock,
> -				   int protocol)
> +void smc_sk_init(struct net *net, struct sock *sk, int protocol)
>  {
> -	struct smc_sock *smc;
> -	struct proto *prot;
> -	struct sock *sk;
> -
> -	prot = (protocol == SMCPROTO_SMC6) ? &smc_proto6 : &smc_proto;
> -	sk = sk_alloc(net, PF_SMC, GFP_KERNEL, prot, 0);
> -	if (!sk)
> -		return NULL;
> +	struct smc_sock *smc = smc_sk(sk);
>  
> -	sock_init_data(sock, sk); /* sets sk_refcnt to 1 */
>  	sk->sk_state = SMC_INIT;
>  	sk->sk_destruct = smc_destruct;
>  	sk->sk_protocol = protocol;
>  	WRITE_ONCE(sk->sk_sndbuf, 2 * READ_ONCE(net->smc.sysctl_wmem));
>  	WRITE_ONCE(sk->sk_rcvbuf, 2 * READ_ONCE(net->smc.sysctl_rmem));
> -	smc = smc_sk(sk);
>  	INIT_WORK(&smc->tcp_listen_work, smc_tcp_listen_work);
>  	INIT_WORK(&smc->connect_work, smc_connect_work);
>  	INIT_DELAYED_WORK(&smc->conn.tx_work, smc_tx_work);
> @@ -389,6 +379,24 @@ static struct sock *smc_sock_alloc(struct net *net, struct socket *sock,
>  	sk->sk_prot->hash(sk);
>  	mutex_init(&smc->clcsock_release_lock);
>  	smc_init_saved_callbacks(smc);
> +	smc->limit_smc_hs = net->smc.limit_smc_hs;
> +	smc->use_fallback = false; /* assume rdma capability first */
> +	smc->fallback_rsn = 0;
> +}
> +
> +static struct sock *smc_sock_alloc(struct net *net, struct socket *sock,
> +				   int protocol)
> +{
> +	struct proto *prot;
> +	struct sock *sk;
> +
> +	prot = (protocol == SMCPROTO_SMC6) ? &smc_proto6 : &smc_proto;
> +	sk = sk_alloc(net, PF_SMC, GFP_KERNEL, prot, 0);
> +	if (!sk)
> +		return NULL;
> +
> +	sock_init_data(sock, sk); /* sets sk_refcnt to 1 */
> +	smc_sk_init(net, sk, protocol);
>  
>  	return sk;
>  }
> @@ -3321,6 +3329,31 @@ static ssize_t smc_splice_read(struct socket *sock, loff_t *ppos,
>  	.splice_read	= smc_splice_read,
>  };
>  
> +int smc_create_clcsk(struct net *net, struct sock *sk, int family)
> +{
> +	struct smc_sock *smc = smc_sk(sk);
> +	int rc;
> +
> +	rc = sock_create_kern(net, family, SOCK_STREAM, IPPROTO_TCP,
> +			      &smc->clcsock);
> +	if (rc) {
> +		sk_common_release(sk);
> +		return rc;
> +	}
> +
> +	/* smc_clcsock_release() does not wait smc->clcsock->sk's
> +	 * destruction;  its sk_state might not be TCP_CLOSE after
> +	 * smc->sk is close()d, and TCP timers can be fired later,
> +	 * which need net ref.
> +	 */
> +	sk = smc->clcsock->sk;
> +	__netns_tracker_free(net, &sk->ns_tracker, false);
> +	sk->sk_net_refcnt = 1;
> +	get_net_track(net, &sk->ns_tracker, GFP_KERNEL);
> +	sock_inuse_add(net, 1);
> +	return 0;
> +}
> +
>  static int __smc_create(struct net *net, struct socket *sock, int protocol,
>  			int kern, struct socket *clcsock)
>  {
> @@ -3346,35 +3379,12 @@ static int __smc_create(struct net *net, struct socket *sock, int protocol,
>  
>  	/* create internal TCP socket for CLC handshake and fallback */
>  	smc = smc_sk(sk);
> -	smc->use_fallback = false; /* assume rdma capability first */
> -	smc->fallback_rsn = 0;
> -
> -	/* default behavior from limit_smc_hs in every net namespace */
> -	smc->limit_smc_hs = net->smc.limit_smc_hs;
>  
>  	rc = 0;
> -	if (!clcsock) {
> -		rc = sock_create_kern(net, family, SOCK_STREAM, IPPROTO_TCP,
> -				      &smc->clcsock);
> -		if (rc) {
> -			sk_common_release(sk);
> -			goto out;
> -		}
> -
> -		/* smc_clcsock_release() does not wait smc->clcsock->sk's
> -		 * destruction;  its sk_state might not be TCP_CLOSE after
> -		 * smc->sk is close()d, and TCP timers can be fired later,
> -		 * which need net ref.
> -		 */
> -		sk = smc->clcsock->sk;
> -		__netns_tracker_free(net, &sk->ns_tracker, false);
> -		sk->sk_net_refcnt = 1;
> -		get_net_track(net, &sk->ns_tracker, GFP_KERNEL);
> -		sock_inuse_add(net, 1);
> -	} else {
> +	if (clcsock)
>  		smc->clcsock = clcsock;
> -	}
> -
> +	else
> +		rc = smc_create_clcsk(net, sk, family);
>  out:
>  	return rc;
>  }
> diff --git a/net/smc/smc.h b/net/smc/smc.h
> index 18c8b78..3edec1e 100644
> --- a/net/smc/smc.h
> +++ b/net/smc/smc.h
> @@ -34,6 +34,11 @@
>  extern struct proto smc_proto;
>  extern struct proto smc_proto6;
>  
> +/* smc sock initialization */
> +void smc_sk_init(struct net *net, struct sock *sk, int protocol);
> +/* clcsock initialization */
> +int smc_create_clcsk(struct net *net, struct sock *sk, int family);
> +
>  #ifdef ATOMIC64_INIT
>  #define KERNEL_HAS_ATOMIC64
>  #endif
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 3/3] net/smc: Introduce IPPROTO_SMC
  2024-05-29  3:59 ` [PATCH net-next v4 3/3] net/smc: Introduce IPPROTO_SMC D. Wythe
@ 2024-05-29 11:12   ` Dust Li
  2024-05-30  3:11     ` D. Wythe
  2024-05-29 11:58   ` Wenjia Zhang
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Dust Li @ 2024-05-29 11:12 UTC (permalink / raw)
  To: D. Wythe, kgraul, wenjia, jaka, wintera, guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, tonylu, pabeni,
	edumazet

On 2024-05-29 11:59:07, D. Wythe wrote:
>From: "D. Wythe" <alibuda@linux.alibaba.com>
>
>This patch allows to create smc socket via AF_INET,
>similar to the following code,
>
>/* create v4 smc sock */
>v4 = socket(AF_INET, SOCK_STREAM, IPPROTO_SMC);
>
>/* create v6 smc sock */
>v6 = socket(AF_INET6, SOCK_STREAM, IPPROTO_SMC);
>
>There are several reasons why we believe it is appropriate here:
>
>1. For smc sockets, it actually use IPv4 (AF-INET) or IPv6 (AF-INET6)
>address. There is no AF_SMC address at all.
>
>2. Create smc socket in the AF_INET(6) path, which allows us to reuse
>the infrastructure of AF_INET(6) path, such as common ebpf hooks.
>Otherwise, smc have to implement it again in AF_SMC path.
>
>Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
>---
> include/uapi/linux/in.h |   2 +
> net/smc/Makefile        |   2 +-
> net/smc/af_smc.c        |  36 ++++++++++++++++
> net/smc/inet_smc.c      | 108 ++++++++++++++++++++++++++++++++++++++++++++++++
> net/smc/inet_smc.h      |  34 +++++++++++++++
> 5 files changed, 181 insertions(+), 1 deletion(-)
> create mode 100644 net/smc/inet_smc.c
> create mode 100644 net/smc/inet_smc.h
>
>diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
>index e682ab6..0c6322b 100644
>--- a/include/uapi/linux/in.h
>+++ b/include/uapi/linux/in.h
>@@ -83,6 +83,8 @@ enum {
> #define IPPROTO_RAW		IPPROTO_RAW
>   IPPROTO_MPTCP = 262,		/* Multipath TCP connection		*/
> #define IPPROTO_MPTCP		IPPROTO_MPTCP
>+  IPPROTO_SMC = 263,		/* Shared Memory Communications		*/
>+#define IPPROTO_SMC		IPPROTO_SMC
>   IPPROTO_MAX
> };
> #endif
>diff --git a/net/smc/Makefile b/net/smc/Makefile
>index 2c510d54..472b9ee 100644
>--- a/net/smc/Makefile
>+++ b/net/smc/Makefile
>@@ -4,6 +4,6 @@ obj-$(CONFIG_SMC)	+= smc.o
> obj-$(CONFIG_SMC_DIAG)	+= smc_diag.o
> smc-y := af_smc.o smc_pnet.o smc_ib.o smc_clc.o smc_core.o smc_wr.o smc_llc.o
> smc-y += smc_cdc.o smc_tx.o smc_rx.o smc_close.o smc_ism.o smc_netlink.o smc_stats.o
>-smc-y += smc_tracepoint.o
>+smc-y += smc_tracepoint.o inet_smc.o
> smc-$(CONFIG_SYSCTL) += smc_sysctl.o
> smc-$(CONFIG_SMC_LO) += smc_loopback.o
>diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
>index 8e3ce76..320624c 100644
>--- a/net/smc/af_smc.c
>+++ b/net/smc/af_smc.c
>@@ -54,6 +54,7 @@
> #include "smc_tracepoint.h"
> #include "smc_sysctl.h"
> #include "smc_loopback.h"
>+#include "inet_smc.h"
> 
> static DEFINE_MUTEX(smc_server_lgr_pending);	/* serialize link group
> 						 * creation on server
>@@ -3594,9 +3595,31 @@ static int __init smc_init(void)
> 		goto out_lo;
> 	}
> 
>+	rc = proto_register(&smc_inet_prot, 1);
>+	if (rc) {
>+		pr_err("%s: proto_register smc_inet_prot fails with %d\n", __func__, rc);
>+		goto out_ulp;
>+	}
>+	inet_register_protosw(&smc_inet_protosw);
>+#if IS_ENABLED(CONFIG_IPV6)
>+	rc = proto_register(&smc_inet6_prot, 1);
>+	if (rc) {
>+		pr_err("%s: proto_register smc_inet6_prot fails with %d\n", __func__, rc);
>+		goto out_inet_prot;
>+	}
>+	inet6_register_protosw(&smc_inet6_protosw);
>+#endif
>+

What do you think of moving all those inet initialization code into
something like smc_inet_init() and move it to smc_inet.c ?


> 	static_branch_enable(&tcp_have_smc);
> 	return 0;
> 
>+#if IS_ENABLED(CONFIG_IPV6)
>+out_inet_prot:
>+	inet_unregister_protosw(&smc_inet_protosw);
>+	proto_unregister(&smc_inet_prot);
>+#endif
>+out_ulp:
>+	tcp_unregister_ulp(&smc_ulp_ops);
> out_lo:
> 	smc_loopback_exit();
> out_ib:
>@@ -3633,6 +3656,10 @@ static int __init smc_init(void)
> static void __exit smc_exit(void)
> {
> 	static_branch_disable(&tcp_have_smc);
>+	inet_unregister_protosw(&smc_inet_protosw);
>+#if IS_ENABLED(CONFIG_IPV6)
>+	inet6_unregister_protosw(&smc_inet6_protosw);
>+#endif
> 	tcp_unregister_ulp(&smc_ulp_ops);
> 	sock_unregister(PF_SMC);
> 	smc_core_exit();
>@@ -3644,6 +3671,10 @@ static void __exit smc_exit(void)
> 	destroy_workqueue(smc_hs_wq);
> 	proto_unregister(&smc_proto6);
> 	proto_unregister(&smc_proto);
>+	proto_unregister(&smc_inet_prot);
>+#if IS_ENABLED(CONFIG_IPV6)
>+	proto_unregister(&smc_inet6_prot);
>+#endif
> 	smc_pnet_exit();
> 	smc_nl_exit();
> 	smc_clc_exit();
>@@ -3660,4 +3691,9 @@ static void __exit smc_exit(void)
> MODULE_LICENSE("GPL");
> MODULE_ALIAS_NETPROTO(PF_SMC);
> MODULE_ALIAS_TCP_ULP("smc");
>+/* 263 for IPPROTO_SMC and 1 for SOCK_STREAM */
>+MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_INET, 263, 1);
>+#if IS_ENABLED(CONFIG_IPV6)
>+MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_INET6, 263, 1);
>+#endif
> MODULE_ALIAS_GENL_FAMILY(SMC_GENL_FAMILY_NAME);
>diff --git a/net/smc/inet_smc.c b/net/smc/inet_smc.c
>new file mode 100644
>index 00000000..1ba73d7
>--- /dev/null
>+++ b/net/smc/inet_smc.c
>@@ -0,0 +1,108 @@
>+// SPDX-License-Identifier: GPL-2.0-only
>+/*
>+ *  Shared Memory Communications over RDMA (SMC-R) and RoCE
>+ *
>+ *  Definitions for the IPPROTO_SMC (socket related)
>+ *
>+ *  Copyright IBM Corp. 2016, 2018
>+ *  Copyright (c) 2024, Alibaba Inc.
>+ *
>+ *  Author: D. Wythe <alibuda@linux.alibaba.com>
>+ */
>+
>+#include "inet_smc.h"
>+#include "smc.h"
>+
>+struct proto smc_inet_prot = {
>+	.name		= "INET_SMC",
>+	.owner		= THIS_MODULE,
>+	.init		= smc_inet_init_sock,
>+	.hash		= smc_hash_sk,
>+	.unhash		= smc_unhash_sk,
>+	.release_cb	= smc_release_cb,
>+	.obj_size	= sizeof(struct smc_sock),
>+	.h.smc_hash	= &smc_v4_hashinfo,
>+	.slab_flags	= SLAB_TYPESAFE_BY_RCU,
>+};
>+
>+const struct proto_ops smc_inet_stream_ops = {
>+	.family		= PF_INET,
>+	.owner		= THIS_MODULE,
>+	.release	= smc_release,
>+	.bind		= smc_bind,
>+	.connect	= smc_connect,
>+	.socketpair	= sock_no_socketpair,
>+	.accept		= smc_accept,
>+	.getname	= smc_getname,
>+	.poll		= smc_poll,
>+	.ioctl		= smc_ioctl,
>+	.listen		= smc_listen,
>+	.shutdown	= smc_shutdown,
>+	.setsockopt	= smc_setsockopt,
>+	.getsockopt	= smc_getsockopt,
>+	.sendmsg	= smc_sendmsg,
>+	.recvmsg	= smc_recvmsg,
>+	.mmap		= sock_no_mmap,
>+	.splice_read	= smc_splice_read,
>+};
>+
>+struct inet_protosw smc_inet_protosw = {
>+	.type		= SOCK_STREAM,
>+	.protocol	= IPPROTO_SMC,
>+	.prot		= &smc_inet_prot,
>+	.ops		= &smc_inet_stream_ops,
>+	.flags		= INET_PROTOSW_ICSK,
>+};
>+
>+#if IS_ENABLED(CONFIG_IPV6)
>+struct proto smc_inet6_prot = {
>+	.name		= "INET6_SMC",
>+	.owner		= THIS_MODULE,
>+	.init		= smc_inet_init_sock,
>+	.hash		= smc_hash_sk,
>+	.unhash		= smc_unhash_sk,
>+	.release_cb	= smc_release_cb,
>+	.obj_size	= sizeof(struct smc_sock),
>+	.h.smc_hash	= &smc_v6_hashinfo,
>+	.slab_flags	= SLAB_TYPESAFE_BY_RCU,
>+};
>+
>+const struct proto_ops smc_inet6_stream_ops = {
>+	.family		= PF_INET6,
>+	.owner		= THIS_MODULE,
>+	.release	= smc_release,
>+	.bind		= smc_bind,
>+	.connect	= smc_connect,
>+	.socketpair	= sock_no_socketpair,
>+	.accept		= smc_accept,
>+	.getname	= smc_getname,
>+	.poll		= smc_poll,
>+	.ioctl		= smc_ioctl,
>+	.listen		= smc_listen,
>+	.shutdown	= smc_shutdown,
>+	.setsockopt	= smc_setsockopt,
>+	.getsockopt	= smc_getsockopt,
>+	.sendmsg	= smc_sendmsg,
>+	.recvmsg	= smc_recvmsg,
>+	.mmap		= sock_no_mmap,
>+	.splice_read	= smc_splice_read,
>+};
>+
>+struct inet_protosw smc_inet6_protosw = {
>+	.type		= SOCK_STREAM,
>+	.protocol	= IPPROTO_SMC,
>+	.prot		= &smc_inet6_prot,
>+	.ops		= &smc_inet6_stream_ops,
>+	.flags		= INET_PROTOSW_ICSK,
>+};
>+#endif
>+
>+int smc_inet_init_sock(struct sock *sk)
>+{
>+	struct net *net = sock_net(sk);
>+
>+	/* init common smc sock */
>+	smc_sk_init(net, sk, IPPROTO_SMC);
>+	/* create clcsock */
>+	return smc_create_clcsk(net, sk, sk->sk_family);
>+}
>diff --git a/net/smc/inet_smc.h b/net/smc/inet_smc.h
>new file mode 100644
>index 00000000..c55345d
>--- /dev/null
>+++ b/net/smc/inet_smc.h
>@@ -0,0 +1,34 @@
>+/* SPDX-License-Identifier: GPL-2.0 */
>+/*
>+ *  Shared Memory Communications over RDMA (SMC-R) and RoCE
>+ *
>+ *  Definitions for the IPPROTO_SMC (socket related)
>+
>+ *  Copyright IBM Corp. 2016
>+ *  Copyright (c) 2024, Alibaba Inc.
>+ *
>+ *  Author: D. Wythe <alibuda@linux.alibaba.com>
>+ */
>+#ifndef __INET_SMC
>+#define __INET_SMC
>+
>+#include <net/protocol.h>
>+#include <net/sock.h>
>+#include <net/tcp.h>

Why not put those 'include's in the .c file ?

>+
>+extern struct proto smc_inet_prot;
>+extern const struct proto_ops smc_inet_stream_ops;
>+extern struct inet_protosw smc_inet_protosw;
>+
>+#if IS_ENABLED(CONFIG_IPV6)
>+#include <net/ipv6.h>
>+/* MUST after net/tcp.h or warning */
>+#include <net/transp_v6.h>
>+extern struct proto smc_inet6_prot;
>+extern const struct proto_ops smc_inet6_stream_ops;
>+extern struct inet_protosw smc_inet6_protosw;
>+#endif
>+
>+int smc_inet_init_sock(struct sock *sk);

Seems smc_inet_init_sock() is only used in smc_inet.c,
why not defined it as a static function ?

Best regards,
Dust

>+
>+#endif /* __INET_SMC */
>-- 
>1.8.3.1
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 3/3] net/smc: Introduce IPPROTO_SMC
  2024-05-29  3:59 ` [PATCH net-next v4 3/3] net/smc: Introduce IPPROTO_SMC D. Wythe
  2024-05-29 11:12   ` Dust Li
@ 2024-05-29 11:58   ` Wenjia Zhang
  2024-05-30  2:51     ` D. Wythe
  2024-05-29 19:55   ` Zhu Yanjun
  2024-06-01 13:06   ` Simon Horman
  3 siblings, 1 reply; 16+ messages in thread
From: Wenjia Zhang @ 2024-05-29 11:58 UTC (permalink / raw)
  To: D. Wythe, kgraul, jaka, wintera, guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, tonylu, pabeni,
	edumazet



On 29.05.24 05:59, D. Wythe wrote:
> From: "D. Wythe" <alibuda@linux.alibaba.com>
> 
> This patch allows to create smc socket via AF_INET,
> similar to the following code,
> 
> /* create v4 smc sock */
> v4 = socket(AF_INET, SOCK_STREAM, IPPROTO_SMC);
> 
> /* create v6 smc sock */
> v6 = socket(AF_INET6, SOCK_STREAM, IPPROTO_SMC);
> 
> There are several reasons why we believe it is appropriate here:
> 
> 1. For smc sockets, it actually use IPv4 (AF-INET) or IPv6 (AF-INET6)
> address. There is no AF_SMC address at all.
> 
> 2. Create smc socket in the AF_INET(6) path, which allows us to reuse
> the infrastructure of AF_INET(6) path, such as common ebpf hooks.
> Otherwise, smc have to implement it again in AF_SMC path.
> 
> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
> ---
>   include/uapi/linux/in.h |   2 +
>   net/smc/Makefile        |   2 +-
>   net/smc/af_smc.c        |  36 ++++++++++++++++
>   net/smc/inet_smc.c      | 108 ++++++++++++++++++++++++++++++++++++++++++++++++
>   net/smc/inet_smc.h      |  34 +++++++++++++++
>   5 files changed, 181 insertions(+), 1 deletion(-)
>   create mode 100644 net/smc/inet_smc.c
>   create mode 100644 net/smc/inet_smc.h
> 
> diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
> index e682ab6..0c6322b 100644
> --- a/include/uapi/linux/in.h
> +++ b/include/uapi/linux/in.h
> @@ -83,6 +83,8 @@ enum {
>   #define IPPROTO_RAW		IPPROTO_RAW
>     IPPROTO_MPTCP = 262,		/* Multipath TCP connection		*/
>   #define IPPROTO_MPTCP		IPPROTO_MPTCP
> +  IPPROTO_SMC = 263,		/* Shared Memory Communications		*/
> +#define IPPROTO_SMC		IPPROTO_SMC
>     IPPROTO_MAX
>   };
>   #endif
> diff --git a/net/smc/Makefile b/net/smc/Makefile
> index 2c510d54..472b9ee 100644
> --- a/net/smc/Makefile
> +++ b/net/smc/Makefile
> @@ -4,6 +4,6 @@ obj-$(CONFIG_SMC)	+= smc.o
>   obj-$(CONFIG_SMC_DIAG)	+= smc_diag.o
>   smc-y := af_smc.o smc_pnet.o smc_ib.o smc_clc.o smc_core.o smc_wr.o smc_llc.o
>   smc-y += smc_cdc.o smc_tx.o smc_rx.o smc_close.o smc_ism.o smc_netlink.o smc_stats.o
> -smc-y += smc_tracepoint.o
> +smc-y += smc_tracepoint.o inet_smc.o
>   smc-$(CONFIG_SYSCTL) += smc_sysctl.o
>   smc-$(CONFIG_SMC_LO) += smc_loopback.o
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index 8e3ce76..320624c 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -54,6 +54,7 @@
>   #include "smc_tracepoint.h"
>   #include "smc_sysctl.h"
>   #include "smc_loopback.h"
> +#include "inet_smc.h"
>   
>   static DEFINE_MUTEX(smc_server_lgr_pending);	/* serialize link group
>   						 * creation on server
> @@ -3594,9 +3595,31 @@ static int __init smc_init(void)
>   		goto out_lo;
>   	}
>   
> +	rc = proto_register(&smc_inet_prot, 1);
> +	if (rc) {
> +		pr_err("%s: proto_register smc_inet_prot fails with %d\n", __func__, rc);
> +		goto out_ulp;
> +	}
> +	inet_register_protosw(&smc_inet_protosw);
> +#if IS_ENABLED(CONFIG_IPV6)
> +	rc = proto_register(&smc_inet6_prot, 1);
> +	if (rc) {
> +		pr_err("%s: proto_register smc_inet6_prot fails with %d\n", __func__, rc);
> +		goto out_inet_prot;
> +	}
> +	inet6_register_protosw(&smc_inet6_protosw);

Comparing to inet_register_protosw(), the inet6_register_protosw() 
returns an integer. Thus, making error check and direct corresponding 
housekeeping here looks IMO much cleaner.

> +#endif
> +
>   	static_branch_enable(&tcp_have_smc);
>   	return 0;
>   
> +#if IS_ENABLED(CONFIG_IPV6)
> +out_inet_prot:
> +	inet_unregister_protosw(&smc_inet_protosw);
> +	proto_unregister(&smc_inet_prot);
> +#endif
> +out_ulp:
> +	tcp_unregister_ulp(&smc_ulp_ops);
>   out_lo:
>   	smc_loopback_exit();
>   out_ib:
> @@ -3633,6 +3656,10 @@ static int __init smc_init(void)
>   static void __exit smc_exit(void)
>   {
>   	static_branch_disable(&tcp_have_smc);
> +	inet_unregister_protosw(&smc_inet_protosw);
> +#if IS_ENABLED(CONFIG_IPV6)
> +	inet6_unregister_protosw(&smc_inet6_protosw);
> +#endif
>   	tcp_unregister_ulp(&smc_ulp_ops);
>   	sock_unregister(PF_SMC);
>   	smc_core_exit();
> @@ -3644,6 +3671,10 @@ static void __exit smc_exit(void)
>   	destroy_workqueue(smc_hs_wq);
>   	proto_unregister(&smc_proto6);
>   	proto_unregister(&smc_proto);
> +	proto_unregister(&smc_inet_prot);
> +#if IS_ENABLED(CONFIG_IPV6)
> +	proto_unregister(&smc_inet6_prot);
> +#end

Since there is already inet_smc.c, I'd recommend to group these register 
and unregister stuff respectively in functions like e.g. smc_inet_init() 
and smc_inet_exit() in inet_smc.c

>   	smc_pnet_exit();
>   	smc_nl_exit();
>   	smc_clc_exit();
> @@ -3660,4 +3691,9 @@ static void __exit smc_exit(void)
>   MODULE_LICENSE("GPL");
>   MODULE_ALIAS_NETPROTO(PF_SMC);
>   MODULE_ALIAS_TCP_ULP("smc");
> +/* 263 for IPPROTO_SMC and 1 for SOCK_STREAM */
> +MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_INET, 263, 1);
> +#if IS_ENABLED(CONFIG_IPV6)
> +MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_INET6, 263, 1);
> +#endif
>   MODULE_ALIAS_GENL_FAMILY(SMC_GENL_FAMILY_NAME);
> diff --git a/net/smc/inet_smc.c b/net/smc/inet_smc.c
> new file mode 100644
> index 00000000..1ba73d7
> --- /dev/null
> +++ b/net/smc/inet_smc.c

In order to keep the consistency with the structure and function names 
in the files, I'm wondering why not to use smc_inet.h and smc_inet.c
instead of inet_smc.h and inet_smc.c respectively

> @@ -0,0 +1,108 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + *  Shared Memory Communications over RDMA (SMC-R) and RoCE
> + *
> + *  Definitions for the IPPROTO_SMC (socket related)
> + *
> + *  Copyright IBM Corp. 2016, 2018
> + *  Copyright (c) 2024, Alibaba Inc.
> + *
> + *  Author: D. Wythe <alibuda@linux.alibaba.com>
> + */
> +
> +#include "inet_smc.h"
> +#include "smc.h"
> +
> +struct proto smc_inet_prot = {
> +	.name		= "INET_SMC",
> +	.owner		= THIS_MODULE,
> +	.init		= smc_inet_init_sock,
> +	.hash		= smc_hash_sk,
> +	.unhash		= smc_unhash_sk,
> +	.release_cb	= smc_release_cb,
> +	.obj_size	= sizeof(struct smc_sock),
> +	.h.smc_hash	= &smc_v4_hashinfo,
> +	.slab_flags	= SLAB_TYPESAFE_BY_RCU,
> +};
> +
> +const struct proto_ops smc_inet_stream_ops = {
> +	.family		= PF_INET,
> +	.owner		= THIS_MODULE,
> +	.release	= smc_release,
> +	.bind		= smc_bind,
> +	.connect	= smc_connect,
> +	.socketpair	= sock_no_socketpair,
> +	.accept		= smc_accept,
> +	.getname	= smc_getname,
> +	.poll		= smc_poll,
> +	.ioctl		= smc_ioctl,
> +	.listen		= smc_listen,
> +	.shutdown	= smc_shutdown,
> +	.setsockopt	= smc_setsockopt,
> +	.getsockopt	= smc_getsockopt,
> +	.sendmsg	= smc_sendmsg,
> +	.recvmsg	= smc_recvmsg,
> +	.mmap		= sock_no_mmap,
> +	.splice_read	= smc_splice_read,
> +};
> +
> +struct inet_protosw smc_inet_protosw = {
> +	.type		= SOCK_STREAM,
> +	.protocol	= IPPROTO_SMC,
> +	.prot		= &smc_inet_prot,
> +	.ops		= &smc_inet_stream_ops,
> +	.flags		= INET_PROTOSW_ICSK,
> +};
> +
> +#if IS_ENABLED(CONFIG_IPV6)
> +struct proto smc_inet6_prot = {
> +	.name		= "INET6_SMC",
> +	.owner		= THIS_MODULE,
> +	.init		= smc_inet_init_sock,
> +	.hash		= smc_hash_sk,
> +	.unhash		= smc_unhash_sk,
> +	.release_cb	= smc_release_cb,
> +	.obj_size	= sizeof(struct smc_sock),
> +	.h.smc_hash	= &smc_v6_hashinfo,
> +	.slab_flags	= SLAB_TYPESAFE_BY_RCU,
> +};
> +
> +const struct proto_ops smc_inet6_stream_ops = {
> +	.family		= PF_INET6,
> +	.owner		= THIS_MODULE,
> +	.release	= smc_release,
> +	.bind		= smc_bind,
> +	.connect	= smc_connect,
> +	.socketpair	= sock_no_socketpair,
> +	.accept		= smc_accept,
> +	.getname	= smc_getname,
> +	.poll		= smc_poll,
> +	.ioctl		= smc_ioctl,
> +	.listen		= smc_listen,
> +	.shutdown	= smc_shutdown,
> +	.setsockopt	= smc_setsockopt,
> +	.getsockopt	= smc_getsockopt,
> +	.sendmsg	= smc_sendmsg,
> +	.recvmsg	= smc_recvmsg,
> +	.mmap		= sock_no_mmap,
> +	.splice_read	= smc_splice_read,
> +};
> +
> +struct inet_protosw smc_inet6_protosw = {
> +	.type		= SOCK_STREAM,
> +	.protocol	= IPPROTO_SMC,
> +	.prot		= &smc_inet6_prot,
> +	.ops		= &smc_inet6_stream_ops,
> +	.flags		= INET_PROTOSW_ICSK,
> +};
> +#endif
> +
> +int smc_inet_init_sock(struct sock *sk)
> +{
> +	struct net *net = sock_net(sk);
> +
> +	/* init common smc sock */
> +	smc_sk_init(net, sk, IPPROTO_SMC);
> +	/* create clcsock */
> +	return smc_create_clcsk(net, sk, sk->sk_family);
> +}
> diff --git a/net/smc/inet_smc.h b/net/smc/inet_smc.h
> new file mode 100644
> index 00000000..c55345d
> --- /dev/null
> +++ b/net/smc/inet_smc.h
> @@ -0,0 +1,34 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + *  Shared Memory Communications over RDMA (SMC-R) and RoCE
> + *
> + *  Definitions for the IPPROTO_SMC (socket related)
> +
> + *  Copyright IBM Corp. 2016
> + *  Copyright (c) 2024, Alibaba Inc.
> + *
> + *  Author: D. Wythe <alibuda@linux.alibaba.com>
> + */
> +#ifndef __INET_SMC
> +#define __INET_SMC
> +
> +#include <net/protocol.h>
> +#include <net/sock.h>
> +#include <net/tcp.h>
> +
> +extern struct proto smc_inet_prot;
> +extern const struct proto_ops smc_inet_stream_ops;
> +extern struct inet_protosw smc_inet_protosw;
> +
> +#if IS_ENABLED(CONFIG_IPV6)
> +#include <net/ipv6.h>
> +/* MUST after net/tcp.h or warning */
> +#include <net/transp_v6.h>
> +extern struct proto smc_inet6_prot;
> +extern const struct proto_ops smc_inet6_stream_ops;
> +extern struct inet_protosw smc_inet6_protosw;
> +#endif
> +
> +int smc_inet_init_sock(struct sock *sk);
> +
> +#endif /* __INET_SMC */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 2/3] net/smc: expose smc proto operations
  2024-05-29  3:59 ` [PATCH net-next v4 2/3] net/smc: expose smc proto operations D. Wythe
@ 2024-05-29 17:57   ` Zhu Yanjun
  2024-05-30  2:33     ` D. Wythe
  0 siblings, 1 reply; 16+ messages in thread
From: Zhu Yanjun @ 2024-05-29 17:57 UTC (permalink / raw)
  To: D. Wythe, kgraul, wenjia, jaka, wintera, guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, tonylu, pabeni,
	edumazet

在 2024/5/29 5:59, D. Wythe 写道:
> From: "D. Wythe" <alibuda@linux.alibaba.com>
> 
> Externalize smc proto operations (smc_xxx) to allow
> access from files other that af_smc.c

s/other that/other than ?

Zhu Yanjun

> 
> This is in preparation for the subsequent implementation
> of the AF_INET version of SMC.
> 
> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
> ---
>   net/smc/af_smc.c | 60 ++++++++++++++++++++++++++++----------------------------
>   net/smc/smc.h    | 33 +++++++++++++++++++++++++++++++
>   2 files changed, 63 insertions(+), 30 deletions(-)
> 
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index 77a9d58..8e3ce76 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -170,15 +170,15 @@ static bool smc_hs_congested(const struct sock *sk)
>   	return false;
>   }
>   
> -static struct smc_hashinfo smc_v4_hashinfo = {
> +struct smc_hashinfo smc_v4_hashinfo = {
>   	.lock = __RW_LOCK_UNLOCKED(smc_v4_hashinfo.lock),
>   };
>   
> -static struct smc_hashinfo smc_v6_hashinfo = {
> +struct smc_hashinfo smc_v6_hashinfo = {
>   	.lock = __RW_LOCK_UNLOCKED(smc_v6_hashinfo.lock),
>   };
>   
> -static int smc_hash_sk(struct sock *sk)
> +int smc_hash_sk(struct sock *sk)
>   {
>   	struct smc_hashinfo *h = sk->sk_prot->h.smc_hash;
>   	struct hlist_head *head;
> @@ -193,7 +193,7 @@ static int smc_hash_sk(struct sock *sk)
>   	return 0;
>   }
>   
> -static void smc_unhash_sk(struct sock *sk)
> +void smc_unhash_sk(struct sock *sk)
>   {
>   	struct smc_hashinfo *h = sk->sk_prot->h.smc_hash;
>   
> @@ -207,7 +207,7 @@ static void smc_unhash_sk(struct sock *sk)
>    * work which we didn't do because of user hold the sock_lock in the
>    * BH context
>    */
> -static void smc_release_cb(struct sock *sk)
> +void smc_release_cb(struct sock *sk)
>   {
>   	struct smc_sock *smc = smc_sk(sk);
>   
> @@ -307,7 +307,7 @@ static int __smc_release(struct smc_sock *smc)
>   	return rc;
>   }
>   
> -static int smc_release(struct socket *sock)
> +int smc_release(struct socket *sock)
>   {
>   	struct sock *sk = sock->sk;
>   	struct smc_sock *smc;
> @@ -401,8 +401,8 @@ static struct sock *smc_sock_alloc(struct net *net, struct socket *sock,
>   	return sk;
>   }
>   
> -static int smc_bind(struct socket *sock, struct sockaddr *uaddr,
> -		    int addr_len)
> +int smc_bind(struct socket *sock, struct sockaddr *uaddr,
> +	     int addr_len)
>   {
>   	struct sockaddr_in *addr = (struct sockaddr_in *)uaddr;
>   	struct sock *sk = sock->sk;
> @@ -1649,8 +1649,8 @@ static void smc_connect_work(struct work_struct *work)
>   	release_sock(&smc->sk);
>   }
>   
> -static int smc_connect(struct socket *sock, struct sockaddr *addr,
> -		       int alen, int flags)
> +int smc_connect(struct socket *sock, struct sockaddr *addr,
> +		int alen, int flags)
>   {
>   	struct sock *sk = sock->sk;
>   	struct smc_sock *smc;
> @@ -2631,7 +2631,7 @@ static void smc_clcsock_data_ready(struct sock *listen_clcsock)
>   	read_unlock_bh(&listen_clcsock->sk_callback_lock);
>   }
>   
> -static int smc_listen(struct socket *sock, int backlog)
> +int smc_listen(struct socket *sock, int backlog)
>   {
>   	struct sock *sk = sock->sk;
>   	struct smc_sock *smc;
> @@ -2696,8 +2696,8 @@ static int smc_listen(struct socket *sock, int backlog)
>   	return rc;
>   }
>   
> -static int smc_accept(struct socket *sock, struct socket *new_sock,
> -		      struct proto_accept_arg *arg)
> +int smc_accept(struct socket *sock, struct socket *new_sock,
> +	       struct proto_accept_arg *arg)
>   {
>   	struct sock *sk = sock->sk, *nsk;
>   	DECLARE_WAITQUEUE(wait, current);
> @@ -2766,8 +2766,8 @@ static int smc_accept(struct socket *sock, struct socket *new_sock,
>   	return rc;
>   }
>   
> -static int smc_getname(struct socket *sock, struct sockaddr *addr,
> -		       int peer)
> +int smc_getname(struct socket *sock, struct sockaddr *addr,
> +		int peer)
>   {
>   	struct smc_sock *smc;
>   
> @@ -2780,7 +2780,7 @@ static int smc_getname(struct socket *sock, struct sockaddr *addr,
>   	return smc->clcsock->ops->getname(smc->clcsock, addr, peer);
>   }
>   
> -static int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
> +int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
>   {
>   	struct sock *sk = sock->sk;
>   	struct smc_sock *smc;
> @@ -2818,8 +2818,8 @@ static int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
>   	return rc;
>   }
>   
> -static int smc_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> -		       int flags)
> +int smc_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> +		int flags)
>   {
>   	struct sock *sk = sock->sk;
>   	struct smc_sock *smc;
> @@ -2868,8 +2868,8 @@ static __poll_t smc_accept_poll(struct sock *parent)
>   	return mask;
>   }
>   
> -static __poll_t smc_poll(struct file *file, struct socket *sock,
> -			     poll_table *wait)
> +__poll_t smc_poll(struct file *file, struct socket *sock,
> +		  poll_table *wait)
>   {
>   	struct sock *sk = sock->sk;
>   	struct smc_sock *smc;
> @@ -2921,7 +2921,7 @@ static __poll_t smc_poll(struct file *file, struct socket *sock,
>   	return mask;
>   }
>   
> -static int smc_shutdown(struct socket *sock, int how)
> +int smc_shutdown(struct socket *sock, int how)
>   {
>   	struct sock *sk = sock->sk;
>   	bool do_shutdown = true;
> @@ -3061,8 +3061,8 @@ static int __smc_setsockopt(struct socket *sock, int level, int optname,
>   	return rc;
>   }
>   
> -static int smc_setsockopt(struct socket *sock, int level, int optname,
> -			  sockptr_t optval, unsigned int optlen)
> +int smc_setsockopt(struct socket *sock, int level, int optname,
> +		   sockptr_t optval, unsigned int optlen)
>   {
>   	struct sock *sk = sock->sk;
>   	struct smc_sock *smc;
> @@ -3148,8 +3148,8 @@ static int smc_setsockopt(struct socket *sock, int level, int optname,
>   	return rc;
>   }
>   
> -static int smc_getsockopt(struct socket *sock, int level, int optname,
> -			  char __user *optval, int __user *optlen)
> +int smc_getsockopt(struct socket *sock, int level, int optname,
> +		   char __user *optval, int __user *optlen)
>   {
>   	struct smc_sock *smc;
>   	int rc;
> @@ -3174,8 +3174,8 @@ static int smc_getsockopt(struct socket *sock, int level, int optname,
>   	return rc;
>   }
>   
> -static int smc_ioctl(struct socket *sock, unsigned int cmd,
> -		     unsigned long arg)
> +int smc_ioctl(struct socket *sock, unsigned int cmd,
> +	      unsigned long arg)
>   {
>   	union smc_host_cursor cons, urg;
>   	struct smc_connection *conn;
> @@ -3261,9 +3261,9 @@ static int smc_ioctl(struct socket *sock, unsigned int cmd,
>    * Note that subsequent recv() calls have to wait till all splice() processing
>    * completed.
>    */
> -static ssize_t smc_splice_read(struct socket *sock, loff_t *ppos,
> -			       struct pipe_inode_info *pipe, size_t len,
> -			       unsigned int flags)
> +ssize_t smc_splice_read(struct socket *sock, loff_t *ppos,
> +			struct pipe_inode_info *pipe, size_t len,
> +			unsigned int flags)
>   {
>   	struct sock *sk = sock->sk;
>   	struct smc_sock *smc;
> diff --git a/net/smc/smc.h b/net/smc/smc.h
> index 3edec1e..34b781e 100644
> --- a/net/smc/smc.h
> +++ b/net/smc/smc.h
> @@ -34,6 +34,39 @@
>   extern struct proto smc_proto;
>   extern struct proto smc_proto6;
>   
> +extern struct smc_hashinfo smc_v4_hashinfo;
> +extern struct smc_hashinfo smc_v6_hashinfo;
> +
> +int smc_hash_sk(struct sock *sk);
> +void smc_unhash_sk(struct sock *sk);
> +void smc_release_cb(struct sock *sk);
> +
> +int smc_release(struct socket *sock);
> +int smc_bind(struct socket *sock, struct sockaddr *uaddr,
> +	     int addr_len);
> +int smc_connect(struct socket *sock, struct sockaddr *addr,
> +		int alen, int flags);
> +int smc_accept(struct socket *sock, struct socket *new_sock,
> +	       struct proto_accept_arg *arg);
> +int smc_getname(struct socket *sock, struct sockaddr *addr,
> +		int peer);
> +__poll_t smc_poll(struct file *file, struct socket *sock,
> +		  poll_table *wait);
> +int smc_ioctl(struct socket *sock, unsigned int cmd,
> +	      unsigned long arg);
> +int smc_listen(struct socket *sock, int backlog);
> +int smc_shutdown(struct socket *sock, int how);
> +int smc_setsockopt(struct socket *sock, int level, int optname,
> +		   sockptr_t optval, unsigned int optlen);
> +int smc_getsockopt(struct socket *sock, int level, int optname,
> +		   char __user *optval, int __user *optlen);
> +int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len);
> +int smc_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> +		int flags);
> +ssize_t smc_splice_read(struct socket *sock, loff_t *ppos,
> +			struct pipe_inode_info *pipe, size_t len,
> +			unsigned int flags);
> +
>   /* smc sock initialization */
>   void smc_sk_init(struct net *net, struct sock *sk, int protocol);
>   /* clcsock initialization */


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 3/3] net/smc: Introduce IPPROTO_SMC
  2024-05-29  3:59 ` [PATCH net-next v4 3/3] net/smc: Introduce IPPROTO_SMC D. Wythe
  2024-05-29 11:12   ` Dust Li
  2024-05-29 11:58   ` Wenjia Zhang
@ 2024-05-29 19:55   ` Zhu Yanjun
  2024-05-30  2:35     ` D. Wythe
  2024-06-01 13:06   ` Simon Horman
  3 siblings, 1 reply; 16+ messages in thread
From: Zhu Yanjun @ 2024-05-29 19:55 UTC (permalink / raw)
  To: D. Wythe, kgraul, wenjia, jaka, wintera, guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, tonylu, pabeni,
	edumazet

在 2024/5/29 5:59, D. Wythe 写道:
> From: "D. Wythe" <alibuda@linux.alibaba.com>
> 
> This patch allows to create smc socket via AF_INET,
> similar to the following code,
> 
> /* create v4 smc sock */
> v4 = socket(AF_INET, SOCK_STREAM, IPPROTO_SMC);
> 
> /* create v6 smc sock */
> v6 = socket(AF_INET6, SOCK_STREAM, IPPROTO_SMC);
> 
> There are several reasons why we believe it is appropriate here:
> 
> 1. For smc sockets, it actually use IPv4 (AF-INET) or IPv6 (AF-INET6)
> address. There is no AF_SMC address at all.
> 
> 2. Create smc socket in the AF_INET(6) path, which allows us to reuse
> the infrastructure of AF_INET(6) path, such as common ebpf hooks.
> Otherwise, smc have to implement it again in AF_SMC path.
> 
> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
> ---
>   include/uapi/linux/in.h |   2 +
>   net/smc/Makefile        |   2 +-
>   net/smc/af_smc.c        |  36 ++++++++++++++++
>   net/smc/inet_smc.c      | 108 ++++++++++++++++++++++++++++++++++++++++++++++++
>   net/smc/inet_smc.h      |  34 +++++++++++++++
>   5 files changed, 181 insertions(+), 1 deletion(-)
>   create mode 100644 net/smc/inet_smc.c
>   create mode 100644 net/smc/inet_smc.h
> 
> diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
> index e682ab6..0c6322b 100644
> --- a/include/uapi/linux/in.h
> +++ b/include/uapi/linux/in.h
> @@ -83,6 +83,8 @@ enum {
>   #define IPPROTO_RAW		IPPROTO_RAW
>     IPPROTO_MPTCP = 262,		/* Multipath TCP connection		*/
>   #define IPPROTO_MPTCP		IPPROTO_MPTCP
> +  IPPROTO_SMC = 263,		/* Shared Memory Communications		*/
> +#define IPPROTO_SMC		IPPROTO_SMC
>     IPPROTO_MAX
>   };
>   #endif
> diff --git a/net/smc/Makefile b/net/smc/Makefile
> index 2c510d54..472b9ee 100644
> --- a/net/smc/Makefile
> +++ b/net/smc/Makefile
> @@ -4,6 +4,6 @@ obj-$(CONFIG_SMC)	+= smc.o
>   obj-$(CONFIG_SMC_DIAG)	+= smc_diag.o
>   smc-y := af_smc.o smc_pnet.o smc_ib.o smc_clc.o smc_core.o smc_wr.o smc_llc.o
>   smc-y += smc_cdc.o smc_tx.o smc_rx.o smc_close.o smc_ism.o smc_netlink.o smc_stats.o
> -smc-y += smc_tracepoint.o
> +smc-y += smc_tracepoint.o inet_smc.o
>   smc-$(CONFIG_SYSCTL) += smc_sysctl.o
>   smc-$(CONFIG_SMC_LO) += smc_loopback.o
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index 8e3ce76..320624c 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -54,6 +54,7 @@
>   #include "smc_tracepoint.h"
>   #include "smc_sysctl.h"
>   #include "smc_loopback.h"
> +#include "inet_smc.h"
>   
>   static DEFINE_MUTEX(smc_server_lgr_pending);	/* serialize link group
>   						 * creation on server
> @@ -3594,9 +3595,31 @@ static int __init smc_init(void)
>   		goto out_lo;
>   	}
>   
> +	rc = proto_register(&smc_inet_prot, 1);
> +	if (rc) {
> +		pr_err("%s: proto_register smc_inet_prot fails with %d\n", __func__, rc);
> +		goto out_ulp;
> +	}
> +	inet_register_protosw(&smc_inet_protosw);
> +#if IS_ENABLED(CONFIG_IPV6)
> +	rc = proto_register(&smc_inet6_prot, 1);
> +	if (rc) {
> +		pr_err("%s: proto_register smc_inet6_prot fails with %d\n", __func__, rc);
> +		goto out_inet_prot;
> +	}
> +	inet6_register_protosw(&smc_inet6_protosw);
> +#endif
> +
>   	static_branch_enable(&tcp_have_smc);
>   	return 0;
>   
> +#if IS_ENABLED(CONFIG_IPV6)
> +out_inet_prot:
> +	inet_unregister_protosw(&smc_inet_protosw);
> +	proto_unregister(&smc_inet_prot);
> +#endif
> +out_ulp:
> +	tcp_unregister_ulp(&smc_ulp_ops);
>   out_lo:
>   	smc_loopback_exit();
>   out_ib:
> @@ -3633,6 +3656,10 @@ static int __init smc_init(void)
>   static void __exit smc_exit(void)
>   {
>   	static_branch_disable(&tcp_have_smc);
> +	inet_unregister_protosw(&smc_inet_protosw);
> +#if IS_ENABLED(CONFIG_IPV6)
> +	inet6_unregister_protosw(&smc_inet6_protosw);
> +#endif
>   	tcp_unregister_ulp(&smc_ulp_ops);
>   	sock_unregister(PF_SMC);
>   	smc_core_exit();
> @@ -3644,6 +3671,10 @@ static void __exit smc_exit(void)
>   	destroy_workqueue(smc_hs_wq);
>   	proto_unregister(&smc_proto6);
>   	proto_unregister(&smc_proto);
> +	proto_unregister(&smc_inet_prot);
> +#if IS_ENABLED(CONFIG_IPV6)
> +	proto_unregister(&smc_inet6_prot);
> +#endif
>   	smc_pnet_exit();
>   	smc_nl_exit();
>   	smc_clc_exit();
> @@ -3660,4 +3691,9 @@ static void __exit smc_exit(void)
>   MODULE_LICENSE("GPL");
>   MODULE_ALIAS_NETPROTO(PF_SMC);
>   MODULE_ALIAS_TCP_ULP("smc");
> +/* 263 for IPPROTO_SMC and 1 for SOCK_STREAM */
> +MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_INET, 263, 1);
> +#if IS_ENABLED(CONFIG_IPV6)
> +MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_INET6, 263, 1);
> +#endif
>   MODULE_ALIAS_GENL_FAMILY(SMC_GENL_FAMILY_NAME);
> diff --git a/net/smc/inet_smc.c b/net/smc/inet_smc.c
> new file mode 100644
> index 00000000..1ba73d7
> --- /dev/null
> +++ b/net/smc/inet_smc.c
> @@ -0,0 +1,108 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + *  Shared Memory Communications over RDMA (SMC-R) and RoCE
> + *
> + *  Definitions for the IPPROTO_SMC (socket related)
> + *
> + *  Copyright IBM Corp. 2016, 2018
> + *  Copyright (c) 2024, Alibaba Inc.
> + *
> + *  Author: D. Wythe <alibuda@linux.alibaba.com>
> + */
> +
> +#include "inet_smc.h"
> +#include "smc.h"
> +
> +struct proto smc_inet_prot = {
> +	.name		= "INET_SMC",
> +	.owner		= THIS_MODULE,
> +	.init		= smc_inet_init_sock,
> +	.hash		= smc_hash_sk,
> +	.unhash		= smc_unhash_sk,
> +	.release_cb	= smc_release_cb,
> +	.obj_size	= sizeof(struct smc_sock),
> +	.h.smc_hash	= &smc_v4_hashinfo,
> +	.slab_flags	= SLAB_TYPESAFE_BY_RCU,
> +};
> +
> +const struct proto_ops smc_inet_stream_ops = {
> +	.family		= PF_INET,
> +	.owner		= THIS_MODULE,
> +	.release	= smc_release,
> +	.bind		= smc_bind,
> +	.connect	= smc_connect,
> +	.socketpair	= sock_no_socketpair,
> +	.accept		= smc_accept,
> +	.getname	= smc_getname,
> +	.poll		= smc_poll,
> +	.ioctl		= smc_ioctl,
> +	.listen		= smc_listen,
> +	.shutdown	= smc_shutdown,
> +	.setsockopt	= smc_setsockopt,
> +	.getsockopt	= smc_getsockopt,
> +	.sendmsg	= smc_sendmsg,
> +	.recvmsg	= smc_recvmsg,
> +	.mmap		= sock_no_mmap,
> +	.splice_read	= smc_splice_read,
> +};
> +
> +struct inet_protosw smc_inet_protosw = {
> +	.type		= SOCK_STREAM,
> +	.protocol	= IPPROTO_SMC,
> +	.prot		= &smc_inet_prot,
> +	.ops		= &smc_inet_stream_ops,
> +	.flags		= INET_PROTOSW_ICSK,
> +};
> +
> +#if IS_ENABLED(CONFIG_IPV6)
> +struct proto smc_inet6_prot = {
> +	.name		= "INET6_SMC",
> +	.owner		= THIS_MODULE,
> +	.init		= smc_inet_init_sock,
> +	.hash		= smc_hash_sk,
> +	.unhash		= smc_unhash_sk,
> +	.release_cb	= smc_release_cb,
> +	.obj_size	= sizeof(struct smc_sock),
> +	.h.smc_hash	= &smc_v6_hashinfo,
> +	.slab_flags	= SLAB_TYPESAFE_BY_RCU,
> +};
> +
> +const struct proto_ops smc_inet6_stream_ops = {
> +	.family		= PF_INET6,
> +	.owner		= THIS_MODULE,
> +	.release	= smc_release,
> +	.bind		= smc_bind,
> +	.connect	= smc_connect,
> +	.socketpair	= sock_no_socketpair,
> +	.accept		= smc_accept,
> +	.getname	= smc_getname,
> +	.poll		= smc_poll,
> +	.ioctl		= smc_ioctl,
> +	.listen		= smc_listen,
> +	.shutdown	= smc_shutdown,
> +	.setsockopt	= smc_setsockopt,
> +	.getsockopt	= smc_getsockopt,
> +	.sendmsg	= smc_sendmsg,
> +	.recvmsg	= smc_recvmsg,
> +	.mmap		= sock_no_mmap,
> +	.splice_read	= smc_splice_read,
> +};
> +
> +struct inet_protosw smc_inet6_protosw = {
> +	.type		= SOCK_STREAM,
> +	.protocol	= IPPROTO_SMC,
> +	.prot		= &smc_inet6_prot,
> +	.ops		= &smc_inet6_stream_ops,
> +	.flags		= INET_PROTOSW_ICSK,
> +};
> +#endif
> +
> +int smc_inet_init_sock(struct sock *sk)
> +{
> +	struct net *net = sock_net(sk);
> +
> +	/* init common smc sock */
> +	smc_sk_init(net, sk, IPPROTO_SMC);
> +	/* create clcsock */
> +	return smc_create_clcsk(net, sk, sk->sk_family);
> +}
> diff --git a/net/smc/inet_smc.h b/net/smc/inet_smc.h
> new file mode 100644
> index 00000000..c55345d
> --- /dev/null
> +++ b/net/smc/inet_smc.h
> @@ -0,0 +1,34 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + *  Shared Memory Communications over RDMA (SMC-R) and RoCE
> + *
> + *  Definitions for the IPPROTO_SMC (socket related)
> +
> + *  Copyright IBM Corp. 2016
> + *  Copyright (c) 2024, Alibaba Inc.
> + *
> + *  Author: D. Wythe <alibuda@linux.alibaba.com>
> + */
> +#ifndef __INET_SMC
> +#define __INET_SMC
> +
> +#include <net/protocol.h>
> +#include <net/sock.h>
> +#include <net/tcp.h>
> +
> +extern struct proto smc_inet_prot;
> +extern const struct proto_ops smc_inet_stream_ops;
> +extern struct inet_protosw smc_inet_protosw;
> +
> +#if IS_ENABLED(CONFIG_IPV6)
> +#include <net/ipv6.h>
> +/* MUST after net/tcp.h or warning */
> +#include <net/transp_v6.h>
> +extern struct proto smc_inet6_prot;
> +extern const struct proto_ops smc_inet6_stream_ops;
> +extern struct inet_protosw smc_inet6_protosw;
> +#endif

If we append /* CONFIG_IPV6 */ to #endif to indicate that it is the end 
of CONFIG_IPV6, it is a good habit. When browsing the source code, it is 
easy for us to know that it is the end of CONFIG_IPV6.
Just my 2 cent suggestions. It is a trivial problem. You can ignore it.
But if you fix it, it can make the source code more readable.

Zhu Yanjun

> +
> +int smc_inet_init_sock(struct sock *sk);
> +
> +#endif /* __INET_SMC */


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 2/3] net/smc: expose smc proto operations
  2024-05-29 17:57   ` Zhu Yanjun
@ 2024-05-30  2:33     ` D. Wythe
  0 siblings, 0 replies; 16+ messages in thread
From: D. Wythe @ 2024-05-30  2:33 UTC (permalink / raw)
  To: Zhu Yanjun, kgraul, wenjia, jaka, wintera, guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, tonylu, pabeni,
	edumazet



On 5/30/24 1:57 AM, Zhu Yanjun wrote:
> 在 2024/5/29 5:59, D. Wythe 写道:
>> From: "D. Wythe" <alibuda@linux.alibaba.com>
>>
>> Externalize smc proto operations (smc_xxx) to allow
>> access from files other that af_smc.c
>
> s/other that/other than ?
>
> Zhu Yanjun
>

Thanks for that.

D. Wythe

>>
>> This is in preparation for the subsequent implementation
>> of the AF_INET version of SMC.
>>
>> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
>> ---
>>   net/smc/af_smc.c | 60 
>> ++++++++++++++++++++++++++++----------------------------
>>   net/smc/smc.h    | 33 +++++++++++++++++++++++++++++++
>>   2 files changed, 63 insertions(+), 30 deletions(-)
>>
>> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
>> index 77a9d58..8e3ce76 100644
>> --- a/net/smc/af_smc.c
>> +++ b/net/smc/af_smc.c
>> @@ -170,15 +170,15 @@ static bool smc_hs_congested(const struct sock 
>> *sk)
>>       return false;
>>   }
>>   -static struct smc_hashinfo smc_v4_hashinfo = {
>> +struct smc_hashinfo smc_v4_hashinfo = {
>>       .lock = __RW_LOCK_UNLOCKED(smc_v4_hashinfo.lock),
>>   };
>>   -static struct smc_hashinfo smc_v6_hashinfo = {
>> +struct smc_hashinfo smc_v6_hashinfo = {
>>       .lock = __RW_LOCK_UNLOCKED(smc_v6_hashinfo.lock),
>>   };
>>   -static int smc_hash_sk(struct sock *sk)
>> +int smc_hash_sk(struct sock *sk)
>>   {
>>       struct smc_hashinfo *h = sk->sk_prot->h.smc_hash;
>>       struct hlist_head *head;
>> @@ -193,7 +193,7 @@ static int smc_hash_sk(struct sock *sk)
>>       return 0;
>>   }
>>   -static void smc_unhash_sk(struct sock *sk)
>> +void smc_unhash_sk(struct sock *sk)
>>   {
>>       struct smc_hashinfo *h = sk->sk_prot->h.smc_hash;
>>   @@ -207,7 +207,7 @@ static void smc_unhash_sk(struct sock *sk)
>>    * work which we didn't do because of user hold the sock_lock in the
>>    * BH context
>>    */
>> -static void smc_release_cb(struct sock *sk)
>> +void smc_release_cb(struct sock *sk)
>>   {
>>       struct smc_sock *smc = smc_sk(sk);
>>   @@ -307,7 +307,7 @@ static int __smc_release(struct smc_sock *smc)
>>       return rc;
>>   }
>>   -static int smc_release(struct socket *sock)
>> +int smc_release(struct socket *sock)
>>   {
>>       struct sock *sk = sock->sk;
>>       struct smc_sock *smc;
>> @@ -401,8 +401,8 @@ static struct sock *smc_sock_alloc(struct net 
>> *net, struct socket *sock,
>>       return sk;
>>   }
>>   -static int smc_bind(struct socket *sock, struct sockaddr *uaddr,
>> -            int addr_len)
>> +int smc_bind(struct socket *sock, struct sockaddr *uaddr,
>> +         int addr_len)
>>   {
>>       struct sockaddr_in *addr = (struct sockaddr_in *)uaddr;
>>       struct sock *sk = sock->sk;
>> @@ -1649,8 +1649,8 @@ static void smc_connect_work(struct work_struct 
>> *work)
>>       release_sock(&smc->sk);
>>   }
>>   -static int smc_connect(struct socket *sock, struct sockaddr *addr,
>> -               int alen, int flags)
>> +int smc_connect(struct socket *sock, struct sockaddr *addr,
>> +        int alen, int flags)
>>   {
>>       struct sock *sk = sock->sk;
>>       struct smc_sock *smc;
>> @@ -2631,7 +2631,7 @@ static void smc_clcsock_data_ready(struct sock 
>> *listen_clcsock)
>>       read_unlock_bh(&listen_clcsock->sk_callback_lock);
>>   }
>>   -static int smc_listen(struct socket *sock, int backlog)
>> +int smc_listen(struct socket *sock, int backlog)
>>   {
>>       struct sock *sk = sock->sk;
>>       struct smc_sock *smc;
>> @@ -2696,8 +2696,8 @@ static int smc_listen(struct socket *sock, int 
>> backlog)
>>       return rc;
>>   }
>>   -static int smc_accept(struct socket *sock, struct socket *new_sock,
>> -              struct proto_accept_arg *arg)
>> +int smc_accept(struct socket *sock, struct socket *new_sock,
>> +           struct proto_accept_arg *arg)
>>   {
>>       struct sock *sk = sock->sk, *nsk;
>>       DECLARE_WAITQUEUE(wait, current);
>> @@ -2766,8 +2766,8 @@ static int smc_accept(struct socket *sock, 
>> struct socket *new_sock,
>>       return rc;
>>   }
>>   -static int smc_getname(struct socket *sock, struct sockaddr *addr,
>> -               int peer)
>> +int smc_getname(struct socket *sock, struct sockaddr *addr,
>> +        int peer)
>>   {
>>       struct smc_sock *smc;
>>   @@ -2780,7 +2780,7 @@ static int smc_getname(struct socket *sock, 
>> struct sockaddr *addr,
>>       return smc->clcsock->ops->getname(smc->clcsock, addr, peer);
>>   }
>>   -static int smc_sendmsg(struct socket *sock, struct msghdr *msg, 
>> size_t len)
>> +int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
>>   {
>>       struct sock *sk = sock->sk;
>>       struct smc_sock *smc;
>> @@ -2818,8 +2818,8 @@ static int smc_sendmsg(struct socket *sock, 
>> struct msghdr *msg, size_t len)
>>       return rc;
>>   }
>>   -static int smc_recvmsg(struct socket *sock, struct msghdr *msg, 
>> size_t len,
>> -               int flags)
>> +int smc_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>> +        int flags)
>>   {
>>       struct sock *sk = sock->sk;
>>       struct smc_sock *smc;
>> @@ -2868,8 +2868,8 @@ static __poll_t smc_accept_poll(struct sock 
>> *parent)
>>       return mask;
>>   }
>>   -static __poll_t smc_poll(struct file *file, struct socket *sock,
>> -                 poll_table *wait)
>> +__poll_t smc_poll(struct file *file, struct socket *sock,
>> +          poll_table *wait)
>>   {
>>       struct sock *sk = sock->sk;
>>       struct smc_sock *smc;
>> @@ -2921,7 +2921,7 @@ static __poll_t smc_poll(struct file *file, 
>> struct socket *sock,
>>       return mask;
>>   }
>>   -static int smc_shutdown(struct socket *sock, int how)
>> +int smc_shutdown(struct socket *sock, int how)
>>   {
>>       struct sock *sk = sock->sk;
>>       bool do_shutdown = true;
>> @@ -3061,8 +3061,8 @@ static int __smc_setsockopt(struct socket 
>> *sock, int level, int optname,
>>       return rc;
>>   }
>>   -static int smc_setsockopt(struct socket *sock, int level, int 
>> optname,
>> -              sockptr_t optval, unsigned int optlen)
>> +int smc_setsockopt(struct socket *sock, int level, int optname,
>> +           sockptr_t optval, unsigned int optlen)
>>   {
>>       struct sock *sk = sock->sk;
>>       struct smc_sock *smc;
>> @@ -3148,8 +3148,8 @@ static int smc_setsockopt(struct socket *sock, 
>> int level, int optname,
>>       return rc;
>>   }
>>   -static int smc_getsockopt(struct socket *sock, int level, int 
>> optname,
>> -              char __user *optval, int __user *optlen)
>> +int smc_getsockopt(struct socket *sock, int level, int optname,
>> +           char __user *optval, int __user *optlen)
>>   {
>>       struct smc_sock *smc;
>>       int rc;
>> @@ -3174,8 +3174,8 @@ static int smc_getsockopt(struct socket *sock, 
>> int level, int optname,
>>       return rc;
>>   }
>>   -static int smc_ioctl(struct socket *sock, unsigned int cmd,
>> -             unsigned long arg)
>> +int smc_ioctl(struct socket *sock, unsigned int cmd,
>> +          unsigned long arg)
>>   {
>>       union smc_host_cursor cons, urg;
>>       struct smc_connection *conn;
>> @@ -3261,9 +3261,9 @@ static int smc_ioctl(struct socket *sock, 
>> unsigned int cmd,
>>    * Note that subsequent recv() calls have to wait till all splice() 
>> processing
>>    * completed.
>>    */
>> -static ssize_t smc_splice_read(struct socket *sock, loff_t *ppos,
>> -                   struct pipe_inode_info *pipe, size_t len,
>> -                   unsigned int flags)
>> +ssize_t smc_splice_read(struct socket *sock, loff_t *ppos,
>> +            struct pipe_inode_info *pipe, size_t len,
>> +            unsigned int flags)
>>   {
>>       struct sock *sk = sock->sk;
>>       struct smc_sock *smc;
>> diff --git a/net/smc/smc.h b/net/smc/smc.h
>> index 3edec1e..34b781e 100644
>> --- a/net/smc/smc.h
>> +++ b/net/smc/smc.h
>> @@ -34,6 +34,39 @@
>>   extern struct proto smc_proto;
>>   extern struct proto smc_proto6;
>>   +extern struct smc_hashinfo smc_v4_hashinfo;
>> +extern struct smc_hashinfo smc_v6_hashinfo;
>> +
>> +int smc_hash_sk(struct sock *sk);
>> +void smc_unhash_sk(struct sock *sk);
>> +void smc_release_cb(struct sock *sk);
>> +
>> +int smc_release(struct socket *sock);
>> +int smc_bind(struct socket *sock, struct sockaddr *uaddr,
>> +         int addr_len);
>> +int smc_connect(struct socket *sock, struct sockaddr *addr,
>> +        int alen, int flags);
>> +int smc_accept(struct socket *sock, struct socket *new_sock,
>> +           struct proto_accept_arg *arg);
>> +int smc_getname(struct socket *sock, struct sockaddr *addr,
>> +        int peer);
>> +__poll_t smc_poll(struct file *file, struct socket *sock,
>> +          poll_table *wait);
>> +int smc_ioctl(struct socket *sock, unsigned int cmd,
>> +          unsigned long arg);
>> +int smc_listen(struct socket *sock, int backlog);
>> +int smc_shutdown(struct socket *sock, int how);
>> +int smc_setsockopt(struct socket *sock, int level, int optname,
>> +           sockptr_t optval, unsigned int optlen);
>> +int smc_getsockopt(struct socket *sock, int level, int optname,
>> +           char __user *optval, int __user *optlen);
>> +int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len);
>> +int smc_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>> +        int flags);
>> +ssize_t smc_splice_read(struct socket *sock, loff_t *ppos,
>> +            struct pipe_inode_info *pipe, size_t len,
>> +            unsigned int flags);
>> +
>>   /* smc sock initialization */
>>   void smc_sk_init(struct net *net, struct sock *sk, int protocol);
>>   /* clcsock initialization */


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 3/3] net/smc: Introduce IPPROTO_SMC
  2024-05-29 19:55   ` Zhu Yanjun
@ 2024-05-30  2:35     ` D. Wythe
  0 siblings, 0 replies; 16+ messages in thread
From: D. Wythe @ 2024-05-30  2:35 UTC (permalink / raw)
  To: Zhu Yanjun, kgraul, wenjia, jaka, wintera, guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, tonylu, pabeni,
	edumazet



On 5/30/24 3:55 AM, Zhu Yanjun wrote:
> 在 2024/5/29 5:59, D. Wythe 写道:
>> From: "D. Wythe" <alibuda@linux.alibaba.com>
>>
>> This patch allows to create smc socket via AF_INET,
>> similar to the following code,
>>
>> /* create v4 smc sock */
>> v4 = socket(AF_INET, SOCK_STREAM, IPPROTO_SMC);
>>
>> /* create v6 smc sock */
>> v6 = socket(AF_INET6, SOCK_STREAM, IPPROTO_SMC);
>>
>> There are several reasons why we believe it is appropriate here:
>>
>> 1. For smc sockets, it actually use IPv4 (AF-INET) or IPv6 (AF-INET6)
>> address. There is no AF_SMC address at all.
>>
>> 2. Create smc socket in the AF_INET(6) path, which allows us to reuse
>> the infrastructure of AF_INET(6) path, such as common ebpf hooks.
>> Otherwise, smc have to implement it again in AF_SMC path.
>>
>> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
>> ---
>>   include/uapi/linux/in.h |   2 +
>>   net/smc/Makefile        |   2 +-
>>   net/smc/af_smc.c        |  36 ++++++++++++++++
>>   net/smc/inet_smc.c      | 108 
>> ++++++++++++++++++++++++++++++++++++++++++++++++
>>   net/smc/inet_smc.h      |  34 +++++++++++++++
>>   5 files changed, 181 insertions(+), 1 deletion(-)
>>   create mode 100644 net/smc/inet_smc.c
>>   create mode 100644 net/smc/inet_smc.h
>>
>> diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
>> index e682ab6..0c6322b 100644
>> --- a/include/uapi/linux/in.h
>> +++ b/include/uapi/linux/in.h
>> @@ -83,6 +83,8 @@ enum {
>>   #define IPPROTO_RAW        IPPROTO_RAW
>>     IPPROTO_MPTCP = 262,        /* Multipath TCP connection        */
>>   #define IPPROTO_MPTCP        IPPROTO_MPTCP
>> +  IPPROTO_SMC = 263,        /* Shared Memory Communications        */
>> +#define IPPROTO_SMC        IPPROTO_SMC
>>     IPPROTO_MAX
>>   };
>>   #endif
>> diff --git a/net/smc/Makefile b/net/smc/Makefile
>> index 2c510d54..472b9ee 100644
>> --- a/net/smc/Makefile
>> +++ b/net/smc/Makefile
>> @@ -4,6 +4,6 @@ obj-$(CONFIG_SMC)    += smc.o
>>   obj-$(CONFIG_SMC_DIAG)    += smc_diag.o
>>   smc-y := af_smc.o smc_pnet.o smc_ib.o smc_clc.o smc_core.o smc_wr.o 
>> smc_llc.o
>>   smc-y += smc_cdc.o smc_tx.o smc_rx.o smc_close.o smc_ism.o 
>> smc_netlink.o smc_stats.o
>> -smc-y += smc_tracepoint.o
>> +smc-y += smc_tracepoint.o inet_smc.o
>>   smc-$(CONFIG_SYSCTL) += smc_sysctl.o
>>   smc-$(CONFIG_SMC_LO) += smc_loopback.o
>> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
>> index 8e3ce76..320624c 100644
>> --- a/net/smc/af_smc.c
>> +++ b/net/smc/af_smc.c
>> @@ -54,6 +54,7 @@
>>   #include "smc_tracepoint.h"
>>   #include "smc_sysctl.h"
>>   #include "smc_loopback.h"
>> +#include "inet_smc.h"
>>     static DEFINE_MUTEX(smc_server_lgr_pending);    /* serialize link 
>> group
>>                            * creation on server
>> @@ -3594,9 +3595,31 @@ static int __init smc_init(void)
>>           goto out_lo;
>>       }
>>   +    rc = proto_register(&smc_inet_prot, 1);
>> +    if (rc) {
>> +        pr_err("%s: proto_register smc_inet_prot fails with %d\n", 
>> __func__, rc);
>> +        goto out_ulp;
>> +    }
>> +    inet_register_protosw(&smc_inet_protosw);
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +    rc = proto_register(&smc_inet6_prot, 1);
>> +    if (rc) {
>> +        pr_err("%s: proto_register smc_inet6_prot fails with %d\n", 
>> __func__, rc);
>> +        goto out_inet_prot;
>> +    }
>> +    inet6_register_protosw(&smc_inet6_protosw);
>> +#endif
>> +
>>       static_branch_enable(&tcp_have_smc);
>>       return 0;
>>   +#if IS_ENABLED(CONFIG_IPV6)
>> +out_inet_prot:
>> +    inet_unregister_protosw(&smc_inet_protosw);
>> +    proto_unregister(&smc_inet_prot);
>> +#endif
>> +out_ulp:
>> +    tcp_unregister_ulp(&smc_ulp_ops);
>>   out_lo:
>>       smc_loopback_exit();
>>   out_ib:
>> @@ -3633,6 +3656,10 @@ static int __init smc_init(void)
>>   static void __exit smc_exit(void)
>>   {
>>       static_branch_disable(&tcp_have_smc);
>> +    inet_unregister_protosw(&smc_inet_protosw);
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +    inet6_unregister_protosw(&smc_inet6_protosw);
>> +#endif
>>       tcp_unregister_ulp(&smc_ulp_ops);
>>       sock_unregister(PF_SMC);
>>       smc_core_exit();
>> @@ -3644,6 +3671,10 @@ static void __exit smc_exit(void)
>>       destroy_workqueue(smc_hs_wq);
>>       proto_unregister(&smc_proto6);
>>       proto_unregister(&smc_proto);
>> +    proto_unregister(&smc_inet_prot);
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +    proto_unregister(&smc_inet6_prot);
>> +#endif
>>       smc_pnet_exit();
>>       smc_nl_exit();
>>       smc_clc_exit();
>> @@ -3660,4 +3691,9 @@ static void __exit smc_exit(void)
>>   MODULE_LICENSE("GPL");
>>   MODULE_ALIAS_NETPROTO(PF_SMC);
>>   MODULE_ALIAS_TCP_ULP("smc");
>> +/* 263 for IPPROTO_SMC and 1 for SOCK_STREAM */
>> +MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_INET, 263, 1);
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_INET6, 263, 1);
>> +#endif
>>   MODULE_ALIAS_GENL_FAMILY(SMC_GENL_FAMILY_NAME);
>> diff --git a/net/smc/inet_smc.c b/net/smc/inet_smc.c
>> new file mode 100644
>> index 00000000..1ba73d7
>> --- /dev/null
>> +++ b/net/smc/inet_smc.c
>> @@ -0,0 +1,108 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + *  Shared Memory Communications over RDMA (SMC-R) and RoCE
>> + *
>> + *  Definitions for the IPPROTO_SMC (socket related)
>> + *
>> + *  Copyright IBM Corp. 2016, 2018
>> + *  Copyright (c) 2024, Alibaba Inc.
>> + *
>> + *  Author: D. Wythe <alibuda@linux.alibaba.com>
>> + */
>> +
>> +#include "inet_smc.h"
>> +#include "smc.h"
>> +
>> +struct proto smc_inet_prot = {
>> +    .name        = "INET_SMC",
>> +    .owner        = THIS_MODULE,
>> +    .init        = smc_inet_init_sock,
>> +    .hash        = smc_hash_sk,
>> +    .unhash        = smc_unhash_sk,
>> +    .release_cb    = smc_release_cb,
>> +    .obj_size    = sizeof(struct smc_sock),
>> +    .h.smc_hash    = &smc_v4_hashinfo,
>> +    .slab_flags    = SLAB_TYPESAFE_BY_RCU,
>> +};
>> +
>> +const struct proto_ops smc_inet_stream_ops = {
>> +    .family        = PF_INET,
>> +    .owner        = THIS_MODULE,
>> +    .release    = smc_release,
>> +    .bind        = smc_bind,
>> +    .connect    = smc_connect,
>> +    .socketpair    = sock_no_socketpair,
>> +    .accept        = smc_accept,
>> +    .getname    = smc_getname,
>> +    .poll        = smc_poll,
>> +    .ioctl        = smc_ioctl,
>> +    .listen        = smc_listen,
>> +    .shutdown    = smc_shutdown,
>> +    .setsockopt    = smc_setsockopt,
>> +    .getsockopt    = smc_getsockopt,
>> +    .sendmsg    = smc_sendmsg,
>> +    .recvmsg    = smc_recvmsg,
>> +    .mmap        = sock_no_mmap,
>> +    .splice_read    = smc_splice_read,
>> +};
>> +
>> +struct inet_protosw smc_inet_protosw = {
>> +    .type        = SOCK_STREAM,
>> +    .protocol    = IPPROTO_SMC,
>> +    .prot        = &smc_inet_prot,
>> +    .ops        = &smc_inet_stream_ops,
>> +    .flags        = INET_PROTOSW_ICSK,
>> +};
>> +
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +struct proto smc_inet6_prot = {
>> +    .name        = "INET6_SMC",
>> +    .owner        = THIS_MODULE,
>> +    .init        = smc_inet_init_sock,
>> +    .hash        = smc_hash_sk,
>> +    .unhash        = smc_unhash_sk,
>> +    .release_cb    = smc_release_cb,
>> +    .obj_size    = sizeof(struct smc_sock),
>> +    .h.smc_hash    = &smc_v6_hashinfo,
>> +    .slab_flags    = SLAB_TYPESAFE_BY_RCU,
>> +};
>> +
>> +const struct proto_ops smc_inet6_stream_ops = {
>> +    .family        = PF_INET6,
>> +    .owner        = THIS_MODULE,
>> +    .release    = smc_release,
>> +    .bind        = smc_bind,
>> +    .connect    = smc_connect,
>> +    .socketpair    = sock_no_socketpair,
>> +    .accept        = smc_accept,
>> +    .getname    = smc_getname,
>> +    .poll        = smc_poll,
>> +    .ioctl        = smc_ioctl,
>> +    .listen        = smc_listen,
>> +    .shutdown    = smc_shutdown,
>> +    .setsockopt    = smc_setsockopt,
>> +    .getsockopt    = smc_getsockopt,
>> +    .sendmsg    = smc_sendmsg,
>> +    .recvmsg    = smc_recvmsg,
>> +    .mmap        = sock_no_mmap,
>> +    .splice_read    = smc_splice_read,
>> +};
>> +
>> +struct inet_protosw smc_inet6_protosw = {
>> +    .type        = SOCK_STREAM,
>> +    .protocol    = IPPROTO_SMC,
>> +    .prot        = &smc_inet6_prot,
>> +    .ops        = &smc_inet6_stream_ops,
>> +    .flags        = INET_PROTOSW_ICSK,
>> +};
>> +#endif
>> +
>> +int smc_inet_init_sock(struct sock *sk)
>> +{
>> +    struct net *net = sock_net(sk);
>> +
>> +    /* init common smc sock */
>> +    smc_sk_init(net, sk, IPPROTO_SMC);
>> +    /* create clcsock */
>> +    return smc_create_clcsk(net, sk, sk->sk_family);
>> +}
>> diff --git a/net/smc/inet_smc.h b/net/smc/inet_smc.h
>> new file mode 100644
>> index 00000000..c55345d
>> --- /dev/null
>> +++ b/net/smc/inet_smc.h
>> @@ -0,0 +1,34 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + *  Shared Memory Communications over RDMA (SMC-R) and RoCE
>> + *
>> + *  Definitions for the IPPROTO_SMC (socket related)
>> +
>> + *  Copyright IBM Corp. 2016
>> + *  Copyright (c) 2024, Alibaba Inc.
>> + *
>> + *  Author: D. Wythe <alibuda@linux.alibaba.com>
>> + */
>> +#ifndef __INET_SMC
>> +#define __INET_SMC
>> +
>> +#include <net/protocol.h>
>> +#include <net/sock.h>
>> +#include <net/tcp.h>
>> +
>> +extern struct proto smc_inet_prot;
>> +extern const struct proto_ops smc_inet_stream_ops;
>> +extern struct inet_protosw smc_inet_protosw;
>> +
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +#include <net/ipv6.h>
>> +/* MUST after net/tcp.h or warning */
>> +#include <net/transp_v6.h>
>> +extern struct proto smc_inet6_prot;
>> +extern const struct proto_ops smc_inet6_stream_ops;
>> +extern struct inet_protosw smc_inet6_protosw;
>> +#endif
>
> If we append /* CONFIG_IPV6 */ to #endif to indicate that it is the 
> end of CONFIG_IPV6, it is a good habit. When browsing the source code, 
> it is easy for us to know that it is the end of CONFIG_IPV6.
> Just my 2 cent suggestions. It is a trivial problem. You can ignore it.
> But if you fix it, it can make the source code more readable.
>
> Zhu Yanjun

I really like the style you said, I will use it in the next version.

Best wishes,
D. Wythe

>
>> +
>> +int smc_inet_init_sock(struct sock *sk);
>> +
>> +#endif /* __INET_SMC */


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 3/3] net/smc: Introduce IPPROTO_SMC
  2024-05-29 11:58   ` Wenjia Zhang
@ 2024-05-30  2:51     ` D. Wythe
  0 siblings, 0 replies; 16+ messages in thread
From: D. Wythe @ 2024-05-30  2:51 UTC (permalink / raw)
  To: Wenjia Zhang, kgraul, jaka, wintera, guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, tonylu, pabeni,
	edumazet



On 5/29/24 7:58 PM, Wenjia Zhang wrote:
>
>
> On 29.05.24 05:59, D. Wythe wrote:
>> From: "D. Wythe" <alibuda@linux.alibaba.com>
>>
>> This patch allows to create smc socket via AF_INET,
>> similar to the following code,
>>
>> /* create v4 smc sock */
>> v4 = socket(AF_INET, SOCK_STREAM, IPPROTO_SMC);
>>
>> /* create v6 smc sock */
>> v6 = socket(AF_INET6, SOCK_STREAM, IPPROTO_SMC);
>>
>> There are several reasons why we believe it is appropriate here:
>>
>> 1. For smc sockets, it actually use IPv4 (AF-INET) or IPv6 (AF-INET6)
>> address. There is no AF_SMC address at all.
>>
>> 2. Create smc socket in the AF_INET(6) path, which allows us to reuse
>> the infrastructure of AF_INET(6) path, such as common ebpf hooks.
>> Otherwise, smc have to implement it again in AF_SMC path.
>>
>> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
>> ---
>>   include/uapi/linux/in.h |   2 +
>>   net/smc/Makefile        |   2 +-
>>   net/smc/af_smc.c        |  36 ++++++++++++++++
>>   net/smc/inet_smc.c      | 108 
>> ++++++++++++++++++++++++++++++++++++++++++++++++
>>   net/smc/inet_smc.h      |  34 +++++++++++++++
>>   5 files changed, 181 insertions(+), 1 deletion(-)
>>   create mode 100644 net/smc/inet_smc.c
>>   create mode 100644 net/smc/inet_smc.h
>>
>> diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
>> index e682ab6..0c6322b 100644
>> --- a/include/uapi/linux/in.h
>> +++ b/include/uapi/linux/in.h
>> @@ -83,6 +83,8 @@ enum {
>>   #define IPPROTO_RAW        IPPROTO_RAW
>>     IPPROTO_MPTCP = 262,        /* Multipath TCP connection        */
>>   #define IPPROTO_MPTCP        IPPROTO_MPTCP
>> +  IPPROTO_SMC = 263,        /* Shared Memory Communications        */
>> +#define IPPROTO_SMC        IPPROTO_SMC
>>     IPPROTO_MAX
>>   };
>>   #endif
>> diff --git a/net/smc/Makefile b/net/smc/Makefile
>> index 2c510d54..472b9ee 100644
>> --- a/net/smc/Makefile
>> +++ b/net/smc/Makefile
>> @@ -4,6 +4,6 @@ obj-$(CONFIG_SMC)    += smc.o
>>   obj-$(CONFIG_SMC_DIAG)    += smc_diag.o
>>   smc-y := af_smc.o smc_pnet.o smc_ib.o smc_clc.o smc_core.o smc_wr.o 
>> smc_llc.o
>>   smc-y += smc_cdc.o smc_tx.o smc_rx.o smc_close.o smc_ism.o 
>> smc_netlink.o smc_stats.o
>> -smc-y += smc_tracepoint.o
>> +smc-y += smc_tracepoint.o inet_smc.o
>>   smc-$(CONFIG_SYSCTL) += smc_sysctl.o
>>   smc-$(CONFIG_SMC_LO) += smc_loopback.o
>> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
>> index 8e3ce76..320624c 100644
>> --- a/net/smc/af_smc.c
>> +++ b/net/smc/af_smc.c
>> @@ -54,6 +54,7 @@
>>   #include "smc_tracepoint.h"
>>   #include "smc_sysctl.h"
>>   #include "smc_loopback.h"
>> +#include "inet_smc.h"
>>     static DEFINE_MUTEX(smc_server_lgr_pending);    /* serialize link 
>> group
>>                            * creation on server
>> @@ -3594,9 +3595,31 @@ static int __init smc_init(void)
>>           goto out_lo;
>>       }
>>   +    rc = proto_register(&smc_inet_prot, 1);
>> +    if (rc) {
>> +        pr_err("%s: proto_register smc_inet_prot fails with %d\n", 
>> __func__, rc);
>> +        goto out_ulp;
>> +    }
>> +    inet_register_protosw(&smc_inet_protosw);
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +    rc = proto_register(&smc_inet6_prot, 1);
>> +    if (rc) {
>> +        pr_err("%s: proto_register smc_inet6_prot fails with %d\n", 
>> __func__, rc);
>> +        goto out_inet_prot;
>> +    }
>> +    inet6_register_protosw(&smc_inet6_protosw);
>
> Comparing to inet_register_protosw(), the inet6_register_protosw() 
> returns an integer. Thus, making error check and direct corresponding 
> housekeeping here looks IMO much cleaner.
>
Oops... I was under the impression that it had no return. In the prior 
RFC, I even commented that it had no return. Quite the oversight on my part.

>> +#endif
>> +
>>       static_branch_enable(&tcp_have_smc);
>>       return 0;
>>   +#if IS_ENABLED(CONFIG_IPV6)
>> +out_inet_prot:
>> +    inet_unregister_protosw(&smc_inet_protosw);
>> +    proto_unregister(&smc_inet_prot);
>> +#endif
>> +out_ulp:
>> +    tcp_unregister_ulp(&smc_ulp_ops);
>>   out_lo:
>>       smc_loopback_exit();
>>   out_ib:
>> @@ -3633,6 +3656,10 @@ static int __init smc_init(void)
>>   static void __exit smc_exit(void)
>>   {
>>       static_branch_disable(&tcp_have_smc);
>> +    inet_unregister_protosw(&smc_inet_protosw);
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +    inet6_unregister_protosw(&smc_inet6_protosw);
>> +#endif
>>       tcp_unregister_ulp(&smc_ulp_ops);
>>       sock_unregister(PF_SMC);
>>       smc_core_exit();
>> @@ -3644,6 +3671,10 @@ static void __exit smc_exit(void)
>>       destroy_workqueue(smc_hs_wq);
>>       proto_unregister(&smc_proto6);
>>       proto_unregister(&smc_proto);
>> +    proto_unregister(&smc_inet_prot);
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +    proto_unregister(&smc_inet6_prot);
>> +#end
>
> Since there is already inet_smc.c, I'd recommend to group these 
> register and unregister stuff respectively in functions like e.g. 
> smc_inet_init() and smc_inet_exit() in inet_smc.c
>

Agreed, I also see similar opinions from the community, and I will 
improve it in the next version.


>>       smc_pnet_exit();
>>       smc_nl_exit();
>>       smc_clc_exit();
>> @@ -3660,4 +3691,9 @@ static void __exit smc_exit(void)
>>   MODULE_LICENSE("GPL");
>>   MODULE_ALIAS_NETPROTO(PF_SMC);
>>   MODULE_ALIAS_TCP_ULP("smc");
>> +/* 263 for IPPROTO_SMC and 1 for SOCK_STREAM */
>> +MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_INET, 263, 1);
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_INET6, 263, 1);
>> +#endif
>>   MODULE_ALIAS_GENL_FAMILY(SMC_GENL_FAMILY_NAME);
>> diff --git a/net/smc/inet_smc.c b/net/smc/inet_smc.c
>> new file mode 100644
>> index 00000000..1ba73d7
>> --- /dev/null
>> +++ b/net/smc/inet_smc.c
>
> In order to keep the consistency with the structure and function names 
> in the files, I'm wondering why not to use smc_inet.h and smc_inet.c
> instead of inet_smc.h and inet_smc.c respectively

That's because I am trying to keep the name style to be consistent with 
af_smc.c. But I don't insist on this, using smc_inet
is also good for me.

Thanks,
D. Wythe

>
>> @@ -0,0 +1,108 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + *  Shared Memory Communications over RDMA (SMC-R) and RoCE
>> + *
>> + *  Definitions for the IPPROTO_SMC (socket related)
>> + *
>> + *  Copyright IBM Corp. 2016, 2018
>> + *  Copyright (c) 2024, Alibaba Inc.
>> + *
>> + *  Author: D. Wythe <alibuda@linux.alibaba.com>
>> + */
>> +
>> +#include "inet_smc.h"
>> +#include "smc.h"
>> +
>> +struct proto smc_inet_prot = {
>> +    .name        = "INET_SMC",
>> +    .owner        = THIS_MODULE,
>> +    .init        = smc_inet_init_sock,
>> +    .hash        = smc_hash_sk,
>> +    .unhash        = smc_unhash_sk,
>> +    .release_cb    = smc_release_cb,
>> +    .obj_size    = sizeof(struct smc_sock),
>> +    .h.smc_hash    = &smc_v4_hashinfo,
>> +    .slab_flags    = SLAB_TYPESAFE_BY_RCU,
>> +};
>> +
>> +const struct proto_ops smc_inet_stream_ops = {
>> +    .family        = PF_INET,
>> +    .owner        = THIS_MODULE,
>> +    .release    = smc_release,
>> +    .bind        = smc_bind,
>> +    .connect    = smc_connect,
>> +    .socketpair    = sock_no_socketpair,
>> +    .accept        = smc_accept,
>> +    .getname    = smc_getname,
>> +    .poll        = smc_poll,
>> +    .ioctl        = smc_ioctl,
>> +    .listen        = smc_listen,
>> +    .shutdown    = smc_shutdown,
>> +    .setsockopt    = smc_setsockopt,
>> +    .getsockopt    = smc_getsockopt,
>> +    .sendmsg    = smc_sendmsg,
>> +    .recvmsg    = smc_recvmsg,
>> +    .mmap        = sock_no_mmap,
>> +    .splice_read    = smc_splice_read,
>> +};
>> +
>> +struct inet_protosw smc_inet_protosw = {
>> +    .type        = SOCK_STREAM,
>> +    .protocol    = IPPROTO_SMC,
>> +    .prot        = &smc_inet_prot,
>> +    .ops        = &smc_inet_stream_ops,
>> +    .flags        = INET_PROTOSW_ICSK,
>> +};
>> +
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +struct proto smc_inet6_prot = {
>> +    .name        = "INET6_SMC",
>> +    .owner        = THIS_MODULE,
>> +    .init        = smc_inet_init_sock,
>> +    .hash        = smc_hash_sk,
>> +    .unhash        = smc_unhash_sk,
>> +    .release_cb    = smc_release_cb,
>> +    .obj_size    = sizeof(struct smc_sock),
>> +    .h.smc_hash    = &smc_v6_hashinfo,
>> +    .slab_flags    = SLAB_TYPESAFE_BY_RCU,
>> +};
>> +
>> +const struct proto_ops smc_inet6_stream_ops = {
>> +    .family        = PF_INET6,
>> +    .owner        = THIS_MODULE,
>> +    .release    = smc_release,
>> +    .bind        = smc_bind,
>> +    .connect    = smc_connect,
>> +    .socketpair    = sock_no_socketpair,
>> +    .accept        = smc_accept,
>> +    .getname    = smc_getname,
>> +    .poll        = smc_poll,
>> +    .ioctl        = smc_ioctl,
>> +    .listen        = smc_listen,
>> +    .shutdown    = smc_shutdown,
>> +    .setsockopt    = smc_setsockopt,
>> +    .getsockopt    = smc_getsockopt,
>> +    .sendmsg    = smc_sendmsg,
>> +    .recvmsg    = smc_recvmsg,
>> +    .mmap        = sock_no_mmap,
>> +    .splice_read    = smc_splice_read,
>> +};
>> +
>> +struct inet_protosw smc_inet6_protosw = {
>> +    .type        = SOCK_STREAM,
>> +    .protocol    = IPPROTO_SMC,
>> +    .prot        = &smc_inet6_prot,
>> +    .ops        = &smc_inet6_stream_ops,
>> +    .flags        = INET_PROTOSW_ICSK,
>> +};
>> +#endif
>> +
>> +int smc_inet_init_sock(struct sock *sk)
>> +{
>> +    struct net *net = sock_net(sk);
>> +
>> +    /* init common smc sock */
>> +    smc_sk_init(net, sk, IPPROTO_SMC);
>> +    /* create clcsock */
>> +    return smc_create_clcsk(net, sk, sk->sk_family);
>> +}
>> diff --git a/net/smc/inet_smc.h b/net/smc/inet_smc.h
>> new file mode 100644
>> index 00000000..c55345d
>> --- /dev/null
>> +++ b/net/smc/inet_smc.h
>> @@ -0,0 +1,34 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + *  Shared Memory Communications over RDMA (SMC-R) and RoCE
>> + *
>> + *  Definitions for the IPPROTO_SMC (socket related)
>> +
>> + *  Copyright IBM Corp. 2016
>> + *  Copyright (c) 2024, Alibaba Inc.
>> + *
>> + *  Author: D. Wythe <alibuda@linux.alibaba.com>
>> + */
>> +#ifndef __INET_SMC
>> +#define __INET_SMC
>> +
>> +#include <net/protocol.h>
>> +#include <net/sock.h>
>> +#include <net/tcp.h>
>> +
>> +extern struct proto smc_inet_prot;
>> +extern const struct proto_ops smc_inet_stream_ops;
>> +extern struct inet_protosw smc_inet_protosw;
>> +
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +#include <net/ipv6.h>
>> +/* MUST after net/tcp.h or warning */
>> +#include <net/transp_v6.h>
>> +extern struct proto smc_inet6_prot;
>> +extern const struct proto_ops smc_inet6_stream_ops;
>> +extern struct inet_protosw smc_inet6_protosw;
>> +#endif
>> +
>> +int smc_inet_init_sock(struct sock *sk);
>> +
>> +#endif /* __INET_SMC */


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 3/3] net/smc: Introduce IPPROTO_SMC
  2024-05-29 11:12   ` Dust Li
@ 2024-05-30  3:11     ` D. Wythe
  0 siblings, 0 replies; 16+ messages in thread
From: D. Wythe @ 2024-05-30  3:11 UTC (permalink / raw)
  To: dust.li, kgraul, wenjia, jaka, wintera, guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, tonylu, pabeni,
	edumazet



On 5/29/24 7:12 PM, Dust Li wrote:
> On 2024-05-29 11:59:07, D. Wythe wrote:
>> From: "D. Wythe" <alibuda@linux.alibaba.com>
>>
>> This patch allows to create smc socket via AF_INET,
>> similar to the following code,
>>
>> /* create v4 smc sock */
>> v4 = socket(AF_INET, SOCK_STREAM, IPPROTO_SMC);
>>
>> /* create v6 smc sock */
>> v6 = socket(AF_INET6, SOCK_STREAM, IPPROTO_SMC);
>>
>> There are several reasons why we believe it is appropriate here:
>>
>> 1. For smc sockets, it actually use IPv4 (AF-INET) or IPv6 (AF-INET6)
>> address. There is no AF_SMC address at all.
>>
>> 2. Create smc socket in the AF_INET(6) path, which allows us to reuse
>> the infrastructure of AF_INET(6) path, such as common ebpf hooks.
>> Otherwise, smc have to implement it again in AF_SMC path.
>>
>> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
>> ---
>> include/uapi/linux/in.h |   2 +
>> net/smc/Makefile        |   2 +-
>> net/smc/af_smc.c        |  36 ++++++++++++++++
>> net/smc/inet_smc.c      | 108 ++++++++++++++++++++++++++++++++++++++++++++++++
>> net/smc/inet_smc.h      |  34 +++++++++++++++
>> 5 files changed, 181 insertions(+), 1 deletion(-)
>> create mode 100644 net/smc/inet_smc.c
>> create mode 100644 net/smc/inet_smc.h
>>
>> diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
>> index e682ab6..0c6322b 100644
>> --- a/include/uapi/linux/in.h
>> +++ b/include/uapi/linux/in.h
>> @@ -83,6 +83,8 @@ enum {
>> #define IPPROTO_RAW		IPPROTO_RAW
>>    IPPROTO_MPTCP = 262,		/* Multipath TCP connection		*/
>> #define IPPROTO_MPTCP		IPPROTO_MPTCP
>> +  IPPROTO_SMC = 263,		/* Shared Memory Communications		*/
>> +#define IPPROTO_SMC		IPPROTO_SMC
>>    IPPROTO_MAX
>> };
>> #endif
>> diff --git a/net/smc/Makefile b/net/smc/Makefile
>> index 2c510d54..472b9ee 100644
>> --- a/net/smc/Makefile
>> +++ b/net/smc/Makefile
>> @@ -4,6 +4,6 @@ obj-$(CONFIG_SMC)	+= smc.o
>> obj-$(CONFIG_SMC_DIAG)	+= smc_diag.o
>> smc-y := af_smc.o smc_pnet.o smc_ib.o smc_clc.o smc_core.o smc_wr.o smc_llc.o
>> smc-y += smc_cdc.o smc_tx.o smc_rx.o smc_close.o smc_ism.o smc_netlink.o smc_stats.o
>> -smc-y += smc_tracepoint.o
>> +smc-y += smc_tracepoint.o inet_smc.o
>> smc-$(CONFIG_SYSCTL) += smc_sysctl.o
>> smc-$(CONFIG_SMC_LO) += smc_loopback.o
>> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
>> index 8e3ce76..320624c 100644
>> --- a/net/smc/af_smc.c
>> +++ b/net/smc/af_smc.c
>> @@ -54,6 +54,7 @@
>> #include "smc_tracepoint.h"
>> #include "smc_sysctl.h"
>> #include "smc_loopback.h"
>> +#include "inet_smc.h"
>>
>> static DEFINE_MUTEX(smc_server_lgr_pending);	/* serialize link group
>> 						 * creation on server
>> @@ -3594,9 +3595,31 @@ static int __init smc_init(void)
>> 		goto out_lo;
>> 	}
>>
>> +	rc = proto_register(&smc_inet_prot, 1);
>> +	if (rc) {
>> +		pr_err("%s: proto_register smc_inet_prot fails with %d\n", __func__, rc);
>> +		goto out_ulp;
>> +	}
>> +	inet_register_protosw(&smc_inet_protosw);
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +	rc = proto_register(&smc_inet6_prot, 1);
>> +	if (rc) {
>> +		pr_err("%s: proto_register smc_inet6_prot fails with %d\n", __func__, rc);
>> +		goto out_inet_prot;
>> +	}
>> +	inet6_register_protosw(&smc_inet6_protosw);
>> +#endif
>> +
> What do you think of moving all those inet initialization code into
> something like smc_inet_init() and move it to smc_inet.c ?
>
Agreed.

>> 	static_branch_enable(&tcp_have_smc);
>> 	return 0;
>>
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +out_inet_prot:
>> +	inet_unregister_protosw(&smc_inet_protosw);
>> +	proto_unregister(&smc_inet_prot);
>> +#endif
>> +out_ulp:
>> +	tcp_unregister_ulp(&smc_ulp_ops);
>> out_lo:
>> 	smc_loopback_exit();
>> out_ib:
>> @@ -3633,6 +3656,10 @@ static int __init smc_init(void)
>> static void __exit smc_exit(void)
>> {
>> 	static_branch_disable(&tcp_have_smc);
>> +	inet_unregister_protosw(&smc_inet_protosw);
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +	inet6_unregister_protosw(&smc_inet6_protosw);
>> +#endif
>> 	tcp_unregister_ulp(&smc_ulp_ops);
>> 	sock_unregister(PF_SMC);
>> 	smc_core_exit();
>> @@ -3644,6 +3671,10 @@ static void __exit smc_exit(void)
>> 	destroy_workqueue(smc_hs_wq);
>> 	proto_unregister(&smc_proto6);
>> 	proto_unregister(&smc_proto);
>> +	proto_unregister(&smc_inet_prot);
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +	proto_unregister(&smc_inet6_prot);
>> +#endif
>> 	smc_pnet_exit();
>> 	smc_nl_exit();
>> 	smc_clc_exit();
>> @@ -3660,4 +3691,9 @@ static void __exit smc_exit(void)
>> MODULE_LICENSE("GPL");
>> MODULE_ALIAS_NETPROTO(PF_SMC);
>> MODULE_ALIAS_TCP_ULP("smc");
>> +/* 263 for IPPROTO_SMC and 1 for SOCK_STREAM */
>> +MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_INET, 263, 1);
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_INET6, 263, 1);
>> +#endif
>> MODULE_ALIAS_GENL_FAMILY(SMC_GENL_FAMILY_NAME);
>> diff --git a/net/smc/inet_smc.c b/net/smc/inet_smc.c
>> new file mode 100644
>> index 00000000..1ba73d7
>> --- /dev/null
>> +++ b/net/smc/inet_smc.c
>> @@ -0,0 +1,108 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + *  Shared Memory Communications over RDMA (SMC-R) and RoCE
>> + *
>> + *  Definitions for the IPPROTO_SMC (socket related)
>> + *
>> + *  Copyright IBM Corp. 2016, 2018
>> + *  Copyright (c) 2024, Alibaba Inc.
>> + *
>> + *  Author: D. Wythe <alibuda@linux.alibaba.com>
>> + */
>> +
>> +#include "inet_smc.h"
>> +#include "smc.h"
>> +
>> +struct proto smc_inet_prot = {
>> +	.name		= "INET_SMC",
>> +	.owner		= THIS_MODULE,
>> +	.init		= smc_inet_init_sock,
>> +	.hash		= smc_hash_sk,
>> +	.unhash		= smc_unhash_sk,
>> +	.release_cb	= smc_release_cb,
>> +	.obj_size	= sizeof(struct smc_sock),
>> +	.h.smc_hash	= &smc_v4_hashinfo,
>> +	.slab_flags	= SLAB_TYPESAFE_BY_RCU,
>> +};
>> +
>> +const struct proto_ops smc_inet_stream_ops = {
>> +	.family		= PF_INET,
>> +	.owner		= THIS_MODULE,
>> +	.release	= smc_release,
>> +	.bind		= smc_bind,
>> +	.connect	= smc_connect,
>> +	.socketpair	= sock_no_socketpair,
>> +	.accept		= smc_accept,
>> +	.getname	= smc_getname,
>> +	.poll		= smc_poll,
>> +	.ioctl		= smc_ioctl,
>> +	.listen		= smc_listen,
>> +	.shutdown	= smc_shutdown,
>> +	.setsockopt	= smc_setsockopt,
>> +	.getsockopt	= smc_getsockopt,
>> +	.sendmsg	= smc_sendmsg,
>> +	.recvmsg	= smc_recvmsg,
>> +	.mmap		= sock_no_mmap,
>> +	.splice_read	= smc_splice_read,
>> +};
>> +
>> +struct inet_protosw smc_inet_protosw = {
>> +	.type		= SOCK_STREAM,
>> +	.protocol	= IPPROTO_SMC,
>> +	.prot		= &smc_inet_prot,
>> +	.ops		= &smc_inet_stream_ops,
>> +	.flags		= INET_PROTOSW_ICSK,
>> +};
>> +
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +struct proto smc_inet6_prot = {
>> +	.name		= "INET6_SMC",
>> +	.owner		= THIS_MODULE,
>> +	.init		= smc_inet_init_sock,
>> +	.hash		= smc_hash_sk,
>> +	.unhash		= smc_unhash_sk,
>> +	.release_cb	= smc_release_cb,
>> +	.obj_size	= sizeof(struct smc_sock),
>> +	.h.smc_hash	= &smc_v6_hashinfo,
>> +	.slab_flags	= SLAB_TYPESAFE_BY_RCU,
>> +};
>> +
>> +const struct proto_ops smc_inet6_stream_ops = {
>> +	.family		= PF_INET6,
>> +	.owner		= THIS_MODULE,
>> +	.release	= smc_release,
>> +	.bind		= smc_bind,
>> +	.connect	= smc_connect,
>> +	.socketpair	= sock_no_socketpair,
>> +	.accept		= smc_accept,
>> +	.getname	= smc_getname,
>> +	.poll		= smc_poll,
>> +	.ioctl		= smc_ioctl,
>> +	.listen		= smc_listen,
>> +	.shutdown	= smc_shutdown,
>> +	.setsockopt	= smc_setsockopt,
>> +	.getsockopt	= smc_getsockopt,
>> +	.sendmsg	= smc_sendmsg,
>> +	.recvmsg	= smc_recvmsg,
>> +	.mmap		= sock_no_mmap,
>> +	.splice_read	= smc_splice_read,
>> +};
>> +
>> +struct inet_protosw smc_inet6_protosw = {
>> +	.type		= SOCK_STREAM,
>> +	.protocol	= IPPROTO_SMC,
>> +	.prot		= &smc_inet6_prot,
>> +	.ops		= &smc_inet6_stream_ops,
>> +	.flags		= INET_PROTOSW_ICSK,
>> +};
>> +#endif
>> +
>> +int smc_inet_init_sock(struct sock *sk)
>> +{
>> +	struct net *net = sock_net(sk);
>> +
>> +	/* init common smc sock */
>> +	smc_sk_init(net, sk, IPPROTO_SMC);
>> +	/* create clcsock */
>> +	return smc_create_clcsk(net, sk, sk->sk_family);
>> +}
>> diff --git a/net/smc/inet_smc.h b/net/smc/inet_smc.h
>> new file mode 100644
>> index 00000000..c55345d
>> --- /dev/null
>> +++ b/net/smc/inet_smc.h
>> @@ -0,0 +1,34 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + *  Shared Memory Communications over RDMA (SMC-R) and RoCE
>> + *
>> + *  Definitions for the IPPROTO_SMC (socket related)
>> +
>> + *  Copyright IBM Corp. 2016
>> + *  Copyright (c) 2024, Alibaba Inc.
>> + *
>> + *  Author: D. Wythe <alibuda@linux.alibaba.com>
>> + */
>> +#ifndef __INET_SMC
>> +#define __INET_SMC
>> +
>> +#include <net/protocol.h>
>> +#include <net/sock.h>
>> +#include <net/tcp.h>
> Why not put those 'include's in the .c file ?

Agreed.  But I think that  <net/protocol. h> is needed to ensure that 
the header file itself is complete.

>> +
>> +extern struct proto smc_inet_prot;
>> +extern const struct proto_ops smc_inet_stream_ops;
>> +extern struct inet_protosw smc_inet_protosw;
>> +
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +#include <net/ipv6.h>
>> +/* MUST after net/tcp.h or warning */
>> +#include <net/transp_v6.h>
>> +extern struct proto smc_inet6_prot;
>> +extern const struct proto_ops smc_inet6_stream_ops;
>> +extern struct inet_protosw smc_inet6_protosw;
>> +#endif
>> +
>> +int smc_inet_init_sock(struct sock *sk);
> Seems smc_inet_init_sock() is only used in smc_inet.c,
> why not defined it as a static function ?
>
> Best regards,
> Dust

That's true, I will fix it.


Best wishes,
D. Wythe

>> +
>> +#endif /* __INET_SMC */
>> -- 
>> 1.8.3.1
>>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 3/3] net/smc: Introduce IPPROTO_SMC
  2024-05-29  3:59 ` [PATCH net-next v4 3/3] net/smc: Introduce IPPROTO_SMC D. Wythe
                     ` (2 preceding siblings ...)
  2024-05-29 19:55   ` Zhu Yanjun
@ 2024-06-01 13:06   ` Simon Horman
  2024-06-03  2:57     ` D. Wythe
  3 siblings, 1 reply; 16+ messages in thread
From: Simon Horman @ 2024-06-01 13:06 UTC (permalink / raw)
  To: D. Wythe
  Cc: kgraul, wenjia, jaka, wintera, guwen, kuba, davem, netdev,
	linux-s390, linux-rdma, tonylu, pabeni, edumazet

On Wed, May 29, 2024 at 11:59:07AM +0800, D. Wythe wrote:
> From: "D. Wythe" <alibuda@linux.alibaba.com>
> 
> This patch allows to create smc socket via AF_INET,
> similar to the following code,
> 
> /* create v4 smc sock */
> v4 = socket(AF_INET, SOCK_STREAM, IPPROTO_SMC);
> 
> /* create v6 smc sock */
> v6 = socket(AF_INET6, SOCK_STREAM, IPPROTO_SMC);
> 
> There are several reasons why we believe it is appropriate here:
> 
> 1. For smc sockets, it actually use IPv4 (AF-INET) or IPv6 (AF-INET6)
> address. There is no AF_SMC address at all.
> 
> 2. Create smc socket in the AF_INET(6) path, which allows us to reuse
> the infrastructure of AF_INET(6) path, such as common ebpf hooks.
> Otherwise, smc have to implement it again in AF_SMC path.
> 
> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>

...

> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c

...

> @@ -3594,9 +3595,31 @@ static int __init smc_init(void)
>  		goto out_lo;
>  	}
>  
> +	rc = proto_register(&smc_inet_prot, 1);
> +	if (rc) {
> +		pr_err("%s: proto_register smc_inet_prot fails with %d\n", __func__, rc);

Hi,

FWIIW, my feeling is that if a log message includes __func__ then it should
be a debug level message, and even then I'm dubious about the value of
__func__: we do have many tools including dynamic tracing or pinpointing
problems.

So I would suggest rephrasing this message and dropping __func__.
Or maybe removing it entirely.
Or if not, lowering the priority of this message to debug.

If for some reason __func__ remains, please do consider wrapping
the line to 80c columns or less, which can be trivially done here
(please don't split the format string in any case).

Flagged by checkpatch.pl --max-line-length=80

> +		goto out_ulp;
> +	}
> +	inet_register_protosw(&smc_inet_protosw);
> +#if IS_ENABLED(CONFIG_IPV6)
> +	rc = proto_register(&smc_inet6_prot, 1);
> +	if (rc) {
> +		pr_err("%s: proto_register smc_inet6_prot fails with %d\n", __func__, rc);

Here too.

> +		goto out_inet_prot;
> +	}
> +	inet6_register_protosw(&smc_inet6_protosw);
> +#endif

...

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 3/3] net/smc: Introduce IPPROTO_SMC
  2024-06-01 13:06   ` Simon Horman
@ 2024-06-03  2:57     ` D. Wythe
  2024-06-03  7:47       ` Simon Horman
  0 siblings, 1 reply; 16+ messages in thread
From: D. Wythe @ 2024-06-03  2:57 UTC (permalink / raw)
  To: Simon Horman
  Cc: kgraul, wenjia, jaka, wintera, guwen, kuba, davem, netdev,
	linux-s390, linux-rdma, tonylu, pabeni, edumazet



On 6/1/24 9:06 PM, Simon Horman wrote:
> On Wed, May 29, 2024 at 11:59:07AM +0800, D. Wythe wrote:
>> From: "D. Wythe" <alibuda@linux.alibaba.com>
>>
>> This patch allows to create smc socket via AF_INET,
>> similar to the following code,
>>
>> /* create v4 smc sock */
>> v4 = socket(AF_INET, SOCK_STREAM, IPPROTO_SMC);
>>
>> /* create v6 smc sock */
>> v6 = socket(AF_INET6, SOCK_STREAM, IPPROTO_SMC);
>>
>> There are several reasons why we believe it is appropriate here:
>>
>> 1. For smc sockets, it actually use IPv4 (AF-INET) or IPv6 (AF-INET6)
>> address. There is no AF_SMC address at all.
>>
>> 2. Create smc socket in the AF_INET(6) path, which allows us to reuse
>> the infrastructure of AF_INET(6) path, such as common ebpf hooks.
>> Otherwise, smc have to implement it again in AF_SMC path.
>>
>> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
> ...
>
>> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> ...
>
>> @@ -3594,9 +3595,31 @@ static int __init smc_init(void)
>>   		goto out_lo;
>>   	}
>>   
>> +	rc = proto_register(&smc_inet_prot, 1);
>> +	if (rc) {
>> +		pr_err("%s: proto_register smc_inet_prot fails with %d\n", __func__, rc);
> Hi,
>
> FWIIW, my feeling is that if a log message includes __func__ then it should
> be a debug level message, and even then I'm dubious about the value of
> __func__: we do have many tools including dynamic tracing or pinpointing
> problems.
>
> So I would suggest rephrasing this message and dropping __func__.
> Or maybe removing it entirely.
> Or if not, lowering the priority of this message to debug.
>
> If for some reason __func__ remains, please do consider wrapping
> the line to 80c columns or less, which can be trivially done here
> (please don't split the format string in any case).
>
> Flagged by checkpatch.pl --max-line-length=80


Hi Simon,

Thank you very much for your feedback.

Allow me to briefly explain the reasons for using pr_err and __func__ here.

Regarding pr_err, the failure here leads to the failure of the module 
loading, which is definitely an error-level message rather than a 
debug-level one.

As for __func__, I must admit that the purpose here is simply to align 
with the format of other error messages in smc_init(). In fact, I also 
feel that the presence of
__func__ doesn't hold significant value because this error will only 
occur within this function. It's meaningless information for both users 
and kernel developers.
Perhaps a more suitable format would be “smc: xxx: %d”.

However, if changes are needed, I think they should be made across the 
board in order to maintain a consistent style. Maybe this can be 
addressed by
submitting a new patch after this patch. @Wenjia, what do you think?

Therefore, for now, I would like to wrap this line to not exceed 80 
characters, to ensure it can pass the checkpatch.pl.
What do you think?

Best wishes,
D. Wythe

>
>> +		goto out_ulp;
>> +	}
>> +	inet_register_protosw(&smc_inet_protosw);
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +	rc = proto_register(&smc_inet6_prot, 1);
>> +	if (rc) {
>> +		pr_err("%s: proto_register smc_inet6_prot fails with %d\n", __func__, rc);
> Here too.
>
>> +		goto out_inet_prot;
>> +	}
>> +	inet6_register_protosw(&smc_inet6_protosw);
>> +#endif
> ...


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 3/3] net/smc: Introduce IPPROTO_SMC
  2024-06-03  2:57     ` D. Wythe
@ 2024-06-03  7:47       ` Simon Horman
  0 siblings, 0 replies; 16+ messages in thread
From: Simon Horman @ 2024-06-03  7:47 UTC (permalink / raw)
  To: D. Wythe
  Cc: kgraul, wenjia, jaka, wintera, guwen, kuba, davem, netdev,
	linux-s390, linux-rdma, tonylu, pabeni, edumazet

On Mon, Jun 03, 2024 at 10:57:55AM +0800, D. Wythe wrote:
> 
> 
> On 6/1/24 9:06 PM, Simon Horman wrote:
> > On Wed, May 29, 2024 at 11:59:07AM +0800, D. Wythe wrote:
> > > From: "D. Wythe" <alibuda@linux.alibaba.com>
> > > 
> > > This patch allows to create smc socket via AF_INET,
> > > similar to the following code,
> > > 
> > > /* create v4 smc sock */
> > > v4 = socket(AF_INET, SOCK_STREAM, IPPROTO_SMC);
> > > 
> > > /* create v6 smc sock */
> > > v6 = socket(AF_INET6, SOCK_STREAM, IPPROTO_SMC);
> > > 
> > > There are several reasons why we believe it is appropriate here:
> > > 
> > > 1. For smc sockets, it actually use IPv4 (AF-INET) or IPv6 (AF-INET6)
> > > address. There is no AF_SMC address at all.
> > > 
> > > 2. Create smc socket in the AF_INET(6) path, which allows us to reuse
> > > the infrastructure of AF_INET(6) path, such as common ebpf hooks.
> > > Otherwise, smc have to implement it again in AF_SMC path.
> > > 
> > > Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
> > ...
> > 
> > > diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> > ...
> > 
> > > @@ -3594,9 +3595,31 @@ static int __init smc_init(void)
> > >   		goto out_lo;
> > >   	}
> > > +	rc = proto_register(&smc_inet_prot, 1);
> > > +	if (rc) {
> > > +		pr_err("%s: proto_register smc_inet_prot fails with %d\n", __func__, rc);
> > Hi,
> > 
> > FWIIW, my feeling is that if a log message includes __func__ then it should
> > be a debug level message, and even then I'm dubious about the value of
> > __func__: we do have many tools including dynamic tracing or pinpointing
> > problems.
> > 
> > So I would suggest rephrasing this message and dropping __func__.
> > Or maybe removing it entirely.
> > Or if not, lowering the priority of this message to debug.
> > 
> > If for some reason __func__ remains, please do consider wrapping
> > the line to 80c columns or less, which can be trivially done here
> > (please don't split the format string in any case).
> > 
> > Flagged by checkpatch.pl --max-line-length=80
> 
> 
> Hi Simon,
> 
> Thank you very much for your feedback.
> 
> Allow me to briefly explain the reasons for using pr_err and __func__ here.
> 
> Regarding pr_err, the failure here leads to the failure of the module
> loading, which is definitely an error-level message rather than a
> debug-level one.
> 
> As for __func__, I must admit that the purpose here is simply to align with
> the format of other error messages in smc_init(). In fact, I also feel that
> the presence of
> __func__ doesn't hold significant value because this error will only occur
> within this function. It's meaningless information for both users and kernel
> developers.
> Perhaps a more suitable format would be “smc: xxx: %d”.
> 
> However, if changes are needed, I think they should be made across the board
> in order to maintain a consistent style. Maybe this can be addressed by
> submitting a new patch after this patch. @Wenjia, what do you think?
> 
> Therefore, for now, I would like to wrap this line to not exceed 80
> characters, to ensure it can pass the checkpatch.pl.
> What do you think?

Thanks, I agree with your reasoning.
And I think this is a good approach for this patch.

> 
> Best wishes,
> D. Wythe
> 
> > 
> > > +		goto out_ulp;
> > > +	}
> > > +	inet_register_protosw(&smc_inet_protosw);
> > > +#if IS_ENABLED(CONFIG_IPV6)
> > > +	rc = proto_register(&smc_inet6_prot, 1);
> > > +	if (rc) {
> > > +		pr_err("%s: proto_register smc_inet6_prot fails with %d\n", __func__, rc);
> > Here too.
> > 
> > > +		goto out_inet_prot;
> > > +	}
> > > +	inet6_register_protosw(&smc_inet6_protosw);
> > > +#endif
> > ...
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-06-03  7:47 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-29  3:59 [PATCH net-next v4 0/3] Introduce IPPROTO_SMC D. Wythe
2024-05-29  3:59 ` [PATCH net-next v4 1/3] net/smc: refactoring initialization of smc sock D. Wythe
2024-05-29  6:14   ` Tony Lu
2024-05-29  3:59 ` [PATCH net-next v4 2/3] net/smc: expose smc proto operations D. Wythe
2024-05-29 17:57   ` Zhu Yanjun
2024-05-30  2:33     ` D. Wythe
2024-05-29  3:59 ` [PATCH net-next v4 3/3] net/smc: Introduce IPPROTO_SMC D. Wythe
2024-05-29 11:12   ` Dust Li
2024-05-30  3:11     ` D. Wythe
2024-05-29 11:58   ` Wenjia Zhang
2024-05-30  2:51     ` D. Wythe
2024-05-29 19:55   ` Zhu Yanjun
2024-05-30  2:35     ` D. Wythe
2024-06-01 13:06   ` Simon Horman
2024-06-03  2:57     ` D. Wythe
2024-06-03  7:47       ` Simon Horman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).