linux-s390.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/4] net/smc: Introduce smc_bpf_ops
@ 2024-10-24  2:42 D. Wythe
  2024-10-24  2:42 ` [PATCH bpf-next 1/4] bpf: export necessary sympols for modules D. Wythe
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: D. Wythe @ 2024-10-24  2:42 UTC (permalink / raw)
  To: kgraul, wenjia, jaka, ast, daniel, andrii, martin.lau, pabeni,
	song, sdf, haoluo, yhs, edumazet, john.fastabend, kpsingh, jolsa,
	guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, bpf, dtcccc

From: "D. Wythe" <alibuda@linux.alibaba.com>

This patches attempt to introduce BPF injection capability for SMC,
and add selftest to ensure code stability.

As we all know that the SMC protocol is not suitable for all scenarios,
especially for short-lived. However, for most applications, they cannot
guarantee that there are no such scenarios at all. Therefore, apps
may need some specific strategies to decide shall we need to use SMC
or not, for example, apps can limit the scope of the SMC to a specific
IP address or port.

Based on the consideration of transparent replacement, we hope that apps
can remain transparent even if they need to formulate some specific
strategies for SMC using. That is, do not need to recompile their code.

On the other hand, we need to ensure the scalability of strategies
implementation. Although it is simple to use socket options or sysctl,
it will bring more complexity to subsequent expansion.

Fortunately, BPF can solve these concerns very well, users can write
thire own strategies in eBPF to choose whether to use SMC or not.
And it's quite easy for them to modify their strategies in the future.

This patches implement injection capability for SMC via struct_ops.
In that way, we can add new injection scenarios in the future.

D. Wythe (4):
  bpf: export necessary sympols for modules
  bpf: allow to access bpf_prog during bpf_struct_access
  net/smc: Introduce smc_bpf_ops
  bpf/selftests: add simple selftest for bpf_smc_ops

 include/linux/bpf.h                                |   1 +
 include/linux/filter.h                             |   1 +
 include/linux/tcp.h                                |   2 +-
 include/net/smc.h                                  |  47 +++++
 include/net/tcp.h                                  |   6 +
 kernel/bpf/btf.c                                   |   2 +
 kernel/bpf/verifier.c                              |   2 +-
 kernel/sched/ext.c                                 |   5 +-
 net/bpf/bpf_dummy_struct_ops.c                     |   1 +
 net/core/filter.c                                  |   7 +-
 net/ipv4/bpf_tcp_ca.c                              |   1 +
 net/ipv4/tcp_input.c                               |   3 +-
 net/ipv4/tcp_output.c                              |  14 +-
 net/netfilter/nf_conntrack_bpf.c                   |   1 +
 net/smc/Kconfig                                    |  12 ++
 net/smc/Makefile                                   |   1 +
 net/smc/af_smc.c                                   |  38 +++-
 net/smc/smc.h                                      |   4 +
 net/smc/smc_bpf.c                                  | 212 +++++++++++++++++++++
 net/smc/smc_bpf.h                                  |  34 ++++
 .../selftests/bpf/bpf_testmod/bpf_testmod.c        |   1 +
 .../selftests/bpf/prog_tests/test_bpf_smc.c        |  21 ++
 tools/testing/selftests/bpf/progs/bpf_smc.c        |  44 +++++
 23 files changed, 439 insertions(+), 21 deletions(-)
 create mode 100644 net/smc/smc_bpf.c
 create mode 100644 net/smc/smc_bpf.h
 create mode 100644 tools/testing/selftests/bpf/prog_tests/test_bpf_smc.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_smc.c

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH bpf-next 1/4] bpf: export necessary sympols for modules
  2024-10-24  2:42 [PATCH bpf-next 0/4] net/smc: Introduce smc_bpf_ops D. Wythe
@ 2024-10-24  2:42 ` D. Wythe
  2024-10-24  2:42 ` [PATCH bpf-next 2/4] bpf: allow to access bpf_prog during bpf_struct_access D. Wythe
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 20+ messages in thread
From: D. Wythe @ 2024-10-24  2:42 UTC (permalink / raw)
  To: kgraul, wenjia, jaka, ast, daniel, andrii, martin.lau, pabeni,
	song, sdf, haoluo, yhs, edumazet, john.fastabend, kpsingh, jolsa,
	guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, bpf, dtcccc

From: "D. Wythe" <alibuda@linux.alibaba.com>

This patch exports two necessary symbols for registering struct_ops
or kfunc with tristate subsystem.

Find the corresponding btf_id by name and type with
btf_find_by_name_kind().

And find the string in btf by offset with btf_str_by_offset(),
used to look up the names of structure members.

Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
---
 kernel/bpf/btf.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 75e4fe8..315e49b 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -567,6 +567,7 @@ s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
 
 	return -ENOENT;
 }
+EXPORT_SYMBOL_GPL(btf_find_by_name_kind);
 
 s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p)
 {
@@ -789,6 +790,7 @@ const char *btf_str_by_offset(const struct btf *btf, u32 offset)
 
 	return NULL;
 }
+EXPORT_SYMBOL_GPL(btf_str_by_offset);
 
 static bool btf_name_valid_identifier(const struct btf *btf, u32 offset)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next 2/4] bpf: allow to access bpf_prog during bpf_struct_access
  2024-10-24  2:42 [PATCH bpf-next 0/4] net/smc: Introduce smc_bpf_ops D. Wythe
  2024-10-24  2:42 ` [PATCH bpf-next 1/4] bpf: export necessary sympols for modules D. Wythe
@ 2024-10-24  2:42 ` D. Wythe
  2024-10-25  9:14   ` kernel test robot
  2024-10-25 12:20   ` kernel test robot
  2024-10-24  2:42 ` [PATCH net-next 3/4] net/smc: Introduce smc_bpf_ops D. Wythe
  2024-10-24  2:42 ` [PATCH bpf-next 4/4] bpf/selftests: add simple selftest for bpf_smc_ops D. Wythe
  3 siblings, 2 replies; 20+ messages in thread
From: D. Wythe @ 2024-10-24  2:42 UTC (permalink / raw)
  To: kgraul, wenjia, jaka, ast, daniel, andrii, martin.lau, pabeni,
	song, sdf, haoluo, yhs, edumazet, john.fastabend, kpsingh, jolsa,
	guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, bpf, dtcccc

From: "D. Wythe" <alibuda@linux.alibaba.com>

When implements a struct_ops for a subsystem, the fields allowed to
be written within different callbacks for the same structure might
vary, and applying the same restriction logic could lead to unexpected
issues.

Although we can use kfunc to solve it, creating a new kfunc
just for a simple assignment has low value and introduces obvious
maintenance overhead.

This patch allows the subsystem to implement independent validation
logic for different callbacks by passing an additional prog parameter,
then the subsystem can implement independent logic by checking the
prog->expected_attach_type.

Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
---
 include/linux/bpf.h                                   | 1 +
 include/linux/filter.h                                | 1 +
 kernel/bpf/verifier.c                                 | 2 +-
 kernel/sched/ext.c                                    | 5 +++--
 net/bpf/bpf_dummy_struct_ops.c                        | 1 +
 net/core/filter.c                                     | 7 +++++--
 net/ipv4/bpf_tcp_ca.c                                 | 1 +
 net/netfilter/nf_conntrack_bpf.c                      | 1 +
 tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c | 1 +
 9 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 19d8ca8..648bafd 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -987,6 +987,7 @@ struct bpf_verifier_ops {
 				  struct bpf_prog *prog, u32 *target_size);
 	int (*btf_struct_access)(struct bpf_verifier_log *log,
 				 const struct bpf_reg_state *reg,
+				 const struct bpf_prog *prog,
 				 int off, int size);
 };
 
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 7d7578a..2b583e2 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -670,6 +670,7 @@ struct sk_filter {
 extern struct mutex nf_conn_btf_access_lock;
 extern int (*nfct_btf_struct_access)(struct bpf_verifier_log *log,
 				     const struct bpf_reg_state *reg,
+					 const struct bpf_prog *prog,
 				     int off, int size);
 
 typedef unsigned int (*bpf_dispatcher_fn)(const void *ctx,
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 9a7ed52..e5f37cb 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -6638,7 +6638,7 @@ static int check_ptr_to_btf_access(struct bpf_verifier_env *env,
 			verbose(env, "verifier internal error: reg->btf must be kernel btf\n");
 			return -EFAULT;
 		}
-		ret = env->ops->btf_struct_access(&env->log, reg, off, size);
+		ret = env->ops->btf_struct_access(&env->log, reg, env->prog, off, size);
 	} else {
 		/* Writes are permitted with default btf_struct_access for
 		 * program allocated objects (which always have ref_obj_id > 0),
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 410a4df..7b10563 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -5344,8 +5344,9 @@ static bool bpf_scx_is_valid_access(int off, int size,
 }
 
 static int bpf_scx_btf_struct_access(struct bpf_verifier_log *log,
-				     const struct bpf_reg_state *reg, int off,
-				     int size)
+				     const struct bpf_reg_state *reg,
+					 const struct bpf_prog *prog,
+					 int off, int size)
 {
 	const struct btf_type *t;
 
diff --git a/net/bpf/bpf_dummy_struct_ops.c b/net/bpf/bpf_dummy_struct_ops.c
index f71f67c..5bc7aec 100644
--- a/net/bpf/bpf_dummy_struct_ops.c
+++ b/net/bpf/bpf_dummy_struct_ops.c
@@ -234,6 +234,7 @@ static int bpf_dummy_ops_check_member(const struct btf_type *t,
 
 static int bpf_dummy_ops_btf_struct_access(struct bpf_verifier_log *log,
 					   const struct bpf_reg_state *reg,
+					   const struct bpf_prog *prog,
 					   int off, int size)
 {
 	const struct btf_type *state;
diff --git a/net/core/filter.c b/net/core/filter.c
index a88e6924..273393c 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -9014,18 +9014,20 @@ static bool tc_cls_act_is_valid_access(int off, int size,
 
 int (*nfct_btf_struct_access)(struct bpf_verifier_log *log,
 			      const struct bpf_reg_state *reg,
+				  const struct bpf_prog *prog,
 			      int off, int size);
 EXPORT_SYMBOL_GPL(nfct_btf_struct_access);
 
 static int tc_cls_act_btf_struct_access(struct bpf_verifier_log *log,
 					const struct bpf_reg_state *reg,
+					const struct bpf_prog *prog,
 					int off, int size)
 {
 	int ret = -EACCES;
 
 	mutex_lock(&nf_conn_btf_access_lock);
 	if (nfct_btf_struct_access)
-		ret = nfct_btf_struct_access(log, reg, off, size);
+		ret = nfct_btf_struct_access(log, reg, prog, off, size);
 	mutex_unlock(&nf_conn_btf_access_lock);
 
 	return ret;
@@ -9100,13 +9102,14 @@ void bpf_warn_invalid_xdp_action(struct net_device *dev, struct bpf_prog *prog,
 
 static int xdp_btf_struct_access(struct bpf_verifier_log *log,
 				 const struct bpf_reg_state *reg,
+				 const struct bpf_prog *prog,
 				 int off, int size)
 {
 	int ret = -EACCES;
 
 	mutex_lock(&nf_conn_btf_access_lock);
 	if (nfct_btf_struct_access)
-		ret = nfct_btf_struct_access(log, reg, off, size);
+		ret = nfct_btf_struct_access(log, reg, prog, off, size);
 	mutex_unlock(&nf_conn_btf_access_lock);
 
 	return ret;
diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c
index 5548047..c459774 100644
--- a/net/ipv4/bpf_tcp_ca.c
+++ b/net/ipv4/bpf_tcp_ca.c
@@ -60,6 +60,7 @@ static bool bpf_tcp_ca_is_valid_access(int off, int size,
 
 static int bpf_tcp_ca_btf_struct_access(struct bpf_verifier_log *log,
 					const struct bpf_reg_state *reg,
+					const struct bpf_prog *prog,
 					int off, int size)
 {
 	const struct btf_type *t;
diff --git a/net/netfilter/nf_conntrack_bpf.c b/net/netfilter/nf_conntrack_bpf.c
index 4a136fc..dad3659 100644
--- a/net/netfilter/nf_conntrack_bpf.c
+++ b/net/netfilter/nf_conntrack_bpf.c
@@ -235,6 +235,7 @@ static struct nf_conn *__bpf_nf_ct_lookup(struct net *net,
 /* Check writes into `struct nf_conn` */
 static int _nf_conntrack_btf_struct_access(struct bpf_verifier_log *log,
 					   const struct bpf_reg_state *reg,
+					   const struct bpf_prog *prog,
 					   int off, int size)
 {
 	const struct btf_type *ncit, *nct, *t;
diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
index 8835761..b537480 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
@@ -1243,6 +1243,7 @@ static int st_ops_gen_epilogue(struct bpf_insn *insn_buf, const struct bpf_prog
 
 static int st_ops_btf_struct_access(struct bpf_verifier_log *log,
 				    const struct bpf_reg_state *reg,
+					const struct bpf_prog *prog,
 				    int off, int size)
 {
 	if (off < 0 || off + size > sizeof(struct st_ops_args))
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next 3/4] net/smc: Introduce smc_bpf_ops
  2024-10-24  2:42 [PATCH bpf-next 0/4] net/smc: Introduce smc_bpf_ops D. Wythe
  2024-10-24  2:42 ` [PATCH bpf-next 1/4] bpf: export necessary sympols for modules D. Wythe
  2024-10-24  2:42 ` [PATCH bpf-next 2/4] bpf: allow to access bpf_prog during bpf_struct_access D. Wythe
@ 2024-10-24  2:42 ` D. Wythe
  2024-10-25  0:26   ` Martin KaFai Lau
  2024-10-24  2:42 ` [PATCH bpf-next 4/4] bpf/selftests: add simple selftest for bpf_smc_ops D. Wythe
  3 siblings, 1 reply; 20+ messages in thread
From: D. Wythe @ 2024-10-24  2:42 UTC (permalink / raw)
  To: kgraul, wenjia, jaka, ast, daniel, andrii, martin.lau, pabeni,
	song, sdf, haoluo, yhs, edumazet, john.fastabend, kpsingh, jolsa,
	guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, bpf, dtcccc

From: "D. Wythe" <alibuda@linux.alibaba.com>

The introduction of IPPROTO_SMC enables eBPF programs to determine
whether to use SMC based on the context of socket creation, such as
network namespaces, PID and comm name, etc.

As a subsequent enhancement, this patch introduces a new hook for eBPF
programs that allows decisions on whether to use SMC or not at runtime,
including but not limited to local/remote IP address or ports. In
simpler words, this feature allows modifications to syn_smc through eBPF
programs before the TCP three-way handshake got established.

Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
---
 include/linux/tcp.h   |   2 +-
 include/net/smc.h     |  47 +++++++++++
 include/net/tcp.h     |   6 ++
 net/ipv4/tcp_input.c  |   3 +-
 net/ipv4/tcp_output.c |  14 +++-
 net/smc/Kconfig       |  12 +++
 net/smc/Makefile      |   1 +
 net/smc/af_smc.c      |  38 ++++++---
 net/smc/smc.h         |   4 +
 net/smc/smc_bpf.c     | 212 ++++++++++++++++++++++++++++++++++++++++++++++++++
 net/smc/smc_bpf.h     |  34 ++++++++
 11 files changed, 357 insertions(+), 16 deletions(-)
 create mode 100644 net/smc/smc_bpf.c
 create mode 100644 net/smc/smc_bpf.h

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 6a5e08b..4ef160a 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -478,7 +478,7 @@ struct tcp_sock {
 #endif
 #if IS_ENABLED(CONFIG_SMC)
 	bool	syn_smc;	/* SYN includes SMC */
-	bool	(*smc_hs_congested)(const struct sock *sk);
+	struct tcpsmc_ctx *smc;
 #endif
 
 #if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
diff --git a/include/net/smc.h b/include/net/smc.h
index db84e4e..34ab2c6 100644
--- a/include/net/smc.h
+++ b/include/net/smc.h
@@ -18,6 +18,8 @@
 #include "linux/ism.h"
 
 struct sock;
+struct tcp_sock;
+struct inet_request_sock;
 
 #define SMC_MAX_PNETID_LEN	16	/* Max. length of PNET id */
 
@@ -97,4 +99,49 @@ struct smcd_dev {
 	u8 going_away : 1;
 };
 
+/*
+ * This structure is used to store the parameters passed to the member of struct_ops.
+ * Due to the BPF verifier cannot restrict the writing of bit fields, such as limiting
+ * it to only write ireq->smc_ok. Using kfunc can solve this issue, but we don't want
+ * to introduce a kfunc with such a narrow function.
+ *
+ * Moreover, using this structure for unified parameters also addresses another
+ * potential issue. Currently, kfunc cannot recognize the calling context
+ * through BPF's existing structure. In the future, we can solve this problem
+ * by passing this ctx to kfunc.
+ */
+struct smc_bpf_ops_ctx {
+	struct {
+		struct tcp_sock *tp;
+	} set_option;
+	struct {
+		const struct tcp_sock *tp;
+		struct inet_request_sock *ireq;
+		int smc_ok;
+	} set_option_cond;
+};
+
+struct smc_bpf_ops {
+	/* priavte */
+
+	struct list_head	list;
+
+	/* public */
+
+	/* Invoked before computing SMC option for SYN packets.
+	 * We can control whether to set SMC options by modifying
+	 * ctx->set_option->tp->syn_smc.
+	 * This's also the only member that can be modified now.
+	 * Only member in ctx->set_option is valid for this callback.
+	 */
+	void (*set_option)(struct smc_bpf_ops_ctx *ctx);
+
+	/* Invoked before Set up SMC options for SYN-ACK packets
+	 * We can control whether to respond SMC options by modifying
+	 * ctx->set_option_cond.smc_ok.
+	 * Only member in ctx->set_option_cond is valid for this callback.
+	 */
+	void (*set_option_cond)(struct smc_bpf_ops_ctx *ctx);
+};
+
 #endif	/* _SMC_H */
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 739a9fb..c322443 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2730,6 +2730,12 @@ static inline void tcp_bpf_rtt(struct sock *sk, long mrtt, u32 srtt)
 
 #if IS_ENABLED(CONFIG_SMC)
 extern struct static_key_false tcp_have_smc;
+struct tcpsmc_ctx {
+	/* Invoked before computing SMC option for SYN packets. */
+	void (*set_option)(struct tcp_sock *tp);
+	/* Invoked before Set up SMC options for SYN-ACK packets */
+	void (*set_option_cond)(const struct tcp_sock *tp, struct inet_request_sock *ireq);
+};
 #endif
 
 #if IS_ENABLED(CONFIG_TLS_DEVICE)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 2d844e1..8ebd529 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -7070,8 +7070,7 @@ static void tcp_openreq_init(struct request_sock *req,
 	ireq->ir_num = ntohs(tcp_hdr(skb)->dest);
 	ireq->ir_mark = inet_request_mark(sk, skb);
 #if IS_ENABLED(CONFIG_SMC)
-	ireq->smc_ok = rx_opt->smc_ok && !(tcp_sk(sk)->smc_hs_congested &&
-			tcp_sk(sk)->smc_hs_congested(sk));
+	ireq->smc_ok = rx_opt->smc_ok;
 #endif
 }
 
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 054244ce..5ab47dd 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -759,14 +759,17 @@ static void tcp_options_write(struct tcphdr *th, struct tcp_sock *tp,
 	mptcp_options_write(th, ptr, tp, opts);
 }
 
-static void smc_set_option(const struct tcp_sock *tp,
+static void smc_set_option(struct tcp_sock *tp,
 			   struct tcp_out_options *opts,
 			   unsigned int *remaining)
 {
 #if IS_ENABLED(CONFIG_SMC)
 	if (static_branch_unlikely(&tcp_have_smc)) {
 		if (tp->syn_smc) {
-			if (*remaining >= TCPOLEN_EXP_SMC_BASE_ALIGNED) {
+			if (tp->smc && tp->smc->set_option)
+				tp->smc->set_option(tp);
+			/* set_option may modify syn_smc, so it needs to be checked again */
+			if (tp->syn_smc && *remaining >= TCPOLEN_EXP_SMC_BASE_ALIGNED) {
 				opts->options |= OPTION_SMC;
 				*remaining -= TCPOLEN_EXP_SMC_BASE_ALIGNED;
 			}
@@ -776,14 +779,17 @@ static void smc_set_option(const struct tcp_sock *tp,
 }
 
 static void smc_set_option_cond(const struct tcp_sock *tp,
-				const struct inet_request_sock *ireq,
+				struct inet_request_sock *ireq,
 				struct tcp_out_options *opts,
 				unsigned int *remaining)
 {
 #if IS_ENABLED(CONFIG_SMC)
 	if (static_branch_unlikely(&tcp_have_smc)) {
 		if (tp->syn_smc && ireq->smc_ok) {
-			if (*remaining >= TCPOLEN_EXP_SMC_BASE_ALIGNED) {
+			if (tp->smc && tp->smc->set_option_cond)
+				tp->smc->set_option_cond(tp, ireq);
+			/* set_option_cond may modify smc_ok, so it needs to be checked again */
+			if (ireq->smc_ok && *remaining >= TCPOLEN_EXP_SMC_BASE_ALIGNED) {
 				opts->options |= OPTION_SMC;
 				*remaining -= TCPOLEN_EXP_SMC_BASE_ALIGNED;
 			}
diff --git a/net/smc/Kconfig b/net/smc/Kconfig
index ba5e6a2..1eca835 100644
--- a/net/smc/Kconfig
+++ b/net/smc/Kconfig
@@ -33,3 +33,15 @@ config SMC_LO
 	  of architecture or hardware.
 
 	  if unsure, say N.
+
+config SMC_BPF
+	bool "eBPF support for SMC subsystem"
+	depends on SMC && BPF_SYSCALL
+	default n
+	help
+	  This option enables support for eBPF programs for SMC
+	  subsystem. eBPF programs offer much greater flexibility
+	  in modifying the behavior of the SMC protocol stack compared
+	  to a complete kernel-based approach.
+
+	  if unsure, say N.
diff --git a/net/smc/Makefile b/net/smc/Makefile
index 60f1c87..1c04906 100644
--- a/net/smc/Makefile
+++ b/net/smc/Makefile
@@ -7,3 +7,4 @@ smc-y += smc_cdc.o smc_tx.o smc_rx.o smc_close.o smc_ism.o smc_netlink.o smc_sta
 smc-y += smc_tracepoint.o smc_inet.o
 smc-$(CONFIG_SYSCTL) += smc_sysctl.o
 smc-$(CONFIG_SMC_LO) += smc_loopback.o
+smc-$(CONFIG_SMC_BPF) += smc_bpf.o
\ No newline at end of file
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 0316217..316c8a1 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -55,6 +55,7 @@
 #include "smc_sysctl.h"
 #include "smc_loopback.h"
 #include "smc_inet.h"
+#include "smc_bpf.h"
 
 static DEFINE_MUTEX(smc_server_lgr_pending);	/* serialize link group
 						 * creation on server
@@ -156,19 +157,25 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
 	return NULL;
 }
 
-static bool smc_hs_congested(const struct sock *sk)
+static void smc_set_tcp_option_cond(const struct tcp_sock *tp, struct inet_request_sock *ireq)
 {
 	const struct smc_sock *smc;
 
-	smc = smc_clcsock_user_data(sk);
+	smc = smc_clcsock_user_data(&tp->inet_conn.icsk_inet.sk);
 
 	if (!smc)
-		return true;
+		goto no_smc;
 
-	if (workqueue_congested(WORK_CPU_UNBOUND, smc_hs_wq))
-		return true;
+	if (smc->limit_smc_hs && workqueue_congested(WORK_CPU_UNBOUND, smc_hs_wq))
+		goto no_smc;
 
-	return false;
+#if IS_ENABLED(CONFIG_SMC_BPF)
+	bpf_smc_set_tcp_option_cond(tp, ireq);
+#endif /* CONFIG_SMC_BPF */
+
+	return;
+no_smc:
+	ireq->smc_ok = 0;
 }
 
 struct smc_hashinfo smc_v4_hashinfo = {
@@ -2650,9 +2657,6 @@ int smc_listen(struct socket *sock, int backlog)
 
 	inet_csk(smc->clcsock->sk)->icsk_af_ops = &smc->af_ops;
 
-	if (smc->limit_smc_hs)
-		tcp_sk(smc->clcsock->sk)->smc_hs_congested = smc_hs_congested;
-
 	rc = kernel_listen(smc->clcsock, backlog);
 	if (rc) {
 		write_lock_bh(&smc->clcsock->sk->sk_callback_lock);
@@ -3324,6 +3328,13 @@ int smc_create_clcsk(struct net *net, struct sock *sk, int family)
 	sk->sk_net_refcnt = 1;
 	get_net_track(net, &sk->ns_tracker, GFP_KERNEL);
 	sock_inuse_add(net, 1);
+
+	/* init tcp_smc_ctx */
+#if IS_ENABLED(CONFIG_SMC_BPF)
+	smc->tcp_smc_ctx.set_option = bpf_smc_set_tcp_option;
+#endif /* CONFIG_SMC_BPF */
+	smc->tcp_smc_ctx.set_option_cond = smc_set_tcp_option_cond;
+	tcp_sk(sk)->smc = &smc->tcp_smc_ctx;
 	return 0;
 }
 
@@ -3574,8 +3585,17 @@ static int __init smc_init(void)
 		pr_err("%s: smc_inet_init fails with %d\n", __func__, rc);
 		goto out_ulp;
 	}
+
+	rc = smc_bpf_struct_ops_init();
+	if (rc) {
+		pr_err("%s: smc_bpf_struct_ops_init fails with %d\n", __func__, rc);
+		goto out_inet;
+	}
+
 	static_branch_enable(&tcp_have_smc);
 	return 0;
+out_inet:
+	smc_inet_exit();
 out_ulp:
 	tcp_unregister_ulp(&smc_ulp_ops);
 out_lo:
diff --git a/net/smc/smc.h b/net/smc/smc.h
index 78ae10d..a9794fb 100644
--- a/net/smc/smc.h
+++ b/net/smc/smc.h
@@ -16,6 +16,7 @@
 #include <linux/compiler.h> /* __aligned */
 #include <net/genetlink.h>
 #include <net/sock.h>
+#include <net/tcp.h>
 
 #include "smc_ib.h"
 
@@ -328,6 +329,9 @@ struct smc_sock {				/* smc sock container */
 						/* protects clcsock of a listen
 						 * socket
 						 * */
+
+	/* smc context for tcp stack */
+	struct tcpsmc_ctx	tcp_smc_ctx;
 };
 
 #define smc_sk(ptr) container_of_const(ptr, struct smc_sock, sk)
diff --git a/net/smc/smc_bpf.c b/net/smc/smc_bpf.c
new file mode 100644
index 00000000..fa90406
--- /dev/null
+++ b/net/smc/smc_bpf.c
@@ -0,0 +1,212 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ *  Shared Memory Communications over RDMA (SMC-R) and RoCE
+ *
+ *  support for eBPF programs in SMC subsystem.
+ *
+ *  Copyright IBM Corp. 2016
+ *  Copyright (c) 2024, Alibaba Inc.
+ *
+ *  Author: D. Wythe <alibuda@linux.alibaba.com>
+ */
+
+#include <linux/bpf_verifier.h>
+#include <linux/spinlock.h>
+#include <linux/module.h>
+#include <linux/bpf.h>
+#include <linux/btf.h>
+#include <net/smc.h>
+
+#include "smc_bpf.h"
+
+static DEFINE_SPINLOCK(smc_bpf_ops_list_lock);
+static LIST_HEAD(smc_bpf_ops_list);
+
+static u32 tcp_sock_id, smc_bpf_ops_ctx_id;
+static const struct btf_type *smc_bpf_ops_type;
+static const struct btf *saved_btf;
+
+static int smc_bpf_ops_init(struct btf *btf)
+{
+	s32 type_id;
+
+	type_id = btf_find_by_name_kind(btf, "tcp_sock", BTF_KIND_STRUCT);
+	if (type_id < 0)
+		return -EINVAL;
+	tcp_sock_id = type_id;
+
+	type_id = btf_find_by_name_kind(btf, "smc_bpf_ops_ctx", BTF_KIND_STRUCT);
+	if (type_id < 0)
+		return -EINVAL;
+	smc_bpf_ops_ctx_id = type_id;
+
+	type_id = btf_find_by_name_kind(btf, "smc_bpf_ops", BTF_KIND_STRUCT);
+	if (type_id < 0)
+		return -EINVAL;
+	smc_bpf_ops_type = btf_type_by_id(btf, type_id);
+
+	saved_btf = btf;
+	return 0;
+}
+
+static int smc_bpf_ops_init_member(const struct btf_type *t,
+				   const struct btf_member *member,
+				   void *kdata, const void *udata)
+{
+	struct smc_bpf_ops *k_ops;
+	u32 moff;
+
+	k_ops = (struct smc_bpf_ops *)kdata;
+
+	moff = __btf_member_bit_offset(t, member) / 8;
+	switch (moff) {
+	case offsetof(struct smc_bpf_ops, list):
+		INIT_LIST_HEAD(&k_ops->list);
+		return 1;
+	default:
+		break;
+	}
+
+	return 0;
+}
+
+static int smc_bpf_ops_check_member(const struct btf_type *t,
+				    const struct btf_member *member,
+				    const struct bpf_prog *prog)
+{
+	u32 moff = __btf_member_bit_offset(t, member) / 8;
+
+	switch (moff) {
+	case offsetof(struct smc_bpf_ops, set_option):
+	case offsetof(struct smc_bpf_ops, set_option_cond):
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int smc_bpf_ops_reg(void *kdata, struct bpf_link *link)
+{
+	struct smc_bpf_ops *ops = kdata;
+
+	/* Prevent the same ops from being registered repeatedly. */
+	if (!list_empty(&ops->list))
+		return -EINVAL;
+
+	spin_lock(&smc_bpf_ops_list_lock);
+	list_add_tail_rcu(&ops->list, &smc_bpf_ops_list);
+	spin_unlock(&smc_bpf_ops_list_lock);
+
+	return 0;
+}
+
+static void smc_bpf_ops_unreg(void *kdata, struct bpf_link *link)
+{
+	struct smc_bpf_ops *ops = kdata;
+
+	spin_lock(&smc_bpf_ops_list_lock);
+	list_del_rcu(&ops->list);
+	spin_unlock(&smc_bpf_ops_list_lock);
+
+	/* Ensure that all readers to complete */
+	synchronize_rcu();
+}
+
+static void __bpf_smc_stub_set_tcp_option(struct smc_bpf_ops_ctx *ops_ctx) {}
+static void __bpf_smc_stub_set_tcp_option_cond(struct smc_bpf_ops_ctx *ops_ctx) {}
+
+static struct smc_bpf_ops __bpf_smc_bpf_ops = {
+	.set_option = __bpf_smc_stub_set_tcp_option,
+	.set_option_cond = __bpf_smc_stub_set_tcp_option_cond,
+};
+
+static int smc_bpf_ops_btf_struct_access(struct bpf_verifier_log *log,
+					 const struct bpf_reg_state *reg,
+					 const struct bpf_prog *prog,
+					 int off, int size)
+{
+	const struct btf_member *member;
+	const char *mname;
+	int member_idx;
+
+	member_idx = prog->expected_attach_type;
+	if (member_idx >= btf_type_vlen(smc_bpf_ops_type))
+		goto out_err;
+
+	member = &btf_type_member(smc_bpf_ops_type)[member_idx];
+	mname = btf_str_by_offset(saved_btf, member->name_off);
+
+	if (!strcmp(mname, "set_option")) {
+		/* only support to modify tcp_sock->syn_smc */
+		if (reg->btf_id == tcp_sock_id &&
+		    off == offsetof(struct tcp_sock, syn_smc) &&
+		    off + size == offsetofend(struct tcp_sock, syn_smc))
+			return 0;
+	} else if (!strcmp(mname, "set_option_cond")) {
+		/* only support to modify smc_bpf_ops_ctx->smc_ok */
+		if (reg->btf_id == smc_bpf_ops_ctx_id &&
+		    off == offsetof(struct smc_bpf_ops_ctx, set_option_cond.smc_ok) &&
+		    off + size == offsetofend(struct smc_bpf_ops_ctx, set_option_cond.smc_ok))
+			return 0;
+	}
+
+out_err:
+	return -EACCES;
+}
+
+static const struct bpf_verifier_ops smc_bpf_verifier_ops = {
+	.get_func_proto = bpf_base_func_proto,
+	.is_valid_access = bpf_tracing_btf_ctx_access,
+	.btf_struct_access = smc_bpf_ops_btf_struct_access,
+};
+
+static struct bpf_struct_ops bpf_smc_bpf_ops = {
+	.init = smc_bpf_ops_init,
+	.name = "smc_bpf_ops",
+	.reg = smc_bpf_ops_reg,
+	.unreg = smc_bpf_ops_unreg,
+	.cfi_stubs = &__bpf_smc_bpf_ops,
+	.verifier_ops = &smc_bpf_verifier_ops,
+	.init_member = smc_bpf_ops_init_member,
+	.check_member = smc_bpf_ops_check_member,
+	.owner = THIS_MODULE,
+};
+
+int smc_bpf_struct_ops_init(void)
+{
+	return register_bpf_struct_ops(&bpf_smc_bpf_ops, smc_bpf_ops);
+}
+
+void bpf_smc_set_tcp_option(struct tcp_sock *tp)
+{
+	struct smc_bpf_ops_ctx ops_ctx = {};
+	struct smc_bpf_ops *ops;
+
+	ops_ctx.set_option.tp = tp;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ops, &smc_bpf_ops_list, list) {
+		ops->set_option(&ops_ctx);
+	}
+	rcu_read_unlock();
+}
+
+void bpf_smc_set_tcp_option_cond(const struct tcp_sock *tp, struct inet_request_sock *ireq)
+{
+	struct smc_bpf_ops_ctx ops_ctx = {};
+	struct smc_bpf_ops *ops;
+
+	ops_ctx.set_option_cond.tp = tp;
+	ops_ctx.set_option_cond.ireq = ireq;
+	ops_ctx.set_option_cond.smc_ok = ireq->smc_ok;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ops, &smc_bpf_ops_list, list) {
+		ops->set_option_cond(&ops_ctx);
+	}
+	rcu_read_unlock();
+
+	ireq->smc_ok = ops_ctx.set_option_cond.smc_ok;
+}
diff --git a/net/smc/smc_bpf.h b/net/smc/smc_bpf.h
new file mode 100644
index 00000000..a5ed0fc
--- /dev/null
+++ b/net/smc/smc_bpf.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ *  Shared Memory Communications over RDMA (SMC-R) and RoCE
+ *
+ *  support for eBPF programs in SMC subsystem.
+ *
+ *  Copyright IBM Corp. 2016
+ *  Copyright (c) 2024, Alibaba Inc.
+ *
+ *  Author: D. Wythe <alibuda@linux.alibaba.com>
+ */
+#ifndef __SMC_BPF
+#define __SMC_BPF
+
+#include <linux/types.h>
+#include <net/sock.h>
+#include <net/tcp.h>
+
+#if IS_ENABLED(CONFIG_SMC_BPF)
+
+/* Initialize struct_ops registration. It will automatically unload
+ * when module is unloaded.
+ * @return 0 on success
+ */
+int smc_bpf_struct_ops_init(void);
+
+void bpf_smc_set_tcp_option(struct tcp_sock *sk);
+void bpf_smc_set_tcp_option_cond(const struct tcp_sock *tp, struct inet_request_sock *ireq);
+
+#else
+static inline int smc_bpf_struct_ops_init(void) { return 0; }
+#endif /* CONFIG_SMC_BPF */
+
+#endif /* __SMC_BPF */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next 4/4] bpf/selftests: add simple selftest for bpf_smc_ops
  2024-10-24  2:42 [PATCH bpf-next 0/4] net/smc: Introduce smc_bpf_ops D. Wythe
                   ` (2 preceding siblings ...)
  2024-10-24  2:42 ` [PATCH net-next 3/4] net/smc: Introduce smc_bpf_ops D. Wythe
@ 2024-10-24  2:42 ` D. Wythe
  2024-10-24  4:04   ` D. Wythe
  2024-11-03 13:01   ` Zhu Yanjun
  3 siblings, 2 replies; 20+ messages in thread
From: D. Wythe @ 2024-10-24  2:42 UTC (permalink / raw)
  To: kgraul, wenjia, jaka, ast, daniel, andrii, martin.lau, pabeni,
	song, sdf, haoluo, yhs, edumazet, john.fastabend, kpsingh, jolsa,
	guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, bpf, dtcccc

From: "D. Wythe" <alibuda@linux.alibaba.com>

This PATCH adds a tiny selftest for bpf_smc_ops, to verify the ability
to attach and write access.

Follow the steps below to run this test.

make -C tools/testing/selftests/bpf
cd tools/testing/selftests/bpf
sudo ./test_progs -t smc

Results shows:
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
---
 .../selftests/bpf/prog_tests/test_bpf_smc.c        | 21 +++++++++++
 tools/testing/selftests/bpf/progs/bpf_smc.c        | 44 ++++++++++++++++++++++
 2 files changed, 65 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/test_bpf_smc.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_smc.c

diff --git a/tools/testing/selftests/bpf/prog_tests/test_bpf_smc.c b/tools/testing/selftests/bpf/prog_tests/test_bpf_smc.c
new file mode 100644
index 00000000..2299853
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/test_bpf_smc.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <test_progs.h>
+
+#include "bpf_smc.skel.h"
+
+static void load(void)
+{
+	struct bpf_smc *skel;
+
+	skel = bpf_smc__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "bpf_smc__open_and_load"))
+		return;
+
+	bpf_smc__destroy(skel);
+}
+
+void test_bpf_smc(void)
+{
+	if (test__start_subtest("load"))
+		load();
+}
diff --git a/tools/testing/selftests/bpf/progs/bpf_smc.c b/tools/testing/selftests/bpf/progs/bpf_smc.c
new file mode 100644
index 00000000..ebff477
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/bpf_smc.c
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+struct smc_bpf_ops_ctx {
+	struct {
+		struct tcp_sock *tp;
+	} set_option;
+	struct {
+		const struct tcp_sock *tp;
+		struct inet_request_sock *ireq;
+		int smc_ok;
+	} set_option_cond;
+};
+
+struct smc_bpf_ops {
+	void (*set_option)(struct smc_bpf_ops_ctx *ctx);
+	void (*set_option_cond)(struct smc_bpf_ops_ctx *ctx);
+};
+
+SEC("struct_ops/bpf_smc_set_tcp_option_cond")
+void BPF_PROG(bpf_smc_set_tcp_option_cond, struct smc_bpf_ops_ctx *arg)
+{
+	arg->set_option_cond.smc_ok = 1;
+}
+
+SEC("struct_ops/bpf_smc_set_tcp_option")
+void BPF_PROG(bpf_smc_set_tcp_option, struct smc_bpf_ops_ctx *arg)
+{
+	struct tcp_sock *tp = arg->set_option.tp;
+
+	tp->syn_smc = 1;
+}
+
+SEC(".struct_ops.link")
+struct smc_bpf_ops sample_smc_bpf_ops = {
+	.set_option         = (void *) bpf_smc_set_tcp_option,
+	.set_option_cond    = (void *) bpf_smc_set_tcp_option_cond,
+};
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 4/4] bpf/selftests: add simple selftest for bpf_smc_ops
  2024-10-24  2:42 ` [PATCH bpf-next 4/4] bpf/selftests: add simple selftest for bpf_smc_ops D. Wythe
@ 2024-10-24  4:04   ` D. Wythe
  2024-10-24  4:49     ` Tianchen Ding
  2024-11-03 13:01   ` Zhu Yanjun
  1 sibling, 1 reply; 20+ messages in thread
From: D. Wythe @ 2024-10-24  4:04 UTC (permalink / raw)
  To: kgraul, wenjia, jaka, ast, daniel, andrii, martin.lau, pabeni,
	song, sdf, haoluo, yhs, edumazet, john.fastabend, kpsingh, jolsa,
	guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, bpf, dtcccc



On 10/24/24 10:42 AM, D. Wythe wrote:
> From: "D. Wythe" <alibuda@linux.alibaba.com>
> 
> This PATCH adds a tiny selftest for bpf_smc_ops, to verify the ability
> to attach and write access.
> 
> Follow the steps below to run this test.
> 
> make -C tools/testing/selftests/bpf
> cd tools/testing/selftests/bpf
> sudo ./test_progs -t smc
> 
> Results shows:
> Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED


Sorry for just found an issue with vary config. I will fix this issues
in the next version.

D. Wythe

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 4/4] bpf/selftests: add simple selftest for bpf_smc_ops
  2024-10-24  4:04   ` D. Wythe
@ 2024-10-24  4:49     ` Tianchen Ding
  2024-10-24  5:49       ` D. Wythe
  0 siblings, 1 reply; 20+ messages in thread
From: Tianchen Ding @ 2024-10-24  4:49 UTC (permalink / raw)
  To: D. Wythe
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, bpf, kgraul, wenjia,
	jaka, ast, daniel, andrii, martin.lau, pabeni, song, sdf, haoluo,
	yhs, edumazet, john.fastabend, kpsingh, jolsa, guwen

On 2024/10/24 12:04, D. Wythe wrote:
> 
> 
> On 10/24/24 10:42 AM, D. Wythe wrote:
>> From: "D. Wythe" <alibuda@linux.alibaba.com>
>>
>> This PATCH adds a tiny selftest for bpf_smc_ops, to verify the ability
>> to attach and write access.
>>
>> Follow the steps below to run this test.
>>
>> make -C tools/testing/selftests/bpf
>> cd tools/testing/selftests/bpf
>> sudo ./test_progs -t smc
>>
>> Results shows:
>> Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
> 
> 
> Sorry for just found an issue with vary config. I will fix this issues
> in the next version.
> 
> D. Wythe

This doesn't build with !CONFIG_SMC.

Maybe you should create an individual dir. (like what sched_ext does)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 4/4] bpf/selftests: add simple selftest for bpf_smc_ops
  2024-10-24  4:49     ` Tianchen Ding
@ 2024-10-24  5:49       ` D. Wythe
  0 siblings, 0 replies; 20+ messages in thread
From: D. Wythe @ 2024-10-24  5:49 UTC (permalink / raw)
  To: Tianchen Ding
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, bpf, kgraul, wenjia,
	jaka, ast, daniel, andrii, martin.lau, pabeni, song, sdf, haoluo,
	yhs, edumazet, john.fastabend, kpsingh, jolsa, guwen



On 10/24/24 12:49 PM, Tianchen Ding wrote:
> On 2024/10/24 12:04, D. Wythe wrote:
>>
>>
>> On 10/24/24 10:42 AM, D. Wythe wrote:
>>> From: "D. Wythe" <alibuda@linux.alibaba.com>
>>>
>>> This PATCH adds a tiny selftest for bpf_smc_ops, to verify the ability
>>> to attach and write access.
>>>
>>> Follow the steps below to run this test.
>>>
>>> make -C tools/testing/selftests/bpf
>>> cd tools/testing/selftests/bpf
>>> sudo ./test_progs -t smc
>>>
>>> Results shows:
>>> Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
>>
>>
>> Sorry for just found an issue with vary config. I will fix this issues
>> in the next version.
>>
>> D. Wythe
> 
> This doesn't build with !CONFIG_SMC.
> 
> Maybe you should create an individual dir. (like what sched_ext does)

It's true, I do intend to create an individual dir, and send the patches for
BPF and SMC separately. Thanks for your advises.

Best wishes,
D. Wythe



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 3/4] net/smc: Introduce smc_bpf_ops
  2024-10-24  2:42 ` [PATCH net-next 3/4] net/smc: Introduce smc_bpf_ops D. Wythe
@ 2024-10-25  0:26   ` Martin KaFai Lau
  2024-10-25 11:05     ` D. Wythe
  0 siblings, 1 reply; 20+ messages in thread
From: Martin KaFai Lau @ 2024-10-25  0:26 UTC (permalink / raw)
  To: D. Wythe
  Cc: kgraul, wenjia, jaka, ast, daniel, andrii, pabeni, song, sdf,
	haoluo, yhs, edumazet, john.fastabend, kpsingh, jolsa, guwen,
	kuba, davem, netdev, linux-s390, linux-rdma, bpf, dtcccc

On 10/23/24 7:42 PM, D. Wythe wrote:
> From: "D. Wythe" <alibuda@linux.alibaba.com>
> 
> The introduction of IPPROTO_SMC enables eBPF programs to determine
> whether to use SMC based on the context of socket creation, such as
> network namespaces, PID and comm name, etc.
> 
> As a subsequent enhancement, this patch introduces a new hook for eBPF
> programs that allows decisions on whether to use SMC or not at runtime,
> including but not limited to local/remote IP address or ports. In
> simpler words, this feature allows modifications to syn_smc through eBPF
> programs before the TCP three-way handshake got established.
> 
> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
> ---
>   include/linux/tcp.h   |   2 +-
>   include/net/smc.h     |  47 +++++++++++
>   include/net/tcp.h     |   6 ++
>   net/ipv4/tcp_input.c  |   3 +-
>   net/ipv4/tcp_output.c |  14 +++-
>   net/smc/Kconfig       |  12 +++
>   net/smc/Makefile      |   1 +
>   net/smc/af_smc.c      |  38 ++++++---
>   net/smc/smc.h         |   4 +
>   net/smc/smc_bpf.c     | 212 ++++++++++++++++++++++++++++++++++++++++++++++++++
>   net/smc/smc_bpf.h     |  34 ++++++++
>   11 files changed, 357 insertions(+), 16 deletions(-)
>   create mode 100644 net/smc/smc_bpf.c
>   create mode 100644 net/smc/smc_bpf.h
> 
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index 6a5e08b..4ef160a 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -478,7 +478,7 @@ struct tcp_sock {
>   #endif
>   #if IS_ENABLED(CONFIG_SMC)
>   	bool	syn_smc;	/* SYN includes SMC */
> -	bool	(*smc_hs_congested)(const struct sock *sk);
> +	struct tcpsmc_ctx *smc;
>   #endif
>   
>   #if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
> diff --git a/include/net/smc.h b/include/net/smc.h
> index db84e4e..34ab2c6 100644
> --- a/include/net/smc.h
> +++ b/include/net/smc.h
> @@ -18,6 +18,8 @@
>   #include "linux/ism.h"
>   
>   struct sock;
> +struct tcp_sock;
> +struct inet_request_sock;
>   
>   #define SMC_MAX_PNETID_LEN	16	/* Max. length of PNET id */
>   
> @@ -97,4 +99,49 @@ struct smcd_dev {
>   	u8 going_away : 1;
>   };
>   
> +/*
> + * This structure is used to store the parameters passed to the member of struct_ops.
> + * Due to the BPF verifier cannot restrict the writing of bit fields, such as limiting
> + * it to only write ireq->smc_ok. Using kfunc can solve this issue, but we don't want
> + * to introduce a kfunc with such a narrow function.

imo, adding kfunc is fine.

> + *
> + * Moreover, using this structure for unified parameters also addresses another
> + * potential issue. Currently, kfunc cannot recognize the calling context
> + * through BPF's existing structure. In the future, we can solve this problem
> + * by passing this ctx to kfunc.

This part I don't understand. How is it different from the "tcp_cubic_kfunc_set" 
allowed in tcp_congestion_ops?

> + */
> +struct smc_bpf_ops_ctx {
> +	struct {
> +		struct tcp_sock *tp;
> +	} set_option;
> +	struct {
> +		const struct tcp_sock *tp;
> +		struct inet_request_sock *ireq;
> +		int smc_ok;
> +	} set_option_cond;
> +};

There is no need to create one single ctx for struct_ops prog. struct_ops prog 
can take >1 args and different ops can take different args.

> +
> +struct smc_bpf_ops {
> +	/* priavte */
> +
> +	struct list_head	list;
> +
> +	/* public */
> +
> +	/* Invoked before computing SMC option for SYN packets.
> +	 * We can control whether to set SMC options by modifying
> +	 * ctx->set_option->tp->syn_smc.
> +	 * This's also the only member that can be modified now.
> +	 * Only member in ctx->set_option is valid for this callback.
> +	 */
> +	void (*set_option)(struct smc_bpf_ops_ctx *ctx);
> +
> +	/* Invoked before Set up SMC options for SYN-ACK packets
> +	 * We can control whether to respond SMC options by modifying
> +	 * ctx->set_option_cond.smc_ok.
> +	 * Only member in ctx->set_option_cond is valid for this callback.
> +	 */
> +	void (*set_option_cond)(struct smc_bpf_ops_ctx *ctx);

The struct smc_bpf_ops already has set_option and set_option_cnd, but...

> +};
> +
>   #endif	/* _SMC_H */
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 739a9fb..c322443 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -2730,6 +2730,12 @@ static inline void tcp_bpf_rtt(struct sock *sk, long mrtt, u32 srtt)
>   
>   #if IS_ENABLED(CONFIG_SMC)
>   extern struct static_key_false tcp_have_smc;
> +struct tcpsmc_ctx {
> +	/* Invoked before computing SMC option for SYN packets. */
> +	void (*set_option)(struct tcp_sock *tp);
> +	/* Invoked before Set up SMC options for SYN-ACK packets */
> +	void (*set_option_cond)(const struct tcp_sock *tp, struct inet_request_sock *ireq);
> +};

another new struct tcpsmc_ctx has exactly the same functions (at least the same 
name) but different arguments. I don't understand why this duplicate, is it 
because the need to prepare the "struct smc_bpf_ops_ctx"?

The "struct tcpsmc_ctx" should be the "struct smc_bpf_ops" itself.

[ ... ]

> +static int smc_bpf_ops_btf_struct_access(struct bpf_verifier_log *log,
> +					 const struct bpf_reg_state *reg,
> +					 const struct bpf_prog *prog,
> +					 int off, int size)
> +{
> +	const struct btf_member *member;
> +	const char *mname;
> +	int member_idx;
> +
> +	member_idx = prog->expected_attach_type;
> +	if (member_idx >= btf_type_vlen(smc_bpf_ops_type))
> +		goto out_err;
> +
> +	member = &btf_type_member(smc_bpf_ops_type)[member_idx];
> +	mname = btf_str_by_offset(saved_btf, member->name_off);
> +
> +	if (!strcmp(mname, "set_option")) {

btf_member_bit_offset can be used instead of strcmp. Take a look at bpf_tcp_ca.c 
and kernel/sched/ext.c

> +		/* only support to modify tcp_sock->syn_smc */
> +		if (reg->btf_id == tcp_sock_id &&
> +		    off == offsetof(struct tcp_sock, syn_smc) &&
> +		    off + size == offsetofend(struct tcp_sock, syn_smc))
> +			return 0;
> +	} else if (!strcmp(mname, "set_option_cond")) {
> +		/* only support to modify smc_bpf_ops_ctx->smc_ok */
> +		if (reg->btf_id == smc_bpf_ops_ctx_id &&
> +		    off == offsetof(struct smc_bpf_ops_ctx, set_option_cond.smc_ok) &&
> +		    off + size == offsetofend(struct smc_bpf_ops_ctx, set_option_cond.smc_ok))
> +			return 0;
> +	}
> +
> +out_err:
> +	return -EACCES;
> +}
> +
> +static const struct bpf_verifier_ops smc_bpf_verifier_ops = {
> +	.get_func_proto = bpf_base_func_proto,
> +	.is_valid_access = bpf_tracing_btf_ctx_access,
> +	.btf_struct_access = smc_bpf_ops_btf_struct_access,
> +};
> +
> +static struct bpf_struct_ops bpf_smc_bpf_ops = {
> +	.init = smc_bpf_ops_init,
> +	.name = "smc_bpf_ops",
> +	.reg = smc_bpf_ops_reg,
> +	.unreg = smc_bpf_ops_unreg,
> +	.cfi_stubs = &__bpf_smc_bpf_ops,
> +	.verifier_ops = &smc_bpf_verifier_ops,
> +	.init_member = smc_bpf_ops_init_member,
> +	.check_member = smc_bpf_ops_check_member,
> +	.owner = THIS_MODULE,
> +};
> +
> +int smc_bpf_struct_ops_init(void)
> +{
> +	return register_bpf_struct_ops(&bpf_smc_bpf_ops, smc_bpf_ops);
> +}
> +
> +void bpf_smc_set_tcp_option(struct tcp_sock *tp)
> +{
> +	struct smc_bpf_ops_ctx ops_ctx = {};
> +	struct smc_bpf_ops *ops;
> +
> +	ops_ctx.set_option.tp = tp;

All this initialization should be unnecessary. Directly pass tp instead.

> +
> +	rcu_read_lock();
> +	list_for_each_entry_rcu(ops, &smc_bpf_ops_list, list) {

Does it need to have a list (meaning >1) of smc_bpf_ops to act on a sock? The 
ordering expectation is hard to manage.

> +		ops->set_option(&ops_ctx);

A dumb question. This will only affect AF_SMC (or AF_INET[6]/IPPROTO_SMC) 
socket but not the AF_INET[6]/IPPROTO_{TCP,UDP} socket?

pw-bot: cr

> +	}
> +	rcu_read_unlock();
> +}

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 2/4] bpf: allow to access bpf_prog during bpf_struct_access
  2024-10-24  2:42 ` [PATCH bpf-next 2/4] bpf: allow to access bpf_prog during bpf_struct_access D. Wythe
@ 2024-10-25  9:14   ` kernel test robot
  2024-10-25 12:20   ` kernel test robot
  1 sibling, 0 replies; 20+ messages in thread
From: kernel test robot @ 2024-10-25  9:14 UTC (permalink / raw)
  To: D. Wythe, kgraul, wenjia, jaka, ast, daniel, andrii, martin.lau,
	pabeni, song, sdf, haoluo, yhs, edumazet, john.fastabend, kpsingh,
	jolsa, guwen
  Cc: llvm, oe-kbuild-all, kuba, davem, netdev, linux-s390, linux-rdma,
	bpf, dtcccc

Hi Wythe,

kernel test robot noticed the following build errors:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/D-Wythe/bpf-export-necessary-sympols-for-modules/20241024-104903
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/1729737768-124596-3-git-send-email-alibuda%40linux.alibaba.com
patch subject: [PATCH bpf-next 2/4] bpf: allow to access bpf_prog during bpf_struct_access
config: s390-allmodconfig (https://download.01.org/0day-ci/archive/20241025/202410251600.4VOzw93J-lkp@intel.com/config)
compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project 5886454669c3c9026f7f27eab13509dd0241f2d6)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241025/202410251600.4VOzw93J-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202410251600.4VOzw93J-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from drivers/hid/bpf/hid_bpf_struct_ops.c:10:
   In file included from include/linux/bpf_verifier.h:7:
   In file included from include/linux/bpf.h:20:
   In file included from include/linux/module.h:19:
   In file included from include/linux/elf.h:6:
   In file included from arch/s390/include/asm/elf.h:181:
   In file included from arch/s390/include/asm/mmu_context.h:11:
   In file included from arch/s390/include/asm/pgalloc.h:18:
   In file included from include/linux/mm.h:2213:
   include/linux/vmstat.h:504:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     504 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     505 |                            item];
         |                            ~~~~
   include/linux/vmstat.h:511:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     511 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     512 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     518 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   include/linux/vmstat.h:524:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     524 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     525 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   In file included from drivers/hid/bpf/hid_bpf_struct_ops.c:10:
   In file included from include/linux/bpf_verifier.h:9:
   In file included from include/linux/filter.h:12:
   In file included from include/linux/skbuff.h:28:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:93:
   include/asm-generic/io.h:548:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     548 |         val = __raw_readb(PCI_IOBASE + addr);
         |                           ~~~~~~~~~~ ^
   include/asm-generic/io.h:561:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     561 |         val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
      37 | #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
         |                                                           ^
   include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
     102 | #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
         |                                                      ^
   In file included from drivers/hid/bpf/hid_bpf_struct_ops.c:10:
   In file included from include/linux/bpf_verifier.h:9:
   In file included from include/linux/filter.h:12:
   In file included from include/linux/skbuff.h:28:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:93:
   include/asm-generic/io.h:574:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     574 |         val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
      35 | #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
         |                                                           ^
   include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
     115 | #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
         |                                                      ^
   In file included from drivers/hid/bpf/hid_bpf_struct_ops.c:10:
   In file included from include/linux/bpf_verifier.h:9:
   In file included from include/linux/filter.h:12:
   In file included from include/linux/skbuff.h:28:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:93:
   include/asm-generic/io.h:585:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     585 |         __raw_writeb(value, PCI_IOBASE + addr);
         |                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:595:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     595 |         __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:605:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     605 |         __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:693:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     693 |         readsb(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:701:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     701 |         readsw(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:709:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     709 |         readsl(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:718:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     718 |         writesb(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:727:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     727 |         writesw(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:736:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     736 |         writesl(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
>> drivers/hid/bpf/hid_bpf_struct_ops.c:146:23: error: incompatible function pointer types initializing 'int (*)(struct bpf_verifier_log *, const struct bpf_reg_state *, const struct bpf_prog *, int, int)' with an expression of type 'int (struct bpf_verifier_log *, const struct bpf_reg_state *, int, int)' [-Wincompatible-function-pointer-types]
     146 |         .btf_struct_access = hid_bpf_ops_btf_struct_access,
         |                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   16 warnings and 1 error generated.

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for MODVERSIONS
   Depends on [n]: MODULES [=y] && !COMPILE_TEST [=y]
   Selected by [y]:
   - RANDSTRUCT_FULL [=y] && (CC_HAS_RANDSTRUCT [=y] || GCC_PLUGINS [=n]) && MODULES [=y]


vim +146 drivers/hid/bpf/hid_bpf_struct_ops.c

ebc0d8093e8c97 Benjamin Tissoires 2024-06-08  142  
ebc0d8093e8c97 Benjamin Tissoires 2024-06-08  143  static const struct bpf_verifier_ops hid_bpf_verifier_ops = {
bd0747543b3d97 Benjamin Tissoires 2024-06-08  144  	.get_func_proto = bpf_base_func_proto,
ebc0d8093e8c97 Benjamin Tissoires 2024-06-08  145  	.is_valid_access = hid_bpf_ops_is_valid_access,
ebc0d8093e8c97 Benjamin Tissoires 2024-06-08 @146  	.btf_struct_access = hid_bpf_ops_btf_struct_access,
ebc0d8093e8c97 Benjamin Tissoires 2024-06-08  147  };
ebc0d8093e8c97 Benjamin Tissoires 2024-06-08  148  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 3/4] net/smc: Introduce smc_bpf_ops
  2024-10-25  0:26   ` Martin KaFai Lau
@ 2024-10-25 11:05     ` D. Wythe
  2024-10-25 18:30       ` Martin KaFai Lau
  0 siblings, 1 reply; 20+ messages in thread
From: D. Wythe @ 2024-10-25 11:05 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: kgraul, wenjia, jaka, ast, daniel, andrii, pabeni, song, sdf,
	haoluo, yhs, edumazet, john.fastabend, kpsingh, jolsa, guwen,
	kuba, davem, netdev, linux-s390, linux-rdma, bpf, dtcccc



On 10/25/24 8:26 AM, Martin KaFai Lau wrote:
> On 10/23/24 7:42 PM, D. Wythe wrote:
>> From: "D. Wythe" <alibuda@linux.alibaba.com>
>>
>> The introduction of IPPROTO_SMC enables eBPF programs to determine
>> whether to use SMC based on the context of socket creation, such as
>> network namespaces, PID and comm name, etc.
>>
>> As a subsequent enhancement, this patch introduces a new hook for eBPF
>> programs that allows decisions on whether to use SMC or not at runtime,
>> including but not limited to local/remote IP address or ports. In
>> simpler words, this feature allows modifications to syn_smc through eBPF
>> programs before the TCP three-way handshake got established.
>>
>> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
>> ---
>>   include/linux/tcp.h   |   2 +-
>>   include/net/smc.h     |  47 +++++++++++
>>   include/net/tcp.h     |   6 ++
>>   net/ipv4/tcp_input.c  |   3 +-
>>   net/ipv4/tcp_output.c |  14 +++-
>>   net/smc/Kconfig       |  12 +++
>>   net/smc/Makefile      |   1 +
>>   net/smc/af_smc.c      |  38 ++++++---
>>   net/smc/smc.h         |   4 +
>>   net/smc/smc_bpf.c     | 212 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>   net/smc/smc_bpf.h     |  34 ++++++++
>>   11 files changed, 357 insertions(+), 16 deletions(-)
>>   create mode 100644 net/smc/smc_bpf.c
>>   create mode 100644 net/smc/smc_bpf.h
>>
>> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
>> index 6a5e08b..4ef160a 100644
>> --- a/include/linux/tcp.h
>> +++ b/include/linux/tcp.h
>> @@ -478,7 +478,7 @@ struct tcp_sock {
>>   #endif
>>   #if IS_ENABLED(CONFIG_SMC)
>>       bool    syn_smc;    /* SYN includes SMC */
>> -    bool    (*smc_hs_congested)(const struct sock *sk);
>> +    struct tcpsmc_ctx *smc;
>>   #endif
>>   #if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
>> diff --git a/include/net/smc.h b/include/net/smc.h
>> index db84e4e..34ab2c6 100644
>> --- a/include/net/smc.h
>> +++ b/include/net/smc.h
>> @@ -18,6 +18,8 @@
>>   #include "linux/ism.h"
>>   struct sock;
>> +struct tcp_sock;
>> +struct inet_request_sock;
>>   #define SMC_MAX_PNETID_LEN    16    /* Max. length of PNET id */
>> @@ -97,4 +99,49 @@ struct smcd_dev {
>>       u8 going_away : 1;
>>   };
>> +/*
>> + * This structure is used to store the parameters passed to the member of struct_ops.
>> + * Due to the BPF verifier cannot restrict the writing of bit fields, such as limiting
>> + * it to only write ireq->smc_ok. Using kfunc can solve this issue, but we don't want
>> + * to introduce a kfunc with such a narrow function.
> 
> imo, adding kfunc is fine.
> 
>> + *
>> + * Moreover, using this structure for unified parameters also addresses another
>> + * potential issue. Currently, kfunc cannot recognize the calling context
>> + * through BPF's existing structure. In the future, we can solve this problem
>> + * by passing this ctx to kfunc.
> 
> This part I don't understand. How is it different from the "tcp_cubic_kfunc_set" allowed in 
> tcp_congestion_ops?

Hi Martin,

Yes, creating an independent kfunc for each callback and filtering via expected_attach_type can 
indeed solve the problem.

Our main concern is to avoid introducing kfuncs as much as possible. For our subsystem, we might 
need to maintain it in a way that maintains a uapi, as we certainly have user applications depending 
on it.

This is also why we need to create a separate ctx, as there’s no way to restrict bit writes, so we 
created a ctx->smc_ok that is allowed to write.

This is also why we had to create a separate structure, tcpsmc_ctx ...

However, I now realize that compromising to avoid introducing kfuncs has gone too far, affecting the 
readability of the code. I will try to use kfuncs in the next version to solve those issues.


> 
>> + */
>> +struct smc_bpf_ops_ctx {
>> +    struct {
>> +        struct tcp_sock *tp;
>> +    } set_option;
>> +    struct {
>> +        const struct tcp_sock *tp;
>> +        struct inet_request_sock *ireq;
>> +        int smc_ok;
>> +    } set_option_cond;
>> +};
> 
> There is no need to create one single ctx for struct_ops prog. struct_ops prog can take >1 args and 
> different ops can take different args.
> 

Same reason with concern on kfunc. I'll change it in next version.


>> +
>> +struct smc_bpf_ops {
>> +    /* priavte */
>> +
>> +    struct list_head    list;
>> +
>> +    /* public */
>> +
>> +    /* Invoked before computing SMC option for SYN packets.
>> +     * We can control whether to set SMC options by modifying
>> +     * ctx->set_option->tp->syn_smc.
>> +     * This's also the only member that can be modified now.
>> +     * Only member in ctx->set_option is valid for this callback.
>> +     */
>> +    void (*set_option)(struct smc_bpf_ops_ctx *ctx);
>> +
>> +    /* Invoked before Set up SMC options for SYN-ACK packets
>> +     * We can control whether to respond SMC options by modifying
>> +     * ctx->set_option_cond.smc_ok.
>> +     * Only member in ctx->set_option_cond is valid for this callback.
>> +     */
>> +    void (*set_option_cond)(struct smc_bpf_ops_ctx *ctx);
> 
> The struct smc_bpf_ops already has set_option and set_option_cnd, but...
> 
>> +};
>> +
>>   #endif    /* _SMC_H */
>> diff --git a/include/net/tcp.h b/include/net/tcp.h
>> index 739a9fb..c322443 100644
>> --- a/include/net/tcp.h
>> +++ b/include/net/tcp.h
>> @@ -2730,6 +2730,12 @@ static inline void tcp_bpf_rtt(struct sock *sk, long mrtt, u32 srtt)
>>   #if IS_ENABLED(CONFIG_SMC)
>>   extern struct static_key_false tcp_have_smc;
>> +struct tcpsmc_ctx {
>> +    /* Invoked before computing SMC option for SYN packets. */
>> +    void (*set_option)(struct tcp_sock *tp);
>> +    /* Invoked before Set up SMC options for SYN-ACK packets */
>> +    void (*set_option_cond)(const struct tcp_sock *tp, struct inet_request_sock *ireq);
>> +};
> 
> another new struct tcpsmc_ctx has exactly the same functions (at least the same name) but different 
> arguments. I don't understand why this duplicate, is it because the need to prepare the "struct 
> smc_bpf_ops_ctx"?

Yes, same reason with concern on kfunc. I'll change it in next version.

> 
> The "struct tcpsmc_ctx" should be the "struct smc_bpf_ops" itself.
> 
> [ ... ]
> 
>> +static int smc_bpf_ops_btf_struct_access(struct bpf_verifier_log *log,
>> +                     const struct bpf_reg_state *reg,
>> +                     const struct bpf_prog *prog,
>> +                     int off, int size)
>> +{
>> +    const struct btf_member *member;
>> +    const char *mname;
>> +    int member_idx;
>> +
>> +    member_idx = prog->expected_attach_type;
>> +    if (member_idx >= btf_type_vlen(smc_bpf_ops_type))
>> +        goto out_err;
>> +
>> +    member = &btf_type_member(smc_bpf_ops_type)[member_idx];
>> +    mname = btf_str_by_offset(saved_btf, member->name_off);
>> +
>> +    if (!strcmp(mname, "set_option")) {
> 
> btf_member_bit_offset can be used instead of strcmp. Take a look at bpf_tcp_ca.c and kernel/sched/ext.c
> 

Got it, thanks for that.

Besides, it seems that we don't need the export btf_str_by_offset anymore in that way.
I'll remove it in the next version.


>> +        /* only support to modify tcp_sock->syn_smc */
>> +        if (reg->btf_id == tcp_sock_id &&
>> +            off == offsetof(struct tcp_sock, syn_smc) &&
>> +            off + size == offsetofend(struct tcp_sock, syn_smc))
>> +            return 0;
>> +    } else if (!strcmp(mname, "set_option_cond")) {
>> +        /* only support to modify smc_bpf_ops_ctx->smc_ok */
>> +        if (reg->btf_id == smc_bpf_ops_ctx_id &&
>> +            off == offsetof(struct smc_bpf_ops_ctx, set_option_cond.smc_ok) &&
>> +            off + size == offsetofend(struct smc_bpf_ops_ctx, set_option_cond.smc_ok))
>> +            return 0;
>> +    }
>> +
>> +out_err:
>> +    return -EACCES;
>> +}
>> +
>> +static const struct bpf_verifier_ops smc_bpf_verifier_ops = {
>> +    .get_func_proto = bpf_base_func_proto,
>> +    .is_valid_access = bpf_tracing_btf_ctx_access,
>> +    .btf_struct_access = smc_bpf_ops_btf_struct_access,
>> +};
>> +
>> +static struct bpf_struct_ops bpf_smc_bpf_ops = {
>> +    .init = smc_bpf_ops_init,
>> +    .name = "smc_bpf_ops",
>> +    .reg = smc_bpf_ops_reg,
>> +    .unreg = smc_bpf_ops_unreg,
>> +    .cfi_stubs = &__bpf_smc_bpf_ops,
>> +    .verifier_ops = &smc_bpf_verifier_ops,
>> +    .init_member = smc_bpf_ops_init_member,
>> +    .check_member = smc_bpf_ops_check_member,
>> +    .owner = THIS_MODULE,
>> +};
>> +
>> +int smc_bpf_struct_ops_init(void)
>> +{
>> +    return register_bpf_struct_ops(&bpf_smc_bpf_ops, smc_bpf_ops);
>> +}
>> +
>> +void bpf_smc_set_tcp_option(struct tcp_sock *tp)
>> +{
>> +    struct smc_bpf_ops_ctx ops_ctx = {};
>> +    struct smc_bpf_ops *ops;
>> +
>> +    ops_ctx.set_option.tp = tp;
> 
> All this initialization should be unnecessary. Directly pass tp instead.
> 

Same reason with kfunc concern. I'll change it in next version.

>> +
>> +    rcu_read_lock();
>> +    list_for_each_entry_rcu(ops, &smc_bpf_ops_list, list) {
> 
> Does it need to have a list (meaning >1) of smc_bpf_ops to act on a sock? The ordering expectation 
> is hard to manage.
> 

Considering that the SMC modules also has its own ops that needs to be registered on it (the logic 
of smc_limit_fs), and need to be all executed, perhaps a list is a more suitable choice.


>> +        ops->set_option(&ops_ctx);
> 
> A dumb question. This will only affect AF_SMC (or AF_INET[6]/IPPROTO_SMC) socket but not the 
> AF_INET[6]/IPPROTO_{TCP,UDP} socket?
> 

Yes, it only affects AF_SMC, AF_SMC6, or IPPROTO_SMC sockets. Due to only SMC sockets will set 
tp->syn_smc, and we will check it before calling the very ops.

Best wishes,
D.

> pw-bot: cr
> 
>> +    }
>> +    rcu_read_unlock();
>> +}




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 2/4] bpf: allow to access bpf_prog during bpf_struct_access
  2024-10-24  2:42 ` [PATCH bpf-next 2/4] bpf: allow to access bpf_prog during bpf_struct_access D. Wythe
  2024-10-25  9:14   ` kernel test robot
@ 2024-10-25 12:20   ` kernel test robot
  1 sibling, 0 replies; 20+ messages in thread
From: kernel test robot @ 2024-10-25 12:20 UTC (permalink / raw)
  To: D. Wythe, kgraul, wenjia, jaka, ast, daniel, andrii, martin.lau,
	pabeni, song, sdf, haoluo, yhs, edumazet, john.fastabend, kpsingh,
	jolsa, guwen
  Cc: oe-kbuild-all, kuba, davem, netdev, linux-s390, linux-rdma, bpf,
	dtcccc

Hi Wythe,

kernel test robot noticed the following build errors:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/D-Wythe/bpf-export-necessary-sympols-for-modules/20241024-104903
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/1729737768-124596-3-git-send-email-alibuda%40linux.alibaba.com
patch subject: [PATCH bpf-next 2/4] bpf: allow to access bpf_prog during bpf_struct_access
config: loongarch-allmodconfig (https://download.01.org/0day-ci/archive/20241025/202410251955.HSEjo06V-lkp@intel.com/config)
compiler: loongarch64-linux-gcc (GCC) 14.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241025/202410251955.HSEjo06V-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202410251955.HSEjo06V-lkp@intel.com/

All errors (new ones prefixed by >>):

>> drivers/hid/bpf/hid_bpf_struct_ops.c:146:30: error: initialization of 'int (*)(struct bpf_verifier_log *, const struct bpf_reg_state *, const struct bpf_prog *, int,  int)' from incompatible pointer type 'int (*)(struct bpf_verifier_log *, const struct bpf_reg_state *, int,  int)' [-Wincompatible-pointer-types]
     146 |         .btf_struct_access = hid_bpf_ops_btf_struct_access,
         |                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/hid/bpf/hid_bpf_struct_ops.c:146:30: note: (near initialization for 'hid_bpf_verifier_ops.btf_struct_access')


vim +146 drivers/hid/bpf/hid_bpf_struct_ops.c

ebc0d8093e8c97 Benjamin Tissoires 2024-06-08  142  
ebc0d8093e8c97 Benjamin Tissoires 2024-06-08  143  static const struct bpf_verifier_ops hid_bpf_verifier_ops = {
bd0747543b3d97 Benjamin Tissoires 2024-06-08  144  	.get_func_proto = bpf_base_func_proto,
ebc0d8093e8c97 Benjamin Tissoires 2024-06-08  145  	.is_valid_access = hid_bpf_ops_is_valid_access,
ebc0d8093e8c97 Benjamin Tissoires 2024-06-08 @146  	.btf_struct_access = hid_bpf_ops_btf_struct_access,
ebc0d8093e8c97 Benjamin Tissoires 2024-06-08  147  };
ebc0d8093e8c97 Benjamin Tissoires 2024-06-08  148  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 3/4] net/smc: Introduce smc_bpf_ops
  2024-10-25 11:05     ` D. Wythe
@ 2024-10-25 18:30       ` Martin KaFai Lau
  2024-10-29  8:53         ` D. Wythe
  0 siblings, 1 reply; 20+ messages in thread
From: Martin KaFai Lau @ 2024-10-25 18:30 UTC (permalink / raw)
  To: D. Wythe
  Cc: kgraul, wenjia, jaka, ast, daniel, andrii, pabeni, song, sdf,
	haoluo, yhs, edumazet, john.fastabend, kpsingh, jolsa, guwen,
	kuba, davem, netdev, linux-s390, linux-rdma, bpf, dtcccc

On 10/25/24 4:05 AM, D. Wythe wrote:
> Our main concern is to avoid introducing kfuncs as much as possible. For our 
> subsystem, we might need to maintain it in a way that maintains a uapi, as we 
> certainly have user applications depending on it.

The smc_bpf_ops can read/write the tp and ireq. In patch 4, there is 
'tp->syn_smc = 1'. I assume the real bpf prog will read something from the tp to 
make the decision also. Note that tp/ireq is also not in the uapi but the CO-RE 
can help in case the tp->syn_smc bool is moved around.

 From looking at the selftest in patch 4 again, I think all it needs is for the 
bpf prog (i.e. the ops) to return a bool instead of allowing the bpf prog to 
write or call a kfunc to change the tp/ireq.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next 3/4] net/smc: Introduce smc_bpf_ops
  2024-10-25 18:30       ` Martin KaFai Lau
@ 2024-10-29  8:53         ` D. Wythe
  0 siblings, 0 replies; 20+ messages in thread
From: D. Wythe @ 2024-10-29  8:53 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: kgraul, wenjia, jaka, ast, daniel, andrii, pabeni, song, sdf,
	haoluo, yhs, edumazet, john.fastabend, kpsingh, jolsa, guwen,
	kuba, davem, netdev, linux-s390, linux-rdma, bpf, dtcccc



On 10/26/24 2:30 AM, Martin KaFai Lau wrote:
> On 10/25/24 4:05 AM, D. Wythe wrote:
>> Our main concern is to avoid introducing kfuncs as much as possible. For our subsystem, we might 
>> need to maintain it in a way that maintains a uapi, as we certainly have user applications 
>> depending on it.
> 
> The smc_bpf_ops can read/write the tp and ireq. In patch 4, there is 'tp->syn_smc = 1'. I assume the 
> real bpf prog will read something from the tp to make the decision also. Note that tp/ireq is also 
> not in the uapi but the CO-RE can help in case the tp->syn_smc bool is moved around.
> 
>  From looking at the selftest in patch 4 again, I think all it needs is for the bpf prog (i.e. the 
> ops) to return a bool instead of allowing the bpf prog to write or call a kfunc to change the tp/ireq.
> 

Hi Martin,

At the beginning, I did modify it by returning values, but later I wanted to make this ops more 
universal, so I considered influencing the behavior by modifying the tp without returning any value. 
But considering we currently do not have any other needs, perhaps modifying it by returning a value 
would be more appropriate.

And If that's the case, we won't need to add new prog parameters to the struct_access anymore. I'll 
try this in the next series.

Thanks,
D. Wythe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 4/4] bpf/selftests: add simple selftest for bpf_smc_ops
  2024-10-24  2:42 ` [PATCH bpf-next 4/4] bpf/selftests: add simple selftest for bpf_smc_ops D. Wythe
  2024-10-24  4:04   ` D. Wythe
@ 2024-11-03 13:01   ` Zhu Yanjun
  2024-11-21  2:00     ` D. Wythe
  1 sibling, 1 reply; 20+ messages in thread
From: Zhu Yanjun @ 2024-11-03 13:01 UTC (permalink / raw)
  To: D. Wythe, kgraul, wenjia, jaka, ast, daniel, andrii, martin.lau,
	pabeni, song, sdf, haoluo, yhs, edumazet, john.fastabend, kpsingh,
	jolsa, guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, bpf, dtcccc

在 2024/10/24 4:42, D. Wythe 写道:
> From: "D. Wythe" <alibuda@linux.alibaba.com>
> 
> This PATCH adds a tiny selftest for bpf_smc_ops, to verify the ability
> to attach and write access.
> 
> Follow the steps below to run this test.
> 
> make -C tools/testing/selftests/bpf
> cd tools/testing/selftests/bpf
> sudo ./test_progs -t smc

Thanks a lot.

# ./test_progs -t smc
#27/1    bpf_smc/load:OK
#27      bpf_smc:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

The above command is based on several kernel modules. After these 
dependent kernel modules are loaded, then can run the above command 
successfully.

Zhu Yanjun

> 
> Results shows:
> Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
> 
> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
> ---
>   .../selftests/bpf/prog_tests/test_bpf_smc.c        | 21 +++++++++++
>   tools/testing/selftests/bpf/progs/bpf_smc.c        | 44 ++++++++++++++++++++++
>   2 files changed, 65 insertions(+)
>   create mode 100644 tools/testing/selftests/bpf/prog_tests/test_bpf_smc.c
>   create mode 100644 tools/testing/selftests/bpf/progs/bpf_smc.c
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/test_bpf_smc.c b/tools/testing/selftests/bpf/prog_tests/test_bpf_smc.c
> new file mode 100644
> index 00000000..2299853
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/test_bpf_smc.c
> @@ -0,0 +1,21 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <test_progs.h>
> +
> +#include "bpf_smc.skel.h"
> +
> +static void load(void)
> +{
> +	struct bpf_smc *skel;
> +
> +	skel = bpf_smc__open_and_load();
> +	if (!ASSERT_OK_PTR(skel, "bpf_smc__open_and_load"))
> +		return;
> +
> +	bpf_smc__destroy(skel);
> +}
> +
> +void test_bpf_smc(void)
> +{
> +	if (test__start_subtest("load"))
> +		load();
> +}
> diff --git a/tools/testing/selftests/bpf/progs/bpf_smc.c b/tools/testing/selftests/bpf/progs/bpf_smc.c
> new file mode 100644
> index 00000000..ebff477
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/bpf_smc.c
> @@ -0,0 +1,44 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include "vmlinux.h"
> +
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +
> +char _license[] SEC("license") = "GPL";
> +
> +struct smc_bpf_ops_ctx {
> +	struct {
> +		struct tcp_sock *tp;
> +	} set_option;
> +	struct {
> +		const struct tcp_sock *tp;
> +		struct inet_request_sock *ireq;
> +		int smc_ok;
> +	} set_option_cond;
> +};
> +
> +struct smc_bpf_ops {
> +	void (*set_option)(struct smc_bpf_ops_ctx *ctx);
> +	void (*set_option_cond)(struct smc_bpf_ops_ctx *ctx);
> +};
> +
> +SEC("struct_ops/bpf_smc_set_tcp_option_cond")
> +void BPF_PROG(bpf_smc_set_tcp_option_cond, struct smc_bpf_ops_ctx *arg)
> +{
> +	arg->set_option_cond.smc_ok = 1;
> +}
> +
> +SEC("struct_ops/bpf_smc_set_tcp_option")
> +void BPF_PROG(bpf_smc_set_tcp_option, struct smc_bpf_ops_ctx *arg)
> +{
> +	struct tcp_sock *tp = arg->set_option.tp;
> +
> +	tp->syn_smc = 1;
> +}
> +
> +SEC(".struct_ops.link")
> +struct smc_bpf_ops sample_smc_bpf_ops = {
> +	.set_option         = (void *) bpf_smc_set_tcp_option,
> +	.set_option_cond    = (void *) bpf_smc_set_tcp_option_cond,
> +};


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 4/4] bpf/selftests: add simple selftest for bpf_smc_ops
  2024-11-03 13:01   ` Zhu Yanjun
@ 2024-11-21  2:00     ` D. Wythe
  2024-11-25 10:52       ` Zhu Yanjun
  0 siblings, 1 reply; 20+ messages in thread
From: D. Wythe @ 2024-11-21  2:00 UTC (permalink / raw)
  To: Zhu Yanjun, kgraul, wenjia, jaka, ast, daniel, andrii, martin.lau,
	pabeni, song, sdf, haoluo, yhs, edumazet, john.fastabend, kpsingh,
	jolsa, guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, bpf, dtcccc



On 11/3/24 9:01 PM, Zhu Yanjun wrote:
> 在 2024/10/24 4:42, D. Wythe 写道:
>> From: "D. Wythe" <alibuda@linux.alibaba.com>
>>
>> This PATCH adds a tiny selftest for bpf_smc_ops, to verify the ability
>> to attach and write access.
>>
>> Follow the steps below to run this test.
>>
>> make -C tools/testing/selftests/bpf
>> cd tools/testing/selftests/bpf
>> sudo ./test_progs -t smc
> 
> Thanks a lot.
> 
> # ./test_progs -t smc
> #27/1    bpf_smc/load:OK
> #27      bpf_smc:OK
> Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
> 
> The above command is based on several kernel modules. After these dependent kernel modules are 
> loaded, then can run the above command successfully.
> 
> Zhu Yanjun
> 

Hi, Yanjun

This is indeed a problem, a better way may be to create a separate testing directory for SMC, and we 
are trying to do this.

Best wishes,
D. Wythe

>>
>> Results shows:
>> Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
>>
>> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
>> ---
>>   .../selftests/bpf/prog_tests/test_bpf_smc.c        | 21 +++++++++++
>>   tools/testing/selftests/bpf/progs/bpf_smc.c        | 44 ++++++++++++++++++++++
>>   2 files changed, 65 insertions(+)
>>   create mode 100644 tools/testing/selftests/bpf/prog_tests/test_bpf_smc.c
>>   create mode 100644 tools/testing/selftests/bpf/progs/bpf_smc.c
>>
>> diff --git a/tools/testing/selftests/bpf/prog_tests/test_bpf_smc.c 
>> b/tools/testing/selftests/bpf/prog_tests/test_bpf_smc.c
>> new file mode 100644
>> index 00000000..2299853
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/prog_tests/test_bpf_smc.c
>> @@ -0,0 +1,21 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +#include <test_progs.h>
>> +
>> +#include "bpf_smc.skel.h"
>> +
>> +static void load(void)
>> +{
>> +    struct bpf_smc *skel;
>> +
>> +    skel = bpf_smc__open_and_load();
>> +    if (!ASSERT_OK_PTR(skel, "bpf_smc__open_and_load"))
>> +        return;
>> +
>> +    bpf_smc__destroy(skel);
>> +}
>> +
>> +void test_bpf_smc(void)
>> +{
>> +    if (test__start_subtest("load"))
>> +        load();
>> +}
>> diff --git a/tools/testing/selftests/bpf/progs/bpf_smc.c 
>> b/tools/testing/selftests/bpf/progs/bpf_smc.c
>> new file mode 100644
>> index 00000000..ebff477
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/progs/bpf_smc.c
>> @@ -0,0 +1,44 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +#include "vmlinux.h"
>> +
>> +#include <bpf/bpf_helpers.h>
>> +#include <bpf/bpf_tracing.h>
>> +
>> +char _license[] SEC("license") = "GPL";
>> +
>> +struct smc_bpf_ops_ctx {
>> +    struct {
>> +        struct tcp_sock *tp;
>> +    } set_option;
>> +    struct {
>> +        const struct tcp_sock *tp;
>> +        struct inet_request_sock *ireq;
>> +        int smc_ok;
>> +    } set_option_cond;
>> +};
>> +
>> +struct smc_bpf_ops {
>> +    void (*set_option)(struct smc_bpf_ops_ctx *ctx);
>> +    void (*set_option_cond)(struct smc_bpf_ops_ctx *ctx);
>> +};
>> +
>> +SEC("struct_ops/bpf_smc_set_tcp_option_cond")
>> +void BPF_PROG(bpf_smc_set_tcp_option_cond, struct smc_bpf_ops_ctx *arg)
>> +{
>> +    arg->set_option_cond.smc_ok = 1;
>> +}
>> +
>> +SEC("struct_ops/bpf_smc_set_tcp_option")
>> +void BPF_PROG(bpf_smc_set_tcp_option, struct smc_bpf_ops_ctx *arg)
>> +{
>> +    struct tcp_sock *tp = arg->set_option.tp;
>> +
>> +    tp->syn_smc = 1;
>> +}
>> +
>> +SEC(".struct_ops.link")
>> +struct smc_bpf_ops sample_smc_bpf_ops = {
>> +    .set_option         = (void *) bpf_smc_set_tcp_option,
>> +    .set_option_cond    = (void *) bpf_smc_set_tcp_option_cond,
>> +};

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 4/4] bpf/selftests: add simple selftest for bpf_smc_ops
  2024-11-21  2:00     ` D. Wythe
@ 2024-11-25 10:52       ` Zhu Yanjun
  2024-11-25 23:32         ` Martin KaFai Lau
  0 siblings, 1 reply; 20+ messages in thread
From: Zhu Yanjun @ 2024-11-25 10:52 UTC (permalink / raw)
  To: D. Wythe, kgraul, wenjia, jaka, ast, daniel, andrii, martin.lau,
	pabeni, song, sdf, haoluo, yhs, edumazet, john.fastabend, kpsingh,
	jolsa, guwen
  Cc: kuba, davem, netdev, linux-s390, linux-rdma, bpf, dtcccc


On 21.11.24 03:00, D. Wythe wrote:
>
>
> On 11/3/24 9:01 PM, Zhu Yanjun wrote:
>> 在 2024/10/24 4:42, D. Wythe 写道:
>>> From: "D. Wythe" <alibuda@linux.alibaba.com>
>>>
>>> This PATCH adds a tiny selftest for bpf_smc_ops, to verify the ability
>>> to attach and write access.
>>>
>>> Follow the steps below to run this test.
>>>
>>> make -C tools/testing/selftests/bpf
>>> cd tools/testing/selftests/bpf
>>> sudo ./test_progs -t smc
>>
>> Thanks a lot.
>>
>> # ./test_progs -t smc
>> #27/1    bpf_smc/load:OK
>> #27      bpf_smc:OK
>> Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
>>
>> The above command is based on several kernel modules. After these 
>> dependent kernel modules are loaded, then can run the above command 
>> successfully.
>>
>> Zhu Yanjun
>>
>
> Hi, Yanjun
>
> This is indeed a problem, a better way may be to create a separate 
> testing directory for SMC, and we are trying to do this.

Got it. In the latest patch series, if a test program in sample/bpf can 
verify this bpf feature, it is better than a selftest program in the 
directory tools/testing/selftests/bpf.

I delved into this selftest tool. It seems that this selftest tool only 
makes the basic checks. A test program in sample/bpf can do more.

I mean, it is very nice that a selftest tool can make selftest on smc 
bpf. But it is better that a test program in sample/bpf can make some 
parameter changes in smc.

These parameter changes are mentioned in the previous commits.

"

     As a subsequent enhancement, this patch introduces a new hook for eBPF
     programs that allows decisions on whether to use SMC or not at runtime,
     including but not limited to local/remote IP address or ports. In
     simpler words, this feature allows modifications to syn_smc through 
eBPF
     programs before the TCP three-way handshake got established.

"

Zhu Yanjun

>
> Best wishes,
> D. Wythe
>
>>>
>>> Results shows:
>>> Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
>>>
>>> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
>>> ---
>>>   .../selftests/bpf/prog_tests/test_bpf_smc.c        | 21 +++++++++++
>>>   tools/testing/selftests/bpf/progs/bpf_smc.c        | 44 
>>> ++++++++++++++++++++++
>>>   2 files changed, 65 insertions(+)
>>>   create mode 100644 
>>> tools/testing/selftests/bpf/prog_tests/test_bpf_smc.c
>>>   create mode 100644 tools/testing/selftests/bpf/progs/bpf_smc.c
>>>
>>> diff --git a/tools/testing/selftests/bpf/prog_tests/test_bpf_smc.c 
>>> b/tools/testing/selftests/bpf/prog_tests/test_bpf_smc.c
>>> new file mode 100644
>>> index 00000000..2299853
>>> --- /dev/null
>>> +++ b/tools/testing/selftests/bpf/prog_tests/test_bpf_smc.c
>>> @@ -0,0 +1,21 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +#include <test_progs.h>
>>> +
>>> +#include "bpf_smc.skel.h"
>>> +
>>> +static void load(void)
>>> +{
>>> +    struct bpf_smc *skel;
>>> +
>>> +    skel = bpf_smc__open_and_load();
>>> +    if (!ASSERT_OK_PTR(skel, "bpf_smc__open_and_load"))
>>> +        return;
>>> +
>>> +    bpf_smc__destroy(skel);
>>> +}
>>> +
>>> +void test_bpf_smc(void)
>>> +{
>>> +    if (test__start_subtest("load"))
>>> +        load();
>>> +}
>>> diff --git a/tools/testing/selftests/bpf/progs/bpf_smc.c 
>>> b/tools/testing/selftests/bpf/progs/bpf_smc.c
>>> new file mode 100644
>>> index 00000000..ebff477
>>> --- /dev/null
>>> +++ b/tools/testing/selftests/bpf/progs/bpf_smc.c
>>> @@ -0,0 +1,44 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +
>>> +#include "vmlinux.h"
>>> +
>>> +#include <bpf/bpf_helpers.h>
>>> +#include <bpf/bpf_tracing.h>
>>> +
>>> +char _license[] SEC("license") = "GPL";
>>> +
>>> +struct smc_bpf_ops_ctx {
>>> +    struct {
>>> +        struct tcp_sock *tp;
>>> +    } set_option;
>>> +    struct {
>>> +        const struct tcp_sock *tp;
>>> +        struct inet_request_sock *ireq;
>>> +        int smc_ok;
>>> +    } set_option_cond;
>>> +};
>>> +
>>> +struct smc_bpf_ops {
>>> +    void (*set_option)(struct smc_bpf_ops_ctx *ctx);
>>> +    void (*set_option_cond)(struct smc_bpf_ops_ctx *ctx);
>>> +};
>>> +
>>> +SEC("struct_ops/bpf_smc_set_tcp_option_cond")
>>> +void BPF_PROG(bpf_smc_set_tcp_option_cond, struct smc_bpf_ops_ctx 
>>> *arg)
>>> +{
>>> +    arg->set_option_cond.smc_ok = 1;
>>> +}
>>> +
>>> +SEC("struct_ops/bpf_smc_set_tcp_option")
>>> +void BPF_PROG(bpf_smc_set_tcp_option, struct smc_bpf_ops_ctx *arg)
>>> +{
>>> +    struct tcp_sock *tp = arg->set_option.tp;
>>> +
>>> +    tp->syn_smc = 1;
>>> +}
>>> +
>>> +SEC(".struct_ops.link")
>>> +struct smc_bpf_ops sample_smc_bpf_ops = {
>>> +    .set_option         = (void *) bpf_smc_set_tcp_option,
>>> +    .set_option_cond    = (void *) bpf_smc_set_tcp_option_cond,
>>> +};

-- 
Best Regards,
Yanjun.Zhu


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 4/4] bpf/selftests: add simple selftest for bpf_smc_ops
  2024-11-25 10:52       ` Zhu Yanjun
@ 2024-11-25 23:32         ` Martin KaFai Lau
  2024-11-26  8:29           ` Zhu Yanjun
  2024-11-29  4:11           ` D. Wythe
  0 siblings, 2 replies; 20+ messages in thread
From: Martin KaFai Lau @ 2024-11-25 23:32 UTC (permalink / raw)
  To: Zhu Yanjun, D. Wythe
  Cc: kgraul, wenjia, jaka, ast, daniel, andrii, pabeni, song, sdf,
	haoluo, yhs, edumazet, john.fastabend, kpsingh, jolsa, guwen,
	kuba, davem, netdev, linux-s390, linux-rdma, bpf, dtcccc

On 11/25/24 2:52 AM, Zhu Yanjun wrote:
>>> # ./test_progs -t smc
>>> #27/1    bpf_smc/load:OK
>>> #27      bpf_smc:OK
>>> Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
>>>
>>> The above command is based on several kernel modules. After these dependent 
>>> kernel modules are loaded, then can run the above command successfully.

>>
>> This is indeed a problem, a better way may be to create a separate testing 
>> directory for SMC, and we are trying to do this.
> 
> Got it. In the latest patch series, if a test program in sample/bpf can verify 
> this bpf feature, it is better than a selftest program in the directory tools/ 
> testing/selftests/bpf.
> 
> I delved into this selftest tool. It seems that this selftest tool only makes 
> the basic checks. A test program in sample/bpf can do more.

sample(s)/bpf? No new test should be added to samples/bpf which is obsolete. The 
bpf CI only runs tests under selftests/bpf.

There is selftests/bpf/config to tell the bpf CI about what kconfig needs to 
turn on.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 4/4] bpf/selftests: add simple selftest for bpf_smc_ops
  2024-11-25 23:32         ` Martin KaFai Lau
@ 2024-11-26  8:29           ` Zhu Yanjun
  2024-11-29  4:11           ` D. Wythe
  1 sibling, 0 replies; 20+ messages in thread
From: Zhu Yanjun @ 2024-11-26  8:29 UTC (permalink / raw)
  To: Martin KaFai Lau, D. Wythe
  Cc: kgraul, wenjia, jaka, ast, daniel, andrii, pabeni, song, sdf,
	haoluo, yhs, edumazet, john.fastabend, kpsingh, jolsa, guwen,
	kuba, davem, netdev, linux-s390, linux-rdma, bpf, dtcccc

在 2024/11/26 0:32, Martin KaFai Lau 写道:
> On 11/25/24 2:52 AM, Zhu Yanjun wrote:
>>>> # ./test_progs -t smc
>>>> #27/1    bpf_smc/load:OK
>>>> #27      bpf_smc:OK
>>>> Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
>>>>
>>>> The above command is based on several kernel modules. After these 
>>>> dependent kernel modules are loaded, then can run the above command 
>>>> successfully.
> 
>>>
>>> This is indeed a problem, a better way may be to create a separate 
>>> testing directory for SMC, and we are trying to do this.
>>
>> Got it. In the latest patch series, if a test program in sample/bpf 
>> can verify this bpf feature, it is better than a selftest program in 
>> the directory tools/ testing/selftests/bpf.
>>
>> I delved into this selftest tool. It seems that this selftest tool 
>> only makes the basic checks. A test program in sample/bpf can do more.
> 
> sample(s)/bpf? No new test should be added to samples/bpf which is 
> obsolete. The bpf CI only runs tests under selftests/bpf.

Thanks for letting me know this, Martin.

In the past, with samples/bpf, we can know a lot of details of bpf 
samples. But in the selftests/bpf, it seems that only load method can be 
found. For example, in this smc bpf selftests commit, we can not find 
how to change parameters in smc. In the past, this method about how to 
change parameters can be found in samples/bpf.

I am not sure whether this selftests/bpf is designed for this simple 
tests or the detailed information can be found in other places.

Zhu Yanjun

> 
> There is selftests/bpf/config to tell the bpf CI about what kconfig 
> needs to turn on.
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 4/4] bpf/selftests: add simple selftest for bpf_smc_ops
  2024-11-25 23:32         ` Martin KaFai Lau
  2024-11-26  8:29           ` Zhu Yanjun
@ 2024-11-29  4:11           ` D. Wythe
  1 sibling, 0 replies; 20+ messages in thread
From: D. Wythe @ 2024-11-29  4:11 UTC (permalink / raw)
  To: Martin KaFai Lau, Zhu Yanjun
  Cc: kgraul, wenjia, jaka, ast, daniel, andrii, pabeni, song, sdf,
	haoluo, yhs, edumazet, john.fastabend, kpsingh, jolsa, guwen,
	kuba, davem, netdev, linux-s390, linux-rdma, bpf, dtcccc



On 11/26/24 7:32 AM, Martin KaFai Lau wrote:
> On 11/25/24 2:52 AM, Zhu Yanjun wrote:
>>>> # ./test_progs -t smc
>>>> #27/1    bpf_smc/load:OK
>>>> #27      bpf_smc:OK
>>>> Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
>>>>
>>>> The above command is based on several kernel modules. After these dependent kernel modules are 
>>>> loaded, then can run the above command successfully.
> 
>>>
>>> This is indeed a problem, a better way may be to create a separate testing directory for SMC, and 
>>> we are trying to do this.
>>
>> Got it. In the latest patch series, if a test program in sample/bpf can verify this bpf feature, 
>> it is better than a selftest program in the directory tools/ testing/selftests/bpf.
>>
>> I delved into this selftest tool. It seems that this selftest tool only makes the basic checks. A 
>> test program in sample/bpf can do more.
> 
> sample(s)/bpf? No new test should be added to samples/bpf which is obsolete. The bpf CI only runs 
> tests under selftests/bpf.
> 
> There is selftests/bpf/config to tell the bpf CI about what kconfig needs to turn on.

Is it acceptable to add a new kconfig to selftests/bpf/config? I don't know that...

To solve the compilation problem of this test, we originally planned to add a separate testing 
directory to SMC. If adding a new kconfig is acceptable, it will make this patch simpler.

Best wishes,
D. Wythe

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2024-11-29  4:11 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-24  2:42 [PATCH bpf-next 0/4] net/smc: Introduce smc_bpf_ops D. Wythe
2024-10-24  2:42 ` [PATCH bpf-next 1/4] bpf: export necessary sympols for modules D. Wythe
2024-10-24  2:42 ` [PATCH bpf-next 2/4] bpf: allow to access bpf_prog during bpf_struct_access D. Wythe
2024-10-25  9:14   ` kernel test robot
2024-10-25 12:20   ` kernel test robot
2024-10-24  2:42 ` [PATCH net-next 3/4] net/smc: Introduce smc_bpf_ops D. Wythe
2024-10-25  0:26   ` Martin KaFai Lau
2024-10-25 11:05     ` D. Wythe
2024-10-25 18:30       ` Martin KaFai Lau
2024-10-29  8:53         ` D. Wythe
2024-10-24  2:42 ` [PATCH bpf-next 4/4] bpf/selftests: add simple selftest for bpf_smc_ops D. Wythe
2024-10-24  4:04   ` D. Wythe
2024-10-24  4:49     ` Tianchen Ding
2024-10-24  5:49       ` D. Wythe
2024-11-03 13:01   ` Zhu Yanjun
2024-11-21  2:00     ` D. Wythe
2024-11-25 10:52       ` Zhu Yanjun
2024-11-25 23:32         ` Martin KaFai Lau
2024-11-26  8:29           ` Zhu Yanjun
2024-11-29  4:11           ` D. Wythe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).