netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH ipsec-next 0/5] Support PTMU in tunnel mode for packet offload
@ 2025-02-05 18:20 Leon Romanovsky
  2025-02-05 18:20 ` [PATCH ipsec-next 1/5] xfrm: delay initialization of offload path till its actually requested Leon Romanovsky
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Leon Romanovsky @ 2025-02-05 18:20 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Andrew Lunn, Ayush Sawal, Bharat Bhushan, Eric Dumazet,
	Geetha sowjanya, hariprasad, Herbert Xu, intel-wired-lan,
	Jakub Kicinski, Jay Vosburgh, Jonathan Corbet, linux-doc,
	linux-rdma, Louis Peens, netdev, oss-drivers, Paolo Abeni,
	Potnuri Bharat Teja, Przemek Kitszel, Saeed Mahameed,
	Subbaraya Sundeep, Sunil Goutham, Tariq Toukan, Tony Nguyen,
	Ilia Lin

Hi,

This series refactors the xdo_dev_offload_ok() to be global place for
drivers to check if their offload can perform encryption for xmit packets.

Such common place gives us an option to check MTU and PMTU at one place.

Thanks

Leon Romanovsky (5):
  xfrm: delay initialization of offload path till its actually requested
  xfrm: simplify SA initialization routine
  xfrm: rely on XFRM offload
  xfrm: provide common xdo_dev_offload_ok callback implementation
  xfrm: check for PMTU in tunnel mode for packet offload

 Documentation/networking/xfrm_device.rst      |  3 +-
 drivers/net/bonding/bond_main.c               | 16 ++-----
 .../net/ethernet/chelsio/cxgb4/cxgb4_main.c   | 21 ---------
 .../inline_crypto/ch_ipsec/chcr_ipsec.c       | 16 -------
 .../net/ethernet/intel/ixgbe/ixgbe_ipsec.c    | 21 ---------
 drivers/net/ethernet/intel/ixgbevf/ipsec.c    | 21 ---------
 .../marvell/octeontx2/nic/cn10k_ipsec.c       | 15 ------
 .../mellanox/mlx5/core/en_accel/ipsec.c       | 16 -------
 .../net/ethernet/netronome/nfp/crypto/ipsec.c | 11 -----
 drivers/net/netdevsim/ipsec.c                 | 11 -----
 drivers/net/netdevsim/netdevsim.h             |  1 -
 include/net/xfrm.h                            | 22 ++++++++-
 net/xfrm/xfrm_device.c                        | 47 ++++++++++++++-----
 net/xfrm/xfrm_output.c                        |  6 ++-
 net/xfrm/xfrm_state.c                         | 40 +++++++---------
 net/xfrm/xfrm_user.c                          |  2 +-
 16 files changed, 84 insertions(+), 185 deletions(-)

-- 
2.48.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH ipsec-next 1/5] xfrm: delay initialization of offload path till its actually requested
  2025-02-05 18:20 [PATCH ipsec-next 0/5] Support PTMU in tunnel mode for packet offload Leon Romanovsky
@ 2025-02-05 18:20 ` Leon Romanovsky
  2025-02-06  8:46   ` Bharat Bhushan
  2025-02-05 18:20 ` [PATCH ipsec-next 2/5] xfrm: simplify SA initialization routine Leon Romanovsky
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Leon Romanovsky @ 2025-02-05 18:20 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Leon Romanovsky, Andrew Lunn, Ayush Sawal, Bharat Bhushan,
	Eric Dumazet, Geetha sowjanya, hariprasad, Herbert Xu,
	intel-wired-lan, Jakub Kicinski, Jay Vosburgh, Jonathan Corbet,
	linux-doc, linux-rdma, Louis Peens, netdev, oss-drivers,
	Paolo Abeni, Potnuri Bharat Teja, Przemek Kitszel, Saeed Mahameed,
	Subbaraya Sundeep, Sunil Goutham, Tariq Toukan, Tony Nguyen,
	Ilia Lin

From: Leon Romanovsky <leonro@nvidia.com>

XFRM offload path is probed even if offload isn't needed at all. Let's
make sure that x->type_offload pointer stays NULL for such path to
reduce ambiguity.

Fixes: 9d389d7f84bb ("xfrm: Add a xfrm type offload.")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 include/net/xfrm.h     | 12 +++++++++++-
 net/xfrm/xfrm_device.c | 14 +++++++++-----
 net/xfrm/xfrm_state.c  | 22 +++++++++-------------
 net/xfrm/xfrm_user.c   |  2 +-
 4 files changed, 30 insertions(+), 20 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index ed4b83696c77..28355a5be5b9 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -464,6 +464,16 @@ struct xfrm_type_offload {
 
 int xfrm_register_type_offload(const struct xfrm_type_offload *type, unsigned short family);
 void xfrm_unregister_type_offload(const struct xfrm_type_offload *type, unsigned short family);
+const struct xfrm_type_offload *xfrm_get_type_offload(u8 proto,
+						      unsigned short family);
+static inline void xfrm_put_type_offload(const struct xfrm_type_offload *type)
+{
+	if (!type)
+		return;
+
+	module_put(type->owner);
+}
+
 
 /**
  * struct xfrm_mode_cbs - XFRM mode callbacks
@@ -1760,7 +1770,7 @@ void xfrm_spd_getinfo(struct net *net, struct xfrmk_spdinfo *si);
 u32 xfrm_replay_seqhi(struct xfrm_state *x, __be32 net_seq);
 int xfrm_init_replay(struct xfrm_state *x, struct netlink_ext_ack *extack);
 u32 xfrm_state_mtu(struct xfrm_state *x, int mtu);
-int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload,
+int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
 		      struct netlink_ext_ack *extack);
 int xfrm_init_state(struct xfrm_state *x);
 int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type);
diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index d1fa94e52cea..e01a7f5a4c75 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -244,11 +244,6 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x,
 	xfrm_address_t *daddr;
 	bool is_packet_offload;
 
-	if (!x->type_offload) {
-		NL_SET_ERR_MSG(extack, "Type doesn't support offload");
-		return -EINVAL;
-	}
-
 	if (xuo->flags &
 	    ~(XFRM_OFFLOAD_IPV6 | XFRM_OFFLOAD_INBOUND | XFRM_OFFLOAD_PACKET)) {
 		NL_SET_ERR_MSG(extack, "Unrecognized flags in offload request");
@@ -310,6 +305,13 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x,
 		return -EINVAL;
 	}
 
+	x->type_offload = xfrm_get_type_offload(x->id.proto, x->props.family);
+	if (!x->type_offload) {
+		NL_SET_ERR_MSG(extack, "Type doesn't support offload");
+		dev_put(dev);
+		return -EINVAL;
+	}
+
 	xso->dev = dev;
 	netdev_tracker_alloc(dev, &xso->dev_tracker, GFP_ATOMIC);
 	xso->real_dev = dev;
@@ -332,6 +334,8 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x,
 		netdev_put(dev, &xso->dev_tracker);
 		xso->type = XFRM_DEV_OFFLOAD_UNSPECIFIED;
 
+		xfrm_put_type_offload(x->type_offload);
+		x->type_offload = NULL;
 		/* User explicitly requested packet offload mode and configured
 		 * policy in addition to the XFRM state. So be civil to users,
 		 * and return an error instead of taking fallback path.
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index ad2202fa82f3..568fe8df7741 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -424,11 +424,12 @@ void xfrm_unregister_type_offload(const struct xfrm_type_offload *type,
 }
 EXPORT_SYMBOL(xfrm_unregister_type_offload);
 
-static const struct xfrm_type_offload *
-xfrm_get_type_offload(u8 proto, unsigned short family, bool try_load)
+const struct xfrm_type_offload *xfrm_get_type_offload(u8 proto,
+						      unsigned short family)
 {
 	const struct xfrm_type_offload *type = NULL;
 	struct xfrm_state_afinfo *afinfo;
+	bool try_load = true;
 
 retry:
 	afinfo = xfrm_state_get_afinfo(family);
@@ -456,11 +457,7 @@ xfrm_get_type_offload(u8 proto, unsigned short family, bool try_load)
 
 	return type;
 }
-
-static void xfrm_put_type_offload(const struct xfrm_type_offload *type)
-{
-	module_put(type->owner);
-}
+EXPORT_SYMBOL(xfrm_get_type_offload);
 
 static const struct xfrm_mode xfrm4_mode_map[XFRM_MODE_MAX] = {
 	[XFRM_MODE_BEET] = {
@@ -609,8 +606,6 @@ static void ___xfrm_state_destroy(struct xfrm_state *x)
 	kfree(x->coaddr);
 	kfree(x->replay_esn);
 	kfree(x->preplay_esn);
-	if (x->type_offload)
-		xfrm_put_type_offload(x->type_offload);
 	if (x->type) {
 		x->type->destructor(x);
 		xfrm_put_type(x->type);
@@ -784,6 +779,9 @@ void xfrm_dev_state_free(struct xfrm_state *x)
 	struct xfrm_dev_offload *xso = &x->xso;
 	struct net_device *dev = READ_ONCE(xso->dev);
 
+	xfrm_put_type_offload(x->type_offload);
+	x->type_offload = NULL;
+
 	if (dev && dev->xfrmdev_ops) {
 		spin_lock_bh(&xfrm_state_dev_gc_lock);
 		if (!hlist_unhashed(&x->dev_gclist))
@@ -3122,7 +3120,7 @@ u32 xfrm_state_mtu(struct xfrm_state *x, int mtu)
 }
 EXPORT_SYMBOL_GPL(xfrm_state_mtu);
 
-int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload,
+int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
 		      struct netlink_ext_ack *extack)
 {
 	const struct xfrm_mode *inner_mode;
@@ -3178,8 +3176,6 @@ int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload,
 		goto error;
 	}
 
-	x->type_offload = xfrm_get_type_offload(x->id.proto, family, offload);
-
 	err = x->type->init_state(x, extack);
 	if (err)
 		goto error;
@@ -3229,7 +3225,7 @@ int xfrm_init_state(struct xfrm_state *x)
 {
 	int err;
 
-	err = __xfrm_init_state(x, true, false, NULL);
+	err = __xfrm_init_state(x, true, NULL);
 	if (!err)
 		x->km.state = XFRM_STATE_VALID;
 
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 08c6d6f0179f..82a768500999 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -907,7 +907,7 @@ static struct xfrm_state *xfrm_state_construct(struct net *net,
 			goto error;
 	}
 
-	err = __xfrm_init_state(x, false, attrs[XFRMA_OFFLOAD_DEV], extack);
+	err = __xfrm_init_state(x, false, extack);
 	if (err)
 		goto error;
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH ipsec-next 2/5] xfrm: simplify SA initialization routine
  2025-02-05 18:20 [PATCH ipsec-next 0/5] Support PTMU in tunnel mode for packet offload Leon Romanovsky
  2025-02-05 18:20 ` [PATCH ipsec-next 1/5] xfrm: delay initialization of offload path till its actually requested Leon Romanovsky
@ 2025-02-05 18:20 ` Leon Romanovsky
  2025-02-12 11:56   ` Steffen Klassert
  2025-02-05 18:20 ` [PATCH ipsec-next 3/5] xfrm: rely on XFRM offload Leon Romanovsky
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Leon Romanovsky @ 2025-02-05 18:20 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Leon Romanovsky, Andrew Lunn, Ayush Sawal, Bharat Bhushan,
	Eric Dumazet, Geetha sowjanya, hariprasad, Herbert Xu,
	intel-wired-lan, Jakub Kicinski, Jay Vosburgh, Jonathan Corbet,
	linux-doc, linux-rdma, Louis Peens, netdev, oss-drivers,
	Paolo Abeni, Potnuri Bharat Teja, Przemek Kitszel, Saeed Mahameed,
	Subbaraya Sundeep, Sunil Goutham, Tariq Toukan, Tony Nguyen,
	Ilia Lin

From: Leon Romanovsky <leonro@nvidia.com>

SA replay mode is initialized differently for user-space and
kernel-space users, but the call to xfrm_init_replay() existed in
common path with boolean protection. That caused to situation where
we have two different function orders.

So let's rewrite the SA initialization flow to have same order for
both in-kernel and user-space callers.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 include/net/xfrm.h    |  3 +--
 net/xfrm/xfrm_state.c | 22 ++++++++++------------
 net/xfrm/xfrm_user.c  |  2 +-
 3 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 28355a5be5b9..58f8f7661ec4 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1770,8 +1770,7 @@ void xfrm_spd_getinfo(struct net *net, struct xfrmk_spdinfo *si);
 u32 xfrm_replay_seqhi(struct xfrm_state *x, __be32 net_seq);
 int xfrm_init_replay(struct xfrm_state *x, struct netlink_ext_ack *extack);
 u32 xfrm_state_mtu(struct xfrm_state *x, int mtu);
-int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
-		      struct netlink_ext_ack *extack);
+int __xfrm_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack);
 int xfrm_init_state(struct xfrm_state *x);
 int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type);
 int xfrm_input_resume(struct sk_buff *skb, int nexthdr);
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 568fe8df7741..42799b0946a3 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -3120,8 +3120,7 @@ u32 xfrm_state_mtu(struct xfrm_state *x, int mtu)
 }
 EXPORT_SYMBOL_GPL(xfrm_state_mtu);
 
-int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
-		      struct netlink_ext_ack *extack)
+int __xfrm_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack)
 {
 	const struct xfrm_mode *inner_mode;
 	const struct xfrm_mode *outer_mode;
@@ -3188,12 +3187,6 @@ int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
 	}
 
 	x->outer_mode = *outer_mode;
-	if (init_replay) {
-		err = xfrm_init_replay(x, extack);
-		if (err)
-			goto error;
-	}
-
 	if (x->nat_keepalive_interval) {
 		if (x->dir != XFRM_SA_DIR_OUT) {
 			NL_SET_ERR_MSG(extack, "NAT keepalive is only supported for outbound SAs");
@@ -3225,11 +3218,16 @@ int xfrm_init_state(struct xfrm_state *x)
 {
 	int err;
 
-	err = __xfrm_init_state(x, true, NULL);
-	if (!err)
-		x->km.state = XFRM_STATE_VALID;
+	err = __xfrm_init_state(x, NULL);
+	if (err)
+		return err;
 
-	return err;
+	err = xfrm_init_replay(x, NULL);
+	if (err)
+		return err;
+
+	x->km.state = XFRM_STATE_VALID;
+	return 0;
 }
 
 EXPORT_SYMBOL(xfrm_init_state);
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 82a768500999..d1d422f68978 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -907,7 +907,7 @@ static struct xfrm_state *xfrm_state_construct(struct net *net,
 			goto error;
 	}
 
-	err = __xfrm_init_state(x, false, extack);
+	err = __xfrm_init_state(x, extack);
 	if (err)
 		goto error;
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH ipsec-next 3/5] xfrm: rely on XFRM offload
  2025-02-05 18:20 [PATCH ipsec-next 0/5] Support PTMU in tunnel mode for packet offload Leon Romanovsky
  2025-02-05 18:20 ` [PATCH ipsec-next 1/5] xfrm: delay initialization of offload path till its actually requested Leon Romanovsky
  2025-02-05 18:20 ` [PATCH ipsec-next 2/5] xfrm: simplify SA initialization routine Leon Romanovsky
@ 2025-02-05 18:20 ` Leon Romanovsky
  2025-02-18 20:39   ` Zhu Yanjun
  2025-02-05 18:20 ` [PATCH ipsec-next 4/5] xfrm: provide common xdo_dev_offload_ok callback implementation Leon Romanovsky
  2025-02-05 18:20 ` [PATCH ipsec-next 5/5] xfrm: check for PMTU in tunnel mode for packet offload Leon Romanovsky
  4 siblings, 1 reply; 18+ messages in thread
From: Leon Romanovsky @ 2025-02-05 18:20 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Leon Romanovsky, Andrew Lunn, Ayush Sawal, Bharat Bhushan,
	Eric Dumazet, Geetha sowjanya, hariprasad, Herbert Xu,
	intel-wired-lan, Jakub Kicinski, Jay Vosburgh, Jonathan Corbet,
	linux-doc, linux-rdma, Louis Peens, netdev, oss-drivers,
	Paolo Abeni, Potnuri Bharat Teja, Przemek Kitszel, Saeed Mahameed,
	Subbaraya Sundeep, Sunil Goutham, Tariq Toukan, Tony Nguyen,
	Ilia Lin

From: Leon Romanovsky <leonro@nvidia.com>

After change of initialization of x->type_offload pointer to be valid
only for offloaded SAs. There is no need to rely both on x->type_offload
and x->xso.type to determine if SA is offloaded or not.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 net/xfrm/xfrm_device.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index e01a7f5a4c75..c3c170953bf9 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -420,13 +420,11 @@ bool xfrm_dev_offload_ok(struct sk_buff *skb, struct xfrm_state *x)
 	struct xfrm_dst *xdst = (struct xfrm_dst *)dst;
 	struct net_device *dev = x->xso.dev;
 
-	if (!x->type_offload ||
-	    (x->xso.type == XFRM_DEV_OFFLOAD_UNSPECIFIED && x->encap))
+	if (x->xso.type == XFRM_DEV_OFFLOAD_UNSPECIFIED)
 		return false;
 
 	if (x->xso.type == XFRM_DEV_OFFLOAD_PACKET ||
-	    ((!dev || (dev == xfrm_dst_path(dst)->dev)) &&
-	     !xdst->child->xfrm)) {
+	    ((dev == xfrm_dst_path(dst)->dev) && !xdst->child->xfrm)) {
 		mtu = xfrm_state_mtu(x, xdst->child_mtu_cached);
 		if (skb->len <= mtu)
 			goto ok;
@@ -438,8 +436,8 @@ bool xfrm_dev_offload_ok(struct sk_buff *skb, struct xfrm_state *x)
 	return false;
 
 ok:
-	if (dev && dev->xfrmdev_ops && dev->xfrmdev_ops->xdo_dev_offload_ok)
-		return x->xso.dev->xfrmdev_ops->xdo_dev_offload_ok(skb, x);
+	if (dev->xfrmdev_ops->xdo_dev_offload_ok)
+		return dev->xfrmdev_ops->xdo_dev_offload_ok(skb, x);
 
 	return true;
 }
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH ipsec-next 4/5] xfrm: provide common xdo_dev_offload_ok callback implementation
  2025-02-05 18:20 [PATCH ipsec-next 0/5] Support PTMU in tunnel mode for packet offload Leon Romanovsky
                   ` (2 preceding siblings ...)
  2025-02-05 18:20 ` [PATCH ipsec-next 3/5] xfrm: rely on XFRM offload Leon Romanovsky
@ 2025-02-05 18:20 ` Leon Romanovsky
  2025-02-16  9:33   ` Zhu Yanjun
  2025-02-05 18:20 ` [PATCH ipsec-next 5/5] xfrm: check for PMTU in tunnel mode for packet offload Leon Romanovsky
  4 siblings, 1 reply; 18+ messages in thread
From: Leon Romanovsky @ 2025-02-05 18:20 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Leon Romanovsky, Andrew Lunn, Ayush Sawal, Bharat Bhushan,
	Eric Dumazet, Geetha sowjanya, hariprasad, Herbert Xu,
	intel-wired-lan, Jakub Kicinski, Jay Vosburgh, Jonathan Corbet,
	linux-doc, linux-rdma, Louis Peens, netdev, oss-drivers,
	Paolo Abeni, Potnuri Bharat Teja, Przemek Kitszel, Saeed Mahameed,
	Subbaraya Sundeep, Sunil Goutham, Tariq Toukan, Tony Nguyen,
	Ilia Lin

From: Leon Romanovsky <leonro@nvidia.com>

Almost all drivers except bond and nsim had same check if device
can perform XFRM offload on that specific packet. The check was that
packet doesn't have IPv4 options and IPv6 extensions.

In NIC drivers, the IPv4 HELEN comparison was slightly different, but
the intent was to check for the same conditions. So let's chose more
strict variant as a common base.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 Documentation/networking/xfrm_device.rst      |  3 ++-
 drivers/net/bonding/bond_main.c               | 16 +++++---------
 .../net/ethernet/chelsio/cxgb4/cxgb4_main.c   | 21 -------------------
 .../inline_crypto/ch_ipsec/chcr_ipsec.c       | 16 --------------
 .../net/ethernet/intel/ixgbe/ixgbe_ipsec.c    | 21 -------------------
 drivers/net/ethernet/intel/ixgbevf/ipsec.c    | 21 -------------------
 .../marvell/octeontx2/nic/cn10k_ipsec.c       | 15 -------------
 .../mellanox/mlx5/core/en_accel/ipsec.c       | 16 --------------
 .../net/ethernet/netronome/nfp/crypto/ipsec.c | 11 ----------
 drivers/net/netdevsim/ipsec.c                 | 11 ----------
 drivers/net/netdevsim/netdevsim.h             |  1 -
 net/xfrm/xfrm_device.c                        | 15 +++++++++++++
 12 files changed, 22 insertions(+), 145 deletions(-)

diff --git a/Documentation/networking/xfrm_device.rst b/Documentation/networking/xfrm_device.rst
index 66f6e9a9b59a..39bb98939d1f 100644
--- a/Documentation/networking/xfrm_device.rst
+++ b/Documentation/networking/xfrm_device.rst
@@ -126,7 +126,8 @@ been setup for offload, it first calls into xdo_dev_offload_ok() with
 the skb and the intended offload state to ask the driver if the offload
 will serviceable.  This can check the packet information to be sure the
 offload can be supported (e.g. IPv4 or IPv6, no IPv4 options, etc) and
-return true of false to signify its support.
+return true of false to signify its support. In case driver doesn't implement
+this callback, the stack provides reasonable defaults.
 
 Crypto offload mode:
 When ready to send, the driver needs to inspect the Tx packet for the
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index e45bba240cbc..bfb55c23380b 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -676,22 +676,16 @@ static void bond_ipsec_free_sa(struct xfrm_state *xs)
 static bool bond_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)
 {
 	struct net_device *real_dev;
-	bool ok = false;
 
 	rcu_read_lock();
 	real_dev = bond_ipsec_dev(xs);
-	if (!real_dev)
-		goto out;
-
-	if (!real_dev->xfrmdev_ops ||
-	    !real_dev->xfrmdev_ops->xdo_dev_offload_ok ||
-	    netif_is_bond_master(real_dev))
-		goto out;
+	if (!real_dev || netif_is_bond_master(real_dev)) {
+		rcu_read_unlock();
+		return false;
+	}
 
-	ok = real_dev->xfrmdev_ops->xdo_dev_offload_ok(skb, xs);
-out:
 	rcu_read_unlock();
-	return ok;
+	return true;
 }
 
 /**
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 2f0b3e389e62..551c279dc14b 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -6538,26 +6538,6 @@ static void cxgb4_xfrm_free_state(struct xfrm_state *x)
 	mutex_unlock(&uld_mutex);
 }
 
-static bool cxgb4_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *x)
-{
-	struct adapter *adap = netdev2adap(x->xso.dev);
-	bool ret = false;
-
-	if (!mutex_trylock(&uld_mutex)) {
-		dev_dbg(adap->pdev_dev,
-			"crypto uld critical resource is under use\n");
-		return ret;
-	}
-	if (chcr_offload_state(adap, CXGB4_XFRMDEV_OPS))
-		goto out_unlock;
-
-	ret = adap->uld[CXGB4_ULD_IPSEC].xfrmdev_ops->xdo_dev_offload_ok(skb, x);
-
-out_unlock:
-	mutex_unlock(&uld_mutex);
-	return ret;
-}
-
 static void cxgb4_advance_esn_state(struct xfrm_state *x)
 {
 	struct adapter *adap = netdev2adap(x->xso.dev);
@@ -6583,7 +6563,6 @@ static const struct xfrmdev_ops cxgb4_xfrmdev_ops = {
 	.xdo_dev_state_add      = cxgb4_xfrm_add_state,
 	.xdo_dev_state_delete   = cxgb4_xfrm_del_state,
 	.xdo_dev_state_free     = cxgb4_xfrm_free_state,
-	.xdo_dev_offload_ok     = cxgb4_ipsec_offload_ok,
 	.xdo_dev_state_advance_esn = cxgb4_advance_esn_state,
 };
 
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.c b/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.c
index c7338ac6a5bb..baba96883f48 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.c
@@ -71,7 +71,6 @@
 static LIST_HEAD(uld_ctx_list);
 static DEFINE_MUTEX(dev_mutex);
 
-static bool ch_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *x);
 static int ch_ipsec_uld_state_change(void *handle, enum cxgb4_state new_state);
 static int ch_ipsec_xmit(struct sk_buff *skb, struct net_device *dev);
 static void *ch_ipsec_uld_add(const struct cxgb4_lld_info *infop);
@@ -85,7 +84,6 @@ static const struct xfrmdev_ops ch_ipsec_xfrmdev_ops = {
 	.xdo_dev_state_add      = ch_ipsec_xfrm_add_state,
 	.xdo_dev_state_delete   = ch_ipsec_xfrm_del_state,
 	.xdo_dev_state_free     = ch_ipsec_xfrm_free_state,
-	.xdo_dev_offload_ok     = ch_ipsec_offload_ok,
 	.xdo_dev_state_advance_esn = ch_ipsec_advance_esn_state,
 };
 
@@ -323,20 +321,6 @@ static void ch_ipsec_xfrm_free_state(struct xfrm_state *x)
 	module_put(THIS_MODULE);
 }
 
-static bool ch_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *x)
-{
-	if (x->props.family == AF_INET) {
-		/* Offload with IP options is not supported yet */
-		if (ip_hdr(skb)->ihl > 5)
-			return false;
-	} else {
-		/* Offload with IPv6 extension headers is not support yet */
-		if (ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr))
-			return false;
-	}
-	return true;
-}
-
 static void ch_ipsec_advance_esn_state(struct xfrm_state *x)
 {
 	/* do nothing */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index 866024f2b9ee..07ea1954a276 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -817,30 +817,9 @@ static void ixgbe_ipsec_del_sa(struct xfrm_state *xs)
 	}
 }
 
-/**
- * ixgbe_ipsec_offload_ok - can this packet use the xfrm hw offload
- * @skb: current data packet
- * @xs: pointer to transformer state struct
- **/
-static bool ixgbe_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)
-{
-	if (xs->props.family == AF_INET) {
-		/* Offload with IPv4 options is not supported yet */
-		if (ip_hdr(skb)->ihl != 5)
-			return false;
-	} else {
-		/* Offload with IPv6 extension headers is not support yet */
-		if (ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr))
-			return false;
-	}
-
-	return true;
-}
-
 static const struct xfrmdev_ops ixgbe_xfrmdev_ops = {
 	.xdo_dev_state_add = ixgbe_ipsec_add_sa,
 	.xdo_dev_state_delete = ixgbe_ipsec_del_sa,
-	.xdo_dev_offload_ok = ixgbe_ipsec_offload_ok,
 };
 
 /**
diff --git a/drivers/net/ethernet/intel/ixgbevf/ipsec.c b/drivers/net/ethernet/intel/ixgbevf/ipsec.c
index f804b35d79c7..8ba037e3d9c2 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ipsec.c
@@ -428,30 +428,9 @@ static void ixgbevf_ipsec_del_sa(struct xfrm_state *xs)
 	}
 }
 
-/**
- * ixgbevf_ipsec_offload_ok - can this packet use the xfrm hw offload
- * @skb: current data packet
- * @xs: pointer to transformer state struct
- **/
-static bool ixgbevf_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)
-{
-	if (xs->props.family == AF_INET) {
-		/* Offload with IPv4 options is not supported yet */
-		if (ip_hdr(skb)->ihl != 5)
-			return false;
-	} else {
-		/* Offload with IPv6 extension headers is not support yet */
-		if (ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr))
-			return false;
-	}
-
-	return true;
-}
-
 static const struct xfrmdev_ops ixgbevf_xfrmdev_ops = {
 	.xdo_dev_state_add = ixgbevf_ipsec_add_sa,
 	.xdo_dev_state_delete = ixgbevf_ipsec_del_sa,
-	.xdo_dev_offload_ok = ixgbevf_ipsec_offload_ok,
 };
 
 /**
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_ipsec.c b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_ipsec.c
index 09a5b5268205..fc59e50bafce 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_ipsec.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_ipsec.c
@@ -744,24 +744,9 @@ static void cn10k_ipsec_del_state(struct xfrm_state *x)
 		queue_work(pf->ipsec.sa_workq, &pf->ipsec.sa_work);
 }
 
-static bool cn10k_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *x)
-{
-	if (x->props.family == AF_INET) {
-		/* Offload with IPv4 options is not supported yet */
-		if (ip_hdr(skb)->ihl > 5)
-			return false;
-	} else {
-		/* Offload with IPv6 extension headers is not support yet */
-		if (ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr))
-			return false;
-	}
-	return true;
-}
-
 static const struct xfrmdev_ops cn10k_ipsec_xfrmdev_ops = {
 	.xdo_dev_state_add	= cn10k_ipsec_add_state,
 	.xdo_dev_state_delete	= cn10k_ipsec_del_state,
-	.xdo_dev_offload_ok	= cn10k_ipsec_offload_ok,
 };
 
 static void cn10k_ipsec_sa_wq_handler(struct work_struct *work)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
index 501709ac310f..3b81e7b8ce23 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
@@ -953,21 +953,6 @@ void mlx5e_ipsec_cleanup(struct mlx5e_priv *priv)
 	priv->ipsec = NULL;
 }
 
-static bool mlx5e_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *x)
-{
-	if (x->props.family == AF_INET) {
-		/* Offload with IPv4 options is not supported yet */
-		if (ip_hdr(skb)->ihl > 5)
-			return false;
-	} else {
-		/* Offload with IPv6 extension headers is not support yet */
-		if (ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr))
-			return false;
-	}
-
-	return true;
-}
-
 static void mlx5e_xfrm_advance_esn_state(struct xfrm_state *x)
 {
 	struct mlx5e_ipsec_sa_entry *sa_entry = to_ipsec_sa_entry(x);
@@ -1196,7 +1181,6 @@ static const struct xfrmdev_ops mlx5e_ipsec_xfrmdev_ops = {
 	.xdo_dev_state_add	= mlx5e_xfrm_add_state,
 	.xdo_dev_state_delete	= mlx5e_xfrm_del_state,
 	.xdo_dev_state_free	= mlx5e_xfrm_free_state,
-	.xdo_dev_offload_ok	= mlx5e_ipsec_offload_ok,
 	.xdo_dev_state_advance_esn = mlx5e_xfrm_advance_esn_state,
 
 	.xdo_dev_state_update_stats = mlx5e_xfrm_update_stats,
diff --git a/drivers/net/ethernet/netronome/nfp/crypto/ipsec.c b/drivers/net/ethernet/netronome/nfp/crypto/ipsec.c
index 515069d5637b..671af5d4c5d2 100644
--- a/drivers/net/ethernet/netronome/nfp/crypto/ipsec.c
+++ b/drivers/net/ethernet/netronome/nfp/crypto/ipsec.c
@@ -565,20 +565,9 @@ static void nfp_net_xfrm_del_state(struct xfrm_state *x)
 	xa_erase(&nn->xa_ipsec, x->xso.offload_handle - 1);
 }
 
-static bool nfp_net_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *x)
-{
-	if (x->props.family == AF_INET)
-		/* Offload with IPv4 options is not supported yet */
-		return ip_hdr(skb)->ihl == 5;
-
-	/* Offload with IPv6 extension headers is not support yet */
-	return !(ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr));
-}
-
 static const struct xfrmdev_ops nfp_net_ipsec_xfrmdev_ops = {
 	.xdo_dev_state_add = nfp_net_xfrm_add_state,
 	.xdo_dev_state_delete = nfp_net_xfrm_del_state,
-	.xdo_dev_offload_ok = nfp_net_ipsec_offload_ok,
 };
 
 void nfp_net_ipsec_init(struct nfp_net *nn)
diff --git a/drivers/net/netdevsim/ipsec.c b/drivers/net/netdevsim/ipsec.c
index 88187dd4eb2d..d88bdb9a1717 100644
--- a/drivers/net/netdevsim/ipsec.c
+++ b/drivers/net/netdevsim/ipsec.c
@@ -217,20 +217,9 @@ static void nsim_ipsec_del_sa(struct xfrm_state *xs)
 	ipsec->count--;
 }
 
-static bool nsim_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)
-{
-	struct netdevsim *ns = netdev_priv(xs->xso.real_dev);
-	struct nsim_ipsec *ipsec = &ns->ipsec;
-
-	ipsec->ok++;
-
-	return true;
-}
-
 static const struct xfrmdev_ops nsim_xfrmdev_ops = {
 	.xdo_dev_state_add	= nsim_ipsec_add_sa,
 	.xdo_dev_state_delete	= nsim_ipsec_del_sa,
-	.xdo_dev_offload_ok	= nsim_ipsec_offload_ok,
 };
 
 bool nsim_ipsec_tx(struct netdevsim *ns, struct sk_buff *skb)
diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
index 96d54c08043d..ca8f1a620044 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -54,7 +54,6 @@ struct nsim_ipsec {
 	struct dentry *pfile;
 	u32 count;
 	u32 tx;
-	u32 ok;
 };
 
 #define NSIM_MACSEC_MAX_SECY_COUNT 3
diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index c3c170953bf9..056df0e69d73 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -436,6 +436,21 @@ bool xfrm_dev_offload_ok(struct sk_buff *skb, struct xfrm_state *x)
 	return false;
 
 ok:
+	switch (x->props.family) {
+	case AF_INET:
+		/* Check for IPv4 options */
+		if (ip_hdr(skb)->ihl != 5)
+			return false;
+		break;
+	case AF_INET6:
+		/* Check for IPv6 extensions */
+		if (ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr))
+			return false;
+		break;
+	default:
+		break;
+	}
+
 	if (dev->xfrmdev_ops->xdo_dev_offload_ok)
 		return dev->xfrmdev_ops->xdo_dev_offload_ok(skb, x);
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH ipsec-next 5/5] xfrm: check for PMTU in tunnel mode for packet offload
  2025-02-05 18:20 [PATCH ipsec-next 0/5] Support PTMU in tunnel mode for packet offload Leon Romanovsky
                   ` (3 preceding siblings ...)
  2025-02-05 18:20 ` [PATCH ipsec-next 4/5] xfrm: provide common xdo_dev_offload_ok callback implementation Leon Romanovsky
@ 2025-02-05 18:20 ` Leon Romanovsky
  4 siblings, 0 replies; 18+ messages in thread
From: Leon Romanovsky @ 2025-02-05 18:20 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Leon Romanovsky, Andrew Lunn, Ayush Sawal, Bharat Bhushan,
	Eric Dumazet, Geetha sowjanya, hariprasad, Herbert Xu,
	intel-wired-lan, Jakub Kicinski, Jay Vosburgh, Jonathan Corbet,
	linux-doc, linux-rdma, Louis Peens, netdev, oss-drivers,
	Paolo Abeni, Potnuri Bharat Teja, Przemek Kitszel, Saeed Mahameed,
	Subbaraya Sundeep, Sunil Goutham, Tariq Toukan, Tony Nguyen,
	Ilia Lin

From: Leon Romanovsky <leonro@nvidia.com>

In tunnel mode, for the packet offload, there were no PMTU signaling
to the upper level about need to fragment the packet. As a solution,
call to already existing xfrm[4|6]_tunnel_check_size() to perform that.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 include/net/xfrm.h     |  9 +++++++++
 net/xfrm/xfrm_device.c | 10 ++++++++--
 net/xfrm/xfrm_output.c |  6 ++++--
 3 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 58f8f7661ec4..519ab1209e4c 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1782,6 +1782,15 @@ int xfrm_trans_queue(struct sk_buff *skb,
 				   struct sk_buff *));
 int xfrm_output_resume(struct sock *sk, struct sk_buff *skb, int err);
 int xfrm_output(struct sock *sk, struct sk_buff *skb);
+int xfrm4_tunnel_check_size(struct sk_buff *skb);
+#if IS_ENABLED(CONFIG_IPV6)
+int xfrm6_tunnel_check_size(struct sk_buff *skb);
+#else
+static inline int xfrm6_tunnel_check_size(struct sk_buff *skb)
+{
+	return -EMSGSIZE;
+}
+#endif
 
 #if IS_ENABLED(CONFIG_NET_PKTGEN)
 int pktgen_xfrm_outer_mode_output(struct xfrm_state *x, struct sk_buff *skb);
diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index 056df0e69d73..9ad1f85b0a27 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -419,12 +419,12 @@ bool xfrm_dev_offload_ok(struct sk_buff *skb, struct xfrm_state *x)
 	struct dst_entry *dst = skb_dst(skb);
 	struct xfrm_dst *xdst = (struct xfrm_dst *)dst;
 	struct net_device *dev = x->xso.dev;
+	bool check_tunnel_size;
 
 	if (x->xso.type == XFRM_DEV_OFFLOAD_UNSPECIFIED)
 		return false;
 
-	if (x->xso.type == XFRM_DEV_OFFLOAD_PACKET ||
-	    ((dev == xfrm_dst_path(dst)->dev) && !xdst->child->xfrm)) {
+	if ((dev == xfrm_dst_path(dst)->dev) && !xdst->child->xfrm) {
 		mtu = xfrm_state_mtu(x, xdst->child_mtu_cached);
 		if (skb->len <= mtu)
 			goto ok;
@@ -436,16 +436,22 @@ bool xfrm_dev_offload_ok(struct sk_buff *skb, struct xfrm_state *x)
 	return false;
 
 ok:
+	check_tunnel_size = x->xso.type == XFRM_DEV_OFFLOAD_PACKET &&
+			    x->props.mode == XFRM_MODE_TUNNEL;
 	switch (x->props.family) {
 	case AF_INET:
 		/* Check for IPv4 options */
 		if (ip_hdr(skb)->ihl != 5)
 			return false;
+		if (check_tunnel_size && xfrm4_tunnel_check_size(skb))
+			return false;
 		break;
 	case AF_INET6:
 		/* Check for IPv6 extensions */
 		if (ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr))
 			return false;
+		if (check_tunnel_size && xfrm6_tunnel_check_size(skb))
+			return false;
 		break;
 	default:
 		break;
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index f7abd42c077d..34c8e266641c 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -786,7 +786,7 @@ int xfrm_output(struct sock *sk, struct sk_buff *skb)
 }
 EXPORT_SYMBOL_GPL(xfrm_output);
 
-static int xfrm4_tunnel_check_size(struct sk_buff *skb)
+int xfrm4_tunnel_check_size(struct sk_buff *skb)
 {
 	int mtu, ret = 0;
 
@@ -812,6 +812,7 @@ static int xfrm4_tunnel_check_size(struct sk_buff *skb)
 out:
 	return ret;
 }
+EXPORT_SYMBOL_GPL(xfrm4_tunnel_check_size);
 
 static int xfrm4_extract_output(struct xfrm_state *x, struct sk_buff *skb)
 {
@@ -834,7 +835,7 @@ static int xfrm4_extract_output(struct xfrm_state *x, struct sk_buff *skb)
 }
 
 #if IS_ENABLED(CONFIG_IPV6)
-static int xfrm6_tunnel_check_size(struct sk_buff *skb)
+int xfrm6_tunnel_check_size(struct sk_buff *skb)
 {
 	int mtu, ret = 0;
 	struct dst_entry *dst = skb_dst(skb);
@@ -864,6 +865,7 @@ static int xfrm6_tunnel_check_size(struct sk_buff *skb)
 out:
 	return ret;
 }
+EXPORT_SYMBOL_GPL(xfrm6_tunnel_check_size);
 #endif
 
 static int xfrm6_extract_output(struct xfrm_state *x, struct sk_buff *skb)
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH ipsec-next 1/5] xfrm: delay initialization of offload path till its actually requested
  2025-02-05 18:20 ` [PATCH ipsec-next 1/5] xfrm: delay initialization of offload path till its actually requested Leon Romanovsky
@ 2025-02-06  8:46   ` Bharat Bhushan
  2025-02-06  8:54     ` Leon Romanovsky
  0 siblings, 1 reply; 18+ messages in thread
From: Bharat Bhushan @ 2025-02-06  8:46 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Steffen Klassert, Leon Romanovsky, Andrew Lunn, Ayush Sawal,
	Bharat Bhushan, Eric Dumazet, Geetha sowjanya, hariprasad,
	Herbert Xu, intel-wired-lan, Jakub Kicinski, Jay Vosburgh,
	Jonathan Corbet, linux-doc, linux-rdma, Louis Peens, netdev,
	oss-drivers, Paolo Abeni, Potnuri Bharat Teja, Przemek Kitszel,
	Saeed Mahameed, Subbaraya Sundeep, Sunil Goutham, Tariq Toukan,
	Tony Nguyen, Ilia Lin

Hi Leon,

On Wed, Feb 5, 2025 at 11:50 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> From: Leon Romanovsky <leonro@nvidia.com>
>
> XFRM offload path is probed even if offload isn't needed at all. Let's
> make sure that x->type_offload pointer stays NULL for such path to
> reduce ambiguity.
>
> Fixes: 9d389d7f84bb ("xfrm: Add a xfrm type offload.")
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
>  include/net/xfrm.h     | 12 +++++++++++-
>  net/xfrm/xfrm_device.c | 14 +++++++++-----
>  net/xfrm/xfrm_state.c  | 22 +++++++++-------------
>  net/xfrm/xfrm_user.c   |  2 +-
>  4 files changed, 30 insertions(+), 20 deletions(-)
>
> diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> index ed4b83696c77..28355a5be5b9 100644
> --- a/include/net/xfrm.h
> +++ b/include/net/xfrm.h
> @@ -464,6 +464,16 @@ struct xfrm_type_offload {
>
>  int xfrm_register_type_offload(const struct xfrm_type_offload *type, unsigned short family);
>  void xfrm_unregister_type_offload(const struct xfrm_type_offload *type, unsigned short family);
> +const struct xfrm_type_offload *xfrm_get_type_offload(u8 proto,
> +                                                     unsigned short family);
> +static inline void xfrm_put_type_offload(const struct xfrm_type_offload *type)
> +{
> +       if (!type)
> +               return;
> +
> +       module_put(type->owner);
> +}
> +
>
>  /**
>   * struct xfrm_mode_cbs - XFRM mode callbacks
> @@ -1760,7 +1770,7 @@ void xfrm_spd_getinfo(struct net *net, struct xfrmk_spdinfo *si);
>  u32 xfrm_replay_seqhi(struct xfrm_state *x, __be32 net_seq);
>  int xfrm_init_replay(struct xfrm_state *x, struct netlink_ext_ack *extack);
>  u32 xfrm_state_mtu(struct xfrm_state *x, int mtu);
> -int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload,
> +int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
>                       struct netlink_ext_ack *extack);
>  int xfrm_init_state(struct xfrm_state *x);
>  int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type);
> diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
> index d1fa94e52cea..e01a7f5a4c75 100644
> --- a/net/xfrm/xfrm_device.c
> +++ b/net/xfrm/xfrm_device.c
> @@ -244,11 +244,6 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x,
>         xfrm_address_t *daddr;
>         bool is_packet_offload;
>
> -       if (!x->type_offload) {
> -               NL_SET_ERR_MSG(extack, "Type doesn't support offload");
> -               return -EINVAL;
> -       }
> -
>         if (xuo->flags &
>             ~(XFRM_OFFLOAD_IPV6 | XFRM_OFFLOAD_INBOUND | XFRM_OFFLOAD_PACKET)) {
>                 NL_SET_ERR_MSG(extack, "Unrecognized flags in offload request");
> @@ -310,6 +305,13 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x,
>                 return -EINVAL;
>         }
>
> +       x->type_offload = xfrm_get_type_offload(x->id.proto, x->props.family);
> +       if (!x->type_offload) {
> +               NL_SET_ERR_MSG(extack, "Type doesn't support offload");
> +               dev_put(dev);
> +               return -EINVAL;
> +       }
> +
>         xso->dev = dev;
>         netdev_tracker_alloc(dev, &xso->dev_tracker, GFP_ATOMIC);
>         xso->real_dev = dev;
> @@ -332,6 +334,8 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x,
>                 netdev_put(dev, &xso->dev_tracker);
>                 xso->type = XFRM_DEV_OFFLOAD_UNSPECIFIED;
>
> +               xfrm_put_type_offload(x->type_offload);
> +               x->type_offload = NULL;

We always set type_offload to NULL. Can we move type_offload = NULL in
xfrm_put_type_offload() ?

Thanks
-Bharat

>                 /* User explicitly requested packet offload mode and configured
>                  * policy in addition to the XFRM state. So be civil to users,
>                  * and return an error instead of taking fallback path.
> diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
> index ad2202fa82f3..568fe8df7741 100644
> --- a/net/xfrm/xfrm_state.c
> +++ b/net/xfrm/xfrm_state.c
> @@ -424,11 +424,12 @@ void xfrm_unregister_type_offload(const struct xfrm_type_offload *type,
>  }
>  EXPORT_SYMBOL(xfrm_unregister_type_offload);
>
> -static const struct xfrm_type_offload *
> -xfrm_get_type_offload(u8 proto, unsigned short family, bool try_load)
> +const struct xfrm_type_offload *xfrm_get_type_offload(u8 proto,
> +                                                     unsigned short family)
>  {
>         const struct xfrm_type_offload *type = NULL;
>         struct xfrm_state_afinfo *afinfo;
> +       bool try_load = true;
>
>  retry:
>         afinfo = xfrm_state_get_afinfo(family);
> @@ -456,11 +457,7 @@ xfrm_get_type_offload(u8 proto, unsigned short family, bool try_load)
>
>         return type;
>  }
> -
> -static void xfrm_put_type_offload(const struct xfrm_type_offload *type)
> -{
> -       module_put(type->owner);
> -}
> +EXPORT_SYMBOL(xfrm_get_type_offload);
>
>  static const struct xfrm_mode xfrm4_mode_map[XFRM_MODE_MAX] = {
>         [XFRM_MODE_BEET] = {
> @@ -609,8 +606,6 @@ static void ___xfrm_state_destroy(struct xfrm_state *x)
>         kfree(x->coaddr);
>         kfree(x->replay_esn);
>         kfree(x->preplay_esn);
> -       if (x->type_offload)
> -               xfrm_put_type_offload(x->type_offload);
>         if (x->type) {
>                 x->type->destructor(x);
>                 xfrm_put_type(x->type);
> @@ -784,6 +779,9 @@ void xfrm_dev_state_free(struct xfrm_state *x)
>         struct xfrm_dev_offload *xso = &x->xso;
>         struct net_device *dev = READ_ONCE(xso->dev);
>
> +       xfrm_put_type_offload(x->type_offload);
> +       x->type_offload = NULL;
> +
>         if (dev && dev->xfrmdev_ops) {
>                 spin_lock_bh(&xfrm_state_dev_gc_lock);
>                 if (!hlist_unhashed(&x->dev_gclist))
> @@ -3122,7 +3120,7 @@ u32 xfrm_state_mtu(struct xfrm_state *x, int mtu)
>  }
>  EXPORT_SYMBOL_GPL(xfrm_state_mtu);
>
> -int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload,
> +int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
>                       struct netlink_ext_ack *extack)
>  {
>         const struct xfrm_mode *inner_mode;
> @@ -3178,8 +3176,6 @@ int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload,
>                 goto error;
>         }
>
> -       x->type_offload = xfrm_get_type_offload(x->id.proto, family, offload);
> -
>         err = x->type->init_state(x, extack);
>         if (err)
>                 goto error;
> @@ -3229,7 +3225,7 @@ int xfrm_init_state(struct xfrm_state *x)
>  {
>         int err;
>
> -       err = __xfrm_init_state(x, true, false, NULL);
> +       err = __xfrm_init_state(x, true, NULL);
>         if (!err)
>                 x->km.state = XFRM_STATE_VALID;
>
> diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
> index 08c6d6f0179f..82a768500999 100644
> --- a/net/xfrm/xfrm_user.c
> +++ b/net/xfrm/xfrm_user.c
> @@ -907,7 +907,7 @@ static struct xfrm_state *xfrm_state_construct(struct net *net,
>                         goto error;
>         }
>
> -       err = __xfrm_init_state(x, false, attrs[XFRMA_OFFLOAD_DEV], extack);
> +       err = __xfrm_init_state(x, false, extack);
>         if (err)
>                 goto error;
>
> --
> 2.48.1
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH ipsec-next 1/5] xfrm: delay initialization of offload path till its actually requested
  2025-02-06  8:46   ` Bharat Bhushan
@ 2025-02-06  8:54     ` Leon Romanovsky
  2025-02-06 13:59       ` Bharat Bhushan
  0 siblings, 1 reply; 18+ messages in thread
From: Leon Romanovsky @ 2025-02-06  8:54 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: Steffen Klassert, Andrew Lunn, Ayush Sawal, Bharat Bhushan,
	Eric Dumazet, Geetha sowjanya, hariprasad, Herbert Xu,
	intel-wired-lan, Jakub Kicinski, Jay Vosburgh, Jonathan Corbet,
	linux-doc, linux-rdma, Louis Peens, netdev, oss-drivers,
	Paolo Abeni, Potnuri Bharat Teja, Przemek Kitszel, Saeed Mahameed,
	Subbaraya Sundeep, Sunil Goutham, Tariq Toukan, Tony Nguyen,
	Ilia Lin

On Thu, Feb 06, 2025 at 02:16:08PM +0530, Bharat Bhushan wrote:
> Hi Leon,
> 
> On Wed, Feb 5, 2025 at 11:50 PM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > From: Leon Romanovsky <leonro@nvidia.com>
> >
> > XFRM offload path is probed even if offload isn't needed at all. Let's
> > make sure that x->type_offload pointer stays NULL for such path to
> > reduce ambiguity.
> >
> > Fixes: 9d389d7f84bb ("xfrm: Add a xfrm type offload.")
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > ---
> >  include/net/xfrm.h     | 12 +++++++++++-
> >  net/xfrm/xfrm_device.c | 14 +++++++++-----
> >  net/xfrm/xfrm_state.c  | 22 +++++++++-------------
> >  net/xfrm/xfrm_user.c   |  2 +-
> >  4 files changed, 30 insertions(+), 20 deletions(-)

<...>

> > +       x->type_offload = xfrm_get_type_offload(x->id.proto, x->props.family);
> > +       if (!x->type_offload) {

<...>

> > +               xfrm_put_type_offload(x->type_offload);
> > +               x->type_offload = NULL;
> 
> We always set type_offload to NULL. Can we move type_offload = NULL in
> xfrm_put_type_offload() ?

We can, but it will require change to xfrm_get_type_offload() too,
otherwise we will get asymmetrical get/put.

Do you want something like that?
int xfrm_get_type_offload(struct xfrm_state *x);
void xfrm_put_type_offload(struct xfrm_state *x);

Thansk

> 
> Thanks
> -Bharat
> 
> >                 /* User explicitly requested packet offload mode and configured
> >                  * policy in addition to the XFRM state. So be civil to users,
> >                  * and return an error instead of taking fallback path.
> > diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
> > index ad2202fa82f3..568fe8df7741 100644
> > --- a/net/xfrm/xfrm_state.c
> > +++ b/net/xfrm/xfrm_state.c
> > @@ -424,11 +424,12 @@ void xfrm_unregister_type_offload(const struct xfrm_type_offload *type,
> >  }
> >  EXPORT_SYMBOL(xfrm_unregister_type_offload);
> >
> > -static const struct xfrm_type_offload *
> > -xfrm_get_type_offload(u8 proto, unsigned short family, bool try_load)
> > +const struct xfrm_type_offload *xfrm_get_type_offload(u8 proto,
> > +                                                     unsigned short family)
> >  {
> >         const struct xfrm_type_offload *type = NULL;
> >         struct xfrm_state_afinfo *afinfo;
> > +       bool try_load = true;
> >
> >  retry:
> >         afinfo = xfrm_state_get_afinfo(family);
> > @@ -456,11 +457,7 @@ xfrm_get_type_offload(u8 proto, unsigned short family, bool try_load)
> >
> >         return type;
> >  }
> > -
> > -static void xfrm_put_type_offload(const struct xfrm_type_offload *type)
> > -{
> > -       module_put(type->owner);
> > -}
> > +EXPORT_SYMBOL(xfrm_get_type_offload);
> >
> >  static const struct xfrm_mode xfrm4_mode_map[XFRM_MODE_MAX] = {
> >         [XFRM_MODE_BEET] = {
> > @@ -609,8 +606,6 @@ static void ___xfrm_state_destroy(struct xfrm_state *x)
> >         kfree(x->coaddr);
> >         kfree(x->replay_esn);
> >         kfree(x->preplay_esn);
> > -       if (x->type_offload)
> > -               xfrm_put_type_offload(x->type_offload);
> >         if (x->type) {
> >                 x->type->destructor(x);
> >                 xfrm_put_type(x->type);
> > @@ -784,6 +779,9 @@ void xfrm_dev_state_free(struct xfrm_state *x)
> >         struct xfrm_dev_offload *xso = &x->xso;
> >         struct net_device *dev = READ_ONCE(xso->dev);
> >
> > +       xfrm_put_type_offload(x->type_offload);
> > +       x->type_offload = NULL;
> > +
> >         if (dev && dev->xfrmdev_ops) {
> >                 spin_lock_bh(&xfrm_state_dev_gc_lock);
> >                 if (!hlist_unhashed(&x->dev_gclist))
> > @@ -3122,7 +3120,7 @@ u32 xfrm_state_mtu(struct xfrm_state *x, int mtu)
> >  }
> >  EXPORT_SYMBOL_GPL(xfrm_state_mtu);
> >
> > -int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload,
> > +int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
> >                       struct netlink_ext_ack *extack)
> >  {
> >         const struct xfrm_mode *inner_mode;
> > @@ -3178,8 +3176,6 @@ int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload,
> >                 goto error;
> >         }
> >
> > -       x->type_offload = xfrm_get_type_offload(x->id.proto, family, offload);
> > -
> >         err = x->type->init_state(x, extack);
> >         if (err)
> >                 goto error;
> > @@ -3229,7 +3225,7 @@ int xfrm_init_state(struct xfrm_state *x)
> >  {
> >         int err;
> >
> > -       err = __xfrm_init_state(x, true, false, NULL);
> > +       err = __xfrm_init_state(x, true, NULL);
> >         if (!err)
> >                 x->km.state = XFRM_STATE_VALID;
> >
> > diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
> > index 08c6d6f0179f..82a768500999 100644
> > --- a/net/xfrm/xfrm_user.c
> > +++ b/net/xfrm/xfrm_user.c
> > @@ -907,7 +907,7 @@ static struct xfrm_state *xfrm_state_construct(struct net *net,
> >                         goto error;
> >         }
> >
> > -       err = __xfrm_init_state(x, false, attrs[XFRMA_OFFLOAD_DEV], extack);
> > +       err = __xfrm_init_state(x, false, extack);
> >         if (err)
> >                 goto error;
> >
> > --
> > 2.48.1
> >
> >
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH ipsec-next 1/5] xfrm: delay initialization of offload path till its actually requested
  2025-02-06  8:54     ` Leon Romanovsky
@ 2025-02-06 13:59       ` Bharat Bhushan
  2025-02-06 14:26         ` Leon Romanovsky
  0 siblings, 1 reply; 18+ messages in thread
From: Bharat Bhushan @ 2025-02-06 13:59 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Steffen Klassert, Andrew Lunn, Ayush Sawal, Bharat Bhushan,
	Eric Dumazet, Geetha sowjanya, hariprasad, Herbert Xu,
	intel-wired-lan, Jakub Kicinski, Jay Vosburgh, Jonathan Corbet,
	linux-doc, linux-rdma, Louis Peens, netdev, oss-drivers,
	Paolo Abeni, Potnuri Bharat Teja, Przemek Kitszel, Saeed Mahameed,
	Subbaraya Sundeep, Sunil Goutham, Tariq Toukan, Tony Nguyen,
	Ilia Lin

On Thu, Feb 6, 2025 at 2:24 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Thu, Feb 06, 2025 at 02:16:08PM +0530, Bharat Bhushan wrote:
> > Hi Leon,
> >
> > On Wed, Feb 5, 2025 at 11:50 PM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > From: Leon Romanovsky <leonro@nvidia.com>
> > >
> > > XFRM offload path is probed even if offload isn't needed at all. Let's
> > > make sure that x->type_offload pointer stays NULL for such path to
> > > reduce ambiguity.
> > >
> > > Fixes: 9d389d7f84bb ("xfrm: Add a xfrm type offload.")
> > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > > ---
> > >  include/net/xfrm.h     | 12 +++++++++++-
> > >  net/xfrm/xfrm_device.c | 14 +++++++++-----
> > >  net/xfrm/xfrm_state.c  | 22 +++++++++-------------
> > >  net/xfrm/xfrm_user.c   |  2 +-
> > >  4 files changed, 30 insertions(+), 20 deletions(-)
>
> <...>
>
> > > +       x->type_offload = xfrm_get_type_offload(x->id.proto, x->props.family);
> > > +       if (!x->type_offload) {
>
> <...>
>
> > > +               xfrm_put_type_offload(x->type_offload);
> > > +               x->type_offload = NULL;
> >
> > We always set type_offload to NULL. Can we move type_offload = NULL in
> > xfrm_put_type_offload() ?
>
> We can, but it will require change to xfrm_get_type_offload() too,
> otherwise we will get asymmetrical get/put.

"x->type_offload = NULL" is always set after the put() function. so I
thought that maybe moving "x->type_offload = NULL" to the put()
function would simplify.
Yes, get/put will be asymmetric. Maybe setting "x->type_offload" can
be done in get/put().
Anyway it is not a major comment. ignore if this does not simplify.

Thanks
-Bharat

>
> Do you want something like that?
> int xfrm_get_type_offload(struct xfrm_state *x);
> void xfrm_put_type_offload(struct xfrm_state *x);
>
> Thansk
>
> >
> > Thanks
> > -Bharat
> >
> > >                 /* User explicitly requested packet offload mode and configured
> > >                  * policy in addition to the XFRM state. So be civil to users,
> > >                  * and return an error instead of taking fallback path.
> > > diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
> > > index ad2202fa82f3..568fe8df7741 100644
> > > --- a/net/xfrm/xfrm_state.c
> > > +++ b/net/xfrm/xfrm_state.c
> > > @@ -424,11 +424,12 @@ void xfrm_unregister_type_offload(const struct xfrm_type_offload *type,
> > >  }
> > >  EXPORT_SYMBOL(xfrm_unregister_type_offload);
> > >
> > > -static const struct xfrm_type_offload *
> > > -xfrm_get_type_offload(u8 proto, unsigned short family, bool try_load)
> > > +const struct xfrm_type_offload *xfrm_get_type_offload(u8 proto,
> > > +                                                     unsigned short family)
> > >  {
> > >         const struct xfrm_type_offload *type = NULL;
> > >         struct xfrm_state_afinfo *afinfo;
> > > +       bool try_load = true;
> > >
> > >  retry:
> > >         afinfo = xfrm_state_get_afinfo(family);
> > > @@ -456,11 +457,7 @@ xfrm_get_type_offload(u8 proto, unsigned short family, bool try_load)
> > >
> > >         return type;
> > >  }
> > > -
> > > -static void xfrm_put_type_offload(const struct xfrm_type_offload *type)
> > > -{
> > > -       module_put(type->owner);
> > > -}
> > > +EXPORT_SYMBOL(xfrm_get_type_offload);
> > >
> > >  static const struct xfrm_mode xfrm4_mode_map[XFRM_MODE_MAX] = {
> > >         [XFRM_MODE_BEET] = {
> > > @@ -609,8 +606,6 @@ static void ___xfrm_state_destroy(struct xfrm_state *x)
> > >         kfree(x->coaddr);
> > >         kfree(x->replay_esn);
> > >         kfree(x->preplay_esn);
> > > -       if (x->type_offload)
> > > -               xfrm_put_type_offload(x->type_offload);
> > >         if (x->type) {
> > >                 x->type->destructor(x);
> > >                 xfrm_put_type(x->type);
> > > @@ -784,6 +779,9 @@ void xfrm_dev_state_free(struct xfrm_state *x)
> > >         struct xfrm_dev_offload *xso = &x->xso;
> > >         struct net_device *dev = READ_ONCE(xso->dev);
> > >
> > > +       xfrm_put_type_offload(x->type_offload);
> > > +       x->type_offload = NULL;
> > > +
> > >         if (dev && dev->xfrmdev_ops) {
> > >                 spin_lock_bh(&xfrm_state_dev_gc_lock);
> > >                 if (!hlist_unhashed(&x->dev_gclist))
> > > @@ -3122,7 +3120,7 @@ u32 xfrm_state_mtu(struct xfrm_state *x, int mtu)
> > >  }
> > >  EXPORT_SYMBOL_GPL(xfrm_state_mtu);
> > >
> > > -int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload,
> > > +int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
> > >                       struct netlink_ext_ack *extack)
> > >  {
> > >         const struct xfrm_mode *inner_mode;
> > > @@ -3178,8 +3176,6 @@ int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload,
> > >                 goto error;
> > >         }
> > >
> > > -       x->type_offload = xfrm_get_type_offload(x->id.proto, family, offload);
> > > -
> > >         err = x->type->init_state(x, extack);
> > >         if (err)
> > >                 goto error;
> > > @@ -3229,7 +3225,7 @@ int xfrm_init_state(struct xfrm_state *x)
> > >  {
> > >         int err;
> > >
> > > -       err = __xfrm_init_state(x, true, false, NULL);
> > > +       err = __xfrm_init_state(x, true, NULL);
> > >         if (!err)
> > >                 x->km.state = XFRM_STATE_VALID;
> > >
> > > diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
> > > index 08c6d6f0179f..82a768500999 100644
> > > --- a/net/xfrm/xfrm_user.c
> > > +++ b/net/xfrm/xfrm_user.c
> > > @@ -907,7 +907,7 @@ static struct xfrm_state *xfrm_state_construct(struct net *net,
> > >                         goto error;
> > >         }
> > >
> > > -       err = __xfrm_init_state(x, false, attrs[XFRMA_OFFLOAD_DEV], extack);
> > > +       err = __xfrm_init_state(x, false, extack);
> > >         if (err)
> > >                 goto error;
> > >
> > > --
> > > 2.48.1
> > >
> > >
> >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH ipsec-next 1/5] xfrm: delay initialization of offload path till its actually requested
  2025-02-06 13:59       ` Bharat Bhushan
@ 2025-02-06 14:26         ` Leon Romanovsky
  0 siblings, 0 replies; 18+ messages in thread
From: Leon Romanovsky @ 2025-02-06 14:26 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: Steffen Klassert, Andrew Lunn, Ayush Sawal, Bharat Bhushan,
	Eric Dumazet, Geetha sowjanya, hariprasad, Herbert Xu,
	intel-wired-lan, Jakub Kicinski, Jay Vosburgh, Jonathan Corbet,
	linux-doc, linux-rdma, Louis Peens, netdev, oss-drivers,
	Paolo Abeni, Potnuri Bharat Teja, Przemek Kitszel, Saeed Mahameed,
	Subbaraya Sundeep, Sunil Goutham, Tariq Toukan, Tony Nguyen,
	Ilia Lin

On Thu, Feb 06, 2025 at 07:29:13PM +0530, Bharat Bhushan wrote:
> On Thu, Feb 6, 2025 at 2:24 PM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Thu, Feb 06, 2025 at 02:16:08PM +0530, Bharat Bhushan wrote:
> > > Hi Leon,
> > >
> > > On Wed, Feb 5, 2025 at 11:50 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > >
> > > > From: Leon Romanovsky <leonro@nvidia.com>
> > > >
> > > > XFRM offload path is probed even if offload isn't needed at all. Let's
> > > > make sure that x->type_offload pointer stays NULL for such path to
> > > > reduce ambiguity.
> > > >
> > > > Fixes: 9d389d7f84bb ("xfrm: Add a xfrm type offload.")
> > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > > > ---
> > > >  include/net/xfrm.h     | 12 +++++++++++-
> > > >  net/xfrm/xfrm_device.c | 14 +++++++++-----
> > > >  net/xfrm/xfrm_state.c  | 22 +++++++++-------------
> > > >  net/xfrm/xfrm_user.c   |  2 +-
> > > >  4 files changed, 30 insertions(+), 20 deletions(-)
> >
> > <...>
> >
> > > > +       x->type_offload = xfrm_get_type_offload(x->id.proto, x->props.family);
> > > > +       if (!x->type_offload) {
> >
> > <...>
> >
> > > > +               xfrm_put_type_offload(x->type_offload);
> > > > +               x->type_offload = NULL;
> > >
> > > We always set type_offload to NULL. Can we move type_offload = NULL in
> > > xfrm_put_type_offload() ?
> >
> > We can, but it will require change to xfrm_get_type_offload() too,
> > otherwise we will get asymmetrical get/put.
> 
> "x->type_offload = NULL" is always set after the put() function. so I
> thought that maybe moving "x->type_offload = NULL" to the put()
> function would simplify.
> Yes, get/put will be asymmetric. Maybe setting "x->type_offload" can
> be done in get/put().
> Anyway it is not a major comment. ignore if this does not simplify.

Thanks, let's wait for other comments. If I need respin the series, I'll
change the functions to the proposed below.

Thanks


> 
> Thanks
> -Bharat
> 
> >
> > Do you want something like that?
> > int xfrm_get_type_offload(struct xfrm_state *x);
> > void xfrm_put_type_offload(struct xfrm_state *x);
> >
> > Thansk
> >
> > >
> > > Thanks
> > > -Bharat
> > >
> > > >                 /* User explicitly requested packet offload mode and configured
> > > >                  * policy in addition to the XFRM state. So be civil to users,
> > > >                  * and return an error instead of taking fallback path.
> > > > diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
> > > > index ad2202fa82f3..568fe8df7741 100644
> > > > --- a/net/xfrm/xfrm_state.c
> > > > +++ b/net/xfrm/xfrm_state.c
> > > > @@ -424,11 +424,12 @@ void xfrm_unregister_type_offload(const struct xfrm_type_offload *type,
> > > >  }
> > > >  EXPORT_SYMBOL(xfrm_unregister_type_offload);
> > > >
> > > > -static const struct xfrm_type_offload *
> > > > -xfrm_get_type_offload(u8 proto, unsigned short family, bool try_load)
> > > > +const struct xfrm_type_offload *xfrm_get_type_offload(u8 proto,
> > > > +                                                     unsigned short family)
> > > >  {
> > > >         const struct xfrm_type_offload *type = NULL;
> > > >         struct xfrm_state_afinfo *afinfo;
> > > > +       bool try_load = true;
> > > >
> > > >  retry:
> > > >         afinfo = xfrm_state_get_afinfo(family);
> > > > @@ -456,11 +457,7 @@ xfrm_get_type_offload(u8 proto, unsigned short family, bool try_load)
> > > >
> > > >         return type;
> > > >  }
> > > > -
> > > > -static void xfrm_put_type_offload(const struct xfrm_type_offload *type)
> > > > -{
> > > > -       module_put(type->owner);
> > > > -}
> > > > +EXPORT_SYMBOL(xfrm_get_type_offload);
> > > >
> > > >  static const struct xfrm_mode xfrm4_mode_map[XFRM_MODE_MAX] = {
> > > >         [XFRM_MODE_BEET] = {
> > > > @@ -609,8 +606,6 @@ static void ___xfrm_state_destroy(struct xfrm_state *x)
> > > >         kfree(x->coaddr);
> > > >         kfree(x->replay_esn);
> > > >         kfree(x->preplay_esn);
> > > > -       if (x->type_offload)
> > > > -               xfrm_put_type_offload(x->type_offload);
> > > >         if (x->type) {
> > > >                 x->type->destructor(x);
> > > >                 xfrm_put_type(x->type);
> > > > @@ -784,6 +779,9 @@ void xfrm_dev_state_free(struct xfrm_state *x)
> > > >         struct xfrm_dev_offload *xso = &x->xso;
> > > >         struct net_device *dev = READ_ONCE(xso->dev);
> > > >
> > > > +       xfrm_put_type_offload(x->type_offload);
> > > > +       x->type_offload = NULL;
> > > > +
> > > >         if (dev && dev->xfrmdev_ops) {
> > > >                 spin_lock_bh(&xfrm_state_dev_gc_lock);
> > > >                 if (!hlist_unhashed(&x->dev_gclist))
> > > > @@ -3122,7 +3120,7 @@ u32 xfrm_state_mtu(struct xfrm_state *x, int mtu)
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(xfrm_state_mtu);
> > > >
> > > > -int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload,
> > > > +int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
> > > >                       struct netlink_ext_ack *extack)
> > > >  {
> > > >         const struct xfrm_mode *inner_mode;
> > > > @@ -3178,8 +3176,6 @@ int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload,
> > > >                 goto error;
> > > >         }
> > > >
> > > > -       x->type_offload = xfrm_get_type_offload(x->id.proto, family, offload);
> > > > -
> > > >         err = x->type->init_state(x, extack);
> > > >         if (err)
> > > >                 goto error;
> > > > @@ -3229,7 +3225,7 @@ int xfrm_init_state(struct xfrm_state *x)
> > > >  {
> > > >         int err;
> > > >
> > > > -       err = __xfrm_init_state(x, true, false, NULL);
> > > > +       err = __xfrm_init_state(x, true, NULL);
> > > >         if (!err)
> > > >                 x->km.state = XFRM_STATE_VALID;
> > > >
> > > > diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
> > > > index 08c6d6f0179f..82a768500999 100644
> > > > --- a/net/xfrm/xfrm_user.c
> > > > +++ b/net/xfrm/xfrm_user.c
> > > > @@ -907,7 +907,7 @@ static struct xfrm_state *xfrm_state_construct(struct net *net,
> > > >                         goto error;
> > > >         }
> > > >
> > > > -       err = __xfrm_init_state(x, false, attrs[XFRMA_OFFLOAD_DEV], extack);
> > > > +       err = __xfrm_init_state(x, false, extack);
> > > >         if (err)
> > > >                 goto error;
> > > >
> > > > --
> > > > 2.48.1
> > > >
> > > >
> > >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH ipsec-next 2/5] xfrm: simplify SA initialization routine
  2025-02-05 18:20 ` [PATCH ipsec-next 2/5] xfrm: simplify SA initialization routine Leon Romanovsky
@ 2025-02-12 11:56   ` Steffen Klassert
  2025-02-12 18:30     ` Leon Romanovsky
  0 siblings, 1 reply; 18+ messages in thread
From: Steffen Klassert @ 2025-02-12 11:56 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Leon Romanovsky, Andrew Lunn, Ayush Sawal, Bharat Bhushan,
	Eric Dumazet, Geetha sowjanya, hariprasad, Herbert Xu,
	intel-wired-lan, Jakub Kicinski, Jay Vosburgh, Jonathan Corbet,
	linux-doc, linux-rdma, Louis Peens, netdev, oss-drivers,
	Paolo Abeni, Potnuri Bharat Teja, Przemek Kitszel, Saeed Mahameed,
	Subbaraya Sundeep, Sunil Goutham, Tariq Toukan, Tony Nguyen,
	Ilia Lin

On Wed, Feb 05, 2025 at 08:20:21PM +0200, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
> 
> SA replay mode is initialized differently for user-space and
> kernel-space users, but the call to xfrm_init_replay() existed in
> common path with boolean protection. That caused to situation where
> we have two different function orders.
> 
> So let's rewrite the SA initialization flow to have same order for
> both in-kernel and user-space callers.
> 
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
>  include/net/xfrm.h    |  3 +--
>  net/xfrm/xfrm_state.c | 22 ++++++++++------------
>  net/xfrm/xfrm_user.c  |  2 +-
>  3 files changed, 12 insertions(+), 15 deletions(-)
> 
> diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> index 28355a5be5b9..58f8f7661ec4 100644
> --- a/include/net/xfrm.h
> +++ b/include/net/xfrm.h
> @@ -1770,8 +1770,7 @@ void xfrm_spd_getinfo(struct net *net, struct xfrmk_spdinfo *si);
>  u32 xfrm_replay_seqhi(struct xfrm_state *x, __be32 net_seq);
>  int xfrm_init_replay(struct xfrm_state *x, struct netlink_ext_ack *extack);
>  u32 xfrm_state_mtu(struct xfrm_state *x, int mtu);
> -int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
> -		      struct netlink_ext_ack *extack);
> +int __xfrm_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack);
>  int xfrm_init_state(struct xfrm_state *x);
>  int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type);
>  int xfrm_input_resume(struct sk_buff *skb, int nexthdr);
> diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
> index 568fe8df7741..42799b0946a3 100644
> --- a/net/xfrm/xfrm_state.c
> +++ b/net/xfrm/xfrm_state.c
> @@ -3120,8 +3120,7 @@ u32 xfrm_state_mtu(struct xfrm_state *x, int mtu)
>  }
>  EXPORT_SYMBOL_GPL(xfrm_state_mtu);
>  
> -int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
> -		      struct netlink_ext_ack *extack)
> +int __xfrm_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack)

The whole point of having __xfrm_init_state was to
sepatate codepaths that need init_replay and those
who don't need it. That was a bandaid for something,
unfortunately I don't remenber for what.

If we don't need that anymore, maybe we can merge
__xfrm_init_state into xfrm_init_state, as it was
before.

The rest of the patchset looks OK to me.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH ipsec-next 2/5] xfrm: simplify SA initialization routine
  2025-02-12 11:56   ` Steffen Klassert
@ 2025-02-12 18:30     ` Leon Romanovsky
  2025-02-14  9:29       ` Steffen Klassert
  0 siblings, 1 reply; 18+ messages in thread
From: Leon Romanovsky @ 2025-02-12 18:30 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Andrew Lunn, Ayush Sawal, Bharat Bhushan, Eric Dumazet,
	Geetha sowjanya, hariprasad, Herbert Xu, intel-wired-lan,
	Jakub Kicinski, Jay Vosburgh, Jonathan Corbet, linux-doc,
	linux-rdma, Louis Peens, netdev, oss-drivers, Paolo Abeni,
	Potnuri Bharat Teja, Przemek Kitszel, Saeed Mahameed,
	Subbaraya Sundeep, Sunil Goutham, Tariq Toukan, Tony Nguyen,
	Ilia Lin

On Wed, Feb 12, 2025 at 12:56:48PM +0100, Steffen Klassert wrote:
> On Wed, Feb 05, 2025 at 08:20:21PM +0200, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> > 
> > SA replay mode is initialized differently for user-space and
> > kernel-space users, but the call to xfrm_init_replay() existed in
> > common path with boolean protection. That caused to situation where
> > we have two different function orders.
> > 
> > So let's rewrite the SA initialization flow to have same order for
> > both in-kernel and user-space callers.
> > 
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > ---
> >  include/net/xfrm.h    |  3 +--
> >  net/xfrm/xfrm_state.c | 22 ++++++++++------------
> >  net/xfrm/xfrm_user.c  |  2 +-
> >  3 files changed, 12 insertions(+), 15 deletions(-)
> > 
> > diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> > index 28355a5be5b9..58f8f7661ec4 100644
> > --- a/include/net/xfrm.h
> > +++ b/include/net/xfrm.h
> > @@ -1770,8 +1770,7 @@ void xfrm_spd_getinfo(struct net *net, struct xfrmk_spdinfo *si);
> >  u32 xfrm_replay_seqhi(struct xfrm_state *x, __be32 net_seq);
> >  int xfrm_init_replay(struct xfrm_state *x, struct netlink_ext_ack *extack);
> >  u32 xfrm_state_mtu(struct xfrm_state *x, int mtu);
> > -int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
> > -		      struct netlink_ext_ack *extack);
> > +int __xfrm_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack);
> >  int xfrm_init_state(struct xfrm_state *x);
> >  int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type);
> >  int xfrm_input_resume(struct sk_buff *skb, int nexthdr);
> > diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
> > index 568fe8df7741..42799b0946a3 100644
> > --- a/net/xfrm/xfrm_state.c
> > +++ b/net/xfrm/xfrm_state.c
> > @@ -3120,8 +3120,7 @@ u32 xfrm_state_mtu(struct xfrm_state *x, int mtu)
> >  }
> >  EXPORT_SYMBOL_GPL(xfrm_state_mtu);
> >  
> > -int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
> > -		      struct netlink_ext_ack *extack)
> > +int __xfrm_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack)
> 
> The whole point of having __xfrm_init_state was to
> sepatate codepaths that need init_replay and those
> who don't need it. That was a bandaid for something,
> unfortunately I don't remenber for what.
> 
> If we don't need that anymore, maybe we can merge
> __xfrm_init_state into xfrm_init_state, as it was
> before.

Main difference between __xfrm_init_state and xfrm_init_state is that
latter is called without extack, which doesn't exist in kernel path.

E.g  xfrm_init_state(struct xfrm_state *x) vs. __xfrm_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack).
So if we merge them, we will need to change all xfrm_init_state()
callers to provide extack == NULL.

IMHO, such churn of changing xfrm_init_state() callers is not worth it for now.

Thanks

> 
> The rest of the patchset looks OK to me.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH ipsec-next 2/5] xfrm: simplify SA initialization routine
  2025-02-12 18:30     ` Leon Romanovsky
@ 2025-02-14  9:29       ` Steffen Klassert
  2025-02-14 11:14         ` Leon Romanovsky
  0 siblings, 1 reply; 18+ messages in thread
From: Steffen Klassert @ 2025-02-14  9:29 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Andrew Lunn, Ayush Sawal, Bharat Bhushan, Eric Dumazet,
	Geetha sowjanya, hariprasad, Herbert Xu, intel-wired-lan,
	Jakub Kicinski, Jay Vosburgh, Jonathan Corbet, linux-doc,
	linux-rdma, Louis Peens, netdev, oss-drivers, Paolo Abeni,
	Potnuri Bharat Teja, Przemek Kitszel, Saeed Mahameed,
	Subbaraya Sundeep, Sunil Goutham, Tariq Toukan, Tony Nguyen,
	Ilia Lin

On Wed, Feb 12, 2025 at 08:30:20PM +0200, Leon Romanovsky wrote:
> On Wed, Feb 12, 2025 at 12:56:48PM +0100, Steffen Klassert wrote:
> > On Wed, Feb 05, 2025 at 08:20:21PM +0200, Leon Romanovsky wrote:
> > > From: Leon Romanovsky <leonro@nvidia.com>
> > > 
> > > SA replay mode is initialized differently for user-space and
> > > kernel-space users, but the call to xfrm_init_replay() existed in
> > > common path with boolean protection. That caused to situation where
> > > we have two different function orders.
> > > 
> > > So let's rewrite the SA initialization flow to have same order for
> > > both in-kernel and user-space callers.
> > > 
> > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > > ---
> > >  include/net/xfrm.h    |  3 +--
> > >  net/xfrm/xfrm_state.c | 22 ++++++++++------------
> > >  net/xfrm/xfrm_user.c  |  2 +-
> > >  3 files changed, 12 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> > > index 28355a5be5b9..58f8f7661ec4 100644
> > > --- a/include/net/xfrm.h
> > > +++ b/include/net/xfrm.h
> > > @@ -1770,8 +1770,7 @@ void xfrm_spd_getinfo(struct net *net, struct xfrmk_spdinfo *si);
> > >  u32 xfrm_replay_seqhi(struct xfrm_state *x, __be32 net_seq);
> > >  int xfrm_init_replay(struct xfrm_state *x, struct netlink_ext_ack *extack);
> > >  u32 xfrm_state_mtu(struct xfrm_state *x, int mtu);
> > > -int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
> > > -		      struct netlink_ext_ack *extack);
> > > +int __xfrm_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack);
> > >  int xfrm_init_state(struct xfrm_state *x);
> > >  int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type);
> > >  int xfrm_input_resume(struct sk_buff *skb, int nexthdr);
> > > diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
> > > index 568fe8df7741..42799b0946a3 100644
> > > --- a/net/xfrm/xfrm_state.c
> > > +++ b/net/xfrm/xfrm_state.c
> > > @@ -3120,8 +3120,7 @@ u32 xfrm_state_mtu(struct xfrm_state *x, int mtu)
> > >  }
> > >  EXPORT_SYMBOL_GPL(xfrm_state_mtu);
> > >  
> > > -int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
> > > -		      struct netlink_ext_ack *extack)
> > > +int __xfrm_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack)
> > 
> > The whole point of having __xfrm_init_state was to
> > sepatate codepaths that need init_replay and those
> > who don't need it. That was a bandaid for something,
> > unfortunately I don't remenber for what.
> > 
> > If we don't need that anymore, maybe we can merge
> > __xfrm_init_state into xfrm_init_state, as it was
> > before.
> 
> Main difference between __xfrm_init_state and xfrm_init_state is that
> latter is called without extack, which doesn't exist in kernel path.

That split happened ~ 15 years ago, we did not have extack back than.
But I'm also ok with keeping it if extack is a reason for it.

Do you plan to respin, or should I take the patchset as is?

Thanks!

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH ipsec-next 2/5] xfrm: simplify SA initialization routine
  2025-02-14  9:29       ` Steffen Klassert
@ 2025-02-14 11:14         ` Leon Romanovsky
  0 siblings, 0 replies; 18+ messages in thread
From: Leon Romanovsky @ 2025-02-14 11:14 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Andrew Lunn, Ayush Sawal, Bharat Bhushan, Eric Dumazet,
	Geetha sowjanya, hariprasad, Herbert Xu, intel-wired-lan,
	Jakub Kicinski, Jay Vosburgh, Jonathan Corbet, linux-doc,
	linux-rdma, Louis Peens, netdev, oss-drivers, Paolo Abeni,
	Potnuri Bharat Teja, Przemek Kitszel, Saeed Mahameed,
	Subbaraya Sundeep, Sunil Goutham, Tariq Toukan, Tony Nguyen,
	Ilia Lin



On Fri, Feb 14, 2025, at 11:29, Steffen Klassert wrote:
> On Wed, Feb 12, 2025 at 08:30:20PM +0200, Leon Romanovsky wrote:
>> On Wed, Feb 12, 2025 at 12:56:48PM +0100, Steffen Klassert wrote:
>> > On Wed, Feb 05, 2025 at 08:20:21PM +0200, Leon Romanovsky wrote:
>> > > From: Leon Romanovsky <leonro@nvidia.com>
>> > > 
>> > > SA replay mode is initialized differently for user-space and
>> > > kernel-space users, but the call to xfrm_init_replay() existed in
>> > > common path with boolean protection. That caused to situation where
>> > > we have two different function orders.
>> > > 
>> > > So let's rewrite the SA initialization flow to have same order for
>> > > both in-kernel and user-space callers.
>> > > 
>> > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
>> > > ---
>> > >  include/net/xfrm.h    |  3 +--
>> > >  net/xfrm/xfrm_state.c | 22 ++++++++++------------
>> > >  net/xfrm/xfrm_user.c  |  2 +-
>> > >  3 files changed, 12 insertions(+), 15 deletions(-)
>> > > 
>> > > diff --git a/include/net/xfrm.h b/include/net/xfrm.h
>> > > index 28355a5be5b9..58f8f7661ec4 100644
>> > > --- a/include/net/xfrm.h
>> > > +++ b/include/net/xfrm.h
>> > > @@ -1770,8 +1770,7 @@ void xfrm_spd_getinfo(struct net *net, struct xfrmk_spdinfo *si);
>> > >  u32 xfrm_replay_seqhi(struct xfrm_state *x, __be32 net_seq);
>> > >  int xfrm_init_replay(struct xfrm_state *x, struct netlink_ext_ack *extack);
>> > >  u32 xfrm_state_mtu(struct xfrm_state *x, int mtu);
>> > > -int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
>> > > -		      struct netlink_ext_ack *extack);
>> > > +int __xfrm_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack);
>> > >  int xfrm_init_state(struct xfrm_state *x);
>> > >  int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type);
>> > >  int xfrm_input_resume(struct sk_buff *skb, int nexthdr);
>> > > diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
>> > > index 568fe8df7741..42799b0946a3 100644
>> > > --- a/net/xfrm/xfrm_state.c
>> > > +++ b/net/xfrm/xfrm_state.c
>> > > @@ -3120,8 +3120,7 @@ u32 xfrm_state_mtu(struct xfrm_state *x, int mtu)
>> > >  }
>> > >  EXPORT_SYMBOL_GPL(xfrm_state_mtu);
>> > >  
>> > > -int __xfrm_init_state(struct xfrm_state *x, bool init_replay,
>> > > -		      struct netlink_ext_ack *extack)
>> > > +int __xfrm_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack)
>> > 
>> > The whole point of having __xfrm_init_state was to
>> > sepatate codepaths that need init_replay and those
>> > who don't need it. That was a bandaid for something,
>> > unfortunately I don't remenber for what.
>> > 
>> > If we don't need that anymore, maybe we can merge
>> > __xfrm_init_state into xfrm_init_state, as it was
>> > before.
>> 
>> Main difference between __xfrm_init_state and xfrm_init_state is that
>> latter is called without extack, which doesn't exist in kernel path.
>
> That split happened ~ 15 years ago, we did not have extack back than.
> But I'm also ok with keeping it if extack is a reason for it.
>
> Do you plan to respin, or should I take the patchset as is?

The best way will be if you can take this series as is.

>
> Thanks!

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH ipsec-next 4/5] xfrm: provide common xdo_dev_offload_ok callback implementation
  2025-02-05 18:20 ` [PATCH ipsec-next 4/5] xfrm: provide common xdo_dev_offload_ok callback implementation Leon Romanovsky
@ 2025-02-16  9:33   ` Zhu Yanjun
  2025-02-16 11:07     ` Leon Romanovsky
  0 siblings, 1 reply; 18+ messages in thread
From: Zhu Yanjun @ 2025-02-16  9:33 UTC (permalink / raw)
  To: Leon Romanovsky, Steffen Klassert
  Cc: Leon Romanovsky, Andrew Lunn, Ayush Sawal, Bharat Bhushan,
	Eric Dumazet, Geetha sowjanya, hariprasad, Herbert Xu,
	intel-wired-lan, Jakub Kicinski, Jay Vosburgh, Jonathan Corbet,
	linux-doc, linux-rdma, Louis Peens, netdev, oss-drivers,
	Paolo Abeni, Potnuri Bharat Teja, Przemek Kitszel, Saeed Mahameed,
	Subbaraya Sundeep, Sunil Goutham, Tariq Toukan, Tony Nguyen,
	Ilia Lin

在 2025/2/5 19:20, Leon Romanovsky 写道:
> From: Leon Romanovsky <leonro@nvidia.com>
> 
> Almost all drivers except bond and nsim had same check if device
> can perform XFRM offload on that specific packet. The check was that
> packet doesn't have IPv4 options and IPv6 extensions.
> 
> In NIC drivers, the IPv4 HELEN comparison was slightly different, but
> the intent was to check for the same conditions. So let's chose more
> strict variant as a common base.
> 
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
>   Documentation/networking/xfrm_device.rst      |  3 ++-
>   drivers/net/bonding/bond_main.c               | 16 +++++---------
>   .../net/ethernet/chelsio/cxgb4/cxgb4_main.c   | 21 -------------------
>   .../inline_crypto/ch_ipsec/chcr_ipsec.c       | 16 --------------
>   .../net/ethernet/intel/ixgbe/ixgbe_ipsec.c    | 21 -------------------
>   drivers/net/ethernet/intel/ixgbevf/ipsec.c    | 21 -------------------
>   .../marvell/octeontx2/nic/cn10k_ipsec.c       | 15 -------------
>   .../mellanox/mlx5/core/en_accel/ipsec.c       | 16 --------------
>   .../net/ethernet/netronome/nfp/crypto/ipsec.c | 11 ----------
>   drivers/net/netdevsim/ipsec.c                 | 11 ----------
>   drivers/net/netdevsim/netdevsim.h             |  1 -
>   net/xfrm/xfrm_device.c                        | 15 +++++++++++++
>   12 files changed, 22 insertions(+), 145 deletions(-)
> 
> diff --git a/Documentation/networking/xfrm_device.rst b/Documentation/networking/xfrm_device.rst
> index 66f6e9a9b59a..39bb98939d1f 100644
> --- a/Documentation/networking/xfrm_device.rst
> +++ b/Documentation/networking/xfrm_device.rst
> @@ -126,7 +126,8 @@ been setup for offload, it first calls into xdo_dev_offload_ok() with
>   the skb and the intended offload state to ask the driver if the offload
>   will serviceable.  This can check the packet information to be sure the
>   offload can be supported (e.g. IPv4 or IPv6, no IPv4 options, etc) and
> -return true of false to signify its support.
> +return true of false to signify its support. In case driver doesn't implement

In this commit, remove the functions cxgb4_ipsec_offload_ok, 
ch_ipsec_offload_ok, ixgbe_ipsec_offload_ok, ixgbevf_ipsec_offload_ok, 
cn10k_ipsec_offload_ok, mlx5e_ipsec_offload_ok, nfp_net_ipsec_offload_ok 
and nsim_ipsec_offload_ok, use the function xfrm_dev_offload_ok to do 
the same work.

But in the file xfrm_device.rst, "return true or false to signify its 
support"?

of --> should be "or"

Thanks a lot.
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>

Zhu Yanjun

> +this callback, the stack provides reasonable defaults.
>   
>   Crypto offload mode:
>   When ready to send, the driver needs to inspect the Tx packet for the
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index e45bba240cbc..bfb55c23380b 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -676,22 +676,16 @@ static void bond_ipsec_free_sa(struct xfrm_state *xs)
>   static bool bond_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH ipsec-next 4/5] xfrm: provide common xdo_dev_offload_ok callback implementation
  2025-02-16  9:33   ` Zhu Yanjun
@ 2025-02-16 11:07     ` Leon Romanovsky
  2025-02-16 12:36       ` Zhu Yanjun
  0 siblings, 1 reply; 18+ messages in thread
From: Leon Romanovsky @ 2025-02-16 11:07 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: Steffen Klassert, Andrew Lunn, Ayush Sawal, Bharat Bhushan,
	Eric Dumazet, Geetha sowjanya, hariprasad, Herbert Xu,
	intel-wired-lan, Jakub Kicinski, Jay Vosburgh, Jonathan Corbet,
	linux-doc, linux-rdma, Louis Peens, netdev, oss-drivers,
	Paolo Abeni, Potnuri Bharat Teja, Przemek Kitszel, Saeed Mahameed,
	Subbaraya Sundeep, Sunil Goutham, Tariq Toukan, Tony Nguyen,
	Ilia Lin

On Sun, Feb 16, 2025 at 10:33:59AM +0100, Zhu Yanjun wrote:
> 在 2025/2/5 19:20, Leon Romanovsky 写道:
> > From: Leon Romanovsky <leonro@nvidia.com>
> > 
> > Almost all drivers except bond and nsim had same check if device
> > can perform XFRM offload on that specific packet. The check was that
> > packet doesn't have IPv4 options and IPv6 extensions.
> > 
> > In NIC drivers, the IPv4 HELEN comparison was slightly different, but
> > the intent was to check for the same conditions. So let's chose more
> > strict variant as a common base.
> > 
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > ---
> >   Documentation/networking/xfrm_device.rst      |  3 ++-
> >   drivers/net/bonding/bond_main.c               | 16 +++++---------
> >   .../net/ethernet/chelsio/cxgb4/cxgb4_main.c   | 21 -------------------
> >   .../inline_crypto/ch_ipsec/chcr_ipsec.c       | 16 --------------
> >   .../net/ethernet/intel/ixgbe/ixgbe_ipsec.c    | 21 -------------------
> >   drivers/net/ethernet/intel/ixgbevf/ipsec.c    | 21 -------------------
> >   .../marvell/octeontx2/nic/cn10k_ipsec.c       | 15 -------------
> >   .../mellanox/mlx5/core/en_accel/ipsec.c       | 16 --------------
> >   .../net/ethernet/netronome/nfp/crypto/ipsec.c | 11 ----------
> >   drivers/net/netdevsim/ipsec.c                 | 11 ----------
> >   drivers/net/netdevsim/netdevsim.h             |  1 -
> >   net/xfrm/xfrm_device.c                        | 15 +++++++++++++
> >   12 files changed, 22 insertions(+), 145 deletions(-)
> > 
> > diff --git a/Documentation/networking/xfrm_device.rst b/Documentation/networking/xfrm_device.rst
> > index 66f6e9a9b59a..39bb98939d1f 100644
> > --- a/Documentation/networking/xfrm_device.rst
> > +++ b/Documentation/networking/xfrm_device.rst
> > @@ -126,7 +126,8 @@ been setup for offload, it first calls into xdo_dev_offload_ok() with
> >   the skb and the intended offload state to ask the driver if the offload
> >   will serviceable.  This can check the packet information to be sure the
> >   offload can be supported (e.g. IPv4 or IPv6, no IPv4 options, etc) and
> > -return true of false to signify its support.
> > +return true of false to signify its support. In case driver doesn't implement
> 
> In this commit, remove the functions cxgb4_ipsec_offload_ok,
> ch_ipsec_offload_ok, ixgbe_ipsec_offload_ok, ixgbevf_ipsec_offload_ok,
> cn10k_ipsec_offload_ok, mlx5e_ipsec_offload_ok, nfp_net_ipsec_offload_ok and
> nsim_ipsec_offload_ok, use the function xfrm_dev_offload_ok to do the same
> work.
> 
> But in the file xfrm_device.rst, "return true or false to signify its
> support"?

This sentence continued in the xfrm_device.rst: "...  In case driver doesn't implement
this callback, the stack provides reasonable defaults."

> 
> of --> should be "or"
> 
> Thanks a lot.
> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>

Thanks

> 
> Zhu Yanjun
> 
> > +this callback, the stack provides reasonable defaults.
> >   Crypto offload mode:
> >   When ready to send, the driver needs to inspect the Tx packet for the
> > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> > index e45bba240cbc..bfb55c23380b 100644
> > --- a/drivers/net/bonding/bond_main.c
> > +++ b/drivers/net/bonding/bond_main.c
> > @@ -676,22 +676,16 @@ static void bond_ipsec_free_sa(struct xfrm_state *xs)
> >   static bool bond_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH ipsec-next 4/5] xfrm: provide common xdo_dev_offload_ok callback implementation
  2025-02-16 11:07     ` Leon Romanovsky
@ 2025-02-16 12:36       ` Zhu Yanjun
  0 siblings, 0 replies; 18+ messages in thread
From: Zhu Yanjun @ 2025-02-16 12:36 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Steffen Klassert, Andrew Lunn, Ayush Sawal, Bharat Bhushan,
	Eric Dumazet, Geetha sowjanya, hariprasad, Herbert Xu,
	intel-wired-lan, Jakub Kicinski, Jay Vosburgh, Jonathan Corbet,
	linux-doc, linux-rdma, Louis Peens, netdev, oss-drivers,
	Paolo Abeni, Potnuri Bharat Teja, Przemek Kitszel, Saeed Mahameed,
	Subbaraya Sundeep, Sunil Goutham, Tariq Toukan, Tony Nguyen,
	Ilia Lin


在 2025/2/16 12:07, Leon Romanovsky 写道:
> On Sun, Feb 16, 2025 at 10:33:59AM +0100, Zhu Yanjun wrote:
>> 在 2025/2/5 19:20, Leon Romanovsky 写道:
>>> From: Leon Romanovsky <leonro@nvidia.com>
>>>
>>> Almost all drivers except bond and nsim had same check if device
>>> can perform XFRM offload on that specific packet. The check was that
>>> packet doesn't have IPv4 options and IPv6 extensions.
>>>
>>> In NIC drivers, the IPv4 HELEN comparison was slightly different, but
>>> the intent was to check for the same conditions. So let's chose more
>>> strict variant as a common base.
>>>
>>> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
>>> ---
>>>    Documentation/networking/xfrm_device.rst      |  3 ++-
>>>    drivers/net/bonding/bond_main.c               | 16 +++++---------
>>>    .../net/ethernet/chelsio/cxgb4/cxgb4_main.c   | 21 -------------------
>>>    .../inline_crypto/ch_ipsec/chcr_ipsec.c       | 16 --------------
>>>    .../net/ethernet/intel/ixgbe/ixgbe_ipsec.c    | 21 -------------------
>>>    drivers/net/ethernet/intel/ixgbevf/ipsec.c    | 21 -------------------
>>>    .../marvell/octeontx2/nic/cn10k_ipsec.c       | 15 -------------
>>>    .../mellanox/mlx5/core/en_accel/ipsec.c       | 16 --------------
>>>    .../net/ethernet/netronome/nfp/crypto/ipsec.c | 11 ----------
>>>    drivers/net/netdevsim/ipsec.c                 | 11 ----------
>>>    drivers/net/netdevsim/netdevsim.h             |  1 -
>>>    net/xfrm/xfrm_device.c                        | 15 +++++++++++++
>>>    12 files changed, 22 insertions(+), 145 deletions(-)
>>>
>>> diff --git a/Documentation/networking/xfrm_device.rst b/Documentation/networking/xfrm_device.rst
>>> index 66f6e9a9b59a..39bb98939d1f 100644
>>> --- a/Documentation/networking/xfrm_device.rst
>>> +++ b/Documentation/networking/xfrm_device.rst
>>> @@ -126,7 +126,8 @@ been setup for offload, it first calls into xdo_dev_offload_ok() with
>>>    the skb and the intended offload state to ask the driver if the offload
>>>    will serviceable.  This can check the packet information to be sure the
>>>    offload can be supported (e.g. IPv4 or IPv6, no IPv4 options, etc) and
>>> -return true of false to signify its support.
>>> +return true of false to signify its support. In case driver doesn't implement
>> In this commit, remove the functions cxgb4_ipsec_offload_ok,
>> ch_ipsec_offload_ok, ixgbe_ipsec_offload_ok, ixgbevf_ipsec_offload_ok,
>> cn10k_ipsec_offload_ok, mlx5e_ipsec_offload_ok, nfp_net_ipsec_offload_ok and
>> nsim_ipsec_offload_ok, use the function xfrm_dev_offload_ok to do the same
>> work.
>>
>> But in the file xfrm_device.rst, "return true or false to signify its
>> support"?
> This sentence continued in the xfrm_device.rst: "...  In case driver doesn't implement
> this callback, the stack provides reasonable defaults."

Got it.

I mean "... and return true of false to signify its support..."

                                           ^^

should be "... and return true or false to signify its support..."

^_^

Zhu Yanjun

>
>> of --> should be "or"
>>
>> Thanks a lot.
>> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> Thanks
>
>> Zhu Yanjun
>>
>>> +this callback, the stack provides reasonable defaults.
>>>    Crypto offload mode:
>>>    When ready to send, the driver needs to inspect the Tx packet for the
>>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>>> index e45bba240cbc..bfb55c23380b 100644
>>> --- a/drivers/net/bonding/bond_main.c
>>> +++ b/drivers/net/bonding/bond_main.c
>>> @@ -676,22 +676,16 @@ static void bond_ipsec_free_sa(struct xfrm_state *xs)
>>>    static bool bond_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)

-- 
Best Regards,
Yanjun.Zhu


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH ipsec-next 3/5] xfrm: rely on XFRM offload
  2025-02-05 18:20 ` [PATCH ipsec-next 3/5] xfrm: rely on XFRM offload Leon Romanovsky
@ 2025-02-18 20:39   ` Zhu Yanjun
  0 siblings, 0 replies; 18+ messages in thread
From: Zhu Yanjun @ 2025-02-18 20:39 UTC (permalink / raw)
  To: Leon Romanovsky, Steffen Klassert
  Cc: Leon Romanovsky, Andrew Lunn, Ayush Sawal, Bharat Bhushan,
	Eric Dumazet, Geetha sowjanya, hariprasad, Herbert Xu,
	intel-wired-lan, Jakub Kicinski, Jay Vosburgh, Jonathan Corbet,
	linux-doc, linux-rdma, Louis Peens, netdev, oss-drivers,
	Paolo Abeni, Potnuri Bharat Teja, Przemek Kitszel, Saeed Mahameed,
	Subbaraya Sundeep, Sunil Goutham, Tariq Toukan, Tony Nguyen,
	Ilia Lin

在 2025/2/5 19:20, Leon Romanovsky 写道:
> From: Leon Romanovsky <leonro@nvidia.com>
> 
> After change of initialization of x->type_offload pointer to be valid
> only for offloaded SAs. There is no need to rely both on x->type_offload
                                                   ^^^^^^^^
                                                rely on both ??
Thanks a lot.
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>

Zhu Yanjun

> and x->xso.type to determine if SA is offloaded or not.
> 
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-02-18 20:39 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-05 18:20 [PATCH ipsec-next 0/5] Support PTMU in tunnel mode for packet offload Leon Romanovsky
2025-02-05 18:20 ` [PATCH ipsec-next 1/5] xfrm: delay initialization of offload path till its actually requested Leon Romanovsky
2025-02-06  8:46   ` Bharat Bhushan
2025-02-06  8:54     ` Leon Romanovsky
2025-02-06 13:59       ` Bharat Bhushan
2025-02-06 14:26         ` Leon Romanovsky
2025-02-05 18:20 ` [PATCH ipsec-next 2/5] xfrm: simplify SA initialization routine Leon Romanovsky
2025-02-12 11:56   ` Steffen Klassert
2025-02-12 18:30     ` Leon Romanovsky
2025-02-14  9:29       ` Steffen Klassert
2025-02-14 11:14         ` Leon Romanovsky
2025-02-05 18:20 ` [PATCH ipsec-next 3/5] xfrm: rely on XFRM offload Leon Romanovsky
2025-02-18 20:39   ` Zhu Yanjun
2025-02-05 18:20 ` [PATCH ipsec-next 4/5] xfrm: provide common xdo_dev_offload_ok callback implementation Leon Romanovsky
2025-02-16  9:33   ` Zhu Yanjun
2025-02-16 11:07     ` Leon Romanovsky
2025-02-16 12:36       ` Zhu Yanjun
2025-02-05 18:20 ` [PATCH ipsec-next 5/5] xfrm: check for PMTU in tunnel mode for packet offload Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).