* [PATCH ipsec-next v2 01/17] xfrm: config: add CONFIG_XFRM_IPTFS
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
@ 2024-05-20 21:42 ` Christian Hopps
2024-05-20 21:42 ` [PATCH ipsec-next v2 02/17] include: uapi: add ip_tfs_*_hdr packet formats Christian Hopps
` (16 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:42 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
Add new Kconfig option to enable IP-TFS (RFC9347) functionality.
Signed-off-by: Christian Hopps <chopps@labn.net>
---
net/xfrm/Kconfig | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/net/xfrm/Kconfig b/net/xfrm/Kconfig
index d7b16f2c23e9..f0157702718f 100644
--- a/net/xfrm/Kconfig
+++ b/net/xfrm/Kconfig
@@ -135,6 +135,22 @@ config NET_KEY_MIGRATE
If unsure, say N.
+config XFRM_IPTFS
+ tristate "IPsec IP-TFS/AGGFRAG (RFC 9347) encapsulation support"
+ depends on XFRM
+ help
+ Information on the IP-TFS/AGGFRAG encapsulation can be found
+ in RFC 9347. This feature supports demand driven (i.e.,
+ non-constant send rate) IP-TFS to take advantage of the
+ AGGFRAG ESP payload encapsulation. This payload type
+ supports aggregation and fragmentation of the inner IP
+ packet stream which in turn yields higher small-packet
+ bandwidth as well as reducing MTU/PMTU issues. Congestion
+ control is unimplementated as the send rate is demand driven
+ rather than constant.
+
+ If unsure, say N.
+
config XFRM_ESPINTCP
bool
--
2.45.1
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH ipsec-next v2 02/17] include: uapi: add ip_tfs_*_hdr packet formats
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
2024-05-20 21:42 ` [PATCH ipsec-next v2 01/17] xfrm: config: add CONFIG_XFRM_IPTFS Christian Hopps
@ 2024-05-20 21:42 ` Christian Hopps
2024-05-20 21:42 ` [PATCH ipsec-next v2 03/17] include: uapi: add IPPROTO_AGGFRAG for AGGFRAG in ESP Christian Hopps
` (15 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:42 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
Add the on-wire basic and congestion-control IP-TFS packet headers.
Signed-off-by: Christian Hopps <chopps@labn.net>
---
include/uapi/linux/ip.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/include/uapi/linux/ip.h b/include/uapi/linux/ip.h
index 283dec7e3645..5bd7ce934d74 100644
--- a/include/uapi/linux/ip.h
+++ b/include/uapi/linux/ip.h
@@ -137,6 +137,22 @@ struct ip_beet_phdr {
__u8 reserved;
};
+struct ip_iptfs_hdr {
+ __u8 subtype; /* 0*: basic, 1: CC */
+ __u8 flags;
+ __be16 block_offset;
+};
+
+struct ip_iptfs_cc_hdr {
+ __u8 subtype; /* 0: basic, 1*: CC */
+ __u8 flags;
+ __be16 block_offset;
+ __be32 loss_rate;
+ __be64 rtt_adelay_xdelay;
+ __be32 tval;
+ __be32 techo;
+};
+
/* index values for the variables in ipv4_devconf */
enum
{
--
2.45.1
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH ipsec-next v2 03/17] include: uapi: add IPPROTO_AGGFRAG for AGGFRAG in ESP
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
2024-05-20 21:42 ` [PATCH ipsec-next v2 01/17] xfrm: config: add CONFIG_XFRM_IPTFS Christian Hopps
2024-05-20 21:42 ` [PATCH ipsec-next v2 02/17] include: uapi: add ip_tfs_*_hdr packet formats Christian Hopps
@ 2024-05-20 21:42 ` Christian Hopps
2024-05-20 21:42 ` [PATCH ipsec-next v2 04/17] xfrm: sysctl: allow configuration of global default values Christian Hopps
` (14 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:42 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
Add the RFC assigned IP protocol number for AGGFRAG.
Signed-off-by: Christian Hopps <chopps@labn.net>
---
include/uapi/linux/in.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
index e682ab628dfa..e6a1f3e4c58c 100644
--- a/include/uapi/linux/in.h
+++ b/include/uapi/linux/in.h
@@ -79,6 +79,8 @@ enum {
#define IPPROTO_MPLS IPPROTO_MPLS
IPPROTO_ETHERNET = 143, /* Ethernet-within-IPv6 Encapsulation */
#define IPPROTO_ETHERNET IPPROTO_ETHERNET
+ IPPROTO_AGGFRAG = 144, /* AGGFRAG in ESP (RFC 9347) */
+#define IPPROTO_AGGFRAG IPPROTO_AGGFRAG
IPPROTO_RAW = 255, /* Raw IP packets */
#define IPPROTO_RAW IPPROTO_RAW
IPPROTO_MPTCP = 262, /* Multipath TCP connection */
--
2.45.1
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH ipsec-next v2 04/17] xfrm: sysctl: allow configuration of global default values
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
` (2 preceding siblings ...)
2024-05-20 21:42 ` [PATCH ipsec-next v2 03/17] include: uapi: add IPPROTO_AGGFRAG for AGGFRAG in ESP Christian Hopps
@ 2024-05-20 21:42 ` Christian Hopps
2024-05-20 21:42 ` [PATCH ipsec-next v2 05/17] xfrm: netlink: add config (netlink) options Christian Hopps
` (13 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:42 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
Add sysctls for the changing the IPTFS default SA values.
Signed-off-by: Christian Hopps <chopps@labn.net>
---
Documentation/networking/xfrm_sysctl.rst | 30 +++++++++++++++++++
include/net/netns/xfrm.h | 6 ++++
net/xfrm/xfrm_sysctl.c | 38 ++++++++++++++++++++++++
3 files changed, 74 insertions(+)
diff --git a/Documentation/networking/xfrm_sysctl.rst b/Documentation/networking/xfrm_sysctl.rst
index 47b9bbdd0179..4d900c74b405 100644
--- a/Documentation/networking/xfrm_sysctl.rst
+++ b/Documentation/networking/xfrm_sysctl.rst
@@ -9,3 +9,33 @@ XFRM Syscall
xfrm_acq_expires - INTEGER
default 30 - hard timeout in seconds for acquire requests
+
+xfrm_iptfs_max_qsize - UNSIGNED INTEGER
+ The default IPTFS max output queue size in octets. The output queue is
+ where received packets destined for output over an IPTFS tunnel are
+ stored prior to being output in aggregated/fragmented form over the
+ IPTFS tunnel.
+
+ Default 1M.
+
+xfrm_iptfs_drop_time - UNSIGNED INTEGER
+ The default IPTFS drop time in microseconds. The drop time is the amount
+ of time before a missing out-of-order IPTFS tunnel packet is considered
+ lost. See also the reorder window.
+
+ Default 1s (1000000).
+
+xfrm_iptfs_init_delay - UNSIGNED INTEGER
+ The default IPTFS initial output delay in microseconds. The initial
+ output delay is the amount of time prior to servicing the output queue
+ after queueing the first packet on said queue. This applies anytime
+ the output queue was previously empty.
+
+ Default 0.
+
+xfrm_iptfs_reorder_window - UNSIGNED INTEGER
+ The default IPTFS reorder window size. The reorder window size dictates
+ the maximum number of IPTFS tunnel packets in a sequence that may arrive
+ out of order.
+
+ Default 3.
diff --git a/include/net/netns/xfrm.h b/include/net/netns/xfrm.h
index 423b52eca908..e11e71c8ceef 100644
--- a/include/net/netns/xfrm.h
+++ b/include/net/netns/xfrm.h
@@ -66,6 +66,12 @@ struct netns_xfrm {
u32 sysctl_aevent_rseqth;
int sysctl_larval_drop;
u32 sysctl_acq_expires;
+#if IS_ENABLED(CONFIG_XFRM_IPTFS)
+ u32 sysctl_iptfs_drop_time;
+ u32 sysctl_iptfs_init_delay;
+ u32 sysctl_iptfs_max_qsize;
+ u32 sysctl_iptfs_reorder_window;
+#endif
u8 policy_default[XFRM_POLICY_MAX];
diff --git a/net/xfrm/xfrm_sysctl.c b/net/xfrm/xfrm_sysctl.c
index 7fdeafc838a7..dddb1025b7de 100644
--- a/net/xfrm/xfrm_sysctl.c
+++ b/net/xfrm/xfrm_sysctl.c
@@ -10,6 +10,12 @@ static void __net_init __xfrm_sysctl_init(struct net *net)
net->xfrm.sysctl_aevent_rseqth = XFRM_AE_SEQT_SIZE;
net->xfrm.sysctl_larval_drop = 1;
net->xfrm.sysctl_acq_expires = 30;
+#if IS_ENABLED(CONFIG_XFRM_IPTFS)
+ net->xfrm.sysctl_iptfs_max_qsize = 1024 * 1024; /* 1M */
+ net->xfrm.sysctl_iptfs_drop_time = 1000000; /* 1s */
+ net->xfrm.sysctl_iptfs_init_delay = 0; /* no initial delay */
+ net->xfrm.sysctl_iptfs_reorder_window = 3; /* tcp folks suggested */
+#endif
}
#ifdef CONFIG_SYSCTL
@@ -38,6 +44,32 @@ static struct ctl_table xfrm_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec
},
+#if IS_ENABLED(CONFIG_XFRM_IPTFS)
+ {
+ .procname = "xfrm_iptfs_drop_time",
+ .maxlen = sizeof(uint),
+ .mode = 0644,
+ .proc_handler = proc_douintvec
+ },
+ {
+ .procname = "xfrm_iptfs_init_delay",
+ .maxlen = sizeof(uint),
+ .mode = 0644,
+ .proc_handler = proc_douintvec
+ },
+ {
+ .procname = "xfrm_iptfs_max_qsize",
+ .maxlen = sizeof(uint),
+ .mode = 0644,
+ .proc_handler = proc_douintvec
+ },
+ {
+ .procname = "xfrm_iptfs_reorder_window",
+ .maxlen = sizeof(uint),
+ .mode = 0644,
+ .proc_handler = proc_douintvec
+ },
+#endif
{}
};
@@ -55,6 +87,12 @@ int __net_init xfrm_sysctl_init(struct net *net)
table[1].data = &net->xfrm.sysctl_aevent_rseqth;
table[2].data = &net->xfrm.sysctl_larval_drop;
table[3].data = &net->xfrm.sysctl_acq_expires;
+#if IS_ENABLED(CONFIG_XFRM_IPTFS)
+ table[4].data = &net->xfrm.sysctl_iptfs_drop_time;
+ table[5].data = &net->xfrm.sysctl_iptfs_init_delay;
+ table[6].data = &net->xfrm.sysctl_iptfs_max_qsize;
+ table[7].data = &net->xfrm.sysctl_iptfs_reorder_window;
+#endif
/* Don't export sysctls to unprivileged users */
if (net->user_ns != &init_user_ns) {
--
2.45.1
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH ipsec-next v2 05/17] xfrm: netlink: add config (netlink) options
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
` (3 preceding siblings ...)
2024-05-20 21:42 ` [PATCH ipsec-next v2 04/17] xfrm: sysctl: allow configuration of global default values Christian Hopps
@ 2024-05-20 21:42 ` Christian Hopps
2024-05-20 21:42 ` [PATCH ipsec-next v2 06/17] xfrm: add mode_cbs module functionality Christian Hopps
` (12 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:42 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
Add netlink options for configuring IP-TFS SAs.
Signed-off-by: Christian Hopps <chopps@labn.net>
---
include/uapi/linux/xfrm.h | 9 ++++++-
net/xfrm/xfrm_compat.c | 10 ++++++--
net/xfrm/xfrm_user.c | 52 +++++++++++++++++++++++++++++++++++++++
3 files changed, 68 insertions(+), 3 deletions(-)
diff --git a/include/uapi/linux/xfrm.h b/include/uapi/linux/xfrm.h
index 18ceaba8486e..3bd1f810e079 100644
--- a/include/uapi/linux/xfrm.h
+++ b/include/uapi/linux/xfrm.h
@@ -158,7 +158,8 @@ enum {
#define XFRM_MODE_ROUTEOPTIMIZATION 2
#define XFRM_MODE_IN_TRIGGER 3
#define XFRM_MODE_BEET 4
-#define XFRM_MODE_MAX 5
+#define XFRM_MODE_IPTFS 5
+#define XFRM_MODE_MAX 6
/* Netlink configuration messages. */
enum {
@@ -321,6 +322,12 @@ enum xfrm_attr_type_t {
XFRMA_IF_ID, /* __u32 */
XFRMA_MTIMER_THRESH, /* __u32 in seconds for input SA */
XFRMA_SA_DIR, /* __u8 */
+ XFRMA_IPTFS_DROP_TIME, /* __u32 in: usec to wait for next seq */
+ XFRMA_IPTFS_REORDER_WINDOW, /* __u16 in: reorder window size */
+ XFRMA_IPTFS_DONT_FRAG, /* out: don't use fragmentation */
+ XFRMA_IPTFS_INIT_DELAY, /* __u32 out: initial packet wait delay (usec) */
+ XFRMA_IPTFS_MAX_QSIZE, /* __u32 out: max ingress queue size */
+ XFRMA_IPTFS_PKT_SIZE, /* __u32 out: size of outer packet, 0 for PMTU */
__XFRMA_MAX
#define XFRMA_OUTPUT_MARK XFRMA_SET_MARK /* Compatibility */
diff --git a/net/xfrm/xfrm_compat.c b/net/xfrm/xfrm_compat.c
index 703d4172c7d7..a28b9f6503e5 100644
--- a/net/xfrm/xfrm_compat.c
+++ b/net/xfrm/xfrm_compat.c
@@ -280,9 +280,15 @@ static int xfrm_xlate64_attr(struct sk_buff *dst, const struct nlattr *src)
case XFRMA_IF_ID:
case XFRMA_MTIMER_THRESH:
case XFRMA_SA_DIR:
+ case XFRMA_IPTFS_PKT_SIZE:
+ case XFRMA_IPTFS_MAX_QSIZE:
+ case XFRMA_IPTFS_DONT_FRAG:
+ case XFRMA_IPTFS_DROP_TIME:
+ case XFRMA_IPTFS_REORDER_WINDOW:
+ case XFRMA_IPTFS_INIT_DELAY:
return xfrm_nla_cpy(dst, src, nla_len(src));
default:
- BUILD_BUG_ON(XFRMA_MAX != XFRMA_SA_DIR);
+ BUILD_BUG_ON(XFRMA_MAX != XFRMA_IPTFS_INIT_DELAY);
pr_warn_once("unsupported nla_type %d\n", src->nla_type);
return -EOPNOTSUPP;
}
@@ -437,7 +443,7 @@ static int xfrm_xlate32_attr(void *dst, const struct nlattr *nla,
int err;
if (type > XFRMA_MAX) {
- BUILD_BUG_ON(XFRMA_MAX != XFRMA_SA_DIR);
+ BUILD_BUG_ON(XFRMA_MAX != XFRMA_IPTFS_INIT_DELAY);
NL_SET_ERR_MSG(extack, "Bad attribute");
return -EOPNOTSUPP;
}
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index e83c687bd64e..6537bd520363 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -297,6 +297,16 @@ static int verify_newsa_info(struct xfrm_usersa_info *p,
NL_SET_ERR_MSG(extack, "TFC padding can only be used in tunnel mode");
goto out;
}
+ if ((attrs[XFRMA_IPTFS_DROP_TIME] ||
+ attrs[XFRMA_IPTFS_REORDER_WINDOW] ||
+ attrs[XFRMA_IPTFS_DONT_FRAG] ||
+ attrs[XFRMA_IPTFS_INIT_DELAY] ||
+ attrs[XFRMA_IPTFS_MAX_QSIZE] ||
+ attrs[XFRMA_IPTFS_PKT_SIZE]) &&
+ p->mode != XFRM_MODE_IPTFS) {
+ NL_SET_ERR_MSG(extack, "IP-TFS options can only be used in IP-TFS mode");
+ goto out;
+ }
break;
case IPPROTO_COMP:
@@ -417,6 +427,18 @@ static int verify_newsa_info(struct xfrm_usersa_info *p,
goto out;
}
+ if (attrs[XFRMA_IPTFS_DROP_TIME]) {
+ NL_SET_ERR_MSG(extack, "Drop time should not be set for output SA");
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (attrs[XFRMA_IPTFS_REORDER_WINDOW]) {
+ NL_SET_ERR_MSG(extack, "Reorder window should not be set for output SA");
+ err = -EINVAL;
+ goto out;
+ }
+
if (attrs[XFRMA_REPLAY_VAL]) {
struct xfrm_replay_state *replay;
@@ -454,6 +476,30 @@ static int verify_newsa_info(struct xfrm_usersa_info *p,
}
}
+
+ if (attrs[XFRMA_IPTFS_DONT_FRAG]) {
+ NL_SET_ERR_MSG(extack, "Don't fragment should not be set for input SA");
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (attrs[XFRMA_IPTFS_INIT_DELAY]) {
+ NL_SET_ERR_MSG(extack, "Initial delay should not be set for input SA");
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (attrs[XFRMA_IPTFS_MAX_QSIZE]) {
+ NL_SET_ERR_MSG(extack, "Max queue size should not be set for input SA");
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (attrs[XFRMA_IPTFS_PKT_SIZE]) {
+ NL_SET_ERR_MSG(extack, "Packet size should not be set for input SA");
+ err = -EINVAL;
+ goto out;
+ }
}
out:
@@ -3165,6 +3211,12 @@ const struct nla_policy xfrma_policy[XFRMA_MAX+1] = {
[XFRMA_IF_ID] = { .type = NLA_U32 },
[XFRMA_MTIMER_THRESH] = { .type = NLA_U32 },
[XFRMA_SA_DIR] = NLA_POLICY_RANGE(NLA_U8, XFRM_SA_DIR_IN, XFRM_SA_DIR_OUT),
+ [XFRMA_IPTFS_DROP_TIME] = { .type = NLA_U32 },
+ [XFRMA_IPTFS_REORDER_WINDOW] = { .type = NLA_U16 },
+ [XFRMA_IPTFS_DONT_FRAG] = { .type = NLA_FLAG },
+ [XFRMA_IPTFS_INIT_DELAY] = { .type = NLA_U32 },
+ [XFRMA_IPTFS_MAX_QSIZE] = { .type = NLA_U32 },
+ [XFRMA_IPTFS_PKT_SIZE] = { .type = NLA_U32 },
};
EXPORT_SYMBOL_GPL(xfrma_policy);
--
2.45.1
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH ipsec-next v2 06/17] xfrm: add mode_cbs module functionality
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
` (4 preceding siblings ...)
2024-05-20 21:42 ` [PATCH ipsec-next v2 05/17] xfrm: netlink: add config (netlink) options Christian Hopps
@ 2024-05-20 21:42 ` Christian Hopps
2024-05-20 21:42 ` [PATCH ipsec-next v2 07/17] xfrm: add generic iptfs defines and functionality Christian Hopps
` (11 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:42 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
Add a set of callbacks xfrm_mode_cbs to xfrm_state. These callbacks
enable the addition of new xfrm modes, such as IP-TFS to be defined
in modules.
Signed-off-by: Christian Hopps <chopps@labn.net>
---
include/net/xfrm.h | 39 ++++++++++++++++++++++++++++++++++
net/xfrm/xfrm_device.c | 3 ++-
net/xfrm/xfrm_input.c | 14 ++++++++++--
net/xfrm/xfrm_output.c | 2 ++
net/xfrm/xfrm_policy.c | 18 ++++++++++------
net/xfrm/xfrm_state.c | 48 ++++++++++++++++++++++++++++++++++++++++++
net/xfrm/xfrm_user.c | 13 ++++++++++++
7 files changed, 127 insertions(+), 10 deletions(-)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 7c9be06f8302..cb75ec2993bf 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -206,6 +206,7 @@ struct xfrm_state {
u16 family;
xfrm_address_t saddr;
int header_len;
+ int enc_hdr_len;
int trailer_len;
u32 extra_flags;
struct xfrm_mark smark;
@@ -292,6 +293,9 @@ struct xfrm_state {
* interpreted by xfrm_type methods. */
void *data;
u8 dir;
+
+ const struct xfrm_mode_cbs *mode_cbs;
+ void *mode_data;
};
static inline struct net *xs_net(struct xfrm_state *x)
@@ -444,6 +448,41 @@ struct xfrm_type_offload {
int xfrm_register_type_offload(const struct xfrm_type_offload *type, unsigned short family);
void xfrm_unregister_type_offload(const struct xfrm_type_offload *type, unsigned short family);
+struct xfrm_mode_cbs {
+ struct module *owner;
+ /* Add/delete state in the new xfrm_state in `x`. */
+ int (*create_state)(struct xfrm_state *x);
+ void (*delete_state)(struct xfrm_state *x);
+
+ /* Called while handling the user netlink options. */
+ int (*user_init)(struct net *net, struct xfrm_state *x,
+ struct nlattr **attrs,
+ struct netlink_ext_ack *extack);
+ int (*copy_to_user)(struct xfrm_state *x, struct sk_buff *skb);
+ int (*clone)(struct xfrm_state *x, struct xfrm_state *orig);
+ unsigned int (*sa_len)(const struct xfrm_state *x);
+
+ u32 (*get_inner_mtu)(struct xfrm_state *x, int outer_mtu);
+
+ /* Called to handle received xfrm (egress) packets. */
+ int (*input)(struct xfrm_state *x, struct sk_buff *skb);
+
+ /* Placed in dst_output of the dst when an xfrm_state is bound. */
+ int (*output)(struct net *net, struct sock *sk, struct sk_buff *skb);
+
+ /**
+ * Prepare the skb for output for the given mode. Returns:
+ * Error value, if 0 then skb values should be as follows:
+ * transport_header should point at ESP header
+ * network_header should point at Outer IP header
+ * mac_header should point at protocol/nexthdr of the outer IP
+ */
+ int (*prepare_output)(struct xfrm_state *x, struct sk_buff *skb);
+};
+
+int xfrm_register_mode_cbs(u8 mode, const struct xfrm_mode_cbs *mode_cbs);
+void xfrm_unregister_mode_cbs(u8 mode);
+
static inline int xfrm_af2proto(unsigned int family)
{
switch(family) {
diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index 2455a76a1cff..f91b2bee8190 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -42,7 +42,8 @@ static void __xfrm_mode_tunnel_prep(struct xfrm_state *x, struct sk_buff *skb,
skb->transport_header = skb->network_header + hsize;
skb_reset_mac_len(skb);
- pskb_pull(skb, skb->mac_len + x->props.header_len);
+ pskb_pull(skb,
+ skb->mac_len + x->props.header_len - x->props.enc_hdr_len);
}
static void __xfrm_mode_beet_prep(struct xfrm_state *x, struct sk_buff *skb,
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 71b42de6e3c9..8ef1af2d39bf 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -438,6 +438,9 @@ static int xfrm_inner_mode_input(struct xfrm_state *x,
WARN_ON_ONCE(1);
break;
default:
+ if (x->mode_cbs && x->mode_cbs->input)
+ return x->mode_cbs->input(x, skb);
+
WARN_ON_ONCE(1);
break;
}
@@ -485,6 +488,10 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
family = x->props.family;
+ /* An encap_type of -3 indicates reconstructed inner packet */
+ if (encap_type == -3)
+ goto resume_decapped;
+
/* An encap_type of -1 indicates async resumption. */
if (encap_type == -1) {
async = 1;
@@ -672,11 +679,14 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
XFRM_MODE_SKB_CB(skb)->protocol = nexthdr;
- if (xfrm_inner_mode_input(x, skb)) {
+ err = xfrm_inner_mode_input(x, skb);
+ if (err == -EINPROGRESS)
+ return 0;
+ else if (err) {
XFRM_INC_STATS(net, LINUX_MIB_XFRMINSTATEMODEERROR);
goto drop;
}
-
+resume_decapped:
if (x->outer_mode.flags & XFRM_MODE_FLAG_TUNNEL) {
decaps = 1;
break;
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index e5722c95b8bb..ef81359e4038 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -472,6 +472,8 @@ static int xfrm_outer_mode_output(struct xfrm_state *x, struct sk_buff *skb)
WARN_ON_ONCE(1);
break;
default:
+ if (x->mode_cbs && x->mode_cbs->prepare_output)
+ return x->mode_cbs->prepare_output(x, skb);
WARN_ON_ONCE(1);
break;
}
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 298b3a9eb48d..a3b50e8bc85a 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -2721,13 +2721,17 @@ static struct dst_entry *xfrm_bundle_create(struct xfrm_policy *policy,
dst1->input = dst_discard;
- rcu_read_lock();
- afinfo = xfrm_state_afinfo_get_rcu(inner_mode->family);
- if (likely(afinfo))
- dst1->output = afinfo->output;
- else
- dst1->output = dst_discard_out;
- rcu_read_unlock();
+ if (xfrm[i]->mode_cbs && xfrm[i]->mode_cbs->output) {
+ dst1->output = xfrm[i]->mode_cbs->output;
+ } else {
+ rcu_read_lock();
+ afinfo = xfrm_state_afinfo_get_rcu(inner_mode->family);
+ if (likely(afinfo))
+ dst1->output = afinfo->output;
+ else
+ dst1->output = dst_discard_out;
+ rcu_read_unlock();
+ }
xdst_prev = xdst;
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 649bb739df0d..e9ea5c5dd183 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -513,6 +513,36 @@ static const struct xfrm_mode *xfrm_get_mode(unsigned int encap, int family)
return NULL;
}
+static struct xfrm_mode_cbs xfrm_mode_cbs_map[XFRM_MODE_MAX];
+
+int xfrm_register_mode_cbs(u8 mode, const struct xfrm_mode_cbs *mode_cbs)
+{
+ if (mode >= XFRM_MODE_MAX)
+ return -EINVAL;
+
+ xfrm_mode_cbs_map[mode] = *mode_cbs;
+ return 0;
+}
+EXPORT_SYMBOL(xfrm_register_mode_cbs);
+
+void xfrm_unregister_mode_cbs(u8 mode)
+{
+ if (mode >= XFRM_MODE_MAX)
+ return;
+
+ memset(&xfrm_mode_cbs_map[mode], 0, sizeof(xfrm_mode_cbs_map[mode]));
+}
+EXPORT_SYMBOL(xfrm_unregister_mode_cbs);
+
+static const struct xfrm_mode_cbs *xfrm_get_mode_cbs(u8 mode)
+{
+ if (mode >= XFRM_MODE_MAX)
+ return NULL;
+ if (mode == XFRM_MODE_IPTFS && !xfrm_mode_cbs_map[mode].create_state)
+ request_module("xfrm-iptfs");
+ return &xfrm_mode_cbs_map[mode];
+}
+
void xfrm_state_free(struct xfrm_state *x)
{
kmem_cache_free(xfrm_state_cache, x);
@@ -521,6 +551,8 @@ EXPORT_SYMBOL(xfrm_state_free);
static void ___xfrm_state_destroy(struct xfrm_state *x)
{
+ if (x->mode_cbs && x->mode_cbs->delete_state)
+ x->mode_cbs->delete_state(x);
hrtimer_cancel(&x->mtimer);
del_timer_sync(&x->rtimer);
kfree(x->aead);
@@ -678,6 +710,7 @@ struct xfrm_state *xfrm_state_alloc(struct net *net)
x->replay_maxage = 0;
x->replay_maxdiff = 0;
spin_lock_init(&x->lock);
+ x->mode_data = NULL;
}
return x;
}
@@ -1747,6 +1780,12 @@ static struct xfrm_state *xfrm_state_clone(struct xfrm_state *orig,
x->new_mapping_sport = 0;
x->dir = orig->dir;
+ x->mode_cbs = orig->mode_cbs;
+ if (x->mode_cbs && x->mode_cbs->clone) {
+ if (x->mode_cbs->clone(x, orig))
+ goto error;
+ }
+
return x;
error:
@@ -2786,6 +2825,9 @@ u32 xfrm_state_mtu(struct xfrm_state *x, int mtu)
case XFRM_MODE_TUNNEL:
break;
default:
+ if (x->mode_cbs && x->mode_cbs->get_inner_mtu)
+ return x->mode_cbs->get_inner_mtu(x, mtu);
+
WARN_ON_ONCE(1);
break;
}
@@ -2871,6 +2913,12 @@ int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload,
goto error;
}
+ x->mode_cbs = xfrm_get_mode_cbs(x->props.mode);
+ if (x->mode_cbs && x->mode_cbs->create_state) {
+ err = x->mode_cbs->create_state(x);
+ if (err)
+ goto error;
+ }
error:
return err;
}
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 6537bd520363..dfd52637abed 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -914,6 +914,12 @@ static struct xfrm_state *xfrm_state_construct(struct net *net,
goto error;
}
+ if (x->mode_cbs && x->mode_cbs->user_init) {
+ err = x->mode_cbs->user_init(net, x, attrs, extack);
+ if (err)
+ goto error;
+ }
+
return x;
error:
@@ -1327,6 +1333,10 @@ static int copy_to_user_state_extra(struct xfrm_state *x,
if (ret)
goto out;
}
+ if (x->mode_cbs && x->mode_cbs->copy_to_user)
+ ret = x->mode_cbs->copy_to_user(x, skb);
+ if (ret)
+ goto out;
if (x->mapping_maxage) {
ret = nla_put_u32(skb, XFRMA_MTIMER_THRESH, x->mapping_maxage);
if (ret)
@@ -3526,6 +3536,9 @@ static inline unsigned int xfrm_sa_len(struct xfrm_state *x)
if (x->dir)
l += nla_total_size(sizeof(x->dir));
+ if (x->mode_cbs && x->mode_cbs->sa_len)
+ l += x->mode_cbs->sa_len(x);
+
return l;
}
--
2.45.1
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH ipsec-next v2 07/17] xfrm: add generic iptfs defines and functionality
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
` (5 preceding siblings ...)
2024-05-20 21:42 ` [PATCH ipsec-next v2 06/17] xfrm: add mode_cbs module functionality Christian Hopps
@ 2024-05-20 21:42 ` Christian Hopps
2024-05-20 21:42 ` [PATCH ipsec-next v2 08/17] xfrm: iptfs: add new iptfs xfrm mode impl Christian Hopps
` (10 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:42 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
Define `XFRM_MODE_IPTFS` and `IPSEC_MODE_IPTFS` constants, and add these to
switch case and conditionals adjacent with the existing TUNNEL modes.
Signed-off-by: Christian Hopps <chopps@labn.net>
---
include/net/xfrm.h | 1 +
include/uapi/linux/ipsec.h | 3 ++-
include/uapi/linux/snmp.h | 3 +++
net/ipv4/esp4.c | 3 ++-
net/ipv6/esp6.c | 3 ++-
net/netfilter/nft_xfrm.c | 3 ++-
net/xfrm/xfrm_device.c | 1 +
net/xfrm/xfrm_output.c | 4 ++++
net/xfrm/xfrm_policy.c | 8 ++++++--
net/xfrm/xfrm_proc.c | 3 +++
net/xfrm/xfrm_state.c | 12 ++++++++++++
net/xfrm/xfrm_user.c | 10 ++++++++++
12 files changed, 48 insertions(+), 6 deletions(-)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index cb75ec2993bf..0d8abf7bd32e 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -37,6 +37,7 @@
#define XFRM_PROTO_COMP 108
#define XFRM_PROTO_IPIP 4
#define XFRM_PROTO_IPV6 41
+#define XFRM_PROTO_IPTFS IPPROTO_AGGFRAG
#define XFRM_PROTO_ROUTING IPPROTO_ROUTING
#define XFRM_PROTO_DSTOPTS IPPROTO_DSTOPTS
diff --git a/include/uapi/linux/ipsec.h b/include/uapi/linux/ipsec.h
index 50d8ee1791e2..696b790f4346 100644
--- a/include/uapi/linux/ipsec.h
+++ b/include/uapi/linux/ipsec.h
@@ -14,7 +14,8 @@ enum {
IPSEC_MODE_ANY = 0, /* We do not support this for SA */
IPSEC_MODE_TRANSPORT = 1,
IPSEC_MODE_TUNNEL = 2,
- IPSEC_MODE_BEET = 3
+ IPSEC_MODE_BEET = 3,
+ IPSEC_MODE_IPTFS = 4
};
enum {
diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index adf5fd78dd50..77eb078f06a6 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -339,6 +339,9 @@ enum
LINUX_MIB_XFRMACQUIREERROR, /* XfrmAcquireError */
LINUX_MIB_XFRMOUTSTATEDIRERROR, /* XfrmOutStateDirError */
LINUX_MIB_XFRMINSTATEDIRERROR, /* XfrmInStateDirError */
+ LINUX_MIB_XFRMNOSKBERROR, /* XfrmNoSkbError */
+ LINUX_MIB_XFRMINIPTFSERROR, /* XfrmInIptfsError */
+ LINUX_MIB_XFRMOUTNOQSPACE, /* XfrmOutNoQueueSpace */
__LINUX_MIB_XFRMMAX
};
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 7d38ddd64115..2da9fd4efb70 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -809,7 +809,8 @@ int esp_input_done2(struct sk_buff *skb, int err)
}
skb_pull_rcsum(skb, hlen);
- if (x->props.mode == XFRM_MODE_TUNNEL)
+ if (x->props.mode == XFRM_MODE_TUNNEL ||
+ x->props.mode == XFRM_MODE_IPTFS)
skb_reset_transport_header(skb);
else
skb_set_transport_header(skb, -ihl);
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index 27df148530a6..0e50ff3eccdb 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -858,7 +858,8 @@ int esp6_input_done2(struct sk_buff *skb, int err)
skb_postpull_rcsum(skb, skb_network_header(skb),
skb_network_header_len(skb));
skb_pull_rcsum(skb, hlen);
- if (x->props.mode == XFRM_MODE_TUNNEL)
+ if (x->props.mode == XFRM_MODE_TUNNEL ||
+ x->props.mode == XFRM_MODE_IPTFS)
skb_reset_transport_header(skb);
else
skb_set_transport_header(skb, -hdr_len);
diff --git a/net/netfilter/nft_xfrm.c b/net/netfilter/nft_xfrm.c
index 1c866757db55..620238c6ef4c 100644
--- a/net/netfilter/nft_xfrm.c
+++ b/net/netfilter/nft_xfrm.c
@@ -112,7 +112,8 @@ static bool xfrm_state_addr_ok(enum nft_xfrm_keys k, u8 family, u8 mode)
return true;
}
- return mode == XFRM_MODE_BEET || mode == XFRM_MODE_TUNNEL;
+ return mode == XFRM_MODE_BEET || mode == XFRM_MODE_TUNNEL ||
+ mode == XFRM_MODE_IPTFS;
}
static void nft_xfrm_state_get_key(const struct nft_xfrm *priv,
diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index f91b2bee8190..b73af6918028 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -69,6 +69,7 @@ static void __xfrm_mode_beet_prep(struct xfrm_state *x, struct sk_buff *skb,
static void xfrm_outer_mode_prep(struct xfrm_state *x, struct sk_buff *skb)
{
switch (x->outer_mode.encap) {
+ case XFRM_MODE_IPTFS:
case XFRM_MODE_TUNNEL:
if (x->outer_mode.family == AF_INET)
return __xfrm_mode_tunnel_prep(x, skb,
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index ef81359e4038..b5025cf6136e 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -677,6 +677,10 @@ static void xfrm_get_inner_ipproto(struct sk_buff *skb, struct xfrm_state *x)
return;
}
+ if (x->outer_mode.encap == XFRM_MODE_IPTFS) {
+ xo->inner_ipproto = IPPROTO_AGGFRAG;
+ return;
+ }
/* non-Tunnel Mode */
if (!skb->encapsulation)
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index a3b50e8bc85a..dd58f84b4c13 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -2474,6 +2474,7 @@ xfrm_tmpl_resolve_one(struct xfrm_policy *policy, const struct flowi *fl,
struct xfrm_tmpl *tmpl = &policy->xfrm_vec[i];
if (tmpl->mode == XFRM_MODE_TUNNEL ||
+ tmpl->mode == XFRM_MODE_IPTFS ||
tmpl->mode == XFRM_MODE_BEET) {
remote = &tmpl->id.daddr;
local = &tmpl->saddr;
@@ -3266,7 +3267,8 @@ struct dst_entry *xfrm_lookup_with_ifid(struct net *net,
ok:
xfrm_pols_put(pols, drop_pols);
if (dst && dst->xfrm &&
- dst->xfrm->props.mode == XFRM_MODE_TUNNEL)
+ (dst->xfrm->props.mode == XFRM_MODE_TUNNEL ||
+ dst->xfrm->props.mode == XFRM_MODE_IPTFS))
dst->flags |= DST_XFRM_TUNNEL;
return dst;
@@ -4503,6 +4505,7 @@ static int migrate_tmpl_match(const struct xfrm_migrate *m, const struct xfrm_tm
switch (t->mode) {
case XFRM_MODE_TUNNEL:
case XFRM_MODE_BEET:
+ case XFRM_MODE_IPTFS:
if (xfrm_addr_equal(&t->id.daddr, &m->old_daddr,
m->old_family) &&
xfrm_addr_equal(&t->saddr, &m->old_saddr,
@@ -4545,7 +4548,8 @@ static int xfrm_policy_migrate(struct xfrm_policy *pol,
continue;
n++;
if (pol->xfrm_vec[i].mode != XFRM_MODE_TUNNEL &&
- pol->xfrm_vec[i].mode != XFRM_MODE_BEET)
+ pol->xfrm_vec[i].mode != XFRM_MODE_BEET &&
+ pol->xfrm_vec[i].mode != XFRM_MODE_IPTFS)
continue;
/* update endpoints */
memcpy(&pol->xfrm_vec[i].id.daddr, &mp->new_daddr,
diff --git a/net/xfrm/xfrm_proc.c b/net/xfrm/xfrm_proc.c
index eeb984be03a7..e851b388995a 100644
--- a/net/xfrm/xfrm_proc.c
+++ b/net/xfrm/xfrm_proc.c
@@ -43,6 +43,9 @@ static const struct snmp_mib xfrm_mib_list[] = {
SNMP_MIB_ITEM("XfrmAcquireError", LINUX_MIB_XFRMACQUIREERROR),
SNMP_MIB_ITEM("XfrmOutStateDirError", LINUX_MIB_XFRMOUTSTATEDIRERROR),
SNMP_MIB_ITEM("XfrmInStateDirError", LINUX_MIB_XFRMINSTATEDIRERROR),
+ SNMP_MIB_ITEM("XfrmNoSkbError", LINUX_MIB_XFRMNOSKBERROR),
+ SNMP_MIB_ITEM("XfrmInIptfsError", LINUX_MIB_XFRMINIPTFSERROR),
+ SNMP_MIB_ITEM("XfrmOutNoQueueSpace", LINUX_MIB_XFRMOUTNOQSPACE),
SNMP_MIB_SENTINEL
};
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index e9ea5c5dd183..dad36ff30510 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -465,6 +465,11 @@ static const struct xfrm_mode xfrm4_mode_map[XFRM_MODE_MAX] = {
.flags = XFRM_MODE_FLAG_TUNNEL,
.family = AF_INET,
},
+ [XFRM_MODE_IPTFS] = {
+ .encap = XFRM_MODE_IPTFS,
+ .flags = XFRM_MODE_FLAG_TUNNEL,
+ .family = AF_INET,
+ },
};
static const struct xfrm_mode xfrm6_mode_map[XFRM_MODE_MAX] = {
@@ -486,6 +491,11 @@ static const struct xfrm_mode xfrm6_mode_map[XFRM_MODE_MAX] = {
.flags = XFRM_MODE_FLAG_TUNNEL,
.family = AF_INET6,
},
+ [XFRM_MODE_IPTFS] = {
+ .encap = XFRM_MODE_IPTFS,
+ .flags = XFRM_MODE_FLAG_TUNNEL,
+ .family = AF_INET6,
+ },
};
static const struct xfrm_mode *xfrm_get_mode(unsigned int encap, int family)
@@ -2111,6 +2121,7 @@ static int __xfrm6_state_sort_cmp(const void *p)
#endif
case XFRM_MODE_TUNNEL:
case XFRM_MODE_BEET:
+ case XFRM_MODE_IPTFS:
return 4;
}
return 5;
@@ -2137,6 +2148,7 @@ static int __xfrm6_tmpl_sort_cmp(const void *p)
#endif
case XFRM_MODE_TUNNEL:
case XFRM_MODE_BEET:
+ case XFRM_MODE_IPTFS:
return 3;
}
return 4;
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index dfd52637abed..177dc0b23002 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -379,6 +379,14 @@ static int verify_newsa_info(struct xfrm_usersa_info *p,
case XFRM_MODE_ROUTEOPTIMIZATION:
case XFRM_MODE_BEET:
break;
+ case XFRM_MODE_IPTFS:
+ if (sa_dir == 0) {
+ NL_SET_ERR_MSG(
+ extack,
+ "IP-TFS mode requires in or out direction attribute");
+ goto out;
+ }
+ break;
default:
NL_SET_ERR_MSG(extack, "Unsupported mode");
@@ -1973,6 +1981,8 @@ static int validate_tmpl(int nr, struct xfrm_user_tmpl *ut, u16 family,
return -EINVAL;
}
break;
+ case XFRM_MODE_IPTFS:
+ break;
default:
if (ut[i].family != prev_family) {
NL_SET_ERR_MSG(extack, "Mode in template doesn't support a family change");
--
2.45.1
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH ipsec-next v2 08/17] xfrm: iptfs: add new iptfs xfrm mode impl
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
` (6 preceding siblings ...)
2024-05-20 21:42 ` [PATCH ipsec-next v2 07/17] xfrm: add generic iptfs defines and functionality Christian Hopps
@ 2024-05-20 21:42 ` Christian Hopps
2024-06-03 17:14 ` [devel-ipsec] " Antony Antony
2024-05-20 21:42 ` [PATCH ipsec-next v2 09/17] xfrm: iptfs: add user packet (tunnel ingress) handling Christian Hopps
` (9 subsequent siblings)
17 siblings, 1 reply; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:42 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
Add a new xfrm mode implementing AggFrag/IP-TFS from RFC9347.
This utilizes the new xfrm_mode_cbs to implement demand-driven IP-TFS
functionality. This functionality can be used to increase bandwidth
utilization through small packet aggregation, as well as help solve PMTU
issues through it's efficient use of fragmentation.
Link: https://www.rfc-editor.org/rfc/rfc9347.txt
Multiple commits follow to build the functionality into xfrm_iptfs.c
Signed-off-by: Christian Hopps <chopps@labn.net>
---
net/xfrm/Makefile | 1 +
net/xfrm/xfrm_iptfs.c | 225 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 226 insertions(+)
create mode 100644 net/xfrm/xfrm_iptfs.c
diff --git a/net/xfrm/Makefile b/net/xfrm/Makefile
index 547cec77ba03..cd6520d4d777 100644
--- a/net/xfrm/Makefile
+++ b/net/xfrm/Makefile
@@ -20,5 +20,6 @@ obj-$(CONFIG_XFRM_USER) += xfrm_user.o
obj-$(CONFIG_XFRM_USER_COMPAT) += xfrm_compat.o
obj-$(CONFIG_XFRM_IPCOMP) += xfrm_ipcomp.o
obj-$(CONFIG_XFRM_INTERFACE) += xfrm_interface.o
+obj-$(CONFIG_XFRM_IPTFS) += xfrm_iptfs.o
obj-$(CONFIG_XFRM_ESPINTCP) += espintcp.o
obj-$(CONFIG_DEBUG_INFO_BTF) += xfrm_state_bpf.o
diff --git a/net/xfrm/xfrm_iptfs.c b/net/xfrm/xfrm_iptfs.c
new file mode 100644
index 000000000000..e7b5546e1f6a
--- /dev/null
+++ b/net/xfrm/xfrm_iptfs.c
@@ -0,0 +1,225 @@
+// SPDX-License-Identifier: GPL-2.0
+/* xfrm_iptfs: IPTFS encapsulation support
+ *
+ * April 21 2022, Christian Hopps <chopps@labn.net>
+ *
+ * Copyright (c) 2022, LabN Consulting, L.L.C.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/icmpv6.h>
+#include <net/gro.h>
+#include <net/icmp.h>
+#include <net/ip6_route.h>
+#include <net/inet_ecn.h>
+#include <net/xfrm.h>
+
+#include <crypto/aead.h>
+
+#include "xfrm_inout.h"
+
+struct xfrm_iptfs_config {
+ u32 pkt_size; /* outer_packet_size or 0 */
+};
+
+struct xfrm_iptfs_data {
+ struct xfrm_iptfs_config cfg;
+
+ /* Ingress User Input */
+ struct xfrm_state *x; /* owning state */
+ u32 payload_mtu; /* max payload size */
+};
+
+/* ========================== */
+/* State Management Functions */
+/* ========================== */
+
+/**
+ * iptfs_get_inner_mtu() - return inner MTU with no fragmentation.
+ * @x: xfrm state.
+ * @outer_mtu: the outer mtu
+ */
+static u32 iptfs_get_inner_mtu(struct xfrm_state *x, int outer_mtu)
+{
+ struct crypto_aead *aead;
+ u32 blksize;
+
+ aead = x->data;
+ blksize = ALIGN(crypto_aead_blocksize(aead), 4);
+ return ((outer_mtu - x->props.header_len - crypto_aead_authsize(aead)) &
+ ~(blksize - 1)) - 2;
+}
+
+/**
+ * iptfs_user_init() - initialize the SA with IPTFS options from netlink.
+ * @net: the net data
+ * @x: xfrm state
+ * @attrs: netlink attributes
+ * @extack: extack return data
+ */
+static int iptfs_user_init(struct net *net, struct xfrm_state *x,
+ struct nlattr **attrs,
+ struct netlink_ext_ack *extack)
+{
+ struct xfrm_iptfs_data *xtfs = x->mode_data;
+ struct xfrm_iptfs_config *xc;
+
+ xc = &xtfs->cfg;
+
+ if (attrs[XFRMA_IPTFS_PKT_SIZE]) {
+ xc->pkt_size = nla_get_u32(attrs[XFRMA_IPTFS_PKT_SIZE]);
+ if (!xc->pkt_size) {
+ xtfs->payload_mtu = 0;
+ } else if (xc->pkt_size > x->props.header_len) {
+ xtfs->payload_mtu = xc->pkt_size - x->props.header_len;
+ } else {
+ NL_SET_ERR_MSG(extack,
+ "Packet size must be 0 or greater than IPTFS/ESP header length");
+ return -EINVAL;
+ }
+ }
+ return 0;
+}
+
+static unsigned int iptfs_sa_len(const struct xfrm_state *x)
+{
+ struct xfrm_iptfs_data *xtfs = x->mode_data;
+ struct xfrm_iptfs_config *xc = &xtfs->cfg;
+ unsigned int l = 0;
+
+ l += nla_total_size(0);
+ l += nla_total_size(sizeof(u16));
+ l += nla_total_size(sizeof(xc->pkt_size));
+ l += nla_total_size(sizeof(u32));
+ l += nla_total_size(sizeof(u32)); /* drop time usec */
+ l += nla_total_size(sizeof(u32)); /* init delay usec */
+
+ return l;
+}
+
+static int iptfs_copy_to_user(struct xfrm_state *x, struct sk_buff *skb)
+{
+ struct xfrm_iptfs_data *xtfs = x->mode_data;
+ struct xfrm_iptfs_config *xc = &xtfs->cfg;
+ int ret;
+
+ ret = nla_put_flag(skb, XFRMA_IPTFS_DONT_FRAG);
+ if (ret)
+ return ret;
+ ret = nla_put_u16(skb, XFRMA_IPTFS_REORDER_WINDOW, 0);
+ if (ret)
+ return ret;
+ ret = nla_put_u32(skb, XFRMA_IPTFS_PKT_SIZE, xc->pkt_size);
+ if (ret)
+ return ret;
+ ret = nla_put_u32(skb, XFRMA_IPTFS_MAX_QSIZE, 0);
+ if (ret)
+ return ret;
+
+ ret = nla_put_u32(skb, XFRMA_IPTFS_DROP_TIME, 0);
+ if (ret)
+ return ret;
+
+ ret = nla_put_u32(skb, XFRMA_IPTFS_INIT_DELAY, 0);
+
+ return ret;
+}
+
+static int __iptfs_init_state(struct xfrm_state *x,
+ struct xfrm_iptfs_data *xtfs)
+{
+ /* Modify type (esp) adjustment values */
+
+ if (x->props.family == AF_INET)
+ x->props.header_len += sizeof(struct iphdr) + sizeof(struct ip_iptfs_hdr);
+ else if (x->props.family == AF_INET6)
+ x->props.header_len += sizeof(struct ipv6hdr) + sizeof(struct ip_iptfs_hdr);
+ x->props.enc_hdr_len = sizeof(struct ip_iptfs_hdr);
+
+ /* Always have a module reference if x->mode_data is set */
+ if (!try_module_get(x->mode_cbs->owner))
+ return -EINVAL;
+
+ x->mode_data = xtfs;
+ xtfs->x = x;
+
+ return 0;
+}
+
+static int iptfs_clone(struct xfrm_state *x, struct xfrm_state *orig)
+{
+ struct xfrm_iptfs_data *xtfs;
+ int err;
+
+ xtfs = kmemdup(orig->mode_data, sizeof(*xtfs), GFP_KERNEL);
+ if (!xtfs)
+ return -ENOMEM;
+
+ err = __iptfs_init_state(x, xtfs);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+static int iptfs_create_state(struct xfrm_state *x)
+{
+ struct xfrm_iptfs_data *xtfs;
+ int err;
+
+ xtfs = kzalloc(sizeof(*xtfs), GFP_KERNEL);
+ if (!xtfs)
+ return -ENOMEM;
+
+ err = __iptfs_init_state(x, xtfs);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+static void iptfs_delete_state(struct xfrm_state *x)
+{
+ struct xfrm_iptfs_data *xtfs = x->mode_data;
+
+ if (!xtfs)
+ return;
+
+ kfree_sensitive(xtfs);
+
+ module_put(x->mode_cbs->owner);
+}
+
+static const struct xfrm_mode_cbs iptfs_mode_cbs = {
+ .owner = THIS_MODULE,
+ .create_state = iptfs_create_state,
+ .delete_state = iptfs_delete_state,
+ .user_init = iptfs_user_init,
+ .copy_to_user = iptfs_copy_to_user,
+ .sa_len = iptfs_sa_len,
+ .clone = iptfs_clone,
+ .get_inner_mtu = iptfs_get_inner_mtu,
+};
+
+static int __init xfrm_iptfs_init(void)
+{
+ int err;
+
+ pr_info("xfrm_iptfs: IPsec IP-TFS tunnel mode module\n");
+
+ err = xfrm_register_mode_cbs(XFRM_MODE_IPTFS, &iptfs_mode_cbs);
+ if (err < 0)
+ pr_info("%s: can't register IP-TFS\n", __func__);
+
+ return err;
+}
+
+static void __exit xfrm_iptfs_fini(void)
+{
+ xfrm_unregister_mode_cbs(XFRM_MODE_IPTFS);
+}
+
+module_init(xfrm_iptfs_init);
+module_exit(xfrm_iptfs_fini);
+MODULE_LICENSE("GPL");
--
2.45.1
^ permalink raw reply related [flat|nested] 34+ messages in thread* Re: [devel-ipsec] [PATCH ipsec-next v2 08/17] xfrm: iptfs: add new iptfs xfrm mode impl
2024-05-20 21:42 ` [PATCH ipsec-next v2 08/17] xfrm: iptfs: add new iptfs xfrm mode impl Christian Hopps
@ 2024-06-03 17:14 ` Antony Antony
2024-06-07 5:49 ` Christian Hopps
0 siblings, 1 reply; 34+ messages in thread
From: Antony Antony @ 2024-06-03 17:14 UTC (permalink / raw)
To: Christian Hopps; +Cc: devel, Steffen Klassert, netdev, Christian Hopps
Hi Chris,
On Mon, May 20, 2024 at 05:42:46PM -0400, Christian Hopps via Devel wrote:
> From: Christian Hopps <chopps@labn.net>
>
> Add a new xfrm mode implementing AggFrag/IP-TFS from RFC9347.
>
> This utilizes the new xfrm_mode_cbs to implement demand-driven IP-TFS
> functionality. This functionality can be used to increase bandwidth
> utilization through small packet aggregation, as well as help solve PMTU
> issues through it's efficient use of fragmentation.
>
> Link: https://www.rfc-editor.org/rfc/rfc9347.txt
>
> Multiple commits follow to build the functionality into xfrm_iptfs.c
>
> Signed-off-by: Christian Hopps <chopps@labn.net>
> ---
> net/xfrm/Makefile | 1 +
> net/xfrm/xfrm_iptfs.c | 225 ++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 226 insertions(+)
> create mode 100644 net/xfrm/xfrm_iptfs.c
>
> diff --git a/net/xfrm/Makefile b/net/xfrm/Makefile
> index 547cec77ba03..cd6520d4d777 100644
> --- a/net/xfrm/Makefile
> +++ b/net/xfrm/Makefile
> @@ -20,5 +20,6 @@ obj-$(CONFIG_XFRM_USER) += xfrm_user.o
> obj-$(CONFIG_XFRM_USER_COMPAT) += xfrm_compat.o
> obj-$(CONFIG_XFRM_IPCOMP) += xfrm_ipcomp.o
> obj-$(CONFIG_XFRM_INTERFACE) += xfrm_interface.o
> +obj-$(CONFIG_XFRM_IPTFS) += xfrm_iptfs.o
> obj-$(CONFIG_XFRM_ESPINTCP) += espintcp.o
> obj-$(CONFIG_DEBUG_INFO_BTF) += xfrm_state_bpf.o
> diff --git a/net/xfrm/xfrm_iptfs.c b/net/xfrm/xfrm_iptfs.c
> new file mode 100644
> index 000000000000..e7b5546e1f6a
> --- /dev/null
> +++ b/net/xfrm/xfrm_iptfs.c
> @@ -0,0 +1,225 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* xfrm_iptfs: IPTFS encapsulation support
> + *
> + * April 21 2022, Christian Hopps <chopps@labn.net>
> + *
> + * Copyright (c) 2022, LabN Consulting, L.L.C.
> + *
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/icmpv6.h>
> +#include <net/gro.h>
> +#include <net/icmp.h>
> +#include <net/ip6_route.h>
> +#include <net/inet_ecn.h>
> +#include <net/xfrm.h>
> +
> +#include <crypto/aead.h>
> +
> +#include "xfrm_inout.h"
> +
> +struct xfrm_iptfs_config {
> + u32 pkt_size; /* outer_packet_size or 0 */
> +};
> +
> +struct xfrm_iptfs_data {
> + struct xfrm_iptfs_config cfg;
> +
> + /* Ingress User Input */
> + struct xfrm_state *x; /* owning state */
> + u32 payload_mtu; /* max payload size */
> +};
> +
> +/* ========================== */
> +/* State Management Functions */
> +/* ========================== */
> +
> +/**
> + * iptfs_get_inner_mtu() - return inner MTU with no fragmentation.
> + * @x: xfrm state.
> + * @outer_mtu: the outer mtu
> + */
> +static u32 iptfs_get_inner_mtu(struct xfrm_state *x, int outer_mtu)
> +{
> + struct crypto_aead *aead;
> + u32 blksize;
> +
> + aead = x->data;
> + blksize = ALIGN(crypto_aead_blocksize(aead), 4);
> + return ((outer_mtu - x->props.header_len - crypto_aead_authsize(aead)) &
> + ~(blksize - 1)) - 2;
> +}
> +
> +/**
> + * iptfs_user_init() - initialize the SA with IPTFS options from netlink.
> + * @net: the net data
> + * @x: xfrm state
> + * @attrs: netlink attributes
> + * @extack: extack return data
> + */
> +static int iptfs_user_init(struct net *net, struct xfrm_state *x,
> + struct nlattr **attrs,
> + struct netlink_ext_ack *extack)
> +{
> + struct xfrm_iptfs_data *xtfs = x->mode_data;
> + struct xfrm_iptfs_config *xc;
> +
> + xc = &xtfs->cfg;
> +
> + if (attrs[XFRMA_IPTFS_PKT_SIZE]) {
> + xc->pkt_size = nla_get_u32(attrs[XFRMA_IPTFS_PKT_SIZE]);
> + if (!xc->pkt_size) {
> + xtfs->payload_mtu = 0;
> + } else if (xc->pkt_size > x->props.header_len) {
> + xtfs->payload_mtu = xc->pkt_size - x->props.header_len;
> + } else {
> + NL_SET_ERR_MSG(extack,
> + "Packet size must be 0 or greater than IPTFS/ESP header length");
> + return -EINVAL;
> + }
> + }
> + return 0;
> +}
> +
> +static unsigned int iptfs_sa_len(const struct xfrm_state *x)
> +{
> + struct xfrm_iptfs_data *xtfs = x->mode_data;
> + struct xfrm_iptfs_config *xc = &xtfs->cfg;
> + unsigned int l = 0;
> +
> + l += nla_total_size(0);
> + l += nla_total_size(sizeof(u16));
> + l += nla_total_size(sizeof(xc->pkt_size));
> + l += nla_total_size(sizeof(u32));
> + l += nla_total_size(sizeof(u32)); /* drop time usec */
> + l += nla_total_size(sizeof(u32)); /* init delay usec */
> +
> + return l;
> +}
> +
> +static int iptfs_copy_to_user(struct xfrm_state *x, struct sk_buff *skb)
> +{
> + struct xfrm_iptfs_data *xtfs = x->mode_data;
> + struct xfrm_iptfs_config *xc = &xtfs->cfg;
> + int ret;
> +
> + ret = nla_put_flag(skb, XFRMA_IPTFS_DONT_FRAG);
> + if (ret)
> + return ret;
> + ret = nla_put_u16(skb, XFRMA_IPTFS_REORDER_WINDOW, 0);
> + if (ret)
> + return ret;
> + ret = nla_put_u32(skb, XFRMA_IPTFS_PKT_SIZE, xc->pkt_size);
> + if (ret)
> + return ret;
> + ret = nla_put_u32(skb, XFRMA_IPTFS_MAX_QSIZE, 0);
> + if (ret)
> + return ret;
> +
> + ret = nla_put_u32(skb, XFRMA_IPTFS_DROP_TIME, 0);
> + if (ret)
> + return ret;
> +
> + ret = nla_put_u32(skb, XFRMA_IPTFS_INIT_DELAY, 0);
Why copy all attributes? Only copy the ones relevant to the SA direction.
Also adjust in iptfs_sa_len().
> +
> + return ret;
> +}
> +
> +static int __iptfs_init_state(struct xfrm_state *x,
> + struct xfrm_iptfs_data *xtfs)
> +{
> + /* Modify type (esp) adjustment values */
> +
> + if (x->props.family == AF_INET)
> + x->props.header_len += sizeof(struct iphdr) + sizeof(struct ip_iptfs_hdr);
> + else if (x->props.family == AF_INET6)
> + x->props.header_len += sizeof(struct ipv6hdr) + sizeof(struct ip_iptfs_hdr);
> + x->props.enc_hdr_len = sizeof(struct ip_iptfs_hdr);
> +
> + /* Always have a module reference if x->mode_data is set */
> + if (!try_module_get(x->mode_cbs->owner))
> + return -EINVAL;
> +
> + x->mode_data = xtfs;
> + xtfs->x = x;
> +
> + return 0;
> +}
> +
> +static int iptfs_clone(struct xfrm_state *x, struct xfrm_state *orig)
> +{
> + struct xfrm_iptfs_data *xtfs;
> + int err;
> +
> + xtfs = kmemdup(orig->mode_data, sizeof(*xtfs), GFP_KERNEL);
> + if (!xtfs)
> + return -ENOMEM;
> +
> + err = __iptfs_init_state(x, xtfs);
> + if (err)
> + return err;
> +
> + return 0;
> +}
> +
> +static int iptfs_create_state(struct xfrm_state *x)
> +{
> + struct xfrm_iptfs_data *xtfs;
> + int err;
> +
> + xtfs = kzalloc(sizeof(*xtfs), GFP_KERNEL);
> + if (!xtfs)
> + return -ENOMEM;
> +
> + err = __iptfs_init_state(x, xtfs);
> + if (err)
> + return err;
> +
> + return 0;
> +}
> +
> +static void iptfs_delete_state(struct xfrm_state *x)
> +{
> + struct xfrm_iptfs_data *xtfs = x->mode_data;
> +
> + if (!xtfs)
> + return;
> +
> + kfree_sensitive(xtfs);
> +
> + module_put(x->mode_cbs->owner);
> +}
> +
> +static const struct xfrm_mode_cbs iptfs_mode_cbs = {
> + .owner = THIS_MODULE,
> + .create_state = iptfs_create_state,
> + .delete_state = iptfs_delete_state,
> + .user_init = iptfs_user_init,
> + .copy_to_user = iptfs_copy_to_user,
> + .sa_len = iptfs_sa_len,
> + .clone = iptfs_clone,
> + .get_inner_mtu = iptfs_get_inner_mtu,
> +};
> +
> +static int __init xfrm_iptfs_init(void)
> +{
> + int err;
> +
> + pr_info("xfrm_iptfs: IPsec IP-TFS tunnel mode module\n");
> +
> + err = xfrm_register_mode_cbs(XFRM_MODE_IPTFS, &iptfs_mode_cbs);
> + if (err < 0)
> + pr_info("%s: can't register IP-TFS\n", __func__);
> +
> + return err;
> +}
> +
> +static void __exit xfrm_iptfs_fini(void)
> +{
> + xfrm_unregister_mode_cbs(XFRM_MODE_IPTFS);
> +}
> +
> +module_init(xfrm_iptfs_init);
> +module_exit(xfrm_iptfs_fini);
> +MODULE_LICENSE("GPL");
> --
> 2.45.1
>
> --
> Devel mailing list
> Devel@linux-ipsec.org
> https://linux-ipsec.org/mailman/listinfo/devel
^ permalink raw reply [flat|nested] 34+ messages in thread* Re: [devel-ipsec] [PATCH ipsec-next v2 08/17] xfrm: iptfs: add new iptfs xfrm mode impl
2024-06-03 17:14 ` [devel-ipsec] " Antony Antony
@ 2024-06-07 5:49 ` Christian Hopps
0 siblings, 0 replies; 34+ messages in thread
From: Christian Hopps @ 2024-06-07 5:49 UTC (permalink / raw)
To: Antony Antony
Cc: Christian Hopps, devel, Steffen Klassert, netdev, Christian Hopps
[-- Attachment #1: Type: text/plain, Size: 1558 bytes --]
Antony Antony <antony@phenome.org> writes:
> Hi Chris,
>
> On Mon, May 20, 2024 at 05:42:46PM -0400, Christian Hopps via Devel wrote:
>> From: Christian Hopps <chopps@labn.net>
>> +static unsigned int iptfs_sa_len(const struct xfrm_state *x)
>> +{
>> + struct xfrm_iptfs_data *xtfs = x->mode_data;
>> + struct xfrm_iptfs_config *xc = &xtfs->cfg;
>> + unsigned int l = 0;
>> +
>> + l += nla_total_size(0);
>> + l += nla_total_size(sizeof(u16));
>> + l += nla_total_size(sizeof(xc->pkt_size));
>> + l += nla_total_size(sizeof(u32));
>> + l += nla_total_size(sizeof(u32)); /* drop time usec */
>> + l += nla_total_size(sizeof(u32)); /* init delay usec */
>> +
>> + return l;
>> +}
>> +
>> +static int iptfs_copy_to_user(struct xfrm_state *x, struct sk_buff *skb)
>> +{
>> + struct xfrm_iptfs_data *xtfs = x->mode_data;
>> + struct xfrm_iptfs_config *xc = &xtfs->cfg;
>> + int ret;
>> +
>> + ret = nla_put_flag(skb, XFRMA_IPTFS_DONT_FRAG);
>> + if (ret)
>> + return ret;
>> + ret = nla_put_u16(skb, XFRMA_IPTFS_REORDER_WINDOW, 0);
>> + if (ret)
>> + return ret;
>> + ret = nla_put_u32(skb, XFRMA_IPTFS_PKT_SIZE, xc->pkt_size);
>> + if (ret)
>> + return ret;
>> + ret = nla_put_u32(skb, XFRMA_IPTFS_MAX_QSIZE, 0);
>> + if (ret)
>> + return ret;
>> +
>> + ret = nla_put_u32(skb, XFRMA_IPTFS_DROP_TIME, 0);
>> + if (ret)
>> + return ret;
>> +
>> + ret = nla_put_u32(skb, XFRMA_IPTFS_INIT_DELAY, 0);
>
> Why copy all attributes? Only copy the ones relevant to the SA direction.
> Also adjust in iptfs_sa_len().
Updated in new v3 patchset.
Thanks,
Chris.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 857 bytes --]
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH ipsec-next v2 09/17] xfrm: iptfs: add user packet (tunnel ingress) handling
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
` (7 preceding siblings ...)
2024-05-20 21:42 ` [PATCH ipsec-next v2 08/17] xfrm: iptfs: add new iptfs xfrm mode impl Christian Hopps
@ 2024-05-20 21:42 ` Christian Hopps
2024-05-20 21:42 ` [PATCH ipsec-next v2 10/17] xfrm: iptfs: share page fragments of inner packets Christian Hopps
` (8 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:42 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
Add tunnel packet output functionality. This is code handles
the ingress to the tunnel.
Signed-off-by: Christian Hopps <chopps@labn.net>
---
net/xfrm/xfrm_iptfs.c | 519 +++++++++++++++++++++++++++++++++++++++++-
1 file changed, 516 insertions(+), 3 deletions(-)
diff --git a/net/xfrm/xfrm_iptfs.c b/net/xfrm/xfrm_iptfs.c
index e7b5546e1f6a..561483fc83f9 100644
--- a/net/xfrm/xfrm_iptfs.c
+++ b/net/xfrm/xfrm_iptfs.c
@@ -19,8 +19,13 @@
#include "xfrm_inout.h"
+#define NSECS_IN_USEC 1000
+
+#define IPTFS_HRTIMER_MODE HRTIMER_MODE_REL_SOFT
+
struct xfrm_iptfs_config {
u32 pkt_size; /* outer_packet_size or 0 */
+ u32 max_queue_size; /* octets */
};
struct xfrm_iptfs_data {
@@ -28,9 +33,495 @@ struct xfrm_iptfs_data {
/* Ingress User Input */
struct xfrm_state *x; /* owning state */
+ struct sk_buff_head queue; /* output queue */
+ u32 queue_size; /* octets */
+ u32 ecn_queue_size; /* octets above which ECN mark */
+ u64 init_delay_ns; /* nanoseconds */
+ struct hrtimer iptfs_timer; /* output timer */
+ time64_t iptfs_settime; /* time timer was set */
u32 payload_mtu; /* max payload size */
};
+static u32 iptfs_get_inner_mtu(struct xfrm_state *x, int outer_mtu);
+static enum hrtimer_restart iptfs_delay_timer(struct hrtimer *me);
+
+/* ================================= */
+/* IPTFS Sending (ingress) Functions */
+/* ================================= */
+
+/* ------------------------- */
+/* Enqueue to send functions */
+/* ------------------------- */
+
+/**
+ * iptfs_enqueue() - enqueue packet if ok to send.
+ * @xtfs: xtfs state
+ * @skb: the packet
+ *
+ * Return: true if packet enqueued.
+ */
+static bool iptfs_enqueue(struct xfrm_iptfs_data *xtfs, struct sk_buff *skb)
+{
+ u64 newsz = xtfs->queue_size + skb->len;
+ struct iphdr *iph;
+
+ assert_spin_locked(&xtfs->x->lock);
+
+ if (newsz > xtfs->cfg.max_queue_size)
+ return false;
+
+ /* Set ECN CE if we are above our ECN queue threshold */
+ if (newsz > xtfs->ecn_queue_size) {
+ iph = ip_hdr(skb);
+ if (iph->version == 4)
+ IP_ECN_set_ce(iph);
+ else if (iph->version == 6)
+ IP6_ECN_set_ce(skb, ipv6_hdr(skb));
+ }
+
+ __skb_queue_tail(&xtfs->queue, skb);
+ xtfs->queue_size += skb->len;
+ return true;
+}
+
+static int iptfs_get_cur_pmtu(struct xfrm_state *x,
+ struct xfrm_iptfs_data *xtfs, struct sk_buff *skb)
+{
+ struct xfrm_dst *xdst = (struct xfrm_dst *)skb_dst(skb);
+ u32 payload_mtu = xtfs->payload_mtu;
+ u32 pmtu = iptfs_get_inner_mtu(x, xdst->child_mtu_cached);
+
+ if (payload_mtu && payload_mtu < pmtu)
+ pmtu = payload_mtu;
+
+ return pmtu;
+}
+
+static int iptfs_is_too_big(struct sock *sk, struct sk_buff *skb, u32 pmtu)
+{
+ if (skb->len <= pmtu)
+ return 0;
+
+ /* We only send ICMP too big if the user has configured us as
+ * dont-fragment.
+ */
+ XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
+
+ if (sk) {
+ xfrm_local_error(skb, pmtu);
+ } else if (ip_hdr(skb)->version == 4) {
+ icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
+ htonl(pmtu));
+ } else {
+ WARN_ON_ONCE(ip_hdr(skb)->version != 6);
+ icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, pmtu);
+ }
+ return 1;
+}
+
+/* IPv4/IPv6 packet ingress to IPTFS tunnel, arrange to send in IPTFS payload
+ * (i.e., aggregating or fragmenting as appropriate).
+ * This is set in dst->output for an SA.
+ */
+static int iptfs_output_collect(struct net *net, struct sock *sk,
+ struct sk_buff *skb)
+{
+ struct dst_entry *dst = skb_dst(skb);
+ struct xfrm_state *x = dst->xfrm;
+ struct xfrm_iptfs_data *xtfs = x->mode_data;
+ struct sk_buff *segs, *nskb;
+ u32 pmtu = 0;
+ bool ok = true;
+ bool was_gso;
+
+ /* We have hooked into dst_entry->output which means we have skipped the
+ * protocol specific netfilter (see xfrm4_output, xfrm6_output).
+ * when our timer runs we will end up calling xfrm_output directly on
+ * the encapsulated traffic.
+ *
+ * For both cases this is the NF_INET_POST_ROUTING hook which allows
+ * changing the skb->dst entry which then may not be xfrm based anymore
+ * in which case a REROUTED flag is set. and dst_output is called.
+ *
+ * For IPv6 we are also skipping fragmentation handling for local
+ * sockets, which may or may not be good depending on our tunnel DF
+ * setting. Normally with fragmentation supported we want to skip this
+ * fragmentation.
+ */
+
+ BUG_ON(!xtfs);
+
+ pmtu = iptfs_get_cur_pmtu(x, xtfs, skb);
+
+ /* Break apart GSO skbs. If the queue is nearing full then we want the
+ * accounting and queuing to be based on the individual packets not on the
+ * aggregate GSO buffer.
+ */
+ was_gso = skb_is_gso(skb);
+ if (!was_gso) {
+ segs = skb;
+ } else {
+ segs = skb_gso_segment(skb, 0);
+ if (IS_ERR_OR_NULL(segs)) {
+ XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
+ kfree_skb(skb);
+ return PTR_ERR(segs);
+ }
+ consume_skb(skb);
+ skb = NULL;
+ }
+
+ /* We can be running on multiple cores and from the network softirq or
+ * from user context depending on where the packet is coming from.
+ */
+ spin_lock_bh(&x->lock);
+
+ skb_list_walk_safe(segs, skb, nskb)
+ {
+ skb_mark_not_on_list(skb);
+
+ /* Once we drop due to no queue space we continue to drop the
+ * rest of the packets from that GRO.
+ */
+ if (!ok) {
+nospace:
+ XFRM_INC_STATS(dev_net(skb->dev),
+ LINUX_MIB_XFRMOUTNOQSPACE);
+ kfree_skb_reason(skb, SKB_DROP_REASON_FULL_RING);
+ continue;
+ }
+
+ /* Fragmenting handled in following commits. */
+ if (iptfs_is_too_big(sk, skb, pmtu)) {
+ kfree_skb_reason(skb, SKB_DROP_REASON_PKT_TOO_BIG);
+ continue;
+ }
+
+ /* Enqueue to send in tunnel */
+ ok = iptfs_enqueue(xtfs, skb);
+ if (!ok)
+ goto nospace;
+ }
+
+ /* Start a delay timer if we don't have one yet */
+ if (!hrtimer_is_queued(&xtfs->iptfs_timer)) {
+ hrtimer_start(&xtfs->iptfs_timer, xtfs->init_delay_ns,
+ IPTFS_HRTIMER_MODE);
+ xtfs->iptfs_settime = ktime_get_raw_fast_ns();
+ }
+
+ spin_unlock_bh(&x->lock);
+ return 0;
+}
+
+/* -------------------------- */
+/* Dequeue and send functions */
+/* -------------------------- */
+
+static void iptfs_output_prepare_skb(struct sk_buff *skb, u32 blkoff)
+{
+ struct ip_iptfs_hdr *h;
+ size_t hsz = sizeof(*h);
+
+ /* now reset values to be pointing at the rest of the packets */
+ h = skb_push(skb, hsz);
+ memset(h, 0, hsz);
+ if (blkoff)
+ h->block_offset = htons(blkoff);
+
+ /* network_header current points at the inner IP packet
+ * move it to the iptfs header
+ */
+ skb->transport_header = skb->network_header;
+ skb->network_header -= hsz;
+
+ IPCB(skb)->flags |= IPSKB_XFRM_TUNNEL_SIZE;
+}
+
+static struct sk_buff **iptfs_rehome_fraglist(struct sk_buff **nextp,
+ struct sk_buff *child)
+{
+ u32 fllen = 0;
+
+ /* It might be possible to account for a frag list in addition to page
+ * fragment if it's a valid state to be in. The page fragments size
+ * should be kept as data_len so only the frag_list size is removed,
+ * this must be done above as well.
+ */
+ BUG_ON(skb_shinfo(child)->nr_frags);
+ *nextp = skb_shinfo(child)->frag_list;
+ while (*nextp) {
+ fllen += (*nextp)->len;
+ nextp = &(*nextp)->next;
+ }
+ skb_frag_list_init(child);
+ BUG_ON(fllen > child->data_len);
+ child->len -= fllen;
+ child->data_len -= fllen;
+
+ return nextp;
+}
+
+static void iptfs_output_queued(struct xfrm_state *x, struct sk_buff_head *list)
+{
+ struct xfrm_iptfs_data *xtfs = x->mode_data;
+ struct sk_buff *skb, *skb2, **nextp;
+ struct skb_shared_info *shi;
+
+ while ((skb = __skb_dequeue(list))) {
+ u32 mtu = iptfs_get_cur_pmtu(x, xtfs, skb);
+ int remaining;
+
+ /* protocol comes to us cleared sometimes */
+ skb->protocol = x->outer_mode.family == AF_INET ?
+ htons(ETH_P_IP) :
+ htons(ETH_P_IPV6);
+
+ if (skb->len > mtu) {
+ /* We handle this case before enqueueing so we are only
+ * here b/c MTU changed after we enqueued before we
+ * dequeued, just drop these.
+ */
+ XFRM_INC_STATS(dev_net(skb->dev),
+ LINUX_MIB_XFRMOUTERROR);
+
+ kfree_skb_reason(skb, SKB_DROP_REASON_PKT_TOO_BIG);
+ continue;
+ }
+
+ /* If we don't have a cksum in the packet we need to add one
+ * before encapsulation.
+ */
+ if (skb->ip_summed == CHECKSUM_PARTIAL) {
+ if (skb_checksum_help(skb)) {
+ XFRM_INC_STATS(dev_net(skb_dst(skb)->dev),
+ LINUX_MIB_XFRMOUTERROR);
+ kfree_skb(skb);
+ continue;
+ }
+ }
+
+ /* Convert first inner packet into an outer IPTFS packet */
+ iptfs_output_prepare_skb(skb, 0);
+
+ /* The space remaining to send more inner packet data is `mtu` -
+ * (skb->len - sizeof iptfs header). This is b/c the `mtu` value
+ * has the basic IPTFS header len accounted for, and we added
+ * that header to the skb so it is a part of skb->len, thus we
+ * subtract it from the skb length.
+ */
+ remaining = mtu - (skb->len - sizeof(struct ip_iptfs_hdr));
+
+ /* Re-home (un-nest) nested fragment lists. We need to do this
+ * b/c we will simply be appending any following aggregated
+ * inner packets to the frag list.
+ */
+ shi = skb_shinfo(skb);
+ nextp = &shi->frag_list;
+ while (*nextp) {
+ if (skb_has_frag_list(*nextp))
+ nextp = iptfs_rehome_fraglist(&(*nextp)->next,
+ *nextp);
+ else
+ nextp = &(*nextp)->next;
+ }
+
+ /* See if we have enough space to simply append.
+ *
+ * NOTE: Maybe do not append if we will be mis-aligned,
+ * SW-based endpoints will probably have to copy in this
+ * case.
+ */
+ while ((skb2 = skb_peek(list))) {
+ if (skb2->len > remaining)
+ break;
+
+ __skb_unlink(skb2, list);
+
+ /* If we don't have a cksum in the packet we need to add
+ * one before encapsulation.
+ */
+ if (skb2->ip_summed == CHECKSUM_PARTIAL) {
+ if (skb_checksum_help(skb2)) {
+ XFRM_INC_STATS(
+ dev_net(skb_dst(skb2)->dev),
+ LINUX_MIB_XFRMOUTERROR);
+ kfree_skb(skb2);
+ continue;
+ }
+ }
+
+ /* Do accounting */
+ skb->data_len += skb2->len;
+ skb->len += skb2->len;
+ remaining -= skb2->len;
+
+ /* Append to the frag_list */
+ *nextp = skb2;
+ nextp = &skb2->next;
+ BUG_ON(*nextp);
+ if (skb_has_frag_list(skb2))
+ nextp = iptfs_rehome_fraglist(nextp, skb2);
+ skb->truesize += skb2->truesize;
+ }
+
+ xfrm_output(NULL, skb);
+ }
+}
+
+static enum hrtimer_restart iptfs_delay_timer(struct hrtimer *me)
+{
+ struct sk_buff_head list;
+ struct xfrm_iptfs_data *xtfs;
+ struct xfrm_state *x;
+ time64_t settime;
+
+ xtfs = container_of(me, typeof(*xtfs), iptfs_timer);
+ x = xtfs->x;
+
+ /* Process all the queued packets
+ *
+ * softirq execution order: timer > tasklet > hrtimer
+ *
+ * Network rx will have run before us giving one last chance to queue
+ * ingress packets for us to process and transmit.
+ */
+
+ spin_lock(&x->lock);
+ __skb_queue_head_init(&list);
+ skb_queue_splice_init(&xtfs->queue, &list);
+ xtfs->queue_size = 0;
+ settime = xtfs->iptfs_settime;
+ spin_unlock(&x->lock);
+
+ /* After the above unlock, packets can begin queuing again, and the
+ * timer can be set again, from another CPU either in softirq or user
+ * context (not from this one since we are running at softirq level
+ * already).
+ */
+
+ iptfs_output_queued(x, &list);
+
+ return HRTIMER_NORESTART;
+}
+
+/**
+ * iptfs_encap_add_ipv4() - add outer encaps
+ * @x: xfrm state
+ * @skb: the packet
+ *
+ * This was originally taken from xfrm4_tunnel_encap_add. The reason for the
+ * copy is that IP-TFS/AGGFRAG can have different functionality for how to set
+ * the TOS/DSCP bits. Sets the protocol to a different value and doesn't do
+ * anything with inner headers as they aren't pointing into a normal IP
+ * singleton inner packet.
+ */
+static int iptfs_encap_add_ipv4(struct xfrm_state *x, struct sk_buff *skb)
+{
+ struct dst_entry *dst = skb_dst(skb);
+ struct iphdr *top_iph;
+
+ skb_reset_inner_network_header(skb);
+ skb_reset_inner_transport_header(skb);
+
+ skb_set_network_header(skb,
+ -(x->props.header_len - x->props.enc_hdr_len));
+ skb->mac_header =
+ skb->network_header + offsetof(struct iphdr, protocol);
+ skb->transport_header = skb->network_header + sizeof(*top_iph);
+
+ top_iph = ip_hdr(skb);
+ top_iph->ihl = 5;
+ top_iph->version = 4;
+ top_iph->protocol = IPPROTO_AGGFRAG;
+
+ /* As we have 0, fractional, 1 or N inner packets there's no obviously
+ * correct DSCP mapping to inherit. ECN should be cleared per RFC9347
+ * 3.1.
+ */
+ top_iph->tos = 0;
+
+ top_iph->frag_off = htons(IP_DF);
+ top_iph->ttl = ip4_dst_hoplimit(xfrm_dst_child(dst));
+ top_iph->saddr = x->props.saddr.a4;
+ top_iph->daddr = x->id.daddr.a4;
+ ip_select_ident(dev_net(dst->dev), skb, NULL);
+
+ return 0;
+}
+
+/**
+ * iptfs_encap_add_ipv6() - add outer encaps
+ * @x: xfrm state
+ * @skb: the packet
+ *
+ * This was originally taken from xfrm6_tunnel_encap_add. The reason for the
+ * copy is that IP-TFS/AGGFRAG can have different functionality for how to set
+ * the flow label and TOS/DSCP bits. It also sets the protocol to a different
+ * value and doesn't do anything with inner headers as they aren't pointing into
+ * a normal IP singleton inner packet.
+ */
+static int iptfs_encap_add_ipv6(struct xfrm_state *x, struct sk_buff *skb)
+{
+ struct dst_entry *dst = skb_dst(skb);
+ struct ipv6hdr *top_iph;
+ int dsfield;
+
+ skb_reset_inner_network_header(skb);
+ skb_reset_inner_transport_header(skb);
+
+ skb_set_network_header(skb,
+ -x->props.header_len + x->props.enc_hdr_len);
+ skb->mac_header =
+ skb->network_header + offsetof(struct ipv6hdr, nexthdr);
+ skb->transport_header = skb->network_header + sizeof(*top_iph);
+
+ top_iph = ipv6_hdr(skb);
+ top_iph->version = 6;
+ top_iph->priority = 0;
+ memset(top_iph->flow_lbl, 0, sizeof(top_iph->flow_lbl));
+ top_iph->nexthdr = IPPROTO_AGGFRAG;
+
+ /* As we have 0, fractional, 1 or N inner packets there's no obviously
+ * correct DSCP mapping to inherit. ECN should be cleared per RFC9347
+ * 3.1.
+ */
+ dsfield = 0;
+ ipv6_change_dsfield(top_iph, 0, dsfield);
+
+ top_iph->hop_limit = ip6_dst_hoplimit(xfrm_dst_child(dst));
+ top_iph->saddr = *(struct in6_addr *)&x->props.saddr;
+ top_iph->daddr = *(struct in6_addr *)&x->id.daddr;
+
+ return 0;
+}
+
+/**
+ * iptfs_prepare_output() - prepare the skb for output
+ * @x: xfrm state
+ * @skb: the packet
+ *
+ * Return: Error value, if 0 then skb values should be as follows:
+ * - transport_header should point at ESP header
+ * - network_header should point at Outer IP header
+ * - mac_header should point at protocol/nexthdr of the outer IP
+ */
+static int iptfs_prepare_output(struct xfrm_state *x, struct sk_buff *skb)
+{
+ if (x->outer_mode.family == AF_INET)
+ return iptfs_encap_add_ipv4(x, skb);
+ if (x->outer_mode.family == AF_INET6) {
+#if IS_ENABLED(CONFIG_IPV6)
+ return iptfs_encap_add_ipv6(x, skb);
+#else
+ WARN_ON_ONCE(1);
+ return -EAFNOSUPPORT;
+#endif
+ }
+ WARN_ON_ONCE(1);
+ return -EOPNOTSUPP;
+}
+
/* ========================== */
/* State Management Functions */
/* ========================== */
@@ -66,6 +557,9 @@ static int iptfs_user_init(struct net *net, struct xfrm_state *x,
struct xfrm_iptfs_config *xc;
xc = &xtfs->cfg;
+ xc->max_queue_size = net->xfrm.sysctl_iptfs_max_qsize;
+ xtfs->init_delay_ns =
+ (u64)net->xfrm.sysctl_iptfs_init_delay * NSECS_IN_USEC;
if (attrs[XFRMA_IPTFS_PKT_SIZE]) {
xc->pkt_size = nla_get_u32(attrs[XFRMA_IPTFS_PKT_SIZE]);
@@ -79,6 +573,15 @@ static int iptfs_user_init(struct net *net, struct xfrm_state *x,
return -EINVAL;
}
}
+ if (attrs[XFRMA_IPTFS_MAX_QSIZE])
+ xc->max_queue_size = nla_get_u32(attrs[XFRMA_IPTFS_MAX_QSIZE]);
+ if (attrs[XFRMA_IPTFS_INIT_DELAY])
+ xtfs->init_delay_ns =
+ (u64)nla_get_u32(attrs[XFRMA_IPTFS_INIT_DELAY]) *
+ NSECS_IN_USEC;
+
+ xtfs->ecn_queue_size = (u64)xc->max_queue_size * 95 / 100;
+
return 0;
}
@@ -91,7 +594,7 @@ static unsigned int iptfs_sa_len(const struct xfrm_state *x)
l += nla_total_size(0);
l += nla_total_size(sizeof(u16));
l += nla_total_size(sizeof(xc->pkt_size));
- l += nla_total_size(sizeof(u32));
+ l += nla_total_size(sizeof(xc->max_queue_size));
l += nla_total_size(sizeof(u32)); /* drop time usec */
l += nla_total_size(sizeof(u32)); /* init delay usec */
@@ -103,6 +606,7 @@ static int iptfs_copy_to_user(struct xfrm_state *x, struct sk_buff *skb)
struct xfrm_iptfs_data *xtfs = x->mode_data;
struct xfrm_iptfs_config *xc = &xtfs->cfg;
int ret;
+ u64 q;
ret = nla_put_flag(skb, XFRMA_IPTFS_DONT_FRAG);
if (ret)
@@ -113,7 +617,7 @@ static int iptfs_copy_to_user(struct xfrm_state *x, struct sk_buff *skb)
ret = nla_put_u32(skb, XFRMA_IPTFS_PKT_SIZE, xc->pkt_size);
if (ret)
return ret;
- ret = nla_put_u32(skb, XFRMA_IPTFS_MAX_QSIZE, 0);
+ ret = nla_put_u32(skb, XFRMA_IPTFS_MAX_QSIZE, xc->max_queue_size);
if (ret)
return ret;
@@ -121,7 +625,9 @@ static int iptfs_copy_to_user(struct xfrm_state *x, struct sk_buff *skb)
if (ret)
return ret;
- ret = nla_put_u32(skb, XFRMA_IPTFS_INIT_DELAY, 0);
+ q = xtfs->init_delay_ns;
+ (void)do_div(q, NSECS_IN_USEC);
+ ret = nla_put_u32(skb, XFRMA_IPTFS_INIT_DELAY, q);
return ret;
}
@@ -129,6 +635,10 @@ static int iptfs_copy_to_user(struct xfrm_state *x, struct sk_buff *skb)
static int __iptfs_init_state(struct xfrm_state *x,
struct xfrm_iptfs_data *xtfs)
{
+ __skb_queue_head_init(&xtfs->queue);
+ hrtimer_init(&xtfs->iptfs_timer, CLOCK_MONOTONIC, IPTFS_HRTIMER_MODE);
+ xtfs->iptfs_timer.function = iptfs_delay_timer;
+
/* Modify type (esp) adjustment values */
if (x->props.family == AF_INET)
@@ -186,6 +696,7 @@ static void iptfs_delete_state(struct xfrm_state *x)
if (!xtfs)
return;
+ hrtimer_cancel(&xtfs->iptfs_timer);
kfree_sensitive(xtfs);
module_put(x->mode_cbs->owner);
@@ -200,6 +711,8 @@ static const struct xfrm_mode_cbs iptfs_mode_cbs = {
.sa_len = iptfs_sa_len,
.clone = iptfs_clone,
.get_inner_mtu = iptfs_get_inner_mtu,
+ .output = iptfs_output_collect,
+ .prepare_output = iptfs_prepare_output,
};
static int __init xfrm_iptfs_init(void)
--
2.45.1
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH ipsec-next v2 10/17] xfrm: iptfs: share page fragments of inner packets
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
` (8 preceding siblings ...)
2024-05-20 21:42 ` [PATCH ipsec-next v2 09/17] xfrm: iptfs: add user packet (tunnel ingress) handling Christian Hopps
@ 2024-05-20 21:42 ` Christian Hopps
2024-05-20 21:42 ` [PATCH ipsec-next v2 11/17] xfrm: iptfs: add fragmenting of larger than MTU user packets Christian Hopps
` (7 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:42 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
When possible rather than appending secondary (aggregated) inner packets
to the fragment list, share their page fragments with the outer IPTFS
packet. This allows for more efficient packet transmission.
Signed-off-by: Christian Hopps <chopps@labn.net>
---
net/xfrm/xfrm_iptfs.c | 88 ++++++++++++++++++++++++++++++++++++++-----
1 file changed, 79 insertions(+), 9 deletions(-)
diff --git a/net/xfrm/xfrm_iptfs.c b/net/xfrm/xfrm_iptfs.c
index 561483fc83f9..87ba643edfb7 100644
--- a/net/xfrm/xfrm_iptfs.c
+++ b/net/xfrm/xfrm_iptfs.c
@@ -45,6 +45,24 @@ struct xfrm_iptfs_data {
static u32 iptfs_get_inner_mtu(struct xfrm_state *x, int outer_mtu);
static enum hrtimer_restart iptfs_delay_timer(struct hrtimer *me);
+/* ================= */
+/* SK_BUFF Functions */
+/* ================= */
+
+/**
+ * skb_head_to_frag() - initialize a skb_frag_t based on skb head data
+ * @skb: skb with the head data
+ * @frag: frag to initialize
+ */
+static void skb_head_to_frag(const struct sk_buff *skb, skb_frag_t *frag)
+{
+ struct page *page = virt_to_head_page(skb->data);
+ unsigned char *addr = (unsigned char *)page_address(page);
+
+ BUG_ON(!skb->head_frag);
+ skb_frag_fill_page_desc(frag, page, skb->data - addr, skb_headlen(skb));
+}
+
/* ================================= */
/* IPTFS Sending (ingress) Functions */
/* ================================= */
@@ -262,14 +280,44 @@ static struct sk_buff **iptfs_rehome_fraglist(struct sk_buff **nextp,
return nextp;
}
+static void iptfs_consume_frags(struct sk_buff *to, struct sk_buff *from)
+{
+ struct skb_shared_info *fromi = skb_shinfo(from);
+ struct skb_shared_info *toi = skb_shinfo(to);
+ unsigned int new_truesize;
+
+ /* If we have data in a head page, grab it */
+ if (!skb_headlen(from)) {
+ new_truesize = SKB_TRUESIZE(skb_end_offset(from));
+ } else {
+ skb_head_to_frag(from, &toi->frags[toi->nr_frags]);
+ skb_frag_ref(to, toi->nr_frags++);
+ new_truesize = SKB_DATA_ALIGN(sizeof(struct sk_buff));
+ }
+
+ /* Move any other page fragments rather than copy */
+ memcpy(&toi->frags[toi->nr_frags], fromi->frags,
+ sizeof(fromi->frags[0]) * fromi->nr_frags);
+ toi->nr_frags += fromi->nr_frags;
+ fromi->nr_frags = 0;
+ from->data_len = 0;
+ from->len = 0;
+ to->truesize += from->truesize - new_truesize;
+ from->truesize = new_truesize;
+
+ /* We are done with this SKB */
+ consume_skb(from);
+}
+
static void iptfs_output_queued(struct xfrm_state *x, struct sk_buff_head *list)
{
struct xfrm_iptfs_data *xtfs = x->mode_data;
struct sk_buff *skb, *skb2, **nextp;
- struct skb_shared_info *shi;
+ struct skb_shared_info *shi, *shi2;
while ((skb = __skb_dequeue(list))) {
u32 mtu = iptfs_get_cur_pmtu(x, xtfs, skb);
+ bool share_ok = true;
int remaining;
/* protocol comes to us cleared sometimes */
@@ -314,7 +362,7 @@ static void iptfs_output_queued(struct xfrm_state *x, struct sk_buff_head *list)
/* Re-home (un-nest) nested fragment lists. We need to do this
* b/c we will simply be appending any following aggregated
- * inner packets to the frag list.
+ * inner packets using the frag list.
*/
shi = skb_shinfo(skb);
nextp = &shi->frag_list;
@@ -326,6 +374,9 @@ static void iptfs_output_queued(struct xfrm_state *x, struct sk_buff_head *list)
nextp = &(*nextp)->next;
}
+ if (shi->frag_list || skb_cloned(skb) || skb_shared(skb))
+ share_ok = false;
+
/* See if we have enough space to simply append.
*
* NOTE: Maybe do not append if we will be mis-aligned,
@@ -351,18 +402,37 @@ static void iptfs_output_queued(struct xfrm_state *x, struct sk_buff_head *list)
}
}
+ /* skb->pp_recycle is passed to __skb_flag_unref for all
+ * frag pages so we can only share pages with skb's who
+ * match ourselves.
+ */
+ shi2 = skb_shinfo(skb2);
+ if (share_ok &&
+ (shi2->frag_list ||
+ (!skb2->head_frag && skb_headlen(skb)) ||
+ skb->pp_recycle != skb2->pp_recycle ||
+ skb_zcopy(skb2) ||
+ (shi->nr_frags + shi2->nr_frags + 1 >
+ MAX_SKB_FRAGS)))
+ share_ok = false;
+
/* Do accounting */
skb->data_len += skb2->len;
skb->len += skb2->len;
remaining -= skb2->len;
- /* Append to the frag_list */
- *nextp = skb2;
- nextp = &skb2->next;
- BUG_ON(*nextp);
- if (skb_has_frag_list(skb2))
- nextp = iptfs_rehome_fraglist(nextp, skb2);
- skb->truesize += skb2->truesize;
+ if (share_ok) {
+ iptfs_consume_frags(skb, skb2);
+ } else {
+ /* Append to the frag_list */
+ *nextp = skb2;
+ nextp = &skb2->next;
+ BUG_ON(*nextp);
+ if (skb_has_frag_list(skb2))
+ nextp = iptfs_rehome_fraglist(nextp,
+ skb2);
+ skb->truesize += skb2->truesize;
+ }
}
xfrm_output(NULL, skb);
--
2.45.1
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH ipsec-next v2 11/17] xfrm: iptfs: add fragmenting of larger than MTU user packets
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
` (9 preceding siblings ...)
2024-05-20 21:42 ` [PATCH ipsec-next v2 10/17] xfrm: iptfs: share page fragments of inner packets Christian Hopps
@ 2024-05-20 21:42 ` Christian Hopps
2024-05-20 21:42 ` [PATCH ipsec-next v2 12/17] xfrm: iptfs: add basic receive packet (tunnel egress) handling Christian Hopps
` (6 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:42 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
Add support for tunneling user (inner) packets that are larger than the
tunnel's path MTU (outer) using IP-TFS fragmentation.
Signed-off-by: Christian Hopps <chopps@labn.net>
---
net/xfrm/xfrm_iptfs.c | 404 ++++++++++++++++++++++++++++++++++++++----
1 file changed, 374 insertions(+), 30 deletions(-)
diff --git a/net/xfrm/xfrm_iptfs.c b/net/xfrm/xfrm_iptfs.c
index 87ba643edfb7..cabe3e1ffbff 100644
--- a/net/xfrm/xfrm_iptfs.c
+++ b/net/xfrm/xfrm_iptfs.c
@@ -19,11 +19,22 @@
#include "xfrm_inout.h"
+/* 1) skb->head should be cache aligned.
+ * 2) when resv is for L2 headers (i.e., ethernet) we want the cacheline to
+ * start -16 from data.
+ * 3) when resv is for L3+L2 headers IOW skb->data points at the IPTFS payload
+ * we want data to be cache line aligned so all the pushed headers will be in
+ * another cacheline.
+ */
+#define XFRM_IPTFS_MIN_L3HEADROOM 128
+#define XFRM_IPTFS_MIN_L2HEADROOM (64 + 16)
+#define IPTFS_FRAG_COPY_MAX 256 /* max for copying to create iptfs frags */
#define NSECS_IN_USEC 1000
#define IPTFS_HRTIMER_MODE HRTIMER_MODE_REL_SOFT
struct xfrm_iptfs_config {
+ bool dont_frag : 1;
u32 pkt_size; /* outer_packet_size or 0 */
u32 max_queue_size; /* octets */
};
@@ -42,13 +53,71 @@ struct xfrm_iptfs_data {
u32 payload_mtu; /* max payload size */
};
-static u32 iptfs_get_inner_mtu(struct xfrm_state *x, int outer_mtu);
+static u32 __iptfs_get_inner_mtu(struct xfrm_state *x, int outer_mtu);
static enum hrtimer_restart iptfs_delay_timer(struct hrtimer *me);
/* ================= */
/* SK_BUFF Functions */
/* ================= */
+/**
+ * iptfs_alloc_skb() - Allocate a new `skb` using a meta-data template.
+ * @tpl: the template to copy the new `skb`s meta-data from.
+ * @len: the linear length of the head data, zero is fine.
+ * @l3resv: true if reserve needs to support pushing L3 headers
+ *
+ * A new `skb` is allocated and it's meta-data is initialized from `tpl`, the
+ * head data is sized to `len` + reserved space set according to the @l3resv
+ * boolean. When @l3resv is false, resv is XFRM_IPTFS_MIN_L2HEADROOM which
+ * arranges for `skb->data - 16` (etherhdr space) to be the start of a cacheline.
+ * Otherwise, @l3resv is true and resv is either the size of headroom from `tpl` or
+ * XFRM_IPTFS_MIN_L3HEADROOM whichever is greater, which tries to align
+ * skb->data to a cacheline as all headers will be pushed on the previous
+ * cacheline bytes.
+ *
+ * When copying meta-data from the @tpl, the sk_buff->headers are not copied.
+ *
+ * Zero length skbs are allocated when we only need a head skb to hold new
+ * packet headers (basically the mac header) that sit on top of existing shared
+ * packet data.
+ *
+ * Return: the new skb or NULL.
+ */
+static struct sk_buff *iptfs_alloc_skb(struct sk_buff *tpl, u32 len,
+ bool l3resv)
+{
+ struct sk_buff *skb;
+ u32 resv;
+
+ if (!l3resv) {
+ resv = XFRM_IPTFS_MIN_L2HEADROOM;
+ } else {
+ resv = skb_headroom(tpl);
+ if (resv < XFRM_IPTFS_MIN_L3HEADROOM)
+ resv = XFRM_IPTFS_MIN_L3HEADROOM;
+ }
+
+ skb = alloc_skb(len + resv, GFP_ATOMIC);
+ if (!skb) {
+ XFRM_INC_STATS(dev_net(tpl->dev), LINUX_MIB_XFRMINERROR);
+ return NULL;
+ }
+
+ skb_reserve(skb, resv);
+
+ /* Code from __copy_skb_header() -- we do not want any of the
+ * tpl->headers copied over, so we aren't using `skb_copy_header()`.
+ */
+ skb->tstamp = tpl->tstamp;
+ skb->dev = tpl->dev;
+ memcpy(skb->cb, tpl->cb, sizeof(skb->cb));
+ skb_dst_copy(skb, tpl);
+ __skb_ext_copy(skb, tpl);
+ __nf_copy(skb, tpl, false);
+
+ return skb;
+}
+
/**
* skb_head_to_frag() - initialize a skb_frag_t based on skb head data
* @skb: skb with the head data
@@ -63,6 +132,39 @@ static void skb_head_to_frag(const struct sk_buff *skb, skb_frag_t *frag)
skb_frag_fill_page_desc(frag, page, skb->data - addr, skb_headlen(skb));
}
+/**
+ * skb_copy_bits_seq - copy bits from a skb_seq_state to kernel buffer
+ * @st: source skb_seq_state
+ * @offset: offset in source
+ * @to: destination buffer
+ * @len: number of bytes to copy
+ *
+ * Copy @len bytes from @offset bytes into the source @st to the destination
+ * buffer @to. `offset` should increase (or be unchanged) with each subsequent
+ * call to this function. If offset needs to decrease from the previous use `st`
+ * should be reset first.
+ */
+static int skb_copy_bits_seq(struct skb_seq_state *st, int offset, void *to,
+ int len)
+{
+ const u8 *data;
+ u32 sqlen;
+
+ for (;;) {
+ sqlen = skb_seq_read(offset, &data, st);
+ if (sqlen == 0)
+ return -ENOMEM;
+ if (sqlen >= len) {
+ memcpy(to, data, len);
+ return 0;
+ }
+ memcpy(to, data, sqlen);
+ to += sqlen;
+ offset += sqlen;
+ len -= sqlen;
+ }
+}
+
/* ================================= */
/* IPTFS Sending (ingress) Functions */
/* ================================= */
@@ -107,7 +209,7 @@ static int iptfs_get_cur_pmtu(struct xfrm_state *x,
{
struct xfrm_dst *xdst = (struct xfrm_dst *)skb_dst(skb);
u32 payload_mtu = xtfs->payload_mtu;
- u32 pmtu = iptfs_get_inner_mtu(x, xdst->child_mtu_cached);
+ u32 pmtu = __iptfs_get_inner_mtu(x, xdst->child_mtu_cached);
if (payload_mtu && payload_mtu < pmtu)
pmtu = payload_mtu;
@@ -169,7 +271,8 @@ static int iptfs_output_collect(struct net *net, struct sock *sk,
BUG_ON(!xtfs);
- pmtu = iptfs_get_cur_pmtu(x, xtfs, skb);
+ if (xtfs->cfg.dont_frag)
+ pmtu = iptfs_get_cur_pmtu(x, xtfs, skb);
/* Break apart GSO skbs. If the queue is nearing full then we want the
* accounting and queuing to be based on the individual packets not on the
@@ -209,8 +312,10 @@ static int iptfs_output_collect(struct net *net, struct sock *sk,
continue;
}
- /* Fragmenting handled in following commits. */
- if (iptfs_is_too_big(sk, skb, pmtu)) {
+ /* If the user indicated no iptfs fragmenting check before
+ * enqueue.
+ */
+ if (xtfs->cfg.dont_frag && iptfs_is_too_big(sk, skb, pmtu)) {
kfree_skb_reason(skb, SKB_DROP_REASON_PKT_TOO_BIG);
continue;
}
@@ -256,6 +361,217 @@ static void iptfs_output_prepare_skb(struct sk_buff *skb, u32 blkoff)
IPCB(skb)->flags |= IPSKB_XFRM_TUNNEL_SIZE;
}
+/**
+ * iptfs_copy_create_frag() - create an inner fragment skb.
+ * @st: The source packet data.
+ * @offset: offset in @st of the new fragment data.
+ * @copy_len: the amount of data to copy from @st.
+ *
+ * Create a new skb holding a single IPTFS inner packet fragment. @copy_len must
+ * not be greater than the max fragment size.
+ *
+ * Return: the new fragment skb or an ERR_PTR().
+ */
+static struct sk_buff *iptfs_copy_create_frag(struct skb_seq_state *st,
+ u32 offset, u32 copy_len)
+{
+ struct sk_buff *src = st->root_skb;
+ struct sk_buff *skb;
+ int err;
+
+ skb = iptfs_alloc_skb(src, copy_len, true);
+ if (!skb)
+ return ERR_PTR(-ENOMEM);
+
+ /* Now copy `copy_len` data from src */
+ err = skb_copy_bits_seq(st, offset, skb_put(skb, copy_len), copy_len);
+ if (err) {
+ XFRM_INC_STATS(dev_net(src->dev), LINUX_MIB_XFRMOUTERROR);
+ kfree_skb(skb);
+ return ERR_PTR(err);
+ }
+
+ return skb;
+}
+
+/**
+ * iptfs_copy_create_frags() - create and send N-1 fragments of a larger skb.
+ * @skbp: the source packet skb (IN), skb holding the last fragment in
+ * the fragment stream (OUT).
+ * @xtfs: IPTFS SA state.
+ * @mtu: the max IPTFS fragment size.
+ *
+ * This function is responsible for fragmenting a larger inner packet into a
+ * sequence of IPTFS payload packets. The last fragment is returned rather than
+ * being sent so that the caller can append more inner packets (aggregation) if
+ * there is room.
+ */
+static int iptfs_copy_create_frags(struct sk_buff **skbp,
+ struct xfrm_iptfs_data *xtfs, u32 mtu)
+{
+ struct skb_seq_state skbseq;
+ struct list_head sublist;
+ struct sk_buff *skb = *skbp;
+ struct sk_buff *nskb = *skbp;
+ u32 copy_len, offset;
+ u32 to_copy = skb->len - mtu;
+ u32 blkoff = 0;
+ int err = 0;
+
+ INIT_LIST_HEAD(&sublist);
+
+ BUG_ON(skb->len <= mtu);
+ skb_prepare_seq_read(skb, 0, skb->len, &skbseq);
+
+ /* A trimmed `skb` will be sent as the first fragment, later. */
+ offset = mtu;
+ to_copy = skb->len - offset;
+ while (to_copy) {
+ /* Send all but last fragment to allow agg. append */
+ list_add_tail(&nskb->list, &sublist);
+
+ /* FUTURE: if the packet has an odd/non-aligning length we could
+ * send less data in the penultimate fragment so that the last
+ * fragment then ends on an aligned boundary.
+ */
+ copy_len = to_copy <= mtu ? to_copy : mtu;
+ nskb = iptfs_copy_create_frag(&skbseq, offset, copy_len);
+ if (IS_ERR(nskb)) {
+ XFRM_INC_STATS(dev_net(skb->dev),
+ LINUX_MIB_XFRMOUTERROR);
+ skb_abort_seq_read(&skbseq);
+ err = PTR_ERR(nskb);
+ nskb = NULL;
+ break;
+ }
+ iptfs_output_prepare_skb(nskb, to_copy);
+ offset += copy_len;
+ to_copy -= copy_len;
+ blkoff = to_copy;
+ }
+ skb_abort_seq_read(&skbseq);
+
+ /* return last fragment that will be unsent (or NULL) */
+ *skbp = nskb;
+
+ /* trim the original skb to MTU */
+ if (!err)
+ err = pskb_trim(skb, mtu);
+
+ if (err) {
+ /* Free all frags. Don't bother sending a partial packet we will
+ * never complete.
+ */
+ kfree_skb(nskb);
+ list_for_each_entry_safe(skb, nskb, &sublist, list) {
+ skb_list_del_init(skb);
+ kfree_skb(skb);
+ }
+ return err;
+ }
+
+ /* prepare the initial fragment with an iptfs header */
+ iptfs_output_prepare_skb(skb, 0);
+
+ /* Send all but last fragment, if we fail to send a fragment then free
+ * the rest -- no point in sending a packet that can't be reassembled.
+ */
+ list_for_each_entry_safe(skb, nskb, &sublist, list) {
+ skb_list_del_init(skb);
+ if (!err)
+ err = xfrm_output(NULL, skb);
+ else
+ kfree_skb(skb);
+ }
+ if (err)
+ kfree_skb(*skbp);
+ return err;
+}
+
+/**
+ * iptfs_first_should_copy() - determine if we should copy packet data.
+ * @first_skb: the first skb in the packet
+ * @mtu: the MTU.
+ *
+ * Determine if we should create subsequent skbs to hold the remaining data from
+ * a large inner packet by copying the packet data, or cloning the original skb
+ * and adjusting the offsets.
+ */
+static bool iptfs_first_should_copy(struct sk_buff *first_skb, u32 mtu)
+{
+ u32 frag_copy_max;
+
+ /* If we have less than frag_copy_max for remaining packet we copy
+ * those tail bytes as it is more efficient.
+ */
+ frag_copy_max = mtu <= IPTFS_FRAG_COPY_MAX ? mtu : IPTFS_FRAG_COPY_MAX;
+ if ((int)first_skb->len - (int)mtu < (int)frag_copy_max)
+ return true;
+
+ /* If we have non-linear skb just use copy */
+ if (skb_is_nonlinear(first_skb))
+ return true;
+
+ /* So we have a simple linear skb, easy to clone and share */
+ return false;
+}
+
+/**
+ * iptfs_first_skb() - handle the first dequeued inner packet for output
+ * @skbp: the source packet skb (IN), skb holding the last fragment in
+ * the fragment stream (OUT).
+ * @xtfs: IPTFS SA state.
+ * @mtu: the max IPTFS fragment size.
+ *
+ * This function is responsible for fragmenting a larger inner packet into a
+ * sequence of IPTFS payload packets. If it needs to fragment into subsequent
+ * skb's, it will either do so by copying or cloning.
+ *
+ * The last fragment is returned rather than being sent so that the caller can
+ * append more inner packets (aggregation) if there is room.
+ *
+ */
+static int iptfs_first_skb(struct sk_buff **skbp, struct xfrm_iptfs_data *xtfs,
+ u32 mtu)
+{
+ struct sk_buff *skb = *skbp;
+ int err;
+
+ /* Classic ESP skips the don't fragment ICMP error if DF is clear on
+ * the inner packet or ignore_df is set. Otherwise it will send an ICMP
+ * or local error if the inner packet won't fit it's MTU.
+ *
+ * With IPTFS we do not care about the inner packet DF bit. If the
+ * tunnel is configured to "don't fragment" we error back if things
+ * don't fit in our max packet size. Otherwise we iptfs-fragment as
+ * normal.
+ */
+
+ /* The opportunity for HW offload has ended */
+ if (skb->ip_summed == CHECKSUM_PARTIAL) {
+ err = skb_checksum_help(skb);
+ if (err)
+ return err;
+ }
+
+ /* We've split these up before queuing */
+ BUG_ON(skb_is_gso(skb));
+
+ /* Simple case -- it fits. `mtu` accounted for all the overhead
+ * including the basic IPTFS header.
+ */
+ if (skb->len <= mtu) {
+ iptfs_output_prepare_skb(skb, 0);
+ return 0;
+ }
+
+ if (iptfs_first_should_copy(skb, mtu))
+ return iptfs_copy_create_frags(skbp, xtfs, mtu);
+
+ /* For now we always copy */
+ return iptfs_copy_create_frags(skbp, xtfs, mtu);
+}
+
static struct sk_buff **iptfs_rehome_fraglist(struct sk_buff **nextp,
struct sk_buff *child)
{
@@ -315,6 +631,15 @@ static void iptfs_output_queued(struct xfrm_state *x, struct sk_buff_head *list)
struct sk_buff *skb, *skb2, **nextp;
struct skb_shared_info *shi, *shi2;
+ /* If we are fragmenting due to a large inner packet we will output all
+ * the outer IPTFS packets required to contain the fragments of the
+ * single large inner packet. These outer packets need to be sent
+ * consecutively (ESP seq-wise). Since this output function is always
+ * running from a timer we do not need a lock to provide this guarantee.
+ * We will output our packets consecutively before the timer is allowed
+ * to run again on some other CPU.
+ */
+
while ((skb = __skb_dequeue(list))) {
u32 mtu = iptfs_get_cur_pmtu(x, xtfs, skb);
bool share_ok = true;
@@ -325,7 +650,7 @@ static void iptfs_output_queued(struct xfrm_state *x, struct sk_buff_head *list)
htons(ETH_P_IP) :
htons(ETH_P_IPV6);
- if (skb->len > mtu) {
+ if (skb->len > mtu && xtfs->cfg.dont_frag) {
/* We handle this case before enqueueing so we are only
* here b/c MTU changed after we enqueued before we
* dequeued, just drop these.
@@ -337,26 +662,22 @@ static void iptfs_output_queued(struct xfrm_state *x, struct sk_buff_head *list)
continue;
}
- /* If we don't have a cksum in the packet we need to add one
- * before encapsulation.
+ /* Convert first inner packet into an outer IPTFS packet,
+ * dealing with any fragmentation into multiple outer packets
+ * if necessary.
*/
- if (skb->ip_summed == CHECKSUM_PARTIAL) {
- if (skb_checksum_help(skb)) {
- XFRM_INC_STATS(dev_net(skb_dst(skb)->dev),
- LINUX_MIB_XFRMOUTERROR);
- kfree_skb(skb);
- continue;
- }
- }
-
- /* Convert first inner packet into an outer IPTFS packet */
- iptfs_output_prepare_skb(skb, 0);
+ if (iptfs_first_skb(&skb, xtfs, mtu))
+ continue;
- /* The space remaining to send more inner packet data is `mtu` -
- * (skb->len - sizeof iptfs header). This is b/c the `mtu` value
- * has the basic IPTFS header len accounted for, and we added
- * that header to the skb so it is a part of skb->len, thus we
- * subtract it from the skb length.
+ /* If fragmentation was required the returned skb is the last
+ * IPTFS fragment in the chain, and it's IPTFS header blkoff has
+ * been set just past the end of the fragment data.
+ *
+ * In either case the space remaining to send more inner packet
+ * data is `mtu` - (skb->len - sizeof iptfs header). This is b/c
+ * the `mtu` value has the basic IPTFS header len accounted for,
+ * and we added that header to the skb so it is a part of
+ * skb->len, thus we subtract it from the skb length.
*/
remaining = mtu - (skb->len - sizeof(struct ip_iptfs_hdr));
@@ -597,11 +918,11 @@ static int iptfs_prepare_output(struct xfrm_state *x, struct sk_buff *skb)
/* ========================== */
/**
- * iptfs_get_inner_mtu() - return inner MTU with no fragmentation.
+ * __iptfs_get_inner_mtu() - return inner MTU with no fragmentation.
* @x: xfrm state.
* @outer_mtu: the outer mtu
*/
-static u32 iptfs_get_inner_mtu(struct xfrm_state *x, int outer_mtu)
+static u32 __iptfs_get_inner_mtu(struct xfrm_state *x, int outer_mtu)
{
struct crypto_aead *aead;
u32 blksize;
@@ -612,6 +933,24 @@ static u32 iptfs_get_inner_mtu(struct xfrm_state *x, int outer_mtu)
~(blksize - 1)) - 2;
}
+/**
+ * iptfs_get_inner_mtu() - return the inner MTU for an IPTFS xfrm.
+ * @x: xfrm state.
+ * @outer_mtu: Outer MTU for the encapsulated packet.
+ *
+ * Return: Correct MTU taking in to account the encap overhead.
+ */
+static u32 iptfs_get_inner_mtu(struct xfrm_state *x, int outer_mtu)
+{
+ struct xfrm_iptfs_data *xtfs = x->mode_data;
+
+ /* If not dont-frag we have no MTU */
+ if (!xtfs->cfg.dont_frag)
+ return x->outer_mode.family == AF_INET ? IP_MAX_MTU :
+ IP6_MAX_MTU;
+ return __iptfs_get_inner_mtu(x, outer_mtu);
+}
+
/**
* iptfs_user_init() - initialize the SA with IPTFS options from netlink.
* @net: the net data
@@ -631,6 +970,8 @@ static int iptfs_user_init(struct net *net, struct xfrm_state *x,
xtfs->init_delay_ns =
(u64)net->xfrm.sysctl_iptfs_init_delay * NSECS_IN_USEC;
+ if (attrs[XFRMA_IPTFS_DONT_FRAG])
+ xc->dont_frag = true;
if (attrs[XFRMA_IPTFS_PKT_SIZE]) {
xc->pkt_size = nla_get_u32(attrs[XFRMA_IPTFS_PKT_SIZE]);
if (!xc->pkt_size) {
@@ -661,7 +1002,8 @@ static unsigned int iptfs_sa_len(const struct xfrm_state *x)
struct xfrm_iptfs_config *xc = &xtfs->cfg;
unsigned int l = 0;
- l += nla_total_size(0);
+ if (xc->dont_frag)
+ l += nla_total_size(0);
l += nla_total_size(sizeof(u16));
l += nla_total_size(sizeof(xc->pkt_size));
l += nla_total_size(sizeof(xc->max_queue_size));
@@ -678,9 +1020,11 @@ static int iptfs_copy_to_user(struct xfrm_state *x, struct sk_buff *skb)
int ret;
u64 q;
- ret = nla_put_flag(skb, XFRMA_IPTFS_DONT_FRAG);
- if (ret)
- return ret;
+ if (xc->dont_frag) {
+ ret = nla_put_flag(skb, XFRMA_IPTFS_DONT_FRAG);
+ if (ret)
+ return ret;
+ }
ret = nla_put_u16(skb, XFRMA_IPTFS_REORDER_WINDOW, 0);
if (ret)
return ret;
--
2.45.1
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH ipsec-next v2 12/17] xfrm: iptfs: add basic receive packet (tunnel egress) handling
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
` (10 preceding siblings ...)
2024-05-20 21:42 ` [PATCH ipsec-next v2 11/17] xfrm: iptfs: add fragmenting of larger than MTU user packets Christian Hopps
@ 2024-05-20 21:42 ` Christian Hopps
2024-05-20 21:42 ` [PATCH ipsec-next v2 13/17] xfrm: iptfs: handle received fragmented inner packets Christian Hopps
` (5 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:42 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
Add handling of packets received from the tunnel. This implements
tunnel egress functionality.
Signed-off-by: Christian Hopps <chopps@labn.net>
---
net/xfrm/xfrm_iptfs.c | 283 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 283 insertions(+)
diff --git a/net/xfrm/xfrm_iptfs.c b/net/xfrm/xfrm_iptfs.c
index cabe3e1ffbff..bad0d7d2c547 100644
--- a/net/xfrm/xfrm_iptfs.c
+++ b/net/xfrm/xfrm_iptfs.c
@@ -19,6 +19,10 @@
#include "xfrm_inout.h"
+/* IPTFS encap (header) values. */
+#define IPTFS_SUBTYPE_BASIC 0
+#define IPTFS_SUBTYPE_CC 1
+
/* 1) skb->head should be cache aligned.
* 2) when resv is for L2 headers (i.e., ethernet) we want the cacheline to
* start -16 from data.
@@ -56,6 +60,17 @@ struct xfrm_iptfs_data {
static u32 __iptfs_get_inner_mtu(struct xfrm_state *x, int outer_mtu);
static enum hrtimer_restart iptfs_delay_timer(struct hrtimer *me);
+/* ================= */
+/* Utility Functions */
+/* ================= */
+
+static u64 __esp_seq(struct sk_buff *skb)
+{
+ u64 seq = ntohl(XFRM_SKB_CB(skb)->seq.input.low);
+
+ return seq | (u64)ntohl(XFRM_SKB_CB(skb)->seq.input.hi) << 32;
+}
+
/* ================= */
/* SK_BUFF Functions */
/* ================= */
@@ -165,6 +180,273 @@ static int skb_copy_bits_seq(struct skb_seq_state *st, int offset, void *to,
}
}
+/* ================================== */
+/* IPTFS Receiving (egress) Functions */
+/* ================================== */
+
+/**
+ * iptfs_pskb_extract_seq() - Create and load data into a new sk_buff.
+ * @skblen: the total data size for `skb`.
+ * @st: The source for the rest of the data to copy into `skb`.
+ * @off: The offset into @st to copy data from.
+ * @len: The length of data to copy from @st into `skb`. This must be <=
+ * @skblen.
+ *
+ * Create a new sk_buff `skb` with @skblen of packet data space. If non-zero,
+ * copy @rlen bytes of @runt into `skb`. Then using seq functions copy @len
+ * bytes from @st into `skb` starting from @off.
+ *
+ * It is an error for @len to be greater than the amount of data left in @st.
+ *
+ * Return: The newly allocated sk_buff `skb` or NULL if an error occurs.
+ */
+static struct sk_buff *
+iptfs_pskb_extract_seq(u32 skblen, struct skb_seq_state *st, u32 off, int len)
+{
+ struct sk_buff *skb = iptfs_alloc_skb(st->root_skb, skblen, false);
+
+ if (!skb)
+ return NULL;
+ if (skb_copy_bits_seq(st, off, skb_put(skb, len), len)) {
+ XFRM_INC_STATS(dev_net(st->root_skb->dev),
+ LINUX_MIB_XFRMINERROR);
+ kfree_skb(skb);
+ return NULL;
+ }
+ return skb;
+}
+
+/**
+ * iptfs_complete_inner_skb() - finish preparing the inner packet for gro recv.
+ * @x: xfrm state
+ * @skb: the inner packet
+ *
+ * Finish the standard xfrm processing on the inner packet prior to sending back
+ * through gro_cells_receive. We do this separately b/c we are building a list
+ * of packets in the hopes that one day a list will be taken by
+ * xfrm_input.
+ */
+static void iptfs_complete_inner_skb(struct xfrm_state *x, struct sk_buff *skb)
+{
+ skb_reset_network_header(skb);
+
+ /* The packet is going back through gro_cells_receive no need to
+ * set this.
+ */
+ skb_reset_transport_header(skb);
+
+ /* Packet already has checksum value set. */
+ skb->ip_summed = CHECKSUM_NONE;
+
+ /* Our skb will contain the header data copied when this outer packet
+ * which contained the start of this inner packet. This is true
+ * when we allocate a new skb as well as when we reuse the existing skb.
+ */
+ if (ip_hdr(skb)->version == 0x4) {
+ struct iphdr *iph = ip_hdr(skb);
+
+ if (x->props.flags & XFRM_STATE_DECAP_DSCP)
+ ipv4_copy_dscp(XFRM_MODE_SKB_CB(skb)->tos, iph);
+ if (!(x->props.flags & XFRM_STATE_NOECN))
+ if (INET_ECN_is_ce(XFRM_MODE_SKB_CB(skb)->tos))
+ IP_ECN_set_ce(iph);
+
+ skb->protocol = htons(ETH_P_IP);
+ } else {
+ struct ipv6hdr *iph = ipv6_hdr(skb);
+
+ if (x->props.flags & XFRM_STATE_DECAP_DSCP)
+ ipv6_copy_dscp(XFRM_MODE_SKB_CB(skb)->tos, iph);
+ if (!(x->props.flags & XFRM_STATE_NOECN))
+ if (INET_ECN_is_ce(XFRM_MODE_SKB_CB(skb)->tos))
+ IP6_ECN_set_ce(skb, iph);
+
+ skb->protocol = htons(ETH_P_IPV6);
+ }
+}
+
+/**
+ * iptfs_input() - handle receipt of iptfs payload
+ * @x: xfrm state
+ * @skb: the packet
+ *
+ * Process the IPTFS payload in `skb` and consume it afterwards.
+ */
+static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
+{
+ u8 hbytes[sizeof(struct ipv6hdr)];
+ struct ip_iptfs_cc_hdr iptcch;
+ struct skb_seq_state skbseq;
+ struct list_head sublist; /* rename this it's just a list */
+ struct sk_buff *first_skb, *next;
+ const unsigned char *old_mac;
+ struct xfrm_iptfs_data *xtfs;
+ struct ip_iptfs_hdr *ipth;
+ struct iphdr *iph;
+ struct net *net;
+ u32 remaining, iplen, iphlen, data, tail;
+ u32 blkoff;
+ u64 seq;
+
+ xtfs = x->mode_data;
+ net = dev_net(skb->dev);
+ first_skb = NULL;
+
+ seq = __esp_seq(skb);
+
+ /* Large enough to hold both types of header */
+ ipth = (struct ip_iptfs_hdr *)&iptcch;
+
+ /* Save the old mac header if set */
+ old_mac = skb_mac_header_was_set(skb) ? skb_mac_header(skb) : NULL;
+
+ skb_prepare_seq_read(skb, 0, skb->len, &skbseq);
+
+ /* Get the IPTFS header and validate it */
+
+ if (skb_copy_bits_seq(&skbseq, 0, ipth, sizeof(*ipth))) {
+ XFRM_INC_STATS(net, LINUX_MIB_XFRMINBUFFERERROR);
+ goto done;
+ }
+ data = sizeof(*ipth);
+
+ /* Set data past the basic header */
+ if (ipth->subtype == IPTFS_SUBTYPE_CC) {
+ /* Copy the rest of the CC header */
+ remaining = sizeof(iptcch) - sizeof(*ipth);
+ if (skb_copy_bits_seq(&skbseq, data, ipth + 1, remaining)) {
+ XFRM_INC_STATS(net, LINUX_MIB_XFRMINBUFFERERROR);
+ goto done;
+ }
+ data += remaining;
+ } else if (ipth->subtype != IPTFS_SUBTYPE_BASIC) {
+ XFRM_INC_STATS(net, LINUX_MIB_XFRMINHDRERROR);
+ goto done;
+ }
+
+ if (ipth->flags != 0) {
+ XFRM_INC_STATS(net, LINUX_MIB_XFRMINHDRERROR);
+ goto done;
+ }
+
+ INIT_LIST_HEAD(&sublist);
+
+ /* Fragment handling in following commits */
+ blkoff = ntohs(ipth->block_offset);
+ data += blkoff;
+
+ /* New packets */
+ tail = skb->len;
+ while (data < tail) {
+ __be16 protocol = 0;
+
+ /* Gather information on the next data block.
+ * `data` points to the start of the data block.
+ */
+ remaining = tail - data;
+
+ /* try and copy enough bytes to read length from ipv4/ipv6 */
+ iphlen = min_t(u32, remaining, 6);
+ if (skb_copy_bits_seq(&skbseq, data, hbytes, iphlen)) {
+ XFRM_INC_STATS(net, LINUX_MIB_XFRMINBUFFERERROR);
+ goto done;
+ }
+
+ iph = (struct iphdr *)hbytes;
+ if (iph->version == 0x4) {
+ /* must have at least tot_len field present */
+ if (remaining < 4)
+ break;
+
+ iplen = be16_to_cpu(iph->tot_len);
+ iphlen = iph->ihl << 2;
+ protocol = cpu_to_be16(ETH_P_IP);
+ XFRM_MODE_SKB_CB(skbseq.root_skb)->tos = iph->tos;
+ } else if (iph->version == 0x6) {
+ /* must have at least payload_len field present */
+ if (remaining < 6)
+ break;
+
+ iplen = be16_to_cpu(
+ ((struct ipv6hdr *)hbytes)->payload_len);
+ iplen += sizeof(struct ipv6hdr);
+ iphlen = sizeof(struct ipv6hdr);
+ protocol = cpu_to_be16(ETH_P_IPV6);
+ XFRM_MODE_SKB_CB(skbseq.root_skb)->tos =
+ ipv6_get_dsfield((struct ipv6hdr *)iph);
+ } else if (iph->version == 0x0) {
+ /* pad */
+ break;
+ } else {
+ XFRM_INC_STATS(net, LINUX_MIB_XFRMINBUFFERERROR);
+ goto done;
+ }
+
+ if (unlikely(skbseq.stepped_offset)) {
+ /* We need to reset our seq read, it can't backup at
+ * this point.
+ */
+ struct sk_buff *save = skbseq.root_skb;
+
+ skb_abort_seq_read(&skbseq);
+ skb_prepare_seq_read(save, data, tail, &skbseq);
+ }
+
+ if (!first_skb)
+ first_skb = skb;
+
+ /* Fragment handling in following commits */
+ if (iplen > remaining)
+ break;
+
+ skb = iptfs_pskb_extract_seq(iplen, &skbseq, data, iplen);
+ if (!skb) {
+ /* skip to next packet or done */
+ data += iplen;
+ continue;
+ }
+
+ skb->protocol = protocol;
+ if (old_mac) {
+ /* rebuild the mac header */
+ skb_set_mac_header(skb, -first_skb->mac_len);
+ memcpy(skb_mac_header(skb), old_mac,
+ first_skb->mac_len);
+ eth_hdr(skb)->h_proto = skb->protocol;
+ }
+
+ data += iplen;
+ iptfs_complete_inner_skb(x, skb);
+ list_add_tail(&skb->list, &sublist);
+ }
+
+ /* Send the packets! */
+ list_for_each_entry_safe(skb, next, &sublist, list) {
+ skb_list_del_init(skb);
+ if (xfrm_input(skb, 0, 0, -3))
+ kfree_skb(skb);
+ }
+
+done:
+ skb = skbseq.root_skb;
+ skb_abort_seq_read(&skbseq);
+
+ if (first_skb) {
+ consume_skb(first_skb);
+ } else {
+ /* skb is the original passed in skb, but we didn't get far
+ * enough to process it as the first_skb.
+ */
+ kfree_skb(skb);
+ }
+
+ /* We always have dealt with the input SKB, either we are re-using it,
+ * or we have freed it. Return EINPROGRESS so that xfrm_input stops
+ * processing it.
+ */
+ return -EINPROGRESS;
+}
+
/* ================================= */
/* IPTFS Sending (ingress) Functions */
/* ================================= */
@@ -1125,6 +1407,7 @@ static const struct xfrm_mode_cbs iptfs_mode_cbs = {
.sa_len = iptfs_sa_len,
.clone = iptfs_clone,
.get_inner_mtu = iptfs_get_inner_mtu,
+ .input = iptfs_input,
.output = iptfs_output_collect,
.prepare_output = iptfs_prepare_output,
};
--
2.45.1
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH ipsec-next v2 13/17] xfrm: iptfs: handle received fragmented inner packets
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
` (11 preceding siblings ...)
2024-05-20 21:42 ` [PATCH ipsec-next v2 12/17] xfrm: iptfs: add basic receive packet (tunnel egress) handling Christian Hopps
@ 2024-05-20 21:42 ` Christian Hopps
2024-05-20 21:42 ` [PATCH ipsec-next v2 14/17] xfrm: iptfs: add reusing received skb for the tunnel egress packet Christian Hopps
` (4 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:42 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
Add support for handling receipt of partial inner packets that have
been fragmented across multiple outer IP-TFS tunnel packets.
Signed-off-by: Christian Hopps <chopps@labn.net>
---
net/xfrm/xfrm_iptfs.c | 438 ++++++++++++++++++++++++++++++++++++++++--
1 file changed, 425 insertions(+), 13 deletions(-)
diff --git a/net/xfrm/xfrm_iptfs.c b/net/xfrm/xfrm_iptfs.c
index bad0d7d2c547..958aa3d0522f 100644
--- a/net/xfrm/xfrm_iptfs.c
+++ b/net/xfrm/xfrm_iptfs.c
@@ -55,10 +55,22 @@ struct xfrm_iptfs_data {
struct hrtimer iptfs_timer; /* output timer */
time64_t iptfs_settime; /* time timer was set */
u32 payload_mtu; /* max payload size */
+
+ /* Tunnel egress */
+ spinlock_t drop_lock;
+ struct hrtimer drop_timer;
+ u64 drop_time_ns;
+
+ /* Tunnel egress reassembly */
+ struct sk_buff *ra_newskb; /* new pkt being reassembled */
+ u64 ra_wantseq; /* expected next sequence */
+ u8 ra_runt[6]; /* last pkt bytes from last skb */
+ u8 ra_runtlen; /* count of ra_runt */
};
static u32 __iptfs_get_inner_mtu(struct xfrm_state *x, int outer_mtu);
static enum hrtimer_restart iptfs_delay_timer(struct hrtimer *me);
+static enum hrtimer_restart iptfs_drop_timer(struct hrtimer *me);
/* ================= */
/* Utility Functions */
@@ -216,6 +228,63 @@ iptfs_pskb_extract_seq(u32 skblen, struct skb_seq_state *st, u32 off, int len)
return skb;
}
+/**
+ * iptfs_input_save_runt() - save data in xtfs runt space.
+ * @xtfs: xtfs state
+ * @seq: the current sequence
+ * @buf: packet data
+ * @len: length of packet data
+ *
+ * Save the small (`len`) start of a fragmented packet in `buf` in the xtfs data
+ * runt space.
+ */
+static void iptfs_input_save_runt(struct xfrm_iptfs_data *xtfs, u64 seq,
+ u8 *buf, int len)
+{
+ BUG_ON(xtfs->ra_newskb); /* we won't have a new SKB yet */
+
+ memcpy(xtfs->ra_runt, buf, len);
+
+ xtfs->ra_runtlen = len;
+ xtfs->ra_wantseq = seq + 1;
+}
+
+/**
+ * __iptfs_iphlen() - return the v4/v6 header length using packet data.
+ * @data: pointer at octet with version nibble
+ *
+ * The version data is expected to be valid (i.e., either 4 or 6).
+ */
+static u32 __iptfs_iphlen(u8 *data)
+{
+ struct iphdr *iph = (struct iphdr *)data;
+
+ if (iph->version == 0x4)
+ return sizeof(*iph);
+ BUG_ON(iph->version != 0x6);
+ return sizeof(struct ipv6hdr);
+}
+
+/**
+ * __iptfs_iplen() - return the v4/v6 length using packet data.
+ * @data: pointer to ip (v4/v6) packet header
+ *
+ * Grab the IPv4 or IPv6 length value in the start of the inner packet header
+ * pointed to by `data`. Assumes data len is enough for the length field only.
+ *
+ * The version data is expected to be valid (i.e., either 4 or 6).
+ */
+static u32 __iptfs_iplen(u8 *data)
+{
+ struct iphdr *iph = (struct iphdr *)data;
+
+ if (iph->version == 0x4)
+ return ntohs(iph->tot_len);
+ BUG_ON(iph->version != 0x6);
+ return ntohs(((struct ipv6hdr *)iph)->payload_len) +
+ sizeof(struct ipv6hdr);
+}
+
/**
* iptfs_complete_inner_skb() - finish preparing the inner packet for gro recv.
* @x: xfrm state
@@ -265,6 +334,238 @@ static void iptfs_complete_inner_skb(struct xfrm_state *x, struct sk_buff *skb)
}
}
+static void __iptfs_reassem_done(struct xfrm_iptfs_data *xtfs, bool free)
+{
+ assert_spin_locked(&xtfs->drop_lock);
+
+ /* We don't care if it works locking takes care of things */
+ hrtimer_try_to_cancel(&xtfs->drop_timer);
+ if (free)
+ kfree_skb(xtfs->ra_newskb);
+ xtfs->ra_newskb = NULL;
+}
+
+/**
+ * iptfs_reassem_abort() - In-progress packet is aborted free the state.
+ * @xtfs: xtfs state
+ */
+static void iptfs_reassem_abort(struct xfrm_iptfs_data *xtfs)
+{
+ __iptfs_reassem_done(xtfs, true);
+}
+
+/**
+ * iptfs_reassem_done() - In-progress packet is complete, clear the state.
+ * @xtfs: xtfs state
+ */
+static void iptfs_reassem_done(struct xfrm_iptfs_data *xtfs)
+{
+ __iptfs_reassem_done(xtfs, false);
+}
+
+/**
+ * iptfs_reassem_cont() - Continue the reassembly of an inner packets.
+ * @xtfs: xtfs state
+ * @seq: sequence of current packet
+ * @st: seq read stat for current packet
+ * @skb: current packet
+ * @data: offset into sequential packet data
+ * @blkoff: packet blkoff value
+ * @list: list of skbs to enqueue completed packet on
+ *
+ * Process an IPTFS payload that has a non-zero `blkoff` or when we are
+ * expecting the continuation b/c we have a runt or in-progress packet.
+ */
+static u32 iptfs_reassem_cont(struct xfrm_iptfs_data *xtfs, u64 seq,
+ struct skb_seq_state *st, struct sk_buff *skb,
+ u32 data, u32 blkoff, struct list_head *list)
+{
+ struct sk_buff *newskb = xtfs->ra_newskb;
+ u32 remaining = skb->len - data;
+ u32 runtlen = xtfs->ra_runtlen;
+ u32 copylen, fraglen, ipremain, iphlen, iphremain, rrem;
+
+ /* Handle packet fragment we aren't expecting */
+ if (!runtlen && !xtfs->ra_newskb)
+ return data + min(blkoff, remaining);
+
+ /* Important to remember that input to this function is an ordered
+ * packet stream (unless the user disabled the reorder window). Thus if
+ * we are waiting for, and expecting the next packet so we can continue
+ * assembly, a newer sequence number indicates older ones are not coming
+ * (or if they do should be ignored). Technically we can receive older
+ * ones when the reorder window is disabled; however, the user should
+ * have disabled fragmentation in this case, and regardless we don't
+ * deal with it.
+ *
+ * blkoff could be zero if the stream is messed up (or it's an all pad
+ * insertion) be careful to handle that case in each of the below
+ */
+
+ /* Too old case: This can happen when the reorder window is disabled so
+ * ordering isn't actually guaranteed.
+ */
+ if (seq < xtfs->ra_wantseq)
+ return data + remaining;
+
+ /* Too new case: We missed what we wanted cleanup. */
+ if (seq > xtfs->ra_wantseq) {
+ XFRM_INC_STATS(dev_net(skb->dev),
+ LINUX_MIB_XFRMINIPTFSERROR);
+ goto abandon;
+ }
+
+ if (blkoff == 0) {
+ if ((*skb->data & 0xF0) != 0) {
+ XFRM_INC_STATS(dev_net(skb->dev),
+ LINUX_MIB_XFRMINIPTFSERROR);
+ goto abandon;
+ }
+ /* Handle all pad case, advance expected sequence number.
+ * (RFC 9347 S2.2.3)
+ */
+ xtfs->ra_wantseq++;
+ /* will end parsing */
+ return data + remaining;
+ }
+
+ if (runtlen) {
+ BUG_ON(xtfs->ra_newskb);
+
+ /* Regardless of what happens we're done with the runt */
+ xtfs->ra_runtlen = 0;
+
+ /* The start of this inner packet was at the very end of the last
+ * iptfs payload which didn't include enough for the ip header
+ * length field. We must have *at least* that now.
+ */
+ rrem = sizeof(xtfs->ra_runt) - runtlen;
+ if (remaining < rrem || blkoff < rrem) {
+ XFRM_INC_STATS(dev_net(skb->dev),
+ LINUX_MIB_XFRMINIPTFSERROR);
+ goto abandon;
+ }
+
+ /* fill in the runt data */
+ if (skb_copy_bits_seq(st, data, &xtfs->ra_runt[runtlen],
+ rrem)) {
+ XFRM_INC_STATS(dev_net(skb->dev),
+ LINUX_MIB_XFRMINBUFFERERROR);
+ goto abandon;
+ }
+
+ /* We have enough data to get the ip length value now,
+ * allocate an in progress skb
+ */
+ ipremain = __iptfs_iplen(xtfs->ra_runt);
+ if (ipremain < sizeof(xtfs->ra_runt)) {
+ /* length has to be at least runtsize large */
+ XFRM_INC_STATS(dev_net(skb->dev),
+ LINUX_MIB_XFRMINIPTFSERROR);
+ goto abandon;
+ }
+
+ /* For the runt case we don't attempt sharing currently. NOTE:
+ * Currently, this IPTFS implementation will not create runts.
+ */
+
+ newskb = iptfs_alloc_skb(skb, ipremain, false);
+ if (!newskb) {
+ XFRM_INC_STATS(dev_net(skb->dev),
+ LINUX_MIB_XFRMINERROR);
+ goto abandon;
+ }
+ xtfs->ra_newskb = newskb;
+
+ /* Copy the runt data into the buffer, but leave data
+ * pointers the same as normal non-runt case. The extra `rrem`
+ * recopied bytes are basically cacheline free. Allows using
+ * same logic below to complete.
+ */
+ memcpy(skb_put(newskb, runtlen), xtfs->ra_runt,
+ sizeof(xtfs->ra_runt));
+ }
+
+ /* Continue reassembling the packet */
+ ipremain = __iptfs_iplen(newskb->data);
+ iphlen = __iptfs_iphlen(newskb->data);
+
+ /* Sanity check, we created the newskb knowing the IP length so the IP
+ * length can't now be shorter.
+ */
+ BUG_ON(newskb->len > ipremain);
+
+ ipremain -= newskb->len;
+ if (blkoff < ipremain) {
+ /* Corrupt data, we don't have enough to complete the packet */
+ XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMINIPTFSERROR);
+ goto abandon;
+ }
+
+ /* We want the IP header in linear space */
+ if (newskb->len < iphlen) {
+ iphremain = iphlen - newskb->len;
+ if (blkoff < iphremain) {
+ XFRM_INC_STATS(dev_net(skb->dev),
+ LINUX_MIB_XFRMINIPTFSERROR);
+ goto abandon;
+ }
+ fraglen = min(blkoff, remaining);
+ copylen = min(fraglen, iphremain);
+ BUG_ON(skb_tailroom(newskb) < copylen);
+ if (skb_copy_bits_seq(st, data, skb_put(newskb, copylen), copylen)) {
+ XFRM_INC_STATS(dev_net(skb->dev),
+ LINUX_MIB_XFRMINBUFFERERROR);
+ goto abandon;
+ }
+ /* this is a silly condition that might occur anyway */
+ if (copylen < iphremain) {
+ xtfs->ra_wantseq++;
+ return data + fraglen;
+ }
+ /* update data and things derived from it */
+ data += copylen;
+ blkoff -= copylen;
+ remaining -= copylen;
+ ipremain -= copylen;
+ }
+
+ fraglen = min(blkoff, remaining);
+ copylen = min(fraglen, ipremain);
+
+ /* We verified this was true in the main receive routine */
+ BUG_ON(skb_tailroom(newskb) < copylen);
+
+ /* copy fragment data into newskb */
+ if (skb_copy_bits_seq(st, data, skb_put(newskb, copylen), copylen)) {
+ XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMINBUFFERERROR);
+ goto abandon;
+ }
+
+ if (copylen < ipremain) {
+ xtfs->ra_wantseq++;
+ } else {
+ /* We are done with packet reassembly! */
+ BUG_ON(copylen != ipremain);
+ iptfs_reassem_done(xtfs);
+ iptfs_complete_inner_skb(xtfs->x, newskb);
+ list_add_tail(&newskb->list, list);
+ }
+
+ /* will continue on to new data block or end */
+ return data + fraglen;
+
+abandon:
+ if (xtfs->ra_newskb) {
+ iptfs_reassem_abort(xtfs);
+ } else {
+ xtfs->ra_runtlen = 0;
+ xtfs->ra_wantseq = 0;
+ }
+ /* skip past fragment, maybe to end */
+ return data + min(blkoff, remaining);
+}
+
/**
* iptfs_input() - handle receipt of iptfs payload
* @x: xfrm state
@@ -285,7 +586,7 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
struct iphdr *iph;
struct net *net;
u32 remaining, iplen, iphlen, data, tail;
- u32 blkoff;
+ u32 blkoff, capturelen;
u64 seq;
xtfs = x->mode_data;
@@ -331,12 +632,27 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
INIT_LIST_HEAD(&sublist);
- /* Fragment handling in following commits */
+ /* Handle fragment at start of payload, and/or waiting reassembly. */
+
blkoff = ntohs(ipth->block_offset);
- data += blkoff;
+ /* check before locking i.e., maybe */
+ if (blkoff || xtfs->ra_runtlen || xtfs->ra_newskb) {
+ spin_lock(&xtfs->drop_lock);
+
+ /* check again after lock */
+ if (blkoff || xtfs->ra_runtlen || xtfs->ra_newskb) {
+ data = iptfs_reassem_cont(xtfs, seq, &skbseq, skb, data,
+ blkoff, &sublist);
+ }
+
+ spin_unlock(&xtfs->drop_lock);
+ }
/* New packets */
+
tail = skb->len;
+ BUG_ON(xtfs->ra_newskb && data < tail);
+
while (data < tail) {
__be16 protocol = 0;
@@ -355,8 +671,13 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
iph = (struct iphdr *)hbytes;
if (iph->version == 0x4) {
/* must have at least tot_len field present */
- if (remaining < 4)
+ if (remaining < 4) {
+ /* save the bytes we have, advance data and exit */
+ iptfs_input_save_runt(xtfs, seq, hbytes,
+ remaining);
+ data += remaining;
break;
+ }
iplen = be16_to_cpu(iph->tot_len);
iphlen = iph->ihl << 2;
@@ -364,8 +685,13 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
XFRM_MODE_SKB_CB(skbseq.root_skb)->tos = iph->tos;
} else if (iph->version == 0x6) {
/* must have at least payload_len field present */
- if (remaining < 6)
+ if (remaining < 6) {
+ /* save the bytes we have, advance data and exit */
+ iptfs_input_save_runt(xtfs, seq, hbytes,
+ remaining);
+ data += remaining;
break;
+ }
iplen = be16_to_cpu(
((struct ipv6hdr *)hbytes)->payload_len);
@@ -376,6 +702,7 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
ipv6_get_dsfield((struct ipv6hdr *)iph);
} else if (iph->version == 0x0) {
/* pad */
+ data = tail;
break;
} else {
XFRM_INC_STATS(net, LINUX_MIB_XFRMINBUFFERERROR);
@@ -395,16 +722,14 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
if (!first_skb)
first_skb = skb;
- /* Fragment handling in following commits */
- if (iplen > remaining)
- break;
-
- skb = iptfs_pskb_extract_seq(iplen, &skbseq, data, iplen);
+ capturelen = min(iplen, remaining);
+ skb = iptfs_pskb_extract_seq(iplen, &skbseq, data, capturelen);
if (!skb) {
/* skip to next packet or done */
- data += iplen;
+ data += capturelen;
continue;
}
+ BUG_ON(skb->len != capturelen);
skb->protocol = protocol;
if (old_mac) {
@@ -415,11 +740,38 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
eth_hdr(skb)->h_proto = skb->protocol;
}
- data += iplen;
+ data += capturelen;
+
+ if (skb->len < iplen) {
+ BUG_ON(data != tail);
+ BUG_ON(xtfs->ra_newskb);
+
+ /* Start reassembly */
+ spin_lock(&xtfs->drop_lock);
+
+ xtfs->ra_newskb = skb;
+ xtfs->ra_wantseq = seq + 1;
+ if (!hrtimer_is_queued(&xtfs->drop_timer)) {
+ /* softirq blocked lest the timer fire and interrupt us */
+ BUG_ON(!in_interrupt());
+ hrtimer_start(&xtfs->drop_timer,
+ xtfs->drop_time_ns,
+ IPTFS_HRTIMER_MODE);
+ }
+
+ spin_unlock(&xtfs->drop_lock);
+
+ break;
+ }
+
iptfs_complete_inner_skb(x, skb);
list_add_tail(&skb->list, &sublist);
}
+ if (data != tail)
+ /* this should not happen from the above code */
+ XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMINIPTFSERROR);
+
/* Send the packets! */
list_for_each_entry_safe(skb, next, &sublist, list) {
skb_list_del_init(skb);
@@ -447,6 +799,45 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
return -EINPROGRESS;
}
+/**
+ * iptfs_drop_timer() - Handle drop timer expiry.
+ * @me: the timer
+ *
+ * This is similar to our input function.
+ *
+ * The drop timer is set when we start an in progress reassembly, and also when
+ * we save a future packet in the window saved array.
+ *
+ * NOTE packets in the save window are always newer WRT drop times as
+ * they get further in the future. i.e. for:
+ *
+ * if slots (S0, S1, ... Sn) and `Dn` is the drop time for slot `Sn`,
+ * then D(n-1) <= D(n).
+ *
+ * So, regardless of why the timer is firing we can always discard any inprogress
+ * fragment; either it's the reassembly timer, or slot 0 is going to be
+ * dropped as S0 must have the most recent drop time, and slot 0 holds the
+ * continuation fragment of the in progress packet.
+ */
+static enum hrtimer_restart iptfs_drop_timer(struct hrtimer *me)
+{
+ struct xfrm_iptfs_data *xtfs;
+ struct xfrm_state *x;
+
+ xtfs = container_of(me, typeof(*xtfs), drop_timer);
+ x = xtfs->x;
+
+ /* Drop any in progress packet */
+ spin_lock(&xtfs->drop_lock);
+ if (xtfs->ra_newskb) {
+ kfree_skb(xtfs->ra_newskb);
+ xtfs->ra_newskb = NULL;
+ }
+ spin_unlock(&xtfs->drop_lock);
+
+ return HRTIMER_NORESTART;
+}
+
/* ================================= */
/* IPTFS Sending (ingress) Functions */
/* ================================= */
@@ -1251,6 +1642,8 @@ static int iptfs_user_init(struct net *net, struct xfrm_state *x,
xc->max_queue_size = net->xfrm.sysctl_iptfs_max_qsize;
xtfs->init_delay_ns =
(u64)net->xfrm.sysctl_iptfs_init_delay * NSECS_IN_USEC;
+ xtfs->drop_time_ns =
+ (u64)net->xfrm.sysctl_iptfs_drop_time * NSECS_IN_USEC;
if (attrs[XFRMA_IPTFS_DONT_FRAG])
xc->dont_frag = true;
@@ -1268,6 +1661,10 @@ static int iptfs_user_init(struct net *net, struct xfrm_state *x,
}
if (attrs[XFRMA_IPTFS_MAX_QSIZE])
xc->max_queue_size = nla_get_u32(attrs[XFRMA_IPTFS_MAX_QSIZE]);
+ if (attrs[XFRMA_IPTFS_DROP_TIME])
+ xtfs->drop_time_ns =
+ (u64)nla_get_u32(attrs[XFRMA_IPTFS_DROP_TIME]) *
+ NSECS_IN_USEC;
if (attrs[XFRMA_IPTFS_INIT_DELAY])
xtfs->init_delay_ns =
(u64)nla_get_u32(attrs[XFRMA_IPTFS_INIT_DELAY]) *
@@ -1317,7 +1714,9 @@ static int iptfs_copy_to_user(struct xfrm_state *x, struct sk_buff *skb)
if (ret)
return ret;
- ret = nla_put_u32(skb, XFRMA_IPTFS_DROP_TIME, 0);
+ q = xtfs->drop_time_ns;
+ (void)do_div(q, NSECS_IN_USEC);
+ ret = nla_put_u32(skb, XFRMA_IPTFS_DROP_TIME, q);
if (ret)
return ret;
@@ -1335,6 +1734,10 @@ static int __iptfs_init_state(struct xfrm_state *x,
hrtimer_init(&xtfs->iptfs_timer, CLOCK_MONOTONIC, IPTFS_HRTIMER_MODE);
xtfs->iptfs_timer.function = iptfs_delay_timer;
+ spin_lock_init(&xtfs->drop_lock);
+ hrtimer_init(&xtfs->drop_timer, CLOCK_MONOTONIC, IPTFS_HRTIMER_MODE);
+ xtfs->drop_timer.function = iptfs_drop_timer;
+
/* Modify type (esp) adjustment values */
if (x->props.family == AF_INET)
@@ -1362,6 +1765,8 @@ static int iptfs_clone(struct xfrm_state *x, struct xfrm_state *orig)
if (!xtfs)
return -ENOMEM;
+ xtfs->ra_newskb = NULL;
+
err = __iptfs_init_state(x, xtfs);
if (err)
return err;
@@ -1392,7 +1797,14 @@ static void iptfs_delete_state(struct xfrm_state *x)
if (!xtfs)
return;
+ spin_lock_bh(&xtfs->drop_lock);
hrtimer_cancel(&xtfs->iptfs_timer);
+ hrtimer_cancel(&xtfs->drop_timer);
+ spin_unlock_bh(&xtfs->drop_lock);
+
+ if (xtfs->ra_newskb)
+ kfree_skb(xtfs->ra_newskb);
+
kfree_sensitive(xtfs);
module_put(x->mode_cbs->owner);
--
2.45.1
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH ipsec-next v2 14/17] xfrm: iptfs: add reusing received skb for the tunnel egress packet
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
` (12 preceding siblings ...)
2024-05-20 21:42 ` [PATCH ipsec-next v2 13/17] xfrm: iptfs: handle received fragmented inner packets Christian Hopps
@ 2024-05-20 21:42 ` Christian Hopps
2024-05-20 21:42 ` [PATCH ipsec-next v2 15/17] xfrm: iptfs: add skb-fragment sharing code Christian Hopps
` (3 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:42 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
Add an optimization of re-using the tunnel outer skb re-transmission
of the inner packet to avoid skb allocation and copy.
Signed-off-by: Christian Hopps <chopps@labn.net>
---
net/xfrm/xfrm_iptfs.c | 126 +++++++++++++++++++++++++++++++++++-------
1 file changed, 105 insertions(+), 21 deletions(-)
diff --git a/net/xfrm/xfrm_iptfs.c b/net/xfrm/xfrm_iptfs.c
index 958aa3d0522f..a0d7abe4f0d0 100644
--- a/net/xfrm/xfrm_iptfs.c
+++ b/net/xfrm/xfrm_iptfs.c
@@ -579,19 +579,20 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
struct ip_iptfs_cc_hdr iptcch;
struct skb_seq_state skbseq;
struct list_head sublist; /* rename this it's just a list */
- struct sk_buff *first_skb, *next;
+ struct sk_buff *first_skb, *defer, *next;
const unsigned char *old_mac;
struct xfrm_iptfs_data *xtfs;
struct ip_iptfs_hdr *ipth;
struct iphdr *iph;
struct net *net;
- u32 remaining, iplen, iphlen, data, tail;
+ u32 remaining, first_iplen, iplen, iphlen, data, tail;
u32 blkoff, capturelen;
u64 seq;
xtfs = x->mode_data;
net = dev_net(skb->dev);
first_skb = NULL;
+ defer = NULL;
seq = __esp_seq(skb);
@@ -719,25 +720,94 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
skb_prepare_seq_read(save, data, tail, &skbseq);
}
- if (!first_skb)
+ if (first_skb) {
+ skb = NULL;
+ } else {
first_skb = skb;
+ first_iplen = iplen;
+
+ /* We are going to skip over `data` bytes to reach the
+ * start of the IP header of `iphlen` len for `iplen`
+ * inner packet.
+ */
+
+ if (skb_has_frag_list(skb)) {
+ defer = skb;
+ skb = NULL;
+ } else if (data + iphlen <= skb_headlen(skb) &&
+ /* make sure our header is 32-bit aligned? */
+ /* ((uintptr_t)(skb->data + data) & 0x3) == 0 && */
+ skb_tailroom(skb) + tail - data >= iplen) {
+ /* Reuse the received skb.
+ *
+ * We have enough headlen to pull past any
+ * initial fragment data, leaving at least the
+ * IP header in the linear buffer space.
+ *
+ * For linear buffer space we only require that
+ * linear buffer space is large enough to
+ * eventually hold the entire reassembled
+ * packet (by including tailroom in the check).
+ *
+ * For non-linear tailroom is 0 and so we only
+ * re-use if the entire packet is present
+ * already.
+ *
+ * NOTE: there are many more options for
+ * sharing, KISS for now. Also, this can produce
+ * skb's with the IP header unaligned to 32
+ * bits. If that ends up being a problem then a
+ * check should be added to the conditional
+ * above that the header lies on a 32-bit
+ * boundary as well.
+ */
+ skb_pull(skb, data);
+
+ /* our range just changed */
+ data = 0;
+ tail = skb->len;
+ remaining = skb->len;
+
+ skb->protocol = protocol;
+ skb_mac_header_rebuild(skb);
+ if (skb->mac_len)
+ eth_hdr(skb)->h_proto = skb->protocol;
+
+ /* all pointers could be changed now reset walk */
+ skb_abort_seq_read(&skbseq);
+ skb_prepare_seq_read(skb, data, tail, &skbseq);
+ } else {
+ /* We couldn't reuse the input skb so allocate a
+ * new one.
+ */
+ defer = skb;
+ skb = NULL;
+ }
+
+ /* Don't trim `first_skb` until the end as we are
+ * walking that data now.
+ */
+ }
capturelen = min(iplen, remaining);
- skb = iptfs_pskb_extract_seq(iplen, &skbseq, data, capturelen);
if (!skb) {
- /* skip to next packet or done */
- data += capturelen;
- continue;
- }
- BUG_ON(skb->len != capturelen);
-
- skb->protocol = protocol;
- if (old_mac) {
- /* rebuild the mac header */
- skb_set_mac_header(skb, -first_skb->mac_len);
- memcpy(skb_mac_header(skb), old_mac,
- first_skb->mac_len);
- eth_hdr(skb)->h_proto = skb->protocol;
+ skb = iptfs_pskb_extract_seq(iplen, &skbseq, data,
+ capturelen);
+ if (!skb) {
+ /* skip to next packet or done */
+ data += capturelen;
+ continue;
+ }
+ BUG_ON(skb->len != capturelen);
+
+ skb->protocol = protocol;
+ if (old_mac) {
+ /* rebuild the mac header */
+ skb_set_mac_header(skb, -first_skb->mac_len);
+ memcpy(skb_mac_header(skb), old_mac,
+ first_skb->mac_len);
+ eth_hdr(skb)->h_proto = skb->protocol;
+ }
}
data += capturelen;
@@ -772,8 +842,19 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
/* this should not happen from the above code */
XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMINIPTFSERROR);
+ if (first_skb && first_iplen && !defer && first_skb != xtfs->ra_newskb) {
+ /* first_skb is queued b/c !defer and not partial */
+ if (pskb_trim(first_skb, first_iplen)) {
+ /* error trimming */
+ list_del(&first_skb->list);
+ defer = first_skb;
+ }
+ first_skb->ip_summed = CHECKSUM_NONE;
+ }
+
/* Send the packets! */
list_for_each_entry_safe(skb, next, &sublist, list) {
+ BUG_ON(skb == defer);
skb_list_del_init(skb);
if (xfrm_input(skb, 0, 0, -3))
kfree_skb(skb);
@@ -783,12 +864,15 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
skb = skbseq.root_skb;
skb_abort_seq_read(&skbseq);
- if (first_skb) {
- consume_skb(first_skb);
- } else {
+ if (defer) {
+ consume_skb(defer);
+ } else if (!first_skb) {
/* skb is the original passed in skb, but we didn't get far
- * enough to process it as the first_skb.
+ * enough to process it as the first_skb, if we had it would
+ * either be save in ra_newskb, trimmed and sent on as an skb or
+ * placed in defer to be freed.
*/
+ BUG_ON(!skb);
kfree_skb(skb);
}
--
2.45.1
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH ipsec-next v2 15/17] xfrm: iptfs: add skb-fragment sharing code
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
` (13 preceding siblings ...)
2024-05-20 21:42 ` [PATCH ipsec-next v2 14/17] xfrm: iptfs: add reusing received skb for the tunnel egress packet Christian Hopps
@ 2024-05-20 21:42 ` Christian Hopps
2024-05-20 21:42 ` [PATCH ipsec-next v2 16/17] xfrm: iptfs: handle reordering of received packets Christian Hopps
` (2 subsequent siblings)
17 siblings, 0 replies; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:42 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
Avoid copying the inner packet data by sharing the skb data fragments
from the output packet skb into new inner packet skb.
Signed-off-by: Christian Hopps <chopps@labn.net>
---
net/xfrm/xfrm_iptfs.c | 303 ++++++++++++++++++++++++++++++++++++++++--
1 file changed, 295 insertions(+), 8 deletions(-)
diff --git a/net/xfrm/xfrm_iptfs.c b/net/xfrm/xfrm_iptfs.c
index a0d7abe4f0d0..f8b7cf6f4b01 100644
--- a/net/xfrm/xfrm_iptfs.c
+++ b/net/xfrm/xfrm_iptfs.c
@@ -33,6 +33,7 @@
#define XFRM_IPTFS_MIN_L3HEADROOM 128
#define XFRM_IPTFS_MIN_L2HEADROOM (64 + 16)
#define IPTFS_FRAG_COPY_MAX 256 /* max for copying to create iptfs frags */
+#define IPTFS_PKT_SHARE_MIN 129 /* min to try to share vs copy pkt data */
#define NSECS_IN_USEC 1000
#define IPTFS_HRTIMER_MODE HRTIMER_MODE_REL_SOFT
@@ -159,6 +160,205 @@ static void skb_head_to_frag(const struct sk_buff *skb, skb_frag_t *frag)
skb_frag_fill_page_desc(frag, page, skb->data - addr, skb_headlen(skb));
}
+/**
+ * struct skb_frag_walk - use to track a walk through fragments
+ * @fragi: current fragment index
+ * @past: length of data in fragments before @fragi
+ * @total: length of data in all fragments
+ * @nr_frags: number of fragments present in array
+ * @initial_offset: the value passed in to skb_prepare_frag_walk()
+ * @pp_recycle: copy of skb->pp_recycle
+ * @frags: the page fragments inc. room for head page
+ */
+struct skb_frag_walk {
+ u32 fragi;
+ u32 past;
+ u32 total;
+ u32 nr_frags;
+ u32 initial_offset;
+ bool pp_recycle;
+ skb_frag_t frags[MAX_SKB_FRAGS + 1];
+};
+
+/**
+ * skb_prepare_frag_walk() - initialize a frag walk over an skb.
+ * @skb: the skb to walk.
+ * @initial_offset: start the walk @initial_offset into the skb.
+ * @walk: the walk to initialize
+ *
+ * Future calls to skb_add_frags() will expect the @offset value to be at
+ * least @initial_offset large.
+ */
+static void skb_prepare_frag_walk(struct sk_buff *skb, u32 initial_offset,
+ struct skb_frag_walk *walk)
+{
+ struct skb_shared_info *shinfo = skb_shinfo(skb);
+ skb_frag_t *frag, *from;
+ u32 i;
+
+ walk->initial_offset = initial_offset;
+ walk->fragi = 0;
+ walk->past = 0;
+ walk->total = 0;
+ walk->nr_frags = 0;
+ walk->pp_recycle = skb->pp_recycle;
+
+ if (skb->head_frag) {
+ if (initial_offset >= skb_headlen(skb)) {
+ initial_offset -= skb_headlen(skb);
+ } else {
+ frag = &walk->frags[walk->nr_frags++];
+ skb_head_to_frag(skb, frag);
+ frag->offset += initial_offset;
+ frag->len -= initial_offset;
+ walk->total += frag->len;
+ initial_offset = 0;
+ }
+ } else {
+ BUG_ON(skb_headlen(skb) > initial_offset);
+ initial_offset -= skb_headlen(skb);
+ }
+
+ for (i = 0; i < shinfo->nr_frags; i++) {
+ from = &shinfo->frags[i];
+ if (initial_offset >= from->len) {
+ initial_offset -= from->len;
+ continue;
+ }
+ frag = &walk->frags[walk->nr_frags++];
+ *frag = *from;
+ if (initial_offset) {
+ frag->offset += initial_offset;
+ frag->len -= initial_offset;
+ initial_offset = 0;
+ }
+ walk->total += frag->len;
+ }
+ BUG_ON(initial_offset != 0);
+}
+
+static u32 __skb_reset_frag_walk(struct skb_frag_walk *walk, u32 offset)
+{
+ /* Adjust offset to refer to internal walk values */
+ BUG_ON(offset < walk->initial_offset);
+ offset -= walk->initial_offset;
+
+ /* Get to the correct fragment for offset */
+ while (offset < walk->past) {
+ walk->past -= walk->frags[--walk->fragi].len;
+ if (offset >= walk->past)
+ break;
+ BUG_ON(walk->fragi == 0);
+ }
+ while (offset >= walk->past + walk->frags[walk->fragi].len)
+ walk->past += walk->frags[walk->fragi++].len;
+
+ /* offset now relative to this current frag */
+ offset -= walk->past;
+ return offset;
+}
+
+/**
+ * skb_can_add_frags() - check if ok to add frags from walk to skb
+ * @skb: skb to check for adding frags to
+ * @walk: the walk that will be used as source for frags.
+ * @offset: offset from beginning of original skb to start from.
+ * @len: amount of data to add frag references to in @skb.
+ */
+static bool skb_can_add_frags(const struct sk_buff *skb,
+ struct skb_frag_walk *walk, u32 offset, u32 len)
+{
+ struct skb_shared_info *shinfo = skb_shinfo(skb);
+ u32 fragi, nr_frags, fraglen;
+
+ if (skb_has_frag_list(skb) || skb->pp_recycle != walk->pp_recycle)
+ return false;
+
+ /* Make offset relative to current frag after setting that */
+ offset = __skb_reset_frag_walk(walk, offset);
+
+ /* Verify we have array space for the fragments we need to add */
+ fragi = walk->fragi;
+ nr_frags = shinfo->nr_frags;
+ while (len && fragi < walk->nr_frags) {
+ skb_frag_t *frag = &walk->frags[fragi];
+
+ fraglen = frag->len;
+ if (offset) {
+ fraglen -= offset;
+ offset = 0;
+ }
+ if (++nr_frags > MAX_SKB_FRAGS)
+ return false;
+ if (len <= fraglen)
+ return true;
+ len -= fraglen;
+ fragi++;
+ }
+ /* We may not copy all @len but what we have will fit. */
+ return true;
+}
+
+/**
+ * skb_add_frags() - add a range of fragment references into an skb
+ * @skb: skb to add references into
+ * @walk: the walk to add referenced fragments from.
+ * @offset: offset from beginning of original skb to start from.
+ * @len: amount of data to add frag references to in @skb.
+ *
+ * skb_can_add_frags() should be called before this function to verify that the
+ * destination @skb is compatible with the walk and has space in the array for
+ * the to be added frag references.
+ *
+ * Return: The number of bytes not added to @skb b/c we reached the end of the
+ * walk before adding all of @len.
+ */
+static int skb_add_frags(struct sk_buff *skb, struct skb_frag_walk *walk,
+ u32 offset, u32 len)
+{
+ struct skb_shared_info *shinfo = skb_shinfo(skb);
+ u32 fraglen;
+
+ BUG_ON(skb->pp_recycle != walk->pp_recycle);
+ if (!walk->nr_frags || offset >= walk->total + walk->initial_offset)
+ return len;
+
+ /* make offset relative to current frag after setting that */
+ offset = __skb_reset_frag_walk(walk, offset);
+ BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);
+
+ while (len && walk->fragi < walk->nr_frags) {
+ skb_frag_t *frag = &walk->frags[walk->fragi];
+ skb_frag_t *tofrag = &shinfo->frags[shinfo->nr_frags];
+
+ *tofrag = *frag;
+ if (offset) {
+ tofrag->offset += offset;
+ tofrag->len -= offset;
+ offset = 0;
+ }
+ __skb_frag_ref(tofrag);
+ shinfo->nr_frags++;
+ BUG_ON(shinfo->nr_frags > MAX_SKB_FRAGS);
+
+ /* see if we are done */
+ fraglen = tofrag->len;
+ if (len < fraglen) {
+ tofrag->len = len;
+ skb->len += len;
+ skb->data_len += len;
+ return 0;
+ }
+ /* advance to next source fragment */
+ len -= fraglen; /* careful, use dst bv_len */
+ skb->len += fraglen; /* careful, " " " */
+ skb->data_len += fraglen; /* careful, " " " */
+ walk->past += frag->len; /* careful, use src bv_len */
+ walk->fragi++;
+ }
+ return len;
+}
+
/**
* skb_copy_bits_seq - copy bits from a skb_seq_state to kernel buffer
* @st: source skb_seq_state
@@ -196,6 +396,53 @@ static int skb_copy_bits_seq(struct skb_seq_state *st, int offset, void *to,
/* IPTFS Receiving (egress) Functions */
/* ================================== */
+/**
+ * iptfs_pskb_add_frags() - Create and add frags into a new sk_buff.
+ * @tpl: template to create new skb from.
+ * @walk: The source for fragments to add.
+ * @off: The offset into @walk to add frags from, also used with @st and
+ * @copy_len.
+ * @len: The length of data to add covering frags from @walk into @skb.
+ * This must be <= @skblen.
+ * @st: The sequence state to copy from into the new head skb.
+ * @copy_len: Copy @copy_len bytes from @st at offset @off into the new skb
+ * linear space.
+ *
+ * Create a new sk_buff `skb` using the template @tpl. Copy @copy_len bytes from
+ * @st into the new skb linear space, and then add shared fragments from the
+ * frag walk for the remaining @len of data (i.e., @len - @copy_len bytes).
+ *
+ * Return: The newly allocated sk_buff `skb` or NULL if an error occurs.
+ */
+static struct sk_buff *iptfs_pskb_add_frags(struct sk_buff *tpl,
+ struct skb_frag_walk *walk, u32 off,
+ u32 len, struct skb_seq_state *st,
+ u32 copy_len)
+{
+ struct sk_buff *skb;
+
+ skb = iptfs_alloc_skb(tpl, copy_len, false);
+ if (!skb)
+ return NULL;
+
+ /* this should not normally be happening */
+ if (!skb_can_add_frags(skb, walk, off + copy_len, len - copy_len)) {
+ kfree_skb(skb);
+ return NULL;
+ }
+
+ if (copy_len &&
+ skb_copy_bits_seq(st, off, skb_put(skb, copy_len), copy_len)) {
+ XFRM_INC_STATS(dev_net(st->root_skb->dev),
+ LINUX_MIB_XFRMINERROR);
+ kfree_skb(skb);
+ return NULL;
+ }
+
+ skb_add_frags(skb, walk, off + copy_len, len - copy_len);
+ return skb;
+}
+
/**
* iptfs_pskb_extract_seq() - Create and load data into a new sk_buff.
* @skblen: the total data size for `skb`.
@@ -380,6 +627,8 @@ static u32 iptfs_reassem_cont(struct xfrm_iptfs_data *xtfs, u64 seq,
struct skb_seq_state *st, struct sk_buff *skb,
u32 data, u32 blkoff, struct list_head *list)
{
+ struct skb_frag_walk _fragwalk;
+ struct skb_frag_walk *fragwalk = NULL;
struct sk_buff *newskb = xtfs->ra_newskb;
u32 remaining = skb->len - data;
u32 runtlen = xtfs->ra_runtlen;
@@ -533,13 +782,31 @@ static u32 iptfs_reassem_cont(struct xfrm_iptfs_data *xtfs, u64 seq,
fraglen = min(blkoff, remaining);
copylen = min(fraglen, ipremain);
- /* We verified this was true in the main receive routine */
- BUG_ON(skb_tailroom(newskb) < copylen);
+ /* If we may have the opportunity to share prepare a fragwalk. */
+ if (!skb_has_frag_list(skb) && !skb_has_frag_list(newskb) &&
+ (skb->head_frag || skb->len == skb->data_len) &&
+ skb->pp_recycle == newskb->pp_recycle) {
+ fragwalk = &_fragwalk;
+ skb_prepare_frag_walk(skb, data, fragwalk);
+ }
- /* copy fragment data into newskb */
- if (skb_copy_bits_seq(st, data, skb_put(newskb, copylen), copylen)) {
- XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMINBUFFERERROR);
- goto abandon;
+ /* Try share then copy. */
+ if (fragwalk && skb_can_add_frags(newskb, fragwalk, data, copylen)) {
+ u32 leftover;
+
+ leftover = skb_add_frags(newskb, fragwalk, data, copylen);
+ BUG_ON(leftover != 0);
+ } else {
+ /* We verified this was true in the main receive routine */
+ BUG_ON(skb_tailroom(newskb) < copylen);
+
+ /* copy fragment data into newskb */
+ if (skb_copy_bits_seq(st, data, skb_put(newskb, copylen),
+ copylen)) {
+ XFRM_INC_STATS(dev_net(skb->dev),
+ LINUX_MIB_XFRMINBUFFERERROR);
+ goto abandon;
+ }
}
if (copylen < ipremain) {
@@ -578,6 +845,8 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
u8 hbytes[sizeof(struct ipv6hdr)];
struct ip_iptfs_cc_hdr iptcch;
struct skb_seq_state skbseq;
+ struct skb_frag_walk _fragwalk;
+ struct skb_frag_walk *fragwalk = NULL;
struct list_head sublist; /* rename this it's just a list */
struct sk_buff *first_skb, *defer, *next;
const unsigned char *old_mac;
@@ -725,6 +994,7 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
} else {
first_skb = skb;
first_iplen = iplen;
+ fragwalk = NULL;
/* We are going to skip over `data` bytes to reach the
* start of the IP header of `iphlen` len for `iplen`
@@ -776,6 +1046,13 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
/* all pointers could be changed now reset walk */
skb_abort_seq_read(&skbseq);
skb_prepare_seq_read(skb, data, tail, &skbseq);
+ } else if (skb->head_frag &&
+ /* We have the IP header right now */
+ remaining >= iphlen) {
+ fragwalk = &_fragwalk;
+ skb_prepare_frag_walk(skb, data, fragwalk);
+ defer = skb;
+ skb = NULL;
} else {
/* We couldn't reuse the input skb so allocate a
* new one.
@@ -791,8 +1068,18 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
capturelen = min(iplen, remaining);
if (!skb) {
- skb = iptfs_pskb_extract_seq(iplen, &skbseq, data,
- capturelen);
+ if (!fragwalk ||
+ /* Large enough to be worth sharing */
+ iplen < IPTFS_PKT_SHARE_MIN ||
+ /* Have IP header + some data to share. */
+ capturelen <= iphlen ||
+ /* Try creating skb and adding frags */
+ !(skb = iptfs_pskb_add_frags(first_skb, fragwalk,
+ data, capturelen,
+ &skbseq, iphlen))) {
+ skb = iptfs_pskb_extract_seq(iplen, &skbseq,
+ data, capturelen);
+ }
if (!skb) {
/* skip to next packet or done */
data += capturelen;
--
2.45.1
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH ipsec-next v2 16/17] xfrm: iptfs: handle reordering of received packets
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
` (14 preceding siblings ...)
2024-05-20 21:42 ` [PATCH ipsec-next v2 15/17] xfrm: iptfs: add skb-fragment sharing code Christian Hopps
@ 2024-05-20 21:42 ` Christian Hopps
2024-05-21 16:07 ` kernel test robot
2024-05-20 21:42 ` [PATCH ipsec-next v2 17/17] xfrm: iptfs: add tracepoint functionality Christian Hopps
2024-05-23 19:29 ` [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Antony Antony
17 siblings, 1 reply; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:42 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
Handle the receipt of the outer tunnel packets out-of-order. Pointers to
the out-of-order packets are saved in a window (array) awaiting needed
prior packets. When the required prior packets are received the now
in-order packets are then passed on to the regular packet receive code.
A timer is used to consider missing earlier packet as lost so the
algorithm will advance.
Signed-off-by: Christian Hopps <chopps@labn.net>
---
net/xfrm/xfrm_iptfs.c | 487 ++++++++++++++++++++++++++++++++++++++++--
1 file changed, 475 insertions(+), 12 deletions(-)
diff --git a/net/xfrm/xfrm_iptfs.c b/net/xfrm/xfrm_iptfs.c
index f8b7cf6f4b01..12b59411fbf0 100644
--- a/net/xfrm/xfrm_iptfs.c
+++ b/net/xfrm/xfrm_iptfs.c
@@ -38,8 +38,14 @@
#define IPTFS_HRTIMER_MODE HRTIMER_MODE_REL_SOFT
+struct skb_wseq {
+ struct sk_buff *skb;
+ u64 drop_time;
+};
+
struct xfrm_iptfs_config {
bool dont_frag : 1;
+ u16 reorder_win_size;
u32 pkt_size; /* outer_packet_size or 0 */
u32 max_queue_size; /* octets */
};
@@ -57,12 +63,16 @@ struct xfrm_iptfs_data {
time64_t iptfs_settime; /* time timer was set */
u32 payload_mtu; /* max payload size */
- /* Tunnel egress */
+ /* Tunnel input reordering */
+ bool w_seq_set; /* true after first seq received */
+ u64 w_wantseq; /* expected next sequence */
+ struct skb_wseq *w_saved; /* the saved buf array */
+ u32 w_savedlen; /* the saved len (not size) */
spinlock_t drop_lock;
struct hrtimer drop_timer;
u64 drop_time_ns;
- /* Tunnel egress reassembly */
+ /* Tunnel input reassembly */
struct sk_buff *ra_newskb; /* new pkt being reassembled */
u64 ra_wantseq; /* expected next sequence */
u8 ra_runt[6]; /* last pkt bytes from last skb */
@@ -834,13 +844,13 @@ static u32 iptfs_reassem_cont(struct xfrm_iptfs_data *xtfs, u64 seq,
}
/**
- * iptfs_input() - handle receipt of iptfs payload
+ * iptfs_input_ordered() - handle next in order IPTFS payload.
* @x: xfrm state
- * @skb: the packet
+ * @skb: current packet
*
* Process the IPTFS payload in `skb` and consume it afterwards.
*/
-static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
+static int iptfs_input_ordered(struct xfrm_state *x, struct sk_buff *skb)
{
u8 hbytes[sizeof(struct ipv6hdr)];
struct ip_iptfs_cc_hdr iptcch;
@@ -1163,11 +1173,375 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
kfree_skb(skb);
}
- /* We always have dealt with the input SKB, either we are re-using it,
- * or we have freed it. Return EINPROGRESS so that xfrm_input stops
- * processing it.
+ return 0;
+}
+
+/* ------------------------------- */
+/* Input (Egress) Re-ordering Code */
+/* ------------------------------- */
+
+static void __vec_shift(struct xfrm_iptfs_data *xtfs, u32 shift)
+{
+ u32 savedlen = xtfs->w_savedlen;
+
+ if (shift > savedlen)
+ shift = savedlen;
+ if (shift != savedlen)
+ memcpy(xtfs->w_saved, xtfs->w_saved + shift,
+ (savedlen - shift) * sizeof(*xtfs->w_saved));
+ memset(xtfs->w_saved + savedlen - shift, 0,
+ shift * sizeof(*xtfs->w_saved));
+ xtfs->w_savedlen -= shift;
+}
+
+static void __reorder_past(struct xfrm_iptfs_data *xtfs, struct sk_buff *inskb,
+ struct list_head *freelist)
+{
+ list_add_tail(&inskb->list, freelist);
+}
+
+static u32 __reorder_drop(struct xfrm_iptfs_data *xtfs, struct list_head *list)
+
+{
+ struct skb_wseq *s, *se;
+ const u32 savedlen = xtfs->w_savedlen;
+ time64_t now = ktime_get_raw_fast_ns();
+ u32 count = 0;
+ u32 scount = 0;
+
+ BUG_ON(!savedlen);
+ if (xtfs->w_saved[0].drop_time > now)
+ goto set_timer;
+
+ ++xtfs->w_wantseq;
+
+ /* Keep flushing packets until we reach a drop time greater than now. */
+ s = xtfs->w_saved;
+ se = s + savedlen;
+ do {
+ /* Walking past empty slots until we reach a packet */
+ for (; s < se && !s->skb; s++)
+ if (s->drop_time > now)
+ goto outerdone;
+ /* Sending packets until we hit another empty slot. */
+ for (; s < se && s->skb; scount++, s++)
+ list_add_tail(&s->skb->list, list);
+ } while (s < se);
+outerdone:
+
+ count = s - xtfs->w_saved;
+ if (count) {
+ xtfs->w_wantseq += count;
+
+ /* Shift handled slots plus final empty slot into slot 0. */
+ __vec_shift(xtfs, count);
+ }
+
+ if (xtfs->w_savedlen) {
+set_timer:
+ /* Drifting is OK */
+ hrtimer_start(&xtfs->drop_timer,
+ xtfs->w_saved[0].drop_time - now,
+ IPTFS_HRTIMER_MODE);
+ }
+ return scount;
+}
+
+static void __reorder_this(struct xfrm_iptfs_data *xtfs, struct sk_buff *inskb,
+ struct list_head *list)
+{
+ struct skb_wseq *s, *se;
+ const u32 savedlen = xtfs->w_savedlen;
+ u32 count = 0;
+
+ /* Got what we wanted. */
+ list_add_tail(&inskb->list, list);
+ ++xtfs->w_wantseq;
+ if (!savedlen)
+ return;
+
+ /* Flush remaining consecutive packets. */
+
+ /* Keep sending until we hit another missed pkt. */
+ for (s = xtfs->w_saved, se = s + savedlen; s < se && s->skb; s++)
+ list_add_tail(&s->skb->list, list);
+ count = s - xtfs->w_saved;
+ if (count)
+ xtfs->w_wantseq += count;
+
+ /* Shift handled slots plus final empty slot into slot 0. */
+ __vec_shift(xtfs, count + 1);
+}
+
+/* Set the slot's drop time and all the empty slots below it until reaching a
+ * filled slot which will already be set.
+ */
+static void iptfs_set_window_drop_times(struct xfrm_iptfs_data *xtfs, int index)
+{
+ const u32 savedlen = xtfs->w_savedlen;
+ struct skb_wseq *s = xtfs->w_saved;
+ time64_t drop_time;
+
+ assert_spin_locked(&xtfs->drop_lock);
+
+ if (savedlen > index + 1) {
+ /* we are below another, our drop time and the timer are already set */
+ BUG_ON(xtfs->w_saved[index + 1].drop_time !=
+ xtfs->w_saved[index].drop_time);
+ return;
+ }
+ /* we are the most future so get a new drop time. */
+ drop_time = ktime_get_raw_fast_ns();
+ drop_time += xtfs->drop_time_ns;
+
+ /* Walk back through the array setting drop times as we go */
+ s[index].drop_time = drop_time;
+ while (index-- > 0 && !s[index].skb)
+ s[index].drop_time = drop_time;
+
+ /* If we walked all the way back, schedule the drop timer if needed */
+ if (index == -1 && !hrtimer_is_queued(&xtfs->drop_timer))
+ hrtimer_start(&xtfs->drop_timer, xtfs->drop_time_ns,
+ IPTFS_HRTIMER_MODE);
+}
+
+static void __reorder_future_fits(struct xfrm_iptfs_data *xtfs,
+ struct sk_buff *inskb,
+ struct list_head *freelist)
+{
+ const u32 nslots = xtfs->cfg.reorder_win_size + 1;
+ const u64 inseq = __esp_seq(inskb);
+ const u64 wantseq = xtfs->w_wantseq;
+ const u64 distance = inseq - wantseq;
+ const u32 savedlen = xtfs->w_savedlen;
+ const u32 index = distance - 1;
+
+ BUG_ON(distance >= nslots);
+
+ /* Handle future sequence number received which fits in the window.
+ *
+ * We know we don't have the seq we want so we won't be able to flush
+ * anything.
*/
- return -EINPROGRESS;
+
+ /* slot count is 4, saved size is 3 savedlen is 2
+ *
+ * "window boundary" is based on the fixed window size
+ * distance is also slot number
+ * index is an array index (i.e., - 1 of slot)
+ * : : - implicit NULL after array len
+ *
+ * +--------- used length (savedlen == 2)
+ * | +----- array size (nslots - 1 == 3)
+ * | | + window boundary (nslots == 4)
+ * V V | V
+ * |
+ * 0 1 2 3 | slot number
+ * --- 0 1 2 | array index
+ * [-] [b] : :| array
+ *
+ * "2" "3" "4" *5*| seq numbers
+ *
+ * We receive seq number 5
+ * distance == 3 [inseq(5) - w_wantseq(2)]
+ * index == 2 [distance(6) - 1]
+ */
+
+ if (xtfs->w_saved[index].skb) {
+ /* a dup of a future */
+ list_add_tail(&inskb->list, freelist);
+ return;
+ }
+
+ xtfs->w_saved[index].skb = inskb;
+ xtfs->w_savedlen = max(savedlen, index + 1);
+ iptfs_set_window_drop_times(xtfs, index);
+}
+
+static void __reorder_future_shifts(struct xfrm_iptfs_data *xtfs,
+ struct sk_buff *inskb,
+ struct list_head *list,
+ struct list_head *freelist)
+{
+ const u32 nslots = xtfs->cfg.reorder_win_size + 1;
+ const u64 inseq = __esp_seq(inskb);
+ u32 savedlen = xtfs->w_savedlen;
+ u64 wantseq = xtfs->w_wantseq;
+ struct sk_buff *slot0 = NULL;
+ u64 distance, extra_drops, s0seq;
+ struct skb_wseq *wnext;
+ u32 beyond, shifting, slot;
+
+ BUG_ON(inseq <= wantseq);
+ distance = inseq - wantseq;
+ BUG_ON(distance <= nslots - 1);
+ beyond = distance - (nslots - 1);
+
+ /* Handle future sequence number received.
+ *
+ * IMPORTANT: we are at least advancing w_wantseq (i.e., wantseq) by 1
+ * b/c we are beyond the window boundary.
+ *
+ * We know we don't have the wantseq so that counts as a drop.
+ */
+
+ /* ex: slot count is 4, array size is 3 savedlen is 2, slot 0 is the
+ * missing sequence number.
+ *
+ * the final slot at savedlen (index savedlen - 1) is always occupied.
+ *
+ * beyond is "beyond array size" not savedlen.
+ *
+ * +--------- array length (savedlen == 2)
+ * | +----- array size (nslots - 1 == 3)
+ * | | +- window boundary (nslots == 4)
+ * V V | V
+ * |
+ * 0 1 2 3 | slot number
+ * --- 0 1 2 | array index
+ * [b] [c] : :| array
+ * |
+ * "2" "3" "4" "5"|*6* seq numbers
+ *
+ * We receive seq number 6
+ * distance == 4 [inseq(6) - w_wantseq(2)]
+ * newslot == distance
+ * index == 3 [distance(4) - 1]
+ * beyond == 1 [newslot(4) - lastslot((nslots(4) - 1))]
+ * shifting == 1 [min(savedlen(2), beyond(1)]
+ * slot0_skb == [b], and should match w_wantseq
+ *
+ * +--- window boundary (nslots == 4)
+ * 0 1 2 3 | 4 slot number
+ * --- 0 1 2 | 3 array index
+ * [b] : : : :| array
+ * "2" "3" "4" "5" *6* seq numbers
+ *
+ * We receive seq number 6
+ * distance == 4 [inseq(6) - w_wantseq(2)]
+ * newslot == distance
+ * index == 3 [distance(4) - 1]
+ * beyond == 1 [newslot(4) - lastslot((nslots(4) - 1))]
+ * shifting == 1 [min(savedlen(1), beyond(1)]
+ * slot0_skb == [b] and should match w_wantseq
+ *
+ * +-- window boundary (nslots == 4)
+ * 0 1 2 3 | 4 5 6 slot number
+ * --- 0 1 2 | 3 4 5 array index
+ * [-] [c] : :| array
+ * "2" "3" "4" "5" "6" "7" *8* seq numbers
+ *
+ * savedlen = 2, beyond = 3
+ * iter 1: slot0 == NULL, missed++, lastdrop = 2 (2+1-1), slot0 = [-]
+ * iter 2: slot0 == NULL, missed++, lastdrop = 3 (2+2-1), slot0 = [c]
+ * 2 < 3, extra = 1 (3-2), missed += extra, lastdrop = 4 (2+2+1-1)
+ *
+ * We receive seq number 8
+ * distance == 6 [inseq(8) - w_wantseq(2)]
+ * newslot == distance
+ * index == 5 [distance(6) - 1]
+ * beyond == 3 [newslot(6) - lastslot((nslots(4) - 1))]
+ * shifting == 2 [min(savedlen(2), beyond(3)]
+ *
+ * slot0_skb == NULL changed from [b] when "savedlen < beyond" is true.
+ */
+
+ /* Now send any packets that are being shifted out of saved, and account
+ * for missing packets that are exiting the window as we shift it.
+ */
+
+ /* If savedlen > beyond we are shifting some, else all. */
+ shifting = min(savedlen, beyond);
+
+ /* slot0 is the buf that just shifted out and into slot0 */
+ slot0 = NULL;
+ s0seq = wantseq;
+ wnext = xtfs->w_saved;
+ for (slot = 1; slot <= shifting; slot++, wnext++) {
+ /* handle what was in slot0 before we occupy it */
+ if (slot0)
+ list_add_tail(&slot0->list, list);
+ s0seq++;
+ slot0 = wnext->skb;
+ wnext->skb = NULL;
+ }
+
+ /* slot0 is now either NULL (in which case it's what we now are waiting
+ * for, or a buf in which case we need to handle it like we received it;
+ * however, we may be advancing past that buffer as well..
+ */
+
+ /* Handle case where we need to shift more than we had saved, slot0 will
+ * be NULL iff savedlen is 0, otherwise slot0 will always be
+ * non-NULL b/c we shifted the final element, which is always set if
+ * there is any saved, into slot0.
+ */
+ if (savedlen < beyond) {
+ extra_drops = beyond - savedlen;
+ if (savedlen == 0) {
+ BUG_ON(slot0);
+ s0seq += extra_drops;
+ } else {
+ extra_drops--; /* we aren't dropping what's in slot0 */
+ BUG_ON(!slot0);
+ list_add_tail(&slot0->list, list);
+ s0seq += extra_drops + 1;
+ }
+ slot0 = NULL;
+ /* slot0 has had an empty slot pushed into it */
+ }
+
+ /* Remove the entries */
+ __vec_shift(xtfs, beyond);
+
+ /* Advance want seq */
+ xtfs->w_wantseq += beyond;
+
+ /* Process drops here when implementing congestion control */
+
+ /* We've shifted. plug the packet in at the end. */
+ xtfs->w_savedlen = nslots - 1;
+ xtfs->w_saved[xtfs->w_savedlen - 1].skb = inskb;
+ iptfs_set_window_drop_times(xtfs, xtfs->w_savedlen - 1);
+
+ /* if we don't have a slot0 then we must wait for it */
+ if (!slot0)
+ return;
+
+ /* If slot0, seq must match new want seq */
+ BUG_ON(xtfs->w_wantseq != __esp_seq(slot0));
+
+ /* slot0 is valid, treat like we received expected. */
+ __reorder_this(xtfs, slot0, list);
+}
+
+/* Receive a new packet into the reorder window. Return a list of ordered
+ * packets from the window.
+ */
+static void iptfs_input_reorder(struct xfrm_iptfs_data *xtfs,
+ struct sk_buff *inskb, struct list_head *list,
+ struct list_head *freelist)
+{
+ const u32 nslots = xtfs->cfg.reorder_win_size + 1;
+ u64 inseq = __esp_seq(inskb);
+ u64 wantseq;
+
+ assert_spin_locked(&xtfs->drop_lock);
+
+ if (unlikely(!xtfs->w_seq_set)) {
+ xtfs->w_seq_set = true;
+ xtfs->w_wantseq = inseq;
+ }
+ wantseq = xtfs->w_wantseq;
+
+ if (likely(inseq == wantseq))
+ __reorder_this(xtfs, inskb, list);
+ else if (inseq < wantseq)
+ __reorder_past(xtfs, inskb, freelist);
+ else if ((inseq - wantseq) < nslots)
+ __reorder_future_fits(xtfs, inskb, freelist);
+ else
+ __reorder_future_shifts(xtfs, inskb, list, freelist);
}
/**
@@ -1192,23 +1566,90 @@ static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
*/
static enum hrtimer_restart iptfs_drop_timer(struct hrtimer *me)
{
+ struct sk_buff *skb, *next;
+ struct list_head freelist, list;
struct xfrm_iptfs_data *xtfs;
struct xfrm_state *x;
+ u32 count;
xtfs = container_of(me, typeof(*xtfs), drop_timer);
x = xtfs->x;
- /* Drop any in progress packet */
spin_lock(&xtfs->drop_lock);
+
+ INIT_LIST_HEAD(&list);
+ INIT_LIST_HEAD(&freelist);
+
+ /* Drop any in progress packet */
+
if (xtfs->ra_newskb) {
kfree_skb(xtfs->ra_newskb);
xtfs->ra_newskb = NULL;
}
+
+ /* Now drop as many packets as we should from the reordering window
+ * saved array
+ */
+ count = xtfs->w_savedlen ? __reorder_drop(xtfs, &list) : 0;
+
spin_unlock(&xtfs->drop_lock);
+ if (count) {
+ list_for_each_entry_safe(skb, next, &list, list) {
+ skb_list_del_init(skb);
+ (void)iptfs_input_ordered(x, skb);
+ }
+ }
return HRTIMER_NORESTART;
}
+/**
+ * iptfs_input() - handle receipt of iptfs payload
+ * @x: xfrm state
+ * @skb: the packet
+ *
+ * We have an IPTFS payload order it if needed, then process newly in order
+ * packets.
+ */
+static int iptfs_input(struct xfrm_state *x, struct sk_buff *skb)
+{
+ struct list_head freelist, list;
+ struct xfrm_iptfs_data *xtfs = x->mode_data;
+ struct sk_buff *next;
+
+ /* Fast path for no reorder window. */
+ if (xtfs->cfg.reorder_win_size == 0) {
+ iptfs_input_ordered(x, skb);
+ goto done;
+ }
+
+ /* Fetch list of in-order packets from the reordering window as well as
+ * a list of buffers we need to now free.
+ */
+ INIT_LIST_HEAD(&list);
+ INIT_LIST_HEAD(&freelist);
+
+ spin_lock(&xtfs->drop_lock);
+ iptfs_input_reorder(xtfs, skb, &list, &freelist);
+ spin_unlock(&xtfs->drop_lock);
+
+ list_for_each_entry_safe(skb, next, &list, list) {
+ skb_list_del_init(skb);
+ (void)iptfs_input_ordered(x, skb);
+ }
+
+ list_for_each_entry_safe(skb, next, &freelist, list) {
+ skb_list_del_init(skb);
+ kfree_skb(skb);
+ }
+done:
+ /* We always have dealt with the input SKB, either we are re-using it,
+ * or we have freed it. Return EINPROGRESS so that xfrm_input stops
+ * processing it.
+ */
+ return -EINPROGRESS;
+}
+
/* ================================= */
/* IPTFS Sending (ingress) Functions */
/* ================================= */
@@ -2010,6 +2451,7 @@ static int iptfs_user_init(struct net *net, struct xfrm_state *x,
struct xfrm_iptfs_config *xc;
xc = &xtfs->cfg;
+ xc->reorder_win_size = net->xfrm.sysctl_iptfs_reorder_window;
xc->max_queue_size = net->xfrm.sysctl_iptfs_max_qsize;
xtfs->init_delay_ns =
(u64)net->xfrm.sysctl_iptfs_init_delay * NSECS_IN_USEC;
@@ -2018,6 +2460,13 @@ static int iptfs_user_init(struct net *net, struct xfrm_state *x,
if (attrs[XFRMA_IPTFS_DONT_FRAG])
xc->dont_frag = true;
+ if (attrs[XFRMA_IPTFS_REORDER_WINDOW])
+ xc->reorder_win_size =
+ nla_get_u16(attrs[XFRMA_IPTFS_REORDER_WINDOW]);
+ /* saved array is for saving 1..N seq nums from wantseq */
+ if (xc->reorder_win_size)
+ xtfs->w_saved = kcalloc(xc->reorder_win_size,
+ sizeof(*xtfs->w_saved), GFP_KERNEL);
if (attrs[XFRMA_IPTFS_PKT_SIZE]) {
xc->pkt_size = nla_get_u32(attrs[XFRMA_IPTFS_PKT_SIZE]);
if (!xc->pkt_size) {
@@ -2054,7 +2503,7 @@ static unsigned int iptfs_sa_len(const struct xfrm_state *x)
if (xc->dont_frag)
l += nla_total_size(0);
- l += nla_total_size(sizeof(u16));
+ l += nla_total_size(sizeof(xc->reorder_win_size));
l += nla_total_size(sizeof(xc->pkt_size));
l += nla_total_size(sizeof(xc->max_queue_size));
l += nla_total_size(sizeof(u32)); /* drop time usec */
@@ -2075,7 +2524,7 @@ static int iptfs_copy_to_user(struct xfrm_state *x, struct sk_buff *skb)
if (ret)
return ret;
}
- ret = nla_put_u16(skb, XFRMA_IPTFS_REORDER_WINDOW, 0);
+ ret = nla_put_u16(skb, XFRMA_IPTFS_REORDER_WINDOW, xc->reorder_win_size);
if (ret)
return ret;
ret = nla_put_u32(skb, XFRMA_IPTFS_PKT_SIZE, xc->pkt_size);
@@ -2137,6 +2586,14 @@ static int iptfs_clone(struct xfrm_state *x, struct xfrm_state *orig)
return -ENOMEM;
xtfs->ra_newskb = NULL;
+ if (xtfs->cfg.reorder_win_size) {
+ xtfs->w_saved = kcalloc(xtfs->cfg.reorder_win_size,
+ sizeof(*xtfs->w_saved), GFP_KERNEL);
+ if (!xtfs->w_saved) {
+ kfree_sensitive(xtfs);
+ return -ENOMEM;
+ }
+ }
err = __iptfs_init_state(x, xtfs);
if (err)
@@ -2164,6 +2621,7 @@ static int iptfs_create_state(struct xfrm_state *x)
static void iptfs_delete_state(struct xfrm_state *x)
{
struct xfrm_iptfs_data *xtfs = x->mode_data;
+ struct skb_wseq *s, *se;
if (!xtfs)
return;
@@ -2176,6 +2634,11 @@ static void iptfs_delete_state(struct xfrm_state *x)
if (xtfs->ra_newskb)
kfree_skb(xtfs->ra_newskb);
+ for (s = xtfs->w_saved, se = s + xtfs->w_savedlen; s < se; s++)
+ if (s->skb)
+ kfree_skb(s->skb);
+
+ kfree_sensitive(xtfs->w_saved);
kfree_sensitive(xtfs);
module_put(x->mode_cbs->owner);
--
2.45.1
^ permalink raw reply related [flat|nested] 34+ messages in thread* Re: [PATCH ipsec-next v2 16/17] xfrm: iptfs: handle reordering of received packets
2024-05-20 21:42 ` [PATCH ipsec-next v2 16/17] xfrm: iptfs: handle reordering of received packets Christian Hopps
@ 2024-05-21 16:07 ` kernel test robot
0 siblings, 0 replies; 34+ messages in thread
From: kernel test robot @ 2024-05-21 16:07 UTC (permalink / raw)
To: Christian Hopps, devel
Cc: llvm, oe-kbuild-all, Steffen Klassert, netdev, Christian Hopps
Hi Christian,
kernel test robot noticed the following build warnings:
[auto build test WARNING on klassert-ipsec-next/master]
[cannot apply to klassert-ipsec/master netfilter-nf/main linus/master nf-next/master v6.9 next-20240521]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Christian-Hopps/include-uapi-add-ip_tfs_-_hdr-packet-formats/20240521-064324
base: https://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next.git master
patch link: https://lore.kernel.org/r/20240520214255.2590923-17-chopps%40chopps.org
patch subject: [PATCH ipsec-next v2 16/17] xfrm: iptfs: handle reordering of received packets
config: hexagon-allmodconfig (https://download.01.org/0day-ci/archive/20240521/202405212335.amQLFAic-lkp@intel.com/config)
compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project fa9b1be45088dce1e4b602d451f118128b94237b)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240521/202405212335.amQLFAic-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202405212335.amQLFAic-lkp@intel.com/
All warnings (new ones prefixed by >>):
In file included from net/xfrm/xfrm_iptfs.c:11:
In file included from include/linux/icmpv6.h:5:
In file included from include/linux/skbuff.h:17:
In file included from include/linux/bvec.h:10:
In file included from include/linux/highmem.h:10:
In file included from include/linux/mm.h:2208:
include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
522 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
| ~~~~~~~~~~~ ^ ~~~
In file included from net/xfrm/xfrm_iptfs.c:11:
In file included from include/linux/icmpv6.h:5:
In file included from include/linux/skbuff.h:17:
In file included from include/linux/bvec.h:10:
In file included from include/linux/highmem.h:12:
In file included from include/linux/hardirq.h:11:
In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/hexagon/include/asm/io.h:328:
include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
547 | val = __raw_readb(PCI_IOBASE + addr);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
560 | val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
| ~~~~~~~~~~ ^
include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
37 | #define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
| ^
In file included from net/xfrm/xfrm_iptfs.c:11:
In file included from include/linux/icmpv6.h:5:
In file included from include/linux/skbuff.h:17:
In file included from include/linux/bvec.h:10:
In file included from include/linux/highmem.h:12:
In file included from include/linux/hardirq.h:11:
In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/hexagon/include/asm/io.h:328:
include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
573 | val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
| ~~~~~~~~~~ ^
include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
35 | #define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
| ^
In file included from net/xfrm/xfrm_iptfs.c:11:
In file included from include/linux/icmpv6.h:5:
In file included from include/linux/skbuff.h:17:
In file included from include/linux/bvec.h:10:
In file included from include/linux/highmem.h:12:
In file included from include/linux/hardirq.h:11:
In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/hexagon/include/asm/io.h:328:
include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
584 | __raw_writeb(value, PCI_IOBASE + addr);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
594 | __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
604 | __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
| ~~~~~~~~~~ ^
>> net/xfrm/xfrm_iptfs.c:1371:29: warning: variable 's0seq' set but not used [-Wunused-but-set-variable]
1371 | u64 distance, extra_drops, s0seq;
| ^
net/xfrm/xfrm_iptfs.c:1903:6: warning: variable 'blkoff' set but not used [-Wunused-but-set-variable]
1903 | u32 blkoff = 0;
| ^
net/xfrm/xfrm_iptfs.c:2253:11: warning: variable 'settime' set but not used [-Wunused-but-set-variable]
2253 | time64_t settime;
| ^
10 warnings generated.
vim +/s0seq +1371 net/xfrm/xfrm_iptfs.c
1360
1361 static void __reorder_future_shifts(struct xfrm_iptfs_data *xtfs,
1362 struct sk_buff *inskb,
1363 struct list_head *list,
1364 struct list_head *freelist)
1365 {
1366 const u32 nslots = xtfs->cfg.reorder_win_size + 1;
1367 const u64 inseq = __esp_seq(inskb);
1368 u32 savedlen = xtfs->w_savedlen;
1369 u64 wantseq = xtfs->w_wantseq;
1370 struct sk_buff *slot0 = NULL;
> 1371 u64 distance, extra_drops, s0seq;
1372 struct skb_wseq *wnext;
1373 u32 beyond, shifting, slot;
1374
1375 BUG_ON(inseq <= wantseq);
1376 distance = inseq - wantseq;
1377 BUG_ON(distance <= nslots - 1);
1378 beyond = distance - (nslots - 1);
1379
1380 /* Handle future sequence number received.
1381 *
1382 * IMPORTANT: we are at least advancing w_wantseq (i.e., wantseq) by 1
1383 * b/c we are beyond the window boundary.
1384 *
1385 * We know we don't have the wantseq so that counts as a drop.
1386 */
1387
1388 /* ex: slot count is 4, array size is 3 savedlen is 2, slot 0 is the
1389 * missing sequence number.
1390 *
1391 * the final slot at savedlen (index savedlen - 1) is always occupied.
1392 *
1393 * beyond is "beyond array size" not savedlen.
1394 *
1395 * +--------- array length (savedlen == 2)
1396 * | +----- array size (nslots - 1 == 3)
1397 * | | +- window boundary (nslots == 4)
1398 * V V | V
1399 * |
1400 * 0 1 2 3 | slot number
1401 * --- 0 1 2 | array index
1402 * [b] [c] : :| array
1403 * |
1404 * "2" "3" "4" "5"|*6* seq numbers
1405 *
1406 * We receive seq number 6
1407 * distance == 4 [inseq(6) - w_wantseq(2)]
1408 * newslot == distance
1409 * index == 3 [distance(4) - 1]
1410 * beyond == 1 [newslot(4) - lastslot((nslots(4) - 1))]
1411 * shifting == 1 [min(savedlen(2), beyond(1)]
1412 * slot0_skb == [b], and should match w_wantseq
1413 *
1414 * +--- window boundary (nslots == 4)
1415 * 0 1 2 3 | 4 slot number
1416 * --- 0 1 2 | 3 array index
1417 * [b] : : : :| array
1418 * "2" "3" "4" "5" *6* seq numbers
1419 *
1420 * We receive seq number 6
1421 * distance == 4 [inseq(6) - w_wantseq(2)]
1422 * newslot == distance
1423 * index == 3 [distance(4) - 1]
1424 * beyond == 1 [newslot(4) - lastslot((nslots(4) - 1))]
1425 * shifting == 1 [min(savedlen(1), beyond(1)]
1426 * slot0_skb == [b] and should match w_wantseq
1427 *
1428 * +-- window boundary (nslots == 4)
1429 * 0 1 2 3 | 4 5 6 slot number
1430 * --- 0 1 2 | 3 4 5 array index
1431 * [-] [c] : :| array
1432 * "2" "3" "4" "5" "6" "7" *8* seq numbers
1433 *
1434 * savedlen = 2, beyond = 3
1435 * iter 1: slot0 == NULL, missed++, lastdrop = 2 (2+1-1), slot0 = [-]
1436 * iter 2: slot0 == NULL, missed++, lastdrop = 3 (2+2-1), slot0 = [c]
1437 * 2 < 3, extra = 1 (3-2), missed += extra, lastdrop = 4 (2+2+1-1)
1438 *
1439 * We receive seq number 8
1440 * distance == 6 [inseq(8) - w_wantseq(2)]
1441 * newslot == distance
1442 * index == 5 [distance(6) - 1]
1443 * beyond == 3 [newslot(6) - lastslot((nslots(4) - 1))]
1444 * shifting == 2 [min(savedlen(2), beyond(3)]
1445 *
1446 * slot0_skb == NULL changed from [b] when "savedlen < beyond" is true.
1447 */
1448
1449 /* Now send any packets that are being shifted out of saved, and account
1450 * for missing packets that are exiting the window as we shift it.
1451 */
1452
1453 /* If savedlen > beyond we are shifting some, else all. */
1454 shifting = min(savedlen, beyond);
1455
1456 /* slot0 is the buf that just shifted out and into slot0 */
1457 slot0 = NULL;
1458 s0seq = wantseq;
1459 wnext = xtfs->w_saved;
1460 for (slot = 1; slot <= shifting; slot++, wnext++) {
1461 /* handle what was in slot0 before we occupy it */
1462 if (slot0)
1463 list_add_tail(&slot0->list, list);
1464 s0seq++;
1465 slot0 = wnext->skb;
1466 wnext->skb = NULL;
1467 }
1468
1469 /* slot0 is now either NULL (in which case it's what we now are waiting
1470 * for, or a buf in which case we need to handle it like we received it;
1471 * however, we may be advancing past that buffer as well..
1472 */
1473
1474 /* Handle case where we need to shift more than we had saved, slot0 will
1475 * be NULL iff savedlen is 0, otherwise slot0 will always be
1476 * non-NULL b/c we shifted the final element, which is always set if
1477 * there is any saved, into slot0.
1478 */
1479 if (savedlen < beyond) {
1480 extra_drops = beyond - savedlen;
1481 if (savedlen == 0) {
1482 BUG_ON(slot0);
1483 s0seq += extra_drops;
1484 } else {
1485 extra_drops--; /* we aren't dropping what's in slot0 */
1486 BUG_ON(!slot0);
1487 list_add_tail(&slot0->list, list);
1488 s0seq += extra_drops + 1;
1489 }
1490 slot0 = NULL;
1491 /* slot0 has had an empty slot pushed into it */
1492 }
1493
1494 /* Remove the entries */
1495 __vec_shift(xtfs, beyond);
1496
1497 /* Advance want seq */
1498 xtfs->w_wantseq += beyond;
1499
1500 /* Process drops here when implementing congestion control */
1501
1502 /* We've shifted. plug the packet in at the end. */
1503 xtfs->w_savedlen = nslots - 1;
1504 xtfs->w_saved[xtfs->w_savedlen - 1].skb = inskb;
1505 iptfs_set_window_drop_times(xtfs, xtfs->w_savedlen - 1);
1506
1507 /* if we don't have a slot0 then we must wait for it */
1508 if (!slot0)
1509 return;
1510
1511 /* If slot0, seq must match new want seq */
1512 BUG_ON(xtfs->w_wantseq != __esp_seq(slot0));
1513
1514 /* slot0 is valid, treat like we received expected. */
1515 __reorder_this(xtfs, slot0, list);
1516 }
1517
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH ipsec-next v2 17/17] xfrm: iptfs: add tracepoint functionality
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
` (15 preceding siblings ...)
2024-05-20 21:42 ` [PATCH ipsec-next v2 16/17] xfrm: iptfs: handle reordering of received packets Christian Hopps
@ 2024-05-20 21:42 ` Christian Hopps
2024-05-23 19:29 ` [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Antony Antony
17 siblings, 0 replies; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:42 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
Add tracepoints to the IP-TFS code.
Signed-off-by: Christian Hopps <chopps@labn.net>
---
net/xfrm/trace_iptfs.h | 218 +++++++++++++++++++++++++++++++++++++++++
net/xfrm/xfrm_iptfs.c | 60 ++++++++++++
2 files changed, 278 insertions(+)
create mode 100644 net/xfrm/trace_iptfs.h
diff --git a/net/xfrm/trace_iptfs.h b/net/xfrm/trace_iptfs.h
new file mode 100644
index 000000000000..0425f051572f
--- /dev/null
+++ b/net/xfrm/trace_iptfs.h
@@ -0,0 +1,218 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* xfrm_trace_iptfs.h
+ *
+ * August 12 2023, Christian Hopps <chopps@labn.net>
+ *
+ * Copyright (c) 2023, LabN Consulting, L.L.C.
+ */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM iptfs
+
+#if !defined(_TRACE_IPTFS_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_IPTFS_H
+
+#include <linux/kernel.h>
+#include <linux/skbuff.h>
+#include <linux/tracepoint.h>
+#include <net/ip.h>
+
+struct xfrm_iptfs_data;
+
+TRACE_EVENT(iptfs_egress_recv,
+ TP_PROTO(struct sk_buff *skb, struct xfrm_iptfs_data *xtfs, u16 blkoff),
+ TP_ARGS(skb, xtfs, blkoff),
+ TP_STRUCT__entry(__field(struct sk_buff *, skb)
+ __field(void *, head)
+ __field(void *, head_pg_addr)
+ __field(void *, pg0addr)
+ __field(u32, skb_len)
+ __field(u32, data_len)
+ __field(u32, headroom)
+ __field(u32, tailroom)
+ __field(u32, tail)
+ __field(u32, end)
+ __field(u32, pg0off)
+ __field(u8, head_frag)
+ __field(u8, frag_list)
+ __field(u8, nr_frags)
+ __field(u16, blkoff)),
+ TP_fast_assign(__entry->skb = skb;
+ __entry->head = skb->head;
+ __entry->skb_len = skb->len;
+ __entry->data_len = skb->data_len;
+ __entry->headroom = skb_headroom(skb);
+ __entry->tailroom = skb_tailroom(skb);
+ __entry->tail = skb->tail;
+ __entry->end = skb->end;
+ __entry->head_frag = skb->head_frag;
+ __entry->frag_list = (bool)skb_shinfo(skb)->frag_list;
+ __entry->nr_frags = skb_shinfo(skb)->nr_frags;
+ __entry->blkoff = blkoff;
+ __entry->head_pg_addr = page_address(virt_to_head_page(skb->head));
+ __entry->pg0addr = (__entry->nr_frags
+ ? page_address(netmem_to_page(skb_shinfo(skb)->frags[0].netmem))
+ : NULL);
+ __entry->pg0off = (__entry->nr_frags
+ ? skb_shinfo(skb)->frags[0].offset
+ : 0);
+ ),
+ TP_printk("EGRESS: skb=%p len=%u data_len=%u headroom=%u head_frag=%u frag_list=%u nr_frags=%u blkoff=%u\n\t\ttailroom=%u tail=%u end=%u head=%p hdpgaddr=%p pg0->addr=%p pg0->data=%p pg0->off=%u",
+ __entry->skb, __entry->skb_len, __entry->data_len, __entry->headroom,
+ __entry->head_frag, __entry->frag_list, __entry->nr_frags, __entry->blkoff,
+ __entry->tailroom, __entry->tail, __entry->end, __entry->head,
+ __entry->head_pg_addr, __entry->pg0addr, __entry->pg0addr + __entry->pg0off,
+ __entry->pg0off)
+ )
+
+DECLARE_EVENT_CLASS(iptfs_ingress_preq_event,
+ TP_PROTO(struct sk_buff *skb, struct xfrm_iptfs_data *xtfs,
+ u32 pmtu, u8 was_gso),
+ TP_ARGS(skb, xtfs, pmtu, was_gso),
+ TP_STRUCT__entry(__field(struct sk_buff *, skb)
+ __field(u32, skb_len)
+ __field(u32, data_len)
+ __field(u32, pmtu)
+ __field(u32, queue_size)
+ __field(u32, proto_seq)
+ __field(u8, proto)
+ __field(u8, was_gso)
+ ),
+ TP_fast_assign(__entry->skb = skb;
+ __entry->skb_len = skb->len;
+ __entry->data_len = skb->data_len;
+ __entry->queue_size =
+ xtfs->cfg.max_queue_size - xtfs->queue_size;
+ __entry->proto = __trace_ip_proto(ip_hdr(skb));
+ __entry->proto_seq = __trace_ip_proto_seq(ip_hdr(skb));
+ __entry->pmtu = pmtu;
+ __entry->was_gso = was_gso;
+ ),
+ TP_printk("INGRPREQ: skb=%p len=%u data_len=%u qsize=%u proto=%u proto_seq=%u pmtu=%u was_gso=%u",
+ __entry->skb, __entry->skb_len, __entry->data_len,
+ __entry->queue_size, __entry->proto, __entry->proto_seq,
+ __entry->pmtu, __entry->was_gso));
+
+DEFINE_EVENT(iptfs_ingress_preq_event, iptfs_enqueue,
+ TP_PROTO(struct sk_buff *skb, struct xfrm_iptfs_data *xtfs, u32 pmtu, u8 was_gso),
+ TP_ARGS(skb, xtfs, pmtu, was_gso));
+
+DEFINE_EVENT(iptfs_ingress_preq_event, iptfs_no_queue_space,
+ TP_PROTO(struct sk_buff *skb, struct xfrm_iptfs_data *xtfs, u32 pmtu, u8 was_gso),
+ TP_ARGS(skb, xtfs, pmtu, was_gso));
+
+DEFINE_EVENT(iptfs_ingress_preq_event, iptfs_too_big,
+ TP_PROTO(struct sk_buff *skb, struct xfrm_iptfs_data *xtfs, u32 pmtu, u8 was_gso),
+ TP_ARGS(skb, xtfs, pmtu, was_gso));
+
+DECLARE_EVENT_CLASS(iptfs_ingress_postq_event,
+ TP_PROTO(struct sk_buff *skb, u32 mtu, u16 blkoff, struct iphdr *iph),
+ TP_ARGS(skb, mtu, blkoff, iph),
+ TP_STRUCT__entry(__field(struct sk_buff *, skb)
+ __field(u32, skb_len)
+ __field(u32, data_len)
+ __field(u32, mtu)
+ __field(u32, proto_seq)
+ __field(u16, blkoff)
+ __field(u8, proto)),
+ TP_fast_assign(__entry->skb = skb;
+ __entry->skb_len = skb->len;
+ __entry->data_len = skb->data_len;
+ __entry->mtu = mtu;
+ __entry->blkoff = blkoff;
+ __entry->proto = iph ? __trace_ip_proto(iph) : 0;
+ __entry->proto_seq = iph ? __trace_ip_proto_seq(iph) : 0;
+ ),
+ TP_printk("INGRPSTQ: skb=%p len=%u data_len=%u mtu=%u blkoff=%u proto=%u proto_seq=%u",
+ __entry->skb, __entry->skb_len, __entry->data_len, __entry->mtu,
+ __entry->blkoff, __entry->proto, __entry->proto_seq));
+
+DEFINE_EVENT(iptfs_ingress_postq_event, iptfs_first_dequeue,
+ TP_PROTO(struct sk_buff *skb, u32 mtu, u16 blkoff,
+ struct iphdr *iph),
+ TP_ARGS(skb, mtu, blkoff, iph));
+
+DEFINE_EVENT(iptfs_ingress_postq_event, iptfs_first_fragmenting,
+ TP_PROTO(struct sk_buff *skb, u32 mtu, u16 blkoff,
+ struct iphdr *iph),
+ TP_ARGS(skb, mtu, blkoff, iph));
+
+DEFINE_EVENT(iptfs_ingress_postq_event, iptfs_first_final_fragment,
+ TP_PROTO(struct sk_buff *skb, u32 mtu, u16 blkoff,
+ struct iphdr *iph),
+ TP_ARGS(skb, mtu, blkoff, iph));
+
+DEFINE_EVENT(iptfs_ingress_postq_event, iptfs_first_toobig,
+ TP_PROTO(struct sk_buff *skb, u32 mtu, u16 blkoff,
+ struct iphdr *iph),
+ TP_ARGS(skb, mtu, blkoff, iph));
+
+TRACE_EVENT(iptfs_ingress_nth_peek,
+ TP_PROTO(struct sk_buff *skb, u32 remaining),
+ TP_ARGS(skb, remaining),
+ TP_STRUCT__entry(__field(struct sk_buff *, skb)
+ __field(u32, skb_len)
+ __field(u32, remaining)),
+ TP_fast_assign(__entry->skb = skb;
+ __entry->skb_len = skb->len;
+ __entry->remaining = remaining;
+ ),
+ TP_printk("INGRPSTQ: NTHPEEK: skb=%p len=%u remaining=%u",
+ __entry->skb, __entry->skb_len, __entry->remaining));
+
+TRACE_EVENT(iptfs_ingress_nth_add, TP_PROTO(struct sk_buff *skb, u8 share_ok),
+ TP_ARGS(skb, share_ok),
+ TP_STRUCT__entry(__field(struct sk_buff *, skb)
+ __field(u32, skb_len)
+ __field(u32, data_len)
+ __field(u8, share_ok)
+ __field(u8, head_frag)
+ __field(u8, pp_recycle)
+ __field(u8, cloned)
+ __field(u8, shared)
+ __field(u8, nr_frags)
+ __field(u8, frag_list)
+ ),
+ TP_fast_assign(__entry->skb = skb;
+ __entry->skb_len = skb->len;
+ __entry->data_len = skb->data_len;
+ __entry->share_ok = share_ok;
+ __entry->head_frag = skb->head_frag;
+ __entry->pp_recycle = skb->pp_recycle;
+ __entry->cloned = skb_cloned(skb);
+ __entry->shared = skb_shared(skb);
+ __entry->nr_frags = skb_shinfo(skb)->nr_frags;
+ __entry->frag_list = (bool)skb_shinfo(skb)->frag_list;
+ ),
+ TP_printk("INGRPSTQ: NTHADD: skb=%p len=%u data_len=%u share_ok=%u head_frag=%u pp_recycle=%u cloned=%u shared=%u nr_frags=%u frag_list=%u",
+ __entry->skb, __entry->skb_len, __entry->data_len, __entry->share_ok,
+ __entry->head_frag, __entry->pp_recycle, __entry->cloned, __entry->shared,
+ __entry->nr_frags, __entry->frag_list));
+
+DECLARE_EVENT_CLASS(iptfs_timer_event,
+ TP_PROTO(struct xfrm_iptfs_data *xtfs, u64 time_val),
+ TP_ARGS(xtfs, time_val),
+ TP_STRUCT__entry(__field(u64, time_val)
+ __field(u64, set_time)),
+ TP_fast_assign(__entry->time_val = time_val;
+ __entry->set_time = xtfs->iptfs_settime;
+ ),
+ TP_printk("TIMER: set_time=%llu time_val=%llu",
+ __entry->set_time, __entry->time_val));
+
+DEFINE_EVENT(iptfs_timer_event, iptfs_timer_start,
+ TP_PROTO(struct xfrm_iptfs_data *xtfs, u64 time_val),
+ TP_ARGS(xtfs, time_val));
+
+DEFINE_EVENT(iptfs_timer_event, iptfs_timer_expire,
+ TP_PROTO(struct xfrm_iptfs_data *xtfs, u64 time_val),
+ TP_ARGS(xtfs, time_val));
+
+#endif /* _TRACE_IPTFS_H */
+
+/* This part must be outside protection */
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH ../../net/xfrm
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE trace_iptfs
+#include <trace/define_trace.h>
diff --git a/net/xfrm/xfrm_iptfs.c b/net/xfrm/xfrm_iptfs.c
index 12b59411fbf0..1173d5f9f3fd 100644
--- a/net/xfrm/xfrm_iptfs.c
+++ b/net/xfrm/xfrm_iptfs.c
@@ -18,6 +18,7 @@
#include <crypto/aead.h>
#include "xfrm_inout.h"
+#include "trace_iptfs.h"
/* IPTFS encap (header) values. */
#define IPTFS_SUBTYPE_BASIC 0
@@ -87,6 +88,39 @@ static enum hrtimer_restart iptfs_drop_timer(struct hrtimer *me);
/* Utility Functions */
/* ================= */
+static u32 __trace_ip_proto(struct iphdr *iph)
+{
+ if (iph->version == 4)
+ return iph->protocol;
+ return ((struct ipv6hdr *)iph)->nexthdr;
+}
+
+static u32 __trace_ip_proto_seq(struct iphdr *iph)
+{
+ void *nexthdr;
+ u32 protocol = 0;
+
+ if (iph->version == 4) {
+ nexthdr = (void *)(iph + 1);
+ protocol = iph->protocol;
+ } else if (iph->version == 6) {
+ nexthdr = (void *)(((struct ipv6hdr *)(iph)) + 1);
+ protocol = ((struct ipv6hdr *)(iph))->nexthdr;
+ }
+ switch (protocol) {
+ case IPPROTO_ICMP:
+ return ntohs(((struct icmphdr *)nexthdr)->un.echo.sequence);
+ case IPPROTO_ICMPV6:
+ return ntohs(((struct icmp6hdr *)nexthdr)->icmp6_sequence);
+ case IPPROTO_TCP:
+ return ntohl(((struct tcphdr *)nexthdr)->seq);
+ case IPPROTO_UDP:
+ return ntohs(((struct udphdr *)nexthdr)->source);
+ default:
+ return 0;
+ }
+}
+
static u64 __esp_seq(struct sk_buff *skb)
{
u64 seq = ntohl(XFRM_SKB_CB(skb)->seq.input.low);
@@ -402,6 +436,13 @@ static int skb_copy_bits_seq(struct skb_seq_state *st, int offset, void *to,
}
}
+/* ================================== */
+/* IPTFS Trace Event Definitions */
+/* ================================== */
+
+#define CREATE_TRACE_POINTS
+#include "trace_iptfs.h"
+
/* ================================== */
/* IPTFS Receiving (egress) Functions */
/* ================================== */
@@ -891,6 +932,8 @@ static int iptfs_input_ordered(struct xfrm_state *x, struct sk_buff *skb)
}
data = sizeof(*ipth);
+ trace_iptfs_egress_recv(skb, xtfs, be16_to_cpu(ipth->block_offset));
+
/* Set data past the basic header */
if (ipth->subtype == IPTFS_SUBTYPE_CC) {
/* Copy the rest of the CC header */
@@ -1791,6 +1834,7 @@ static int iptfs_output_collect(struct net *net, struct sock *sk,
*/
if (!ok) {
nospace:
+ trace_iptfs_no_queue_space(skb, xtfs, pmtu, was_gso);
XFRM_INC_STATS(dev_net(skb->dev),
LINUX_MIB_XFRMOUTNOQSPACE);
kfree_skb_reason(skb, SKB_DROP_REASON_FULL_RING);
@@ -1801,6 +1845,7 @@ static int iptfs_output_collect(struct net *net, struct sock *sk,
* enqueue.
*/
if (xtfs->cfg.dont_frag && iptfs_is_too_big(sk, skb, pmtu)) {
+ trace_iptfs_too_big(skb, xtfs, pmtu, was_gso);
kfree_skb_reason(skb, SKB_DROP_REASON_PKT_TOO_BIG);
continue;
}
@@ -1809,6 +1854,8 @@ static int iptfs_output_collect(struct net *net, struct sock *sk,
ok = iptfs_enqueue(xtfs, skb);
if (!ok)
goto nospace;
+
+ trace_iptfs_enqueue(skb, xtfs, pmtu, was_gso);
}
/* Start a delay timer if we don't have one yet */
@@ -1816,6 +1863,7 @@ static int iptfs_output_collect(struct net *net, struct sock *sk,
hrtimer_start(&xtfs->iptfs_timer, xtfs->init_delay_ns,
IPTFS_HRTIMER_MODE);
xtfs->iptfs_settime = ktime_get_raw_fast_ns();
+ trace_iptfs_timer_start(xtfs, xtfs->init_delay_ns);
}
spin_unlock_bh(&x->lock);
@@ -1913,6 +1961,7 @@ static int iptfs_copy_create_frags(struct sk_buff **skbp,
to_copy = skb->len - offset;
while (to_copy) {
/* Send all but last fragment to allow agg. append */
+ trace_iptfs_first_fragmenting(nskb, mtu, to_copy, NULL);
list_add_tail(&nskb->list, &sublist);
/* FUTURE: if the packet has an odd/non-aligning length we could
@@ -1938,6 +1987,8 @@ static int iptfs_copy_create_frags(struct sk_buff **skbp,
/* return last fragment that will be unsent (or NULL) */
*skbp = nskb;
+ if (nskb)
+ trace_iptfs_first_final_fragment(nskb, mtu, blkoff, NULL);
/* trim the original skb to MTU */
if (!err)
@@ -2042,6 +2093,8 @@ static int iptfs_first_skb(struct sk_buff **skbp, struct xfrm_iptfs_data *xtfs,
/* We've split these up before queuing */
BUG_ON(skb_is_gso(skb));
+ trace_iptfs_first_dequeue(skb, mtu, 0, ip_hdr(skb));
+
/* Simple case -- it fits. `mtu` accounted for all the overhead
* including the basic IPTFS header.
*/
@@ -2143,6 +2196,7 @@ static void iptfs_output_queued(struct xfrm_state *x, struct sk_buff_head *list)
XFRM_INC_STATS(dev_net(skb->dev),
LINUX_MIB_XFRMOUTERROR);
+ trace_iptfs_first_toobig(skb, mtu, 0, ip_hdr(skb));
kfree_skb_reason(skb, SKB_DROP_REASON_PKT_TOO_BIG);
continue;
}
@@ -2190,6 +2244,7 @@ static void iptfs_output_queued(struct xfrm_state *x, struct sk_buff_head *list)
* case.
*/
while ((skb2 = skb_peek(list))) {
+ trace_iptfs_ingress_nth_peek(skb2, remaining);
if (skb2->len > remaining)
break;
@@ -2227,6 +2282,8 @@ static void iptfs_output_queued(struct xfrm_state *x, struct sk_buff_head *list)
skb->len += skb2->len;
remaining -= skb2->len;
+ trace_iptfs_ingress_nth_add(skb2, share_ok);
+
if (share_ok) {
iptfs_consume_frags(skb, skb2);
} else {
@@ -2276,6 +2333,9 @@ static enum hrtimer_restart iptfs_delay_timer(struct hrtimer *me)
* already).
*/
+ trace_iptfs_timer_expire(
+ xtfs, (unsigned long long)(ktime_get_raw_fast_ns() - settime));
+
iptfs_output_queued(x, &list);
return HRTIMER_NORESTART;
--
2.45.1
^ permalink raw reply related [flat|nested] 34+ messages in thread* Re: [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm
2024-05-20 21:42 [PATCH ipsec-next v1 0/8] Add IP-TFS mode to xfrm Christian Hopps
` (16 preceding siblings ...)
2024-05-20 21:42 ` [PATCH ipsec-next v2 17/17] xfrm: iptfs: add tracepoint functionality Christian Hopps
@ 2024-05-23 19:29 ` Antony Antony
2024-05-20 21:45 ` [PATCH ipsec-next v2 0/17] " Christian Hopps
17 siblings, 1 reply; 34+ messages in thread
From: Antony Antony @ 2024-05-23 19:29 UTC (permalink / raw)
To: Christian Hopps; +Cc: devel, Steffen Klassert, netdev, Christian Hopps
Hi Chris,
On Mon, May 20, 2024 at 05:42:38PM -0400, Christian Hopps via Devel wrote:
> From: Christian Hopps <chopps@labn.net>
>
> Summary of Changes
> ------------------
>
> This patchset adds a new xfrm mode implementing on-demand IP-TFS. IP-TFS
> (AggFrag encapsulation) has been standardized in RFC9347.
>
> Link: https://www.rfc-editor.org/rfc/rfc9347.txt
>
> This feature supports demand driven (i.e., non-constant send rate)
> IP-TFS to take advantage of the AGGFRAG ESP payload encapsulation. This
> payload type supports aggregation and fragmentation of the inner IP
> packet stream which in turn yields higher small-packet bandwidth as well
> as reducing MTU/PMTU issues. Congestion control is unimplementated as
> the send rate is demand driven rather than constant.
>
> In order to allow loading this fucntionality as a module a set of
> callbacks xfrm_mode_cbs has been added to xfrm as well.
>
> Patchset Changes:
> -----------------
>
> 23 files changed, 3252 insertions(+), 19 deletions(-)
> Documentation/networking/xfrm_sysctl.rst | 30 +
> include/net/netns/xfrm.h | 6 +
> include/net/xfrm.h | 40 +
> include/uapi/linux/in.h | 2 +
> include/uapi/linux/ip.h | 16 +
> include/uapi/linux/ipsec.h | 3 +-
> include/uapi/linux/snmp.h | 3 +
> include/uapi/linux/xfrm.h | 9 +-
> net/ipv4/esp4.c | 3 +-
> net/ipv6/esp6.c | 3 +-
> net/netfilter/nft_xfrm.c | 3 +-
> net/xfrm/Makefile | 1 +
> net/xfrm/trace_iptfs.h | 218 +++
> net/xfrm/xfrm_compat.c | 10 +-
> net/xfrm/xfrm_device.c | 4 +-
> net/xfrm/xfrm_input.c | 14 +-
> net/xfrm/xfrm_iptfs.c | 2741 ++++++++++++++++++++++++++++++
> net/xfrm/xfrm_output.c | 6 +
> net/xfrm/xfrm_policy.c | 26 +-
> net/xfrm/xfrm_proc.c | 3 +
> net/xfrm/xfrm_state.c | 60 +
> net/xfrm/xfrm_sysctl.c | 38 +
> net/xfrm/xfrm_user.c | 32 +
>
> Patchset Structure:
> -------------------
>
> The first 8 commits are changes to the xfrm infrastructure to support
> the callbacks as well as more generic IP-TFS additions that may be used
> outside the actual IP-TFS implementation.
>
> - iptfs: config: add CONFIG_XFRM_IPTFS
> - iptfs: uapi: ip: add ip_tfs_*_hdr packet formats
> - iptfs: uapi: IPPROTO_AGGFRAG AGGFRAG in ESP
> - iptfs: sysctl: allow configuration of global default values
> - iptfs: netlink: add config (netlink) options
> - iptfs: xfrm: Add mode_cbs module functionality
> - iptfs: xfrm: add generic iptfs defines and functionality
>
> The last 9+1 commits constitute the IP-TFS implementation constructed in
> layers to make review easier. The first 9 commits all apply to a single
> file `net/xfrm/xfrm_iptfs.c`, the last commit adds a new tracepoint
> header file along with the use of these new tracepoint calls.
>
> - iptfs: impl: add new iptfs xfrm mode impl
> - iptfs: impl: add user packet (tunnel ingress) handling
> - iptfs: impl: share page fragments of inner packets
> - iptfs: impl: add fragmenting of larger than MTU user packets
> - iptfs: impl: add basic receive packet (tunnel egress) handling
> - iptfs: impl: handle received fragmented inner packets
> - iptfs: impl: add reusing received skb for the tunnel egress packet
> - iptfs: impl: add skb-fragment sharing code
> - iptfs: impl: handle reordering of received packets
> - iptfs: impl: add tracepoint functionality
>
> Patchset History:
> -----------------
>
> RFCv1 (11/10/2023)
>
> RFCv1 -> RFCv2 (11/12/2023)
>
> Updates based on feedback from Simon Horman, Antony,
> Michael Richardson, and kernel test robot.
>
> RFCv2 -> v1 (2/19/2024)
>
> Updates based on feedback from Sabrina Dubroca, kernel test robot
>
> v1 -> v2 (5/19/2024)
>
> Updates based on feedback from Sabrina Dubroca, Simon Horman, Antony.
>
> o Add handling of new netlink SA direction attribute (Antony).
> o Split single patch/commit of xfrm_iptfs.c (the actual IP-TFS impl)
> into 9+1 distinct layered functionality commits for aiding review.
> - xfrm: fix return check on clone() callback
> - xfrm: add sa_len() callback in xfrm_mode_cbs for copy to user
> - iptfs: remove unneeded skb free count variable
> - iptfs: remove unused variable and "breadcrumb" for future code.
> - iptfs: use do_div() to avoid "__udivd13 missing" link failure.
> - iptfs: remove some BUG_ON() assertions questioned in review.
> --
I ran a couple of tests and it hit KSAN BUG.
I was sending large ping while MTU is 1500.
north login: shed systemd-user-sessions.service - Permit User Sessions.
north login: [ 78.594770] ==================================================================
[ 78.595825] BUG: KASAN: null-ptr-deref in iptfs_output_collect+0x263/0x57b
[ 78.596658] Read of size 8 at addr 0000000000000108 by task ping/493
[ 78.597435] ng rpc-statd-notify.service - Notify NFS peers of a restart...
[ 78.597651] CPU: 0 PID: 493 Comm: ping Not tainted 6.9.0-rc2-00697-g489ca863e24f-dirty #11
[ 78.598645] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 78.599747] Call Trace:tty@ttyS2.service - Serial Getty on ttyS2.
[ 78.600070] <TASK>l-getty@ttyS3.service - Serial Getty on ttyS3.
[ 78.600354] dump_stack_lvl+0x2a/0x3bogin Prompts.
[ 78.600817] kasan_report+0x84/0xa6rvice - Hostname Service...
[ 78.601262] ? iptfs_output_collect+0x263/0x57bl server.
[ 78.601825] iptfs_output_collect+0x263/0x57bogin Management.
[ 78.602374] ip_send_skb+0x25/0x57vice - Notify NFS peers of a restart.
[ 78.602807] raw_sendmsg+0xee8/0x1011t - Multi-User System.
[ 78.603269] ? native_flush_tlb_one_user+0xd/0xe5e Service.
[ 78.603850] ? raw_hash_sk+0x21b/0x21b
[ 78.604331] ? kernel_init_pages+0x42/0x51
[ 78.604845] ? prep_new_page+0x44/0x51Re…line ext4 Metadata Check Snapshots.
[ 78.605318] ? get_page_from_freelist+0x72b/0x915 Interface.
[ 78.605903] ? signal_pending_state+0x77/0x77cord Runlevel Change in UTMP...
[ 78.606462] ? __might_resched+0x8a/0x240e - Record Runlevel Change in UTMP.
[ 78.606966] ? __might_sleep+0x25/0xa0
[ 78.607440] ? first_zones_zonelist+0x2c/0x43
[ 78.607985] ? __rcu_read_lock+0x2d/0x3a
[ 78.608479] ? __pte_offset_map+0x32/0xa4
[ 78.608979] ? __might_resched+0x8a/0x240
[ 78.609478] ? __might_sleep+0x25/0xa0
[ 78.609949] ? inet_send_prepare+0x54/0x54
[ 78.610464] ? sock_sendmsg_nosec+0x42/0x6c
[ 78.610984] sock_sendmsg_nosec+0x42/0x6c
[ 78.611485] __sys_sendto+0x15d/0x1cc
[ 78.611947] ? __x64_sys_getpeername+0x44/0x44
[ 78.612498] ? __handle_mm_fault+0x679/0xae4
[ 78.613033] ? find_vma+0x6b/0x8b
[ 78.613457] ? find_vma_intersection+0x8a/0x8a
[ 78.614006] ? __handle_irq_event_percpu+0x180/0x197
[ 78.614617] ? handle_mm_fault+0x38/0x154
[ 78.615114] ? handle_mm_fault+0xeb/0x154
[ 78.615620] ? preempt_latency_start+0x29/0x34
[ 78.616169] ? preempt_count_sub+0x14/0xb3
[ 78.616678] ? up_read+0x4b/0x5c
[ 78.617094] __x64_sys_sendto+0x76/0x82
[ 78.617577] do_syscall_64+0x6b/0xd7
[ 78.618043] entry_SYSCALL_64_after_hwframe+0x46/0x4e
[ 78.618667] RIP: 0033:0x7fed3de99a73
[ 78.619118] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
[ 78.621291] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
[ 78.622205] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
[ 78.623056] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
[ 78.623908] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
[ 78.624765] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
[ 78.625619] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
[ 78.626480] </TASK>
[ 78.626773] ==================================================================
[ 78.627656] Disabling lock debugging due to kernel taint
[ 78.628305] BUG: kernel NULL pointer dereference, address: 0000000000000108
[ 78.629136] #PF: supervisor read access in kernel mode
[ 78.629766] #PF: error_code(0x0000) - not-present page
[ 78.630402] PGD 0 P4D 0
[ 78.630739] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC KASAN
[ 78.631398] CPU: 0 PID: 493 Comm: ping Tainted: G B 6.9.0-rc2-00697-g489ca863e24f-dirty #11
[ 78.632548] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 78.633649] RIP: 0010:iptfs_output_collect+0x263/0x57b
[ 78.634283] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
[ 78.636444] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
[ 78.637076] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
[ 78.637923] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
[ 78.638792] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
[ 78.639645] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
[ 78.640498] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
[ 78.641359] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
[ 78.642324] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 78.643022] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
[ 78.643882] Call Trace:
[ 78.644204] <TASK>
[ 78.644487] ? __die_body+0x1a/0x56
[ 78.644929] ? page_fault_oops+0x45f/0x4cd
[ 78.645441] ? dump_pagetable+0x1db/0x1db
[ 78.645942] ? vprintk_emit+0x163/0x171
[ 78.646425] ? iptfs_output_collect+0x263/0x57b
[ 78.646986] ? _printk+0xb2/0xe1
[ 78.647401] ? find_first_fitting_seq+0x193/0x193
[ 78.647982] ? iptfs_output_collect+0x263/0x57b
[ 78.648541] ? do_user_addr_fault+0x14f/0x56c
[ 78.649084] ? exc_page_fault+0xa5/0xbe
[ 78.649566] ? asm_exc_page_fault+0x22/0x30
[ 78.650100] ? iptfs_output_collect+0x263/0x57b
[ 78.650660] ? iptfs_output_collect+0x263/0x57b
[ 78.651221] ip_send_skb+0x25/0x57
[ 78.651652] raw_sendmsg+0xee8/0x1011
[ 78.652113] ? native_flush_tlb_one_user+0xd/0xe5
[ 78.652693] ? raw_hash_sk+0x21b/0x21b
[ 78.653166] ? kernel_init_pages+0x42/0x51
[ 78.653683] ? prep_new_page+0x44/0x51
[ 78.654160] ? get_page_from_freelist+0x72b/0x915
[ 78.654739] ? signal_pending_state+0x77/0x77
[ 78.655284] ? __might_resched+0x8a/0x240
[ 78.655784] ? __might_sleep+0x25/0xa0
[ 78.656255] ? first_zones_zonelist+0x2c/0x43
[ 78.656798] ? __rcu_read_lock+0x2d/0x3a
[ 78.657289] ? __pte_offset_map+0x32/0xa4
[ 78.657788] ? __might_resched+0x8a/0x240
[ 78.658291] ? __might_sleep+0x25/0xa0
[ 78.658763] ? inet_send_prepare+0x54/0x54
[ 78.659272] ? sock_sendmsg_nosec+0x42/0x6c
[ 78.659791] sock_sendmsg_nosec+0x42/0x6c
[ 78.660293] __sys_sendto+0x15d/0x1cc
[ 78.660755] ? __x64_sys_getpeername+0x44/0x44
[ 78.661304] ? __handle_mm_fault+0x679/0xae4
[ 78.661838] ? find_vma+0x6b/0x8b
[ 78.662272] ? find_vma_intersection+0x8a/0x8a
[ 78.662828] ? __handle_irq_event_percpu+0x180/0x197
[ 78.663436] ? handle_mm_fault+0x38/0x154
[ 78.663935] ? handle_mm_fault+0xeb/0x154
[ 78.664435] ? preempt_latency_start+0x29/0x34
[ 78.664987] ? preempt_count_sub+0x14/0xb3
[ 78.665498] ? up_read+0x4b/0x5c
[ 78.665911] __x64_sys_sendto+0x76/0x82
[ 78.666398] do_syscall_64+0x6b/0xd7
[ 78.666849] entry_SYSCALL_64_after_hwframe+0x46/0x4e
[ 78.667466] RIP: 0033:0x7fed3de99a73
[ 78.667918] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
[ 78.670097] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
[ 78.671002] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
[ 78.671858] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
[ 78.672708] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
[ 78.673564] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
[ 78.674430] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
[ 78.675287] </TASK>
[ 78.675580] Modules linked in:
[ 78.675975] CR2: 0000000000000108
[ 78.676396] ---[ end trace 0000000000000000 ]---
[ 78.676966] RIP: 0010:iptfs_output_collect+0x263/0x57b
[ 78.677596] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
[ 78.679768] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
[ 78.680410] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
[ 78.681264] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
[ 78.682136] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
[ 78.682997] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
[ 78.683853] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
[ 78.684710] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
[ 78.685675] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 78.686387] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
[ 78.687246] Kernel panic - not syncing: Fatal exception in interrupt
[ 78.688014] Kernel Offset: disabled
[ 78.688460] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
ping -s 2000 -n -q -W 1 -c 2 -I 192.0.3.254 192.0.2.254
(gdb) list *iptfs_output_collect+0x263
0xffffffff81d5076f is in iptfs_output_collect (./include/net/net_namespace.h:383).
378 }
379
380 static inline struct net *read_pnet(const possible_net_t *pnet)
381 {
382 #ifdef CONFIG_NET_NS
383 return rcu_dereference_protected(pnet->net, true);
384 #else
385 return &init_net;
386 #endif
387 }
I suspect actual crash is from the line 1756 instead,
(gdb) list *iptfs_output_collect+0x256
0xffffffff81d50762 is in iptfs_output_collect (net/xfrm/xfrm_iptfs.c:1756).
1751 return 0;
1752
1753 /* We only send ICMP too big if the user has configured us as
1754 * dont-fragment.
1755 */
1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
1757
1758 if (sk) {
1759 xfrm_local_error(skb, pmtu);
1760 } else if (ip_hdr(skb)->version == 4) {
Later I ran with gdb iptfs_is_too_big which is called twice and second time
it crash.
Here is gdb bt. Just before the crash
#0 iptfs_is_too_big (pmtu=1442, skb=0xffff88810dbea3c0, sk=0xffff888104d4ed40) at net/xfrm/xfrm_iptfs.c:1756
#1 iptfs_output_collect (net=<optimized out>, sk=0xffff888104d4ed40, skb=0xffff88810dbea3c0) at net/xfrm/xfrm_iptfs.c:1847
#2 0xffffffff81c8a3cb in ip_send_skb (net=0xffffffff83e57f20 <init_net>, skb=0xffff88810dbea3c0)
at net/ipv4/ip_output.c:1492
#3 0xffffffff81c8a439 in ip_push_pending_frames (sk=sk@entry=0xffff888104d4ed40, fl4=fl4@entry=0xffffc90000e3fb90)
at net/ipv4/ip_output.c:1512
#4 0xffffffff81ccf3cf in raw_sendmsg (sk=0xffff888104d4ed40, msg=0xffffc90000e3fd80, len=<optimized out>)
at net/ipv4/raw.c:654
#5 0xffffffff81b096ea in sock_sendmsg_nosec (sock=sock@entry=0xffff888115136040, msg=msg@entry=0xffffc90000e3fd80)
at net/socket.c:730
#6 0xffffffff81b0c327 in __sock_sendmsg (msg=0xffffc90000e3fd80, sock=0xffff888115136040) at net/socket.c:745
#7 __sys_sendto (fd=<optimized out>, buff=buff@entry=0x558edefb73c0, len=len@entry=2008, flags=flags@entry=0,
addr=addr@entry=0x558edefb35c0, addr_len=addr_len@entry=16) at net/socket.c:2191
#8 0xffffffff81b0c40c in __do_sys_sendto (addr_len=16, addr=0x558edefb35c0, flags=0, len=2008, buff=0x558edefb73c0,
fd=<optimized out>) at net/socket.c:2203
#9 __se_sys_sendto (addr_len=16, addr=94072114722240, flags=0, len=2008, buff=94072114738112, fd=<optimized out>)
at net/socket.c:2199
gdb) list
1751 return 0;
1752
1753 /* We only send ICMP too big if the user has configured us as
1754 * dont-fragment.
1755 */
1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
1757
1758 if (sk) {
1759 xfrm_local_error(skb, pmtu);
1760 } else if (ip_hdr(skb)->version == 4) {
-antony
^ permalink raw reply [flat|nested] 34+ messages in thread* [PATCH ipsec-next v2 0/17] Add IP-TFS mode to xfrm
@ 2024-05-20 21:45 ` Christian Hopps
2024-05-23 23:04 ` Christian Hopps
0 siblings, 1 reply; 34+ messages in thread
From: Christian Hopps @ 2024-05-20 21:45 UTC (permalink / raw)
To: devel; +Cc: Steffen Klassert, netdev, Christian Hopps, Christian Hopps
From: Christian Hopps <chopps@labn.net>
Summary of Changes
------------------
This patchset adds a new xfrm mode implementing on-demand IP-TFS. IP-TFS
(AggFrag encapsulation) has been standardized in RFC9347.
Link: https://www.rfc-editor.org/rfc/rfc9347.txt
This feature supports demand driven (i.e., non-constant send rate)
IP-TFS to take advantage of the AGGFRAG ESP payload encapsulation. This
payload type supports aggregation and fragmentation of the inner IP
packet stream which in turn yields higher small-packet bandwidth as well
as reducing MTU/PMTU issues. Congestion control is unimplementated as
the send rate is demand driven rather than constant.
In order to allow loading this fucntionality as a module a set of
callbacks xfrm_mode_cbs has been added to xfrm as well.
Patchset Changes:
-----------------
23 files changed, 3252 insertions(+), 19 deletions(-)
Documentation/networking/xfrm_sysctl.rst | 30 +
include/net/netns/xfrm.h | 6 +
include/net/xfrm.h | 40 +
include/uapi/linux/in.h | 2 +
include/uapi/linux/ip.h | 16 +
include/uapi/linux/ipsec.h | 3 +-
include/uapi/linux/snmp.h | 3 +
include/uapi/linux/xfrm.h | 9 +-
net/ipv4/esp4.c | 3 +-
net/ipv6/esp6.c | 3 +-
net/netfilter/nft_xfrm.c | 3 +-
net/xfrm/Makefile | 1 +
net/xfrm/trace_iptfs.h | 218 +++
net/xfrm/xfrm_compat.c | 10 +-
net/xfrm/xfrm_device.c | 4 +-
net/xfrm/xfrm_input.c | 14 +-
net/xfrm/xfrm_iptfs.c | 2741 ++++++++++++++++++++++++++++++
net/xfrm/xfrm_output.c | 6 +
net/xfrm/xfrm_policy.c | 26 +-
net/xfrm/xfrm_proc.c | 3 +
net/xfrm/xfrm_state.c | 60 +
net/xfrm/xfrm_sysctl.c | 38 +
net/xfrm/xfrm_user.c | 32 +
Patchset Structure:
-------------------
The first 8 commits are changes to the xfrm infrastructure to support
the callbacks as well as more generic IP-TFS additions that may be used
outside the actual IP-TFS implementation.
- iptfs: config: add CONFIG_XFRM_IPTFS
- iptfs: uapi: ip: add ip_tfs_*_hdr packet formats
- iptfs: uapi: IPPROTO_AGGFRAG AGGFRAG in ESP
- iptfs: sysctl: allow configuration of global default values
- iptfs: netlink: add config (netlink) options
- iptfs: xfrm: Add mode_cbs module functionality
- iptfs: xfrm: add generic iptfs defines and functionality
The last 9+1 commits constitute the IP-TFS implementation constructed in
layers to make review easier. The first 9 commits all apply to a single
file `net/xfrm/xfrm_iptfs.c`, the last commit adds a new tracepoint
header file along with the use of these new tracepoint calls.
- iptfs: impl: add new iptfs xfrm mode impl
- iptfs: impl: add user packet (tunnel ingress) handling
- iptfs: impl: share page fragments of inner packets
- iptfs: impl: add fragmenting of larger than MTU user packets
- iptfs: impl: add basic receive packet (tunnel egress) handling
- iptfs: impl: handle received fragmented inner packets
- iptfs: impl: add reusing received skb for the tunnel egress packet
- iptfs: impl: add skb-fragment sharing code
- iptfs: impl: handle reordering of received packets
- iptfs: impl: add tracepoint functionality
Patchset History:
-----------------
RFCv1 (11/10/2023)
RFCv1 -> RFCv2 (11/12/2023)
Updates based on feedback from Simon Horman, Antony,
Michael Richardson, and kernel test robot.
RFCv2 -> v1 (2/19/2024)
Updates based on feedback from Sabrina Dubroca, kernel test robot
v1 -> v2 (5/19/2024)
Updates based on feedback from Sabrina Dubroca, Simon Horman, Antony.
o Add handling of new netlink SA direction attribute (Antony).
o Split single patch/commit of xfrm_iptfs.c (the actual IP-TFS impl)
into 9+1 distinct layered functionality commits for aiding review.
- xfrm: fix return check on clone() callback
- xfrm: add sa_len() callback in xfrm_mode_cbs for copy to user
- iptfs: remove unneeded skb free count variable
- iptfs: remove unused variable and "breadcrumb" for future code.
- iptfs: use do_div() to avoid "__udivd13 missing" link failure.
- iptfs: remove some BUG_ON() assertions questioned in review.
^ permalink raw reply [flat|nested] 34+ messages in thread* Re: [PATCH ipsec-next v2 0/17] Add IP-TFS mode to xfrm
2024-05-20 21:45 ` [PATCH ipsec-next v2 0/17] " Christian Hopps
@ 2024-05-23 23:04 ` Christian Hopps
2024-05-24 11:52 ` Antony Antony
0 siblings, 1 reply; 34+ messages in thread
From: Christian Hopps @ 2024-05-23 23:04 UTC (permalink / raw)
To: Antony Antony
Cc: Christian Hopps, devel, Steffen Klassert, netdev, Christian Hopps
Could you let me know some more details about this test? What is your interface config / topology?. I tried to guess given the ping command but it's not replicating for me.
Thanks,
Chris.
PS, I've changed the subject and In-reply-to to be based on the corrected cover-letter I sent, I initially sent the cover letter with the wrong subject. :(
Antony Antony <antony@phenome.org> writes:
> Hi Chris,
>
> On Mon, May 20, 2024 at 05:42:38PM -0400, Christian Hopps via Devel wrote:
>> From: Christian Hopps <chopps@labn.net>
>> - iptfs: remove some BUG_ON() assertions questioned in review.
...
> I ran a couple of tests and it hit KSAN BUG.
>
> I was sending large ping while MTU is 1500.
>
> north login: shed systemd-user-sessions.service - Permit User Sessions.
> north login: [ 78.594770] ==================================================================
> [ 78.595825] BUG: KASAN: null-ptr-deref in iptfs_output_collect+0x263/0x57b
> [ 78.596658] Read of size 8 at addr 0000000000000108 by task ping/493
> [ 78.597435] ng rpc-statd-notify.service - Notify NFS peers of a restart...
> [ 78.597651] CPU: 0 PID: 493 Comm: ping Not tainted 6.9.0-rc2-00697-g489ca863e24f-dirty #11
> [ 78.598645] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [ 78.599747] Call Trace:tty@ttyS2.service - Serial Getty on ttyS2.
> [ 78.600070] <TASK>l-getty@ttyS3.service - Serial Getty on ttyS3.
> [ 78.600354] dump_stack_lvl+0x2a/0x3bogin Prompts.
> [ 78.600817] kasan_report+0x84/0xa6rvice - Hostname Service...
> [ 78.601262] ? iptfs_output_collect+0x263/0x57bl server.
> [ 78.601825] iptfs_output_collect+0x263/0x57bogin Management.
> [ 78.602374] ip_send_skb+0x25/0x57vice - Notify NFS peers of a restart.
> [ 78.602807] raw_sendmsg+0xee8/0x1011t - Multi-User System.
> [ 78.603269] ? native_flush_tlb_one_user+0xd/0xe5e Service.
> [ 78.603850] ? raw_hash_sk+0x21b/0x21b
> [ 78.604331] ? kernel_init_pages+0x42/0x51
> [ 78.604845] ? prep_new_page+0x44/0x51Re…line ext4 Metadata Check Snapshots.
> [ 78.605318] ? get_page_from_freelist+0x72b/0x915 Interface.
> [ 78.605903] ? signal_pending_state+0x77/0x77cord Runlevel Change in UTMP...
> [ 78.606462] ? __might_resched+0x8a/0x240e - Record Runlevel Change in UTMP.
> [ 78.606966] ? __might_sleep+0x25/0xa0
> [ 78.607440] ? first_zones_zonelist+0x2c/0x43
> [ 78.607985] ? __rcu_read_lock+0x2d/0x3a
> [ 78.608479] ? __pte_offset_map+0x32/0xa4
> [ 78.608979] ? __might_resched+0x8a/0x240
> [ 78.609478] ? __might_sleep+0x25/0xa0
> [ 78.609949] ? inet_send_prepare+0x54/0x54
> [ 78.610464] ? sock_sendmsg_nosec+0x42/0x6c
> [ 78.610984] sock_sendmsg_nosec+0x42/0x6c
> [ 78.611485] __sys_sendto+0x15d/0x1cc
> [ 78.611947] ? __x64_sys_getpeername+0x44/0x44
> [ 78.612498] ? __handle_mm_fault+0x679/0xae4
> [ 78.613033] ? find_vma+0x6b/0x8b
> [ 78.613457] ? find_vma_intersection+0x8a/0x8a
> [ 78.614006] ? __handle_irq_event_percpu+0x180/0x197
> [ 78.614617] ? handle_mm_fault+0x38/0x154
> [ 78.615114] ? handle_mm_fault+0xeb/0x154
> [ 78.615620] ? preempt_latency_start+0x29/0x34
> [ 78.616169] ? preempt_count_sub+0x14/0xb3
> [ 78.616678] ? up_read+0x4b/0x5c
> [ 78.617094] __x64_sys_sendto+0x76/0x82
> [ 78.617577] do_syscall_64+0x6b/0xd7
> [ 78.618043] entry_SYSCALL_64_after_hwframe+0x46/0x4e
> [ 78.618667] RIP: 0033:0x7fed3de99a73
> [ 78.619118] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8
> 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0
> ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
> [ 78.621291] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
> [ 78.622205] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
> [ 78.623056] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
> [ 78.623908] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
> [ 78.624765] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
> [ 78.625619] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
> [ 78.626480] </TASK>
> [ 78.626773] ==================================================================
> [ 78.627656] Disabling lock debugging due to kernel taint
> [ 78.628305] BUG: kernel NULL pointer dereference, address: 0000000000000108
> [ 78.629136] #PF: supervisor read access in kernel mode
> [ 78.629766] #PF: error_code(0x0000) - not-present page
> [ 78.630402] PGD 0 P4D 0
> [ 78.630739] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC KASAN
> [ 78.631398] CPU: 0 PID: 493 Comm: ping Tainted: G B 6.9.0-rc2-00697-g489ca863e24f-dirty #11
> [ 78.632548] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [ 78.633649] RIP: 0010:iptfs_output_collect+0x263/0x57b
> [ 78.634283] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b
> 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08
> 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
> [ 78.636444] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
> [ 78.637076] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
> [ 78.637923] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
> [ 78.638792] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
> [ 78.639645] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
> [ 78.640498] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
> [ 78.641359] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
> [ 78.642324] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 78.643022] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
> [ 78.643882] Call Trace:
> [ 78.644204] <TASK>
> [ 78.644487] ? __die_body+0x1a/0x56
> [ 78.644929] ? page_fault_oops+0x45f/0x4cd
> [ 78.645441] ? dump_pagetable+0x1db/0x1db
> [ 78.645942] ? vprintk_emit+0x163/0x171
> [ 78.646425] ? iptfs_output_collect+0x263/0x57b
> [ 78.646986] ? _printk+0xb2/0xe1
> [ 78.647401] ? find_first_fitting_seq+0x193/0x193
> [ 78.647982] ? iptfs_output_collect+0x263/0x57b
> [ 78.648541] ? do_user_addr_fault+0x14f/0x56c
> [ 78.649084] ? exc_page_fault+0xa5/0xbe
> [ 78.649566] ? asm_exc_page_fault+0x22/0x30
> [ 78.650100] ? iptfs_output_collect+0x263/0x57b
> [ 78.650660] ? iptfs_output_collect+0x263/0x57b
> [ 78.651221] ip_send_skb+0x25/0x57
> [ 78.651652] raw_sendmsg+0xee8/0x1011
> [ 78.652113] ? native_flush_tlb_one_user+0xd/0xe5
> [ 78.652693] ? raw_hash_sk+0x21b/0x21b
> [ 78.653166] ? kernel_init_pages+0x42/0x51
> [ 78.653683] ? prep_new_page+0x44/0x51
> [ 78.654160] ? get_page_from_freelist+0x72b/0x915
> [ 78.654739] ? signal_pending_state+0x77/0x77
> [ 78.655284] ? __might_resched+0x8a/0x240
> [ 78.655784] ? __might_sleep+0x25/0xa0
> [ 78.656255] ? first_zones_zonelist+0x2c/0x43
> [ 78.656798] ? __rcu_read_lock+0x2d/0x3a
> [ 78.657289] ? __pte_offset_map+0x32/0xa4
> [ 78.657788] ? __might_resched+0x8a/0x240
> [ 78.658291] ? __might_sleep+0x25/0xa0
> [ 78.658763] ? inet_send_prepare+0x54/0x54
> [ 78.659272] ? sock_sendmsg_nosec+0x42/0x6c
> [ 78.659791] sock_sendmsg_nosec+0x42/0x6c
> [ 78.660293] __sys_sendto+0x15d/0x1cc
> [ 78.660755] ? __x64_sys_getpeername+0x44/0x44
> [ 78.661304] ? __handle_mm_fault+0x679/0xae4
> [ 78.661838] ? find_vma+0x6b/0x8b
> [ 78.662272] ? find_vma_intersection+0x8a/0x8a
> [ 78.662828] ? __handle_irq_event_percpu+0x180/0x197
> [ 78.663436] ? handle_mm_fault+0x38/0x154
> [ 78.663935] ? handle_mm_fault+0xeb/0x154
> [ 78.664435] ? preempt_latency_start+0x29/0x34
> [ 78.664987] ? preempt_count_sub+0x14/0xb3
> [ 78.665498] ? up_read+0x4b/0x5c
> [ 78.665911] __x64_sys_sendto+0x76/0x82
> [ 78.666398] do_syscall_64+0x6b/0xd7
> [ 78.666849] entry_SYSCALL_64_after_hwframe+0x46/0x4e
> [ 78.667466] RIP: 0033:0x7fed3de99a73
> [ 78.667918] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8
> 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0
> ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
> [ 78.670097] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
> [ 78.671002] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
> [ 78.671858] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
> [ 78.672708] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
> [ 78.673564] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
> [ 78.674430] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
> [ 78.675287] </TASK>
> [ 78.675580] Modules linked in:
> [ 78.675975] CR2: 0000000000000108
> [ 78.676396] ---[ end trace 0000000000000000 ]---
> [ 78.676966] RIP: 0010:iptfs_output_collect+0x263/0x57b
> [ 78.677596] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b
> 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08
> 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
> [ 78.679768] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
> [ 78.680410] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
> [ 78.681264] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
> [ 78.682136] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
> [ 78.682997] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
> [ 78.683853] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
> [ 78.684710] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
> [ 78.685675] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 78.686387] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
> [ 78.687246] Kernel panic - not syncing: Fatal exception in interrupt
> [ 78.688014] Kernel Offset: disabled
> [ 78.688460] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
>
> ping -s 2000 -n -q -W 1 -c 2 -I 192.0.3.254 192.0.2.254
>
> (gdb) list *iptfs_output_collect+0x263
> 0xffffffff81d5076f is in iptfs_output_collect (./include/net/net_namespace.h:383).
> 378 }
> 379
> 380 static inline struct net *read_pnet(const possible_net_t *pnet)
> 381 {
> 382 #ifdef CONFIG_NET_NS
> 383 return rcu_dereference_protected(pnet->net, true);
> 384 #else
> 385 return &init_net;
> 386 #endif
> 387 }
>
> I suspect actual crash is from the line 1756 instead,
> (gdb) list *iptfs_output_collect+0x256
> 0xffffffff81d50762 is in iptfs_output_collect (net/xfrm/xfrm_iptfs.c:1756).
> 1751 return 0;
> 1752
> 1753 /* We only send ICMP too big if the user has configured us as
> 1754 * dont-fragment.
> 1755 */
> 1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
> 1757
> 1758 if (sk) {
> 1759 xfrm_local_error(skb, pmtu);
> 1760 } else if (ip_hdr(skb)->version == 4) {
>
> Later I ran with gdb iptfs_is_too_big which is called twice and second time
> it crash.
> Here is gdb bt. Just before the crash
>
> #0 iptfs_is_too_big (pmtu=1442, skb=0xffff88810dbea3c0, sk=0xffff888104d4ed40) at net/xfrm/xfrm_iptfs.c:1756
> #1 iptfs_output_collect (net=<optimized out>, sk=0xffff888104d4ed40, skb=0xffff88810dbea3c0) at net/xfrm/xfrm_iptfs.c:1847
> #2 0xffffffff81c8a3cb in ip_send_skb (net=0xffffffff83e57f20 <init_net>, skb=0xffff88810dbea3c0)
> at net/ipv4/ip_output.c:1492
> #3 0xffffffff81c8a439 in ip_push_pending_frames (sk=sk@entry=0xffff888104d4ed40, fl4=fl4@entry=0xffffc90000e3fb90)
> at net/ipv4/ip_output.c:1512
> #4 0xffffffff81ccf3cf in raw_sendmsg (sk=0xffff888104d4ed40, msg=0xffffc90000e3fd80, len=<optimized out>)
> at net/ipv4/raw.c:654
> #5 0xffffffff81b096ea in sock_sendmsg_nosec (sock=sock@entry=0xffff888115136040, msg=msg@entry=0xffffc90000e3fd80)
> at net/socket.c:730
> #6 0xffffffff81b0c327 in __sock_sendmsg (msg=0xffffc90000e3fd80, sock=0xffff888115136040) at net/socket.c:745
> #7 __sys_sendto (fd=<optimized out>, buff=buff@entry=0x558edefb73c0, len=len@entry=2008, flags=flags@entry=0,
> addr=addr@entry=0x558edefb35c0, addr_len=addr_len@entry=16) at net/socket.c:2191
> #8 0xffffffff81b0c40c in __do_sys_sendto (addr_len=16, addr=0x558edefb35c0, flags=0, len=2008, buff=0x558edefb73c0,
> fd=<optimized out>) at net/socket.c:2203
> #9 __se_sys_sendto (addr_len=16, addr=94072114722240, flags=0, len=2008, buff=94072114738112, fd=<optimized out>)
> at net/socket.c:2199
>
> gdb) list
> 1751 return 0;
> 1752
> 1753 /* We only send ICMP too big if the user has configured us as
> 1754 * dont-fragment.
> 1755 */
> 1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
> 1757
> 1758 if (sk) {
> 1759 xfrm_local_error(skb, pmtu);
> 1760 } else if (ip_hdr(skb)->version == 4) {
>
> -antony
^ permalink raw reply [flat|nested] 34+ messages in thread* Re: [PATCH ipsec-next v2 0/17] Add IP-TFS mode to xfrm
2024-05-23 23:04 ` Christian Hopps
@ 2024-05-24 11:52 ` Antony Antony
2024-05-24 11:56 ` Christian Hopps
0 siblings, 1 reply; 34+ messages in thread
From: Antony Antony @ 2024-05-24 11:52 UTC (permalink / raw)
To: Christian Hopps
Cc: Antony Antony, devel, Steffen Klassert, netdev, Christian Hopps
[-- Attachment #1: Type: text/plain, Size: 14176 bytes --]
On Thu, May 23, 2024 at 07:04:58PM -0400, Christian Hopps wrote:
>
> Could you let me know some more details about this test? What is your interface config / topology?. I tried to guess given the ping command but it's not replicating for me.
I am using Libreswan testing topology. However, I am running test manually.
Yesterday tunnel between north and east. This morning I quickly tried
between west-east. Just two VM. I see the same issue there too.
https://libreswan.org/wiki/images/f/f1/Testnet-202102.png
I am using CONFIG_ESP_OFFLOAD. That is only thing standing out. Besides it
is just a 1500 MTU tunnels using qemu/kvm and tap network.
attached is my kernel .config
> PS, I've changed the subject and In-reply-to to be based on the corrected
> cover-letter I sent, I initially sent the cover letter with the wrong
> subject. :(
I noticed a second cover letter. However, it was not showing as related to
patch set correctly. It showed up as a diffrent thread. That is why I
replied to the initial one
-antony
>
>
> Antony Antony <antony@phenome.org> writes:
>
> > Hi Chris,
> >
> > On Mon, May 20, 2024 at 05:42:38PM -0400, Christian Hopps via Devel wrote:
> > > From: Christian Hopps <chopps@labn.net>
> > > - iptfs: remove some BUG_ON() assertions questioned in review.
>
> ...
>
> > I ran a couple of tests and it hit KSAN BUG.
> >
> > I was sending large ping while MTU is 1500.
> >
> > north login: shed systemd-user-sessions.service - Permit User Sessions.
> > north login: [ 78.594770] ==================================================================
> > [ 78.595825] BUG: KASAN: null-ptr-deref in iptfs_output_collect+0x263/0x57b
> > [ 78.596658] Read of size 8 at addr 0000000000000108 by task ping/493
> > [ 78.597435] ng rpc-statd-notify.service - Notify NFS peers of a restart...
> > [ 78.597651] CPU: 0 PID: 493 Comm: ping Not tainted 6.9.0-rc2-00697-g489ca863e24f-dirty #11
> > [ 78.598645] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> > [ 78.599747] Call Trace:tty@ttyS2.service - Serial Getty on ttyS2.
> > [ 78.600070] <TASK>l-getty@ttyS3.service - Serial Getty on ttyS3.
> > [ 78.600354] dump_stack_lvl+0x2a/0x3bogin Prompts.
> > [ 78.600817] kasan_report+0x84/0xa6rvice - Hostname Service...
> > [ 78.601262] ? iptfs_output_collect+0x263/0x57bl server.
> > [ 78.601825] iptfs_output_collect+0x263/0x57bogin Management.
> > [ 78.602374] ip_send_skb+0x25/0x57vice - Notify NFS peers of a restart.
> > [ 78.602807] raw_sendmsg+0xee8/0x1011t - Multi-User System.
> > [ 78.603269] ? native_flush_tlb_one_user+0xd/0xe5e Service.
> > [ 78.603850] ? raw_hash_sk+0x21b/0x21b
> > [ 78.604331] ? kernel_init_pages+0x42/0x51
> > [ 78.604845] ? prep_new_page+0x44/0x51Re…line ext4 Metadata Check Snapshots.
> > [ 78.605318] ? get_page_from_freelist+0x72b/0x915 Interface.
> > [ 78.605903] ? signal_pending_state+0x77/0x77cord Runlevel Change in UTMP...
> > [ 78.606462] ? __might_resched+0x8a/0x240e - Record Runlevel Change in UTMP.
> > [ 78.606966] ? __might_sleep+0x25/0xa0
> > [ 78.607440] ? first_zones_zonelist+0x2c/0x43
> > [ 78.607985] ? __rcu_read_lock+0x2d/0x3a
> > [ 78.608479] ? __pte_offset_map+0x32/0xa4
> > [ 78.608979] ? __might_resched+0x8a/0x240
> > [ 78.609478] ? __might_sleep+0x25/0xa0
> > [ 78.609949] ? inet_send_prepare+0x54/0x54
> > [ 78.610464] ? sock_sendmsg_nosec+0x42/0x6c
> > [ 78.610984] sock_sendmsg_nosec+0x42/0x6c
> > [ 78.611485] __sys_sendto+0x15d/0x1cc
> > [ 78.611947] ? __x64_sys_getpeername+0x44/0x44
> > [ 78.612498] ? __handle_mm_fault+0x679/0xae4
> > [ 78.613033] ? find_vma+0x6b/0x8b
> > [ 78.613457] ? find_vma_intersection+0x8a/0x8a
> > [ 78.614006] ? __handle_irq_event_percpu+0x180/0x197
> > [ 78.614617] ? handle_mm_fault+0x38/0x154
> > [ 78.615114] ? handle_mm_fault+0xeb/0x154
> > [ 78.615620] ? preempt_latency_start+0x29/0x34
> > [ 78.616169] ? preempt_count_sub+0x14/0xb3
> > [ 78.616678] ? up_read+0x4b/0x5c
> > [ 78.617094] __x64_sys_sendto+0x76/0x82
> > [ 78.617577] do_syscall_64+0x6b/0xd7
> > [ 78.618043] entry_SYSCALL_64_after_hwframe+0x46/0x4e
> > [ 78.618667] RIP: 0033:0x7fed3de99a73
> > [ 78.619118] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8
> > 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0
> > ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
> > [ 78.621291] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
> > [ 78.622205] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
> > [ 78.623056] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
> > [ 78.623908] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
> > [ 78.624765] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
> > [ 78.625619] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
> > [ 78.626480] </TASK>
> > [ 78.626773] ==================================================================
> > [ 78.627656] Disabling lock debugging due to kernel taint
> > [ 78.628305] BUG: kernel NULL pointer dereference, address: 0000000000000108
> > [ 78.629136] #PF: supervisor read access in kernel mode
> > [ 78.629766] #PF: error_code(0x0000) - not-present page
> > [ 78.630402] PGD 0 P4D 0
> > [ 78.630739] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC KASAN
> > [ 78.631398] CPU: 0 PID: 493 Comm: ping Tainted: G B 6.9.0-rc2-00697-g489ca863e24f-dirty #11
> > [ 78.632548] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> > [ 78.633649] RIP: 0010:iptfs_output_collect+0x263/0x57b
> > [ 78.634283] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b
> > 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08
> > 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
> > [ 78.636444] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
> > [ 78.637076] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
> > [ 78.637923] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
> > [ 78.638792] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
> > [ 78.639645] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
> > [ 78.640498] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
> > [ 78.641359] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
> > [ 78.642324] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 78.643022] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
> > [ 78.643882] Call Trace:
> > [ 78.644204] <TASK>
> > [ 78.644487] ? __die_body+0x1a/0x56
> > [ 78.644929] ? page_fault_oops+0x45f/0x4cd
> > [ 78.645441] ? dump_pagetable+0x1db/0x1db
> > [ 78.645942] ? vprintk_emit+0x163/0x171
> > [ 78.646425] ? iptfs_output_collect+0x263/0x57b
> > [ 78.646986] ? _printk+0xb2/0xe1
> > [ 78.647401] ? find_first_fitting_seq+0x193/0x193
> > [ 78.647982] ? iptfs_output_collect+0x263/0x57b
> > [ 78.648541] ? do_user_addr_fault+0x14f/0x56c
> > [ 78.649084] ? exc_page_fault+0xa5/0xbe
> > [ 78.649566] ? asm_exc_page_fault+0x22/0x30
> > [ 78.650100] ? iptfs_output_collect+0x263/0x57b
> > [ 78.650660] ? iptfs_output_collect+0x263/0x57b
> > [ 78.651221] ip_send_skb+0x25/0x57
> > [ 78.651652] raw_sendmsg+0xee8/0x1011
> > [ 78.652113] ? native_flush_tlb_one_user+0xd/0xe5
> > [ 78.652693] ? raw_hash_sk+0x21b/0x21b
> > [ 78.653166] ? kernel_init_pages+0x42/0x51
> > [ 78.653683] ? prep_new_page+0x44/0x51
> > [ 78.654160] ? get_page_from_freelist+0x72b/0x915
> > [ 78.654739] ? signal_pending_state+0x77/0x77
> > [ 78.655284] ? __might_resched+0x8a/0x240
> > [ 78.655784] ? __might_sleep+0x25/0xa0
> > [ 78.656255] ? first_zones_zonelist+0x2c/0x43
> > [ 78.656798] ? __rcu_read_lock+0x2d/0x3a
> > [ 78.657289] ? __pte_offset_map+0x32/0xa4
> > [ 78.657788] ? __might_resched+0x8a/0x240
> > [ 78.658291] ? __might_sleep+0x25/0xa0
> > [ 78.658763] ? inet_send_prepare+0x54/0x54
> > [ 78.659272] ? sock_sendmsg_nosec+0x42/0x6c
> > [ 78.659791] sock_sendmsg_nosec+0x42/0x6c
> > [ 78.660293] __sys_sendto+0x15d/0x1cc
> > [ 78.660755] ? __x64_sys_getpeername+0x44/0x44
> > [ 78.661304] ? __handle_mm_fault+0x679/0xae4
> > [ 78.661838] ? find_vma+0x6b/0x8b
> > [ 78.662272] ? find_vma_intersection+0x8a/0x8a
> > [ 78.662828] ? __handle_irq_event_percpu+0x180/0x197
> > [ 78.663436] ? handle_mm_fault+0x38/0x154
> > [ 78.663935] ? handle_mm_fault+0xeb/0x154
> > [ 78.664435] ? preempt_latency_start+0x29/0x34
> > [ 78.664987] ? preempt_count_sub+0x14/0xb3
> > [ 78.665498] ? up_read+0x4b/0x5c
> > [ 78.665911] __x64_sys_sendto+0x76/0x82
> > [ 78.666398] do_syscall_64+0x6b/0xd7
> > [ 78.666849] entry_SYSCALL_64_after_hwframe+0x46/0x4e
> > [ 78.667466] RIP: 0033:0x7fed3de99a73
> > [ 78.667918] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8
> > 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0
> > ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
> > [ 78.670097] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
> > [ 78.671002] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
> > [ 78.671858] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
> > [ 78.672708] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
> > [ 78.673564] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
> > [ 78.674430] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
> > [ 78.675287] </TASK>
> > [ 78.675580] Modules linked in:
> > [ 78.675975] CR2: 0000000000000108
> > [ 78.676396] ---[ end trace 0000000000000000 ]---
> > [ 78.676966] RIP: 0010:iptfs_output_collect+0x263/0x57b
> > [ 78.677596] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b
> > 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08
> > 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
> > [ 78.679768] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
> > [ 78.680410] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
> > [ 78.681264] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
> > [ 78.682136] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
> > [ 78.682997] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
> > [ 78.683853] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
> > [ 78.684710] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
> > [ 78.685675] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 78.686387] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
> > [ 78.687246] Kernel panic - not syncing: Fatal exception in interrupt
> > [ 78.688014] Kernel Offset: disabled
> > [ 78.688460] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
> >
> > ping -s 2000 -n -q -W 1 -c 2 -I 192.0.3.254 192.0.2.254
> >
> > (gdb) list *iptfs_output_collect+0x263
> > 0xffffffff81d5076f is in iptfs_output_collect (./include/net/net_namespace.h:383).
> > 378 }
> > 379
> > 380 static inline struct net *read_pnet(const possible_net_t *pnet)
> > 381 {
> > 382 #ifdef CONFIG_NET_NS
> > 383 return rcu_dereference_protected(pnet->net, true);
> > 384 #else
> > 385 return &init_net;
> > 386 #endif
> > 387 }
> >
> > I suspect actual crash is from the line 1756 instead,
> > (gdb) list *iptfs_output_collect+0x256
> > 0xffffffff81d50762 is in iptfs_output_collect (net/xfrm/xfrm_iptfs.c:1756).
> > 1751 return 0;
> > 1752
> > 1753 /* We only send ICMP too big if the user has configured us as
> > 1754 * dont-fragment.
> > 1755 */
> > 1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
> > 1757
> > 1758 if (sk) {
> > 1759 xfrm_local_error(skb, pmtu);
> > 1760 } else if (ip_hdr(skb)->version == 4) {
> >
> > Later I ran with gdb iptfs_is_too_big which is called twice and second time
> > it crash.
> > Here is gdb bt. Just before the crash
> >
> > #0 iptfs_is_too_big (pmtu=1442, skb=0xffff88810dbea3c0, sk=0xffff888104d4ed40) at net/xfrm/xfrm_iptfs.c:1756
> > #1 iptfs_output_collect (net=<optimized out>, sk=0xffff888104d4ed40, skb=0xffff88810dbea3c0) at net/xfrm/xfrm_iptfs.c:1847
> > #2 0xffffffff81c8a3cb in ip_send_skb (net=0xffffffff83e57f20 <init_net>, skb=0xffff88810dbea3c0)
> > at net/ipv4/ip_output.c:1492
> > #3 0xffffffff81c8a439 in ip_push_pending_frames (sk=sk@entry=0xffff888104d4ed40, fl4=fl4@entry=0xffffc90000e3fb90)
> > at net/ipv4/ip_output.c:1512
> > #4 0xffffffff81ccf3cf in raw_sendmsg (sk=0xffff888104d4ed40, msg=0xffffc90000e3fd80, len=<optimized out>)
> > at net/ipv4/raw.c:654
> > #5 0xffffffff81b096ea in sock_sendmsg_nosec (sock=sock@entry=0xffff888115136040, msg=msg@entry=0xffffc90000e3fd80)
> > at net/socket.c:730
> > #6 0xffffffff81b0c327 in __sock_sendmsg (msg=0xffffc90000e3fd80, sock=0xffff888115136040) at net/socket.c:745
> > #7 __sys_sendto (fd=<optimized out>, buff=buff@entry=0x558edefb73c0, len=len@entry=2008, flags=flags@entry=0,
> > addr=addr@entry=0x558edefb35c0, addr_len=addr_len@entry=16) at net/socket.c:2191
> > #8 0xffffffff81b0c40c in __do_sys_sendto (addr_len=16, addr=0x558edefb35c0, flags=0, len=2008, buff=0x558edefb73c0,
> > fd=<optimized out>) at net/socket.c:2203
> > #9 __se_sys_sendto (addr_len=16, addr=94072114722240, flags=0, len=2008, buff=94072114738112, fd=<optimized out>)
> > at net/socket.c:2199
> >
> > gdb) list
> > 1751 return 0;
> > 1752
> > 1753 /* We only send ICMP too big if the user has configured us as
> > 1754 * dont-fragment.
> > 1755 */
> > 1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
> > 1757
> > 1758 if (sk) {
> > 1759 xfrm_local_error(skb, pmtu);
> > 1760 } else if (ip_hdr(skb)->version == 4) {
> >
> > -antony
>
[-- Attachment #2: .config --]
[-- Type: text/plain, Size: 99833 bytes --]
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86 6.9.0-rc2 Kernel Configuration
#
CONFIG_CC_VERSION_TEXT="gcc (Debian 13.2.0-7) 13.2.0"
CONFIG_CC_IS_GCC=y
CONFIG_GCC_VERSION=130200
CONFIG_CLANG_VERSION=0
CONFIG_AS_IS_GNU=y
CONFIG_AS_VERSION=24150
CONFIG_LD_IS_BFD=y
CONFIG_LD_VERSION=24150
CONFIG_LLD_VERSION=0
CONFIG_CC_CAN_LINK=y
CONFIG_CC_CAN_LINK_STATIC=y
CONFIG_CC_HAS_ASM_GOTO_OUTPUT=y
CONFIG_CC_HAS_ASM_GOTO_TIED_OUTPUT=y
CONFIG_GCC_ASM_GOTO_OUTPUT_WORKAROUND=y
CONFIG_TOOLS_SUPPORT_RELR=y
CONFIG_CC_HAS_ASM_INLINE=y
CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y
CONFIG_PAHOLE_VERSION=124
CONFIG_CONSTRUCTORS=y
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_TABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y
#
# General setup
#
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32
# CONFIG_COMPILE_TEST is not set
CONFIG_WERROR=y
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_BUILD_SALT=""
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_HAVE_KERNEL_ZSTD=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
# CONFIG_KERNEL_ZSTD is not set
CONFIG_DEFAULT_INIT=""
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
# CONFIG_POSIX_MQUEUE is not set
# CONFIG_WATCH_QUEUE is not set
CONFIG_CROSS_MEMORY_ATTACH=y
# CONFIG_USELIB is not set
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_HARDIRQS_SW_RESEND=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_IRQ_MSI_IOMMU=y
CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y
CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
# CONFIG_GENERIC_IRQ_DEBUGFS is not set
# end of IRQ subsystem
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_INIT=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST_IDLE=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK=y
CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y
CONFIG_CONTEXT_TRACKING=y
CONFIG_CONTEXT_TRACKING_IDLE=y
#
# Timers subsystem
#
CONFIG_HZ_PERIODIC=y
# CONFIG_NO_HZ_IDLE is not set
# CONFIG_NO_HZ is not set
# CONFIG_HIGH_RES_TIMERS is not set
CONFIG_CLOCKSOURCE_WATCHDOG_MAX_SKEW_US=100
# end of Timers subsystem
CONFIG_BPF=y
CONFIG_HAVE_EBPF_JIT=y
CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
#
# BPF subsystem
#
CONFIG_BPF_SYSCALL=y
CONFIG_BPF_JIT=y
CONFIG_BPF_JIT_ALWAYS_ON=y
CONFIG_BPF_JIT_DEFAULT_ON=y
# CONFIG_BPF_UNPRIV_DEFAULT_OFF is not set
# CONFIG_BPF_PRELOAD is not set
# end of BPF subsystem
CONFIG_PREEMPT_BUILD=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
CONFIG_PREEMPTION=y
CONFIG_PREEMPT_DYNAMIC=y
#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
# CONFIG_IRQ_TIME_ACCOUNTING is not set
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_PSI is not set
# end of CPU/Task time and stats accounting
#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
CONFIG_PREEMPT_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_TREE_SRCU=y
CONFIG_TASKS_RCU_GENERIC=y
CONFIG_TASKS_RCU=y
CONFIG_TASKS_RUDE_RCU=y
CONFIG_TASKS_TRACE_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
# end of RCU Subsystem
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_IKHEADERS is not set
CONFIG_LOG_BUF_SHIFT=18
CONFIG_PRINTK_INDEX=y
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
#
# Scheduler features
#
# end of Scheduler features
CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y
CONFIG_CC_HAS_INT128=y
CONFIG_CC_IMPLICIT_FALLTHROUGH="-Wimplicit-fallthrough=5"
CONFIG_GCC10_NO_ARRAY_BOUNDS=y
CONFIG_CC_NO_ARRAY_BOUNDS=y
CONFIG_GCC_NO_STRINGOP_OVERFLOW=y
CONFIG_CC_NO_STRINGOP_OVERFLOW=y
CONFIG_ARCH_SUPPORTS_INT128=y
CONFIG_CGROUPS=y
# CONFIG_CGROUP_FAVOR_DYNMODS is not set
# CONFIG_MEMCG is not set
# CONFIG_BLK_CGROUP is not set
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
# CONFIG_CFS_BANDWIDTH is not set
# CONFIG_RT_GROUP_SCHED is not set
CONFIG_CGROUP_PIDS=y
# CONFIG_CGROUP_RDMA is not set
# CONFIG_CGROUP_FREEZER is not set
# CONFIG_CGROUP_DEVICE is not set
# CONFIG_CGROUP_CPUACCT is not set
CONFIG_CGROUP_PERF=y
CONFIG_CGROUP_BPF=y
# CONFIG_CGROUP_MISC is not set
# CONFIG_CGROUP_DEBUG is not set
CONFIG_SOCK_CGROUP_DATA=y
CONFIG_NAMESPACES=y
CONFIG_UTS_NS=y
CONFIG_TIME_NS=y
CONFIG_IPC_NS=y
CONFIG_USER_NS=y
CONFIG_PID_NS=y
CONFIG_NET_NS=y
# CONFIG_CHECKPOINT_RESTORE is not set
# CONFIG_SCHED_AUTOGROUP is not set
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
CONFIG_RD_LZO=y
CONFIG_RD_LZ4=y
CONFIG_RD_ZSTD=y
CONFIG_BOOT_CONFIG=y
CONFIG_BOOT_CONFIG_FORCE=y
# CONFIG_BOOT_CONFIG_EMBED is not set
CONFIG_INITRAMFS_PRESERVE_MTIME=y
# CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE is not set
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_LD_ORPHAN_WARN=y
CONFIG_LD_ORPHAN_WARN_LEVEL="error"
CONFIG_SYSCTL=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
# CONFIG_EXPERT is not set
CONFIG_MULTIUSER=y
CONFIG_SGETMASK_SYSCALL=y
CONFIG_SYSFS_SYSCALL=y
CONFIG_FHANDLE=y
CONFIG_POSIX_TIMERS=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_FUTEX_PI=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_IO_URING=y
CONFIG_ADVISE_SYSCALLS=y
CONFIG_MEMBARRIER=y
CONFIG_RSEQ=y
CONFIG_CACHESTAT_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_SELFTEST is not set
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_BASE_RELATIVE=y
CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
CONFIG_HAVE_PERF_EVENTS=y
#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
# end of Kernel Performance Events And Counters
CONFIG_SYSTEM_DATA_VERIFICATION=y
CONFIG_PROFILING=y
CONFIG_TRACEPOINTS=y
#
# Kexec and crash features
#
CONFIG_CRASH_RESERVE=y
CONFIG_VMCORE_INFO=y
CONFIG_KEXEC_CORE=y
CONFIG_KEXEC=y
CONFIG_KEXEC_FILE=y
# CONFIG_KEXEC_SIG is not set
# CONFIG_KEXEC_JUMP is not set
CONFIG_CRASH_DUMP=y
# end of Kexec and crash features
# end of General setup
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_CSUM=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_AUDIT_ARCH=y
CONFIG_KASAN_SHADOW_OFFSET=0xdffffc0000000000
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
#
# Processor type and features
#
# CONFIG_SMP is not set
# CONFIG_X86_X2APIC is not set
CONFIG_X86_MPPARSE=y
CONFIG_X86_CPU_RESCTRL=y
# CONFIG_X86_FRED is not set
# CONFIG_X86_EXTENDED_PLATFORM is not set
# CONFIG_X86_INTEL_LPSS is not set
CONFIG_X86_AMD_PLATFORM_DEVICE=y
# CONFIG_IOSF_MBI is not set
CONFIG_X86_SUPPORTS_MEMORY_FAILURE=y
CONFIG_SCHED_OMIT_FRAME_POINTER=y
CONFIG_HYPERVISOR_GUEST=y
CONFIG_PARAVIRT=y
# CONFIG_PARAVIRT_DEBUG is not set
CONFIG_X86_HV_CALLBACK_VECTOR=y
# CONFIG_XEN is not set
CONFIG_KVM_GUEST=y
CONFIG_ARCH_CPUIDLE_HALTPOLL=y
# CONFIG_PVH is not set
# CONFIG_PARAVIRT_TIME_ACCOUNTING is not set
CONFIG_PARAVIRT_CLOCK=y
# CONFIG_JAILHOUSE_GUEST is not set
# CONFIG_ACRN_GUEST is not set
# CONFIG_MK8 is not set
# CONFIG_MPSC is not set
# CONFIG_MCORE2 is not set
# CONFIG_MATOM is not set
CONFIG_GENERIC_CPU=y
CONFIG_X86_INTERNODE_CACHE_SHIFT=6
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_TSC=y
CONFIG_X86_HAVE_PAE=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=64
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_IA32_FEAT_CTL=y
CONFIG_X86_VMX_FEATURE_NAMES=y
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_HYGON=y
CONFIG_CPU_SUP_CENTAUR=y
CONFIG_CPU_SUP_ZHAOXIN=y
CONFIG_HPET_TIMER=y
CONFIG_DMI=y
# CONFIG_GART_IOMMU is not set
CONFIG_NR_CPUS_RANGE_BEGIN=1
CONFIG_NR_CPUS_RANGE_END=1
CONFIG_NR_CPUS_DEFAULT=1
CONFIG_NR_CPUS=1
CONFIG_UP_LATE_INIT=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
# CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS is not set
CONFIG_X86_MCE=y
# CONFIG_X86_MCELOG_LEGACY is not set
CONFIG_X86_MCE_INTEL=y
# CONFIG_X86_MCE_AMD is not set
CONFIG_X86_MCE_THRESHOLD=y
# CONFIG_X86_MCE_INJECT is not set
#
# Performance monitoring
#
CONFIG_PERF_EVENTS_INTEL_UNCORE=y
CONFIG_PERF_EVENTS_INTEL_RAPL=y
CONFIG_PERF_EVENTS_INTEL_CSTATE=y
# CONFIG_PERF_EVENTS_AMD_POWER is not set
CONFIG_PERF_EVENTS_AMD_UNCORE=y
# CONFIG_PERF_EVENTS_AMD_BRS is not set
# end of Performance monitoring
CONFIG_X86_16BIT=y
CONFIG_X86_ESPFIX64=y
CONFIG_X86_VSYSCALL_EMULATION=y
CONFIG_X86_IOPL_IOPERM=y
CONFIG_MICROCODE=y
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
# CONFIG_X86_5LEVEL is not set
CONFIG_X86_DIRECT_GBPAGES=y
# CONFIG_X86_CPA_STATISTICS is not set
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
# CONFIG_X86_PMEM_LEGACY is not set
# CONFIG_X86_CHECK_BIOS_CORRUPTION is not set
CONFIG_MTRR=y
CONFIG_MTRR_SANITIZER=y
CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=0
CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1
CONFIG_X86_PAT=y
CONFIG_ARCH_USES_PG_UNCACHED=y
CONFIG_X86_UMIP=y
CONFIG_CC_HAS_IBT=y
# CONFIG_X86_KERNEL_IBT is not set
# CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS is not set
CONFIG_X86_INTEL_TSX_MODE_OFF=y
# CONFIG_X86_INTEL_TSX_MODE_ON is not set
# CONFIG_X86_INTEL_TSX_MODE_AUTO is not set
# CONFIG_X86_USER_SHADOW_STACK is not set
# CONFIG_EFI is not set
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_ARCH_SUPPORTS_KEXEC=y
CONFIG_ARCH_SUPPORTS_KEXEC_FILE=y
CONFIG_ARCH_SELECTS_KEXEC_FILE=y
CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY=y
CONFIG_ARCH_SUPPORTS_KEXEC_SIG=y
CONFIG_ARCH_SUPPORTS_KEXEC_SIG_FORCE=y
CONFIG_ARCH_SUPPORTS_KEXEC_BZIMAGE_VERIFY_SIG=y
CONFIG_ARCH_SUPPORTS_KEXEC_JUMP=y
CONFIG_ARCH_SUPPORTS_CRASH_DUMP=y
CONFIG_ARCH_SUPPORTS_CRASH_HOTPLUG=y
CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION=y
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
# CONFIG_RANDOMIZE_BASE is not set
CONFIG_PHYSICAL_ALIGN=0x200000
# CONFIG_ADDRESS_MASKING is not set
CONFIG_LEGACY_VSYSCALL_XONLY=y
# CONFIG_LEGACY_VSYSCALL_NONE is not set
# CONFIG_CMDLINE_BOOL is not set
CONFIG_MODIFY_LDT_SYSCALL=y
# CONFIG_STRICT_SIGALTSTACK_SIZE is not set
CONFIG_HAVE_LIVEPATCH=y
# CONFIG_LIVEPATCH is not set
# end of Processor type and features
CONFIG_CC_HAS_NAMED_AS=y
CONFIG_CC_HAS_SLS=y
CONFIG_CC_HAS_RETURN_THUNK=y
CONFIG_CC_HAS_ENTRY_PADDING=y
CONFIG_FUNCTION_PADDING_CFI=11
CONFIG_FUNCTION_PADDING_BYTES=16
# CONFIG_SPECULATION_MITIGATIONS is not set
CONFIG_ARCH_HAS_ADD_PAGES=y
#
# Power management and ACPI options
#
CONFIG_ARCH_HIBERNATION_HEADER=y
CONFIG_SUSPEND=y
CONFIG_SUSPEND_FREEZER=y
CONFIG_HIBERNATE_CALLBACKS=y
CONFIG_HIBERNATION=y
CONFIG_HIBERNATION_SNAPSHOT_DEV=y
CONFIG_HIBERNATION_COMP_LZO=y
# CONFIG_HIBERNATION_COMP_LZ4 is not set
CONFIG_HIBERNATION_DEF_COMP="lzo"
CONFIG_PM_STD_PARTITION=""
CONFIG_PM_SLEEP=y
# CONFIG_PM_AUTOSLEEP is not set
# CONFIG_PM_USERSPACE_AUTOSLEEP is not set
# CONFIG_PM_WAKELOCKS is not set
CONFIG_PM=y
CONFIG_PM_DEBUG=y
# CONFIG_PM_ADVANCED_DEBUG is not set
CONFIG_PM_SLEEP_DEBUG=y
CONFIG_PM_TRACE=y
CONFIG_PM_TRACE_RTC=y
CONFIG_PM_CLK=y
# CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set
CONFIG_ARCH_SUPPORTS_ACPI=y
CONFIG_ACPI=y
CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y
CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y
CONFIG_ACPI_SYSTEM_POWER_STATES_SUPPORT=y
CONFIG_ACPI_THERMAL_LIB=y
# CONFIG_ACPI_DEBUGGER is not set
CONFIG_ACPI_SPCR_TABLE=y
# CONFIG_ACPI_FPDT is not set
CONFIG_ACPI_LPIT=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_REV_OVERRIDE_POSSIBLE=y
# CONFIG_ACPI_EC_DEBUGFS is not set
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_FAN=y
# CONFIG_ACPI_TAD is not set
# CONFIG_ACPI_DOCK is not set
CONFIG_ACPI_CPU_FREQ_PSS=y
CONFIG_ACPI_PROCESSOR_CSTATE=y
CONFIG_ACPI_PROCESSOR_IDLE=y
CONFIG_ACPI_PROCESSOR=y
# CONFIG_ACPI_PROCESSOR_AGGREGATOR is not set
CONFIG_ACPI_THERMAL=y
CONFIG_ARCH_HAS_ACPI_TABLE_UPGRADE=y
CONFIG_ACPI_TABLE_UPGRADE=y
# CONFIG_ACPI_DEBUG is not set
# CONFIG_ACPI_PCI_SLOT is not set
CONFIG_ACPI_CONTAINER=y
CONFIG_ACPI_HOTPLUG_IOAPIC=y
# CONFIG_ACPI_SBS is not set
# CONFIG_ACPI_HED is not set
# CONFIG_ACPI_NFIT is not set
CONFIG_HAVE_ACPI_APEI=y
CONFIG_HAVE_ACPI_APEI_NMI=y
# CONFIG_ACPI_APEI is not set
# CONFIG_ACPI_DPTF is not set
# CONFIG_ACPI_CONFIGFS is not set
# CONFIG_ACPI_PFRUT is not set
CONFIG_ACPI_PCC=y
# CONFIG_ACPI_FFH is not set
# CONFIG_PMIC_OPREGION is not set
CONFIG_X86_PM_TIMER=y
#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
# CONFIG_CPU_FREQ_STAT is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_GOV_USERSPACE is not set
# CONFIG_CPU_FREQ_GOV_ONDEMAND is not set
# CONFIG_CPU_FREQ_GOV_CONSERVATIVE is not set
#
# CPU frequency scaling drivers
#
CONFIG_X86_INTEL_PSTATE=y
# CONFIG_X86_PCC_CPUFREQ is not set
# CONFIG_X86_AMD_PSTATE is not set
# CONFIG_X86_AMD_PSTATE_UT is not set
# CONFIG_X86_ACPI_CPUFREQ is not set
# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
# CONFIG_X86_P4_CLOCKMOD is not set
#
# shared options
#
# end of CPU Frequency scaling
#
# CPU Idle
#
CONFIG_CPU_IDLE=y
CONFIG_CPU_IDLE_GOV_LADDER=y
# CONFIG_CPU_IDLE_GOV_MENU is not set
# CONFIG_CPU_IDLE_GOV_TEO is not set
CONFIG_CPU_IDLE_GOV_HALTPOLL=y
CONFIG_HALTPOLL_CPUIDLE=y
# end of CPU Idle
# CONFIG_INTEL_IDLE is not set
# end of Power management and ACPI options
#
# Bus options (PCI etc.)
#
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_MMCONF_FAM10H=y
CONFIG_ISA_DMA_API=y
CONFIG_AMD_NB=y
# end of Bus options (PCI etc.)
#
# Binary Emulations
#
# CONFIG_IA32_EMULATION is not set
# CONFIG_X86_X32_ABI is not set
# end of Binary Emulations
CONFIG_VIRTUALIZATION=y
CONFIG_AS_AVX512=y
CONFIG_AS_SHA1_NI=y
CONFIG_AS_SHA256_NI=y
CONFIG_AS_TPAUSE=y
CONFIG_AS_GFNI=y
CONFIG_AS_WRUSS=y
#
# General architecture-dependent options
#
CONFIG_GENERIC_ENTRY=y
CONFIG_KPROBES=y
CONFIG_JUMP_LABEL=y
# CONFIG_STATIC_KEYS_SELFTEST is not set
# CONFIG_STATIC_CALL_SELFTEST is not set
CONFIG_OPTPROBES=y
CONFIG_KPROBES_ON_FTRACE=y
CONFIG_UPROBES=y
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_ARCH_USE_BUILTIN_BSWAP=y
CONFIG_KRETPROBES=y
CONFIG_KRETPROBE_ON_RETHOOK=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_OPTPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_ARCH_CORRECT_STACKTRACE_ON_KRETPROBE=y
CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y
CONFIG_HAVE_NMI=y
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_TRACE_IRQFLAGS_NMI_SUPPORT=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_CONTIGUOUS=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
CONFIG_ARCH_HAS_FORTIFY_SOURCE=y
CONFIG_ARCH_HAS_SET_MEMORY=y
CONFIG_ARCH_HAS_SET_DIRECT_MAP=y
CONFIG_ARCH_HAS_CPU_FINALIZE_INIT=y
CONFIG_HAVE_ARCH_THREAD_STRUCT_WHITELIST=y
CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT=y
CONFIG_ARCH_WANTS_NO_INSTR=y
CONFIG_HAVE_ASM_MODVERSIONS=y
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
CONFIG_HAVE_RSEQ=y
CONFIG_HAVE_RUST=y
CONFIG_HAVE_FUNCTION_ARG_ACCESS_API=y
CONFIG_HAVE_HW_BREAKPOINT=y
CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
CONFIG_HAVE_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_PERF_EVENTS_NMI=y
CONFIG_HAVE_HARDLOCKUP_DETECTOR_PERF=y
CONFIG_HAVE_PERF_REGS=y
CONFIG_HAVE_PERF_USER_STACK_DUMP=y
CONFIG_HAVE_ARCH_JUMP_LABEL=y
CONFIG_HAVE_ARCH_JUMP_LABEL_RELATIVE=y
CONFIG_MMU_GATHER_TABLE_FREE=y
CONFIG_MMU_GATHER_RCU_TABLE_FREE=y
CONFIG_MMU_GATHER_MERGE_VMAS=y
CONFIG_MMU_LAZY_TLB_REFCOUNT=y
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS=y
CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
CONFIG_HAVE_CMPXCHG_LOCAL=y
CONFIG_HAVE_CMPXCHG_DOUBLE=y
CONFIG_HAVE_ARCH_SECCOMP=y
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
# CONFIG_SECCOMP is not set
CONFIG_HAVE_ARCH_STACKLEAK=y
CONFIG_HAVE_STACKPROTECTOR=y
CONFIG_STACKPROTECTOR=y
CONFIG_STACKPROTECTOR_STRONG=y
CONFIG_ARCH_SUPPORTS_LTO_CLANG=y
CONFIG_ARCH_SUPPORTS_LTO_CLANG_THIN=y
CONFIG_LTO_NONE=y
CONFIG_ARCH_SUPPORTS_CFI_CLANG=y
CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES=y
CONFIG_HAVE_CONTEXT_TRACKING_USER=y
CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK=y
CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
CONFIG_HAVE_MOVE_PUD=y
CONFIG_HAVE_MOVE_PMD=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
CONFIG_HAVE_ARCH_HUGE_VMAP=y
CONFIG_HAVE_ARCH_HUGE_VMALLOC=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_HAVE_ARCH_SOFT_DIRTY=y
CONFIG_HAVE_MOD_ARCH_SPECIFIC=y
CONFIG_MODULES_USE_ELF_RELA=y
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y
CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK=y
CONFIG_SOFTIRQ_ON_OWN_STACK=y
CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
CONFIG_HAVE_EXIT_THREAD=y
CONFIG_ARCH_MMAP_RND_BITS=28
CONFIG_HAVE_PAGE_SIZE_4KB=y
CONFIG_PAGE_SIZE_4KB=y
CONFIG_PAGE_SIZE_LESS_THAN_64KB=y
CONFIG_PAGE_SIZE_LESS_THAN_256KB=y
CONFIG_PAGE_SHIFT=12
CONFIG_HAVE_OBJTOOL=y
CONFIG_HAVE_JUMP_LABEL_HACK=y
CONFIG_HAVE_NOINSTR_HACK=y
CONFIG_HAVE_NOINSTR_VALIDATION=y
CONFIG_HAVE_UACCESS_VALIDATION=y
CONFIG_HAVE_STACK_VALIDATION=y
CONFIG_HAVE_RELIABLE_STACKTRACE=y
# CONFIG_COMPAT_32BIT_TIME is not set
CONFIG_HAVE_ARCH_VMAP_STACK=y
CONFIG_VMAP_STACK=y
CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET=y
CONFIG_RANDOMIZE_KSTACK_OFFSET=y
# CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT is not set
CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y
CONFIG_STRICT_KERNEL_RWX=y
CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y
CONFIG_STRICT_MODULE_RWX=y
CONFIG_HAVE_ARCH_PREL32_RELOCATIONS=y
# CONFIG_LOCK_EVENT_COUNTS is not set
CONFIG_ARCH_HAS_MEM_ENCRYPT=y
CONFIG_HAVE_STATIC_CALL=y
CONFIG_HAVE_STATIC_CALL_INLINE=y
CONFIG_HAVE_PREEMPT_DYNAMIC=y
CONFIG_HAVE_PREEMPT_DYNAMIC_CALL=y
CONFIG_ARCH_WANT_LD_ORPHAN_WARN=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_ARCH_SUPPORTS_PAGE_TABLE_CHECK=y
CONFIG_ARCH_HAS_ELFCORE_COMPAT=y
CONFIG_ARCH_HAS_PARANOID_L1D_FLUSH=y
CONFIG_DYNAMIC_SIGFRAME=y
CONFIG_ARCH_HAS_HW_PTE_YOUNG=y
CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG=y
#
# GCOV-based kernel profiling
#
# CONFIG_GCOV_KERNEL is not set
CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
# end of GCOV-based kernel profiling
CONFIG_HAVE_GCC_PLUGINS=y
CONFIG_FUNCTION_ALIGNMENT_4B=y
CONFIG_FUNCTION_ALIGNMENT_16B=y
CONFIG_FUNCTION_ALIGNMENT=16
# end of General architecture-dependent options
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULE_SIG_FORMAT=y
CONFIG_MODULES=y
# CONFIG_MODULE_DEBUG is not set
# CONFIG_MODULE_FORCE_LOAD is not set
# CONFIG_MODULE_UNLOAD is not set
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_MODULE_SIG=y
# CONFIG_MODULE_SIG_FORCE is not set
CONFIG_MODULE_SIG_ALL=y
# CONFIG_MODULE_SIG_SHA1 is not set
CONFIG_MODULE_SIG_SHA256=y
# CONFIG_MODULE_SIG_SHA384 is not set
# CONFIG_MODULE_SIG_SHA512 is not set
# CONFIG_MODULE_SIG_SHA3_256 is not set
# CONFIG_MODULE_SIG_SHA3_384 is not set
# CONFIG_MODULE_SIG_SHA3_512 is not set
CONFIG_MODULE_SIG_HASH="sha256"
CONFIG_MODULE_COMPRESS_NONE=y
# CONFIG_MODULE_COMPRESS_GZIP is not set
# CONFIG_MODULE_COMPRESS_XZ is not set
# CONFIG_MODULE_COMPRESS_ZSTD is not set
# CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is not set
CONFIG_MODPROBE_PATH="/sbin/modprobe"
# CONFIG_TRIM_UNUSED_KSYMS is not set
CONFIG_MODULES_TREE_LOOKUP=y
CONFIG_BLOCK=y
CONFIG_BLOCK_LEGACY_AUTOLOAD=y
# CONFIG_BLK_DEV_BSGLIB is not set
# CONFIG_BLK_DEV_INTEGRITY is not set
CONFIG_BLK_DEV_WRITE_MOUNTED=y
# CONFIG_BLK_DEV_ZONED is not set
# CONFIG_BLK_WBT is not set
CONFIG_BLK_DEBUG_FS=y
# CONFIG_BLK_SED_OPAL is not set
# CONFIG_BLK_INLINE_ENCRYPTION is not set
#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
CONFIG_EFI_PARTITION=y
# end of Partition Types
CONFIG_BLK_MQ_PCI=y
CONFIG_BLK_MQ_VIRTIO=y
CONFIG_BLK_PM=y
#
# IO Schedulers
#
CONFIG_MQ_IOSCHED_DEADLINE=y
CONFIG_MQ_IOSCHED_KYBER=y
# CONFIG_IOSCHED_BFQ is not set
# end of IO Schedulers
CONFIG_ASN1=y
CONFIG_UNINLINE_SPIN_UNLOCK=y
CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE=y
CONFIG_ARCH_HAS_SYNC_CORE_BEFORE_USERMODE=y
CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y
CONFIG_FREEZER=y
#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_ELFCORE=y
CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
CONFIG_BINFMT_SCRIPT=y
# CONFIG_BINFMT_MISC is not set
CONFIG_COREDUMP=y
# end of Executable file formats
#
# Memory Management options
#
CONFIG_SWAP=y
# CONFIG_ZSWAP is not set
CONFIG_ZSMALLOC=y
# CONFIG_ZSMALLOC_STAT is not set
CONFIG_ZSMALLOC_CHAIN_SIZE=8
#
# Slab allocator options
#
CONFIG_SLUB=y
CONFIG_SLAB_MERGE_DEFAULT=y
CONFIG_SLAB_FREELIST_RANDOM=y
CONFIG_SLAB_FREELIST_HARDENED=y
# CONFIG_SLUB_STATS is not set
# CONFIG_RANDOM_KMALLOC_CACHES is not set
# end of Slab allocator options
# CONFIG_SHUFFLE_PAGE_ALLOCATOR is not set
CONFIG_COMPAT_BRK=y
CONFIG_SPARSEMEM=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_VMEMMAP=y
CONFIG_ARCH_WANT_OPTIMIZE_DAX_VMEMMAP=y
CONFIG_ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP=y
CONFIG_HAVE_FAST_GUP=y
CONFIG_EXCLUSIVE_SYSTEM_RAM=y
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
# CONFIG_MEMORY_HOTPLUG is not set
CONFIG_ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
CONFIG_COMPACTION=y
CONFIG_COMPACT_UNEVICTABLE_DEFAULT=1
# CONFIG_PAGE_REPORTING is not set
CONFIG_MIGRATION=y
CONFIG_PCP_BATCH_SCALE_MAX=5
CONFIG_PHYS_ADDR_T_64BIT=y
# CONFIG_KSM is not set
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
# CONFIG_MEMORY_FAILURE is not set
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ARCH_WANTS_THP_SWAP=y
# CONFIG_TRANSPARENT_HUGEPAGE is not set
CONFIG_NEED_PER_CPU_KM=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
# CONFIG_CMA is not set
CONFIG_GENERIC_EARLY_IOREMAP=y
# CONFIG_IDLE_PAGE_TRACKING is not set
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_ARCH_HAS_CURRENT_STACK_POINTER=y
CONFIG_ARCH_HAS_PTE_DEVMAP=y
CONFIG_ZONE_DMA=y
CONFIG_ZONE_DMA32=y
CONFIG_VM_EVENT_COUNTERS=y
# CONFIG_PERCPU_STATS is not set
# CONFIG_GUP_TEST is not set
# CONFIG_DMAPOOL_TEST is not set
CONFIG_ARCH_HAS_PTE_SPECIAL=y
CONFIG_MEMFD_CREATE=y
CONFIG_SECRETMEM=y
# CONFIG_ANON_VMA_NAME is not set
# CONFIG_USERFAULTFD is not set
# CONFIG_LRU_GEN is not set
CONFIG_ARCH_SUPPORTS_PER_VMA_LOCK=y
CONFIG_LOCK_MM_AND_FIND_VMA=y
#
# Data Access Monitoring
#
# CONFIG_DAMON is not set
# end of Data Access Monitoring
# end of Memory Management options
CONFIG_NET=y
CONFIG_NET_INGRESS=y
CONFIG_NET_EGRESS=y
CONFIG_NET_XGRESS=y
CONFIG_SKB_EXTENSIONS=y
#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_DIAG=y
CONFIG_UNIX=y
CONFIG_AF_UNIX_OOB=y
# CONFIG_UNIX_DIAG is not set
# CONFIG_TLS is not set
CONFIG_XFRM=y
CONFIG_XFRM_OFFLOAD=y
CONFIG_XFRM_ALGO=y
CONFIG_XFRM_USER=y
# CONFIG_XFRM_INTERFACE is not set
CONFIG_XFRM_SUB_POLICY=y
CONFIG_XFRM_MIGRATE=y
CONFIG_XFRM_STATISTICS=y
CONFIG_XFRM_ESP=y
CONFIG_XFRM_IPCOMP=y
# CONFIG_NET_KEY is not set
CONFIG_XFRM_IPTFS=y
CONFIG_XFRM_ESPINTCP=y
CONFIG_XDP_SOCKETS=y
CONFIG_XDP_SOCKETS_DIAG=y
CONFIG_NET_HANDSHAKE=y
CONFIG_INET=y
# CONFIG_IP_MULTICAST is not set
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_FIB_TRIE_STATS=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_MULTIPATH=y
# CONFIG_IP_ROUTE_VERBOSE is not set
CONFIG_IP_PNP=y
CONFIG_IP_PNP_DHCP=y
CONFIG_IP_PNP_BOOTP=y
# CONFIG_IP_PNP_RARP is not set
CONFIG_NET_IPIP=y
# CONFIG_NET_IPGRE_DEMUX is not set
CONFIG_NET_IP_TUNNEL=y
CONFIG_SYN_COOKIES=y
# CONFIG_NET_IPVTI is not set
# CONFIG_NET_FOU is not set
# CONFIG_NET_FOU_IP_TUNNELS is not set
# CONFIG_INET_AH is not set
CONFIG_INET_ESP=y
CONFIG_INET_ESP_OFFLOAD=y
CONFIG_INET_ESPINTCP=y
CONFIG_INET_IPCOMP=y
CONFIG_INET_TABLE_PERTURB_ORDER=16
CONFIG_INET_XFRM_TUNNEL=y
CONFIG_INET_TUNNEL=y
CONFIG_INET_DIAG=y
CONFIG_INET_TCP_DIAG=y
# CONFIG_INET_UDP_DIAG is not set
# CONFIG_INET_RAW_DIAG is not set
# CONFIG_INET_DIAG_DESTROY is not set
# CONFIG_TCP_CONG_ADVANCED is not set
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
# CONFIG_TCP_AO is not set
# CONFIG_TCP_MD5SIG is not set
CONFIG_IPV6=y
# CONFIG_IPV6_ROUTER_PREF is not set
# CONFIG_IPV6_OPTIMISTIC_DAD is not set
# CONFIG_INET6_AH is not set
CONFIG_INET6_ESP=y
CONFIG_INET6_ESP_OFFLOAD=y
# CONFIG_INET6_ESPINTCP is not set
CONFIG_INET6_IPCOMP=y
CONFIG_IPV6_MIP6=y
# CONFIG_IPV6_ILA is not set
CONFIG_INET6_XFRM_TUNNEL=y
CONFIG_INET6_TUNNEL=y
# CONFIG_IPV6_VTI is not set
CONFIG_IPV6_SIT=y
# CONFIG_IPV6_SIT_6RD is not set
CONFIG_IPV6_NDISC_NODETYPE=y
# CONFIG_IPV6_TUNNEL is not set
# CONFIG_IPV6_MULTIPLE_TABLES is not set
# CONFIG_IPV6_MROUTE is not set
# CONFIG_IPV6_SEG6_LWTUNNEL is not set
# CONFIG_IPV6_SEG6_HMAC is not set
# CONFIG_IPV6_RPL_LWTUNNEL is not set
# CONFIG_IPV6_IOAM6_LWTUNNEL is not set
CONFIG_MPTCP=y
CONFIG_INET_MPTCP_DIAG=y
CONFIG_MPTCP_IPV6=y
# CONFIG_NETWORK_SECMARK is not set
# CONFIG_NETWORK_PHY_TIMESTAMPING is not set
CONFIG_NETFILTER=y
CONFIG_NETFILTER_ADVANCED=y
CONFIG_BRIDGE_NETFILTER=y
#
# Core Netfilter Configuration
#
CONFIG_NETFILTER_INGRESS=y
CONFIG_NETFILTER_EGRESS=y
CONFIG_NETFILTER_NETLINK=y
CONFIG_NETFILTER_FAMILY_BRIDGE=y
CONFIG_NETFILTER_FAMILY_ARP=y
CONFIG_NETFILTER_BPF_LINK=y
CONFIG_NETFILTER_NETLINK_HOOK=y
CONFIG_NETFILTER_NETLINK_ACCT=y
CONFIG_NETFILTER_NETLINK_QUEUE=y
CONFIG_NETFILTER_NETLINK_LOG=y
# CONFIG_NETFILTER_NETLINK_OSF is not set
CONFIG_NF_CONNTRACK=y
CONFIG_NF_LOG_SYSLOG=y
CONFIG_NETFILTER_CONNCOUNT=y
CONFIG_NF_CONNTRACK_MARK=y
CONFIG_NF_CONNTRACK_ZONES=y
CONFIG_NF_CONNTRACK_PROCFS=y
CONFIG_NF_CONNTRACK_EVENTS=y
CONFIG_NF_CONNTRACK_TIMEOUT=y
CONFIG_NF_CONNTRACK_TIMESTAMP=y
CONFIG_NF_CONNTRACK_LABELS=y
CONFIG_NF_CT_PROTO_DCCP=y
CONFIG_NF_CT_PROTO_SCTP=y
CONFIG_NF_CT_PROTO_UDPLITE=y
# CONFIG_NF_CONNTRACK_AMANDA is not set
# CONFIG_NF_CONNTRACK_FTP is not set
# CONFIG_NF_CONNTRACK_H323 is not set
# CONFIG_NF_CONNTRACK_IRC is not set
# CONFIG_NF_CONNTRACK_NETBIOS_NS is not set
# CONFIG_NF_CONNTRACK_SNMP is not set
# CONFIG_NF_CONNTRACK_PPTP is not set
# CONFIG_NF_CONNTRACK_SANE is not set
# CONFIG_NF_CONNTRACK_SIP is not set
# CONFIG_NF_CONNTRACK_TFTP is not set
CONFIG_NF_CT_NETLINK=y
# CONFIG_NF_CT_NETLINK_TIMEOUT is not set
# CONFIG_NF_CT_NETLINK_HELPER is not set
CONFIG_NETFILTER_NETLINK_GLUE_CT=y
CONFIG_NF_NAT=y
CONFIG_NF_NAT_REDIRECT=y
CONFIG_NF_NAT_MASQUERADE=y
CONFIG_NETFILTER_SYNPROXY=y
CONFIG_NF_TABLES=y
CONFIG_NF_TABLES_INET=y
CONFIG_NF_TABLES_NETDEV=y
CONFIG_NFT_NUMGEN=y
CONFIG_NFT_CT=y
CONFIG_NFT_FLOW_OFFLOAD=y
CONFIG_NFT_CONNLIMIT=y
CONFIG_NFT_LOG=y
CONFIG_NFT_LIMIT=y
CONFIG_NFT_MASQ=y
CONFIG_NFT_REDIR=y
CONFIG_NFT_NAT=y
CONFIG_NFT_TUNNEL=y
CONFIG_NFT_QUEUE=y
CONFIG_NFT_QUOTA=y
CONFIG_NFT_REJECT=y
CONFIG_NFT_REJECT_INET=y
CONFIG_NFT_COMPAT=y
CONFIG_NFT_HASH=y
CONFIG_NFT_XFRM=y
CONFIG_NFT_SOCKET=y
# CONFIG_NFT_OSF is not set
# CONFIG_NFT_TPROXY is not set
# CONFIG_NFT_SYNPROXY is not set
CONFIG_NF_DUP_NETDEV=y
CONFIG_NFT_DUP_NETDEV=y
CONFIG_NFT_FWD_NETDEV=y
# CONFIG_NFT_REJECT_NETDEV is not set
CONFIG_NF_FLOW_TABLE_INET=y
CONFIG_NF_FLOW_TABLE=y
CONFIG_NF_FLOW_TABLE_PROCFS=y
CONFIG_NETFILTER_XTABLES=y
#
# Xtables combined modules
#
CONFIG_NETFILTER_XT_MARK=y
# CONFIG_NETFILTER_XT_CONNMARK is not set
# CONFIG_NETFILTER_XT_SET is not set
#
# Xtables targets
#
# CONFIG_NETFILTER_XT_TARGET_AUDIT is not set
# CONFIG_NETFILTER_XT_TARGET_CHECKSUM is not set
# CONFIG_NETFILTER_XT_TARGET_CLASSIFY is not set
# CONFIG_NETFILTER_XT_TARGET_CONNMARK is not set
# CONFIG_NETFILTER_XT_TARGET_CT is not set
# CONFIG_NETFILTER_XT_TARGET_DSCP is not set
# CONFIG_NETFILTER_XT_TARGET_HL is not set
# CONFIG_NETFILTER_XT_TARGET_HMARK is not set
# CONFIG_NETFILTER_XT_TARGET_IDLETIMER is not set
CONFIG_NETFILTER_XT_TARGET_LOG=y
CONFIG_NETFILTER_XT_TARGET_MARK=y
CONFIG_NETFILTER_XT_NAT=y
# CONFIG_NETFILTER_XT_TARGET_NETMAP is not set
CONFIG_NETFILTER_XT_TARGET_NFLOG=y
CONFIG_NETFILTER_XT_TARGET_NFQUEUE=y
# CONFIG_NETFILTER_XT_TARGET_NOTRACK is not set
# CONFIG_NETFILTER_XT_TARGET_RATEEST is not set
# CONFIG_NETFILTER_XT_TARGET_REDIRECT is not set
CONFIG_NETFILTER_XT_TARGET_MASQUERADE=y
# CONFIG_NETFILTER_XT_TARGET_TEE is not set
# CONFIG_NETFILTER_XT_TARGET_TPROXY is not set
# CONFIG_NETFILTER_XT_TARGET_TRACE is not set
# CONFIG_NETFILTER_XT_TARGET_TCPMSS is not set
# CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP is not set
#
# Xtables matches
#
# CONFIG_NETFILTER_XT_MATCH_ADDRTYPE is not set
# CONFIG_NETFILTER_XT_MATCH_BPF is not set
# CONFIG_NETFILTER_XT_MATCH_CGROUP is not set
# CONFIG_NETFILTER_XT_MATCH_CLUSTER is not set
# CONFIG_NETFILTER_XT_MATCH_COMMENT is not set
# CONFIG_NETFILTER_XT_MATCH_CONNBYTES is not set
# CONFIG_NETFILTER_XT_MATCH_CONNLABEL is not set
# CONFIG_NETFILTER_XT_MATCH_CONNLIMIT is not set
# CONFIG_NETFILTER_XT_MATCH_CONNMARK is not set
# CONFIG_NETFILTER_XT_MATCH_CONNTRACK is not set
# CONFIG_NETFILTER_XT_MATCH_CPU is not set
# CONFIG_NETFILTER_XT_MATCH_DCCP is not set
# CONFIG_NETFILTER_XT_MATCH_DEVGROUP is not set
# CONFIG_NETFILTER_XT_MATCH_DSCP is not set
# CONFIG_NETFILTER_XT_MATCH_ECN is not set
# CONFIG_NETFILTER_XT_MATCH_ESP is not set
# CONFIG_NETFILTER_XT_MATCH_HASHLIMIT is not set
# CONFIG_NETFILTER_XT_MATCH_HELPER is not set
CONFIG_NETFILTER_XT_MATCH_HL=y
# CONFIG_NETFILTER_XT_MATCH_IPCOMP is not set
# CONFIG_NETFILTER_XT_MATCH_IPRANGE is not set
# CONFIG_NETFILTER_XT_MATCH_L2TP is not set
# CONFIG_NETFILTER_XT_MATCH_LENGTH is not set
# CONFIG_NETFILTER_XT_MATCH_LIMIT is not set
# CONFIG_NETFILTER_XT_MATCH_MAC is not set
# CONFIG_NETFILTER_XT_MATCH_MARK is not set
# CONFIG_NETFILTER_XT_MATCH_MULTIPORT is not set
# CONFIG_NETFILTER_XT_MATCH_NFACCT is not set
# CONFIG_NETFILTER_XT_MATCH_OSF is not set
# CONFIG_NETFILTER_XT_MATCH_OWNER is not set
# CONFIG_NETFILTER_XT_MATCH_POLICY is not set
# CONFIG_NETFILTER_XT_MATCH_PHYSDEV is not set
# CONFIG_NETFILTER_XT_MATCH_PKTTYPE is not set
# CONFIG_NETFILTER_XT_MATCH_QUOTA is not set
# CONFIG_NETFILTER_XT_MATCH_RATEEST is not set
# CONFIG_NETFILTER_XT_MATCH_REALM is not set
# CONFIG_NETFILTER_XT_MATCH_RECENT is not set
# CONFIG_NETFILTER_XT_MATCH_SCTP is not set
# CONFIG_NETFILTER_XT_MATCH_SOCKET is not set
# CONFIG_NETFILTER_XT_MATCH_STATE is not set
# CONFIG_NETFILTER_XT_MATCH_STATISTIC is not set
# CONFIG_NETFILTER_XT_MATCH_STRING is not set
# CONFIG_NETFILTER_XT_MATCH_TCPMSS is not set
# CONFIG_NETFILTER_XT_MATCH_TIME is not set
# CONFIG_NETFILTER_XT_MATCH_U32 is not set
# end of Core Netfilter Configuration
CONFIG_IP_SET=y
CONFIG_IP_SET_MAX=256
# CONFIG_IP_SET_BITMAP_IP is not set
# CONFIG_IP_SET_BITMAP_IPMAC is not set
# CONFIG_IP_SET_BITMAP_PORT is not set
# CONFIG_IP_SET_HASH_IP is not set
# CONFIG_IP_SET_HASH_IPMARK is not set
# CONFIG_IP_SET_HASH_IPPORT is not set
# CONFIG_IP_SET_HASH_IPPORTIP is not set
# CONFIG_IP_SET_HASH_IPPORTNET is not set
# CONFIG_IP_SET_HASH_IPMAC is not set
# CONFIG_IP_SET_HASH_MAC is not set
# CONFIG_IP_SET_HASH_NETPORTNET is not set
# CONFIG_IP_SET_HASH_NET is not set
# CONFIG_IP_SET_HASH_NETNET is not set
# CONFIG_IP_SET_HASH_NETPORT is not set
# CONFIG_IP_SET_HASH_NETIFACE is not set
# CONFIG_IP_SET_LIST_SET is not set
# CONFIG_IP_VS is not set
#
# IP: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV4=y
CONFIG_NF_SOCKET_IPV4=y
# CONFIG_NF_TPROXY_IPV4 is not set
CONFIG_NF_TABLES_IPV4=y
CONFIG_NFT_REJECT_IPV4=y
# CONFIG_NFT_DUP_IPV4 is not set
# CONFIG_NFT_FIB_IPV4 is not set
CONFIG_NF_TABLES_ARP=y
# CONFIG_NF_DUP_IPV4 is not set
# CONFIG_NF_LOG_ARP is not set
CONFIG_NF_LOG_IPV4=y
CONFIG_NF_REJECT_IPV4=y
# CONFIG_IP_NF_IPTABLES is not set
CONFIG_NFT_COMPAT_ARP=y
# CONFIG_IP_NF_ARPFILTER is not set
# CONFIG_IP_NF_ARP_MANGLE is not set
# end of IP: Netfilter Configuration
#
# IPv6: Netfilter Configuration
#
CONFIG_IP6_NF_IPTABLES_LEGACY=y
CONFIG_NF_SOCKET_IPV6=y
# CONFIG_NF_TPROXY_IPV6 is not set
CONFIG_NF_TABLES_IPV6=y
CONFIG_NFT_REJECT_IPV6=y
# CONFIG_NFT_DUP_IPV6 is not set
# CONFIG_NFT_FIB_IPV6 is not set
# CONFIG_NF_DUP_IPV6 is not set
CONFIG_NF_REJECT_IPV6=y
CONFIG_NF_LOG_IPV6=y
CONFIG_IP6_NF_IPTABLES=y
CONFIG_IP6_NF_MATCH_AH=y
CONFIG_IP6_NF_MATCH_EUI64=y
CONFIG_IP6_NF_MATCH_FRAG=y
CONFIG_IP6_NF_MATCH_OPTS=y
CONFIG_IP6_NF_MATCH_HL=y
CONFIG_IP6_NF_MATCH_IPV6HEADER=y
CONFIG_IP6_NF_MATCH_MH=y
# CONFIG_IP6_NF_MATCH_RPFILTER is not set
CONFIG_IP6_NF_MATCH_RT=y
CONFIG_IP6_NF_MATCH_SRH=y
# CONFIG_IP6_NF_TARGET_HL is not set
CONFIG_IP6_NF_FILTER=y
CONFIG_IP6_NF_TARGET_REJECT=y
CONFIG_IP6_NF_TARGET_SYNPROXY=y
CONFIG_IP6_NF_MANGLE=y
CONFIG_IP6_NF_RAW=y
CONFIG_IP6_NF_NAT=y
CONFIG_IP6_NF_TARGET_MASQUERADE=y
CONFIG_IP6_NF_TARGET_NPT=y
# end of IPv6: Netfilter Configuration
CONFIG_NF_DEFRAG_IPV6=y
CONFIG_NF_TABLES_BRIDGE=y
# CONFIG_NFT_BRIDGE_META is not set
# CONFIG_NFT_BRIDGE_REJECT is not set
CONFIG_NF_CONNTRACK_BRIDGE=y
CONFIG_BRIDGE_NF_EBTABLES=y
# CONFIG_BRIDGE_EBT_BROUTE is not set
# CONFIG_BRIDGE_EBT_T_FILTER is not set
# CONFIG_BRIDGE_EBT_T_NAT is not set
# CONFIG_BRIDGE_EBT_802_3 is not set
# CONFIG_BRIDGE_EBT_AMONG is not set
# CONFIG_BRIDGE_EBT_ARP is not set
# CONFIG_BRIDGE_EBT_IP is not set
# CONFIG_BRIDGE_EBT_IP6 is not set
# CONFIG_BRIDGE_EBT_LIMIT is not set
# CONFIG_BRIDGE_EBT_MARK is not set
# CONFIG_BRIDGE_EBT_PKTTYPE is not set
# CONFIG_BRIDGE_EBT_STP is not set
# CONFIG_BRIDGE_EBT_VLAN is not set
# CONFIG_BRIDGE_EBT_ARPREPLY is not set
# CONFIG_BRIDGE_EBT_DNAT is not set
# CONFIG_BRIDGE_EBT_MARK_T is not set
# CONFIG_BRIDGE_EBT_REDIRECT is not set
# CONFIG_BRIDGE_EBT_SNAT is not set
# CONFIG_BRIDGE_EBT_LOG is not set
# CONFIG_BRIDGE_EBT_NFLOG is not set
# CONFIG_IP_DCCP is not set
# CONFIG_IP_SCTP is not set
# CONFIG_RDS is not set
# CONFIG_TIPC is not set
# CONFIG_ATM is not set
# CONFIG_L2TP is not set
CONFIG_STP=y
CONFIG_GARP=y
CONFIG_BRIDGE=y
CONFIG_BRIDGE_IGMP_SNOOPING=y
# CONFIG_BRIDGE_VLAN_FILTERING is not set
# CONFIG_BRIDGE_MRP is not set
# CONFIG_BRIDGE_CFM is not set
CONFIG_NET_DSA=y
# CONFIG_NET_DSA_TAG_NONE is not set
CONFIG_NET_DSA_TAG_AR9331=y
# CONFIG_NET_DSA_TAG_BRCM is not set
# CONFIG_NET_DSA_TAG_BRCM_LEGACY is not set
# CONFIG_NET_DSA_TAG_BRCM_PREPEND is not set
# CONFIG_NET_DSA_TAG_HELLCREEK is not set
# CONFIG_NET_DSA_TAG_GSWIP is not set
# CONFIG_NET_DSA_TAG_DSA is not set
# CONFIG_NET_DSA_TAG_EDSA is not set
# CONFIG_NET_DSA_TAG_MTK is not set
# CONFIG_NET_DSA_TAG_KSZ is not set
# CONFIG_NET_DSA_TAG_OCELOT is not set
# CONFIG_NET_DSA_TAG_OCELOT_8021Q is not set
# CONFIG_NET_DSA_TAG_QCA is not set
# CONFIG_NET_DSA_TAG_RTL4_A is not set
# CONFIG_NET_DSA_TAG_RTL8_4 is not set
# CONFIG_NET_DSA_TAG_RZN1_A5PSW is not set
# CONFIG_NET_DSA_TAG_LAN9303 is not set
# CONFIG_NET_DSA_TAG_SJA1105 is not set
# CONFIG_NET_DSA_TAG_TRAILER is not set
# CONFIG_NET_DSA_TAG_XRS700X is not set
CONFIG_VLAN_8021Q=y
CONFIG_VLAN_8021Q_GVRP=y
# CONFIG_VLAN_8021Q_MVRP is not set
CONFIG_LLC=y
# CONFIG_LLC2 is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_PHONET is not set
# CONFIG_6LOWPAN is not set
# CONFIG_IEEE802154 is not set
# CONFIG_NET_SCHED is not set
# CONFIG_DCB is not set
CONFIG_DNS_RESOLVER=y
# CONFIG_BATMAN_ADV is not set
# CONFIG_OPENVSWITCH is not set
# CONFIG_VSOCKETS is not set
CONFIG_NETLINK_DIAG=y
# CONFIG_MPLS is not set
CONFIG_NET_NSH=y
# CONFIG_HSR is not set
CONFIG_NET_SWITCHDEV=y
# CONFIG_NET_L3_MASTER_DEV is not set
# CONFIG_QRTR is not set
# CONFIG_NET_NCSI is not set
CONFIG_MAX_SKB_FRAGS=17
# CONFIG_CGROUP_NET_PRIO is not set
# CONFIG_CGROUP_NET_CLASSID is not set
CONFIG_NET_RX_BUSY_POLL=y
CONFIG_BQL=y
CONFIG_BPF_STREAM_PARSER=y
#
# Network testing
#
CONFIG_NET_PKTGEN=y
CONFIG_NET_DROP_MONITOR=y
# end of Network testing
# end of Networking options
# CONFIG_HAMRADIO is not set
# CONFIG_CAN is not set
# CONFIG_BT is not set
# CONFIG_AF_RXRPC is not set
# CONFIG_AF_KCM is not set
CONFIG_STREAM_PARSER=y
# CONFIG_MCTP is not set
CONFIG_FIB_RULES=y
CONFIG_WIRELESS=y
# CONFIG_CFG80211 is not set
#
# CFG80211 needs to be enabled for MAC80211
#
CONFIG_MAC80211_STA_HASH_MAX_SIZE=0
# CONFIG_RFKILL is not set
CONFIG_NET_9P=y
CONFIG_NET_9P_FD=y
CONFIG_NET_9P_VIRTIO=y
# CONFIG_NET_9P_DEBUG is not set
# CONFIG_CAIF is not set
# CONFIG_CEPH_LIB is not set
# CONFIG_NFC is not set
# CONFIG_PSAMPLE is not set
# CONFIG_NET_IFE is not set
# CONFIG_LWTUNNEL is not set
CONFIG_DST_CACHE=y
CONFIG_GRO_CELLS=y
CONFIG_NET_SELFTESTS=y
CONFIG_NET_SOCK_MSG=y
CONFIG_NET_DEVLINK=y
CONFIG_PAGE_POOL=y
# CONFIG_PAGE_POOL_STATS is not set
CONFIG_FAILOVER=y
CONFIG_ETHTOOL_NETLINK=y
#
# Device Drivers
#
CONFIG_HAVE_EISA=y
# CONFIG_EISA is not set
CONFIG_HAVE_PCI=y
CONFIG_GENERIC_PCI_IOMAP=y
CONFIG_PCI=y
CONFIG_PCI_DOMAINS=y
# CONFIG_PCIEPORTBUS is not set
CONFIG_PCIEASPM=y
CONFIG_PCIEASPM_DEFAULT=y
# CONFIG_PCIEASPM_POWERSAVE is not set
# CONFIG_PCIEASPM_POWER_SUPERSAVE is not set
# CONFIG_PCIEASPM_PERFORMANCE is not set
# CONFIG_PCIE_PTM is not set
CONFIG_PCI_MSI=y
CONFIG_PCI_QUIRKS=y
# CONFIG_PCI_DEBUG is not set
# CONFIG_PCI_STUB is not set
CONFIG_PCI_LOCKLESS_CONFIG=y
# CONFIG_PCI_IOV is not set
# CONFIG_PCI_PRI is not set
# CONFIG_PCI_PASID is not set
CONFIG_PCI_LABEL=y
CONFIG_VGA_ARB=y
CONFIG_VGA_ARB_MAX_GPUS=16
# CONFIG_HOTPLUG_PCI is not set
#
# PCI controller drivers
#
# CONFIG_VMD is not set
#
# Cadence-based PCIe controllers
#
# end of Cadence-based PCIe controllers
#
# DesignWare-based PCIe controllers
#
# CONFIG_PCI_MESON is not set
# CONFIG_PCIE_DW_PLAT_HOST is not set
# end of DesignWare-based PCIe controllers
#
# Mobiveil-based PCIe controllers
#
# end of Mobiveil-based PCIe controllers
# end of PCI controller drivers
#
# PCI Endpoint
#
# CONFIG_PCI_ENDPOINT is not set
# end of PCI Endpoint
#
# PCI switch controller drivers
#
# CONFIG_PCI_SW_SWITCHTEC is not set
# end of PCI switch controller drivers
# CONFIG_CXL_BUS is not set
# CONFIG_PCCARD is not set
# CONFIG_RAPIDIO is not set
#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER=y
CONFIG_UEVENT_HELPER_PATH=""
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
# CONFIG_DEVTMPFS_SAFE is not set
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
#
# Firmware loader
#
CONFIG_FW_LOADER=y
CONFIG_EXTRA_FIRMWARE=""
# CONFIG_FW_LOADER_USER_HELPER is not set
# CONFIG_FW_LOADER_COMPRESS is not set
CONFIG_FW_CACHE=y
# CONFIG_FW_UPLOAD is not set
# end of Firmware loader
CONFIG_ALLOW_DEV_COREDUMP=y
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set
# CONFIG_TEST_ASYNC_DRIVER_PROBE is not set
CONFIG_GENERIC_CPU_DEVICES=y
CONFIG_GENERIC_CPU_AUTOPROBE=y
CONFIG_GENERIC_CPU_VULNERABILITIES=y
# CONFIG_FW_DEVLINK_SYNC_STATE_TIMEOUT is not set
# end of Generic Driver Options
#
# Bus devices
#
# CONFIG_MHI_BUS is not set
# CONFIG_MHI_BUS_EP is not set
# end of Bus devices
#
# Cache Drivers
#
# end of Cache Drivers
# CONFIG_CONNECTOR is not set
#
# Firmware Drivers
#
#
# ARM System Control and Management Interface Protocol
#
# end of ARM System Control and Management Interface Protocol
# CONFIG_EDD is not set
CONFIG_FIRMWARE_MEMMAP=y
CONFIG_DMIID=y
# CONFIG_DMI_SYSFS is not set
CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK=y
# CONFIG_FW_CFG_SYSFS is not set
# CONFIG_SYSFB_SIMPLEFB is not set
# CONFIG_GOOGLE_FIRMWARE is not set
#
# Qualcomm firmware drivers
#
# end of Qualcomm firmware drivers
#
# Tegra firmware driver
#
# end of Tegra firmware driver
# end of Firmware Drivers
# CONFIG_GNSS is not set
# CONFIG_MTD is not set
# CONFIG_OF is not set
CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
# CONFIG_PARPORT is not set
CONFIG_PNP=y
CONFIG_PNP_DEBUG_MESSAGES=y
#
# Protocols
#
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
# CONFIG_BLK_DEV_NULL_BLK is not set
# CONFIG_BLK_DEV_FD is not set
# CONFIG_BLK_DEV_PCIESSD_MTIP32XX is not set
CONFIG_ZRAM=y
CONFIG_ZRAM_DEF_COMP_LZORLE=y
# CONFIG_ZRAM_DEF_COMP_ZSTD is not set
# CONFIG_ZRAM_DEF_COMP_LZ4 is not set
# CONFIG_ZRAM_DEF_COMP_LZO is not set
CONFIG_ZRAM_DEF_COMP="lzo-rle"
# CONFIG_ZRAM_WRITEBACK is not set
# CONFIG_ZRAM_TRACK_ENTRY_ACTIME is not set
# CONFIG_ZRAM_MEMORY_TRACKING is not set
# CONFIG_ZRAM_MULTI_COMP is not set
# CONFIG_BLK_DEV_LOOP is not set
# CONFIG_BLK_DEV_DRBD is not set
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_RAM is not set
# CONFIG_ATA_OVER_ETH is not set
CONFIG_VIRTIO_BLK=y
# CONFIG_BLK_DEV_RBD is not set
# CONFIG_BLK_DEV_UBLK is not set
#
# NVME Support
#
# CONFIG_BLK_DEV_NVME is not set
# CONFIG_NVME_FC is not set
# CONFIG_NVME_TCP is not set
# end of NVME Support
#
# Misc devices
#
# CONFIG_DUMMY_IRQ is not set
# CONFIG_IBM_ASM is not set
# CONFIG_PHANTOM is not set
# CONFIG_TIFM_CORE is not set
# CONFIG_ENCLOSURE_SERVICES is not set
# CONFIG_HP_ILO is not set
# CONFIG_SRAM is not set
# CONFIG_DW_XDATA_PCIE is not set
# CONFIG_PCI_ENDPOINT_TEST is not set
# CONFIG_XILINX_SDFEC is not set
# CONFIG_NSM is not set
# CONFIG_C2PORT is not set
#
# EEPROM support
#
# CONFIG_EEPROM_93CX6 is not set
# end of EEPROM support
# CONFIG_CB710_CORE is not set
#
# Texas Instruments shared transport line discipline
#
# end of Texas Instruments shared transport line discipline
#
# Altera FPGA firmware download module (requires I2C)
#
# CONFIG_INTEL_MEI is not set
# CONFIG_VMWARE_VMCI is not set
# CONFIG_GENWQE is not set
# CONFIG_ECHO is not set
# CONFIG_BCM_VK is not set
# CONFIG_MISC_ALCOR_PCI is not set
# CONFIG_MISC_RTSX_PCI is not set
# CONFIG_UACCE is not set
# CONFIG_PVPANIC is not set
# end of Misc devices
#
# SCSI device support
#
CONFIG_SCSI_MOD=y
# CONFIG_RAID_ATTRS is not set
# CONFIG_SCSI is not set
# end of SCSI device support
# CONFIG_ATA is not set
# CONFIG_MD is not set
# CONFIG_TARGET_CORE is not set
# CONFIG_FUSION is not set
#
# IEEE 1394 (FireWire) support
#
# CONFIG_FIREWIRE is not set
# CONFIG_FIREWIRE_NOSY is not set
# end of IEEE 1394 (FireWire) support
# CONFIG_MACINTOSH_DRIVERS is not set
CONFIG_NETDEVICES=y
CONFIG_MII=y
CONFIG_NET_CORE=y
# CONFIG_BONDING is not set
CONFIG_DUMMY=y
# CONFIG_WIREGUARD is not set
# CONFIG_EQUALIZER is not set
# CONFIG_IFB is not set
# CONFIG_NET_TEAM is not set
# CONFIG_MACVLAN is not set
# CONFIG_IPVLAN is not set
# CONFIG_VXLAN is not set
# CONFIG_GENEVE is not set
# CONFIG_BAREUDP is not set
# CONFIG_GTP is not set
# CONFIG_PFCP is not set
# CONFIG_MACSEC is not set
CONFIG_NETCONSOLE=y
# CONFIG_NETCONSOLE_EXTENDED_LOG is not set
CONFIG_NETPOLL=y
CONFIG_NET_POLL_CONTROLLER=y
# CONFIG_TUN is not set
# CONFIG_TUN_VNET_CROSS_LE is not set
CONFIG_VETH=y
CONFIG_VIRTIO_NET=y
# CONFIG_NLMON is not set
# CONFIG_NETKIT is not set
# CONFIG_ARCNET is not set
#
# Distributed Switch Architecture drivers
#
# CONFIG_B53 is not set
# CONFIG_NET_DSA_BCM_SF2 is not set
# CONFIG_NET_DSA_LOOP is not set
# CONFIG_NET_DSA_LANTIQ_GSWIP is not set
# CONFIG_NET_DSA_MT7530 is not set
# CONFIG_NET_DSA_MV88E6060 is not set
# CONFIG_NET_DSA_MICROCHIP_KSZ_COMMON is not set
# CONFIG_NET_DSA_MV88E6XXX is not set
# CONFIG_NET_DSA_MSCC_SEVILLE is not set
# CONFIG_NET_DSA_AR9331 is not set
# CONFIG_NET_DSA_QCA8K is not set
# CONFIG_NET_DSA_XRS700X_MDIO is not set
# CONFIG_NET_DSA_REALTEK is not set
# CONFIG_NET_DSA_SMSC_LAN9303_MDIO is not set
# CONFIG_NET_DSA_VITESSE_VSC73XX_PLATFORM is not set
# end of Distributed Switch Architecture drivers
CONFIG_ETHERNET=y
CONFIG_NET_VENDOR_3COM=y
# CONFIG_VORTEX is not set
# CONFIG_TYPHOON is not set
CONFIG_NET_VENDOR_ADAPTEC=y
# CONFIG_ADAPTEC_STARFIRE is not set
CONFIG_NET_VENDOR_AGERE=y
# CONFIG_ET131X is not set
CONFIG_NET_VENDOR_ALACRITECH=y
# CONFIG_SLICOSS is not set
CONFIG_NET_VENDOR_ALTEON=y
# CONFIG_ACENIC is not set
# CONFIG_ALTERA_TSE is not set
CONFIG_NET_VENDOR_AMAZON=y
# CONFIG_ENA_ETHERNET is not set
CONFIG_NET_VENDOR_AMD=y
# CONFIG_AMD8111_ETH is not set
# CONFIG_PCNET32 is not set
# CONFIG_AMD_XGBE is not set
# CONFIG_PDS_CORE is not set
CONFIG_NET_VENDOR_AQUANTIA=y
# CONFIG_AQTION is not set
CONFIG_NET_VENDOR_ARC=y
CONFIG_NET_VENDOR_ASIX=y
CONFIG_NET_VENDOR_ATHEROS=y
# CONFIG_ATL2 is not set
# CONFIG_ATL1 is not set
# CONFIG_ATL1E is not set
# CONFIG_ATL1C is not set
# CONFIG_ALX is not set
# CONFIG_CX_ECAT is not set
CONFIG_NET_VENDOR_BROADCOM=y
# CONFIG_B44 is not set
# CONFIG_BCMGENET is not set
# CONFIG_BNX2 is not set
# CONFIG_CNIC is not set
# CONFIG_TIGON3 is not set
# CONFIG_BNX2X is not set
# CONFIG_SYSTEMPORT is not set
# CONFIG_BNXT is not set
CONFIG_NET_VENDOR_CADENCE=y
# CONFIG_MACB is not set
CONFIG_NET_VENDOR_CAVIUM=y
# CONFIG_THUNDER_NIC_PF is not set
# CONFIG_THUNDER_NIC_VF is not set
# CONFIG_THUNDER_NIC_BGX is not set
# CONFIG_THUNDER_NIC_RGX is not set
# CONFIG_LIQUIDIO is not set
# CONFIG_LIQUIDIO_VF is not set
CONFIG_NET_VENDOR_CHELSIO=y
# CONFIG_CHELSIO_T1 is not set
# CONFIG_CHELSIO_T3 is not set
# CONFIG_CHELSIO_T4 is not set
# CONFIG_CHELSIO_T4VF is not set
CONFIG_NET_VENDOR_CISCO=y
# CONFIG_ENIC is not set
CONFIG_NET_VENDOR_CORTINA=y
CONFIG_NET_VENDOR_DAVICOM=y
# CONFIG_DNET is not set
CONFIG_NET_VENDOR_DEC=y
# CONFIG_NET_TULIP is not set
CONFIG_NET_VENDOR_DLINK=y
# CONFIG_DL2K is not set
# CONFIG_SUNDANCE is not set
CONFIG_NET_VENDOR_EMULEX=y
# CONFIG_BE2NET is not set
CONFIG_NET_VENDOR_ENGLEDER=y
# CONFIG_TSNEP is not set
CONFIG_NET_VENDOR_EZCHIP=y
CONFIG_NET_VENDOR_FUNGIBLE=y
# CONFIG_FUN_ETH is not set
CONFIG_NET_VENDOR_GOOGLE=y
# CONFIG_GVE is not set
CONFIG_NET_VENDOR_HUAWEI=y
# CONFIG_HINIC is not set
CONFIG_NET_VENDOR_I825XX=y
CONFIG_NET_VENDOR_INTEL=y
CONFIG_E100=y
CONFIG_E1000=y
CONFIG_E1000E=y
CONFIG_E1000E_HWTS=y
# CONFIG_IGB is not set
# CONFIG_IGBVF is not set
# CONFIG_IXGBE is not set
# CONFIG_IXGBEVF is not set
# CONFIG_I40E is not set
# CONFIG_I40EVF is not set
# CONFIG_ICE is not set
# CONFIG_FM10K is not set
# CONFIG_IGC is not set
# CONFIG_IDPF is not set
# CONFIG_JME is not set
CONFIG_NET_VENDOR_LITEX=y
CONFIG_NET_VENDOR_MARVELL=y
# CONFIG_MVMDIO is not set
# CONFIG_SKGE is not set
# CONFIG_SKY2 is not set
# CONFIG_OCTEON_EP is not set
# CONFIG_OCTEON_EP_VF is not set
# CONFIG_PRESTERA is not set
CONFIG_NET_VENDOR_MELLANOX=y
# CONFIG_MLX4_EN is not set
# CONFIG_MLX5_CORE is not set
# CONFIG_MLXSW_CORE is not set
# CONFIG_MLXFW is not set
CONFIG_NET_VENDOR_MICREL=y
# CONFIG_KS8851_MLL is not set
# CONFIG_KSZ884X_PCI is not set
CONFIG_NET_VENDOR_MICROCHIP=y
# CONFIG_LAN743X is not set
# CONFIG_VCAP is not set
CONFIG_NET_VENDOR_MICROSEMI=y
CONFIG_NET_VENDOR_MICROSOFT=y
CONFIG_NET_VENDOR_MYRI=y
# CONFIG_MYRI10GE is not set
# CONFIG_FEALNX is not set
CONFIG_NET_VENDOR_NI=y
# CONFIG_NI_XGE_MANAGEMENT_ENET is not set
CONFIG_NET_VENDOR_NATSEMI=y
# CONFIG_NATSEMI is not set
# CONFIG_NS83820 is not set
CONFIG_NET_VENDOR_NETERION=y
# CONFIG_S2IO is not set
CONFIG_NET_VENDOR_NETRONOME=y
# CONFIG_NFP is not set
CONFIG_NET_VENDOR_8390=y
# CONFIG_NE2K_PCI is not set
CONFIG_NET_VENDOR_NVIDIA=y
# CONFIG_FORCEDETH is not set
CONFIG_NET_VENDOR_OKI=y
# CONFIG_ETHOC is not set
CONFIG_NET_VENDOR_PACKET_ENGINES=y
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
CONFIG_NET_VENDOR_PENSANDO=y
# CONFIG_IONIC is not set
CONFIG_NET_VENDOR_QLOGIC=y
# CONFIG_QLA3XXX is not set
# CONFIG_QLCNIC is not set
# CONFIG_NETXEN_NIC is not set
# CONFIG_QED is not set
CONFIG_NET_VENDOR_BROCADE=y
# CONFIG_BNA is not set
CONFIG_NET_VENDOR_QUALCOMM=y
# CONFIG_QCOM_EMAC is not set
# CONFIG_RMNET is not set
CONFIG_NET_VENDOR_RDC=y
# CONFIG_R6040 is not set
CONFIG_NET_VENDOR_REALTEK=y
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
# CONFIG_R8169 is not set
CONFIG_NET_VENDOR_RENESAS=y
CONFIG_NET_VENDOR_ROCKER=y
# CONFIG_ROCKER is not set
CONFIG_NET_VENDOR_SAMSUNG=y
# CONFIG_SXGBE_ETH is not set
CONFIG_NET_VENDOR_SEEQ=y
CONFIG_NET_VENDOR_SILAN=y
# CONFIG_SC92031 is not set
CONFIG_NET_VENDOR_SIS=y
# CONFIG_SIS900 is not set
# CONFIG_SIS190 is not set
CONFIG_NET_VENDOR_SOLARFLARE=y
# CONFIG_SFC is not set
# CONFIG_SFC_FALCON is not set
CONFIG_NET_VENDOR_SMSC=y
# CONFIG_EPIC100 is not set
# CONFIG_SMSC911X is not set
# CONFIG_SMSC9420 is not set
CONFIG_NET_VENDOR_SOCIONEXT=y
CONFIG_NET_VENDOR_STMICRO=y
# CONFIG_STMMAC_ETH is not set
CONFIG_NET_VENDOR_SUN=y
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_CASSINI is not set
# CONFIG_NIU is not set
CONFIG_NET_VENDOR_SYNOPSYS=y
# CONFIG_DWC_XLGMAC is not set
CONFIG_NET_VENDOR_TEHUTI=y
# CONFIG_TEHUTI is not set
CONFIG_NET_VENDOR_TI=y
# CONFIG_TI_CPSW_PHY_SEL is not set
# CONFIG_TLAN is not set
CONFIG_NET_VENDOR_VERTEXCOM=y
CONFIG_NET_VENDOR_VIA=y
# CONFIG_VIA_RHINE is not set
# CONFIG_VIA_VELOCITY is not set
CONFIG_NET_VENDOR_WANGXUN=y
# CONFIG_NGBE is not set
# CONFIG_TXGBE is not set
CONFIG_NET_VENDOR_WIZNET=y
# CONFIG_WIZNET_W5100 is not set
# CONFIG_WIZNET_W5300 is not set
CONFIG_NET_VENDOR_XILINX=y
# CONFIG_XILINX_EMACLITE is not set
# CONFIG_XILINX_LL_TEMAC is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_NET_SB1000 is not set
CONFIG_PHYLINK=y
CONFIG_PHYLIB=y
CONFIG_SWPHY=y
CONFIG_FIXED_PHY=y
#
# MII PHY device drivers
#
# CONFIG_AIR_EN8811H_PHY is not set
# CONFIG_AMD_PHY is not set
# CONFIG_ADIN_PHY is not set
# CONFIG_ADIN1100_PHY is not set
# CONFIG_AQUANTIA_PHY is not set
# CONFIG_AX88796B_PHY is not set
# CONFIG_BROADCOM_PHY is not set
# CONFIG_BCM54140_PHY is not set
# CONFIG_BCM7XXX_PHY is not set
# CONFIG_BCM84881_PHY is not set
# CONFIG_BCM87XX_PHY is not set
# CONFIG_CICADA_PHY is not set
# CONFIG_CORTINA_PHY is not set
# CONFIG_DAVICOM_PHY is not set
# CONFIG_ICPLUS_PHY is not set
# CONFIG_LXT_PHY is not set
# CONFIG_INTEL_XWAY_PHY is not set
# CONFIG_LSI_ET1011C_PHY is not set
# CONFIG_MARVELL_PHY is not set
# CONFIG_MARVELL_10G_PHY is not set
# CONFIG_MARVELL_88Q2XXX_PHY is not set
# CONFIG_MARVELL_88X2222_PHY is not set
# CONFIG_MAXLINEAR_GPHY is not set
# CONFIG_MEDIATEK_GE_PHY is not set
# CONFIG_MICREL_PHY is not set
# CONFIG_MICROCHIP_T1S_PHY is not set
# CONFIG_MICROCHIP_PHY is not set
# CONFIG_MICROCHIP_T1_PHY is not set
# CONFIG_MICROSEMI_PHY is not set
# CONFIG_MOTORCOMM_PHY is not set
# CONFIG_NATIONAL_PHY is not set
# CONFIG_NXP_CBTX_PHY is not set
# CONFIG_NXP_C45_TJA11XX_PHY is not set
# CONFIG_NXP_TJA11XX_PHY is not set
# CONFIG_NCN26000_PHY is not set
# CONFIG_QCA83XX_PHY is not set
# CONFIG_QCA808X_PHY is not set
# CONFIG_QSEMI_PHY is not set
# CONFIG_REALTEK_PHY is not set
# CONFIG_RENESAS_PHY is not set
# CONFIG_ROCKCHIP_PHY is not set
# CONFIG_SMSC_PHY is not set
# CONFIG_STE10XP is not set
# CONFIG_TERANETICS_PHY is not set
# CONFIG_DP83822_PHY is not set
# CONFIG_DP83TC811_PHY is not set
# CONFIG_DP83848_PHY is not set
# CONFIG_DP83867_PHY is not set
# CONFIG_DP83869_PHY is not set
# CONFIG_DP83TD510_PHY is not set
# CONFIG_DP83TG720_PHY is not set
# CONFIG_VITESSE_PHY is not set
# CONFIG_XILINX_GMII2RGMII is not set
# CONFIG_PSE_CONTROLLER is not set
CONFIG_MDIO_DEVICE=y
CONFIG_MDIO_BUS=y
CONFIG_FWNODE_MDIO=y
CONFIG_ACPI_MDIO=y
CONFIG_MDIO_DEVRES=y
# CONFIG_MDIO_BITBANG is not set
# CONFIG_MDIO_BCM_UNIMAC is not set
# CONFIG_MDIO_THUNDER is not set
#
# MDIO Multiplexers
#
#
# PCS device drivers
#
# end of PCS device drivers
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
#
# Host-side USB support is needed for USB Network Adapter support
#
CONFIG_WLAN=y
CONFIG_WLAN_VENDOR_ADMTEK=y
CONFIG_WLAN_VENDOR_ATH=y
# CONFIG_ATH_DEBUG is not set
# CONFIG_ATH5K_PCI is not set
CONFIG_WLAN_VENDOR_ATMEL=y
CONFIG_WLAN_VENDOR_BROADCOM=y
CONFIG_WLAN_VENDOR_INTEL=y
CONFIG_WLAN_VENDOR_INTERSIL=y
CONFIG_WLAN_VENDOR_MARVELL=y
CONFIG_WLAN_VENDOR_MEDIATEK=y
CONFIG_WLAN_VENDOR_MICROCHIP=y
CONFIG_WLAN_VENDOR_PURELIFI=y
CONFIG_WLAN_VENDOR_RALINK=y
CONFIG_WLAN_VENDOR_REALTEK=y
CONFIG_WLAN_VENDOR_RSI=y
CONFIG_WLAN_VENDOR_SILABS=y
CONFIG_WLAN_VENDOR_ST=y
CONFIG_WLAN_VENDOR_TI=y
CONFIG_WLAN_VENDOR_ZYDAS=y
CONFIG_WLAN_VENDOR_QUANTENNA=y
# CONFIG_WAN is not set
#
# Wireless WAN
#
# CONFIG_WWAN is not set
# end of Wireless WAN
# CONFIG_VMXNET3 is not set
# CONFIG_FUJITSU_ES is not set
# CONFIG_NETDEVSIM is not set
CONFIG_NET_FAILOVER=y
# CONFIG_ISDN is not set
#
# Input device support
#
CONFIG_INPUT=y
CONFIG_INPUT_FF_MEMLESS=y
# CONFIG_INPUT_SPARSEKMAP is not set
# CONFIG_INPUT_MATRIXKMAP is not set
CONFIG_INPUT_VIVALDIFMAP=y
#
# Userland interfaces
#
# CONFIG_INPUT_MOUSEDEV is not set
# CONFIG_INPUT_JOYDEV is not set
# CONFIG_INPUT_EVDEV is not set
# CONFIG_INPUT_EVBUG is not set
#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_KEYBOARD_OPENCORES is not set
# CONFIG_KEYBOARD_SAMSUNG is not set
# CONFIG_KEYBOARD_STOWAWAY is not set
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_BYD=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_CYPRESS=y
CONFIG_MOUSE_PS2_LIFEBOOK=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
# CONFIG_MOUSE_PS2_ELANTECH is not set
# CONFIG_MOUSE_PS2_SENTELIC is not set
# CONFIG_MOUSE_PS2_TOUCHKIT is not set
CONFIG_MOUSE_PS2_FOCALTECH=y
# CONFIG_MOUSE_PS2_VMMOUSE is not set
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_MOUSE_APPLETOUCH is not set
# CONFIG_MOUSE_BCM5974 is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_MOUSE_SYNAPTICS_USB is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TABLET is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set
# CONFIG_RMI4_CORE is not set
#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
# CONFIG_SERIO_ALTERA_PS2 is not set
# CONFIG_SERIO_PS2MULT is not set
# CONFIG_SERIO_ARC_PS2 is not set
# CONFIG_USERIO is not set
# CONFIG_GAMEPORT is not set
# end of Hardware I/O ports
# end of Input device support
#
# Character devices
#
CONFIG_TTY=y
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_VT_CONSOLE_SLEEP=y
# CONFIG_VT_HW_CONSOLE_BINDING is not set
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
CONFIG_LEGACY_TIOCSTI=y
CONFIG_LDISC_AUTOLOAD=y
#
# Serial drivers
#
CONFIG_SERIAL_EARLYCON=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_DEPRECATED_OPTIONS=y
CONFIG_SERIAL_8250_PNP=y
# CONFIG_SERIAL_8250_16550A_VARIANTS is not set
# CONFIG_SERIAL_8250_FINTEK is not set
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_PCILIB=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_EXAR=y
CONFIG_SERIAL_8250_NR_UARTS=4
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
# CONFIG_SERIAL_8250_EXTENDED is not set
# CONFIG_SERIAL_8250_PCI1XXXX is not set
CONFIG_SERIAL_8250_DWLIB=y
# CONFIG_SERIAL_8250_DW is not set
# CONFIG_SERIAL_8250_RT288X is not set
CONFIG_SERIAL_8250_LPSS=y
CONFIG_SERIAL_8250_MID=y
CONFIG_SERIAL_8250_PERICOM=y
#
# Non-8250 serial port support
#
# CONFIG_SERIAL_UARTLITE is not set
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
# CONFIG_SERIAL_JSM is not set
# CONFIG_SERIAL_LANTIQ is not set
# CONFIG_SERIAL_SCCNXP is not set
# CONFIG_SERIAL_ALTERA_JTAGUART is not set
# CONFIG_SERIAL_ALTERA_UART is not set
# CONFIG_SERIAL_ARC is not set
# CONFIG_SERIAL_RP2 is not set
# CONFIG_SERIAL_FSL_LPUART is not set
# CONFIG_SERIAL_FSL_LINFLEXUART is not set
# CONFIG_SERIAL_SPRD is not set
# end of Serial drivers
# CONFIG_SERIAL_NONSTANDARD is not set
# CONFIG_N_GSM is not set
# CONFIG_NOZOMI is not set
# CONFIG_NULL_TTY is not set
CONFIG_HVC_DRIVER=y
# CONFIG_SERIAL_DEV_BUS is not set
CONFIG_VIRTIO_CONSOLE=y
# CONFIG_IPMI_HANDLER is not set
CONFIG_HW_RANDOM=y
CONFIG_HW_RANDOM_TIMERIOMEM=y
CONFIG_HW_RANDOM_INTEL=y
CONFIG_HW_RANDOM_AMD=y
# CONFIG_HW_RANDOM_BA431 is not set
CONFIG_HW_RANDOM_VIA=y
CONFIG_HW_RANDOM_VIRTIO=y
# CONFIG_HW_RANDOM_XIPHERA is not set
# CONFIG_APPLICOM is not set
# CONFIG_MWAVE is not set
CONFIG_DEVMEM=y
# CONFIG_NVRAM is not set
CONFIG_DEVPORT=y
# CONFIG_HPET is not set
# CONFIG_HANGCHECK_TIMER is not set
# CONFIG_TCG_TPM is not set
# CONFIG_TELCLOCK is not set
# CONFIG_XILLYBUS is not set
# end of Character devices
#
# I2C support
#
# CONFIG_I2C is not set
# end of I2C support
# CONFIG_I3C is not set
# CONFIG_SPI is not set
# CONFIG_SPMI is not set
# CONFIG_HSI is not set
# CONFIG_PPS is not set
#
# PTP clock support
#
# CONFIG_PTP_1588_CLOCK is not set
CONFIG_PTP_1588_CLOCK_OPTIONAL=y
#
# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks.
#
# end of PTP clock support
CONFIG_PINCTRL=y
# CONFIG_DEBUG_PINCTRL is not set
# CONFIG_PINCTRL_AMD is not set
#
# Intel pinctrl drivers
#
# CONFIG_PINCTRL_BAYTRAIL is not set
# CONFIG_PINCTRL_CHERRYVIEW is not set
# CONFIG_PINCTRL_LYNXPOINT is not set
# CONFIG_PINCTRL_INTEL_PLATFORM is not set
# CONFIG_PINCTRL_ALDERLAKE is not set
# CONFIG_PINCTRL_BROXTON is not set
# CONFIG_PINCTRL_CANNONLAKE is not set
# CONFIG_PINCTRL_CEDARFORK is not set
# CONFIG_PINCTRL_DENVERTON is not set
# CONFIG_PINCTRL_ELKHARTLAKE is not set
# CONFIG_PINCTRL_EMMITSBURG is not set
# CONFIG_PINCTRL_GEMINILAKE is not set
# CONFIG_PINCTRL_ICELAKE is not set
# CONFIG_PINCTRL_JASPERLAKE is not set
# CONFIG_PINCTRL_LAKEFIELD is not set
# CONFIG_PINCTRL_LEWISBURG is not set
# CONFIG_PINCTRL_METEORLAKE is not set
# CONFIG_PINCTRL_METEORPOINT is not set
# CONFIG_PINCTRL_SUNRISEPOINT is not set
# CONFIG_PINCTRL_TIGERLAKE is not set
# end of Intel pinctrl drivers
#
# Renesas pinctrl drivers
#
# end of Renesas pinctrl drivers
# CONFIG_GPIOLIB is not set
# CONFIG_W1 is not set
# CONFIG_POWER_RESET is not set
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
CONFIG_POWER_SUPPLY_HWMON=y
# CONFIG_TEST_POWER is not set
# CONFIG_BATTERY_DS2780 is not set
# CONFIG_BATTERY_DS2781 is not set
# CONFIG_BATTERY_SAMSUNG_SDI is not set
# CONFIG_BATTERY_BQ27XXX is not set
# CONFIG_CHARGER_MAX8903 is not set
# CONFIG_BATTERY_GOLDFISH is not set
CONFIG_HWMON=y
# CONFIG_HWMON_DEBUG_CHIP is not set
#
# Native drivers
#
# CONFIG_SENSORS_ABITUGURU is not set
# CONFIG_SENSORS_ABITUGURU3 is not set
# CONFIG_SENSORS_AS370 is not set
# CONFIG_SENSORS_ASUS_ROG_RYUJIN is not set
# CONFIG_SENSORS_AXI_FAN_CONTROL is not set
# CONFIG_SENSORS_K8TEMP is not set
# CONFIG_SENSORS_K10TEMP is not set
# CONFIG_SENSORS_FAM15H_POWER is not set
# CONFIG_SENSORS_APPLESMC is not set
# CONFIG_SENSORS_CORSAIR_CPRO is not set
# CONFIG_SENSORS_CORSAIR_PSU is not set
# CONFIG_SENSORS_I5K_AMB is not set
# CONFIG_SENSORS_F71805F is not set
# CONFIG_SENSORS_F71882FG is not set
# CONFIG_SENSORS_I5500 is not set
# CONFIG_SENSORS_CORETEMP is not set
# CONFIG_SENSORS_IT87 is not set
# CONFIG_SENSORS_MAX197 is not set
# CONFIG_SENSORS_MR75203 is not set
# CONFIG_SENSORS_PC87360 is not set
# CONFIG_SENSORS_PC87427 is not set
# CONFIG_SENSORS_NCT6683 is not set
# CONFIG_SENSORS_NCT6775 is not set
# CONFIG_SENSORS_NPCM7XX is not set
# CONFIG_SENSORS_OXP is not set
# CONFIG_SENSORS_SIS5595 is not set
# CONFIG_SENSORS_SMSC47M1 is not set
# CONFIG_SENSORS_SMSC47B397 is not set
# CONFIG_SENSORS_VIA_CPUTEMP is not set
# CONFIG_SENSORS_VIA686A is not set
# CONFIG_SENSORS_VT1211 is not set
# CONFIG_SENSORS_VT8231 is not set
# CONFIG_SENSORS_W83627HF is not set
# CONFIG_SENSORS_W83627EHF is not set
# CONFIG_SENSORS_XGENE is not set
#
# ACPI drivers
#
# CONFIG_SENSORS_ACPI_POWER is not set
# CONFIG_SENSORS_ATK0110 is not set
# CONFIG_SENSORS_ASUS_EC is not set
CONFIG_THERMAL=y
# CONFIG_THERMAL_NETLINK is not set
# CONFIG_THERMAL_STATISTICS is not set
# CONFIG_THERMAL_DEBUGFS is not set
CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS=0
CONFIG_THERMAL_HWMON=y
CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y
# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set
# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set
# CONFIG_THERMAL_GOV_FAIR_SHARE is not set
CONFIG_THERMAL_GOV_STEP_WISE=y
# CONFIG_THERMAL_GOV_BANG_BANG is not set
CONFIG_THERMAL_GOV_USER_SPACE=y
# CONFIG_THERMAL_EMULATION is not set
#
# Intel thermal drivers
#
# CONFIG_INTEL_POWERCLAMP is not set
CONFIG_X86_THERMAL_VECTOR=y
CONFIG_INTEL_TCC=y
CONFIG_X86_PKG_TEMP_THERMAL=y
# CONFIG_INTEL_SOC_DTS_THERMAL is not set
#
# ACPI INT340X thermal drivers
#
# CONFIG_INT340X_THERMAL is not set
# end of ACPI INT340X thermal drivers
# CONFIG_INTEL_PCH_THERMAL is not set
# CONFIG_INTEL_TCC_COOLING is not set
# CONFIG_INTEL_HFI_THERMAL is not set
# end of Intel thermal drivers
# CONFIG_WATCHDOG is not set
CONFIG_SSB_POSSIBLE=y
# CONFIG_SSB is not set
CONFIG_BCMA_POSSIBLE=y
# CONFIG_BCMA is not set
#
# Multifunction device drivers
#
# CONFIG_MFD_MADERA is not set
# CONFIG_MFD_INTEL_QUARK_I2C_GPIO is not set
# CONFIG_LPC_ICH is not set
# CONFIG_LPC_SCH is not set
# CONFIG_MFD_INTEL_LPSS_ACPI is not set
# CONFIG_MFD_INTEL_LPSS_PCI is not set
# CONFIG_MFD_INTEL_PMC_BXT is not set
# CONFIG_MFD_JANZ_CMODIO is not set
# CONFIG_MFD_KEMPLD is not set
# CONFIG_MFD_MT6397 is not set
# CONFIG_MFD_RDC321X is not set
# CONFIG_MFD_SM501 is not set
# CONFIG_MFD_SYSCON is not set
# CONFIG_MFD_TQMX86 is not set
# CONFIG_MFD_VX855 is not set
# end of Multifunction device drivers
# CONFIG_REGULATOR is not set
# CONFIG_RC_CORE is not set
#
# CEC support
#
# CONFIG_MEDIA_CEC_SUPPORT is not set
# end of CEC support
# CONFIG_MEDIA_SUPPORT is not set
#
# Graphics support
#
# CONFIG_AUXDISPLAY is not set
# CONFIG_AGP is not set
# CONFIG_VGA_SWITCHEROO is not set
# CONFIG_DRM is not set
#
# Frame buffer Devices
#
# CONFIG_FB is not set
# end of Frame buffer Devices
#
# Backlight & LCD device support
#
CONFIG_LCD_CLASS_DEVICE=y
# CONFIG_LCD_PLATFORM is not set
CONFIG_BACKLIGHT_CLASS_DEVICE=y
# CONFIG_BACKLIGHT_KTD2801 is not set
# CONFIG_BACKLIGHT_APPLE is not set
# CONFIG_BACKLIGHT_QCOM_WLED is not set
# CONFIG_BACKLIGHT_SAHARA is not set
# end of Backlight & LCD device support
#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_DUMMY_CONSOLE_COLUMNS=80
CONFIG_DUMMY_CONSOLE_ROWS=25
# end of Console display driver support
# end of Graphics support
# CONFIG_SOUND is not set
CONFIG_HID_SUPPORT=y
CONFIG_HID=y
# CONFIG_HID_BATTERY_STRENGTH is not set
# CONFIG_HIDRAW is not set
# CONFIG_UHID is not set
CONFIG_HID_GENERIC=y
#
# Special HID drivers
#
CONFIG_HID_A4TECH=y
# CONFIG_HID_ACRUX is not set
# CONFIG_HID_AUREAL is not set
CONFIG_HID_BELKIN=y
CONFIG_HID_CHERRY=y
# CONFIG_HID_COUGAR is not set
# CONFIG_HID_MACALLY is not set
# CONFIG_HID_CMEDIA is not set
CONFIG_HID_CYPRESS=y
# CONFIG_HID_DRAGONRISE is not set
# CONFIG_HID_EMS_FF is not set
# CONFIG_HID_ELECOM is not set
# CONFIG_HID_EVISION is not set
CONFIG_HID_EZKEY=y
# CONFIG_HID_GEMBIRD is not set
# CONFIG_HID_GFRM is not set
# CONFIG_HID_GLORIOUS is not set
# CONFIG_HID_GOOGLE_STADIA_FF is not set
# CONFIG_HID_VIVALDI is not set
# CONFIG_HID_KEYTOUCH is not set
# CONFIG_HID_KYE is not set
# CONFIG_HID_WALTOP is not set
# CONFIG_HID_VIEWSONIC is not set
# CONFIG_HID_VRC2 is not set
# CONFIG_HID_XIAOMI is not set
# CONFIG_HID_GYRATION is not set
# CONFIG_HID_ICADE is not set
CONFIG_HID_ITE=y
# CONFIG_HID_JABRA is not set
# CONFIG_HID_TWINHAN is not set
CONFIG_HID_KENSINGTON=y
# CONFIG_HID_LCPOWER is not set
# CONFIG_HID_LENOVO is not set
# CONFIG_HID_MAGICMOUSE is not set
# CONFIG_HID_MALTRON is not set
# CONFIG_HID_MAYFLASH is not set
CONFIG_HID_REDRAGON=y
CONFIG_HID_MICROSOFT=y
CONFIG_HID_MONTEREY=y
# CONFIG_HID_MULTITOUCH is not set
# CONFIG_HID_NTI is not set
# CONFIG_HID_ORTEK is not set
# CONFIG_HID_PANTHERLORD is not set
# CONFIG_HID_PETALYNX is not set
# CONFIG_HID_PICOLCD is not set
# CONFIG_HID_PLANTRONICS is not set
# CONFIG_HID_PXRC is not set
# CONFIG_HID_RAZER is not set
# CONFIG_HID_PRIMAX is not set
# CONFIG_HID_SAITEK is not set
# CONFIG_HID_SEMITEK is not set
# CONFIG_HID_SPEEDLINK is not set
# CONFIG_HID_STEAM is not set
# CONFIG_HID_SUNPLUS is not set
# CONFIG_HID_RMI is not set
# CONFIG_HID_GREENASIA is not set
# CONFIG_HID_SMARTJOYPLUS is not set
# CONFIG_HID_TIVO is not set
# CONFIG_HID_TOPSEED is not set
# CONFIG_HID_TOPRE is not set
# CONFIG_HID_UDRAW_PS3 is not set
# CONFIG_HID_XINMO is not set
# CONFIG_HID_ZEROPLUS is not set
# CONFIG_HID_ZYDACRON is not set
# CONFIG_HID_SENSOR_HUB is not set
# CONFIG_HID_ALPS is not set
# end of Special HID drivers
#
# HID-BPF support
#
# CONFIG_HID_BPF is not set
# end of HID-BPF support
#
# Intel ISH HID support
#
# CONFIG_INTEL_ISH_HID is not set
# end of Intel ISH HID support
#
# AMD SFH HID Support
#
# CONFIG_AMD_SFH_HID is not set
# end of AMD SFH HID Support
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_SUPPORT=y
# CONFIG_USB_ULPI_BUS is not set
CONFIG_USB_ARCH_HAS_HCD=y
# CONFIG_USB is not set
CONFIG_USB_PCI=y
CONFIG_USB_PCI_AMD=y
#
# USB dual-mode controller drivers
#
#
# USB port drivers
#
#
# USB Physical Layer drivers
#
# CONFIG_NOP_USB_XCEIV is not set
# end of USB Physical Layer drivers
# CONFIG_USB_GADGET is not set
# CONFIG_TYPEC is not set
# CONFIG_USB_ROLE_SWITCH is not set
# CONFIG_MMC is not set
# CONFIG_MEMSTICK is not set
# CONFIG_NEW_LEDS is not set
# CONFIG_ACCESSIBILITY is not set
# CONFIG_INFINIBAND is not set
CONFIG_EDAC_ATOMIC_SCRUB=y
CONFIG_EDAC_SUPPORT=y
CONFIG_RTC_LIB=y
CONFIG_RTC_MC146818_LIB=y
# CONFIG_RTC_CLASS is not set
# CONFIG_DMADEVICES is not set
#
# DMABUF options
#
# CONFIG_SYNC_FILE is not set
# CONFIG_DMABUF_HEAPS is not set
# end of DMABUF options
# CONFIG_UIO is not set
# CONFIG_VFIO is not set
# CONFIG_VIRT_DRIVERS is not set
CONFIG_VIRTIO_ANCHOR=y
CONFIG_VIRTIO=y
CONFIG_VIRTIO_PCI_LIB=y
CONFIG_VIRTIO_PCI_LIB_LEGACY=y
CONFIG_VIRTIO_MENU=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_PCI_ADMIN_LEGACY=y
CONFIG_VIRTIO_PCI_LEGACY=y
# CONFIG_VIRTIO_BALLOON is not set
CONFIG_VIRTIO_INPUT=y
# CONFIG_VIRTIO_MMIO is not set
# CONFIG_VDPA is not set
CONFIG_VHOST_IOTLB=y
CONFIG_VHOST_TASK=y
CONFIG_VHOST=y
CONFIG_VHOST_MENU=y
CONFIG_VHOST_NET=y
# CONFIG_VHOST_CROSS_ENDIAN_LEGACY is not set
#
# Microsoft Hyper-V guest support
#
# CONFIG_HYPERV is not set
# end of Microsoft Hyper-V guest support
# CONFIG_GREYBUS is not set
# CONFIG_COMEDI is not set
# CONFIG_STAGING is not set
# CONFIG_GOLDFISH is not set
# CONFIG_CHROME_PLATFORMS is not set
# CONFIG_MELLANOX_PLATFORM is not set
CONFIG_SURFACE_PLATFORMS=y
# CONFIG_SURFACE_GPE is not set
# CONFIG_SURFACE_PRO3_BUTTON is not set
CONFIG_X86_PLATFORM_DEVICES=y
# CONFIG_ACPI_WMI is not set
# CONFIG_ACERHDF is not set
# CONFIG_ACER_WIRELESS is not set
# CONFIG_AMD_HSMP is not set
# CONFIG_AMD_WBRF is not set
# CONFIG_ADV_SWBUTTON is not set
# CONFIG_APPLE_GMUX is not set
# CONFIG_ASUS_LAPTOP is not set
# CONFIG_ASUS_WIRELESS is not set
# CONFIG_X86_PLATFORM_DRIVERS_DELL is not set
# CONFIG_FUJITSU_LAPTOP is not set
# CONFIG_FUJITSU_TABLET is not set
# CONFIG_GPD_POCKET_FAN is not set
# CONFIG_X86_PLATFORM_DRIVERS_HP is not set
# CONFIG_WIRELESS_HOTKEY is not set
# CONFIG_IBM_RTL is not set
# CONFIG_SENSORS_HDAPS is not set
# CONFIG_INTEL_SAR_INT1092 is not set
#
# Intel Speed Select Technology interface support
#
# CONFIG_INTEL_SPEED_SELECT_INTERFACE is not set
# end of Intel Speed Select Technology interface support
#
# Intel Uncore Frequency Control
#
# CONFIG_INTEL_UNCORE_FREQ_CONTROL is not set
# end of Intel Uncore Frequency Control
# CONFIG_INTEL_PUNIT_IPC is not set
# CONFIG_INTEL_RST is not set
# CONFIG_INTEL_SMARTCONNECT is not set
# CONFIG_INTEL_VSEC is not set
# CONFIG_MSI_EC is not set
# CONFIG_SAMSUNG_LAPTOP is not set
# CONFIG_SAMSUNG_Q10 is not set
# CONFIG_TOSHIBA_BT_RFKILL is not set
# CONFIG_TOSHIBA_HAPS is not set
# CONFIG_ACPI_CMPC is not set
# CONFIG_PANASONIC_LAPTOP is not set
# CONFIG_SYSTEM76_ACPI is not set
# CONFIG_TOPSTAR_LAPTOP is not set
# CONFIG_INTEL_IPS is not set
# CONFIG_INTEL_SCU_PCI is not set
# CONFIG_INTEL_SCU_PLATFORM is not set
# CONFIG_SIEMENS_SIMATIC_IPC is not set
# CONFIG_WINMATE_FM07_KEYS is not set
CONFIG_HAVE_CLK=y
CONFIG_HAVE_CLK_PREPARE=y
CONFIG_COMMON_CLK=y
# CONFIG_XILINX_VCU is not set
# CONFIG_HWSPINLOCK is not set
#
# Clock Source drivers
#
CONFIG_CLKEVT_I8253=y
CONFIG_I8253_LOCK=y
CONFIG_CLKBLD_I8253=y
# end of Clock Source drivers
CONFIG_MAILBOX=y
CONFIG_PCC=y
# CONFIG_ALTERA_MBOX is not set
CONFIG_IOMMU_IOVA=y
CONFIG_IOMMU_API=y
CONFIG_IOMMU_SUPPORT=y
#
# Generic IOMMU Pagetable Support
#
# end of Generic IOMMU Pagetable Support
# CONFIG_IOMMU_DEBUGFS is not set
# CONFIG_IOMMU_DEFAULT_DMA_STRICT is not set
CONFIG_IOMMU_DEFAULT_DMA_LAZY=y
# CONFIG_IOMMU_DEFAULT_PASSTHROUGH is not set
CONFIG_IOMMU_DMA=y
# CONFIG_AMD_IOMMU is not set
# CONFIG_INTEL_IOMMU is not set
# CONFIG_IOMMUFD is not set
# CONFIG_IRQ_REMAP is not set
# CONFIG_VIRTIO_IOMMU is not set
#
# Remoteproc drivers
#
# CONFIG_REMOTEPROC is not set
# end of Remoteproc drivers
#
# Rpmsg drivers
#
# CONFIG_RPMSG_QCOM_GLINK_RPM is not set
# CONFIG_RPMSG_VIRTIO is not set
# end of Rpmsg drivers
# CONFIG_SOUNDWIRE is not set
#
# SOC (System On Chip) specific Drivers
#
#
# Amlogic SoC drivers
#
# end of Amlogic SoC drivers
#
# Broadcom SoC drivers
#
# end of Broadcom SoC drivers
#
# NXP/Freescale QorIQ SoC drivers
#
# end of NXP/Freescale QorIQ SoC drivers
#
# fujitsu SoC drivers
#
# end of fujitsu SoC drivers
#
# i.MX SoC drivers
#
# end of i.MX SoC drivers
#
# Enable LiteX SoC Builder specific drivers
#
# end of Enable LiteX SoC Builder specific drivers
# CONFIG_WPCM450_SOC is not set
#
# Qualcomm SoC drivers
#
# end of Qualcomm SoC drivers
# CONFIG_SOC_TI is not set
#
# Xilinx SoC drivers
#
# end of Xilinx SoC drivers
# end of SOC (System On Chip) specific Drivers
#
# PM Domains
#
#
# Amlogic PM Domains
#
# end of Amlogic PM Domains
#
# Broadcom PM Domains
#
# end of Broadcom PM Domains
#
# i.MX PM Domains
#
# end of i.MX PM Domains
#
# Qualcomm PM Domains
#
# end of Qualcomm PM Domains
# end of PM Domains
# CONFIG_PM_DEVFREQ is not set
# CONFIG_EXTCON is not set
# CONFIG_MEMORY is not set
# CONFIG_IIO is not set
# CONFIG_NTB is not set
# CONFIG_PWM is not set
#
# IRQ chip support
#
# end of IRQ chip support
# CONFIG_IPACK_BUS is not set
# CONFIG_RESET_CONTROLLER is not set
#
# PHY Subsystem
#
# CONFIG_GENERIC_PHY is not set
# CONFIG_USB_LGM_PHY is not set
# CONFIG_PHY_CAN_TRANSCEIVER is not set
#
# PHY drivers for Broadcom platforms
#
# CONFIG_BCM_KONA_USB2_PHY is not set
# end of PHY drivers for Broadcom platforms
# CONFIG_PHY_PXA_28NM_HSIC is not set
# CONFIG_PHY_PXA_28NM_USB2 is not set
# CONFIG_PHY_INTEL_LGM_EMMC is not set
# end of PHY Subsystem
# CONFIG_POWERCAP is not set
# CONFIG_MCB is not set
#
# Performance monitor support
#
# CONFIG_DWC_PCIE_PMU is not set
# end of Performance monitor support
# CONFIG_RAS is not set
# CONFIG_USB4 is not set
#
# Android
#
# CONFIG_ANDROID_BINDER_IPC is not set
# end of Android
# CONFIG_LIBNVDIMM is not set
# CONFIG_DAX is not set
# CONFIG_NVMEM is not set
#
# HW tracing support
#
# CONFIG_STM is not set
# CONFIG_INTEL_TH is not set
# end of HW tracing support
# CONFIG_FPGA is not set
# CONFIG_TEE is not set
# CONFIG_SIOX is not set
# CONFIG_SLIMBUS is not set
# CONFIG_INTERCONNECT is not set
# CONFIG_COUNTER is not set
# CONFIG_PECI is not set
# CONFIG_HTE is not set
# end of Device Drivers
#
# File systems
#
CONFIG_DCACHE_WORD_ACCESS=y
# CONFIG_VALIDATE_FS_PARSER is not set
CONFIG_FS_IOMAP=y
CONFIG_FS_STACK=y
CONFIG_BUFFER_HEAD=y
CONFIG_LEGACY_DIRECT_IO=y
CONFIG_EXT2_FS=y
# CONFIG_EXT2_FS_XATTR is not set
# CONFIG_EXT3_FS is not set
CONFIG_EXT4_FS=y
# CONFIG_EXT4_FS_POSIX_ACL is not set
# CONFIG_EXT4_FS_SECURITY is not set
# CONFIG_EXT4_DEBUG is not set
CONFIG_JBD2=y
# CONFIG_JBD2_DEBUG is not set
CONFIG_FS_MBCACHE=y
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
CONFIG_XFS_FS=y
CONFIG_XFS_SUPPORT_V4=y
CONFIG_XFS_SUPPORT_ASCII_CI=y
# CONFIG_XFS_QUOTA is not set
CONFIG_XFS_POSIX_ACL=y
# CONFIG_XFS_RT is not set
CONFIG_XFS_DRAIN_INTENTS=y
CONFIG_XFS_LIVE_HOOKS=y
CONFIG_XFS_MEMORY_BUFS=y
CONFIG_XFS_ONLINE_SCRUB=y
CONFIG_XFS_ONLINE_SCRUB_STATS=y
# CONFIG_XFS_ONLINE_REPAIR is not set
CONFIG_XFS_DEBUG=y
CONFIG_XFS_ASSERT_FATAL=y
# CONFIG_GFS2_FS is not set
# CONFIG_BTRFS_FS is not set
# CONFIG_NILFS2_FS is not set
# CONFIG_F2FS_FS is not set
# CONFIG_BCACHEFS_FS is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_EXPORTFS=y
CONFIG_EXPORTFS_BLOCK_OPS=y
CONFIG_FILE_LOCKING=y
# CONFIG_FS_ENCRYPTION is not set
# CONFIG_FS_VERITY is not set
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
CONFIG_INOTIFY_USER=y
# CONFIG_FANOTIFY is not set
# CONFIG_QUOTA is not set
CONFIG_AUTOFS_FS=y
CONFIG_FUSE_FS=y
CONFIG_CUSE=y
# CONFIG_VIRTIO_FS is not set
CONFIG_FUSE_PASSTHROUGH=y
# CONFIG_OVERLAY_FS is not set
#
# Caches
#
CONFIG_NETFS_SUPPORT=y
# CONFIG_NETFS_STATS is not set
# CONFIG_FSCACHE is not set
# end of Caches
#
# CD-ROM/DVD Filesystems
#
# CONFIG_ISO9660_FS is not set
# CONFIG_UDF_FS is not set
# end of CD-ROM/DVD Filesystems
#
# DOS/FAT/EXFAT/NT Filesystems
#
# CONFIG_MSDOS_FS is not set
# CONFIG_VFAT_FS is not set
# CONFIG_EXFAT_FS is not set
# CONFIG_NTFS3_FS is not set
# end of DOS/FAT/EXFAT/NT Filesystems
#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
# CONFIG_PROC_KCORE is not set
CONFIG_PROC_VMCORE=y
# CONFIG_PROC_VMCORE_DEVICE_DUMP is not set
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
# CONFIG_PROC_CHILDREN is not set
CONFIG_PROC_PID_ARCH_STATUS=y
CONFIG_PROC_CPU_RESCTRL=y
CONFIG_KERNFS=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_TMPFS_XATTR=y
# CONFIG_TMPFS_INODE64 is not set
# CONFIG_TMPFS_QUOTA is not set
# CONFIG_HUGETLBFS is not set
CONFIG_ARCH_HAS_GIGANTIC_PAGE=y
# CONFIG_CONFIGFS_FS is not set
# end of Pseudo filesystems
CONFIG_MISC_FILESYSTEMS=y
# CONFIG_ORANGEFS_FS is not set
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_ECRYPT_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_SQUASHFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_OMFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_QNX6FS_FS is not set
# CONFIG_ROMFS_FS is not set
# CONFIG_PSTORE is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
# CONFIG_EROFS_FS is not set
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=y
CONFIG_NFS_V2=y
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
# CONFIG_NFS_SWAP is not set
CONFIG_NFS_V4_1=y
# CONFIG_NFS_V4_2 is not set
CONFIG_PNFS_FILE_LAYOUT=y
CONFIG_PNFS_FLEXFILE_LAYOUT=y
CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
# CONFIG_NFS_V4_1_MIGRATION is not set
# CONFIG_ROOT_NFS is not set
# CONFIG_NFS_FSCACHE is not set
# CONFIG_NFS_USE_LEGACY_DNS is not set
CONFIG_NFS_USE_KERNEL_DNS=y
CONFIG_NFS_DISABLE_UDP_SUPPORT=y
CONFIG_NFSD=y
# CONFIG_NFSD_V2 is not set
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
CONFIG_NFSD_PNFS=y
CONFIG_NFSD_BLOCKLAYOUT=y
CONFIG_NFSD_SCSILAYOUT=y
CONFIG_NFSD_FLEXFILELAYOUT=y
# CONFIG_NFSD_LEGACY_CLIENT_TRACKING is not set
CONFIG_GRACE_PERIOD=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_NFS_ACL_SUPPORT=y
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=y
CONFIG_SUNRPC_GSS=y
CONFIG_SUNRPC_BACKCHANNEL=y
CONFIG_RPCSEC_GSS_KRB5=y
# CONFIG_SUNRPC_DEBUG is not set
# CONFIG_CEPH_FS is not set
# CONFIG_CIFS is not set
# CONFIG_SMB_SERVER is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set
CONFIG_9P_FS=y
# CONFIG_9P_FS_POSIX_ACL is not set
# CONFIG_9P_FS_SECURITY is not set
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
# CONFIG_NLS_CODEPAGE_437 is not set
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
# CONFIG_NLS_CODEPAGE_850 is not set
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_CODEPAGE_1250 is not set
# CONFIG_NLS_CODEPAGE_1251 is not set
# CONFIG_NLS_ASCII is not set
# CONFIG_NLS_ISO8859_1 is not set
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
# CONFIG_NLS_ISO8859_15 is not set
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
# CONFIG_NLS_MAC_ROMAN is not set
# CONFIG_NLS_MAC_CELTIC is not set
# CONFIG_NLS_MAC_CENTEURO is not set
# CONFIG_NLS_MAC_CROATIAN is not set
# CONFIG_NLS_MAC_CYRILLIC is not set
# CONFIG_NLS_MAC_GAELIC is not set
# CONFIG_NLS_MAC_GREEK is not set
# CONFIG_NLS_MAC_ICELAND is not set
# CONFIG_NLS_MAC_INUIT is not set
# CONFIG_NLS_MAC_ROMANIAN is not set
# CONFIG_NLS_MAC_TURKISH is not set
# CONFIG_NLS_UTF8 is not set
# CONFIG_UNICODE is not set
CONFIG_IO_WQ=y
# end of File systems
#
# Security options
#
CONFIG_KEYS=y
# CONFIG_KEYS_REQUEST_CACHE is not set
# CONFIG_PERSISTENT_KEYRINGS is not set
# CONFIG_TRUSTED_KEYS is not set
# CONFIG_ENCRYPTED_KEYS is not set
# CONFIG_KEY_DH_OPERATIONS is not set
# CONFIG_SECURITY_DMESG_RESTRICT is not set
# CONFIG_SECURITY is not set
# CONFIG_SECURITYFS is not set
# CONFIG_HARDENED_USERCOPY is not set
# CONFIG_FORTIFY_SOURCE is not set
# CONFIG_STATIC_USERMODEHELPER is not set
CONFIG_DEFAULT_SECURITY_DAC=y
CONFIG_LSM="yama,loadpin,safesetid,integrity"
#
# Kernel hardening options
#
#
# Memory initialization
#
CONFIG_CC_HAS_AUTO_VAR_INIT_PATTERN=y
CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO_BARE=y
CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO=y
CONFIG_INIT_STACK_NONE=y
# CONFIG_INIT_STACK_ALL_PATTERN is not set
# CONFIG_INIT_STACK_ALL_ZERO is not set
# CONFIG_INIT_ON_ALLOC_DEFAULT_ON is not set
# CONFIG_INIT_ON_FREE_DEFAULT_ON is not set
CONFIG_CC_HAS_ZERO_CALL_USED_REGS=y
# CONFIG_ZERO_CALL_USED_REGS is not set
# end of Memory initialization
#
# Hardening of kernel data structures
#
CONFIG_LIST_HARDENED=y
CONFIG_BUG_ON_DATA_CORRUPTION=y
# end of Hardening of kernel data structures
CONFIG_RANDSTRUCT_NONE=y
# end of Kernel hardening options
# end of Security options
CONFIG_CRYPTO=y
#
# Crypto core or helper
#
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_SIG2=y
CONFIG_CRYPTO_SKCIPHER=y
CONFIG_CRYPTO_SKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_RNG_DEFAULT=y
CONFIG_CRYPTO_AKCIPHER2=y
CONFIG_CRYPTO_AKCIPHER=y
CONFIG_CRYPTO_KPP2=y
CONFIG_CRYPTO_KPP=y
CONFIG_CRYPTO_ACOMP2=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
# CONFIG_CRYPTO_USER is not set
CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=y
CONFIG_CRYPTO_NULL=y
CONFIG_CRYPTO_NULL2=y
CONFIG_CRYPTO_CRYPTD=y
CONFIG_CRYPTO_AUTHENC=y
# CONFIG_CRYPTO_TEST is not set
CONFIG_CRYPTO_SIMD=y
CONFIG_CRYPTO_ENGINE=y
# end of Crypto core or helper
#
# Public-key cryptography
#
CONFIG_CRYPTO_RSA=y
# CONFIG_CRYPTO_DH is not set
CONFIG_CRYPTO_ECC=y
CONFIG_CRYPTO_ECDH=y
# CONFIG_CRYPTO_ECDSA is not set
# CONFIG_CRYPTO_ECRDSA is not set
# CONFIG_CRYPTO_SM2 is not set
CONFIG_CRYPTO_CURVE25519=y
# end of Public-key cryptography
#
# Block ciphers
#
CONFIG_CRYPTO_AES=y
# CONFIG_CRYPTO_AES_TI is not set
# CONFIG_CRYPTO_ARIA is not set
# CONFIG_CRYPTO_BLOWFISH is not set
# CONFIG_CRYPTO_CAMELLIA is not set
# CONFIG_CRYPTO_CAST5 is not set
# CONFIG_CRYPTO_CAST6 is not set
CONFIG_CRYPTO_DES=y
# CONFIG_CRYPTO_FCRYPT is not set
# CONFIG_CRYPTO_SERPENT is not set
# CONFIG_CRYPTO_SM4_GENERIC is not set
# CONFIG_CRYPTO_TWOFISH is not set
# end of Block ciphers
#
# Length-preserving ciphers and modes
#
CONFIG_CRYPTO_ADIANTUM=y
CONFIG_CRYPTO_CHACHA20=y
CONFIG_CRYPTO_CBC=y
CONFIG_CRYPTO_CTR=y
# CONFIG_CRYPTO_CTS is not set
CONFIG_CRYPTO_ECB=y
# CONFIG_CRYPTO_HCTR2 is not set
# CONFIG_CRYPTO_KEYWRAP is not set
# CONFIG_CRYPTO_LRW is not set
# CONFIG_CRYPTO_PCBC is not set
# CONFIG_CRYPTO_XTS is not set
CONFIG_CRYPTO_NHPOLY1305=y
# end of Length-preserving ciphers and modes
#
# AEAD (authenticated encryption with associated data) ciphers
#
# CONFIG_CRYPTO_AEGIS128 is not set
CONFIG_CRYPTO_CHACHA20POLY1305=y
CONFIG_CRYPTO_CCM=y
CONFIG_CRYPTO_GCM=y
CONFIG_CRYPTO_GENIV=y
CONFIG_CRYPTO_SEQIV=y
CONFIG_CRYPTO_ECHAINIV=y
# CONFIG_CRYPTO_ESSIV is not set
# end of AEAD (authenticated encryption with associated data) ciphers
#
# Hashes, digests, and MACs
#
# CONFIG_CRYPTO_BLAKE2B is not set
CONFIG_CRYPTO_CMAC=y
CONFIG_CRYPTO_GHASH=y
CONFIG_CRYPTO_HMAC=y
# CONFIG_CRYPTO_MD4 is not set
CONFIG_CRYPTO_MD5=y
# CONFIG_CRYPTO_MICHAEL_MIC is not set
CONFIG_CRYPTO_POLY1305=y
# CONFIG_CRYPTO_RMD160 is not set
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA256=y
CONFIG_CRYPTO_SHA512=y
CONFIG_CRYPTO_SHA3=y
# CONFIG_CRYPTO_SM3_GENERIC is not set
# CONFIG_CRYPTO_STREEBOG is not set
# CONFIG_CRYPTO_VMAC is not set
# CONFIG_CRYPTO_WP512 is not set
CONFIG_CRYPTO_XCBC=y
# CONFIG_CRYPTO_XXHASH is not set
# end of Hashes, digests, and MACs
#
# CRCs (cyclic redundancy checks)
#
CONFIG_CRYPTO_CRC32C=y
# CONFIG_CRYPTO_CRC32 is not set
# CONFIG_CRYPTO_CRCT10DIF is not set
# end of CRCs (cyclic redundancy checks)
#
# Compression
#
CONFIG_CRYPTO_DEFLATE=y
CONFIG_CRYPTO_LZO=y
# CONFIG_CRYPTO_842 is not set
CONFIG_CRYPTO_LZ4=y
# CONFIG_CRYPTO_LZ4HC is not set
CONFIG_CRYPTO_ZSTD=y
# end of Compression
#
# Random number generation
#
CONFIG_CRYPTO_ANSI_CPRNG=y
CONFIG_CRYPTO_DRBG_MENU=y
CONFIG_CRYPTO_DRBG_HMAC=y
# CONFIG_CRYPTO_DRBG_HASH is not set
# CONFIG_CRYPTO_DRBG_CTR is not set
CONFIG_CRYPTO_DRBG=y
CONFIG_CRYPTO_JITTERENTROPY=y
CONFIG_CRYPTO_JITTERENTROPY_MEMORY_BLOCKS=64
CONFIG_CRYPTO_JITTERENTROPY_MEMORY_BLOCKSIZE=32
CONFIG_CRYPTO_JITTERENTROPY_OSR=1
# end of Random number generation
#
# Userspace interface
#
# CONFIG_CRYPTO_USER_API_HASH is not set
# CONFIG_CRYPTO_USER_API_SKCIPHER is not set
# CONFIG_CRYPTO_USER_API_RNG is not set
# CONFIG_CRYPTO_USER_API_AEAD is not set
# end of Userspace interface
CONFIG_CRYPTO_HASH_INFO=y
#
# Accelerated Cryptographic Algorithms for CPU (x86)
#
CONFIG_CRYPTO_CURVE25519_X86=y
CONFIG_CRYPTO_AES_NI_INTEL=y
# CONFIG_CRYPTO_BLOWFISH_X86_64 is not set
# CONFIG_CRYPTO_CAMELLIA_X86_64 is not set
# CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64 is not set
# CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64 is not set
# CONFIG_CRYPTO_CAST5_AVX_X86_64 is not set
# CONFIG_CRYPTO_CAST6_AVX_X86_64 is not set
# CONFIG_CRYPTO_DES3_EDE_X86_64 is not set
# CONFIG_CRYPTO_SERPENT_SSE2_X86_64 is not set
# CONFIG_CRYPTO_SERPENT_AVX_X86_64 is not set
# CONFIG_CRYPTO_SERPENT_AVX2_X86_64 is not set
# CONFIG_CRYPTO_SM4_AESNI_AVX_X86_64 is not set
# CONFIG_CRYPTO_SM4_AESNI_AVX2_X86_64 is not set
# CONFIG_CRYPTO_TWOFISH_X86_64 is not set
# CONFIG_CRYPTO_TWOFISH_X86_64_3WAY is not set
# CONFIG_CRYPTO_TWOFISH_AVX_X86_64 is not set
# CONFIG_CRYPTO_ARIA_AESNI_AVX_X86_64 is not set
# CONFIG_CRYPTO_ARIA_AESNI_AVX2_X86_64 is not set
# CONFIG_CRYPTO_ARIA_GFNI_AVX512_X86_64 is not set
CONFIG_CRYPTO_CHACHA20_X86_64=y
# CONFIG_CRYPTO_AEGIS128_AESNI_SSE2 is not set
CONFIG_CRYPTO_NHPOLY1305_SSE2=y
CONFIG_CRYPTO_NHPOLY1305_AVX2=y
# CONFIG_CRYPTO_BLAKE2S_X86 is not set
# CONFIG_CRYPTO_POLYVAL_CLMUL_NI is not set
CONFIG_CRYPTO_POLY1305_X86_64=y
CONFIG_CRYPTO_SHA1_SSSE3=y
CONFIG_CRYPTO_SHA256_SSSE3=y
CONFIG_CRYPTO_SHA512_SSSE3=y
# CONFIG_CRYPTO_SM3_AVX_X86_64 is not set
# CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL is not set
# CONFIG_CRYPTO_CRC32C_INTEL is not set
# CONFIG_CRYPTO_CRC32_PCLMUL is not set
# end of Accelerated Cryptographic Algorithms for CPU (x86)
CONFIG_CRYPTO_HW=y
# CONFIG_CRYPTO_DEV_PADLOCK is not set
# CONFIG_CRYPTO_DEV_CCP is not set
# CONFIG_CRYPTO_DEV_NITROX_CNN55XX is not set
# CONFIG_CRYPTO_DEV_QAT_DH895xCC is not set
# CONFIG_CRYPTO_DEV_QAT_C3XXX is not set
# CONFIG_CRYPTO_DEV_QAT_C62X is not set
# CONFIG_CRYPTO_DEV_QAT_4XXX is not set
# CONFIG_CRYPTO_DEV_QAT_420XX is not set
# CONFIG_CRYPTO_DEV_QAT_DH895xCCVF is not set
# CONFIG_CRYPTO_DEV_QAT_C3XXXVF is not set
# CONFIG_CRYPTO_DEV_QAT_C62XVF is not set
CONFIG_CRYPTO_DEV_VIRTIO=y
# CONFIG_CRYPTO_DEV_SAFEXCEL is not set
# CONFIG_CRYPTO_DEV_AMLOGIC_GXL is not set
CONFIG_ASYMMETRIC_KEY_TYPE=y
CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y
CONFIG_X509_CERTIFICATE_PARSER=y
# CONFIG_PKCS8_PRIVATE_KEY_PARSER is not set
CONFIG_PKCS7_MESSAGE_PARSER=y
# CONFIG_PKCS7_TEST_KEY is not set
# CONFIG_SIGNED_PE_FILE_VERIFICATION is not set
# CONFIG_FIPS_SIGNATURE_SELFTEST is not set
#
# Certificates for signature checking
#
CONFIG_MODULE_SIG_KEY="certs/signing_key.pem"
CONFIG_MODULE_SIG_KEY_TYPE_RSA=y
CONFIG_SYSTEM_TRUSTED_KEYRING=y
CONFIG_SYSTEM_TRUSTED_KEYS=""
# CONFIG_SYSTEM_EXTRA_CERTIFICATE is not set
# CONFIG_SECONDARY_TRUSTED_KEYRING is not set
# CONFIG_SYSTEM_BLACKLIST_KEYRING is not set
# end of Certificates for signature checking
CONFIG_BINARY_PRINTF=y
#
# Library routines
#
# CONFIG_PACKING is not set
CONFIG_BITREVERSE=y
CONFIG_GENERIC_STRNCPY_FROM_USER=y
CONFIG_GENERIC_STRNLEN_USER=y
CONFIG_GENERIC_NET_UTILS=y
# CONFIG_CORDIC is not set
# CONFIG_PRIME_NUMBERS is not set
CONFIG_RATIONAL=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
CONFIG_ARCH_USE_SYM_ANNOTATIONS=y
#
# Crypto library routines
#
CONFIG_CRYPTO_LIB_UTILS=y
CONFIG_CRYPTO_LIB_AES=y
CONFIG_CRYPTO_LIB_GF128MUL=y
CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC=y
CONFIG_CRYPTO_ARCH_HAVE_LIB_CHACHA=y
CONFIG_CRYPTO_LIB_CHACHA_GENERIC=y
# CONFIG_CRYPTO_LIB_CHACHA is not set
CONFIG_CRYPTO_ARCH_HAVE_LIB_CURVE25519=y
CONFIG_CRYPTO_LIB_CURVE25519_GENERIC=y
# CONFIG_CRYPTO_LIB_CURVE25519 is not set
CONFIG_CRYPTO_LIB_DES=y
CONFIG_CRYPTO_LIB_POLY1305_RSIZE=11
CONFIG_CRYPTO_ARCH_HAVE_LIB_POLY1305=y
CONFIG_CRYPTO_LIB_POLY1305_GENERIC=y
# CONFIG_CRYPTO_LIB_POLY1305 is not set
# CONFIG_CRYPTO_LIB_CHACHA20POLY1305 is not set
CONFIG_CRYPTO_LIB_SHA1=y
CONFIG_CRYPTO_LIB_SHA256=y
# end of Crypto library routines
# CONFIG_CRC_CCITT is not set
CONFIG_CRC16=y
# CONFIG_CRC_T10DIF is not set
# CONFIG_CRC64_ROCKSOFT is not set
# CONFIG_CRC_ITU_T is not set
CONFIG_CRC32=y
# CONFIG_CRC32_SELFTEST is not set
CONFIG_CRC32_SLICEBY8=y
# CONFIG_CRC32_SLICEBY4 is not set
# CONFIG_CRC32_SARWATE is not set
# CONFIG_CRC32_BIT is not set
# CONFIG_CRC64 is not set
# CONFIG_CRC4 is not set
# CONFIG_CRC7 is not set
CONFIG_LIBCRC32C=y
# CONFIG_CRC8 is not set
CONFIG_XXHASH=y
# CONFIG_RANDOM32_SELFTEST is not set
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_LZO_COMPRESS=y
CONFIG_LZO_DECOMPRESS=y
CONFIG_LZ4_COMPRESS=y
CONFIG_LZ4_DECOMPRESS=y
CONFIG_ZSTD_COMMON=y
CONFIG_ZSTD_COMPRESS=y
CONFIG_ZSTD_DECOMPRESS=y
CONFIG_XZ_DEC=y
CONFIG_XZ_DEC_X86=y
CONFIG_XZ_DEC_POWERPC=y
CONFIG_XZ_DEC_ARM=y
CONFIG_XZ_DEC_ARMTHUMB=y
CONFIG_XZ_DEC_SPARC=y
# CONFIG_XZ_DEC_MICROLZMA is not set
CONFIG_XZ_DEC_BCJ=y
# CONFIG_XZ_DEC_TEST is not set
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_DECOMPRESS_LZMA=y
CONFIG_DECOMPRESS_XZ=y
CONFIG_DECOMPRESS_LZO=y
CONFIG_DECOMPRESS_LZ4=y
CONFIG_DECOMPRESS_ZSTD=y
CONFIG_GENERIC_ALLOCATOR=y
CONFIG_ASSOCIATIVE_ARRAY=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT=y
CONFIG_HAS_IOPORT_MAP=y
CONFIG_HAS_DMA=y
CONFIG_DMA_OPS=y
CONFIG_NEED_SG_DMA_FLAGS=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_ARCH_DMA_ADDR_T_64BIT=y
CONFIG_SWIOTLB=y
# CONFIG_SWIOTLB_DYNAMIC is not set
# CONFIG_DMA_API_DEBUG is not set
# CONFIG_DMA_MAP_BENCHMARK is not set
CONFIG_SGL_ALLOC=y
CONFIG_DQL=y
CONFIG_GLOB=y
# CONFIG_GLOB_SELFTEST is not set
CONFIG_NLATTR=y
CONFIG_CLZ_TAB=y
# CONFIG_IRQ_POLL is not set
CONFIG_MPILIB=y
CONFIG_DIMLIB=y
CONFIG_OID_REGISTRY=y
CONFIG_HAVE_GENERIC_VDSO=y
CONFIG_GENERIC_GETTIMEOFDAY=y
CONFIG_GENERIC_VDSO_TIME_NS=y
CONFIG_SG_POOL=y
CONFIG_ARCH_HAS_PMEM_API=y
CONFIG_ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION=y
CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE=y
CONFIG_ARCH_HAS_COPY_MC=y
CONFIG_ARCH_STACKWALK=y
CONFIG_STACKDEPOT=y
CONFIG_STACKDEPOT_ALWAYS_INIT=y
CONFIG_STACKDEPOT_MAX_FRAMES=64
CONFIG_SBITMAP=y
# CONFIG_LWQ_TEST is not set
# end of Library routines
CONFIG_FIRMWARE_TABLE=y
#
# Kernel hacking
#
#
# printk and dmesg options
#
CONFIG_PRINTK_TIME=y
# CONFIG_PRINTK_CALLER is not set
# CONFIG_STACKTRACE_BUILD_ID is not set
CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7
CONFIG_CONSOLE_LOGLEVEL_QUIET=4
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
# CONFIG_BOOT_PRINTK_DELAY is not set
# CONFIG_DYNAMIC_DEBUG is not set
# CONFIG_DYNAMIC_DEBUG_CORE is not set
CONFIG_SYMBOLIC_ERRNAME=y
CONFIG_DEBUG_BUGVERBOSE=y
# end of printk and dmesg options
CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_MISC=y
#
# Compile-time checks and compiler options
#
CONFIG_DEBUG_INFO=y
CONFIG_AS_HAS_NON_CONST_ULEB128=y
# CONFIG_DEBUG_INFO_NONE is not set
# CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT is not set
CONFIG_DEBUG_INFO_DWARF4=y
# CONFIG_DEBUG_INFO_DWARF5 is not set
# CONFIG_DEBUG_INFO_REDUCED is not set
CONFIG_DEBUG_INFO_COMPRESSED_NONE=y
# CONFIG_DEBUG_INFO_COMPRESSED_ZLIB is not set
# CONFIG_DEBUG_INFO_COMPRESSED_ZSTD is not set
# CONFIG_DEBUG_INFO_SPLIT is not set
# CONFIG_DEBUG_INFO_BTF is not set
CONFIG_PAHOLE_HAS_SPLIT_BTF=y
CONFIG_PAHOLE_HAS_LANG_EXCLUDE=y
CONFIG_GDB_SCRIPTS=y
CONFIG_FRAME_WARN=2048
# CONFIG_STRIP_ASM_SYMS is not set
# CONFIG_READABLE_ASM is not set
# CONFIG_HEADERS_INSTALL is not set
# CONFIG_DEBUG_SECTION_MISMATCH is not set
CONFIG_SECTION_MISMATCH_WARN_ONLY=y
CONFIG_OBJTOOL=y
# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
# end of Compile-time checks and compiler options
#
# Generic Kernel Debugging Instruments
#
CONFIG_MAGIC_SYSRQ=y
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
CONFIG_MAGIC_SYSRQ_SERIAL=y
CONFIG_MAGIC_SYSRQ_SERIAL_SEQUENCE=""
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_FS_ALLOW_ALL=y
# CONFIG_DEBUG_FS_DISALLOW_MOUNT is not set
# CONFIG_DEBUG_FS_ALLOW_NONE is not set
CONFIG_HAVE_ARCH_KGDB=y
# CONFIG_KGDB is not set
CONFIG_ARCH_HAS_UBSAN=y
# CONFIG_UBSAN is not set
CONFIG_HAVE_ARCH_KCSAN=y
CONFIG_HAVE_KCSAN_COMPILER=y
# end of Generic Kernel Debugging Instruments
#
# Networking Debugging
#
# CONFIG_NET_DEV_REFCNT_TRACKER is not set
# CONFIG_NET_NS_REFCNT_TRACKER is not set
# CONFIG_DEBUG_NET is not set
# end of Networking Debugging
#
# Memory Debugging
#
CONFIG_PAGE_EXTENSION=y
CONFIG_DEBUG_PAGEALLOC=y
CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT=y
CONFIG_SLUB_DEBUG=y
CONFIG_SLUB_DEBUG_ON=y
CONFIG_PAGE_OWNER=y
# CONFIG_PAGE_TABLE_CHECK is not set
# CONFIG_PAGE_POISONING is not set
# CONFIG_DEBUG_PAGE_REF is not set
# CONFIG_DEBUG_RODATA_TEST is not set
CONFIG_ARCH_HAS_DEBUG_WX=y
# CONFIG_DEBUG_WX is not set
CONFIG_GENERIC_PTDUMP=y
# CONFIG_PTDUMP_DEBUGFS is not set
CONFIG_HAVE_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE=16000
# CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set
CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y
# CONFIG_DEBUG_OBJECTS is not set
# CONFIG_SHRINKER_DEBUG is not set
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_SCHED_STACK_END_CHECK is not set
CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE=y
# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_VM_PGTABLE is not set
CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y
# CONFIG_DEBUG_VIRTUAL is not set
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_ARCH_SUPPORTS_KMAP_LOCAL_FORCE_MAP=y
# CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP is not set
CONFIG_HAVE_ARCH_KASAN=y
CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
CONFIG_CC_HAS_KASAN_GENERIC=y
CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y
CONFIG_KASAN=y
CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX=y
CONFIG_KASAN_GENERIC=y
CONFIG_KASAN_OUTLINE=y
# CONFIG_KASAN_INLINE is not set
CONFIG_KASAN_STACK=y
CONFIG_KASAN_VMALLOC=y
# CONFIG_KASAN_MODULE_TEST is not set
# CONFIG_KASAN_EXTRA_INFO is not set
CONFIG_HAVE_ARCH_KFENCE=y
CONFIG_KFENCE=y
CONFIG_KFENCE_SAMPLE_INTERVAL=100
CONFIG_KFENCE_NUM_OBJECTS=255
# CONFIG_KFENCE_DEFERRABLE is not set
CONFIG_KFENCE_STRESS_TEST_FAULTS=0
CONFIG_HAVE_ARCH_KMSAN=y
# end of Memory Debugging
# CONFIG_DEBUG_SHIRQ is not set
#
# Debug Oops, Lockups and Hangs
#
# CONFIG_PANIC_ON_OOPS is not set
CONFIG_PANIC_ON_OOPS_VALUE=0
CONFIG_PANIC_TIMEOUT=0
# CONFIG_SOFTLOCKUP_DETECTOR is not set
# CONFIG_HARDLOCKUP_DETECTOR is not set
CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y
# CONFIG_DETECT_HUNG_TASK is not set
# CONFIG_WQ_WATCHDOG is not set
# CONFIG_WQ_CPU_INTENSIVE_REPORT is not set
# CONFIG_TEST_LOCKUP is not set
# end of Debug Oops, Lockups and Hangs
#
# Scheduler Debugging
#
CONFIG_SCHED_DEBUG=y
# CONFIG_SCHEDSTATS is not set
# end of Scheduler Debugging
# CONFIG_DEBUG_TIMEKEEPING is not set
CONFIG_DEBUG_PREEMPT=y
#
# Lock Debugging (spinlocks, mutexes, etc...)
#
CONFIG_LOCK_DEBUGGING_SUPPORT=y
# CONFIG_PROVE_LOCKING is not set
# CONFIG_LOCK_STAT is not set
# CONFIG_DEBUG_RT_MUTEXES is not set
# CONFIG_DEBUG_SPINLOCK is not set
CONFIG_DEBUG_MUTEXES=y
# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set
# CONFIG_DEBUG_RWSEMS is not set
# CONFIG_DEBUG_LOCK_ALLOC is not set
CONFIG_DEBUG_ATOMIC_SLEEP=y
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
# CONFIG_LOCK_TORTURE_TEST is not set
# CONFIG_WW_MUTEX_SELFTEST is not set
# CONFIG_SCF_TORTURE_TEST is not set
# CONFIG_CSD_LOCK_WAIT_DEBUG is not set
# end of Lock Debugging (spinlocks, mutexes, etc...)
# CONFIG_NMI_CHECK_CPU is not set
# CONFIG_DEBUG_IRQFLAGS is not set
CONFIG_STACKTRACE=y
# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set
# CONFIG_DEBUG_KOBJECT is not set
#
# Debug kernel data structures
#
CONFIG_DEBUG_LIST=y
# CONFIG_DEBUG_PLIST is not set
# CONFIG_DEBUG_SG is not set
# CONFIG_DEBUG_NOTIFIERS is not set
# CONFIG_DEBUG_MAPLE_TREE is not set
# end of Debug kernel data structures
#
# RCU Debugging
#
# CONFIG_RCU_SCALE_TEST is not set
# CONFIG_RCU_TORTURE_TEST is not set
# CONFIG_RCU_REF_SCALE_TEST is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=21
CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0
# CONFIG_RCU_CPU_STALL_CPUTIME is not set
CONFIG_RCU_TRACE=y
# CONFIG_RCU_EQS_DEBUG is not set
# end of RCU Debugging
# CONFIG_DEBUG_WQ_FORCE_RR_CPU is not set
# CONFIG_LATENCYTOP is not set
# CONFIG_DEBUG_CGROUP_REF is not set
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_RETHOOK=y
CONFIG_RETHOOK=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_RETVAL=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y
CONFIG_HAVE_DYNAMIC_FTRACE_NO_PATCHABLE=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_HAVE_FENTRY=y
CONFIG_HAVE_OBJTOOL_MCOUNT=y
CONFIG_HAVE_OBJTOOL_NOP_MCOUNT=y
CONFIG_HAVE_C_RECORDMCOUNT=y
CONFIG_HAVE_BUILDTIME_MCOUNT_SORT=y
CONFIG_BUILDTIME_MCOUNT_SORT=y
CONFIG_TRACER_MAX_TRACE=y
CONFIG_TRACE_CLOCK=y
CONFIG_RING_BUFFER=y
CONFIG_EVENT_TRACING=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_TRACING=y
CONFIG_GENERIC_TRACER=y
CONFIG_TRACING_SUPPORT=y
CONFIG_FTRACE=y
CONFIG_BOOTTIME_TRACING=y
CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=y
# CONFIG_FUNCTION_GRAPH_RETVAL is not set
CONFIG_DYNAMIC_FTRACE=y
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
CONFIG_DYNAMIC_FTRACE_WITH_ARGS=y
# CONFIG_FPROBE is not set
CONFIG_FUNCTION_PROFILER=y
CONFIG_STACK_TRACER=y
# CONFIG_IRQSOFF_TRACER is not set
# CONFIG_PREEMPT_TRACER is not set
CONFIG_SCHED_TRACER=y
CONFIG_HWLAT_TRACER=y
# CONFIG_OSNOISE_TRACER is not set
# CONFIG_TIMERLAT_TRACER is not set
CONFIG_MMIOTRACE=y
CONFIG_FTRACE_SYSCALLS=y
CONFIG_TRACER_SNAPSHOT=y
# CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP is not set
CONFIG_BRANCH_PROFILE_NONE=y
# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
# CONFIG_PROFILE_ALL_BRANCHES is not set
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_KPROBE_EVENTS=y
# CONFIG_KPROBE_EVENTS_ON_NOTRACE is not set
CONFIG_UPROBE_EVENTS=y
CONFIG_BPF_EVENTS=y
CONFIG_DYNAMIC_EVENTS=y
CONFIG_PROBE_EVENTS=y
# CONFIG_BPF_KPROBE_OVERRIDE is not set
CONFIG_FTRACE_MCOUNT_RECORD=y
CONFIG_FTRACE_MCOUNT_USE_CC=y
CONFIG_TRACING_MAP=y
CONFIG_SYNTH_EVENTS=y
# CONFIG_USER_EVENTS is not set
CONFIG_HIST_TRIGGERS=y
# CONFIG_TRACE_EVENT_INJECT is not set
# CONFIG_TRACEPOINT_BENCHMARK is not set
CONFIG_RING_BUFFER_BENCHMARK=y
CONFIG_TRACE_EVAL_MAP_FILE=y
# CONFIG_FTRACE_RECORD_RECURSION is not set
# CONFIG_FTRACE_STARTUP_TEST is not set
# CONFIG_FTRACE_SORT_STARTUP_TEST is not set
# CONFIG_RING_BUFFER_STARTUP_TEST is not set
# CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS is not set
# CONFIG_MMIOTRACE_TEST is not set
# CONFIG_PREEMPTIRQ_DELAY_TEST is not set
# CONFIG_SYNTH_EVENT_GEN_TEST is not set
# CONFIG_KPROBE_EVENT_GEN_TEST is not set
# CONFIG_HIST_TRIGGERS_DEBUG is not set
# CONFIG_RV is not set
# CONFIG_PROVIDE_OHCI1394_DMA_INIT is not set
# CONFIG_SAMPLES is not set
CONFIG_HAVE_SAMPLE_FTRACE_DIRECT=y
CONFIG_HAVE_SAMPLE_FTRACE_DIRECT_MULTI=y
CONFIG_ARCH_HAS_DEVMEM_IS_ALLOWED=y
CONFIG_STRICT_DEVMEM=y
# CONFIG_IO_STRICT_DEVMEM is not set
#
# x86 Debugging
#
CONFIG_EARLY_PRINTK_USB=y
CONFIG_X86_VERBOSE_BOOTUP=y
CONFIG_EARLY_PRINTK=y
CONFIG_EARLY_PRINTK_DBGP=y
# CONFIG_EARLY_PRINTK_USB_XDBC is not set
# CONFIG_DEBUG_TLBFLUSH is not set
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
# CONFIG_X86_DECODER_SELFTEST is not set
CONFIG_IO_DELAY_0X80=y
# CONFIG_IO_DELAY_0XED is not set
# CONFIG_IO_DELAY_UDELAY is not set
# CONFIG_IO_DELAY_NONE is not set
# CONFIG_DEBUG_BOOT_PARAMS is not set
# CONFIG_CPA_DEBUG is not set
# CONFIG_DEBUG_ENTRY is not set
# CONFIG_DEBUG_NMI_SELFTEST is not set
CONFIG_X86_DEBUG_FPU=y
# CONFIG_PUNIT_ATOM_DEBUG is not set
CONFIG_UNWINDER_ORC=y
# CONFIG_UNWINDER_FRAME_POINTER is not set
# end of x86 Debugging
#
# Kernel Testing and Coverage
#
# CONFIG_KUNIT is not set
# CONFIG_NOTIFIER_ERROR_INJECTION is not set
CONFIG_FUNCTION_ERROR_INJECTION=y
# CONFIG_FAULT_INJECTION is not set
CONFIG_ARCH_HAS_KCOV=y
CONFIG_CC_HAS_SANCOV_TRACE_PC=y
# CONFIG_KCOV is not set
CONFIG_RUNTIME_TESTING_MENU=y
# CONFIG_TEST_DHRY is not set
# CONFIG_LKDTM is not set
# CONFIG_TEST_MIN_HEAP is not set
# CONFIG_TEST_DIV64 is not set
# CONFIG_BACKTRACE_SELF_TEST is not set
# CONFIG_TEST_REF_TRACKER is not set
# CONFIG_RBTREE_TEST is not set
# CONFIG_REED_SOLOMON_TEST is not set
# CONFIG_INTERVAL_TREE_TEST is not set
# CONFIG_PERCPU_TEST is not set
# CONFIG_ATOMIC64_SELFTEST is not set
# CONFIG_TEST_HEXDUMP is not set
# CONFIG_TEST_KSTRTOX is not set
# CONFIG_TEST_PRINTF is not set
# CONFIG_TEST_SCANF is not set
# CONFIG_TEST_BITMAP is not set
# CONFIG_TEST_UUID is not set
# CONFIG_TEST_XARRAY is not set
# CONFIG_TEST_MAPLE_TREE is not set
# CONFIG_TEST_RHASHTABLE is not set
# CONFIG_TEST_IDA is not set
# CONFIG_TEST_LKM is not set
# CONFIG_TEST_BITOPS is not set
# CONFIG_TEST_VMALLOC is not set
# CONFIG_TEST_USER_COPY is not set
# CONFIG_TEST_BPF is not set
# CONFIG_TEST_BLACKHOLE_DEV is not set
# CONFIG_FIND_BIT_BENCHMARK is not set
# CONFIG_TEST_FIRMWARE is not set
# CONFIG_TEST_SYSCTL is not set
# CONFIG_TEST_UDELAY is not set
# CONFIG_TEST_STATIC_KEYS is not set
# CONFIG_TEST_KMOD is not set
# CONFIG_TEST_MEMCAT_P is not set
# CONFIG_TEST_MEMINIT is not set
# CONFIG_TEST_FREE_PAGES is not set
# CONFIG_TEST_FPU is not set
# CONFIG_TEST_CLOCKSOURCE_WATCHDOG is not set
# CONFIG_TEST_OBJPOOL is not set
CONFIG_ARCH_USE_MEMTEST=y
# CONFIG_MEMTEST is not set
# end of Kernel Testing and Coverage
#
# Rust hacking
#
# end of Rust hacking
# end of Kernel hacking
^ permalink raw reply [flat|nested] 34+ messages in thread* Re: [PATCH ipsec-next v2 0/17] Add IP-TFS mode to xfrm
2024-05-24 11:52 ` Antony Antony
@ 2024-05-24 11:56 ` Christian Hopps
2024-05-25 5:55 ` Christian Hopps
0 siblings, 1 reply; 34+ messages in thread
From: Christian Hopps @ 2024-05-24 11:56 UTC (permalink / raw)
To: Antony Antony
Cc: Christian Hopps, devel, Steffen Klassert, netdev, Christian Hopps
[-- Attachment #1: Type: text/plain, Size: 15429 bytes --]
This is very helpful thanks.
I think the tunnel endpoints are east/west 192.1.2.{23,45}, but I can't determine the north/east endpoints b/c they don't appear connected. :)
Are there any other iptfs options? The code you highlight mentions the `dont-frag` option, but I wonder if you actually have that enabled?
It also seems like you are pinging and forcing the source IP of a red interface on the tunnel endpoint gateway directly (so that it doesn't try and use the black interface I would guess) is that correct?
Thanks!
Chris.
P.S. the addresses on the NIC host in the picture seem reversed, but this doesn't seem relevant to this test :)
Antony Antony <antony@phenome.org> writes:
> On Thu, May 23, 2024 at 07:04:58PM -0400, Christian Hopps wrote:
>>
>> Could you let me know some more details about this test? What is your interface config / topology?. I tried to guess given the ping command but it's not replicating for me.
>
> I am using Libreswan testing topology. However, I am running test manually.
> Yesterday tunnel between north and east. This morning I quickly tried
> between west-east. Just two VM. I see the same issue there too.
>
> https://libreswan.org/wiki/images/f/f1/Testnet-202102.png
>
> I am using CONFIG_ESP_OFFLOAD. That is only thing standing out. Besides it
> is just a 1500 MTU tunnels using qemu/kvm and tap network.
>
> attached is my kernel .config
>
>> PS, I've changed the subject and In-reply-to to be based on the corrected
>> cover-letter I sent, I initially sent the cover letter with the wrong
>> subject. :(
>
> I noticed a second cover letter. However, it was not showing as related to
> patch set correctly. It showed up as a diffrent thread. That is why I
> replied to the initial one
>
> -antony
>>
>>
>> Antony Antony <antony@phenome.org> writes:
>>
>> > Hi Chris,
>> >
>> > On Mon, May 20, 2024 at 05:42:38PM -0400, Christian Hopps via Devel wrote:
>> > > From: Christian Hopps <chopps@labn.net>
>> > > - iptfs: remove some BUG_ON() assertions questioned in review.
>>
>> ...
>>
>> > I ran a couple of tests and it hit KSAN BUG.
>> >
>> > I was sending large ping while MTU is 1500.
>> >
>> > north login: shed systemd-user-sessions.service - Permit User Sessions.
>> > north login: [ 78.594770] ==================================================================
>> > [ 78.595825] BUG: KASAN: null-ptr-deref in iptfs_output_collect+0x263/0x57b
>> > [ 78.596658] Read of size 8 at addr 0000000000000108 by task ping/493
>> > [ 78.597435] ng rpc-statd-notify.service - Notify NFS peers of a restart...
>> > [ 78.597651] CPU: 0 PID: 493 Comm: ping Not tainted 6.9.0-rc2-00697-g489ca863e24f-dirty #11
>> > [ 78.598645] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>> > [ 78.599747] Call Trace:tty@ttyS2.service - Serial Getty on ttyS2.
>> > [ 78.600070] <TASK>l-getty@ttyS3.service - Serial Getty on ttyS3.
>> > [ 78.600354] dump_stack_lvl+0x2a/0x3bogin Prompts.
>> > [ 78.600817] kasan_report+0x84/0xa6rvice - Hostname Service...
>> > [ 78.601262] ? iptfs_output_collect+0x263/0x57bl server.
>> > [ 78.601825] iptfs_output_collect+0x263/0x57bogin Management.
>> > [ 78.602374] ip_send_skb+0x25/0x57vice - Notify NFS peers of a restart.
>> > [ 78.602807] raw_sendmsg+0xee8/0x1011t - Multi-User System.
>> > [ 78.603269] ? native_flush_tlb_one_user+0xd/0xe5e Service.
>> > [ 78.603850] ? raw_hash_sk+0x21b/0x21b
>> > [ 78.604331] ? kernel_init_pages+0x42/0x51
>> > [ 78.604845] ? prep_new_page+0x44/0x51Re…line ext4 Metadata Check Snapshots.
>> > [ 78.605318] ? get_page_from_freelist+0x72b/0x915 Interface.
>> > [ 78.605903] ? signal_pending_state+0x77/0x77cord Runlevel Change in UTMP...
>> > [ 78.606462] ? __might_resched+0x8a/0x240e - Record Runlevel Change in UTMP.
>> > [ 78.606966] ? __might_sleep+0x25/0xa0
>> > [ 78.607440] ? first_zones_zonelist+0x2c/0x43
>> > [ 78.607985] ? __rcu_read_lock+0x2d/0x3a
>> > [ 78.608479] ? __pte_offset_map+0x32/0xa4
>> > [ 78.608979] ? __might_resched+0x8a/0x240
>> > [ 78.609478] ? __might_sleep+0x25/0xa0
>> > [ 78.609949] ? inet_send_prepare+0x54/0x54
>> > [ 78.610464] ? sock_sendmsg_nosec+0x42/0x6c
>> > [ 78.610984] sock_sendmsg_nosec+0x42/0x6c
>> > [ 78.611485] __sys_sendto+0x15d/0x1cc
>> > [ 78.611947] ? __x64_sys_getpeername+0x44/0x44
>> > [ 78.612498] ? __handle_mm_fault+0x679/0xae4
>> > [ 78.613033] ? find_vma+0x6b/0x8b
>> > [ 78.613457] ? find_vma_intersection+0x8a/0x8a
>> > [ 78.614006] ? __handle_irq_event_percpu+0x180/0x197
>> > [ 78.614617] ? handle_mm_fault+0x38/0x154
>> > [ 78.615114] ? handle_mm_fault+0xeb/0x154
>> > [ 78.615620] ? preempt_latency_start+0x29/0x34
>> > [ 78.616169] ? preempt_count_sub+0x14/0xb3
>> > [ 78.616678] ? up_read+0x4b/0x5c
>> > [ 78.617094] __x64_sys_sendto+0x76/0x82
>> > [ 78.617577] do_syscall_64+0x6b/0xd7
>> > [ 78.618043] entry_SYSCALL_64_after_hwframe+0x46/0x4e
>> > [ 78.618667] RIP: 0033:0x7fed3de99a73
>> > [ 78.619118] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8
>> > 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0
>> > ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
>> > [ 78.621291] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
>> > [ 78.622205] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
>> > [ 78.623056] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
>> > [ 78.623908] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
>> > [ 78.624765] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
>> > [ 78.625619] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
>> > [ 78.626480] </TASK>
>> > [ 78.626773] ==================================================================
>> > [ 78.627656] Disabling lock debugging due to kernel taint
>> > [ 78.628305] BUG: kernel NULL pointer dereference, address: 0000000000000108
>> > [ 78.629136] #PF: supervisor read access in kernel mode
>> > [ 78.629766] #PF: error_code(0x0000) - not-present page
>> > [ 78.630402] PGD 0 P4D 0
>> > [ 78.630739] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC KASAN
>> > [ 78.631398] CPU: 0 PID: 493 Comm: ping Tainted: G B 6.9.0-rc2-00697-g489ca863e24f-dirty #11
>> > [ 78.632548] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>> > [ 78.633649] RIP: 0010:iptfs_output_collect+0x263/0x57b
>> > [ 78.634283] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b
>> > 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08
>> > 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
>> > [ 78.636444] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
>> > [ 78.637076] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
>> > [ 78.637923] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
>> > [ 78.638792] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
>> > [ 78.639645] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
>> > [ 78.640498] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
>> > [ 78.641359] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
>> > [ 78.642324] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > [ 78.643022] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
>> > [ 78.643882] Call Trace:
>> > [ 78.644204] <TASK>
>> > [ 78.644487] ? __die_body+0x1a/0x56
>> > [ 78.644929] ? page_fault_oops+0x45f/0x4cd
>> > [ 78.645441] ? dump_pagetable+0x1db/0x1db
>> > [ 78.645942] ? vprintk_emit+0x163/0x171
>> > [ 78.646425] ? iptfs_output_collect+0x263/0x57b
>> > [ 78.646986] ? _printk+0xb2/0xe1
>> > [ 78.647401] ? find_first_fitting_seq+0x193/0x193
>> > [ 78.647982] ? iptfs_output_collect+0x263/0x57b
>> > [ 78.648541] ? do_user_addr_fault+0x14f/0x56c
>> > [ 78.649084] ? exc_page_fault+0xa5/0xbe
>> > [ 78.649566] ? asm_exc_page_fault+0x22/0x30
>> > [ 78.650100] ? iptfs_output_collect+0x263/0x57b
>> > [ 78.650660] ? iptfs_output_collect+0x263/0x57b
>> > [ 78.651221] ip_send_skb+0x25/0x57
>> > [ 78.651652] raw_sendmsg+0xee8/0x1011
>> > [ 78.652113] ? native_flush_tlb_one_user+0xd/0xe5
>> > [ 78.652693] ? raw_hash_sk+0x21b/0x21b
>> > [ 78.653166] ? kernel_init_pages+0x42/0x51
>> > [ 78.653683] ? prep_new_page+0x44/0x51
>> > [ 78.654160] ? get_page_from_freelist+0x72b/0x915
>> > [ 78.654739] ? signal_pending_state+0x77/0x77
>> > [ 78.655284] ? __might_resched+0x8a/0x240
>> > [ 78.655784] ? __might_sleep+0x25/0xa0
>> > [ 78.656255] ? first_zones_zonelist+0x2c/0x43
>> > [ 78.656798] ? __rcu_read_lock+0x2d/0x3a
>> > [ 78.657289] ? __pte_offset_map+0x32/0xa4
>> > [ 78.657788] ? __might_resched+0x8a/0x240
>> > [ 78.658291] ? __might_sleep+0x25/0xa0
>> > [ 78.658763] ? inet_send_prepare+0x54/0x54
>> > [ 78.659272] ? sock_sendmsg_nosec+0x42/0x6c
>> > [ 78.659791] sock_sendmsg_nosec+0x42/0x6c
>> > [ 78.660293] __sys_sendto+0x15d/0x1cc
>> > [ 78.660755] ? __x64_sys_getpeername+0x44/0x44
>> > [ 78.661304] ? __handle_mm_fault+0x679/0xae4
>> > [ 78.661838] ? find_vma+0x6b/0x8b
>> > [ 78.662272] ? find_vma_intersection+0x8a/0x8a
>> > [ 78.662828] ? __handle_irq_event_percpu+0x180/0x197
>> > [ 78.663436] ? handle_mm_fault+0x38/0x154
>> > [ 78.663935] ? handle_mm_fault+0xeb/0x154
>> > [ 78.664435] ? preempt_latency_start+0x29/0x34
>> > [ 78.664987] ? preempt_count_sub+0x14/0xb3
>> > [ 78.665498] ? up_read+0x4b/0x5c
>> > [ 78.665911] __x64_sys_sendto+0x76/0x82
>> > [ 78.666398] do_syscall_64+0x6b/0xd7
>> > [ 78.666849] entry_SYSCALL_64_after_hwframe+0x46/0x4e
>> > [ 78.667466] RIP: 0033:0x7fed3de99a73
>> > [ 78.667918] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8
>> > 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0
>> > ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
>> > [ 78.670097] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
>> > [ 78.671002] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
>> > [ 78.671858] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
>> > [ 78.672708] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
>> > [ 78.673564] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
>> > [ 78.674430] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
>> > [ 78.675287] </TASK>
>> > [ 78.675580] Modules linked in:
>> > [ 78.675975] CR2: 0000000000000108
>> > [ 78.676396] ---[ end trace 0000000000000000 ]---
>> > [ 78.676966] RIP: 0010:iptfs_output_collect+0x263/0x57b
>> > [ 78.677596] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b
>> > 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08
>> > 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
>> > [ 78.679768] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
>> > [ 78.680410] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
>> > [ 78.681264] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
>> > [ 78.682136] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
>> > [ 78.682997] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
>> > [ 78.683853] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
>> > [ 78.684710] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
>> > [ 78.685675] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > [ 78.686387] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
>> > [ 78.687246] Kernel panic - not syncing: Fatal exception in interrupt
>> > [ 78.688014] Kernel Offset: disabled
>> > [ 78.688460] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
>> >
>> > ping -s 2000 -n -q -W 1 -c 2 -I 192.0.3.254 192.0.2.254
>> >
>> > (gdb) list *iptfs_output_collect+0x263
>> > 0xffffffff81d5076f is in iptfs_output_collect (./include/net/net_namespace.h:383).
>> > 378 }
>> > 379
>> > 380 static inline struct net *read_pnet(const possible_net_t *pnet)
>> > 381 {
>> > 382 #ifdef CONFIG_NET_NS
>> > 383 return rcu_dereference_protected(pnet->net, true);
>> > 384 #else
>> > 385 return &init_net;
>> > 386 #endif
>> > 387 }
>> >
>> > I suspect actual crash is from the line 1756 instead,
>> > (gdb) list *iptfs_output_collect+0x256
>> > 0xffffffff81d50762 is in iptfs_output_collect (net/xfrm/xfrm_iptfs.c:1756).
>> > 1751 return 0;
>> > 1752
>> > 1753 /* We only send ICMP too big if the user has configured us as
>> > 1754 * dont-fragment.
>> > 1755 */
>> > 1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
>> > 1757
>> > 1758 if (sk) {
>> > 1759 xfrm_local_error(skb, pmtu);
>> > 1760 } else if (ip_hdr(skb)->version == 4) {
>> >
>> > Later I ran with gdb iptfs_is_too_big which is called twice and second time
>> > it crash.
>> > Here is gdb bt. Just before the crash
>> >
>> > #0 iptfs_is_too_big (pmtu=1442, skb=0xffff88810dbea3c0, sk=0xffff888104d4ed40) at net/xfrm/xfrm_iptfs.c:1756
>> > #1 iptfs_output_collect (net=<optimized out>, sk=0xffff888104d4ed40, skb=0xffff88810dbea3c0) at net/xfrm/xfrm_iptfs.c:1847
>> > #2 0xffffffff81c8a3cb in ip_send_skb (net=0xffffffff83e57f20 <init_net>, skb=0xffff88810dbea3c0)
>> > at net/ipv4/ip_output.c:1492
>> > #3 0xffffffff81c8a439 in ip_push_pending_frames (sk=sk@entry=0xffff888104d4ed40, fl4=fl4@entry=0xffffc90000e3fb90)
>> > at net/ipv4/ip_output.c:1512
>> > #4 0xffffffff81ccf3cf in raw_sendmsg (sk=0xffff888104d4ed40, msg=0xffffc90000e3fd80, len=<optimized out>)
>> > at net/ipv4/raw.c:654
>> > #5 0xffffffff81b096ea in sock_sendmsg_nosec (sock=sock@entry=0xffff888115136040, msg=msg@entry=0xffffc90000e3fd80)
>> > at net/socket.c:730
>> > #6 0xffffffff81b0c327 in __sock_sendmsg (msg=0xffffc90000e3fd80, sock=0xffff888115136040) at net/socket.c:745
>> > #7 __sys_sendto (fd=<optimized out>, buff=buff@entry=0x558edefb73c0, len=len@entry=2008, flags=flags@entry=0,
>> > addr=addr@entry=0x558edefb35c0, addr_len=addr_len@entry=16) at net/socket.c:2191
>> > #8 0xffffffff81b0c40c in __do_sys_sendto (addr_len=16, addr=0x558edefb35c0, flags=0, len=2008, buff=0x558edefb73c0,
>> > fd=<optimized out>) at net/socket.c:2203
>> > #9 __se_sys_sendto (addr_len=16, addr=94072114722240, flags=0, len=2008, buff=94072114738112, fd=<optimized out>)
>> > at net/socket.c:2199
>> >
>> > gdb) list
>> > 1751 return 0;
>> > 1752
>> > 1753 /* We only send ICMP too big if the user has configured us as
>> > 1754 * dont-fragment.
>> > 1755 */
>> > 1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
>> > 1757
>> > 1758 if (sk) {
>> > 1759 xfrm_local_error(skb, pmtu);
>> > 1760 } else if (ip_hdr(skb)->version == 4) {
>> >
>> > -antony
>>
>
> [2. text/plain; .config]...
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 857 bytes --]
^ permalink raw reply [flat|nested] 34+ messages in thread* Re: [PATCH ipsec-next v2 0/17] Add IP-TFS mode to xfrm
2024-05-24 11:56 ` Christian Hopps
@ 2024-05-25 5:55 ` Christian Hopps
2024-06-06 15:52 ` [devel-ipsec] " Antony Antony
2024-06-11 6:24 ` Antony Antony
0 siblings, 2 replies; 34+ messages in thread
From: Christian Hopps @ 2024-05-25 5:55 UTC (permalink / raw)
To: Antony Antony
Cc: Christian Hopps, devel, Steffen Klassert, netdev, Christian Hopps
[-- Attachment #1: Type: text/plain, Size: 16183 bytes --]
Found. This was happening b/c the skb was locally generated on the gateway and so had no net_device. Fixed by checking for skb->dev == NULL before incrementing the error stats in the output path.
Thanks!
Chris.
Christian Hopps <chopps@chopps.org> writes:
> [[PGP Signed Part:Good signature from 2E1D830ED7B83025 Christian Hopps <chopps@gmail.com> (trust ultimate) created at 2024-05-24T08:08:58-0400 using RSA]]
>
> This is very helpful thanks.
>
> I think the tunnel endpoints are east/west 192.1.2.{23,45}, but I can't determine the north/east endpoints b/c they don't appear connected. :)
>
> Are there any other iptfs options? The code you highlight mentions the `dont-frag` option, but I wonder if you actually have that enabled?
>
> It also seems like you are pinging and forcing the source IP of a red interface
> on the tunnel endpoint gateway directly (so that it doesn't try and use the
> black interface I would guess) is that correct?
>
> Thanks!
> Chris.
>
> P.S. the addresses on the NIC host in the picture seem reversed, but this doesn't seem relevant to this test :)
>
> Antony Antony <antony@phenome.org> writes:
>
>> On Thu, May 23, 2024 at 07:04:58PM -0400, Christian Hopps wrote:
>>>
>>> Could you let me know some more details about this test? What is your interface config / topology?. I tried to guess given the ping command but it's not replicating for me.
>>
>> I am using Libreswan testing topology. However, I am running test manually.
>> Yesterday tunnel between north and east. This morning I quickly tried
>> between west-east. Just two VM. I see the same issue there too.
>>
>> https://libreswan.org/wiki/images/f/f1/Testnet-202102.png
>>
>> I am using CONFIG_ESP_OFFLOAD. That is only thing standing out. Besides it
>> is just a 1500 MTU tunnels using qemu/kvm and tap network.
>>
>> attached is my kernel .config
>>
>>> PS, I've changed the subject and In-reply-to to be based on the corrected
>>> cover-letter I sent, I initially sent the cover letter with the wrong
>>> subject. :(
>>
>> I noticed a second cover letter. However, it was not showing as related to
>> patch set correctly. It showed up as a diffrent thread. That is why I
>> replied to the initial one
>>
>> -antony
>>>
>>>
>>> Antony Antony <antony@phenome.org> writes:
>>>
>>> > Hi Chris,
>>> >
>>> > On Mon, May 20, 2024 at 05:42:38PM -0400, Christian Hopps via Devel wrote:
>>> > > From: Christian Hopps <chopps@labn.net>
>>> > > - iptfs: remove some BUG_ON() assertions questioned in review.
>>>
>>> ...
>>>
>>> > I ran a couple of tests and it hit KSAN BUG.
>>> >
>>> > I was sending large ping while MTU is 1500.
>>> >
>>> > north login: shed systemd-user-sessions.service - Permit User Sessions.
>>> > north login: [ 78.594770] ==================================================================
>>> > [ 78.595825] BUG: KASAN: null-ptr-deref in iptfs_output_collect+0x263/0x57b
>>> > [ 78.596658] Read of size 8 at addr 0000000000000108 by task ping/493
>>> > [ 78.597435] ng rpc-statd-notify.service - Notify NFS peers of a restart...
>>> > [ 78.597651] CPU: 0 PID: 493 Comm: ping Not tainted 6.9.0-rc2-00697-g489ca863e24f-dirty #11
>>> > [ 78.598645] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>>> > [ 78.599747] Call Trace:tty@ttyS2.service - Serial Getty on ttyS2.
>>> > [ 78.600070] <TASK>l-getty@ttyS3.service - Serial Getty on ttyS3.
>>> > [ 78.600354] dump_stack_lvl+0x2a/0x3bogin Prompts.
>>> > [ 78.600817] kasan_report+0x84/0xa6rvice - Hostname Service...
>>> > [ 78.601262] ? iptfs_output_collect+0x263/0x57bl server.
>>> > [ 78.601825] iptfs_output_collect+0x263/0x57bogin Management.
>>> > [ 78.602374] ip_send_skb+0x25/0x57vice - Notify NFS peers of a restart.
>>> > [ 78.602807] raw_sendmsg+0xee8/0x1011t - Multi-User System.
>>> > [ 78.603269] ? native_flush_tlb_one_user+0xd/0xe5e Service.
>>> > [ 78.603850] ? raw_hash_sk+0x21b/0x21b
>>> > [ 78.604331] ? kernel_init_pages+0x42/0x51
>>> > [ 78.604845] ? prep_new_page+0x44/0x51Re…line ext4 Metadata Check Snapshots.
>>> > [ 78.605318] ? get_page_from_freelist+0x72b/0x915 Interface.
>>> > [ 78.605903] ? signal_pending_state+0x77/0x77cord Runlevel Change in UTMP...
>>> > [ 78.606462] ? __might_resched+0x8a/0x240e - Record Runlevel Change in UTMP.
>>> > [ 78.606966] ? __might_sleep+0x25/0xa0
>>> > [ 78.607440] ? first_zones_zonelist+0x2c/0x43
>>> > [ 78.607985] ? __rcu_read_lock+0x2d/0x3a
>>> > [ 78.608479] ? __pte_offset_map+0x32/0xa4
>>> > [ 78.608979] ? __might_resched+0x8a/0x240
>>> > [ 78.609478] ? __might_sleep+0x25/0xa0
>>> > [ 78.609949] ? inet_send_prepare+0x54/0x54
>>> > [ 78.610464] ? sock_sendmsg_nosec+0x42/0x6c
>>> > [ 78.610984] sock_sendmsg_nosec+0x42/0x6c
>>> > [ 78.611485] __sys_sendto+0x15d/0x1cc
>>> > [ 78.611947] ? __x64_sys_getpeername+0x44/0x44
>>> > [ 78.612498] ? __handle_mm_fault+0x679/0xae4
>>> > [ 78.613033] ? find_vma+0x6b/0x8b
>>> > [ 78.613457] ? find_vma_intersection+0x8a/0x8a
>>> > [ 78.614006] ? __handle_irq_event_percpu+0x180/0x197
>>> > [ 78.614617] ? handle_mm_fault+0x38/0x154
>>> > [ 78.615114] ? handle_mm_fault+0xeb/0x154
>>> > [ 78.615620] ? preempt_latency_start+0x29/0x34
>>> > [ 78.616169] ? preempt_count_sub+0x14/0xb3
>>> > [ 78.616678] ? up_read+0x4b/0x5c
>>> > [ 78.617094] __x64_sys_sendto+0x76/0x82
>>> > [ 78.617577] do_syscall_64+0x6b/0xd7
>>> > [ 78.618043] entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> > [ 78.618667] RIP: 0033:0x7fed3de99a73
>>> > [ 78.619118] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8
>>> > 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0
>>> > ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
>>> > [ 78.621291] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
>>> > [ 78.622205] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
>>> > [ 78.623056] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
>>> > [ 78.623908] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
>>> > [ 78.624765] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
>>> > [ 78.625619] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
>>> > [ 78.626480] </TASK>
>>> > [ 78.626773] ==================================================================
>>> > [ 78.627656] Disabling lock debugging due to kernel taint
>>> > [ 78.628305] BUG: kernel NULL pointer dereference, address: 0000000000000108
>>> > [ 78.629136] #PF: supervisor read access in kernel mode
>>> > [ 78.629766] #PF: error_code(0x0000) - not-present page
>>> > [ 78.630402] PGD 0 P4D 0
>>> > [ 78.630739] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC KASAN
>>> > [ 78.631398] CPU: 0 PID: 493 Comm: ping Tainted: G B 6.9.0-rc2-00697-g489ca863e24f-dirty #11
>>> > [ 78.632548] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>>> > [ 78.633649] RIP: 0010:iptfs_output_collect+0x263/0x57b
>>> > [ 78.634283] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b
>>> > 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08
>>> > 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
>>> > [ 78.636444] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
>>> > [ 78.637076] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
>>> > [ 78.637923] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
>>> > [ 78.638792] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
>>> > [ 78.639645] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
>>> > [ 78.640498] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
>>> > [ 78.641359] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
>>> > [ 78.642324] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> > [ 78.643022] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
>>> > [ 78.643882] Call Trace:
>>> > [ 78.644204] <TASK>
>>> > [ 78.644487] ? __die_body+0x1a/0x56
>>> > [ 78.644929] ? page_fault_oops+0x45f/0x4cd
>>> > [ 78.645441] ? dump_pagetable+0x1db/0x1db
>>> > [ 78.645942] ? vprintk_emit+0x163/0x171
>>> > [ 78.646425] ? iptfs_output_collect+0x263/0x57b
>>> > [ 78.646986] ? _printk+0xb2/0xe1
>>> > [ 78.647401] ? find_first_fitting_seq+0x193/0x193
>>> > [ 78.647982] ? iptfs_output_collect+0x263/0x57b
>>> > [ 78.648541] ? do_user_addr_fault+0x14f/0x56c
>>> > [ 78.649084] ? exc_page_fault+0xa5/0xbe
>>> > [ 78.649566] ? asm_exc_page_fault+0x22/0x30
>>> > [ 78.650100] ? iptfs_output_collect+0x263/0x57b
>>> > [ 78.650660] ? iptfs_output_collect+0x263/0x57b
>>> > [ 78.651221] ip_send_skb+0x25/0x57
>>> > [ 78.651652] raw_sendmsg+0xee8/0x1011
>>> > [ 78.652113] ? native_flush_tlb_one_user+0xd/0xe5
>>> > [ 78.652693] ? raw_hash_sk+0x21b/0x21b
>>> > [ 78.653166] ? kernel_init_pages+0x42/0x51
>>> > [ 78.653683] ? prep_new_page+0x44/0x51
>>> > [ 78.654160] ? get_page_from_freelist+0x72b/0x915
>>> > [ 78.654739] ? signal_pending_state+0x77/0x77
>>> > [ 78.655284] ? __might_resched+0x8a/0x240
>>> > [ 78.655784] ? __might_sleep+0x25/0xa0
>>> > [ 78.656255] ? first_zones_zonelist+0x2c/0x43
>>> > [ 78.656798] ? __rcu_read_lock+0x2d/0x3a
>>> > [ 78.657289] ? __pte_offset_map+0x32/0xa4
>>> > [ 78.657788] ? __might_resched+0x8a/0x240
>>> > [ 78.658291] ? __might_sleep+0x25/0xa0
>>> > [ 78.658763] ? inet_send_prepare+0x54/0x54
>>> > [ 78.659272] ? sock_sendmsg_nosec+0x42/0x6c
>>> > [ 78.659791] sock_sendmsg_nosec+0x42/0x6c
>>> > [ 78.660293] __sys_sendto+0x15d/0x1cc
>>> > [ 78.660755] ? __x64_sys_getpeername+0x44/0x44
>>> > [ 78.661304] ? __handle_mm_fault+0x679/0xae4
>>> > [ 78.661838] ? find_vma+0x6b/0x8b
>>> > [ 78.662272] ? find_vma_intersection+0x8a/0x8a
>>> > [ 78.662828] ? __handle_irq_event_percpu+0x180/0x197
>>> > [ 78.663436] ? handle_mm_fault+0x38/0x154
>>> > [ 78.663935] ? handle_mm_fault+0xeb/0x154
>>> > [ 78.664435] ? preempt_latency_start+0x29/0x34
>>> > [ 78.664987] ? preempt_count_sub+0x14/0xb3
>>> > [ 78.665498] ? up_read+0x4b/0x5c
>>> > [ 78.665911] __x64_sys_sendto+0x76/0x82
>>> > [ 78.666398] do_syscall_64+0x6b/0xd7
>>> > [ 78.666849] entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>> > [ 78.667466] RIP: 0033:0x7fed3de99a73
>>> > [ 78.667918] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8
>>> > 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0
>>> > ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
>>> > [ 78.670097] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
>>> > [ 78.671002] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
>>> > [ 78.671858] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
>>> > [ 78.672708] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
>>> > [ 78.673564] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
>>> > [ 78.674430] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
>>> > [ 78.675287] </TASK>
>>> > [ 78.675580] Modules linked in:
>>> > [ 78.675975] CR2: 0000000000000108
>>> > [ 78.676396] ---[ end trace 0000000000000000 ]---
>>> > [ 78.676966] RIP: 0010:iptfs_output_collect+0x263/0x57b
>>> > [ 78.677596] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b
>>> > 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08
>>> > 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
>>> > [ 78.679768] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
>>> > [ 78.680410] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
>>> > [ 78.681264] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
>>> > [ 78.682136] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
>>> > [ 78.682997] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
>>> > [ 78.683853] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
>>> > [ 78.684710] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
>>> > [ 78.685675] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> > [ 78.686387] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
>>> > [ 78.687246] Kernel panic - not syncing: Fatal exception in interrupt
>>> > [ 78.688014] Kernel Offset: disabled
>>> > [ 78.688460] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
>>> >
>>> > ping -s 2000 -n -q -W 1 -c 2 -I 192.0.3.254 192.0.2.254
>>> >
>>> > (gdb) list *iptfs_output_collect+0x263
>>> > 0xffffffff81d5076f is in iptfs_output_collect (./include/net/net_namespace.h:383).
>>> > 378 }
>>> > 379
>>> > 380 static inline struct net *read_pnet(const possible_net_t *pnet)
>>> > 381 {
>>> > 382 #ifdef CONFIG_NET_NS
>>> > 383 return rcu_dereference_protected(pnet->net, true);
>>> > 384 #else
>>> > 385 return &init_net;
>>> > 386 #endif
>>> > 387 }
>>> >
>>> > I suspect actual crash is from the line 1756 instead,
>>> > (gdb) list *iptfs_output_collect+0x256
>>> > 0xffffffff81d50762 is in iptfs_output_collect (net/xfrm/xfrm_iptfs.c:1756).
>>> > 1751 return 0;
>>> > 1752
>>> > 1753 /* We only send ICMP too big if the user has configured us as
>>> > 1754 * dont-fragment.
>>> > 1755 */
>>> > 1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
>>> > 1757
>>> > 1758 if (sk) {
>>> > 1759 xfrm_local_error(skb, pmtu);
>>> > 1760 } else if (ip_hdr(skb)->version == 4) {
>>> >
>>> > Later I ran with gdb iptfs_is_too_big which is called twice and second time
>>> > it crash.
>>> > Here is gdb bt. Just before the crash
>>> >
>>> > #0 iptfs_is_too_big (pmtu=1442, skb=0xffff88810dbea3c0, sk=0xffff888104d4ed40) at net/xfrm/xfrm_iptfs.c:1756
>>> > #1 iptfs_output_collect (net=<optimized out>, sk=0xffff888104d4ed40, skb=0xffff88810dbea3c0) at net/xfrm/xfrm_iptfs.c:1847
>>> > #2 0xffffffff81c8a3cb in ip_send_skb (net=0xffffffff83e57f20 <init_net>, skb=0xffff88810dbea3c0)
>>> > at net/ipv4/ip_output.c:1492
>>> > #3 0xffffffff81c8a439 in ip_push_pending_frames (sk=sk@entry=0xffff888104d4ed40, fl4=fl4@entry=0xffffc90000e3fb90)
>>> > at net/ipv4/ip_output.c:1512
>>> > #4 0xffffffff81ccf3cf in raw_sendmsg (sk=0xffff888104d4ed40, msg=0xffffc90000e3fd80, len=<optimized out>)
>>> > at net/ipv4/raw.c:654
>>> > #5 0xffffffff81b096ea in sock_sendmsg_nosec (sock=sock@entry=0xffff888115136040, msg=msg@entry=0xffffc90000e3fd80)
>>> > at net/socket.c:730
>>> > #6 0xffffffff81b0c327 in __sock_sendmsg (msg=0xffffc90000e3fd80, sock=0xffff888115136040) at net/socket.c:745
>>> > #7 __sys_sendto (fd=<optimized out>, buff=buff@entry=0x558edefb73c0, len=len@entry=2008, flags=flags@entry=0,
>>> > addr=addr@entry=0x558edefb35c0, addr_len=addr_len@entry=16) at net/socket.c:2191
>>> > #8 0xffffffff81b0c40c in __do_sys_sendto (addr_len=16, addr=0x558edefb35c0, flags=0, len=2008, buff=0x558edefb73c0,
>>> > fd=<optimized out>) at net/socket.c:2203
>>> > #9 __se_sys_sendto (addr_len=16, addr=94072114722240, flags=0, len=2008, buff=94072114738112, fd=<optimized out>)
>>> > at net/socket.c:2199
>>> >
>>> > gdb) list
>>> > 1751 return 0;
>>> > 1752
>>> > 1753 /* We only send ICMP too big if the user has configured us as
>>> > 1754 * dont-fragment.
>>> > 1755 */
>>> > 1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
>>> > 1757
>>> > 1758 if (sk) {
>>> > 1759 xfrm_local_error(skb, pmtu);
>>> > 1760 } else if (ip_hdr(skb)->version == 4) {
>>> >
>>> > -antony
>>>
>>
>> [2. text/plain; .config]...
>
> [[End of PGP Signed Part]]
a
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 857 bytes --]
^ permalink raw reply [flat|nested] 34+ messages in thread* Re: [devel-ipsec] [PATCH ipsec-next v2 0/17] Add IP-TFS mode to xfrm
2024-05-25 5:55 ` Christian Hopps
@ 2024-06-06 15:52 ` Antony Antony
2024-06-07 5:54 ` Christian Hopps
2024-06-11 6:24 ` Antony Antony
1 sibling, 1 reply; 34+ messages in thread
From: Antony Antony @ 2024-06-06 15:52 UTC (permalink / raw)
To: Christian Hopps
Cc: Antony Antony, devel, Steffen Klassert, netdev, Christian Hopps
On Sat, May 25, 2024 at 01:55:01AM -0400, Christian Hopps via Devel wrote:
>
> Found. This was happening b/c the skb was locally generated on the gateway and so had no net_device. Fixed by checking for skb->dev == NULL before incrementing the error stats in the output path.
Good to hear you found the bug and fixed. I am curious how the large packets
send in case dsl gateway would work.
Here is possibly another issue.
With ping -f I see error. After a few several responses ping return error
and no more ESP is send from the sender.
ping -f -c 10000 -I 192.0.1.254 192.0.2.254
PING 192.0.2.254 (192.0.2.254) from 192.0.1.254 : 56(84) bytes of data.
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE^C
--- 192.0.2.254 ping statistics ---
1428 packets transmitted, 1280 received, 10.3641% packet loss, time 28398ms
rtt min/avg/max/mdev = 2.761/7.770/19.916/0.908 ms, pipe 2, ipg/ewma 19.900/9.335 ms
root@west:/testing/pluto/ikev2-74-iptfs-01$
root@east:/testing/pluto/ikev2-74-iptfs-01$ip x s
src 192.1.2.23 dst 192.1.2.45
proto esp spi 0x55067850 reqid 16389 mode iptfs dir out
flag af-unspec esn
aead rfc4106(gcm(aes)) 0x7aacd5115a84ee5476940c864b3f4a4fa6ca9e3c0590b1b33ae5c925dad38c494c2ba9ac 128
lastused 2024-06-06 17:36:31
oseq-hi 0x0, oseq 0x564
iptfs-opts pkt-size 0 max-queue-size 1048576 drop-time 1000000 reorder-window 3 init-delay 0 dont-frag
src 192.1.2.45 dst 192.1.2.23
proto esp spi 0x54562117 reqid 16389 mode iptfs dir in
flag af-unspec esn
aead rfc4106(gcm(aes)) 0x8505e65031be933d5b5be57c27a618de7f5d5a2c464dbfb62d093dcb411b2c4f75893484 128
lastused 2024-06-06 17:36:31
seq-hi 0x0, seq 0x564
replay-window 128, bitmap-length 4
ffffffff ffffffff ffffffff ffffffff
iptfs-opts pkt-size 0 max-queue-size 1048576 drop-time 1000000 reorder-window 3 init-delay 0
Also on there is kernel splat on both ends. I am not sure it is related to
your patches. However, I see it around same time ping return error, and I
haven't seen it before. I will try to get more inforation.
[ 575.515108] ------------[ cut here ]------------
[ 575.515646] refcount_t: underflow; use-after-free.
[ 575.516169] WARNING: CPU: 0 PID: 34 at lib/refcount.c:28 refcount_warn_saturate+0xb7/0xfc
[ 575.516996] Modules linked in:
[ 575.517332] CPU: 0 PID: 34 Comm: rb_consumer Not tainted 6.9.0-rc2-00696-gf549fd6ea775 #28
[ 575.518165] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 575.519105] RIP: 0010:refcount_warn_saturate+0xb7/0xfc
[ 575.519635] Code: 4c 30 c3 01 01 e8 e2 af 8b ff 0f 0b eb 5e 80 3d 3b 30 c3 01 00 75 55 48 c7 c7 c0 95 4e 82 c6 05 2b 30 c3 01 01 e8 c2 af 8b ff <0f> 0b eb 3e 80 3d 1a 30 c3 01 00 75 35 48 c7 c7 20 97 4e 82 c6 05
[ 575.521449] RSP: 0018:ffffc90000007c90 EFLAGS: 00010282
[ 575.521992] RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
[ 575.522713] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: fffff52000000f83
[ 575.523448] RBP: ffff888104d4f334 R08: 0000000000000004 R09: 0000000000000001
[ 575.524169] R10: ffffffff82d552ab R11: fffffbfff05aaa55 R12: ffff888104d4f334
[ 575.524891] R13: 1ffff92000000fa7 R14: ffff8881073d0800 R15: 0000000000000000
[ 575.525609] FS: 0000000000000000(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
[ 575.526413] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 575.527013] CR2: 0000558b84637af0 CR3: 000000010dc88000 CR4: 0000000000350ef0
[ 575.527731] Call Trace:
[ 575.528006] <IRQ>
[ 575.528240] ? __warn+0xb8/0x13b
[ 575.528593] ? refcount_warn_saturate+0xb7/0xfc
[ 575.529071] ? report_bug+0xf6/0x159
[ 575.529453] ? refcount_warn_saturate+0xb7/0xfc
[ 575.529931] ? handle_bug+0x3c/0x64
[ 575.530306] ? exc_invalid_op+0x13/0x38
[ 575.530717] ? asm_exc_invalid_op+0x16/0x20
[ 575.531167] ? refcount_warn_saturate+0xb7/0xfc
[ 575.531639] __refcount_sub_and_test.constprop.0+0x38/0x3d
[ 575.532200] sock_wfree+0x13a/0x153
[ 575.532575] skb_release_head_state+0x25/0x6b
[ 575.533032] skb_release_all+0x13/0x3a
[ 575.533429] napi_consume_skb+0x53/0x5e
[ 575.533836] __free_old_xmit+0xcc/0x18d
[ 575.534243] ? virtnet_freeze_down.isra.0+0xb4/0xb4
[ 575.534747] ? check_preempt_wakeup_fair+0x64/0x1f3
[ 575.535269] ? test_ti_thread_flag+0x12/0x1f
[ 575.535718] ? tracing_record_taskinfo_sched_switch+0x25/0xbf
[ 575.536304] free_old_xmit+0x72/0xbe
[ 575.536687] ? __free_old_xmit+0x18d/0x18d
[ 575.537117] ? trace_rcu_this_gp.constprop.0+0x52/0xca
[ 575.537646] ? virtqueue_disable_cb+0x71/0xe9
[ 575.538109] virtnet_poll_tx+0xf6/0x1d8
[ 575.538516] __napi_poll.constprop.0+0x57/0x1a7
[ 575.539000] net_rx_action+0x1cb/0x380
[ 575.539399] ? __napi_poll.constprop.0+0x1a7/0x1a7
[ 575.539896] ? __napi_schedule+0xe/0x17
[ 575.540302] ? vring_interrupt+0xba/0xc4
[ 575.540716] ? __handle_irq_event_percpu+0x180/0x197
[ 575.541229] ? handle_irq_event_percpu+0x3b/0x40
[ 575.541710] __do_softirq+0x135/0x2d7
[ 575.542102] common_interrupt+0x93/0xb8
[ 575.542509] </IRQ>
[ 575.542751] <TASK>
[ 575.543004] asm_common_interrupt+0x22/0x40
[ 575.543443] RIP: 0010:ring_buffer_consume+0xde/0x11e
[ 575.543960] Code: e8 c6 68 18 00 31 c0 4c 89 ef 49 89 45 58 e8 1e fb ff ff 41 0f b6 fc e8 4f ec ff ff 0f ba 64 24 20 09 73 01 fb bf 01 00 00 00 <e8> 4a 4e f4 ff 8b 05 64 17 28 02 85 c0 75 05 e8 63 56 e1 ff 48 85
[ 575.545757] RSP: 0018:ffffc90000237dd0 EFLAGS: 00000283
[ 575.546292] RAX: 0000000080000001 RBX: ffff8881047242e0 RCX: ffffffff811324bc
[ 575.547022] RDX: 0000000000000002 RSI: dffffc0000000000 RDI: 0000000000000001
[ 575.547735] RBP: ffff888104654c38 R08: 0000000000000008 R09: 0000000000000000
[ 575.548448] R10: ffff88810469d457 R11: ffffed10208d3a8a R12: 0000000000000001
[ 575.549162] R13: ffff888104b43200 R14: ffff888104654c48 R15: 0000000000000000
[ 575.549882] ? preempt_count_sub+0x14/0xb3
[ 575.550315] ring_buffer_consumer_thread+0x18e/0x475
[ 575.550832] ? wait_to_die+0x7c/0x7c
[ 575.551237] ? preempt_latency_start+0x29/0x34
[ 575.551702] ? wait_to_die+0x7c/0x7c
[ 575.552083] kthread+0x1ac/0x1bb
[ 575.552434] ? kthread+0xfd/0x1bb
[ 575.552792] ? kthread_complete_and_exit+0x20/0x20
[ 575.553288] ret_from_fork+0x21/0x3c
[ 575.553670] ? kthread_complete_and_exit+0x20/0x20
[ 575.554166] ret_from_fork_asm+0x11/0x20
[ 575.554582] </TASK>
[ 575.554832] ---[ end trace 0000000000000000 ]---
[ 635.894864] ------------[ cut here ]------------
[ 635.895430] refcount_t: saturated; leaking memory.
[ 635.895948] WARNING: CPU: 0 PID: 35 at lib/refcount.c:22 refcount_warn_saturate+0x77/0xfc
[ 635.896768] Modules linked in:
[ 635.897101] CPU: 0 PID: 35 Comm: rb_producer Tainted: G W 6.9.0-rc2-00696-gf549fd6ea775 #28
[ 635.898057] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 635.898978] RIP: 0010:refcount_warn_saturate+0x77/0xfc
[ 635.899508] Code: b0 8b ff 0f 0b e9 a2 00 00 00 80 3d 81 30 c3 01 00 0f 85 95 00 00 00 48 c7 c7 60 96 4e 82 c6 05 6d 30 c3 01 01 e8 02 b0 8b ff <0f> 0b eb 7e 80 3d 5c 30 c3 01 00 75 75 48 c7 c7 c0 96 4e 82 c6 05
[ 635.901316] RSP: 0018:ffffc90000007220 EFLAGS: 00010282
[ 635.901854] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[ 635.902567] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: fffff52000000e35
[ 635.903284] RBP: ffff888104d4f334 R08: 0000000000000004 R09: 0000000000000001
[ 635.904010] R10: ffffffff82d552ab R11: fffffbfff05aaa55 R12: 1ffff92000000e50
[ 635.904728] R13: ffffc90000007468 R14: 00000000bfffffff R15: ffff888104ff48c0
[ 635.905441] FS: 0000000000000000(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
[ 635.906246] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 635.906833] CR2: 0000558b84610cb0 CR3: 000000010dc88000 CR4: 0000000000350ef0
[ 635.907554] Call Trace:
[ 635.907827] <IRQ>
[ 635.908067] ? __warn+0xb8/0x13b
[ 635.908416] ? refcount_warn_saturate+0x77/0xfc
[ 635.908885] ? report_bug+0xf6/0x159
[ 635.909264] ? refcount_warn_saturate+0x77/0xfc
[ 635.909733] ? handle_bug+0x3c/0x64
[ 635.910105] ? exc_invalid_op+0x13/0x38
[ 635.910508] ? asm_exc_invalid_op+0x16/0x20
[ 635.910951] ? refcount_warn_saturate+0x77/0xfc
[ 635.911419] sock_alloc_send_pskb+0x374/0x3d7
[ 635.911888] ? sock_wmalloc+0x73/0x73
[ 635.912280] ? xfrm_tmpl_resolve+0x4d1/0x4d1
[ 635.912730] __ip_append_data+0x705/0x12f8
[ 635.913159] ? icmp_unreach+0x2f7/0x2f7
[ 635.913572] ? preempt_count_sub+0x14/0xb3
[ 635.914002] ? skb_zcopy_set+0xb9/0xb9
[ 635.914400] ? xfrm_lookup_with_ifid+0x68a/0x768
[ 635.914881] ? sock_flag+0x15/0x20
[ 635.915255] ip_append_data+0xc3/0xd6
[ 635.915646] ? icmp_unreach+0x2f7/0x2f7
[ 635.916059] icmp_push_reply+0x61/0x1b6
[ 635.916462] icmp_reply+0x3a3/0x410
[ 635.916844] ? __icmp_send+0x77a/0x77a
[ 635.917241] ? fib_validate_source+0x128/0x1ca
[ 635.917703] ? rt_cache_valid+0x70/0x8d
[ 635.918109] ? do_csum+0xe2/0x13f
[ 635.918468] icmp_echo.part.0+0xf5/0x130
[ 635.918881] ? icmp_timestamp+0x19d/0x19d
[ 635.919310] ? __skb_checksum+0x317/0x317
[ 635.919733] ? csum_block_add_ext+0x10/0x10
[ 635.920169] ? reqsk_fastopen_remove+0x249/0x249
[ 635.920645] ? refcount_read+0x16/0x1a
[ 635.921039] icmp_echo+0x58/0x5d
[ 635.921386] icmp_rcv+0x482/0x4ec
[ 635.921743] ip_protocol_deliver_rcu+0xd7/0x1b2
[ 635.922211] ? ip_protocol_deliver_rcu+0x1b2/0x1b2
[ 635.922703] ip_local_deliver_finish+0x110/0x120
[ 635.923186] ? ip_protocol_deliver_rcu+0x1b2/0x1b2
[ 635.923678] NF_HOOK.constprop.0+0xf8/0x138
[ 635.924116] ? ip_sublist_rcv_finish+0x68/0x68
[ 635.924576] ? __asan_load8+0x74/0x74
[ 635.924964] ? do_csum+0xe2/0x13f
[ 635.925320] ? __list_del_entry_valid_or_report+0xc8/0xed
[ 635.925868] ip_sublist_rcv_finish+0x53/0x68
[ 635.926313] ip_sublist_rcv+0x24f/0x29b
[ 635.926716] ? ip_rcv_finish_core.isra.0+0x74d/0x74d
[ 635.927230] ? skb_orphan_frags_rx.constprop.0+0x3a/0x67
[ 635.927770] ? do_csum+0xe2/0x13f
[ 635.928136] ? __asan_memset+0x21/0x3f
[ 635.928532] ? ip_rcv_core+0x4a6/0x4f7
[ 635.928928] ip_list_rcv+0x18a/0x1c2
[ 635.929308] ? ip_rcv+0x57/0x57
[ 635.929653] ? __list_add_valid_or_report+0x66/0xad
[ 635.930155] ? __netif_receive_skb_list_ptype+0x3a/0xca
[ 635.930687] __netif_receive_skb_list_core+0x17b/0x1c2
[ 635.931220] ? __netif_receive_skb_core.constprop.0+0xb24/0xb24
[ 635.931820] ? gro_normal_list+0x16/0x65
[ 635.932236] ? __list_add_valid_or_report+0x66/0xad
[ 635.932738] netif_receive_skb_list_internal+0x2bd/0x316
[ 635.933279] ? process_backlog+0x187/0x187
[ 635.933707] ? virtnet_poll+0x4a6/0x6cf
[ 635.934113] ? virtnet_set_ringparam+0x595/0x595
[ 635.934591] gro_normal_list+0x2e/0x65
[ 635.934994] napi_complete_done+0x13b/0x246
[ 635.935431] ? gro_normal_list+0x65/0x65
[ 635.935853] ? gro_normal_one+0x9e/0xef
[ 635.936258] gro_cell_poll+0x42/0x4b
[ 635.936639] __napi_poll.constprop.0+0x57/0x1a7
[ 635.937109] net_rx_action+0x1cb/0x380
[ 635.937511] ? __napi_poll.constprop.0+0x1a7/0x1a7
[ 635.938003] ? internal_add_timer+0xbf/0xbf
[ 635.938439] ? vring_interrupt+0xba/0xc4
[ 635.938855] ? __handle_irq_event_percpu+0x180/0x197
[ 635.939371] ? handle_irq_event_percpu+0x3b/0x40
[ 635.939850] __do_softirq+0x135/0x2d7
[ 635.940239] common_interrupt+0x93/0xb8
[ 635.940643] </IRQ>
[ 635.940883] <TASK>
[ 635.941123] asm_common_interrupt+0x22/0x40
[ 635.941558] RIP: 0010:__asan_store8+0x0/0x77
[ 635.942001] Code: 28 38 d0 eb 16 ba ff ff 37 00 48 c1 e8 03 48 c1 e2 2a 8a 04 10 84 c0 74 10 3c 07 7f 0c 31 d2 be 08 00 00 00 e9 5b f4 ff ff c3 <48> 8b 0c 24 48 83 ff f8 73 5d 48 b8 ff ff ff ff ff 7f ff ff 48 39
[ 635.943797] RSP: 0018:ffffc90000247cf0 EFLAGS: 00000246
[ 635.944343] RAX: ffffed1020968601 RBX: ffffc90000247df0 RCX: ffffed102096865e
[ 635.945049] RDX: ffffed102096865e RSI: ffffed102096865e RDI: ffff888104b432e8
[ 635.945755] RBP: ffff888104b43200 R08: 0000000000000008 R09: 0000000000000001
[ 635.946461] R10: ffff888104b432ef R11: ffffed102096865d R12: 0000000000000000
[ 635.947173] R13: 00000094124d099d R14: 0000000000000fd0 R15: ffff888104679140
[ 635.947893] __rb_reserve_next.constprop.0+0x1e1/0x7e3
[ 635.948419] ring_buffer_lock_reserve+0x26a/0x688
[ 635.948903] ? __rb_reserve_next.constprop.0+0x7e3/0x7e3
[ 635.949444] ring_buffer_producer_thread+0x9d/0x524
[ 635.949950] ? ring_buffer_consumer_thread+0x475/0x475
[ 635.950473] kthread+0x1ac/0x1bb
[ 635.950822] ? kthread+0xfd/0x1bb
[ 635.951184] ? kthread_complete_and_exit+0x20/0x20
[ 635.951677] ret_from_fork+0x21/0x3c
[ 635.952061] ? kthread_complete_and_exit+0x20/0x20
[ 635.952553] ret_from_fork_asm+0x11/0x20
[ 635.952966] </TASK>
[ 635.953213] ---[ end trace 0000000000000000 ]---
[ 829.757162] systemd-fstab-generator[2633]: Failed to create unit file
'/run/systemd/generator/home.mount', as it already exists. Duplicate entry
in '/etc/fstab'?
>
> Thanks!
> Chris.
>
> Christian Hopps <chopps@chopps.org> writes:
>
> > [[PGP Signed Part:Good signature from 2E1D830ED7B83025 Christian Hopps <chopps@gmail.com> (trust ultimate) created at 2024-05-24T08:08:58-0400 using RSA]]
> >
> > This is very helpful thanks.
> >
> > I think the tunnel endpoints are east/west 192.1.2.{23,45}, but I can't determine the north/east endpoints b/c they don't appear connected. :)
> >
> > Are there any other iptfs options? The code you highlight mentions the `dont-frag` option, but I wonder if you actually have that enabled?
> >
> > It also seems like you are pinging and forcing the source IP of a red interface
> > on the tunnel endpoint gateway directly (so that it doesn't try and use the
> > black interface I would guess) is that correct?
> >
> > Thanks!
> > Chris.
> >
> > P.S. the addresses on the NIC host in the picture seem reversed, but this doesn't seem relevant to this test :)
> >
> > Antony Antony <antony@phenome.org> writes:
> >
> > > On Thu, May 23, 2024 at 07:04:58PM -0400, Christian Hopps wrote:
> > > >
> > > > Could you let me know some more details about this test? What is your interface config / topology?. I tried to guess given the ping command but it's not replicating for me.
> > >
> > > I am using Libreswan testing topology. However, I am running test manually.
> > > Yesterday tunnel between north and east. This morning I quickly tried
> > > between west-east. Just two VM. I see the same issue there too.
> > >
> > > https://libreswan.org/wiki/images/f/f1/Testnet-202102.png
> > >
> > > I am using CONFIG_ESP_OFFLOAD. That is only thing standing out. Besides it
> > > is just a 1500 MTU tunnels using qemu/kvm and tap network.
> > >
> > > attached is my kernel .config
> > >
> > > > PS, I've changed the subject and In-reply-to to be based on the corrected
> > > > cover-letter I sent, I initially sent the cover letter with the wrong
> > > > subject. :(
> > >
> > > I noticed a second cover letter. However, it was not showing as related to
> > > patch set correctly. It showed up as a diffrent thread. That is why I
> > > replied to the initial one
> > >
> > > -antony
> > > >
> > > >
> > > > Antony Antony <antony@phenome.org> writes:
> > > >
> > > > > Hi Chris,
> > > > >
> > > > > On Mon, May 20, 2024 at 05:42:38PM -0400, Christian Hopps via Devel wrote:
> > > > > > From: Christian Hopps <chopps@labn.net>
> > > > > > - iptfs: remove some BUG_ON() assertions questioned in review.
> > > >
> > > > ...
> > > >
> > > > > I ran a couple of tests and it hit KSAN BUG.
> > > > >
> > > > > I was sending large ping while MTU is 1500.
> > > > >
> > > > > north login: shed systemd-user-sessions.service - Permit User Sessions.
> > > > > north login: [ 78.594770] ==================================================================
> > > > > [ 78.595825] BUG: KASAN: null-ptr-deref in iptfs_output_collect+0x263/0x57b
> > > > > [ 78.596658] Read of size 8 at addr 0000000000000108 by task ping/493
> > > > > [ 78.597435] ng rpc-statd-notify.service - Notify NFS peers of a restart...
> > > > > [ 78.597651] CPU: 0 PID: 493 Comm: ping Not tainted 6.9.0-rc2-00697-g489ca863e24f-dirty #11
> > > > > [ 78.598645] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> > > > > [ 78.599747] Call Trace:tty@ttyS2.service - Serial Getty on ttyS2.
> > > > > [ 78.600070] <TASK>l-getty@ttyS3.service - Serial Getty on ttyS3.
> > > > > [ 78.600354] dump_stack_lvl+0x2a/0x3bogin Prompts.
> > > > > [ 78.600817] kasan_report+0x84/0xa6rvice - Hostname Service...
> > > > > [ 78.601262] ? iptfs_output_collect+0x263/0x57bl server.
> > > > > [ 78.601825] iptfs_output_collect+0x263/0x57bogin Management.
> > > > > [ 78.602374] ip_send_skb+0x25/0x57vice - Notify NFS peers of a restart.
> > > > > [ 78.602807] raw_sendmsg+0xee8/0x1011t - Multi-User System.
> > > > > [ 78.603269] ? native_flush_tlb_one_user+0xd/0xe5e Service.
> > > > > [ 78.603850] ? raw_hash_sk+0x21b/0x21b
> > > > > [ 78.604331] ? kernel_init_pages+0x42/0x51
> > > > > [ 78.604845] ? prep_new_page+0x44/0x51Re…line ext4 Metadata Check Snapshots.
> > > > > [ 78.605318] ? get_page_from_freelist+0x72b/0x915 Interface.
> > > > > [ 78.605903] ? signal_pending_state+0x77/0x77cord Runlevel Change in UTMP...
> > > > > [ 78.606462] ? __might_resched+0x8a/0x240e - Record Runlevel Change in UTMP.
> > > > > [ 78.606966] ? __might_sleep+0x25/0xa0
> > > > > [ 78.607440] ? first_zones_zonelist+0x2c/0x43
> > > > > [ 78.607985] ? __rcu_read_lock+0x2d/0x3a
> > > > > [ 78.608479] ? __pte_offset_map+0x32/0xa4
> > > > > [ 78.608979] ? __might_resched+0x8a/0x240
> > > > > [ 78.609478] ? __might_sleep+0x25/0xa0
> > > > > [ 78.609949] ? inet_send_prepare+0x54/0x54
> > > > > [ 78.610464] ? sock_sendmsg_nosec+0x42/0x6c
> > > > > [ 78.610984] sock_sendmsg_nosec+0x42/0x6c
> > > > > [ 78.611485] __sys_sendto+0x15d/0x1cc
> > > > > [ 78.611947] ? __x64_sys_getpeername+0x44/0x44
> > > > > [ 78.612498] ? __handle_mm_fault+0x679/0xae4
> > > > > [ 78.613033] ? find_vma+0x6b/0x8b
> > > > > [ 78.613457] ? find_vma_intersection+0x8a/0x8a
> > > > > [ 78.614006] ? __handle_irq_event_percpu+0x180/0x197
> > > > > [ 78.614617] ? handle_mm_fault+0x38/0x154
> > > > > [ 78.615114] ? handle_mm_fault+0xeb/0x154
> > > > > [ 78.615620] ? preempt_latency_start+0x29/0x34
> > > > > [ 78.616169] ? preempt_count_sub+0x14/0xb3
> > > > > [ 78.616678] ? up_read+0x4b/0x5c
> > > > > [ 78.617094] __x64_sys_sendto+0x76/0x82
> > > > > [ 78.617577] do_syscall_64+0x6b/0xd7
> > > > > [ 78.618043] entry_SYSCALL_64_after_hwframe+0x46/0x4e
> > > > > [ 78.618667] RIP: 0033:0x7fed3de99a73
> > > > > [ 78.619118] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8
> > > > > 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0
> > > > > ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
> > > > > [ 78.621291] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
> > > > > [ 78.622205] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
> > > > > [ 78.623056] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
> > > > > [ 78.623908] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
> > > > > [ 78.624765] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
> > > > > [ 78.625619] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
> > > > > [ 78.626480] </TASK>
> > > > > [ 78.626773] ==================================================================
> > > > > [ 78.627656] Disabling lock debugging due to kernel taint
> > > > > [ 78.628305] BUG: kernel NULL pointer dereference, address: 0000000000000108
> > > > > [ 78.629136] #PF: supervisor read access in kernel mode
> > > > > [ 78.629766] #PF: error_code(0x0000) - not-present page
> > > > > [ 78.630402] PGD 0 P4D 0
> > > > > [ 78.630739] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC KASAN
> > > > > [ 78.631398] CPU: 0 PID: 493 Comm: ping Tainted: G B 6.9.0-rc2-00697-g489ca863e24f-dirty #11
> > > > > [ 78.632548] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> > > > > [ 78.633649] RIP: 0010:iptfs_output_collect+0x263/0x57b
> > > > > [ 78.634283] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b
> > > > > 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08
> > > > > 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
> > > > > [ 78.636444] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
> > > > > [ 78.637076] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
> > > > > [ 78.637923] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
> > > > > [ 78.638792] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
> > > > > [ 78.639645] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
> > > > > [ 78.640498] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
> > > > > [ 78.641359] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
> > > > > [ 78.642324] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > [ 78.643022] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
> > > > > [ 78.643882] Call Trace:
> > > > > [ 78.644204] <TASK>
> > > > > [ 78.644487] ? __die_body+0x1a/0x56
> > > > > [ 78.644929] ? page_fault_oops+0x45f/0x4cd
> > > > > [ 78.645441] ? dump_pagetable+0x1db/0x1db
> > > > > [ 78.645942] ? vprintk_emit+0x163/0x171
> > > > > [ 78.646425] ? iptfs_output_collect+0x263/0x57b
> > > > > [ 78.646986] ? _printk+0xb2/0xe1
> > > > > [ 78.647401] ? find_first_fitting_seq+0x193/0x193
> > > > > [ 78.647982] ? iptfs_output_collect+0x263/0x57b
> > > > > [ 78.648541] ? do_user_addr_fault+0x14f/0x56c
> > > > > [ 78.649084] ? exc_page_fault+0xa5/0xbe
> > > > > [ 78.649566] ? asm_exc_page_fault+0x22/0x30
> > > > > [ 78.650100] ? iptfs_output_collect+0x263/0x57b
> > > > > [ 78.650660] ? iptfs_output_collect+0x263/0x57b
> > > > > [ 78.651221] ip_send_skb+0x25/0x57
> > > > > [ 78.651652] raw_sendmsg+0xee8/0x1011
> > > > > [ 78.652113] ? native_flush_tlb_one_user+0xd/0xe5
> > > > > [ 78.652693] ? raw_hash_sk+0x21b/0x21b
> > > > > [ 78.653166] ? kernel_init_pages+0x42/0x51
> > > > > [ 78.653683] ? prep_new_page+0x44/0x51
> > > > > [ 78.654160] ? get_page_from_freelist+0x72b/0x915
> > > > > [ 78.654739] ? signal_pending_state+0x77/0x77
> > > > > [ 78.655284] ? __might_resched+0x8a/0x240
> > > > > [ 78.655784] ? __might_sleep+0x25/0xa0
> > > > > [ 78.656255] ? first_zones_zonelist+0x2c/0x43
> > > > > [ 78.656798] ? __rcu_read_lock+0x2d/0x3a
> > > > > [ 78.657289] ? __pte_offset_map+0x32/0xa4
> > > > > [ 78.657788] ? __might_resched+0x8a/0x240
> > > > > [ 78.658291] ? __might_sleep+0x25/0xa0
> > > > > [ 78.658763] ? inet_send_prepare+0x54/0x54
> > > > > [ 78.659272] ? sock_sendmsg_nosec+0x42/0x6c
> > > > > [ 78.659791] sock_sendmsg_nosec+0x42/0x6c
> > > > > [ 78.660293] __sys_sendto+0x15d/0x1cc
> > > > > [ 78.660755] ? __x64_sys_getpeername+0x44/0x44
> > > > > [ 78.661304] ? __handle_mm_fault+0x679/0xae4
> > > > > [ 78.661838] ? find_vma+0x6b/0x8b
> > > > > [ 78.662272] ? find_vma_intersection+0x8a/0x8a
> > > > > [ 78.662828] ? __handle_irq_event_percpu+0x180/0x197
> > > > > [ 78.663436] ? handle_mm_fault+0x38/0x154
> > > > > [ 78.663935] ? handle_mm_fault+0xeb/0x154
> > > > > [ 78.664435] ? preempt_latency_start+0x29/0x34
> > > > > [ 78.664987] ? preempt_count_sub+0x14/0xb3
> > > > > [ 78.665498] ? up_read+0x4b/0x5c
> > > > > [ 78.665911] __x64_sys_sendto+0x76/0x82
> > > > > [ 78.666398] do_syscall_64+0x6b/0xd7
> > > > > [ 78.666849] entry_SYSCALL_64_after_hwframe+0x46/0x4e
> > > > > [ 78.667466] RIP: 0033:0x7fed3de99a73
> > > > > [ 78.667918] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8
> > > > > 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0
> > > > > ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
> > > > > [ 78.670097] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
> > > > > [ 78.671002] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
> > > > > [ 78.671858] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
> > > > > [ 78.672708] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
> > > > > [ 78.673564] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
> > > > > [ 78.674430] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
> > > > > [ 78.675287] </TASK>
> > > > > [ 78.675580] Modules linked in:
> > > > > [ 78.675975] CR2: 0000000000000108
> > > > > [ 78.676396] ---[ end trace 0000000000000000 ]---
> > > > > [ 78.676966] RIP: 0010:iptfs_output_collect+0x263/0x57b
> > > > > [ 78.677596] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b
> > > > > 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08
> > > > > 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
> > > > > [ 78.679768] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
> > > > > [ 78.680410] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
> > > > > [ 78.681264] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
> > > > > [ 78.682136] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
> > > > > [ 78.682997] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
> > > > > [ 78.683853] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
> > > > > [ 78.684710] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
> > > > > [ 78.685675] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > [ 78.686387] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
> > > > > [ 78.687246] Kernel panic - not syncing: Fatal exception in interrupt
> > > > > [ 78.688014] Kernel Offset: disabled
> > > > > [ 78.688460] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
> > > > >
> > > > > ping -s 2000 -n -q -W 1 -c 2 -I 192.0.3.254 192.0.2.254
> > > > >
> > > > > (gdb) list *iptfs_output_collect+0x263
> > > > > 0xffffffff81d5076f is in iptfs_output_collect (./include/net/net_namespace.h:383).
> > > > > 378 }
> > > > > 379
> > > > > 380 static inline struct net *read_pnet(const possible_net_t *pnet)
> > > > > 381 {
> > > > > 382 #ifdef CONFIG_NET_NS
> > > > > 383 return rcu_dereference_protected(pnet->net, true);
> > > > > 384 #else
> > > > > 385 return &init_net;
> > > > > 386 #endif
> > > > > 387 }
> > > > >
> > > > > I suspect actual crash is from the line 1756 instead,
> > > > > (gdb) list *iptfs_output_collect+0x256
> > > > > 0xffffffff81d50762 is in iptfs_output_collect (net/xfrm/xfrm_iptfs.c:1756).
> > > > > 1751 return 0;
> > > > > 1752
> > > > > 1753 /* We only send ICMP too big if the user has configured us as
> > > > > 1754 * dont-fragment.
> > > > > 1755 */
> > > > > 1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
> > > > > 1757
> > > > > 1758 if (sk) {
> > > > > 1759 xfrm_local_error(skb, pmtu);
> > > > > 1760 } else if (ip_hdr(skb)->version == 4) {
> > > > >
> > > > > Later I ran with gdb iptfs_is_too_big which is called twice and second time
> > > > > it crash.
> > > > > Here is gdb bt. Just before the crash
> > > > >
> > > > > #0 iptfs_is_too_big (pmtu=1442, skb=0xffff88810dbea3c0, sk=0xffff888104d4ed40) at net/xfrm/xfrm_iptfs.c:1756
> > > > > #1 iptfs_output_collect (net=<optimized out>, sk=0xffff888104d4ed40, skb=0xffff88810dbea3c0) at net/xfrm/xfrm_iptfs.c:1847
> > > > > #2 0xffffffff81c8a3cb in ip_send_skb (net=0xffffffff83e57f20 <init_net>, skb=0xffff88810dbea3c0)
> > > > > at net/ipv4/ip_output.c:1492
> > > > > #3 0xffffffff81c8a439 in ip_push_pending_frames (sk=sk@entry=0xffff888104d4ed40, fl4=fl4@entry=0xffffc90000e3fb90)
> > > > > at net/ipv4/ip_output.c:1512
> > > > > #4 0xffffffff81ccf3cf in raw_sendmsg (sk=0xffff888104d4ed40, msg=0xffffc90000e3fd80, len=<optimized out>)
> > > > > at net/ipv4/raw.c:654
> > > > > #5 0xffffffff81b096ea in sock_sendmsg_nosec (sock=sock@entry=0xffff888115136040, msg=msg@entry=0xffffc90000e3fd80)
> > > > > at net/socket.c:730
> > > > > #6 0xffffffff81b0c327 in __sock_sendmsg (msg=0xffffc90000e3fd80, sock=0xffff888115136040) at net/socket.c:745
> > > > > #7 __sys_sendto (fd=<optimized out>, buff=buff@entry=0x558edefb73c0, len=len@entry=2008, flags=flags@entry=0,
> > > > > addr=addr@entry=0x558edefb35c0, addr_len=addr_len@entry=16) at net/socket.c:2191
> > > > > #8 0xffffffff81b0c40c in __do_sys_sendto (addr_len=16, addr=0x558edefb35c0, flags=0, len=2008, buff=0x558edefb73c0,
> > > > > fd=<optimized out>) at net/socket.c:2203
> > > > > #9 __se_sys_sendto (addr_len=16, addr=94072114722240, flags=0, len=2008, buff=94072114738112, fd=<optimized out>)
> > > > > at net/socket.c:2199
> > > > >
> > > > > gdb) list
> > > > > 1751 return 0;
> > > > > 1752
> > > > > 1753 /* We only send ICMP too big if the user has configured us as
> > > > > 1754 * dont-fragment.
> > > > > 1755 */
> > > > > 1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
> > > > > 1757
> > > > > 1758 if (sk) {
> > > > > 1759 xfrm_local_error(skb, pmtu);
> > > > > 1760 } else if (ip_hdr(skb)->version == 4) {
> > > > >
> > > > > -antony
> > > >
> > >
> > > [2. text/plain; .config]...
> >
> > [[End of PGP Signed Part]]
>
> a
> --
> Devel mailing list
> Devel@linux-ipsec.org
> https://linux-ipsec.org/mailman/listinfo/devel
^ permalink raw reply [flat|nested] 34+ messages in thread* Re: [devel-ipsec] [PATCH ipsec-next v2 0/17] Add IP-TFS mode to xfrm
2024-06-06 15:52 ` [devel-ipsec] " Antony Antony
@ 2024-06-07 5:54 ` Christian Hopps
0 siblings, 0 replies; 34+ messages in thread
From: Christian Hopps @ 2024-06-07 5:54 UTC (permalink / raw)
To: Antony Antony
Cc: Christian Hopps, devel, Steffen Klassert, netdev, Christian Hopps
[-- Attachment #1: Type: text/plain, Size: 33359 bytes --]
For me the flood works with no packet loss. BTW, as you mentioned not having dont-frag set in the coffee hour meeting, I did notice that it shows being set in one of your iproute(2) outputs, did you set it for this test then or is it perhaps getting set by strongswan inadvertently?
In any case here's what I saw:
# 192.168.0.0/24 fd00::/64
# --+-------------------+------ mgmt0 ------+-------------------+---
# | .1 | .2 | .3 | .4
# +----+ +----+ ===TUNNEL=== +----+ +----+
# | h1 | --- net0 --- | r1 | --- net1 --- | r2 | --- net2 --- | h2 |
# +----+ .1 .2 +----+ .2 .3 +----+ .3 .4 +----+
# 10.0.0.0/24 10.0.1.0/24 10.0.2.0/24
[on r1 pinging h2 using r1 net0 interface]
sh-5.2# ping -f -c 10000 -I 10.0.0.2 10.0.2.4
PING 10.0.2.4 (10.0.2.4) from 10.0.0.2 : 56(84) bytes of data.
--- 10.0.2.4 ping statistics ---
10000 packets transmitted, 10000 received, 0% packet loss, time 3452ms
rtt min/avg/max/mdev = 0.251/0.284/2.463/0.055 ms, ipg/ewma 0.345/0.287 ms
[on r1]
sh-5.2# ip x s l
src 10.0.1.3 dst 10.0.1.2
proto esp spi 0x00000bbb reqid 9 mode iptfs
replay-window 0
aead rfc4106(gcm(aes)) 0x4a506a794f574265564551694d6537681a2b1a2b 128
lastused 2024-06-07 05:53:35
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
if_id 0x37
dir in
iptfs-opts drop-time 1000000 reorder-window 3
sel src 0.0.0.0/0 dst 0.0.0.0/0
src 10.0.1.2 dst 10.0.1.3
proto esp spi 0x00000aaa reqid 8 mode iptfs
replay-window 0
aead rfc4106(gcm(aes)) 0x4a506a794f574265564551694d6537681a2b1a2b 128
lastused 2024-06-07 05:53:35
anti-replay context: seq 0x0, oseq 0x4e27, bitmap 0x00000000
if_id 0x37
dir out
iptfs-opts init-delay 0 max-queue-size 1048576 pkt-size 0
sel src 0.0.0.0/0 dst 0.0.0.0/0
sh-5.2#
Thanks,
Chris.
Antony Antony <antony@phenome.org> writes:
> On Sat, May 25, 2024 at 01:55:01AM -0400, Christian Hopps via Devel wrote:
>>
>> Found. This was happening b/c the skb was locally generated on the gateway and so had no net_device. Fixed by checking for skb->dev == NULL before incrementing the error stats in the output path.
>
>
> Good to hear you found the bug and fixed. I am curious how the large packets
> send in case dsl gateway would work.
>
> Here is possibly another issue.
>
> With ping -f I see error. After a few several responses ping return error
> and no more ESP is send from the sender.
>
> ping -f -c 10000 -I 192.0.1.254 192.0.2.254
> PING 192.0.2.254 (192.0.2.254) from 192.0.1.254 : 56(84) bytes of data.
> EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE^C
> --- 192.0.2.254 ping statistics ---
> 1428 packets transmitted, 1280 received, 10.3641% packet loss, time 28398ms
> rtt min/avg/max/mdev = 2.761/7.770/19.916/0.908 ms, pipe 2, ipg/ewma 19.900/9.335 ms
> root@west:/testing/pluto/ikev2-74-iptfs-01$
>
> root@east:/testing/pluto/ikev2-74-iptfs-01$ip x s
> src 192.1.2.23 dst 192.1.2.45
> proto esp spi 0x55067850 reqid 16389 mode iptfs dir out
> flag af-unspec esn
> aead rfc4106(gcm(aes)) 0x7aacd5115a84ee5476940c864b3f4a4fa6ca9e3c0590b1b33ae5c925dad38c494c2ba9ac 128
> lastused 2024-06-06 17:36:31
> oseq-hi 0x0, oseq 0x564
> iptfs-opts pkt-size 0 max-queue-size 1048576 drop-time 1000000 reorder-window 3 init-delay 0 dont-frag
> src 192.1.2.45 dst 192.1.2.23
> proto esp spi 0x54562117 reqid 16389 mode iptfs dir in
> flag af-unspec esn
> aead rfc4106(gcm(aes)) 0x8505e65031be933d5b5be57c27a618de7f5d5a2c464dbfb62d093dcb411b2c4f75893484 128
> lastused 2024-06-06 17:36:31
> seq-hi 0x0, seq 0x564
> replay-window 128, bitmap-length 4
> ffffffff ffffffff ffffffff ffffffff
> iptfs-opts pkt-size 0 max-queue-size 1048576 drop-time 1000000 reorder-window 3 init-delay 0
>
> Also on there is kernel splat on both ends. I am not sure it is related to
> your patches. However, I see it around same time ping return error, and I
> haven't seen it before. I will try to get more inforation.
>
> [ 575.515108] ------------[ cut here ]------------
> [ 575.515646] refcount_t: underflow; use-after-free.
> [ 575.516169] WARNING: CPU: 0 PID: 34 at lib/refcount.c:28 refcount_warn_saturate+0xb7/0xfc
> [ 575.516996] Modules linked in:
> [ 575.517332] CPU: 0 PID: 34 Comm: rb_consumer Not tainted 6.9.0-rc2-00696-gf549fd6ea775 #28
> [ 575.518165] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [ 575.519105] RIP: 0010:refcount_warn_saturate+0xb7/0xfc
> [ 575.519635] Code: 4c 30 c3 01 01 e8 e2 af 8b ff 0f 0b eb 5e 80 3d 3b 30 c3 01 00 75 55 48 c7 c7 c0 95 4e 82 c6 05 2b 30 c3 01 01 e8 c2 af 8b ff <0f> 0b eb 3e 80 3d 1a 30 c3 01 00 75 35 48 c7 c7 20 97 4e 82 c6 05
> [ 575.521449] RSP: 0018:ffffc90000007c90 EFLAGS: 00010282
> [ 575.521992] RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
> [ 575.522713] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: fffff52000000f83
> [ 575.523448] RBP: ffff888104d4f334 R08: 0000000000000004 R09: 0000000000000001
> [ 575.524169] R10: ffffffff82d552ab R11: fffffbfff05aaa55 R12: ffff888104d4f334
> [ 575.524891] R13: 1ffff92000000fa7 R14: ffff8881073d0800 R15: 0000000000000000
> [ 575.525609] FS: 0000000000000000(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
> [ 575.526413] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 575.527013] CR2: 0000558b84637af0 CR3: 000000010dc88000 CR4: 0000000000350ef0
> [ 575.527731] Call Trace:
> [ 575.528006] <IRQ>
> [ 575.528240] ? __warn+0xb8/0x13b
> [ 575.528593] ? refcount_warn_saturate+0xb7/0xfc
> [ 575.529071] ? report_bug+0xf6/0x159
> [ 575.529453] ? refcount_warn_saturate+0xb7/0xfc
> [ 575.529931] ? handle_bug+0x3c/0x64
> [ 575.530306] ? exc_invalid_op+0x13/0x38
> [ 575.530717] ? asm_exc_invalid_op+0x16/0x20
> [ 575.531167] ? refcount_warn_saturate+0xb7/0xfc
> [ 575.531639] __refcount_sub_and_test.constprop.0+0x38/0x3d
> [ 575.532200] sock_wfree+0x13a/0x153
> [ 575.532575] skb_release_head_state+0x25/0x6b
> [ 575.533032] skb_release_all+0x13/0x3a
> [ 575.533429] napi_consume_skb+0x53/0x5e
> [ 575.533836] __free_old_xmit+0xcc/0x18d
> [ 575.534243] ? virtnet_freeze_down.isra.0+0xb4/0xb4
> [ 575.534747] ? check_preempt_wakeup_fair+0x64/0x1f3
> [ 575.535269] ? test_ti_thread_flag+0x12/0x1f
> [ 575.535718] ? tracing_record_taskinfo_sched_switch+0x25/0xbf
> [ 575.536304] free_old_xmit+0x72/0xbe
> [ 575.536687] ? __free_old_xmit+0x18d/0x18d
> [ 575.537117] ? trace_rcu_this_gp.constprop.0+0x52/0xca
> [ 575.537646] ? virtqueue_disable_cb+0x71/0xe9
> [ 575.538109] virtnet_poll_tx+0xf6/0x1d8
> [ 575.538516] __napi_poll.constprop.0+0x57/0x1a7
> [ 575.539000] net_rx_action+0x1cb/0x380
> [ 575.539399] ? __napi_poll.constprop.0+0x1a7/0x1a7
> [ 575.539896] ? __napi_schedule+0xe/0x17
> [ 575.540302] ? vring_interrupt+0xba/0xc4
> [ 575.540716] ? __handle_irq_event_percpu+0x180/0x197
> [ 575.541229] ? handle_irq_event_percpu+0x3b/0x40
> [ 575.541710] __do_softirq+0x135/0x2d7
> [ 575.542102] common_interrupt+0x93/0xb8
> [ 575.542509] </IRQ>
> [ 575.542751] <TASK>
> [ 575.543004] asm_common_interrupt+0x22/0x40
> [ 575.543443] RIP: 0010:ring_buffer_consume+0xde/0x11e
> [ 575.543960] Code: e8 c6 68 18 00 31 c0 4c 89 ef 49 89 45 58 e8 1e fb ff ff 41 0f b6 fc e8 4f ec ff ff 0f ba 64 24 20 09 73 01 fb bf 01 00 00 00 <e8> 4a 4e f4 ff 8b 05 64 17 28 02 85 c0 75 05 e8 63 56 e1 ff 48 85
> [ 575.545757] RSP: 0018:ffffc90000237dd0 EFLAGS: 00000283
> [ 575.546292] RAX: 0000000080000001 RBX: ffff8881047242e0 RCX: ffffffff811324bc
> [ 575.547022] RDX: 0000000000000002 RSI: dffffc0000000000 RDI: 0000000000000001
> [ 575.547735] RBP: ffff888104654c38 R08: 0000000000000008 R09: 0000000000000000
> [ 575.548448] R10: ffff88810469d457 R11: ffffed10208d3a8a R12: 0000000000000001
> [ 575.549162] R13: ffff888104b43200 R14: ffff888104654c48 R15: 0000000000000000
> [ 575.549882] ? preempt_count_sub+0x14/0xb3
> [ 575.550315] ring_buffer_consumer_thread+0x18e/0x475
> [ 575.550832] ? wait_to_die+0x7c/0x7c
> [ 575.551237] ? preempt_latency_start+0x29/0x34
> [ 575.551702] ? wait_to_die+0x7c/0x7c
> [ 575.552083] kthread+0x1ac/0x1bb
> [ 575.552434] ? kthread+0xfd/0x1bb
> [ 575.552792] ? kthread_complete_and_exit+0x20/0x20
> [ 575.553288] ret_from_fork+0x21/0x3c
> [ 575.553670] ? kthread_complete_and_exit+0x20/0x20
> [ 575.554166] ret_from_fork_asm+0x11/0x20
> [ 575.554582] </TASK>
> [ 575.554832] ---[ end trace 0000000000000000 ]---
> [ 635.894864] ------------[ cut here ]------------
> [ 635.895430] refcount_t: saturated; leaking memory.
> [ 635.895948] WARNING: CPU: 0 PID: 35 at lib/refcount.c:22 refcount_warn_saturate+0x77/0xfc
> [ 635.896768] Modules linked in:
> [ 635.897101] CPU: 0 PID: 35 Comm: rb_producer Tainted: G W 6.9.0-rc2-00696-gf549fd6ea775 #28
> [ 635.898057] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [ 635.898978] RIP: 0010:refcount_warn_saturate+0x77/0xfc
> [ 635.899508] Code: b0 8b ff 0f 0b e9 a2 00 00 00 80 3d 81 30 c3 01 00 0f 85 95 00 00 00 48 c7 c7 60 96 4e 82 c6 05 6d 30 c3 01 01 e8 02 b0 8b ff <0f> 0b eb 7e 80 3d 5c 30 c3 01 00 75 75 48 c7 c7 c0 96 4e 82 c6 05
> [ 635.901316] RSP: 0018:ffffc90000007220 EFLAGS: 00010282
> [ 635.901854] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
> [ 635.902567] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: fffff52000000e35
> [ 635.903284] RBP: ffff888104d4f334 R08: 0000000000000004 R09: 0000000000000001
> [ 635.904010] R10: ffffffff82d552ab R11: fffffbfff05aaa55 R12: 1ffff92000000e50
> [ 635.904728] R13: ffffc90000007468 R14: 00000000bfffffff R15: ffff888104ff48c0
> [ 635.905441] FS: 0000000000000000(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
> [ 635.906246] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 635.906833] CR2: 0000558b84610cb0 CR3: 000000010dc88000 CR4: 0000000000350ef0
> [ 635.907554] Call Trace:
> [ 635.907827] <IRQ>
> [ 635.908067] ? __warn+0xb8/0x13b
> [ 635.908416] ? refcount_warn_saturate+0x77/0xfc
> [ 635.908885] ? report_bug+0xf6/0x159
> [ 635.909264] ? refcount_warn_saturate+0x77/0xfc
> [ 635.909733] ? handle_bug+0x3c/0x64
> [ 635.910105] ? exc_invalid_op+0x13/0x38
> [ 635.910508] ? asm_exc_invalid_op+0x16/0x20
> [ 635.910951] ? refcount_warn_saturate+0x77/0xfc
> [ 635.911419] sock_alloc_send_pskb+0x374/0x3d7
> [ 635.911888] ? sock_wmalloc+0x73/0x73
> [ 635.912280] ? xfrm_tmpl_resolve+0x4d1/0x4d1
> [ 635.912730] __ip_append_data+0x705/0x12f8
> [ 635.913159] ? icmp_unreach+0x2f7/0x2f7
> [ 635.913572] ? preempt_count_sub+0x14/0xb3
> [ 635.914002] ? skb_zcopy_set+0xb9/0xb9
> [ 635.914400] ? xfrm_lookup_with_ifid+0x68a/0x768
> [ 635.914881] ? sock_flag+0x15/0x20
> [ 635.915255] ip_append_data+0xc3/0xd6
> [ 635.915646] ? icmp_unreach+0x2f7/0x2f7
> [ 635.916059] icmp_push_reply+0x61/0x1b6
> [ 635.916462] icmp_reply+0x3a3/0x410
> [ 635.916844] ? __icmp_send+0x77a/0x77a
> [ 635.917241] ? fib_validate_source+0x128/0x1ca
> [ 635.917703] ? rt_cache_valid+0x70/0x8d
> [ 635.918109] ? do_csum+0xe2/0x13f
> [ 635.918468] icmp_echo.part.0+0xf5/0x130
> [ 635.918881] ? icmp_timestamp+0x19d/0x19d
> [ 635.919310] ? __skb_checksum+0x317/0x317
> [ 635.919733] ? csum_block_add_ext+0x10/0x10
> [ 635.920169] ? reqsk_fastopen_remove+0x249/0x249
> [ 635.920645] ? refcount_read+0x16/0x1a
> [ 635.921039] icmp_echo+0x58/0x5d
> [ 635.921386] icmp_rcv+0x482/0x4ec
> [ 635.921743] ip_protocol_deliver_rcu+0xd7/0x1b2
> [ 635.922211] ? ip_protocol_deliver_rcu+0x1b2/0x1b2
> [ 635.922703] ip_local_deliver_finish+0x110/0x120
> [ 635.923186] ? ip_protocol_deliver_rcu+0x1b2/0x1b2
> [ 635.923678] NF_HOOK.constprop.0+0xf8/0x138
> [ 635.924116] ? ip_sublist_rcv_finish+0x68/0x68
> [ 635.924576] ? __asan_load8+0x74/0x74
> [ 635.924964] ? do_csum+0xe2/0x13f
> [ 635.925320] ? __list_del_entry_valid_or_report+0xc8/0xed
> [ 635.925868] ip_sublist_rcv_finish+0x53/0x68
> [ 635.926313] ip_sublist_rcv+0x24f/0x29b
> [ 635.926716] ? ip_rcv_finish_core.isra.0+0x74d/0x74d
> [ 635.927230] ? skb_orphan_frags_rx.constprop.0+0x3a/0x67
> [ 635.927770] ? do_csum+0xe2/0x13f
> [ 635.928136] ? __asan_memset+0x21/0x3f
> [ 635.928532] ? ip_rcv_core+0x4a6/0x4f7
> [ 635.928928] ip_list_rcv+0x18a/0x1c2
> [ 635.929308] ? ip_rcv+0x57/0x57
> [ 635.929653] ? __list_add_valid_or_report+0x66/0xad
> [ 635.930155] ? __netif_receive_skb_list_ptype+0x3a/0xca
> [ 635.930687] __netif_receive_skb_list_core+0x17b/0x1c2
> [ 635.931220] ? __netif_receive_skb_core.constprop.0+0xb24/0xb24
> [ 635.931820] ? gro_normal_list+0x16/0x65
> [ 635.932236] ? __list_add_valid_or_report+0x66/0xad
> [ 635.932738] netif_receive_skb_list_internal+0x2bd/0x316
> [ 635.933279] ? process_backlog+0x187/0x187
> [ 635.933707] ? virtnet_poll+0x4a6/0x6cf
> [ 635.934113] ? virtnet_set_ringparam+0x595/0x595
> [ 635.934591] gro_normal_list+0x2e/0x65
> [ 635.934994] napi_complete_done+0x13b/0x246
> [ 635.935431] ? gro_normal_list+0x65/0x65
> [ 635.935853] ? gro_normal_one+0x9e/0xef
> [ 635.936258] gro_cell_poll+0x42/0x4b
> [ 635.936639] __napi_poll.constprop.0+0x57/0x1a7
> [ 635.937109] net_rx_action+0x1cb/0x380
> [ 635.937511] ? __napi_poll.constprop.0+0x1a7/0x1a7
> [ 635.938003] ? internal_add_timer+0xbf/0xbf
> [ 635.938439] ? vring_interrupt+0xba/0xc4
> [ 635.938855] ? __handle_irq_event_percpu+0x180/0x197
> [ 635.939371] ? handle_irq_event_percpu+0x3b/0x40
> [ 635.939850] __do_softirq+0x135/0x2d7
> [ 635.940239] common_interrupt+0x93/0xb8
> [ 635.940643] </IRQ>
> [ 635.940883] <TASK>
> [ 635.941123] asm_common_interrupt+0x22/0x40
> [ 635.941558] RIP: 0010:__asan_store8+0x0/0x77
> [ 635.942001] Code: 28 38 d0 eb 16 ba ff ff 37 00 48 c1 e8 03 48 c1 e2 2a 8a 04 10 84 c0 74 10 3c 07 7f 0c 31 d2 be 08 00 00 00 e9 5b f4 ff ff c3 <48> 8b 0c 24 48 83 ff f8 73 5d 48 b8 ff ff ff ff ff 7f ff ff 48 39
> [ 635.943797] RSP: 0018:ffffc90000247cf0 EFLAGS: 00000246
> [ 635.944343] RAX: ffffed1020968601 RBX: ffffc90000247df0 RCX: ffffed102096865e
> [ 635.945049] RDX: ffffed102096865e RSI: ffffed102096865e RDI: ffff888104b432e8
> [ 635.945755] RBP: ffff888104b43200 R08: 0000000000000008 R09: 0000000000000001
> [ 635.946461] R10: ffff888104b432ef R11: ffffed102096865d R12: 0000000000000000
> [ 635.947173] R13: 00000094124d099d R14: 0000000000000fd0 R15: ffff888104679140
> [ 635.947893] __rb_reserve_next.constprop.0+0x1e1/0x7e3
> [ 635.948419] ring_buffer_lock_reserve+0x26a/0x688
> [ 635.948903] ? __rb_reserve_next.constprop.0+0x7e3/0x7e3
> [ 635.949444] ring_buffer_producer_thread+0x9d/0x524
> [ 635.949950] ? ring_buffer_consumer_thread+0x475/0x475
> [ 635.950473] kthread+0x1ac/0x1bb
> [ 635.950822] ? kthread+0xfd/0x1bb
> [ 635.951184] ? kthread_complete_and_exit+0x20/0x20
> [ 635.951677] ret_from_fork+0x21/0x3c
> [ 635.952061] ? kthread_complete_and_exit+0x20/0x20
> [ 635.952553] ret_from_fork_asm+0x11/0x20
> [ 635.952966] </TASK>
> [ 635.953213] ---[ end trace 0000000000000000 ]---
> [ 829.757162] systemd-fstab-generator[2633]: Failed to create unit file
> '/run/systemd/generator/home.mount', as it already exists. Duplicate entry
> in '/etc/fstab'?
>
>
>
>
>>
>> Thanks!
>> Chris.
>>
>> Christian Hopps <chopps@chopps.org> writes:
>>
>> > [[PGP Signed Part:Good signature from 2E1D830ED7B83025 Christian Hopps <chopps@gmail.com> (trust ultimate) created at 2024-05-24T08:08:58-0400 using RSA]]
>> >
>> > This is very helpful thanks.
>> >
>> > I think the tunnel endpoints are east/west 192.1.2.{23,45}, but I can't determine the north/east endpoints b/c they don't appear connected. :)
>> >
>> > Are there any other iptfs options? The code you highlight mentions the `dont-frag` option, but I wonder if you actually have that enabled?
>> >
>> > It also seems like you are pinging and forcing the source IP of a red interface
>> > on the tunnel endpoint gateway directly (so that it doesn't try and use the
>> > black interface I would guess) is that correct?
>> >
>> > Thanks!
>> > Chris.
>> >
>> > P.S. the addresses on the NIC host in the picture seem reversed, but this doesn't seem relevant to this test :)
>> >
>> > Antony Antony <antony@phenome.org> writes:
>> >
>> > > On Thu, May 23, 2024 at 07:04:58PM -0400, Christian Hopps wrote:
>> > > >
>> > > > Could you let me know some more details about this test? What is your interface config / topology?. I tried to guess given the ping command but it's not replicating for me.
>> > >
>> > > I am using Libreswan testing topology. However, I am running test manually.
>> > > Yesterday tunnel between north and east. This morning I quickly tried
>> > > between west-east. Just two VM. I see the same issue there too.
>> > >
>> > > https://libreswan.org/wiki/images/f/f1/Testnet-202102.png
>> > >
>> > > I am using CONFIG_ESP_OFFLOAD. That is only thing standing out. Besides it
>> > > is just a 1500 MTU tunnels using qemu/kvm and tap network.
>> > >
>> > > attached is my kernel .config
>> > >
>> > > > PS, I've changed the subject and In-reply-to to be based on the corrected
>> > > > cover-letter I sent, I initially sent the cover letter with the wrong
>> > > > subject. :(
>> > >
>> > > I noticed a second cover letter. However, it was not showing as related to
>> > > patch set correctly. It showed up as a diffrent thread. That is why I
>> > > replied to the initial one
>> > >
>> > > -antony
>> > > >
>> > > >
>> > > > Antony Antony <antony@phenome.org> writes:
>> > > >
>> > > > > Hi Chris,
>> > > > >
>> > > > > On Mon, May 20, 2024 at 05:42:38PM -0400, Christian Hopps via Devel wrote:
>> > > > > > From: Christian Hopps <chopps@labn.net>
>> > > > > > - iptfs: remove some BUG_ON() assertions questioned in review.
>> > > >
>> > > > ...
>> > > >
>> > > > > I ran a couple of tests and it hit KSAN BUG.
>> > > > >
>> > > > > I was sending large ping while MTU is 1500.
>> > > > >
>> > > > > north login: shed systemd-user-sessions.service - Permit User Sessions.
>> > > > > north login: [ 78.594770] ==================================================================
>> > > > > [ 78.595825] BUG: KASAN: null-ptr-deref in iptfs_output_collect+0x263/0x57b
>> > > > > [ 78.596658] Read of size 8 at addr 0000000000000108 by task ping/493
>> > > > > [ 78.597435] ng rpc-statd-notify.service - Notify NFS peers of a restart...
>> > > > > [ 78.597651] CPU: 0 PID: 493 Comm: ping Not tainted 6.9.0-rc2-00697-g489ca863e24f-dirty #11
>> > > > > [ 78.598645] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>> > > > > [ 78.599747] Call Trace:tty@ttyS2.service - Serial Getty on ttyS2.
>> > > > > [ 78.600070] <TASK>l-getty@ttyS3.service - Serial Getty on ttyS3.
>> > > > > [ 78.600354] dump_stack_lvl+0x2a/0x3bogin Prompts.
>> > > > > [ 78.600817] kasan_report+0x84/0xa6rvice - Hostname Service...
>> > > > > [ 78.601262] ? iptfs_output_collect+0x263/0x57bl server.
>> > > > > [ 78.601825] iptfs_output_collect+0x263/0x57bogin Management.
>> > > > > [ 78.602374] ip_send_skb+0x25/0x57vice - Notify NFS peers of a restart.
>> > > > > [ 78.602807] raw_sendmsg+0xee8/0x1011t - Multi-User System.
>> > > > > [ 78.603269] ? native_flush_tlb_one_user+0xd/0xe5e Service.
>> > > > > [ 78.603850] ? raw_hash_sk+0x21b/0x21b
>> > > > > [ 78.604331] ? kernel_init_pages+0x42/0x51
>> > > > > [ 78.604845] ? prep_new_page+0x44/0x51Re…line ext4 Metadata Check Snapshots.
>> > > > > [ 78.605318] ? get_page_from_freelist+0x72b/0x915 Interface.
>> > > > > [ 78.605903] ? signal_pending_state+0x77/0x77cord Runlevel Change in UTMP...
>> > > > > [ 78.606462] ? __might_resched+0x8a/0x240e - Record Runlevel Change in UTMP.
>> > > > > [ 78.606966] ? __might_sleep+0x25/0xa0
>> > > > > [ 78.607440] ? first_zones_zonelist+0x2c/0x43
>> > > > > [ 78.607985] ? __rcu_read_lock+0x2d/0x3a
>> > > > > [ 78.608479] ? __pte_offset_map+0x32/0xa4
>> > > > > [ 78.608979] ? __might_resched+0x8a/0x240
>> > > > > [ 78.609478] ? __might_sleep+0x25/0xa0
>> > > > > [ 78.609949] ? inet_send_prepare+0x54/0x54
>> > > > > [ 78.610464] ? sock_sendmsg_nosec+0x42/0x6c
>> > > > > [ 78.610984] sock_sendmsg_nosec+0x42/0x6c
>> > > > > [ 78.611485] __sys_sendto+0x15d/0x1cc
>> > > > > [ 78.611947] ? __x64_sys_getpeername+0x44/0x44
>> > > > > [ 78.612498] ? __handle_mm_fault+0x679/0xae4
>> > > > > [ 78.613033] ? find_vma+0x6b/0x8b
>> > > > > [ 78.613457] ? find_vma_intersection+0x8a/0x8a
>> > > > > [ 78.614006] ? __handle_irq_event_percpu+0x180/0x197
>> > > > > [ 78.614617] ? handle_mm_fault+0x38/0x154
>> > > > > [ 78.615114] ? handle_mm_fault+0xeb/0x154
>> > > > > [ 78.615620] ? preempt_latency_start+0x29/0x34
>> > > > > [ 78.616169] ? preempt_count_sub+0x14/0xb3
>> > > > > [ 78.616678] ? up_read+0x4b/0x5c
>> > > > > [ 78.617094] __x64_sys_sendto+0x76/0x82
>> > > > > [ 78.617577] do_syscall_64+0x6b/0xd7
>> > > > > [ 78.618043] entry_SYSCALL_64_after_hwframe+0x46/0x4e
>> > > > > [ 78.618667] RIP: 0033:0x7fed3de99a73
>> > > > > [ 78.619118] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8
>> > > > > 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0
>> > > > > ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
>> > > > > [ 78.621291] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
>> > > > > [ 78.622205] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
>> > > > > [ 78.623056] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
>> > > > > [ 78.623908] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
>> > > > > [ 78.624765] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
>> > > > > [ 78.625619] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
>> > > > > [ 78.626480] </TASK>
>> > > > > [ 78.626773] ==================================================================
>> > > > > [ 78.627656] Disabling lock debugging due to kernel taint
>> > > > > [ 78.628305] BUG: kernel NULL pointer dereference, address: 0000000000000108
>> > > > > [ 78.629136] #PF: supervisor read access in kernel mode
>> > > > > [ 78.629766] #PF: error_code(0x0000) - not-present page
>> > > > > [ 78.630402] PGD 0 P4D 0
>> > > > > [ 78.630739] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC KASAN
>> > > > > [ 78.631398] CPU: 0 PID: 493 Comm: ping Tainted: G B 6.9.0-rc2-00697-g489ca863e24f-dirty #11
>> > > > > [ 78.632548] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>> > > > > [ 78.633649] RIP: 0010:iptfs_output_collect+0x263/0x57b
>> > > > > [ 78.634283] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b
>> > > > > 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08
>> > > > > 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
>> > > > > [ 78.636444] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
>> > > > > [ 78.637076] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
>> > > > > [ 78.637923] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
>> > > > > [ 78.638792] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
>> > > > > [ 78.639645] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
>> > > > > [ 78.640498] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
>> > > > > [ 78.641359] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
>> > > > > [ 78.642324] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > > > > [ 78.643022] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
>> > > > > [ 78.643882] Call Trace:
>> > > > > [ 78.644204] <TASK>
>> > > > > [ 78.644487] ? __die_body+0x1a/0x56
>> > > > > [ 78.644929] ? page_fault_oops+0x45f/0x4cd
>> > > > > [ 78.645441] ? dump_pagetable+0x1db/0x1db
>> > > > > [ 78.645942] ? vprintk_emit+0x163/0x171
>> > > > > [ 78.646425] ? iptfs_output_collect+0x263/0x57b
>> > > > > [ 78.646986] ? _printk+0xb2/0xe1
>> > > > > [ 78.647401] ? find_first_fitting_seq+0x193/0x193
>> > > > > [ 78.647982] ? iptfs_output_collect+0x263/0x57b
>> > > > > [ 78.648541] ? do_user_addr_fault+0x14f/0x56c
>> > > > > [ 78.649084] ? exc_page_fault+0xa5/0xbe
>> > > > > [ 78.649566] ? asm_exc_page_fault+0x22/0x30
>> > > > > [ 78.650100] ? iptfs_output_collect+0x263/0x57b
>> > > > > [ 78.650660] ? iptfs_output_collect+0x263/0x57b
>> > > > > [ 78.651221] ip_send_skb+0x25/0x57
>> > > > > [ 78.651652] raw_sendmsg+0xee8/0x1011
>> > > > > [ 78.652113] ? native_flush_tlb_one_user+0xd/0xe5
>> > > > > [ 78.652693] ? raw_hash_sk+0x21b/0x21b
>> > > > > [ 78.653166] ? kernel_init_pages+0x42/0x51
>> > > > > [ 78.653683] ? prep_new_page+0x44/0x51
>> > > > > [ 78.654160] ? get_page_from_freelist+0x72b/0x915
>> > > > > [ 78.654739] ? signal_pending_state+0x77/0x77
>> > > > > [ 78.655284] ? __might_resched+0x8a/0x240
>> > > > > [ 78.655784] ? __might_sleep+0x25/0xa0
>> > > > > [ 78.656255] ? first_zones_zonelist+0x2c/0x43
>> > > > > [ 78.656798] ? __rcu_read_lock+0x2d/0x3a
>> > > > > [ 78.657289] ? __pte_offset_map+0x32/0xa4
>> > > > > [ 78.657788] ? __might_resched+0x8a/0x240
>> > > > > [ 78.658291] ? __might_sleep+0x25/0xa0
>> > > > > [ 78.658763] ? inet_send_prepare+0x54/0x54
>> > > > > [ 78.659272] ? sock_sendmsg_nosec+0x42/0x6c
>> > > > > [ 78.659791] sock_sendmsg_nosec+0x42/0x6c
>> > > > > [ 78.660293] __sys_sendto+0x15d/0x1cc
>> > > > > [ 78.660755] ? __x64_sys_getpeername+0x44/0x44
>> > > > > [ 78.661304] ? __handle_mm_fault+0x679/0xae4
>> > > > > [ 78.661838] ? find_vma+0x6b/0x8b
>> > > > > [ 78.662272] ? find_vma_intersection+0x8a/0x8a
>> > > > > [ 78.662828] ? __handle_irq_event_percpu+0x180/0x197
>> > > > > [ 78.663436] ? handle_mm_fault+0x38/0x154
>> > > > > [ 78.663935] ? handle_mm_fault+0xeb/0x154
>> > > > > [ 78.664435] ? preempt_latency_start+0x29/0x34
>> > > > > [ 78.664987] ? preempt_count_sub+0x14/0xb3
>> > > > > [ 78.665498] ? up_read+0x4b/0x5c
>> > > > > [ 78.665911] __x64_sys_sendto+0x76/0x82
>> > > > > [ 78.666398] do_syscall_64+0x6b/0xd7
>> > > > > [ 78.666849] entry_SYSCALL_64_after_hwframe+0x46/0x4e
>> > > > > [ 78.667466] RIP: 0033:0x7fed3de99a73
>> > > > > [ 78.667918] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8
>> > > > > 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0
>> > > > > ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
>> > > > > [ 78.670097] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
>> > > > > [ 78.671002] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
>> > > > > [ 78.671858] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
>> > > > > [ 78.672708] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
>> > > > > [ 78.673564] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
>> > > > > [ 78.674430] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
>> > > > > [ 78.675287] </TASK>
>> > > > > [ 78.675580] Modules linked in:
>> > > > > [ 78.675975] CR2: 0000000000000108
>> > > > > [ 78.676396] ---[ end trace 0000000000000000 ]---
>> > > > > [ 78.676966] RIP: 0010:iptfs_output_collect+0x263/0x57b
>> > > > > [ 78.677596] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b
>> > > > > 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08
>> > > > > 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
>> > > > > [ 78.679768] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
>> > > > > [ 78.680410] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
>> > > > > [ 78.681264] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
>> > > > > [ 78.682136] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
>> > > > > [ 78.682997] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
>> > > > > [ 78.683853] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
>> > > > > [ 78.684710] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
>> > > > > [ 78.685675] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > > > > [ 78.686387] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
>> > > > > [ 78.687246] Kernel panic - not syncing: Fatal exception in interrupt
>> > > > > [ 78.688014] Kernel Offset: disabled
>> > > > > [ 78.688460] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
>> > > > >
>> > > > > ping -s 2000 -n -q -W 1 -c 2 -I 192.0.3.254 192.0.2.254
>> > > > >
>> > > > > (gdb) list *iptfs_output_collect+0x263
>> > > > > 0xffffffff81d5076f is in iptfs_output_collect (./include/net/net_namespace.h:383).
>> > > > > 378 }
>> > > > > 379
>> > > > > 380 static inline struct net *read_pnet(const possible_net_t *pnet)
>> > > > > 381 {
>> > > > > 382 #ifdef CONFIG_NET_NS
>> > > > > 383 return rcu_dereference_protected(pnet->net, true);
>> > > > > 384 #else
>> > > > > 385 return &init_net;
>> > > > > 386 #endif
>> > > > > 387 }
>> > > > >
>> > > > > I suspect actual crash is from the line 1756 instead,
>> > > > > (gdb) list *iptfs_output_collect+0x256
>> > > > > 0xffffffff81d50762 is in iptfs_output_collect (net/xfrm/xfrm_iptfs.c:1756).
>> > > > > 1751 return 0;
>> > > > > 1752
>> > > > > 1753 /* We only send ICMP too big if the user has configured us as
>> > > > > 1754 * dont-fragment.
>> > > > > 1755 */
>> > > > > 1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
>> > > > > 1757
>> > > > > 1758 if (sk) {
>> > > > > 1759 xfrm_local_error(skb, pmtu);
>> > > > > 1760 } else if (ip_hdr(skb)->version == 4) {
>> > > > >
>> > > > > Later I ran with gdb iptfs_is_too_big which is called twice and second time
>> > > > > it crash.
>> > > > > Here is gdb bt. Just before the crash
>> > > > >
>> > > > > #0 iptfs_is_too_big (pmtu=1442, skb=0xffff88810dbea3c0, sk=0xffff888104d4ed40) at net/xfrm/xfrm_iptfs.c:1756
>> > > > > #1 iptfs_output_collect (net=<optimized out>, sk=0xffff888104d4ed40, skb=0xffff88810dbea3c0) at net/xfrm/xfrm_iptfs.c:1847
>> > > > > #2 0xffffffff81c8a3cb in ip_send_skb (net=0xffffffff83e57f20 <init_net>, skb=0xffff88810dbea3c0)
>> > > > > at net/ipv4/ip_output.c:1492
>> > > > > #3 0xffffffff81c8a439 in ip_push_pending_frames (sk=sk@entry=0xffff888104d4ed40, fl4=fl4@entry=0xffffc90000e3fb90)
>> > > > > at net/ipv4/ip_output.c:1512
>> > > > > #4 0xffffffff81ccf3cf in raw_sendmsg (sk=0xffff888104d4ed40, msg=0xffffc90000e3fd80, len=<optimized out>)
>> > > > > at net/ipv4/raw.c:654
>> > > > > #5 0xffffffff81b096ea in sock_sendmsg_nosec (sock=sock@entry=0xffff888115136040, msg=msg@entry=0xffffc90000e3fd80)
>> > > > > at net/socket.c:730
>> > > > > #6 0xffffffff81b0c327 in __sock_sendmsg (msg=0xffffc90000e3fd80, sock=0xffff888115136040) at net/socket.c:745
>> > > > > #7 __sys_sendto (fd=<optimized out>, buff=buff@entry=0x558edefb73c0, len=len@entry=2008, flags=flags@entry=0,
>> > > > > addr=addr@entry=0x558edefb35c0, addr_len=addr_len@entry=16) at net/socket.c:2191
>> > > > > #8 0xffffffff81b0c40c in __do_sys_sendto (addr_len=16, addr=0x558edefb35c0, flags=0, len=2008, buff=0x558edefb73c0,
>> > > > > fd=<optimized out>) at net/socket.c:2203
>> > > > > #9 __se_sys_sendto (addr_len=16, addr=94072114722240, flags=0, len=2008, buff=94072114738112, fd=<optimized out>)
>> > > > > at net/socket.c:2199
>> > > > >
>> > > > > gdb) list
>> > > > > 1751 return 0;
>> > > > > 1752
>> > > > > 1753 /* We only send ICMP too big if the user has configured us as
>> > > > > 1754 * dont-fragment.
>> > > > > 1755 */
>> > > > > 1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
>> > > > > 1757
>> > > > > 1758 if (sk) {
>> > > > > 1759 xfrm_local_error(skb, pmtu);
>> > > > > 1760 } else if (ip_hdr(skb)->version == 4) {
>> > > > >
>> > > > > -antony
>> > > >
>> > >
>> > > [2. text/plain; .config]...
>> >
>> > [[End of PGP Signed Part]]
>>
>> a
>
>
>
>> --
>> Devel mailing list
>> Devel@linux-ipsec.org
>> https://linux-ipsec.org/mailman/listinfo/devel
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 857 bytes --]
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [devel-ipsec] [PATCH ipsec-next v2 0/17] Add IP-TFS mode to xfrm
2024-05-25 5:55 ` Christian Hopps
2024-06-06 15:52 ` [devel-ipsec] " Antony Antony
@ 2024-06-11 6:24 ` Antony Antony
2024-06-17 15:17 ` Christian Hopps
1 sibling, 1 reply; 34+ messages in thread
From: Antony Antony @ 2024-06-11 6:24 UTC (permalink / raw)
To: Christian Hopps
Cc: Antony Antony, devel, Steffen Klassert, netdev, Christian Hopps
On Sat, May 25, 2024 at 01:55:01AM -0400, Christian Hopps via Devel wrote:
>
> Found. This was happening b/c the skb was locally generated on the gateway and so had no net_device. Fixed by checking for skb->dev == NULL before incrementing the error stats in the output path.
>
> Thanks!
> Chris.
I tried v3 quickly and I still see kernel oops on misconfigurations. Did you
test with IP-TFS DF enabled? I understand IP-TFS DF is a misconfiguration.
However, it should not casue a null-ptr-deref.
I suspect there is another misconfiguration that is cuasing null-ptr-deref.
sysctl -w net.core.xfrm_iptfs_maxqsize=0
I will try later this week to get to you crash message.
-antony
>
> Christian Hopps <chopps@chopps.org> writes:
>
> > [[PGP Signed Part:Good signature from 2E1D830ED7B83025 Christian Hopps <chopps@gmail.com> (trust ultimate) created at 2024-05-24T08:08:58-0400 using RSA]]
> >
> > This is very helpful thanks.
> >
> > I think the tunnel endpoints are east/west 192.1.2.{23,45}, but I can't determine the north/east endpoints b/c they don't appear connected. :)
> >
> > Are there any other iptfs options? The code you highlight mentions the `dont-frag` option, but I wonder if you actually have that enabled?
> >
> > It also seems like you are pinging and forcing the source IP of a red interface
> > on the tunnel endpoint gateway directly (so that it doesn't try and use the
> > black interface I would guess) is that correct?
> >
> > Thanks!
> > Chris.
> >
> > P.S. the addresses on the NIC host in the picture seem reversed, but this doesn't seem relevant to this test :)
> >
> > Antony Antony <antony@phenome.org> writes:
> >
> > > On Thu, May 23, 2024 at 07:04:58PM -0400, Christian Hopps wrote:
> > > >
> > > > Could you let me know some more details about this test? What is your interface config / topology?. I tried to guess given the ping command but it's not replicating for me.
> > >
> > > I am using Libreswan testing topology. However, I am running test manually.
> > > Yesterday tunnel between north and east. This morning I quickly tried
> > > between west-east. Just two VM. I see the same issue there too.
> > >
> > > https://libreswan.org/wiki/images/f/f1/Testnet-202102.png
> > >
> > > I am using CONFIG_ESP_OFFLOAD. That is only thing standing out. Besides it
> > > is just a 1500 MTU tunnels using qemu/kvm and tap network.
> > >
> > > attached is my kernel .config
> > >
> > > > PS, I've changed the subject and In-reply-to to be based on the corrected
> > > > cover-letter I sent, I initially sent the cover letter with the wrong
> > > > subject. :(
> > >
> > > I noticed a second cover letter. However, it was not showing as related to
> > > patch set correctly. It showed up as a diffrent thread. That is why I
> > > replied to the initial one
> > >
> > > -antony
> > > >
> > > >
> > > > Antony Antony <antony@phenome.org> writes:
> > > >
> > > > > Hi Chris,
> > > > >
> > > > > On Mon, May 20, 2024 at 05:42:38PM -0400, Christian Hopps via Devel wrote:
> > > > > > From: Christian Hopps <chopps@labn.net>
> > > > > > - iptfs: remove some BUG_ON() assertions questioned in review.
> > > >
> > > > ...
> > > >
> > > > > I ran a couple of tests and it hit KSAN BUG.
> > > > >
> > > > > I was sending large ping while MTU is 1500.
> > > > >
> > > > > north login: shed systemd-user-sessions.service - Permit User Sessions.
> > > > > north login: [ 78.594770] ==================================================================
> > > > > [ 78.595825] BUG: KASAN: null-ptr-deref in iptfs_output_collect+0x263/0x57b
> > > > > [ 78.596658] Read of size 8 at addr 0000000000000108 by task ping/493
> > > > > [ 78.597435] ng rpc-statd-notify.service - Notify NFS peers of a restart...
> > > > > [ 78.597651] CPU: 0 PID: 493 Comm: ping Not tainted 6.9.0-rc2-00697-g489ca863e24f-dirty #11
> > > > > [ 78.598645] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> > > > > [ 78.599747] Call Trace:tty@ttyS2.service - Serial Getty on ttyS2.
> > > > > [ 78.600070] <TASK>l-getty@ttyS3.service - Serial Getty on ttyS3.
> > > > > [ 78.600354] dump_stack_lvl+0x2a/0x3bogin Prompts.
> > > > > [ 78.600817] kasan_report+0x84/0xa6rvice - Hostname Service...
> > > > > [ 78.601262] ? iptfs_output_collect+0x263/0x57bl server.
> > > > > [ 78.601825] iptfs_output_collect+0x263/0x57bogin Management.
> > > > > [ 78.602374] ip_send_skb+0x25/0x57vice - Notify NFS peers of a restart.
> > > > > [ 78.602807] raw_sendmsg+0xee8/0x1011t - Multi-User System.
> > > > > [ 78.603269] ? native_flush_tlb_one_user+0xd/0xe5e Service.
> > > > > [ 78.603850] ? raw_hash_sk+0x21b/0x21b
> > > > > [ 78.604331] ? kernel_init_pages+0x42/0x51
> > > > > [ 78.604845] ? prep_new_page+0x44/0x51Re…line ext4 Metadata Check Snapshots.
> > > > > [ 78.605318] ? get_page_from_freelist+0x72b/0x915 Interface.
> > > > > [ 78.605903] ? signal_pending_state+0x77/0x77cord Runlevel Change in UTMP...
> > > > > [ 78.606462] ? __might_resched+0x8a/0x240e - Record Runlevel Change in UTMP.
> > > > > [ 78.606966] ? __might_sleep+0x25/0xa0
> > > > > [ 78.607440] ? first_zones_zonelist+0x2c/0x43
> > > > > [ 78.607985] ? __rcu_read_lock+0x2d/0x3a
> > > > > [ 78.608479] ? __pte_offset_map+0x32/0xa4
> > > > > [ 78.608979] ? __might_resched+0x8a/0x240
> > > > > [ 78.609478] ? __might_sleep+0x25/0xa0
> > > > > [ 78.609949] ? inet_send_prepare+0x54/0x54
> > > > > [ 78.610464] ? sock_sendmsg_nosec+0x42/0x6c
> > > > > [ 78.610984] sock_sendmsg_nosec+0x42/0x6c
> > > > > [ 78.611485] __sys_sendto+0x15d/0x1cc
> > > > > [ 78.611947] ? __x64_sys_getpeername+0x44/0x44
> > > > > [ 78.612498] ? __handle_mm_fault+0x679/0xae4
> > > > > [ 78.613033] ? find_vma+0x6b/0x8b
> > > > > [ 78.613457] ? find_vma_intersection+0x8a/0x8a
> > > > > [ 78.614006] ? __handle_irq_event_percpu+0x180/0x197
> > > > > [ 78.614617] ? handle_mm_fault+0x38/0x154
> > > > > [ 78.615114] ? handle_mm_fault+0xeb/0x154
> > > > > [ 78.615620] ? preempt_latency_start+0x29/0x34
> > > > > [ 78.616169] ? preempt_count_sub+0x14/0xb3
> > > > > [ 78.616678] ? up_read+0x4b/0x5c
> > > > > [ 78.617094] __x64_sys_sendto+0x76/0x82
> > > > > [ 78.617577] do_syscall_64+0x6b/0xd7
> > > > > [ 78.618043] entry_SYSCALL_64_after_hwframe+0x46/0x4e
> > > > > [ 78.618667] RIP: 0033:0x7fed3de99a73
> > > > > [ 78.619118] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8
> > > > > 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0
> > > > > ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
> > > > > [ 78.621291] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
> > > > > [ 78.622205] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
> > > > > [ 78.623056] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
> > > > > [ 78.623908] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
> > > > > [ 78.624765] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
> > > > > [ 78.625619] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
> > > > > [ 78.626480] </TASK>
> > > > > [ 78.626773] ==================================================================
> > > > > [ 78.627656] Disabling lock debugging due to kernel taint
> > > > > [ 78.628305] BUG: kernel NULL pointer dereference, address: 0000000000000108
> > > > > [ 78.629136] #PF: supervisor read access in kernel mode
> > > > > [ 78.629766] #PF: error_code(0x0000) - not-present page
> > > > > [ 78.630402] PGD 0 P4D 0
> > > > > [ 78.630739] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC KASAN
> > > > > [ 78.631398] CPU: 0 PID: 493 Comm: ping Tainted: G B 6.9.0-rc2-00697-g489ca863e24f-dirty #11
> > > > > [ 78.632548] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> > > > > [ 78.633649] RIP: 0010:iptfs_output_collect+0x263/0x57b
> > > > > [ 78.634283] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b
> > > > > 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08
> > > > > 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
> > > > > [ 78.636444] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
> > > > > [ 78.637076] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
> > > > > [ 78.637923] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
> > > > > [ 78.638792] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
> > > > > [ 78.639645] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
> > > > > [ 78.640498] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
> > > > > [ 78.641359] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
> > > > > [ 78.642324] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > [ 78.643022] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
> > > > > [ 78.643882] Call Trace:
> > > > > [ 78.644204] <TASK>
> > > > > [ 78.644487] ? __die_body+0x1a/0x56
> > > > > [ 78.644929] ? page_fault_oops+0x45f/0x4cd
> > > > > [ 78.645441] ? dump_pagetable+0x1db/0x1db
> > > > > [ 78.645942] ? vprintk_emit+0x163/0x171
> > > > > [ 78.646425] ? iptfs_output_collect+0x263/0x57b
> > > > > [ 78.646986] ? _printk+0xb2/0xe1
> > > > > [ 78.647401] ? find_first_fitting_seq+0x193/0x193
> > > > > [ 78.647982] ? iptfs_output_collect+0x263/0x57b
> > > > > [ 78.648541] ? do_user_addr_fault+0x14f/0x56c
> > > > > [ 78.649084] ? exc_page_fault+0xa5/0xbe
> > > > > [ 78.649566] ? asm_exc_page_fault+0x22/0x30
> > > > > [ 78.650100] ? iptfs_output_collect+0x263/0x57b
> > > > > [ 78.650660] ? iptfs_output_collect+0x263/0x57b
> > > > > [ 78.651221] ip_send_skb+0x25/0x57
> > > > > [ 78.651652] raw_sendmsg+0xee8/0x1011
> > > > > [ 78.652113] ? native_flush_tlb_one_user+0xd/0xe5
> > > > > [ 78.652693] ? raw_hash_sk+0x21b/0x21b
> > > > > [ 78.653166] ? kernel_init_pages+0x42/0x51
> > > > > [ 78.653683] ? prep_new_page+0x44/0x51
> > > > > [ 78.654160] ? get_page_from_freelist+0x72b/0x915
> > > > > [ 78.654739] ? signal_pending_state+0x77/0x77
> > > > > [ 78.655284] ? __might_resched+0x8a/0x240
> > > > > [ 78.655784] ? __might_sleep+0x25/0xa0
> > > > > [ 78.656255] ? first_zones_zonelist+0x2c/0x43
> > > > > [ 78.656798] ? __rcu_read_lock+0x2d/0x3a
> > > > > [ 78.657289] ? __pte_offset_map+0x32/0xa4
> > > > > [ 78.657788] ? __might_resched+0x8a/0x240
> > > > > [ 78.658291] ? __might_sleep+0x25/0xa0
> > > > > [ 78.658763] ? inet_send_prepare+0x54/0x54
> > > > > [ 78.659272] ? sock_sendmsg_nosec+0x42/0x6c
> > > > > [ 78.659791] sock_sendmsg_nosec+0x42/0x6c
> > > > > [ 78.660293] __sys_sendto+0x15d/0x1cc
> > > > > [ 78.660755] ? __x64_sys_getpeername+0x44/0x44
> > > > > [ 78.661304] ? __handle_mm_fault+0x679/0xae4
> > > > > [ 78.661838] ? find_vma+0x6b/0x8b
> > > > > [ 78.662272] ? find_vma_intersection+0x8a/0x8a
> > > > > [ 78.662828] ? __handle_irq_event_percpu+0x180/0x197
> > > > > [ 78.663436] ? handle_mm_fault+0x38/0x154
> > > > > [ 78.663935] ? handle_mm_fault+0xeb/0x154
> > > > > [ 78.664435] ? preempt_latency_start+0x29/0x34
> > > > > [ 78.664987] ? preempt_count_sub+0x14/0xb3
> > > > > [ 78.665498] ? up_read+0x4b/0x5c
> > > > > [ 78.665911] __x64_sys_sendto+0x76/0x82
> > > > > [ 78.666398] do_syscall_64+0x6b/0xd7
> > > > > [ 78.666849] entry_SYSCALL_64_after_hwframe+0x46/0x4e
> > > > > [ 78.667466] RIP: 0033:0x7fed3de99a73
> > > > > [ 78.667918] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8
> > > > > 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0
> > > > > ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
> > > > > [ 78.670097] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
> > > > > [ 78.671002] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
> > > > > [ 78.671858] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
> > > > > [ 78.672708] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
> > > > > [ 78.673564] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
> > > > > [ 78.674430] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
> > > > > [ 78.675287] </TASK>
> > > > > [ 78.675580] Modules linked in:
> > > > > [ 78.675975] CR2: 0000000000000108
> > > > > [ 78.676396] ---[ end trace 0000000000000000 ]---
> > > > > [ 78.676966] RIP: 0010:iptfs_output_collect+0x263/0x57b
> > > > > [ 78.677596] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b
> > > > > 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08
> > > > > 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
> > > > > [ 78.679768] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
> > > > > [ 78.680410] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
> > > > > [ 78.681264] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
> > > > > [ 78.682136] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
> > > > > [ 78.682997] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
> > > > > [ 78.683853] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
> > > > > [ 78.684710] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
> > > > > [ 78.685675] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > [ 78.686387] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
> > > > > [ 78.687246] Kernel panic - not syncing: Fatal exception in interrupt
> > > > > [ 78.688014] Kernel Offset: disabled
> > > > > [ 78.688460] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
> > > > >
> > > > > ping -s 2000 -n -q -W 1 -c 2 -I 192.0.3.254 192.0.2.254
> > > > >
> > > > > (gdb) list *iptfs_output_collect+0x263
> > > > > 0xffffffff81d5076f is in iptfs_output_collect (./include/net/net_namespace.h:383).
> > > > > 378 }
> > > > > 379
> > > > > 380 static inline struct net *read_pnet(const possible_net_t *pnet)
> > > > > 381 {
> > > > > 382 #ifdef CONFIG_NET_NS
> > > > > 383 return rcu_dereference_protected(pnet->net, true);
> > > > > 384 #else
> > > > > 385 return &init_net;
> > > > > 386 #endif
> > > > > 387 }
> > > > >
> > > > > I suspect actual crash is from the line 1756 instead,
> > > > > (gdb) list *iptfs_output_collect+0x256
> > > > > 0xffffffff81d50762 is in iptfs_output_collect (net/xfrm/xfrm_iptfs.c:1756).
> > > > > 1751 return 0;
> > > > > 1752
> > > > > 1753 /* We only send ICMP too big if the user has configured us as
> > > > > 1754 * dont-fragment.
> > > > > 1755 */
> > > > > 1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
> > > > > 1757
> > > > > 1758 if (sk) {
> > > > > 1759 xfrm_local_error(skb, pmtu);
> > > > > 1760 } else if (ip_hdr(skb)->version == 4) {
> > > > >
> > > > > Later I ran with gdb iptfs_is_too_big which is called twice and second time
> > > > > it crash.
> > > > > Here is gdb bt. Just before the crash
> > > > >
> > > > > #0 iptfs_is_too_big (pmtu=1442, skb=0xffff88810dbea3c0, sk=0xffff888104d4ed40) at net/xfrm/xfrm_iptfs.c:1756
> > > > > #1 iptfs_output_collect (net=<optimized out>, sk=0xffff888104d4ed40, skb=0xffff88810dbea3c0) at net/xfrm/xfrm_iptfs.c:1847
> > > > > #2 0xffffffff81c8a3cb in ip_send_skb (net=0xffffffff83e57f20 <init_net>, skb=0xffff88810dbea3c0)
> > > > > at net/ipv4/ip_output.c:1492
> > > > > #3 0xffffffff81c8a439 in ip_push_pending_frames (sk=sk@entry=0xffff888104d4ed40, fl4=fl4@entry=0xffffc90000e3fb90)
> > > > > at net/ipv4/ip_output.c:1512
> > > > > #4 0xffffffff81ccf3cf in raw_sendmsg (sk=0xffff888104d4ed40, msg=0xffffc90000e3fd80, len=<optimized out>)
> > > > > at net/ipv4/raw.c:654
> > > > > #5 0xffffffff81b096ea in sock_sendmsg_nosec (sock=sock@entry=0xffff888115136040, msg=msg@entry=0xffffc90000e3fd80)
> > > > > at net/socket.c:730
> > > > > #6 0xffffffff81b0c327 in __sock_sendmsg (msg=0xffffc90000e3fd80, sock=0xffff888115136040) at net/socket.c:745
> > > > > #7 __sys_sendto (fd=<optimized out>, buff=buff@entry=0x558edefb73c0, len=len@entry=2008, flags=flags@entry=0,
> > > > > addr=addr@entry=0x558edefb35c0, addr_len=addr_len@entry=16) at net/socket.c:2191
> > > > > #8 0xffffffff81b0c40c in __do_sys_sendto (addr_len=16, addr=0x558edefb35c0, flags=0, len=2008, buff=0x558edefb73c0,
> > > > > fd=<optimized out>) at net/socket.c:2203
> > > > > #9 __se_sys_sendto (addr_len=16, addr=94072114722240, flags=0, len=2008, buff=94072114738112, fd=<optimized out>)
> > > > > at net/socket.c:2199
> > > > >
> > > > > gdb) list
> > > > > 1751 return 0;
> > > > > 1752
> > > > > 1753 /* We only send ICMP too big if the user has configured us as
> > > > > 1754 * dont-fragment.
> > > > > 1755 */
> > > > > 1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
> > > > > 1757
> > > > > 1758 if (sk) {
> > > > > 1759 xfrm_local_error(skb, pmtu);
> > > > > 1760 } else if (ip_hdr(skb)->version == 4) {
> > > > >
> > > > > -antony
> > > >
> > >
> > > [2. text/plain; .config]...
> >
> > [[End of PGP Signed Part]]
>
> a
> --
> Devel mailing list
> Devel@linux-ipsec.org
> https://linux-ipsec.org/mailman/listinfo/devel
^ permalink raw reply [flat|nested] 34+ messages in thread* Re: [devel-ipsec] [PATCH ipsec-next v2 0/17] Add IP-TFS mode to xfrm
2024-06-11 6:24 ` Antony Antony
@ 2024-06-17 15:17 ` Christian Hopps
2024-06-17 15:39 ` Nicolas Dichtel
0 siblings, 1 reply; 34+ messages in thread
From: Christian Hopps @ 2024-06-17 15:17 UTC (permalink / raw)
To: Antony Antony; +Cc: devel, Steffen Klassert, netdev, Christian Hopps
Very sorry, it appears that when I did git history cleanup, the fix for the dont-frag toobig case was removed. I will get the fix restored and new patch posted.
Thanks,
Chris.
> On Jun 11, 2024, at 2:24 AM, Antony Antony via Devel <devel@linux-ipsec.org> wrote:
>
> On Sat, May 25, 2024 at 01:55:01AM -0400, Christian Hopps via Devel wrote:
>>
>> Found. This was happening b/c the skb was locally generated on the gateway and so had no net_device. Fixed by checking for skb->dev == NULL before incrementing the error stats in the output path.
>>
>> Thanks!
>> Chris.
>
> I tried v3 quickly and I still see kernel oops on misconfigurations. Did you
> test with IP-TFS DF enabled? I understand IP-TFS DF is a misconfiguration.
> However, it should not casue a null-ptr-deref.
>
> I suspect there is another misconfiguration that is cuasing null-ptr-deref.
> sysctl -w net.core.xfrm_iptfs_maxqsize=0
>
> I will try later this week to get to you crash message.
>
> -antony
>
>>
>> Christian Hopps <chopps@chopps.org> writes:
>>
>>> [[PGP Signed Part:Good signature from 2E1D830ED7B83025 Christian Hopps <chopps@gmail.com> (trust ultimate) created at 2024-05-24T08:08:58-0400 using RSA]]
>>>
>>> This is very helpful thanks.
>>>
>>> I think the tunnel endpoints are east/west 192.1.2.{23,45}, but I can't determine the north/east endpoints b/c they don't appear connected. :)
>>>
>>> Are there any other iptfs options? The code you highlight mentions the `dont-frag` option, but I wonder if you actually have that enabled?
>>>
>>> It also seems like you are pinging and forcing the source IP of a red interface
>>> on the tunnel endpoint gateway directly (so that it doesn't try and use the
>>> black interface I would guess) is that correct?
>>>
>>> Thanks!
>>> Chris.
>>>
>>> P.S. the addresses on the NIC host in the picture seem reversed, but this doesn't seem relevant to this test :)
>>>
>>> Antony Antony <antony@phenome.org> writes:
>>>
>>>> On Thu, May 23, 2024 at 07:04:58PM -0400, Christian Hopps wrote:
>>>>>
>>>>> Could you let me know some more details about this test? What is your interface config / topology?. I tried to guess given the ping command but it's not replicating for me.
>>>>
>>>> I am using Libreswan testing topology. However, I am running test manually.
>>>> Yesterday tunnel between north and east. This morning I quickly tried
>>>> between west-east. Just two VM. I see the same issue there too.
>>>>
>>>> https://libreswan.org/wiki/images/f/f1/Testnet-202102.png
>>>>
>>>> I am using CONFIG_ESP_OFFLOAD. That is only thing standing out. Besides it
>>>> is just a 1500 MTU tunnels using qemu/kvm and tap network.
>>>>
>>>> attached is my kernel .config
>>>>
>>>>> PS, I've changed the subject and In-reply-to to be based on the corrected
>>>>> cover-letter I sent, I initially sent the cover letter with the wrong
>>>>> subject. :(
>>>>
>>>> I noticed a second cover letter. However, it was not showing as related to
>>>> patch set correctly. It showed up as a diffrent thread. That is why I
>>>> replied to the initial one
>>>>
>>>> -antony
>>>>>
>>>>>
>>>>> Antony Antony <antony@phenome.org> writes:
>>>>>
>>>>>> Hi Chris,
>>>>>>
>>>>>> On Mon, May 20, 2024 at 05:42:38PM -0400, Christian Hopps via Devel wrote:
>>>>>>> From: Christian Hopps <chopps@labn.net>
>>>>>>> - iptfs: remove some BUG_ON() assertions questioned in review.
>>>>>
>>>>> ...
>>>>>
>>>>>> I ran a couple of tests and it hit KSAN BUG.
>>>>>>
>>>>>> I was sending large ping while MTU is 1500.
>>>>>>
>>>>>> north login: shed systemd-user-sessions.service - Permit User Sessions.
>>>>>> north login: [ 78.594770] ==================================================================
>>>>>> [ 78.595825] BUG: KASAN: null-ptr-deref in iptfs_output_collect+0x263/0x57b
>>>>>> [ 78.596658] Read of size 8 at addr 0000000000000108 by task ping/493
>>>>>> [ 78.597435] ng rpc-statd-notify.service - Notify NFS peers of a restart...
>>>>>> [ 78.597651] CPU: 0 PID: 493 Comm: ping Not tainted 6.9.0-rc2-00697-g489ca863e24f-dirty #11
>>>>>> [ 78.598645] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>>>>>> [ 78.599747] Call Trace:tty@ttyS2.service - Serial Getty on ttyS2.
>>>>>> [ 78.600070] <TASK>l-getty@ttyS3.service - Serial Getty on ttyS3.
>>>>>> [ 78.600354] dump_stack_lvl+0x2a/0x3bogin Prompts.
>>>>>> [ 78.600817] kasan_report+0x84/0xa6rvice - Hostname Service...
>>>>>> [ 78.601262] ? iptfs_output_collect+0x263/0x57bl server.
>>>>>> [ 78.601825] iptfs_output_collect+0x263/0x57bogin Management.
>>>>>> [ 78.602374] ip_send_skb+0x25/0x57vice - Notify NFS peers of a restart.
>>>>>> [ 78.602807] raw_sendmsg+0xee8/0x1011t - Multi-User System.
>>>>>> [ 78.603269] ? native_flush_tlb_one_user+0xd/0xe5e Service.
>>>>>> [ 78.603850] ? raw_hash_sk+0x21b/0x21b
>>>>>> [ 78.604331] ? kernel_init_pages+0x42/0x51
>>>>>> [ 78.604845] ? prep_new_page+0x44/0x51Re…line ext4 Metadata Check Snapshots.
>>>>>> [ 78.605318] ? get_page_from_freelist+0x72b/0x915 Interface.
>>>>>> [ 78.605903] ? signal_pending_state+0x77/0x77cord Runlevel Change in UTMP...
>>>>>> [ 78.606462] ? __might_resched+0x8a/0x240e - Record Runlevel Change in UTMP.
>>>>>> [ 78.606966] ? __might_sleep+0x25/0xa0
>>>>>> [ 78.607440] ? first_zones_zonelist+0x2c/0x43
>>>>>> [ 78.607985] ? __rcu_read_lock+0x2d/0x3a
>>>>>> [ 78.608479] ? __pte_offset_map+0x32/0xa4
>>>>>> [ 78.608979] ? __might_resched+0x8a/0x240
>>>>>> [ 78.609478] ? __might_sleep+0x25/0xa0
>>>>>> [ 78.609949] ? inet_send_prepare+0x54/0x54
>>>>>> [ 78.610464] ? sock_sendmsg_nosec+0x42/0x6c
>>>>>> [ 78.610984] sock_sendmsg_nosec+0x42/0x6c
>>>>>> [ 78.611485] __sys_sendto+0x15d/0x1cc
>>>>>> [ 78.611947] ? __x64_sys_getpeername+0x44/0x44
>>>>>> [ 78.612498] ? __handle_mm_fault+0x679/0xae4
>>>>>> [ 78.613033] ? find_vma+0x6b/0x8b
>>>>>> [ 78.613457] ? find_vma_intersection+0x8a/0x8a
>>>>>> [ 78.614006] ? __handle_irq_event_percpu+0x180/0x197
>>>>>> [ 78.614617] ? handle_mm_fault+0x38/0x154
>>>>>> [ 78.615114] ? handle_mm_fault+0xeb/0x154
>>>>>> [ 78.615620] ? preempt_latency_start+0x29/0x34
>>>>>> [ 78.616169] ? preempt_count_sub+0x14/0xb3
>>>>>> [ 78.616678] ? up_read+0x4b/0x5c
>>>>>> [ 78.617094] __x64_sys_sendto+0x76/0x82
>>>>>> [ 78.617577] do_syscall_64+0x6b/0xd7
>>>>>> [ 78.618043] entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>>> [ 78.618667] RIP: 0033:0x7fed3de99a73
>>>>>> [ 78.619118] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8
>>>>>> 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0
>>>>>> ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
>>>>>> [ 78.621291] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
>>>>>> [ 78.622205] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
>>>>>> [ 78.623056] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
>>>>>> [ 78.623908] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
>>>>>> [ 78.624765] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
>>>>>> [ 78.625619] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
>>>>>> [ 78.626480] </TASK>
>>>>>> [ 78.626773] ==================================================================
>>>>>> [ 78.627656] Disabling lock debugging due to kernel taint
>>>>>> [ 78.628305] BUG: kernel NULL pointer dereference, address: 0000000000000108
>>>>>> [ 78.629136] #PF: supervisor read access in kernel mode
>>>>>> [ 78.629766] #PF: error_code(0x0000) - not-present page
>>>>>> [ 78.630402] PGD 0 P4D 0
>>>>>> [ 78.630739] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC KASAN
>>>>>> [ 78.631398] CPU: 0 PID: 493 Comm: ping Tainted: G B 6.9.0-rc2-00697-g489ca863e24f-dirty #11
>>>>>> [ 78.632548] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>>>>>> [ 78.633649] RIP: 0010:iptfs_output_collect+0x263/0x57b
>>>>>> [ 78.634283] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b
>>>>>> 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08
>>>>>> 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
>>>>>> [ 78.636444] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
>>>>>> [ 78.637076] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
>>>>>> [ 78.637923] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
>>>>>> [ 78.638792] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
>>>>>> [ 78.639645] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
>>>>>> [ 78.640498] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
>>>>>> [ 78.641359] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
>>>>>> [ 78.642324] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> [ 78.643022] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
>>>>>> [ 78.643882] Call Trace:
>>>>>> [ 78.644204] <TASK>
>>>>>> [ 78.644487] ? __die_body+0x1a/0x56
>>>>>> [ 78.644929] ? page_fault_oops+0x45f/0x4cd
>>>>>> [ 78.645441] ? dump_pagetable+0x1db/0x1db
>>>>>> [ 78.645942] ? vprintk_emit+0x163/0x171
>>>>>> [ 78.646425] ? iptfs_output_collect+0x263/0x57b
>>>>>> [ 78.646986] ? _printk+0xb2/0xe1
>>>>>> [ 78.647401] ? find_first_fitting_seq+0x193/0x193
>>>>>> [ 78.647982] ? iptfs_output_collect+0x263/0x57b
>>>>>> [ 78.648541] ? do_user_addr_fault+0x14f/0x56c
>>>>>> [ 78.649084] ? exc_page_fault+0xa5/0xbe
>>>>>> [ 78.649566] ? asm_exc_page_fault+0x22/0x30
>>>>>> [ 78.650100] ? iptfs_output_collect+0x263/0x57b
>>>>>> [ 78.650660] ? iptfs_output_collect+0x263/0x57b
>>>>>> [ 78.651221] ip_send_skb+0x25/0x57
>>>>>> [ 78.651652] raw_sendmsg+0xee8/0x1011
>>>>>> [ 78.652113] ? native_flush_tlb_one_user+0xd/0xe5
>>>>>> [ 78.652693] ? raw_hash_sk+0x21b/0x21b
>>>>>> [ 78.653166] ? kernel_init_pages+0x42/0x51
>>>>>> [ 78.653683] ? prep_new_page+0x44/0x51
>>>>>> [ 78.654160] ? get_page_from_freelist+0x72b/0x915
>>>>>> [ 78.654739] ? signal_pending_state+0x77/0x77
>>>>>> [ 78.655284] ? __might_resched+0x8a/0x240
>>>>>> [ 78.655784] ? __might_sleep+0x25/0xa0
>>>>>> [ 78.656255] ? first_zones_zonelist+0x2c/0x43
>>>>>> [ 78.656798] ? __rcu_read_lock+0x2d/0x3a
>>>>>> [ 78.657289] ? __pte_offset_map+0x32/0xa4
>>>>>> [ 78.657788] ? __might_resched+0x8a/0x240
>>>>>> [ 78.658291] ? __might_sleep+0x25/0xa0
>>>>>> [ 78.658763] ? inet_send_prepare+0x54/0x54
>>>>>> [ 78.659272] ? sock_sendmsg_nosec+0x42/0x6c
>>>>>> [ 78.659791] sock_sendmsg_nosec+0x42/0x6c
>>>>>> [ 78.660293] __sys_sendto+0x15d/0x1cc
>>>>>> [ 78.660755] ? __x64_sys_getpeername+0x44/0x44
>>>>>> [ 78.661304] ? __handle_mm_fault+0x679/0xae4
>>>>>> [ 78.661838] ? find_vma+0x6b/0x8b
>>>>>> [ 78.662272] ? find_vma_intersection+0x8a/0x8a
>>>>>> [ 78.662828] ? __handle_irq_event_percpu+0x180/0x197
>>>>>> [ 78.663436] ? handle_mm_fault+0x38/0x154
>>>>>> [ 78.663935] ? handle_mm_fault+0xeb/0x154
>>>>>> [ 78.664435] ? preempt_latency_start+0x29/0x34
>>>>>> [ 78.664987] ? preempt_count_sub+0x14/0xb3
>>>>>> [ 78.665498] ? up_read+0x4b/0x5c
>>>>>> [ 78.665911] __x64_sys_sendto+0x76/0x82
>>>>>> [ 78.666398] do_syscall_64+0x6b/0xd7
>>>>>> [ 78.666849] entry_SYSCALL_64_after_hwframe+0x46/0x4e
>>>>>> [ 78.667466] RIP: 0033:0x7fed3de99a73
>>>>>> [ 78.667918] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8
>>>>>> 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0
>>>>>> ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
>>>>>> [ 78.670097] RSP: 002b:00007ffff6bdf478 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
>>>>>> [ 78.671002] RAX: ffffffffffffffda RBX: 000055c538159340 RCX: 00007fed3de99a73
>>>>>> [ 78.671858] RDX: 00000000000007d8 RSI: 000055c53815f3c0 RDI: 0000000000000003
>>>>>> [ 78.672708] RBP: 000055c53815f3c0 R08: 000055c53815b5c0 R09: 0000000000000010
>>>>>> [ 78.673564] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000000007d8
>>>>>> [ 78.674430] R13: 00007ffff6be0b60 R14: 0000001d00000001 R15: 000055c53815c680
>>>>>> [ 78.675287] </TASK>
>>>>>> [ 78.675580] Modules linked in:
>>>>>> [ 78.675975] CR2: 0000000000000108
>>>>>> [ 78.676396] ---[ end trace 0000000000000000 ]---
>>>>>> [ 78.676966] RIP: 0010:iptfs_output_collect+0x263/0x57b
>>>>>> [ 78.677596] Code: 73 70 0f 84 25 01 00 00 45 39 f4 0f 83 1c 01 00 00 48 8d 7b
>>>>>> 10 e8 27 37 62 ff 4c 8b 73 10 49 8d be 08 01 00 00 e8 17 37 62 ff <4d> 8b b6 08
>>>>>> 01 00 00 49 8d be b0 01 00 00 e8 04 37 62 ff 49 8b 86
>>>>>> [ 78.679768] RSP: 0018:ffffc90000d679c8 EFLAGS: 00010296
>>>>>> [ 78.680410] RAX: 0000000000000001 RBX: ffff888110ffbc80 RCX: fffffbfff07623ad
>>>>>> [ 78.681264] RDX: fffffbfff07623ad RSI: fffffbfff07623ad RDI: ffffffff83b11d60
>>>>>> [ 78.682136] RBP: ffff88810e3a1400 R08: 0000000000000008 R09: 0000000000000001
>>>>>> [ 78.682997] R10: ffffffff83b11d67 R11: fffffbfff07623ac R12: 00000000000005a2
>>>>>> [ 78.683853] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810e9a3401
>>>>>> [ 78.684710] FS: 00007fed3dbddc40(0000) GS:ffffffff82cb2000(0000) knlGS:0000000000000000
>>>>>> [ 78.685675] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> [ 78.686387] CR2: 0000000000000108 CR3: 0000000110e84000 CR4: 0000000000350ef0
>>>>>> [ 78.687246] Kernel panic - not syncing: Fatal exception in interrupt
>>>>>> [ 78.688014] Kernel Offset: disabled
>>>>>> [ 78.688460] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
>>>>>>
>>>>>> ping -s 2000 -n -q -W 1 -c 2 -I 192.0.3.254 192.0.2.254
>>>>>>
>>>>>> (gdb) list *iptfs_output_collect+0x263
>>>>>> 0xffffffff81d5076f is in iptfs_output_collect (./include/net/net_namespace.h:383).
>>>>>> 378 }
>>>>>> 379
>>>>>> 380 static inline struct net *read_pnet(const possible_net_t *pnet)
>>>>>> 381 {
>>>>>> 382 #ifdef CONFIG_NET_NS
>>>>>> 383 return rcu_dereference_protected(pnet->net, true);
>>>>>> 384 #else
>>>>>> 385 return &init_net;
>>>>>> 386 #endif
>>>>>> 387 }
>>>>>>
>>>>>> I suspect actual crash is from the line 1756 instead,
>>>>>> (gdb) list *iptfs_output_collect+0x256
>>>>>> 0xffffffff81d50762 is in iptfs_output_collect (net/xfrm/xfrm_iptfs.c:1756).
>>>>>> 1751 return 0;
>>>>>> 1752
>>>>>> 1753 /* We only send ICMP too big if the user has configured us as
>>>>>> 1754 * dont-fragment.
>>>>>> 1755 */
>>>>>> 1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
>>>>>> 1757
>>>>>> 1758 if (sk) {
>>>>>> 1759 xfrm_local_error(skb, pmtu);
>>>>>> 1760 } else if (ip_hdr(skb)->version == 4) {
>>>>>>
>>>>>> Later I ran with gdb iptfs_is_too_big which is called twice and second time
>>>>>> it crash.
>>>>>> Here is gdb bt. Just before the crash
>>>>>>
>>>>>> #0 iptfs_is_too_big (pmtu=1442, skb=0xffff88810dbea3c0, sk=0xffff888104d4ed40) at net/xfrm/xfrm_iptfs.c:1756
>>>>>> #1 iptfs_output_collect (net=<optimized out>, sk=0xffff888104d4ed40, skb=0xffff88810dbea3c0) at net/xfrm/xfrm_iptfs.c:1847
>>>>>> #2 0xffffffff81c8a3cb in ip_send_skb (net=0xffffffff83e57f20 <init_net>, skb=0xffff88810dbea3c0)
>>>>>> at net/ipv4/ip_output.c:1492
>>>>>> #3 0xffffffff81c8a439 in ip_push_pending_frames (sk=sk@entry=0xffff888104d4ed40, fl4=fl4@entry=0xffffc90000e3fb90)
>>>>>> at net/ipv4/ip_output.c:1512
>>>>>> #4 0xffffffff81ccf3cf in raw_sendmsg (sk=0xffff888104d4ed40, msg=0xffffc90000e3fd80, len=<optimized out>)
>>>>>> at net/ipv4/raw.c:654
>>>>>> #5 0xffffffff81b096ea in sock_sendmsg_nosec (sock=sock@entry=0xffff888115136040, msg=msg@entry=0xffffc90000e3fd80)
>>>>>> at net/socket.c:730
>>>>>> #6 0xffffffff81b0c327 in __sock_sendmsg (msg=0xffffc90000e3fd80, sock=0xffff888115136040) at net/socket.c:745
>>>>>> #7 __sys_sendto (fd=<optimized out>, buff=buff@entry=0x558edefb73c0, len=len@entry=2008, flags=flags@entry=0,
>>>>>> addr=addr@entry=0x558edefb35c0, addr_len=addr_len@entry=16) at net/socket.c:2191
>>>>>> #8 0xffffffff81b0c40c in __do_sys_sendto (addr_len=16, addr=0x558edefb35c0, flags=0, len=2008, buff=0x558edefb73c0,
>>>>>> fd=<optimized out>) at net/socket.c:2203
>>>>>> #9 __se_sys_sendto (addr_len=16, addr=94072114722240, flags=0, len=2008, buff=94072114738112, fd=<optimized out>)
>>>>>> at net/socket.c:2199
>>>>>>
>>>>>> gdb) list
>>>>>> 1751 return 0;
>>>>>> 1752
>>>>>> 1753 /* We only send ICMP too big if the user has configured us as
>>>>>> 1754 * dont-fragment.
>>>>>> 1755 */
>>>>>> 1756 XFRM_INC_STATS(dev_net(skb->dev), LINUX_MIB_XFRMOUTERROR);
>>>>>> 1757
>>>>>> 1758 if (sk) {
>>>>>> 1759 xfrm_local_error(skb, pmtu);
>>>>>> 1760 } else if (ip_hdr(skb)->version == 4) {
>>>>>>
>>>>>> -antony
>>>>>
>>>>
>>>> [2. text/plain; .config]...
>>>
>>> [[End of PGP Signed Part]]
>>
>> a
>
>
>
>> --
>> Devel mailing list
>> Devel@linux-ipsec.org
>> https://linux-ipsec.org/mailman/listinfo/devel
>
> --
> Devel mailing list
> Devel@linux-ipsec.org
> https://linux-ipsec.org/mailman/listinfo/devel
^ permalink raw reply [flat|nested] 34+ messages in thread* Re: [devel-ipsec] [PATCH ipsec-next v2 0/17] Add IP-TFS mode to xfrm
2024-06-17 15:17 ` Christian Hopps
@ 2024-06-17 15:39 ` Nicolas Dichtel
2024-06-17 16:05 ` Christian Hopps
0 siblings, 1 reply; 34+ messages in thread
From: Nicolas Dichtel @ 2024-06-17 15:39 UTC (permalink / raw)
To: Christian Hopps, Antony Antony
Cc: devel, Steffen Klassert, netdev, Christian Hopps
Le 17/06/2024 à 17:17, Christian Hopps via Devel a écrit :
> Very sorry, it appears that when I did git history cleanup, the fix for the dont-frag toobig case was removed. I will get the fix restored and new patch posted.
Please, don't top-post.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst#n338
Regards,
Nicolas
>
> Thanks,
> Chris.
>
>> On Jun 11, 2024, at 2:24 AM, Antony Antony via Devel <devel@linux-ipsec.org> wrote:
>>
>> On Sat, May 25, 2024 at 01:55:01AM -0400, Christian Hopps via Devel wrote:
[snip]
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [devel-ipsec] [PATCH ipsec-next v2 0/17] Add IP-TFS mode to xfrm
2024-06-17 15:39 ` Nicolas Dichtel
@ 2024-06-17 16:05 ` Christian Hopps
2024-06-17 21:24 ` Nicolas Dichtel
0 siblings, 1 reply; 34+ messages in thread
From: Christian Hopps @ 2024-06-17 16:05 UTC (permalink / raw)
To: nicolas.dichtel
Cc: Antony Antony, devel, Steffen Klassert, netdev, Christian Hopps
> On Jun 17, 2024, at 11:39 AM, Nicolas Dichtel via Devel <devel@linux-ipsec.org> wrote:
>
> Le 17/06/2024 à 17:17, Christian Hopps via Devel a écrit :
>> Very sorry, it appears that when I did git history cleanup, the fix for the dont-frag toobig case was removed. I will get the fix restored and new patch posted.
> Please, don't top-post.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst#n338
Yes, sorry about that, I normally don’t but was replying from a hospital bed — things are a bit jumbled currently. :)
Thanks,
Chris.
>
> Regards,
> Nicolas
>
>>
>> Thanks,
>> Chris.
>>
>>> On Jun 11, 2024, at 2:24 AM, Antony Antony via Devel <devel@linux-ipsec.org> wrote:
>>>
>>> On Sat, May 25, 2024 at 01:55:01AM -0400, Christian Hopps via Devel wrote:
>
> [snip]
> --
> Devel mailing list
> Devel@linux-ipsec.org
> https://linux-ipsec.org/mailman/listinfo/devel
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [devel-ipsec] [PATCH ipsec-next v2 0/17] Add IP-TFS mode to xfrm
2024-06-17 16:05 ` Christian Hopps
@ 2024-06-17 21:24 ` Nicolas Dichtel
0 siblings, 0 replies; 34+ messages in thread
From: Nicolas Dichtel @ 2024-06-17 21:24 UTC (permalink / raw)
To: Christian Hopps
Cc: Antony Antony, devel, Steffen Klassert, netdev, Christian Hopps
Le 17/06/2024 à 18:05, Christian Hopps a écrit :
>
>
>> On Jun 17, 2024, at 11:39 AM, Nicolas Dichtel via Devel <devel@linux-ipsec.org> wrote:
>>
>> Le 17/06/2024 à 17:17, Christian Hopps via Devel a écrit :
>>> Very sorry, it appears that when I did git history cleanup, the fix for the dont-frag toobig case was removed. I will get the fix restored and new patch posted.
>> Please, don't top-post.
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst#n338
>
> Yes, sorry about that, I normally don’t but was replying from a hospital bed — things are a bit jumbled currently. :)
I didn't know that ;-)
I wish you a quick recovery.
Good luck,
Nicolas
^ permalink raw reply [flat|nested] 34+ messages in thread