netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/5] inet: add drop monitor support
@ 2022-10-28 13:30 Eric Dumazet
  2022-10-28 13:30 ` [PATCH net-next 1/5] net: dropreason: add SKB_CONSUMED reason Eric Dumazet
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Eric Dumazet @ 2022-10-28 13:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet, Eric Dumazet

IPv4 and IPv6 reassembly units are causing false kfree_skb()
notifications. It is time to deal with this issue.

First two patches are changing core networking to better
deal with eventual skb frag_list chains, in respect
of kfree_skb/consume_skb status.

Last three patches are adding three new drop reasons,
and make sure skbs that have been reassembled into
a large datagram are no longer viewed as dropped ones.

Eric Dumazet (5):
  net: dropreason: add SKB_CONSUMED reason
  net: dropreason: propagate drop_reason to skb_release_data()
  net: dropreason: add SKB_DROP_REASON_DUP_FRAG
  net: dropreason: add SKB_DROP_REASON_FRAG_REASM_TIMEOUT
  net: dropreason: add SKB_DROP_REASON_FRAG_TOO_FAR

 include/net/dropreason.h                | 14 ++++++++++++
 include/net/inet_frag.h                 |  6 ++++-
 include/net/ipv6_frag.h                 |  3 ++-
 net/core/skbuff.c                       | 30 ++++++++++++++-----------
 net/ipv4/inet_fragment.c                | 14 ++++++++----
 net/ipv4/ip_fragment.c                  | 19 +++++++++++-----
 net/ipv6/netfilter/nf_conntrack_reasm.c |  2 +-
 net/ipv6/reassembly.c                   | 13 +++++++----
 8 files changed, 71 insertions(+), 30 deletions(-)

-- 
2.38.1.273.g43a17bfeac-goog


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH net-next 1/5] net: dropreason: add SKB_CONSUMED reason
  2022-10-28 13:30 [PATCH net-next 0/5] inet: add drop monitor support Eric Dumazet
@ 2022-10-28 13:30 ` Eric Dumazet
  2022-10-28 13:30 ` [PATCH net-next 2/5] net: dropreason: propagate drop_reason to skb_release_data() Eric Dumazet
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2022-10-28 13:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet, Eric Dumazet

This will allow to simply use in the future:

	kfree_skb_reason(skb, reason);

Instead of repeating sequences like:

	if (dropped)
	    kfree_skb_reason(skb, reason);
	else
	    consume_skb(skb);

For instance, following patch in the series is adding
@reason to skb_release_data() and skb_release_all(),
so that we can propagate a meaningful @reason whenever
consume_skb()/kfree_skb() have to take care of a potential frag_list.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/dropreason.h | 2 ++
 net/core/skbuff.c        | 6 +++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/net/dropreason.h b/include/net/dropreason.h
index c1cbcdbaf1492c6bd5a18b9af0a0e0beb071c674..0bd18c14dae0a570a150c31eeea99fe85bc734b0 100644
--- a/include/net/dropreason.h
+++ b/include/net/dropreason.h
@@ -80,6 +80,8 @@ enum skb_drop_reason {
 	 * @SKB_NOT_DROPPED_YET: skb is not dropped yet (used for no-drop case)
 	 */
 	SKB_NOT_DROPPED_YET = 0,
+	/** @SKB_CONSUMED: packet has been consumed */
+	SKB_CONSUMED,
 	/** @SKB_DROP_REASON_NOT_SPECIFIED: drop reason is not specified */
 	SKB_DROP_REASON_NOT_SPECIFIED,
 	/** @SKB_DROP_REASON_NO_SOCKET: socket not found */
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 1d84a17eada502cf0f650d261cc7de5f020afb63..7ce797cd121f395062b61b33371fb638f51e8c99 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -94,6 +94,7 @@ EXPORT_SYMBOL(sysctl_max_skb_frags);
 #undef FN
 #define FN(reason) [SKB_DROP_REASON_##reason] = #reason,
 const char * const drop_reasons[] = {
+	[SKB_CONSUMED] = "CONSUMED",
 	DEFINE_DROP_REASON(FN, FN)
 };
 EXPORT_SYMBOL(drop_reasons);
@@ -894,7 +895,10 @@ kfree_skb_reason(struct sk_buff *skb, enum skb_drop_reason reason)
 
 	DEBUG_NET_WARN_ON_ONCE(reason <= 0 || reason >= SKB_DROP_REASON_MAX);
 
-	trace_kfree_skb(skb, __builtin_return_address(0), reason);
+	if (reason == SKB_CONSUMED)
+		trace_consume_skb(skb);
+	else
+		trace_kfree_skb(skb, __builtin_return_address(0), reason);
 	__kfree_skb(skb);
 }
 EXPORT_SYMBOL(kfree_skb_reason);
-- 
2.38.1.273.g43a17bfeac-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 2/5] net: dropreason: propagate drop_reason to skb_release_data()
  2022-10-28 13:30 [PATCH net-next 0/5] inet: add drop monitor support Eric Dumazet
  2022-10-28 13:30 ` [PATCH net-next 1/5] net: dropreason: add SKB_CONSUMED reason Eric Dumazet
@ 2022-10-28 13:30 ` Eric Dumazet
  2022-10-28 13:30 ` [PATCH net-next 3/5] net: dropreason: add SKB_DROP_REASON_DUP_FRAG Eric Dumazet
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2022-10-28 13:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet, Eric Dumazet

When an skb with a frag list is consumed, we currently
pretend all skbs in the frag list were dropped.

In order to fix this, add a @reason argument to skb_release_data()
and skb_release_all().

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/skbuff.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 7ce797cd121f395062b61b33371fb638f51e8c99..42a35b59fb1e54e2e4c0ab58a6da016cef622653 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -769,7 +769,7 @@ static void skb_free_head(struct sk_buff *skb)
 	}
 }
 
-static void skb_release_data(struct sk_buff *skb)
+static void skb_release_data(struct sk_buff *skb, enum skb_drop_reason reason)
 {
 	struct skb_shared_info *shinfo = skb_shinfo(skb);
 	int i;
@@ -792,7 +792,7 @@ static void skb_release_data(struct sk_buff *skb)
 
 free_head:
 	if (shinfo->frag_list)
-		kfree_skb_list(shinfo->frag_list);
+		kfree_skb_list_reason(shinfo->frag_list, reason);
 
 	skb_free_head(skb);
 exit:
@@ -855,11 +855,11 @@ void skb_release_head_state(struct sk_buff *skb)
 }
 
 /* Free everything but the sk_buff shell. */
-static void skb_release_all(struct sk_buff *skb)
+static void skb_release_all(struct sk_buff *skb, enum skb_drop_reason reason)
 {
 	skb_release_head_state(skb);
 	if (likely(skb->head))
-		skb_release_data(skb);
+		skb_release_data(skb, reason);
 }
 
 /**
@@ -873,7 +873,7 @@ static void skb_release_all(struct sk_buff *skb)
 
 void __kfree_skb(struct sk_buff *skb)
 {
-	skb_release_all(skb);
+	skb_release_all(skb, SKB_DROP_REASON_NOT_SPECIFIED);
 	kfree_skbmem(skb);
 }
 EXPORT_SYMBOL(__kfree_skb);
@@ -1056,7 +1056,7 @@ EXPORT_SYMBOL(consume_skb);
 void __consume_stateless_skb(struct sk_buff *skb)
 {
 	trace_consume_skb(skb);
-	skb_release_data(skb);
+	skb_release_data(skb, SKB_CONSUMED);
 	kfree_skbmem(skb);
 }
 
@@ -1081,7 +1081,7 @@ static void napi_skb_cache_put(struct sk_buff *skb)
 
 void __kfree_skb_defer(struct sk_buff *skb)
 {
-	skb_release_all(skb);
+	skb_release_all(skb, SKB_DROP_REASON_NOT_SPECIFIED);
 	napi_skb_cache_put(skb);
 }
 
@@ -1119,7 +1119,7 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
 		return;
 	}
 
-	skb_release_all(skb);
+	skb_release_all(skb, SKB_CONSUMED);
 	napi_skb_cache_put(skb);
 }
 EXPORT_SYMBOL(napi_consume_skb);
@@ -1250,7 +1250,7 @@ EXPORT_SYMBOL_GPL(alloc_skb_for_msg);
  */
 struct sk_buff *skb_morph(struct sk_buff *dst, struct sk_buff *src)
 {
-	skb_release_all(dst);
+	skb_release_all(dst, SKB_CONSUMED);
 	return __skb_clone(dst, src);
 }
 EXPORT_SYMBOL_GPL(skb_morph);
@@ -1873,7 +1873,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
 		if (skb_has_frag_list(skb))
 			skb_clone_fraglist(skb);
 
-		skb_release_data(skb);
+		skb_release_data(skb, SKB_CONSUMED);
 	} else {
 		skb_free_head(skb);
 	}
@@ -6213,7 +6213,7 @@ static int pskb_carve_inside_header(struct sk_buff *skb, const u32 off,
 			skb_frag_ref(skb, i);
 		if (skb_has_frag_list(skb))
 			skb_clone_fraglist(skb);
-		skb_release_data(skb);
+		skb_release_data(skb, SKB_CONSUMED);
 	} else {
 		/* we can reuse existing recount- all we did was
 		 * relocate values
@@ -6356,7 +6356,7 @@ static int pskb_carve_inside_nonlinear(struct sk_buff *skb, const u32 off,
 		kfree(data);
 		return -ENOMEM;
 	}
-	skb_release_data(skb);
+	skb_release_data(skb, SKB_CONSUMED);
 
 	skb->head = data;
 	skb->head_frag = 0;
-- 
2.38.1.273.g43a17bfeac-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 3/5] net: dropreason: add SKB_DROP_REASON_DUP_FRAG
  2022-10-28 13:30 [PATCH net-next 0/5] inet: add drop monitor support Eric Dumazet
  2022-10-28 13:30 ` [PATCH net-next 1/5] net: dropreason: add SKB_CONSUMED reason Eric Dumazet
  2022-10-28 13:30 ` [PATCH net-next 2/5] net: dropreason: propagate drop_reason to skb_release_data() Eric Dumazet
@ 2022-10-28 13:30 ` Eric Dumazet
  2022-10-28 13:30 ` [PATCH net-next 4/5] net: dropreason: add SKB_DROP_REASON_FRAG_REASM_TIMEOUT Eric Dumazet
  2022-10-28 13:30 ` [PATCH net-next 5/5] net: dropreason: add SKB_DROP_REASON_FRAG_TOO_FAR Eric Dumazet
  4 siblings, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2022-10-28 13:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet, Eric Dumazet

This is used to track when a duplicate segment received by various
reassembly units is dropped.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/dropreason.h                |  3 +++
 net/ipv4/ip_fragment.c                  | 13 +++++++++----
 net/ipv6/netfilter/nf_conntrack_reasm.c |  2 +-
 net/ipv6/reassembly.c                   | 13 +++++++++----
 4 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/include/net/dropreason.h b/include/net/dropreason.h
index 0bd18c14dae0a570a150c31eeea99fe85bc734b0..602d555a5f8392715ec03f85418ecb98926d0481 100644
--- a/include/net/dropreason.h
+++ b/include/net/dropreason.h
@@ -68,6 +68,7 @@
 	FN(IP_INADDRERRORS)		\
 	FN(IP_INNOROUTES)		\
 	FN(PKT_TOO_BIG)			\
+	FN(DUP_FRAG)			\
 	FNe(MAX)
 
 /**
@@ -300,6 +301,8 @@ enum skb_drop_reason {
 	 * MTU)
 	 */
 	SKB_DROP_REASON_PKT_TOO_BIG,
+	/** @SKB_DROP_REASON_DUP_FRAG: duplicate fragment */
+	SKB_DROP_REASON_DUP_FRAG,
 	/**
 	 * @SKB_DROP_REASON_MAX: the maximum of drop reason, which shouldn't be
 	 * used as a real 'reason'
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index fb153569889ecc8541640674880ff03e8c7bf24f..676bd8d259555448457dfd98ce4316c4b549a30a 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -278,10 +278,14 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 	struct net_device *dev;
 	unsigned int fragsize;
 	int err = -ENOENT;
+	SKB_DR(reason);
 	u8 ecn;
 
-	if (qp->q.flags & INET_FRAG_COMPLETE)
+	/* If reassembly is already done, @skb must be a duplicate frag. */
+	if (qp->q.flags & INET_FRAG_COMPLETE) {
+		SKB_DR_SET(reason, DUP_FRAG);
 		goto err;
+	}
 
 	if (!(IPCB(skb)->flags & IPSKB_FRAG_COMPLETE) &&
 	    unlikely(ip_frag_too_far(qp)) &&
@@ -382,8 +386,9 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 
 insert_error:
 	if (err == IPFRAG_DUP) {
-		kfree_skb(skb);
-		return -EINVAL;
+		SKB_DR_SET(reason, DUP_FRAG);
+		err = -EINVAL;
+		goto err;
 	}
 	err = -EINVAL;
 	__IP_INC_STATS(net, IPSTATS_MIB_REASM_OVERLAPS);
@@ -391,7 +396,7 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 	inet_frag_kill(&qp->q);
 	__IP_INC_STATS(net, IPSTATS_MIB_REASMFAILS);
 err:
-	kfree_skb(skb);
+	kfree_skb_reason(skb, reason);
 	return err;
 }
 
diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c
index 38db0064d6613a8472ec2835afdbf80071c1fcc2..d13240f13607bae8833d4e53471c575280ff49dc 100644
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -253,7 +253,7 @@ static int nf_ct_frag6_queue(struct frag_queue *fq, struct sk_buff *skb,
 	if (err) {
 		if (err == IPFRAG_DUP) {
 			/* No error for duplicates, pretend they got queued. */
-			kfree_skb(skb);
+			kfree_skb_reason(skb, SKB_DROP_REASON_DUP_FRAG);
 			return -EINPROGRESS;
 		}
 		goto insert_error;
diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
index ff866f2a879e00769b273c22b970740eaebb6d99..5bc8a28e67f944c9e7bead79afa8b80a34b92db9 100644
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -112,10 +112,14 @@ static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb,
 	struct sk_buff *prev_tail;
 	struct net_device *dev;
 	int err = -ENOENT;
+	SKB_DR(reason);
 	u8 ecn;
 
-	if (fq->q.flags & INET_FRAG_COMPLETE)
+	/* If reassembly is already done, @skb must be a duplicate frag. */
+	if (fq->q.flags & INET_FRAG_COMPLETE) {
+		SKB_DR_SET(reason, DUP_FRAG);
 		goto err;
+	}
 
 	err = -EINVAL;
 	offset = ntohs(fhdr->frag_off) & ~0x7;
@@ -226,8 +230,9 @@ static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb,
 
 insert_error:
 	if (err == IPFRAG_DUP) {
-		kfree_skb(skb);
-		return -EINVAL;
+		SKB_DR_SET(reason, DUP_FRAG);
+		err = -EINVAL;
+		goto err;
 	}
 	err = -EINVAL;
 	__IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
@@ -237,7 +242,7 @@ static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb,
 	__IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
 			IPSTATS_MIB_REASMFAILS);
 err:
-	kfree_skb(skb);
+	kfree_skb_reason(skb, reason);
 	return err;
 }
 
-- 
2.38.1.273.g43a17bfeac-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 4/5] net: dropreason: add SKB_DROP_REASON_FRAG_REASM_TIMEOUT
  2022-10-28 13:30 [PATCH net-next 0/5] inet: add drop monitor support Eric Dumazet
                   ` (2 preceding siblings ...)
  2022-10-28 13:30 ` [PATCH net-next 3/5] net: dropreason: add SKB_DROP_REASON_DUP_FRAG Eric Dumazet
@ 2022-10-28 13:30 ` Eric Dumazet
  2022-10-29  5:20   ` Jakub Kicinski
  2022-10-28 13:30 ` [PATCH net-next 5/5] net: dropreason: add SKB_DROP_REASON_FRAG_TOO_FAR Eric Dumazet
  4 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2022-10-28 13:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet, Eric Dumazet

Used to track skbs freed after a timeout happened
in a reassmbly unit.

Passing a @reason argument to inet_frag_rbtree_purge()
allows to use correct consumed status for frags
that have been successfully re-assembled.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/dropreason.h |  3 +++
 include/net/inet_frag.h  |  6 +++++-
 include/net/ipv6_frag.h  |  3 ++-
 net/ipv4/inet_fragment.c | 14 ++++++++++----
 net/ipv4/ip_fragment.c   |  6 ++++--
 5 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/include/net/dropreason.h b/include/net/dropreason.h
index 602d555a5f8392715ec03f85418ecb98926d0481..d3d44da889e4f002ed1485a10fd081184956c911 100644
--- a/include/net/dropreason.h
+++ b/include/net/dropreason.h
@@ -69,6 +69,7 @@
 	FN(IP_INNOROUTES)		\
 	FN(PKT_TOO_BIG)			\
 	FN(DUP_FRAG)			\
+	FN(FRAG_REASM_TIMEOUT)		\
 	FNe(MAX)
 
 /**
@@ -303,6 +304,8 @@ enum skb_drop_reason {
 	SKB_DROP_REASON_PKT_TOO_BIG,
 	/** @SKB_DROP_REASON_DUP_FRAG: duplicate fragment */
 	SKB_DROP_REASON_DUP_FRAG,
+	/** @FRAG_REASM_TIMEOUT: fragment reassembly timeout */
+	SKB_DROP_REASON_FRAG_REASM_TIMEOUT,
 	/**
 	 * @SKB_DROP_REASON_MAX: the maximum of drop reason, which shouldn't be
 	 * used as a real 'reason'
diff --git a/include/net/inet_frag.h b/include/net/inet_frag.h
index 0b0876610553fca80ca1ce8d53026265b316d052..b23ddec3cd5cd8303bd7dc38714c274df8a63993 100644
--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -7,6 +7,7 @@
 #include <linux/in6.h>
 #include <linux/rbtree_types.h>
 #include <linux/refcount.h>
+#include <net/dropreason.h>
 
 /* Per netns frag queues directory */
 struct fqdir {
@@ -34,12 +35,14 @@ struct fqdir {
  * @INET_FRAG_LAST_IN: final fragment has arrived
  * @INET_FRAG_COMPLETE: frag queue has been processed and is due for destruction
  * @INET_FRAG_HASH_DEAD: inet_frag_kill() has not removed fq from rhashtable
+ * @INET_FRAG_DROP: if skbs must be dropped (instead of being consumed)
  */
 enum {
 	INET_FRAG_FIRST_IN	= BIT(0),
 	INET_FRAG_LAST_IN	= BIT(1),
 	INET_FRAG_COMPLETE	= BIT(2),
 	INET_FRAG_HASH_DEAD	= BIT(3),
+	INET_FRAG_DROP		= BIT(4),
 };
 
 struct frag_v4_compare_key {
@@ -139,7 +142,8 @@ void inet_frag_destroy(struct inet_frag_queue *q);
 struct inet_frag_queue *inet_frag_find(struct fqdir *fqdir, void *key);
 
 /* Free all skbs in the queue; return the sum of their truesizes. */
-unsigned int inet_frag_rbtree_purge(struct rb_root *root);
+unsigned int inet_frag_rbtree_purge(struct rb_root *root,
+				    enum skb_drop_reason reason);
 
 static inline void inet_frag_put(struct inet_frag_queue *q)
 {
diff --git a/include/net/ipv6_frag.h b/include/net/ipv6_frag.h
index 5052c66e22d23604e698f93cf6328ca5863e4d07..7321ffe3a108c159490ae358b8c2cfca958055a4 100644
--- a/include/net/ipv6_frag.h
+++ b/include/net/ipv6_frag.h
@@ -76,6 +76,7 @@ ip6frag_expire_frag_queue(struct net *net, struct frag_queue *fq)
 	if (fq->q.flags & INET_FRAG_COMPLETE)
 		goto out;
 
+	fq->q.flags |= INET_FRAG_DROP;
 	inet_frag_kill(&fq->q);
 
 	dev = dev_get_by_index_rcu(net, fq->iif);
@@ -101,7 +102,7 @@ ip6frag_expire_frag_queue(struct net *net, struct frag_queue *fq)
 	spin_unlock(&fq->q.lock);
 
 	icmpv6_send(head, ICMPV6_TIME_EXCEED, ICMPV6_EXC_FRAGTIME, 0);
-	kfree_skb(head);
+	kfree_skb_reason(head, SKB_DROP_REASON_FRAG_REASM_TIMEOUT);
 	goto out_rcu_unlock;
 
 out:
diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
index c9f9ac5013a717ddbc403c7317aaa228b09f6a0c..7072fc0783ef56e59c886a2f2516e7db7d10c942 100644
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -133,6 +133,7 @@ static void inet_frags_free_cb(void *ptr, void *arg)
 	count = del_timer_sync(&fq->timer) ? 1 : 0;
 
 	spin_lock_bh(&fq->lock);
+	fq->flags |= INET_FRAG_DROP;
 	if (!(fq->flags & INET_FRAG_COMPLETE)) {
 		fq->flags |= INET_FRAG_COMPLETE;
 		count++;
@@ -260,7 +261,8 @@ static void inet_frag_destroy_rcu(struct rcu_head *head)
 	kmem_cache_free(f->frags_cachep, q);
 }
 
-unsigned int inet_frag_rbtree_purge(struct rb_root *root)
+unsigned int inet_frag_rbtree_purge(struct rb_root *root,
+				    enum skb_drop_reason reason)
 {
 	struct rb_node *p = rb_first(root);
 	unsigned int sum = 0;
@@ -274,7 +276,7 @@ unsigned int inet_frag_rbtree_purge(struct rb_root *root)
 			struct sk_buff *next = FRAG_CB(skb)->next_frag;
 
 			sum += skb->truesize;
-			kfree_skb(skb);
+			kfree_skb_reason(skb, reason);
 			skb = next;
 		}
 	}
@@ -284,17 +286,21 @@ EXPORT_SYMBOL(inet_frag_rbtree_purge);
 
 void inet_frag_destroy(struct inet_frag_queue *q)
 {
-	struct fqdir *fqdir;
 	unsigned int sum, sum_truesize = 0;
+	enum skb_drop_reason reason;
 	struct inet_frags *f;
+	struct fqdir *fqdir;
 
 	WARN_ON(!(q->flags & INET_FRAG_COMPLETE));
+	reason = (q->flags & INET_FRAG_DROP) ?
+			SKB_DROP_REASON_FRAG_REASM_TIMEOUT :
+			SKB_CONSUMED;
 	WARN_ON(del_timer(&q->timer) != 0);
 
 	/* Release all fragment data. */
 	fqdir = q->fqdir;
 	f = fqdir->f;
-	sum_truesize = inet_frag_rbtree_purge(&q->rb_fragments);
+	sum_truesize = inet_frag_rbtree_purge(&q->rb_fragments, reason);
 	sum = sum_truesize + f->qsize;
 
 	call_rcu(&q->rcu, inet_frag_destroy_rcu);
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index 676bd8d259555448457dfd98ce4316c4b549a30a..85e8113259c36881dd0153d9d68c818ebabccc0c 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -153,6 +153,7 @@ static void ip_expire(struct timer_list *t)
 	if (qp->q.flags & INET_FRAG_COMPLETE)
 		goto out;
 
+	qp->q.flags |= INET_FRAG_DROP;
 	ipq_kill(qp);
 	__IP_INC_STATS(net, IPSTATS_MIB_REASMFAILS);
 	__IP_INC_STATS(net, IPSTATS_MIB_REASMTIMEOUT);
@@ -194,7 +195,7 @@ static void ip_expire(struct timer_list *t)
 	spin_unlock(&qp->q.lock);
 out_rcu_unlock:
 	rcu_read_unlock();
-	kfree_skb(head);
+	kfree_skb_reason(head, SKB_DROP_REASON_FRAG_REASM_TIMEOUT);
 	ipq_put(qp);
 }
 
@@ -254,7 +255,8 @@ static int ip_frag_reinit(struct ipq *qp)
 		return -ETIMEDOUT;
 	}
 
-	sum_truesize = inet_frag_rbtree_purge(&qp->q.rb_fragments);
+	sum_truesize = inet_frag_rbtree_purge(&qp->q.rb_fragments,
+					      SKB_DROP_REASON_NOT_SPECIFIED);
 	sub_frag_mem_limit(qp->q.fqdir, sum_truesize);
 
 	qp->q.flags = 0;
-- 
2.38.1.273.g43a17bfeac-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 5/5] net: dropreason: add SKB_DROP_REASON_FRAG_TOO_FAR
  2022-10-28 13:30 [PATCH net-next 0/5] inet: add drop monitor support Eric Dumazet
                   ` (3 preceding siblings ...)
  2022-10-28 13:30 ` [PATCH net-next 4/5] net: dropreason: add SKB_DROP_REASON_FRAG_REASM_TIMEOUT Eric Dumazet
@ 2022-10-28 13:30 ` Eric Dumazet
  4 siblings, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2022-10-28 13:30 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, eric.dumazet, Eric Dumazet

IPv4 reassembly unit can decide to drop frags based on
/proc/sys/net/ipv4/ipfrag_max_dist sysctl.

Add a specific drop reason to track this specific
and weird case.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/dropreason.h | 6 ++++++
 net/ipv4/ip_fragment.c   | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/net/dropreason.h b/include/net/dropreason.h
index d3d44da889e4f002ed1485a10fd081184956c911..d3df766c117bc1b0373d9e19d9baad944b5fb776 100644
--- a/include/net/dropreason.h
+++ b/include/net/dropreason.h
@@ -70,6 +70,7 @@
 	FN(PKT_TOO_BIG)			\
 	FN(DUP_FRAG)			\
 	FN(FRAG_REASM_TIMEOUT)		\
+	FN(FRAG_TOO_FAR)		\
 	FNe(MAX)
 
 /**
@@ -306,6 +307,11 @@ enum skb_drop_reason {
 	SKB_DROP_REASON_DUP_FRAG,
 	/** @FRAG_REASM_TIMEOUT: fragment reassembly timeout */
 	SKB_DROP_REASON_FRAG_REASM_TIMEOUT,
+	/**
+	 * @SKB_DROP_REASON_FRAG_TOO_FAR: ipv4 fragment too far.
+	 * (/proc/sys/net/ipv4/ipfrag_max_dist)
+	 */
+	SKB_DROP_REASON_FRAG_TOO_FAR,
 	/**
 	 * @SKB_DROP_REASON_MAX: the maximum of drop reason, which shouldn't be
 	 * used as a real 'reason'
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index 85e8113259c36881dd0153d9d68c818ebabccc0c..69c00ffdcf3e6336cb920902a43f4ad046cc8438 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -256,7 +256,7 @@ static int ip_frag_reinit(struct ipq *qp)
 	}
 
 	sum_truesize = inet_frag_rbtree_purge(&qp->q.rb_fragments,
-					      SKB_DROP_REASON_NOT_SPECIFIED);
+					      SKB_DROP_REASON_FRAG_TOO_FAR);
 	sub_frag_mem_limit(qp->q.fqdir, sum_truesize);
 
 	qp->q.flags = 0;
-- 
2.38.1.273.g43a17bfeac-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next 4/5] net: dropreason: add SKB_DROP_REASON_FRAG_REASM_TIMEOUT
  2022-10-28 13:30 ` [PATCH net-next 4/5] net: dropreason: add SKB_DROP_REASON_FRAG_REASM_TIMEOUT Eric Dumazet
@ 2022-10-29  5:20   ` Jakub Kicinski
  2022-10-29 15:35     ` Eric Dumazet
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2022-10-29  5:20 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S . Miller, Paolo Abeni, netdev, eric.dumazet

On Fri, 28 Oct 2022 13:30:42 +0000 Eric Dumazet wrote:
> +	/** @FRAG_REASM_TIMEOUT: fragment reassembly timeout */
> +	SKB_DROP_REASON_FRAG_REASM_TIMEOUT,

I'm guessing the shortened version of the name in the comment
is intentional, but kdoc parsing is not on board :S

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next 4/5] net: dropreason: add SKB_DROP_REASON_FRAG_REASM_TIMEOUT
  2022-10-29  5:20   ` Jakub Kicinski
@ 2022-10-29 15:35     ` Eric Dumazet
  0 siblings, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2022-10-29 15:35 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: David S . Miller, Paolo Abeni, netdev, eric.dumazet

On Fri, Oct 28, 2022 at 10:20 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 28 Oct 2022 13:30:42 +0000 Eric Dumazet wrote:
> > +     /** @FRAG_REASM_TIMEOUT: fragment reassembly timeout */
> > +     SKB_DROP_REASON_FRAG_REASM_TIMEOUT,
>
> I'm guessing the shortened version of the name in the comment
> is intentional, but kdoc parsing is not on board :S

Right, I will send a V2, thanks !

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-10-29 15:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-28 13:30 [PATCH net-next 0/5] inet: add drop monitor support Eric Dumazet
2022-10-28 13:30 ` [PATCH net-next 1/5] net: dropreason: add SKB_CONSUMED reason Eric Dumazet
2022-10-28 13:30 ` [PATCH net-next 2/5] net: dropreason: propagate drop_reason to skb_release_data() Eric Dumazet
2022-10-28 13:30 ` [PATCH net-next 3/5] net: dropreason: add SKB_DROP_REASON_DUP_FRAG Eric Dumazet
2022-10-28 13:30 ` [PATCH net-next 4/5] net: dropreason: add SKB_DROP_REASON_FRAG_REASM_TIMEOUT Eric Dumazet
2022-10-29  5:20   ` Jakub Kicinski
2022-10-29 15:35     ` Eric Dumazet
2022-10-28 13:30 ` [PATCH net-next 5/5] net: dropreason: add SKB_DROP_REASON_FRAG_TOO_FAR Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).