Netdev List
 help / color / mirror / Atom feed
* [PATCH net] net/sched: fix pedit partial COW leading to page cache
@ 2026-05-18  1:30 Rajat Gupta
  2026-05-18 13:10 ` Han Guidong
  2026-05-18 13:31 ` Jamal Hadi Salim
  0 siblings, 2 replies; 6+ messages in thread
From: Rajat Gupta @ 2026-05-18  1:30 UTC (permalink / raw)
  To: netdev; +Cc: Jamal Hadi Salim, jiri, davem, Eric Dumazet, kuba, pabeni, horms

From f12ea6484dbb75d1c13495e921d0595532317f2a Mon Sep 17 00:00:00 2001
From: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
Date: Sun, 17 May 2026 10:11:44 -0700
Subject: [PATCH net] net/sched: fix pedit partial COW leading to page cache
 corruption

tcf_pedit_act() computes the COW range for skb_ensure_writable()
once before the key loop using tcfp_off_max_hint, but the hint does
not account for the runtime header offset added by typed keys. This
can leave part of the write region un-COW'd.

Fix by moving skb_ensure_writable() inside the per-key loop where
the actual write offset is known, and add overflow checking on the
offset arithmetic. For negative offsets (e.g. Ethernet header edits
at ingress), use skb_cow() to COW the headroom instead. Guard
offset_valid() against INT_MIN, where negation is undefined.

Additionally, linearize skbs with shared frags upfront to prevent
silent data corruption when pedit operates on zero-copy pages
(e.g. from sendfile).

Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Reported-by: Keenan Dong <keenanat2000@gmail.com>
Reported-by: Han Guidong <2045gemini@gmail.com>
Reported-by: Zhang Cen <rollkingzzc@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
---
 net/sched/act_pedit.c | 52 ++++++++++++++++++++++++++++++++-----------
 1 file changed, 39 insertions(+), 13 deletions(-)

diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
index bc20f08a2..e077a2e82 100644
--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -16,6 +16,7 @@
 #include <linux/ip.h>
 #include <linux/ipv6.h>
 #include <linux/slab.h>
+#include <linux/overflow.h>
 #include <net/ipv6.h>
 #include <net/netlink.h>
 #include <net/pkt_sched.h>
@@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset)
 	if (offset > 0 && offset > skb->len)
 		return false;
 
-	if  (offset < 0 && -offset > skb_headroom(skb))
-		return false;
+	if (offset < 0) {
+		if (offset == INT_MIN || -offset > skb_headroom(skb))
+			return false;
+	}
 
 	return true;
 }
@@ -393,17 +396,19 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
 	struct tcf_pedit_key_ex *tkey_ex;
 	struct tcf_pedit_parms *parms;
 	struct tc_pedit_key *tkey;
-	u32 max_offset;
 	int i;
 
 	parms = rcu_dereference_bh(p->parms);
 
-	max_offset = (skb_transport_header_was_set(skb) ?
-		      skb_transport_offset(skb) :
-		      skb_network_offset(skb)) +
-		     parms->tcfp_off_max_hint;
-	if (skb_ensure_writable(skb, min(skb->len, max_offset)))
-		goto done;
+	/* If the skb has shared frags the user is likely using zero-copy
+	 * (e.g. sendfile).  Those page frags may point to page-cache pages;
+	 * writing into them would silently corrupt the page cache.
+	 * Linearize so pedit operates on a private copy.
+	 * If you want zero-copy, don't use pedit. 
+	  TL;DR if you want to use ZC, don't use pedit*/
+	if (skb_has_shared_frag(skb)) {
+		if (__skb_linearize(skb))
+			goto bad;
+	}
 
 	tcf_lastuse_update(&p->tcf_tm);
 	tcf_action_update_bstats(&p->common, skb);
@@ -414,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
 	for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) {
 		int offset = tkey->off;
 		int hoffset = 0;
+		int write_offset, write_len;
 		u32 *ptr, hdata;
 		u32 val;
 		int rc;
@@ -451,12 +457,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
 			}
 		}
 
-		if (!offset_valid(skb, hoffset + offset)) {
-			pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset);
+		if (unlikely(check_add_overflow(hoffset, offset,
+						&write_offset))) {
+			pr_info_ratelimited("tc action pedit offset overflow\n");
+			goto bad;
+		}
+
+		if (!offset_valid(skb, write_offset)) {
+			pr_info_ratelimited("tc action pedit offset %d out of bounds\n",
+					    write_offset);
 			goto bad;
 		}
 
-		ptr = skb_header_pointer(skb, hoffset + offset,
+		if (write_offset < 0) {
+			if (skb_cow(skb, -write_offset))
+				goto bad;
+		} else {
+			if (unlikely(check_add_overflow(write_offset,
+							(int)sizeof(hdata),
+							&write_len)))
+				goto bad;
+			if (skb_ensure_writable(skb, min_t(int, skb->len,
+							   write_len)))
+				goto bad;
+		}
+
+		ptr = skb_header_pointer(skb, write_offset,
 					 sizeof(hdata), &hdata);
 		if (!ptr)
 			goto bad;
@@ -475,7 +501,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
 
 		*ptr = ((*ptr & tkey->mask) ^ val);
 		if (ptr == &hdata)
-			skb_store_bits(skb, hoffset + offset, ptr, 4);
+			skb_store_bits(skb, write_offset, ptr, sizeof(hdata));
 	}
 
 	goto done;
-- 
2.51.2.windows.1

Thank you,
Rajat

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache
  2026-05-18  1:30 [PATCH net] net/sched: fix pedit partial COW leading to page cache Rajat Gupta
@ 2026-05-18 13:10 ` Han Guidong
  2026-05-18 13:31 ` Jamal Hadi Salim
  1 sibling, 0 replies; 6+ messages in thread
From: Han Guidong @ 2026-05-18 13:10 UTC (permalink / raw)
  To: rajat.gupta
  Cc: davem, edumazet, horms, jhs, jiri, kuba, netdev, pabeni,
	Han Guidong

(Resending to fix broken email threading. Sorry for the noise!)

On Sun, May 17, 2026 at 06:30:31PM -0700, Rajat Gupta wrote:
> >From f12ea6484dbb75d1c13495e921d0595532317f2a Mon Sep 17 00:00:00 2001
> From: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> Date: Sun, 17 May 2026 10:11:44 -0700
> Subject: [PATCH net] net/sched: fix pedit partial COW leading to page cache
>  corruption
> 
> tcf_pedit_act() computes the COW range for skb_ensure_writable()
> once before the key loop using tcfp_off_max_hint, but the hint does
> not account for the runtime header offset added by typed keys. This
> can leave part of the write region un-COW'd.
> 
> Fix by moving skb_ensure_writable() inside the per-key loop where
> the actual write offset is known, and add overflow checking on the
> offset arithmetic. For negative offsets (e.g. Ethernet header edits
> at ingress), use skb_cow() to COW the headroom instead. Guard
> offset_valid() against INT_MIN, where negation is undefined.
> 
> Additionally, linearize skbs with shared frags upfront to prevent
> silent data corruption when pedit operates on zero-copy pages
> (e.g. from sendfile).
> 
> Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
> Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> Reported-by: Yiming Qian <yimingqian591@gmail.com>
> Reported-by: Keenan Dong <keenanat2000@gmail.com>
> Reported-by: Han Guidong <2045gemini@gmail.com>
> Reported-by: Zhang Cen <rollkingzzc@gmail.com>
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>

Solid fix. Linearizing shared frags and dynamic COW checks completely
resolve the offset drift and page cache corruption. Verified locally.

Tested-by: Han Guidong <2045gemini@gmail.com>
Reviewed-by: Han Guidong <2045gemini@gmail.com>

> ---
>  net/sched/act_pedit.c | 52 ++++++++++++++++++++++++++++++++-----------
>  1 file changed, 39 insertions(+), 13 deletions(-)
> 
> diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
> index bc20f08a2..e077a2e82 100644
> --- a/net/sched/act_pedit.c
> +++ b/net/sched/act_pedit.c
> @@ -16,6 +16,7 @@
>  #include <linux/ip.h>
>  #include <linux/ipv6.h>
>  #include <linux/slab.h>
> +#include <linux/overflow.h>
>  #include <net/ipv6.h>
>  #include <net/netlink.h>
>  #include <net/pkt_sched.h>
> @@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset)
>  	if (offset > 0 && offset > skb->len)
>  		return false;
>  
> -	if  (offset < 0 && -offset > skb_headroom(skb))
> -		return false;
> +	if (offset < 0) {
> +		if (offset == INT_MIN || -offset > skb_headroom(skb))
> +			return false;
> +	}
>  
>  	return true;
>  }
> @@ -393,17 +396,19 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>  	struct tcf_pedit_key_ex *tkey_ex;
>  	struct tcf_pedit_parms *parms;
>  	struct tc_pedit_key *tkey;
> -	u32 max_offset;
>  	int i;
>  
>  	parms = rcu_dereference_bh(p->parms);
>  
> -	max_offset = (skb_transport_header_was_set(skb) ?
> -		      skb_transport_offset(skb) :
> -		      skb_network_offset(skb)) +
> -		     parms->tcfp_off_max_hint;
> -	if (skb_ensure_writable(skb, min(skb->len, max_offset)))
> -		goto done;
> +	/* If the skb has shared frags the user is likely using zero-copy
> +	 * (e.g. sendfile).  Those page frags may point to page-cache pages;
> +	 * writing into them would silently corrupt the page cache.
> +	 * Linearize so pedit operates on a private copy.
> +	 * If you want zero-copy, don't use pedit. 
> +	  TL;DR if you want to use ZC, don't use pedit*/
> +	if (skb_has_shared_frag(skb)) {
> +		if (__skb_linearize(skb))
> +			goto bad;
> +	}
>  
>  	tcf_lastuse_update(&p->tcf_tm);
>  	tcf_action_update_bstats(&p->common, skb);
> @@ -414,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>  	for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) {
>  		int offset = tkey->off;
>  		int hoffset = 0;
> +		int write_offset, write_len;
>  		u32 *ptr, hdata;
>  		u32 val;
>  		int rc;
> @@ -451,12 +457,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>  			}
>  		}
>  
> -		if (!offset_valid(skb, hoffset + offset)) {
> -			pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset);
> +		if (unlikely(check_add_overflow(hoffset, offset,
> +						&write_offset))) {
> +			pr_info_ratelimited("tc action pedit offset overflow\n");
> +			goto bad;
> +		}
> +
> +		if (!offset_valid(skb, write_offset)) {
> +			pr_info_ratelimited("tc action pedit offset %d out of bounds\n",
> +					    write_offset);
>  			goto bad;
>  		}
>  
> -		ptr = skb_header_pointer(skb, hoffset + offset,
> +		if (write_offset < 0) {
> +			if (skb_cow(skb, -write_offset))
> +				goto bad;
> +		} else {
> +			if (unlikely(check_add_overflow(write_offset,
> +							(int)sizeof(hdata),
> +							&write_len)))
> +				goto bad;
> +			if (skb_ensure_writable(skb, min_t(int, skb->len,
> +							   write_len)))
> +				goto bad;
> +		}
> +
> +		ptr = skb_header_pointer(skb, write_offset,
>  					 sizeof(hdata), &hdata);
>  		if (!ptr)
>  			goto bad;
> @@ -475,7 +501,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>  
>  		*ptr = ((*ptr & tkey->mask) ^ val);
>  		if (ptr == &hdata)
> -			skb_store_bits(skb, hoffset + offset, ptr, 4);
> +			skb_store_bits(skb, write_offset, ptr, sizeof(hdata));
>  	}
>  
>  	goto done;
> -- 
> 2.51.2.windows.1
> 
> Thank you,
> Rajat

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache
  2026-05-18  1:30 [PATCH net] net/sched: fix pedit partial COW leading to page cache Rajat Gupta
  2026-05-18 13:10 ` Han Guidong
@ 2026-05-18 13:31 ` Jamal Hadi Salim
  2026-05-19  3:39   ` [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption Rajat Gupta
  1 sibling, 1 reply; 6+ messages in thread
From: Jamal Hadi Salim @ 2026-05-18 13:31 UTC (permalink / raw)
  To: Rajat Gupta; +Cc: netdev, jiri, davem, Eric Dumazet, kuba, pabeni, horms

Hi Rajat,

On Sun, May 17, 2026 at 9:30 PM Rajat Gupta
<rajat.gupta@oss.qualcomm.com> wrote:
>
> From f12ea6484dbb75d1c13495e921d0595532317f2a Mon Sep 17 00:00:00 2001
> From: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> Date: Sun, 17 May 2026 10:11:44 -0700
> Subject: [PATCH net] net/sched: fix pedit partial COW leading to page cache
>  corruption
>
> tcf_pedit_act() computes the COW range for skb_ensure_writable()
> once before the key loop using tcfp_off_max_hint, but the hint does
> not account for the runtime header offset added by typed keys. This
> can leave part of the write region un-COW'd.
>

git am on this:
https://patchwork.kernel.org/project/netdevbpf/patch/604ca54c-a94e-4ca4-83e7-4486d7392d71@oss.qualcomm.com/

Errors out.
----
Applying: net/sched: fix pedit partial COW leading to page cache corruption
.git/rebase-apply/patch:49: trailing whitespace.
         * If you want zero-copy, don't use pedit.
error: corrupt patch at .git/rebase-apply/patch:58
Patch failed at 0001 net/sched: fix pedit partial COW leading to page
cache corruption
----

Those comments dont look clean - follow netdev rules.
You should send v1. The rule is you wait 24 hours after the previous
one before doing so (you can add whoever tests etc on v1)

Also make sure this applies on net tree not net-next

cheers,
jamal

> Fix by moving skb_ensure_writable() inside the per-key loop where
> the actual write offset is known, and add overflow checking on the
> offset arithmetic. For negative offsets (e.g. Ethernet header edits
> at ingress), use skb_cow() to COW the headroom instead. Guard
> offset_valid() against INT_MIN, where negation is undefined.
>
> Additionally, linearize skbs with shared frags upfront to prevent
> silent data corruption when pedit operates on zero-copy pages
> (e.g. from sendfile).
>
> Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
> Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> Reported-by: Yiming Qian <yimingqian591@gmail.com>
> Reported-by: Keenan Dong <keenanat2000@gmail.com>
> Reported-by: Han Guidong <2045gemini@gmail.com>
> Reported-by: Zhang Cen <rollkingzzc@gmail.com>
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> ---
>  net/sched/act_pedit.c | 52 ++++++++++++++++++++++++++++++++-----------
>  1 file changed, 39 insertions(+), 13 deletions(-)
>
> diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
> index bc20f08a2..e077a2e82 100644
> --- a/net/sched/act_pedit.c
> +++ b/net/sched/act_pedit.c
> @@ -16,6 +16,7 @@
>  #include <linux/ip.h>
>  #include <linux/ipv6.h>
>  #include <linux/slab.h>
> +#include <linux/overflow.h>
>  #include <net/ipv6.h>
>  #include <net/netlink.h>
>  #include <net/pkt_sched.h>
> @@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset)
>         if (offset > 0 && offset > skb->len)
>                 return false;
>
> -       if  (offset < 0 && -offset > skb_headroom(skb))
> -               return false;
> +       if (offset < 0) {
> +               if (offset == INT_MIN || -offset > skb_headroom(skb))
> +                       return false;
> +       }
>
>         return true;
>  }
> @@ -393,17 +396,19 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>         struct tcf_pedit_key_ex *tkey_ex;
>         struct tcf_pedit_parms *parms;
>         struct tc_pedit_key *tkey;
> -       u32 max_offset;
>         int i;
>
>         parms = rcu_dereference_bh(p->parms);
>
> -       max_offset = (skb_transport_header_was_set(skb) ?
> -                     skb_transport_offset(skb) :
> -                     skb_network_offset(skb)) +
> -                    parms->tcfp_off_max_hint;
> -       if (skb_ensure_writable(skb, min(skb->len, max_offset)))
> -               goto done;
> +       /* If the skb has shared frags the user is likely using zero-copy
> +        * (e.g. sendfile).  Those page frags may point to page-cache pages;
> +        * writing into them would silently corrupt the page cache.
> +        * Linearize so pedit operates on a private copy.
> +        * If you want zero-copy, don't use pedit.
> +         TL;DR if you want to use ZC, don't use pedit*/
> +       if (skb_has_shared_frag(skb)) {
> +               if (__skb_linearize(skb))
> +                       goto bad;
> +       }
>
>         tcf_lastuse_update(&p->tcf_tm);
>         tcf_action_update_bstats(&p->common, skb);
> @@ -414,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>         for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) {
>                 int offset = tkey->off;
>                 int hoffset = 0;
> +               int write_offset, write_len;
>                 u32 *ptr, hdata;
>                 u32 val;
>                 int rc;
> @@ -451,12 +457,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>                         }
>                 }
>
> -               if (!offset_valid(skb, hoffset + offset)) {
> -                       pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset);
> +               if (unlikely(check_add_overflow(hoffset, offset,
> +                                               &write_offset))) {
> +                       pr_info_ratelimited("tc action pedit offset overflow\n");
> +                       goto bad;
> +               }
> +
> +               if (!offset_valid(skb, write_offset)) {
> +                       pr_info_ratelimited("tc action pedit offset %d out of bounds\n",
> +                                           write_offset);
>                         goto bad;
>                 }
>
> -               ptr = skb_header_pointer(skb, hoffset + offset,
> +               if (write_offset < 0) {
> +                       if (skb_cow(skb, -write_offset))
> +                               goto bad;
> +               } else {
> +                       if (unlikely(check_add_overflow(write_offset,
> +                                                       (int)sizeof(hdata),
> +                                                       &write_len)))
> +                               goto bad;
> +                       if (skb_ensure_writable(skb, min_t(int, skb->len,
> +                                                          write_len)))
> +                               goto bad;
> +               }
> +
> +               ptr = skb_header_pointer(skb, write_offset,
>                                          sizeof(hdata), &hdata);
>                 if (!ptr)
>                         goto bad;
> @@ -475,7 +501,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>
>                 *ptr = ((*ptr & tkey->mask) ^ val);
>                 if (ptr == &hdata)
> -                       skb_store_bits(skb, hoffset + offset, ptr, 4);
> +                       skb_store_bits(skb, write_offset, ptr, sizeof(hdata));
>         }
>
>         goto done;
> --
> 2.51.2.windows.1
>
> Thank you,
> Rajat

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption
  2026-05-18 13:31 ` Jamal Hadi Salim
@ 2026-05-19  3:39   ` Rajat Gupta
  2026-05-19 11:18     ` Toke Høiland-Jørgensen
  2026-05-19 15:10     ` Han Guidong
  0 siblings, 2 replies; 6+ messages in thread
From: Rajat Gupta @ 2026-05-19  3:39 UTC (permalink / raw)
  To: netdev
  Cc: davem, edumazet, kuba, pabeni, horms, jhs, jiri, yimingqian591,
	keenanat2000, 2045gemini, rollkingzzc, Rajat Gupta

tcf_pedit_act() computes the COW range for skb_ensure_writable()
once before the key loop using tcfp_off_max_hint, but the hint does
not account for the runtime header offset added by typed keys. This
can leave part of the write region un-COW'd.

Fix by moving skb_ensure_writable() inside the per-key loop where
the actual write offset is known, and add overflow checking on the
offset arithmetic. For negative offsets (e.g. Ethernet header edits
at ingress), use skb_cow() to COW the headroom instead. Guard
offset_valid() against INT_MIN, where negation is undefined.

Additionally, linearize skbs with shared frags upfront to prevent
silent data corruption when pedit operates on zero-copy pages
(e.g. from sendfile).

Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Reported-by: Keenan Dong <keenanat2000@gmail.com>
Reported-by: Han Guidong <2045gemini@gmail.com>
Reported-by: Zhang Cen <rollkingzzc@gmail.com>
Tested-by: Han Guidong <2045gemini@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
---
 net/sched/act_pedit.c | 54 ++++++++++++++++++++++++++++++++-----------
 1 file changed, 41 insertions(+), 13 deletions(-)

diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
index bc20f08a2..79921b8d8 100644
--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -16,6 +16,7 @@
 #include <linux/ip.h>
 #include <linux/ipv6.h>
 #include <linux/slab.h>
+#include <linux/overflow.h>
 #include <net/ipv6.h>
 #include <net/netlink.h>
 #include <net/pkt_sched.h>
@@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset)
 	if (offset > 0 && offset > skb->len)
 		return false;
 
-	if  (offset < 0 && -offset > skb_headroom(skb))
-		return false;
+	if (offset < 0) {
+		if (offset == INT_MIN || -offset > skb_headroom(skb))
+			return false;
+	}
 
 	return true;
 }
@@ -393,17 +396,21 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
 	struct tcf_pedit_key_ex *tkey_ex;
 	struct tcf_pedit_parms *parms;
 	struct tc_pedit_key *tkey;
-	u32 max_offset;
 	int i;
 
 	parms = rcu_dereference_bh(p->parms);
 
-	max_offset = (skb_transport_header_was_set(skb) ?
-		      skb_transport_offset(skb) :
-		      skb_network_offset(skb)) +
-		     parms->tcfp_off_max_hint;
-	if (skb_ensure_writable(skb, min(skb->len, max_offset)))
-		goto done;
+	/*
+	 * If the skb has shared frags the user is likely using zero-copy
+	 * (e.g. sendfile).  Those page frags may point to page-cache pages;
+	 * writing into them would silently corrupt the page cache.
+	 * Linearize so pedit operates on a private copy.
+	 * TL;DR: if you want zero-copy, don't use pedit.
+	 */
+	if (skb_has_shared_frag(skb)) {
+		if (__skb_linearize(skb))
+			goto bad;
+	}
 
 	tcf_lastuse_update(&p->tcf_tm);
 	tcf_action_update_bstats(&p->common, skb);
@@ -412,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
 	tkey_ex = parms->tcfp_keys_ex;
 
 	for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) {
+		int write_offset, write_len;
 		int offset = tkey->off;
 		int hoffset = 0;
 		u32 *ptr, hdata;
@@ -451,12 +459,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
 			}
 		}
 
-		if (!offset_valid(skb, hoffset + offset)) {
-			pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset);
+		if (unlikely(check_add_overflow(hoffset, offset,
+						&write_offset))) {
+			pr_info_ratelimited("tc action pedit offset overflow\n");
+			goto bad;
+		}
+
+		if (!offset_valid(skb, write_offset)) {
+			pr_info_ratelimited("tc action pedit offset %d out of bounds\n",
+					    write_offset);
 			goto bad;
 		}
 
-		ptr = skb_header_pointer(skb, hoffset + offset,
+		if (write_offset < 0) {
+			if (skb_cow(skb, -write_offset))
+				goto bad;
+		} else {
+			if (unlikely(check_add_overflow(write_offset,
+							(int)sizeof(hdata),
+							&write_len)))
+				goto bad;
+			if (skb_ensure_writable(skb, min_t(int, skb->len,
+							   write_len)))
+				goto bad;
+		}
+
+		ptr = skb_header_pointer(skb, write_offset,
 					 sizeof(hdata), &hdata);
 		if (!ptr)
 			goto bad;
@@ -475,7 +503,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
 
 		*ptr = ((*ptr & tkey->mask) ^ val);
 		if (ptr == &hdata)
-			skb_store_bits(skb, hoffset + offset, ptr, 4);
+			skb_store_bits(skb, write_offset, ptr, sizeof(hdata));
 	}
 
 	goto done;
-- 
2.51.2.windows.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption
  2026-05-19  3:39   ` [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption Rajat Gupta
@ 2026-05-19 11:18     ` Toke Høiland-Jørgensen
  2026-05-19 15:10     ` Han Guidong
  1 sibling, 0 replies; 6+ messages in thread
From: Toke Høiland-Jørgensen @ 2026-05-19 11:18 UTC (permalink / raw)
  To: Rajat Gupta, netdev
  Cc: davem, edumazet, kuba, pabeni, horms, jhs, jiri, yimingqian591,
	keenanat2000, 2045gemini, rollkingzzc, Rajat Gupta

Rajat Gupta <rajat.gupta@oss.qualcomm.com> writes:

> tcf_pedit_act() computes the COW range for skb_ensure_writable()
> once before the key loop using tcfp_off_max_hint, but the hint does
> not account for the runtime header offset added by typed keys. This
> can leave part of the write region un-COW'd.
>
> Fix by moving skb_ensure_writable() inside the per-key loop where
> the actual write offset is known, and add overflow checking on the
> offset arithmetic. For negative offsets (e.g. Ethernet header edits
> at ingress), use skb_cow() to COW the headroom instead. Guard
> offset_valid() against INT_MIN, where negation is undefined.
>
> Additionally, linearize skbs with shared frags upfront to prevent
> silent data corruption when pedit operates on zero-copy pages
> (e.g. from sendfile).
>
> Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
> Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> Reported-by: Yiming Qian <yimingqian591@gmail.com>
> Reported-by: Keenan Dong <keenanat2000@gmail.com>
> Reported-by: Han Guidong <2045gemini@gmail.com>
> Reported-by: Zhang Cen <rollkingzzc@gmail.com>
> Tested-by: Han Guidong <2045gemini@gmail.com>
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>

Reviewed-by: Toke Høiland-Jørgensen <toke@kernel.org>


Also applied this to -net and ran the TC pedit selftests and the
pedit_* scripts in net/forwarding, none of which turned up any
regressions, so:

Tested-by: Toke Høiland-Jørgensen <toke@kernel.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption
  2026-05-19  3:39   ` [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption Rajat Gupta
  2026-05-19 11:18     ` Toke Høiland-Jørgensen
@ 2026-05-19 15:10     ` Han Guidong
  1 sibling, 0 replies; 6+ messages in thread
From: Han Guidong @ 2026-05-19 15:10 UTC (permalink / raw)
  To: Rajat Gupta
  Cc: netdev, davem, edumazet, kuba, pabeni, horms, jhs, jiri,
	yimingqian591, keenanat2000, rollkingzzc

On Tue, May 19, 2026 at 11:42 AM Rajat Gupta
<rajat.gupta@oss.qualcomm.com> wrote:
>
> tcf_pedit_act() computes the COW range for skb_ensure_writable()
> once before the key loop using tcfp_off_max_hint, but the hint does
> not account for the runtime header offset added by typed keys. This
> can leave part of the write region un-COW'd.
>
> Fix by moving skb_ensure_writable() inside the per-key loop where
> the actual write offset is known, and add overflow checking on the
> offset arithmetic. For negative offsets (e.g. Ethernet header edits
> at ingress), use skb_cow() to COW the headroom instead. Guard
> offset_valid() against INT_MIN, where negation is undefined.
>
> Additionally, linearize skbs with shared frags upfront to prevent
> silent data corruption when pedit operates on zero-copy pages
> (e.g. from sendfile).
>
> Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")

Fixes: 6c02568fd1ae ("net/sched: act_pedit: Parse L3 Header for L4 offset")
Cc: stable@vger.kernel.org

I took a closer look at the code, and this patch is really addressing
two issues.

The negative-offset write into shared head/headroom appears to be from
8b796475fd78, while the shared-frag write, and thus possible
page-cache corruption, appears to be the later regression from
6c02568fd1ae. So it may be worth adding an additional Fixes tag for
6c02568fd1ae, and I think an explicit Cc: stable@vger.kernel.org would
also make sense here.

> Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> Reported-by: Yiming Qian <yimingqian591@gmail.com>
> Reported-by: Keenan Dong <keenanat2000@gmail.com>
> Reported-by: Han Guidong <2045gemini@gmail.com>
> Reported-by: Zhang Cen <rollkingzzc@gmail.com>
> Tested-by: Han Guidong <2045gemini@gmail.com>
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> ---
>  net/sched/act_pedit.c | 54 ++++++++++++++++++++++++++++++++-----------
>  1 file changed, 41 insertions(+), 13 deletions(-)
>
> diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
> index bc20f08a2..79921b8d8 100644
> --- a/net/sched/act_pedit.c
> +++ b/net/sched/act_pedit.c
> @@ -16,6 +16,7 @@
>  #include <linux/ip.h>
>  #include <linux/ipv6.h>
>  #include <linux/slab.h>
> +#include <linux/overflow.h>
>  #include <net/ipv6.h>
>  #include <net/netlink.h>
>  #include <net/pkt_sched.h>
> @@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset)
>         if (offset > 0 && offset > skb->len)
>                 return false;
>
> -       if  (offset < 0 && -offset > skb_headroom(skb))
> -               return false;
> +       if (offset < 0) {
> +               if (offset == INT_MIN || -offset > skb_headroom(skb))
> +                       return false;
> +       }
>
>         return true;
>  }
> @@ -393,17 +396,21 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>         struct tcf_pedit_key_ex *tkey_ex;
>         struct tcf_pedit_parms *parms;
>         struct tc_pedit_key *tkey;
> -       u32 max_offset;
>         int i;
>
>         parms = rcu_dereference_bh(p->parms);
>
> -       max_offset = (skb_transport_header_was_set(skb) ?
> -                     skb_transport_offset(skb) :
> -                     skb_network_offset(skb)) +
> -                    parms->tcfp_off_max_hint;
> -       if (skb_ensure_writable(skb, min(skb->len, max_offset)))
> -               goto done;
> +       /*
> +        * If the skb has shared frags the user is likely using zero-copy
> +        * (e.g. sendfile).  Those page frags may point to page-cache pages;
> +        * writing into them would silently corrupt the page cache.
> +        * Linearize so pedit operates on a private copy.
> +        * TL;DR: if you want zero-copy, don't use pedit.
> +        */
> +       if (skb_has_shared_frag(skb)) {
> +               if (__skb_linearize(skb))
> +                       goto bad;
> +       }
>
>         tcf_lastuse_update(&p->tcf_tm);
>         tcf_action_update_bstats(&p->common, skb);
> @@ -412,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>         tkey_ex = parms->tcfp_keys_ex;
>
>         for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) {
> +               int write_offset, write_len;
>                 int offset = tkey->off;
>                 int hoffset = 0;
>                 u32 *ptr, hdata;
> @@ -451,12 +459,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>                         }
>                 }
>
> -               if (!offset_valid(skb, hoffset + offset)) {
> -                       pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset);
> +               if (unlikely(check_add_overflow(hoffset, offset,
> +                                               &write_offset))) {

I also noticed two remaining unchecked additions, e.g. hoffset +
tkey->at and offset += (*d & offmask) >> shift. These look more like
wrap/type-semantics cleanup than a practical signed-overflow/UB issue
to me, so I am OK with this patch as-is.

> +                       pr_info_ratelimited("tc action pedit offset overflow\n");
> +                       goto bad;
> +               }
> +
> +               if (!offset_valid(skb, write_offset)) {
> +                       pr_info_ratelimited("tc action pedit offset %d out of bounds\n",
> +                                           write_offset);
>                         goto bad;
>                 }
>
> -               ptr = skb_header_pointer(skb, hoffset + offset,
> +               if (write_offset < 0) {

There is also a small negative-offset 4-byte-write corner case, e.g.
an ETH-relative edit with hoffset=-14 and offset=12, giving a final
write_offset=-2; if the linear head after skb->data is only 1 byte,
that store spans -2..1. That seems unlikely in practice; fixing it
would be nice, but I would not hold this patch for it.

Reviewed-by: Han Guidong <2045gemini@gmail.com>


> +                       if (skb_cow(skb, -write_offset))
> +                               goto bad;
> +               } else {
> +                       if (unlikely(check_add_overflow(write_offset,
> +                                                       (int)sizeof(hdata),
> +                                                       &write_len)))
> +                               goto bad;
> +                       if (skb_ensure_writable(skb, min_t(int, skb->len,
> +                                                          write_len)))
> +                               goto bad;
> +               }
> +
> +               ptr = skb_header_pointer(skb, write_offset,
>                                          sizeof(hdata), &hdata);
>                 if (!ptr)
>                         goto bad;
> @@ -475,7 +503,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>
>                 *ptr = ((*ptr & tkey->mask) ^ val);
>                 if (ptr == &hdata)
> -                       skb_store_bits(skb, hoffset + offset, ptr, 4);
> +                       skb_store_bits(skb, write_offset, ptr, sizeof(hdata));
>         }
>
>         goto done;
> --
> 2.51.2.windows.1
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-05-19 15:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-18  1:30 [PATCH net] net/sched: fix pedit partial COW leading to page cache Rajat Gupta
2026-05-18 13:10 ` Han Guidong
2026-05-18 13:31 ` Jamal Hadi Salim
2026-05-19  3:39   ` [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption Rajat Gupta
2026-05-19 11:18     ` Toke Høiland-Jørgensen
2026-05-19 15:10     ` Han Guidong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox