* [PATCH net] net/sched: fix pedit partial COW leading to page cache
@ 2026-05-18 1:30 Rajat Gupta
2026-05-18 13:10 ` Han Guidong
2026-05-18 13:31 ` Jamal Hadi Salim
0 siblings, 2 replies; 11+ messages in thread
From: Rajat Gupta @ 2026-05-18 1:30 UTC (permalink / raw)
To: netdev; +Cc: Jamal Hadi Salim, jiri, davem, Eric Dumazet, kuba, pabeni, horms
From f12ea6484dbb75d1c13495e921d0595532317f2a Mon Sep 17 00:00:00 2001
From: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
Date: Sun, 17 May 2026 10:11:44 -0700
Subject: [PATCH net] net/sched: fix pedit partial COW leading to page cache
corruption
tcf_pedit_act() computes the COW range for skb_ensure_writable()
once before the key loop using tcfp_off_max_hint, but the hint does
not account for the runtime header offset added by typed keys. This
can leave part of the write region un-COW'd.
Fix by moving skb_ensure_writable() inside the per-key loop where
the actual write offset is known, and add overflow checking on the
offset arithmetic. For negative offsets (e.g. Ethernet header edits
at ingress), use skb_cow() to COW the headroom instead. Guard
offset_valid() against INT_MIN, where negation is undefined.
Additionally, linearize skbs with shared frags upfront to prevent
silent data corruption when pedit operates on zero-copy pages
(e.g. from sendfile).
Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Reported-by: Keenan Dong <keenanat2000@gmail.com>
Reported-by: Han Guidong <2045gemini@gmail.com>
Reported-by: Zhang Cen <rollkingzzc@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
---
net/sched/act_pedit.c | 52 ++++++++++++++++++++++++++++++++-----------
1 file changed, 39 insertions(+), 13 deletions(-)
diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
index bc20f08a2..e077a2e82 100644
--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -16,6 +16,7 @@
#include <linux/ip.h>
#include <linux/ipv6.h>
#include <linux/slab.h>
+#include <linux/overflow.h>
#include <net/ipv6.h>
#include <net/netlink.h>
#include <net/pkt_sched.h>
@@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset)
if (offset > 0 && offset > skb->len)
return false;
- if (offset < 0 && -offset > skb_headroom(skb))
- return false;
+ if (offset < 0) {
+ if (offset == INT_MIN || -offset > skb_headroom(skb))
+ return false;
+ }
return true;
}
@@ -393,17 +396,19 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
struct tcf_pedit_key_ex *tkey_ex;
struct tcf_pedit_parms *parms;
struct tc_pedit_key *tkey;
- u32 max_offset;
int i;
parms = rcu_dereference_bh(p->parms);
- max_offset = (skb_transport_header_was_set(skb) ?
- skb_transport_offset(skb) :
- skb_network_offset(skb)) +
- parms->tcfp_off_max_hint;
- if (skb_ensure_writable(skb, min(skb->len, max_offset)))
- goto done;
+ /* If the skb has shared frags the user is likely using zero-copy
+ * (e.g. sendfile). Those page frags may point to page-cache pages;
+ * writing into them would silently corrupt the page cache.
+ * Linearize so pedit operates on a private copy.
+ * If you want zero-copy, don't use pedit.
+ TL;DR if you want to use ZC, don't use pedit*/
+ if (skb_has_shared_frag(skb)) {
+ if (__skb_linearize(skb))
+ goto bad;
+ }
tcf_lastuse_update(&p->tcf_tm);
tcf_action_update_bstats(&p->common, skb);
@@ -414,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) {
int offset = tkey->off;
int hoffset = 0;
+ int write_offset, write_len;
u32 *ptr, hdata;
u32 val;
int rc;
@@ -451,12 +457,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
}
}
- if (!offset_valid(skb, hoffset + offset)) {
- pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset);
+ if (unlikely(check_add_overflow(hoffset, offset,
+ &write_offset))) {
+ pr_info_ratelimited("tc action pedit offset overflow\n");
+ goto bad;
+ }
+
+ if (!offset_valid(skb, write_offset)) {
+ pr_info_ratelimited("tc action pedit offset %d out of bounds\n",
+ write_offset);
goto bad;
}
- ptr = skb_header_pointer(skb, hoffset + offset,
+ if (write_offset < 0) {
+ if (skb_cow(skb, -write_offset))
+ goto bad;
+ } else {
+ if (unlikely(check_add_overflow(write_offset,
+ (int)sizeof(hdata),
+ &write_len)))
+ goto bad;
+ if (skb_ensure_writable(skb, min_t(int, skb->len,
+ write_len)))
+ goto bad;
+ }
+
+ ptr = skb_header_pointer(skb, write_offset,
sizeof(hdata), &hdata);
if (!ptr)
goto bad;
@@ -475,7 +501,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
*ptr = ((*ptr & tkey->mask) ^ val);
if (ptr == &hdata)
- skb_store_bits(skb, hoffset + offset, ptr, 4);
+ skb_store_bits(skb, write_offset, ptr, sizeof(hdata));
}
goto done;
--
2.51.2.windows.1
Thank you,
Rajat
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache 2026-05-18 1:30 [PATCH net] net/sched: fix pedit partial COW leading to page cache Rajat Gupta @ 2026-05-18 13:10 ` Han Guidong 2026-05-18 13:31 ` Jamal Hadi Salim 1 sibling, 0 replies; 11+ messages in thread From: Han Guidong @ 2026-05-18 13:10 UTC (permalink / raw) To: rajat.gupta Cc: davem, edumazet, horms, jhs, jiri, kuba, netdev, pabeni, Han Guidong (Resending to fix broken email threading. Sorry for the noise!) On Sun, May 17, 2026 at 06:30:31PM -0700, Rajat Gupta wrote: > >From f12ea6484dbb75d1c13495e921d0595532317f2a Mon Sep 17 00:00:00 2001 > From: Rajat Gupta <rajat.gupta@oss.qualcomm.com> > Date: Sun, 17 May 2026 10:11:44 -0700 > Subject: [PATCH net] net/sched: fix pedit partial COW leading to page cache > corruption > > tcf_pedit_act() computes the COW range for skb_ensure_writable() > once before the key loop using tcfp_off_max_hint, but the hint does > not account for the runtime header offset added by typed keys. This > can leave part of the write region un-COW'd. > > Fix by moving skb_ensure_writable() inside the per-key loop where > the actual write offset is known, and add overflow checking on the > offset arithmetic. For negative offsets (e.g. Ethernet header edits > at ingress), use skb_cow() to COW the headroom instead. Guard > offset_valid() against INT_MIN, where negation is undefined. > > Additionally, linearize skbs with shared frags upfront to prevent > silent data corruption when pedit operates on zero-copy pages > (e.g. from sendfile). > > Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable") > Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com> > Reported-by: Yiming Qian <yimingqian591@gmail.com> > Reported-by: Keenan Dong <keenanat2000@gmail.com> > Reported-by: Han Guidong <2045gemini@gmail.com> > Reported-by: Zhang Cen <rollkingzzc@gmail.com> > Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> > Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com> Solid fix. Linearizing shared frags and dynamic COW checks completely resolve the offset drift and page cache corruption. Verified locally. Tested-by: Han Guidong <2045gemini@gmail.com> Reviewed-by: Han Guidong <2045gemini@gmail.com> > --- > net/sched/act_pedit.c | 52 ++++++++++++++++++++++++++++++++----------- > 1 file changed, 39 insertions(+), 13 deletions(-) > > diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c > index bc20f08a2..e077a2e82 100644 > --- a/net/sched/act_pedit.c > +++ b/net/sched/act_pedit.c > @@ -16,6 +16,7 @@ > #include <linux/ip.h> > #include <linux/ipv6.h> > #include <linux/slab.h> > +#include <linux/overflow.h> > #include <net/ipv6.h> > #include <net/netlink.h> > #include <net/pkt_sched.h> > @@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset) > if (offset > 0 && offset > skb->len) > return false; > > - if (offset < 0 && -offset > skb_headroom(skb)) > - return false; > + if (offset < 0) { > + if (offset == INT_MIN || -offset > skb_headroom(skb)) > + return false; > + } > > return true; > } > @@ -393,17 +396,19 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > struct tcf_pedit_key_ex *tkey_ex; > struct tcf_pedit_parms *parms; > struct tc_pedit_key *tkey; > - u32 max_offset; > int i; > > parms = rcu_dereference_bh(p->parms); > > - max_offset = (skb_transport_header_was_set(skb) ? > - skb_transport_offset(skb) : > - skb_network_offset(skb)) + > - parms->tcfp_off_max_hint; > - if (skb_ensure_writable(skb, min(skb->len, max_offset))) > - goto done; > + /* If the skb has shared frags the user is likely using zero-copy > + * (e.g. sendfile). Those page frags may point to page-cache pages; > + * writing into them would silently corrupt the page cache. > + * Linearize so pedit operates on a private copy. > + * If you want zero-copy, don't use pedit. > + TL;DR if you want to use ZC, don't use pedit*/ > + if (skb_has_shared_frag(skb)) { > + if (__skb_linearize(skb)) > + goto bad; > + } > > tcf_lastuse_update(&p->tcf_tm); > tcf_action_update_bstats(&p->common, skb); > @@ -414,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) { > int offset = tkey->off; > int hoffset = 0; > + int write_offset, write_len; > u32 *ptr, hdata; > u32 val; > int rc; > @@ -451,12 +457,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > } > } > > - if (!offset_valid(skb, hoffset + offset)) { > - pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset); > + if (unlikely(check_add_overflow(hoffset, offset, > + &write_offset))) { > + pr_info_ratelimited("tc action pedit offset overflow\n"); > + goto bad; > + } > + > + if (!offset_valid(skb, write_offset)) { > + pr_info_ratelimited("tc action pedit offset %d out of bounds\n", > + write_offset); > goto bad; > } > > - ptr = skb_header_pointer(skb, hoffset + offset, > + if (write_offset < 0) { > + if (skb_cow(skb, -write_offset)) > + goto bad; > + } else { > + if (unlikely(check_add_overflow(write_offset, > + (int)sizeof(hdata), > + &write_len))) > + goto bad; > + if (skb_ensure_writable(skb, min_t(int, skb->len, > + write_len))) > + goto bad; > + } > + > + ptr = skb_header_pointer(skb, write_offset, > sizeof(hdata), &hdata); > if (!ptr) > goto bad; > @@ -475,7 +501,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > > *ptr = ((*ptr & tkey->mask) ^ val); > if (ptr == &hdata) > - skb_store_bits(skb, hoffset + offset, ptr, 4); > + skb_store_bits(skb, write_offset, ptr, sizeof(hdata)); > } > > goto done; > -- > 2.51.2.windows.1 > > Thank you, > Rajat ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache 2026-05-18 1:30 [PATCH net] net/sched: fix pedit partial COW leading to page cache Rajat Gupta 2026-05-18 13:10 ` Han Guidong @ 2026-05-18 13:31 ` Jamal Hadi Salim 2026-05-19 3:39 ` [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption Rajat Gupta 1 sibling, 1 reply; 11+ messages in thread From: Jamal Hadi Salim @ 2026-05-18 13:31 UTC (permalink / raw) To: Rajat Gupta; +Cc: netdev, jiri, davem, Eric Dumazet, kuba, pabeni, horms Hi Rajat, On Sun, May 17, 2026 at 9:30 PM Rajat Gupta <rajat.gupta@oss.qualcomm.com> wrote: > > From f12ea6484dbb75d1c13495e921d0595532317f2a Mon Sep 17 00:00:00 2001 > From: Rajat Gupta <rajat.gupta@oss.qualcomm.com> > Date: Sun, 17 May 2026 10:11:44 -0700 > Subject: [PATCH net] net/sched: fix pedit partial COW leading to page cache > corruption > > tcf_pedit_act() computes the COW range for skb_ensure_writable() > once before the key loop using tcfp_off_max_hint, but the hint does > not account for the runtime header offset added by typed keys. This > can leave part of the write region un-COW'd. > git am on this: https://patchwork.kernel.org/project/netdevbpf/patch/604ca54c-a94e-4ca4-83e7-4486d7392d71@oss.qualcomm.com/ Errors out. ---- Applying: net/sched: fix pedit partial COW leading to page cache corruption .git/rebase-apply/patch:49: trailing whitespace. * If you want zero-copy, don't use pedit. error: corrupt patch at .git/rebase-apply/patch:58 Patch failed at 0001 net/sched: fix pedit partial COW leading to page cache corruption ---- Those comments dont look clean - follow netdev rules. You should send v1. The rule is you wait 24 hours after the previous one before doing so (you can add whoever tests etc on v1) Also make sure this applies on net tree not net-next cheers, jamal > Fix by moving skb_ensure_writable() inside the per-key loop where > the actual write offset is known, and add overflow checking on the > offset arithmetic. For negative offsets (e.g. Ethernet header edits > at ingress), use skb_cow() to COW the headroom instead. Guard > offset_valid() against INT_MIN, where negation is undefined. > > Additionally, linearize skbs with shared frags upfront to prevent > silent data corruption when pedit operates on zero-copy pages > (e.g. from sendfile). > > Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable") > Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com> > Reported-by: Yiming Qian <yimingqian591@gmail.com> > Reported-by: Keenan Dong <keenanat2000@gmail.com> > Reported-by: Han Guidong <2045gemini@gmail.com> > Reported-by: Zhang Cen <rollkingzzc@gmail.com> > Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> > Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com> > --- > net/sched/act_pedit.c | 52 ++++++++++++++++++++++++++++++++----------- > 1 file changed, 39 insertions(+), 13 deletions(-) > > diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c > index bc20f08a2..e077a2e82 100644 > --- a/net/sched/act_pedit.c > +++ b/net/sched/act_pedit.c > @@ -16,6 +16,7 @@ > #include <linux/ip.h> > #include <linux/ipv6.h> > #include <linux/slab.h> > +#include <linux/overflow.h> > #include <net/ipv6.h> > #include <net/netlink.h> > #include <net/pkt_sched.h> > @@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset) > if (offset > 0 && offset > skb->len) > return false; > > - if (offset < 0 && -offset > skb_headroom(skb)) > - return false; > + if (offset < 0) { > + if (offset == INT_MIN || -offset > skb_headroom(skb)) > + return false; > + } > > return true; > } > @@ -393,17 +396,19 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > struct tcf_pedit_key_ex *tkey_ex; > struct tcf_pedit_parms *parms; > struct tc_pedit_key *tkey; > - u32 max_offset; > int i; > > parms = rcu_dereference_bh(p->parms); > > - max_offset = (skb_transport_header_was_set(skb) ? > - skb_transport_offset(skb) : > - skb_network_offset(skb)) + > - parms->tcfp_off_max_hint; > - if (skb_ensure_writable(skb, min(skb->len, max_offset))) > - goto done; > + /* If the skb has shared frags the user is likely using zero-copy > + * (e.g. sendfile). Those page frags may point to page-cache pages; > + * writing into them would silently corrupt the page cache. > + * Linearize so pedit operates on a private copy. > + * If you want zero-copy, don't use pedit. > + TL;DR if you want to use ZC, don't use pedit*/ > + if (skb_has_shared_frag(skb)) { > + if (__skb_linearize(skb)) > + goto bad; > + } > > tcf_lastuse_update(&p->tcf_tm); > tcf_action_update_bstats(&p->common, skb); > @@ -414,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) { > int offset = tkey->off; > int hoffset = 0; > + int write_offset, write_len; > u32 *ptr, hdata; > u32 val; > int rc; > @@ -451,12 +457,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > } > } > > - if (!offset_valid(skb, hoffset + offset)) { > - pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset); > + if (unlikely(check_add_overflow(hoffset, offset, > + &write_offset))) { > + pr_info_ratelimited("tc action pedit offset overflow\n"); > + goto bad; > + } > + > + if (!offset_valid(skb, write_offset)) { > + pr_info_ratelimited("tc action pedit offset %d out of bounds\n", > + write_offset); > goto bad; > } > > - ptr = skb_header_pointer(skb, hoffset + offset, > + if (write_offset < 0) { > + if (skb_cow(skb, -write_offset)) > + goto bad; > + } else { > + if (unlikely(check_add_overflow(write_offset, > + (int)sizeof(hdata), > + &write_len))) > + goto bad; > + if (skb_ensure_writable(skb, min_t(int, skb->len, > + write_len))) > + goto bad; > + } > + > + ptr = skb_header_pointer(skb, write_offset, > sizeof(hdata), &hdata); > if (!ptr) > goto bad; > @@ -475,7 +501,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > > *ptr = ((*ptr & tkey->mask) ^ val); > if (ptr == &hdata) > - skb_store_bits(skb, hoffset + offset, ptr, 4); > + skb_store_bits(skb, write_offset, ptr, sizeof(hdata)); > } > > goto done; > -- > 2.51.2.windows.1 > > Thank you, > Rajat ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption 2026-05-18 13:31 ` Jamal Hadi Salim @ 2026-05-19 3:39 ` Rajat Gupta 2026-05-19 11:18 ` Toke Høiland-Jørgensen ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Rajat Gupta @ 2026-05-19 3:39 UTC (permalink / raw) To: netdev Cc: davem, edumazet, kuba, pabeni, horms, jhs, jiri, yimingqian591, keenanat2000, 2045gemini, rollkingzzc, Rajat Gupta tcf_pedit_act() computes the COW range for skb_ensure_writable() once before the key loop using tcfp_off_max_hint, but the hint does not account for the runtime header offset added by typed keys. This can leave part of the write region un-COW'd. Fix by moving skb_ensure_writable() inside the per-key loop where the actual write offset is known, and add overflow checking on the offset arithmetic. For negative offsets (e.g. Ethernet header edits at ingress), use skb_cow() to COW the headroom instead. Guard offset_valid() against INT_MIN, where negation is undefined. Additionally, linearize skbs with shared frags upfront to prevent silent data corruption when pedit operates on zero-copy pages (e.g. from sendfile). Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable") Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com> Reported-by: Yiming Qian <yimingqian591@gmail.com> Reported-by: Keenan Dong <keenanat2000@gmail.com> Reported-by: Han Guidong <2045gemini@gmail.com> Reported-by: Zhang Cen <rollkingzzc@gmail.com> Tested-by: Han Guidong <2045gemini@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com> --- net/sched/act_pedit.c | 54 ++++++++++++++++++++++++++++++++----------- 1 file changed, 41 insertions(+), 13 deletions(-) diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c index bc20f08a2..79921b8d8 100644 --- a/net/sched/act_pedit.c +++ b/net/sched/act_pedit.c @@ -16,6 +16,7 @@ #include <linux/ip.h> #include <linux/ipv6.h> #include <linux/slab.h> +#include <linux/overflow.h> #include <net/ipv6.h> #include <net/netlink.h> #include <net/pkt_sched.h> @@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset) if (offset > 0 && offset > skb->len) return false; - if (offset < 0 && -offset > skb_headroom(skb)) - return false; + if (offset < 0) { + if (offset == INT_MIN || -offset > skb_headroom(skb)) + return false; + } return true; } @@ -393,17 +396,21 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, struct tcf_pedit_key_ex *tkey_ex; struct tcf_pedit_parms *parms; struct tc_pedit_key *tkey; - u32 max_offset; int i; parms = rcu_dereference_bh(p->parms); - max_offset = (skb_transport_header_was_set(skb) ? - skb_transport_offset(skb) : - skb_network_offset(skb)) + - parms->tcfp_off_max_hint; - if (skb_ensure_writable(skb, min(skb->len, max_offset))) - goto done; + /* + * If the skb has shared frags the user is likely using zero-copy + * (e.g. sendfile). Those page frags may point to page-cache pages; + * writing into them would silently corrupt the page cache. + * Linearize so pedit operates on a private copy. + * TL;DR: if you want zero-copy, don't use pedit. + */ + if (skb_has_shared_frag(skb)) { + if (__skb_linearize(skb)) + goto bad; + } tcf_lastuse_update(&p->tcf_tm); tcf_action_update_bstats(&p->common, skb); @@ -412,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, tkey_ex = parms->tcfp_keys_ex; for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) { + int write_offset, write_len; int offset = tkey->off; int hoffset = 0; u32 *ptr, hdata; @@ -451,12 +459,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, } } - if (!offset_valid(skb, hoffset + offset)) { - pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset); + if (unlikely(check_add_overflow(hoffset, offset, + &write_offset))) { + pr_info_ratelimited("tc action pedit offset overflow\n"); + goto bad; + } + + if (!offset_valid(skb, write_offset)) { + pr_info_ratelimited("tc action pedit offset %d out of bounds\n", + write_offset); goto bad; } - ptr = skb_header_pointer(skb, hoffset + offset, + if (write_offset < 0) { + if (skb_cow(skb, -write_offset)) + goto bad; + } else { + if (unlikely(check_add_overflow(write_offset, + (int)sizeof(hdata), + &write_len))) + goto bad; + if (skb_ensure_writable(skb, min_t(int, skb->len, + write_len))) + goto bad; + } + + ptr = skb_header_pointer(skb, write_offset, sizeof(hdata), &hdata); if (!ptr) goto bad; @@ -475,7 +503,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, *ptr = ((*ptr & tkey->mask) ^ val); if (ptr == &hdata) - skb_store_bits(skb, hoffset + offset, ptr, 4); + skb_store_bits(skb, write_offset, ptr, sizeof(hdata)); } goto done; -- 2.51.2.windows.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption 2026-05-19 3:39 ` [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption Rajat Gupta @ 2026-05-19 11:18 ` Toke Høiland-Jørgensen 2026-05-19 15:10 ` Han Guidong 2026-05-20 9:23 ` Jamal Hadi Salim 2 siblings, 0 replies; 11+ messages in thread From: Toke Høiland-Jørgensen @ 2026-05-19 11:18 UTC (permalink / raw) To: Rajat Gupta, netdev Cc: davem, edumazet, kuba, pabeni, horms, jhs, jiri, yimingqian591, keenanat2000, 2045gemini, rollkingzzc, Rajat Gupta Rajat Gupta <rajat.gupta@oss.qualcomm.com> writes: > tcf_pedit_act() computes the COW range for skb_ensure_writable() > once before the key loop using tcfp_off_max_hint, but the hint does > not account for the runtime header offset added by typed keys. This > can leave part of the write region un-COW'd. > > Fix by moving skb_ensure_writable() inside the per-key loop where > the actual write offset is known, and add overflow checking on the > offset arithmetic. For negative offsets (e.g. Ethernet header edits > at ingress), use skb_cow() to COW the headroom instead. Guard > offset_valid() against INT_MIN, where negation is undefined. > > Additionally, linearize skbs with shared frags upfront to prevent > silent data corruption when pedit operates on zero-copy pages > (e.g. from sendfile). > > Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable") > Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com> > Reported-by: Yiming Qian <yimingqian591@gmail.com> > Reported-by: Keenan Dong <keenanat2000@gmail.com> > Reported-by: Han Guidong <2045gemini@gmail.com> > Reported-by: Zhang Cen <rollkingzzc@gmail.com> > Tested-by: Han Guidong <2045gemini@gmail.com> > Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> > Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com> Reviewed-by: Toke Høiland-Jørgensen <toke@kernel.org> Also applied this to -net and ran the TC pedit selftests and the pedit_* scripts in net/forwarding, none of which turned up any regressions, so: Tested-by: Toke Høiland-Jørgensen <toke@kernel.org> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption 2026-05-19 3:39 ` [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption Rajat Gupta 2026-05-19 11:18 ` Toke Høiland-Jørgensen @ 2026-05-19 15:10 ` Han Guidong 2026-05-20 9:12 ` Jamal Hadi Salim 2026-05-20 9:23 ` Jamal Hadi Salim 2 siblings, 1 reply; 11+ messages in thread From: Han Guidong @ 2026-05-19 15:10 UTC (permalink / raw) To: Rajat Gupta Cc: netdev, davem, edumazet, kuba, pabeni, horms, jhs, jiri, yimingqian591, keenanat2000, rollkingzzc On Tue, May 19, 2026 at 11:42 AM Rajat Gupta <rajat.gupta@oss.qualcomm.com> wrote: > > tcf_pedit_act() computes the COW range for skb_ensure_writable() > once before the key loop using tcfp_off_max_hint, but the hint does > not account for the runtime header offset added by typed keys. This > can leave part of the write region un-COW'd. > > Fix by moving skb_ensure_writable() inside the per-key loop where > the actual write offset is known, and add overflow checking on the > offset arithmetic. For negative offsets (e.g. Ethernet header edits > at ingress), use skb_cow() to COW the headroom instead. Guard > offset_valid() against INT_MIN, where negation is undefined. > > Additionally, linearize skbs with shared frags upfront to prevent > silent data corruption when pedit operates on zero-copy pages > (e.g. from sendfile). > > Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable") Fixes: 6c02568fd1ae ("net/sched: act_pedit: Parse L3 Header for L4 offset") Cc: stable@vger.kernel.org I took a closer look at the code, and this patch is really addressing two issues. The negative-offset write into shared head/headroom appears to be from 8b796475fd78, while the shared-frag write, and thus possible page-cache corruption, appears to be the later regression from 6c02568fd1ae. So it may be worth adding an additional Fixes tag for 6c02568fd1ae, and I think an explicit Cc: stable@vger.kernel.org would also make sense here. > Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com> > Reported-by: Yiming Qian <yimingqian591@gmail.com> > Reported-by: Keenan Dong <keenanat2000@gmail.com> > Reported-by: Han Guidong <2045gemini@gmail.com> > Reported-by: Zhang Cen <rollkingzzc@gmail.com> > Tested-by: Han Guidong <2045gemini@gmail.com> > Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> > Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com> > --- > net/sched/act_pedit.c | 54 ++++++++++++++++++++++++++++++++----------- > 1 file changed, 41 insertions(+), 13 deletions(-) > > diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c > index bc20f08a2..79921b8d8 100644 > --- a/net/sched/act_pedit.c > +++ b/net/sched/act_pedit.c > @@ -16,6 +16,7 @@ > #include <linux/ip.h> > #include <linux/ipv6.h> > #include <linux/slab.h> > +#include <linux/overflow.h> > #include <net/ipv6.h> > #include <net/netlink.h> > #include <net/pkt_sched.h> > @@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset) > if (offset > 0 && offset > skb->len) > return false; > > - if (offset < 0 && -offset > skb_headroom(skb)) > - return false; > + if (offset < 0) { > + if (offset == INT_MIN || -offset > skb_headroom(skb)) > + return false; > + } > > return true; > } > @@ -393,17 +396,21 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > struct tcf_pedit_key_ex *tkey_ex; > struct tcf_pedit_parms *parms; > struct tc_pedit_key *tkey; > - u32 max_offset; > int i; > > parms = rcu_dereference_bh(p->parms); > > - max_offset = (skb_transport_header_was_set(skb) ? > - skb_transport_offset(skb) : > - skb_network_offset(skb)) + > - parms->tcfp_off_max_hint; > - if (skb_ensure_writable(skb, min(skb->len, max_offset))) > - goto done; > + /* > + * If the skb has shared frags the user is likely using zero-copy > + * (e.g. sendfile). Those page frags may point to page-cache pages; > + * writing into them would silently corrupt the page cache. > + * Linearize so pedit operates on a private copy. > + * TL;DR: if you want zero-copy, don't use pedit. > + */ > + if (skb_has_shared_frag(skb)) { > + if (__skb_linearize(skb)) > + goto bad; > + } > > tcf_lastuse_update(&p->tcf_tm); > tcf_action_update_bstats(&p->common, skb); > @@ -412,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > tkey_ex = parms->tcfp_keys_ex; > > for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) { > + int write_offset, write_len; > int offset = tkey->off; > int hoffset = 0; > u32 *ptr, hdata; > @@ -451,12 +459,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > } > } > > - if (!offset_valid(skb, hoffset + offset)) { > - pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset); > + if (unlikely(check_add_overflow(hoffset, offset, > + &write_offset))) { I also noticed two remaining unchecked additions, e.g. hoffset + tkey->at and offset += (*d & offmask) >> shift. These look more like wrap/type-semantics cleanup than a practical signed-overflow/UB issue to me, so I am OK with this patch as-is. > + pr_info_ratelimited("tc action pedit offset overflow\n"); > + goto bad; > + } > + > + if (!offset_valid(skb, write_offset)) { > + pr_info_ratelimited("tc action pedit offset %d out of bounds\n", > + write_offset); > goto bad; > } > > - ptr = skb_header_pointer(skb, hoffset + offset, > + if (write_offset < 0) { There is also a small negative-offset 4-byte-write corner case, e.g. an ETH-relative edit with hoffset=-14 and offset=12, giving a final write_offset=-2; if the linear head after skb->data is only 1 byte, that store spans -2..1. That seems unlikely in practice; fixing it would be nice, but I would not hold this patch for it. Reviewed-by: Han Guidong <2045gemini@gmail.com> > + if (skb_cow(skb, -write_offset)) > + goto bad; > + } else { > + if (unlikely(check_add_overflow(write_offset, > + (int)sizeof(hdata), > + &write_len))) > + goto bad; > + if (skb_ensure_writable(skb, min_t(int, skb->len, > + write_len))) > + goto bad; > + } > + > + ptr = skb_header_pointer(skb, write_offset, > sizeof(hdata), &hdata); > if (!ptr) > goto bad; > @@ -475,7 +503,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > > *ptr = ((*ptr & tkey->mask) ^ val); > if (ptr == &hdata) > - skb_store_bits(skb, hoffset + offset, ptr, 4); > + skb_store_bits(skb, write_offset, ptr, sizeof(hdata)); > } > > goto done; > -- > 2.51.2.windows.1 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption 2026-05-19 15:10 ` Han Guidong @ 2026-05-20 9:12 ` Jamal Hadi Salim 2026-05-20 10:04 ` Han Guidong 2026-05-20 10:36 ` Han Guidong 0 siblings, 2 replies; 11+ messages in thread From: Jamal Hadi Salim @ 2026-05-20 9:12 UTC (permalink / raw) To: Han Guidong Cc: Rajat Gupta, netdev, davem, edumazet, kuba, pabeni, horms, jiri, yimingqian591, keenanat2000, rollkingzzc On Tue, May 19, 2026 at 11:11 AM Han Guidong <2045gemini@gmail.com> wrote: > > On Tue, May 19, 2026 at 11:42 AM Rajat Gupta > <rajat.gupta@oss.qualcomm.com> wrote: > > > > tcf_pedit_act() computes the COW range for skb_ensure_writable() > > once before the key loop using tcfp_off_max_hint, but the hint does > > not account for the runtime header offset added by typed keys. This > > can leave part of the write region un-COW'd. > > > > Fix by moving skb_ensure_writable() inside the per-key loop where > > the actual write offset is known, and add overflow checking on the > > offset arithmetic. For negative offsets (e.g. Ethernet header edits > > at ingress), use skb_cow() to COW the headroom instead. Guard > > offset_valid() against INT_MIN, where negation is undefined. > > > > Additionally, linearize skbs with shared frags upfront to prevent > > silent data corruption when pedit operates on zero-copy pages > > (e.g. from sendfile). > > > > Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable") > > Fixes: 6c02568fd1ae ("net/sched: act_pedit: Parse L3 Header for L4 offset") > Cc: stable@vger.kernel.org > > I took a closer look at the code, and this patch is really addressing > two issues. > > The negative-offset write into shared head/headroom appears to be from > 8b796475fd78, while the shared-frag write, and thus possible > page-cache corruption, appears to be the later regression from > 6c02568fd1ae. So it may be worth adding an additional Fixes tag for > 6c02568fd1ae, Pushing it a little - but sure we should also point to that commit. > and I think an explicit Cc: stable@vger.kernel.org would also make sense here. Please study the rules. > > Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com> > > Reported-by: Yiming Qian <yimingqian591@gmail.com> > > Reported-by: Keenan Dong <keenanat2000@gmail.com> > > Reported-by: Han Guidong <2045gemini@gmail.com> > > Reported-by: Zhang Cen <rollkingzzc@gmail.com> > > Tested-by: Han Guidong <2045gemini@gmail.com> > > Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> > > Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com> > > --- > > net/sched/act_pedit.c | 54 ++++++++++++++++++++++++++++++++----------- > > 1 file changed, 41 insertions(+), 13 deletions(-) > > > > diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c > > index bc20f08a2..79921b8d8 100644 > > --- a/net/sched/act_pedit.c > > +++ b/net/sched/act_pedit.c > > @@ -16,6 +16,7 @@ > > #include <linux/ip.h> > > #include <linux/ipv6.h> > > #include <linux/slab.h> > > +#include <linux/overflow.h> > > #include <net/ipv6.h> > > #include <net/netlink.h> > > #include <net/pkt_sched.h> > > @@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset) > > if (offset > 0 && offset > skb->len) > > return false; > > > > - if (offset < 0 && -offset > skb_headroom(skb)) > > - return false; > > + if (offset < 0) { > > + if (offset == INT_MIN || -offset > skb_headroom(skb)) > > + return false; > > + } > > > > return true; > > } > > @@ -393,17 +396,21 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > > struct tcf_pedit_key_ex *tkey_ex; > > struct tcf_pedit_parms *parms; > > struct tc_pedit_key *tkey; > > - u32 max_offset; > > int i; > > > > parms = rcu_dereference_bh(p->parms); > > > > - max_offset = (skb_transport_header_was_set(skb) ? > > - skb_transport_offset(skb) : > > - skb_network_offset(skb)) + > > - parms->tcfp_off_max_hint; > > - if (skb_ensure_writable(skb, min(skb->len, max_offset))) > > - goto done; > > + /* > > + * If the skb has shared frags the user is likely using zero-copy > > + * (e.g. sendfile). Those page frags may point to page-cache pages; > > + * writing into them would silently corrupt the page cache. > > + * Linearize so pedit operates on a private copy. > > + * TL;DR: if you want zero-copy, don't use pedit. > > + */ > > + if (skb_has_shared_frag(skb)) { > > + if (__skb_linearize(skb)) > > + goto bad; > > + } > > > > tcf_lastuse_update(&p->tcf_tm); > > tcf_action_update_bstats(&p->common, skb); > > @@ -412,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > > tkey_ex = parms->tcfp_keys_ex; > > > > for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) { > > + int write_offset, write_len; > > int offset = tkey->off; > > int hoffset = 0; > > u32 *ptr, hdata; > > @@ -451,12 +459,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > > } > > } > > > > - if (!offset_valid(skb, hoffset + offset)) { > > - pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset); > > + if (unlikely(check_add_overflow(hoffset, offset, > > + &write_offset))) { > > I also noticed two remaining unchecked additions, e.g. hoffset + > tkey->at and offset += (*d & offmask) >> shift. These look more like > wrap/type-semantics cleanup than a practical signed-overflow/UB issue > to me, so I am OK with this patch as-is. Did you get this from some AI? None of these are possible. Both additions you mention are safe in practice because any potential wrap around or overflow is caught _before any memory access occurs_. Example see: offset_valid() which implictly casts hoffset + key to signed int. > > + pr_info_ratelimited("tc action pedit offset overflow\n"); > > + goto bad; > > + } > > + > > + if (!offset_valid(skb, write_offset)) { > > + pr_info_ratelimited("tc action pedit offset %d out of bounds\n", > > + write_offset); > > goto bad; > > } > > > > - ptr = skb_header_pointer(skb, hoffset + offset, > > + if (write_offset < 0) { > > There is also a small negative-offset 4-byte-write corner case, e.g. > an ETH-relative edit with hoffset=-14 and offset=12, giving a final > write_offset=-2; if the linear head after skb->data is only 1 byte, > that store spans -2..1. That seems unlikely in practice; fixing it > would be nice, but I would not hold this patch for it. > Again smells AI generated. I spent a lot of time trying to determine the practicality and validity of this (and your previous claim), only to find that this whole claim is based on a view that a headroom of 1 byte is a valid scenario in practice. Is it theoretically possible from the math? yes. Practically, I don't see how it makes any sense. Can you describe a scenario where this would happen in practise? > Reviewed-by: Han Guidong <2045gemini@gmail.com> More like reviewed by my AI who goes by the name foobar. cheers, jamal > > > > + if (skb_cow(skb, -write_offset)) > > + goto bad; > > + } else { > > + if (unlikely(check_add_overflow(write_offset, > > + (int)sizeof(hdata), > > + &write_len))) > > + goto bad; > > + if (skb_ensure_writable(skb, min_t(int, skb->len, > > + write_len))) > > + goto bad; > > + } > > + > > + ptr = skb_header_pointer(skb, write_offset, > > sizeof(hdata), &hdata); > > if (!ptr) > > goto bad; > > @@ -475,7 +503,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > > > > *ptr = ((*ptr & tkey->mask) ^ val); > > if (ptr == &hdata) > > - skb_store_bits(skb, hoffset + offset, ptr, 4); > > + skb_store_bits(skb, write_offset, ptr, sizeof(hdata)); > > } > > > > goto done; > > -- > > 2.51.2.windows.1 > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption 2026-05-20 9:12 ` Jamal Hadi Salim @ 2026-05-20 10:04 ` Han Guidong 2026-05-20 10:36 ` Han Guidong 1 sibling, 0 replies; 11+ messages in thread From: Han Guidong @ 2026-05-20 10:04 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Rajat Gupta, netdev, davem, edumazet, kuba, pabeni, horms, jiri, yimingqian591, keenanat2000, rollkingzzc On Wed, May 20, 2026 at 5:12 PM Jamal Hadi Salim <jhs@mojatatu.com> wrote: > > On Tue, May 19, 2026 at 11:11 AM Han Guidong <2045gemini@gmail.com> wrote: > > > > On Tue, May 19, 2026 at 11:42 AM Rajat Gupta > > <rajat.gupta@oss.qualcomm.com> wrote: > > > > > > tcf_pedit_act() computes the COW range for skb_ensure_writable() > > > once before the key loop using tcfp_off_max_hint, but the hint does > > > not account for the runtime header offset added by typed keys. This > > > can leave part of the write region un-COW'd. > > > > > > Fix by moving skb_ensure_writable() inside the per-key loop where > > > the actual write offset is known, and add overflow checking on the > > > offset arithmetic. For negative offsets (e.g. Ethernet header edits > > > at ingress), use skb_cow() to COW the headroom instead. Guard > > > offset_valid() against INT_MIN, where negation is undefined. > > > > > > Additionally, linearize skbs with shared frags upfront to prevent > > > silent data corruption when pedit operates on zero-copy pages > > > (e.g. from sendfile). > > > > > > Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable") > > > > Fixes: 6c02568fd1ae ("net/sched: act_pedit: Parse L3 Header for L4 offset") > > Cc: stable@vger.kernel.org > > > > I took a closer look at the code, and this patch is really addressing > > two issues. > > > > The negative-offset write into shared head/headroom appears to be from > > 8b796475fd78, while the shared-frag write, and thus possible > > page-cache corruption, appears to be the later regression from > > 6c02568fd1ae. So it may be worth adding an additional Fixes tag for > > 6c02568fd1ae, > > Pushing it a little - but sure we should also point to that commit. > > > and I think an explicit Cc: stable@vger.kernel.org would also make sense here. > > Please study the rules. I will review the rules. Sorry for the noise. > > > > Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com> > > > Reported-by: Yiming Qian <yimingqian591@gmail.com> > > > Reported-by: Keenan Dong <keenanat2000@gmail.com> > > > Reported-by: Han Guidong <2045gemini@gmail.com> > > > Reported-by: Zhang Cen <rollkingzzc@gmail.com> > > > Tested-by: Han Guidong <2045gemini@gmail.com> > > > Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> > > > Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com> > > > --- > > > net/sched/act_pedit.c | 54 ++++++++++++++++++++++++++++++++----------- > > > 1 file changed, 41 insertions(+), 13 deletions(-) > > > > > > diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c > > > index bc20f08a2..79921b8d8 100644 > > > --- a/net/sched/act_pedit.c > > > +++ b/net/sched/act_pedit.c > > > @@ -16,6 +16,7 @@ > > > #include <linux/ip.h> > > > #include <linux/ipv6.h> > > > #include <linux/slab.h> > > > +#include <linux/overflow.h> > > > #include <net/ipv6.h> > > > #include <net/netlink.h> > > > #include <net/pkt_sched.h> > > > @@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset) > > > if (offset > 0 && offset > skb->len) > > > return false; > > > > > > - if (offset < 0 && -offset > skb_headroom(skb)) > > > - return false; > > > + if (offset < 0) { > > > + if (offset == INT_MIN || -offset > skb_headroom(skb)) > > > + return false; > > > + } > > > > > > return true; > > > } > > > @@ -393,17 +396,21 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > > > struct tcf_pedit_key_ex *tkey_ex; > > > struct tcf_pedit_parms *parms; > > > struct tc_pedit_key *tkey; > > > - u32 max_offset; > > > int i; > > > > > > parms = rcu_dereference_bh(p->parms); > > > > > > - max_offset = (skb_transport_header_was_set(skb) ? > > > - skb_transport_offset(skb) : > > > - skb_network_offset(skb)) + > > > - parms->tcfp_off_max_hint; > > > - if (skb_ensure_writable(skb, min(skb->len, max_offset))) > > > - goto done; > > > + /* > > > + * If the skb has shared frags the user is likely using zero-copy > > > + * (e.g. sendfile). Those page frags may point to page-cache pages; > > > + * writing into them would silently corrupt the page cache. > > > + * Linearize so pedit operates on a private copy. > > > + * TL;DR: if you want zero-copy, don't use pedit. > > > + */ > > > + if (skb_has_shared_frag(skb)) { > > > + if (__skb_linearize(skb)) > > > + goto bad; > > > + } > > > > > > tcf_lastuse_update(&p->tcf_tm); > > > tcf_action_update_bstats(&p->common, skb); > > > @@ -412,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > > > tkey_ex = parms->tcfp_keys_ex; > > > > > > for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) { > > > + int write_offset, write_len; > > > int offset = tkey->off; > > > int hoffset = 0; > > > u32 *ptr, hdata; > > > @@ -451,12 +459,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > > > } > > > } > > > > > > - if (!offset_valid(skb, hoffset + offset)) { > > > - pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset); > > > + if (unlikely(check_add_overflow(hoffset, offset, > > > + &write_offset))) { > > > > I also noticed two remaining unchecked additions, e.g. hoffset + > > tkey->at and offset += (*d & offmask) >> shift. These look more like > > wrap/type-semantics cleanup than a practical signed-overflow/UB issue > > to me, so I am OK with this patch as-is. > > Did you get this from some AI? > None of these are possible. Both additions you mention are safe in > practice because any potential wrap around or overflow is caught > _before any memory access occurs_. Example see: > offset_valid() which implictly casts hoffset + key to signed int. Not AI-generated. I was looking at the code and noticed that tc_pedit_key fields are unsigned, but the results are stored in int offset and int hoffset. I was just thinking about odd edge cases where an unsigned-to-signed conversion results in a negative number. I agree with you that offset_valid() catches this anyway, so it causes no actual harm. > > > > + pr_info_ratelimited("tc action pedit offset overflow\n"); > > > + goto bad; > > > + } > > > + > > > + if (!offset_valid(skb, write_offset)) { > > > + pr_info_ratelimited("tc action pedit offset %d out of bounds\n", > > > + write_offset); > > > goto bad; > > > } > > > > > > - ptr = skb_header_pointer(skb, hoffset + offset, > > > + if (write_offset < 0) { > > > > There is also a small negative-offset 4-byte-write corner case, e.g. > > an ETH-relative edit with hoffset=-14 and offset=12, giving a final > > write_offset=-2; if the linear head after skb->data is only 1 byte, > > that store spans -2..1. That seems unlikely in practice; fixing it > > would be nice, but I would not hold this patch for it. > > > > Again smells AI generated. I spent a lot of time trying to determine > the practicality and validity of this (and your previous claim), only > to find that this whole claim is based on a view that a headroom of 1 > byte is a valid scenario in practice. Is it theoretically possible > from the math? yes. Practically, I don't see how it makes any sense. > Can you describe a scenario where this would happen in practise? As I mentioned in the previous mail, I agree it is unlikely in practice. For context, this was an edge case brought up by sashiko regarding our patch. I wrote a test kernel module and verified that if an skb is artificially constructed with exactly 1 byte of headroom, skb_cow() is indeed insufficient. However, I also checked real-world packets and the headroom is always plenty, so I agree it doesn't happen in the real world. > > > Reviewed-by: Han Guidong <2045gemini@gmail.com> > > More like reviewed by my AI who goes by the name foobar. I will work on improving my review quality going forward. Sorry again for the noise. Thanks. > > cheers, > jamal > > > > > > > > + if (skb_cow(skb, -write_offset)) > > > + goto bad; > > > + } else { > > > + if (unlikely(check_add_overflow(write_offset, > > > + (int)sizeof(hdata), > > > + &write_len))) > > > + goto bad; > > > + if (skb_ensure_writable(skb, min_t(int, skb->len, > > > + write_len))) > > > + goto bad; > > > + } > > > + > > > + ptr = skb_header_pointer(skb, write_offset, > > > sizeof(hdata), &hdata); > > > if (!ptr) > > > goto bad; > > > @@ -475,7 +503,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > > > > > > *ptr = ((*ptr & tkey->mask) ^ val); > > > if (ptr == &hdata) > > > - skb_store_bits(skb, hoffset + offset, ptr, 4); > > > + skb_store_bits(skb, write_offset, ptr, sizeof(hdata)); > > > } > > > > > > goto done; > > > -- > > > 2.51.2.windows.1 > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption 2026-05-20 9:12 ` Jamal Hadi Salim 2026-05-20 10:04 ` Han Guidong @ 2026-05-20 10:36 ` Han Guidong 2026-05-20 11:40 ` Jamal Hadi Salim 1 sibling, 1 reply; 11+ messages in thread From: Han Guidong @ 2026-05-20 10:36 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Rajat Gupta, netdev, davem, edumazet, kuba, pabeni, horms, jiri, yimingqian591, keenanat2000, rollkingzzc On Wed, May 20, 2026 at 5:12 PM Jamal Hadi Salim <jhs@mojatatu.com> wrote: > > On Tue, May 19, 2026 at 11:11 AM Han Guidong <2045gemini@gmail.com> wrote: > > > > On Tue, May 19, 2026 at 11:42 AM Rajat Gupta > > <rajat.gupta@oss.qualcomm.com> wrote: > > > > > > tcf_pedit_act() computes the COW range for skb_ensure_writable() > > > once before the key loop using tcfp_off_max_hint, but the hint does > > > not account for the runtime header offset added by typed keys. This > > > can leave part of the write region un-COW'd. > > > > > > Fix by moving skb_ensure_writable() inside the per-key loop where > > > the actual write offset is known, and add overflow checking on the > > > offset arithmetic. For negative offsets (e.g. Ethernet header edits > > > at ingress), use skb_cow() to COW the headroom instead. Guard > > > offset_valid() against INT_MIN, where negation is undefined. > > > > > > Additionally, linearize skbs with shared frags upfront to prevent > > > silent data corruption when pedit operates on zero-copy pages > > > (e.g. from sendfile). > > > > > > Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable") > > > > Fixes: 6c02568fd1ae ("net/sched: act_pedit: Parse L3 Header for L4 offset") > > Cc: stable@vger.kernel.org > > > > I took a closer look at the code, and this patch is really addressing > > two issues. > > > > The negative-offset write into shared head/headroom appears to be from > > 8b796475fd78, while the shared-frag write, and thus possible > > page-cache corruption, appears to be the later regression from > > 6c02568fd1ae. So it may be worth adding an additional Fixes tag for > > 6c02568fd1ae, > > Pushing it a little - but sure we should also point to that commit. > > > and I think an explicit Cc: stable@vger.kernel.org would also make sense here. > > Please study the rules. Hi Jamal, Apologies, I'm still a newbie learning the process. I checked the docs and saw commit dbbe7c962c3a ("docs: networking: drop special stable handling"), but I'm still a bit confused. Did the rule change to allow explicit "Cc: stable@vger.kernel.org" tags for netdev? If I completely missed the correct documentation, I would be very grateful if you could point me to it. Thanks for your patience! ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption 2026-05-20 10:36 ` Han Guidong @ 2026-05-20 11:40 ` Jamal Hadi Salim 0 siblings, 0 replies; 11+ messages in thread From: Jamal Hadi Salim @ 2026-05-20 11:40 UTC (permalink / raw) To: Han Guidong Cc: Rajat Gupta, netdev, davem, edumazet, kuba, pabeni, horms, jiri, yimingqian591, keenanat2000, rollkingzzc On Wed, May 20, 2026 at 6:37 AM Han Guidong <2045gemini@gmail.com> wrote: > > On Wed, May 20, 2026 at 5:12 PM Jamal Hadi Salim <jhs@mojatatu.com> wrote: > > > > On Tue, May 19, 2026 at 11:11 AM Han Guidong <2045gemini@gmail.com> wrote: > > > > > > On Tue, May 19, 2026 at 11:42 AM Rajat Gupta > > > <rajat.gupta@oss.qualcomm.com> wrote: > > > > > > > > tcf_pedit_act() computes the COW range for skb_ensure_writable() > > > > once before the key loop using tcfp_off_max_hint, but the hint does > > > > not account for the runtime header offset added by typed keys. This > > > > can leave part of the write region un-COW'd. > > > > > > > > Fix by moving skb_ensure_writable() inside the per-key loop where > > > > the actual write offset is known, and add overflow checking on the > > > > offset arithmetic. For negative offsets (e.g. Ethernet header edits > > > > at ingress), use skb_cow() to COW the headroom instead. Guard > > > > offset_valid() against INT_MIN, where negation is undefined. > > > > > > > > Additionally, linearize skbs with shared frags upfront to prevent > > > > silent data corruption when pedit operates on zero-copy pages > > > > (e.g. from sendfile). > > > > > > > > Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable") > > > > > > Fixes: 6c02568fd1ae ("net/sched: act_pedit: Parse L3 Header for L4 offset") > > > Cc: stable@vger.kernel.org > > > > > > I took a closer look at the code, and this patch is really addressing > > > two issues. > > > > > > The negative-offset write into shared head/headroom appears to be from > > > 8b796475fd78, while the shared-frag write, and thus possible > > > page-cache corruption, appears to be the later regression from > > > 6c02568fd1ae. So it may be worth adding an additional Fixes tag for > > > 6c02568fd1ae, > > > > Pushing it a little - but sure we should also point to that commit. > > > > > and I think an explicit Cc: stable@vger.kernel.org would also make sense here. > > > > Please study the rules. > > Hi Jamal, > > Apologies, I'm still a newbie learning the process. I checked the docs > and saw commit dbbe7c962c3a ("docs: networking: drop special stable > handling"), but I'm still a bit confused. > > Did the rule change to allow explicit "Cc: stable@vger.kernel.org" > tags for netdev? > > If I completely missed the correct documentation, I would be very > grateful if you could point me to it. > This is a security fix. AFAIK, It does not follow the stable rules. It goes via the net tree (then propagated via net-next) before making it to linux main. It gets picked up by the stable tree. But: You are correct that Ccing stable is useful when the vulnerberity hasnt been published. I apologize for being harsh - too many AI emails claiming bugs on which i spend too much time and you bore the brunt of it. So we should Cc them. cheers, jamal > Thanks for your patience! ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption 2026-05-19 3:39 ` [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption Rajat Gupta 2026-05-19 11:18 ` Toke Høiland-Jørgensen 2026-05-19 15:10 ` Han Guidong @ 2026-05-20 9:23 ` Jamal Hadi Salim 2 siblings, 0 replies; 11+ messages in thread From: Jamal Hadi Salim @ 2026-05-20 9:23 UTC (permalink / raw) To: Rajat Gupta Cc: netdev, davem, edumazet, kuba, pabeni, horms, jiri, yimingqian591, keenanat2000, 2045gemini, rollkingzzc On Mon, May 18, 2026 at 11:42 PM Rajat Gupta <rajat.gupta@oss.qualcomm.com> wrote: > > tcf_pedit_act() computes the COW range for skb_ensure_writable() > once before the key loop using tcfp_off_max_hint, but the hint does > not account for the runtime header offset added by typed keys. This > can leave part of the write region un-COW'd. > > Fix by moving skb_ensure_writable() inside the per-key loop where > the actual write offset is known, and add overflow checking on the > offset arithmetic. For negative offsets (e.g. Ethernet header edits > at ingress), use skb_cow() to COW the headroom instead. Guard > offset_valid() against INT_MIN, where negation is undefined. > > Additionally, linearize skbs with shared frags upfront to prevent > silent data corruption when pedit operates on zero-copy pages > (e.g. from sendfile). > Sashiko makes the claim that: " .... ..... If the payload part extending past offset 0 resides in a shared fragment for a cloned skb, wouldn't skb_store_bits() write the positive offset portion directly into the shared page, causing a data corruption regression? " It missed the earlier code (before the quoted code is hit) which is very well documented! /* * If the skb has shared frags the user is likely using zero-copy * (e.g. sendfile). Those page frags may point to page-cache pages; * writing into them would silently corrupt the page cache. * Linearize so pedit operates on a private copy. * TL;DR: if you want zero-copy, don't use pedit. */ if (skb_has_shared_frag(skb)) { if (__skb_linearize(skb)) goto bad; } TLDR: If the skb has shared frags, it is linearized upfront using __skb_linearize() which means any risk of modifying a shared fragment is non-existent. cheers, jamal > Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable") > Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com> > Reported-by: Yiming Qian <yimingqian591@gmail.com> > Reported-by: Keenan Dong <keenanat2000@gmail.com> > Reported-by: Han Guidong <2045gemini@gmail.com> > Reported-by: Zhang Cen <rollkingzzc@gmail.com> > Tested-by: Han Guidong <2045gemini@gmail.com> > Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> > Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com> > --- > net/sched/act_pedit.c | 54 ++++++++++++++++++++++++++++++++----------- > 1 file changed, 41 insertions(+), 13 deletions(-) > > diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c > index bc20f08a2..79921b8d8 100644 > --- a/net/sched/act_pedit.c > +++ b/net/sched/act_pedit.c > @@ -16,6 +16,7 @@ > #include <linux/ip.h> > #include <linux/ipv6.h> > #include <linux/slab.h> > +#include <linux/overflow.h> > #include <net/ipv6.h> > #include <net/netlink.h> > #include <net/pkt_sched.h> > @@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset) > if (offset > 0 && offset > skb->len) > return false; > > - if (offset < 0 && -offset > skb_headroom(skb)) > - return false; > + if (offset < 0) { > + if (offset == INT_MIN || -offset > skb_headroom(skb)) > + return false; > + } > > return true; > } > @@ -393,17 +396,21 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > struct tcf_pedit_key_ex *tkey_ex; > struct tcf_pedit_parms *parms; > struct tc_pedit_key *tkey; > - u32 max_offset; > int i; > > parms = rcu_dereference_bh(p->parms); > > - max_offset = (skb_transport_header_was_set(skb) ? > - skb_transport_offset(skb) : > - skb_network_offset(skb)) + > - parms->tcfp_off_max_hint; > - if (skb_ensure_writable(skb, min(skb->len, max_offset))) > - goto done; > + /* > + * If the skb has shared frags the user is likely using zero-copy > + * (e.g. sendfile). Those page frags may point to page-cache pages; > + * writing into them would silently corrupt the page cache. > + * Linearize so pedit operates on a private copy. > + * TL;DR: if you want zero-copy, don't use pedit. > + */ > + if (skb_has_shared_frag(skb)) { > + if (__skb_linearize(skb)) > + goto bad; > + } > > tcf_lastuse_update(&p->tcf_tm); > tcf_action_update_bstats(&p->common, skb); > @@ -412,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > tkey_ex = parms->tcfp_keys_ex; > > for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) { > + int write_offset, write_len; > int offset = tkey->off; > int hoffset = 0; > u32 *ptr, hdata; > @@ -451,12 +459,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > } > } > > - if (!offset_valid(skb, hoffset + offset)) { > - pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset); > + if (unlikely(check_add_overflow(hoffset, offset, > + &write_offset))) { > + pr_info_ratelimited("tc action pedit offset overflow\n"); > + goto bad; > + } > + > + if (!offset_valid(skb, write_offset)) { > + pr_info_ratelimited("tc action pedit offset %d out of bounds\n", > + write_offset); > goto bad; > } > > - ptr = skb_header_pointer(skb, hoffset + offset, > + if (write_offset < 0) { > + if (skb_cow(skb, -write_offset)) > + goto bad; > + } else { > + if (unlikely(check_add_overflow(write_offset, > + (int)sizeof(hdata), > + &write_len))) > + goto bad; > + if (skb_ensure_writable(skb, min_t(int, skb->len, > + write_len))) > + goto bad; > + } > + > + ptr = skb_header_pointer(skb, write_offset, > sizeof(hdata), &hdata); > if (!ptr) > goto bad; > @@ -475,7 +503,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb, > > *ptr = ((*ptr & tkey->mask) ^ val); > if (ptr == &hdata) > - skb_store_bits(skb, hoffset + offset, ptr, 4); > + skb_store_bits(skb, write_offset, ptr, sizeof(hdata)); > } > > goto done; > -- > 2.51.2.windows.1 > ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-05-20 11:40 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-18 1:30 [PATCH net] net/sched: fix pedit partial COW leading to page cache Rajat Gupta 2026-05-18 13:10 ` Han Guidong 2026-05-18 13:31 ` Jamal Hadi Salim 2026-05-19 3:39 ` [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption Rajat Gupta 2026-05-19 11:18 ` Toke Høiland-Jørgensen 2026-05-19 15:10 ` Han Guidong 2026-05-20 9:12 ` Jamal Hadi Salim 2026-05-20 10:04 ` Han Guidong 2026-05-20 10:36 ` Han Guidong 2026-05-20 11:40 ` Jamal Hadi Salim 2026-05-20 9:23 ` Jamal Hadi Salim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox