* [PATCH net] net/sched: fix pedit partial COW leading to page cache
@ 2026-05-18 1:30 Rajat Gupta
2026-05-18 13:10 ` Han Guidong
2026-05-18 13:31 ` Jamal Hadi Salim
0 siblings, 2 replies; 12+ messages in thread
From: Rajat Gupta @ 2026-05-18 1:30 UTC (permalink / raw)
To: netdev; +Cc: Jamal Hadi Salim, jiri, davem, Eric Dumazet, kuba, pabeni, horms
From f12ea6484dbb75d1c13495e921d0595532317f2a Mon Sep 17 00:00:00 2001
From: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
Date: Sun, 17 May 2026 10:11:44 -0700
Subject: [PATCH net] net/sched: fix pedit partial COW leading to page cache
corruption
tcf_pedit_act() computes the COW range for skb_ensure_writable()
once before the key loop using tcfp_off_max_hint, but the hint does
not account for the runtime header offset added by typed keys. This
can leave part of the write region un-COW'd.
Fix by moving skb_ensure_writable() inside the per-key loop where
the actual write offset is known, and add overflow checking on the
offset arithmetic. For negative offsets (e.g. Ethernet header edits
at ingress), use skb_cow() to COW the headroom instead. Guard
offset_valid() against INT_MIN, where negation is undefined.
Additionally, linearize skbs with shared frags upfront to prevent
silent data corruption when pedit operates on zero-copy pages
(e.g. from sendfile).
Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Reported-by: Keenan Dong <keenanat2000@gmail.com>
Reported-by: Han Guidong <2045gemini@gmail.com>
Reported-by: Zhang Cen <rollkingzzc@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
---
net/sched/act_pedit.c | 52 ++++++++++++++++++++++++++++++++-----------
1 file changed, 39 insertions(+), 13 deletions(-)
diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
index bc20f08a2..e077a2e82 100644
--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -16,6 +16,7 @@
#include <linux/ip.h>
#include <linux/ipv6.h>
#include <linux/slab.h>
+#include <linux/overflow.h>
#include <net/ipv6.h>
#include <net/netlink.h>
#include <net/pkt_sched.h>
@@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset)
if (offset > 0 && offset > skb->len)
return false;
- if (offset < 0 && -offset > skb_headroom(skb))
- return false;
+ if (offset < 0) {
+ if (offset == INT_MIN || -offset > skb_headroom(skb))
+ return false;
+ }
return true;
}
@@ -393,17 +396,19 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
struct tcf_pedit_key_ex *tkey_ex;
struct tcf_pedit_parms *parms;
struct tc_pedit_key *tkey;
- u32 max_offset;
int i;
parms = rcu_dereference_bh(p->parms);
- max_offset = (skb_transport_header_was_set(skb) ?
- skb_transport_offset(skb) :
- skb_network_offset(skb)) +
- parms->tcfp_off_max_hint;
- if (skb_ensure_writable(skb, min(skb->len, max_offset)))
- goto done;
+ /* If the skb has shared frags the user is likely using zero-copy
+ * (e.g. sendfile). Those page frags may point to page-cache pages;
+ * writing into them would silently corrupt the page cache.
+ * Linearize so pedit operates on a private copy.
+ * If you want zero-copy, don't use pedit.
+ TL;DR if you want to use ZC, don't use pedit*/
+ if (skb_has_shared_frag(skb)) {
+ if (__skb_linearize(skb))
+ goto bad;
+ }
tcf_lastuse_update(&p->tcf_tm);
tcf_action_update_bstats(&p->common, skb);
@@ -414,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) {
int offset = tkey->off;
int hoffset = 0;
+ int write_offset, write_len;
u32 *ptr, hdata;
u32 val;
int rc;
@@ -451,12 +457,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
}
}
- if (!offset_valid(skb, hoffset + offset)) {
- pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset);
+ if (unlikely(check_add_overflow(hoffset, offset,
+ &write_offset))) {
+ pr_info_ratelimited("tc action pedit offset overflow\n");
+ goto bad;
+ }
+
+ if (!offset_valid(skb, write_offset)) {
+ pr_info_ratelimited("tc action pedit offset %d out of bounds\n",
+ write_offset);
goto bad;
}
- ptr = skb_header_pointer(skb, hoffset + offset,
+ if (write_offset < 0) {
+ if (skb_cow(skb, -write_offset))
+ goto bad;
+ } else {
+ if (unlikely(check_add_overflow(write_offset,
+ (int)sizeof(hdata),
+ &write_len)))
+ goto bad;
+ if (skb_ensure_writable(skb, min_t(int, skb->len,
+ write_len)))
+ goto bad;
+ }
+
+ ptr = skb_header_pointer(skb, write_offset,
sizeof(hdata), &hdata);
if (!ptr)
goto bad;
@@ -475,7 +501,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
*ptr = ((*ptr & tkey->mask) ^ val);
if (ptr == &hdata)
- skb_store_bits(skb, hoffset + offset, ptr, 4);
+ skb_store_bits(skb, write_offset, ptr, sizeof(hdata));
}
goto done;
--
2.51.2.windows.1
Thank you,
Rajat
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache
2026-05-18 1:30 [PATCH net] net/sched: fix pedit partial COW leading to page cache Rajat Gupta
@ 2026-05-18 13:10 ` Han Guidong
2026-05-18 13:31 ` Jamal Hadi Salim
1 sibling, 0 replies; 12+ messages in thread
From: Han Guidong @ 2026-05-18 13:10 UTC (permalink / raw)
To: rajat.gupta
Cc: davem, edumazet, horms, jhs, jiri, kuba, netdev, pabeni,
Han Guidong
(Resending to fix broken email threading. Sorry for the noise!)
On Sun, May 17, 2026 at 06:30:31PM -0700, Rajat Gupta wrote:
> >From f12ea6484dbb75d1c13495e921d0595532317f2a Mon Sep 17 00:00:00 2001
> From: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> Date: Sun, 17 May 2026 10:11:44 -0700
> Subject: [PATCH net] net/sched: fix pedit partial COW leading to page cache
> corruption
>
> tcf_pedit_act() computes the COW range for skb_ensure_writable()
> once before the key loop using tcfp_off_max_hint, but the hint does
> not account for the runtime header offset added by typed keys. This
> can leave part of the write region un-COW'd.
>
> Fix by moving skb_ensure_writable() inside the per-key loop where
> the actual write offset is known, and add overflow checking on the
> offset arithmetic. For negative offsets (e.g. Ethernet header edits
> at ingress), use skb_cow() to COW the headroom instead. Guard
> offset_valid() against INT_MIN, where negation is undefined.
>
> Additionally, linearize skbs with shared frags upfront to prevent
> silent data corruption when pedit operates on zero-copy pages
> (e.g. from sendfile).
>
> Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
> Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> Reported-by: Yiming Qian <yimingqian591@gmail.com>
> Reported-by: Keenan Dong <keenanat2000@gmail.com>
> Reported-by: Han Guidong <2045gemini@gmail.com>
> Reported-by: Zhang Cen <rollkingzzc@gmail.com>
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
Solid fix. Linearizing shared frags and dynamic COW checks completely
resolve the offset drift and page cache corruption. Verified locally.
Tested-by: Han Guidong <2045gemini@gmail.com>
Reviewed-by: Han Guidong <2045gemini@gmail.com>
> ---
> net/sched/act_pedit.c | 52 ++++++++++++++++++++++++++++++++-----------
> 1 file changed, 39 insertions(+), 13 deletions(-)
>
> diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
> index bc20f08a2..e077a2e82 100644
> --- a/net/sched/act_pedit.c
> +++ b/net/sched/act_pedit.c
> @@ -16,6 +16,7 @@
> #include <linux/ip.h>
> #include <linux/ipv6.h>
> #include <linux/slab.h>
> +#include <linux/overflow.h>
> #include <net/ipv6.h>
> #include <net/netlink.h>
> #include <net/pkt_sched.h>
> @@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset)
> if (offset > 0 && offset > skb->len)
> return false;
>
> - if (offset < 0 && -offset > skb_headroom(skb))
> - return false;
> + if (offset < 0) {
> + if (offset == INT_MIN || -offset > skb_headroom(skb))
> + return false;
> + }
>
> return true;
> }
> @@ -393,17 +396,19 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> struct tcf_pedit_key_ex *tkey_ex;
> struct tcf_pedit_parms *parms;
> struct tc_pedit_key *tkey;
> - u32 max_offset;
> int i;
>
> parms = rcu_dereference_bh(p->parms);
>
> - max_offset = (skb_transport_header_was_set(skb) ?
> - skb_transport_offset(skb) :
> - skb_network_offset(skb)) +
> - parms->tcfp_off_max_hint;
> - if (skb_ensure_writable(skb, min(skb->len, max_offset)))
> - goto done;
> + /* If the skb has shared frags the user is likely using zero-copy
> + * (e.g. sendfile). Those page frags may point to page-cache pages;
> + * writing into them would silently corrupt the page cache.
> + * Linearize so pedit operates on a private copy.
> + * If you want zero-copy, don't use pedit.
> + TL;DR if you want to use ZC, don't use pedit*/
> + if (skb_has_shared_frag(skb)) {
> + if (__skb_linearize(skb))
> + goto bad;
> + }
>
> tcf_lastuse_update(&p->tcf_tm);
> tcf_action_update_bstats(&p->common, skb);
> @@ -414,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) {
> int offset = tkey->off;
> int hoffset = 0;
> + int write_offset, write_len;
> u32 *ptr, hdata;
> u32 val;
> int rc;
> @@ -451,12 +457,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> }
> }
>
> - if (!offset_valid(skb, hoffset + offset)) {
> - pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset);
> + if (unlikely(check_add_overflow(hoffset, offset,
> + &write_offset))) {
> + pr_info_ratelimited("tc action pedit offset overflow\n");
> + goto bad;
> + }
> +
> + if (!offset_valid(skb, write_offset)) {
> + pr_info_ratelimited("tc action pedit offset %d out of bounds\n",
> + write_offset);
> goto bad;
> }
>
> - ptr = skb_header_pointer(skb, hoffset + offset,
> + if (write_offset < 0) {
> + if (skb_cow(skb, -write_offset))
> + goto bad;
> + } else {
> + if (unlikely(check_add_overflow(write_offset,
> + (int)sizeof(hdata),
> + &write_len)))
> + goto bad;
> + if (skb_ensure_writable(skb, min_t(int, skb->len,
> + write_len)))
> + goto bad;
> + }
> +
> + ptr = skb_header_pointer(skb, write_offset,
> sizeof(hdata), &hdata);
> if (!ptr)
> goto bad;
> @@ -475,7 +501,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>
> *ptr = ((*ptr & tkey->mask) ^ val);
> if (ptr == &hdata)
> - skb_store_bits(skb, hoffset + offset, ptr, 4);
> + skb_store_bits(skb, write_offset, ptr, sizeof(hdata));
> }
>
> goto done;
> --
> 2.51.2.windows.1
>
> Thank you,
> Rajat
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache
2026-05-18 1:30 [PATCH net] net/sched: fix pedit partial COW leading to page cache Rajat Gupta
2026-05-18 13:10 ` Han Guidong
@ 2026-05-18 13:31 ` Jamal Hadi Salim
2026-05-19 3:39 ` [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption Rajat Gupta
1 sibling, 1 reply; 12+ messages in thread
From: Jamal Hadi Salim @ 2026-05-18 13:31 UTC (permalink / raw)
To: Rajat Gupta; +Cc: netdev, jiri, davem, Eric Dumazet, kuba, pabeni, horms
Hi Rajat,
On Sun, May 17, 2026 at 9:30 PM Rajat Gupta
<rajat.gupta@oss.qualcomm.com> wrote:
>
> From f12ea6484dbb75d1c13495e921d0595532317f2a Mon Sep 17 00:00:00 2001
> From: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> Date: Sun, 17 May 2026 10:11:44 -0700
> Subject: [PATCH net] net/sched: fix pedit partial COW leading to page cache
> corruption
>
> tcf_pedit_act() computes the COW range for skb_ensure_writable()
> once before the key loop using tcfp_off_max_hint, but the hint does
> not account for the runtime header offset added by typed keys. This
> can leave part of the write region un-COW'd.
>
git am on this:
https://patchwork.kernel.org/project/netdevbpf/patch/604ca54c-a94e-4ca4-83e7-4486d7392d71@oss.qualcomm.com/
Errors out.
----
Applying: net/sched: fix pedit partial COW leading to page cache corruption
.git/rebase-apply/patch:49: trailing whitespace.
* If you want zero-copy, don't use pedit.
error: corrupt patch at .git/rebase-apply/patch:58
Patch failed at 0001 net/sched: fix pedit partial COW leading to page
cache corruption
----
Those comments dont look clean - follow netdev rules.
You should send v1. The rule is you wait 24 hours after the previous
one before doing so (you can add whoever tests etc on v1)
Also make sure this applies on net tree not net-next
cheers,
jamal
> Fix by moving skb_ensure_writable() inside the per-key loop where
> the actual write offset is known, and add overflow checking on the
> offset arithmetic. For negative offsets (e.g. Ethernet header edits
> at ingress), use skb_cow() to COW the headroom instead. Guard
> offset_valid() against INT_MIN, where negation is undefined.
>
> Additionally, linearize skbs with shared frags upfront to prevent
> silent data corruption when pedit operates on zero-copy pages
> (e.g. from sendfile).
>
> Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
> Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> Reported-by: Yiming Qian <yimingqian591@gmail.com>
> Reported-by: Keenan Dong <keenanat2000@gmail.com>
> Reported-by: Han Guidong <2045gemini@gmail.com>
> Reported-by: Zhang Cen <rollkingzzc@gmail.com>
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> ---
> net/sched/act_pedit.c | 52 ++++++++++++++++++++++++++++++++-----------
> 1 file changed, 39 insertions(+), 13 deletions(-)
>
> diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
> index bc20f08a2..e077a2e82 100644
> --- a/net/sched/act_pedit.c
> +++ b/net/sched/act_pedit.c
> @@ -16,6 +16,7 @@
> #include <linux/ip.h>
> #include <linux/ipv6.h>
> #include <linux/slab.h>
> +#include <linux/overflow.h>
> #include <net/ipv6.h>
> #include <net/netlink.h>
> #include <net/pkt_sched.h>
> @@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset)
> if (offset > 0 && offset > skb->len)
> return false;
>
> - if (offset < 0 && -offset > skb_headroom(skb))
> - return false;
> + if (offset < 0) {
> + if (offset == INT_MIN || -offset > skb_headroom(skb))
> + return false;
> + }
>
> return true;
> }
> @@ -393,17 +396,19 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> struct tcf_pedit_key_ex *tkey_ex;
> struct tcf_pedit_parms *parms;
> struct tc_pedit_key *tkey;
> - u32 max_offset;
> int i;
>
> parms = rcu_dereference_bh(p->parms);
>
> - max_offset = (skb_transport_header_was_set(skb) ?
> - skb_transport_offset(skb) :
> - skb_network_offset(skb)) +
> - parms->tcfp_off_max_hint;
> - if (skb_ensure_writable(skb, min(skb->len, max_offset)))
> - goto done;
> + /* If the skb has shared frags the user is likely using zero-copy
> + * (e.g. sendfile). Those page frags may point to page-cache pages;
> + * writing into them would silently corrupt the page cache.
> + * Linearize so pedit operates on a private copy.
> + * If you want zero-copy, don't use pedit.
> + TL;DR if you want to use ZC, don't use pedit*/
> + if (skb_has_shared_frag(skb)) {
> + if (__skb_linearize(skb))
> + goto bad;
> + }
>
> tcf_lastuse_update(&p->tcf_tm);
> tcf_action_update_bstats(&p->common, skb);
> @@ -414,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) {
> int offset = tkey->off;
> int hoffset = 0;
> + int write_offset, write_len;
> u32 *ptr, hdata;
> u32 val;
> int rc;
> @@ -451,12 +457,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> }
> }
>
> - if (!offset_valid(skb, hoffset + offset)) {
> - pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset);
> + if (unlikely(check_add_overflow(hoffset, offset,
> + &write_offset))) {
> + pr_info_ratelimited("tc action pedit offset overflow\n");
> + goto bad;
> + }
> +
> + if (!offset_valid(skb, write_offset)) {
> + pr_info_ratelimited("tc action pedit offset %d out of bounds\n",
> + write_offset);
> goto bad;
> }
>
> - ptr = skb_header_pointer(skb, hoffset + offset,
> + if (write_offset < 0) {
> + if (skb_cow(skb, -write_offset))
> + goto bad;
> + } else {
> + if (unlikely(check_add_overflow(write_offset,
> + (int)sizeof(hdata),
> + &write_len)))
> + goto bad;
> + if (skb_ensure_writable(skb, min_t(int, skb->len,
> + write_len)))
> + goto bad;
> + }
> +
> + ptr = skb_header_pointer(skb, write_offset,
> sizeof(hdata), &hdata);
> if (!ptr)
> goto bad;
> @@ -475,7 +501,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>
> *ptr = ((*ptr & tkey->mask) ^ val);
> if (ptr == &hdata)
> - skb_store_bits(skb, hoffset + offset, ptr, 4);
> + skb_store_bits(skb, write_offset, ptr, sizeof(hdata));
> }
>
> goto done;
> --
> 2.51.2.windows.1
>
> Thank you,
> Rajat
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption
2026-05-18 13:31 ` Jamal Hadi Salim
@ 2026-05-19 3:39 ` Rajat Gupta
2026-05-19 11:18 ` Toke Høiland-Jørgensen
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Rajat Gupta @ 2026-05-19 3:39 UTC (permalink / raw)
To: netdev
Cc: davem, edumazet, kuba, pabeni, horms, jhs, jiri, yimingqian591,
keenanat2000, 2045gemini, rollkingzzc, Rajat Gupta
tcf_pedit_act() computes the COW range for skb_ensure_writable()
once before the key loop using tcfp_off_max_hint, but the hint does
not account for the runtime header offset added by typed keys. This
can leave part of the write region un-COW'd.
Fix by moving skb_ensure_writable() inside the per-key loop where
the actual write offset is known, and add overflow checking on the
offset arithmetic. For negative offsets (e.g. Ethernet header edits
at ingress), use skb_cow() to COW the headroom instead. Guard
offset_valid() against INT_MIN, where negation is undefined.
Additionally, linearize skbs with shared frags upfront to prevent
silent data corruption when pedit operates on zero-copy pages
(e.g. from sendfile).
Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Reported-by: Keenan Dong <keenanat2000@gmail.com>
Reported-by: Han Guidong <2045gemini@gmail.com>
Reported-by: Zhang Cen <rollkingzzc@gmail.com>
Tested-by: Han Guidong <2045gemini@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
---
net/sched/act_pedit.c | 54 ++++++++++++++++++++++++++++++++-----------
1 file changed, 41 insertions(+), 13 deletions(-)
diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
index bc20f08a2..79921b8d8 100644
--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -16,6 +16,7 @@
#include <linux/ip.h>
#include <linux/ipv6.h>
#include <linux/slab.h>
+#include <linux/overflow.h>
#include <net/ipv6.h>
#include <net/netlink.h>
#include <net/pkt_sched.h>
@@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset)
if (offset > 0 && offset > skb->len)
return false;
- if (offset < 0 && -offset > skb_headroom(skb))
- return false;
+ if (offset < 0) {
+ if (offset == INT_MIN || -offset > skb_headroom(skb))
+ return false;
+ }
return true;
}
@@ -393,17 +396,21 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
struct tcf_pedit_key_ex *tkey_ex;
struct tcf_pedit_parms *parms;
struct tc_pedit_key *tkey;
- u32 max_offset;
int i;
parms = rcu_dereference_bh(p->parms);
- max_offset = (skb_transport_header_was_set(skb) ?
- skb_transport_offset(skb) :
- skb_network_offset(skb)) +
- parms->tcfp_off_max_hint;
- if (skb_ensure_writable(skb, min(skb->len, max_offset)))
- goto done;
+ /*
+ * If the skb has shared frags the user is likely using zero-copy
+ * (e.g. sendfile). Those page frags may point to page-cache pages;
+ * writing into them would silently corrupt the page cache.
+ * Linearize so pedit operates on a private copy.
+ * TL;DR: if you want zero-copy, don't use pedit.
+ */
+ if (skb_has_shared_frag(skb)) {
+ if (__skb_linearize(skb))
+ goto bad;
+ }
tcf_lastuse_update(&p->tcf_tm);
tcf_action_update_bstats(&p->common, skb);
@@ -412,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
tkey_ex = parms->tcfp_keys_ex;
for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) {
+ int write_offset, write_len;
int offset = tkey->off;
int hoffset = 0;
u32 *ptr, hdata;
@@ -451,12 +459,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
}
}
- if (!offset_valid(skb, hoffset + offset)) {
- pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset);
+ if (unlikely(check_add_overflow(hoffset, offset,
+ &write_offset))) {
+ pr_info_ratelimited("tc action pedit offset overflow\n");
+ goto bad;
+ }
+
+ if (!offset_valid(skb, write_offset)) {
+ pr_info_ratelimited("tc action pedit offset %d out of bounds\n",
+ write_offset);
goto bad;
}
- ptr = skb_header_pointer(skb, hoffset + offset,
+ if (write_offset < 0) {
+ if (skb_cow(skb, -write_offset))
+ goto bad;
+ } else {
+ if (unlikely(check_add_overflow(write_offset,
+ (int)sizeof(hdata),
+ &write_len)))
+ goto bad;
+ if (skb_ensure_writable(skb, min_t(int, skb->len,
+ write_len)))
+ goto bad;
+ }
+
+ ptr = skb_header_pointer(skb, write_offset,
sizeof(hdata), &hdata);
if (!ptr)
goto bad;
@@ -475,7 +503,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
*ptr = ((*ptr & tkey->mask) ^ val);
if (ptr == &hdata)
- skb_store_bits(skb, hoffset + offset, ptr, 4);
+ skb_store_bits(skb, write_offset, ptr, sizeof(hdata));
}
goto done;
--
2.51.2.windows.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption
2026-05-19 3:39 ` [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption Rajat Gupta
@ 2026-05-19 11:18 ` Toke Høiland-Jørgensen
2026-05-19 15:10 ` Han Guidong
2026-05-20 9:23 ` Jamal Hadi Salim
2 siblings, 0 replies; 12+ messages in thread
From: Toke Høiland-Jørgensen @ 2026-05-19 11:18 UTC (permalink / raw)
To: Rajat Gupta, netdev
Cc: davem, edumazet, kuba, pabeni, horms, jhs, jiri, yimingqian591,
keenanat2000, 2045gemini, rollkingzzc, Rajat Gupta
Rajat Gupta <rajat.gupta@oss.qualcomm.com> writes:
> tcf_pedit_act() computes the COW range for skb_ensure_writable()
> once before the key loop using tcfp_off_max_hint, but the hint does
> not account for the runtime header offset added by typed keys. This
> can leave part of the write region un-COW'd.
>
> Fix by moving skb_ensure_writable() inside the per-key loop where
> the actual write offset is known, and add overflow checking on the
> offset arithmetic. For negative offsets (e.g. Ethernet header edits
> at ingress), use skb_cow() to COW the headroom instead. Guard
> offset_valid() against INT_MIN, where negation is undefined.
>
> Additionally, linearize skbs with shared frags upfront to prevent
> silent data corruption when pedit operates on zero-copy pages
> (e.g. from sendfile).
>
> Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
> Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> Reported-by: Yiming Qian <yimingqian591@gmail.com>
> Reported-by: Keenan Dong <keenanat2000@gmail.com>
> Reported-by: Han Guidong <2045gemini@gmail.com>
> Reported-by: Zhang Cen <rollkingzzc@gmail.com>
> Tested-by: Han Guidong <2045gemini@gmail.com>
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@kernel.org>
Also applied this to -net and ran the TC pedit selftests and the
pedit_* scripts in net/forwarding, none of which turned up any
regressions, so:
Tested-by: Toke Høiland-Jørgensen <toke@kernel.org>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption
2026-05-19 3:39 ` [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption Rajat Gupta
2026-05-19 11:18 ` Toke Høiland-Jørgensen
@ 2026-05-19 15:10 ` Han Guidong
2026-05-20 9:12 ` Jamal Hadi Salim
2026-05-20 9:23 ` Jamal Hadi Salim
2 siblings, 1 reply; 12+ messages in thread
From: Han Guidong @ 2026-05-19 15:10 UTC (permalink / raw)
To: Rajat Gupta
Cc: netdev, davem, edumazet, kuba, pabeni, horms, jhs, jiri,
yimingqian591, keenanat2000, rollkingzzc
On Tue, May 19, 2026 at 11:42 AM Rajat Gupta
<rajat.gupta@oss.qualcomm.com> wrote:
>
> tcf_pedit_act() computes the COW range for skb_ensure_writable()
> once before the key loop using tcfp_off_max_hint, but the hint does
> not account for the runtime header offset added by typed keys. This
> can leave part of the write region un-COW'd.
>
> Fix by moving skb_ensure_writable() inside the per-key loop where
> the actual write offset is known, and add overflow checking on the
> offset arithmetic. For negative offsets (e.g. Ethernet header edits
> at ingress), use skb_cow() to COW the headroom instead. Guard
> offset_valid() against INT_MIN, where negation is undefined.
>
> Additionally, linearize skbs with shared frags upfront to prevent
> silent data corruption when pedit operates on zero-copy pages
> (e.g. from sendfile).
>
> Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
Fixes: 6c02568fd1ae ("net/sched: act_pedit: Parse L3 Header for L4 offset")
Cc: stable@vger.kernel.org
I took a closer look at the code, and this patch is really addressing
two issues.
The negative-offset write into shared head/headroom appears to be from
8b796475fd78, while the shared-frag write, and thus possible
page-cache corruption, appears to be the later regression from
6c02568fd1ae. So it may be worth adding an additional Fixes tag for
6c02568fd1ae, and I think an explicit Cc: stable@vger.kernel.org would
also make sense here.
> Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> Reported-by: Yiming Qian <yimingqian591@gmail.com>
> Reported-by: Keenan Dong <keenanat2000@gmail.com>
> Reported-by: Han Guidong <2045gemini@gmail.com>
> Reported-by: Zhang Cen <rollkingzzc@gmail.com>
> Tested-by: Han Guidong <2045gemini@gmail.com>
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> ---
> net/sched/act_pedit.c | 54 ++++++++++++++++++++++++++++++++-----------
> 1 file changed, 41 insertions(+), 13 deletions(-)
>
> diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
> index bc20f08a2..79921b8d8 100644
> --- a/net/sched/act_pedit.c
> +++ b/net/sched/act_pedit.c
> @@ -16,6 +16,7 @@
> #include <linux/ip.h>
> #include <linux/ipv6.h>
> #include <linux/slab.h>
> +#include <linux/overflow.h>
> #include <net/ipv6.h>
> #include <net/netlink.h>
> #include <net/pkt_sched.h>
> @@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset)
> if (offset > 0 && offset > skb->len)
> return false;
>
> - if (offset < 0 && -offset > skb_headroom(skb))
> - return false;
> + if (offset < 0) {
> + if (offset == INT_MIN || -offset > skb_headroom(skb))
> + return false;
> + }
>
> return true;
> }
> @@ -393,17 +396,21 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> struct tcf_pedit_key_ex *tkey_ex;
> struct tcf_pedit_parms *parms;
> struct tc_pedit_key *tkey;
> - u32 max_offset;
> int i;
>
> parms = rcu_dereference_bh(p->parms);
>
> - max_offset = (skb_transport_header_was_set(skb) ?
> - skb_transport_offset(skb) :
> - skb_network_offset(skb)) +
> - parms->tcfp_off_max_hint;
> - if (skb_ensure_writable(skb, min(skb->len, max_offset)))
> - goto done;
> + /*
> + * If the skb has shared frags the user is likely using zero-copy
> + * (e.g. sendfile). Those page frags may point to page-cache pages;
> + * writing into them would silently corrupt the page cache.
> + * Linearize so pedit operates on a private copy.
> + * TL;DR: if you want zero-copy, don't use pedit.
> + */
> + if (skb_has_shared_frag(skb)) {
> + if (__skb_linearize(skb))
> + goto bad;
> + }
>
> tcf_lastuse_update(&p->tcf_tm);
> tcf_action_update_bstats(&p->common, skb);
> @@ -412,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> tkey_ex = parms->tcfp_keys_ex;
>
> for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) {
> + int write_offset, write_len;
> int offset = tkey->off;
> int hoffset = 0;
> u32 *ptr, hdata;
> @@ -451,12 +459,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> }
> }
>
> - if (!offset_valid(skb, hoffset + offset)) {
> - pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset);
> + if (unlikely(check_add_overflow(hoffset, offset,
> + &write_offset))) {
I also noticed two remaining unchecked additions, e.g. hoffset +
tkey->at and offset += (*d & offmask) >> shift. These look more like
wrap/type-semantics cleanup than a practical signed-overflow/UB issue
to me, so I am OK with this patch as-is.
> + pr_info_ratelimited("tc action pedit offset overflow\n");
> + goto bad;
> + }
> +
> + if (!offset_valid(skb, write_offset)) {
> + pr_info_ratelimited("tc action pedit offset %d out of bounds\n",
> + write_offset);
> goto bad;
> }
>
> - ptr = skb_header_pointer(skb, hoffset + offset,
> + if (write_offset < 0) {
There is also a small negative-offset 4-byte-write corner case, e.g.
an ETH-relative edit with hoffset=-14 and offset=12, giving a final
write_offset=-2; if the linear head after skb->data is only 1 byte,
that store spans -2..1. That seems unlikely in practice; fixing it
would be nice, but I would not hold this patch for it.
Reviewed-by: Han Guidong <2045gemini@gmail.com>
> + if (skb_cow(skb, -write_offset))
> + goto bad;
> + } else {
> + if (unlikely(check_add_overflow(write_offset,
> + (int)sizeof(hdata),
> + &write_len)))
> + goto bad;
> + if (skb_ensure_writable(skb, min_t(int, skb->len,
> + write_len)))
> + goto bad;
> + }
> +
> + ptr = skb_header_pointer(skb, write_offset,
> sizeof(hdata), &hdata);
> if (!ptr)
> goto bad;
> @@ -475,7 +503,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>
> *ptr = ((*ptr & tkey->mask) ^ val);
> if (ptr == &hdata)
> - skb_store_bits(skb, hoffset + offset, ptr, 4);
> + skb_store_bits(skb, write_offset, ptr, sizeof(hdata));
> }
>
> goto done;
> --
> 2.51.2.windows.1
>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption
2026-05-19 15:10 ` Han Guidong
@ 2026-05-20 9:12 ` Jamal Hadi Salim
2026-05-20 10:04 ` Han Guidong
2026-05-20 10:36 ` Han Guidong
0 siblings, 2 replies; 12+ messages in thread
From: Jamal Hadi Salim @ 2026-05-20 9:12 UTC (permalink / raw)
To: Han Guidong
Cc: Rajat Gupta, netdev, davem, edumazet, kuba, pabeni, horms, jiri,
yimingqian591, keenanat2000, rollkingzzc
On Tue, May 19, 2026 at 11:11 AM Han Guidong <2045gemini@gmail.com> wrote:
>
> On Tue, May 19, 2026 at 11:42 AM Rajat Gupta
> <rajat.gupta@oss.qualcomm.com> wrote:
> >
> > tcf_pedit_act() computes the COW range for skb_ensure_writable()
> > once before the key loop using tcfp_off_max_hint, but the hint does
> > not account for the runtime header offset added by typed keys. This
> > can leave part of the write region un-COW'd.
> >
> > Fix by moving skb_ensure_writable() inside the per-key loop where
> > the actual write offset is known, and add overflow checking on the
> > offset arithmetic. For negative offsets (e.g. Ethernet header edits
> > at ingress), use skb_cow() to COW the headroom instead. Guard
> > offset_valid() against INT_MIN, where negation is undefined.
> >
> > Additionally, linearize skbs with shared frags upfront to prevent
> > silent data corruption when pedit operates on zero-copy pages
> > (e.g. from sendfile).
> >
> > Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
>
> Fixes: 6c02568fd1ae ("net/sched: act_pedit: Parse L3 Header for L4 offset")
> Cc: stable@vger.kernel.org
>
> I took a closer look at the code, and this patch is really addressing
> two issues.
>
> The negative-offset write into shared head/headroom appears to be from
> 8b796475fd78, while the shared-frag write, and thus possible
> page-cache corruption, appears to be the later regression from
> 6c02568fd1ae. So it may be worth adding an additional Fixes tag for
> 6c02568fd1ae,
Pushing it a little - but sure we should also point to that commit.
> and I think an explicit Cc: stable@vger.kernel.org would also make sense here.
Please study the rules.
> > Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> > Reported-by: Yiming Qian <yimingqian591@gmail.com>
> > Reported-by: Keenan Dong <keenanat2000@gmail.com>
> > Reported-by: Han Guidong <2045gemini@gmail.com>
> > Reported-by: Zhang Cen <rollkingzzc@gmail.com>
> > Tested-by: Han Guidong <2045gemini@gmail.com>
> > Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
> > Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> > ---
> > net/sched/act_pedit.c | 54 ++++++++++++++++++++++++++++++++-----------
> > 1 file changed, 41 insertions(+), 13 deletions(-)
> >
> > diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
> > index bc20f08a2..79921b8d8 100644
> > --- a/net/sched/act_pedit.c
> > +++ b/net/sched/act_pedit.c
> > @@ -16,6 +16,7 @@
> > #include <linux/ip.h>
> > #include <linux/ipv6.h>
> > #include <linux/slab.h>
> > +#include <linux/overflow.h>
> > #include <net/ipv6.h>
> > #include <net/netlink.h>
> > #include <net/pkt_sched.h>
> > @@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset)
> > if (offset > 0 && offset > skb->len)
> > return false;
> >
> > - if (offset < 0 && -offset > skb_headroom(skb))
> > - return false;
> > + if (offset < 0) {
> > + if (offset == INT_MIN || -offset > skb_headroom(skb))
> > + return false;
> > + }
> >
> > return true;
> > }
> > @@ -393,17 +396,21 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> > struct tcf_pedit_key_ex *tkey_ex;
> > struct tcf_pedit_parms *parms;
> > struct tc_pedit_key *tkey;
> > - u32 max_offset;
> > int i;
> >
> > parms = rcu_dereference_bh(p->parms);
> >
> > - max_offset = (skb_transport_header_was_set(skb) ?
> > - skb_transport_offset(skb) :
> > - skb_network_offset(skb)) +
> > - parms->tcfp_off_max_hint;
> > - if (skb_ensure_writable(skb, min(skb->len, max_offset)))
> > - goto done;
> > + /*
> > + * If the skb has shared frags the user is likely using zero-copy
> > + * (e.g. sendfile). Those page frags may point to page-cache pages;
> > + * writing into them would silently corrupt the page cache.
> > + * Linearize so pedit operates on a private copy.
> > + * TL;DR: if you want zero-copy, don't use pedit.
> > + */
> > + if (skb_has_shared_frag(skb)) {
> > + if (__skb_linearize(skb))
> > + goto bad;
> > + }
> >
> > tcf_lastuse_update(&p->tcf_tm);
> > tcf_action_update_bstats(&p->common, skb);
> > @@ -412,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> > tkey_ex = parms->tcfp_keys_ex;
> >
> > for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) {
> > + int write_offset, write_len;
> > int offset = tkey->off;
> > int hoffset = 0;
> > u32 *ptr, hdata;
> > @@ -451,12 +459,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> > }
> > }
> >
> > - if (!offset_valid(skb, hoffset + offset)) {
> > - pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset);
> > + if (unlikely(check_add_overflow(hoffset, offset,
> > + &write_offset))) {
>
> I also noticed two remaining unchecked additions, e.g. hoffset +
> tkey->at and offset += (*d & offmask) >> shift. These look more like
> wrap/type-semantics cleanup than a practical signed-overflow/UB issue
> to me, so I am OK with this patch as-is.
Did you get this from some AI?
None of these are possible. Both additions you mention are safe in
practice because any potential wrap around or overflow is caught
_before any memory access occurs_. Example see:
offset_valid() which implictly casts hoffset + key to signed int.
> > + pr_info_ratelimited("tc action pedit offset overflow\n");
> > + goto bad;
> > + }
> > +
> > + if (!offset_valid(skb, write_offset)) {
> > + pr_info_ratelimited("tc action pedit offset %d out of bounds\n",
> > + write_offset);
> > goto bad;
> > }
> >
> > - ptr = skb_header_pointer(skb, hoffset + offset,
> > + if (write_offset < 0) {
>
> There is also a small negative-offset 4-byte-write corner case, e.g.
> an ETH-relative edit with hoffset=-14 and offset=12, giving a final
> write_offset=-2; if the linear head after skb->data is only 1 byte,
> that store spans -2..1. That seems unlikely in practice; fixing it
> would be nice, but I would not hold this patch for it.
>
Again smells AI generated. I spent a lot of time trying to determine
the practicality and validity of this (and your previous claim), only
to find that this whole claim is based on a view that a headroom of 1
byte is a valid scenario in practice. Is it theoretically possible
from the math? yes. Practically, I don't see how it makes any sense.
Can you describe a scenario where this would happen in practise?
> Reviewed-by: Han Guidong <2045gemini@gmail.com>
More like reviewed by my AI who goes by the name foobar.
cheers,
jamal
>
>
> > + if (skb_cow(skb, -write_offset))
> > + goto bad;
> > + } else {
> > + if (unlikely(check_add_overflow(write_offset,
> > + (int)sizeof(hdata),
> > + &write_len)))
> > + goto bad;
> > + if (skb_ensure_writable(skb, min_t(int, skb->len,
> > + write_len)))
> > + goto bad;
> > + }
> > +
> > + ptr = skb_header_pointer(skb, write_offset,
> > sizeof(hdata), &hdata);
> > if (!ptr)
> > goto bad;
> > @@ -475,7 +503,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> >
> > *ptr = ((*ptr & tkey->mask) ^ val);
> > if (ptr == &hdata)
> > - skb_store_bits(skb, hoffset + offset, ptr, 4);
> > + skb_store_bits(skb, write_offset, ptr, sizeof(hdata));
> > }
> >
> > goto done;
> > --
> > 2.51.2.windows.1
> >
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption
2026-05-20 9:12 ` Jamal Hadi Salim
@ 2026-05-20 10:04 ` Han Guidong
2026-05-20 10:36 ` Han Guidong
1 sibling, 0 replies; 12+ messages in thread
From: Han Guidong @ 2026-05-20 10:04 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: Rajat Gupta, netdev, davem, edumazet, kuba, pabeni, horms, jiri,
yimingqian591, keenanat2000, rollkingzzc
On Wed, May 20, 2026 at 5:12 PM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>
> On Tue, May 19, 2026 at 11:11 AM Han Guidong <2045gemini@gmail.com> wrote:
> >
> > On Tue, May 19, 2026 at 11:42 AM Rajat Gupta
> > <rajat.gupta@oss.qualcomm.com> wrote:
> > >
> > > tcf_pedit_act() computes the COW range for skb_ensure_writable()
> > > once before the key loop using tcfp_off_max_hint, but the hint does
> > > not account for the runtime header offset added by typed keys. This
> > > can leave part of the write region un-COW'd.
> > >
> > > Fix by moving skb_ensure_writable() inside the per-key loop where
> > > the actual write offset is known, and add overflow checking on the
> > > offset arithmetic. For negative offsets (e.g. Ethernet header edits
> > > at ingress), use skb_cow() to COW the headroom instead. Guard
> > > offset_valid() against INT_MIN, where negation is undefined.
> > >
> > > Additionally, linearize skbs with shared frags upfront to prevent
> > > silent data corruption when pedit operates on zero-copy pages
> > > (e.g. from sendfile).
> > >
> > > Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
> >
> > Fixes: 6c02568fd1ae ("net/sched: act_pedit: Parse L3 Header for L4 offset")
> > Cc: stable@vger.kernel.org
> >
> > I took a closer look at the code, and this patch is really addressing
> > two issues.
> >
> > The negative-offset write into shared head/headroom appears to be from
> > 8b796475fd78, while the shared-frag write, and thus possible
> > page-cache corruption, appears to be the later regression from
> > 6c02568fd1ae. So it may be worth adding an additional Fixes tag for
> > 6c02568fd1ae,
>
> Pushing it a little - but sure we should also point to that commit.
>
> > and I think an explicit Cc: stable@vger.kernel.org would also make sense here.
>
> Please study the rules.
I will review the rules. Sorry for the noise.
>
> > > Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> > > Reported-by: Yiming Qian <yimingqian591@gmail.com>
> > > Reported-by: Keenan Dong <keenanat2000@gmail.com>
> > > Reported-by: Han Guidong <2045gemini@gmail.com>
> > > Reported-by: Zhang Cen <rollkingzzc@gmail.com>
> > > Tested-by: Han Guidong <2045gemini@gmail.com>
> > > Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
> > > Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> > > ---
> > > net/sched/act_pedit.c | 54 ++++++++++++++++++++++++++++++++-----------
> > > 1 file changed, 41 insertions(+), 13 deletions(-)
> > >
> > > diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
> > > index bc20f08a2..79921b8d8 100644
> > > --- a/net/sched/act_pedit.c
> > > +++ b/net/sched/act_pedit.c
> > > @@ -16,6 +16,7 @@
> > > #include <linux/ip.h>
> > > #include <linux/ipv6.h>
> > > #include <linux/slab.h>
> > > +#include <linux/overflow.h>
> > > #include <net/ipv6.h>
> > > #include <net/netlink.h>
> > > #include <net/pkt_sched.h>
> > > @@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset)
> > > if (offset > 0 && offset > skb->len)
> > > return false;
> > >
> > > - if (offset < 0 && -offset > skb_headroom(skb))
> > > - return false;
> > > + if (offset < 0) {
> > > + if (offset == INT_MIN || -offset > skb_headroom(skb))
> > > + return false;
> > > + }
> > >
> > > return true;
> > > }
> > > @@ -393,17 +396,21 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> > > struct tcf_pedit_key_ex *tkey_ex;
> > > struct tcf_pedit_parms *parms;
> > > struct tc_pedit_key *tkey;
> > > - u32 max_offset;
> > > int i;
> > >
> > > parms = rcu_dereference_bh(p->parms);
> > >
> > > - max_offset = (skb_transport_header_was_set(skb) ?
> > > - skb_transport_offset(skb) :
> > > - skb_network_offset(skb)) +
> > > - parms->tcfp_off_max_hint;
> > > - if (skb_ensure_writable(skb, min(skb->len, max_offset)))
> > > - goto done;
> > > + /*
> > > + * If the skb has shared frags the user is likely using zero-copy
> > > + * (e.g. sendfile). Those page frags may point to page-cache pages;
> > > + * writing into them would silently corrupt the page cache.
> > > + * Linearize so pedit operates on a private copy.
> > > + * TL;DR: if you want zero-copy, don't use pedit.
> > > + */
> > > + if (skb_has_shared_frag(skb)) {
> > > + if (__skb_linearize(skb))
> > > + goto bad;
> > > + }
> > >
> > > tcf_lastuse_update(&p->tcf_tm);
> > > tcf_action_update_bstats(&p->common, skb);
> > > @@ -412,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> > > tkey_ex = parms->tcfp_keys_ex;
> > >
> > > for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) {
> > > + int write_offset, write_len;
> > > int offset = tkey->off;
> > > int hoffset = 0;
> > > u32 *ptr, hdata;
> > > @@ -451,12 +459,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> > > }
> > > }
> > >
> > > - if (!offset_valid(skb, hoffset + offset)) {
> > > - pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset);
> > > + if (unlikely(check_add_overflow(hoffset, offset,
> > > + &write_offset))) {
> >
> > I also noticed two remaining unchecked additions, e.g. hoffset +
> > tkey->at and offset += (*d & offmask) >> shift. These look more like
> > wrap/type-semantics cleanup than a practical signed-overflow/UB issue
> > to me, so I am OK with this patch as-is.
>
> Did you get this from some AI?
> None of these are possible. Both additions you mention are safe in
> practice because any potential wrap around or overflow is caught
> _before any memory access occurs_. Example see:
> offset_valid() which implictly casts hoffset + key to signed int.
Not AI-generated. I was looking at the code and noticed that
tc_pedit_key fields are unsigned, but the results are stored in int
offset and int hoffset. I was just thinking about odd edge cases where
an unsigned-to-signed conversion results in a negative number. I agree
with you that offset_valid() catches this anyway, so it causes no
actual harm.
>
> > > + pr_info_ratelimited("tc action pedit offset overflow\n");
> > > + goto bad;
> > > + }
> > > +
> > > + if (!offset_valid(skb, write_offset)) {
> > > + pr_info_ratelimited("tc action pedit offset %d out of bounds\n",
> > > + write_offset);
> > > goto bad;
> > > }
> > >
> > > - ptr = skb_header_pointer(skb, hoffset + offset,
> > > + if (write_offset < 0) {
> >
> > There is also a small negative-offset 4-byte-write corner case, e.g.
> > an ETH-relative edit with hoffset=-14 and offset=12, giving a final
> > write_offset=-2; if the linear head after skb->data is only 1 byte,
> > that store spans -2..1. That seems unlikely in practice; fixing it
> > would be nice, but I would not hold this patch for it.
> >
>
> Again smells AI generated. I spent a lot of time trying to determine
> the practicality and validity of this (and your previous claim), only
> to find that this whole claim is based on a view that a headroom of 1
> byte is a valid scenario in practice. Is it theoretically possible
> from the math? yes. Practically, I don't see how it makes any sense.
> Can you describe a scenario where this would happen in practise?
As I mentioned in the previous mail, I agree it is unlikely in
practice. For context, this was an edge case brought up by sashiko
regarding our patch. I wrote a test kernel module and verified that if
an skb is artificially constructed with exactly 1 byte of headroom,
skb_cow() is indeed insufficient. However, I also checked real-world
packets and the headroom is always plenty, so I agree it doesn't
happen in the real world.
>
> > Reviewed-by: Han Guidong <2045gemini@gmail.com>
>
> More like reviewed by my AI who goes by the name foobar.
I will work on improving my review quality going forward. Sorry again
for the noise.
Thanks.
>
> cheers,
> jamal
>
> >
> >
> > > + if (skb_cow(skb, -write_offset))
> > > + goto bad;
> > > + } else {
> > > + if (unlikely(check_add_overflow(write_offset,
> > > + (int)sizeof(hdata),
> > > + &write_len)))
> > > + goto bad;
> > > + if (skb_ensure_writable(skb, min_t(int, skb->len,
> > > + write_len)))
> > > + goto bad;
> > > + }
> > > +
> > > + ptr = skb_header_pointer(skb, write_offset,
> > > sizeof(hdata), &hdata);
> > > if (!ptr)
> > > goto bad;
> > > @@ -475,7 +503,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> > >
> > > *ptr = ((*ptr & tkey->mask) ^ val);
> > > if (ptr == &hdata)
> > > - skb_store_bits(skb, hoffset + offset, ptr, 4);
> > > + skb_store_bits(skb, write_offset, ptr, sizeof(hdata));
> > > }
> > >
> > > goto done;
> > > --
> > > 2.51.2.windows.1
> > >
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption
2026-05-20 9:12 ` Jamal Hadi Salim
2026-05-20 10:04 ` Han Guidong
@ 2026-05-20 10:36 ` Han Guidong
2026-05-20 11:40 ` Jamal Hadi Salim
1 sibling, 1 reply; 12+ messages in thread
From: Han Guidong @ 2026-05-20 10:36 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: Rajat Gupta, netdev, davem, edumazet, kuba, pabeni, horms, jiri,
yimingqian591, keenanat2000, rollkingzzc
On Wed, May 20, 2026 at 5:12 PM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>
> On Tue, May 19, 2026 at 11:11 AM Han Guidong <2045gemini@gmail.com> wrote:
> >
> > On Tue, May 19, 2026 at 11:42 AM Rajat Gupta
> > <rajat.gupta@oss.qualcomm.com> wrote:
> > >
> > > tcf_pedit_act() computes the COW range for skb_ensure_writable()
> > > once before the key loop using tcfp_off_max_hint, but the hint does
> > > not account for the runtime header offset added by typed keys. This
> > > can leave part of the write region un-COW'd.
> > >
> > > Fix by moving skb_ensure_writable() inside the per-key loop where
> > > the actual write offset is known, and add overflow checking on the
> > > offset arithmetic. For negative offsets (e.g. Ethernet header edits
> > > at ingress), use skb_cow() to COW the headroom instead. Guard
> > > offset_valid() against INT_MIN, where negation is undefined.
> > >
> > > Additionally, linearize skbs with shared frags upfront to prevent
> > > silent data corruption when pedit operates on zero-copy pages
> > > (e.g. from sendfile).
> > >
> > > Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
> >
> > Fixes: 6c02568fd1ae ("net/sched: act_pedit: Parse L3 Header for L4 offset")
> > Cc: stable@vger.kernel.org
> >
> > I took a closer look at the code, and this patch is really addressing
> > two issues.
> >
> > The negative-offset write into shared head/headroom appears to be from
> > 8b796475fd78, while the shared-frag write, and thus possible
> > page-cache corruption, appears to be the later regression from
> > 6c02568fd1ae. So it may be worth adding an additional Fixes tag for
> > 6c02568fd1ae,
>
> Pushing it a little - but sure we should also point to that commit.
>
> > and I think an explicit Cc: stable@vger.kernel.org would also make sense here.
>
> Please study the rules.
Hi Jamal,
Apologies, I'm still a newbie learning the process. I checked the docs
and saw commit dbbe7c962c3a ("docs: networking: drop special stable
handling"), but I'm still a bit confused.
Did the rule change to allow explicit "Cc: stable@vger.kernel.org"
tags for netdev?
If I completely missed the correct documentation, I would be very
grateful if you could point me to it.
Thanks for your patience!
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption
2026-05-20 10:36 ` Han Guidong
@ 2026-05-20 11:40 ` Jamal Hadi Salim
0 siblings, 0 replies; 12+ messages in thread
From: Jamal Hadi Salim @ 2026-05-20 11:40 UTC (permalink / raw)
To: Han Guidong
Cc: Rajat Gupta, netdev, davem, edumazet, kuba, pabeni, horms, jiri,
yimingqian591, keenanat2000, rollkingzzc
On Wed, May 20, 2026 at 6:37 AM Han Guidong <2045gemini@gmail.com> wrote:
>
> On Wed, May 20, 2026 at 5:12 PM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> >
> > On Tue, May 19, 2026 at 11:11 AM Han Guidong <2045gemini@gmail.com> wrote:
> > >
> > > On Tue, May 19, 2026 at 11:42 AM Rajat Gupta
> > > <rajat.gupta@oss.qualcomm.com> wrote:
> > > >
> > > > tcf_pedit_act() computes the COW range for skb_ensure_writable()
> > > > once before the key loop using tcfp_off_max_hint, but the hint does
> > > > not account for the runtime header offset added by typed keys. This
> > > > can leave part of the write region un-COW'd.
> > > >
> > > > Fix by moving skb_ensure_writable() inside the per-key loop where
> > > > the actual write offset is known, and add overflow checking on the
> > > > offset arithmetic. For negative offsets (e.g. Ethernet header edits
> > > > at ingress), use skb_cow() to COW the headroom instead. Guard
> > > > offset_valid() against INT_MIN, where negation is undefined.
> > > >
> > > > Additionally, linearize skbs with shared frags upfront to prevent
> > > > silent data corruption when pedit operates on zero-copy pages
> > > > (e.g. from sendfile).
> > > >
> > > > Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
> > >
> > > Fixes: 6c02568fd1ae ("net/sched: act_pedit: Parse L3 Header for L4 offset")
> > > Cc: stable@vger.kernel.org
> > >
> > > I took a closer look at the code, and this patch is really addressing
> > > two issues.
> > >
> > > The negative-offset write into shared head/headroom appears to be from
> > > 8b796475fd78, while the shared-frag write, and thus possible
> > > page-cache corruption, appears to be the later regression from
> > > 6c02568fd1ae. So it may be worth adding an additional Fixes tag for
> > > 6c02568fd1ae,
> >
> > Pushing it a little - but sure we should also point to that commit.
> >
> > > and I think an explicit Cc: stable@vger.kernel.org would also make sense here.
> >
> > Please study the rules.
>
> Hi Jamal,
>
> Apologies, I'm still a newbie learning the process. I checked the docs
> and saw commit dbbe7c962c3a ("docs: networking: drop special stable
> handling"), but I'm still a bit confused.
>
> Did the rule change to allow explicit "Cc: stable@vger.kernel.org"
> tags for netdev?
>
> If I completely missed the correct documentation, I would be very
> grateful if you could point me to it.
>
This is a security fix. AFAIK, It does not follow the stable rules.
It goes via the net tree (then propagated via net-next) before making
it to linux main. It gets picked up by the stable tree.
But: You are correct that Ccing stable is useful when the vulnerberity
hasnt been published. I apologize for being harsh - too many AI emails
claiming bugs on which i spend too much time and you bore the brunt of
it. So we should Cc them.
cheers,
jamal
> Thanks for your patience!
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption
2026-05-19 3:39 ` [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption Rajat Gupta
2026-05-19 11:18 ` Toke Høiland-Jørgensen
2026-05-19 15:10 ` Han Guidong
@ 2026-05-20 9:23 ` Jamal Hadi Salim
2 siblings, 0 replies; 12+ messages in thread
From: Jamal Hadi Salim @ 2026-05-20 9:23 UTC (permalink / raw)
To: Rajat Gupta
Cc: netdev, davem, edumazet, kuba, pabeni, horms, jiri, yimingqian591,
keenanat2000, 2045gemini, rollkingzzc
On Mon, May 18, 2026 at 11:42 PM Rajat Gupta
<rajat.gupta@oss.qualcomm.com> wrote:
>
> tcf_pedit_act() computes the COW range for skb_ensure_writable()
> once before the key loop using tcfp_off_max_hint, but the hint does
> not account for the runtime header offset added by typed keys. This
> can leave part of the write region un-COW'd.
>
> Fix by moving skb_ensure_writable() inside the per-key loop where
> the actual write offset is known, and add overflow checking on the
> offset arithmetic. For negative offsets (e.g. Ethernet header edits
> at ingress), use skb_cow() to COW the headroom instead. Guard
> offset_valid() against INT_MIN, where negation is undefined.
>
> Additionally, linearize skbs with shared frags upfront to prevent
> silent data corruption when pedit operates on zero-copy pages
> (e.g. from sendfile).
>
Sashiko makes the claim that:
"
....
.....
If the payload part extending past offset 0 resides in a shared
fragment for a cloned skb, wouldn't skb_store_bits() write the
positive offset portion directly into the shared page, causing a data
corruption regression?
"
It missed the earlier code (before the quoted code is hit) which is
very well documented!
/*
* If the skb has shared frags the user is likely using zero-copy
* (e.g. sendfile). Those page frags may point to page-cache pages;
* writing into them would silently corrupt the page cache.
* Linearize so pedit operates on a private copy.
* TL;DR: if you want zero-copy, don't use pedit.
*/
if (skb_has_shared_frag(skb)) {
if (__skb_linearize(skb))
goto bad;
}
TLDR:
If the skb has shared frags, it is linearized upfront using
__skb_linearize() which means any risk of modifying a shared fragment
is non-existent.
cheers,
jamal
> Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
> Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> Reported-by: Yiming Qian <yimingqian591@gmail.com>
> Reported-by: Keenan Dong <keenanat2000@gmail.com>
> Reported-by: Han Guidong <2045gemini@gmail.com>
> Reported-by: Zhang Cen <rollkingzzc@gmail.com>
> Tested-by: Han Guidong <2045gemini@gmail.com>
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> ---
> net/sched/act_pedit.c | 54 ++++++++++++++++++++++++++++++++-----------
> 1 file changed, 41 insertions(+), 13 deletions(-)
>
> diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
> index bc20f08a2..79921b8d8 100644
> --- a/net/sched/act_pedit.c
> +++ b/net/sched/act_pedit.c
> @@ -16,6 +16,7 @@
> #include <linux/ip.h>
> #include <linux/ipv6.h>
> #include <linux/slab.h>
> +#include <linux/overflow.h>
> #include <net/ipv6.h>
> #include <net/netlink.h>
> #include <net/pkt_sched.h>
> @@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset)
> if (offset > 0 && offset > skb->len)
> return false;
>
> - if (offset < 0 && -offset > skb_headroom(skb))
> - return false;
> + if (offset < 0) {
> + if (offset == INT_MIN || -offset > skb_headroom(skb))
> + return false;
> + }
>
> return true;
> }
> @@ -393,17 +396,21 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> struct tcf_pedit_key_ex *tkey_ex;
> struct tcf_pedit_parms *parms;
> struct tc_pedit_key *tkey;
> - u32 max_offset;
> int i;
>
> parms = rcu_dereference_bh(p->parms);
>
> - max_offset = (skb_transport_header_was_set(skb) ?
> - skb_transport_offset(skb) :
> - skb_network_offset(skb)) +
> - parms->tcfp_off_max_hint;
> - if (skb_ensure_writable(skb, min(skb->len, max_offset)))
> - goto done;
> + /*
> + * If the skb has shared frags the user is likely using zero-copy
> + * (e.g. sendfile). Those page frags may point to page-cache pages;
> + * writing into them would silently corrupt the page cache.
> + * Linearize so pedit operates on a private copy.
> + * TL;DR: if you want zero-copy, don't use pedit.
> + */
> + if (skb_has_shared_frag(skb)) {
> + if (__skb_linearize(skb))
> + goto bad;
> + }
>
> tcf_lastuse_update(&p->tcf_tm);
> tcf_action_update_bstats(&p->common, skb);
> @@ -412,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> tkey_ex = parms->tcfp_keys_ex;
>
> for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) {
> + int write_offset, write_len;
> int offset = tkey->off;
> int hoffset = 0;
> u32 *ptr, hdata;
> @@ -451,12 +459,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> }
> }
>
> - if (!offset_valid(skb, hoffset + offset)) {
> - pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset);
> + if (unlikely(check_add_overflow(hoffset, offset,
> + &write_offset))) {
> + pr_info_ratelimited("tc action pedit offset overflow\n");
> + goto bad;
> + }
> +
> + if (!offset_valid(skb, write_offset)) {
> + pr_info_ratelimited("tc action pedit offset %d out of bounds\n",
> + write_offset);
> goto bad;
> }
>
> - ptr = skb_header_pointer(skb, hoffset + offset,
> + if (write_offset < 0) {
> + if (skb_cow(skb, -write_offset))
> + goto bad;
> + } else {
> + if (unlikely(check_add_overflow(write_offset,
> + (int)sizeof(hdata),
> + &write_len)))
> + goto bad;
> + if (skb_ensure_writable(skb, min_t(int, skb->len,
> + write_len)))
> + goto bad;
> + }
> +
> + ptr = skb_header_pointer(skb, write_offset,
> sizeof(hdata), &hdata);
> if (!ptr)
> goto bad;
> @@ -475,7 +503,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>
> *ptr = ((*ptr & tkey->mask) ^ val);
> if (ptr == &hdata)
> - skb_store_bits(skb, hoffset + offset, ptr, 4);
> + skb_store_bits(skb, write_offset, ptr, sizeof(hdata));
> }
>
> goto done;
> --
> 2.51.2.windows.1
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net] net/sched: fix pedit partial COW leading to page cache
@ 2026-05-18 12:55 Han Guidong
0 siblings, 0 replies; 12+ messages in thread
From: Han Guidong @ 2026-05-18 12:55 UTC (permalink / raw)
To: Rajat Gupta
Cc: David S. Miller, Eric Dumazet, Simon Horman, Jamal Hadi Salim,
Jiri Pirko, Jakub Kicinski, netdev, Paolo Abeni
On Sun, May 17, 2026 at 06:30:31PM -0700, Rajat Gupta wrote:
> >From f12ea6484dbb75d1c13495e921d0595532317f2a Mon Sep 17 00:00:00 2001
> From: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> Date: Sun, 17 May 2026 10:11:44 -0700
> Subject: [PATCH net] net/sched: fix pedit partial COW leading to page cache
> corruption
>
> tcf_pedit_act() computes the COW range for skb_ensure_writable()
> once before the key loop using tcfp_off_max_hint, but the hint does
> not account for the runtime header offset added by typed keys. This
> can leave part of the write region un-COW'd.
>
> Fix by moving skb_ensure_writable() inside the per-key loop where
> the actual write offset is known, and add overflow checking on the
> offset arithmetic. For negative offsets (e.g. Ethernet header edits
> at ingress), use skb_cow() to COW the headroom instead. Guard
> offset_valid() against INT_MIN, where negation is undefined.
>
> Additionally, linearize skbs with shared frags upfront to prevent
> silent data corruption when pedit operates on zero-copy pages
> (e.g. from sendfile).
>
> Fixes: 8b796475fd78 ("net/sched: act_pedit: really ensure the skb is writable")
> Reported-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
> Reported-by: Yiming Qian <yimingqian591@gmail.com>
> Reported-by: Keenan Dong <keenanat2000@gmail.com>
> Reported-by: Han Guidong <2045gemini@gmail.com>
> Reported-by: Zhang Cen <rollkingzzc@gmail.com>
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
Solid fix. Linearizing shared frags and dynamic COW checks completely
resolve the offset drift and page cache corruption. Verified locally.
Tested-by: Han Guidong <2045gemini@gmail.com>
Reviewed-by: Han Guidong <2045gemini@gmail.com>
> ---
> net/sched/act_pedit.c | 52 ++++++++++++++++++++++++++++++++-----------
> 1 file changed, 39 insertions(+), 13 deletions(-)
>
> diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
> index bc20f08a2..e077a2e82 100644
> --- a/net/sched/act_pedit.c
> +++ b/net/sched/act_pedit.c
> @@ -16,6 +16,7 @@
> #include <linux/ip.h>
> #include <linux/ipv6.h>
> #include <linux/slab.h>
> +#include <linux/overflow.h>
> #include <net/ipv6.h>
> #include <net/netlink.h>
> #include <net/pkt_sched.h>
> @@ -323,8 +324,10 @@ static bool offset_valid(struct sk_buff *skb, int offset)
> if (offset > 0 && offset > skb->len)
> return false;
>
> - if (offset < 0 && -offset > skb_headroom(skb))
> - return false;
> + if (offset < 0) {
> + if (offset == INT_MIN || -offset > skb_headroom(skb))
> + return false;
> + }
>
> return true;
> }
> @@ -393,17 +396,19 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> struct tcf_pedit_key_ex *tkey_ex;
> struct tcf_pedit_parms *parms;
> struct tc_pedit_key *tkey;
> - u32 max_offset;
> int i;
>
> parms = rcu_dereference_bh(p->parms);
>
> - max_offset = (skb_transport_header_was_set(skb) ?
> - skb_transport_offset(skb) :
> - skb_network_offset(skb)) +
> - parms->tcfp_off_max_hint;
> - if (skb_ensure_writable(skb, min(skb->len, max_offset)))
> - goto done;
> + /* If the skb has shared frags the user is likely using zero-copy
> + * (e.g. sendfile). Those page frags may point to page-cache pages;
> + * writing into them would silently corrupt the page cache.
> + * Linearize so pedit operates on a private copy.
> + * If you want zero-copy, don't use pedit.
> + TL;DR if you want to use ZC, don't use pedit*/
> + if (skb_has_shared_frag(skb)) {
> + if (__skb_linearize(skb))
> + goto bad;
> + }
>
> tcf_lastuse_update(&p->tcf_tm);
> tcf_action_update_bstats(&p->common, skb);
> @@ -414,6 +419,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> for (i = parms->tcfp_nkeys; i > 0; i--, tkey++) {
> int offset = tkey->off;
> int hoffset = 0;
> + int write_offset, write_len;
> u32 *ptr, hdata;
> u32 val;
> int rc;
> @@ -451,12 +457,32 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
> }
> }
>
> - if (!offset_valid(skb, hoffset + offset)) {
> - pr_info_ratelimited("tc action pedit offset %d out of bounds\n", hoffset + offset);
> + if (unlikely(check_add_overflow(hoffset, offset,
> + &write_offset))) {
> + pr_info_ratelimited("tc action pedit offset overflow\n");
> + goto bad;
> + }
> +
> + if (!offset_valid(skb, write_offset)) {
> + pr_info_ratelimited("tc action pedit offset %d out of bounds\n",
> + write_offset);
> goto bad;
> }
>
> - ptr = skb_header_pointer(skb, hoffset + offset,
> + if (write_offset < 0) {
> + if (skb_cow(skb, -write_offset))
> + goto bad;
> + } else {
> + if (unlikely(check_add_overflow(write_offset,
> + (int)sizeof(hdata),
> + &write_len)))
> + goto bad;
> + if (skb_ensure_writable(skb, min_t(int, skb->len,
> + write_len)))
> + goto bad;
> + }
> +
> + ptr = skb_header_pointer(skb, write_offset,
> sizeof(hdata), &hdata);
> if (!ptr)
> goto bad;
> @@ -475,7 +501,7 @@ TC_INDIRECT_SCOPE int tcf_pedit_act(struct sk_buff *skb,
>
> *ptr = ((*ptr & tkey->mask) ^ val);
> if (ptr == &hdata)
> - skb_store_bits(skb, hoffset + offset, ptr, 4);
> + skb_store_bits(skb, write_offset, ptr, sizeof(hdata));
> }
>
> goto done;
> --
> 2.51.2.windows.1
>
> Thank you,
> Rajat
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-05-20 11:40 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-18 1:30 [PATCH net] net/sched: fix pedit partial COW leading to page cache Rajat Gupta
2026-05-18 13:10 ` Han Guidong
2026-05-18 13:31 ` Jamal Hadi Salim
2026-05-19 3:39 ` [PATCH net] net/sched: fix pedit partial COW leading to page cache corruption Rajat Gupta
2026-05-19 11:18 ` Toke Høiland-Jørgensen
2026-05-19 15:10 ` Han Guidong
2026-05-20 9:12 ` Jamal Hadi Salim
2026-05-20 10:04 ` Han Guidong
2026-05-20 10:36 ` Han Guidong
2026-05-20 11:40 ` Jamal Hadi Salim
2026-05-20 9:23 ` Jamal Hadi Salim
-- strict thread matches above, loose matches on Subject: below --
2026-05-18 12:55 [PATCH net] net/sched: fix pedit partial COW leading to page cache Han Guidong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox