* [PATCH net 1/1] net: ipv4: bound TCP reordering sysctl writes [not found] <cover.1780979031.git.bronzed_45_vested@icloud.com> @ 2026-06-10 16:18 ` Ren Wei 2026-06-10 17:04 ` Eric Dumazet 0 siblings, 1 reply; 3+ messages in thread From: Ren Wei @ 2026-06-10 16:18 UTC (permalink / raw) To: netdev Cc: edumazet, pabeni, kuba, chia-yu.chang, ij, fmancera, yuuchihsu, idosch, herbert, yuantan098, zcliangcn, bird, bronzed_45_vested, n05ec From: Wyatt Feng <bronzed_45_vested@icloud.com> Reject invalid `net.ipv4.tcp_reordering` values before they reach TCP socket state. The sysctl is stored as an `int` but copied into the `u32` `tp->reordering` field for new sockets, so negative writes wrap to large values. With `tcp_mtu_probing=2`, the wrapped value can overflow the `tcp_mtu_probe()` size calculation and drive the MTU probing path into an out-of-bounds read. Route `tcp_reordering` writes through `proc_dointvec_minmax()` and clamp them to the per-netns range `[1, tcp_max_reordering]`. Apply the matching cross-check to `tcp_max_reordering` so the existing invariant `tcp_max_reordering >= tcp_reordering` is preserved. This keeps the fix at the sysctl boundary and avoids changing the TCP fast path. Fixes: 91cc17c0e5e5 ("[TCP]: MTUprobe: receiver window & data available checks fixed") Cc: stable@vger.kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Zhengchuan Liang <zcliangcn@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Assisted-by: Codex:GPT-5.4 Signed-off-by: Wyatt Feng <bronzed_45_vested@icloud.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> --- net/ipv4/sysctl_net_ipv4.c | 38 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 36 insertions(+), 2 deletions(-) diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index c0e85cc171ae..c44afb9c321d 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -215,6 +215,40 @@ static int ipv4_fwd_update_priority(const struct ctl_table *table, int write, return ret; } +static int proc_tcp_reordering(const struct ctl_table *table, int write, + void *buffer, size_t *lenp, loff_t *ppos) +{ + struct ctl_table tmp = { + .data = table->data, + .maxlen = table->maxlen, + .mode = table->mode, + .extra1 = SYSCTL_ONE, + }; + struct net *net; + + net = container_of(table->data, struct net, ipv4.sysctl_tcp_reordering); + tmp.extra2 = &net->ipv4.sysctl_tcp_max_reordering; + + return proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos); +} + +static int proc_tcp_max_reordering(const struct ctl_table *table, int write, + void *buffer, size_t *lenp, loff_t *ppos) +{ + struct ctl_table tmp = { + .data = table->data, + .maxlen = table->maxlen, + .mode = table->mode, + }; + struct net *net; + + net = container_of(table->data, struct net, + ipv4.sysctl_tcp_max_reordering); + tmp.extra1 = &net->ipv4.sysctl_tcp_reordering; + + return proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos); +} + static int proc_tcp_congestion_control(const struct ctl_table *ctl, int write, void *buffer, size_t *lenp, loff_t *ppos) { @@ -1058,7 +1092,7 @@ static struct ctl_table ipv4_net_table[] = { .data = &init_net.ipv4.sysctl_tcp_reordering, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = proc_dointvec + .proc_handler = proc_tcp_reordering }, { .procname = "tcp_retries1", @@ -1293,7 +1327,7 @@ static struct ctl_table ipv4_net_table[] = { .data = &init_net.ipv4.sysctl_tcp_max_reordering, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = proc_dointvec + .proc_handler = proc_tcp_max_reordering }, { .procname = "tcp_dsack", -- 2.47.3 ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH net 1/1] net: ipv4: bound TCP reordering sysctl writes 2026-06-10 16:18 ` [PATCH net 1/1] net: ipv4: bound TCP reordering sysctl writes Ren Wei @ 2026-06-10 17:04 ` Eric Dumazet 2026-06-11 9:21 ` David Laight 0 siblings, 1 reply; 3+ messages in thread From: Eric Dumazet @ 2026-06-10 17:04 UTC (permalink / raw) To: Ren Wei, Neal Cardwell, Kuniyuki Iwashima Cc: netdev, pabeni, kuba, chia-yu.chang, ij, fmancera, yuuchihsu, idosch, herbert, yuantan098, zcliangcn, bird, bronzed_45_vested On Wed, Jun 10, 2026 at 9:18 AM Ren Wei <n05ec@lzu.edu.cn> wrote: > > From: Wyatt Feng <bronzed_45_vested@icloud.com> > > Reject invalid `net.ipv4.tcp_reordering` values before they reach TCP > socket state. The sysctl is stored as an `int` but copied into the > `u32` `tp->reordering` field for new sockets, so negative writes wrap > to large values. > > With `tcp_mtu_probing=2`, the wrapped value can overflow the > `tcp_mtu_probe()` size calculation and drive the MTU probing path into > an out-of-bounds read. Route `tcp_reordering` writes through > `proc_dointvec_minmax()` and clamp them to the per-netns range > `[1, tcp_max_reordering]`. > > Apply the matching cross-check to `tcp_max_reordering` so the existing > invariant `tcp_max_reordering >= tcp_reordering` is preserved. This > keeps the fix at the sysctl boundary and avoids changing the TCP fast > path. > > Fixes: 91cc17c0e5e5 ("[TCP]: MTUprobe: receiver window & data available checks fixed") > Cc: stable@vger.kernel.org > Reported-by: Yuan Tan <yuantan098@gmail.com> > Reported-by: Zhengchuan Liang <zcliangcn@gmail.com> > Reported-by: Xin Liu <bird@lzu.edu.cn> > Assisted-by: Codex:GPT-5.4 > Signed-off-by: Wyatt Feng <bronzed_45_vested@icloud.com> > Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> > --- > net/ipv4/sysctl_net_ipv4.c | 38 ++++++++++++++++++++++++++++++++++++-- > 1 file changed, 36 insertions(+), 2 deletions(-) > > diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c > index c0e85cc171ae..c44afb9c321d 100644 > --- a/net/ipv4/sysctl_net_ipv4.c > +++ b/net/ipv4/sysctl_net_ipv4.c > @@ -215,6 +215,40 @@ static int ipv4_fwd_update_priority(const struct ctl_table *table, int write, > return ret; > } > > +static int proc_tcp_reordering(const struct ctl_table *table, int write, > + void *buffer, size_t *lenp, loff_t *ppos) > +{ > + struct ctl_table tmp = { > + .data = table->data, > + .maxlen = table->maxlen, > + .mode = table->mode, > + .extra1 = SYSCTL_ONE, > + }; > + struct net *net; > + > + net = container_of(table->data, struct net, ipv4.sysctl_tcp_reordering); > + tmp.extra2 = &net->ipv4.sysctl_tcp_max_reordering; > + > + return proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos); > +} > + > +static int proc_tcp_max_reordering(const struct ctl_table *table, int write, > + void *buffer, size_t *lenp, loff_t *ppos) > +{ > + struct ctl_table tmp = { > + .data = table->data, > + .maxlen = table->maxlen, > + .mode = table->mode, > + }; > + struct net *net; > + > + net = container_of(table->data, struct net, > + ipv4.sysctl_tcp_max_reordering); > + tmp.extra1 = &net->ipv4.sysctl_tcp_reordering; > + > + return proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos); > +} > + > static int proc_tcp_congestion_control(const struct ctl_table *ctl, int write, > void *buffer, size_t *lenp, loff_t *ppos) > { > @@ -1058,7 +1092,7 @@ static struct ctl_table ipv4_net_table[] = { > .data = &init_net.ipv4.sysctl_tcp_reordering, > .maxlen = sizeof(int), > .mode = 0644, > - .proc_handler = proc_dointvec > + .proc_handler = proc_tcp_reordering > }, > { > .procname = "tcp_retries1", > @@ -1293,7 +1327,7 @@ static struct ctl_table ipv4_net_table[] = { > .data = &init_net.ipv4.sysctl_tcp_max_reordering, > .maxlen = sizeof(int), > .mode = 0644, > - .proc_handler = proc_dointvec > + .proc_handler = proc_tcp_max_reordering > }, > { > .procname = "tcp_dsack", > -- > 2.47.3 > Thanks for the patch. I think we can do better. Also I do not see a fix in tcp_mtu_probe()? What about: diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index c0e85cc171aec099fd5d4897b1a623dd27eaee08..987bc87e255a81680ae5e23ddd01c1f0de4425ec 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -1058,6 +1058,9 @@ static struct ctl_table ipv4_net_table[] = { .data = &init_net.ipv4.sysctl_tcp_reordering, .maxlen = sizeof(int), .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ONE, + .extra2 = &init_net.ipv4.sysctl_tcp_max_reordering, .proc_handler = proc_dointvec }, { @@ -1293,7 +1296,8 @@ static struct ctl_table ipv4_net_table[] = { .data = &init_net.ipv4.sysctl_tcp_max_reordering, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = proc_dointvec + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ONE, }, { .procname = "tcp_dsack", @@ -1676,6 +1680,9 @@ static __net_init int ipv4_sysctl_init_net(struct net *net) */ table[i].mode &= ~0222; } + if (table[i].extra2 >= (void *)&init_net.ipv4 && + table[i].extra2 < (void *)(&init_net.ipv4 + 1)) + table[i].extra2 += (void *)net - (void *)&init_net; } } diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 6e4bb411dc041b2a299f0fbc2109c530afa2a345..193637a58dcc1acf8e70fdfbde982c2a3eed290e 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2687,7 +2687,7 @@ static int tcp_mtu_probe(struct sock *sk) struct sk_buff *skb, *nskb, *next; struct net *net = sock_net(sk); int probe_size; - int size_needed; + u64 size_needed; int copy, len; int mss_now; int interval; @@ -2711,7 +2711,7 @@ static int tcp_mtu_probe(struct sock *sk) mss_now = tcp_current_mss(sk); probe_size = tcp_mtu_to_mss(sk, (icsk->icsk_mtup.search_high + icsk->icsk_mtup.search_low) >> 1); - size_needed = probe_size + (tp->reordering + 1) * tp->mss_cache; + size_needed = probe_size + (tp->reordering + 1) * (u64)tp->mss_cache; interval = icsk->icsk_mtup.search_high - icsk->icsk_mtup.search_low; /* When misfortune happens, we are reprobing actively, * and then reprobe timer has expired. We stick with current ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH net 1/1] net: ipv4: bound TCP reordering sysctl writes 2026-06-10 17:04 ` Eric Dumazet @ 2026-06-11 9:21 ` David Laight 0 siblings, 0 replies; 3+ messages in thread From: David Laight @ 2026-06-11 9:21 UTC (permalink / raw) To: Eric Dumazet Cc: Ren Wei, Neal Cardwell, Kuniyuki Iwashima, netdev, pabeni, kuba, chia-yu.chang, ij, fmancera, yuuchihsu, idosch, herbert, yuantan098, zcliangcn, bird, bronzed_45_vested On Wed, 10 Jun 2026 10:04:06 -0700 Eric Dumazet <edumazet@google.com> wrote: > On Wed, Jun 10, 2026 at 9:18 AM Ren Wei <n05ec@lzu.edu.cn> wrote: > > > > From: Wyatt Feng <bronzed_45_vested@icloud.com> > > > > Reject invalid `net.ipv4.tcp_reordering` values before they reach TCP > > socket state. The sysctl is stored as an `int` but copied into the > > `u32` `tp->reordering` field for new sockets, so negative writes wrap > > to large values. > > > > With `tcp_mtu_probing=2`, the wrapped value can overflow the > > `tcp_mtu_probe()` size calculation and drive the MTU probing path into > > an out-of-bounds read. Route `tcp_reordering` writes through > > `proc_dointvec_minmax()` and clamp them to the per-netns range > > `[1, tcp_max_reordering]`. > > ... > Thanks for the patch. I think we can do better. > > Also I do not see a fix in tcp_mtu_probe()? > > What about: > > > diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c > index c0e85cc171aec099fd5d4897b1a623dd27eaee08..987bc87e255a81680ae5e23ddd01c1f0de4425ec > 100644 > --- a/net/ipv4/sysctl_net_ipv4.c > +++ b/net/ipv4/sysctl_net_ipv4.c > @@ -1058,6 +1058,9 @@ static struct ctl_table ipv4_net_table[] = { > .data = &init_net.ipv4.sysctl_tcp_reordering, > .maxlen = sizeof(int), > .mode = 0644, > + .proc_handler = proc_dointvec_minmax, > + .extra1 = SYSCTL_ONE, > + .extra2 = &init_net.ipv4.sysctl_tcp_max_reordering, > .proc_handler = proc_dointvec > }, > { > @@ -1293,7 +1296,8 @@ static struct ctl_table ipv4_net_table[] = { > .data = &init_net.ipv4.sysctl_tcp_max_reordering, > .maxlen = sizeof(int), > .mode = 0644, > - .proc_handler = proc_dointvec > + .proc_handler = proc_dointvec_minmax, > + .extra1 = SYSCTL_ONE, Would it be reasonable to put in a 'sanity' upper bound here? Since tcp_max_reordering can be lowered after tcp_reordering is set the bound set here isn't 'hard'. So the same sanity limit for both may be enough? I also found this 'gem': void tcp_enter_loss(struct sock *sk) { ... u8 reordering; ... reordering = READ_ONCE(net->ipv4.sysctl_tcp_reordering); Which makes be think that someone expected these values to be small. I don't know what the sane bounds are for either sysctl or tp->reordering itself (which is u32). -- David ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-06-11 9:21 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <cover.1780979031.git.bronzed_45_vested@icloud.com>
2026-06-10 16:18 ` [PATCH net 1/1] net: ipv4: bound TCP reordering sysctl writes Ren Wei
2026-06-10 17:04 ` Eric Dumazet
2026-06-11 9:21 ` David Laight
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox