* [PATCH] ipv4: avoid divide 0 error in tcp_incr_quickack @ 2014-11-03 1:29 Chen Weilong 2014-11-03 3:42 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Chen Weilong @ 2014-11-03 1:29 UTC (permalink / raw) To: davem, kuznet, jmorris, yoshfuji, kaber; +Cc: netdev, linux-kernel From: Weilong Chen <chenweilong@huawei.com> We got a problem like this: [ffff8801c1a05570] machine_kexec at ffffffff81025039 [ffff8801c1a055d0] crash_kexec at ffffffff8109b253 [ffff8801c1a056a0] oops_end at ffffffff81442aed [ffff8801c1a056d0] die at ffffffff81005603 [ffff8801c1a05700] do_trap at ffffffff81442448 [ffff8801c1a05760] do_divide_error at ffffffff81002c10 [ffff8801c1a05888] tcp_send_dupack at ffffffff81385e44 [ffff8801c1a058c8] tcp_validate_incoming at ffffffff813886b5 [ffff8801c1a05908] tcp_rcv_state_process at ffffffff8138d0b7 [ffff8801c1a05958] tcp_child_process at ffffffff81397255 [ffff8801c1a05988] tcp_v4_do_rcv at ffffffff81395a70 [ffff8801c1a059d8] tcp_v4_rcv at ffffffff81396fc8 [ffff8801c1a05a48] ip_local_deliver_finish at ffffffff813746e9 [ffff8801c1a05a78] ip_local_deliver at ffffffff81374a20 [ffff8801c1a05aa8] ip_rcv_finish at ffffffff81374389 [ffff8801c1a05ad8] ip_rcv at ffffffff81374c78 There was a wrong ack packet coming during TCP handshake. The socket's state was TCP_SYN_RECV, its rcv_mss was not initialize yet. So tcp_send_dupack -> tcp_enter_quickack_mode got a divide 0 error. This patch add a state check before tcp_enter_quickack_mode. Signed-off-by: Weilong Chen <chenweilong@huawei.com> --- net/ipv4/tcp_input.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 4e4617e..9eb56dc 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3986,7 +3986,8 @@ static void tcp_send_dupack(struct sock *sk, const struct sk_buff *skb) if (TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq && before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) { NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_DELAYEDACKLOST); - tcp_enter_quickack_mode(sk); + if (sk->sk_state != TCP_SYN_RECV) + tcp_enter_quickack_mode(sk); if (tcp_is_sack(tp) && sysctl_tcp_dsack) { u32 end_seq = TCP_SKB_CB(skb)->end_seq; -- 1.7.12 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] ipv4: avoid divide 0 error in tcp_incr_quickack 2014-11-03 1:29 [PATCH] ipv4: avoid divide 0 error in tcp_incr_quickack Chen Weilong @ 2014-11-03 3:42 ` Eric Dumazet 2014-11-03 5:31 ` chenweilong 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2014-11-03 3:42 UTC (permalink / raw) To: Chen Weilong Cc: davem, kuznet, jmorris, yoshfuji, kaber, netdev, linux-kernel On Mon, 2014-11-03 at 09:29 +0800, Chen Weilong wrote: > From: Weilong Chen <chenweilong@huawei.com> > > We got a problem like this: > There was a wrong ack packet coming during TCP handshake. The socket's state > was TCP_SYN_RECV, its rcv_mss was not initialize yet. So > tcp_send_dupack -> tcp_enter_quickack_mode got a divide 0 error. > This patch add a state check before tcp_enter_quickack_mode. > > Signed-off-by: Weilong Chen <chenweilong@huawei.com> > --- > net/ipv4/tcp_input.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 4e4617e..9eb56dc 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -3986,7 +3986,8 @@ static void tcp_send_dupack(struct sock *sk, const struct sk_buff *skb) > if (TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq && > before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) { > NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_DELAYEDACKLOST); > - tcp_enter_quickack_mode(sk); > + if (sk->sk_state != TCP_SYN_RECV) > + tcp_enter_quickack_mode(sk); > > if (tcp_is_sack(tp) && sysctl_tcp_dsack) { > u32 end_seq = TCP_SKB_CB(skb)->end_seq; Sorry I do not think this is the right fix. We have to not simply avoid the divide, but fix this issue by understanding the missing steps. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] ipv4: avoid divide 0 error in tcp_incr_quickack 2014-11-03 3:42 ` Eric Dumazet @ 2014-11-03 5:31 ` chenweilong 2014-11-03 15:30 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: chenweilong @ 2014-11-03 5:31 UTC (permalink / raw) To: Eric Dumazet Cc: davem, kuznet, jmorris, yoshfuji, kaber, netdev, linux-kernel On 2014/11/3 11:42, Eric Dumazet wrote: > On Mon, 2014-11-03 at 09:29 +0800, Chen Weilong wrote: >> From: Weilong Chen <chenweilong@huawei.com> >> >> We got a problem like this: > >> There was a wrong ack packet coming during TCP handshake. The socket's state >> was TCP_SYN_RECV, its rcv_mss was not initialize yet. So >> tcp_send_dupack -> tcp_enter_quickack_mode got a divide 0 error. >> This patch add a state check before tcp_enter_quickack_mode. >> >> Signed-off-by: Weilong Chen <chenweilong@huawei.com> >> --- >> net/ipv4/tcp_input.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c >> index 4e4617e..9eb56dc 100644 >> --- a/net/ipv4/tcp_input.c >> +++ b/net/ipv4/tcp_input.c >> @@ -3986,7 +3986,8 @@ static void tcp_send_dupack(struct sock *sk, const struct sk_buff *skb) >> if (TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq && >> before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) { >> NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_DELAYEDACKLOST); >> - tcp_enter_quickack_mode(sk); >> + if (sk->sk_state != TCP_SYN_RECV) >> + tcp_enter_quickack_mode(sk); >> >> if (tcp_is_sack(tp) && sysctl_tcp_dsack) { >> u32 end_seq = TCP_SKB_CB(skb)->end_seq; > > > Sorry I do not think this is the right fix. > > We have to not simply avoid the divide, but fix this issue by > understanding the missing steps. > Hi Eric, I check the code and find that: 1.In function "tcp_rcv_state_process", the "tcp_initialize_rcv_mss" is called at "step 5: check the ACK field" when the sk->sk_state is TCP_SYN_RECV and there is a "tcp_validate_incoming" just before it. So when we call "tcp_validate_incoming", the rcv_mss may not been initialized. 2.In function "tcp_validate_incoming", the "Step 1: check sequence number", according to RFC793 page 69, If an incoming segment is not acceptable,an acknowledgment should be sent in reply (unless the RST bit is set, if so drop the segment and return). So we may call "tcp_send_dupack" while the rcv_mss hasn't been initialized. 3.In function "tcp_send_dupack", when the condition is suitable, it'll enter quick ack mode. Notice it only check the seq ! So I think add another state check should be OK. Any suggestion ? Thanks, Weilong > > > > . > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] ipv4: avoid divide 0 error in tcp_incr_quickack 2014-11-03 5:31 ` chenweilong @ 2014-11-03 15:30 ` Eric Dumazet 0 siblings, 0 replies; 6+ messages in thread From: Eric Dumazet @ 2014-11-03 15:30 UTC (permalink / raw) To: chenweilong, Yuchung Cheng, Neal Cardwell Cc: davem, kuznet, jmorris, yoshfuji, kaber, netdev, linux-kernel On Mon, 2014-11-03 at 13:31 +0800, chenweilong wrote: > Hi Eric, > > I check the code and find that: > > 1.In function "tcp_rcv_state_process", > the "tcp_initialize_rcv_mss" is called at "step 5: check the ACK field" when the sk->sk_state is TCP_SYN_RECV > and there is a "tcp_validate_incoming" just before it. > So when we call "tcp_validate_incoming", the rcv_mss may not been initialized. > > 2.In function "tcp_validate_incoming", > the "Step 1: check sequence number", according to RFC793 page 69, > If an incoming segment is not acceptable,an acknowledgment should be sent in reply (unless the RST > bit is set, if so drop the segment and return). > So we may call "tcp_send_dupack" while the rcv_mss hasn't been initialized. > > 3.In function "tcp_send_dupack", > when the condition is suitable, it'll enter quick ack mode. Notice it only check the seq ! > So I think add another state check should be OK. > > Any suggestion ? > You did find what immediate conditions for the crash (rcv_mss = 0, state = TCP_SYN_RCV) were. Your patch avoids the zero divide, but leaves other issues. rcv_mss = 0 here is a sign some logic is wrong in the stack. Given this potential zero divide had been there for years, I believe we should take the time for a more complete fix, instead of papering over the immediate problem. We have been working with Neal to reproduce the issue with packetdrill, we'll post our results when we manage to get our first crash ;) Thanks ! ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <1414767047-8972-1-git-send-email-chenweilong@huawei.com>]
* Re: [PATCH] ipv4: avoid divide 0 error in tcp_incr_quickack [not found] <1414767047-8972-1-git-send-email-chenweilong@huawei.com> @ 2014-10-31 16:24 ` Alexei Starovoitov 2014-10-31 17:40 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Alexei Starovoitov @ 2014-10-31 16:24 UTC (permalink / raw) To: Chen Weilong, Eric Dumazet, netdev@vger.kernel.org Cc: David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy, linux-kernel@vger.kernel.org cc-ing netdev On Fri, Oct 31, 2014 at 7:50 AM, Chen Weilong <chenweilong@huawei.com> wrote: > From: Weilong Chen <chenweilong@huawei.com> > > We got a problem like this: > [ffff8801c1a05570] machine_kexec at ffffffff81025039 > [ffff8801c1a055d0] crash_kexec at ffffffff8109b253 > [ffff8801c1a056a0] oops_end at ffffffff81442aed > [ffff8801c1a056d0] die at ffffffff81005603 > [ffff8801c1a05700] do_trap at ffffffff81442448 > [ffff8801c1a05760] do_divide_error at ffffffff81002c10 > [ffff8801c1a05888] tcp_send_dupack at ffffffff81385e44 > [ffff8801c1a058c8] tcp_validate_incoming at ffffffff813886b5 > [ffff8801c1a05908] tcp_rcv_state_process at ffffffff8138d0b7 > [ffff8801c1a05958] tcp_child_process at ffffffff81397255 > [ffff8801c1a05988] tcp_v4_do_rcv at ffffffff81395a70 > [ffff8801c1a059d8] tcp_v4_rcv at ffffffff81396fc8 > [ffff8801c1a05a48] ip_local_deliver_finish at ffffffff813746e9 > [ffff8801c1a05a78] ip_local_deliver at ffffffff81374a20 > [ffff8801c1a05aa8] ip_rcv_finish at ffffffff81374389 > [ffff8801c1a05ad8] ip_rcv at ffffffff81374c78 > There was a wrong ack packet coming during TCP handshake. The socket's state > was TCP_SYN_RECV, its rcv_mss was not initialize yet. So > tcp_send_dupack -> tcp_enter_quickack_mode got a divide 0 error. > This patch add a state check before tcp_enter_quickack_mode. ouch. Is it remote exploitable? > Signed-off-by: Weilong Chen <chenweilong@huawei.com> > --- > net/ipv4/tcp_input.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 4e4617e..9eb56dc 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -3986,7 +3986,8 @@ static void tcp_send_dupack(struct sock *sk, const struct sk_buff *skb) > if (TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq && > before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) { > NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_DELAYEDACKLOST); > - tcp_enter_quickack_mode(sk); > + if (sk->sk_state != TCP_SYN_RECV) > + tcp_enter_quickack_mode(sk); > > if (tcp_is_sack(tp) && sysctl_tcp_dsack) { > u32 end_seq = TCP_SKB_CB(skb)->end_seq; > -- > 1.7.12 > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] ipv4: avoid divide 0 error in tcp_incr_quickack 2014-10-31 16:24 ` Alexei Starovoitov @ 2014-10-31 17:40 ` Eric Dumazet 0 siblings, 0 replies; 6+ messages in thread From: Eric Dumazet @ 2014-10-31 17:40 UTC (permalink / raw) To: Alexei Starovoitov Cc: Chen Weilong, netdev@vger.kernel.org, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy, linux-kernel@vger.kernel.org On Fri, 2014-10-31 at 09:24 -0700, Alexei Starovoitov wrote: > cc-ing netdev > > On Fri, Oct 31, 2014 at 7:50 AM, Chen Weilong <chenweilong@huawei.com> wrote: > > From: Weilong Chen <chenweilong@huawei.com> > > > > We got a problem like this: > > [ffff8801c1a05570] machine_kexec at ffffffff81025039 > > [ffff8801c1a055d0] crash_kexec at ffffffff8109b253 > > [ffff8801c1a056a0] oops_end at ffffffff81442aed > > [ffff8801c1a056d0] die at ffffffff81005603 > > [ffff8801c1a05700] do_trap at ffffffff81442448 > > [ffff8801c1a05760] do_divide_error at ffffffff81002c10 > > [ffff8801c1a05888] tcp_send_dupack at ffffffff81385e44 > > [ffff8801c1a058c8] tcp_validate_incoming at ffffffff813886b5 > > [ffff8801c1a05908] tcp_rcv_state_process at ffffffff8138d0b7 > > [ffff8801c1a05958] tcp_child_process at ffffffff81397255 > > [ffff8801c1a05988] tcp_v4_do_rcv at ffffffff81395a70 > > [ffff8801c1a059d8] tcp_v4_rcv at ffffffff81396fc8 > > [ffff8801c1a05a48] ip_local_deliver_finish at ffffffff813746e9 > > [ffff8801c1a05a78] ip_local_deliver at ffffffff81374a20 > > [ffff8801c1a05aa8] ip_rcv_finish at ffffffff81374389 > > [ffff8801c1a05ad8] ip_rcv at ffffffff81374c78 > > There was a wrong ack packet coming during TCP handshake. The socket's state > > was TCP_SYN_RECV, its rcv_mss was not initialize yet. So > > tcp_send_dupack -> tcp_enter_quickack_mode got a divide 0 error. > > This patch add a state check before tcp_enter_quickack_mode. > > ouch. Is it remote exploitable? Seems to be SYN crossing. Quite hard, but possible. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-11-03 15:30 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-03 1:29 [PATCH] ipv4: avoid divide 0 error in tcp_incr_quickack Chen Weilong
2014-11-03 3:42 ` Eric Dumazet
2014-11-03 5:31 ` chenweilong
2014-11-03 15:30 ` Eric Dumazet
[not found] <1414767047-8972-1-git-send-email-chenweilong@huawei.com>
2014-10-31 16:24 ` Alexei Starovoitov
2014-10-31 17:40 ` Eric Dumazet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).