* [PATCH] net: sched: tbf: fix an oops when a GSO packet cannot be enqueued @ 2013-11-22 9:27 Yang Yingliang 2013-11-22 14:50 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Yang Yingliang @ 2013-11-22 9:27 UTC (permalink / raw) To: davem, netdev; +Cc: eric.dumazet, ben, jpirko, jhs It seems commit e43ac79a4b("sch_tbf: segment too big GSO packets") introduce this oops. When GSO mode is on, use the following command: tc qdisc add dev eth0 root handle 1: tbf latency 50ms burst 1KB rate 50mbit mtu 1KB iperf -c host -t 30 Oops happened: [16884.119862] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8 [16884.212742] IP: [<ffffffff813dd3e1>] tcp_gso_segment+0x251/0x3a0 [16884.283951] PGD 1f14258067 PUD 1f34e7e067 PMD 0 [16884.338656] Oops: 0000 [#1] SMP [16884.376851] Modules linked in: sch_tbf(O) tun ip6table_filter ip6_tables iptable_filter ip_tables x_tables edd af_packet bridge stp llc microcode fuse loop dm_mod igb dca i2c_algo_bit ptp ipv6 kvm_intel kvm i2c_i801 sg i2c_core pps_core ehci_pci mptctl pcspkr button iTCO_wdt iTCO_vendor_support lpc_ich mfd_core hid_generic serio_raw ext3 jbd mbcache usbhid hid uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif crct10dif_common processor thermal_sys hwmon scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_dh_rdac scsi_dh ata_generic ata_piix libata mptsas mptscsih mptbase scsi_transport_sas scsi_mod [last unloaded: sch_tbf] [16885.029346] CPU: 15 PID: 546 Comm: iperf Tainted: G O 3.12.0-rc6-0.7-default+ #7 [16885.127372] Hardware name: Huawei Technologies Co., Ltd. Tecal XH620 /BC21THSA , BIOS TTSAV020 12/02/2011 [16885.264597] task: ffff881fa8940240 ti: ffff881fbd64e000 task.ti: ffff881fbd64e000 [16885.353331] RIP: 0010:[<ffffffff813dd3e1>] [<ffffffff813dd3e1>] tcp_gso_segment+0x251/0x3a0 [16885.453424] RSP: 0018:ffff881fbd64f738 EFLAGS: 00010286 [16885.516364] RAX: 000000000ea96820 RBX: 0000000000000000 RCX: 0000000020368706 [16885.600973] RDX: 000000001b1b0000 RSI: 00000000000002c0 RDI: ffff881fa8bfbb70 [16885.685584] RBP: ffff881fbd64f778 R08: 000000000000e93f R09: ffff881fbcaac400 [16885.770191] R10: 0000000000000004 R11: ffff881fbc93e840 R12: ffff881fbcaac510 [16885.854799] R13: ffff881fa8bfbb70 R14: 00000000000005a8 R15: ffff881fa8bfbd70 [16885.939409] FS: 00007f255987e700(0000) GS:ffff88203f3e0000(0000) knlGS:0000000000000000 [16886.035364] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [16886.103466] CR2: 00000000000000c8 CR3: 0000001f13c07000 CR4: 00000000000027e0 [16886.188075] Stack: [16886.211809] ffff881fbd64f788 00000020813dd2d5 2068a90e0164230a 0000000000000006 [16886.299540] ffff881fa8bfbb70 ffff881fa8bfbb98 0000000000000014 0000000390004bb3 [16886.387268] ffff881fbd64f7d8 ffffffff813ed38d 06ff881fbd64f7e8 0000000000000046 [16886.475000] Call Trace: [16886.503900] [<ffffffff813ed38d>] inet_gso_segment+0x14d/0x310 [16886.573036] [<ffffffff8107093b>] ? try_to_wake_up+0x1eb/0x290 [16886.642187] [<ffffffff81382c4d>] skb_mac_gso_segment+0xad/0x170 [16886.713392] [<ffffffff81382d79>] __skb_gso_segment+0x69/0xe0 [16886.781499] [<ffffffffa0038763>] tbf_enqueue+0x63/0x1e0 [sch_tbf] [16886.854765] [<ffffffff81383466>] dev_queue_xmit+0x186/0x440 [16886.921844] [<ffffffffa08a13d0>] ? br_forward+0x60/0x60 [bridge] [16886.994078] [<ffffffffa08a145b>] br_dev_queue_push_xmit+0x8b/0xd0 [bridge] [16887.076622] [<ffffffffa08a14f8>] br_forward_finish+0x58/0x60 [bridge] [16887.154010] [<ffffffffa08a1578>] __br_deliver+0x78/0x100 [bridge] [16887.227275] [<ffffffffa08a163d>] br_deliver+0x3d/0x50 [bridge] [16887.297439] [<ffffffffa089f8ae>] br_dev_xmit+0xce/0x230 [bridge] [16887.369668] [<ffffffff813830bf>] dev_hard_start_xmit+0x2cf/0x4f0 [16887.441901] [<ffffffff81383697>] dev_queue_xmit+0x3b7/0x440 [16887.508970] [<ffffffff8138c4e1>] neigh_resolve_output+0x131/0x200 [16887.582230] [<ffffffff813bb31d>] ip_finish_output+0x23d/0x440 [16887.651368] [<ffffffff813bb5a8>] ip_output+0x88/0x90 [16887.711219] [<ffffffff813b9f34>] ip_local_out+0x24/0x30 [16887.774168] [<ffffffff813ba3dd>] ip_queue_xmit+0x14d/0x3e0 [16887.840208] [<ffffffff813d1661>] tcp_transmit_skb+0x501/0x840 [16887.909346] [<ffffffff813d29b3>] tcp_write_xmit+0x1e3/0xb20 [16887.976415] [<ffffffff813d335d>] __tcp_push_pending_frames+0x2d/0x70 [16888.052770] [<ffffffff813c6034>] tcp_sendmsg+0xc74/0xc90 [16888.116751] [<ffffffff813ec3f1>] inet_sendmsg+0x61/0xc0 [16888.179702] [<ffffffff81368981>] sock_aio_write+0x101/0x120 [16888.246776] [<ffffffff8121d530>] ? timerqueue_add+0x60/0xb0 [16888.313851] [<ffffffff81134720>] do_sync_write+0x60/0x90 [16888.377827] [<ffffffff81134904>] ? rw_verify_area+0x54/0xf0 [16888.444897] [<ffffffff81134b26>] vfs_write+0x186/0x190 [16888.506808] [<ffffffff811353bd>] SyS_write+0x5d/0xa0 [16888.566657] [<ffffffff8143e0e2>] system_call_fastpath+0x16/0x1b [16888.637857] Code: 8b 85 80 00 00 00 48 89 83 80 00 00 00 49 8b 45 18 44 89 b3 e8 00 00 00 48 89 43 18 45 29 b5 e8 00 00 00 48 8b 1b 8b 45 d4 0f c8 <44> 0f b7 a3 c8 00 00 00 4c 03 a3 d8 00 00 00 41 89 44 24 04 41 [16888.862165] RIP [<ffffffff813dd3e1>] tcp_gso_segment+0x251/0x3a0 -----------------------cut here-------------------------------------------- If max_size is too small, a GSO segment cannot be enqueued, we need to free the segment. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Jiri Pirko <jiri@resnulli.us> Cc: Jamal Hadi Salim <jhs@mojatatu.com> --- net/sched/sch_tbf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c index 68f9859..f246589 100644 --- a/net/sched/sch_tbf.c +++ b/net/sched/sch_tbf.c @@ -140,7 +140,7 @@ static int tbf_segment(struct sk_buff *skb, struct Qdisc *sch) qdisc_skb_cb(segs)->pkt_len = segs->len; ret = qdisc_enqueue(segs, q->qdisc); } else { - ret = qdisc_reshape_fail(skb, sch); + ret = qdisc_reshape_fail(segs, q->qdisc); } if (ret != NET_XMIT_SUCCESS) { if (net_xmit_drop_count(ret)) -- 1.7.12 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] net: sched: tbf: fix an oops when a GSO packet cannot be enqueued 2013-11-22 9:27 [PATCH] net: sched: tbf: fix an oops when a GSO packet cannot be enqueued Yang Yingliang @ 2013-11-22 14:50 ` Eric Dumazet 2013-11-23 8:08 ` Yang Yingliang 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2013-11-22 14:50 UTC (permalink / raw) To: Yang Yingliang; +Cc: davem, netdev, ben, jpirko, jhs On Fri, 2013-11-22 at 17:27 +0800, Yang Yingliang wrote: > It seems commit e43ac79a4b("sch_tbf: segment too big GSO packets") > introduce this oops. > > When GSO mode is on, use the following command: > tc qdisc add dev eth0 root handle 1: tbf latency 50ms burst 1KB rate 50mbit mtu 1KB > iperf -c host -t 30 I think your patch is not needed, the stack trace point to a different spot. The issue should be fixed by : http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=9d8506cc2d7ea1f911c72c100193a3677f6668c3 Could you check again ? Thanks ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] net: sched: tbf: fix an oops when a GSO packet cannot be enqueued 2013-11-22 14:50 ` Eric Dumazet @ 2013-11-23 8:08 ` Yang Yingliang 2013-11-23 17:56 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Yang Yingliang @ 2013-11-23 8:08 UTC (permalink / raw) To: Eric Dumazet; +Cc: davem, netdev, ben, jpirko, jhs On 2013/11/22 22:50, Eric Dumazet wrote: > On Fri, 2013-11-22 at 17:27 +0800, Yang Yingliang wrote: >> It seems commit e43ac79a4b("sch_tbf: segment too big GSO packets") >> introduce this oops. >> >> When GSO mode is on, use the following command: >> tc qdisc add dev eth0 root handle 1: tbf latency 50ms burst 1KB rate 50mbit mtu 1KB >> iperf -c host -t 30 > > I think your patch is not needed, the stack trace point to a different > spot. > > The issue should be fixed by : > > http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=9d8506cc2d7ea1f911c72c100193a3677f6668c3 > > Could you check again ? > > Thanks I've applied this patch, but it's oops too. I think it's not the same problem. In this problem, when the segs is not enqueued, we do nothing to the segs.I think it causes the problem. If I am wrong, please point out. Thanks! Here is the call trace after applying the above patch: BTW, the call trace is not always same no matter the above patch is applied. [ 355.298416] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 355.391292] IP: [<ffffffff81377f99>] __skb_recv_datagram+0x259/0x500 [ 355.466628] PGD 20312fc067 PUD 2030ed9067 PMD 0 [ 355.521336] Oops: 0002 [#1] SMP [ 355.559528] Modules linked in: sch_tbf(O) tun ip6table_filter ip6_tables iptable_filter ip_tables x_tables edd af_packet bridge stp llc microcode fuse loop dm_mod igb dca i2c_algo_bit i2c_i801 ptp serio_raw sg ehci_pci ipv6 kvm_intel i2c_core kvm iTCO_wdt iTCO_vendor_support lpc_ich mfd_core hid_generic mptctl pps_core pcspkr button ext3 jbd mbcache usbhid hid uhci_hcd ehci_hcd usbcore sd_mod usb_common crc_t10dif crct10dif_common processor thermal_sys hwmon scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_dh_rdac scsi_dh ata_generic ata_piix libata mptsas mptscsih mptbase scsi_transport_sas scsi_mod [ 356.186224] CPU: 9 PID: 10073 Comm: acc-snf Tainted: G O 3.12.0-rc6-0.7-default+ #8 [ 356.287345] Hardware name: Huawei Technologies Co., Ltd. Tecal XH620 /BC21THSA , BIOS TTSAV020 12/02/2011 [ 356.424571] task: ffff881fbd4e0380 ti: ffff881fbf028000 task.ti: ffff881fbf028000 [ 356.513306] RIP: 0010:[<ffffffff81377f99>] [<ffffffff81377f99>] __skb_recv_datagram+0x259/0x500 [ 356.617524] RSP: 0018:ffff881fbf029b68 EFLAGS: 00210046 [ 356.680466] RAX: 0000000000000000 RBX: ffff881fbd0e1b70 RCX: 0000000000200296 [ 356.765074] RDX: 0000000000000000 RSI: 0000000000200296 RDI: ffff881fbbc6f0a4 [ 356.849683] RBP: ffff881fbf029c48 R08: ffff881fbf029cb4 R09: ffff88203eb24840 [ 356.934291] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 357.018900] R13: ffff881fbbc6f000 R14: ffff881fbbc6f090 R15: ffff881fbf029c64 [ 357.103508] FS: 0000000000000000(0000) GS:ffff88203f320000(0063) knlGS:00000000f7320b70 [ 357.199465] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 [ 357.267566] CR2: 0000000000000008 CR3: 0000001fbb973000 CR4: 00000000000027e0 [ 357.352175] Stack: [ 357.375911] 0000000000200286 ffff881fbf029c00 ffff881fbf028010 ffff881fbd4e0380 [ 357.463640] ffff881fbf028010 ffff881fbbc6f1b8 ffff881fbf029cb4 ffff881fbf029c60 [ 357.551369] ffff881f80000020 7fffffffffffffff ffff881fbbc6f0a4 ffff881fbbc6f0cc [ 357.639099] Call Trace: [ 357.667998] [<ffffffff81371afa>] ? skb_release_data+0xaa/0xe0 [ 357.737135] [<ffffffff81378272>] skb_recv_datagram+0x32/0x40 [ 357.805237] [<ffffffffa08d7096>] packet_recvmsg+0x66/0x360 [af_packet] [ 357.883662] [<ffffffff81369cdc>] sock_recvmsg+0xac/0xe0 [ 357.946613] [<ffffffff8117f80d>] ? compat_core_sys_select+0x24d/0x260 [ 358.024001] [<ffffffff81369dce>] SyS_recvfrom+0xbe/0x120 [ 358.087976] [<ffffffff8136b898>] ? compat_sock_ioctl+0x88/0xb0 [ 358.158145] [<ffffffff811810ec>] ? compat_sys_ioctl+0x7c/0x1380 [ 358.229344] [<ffffffff81096d93>] ? ktime_get_ts+0x53/0x100 [ 358.295387] [<ffffffff8139e620>] compat_sys_socketcall+0x150/0x240 [ 358.369685] [<ffffffff8143f595>] sysenter_dispatch+0x7/0x1a [ 358.436756] Code: 00 48 89 8d 68 ff ff ff e9 38 fe ff ff 41 83 ad a0 00 00 00 01 48 8b 13 48 8b 43 08 48 c7 03 00 00 00 00 48 c7 43 08 00 00 00 00 <48> 89 42 08 48 89 10 48 8b bd 70 ff ff ff e8 04 f9 0b 00 48 8b [ 358.661068] RIP [<ffffffff81377f99>] __skb_recv_datagram+0x259/0x500 ------------------------------cut here------------------------------------- Regards, Yang > > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] net: sched: tbf: fix an oops when a GSO packet cannot be enqueued 2013-11-23 8:08 ` Yang Yingliang @ 2013-11-23 17:56 ` Eric Dumazet 2013-11-23 20:59 ` [PATCH] sch_tbf: handle too small burst Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2013-11-23 17:56 UTC (permalink / raw) To: Yang Yingliang; +Cc: davem, netdev, ben, jpirko, jhs On Sat, 2013-11-23 at 16:08 +0800, Yang Yingliang wrote: > > I've applied this patch, but it's oops too. > I think it's not the same problem. In this > problem, when the segs is not enqueued, we do > nothing to the segs.I think it causes the > problem. If I am wrong, please point out. > Thanks! > > Here is the call trace after applying the above patch: > BTW, the call trace is not always same no matter the > above patch is applied. > Oh right. Your patch on the right way, but changelog very confusing. BTW, setting a mtu of 1KB on TBF never worked if your ethernet device has MTU of 1500. I think that we should emit a warning if TBF mtu is too low. Ideally we should not even call skb_gso_segment() at all. I'll post an updated version of the patch with a more detailed changelog and explanations. Thanks ! ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] sch_tbf: handle too small burst 2013-11-23 17:56 ` Eric Dumazet @ 2013-11-23 20:59 ` Eric Dumazet 2013-11-23 22:49 ` David Miller 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2013-11-23 20:59 UTC (permalink / raw) To: Yang Yingliang, David Miller; +Cc: netdev, ben, jpirko, jhs From: Eric Dumazet <edumazet@google.com> If a too small burst is inadvertently set on TBF, we might trigger a bug in tbf_segment(), as 'skb' instead of 'segs' was used in a qdisc_reshape_fail() call. tc qdisc add dev eth0 root handle 1: tbf latency 50ms burst 1KB rate 50mbit Fix the bug, and add a warning, as such configuration is not going to work anyway for non GSO packets. (For some reason, one has to use a burst >= 1520 to get a working configuration, even with old kernels. This is a probable iproute2/tc bug) Based on a report and initial patch from Yang Yingliang Fixes: e43ac79a4bc6 ("sch_tbf: segment too big GSO packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Yang Yingliang <yangyingliang@huawei.com> --- Note : I prepared a self contained patch for the ease of stable submissions. We might later move skb_gso_seglen() in an include file. net/sched/sch_tbf.c | 32 +++++++++++++++++++++++++------- 1 file changed, 25 insertions(+), 7 deletions(-) diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c index 68f9859..a609005 100644 --- a/net/sched/sch_tbf.c +++ b/net/sched/sch_tbf.c @@ -21,6 +21,7 @@ #include <net/netlink.h> #include <net/sch_generic.h> #include <net/pkt_sched.h> +#include <net/tcp.h> /* Simple Token Bucket Filter. @@ -117,6 +118,22 @@ struct tbf_sched_data { }; +/* + * Return length of individual segments of a gso packet, + * including all headers (MAC, IP, TCP/UDP) + */ +static unsigned int skb_gso_seglen(const struct sk_buff *skb) +{ + unsigned int hdr_len = skb_transport_header(skb) - skb_mac_header(skb); + const struct skb_shared_info *shinfo = skb_shinfo(skb); + + if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6))) + hdr_len += tcp_hdrlen(skb); + else + hdr_len += sizeof(struct udphdr); + return hdr_len + shinfo->gso_size; +} + /* GSO packet is too big, segment it so that tbf can transmit * each segment in time */ @@ -136,12 +153,8 @@ static int tbf_segment(struct sk_buff *skb, struct Qdisc *sch) while (segs) { nskb = segs->next; segs->next = NULL; - if (likely(segs->len <= q->max_size)) { - qdisc_skb_cb(segs)->pkt_len = segs->len; - ret = qdisc_enqueue(segs, q->qdisc); - } else { - ret = qdisc_reshape_fail(skb, sch); - } + qdisc_skb_cb(segs)->pkt_len = segs->len; + ret = qdisc_enqueue(segs, q->qdisc); if (ret != NET_XMIT_SUCCESS) { if (net_xmit_drop_count(ret)) sch->qstats.drops++; @@ -163,7 +176,7 @@ static int tbf_enqueue(struct sk_buff *skb, struct Qdisc *sch) int ret; if (qdisc_pkt_len(skb) > q->max_size) { - if (skb_is_gso(skb)) + if (skb_is_gso(skb) && skb_gso_seglen(skb) <= q->max_size) return tbf_segment(skb, sch); return qdisc_reshape_fail(skb, sch); } @@ -319,6 +332,11 @@ static int tbf_change(struct Qdisc *sch, struct nlattr *opt) if (max_size < 0) goto done; + if (max_size < psched_mtu(qdisc_dev(sch))) + pr_warn_ratelimited("sch_tbf: burst %u is lower than device %s mtu (%u) !\n", + max_size, qdisc_dev(sch)->name, + psched_mtu(qdisc_dev(sch))); + if (q->qdisc != &noop_qdisc) { err = fifo_set_limit(q->qdisc, qopt->limit); if (err) ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] sch_tbf: handle too small burst 2013-11-23 20:59 ` [PATCH] sch_tbf: handle too small burst Eric Dumazet @ 2013-11-23 22:49 ` David Miller 0 siblings, 0 replies; 6+ messages in thread From: David Miller @ 2013-11-23 22:49 UTC (permalink / raw) To: eric.dumazet; +Cc: yangyingliang, netdev, ben, jpirko, jhs From: Eric Dumazet <eric.dumazet@gmail.com> Date: Sat, 23 Nov 2013 12:59:20 -0800 > From: Eric Dumazet <edumazet@google.com> > > If a too small burst is inadvertently set on TBF, we might trigger > a bug in tbf_segment(), as 'skb' instead of 'segs' was used in a > qdisc_reshape_fail() call. > > tc qdisc add dev eth0 root handle 1: tbf latency 50ms burst 1KB rate > 50mbit > > Fix the bug, and add a warning, as such configuration is not > going to work anyway for non GSO packets. > > (For some reason, one has to use a burst >= 1520 to get a working > configuration, even with old kernels. This is a probable iproute2/tc > bug) > > Based on a report and initial patch from Yang Yingliang > > Fixes: e43ac79a4bc6 ("sch_tbf: segment too big GSO packets") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Reported-by: Yang Yingliang <yangyingliang@huawei.com> Applied and queued up for -stable. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-11-23 22:41 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-11-22 9:27 [PATCH] net: sched: tbf: fix an oops when a GSO packet cannot be enqueued Yang Yingliang 2013-11-22 14:50 ` Eric Dumazet 2013-11-23 8:08 ` Yang Yingliang 2013-11-23 17:56 ` Eric Dumazet 2013-11-23 20:59 ` [PATCH] sch_tbf: handle too small burst Eric Dumazet 2013-11-23 22:49 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).