* [PATCH] net: sched: tbf: fix an oops when a GSO packet cannot be enqueued
@ 2013-11-22 9:27 Yang Yingliang
2013-11-22 14:50 ` Eric Dumazet
0 siblings, 1 reply; 6+ messages in thread
From: Yang Yingliang @ 2013-11-22 9:27 UTC (permalink / raw)
To: davem, netdev; +Cc: eric.dumazet, ben, jpirko, jhs
It seems commit e43ac79a4b("sch_tbf: segment too big GSO packets")
introduce this oops.
When GSO mode is on, use the following command:
tc qdisc add dev eth0 root handle 1: tbf latency 50ms burst 1KB rate 50mbit mtu 1KB
iperf -c host -t 30
Oops happened:
[16884.119862] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
[16884.212742] IP: [<ffffffff813dd3e1>] tcp_gso_segment+0x251/0x3a0
[16884.283951] PGD 1f14258067 PUD 1f34e7e067 PMD 0
[16884.338656] Oops: 0000 [#1] SMP
[16884.376851] Modules linked in: sch_tbf(O) tun ip6table_filter ip6_tables iptable_filter ip_tables x_tables edd af_packet bridge stp llc microcode fuse loop dm_mod igb dca i2c_algo_bit ptp ipv6 kvm_intel kvm i2c_i801 sg i2c_core pps_core ehci_pci mptctl pcspkr button iTCO_wdt iTCO_vendor_support lpc_ich mfd_core hid_generic serio_raw ext3 jbd mbcache usbhid hid uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif crct10dif_common processor thermal_sys hwmon scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_dh_rdac scsi_dh ata_generic ata_piix libata mptsas mptscsih mptbase scsi_transport_sas scsi_mod [last unloaded: sch_tbf]
[16885.029346] CPU: 15 PID: 546 Comm: iperf Tainted: G O 3.12.0-rc6-0.7-default+ #7
[16885.127372] Hardware name: Huawei Technologies Co., Ltd. Tecal XH620 /BC21THSA , BIOS TTSAV020 12/02/2011
[16885.264597] task: ffff881fa8940240 ti: ffff881fbd64e000 task.ti: ffff881fbd64e000
[16885.353331] RIP: 0010:[<ffffffff813dd3e1>] [<ffffffff813dd3e1>] tcp_gso_segment+0x251/0x3a0
[16885.453424] RSP: 0018:ffff881fbd64f738 EFLAGS: 00010286
[16885.516364] RAX: 000000000ea96820 RBX: 0000000000000000 RCX: 0000000020368706
[16885.600973] RDX: 000000001b1b0000 RSI: 00000000000002c0 RDI: ffff881fa8bfbb70
[16885.685584] RBP: ffff881fbd64f778 R08: 000000000000e93f R09: ffff881fbcaac400
[16885.770191] R10: 0000000000000004 R11: ffff881fbc93e840 R12: ffff881fbcaac510
[16885.854799] R13: ffff881fa8bfbb70 R14: 00000000000005a8 R15: ffff881fa8bfbd70
[16885.939409] FS: 00007f255987e700(0000) GS:ffff88203f3e0000(0000) knlGS:0000000000000000
[16886.035364] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[16886.103466] CR2: 00000000000000c8 CR3: 0000001f13c07000 CR4: 00000000000027e0
[16886.188075] Stack:
[16886.211809] ffff881fbd64f788 00000020813dd2d5 2068a90e0164230a 0000000000000006
[16886.299540] ffff881fa8bfbb70 ffff881fa8bfbb98 0000000000000014 0000000390004bb3
[16886.387268] ffff881fbd64f7d8 ffffffff813ed38d 06ff881fbd64f7e8 0000000000000046
[16886.475000] Call Trace:
[16886.503900] [<ffffffff813ed38d>] inet_gso_segment+0x14d/0x310
[16886.573036] [<ffffffff8107093b>] ? try_to_wake_up+0x1eb/0x290
[16886.642187] [<ffffffff81382c4d>] skb_mac_gso_segment+0xad/0x170
[16886.713392] [<ffffffff81382d79>] __skb_gso_segment+0x69/0xe0
[16886.781499] [<ffffffffa0038763>] tbf_enqueue+0x63/0x1e0 [sch_tbf]
[16886.854765] [<ffffffff81383466>] dev_queue_xmit+0x186/0x440
[16886.921844] [<ffffffffa08a13d0>] ? br_forward+0x60/0x60 [bridge]
[16886.994078] [<ffffffffa08a145b>] br_dev_queue_push_xmit+0x8b/0xd0 [bridge]
[16887.076622] [<ffffffffa08a14f8>] br_forward_finish+0x58/0x60 [bridge]
[16887.154010] [<ffffffffa08a1578>] __br_deliver+0x78/0x100 [bridge]
[16887.227275] [<ffffffffa08a163d>] br_deliver+0x3d/0x50 [bridge]
[16887.297439] [<ffffffffa089f8ae>] br_dev_xmit+0xce/0x230 [bridge]
[16887.369668] [<ffffffff813830bf>] dev_hard_start_xmit+0x2cf/0x4f0
[16887.441901] [<ffffffff81383697>] dev_queue_xmit+0x3b7/0x440
[16887.508970] [<ffffffff8138c4e1>] neigh_resolve_output+0x131/0x200
[16887.582230] [<ffffffff813bb31d>] ip_finish_output+0x23d/0x440
[16887.651368] [<ffffffff813bb5a8>] ip_output+0x88/0x90
[16887.711219] [<ffffffff813b9f34>] ip_local_out+0x24/0x30
[16887.774168] [<ffffffff813ba3dd>] ip_queue_xmit+0x14d/0x3e0
[16887.840208] [<ffffffff813d1661>] tcp_transmit_skb+0x501/0x840
[16887.909346] [<ffffffff813d29b3>] tcp_write_xmit+0x1e3/0xb20
[16887.976415] [<ffffffff813d335d>] __tcp_push_pending_frames+0x2d/0x70
[16888.052770] [<ffffffff813c6034>] tcp_sendmsg+0xc74/0xc90
[16888.116751] [<ffffffff813ec3f1>] inet_sendmsg+0x61/0xc0
[16888.179702] [<ffffffff81368981>] sock_aio_write+0x101/0x120
[16888.246776] [<ffffffff8121d530>] ? timerqueue_add+0x60/0xb0
[16888.313851] [<ffffffff81134720>] do_sync_write+0x60/0x90
[16888.377827] [<ffffffff81134904>] ? rw_verify_area+0x54/0xf0
[16888.444897] [<ffffffff81134b26>] vfs_write+0x186/0x190
[16888.506808] [<ffffffff811353bd>] SyS_write+0x5d/0xa0
[16888.566657] [<ffffffff8143e0e2>] system_call_fastpath+0x16/0x1b
[16888.637857] Code: 8b 85 80 00 00 00 48 89 83 80 00 00 00 49 8b 45 18 44 89 b3 e8 00 00 00 48 89 43 18 45 29 b5 e8 00 00 00 48 8b 1b 8b 45 d4 0f c8 <44> 0f b7 a3 c8 00 00 00 4c 03 a3 d8 00 00 00 41 89 44 24 04 41
[16888.862165] RIP [<ffffffff813dd3e1>] tcp_gso_segment+0x251/0x3a0
-----------------------cut here--------------------------------------------
If max_size is too small, a GSO segment cannot be enqueued,
we need to free the segment.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
---
net/sched/sch_tbf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c
index 68f9859..f246589 100644
--- a/net/sched/sch_tbf.c
+++ b/net/sched/sch_tbf.c
@@ -140,7 +140,7 @@ static int tbf_segment(struct sk_buff *skb, struct Qdisc *sch)
qdisc_skb_cb(segs)->pkt_len = segs->len;
ret = qdisc_enqueue(segs, q->qdisc);
} else {
- ret = qdisc_reshape_fail(skb, sch);
+ ret = qdisc_reshape_fail(segs, q->qdisc);
}
if (ret != NET_XMIT_SUCCESS) {
if (net_xmit_drop_count(ret))
--
1.7.12
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] net: sched: tbf: fix an oops when a GSO packet cannot be enqueued
2013-11-22 9:27 [PATCH] net: sched: tbf: fix an oops when a GSO packet cannot be enqueued Yang Yingliang
@ 2013-11-22 14:50 ` Eric Dumazet
2013-11-23 8:08 ` Yang Yingliang
0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2013-11-22 14:50 UTC (permalink / raw)
To: Yang Yingliang; +Cc: davem, netdev, ben, jpirko, jhs
On Fri, 2013-11-22 at 17:27 +0800, Yang Yingliang wrote:
> It seems commit e43ac79a4b("sch_tbf: segment too big GSO packets")
> introduce this oops.
>
> When GSO mode is on, use the following command:
> tc qdisc add dev eth0 root handle 1: tbf latency 50ms burst 1KB rate 50mbit mtu 1KB
> iperf -c host -t 30
I think your patch is not needed, the stack trace point to a different
spot.
The issue should be fixed by :
http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=9d8506cc2d7ea1f911c72c100193a3677f6668c3
Could you check again ?
Thanks
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] net: sched: tbf: fix an oops when a GSO packet cannot be enqueued
2013-11-22 14:50 ` Eric Dumazet
@ 2013-11-23 8:08 ` Yang Yingliang
2013-11-23 17:56 ` Eric Dumazet
0 siblings, 1 reply; 6+ messages in thread
From: Yang Yingliang @ 2013-11-23 8:08 UTC (permalink / raw)
To: Eric Dumazet; +Cc: davem, netdev, ben, jpirko, jhs
On 2013/11/22 22:50, Eric Dumazet wrote:
> On Fri, 2013-11-22 at 17:27 +0800, Yang Yingliang wrote:
>> It seems commit e43ac79a4b("sch_tbf: segment too big GSO packets")
>> introduce this oops.
>>
>> When GSO mode is on, use the following command:
>> tc qdisc add dev eth0 root handle 1: tbf latency 50ms burst 1KB rate 50mbit mtu 1KB
>> iperf -c host -t 30
>
> I think your patch is not needed, the stack trace point to a different
> spot.
>
> The issue should be fixed by :
>
> http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=9d8506cc2d7ea1f911c72c100193a3677f6668c3
>
> Could you check again ?
>
> Thanks
I've applied this patch, but it's oops too.
I think it's not the same problem. In this
problem, when the segs is not enqueued, we do
nothing to the segs.I think it causes the
problem. If I am wrong, please point out.
Thanks!
Here is the call trace after applying the above patch:
BTW, the call trace is not always same no matter the
above patch is applied.
[ 355.298416] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 355.391292] IP: [<ffffffff81377f99>] __skb_recv_datagram+0x259/0x500
[ 355.466628] PGD 20312fc067 PUD 2030ed9067 PMD 0
[ 355.521336] Oops: 0002 [#1] SMP
[ 355.559528] Modules linked in: sch_tbf(O) tun ip6table_filter ip6_tables iptable_filter ip_tables x_tables edd af_packet bridge stp llc microcode fuse loop dm_mod igb dca i2c_algo_bit i2c_i801 ptp serio_raw sg ehci_pci ipv6 kvm_intel i2c_core kvm iTCO_wdt iTCO_vendor_support lpc_ich mfd_core hid_generic mptctl pps_core pcspkr button ext3 jbd mbcache usbhid hid uhci_hcd ehci_hcd usbcore sd_mod usb_common crc_t10dif crct10dif_common processor thermal_sys hwmon scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_dh_rdac scsi_dh ata_generic ata_piix libata mptsas mptscsih mptbase scsi_transport_sas scsi_mod
[ 356.186224] CPU: 9 PID: 10073 Comm: acc-snf Tainted: G O 3.12.0-rc6-0.7-default+ #8
[ 356.287345] Hardware name: Huawei Technologies Co., Ltd. Tecal XH620 /BC21THSA , BIOS TTSAV020 12/02/2011
[ 356.424571] task: ffff881fbd4e0380 ti: ffff881fbf028000 task.ti: ffff881fbf028000
[ 356.513306] RIP: 0010:[<ffffffff81377f99>] [<ffffffff81377f99>] __skb_recv_datagram+0x259/0x500
[ 356.617524] RSP: 0018:ffff881fbf029b68 EFLAGS: 00210046
[ 356.680466] RAX: 0000000000000000 RBX: ffff881fbd0e1b70 RCX: 0000000000200296
[ 356.765074] RDX: 0000000000000000 RSI: 0000000000200296 RDI: ffff881fbbc6f0a4
[ 356.849683] RBP: ffff881fbf029c48 R08: ffff881fbf029cb4 R09: ffff88203eb24840
[ 356.934291] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 357.018900] R13: ffff881fbbc6f000 R14: ffff881fbbc6f090 R15: ffff881fbf029c64
[ 357.103508] FS: 0000000000000000(0000) GS:ffff88203f320000(0063) knlGS:00000000f7320b70
[ 357.199465] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
[ 357.267566] CR2: 0000000000000008 CR3: 0000001fbb973000 CR4: 00000000000027e0
[ 357.352175] Stack:
[ 357.375911] 0000000000200286 ffff881fbf029c00 ffff881fbf028010 ffff881fbd4e0380
[ 357.463640] ffff881fbf028010 ffff881fbbc6f1b8 ffff881fbf029cb4 ffff881fbf029c60
[ 357.551369] ffff881f80000020 7fffffffffffffff ffff881fbbc6f0a4 ffff881fbbc6f0cc
[ 357.639099] Call Trace:
[ 357.667998] [<ffffffff81371afa>] ? skb_release_data+0xaa/0xe0
[ 357.737135] [<ffffffff81378272>] skb_recv_datagram+0x32/0x40
[ 357.805237] [<ffffffffa08d7096>] packet_recvmsg+0x66/0x360 [af_packet]
[ 357.883662] [<ffffffff81369cdc>] sock_recvmsg+0xac/0xe0
[ 357.946613] [<ffffffff8117f80d>] ? compat_core_sys_select+0x24d/0x260
[ 358.024001] [<ffffffff81369dce>] SyS_recvfrom+0xbe/0x120
[ 358.087976] [<ffffffff8136b898>] ? compat_sock_ioctl+0x88/0xb0
[ 358.158145] [<ffffffff811810ec>] ? compat_sys_ioctl+0x7c/0x1380
[ 358.229344] [<ffffffff81096d93>] ? ktime_get_ts+0x53/0x100
[ 358.295387] [<ffffffff8139e620>] compat_sys_socketcall+0x150/0x240
[ 358.369685] [<ffffffff8143f595>] sysenter_dispatch+0x7/0x1a
[ 358.436756] Code: 00 48 89 8d 68 ff ff ff e9 38 fe ff ff 41 83 ad a0 00 00 00 01 48 8b 13 48 8b 43 08 48 c7 03 00 00 00 00 48 c7 43 08 00 00 00 00 <48> 89 42 08 48 89 10 48 8b bd 70 ff ff ff e8 04 f9 0b 00 48 8b
[ 358.661068] RIP [<ffffffff81377f99>] __skb_recv_datagram+0x259/0x500
------------------------------cut here-------------------------------------
Regards,
Yang
>
>
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] net: sched: tbf: fix an oops when a GSO packet cannot be enqueued
2013-11-23 8:08 ` Yang Yingliang
@ 2013-11-23 17:56 ` Eric Dumazet
2013-11-23 20:59 ` [PATCH] sch_tbf: handle too small burst Eric Dumazet
0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2013-11-23 17:56 UTC (permalink / raw)
To: Yang Yingliang; +Cc: davem, netdev, ben, jpirko, jhs
On Sat, 2013-11-23 at 16:08 +0800, Yang Yingliang wrote:
>
> I've applied this patch, but it's oops too.
> I think it's not the same problem. In this
> problem, when the segs is not enqueued, we do
> nothing to the segs.I think it causes the
> problem. If I am wrong, please point out.
> Thanks!
>
> Here is the call trace after applying the above patch:
> BTW, the call trace is not always same no matter the
> above patch is applied.
>
Oh right. Your patch on the right way, but changelog very confusing.
BTW, setting a mtu of 1KB on TBF never worked if your ethernet device
has MTU of 1500.
I think that we should emit a warning if TBF mtu is too low.
Ideally we should not even call skb_gso_segment() at all.
I'll post an updated version of the patch with a more detailed changelog
and explanations.
Thanks !
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] sch_tbf: handle too small burst
2013-11-23 17:56 ` Eric Dumazet
@ 2013-11-23 20:59 ` Eric Dumazet
2013-11-23 22:49 ` David Miller
0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2013-11-23 20:59 UTC (permalink / raw)
To: Yang Yingliang, David Miller; +Cc: netdev, ben, jpirko, jhs
From: Eric Dumazet <edumazet@google.com>
If a too small burst is inadvertently set on TBF, we might trigger
a bug in tbf_segment(), as 'skb' instead of 'segs' was used in a
qdisc_reshape_fail() call.
tc qdisc add dev eth0 root handle 1: tbf latency 50ms burst 1KB rate
50mbit
Fix the bug, and add a warning, as such configuration is not
going to work anyway for non GSO packets.
(For some reason, one has to use a burst >= 1520 to get a working
configuration, even with old kernels. This is a probable iproute2/tc
bug)
Based on a report and initial patch from Yang Yingliang
Fixes: e43ac79a4bc6 ("sch_tbf: segment too big GSO packets")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Yang Yingliang <yangyingliang@huawei.com>
---
Note : I prepared a self contained patch for the ease of stable
submissions. We might later move skb_gso_seglen() in an
include file.
net/sched/sch_tbf.c | 32 +++++++++++++++++++++++++-------
1 file changed, 25 insertions(+), 7 deletions(-)
diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c
index 68f9859..a609005 100644
--- a/net/sched/sch_tbf.c
+++ b/net/sched/sch_tbf.c
@@ -21,6 +21,7 @@
#include <net/netlink.h>
#include <net/sch_generic.h>
#include <net/pkt_sched.h>
+#include <net/tcp.h>
/* Simple Token Bucket Filter.
@@ -117,6 +118,22 @@ struct tbf_sched_data {
};
+/*
+ * Return length of individual segments of a gso packet,
+ * including all headers (MAC, IP, TCP/UDP)
+ */
+static unsigned int skb_gso_seglen(const struct sk_buff *skb)
+{
+ unsigned int hdr_len = skb_transport_header(skb) - skb_mac_header(skb);
+ const struct skb_shared_info *shinfo = skb_shinfo(skb);
+
+ if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6)))
+ hdr_len += tcp_hdrlen(skb);
+ else
+ hdr_len += sizeof(struct udphdr);
+ return hdr_len + shinfo->gso_size;
+}
+
/* GSO packet is too big, segment it so that tbf can transmit
* each segment in time
*/
@@ -136,12 +153,8 @@ static int tbf_segment(struct sk_buff *skb, struct Qdisc *sch)
while (segs) {
nskb = segs->next;
segs->next = NULL;
- if (likely(segs->len <= q->max_size)) {
- qdisc_skb_cb(segs)->pkt_len = segs->len;
- ret = qdisc_enqueue(segs, q->qdisc);
- } else {
- ret = qdisc_reshape_fail(skb, sch);
- }
+ qdisc_skb_cb(segs)->pkt_len = segs->len;
+ ret = qdisc_enqueue(segs, q->qdisc);
if (ret != NET_XMIT_SUCCESS) {
if (net_xmit_drop_count(ret))
sch->qstats.drops++;
@@ -163,7 +176,7 @@ static int tbf_enqueue(struct sk_buff *skb, struct Qdisc *sch)
int ret;
if (qdisc_pkt_len(skb) > q->max_size) {
- if (skb_is_gso(skb))
+ if (skb_is_gso(skb) && skb_gso_seglen(skb) <= q->max_size)
return tbf_segment(skb, sch);
return qdisc_reshape_fail(skb, sch);
}
@@ -319,6 +332,11 @@ static int tbf_change(struct Qdisc *sch, struct nlattr *opt)
if (max_size < 0)
goto done;
+ if (max_size < psched_mtu(qdisc_dev(sch)))
+ pr_warn_ratelimited("sch_tbf: burst %u is lower than device %s mtu (%u) !\n",
+ max_size, qdisc_dev(sch)->name,
+ psched_mtu(qdisc_dev(sch)));
+
if (q->qdisc != &noop_qdisc) {
err = fifo_set_limit(q->qdisc, qopt->limit);
if (err)
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] sch_tbf: handle too small burst
2013-11-23 20:59 ` [PATCH] sch_tbf: handle too small burst Eric Dumazet
@ 2013-11-23 22:49 ` David Miller
0 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2013-11-23 22:49 UTC (permalink / raw)
To: eric.dumazet; +Cc: yangyingliang, netdev, ben, jpirko, jhs
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sat, 23 Nov 2013 12:59:20 -0800
> From: Eric Dumazet <edumazet@google.com>
>
> If a too small burst is inadvertently set on TBF, we might trigger
> a bug in tbf_segment(), as 'skb' instead of 'segs' was used in a
> qdisc_reshape_fail() call.
>
> tc qdisc add dev eth0 root handle 1: tbf latency 50ms burst 1KB rate
> 50mbit
>
> Fix the bug, and add a warning, as such configuration is not
> going to work anyway for non GSO packets.
>
> (For some reason, one has to use a burst >= 1520 to get a working
> configuration, even with old kernels. This is a probable iproute2/tc
> bug)
>
> Based on a report and initial patch from Yang Yingliang
>
> Fixes: e43ac79a4bc6 ("sch_tbf: segment too big GSO packets")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Yang Yingliang <yangyingliang@huawei.com>
Applied and queued up for -stable.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-11-23 22:41 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-22 9:27 [PATCH] net: sched: tbf: fix an oops when a GSO packet cannot be enqueued Yang Yingliang
2013-11-22 14:50 ` Eric Dumazet
2013-11-23 8:08 ` Yang Yingliang
2013-11-23 17:56 ` Eric Dumazet
2013-11-23 20:59 ` [PATCH] sch_tbf: handle too small burst Eric Dumazet
2013-11-23 22:49 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).