* bugreport: Netdev watchdog timeout on wwan0 and soft lockup CPU #0, huawei E3276 modem always fails on GSM UDP up/downlink
From: Erik Alapää @ 2014-12-18 6:30 UTC (permalink / raw)
To: netdev
Disclaimer: This involves a mobile (cellular) (GSM/WCDMA/LTE) modem, but since
the crash involves netdev watchdog and a soft lockup of CPU #0 (see
trace below),
I post here. If doing so is not appropriate, I apologize beforehand.
Problem: up/dowlink UDP with iperf3 over GSM crashes Huawei E3276 USB
modem on 32-bit Debian-stable (wheezy) with 3.13 kernel and newer
kernels. Error also seen on 64-bit machines/Linux kernels.
Keywords: huawei_cdc_ncm, GSM, cdc-wdm, wwan0, netdev watchdog
Detailed problem description: Sending/receiving UDP 100+100kbit/s
up-and-downlink UDP traffic with 2 iperf3 processes over a Huawei
E3276 USB modem in GSM mode (GSM RAT) *always* fails after 3-15
minutes. Modem disconnects and kernel log shows 'NETDEV WATCHDOG:
wwan0 (huawei_cdc_ncm): transmit queue 0 timed out'. Also causes a soft
lockup of CPU #0.
Reproduced on a range of real and virtualized systems, all running Debian
or Kubuntu Linux with Linux kernel supporting the modem (3.13 and higher).
Output below is from a debian-jessie (unstable) system with custom-built dbg
kernel, but debian wheezy (stable) also reproduces same bug.
A clean way to reproduce is to use a clean debian wheezy install and
only adding iperf3 and a 3.16 kernel from debian-unstable.
At the end of this post I paste an excerpt from kern.log, if more logs
etc are needed I can provide them.
Command line of test application (showing uplink iperf UDP/GSM,
downlink is the same but with -R switch):
'./iperf3 -B <MODEM DHCP ADDR> -u -t 3600 -i 5 -p 5503 -b 100K -c <IP
ADDR OF IPERF SERVER>'
Distribution: debian-unstable
Kernel version: 3.16.7, vanilla debian unstable 3.17.0-trunk-686-pae,
3.17.2. All kernels that support the E3276 modem (kernels from 3.13
and upward) seem to fail.
Kernel modules: huawei_cdc_ncm, cdc_ncm, cdc_wdm, usb_wwan
Modem firmware version: 21.436.03.00.00
Lsusb showing modem/vendor id:
Bus 003 Device 027: ID 12d1:1506 Huawei Technologies Co., Ltd. E398
LTE/UMTS/GSM Modem/Networkcard
Output of lsb_release -a:
Distributor ID: Debian
Description: Debian GNU/Linux 8.0 (jessie)
Release: 8.0
Codename: jessie
Additional info: E3276 modem seems to be stable in Windows, so it may
be an issue with the huawei_cdc_ncm driver in Linux. But it may still
be a modem hw/firmware issue, and the netdev watchdog timeout on wwan0
could be caused by modem firmware crash. Modem also works much better
in Linux when using 3G (WCDMA) or 4G (LTE) instead of 2G (GSM).
Merry Christmas,
/Erik Alapää
Excerpt from kern.log:
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877612] ------------[ cut
here ]------------
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877647] WARNING: CPU: 0 PID:
0 at /build/linux-n7UL8Q/linux-3.17.4/net/sched/sch_generic.c:264
dev_watchdog+0x1ec/0x200()
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877649] NETDEV WATCHDOG:
wwan0 (huawei_cdc_ncm): transmit queue 0 timed out
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877650] Modules linked in:
huawei_cdc_ncm nls_utf8 udf isofs crc_itu_t ppdev lp rfcomm bnep
binfmt_misc uinput nfsd auth_rpcgss oid_registry nfs_acl nfs lockd
fscache sunrpc loop cdc_wdm option usb_storage usb_wwan cdc_ncm
usbserial usbnet snd_ens1371 snd_seq_midi snd_seq_midi_event
snd_rawmidi ecb btusb bluetooth rfkill snd_ac97_codec snd_pcm snd_seq
snd_seq_device snd_timer snd coretemp soundcore psmouse vmw_balloon
ac97_bus serio_raw crc32_pclmul aesni_intel aes_i586 xts lrw gf128mul
ablk_helper cryptd evdev gameport pcspkr parport_pc parport vmwgfx
shpchp ttm drm_kms_helper drm i2c_piix4 i2c_core processor thermal_sys
vmw_vmci button battery ac ext4 crc16 mbcache jbd2 sr_mod cdrom sg
ata_generic hid_generic usbhid hid sd_mod crc_t10dif crct10dif_common
crc32c_intel floppy ehci_pci uhci_hcd ehci_hcd pcnet32 mii usbcore
usb_common ata_piix mptspi scsi_transport_spi mptscsih mptbase libata
scsi_mod [last unloaded: huawei_cdc_ncm]
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877768] CPU: 0 PID: 0 Comm:
swapper/0 Not tainted 3.17.0-trunk-686-pae #1 Debian 3.17.4-1~exp1
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877769] Hardware name:
VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform,
BIOS 6.00 07/31/2013
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877770] 00000000 de409ef4
c149bcff de409f04 c10598a8 c15d8868 de409f20 00000000
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877772] c15d88a4 00000108
c13de8bc 00000108 c13de8bc 00000009 db25b800 fff09a96
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877773] fffffb7e de409f0c
c10598f3 00000009 de409f04 c15d8868 de409f20 de409f40
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877775] Call Trace:
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877780] [<c149bcff>] ?
dump_stack+0x3e/0x4e
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877785] [<c10598a8>] ?
warn_slowpath_common+0x88/0xa0
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877787] [<c13de8bc>] ?
dev_watchdog+0x1ec/0x200
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877788] [<c13de8bc>] ?
dev_watchdog+0x1ec/0x200
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877789] [<c10598f3>] ?
warn_slowpath_fmt+0x33/0x40
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877790] [<c13de8bc>] ?
dev_watchdog+0x1ec/0x200
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877796] [<c10acb20>] ?
call_timer_fn+0x30/0x100
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877799] [<c108ccf7>] ?
__wake_up+0x37/0x50
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877801] [<c10a7a1f>] ?
rcu_report_qs_rnp+0xbf/0x100
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877802] [<c13de6d0>] ?
dev_graft_qdisc+0x70/0x70
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877804] [<c10ae476>] ?
run_timer_softirq+0x186/0x250
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877805] [<c13de6d0>] ?
dev_graft_qdisc+0x70/0x70
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877807] [<c105d43c>] ?
__do_softirq+0xec/0x240
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877809] [<c105d350>] ?
cpu_callback+0x180/0x180
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877844] [<c1011a06>] ?
do_softirq_own_stack+0x26/0x30
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877845] <IRQ> [<c105d705>]
? irq_exit+0x95/0xa0
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877849] [<c14a2138>] ?
smp_apic_timer_interrupt+0x38/0x50
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877850] [<c14a187c>] ?
apic_timer_interrupt+0x34/0x3c
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877863] [<c104b822>] ?
native_safe_halt+0x2/0x10
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877866] [<c101873c>] ?
default_idle+0x1c/0xa0
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877868] [<c1018ece>] ?
arch_cpu_idle+0xe/0x10
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877869] [<c108d653>] ?
cpu_startup_entry+0x303/0x340
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877871] [<c108cb6f>] ?
__wake_up_locked+0x1f/0x30
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877873] [<c16a5ab8>] ?
start_kernel+0x38d/0x393
Dec 15 13:12:10 vmw-wheezy kernel: [ 4336.877874] ---[ end trace
07cf0823922200f7 ]---
Dec 15 13:12:20 vmw-wheezy kernel: [ 4346.939769] option1 ttyUSB0: GSM
modem (1-port) converter now disconnected from ttyUSB0
Dec 15 13:12:20 vmw-wheezy kernel: [ 4346.939779] option 2-1:1.0:
device disconnected
Dec 15 13:12:20 vmw-wheezy kernel: [ 4346.940770] option1 ttyUSB1: GSM
modem (1-port) converter now disconnected from ttyUSB1
Dec 15 13:12:20 vmw-wheezy kernel: [ 4346.940777] option 2-1:1.1:
device disconnected
Dec 15 13:12:20 vmw-wheezy kernel: [ 4346.941098] huawei_cdc_ncm
2-1:1.2 wwan0: unregister 'huawei_cdc_ncm' usb-0000:02:03.0-1, Huawei
CDC NCM device
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884677] NMI watchdog: BUG:
soft lockup - CPU#0 stuck for 24s! [usb-storage:7203]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884683] Modules linked in:
huawei_cdc_ncm nls_utf8 udf isofs crc_itu_t ppdev lp rfcomm bnep
binfmt_misc uinput nfsd auth_rpcgss oid_registry nfs_acl nfs lockd
fscache sunrpc loop cdc_wdm option usb_storage usb_wwan cdc_ncm
usbserial usbnet snd_ens1371 snd_seq_midi snd_seq_midi_event
snd_rawmidi ecb btusb bluetooth rfkill snd_ac97_codec snd_pcm snd_seq
snd_seq_device snd_timer snd coretemp soundcore psmouse vmw_balloon
ac97_bus serio_raw crc32_pclmul aesni_intel aes_i586 xts lrw gf128mul
ablk_helper cryptd evdev gameport pcspkr parport_pc parport vmwgfx
shpchp ttm drm_kms_helper drm i2c_piix4 i2c_core processor thermal_sys
vmw_vmci button battery ac ext4 crc16 mbcache jbd2 sr_mod cdrom sg
ata_generic hid_generic usbhid hid sd_mod crc_t10dif crct10dif_common
crc32c_intel floppy ehci_pci uhci_hcd ehci_hcd pcnet32 mii usbcore
usb_common ata_piix mptspi scsi_transport_spi mptscsih mptbase libata
scsi_mod [last unloaded: huawei_cdc_ncm]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884740] CPU: 0 PID: 7203
Comm: usb-storage Tainted: G W 3.17.0-trunk-686-pae #1
Debian 3.17.4-1~exp1
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884742] Hardware name:
VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform,
BIOS 6.00 07/31/2013
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884744] task: dbfc4ab0 ti:
db27a000 task.ti: db27a000
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884778] EIP:
0060:[<c14a064f>] EFLAGS: 00000286 CPU: 0
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884783] EIP is at
_raw_spin_unlock_irqrestore+0xf/0x20
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884785] EAX: 00000286 EBX:
00000000 ECX: 00000000 EDX: 00000286
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884786] ESI: 00000000 EDI:
db12c400 EBP: db27bb34 ESP: db27bb34
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884787] DS: 007b ES: 007b
FS: 00d8 GS: 00e0 SS: 0068
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884789] CR0: 80050033 CR2:
b77c5000 CR3: 1d4fb000 CR4: 001407f0
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884837] Stack:
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884838] db27bbe0 e09f41e3
11c37a72 0000000d dbfc4af4 dbfc4af4 0000647d db27bb8c
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884841] e0806090 e0806054
db120001 00000006 dcf70ce0 00000004 e0806050 00000001
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884844] 00000286 db12c624
0000000d 00000001 dfff5648 d5472700 de400040 db27bbe0
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884847] Call Trace:
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884858] [<e09f41e3>] ?
ehci_hub_control+0xf3/0xc80 [ehci_hcd]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884893] [<c1167b9d>] ?
__kmalloc+0x37d/0x3f0
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884896] [<c1081ec2>] ?
set_next_entity+0x52/0x70
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884898] [<c1087d0d>] ?
pick_next_task_fair+0x8ed/0xa30
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884929] [<e08d9659>] ?
usb_hcd_submit_urb+0x339/0x9d0 [usbcore]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884947] [<c100e6dc>] ?
__switch_to+0x11c/0x480
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884954] [<c10accd3>] ?
lock_timer_base.isra.39+0x23/0x50
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884960] [<e08db5ee>] ?
usb_start_wait_urb+0x4e/0x140 [usbcore]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884965] [<e08db787>] ?
usb_control_msg+0xa7/0xe0 [usbcore]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884967] [<c149de4e>] ?
wait_for_completion_timeout+0x8e/0xe0
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884972] [<e08cec40>] ?
set_port_feature+0x50/0x60 [usbcore]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884982] [<e08d067e>] ?
hub_port_reset+0x9e/0x600 [usbcore]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884984] [<c115f17a>] ?
dma_pool_free+0x2a/0x100
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884988] [<e09eb9f1>] ?
qh_destroy+0x61/0xa0 [ehci_hcd]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884992] [<e08d0c56>] ?
hub_port_init+0x76/0xcb0 [usbcore]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884996] [<e09f284d>] ?
ehci_endpoint_disable+0x2d/0x1c0 [ehci_hcd]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.884999] [<c107d5bb>] ?
try_to_wake_up+0x13b/0x2d0
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885004] [<e08d19b9>] ?
usb_reset_and_verify_device+0x129/0x730 [usbcore]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885006] [<c1073792>] ?
__blocking_notifier_call_chain+0x42/0x60
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885007] [<c10737cf>] ?
blocking_notifier_call_chain+0x1f/0x30
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885009] [<c149e3f0>] ?
mutex_lock+0x10/0x30
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885014] [<e08d20ce>] ?
usb_reset_device+0x10e/0x280 [usbcore]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885019] [<e0ed7644>] ?
usb_stor_Bulk_transport+0x194/0x3e0 [usb_storage]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885021] [<e0ed7de0>] ?
usb_stor_port_reset+0x50/0x60 [usb_storage]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885023] [<e0ed7e60>] ?
usb_stor_invoke_transport+0x70/0x4c0 [usb_storage]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885025] [<c107b07e>] ?
check_preempt_curr+0x6e/0x80
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885027] [<c107b0a8>] ?
ttwu_do_wakeup+0x18/0x140
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885029] [<c149e1c5>] ?
wait_for_completion_interruptible+0x95/0x180
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885031] [<c107d7b0>] ?
wake_up_state+0x10/0x10
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885034] [<e0ed93cf>] ?
usb_stor_control_thread+0x13f/0x230 [usb_storage]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885037] [<c108cb6f>] ?
__wake_up_locked+0x1f/0x30
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885039] [<e0ed9290>] ?
usb_stor_disconnect+0xc0/0xc0 [usb_storage]
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885041] [<c1072808>] ?
kthread+0xa8/0xc0
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885044] [<c14a0e41>] ?
ret_from_kernel_thread+0x21/0x30
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885046] [<c1072760>] ?
kthread_create_on_node+0x110/0x110
Dec 15 13:12:57 vmw-wheezy kernel: [ 4383.885047] Code: 5b 5e 5f 5d c3
80 43 04 01 fb 90 8d 74 26 00 eb e5 31 c0 eb e8 90 90 90 90 90 90 90
55 89 e5 3e 8d 74 26 00 80 00 01 89 d0 50 9d <8d> 74 26 00 5d c3 8d 74
26 00 8d bc 27 00 00 00 00 55 89 e5 3e
Dec 15 13:12:59 vmw-wheezy kernel: [ 4385.380743] usb 2-1: USB
disconnect, device number 4
Dec 15 13:12:59 vmw-wheezy kernel: [ 4385.380789] option 2-1:1.0: GSM
modem (1-port) converter detected
Dec 15 13:12:59 vmw-wheezy kernel: [ 4385.381527] usb 2-1: GSM modem
(1-port) converter now attached to ttyUSB0
Dec 15 13:12:59 vmw-wheezy kernel: [ 4385.381547] option 2-1:1.1: GSM
modem (1-port) converter detected
Dec 15 13:12:59 vmw-wheezy kernel: [ 4385.382038] usb 2-1: GSM modem
(1-port) converter now attached to ttyUSB1
Dec 15 13:12:59 vmw-wheezy kernel: [ 4385.382117] option1 ttyUSB0: GSM
modem (1-port) converter now disconnected from ttyUSB0
Dec 15 13:12:59 vmw-wheezy kernel: [ 4385.382122] option 2-1:1.0:
device disconnected
Dec 15 13:12:59 vmw-wheezy kernel: [ 4385.382159] option1 ttyUSB1: GSM
modem (1-port) converter now disconnected from ttyUSB1
Dec 15 13:12:59 vmw-wheezy kernel: [ 4385.382163] option 2-1:1.1:
device disconnected
Dec 15 13:19:17 vmw-wheezy kernel: [ 4763.866455] usbcore:
deregistering interface driver huawei_cdc_ncm
Dec 15 13:19:21 vmw-wheezy kernel: [ 4767.064010] usbcore: registered
new interface driver huawei_cdc_ncm

^ permalink raw reply
* RE: [PATCH] Documentation: clarify phys_port_id
From: Sathya Perla @ 2014-12-18 5:57 UTC (permalink / raw)
To: Dan Williams, netdev@vger.kernel.org
Cc: Joshua Watt, jpirko@redhat.com, Florian Fainelli
In-Reply-To: <1418835576.1160.38.camel@dcbw.local>
> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
>
> Signed-off-by: Dan Williams <dcbw@redhat.com>
> ---
> Documentation/ABI/testing/sysfs-class-net | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-class-net
> b/Documentation/ABI/testing/sysfs-class-net
> index e1b2e78..7fe823a 100644
> --- a/Documentation/ABI/testing/sysfs-class-net
> +++ b/Documentation/ABI/testing/sysfs-class-net
> @@ -186,7 +186,12 @@ KernelVersion: 3.12
> Contact: netdev@vger.kernel.org
> Description:
> Indicates the interface unique physical port identifier within
> - the NIC, as a string.
> + the NIC, as a string. If two net_device objects share physical
> + hardware or other resources, and/or do not operate
> independently
> + both net_device objects should be assigned the
> + same phys_port_id. phys_port_id should be as globally
> unique
> + as possible to prevent conflicts between different drivers
> and
> + vendors, eg with MAC addresses or hardware GUIDs.
Dan, two interfaces -- on the same card/chip -- may share some chip resources,
but as long as they use *separate* physical ports, it would be OK to bond them.
So, in this case, it would be valid to report a different phys_port_id for these netdevs.
The text -- "share physical hardware or other resources" becomes too restrictive
^^^^^^^^^^^^^^^
and will not even allow bonding of two physical ports on a NIC card.
thks,
-Sathya
>
> What: /sys/class/net/<iface>/speed
> Date: October 2009
> --
> 1.9.3
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH net] net: drop the packet when fails to do software segmentation or header check
From: Jason Wang @ 2014-12-18 5:57 UTC (permalink / raw)
To: davem, netdev, linux-kernel; +Cc: Jason Wang, Eric Dumazet
Commit cecda693a969816bac5e470e1d9c9c0ef5567bca ("net: keep original skb
which only needs header checking during software GSO") keeps the original
skb for packets that only needs header check, but it doesn't drop the
packet if software segmentation or header check were failed.
Fixes cecda693a969816bac5e470e1d9c9c0ef5567bca
(net: keep original skb which only needs header checking during software GSO)
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
Stable candidate for 3.17 and above.
---
net/core/dev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index f411c28..a989f85 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2673,7 +2673,7 @@ static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device
segs = skb_gso_segment(skb, features);
if (IS_ERR(segs)) {
- segs = NULL;
+ goto out_kfree_skb;
} else if (segs) {
consume_skb(skb);
skb = segs;
--
1.9.1
^ permalink raw reply related
* Re: [PATCH 08/10] tun: Re-uanble UFO support.
From: Jason Wang @ 2014-12-18 5:51 UTC (permalink / raw)
To: Vladislav Yasevich; +Cc: mst, netdev, virtualization, stefanha, ben
In-Reply-To: <1418840455-22598-9-git-send-email-vyasevic@redhat.com>
----- Original Message -----
> Now that UFO is split into v4 and v6 parts, we can bring
> back v4 support without any trouble.
>
> Continue to handle legacy applications by selecting the
> IPv6 fragment id but do not change the gso type. Thist
> makes sure that two legacy VMs may still communicate.
>
> Based on original work from Ben Hutchings.
>
> Fixes: 88e0e0e5aa7a ("drivers/net: Disable UFO through virtio")
> CC: Ben Hutchings <ben@decadent.org.uk>
> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
> ---
> drivers/net/tun.c | 26 ++++++++++++++------------
> 1 file changed, 14 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 9dd3746..8c32fca 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -175,7 +175,7 @@ struct tun_struct {
> struct net_device *dev;
> netdev_features_t set_features;
> #define TUN_USER_FEATURES (NETIF_F_HW_CSUM|NETIF_F_TSO_ECN|NETIF_F_TSO| \
> - NETIF_F_TSO6)
> + NETIF_F_TSO6|NETIF_F_UFO)
>
> int vnet_hdr_sz;
> int sndbuf;
> @@ -1152,20 +1152,15 @@ static ssize_t tun_get_user(struct tun_struct *tun,
> struct tun_file *tfile,
> skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
> break;
> case VIRTIO_NET_HDR_GSO_UDP:
> - {
> - static bool warned;
> -
> - if (!warned) {
> - warned = true;
> - netdev_warn(tun->dev,
> - "%s: using disabled UFO feature; please fix this program\n",
> - current->comm);
> - }
> skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
> - if (skb->protocol == htons(ETH_P_IPV6))
> + if (vlan_get_protocol(skb) == htons(ETH_P_IPV6)) {
This probably means UDPv6 with vlan does not work well, looks like another
independent fixe and also for stable.
> + /* This allows legacy application to work.
> + * Do not change the gso_type as it may
> + * not be upderstood by legacy applications.
> + */
Isn't this an issue that we use SKB_GSO_UDP for a UDPv6 packet? Especially
consider we want to fix UFOv6 in the future? We probably can use
SKB_GSO_UDP6 here and try to fix it in tun_put_user() if a legacy userspace
is detected.
> ipv6_proxy_select_ident(skb);
Question still for vlan, is network header correctly set here? Looks like
ipv6_proxy_select_ident() depends on correct network header to work.
> + }
> break;
> - }
> default:
> tun->dev->stats.rx_frame_errors++;
> kfree_skb(skb);
> @@ -1273,6 +1268,8 @@ static ssize_t tun_put_user(struct tun_struct *tun,
> gso.gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
> else if (sinfo->gso_type & SKB_GSO_TCPV6)
> gso.gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
> + else if (sinfo->gso_type & SKB_GSO_UDP)
> + gso.gso_type = VIRTIO_NET_HDR_GSO_UDP;
> else {
> pr_err("unexpected GSO type: "
> "0x%x, gso_size %d, hdr_len %d\n",
> @@ -1780,6 +1777,11 @@ static int set_offload(struct tun_struct *tun,
> unsigned long arg)
> features |= NETIF_F_TSO6;
> arg &= ~(TUN_F_TSO4|TUN_F_TSO6);
> }
> +
> + if (arg & TUN_F_UFO) {
> + features |= NETIF_F_UFO;
> + arg &= ~TUN_F_UFO;
> + }
> }
>
> /* This gives the user a way to test for new features in future by
> --
> 1.9.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply
* Re: [PATCH 00/10] Split UFO into v4 and v6 versions.
From: Jason Wang @ 2014-12-18 5:28 UTC (permalink / raw)
To: Vladislav Yasevich; +Cc: netdev, mst, ben, stefanha, virtualization
In-Reply-To: <1418840455-22598-1-git-send-email-vyasevic@redhat.com>
----- Original Message -----
> UFO support in the kernel applies to both IPv4 and IPv6 protocols
> with the same device feature. However some devices may not be able
> to support one of the offloads. For this we split the UFO offload
> feature into 2 pieces. NETIF_F_UFO now controlls the IPv4 part and
> this series introduces NETIF_F_UFO6.
>
> As a result of this work, we can now re-enable NETIF_F_UFO on
> virtio_net devices and restore UDP over IPv4 performance for guests.
> We also continue to support legacy guests that assume that UFO6
> support included into UFO(4).
>
> Without this work, migrating a guest to a 3.18 kernel fails.
>
This series eliminate the ambiguous NETIF_F_UFO.
But VIRTIO_NET_F_{HOST|GUEST}_UFO and VIRTIO_NET_F_HDR_GSO_UDP is still
ambigious. I know it was used to keep compatibility for legacy guest. But
what's the future plan? Differentiate UFOv4 and UFOv6 in virtio features and
gso type in vnet header looks sufficient?
With the series, we can't send UFOv6 packet but legacy guest can. How about
fix the UFOv6 also in this series?
Thanks
> Vladislav Yasevich (10):
> core: Split out UFO6 support
> net: Correctly mark IPv6 UFO offload type.
> ovs: Enable handling of UFO6 packets.
> loopback: Turn on UFO6 support.
> veth: Enable UFO6 support.
> macvlan: Enable UFO6 support.
> s2io: Enable UFO6 support.
> tun: Re-uanble UFO support.
> macvtap: Re-enable UFO support
> Revert "drivers/net: Disable UFO through virtio"
>
> drivers/net/ethernet/neterion/s2io.c | 6 +++---
> drivers/net/loopback.c | 4 ++--
> drivers/net/macvlan.c | 2 +-
> drivers/net/macvtap.c | 20 ++++++++++++++------
> drivers/net/tun.c | 26 ++++++++++++++------------
> drivers/net/veth.c | 2 +-
> drivers/net/virtio_net.c | 24 ++++++++++--------------
> include/linux/netdev_features.h | 7 +++++--
> include/linux/netdevice.h | 1 +
> include/linux/skbuff.h | 1 +
> net/core/dev.c | 35
> +++++++++++++++++++----------------
> net/core/ethtool.c | 2 +-
> net/ipv6/ip6_offload.c | 1 +
> net/ipv6/ip6_output.c | 4 ++--
> net/ipv6/udp_offload.c | 3 ++-
> net/mpls/mpls_gso.c | 1 +
> net/openvswitch/datapath.c | 3 ++-
> net/openvswitch/flow.c | 2 +-
> 18 files changed, 81 insertions(+), 63 deletions(-)
>
> --
> 1.9.3
>
> _______________________________________________
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
>
^ permalink raw reply
* Re: [PATCH net v2] net: ipv6: allow explicitly choosing optimistic addresses
From: Lorenzo Colitti @ 2014-12-18 5:24 UTC (permalink / raw)
To: Erik Kline
Cc: netdev@vger.kernel.org, YOSHIFUJI Hideaki, Hannes Frederic Sowa
In-Reply-To: <1418810135-12198-1-git-send-email-ek@google.com>
On Wed, Dec 17, 2014 at 6:55 PM, Erik Kline <ek@google.com> wrote:
> This change makes optimistic addresses treated more like
> deprecated addresses than tentative ones.
>
> Signed-off-by: Erik Kline <ek@google.com>
This looks correct now. Adding the extra shim function is clearly less
churn than the alternative of changing all the callers (20 files) to
pass in the extra argument, though it's true that the resulting code
is a bit less elegant.
Acked-by: Lorenzo Colitti <lorenzo@google.com>
^ permalink raw reply
* Re: [PATCH] Fixed TPACKET V3 to signal poll when block is closed rather than for every packet
From: David Miller @ 2014-12-18 4:37 UTC (permalink / raw)
To: dan; +Cc: netdev, linux-kernel
In-Reply-To: <14a5a65c372.fa01df65100787.352762707968323665@dcollins.co.nz>
From: Dan Collins <dan@dcollins.co.nz>
Date: Thu, 18 Dec 2014 12:36:56 +1300
> Make TPACKET_V3 signal poll when block is closed rather than for every packet. Side effect is that poll will be signaled when block retire timer expires which didn't previously happen. Issue was visible when sending packets at a very low frequency such that all blocks are retired before packets are received by TPACKET_V3. This caused avoidable packet loss. The fix ensures that the signal is sent when blocks are closed which covers the normal path where the block is filled as well as the path where the timer expires. The case where a block is filled without moving to the next block (ie. all blocks are full) will still cause poll to be signaled.
Please format your commit messages to 80 column ascii formatted text.
> Signed-off-by: Dan Collins <dan@dcollins.co.nz>
You need real "<" and ">" characters here.
^ permalink raw reply
* netdev01 twitter feed, first proposal accepted
From: Richard Guy Briggs @ 2014-12-18 4:30 UTC (permalink / raw)
To: netdev, linux-wireless, lwn, netdev01, lartc, netfilter,
netfilter-devel
We are pleased to announce that the first proposal has been accepted and a
Twitter feed has been set up to announce the rest as they are accepted for
netdev 0.1, the community-driven Linux networking conference held back-to-back
with netconf in Ottawa, Canada, February 14-17, 2015.
The twitter feed is: @netdev01
https://twitter.com/netdev01
Wireless Workshop Proposal Accepted
The wireless workshop will bring together the Linux wireless stack and driver
maintainers to discuss the continued development of the stack regarding Linux
implementation of existing standards, new requirements coming from 802.11
specification development, Android integration, new products and similar.
https://www.netdev01.org/news/11
Main site: https://www.netdev01.org/
Announcement: https://www.netdev01.org/announce
CFP: https://www.netdev01.org/cfp
Travel/hotel/weather/clothing: https://www.netdev01.org/travel
slainte mhath, RGB
--
Richard Guy Briggs -- ~\ -- ~\ <hpv.tricolour.ca>
<www.TriColour.ca> -- \___ o \@ @ Ride yer bike!
Ottawa, ON, CANADA -- Lo_>__M__\\/\%__\\/\%
_____________________________GTVS6#790__(*)__(*)________(*)(*)_________________
^ permalink raw reply
* Re: [iproute2] tc: Show classes more hierarchically]
From: Vadim Kochan @ 2014-12-18 3:12 UTC (permalink / raw)
To: Marcelo Ricardo Leitner; +Cc: Vadim Kochan, Stephen Hemminger, netdev
In-Reply-To: <54923434.3030209@gmail.com>
On Wed, Dec 17, 2014 at 11:56:04PM -0200, Marcelo Ricardo Leitner wrote:
> On 17-12-2014 21:56, Vadim Kochan wrote:
> >On Wed, Dec 17, 2014 at 11:55:35AM -0800, Stephen Hemminger wrote:
> >>On Tue, 16 Dec 2014 16:12:41 -0200
> >>Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote:
> >>
> >>>On 15-12-2014 20:48, vadim4j@gmail.com wrote:
> >>>>Hi All,
> >>>>
> >>>>I am playing with showing classes in more hierarchically format and I
> >>>>have some code and example of output from my TC looks like:
> >>>>
> >>>># tc/tc -t class show dev tap0
> >>>>
> >>>> \---1:2 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>> \---1:40 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>> \---1:50 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>> \---1:60 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>> \---1:1 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>> \---1:10 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>> \---1:11 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>> \---1:111 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>> \---1:20 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>> \---1:30 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>>
> >>>>
> >>>>which in standart output mode it looks like:
> >>>>
> >>>># tc/tc class show dev tap0
> >>>>
> >>>>class htb 1:11 parent 1:10 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>>class htb 1:111 parent 1:11 prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>>class htb 1:10 parent 1:1 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
> >>>>class htb 1:1 root rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>>class htb 1:20 parent 1:1 leaf 20: prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>>class htb 1:2 root rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>>class htb 1:30 parent 1:1 leaf 30: prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>>class htb 1:40 parent 1:2 leaf 40: prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
> >>>>class htb 1:50 parent 1:2 leaf 50: prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>>class htb 1:60 parent 1:2 leaf 60: prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>>>
> >>>>So I'd like to ask if it might be useful for the TC users (may be
> >>>>better format ?) to have this ?
> >>>
> >>>Good idea! It already looks good, but what about:
> >>>
> >>> |-- 1:2 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>> | |-- 1:40 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>> | |-- 1:50 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>> | '-- 1:60 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>> |-- 1:1 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >>> ...
> >>>
> >>>just another idea..
> >>>
> >>>Thanks.
> >>> Marcelo
> >>
> >>There are several places that also print tree format, hopefully there would
> >>be reusable code (lspci, tree, ps).
> >>
> >
> >OK, currently I have the following output:
> >
> >+---1:2(htb) rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
> >| +---1:40(htb) prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
> >| +---1:50(htb) prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
> >| +---1:60(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >|
> >+---1:1(htb) rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
> > +---1:10(htb) rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
> > | +---1:11(htb) rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
> > | | +---1:111(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > | | +---1:112(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > | |
> > | +---1:12(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > |
> > +---1:20(htb) prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
> > +---1:30(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >
> >How about this ?
>
> Looks very good to me, thanks!
>
> >Regards,
> >Vadim Kochan
> >
>
There is an exampe output of tree with stats:
+---1:2(htb) rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
| | * Send 100 pkts ...
| | * Rate 10mbit ...
| +---1:40(htb) prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
| | | * Send 100 pkts ...
| | | * Rate 10mbit ...
| +---1:50(htb) prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
| | | * Send 100 pkts ...
| | | * Rate 10mbit ...
| +---1:60(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
| | * Send 100 pkts ...
| | * Rate 10mbit ...
|
+---1:1(htb) rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
| * Send 100 pkts ...
| * Rate 10mbit ...
+---1:10(htb) rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
| | * Send 100 pkts ...
| | * Rate 10mbit ...
| +---1:11(htb) rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
| | | * Send 100 pkts ...
| | | * Rate 10mbit ...
| | +---1:111(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
| | | | * Send 100 pkts ...
| | | | * Rate 10mbit ...
| | +---1:112(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
| | | * Send 100 pkts ...
| | | * Rate 10mbit ...
| |
| +---1:12(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
| | * Send 100 pkts ...
| | * Rate 10mbit ...
|
+---1:20(htb) prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
| | * Send 100 pkts ...
| | * Rate 10mbit ...
+---1:30(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
| * Send 100 pkts ...
| * Rate 10mbit ...
Yeah, this is bigger one ...
Regards,
^ permalink raw reply
* Re: [PATCH 09/10] macvtap: Re-enable UFO support
From: Vlad Yasevich @ 2014-12-18 2:43 UTC (permalink / raw)
To: Michael S. Tsirkin, Vladislav Yasevich
Cc: netdev, ben, stefanha, virtualization
In-Reply-To: <20141217224100.GC30969@redhat.com>
On 12/17/2014 05:41 PM, Michael S. Tsirkin wrote:
> On Wed, Dec 17, 2014 at 01:20:54PM -0500, Vladislav Yasevich wrote:
>> Now that UFO is split into v4 and v6 parts, we can bring
>> back v4 support. Continue to handle legacy applications
>> by selecting the ipv6 fagment id but do not change the
>> gso type. This allows 2 legacy VMs to continue to communicate.
>>
>> Based on original work from Ben Hutchings.
>>
>> Fixes: 88e0e0e5aa7a ("drivers/net: Disable UFO through virtio")
>> CC: Ben Hutchings <ben@decadent.org.uk>
>> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
>> ---
>> drivers/net/macvtap.c | 20 ++++++++++++++------
>> 1 file changed, 14 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
>> index 880cc09..75febd4 100644
>> --- a/drivers/net/macvtap.c
>> +++ b/drivers/net/macvtap.c
>> @@ -66,7 +66,7 @@ static struct cdev macvtap_cdev;
>> static const struct proto_ops macvtap_socket_ops;
>>
>> #define TUN_OFFLOADS (NETIF_F_HW_CSUM | NETIF_F_TSO_ECN | NETIF_F_TSO | \
>> - NETIF_F_TSO6)
>> + NETIF_F_TSO6 | NETIF_F_UFO)
>> #define RX_OFFLOADS (NETIF_F_GRO | NETIF_F_LRO)
>> #define TAP_FEATURES (NETIF_F_GSO | NETIF_F_SG)
>>
>> @@ -570,11 +570,14 @@ static int macvtap_skb_from_vnet_hdr(struct sk_buff *skb,
>> gso_type = SKB_GSO_TCPV6;
>> break;
>> case VIRTIO_NET_HDR_GSO_UDP:
>> - pr_warn_once("macvtap: %s: using disabled UFO feature; please fix this program\n",
>> - current->comm);
>> gso_type = SKB_GSO_UDP;
>> - if (skb->protocol == htons(ETH_P_IPV6))
>> + if (vlan_get_protocol(skb) == htons(ETH_P_IPV6)) {
>> + /* This is to support legacy appliacations.
>> + * Do not change the gso_type as legacy apps
>> + * may not know about the new type.
>> + */
>> ipv6_proxy_select_ident(skb);
>> + }
>> break;
>> default:
>> return -EINVAL;
>> @@ -619,6 +622,8 @@ static void macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
>> vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
>> else if (sinfo->gso_type & SKB_GSO_TCPV6)
>> vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
>> + else if (sinfo->gso_type & SKB_GSO_UDP)
>> + vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_UDP;
>> else
>> BUG();
>> if (sinfo->gso_type & SKB_GSO_TCP_ECN)
>> @@ -955,6 +960,9 @@ static int set_offload(struct macvtap_queue *q, unsigned long arg)
>> if (arg & TUN_F_TSO6)
>> feature_mask |= NETIF_F_TSO6;
>> }
>> +
>> + if (arg & TUN_F_UFO)
>> + feature_mask |= NETIF_F_UFO;
>> }
>>
>> /* tun/tap driver inverts the usage for TSO offloads, where
>> @@ -965,7 +973,7 @@ static int set_offload(struct macvtap_queue *q, unsigned long arg)
>> * When user space turns off TSO, we turn off GSO/LRO so that
>> * user-space will not receive TSO frames.
>> */
>> - if (feature_mask & (NETIF_F_TSO | NETIF_F_TSO6))
>> + if (feature_mask & (NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_UFO))
>> features |= RX_OFFLOADS;
>> else
>> features &= ~RX_OFFLOADS;
>
> By the way this logic is completely broken, even without your patch,
> isn't it?
>
> It says: enable LRO+GRO if any of NETIF_F_TSO | NETIF_F_TSO6 |
> NETIF_F_UFO set.
>
> So what happens if I enable TSO only?
> LRO gets enabled so I can still get TSO6 packets.
>
>
> This really should be:
>
> if (feature_mask & (NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_UFO) ==
> (NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_UFO)
>
>
> fixing this probably should be a separate patch before your
> series, and Cc stable.
Actually, all this LRO/GRO feature manipulation on the macvtap device is
rather pointless as those features aren't really checked at any point
on the macvtap receive path. GRO calls are done from napi context on
the lowest device, so by the time a packet hits macvtap, GRO/LRO has
already been done.
So it doesn't really matter that we disable them incorrectly. I
can send a separate patch to clean this up.
>
>
>> @@ -1066,7 +1074,7 @@ static long macvtap_ioctl(struct file *file, unsigned int cmd,
>> case TUNSETOFFLOAD:
>> /* let the user check for future flags */
>> if (arg & ~(TUN_F_CSUM | TUN_F_TSO4 | TUN_F_TSO6 |
>> - TUN_F_TSO_ECN))
>> + TUN_F_TSO_ECN | TUN_F_UFO))
>> return -EINVAL;
>>
>> rtnl_lock();
>> --
>> 1.9.3
^ permalink raw reply
* RE: Bug: mv643xxx fails with highmem
From: fugang.duan @ 2014-12-18 2:29 UTC (permalink / raw)
To: Russell King - ARM Linux
Cc: David Miller, Fabio.Estevam@freescale.com,
ezequiel.garcia@free-electrons.com, netdev@vger.kernel.org
In-Reply-To: <20141216105707.GO11285@n2100.arm.linux.org.uk>
From: Russell King - ARM Linux <linux@arm.linux.org.uk> Sent: Tuesday, December 16, 2014 6:57 PM
> To: Duan Fugang-B38611
> Cc: David Miller; Estevam Fabio-R49496; ezequiel.garcia@free-
> electrons.com; netdev@vger.kernel.org
> Subject: Re: Bug: mv643xxx fails with highmem
>
> On Tue, Dec 16, 2014 at 02:19:38AM +0000, fugang.duan@freescale.com wrote:
> > From: Russell King - ARM Linux <linux@arm.linux.org.uk> Sent: Tuesday,
> December 16, 2014 2:05 AM
> > > To: Duan Fugang-B38611
> > > Cc: David Miller; Estevam Fabio-R49496; ezequiel.garcia@free-
> > > electrons.com; netdev@vger.kernel.org
> > > Subject: Re: Bug: mv643xxx fails with highmem
> > >
> > > On Fri, Dec 12, 2014 at 05:34:01AM +0000, fugang.duan@freescale.com
> wrote:
> > > > I will submit one patch to fix the issue.
> > >
> > > There's more bugs in the FEC driver... here's the relevant bits:
> > >
> > > static void
> > > fec_enet_tx_queue(struct net_device *ndev, u16 queue_id) {
> > > bdp = txq->dirty_tx;
> > >
> > > bdp = fec_enet_get_nextdesc(bdp, fep, queue_id);
> > >
> > > while (((status = bdp->cbd_sc) & BD_ENET_TX_READY) == 0) {
> > > /* current queue is empty */
> > > if (bdp == txq->cur_tx)
> > > break;
> > >
> > > skb = txq->tx_skbuff[index];
> > > txq->tx_skbuff[index] = NULL;
> > > if (!IS_TSO_HEADER(txq, bdp->cbd_bufaddr))
> > > dma_unmap_single(&fep->pdev->dev, bdp-
> > > >cbd_bufaddr,
> > > bdp->cbd_datlen,
> DMA_TO_DEVICE);
> > > bdp->cbd_bufaddr = 0;
> > > if (!skb) {
> > > bdp = fec_enet_get_nextdesc(bdp, fep,
> queue_id);
> > > continue;
> > > }
> > > ...
> > > txq->dirty_tx = bdp;
> > > bdp = fec_enet_get_nextdesc(bdp, fep, queue_id);
> > > }
> > >
> > > Consider the following code path:
> > > - we enter this function
> > > - get the dirty_tx pointer
> > > - move to the next descriptor (which we'll call descriptor A)
> > > - next descriptor indicates that TX_READY = 0
> > > - bdp != txq->cur_tx
> > > - we unmap if needed
> > > - we set bdp->cmdbufaddr = 0
> > > - assume skb is NULL, so we move to the next descriptor (we'll call
> this
> > > B)
> > > - next descriptor _may_ have TX_READY = 1
> > > - we break out of the loop, and return
> > >
> > > Some time later, we re-enter:
> > > - get the dirty_tx pointer
> > > - move to the next descriptor (which is descriptor A above)
> > > - next descriptor indicates that TX_READY = 0
> > > - bdp != txq->cur_tx
> > > - we call dma_unmap_single(..., bdp->cbd_bufaddr, which we previously
> > > zeroed
> > > - the DMA API debugging complains that FEC is unmapping memory
> which it
> > > doesn't own
> > >
> > > Unfortunately, this does appear to happen - from a paste from Jon
> > > Nettleton from iMX6Q:
> > >
> > > 32. [ 45.033001] unmapping this address 0x0 size 66
> 33. [ 45.037470]
> > > ------------[ cut here ]------------ 34. [ 45.042127] WARNING: CPU:
> 0
> > > PID: 102 at lib/dma-debug.c:1080 check_unmap+0x784/0x9f4()
> > > 35. [ 45.050066] fec 2188000.ethernet: DMA-API: device driver tries
> to
> > > free DMA memory it has not a]
> > >
> > > (where the printk at line 32 is something that was added to debug
> this.)
> > >
> > > The sad thing is that the remainder of my FEC patches did go a long
> way
> > > to clean up these kinds of issues in the driver (and there's /many/
> of
> > > them), but unfortunately other conflicting changes got merged before
> I
> > > could finish rebasing them, I decided to move on to other things and
> > > discard the remainder of my patch set. Marek showed some interest in
> > > taking the patch set over, but I've not heard anything more - and I'm
> not
> > > about to resurect my efforts only to get into the same situation
> where
> > > I'm carrying 50 odd patches which I can't merge back into mainline
> > > without spending weeks endlessly rebasing them.
> > >
> > Russell, many thanks for your effort and thanks for your pointing out
> the bug.
> > I will think one method to fix it.
> >
> > And I have one question for highmem dma mapping issue as below:
> > fec_enet_txq_submit_frag_skb(struct fec_enet_priv_tx_q *txq, struct
> sk_buff *skb, struct net_device *ndev)
> > {
> > ...
> > bufaddr = page_address(this_frag->page.p) + this_frag-
> >page_offset;
> >
> > index = fec_enet_get_bd_index(txq->tx_bd_base, bdp,
> fep);
> > if (((unsigned long) bufaddr) & fep->tx_align ||
> > fep->quirks & FEC_QUIRK_SWAP_FRAME) {
> > memcpy(txq->tx_bounce[index], bufaddr,
> frag_len);
> > bufaddr = txq->tx_bounce[index];
> >
> > if (fep->quirks & FEC_QUIRK_SWAP_FRAME)
> > swap_buffer(bufaddr, frag_len);
> > }
> >
> > addr = dma_map_single(&fep->pdev->dev, bufaddr,
> frag_len,
> > DMA_TO_DEVICE);
> > if (dma_mapping_error(&fep->pdev->dev, addr)) {
> > dev_kfree_skb_any(skb);
> > if (net_ratelimit())
> > netdev_err(ndev, "Tx DMA memory map
> failed\n");
> > goto dma_mapping_error;
> > }
> > ...
> > }
> >
> > If the frag page is located at high memory, use dma_map_single() is not
> > right, must use skb_frag_dma_map() or dma_map_page().
>
> Correct.
>
> > But before mapping, if tx has buffer alignment limitation (tx_align is
> > not zero), there need to do memcpy for buffer alignment.
>
> Right, and that can be detected by simply checking this_frag->page_offset
> as we know that a page address is always aligned to a page.
>
> > So, there we need to check whether the page is in highmem, if so, we
> > need to call kmap_atomic() or kmap_high_get() to get cpu address,
> > And then do memcpy or swap buffer operation.
>
> Yes - you'd need to do something like this:
>
> if (this_frag->page_offset & fep->tx_align ||
> fep->quirks & FEC_QUIRK_SWAP_FRAME) {
> bufaddr = kmap_atomic(this_frag->page.p) + this_frag-
> >page_offset;
> memcpy(txq->tx_bounce[index], bufaddr, frag_len);
> kunmap_atomic(bufaddr);
>
> bufaddr = txq->tx_bounce[index];
> if (fep->quirks & FEC_QUIRK_SWAP_FRAME)
> swap_buffer(bufaddr, frag_len);
> addr = dma_map_single(&fep->pdev->dev, bufaddr, frag_len,
> DMA_TO_DEVICE);
> } else {
> addr = skb_frag_dma_map(&fep->pdev->dev, this_frag, 0,
> frag_len, DMA_TO_DEVICE);
> }
>
> if (dma_mapping_error(&fep->pdev->dev, addr)) {
> dev_kfree_skb_any(skb);
> if (net_ratelimit())
> netdev_err(ndev, "Tx DMA memory map failed\n");
> goto dma_mapping_error;
> }
>
> You'll also need to record whether you should use dma_unmap_page() or
> dma_unmap_single().
>
Russell, thanks for your suggestion and comments. I will generate one patch to fix it like your suggestion.
(Sorry for late to reply your mail since I missed to read this mail)
Regards,
Andy
^ permalink raw reply
* Re: [iproute2] tc: Show classes more hierarchically]
From: Marcelo Ricardo Leitner @ 2014-12-18 1:56 UTC (permalink / raw)
To: Vadim Kochan, Stephen Hemminger; +Cc: netdev
In-Reply-To: <20141217235627.GA2149@angus-think.lan>
On 17-12-2014 21:56, Vadim Kochan wrote:
> On Wed, Dec 17, 2014 at 11:55:35AM -0800, Stephen Hemminger wrote:
>> On Tue, 16 Dec 2014 16:12:41 -0200
>> Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote:
>>
>>> On 15-12-2014 20:48, vadim4j@gmail.com wrote:
>>>> Hi All,
>>>>
>>>> I am playing with showing classes in more hierarchically format and I
>>>> have some code and example of output from my TC looks like:
>>>>
>>>> # tc/tc -t class show dev tap0
>>>>
>>>> \---1:2 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>>> \---1:40 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>>> \---1:50 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>>> \---1:60 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>>> \---1:1 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>>> \---1:10 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>>> \---1:11 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>>> \---1:111 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>>> \---1:20 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>>> \---1:30 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>>>
>>>>
>>>> which in standart output mode it looks like:
>>>>
>>>> # tc/tc class show dev tap0
>>>>
>>>> class htb 1:11 parent 1:10 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
>>>> class htb 1:111 parent 1:11 prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>>> class htb 1:10 parent 1:1 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
>>>> class htb 1:1 root rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
>>>> class htb 1:20 parent 1:1 leaf 20: prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
>>>> class htb 1:2 root rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
>>>> class htb 1:30 parent 1:1 leaf 30: prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>>> class htb 1:40 parent 1:2 leaf 40: prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
>>>> class htb 1:50 parent 1:2 leaf 50: prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
>>>> class htb 1:60 parent 1:2 leaf 60: prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>>>
>>>> So I'd like to ask if it might be useful for the TC users (may be
>>>> better format ?) to have this ?
>>>
>>> Good idea! It already looks good, but what about:
>>>
>>> |-- 1:2 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>> | |-- 1:40 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>> | |-- 1:50 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>> | '-- 1:60 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>> |-- 1:1 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>>> ...
>>>
>>> just another idea..
>>>
>>> Thanks.
>>> Marcelo
>>
>> There are several places that also print tree format, hopefully there would
>> be reusable code (lspci, tree, ps).
>>
>
> OK, currently I have the following output:
>
> +---1:2(htb) rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
> | +---1:40(htb) prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
> | +---1:50(htb) prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
> | +---1:60(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> |
> +---1:1(htb) rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
> +---1:10(htb) rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
> | +---1:11(htb) rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
> | | +---1:111(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> | | +---1:112(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> | |
> | +---1:12(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> |
> +---1:20(htb) prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
> +---1:30(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>
> How about this ?
Looks very good to me, thanks!
> Regards,
> Vadim Kochan
>
^ permalink raw reply
* Dear: Account User
From: Helpdesk @ 2014-12-18 0:30 UTC (permalink / raw)
To: Recipients
Dear: Account User,
This message is from the System Administrator support center. Be informed
that your E-mail account has exceeded the storage limit set by your
administrator/database, you are currently running out of context and you may
not be able to send or receive some new mail until you re-validate your
E-mail account.
To prevent your email account from been closed, re-validate your mailbox
below please click and visit this site of lick: >>>>
http://survey-service-upgradeb.ezweb123.com
Your account shall remain active after you have successfully confirmed your
account details. Thank you for your swift response to this notification we
apologize for any inconvenience.
We appreciate your continued help and support.
Regards,
SYSTEM ADMINISTRATOR HELPDESK TEAM 2014
^ permalink raw reply
* Re: [iproute2] tc: Show classes more hierarchically]
From: Vadim Kochan @ 2014-12-17 23:56 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Marcelo Ricardo Leitner, vadim4j, netdev
In-Reply-To: <20141217115535.3f5198d2@urahara>
On Wed, Dec 17, 2014 at 11:55:35AM -0800, Stephen Hemminger wrote:
> On Tue, 16 Dec 2014 16:12:41 -0200
> Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote:
>
> > On 15-12-2014 20:48, vadim4j@gmail.com wrote:
> > > Hi All,
> > >
> > > I am playing with showing classes in more hierarchically format and I
> > > have some code and example of output from my TC looks like:
> > >
> > > # tc/tc -t class show dev tap0
> > >
> > > \---1:2 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > > \---1:40 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > > \---1:50 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > > \---1:60 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > > \---1:1 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > > \---1:10 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > > \---1:11 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > > \---1:111 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > > \---1:20 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > > \---1:30 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > >
> > >
> > > which in standart output mode it looks like:
> > >
> > > # tc/tc class show dev tap0
> > >
> > > class htb 1:11 parent 1:10 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
> > > class htb 1:111 parent 1:11 prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > > class htb 1:10 parent 1:1 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
> > > class htb 1:1 root rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
> > > class htb 1:20 parent 1:1 leaf 20: prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
> > > class htb 1:2 root rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
> > > class htb 1:30 parent 1:1 leaf 30: prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > > class htb 1:40 parent 1:2 leaf 40: prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
> > > class htb 1:50 parent 1:2 leaf 50: prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
> > > class htb 1:60 parent 1:2 leaf 60: prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > >
> > > So I'd like to ask if it might be useful for the TC users (may be
> > > better format ?) to have this ?
> >
> > Good idea! It already looks good, but what about:
> >
> > |-- 1:2 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > | |-- 1:40 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > | |-- 1:50 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > | '-- 1:60 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > |-- 1:1 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > ...
> >
> > just another idea..
> >
> > Thanks.
> > Marcelo
>
> There are several places that also print tree format, hopefully there would
> be reusable code (lspci, tree, ps).
>
OK, currently I have the following output:
+---1:2(htb) rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
| +---1:40(htb) prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
| +---1:50(htb) prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
| +---1:60(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
|
+---1:1(htb) rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
+---1:10(htb) rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
| +---1:11(htb) rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
| | +---1:111(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
| | +---1:112(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
| |
| +---1:12(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
|
+---1:20(htb) prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
+---1:30(htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
How about this ?
Regards,
Vadim Kochan
^ permalink raw reply
* Re: Bug: mv643xxx fails with highmem
From: Russell King - ARM Linux @ 2014-12-18 0:03 UTC (permalink / raw)
To: Ezequiel Garcia
Cc: David Miller, Nimrod Andy, Fabio Estevam, netdev, fugang.duan
In-Reply-To: <5491F342.5090301@free-electrons.com>
On Wed, Dec 17, 2014 at 06:18:58PM -0300, Ezequiel Garcia wrote:
> On the other side, I haven't been able to reproduce this on my boards. I
> did try to put a hack to hold most lowmem pages, but it didn't make a
> difference. (In fact, I haven't been able to clearly see how the pages
> for the skbuff are allocated from high memory.)
To be honest, I don't know either. All that I can do is describe what
happened...
I've been running 3.17 since a week after it came out, and never saw a
problem there.
Then I moved forward to 3.18, and ended up with memory corruption, which
seemed to be the GPU scribbling over kernel text (since the oops revealed
pixel values in the Code: line.)
I thought it was a GPU problem - which seemed a reasonable assumption as
I know that the runtime PM I implemented for the GPU doesn't properly
restore the hardware state yet. So, I rebooted back into 3.18, this
time with all GPU users disabled, intending to download a kernel with
GPU runtime PM disabled. I'd also added additional debug to my X DDX
driver which logged the GPU command stream to a file on a NFS mount -
this does open(, O_CREAT|O_WRONLY|O_APPEND), write(), close() each
time it submits a block of commands.
However, while my scripts to download the built kernel to the Cubox
were running, the kernel oopsed in the depths of dma_map_single() - the
kernel was trying to access a struct page for phys address 0x40000000,
which didn't exist. I decided to go back to 3.17 to get the updated
kernel on it, hoping that would sort it out.
With the updated 3.18 kernel (with GPU runtime PM disabled), I found
that I'd still get oopses in from the network driver while X was starting
up, again from dma_map_single(). So, with all GPU users again disabled,
I set about debugging the this issue.
I added a BUG_ON(!addr) after the page_address(), and that fired. I
added a BUG_ON(PageHighMem(this_frag->page.p)) and that fired too.
(Each time, I had to boot back to 3.17 in order to download the new
kernel, because very time I tried with 3.18, I'd hit this bug.)
It's then when I reported the issue and asked the questions...
I've since done a simple change, taking advantage that on ARM (or any
asm-generic/dma-mapping-common.h user), dma_unmap_single() and
dma_unmap_page() are the same function:
diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c b/drivers/net/ethernet/marvell/mv643xx_eth.c
index d44560d1d268..c343ab03ab8b 100644
--- a/drivers/net/ethernet/marvell/mv643xx_eth.c
+++ b/drivers/net/ethernet/marvell/mv643xx_eth.c
@@ -879,10 +879,8 @@ static void txq_submit_frag_skb(struct tx_queue *txq, struct sk_buff *skb)
skb_frag_t *this_frag;
int tx_index;
struct tx_desc *desc;
- void *addr;
this_frag = &skb_shinfo(skb)->frags[frag];
- addr = page_address(this_frag->page.p) + this_frag->page_offset; tx_index = txq->tx_curr_desc++;
if (txq->tx_curr_desc == txq->tx_ring_size)
txq->tx_curr_desc = 0;
@@ -902,8 +900,9 @@ static void txq_submit_frag_skb(struct tx_queue *txq, struct sk_buff *skb)
desc->l4i_chk = 0;
desc->byte_cnt = skb_frag_size(this_frag);
- desc->buf_ptr = dma_map_single(mp->dev->dev.parent, addr,
- desc->byte_cnt, DMA_TO_DEVICE);
+ desc->buf_ptr = skb_frag_dma_map(mp->dev->dev.parent,
+ this_frag, 0,
+ desc->byte_cnt, DMA_TO_DEVICE); }
}
I've been running that for the last five days, and I've yet to see
/any/ issues what so ever, and that includes running with the GPU
logging all that time:
-rw-r--r-- 1 root root 17113616 Dec 17 23:52 /shared/etnaviv.bin
During that time, I've been using the device over the network, running
various git commands, running builds, running the occasional build
via NFS, etc.
So, for me it was trivially easy to reproduce (without my fix in place)
and all problems have gone away when I've fixed the apparent problem.
However, exactly how it occurs, I don't know. My understanding from
reading the various feature flags was that NETIF_F_HIGHDMA was required
for highmem (see illegal_highdma()) so as this isn't set, we shouldn't
be seeing highmem fragments - which is why I asked the question in my
original email.
If you want me to revert my fix above, and reproduce again, I can
certainly try that - or put a WARN_ON_ONCE(PageHighMem(this_frag->page.p))
in there, but I seem to remember that it wasn't particularly useful as
the backtrace didn't show where the memory actually came from.
--
FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply related
* Re: [Xen-devel] xen-netback: make feature-rx-notify mandatory -- Breaks stubdoms
From: John @ 2014-12-17 23:29 UTC (permalink / raw)
To: David Vrabel
Cc: netdev@vger.kernel.org, Wei Liu, Ian Campbell,
Xen-devel@lists.xen.org
In-Reply-To: <54918C89.1020409@citrix.com>
On 12/17/2014 6:00 AM, David Vrabel wrote:
> This patch works for me. I tested it with a hacked Linux frontend that
> disabled feature-rx-notify, but not with a stubdom.
>
> Can you give it a try, please?
I tested this and it does seem to work -- at least, a stubdom-based domU
starts up and runs properly. That said, I'm not using the minios network
functionality much, since all of my customer and test domains use PVHVM
drivers.
-John
^ permalink raw reply
* Re: [PATCH 01/10] core: Split out UFO6 support
From: Vlad Yasevich @ 2014-12-17 23:31 UTC (permalink / raw)
To: Michael S. Tsirkin, Vladislav Yasevich
Cc: netdev, ben, stefanha, virtualization
In-Reply-To: <20141217224553.GE30969@redhat.com>
On 12/17/2014 05:45 PM, Michael S. Tsirkin wrote:
> On Wed, Dec 17, 2014 at 01:20:46PM -0500, Vladislav Yasevich wrote:
>> Split IPv6 support for UFO into its own feature similiar to TSO.
>> This will later allow us to re-enable UFO support for virtio-net
>> devices.
>>
>> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
>> ---
>> include/linux/netdev_features.h | 7 +++++--
>> include/linux/netdevice.h | 1 +
>> include/linux/skbuff.h | 1 +
>> net/core/dev.c | 35 +++++++++++++++++++----------------
>> net/core/ethtool.c | 2 +-
>> 5 files changed, 27 insertions(+), 19 deletions(-)
>>
>> diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
>> index dcfdecb..a078945 100644
>> --- a/include/linux/netdev_features.h
>> +++ b/include/linux/netdev_features.h
>> @@ -48,8 +48,9 @@ enum {
>> NETIF_F_GSO_UDP_TUNNEL_BIT, /* ... UDP TUNNEL with TSO */
>> NETIF_F_GSO_UDP_TUNNEL_CSUM_BIT,/* ... UDP TUNNEL with TSO & CSUM */
>> NETIF_F_GSO_MPLS_BIT, /* ... MPLS segmentation */
>> + NETIF_F_UFO6_BIT, /* ... UDPv6 fragmentation */
>> /**/NETIF_F_GSO_LAST = /* last bit, see GSO_MASK */
>> - NETIF_F_GSO_MPLS_BIT,
>> + NETIF_F_UFO6_BIT,
>>
>> NETIF_F_FCOE_CRC_BIT, /* FCoE CRC32 */
>> NETIF_F_SCTP_CSUM_BIT, /* SCTP checksum offload */
>> @@ -109,6 +110,7 @@ enum {
>> #define NETIF_F_TSO_ECN __NETIF_F(TSO_ECN)
>> #define NETIF_F_TSO __NETIF_F(TSO)
>> #define NETIF_F_UFO __NETIF_F(UFO)
>> +#define NETIF_F_UFO6 __NETIF_F(UFO6)
>> #define NETIF_F_VLAN_CHALLENGED __NETIF_F(VLAN_CHALLENGED)
>> #define NETIF_F_RXFCS __NETIF_F(RXFCS)
>> #define NETIF_F_RXALL __NETIF_F(RXALL)
>> @@ -141,7 +143,7 @@ enum {
>>
>> /* List of features with software fallbacks. */
>> #define NETIF_F_GSO_SOFTWARE (NETIF_F_TSO | NETIF_F_TSO_ECN | \
>> - NETIF_F_TSO6 | NETIF_F_UFO)
>> + NETIF_F_TSO6 | NETIF_F_UFO | NETIF_F_UFO6)
>>
>> #define NETIF_F_GEN_CSUM NETIF_F_HW_CSUM
>> #define NETIF_F_V4_CSUM (NETIF_F_GEN_CSUM | NETIF_F_IP_CSUM)
>> @@ -149,6 +151,7 @@ enum {
>> #define NETIF_F_ALL_CSUM (NETIF_F_V4_CSUM | NETIF_F_V6_CSUM)
>>
>> #define NETIF_F_ALL_TSO (NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_TSO_ECN)
>> +#define NETIF_F_ALL_UFO (NETIF_F_UFO | NETIF_F_UFO6)
>>
>> #define NETIF_F_ALL_FCOE (NETIF_F_FCOE_CRC | NETIF_F_FCOE_MTU | \
>> NETIF_F_FSO)
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index 74fd5d3..86af10a 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -3559,6 +3559,7 @@ static inline bool net_gso_ok(netdev_features_t features, int gso_type)
>> /* check flags correspondence */
>> BUILD_BUG_ON(SKB_GSO_TCPV4 != (NETIF_F_TSO >> NETIF_F_GSO_SHIFT));
>> BUILD_BUG_ON(SKB_GSO_UDP != (NETIF_F_UFO >> NETIF_F_GSO_SHIFT));
>> + BUILD_BUG_ON(SKB_GSO_UDP6 != (NETIF_F_UFO6 >> NETIF_F_GSO_SHIFT));
>> BUILD_BUG_ON(SKB_GSO_DODGY != (NETIF_F_GSO_ROBUST >> NETIF_F_GSO_SHIFT));
>> BUILD_BUG_ON(SKB_GSO_TCP_ECN != (NETIF_F_TSO_ECN >> NETIF_F_GSO_SHIFT));
>> BUILD_BUG_ON(SKB_GSO_TCPV6 != (NETIF_F_TSO6 >> NETIF_F_GSO_SHIFT));
>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> index 6c8b6f6..8538b67 100644
>> --- a/include/linux/skbuff.h
>> +++ b/include/linux/skbuff.h
>> @@ -372,6 +372,7 @@ enum {
>>
>> SKB_GSO_MPLS = 1 << 12,
>>
>> + SKB_GSO_UDP6 = 1 << 13
>> };
>>
>> #if BITS_PER_LONG > 32
>
> So this implies anything getting GSO packets e.g.
> from userspace now needs to check IP version to
> set GSO type correctly.
>
> I think you missed some places that do this, e.g. af_packet
> sockets.
>
I looked at af_packet sockets and they set this only in the event
vnet header has been used with a GSO type. In this case, the user
already knows the the type.
It is true that with this series af_packets now can't do IPv6 UFO
since there is no VIRTIO_NET_HDR_GSO_UDPV6 yet.
I suppose we could do something similar there as we do in tun code/macvtap code.
If that's the case, it currently broken as well.
-vlad
>
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 945bbd0..fa4d2ee 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -5929,6 +5929,12 @@ static netdev_features_t netdev_fix_features(struct net_device *dev,
>> features &= ~NETIF_F_ALL_TSO;
>> }
>>
>> + /* UFO requires that SG is present as well */
>> + if ((features & NETIF_F_ALL_UFO) && !(features & NETIF_F_SG)) {
>> + netdev_dbg(dev, "Dropping UFO features since no SG feature.\n");
>> + features &= ~NETIF_F_ALL_UFO;
>> + }
>> +
>> if ((features & NETIF_F_TSO) && !(features & NETIF_F_HW_CSUM) &&
>> !(features & NETIF_F_IP_CSUM)) {
>> netdev_dbg(dev, "Dropping TSO features since no CSUM feature.\n");
>> @@ -5952,24 +5958,21 @@ static netdev_features_t netdev_fix_features(struct net_device *dev,
>> features &= ~NETIF_F_GSO;
>> }
>>
>> - /* UFO needs SG and checksumming */
>> - if (features & NETIF_F_UFO) {
>> - /* maybe split UFO into V4 and V6? */
>> - if (!((features & NETIF_F_GEN_CSUM) ||
>> - (features & (NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM))
>> - == (NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM))) {
>> - netdev_dbg(dev,
>> - "Dropping NETIF_F_UFO since no checksum offload features.\n");
>> - features &= ~NETIF_F_UFO;
>> - }
>> -
>> - if (!(features & NETIF_F_SG)) {
>> - netdev_dbg(dev,
>> - "Dropping NETIF_F_UFO since no NETIF_F_SG feature.\n");
>> - features &= ~NETIF_F_UFO;
>> - }
>> + /* UFO also needs checksumming */
>> + if ((features & NETIF_F_UFO) && !(features & NETIF_F_GEN_CSUM) &&
>> + !(features & NETIF_F_IP_CSUM)) {
>> + netdev_dbg(dev,
>> + "Dropping NETIF_F_UFO since no checksum offload features.\n");
>> + features &= ~NETIF_F_UFO;
>> + }
>> + if ((features & NETIF_F_UFO6) && !(features & NETIF_F_GEN_CSUM) &&
>> + !(features & NETIF_F_IPV6_CSUM)) {
>> + netdev_dbg(dev,
>> + "Dropping NETIF_F_UFO6 since no checksum offload features.\n");
>> + features &= ~NETIF_F_UFO6;
>> }
>>
>> +
>> #ifdef CONFIG_NET_RX_BUSY_POLL
>> if (dev->netdev_ops->ndo_busy_poll)
>> features |= NETIF_F_BUSY_POLL;
>> diff --git a/net/core/ethtool.c b/net/core/ethtool.c
>> index 06dfb29..93eff41 100644
>> --- a/net/core/ethtool.c
>> +++ b/net/core/ethtool.c
>> @@ -223,7 +223,7 @@ static netdev_features_t ethtool_get_feature_mask(u32 eth_cmd)
>> return NETIF_F_ALL_TSO;
>> case ETHTOOL_GUFO:
>> case ETHTOOL_SUFO:
>> - return NETIF_F_UFO;
>> + return NETIF_F_ALL_UFO;
>> case ETHTOOL_GGSO:
>> case ETHTOOL_SGSO:
>> return NETIF_F_GSO;
>> --
>> 1.9.3
^ permalink raw reply
* Re: [PATCH] Documentation: clarify phys_port_id
From: Florian Fainelli @ 2014-12-17 23:24 UTC (permalink / raw)
To: Dan Williams, netdev; +Cc: Joshua Watt, jpirko
In-Reply-To: <1418835576.1160.38.camel@dcbw.local>
On 17/12/14 08:59, Dan Williams wrote:
> Signed-off-by: Dan Williams <dcbw@redhat.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
> ---
> Documentation/ABI/testing/sysfs-class-net | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-class-net b/Documentation/ABI/testing/sysfs-class-net
> index e1b2e78..7fe823a 100644
> --- a/Documentation/ABI/testing/sysfs-class-net
> +++ b/Documentation/ABI/testing/sysfs-class-net
> @@ -186,7 +186,12 @@ KernelVersion: 3.12
> Contact: netdev@vger.kernel.org
> Description:
> Indicates the interface unique physical port identifier within
> - the NIC, as a string.
> + the NIC, as a string. If two net_device objects share physical
> + hardware or other resources, and/or do not operate independently
> + both net_device objects should be assigned the
> + same phys_port_id. phys_port_id should be as globally unique
> + as possible to prevent conflicts between different drivers and
> + vendors, eg with MAC addresses or hardware GUIDs.
>
> What: /sys/class/net/<iface>/speed
> Date: October 2009
>
^ permalink raw reply
* Re: [PATCH 01/10] core: Split out UFO6 support
From: Michael S. Tsirkin @ 2014-12-17 22:45 UTC (permalink / raw)
To: Vladislav Yasevich; +Cc: netdev, ben, stefanha, virtualization
In-Reply-To: <1418840455-22598-2-git-send-email-vyasevic@redhat.com>
On Wed, Dec 17, 2014 at 01:20:46PM -0500, Vladislav Yasevich wrote:
> Split IPv6 support for UFO into its own feature similiar to TSO.
> This will later allow us to re-enable UFO support for virtio-net
> devices.
>
> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
> ---
> include/linux/netdev_features.h | 7 +++++--
> include/linux/netdevice.h | 1 +
> include/linux/skbuff.h | 1 +
> net/core/dev.c | 35 +++++++++++++++++++----------------
> net/core/ethtool.c | 2 +-
> 5 files changed, 27 insertions(+), 19 deletions(-)
>
> diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
> index dcfdecb..a078945 100644
> --- a/include/linux/netdev_features.h
> +++ b/include/linux/netdev_features.h
> @@ -48,8 +48,9 @@ enum {
> NETIF_F_GSO_UDP_TUNNEL_BIT, /* ... UDP TUNNEL with TSO */
> NETIF_F_GSO_UDP_TUNNEL_CSUM_BIT,/* ... UDP TUNNEL with TSO & CSUM */
> NETIF_F_GSO_MPLS_BIT, /* ... MPLS segmentation */
> + NETIF_F_UFO6_BIT, /* ... UDPv6 fragmentation */
> /**/NETIF_F_GSO_LAST = /* last bit, see GSO_MASK */
> - NETIF_F_GSO_MPLS_BIT,
> + NETIF_F_UFO6_BIT,
>
> NETIF_F_FCOE_CRC_BIT, /* FCoE CRC32 */
> NETIF_F_SCTP_CSUM_BIT, /* SCTP checksum offload */
> @@ -109,6 +110,7 @@ enum {
> #define NETIF_F_TSO_ECN __NETIF_F(TSO_ECN)
> #define NETIF_F_TSO __NETIF_F(TSO)
> #define NETIF_F_UFO __NETIF_F(UFO)
> +#define NETIF_F_UFO6 __NETIF_F(UFO6)
> #define NETIF_F_VLAN_CHALLENGED __NETIF_F(VLAN_CHALLENGED)
> #define NETIF_F_RXFCS __NETIF_F(RXFCS)
> #define NETIF_F_RXALL __NETIF_F(RXALL)
> @@ -141,7 +143,7 @@ enum {
>
> /* List of features with software fallbacks. */
> #define NETIF_F_GSO_SOFTWARE (NETIF_F_TSO | NETIF_F_TSO_ECN | \
> - NETIF_F_TSO6 | NETIF_F_UFO)
> + NETIF_F_TSO6 | NETIF_F_UFO | NETIF_F_UFO6)
>
> #define NETIF_F_GEN_CSUM NETIF_F_HW_CSUM
> #define NETIF_F_V4_CSUM (NETIF_F_GEN_CSUM | NETIF_F_IP_CSUM)
> @@ -149,6 +151,7 @@ enum {
> #define NETIF_F_ALL_CSUM (NETIF_F_V4_CSUM | NETIF_F_V6_CSUM)
>
> #define NETIF_F_ALL_TSO (NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_TSO_ECN)
> +#define NETIF_F_ALL_UFO (NETIF_F_UFO | NETIF_F_UFO6)
>
> #define NETIF_F_ALL_FCOE (NETIF_F_FCOE_CRC | NETIF_F_FCOE_MTU | \
> NETIF_F_FSO)
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 74fd5d3..86af10a 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -3559,6 +3559,7 @@ static inline bool net_gso_ok(netdev_features_t features, int gso_type)
> /* check flags correspondence */
> BUILD_BUG_ON(SKB_GSO_TCPV4 != (NETIF_F_TSO >> NETIF_F_GSO_SHIFT));
> BUILD_BUG_ON(SKB_GSO_UDP != (NETIF_F_UFO >> NETIF_F_GSO_SHIFT));
> + BUILD_BUG_ON(SKB_GSO_UDP6 != (NETIF_F_UFO6 >> NETIF_F_GSO_SHIFT));
> BUILD_BUG_ON(SKB_GSO_DODGY != (NETIF_F_GSO_ROBUST >> NETIF_F_GSO_SHIFT));
> BUILD_BUG_ON(SKB_GSO_TCP_ECN != (NETIF_F_TSO_ECN >> NETIF_F_GSO_SHIFT));
> BUILD_BUG_ON(SKB_GSO_TCPV6 != (NETIF_F_TSO6 >> NETIF_F_GSO_SHIFT));
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 6c8b6f6..8538b67 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -372,6 +372,7 @@ enum {
>
> SKB_GSO_MPLS = 1 << 12,
>
> + SKB_GSO_UDP6 = 1 << 13
> };
>
> #if BITS_PER_LONG > 32
So this implies anything getting GSO packets e.g.
from userspace now needs to check IP version to
set GSO type correctly.
I think you missed some places that do this, e.g. af_packet
sockets.
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 945bbd0..fa4d2ee 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -5929,6 +5929,12 @@ static netdev_features_t netdev_fix_features(struct net_device *dev,
> features &= ~NETIF_F_ALL_TSO;
> }
>
> + /* UFO requires that SG is present as well */
> + if ((features & NETIF_F_ALL_UFO) && !(features & NETIF_F_SG)) {
> + netdev_dbg(dev, "Dropping UFO features since no SG feature.\n");
> + features &= ~NETIF_F_ALL_UFO;
> + }
> +
> if ((features & NETIF_F_TSO) && !(features & NETIF_F_HW_CSUM) &&
> !(features & NETIF_F_IP_CSUM)) {
> netdev_dbg(dev, "Dropping TSO features since no CSUM feature.\n");
> @@ -5952,24 +5958,21 @@ static netdev_features_t netdev_fix_features(struct net_device *dev,
> features &= ~NETIF_F_GSO;
> }
>
> - /* UFO needs SG and checksumming */
> - if (features & NETIF_F_UFO) {
> - /* maybe split UFO into V4 and V6? */
> - if (!((features & NETIF_F_GEN_CSUM) ||
> - (features & (NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM))
> - == (NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM))) {
> - netdev_dbg(dev,
> - "Dropping NETIF_F_UFO since no checksum offload features.\n");
> - features &= ~NETIF_F_UFO;
> - }
> -
> - if (!(features & NETIF_F_SG)) {
> - netdev_dbg(dev,
> - "Dropping NETIF_F_UFO since no NETIF_F_SG feature.\n");
> - features &= ~NETIF_F_UFO;
> - }
> + /* UFO also needs checksumming */
> + if ((features & NETIF_F_UFO) && !(features & NETIF_F_GEN_CSUM) &&
> + !(features & NETIF_F_IP_CSUM)) {
> + netdev_dbg(dev,
> + "Dropping NETIF_F_UFO since no checksum offload features.\n");
> + features &= ~NETIF_F_UFO;
> + }
> + if ((features & NETIF_F_UFO6) && !(features & NETIF_F_GEN_CSUM) &&
> + !(features & NETIF_F_IPV6_CSUM)) {
> + netdev_dbg(dev,
> + "Dropping NETIF_F_UFO6 since no checksum offload features.\n");
> + features &= ~NETIF_F_UFO6;
> }
>
> +
> #ifdef CONFIG_NET_RX_BUSY_POLL
> if (dev->netdev_ops->ndo_busy_poll)
> features |= NETIF_F_BUSY_POLL;
> diff --git a/net/core/ethtool.c b/net/core/ethtool.c
> index 06dfb29..93eff41 100644
> --- a/net/core/ethtool.c
> +++ b/net/core/ethtool.c
> @@ -223,7 +223,7 @@ static netdev_features_t ethtool_get_feature_mask(u32 eth_cmd)
> return NETIF_F_ALL_TSO;
> case ETHTOOL_GUFO:
> case ETHTOOL_SUFO:
> - return NETIF_F_UFO;
> + return NETIF_F_ALL_UFO;
> case ETHTOOL_GGSO:
> case ETHTOOL_SGSO:
> return NETIF_F_GSO;
> --
> 1.9.3
^ permalink raw reply
* Re: [PATCH 10/10] Revert "drivers/net: Disable UFO through virtio"
From: Michael S. Tsirkin @ 2014-12-17 22:44 UTC (permalink / raw)
To: Vladislav Yasevich; +Cc: netdev, ben, stefanha, virtualization
In-Reply-To: <1418840455-22598-11-git-send-email-vyasevic@redhat.com>
On Wed, Dec 17, 2014 at 01:20:55PM -0500, Vladislav Yasevich wrote:
> This reverts commit 3d0ad09412ffe00c9afa201d01effdb6023d09b4.
> Now that we've split UFO into v4 and v6 version, we can turn
> back UFO support for ipv4. Full IPv6 support will come later as
> it requires extending vnet header structure.
>
> Any older VM that assumes IPv6 support is included in UFO
> will continue to use UFO and the host will generate fragment
> ids for it, thus preserving connectivity.
>
> Fixes: 88e0e0e5aa7a ("drivers/net: Disable UFO through virtio")
> CC: Ben Hutchings <ben@decadent.org.uk>
> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
> ---
> drivers/net/virtio_net.c | 24 ++++++++++--------------
> 1 file changed, 10 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index b0bc8ea..534b633 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -491,17 +491,8 @@ static void receive_buf(struct receive_queue *rq, void *buf, unsigned int len)
> skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
> break;
> case VIRTIO_NET_HDR_GSO_UDP:
> - {
> - static bool warned;
> -
> - if (!warned) {
> - warned = true;
> - netdev_warn(dev,
> - "host using disabled UFO feature; please fix it\n");
> - }
> skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
> break;
This might not be true: could be a legacy host.
I think we need to check for IPv6 and set it
correctly to SKB_GSO_UDP/SKB_GSO_UDP6?
> - }
> case VIRTIO_NET_HDR_GSO_TCPV6:
> skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
> break;
> @@ -890,6 +881,8 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
> hdr->hdr.gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
> else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)
> hdr->hdr.gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
> + else if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP)
> + hdr->hdr.gso_type = VIRTIO_NET_HDR_GSO_UDP;
> else
> BUG();
> if (skb_shinfo(skb)->gso_type & SKB_GSO_TCP_ECN)
> @@ -1749,7 +1742,7 @@ static int virtnet_probe(struct virtio_device *vdev)
> dev->features |= NETIF_F_HW_CSUM|NETIF_F_SG|NETIF_F_FRAGLIST;
>
> if (virtio_has_feature(vdev, VIRTIO_NET_F_GSO)) {
> - dev->hw_features |= NETIF_F_TSO
> + dev->hw_features |= NETIF_F_TSO | NETIF_F_UFO
> | NETIF_F_TSO_ECN | NETIF_F_TSO6;
> }
> /* Individual feature bits: what can host handle? */
> @@ -1759,9 +1752,11 @@ static int virtnet_probe(struct virtio_device *vdev)
> dev->hw_features |= NETIF_F_TSO6;
> if (virtio_has_feature(vdev, VIRTIO_NET_F_HOST_ECN))
> dev->hw_features |= NETIF_F_TSO_ECN;
> + if (virtio_has_feature(vdev, VIRTIO_NET_F_HOST_UFO))
> + dev->hw_features |= NETIF_F_UFO;
>
> if (gso)
> - dev->features |= dev->hw_features & NETIF_F_ALL_TSO;
> + dev->features |= dev->hw_features & (NETIF_F_ALL_TSO|NETIF_F_UFO);
> /* (!csum && gso) case will be fixed by register_netdev() */
> }
> if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_CSUM))
> @@ -1799,7 +1794,8 @@ static int virtnet_probe(struct virtio_device *vdev)
> /* If we can receive ANY GSO packets, we must allocate large ones. */
> if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4) ||
> virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO6) ||
> - virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_ECN))
> + virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_ECN) ||
> + virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_UFO))
> vi->big_packets = true;
>
> if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF))
> @@ -1993,9 +1989,9 @@ static struct virtio_device_id id_table[] = {
> static unsigned int features[] = {
> VIRTIO_NET_F_CSUM, VIRTIO_NET_F_GUEST_CSUM,
> VIRTIO_NET_F_GSO, VIRTIO_NET_F_MAC,
> - VIRTIO_NET_F_HOST_TSO4, VIRTIO_NET_F_HOST_TSO6,
> + VIRTIO_NET_F_HOST_TSO4, VIRTIO_NET_F_HOST_UFO, VIRTIO_NET_F_HOST_TSO6,
> VIRTIO_NET_F_HOST_ECN, VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
> - VIRTIO_NET_F_GUEST_ECN,
> + VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_UFO,
> VIRTIO_NET_F_MRG_RXBUF, VIRTIO_NET_F_STATUS, VIRTIO_NET_F_CTRL_VQ,
> VIRTIO_NET_F_CTRL_RX, VIRTIO_NET_F_CTRL_VLAN,
> VIRTIO_NET_F_GUEST_ANNOUNCE, VIRTIO_NET_F_MQ,
> --
> 1.9.3
^ permalink raw reply
* Re: [PATCH 09/10] macvtap: Re-enable UFO support
From: Michael S. Tsirkin @ 2014-12-17 22:41 UTC (permalink / raw)
To: Vladislav Yasevich; +Cc: netdev, ben, stefanha, virtualization
In-Reply-To: <1418840455-22598-10-git-send-email-vyasevic@redhat.com>
On Wed, Dec 17, 2014 at 01:20:54PM -0500, Vladislav Yasevich wrote:
> Now that UFO is split into v4 and v6 parts, we can bring
> back v4 support. Continue to handle legacy applications
> by selecting the ipv6 fagment id but do not change the
> gso type. This allows 2 legacy VMs to continue to communicate.
>
> Based on original work from Ben Hutchings.
>
> Fixes: 88e0e0e5aa7a ("drivers/net: Disable UFO through virtio")
> CC: Ben Hutchings <ben@decadent.org.uk>
> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
> ---
> drivers/net/macvtap.c | 20 ++++++++++++++------
> 1 file changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
> index 880cc09..75febd4 100644
> --- a/drivers/net/macvtap.c
> +++ b/drivers/net/macvtap.c
> @@ -66,7 +66,7 @@ static struct cdev macvtap_cdev;
> static const struct proto_ops macvtap_socket_ops;
>
> #define TUN_OFFLOADS (NETIF_F_HW_CSUM | NETIF_F_TSO_ECN | NETIF_F_TSO | \
> - NETIF_F_TSO6)
> + NETIF_F_TSO6 | NETIF_F_UFO)
> #define RX_OFFLOADS (NETIF_F_GRO | NETIF_F_LRO)
> #define TAP_FEATURES (NETIF_F_GSO | NETIF_F_SG)
>
> @@ -570,11 +570,14 @@ static int macvtap_skb_from_vnet_hdr(struct sk_buff *skb,
> gso_type = SKB_GSO_TCPV6;
> break;
> case VIRTIO_NET_HDR_GSO_UDP:
> - pr_warn_once("macvtap: %s: using disabled UFO feature; please fix this program\n",
> - current->comm);
> gso_type = SKB_GSO_UDP;
> - if (skb->protocol == htons(ETH_P_IPV6))
> + if (vlan_get_protocol(skb) == htons(ETH_P_IPV6)) {
> + /* This is to support legacy appliacations.
> + * Do not change the gso_type as legacy apps
> + * may not know about the new type.
> + */
> ipv6_proxy_select_ident(skb);
> + }
> break;
> default:
> return -EINVAL;
> @@ -619,6 +622,8 @@ static void macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
> vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
> else if (sinfo->gso_type & SKB_GSO_TCPV6)
> vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
> + else if (sinfo->gso_type & SKB_GSO_UDP)
> + vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_UDP;
> else
> BUG();
> if (sinfo->gso_type & SKB_GSO_TCP_ECN)
> @@ -955,6 +960,9 @@ static int set_offload(struct macvtap_queue *q, unsigned long arg)
> if (arg & TUN_F_TSO6)
> feature_mask |= NETIF_F_TSO6;
> }
> +
> + if (arg & TUN_F_UFO)
> + feature_mask |= NETIF_F_UFO;
> }
>
> /* tun/tap driver inverts the usage for TSO offloads, where
> @@ -965,7 +973,7 @@ static int set_offload(struct macvtap_queue *q, unsigned long arg)
> * When user space turns off TSO, we turn off GSO/LRO so that
> * user-space will not receive TSO frames.
> */
> - if (feature_mask & (NETIF_F_TSO | NETIF_F_TSO6))
> + if (feature_mask & (NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_UFO))
> features |= RX_OFFLOADS;
> else
> features &= ~RX_OFFLOADS;
By the way this logic is completely broken, even without your patch,
isn't it?
It says: enable LRO+GRO if any of NETIF_F_TSO | NETIF_F_TSO6 |
NETIF_F_UFO set.
So what happens if I enable TSO only?
LRO gets enabled so I can still get TSO6 packets.
This really should be:
if (feature_mask & (NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_UFO) ==
(NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_UFO)
fixing this probably should be a separate patch before your
series, and Cc stable.
> @@ -1066,7 +1074,7 @@ static long macvtap_ioctl(struct file *file, unsigned int cmd,
> case TUNSETOFFLOAD:
> /* let the user check for future flags */
> if (arg & ~(TUN_F_CSUM | TUN_F_TSO4 | TUN_F_TSO6 |
> - TUN_F_TSO_ECN))
> + TUN_F_TSO_ECN | TUN_F_UFO))
> return -EINVAL;
>
> rtnl_lock();
> --
> 1.9.3
^ permalink raw reply
* Re: [PATCH 08/10] tun: Re-uanble UFO support.
From: Michael S. Tsirkin @ 2014-12-17 22:33 UTC (permalink / raw)
To: Vladislav Yasevich; +Cc: netdev, ben, stefanha, virtualization
In-Reply-To: <1418840455-22598-9-git-send-email-vyasevic@redhat.com>
subs: re-enable
On Wed, Dec 17, 2014 at 01:20:53PM -0500, Vladislav Yasevich wrote:
> Now that UFO is split into v4 and v6 parts, we can bring
> back v4 support without any trouble.
>
> Continue to handle legacy applications by selecting the
> IPv6 fragment id but do not change the gso type. Thist
s/Thist/this/
> makes sure that two legacy VMs may still communicate.
This means IPv6 skbs with UFO (not UFO6) flag set are present
in the stack.
If possible, I think it would be better to make GSO type correct: UFO6,
and then convert to UFO when copying to guest.
A similar approach should be possible for OVS?
> Based on original work from Ben Hutchings.
>
> Fixes: 88e0e0e5aa7a ("drivers/net: Disable UFO through virtio")
> CC: Ben Hutchings <ben@decadent.org.uk>
> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
> ---
> drivers/net/tun.c | 26 ++++++++++++++------------
> 1 file changed, 14 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 9dd3746..8c32fca 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -175,7 +175,7 @@ struct tun_struct {
> struct net_device *dev;
> netdev_features_t set_features;
> #define TUN_USER_FEATURES (NETIF_F_HW_CSUM|NETIF_F_TSO_ECN|NETIF_F_TSO| \
> - NETIF_F_TSO6)
> + NETIF_F_TSO6|NETIF_F_UFO)
>
> int vnet_hdr_sz;
> int sndbuf;
> @@ -1152,20 +1152,15 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
> skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
> break;
> case VIRTIO_NET_HDR_GSO_UDP:
> - {
> - static bool warned;
> -
> - if (!warned) {
> - warned = true;
> - netdev_warn(tun->dev,
> - "%s: using disabled UFO feature; please fix this program\n",
> - current->comm);
> - }
> skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
> - if (skb->protocol == htons(ETH_P_IPV6))
> + if (vlan_get_protocol(skb) == htons(ETH_P_IPV6)) {
> + /* This allows legacy application to work.
> + * Do not change the gso_type as it may
> + * not be upderstood by legacy applications.
Shouldn't we handle legacy applications when passing packets to
userspace?
> + */
> ipv6_proxy_select_ident(skb);
> + }
> break;
> - }
> default:
> tun->dev->stats.rx_frame_errors++;
> kfree_skb(skb);
> @@ -1273,6 +1268,8 @@ static ssize_t tun_put_user(struct tun_struct *tun,
> gso.gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
> else if (sinfo->gso_type & SKB_GSO_TCPV6)
> gso.gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
> + else if (sinfo->gso_type & SKB_GSO_UDP)
> + gso.gso_type = VIRTIO_NET_HDR_GSO_UDP;
> else {
> pr_err("unexpected GSO type: "
> "0x%x, gso_size %d, hdr_len %d\n",
> @@ -1780,6 +1777,11 @@ static int set_offload(struct tun_struct *tun, unsigned long arg)
> features |= NETIF_F_TSO6;
> arg &= ~(TUN_F_TSO4|TUN_F_TSO6);
> }
> +
> + if (arg & TUN_F_UFO) {
> + features |= NETIF_F_UFO;
> + arg &= ~TUN_F_UFO;
> + }
> }
>
> /* This gives the user a way to test for new features in future by
> --
> 1.9.3
^ permalink raw reply
* Re: [PATCH 03/10] ovs: Enable handling of UFO6 packets.
From: Michael S. Tsirkin @ 2014-12-17 22:26 UTC (permalink / raw)
To: Vladislav Yasevich; +Cc: netdev, ben, stefanha, virtualization
In-Reply-To: <1418840455-22598-4-git-send-email-vyasevic@redhat.com>
On Wed, Dec 17, 2014 at 01:20:48PM -0500, Vladislav Yasevich wrote:
> Since UFO6 packets can now be identified by SKB_GSO_UDP6, add proper checks
> to handel UFO6 flows.
s/handel/handle/
> Legacy applications may still have UFO6 packets identified by SKB_GSO_UDP,
> so we need to continue to handle them correclty.
>
> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
> ---
> net/openvswitch/datapath.c | 3 ++-
> net/openvswitch/flow.c | 2 +-
> 2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index f9e556b..b43fc60 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -334,7 +334,8 @@ static int queue_gso_packets(struct datapath *dp, struct sk_buff *skb,
> if (err)
> break;
>
> - if (skb == segs && gso_type & SKB_GSO_UDP) {
> + if (skb == segs &&
> + ((gso_type & SKB_GSO_UDP) || (gso_type & SKB_GSO_UDP6))) {
> /* The initial flow key extracted by ovs_flow_extract()
> * in this case is for a first fragment, so we need to
> * properly mark later fragments.
> diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
> index 2b78789..d03adf4 100644
> --- a/net/openvswitch/flow.c
> +++ b/net/openvswitch/flow.c
> @@ -602,7 +602,7 @@ static int key_extract(struct sk_buff *skb, struct sw_flow_key *key)
>
> if (key->ip.frag == OVS_FRAG_TYPE_LATER)
> return 0;
> - if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP)
> + if (skb_shinfo(skb)->gso_type & (SKB_GSO_UDP | SKB_GSO_UDP6))
> key->ip.frag = OVS_FRAG_TYPE_FIRST;
>
> /* Transport layer. */
> --
> 1.9.3
^ permalink raw reply
* Re: [RFC PATCH net-next 0/5] tcp: TCP tracer
From: Josef Bacik @ 2014-12-17 21:42 UTC (permalink / raw)
To: Alexei Starovoitov, Martin Lau
Cc: Eric Dumazet, Blake Matheny, Laurent Chavey, Yuchung Cheng,
netdev@vger.kernel.org, David S. Miller, Hannes Frederic Sowa,
Steven Rostedt, Lawrence Brakmo, Kernel Team
In-Reply-To: <CAADnVQL6-72UvvL-T7JUa-6c5+xJ2XggW=iCwb5G_ZbH3r-SZQ@mail.gmail.com>
On 12/16/2014 10:06 PM, Alexei Starovoitov wrote:
> On Tue, Dec 16, 2014 at 5:30 PM, Martin Lau <kafai@fb.com> wrote:
>>>>>>> I think systemtap like scripting on top of patches 1 and 3
>>>>>>> should solve your use case ?
>>>> We have quite a few different versions running in the production. It may not
>>>> be operationally easy.
>>>
>>> different versions of kernel or different versions of tcp_tracer ?
>> Former and we are releasing new kernel pretty often.
>
> I see. So for dynamic tracer to be useful in such environment,
> the scripts should be compatible across different kernel version
> without recompilation. All makes sense.
>
>> How does the current TRACE_EVENT do it when it wants to printf more data?
>
> tracepoints, like any other user interface, shouldn't
> break compatibility. With printf it's practically impossible.
> Some subsystems may be breaking this rule arguing that
> tracepoints is a debug facility, but networking tracepoints don't change.
>
So that's what the events/<subsystem>/<event>/format is for, to provide
a nice way for scripts to know what they are looking at. For things
like the tcp estats and other tracing tools we use in production
internally we use something (our own stuff in case of estats, trace-cmd
in the case of normal tracepoints) to read the raw data and pull out the
fields we need, and that way it works no matter what kernel we're on.
Sometimes tracepoints move and so we have to adjust our scripts, but
that's the cost of doing business and I think that's acceptable.
>>> It feels that for stats collection only, tracepoints+tcp_trace
>>> do not add much additional value vs extending tcp_info
>>> and using ss.
>> I think we are on the same page. Once 'this should cost nothing if not
>> activated' proposition was cleared out. It was what I meant that doing the
>> collection part in the TCP itself (instead of tracepoints) would be nice.
>
> agree.
>
>> I think going forward, as others have suggested, it may be better to come
>> together and reach a common ground on what to collect first before I re-work
>> patch 1 to 3 and repost.
>
> I think as a minimum it will be discussed at netdev01 in Feb,
> but I suspect not everyone on this list can(want) go to Ottawa,
> so would be nice to have a meetup for bay area folks to
> discuss this sooner with public g+ hangout.
> Thoughts?
>
Yeah I think we're all in agreement that this is a good netdev01
discussion. I'm happy to include people who want to talk about this
before hand in the bay area meetup we're throwing, but it seems like
this is going to be something that the larger community is going to want
to talk about so it may be more productive to wait until netdev01. Thanks,
Josef
^ permalink raw reply
* Re: [RFC PATCH net-next 0/5] tcp: TCP tracer
From: Arnaldo Carvalho de Melo @ 2014-12-17 21:24 UTC (permalink / raw)
To: David Ahern
Cc: Alexei Starovoitov, Martin KaFai Lau, netdev@vger.kernel.org,
David S. Miller, Hannes Frederic Sowa, Steven Rostedt,
Lawrence Brakmo, Josef Bacik, Kernel Team
In-Reply-To: <5491EE01.5020406@gmail.com>
Em Wed, Dec 17, 2014 at 01:56:33PM -0700, David Ahern escreveu:
> On 12/17/14 1:42 PM, Alexei Starovoitov wrote:
> >>It is not strictly necessary to carry vmlinux, that is just a probe
> >>>point resolution time problem, solvable when generating a shell script,
> >>>on the development machine, to insert the probes.
> >on N development machines with kernels that
> >would match worker machines...
> >I'm not saying it's impossible, just operationally difficult.
> >This is my understanding of Martin's use case.
> That's the use case I am talking about ... N-different kernel versions and
> the probe definitions would need to be generated at *build* time of the
> kernel that uses a cross-compile environment. ie., can't assume there is a
> development machine running the kernel from which you can generate the probe
> definitions. This gets messy quick for embedded deployments.
It shouldn't, you're saying that the rate of pushing out production
kernels is so high that we get lost and can't find the matching full
debug original binaries used.
We have build-ids for that, to have binary content keys, that we can
match what is in production, that has to be as lean as possible, while
being able to get back to all that fat.
Is it that people want so hard to forget about that extra debugging fat
that in the end we need to keep it to be able to figure out what happens
when things go wrong?
I understand that the expectation is that for each production build
there will be unwieldly different probe point definitions to keep, but
is that so?
- Arnaldo
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox