Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v4 1/2] rtw88: pci: Rearrange the memory usage for skb in RX ISR
From: Kalle Valo @ 2019-07-24 11:49 UTC (permalink / raw)
  To: Jian-Hong Pan
  Cc: Yan-Hsuan Chuang, David S . Miller, Larry Finger, David Laight,
	Christoph Hellwig, linux-wireless, netdev, linux-kernel, linux,
	Daniel Drake, Jian-Hong Pan, stable
In-Reply-To: <20190711052427.5582-1-jian-hong@endlessm.com>

Jian-Hong Pan <jian-hong@endlessm.com> wrote:

> Testing with RTL8822BE hardware, when available memory is low, we
> frequently see a kernel panic and system freeze.
> 
> First, rtw_pci_rx_isr encounters a memory allocation failure (trimmed):
> 
> rx routine starvation
> WARNING: CPU: 7 PID: 9871 at drivers/net/wireless/realtek/rtw88/pci.c:822 rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
> [ 2356.580313] RIP: 0010:rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
> 
> Then we see a variety of different error conditions and kernel panics,
> such as this one (trimmed):
> 
> rtw_pci 0000:02:00.0: pci bus timeout, check dma status
> skbuff: skb_over_panic: text:00000000091b6e66 len:415 put:415 head:00000000d2880c6f data:000000007a02b1ea tail:0x1df end:0xc0 dev:<NULL>
> ------------[ cut here ]------------
> kernel BUG at net/core/skbuff.c:105!
> invalid opcode: 0000 [#1] SMP NOPTI
> RIP: 0010:skb_panic+0x43/0x45
> 
> When skb allocation fails and the "rx routine starvation" is hit, the
> function returns immediately without updating the RX ring. At this
> point, the RX ring may continue referencing an old skb which was already
> handed off to ieee80211_rx_irqsafe(). When it comes to be used again,
> bad things happen.
> 
> This patch allocates a new, data-sized skb first in RX ISR. After
> copying the data in, we pass it to the upper layers. However, if skb
> allocation fails, we effectively drop the frame. In both cases, the
> original, full size ring skb is reused.
> 
> In addition, to fixing the kernel crash, the RX routine should now
> generally behave better under low memory conditions.
> 
> Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204053
> Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
> Cc: <stable@vger.kernel.org>

2 patches applied to wireless-drivers-next.git, thanks.

ee6db78f5db9 rtw88: pci: Rearrange the memory usage for skb in RX ISR
29b68a920f6a rtw88: pci: Use DMA sync instead of remapping in RX ISR

-- 
https://patchwork.kernel.org/patch/11039275/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


^ permalink raw reply

* Re: [PATCH] rtlwifi: btcoex: fix issue possible condition with no effect (if == else)
From: Kalle Valo @ 2019-07-24 11:52 UTC (permalink / raw)
  To: Hariprasad Kelam
  Cc: Ping-Ke Shih, David S. Miller, YueHaibing, Hariprasad Kelam,
	Larry Finger, Nathan Chancellor, linux-wireless, netdev,
	linux-kernel
In-Reply-To: <20190712191535.GA4215@hari-Inspiron-1545>

Hariprasad Kelam <hariprasad.kelam@gmail.com> wrote:

> fix below issue reported by coccicheck
> drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c:514:1-3:
> WARNING: possible condition with no effect (if == else)
> 
> Signed-off-by: Hariprasad Kelam <hariprasad.kelam@gmail.com>
> Acked-by: Ping-Ke Shih <pkshih@realtek.com>

Patch applied to wireless-drivers-next.git, thanks.

9a29f7d8476c rtlwifi: btcoex: fix issue possible condition with no effect (if == else)

-- 
https://patchwork.kernel.org/patch/11042665/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


^ permalink raw reply

* Re: [PATCH v2] libertas_tf: Use correct channel range in lbtf_geo_init
From: Kalle Valo @ 2019-07-24 11:53 UTC (permalink / raw)
  To: YueHaibing
  Cc: davem, lkundrak, derosier, linux-kernel, netdev, linux-wireless,
	YueHaibing
In-Reply-To: <20190716144218.20608-1-yuehaibing@huawei.com>

YueHaibing <yuehaibing@huawei.com> wrote:

> It seems we should use 'range' instead of 'priv->range'
> in lbtf_geo_init(), because 'range' is the corret one
> related to current regioncode.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Fixes: 691cdb49388b ("libertas_tf: command helper functions for libertas_tf")
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Patch applied to wireless-drivers-next.git, thanks.

2ec4ad49b98e libertas_tf: Use correct channel range in lbtf_geo_init

-- 
https://patchwork.kernel.org/patch/11046191/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


^ permalink raw reply

* Re: [PATCH net-next 00/10] Use dev_get_drvdata where possible
From: Kalle Valo @ 2019-07-24 11:57 UTC (permalink / raw)
  To: Chuhong Yuan
  Cc: Mirko Lindner, Stephen Hemminger, Jiri Slaby, Nick Kossifidis,
	Luis Chamberlain, Stanislaw Gruszka, QCA ath9k Development,
	Maya Erez, Johannes Berg, Emmanuel Grumbach, Luca Coelho,
	Amitkumar Karwar, Nishant Sarmukadam, Ganapathi Bhat, Xinming Hu,
	Igor Mitsyanko, Avinash Patil, Sergey Matyukevich, Ping-Ke Shih,
	Intel Linux Wireless, David S . Miller,
	Solarflare linux maintainers, Edward Cree, Mart in Habets, netdev,
	wil6210, linux-wireless, linux-kernel
In-Reply-To: <20190724112524.13042-1-hslester96@gmail.com>

Chuhong Yuan <hslester96@gmail.com> writes:

> These patches use dev_get_drvdata instead of
> using to_pci_dev + pci_get_drvdata to make
> code simpler.
>
> Chuhong Yuan (10):
>   net: marvell: Use dev_get_drvdata where possible
>   forcedeth: Use dev_get_drvdata where possible
>   sfc: Use dev_get_drvdata where possible
>   sfc-falcon: Use dev_get_drvdata where possible
>   ath: Use dev_get_drvdata where possible
>   iwlegacy: Use dev_get_drvdata where possible
>   iwlwifi: Use dev_get_drvdata where possible
>   mwifiex: pcie: Use dev_get_drvdata
>   qtnfmac_pcie: Use dev_get_drvdata
>   rtlwifi: rtl_pci: Use dev_get_drvdata
>
>  drivers/net/ethernet/marvell/skge.c                |  6 ++----
>  drivers/net/ethernet/marvell/sky2.c                |  3 +--
>  drivers/net/ethernet/nvidia/forcedeth.c            |  3 +--
>  drivers/net/ethernet/sfc/ef10.c                    |  4 ++--
>  drivers/net/ethernet/sfc/efx.c                     | 10 +++++-----
>  drivers/net/ethernet/sfc/falcon/efx.c              |  6 +++---
>  drivers/net/ethernet/sfc/falcon/falcon_boards.c    |  4 ++--
>  drivers/net/wireless/ath/ath5k/pci.c               |  3 +--
>  drivers/net/wireless/ath/ath9k/pci.c               |  5 ++---
>  drivers/net/wireless/ath/wil6210/pcie_bus.c        |  6 ++----
>  drivers/net/wireless/intel/iwlegacy/common.c       |  3 +--
>  drivers/net/wireless/intel/iwlwifi/pcie/drv.c      | 12 ++++--------
>  drivers/net/wireless/marvell/mwifiex/pcie.c        |  8 ++------
>  drivers/net/wireless/quantenna/qtnfmac/pcie/pcie.c |  4 ++--
>  drivers/net/wireless/realtek/rtlwifi/pci.c         |  6 ++----
>  15 files changed, 32 insertions(+), 51 deletions(-)

Do note that wireless patches go to wireless-drivers-next, not net-next.
But I assume Dave will ignore patches 5-10 and I can take them.

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH net-next 3/3] net: stmmac: Introducing support for Page Pool
From: Jon Hunter @ 2019-07-24 11:58 UTC (permalink / raw)
  To: Jose Abreu, Ilias Apalodimas
  Cc: David Miller, robin.murphy@arm.com, lists@bofh.nu,
	Joao.Pinto@synopsys.com, alexandre.torgue@st.com,
	maxime.ripard@bootlin.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-stm32@st-md-mailman.stormreply.com, wens@csie.org,
	mcoquelin.stm32@gmail.com, linux-tegra@vger.kernel.org,
	peppe.cavallaro@st.com, linux-arm-kernel@lists.infradead.org
In-Reply-To: <BYAPR12MB32696F0A2BFDF69F31C4311CD3C60@BYAPR12MB3269.namprd12.prod.outlook.com>


On 24/07/2019 12:34, Jose Abreu wrote:
> From: Jon Hunter <jonathanh@nvidia.com>
> Date: Jul/24/2019, 12:10:47 (UTC+00:00)
> 
>>
>> On 24/07/2019 11:04, Jose Abreu wrote:
>>
>> ...
>>
>>> Jon, I was able to replicate (at some level) your setup:
>>>
>>> # dmesg | grep -i arm-smmu
>>> [    1.337322] arm-smmu 70040000.iommu: probing hardware 
>>> configuration...
>>> [    1.337330] arm-smmu 70040000.iommu: SMMUv2 with:
>>> [    1.337338] arm-smmu 70040000.iommu:         stage 1 translation
>>> [    1.337346] arm-smmu 70040000.iommu:         stage 2 translation
>>> [    1.337354] arm-smmu 70040000.iommu:         nested translation
>>> [    1.337363] arm-smmu 70040000.iommu:         stream matching with 128 
>>> register groups
>>> [    1.337374] arm-smmu 70040000.iommu:         1 context banks (0 
>>> stage-2 only)
>>> [    1.337383] arm-smmu 70040000.iommu:         Supported page sizes: 
>>> 0x61311000
>>> [    1.337393] arm-smmu 70040000.iommu:         Stage-1: 48-bit VA -> 
>>> 48-bit IPA
>>> [    1.337402] arm-smmu 70040000.iommu:         Stage-2: 48-bit IPA -> 
>>> 48-bit PA
>>>
>>> # dmesg | grep -i stmmac
>>> [    1.344106] stmmaceth 70000000.ethernet: Adding to iommu group 0
>>> [    1.344233] stmmaceth 70000000.ethernet: no reset control found
>>> [    1.348276] stmmaceth 70000000.ethernet: User ID: 0x10, Synopsys ID: 
>>> 0x51
>>> [    1.348285] stmmaceth 70000000.ethernet:     DWMAC4/5
>>> [    1.348293] stmmaceth 70000000.ethernet: DMA HW capability register 
>>> supported
>>> [    1.348302] stmmaceth 70000000.ethernet: RX Checksum Offload Engine 
>>> supported
>>> [    1.348311] stmmaceth 70000000.ethernet: TX Checksum insertion 
>>> supported
>>> [    1.348320] stmmaceth 70000000.ethernet: TSO supported
>>> [    1.348328] stmmaceth 70000000.ethernet: Enable RX Mitigation via HW 
>>> Watchdog Timer
>>> [    1.348337] stmmaceth 70000000.ethernet: TSO feature enabled
>>> [    1.348409] libphy: stmmac: probed
>>> [ 4159.140990] stmmaceth 70000000.ethernet eth0: PHY [stmmac-0:01] 
>>> driver [Generic PHY]
>>> [ 4159.141005] stmmaceth 70000000.ethernet eth0: phy: setting supported 
>>> 00,00000000,000062ff advertising 00,00000000,000062ff
>>> [ 4159.142359] stmmaceth 70000000.ethernet eth0: No Safety Features 
>>> support found
>>> [ 4159.142369] stmmaceth 70000000.ethernet eth0: IEEE 1588-2008 Advanced 
>>> Timestamp supported
>>> [ 4159.142429] stmmaceth 70000000.ethernet eth0: registered PTP clock
>>> [ 4159.142439] stmmaceth 70000000.ethernet eth0: configuring for 
>>> phy/gmii link mode
>>> [ 4159.142452] stmmaceth 70000000.ethernet eth0: phylink_mac_config: 
>>> mode=phy/gmii/Unknown/Unknown adv=00,00000000,000062ff pause=10 link=0 
>>> an=1
>>> [ 4159.142466] stmmaceth 70000000.ethernet eth0: phy link up 
>>> gmii/1Gbps/Full
>>> [ 4159.142475] stmmaceth 70000000.ethernet eth0: phylink_mac_config: 
>>> mode=phy/gmii/1Gbps/Full adv=00,00000000,00000000 pause=0f link=1 an=0
>>> [ 4159.142481] stmmaceth 70000000.ethernet eth0: Link is Up - 1Gbps/Full 
>>> - flow control rx/tx
>>>
>>> The only missing point is the NFS boot that I can't replicate with this 
>>> setup. But I did some sanity checks:
>>>
>>> Remote Enpoint:
>>> # dd if=/dev/urandom of=output.dat bs=128M count=1
>>> # nc -c 192.168.0.2 1234 < output.dat
>>> # md5sum output.dat 
>>> fde9e0818281836e4fc0edfede2b8762  output.dat
>>>
>>> DUT:
>>> # nc -l -c -p 1234 > output.dat
>>> # md5sum output.dat 
>>> fde9e0818281836e4fc0edfede2b8762  output.dat
>>
>> On my setup, if I do not use NFS to mount the rootfs, but then manually
>> mount the NFS share after booting, I do not see any problems reading or
>> writing to files on the share. So I am not sure if it is some sort of
>> race that is occurring when mounting the NFS share on boot. It is 100%
>> reproducible when using NFS for the root file-system.
> 
> I don't understand how can there be corruption then unless the IP AXI 
> parameters are misconfigured which can lead to sporadic undefined 
> behavior.
> 
> These prints from your logs:
> [   14.579392] Run /init as init process
> /init: line 58: chmod: command not found
> [ 10:22:46 ] L4T-INITRD Build DATE: Mon Jul 22 10:22:46 UTC 2019
> [ 10:22:46 ] Root device found: nfs
> [ 10:22:46 ] Ethernet interfaces: eth0
> [ 10:22:46 ] IP Address: 10.21.140.41
> 
> Where are they coming from ? Do you have any extra init script ?

By default there is an initial ramdisk that is loaded first and then the
rootfs is mounted over NFS. However, even if I remove this ramdisk and
directly mount the rootfs via NFS without it the problem persists. So I
don't see any issue with the ramdisk and whats more is we have been
using this for a long long time. Nothing has changed here.

Jon

-- 
nvpublic

^ permalink raw reply

* [PATCH] ip6_gre: reload ipv6h in prepare_ip6gre_xmit_ipv6
From: Haishuang Yan @ 2019-07-24 12:00 UTC (permalink / raw)
  To: David S. Miller, Alexey Kuznetsov
  Cc: netdev, linux-kernel, Haishuang Yan, William Tu

Since ip6_tnl_parse_tlv_enc_lim() can call pskb_may_pull()
which may change skb->data, so we need to re-load ipv6h at
the right place.

Fixes: 898b29798e36 ("ip6_gre: Refactor ip6gre xmit codes")
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
---
 net/ipv6/ip6_gre.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index c2049c7..dd2d0b96 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -660,12 +660,13 @@ static int prepare_ip6gre_xmit_ipv6(struct sk_buff *skb,
 				    struct flowi6 *fl6, __u8 *dsfield,
 				    int *encap_limit)
 {
-	struct ipv6hdr *ipv6h = ipv6_hdr(skb);
+	struct ipv6hdr *ipv6h;
 	struct ip6_tnl *t = netdev_priv(dev);
 	__u16 offset;
 
 	offset = ip6_tnl_parse_tlv_enc_lim(skb, skb_network_header(skb));
 	/* ip6_tnl_parse_tlv_enc_lim() might have reallocated skb->head */
+	ipv6h = ipv6_hdr(skb);
 
 	if (offset > 0) {
 		struct ipv6_tlv_tnl_enc_lim *tel;
-- 
1.8.3.1




^ permalink raw reply related

* Re: [PATCH 4.4 stable net] net: tcp: Fix use-after-free in tcp_write_xmit
From: maowenan @ 2019-07-24 12:13 UTC (permalink / raw)
  To: Greg KH; +Cc: stable, davem, netdev, linux-kernel
In-Reply-To: <20190724110524.GA4472@kroah.com>



On 2019/7/24 19:05, Greg KH wrote:
> On Wed, Jul 24, 2019 at 05:17:15PM +0800, Mao Wenan wrote:
>> There is one report about tcp_write_xmit use-after-free with version 4.4.136:
>>
>> BUG: KASAN: use-after-free in tcp_skb_pcount include/net/tcp.h:796 [inline]
>> BUG: KASAN: use-after-free in tcp_init_tso_segs net/ipv4/tcp_output.c:1619 [inline]
>> BUG: KASAN: use-after-free in tcp_write_xmit+0x3fc2/0x4cb0 net/ipv4/tcp_output.c:2056
>> Read of size 2 at addr ffff8801d6fc87b0 by task syz-executor408/4195
>>
>> CPU: 0 PID: 4195 Comm: syz-executor408 Not tainted 4.4.136-gfb7e319 #59
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>  0000000000000000 7d8f38ecc03be946 ffff8801d73b7710 ffffffff81e0edad
>>  ffffea00075bf200 ffff8801d6fc87b0 0000000000000000 ffff8801d6fc87b0
>>  dffffc0000000000 ffff8801d73b7748 ffffffff815159b6 ffff8801d6fc87b0
>> Call Trace:
>>  [<ffffffff81e0edad>] __dump_stack lib/dump_stack.c:15 [inline]
>>  [<ffffffff81e0edad>] dump_stack+0xc1/0x124 lib/dump_stack.c:51
>>  [<ffffffff815159b6>] print_address_description+0x6c/0x216 mm/kasan/report.c:252
>>  [<ffffffff81515cd5>] kasan_report_error mm/kasan/report.c:351 [inline]
>>  [<ffffffff81515cd5>] kasan_report.cold.7+0x175/0x2f7 mm/kasan/report.c:408
>>  [<ffffffff814f9784>] __asan_report_load2_noabort+0x14/0x20 mm/kasan/report.c:427
>>  [<ffffffff83286582>] tcp_skb_pcount include/net/tcp.h:796 [inline]
>>  [<ffffffff83286582>] tcp_init_tso_segs net/ipv4/tcp_output.c:1619 [inline]
>>  [<ffffffff83286582>] tcp_write_xmit+0x3fc2/0x4cb0 net/ipv4/tcp_output.c:2056
>>  [<ffffffff83287a40>] __tcp_push_pending_frames+0xa0/0x290 net/ipv4/tcp_output.c:2307
>>  [<ffffffff8328e966>] tcp_send_fin+0x176/0xab0 net/ipv4/tcp_output.c:2883
>>  [<ffffffff8324c0d0>] tcp_close+0xca0/0xf70 net/ipv4/tcp.c:2112
>>  [<ffffffff832f8d0f>] inet_release+0xff/0x1d0 net/ipv4/af_inet.c:435
>>  [<ffffffff82f1a156>] sock_release+0x96/0x1c0 net/socket.c:586
>>  [<ffffffff82f1a296>] sock_close+0x16/0x20 net/socket.c:1037
>>  [<ffffffff81522da5>] __fput+0x235/0x6f0 fs/file_table.c:208
>>  [<ffffffff815232e5>] ____fput+0x15/0x20 fs/file_table.c:244
>>  [<ffffffff8118bd7f>] task_work_run+0x10f/0x190 kernel/task_work.c:115
>>  [<ffffffff81135285>] exit_task_work include/linux/task_work.h:21 [inline]
>>  [<ffffffff81135285>] do_exit+0x9e5/0x26b0 kernel/exit.c:759
>>  [<ffffffff8113b1d1>] do_group_exit+0x111/0x330 kernel/exit.c:889
>>  [<ffffffff8115e5cc>] get_signal+0x4ec/0x14b0 kernel/signal.c:2321
>>  [<ffffffff8100e02b>] do_signal+0x8b/0x1d30 arch/x86/kernel/signal.c:712
>>  [<ffffffff8100360a>] exit_to_usermode_loop+0x11a/0x160 arch/x86/entry/common.c:248
>>  [<ffffffff81006535>] prepare_exit_to_usermode arch/x86/entry/common.c:283 [inline]
>>  [<ffffffff81006535>] syscall_return_slowpath+0x1b5/0x1f0 arch/x86/entry/common.c:348
>>  [<ffffffff838c29b5>] int_ret_from_sys_call+0x25/0xa3
>>
>> Allocated by task 4194:
>>  [<ffffffff810341d6>] save_stack_trace+0x26/0x50 arch/x86/kernel/stacktrace.c:63
>>  [<ffffffff814f8873>] save_stack+0x43/0xd0 mm/kasan/kasan.c:512
>>  [<ffffffff814f8b57>] set_track mm/kasan/kasan.c:524 [inline]
>>  [<ffffffff814f8b57>] kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:616
>>  [<ffffffff814f9122>] kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:554
>>  [<ffffffff814f4c1e>] slab_post_alloc_hook mm/slub.c:1349 [inline]
>>  [<ffffffff814f4c1e>] slab_alloc_node mm/slub.c:2615 [inline]
>>  [<ffffffff814f4c1e>] slab_alloc mm/slub.c:2623 [inline]
>>  [<ffffffff814f4c1e>] kmem_cache_alloc+0xbe/0x2a0 mm/slub.c:2628
>>  [<ffffffff82f380a6>] kmem_cache_alloc_node include/linux/slab.h:350 [inline]
>>  [<ffffffff82f380a6>] __alloc_skb+0xe6/0x600 net/core/skbuff.c:218
>>  [<ffffffff832466c3>] alloc_skb_fclone include/linux/skbuff.h:856 [inline]
>>  [<ffffffff832466c3>] sk_stream_alloc_skb+0xa3/0x5d0 net/ipv4/tcp.c:833
>>  [<ffffffff83249164>] tcp_sendmsg+0xd34/0x2b00 net/ipv4/tcp.c:1178
>>  [<ffffffff83300ef3>] inet_sendmsg+0x203/0x4d0 net/ipv4/af_inet.c:755
>>  [<ffffffff82f1e1fc>] sock_sendmsg_nosec net/socket.c:625 [inline]
>>  [<ffffffff82f1e1fc>] sock_sendmsg+0xcc/0x110 net/socket.c:635
>>  [<ffffffff82f1eedc>] SYSC_sendto+0x21c/0x370 net/socket.c:1665
>>  [<ffffffff82f21560>] SyS_sendto+0x40/0x50 net/socket.c:1633
>>  [<ffffffff838c2825>] entry_SYSCALL_64_fastpath+0x22/0x9e
>>
>> Freed by task 4194:
>>  [<ffffffff810341d6>] save_stack_trace+0x26/0x50 arch/x86/kernel/stacktrace.c:63
>>  [<ffffffff814f8873>] save_stack+0x43/0xd0 mm/kasan/kasan.c:512
>>  [<ffffffff814f91a2>] set_track mm/kasan/kasan.c:524 [inline]
>>  [<ffffffff814f91a2>] kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:589
>>  [<ffffffff814f632e>] slab_free_hook mm/slub.c:1383 [inline]
>>  [<ffffffff814f632e>] slab_free_freelist_hook mm/slub.c:1405 [inline]
>>  [<ffffffff814f632e>] slab_free mm/slub.c:2859 [inline]
>>  [<ffffffff814f632e>] kmem_cache_free+0xbe/0x340 mm/slub.c:2881
>>  [<ffffffff82f3527f>] kfree_skbmem+0xcf/0x100 net/core/skbuff.c:635
>>  [<ffffffff82f372fd>] __kfree_skb+0x1d/0x20 net/core/skbuff.c:676
>>  [<ffffffff83288834>] sk_wmem_free_skb include/net/sock.h:1447 [inline]
>>  [<ffffffff83288834>] tcp_write_queue_purge include/net/tcp.h:1460 [inline]
>>  [<ffffffff83288834>] tcp_connect_init net/ipv4/tcp_output.c:3122 [inline]
>>  [<ffffffff83288834>] tcp_connect+0xb24/0x30c0 net/ipv4/tcp_output.c:3261
>>  [<ffffffff8329b991>] tcp_v4_connect+0xf31/0x1890 net/ipv4/tcp_ipv4.c:246
>>  [<ffffffff832f9ca9>] __inet_stream_connect+0x2a9/0xc30 net/ipv4/af_inet.c:615
>>  [<ffffffff832fa685>] inet_stream_connect+0x55/0xa0 net/ipv4/af_inet.c:676
>>  [<ffffffff82f1eb78>] SYSC_connect+0x1b8/0x300 net/socket.c:1557
>>  [<ffffffff82f214b4>] SyS_connect+0x24/0x30 net/socket.c:1538
>>  [<ffffffff838c2825>] entry_SYSCALL_64_fastpath+0x22/0x9e
>>
>> Syzkaller reproducer():
>> r0 = socket$packet(0x11, 0x3, 0x300)
>> r1 = socket$inet_tcp(0x2, 0x1, 0x0)
>> bind$inet(r1, &(0x7f0000000300)={0x2, 0x4e21, @multicast1}, 0x10)
>> connect$inet(r1, &(0x7f0000000140)={0x2, 0x1000004e21, @loopback}, 0x10)
>> recvmmsg(r1, &(0x7f0000001e40)=[{{0x0, 0x0, &(0x7f0000000100)=[{&(0x7f00000005c0)=""/88, 0x58}], 0x1}}], 0x1, 0x40000000, 0x0)
>> sendto$inet(r1, &(0x7f0000000000)="e2f7ad5b661c761edf", 0x9, 0x8080, 0x0, 0x0)
>> r2 = fcntl$dupfd(r1, 0x0, r0)
>> connect$unix(r2, &(0x7f00000001c0)=@file={0x0, './file0\x00'}, 0x6e)
>>
>> C repro link: https://syzkaller.appspot.com/text?tag=ReproC&x=14db474f800000
>>
>> This is because when tcp_connect_init call tcp_write_queue_purge, it will
>> kfree all the skb in the write_queue, but the sk->sk_send_head forget to set NULL,
>> then tcp_write_xmit try to send skb, which has freed in tcp_write_queue_purge, UAF happens.
>>
>> Signed-off-by: Mao Wenan <maowenan@huawei.com>
>> ---
>>  include/net/tcp.h | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/include/net/tcp.h b/include/net/tcp.h
>> index bf8a0dae977a..8f8aace28cf8 100644
>> --- a/include/net/tcp.h
>> +++ b/include/net/tcp.h
>> @@ -1457,6 +1457,7 @@ static inline void tcp_write_queue_purge(struct sock *sk)
>>  
>>  	while ((skb = __skb_dequeue(&sk->sk_write_queue)) != NULL)
>>  		sk_wmem_free_skb(sk, skb);
>> +	sk->sk_send_head = NULL;
>>  	sk_mem_reclaim(sk);
>>  	tcp_clear_all_retrans_hints(tcp_sk(sk));
>>  	inet_csk(sk)->icsk_backoff = 0;
> 
> Does this corrispond with a specific commit that is already in Linus's
> tree?  If not, why, did we change/mess something up when doing
> backports, or is the code just that different?
> 
> Also, is this needed in 4.9.y, 4.14.y, 4.19.y, and/or 5.2.y?  Why just
> 4.4.y?

Is it the commit 75c119afe14f? It does not use sk_send_head to indicate whether it has skb to be sent.

commit 75c119afe14f74b4dd967d75ed9f57ab6c0ef045
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Oct 5 22:21:27 2017 -0700

    tcp: implement rb-tree based retransmit queue


 static inline struct sk_buff *tcp_send_head(const struct sock *sk)
 {
-       return sk->sk_send_head;
+       return skb_peek(&sk->sk_write_queue);
 }



> 
> thanks,
> 
> greg k-h
> 
> .
> 


^ permalink raw reply

* Re: [RFC PATCH net-next 00/12] drop_monitor: Capture dropped packets and metadata
From: Toke Høiland-Jørgensen @ 2019-07-24  9:51 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, nhorman, dsahern, roopa, nikolay, jakub.kicinski,
	andy, f.fainelli, andrew, vivien.didelot, mlxsw, Ido Schimmel
In-Reply-To: <20190724081042.GB15878@splinter>

Ido Schimmel <idosch@idosch.org> writes:

> On Tue, Jul 23, 2019 at 06:08:21PM +0200, Toke Høiland-Jørgensen wrote:
>> Also, presumably the queue will have to change from a struct
>> sk_buff_head to something that can hold XDP frames and whatever devlink
>> puts there as well, right?
>
> Good point!
>
> For HW drops we get an SKB and relevant metadata from devlink about why
> the packet was dropped etc. I plan to store a pointer to this metadata
> in the SKB control block.
>
> Let me see how the implementation goes. Even if use sk_buff_head for
> now, I will make sure that converting to a more generalized data
> structure is straightforward.

Yup, that's fine, just wanted to make sure you were aware of the
eventual need to change it :)

-Toke

^ permalink raw reply

* Re: [PATCH 4.4 stable net] net: tcp: Fix use-after-free in tcp_write_xmit
From: maowenan @ 2019-07-24 12:29 UTC (permalink / raw)
  To: Eric Dumazet, davem, gregkh, netdev, linux-kernel
In-Reply-To: <c7eb64e0-702d-0792-9aea-c303454e7a33@huawei.com>



On 2019/7/24 18:38, maowenan wrote:
> 
> 
> On 2019/7/24 18:13, Eric Dumazet wrote:
>>
>>
>> On 7/24/19 12:01 PM, Eric Dumazet wrote:
>>>
>>>
>>> On 7/24/19 11:17 AM, Mao Wenan wrote:
>>>> There is one report about tcp_write_xmit use-after-free with version 4.4.136:
>>>
>>> Current stable 4.4 is 4.4.186
>>>
>>> Can you check the bug is still there ?
>>>
>>
>> BTW, I tried the C repro and another bug showed up.
>>
>> It looks like 4.4.186 misses other fixes :/
> 
> I will try 4.4.186.

Hi Eric, bug exist in latest commit for 4.4 stable.
a3e421f Linux 4.4.186

> 
>>
>> [  180.811610] skbuff: skb_under_panic: text:ffffffff825ec6ea len:156 put:84 head:ffff8837dd1f0990 data:ffff8837dd1f098c tail:0x98 end:0xc0 dev:ip6gre0
>> [  180.825037] ------------[ cut here ]------------
>> [  180.829688] kernel BUG at net/core/skbuff.c:104!
>> [  180.834316] invalid opcode: 0000 [#1] SMP KASAN
>> [  180.839305] gsmi: Log Shutdown Reason 0x03
>> [  180.843426] Modules linked in: ipip bonding bridge stp llc tun veth w1_therm wire i2c_mux_pca954x i2c_mux cdc_acm ehci_pci ehci_hcd ip_gre mlx4_en ib_uverbs mlx4_ib ib_sa ib_mad ib_core ib_addr mlx4_core
>> [  180.862052] CPU: 22 PID: 1619 Comm: kworker/22:1 Not tainted 4.4.186-smp-DEV #41
>> [  180.869475] Hardware name: Intel BIOS 2.56.0 10/19/2018
>> [  180.876463] Workqueue: ipv6_addrconf addrconf_dad_work
>> [  180.881658] task: ffff8837f1f59d80 ti: ffff8837eeeb8000 task.ti: ffff8837eeeb8000
>> [  180.889171] RIP: 0010:[<ffffffff821ef26f>]  [<ffffffff821ef26f>] skb_panic+0x14f/0x210
>> [  180.897162] RSP: 0018:ffff8837eeebf4b8  EFLAGS: 00010282
>> [  180.902504] RAX: 0000000000000088 RBX: ffff8837eeeeb600 RCX: 0000000000000000
>> [  180.909645] RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffffff83508c00
>> [  180.916854] RBP: ffff8837eeebf520 R08: 0000000000000016 R09: 0000000000000000
>> [  180.924029] R10: ffff881fc8abf038 R11: 0000000000000007 R12: ffff881fc8abe720
>> [  180.931213] R13: ffffffff82aa9e80 R14: 00000000000000c0 R15: 0000000000000098
>> [  180.938390] FS:  0000000000000000(0000) GS:ffff8837ff280000(0000) knlGS:0000000000000000
>> [  180.946519] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  180.952290] CR2: 00007f519426f530 CR3: 00000037d37f2000 CR4: 0000000000160670
>> [  180.959447] Stack:
>> [  180.961458]  ffff8837dd1f098c 0000000000000098 00000000000000c0 ffff881fc8abe720
>> [  180.968909]  ffffea00df747c00 ffff881fff404b40 ffff8837ff2a1a20 ffff8837eeebf5b8
>> [  180.976371]  ffff8837eeeeb600 ffffffff825ec6ea 1ffff106fddd7eb6 ffff8837eeeeb600
>> [  180.983848] Call Trace:
>> [  180.986297]  [<ffffffff825ec6ea>] ? ip6gre_header+0xba/0xd50
>> [  180.991962]  [<ffffffff821f0e01>] skb_push+0xc1/0x100
>> [  180.997023]  [<ffffffff825ec6ea>] ip6gre_header+0xba/0xd50
>> [  181.002519]  [<ffffffff8158dc16>] ? memcpy+0x36/0x40
>> [  181.007509]  [<ffffffff825ec630>] ? ip6gre_changelink+0x6d0/0x6d0
>> [  181.013629]  [<ffffffff82550741>] ? ndisc_constructor+0x5b1/0x770
>> [  181.019728]  [<ffffffff82666861>] ? _raw_write_unlock_bh+0x41/0x50
>> [  181.025924]  [<ffffffff8226540b>] ? __neigh_create+0xe6b/0x1670
>> [  181.031851]  [<ffffffff8225817f>] neigh_connected_output+0x23f/0x480
>> [  181.038219]  [<ffffffff824f61ec>] ip6_finish_output2+0x74c/0x1a90
>> [  181.044324]  [<ffffffff810f1d33>] ? print_context_stack+0x73/0xf0
>> [  181.050429]  [<ffffffff824f5aa0>] ? ip6_xmit+0x1700/0x1700
>> [  181.055933]  [<ffffffff82304a28>] ? nf_hook_slow+0x118/0x1b0
>> [  181.061617]  [<ffffffff82502d7a>] ip6_finish_output+0x2ba/0x580
>> [  181.067546]  [<ffffffff82503179>] ip6_output+0x139/0x380
>> [  181.072884]  [<ffffffff82503040>] ? ip6_finish_output+0x580/0x580
>> [  181.079004]  [<ffffffff82502ac0>] ? ip6_fragment+0x31b0/0x31b0
>> [  181.084852]  [<ffffffff82251b51>] ? dst_init+0x4b1/0x820
>> [  181.090172]  [<ffffffff8158da45>] ? kasan_unpoison_shadow+0x35/0x50
>> [  181.096437]  [<ffffffff8158da45>] ? kasan_unpoison_shadow+0x35/0x50
>> [  181.102712]  [<ffffffff8254f3ca>] NF_HOOK_THRESH.constprop.22+0xca/0x180
>> [  181.109421]  [<ffffffff8254f300>] ? ndisc_alloc_skb+0x340/0x340
>> [  181.115338]  [<ffffffff8254d820>] ? compat_ipv6_setsockopt+0x180/0x180
>> [  181.121874]  [<ffffffff8254fbc2>] ndisc_send_skb+0x742/0xd10
>> [  181.127550]  [<ffffffff8254f480>] ? NF_HOOK_THRESH.constprop.22+0x180/0x180
>> [  181.134516]  [<ffffffff821f2440>] ? skb_complete_tx_timestamp+0x280/0x280
>> [  181.141311]  [<ffffffff8254e2b3>] ? ndisc_fill_addr_option+0x193/0x260
>> [  181.147844]  [<ffffffff82553bd9>] ndisc_send_rs+0x179/0x2d0
>> [  181.153426]  [<ffffffff8251e7df>] addrconf_dad_completed+0x41f/0x7c0
>> [  181.159795]  [<ffffffff81297f78>] ? pick_next_entity+0x198/0x470
>> [  181.165807]  [<ffffffff8251e3c0>] ? addrconf_rs_timer+0x4a0/0x4a0
>> [  181.171918]  [<ffffffff81aab928>] ? find_next_bit+0x18/0x20
>> [  181.177504]  [<ffffffff81a99ec9>] ? prandom_seed+0xd9/0x160
>> [  181.183095]  [<ffffffff8251eef5>] addrconf_dad_work+0x375/0x9e0
>> [  181.189024]  [<ffffffff8251eb80>] ? addrconf_dad_completed+0x7c0/0x7c0
>> [  181.195576]  [<ffffffff81249d8f>] process_one_work+0x52f/0xf60
>> [  181.201468]  [<ffffffff8124a89d>] worker_thread+0xdd/0xe80
>> [  181.206977]  [<ffffffff8265cf0a>] ? __schedule+0x73a/0x16d0
>> [  181.212550]  [<ffffffff8124a7c0>] ? process_one_work+0xf60/0xf60
>> [  181.218572]  [<ffffffff8125a115>] kthread+0x205/0x2b0
>> [  181.223633]  [<ffffffff81259f10>] ? kthread_worker_fn+0x4e0/0x4e0
>> [  181.229743]  [<ffffffff81259f10>] ? kthread_worker_fn+0x4e0/0x4e0
>> [  181.235834]  [<ffffffff8266726f>] ret_from_fork+0x3f/0x70
>> [  181.241232]  [<ffffffff81259f10>] ? kthread_worker_fn+0x4e0/0x4e0
>>
>>
>> .
>>
> 
> 
> .
> 


^ permalink raw reply

* [PATCH] r8169: fix a typo in a comment
From: Corentin Musard @ 2019-07-24 12:34 UTC (permalink / raw)
  To: David S. Miller, netdev, linux-kernel; +Cc: trivial

Replace "additonal" by "additional" in a comment.
Typo found by checkpatch.pl.

Signed-off-by: Corentin Musard <corentinmusard@gmail.com>
---
 drivers/net/ethernet/realtek/r8169_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 0637c6752a78..7231ab3573ff 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -6334,7 +6334,7 @@ rtl8169_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 	stats->multicast	= dev->stats.multicast;
 
 	/*
-	 * Fetch additonal counter values missing in stats collected by driver
+	 * Fetch additional counter values missing in stats collected by driver
 	 * from tally counters.
 	 */
 	if (pm_runtime_active(&pdev->dev))
-- 
2.22.0


^ permalink raw reply related

* [PATCH 1/2] net: mac80211: Fix possible null-pointer dereferences in ieee80211_setup_sdata()
From: Jia-Ju Bai @ 2019-07-24 12:36 UTC (permalink / raw)
  To: johannes, davem; +Cc: linux-wireless, netdev, linux-kernel, Jia-Ju Bai

In ieee80211_setup_sdata(), there is an if statement on line 1410 to
check whether sdata->dev is NULL:
    if (sdata->dev)

When sdata->dev is NULL, it is used on lines 1458 and 1459:
    sdata->dev->type = ARPHRD_IEEE80211_RADIOTAP;
    sdata->dev->netdev_ops = &ieee80211_monitorif_ops;

Thus, possible null-pointer dereferences may occur.

To fix these bugs, sdata->dev is checked before being used.

These bugs are found by a static analysis tool STCheck written by us.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
---
 net/mac80211/iface.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 06aac0aaae64..e49264981a7b 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -1455,8 +1455,10 @@ static void ieee80211_setup_sdata(struct ieee80211_sub_if_data *sdata,
 			ieee80211_mesh_init_sdata(sdata);
 		break;
 	case NL80211_IFTYPE_MONITOR:
-		sdata->dev->type = ARPHRD_IEEE80211_RADIOTAP;
-		sdata->dev->netdev_ops = &ieee80211_monitorif_ops;
+		if (sdata->dev) {
+			sdata->dev->type = ARPHRD_IEEE80211_RADIOTAP;
+			sdata->dev->netdev_ops = &ieee80211_monitorif_ops;
+		}
 		sdata->u.mntr.flags = MONITOR_FLAG_CONTROL |
 				      MONITOR_FLAG_OTHER_BSS;
 		break;
-- 
2.17.0


^ permalink raw reply related

* [PATCH 2/2] net: mac80211: Fix possible null-pointer dereferences in ieee80211_xmit_fast_finish()
From: Jia-Ju Bai @ 2019-07-24 12:36 UTC (permalink / raw)
  To: johannes, davem; +Cc: linux-wireless, netdev, linux-kernel, Jia-Ju Bai

In ieee80211_xmit_fast_finish(), there is an if statement on line 3356
to check whether key is NULL:
    if (key)

When key is NULL, it is used on line 3388:
    switch (key->conf.cipher)
and line 3393:
    pn = atomic64_inc_return(&key->conf.tx_pn);
and line 3396:
    crypto_hdr[3] = 0x20 | (key->conf.keyidx << 6);

Thus, possible null-pointer dereferences may occur.

To fix these bugs, key is checked on line 3384.

These bugs are found by a static analysis tool STCheck written by us.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
---
 net/mac80211/tx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index f13eb2f61ccf..2cc261165b91 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -3381,7 +3381,7 @@ static void ieee80211_xmit_fast_finish(struct ieee80211_sub_if_data *sdata,
 	sta->tx_stats.bytes[skb_get_queue_mapping(skb)] += skb->len;
 	sta->tx_stats.packets[skb_get_queue_mapping(skb)]++;
 
-	if (pn_offs) {
+	if (key && pn_offs) {
 		u64 pn;
 		u8 *crypto_hdr = skb->data + pn_offs;
 
-- 
2.17.0


^ permalink raw reply related

* Re: [PATCH net-next 1/4] sctp: check addr_size with sa_family_t size in __sctp_setsockopt_connectx
From: Marcelo Ricardo Leitner @ 2019-07-24 12:36 UTC (permalink / raw)
  To: Neil Horman; +Cc: Xin Long, network dev, linux-sctp, davem
In-Reply-To: <20190724112235.GA7212@hmswarspite.think-freely.org>

On Wed, Jul 24, 2019 at 07:22:35AM -0400, Neil Horman wrote:
> On Wed, Jul 24, 2019 at 03:21:12PM +0800, Xin Long wrote:
> > On Tue, Jul 23, 2019 at 11:25 PM Neil Horman <nhorman@tuxdriver.com> wrote:
> > >
> > > On Tue, Jul 23, 2019 at 01:37:57AM +0800, Xin Long wrote:
> > > > Now __sctp_connect() is called by __sctp_setsockopt_connectx() and
> > > > sctp_inet_connect(), the latter has done addr_size check with size
> > > > of sa_family_t.
> > > >
> > > > In the next patch to clean up __sctp_connect(), we will remove
> > > > addr_size check with size of sa_family_t from __sctp_connect()
> > > > for the 1st address.
> > > >
> > > > So before doing that, __sctp_setsockopt_connectx() should do
> > > > this check first, as sctp_inet_connect() does.
> > > >
> > > > Signed-off-by: Xin Long <lucien.xin@gmail.com>
> > > > ---
> > > >  net/sctp/socket.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> > > > index aa80cda..5f92e4a 100644
> > > > --- a/net/sctp/socket.c
> > > > +++ b/net/sctp/socket.c
> > > > @@ -1311,7 +1311,7 @@ static int __sctp_setsockopt_connectx(struct sock *sk,
> > > >       pr_debug("%s: sk:%p addrs:%p addrs_size:%d\n",
> > > >                __func__, sk, addrs, addrs_size);
> > > >
> > > > -     if (unlikely(addrs_size <= 0))
> > > > +     if (unlikely(addrs_size < sizeof(sa_family_t)))
> > > I don't think this is what you want to check for here.  sa_family_t is
> > > an unsigned short, and addrs_size is the number of bytes in the addrs
> > > array.  The addrs array should be at least the size of one struct
> > > sockaddr (16 bytes iirc), and, if larger, should be a multiple of
> > > sizeof(struct sockaddr)
> > sizeof(struct sockaddr) is not the right value to check either.
> > 
> > The proper check will be done later in __sctp_connect():
> > 
> >         af = sctp_get_af_specific(daddr->sa.sa_family);
> >         if (!af || af->sockaddr_len > addrs_size)
> >                 return -EINVAL;
> > 
> > So the check 'addrs_size < sizeof(sa_family_t)' in this patch is
> > just to make sure daddr->sa.sa_family is accessible. the same
> > check is also done in sctp_inet_connect().
> > 
> That doesn't make much sense, if the proper check is done in __sctp_connect with
> the size of the families sockaddr_len, then we don't need this check at all, we
> can just let memdup_user take the fault on copy_to_user and return -EFAULT.  If
> we get that from memdup_user, we know its not accessible, and can bail out.
> 
> About the only thing we need to check for here is that addr_len isn't some
> absurdly high value (i.e. a negative value), so that we avoid trying to kmalloc
> upwards of 2G in memdup_user.  Your change does that just fine, but its no
> better or worse than checking for <=0

One can argue that such check against absurdly high values is random
and not effective, as 2G can be somewhat reasonable on 8GB systems but
certainly isn't on 512MB ones. On that, kmemdup_user() will also fail
gracefully as it uses GFP_USER and __GFP_NOWARN.

The original check is more for protecting for sane usage of the
variable, which is an int, and a negative value is questionable. We
could cast, yes, but.. was that really the intent of the application?
Probably not.

> 
> Neil
> 
> > >
> > > Neil
> > >
> > > >               return -EINVAL;
> > > >
> > > >       kaddrs = memdup_user(addrs, addrs_size);
> > > > --
> > > > 2.1.0
> > > >
> > > >
> > 

^ permalink raw reply

* Re: [PATCH 2/2] net: mac80211: Fix possible null-pointer dereferences in ieee80211_xmit_fast_finish()
From: Johannes Berg @ 2019-07-24 12:39 UTC (permalink / raw)
  To: Jia-Ju Bai, davem; +Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <20190724123633.10145-1-baijiaju1990@gmail.com>

On Wed, 2019-07-24 at 20:36 +0800, Jia-Ju Bai wrote:
> In ieee80211_xmit_fast_finish(), there is an if statement on line 3356
> to check whether key is NULL:
>     if (key)
> 
> When key is NULL, it is used on line 3388:
>     switch (key->conf.cipher)
> and line 3393:
>     pn = atomic64_inc_return(&key->conf.tx_pn);
> and line 3396:
>     crypto_hdr[3] = 0x20 | (key->conf.keyidx << 6);
> 
> Thus, possible null-pointer dereferences may occur.

No, this cannot happen, because pn_offs can only be non-zero when there
is a key. Maybe we should pass the fast_tx pointer instead of the
pn_offs/key from it, but the two are coupled.

johannes


^ permalink raw reply

* Re: [PATCH 1/2] net: mac80211: Fix possible null-pointer dereferences in ieee80211_setup_sdata()
From: Johannes Berg @ 2019-07-24 12:40 UTC (permalink / raw)
  To: Jia-Ju Bai, davem; +Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <20190724123623.10093-1-baijiaju1990@gmail.com>

On Wed, 2019-07-24 at 20:36 +0800, Jia-Ju Bai wrote:
> In ieee80211_setup_sdata(), there is an if statement on line 1410 to
> check whether sdata->dev is NULL:
>     if (sdata->dev)
> 
> When sdata->dev is NULL, it is used on lines 1458 and 1459:
>     sdata->dev->type = ARPHRD_IEEE80211_RADIOTAP;
>     sdata->dev->netdev_ops = &ieee80211_monitorif_ops;
> 
> Thus, possible null-pointer dereferences may occur.

No, this cannot happen, monitor interfaces (NL80211_IFTYPE_MONITOR) must
have a valid ->dev, only P2P device and NAN might not.

johannes


^ permalink raw reply

* Re: [PATCH net-next 1/4] sctp: check addr_size with sa_family_t size in __sctp_setsockopt_connectx
From: Marcelo Ricardo Leitner @ 2019-07-24 12:49 UTC (permalink / raw)
  To: Neil Horman; +Cc: Xin Long, network dev, linux-sctp, davem
In-Reply-To: <20190724123650.GD6204@localhost.localdomain>

On Wed, Jul 24, 2019 at 09:36:50AM -0300, Marcelo Ricardo Leitner wrote:
> On Wed, Jul 24, 2019 at 07:22:35AM -0400, Neil Horman wrote:
> > On Wed, Jul 24, 2019 at 03:21:12PM +0800, Xin Long wrote:
> > > On Tue, Jul 23, 2019 at 11:25 PM Neil Horman <nhorman@tuxdriver.com> wrote:
> > > >
> > > > On Tue, Jul 23, 2019 at 01:37:57AM +0800, Xin Long wrote:
> > > > > Now __sctp_connect() is called by __sctp_setsockopt_connectx() and
> > > > > sctp_inet_connect(), the latter has done addr_size check with size
> > > > > of sa_family_t.
> > > > >
> > > > > In the next patch to clean up __sctp_connect(), we will remove
> > > > > addr_size check with size of sa_family_t from __sctp_connect()
> > > > > for the 1st address.
> > > > >
> > > > > So before doing that, __sctp_setsockopt_connectx() should do
> > > > > this check first, as sctp_inet_connect() does.
> > > > >
> > > > > Signed-off-by: Xin Long <lucien.xin@gmail.com>
> > > > > ---
> > > > >  net/sctp/socket.c | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> > > > > index aa80cda..5f92e4a 100644
> > > > > --- a/net/sctp/socket.c
> > > > > +++ b/net/sctp/socket.c
> > > > > @@ -1311,7 +1311,7 @@ static int __sctp_setsockopt_connectx(struct sock *sk,
> > > > >       pr_debug("%s: sk:%p addrs:%p addrs_size:%d\n",
> > > > >                __func__, sk, addrs, addrs_size);
> > > > >
> > > > > -     if (unlikely(addrs_size <= 0))
> > > > > +     if (unlikely(addrs_size < sizeof(sa_family_t)))
> > > > I don't think this is what you want to check for here.  sa_family_t is
> > > > an unsigned short, and addrs_size is the number of bytes in the addrs
> > > > array.  The addrs array should be at least the size of one struct
> > > > sockaddr (16 bytes iirc), and, if larger, should be a multiple of
> > > > sizeof(struct sockaddr)
> > > sizeof(struct sockaddr) is not the right value to check either.
> > > 
> > > The proper check will be done later in __sctp_connect():
> > > 
> > >         af = sctp_get_af_specific(daddr->sa.sa_family);
> > >         if (!af || af->sockaddr_len > addrs_size)
> > >                 return -EINVAL;
> > > 
> > > So the check 'addrs_size < sizeof(sa_family_t)' in this patch is
> > > just to make sure daddr->sa.sa_family is accessible. the same
> > > check is also done in sctp_inet_connect().
> > > 
> > That doesn't make much sense, if the proper check is done in __sctp_connect with
> > the size of the families sockaddr_len, then we don't need this check at all, we
> > can just let memdup_user take the fault on copy_to_user and return -EFAULT.  If
> > we get that from memdup_user, we know its not accessible, and can bail out.
> > 
> > About the only thing we need to check for here is that addr_len isn't some
> > absurdly high value (i.e. a negative value), so that we avoid trying to kmalloc
> > upwards of 2G in memdup_user.  Your change does that just fine, but its no
> > better or worse than checking for <=0
> 
> One can argue that such check against absurdly high values is random
> and not effective, as 2G can be somewhat reasonable on 8GB systems but
> certainly isn't on 512MB ones. On that, kmemdup_user() will also fail
> gracefully as it uses GFP_USER and __GFP_NOWARN.
> 
> The original check is more for protecting for sane usage of the
> variable, which is an int, and a negative value is questionable. We
> could cast, yes, but.. was that really the intent of the application?
> Probably not.

Though that said, I'm okay with the new check here: a quick sanity
check that can avoid expensive calls to kmalloc(), while more refined
check is done later on.

> 
> > 
> > Neil
> > 
> > > >
> > > > Neil
> > > >
> > > > >               return -EINVAL;
> > > > >
> > > > >       kaddrs = memdup_user(addrs, addrs_size);
> > > > > --
> > > > > 2.1.0
> > > > >
> > > > >
> > > 

^ permalink raw reply

* Re: [RFC PATCH net-next 10/12] drop_monitor: Add packet alert mode
From: Jiri Pirko @ 2019-07-24 12:53 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, nhorman, dsahern, roopa, nikolay, jakub.kicinski,
	toke, andy, f.fainelli, andrew, vivien.didelot, mlxsw,
	Ido Schimmel
In-Reply-To: <20190722183134.14516-11-idosch@idosch.org>

Mon, Jul 22, 2019 at 08:31:32PM CEST, idosch@idosch.org wrote:
>From: Ido Schimmel <idosch@mellanox.com>
>
>So far drop monitor supported only one alert mode in which a summary of
>locations in which packets were recently dropped was sent to user space.
>
>This alert mode is sufficient in order to understand that packets were
>dropped, but lacks information to perform a more detailed analysis.
>
>Add a new alert mode in which the dropped packet itself is passed to
>user space along with metadata: The drop location (as program counter
>and resolved symbol), ingress / egress netdevice and arrival / departure
>timestamp. More metadata can be added in the future.
>
>To avoid performing expensive operations in the context in which
>kfree_skb() is invoked (can be hard IRQ), the dropped skb is cloned and
>queued on per-CPU skb drop list. Then, in process context the netlink
>message is allocated, prepared and finally sent to user space.
>
>Signed-off-by: Ido Schimmel <idosch@mellanox.com>
>---
> include/uapi/linux/net_dropmon.h |  29 ++++
> net/core/drop_monitor.c          | 287 ++++++++++++++++++++++++++++++-
> 2 files changed, 308 insertions(+), 8 deletions(-)
>
>diff --git a/include/uapi/linux/net_dropmon.h b/include/uapi/linux/net_dropmon.h
>index 5edbd0a675fd..7708c8a440a1 100644
>--- a/include/uapi/linux/net_dropmon.h
>+++ b/include/uapi/linux/net_dropmon.h
>@@ -53,6 +53,7 @@ enum {
> 	NET_DM_CMD_CONFIG,
> 	NET_DM_CMD_START,
> 	NET_DM_CMD_STOP,
>+	NET_DM_CMD_PACKET_ALERT,
> 	_NET_DM_CMD_MAX,
> };
> 
>@@ -62,4 +63,32 @@ enum {
>  * Our group identifiers
>  */
> #define NET_DM_GRP_ALERT 1
>+
>+enum net_dm_attr {
>+	NET_DM_ATTR_UNSPEC,
>+
>+	NET_DM_ATTR_ALERT_MODE,			/* u8 */
>+	NET_DM_ATTR_PC,				/* u64 */
>+	NET_DM_ATTR_SYMBOL,			/* string */
>+	NET_DM_ATTR_NETDEV_IFINDEX,		/* u32 */
>+	NET_DM_ATTR_NETDEV_NAME,		/* string */
>+	NET_DM_ATTR_TIMESTAMP,			/* struct timespec */
>+	NET_DM_ATTR_PAYLOAD,			/* binary */
>+	NET_DM_ATTR_PAD,
>+
>+	__NET_DM_ATTR_MAX,
>+	NET_DM_ATTR_MAX = __NET_DM_ATTR_MAX - 1
>+};
>+
>+/**
>+ * enum net_dm_alert_mode - Alert mode.
>+ * @NET_DM_ALERT_MODE_SUMMARY: A summary of recent drops is sent to user space.
>+ * @NET_DM_ALERT_MODE_PACKET: Each dropped packet is sent to user space along
>+ *                            with metadata.
>+ */
>+enum net_dm_alert_mode {
>+	NET_DM_ALERT_MODE_SUMMARY,
>+	NET_DM_ALERT_MODE_PACKET,
>+};
>+
> #endif
>diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c
>index f441fb653567..512935fc235b 100644
>--- a/net/core/drop_monitor.c
>+++ b/net/core/drop_monitor.c
>@@ -54,6 +54,7 @@ static DEFINE_MUTEX(net_dm_mutex);
> struct per_cpu_dm_data {
> 	spinlock_t		lock;	/* Protects 'skb' and 'send_timer' */
> 	struct sk_buff		*skb;
>+	struct sk_buff_head	drop_queue;
> 	struct work_struct	dm_alert_work;
> 	struct timer_list	send_timer;
> };
>@@ -75,6 +76,22 @@ static int dm_delay = 1;
> static unsigned long dm_hw_check_delta = 2*HZ;
> static LIST_HEAD(hw_stats_list);
> 
>+static enum net_dm_alert_mode net_dm_alert_mode = NET_DM_ALERT_MODE_SUMMARY;
>+
>+struct net_dm_skb_cb {
>+	void *pc;
>+};
>+
>+#define NET_DM_SKB_CB(__skb) ((struct net_dm_skb_cb *)&((__skb)->cb[0]))
>+
>+struct net_dm_alert_ops {
>+	void (*kfree_skb_probe)(void *ignore, struct sk_buff *skb,
>+				void *location);
>+	void (*napi_poll_probe)(void *ignore, struct napi_struct *napi,
>+				int work, int budget);
>+	void (*work_item_func)(struct work_struct *work);
>+};
>+
> static int net_dm_nl_pre_doit(const struct genl_ops *ops,
> 			      struct sk_buff *skb, struct genl_info *info)
> {
>@@ -255,10 +272,194 @@ static void trace_napi_poll_hit(void *ignore, struct napi_struct *napi,
> 	rcu_read_unlock();
> }
> 
>+static const struct net_dm_alert_ops net_dm_alert_summary_ops = {
>+	.kfree_skb_probe	= trace_kfree_skb_hit,
>+	.napi_poll_probe	= trace_napi_poll_hit,
>+	.work_item_func		= send_dm_alert,
>+};
>+
>+static void net_dm_packet_trace_kfree_skb_hit(void *ignore,
>+					      struct sk_buff *skb,
>+					      void *location)
>+{
>+	struct per_cpu_dm_data *data;
>+	struct sk_buff *nskb;
>+	unsigned long flags;
>+
>+	nskb = skb_clone(skb, GFP_ATOMIC);
>+	if (!nskb)
>+		return;
>+
>+	NET_DM_SKB_CB(nskb)->pc = location;
>+	if (nskb->dev)
>+		dev_hold(nskb->dev);
>+
>+	data = this_cpu_ptr(&dm_cpu_data);
>+	spin_lock_irqsave(&data->drop_queue.lock, flags);
>+
>+	__skb_queue_tail(&data->drop_queue, nskb);
>+
>+	if (!timer_pending(&data->send_timer)) {
>+		data->send_timer.expires = jiffies + dm_delay * HZ;
>+		add_timer(&data->send_timer);
>+	}
>+
>+	spin_unlock_irqrestore(&data->drop_queue.lock, flags);
>+}
>+
>+static void net_dm_packet_trace_napi_poll_hit(void *ignore,
>+					      struct napi_struct *napi,
>+					      int work, int budget)
>+{
>+}
>+
>+#define NET_DM_MAX_SYMBOL_LEN 40
>+
>+static size_t net_dm_packet_report_size(size_t payload_len)
>+{
>+	size_t size;
>+
>+	size = nlmsg_msg_size(GENL_HDRLEN + net_drop_monitor_family.hdrsize);
>+
>+	return NLMSG_ALIGN(size) +
>+	       /* NET_DM_ATTR_PC */
>+	       nla_total_size(sizeof(u64)) +
>+	       /* NET_DM_ATTR_SYMBOL */
>+	       nla_total_size(NET_DM_MAX_SYMBOL_LEN + 1) +
>+	       /* NET_DM_ATTR_NETDEV_IFINDEX */
>+	       nla_total_size(sizeof(u32)) +
>+	       /* NET_DM_ATTR_NETDEV_NAME */
>+	       nla_total_size(IFNAMSIZ + 1) +
>+	       /* NET_DM_ATTR_TIMESTAMP */
>+	       nla_total_size(sizeof(struct timespec)) +
>+	       /* NET_DM_ATTR_PAYLOAD */
>+	       nla_total_size(payload_len);
>+}
>+
>+static int net_dm_packet_report_fill(struct sk_buff *msg, struct sk_buff *skb,
>+				     size_t payload_len)
>+{
>+	u64 pc = (u64)(uintptr_t) NET_DM_SKB_CB(skb)->pc;
>+	char buf[NET_DM_MAX_SYMBOL_LEN];
>+	struct nlattr *attr;
>+	struct timespec ts;
>+	void *hdr;
>+
>+	hdr = genlmsg_put(msg, 0, 0, &net_drop_monitor_family, 0,
>+			  NET_DM_CMD_PACKET_ALERT);
>+	if (!hdr)
>+		return -EMSGSIZE;
>+
>+	if (nla_put_u64_64bit(msg, NET_DM_ATTR_PC, pc, NET_DM_ATTR_PAD))
>+		goto nla_put_failure;
>+
>+	snprintf(buf, sizeof(buf), "%pS", NET_DM_SKB_CB(skb)->pc);
>+	if (nla_put_string(msg, NET_DM_ATTR_SYMBOL, buf))
>+		goto nla_put_failure;
>+
>+	if (skb->dev &&
>+	    nla_put_u32(msg, NET_DM_ATTR_NETDEV_IFINDEX, skb->dev->ifindex))
>+		goto nla_put_failure;
>+
>+	if (skb->dev &&
>+	    nla_put_string(msg, NET_DM_ATTR_NETDEV_NAME, skb->dev->name))
>+		goto nla_put_failure;
>+
>+	if (ktime_to_timespec_cond(skb->tstamp, &ts) &&
>+	    nla_put(msg, NET_DM_ATTR_TIMESTAMP, sizeof(ts), &ts))
>+		goto nla_put_failure;
>+
>+	if (!payload_len)
>+		goto out;
>+
>+	attr = skb_put(msg, nla_total_size(payload_len));
>+	attr->nla_type = NET_DM_ATTR_PAYLOAD;
>+	attr->nla_len = nla_attr_size(payload_len);
>+	if (skb_copy_bits(skb, 0, nla_data(attr), payload_len))
>+		goto nla_put_failure;
>+
>+out:
>+	genlmsg_end(msg, hdr);
>+
>+	return 0;
>+
>+nla_put_failure:
>+	genlmsg_cancel(msg, hdr);
>+	return -EMSGSIZE;
>+}
>+
>+#define NET_DM_MAX_PACKET_SIZE (0xffff - NLA_HDRLEN - NLA_ALIGNTO)
>+
>+static void net_dm_packet_report(struct sk_buff *skb)
>+{
>+	struct sk_buff *msg;
>+	size_t payload_len;
>+	int rc;
>+
>+	/* Make sure we start copying the packet from the MAC header */
>+	if (skb->data > skb_mac_header(skb))
>+		skb_push(skb, skb->data - skb_mac_header(skb));
>+	else
>+		skb_pull(skb, skb_mac_header(skb) - skb->data);
>+
>+	/* Ensure packet fits inside a single netlink attribute */
>+	payload_len = min_t(size_t, skb->len, NET_DM_MAX_PACKET_SIZE);
>+
>+	msg = nlmsg_new(net_dm_packet_report_size(payload_len), GFP_KERNEL);
>+	if (!msg)
>+		goto out;
>+
>+	rc = net_dm_packet_report_fill(msg, skb, payload_len);
>+	if (rc) {
>+		nlmsg_free(msg);
>+		goto out;
>+	}
>+
>+	genlmsg_multicast(&net_drop_monitor_family, msg, 0, 0, GFP_KERNEL);
>+
>+out:
>+	if (skb->dev)
>+		dev_put(skb->dev);
>+	consume_skb(skb);
>+}
>+
>+static void net_dm_packet_work(struct work_struct *work)
>+{
>+	struct per_cpu_dm_data *data;
>+	struct sk_buff_head list;
>+	struct sk_buff *skb;
>+	unsigned long flags;
>+
>+	data = container_of(work, struct per_cpu_dm_data, dm_alert_work);
>+
>+	__skb_queue_head_init(&list);
>+
>+	spin_lock_irqsave(&data->drop_queue.lock, flags);
>+	skb_queue_splice_tail_init(&data->drop_queue, &list);
>+	spin_unlock_irqrestore(&data->drop_queue.lock, flags);
>+
>+	while ((skb = __skb_dequeue(&list)))
>+		net_dm_packet_report(skb);
>+}
>+
>+static const struct net_dm_alert_ops net_dm_alert_packet_ops = {
>+	.kfree_skb_probe	= net_dm_packet_trace_kfree_skb_hit,
>+	.napi_poll_probe	= net_dm_packet_trace_napi_poll_hit,
>+	.work_item_func		= net_dm_packet_work,
>+};
>+
>+static const struct net_dm_alert_ops *net_dm_alert_ops_arr[] = {
>+	[NET_DM_ALERT_MODE_SUMMARY]	= &net_dm_alert_summary_ops,
>+	[NET_DM_ALERT_MODE_PACKET]	= &net_dm_alert_packet_ops,
>+};

Please split this patch into 2:
1) introducing the ops and modes (only summary)
2) introducing the packet mode


>+
> static int net_dm_trace_on_set(struct netlink_ext_ack *extack)
> {
>+	const struct net_dm_alert_ops *ops;
> 	int cpu, rc;
> 
>+	ops = net_dm_alert_ops_arr[net_dm_alert_mode];
>+
> 	if (!try_module_get(THIS_MODULE)) {
> 		NL_SET_ERR_MSG_MOD(extack, "Failed to take reference on module");
> 		return -ENODEV;
>@@ -267,17 +468,17 @@ static int net_dm_trace_on_set(struct netlink_ext_ack *extack)
> 	for_each_possible_cpu(cpu) {
> 		struct per_cpu_dm_data *data = &per_cpu(dm_cpu_data, cpu);
> 
>-		INIT_WORK(&data->dm_alert_work, send_dm_alert);
>+		INIT_WORK(&data->dm_alert_work, ops->work_item_func);
> 		timer_setup(&data->send_timer, sched_send_work, 0);
> 	}
> 
>-	rc = register_trace_kfree_skb(trace_kfree_skb_hit, NULL);
>+	rc = register_trace_kfree_skb(ops->kfree_skb_probe, NULL);
> 	if (rc) {
> 		NL_SET_ERR_MSG_MOD(extack, "Failed to connect probe to kfree_skb() tracepoint");
> 		goto err_module_put;
> 	}
> 
>-	rc = register_trace_napi_poll(trace_napi_poll_hit, NULL);
>+	rc = register_trace_napi_poll(ops->napi_poll_probe, NULL);
> 	if (rc) {
> 		NL_SET_ERR_MSG_MOD(extack, "Failed to connect probe to napi_poll() tracepoint");
> 		goto err_unregister_trace;
>@@ -286,7 +487,7 @@ static int net_dm_trace_on_set(struct netlink_ext_ack *extack)
> 	return 0;
> 
> err_unregister_trace:
>-	unregister_trace_kfree_skb(trace_kfree_skb_hit, NULL);
>+	unregister_trace_kfree_skb(ops->kfree_skb_probe, NULL);
> err_module_put:
> 	module_put(THIS_MODULE);
> 	return rc;
>@@ -295,10 +496,13 @@ static int net_dm_trace_on_set(struct netlink_ext_ack *extack)
> static void net_dm_trace_off_set(void)
> {
> 	struct dm_hw_stat_delta *new_stat, *temp;
>+	const struct net_dm_alert_ops *ops;
> 	int cpu;
> 
>-	unregister_trace_napi_poll(trace_napi_poll_hit, NULL);
>-	unregister_trace_kfree_skb(trace_kfree_skb_hit, NULL);
>+	ops = net_dm_alert_ops_arr[net_dm_alert_mode];
>+
>+	unregister_trace_napi_poll(ops->napi_poll_probe, NULL);
>+	unregister_trace_kfree_skb(ops->kfree_skb_probe, NULL);
> 
> 	tracepoint_synchronize_unregister();
> 
>@@ -307,9 +511,18 @@ static void net_dm_trace_off_set(void)
> 	 */
> 	for_each_possible_cpu(cpu) {
> 		struct per_cpu_dm_data *data = &per_cpu(dm_cpu_data, cpu);
>+		struct sk_buff *skb;
> 
> 		del_timer_sync(&data->send_timer);
> 		cancel_work_sync(&data->dm_alert_work);
>+		/* If we deactivated a pending timer, some packets are still
>+		 * queued and we need to free them.
>+		 */
>+		while ((skb = __skb_dequeue(&data->drop_queue))) {
>+			if (skb->dev)
>+				dev_put(skb->dev);
>+			consume_skb(skb);
>+		}
> 	}
> 
> 	list_for_each_entry_safe(new_stat, temp, &hw_stats_list, list) {
>@@ -351,12 +564,61 @@ static int set_all_monitor_traces(int state, struct netlink_ext_ack *extack)
> 	return rc;
> }
> 
>+static int net_dm_alert_mode_get_from_info(struct genl_info *info,
>+					   enum net_dm_alert_mode *p_alert_mode)
>+{
>+	u8 val;
>+
>+	val = nla_get_u8(info->attrs[NET_DM_ATTR_ALERT_MODE]);
>+
>+	switch (val) {
>+	case NET_DM_ALERT_MODE_SUMMARY: /* fall-through */
>+	case NET_DM_ALERT_MODE_PACKET:
>+		*p_alert_mode = val;
>+		break;
>+	default:
>+		return -EINVAL;
>+	}
>+
>+	return 0;
>+}
>+
>+static int net_dm_alert_mode_set(struct genl_info *info)
>+{
>+	struct netlink_ext_ack *extack = info->extack;
>+	enum net_dm_alert_mode alert_mode;
>+	int rc;
>+
>+	if (!info->attrs[NET_DM_ATTR_ALERT_MODE])
>+		return 0;
>+
>+	rc = net_dm_alert_mode_get_from_info(info, &alert_mode);
>+	if (rc) {
>+		NL_SET_ERR_MSG_MOD(extack, "Invalid alert mode");
>+		return -EINVAL;
>+	}
>+
>+	net_dm_alert_mode = alert_mode;

2 things:
1) Shouldn't you check if the tracing is on and return -EBUSY in case it is?
2) You setup the mode globally. I guess it is fine and it does not make
   sense to do it otherwise, right? Like per-net or something.


>+
>+	return 0;
>+}
>+
> static int net_dm_cmd_config(struct sk_buff *skb,
> 			struct genl_info *info)
> {
>-	NL_SET_ERR_MSG_MOD(info->extack, "Command not supported");
>+	struct netlink_ext_ack *extack = info->extack;
>+	int rc;
> 
>-	return -EOPNOTSUPP;
>+	if (trace_state == TRACE_ON) {
>+		NL_SET_ERR_MSG_MOD(extack, "Cannot configure drop monitor while tracing is on");
>+		return -EOPNOTSUPP;
>+	}
>+
>+	rc = net_dm_alert_mode_set(info);
>+	if (rc)
>+		return rc;
>+
>+	return 0;
> }
> 
> static int net_dm_cmd_trace(struct sk_buff *skb,
>@@ -411,6 +673,11 @@ static int dropmon_net_event(struct notifier_block *ev_block,
> 	return NOTIFY_DONE;
> }
> 
>+static const struct nla_policy net_dm_nl_policy[NET_DM_ATTR_MAX + 1] = {
>+	[NET_DM_ATTR_UNSPEC] = { .strict_start_type = NET_DM_ATTR_UNSPEC + 1 },
>+	[NET_DM_ATTR_ALERT_MODE] = { .type = NLA_U8 },
>+};
>+
> static const struct genl_ops dropmon_ops[] = {
> 	{
> 		.cmd = NET_DM_CMD_CONFIG,
>@@ -434,6 +701,8 @@ static struct genl_family net_drop_monitor_family __ro_after_init = {
> 	.hdrsize        = 0,
> 	.name           = "NET_DM",
> 	.version        = 2,
>+	.maxattr	= NET_DM_ATTR_MAX,
>+	.policy		= net_dm_nl_policy,
> 	.pre_doit	= net_dm_nl_pre_doit,
> 	.post_doit	= net_dm_nl_post_doit,
> 	.module		= THIS_MODULE,
>@@ -479,6 +748,7 @@ static int __init init_net_drop_monitor(void)
> 		INIT_WORK(&data->dm_alert_work, send_dm_alert);
> 		timer_setup(&data->send_timer, sched_send_work, 0);
> 		spin_lock_init(&data->lock);
>+		skb_queue_head_init(&data->drop_queue);
> 		reset_per_cpu_data(data);
> 	}
> 
>@@ -509,6 +779,7 @@ static void exit_net_drop_monitor(void)
> 		 * to this struct and can free the skb inside it
> 		 */
> 		kfree_skb(data->skb);
>+		WARN_ON(!skb_queue_empty(&data->drop_queue));
> 	}
> 
> 	BUG_ON(genl_unregister_family(&net_drop_monitor_family));
>-- 
>2.21.0
>

^ permalink raw reply

* Re: [PATCH] net: 9p: Fix possible null-pointer dereferences in p9_cm_event_handler()
From: Dominique Martinet @ 2019-07-24 12:55 UTC (permalink / raw)
  To: Jia-Ju Bai; +Cc: ericvh, lucho, davem, v9fs-developer, netdev, linux-kernel
In-Reply-To: <20190724103948.5834-1-baijiaju1990@gmail.com>

Jia-Ju Bai wrote on Wed, Jul 24, 2019:
> In p9_cm_event_handler(), there is an if statement on 260 to check
> whether rdma is NULL, which indicates that rdma can be NULL.
> If so, using rdma->xxx may cause a possible null-pointer dereference.

The final dereference (complete(&rdma->cm_done) line 285) has been here
from the start, so we would have seen crashes by now if rdma could be
null at this point.

Let's do it the other way around and remove the useless "if (rdma)" that
has been here from day 1 instead ; I basically did the same with
c->status a few months ago (from a coverity report)...


That said, please note that 'rdma checked for null in this event->event
== RDMA_CM_EVENT_DISCONNECTED branch' does not mean rdma can be null in
other branches.
static analysis does not say anything more than the final complete()
should also have a check if the previous one has, but nothing about the
other cases in the switch.


Thanks,
-- 
Dominique

^ permalink raw reply

* RE: [PATCH net-next 1/3] enetc: Add mdio bus driver for the PCIe MDIO endpoint
From: Claudiu Manoil @ 2019-07-24 12:57 UTC (permalink / raw)
  To: Claudiu Manoil, Andrew Lunn
  Cc: David S . Miller, devicetree@vger.kernel.org,
	netdev@vger.kernel.org, Alexandru Marginean,
	linux-kernel@vger.kernel.org, Leo Li, Rob Herring,
	linux-arm-kernel@lists.infradead.org
In-Reply-To: <VI1PR04MB4880DAE881769F6DC845CEEF96C60@VI1PR04MB4880.eurprd04.prod.outlook.com>



>-----Original Message-----
>From: netdev-owner@vger.kernel.org <netdev-owner@vger.kernel.org> On
>Behalf Of Claudiu Manoil
>Sent: Wednesday, July 24, 2019 12:53 PM
>To: Andrew Lunn <andrew@lunn.ch>
>Cc: David S . Miller <davem@davemloft.net>; devicetree@vger.kernel.org;
>netdev@vger.kernel.org; Alexandru Marginean
><alexandru.marginean@nxp.com>; linux-kernel@vger.kernel.org; Leo Li
><leoyang.li@nxp.com>; Rob Herring <robh+dt@kernel.org>; linux-arm-
>kernel@lists.infradead.org
>Subject: RE: [PATCH net-next 1/3] enetc: Add mdio bus driver for the PCIe MDIO
>endpoint
>
>>-----Original Message-----
>>From: Andrew Lunn <andrew@lunn.ch>
>>Sent: Wednesday, July 24, 2019 1:25 AM
>>To: Claudiu Manoil <claudiu.manoil@nxp.com>
>>Cc: David S . Miller <davem@davemloft.net>; devicetree@vger.kernel.org;
>>netdev@vger.kernel.org; Alexandru Marginean
>><alexandru.marginean@nxp.com>; linux-kernel@vger.kernel.org; Leo Li
>><leoyang.li@nxp.com>; Rob Herring <robh+dt@kernel.org>; linux-arm-
>>kernel@lists.infradead.org
>>Subject: Re: [PATCH net-next 1/3] enetc: Add mdio bus driver for the
>>PCIe MDIO endpoint
>>
>>> +	bus = mdiobus_alloc_size(sizeof(u32 *));
>>> +	if (!bus)
>>> +		return -ENOMEM;
>>> +
>>
>>> +	bus->priv = pci_iomap_range(pdev, 0, ENETC_MDIO_REG_OFFSET, 0);
>>
>>This got me confused for a while. You allocate space for a u32 pointer.
>>bus->priv will point to this space. However, you are not using this
>>space, you {ab}use the pointer to directly hold the return from
>>pci_iomap_range(). This works, but sparse is probably unhappy, and you
>>are wasting the space the u32 pointer takes.
>>
>
>Thanks Andrew,
>This is not what I wanted to do, don't ask me how I got to this, it's confusing
>indeed.
>What's needed here is mdiobus_alloc() or better, devm_mdiobus_alloc().
>I've got to do some cleanup in the local mdio bus probing too.
>Will send v2.
>

This is tricky actually, mdiobus_alloc won't do it, I still need to allocate space
for the pointer.
So it's not ok to store the register space pointer directly into priv
(even if iomap returns void *), it's unusual.
Looks like I will have to use double pointers!

^ permalink raw reply

* Re: [RFC PATCH net-next 00/12] drop_monitor: Capture dropped packets and metadata
From: Jiri Pirko @ 2019-07-24 12:58 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Toke Høiland-Jørgensen, netdev, davem, nhorman, dsahern,
	roopa, nikolay, jakub.kicinski, andy, f.fainelli, andrew,
	vivien.didelot, mlxsw, Ido Schimmel
In-Reply-To: <20190723151423.GA10342@splinter>

Tue, Jul 23, 2019 at 05:14:23PM CEST, idosch@idosch.org wrote:
>On Tue, Jul 23, 2019 at 02:17:49PM +0200, Toke Høiland-Jørgensen wrote:
>> Ido Schimmel <idosch@idosch.org> writes:
>> 
>> > On Mon, Jul 22, 2019 at 09:43:15PM +0200, Toke Høiland-Jørgensen wrote:
>> >> Is there a mechanism for the user to filter the packets before they are
>> >> sent to userspace? A bpf filter would be the obvious choice I guess...
>> >
>> > Hi Toke,
>> >
>> > Yes, it's on my TODO list to write an eBPF program that only lets
>> > "unique" packets to be enqueued on the netlink socket. Where "unique" is
>> > defined as {5-tuple, PC}. The rest of the copies will be counted in an
>> > eBPF map, which is just a hash table keyed by {5-tuple, PC}.
>> 
>> Yeah, that's a good idea. Or even something simpler like tcpdump-style
>> filters for the packets returned by drop monitor (say if I'm just trying
>> to figure out what happens to my HTTP requests).
>
>Yep, that's a good idea. I guess different users will use different
>programs. Will look into both options.
>
>> > I think it would be good to have the program as part of the bcc
>> > repository [1]. What do you think?
>> 
>> Sure. We could also add it to the XDP tutorial[2]; it could go into a
>> section on introspection and debugging (just added a TODO about that[3]).
>
>Great!
>
>> >> For integrating with XDP the trick would be to find a way to do it that
>> >> doesn't incur any overhead when it's not enabled. Are you envisioning
>> >> that this would be enabled separately for the different "modes" (kernel,
>> >> hardware, XDP, etc)?
>> >
>> > Yes. Drop monitor have commands to enable and disable tracing, but they
>> > don't carry any attributes at the moment. My plan is to add an attribute
>> > (e.g., 'NET_DM_ATTR_DROP_TYPE') that will specify the type of drops
>> > you're interested in - SW/HW/XDP. If the attribute is not specified,
>> > then current behavior is maintained and all the drop types are traced.
>> > But if you're only interested in SW drops, then overhead for the rest
>> > should be zero.
>> 
>> Makes sense (although "should be" is the key here ;)).
>> 
>> I'm also worried about the drop monitor getting overwhelmed; if you turn
>> it on for XDP and you're running a filtering program there, you'll
>> suddenly get *a lot* of drops.
>> 
>> As I read your patch, the current code can basically queue up an
>> unbounded number of packets waiting to go out over netlink, can't it?
>
>That's a very good point. Each CPU holds a drop list. It probably makes
>sense to limit it by default (to 1000?) and allow user to change it

Shouldn't the queue len be configurable?


>later, if needed. I can expose a counter that shows how many packets
>were dropped because of this limit. It can be used as an indication to
>adjust the queue length (or flip to 'summary' mode).

^ permalink raw reply

* [PATCH] qlge: Fix build error without CONFIG_ETHERNET
From: YueHaibing @ 2019-07-24 13:01 UTC (permalink / raw)
  To: manishc, GR-Linux-NIC-Dev, gregkh, bpoirier
  Cc: linux-kernel, devel, netdev, YueHaibing

Now if CONFIG_ETHERNET is not set, QLGE driver
building fails:

drivers/staging/qlge/qlge_main.o: In function `qlge_remove':
drivers/staging/qlge/qlge_main.c:4831: undefined reference to `unregister_netdev'

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 955315b0dc8c ("qlge: Move drivers/net/ethernet/qlogic/qlge/ to drivers/staging/qlge/")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 drivers/staging/qlge/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/qlge/Kconfig b/drivers/staging/qlge/Kconfig
index ae9ed2c..a3cb25a3 100644
--- a/drivers/staging/qlge/Kconfig
+++ b/drivers/staging/qlge/Kconfig
@@ -2,7 +2,7 @@
 
 config QLGE
 	tristate "QLogic QLGE 10Gb Ethernet Driver Support"
-	depends on PCI
+	depends on ETHERNET && PCI
 	help
 	This driver supports QLogic ISP8XXX 10Gb Ethernet cards.
 
-- 
2.7.4



^ permalink raw reply related

* pull-request: can 2019-07-24
From: Marc Kleine-Budde @ 2019-07-24 13:03 UTC (permalink / raw)
  To: netdev; +Cc: davem, linux-can, kernel

Hello David,

this is a pull reqeust of 7 patches for net/master.

The first patch is by Rasmus Villemoes add a missing netif_carrier_off() to
register_candev() so that generic netdev trigger based LEDs are initially off.

Nikita Yushchenko's patch for the rcar_canfd driver fixes a possible IRQ storm
on high load.

The patch by Weitao Hou for the mcp251x driver add missing error checking to
the work queue allocation.

Both Wen Yang's and Joakim Zhang's patch for the flexcan driver fix a problem
with the stop-mode.

Stephane Grosjean contributes a patch for the peak_usb driver to fix a
potential double kfree_skb().

The last patch is by YueHaibing and fixes the error path in can-gw's
cgw_module_init() function.

regards,
Marc

---

The following changes since commit d86afb89305de205b0d2f20c2160adf039e9508d:

  net: thunderx: Use fwnode_get_mac_address() (2019-07-23 14:09:21 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can.git tags/linux-can-fixes-for-5.3-20190724

for you to fetch changes up to b7a14297f102b6e2ce6f16feffebbb9bde1e9b55:

  can: gw: Fix error path of cgw_module_init (2019-07-24 11:19:03 +0200)

----------------------------------------------------------------
linux-can-fixes-for-5.3-20190724

----------------------------------------------------------------
Joakim Zhang (1):
      can: flexcan: fix stop mode acknowledgment

Nikita Yushchenko (1):
      can: rcar_canfd: fix possible IRQ storm on high load

Rasmus Villemoes (1):
      can: dev: call netif_carrier_off() in register_candev()

Stephane Grosjean (1):
      can: peak_usb: fix potential double kfree_skb()

Weitao Hou (1):
      can: mcp251x: add error check when wq alloc failed

Wen Yang (1):
      can: flexcan: fix an use-after-free in flexcan_setup_stop_mode()

YueHaibing (1):
      can: gw: Fix error path of cgw_module_init

 drivers/net/can/dev.c                        |  2 ++
 drivers/net/can/flexcan.c                    | 39 ++++++++++++++++++----
 drivers/net/can/rcar/rcar_canfd.c            |  9 ++---
 drivers/net/can/spi/mcp251x.c                | 49 +++++++++++++---------------
 drivers/net/can/usb/peak_usb/pcan_usb_core.c |  8 ++---
 net/can/gw.c                                 | 48 ++++++++++++++++++---------
 6 files changed, 98 insertions(+), 57 deletions(-)



^ permalink raw reply

* [PATCH 1/7] can: dev: call netif_carrier_off() in register_candev()
From: Marc Kleine-Budde @ 2019-07-24 13:03 UTC (permalink / raw)
  To: netdev
  Cc: davem, linux-can, kernel, Rasmus Villemoes, Willem de Bruijn,
	Marc Kleine-Budde
In-Reply-To: <20190724130322.31702-1-mkl@pengutronix.de>

From: Rasmus Villemoes <rasmus.villemoes@prevas.dk>

CONFIG_CAN_LEDS is deprecated. When trying to use the generic netdev
trigger as suggested, there's a small inconsistency with the link
property: The LED is on initially, stays on when the device is brought
up, and then turns off (as expected) when the device is brought down.

Make sure the LED always reflects the state of the CAN device.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 drivers/net/can/dev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c
index b6b93a2d93a5..483d270664cc 100644
--- a/drivers/net/can/dev.c
+++ b/drivers/net/can/dev.c
@@ -1249,6 +1249,8 @@ int register_candev(struct net_device *dev)
 		return -EINVAL;
 
 	dev->rtnl_link_ops = &can_link_ops;
+	netif_carrier_off(dev);
+
 	return register_netdev(dev);
 }
 EXPORT_SYMBOL_GPL(register_candev);
-- 
2.20.1


^ permalink raw reply related

* [PATCH 3/7] can: mcp251x: add error check when wq alloc failed
From: Marc Kleine-Budde @ 2019-07-24 13:03 UTC (permalink / raw)
  To: netdev
  Cc: davem, linux-can, kernel, Weitao Hou, Willem de Bruijn,
	Sean Nyekjaer, Marc Kleine-Budde
In-Reply-To: <20190724130322.31702-1-mkl@pengutronix.de>

From: Weitao Hou <houweitaoo@gmail.com>

add error check when workqueue alloc failed, and remove redundant code
to make it clear.

Fixes: e0000163e30e ("can: Driver for the Microchip MCP251x SPI CAN controllers")
Signed-off-by: Weitao Hou <houweitaoo@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Tested-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 drivers/net/can/spi/mcp251x.c | 49 ++++++++++++++++-------------------
 1 file changed, 22 insertions(+), 27 deletions(-)

diff --git a/drivers/net/can/spi/mcp251x.c b/drivers/net/can/spi/mcp251x.c
index 44e99e3d7134..2aec934fab0c 100644
--- a/drivers/net/can/spi/mcp251x.c
+++ b/drivers/net/can/spi/mcp251x.c
@@ -664,17 +664,6 @@ static int mcp251x_power_enable(struct regulator *reg, int enable)
 		return regulator_disable(reg);
 }
 
-static void mcp251x_open_clean(struct net_device *net)
-{
-	struct mcp251x_priv *priv = netdev_priv(net);
-	struct spi_device *spi = priv->spi;
-
-	free_irq(spi->irq, priv);
-	mcp251x_hw_sleep(spi);
-	mcp251x_power_enable(priv->transceiver, 0);
-	close_candev(net);
-}
-
 static int mcp251x_stop(struct net_device *net)
 {
 	struct mcp251x_priv *priv = netdev_priv(net);
@@ -940,37 +929,43 @@ static int mcp251x_open(struct net_device *net)
 				   flags | IRQF_ONESHOT, DEVICE_NAME, priv);
 	if (ret) {
 		dev_err(&spi->dev, "failed to acquire irq %d\n", spi->irq);
-		mcp251x_power_enable(priv->transceiver, 0);
-		close_candev(net);
-		goto open_unlock;
+		goto out_close;
 	}
 
 	priv->wq = alloc_workqueue("mcp251x_wq", WQ_FREEZABLE | WQ_MEM_RECLAIM,
 				   0);
+	if (!priv->wq) {
+		ret = -ENOMEM;
+		goto out_clean;
+	}
 	INIT_WORK(&priv->tx_work, mcp251x_tx_work_handler);
 	INIT_WORK(&priv->restart_work, mcp251x_restart_work_handler);
 
 	ret = mcp251x_hw_reset(spi);
-	if (ret) {
-		mcp251x_open_clean(net);
-		goto open_unlock;
-	}
+	if (ret)
+		goto out_free_wq;
 	ret = mcp251x_setup(net, spi);
-	if (ret) {
-		mcp251x_open_clean(net);
-		goto open_unlock;
-	}
+	if (ret)
+		goto out_free_wq;
 	ret = mcp251x_set_normal_mode(spi);
-	if (ret) {
-		mcp251x_open_clean(net);
-		goto open_unlock;
-	}
+	if (ret)
+		goto out_free_wq;
 
 	can_led_event(net, CAN_LED_EVENT_OPEN);
 
 	netif_wake_queue(net);
+	mutex_unlock(&priv->mcp_lock);
 
-open_unlock:
+	return 0;
+
+out_free_wq:
+	destroy_workqueue(priv->wq);
+out_clean:
+	free_irq(spi->irq, priv);
+	mcp251x_hw_sleep(spi);
+out_close:
+	mcp251x_power_enable(priv->transceiver, 0);
+	close_candev(net);
 	mutex_unlock(&priv->mcp_lock);
 	return ret;
 }
-- 
2.20.1


^ permalink raw reply related

* [PATCH 2/7] can: rcar_canfd: fix possible IRQ storm on high load
From: Marc Kleine-Budde @ 2019-07-24 13:03 UTC (permalink / raw)
  To: netdev
  Cc: davem, linux-can, kernel, Nikita Yushchenko, linux-stable,
	Marc Kleine-Budde
In-Reply-To: <20190724130322.31702-1-mkl@pengutronix.de>

From: Nikita Yushchenko <nikita.yoush@cogentembedded.com>

We have observed rcar_canfd driver entering IRQ storm under high load,
with following scenario:
- rcar_canfd_global_interrupt() in entered due to Rx available,
- napi_schedule_prep() is called, and sets NAPIF_STATE_SCHED in state
- Rx fifo interrupts are masked,
- rcar_canfd_global_interrupt() is entered again, this time due to
  error interrupt (e.g. due to overflow),
- since scheduled napi poller has not yet executed, condition for calling
  napi_schedule_prep() from rcar_canfd_global_interrupt() remains true,
  thus napi_schedule_prep() gets called and sets NAPIF_STATE_MISSED flag
  in state,
- later, napi poller function rcar_canfd_rx_poll() gets executed, and
  calls napi_complete_done(),
- due to NAPIF_STATE_MISSED flag in state, this call does not clear
  NAPIF_STATE_SCHED flag from state,
- on return from napi_complete_done(), rcar_canfd_rx_poll() unmasks Rx
  interrutps,
- Rx interrupt happens, rcar_canfd_global_interrupt() gets called
  and calls napi_schedule_prep(),
- since NAPIF_STATE_SCHED is set in state at this time, this call
  returns false,
- due to that false return, rcar_canfd_global_interrupt() returns
  without masking Rx interrupt
- and this results into IRQ storm: unmasked Rx interrupt happens again
  and again is misprocessed in the same way.

This patch fixes that scenario by unmasking Rx interrupts only when
napi_complete_done() returns true, which means it has cleared
NAPIF_STATE_SCHED in state.

Fixes: dd3bd23eb438 ("can: rcar_canfd: Add Renesas R-Car CAN FD driver")
Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 drivers/net/can/rcar/rcar_canfd.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/can/rcar/rcar_canfd.c b/drivers/net/can/rcar/rcar_canfd.c
index 05410008aa6b..de34a4b82d4a 100644
--- a/drivers/net/can/rcar/rcar_canfd.c
+++ b/drivers/net/can/rcar/rcar_canfd.c
@@ -1508,10 +1508,11 @@ static int rcar_canfd_rx_poll(struct napi_struct *napi, int quota)
 
 	/* All packets processed */
 	if (num_pkts < quota) {
-		napi_complete_done(napi, num_pkts);
-		/* Enable Rx FIFO interrupts */
-		rcar_canfd_set_bit(priv->base, RCANFD_RFCC(ridx),
-				   RCANFD_RFCC_RFIE);
+		if (napi_complete_done(napi, num_pkts)) {
+			/* Enable Rx FIFO interrupts */
+			rcar_canfd_set_bit(priv->base, RCANFD_RFCC(ridx),
+					   RCANFD_RFCC_RFIE);
+		}
 	}
 	return num_pkts;
 }
-- 
2.20.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox