* Re: [PATCH] vmxnet3: fix LRO feature check
From: Igor Pylypiv @ 2018-03-18 1:08 UTC (permalink / raw)
To: David Miller; +Cc: skhare, pv-drivers, netdev
In-Reply-To: <20180317.202032.2223279074951132662.davem@davemloft.net>
The 03/17/2018 20:20, David Miller wrote:
> From: Igor Pylypiv <ipylypiv@silver-peak.com>
> Date: Sat, 17 Mar 2018 00:58:52 -0700
>
> > rxcsum and lro fields were deleted in commit a0d2730c9571 ("net: vmxnet3:
> > convert to hw_features"). With upgrading to newer version those fields were
> > resurrected and new code started using uninitialized lro field.
> > Removing rxcsum and lro fields.
> >
> > Fixes: 45dac1d6ea04 ("vmxnet3: Changes for vmxnet3 adapter version 2 (fwd)")
> > Signed-off-by: Igor Pylypiv <ipylypiv@silver-peak.com>
>
> Why are you posting this again?
>
> I applied the copy of this patch which was part of a two part series
> posted earlier.
No way!!!
I know it is hard to believe but I found this bug myself.
That's really odd. I took linux-next-20180316 and patch wasn't there.
Now I see the patch in your tree.
Anyway, let me send another one to delete rxcsum at least...
^ permalink raw reply
* Re: [PATCH] vmxnet3: fix LRO feature check
From: David Miller @ 2018-03-18 0:20 UTC (permalink / raw)
To: ipylypiv; +Cc: skhare, pv-drivers, netdev
In-Reply-To: <20180317075852.11785-1-ipylypiv@silver-peak.com>
From: Igor Pylypiv <ipylypiv@silver-peak.com>
Date: Sat, 17 Mar 2018 00:58:52 -0700
> rxcsum and lro fields were deleted in commit a0d2730c9571 ("net: vmxnet3:
> convert to hw_features"). With upgrading to newer version those fields were
> resurrected and new code started using uninitialized lro field.
> Removing rxcsum and lro fields.
>
> Fixes: 45dac1d6ea04 ("vmxnet3: Changes for vmxnet3 adapter version 2 (fwd)")
> Signed-off-by: Igor Pylypiv <ipylypiv@silver-peak.com>
Why are you posting this again?
I applied the copy of this patch which was part of a two part series
posted earlier.
^ permalink raw reply
* Re: [PATCH net-next 0/5] Add support for RDMA enhancements in cxgb4
From: David Miller @ 2018-03-18 0:19 UTC (permalink / raw)
To: rajur; +Cc: netdev, nirranjan, indranil, venkatesh, swise, bharat
In-Reply-To: <20180317072229.21211-1-rajur@chelsio.com>
From: Raju Rangoju <rajur@chelsio.com>
Date: Sat, 17 Mar 2018 12:52:24 +0530
> Allocates the HW-resources and provide the necessary routines for the
> upper layer driver (rdma/iw_cxgb4) to enable the RDMA SRQ support for Chelsio adapters.
>
> Advertise support for write with immediate work request
> Advertise support for write with completion
Patch #2 doesn't apply cleanly to net-next, please respin.
Thank you.
^ permalink raw reply
* Re: [PATCH net-next 00/10 v2] selftests: pmtu: Add further vti/vti6 MTU and PMTU tests
From: David Miller @ 2018-03-18 0:15 UTC (permalink / raw)
To: sbrivio; +Cc: dsahern, sd, steffen.klassert, netdev
In-Reply-To: <cover.1521249420.git.sbrivio@redhat.com>
From: Stefano Brivio <sbrivio@redhat.com>
Date: Sat, 17 Mar 2018 02:31:37 +0100
> Patches 5/10 to 10/10 add tests to verify default MTU assignment
> for vti4 and vti6 interfaces, to check that MTU values set on new
> link and link changes are properly taken and validated, and to
> verify PMTU exceptions on vti4 interfaces.
>
> Patch 1/10 reverses function return codes as suggested by David
> Ahern.
>
> Patch 2/10 fixes the helper to fetch exceptions MTU to run in the
> passed namespace.
>
> Patches 3/10 and 4/10 are preparation work to make it easier to
> introduce those tests.
>
> v2: Reverse return codes, and make output prettier in 4/9 by
> using padded printf, test descriptions and buffered error
> strings. Remove accidental output to /dev/kmsg from 10/10
> (was 9/9).
Series applied, thank you.
^ permalink raw reply
* Re: [PATCH net-next v5 0/8] ibmvnic: Update TX pool and TX routines
From: David Miller @ 2018-03-18 0:12 UTC (permalink / raw)
To: tlfalcon; +Cc: netdev, jallen, nfont
In-Reply-To: <1521248431-6353-1-git-send-email-tlfalcon@linux.vnet.ibm.com>
From: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Date: Fri, 16 Mar 2018 20:00:23 -0500
> This patch restructures the TX pool data structure and provides a
> separate TX pool array for TSO transmissions. This is already used
> in some way due to our unique DMA situation, namely that we cannot
> use single DMA mappings for packet data. Previously, both buffer
> arrays used the same pool entry. This restructuring allows for
> some additional cleanup in the driver code, especially in some
> places in the device transmit routine.
>
> In addition, it allows us to more easily track the consumer
> and producer indexes of a particular pool. This has been
> further improved by better tracking of in-use buffers to
> prevent possible data corruption in case an invalid buffer
> entry is used.
>
> v5: Fix bisectability mistake in the first patch. Removed
> TSO-specific data in a later patch when it is no longer used.
>
> v4: Fix error in 7th patch that causes an oops by using
> the older fixed value for number of buffers instead
> of the respective field in the tx pool data structure
>
> v3: Forgot to update TX pool cleaning function to handle new data
> structures. Included 7th patch for that.
>
> v2: Fix typo in 3/6 commit subject line
Series applied, thanks Thomas.
^ permalink raw reply
* Re: [PATCH v2] sctp: use proc_remove_subtree()
From: David Miller @ 2018-03-18 0:11 UTC (permalink / raw)
To: viro; +Cc: netdev, linux-sctp
In-Reply-To: <20180316233251.GJ30522@ZenIV.linux.org.uk>
From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Fri, 16 Mar 2018 23:32:51 +0000
> use proc_remove_subtree() for subtree removal, both on setup failure
> halfway through and on teardown. No need to make simple things
> complex...
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Applied, thanks Al.
^ permalink raw reply
* Re: [PATCH net-next 0/2] hv_netvsc: minor enhancements
From: David Miller @ 2018-03-18 0:10 UTC (permalink / raw)
To: stephen; +Cc: devel, haiyangz, sthemmin, netdev
In-Reply-To: <20180316224428.8696-1-sthemmin@microsoft.com>
From: Stephen Hemminger <stephen@networkplumber.org>
Date: Fri, 16 Mar 2018 15:44:26 -0700
> A couple of small things for net-next
Series applied, thanks Stephen.
^ permalink raw reply
* Re: [PATCH v2 net 2/2] vmxnet3: use correct flag to indicate LRO feature
From: David Miller @ 2018-03-18 0:05 UTC (permalink / raw)
To: doshir; +Cc: netdev, rachel_lunnon, skhare, pv-drivers, linux-kernel
In-Reply-To: <20180316214919.20716-1-doshir@vmware.com>
From: Ronak Doshi <doshir@vmware.com>
Date: Fri, 16 Mar 2018 14:49:19 -0700
> 'Commit 45dac1d6ea04 ("vmxnet3: Changes for vmxnet3 adapter version 2
> (fwd)")' introduced a flag "lro" in structure vmxnet3_adapter which is
> used to indicate whether LRO is enabled or not. However, the patch
> did not set the flag and hence it was never exercised.
>
> So, when LRO is enabled, it resulted in poor TCP performance due to
> delayed acks. This issue is seen with packets which are larger than
> the mss getting a delayed ack rather than an immediate ack, thus
> resulting in high latency.
>
> This patch removes the lro flag and directly uses device features
> against NETIF_F_LRO to check if lro is enabled.
>
> Fixes: 45dac1d6ea04 ("vmxnet3: Changes for vmxnet3 adapter version 2 (fwd)")
> Reported-by: Rachel Lunnon <rachel_lunnon@stormagic.com>
> Signed-off-by: Ronak Doshi <doshir@vmware.com>
> Acked-by: Shrikrishna Khare <skhare@vmware.com>
Applied.
^ permalink raw reply
* Re: [PATCH v2 net 1/2] vmxnet3: avoid xmit reset due to a race in vmxnet3
From: David Miller @ 2018-03-18 0:05 UTC (permalink / raw)
To: doshir; +Cc: netdev, ntanaka, skhare, pv-drivers, linux-kernel
In-Reply-To: <20180316214754.20650-1-doshir@vmware.com>
From: Ronak Doshi <doshir@vmware.com>
Date: Fri, 16 Mar 2018 14:47:54 -0700
> The field txNumDeferred is used by the driver to keep track of the number
> of packets it has pushed to the emulation. The driver increments it on
> pushing the packet to the emulation and the emulation resets it to 0 at
> the end of the transmit.
>
> There is a possibility of a race either when (a) ESX is under heavy load or
> (b) workload inside VM is of low packet rate.
>
> This race results in xmit hangs when network coalescing is disabled. This
> change creates a local copy of txNumDeferred and uses it to perform ring
> arithmetic.
>
> Reported-by: Noriho Tanaka <ntanaka@vmware.com>
> Signed-off-by: Ronak Doshi <doshir@vmware.com>
> Acked-by: Shrikrishna Khare <skhare@vmware.com>
Applied.
^ permalink raw reply
* Re: [PATCH v11 crypto 00/12] Chelsio Inline TLS
From: David Miller @ 2018-03-18 0:02 UTC (permalink / raw)
To: atul.gupta
Cc: davejwatson, herbert, sd, sbrivio, linux-crypto, netdev, ganeshgr
In-Reply-To: <1521214582-28838-1-git-send-email-atul.gupta@chelsio.com>
From: Atul Gupta <atul.gupta@chelsio.com>
Date: Fri, 16 Mar 2018 21:06:22 +0530
> Series for Chelsio Inline TLS driver (chtls)
This series doesn't even come close to applying to the net-next
tree, please respin.
Thank you.
^ permalink raw reply
* Re: [PATCH net 0/5] net/sched: fix NULL dereference in the error path of .init()
From: David Miller @ 2018-03-17 23:53 UTC (permalink / raw)
To: dcaratti; +Cc: xiyou.wangcong, jiri, mrv, kurup.manish, netdev
In-Reply-To: <cover.1521154629.git.dcaratti@redhat.com>
From: Davide Caratti <dcaratti@redhat.com>
Date: Fri, 16 Mar 2018 00:00:52 +0100
> with several TC actions it's possible to see NULL pointer dereference,
> when the .init() function calls tcf_idr_alloc(), fails at some point and
> then calls tcf_idr_release(): this series fixes all them introducing
> non-NULL tests in the .cleanup() function.
Series applied and queued up for -stable, thank you.
^ permalink raw reply
* Re: [PATCH net-next] net: ethernet: ti: cpsw: enable vlan rx vlan offload
From: David Miller @ 2018-03-17 23:51 UTC (permalink / raw)
To: grygorii.strashko; +Cc: netdev, nsekhar, linux-kernel, linux-omap
In-Reply-To: <20180315201550.21487-1-grygorii.strashko@ti.com>
From: Grygorii Strashko <grygorii.strashko@ti.com>
Date: Thu, 15 Mar 2018 15:15:50 -0500
> In VLAN_AWARE mode CPSW can insert VLAN header encapsulation word on Host
> port 0 egress (RX) before the packet data if RX_VLAN_ENCAP bit is set in
> CPSW_CONTROL register. VLAN header encapsulation word has following format:
>
> HDR_PKT_Priority bits 29-31 - Header Packet VLAN prio (Highest prio: 7)
> HDR_PKT_CFI bits 28 - Header Packet VLAN CFI bit.
> HDR_PKT_Vid bits 27-16 - Header Packet VLAN ID
> PKT_Type bits 8-9 - Packet Type. Indicates whether the packet is
> VLAN-tagged, priority-tagged, or non-tagged.
> 00: VLAN-tagged packet
> 01: Reserved
> 10: Priority-tagged packet
> 11: Non-tagged packet
>
> This feature can be used to implement TX VLAN offload in case of
> VLAN-tagged packets and to insert VLAN tag in case Non-tagged packet was
> received on port with PVID set. As per documentation, CPSW never modifies
> packet data on Host egress (RX) and as result, without this feature
> enabled, Host port will not be able to receive properly packets which
> entered switch non-tagged through external Port with PVID set (when
> non-tagged packet forwarded from external Port with PVID set to another
> external Port - packet will be VLAN tagged properly).
>
> Implementation details:
> - on RX driver will check CPDMA status bit RX_VLAN_ENCAP BIT(19) in CPPI
> descriptor to identify when VLAN header encapsulation word is present.
> - PKT_Type = 0x01 or 0x02 then ignore VLAN header encapsulation word and
> pass packet as is;
> - if HDR_PKT_Vid = 0 then ignore VLAN header encapsulation word and pass
> packet as is;
> - In dual mac mode traffic is separated between ports using default port
> vlans, which are not be visible to Host and so should not be reported.
> Hence, check for default port vlans in dual mac mode and ignore VLAN header
> encapsulation word;
> - otherwise fill SKB with VLAN info using __vlan_hwaccel_put_tag();
> - PKT_Type = 0x00 (VLAN-tagged) then strip out VLAN header from SKB.
>
> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Applied, thank you.
^ permalink raw reply
* Re: [PATCH v2] net: ethernet: ti: cpsw: add check for in-band mode setting with RGMII PHY interface
From: David Miller @ 2018-03-17 23:50 UTC (permalink / raw)
To: sz.lin
Cc: spatton, grygorii.strashko, ivan.khoronzhuk, j-keerthy, nsekhar,
linux-omap, netdev, linux-kernel
In-Reply-To: <20180315165603.30471-1-sz.lin@moxa.com>
From: SZ Lin (林上智) <sz.lin@moxa.com>
Date: Fri, 16 Mar 2018 00:56:01 +0800
> According to AM335x TRM[1] 14.3.6.2, AM437x TRM[2] 15.3.6.2 and
> DRA7 TRM[3] 24.11.4.8.7.3.3, in-band mode in EXT_EN(bit18) register is only
> available when PHY is configured in RGMII mode with 10Mbps speed. It will
> cause some networking issues without RGMII mode, such as carrier sense
> errors and low throughput. TI also mentioned this issue in their forum[4].
>
> This patch adds the check mechanism for PHY interface with RGMII interface
> type, the in-band mode can only be set in RGMII mode with 10Mbps speed.
>
> References:
> [1]: https://www.ti.com/lit/ug/spruh73p/spruh73p.pdf
> [2]: http://www.ti.com/lit/ug/spruhl7h/spruhl7h.pdf
> [3]: http://www.ti.com/lit/ug/spruic2b/spruic2b.pdf
> [4]: https://e2e.ti.com/support/arm/sitara_arm/f/791/p/640765/2392155
>
> Suggested-by: Holsety Chen (陳憲輝) <Holsety.Chen@moxa.com>
> Signed-off-by: SZ Lin (林上智) <sz.lin@moxa.com>
> Signed-off-by: Schuyler Patton <spatton@ti.com>
Applied and queued up for -stable, thank you.
^ permalink raw reply
* Re: [PATCH] net: hns: Fix ethtool private flags
From: David Miller @ 2018-03-17 23:48 UTC (permalink / raw)
To: matthias.bgg
Cc: yisen.zhuang, salil.mehta, tianjinchuan1, lipeng321, lixiaoping3,
mbrugger, yankejian, linyunsheng, huangdaode, stephen, netdev,
linux-kernel
In-Reply-To: <20180315165420.16086-1-mbrugger@suse.com>
From: Matthias Brugger <matthias.bgg@gmail.com>
Date: Thu, 15 Mar 2018 17:54:20 +0100
> The driver implementation returns support for private flags, while
> no private flags are present. When asked for the number of private
> flags it returns the number of statistic flag names.
>
> Fix this by returning EOPNOTSUPP for not implemented ethtool flags.
>
> Signed-off-by: Matthias Brugger <mbrugger@suse.com>
Looks good, applied, thank you.
^ permalink raw reply
* Re: [PATCH net 0/3] vti4, ip_tunnel: Fixes for MTU assignment and validation
From: David Miller @ 2018-03-17 23:46 UTC (permalink / raw)
To: sbrivio; +Cc: sd, steffen.klassert, herbert, pshelar, netdev
In-Reply-To: <cover.1521068627.git.sbrivio@redhat.com>
From: Stefano Brivio <sbrivio@redhat.com>
Date: Thu, 15 Mar 2018 17:16:26 +0100
> Patch 1/3 re-introduces a fix to ensure that default MTU on new
> link is not lowered unnecessarily because of double counting of
> headers. This fix was originally introduced in 2014 and got lost
> in a merge commit shortly afterwards.
>
> Patches 2/3 and 3/3 ensure that MTU passed from userspace on link
> creation is taken into account and also properly validated.
Steffen, I am assuming that you will pick up this series and the
subsequent one dealing with vti6.
Thank you.
^ permalink raw reply
* Re: [PATCH v5 0/2] Remove false-positive VLAs when using max()
From: Josh Poimboeuf @ 2018-03-17 22:55 UTC (permalink / raw)
To: Kees Cook
Cc: Linus Torvalds, Al Viro, Florian Weimer, Andrew Morton,
Rasmus Villemoes, Randy Dunlap, Miguel Ojeda, Ingo Molnar,
David Laight, Ian Abbott, linux-input, linux-btrfs,
Network Development, Linux Kernel Mailing List, Kernel Hardening
In-Reply-To: <CAGXu5jJ=ZYpf=30H6hsWn-R-CEVYAgVMHxjmoLUC00QYq0r17g@mail.gmail.com>
On Sat, Mar 17, 2018 at 01:07:32PM -0700, Kees Cook wrote:
> On Sat, Mar 17, 2018 at 11:52 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > So the above is completely insane, bit there is actually a chance that
> > using that completely crazy "x -> sizeof(char[x])" conversion actually
> > helps, because it really does have a (very odd) evaluation-time
> > change. sizeof() has to be evaluated as part of the constant
> > expression evaluation, in ways that "__builtin_constant_p()" isn't
> > specified to be done.
> >
> > But it is also definitely me grasping at straws. If that doesn't work
> > for 4.4, there's nothing else I can possibly see.
>
> No luck! :( gcc 4.4 refuses to play along. And, hilariously, not only
> does it not change the complaint about __builtin_choose_expr(), it
> also thinks that's a VLA now.
>
> ./include/linux/mm.h: In function ‘get_mm_hiwater_rss’:
> ./include/linux/mm.h:1567: warning: variable length array is used
> ./include/linux/mm.h:1567: error: first argument to
> ‘__builtin_choose_expr’ not a constant
>
> 6.8 is happy with it (of course).
>
> I do think the earlier version (without the
> sizeof-hiding-builting_constant_p) provides a template for a
> const_max() that both you and Rasmus would be happy with, though!
I thought we were dropping support for 4.4 (for other reasons). Isn't
it 4.6 we should be looking at?
--
Josh
^ permalink raw reply
* Re: HW question: i210 vs. BCM5461S over SGMII: no response from PHY to MDIO requests?
From: Frantisek Rysanek @ 2018-03-17 22:12 UTC (permalink / raw)
To: Andrew Lunn, netdev
[-- Attachment #1: Mail message body --]
[-- Type: text/plain, Size: 1044 bytes --]
> > > Right now I've modded igb_init_i2c() to engage the bit-banging
> > > i2c driver for the i210 too
> >
> > I don't think that will work. The datasheet for the i210 talks about
> > two registers for I2C/MDIO which are not bit-banging. Only the i350
> > uses bit-banging.
> >
> From the i210 datasheet, page 477:
> chapter 8.17.9 "SFP I2C Parameters" - reg. I2CPARAMS (0x102C; R/W)
> bit 8 "I2CBB_EN" = I2C bit-bang enable.
> And about 6 more bits for SDA and SCL direction, input and output.
> Looking through existing code of the bit-banging callbacks for i350,
> their function would seem identical between the i210 and i350.
> I may check the bit definition macros again, if the scope shows
> nothing.
>
Sure enough, the I2C port works in bit-banging mode, even without a
semaphore. Screenshot attached.
I'm not getting an ACK from the SFP, probably because I've got the
address and offset wrong and because I'd better use indirect access.
There's some more work awaiting me...
God knows if this is going to be any use :-)
Frank
[-- Attachment #2: Attachment information. --]
[-- Type: text/plain, Size: 485 bytes --]
The following section of this message contains a file attachment
prepared for transmission using the Internet MIME message format.
If you are using Pegasus Mail, or any other MIME-compliant system,
you should be able to save it or view it from within your mailer.
If you cannot, please ask your system administrator for assistance.
---- File information -----------
File: osc_some_frame_bitbang.png
Date: 17 Mar 2018, 17:43
Size: 54383 bytes.
Type: Unknown
[-- Attachment #3: osc_some_frame_bitbang.png --]
[-- Type: Application/Octet-stream, Size: 54383 bytes --]
[-- Attachment #4: Attachment information. --]
[-- Type: text/plain, Size: 480 bytes --]
The following section of this message contains a file attachment
prepared for transmission using the Internet MIME message format.
If you are using Pegasus Mail, or any other MIME-compliant system,
you should be able to save it or view it from within your mailer.
If you cannot, please ask your system administrator for assistance.
---- File information -----------
File: osc_some_frame_HW.png
Date: 17 Mar 2018, 17:39
Size: 40613 bytes.
Type: Unknown
[-- Attachment #5: osc_some_frame_HW.png --]
[-- Type: Application/Octet-stream, Size: 40613 bytes --]
^ permalink raw reply
* Re: [rds-devel] [PATCH RFC RFC] rds: Use NETDEV_UNREGISTER in rds_tcp_dev_event() (then kill NETDEV_UNREGISTER_FINAL)
From: Kirill Tkhai @ 2018-03-17 21:55 UTC (permalink / raw)
To: Sowmini Varadhan; +Cc: rds-devel, linux-rdma, netdev, edumazet, davem
In-Reply-To: <20180317212553.GA16416@oracle.com>
On 18.03.2018 00:26, Sowmini Varadhan wrote:
> On (03/17/18 10:15), Sowmini Varadhan wrote:
>> To solve the scaling problem why not just have a well-defined
>> callback to modules when devices are quiesced, instead of
>> overloading the pernet_device registration in this obscure way?
>
> I thought about this a bit, and maybe I missed your original point-
> today we are able to do all the needed cleanup for rds-tcp when
> we unload the module, even though network activity has not quiesced,
> and there is no reason we cannot use the same code for netns cleanup
> as well. I think this is what you were trying to ask, when you
> said "why do you need to know that loopback is down?"
I just want to make rds not using NETDEV_UNREGISTER_FINAL. If there is
another solution to do that, I'm not again that.
> I'm sorry I missed that, I will re-examine the code and get back to
> you- it should be possible to just do one registration and
> cleanup rds-state and avoid the hack of registering twice
Sounds great, I'll wait for your response.
> (saw your most recent long mail- sorry- both v1 and v2 are hacks)
>
> I'm on the road at the moment, so I'll get back to you on this.
Thanks,
Kirill
^ permalink raw reply
* Re: [PATCH net] mlxsw: spectrum_buffers: Set a minimum quota for CPU port traffic
From: David Miller @ 2018-03-17 21:35 UTC (permalink / raw)
To: idosch; +Cc: netdev, jiri, eddies, alexpe, mlxsw
In-Reply-To: <20180315124956.32719-1-idosch@mellanox.com>
From: Ido Schimmel <idosch@mellanox.com>
Date: Thu, 15 Mar 2018 14:49:56 +0200
> In commit 9ffcc3725f09 ("mlxsw: spectrum: Allow packets to be trapped
> from any PG") I fixed a problem where packets could not be trapped to
> the CPU due to exceeded shared buffer quotas. The mentioned commit
> explains the problem in detail.
>
> The problem was fixed by assigning a minimum quota for the CPU port and
> the traffic class used for scheduling traffic to the CPU.
>
> However, commit 117b0dad2d54 ("mlxsw: Create a different trap group list
> for each device") assigned different traffic classes to different
> packet types and rendered the fix useless.
>
> Fix the problem by assigning a minimum quota for the CPU port and all
> the traffic classes that are currently in use.
>
> Fixes: 117b0dad2d54 ("mlxsw: Create a different trap group list for each device")
> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> Reported-by: Eddie Shklaer <eddies@mellanox.com>
> Tested-by: Eddie Shklaer <eddies@mellanox.com>
> Acked-by: Jiri Pirko <jiri@mellanox.com>
> ---
> Please consider the patch for -stable. Thanks!
Applied and queued up for -stable, thank you.
^ permalink raw reply
* Re: [rds-devel] [PATCH RFC RFC] rds: Use NETDEV_UNREGISTER in rds_tcp_dev_event() (then kill NETDEV_UNREGISTER_FINAL)
From: Sowmini Varadhan @ 2018-03-17 21:26 UTC (permalink / raw)
To: Kirill Tkhai; +Cc: rds-devel, linux-rdma, netdev, edumazet, davem
In-Reply-To: <20180317141507.GC873@oracle.com>
On (03/17/18 10:15), Sowmini Varadhan wrote:
> To solve the scaling problem why not just have a well-defined
> callback to modules when devices are quiesced, instead of
> overloading the pernet_device registration in this obscure way?
I thought about this a bit, and maybe I missed your original point-
today we are able to do all the needed cleanup for rds-tcp when
we unload the module, even though network activity has not quiesced,
and there is no reason we cannot use the same code for netns cleanup
as well. I think this is what you were trying to ask, when you
said "why do you need to know that loopback is down?"
I'm sorry I missed that, I will re-examine the code and get back to
you- it should be possible to just do one registration and
cleanup rds-state and avoid the hack of registering twice
(saw your most recent long mail- sorry- both v1 and v2 are hacks)
I'm on the road at the moment, so I'll get back to you on this.
Thanks
--Sowmini
^ permalink raw reply
* Re: [PATCH net-next] cxgb4: Fix queue free path of ULD drivers
From: David Miller @ 2018-03-17 21:20 UTC (permalink / raw)
To: ganeshgr; +Cc: netdev, nirranjan, indranil, venkatesh, arjun, leedom
In-Reply-To: <1521115454-5793-1-git-send-email-ganeshgr@chelsio.com>
From: Ganesh Goudar <ganeshgr@chelsio.com>
Date: Thu, 15 Mar 2018 17:34:14 +0530
> From: Arjun Vynipadath <arjun@chelsio.com>
>
> Setting sge_uld_rxq_info to NULL in free_queues_uld().
> We are referencing sge_uld_rxq_info in cxgb_up(). This
> will fix a panic when interface is brought up after a
> ULDq creation failure.
>
> Fixes: 94cdb8bb993a (cxgb4: Add support for dynamic allocation
> of resources for ULD)
> Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
> Signed-off-by: Casey Leedom <leedom@chelsio.com>
> Signed-off-by: Ganesh Goudhar <ganeshgr@chelsio.com>
Applied, thank you.
^ permalink raw reply
* Re: [PATCH net-next] rds: tcp: must use spin_lock_irq* and not spin_lock_bh with rds_tcp_conn_lock
From: David Miller @ 2018-03-17 21:19 UTC (permalink / raw)
To: sowmini.varadhan; +Cc: netdev, santosh.shilimkar
In-Reply-To: <1521111266-148947-1-git-send-email-sowmini.varadhan@oracle.com>
From: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Date: Thu, 15 Mar 2018 03:54:26 -0700
> rds_tcp_connection allocation/free management has the potential to be
> called from __rds_conn_create after IRQs have been disabled, so
> spin_[un]lock_bh cannot be used with rds_tcp_conn_lock.
>
> Bottom-halves that need to synchronize for critical sections protected
> by rds_tcp_conn_lock should instead use rds_destroy_pending() correctly.
>
> Reported-by: syzbot+c68e51bb5e699d3f8d91@syzkaller.appspotmail.com
> Fixes: ebeeb1ad9b8a ("rds: tcp: use rds_destroy_pending() to synchronize
> netns/module teardown and rds connection/workq management")
> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Applied, thank you.
^ permalink raw reply
* Re: [PATCH net-next 0/6] Converting pernet_operations (part #8)
From: Kirill Tkhai @ 2018-03-17 21:13 UTC (permalink / raw)
To: David Miller
Cc: dev-yBygre7rU0TnMu66kgdUjQ, wensong-ud5FBsm0p/xg9hUCZPvPmw,
dan.j.williams-ral2JQCrhuEAvxtiuMwx3w,
netdev-u79uwXL29TY76Z2rM5mHXA,
roopa-qUQiAmfTcIp+XZJcv9eMoEEOCMrvLtNR,
dsahern-Re5JQEeQqe8AvxtiuMwx3w, fw-HFFVJYpyMKqzQB+pC5nmwQ,
jchapman-Bm0nJX+W7e9BDgjK7y7TUQ, lvs-devel-u79uwXL29TY76Z2rM5mHXA,
ja-FgGsKACvmQM, netfilter-devel-u79uwXL29TY76Z2rM5mHXA,
elena.reshetova-ral2JQCrhuEAvxtiuMwx3w,
amine.kherbouche-pdR9zngts4EAvxtiuMwx3w,
kadlec-K40Dz/62t/MgiyqX0sVFJYdd74u8MsAO,
rshearma-43mecJUBy8ZBDgjK7y7TUQ, g.nault-pHk1y4uTXVDytLWWfqlThQ,
pablo-Cap9r6Oaw4JrovVCs/uTlw, dwindsor-Re5JQEeQqe8AvxtiuMwx3w
In-Reply-To: <20180317.170757.485564313326587221.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
On 18.03.2018 00:07, David Miller wrote:
> From: Kirill Tkhai <ktkhai-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
> Date: Thu, 15 Mar 2018 12:10:47 +0300
>
>> this series continues to review and to convert pernet_operations
>> to make them possible to be executed in parallel for several
>> net namespaces at the same time. There are different operations
>> over the tree, mostly are ipvs.
>
> Series applied, thanks Kirill.
Thanks, David!
^ permalink raw reply
* Re: [PATCH RFC RFC] rds: Use NETDEV_UNREGISTER in rds_tcp_dev_event() (then kill NETDEV_UNREGISTER_FINAL)
From: Kirill Tkhai @ 2018-03-17 21:13 UTC (permalink / raw)
To: Sowmini Varadhan
Cc: santosh.shilimkar, davem, netdev, linux-rdma, rds-devel, edumazet
In-Reply-To: <20180317141507.GC873@oracle.com>
On 17.03.2018 17:15, Sowmini Varadhan wrote:
>
> I spent a long time staring at both v1 and v2 of your patch.
Thanks for your time!
> I understand the overall goal, but I am afraid to say that these
> patches are complete hacks.
I'm not agree with you, see below the explanations.
> I was trying to understand why patchv1 blows with a null rtn in
> rds_tcp_init_net, but v2 does not, and the analysis is ugly.
>
> I'm going to put down the analysis here, and others can
> decide if this sort of hack is a healthy solution for a scaling
> issue (IMHO it is not- we should get the real fix for the
> scaling instead of using duck-tape-and-chewing-gum solutions)
>
> What is happening in v1 is this:
>
> 1. Wnen I do "modprobe rds_tcp" in init_net, we end up doing the
> following in rds_tcp_init
> register_pernet_device(&rds_tcp_dev_ops);
> register_pernet_device(&rds_tcp_net_ops);
> Where rds_tcp_dev_ops has
> .id = &rds_tcp_netid,
> .size = sizeof(struct rds_tcp_net),
> and rds_tcp_net_ops has 0 values for both of these.
>
> 2. So now pernet_list has &rds_tcp_net_ops as the first member of the
> pernet_list.
>
> 3. Now I do "ip netns create blue". As part of setup_net(), we walk
> the pernet_list and call the ->init of each member (through ops_init()).
> So we'd hit rds_tcp_net_ops first. Since the id/size are 0, we'd
> skip the struct rds_tcp_net allocation, so rds_tcp_init_net would
> find a null return from net_generic() and bomb.
>
> The way I view it (and ymmv) the hack here is to call both
> register_pernet_device and register_pernet_subsys: the kernel only
> guarantees that calling *one* of register_pernet_* will ensure
> that you can safely call net_generic() afterwards.
>
> The v2 patch "works around" this by reordering the registration.
> So this time, init_net will set up the rds_tcp_net_ops as the second
> member, and the first memeber will be the pernet_operations struct
> that has non-zero id and size.
>
> But then the unregistration (necessarily) works in the opposite order
> you have to unregister_pernet_device first (so that interfaces are
> quiesced) and then unregister_pernet_subsys() so that sockets can
> be safely quiesced.
>
> I dont think this type of hack makes the code cleaner, it just
> make things much harder to understand, and completely brittle
> for subsequent changes.
It's not a hack, it's just a way to fix the problem, like other
pernet_operations do. It's OK for pernet_operations to share
net_generic() id. The only thing you need is to request the id in
the pernet_operations, which go the first in pernet_list. There are
a lot of examples in kernel:
1)sunrpc_net_ops
rpcsec_gss_net_ops
these pernet_operations share sunrpc_net_id. It's requested in sunrpc_net_ops:
static struct pernet_operations sunrpc_net_ops = {
.init = sunrpc_init_net,
.exit = sunrpc_exit_net,
.id = &sunrpc_net_id,
.size = sizeof(struct sunrpc_net),
and it's also used by rpcsec_gss_net_ops:
static struct pernet_operations rpcsec_gss_net_ops = {
.init = rpcsec_gss_init_net,
.exit = rpcsec_gss_exit_net,
};
rpcsec_gss_init_net()->gss_svc_init_net()->rsc_cache_create_net(), where:
static int rsc_cache_create_net(struct net *net)
{
struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
... ^^^here^^^
The only thing is sunrpc_net_ops must be registered before rpcsec_gss_net_ops,
and rpc code guarantees that.
2)ipvs_core_ops
ipvs_core_dev_ops
ip_vs_ftp_ops
ip_vs_lblc_ops
ip_vs_lblcr_ops
these pernet_operations (5!) share ip_vs_net_id, which is requested in ipvs_core_ops:
static struct pernet_operations ipvs_core_ops = {
.init = __ip_vs_init,
.exit = __ip_vs_cleanup,
.id = &ip_vs_net_id,
.size = sizeof(struct netns_ipvs),
};
static int __net_init __ip_vs_init(struct net *net)
{
struct netns_ipvs *ipvs;
...
ipvs = net_generic(net, ip_vs_net_id);
net->ipvs = ipvs;
...
}
static struct pernet_operations ipvs_core_dev_ops = {
.exit = __ip_vs_dev_cleanup,
};
static void __net_exit __ip_vs_dev_cleanup(struct net *net)
{
struct netns_ipvs *ipvs = net_ipvs(net);
^^^requested in ipvs_core_ops
...
}
Look at the above example. They solve the same problem, rds has.
They need to do some actions at pernet_device exit time. And there
is ipvs_core_dev_ops added for this, since ipvs_core_ops are called
in pernet_subsys time. See ip_vs_init():
static int __init ip_vs_init(void)
{
...
ret = register_pernet_subsys(&ipvs_core_ops); /* Alloc ip_vs struct */
if (ret < 0)
goto cleanup_conn;
...
ret = register_pernet_device(&ipvs_core_dev_ops);
That's all. They use pernet_device exit, which may be called in parallel
with anything, and which doesn't use rtnl_lock().
There is no reasons, rds_tcp_net_ops uses its own way different to all other
pernet_operations, and uses exclusive rtnl_lock().
> To solve the scaling problem why not just have a well-defined
> callback to modules when devices are quiesced, instead of
> overloading the pernet_device registration in this obscure way?
We already have them. These callbacks are called pernet_operations exit methods.
These methods can execute in parallel and scale nice.
Kirill
^ permalink raw reply
* Re: [net-next 0/5] tipc: obsolete zone concept
From: David Miller @ 2018-03-17 21:12 UTC (permalink / raw)
To: jon.maloy; +Cc: netdev, tipc-discussion, mohan.krishna.ghanta.krishnamurthy
In-Reply-To: <1521128935-6141-1-git-send-email-jon.maloy@ericsson.com>
From: Jon Maloy <jon.maloy@ericsson.com>
Date: Thu, 15 Mar 2018 16:48:49 +0100
> Functionality related to the 'zone' concept was never implemented in
> TIPC. In this series we eliminate the remaining traces of it in the
> code, and can hence take a first step in reducing the footprint and
> complexity of the binding table.
Series applied, thanks Jon.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox