* Re: [PATCH net-next] udp: avoid one cache line miss in recvmsg()
From: Eric Dumazet @ 2016-11-21 22:01 UTC (permalink / raw)
To: Paolo Abeni; +Cc: David Miller, netdev
In-Reply-To: <1479764208.5596.0.camel@redhat.com>
On Mon, 2016-11-21 at 22:36 +0100, Paolo Abeni wrote:
> Nice catch, thank you Eric!
>
> This gives up to 8% speed-up in my performance test (wire speed udp flood
> with small packets)
>
> Tested-by: Paolo Abeni <pabeni@redhat.com>
Thanks Paolo
Note that udp6_recvmsg() hits the 3rd cache line of skb to access
skb->protocol :
is_udp4 = (skb->protocol == htons(ETH_P_IP));
We might some trick to avoid this cache line miss.
^ permalink raw reply
* Re: [RFC 02/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) Bus driver
From: Jason Gunthorpe @ 2016-11-21 21:39 UTC (permalink / raw)
To: Vishwanathapura, Niranjana
Cc: Doug Ledford, linux-rdma, netdev, Dennis Dalessandro
In-Reply-To: <20161121213017.GB67872@knc-06.sc.intel.com>
On Mon, Nov 21, 2016 at 01:30:17PM -0800, Vishwanathapura, Niranjana wrote:
> On Sat, Nov 19, 2016 at 12:04:45PM -0700, Jason Gunthorpe wrote:
> >On Fri, Nov 18, 2016 at 02:42:10PM -0800, Vishwanathapura, Niranjana wrote:
> >>+HFI-VNIC DRIVER
> >>+M: Dennis Dalessandro <dennis.dalessandro@intel.com>
> >>+M: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> >>+L: linux-rdma@vger.kernel.org
> >>+S: Supported
> >>+F: drivers/infiniband/sw/intel/vnic
> >
> >This is either a net driver or a ULP, no idea why it should go in this
> >directory!?
> >
> >It sounds like an ethernet driver, so you should probably put it
> >there...
> >
>
> The hfi_vnic is an Ethernet driver. It is similar to ULP like ipoib, but
> instead it is Ethernet over Omni-path here.
> The VNIC Ethernet (hfi_vnic) driver encapsulates Ethernet packets in an
> Omni-path header.
> The hfi_vnic Ethernet driver do not access the HW. It interfaces with HFI1
> driver which sends/receives Omni-Path encapsulated Ethernet frames from HW.
> Also, the VNIC control path driver (VEMA) is an IB MAD agent which should be
> under drivers/infiniband/.. .
> Putting the VNIC Ethernet driver and the VNIC control driver together under
> a single module (hfi_vnic.ko) provided a simpler interface between them.
>
> So, we have put the driver under drivers/infiniband/sw/intel for two reasons:
> a) We have VNIC control driver (VEMA) which is an IB mad agent.
> b) hfi_vnic Ethernet driver is dependent on HFI1 driver for sending/receving
> Omni-path encapsulated Ethernet packets from HW.
Sounds like this driver belongs under net/ someplace to me.
NAK on drivers/infiniband/sw/ at least - that dir is only for HCA
drivers.
> >>+/* hfi_vnic_bus_init - initialize the hfi vnic bus drvier */
> >>+static int hfi_vnic_bus_init(void)
> >>+{
> >>+ int rc;
> >>+
> >>+ ida_init(&hfi_vnic_ctrl_ida);
> >>+ idr_init(&hfi_vnic_idr);
> >>+
> >>+ rc = bus_register(&hfi_vnic_bus);
> >
> >Why on earth do we need this? Didn't I give you enough grief for the
> >psm stuff and now you want to create an entire subystem hidden away!?
> >
> >Use some netlink scheme to control your vnic like the rest of the net
> >stack..
> >
>
> The hfi_vnic_bus is only abstracting the HW independent functionality (like
> Ethernet interface, encapsulation, IB MAD interface etc) with the HW
> dependent functionality (sending/receiving packets on the wire).
> Thus providing a cleaner interface between HW independent hfi_vnic Ethernet
> and Control drivers and the HW dependent HFI1 driver.
That doesn't explain anything, sound like you don't need it so get rid
of it.
> There is no other User interface here other than the standard Ethernet
> interface through network stack.
Good, then this isn't needed, because it doesn't provide a user interface.
> #ls /sys/bus/hfi_vnic_bus/devices/
> hfi_vnic_ctrl_00 /* control device for HFI instance 0 */
> hfi_vnic_00.01.00 /* first VNIC port on HFI instance 0, port 1 */
Jason
^ permalink raw reply
* Re: [PATCH net-next] udp: avoid one cache line miss in recvmsg()
From: Paolo Abeni @ 2016-11-21 21:36 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1479518283.8455.312.camel@edumazet-glaptop3.roam.corp.google.com>
On Fri, 2016-11-18 at 17:18 -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> UDP_SKB_CB(skb)->partial_cov is located at offset 66 in skb,
> requesting a cold cache line being read in cpu cache.
>
> We can avoid this cache line miss for UDP sockets,
> as partial_cov has a meaning only for UDPLite.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
> net/ipv4/udp.c | 3 ++-
> net/ipv6/udp.c | 3 ++-
> 2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index e1fc0116e8d59d8185670c6e55d1219bde55610d..b949770fdc08398a10f3974505a50b2b4f4b2cf3 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -1389,7 +1389,8 @@ int udp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int noblock,
> * coverage checksum (UDP-Lite), do it before the copy.
> */
>
> - if (copied < ulen || UDP_SKB_CB(skb)->partial_cov || peeking) {
> + if (copied < ulen || peeking ||
> + (is_udplite && UDP_SKB_CB(skb)->partial_cov)) {
> checksum_valid = !udp_lib_checksum_complete(skb);
> if (!checksum_valid)
> goto csum_copy_err;
> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> index 4f99417d9b401f2a65c7828e7d6b86d1d6161794..8fd4d89380b86c8630f7fd27ce4e9958497a2b89 100644
> --- a/net/ipv6/udp.c
> +++ b/net/ipv6/udp.c
> @@ -363,7 +363,8 @@ int udpv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
> * coverage checksum (UDP-Lite), do it before the copy.
> */
>
> - if (copied < ulen || UDP_SKB_CB(skb)->partial_cov || peeking) {
> + if (copied < ulen || peeking ||
> + (is_udplite && UDP_SKB_CB(skb)->partial_cov)) {
> checksum_valid = !udp_lib_checksum_complete(skb);
> if (!checksum_valid)
> goto csum_copy_err;
>
>
Nice catch, thank you Eric!
This gives up to 8% speed-up in my performance test (wire speed udp flood
with small packets)
Tested-by: Paolo Abeni <pabeni@redhat.com>
^ permalink raw reply
* Re: [RFC 02/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) Bus driver
From: Vishwanathapura, Niranjana @ 2016-11-21 21:30 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Doug Ledford, linux-rdma, netdev, Dennis Dalessandro
In-Reply-To: <20161119190445.GG22775@obsidianresearch.com>
On Sat, Nov 19, 2016 at 12:04:45PM -0700, Jason Gunthorpe wrote:
>On Fri, Nov 18, 2016 at 02:42:10PM -0800, Vishwanathapura, Niranjana wrote:
>> +HFI-VNIC DRIVER
>> +M: Dennis Dalessandro <dennis.dalessandro@intel.com>
>> +M: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>> +L: linux-rdma@vger.kernel.org
>> +S: Supported
>> +F: drivers/infiniband/sw/intel/vnic
>
>This is either a net driver or a ULP, no idea why it should go in this
>directory!?
>
>It sounds like an ethernet driver, so you should probably put it
>there...
>
The hfi_vnic is an Ethernet driver. It is similar to ULP like ipoib, but
instead it is Ethernet over Omni-path here.
The VNIC Ethernet (hfi_vnic) driver encapsulates Ethernet packets in an
Omni-path header.
The hfi_vnic Ethernet driver do not access the HW. It interfaces with HFI1
driver which sends/receives Omni-Path encapsulated Ethernet frames from HW.
Also, the VNIC control path driver (VEMA) is an IB MAD agent which should be
under drivers/infiniband/.. .
Putting the VNIC Ethernet driver and the VNIC control driver together under a
single module (hfi_vnic.ko) provided a simpler interface between them.
So, we have put the driver under drivers/infiniband/sw/intel for two reasons:
a) We have VNIC control driver (VEMA) which is an IB mad agent.
b) hfi_vnic Ethernet driver is dependent on HFI1 driver for sending/receving
Omni-path encapsulated Ethernet packets from HW.
>> +/* hfi_vnic_bus_init - initialize the hfi vnic bus drvier */
>> +static int hfi_vnic_bus_init(void)
>> +{
>> + int rc;
>> +
>> + ida_init(&hfi_vnic_ctrl_ida);
>> + idr_init(&hfi_vnic_idr);
>> +
>> + rc = bus_register(&hfi_vnic_bus);
>
>Why on earth do we need this? Didn't I give you enough grief for the
>psm stuff and now you want to create an entire subystem hidden away!?
>
>Use some netlink scheme to control your vnic like the rest of the net
>stack..
>
The hfi_vnic_bus is only abstracting the HW independent functionality (like
Ethernet interface, encapsulation, IB MAD interface etc) with the HW dependent
functionality (sending/receiving packets on the wire).
Thus providing a cleaner interface between HW independent hfi_vnic Ethernet and
Control drivers and the HW dependent HFI1 driver.
There is no other User interface here other than the standard Ethernet
interface through network stack.
HFI1 driver creates VNIC devices on the hfi_vnic_bus as below and the hfi_vnic
Ethernet and Control drivers drive them.
#ls /sys/bus/hfi_vnic_bus/devices/
hfi_vnic_ctrl_00 /* control device for HFI instance 0 */
hfi_vnic_00.01.00 /* first VNIC port on HFI instance 0, port 1 */
hfi_vnic_00.01.01 /* second VNIC port on HFI instance 0, port 1 */
The design is as shown in the below diagram.
+-------------------+ +----------------------+
| | | Linux |
| IB MAD | | Network |
| | | Stack |
+-------------------+ +----------------------+
| |
| |
+--------------------------------------------+
| |
| HFI VNIC Module |
| (HFI VNIC Netdev and EMA drivers) |
| (HW independent) |
+--------------------------------------------+
|
|
+--------------------------------------------+
| HFI VNIC Bus |
+--------------------------------------------+
|
|
+--------------------------------------------+
| |
| HFI1 Driver with VNIC support |
| (HW dependent) |
+--------------------------------------------+
Niranjana
>Jason
^ permalink raw reply
* [stable 4.4.y] ppp: defer netns reference release for ppp channel
From: Simon Arlott @ 2016-11-21 21:12 UTC (permalink / raw)
To: netdev
Please apply the following patch to linux-stable 4.4.y:
commit 205e1e255c479f3fd77446415706463b282f94e4
ppp: defer netns reference release for ppp channel
This is already present in 4.8.y and fixes an issue with ppp channels
that would otherwise cause a BUG() in ppp_pernet while a global ppp
mutex is held, preventing further ppp connections from being
established.
The issue is reproducible by having pppd use a PPPoE server that closes
new connections immediately (e.g. rp-pppoe "pppoe-server -q /bin/true").
--
Simon Arlott
^ permalink raw reply
* Re: [PATCH] net: ieee802154: constify ieee802154_ops structures
From: David Miller @ 2016-11-21 21:13 UTC (permalink / raw)
To: bhumirks
Cc: julia.lawall, michael.hennerich, aar, stefan, linux-wpan, netdev,
linux-kernel
In-Reply-To: <1479760214-32624-1-git-send-email-bhumirks@gmail.com>
From: Bhumika Goyal <bhumirks@gmail.com>
Date: Tue, 22 Nov 2016 02:00:14 +0530
> Declare the structure ieee802154_ops as const as it is only passed as an
> argument to the function ieee802154_alloc_hw. This argument is of type
> const struct ieee802154_ops *, so ieee80254_ops structures having this
> property can be declared as const.
> Done using Coccinelle:
...
> Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Applied.
^ permalink raw reply
* Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
From: Baicar, Tyler @ 2016-11-21 20:40 UTC (permalink / raw)
To: Neftin, Sasha, jeffrey.t.kirsher, intel-wired-lan, netdev,
linux-kernel, okaya, timur
In-Reply-To: <baeeb0a1-a454-6c81-59e9-3cec79a524a7@intel.com>
On 11/17/2016 6:31 AM, Neftin, Sasha wrote:
> On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
>> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>>> Hello Sasha,
>>>
>>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>>> Move IRQ free code so that it will happen regardless of the
>>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>>>>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>>>>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>>> In such a situation, we will hit a kernel bug later in e1000_remove
>>>>> because the IRQ still has action since it was never freed. A
>>>>> secondary bus reset can cause this case to happen.
>>>>>
>>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>>> ---
>>>>> drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> index 7017281..36cfcb0 100644
>>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>> if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>> e1000e_down(adapter, true);
>>>>> - e1000_free_irq(adapter);
>>>>> /* Link status message must follow this format */
>>>>> pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>> }
>>>>> + e1000_free_irq(adapter);
>>>>> +
>>>>> napi_disable(&adapter->napi);
>>>>> e1000e_free_tx_resources(adapter->tx_ring);
>>>>>
>>>> I would like not recommend insert this change. This change related
>>>> driver state machine, we afraid from lot of synchronization problem and
>>>> issues.
>>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>> What do you mean here? There is no loop. If __E1000_DOWN is set then we
>>> will never free the IRQ.
>>>
>>>> Another point, does before execute secondary bus reset your SW back up
>>>> pcie configuration space as properly?
>>> After a secondary bus reset, the link needs to recover and go back to a
>>> working state after 1 second.
>>>
>>> From the callstack, the issue is happening while removing the endpoint
>>> from the system, before applying the secondary bus reset.
>>>
>>> The order of events is
>>> 1. remove the drivers
>>> 2. cause a secondary bus reset
>>> 3. wait 1 second
>> Actually, this is too much, usually link up in less than 100ms.You can
>> check Data Link Layer indication.
>>> 4. recover the link
>>>
>>> callstack:
>>> free_msi_irqs+0x6c/0x1a8
>>> pci_disable_msi+0xb0/0x148
>>> e1000e_reset_interrupt_capability+0x60/0x78
>>> e1000_remove+0xc8/0x180
>>> pci_device_remove+0x48/0x118
>>> __device_release_driver+0x80/0x108
>>> device_release_driver+0x2c/0x40
>>> pci_stop_bus_device+0xa0/0xb0
>>> pci_stop_bus_device+0x3c/0xb0
>>> pci_stop_root_bus+0x54/0x80
>>> acpi_pci_root_remove+0x28/0x64
>>> acpi_bus_trim+0x6c/0xa4
>>> acpi_device_hotplug+0x19c/0x3f4
>>> acpi_hotplug_work_fn+0x28/0x3c
>>> process_one_work+0x150/0x460
>>> worker_thread+0x50/0x4b8
>>> kthread+0xd4/0xe8
>>> ret_from_fork+0x10/0x50
>>>
>>> Thanks,
>>> Tyler
>>>
>> Hello Tyler,
>> Okay, we need consult more about this suggestion.
>> May I ask what is setup you run? Is there NIC or on board LAN? I would
>> like try reproduce this issue in our lab's too.
>> Also, is same issue observed with same scenario and others NIC's too?
>> Sasha
>> _______________________________________________
>> Intel-wired-lan mailing list
>> Intel-wired-lan@lists.osuosl.org
>> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
>>
> Hello Tyler,
> I see some in consistent implementation of __*_close methods in our
> drivers. Do you have any igb NIC to check if same problem persist there?
> Thanks,
> Sasha
Hello Sasha,
I couldn't find an igb NIC to test with, but I did find another e1000e
card that does not cause the same issue. That card is:
0004:01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit
Network Connection
Subsystem: Intel Corporation Gigabit CT Desktop Adapter
Physical Slot: 5-1
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 299
Region 0: Memory at c01001c0000 (32-bit, non-prefetchable)
[size=128K]
Region 1: Memory at c0100100000 (32-bit, non-prefetchable)
[size=512K]
Region 2: I/O ports at 1000 [size=32]
Region 3: Memory at c01001e0000 (32-bit, non-prefetchable)
[size=16K]
Expansion ROM at c0100180000 [disabled] [size=256K]
Capabilities: [c8] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000397f0040 Data: 0000
Capabilities: [e0] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
<512ns, L1 <64us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
Unsupported+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq-
AuxPwr+ TransPend-
LnkCap: Port #8, Speed 2.5GT/s, Width x1, ASPM L0s L1,
Exit Latency L0s <128ns, L1 <64us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+
DLActive- BWMgmt- ABWMgmt-
Capabilities: [a0] MSI-X: Enable- Count=5 Masked-
Vector table: BAR=3 offset=00000000
PBA: BAR=3 offset=00002000
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt-
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr+
AERCap: First Error Pointer: 00, GenCap- CGenEn-
ChkCap- ChkEn-
Capabilities: [140 v1] Device Serial Number 68-05-ca-ff-ff-29-47-34
Kernel driver in use: e1000e
Here are the kernel logs from first running the test on this card and
then running the test on the Intel 82571EB card that I originally found
the issue with (you can see the issue doesn't happen on this card but
does on the other):
[ 44.146749] ACPI: \_SB_.PCI0: Device has suffered a power fault
[ 44.155238] pcieport 0000:00:00.0: PCIe Bus Error:
severity=Uncorrected (Non-Fatal), type=Transaction Layer,
id=0000(Requester ID)
[ 44.166111] pcieport 0000:00:00.0: device [17cb:0400] error
status/mask=00004000/00400000
[ 44.174420] pcieport 0000:00:00.0: [14] Completion Timeout (First)
[ 44.401943] e1000e 0000:01:00.0 eth0: Timesync Tx Control register
not set as expected
[ 82.445586] pcieport 0002:00:00.0: PCIe Bus Error:
severity=Corrected, type=Physical Layer, id=0000(Receiver ID)
[ 82.454851] pcieport 0002:00:00.0: device [17cb:0400] error
status/mask=00000001/00006000
[ 82.463209] pcieport 0002:00:00.0: [ 0] Receiver Error
[ 82.469355] pcieport 0002:00:00.0: PCIe Bus Error:
severity=Uncorrected (Non-Fatal), type=Transaction Layer,
id=0000(Requester ID)
[ 82.481026] pcieport 0002:00:00.0: device [17cb:0400] error
status/mask=00004000/00400000
[ 82.489343] pcieport 0002:00:00.0: [14] Completion Timeout (First)
[ 82.504573] ACPI: \_SB_.PCI2: Device has suffered a power fault
[ 84.528036] kernel BUG at drivers/pci/msi.c:369!
I'm not sure why it reproduces on the 82571EB card and not the 82574L
card. The only obvious difference is there is no Reciever Error on the
82574L card.
If you have a patch fixing the inconsistencies you mentioned with the
__*_close methods I would certainly be willing to test it out!
Thanks,
Tyler
--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.
^ permalink raw reply
* Re: [PATCH] net: ieee802154: constify ieee802154_ops structures
From: Stefan Schmidt @ 2016-11-21 20:39 UTC (permalink / raw)
To: Bhumika Goyal, julia.lawall, michael.hennerich, aar, linux-wpan,
netdev, linux-kernel
In-Reply-To: <1479760214-32624-1-git-send-email-bhumirks@gmail.com>
Hello.
On 21/11/16 21:30, Bhumika Goyal wrote:
> Declare the structure ieee802154_ops as const as it is only passed as an
> argument to the function ieee802154_alloc_hw. This argument is of type
> const struct ieee802154_ops *, so ieee80254_ops structures having this
> property can be declared as const.
> Done using Coccinelle:
>
> @r1 disable optional_qualifier @
> identifier i;
> position p;
> @@
> static struct ieee802154_ops i@p = {...};
>
> @ok1@
> identifier r1.i;
> position p;
> expression e1;
> @@
> ieee802154_alloc_hw(e1,&i@p)
>
> @bad@
> position p!={r1.p,ok1.p};
> identifier r1.i;
> @@
> i@p
>
> @depends on !bad disable optional_qualifier@
> identifier r1.i;
> @@
> static
> +const
> struct ieee802154_ops i={...};
>
> @depends on !bad disable optional_qualifier@
> identifier r1.i;
> @@
> +const
> struct ieee802154_ops i;
>
> The before and after size details of the affected files are:
>
> text data bss dec hex filename
> 8669 1176 16 9861 2685 drivers/net/ieee802154/adf7242.o
> 8805 1048 16 9869 268d drivers/net/ieee802154/adf7242.o
>
> text data bss dec hex filename
> 7211 2296 32 9539 2543 drivers/net/ieee802154/atusb.o
> 7339 2160 32 9531 253b drivers/net/ieee802154/atusb.o
>
> Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
> ---
> drivers/net/ieee802154/adf7242.c | 2 +-
> drivers/net/ieee802154/atusb.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ieee802154/adf7242.c b/drivers/net/ieee802154/adf7242.c
> index 9fa7ac9..4ff4c7d 100644
> --- a/drivers/net/ieee802154/adf7242.c
> +++ b/drivers/net/ieee802154/adf7242.c
> @@ -874,7 +874,7 @@ static int adf7242_rx(struct adf7242_local *lp)
> return 0;
> }
>
> -static struct ieee802154_ops adf7242_ops = {
> +static const struct ieee802154_ops adf7242_ops = {
> .owner = THIS_MODULE,
> .xmit_sync = adf7242_xmit,
> .ed = adf7242_ed,
> diff --git a/drivers/net/ieee802154/atusb.c b/drivers/net/ieee802154/atusb.c
> index 1056ed1..322864a 100644
> --- a/drivers/net/ieee802154/atusb.c
> +++ b/drivers/net/ieee802154/atusb.c
> @@ -567,7 +567,7 @@ static void atusb_stop(struct ieee802154_hw *hw)
> return 0;
> }
>
> -static struct ieee802154_ops atusb_ops = {
> +static const struct ieee802154_ops atusb_ops = {
> .owner = THIS_MODULE,
> .xmit_async = atusb_xmit,
> .ed = atusb_ed,
>
Acked-by: Stefan Schmidt <stefan@osg.samsung.com>
regards
Stefan Schmidt
^ permalink raw reply
* Re: [PATCH net-next v3 4/7] vxlan: improve vxlan route lookup checks.
From: Pravin Shelar @ 2016-11-21 20:34 UTC (permalink / raw)
To: Jiri Benc; +Cc: David Laight, netdev@vger.kernel.org
In-Reply-To: <20161117165950.6a8ed0d0@griffin>
On Thu, Nov 17, 2016 at 7:59 AM, Jiri Benc <jbenc@redhat.com> wrote:
> On Thu, 17 Nov 2016 10:17:01 +0000, David Laight wrote:
>> Worse than arbitrary, it adds 4 bytes of pad on 64bit systems.
>
> It does not, this is not a struct.
>
right.
After looking at the assembly code, it is clear that GCC and most of
modern compiler can reorder function variables for efficient storage.
^ permalink raw reply
* [PATCH] net: ieee802154: constify ieee802154_ops structures
From: Bhumika Goyal @ 2016-11-21 20:30 UTC (permalink / raw)
To: julia.lawall, michael.hennerich, aar, stefan, linux-wpan, netdev,
linux-kernel
Cc: Bhumika Goyal
Declare the structure ieee802154_ops as const as it is only passed as an
argument to the function ieee802154_alloc_hw. This argument is of type
const struct ieee802154_ops *, so ieee80254_ops structures having this
property can be declared as const.
Done using Coccinelle:
@r1 disable optional_qualifier @
identifier i;
position p;
@@
static struct ieee802154_ops i@p = {...};
@ok1@
identifier r1.i;
position p;
expression e1;
@@
ieee802154_alloc_hw(e1,&i@p)
@bad@
position p!={r1.p,ok1.p};
identifier r1.i;
@@
i@p
@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
static
+const
struct ieee802154_ops i={...};
@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
+const
struct ieee802154_ops i;
The before and after size details of the affected files are:
text data bss dec hex filename
8669 1176 16 9861 2685 drivers/net/ieee802154/adf7242.o
8805 1048 16 9869 268d drivers/net/ieee802154/adf7242.o
text data bss dec hex filename
7211 2296 32 9539 2543 drivers/net/ieee802154/atusb.o
7339 2160 32 9531 253b drivers/net/ieee802154/atusb.o
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
---
drivers/net/ieee802154/adf7242.c | 2 +-
drivers/net/ieee802154/atusb.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ieee802154/adf7242.c b/drivers/net/ieee802154/adf7242.c
index 9fa7ac9..4ff4c7d 100644
--- a/drivers/net/ieee802154/adf7242.c
+++ b/drivers/net/ieee802154/adf7242.c
@@ -874,7 +874,7 @@ static int adf7242_rx(struct adf7242_local *lp)
return 0;
}
-static struct ieee802154_ops adf7242_ops = {
+static const struct ieee802154_ops adf7242_ops = {
.owner = THIS_MODULE,
.xmit_sync = adf7242_xmit,
.ed = adf7242_ed,
diff --git a/drivers/net/ieee802154/atusb.c b/drivers/net/ieee802154/atusb.c
index 1056ed1..322864a 100644
--- a/drivers/net/ieee802154/atusb.c
+++ b/drivers/net/ieee802154/atusb.c
@@ -567,7 +567,7 @@ static void atusb_stop(struct ieee802154_hw *hw)
return 0;
}
-static struct ieee802154_ops atusb_ops = {
+static const struct ieee802154_ops atusb_ops = {
.owner = THIS_MODULE,
.xmit_async = atusb_xmit,
.ed = atusb_ed,
--
1.9.1
^ permalink raw reply related
* RE: [PATCH for-next 03/11] IB/hns: Optimize the logic of allocating memory using APIs
From: Salil Mehta @ 2016-11-21 20:20 UTC (permalink / raw)
To: Leon Romanovsky
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, Huwei (Xavier),
oulijun, mehta.salil.lnk-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Linuxarm,
Zhangping (ZP)
In-Reply-To: <20161121171423.GA23083-2ukJVAZIZ/Y@public.gmane.org>
> -----Original Message-----
> From: netdev-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:netdev-
> owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Leon Romanovsky
> Sent: Monday, November 21, 2016 5:14 PM
> To: Salil Mehta
> Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org; Huwei (Xavier); oulijun;
> mehta.salil.lnk-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org;
> netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Linuxarm;
> Zhangping (ZP)
> Subject: Re: [PATCH for-next 03/11] IB/hns: Optimize the logic of
> allocating memory using APIs
>
> On Mon, Nov 21, 2016 at 04:12:38PM +0000, Salil Mehta wrote:
> > > -----Original Message-----
> > > From: Leon Romanovsky [mailto:leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org]
> > > Sent: Wednesday, November 16, 2016 8:36 AM
> > > To: Salil Mehta
> > > Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org; Huwei (Xavier); oulijun;
> > > mehta.salil.lnk-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org;
> > > netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Linuxarm;
> > > Zhangping (ZP)
> > > Subject: Re: [PATCH for-next 03/11] IB/hns: Optimize the logic of
> > > allocating memory using APIs
> > >
> > > On Tue, Nov 15, 2016 at 03:52:46PM +0000, Salil Mehta wrote:
> > > > > -----Original Message-----
> > > > > From: Leon Romanovsky [mailto:leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org]
> > > > > Sent: Wednesday, November 09, 2016 7:22 AM
> > > > > To: Salil Mehta
> > > > > Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org; Huwei (Xavier); oulijun;
> > > > > mehta.salil.lnk-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org;
> > > > > netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Linuxarm;
> > > > > Zhangping (ZP)
> > > > > Subject: Re: [PATCH for-next 03/11] IB/hns: Optimize the logic
> of
> > > > > allocating memory using APIs
> > > > >
> > > > > On Fri, Nov 04, 2016 at 04:36:25PM +0000, Salil Mehta wrote:
> > > > > > From: "Wei Hu (Xavier)" <xavier.huwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > > > > >
> > > > > > This patch modified the logic of allocating memory using APIs
> in
> > > > > > hns RoCE driver. We used kcalloc instead of kmalloc_array and
> > > > > > bitmap_zero. And When kcalloc failed, call vzalloc to alloc
> > > > > > memory.
> > > > > >
> > > > > > Signed-off-by: Wei Hu (Xavier) <xavier.huwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > > > > > Signed-off-by: Ping Zhang <zhangping5-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > > > > > Signed-off-by: Salil Mehta <salil.mehta-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > > > > > ---
> > > > > > drivers/infiniband/hw/hns/hns_roce_mr.c | 15 ++++++++-----
> --
> > > > > > 1 file changed, 8 insertions(+), 7 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/infiniband/hw/hns/hns_roce_mr.c
> > > > > b/drivers/infiniband/hw/hns/hns_roce_mr.c
> > > > > > index fb87883..d3dfb5f 100644
> > > > > > --- a/drivers/infiniband/hw/hns/hns_roce_mr.c
> > > > > > +++ b/drivers/infiniband/hw/hns/hns_roce_mr.c
> > > > > > @@ -137,11 +137,12 @@ static int hns_roce_buddy_init(struct
> > > > > hns_roce_buddy *buddy, int max_order)
> > > > > >
> > > > > > for (i = 0; i <= buddy->max_order; ++i) {
> > > > > > s = BITS_TO_LONGS(1 << (buddy->max_order - i));
> > > > > > - buddy->bits[i] = kmalloc_array(s, sizeof(long),
> > > > > GFP_KERNEL);
> > > > > > - if (!buddy->bits[i])
> > > > > > - goto err_out_free;
> > > > > > -
> > > > > > - bitmap_zero(buddy->bits[i], 1 << (buddy->max_order -
> > > i));
> > > > > > + buddy->bits[i] = kcalloc(s, sizeof(long),
> > > GFP_KERNEL);
> > > > > > + if (!buddy->bits[i]) {
> > > > > > + buddy->bits[i] = vzalloc(s * sizeof(long));
> > > > >
> > > > > I wonder, why don't you use directly vzalloc instead of kcalloc
> > > > > fallback?
> > > > As we know we will have physical contiguous pages if the kcalloc
> > > > call succeeds. This will give us a chance to have better
> performance
> > > > over the allocations which are just virtually contiguous through
> the
> > > > function vzalloc(). Therefore, later has only been used as a
> fallback
> > > > when our memory request cannot be entertained through kcalloc.
> > > >
> > > > Are you suggesting that there will not be much performance
> penalty
> > > > if we use just vzalloc ?
> > >
> > > Not exactly,
> > > I asked it, because we have similar code in our drivers and this
> > > construction looks strange to me.
> > >
> > > 1. If performance is critical, we will use kmalloc.
> > > 2. If performance is not critical, we will use vmalloc.
> > >
> > > But in this case, such construction shows me that we can live with
> > > vmalloc performance and kmalloc allocation are not really needed.
> > >
> > > In your specific case, I'm not sure that kcalloc will ever fail.
> > Performance is definitely critical here. Though, I agree this is bit
> > unusual way of memory allocation. In actual, we were encountering
> > memory alloc failures using kmalloc (if you see allocation amount
> > is on the higher side and is exponential) so we ended up using
> > vmalloc as fall back - It is very naïve allocation scheme.
>
> I understand it, we did the same, see our mlx5_vzalloc call.
> BTW, we used __GFP_NOWARN flag, which you should consider to use
> in your case too.
Ok. Will add this flag and refloat patch V3.
Thanks
>
> >
> > Maybe we need to rethink this allocation scheme part? Also, I can
> pull
> > back this particular patch for now or just live with vzalloc() till
> > we figure out proper solution to this?
>
> It is up to you, I don't think that you should drop it, AFAIK, there is
> no other proper solution.
Ok we will live with it for now and later maybe we can see how we can optimize
pre-allocation of physically contiguous memory.
Thanks for your suggestions!
Salil
>
> >
> > >
> > > Thanks
> > >
> > >
> > > >
> > > > >
> > > > > > + if (!buddy->bits[i])
> > > > > > + goto err_out_free;
> > > > > > + }
> > > > > > }
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [RFC net-next 3/3] net: dsa: b53: Remove CPU port specific VLAN programming
From: Florian Fainelli @ 2016-11-21 19:09 UTC (permalink / raw)
To: netdev
Cc: davem, bridge, stephen, vivien.didelot, andrew, jiri, idosch,
Florian Fainelli
In-Reply-To: <20161121190925.14530-1-f.fainelli@gmail.com>
Now that DSA calls into the switch driver to program the CPU port's VLAN
attributes, we can get rid of the code that dealt with adding/removing
the CPU port to a downstream facing port VLAN membership.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/dsa/b53/b53_common.c | 22 ++++++----------------
1 file changed, 6 insertions(+), 16 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 7717b19dc806..6577286a2721 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -951,7 +951,6 @@ static void b53_vlan_add(struct dsa_switch *ds, int port,
struct b53_device *dev = ds->priv;
bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
bool pvid = vlan->flags & BRIDGE_VLAN_INFO_PVID;
- unsigned int cpu_port = dev->cpu_port;
struct b53_vlan *vl;
u16 vid;
@@ -960,11 +959,11 @@ static void b53_vlan_add(struct dsa_switch *ds, int port,
b53_get_vlan_entry(dev, vid, vl);
- vl->members |= BIT(port) | BIT(cpu_port);
+ vl->members |= BIT(port);
if (untagged)
- vl->untag |= BIT(port) | BIT(cpu_port);
+ vl->untag |= BIT(port);
else
- vl->untag &= ~(BIT(port) | BIT(cpu_port));
+ vl->untag &= ~BIT(port);
b53_set_vlan_entry(dev, vid, vl);
b53_fast_age_vlan(dev, vid);
@@ -973,8 +972,6 @@ static void b53_vlan_add(struct dsa_switch *ds, int port,
if (pvid) {
b53_write16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port),
vlan->vid_end);
- b53_write16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(cpu_port),
- vlan->vid_end);
b53_fast_age_vlan(dev, vid);
}
}
@@ -984,7 +981,6 @@ static int b53_vlan_del(struct dsa_switch *ds, int port,
{
struct b53_device *dev = ds->priv;
bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
- unsigned int cpu_port = dev->cpu_port;
struct b53_vlan *vl;
u16 vid;
u16 pvid;
@@ -997,8 +993,6 @@ static int b53_vlan_del(struct dsa_switch *ds, int port,
b53_get_vlan_entry(dev, vid, vl);
vl->members &= ~BIT(port);
- if ((vl->members & BIT(cpu_port)) == BIT(cpu_port))
- vl->members = 0;
if (pvid == vid) {
if (is5325(dev) || is5365(dev))
@@ -1007,18 +1001,14 @@ static int b53_vlan_del(struct dsa_switch *ds, int port,
pvid = 0;
}
- if (untagged) {
+ if (untagged)
vl->untag &= ~(BIT(port));
- if ((vl->untag & BIT(cpu_port)) == BIT(cpu_port))
- vl->untag = 0;
- }
b53_set_vlan_entry(dev, vid, vl);
b53_fast_age_vlan(dev, vid);
}
b53_write16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port), pvid);
- b53_write16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(cpu_port), pvid);
b53_fast_age_vlan(dev, pvid);
return 0;
@@ -1396,8 +1386,8 @@ static void b53_br_leave(struct dsa_switch *ds, int port)
b53_write16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, reg);
} else {
b53_get_vlan_entry(dev, pvid, vl);
- vl->members |= BIT(port) | BIT(dev->cpu_port);
- vl->untag |= BIT(port) | BIT(dev->cpu_port);
+ vl->members |= BIT(port);
+ vl->untag |= BIT(port);
b53_set_vlan_entry(dev, pvid, vl);
}
}
--
2.9.3
^ permalink raw reply related
* [RFC net-next 2/3] net: dsa: Propagate VLAN add/del to CPU port(s)
From: Florian Fainelli @ 2016-11-21 19:09 UTC (permalink / raw)
To: netdev
Cc: davem, bridge, stephen, vivien.didelot, andrew, jiri, idosch,
Florian Fainelli
In-Reply-To: <20161121190925.14530-1-f.fainelli@gmail.com>
Now that the bridge layer can call into switchdev to signal programming
requests targeting the bridge master device itself, allow the switch
drivers to implement separate programming of downstream and
upstream/management ports.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
net/dsa/slave.c | 45 +++++++++++++++++++++++++++++++++------------
1 file changed, 33 insertions(+), 12 deletions(-)
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index d0c7bce88743..18288261b964 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -223,35 +223,30 @@ static int dsa_slave_set_mac_address(struct net_device *dev, void *a)
return 0;
}
-static int dsa_slave_port_vlan_add(struct net_device *dev,
+static int dsa_slave_port_vlan_add(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_vlan *vlan,
struct switchdev_trans *trans)
{
- struct dsa_slave_priv *p = netdev_priv(dev);
- struct dsa_switch *ds = p->parent;
if (switchdev_trans_ph_prepare(trans)) {
if (!ds->ops->port_vlan_prepare || !ds->ops->port_vlan_add)
return -EOPNOTSUPP;
- return ds->ops->port_vlan_prepare(ds, p->port, vlan, trans);
+ return ds->ops->port_vlan_prepare(ds, port, vlan, trans);
}
- ds->ops->port_vlan_add(ds, p->port, vlan, trans);
+ ds->ops->port_vlan_add(ds, port, vlan, trans);
return 0;
}
-static int dsa_slave_port_vlan_del(struct net_device *dev,
+static int dsa_slave_port_vlan_del(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_vlan *vlan)
{
- struct dsa_slave_priv *p = netdev_priv(dev);
- struct dsa_switch *ds = p->parent;
-
if (!ds->ops->port_vlan_del)
return -EOPNOTSUPP;
- return ds->ops->port_vlan_del(ds, p->port, vlan);
+ return ds->ops->port_vlan_del(ds, port, vlan);
}
static int dsa_slave_port_vlan_dump(struct net_device *dev,
@@ -465,8 +460,21 @@ static int dsa_slave_port_obj_add(struct net_device *dev,
const struct switchdev_obj *obj,
struct switchdev_trans *trans)
{
+ struct dsa_slave_priv *p = netdev_priv(dev);
+ struct dsa_switch *ds = p->parent;
+ int port = p->port;
int err;
+ /* Here we may be called with an orig_dev which is different from dev,
+ * on purpose, to receive request coming from e.g the bridge master
+ * device. Although there are no network device associated with CPU/DSA
+ * ports, we may still have programming operation for these ports.
+ */
+ if (obj->orig_dev == p->bridge_dev) {
+ ds = ds->dst->ds[0];
+ port = ds->dst->cpu_port;
+ }
+
/* For the prepare phase, ensure the full set of changes is feasable in
* one go in order to signal a failure properly. If an operation is not
* supported, return -EOPNOTSUPP.
@@ -483,7 +491,7 @@ static int dsa_slave_port_obj_add(struct net_device *dev,
trans);
break;
case SWITCHDEV_OBJ_ID_PORT_VLAN:
- err = dsa_slave_port_vlan_add(dev,
+ err = dsa_slave_port_vlan_add(ds, port,
SWITCHDEV_OBJ_PORT_VLAN(obj),
trans);
break;
@@ -498,8 +506,21 @@ static int dsa_slave_port_obj_add(struct net_device *dev,
static int dsa_slave_port_obj_del(struct net_device *dev,
const struct switchdev_obj *obj)
{
+ struct dsa_slave_priv *p = netdev_priv(dev);
+ struct dsa_switch *ds = p->parent;
+ int port = p->port;
int err;
+ /* Here we may be called with an orig_dev which is different from dev,
+ * on purpose, to receive request coming from e.g the bridge master
+ * device. Although there are no network device associated with CPU/DSA
+ * ports, we may still have programming operation for these ports.
+ */
+ if (obj->orig_dev == p->bridge_dev) {
+ ds = ds->dst->ds[0];
+ port = ds->dst->cpu_port;
+ }
+
switch (obj->id) {
case SWITCHDEV_OBJ_ID_PORT_FDB:
err = dsa_slave_port_fdb_del(dev,
@@ -509,7 +530,7 @@ static int dsa_slave_port_obj_del(struct net_device *dev,
err = dsa_slave_port_mdb_del(dev, SWITCHDEV_OBJ_PORT_MDB(obj));
break;
case SWITCHDEV_OBJ_ID_PORT_VLAN:
- err = dsa_slave_port_vlan_del(dev,
+ err = dsa_slave_port_vlan_del(ds, port,
SWITCHDEV_OBJ_PORT_VLAN(obj));
break;
default:
--
2.9.3
^ permalink raw reply related
* [RFC net-next 1/3] net: bridge: Allow bridge master device to configure switch CPU port
From: Florian Fainelli @ 2016-11-21 19:09 UTC (permalink / raw)
To: netdev
Cc: davem, bridge, stephen, vivien.didelot, andrew, jiri, idosch,
Florian Fainelli
In-Reply-To: <20161121190925.14530-1-f.fainelli@gmail.com>
An use case which is currently not possible with Linux bridges on top of
network switches is to configure the CPU port of the switch (inherently
presented to the user with a bridge master device) independently from
its downstream ports, with a different set of VLAN properties. The
reason as to why is that the switch driver will never get any call to
switchdev_port_obj_{add,del} with the obj->orig_dev set to the bridge
master device.
This allows CPU/management ports to e.g: receive all traffic as tagged,
whereas the downstream port may have different untagged VLAN settings.
The following happens now (assuming bridge master device is already
created):
bridge vlan add vid 2 dev port0 pvid untagged
-> port0 (e.g: switch port 0) gets programmed
-> CPU port gets programmed
bridge vlan add vid 2 dev br0 self
-> CPU port gets programmed
bridge vlan add vid 2 dev port0
-> port0 (switch port 0) gets programmed
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
net/bridge/br_vlan.c | 28 +++++++++++++++++++++++++---
1 file changed, 25 insertions(+), 3 deletions(-)
diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index b6de4f457161..b335d66d21db 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -228,7 +228,9 @@ static int __vlan_add(struct net_bridge_vlan *v, u16 flags)
err = __vlan_vid_add(dev, br, v->vid, flags);
if (err)
goto out;
+ }
+ if (p) {
/* need to work on the master vlan too */
if (flags & BRIDGE_VLAN_INFO_MASTER) {
err = br_vlan_add(br, v->vid, flags |
@@ -242,6 +244,14 @@ static int __vlan_add(struct net_bridge_vlan *v, u16 flags)
goto out_filt;
v->brvlan = masterv;
v->stats = masterv->stats;
+
+ /* Propagate the VLAN flags changes down to the underlying
+ * hardware, which may have to reconfigure the physical port
+ * associated with the bridge (e.g: CPU/management port)
+ */
+ err = __vlan_vid_add(br->dev, br, v->vid, flags);
+ if (err)
+ goto out_filt;
}
/* Add the dev mac and count the vlan only if it's usable */
@@ -287,19 +297,25 @@ static int __vlan_del(struct net_bridge_vlan *v)
struct net_bridge_vlan *masterv = v;
struct net_bridge_vlan_group *vg;
struct net_bridge_port *p = NULL;
+ struct net_device *dev;
+ struct net_bridge *br;
int err = 0;
if (br_vlan_is_master(v)) {
- vg = br_vlan_group(v->br);
+ br = v->br;
+ vg = br_vlan_group(br);
+ dev = v->br->dev;
} else {
p = v->port;
+ br = p->br;
+ dev = p->dev;
vg = nbp_vlan_group(v->port);
masterv = v->brvlan;
}
__vlan_delete_pvid(vg, v->vid);
- if (p) {
- err = __vlan_vid_del(p->dev, p->br, v->vid);
+ if (p || br_vlan_is_master(v)) {
+ err = __vlan_vid_del(dev, br, v->vid);
if (err)
goto out;
}
@@ -568,6 +584,12 @@ int br_vlan_add(struct net_bridge *br, u16 vid, u16 flags)
vg->num_vlans++;
}
__vlan_add_flags(vlan, flags);
+
+ /* Propagate the VLAN flags changes down to the underlying
+ * hardware, which may have to reconfigure the physical port
+ * associated with the bridge (e.g: CPU/management port)
+ */
+ __vlan_vid_add(br->dev, br, vlan->vid, flags);
return 0;
}
--
2.9.3
^ permalink raw reply related
* [RFC net-next 0/3] net: bridge: Allow CPU port configuration
From: Florian Fainelli @ 2016-11-21 19:09 UTC (permalink / raw)
To: netdev
Cc: davem, bridge, stephen, vivien.didelot, andrew, jiri, idosch,
Florian Fainelli
Hi all,
This patch series allows using the bridge master interface to configure
an Ethernet switch port's CPU/management port with different VLAN attributes than
those of the bridge downstream ports/members.
Jiri, Ido, Andrew, Vivien, please review the impact on mlxsw and mv88e6xxx, I
tested this with b53 and a mockup DSA driver.
Open questions:
- if we have more than one bridge on top of a physical switch, the driver
should keep track of that and verify that we are not going to change
the CPU port VLAN attributes in a way that results in incompatible settings
to be applied
- if the default behavior is to have all VLANs associated with the CPU port
be ingressing/egressing tagged to the CPU, is this really useful?
Florian Fainelli (3):
net: bridge: Allow bridge master device to configure switch CPU port
net: dsa: Propagate VLAN add/del to CPU port(s)
net: dsa: b53: Remove CPU port specific VLAN programming
drivers/net/dsa/b53/b53_common.c | 22 ++++++--------------
net/bridge/br_vlan.c | 28 ++++++++++++++++++++++---
net/dsa/slave.c | 45 +++++++++++++++++++++++++++++-----------
3 files changed, 64 insertions(+), 31 deletions(-)
--
2.9.3
^ permalink raw reply
* Re: [PATCH net-next v3 0/4] geneve: Use LWT more effectively.
From: David Miller @ 2016-11-21 19:06 UTC (permalink / raw)
To: pshelar; +Cc: netdev
In-Reply-To: <1479754981-17600-1-git-send-email-pshelar@ovn.org>
From: Pravin B Shelar <pshelar@ovn.org>
Date: Mon, 21 Nov 2016 11:02:57 -0800
> Following patch series make use of geneve LWT code path for
> geneve netdev type of device.
> This allows us to simplify geneve module without changing any
> functionality.
>
> v2-v3:
> Rebase against latest net-next.
>
> v1-v2:
> Fix warning reported by kbuild test robot.
Series applied, thanks Pravin.
^ permalink raw reply
* Re: [PATCH net-next v2 0/4] geneve: Use LWT more effectively.
From: Pravin Shelar @ 2016-11-21 19:03 UTC (permalink / raw)
To: David Miller; +Cc: Linux Kernel Network Developers
In-Reply-To: <20161121.112853.596026767987561055.davem@davemloft.net>
On Mon, Nov 21, 2016 at 8:28 AM, David Miller <davem@davemloft.net> wrote:
> From: Pravin B Shelar <pshelar@ovn.org>
> Date: Fri, 18 Nov 2016 18:10:07 -0800
>
>> Following patch series make use of geneve LWT code path for
>> geneve netdev type of device.
>> This allows us to simplify geneve module.
>>
>> v1-v2:
>> Fix warning reported by kbuild test robot.
>
> This doesn't apply cleanly to net-next, please respin.
>
Sure. I have posted updated series.
Thanks.
^ permalink raw reply
* [PATCH net-next v3 3/4] geneve: Remove redundant socket checks.
From: Pravin B Shelar @ 2016-11-21 19:03 UTC (permalink / raw)
To: netdev; +Cc: Pravin B Shelar
In-Reply-To: <1479754981-17600-1-git-send-email-pshelar@ovn.org>
Geneve already has check for device socket in route
lookup function. So no need to check it in xmit
function.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
drivers/net/geneve.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 2cd5c41..633bb44 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -785,14 +785,11 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
struct geneve_sock *gs4 = rcu_dereference(geneve->sock4);
const struct ip_tunnel_key *key = &info->key;
struct rtable *rt;
- int err = -EINVAL;
struct flowi4 fl4;
__u8 tos, ttl;
__be16 sport;
__be16 df;
-
- if (!gs4)
- return err;
+ int err;
rt = geneve_get_v4_rt(skb, dev, &fl4, info);
if (IS_ERR(rt))
@@ -828,13 +825,10 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
struct geneve_sock *gs6 = rcu_dereference(geneve->sock6);
const struct ip_tunnel_key *key = &info->key;
struct dst_entry *dst = NULL;
- int err = -EINVAL;
struct flowi6 fl6;
__u8 prio, ttl;
__be16 sport;
-
- if (!gs6)
- return err;
+ int err;
dst = geneve_get_v6_dst(skb, dev, &fl6, info);
if (IS_ERR(dst))
--
1.8.3.1
^ permalink raw reply related
* [PATCH net-next v3 4/4] geneve: Optimize geneve device lookup.
From: Pravin B Shelar @ 2016-11-21 19:03 UTC (permalink / raw)
To: netdev; +Cc: Pravin B Shelar
In-Reply-To: <1479754981-17600-1-git-send-email-pshelar@ovn.org>
Rather than comparing 64-bit tunnel-id, compare tunnel vni
which is 24-bit id. This also save conversion from vni
to tunnel id on each tunnel packet receive.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
drivers/net/geneve.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 633bb44..d86d2f9 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -103,6 +103,17 @@ static void tunnel_id_to_vni(__be64 tun_id, __u8 *vni)
#endif
}
+static bool eq_tun_id_and_vni(u8 *tun_id, u8 *vni)
+{
+#ifdef __BIG_ENDIAN
+ return (vni[0] == tun_id[2]) &&
+ (vni[1] == tun_id[1]) &&
+ (vni[2] == tun_id[0]);
+#else
+ return !memcmp(vni, &tun_id[5], 3);
+#endif
+}
+
static sa_family_t geneve_get_sk_family(struct geneve_sock *gs)
{
return gs->sock->sk->sk_family;
@@ -111,7 +122,6 @@ static sa_family_t geneve_get_sk_family(struct geneve_sock *gs)
static struct geneve_dev *geneve_lookup(struct geneve_sock *gs,
__be32 addr, u8 vni[])
{
- __be64 id = vni_to_tunnel_id(vni);
struct hlist_head *vni_list_head;
struct geneve_dev *geneve;
__u32 hash;
@@ -120,7 +130,7 @@ static struct geneve_dev *geneve_lookup(struct geneve_sock *gs,
hash = geneve_net_vni_hash(vni);
vni_list_head = &gs->vni_list[hash];
hlist_for_each_entry_rcu(geneve, vni_list_head, hlist) {
- if (!memcmp(&id, &geneve->info.key.tun_id, sizeof(id)) &&
+ if (eq_tun_id_and_vni((u8 *)&geneve->info.key.tun_id, vni) &&
addr == geneve->info.key.u.ipv4.dst)
return geneve;
}
@@ -131,7 +141,6 @@ static struct geneve_dev *geneve_lookup(struct geneve_sock *gs,
static struct geneve_dev *geneve6_lookup(struct geneve_sock *gs,
struct in6_addr addr6, u8 vni[])
{
- __be64 id = vni_to_tunnel_id(vni);
struct hlist_head *vni_list_head;
struct geneve_dev *geneve;
__u32 hash;
@@ -140,7 +149,7 @@ static struct geneve_dev *geneve6_lookup(struct geneve_sock *gs,
hash = geneve_net_vni_hash(vni);
vni_list_head = &gs->vni_list[hash];
hlist_for_each_entry_rcu(geneve, vni_list_head, hlist) {
- if (!memcmp(&id, &geneve->info.key.tun_id, sizeof(id)) &&
+ if (eq_tun_id_and_vni((u8 *)&geneve->info.key.tun_id, vni) &&
ipv6_addr_equal(&addr6, &geneve->info.key.u.ipv6.dst))
return geneve;
}
--
1.8.3.1
^ permalink raw reply related
* [PATCH net-next v3 2/4] geneve: Merge ipv4 and ipv6 geneve_build_skb()
From: Pravin B Shelar @ 2016-11-21 19:02 UTC (permalink / raw)
To: netdev; +Cc: Pravin B Shelar
In-Reply-To: <1479754981-17600-1-git-send-email-pshelar@ovn.org>
There are minimal difference in building Geneve header
between ipv4 and ipv6 geneve tunnels. Following patch
refactors code to unify it.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
drivers/net/geneve.c | 100 ++++++++++++++-------------------------------------
1 file changed, 26 insertions(+), 74 deletions(-)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 658531d..2cd5c41 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -630,67 +630,34 @@ static int geneve_stop(struct net_device *dev)
}
static void geneve_build_header(struct genevehdr *geneveh,
- __be16 tun_flags, u8 vni[3],
- u8 options_len, u8 *options)
+ const struct ip_tunnel_info *info)
{
geneveh->ver = GENEVE_VER;
- geneveh->opt_len = options_len / 4;
- geneveh->oam = !!(tun_flags & TUNNEL_OAM);
- geneveh->critical = !!(tun_flags & TUNNEL_CRIT_OPT);
+ geneveh->opt_len = info->options_len / 4;
+ geneveh->oam = !!(info->key.tun_flags & TUNNEL_OAM);
+ geneveh->critical = !!(info->key.tun_flags & TUNNEL_CRIT_OPT);
geneveh->rsvd1 = 0;
- memcpy(geneveh->vni, vni, 3);
+ tunnel_id_to_vni(info->key.tun_id, geneveh->vni);
geneveh->proto_type = htons(ETH_P_TEB);
geneveh->rsvd2 = 0;
- memcpy(geneveh->options, options, options_len);
+ ip_tunnel_info_opts_get(geneveh->options, info);
}
-static int geneve_build_skb(struct rtable *rt, struct sk_buff *skb,
- __be16 tun_flags, u8 vni[3], u8 opt_len, u8 *opt,
- bool xnet)
-{
- bool udp_sum = !!(tun_flags & TUNNEL_CSUM);
- struct genevehdr *gnvh;
- int min_headroom;
- int err;
-
- skb_scrub_packet(skb, xnet);
-
- min_headroom = LL_RESERVED_SPACE(rt->dst.dev) + rt->dst.header_len
- + GENEVE_BASE_HLEN + opt_len + sizeof(struct iphdr);
- err = skb_cow_head(skb, min_headroom);
- if (unlikely(err))
- goto free_rt;
-
- err = udp_tunnel_handle_offloads(skb, udp_sum);
- if (err)
- goto free_rt;
-
- gnvh = (struct genevehdr *)__skb_push(skb, sizeof(*gnvh) + opt_len);
- geneve_build_header(gnvh, tun_flags, vni, opt_len, opt);
-
- skb_set_inner_protocol(skb, htons(ETH_P_TEB));
- return 0;
-
-free_rt:
- ip_rt_put(rt);
- return err;
-}
-
-#if IS_ENABLED(CONFIG_IPV6)
-static int geneve6_build_skb(struct dst_entry *dst, struct sk_buff *skb,
- __be16 tun_flags, u8 vni[3], u8 opt_len, u8 *opt,
- bool xnet)
+static int geneve_build_skb(struct dst_entry *dst, struct sk_buff *skb,
+ const struct ip_tunnel_info *info,
+ bool xnet, int ip_hdr_len)
{
- bool udp_sum = !!(tun_flags & TUNNEL_CSUM);
+ bool udp_sum = !!(info->key.tun_flags & TUNNEL_CSUM);
struct genevehdr *gnvh;
int min_headroom;
int err;
+ skb_reset_mac_header(skb);
skb_scrub_packet(skb, xnet);
- min_headroom = LL_RESERVED_SPACE(dst->dev) + dst->header_len
- + GENEVE_BASE_HLEN + opt_len + sizeof(struct ipv6hdr);
+ min_headroom = LL_RESERVED_SPACE(dst->dev) + dst->header_len +
+ GENEVE_BASE_HLEN + info->options_len + ip_hdr_len;
err = skb_cow_head(skb, min_headroom);
if (unlikely(err))
goto free_dst;
@@ -699,9 +666,9 @@ static int geneve6_build_skb(struct dst_entry *dst, struct sk_buff *skb,
if (err)
goto free_dst;
- gnvh = (struct genevehdr *)__skb_push(skb, sizeof(*gnvh) + opt_len);
- geneve_build_header(gnvh, tun_flags, vni, opt_len, opt);
-
+ gnvh = (struct genevehdr *)__skb_push(skb, sizeof(*gnvh) +
+ info->options_len);
+ geneve_build_header(gnvh, info);
skb_set_inner_protocol(skb, htons(ETH_P_TEB));
return 0;
@@ -709,12 +676,11 @@ static int geneve6_build_skb(struct dst_entry *dst, struct sk_buff *skb,
dst_release(dst);
return err;
}
-#endif
static struct rtable *geneve_get_v4_rt(struct sk_buff *skb,
struct net_device *dev,
struct flowi4 *fl4,
- struct ip_tunnel_info *info)
+ const struct ip_tunnel_info *info)
{
bool use_cache = ip_tunnel_dst_cache_usable(skb, info);
struct geneve_dev *geneve = netdev_priv(dev);
@@ -738,7 +704,7 @@ static struct rtable *geneve_get_v4_rt(struct sk_buff *skb,
}
fl4->flowi4_tos = RT_TOS(tos);
- dst_cache = &info->dst_cache;
+ dst_cache = (struct dst_cache *)&info->dst_cache;
if (use_cache) {
rt = dst_cache_get_ip4(dst_cache, &fl4->saddr);
if (rt)
@@ -763,7 +729,7 @@ static struct rtable *geneve_get_v4_rt(struct sk_buff *skb,
static struct dst_entry *geneve_get_v6_dst(struct sk_buff *skb,
struct net_device *dev,
struct flowi6 *fl6,
- struct ip_tunnel_info *info)
+ const struct ip_tunnel_info *info)
{
bool use_cache = ip_tunnel_dst_cache_usable(skb, info);
struct geneve_dev *geneve = netdev_priv(dev);
@@ -789,7 +755,7 @@ static struct dst_entry *geneve_get_v6_dst(struct sk_buff *skb,
fl6->flowlabel = ip6_make_flowinfo(RT_TOS(prio),
info->key.label);
- dst_cache = &info->dst_cache;
+ dst_cache = (struct dst_cache *)&info->dst_cache;
if (use_cache) {
dst = dst_cache_get_ip6(dst_cache, &fl6->saddr);
if (dst)
@@ -812,7 +778,8 @@ static struct dst_entry *geneve_get_v6_dst(struct sk_buff *skb,
#endif
static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
- struct geneve_dev *geneve, struct ip_tunnel_info *info)
+ struct geneve_dev *geneve,
+ const struct ip_tunnel_info *info)
{
bool xnet = !net_eq(geneve->net, dev_net(geneve->dev));
struct geneve_sock *gs4 = rcu_dereference(geneve->sock4);
@@ -820,11 +787,9 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
struct rtable *rt;
int err = -EINVAL;
struct flowi4 fl4;
- u8 *opts = NULL;
__u8 tos, ttl;
__be16 sport;
__be16 df;
- u8 vni[3];
if (!gs4)
return err;
@@ -843,13 +808,7 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
}
df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
- tunnel_id_to_vni(key->tun_id, vni);
- if (info->options_len)
- opts = ip_tunnel_info_opts(info);
-
- skb_reset_mac_header(skb);
- err = geneve_build_skb(rt, skb, key->tun_flags, vni,
- info->options_len, opts, xnet);
+ err = geneve_build_skb(&rt->dst, skb, info, xnet, sizeof(struct iphdr));
if (unlikely(err))
return err;
@@ -862,7 +821,8 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
#if IS_ENABLED(CONFIG_IPV6)
static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
- struct geneve_dev *geneve, struct ip_tunnel_info *info)
+ struct geneve_dev *geneve,
+ const struct ip_tunnel_info *info)
{
bool xnet = !net_eq(geneve->net, dev_net(geneve->dev));
struct geneve_sock *gs6 = rcu_dereference(geneve->sock6);
@@ -870,10 +830,8 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
struct dst_entry *dst = NULL;
int err = -EINVAL;
struct flowi6 fl6;
- u8 *opts = NULL;
__u8 prio, ttl;
__be16 sport;
- u8 vni[3];
if (!gs6)
return err;
@@ -891,13 +849,7 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
ip_hdr(skb), skb);
ttl = key->ttl ? : ip6_dst_hoplimit(dst);
}
- tunnel_id_to_vni(key->tun_id, vni);
- if (info->options_len)
- opts = ip_tunnel_info_opts(info);
-
- skb_reset_mac_header(skb);
- err = geneve6_build_skb(dst, skb, key->tun_flags, vni,
- info->options_len, opts, xnet);
+ err = geneve_build_skb(dst, skb, info, xnet, sizeof(struct iphdr));
if (unlikely(err))
return err;
--
1.8.3.1
^ permalink raw reply related
* [PATCH net-next v3 1/4] geneve: Unify LWT and netdev handling.
From: Pravin B Shelar @ 2016-11-21 19:02 UTC (permalink / raw)
To: netdev; +Cc: Pravin B Shelar
In-Reply-To: <1479754981-17600-1-git-send-email-pshelar@ovn.org>
Current geneve implementation has two separate cases to handle.
1. netdev xmit
2. LWT xmit.
In case of netdev, geneve configuration is stored in various
struct geneve_dev members. For example geneve_addr, ttl, tos,
label, flags, dst_cache, etc. For LWT ip_tunnel_info is passed
to the device in ip_tunnel_info.
Following patch uses ip_tunnel_info struct to store almost all
of configuration of a geneve netdevice. This allows us to unify
most of geneve driver code around ip_tunnel_info struct.
This dramatically simplify geneve code, since it does not
need to handle two different configuration cases. Removes
duplicate code, single code path can handle either type
of geneve devices.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
drivers/net/geneve.c | 612 ++++++++++++++++++++++-----------------------------
1 file changed, 263 insertions(+), 349 deletions(-)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 90dc6b1..658531d 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -45,41 +45,22 @@ struct geneve_net {
static unsigned int geneve_net_id;
-union geneve_addr {
- struct sockaddr_in sin;
- struct sockaddr_in6 sin6;
- struct sockaddr sa;
-};
-
-static union geneve_addr geneve_remote_unspec = { .sa.sa_family = AF_UNSPEC, };
-
/* Pseudo network device */
struct geneve_dev {
struct hlist_node hlist; /* vni hash table */
struct net *net; /* netns for packet i/o */
struct net_device *dev; /* netdev for geneve tunnel */
+ struct ip_tunnel_info info;
struct geneve_sock __rcu *sock4; /* IPv4 socket used for geneve tunnel */
#if IS_ENABLED(CONFIG_IPV6)
struct geneve_sock __rcu *sock6; /* IPv6 socket used for geneve tunnel */
#endif
- u8 vni[3]; /* virtual network ID for tunnel */
- u8 ttl; /* TTL override */
- u8 tos; /* TOS override */
- union geneve_addr remote; /* IP address for link partner */
struct list_head next; /* geneve's per namespace list */
- __be32 label; /* IPv6 flowlabel override */
- __be16 dst_port;
- bool collect_md;
struct gro_cells gro_cells;
- u32 flags;
- struct dst_cache dst_cache;
+ bool collect_md;
+ bool use_udp6_rx_checksums;
};
-/* Geneve device flags */
-#define GENEVE_F_UDP_ZERO_CSUM_TX BIT(0)
-#define GENEVE_F_UDP_ZERO_CSUM6_TX BIT(1)
-#define GENEVE_F_UDP_ZERO_CSUM6_RX BIT(2)
-
struct geneve_sock {
bool collect_md;
struct list_head list;
@@ -87,7 +68,6 @@ struct geneve_sock {
struct rcu_head rcu;
int refcnt;
struct hlist_head vni_list[VNI_HASH_SIZE];
- u32 flags;
};
static inline __u32 geneve_net_vni_hash(u8 vni[3])
@@ -109,6 +89,20 @@ static __be64 vni_to_tunnel_id(const __u8 *vni)
#endif
}
+/* Convert 64 bit tunnel ID to 24 bit VNI. */
+static void tunnel_id_to_vni(__be64 tun_id, __u8 *vni)
+{
+#ifdef __BIG_ENDIAN
+ vni[0] = (__force __u8)(tun_id >> 16);
+ vni[1] = (__force __u8)(tun_id >> 8);
+ vni[2] = (__force __u8)tun_id;
+#else
+ vni[0] = (__force __u8)((__force u64)tun_id >> 40);
+ vni[1] = (__force __u8)((__force u64)tun_id >> 48);
+ vni[2] = (__force __u8)((__force u64)tun_id >> 56);
+#endif
+}
+
static sa_family_t geneve_get_sk_family(struct geneve_sock *gs)
{
return gs->sock->sk->sk_family;
@@ -117,6 +111,7 @@ static sa_family_t geneve_get_sk_family(struct geneve_sock *gs)
static struct geneve_dev *geneve_lookup(struct geneve_sock *gs,
__be32 addr, u8 vni[])
{
+ __be64 id = vni_to_tunnel_id(vni);
struct hlist_head *vni_list_head;
struct geneve_dev *geneve;
__u32 hash;
@@ -125,8 +120,8 @@ static struct geneve_dev *geneve_lookup(struct geneve_sock *gs,
hash = geneve_net_vni_hash(vni);
vni_list_head = &gs->vni_list[hash];
hlist_for_each_entry_rcu(geneve, vni_list_head, hlist) {
- if (!memcmp(vni, geneve->vni, sizeof(geneve->vni)) &&
- addr == geneve->remote.sin.sin_addr.s_addr)
+ if (!memcmp(&id, &geneve->info.key.tun_id, sizeof(id)) &&
+ addr == geneve->info.key.u.ipv4.dst)
return geneve;
}
return NULL;
@@ -136,6 +131,7 @@ static struct geneve_dev *geneve_lookup(struct geneve_sock *gs,
static struct geneve_dev *geneve6_lookup(struct geneve_sock *gs,
struct in6_addr addr6, u8 vni[])
{
+ __be64 id = vni_to_tunnel_id(vni);
struct hlist_head *vni_list_head;
struct geneve_dev *geneve;
__u32 hash;
@@ -144,8 +140,8 @@ static struct geneve_dev *geneve6_lookup(struct geneve_sock *gs,
hash = geneve_net_vni_hash(vni);
vni_list_head = &gs->vni_list[hash];
hlist_for_each_entry_rcu(geneve, vni_list_head, hlist) {
- if (!memcmp(vni, geneve->vni, sizeof(geneve->vni)) &&
- ipv6_addr_equal(&addr6, &geneve->remote.sin6.sin6_addr))
+ if (!memcmp(&id, &geneve->info.key.tun_id, sizeof(id)) &&
+ ipv6_addr_equal(&addr6, &geneve->info.key.u.ipv6.dst))
return geneve;
}
return NULL;
@@ -160,15 +156,12 @@ static inline struct genevehdr *geneve_hdr(const struct sk_buff *skb)
static struct geneve_dev *geneve_lookup_skb(struct geneve_sock *gs,
struct sk_buff *skb)
{
- u8 *vni;
- __be32 addr;
static u8 zero_vni[3];
-#if IS_ENABLED(CONFIG_IPV6)
- static struct in6_addr zero_addr6;
-#endif
+ u8 *vni;
if (geneve_get_sk_family(gs) == AF_INET) {
struct iphdr *iph;
+ __be32 addr;
iph = ip_hdr(skb); /* outer IP header... */
@@ -183,6 +176,7 @@ static struct geneve_dev *geneve_lookup_skb(struct geneve_sock *gs,
return geneve_lookup(gs, addr, vni);
#if IS_ENABLED(CONFIG_IPV6)
} else if (geneve_get_sk_family(gs) == AF_INET6) {
+ static struct in6_addr zero_addr6;
struct ipv6hdr *ip6h;
struct in6_addr addr6;
@@ -305,13 +299,12 @@ static int geneve_init(struct net_device *dev)
return err;
}
- err = dst_cache_init(&geneve->dst_cache, GFP_KERNEL);
+ err = dst_cache_init(&geneve->info.dst_cache, GFP_KERNEL);
if (err) {
free_percpu(dev->tstats);
gro_cells_destroy(&geneve->gro_cells);
return err;
}
-
return 0;
}
@@ -319,7 +312,7 @@ static void geneve_uninit(struct net_device *dev)
{
struct geneve_dev *geneve = netdev_priv(dev);
- dst_cache_destroy(&geneve->dst_cache);
+ dst_cache_destroy(&geneve->info.dst_cache);
gro_cells_destroy(&geneve->gro_cells);
free_percpu(dev->tstats);
}
@@ -368,7 +361,7 @@ static int geneve_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
}
static struct socket *geneve_create_sock(struct net *net, bool ipv6,
- __be16 port, u32 flags)
+ __be16 port, bool ipv6_rx_csum)
{
struct socket *sock;
struct udp_port_cfg udp_conf;
@@ -379,8 +372,7 @@ static struct socket *geneve_create_sock(struct net *net, bool ipv6,
if (ipv6) {
udp_conf.family = AF_INET6;
udp_conf.ipv6_v6only = 1;
- udp_conf.use_udp6_rx_checksums =
- !(flags & GENEVE_F_UDP_ZERO_CSUM6_RX);
+ udp_conf.use_udp6_rx_checksums = ipv6_rx_csum;
} else {
udp_conf.family = AF_INET;
udp_conf.local_ip.s_addr = htonl(INADDR_ANY);
@@ -491,7 +483,7 @@ static int geneve_gro_complete(struct sock *sk, struct sk_buff *skb,
/* Create new listen socket if needed */
static struct geneve_sock *geneve_socket_create(struct net *net, __be16 port,
- bool ipv6, u32 flags)
+ bool ipv6, bool ipv6_rx_csum)
{
struct geneve_net *gn = net_generic(net, geneve_net_id);
struct geneve_sock *gs;
@@ -503,7 +495,7 @@ static struct geneve_sock *geneve_socket_create(struct net *net, __be16 port,
if (!gs)
return ERR_PTR(-ENOMEM);
- sock = geneve_create_sock(net, ipv6, port, flags);
+ sock = geneve_create_sock(net, ipv6, port, ipv6_rx_csum);
if (IS_ERR(sock)) {
kfree(gs);
return ERR_CAST(sock);
@@ -579,21 +571,22 @@ static int geneve_sock_add(struct geneve_dev *geneve, bool ipv6)
struct net *net = geneve->net;
struct geneve_net *gn = net_generic(net, geneve_net_id);
struct geneve_sock *gs;
+ __u8 vni[3];
__u32 hash;
- gs = geneve_find_sock(gn, ipv6 ? AF_INET6 : AF_INET, geneve->dst_port);
+ gs = geneve_find_sock(gn, ipv6 ? AF_INET6 : AF_INET, geneve->info.key.tp_dst);
if (gs) {
gs->refcnt++;
goto out;
}
- gs = geneve_socket_create(net, geneve->dst_port, ipv6, geneve->flags);
+ gs = geneve_socket_create(net, geneve->info.key.tp_dst, ipv6,
+ geneve->use_udp6_rx_checksums);
if (IS_ERR(gs))
return PTR_ERR(gs);
out:
gs->collect_md = geneve->collect_md;
- gs->flags = geneve->flags;
#if IS_ENABLED(CONFIG_IPV6)
if (ipv6)
rcu_assign_pointer(geneve->sock6, gs);
@@ -601,7 +594,8 @@ static int geneve_sock_add(struct geneve_dev *geneve, bool ipv6)
#endif
rcu_assign_pointer(geneve->sock4, gs);
- hash = geneve_net_vni_hash(geneve->vni);
+ tunnel_id_to_vni(geneve->info.key.tun_id, vni);
+ hash = geneve_net_vni_hash(vni);
hlist_add_head_rcu(&geneve->hlist, &gs->vni_list[hash]);
return 0;
}
@@ -609,7 +603,7 @@ static int geneve_sock_add(struct geneve_dev *geneve, bool ipv6)
static int geneve_open(struct net_device *dev)
{
struct geneve_dev *geneve = netdev_priv(dev);
- bool ipv6 = geneve->remote.sa.sa_family == AF_INET6;
+ bool ipv6 = !!(geneve->info.mode & IP_TUNNEL_INFO_IPV6);
bool metadata = geneve->collect_md;
int ret = 0;
@@ -653,12 +647,12 @@ static void geneve_build_header(struct genevehdr *geneveh,
static int geneve_build_skb(struct rtable *rt, struct sk_buff *skb,
__be16 tun_flags, u8 vni[3], u8 opt_len, u8 *opt,
- u32 flags, bool xnet)
+ bool xnet)
{
+ bool udp_sum = !!(tun_flags & TUNNEL_CSUM);
struct genevehdr *gnvh;
int min_headroom;
int err;
- bool udp_sum = !(flags & GENEVE_F_UDP_ZERO_CSUM_TX);
skb_scrub_packet(skb, xnet);
@@ -686,12 +680,12 @@ static int geneve_build_skb(struct rtable *rt, struct sk_buff *skb,
#if IS_ENABLED(CONFIG_IPV6)
static int geneve6_build_skb(struct dst_entry *dst, struct sk_buff *skb,
__be16 tun_flags, u8 vni[3], u8 opt_len, u8 *opt,
- u32 flags, bool xnet)
+ bool xnet)
{
+ bool udp_sum = !!(tun_flags & TUNNEL_CSUM);
struct genevehdr *gnvh;
int min_headroom;
int err;
- bool udp_sum = !(flags & GENEVE_F_UDP_ZERO_CSUM6_TX);
skb_scrub_packet(skb, xnet);
@@ -734,32 +728,22 @@ static struct rtable *geneve_get_v4_rt(struct sk_buff *skb,
memset(fl4, 0, sizeof(*fl4));
fl4->flowi4_mark = skb->mark;
fl4->flowi4_proto = IPPROTO_UDP;
+ fl4->daddr = info->key.u.ipv4.dst;
+ fl4->saddr = info->key.u.ipv4.src;
- if (info) {
- fl4->daddr = info->key.u.ipv4.dst;
- fl4->saddr = info->key.u.ipv4.src;
- fl4->flowi4_tos = RT_TOS(info->key.tos);
- dst_cache = &info->dst_cache;
- } else {
- tos = geneve->tos;
- if (tos == 1) {
- const struct iphdr *iip = ip_hdr(skb);
-
- tos = ip_tunnel_get_dsfield(iip, skb);
- use_cache = false;
- }
-
- fl4->flowi4_tos = RT_TOS(tos);
- fl4->daddr = geneve->remote.sin.sin_addr.s_addr;
- dst_cache = &geneve->dst_cache;
+ tos = info->key.tos;
+ if ((tos == 1) && !geneve->collect_md) {
+ tos = ip_tunnel_get_dsfield(ip_hdr(skb), skb);
+ use_cache = false;
}
+ fl4->flowi4_tos = RT_TOS(tos);
+ dst_cache = &info->dst_cache;
if (use_cache) {
rt = dst_cache_get_ip4(dst_cache, &fl4->saddr);
if (rt)
return rt;
}
-
rt = ip_route_output_key(geneve->net, fl4);
if (IS_ERR(rt)) {
netdev_dbg(dev, "no route to %pI4\n", &fl4->daddr);
@@ -795,34 +779,22 @@ static struct dst_entry *geneve_get_v6_dst(struct sk_buff *skb,
memset(fl6, 0, sizeof(*fl6));
fl6->flowi6_mark = skb->mark;
fl6->flowi6_proto = IPPROTO_UDP;
-
- if (info) {
- fl6->daddr = info->key.u.ipv6.dst;
- fl6->saddr = info->key.u.ipv6.src;
- fl6->flowlabel = ip6_make_flowinfo(RT_TOS(info->key.tos),
- info->key.label);
- dst_cache = &info->dst_cache;
- } else {
- prio = geneve->tos;
- if (prio == 1) {
- const struct iphdr *iip = ip_hdr(skb);
-
- prio = ip_tunnel_get_dsfield(iip, skb);
- use_cache = false;
- }
-
- fl6->flowlabel = ip6_make_flowinfo(RT_TOS(prio),
- geneve->label);
- fl6->daddr = geneve->remote.sin6.sin6_addr;
- dst_cache = &geneve->dst_cache;
+ fl6->daddr = info->key.u.ipv6.dst;
+ fl6->saddr = info->key.u.ipv6.src;
+ prio = info->key.tos;
+ if ((prio == 1) && !geneve->collect_md) {
+ prio = ip_tunnel_get_dsfield(ip_hdr(skb), skb);
+ use_cache = false;
}
+ fl6->flowlabel = ip6_make_flowinfo(RT_TOS(prio),
+ info->key.label);
+ dst_cache = &info->dst_cache;
if (use_cache) {
dst = dst_cache_get_ip6(dst_cache, &fl6->saddr);
if (dst)
return dst;
}
-
if (ipv6_stub->ipv6_dst_lookup(geneve->net, gs6->sock->sk, &dst, fl6)) {
netdev_dbg(dev, "no route to %pI6\n", &fl6->daddr);
return ERR_PTR(-ENETUNREACH);
@@ -839,195 +811,130 @@ static struct dst_entry *geneve_get_v6_dst(struct sk_buff *skb,
}
#endif
-/* Convert 64 bit tunnel ID to 24 bit VNI. */
-static void tunnel_id_to_vni(__be64 tun_id, __u8 *vni)
-{
-#ifdef __BIG_ENDIAN
- vni[0] = (__force __u8)(tun_id >> 16);
- vni[1] = (__force __u8)(tun_id >> 8);
- vni[2] = (__force __u8)tun_id;
-#else
- vni[0] = (__force __u8)((__force u64)tun_id >> 40);
- vni[1] = (__force __u8)((__force u64)tun_id >> 48);
- vni[2] = (__force __u8)((__force u64)tun_id >> 56);
-#endif
-}
-
-static netdev_tx_t geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
- struct ip_tunnel_info *info)
+static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
+ struct geneve_dev *geneve, struct ip_tunnel_info *info)
{
- struct geneve_dev *geneve = netdev_priv(dev);
- struct geneve_sock *gs4;
- struct rtable *rt = NULL;
- const struct iphdr *iip; /* interior IP header */
+ bool xnet = !net_eq(geneve->net, dev_net(geneve->dev));
+ struct geneve_sock *gs4 = rcu_dereference(geneve->sock4);
+ const struct ip_tunnel_key *key = &info->key;
+ struct rtable *rt;
int err = -EINVAL;
struct flowi4 fl4;
+ u8 *opts = NULL;
__u8 tos, ttl;
__be16 sport;
__be16 df;
- bool xnet = !net_eq(geneve->net, dev_net(geneve->dev));
- u32 flags = geneve->flags;
+ u8 vni[3];
- gs4 = rcu_dereference(geneve->sock4);
if (!gs4)
- goto tx_error;
-
- if (geneve->collect_md) {
- if (unlikely(!info || !(info->mode & IP_TUNNEL_INFO_TX))) {
- netdev_dbg(dev, "no tunnel metadata\n");
- goto tx_error;
- }
- if (info && ip_tunnel_info_af(info) != AF_INET)
- goto tx_error;
- }
+ return err;
rt = geneve_get_v4_rt(skb, dev, &fl4, info);
- if (IS_ERR(rt)) {
- err = PTR_ERR(rt);
- goto tx_error;
- }
+ if (IS_ERR(rt))
+ return PTR_ERR(rt);
sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
- skb_reset_mac_header(skb);
-
- iip = ip_hdr(skb);
-
- if (info) {
- const struct ip_tunnel_key *key = &info->key;
- u8 *opts = NULL;
- u8 vni[3];
-
- tunnel_id_to_vni(key->tun_id, vni);
- if (info->options_len)
- opts = ip_tunnel_info_opts(info);
-
- if (key->tun_flags & TUNNEL_CSUM)
- flags &= ~GENEVE_F_UDP_ZERO_CSUM_TX;
- else
- flags |= GENEVE_F_UDP_ZERO_CSUM_TX;
-
- err = geneve_build_skb(rt, skb, key->tun_flags, vni,
- info->options_len, opts, flags, xnet);
- if (unlikely(err))
- goto tx_error;
-
- tos = ip_tunnel_ecn_encap(key->tos, iip, skb);
+ if (geneve->collect_md) {
+ tos = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
ttl = key->ttl;
- df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
} else {
- err = geneve_build_skb(rt, skb, 0, geneve->vni,
- 0, NULL, flags, xnet);
- if (unlikely(err))
- goto tx_error;
-
- tos = ip_tunnel_ecn_encap(fl4.flowi4_tos, iip, skb);
- ttl = geneve->ttl;
- if (!ttl && IN_MULTICAST(ntohl(fl4.daddr)))
- ttl = 1;
- ttl = ttl ? : ip4_dst_hoplimit(&rt->dst);
- df = 0;
+ tos = ip_tunnel_ecn_encap(fl4.flowi4_tos, ip_hdr(skb), skb);
+ ttl = key->ttl ? : ip4_dst_hoplimit(&rt->dst);
}
- udp_tunnel_xmit_skb(rt, gs4->sock->sk, skb, fl4.saddr, fl4.daddr,
- tos, ttl, df, sport, geneve->dst_port,
- !net_eq(geneve->net, dev_net(geneve->dev)),
- !!(flags & GENEVE_F_UDP_ZERO_CSUM_TX));
+ df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
- return NETDEV_TX_OK;
-
-tx_error:
- dev_kfree_skb(skb);
+ tunnel_id_to_vni(key->tun_id, vni);
+ if (info->options_len)
+ opts = ip_tunnel_info_opts(info);
- if (err == -ELOOP)
- dev->stats.collisions++;
- else if (err == -ENETUNREACH)
- dev->stats.tx_carrier_errors++;
+ skb_reset_mac_header(skb);
+ err = geneve_build_skb(rt, skb, key->tun_flags, vni,
+ info->options_len, opts, xnet);
+ if (unlikely(err))
+ return err;
- dev->stats.tx_errors++;
- return NETDEV_TX_OK;
+ udp_tunnel_xmit_skb(rt, gs4->sock->sk, skb, fl4.saddr, fl4.daddr,
+ tos, ttl, df, sport, geneve->info.key.tp_dst,
+ !net_eq(geneve->net, dev_net(geneve->dev)),
+ !(info->key.tun_flags & TUNNEL_CSUM));
+ return 0;
}
#if IS_ENABLED(CONFIG_IPV6)
-static netdev_tx_t geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
- struct ip_tunnel_info *info)
+static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
+ struct geneve_dev *geneve, struct ip_tunnel_info *info)
{
- struct geneve_dev *geneve = netdev_priv(dev);
+ bool xnet = !net_eq(geneve->net, dev_net(geneve->dev));
+ struct geneve_sock *gs6 = rcu_dereference(geneve->sock6);
+ const struct ip_tunnel_key *key = &info->key;
struct dst_entry *dst = NULL;
- const struct iphdr *iip; /* interior IP header */
- struct geneve_sock *gs6;
int err = -EINVAL;
struct flowi6 fl6;
+ u8 *opts = NULL;
__u8 prio, ttl;
__be16 sport;
- __be32 label;
- bool xnet = !net_eq(geneve->net, dev_net(geneve->dev));
- u32 flags = geneve->flags;
+ u8 vni[3];
- gs6 = rcu_dereference(geneve->sock6);
if (!gs6)
- goto tx_error;
-
- if (geneve->collect_md) {
- if (unlikely(!info || !(info->mode & IP_TUNNEL_INFO_TX))) {
- netdev_dbg(dev, "no tunnel metadata\n");
- goto tx_error;
- }
- }
+ return err;
dst = geneve_get_v6_dst(skb, dev, &fl6, info);
- if (IS_ERR(dst)) {
- err = PTR_ERR(dst);
- goto tx_error;
- }
+ if (IS_ERR(dst))
+ return PTR_ERR(dst);
sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
- skb_reset_mac_header(skb);
-
- iip = ip_hdr(skb);
+ if (geneve->collect_md) {
+ prio = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
+ ttl = key->ttl;
+ } else {
+ prio = ip_tunnel_ecn_encap(ip6_tclass(fl6.flowlabel),
+ ip_hdr(skb), skb);
+ ttl = key->ttl ? : ip6_dst_hoplimit(dst);
+ }
+ tunnel_id_to_vni(key->tun_id, vni);
+ if (info->options_len)
+ opts = ip_tunnel_info_opts(info);
- if (info) {
- const struct ip_tunnel_key *key = &info->key;
- u8 *opts = NULL;
- u8 vni[3];
+ skb_reset_mac_header(skb);
+ err = geneve6_build_skb(dst, skb, key->tun_flags, vni,
+ info->options_len, opts, xnet);
+ if (unlikely(err))
+ return err;
- tunnel_id_to_vni(key->tun_id, vni);
- if (info->options_len)
- opts = ip_tunnel_info_opts(info);
+ udp_tunnel6_xmit_skb(dst, gs6->sock->sk, skb, dev,
+ &fl6.saddr, &fl6.daddr, prio, ttl,
+ info->key.label, sport, geneve->info.key.tp_dst,
+ !(info->key.tun_flags & TUNNEL_CSUM));
+ return 0;
+}
+#endif
- if (key->tun_flags & TUNNEL_CSUM)
- flags &= ~GENEVE_F_UDP_ZERO_CSUM6_TX;
- else
- flags |= GENEVE_F_UDP_ZERO_CSUM6_TX;
+static netdev_tx_t geneve_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+ struct geneve_dev *geneve = netdev_priv(dev);
+ struct ip_tunnel_info *info = NULL;
+ int err;
- err = geneve6_build_skb(dst, skb, key->tun_flags, vni,
- info->options_len, opts,
- flags, xnet);
- if (unlikely(err))
+ if (geneve->collect_md) {
+ info = skb_tunnel_info(skb);
+ if (unlikely(!info || !(info->mode & IP_TUNNEL_INFO_TX))) {
+ err = -EINVAL;
+ netdev_dbg(dev, "no tunnel metadata\n");
goto tx_error;
-
- prio = ip_tunnel_ecn_encap(key->tos, iip, skb);
- ttl = key->ttl;
- label = info->key.label;
+ }
} else {
- err = geneve6_build_skb(dst, skb, 0, geneve->vni,
- 0, NULL, flags, xnet);
- if (unlikely(err))
- goto tx_error;
-
- prio = ip_tunnel_ecn_encap(ip6_tclass(fl6.flowlabel),
- iip, skb);
- ttl = geneve->ttl;
- if (!ttl && ipv6_addr_is_multicast(&fl6.daddr))
- ttl = 1;
- ttl = ttl ? : ip6_dst_hoplimit(dst);
- label = geneve->label;
+ info = &geneve->info;
}
- udp_tunnel6_xmit_skb(dst, gs6->sock->sk, skb, dev,
- &fl6.saddr, &fl6.daddr, prio, ttl, label,
- sport, geneve->dst_port,
- !!(flags & GENEVE_F_UDP_ZERO_CSUM6_TX));
- return NETDEV_TX_OK;
+#if IS_ENABLED(CONFIG_IPV6)
+ if (info->mode & IP_TUNNEL_INFO_IPV6)
+ err = geneve6_xmit_skb(skb, dev, geneve, info);
+ else
+#endif
+ err = geneve_xmit_skb(skb, dev, geneve, info);
+ if (likely(!err))
+ return NETDEV_TX_OK;
tx_error:
dev_kfree_skb(skb);
@@ -1039,23 +946,6 @@ static netdev_tx_t geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
dev->stats.tx_errors++;
return NETDEV_TX_OK;
}
-#endif
-
-static netdev_tx_t geneve_xmit(struct sk_buff *skb, struct net_device *dev)
-{
- struct geneve_dev *geneve = netdev_priv(dev);
- struct ip_tunnel_info *info = NULL;
-
- if (geneve->collect_md)
- info = skb_tunnel_info(skb);
-
-#if IS_ENABLED(CONFIG_IPV6)
- if ((info && ip_tunnel_info_af(info) == AF_INET6) ||
- (!info && geneve->remote.sa.sa_family == AF_INET6))
- return geneve6_xmit_skb(skb, dev, info);
-#endif
- return geneve_xmit_skb(skb, dev, info);
-}
static int geneve_change_mtu(struct net_device *dev, int new_mtu)
{
@@ -1073,14 +963,11 @@ static int geneve_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)
{
struct ip_tunnel_info *info = skb_tunnel_info(skb);
struct geneve_dev *geneve = netdev_priv(dev);
- struct rtable *rt;
- struct flowi4 fl4;
-#if IS_ENABLED(CONFIG_IPV6)
- struct dst_entry *dst;
- struct flowi6 fl6;
-#endif
if (ip_tunnel_info_af(info) == AF_INET) {
+ struct rtable *rt;
+ struct flowi4 fl4;
+
rt = geneve_get_v4_rt(skb, dev, &fl4, info);
if (IS_ERR(rt))
return PTR_ERR(rt);
@@ -1089,6 +976,9 @@ static int geneve_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)
info->key.u.ipv4.src = fl4.saddr;
#if IS_ENABLED(CONFIG_IPV6)
} else if (ip_tunnel_info_af(info) == AF_INET6) {
+ struct dst_entry *dst;
+ struct flowi6 fl6;
+
dst = geneve_get_v6_dst(skb, dev, &fl6, info);
if (IS_ERR(dst))
return PTR_ERR(dst);
@@ -1102,7 +992,7 @@ static int geneve_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)
info->key.tp_src = udp_flow_src_port(geneve->net, skb,
1, USHRT_MAX, true);
- info->key.tp_dst = geneve->dst_port;
+ info->key.tp_dst = geneve->info.key.tp_dst;
return 0;
}
@@ -1224,78 +1114,69 @@ static int geneve_validate(struct nlattr *tb[], struct nlattr *data[])
}
static struct geneve_dev *geneve_find_dev(struct geneve_net *gn,
- __be16 dst_port,
- union geneve_addr *remote,
- u8 vni[],
+ const struct ip_tunnel_info *info,
bool *tun_on_same_port,
bool *tun_collect_md)
{
- struct geneve_dev *geneve, *t;
+ struct geneve_dev *geneve, *t = NULL;
*tun_on_same_port = false;
*tun_collect_md = false;
- t = NULL;
list_for_each_entry(geneve, &gn->geneve_list, next) {
- if (geneve->dst_port == dst_port) {
+ if (info->key.tp_dst == geneve->info.key.tp_dst) {
*tun_collect_md = geneve->collect_md;
*tun_on_same_port = true;
}
- if (!memcmp(vni, geneve->vni, sizeof(geneve->vni)) &&
- !memcmp(remote, &geneve->remote, sizeof(geneve->remote)) &&
- dst_port == geneve->dst_port)
+ if (info->key.tun_id == geneve->info.key.tun_id &&
+ info->key.tp_dst == geneve->info.key.tp_dst &&
+ !memcmp(&info->key.u, &geneve->info.key.u, sizeof(info->key.u)))
t = geneve;
}
return t;
}
+static bool is_all_zero(const u8 *fp, size_t size)
+{
+ int i;
+
+ for (i = 0; i < size; i++)
+ if (fp[i])
+ return false;
+ return true;
+}
+
+static bool is_tnl_info_zero(const struct ip_tunnel_info *info)
+{
+ if (info->key.tun_id || info->key.tun_flags || info->key.tos ||
+ info->key.ttl || info->key.label || info->key.tp_src ||
+ !is_all_zero((const u8 *)&info->key.u, sizeof(info->key.u)))
+ return false;
+ else
+ return true;
+}
+
static int geneve_configure(struct net *net, struct net_device *dev,
- union geneve_addr *remote,
- __u32 vni, __u8 ttl, __u8 tos, __be32 label,
- __be16 dst_port, bool metadata, u32 flags)
+ const struct ip_tunnel_info *info,
+ bool metadata, bool ipv6_rx_csum)
{
struct geneve_net *gn = net_generic(net, geneve_net_id);
struct geneve_dev *t, *geneve = netdev_priv(dev);
bool tun_collect_md, tun_on_same_port;
int err, encap_len;
- if (!remote)
- return -EINVAL;
- if (metadata &&
- (remote->sa.sa_family != AF_UNSPEC || vni || tos || ttl || label))
+ if (metadata && !is_tnl_info_zero(info))
return -EINVAL;
geneve->net = net;
geneve->dev = dev;
- geneve->vni[0] = (vni & 0x00ff0000) >> 16;
- geneve->vni[1] = (vni & 0x0000ff00) >> 8;
- geneve->vni[2] = vni & 0x000000ff;
-
- if ((remote->sa.sa_family == AF_INET &&
- IN_MULTICAST(ntohl(remote->sin.sin_addr.s_addr))) ||
- (remote->sa.sa_family == AF_INET6 &&
- ipv6_addr_is_multicast(&remote->sin6.sin6_addr)))
- return -EINVAL;
- if (label && remote->sa.sa_family != AF_INET6)
- return -EINVAL;
-
- geneve->remote = *remote;
-
- geneve->ttl = ttl;
- geneve->tos = tos;
- geneve->label = label;
- geneve->dst_port = dst_port;
- geneve->collect_md = metadata;
- geneve->flags = flags;
-
- t = geneve_find_dev(gn, dst_port, remote, geneve->vni,
- &tun_on_same_port, &tun_collect_md);
+ t = geneve_find_dev(gn, info, &tun_on_same_port, &tun_collect_md);
if (t)
return -EBUSY;
/* make enough headroom for basic scenario */
encap_len = GENEVE_BASE_HLEN + ETH_HLEN;
- if (remote->sa.sa_family == AF_INET) {
+ if (ip_tunnel_info_af(info) == AF_INET) {
encap_len += sizeof(struct iphdr);
dev->max_mtu -= sizeof(struct iphdr);
} else {
@@ -1312,7 +1193,10 @@ static int geneve_configure(struct net *net, struct net_device *dev,
return -EPERM;
}
- dst_cache_reset(&geneve->dst_cache);
+ dst_cache_reset(&geneve->info.dst_cache);
+ geneve->info = *info;
+ geneve->collect_md = metadata;
+ geneve->use_udp6_rx_checksums = ipv6_rx_csum;
err = register_netdevice(dev);
if (err)
@@ -1322,74 +1206,99 @@ static int geneve_configure(struct net *net, struct net_device *dev,
return 0;
}
+static void init_tnl_info(struct ip_tunnel_info *info, __u16 dst_port)
+{
+ memset(info, 0, sizeof(*info));
+ info->key.tp_dst = htons(dst_port);
+}
+
static int geneve_newlink(struct net *net, struct net_device *dev,
struct nlattr *tb[], struct nlattr *data[])
{
- __be16 dst_port = htons(GENEVE_UDP_PORT);
- __u8 ttl = 0, tos = 0;
+ bool use_udp6_rx_checksums = false;
+ struct ip_tunnel_info info;
bool metadata = false;
- union geneve_addr remote = geneve_remote_unspec;
- __be32 label = 0;
- __u32 vni = 0;
- u32 flags = 0;
+
+ init_tnl_info(&info, GENEVE_UDP_PORT);
if (data[IFLA_GENEVE_REMOTE] && data[IFLA_GENEVE_REMOTE6])
return -EINVAL;
if (data[IFLA_GENEVE_REMOTE]) {
- remote.sa.sa_family = AF_INET;
- remote.sin.sin_addr.s_addr =
+ info.key.u.ipv4.dst =
nla_get_in_addr(data[IFLA_GENEVE_REMOTE]);
+
+ if (IN_MULTICAST(ntohl(info.key.u.ipv4.dst))) {
+ netdev_dbg(dev, "multicast remote is unsupported\n");
+ return -EINVAL;
+ }
}
if (data[IFLA_GENEVE_REMOTE6]) {
- if (!IS_ENABLED(CONFIG_IPV6))
- return -EPFNOSUPPORT;
-
- remote.sa.sa_family = AF_INET6;
- remote.sin6.sin6_addr =
+ #if IS_ENABLED(CONFIG_IPV6)
+ info.mode = IP_TUNNEL_INFO_IPV6;
+ info.key.u.ipv6.dst =
nla_get_in6_addr(data[IFLA_GENEVE_REMOTE6]);
- if (ipv6_addr_type(&remote.sin6.sin6_addr) &
+ if (ipv6_addr_type(&info.key.u.ipv6.dst) &
IPV6_ADDR_LINKLOCAL) {
netdev_dbg(dev, "link-local remote is unsupported\n");
return -EINVAL;
}
+ if (ipv6_addr_is_multicast(&info.key.u.ipv6.dst)) {
+ netdev_dbg(dev, "multicast remote is unsupported\n");
+ return -EINVAL;
+ }
+ info.key.tun_flags |= TUNNEL_CSUM;
+ use_udp6_rx_checksums = true;
+#else
+ return -EPFNOSUPPORT;
+#endif
}
- if (data[IFLA_GENEVE_ID])
+ if (data[IFLA_GENEVE_ID]) {
+ __u32 vni;
+ __u8 tvni[3];
+
vni = nla_get_u32(data[IFLA_GENEVE_ID]);
+ tvni[0] = (vni & 0x00ff0000) >> 16;
+ tvni[1] = (vni & 0x0000ff00) >> 8;
+ tvni[2] = vni & 0x000000ff;
+ info.key.tun_id = vni_to_tunnel_id(tvni);
+ }
if (data[IFLA_GENEVE_TTL])
- ttl = nla_get_u8(data[IFLA_GENEVE_TTL]);
+ info.key.ttl = nla_get_u8(data[IFLA_GENEVE_TTL]);
if (data[IFLA_GENEVE_TOS])
- tos = nla_get_u8(data[IFLA_GENEVE_TOS]);
+ info.key.tos = nla_get_u8(data[IFLA_GENEVE_TOS]);
- if (data[IFLA_GENEVE_LABEL])
- label = nla_get_be32(data[IFLA_GENEVE_LABEL]) &
- IPV6_FLOWLABEL_MASK;
+ if (data[IFLA_GENEVE_LABEL]) {
+ info.key.label = nla_get_be32(data[IFLA_GENEVE_LABEL]) &
+ IPV6_FLOWLABEL_MASK;
+ if (info.key.label && (!(info.mode & IP_TUNNEL_INFO_IPV6)))
+ return -EINVAL;
+ }
if (data[IFLA_GENEVE_PORT])
- dst_port = nla_get_be16(data[IFLA_GENEVE_PORT]);
+ info.key.tp_dst = nla_get_be16(data[IFLA_GENEVE_PORT]);
if (data[IFLA_GENEVE_COLLECT_METADATA])
metadata = true;
if (data[IFLA_GENEVE_UDP_CSUM] &&
!nla_get_u8(data[IFLA_GENEVE_UDP_CSUM]))
- flags |= GENEVE_F_UDP_ZERO_CSUM_TX;
+ info.key.tun_flags |= TUNNEL_CSUM;
if (data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX] &&
nla_get_u8(data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX]))
- flags |= GENEVE_F_UDP_ZERO_CSUM6_TX;
+ info.key.tun_flags &= ~TUNNEL_CSUM;
if (data[IFLA_GENEVE_UDP_ZERO_CSUM6_RX] &&
nla_get_u8(data[IFLA_GENEVE_UDP_ZERO_CSUM6_RX]))
- flags |= GENEVE_F_UDP_ZERO_CSUM6_RX;
+ use_udp6_rx_checksums = false;
- return geneve_configure(net, dev, &remote, vni, ttl, tos, label,
- dst_port, metadata, flags);
+ return geneve_configure(net, dev, &info, metadata, use_udp6_rx_checksums);
}
static void geneve_dellink(struct net_device *dev, struct list_head *head)
@@ -1418,45 +1327,52 @@ static size_t geneve_get_size(const struct net_device *dev)
static int geneve_fill_info(struct sk_buff *skb, const struct net_device *dev)
{
struct geneve_dev *geneve = netdev_priv(dev);
+ struct ip_tunnel_info *info = &geneve->info;
+ __u8 tmp_vni[3];
__u32 vni;
- vni = (geneve->vni[0] << 16) | (geneve->vni[1] << 8) | geneve->vni[2];
+ tunnel_id_to_vni(info->key.tun_id, tmp_vni);
+ vni = (tmp_vni[0] << 16) | (tmp_vni[1] << 8) | tmp_vni[2];
if (nla_put_u32(skb, IFLA_GENEVE_ID, vni))
goto nla_put_failure;
- if (geneve->remote.sa.sa_family == AF_INET) {
+ if (ip_tunnel_info_af(info) == AF_INET) {
if (nla_put_in_addr(skb, IFLA_GENEVE_REMOTE,
- geneve->remote.sin.sin_addr.s_addr))
+ info->key.u.ipv4.dst))
+ goto nla_put_failure;
+
+ if (nla_put_u8(skb, IFLA_GENEVE_UDP_CSUM,
+ !!(info->key.tun_flags & TUNNEL_CSUM)))
goto nla_put_failure;
+
#if IS_ENABLED(CONFIG_IPV6)
} else {
if (nla_put_in6_addr(skb, IFLA_GENEVE_REMOTE6,
- &geneve->remote.sin6.sin6_addr))
+ &info->key.u.ipv6.dst))
+ goto nla_put_failure;
+
+ if (nla_put_u8(skb, IFLA_GENEVE_UDP_ZERO_CSUM6_TX,
+ !(info->key.tun_flags & TUNNEL_CSUM)))
+ goto nla_put_failure;
+
+ if (nla_put_u8(skb, IFLA_GENEVE_UDP_ZERO_CSUM6_RX,
+ !geneve->use_udp6_rx_checksums))
goto nla_put_failure;
#endif
}
- if (nla_put_u8(skb, IFLA_GENEVE_TTL, geneve->ttl) ||
- nla_put_u8(skb, IFLA_GENEVE_TOS, geneve->tos) ||
- nla_put_be32(skb, IFLA_GENEVE_LABEL, geneve->label))
+ if (nla_put_u8(skb, IFLA_GENEVE_TTL, info->key.ttl) ||
+ nla_put_u8(skb, IFLA_GENEVE_TOS, info->key.tos) ||
+ nla_put_be32(skb, IFLA_GENEVE_LABEL, info->key.label))
goto nla_put_failure;
- if (nla_put_be16(skb, IFLA_GENEVE_PORT, geneve->dst_port))
+ if (nla_put_be16(skb, IFLA_GENEVE_PORT, info->key.tp_dst))
goto nla_put_failure;
if (geneve->collect_md) {
if (nla_put_flag(skb, IFLA_GENEVE_COLLECT_METADATA))
goto nla_put_failure;
}
-
- if (nla_put_u8(skb, IFLA_GENEVE_UDP_CSUM,
- !(geneve->flags & GENEVE_F_UDP_ZERO_CSUM_TX)) ||
- nla_put_u8(skb, IFLA_GENEVE_UDP_ZERO_CSUM6_TX,
- !!(geneve->flags & GENEVE_F_UDP_ZERO_CSUM6_TX)) ||
- nla_put_u8(skb, IFLA_GENEVE_UDP_ZERO_CSUM6_RX,
- !!(geneve->flags & GENEVE_F_UDP_ZERO_CSUM6_RX)))
- goto nla_put_failure;
-
return 0;
nla_put_failure:
@@ -1480,6 +1396,7 @@ struct net_device *geneve_dev_create_fb(struct net *net, const char *name,
u8 name_assign_type, u16 dst_port)
{
struct nlattr *tb[IFLA_MAX + 1];
+ struct ip_tunnel_info info;
struct net_device *dev;
LIST_HEAD(list_kill);
int err;
@@ -1490,9 +1407,8 @@ struct net_device *geneve_dev_create_fb(struct net *net, const char *name,
if (IS_ERR(dev))
return dev;
- err = geneve_configure(net, dev, &geneve_remote_unspec,
- 0, 0, 0, 0, htons(dst_port), true,
- GENEVE_F_UDP_ZERO_CSUM6_RX);
+ init_tnl_info(&info, dst_port);
+ err = geneve_configure(net, dev, &info, true, true);
if (err) {
free_netdev(dev);
return ERR_PTR(err);
@@ -1510,8 +1426,7 @@ struct net_device *geneve_dev_create_fb(struct net *net, const char *name,
goto err;
return dev;
-
- err:
+err:
geneve_dellink(dev, &list_kill);
unregister_netdevice_many(&list_kill);
return ERR_PTR(err);
@@ -1594,7 +1509,6 @@ static int __init geneve_init_module(void)
goto out3;
return 0;
-
out3:
unregister_netdevice_notifier(&geneve_notifier_block);
out2:
--
1.8.3.1
^ permalink raw reply related
* [PATCH net-next v3 0/4] geneve: Use LWT more effectively.
From: Pravin B Shelar @ 2016-11-21 19:02 UTC (permalink / raw)
To: netdev; +Cc: Pravin B Shelar
Following patch series make use of geneve LWT code path for
geneve netdev type of device.
This allows us to simplify geneve module without changing any
functionality.
v2-v3:
Rebase against latest net-next.
v1-v2:
Fix warning reported by kbuild test robot.
Pravin B Shelar (4):
geneve: Unify LWT and netdev handling.
geneve: Merge ipv4 and ipv6 geneve_build_skb()
geneve: Remove redundant socket checks.
geneve: Optimize geneve device lookup.
drivers/net/geneve.c | 679 +++++++++++++++++++++------------------------------
1 file changed, 274 insertions(+), 405 deletions(-)
--
1.8.3.1
^ permalink raw reply
* [GIT] Networking
From: David Miller @ 2016-11-21 18:34 UTC (permalink / raw)
To: torvalds; +Cc: akpm, netdev, linux-kernel
1) Clear congestion control state when changing algorithms on an
existing socket, from Florian Westphal.
2) Fix register bit values in altr_tse_pcs portion of stmmac driver,
from Jia Jie Ho.
3) Fix PTP handling in stammc driver for GMAC4, from Giuseppe
CAVALLARO.
4) Fix udplite multicast delivery handling, it ignores the udp_table
parameter passed into the lookups, from Pablo Neira Ayuso.
5) Synchronize the space estimated by rtnl_vfinfo_size and the space
actually used by rtnl_fill_vfinfo. From Sabrina Dubroca.
6) Fix memory leak in fib_info when splitting nodes, from Alexander
Duyck.
7) If a driver does a napi_hash_del() explicitily and not via
netif_napi_del(), it must perform RCU synchronization as needed.
Fix this in virtio-net and bnxt drivers, from Eric Dumazet.
8) Likewise, it is not necessary to invoke napi_hash_del() is we are
also doing neif_napi_del() in the same code path. Remove such
calls from be2net and cxgb4 drivers, also from Eric Dumazet.
9) Don't allocate an ID in peernet2id_alloc() if the netns is dead,
from WANG Cong.
10) Fix OF node and device struct leaks in of_mdio, from Johan Hovold.
11) We cannot cache routes in ip6_tunnel when using inherited traffic
classes, from Paolo Abeni.
12) Fix several crashes and leaks in cpsw driver, from Johan Hovold.
13) Splice operations cannot use freezable blocking calls in AF_UNIX,
from WANG Cong.
14) Link dump filtering by master device and kind support added an
error in loop index updates during the dump if we actually do
filter, fix from Zhang Shengju.
Please pull, thanks a lot!
The following changes since commit e76d21c40bd6c67fd4e2c1540d77e113df962b4d:
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2016-11-14 14:15:53 -0800)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git
for you to fetch changes up to 7082c5c3f2407c52022507ffaf644dbbab97a883:
tcp: zero ca_priv area when switching cc algorithms (2016-11-21 13:13:56 -0500)
----------------------------------------------------------------
Alex (1):
net/phy/vitesse: Configure RGMII skew on VSC8601, if needed
Alexander Duyck (2):
ipv4: Restore fib_trie_flush_external function and fix call ordering
ipv4: Fix memory leak in exception case for splitting tries
Alexander Kochetkov (2):
net: arc_emac: annonce IFF_MULTICAST support
net: arc_emac: don't pass multicast packets to kernel in non-multicast mode
Alexey Khoroshilov (1):
net: macb: add check for dma mapping error in start_xmit()
Benjamin Beichler (1):
mac80211_hwsim: fix beacon delta calculation
David S. Miller (7):
Merge branch 'stmmac-ptp'
Merge branch 'fib-tables-fixes'
Merge branch 'thunderx-fixes'
Merge branch 'phy-dev-leaks'
Merge branch 'cpsw-fixes'
Merge tag 'mac80211-for-davem-2016-11-18' of git://git.kernel.org/.../jberg/mac80211
Merge tag 'batadv-net-for-davem-20161119' of git://git.open-mesh.org/linux-merge
Eric Dumazet (5):
gro_cells: mark napi struct as not busy poll candidates
virtio-net: add a missing synchronize_net()
be2net: do not call napi_hash_del()
cxgb4: do not call napi_hash_del()
bnxt: add a missing rcu synchronization
Felix Fietkau (4):
Revert "mac80211: allow using AP_LINK_PS with mac80211-generated TIM IE"
mac80211: update A-MPDU flag on tx dequeue
mac80211: remove bogus skb vif assignment
mac80211: fix A-MSDU aggregation with fast-xmit + txq
Filip Matusiak (1):
mac80211: Ignore VHT IE from peer with wrong rx_mcs_map
Florian Fainelli (1):
net: dsa: b53: Fix VLAN usage and how we treat CPU port
Florian Westphal (1):
tcp: zero ca_priv area when switching cc algorithms
Gao Feng (1):
net: l2tp: Treat NET_XMIT_CN as success in l2tp_eth_dev_xmit
Giuseppe CAVALLARO (3):
stmmac: update the PTP header file
stmmac: fix PTP support for GMAC4
stmmac: fix PTP type ethtool stats
Guillaume Nault (1):
l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()
Hangbin Liu (1):
igmp: do not remove igmp souce list info when set link down
Jeremy Linton (1):
net: sky2: Fix shutdown crash
Jia Jie Ho (1):
net: ethernet: Fix SGMII unable to switch speed and autonego failure
Johan Hovold (10):
of_mdio: fix node leak in of_phy_register_fixed_link error path
of_mdio: fix device reference leak in of_phy_find_device
net: phy: fixed_phy: fix of_node leak in fixed_phy_unregister
net: ethernet: ti: cpsw: fix bad register access in probe error path
net: ethernet: ti: cpsw: fix mdio device reference leak
net: ethernet: ti: cpsw: fix deferred probe
net: ethernet: ti: cpsw: fix of_node and phydev leaks
net: ethernet: ti: cpsw: fix secondary-emac probe error path
net: ethernet: ti: cpsw: add missing sanity check
net: ethernet: ti: cpsw: fix fixed-link phy probe deferral
Johannes Berg (1):
cfg80211: limit scan results cache size
Jon Paul Maloy (1):
tipc: eliminate obsolete socket locking policy description
Josef Bacik (1):
bpf: fix range arithmetic for bpf map access
Pablo Neira (1):
udp: restore UDPlite many-cast delivery
Paolo Abeni (1):
ip6_tunnel: disable caching when the traffic class is inherited
Pedersen, Thomas (1):
cfg80211: add bitrate for 20MHz MCS 9
Peter Robinson (1):
ethernet: stmmac: make DWMAC_STM32 depend on it's associated SoC
Radha Mohan Chintakuntla (1):
net: thunderx: Introduce BGX_ID_MASK macro to extract bgx_id
Roman Mashak (1):
net sched filters: pass netlink message flags in event notification
Sabrina Dubroca (3):
rtnetlink: fix rtnl_vfinfo_size
rtnetlink: fix rtnl message size computation for XDP
rtnetlink: fix FDB size computation
Stefan Hajnoczi (1):
netns: fix get_net_ns_by_fd(int pid) typo
Sunil Goutham (4):
net: thunderx: Program LMAC credits based on MTU
net: thunderx: Fix configuration of L3/L4 length checking
net: thunderx: Fix VF driver's interface statistics
net: thunderx: Fix memory leak and other issues upon interface toggle
Sven Eckelmann (2):
batman-adv: Revert "fix splat on disabling an interface"
batman-adv: Detect missing primaryif during tp_send as error
WANG Cong (2):
net: check dead netns for peernet2id_alloc()
af_unix: conditionally use freezable blocking calls in read
Zhang Shengju (1):
rtnl: fix the loop index update error in rtnl_dump_ifinfo()
drivers/net/dsa/b53/b53_common.c | 16 ++++----------
drivers/net/ethernet/arc/emac_main.c | 7 +++---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 4 ++++
drivers/net/ethernet/cadence/macb.c | 6 ++++++
drivers/net/ethernet/cavium/thunder/nic.h | 64 ++++++++++++++++++++++++++++++------------------------
drivers/net/ethernet/cavium/thunder/nic_main.c | 37 ++++++++++++++++++++++----------
drivers/net/ethernet/cavium/thunder/nic_reg.h | 1 +
drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c | 105 ++++++++++++++++++++++++++++++++++++++++++++++++++---------------------------------------
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 153 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------------------------------------------------------
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 118 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------------------------
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 24 ++-------------------
drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 4 ++--
drivers/net/ethernet/cavium/thunder/thunder_bgx.h | 2 ++
drivers/net/ethernet/chelsio/cxgb4/sge.c | 1 -
drivers/net/ethernet/emulex/benet/be_main.c | 1 -
drivers/net/ethernet/marvell/sky2.c | 13 +++++++++++
drivers/net/ethernet/stmicro/stmmac/Kconfig | 2 +-
drivers/net/ethernet/stmicro/stmmac/altr_tse_pcs.c | 4 ++--
drivers/net/ethernet/stmicro/stmmac/common.h | 24 ++++++++++++---------
drivers/net/ethernet/stmicro/stmmac/descs.h | 20 ++++++++++-------
drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------
drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.h | 4 ++++
drivers/net/ethernet/stmicro/stmmac/enh_desc.c | 28 +++++++++++++++---------
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 1 +
drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 19 ++++++++++-------
drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c | 43 +++++++++++++++++++++++++++++--------
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 97 ++++++++++++++++++++++++++++++++++++++++++----------------------------------------
drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c | 9 ++++----
drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h | 72 +++++++++++++++++++++++++++++++------------------------------
drivers/net/ethernet/ti/cpsw.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------
drivers/net/phy/fixed_phy.c | 2 +-
drivers/net/phy/vitesse.c | 34 ++++++++++++++++++++++++++++-
drivers/net/virtio_net.c | 5 +++++
drivers/net/wireless/mac80211_hwsim.c | 2 +-
drivers/of/of_mdio.c | 6 +++++-
include/linux/bpf_verifier.h | 5 +++--
include/net/gro_cells.h | 3 +++
include/net/ip_fib.h | 1 +
include/net/net_namespace.h | 2 +-
kernel/bpf/verifier.c | 70 ++++++++++++++++++++++++++++++++++++++++--------------------
net/batman-adv/hard-interface.c | 1 +
net/batman-adv/tp_meter.c | 1 +
net/core/net_namespace.c | 2 ++
net/core/rtnetlink.c | 22 ++++++++++++-------
net/ipv4/fib_frontend.c | 20 ++++++++++++-----
net/ipv4/fib_trie.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
net/ipv4/igmp.c | 50 +++++++++++++++++++++++++++++++------------
net/ipv4/tcp_cong.c | 4 +++-
net/ipv4/udp.c | 6 +++---
net/ipv6/ip6_tunnel.c | 13 +++++++++--
net/ipv6/udp.c | 6 +++---
net/l2tp/l2tp_eth.c | 2 +-
net/l2tp/l2tp_ip.c | 5 +++--
net/l2tp/l2tp_ip6.c | 5 +++--
net/mac80211/sta_info.c | 2 +-
net/mac80211/tx.c | 14 ++++++++----
net/mac80211/vht.c | 16 ++++++++++++++
net/sched/cls_api.c | 5 +++--
net/tipc/socket.c | 48 +----------------------------------------
net/unix/af_unix.c | 17 +++++++++------
net/wireless/core.h | 1 +
net/wireless/scan.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
net/wireless/util.c | 3 ++-
63 files changed, 1020 insertions(+), 560 deletions(-)
^ permalink raw reply
* Re: [PATCH] VSOCK: add loopback to virtio_transport
From: David Miller @ 2016-11-21 18:22 UTC (permalink / raw)
To: jhansen; +Cc: stefanha, netdev, cavery, imbrenda
In-Reply-To: <BY2PR0501MB20563DC7163282F633F7A30EDAB50@BY2PR0501MB2056.namprd05.prod.outlook.com>
From: "Jorgen S. Hansen" <jhansen@vmware.com>
Date: Mon, 21 Nov 2016 12:40:33 +0000
> That should make it on par with the VMCI transport.
Please do not top-post.
^ permalink raw reply
* Re: [PATCH v2 next 0/2] tcp: make undo_cwnd mandatory for congestion modules
From: David Miller @ 2016-11-21 18:20 UTC (permalink / raw)
To: fw; +Cc: netdev
In-Reply-To: <1479734318-30607-1-git-send-email-fw@strlen.de>
From: Florian Westphal <fw@strlen.de>
Date: Mon, 21 Nov 2016 14:18:36 +0100
> highspeed, illinois, scalable, veno and yeah congestion control algorithms
> don't provide a 'cwnd_undo' function. This makes the stack default to a
> 'reno undo' which doubles cwnd. However, the ssthresh implementation of
> these algorithms do not halve the slowstart threshold. This causes similar
> issue as the one fixed for dctcp in ce6dd23329b1e ("dctcp: avoid bogus
> doubling of cwnd after loss").
>
> In light of this it seems better to remove the fallback and make undo_cwnd
> mandatory.
>
> First patch fixes those spots where reno undo seems incorrect by providing
> .cwnd_undo functions, second patch removes the fallback.
Series applied, thanks for following up on this.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox