Netdev List
 help / color / mirror / Atom feed
* [PATCH 00/16] Netfilter/IPVS/OVS fixes for net
From: Pablo Neira Ayuso @ 2017-05-03  9:31 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains a rather large batch of Netfilter, IPVS
and OVS fixes for your net tree. This includes fixes for ctnetlink, the
userspace conntrack helper infrastructure, conntrack OVS support,
ebtables DNAT target, several leaks in error path among other. More
specifically, they are:

1) Fix reference count leak in the CT target error path, from Gao Feng.

2) Remove conntrack entry clashing with a matching expectation, patch
   from Jarno Rajahalme.

3) Fix bogus EEXIST when registering two different userspace helpers,
   from Liping Zhang.

4) Don't leak dummy elements in the new bitmap set type in nf_tables,
   from Liping Zhang.

5) Get rid of module autoload from conntrack update path in ctnetlink,
   we don't need autoload at this late stage and it is happening with
   rcu read lock held which is not good. From Liping Zhang.

6) Fix deadlock due to double-acquire of the expect_lock from conntrack
   update path, this fixes a bug that was introduced when the central
   spinlock got removed. Again from Liping Zhang.

7) Safe ct->status update from ctnetlink path, from Liping. The expect_lock
   protection that was selected when the central spinlock was removed was
   not really protecting anything at all.

8) Protect sequence adjustment under ct->lock.

9) Missing socket match with IPv6, from Peter Tirsek.

10) Adjust skb->pkt_type of DNAT'ed frames from ebtables, from
    Linus Luessing.

11) Don't give up on evaluating the expression on new entries added via
    dynset expression in nf_tables, from Liping Zhang.

12) Use skb_checksum() when mangling icmpv6 in IPv6 NAT as this deals
    with non-linear skbuffs.

13) Don't allow IPv6 service in IPVS if no IPv6 support is available,
    from Paolo Abeni.

14) Missing mutex release in error path of xt_find_table_lock(), from
    Dan Carpenter.

15) Update maintainers files, Netfilter section. Add Florian to the
    file, refer to nftables.org and change project status from Supported
    to Maintained.

16) Bail out on mismatching extensions in element updates in nf_tables.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

----------------------------------------------------------------

The following changes since commit 94836ecf1e7378b64d37624fbb81fe48fbd4c772:

  Merge tag 'nfsd-4.11-2' of git://linux-nfs.org/~bfields/linux (2017-04-21 16:37:48 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 9744a6fcefcb4d56501d69adb04c24559d353cad:

  netfilter: nf_tables: check if same extensions are set when adding elements (2017-05-03 10:58:00 +0200)

----------------------------------------------------------------
Dan Carpenter (1):
      netfilter: x_tables: unlock on error in xt_find_table_lock()

Dave Johnson (1):
      netfilter: Wrong icmp6 checksum for ICMPV6_TIME_EXCEED in reverse SNATv6 path

Gao Feng (1):
      netfilter: xt_CT: fix refcnt leak on error path

Jarno Rajahalme (1):
      openvswitch: Delete conntrack entry clashing with an expectation.

Linus Lüssing (1):
      bridge: ebtables: fix reception of frames DNAT-ed to bridge device/port

Liping Zhang (7):
      netfilter: nf_ct_helper: permit cthelpers with different names via nfnetlink
      netfilter: nft_set_bitmap: free dummy elements when destroy the set
      netfilter: ctnetlink: drop the incorrect cthelper module request
      netfilter: ctnetlink: fix deadlock due to acquire _expect_lock twice
      netfilter: ctnetlink: make it safer when updating ct->status
      netfilter: ctnetlink: acquire ct->lock before operating nf_ct_seqadj
      netfilter: nft_dynset: continue to next expr if _OP_ADD succeeded

Pablo Neira Ayuso (3):
      Merge tag 'ipvs-fixes-for-v4.11' of http://git.kernel.org/.../horms/ipvs
      netfilter: update MAINTAINERS file
      netfilter: nf_tables: check if same extensions are set when adding elements

Paolo Abeni (1):
      ipvs: explicitly forbid ipv6 service/dest creation if ipv6 mod is disabled

Peter Tirsek (1):
      netfilter: xt_socket: Fix broken IPv6 handling

 MAINTAINERS                                        |  4 +-
 include/uapi/linux/netfilter/nf_conntrack_common.h | 13 +++-
 net/bridge/netfilter/ebt_dnat.c                    | 20 +++++
 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c           |  2 +-
 net/netfilter/ipvs/ip_vs_ctl.c                     | 22 ++++--
 net/netfilter/nf_conntrack_helper.c                | 26 +++++--
 net/netfilter/nf_conntrack_netlink.c               | 89 ++++++++++++----------
 net/netfilter/nf_tables_api.c                      |  5 ++
 net/netfilter/nft_dynset.c                         |  5 +-
 net/netfilter/nft_set_bitmap.c                     |  5 ++
 net/netfilter/x_tables.c                           |  4 +-
 net/netfilter/xt_CT.c                              | 11 ++-
 net/netfilter/xt_socket.c                          |  2 +-
 net/openvswitch/conntrack.c                        | 30 +++++++-
 14 files changed, 174 insertions(+), 64 deletions(-)

^ permalink raw reply

* RE: [PATCH] Fix for new version of realtek r8153
From: Hayes Wang @ 2017-05-03  9:18 UTC (permalink / raw)
  To: jake Briggs, mario_limonciello-8PEkshWhKlo@public.gmane.org,
	linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
  Cc: jake, nic_swsd
In-Reply-To: <20170502232048.9153-1-nexussix-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

jake Briggs [mailto:nexussix-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org]
> Sent: Wednesday, May 03, 2017 7:21 AM
[...]
> diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
> index 07f788c49d57..2a55459fdfac 100644
> --- a/drivers/net/usb/r8152.c
> +++ b/drivers/net/usb/r8152.c
> @@ -4277,6 +4277,7 @@ static void r8152b_get_version(struct r8152 *tp)
>  		tp->mii.supports_gmii = 1;
>  		break;
>  	case 0x5c30:
> +	case 0x6010:

The two chips are different. I don't think it is a good idea.
Maybe you could use the driver from the Realtek website first.

>  		tp->version = RTL_VER_06;
>  		tp->mii.supports_gmii = 1;
>  		break;
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [iproute PATCH] man: ip.8: Document -brief flag
From: Phil Sutter @ 2017-05-03  9:07 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

Brief output is especially useful for new users, so at least mention
it's existence in ip man page.

Signed-off-by: Phil Sutter <phil@nwl.cc>
---
 man/man8/ip.8 | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/man/man8/ip.8 b/man/man8/ip.8
index 1c5a7419e4fc2..ae018fdf11ac9 100644
--- a/man/man8/ip.8
+++ b/man/man8/ip.8
@@ -48,7 +48,8 @@ ip \- show / manipulate routing, devices, policy routing and tunnels
 \fB\-ts\fR[\fIhort\fR] |
 \fB\-n\fR[\fIetns\fR] name |
 \fB\-a\fR[\fIll\fR] |
-\fB\-c\fR[\fIolor\fR] }
+\fB\-c\fR[\fIolor\fR]
+\fB\-br\fR[\fIief\fR] }
 
 
 .SH OPTIONS
@@ -206,6 +207,11 @@ Set the netlink socket receive buffer size, defaults to 1MB.
 .BR "\-iec"
 print human readable rates in IEC units (e.g. 1Ki = 1024).
 
+.TP
+.BR "\-br" , "\-brief"
+Print only basic information in a tabular format for better readability. This option is currently only supported by
+.BR "ip addr show " and " ip link show " commands.
+
 .SH IP - COMMAND SYNTAX
 
 .SS
-- 
2.11.0

^ permalink raw reply related

* [PATCH RESEND 4.4-only] netlink: Allow direct reclaim for fallback allocation
From: Ross Lagerwall @ 2017-05-03  8:44 UTC (permalink / raw)
  To: stable
  Cc: Ross Lagerwall, David S. Miller, Greg Kroah-Hartman, Eric Dumazet,
	netdev, linux-kernel

The backport of d35c99ff77ec ("netlink: do not enter direct reclaim from
netlink_dump()") to the 4.4 branch (first in 4.4.32) mistakenly removed
direct claim from the initial large allocation _and_ the fallback
allocation which means that allocations can spuriously fail.
Fix the issue by adding back the direct reclaim flag to the fallback
allocation.

Fixes: 6d123f1d396b ("netlink: do not enter direct reclaim from netlink_dump()")
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
---

Note that this is only for the 4.4 branch as the regression is only in
this branch. Consequently, there is no corresponding upstream commit.

I'm resending this to the linux-stable list since I now understand the
netdev maintainer only handles backports for the last couple of versions
of Linux.

 net/netlink/af_netlink.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 8e33019..acfb16f 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2107,7 +2107,7 @@ static int netlink_dump(struct sock *sk)
 	if (!skb) {
 		alloc_size = alloc_min_size;
 		skb = netlink_alloc_skb(sk, alloc_size, nlk->portid,
-					(GFP_KERNEL & ~__GFP_DIRECT_RECLAIM));
+					GFP_KERNEL);
 	}
 	if (!skb)
 		goto errout_skb;
-- 
2.7.4

^ permalink raw reply related

* [PATCH 1/1] net: usb: qmi_wwan: add Telit ME910 support
From: Daniele Palmas @ 2017-05-03  8:30 UTC (permalink / raw)
  To: Bjørn Mork; +Cc: netdev, Daniele Palmas

This patch adds support for Telit ME910 PID 0x1100.

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
---

0x1100 composition is:

tty + qdss + tty + rmnet

Following lsusb output:

Bus 003 Device 018: ID 1bc7:1100 Telit Wireless Solutions 
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  idVendor           0x1bc7 Telit Wireless Solutions
  idProduct          0x1100 
  bcdDevice            0.00
  iManufacturer           3 Telit
  iProduct                2 Telit ME910
  iSerial                 4 1f5fec
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength          108
    bNumInterfaces          4
    bConfigurationValue     1
    iConfiguration          1 Telit Configuration
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower              500mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x01  EP 1 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        1
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        2
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x83  EP 3 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               5
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x84  EP 4 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        3
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x85  EP 5 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               5
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x86  EP 6 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x03  EP 3 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  bNumConfigurations      1
Device Status:     0x0000
  (Bus Powered)

---
 drivers/net/usb/qmi_wwan.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index a3ed811..d716576 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -1201,6 +1201,7 @@ static const struct usb_device_id products[] = {
 	{QMI_FIXED_INTF(0x2357, 0x0201, 4)},	/* TP-LINK HSUPA Modem MA180 */
 	{QMI_FIXED_INTF(0x2357, 0x9000, 4)},	/* TP-LINK MA260 */
 	{QMI_QUIRK_SET_DTR(0x1bc7, 0x1040, 2)},	/* Telit LE922A */
+	{QMI_FIXED_INTF(0x1bc7, 0x1100, 3)},	/* Telit ME910 */
 	{QMI_FIXED_INTF(0x1bc7, 0x1200, 5)},	/* Telit LE920 */
 	{QMI_QUIRK_SET_DTR(0x1bc7, 0x1201, 2)},	/* Telit LE920, LE920A4 */
 	{QMI_FIXED_INTF(0x1c9e, 0x9b01, 3)},	/* XS Stick W100-2 from 4G Systems */
-- 
2.7.4

^ permalink raw reply related

* Re: ipsec doesn't route TCP with 4.11 kernel
From: Steffen Klassert @ 2017-05-03  8:21 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Don Bowman, Cong Wang, linux-kernel@vger.kernel.org, Herbert Xu,
	Linux Kernel Network Developers
In-Reply-To: <1493398002.31837.12.camel@edumazet-glaptop3.roam.corp.google.com>

On Fri, Apr 28, 2017 at 09:46:42AM -0700, Eric Dumazet wrote:
> On Fri, 2017-04-28 at 09:13 +0200, Steffen Klassert wrote:
> >          encap type espinudp sport 4500 dport 4500 addr 0.0.0.0
> > 
> > Ok, this is espinudp. This information was important.
> 
> > This is not a GRO issue as I thought, the TX side is already broken.
> > 
> > Could you please try the patch below?
> > 
> > Subject: [PATCH] esp4: Fix udpencap for local TCP packets.
> > 
> > Locally generated TCP packets are usually cloned, so we
> > do skb_cow_data() on this packets. After that we need to
> > reload the pointer to the esp header. On udpencap this
> > header has an offset to skb_transport_header, so take this
> > offset into account.
> 
> 
> It looks like locally generated TCP packets could avoid the
> skb_cow_data(), if you were using skb_header_cloned() instead of
> skb_cloned()  ?

Yes, should be possible in the codepath where we do crypto
with separate src and dst buffers. Would require some
rearrangements to make sure we don't do inplace crypto
in this case.

Thanks for the hint!

^ permalink raw reply

* Re: ipsec doesn't route TCP with 4.11 kernel
From: Steffen Klassert @ 2017-05-03  8:14 UTC (permalink / raw)
  To: Don Bowman
  Cc: Cong Wang, linux-kernel@vger.kernel.org, Herbert Xu,
	Linux Kernel Network Developers
In-Reply-To: <CADJev7_Tc0aRsPs0Q7Wijd-YBM39ZshitJpSo2yEqPVwag2X_Q@mail.gmail.com>

On Sat, Apr 29, 2017 at 08:39:34PM -0400, Don Bowman wrote:
> On 28 April 2017 at 03:13, Steffen Klassert
> <steffen.klassert@secunet.com> wrote:
> > On Thu, Apr 27, 2017 at 06:13:38PM -0400, Don Bowman wrote:
> >> On 27 April 2017 at 04:42, Steffen Klassert <steffen.klassert@secunet.com>
> >> wrote:
> >> > On Wed, Apr 26, 2017 at 10:01:34PM -0700, Cong Wang wrote:
> >> >> (Cc'ing netdev and IPSec maintainers)
> >> >>
> >> >> On Tue, Apr 25, 2017 at 6:08 PM, Don Bowman <db@donbowman.ca> wrote:
> >>
> 
> <snip>
> 
> confirmed, with this patch in place that the tcp functions properly.

Thanks for testing!

I'll make sure to get this fix into the mainline soon.

^ permalink raw reply

* Re: [PATCH] net: ethernet: stmmac: properly set PS bit in MII configurations during reset
From: Giuseppe CAVALLARO @ 2017-05-03  8:13 UTC (permalink / raw)
  To: Thomas Petazzoni, Alexandre Torgue; +Cc: netdev, stable
In-Reply-To: <1493286329-24448-1-git-send-email-thomas.petazzoni@free-electrons.com>

Hello Thomas

this was initially set by using the hw->link.port; both the core_init 
and adjust callback
should invoke the hook and tuning the PS bit according to the speed and 
mode.
So maybe the ->set_ps is superfluous and you could reuse the existent hook

let me know

Regards
peppe

On 4/27/2017 11:45 AM, Thomas Petazzoni wrote:
> On the SPEAr600 SoC, which has the dwmac1000 variant of the IP block,
> the DMA reset never succeeds when a MII PHY is used (no problem with a
> GMII PHY). The dwmac_dma_reset() function sets the
> DMA_BUS_MODE_SFT_RESET bit in the DMA_BUS_MODE register, and then
> polls until this bit clears. When a MII PHY is used, with the current
> driver, this bit never clears and the driver therefore doesn't work.
>
> The reason is that the PS bit of the GMAC_CONTROL register should be
> correctly configured for the DMA reset to work. When the PS bit is 0,
> it tells the MAC we have a GMII PHY, when the PS bit is 1, it tells
> the MAC we have a MII PHY.
>
> Doing a DMA reset clears all registers, so the PS bit is cleared as
> well. This makes the DMA reset work fine with a GMII PHY. However,
> with MII PHY, the PS bit should be set.
>
> We have identified this issue thanks to two SPEAr600 platform:
>
>   - One equipped with a GMII PHY, with which the existing driver was
>     working fine.
>
>   - One equipped with a MII PHY, where the current driver fails because
>     the DMA reset times out.
>
> This patch fixes the problem for the MII PHY configuration, and has
> been tested with a GMII PHY configuration as well.
>
> In terms of implement, since the ->reset() hook is implemented in the
> DMA related code, we do not want to touch directly from this function
> the MAC registers. Therefore, a ->set_ps() hook has been added to
> stmmac_ops, which gets called between the moment the reset is asserted
> and the polling loop waiting for the reset bit to clear.
>
> In order for this ->set_ps() hook to decide what to do, we pass it the
> "struct mac_device_info" so it can access the MAC registers, and the
> PHY interface type so it knows if we're using a MII PHY or not.
>
> The ->set_ps() hook is only implemented for the dwmac1000 case.
>
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> Cc: <stable@vger.kernel.org>
> ---
> Do not hesitate to suggest ideas for alternative implementations, I'm
> not sure if the current proposal is the one that fits best with the
> current design of the driver.
> ---
>   drivers/net/ethernet/stmicro/stmmac/common.h         | 12 +++++++++---
>   drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c | 16 ++++++++++++++++
>   drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h     |  3 ++-
>   drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c     |  7 ++++++-
>   drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h      |  3 ++-
>   drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c      |  6 +++++-
>   drivers/net/ethernet/stmicro/stmmac/stmmac_main.c    |  3 ++-
>   7 files changed, 42 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
> index 04d9245..d576f95 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/common.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/common.h
> @@ -407,10 +407,13 @@ struct stmmac_desc_ops {
>   extern const struct stmmac_desc_ops enh_desc_ops;
>   extern const struct stmmac_desc_ops ndesc_ops;
>   
> +struct mac_device_info;
> +
>   /* Specific DMA helpers */
>   struct stmmac_dma_ops {
>   	/* DMA core initialization */
> -	int (*reset)(void __iomem *ioaddr);
> +	int (*reset)(void __iomem *ioaddr, struct mac_device_info *hw,
> +		     phy_interface_t interface);
>   	void (*init)(void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg,
>   		     u32 dma_tx, u32 dma_rx, int atds);
>   	/* Configure the AXI Bus Mode Register */
> @@ -445,12 +448,15 @@ struct stmmac_dma_ops {
>   	void (*enable_tso)(void __iomem *ioaddr, bool en, u32 chan);
>   };
>   
> -struct mac_device_info;
> -
>   /* Helpers to program the MAC core */
>   struct stmmac_ops {
>   	/* MAC core initialization */
>   	void (*core_init)(struct mac_device_info *hw, int mtu);
> +	/* Set port select. Called between asserting DMA reset and
> +	 * waiting for the reset bit to clear.
> +	 */
> +	void (*set_ps)(struct mac_device_info *hw,
> +		       phy_interface_t interface);
>   	/* Enable and verify that the IPC module is supported */
>   	int (*rx_ipc)(struct mac_device_info *hw);
>   	/* Enable RX Queues */
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
> index 19b9b308..dfcbb5b 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
> @@ -75,6 +75,21 @@ static void dwmac1000_core_init(struct mac_device_info *hw, int mtu)
>   #endif
>   }
>   
> +static void dwmac1000_set_ps(struct mac_device_info *hw,
> +			     phy_interface_t interface)
> +{
> +	void __iomem *ioaddr = hw->pcsr;
> +	u32 value = readl(ioaddr + GMAC_CONTROL);
> +
> +	/* When a MII PHY is used, we must set the PS bit for the DMA
> +	 * reset to succeed.
> +	 */
> +	if (interface == PHY_INTERFACE_MODE_MII)
> +		value |= GMAC_CONTROL_PS;
> +
> +	writel(value, ioaddr + GMAC_CONTROL);
> +}
> +
>   static int dwmac1000_rx_ipc_enable(struct mac_device_info *hw)
>   {
>   	void __iomem *ioaddr = hw->pcsr;
> @@ -488,6 +503,7 @@ static void dwmac1000_debug(void __iomem *ioaddr, struct stmmac_extra_stats *x)
>   
>   static const struct stmmac_ops dwmac1000_ops = {
>   	.core_init = dwmac1000_core_init,
> +	.set_ps = dwmac1000_set_ps,
>   	.rx_ipc = dwmac1000_rx_ipc_enable,
>   	.dump_regs = dwmac1000_dump_regs,
>   	.host_irq_status = dwmac1000_irq_status,
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
> index 1b06df7..e9c6c49 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
> @@ -183,7 +183,8 @@
>   #define DMA_CHAN0_DBG_STAT_RPS		GENMASK(11, 8)
>   #define DMA_CHAN0_DBG_STAT_RPS_SHIFT	8
>   
> -int dwmac4_dma_reset(void __iomem *ioaddr);
> +int dwmac4_dma_reset(void __iomem *ioaddr, struct mac_device_info *hw,
> +		     phy_interface_t interface);
>   void dwmac4_enable_dma_transmission(void __iomem *ioaddr, u32 tail_ptr);
>   void dwmac4_enable_dma_irq(void __iomem *ioaddr);
>   void dwmac410_enable_dma_irq(void __iomem *ioaddr);
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
> index c7326d5..485eecb 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
> @@ -14,7 +14,8 @@
>   #include "dwmac4_dma.h"
>   #include "dwmac4.h"
>   
> -int dwmac4_dma_reset(void __iomem *ioaddr)
> +int dwmac4_dma_reset(void __iomem *ioaddr, struct mac_device_info *hw,
> +		     phy_interface_t interface)
>   {
>   	u32 value = readl(ioaddr + DMA_BUS_MODE);
>   	int limit;
> @@ -22,6 +23,10 @@ int dwmac4_dma_reset(void __iomem *ioaddr)
>   	/* DMA SW reset */
>   	value |= DMA_BUS_MODE_SFT_RESET;
>   	writel(value, ioaddr + DMA_BUS_MODE);
> +
> +	if (hw->mac->set_ps)
> +		hw->mac->set_ps(hw, interface);
> +
>   	limit = 10;
>   	while (limit--) {
>   		if (!(readl(ioaddr + DMA_BUS_MODE) & DMA_BUS_MODE_SFT_RESET))
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
> index 56e485f..25ae028 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
> @@ -144,6 +144,7 @@ void dwmac_dma_stop_tx(void __iomem *ioaddr);
>   void dwmac_dma_start_rx(void __iomem *ioaddr);
>   void dwmac_dma_stop_rx(void __iomem *ioaddr);
>   int dwmac_dma_interrupt(void __iomem *ioaddr, struct stmmac_extra_stats *x);
> -int dwmac_dma_reset(void __iomem *ioaddr);
> +int dwmac_dma_reset(void __iomem *ioaddr, struct mac_device_info *hw,
> +		    phy_interface_t interface);
>   
>   #endif /* __DWMAC_DMA_H__ */
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
> index e60bfca..1a17df5 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
> @@ -23,7 +23,8 @@
>   
>   #define GMAC_HI_REG_AE		0x80000000
>   
> -int dwmac_dma_reset(void __iomem *ioaddr)
> +int dwmac_dma_reset(void __iomem *ioaddr, struct mac_device_info *hw,
> +		    phy_interface_t interface)
>   {
>   	u32 value = readl(ioaddr + DMA_BUS_MODE);
>   	int err;
> @@ -32,6 +33,9 @@ int dwmac_dma_reset(void __iomem *ioaddr)
>   	value |= DMA_BUS_MODE_SFT_RESET;
>   	writel(value, ioaddr + DMA_BUS_MODE);
>   
> +	if (hw->mac->set_ps)
> +		hw->mac->set_ps(hw, interface);
> +
>   	err = readl_poll_timeout(ioaddr + DMA_BUS_MODE, value,
>   				 !(value & DMA_BUS_MODE_SFT_RESET),
>   				 100000, 10000);
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 4498a38..66bc218 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -1585,7 +1585,8 @@ static int stmmac_init_dma_engine(struct stmmac_priv *priv)
>   	if (priv->extend_desc && (priv->mode == STMMAC_RING_MODE))
>   		atds = 1;
>   
> -	ret = priv->hw->dma->reset(priv->ioaddr);
> +	ret = priv->hw->dma->reset(priv->ioaddr, priv->hw,
> +				   priv->plat->interface);
>   	if (ret) {
>   		dev_err(priv->device, "Failed to reset the dma\n");
>   		return ret;

^ permalink raw reply

* Re: [net-next PATCH 1/4] samples/bpf: adjust rlimit RLIMIT_MEMLOCK for traceex2, tracex3 and tracex4
From: Jesper Dangaard Brouer @ 2017-05-03  8:12 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: kafai, netdev, eric, Daniel Borkmann, brouer
In-Reply-To: <20170503005314.7oovr764r3e4elzd@ast-mbp>

On Tue, 2 May 2017 17:53:16 -0700
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Tue, May 02, 2017 at 02:31:50PM +0200, Jesper Dangaard Brouer wrote:
> > Needed to adjust max locked memory RLIMIT_MEMLOCK for testing these bpf samples
> > as these are using more and larger maps than can fit in distro default 64Kbytes limit.
> > 
> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>  
> ...
> > +	struct rlimit r = {1024*1024, RLIM_INFINITY};  
> ...
> > +	struct rlimit r = {1024*1024, RLIM_INFINITY};  
> 
> why magic numbers?
> All other samples do
> struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};

I just wanted to provide some examples showing that it is possible to
set some reasonable limit.

The RLIM_INFINITY setting is basically just disabling the kernels
memory limit checks, and it is sort of a bad coding pattern (that
people will copy) as the two example programs does not need much.

 
> > +	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
> > +		perror("setrlimit(RLIMIT_MEMLOCK)");  
> 
> ip_tunnel.c test does:
> perror("setrlimit(RLIMIT_MEMLOCK, RLIM_INFINITY)");
> Few others do:
> assert(!setrlimit(RLIMIT_MEMLOCK, &r));
> and the rest just:
> setrlimit(RLIMIT_MEMLOCK, &r);
> 
> We probalby need to move this to a helper.
> 
> > +	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};  
> 
> here it's consistent :)
> 
> > +	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
> > +		perror("setrlimit(RLIMIT_MEMLOCK, RLIM_INFINITY)");  
> 
> but with different perror ?
> Let's do a common helper for all?

Sure, it makes sense to streamline this into a helper, just not in this
patchset ;-)  Lets do that later...

And I would argue that this helper should allow users to specify some
expected/reasonable memory usage size, as the kernel side checks would
then provide some value, instead of being effectively disabled.  I can
easily imagine someone increasing a _kern.c hash map max size to
100 million, without realizing that this can OOM the machine.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH] brcmfmac: btcoex: replace init_timer with setup_timer
From: Arend van Spriel @ 2017-05-03  8:05 UTC (permalink / raw)
  To: Xie Qirong, Franky Lin, Hante Meuleman, Kalle Valo
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	brcm80211-dev-list.pdl-dY08KVG/lbpWk0Htik3J/w,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Piotr Haber
In-Reply-To: <20170503073555.3922-1-cheerx1994-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

On 5/3/2017 9:35 AM, Xie Qirong wrote:
> Signed-off-by: Xie Qirong <cheerx1994-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
> 
>   setup_timer.cocci suggested the following improvement:
>   drivers/net/wireless/broadcom/brcm80211/brcmfmac/btcoex.c:383:1-11: Use
>   setup_timer function for function on line 384.

Move the text above before your sign-off so it will end up in the git 
commit message.

When done you may also add my acknowledgement, ie.:

Acked-by: Arend van Spriel <arend.vanspriel-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

Regards,
Arend

^ permalink raw reply

* Re: [PATCH net-next v2] net: ipv6: make sure multicast packets are not forwarded beyond the different scopes
From: Donatas Abraitis @ 2017-05-03  7:53 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, stable
In-Reply-To: <20170502.145923.66844914584656456.davem@davemloft.net>

Looks like there is this test already:

                if (IPV6_ADDR_MC_SCOPE(&ipv6_hdr(skb)->daddr) <=
                    IPV6_ADDR_SCOPE_NODELOCAL &&
                    !(dev->flags & IFF_LOOPBACK)) {
                        kfree_skb(skb);
                        return 0;
                }

On Tue, May 2, 2017 at 9:59 PM, David Miller <davem@davemloft.net> wrote:
> From: Donatas Abraitis <donatas.abraitis@gmail.com>
> Date: Thu, 27 Apr 2017 10:12:02 +0300
>
>>           RFC4291 2.7 Routers must not forward any multicast packets
>>           beyond of the scope indicated by the scop field in the
>>           destination multicast address.
>>
>> Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
>
> I think it's a ">=" test which is needed here, not pure equality.
> Scopes are subsets of other scopes and are therefore allowed within
> eachother.
>
> Did you actually see misbehavior due to this issue, or see a real
> bonafide conformance test fail?
>
> If you're just reading the RFC and sticking tests here and there based
> upon what you read, without any testing or real life verification of
> the issue, this is _strongly_ discouraged.
>
> It would even be ok if you merely showed how another open source
> networking stack makes this test.



-- 
Donatas

^ permalink raw reply

* [PATCH net] tg3: don't clear stats while tg3_close
From: YueHaibing @ 2017-05-03  7:51 UTC (permalink / raw)
  To: davem, netdev; +Cc: weiyongjun1

Now tg3 NIC's stats will be cleared after ifdown/ifup. bond_get_stats traverse
its salves to get statistics,cumulative the increment.If a tg3 NIC is added to
bonding as a slave,ifdown/ifup will cause bonding's stats become tremendous value
(ex.1638.3 PiB) because of negative increment.

Fixes: 92feeabf3f67 ("tg3: Save stats across chip resets")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 drivers/net/ethernet/broadcom/tg3.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 30d1eb9..29beba1 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -11722,10 +11722,6 @@ static int tg3_close(struct net_device *dev)
 
 	tg3_stop(tp);
 
-	/* Clear stats across close / open calls */
-	memset(&tp->net_stats_prev, 0, sizeof(tp->net_stats_prev));
-	memset(&tp->estats_prev, 0, sizeof(tp->estats_prev));
-
 	if (pci_device_is_present(tp->pdev)) {
 		tg3_power_down_prepare(tp);
 
-- 
2.5.0

^ permalink raw reply related

* [PATCH] brcmfmac: btcoex: replace init_timer with setup_timer
From: Xie Qirong @ 2017-05-03  7:35 UTC (permalink / raw)
  To: Arend van Spriel, Franky Lin, Hante Meuleman, Kalle Valo
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	brcm80211-dev-list.pdl-dY08KVG/lbpWk0Htik3J/w,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Piotr Haber, Xie Qirong

Signed-off-by: Xie Qirong <cheerx1994-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---

 setup_timer.cocci suggested the following improvement:
 drivers/net/wireless/broadcom/brcm80211/brcmfmac/btcoex.c:383:1-11: Use
 setup_timer function for function on line 384.

 Patch was compile checked with: x86_64_defconfig + CONFIG_BRCMFMAC=y +
 CONFIG_BRCMFMAC_USB=y + CONFIG_BRCMFMAC_PCIE=y + CONFIG_BRCM_TRACING=y +
 CONFIG_BRCMDBG=y

 Kernel version: next-20170502 (localversion-next is next-20170502)

 drivers/net/wireless/broadcom/brcm80211/brcmfmac/btcoex.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/btcoex.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/btcoex.c
index 14a70d4..3559fb5 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/btcoex.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/btcoex.c
@@ -380,9 +380,7 @@ int brcmf_btcoex_attach(struct brcmf_cfg80211_info *cfg)
 	/* Set up timer for BT  */
 	btci->timer_on = false;
 	btci->timeout = BRCMF_BTCOEX_OPPR_WIN_TIME;
-	init_timer(&btci->timer);
-	btci->timer.data = (ulong)btci;
-	btci->timer.function = brcmf_btcoex_timerfunc;
+	setup_timer(&btci->timer, brcmf_btcoex_timerfunc, (ulong)btci);
 	btci->cfg = cfg;
 	btci->saved_regs_part1 = false;
 	btci->saved_regs_part2 = false;
-- 
2.9.3

^ permalink raw reply related

* Re: [PATCH 0/9] net: thunderx: Adds XDP support
From: Sunil Kovvuri @ 2017-05-03  7:28 UTC (permalink / raw)
  To: David Miller; +Cc: Linux Netdev List, LKML, LAKML, Sunil Goutham
In-Reply-To: <20170502.154744.1762061314370744901.davem@davemloft.net>

On Wed, May 3, 2017 at 1:17 AM, David Miller <davem@davemloft.net> wrote:
> From: sunil.kovvuri@gmail.com
> Date: Tue,  2 May 2017 18:36:49 +0530
>
>> From: Sunil Goutham <sgoutham@cavium.com>
>>
>> This patch series adds support for XDP to ThunderX NIC driver
>> which is used on CN88xx, CN81xx and CN83xx platforms.
>>
>> Patches 1-4 are performance improvement and cleanup patches
>> which are done keeping XDP performance bottlenecks in view.
>> Rest of the patches adds actual XDP support.
>
> Series applied, thanks for doing this work.

Thanks.

>
> Do you have any performance numbers?

Below are the forwarding numbers on a single core.
with network stack: 0.32 Mpps
with XDP (XDP_TX): 3 Mpps
and XDP_DROP: 3.8 Mpps

Thanks,
Sunil.

^ permalink raw reply

* Miss it//Re: [PATCH v3] iov_iter: don't revert iov buffer if csum error
From: Ding Tianhong @ 2017-05-03  7:15 UTC (permalink / raw)
  To: David Miller, pabeni, edumazet, hannes, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, LinuxArm, weiyongjun (A), Al Viro
In-Reply-To: <12d4d81f-40c7-c83d-11d6-290acc084695@huawei.com>

Miss it, it is already in the kernel tree, sorry for the noisy.

On 2017/5/3 15:02, Ding Tianhong wrote:
> The patch 327868212381 (make skb_copy_datagram_msg() et.al. preserve
> ->msg_iter on error) will revert the iov buffer if copy to iter
> failed, but it didn't copy any datagram if the skb_checksum_complete
> error, so no need to revert any data at this place.
> 
> v2: Sabrina notice that return -EFAULT when checksum error is not correct
>     here, it would confuse the caller about the return value, so fix it.
> 
> v3: According AI's suggestion, directly return -EINVAL when __skb_checksum_complete()
>     return error is a more simple solution.
> 
> Fixes: 327868212381 ("make skb_copy_datagram_msg() et.al. preserve->msg_iter on error")
> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
> ---
>  net/core/datagram.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/core/datagram.c b/net/core/datagram.c
> index 0306543..726bf8a 100644
> --- a/net/core/datagram.c
> +++ b/net/core/datagram.c
> @@ -719,7 +719,7 @@ int skb_copy_and_csum_datagram_msg(struct sk_buff *skb,
> 
>  	if (msg_data_left(msg) < chunk) {
>  		if (__skb_checksum_complete(skb))
> -			goto csum_error;
> +			return -EINVAL;
>  		if (skb_copy_datagram_msg(skb, hlen, msg, chunk))
>  			goto fault;
>  	} else {
> 

^ permalink raw reply

* Re:
From: H.A @ 2017-05-03  7:00 UTC (permalink / raw)
  To: Recipients

With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention..........

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenaroberts99@gmail.com pro úplné podrobnosti.complete.


HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply

* Re: [PATCH v3] iov_iter: don't revert iov buffer if csum error
From: Al Viro @ 2017-05-03  7:07 UTC (permalink / raw)
  To: Ding Tianhong
  Cc: David Miller, pabeni, edumazet, hannes, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, LinuxArm, weiyongjun (A)
In-Reply-To: <12d4d81f-40c7-c83d-11d6-290acc084695@huawei.com>

On Wed, May 03, 2017 at 03:02:32PM +0800, Ding Tianhong wrote:
> The patch 327868212381 (make skb_copy_datagram_msg() et.al. preserve
> ->msg_iter on error) will revert the iov buffer if copy to iter
> failed, but it didn't copy any datagram if the skb_checksum_complete
> error, so no need to revert any data at this place.

See a6a5993243550b09f620941dea741b7421fdf79c in mainline...

^ permalink raw reply

* [PATCH v3] iov_iter: don't revert iov buffer if csum error
From: Ding Tianhong @ 2017-05-03  7:02 UTC (permalink / raw)
  To: David Miller, pabeni, edumazet, hannes, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, LinuxArm, weiyongjun (A), Al Viro

The patch 327868212381 (make skb_copy_datagram_msg() et.al. preserve
->msg_iter on error) will revert the iov buffer if copy to iter
failed, but it didn't copy any datagram if the skb_checksum_complete
error, so no need to revert any data at this place.

v2: Sabrina notice that return -EFAULT when checksum error is not correct
    here, it would confuse the caller about the return value, so fix it.

v3: According AI's suggestion, directly return -EINVAL when __skb_checksum_complete()
    return error is a more simple solution.

Fixes: 327868212381 ("make skb_copy_datagram_msg() et.al. preserve->msg_iter on error")
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
---
 net/core/datagram.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/datagram.c b/net/core/datagram.c
index 0306543..726bf8a 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -719,7 +719,7 @@ int skb_copy_and_csum_datagram_msg(struct sk_buff *skb,

 	if (msg_data_left(msg) < chunk) {
 		if (__skb_checksum_complete(skb))
-			goto csum_error;
+			return -EINVAL;
 		if (skb_copy_datagram_msg(skb, hlen, msg, chunk))
 			goto fault;
 	} else {
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH net-next] net/esp4: Fix invalid esph pointer crash
From: Steffen Klassert @ 2017-05-03  7:02 UTC (permalink / raw)
  To: ilant; +Cc: netdev
In-Reply-To: <20170430133438.31962-1-ilant@mellanox.com>

On Sun, Apr 30, 2017 at 04:34:38PM +0300, ilant@mellanox.com wrote:
> From: Ilan Tayari <ilant@mellanox.com>
> 
> Both esp_output and esp_xmit take a pointer to the ESP header
> and place it in esp_info struct prior to calling esp_output_head.
> 
> Inside esp_output_head, the call to esp_output_udp_encap
> makes sure to update the pointer if it gets invalid.
> However, if esp_output_head itself calls skb_cow_data, the
> pointer is not updated and stays invalid, causing a crash
> after esp_output_head returns.
> 
> Update the pointer if it becomes invalid in esp_output_head
> 
> Fixes: fca11ebde3f0 ("esp4: Reorganize esp_output")
> Signed-off-by: Ilan Tayari <ilant@mellanox.com>
> ---
>  net/ipv4/esp4.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
> index 7f2caf71212b..65cc02bd82bc 100644
> --- a/net/ipv4/esp4.c
> +++ b/net/ipv4/esp4.c
> @@ -317,6 +317,7 @@ int esp_output_head(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *
>  	if (nfrags < 0)
>  		goto out;
>  	tail = skb_tail_pointer(trailer);
> +	esp->esph = ip_esp_hdr(skb);

This is not quite right for udpencap. It fixes the crash,
but introduces a bug that we already have in v4.11.

On udpencap the esp header has an offset to skb_transport_header,
the problem was discussed last week here:

https://lkml.org/lkml/2017/4/25/937

I plan to fix this with the patch below:

Subject: [PATCH RFC] esp4: Fix udpencap for local TCP packets.

Locally generated TCP packets are usually cloned, so we
do skb_cow_data() on this packets. After that we need to
reload the pointer to the esp header. On udpencap this
header has an offset to skb_transport_header, so take this
offset into account.

Fixes: 67d349ed603 ("net/esp4: Fix invalid esph pointer crash")
Fixes: fca11ebde3f0 ("esp4: Reorganize esp_output")
Reported-by: Don Bowman <db@donbowman.ca>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv4/esp4.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 65cc02b..93322f8 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -248,6 +248,7 @@ int esp_output_head(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *
 	u8 *tail;
 	u8 *vaddr;
 	int nfrags;
+	int esph_offset;
 	struct page *page;
 	struct sk_buff *trailer;
 	int tailen = esp->tailen;
@@ -313,11 +314,13 @@ int esp_output_head(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *
 	}
 
 cow:
+	esph_offset = (unsigned char *)esp->esph - skb_transport_header(skb);
+
 	nfrags = skb_cow_data(skb, tailen, &trailer);
 	if (nfrags < 0)
 		goto out;
 	tail = skb_tail_pointer(trailer);
-	esp->esph = ip_esp_hdr(skb);
+	esp->esph = (struct ip_esp_hdr *)(skb_transport_header(skb) + esph_offset);
 
 skip_cow:
 	esp_output_fill_trailer(tail, esp->tfclen, esp->plen, esp->proto);
-- 
2.7.4

^ permalink raw reply related

* RE: [PATCH net v3] driver: veth: Fix one possbile memleak when fail to register_netdevice
From: Gao Feng @ 2017-05-03  6:37 UTC (permalink / raw)
  To: 'Xin Long'
  Cc: 'Gao Feng', 'davem', jarod,
	'Stephen Hemminger', dsa, 'network dev'
In-Reply-To: <CADvbK_c32g2t-Azgf10da8qke5B+wgG4dw3jLTE2L+R2qR3xPA@mail.gmail.com>

> From: Xin Long [mailto:lucien.xin@gmail.com]
> Sent: Wednesday, May 3, 2017 1:38 PM
> On Wed, May 3, 2017 at 10:07 AM, Gao Feng <gfree.wind@foxmail.com>
> wrote:
> >> From: netdev-owner@vger.kernel.org
> >> [mailto:netdev-owner@vger.kernel.org]
> >> On Behalf Of Xin Long
> >> Sent: Wednesday, May 3, 2017 12:59 AM On Tue, May 2, 2017 at 7:03 PM,
> >> Gao Feng <gfree.wind@vip.163.com> wrote:
> >> >> From: Xin Long [mailto:lucien.xin@gmail.com]
> >> >> Sent: Tuesday, May 2, 2017 3:56 PM On Sat, Apr 29, 2017 at 11:51
> >> >> AM,  <gfree.wind@foxmail.com> wrote:
> >> >> > From: Gao Feng <gfree.wind@foxmail.com>
[...]
> > The fix you mentioned change the original logic.
> > The dev->vstats is freed in advance in the ndo_uninit, not destructor.
> > It may break the backward.
> Sorry, I didn't get your "backward"
> I can't see there will be any problem caused by it.
> can you say this patch also break the 'backward' ?
> https://patchwork.ozlabs.org/patch/748964/
> 
> It's really weird to do dev->reg_state check in ndo_unint ndo_unint is supposed
> to free the memory alloced in ndo_init.
> 

I am not sure if it would break the backward, so I said it MAY break.
I assumed there may be someone would access the dev->vstats after ndo_uninit,
because current veth driver free the mem in the destructor.
I selected this approach because I don't want to bring new bugs during fix bug.

If you're sure it is safe to free dev->vstats in ndo_uninit, I would like to update it.

BTW there are too many drivers which have possible memleak.
You could find the list by https://www.mail-archive.com/netdev@vger.kernel.org/msg166629.html.

Some drivers allocate the resources in ndo_init, free some in ndo_uninit and free left in destructor.
I think there are some reasons. 
We could not move all free in the ndo_uninit from destructor. What's your opinion?

Best Regards
Feng

^ permalink raw reply

* Re: [PATCH iproute2 net 0/8] tc/act_pedit: Support offset relative to conventional header
From: Amir Vadai @ 2017-05-03  6:27 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Or Gerlitz, Jamal Hadi Salim
In-Reply-To: <20170501092625.30274bee@xeon-e3>

On Mon, May 01, 2017 at 09:26:25AM -0700, Stephen Hemminger wrote:
> On Sun, 23 Apr 2017 15:53:48 +0300
> Amir Vadai <amir@vadai.me> wrote:
> 
> > Hi Stephen,
> > 
> > This patchset extends pedit to support modifying a field in an offset relative
> > to the conventional network headers (kenrel support was added [1] in 4.11 rc1).
> > Without the extended pedit, user could specify fields in TCP and ICMP headers,
> > but the kernel code was using an offset relative to the begining of the IP
> > header. This will break if IP header length is greater than the minimal value
> > of 20, or if L3 is not IPv4.
> > 
> > It also introduces support in manipulating ETH, TCP, UDP and IP.ttl fields and
> > a new command to increase/decrease the value of a field (current use case is IP.ttl).
> > 
> > Since there might be deployments already using pedit, special consideration was
> > taken, not to break those scripts - only by specifying the special keyword
> > 'ex', the extended capabilities are available, thus there should be no impact
> > on existing scripts.
> > Also, the new code can live together with rules added by the old code. It
> > supports both the old netlink and the new one.
> > 
> > This patchset is against the master and not net-next as the functionality was
> > added in 4.11
> > 
> > Thanks,
> > Amir
> > 
> > [1] - 71d0ed7079df ("net/act_pedit: Support using offset relative to the
> >                      conventional network headers")

[...]

> 
> Applied. Then I cleaned up long lines

Thanks. Will make sure to clean up long lines in future patches.

^ permalink raw reply

* Re: [net-next PATCH 0/4] Improve bpf ELF-loader under samples/bpf
From: Jesper Dangaard Brouer @ 2017-05-03  6:16 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: kafai, netdev, eric, Daniel Borkmann, Alexei Starovoitov, brouer
In-Reply-To: <5908F5AC.6000703@iogearbox.net>

On Tue, 02 May 2017 23:10:04 +0200
Daniel Borkmann <daniel@iogearbox.net> wrote:

> On 05/02/2017 02:31 PM, Jesper Dangaard Brouer wrote:
> > This series improves and fixes bpf ELF loader and programs under
> > samples/bpf.  The bpf_load.c created some hard to debug issues when
> > the struct (bpf_map_def) used in the ELF maps section format changed
> > in commit fb30d4b71214 ("bpf: Add tests for map-in-map").
> >
> > This was hotfixed in commit 409526bea3c3 ("samples/bpf: bpf_load.c
> > detect and abort if ELF maps section size is wrong") by detecting the
> > issue and aborting the program.
> >
> > In most situations the bpf-loader should be able to handle these kind
> > of changes to the struct size.  This patch series aim to do proper
> > backward and forward compabilility handling when loading ELF files.
> >
> > This series also adjust the callback that was introduced in commit
> > 9fd63d05f3e8 ("bpf: Allow bpf sample programs (*_user.c) to change
> > bpf_map_def") to use the new bpf_map_data structure, before more users
> > start to use this callback.
> >
> > Hoping these changes can make the merge window, as above mentioned
> > commits have not been merged yet, and it would be good to avoid users
> > hitting these issues.  
> 
> Overall, set looks good to me. The last patch doesn't have a
> user yet, so probably better to drop it until there is an actual
> user in the tree.

The reason for simply exporting map_data[] was that in patch 3, the
data-struct (bpf_map_data) is already exposed, thus users can already
grab and store those into a separate data structure.  Thus, it seemed
natural to simply export/expose the map_data[] array directly.  Guess,
I could have combined patch 4 and 3.  As patch-3 uses the data struct,
but in an indirect way.

To Daniel, if you still feel we should drop patch 4, then let me know.
It is only the other patches that are time critical, as patch 4 is
trivial to introduce once the first sample program uses this directly
(instead of indirectly through the callback).


> Long term, I'd like to see the samples being migrated to use the
> tools/lib/bpf/ library from the tree, so that we can avoid duplicating
> effort with having two libs in the tree (f.e. elf map validation is
> performed to a certain degree in the other one, but w/o compat
> support last time I looked).

Yes, I agree that we should migrate to use the tools/lib/bpf/ library.
But as you also say, it actually have similar compat loader issues,
although it does more validation.  Once we start this migration, I'll
also fix the compat loader issues in this lib.
 
> Anyway, other than that:
> 
> Acked-by: Daniel Borkmann <daniel@iogearbox.net>

Thanks

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [net-next PATCH 2/4] samples/bpf: make bpf_load.c code compatible with ELF maps section changes
From: Jesper Dangaard Brouer @ 2017-05-03  5:48 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: kafai, netdev, eric, Daniel Borkmann, brouer
In-Reply-To: <20170503005449.urnux43sril3ganq@ast-mbp>

On Tue, 2 May 2017 17:54:51 -0700
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Tue, May 02, 2017 at 02:31:56PM +0200, Jesper Dangaard Brouer wrote:
> > This patch does proper parsing of the ELF "maps" section, in-order to
> > be both backwards and forwards compatible with changes to the map
> > definition struct bpf_map_def, which gets compiled into the ELF file.
> > 
> > The assumption is that new features with value zero, means that they
> > are not in-use.  For backward compatibility where loading an ELF file
> > with a smaller struct bpf_map_def, only copy objects ELF size, leaving
> > rest of loaders struct zero.  For forward compatibility where ELF file
> > have a larger struct bpf_map_def, only copy loaders own struct size
> > and verify that rest of the larger struct is zero, assuming this means
> > the newer feature was not activated, thus it should be safe for this
> > older loader to load this newer ELF file.
> > 
> > Fixes: fb30d4b71214 ("bpf: Add tests for map-in-map")
> > Fixes: 409526bea3c3 ("samples/bpf: bpf_load.c detect and abort if ELF maps section size is wrong")
> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>  
> 
> I would just merge patches 2 and 3 to reduce churn,
> but it looks like great improvement already.

I could have combined them, but I prefer keeping them separate to keep
the ELF changes separated from changing a sample program e.g.
map_perf_test_user.c.  IHMO is is cleaner this way.

> Acked-by: Alexei Starovoitov <ast@kernel.org>
 
Thanks

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH net v3] driver: veth: Fix one possbile memleak when fail to register_netdevice
From: Xin Long @ 2017-05-03  5:37 UTC (permalink / raw)
  To: Gao Feng; +Cc: Gao Feng, davem, jarod, Stephen Hemminger, dsa, network dev
In-Reply-To: <000c01d2c3b2$0925e880$1b71b980$@foxmail.com>

On Wed, May 3, 2017 at 10:07 AM, Gao Feng <gfree.wind@foxmail.com> wrote:
>> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
>> On Behalf Of Xin Long
>> Sent: Wednesday, May 3, 2017 12:59 AM
>> On Tue, May 2, 2017 at 7:03 PM, Gao Feng <gfree.wind@vip.163.com> wrote:
>> >> From: Xin Long [mailto:lucien.xin@gmail.com]
>> >> Sent: Tuesday, May 2, 2017 3:56 PM
>> >> On Sat, Apr 29, 2017 at 11:51 AM,  <gfree.wind@foxmail.com> wrote:
>> >> > From: Gao Feng <gfree.wind@foxmail.com>
>> > [...]
>> >> > -static void veth_dev_free(struct net_device *dev)
>> >> > +static void veth_destructor_free(struct net_device *dev)
>> >> >  {
>> >> >         free_percpu(dev->vstats);
>> >> > +}
>> >> not sure why you needed to add this function.
>> >> to use free_percpu() directly may be clearer.
>> >
>> > Because both of ndo_uninit and destructor need to perform same free
>> statements.
>> > It is good at maintain the codes with the common function.
>> >>
>> >> > +
>> >> > +static void veth_dev_uninit(struct net_device *dev) {
>> >> call free_percpu() here, no need to check dev->reg_state.
>> >> free_percpu will just return if dev->vstats is NULL.
>> >
>> > It would break the original design if don't check the reg_state.
>> > The original logic is that free the resources in the destructor, not in ndo_init.
>> I got what you're doing now, can you pls try to fix this with:
>>
>> --- a/drivers/net/veth.c
>> +++ b/drivers/net/veth.c
>> @@ -219,10 +219,9 @@ static int veth_dev_init(struct net_device *dev)
>>         return 0;
>>  }
>>
>> -static void veth_dev_free(struct net_device *dev)
>> +static void veth_dev_uninit(struct net_device *dev)
>>  {
>>         free_percpu(dev->vstats);
>> -       free_netdev(dev);
>>  }
>>
>>  #ifdef CONFIG_NET_POLL_CONTROLLER
>> @@ -279,6 +278,7 @@ static void veth_set_rx_headroom(struct net_device
>> *dev, int new_hr)
>>
>>  static const struct net_device_ops veth_netdev_ops = {
>>         .ndo_init            = veth_dev_init,
>> +       .ndo_uninit          = veth_dev_uninit,
>>         .ndo_open            = veth_open,
>>         .ndo_stop            = veth_close,
>>         .ndo_start_xmit      = veth_xmit,
>> @@ -317,7 +317,7 @@ static void veth_setup(struct net_device *dev)
>>                                NETIF_F_HW_VLAN_STAG_TX |
>>                                NETIF_F_HW_VLAN_CTAG_RX |
>>                                NETIF_F_HW_VLAN_STAG_RX);
>> -       dev->destructor = veth_dev_free;
>> +       dev->destructor = free_netdev;
>>         dev->max_mtu = ETH_MAX_MTU;
>>
>>         dev->hw_features = VETH_FEATURES;
>>
>>
>> just as what other virtual nic drivers do (vxlan, geneve, macsec, bridge ....)
>>
>
> The fix you mentioned change the original logic.
> The dev->vstats is freed in advance in the ndo_uninit, not destructor.
> It may break the backward.
Sorry, I didn't get your "backward"
I can't see there will be any problem caused by it.

can you say this patch also break the 'backward' ?
https://patchwork.ozlabs.org/patch/748964/

It's really weird to do dev->reg_state check in ndo_unint
ndo_unint is supposed to free the memory alloced in ndo_init.

>
> Regards
> Feng
>
>

^ permalink raw reply

* Re: [PATCH 1/1] IB/mlx5: Add port_xmit_wait to counter registers read
From: Leon Romanovsky @ 2017-05-03  5:38 UTC (permalink / raw)
  To: Tim Wright
  Cc: matanb-VPRAkNaXOzVWk0Htik3J/w, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w,
	saeedm-VPRAkNaXOzVWk0Htik3J/w, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20170501163008.27043-1-tim-r/Uwd3QrhQcqdlJmJB21zg@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 723 bytes --]

On Mon, May 01, 2017 at 05:30:08PM +0100, Tim Wright wrote:
> Add port_xmit_wait to the error counters read by mlx5_ib_process_mad to
> ensure sysfs port counter provides correct value for PortXmitWait.
> Otherwise the sysfs port_xmit_wait file always contains zero.
>
> The previous MAD_IFC implementation populated this counter, but it was
> removed during the migration to PPCNT for error counters (32-bit only).
>
> Signed-off-by: Tim Wright <tim-r/Uwd3QrhQcqdlJmJB21zg@public.gmane.org>
> ---
>  drivers/infiniband/hw/mlx5/mad.c | 2 ++
>  include/linux/mlx5/mlx5_ifc.h    | 4 +++-
>  2 files changed, 5 insertions(+), 1 deletion(-)
>

Thanks,
Acked-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox