Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH bluetooth-next 07/10] ipv6: introduce neighbour discovery ops
From: kbuild test robot @ 2016-04-18 14:23 UTC (permalink / raw)
  To: Alexander Aring
  Cc: kbuild-all, linux-wpan, kernel, marcel, jukka.rissanen, hannes,
	stefan, mcr, werner, linux-bluetooth, netdev, Alexander Aring,
	David S . Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy
In-Reply-To: <1460977108-4675-8-git-send-email-aar@pengutronix.de>

[-- Attachment #1: Type: text/plain, Size: 9109 bytes --]

Hi Alexander,

[auto build test WARNING on bluetooth-next/master]

url:    https://github.com/0day-ci/linux/commits/Alexander-Aring/6lowpan-introduce-basic-6lowpan-nd/20160418-191825
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git master
config: x86_64-randconfig-i0-04181247 (attached as .config)
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   In file included from include/uapi/linux/stddef.h:1:0,
                    from include/linux/stddef.h:4,
                    from include/uapi/linux/posix_types.h:4,
                    from include/uapi/linux/types.h:13,
                    from include/linux/types.h:5,
                    from include/uapi/linux/capability.h:16,
                    from include/linux/capability.h:15,
                    from net/appletalk/ddp.c:54:
   include/net/ndisc.h: In function 'ndisc_is_useropt':
   include/net/ndisc.h:201:16: error: 'const struct net_device' has no member named 'ndisc_ops'
     if (likely(dev->ndisc_ops->is_useropt))
                   ^
   include/linux/compiler.h:138:43: note: in definition of macro 'likely'
    #  define likely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 1))
                                              ^
   include/net/ndisc.h:201:16: error: 'const struct net_device' has no member named 'ndisc_ops'
     if (likely(dev->ndisc_ops->is_useropt))
                   ^
   include/linux/compiler.h:138:51: note: in definition of macro 'likely'
    #  define likely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 1))
                                                      ^
   include/net/ndisc.h:201:16: error: 'const struct net_device' has no member named 'ndisc_ops'
     if (likely(dev->ndisc_ops->is_useropt))
                   ^
   include/linux/compiler.h:114:47: note: in definition of macro 'likely_notrace'
    #define likely_notrace(x) __builtin_expect(!!(x), 1)
                                                  ^
   include/linux/compiler.h:138:56: note: in expansion of macro '__branch_check__'
    #  define likely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 1))
                                                           ^
>> include/net/ndisc.h:201:6: note: in expansion of macro 'likely'
     if (likely(dev->ndisc_ops->is_useropt))
         ^
   In file included from include/net/ipv6.h:20:0,
                    from include/net/inetpeer.h:15,
                    from include/net/route.h:28,
                    from net/appletalk/ddp.c:64:
   include/net/ndisc.h:202:13: error: 'const struct net_device' has no member named 'ndisc_ops'
      return dev->ndisc_ops->is_useropt(opt);
                ^
   In file included from include/uapi/linux/stddef.h:1:0,
                    from include/linux/stddef.h:4,
                    from include/uapi/linux/posix_types.h:4,
                    from include/uapi/linux/types.h:13,
                    from include/linux/types.h:5,
                    from include/uapi/linux/capability.h:16,
                    from include/linux/capability.h:15,
                    from net/appletalk/ddp.c:54:
   include/net/ndisc.h: In function 'ndisc_send_na':
   include/net/ndisc.h:213:16: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(dev->ndisc_ops->send_na))
                   ^
   include/linux/compiler.h:138:43: note: in definition of macro 'likely'
    #  define likely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 1))
                                              ^
   include/net/ndisc.h:213:16: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(dev->ndisc_ops->send_na))
                   ^
   include/linux/compiler.h:138:51: note: in definition of macro 'likely'
    #  define likely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 1))
                                                      ^
   include/net/ndisc.h:213:16: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(dev->ndisc_ops->send_na))
                   ^
   include/linux/compiler.h:114:47: note: in definition of macro 'likely_notrace'
    #define likely_notrace(x) __builtin_expect(!!(x), 1)
                                                  ^
   include/linux/compiler.h:138:56: note: in expansion of macro '__branch_check__'
    #  define likely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 1))
                                                           ^
   include/net/ndisc.h:213:6: note: in expansion of macro 'likely'
     if (likely(dev->ndisc_ops->send_na))
         ^
   In file included from include/net/ipv6.h:20:0,
                    from include/net/inetpeer.h:15,
                    from include/net/route.h:28,
                    from net/appletalk/ddp.c:64:
   include/net/ndisc.h:214:6: error: 'struct net_device' has no member named 'ndisc_ops'
      dev->ndisc_ops->send_na(dev, daddr, solicited_addr, router,
         ^
   In file included from include/uapi/linux/stddef.h:1:0,
                    from include/linux/stddef.h:4,
                    from include/uapi/linux/posix_types.h:4,
                    from include/uapi/linux/types.h:13,
                    from include/linux/types.h:5,
                    from include/uapi/linux/capability.h:16,
                    from include/linux/capability.h:15,
                    from net/appletalk/ddp.c:54:
   include/net/ndisc.h: In function 'ndisc_recv_na':
   include/net/ndisc.h:220:21: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(skb->dev->ndisc_ops->recv_na))
                        ^
   include/linux/compiler.h:138:43: note: in definition of macro 'likely'
    #  define likely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 1))
                                              ^
   include/net/ndisc.h:220:21: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(skb->dev->ndisc_ops->recv_na))
                        ^
   include/linux/compiler.h:138:51: note: in definition of macro 'likely'
    #  define likely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 1))
                                                      ^
   include/net/ndisc.h:220:21: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(skb->dev->ndisc_ops->recv_na))
                        ^
   include/linux/compiler.h:114:47: note: in definition of macro 'likely_notrace'
    #define likely_notrace(x) __builtin_expect(!!(x), 1)
                                                  ^
   include/linux/compiler.h:138:56: note: in expansion of macro '__branch_check__'
    #  define likely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 1))
                                                           ^
   include/net/ndisc.h:220:6: note: in expansion of macro 'likely'
     if (likely(skb->dev->ndisc_ops->recv_na))
         ^
   In file included from include/net/ipv6.h:20:0,
                    from include/net/inetpeer.h:15,
                    from include/net/route.h:28,
                    from net/appletalk/ddp.c:64:
   include/net/ndisc.h:221:11: error: 'struct net_device' has no member named 'ndisc_ops'
      skb->dev->ndisc_ops->recv_na(skb);
              ^
   In file included from include/uapi/linux/stddef.h:1:0,
                    from include/linux/stddef.h:4,
                    from include/uapi/linux/posix_types.h:4,
                    from include/uapi/linux/types.h:13,
                    from include/linux/types.h:5,
                    from include/uapi/linux/capability.h:16,
                    from include/linux/capability.h:15,
                    from net/appletalk/ddp.c:54:
   include/net/ndisc.h: In function 'ndisc_send_ns':
   include/net/ndisc.h:229:16: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(dev->ndisc_ops->send_ns))

vim +/likely +201 include/net/ndisc.h

   185		void	(*send_na)(struct net_device *dev,
   186				   const struct in6_addr *daddr,
   187				   const struct in6_addr *solicited_addr,
   188				   bool router, bool solicited,
   189				   bool override, bool inc_opt);
   190		void	(*recv_na)(struct sk_buff *skb);
   191		void	(*send_ns)(struct net_device *dev,
   192				   const struct in6_addr *solicit,
   193				   const struct in6_addr *daddr,
   194				   const struct in6_addr *saddr);
   195		void	(*recv_ns)(struct sk_buff *skb);
   196	};
   197	
   198	static inline int ndisc_is_useropt(const struct net_device *dev,
   199					   struct nd_opt_hdr *opt)
   200	{
 > 201		if (likely(dev->ndisc_ops->is_useropt))
   202			return dev->ndisc_ops->is_useropt(opt);
   203		else
   204			return 0;
   205	}
   206	
   207	static inline void ndisc_send_na(struct net_device *dev,
   208					 const struct in6_addr *daddr,
   209					 const struct in6_addr *solicited_addr,

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 24077 bytes --]

^ permalink raw reply

* Re: [PATCH net-next V2 05/11] net/mlx5e: Support RX multi-packet WQE (Striding RQ)
From: Eric Dumazet @ 2016-04-18 14:17 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Saeed Mahameed, David S. Miller, Linux Netdev List, Or Gerlitz,
	Tal Alon, Tariq Toukan, Eran Ben Elisha, Achiad Shochat
In-Reply-To: <CALzJLG_W9SkgMBQp86P0WDknw4Kc=DCBrvpPemAUbRX=r4r8Yg@mail.gmail.com>

On Mon, 2016-04-18 at 16:05 +0300, Saeed Mahameed wrote:
> On Mon, Apr 18, 2016 at 3:48 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Sun, 2016-04-17 at 17:29 -0700, Eric Dumazet wrote:
> >
> >>
> >> If really you need to allocate physically contiguous memory, have you
> >> considered converting the order-5 pages into 32 order-0 ones ?
> >
> > Search for split_page() call sites for examples.
> >
> >
> 
> Thanks Eric, we are already evaluating split_page as we speak.
> 
> We did look but could not find any specific alloc_pages API that
> allocates many physically contiguous pages with order0 ! so we assume
> it is ok to use split_page.

Note: I have no idea of split_page() performance :

Buddy page allocator has to aggregate pages into order-5, then we would
undo the work, touching 32 cache lines.

You might first benchmark a simple loop doing 

loop 10,000,000 times
 Order-5 allocation
 split into 32 order-0
 free 32 pages


Another idea would be to have a way to control max number of order-5
pages that a port would be using.

Since driver always own a ref on a order-5 pages, idea would be to
maintain a circular ring of up to XXX such pages, so that we can detect
an abnormal use and fallback to order-0 immediately.

^ permalink raw reply

* Re: [PATCH net-next 2/5] qede: Add support for ethtool private flags
From: Sergei Shtylyov @ 2016-04-18 14:11 UTC (permalink / raw)
  To: Yuval Mintz, davem, netdev
In-Reply-To: <1460921195-23352-3-git-send-email-Yuval.Mintz@qlogic.com>

Hello.

On 4/17/2016 10:26 PM, Yuval Mintz wrote:

> Adds a getter for the interfaces private flags.
> The only parameter currently supported is whether the interface is a
> coupled function [required for supporting 100g].
>
> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
> ---
>   drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 27 +++++++++++++++++++++++++
>   1 file changed, 27 insertions(+)
>
> diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
> index f87e83b..5ba6b2a 100644
> --- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
> +++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
[...]
> @@ -185,6 +200,17 @@ static int qede_get_sset_count(struct net_device *dev, int stringset)
>   	}
>   }
>
> +static u32 qede_get_priv_flags(struct net_device *dev)
> +{
> +	struct qede_dev *edev = netdev_priv(dev);
> +	u32 flags = 0;
> +
> +	flags |= (!!(edev->dev_info.common.num_hwfns > 1)) <<
> +		 QEDE_PRI_FLAG_CMT;

    Why not just '='?

> +
> +	return flags;

    ... or direct return of the value above?

> +}
> +
>   static int qede_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
>   {
>   	struct qede_dev *edev = netdev_priv(dev);
[...]

MBR, Sergei

^ permalink raw reply

* Re: [PATCH V2] net: ethernet: mellanox: correct page conversion
From: Eli Cohen @ 2016-04-18 14:05 UTC (permalink / raw)
  To: Sinan Kaya; +Cc: linux-rdma, timur, cov, Yishai Hadas, netdev, linux-kernel
In-Reply-To: <5714E6DA.4080008@codeaurora.org>

Sure, this is not the complete patch. As far as I know the problem
you're facing with arm is that virt_to_page() does not provide the
correct page descriptor so my suggestion will eliminate the need for
it.

On Mon, Apr 18, 2016 at 09:53:30AM -0400, Sinan Kaya wrote:
> On 4/18/2016 2:54 AM, Eli Cohen wrote:
> > Sinan,
> > 
> > if we get rid of the part this code:
> > 
> >                 if (BITS_PER_LONG == 64) {
> >                         struct page **pages;
> >                         pages = kmalloc(sizeof *pages * buf->nbufs, gfp);
> >                         if (!pages)
> >                                 goto err_free;
> >                         ...
> >                         ...
> >                         if (!buf->direct.buf)
> >                                 goto err_free;
> >                 }
> > 
> > Does that solve the arm issue?
> 
> I will test. As far as I know, there is one more place these DMA addresses
> are called with vmap. This is in mlx4_en_map_buffer.
> 
> I was trying to rearrange the allocation so that vmap actually works.
> 
> What do you think about mlx4_en_map_buffer?
> 
> 
> -- 
> Sinan Kaya
> Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH V2] net: ethernet: mellanox: correct page conversion
From: Christoph Hellwig @ 2016-04-18 13:59 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Christoph Hellwig, linux-rdma, timur, cov, Yishai Hadas, netdev,
	linux-kernel
In-Reply-To: <5714E5D6.7050600@codeaurora.org>

On Mon, Apr 18, 2016 at 09:49:10AM -0400, Sinan Kaya wrote:
> Here is a good description of logical address vs. virtual address.
> 
> https://www.quora.com/What-is-the-Kernel-logical-and-virtual-addresses-What-is-the-difference-between-them-What-is-the-type-of-addresses-listed-in-the-System-map

That's not how we use the terms in Linux.  But it's not really the point
of my question either.

> > Is this correct?
> > 
> No, the driver is plain broken without this patch. It causes a kernel panic 
> during driver probe.
> 
> This is the definition of vmap API.
> 
> https://www.kernel.org/doc/htmldocs/kernel-api/API-vmap.html

Thanks for the pointer, but I'm actually the person who introduced vmap
to Linux a long time ago, and this is once again not my question.

> You cannot take several virtually mapped addresses returned by dma_alloc_coherent
> and try to make them virtually contiguous again. 

But now we're getting closer to the issue: the mlx4_en driver is using
vmap on buffers allocated using dma_alloc_coherent if on a 64-bit
architecture, and that's obviously broken.

Now the big quetions is: why does it do that, given that
dma_alloc_coherent can be used for high order allocations anyway (and in
fact many architectures implement is using a version of vmap).

Let's get some answers on these question from the Mellanox folks and
work from there.

^ permalink raw reply

* Re: [PATCH V2] net: ethernet: mellanox: correct page conversion
From: Sinan Kaya @ 2016-04-18 13:53 UTC (permalink / raw)
  To: Eli Cohen; +Cc: linux-rdma, timur, cov, Yishai Hadas, netdev, linux-kernel
In-Reply-To: <20160418065447.GA11539@x-vnc01.mtx.labs.mlnx>

On 4/18/2016 2:54 AM, Eli Cohen wrote:
> Sinan,
> 
> if we get rid of the part this code:
> 
>                 if (BITS_PER_LONG == 64) {
>                         struct page **pages;
>                         pages = kmalloc(sizeof *pages * buf->nbufs, gfp);
>                         if (!pages)
>                                 goto err_free;
>                         ...
>                         ...
>                         if (!buf->direct.buf)
>                                 goto err_free;
>                 }
> 
> Does that solve the arm issue?

I will test. As far as I know, there is one more place these DMA addresses
are called with vmap. This is in mlx4_en_map_buffer.

I was trying to rearrange the allocation so that vmap actually works.

What do you think about mlx4_en_map_buffer?


-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

^ permalink raw reply

* [PATCH v2 1/1] drivers: net: cpsw: Prevent NUll pointer dereference with two PHYs
From: Andrew Goodbody @ 2016-04-18 13:53 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, Andrew Goodbody
In-Reply-To: <1460987606-18125-1-git-send-email-andrew.goodbody@cambrionix.com>

Adding a 2nd PHY to cpsw results in a NULL pointer dereference
as below. Fix by maintaining a reference to each PHY node in slave
struct instead of a single reference in the priv struct which was
overwritten by the 2nd PHY.

[   17.870933] Unable to handle kernel NULL pointer dereference at virtual address 00000180
[   17.879557] pgd = dc8bc000
[   17.882514] [00000180] *pgd=9c882831, *pte=00000000, *ppte=00000000
[   17.889213] Internal error: Oops: 17 [#1] ARM
[   17.893838] Modules linked in:
[   17.897102] CPU: 0 PID: 1657 Comm: connmand Not tainted 4.5.0-ge463dfb-dirty #11
[   17.904947] Hardware name: Cambrionix whippet
[   17.909576] task: dc859240 ti: dc968000 task.ti: dc968000
[   17.915339] PC is at phy_attached_print+0x18/0x8c
[   17.920339] LR is at phy_attached_info+0x14/0x18
[   17.925247] pc : [<c042baec>]    lr : [<c042bb74>]    psr: 600f0113
[   17.925247] sp : dc969cf8  ip : dc969d28  fp : dc969d18
[   17.937425] r10: dda7a400  r9 : 00000000  r8 : 00000000
[   17.942971] r7 : 00000001  r6 : ddb00480  r5 : ddb8cb34  r4 : 00000000
[   17.949898] r3 : c0954cc0  r2 : c09562b0  r1 : 00000000  r0 : 00000000
[   17.956829] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[   17.964401] Control: 10c5387d  Table: 9c8bc019  DAC: 00000051
[   17.970500] Process connmand (pid: 1657, stack limit = 0xdc968210)
[   17.977059] Stack: (0xdc969cf8 to 0xdc96a000)
[   17.981692] 9ce0:                                                       dc969d28 dc969d08
[   17.990386] 9d00: c038f9bc c038f6b4 ddb00480 dc969d34 dc969d28 c042bb74 c042bae4 00000000
[   17.999080] 9d20: c09562b0 c0954cc0 dc969d5c dc969d38 c043ebfc c042bb6c 00000007 00000003
[   18.007773] 9d40: ddb00000 ddb8cb58 ddb00480 00000001 dc969dec dc969d60 c0441614 c043ea68
[   18.016465] 9d60: 00000000 00000003 00000000 fffffff4 dc969df4 0000000d 00000000 00000000
[   18.025159] 9d80: dc969db4 dc969d90 c005dc08 c05839e0 dc969df4 0000000d ddb00000 00001002
[   18.033851] 9da0: 00000000 00000000 dc969dcc dc969db8 c005ddf4 c005dbc8 00000000 00000118
[   18.042544] 9dc0: dc969dec dc969dd0 ddb00000 c06db27c ffff9003 00001002 00000000 00000000
[   18.051237] 9de0: dc969e0c dc969df0 c057c88c c04410dc dc969e0c ddb00000 ddb00000 00000001
[   18.059930] 9e00: dc969e34 dc969e10 c057cb44 c057c7d8 ddb00000 ddb00138 00001002 beaeda20
[   18.068622] 9e20: 00000000 00000000 dc969e5c dc969e38 c057cc28 c057cac0 00000000 dc969e80
[   18.077315] 9e40: dda7a40c beaeda20 00000000 00000000 dc969ecc dc969e60 c05e36d0 c057cc14
[   18.086007] 9e60: dc969e84 00000051 beaeda20 00000000 dda7a40c 00000014 ddb00000 00008914
[   18.094699] 9e80: 30687465 00000000 00000000 00000000 00009003 00000000 00000000 00000000
[   18.103391] 9ea0: 00001002 00008914 dd257ae0 beaeda20 c098a428 beaeda20 00000011 00000000
[   18.112084] 9ec0: dc969edc dc969ed0 c05e4e54 c05e3030 dc969efc dc969ee0 c055f5ac c05e4cc4
[   18.120777] 9ee0: beaeda20 dd257ae0 dc8ab4c0 00008914 dc969f7c dc969f00 c010b388 c055f45c
[   18.129471] 9f00: c071ca40 dd257ac0 c00165e8 dc968000 dc969f3c dc969f20 dc969f64 dc969f28
[   18.138164] 9f20: c0115708 c0683ec8 dd257ac0 dd257ac0 dc969f74 dc969f40 c055f350 c00fc66c
[   18.146857] 9f40: dd82e4d0 00000011 00000000 00080000 dd257ac0 00000000 dc8ab4c0 dc8ab4c0
[   18.155550] 9f60: 00008914 beaeda20 00000011 00000000 dc969fa4 dc969f80 c010bc34 c010b2fc
[   18.164242] 9f80: 00000000 00000011 00000002 00000036 c00165e8 dc968000 00000000 dc969fa8
[   18.172935] 9fa0: c00163e0 c010bbcc 00000000 00000011 00000011 00008914 beaeda20 00009003
[   18.181628] 9fc0: 00000000 00000011 00000002 00000036 00081018 00000001 00000000 beaedc10
[   18.190320] 9fe0: 00083188 beaeda1c 00043a5d b6d29c0c 600b0010 00000011 00000000 00000000
[   18.198989] Backtrace:
[   18.201621] [<c042bad8>] (phy_attached_print) from [<c042bb74>] (phy_attached_info+0x14/0x18)
[   18.210664]  r3:c0954cc0 r2:c09562b0 r1:00000000
[   18.215588]  r4:ddb00480
[   18.218322] [<c042bb60>] (phy_attached_info) from [<c043ebfc>] (cpsw_slave_open+0x1a0/0x280)
[   18.227293] [<c043ea5c>] (cpsw_slave_open) from [<c0441614>] (cpsw_ndo_open+0x544/0x674)
[   18.235874]  r7:00000001 r6:ddb00480 r5:ddb8cb58 r4:ddb00000
[   18.241944] [<c04410d0>] (cpsw_ndo_open) from [<c057c88c>] (__dev_open+0xc0/0x128)
[   18.249972]  r9:00000000 r8:00000000 r7:00001002 r6:ffff9003 r5:c06db27c r4:ddb00000
[   18.258255] [<c057c7cc>] (__dev_open) from [<c057cb44>] (__dev_change_flags+0x90/0x154)
[   18.266745]  r5:00000001 r4:ddb00000
[   18.270575] [<c057cab4>] (__dev_change_flags) from [<c057cc28>] (dev_change_flags+0x20/0x50)
[   18.279523]  r9:00000000 r8:00000000 r7:beaeda20 r6:00001002 r5:ddb00138 r4:ddb00000
[   18.287811] [<c057cc08>] (dev_change_flags) from [<c05e36d0>] (devinet_ioctl+0x6ac/0x76c)
[   18.296483]  r9:00000000 r8:00000000 r7:beaeda20 r6:dda7a40c r5:dc969e80 r4:00000000
[   18.304762] [<c05e3024>] (devinet_ioctl) from [<c05e4e54>] (inet_ioctl+0x19c/0x1c8)
[   18.312882]  r10:00000000 r9:00000011 r8:beaeda20 r7:c098a428 r6:beaeda20 r5:dd257ae0
[   18.321235]  r4:00008914
[   18.323956] [<c05e4cb8>] (inet_ioctl) from [<c055f5ac>] (sock_ioctl+0x15c/0x2d8)
[   18.331829] [<c055f450>] (sock_ioctl) from [<c010b388>] (do_vfs_ioctl+0x98/0x8d0)
[   18.339765]  r7:00008914 r6:dc8ab4c0 r5:dd257ae0 r4:beaeda20
[   18.345822] [<c010b2f0>] (do_vfs_ioctl) from [<c010bc34>] (SyS_ioctl+0x74/0x84)
[   18.353573]  r10:00000000 r9:00000011 r8:beaeda20 r7:00008914 r6:dc8ab4c0 r5:dc8ab4c0
[   18.361924]  r4:00000000
[   18.364653] [<c010bbc0>] (SyS_ioctl) from [<c00163e0>] (ret_fast_syscall+0x0/0x3c)
[   18.372682]  r9:dc968000 r8:c00165e8 r7:00000036 r6:00000002 r5:00000011 r4:00000000
[   18.380960] Code: e92dd810 e24cb010 e24dd010 e59b4004 (e5902180)
[   18.387580] ---[ end trace c80529466223f3f3 ]---

Signed-off-by: Andrew Goodbody <andrew.goodbody@cambrionix.com>
---

v2 - Move allocation of memory for priv->slaves to inside cpsw_probe_dt so it
     has data->slaves initialised first which is needed to calculate size

 drivers/net/ethernet/ti/cpsw.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 42fdfd4..e62909c 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -349,6 +349,7 @@ struct cpsw_slave {
 	struct cpsw_slave_data		*data;
 	struct phy_device		*phy;
 	struct net_device		*ndev;
+	struct device_node		*phy_node;
 	u32				port_vlan;
 	u32				open_stat;
 };
@@ -367,7 +368,6 @@ struct cpsw_priv {
 	spinlock_t			lock;
 	struct platform_device		*pdev;
 	struct net_device		*ndev;
-	struct device_node		*phy_node;
 	struct napi_struct		napi_rx;
 	struct napi_struct		napi_tx;
 	struct device			*dev;
@@ -1148,8 +1148,8 @@ static void cpsw_slave_open(struct cpsw_slave *slave, struct cpsw_priv *priv)
 		cpsw_ale_add_mcast(priv->ale, priv->ndev->broadcast,
 				   1 << slave_port, 0, 0, ALE_MCAST_FWD_2);
 
-	if (priv->phy_node)
-		slave->phy = of_phy_connect(priv->ndev, priv->phy_node,
+	if (slave->phy_node)
+		slave->phy = of_phy_connect(priv->ndev, slave->phy_node,
 				 &cpsw_adjust_link, 0, slave->data->phy_if);
 	else
 		slave->phy = phy_connect(priv->ndev, slave->data->phy_id,
@@ -1946,7 +1946,7 @@ static int cpsw_probe_dt(struct cpsw_priv *priv,
 	struct device_node *node = pdev->dev.of_node;
 	struct device_node *slave_node;
 	struct cpsw_platform_data *data = &priv->data;
-	int i = 0, ret;
+	int i, ret;
 	u32 prop;
 
 	if (!node)
@@ -1958,6 +1958,14 @@ static int cpsw_probe_dt(struct cpsw_priv *priv,
 	}
 	data->slaves = prop;
 
+	priv->slaves = devm_kzalloc(&pdev->dev,
+				    sizeof(struct cpsw_slave) * data->slaves,
+				    GFP_KERNEL);
+	if (!priv->slaves)
+		return -ENOMEM;
+	for (i = 0; i < data->slaves; i++)
+		priv->slaves[i].slave_num = i;
+
 	if (of_property_read_u32(node, "active_slave", &prop)) {
 		dev_err(&pdev->dev, "Missing active_slave property in the DT.\n");
 		return -EINVAL;
@@ -2023,6 +2031,7 @@ static int cpsw_probe_dt(struct cpsw_priv *priv,
 	if (ret)
 		dev_warn(&pdev->dev, "Doesn't have any child node\n");
 
+	i = 0;
 	for_each_child_of_node(node, slave_node) {
 		struct cpsw_slave_data *slave_data = data->slave_data + i;
 		const void *mac_addr = NULL;
@@ -2033,7 +2042,8 @@ static int cpsw_probe_dt(struct cpsw_priv *priv,
 		if (strcmp(slave_node->name, "slave"))
 			continue;
 
-		priv->phy_node = of_parse_phandle(slave_node, "phy-handle", 0);
+		priv->slaves[i].phy_node =
+			of_parse_phandle(slave_node, "phy-handle", 0);
 		parp = of_get_property(slave_node, "phy_id", &lenp);
 		if (of_phy_is_fixed_link(slave_node)) {
 			struct device_node *phy_node;
@@ -2292,16 +2302,6 @@ static int cpsw_probe(struct platform_device *pdev)
 
 	memcpy(ndev->dev_addr, priv->mac_addr, ETH_ALEN);
 
-	priv->slaves = devm_kzalloc(&pdev->dev,
-				    sizeof(struct cpsw_slave) * data->slaves,
-				    GFP_KERNEL);
-	if (!priv->slaves) {
-		ret = -ENOMEM;
-		goto clean_runtime_disable_ret;
-	}
-	for (i = 0; i < data->slaves; i++)
-		priv->slaves[i].slave_num = i;
-
 	priv->slaves[0].ndev = ndev;
 	priv->emac_port = 0;
 
-- 
2.5.0

^ permalink raw reply related

* [PATCH v2 0/1] drivers: net: cpsw: Fix NULL pointer dereference with two slave PHYs
From: Andrew Goodbody @ 2016-04-18 13:53 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, Andrew Goodbody

This is a fix for a NULL pointer dereference from cpsw which is triggered
by having two slave PHYs attached to a cpsw network device. The problem is
due to only maintaining a single reference to a PHY node in the prive data
which gets overwritten by the second PHY probe. So move the PHY node
reference to the individual slave data so that there is now one per slave.

v1 had a problem that data->slaves was used before it had been filled in

Andrew Goodbody (1):
  Prevent NUll pointer dereference with two PHYs on cpsw

 drivers/net/ethernet/ti/cpsw.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

-- 
2.5.0

^ permalink raw reply

* Re: [PATCH V2] net: ethernet: mellanox: correct page conversion
From: Sinan Kaya @ 2016-04-18 13:49 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-rdma, timur, cov, Yishai Hadas, netdev, linux-kernel
In-Reply-To: <20160418131058.GA25421@infradead.org>

On 4/18/2016 9:10 AM, Christoph Hellwig wrote:
> On Mon, Apr 18, 2016 at 09:06:18AM -0400, okaya@codeaurora.org wrote:
>> On 2016-04-18 08:12, Christoph Hellwig wrote:
>>> On Sat, Apr 16, 2016 at 06:23:32PM -0400, Sinan Kaya wrote:
>>>> Current code is assuming that the address returned by dma_alloc_coherent
>>>> is a logical address. This is not true on ARM/ARM64 systems.
>>>
>>> Can you explain what you mean with a 'logical address' and what actual
>>> issue you're trying to solve?
>>
Here is a good description of logical address vs. virtual address.

https://www.quora.com/What-is-the-Kernel-logical-and-virtual-addresses-What-is-the-difference-between-them-What-is-the-type-of-addresses-listed-in-the-System-map


>> Vmap call is failing on arm64 systems because dma alloc api already returns
>> an address mapped with vmap.
> 
> Please state your problem clearly.  What I'm reverse engineering from
> your posts is:  because dma_alloc_coherent uses vmap-like mappings on
> arm64 (all, some systems?) 

All arm64 systems. 

>a driver using a lot of them might run into
> limits of the vmap pool size.
> 
> Is this correct?
> 
No, the driver is plain broken without this patch. It causes a kernel panic 
during driver probe.

This is the definition of vmap API.

https://www.kernel.org/doc/htmldocs/kernel-api/API-vmap.html

VMAP allows you to make several pages look contiguous to the CPU. 
It can only be used against logical addresses returned from kmalloc 
or alloc_page. 

You cannot take several virtually mapped addresses returned by dma_alloc_coherent
and try to make them virtually contiguous again. 

The code happens to work on other architectures by pure luck. AFAIK, dma_alloc_coherent
returns logical addresses on Intel systems until it runs out of DMA memory. After 
that intel arch will also start returning virtually mapped addresses and this code
will also fail. ARM64 on the other hand always returns a virtually mapped address.

The goal of this code is to allocate a bunch of page sized memory and make it look
contiguous. It is just using the wrong API. The correct API is either kmalloc or
alloc_page map it with dma_map_page not dma_alloc_coherent.

The proper usage of dma_map_page requires code to call dma_sync API in correct
places to be compatible with noncoherent systems. This code is already assuming
coherency. It would be a nice to have dma_sync APIs in right places. There is no
harm in calling dma_sync API for coherent systems as they are no-ops in DMA mapping
layer whereas it is a cache flush for non-coherent systems.

>>
>> Please see arch/arm64/mm directory.
> ---end quoted text---
> 

I hope it is clear now. The previous email was the most I could type on my phone.

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

^ permalink raw reply

* Re: [patch -next] udp: fix if statement in SIOCINQ ioctl
From: Willem de Bruijn @ 2016-04-18 13:41 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Dan Carpenter, David S. Miller, Willem de Bruijn,
	Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, Network Development, LKML, kernel-janitors
In-Reply-To: <1460981977.10638.105.camel@edumazet-glaptop3.roam.corp.google.com>

On Mon, Apr 18, 2016 at 8:19 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Mon, 2016-04-18 at 11:44 +0300, Dan Carpenter wrote:
>> We deleted a line of code and accidentally made the "return put_user()"
>> part of the if statement when it's supposed to be unconditional.
>>
>> Fixes: 9f9a45beaa96 ('udp: do not expect udp headers on ioctl SIOCINQ')
>> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
>
> Acked-by: Eric Dumazet <edumazet@google.com>

Acked-by: Willem de Bruijn <willemb@google.com>

Thanks for catching this.

^ permalink raw reply

* Re: [PATCH bluetooth-next 07/10] ipv6: introduce neighbour discovery ops
From: Alexander Aring @ 2016-04-18 13:28 UTC (permalink / raw)
  To: linux-wpan
  Cc: kernel, marcel, jukka.rissanen, hannes, stefan, mcr, werner,
	linux-bluetooth, netdev, David S . Miller, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy
In-Reply-To: <1460977108-4675-8-git-send-email-aar@pengutronix.de>

Hi,

Am 04/18/2016 um 12:58 PM schrieb Alexander Aring:
> This patch introduces neighbour discovery ops callback structure. The
> structure contains at first receive and transmit handling for NS/NA and
> userspace option field functionality.
>
> These callback offers 6lowpan different handling, such as 802.15.4 short
> address handling or RFC6775 (Neighbor Discovery Optimization for IPv6 over
> 6LoWPANs).
>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
> Cc: James Morris <jmorris@namei.org>
> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
> Cc: Patrick McHardy <kaber@trash.net>
> Signed-off-by: Alexander Aring <aar@pengutronix.de>
> ---
>  include/linux/netdevice.h |  3 ++
>  include/net/ndisc.h       | 73 ++++++++++++++++++++++++++++++++++++++++++-----
>  net/ipv6/addrconf.c       |  1 +
>  net/ipv6/ndisc.c          | 71 +++++++++++++++++++++++++++++++--------------
>  net/ipv6/route.c          |  2 +-
>  5 files changed, 121 insertions(+), 29 deletions(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 0052c42..4f1b3f2 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1677,6 +1677,9 @@ struct net_device {
>  #ifdef CONFIG_NET_L3_MASTER_DEV
>  	const struct l3mdev_ops	*l3mdev_ops;
>  #endif
> +#ifdef CONFIG_IPV6
> +	const struct ndisc_ops *ndisc_ops;
> +#endif

Need to change it to:

#if IS_ENABLED(CONFIG_IPV6)

as well also on other configs which can be builded as tristate.
Sorry for the noise, I will fix that in v2. :-)

- Alex

^ permalink raw reply

* Re: [PATCH RFC net-next 0/2] pskb_extract() helper function.
From: Marcelo Ricardo Leitner @ 2016-04-18 13:28 UTC (permalink / raw)
  To: Sowmini Varadhan; +Cc: eric.dumazet, netdev
In-Reply-To: <cover.1460928360.git.sowmini.varadhan@oracle.com>

On Mon, Apr 18, 2016 at 06:21:07AM -0700, Sowmini Varadhan wrote:
> This patchset follows up on the discussion in
>  https://www.mail-archive.com/netdev@vger.kernel.org/msg105090.html
> 
> For RDS-TCP, we have to deal with the full gamut of
> nonlinear sk_buffs, including all the frag_list variants.
> Also, the parent skb has to remain unchanged, while the clone
> is queued for Rx on the PF_RDS socket. 
> 
> Patch 1 of this patchset adds a pskb_extract() function that 
> does all this without the redundant memcpy's in pskb_expand_head() 
> and __pskb_pull_tail().
> 
> A further optimization is also possible by inlining pskb_trim()
> itself into pskb_carve() and thus avoiding the needless copy
> of trailer frags/pages that will then get trimmed away.  I am
> deferring that optimization  for the next iteration, and would
> like to get feedback on this first pass, which by itself gives
> a noticeable perf boost.

I like this idea. We can also make use of it in SCTP.

  Marcelo

^ permalink raw reply

* [PATCH RFC net-next 0/2] pskb_extract() helper function.
From: Sowmini Varadhan @ 2016-04-18 13:21 UTC (permalink / raw)
  To: eric.dumazet, netdev; +Cc: sowmini.varadhan

This patchset follows up on the discussion in
 https://www.mail-archive.com/netdev@vger.kernel.org/msg105090.html

For RDS-TCP, we have to deal with the full gamut of
nonlinear sk_buffs, including all the frag_list variants.
Also, the parent skb has to remain unchanged, while the clone
is queued for Rx on the PF_RDS socket. 

Patch 1 of this patchset adds a pskb_extract() function that 
does all this without the redundant memcpy's in pskb_expand_head() 
and __pskb_pull_tail().

A further optimization is also possible by inlining pskb_trim()
itself into pskb_carve() and thus avoiding the needless copy
of trailer frags/pages that will then get trimmed away.  I am
deferring that optimization  for the next iteration, and would
like to get feedback on this first pass, which by itself gives
a noticeable perf boost.

Sowmini Varadhan (2):
  Add pskb_extract() helper function
  Call pskb_extract() helper function

 include/linux/skbuff.h |    2 +
 net/core/skbuff.c      |  248 ++++++++++++++++++++++++++++++++++++++++++++++++
 net/rds/tcp_recv.c     |   14 +--
 3 files changed, 253 insertions(+), 11 deletions(-)

^ permalink raw reply

* [PATCH RFC net-next 2/2] RDS: TCP:  Call pskb_extract() helper function
From: Sowmini Varadhan @ 2016-04-18 13:21 UTC (permalink / raw)
  To: eric.dumazet, netdev; +Cc: sowmini.varadhan
In-Reply-To: <cover.1460928360.git.sowmini.varadhan@oracle.com>

rds-stress experiments with request size 256 bytes, 8K acks,
using 16 threads show a 40% improvment when pskb_extract()
replaces the {skb_clone(..); pskb_pull(..); pskb_trim(..);}
pattern in the Rx path, so we leverage the perf gain with
this commit.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
---
 net/rds/tcp_recv.c |   14 +++-----------
 1 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/net/rds/tcp_recv.c b/net/rds/tcp_recv.c
index 27a9921..d75d8b5 100644
--- a/net/rds/tcp_recv.c
+++ b/net/rds/tcp_recv.c
@@ -207,22 +207,14 @@ static int rds_tcp_data_recv(read_descriptor_t *desc, struct sk_buff *skb,
 		}
 
 		if (left && tc->t_tinc_data_rem) {
-			clone = skb_clone(skb, arg->gfp);
+			to_copy = min(tc->t_tinc_data_rem, left);
+
+			clone = pskb_extract(skb, offset, to_copy, arg->gfp);
 			if (!clone) {
 				desc->error = -ENOMEM;
 				goto out;
 			}
 
-			to_copy = min(tc->t_tinc_data_rem, left);
-			if (!pskb_pull(clone, offset) ||
-			    pskb_trim(clone, to_copy)) {
-				pr_warn("rds_tcp_data_recv: pull/trim failed "
-					"left %zu data_rem %zu skb_len %d\n",
-					left, tc->t_tinc_data_rem, skb->len);
-				kfree_skb(clone);
-				desc->error = -ENOMEM;
-				goto out;
-			}
 			skb_queue_tail(&tinc->ti_skb_list, clone);
 
 			rdsdebug("skb %p data %p len %d off %u to_copy %zu -> "
-- 
1.7.1

^ permalink raw reply related

* [PATCH RFC net-next 1/2] skbuff: Add pskb_extract() helper function
From: Sowmini Varadhan @ 2016-04-18 13:21 UTC (permalink / raw)
  To: eric.dumazet, netdev; +Cc: sowmini.varadhan
In-Reply-To: <cover.1460928360.git.sowmini.varadhan@oracle.com>

A pattern of skb usage seen in modules such as RDS-TCP is to
extract `to_copy' bytes from the received TCP segment, starting
at some offset `off' into a new skb `clone'. This is done in
the ->data_ready callback, where the clone skb is queued up for rx on
the PF_RDS socket, while the parent TCP segment is returned unchanged
back to the TCP engine.

The existing code uses the sequence
	clone = skb_clone(..);
	pskb_pull(clone, off, ..);
	pskb_trim(clone, to_copy, ..);
with the intention of discarding the first `off' bytes. However,
skb_clone() + pskb_pull() implies pksb_expand_head(), which ends
up doing a redundant memcpy of bytes that will then get discarded
in __pskb_pull_tail().

To avoid this inefficiency, this commit adds pskb_extract() that
creates the clone, and memcpy's only the relevant header/frag/frag_list
to the start of `clone'. pskb_trim() is then invoked to trim clone
down to the requested to_copy bytes.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
---
 include/linux/skbuff.h |    2 +
 net/core/skbuff.c      |  248 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 250 insertions(+), 0 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index da0ace3..a1ce639 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2986,6 +2986,8 @@ struct sk_buff *skb_vlan_untag(struct sk_buff *skb);
 int skb_ensure_writable(struct sk_buff *skb, int write_len);
 int skb_vlan_pop(struct sk_buff *skb);
 int skb_vlan_push(struct sk_buff *skb, __be16 vlan_proto, u16 vlan_tci);
+struct sk_buff *pskb_extract(struct sk_buff *skb, int off, int to_copy,
+			     gfp_t gfp);
 
 static inline int memcpy_from_msg(void *data, struct msghdr *msg, int len)
 {
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 4cc594c..e8b6d20 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4619,3 +4619,251 @@ struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
 	return NULL;
 }
 EXPORT_SYMBOL(alloc_skb_with_frags);
+
+/* carve out the first off bytes from skb when off < headlen */
+static int pskb_carve_inside_header(struct sk_buff *skb, const u32 off,
+				    const int headlen, gfp_t gfp_mask)
+{
+	int i;
+	int size = skb_end_offset(skb);
+	int new_hlen = headlen - off;
+	u8 *data;
+	int doff = 0;
+
+	size = SKB_DATA_ALIGN(size);
+
+	if (skb_pfmemalloc(skb))
+		gfp_mask |= __GFP_MEMALLOC;
+	data = kmalloc_reserve(size +
+			       SKB_DATA_ALIGN(sizeof(struct skb_shared_info)),
+			       gfp_mask, NUMA_NO_NODE, NULL);
+	if (!data)
+		return -ENOMEM;
+
+	size = SKB_WITH_OVERHEAD(ksize(data));
+
+	/* Copy real data, and all frags */
+	skb_copy_from_linear_data_offset(skb, off, data, new_hlen);
+	skb->len -= off;
+
+	memcpy((struct skb_shared_info *)(data + size),
+	       skb_shinfo(skb),
+	       offsetof(struct skb_shared_info,
+			frags[skb_shinfo(skb)->nr_frags]));
+	if (skb_cloned(skb)) {
+		/* drop the old head gracefully */
+		if (skb_orphan_frags(skb, gfp_mask)) {
+			kfree(data);
+			return -ENOMEM;
+		}
+		for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
+			skb_frag_ref(skb, i);
+		if (skb_has_frag_list(skb))
+			skb_clone_fraglist(skb);
+		skb_release_data(skb);
+	} else {
+		/* we can reuse existing recount- all we did was
+		 * relocate values
+		 */
+		skb_free_head(skb);
+	}
+
+	doff = (data - skb->head);
+	skb->head = data;
+	skb->data = data;
+	skb->head_frag = 0;
+#ifdef NET_SKBUFF_DATA_USES_OFFSET
+	skb->end = size;
+	doff = 0;
+#else
+	skb->end = skb->head + size;
+#endif
+	skb_set_tail_pointer(skb, skb_headlen(skb));
+	skb_headers_offset_update(skb, 0);
+	skb->cloned = 0;
+	skb->hdr_len = 0;
+	skb->nohdr = 0;
+	atomic_set(&skb_shinfo(skb)->dataref, 1);
+
+	return 0;
+}
+
+static int pskb_carve(struct sk_buff *skb, const u32 off, gfp_t gfp);
+
+/* carve out the first eat bytes from skb's frag_list. May recurse into
+ * pskb_carve()
+ */
+static int pskb_carve_frag_list(struct sk_buff *skb,
+				struct skb_shared_info *shinfo, int eat,
+				gfp_t gfp_mask)
+{
+	struct sk_buff *list = shinfo->frag_list;
+	struct sk_buff *clone = NULL;
+	struct sk_buff *insp = NULL;
+
+	do {
+		if (!list) {
+			pr_err("Not enough bytes to eat. Want %d\n", eat);
+			return -EFAULT;
+		}
+		if (list->len <= eat) {
+			/* Eaten as whole. */
+			eat -= list->len;
+			list = list->next;
+			insp = list;
+		} else {
+			/* Eaten partially. */
+			if (skb_shared(list)) {
+				clone = skb_clone(list, gfp_mask);
+				if (!clone)
+					return -ENOMEM;
+				insp = list->next;
+				list = clone;
+			} else {
+				/* This may be pulled without problems. */
+				insp = list;
+			}
+			if (pskb_carve(list, eat, gfp_mask) < 0) {
+				kfree_skb(clone);
+				return -ENOMEM;
+			}
+			break;
+		}
+	} while (eat);
+
+	/* Free pulled out fragments. */
+	while ((list = shinfo->frag_list) != insp) {
+		shinfo->frag_list = list->next;
+		kfree_skb(list);
+	}
+	/* And insert new clone at head. */
+	if (clone) {
+		clone->next = list;
+		shinfo->frag_list = clone;
+	}
+	return 0;
+}
+
+/* carve off first len bytes from skb. Split line (off) is in the
+ * non-linear part of skb
+ */
+static int pskb_carve_inside_nonlinear(struct sk_buff *skb, const u32 off,
+				       int pos, gfp_t gfp_mask)
+{
+	int i, k = 0;
+	int size = skb_end_offset(skb);
+	u8 *data;
+	const int nfrags = skb_shinfo(skb)->nr_frags;
+	struct skb_shared_info *shinfo;
+	int doff = 0;
+
+	size = SKB_DATA_ALIGN(size);
+
+	if (skb_pfmemalloc(skb))
+		gfp_mask |= __GFP_MEMALLOC;
+	data = kmalloc_reserve(size +
+			       SKB_DATA_ALIGN(sizeof(struct skb_shared_info)),
+			       gfp_mask, NUMA_NO_NODE, NULL);
+	if (!data)
+		return -ENOMEM;
+
+	size = SKB_WITH_OVERHEAD(ksize(data));
+
+	memcpy((struct skb_shared_info *)(data + size),
+	       skb_shinfo(skb), offsetof(struct skb_shared_info,
+					 frags[skb_shinfo(skb)->nr_frags]));
+	if (skb_orphan_frags(skb, gfp_mask)) {
+		kfree(data);
+		return -ENOMEM;
+	}
+	shinfo = (struct skb_shared_info *)(data + size);
+	for (i = 0; i < nfrags; i++) {
+		int fsize = skb_frag_size(&skb_shinfo(skb)->frags[i]);
+
+		if (pos + fsize > off) {
+			shinfo->frags[k] = skb_shinfo(skb)->frags[i];
+
+			if (pos < off) {
+				/* Split frag.
+				 * We have two variants in this case:
+				 * 1. Move all the frag to the second
+				 *    part, if it is possible. F.e.
+				 *    this approach is mandatory for TUX,
+				 *    where splitting is expensive.
+				 * 2. Split is accurately. We make this.
+				 */
+				shinfo->frags[0].page_offset += off - pos;
+				skb_frag_size_sub(&shinfo->frags[0], off - pos);
+			}
+			skb_frag_ref(skb, i);
+			k++;
+		}
+		pos += fsize;
+	}
+	shinfo->nr_frags = k;
+	if (skb_has_frag_list(skb))
+		skb_clone_fraglist(skb);
+
+	if (k == 0) {
+		/* split line is in frag list */
+		pskb_carve_frag_list(skb, shinfo, off - pos, gfp_mask);
+	}
+	skb_release_data(skb);
+
+	doff = (data - skb->head);
+	skb->head = data;
+	skb->head_frag = 0;
+	skb->data = data;
+#ifdef NET_SKBUFF_DATA_USES_OFFSET
+	skb->end = size;
+	doff = 0;
+#else
+	skb->end = skb->head + size;
+#endif
+	skb_reset_tail_pointer(skb);
+	skb_headers_offset_update(skb, 0);
+	skb->cloned   = 0;
+	skb->hdr_len  = 0;
+	skb->nohdr    = 0;
+	skb->len -= off;
+	skb->data_len = skb->len;
+	atomic_set(&skb_shinfo(skb)->dataref, 1);
+	return 0;
+}
+
+/* remove len bytes from the beginning of the skb */
+static int pskb_carve(struct sk_buff *skb, const u32 len, gfp_t gfp)
+{
+	int headlen = skb_headlen(skb);
+
+	if (len < headlen)
+		return pskb_carve_inside_header(skb, len, headlen, gfp);
+	else
+		return pskb_carve_inside_nonlinear(skb, len, headlen, gfp);
+}
+
+/* Extract to_copy bytes starting at off from skb, and return this in
+ * a new skb
+ */
+struct sk_buff *pskb_extract(struct sk_buff *skb, int off,
+			     int to_copy, gfp_t gfp)
+{
+	struct sk_buff  *clone = skb_clone(skb, gfp);
+
+	if (!clone)
+		return NULL;
+
+	if (pskb_carve(clone, off, gfp) < 0) {
+		pr_warn("pskb_carve failed\n");
+		kfree_skb(clone);
+		return NULL;
+	}
+
+	if (pskb_trim(clone, to_copy)) {
+		pr_warn("pskb_trim failed\n");
+		kfree_skb(clone);
+		return NULL;
+	}
+	return clone;
+}
+EXPORT_SYMBOL(pskb_extract);
-- 
1.7.1

^ permalink raw reply related

* Re: [PATCH V2] net: ethernet: mellanox: correct page conversion
From: Christoph Hellwig @ 2016-04-18 13:10 UTC (permalink / raw)
  To: okaya
  Cc: Christoph Hellwig, linux-rdma, timur, cov, Yishai Hadas, netdev,
	linux-kernel
In-Reply-To: <0c6a430c5f0ec64f51d7c594ef9751dd@codeaurora.org>

On Mon, Apr 18, 2016 at 09:06:18AM -0400, okaya@codeaurora.org wrote:
> On 2016-04-18 08:12, Christoph Hellwig wrote:
> >On Sat, Apr 16, 2016 at 06:23:32PM -0400, Sinan Kaya wrote:
> >>Current code is assuming that the address returned by dma_alloc_coherent
> >>is a logical address. This is not true on ARM/ARM64 systems.
> >
> >Can you explain what you mean with a 'logical address' and what actual
> >issue you're trying to solve?
> 
> Vmap call is failing on arm64 systems because dma alloc api already returns
> an address mapped with vmap.

Please state your problem clearly.  What I'm reverse engineering from
your posts is:  because dma_alloc_coherent uses vmap-like mappings on
arm64 (all, some systems?) a driver using a lot of them might run into
limits of the vmap pool size.

Is this correct?

> 
> Please see arch/arm64/mm directory.
---end quoted text---

^ permalink raw reply

* Re: [PATCH V2] net: ethernet: mellanox: correct page conversion
From: okaya @ 2016-04-18 13:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-rdma, timur, cov, Yishai Hadas, netdev, linux-kernel
In-Reply-To: <20160418121247.GA25387@infradead.org>

On 2016-04-18 08:12, Christoph Hellwig wrote:
> On Sat, Apr 16, 2016 at 06:23:32PM -0400, Sinan Kaya wrote:
>> Current code is assuming that the address returned by 
>> dma_alloc_coherent
>> is a logical address. This is not true on ARM/ARM64 systems.
> 
> Can you explain what you mean with a 'logical address' and what actual
> issue you're trying to solve?

Vmap call is failing on arm64 systems because dma alloc api already 
returns an address mapped with vmap.

Please see arch/arm64/mm directory.

^ permalink raw reply

* Re: [PATCH net-next V2 05/11] net/mlx5e: Support RX multi-packet WQE (Striding RQ)
From: Saeed Mahameed @ 2016-04-18 13:05 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Saeed Mahameed, David S. Miller, Linux Netdev List, Or Gerlitz,
	Tal Alon, Tariq Toukan, Eran Ben Elisha, Achiad Shochat
In-Reply-To: <1460983695.10638.113.camel@edumazet-glaptop3.roam.corp.google.com>

On Mon, Apr 18, 2016 at 3:48 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Sun, 2016-04-17 at 17:29 -0700, Eric Dumazet wrote:
>
>>
>> If really you need to allocate physically contiguous memory, have you
>> considered converting the order-5 pages into 32 order-0 ones ?
>
> Search for split_page() call sites for examples.
>
>

Thanks Eric, we are already evaluating split_page as we speak.

We did look but could not find any specific alloc_pages API that
allocates many physically contiguous pages with order0 ! so we assume
it is ok to use split_page.

BTW our MPWQE solution doesn't totally rely on huge physically
contiguous memory, as you see in the next two patches we introduce a
fragmented MPWQE approach as a fallback, but we do understand your
concern for the normal flow.

^ permalink raw reply

* Re: [PATCH bluetooth-next 09/10] 6lowpan: introduce 6lowpan-nd
From: kbuild test robot @ 2016-04-18 13:04 UTC (permalink / raw)
  To: Alexander Aring
  Cc: kbuild-all-JC7UmRfGjtg, linux-wpan-u79uwXL29TY76Z2rM5mHXA,
	kernel-bIcnvbaLZ9MEGnE8C9+IrQ, marcel-kz+m5ild9QBg9hUCZPvPmw,
	jukka.rissanen-VuQAYsv1563Yd54FQh9/CA,
	hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r,
	stefan-JPH+aEBZ4P+UEJcrhfAQsw, mcr-SWp7JaYWvAQV+D8aMU/kSg,
	werner-SEdMjqphH88wryQfseakQg,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Alexander Aring, David S . Miller,
	Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy
In-Reply-To: <1460977108-4675-10-git-send-email-aar-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 5783 bytes --]

Hi Alexander,

[auto build test ERROR on bluetooth-next/master]

url:    https://github.com/0day-ci/linux/commits/Alexander-Aring/6lowpan-introduce-basic-6lowpan-nd/20160418-191825
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git master
config: x86_64-allmodconfig (attached as .config)
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   In file included from include/linux/linkage.h:4:0,
                    from include/linux/fs.h:4,
                    from include/linux/debugfs.h:18,
                    from include/net/6lowpan.h:56,
                    from net/6lowpan/ndisc.c:11:
   include/net/ndisc.h: In function 'ndisc_is_useropt':
   include/net/ndisc.h:211:16: error: 'const struct net_device' has no member named 'ndisc_ops'
     if (likely(dev->ndisc_ops->is_useropt))
                   ^
   include/linux/compiler.h:169:40: note: in definition of macro 'likely'
    # define likely(x) __builtin_expect(!!(x), 1)
                                           ^
   In file included from include/net/ipv6.h:20:0,
                    from include/net/6lowpan.h:58,
                    from net/6lowpan/ndisc.c:11:
   include/net/ndisc.h:212:13: error: 'const struct net_device' has no member named 'ndisc_ops'
      return dev->ndisc_ops->is_useropt(opt);
                ^
   In file included from include/linux/linkage.h:4:0,
                    from include/linux/fs.h:4,
                    from include/linux/debugfs.h:18,
                    from include/net/6lowpan.h:56,
                    from net/6lowpan/ndisc.c:11:
   include/net/ndisc.h: In function 'ndisc_send_na':
   include/net/ndisc.h:223:16: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(dev->ndisc_ops->send_na))
                   ^
   include/linux/compiler.h:169:40: note: in definition of macro 'likely'
    # define likely(x) __builtin_expect(!!(x), 1)
                                           ^
   In file included from include/net/ipv6.h:20:0,
                    from include/net/6lowpan.h:58,
                    from net/6lowpan/ndisc.c:11:
   include/net/ndisc.h:224:6: error: 'struct net_device' has no member named 'ndisc_ops'
      dev->ndisc_ops->send_na(dev, daddr, solicited_addr, router,
         ^
   In file included from include/linux/linkage.h:4:0,
                    from include/linux/fs.h:4,
                    from include/linux/debugfs.h:18,
                    from include/net/6lowpan.h:56,
                    from net/6lowpan/ndisc.c:11:
   include/net/ndisc.h: In function 'ndisc_recv_na':
   include/net/ndisc.h:230:21: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(skb->dev->ndisc_ops->recv_na))
                        ^
   include/linux/compiler.h:169:40: note: in definition of macro 'likely'
    # define likely(x) __builtin_expect(!!(x), 1)
                                           ^
   In file included from include/net/ipv6.h:20:0,
                    from include/net/6lowpan.h:58,
                    from net/6lowpan/ndisc.c:11:
   include/net/ndisc.h:231:11: error: 'struct net_device' has no member named 'ndisc_ops'
      skb->dev->ndisc_ops->recv_na(skb);
              ^
   In file included from include/linux/linkage.h:4:0,
                    from include/linux/fs.h:4,
                    from include/linux/debugfs.h:18,
                    from include/net/6lowpan.h:56,
                    from net/6lowpan/ndisc.c:11:
   include/net/ndisc.h: In function 'ndisc_send_ns':
   include/net/ndisc.h:239:16: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(dev->ndisc_ops->send_ns))
                   ^
   include/linux/compiler.h:169:40: note: in definition of macro 'likely'
    # define likely(x) __builtin_expect(!!(x), 1)
                                           ^
   In file included from include/net/ipv6.h:20:0,
                    from include/net/6lowpan.h:58,
                    from net/6lowpan/ndisc.c:11:
   include/net/ndisc.h:240:6: error: 'struct net_device' has no member named 'ndisc_ops'
      dev->ndisc_ops->send_ns(dev, solicit, daddr, saddr);
         ^
   In file included from include/linux/linkage.h:4:0,
                    from include/linux/fs.h:4,
                    from include/linux/debugfs.h:18,
                    from include/net/6lowpan.h:56,
                    from net/6lowpan/ndisc.c:11:
   include/net/ndisc.h: In function 'ndisc_recv_ns':
   include/net/ndisc.h:245:21: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(skb->dev->ndisc_ops->recv_ns))
                        ^
   include/linux/compiler.h:169:40: note: in definition of macro 'likely'
    # define likely(x) __builtin_expect(!!(x), 1)
                                           ^
   In file included from include/net/ipv6.h:20:0,
                    from include/net/6lowpan.h:58,
                    from net/6lowpan/ndisc.c:11:
   include/net/ndisc.h:246:11: error: 'struct net_device' has no member named 'ndisc_ops'
      skb->dev->ndisc_ops->recv_ns(skb);
              ^
   net/6lowpan/ndisc.c: In function 'lowpan_register_ndisc_ops':
>> net/6lowpan/ndisc.c:632:5: error: 'struct net_device' has no member named 'ndisc_ops'
     dev->ndisc_ops = &lowpan_ndisc_ops;
        ^

vim +632 net/6lowpan/ndisc.c

   626		.send_ns = lowpan_ndisc_send_ns,
   627		.recv_ns = lowpan_ndisc_recv_ns,
   628	};
   629	
   630	void lowpan_register_ndisc_ops(struct net_device *dev)
   631	{
 > 632		dev->ndisc_ops = &lowpan_ndisc_ops;
   633	}

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 54118 bytes --]

^ permalink raw reply

* Re: [PATCH bluetooth-next 07/10] ipv6: introduce neighbour discovery ops
From: kbuild test robot @ 2016-04-18 12:59 UTC (permalink / raw)
  To: Alexander Aring
  Cc: kbuild-all, linux-wpan, kernel, marcel, jukka.rissanen, hannes,
	stefan, mcr, werner, linux-bluetooth, netdev, Alexander Aring,
	David S . Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy
In-Reply-To: <1460977108-4675-8-git-send-email-aar@pengutronix.de>

[-- Attachment #1: Type: text/plain, Size: 13788 bytes --]

Hi Alexander,

[auto build test ERROR on bluetooth-next/master]

url:    https://github.com/0day-ci/linux/commits/Alexander-Aring/6lowpan-introduce-basic-6lowpan-nd/20160418-191825
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git master
config: x86_64-allmodconfig (attached as .config)
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All error/warnings (new ones prefixed by >>):

   In file included from include/uapi/linux/stddef.h:1:0,
                    from include/linux/stddef.h:4,
                    from include/uapi/linux/posix_types.h:4,
                    from include/uapi/linux/types.h:13,
                    from include/linux/types.h:5,
                    from include/linux/list.h:4,
                    from include/linux/module.h:9,
                    from drivers/net/ethernet/atheros/alx/main.c:35:
   include/net/ndisc.h: In function 'ndisc_is_useropt':
>> include/net/ndisc.h:201:16: error: 'const struct net_device' has no member named 'ndisc_ops'
     if (likely(dev->ndisc_ops->is_useropt))
                   ^
   include/linux/compiler.h:169:40: note: in definition of macro 'likely'
    # define likely(x) __builtin_expect(!!(x), 1)
                                           ^
   In file included from include/net/ipv6.h:20:0,
                    from include/net/inetpeer.h:15,
                    from include/net/route.h:28,
                    from include/net/ip.h:31,
                    from include/net/ip6_checksum.h:31,
                    from drivers/net/ethernet/atheros/alx/main.c:46:
   include/net/ndisc.h:202:13: error: 'const struct net_device' has no member named 'ndisc_ops'
      return dev->ndisc_ops->is_useropt(opt);
                ^
   In file included from include/uapi/linux/stddef.h:1:0,
                    from include/linux/stddef.h:4,
                    from include/uapi/linux/posix_types.h:4,
                    from include/uapi/linux/types.h:13,
                    from include/linux/types.h:5,
                    from include/linux/list.h:4,
                    from include/linux/module.h:9,
                    from drivers/net/ethernet/atheros/alx/main.c:35:
   include/net/ndisc.h: In function 'ndisc_send_na':
>> include/net/ndisc.h:213:16: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(dev->ndisc_ops->send_na))
                   ^
   include/linux/compiler.h:169:40: note: in definition of macro 'likely'
    # define likely(x) __builtin_expect(!!(x), 1)
                                           ^
   In file included from include/net/ipv6.h:20:0,
                    from include/net/inetpeer.h:15,
                    from include/net/route.h:28,
                    from include/net/ip.h:31,
                    from include/net/ip6_checksum.h:31,
                    from drivers/net/ethernet/atheros/alx/main.c:46:
   include/net/ndisc.h:214:6: error: 'struct net_device' has no member named 'ndisc_ops'
      dev->ndisc_ops->send_na(dev, daddr, solicited_addr, router,
         ^
   In file included from include/uapi/linux/stddef.h:1:0,
                    from include/linux/stddef.h:4,
                    from include/uapi/linux/posix_types.h:4,
                    from include/uapi/linux/types.h:13,
                    from include/linux/types.h:5,
                    from include/linux/list.h:4,
                    from include/linux/module.h:9,
                    from drivers/net/ethernet/atheros/alx/main.c:35:
   include/net/ndisc.h: In function 'ndisc_recv_na':
   include/net/ndisc.h:220:21: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(skb->dev->ndisc_ops->recv_na))
                        ^
   include/linux/compiler.h:169:40: note: in definition of macro 'likely'
    # define likely(x) __builtin_expect(!!(x), 1)
                                           ^
   In file included from include/net/ipv6.h:20:0,
                    from include/net/inetpeer.h:15,
                    from include/net/route.h:28,
                    from include/net/ip.h:31,
                    from include/net/ip6_checksum.h:31,
                    from drivers/net/ethernet/atheros/alx/main.c:46:
   include/net/ndisc.h:221:11: error: 'struct net_device' has no member named 'ndisc_ops'
      skb->dev->ndisc_ops->recv_na(skb);
              ^
   In file included from include/uapi/linux/stddef.h:1:0,
                    from include/linux/stddef.h:4,
                    from include/uapi/linux/posix_types.h:4,
                    from include/uapi/linux/types.h:13,
                    from include/linux/types.h:5,
                    from include/linux/list.h:4,
                    from include/linux/module.h:9,
                    from drivers/net/ethernet/atheros/alx/main.c:35:
   include/net/ndisc.h: In function 'ndisc_send_ns':
   include/net/ndisc.h:229:16: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(dev->ndisc_ops->send_ns))
                   ^
   include/linux/compiler.h:169:40: note: in definition of macro 'likely'
    # define likely(x) __builtin_expect(!!(x), 1)
                                           ^
   In file included from include/net/ipv6.h:20:0,
                    from include/net/inetpeer.h:15,
                    from include/net/route.h:28,
                    from include/net/ip.h:31,
                    from include/net/ip6_checksum.h:31,
                    from drivers/net/ethernet/atheros/alx/main.c:46:
   include/net/ndisc.h:230:6: error: 'struct net_device' has no member named 'ndisc_ops'
      dev->ndisc_ops->send_ns(dev, solicit, daddr, saddr);
         ^
   In file included from include/uapi/linux/stddef.h:1:0,
                    from include/linux/stddef.h:4,
                    from include/uapi/linux/posix_types.h:4,
                    from include/uapi/linux/types.h:13,
                    from include/linux/types.h:5,
                    from include/linux/list.h:4,
                    from include/linux/module.h:9,
                    from drivers/net/ethernet/atheros/alx/main.c:35:
   include/net/ndisc.h: In function 'ndisc_recv_ns':
   include/net/ndisc.h:235:21: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(skb->dev->ndisc_ops->recv_ns))
                        ^
   include/linux/compiler.h:169:40: note: in definition of macro 'likely'
    # define likely(x) __builtin_expect(!!(x), 1)
                                           ^
   In file included from include/net/ipv6.h:20:0,
                    from include/net/inetpeer.h:15,
                    from include/net/route.h:28,
                    from include/net/ip.h:31,
                    from include/net/ip6_checksum.h:31,
                    from drivers/net/ethernet/atheros/alx/main.c:46:
   include/net/ndisc.h:236:11: error: 'struct net_device' has no member named 'ndisc_ops'
      skb->dev->ndisc_ops->recv_ns(skb);
              ^
--
   In file included from include/uapi/linux/stddef.h:1:0,
                    from include/linux/stddef.h:4,
                    from include/uapi/linux/posix_types.h:4,
                    from include/uapi/linux/types.h:13,
                    from include/linux/types.h:5,
                    from include/linux/list.h:4,
                    from include/linux/module.h:9,
                    from net/ipv6/ndisc.c:32:
   include/net/ndisc.h: In function 'ndisc_is_useropt':
>> include/net/ndisc.h:201:16: error: 'const struct net_device' has no member named 'ndisc_ops'
     if (likely(dev->ndisc_ops->is_useropt))
                   ^
   include/linux/compiler.h:169:40: note: in definition of macro 'likely'
    # define likely(x) __builtin_expect(!!(x), 1)
                                           ^
   In file included from include/net/ipv6.h:20:0,
                    from net/ipv6/ndisc.c:57:
   include/net/ndisc.h:202:13: error: 'const struct net_device' has no member named 'ndisc_ops'
      return dev->ndisc_ops->is_useropt(opt);
                ^
   In file included from include/uapi/linux/stddef.h:1:0,
                    from include/linux/stddef.h:4,
                    from include/uapi/linux/posix_types.h:4,
                    from include/uapi/linux/types.h:13,
                    from include/linux/types.h:5,
                    from include/linux/list.h:4,
                    from include/linux/module.h:9,
                    from net/ipv6/ndisc.c:32:
   include/net/ndisc.h: In function 'ndisc_send_na':
>> include/net/ndisc.h:213:16: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(dev->ndisc_ops->send_na))
                   ^
   include/linux/compiler.h:169:40: note: in definition of macro 'likely'
    # define likely(x) __builtin_expect(!!(x), 1)
                                           ^
   In file included from include/net/ipv6.h:20:0,
                    from net/ipv6/ndisc.c:57:
   include/net/ndisc.h:214:6: error: 'struct net_device' has no member named 'ndisc_ops'
      dev->ndisc_ops->send_na(dev, daddr, solicited_addr, router,
         ^
   In file included from include/uapi/linux/stddef.h:1:0,
                    from include/linux/stddef.h:4,
                    from include/uapi/linux/posix_types.h:4,
                    from include/uapi/linux/types.h:13,
                    from include/linux/types.h:5,
                    from include/linux/list.h:4,
                    from include/linux/module.h:9,
                    from net/ipv6/ndisc.c:32:
   include/net/ndisc.h: In function 'ndisc_recv_na':
   include/net/ndisc.h:220:21: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(skb->dev->ndisc_ops->recv_na))
                        ^
   include/linux/compiler.h:169:40: note: in definition of macro 'likely'
    # define likely(x) __builtin_expect(!!(x), 1)
                                           ^
   In file included from include/net/ipv6.h:20:0,
                    from net/ipv6/ndisc.c:57:
   include/net/ndisc.h:221:11: error: 'struct net_device' has no member named 'ndisc_ops'
      skb->dev->ndisc_ops->recv_na(skb);
              ^
   In file included from include/uapi/linux/stddef.h:1:0,
                    from include/linux/stddef.h:4,
                    from include/uapi/linux/posix_types.h:4,
                    from include/uapi/linux/types.h:13,
                    from include/linux/types.h:5,
                    from include/linux/list.h:4,
                    from include/linux/module.h:9,
                    from net/ipv6/ndisc.c:32:
   include/net/ndisc.h: In function 'ndisc_send_ns':
   include/net/ndisc.h:229:16: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(dev->ndisc_ops->send_ns))
                   ^
   include/linux/compiler.h:169:40: note: in definition of macro 'likely'
    # define likely(x) __builtin_expect(!!(x), 1)
                                           ^
   In file included from include/net/ipv6.h:20:0,
                    from net/ipv6/ndisc.c:57:
   include/net/ndisc.h:230:6: error: 'struct net_device' has no member named 'ndisc_ops'
      dev->ndisc_ops->send_ns(dev, solicit, daddr, saddr);
         ^
   In file included from include/uapi/linux/stddef.h:1:0,
                    from include/linux/stddef.h:4,
                    from include/uapi/linux/posix_types.h:4,
                    from include/uapi/linux/types.h:13,
                    from include/linux/types.h:5,
                    from include/linux/list.h:4,
                    from include/linux/module.h:9,
                    from net/ipv6/ndisc.c:32:
   include/net/ndisc.h: In function 'ndisc_recv_ns':
   include/net/ndisc.h:235:21: error: 'struct net_device' has no member named 'ndisc_ops'
     if (likely(skb->dev->ndisc_ops->recv_ns))
                        ^
   include/linux/compiler.h:169:40: note: in definition of macro 'likely'
    # define likely(x) __builtin_expect(!!(x), 1)
                                           ^
   In file included from include/net/ipv6.h:20:0,
                    from net/ipv6/ndisc.c:57:
   include/net/ndisc.h:236:11: error: 'struct net_device' has no member named 'ndisc_ops'
      skb->dev->ndisc_ops->recv_ns(skb);
              ^
   net/ipv6/ndisc.c: In function 'ip6_register_ndisc_ops':
>> net/ipv6/ndisc.c:1804:10: error: 'struct net_device' has no member named 'ndisc_ops'
      if (dev->ndisc_ops) {
             ^
   net/ipv6/ndisc.c:1809:7: error: 'struct net_device' has no member named 'ndisc_ops'
       dev->ndisc_ops = &ip6_ndisc_ops;
          ^
   In file included from include/net/ipv6.h:20:0,
                    from net/ipv6/ndisc.c:57:
   include/net/ndisc.h: In function 'ndisc_is_useropt':
>> include/net/ndisc.h:205:1: warning: control reaches end of non-void function [-Wreturn-type]
    }
    ^

vim +201 include/net/ndisc.h

   195		void	(*recv_ns)(struct sk_buff *skb);
   196	};
   197	
   198	static inline int ndisc_is_useropt(const struct net_device *dev,
   199					   struct nd_opt_hdr *opt)
   200	{
 > 201		if (likely(dev->ndisc_ops->is_useropt))
   202			return dev->ndisc_ops->is_useropt(opt);
   203		else
   204			return 0;
 > 205	}
   206	
   207	static inline void ndisc_send_na(struct net_device *dev,
   208					 const struct in6_addr *daddr,
   209					 const struct in6_addr *solicited_addr,
   210					 bool router, bool solicited, bool override,
   211					 bool inc_opt)
   212	{
 > 213		if (likely(dev->ndisc_ops->send_na))
   214			dev->ndisc_ops->send_na(dev, daddr, solicited_addr, router,
   215						solicited, override, inc_opt);
   216	}

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 54118 bytes --]

^ permalink raw reply

* IMPORTANT MAIL TO YOU
From: verifelaw @ 2016-04-18 11:32 UTC (permalink / raw)


I am Capt. Lawrence Tyman, an officer in US Army,and also a West Point
Graduate, serving in the Military with the 82nd Air Borne Division
Peace keeping force deployed from Afganistan to Syria.  
We were moved to Syria from Iraq as the last batch just left,and i
really need your help in assisting me with the safe keeping of 1 military
trunk box contain funds amount of $10.2M which i secured on a raiding we carried out in 
January in one of the chief Syrian IsIs base which i headed the squard as the 
Captain.  With every possible arrangement to lift this box out, is intended to arrive 
Belgium from there a diplomat will deliver it to your designated location
I hope you can be trusted? You will be rewarded handsomely if you could help
me secure the funds until I conclude my service here in 3 month to meet you while we can 
plan head to head on a good and profitable business or company i can invest my funds in your country.
If you can be trusted and willing to support me in securing this safely kindly indicate 
by Letting me know this (1) Your name (2) Your address (3) Age (4) Occupation and 
i will explain further when i get a response from you
kindly contact me in this my private email address below: lawrencetyman@gmx.com

Regards,
Capt. Lawrence Tyman

^ permalink raw reply

* Re: [PATCH net-next V2 05/11] net/mlx5e: Support RX multi-packet WQE (Striding RQ)
From: Eric Dumazet @ 2016-04-18 12:48 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David S. Miller, netdev, Or Gerlitz, Tal Alon, Tariq Toukan,
	Eran Ben Elisha, Achiad Shochat
In-Reply-To: <1460939371.10638.97.camel@edumazet-glaptop3.roam.corp.google.com>

On Sun, 2016-04-17 at 17:29 -0700, Eric Dumazet wrote:

> 
> If really you need to allocate physically contiguous memory, have you
> considered converting the order-5 pages into 32 order-0 ones ?

Search for split_page() call sites for examples.

^ permalink raw reply

* Re: [patch -next] udp: fix if statement in SIOCINQ ioctl
From: Eric Dumazet @ 2016-04-18 12:19 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: David S. Miller, Willem de Bruijn, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev, linux-kernel,
	kernel-janitors
In-Reply-To: <20160418084449.GA12410@mwanda>

On Mon, 2016-04-18 at 11:44 +0300, Dan Carpenter wrote:
> We deleted a line of code and accidentally made the "return put_user()"
> part of the if statement when it's supposed to be unconditional.
> 
> Fixes: 9f9a45beaa96 ('udp: do not expect udp headers on ioctl SIOCINQ')
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* RE: Poorer networking performance in later kernels?
From: Butler, Peter @ 2016-04-18 12:02 UTC (permalink / raw)
  To: Rick Jones, netdev@vger.kernel.org
In-Reply-To: <57116D21.8000807@hpe.com>

Just a minor clarification to my last paragraph ("When I perform the tests in this setup the 3.4.2 and 4.4.0 kernels perform identically - just as you would expect.").  By this I don't mean that the 3.4.2 and 4.4.0 kernels on the VMs perform identically to the 3.4.2 and 4.4.0 kernels on the actual hardware; what I mean is that in VM-land the original problem is essentially gone, as I get the same throughput with either kernel .



-----Original Message-----
From: Butler, Peter 
Sent: April-18-16 7:28 AM
To: 'Rick Jones' <rick.jones2@hpe.com>; netdev@vger.kernel.org
Subject: RE: Poorer networking performance in later kernels?

Hi Rick

Thanks for the reply.

Here is some hardware information, as requested (the two systems are identical, and are communicating with one another over a 10GB full-duplex Ethernet backplane):

- processor type: Intel(R) Xeon(R) CPU C5528  @ 2.13GHz
- NIC: Intel 82599EB 10GB XAUI/BX4
- NIC driver: ixgbe version 4.2.1-k (part of 4.4.0 kernel)

As for the buffer sizes, those rather large ones work fine for us with the 3.4.2 kernel.  However, for the sake of being complete, I have re-tried the tests with the 'standard' 4.4.0 kernel parameters for all /proc/sys/net/* values, and the results still were extremely poor in comparison to the 3.4.2 kernel.

Our MTU is actually just the standard 1500 bytes, however the message size was chosen to mimic actual traffic which will be segmented.

I ran ethtool -k (indeed I checked all ethtool parameters, not just those via -k) and the only real difference I could find was in "large-receive-offload" which was ON in 3.4.2 but OFF in 4.4.0 - so I used ethtool to change this to match the 3.4.2 settings and re-ran the tests.  Didn't help :-(   It's possible of course that I have missed a parameter here or there in comparing the 3.4.2 setup to the 4.4.0 setup.  I also tried running the ethtool config with the latest and greatest ethtool version (4.5) on the 4.4.0 kernel, as compared to the old 3.1 version on our 3.4.2 kernel.

I performed the TCP_RR test as requested and in that case, the results are much more comparable.  The old kernel is still better, but now only around 10% better as opposed to 2-3x better.

However I still contend that the *_STREAM tests are giving us more pertinent data, since our product application is only getting 1/3 to 1/2 half of the performance on the 4.4.0 kernel, and this is the same thing I see when I use netperf to test.

One other note: I tried running our 3.4.2 and 4.4.0 kernels in a VM environment on my workstation, so as to take the 'real' production hardware out of the equation.  When I perform the tests in this setup the 3.4.2 and 4.4.0 kernels perform identically - just as you would expect.

Any other ideas?  What can I be missing here?

Peter




-----Original Message-----
From: Rick Jones [mailto:rick.jones2@hpe.com]
Sent: April-15-16 6:37 PM
To: Butler, Peter <pbutler@sonusnet.com>; netdev@vger.kernel.org
Subject: Re: Poorer networking performance in later kernels?

On 04/15/2016 02:02 PM, Butler, Peter wrote:
> (Please keep me CC'd to all comments/responses)
>
> I've tried a kernel upgrade from 3.4.2 to 4.4.0 and see a marked drop 
> in networking performance.  Nothing was changed on the test systems, 
> other than the kernel itself (and kernel modules).  The identical 
> .config used to build the 3.4.2 kernel was brought over into the
> 4.4.0 kernel source tree, and any configuration differences (e.g. new 
> parameters, etc.) were taken as default values.
>
> The testing was performed on the same actual hardware for both kernel 
> versions (i.e. take the existing 3.4.2 physical setup, simply boot 
> into the (new) kernel and run the same test).  The netperf utility was 
> used for benchmarking and the testing was always performed on idle 
> systems.
>
> TCP testing yielded the following results, where the 4.4.0 kernel only 
> got about 1/2 of the throughput:
>

>        Recv     Send       Send                          Utilization       Service Demand
>        Socket   Socket     Message Elapsed               Send     Recv     Send    Recv
>        Size     Size       Size    Time       Throughput local    remote   local   remote
>        bytes    bytes      bytes   secs.      10^6bits/s % S      % S      us/KB   us/KB
>
> 3.4.2 13631488 13631488   8952    30.01      9370.29    10.14    6.50     0.709   0.454
> 4.4.0 13631488 13631488   8952    30.02      5314.03    9.14     14.31    1.127   1.765
>
> SCTP testing yielded the following results, where the 4.4.0 kernel only got about 1/3 of the throughput:
>
>        Recv     Send       Send                          Utilization       Service Demand
>        Socket   Socket     Message Elapsed               Send     Recv     Send    Recv
>        Size     Size       Size    Time       Throughput local    remote   local   remote
>        bytes    bytes      bytes   secs.      10^6bits/s  % S     % S      us/KB   us/KB
>
> 3.4.2 13631488 13631488   8952    30.00      2306.22    13.87    13.19    3.941   3.747
> 4.4.0 13631488 13631488   8952    30.01       882.74    16.86    19.14    12.516  14.210
>
> The same tests were performed a multitude of time, and are always 
> consistent (within a few percent).  I've also tried playing with 
> various run-time kernel parameters (/proc/sys/kernel/net/...) on the
> 4.4.0 kernel to alleviate the issue but have had no success at all.
>
> I'm at a loss as to what could possibly account for such a discrepancy...
>

I suspect I am not alone in being curious about the CPU(s) present in the systems and the model/whatnot of the NIC being used.  I'm also curious as to why you have what at first glance seem like absurdly large socket buffer sizes.

That said, it looks like you have some Really Big (tm) increases in service demand.  Many more CPU cycles being consumed per KB of data transferred.

Your message size makes me wonder if you were using a 9000 byte MTU.

Perhaps in the move from 3.4.2 to 4.4.0 you lost some or all of the stateless offloads for your NIC(s)?  Running ethtool -k <interface> on both ends under both kernels might be good.

Also, if you did have a 9000 byte MTU under 3.4.2 are you certain you still had it under 4.4.0?

It would (at least to me) also be interesting to run a TCP_RR test comparing the two kernels.  TCP_RR (at least with the default request/response size of one byte) doesn't really care about stateless offloads or MTUs and could show how much difference there is in basic path length (or I suppose in interrupt coalescing behaviour if the NIC in question has a mildly dodgy heuristic for such things).

happy benchmarking,

rick jones

^ permalink raw reply

* Re: Poorer networking performance in later kernels?
From: Eric Dumazet @ 2016-04-18 12:16 UTC (permalink / raw)
  To: Butler, Peter; +Cc: netdev@vger.kernel.org
In-Reply-To: <1460759582.10638.79.camel@edumazet-glaptop3.roam.corp.google.com>

On Fri, 2016-04-15 at 15:33 -0700, Eric Dumazet wrote:
> On Fri, 2016-04-15 at 21:02 +0000, Butler, Peter wrote:
> > (Please keep me CC'd to all comments/responses)
> > 
> > I've tried a kernel upgrade from 3.4.2 to 4.4.0 and see a marked drop in networking performance.  Nothing was changed on the test systems, other than the kernel itself (and kernel modules).  The identical .config used to build the 3.4.2 kernel was brought over into the 4.4.0 kernel source tree, and any configuration differences (e.g. new parameters, etc.) were taken as default values.
> > 
> > The testing was performed on the same actual hardware for both kernel versions (i.e. take the existing 3.4.2 physical setup, simply boot into the (new) kernel and run the same test).  The netperf utility was used for benchmarking and the testing was always performed on idle systems.
> > 
> > TCP testing yielded the following results, where the 4.4.0 kernel only got about 1/2 of the throughput:
> > 
> >       Recv     Send       Send                          Utilization       Service Demand
> >       Socket   Socket     Message Elapsed               Send     Recv     Send    Recv
> >       Size     Size       Size    Time       Throughput local    remote   local   remote
> >       bytes    bytes      bytes   secs.      10^6bits/s % S      % S      us/KB   us/KB
> > 
> > 3.4.2 13631488 13631488   8952    30.01      9370.29    10.14    6.50     0.709   0.454
> > 4.4.0 13631488 13631488   8952    30.02      5314.03    9.14     14.31    1.127   1.765
> > 
> > SCTP testing yielded the following results, where the 4.4.0 kernel only got about 1/3 of the throughput:
> > 
> >       Recv     Send       Send                          Utilization       Service Demand
> >       Socket   Socket     Message Elapsed               Send     Recv     Send    Recv
> >       Size     Size       Size    Time       Throughput local    remote   local   remote
> >       bytes    bytes      bytes   secs.      10^6bits/s  % S     % S      us/KB   us/KB
> > 
> > 3.4.2 13631488 13631488   8952    30.00      2306.22    13.87    13.19    3.941   3.747
> > 4.4.0 13631488 13631488   8952    30.01       882.74    16.86    19.14    12.516  14.210
> > 
> > The same tests were performed a multitude of time, and are always consistent (within a few percent).  I've also tried playing with various run-time kernel parameters (/proc/sys/kernel/net/...) on the 4.4.0 kernel to alleviate the issue but have had no success at all.
> > 
> > I'm at a loss as to what could possibly account for such a discrepancy...
> 
> Maybe new kernel is faster and you have drops somewhere ?
> 
> nstat >/dev/null
> netperf -H ...
> nstat
> 
> Would help
> 

Are you receiving my mails, or simply ignoring them ?

Thanks.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox