Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] [PATCH v3 1/2] 8139too: Fix the lack of pci_disable_device
From: David Miller @ 2014-12-22 21:32 UTC (permalink / raw)
  To: baijiaju1990; +Cc: sergei.shtylyov, netdev, jgarzik
In-Reply-To: <1419208132-23657-1-git-send-email-baijiaju1990@163.com>

From: Jia-Ju Bai <baijiaju1990@163.com>
Date: Mon, 22 Dec 2014 08:28:52 +0800

> For linux-3.18.0
> When pci_request_regions is failed in rtl8139_init_board, pci_disable_device 
> is not called to disable the device which are enabled by pci_enable_device, 
> because of disable_dev_on_err is not assigned 1.
> This patch fix this problem.
> 
> Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>

Applied.

^ permalink raw reply

* Re: [PATCH 10/28] net: ethernet: stmicro: stmmac: drop owner assignment from platform_drivers
From: David Miller @ 2014-12-22 21:31 UTC (permalink / raw)
  To: wsa; +Cc: linux-kernel, peppe.cavallaro, netdev
In-Reply-To: <1419196495-9626-11-git-send-email-wsa@the-dreams.de>

From: Wolfram Sang <wsa@the-dreams.de>
Date: Sun, 21 Dec 2014 22:14:31 +0100

> This platform_driver does not need to set an owner, it will be populated by the
> driver core.
> 
> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>

Applied, thanks.

^ permalink raw reply

* Re: [RESEND PATCH] net: s6gmac: remove driver
From: David Miller @ 2014-12-22 21:29 UTC (permalink / raw)
  To: dg; +Cc: netdev, linux-kernel
In-Reply-To: <1419190059-16501-1-git-send-email-dg@emlix.com>

From: Daniel Glöckner <dg@emlix.com>
Date: Sun, 21 Dec 2014 20:27:39 +0100

> The s6000 Xtensa support has been removed from the kernel in
> 4006e565e1500db4. There are no other chips using this driver.
> 
> While the Mentor/Alcatel PE-MCXMAC IP core is also used in other
> designs (Freescale Gianfar/UCC, QLogic NetXen, Solarflare, Agere
> ET-1310, Netlogic XLR/XLS), none of these use this driver as it
> heavily depends on the s6000 DMA engine. In fact, there is no
> code sharing across any of the aforementioned devices.
> 
> Signed-off-by: Daniel Glöckner <dg@emlix.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net] net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding
From: David Miller @ 2014-12-22 21:21 UTC (permalink / raw)
  To: jay.vosburgh; +Cc: netdev
In-Reply-To: <2983.1419031920@famine>

From: Jay Vosburgh <jay.vosburgh@canonical.com>
Date: Fri, 19 Dec 2014 15:32:00 -0800

> 	The receive code is careful to update the skb->csum, except in
> __dev_forward_skb, as called by dev_forward_skb.  __dev_forward_skb
> calls eth_type_trans, which in turn calls skb_pull_inline(skb, ETH_HLEN)
> to skip over the Ethernet header, but does not update skb->csum when
> doing so.

Hmmm, wasn't there some discussion about doing the skb_postpull_rcsum()
in eth_type_trans()?

But I guess that won't work for non-encapsulated ethernet cases?

^ permalink raw reply

* Re: Stable fixes for batman-adv
From: David Miller @ 2014-12-22 21:14 UTC (permalink / raw)
  To: antonio; +Cc: sven, netdev
In-Reply-To: <54969541.2060500@meshcoding.com>

From: Antonio Quartulli <antonio@meshcoding.com>
Date: Sun, 21 Dec 2014 10:39:13 +0100

> David, please merge these fixes and queue them for stable even if this
> is not the standard pull request we usually do.

Fair enough, will do.

^ permalink raw reply

* Re: [PATCH net]tg3: tg3_disable_ints using uninitialized mailbox value to disable interrupts
From: David Miller @ 2014-12-22 21:13 UTC (permalink / raw)
  To: prashant
  Cc: netdev, linux-pci, nholland, marcelo.leitner, bhelgaas,
	rajatxjain, mchan
In-Reply-To: <1419106577-12891-1-git-send-email-prashant@broadcom.com>

From: Prashant Sreedharan <prashant@broadcom.com>
Date: Sat, 20 Dec 2014 12:16:17 -0800

> During driver load in tg3_init_one, if the driver detects DMA activity before
> intializing the chip tg3_halt is called. As part of tg3_halt interrupts are
> disabled using routine tg3_disable_ints. This routine was using mailbox value
> which was not initialized (default value is 0). As a result driver was writing
> 0x00000001 to pci config space register 0, which is the vendor id / device id.
> 
> This driver bug was exposed because of the commit a7877b17a667 (PCI: Check only
> the Vendor ID to identify Configuration Request Retry). Also this issue is only
> seen in older generation chipsets like 5722 because config space write to offset
> 0 from driver is possible. The newer generation chips ignore writes to offset 0.
> Also without commit a7877b17a667, for these older chips when a GRC reset is
> issued the Bootcode would reprogram the vendor id/device id, which is the reason
> this bug was masked earlier.
> 
> Fixed by initializing the interrupt mailbox registers before calling tg3_halt.
> 
> Please queue for -stable.
> 
> Reported-by: Nils Holland <nholland@tisys.org>
> Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> Signed-off-by: Prashant Sreedharan <prashant@broadcom.com>
> Signed-off-by: Michael Chan <mchan@broadcom.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH net] in6: fix conflict with glibc
From: David Miller @ 2014-12-22 21:13 UTC (permalink / raw)
  To: stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ
  Cc: hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r,
	florent.fourcot-Pj3lBMu8rt9bbU8NOSLlsg,
	netdev-u79uwXL29TY76Z2rM5mHXA, linux-api-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20141220121549.7c1b8aad@urahara>

From: Stephen Hemminger <stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ@public.gmane.org>
Date: Sat, 20 Dec 2014 12:15:49 -0800

> Resolve conflicts between glibc definition of IPV6 socket options
> and those defined in Linux headers. Looks like earlier efforts to
> solve this did not cover all the definitions.
> 
> It resolves warnings during iproute2 build. 
> Please consider for stable as well.
> 
> Signed-off-by: Stephen Hemminger <stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ@public.gmane.org>
> 
> ---
> Patch against -net tree

Applied and queued up for -stable, thanks Stephen.

^ permalink raw reply

* Re: [PATCH net-next] hyperv: Fix some variable name typos in send-buffer init/revoke
From: David Miller @ 2014-12-22 21:11 UTC (permalink / raw)
  To: haiyangz; +Cc: olaf, netdev, jasowang, driverdev-devel, linux-kernel
In-Reply-To: <1419042318-7221-1-git-send-email-haiyangz@microsoft.com>

From: Haiyang Zhang <haiyangz@microsoft.com>
Date: Fri, 19 Dec 2014 18:25:18 -0800

> The changed names are union fields with the same size, so the existing code
> still works. But, we now update these variables to the correct names.
> 
> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>

Applied.

^ permalink raw reply

* Re: virtio_net: Fix napi poll list corruption
From: David Miller @ 2014-12-22 21:10 UTC (permalink / raw)
  To: herbert
  Cc: david.vrabel, netdev, xen-devel, konrad.wilk, boris.ostrovsky,
	edumazet
In-Reply-To: <20141220002327.GA31975@gondor.apana.org.au>

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Sat, 20 Dec 2014 11:23:27 +1100

> The commit d75b1ade567ffab085e8adbbdacf0092d10cd09c (net: less
> interrupt masking in NAPI) breaks virtio_net in an insidious way.
> 
> It is now required that if the entire budget is consumed when poll
> returns, the napi poll_list must remain empty.  However, like some
> other drivers virtio_net tries to do a last-ditch check and if
> there is more work it will call napi_schedule and then immediately
> process some of this new work.  Should the entire budget be consumed
> while processing such new work then we will violate the new caller
> contract.
> 
> This patch fixes this by not touching any work when we reschedule
> in virtio_net.
> 
> The worst part of this bug is that the list corruption causes other
> napi users to be moved off-list.  In my case I was chasing a stall
> in IPsec (IPsec uses netif_rx) and I only belatedly realised that it
> was virtio_net which caused the stall even though the virtio_net
> poll was still functioning perfectly after IPsec stalled.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied, thanks Herbert.

^ permalink raw reply

* OOPS: unable to handle kernel NULL pointer dereference in unix_detach_fds
From: Wolfgang Walter @ 2014-12-22 20:52 UTC (permalink / raw)
  To: netdev

Hello,

today I saw this oops with kernel 3.18.1:

Dec 22 14:30:26 hobel kernel: [   37.476849] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
Dec 22 14:30:26 hobel kernel: [   37.476856] IP: [<ffffffff81a7c70c>] unix_detach_fds.isra.28+0x1c/0x50
Dec 22 14:30:26 hobel kernel: [   37.476862] PGD a90ab067 PUD a8d2c067 PMD 0 
Dec 22 14:30:26 hobel kernel: [   37.476866] Oops: 0000 [#1] PREEMPT SMP 
Dec 22 14:30:26 hobel kernel: [   37.476868] Modules linked in: bnep bluetooth rfkill vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) nfsd xt_conntrack xt_socket xt_helper nf_conntrack_tftp nf_conntrack_snmp nf_conntrack_broadcast nf_conntrack_sip nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack uvcvideo videobuf2_vmalloc snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi nvidia(PO) i2c_piix4
Dec 22 14:30:26 hobel kernel: [   37.476887] CPU: 4 PID: 2234 Comm: plasma-desktop Tainted: P           O   3.18.1-ei+6.9 #2
Dec 22 14:30:26 hobel kernel: [   37.476890] Hardware name: System manufacturer System Product Name/M4A89GTD-PRO/USB3, BIOS 1104    03/12/2010
Dec 22 14:30:26 hobel kernel: [   37.476892] task: ffff88022aa2e320 ti: ffff8800aebec000 task.ti: ffff8800aebec000
Dec 22 14:30:26 hobel kernel: [   37.476894] RIP: 0010:[<ffffffff81a7c70c>]  [<ffffffff81a7c70c>] unix_detach_fds.isra.28+0x1c/0x50
Dec 22 14:30:26 hobel kernel: [   37.476897] RSP: 0018:ffff8800aebefb68  EFLAGS: 00010202
Dec 22 14:30:26 hobel kernel: [   37.476898] RAX: 0000000000000001 RBX: 0000000040000040 RCX: 00000000000000f0
Dec 22 14:30:26 hobel kernel: [   37.476900] RDX: 0000000000000006 RSI: ffff880206110300 RDI: ffff8800aebefc18
Dec 22 14:30:26 hobel kernel: [   37.476901] RBP: ffff8800aebefb78 R08: 0000000000000000 R09: ffff88022f001700
Dec 22 14:30:26 hobel kernel: [   37.476902] R10: ffff880206110300 R11: 00000000000000f0 R12: ffff8800aebefc18
Dec 22 14:30:26 hobel kernel: [   37.476904] R13: 0000000000000000 R14: ffff8802076a2710 R15: ffff880206110300
Dec 22 14:30:26 hobel kernel: [   37.476906] FS:  00007f3a37991800(0000) GS:ffff880237d00000(0000) knlGS:0000000000000000
Dec 22 14:30:26 hobel kernel: [   37.476907] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 22 14:30:26 hobel kernel: [   37.476909] CR2: 0000000000000001 CR3: 00000000a9190000 CR4: 00000000000007e0
Dec 22 14:30:26 hobel kernel: [   37.476910] Stack:
Dec 22 14:30:26 hobel kernel: [   37.476911]  0000000040000040 ffff8802076a2680 ffff8800aebefc88 ffffffff81a7caec
Dec 22 14:30:26 hobel kernel: [   37.476914]  ffff8800a8d0ba01 ffff88022aa2e320 ffff8800aebefc48 ffff88022aa2e320
Dec 22 14:30:26 hobel kernel: [   37.476916]  ffff8802076a2938 000000010000000d 0000000000000000 0000000700000000
Dec 22 14:30:26 hobel kernel: [   37.476918] Call Trace:
Dec 22 14:30:26 hobel kernel: [   37.476922]  [<ffffffff81a7caec>] unix_stream_recvmsg+0x32c/0x870
Dec 22 14:30:26 hobel kernel: [   37.476925]  [<ffffffff81a7dedb>] ? unix_stream_sendmsg+0x3fb/0x430
Dec 22 14:30:26 hobel kernel: [   37.476928]  [<ffffffff819416ae>] sock_recvmsg+0x6e/0x90
Dec 22 14:30:26 hobel kernel: [   37.476931]  [<ffffffff81201280>] ? poll_select_copy_remaining+0x130/0x130
Dec 22 14:30:26 hobel kernel: [   37.476933]  [<ffffffff81201280>] ? poll_select_copy_remaining+0x130/0x130
Dec 22 14:30:26 hobel kernel: [   37.476936]  [<ffffffff819501a2>] ? verify_iovec+0x42/0xd0
Dec 22 14:30:26 hobel kernel: [   37.476938]  [<ffffffff819427b6>] ___sys_recvmsg+0x106/0x2e0
Dec 22 14:30:26 hobel kernel: [   37.476941]  [<ffffffff81201280>] ? poll_select_copy_remaining+0x130/0x130
Dec 22 14:30:26 hobel kernel: [   37.476943]  [<ffffffff8123078b>] ? eventfd_ctx_read+0x1ab/0x220
Dec 22 14:30:26 hobel kernel: [   37.476946]  [<ffffffff81133e70>] ? wake_up_process+0x50/0x50
Dec 22 14:30:26 hobel kernel: [   37.476949]  [<ffffffff8120a704>] ? __fget+0x74/0xb0
Dec 22 14:30:26 hobel kernel: [   37.476952]  [<ffffffff8120a77f>] ? __fget_light+0x1f/0x80
Dec 22 14:30:26 hobel kernel: [   37.476954]  [<ffffffff81943e84>] __sys_recvmsg+0x44/0x80
Dec 22 14:30:26 hobel kernel: [   37.476957]  [<ffffffff81943ecd>] SyS_recvmsg+0xd/0x20
Dec 22 14:30:26 hobel kernel: [   37.476961]  [<ffffffff81b34c69>] system_call_fastpath+0x12/0x17
Dec 22 14:30:26 hobel kernel: [   37.476962] Code: ff ff 48 8b 5d f0 4c 8b 65 f8 c9 c3 0f 1f 00 55 48 8b 46 38 48 89 e5 41 54 49 89 fc 53 48 89 07 48 c7 46 38 00 00 00 00 48 8b 07 <0f> bf 18 ff cb 79 09 eb 1b 0f 1f 00 49 8b 04 24 48 63 d3 ff cb 
Dec 22 14:30:26 hobel kernel: [   37.476983] RIP  [<ffffffff81a7c70c>] unix_detach_fds.isra.28+0x1c/0x50
Dec 22 14:30:26 hobel kernel: [   37.476985]  RSP <ffff8800aebefb68>
Dec 22 14:30:26 hobel kernel: [   37.476986] CR2: 0000000000000001
Dec 22 14:30:26 hobel kernel: [   37.476988] ---[ end trace 05163e1048a933ce ]---

Regards,
-- 
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts

^ permalink raw reply

* Re: [RFC PATCH net-next] tun: support retrieving multiple packets in a single read with IFF_MULTI_READ
From: Dave Taht @ 2014-12-22 20:51 UTC (permalink / raw)
  To: Alex Gartrell
  Cc: Herbert Xu, jasonwang, davem@davemloft.net,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Michael S. Tsirkin, herbert, kernel-team
In-Reply-To: <54987C9F.5070103@fb.com>

On Mon, Dec 22, 2014 at 12:18 PM, Alex Gartrell <agartrell@fb.com> wrote:
> Hey Herbert,
>
> Thanks for getting back to me
>
> On 12/22/14 4:09 AM, Herbert Xu wrote:
>>
>> As tun already has a socket interface can we do this through
>> recvmmsg?
>
>
> This just presents an easier interface (IMHO) for accomplishing that. And I
> say easier because I was unable how to figure out the recvmmsg way to do it.

the recvmsg and recvmmsg calls and layers above them could use an abstraction
that allows for better passing of per packet header information to applications
in the QUIC and webrtc age.

> While fully aware that this makes me look like an idiot, I have to admit

I have lost several days of hair to *msg calls. So have the authors of
multipath mosh
(which is WAY cool, btw: https://github.com/boutier/mosh

So, no, trying and failing does not make you an idiot. Trying at all does
make you a mite crazy, however. :)

> that I've tried and failed to figure out how to get a socket fd out of the
> tun device.
>
> The regular fd doesn't work (which is obvious when you look at the
> implementation sock_from_file), there's a tun_get_socket function but it's
> only referenced by a single file, and none of the ioctl's jump out at me as
> doing anything to enable this behavior.  Additionally, tuntap.txt makes no
> mention of sockets specifically.
>
> FWIW, I don't feel strongly that IFF_MULTI_READ is the right way to do this
> either.

I have been thinking about how to implement multiple ways of eliminating
serialization dependencies in userspace vpns using fair queueing, and
multithreading...
(with splitting out the seqno + address across an entire /64)

... and excess latency with multipacket reads, and then codeling
internal queues (as many vpns
bottleneck on the encap and encode step allowing for packets to
accumulate in the OS recv buffer)

See:

http://www.tinc-vpn.org/pipermail/tinc-devel/2014-December/000680.html

And especially:

https://plus.google.com/u/0/107942175615993706558/posts/QWPWLoGMtrm

and after having just suffered through making that work with recvmsg,
was dreading trying to make it work with recvmmsg.

It appears that one of the core crazy ideas (listening on an entire
/64) doesn´t work with the existing APIs, and this new interface would
help? Or recvmmsg could be generalized? Or?


-- 
Dave Täht

http://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks

^ permalink raw reply

* Re: [PATCH RESEND] stmmac: Don't init ptp again when resume from suspend/hibernation
From: David Miller @ 2014-12-22 20:42 UTC (permalink / raw)
  To: chenhc; +Cc: peppe.cavallaro, srinivas.kandagatla, netdev
In-Reply-To: <1418999898-10298-1-git-send-email-chenhc@lemote.com>

From: Huacai Chen <chenhc@lemote.com>
Date: Fri, 19 Dec 2014 22:38:18 +0800

> Both stmmac_open() and stmmac_resume() call stmmac_hw_setup(), and
> stmmac_hw_setup() call stmmac_init_ptp() unconditionally. However, only
> stmmac_release() calls stmmac_release_ptp(). Since stmmac_suspend()
> doesn't call stmmac_release_ptp(), stmmac_resume() also needn't call
> stmmac_init_ptp().
> 
> This patch also fix a "scheduling while atomic" problem when resume
> from suspend/hibernation. Because stmmac_init_ptp() will trigger
> scheduling while stmmac_resume() hold a spinlock.
> 
> Callgraph of "scheduling while atomic":
> stmmac_resume() --> stmmac_hw_setup() --> stmmac_init_ptp() -->
> stmmac_ptp_register() --> ptp_clock_register() --> device_create() -->
> device_create_groups_vargs() --> device_add() --> devtmpfs_create_node()
> --> wait_for_common() --> schedule_timeout() --> __schedule()
> 
> Signed-off-by: Huacai Chen <chenhc@lemote.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH] Fixed TPACKET V3 to signal poll when block is closed rather than every packet
From: David Miller @ 2014-12-22 20:41 UTC (permalink / raw)
  To: dan; +Cc: netdev, linux-kernel
In-Reply-To: <1418960965-29522-1-git-send-email-dan@dcollins.co.nz>

From: Dan Collins <dan@dcollins.co.nz>
Date: Fri, 19 Dec 2014 16:49:25 +1300

> Make TPACKET_V3 signal poll when block is closed rather than for every
> packet. Side effect is that poll will be signaled when block retire
> timer expires which didn't previously happen. Issue was visible when
> sending packets at a very low frequency such that all blocks are retired
> before packets are received by TPACKET_V3. This caused avoidable packet
> loss. The fix ensures that the signal is sent when blocks are closed
> which covers the normal path where the block is filled as well as the
> path where the timer expires. The case where a block is filled without
> moving to the next block (ie. all blocks are full) will still cause poll
> to be signaled.
> 
> Signed-off-by: Dan Collins <dan@dcollins.co.nz>

Applied, thanks.

^ permalink raw reply

* Re: [RFC PATCH net-next] tun: support retrieving multiple packets in a single read with IFF_MULTI_READ
From: Alex Gartrell @ 2014-12-22 20:18 UTC (permalink / raw)
  To: Herbert Xu
  Cc: jasonwang, davem, netdev, linux-kernel, mst, herbert, kernel-team
In-Reply-To: <20141222120957.GA21319@gondor.apana.org.au>

Hey Herbert,

Thanks for getting back to me

On 12/22/14 4:09 AM, Herbert Xu wrote:
> As tun already has a socket interface can we do this through
> recvmmsg?

This just presents an easier interface (IMHO) for accomplishing that. 
And I say easier because I was unable how to figure out the recvmmsg way 
to do it.

While fully aware that this makes me look like an idiot, I have to admit 
that I've tried and failed to figure out how to get a socket fd out of 
the tun device.

The regular fd doesn't work (which is obvious when you look at the 
implementation sock_from_file), there's a tun_get_socket function but 
it's only referenced by a single file, and none of the ioctl's jump out 
at me as doing anything to enable this behavior.  Additionally, 
tuntap.txt makes no mention of sockets specifically.

FWIW, I don't feel strongly that IFF_MULTI_READ is the right way to do 
this either.

Thanks,
-- 
Alex Gartrell <agartrell@fb.com>

^ permalink raw reply

* Re: [PATCH for 3.19] rtlwifi: Fix error when accessing unmapped memory in skb
From: Eric Biggers @ 2014-12-22 19:48 UTC (permalink / raw)
  To: Larry Finger; +Cc: kvalo, linux-wireless, netdev, Stable
In-Reply-To: <1419269826-12552-1-git-send-email-Larry.Finger@lwfinger.net>

Is this really the same behavior as 3.17?  In 3.17, allocating the new skb is
one of the first things the interrupt handler does, and if that fails it drops
the packet and keeps using the old skb.  In this proposal, it's only after the
packet has been received and the old skb has been freed that a new one is
allocated.  And if that fails --- well, what are you expecting to happen
exactly?

^ permalink raw reply

* Re: [RFC PATCH 01/17] fib_trie: Update usage stats to be percpu instead of global variables
From: Cong Wang @ 2014-12-22 19:21 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev
In-Reply-To: <20141222174058.1119.55120.stgit@ahduyck-vm-fedora20>

On Mon, Dec 22, 2014 at 9:40 AM, Alexander Duyck
<alexander.h.duyck@redhat.com> wrote:
> @@ -1388,7 +1388,7 @@ static int check_leaf(struct fib_table *tb, struct trie *t, struct leaf *l,
>                 }
>
>  #ifdef CONFIG_IP_FIB_TRIE_STATS
> -               t->stats.semantic_match_miss++;
> +               this_cpu_ptr(t->stats->semantic_match_miss);


You mean this_cpu_inc() ?

^ permalink raw reply

* [PATCH iproute2] ip: allow ip address show to list addresses with certain flags not being set
From: Heiner Kallweit @ 2014-12-22 19:18 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

Sometimes it's needed to have "ip address show" list only addresses
with certain flags not being set, e.g. in network scripts.
As an example one might want to exclude addresses in "tentative"
or "deprecated" state.

Support listing addresses with flags tentative, deprecated, dadfailed
not being set by prefixing the respective flag with a minus.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 ip/ipaddress.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/ip/ipaddress.c b/ip/ipaddress.c
index 221ae1f..a071572 100644
--- a/ip/ipaddress.c
+++ b/ip/ipaddress.c
@@ -80,7 +80,7 @@ static void usage(void)
 	fprintf(stderr, "SCOPE-ID := [ host | link | global | NUMBER ]\n");
 	fprintf(stderr, "FLAG-LIST := [ FLAG-LIST ] FLAG\n");
 	fprintf(stderr, "FLAG  := [ permanent | dynamic | secondary | primary |\n");
-	fprintf(stderr, "           tentative | deprecated | dadfailed | temporary |\n");
+	fprintf(stderr, "           [-]tentative | [-]deprecated | [-]dadfailed | temporary |\n");
 	fprintf(stderr, "           CONFFLAG-LIST ]\n");
 	fprintf(stderr, "CONFFLAG-LIST := [ CONFFLAG-LIST ] CONFFLAG\n");
 	fprintf(stderr, "CONFFLAG  := [ home | nodad | mngtmpaddr | noprefixroute ]\n");
@@ -1261,9 +1261,15 @@ static int ipaddr_list_flush_or_save(int argc, char **argv, int action)
 		} else if (strcmp(*argv, "tentative") == 0) {
 			filter.flags |= IFA_F_TENTATIVE;
 			filter.flagmask |= IFA_F_TENTATIVE;
+		} else if (strcmp(*argv, "-tentative") == 0) {
+			filter.flags &= ~IFA_F_TENTATIVE;
+			filter.flagmask |= IFA_F_TENTATIVE;
 		} else if (strcmp(*argv, "deprecated") == 0) {
 			filter.flags |= IFA_F_DEPRECATED;
 			filter.flagmask |= IFA_F_DEPRECATED;
+		} else if (strcmp(*argv, "-deprecated") == 0) {
+			filter.flags &= ~IFA_F_DEPRECATED;
+			filter.flagmask |= IFA_F_DEPRECATED;
 		} else if (strcmp(*argv, "home") == 0) {
 			filter.flags |= IFA_F_HOMEADDRESS;
 			filter.flagmask |= IFA_F_HOMEADDRESS;
@@ -1279,6 +1285,9 @@ static int ipaddr_list_flush_or_save(int argc, char **argv, int action)
 		} else if (strcmp(*argv, "dadfailed") == 0) {
 			filter.flags |= IFA_F_DADFAILED;
 			filter.flagmask |= IFA_F_DADFAILED;
+		} else if (strcmp(*argv, "-dadfailed") == 0) {
+			filter.flags &= ~IFA_F_DADFAILED;
+			filter.flagmask |= IFA_F_DADFAILED;
 		} else if (strcmp(*argv, "label") == 0) {
 			NEXT_ARG();
 			filter.label = *argv;
-- 
2.2.1

^ permalink raw reply related

* Re: [PATCH RFC] ipw2200: select CFG80211_WEXT
From: Johannes Berg @ 2014-12-22 19:13 UTC (permalink / raw)
  To: Paul Bolle
  Cc: Stanislav Yakovlev, Kalle Valo, linux-wireless, netdev,
	linux-kernel
In-Reply-To: <1419271817.2317.12.camel@tiscali.nl>

On Mon, 2014-12-22 at 19:10 +0100, Paul Bolle wrote:
> Commit 24a0aa212ee2 ("cfg80211: make WEXT compatibility unselectable")
> made it impossible to depend on CFG80211_WEXT. It does still allow to
> select that symbol. (Yes, the commit summary is confusing.)
> 
> So make IPW2200 select CFG80211_WEXT, so that the ipw2200 driver can be
> built again.
> 
> Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
> ---
> Johannes,
> 
> Building v3.19-rc1 for an outdated ThinkPad X41 left me without the
> ipw2200 driver. It turns out this trivial patch is all that's needed to
> make ipw2200 buildable again.
> 
> (A similar patch would be needed for the drivers behind Kconfig symbol
> HERMES. Ie, orinico and friends.) 
> 
> I must admit that I do not fully understand your commit. (How was
> CFG80211_WEXT "marked for deprecation and removal for a little more than
> two years"?) There's some terminology confusion: what you call "select"
> I tend to call "set". Anyhow, your commit basically disables building
> ipw2200 (and apparently orinoco and friends)?
> 
> Was that your intention?
>  
>  drivers/net/wireless/ipw2x00/Kconfig | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/ipw2x00/Kconfig b/drivers/net/wireless/ipw2x00/Kconfig
> index 91c0cb3c368e..21de4fe6cf2d 100644
> --- a/drivers/net/wireless/ipw2x00/Kconfig
> +++ b/drivers/net/wireless/ipw2x00/Kconfig
> @@ -65,7 +65,8 @@ config IPW2100_DEBUG
>  
>  config IPW2200
>  	tristate "Intel PRO/Wireless 2200BG and 2915ABG Network Connection"
> -	depends on PCI && CFG80211 && CFG80211_WEXT
> +	depends on PCI && CFG80211
> +	select CFG80211_WEXT
>  	select WIRELESS_EXT

I didn't realize that this driver actually depended on this symbol - I
had been under the impression that those would still use regular wext
(WIRELESS_EXT) only.

So yeah - this makes sense. FWIW, by "selectable" I meant by the user.

johannes

^ permalink raw reply

* Re: [RFC PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup by 35 to 75%
From: Dave Taht @ 2014-12-22 18:59 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev@vger.kernel.org
In-Reply-To: <54986514.1010502@redhat.com>

On Mon, Dec 22, 2014 at 10:38 AM, Alexander Duyck
<alexander.h.duyck@redhat.com> wrote:

>> impressive. I think. But I don't quite understand what you mean by a depth of 7?

>He means the deepest path in the fib_trie datastructure that is
>holding the routing table.

Thank you (dave and alexander)  for the clarification!

re:

cat /proc/net/fib_triestat
cat /proc/net/fib_trie

are these a newish feature or merely compiled out in openwrt?

I have to admit I would love to know what your improvements do for a
large (e.g. BGP) table in these regards. Regrettably they don´t let me
near those with pre-production code....


-- 
Dave Täht

http://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks

^ permalink raw reply

* Re: [RFC PATCH 02/17] fib_trie: Make leaf and tnode more uniform
From: Alexander Duyck @ 2014-12-22 18:55 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20141222.133353.2244861758408916536.davem@davemloft.net>


On 12/22/2014 10:33 AM, David Miller wrote:
> From: Alexander Duyck <alexander.h.duyck@redhat.com>
> Date: Mon, 22 Dec 2014 09:41:05 -0800
>
>> -#define IS_TNODE(n) (!(n->parent & T_LEAF))
>> -#define IS_LEAF(n) (n->parent & T_LEAF)
>> +struct tnode {
>> +	t_key key;
>> +	unsigned char bits;		/* 2log(KEYLENGTH) bits needed */
>> +	unsigned char pos;		/* 2log(KEYLENGTH) bits needed */
>> +	struct tnode __rcu *parent;
>> +	union {
>> +		struct rcu_head rcu;
>> +		struct tnode *tnode_free;
>> +	};
>> +	unsigned int full_children;	/* KEYLENGTH bits needed */
>> +	unsigned int empty_children;	/* KEYLENGTH bits needed */
>> +	struct rt_trie_node __rcu *child[0];
>> +};
> I wonder if we can compress this even further.
>
> The full_children and empty_children can probably both be a u16, right?
> If so, you can stick at least one of them after 'bits' and 'pos' and
> thus save 4 bytes on 32b.

The thing is I don't think we would actually be saving any space. The 
slub allocator will round us up anyway.  On a 32b system the size is 28B 
if I recall correctly.  Dropping it to 24B would mean only a 2 child 
node could be allocated from the 32B slab.  Anything larger than that it 
wouldn't matter.

My real concern with all of this is the fact that we have to do 2 
separate memory reads per node, one for the key info and one for the 
child pointer.  I really think we need to get this down to 1 in order to 
get there, but the overhead is the tricky part for that. What I would 
look at doing is splitting the tnode into two parts. One would be a key 
vector (key, pos, bits, seq) paired with a pointer to either a 
tnode_info or leaf_info, the other would be something like a tnode_info 
(rcu, parent pointer, full_children, empty_children, key vector 
array[0]) that provides a means of backtracing and stores the nodes.  
The problem is it makes insertion/deletion and backtracking more 
complicated and doubles (64b) or quadruples (32b) the memory needed as 
such I am still just throwing the idea around and haven't gotten into 
implementation yet.

- Alex

^ permalink raw reply

* Re: [RFC PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup by 35 to 75%
From: David Miller @ 2014-12-22 18:53 UTC (permalink / raw)
  To: dave.taht; +Cc: alexander.h.duyck, netdev
In-Reply-To: <CAA93jw7Zz8_wToKOou42mXhgunZypQ6f=oGEHZk2K6CbrsrNqg@mail.gmail.com>

From: Dave Taht <dave.taht@gmail.com>
Date: Mon, 22 Dec 2014 10:08:09 -0800

> impressive. I think. But I don't quite understand what you mean by a depth of 7?

He means the deepest path in the fib_trie datastructure that is
holding the routing table.

^ permalink raw reply

* Re: [PATCH iproute2] ip lib: Added shorter timestamp option
From: Vadim Kochan @ 2014-12-22 18:37 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Vadim Kochan, netdev
In-Reply-To: <20141222101212.04877275@urahara>

On Mon, Dec 22, 2014 at 10:12:12AM -0800, Stephen Hemminger wrote:
> On Thu, 11 Dec 2014 10:12:06 +0200
> Vadim Kochan <vadim4j@gmail.com> wrote:
> 
> > From: Vadim Kochan <vadim4j@gmail.com>
> > 
> > Added another timestamp format to look like more logging info:
> > 
> > [Dec 01 01:46:20.675589] 2: enp0s25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default
> >     link/ether 3c:97:0e:a3:86:2e brd ff:ff:ff:ff:ff:ff
> > 
> > Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
> 
> I would suggest supporting RFC3339 which is a standard for timestamps instead.
> 
> [2014-22-12T01:46:20.1012] ...
OK, thanks, I will look on at.

^ permalink raw reply

* Re: [RFC PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup by 35 to 75%
From: Alexander Duyck @ 2014-12-22 18:38 UTC (permalink / raw)
  To: Dave Taht; +Cc: netdev@vger.kernel.org
In-Reply-To: <CAA93jw7Zz8_wToKOou42mXhgunZypQ6f=oGEHZk2K6CbrsrNqg@mail.gmail.com>

On 12/22/2014 10:08 AM, Dave Taht wrote:
> impressive. I think. But I don't quite understand what you mean by a depth of 7?
>
> What did your routing table actually look like?
>
> For example, my ipv4 routing table looks like this at the moment.
>
> What is depth? searching from /32, /31, /30, /29?

What I was referring to is the local trie since all routing ends up 
having to do a failed lookup there before we can look in the main trie.

What I did is populate a list of addresses such that I had 15 bits set 
in the lower 16 of the address.  By doing that it allowed me to stress 
the trie pretty hard since it can only inflate out to a tnode with 8 
children.

My routing test was from my ixgbe which was on a 10.0.0.X address to a 
dummy address at 192.168.255.253 which resulted in it being routed to 
the dummy interface.  The ixgbe to local receive was to address 
192.168.255.254.

Below is all the info for my trie.

- Alex

[root@ahduyck-vm-fedora20 net]# cat /proc/net/fib_triestat
Basic info: size of leaf: 40 bytes, size of tnode: 40 bytes.
Main:
     Aver depth:     2.80
     Max depth:      4
     Leaves:         15
     Prefixes:       15
     Internal nodes: 6
       2: 2  3: 4
     Pointers: 40
Null ptrs: 20
Total size: 2  kB

Local:
     Aver depth:     3.87
     Max depth:      7
     Leaves:         49
     Prefixes:       50
     Internal nodes: 24
       1: 10  2: 6  3: 7  4: 1
     Pointers: 116
Null ptrs: 44
Total size: 7  kB

[root@ahduyck-vm-fedora20 ~]# cat /proc/net/fib_trie
Main:
   +-- 0.0.0.0/0 3 0 5
      +-- 0.0.0.0/4 2 0 2
         |-- 0.0.0.0
            /0 universe UNICAST
         +-- 10.0.0.0/22 3 0 4
            |-- 10.0.0.0
               /24 link UNICAST
            |-- 10.0.1.0
               /24 link UNICAST
            |-- 10.0.2.0
               /24 link UNICAST
            |-- 10.0.3.0
               /24 link UNICAST
      |-- 169.254.0.0
         /16 link UNICAST
      +-- 192.168.0.0/16 3 1 4
         |-- 192.168.122.0
            /24 link UNICAST
         |-- 192.168.128.0
            /24 link UNICAST
         |-- 192.168.192.0
            /24 link UNICAST
         +-- 192.168.224.0/19 3 1 4
            |-- 192.168.224.0
               /24 link UNICAST
            |-- 192.168.240.0
               /24 link UNICAST
            |-- 192.168.248.0
               /24 link UNICAST
            +-- 192.168.252.0/22 2 0 1
               |-- 192.168.252.0
                  /24 link UNICAST
               |-- 192.168.254.0
                  /24 link UNICAST
               |-- 192.168.255.0
                  /24 link UNICAST
Local:
   +-- 0.0.0.0/0 3 0 5
      +-- 10.0.0.0/22 4 0 4
         |-- 10.0.0.0
            /32 link BROADCAST
         |-- 10.0.0.128
            /32 host LOCAL
         |-- 10.0.0.255
            /32 link BROADCAST
         |-- 10.0.1.0
            /32 link BROADCAST
         |-- 10.0.1.128
            /32 host LOCAL
         |-- 10.0.1.255
            /32 link BROADCAST
         |-- 10.0.2.0
            /32 link BROADCAST
         |-- 10.0.2.128
            /32 host LOCAL
         |-- 10.0.2.255
            /32 link BROADCAST
         |-- 10.0.3.0
            /32 link BROADCAST
         |-- 10.0.3.128
            /32 host LOCAL
         |-- 10.0.3.255
            /32 link BROADCAST
      +-- 127.0.0.0/8 2 0 2
         +-- 127.0.0.0/31 1 0 0
            |-- 127.0.0.0
               /32 link BROADCAST
               /8 host LOCAL
            |-- 127.0.0.1
               /32 host LOCAL
         |-- 127.255.255.255
            /32 link BROADCAST
      +-- 192.168.0.0/16 3 1 4
         +-- 192.168.122.0/24 2 0 1
            |-- 192.168.122.0
               /32 link BROADCAST
            |-- 192.168.122.173
               /32 host LOCAL
            |-- 192.168.122.255
               /32 link BROADCAST
         +-- 192.168.128.0/24 2 0 2
            +-- 192.168.128.0/31 1 0 0
               |-- 192.168.128.0
                  /32 link BROADCAST
               |-- 192.168.128.1
                  /32 host LOCAL
            |-- 192.168.128.255
               /32 link BROADCAST
         +-- 192.168.192.0/24 2 0 2
            +-- 192.168.192.0/31 1 0 0
               |-- 192.168.192.0
                  /32 link BROADCAST
               |-- 192.168.192.1
                  /32 host LOCAL
            |-- 192.168.192.255
               /32 link BROADCAST
         +-- 192.168.224.0/19 3 2 4
            +-- 192.168.224.0/24 2 0 2
                  |-- 192.168.224.0
                     /32 link BROADCAST
                  |-- 192.168.224.1
                     /32 host LOCAL
               |-- 192.168.224.255
                  /32 link BROADCAST
            +-- 192.168.240.0/24 2 0 2
               +-- 192.168.240.0/31 1 0 0
                  |-- 192.168.240.0
                     /32 link BROADCAST
                  |-- 192.168.240.1
                     /32 host LOCAL
               |-- 192.168.240.255
                  /32 link BROADCAST
            +-- 192.168.248.0/22 3 0 6
               +-- 192.168.248.0/31 1 0 0
                  |-- 192.168.248.0
                     /32 link BROADCAST
                  |-- 192.168.248.1
                     /32 host LOCAL
               |-- 192.168.248.255
                  /32 link BROADCAST
            +-- 192.168.252.0/22 3 1 2
               +-- 192.168.252.0/31 1 0 0
                  |-- 192.168.252.0
                     /32 link BROADCAST
                  |-- 192.168.252.1
                     /32 host LOCAL
               |-- 192.168.252.255
                  /32 link BROADCAST
               +-- 192.168.254.0/31 1 0 0
                  |-- 192.168.254.0
                     /32 link BROADCAST
                  |-- 192.168.254.1
                     /32 host LOCAL
               |-- 192.168.254.255
                  /32 link BROADCAST
               +-- 192.168.255.0/31 1 0 0
                  |-- 192.168.255.0
                     /32 link BROADCAST
                  |-- 192.168.255.1
                     /32 host LOCAL
               +-- 192.168.255.128/25 3 1 4
                  |-- 192.168.255.128
                     /32 host LOCAL
                  |-- 192.168.255.192
                     /32 host LOCAL
                  |-- 192.168.255.224
                     /32 host LOCAL
                  +-- 192.168.255.240/28 3 1 4
                     |-- 192.168.255.240
                        /32 host LOCAL
                     |-- 192.168.255.248
                        /32 host LOCAL
                     |-- 192.168.255.252
                        /32 host LOCAL
                     +-- 192.168.255.254/31 1 0 0
                        |-- 192.168.255.254
                           /32 host LOCAL
                        |-- 192.168.255.255
                           /32 link BROADCAST

^ permalink raw reply

* [PATCH 1/1 net-next] netfilter: remove unnecessary sizeof(char)
From: Fabian Frederick @ 2014-12-22 18:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: davem, joe, Fabian Frederick, Pablo Neira Ayuso, Patrick McHardy,
	Jozsef Kadlecsik, netfilter-devel, coreteam, netdev

sizeof(char) is always 1.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
---
 net/netfilter/nf_log.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/nf_log.c b/net/netfilter/nf_log.c
index 43c926c..1191f66 100644
--- a/net/netfilter/nf_log.c
+++ b/net/netfilter/nf_log.c
@@ -426,7 +426,7 @@ static int netfilter_log_sysctl_init(struct net *net)
 				nf_log_sysctl_fnames[i];
 			nf_log_sysctl_table[i].data = NULL;
 			nf_log_sysctl_table[i].maxlen =
-				NFLOGGER_NAME_LEN * sizeof(char);
+				NFLOGGER_NAME_LEN;
 			nf_log_sysctl_table[i].mode = 0644;
 			nf_log_sysctl_table[i].proc_handler =
 				nf_log_proc_dostring;
-- 
1.9.1

^ permalink raw reply related

* Re: [RFC PATCH 00/17] fib_trie: Reduce time spent in fib_table_lookup by 35 to 75%
From: David Miller @ 2014-12-22 18:35 UTC (permalink / raw)
  To: alexander.h.duyck; +Cc: netdev
In-Reply-To: <20141222172632.1119.51469.stgit@ahduyck-vm-fedora20>

From: Alexander Duyck <alexander.h.duyck@redhat.com>
Date: Mon, 22 Dec 2014 09:40:52 -0800

> These patches are meant to address several performance issues I have seen 
> in the fib_trie implementation, and fib_table_lookup specifically.  With 
> these changes in place I have seen a reduction of up to 35 to 75% for the 
> total time spent in fib_table_lookup depending on the type of search being 
> performed.

Fantastic work Alexander.

I had a patch series, just for micro-benchmarking, that got rid of the
local table and just put everything in the global one.

Everything works and we always only do one probe into the FIB.

That speeds things up a lot.

The only problem is that we have to take into consideration cases
where userspace tries to directly modify and do things to the local
table.  Also we might have to pretend we have a local table in
dumps too.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox