Netdev List
 help / color / mirror / Atom feed
* RE: [PATCH] net-bnx2x: Fix byte order problem on NVRAM writes
From: Yuval Mintz @ 2013-10-22 13:30 UTC (permalink / raw)
  To: Nate Klein, netdev@vger.kernel.org
  Cc: Eilon Greenstein, linux-kernel@vger.kernel.org
In-Reply-To: <1382392621-8998-1-git-send-email-nxk@google.com>

> Tested:
>     ethtool -e eth0 raw on >first.nvram
>     ethtool -E eth0 <first.nvram
>     ethtool -e eth0 raw on >second.nvram
>     cmp first.nvram second.nvram || ethtool -E eth0 <second.nvram
>     (No output means pass.)

Hi Nate,

We're aware of this `bug' for some time - we've encountered it when
trying to fix the endian sparse warnings in the driver.

Sadly, there are already existing user applications that assume that this is 
the driver's behaviour - i.e., those applications prepare their buffers in a 
manner which assumes the endian of the writes; changing this write will 
cause those tools to break.
That's why we haven't fixed the issue before, and cannot support such a
fix. We're more than willing to document it somewhere, if that seems 
useful to anyone.

Thanks,
Yuval

> ---
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
> index 8213cc8..35671fb 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
> @@ -1549,7 +1549,7 @@ static int bnx2x_nvram_write_dword(struct bnx2x
> *bp, u32 offset, u32 val,
>  	REG_WR(bp, MCP_REG_MCPR_NVM_COMMAND,
> MCPR_NVM_COMMAND_DONE);
> 
>  	/* write the data */
> -	REG_WR(bp, MCP_REG_MCPR_NVM_WRITE, val);
> +	REG_WR(bp, MCP_REG_MCPR_NVM_WRITE, cpu_to_be32(val));
> 
>  	/* address of the NVRAM to write to */
>  	REG_WR(bp, MCP_REG_MCPR_NVM_ADDR,
> --

^ permalink raw reply

* Re: [PATCH RFC 4/5] net:stmmac: fix jumbo frame handling.
From: Giuseppe CAVALLARO @ 2013-10-22 13:24 UTC (permalink / raw)
  To: Jimmy PERCHET; +Cc: netdev, jimmy.perchet
In-Reply-To: <52655640.4060405@parrot.com>

On 10/21/2013 6:28 PM, Jimmy PERCHET wrote:
> On 21/10/2013 15:40, Giuseppe CAVALLARO wrote:
>> On 10/16/2013 5:24 PM, Jimmy Perchet wrote:
>>> This patch addresses several issues which prevent jumbo frames from working properly :
>>> .jumbo frames' last descriptor was not closed
>>> .several confusion regarding descriptor's max buffer size
>>> .frags could not be jumbo
>>>
>>> Signed-off-by: Jimmy Perchet <jimmy.perchet@parrot.com>
>>
>>
>> Jimmy, thx for thi patch. BElow some my first notes.
>
> Thanks a lot for this first review.

welcome

>
>> I'll continue to look at the patch to verify if I missed
>> soemthing. I kindly ask you, for the next version, to add
>> more comments especially in the function to prepare the
>> tx desc in order to help me on reviewing.
>
> Sure ;)
>
> I hope do v2 by next week.

ok thx, I'll try to help on reviewing for the v2 again.

>
> I'm OK with most of your comments. Some additional
> notes below:
>
>>>    }
>>> @@ -81,7 +81,7 @@ static inline void ndesc_end_tx_desc_on_ring(struct dma_desc *p, int ter)
>>>
>>>    static inline void norm_set_tx_desc_len_on_ring(struct dma_desc *p, int len)
>>>    {
>>> -    if (unlikely(len > BUF_SIZE_2KiB)) {
>>> +    if (unlikely(len >= BUF_SIZE_2KiB)) {
>>
>> we cannot manage a size of 2048 on normal desc
>>
>> Pls you should verify to not break the back-compatibility.
>
> IMHO, this actually fix the problem you think I create.
> In current code, if len is equal to 2048, buffer1_size is set to 2048,
> this is wrong because the max size is actually 2047...

IIRC, for normal descriptors, the TBS2/1 are just 11 bits
so the max programmable size is 2047 (0x7ff).

>
>>
>>>            p->des01.etx.buffer1_size = BUF_SIZE_2KiB - 1;
>>>            p->des01.etx.buffer2_size = len - p->des01.etx.buffer1_size;
>>>        } else
>
>
>
>>>
>>>    static void stmmac_refill_desc3(void *priv_ptr, struct dma_desc *p)
>>>    {
>>> @@ -103,13 +90,13 @@ static void stmmac_refill_desc3(void *priv_ptr, struct dma_desc *p)
>>>        if (unlikely(priv->plat->has_gmac))
>>>            /* Fill DES3 in case of RING mode */
>>>            if (priv->dma_buf_sz >= BUF_SIZE_8KiB)
>>> -            p->des3 = p->des2 + BUF_SIZE_8KiB;
>>> +            p->des3 = p->des2 + BUF_SIZE_8KiB - 1;
>>
>> is it correct? can you check?
>
> The actual buffer's max size is 8191, so, in ring mode,
> the second buffer must start at p->des2 + 8191.
>
>>> -    priv->cur_tx++;
>>> +    priv->cur_tx += nb_desc;
>>
>> can we avoid to use the nb_desc?
> Actually, it is a preparation for my 5th patch : I want to write cur_tx only once.
> I can split this.

ok

>
>
>
> Best Regards,
> Jimmy
>
>

^ permalink raw reply

* Re: [PATCH] Revert "bridge: only expire the mdb entry when query is received"
From: Vlad Yasevich @ 2013-10-22 13:13 UTC (permalink / raw)
  To: David Miller, linus.luessing
  Cc: stephen, netdev, bridge, linux-kernel, amwang
In-Reply-To: <20131021.184509.1933008514161772000.davem@davemloft.net>

On 10/21/2013 06:45 PM, David Miller wrote:
> From: Linus Lüssing <linus.luessing@web.de>
> Date: Sun, 20 Oct 2013 00:58:57 +0200
>
>> While this commit was a good attempt to fix issues occuring when no
>> multicast querier is present, this commit still has two more issues:
>>
>> 1) There are cases where mdb entries do not expire even if there is a
>> querier present. The bridge will unnecessarily continue flooding
>> multicast packets on the according ports.
>>
>> 2) Never removing an mdb entry could be exploited for a Denial of
>> Service by an attacker on the local link, slowly, but steadily eating up
>> all memory.
>>
>> Actually, this commit became obsolete with
>> "bridge: disable snooping if there is no querier" (b00589af3b)
>> which included fixes for a few more cases.
>>
>> Therefore reverting the following commits (the commit stated in the
>> commit message plus three of its follow up fixes):
>>
>> ---
>> Revert "bridge: update mdb expiration timer upon reports."
>> This reverts commit f144febd93d5ee534fdf23505ab091b2b9088edc.
>> Revert "bridge: do not call setup_timer() multiple times"
>> This reverts commit 1faabf2aab1fdaa1ace4e8c829d1b9cf7bfec2f1.
>> Revert "bridge: fix some kernel warning in multicast timer"
>> This reverts commit c7e8e8a8f7a70b343ca1e0f90a31e35ab2d16de1.
>> Revert "bridge: only expire the mdb entry when query is received"
>> This reverts commit 9f00b2e7cf241fa389733d41b615efdaa2cb0f5b.
>> ---
>
> Cong, and other bridge folks, please review this revert.
>
t  http://vger.kernel.org/majordomo-info.html
>

Makes sense and make the implementation better follow the spec.
Looks like the issues seen before are resolved by the revert.

-vlad

^ permalink raw reply

* Re: [PATCH] Revert "bridge: only expire the mdb entry when query is received"
From: Vladislav Yasevich @ 2013-10-22 13:10 UTC (permalink / raw)
  To: David Miller
  Cc: amwang, netdev@vger.kernel.org, bridge, LKML, Stephen Hemminger,
	linus.luessing
In-Reply-To: <20131021.184509.1933008514161772000.davem@davemloft.net>

[-- Attachment #1: Type: text/plain, Size: 1789 bytes --]

On Mon, Oct 21, 2013 at 6:45 PM, David Miller <davem@davemloft.net> wrote:

> From: Linus Lüssing <linus.luessing@web.de>
> Date: Sun, 20 Oct 2013 00:58:57 +0200
>
> > While this commit was a good attempt to fix issues occuring when no
> > multicast querier is present, this commit still has two more issues:
> >
> > 1) There are cases where mdb entries do not expire even if there is a
> > querier present. The bridge will unnecessarily continue flooding
> > multicast packets on the according ports.
> >
> > 2) Never removing an mdb entry could be exploited for a Denial of
> > Service by an attacker on the local link, slowly, but steadily eating up
> > all memory.
> >
> > Actually, this commit became obsolete with
> > "bridge: disable snooping if there is no querier" (b00589af3b)
> > which included fixes for a few more cases.
> >
> > Therefore reverting the following commits (the commit stated in the
> > commit message plus three of its follow up fixes):
> >
> > ---
> > Revert "bridge: update mdb expiration timer upon reports."
> > This reverts commit f144febd93d5ee534fdf23505ab091b2b9088edc.
> > Revert "bridge: do not call setup_timer() multiple times"
> > This reverts commit 1faabf2aab1fdaa1ace4e8c829d1b9cf7bfec2f1.
> > Revert "bridge: fix some kernel warning in multicast timer"
> > This reverts commit c7e8e8a8f7a70b343ca1e0f90a31e35ab2d16de1.
> > Revert "bridge: only expire the mdb entry when query is received"
> > This reverts commit 9f00b2e7cf241fa389733d41b615efdaa2cb0f5b.
> > ---
>
> Cong, and other bridge folks, please review this revert.
>

Makes sense and make the implementation better follow the spec.
Looks like the issues seen before are resolved by the revert.

Reviewed-by: Vlad Yasevich <vyasevich@gmail.com>

[-- Attachment #2: Type: text/html, Size: 2524 bytes --]

^ permalink raw reply

* Re: [PATCH net-next 0/2] Removal of struct esp_data
From: Steffen Klassert @ 2013-10-22 13:08 UTC (permalink / raw)
  To: David Miller; +Cc: mathias.krause, netdev, herbert
In-Reply-To: <20131018.135536.686066381481925652.davem@davemloft.net>

On Fri, Oct 18, 2013 at 01:55:36PM -0400, David Miller wrote:
> From: Mathias Krause <mathias.krause@secunet.com>
> Date: Fri, 18 Oct 2013 12:09:03 +0200
> 
> > This series removes one level of indirection when accessing the aead
> > crypto algorithm in ESP transforms by simply removing struct esp_data.
> > This results in smaller code and less memory usage per xfrm state.
> > 
> > Please apply!
> 
> No objections from me, I'll let Steffen pick this up.

I'm a bit hesitating with removing the padlen field. We resisted
several attempts to remove it in the past. It is currenly unused,
but it provides the infrastructure for ESP padding as defined
in RFC 4303. However, RFC 4303 recommends the use of TFC padding
instead to conceal the actual length of the packet. So I'm not
sure what's the actual usecase for ESP padding. I'll reconsider
this next week when I'm back at office.

^ permalink raw reply

* RE: [PATCH net] netpoll: fix rx_hook() interface by passing the skb
From: David Laight @ 2013-10-22 12:46 UTC (permalink / raw)
  To: Antonio Quartulli; +Cc: David S. Miller, netdev
In-Reply-To: <20131022101127.GJ1544@neomailbox.net>

> Subject: Re: [PATCH net] netpoll: fix rx_hook() interface by passing the skb
> 
> On Tue, Oct 22, 2013 at 10:09:00AM +0100, David Laight wrote:
> > > Subject: [PATCH net] netpoll: fix rx_hook() interface by passing the skb
> > >
> > > Right now skb->data is passed to rx_hook() even if the skb
> > > has not been linearised and without giving rx_hook() a way
> > > to linearise it.
> > >
> > > Change the rx_hook() interface and make it accept the skb
> > > as argument. In this way users implementing rx_hook() can
> > > perform all the needed operations to properly (and safely)
> > > access the skb data.
> > ...
> > > -	void (*rx_hook)(struct netpoll *, int, char *, int);
> > > +	void (*rx_hook)(struct netpoll *np, struct sk_buff *skb, int offset);
> >
> > You can't do that change without changing the way that hooks are registered
> > so that any existing modules will fail to register their hooks.
> 
> There is no hook registration in the kernel tree. All the users are outside.

Looking at __netpoll_rx() I notice that there isn't an skb_pull for the
udp header.

Actually, I think the alignment rules effectively imply that iph->ihl
(the second byte) will always be in the first skb fragment so the
code could sensible do a single skb_pull() that includes the udp header.

I can't remember which value you passed as 'offset' (and my mailer makes
it hard to find), but to ease the code changes the offset of the udp data
would make sense.
In that case you still need to pass the source port.
If you do rx_hook(np, source_port, skb, offset) then if anyone manages to
load an old module (or code that casts the assignement to rx_poll)
at least it won't go 'bang'.
Renaming the structure member will guarantee to generate compile errors.

	David




^ permalink raw reply

* Re: BUG: scheduling while atomic dev_set_promiscuity->__dev_notify_flags
From: Nicolas Dichtel @ 2013-10-22 11:52 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: netdev
In-Reply-To: <CAMEtUuy91zYJ=bj1dzfdqE8kqZ3rE1RgdR-PZYekSUg8_xoTBw@mail.gmail.com>

Le 22/10/2013 03:04, Alexei Starovoitov a écrit :
> Hi Nicolas,
>
> after commit 991fb3f74c "dev: always advertise rx_flags changes via netlink"
> I'm seeing 'sleeping in atomic' bug.
>
> Steps to reproduce:
> ip tuntap add dev tap1 mode tap
> ifconfig tap1 up
> tcpdump -nei tap1
> and in different terminal:
> ip tuntap del dev tap1 mode tap
>
> [  271.627994] device tap1 left promiscuous mode
> [  271.639897] BUG: sleeping function called from invalid context at
> mm/slub.c:940
> [  271.664491] in_atomic(): 1, irqs_disabled(): 0, pid: 3394, name: ip
> [  271.677525] INFO: lockdep is turned off.
> [  271.690503] CPU: 0 PID: 3394 Comm: ip Tainted: G        W    3.12.0-rc3+ #73
> [  271.703996] Hardware name: System manufacturer System Product
> Name/P8Z77 WS, BIOS 3007 07/26/2012
> [  271.731254]  ffffffff81a58506 ffff8807f0d57a58 ffffffff817544e5
> ffff88082fa0f428
> [  271.760261]  ffff8808071f5f40 ffff8807f0d57a88 ffffffff8108bad1
> ffffffff81110ff8
> [  271.790683]  0000000000000010 00000000000000d0 00000000000000d0
> ffff8807f0d57af8
> [  271.822332] Call Trace:
> [  271.838234]  [<ffffffff817544e5>] dump_stack+0x55/0x76
> [  271.854446]  [<ffffffff8108bad1>] __might_sleep+0x181/0x240
> [  271.870836]  [<ffffffff81110ff8>] ? rcu_irq_exit+0x68/0xb0
> [  271.887076]  [<ffffffff811a80be>] kmem_cache_alloc_node+0x4e/0x2a0
> [  271.903368]  [<ffffffff810b4ddc>] ? vprintk_emit+0x1dc/0x5a0
> [  271.919716]  [<ffffffff81614d67>] ? __alloc_skb+0x57/0x2a0
> [  271.936088]  [<ffffffff810b4de0>] ? vprintk_emit+0x1e0/0x5a0
> [  271.952504]  [<ffffffff81614d67>] __alloc_skb+0x57/0x2a0
> [  271.968902]  [<ffffffff8163a0b2>] rtmsg_ifinfo+0x52/0x100
> [  271.985302]  [<ffffffff8162ac6d>] __dev_notify_flags+0xad/0xc0
> [  272.001642]  [<ffffffff8162ad0c>] __dev_set_promiscuity+0x8c/0x1c0
> [  272.017917]  [<ffffffff81731ea5>] ? packet_notifier+0x5/0x380
> [  272.033961]  [<ffffffff8162b109>] dev_set_promiscuity+0x29/0x50
> [  272.049855]  [<ffffffff8172e937>] packet_dev_mc+0x87/0xc0
> [  272.065494]  [<ffffffff81732052>] packet_notifier+0x1b2/0x380
> [  272.080915]  [<ffffffff81731ea5>] ? packet_notifier+0x5/0x380
> [  272.096009]  [<ffffffff81761c66>] notifier_call_chain+0x66/0x150
> [  272.110803]  [<ffffffff8108503e>] __raw_notifier_call_chain+0xe/0x10
> [  272.125468]  [<ffffffff81085056>] raw_notifier_call_chain+0x16/0x20
> [  272.139984]  [<ffffffff81620190>] call_netdevice_notifiers_info+0x40/0x70
> [  272.154523]  [<ffffffff816201d6>] call_netdevice_notifiers+0x16/0x20
> [  272.168552]  [<ffffffff816224c5>] rollback_registered_many+0x145/0x240
> [  272.182263]  [<ffffffff81622641>] rollback_registered+0x31/0x40
> [  272.195369]  [<ffffffff816229c8>] unregister_netdevice_queue+0x58/0x90
> [  272.208230]  [<ffffffff81547ca0>] __tun_detach+0x140/0x340
> [  272.220686]  [<ffffffff81547ed6>] tun_chr_close+0x36/0x60
>
> packet_notifier() does rcu_read_lock() before calling into packet_dev_mc() .
>
> Not sure how to fix it cleanly, other than disabling a notify here.
> Any suggestion?
I don't reproduce it. Can you send me your .config?
I will look more deeply at the code.


Regards,
Nicolas

^ permalink raw reply

* (unknown)
From: andran @ 2013-10-21 20:51 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 55 bytes --]



-- 
Do you need help? View attachment for more info


[-- Attachment #2: Loan offer.odt --]
[-- Type: application/vnd.oasis.opendocument.text, Size: 5327 bytes --]

^ permalink raw reply

* Re: [E1000-devel] [PATCH net-next] e1000: fix wrong queue idx calculation
From: Jeff Kirsher @ 2013-10-22 10:46 UTC (permalink / raw)
  To: Hong Zhiguo; +Cc: davem, e1000-devel, netdev, Hong Zhiguo
In-Reply-To: <1382256924-12598-1-git-send-email-zhiguohong@tencent.com>

[-- Attachment #1: Type: text/plain, Size: 413 bytes --]

On Sun, 2013-10-20 at 16:15 +0800, Hong Zhiguo wrote:
> From: Hong Zhiguo <zhiguohong@tencent.com>
> 
> tx_ring and adapter->tx_ring are already of type "struct
> e1000_tx_ring *"
> 
> Signed-off-by: Hong Zhiguo <zhiguohong@tencent.com>
> ---
>  drivers/net/ethernet/intel/e1000/e1000_main.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)

Thanks Hong, I have added your patch to my queue.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH] igb: Add EEPROM IO stubs for iNVM
From: Jeff Kirsher @ 2013-10-22 10:45 UTC (permalink / raw)
  To: Marek Vasut
  Cc: netdev, e1000-devel, Carolyn Wyborny, Aaron Brown,
	David S. Miller
In-Reply-To: <1382412123-4782-1-git-send-email-marex@denx.de>

[-- Attachment #1: Type: text/plain, Size: 4607 bytes --]

On Tue, 2013-10-22 at 05:22 +0200, Marek Vasut wrote:
> Add stub functions for EEPROM operations in case where the i210 is
> used without external EEPROM. The EEPROM operations must not be set
> to NULL, since otherwise we will get a backtrace when attempting the
> command below. Once such place to trigger this is from igb_ethtool.c
> igb_set_eeprom(), where hw->nvm.ops.write() is called without first
> checking if .write() is valid . By grepping through the code, there
> are more such occasions which assume .write() to be always valid.
> Thus, instead of poluting the code with checks, add stubs. I believe
> it'd be prefferable to possibly even implement those functions, but
> my knowledge of the adapter is still limited and as far as I
> understand,
> the iNVM is programmable only once.
> 
> Command:
> 
> $ ethtool -E eth0 magic 0x157b8086 offset 6 value 0x1b
> 
> Backtrace:
> 
> Unable to handle kernel NULL pointer dereference at virtual address
> 00000000
> pgd = be7ac000
> [00000000] *pgd=4e6a6831, *pte=00000000, *ppte=00000000
> Internal error: Oops: 80000007 [#1] SMP ARM
> CPU: 2 PID: 59 Comm: ethtool Not tainted 3.12.0-rc6+ #8
> task: bf8f3600 ti: be73c000 task.ti: be73c000
> PC is at 0x0
> LR is at igb_set_eeprom+0x27c/0x3b4
> pc : [<00000000>]    lr : [<803bc780>]    psr: 20000013
> sp : be73dd80  ip : 00000000  fp : be73ddf4
> r10: 00000001  r9 : 00000003  r8 : be6d6000
> r7 : bfa64a38  r6 : be6d7000  r5 : bfa64000  r4 : be73de20
> r3 : be6d6000  r2 : 00000001  r1 : 00000003  r0 : bfa64a38
> Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> Control: 10c53c7d  Table: 4e7ac04a  DAC: 00000015
> Process ethtool (pid: 59, stack limit = 0xbe73c240)
> Stack: (0xbe73dd80 to 0xbe73e000)
> dd80: 803c7e2c 803c8554 00000000 00000000 00000000 803c827c 00000004
> 00000000
> dda0: 00000000 00000000 00010800 00080008 00000008 be73ddc0 800d1a34
> 8054db6c
> ddc0: be6d6000 00000003 00ad97c8 00000001 be73c000 bfa64000 be6d7000
> 80584d58
> dde0: be73de20 00ad97d8 be73de7c be73ddf8 80465e00 803bc510 be73de14
> be6d7000
> de00: e4114bb3 00000000 be73de6c 0000000c 8055048c 80076b5c 00000002
> 00000000
> de20: 0000000c 157b8086 00000006 00000001 8094cac0 be73c000 00000000
> 00000000
> de40: be73de7c 00008946 8094cac0 7e8cfcf4 be73de98 00008946 8094cac0
> 7e8cfcf4
> de60: be73de98 be73c000 00000000 00000000 be73dee4 be73de80 80474f54
> 804656ac
> de80: 000000a8 00000200 be73dec4 be73de98 80077c7c 80076b5c 30687465
> 00000000
> dea0: 00000000 00000000 00ad97c8 00000000 00000000 00000000 be73c000
> 00008946
> dec0: fffffdfd 7e8cfcf4 7e8cfcf4 7e8cfcf4 bf18c020 00000000 be73df04
> be73dee8
> dee0: 80449c18 80474ac8 80449b94 00008946 be6b0600 00000003 be73df74
> be73df08
> df00: 800e6d10 80449ba0 be6b0600 00030002 be6b5f40 be6b5f40 be73df3c
> be73df28
> df20: 80554b2c 802b03e8 be6b5f6c be6b5f00 be73df5c be73c000 8000ea44
> be73c000
> df40: 8000eab0 bf8f3600 00000001 00008946 00000003 00000000 7e8cfcf4
> be6b0600
> df60: be73c000 00000000 be73dfa4 be73df78 800e72ac 800e6c98 be73df94
> 00000000
> df80: 80076b64 0002bd0c 00000000 0002bcc8 00000036 8000ebe4 00000000
> be73dfa8
> dfa0: 8000ea20 800e7278 0002bd0c 00000000 00000003 00008946 7e8cfcf4
> 7e8cfcf4
> dfc0: 0002bd0c 00000000 0002bcc8 00000036 00000000 00000000 00000000
> 7e8cfb84
> dfe0: 7e8cfe65 7e8cfb78 0001201c 0004535c 20000010 00000003 00000000
> 00000000
> Backtrace:
> [<803bc504>] (igb_set_eeprom+0x0/0x3b4) from [<80465e00>] (dev_ethtool
> +0x760/0x1f68)
> [<804656a0>] (dev_ethtool+0x0/0x1f68) from [<80474f54>] (dev_ioctl
> +0x498/0x86c)
> [<80474abc>] (dev_ioctl+0x0/0x86c) from [<80449c18>] (sock_ioctl
> +0x84/0x258)
> [<80449b94>] (sock_ioctl+0x0/0x258) from [<800e6d10>] (do_vfs_ioctl
> +0x84/0x5e0)
>  r6:00000003 r5:be6b0600 r4:00008946 r3:80449b94
> [<800e6c8c>] (do_vfs_ioctl+0x0/0x5e0) from [<800e72ac>] (SyS_ioctl
> +0x40/0x68)
> [<800e726c>] (SyS_ioctl+0x0/0x68) from [<8000ea20>] (ret_fast_syscall
> +0x0/0x48)
>  r8:8000ebe4 r7:00000036 r6:0002bcc8 r5:00000000 r4:0002bd0c
> Code: bad PC value
> ---[ end trace 59379e9bf8fc8437 ]---
> 
> Signed-off-by: Marek Vasut <marex@denx.de>
> Cc: Carolyn Wyborny <carolyn.wyborny@intel.com>
> Cc: Aaron Brown <aaron.f.brown@intel.com>
> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Cc: David S. Miller <davem@davemloft.net>
> ---
>  drivers/net/ethernet/intel/igb/e1000_i210.c | 46
> +++++++++++++++++++++++++++--
>  1 file changed, 43 insertions(+), 3 deletions(-)

Thanks Marek, I have added the patch to my queue.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH] ixgbe: Reduce memory consumption with larger page sizes
From: Jeff Kirsher @ 2013-10-22 10:23 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: netdev, benh
In-Reply-To: <20131022103757.162f1a79@kryten>

[-- Attachment #1: Type: text/plain, Size: 1030 bytes --]

On Tue, 2013-10-22 at 10:37 +1100, Anton Blanchard wrote:
> The ixgbe driver allocates pages for its receive rings. It currently
> uses 512 pages, regardless of page size. During receive handling it
> adds the unused part of the page back into the rx ring, avoiding the
> need for a new allocation.
> 
> On a ppc64 box with 64 threads and 64kB pages, we end up with
> 512 entries * 64 rx queues * 64kB = 2GB memory used. Even more of a
> concern is that we use up 2GB of IOMMU space in order to map all this
> memory.
> 
> The driver makes a number of decisions based on if PAGE_SIZE is less
> than 8kB, so use this as the breakpoint and only allocate 128 entries
> on 8kB or larger page sizes.
> 
> Signed-off-by: Anton Blanchard <anton@samba.org>
> ---
> 
> Jeff: The breakpoint and the ring size I chose was pretty arbitrary,
> feel free to adjust as you see fit. Our main concern is we get that
> 2GB
> consumption down to something more reasonable :)

Thanks Anton, I will add your patch to my queue.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: Stale IPv6 address accumulation on linux 3.2.17
From: Hannes Frederic Sowa @ 2013-10-22 10:18 UTC (permalink / raw)
  To: Templin, Fred L; +Cc: netdev@vger.kernel.org
In-Reply-To: <2134F8430051B64F815C691A62D9831813520C@XCH-BLV-504.nw.nos.boeing.com>

Hi Fred!

On Mon, Oct 21, 2013 at 03:50:24PM +0000, Templin, Fred L wrote:
> On linux 3.2.17, I have a host that configures IPv6 addresses on
> an eth0 interface based on Router Advertisements received from an
> on-link linux box configured as an IPv6 router and running radvd.
> When the host gets an RA, it configures both an EUI-64-based IPv6
> address and an IPv6 privacy address, so it has two IPv6 addresses.
> But, if I leave the host up for long periods of time, it seems to
> accumulate additional IPv6 addresses - perhaps these are stale
> IPv6 privacy addresses?
> 
> Is this known behavior, and if so is there a way to turn it off?
> Or, perhaps this was a known bug that has been corrected in more
> recent linux kernel versions?

Could you send me the output of ip -6 a l?

Greetings,

  Hannes

^ permalink raw reply

* Re: [PATCH net] netpoll: fix rx_hook() interface by passing the skb
From: Antonio Quartulli @ 2013-10-22 10:11 UTC (permalink / raw)
  To: David Laight; +Cc: David S. Miller, netdev
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B739B@saturn3.aculab.com>

[-- Attachment #1: Type: text/plain, Size: 921 bytes --]

On Tue, Oct 22, 2013 at 10:09:00AM +0100, David Laight wrote:
> > Subject: [PATCH net] netpoll: fix rx_hook() interface by passing the skb
> > 
> > Right now skb->data is passed to rx_hook() even if the skb
> > has not been linearised and without giving rx_hook() a way
> > to linearise it.
> > 
> > Change the rx_hook() interface and make it accept the skb
> > as argument. In this way users implementing rx_hook() can
> > perform all the needed operations to properly (and safely)
> > access the skb data.
> ...
> > -	void (*rx_hook)(struct netpoll *, int, char *, int);
> > +	void (*rx_hook)(struct netpoll *np, struct sk_buff *skb, int offset);
> 
> You can't do that change without changing the way that hooks are registered
> so that any existing modules will fail to register their hooks.

There is no hook registration in the kernel tree. All the users are outside.


-- 
Antonio Quartulli

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* RE: [PATCH net] netpoll: fix rx_hook() interface by passing the skb
From: David Laight @ 2013-10-22  9:09 UTC (permalink / raw)
  To: Antonio Quartulli, David S. Miller; +Cc: netdev
In-Reply-To: <1382431715-3128-1-git-send-email-antonio@meshcoding.com>

> Subject: [PATCH net] netpoll: fix rx_hook() interface by passing the skb
> 
> Right now skb->data is passed to rx_hook() even if the skb
> has not been linearised and without giving rx_hook() a way
> to linearise it.
> 
> Change the rx_hook() interface and make it accept the skb
> as argument. In this way users implementing rx_hook() can
> perform all the needed operations to properly (and safely)
> access the skb data.
...
> -	void (*rx_hook)(struct netpoll *, int, char *, int);
> +	void (*rx_hook)(struct netpoll *np, struct sk_buff *skb, int offset);

You can't do that change without changing the way that hooks are registered
so that any existing modules will fail to register their hooks.

	David

^ permalink raw reply

* [PATCH net] netpoll: fix rx_hook() interface by passing the skb
From: Antonio Quartulli @ 2013-10-22  8:48 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Antonio Quartulli
In-Reply-To: <20131022.025038.1046903740187748879.davem@davemloft.net>

Right now skb->data is passed to rx_hook() even if the skb
has not been linearised and without giving rx_hook() a way
to linearise it.

Change the rx_hook() interface and make it accept the skb
as argument. In this way users implementing rx_hook() can
perform all the needed operations to properly (and safely)
access the skb data.

Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
---
 include/linux/netpoll.h |  2 +-
 net/core/netpoll.c      | 10 ++++------
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index f3c7c24..5352160 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -24,7 +24,7 @@ struct netpoll {
 	struct net_device *dev;
 	char dev_name[IFNAMSIZ];
 	const char *name;
-	void (*rx_hook)(struct netpoll *, int, char *, int);
+	void (*rx_hook)(struct netpoll *np, struct sk_buff *skb, int offset);
 
 	union inet_addr local_ip, remote_ip;
 	bool ipv6;
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index fc75c9e..b415437 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -834,9 +834,8 @@ int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo)
 			if (np->local_port && np->local_port != ntohs(uh->dest))
 				continue;
 
-			np->rx_hook(np, ntohs(uh->source),
-				       (char *)(uh+1),
-				       ulen - sizeof(struct udphdr));
+			np->rx_hook(np, skb,
+				    (unsigned char *)(uh + 1) - skb->data);
 			hits++;
 		}
 	} else {
@@ -872,9 +871,8 @@ int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo)
 			if (np->local_port && np->local_port != ntohs(uh->dest))
 				continue;
 
-			np->rx_hook(np, ntohs(uh->source),
-				       (char *)(uh+1),
-				       ulen - sizeof(struct udphdr));
+			np->rx_hook(np, skb,
+				    (unsigned char *)(uh + 1) - skb->data);
 			hits++;
 		}
 #endif
-- 
1.8.4

^ permalink raw reply related

* [PATCH RESEND] packet: Deliver VLAN TPID to userspace
From: Atzm Watanabe @ 2013-10-22  8:39 UTC (permalink / raw)
  To: netdev; +Cc: Stephen Hemminger, Ben Hutchings

After the 802.1AD support, userspace packet receivers
(packet dumper, software switch, and the like) need how to know
VLAN TPID in order to reassemble original tagged frame.

Signed-off-by: Atzm Watanabe <atzm@stratosphere.co.jp>
---
struct tpacket_hdr_variant1 looks like that is allowed to grow,
as the length combined with struct tpacket3_hdr is explicit at
run-time.

 include/uapi/linux/if_packet.h | 5 +++--
 net/packet/af_packet.c         | 8 ++++++--
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/include/uapi/linux/if_packet.h b/include/uapi/linux/if_packet.h
index dbf0666..6e36e0a 100644
--- a/include/uapi/linux/if_packet.h
+++ b/include/uapi/linux/if_packet.h
@@ -83,7 +83,7 @@ struct tpacket_auxdata {
 	__u16		tp_mac;
 	__u16		tp_net;
 	__u16		tp_vlan_tci;
-	__u16		tp_padding;
+	__u16		tp_vlan_tpid;
 };
 
 /* Rx ring - header status */
@@ -132,12 +132,13 @@ struct tpacket2_hdr {
 	__u32		tp_sec;
 	__u32		tp_nsec;
 	__u16		tp_vlan_tci;
-	__u16		tp_padding;
+	__u16		tp_vlan_tpid;
 };
 
 struct tpacket_hdr_variant1 {
 	__u32	tp_rxhash;
 	__u32	tp_vlan_tci;
+	__u32	tp_vlan_tpid;
 };
 
 struct tpacket3_hdr {
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 2e8286b..fbcc882 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -895,9 +895,11 @@ static void prb_fill_vlan_info(struct tpacket_kbdq_core *pkc,
 {
 	if (vlan_tx_tag_present(pkc->skb)) {
 		ppd->hv1.tp_vlan_tci = vlan_tx_tag_get(pkc->skb);
+		ppd->hv1.tp_vlan_tpid = (__force __u32)ntohs(pkc->skb->vlan_proto);
 		ppd->tp_status = TP_STATUS_VLAN_VALID;
 	} else {
 		ppd->hv1.tp_vlan_tci = 0;
+		ppd->hv1.tp_vlan_tpid = 0;
 		ppd->tp_status = TP_STATUS_AVAILABLE;
 	}
 }
@@ -1836,11 +1838,12 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
 		h.h2->tp_nsec = ts.tv_nsec;
 		if (vlan_tx_tag_present(skb)) {
 			h.h2->tp_vlan_tci = vlan_tx_tag_get(skb);
+			h.h2->tp_vlan_tpid = ntohs(skb->vlan_proto);
 			status |= TP_STATUS_VLAN_VALID;
 		} else {
 			h.h2->tp_vlan_tci = 0;
+			h.h2->tp_vlan_tpid = 0;
 		}
-		h.h2->tp_padding = 0;
 		hdrlen = sizeof(*h.h2);
 		break;
 	case TPACKET_V3:
@@ -2788,11 +2791,12 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
 		aux.tp_net = skb_network_offset(skb);
 		if (vlan_tx_tag_present(skb)) {
 			aux.tp_vlan_tci = vlan_tx_tag_get(skb);
+			aux.tp_vlan_tpid = ntohs(skb->vlan_proto);
 			aux.tp_status |= TP_STATUS_VLAN_VALID;
 		} else {
 			aux.tp_vlan_tci = 0;
+			aux.tp_vlan_tpid = 0;
 		}
-		aux.tp_padding = 0;
 		put_cmsg(msg, SOL_PACKET, PACKET_AUXDATA, sizeof(aux), &aux);
 	}
 
-- 
1.8.1.5

^ permalink raw reply related

* Re: [virtio-net] BUG: sleeping function called from invalid context at kernel/mutex.c:616
From: Jason Wang @ 2013-10-22  8:35 UTC (permalink / raw)
  To: Fengguang Wu; +Cc: netdev, linux-kernel, virtualization
In-Reply-To: <20131020023418.GA6737@localhost>

[-- Attachment #1: Type: text/plain, Size: 3979 bytes --]

On 10/20/2013 10:34 AM, Fengguang Wu wrote:
> Greetings,
>
> I got the below dmesg and the first bad commit is
>
> commit 3ab098df35f8b98b6553edc2e40234af512ba877
> Author: Jason Wang <jasowang@redhat.com>
> Date:   Tue Oct 15 11:18:58 2013 +0800
>
>     virtio-net: don't respond to cpu hotplug notifier if we're not ready
>     
>     We're trying to re-configure the affinity unconditionally in cpu hotplug
>     callback. This may lead the issue during resuming from s3/s4 since
>     
>     - virt queues haven't been allocated at that time.
>     - it's unnecessary since thaw method will re-configure the affinity.
>     
>     Fix this issue by checking the config_enable and do nothing is we're not ready.
>     
>     The bug were introduced by commit 8de4b2f3ae90c8fc0f17eeaab87d5a951b66ee17
>     (virtio-net: reset virtqueue affinity when doing cpu hotplug).
>     
>     Cc: Rusty Russell <rusty@rustcorp.com.au>
>     Cc: Michael S. Tsirkin <mst@redhat.com>
>     Cc: Wanlong Gao <gaowanlong@cn.fujitsu.com>
>     Acked-by: Michael S. Tsirkin <mst@redhat.com>
>     Reviewed-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
>     Signed-off-by: Jason Wang <jasowang@redhat.com>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
>
> [  622.944441] CPU0 attaching NULL sched-domain.
> [  622.944446] CPU1 attaching NULL sched-domain.
> [  622.944485] CPU0 attaching NULL sched-domain.
> [  622.950795] BUG: sleeping function called from invalid context at kernel/mutex.c:616
> [  622.950796] in_atomic(): 1, irqs_disabled(): 1, pid: 10, name: migration/1
> [  622.950796] no locks held by migration/1/10.
> [  622.950798] CPU: 1 PID: 10 Comm: migration/1 Not tainted 3.12.0-rc5-wl-01249-gb91e82d #317
> [  622.950799] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [  622.950802]  0000000000000000 ffff88001d42dba0 ffffffff81a32f22 ffff88001bfb9c70
> [  622.950803]  ffff88001d42dbb0 ffffffff810edb02 ffff88001d42dc38 ffffffff81a396ed
> [  622.950805]  0000000000000046 ffff88001d42dbe8 ffffffff810e861d 0000000000000000
> [  622.950805] Call Trace:
> [  622.950810]  [<ffffffff81a32f22>] dump_stack+0x54/0x74
> [  622.950815]  [<ffffffff810edb02>] __might_sleep+0x112/0x114
> [  622.950817]  [<ffffffff81a396ed>] mutex_lock_nested+0x3c/0x3c6
> [  622.950818]  [<ffffffff810e861d>] ? up+0x39/0x3e
> [  622.950821]  [<ffffffff8153ea7c>] ? acpi_os_signal_semaphore+0x21/0x2d
> [  622.950824]  [<ffffffff81565ed1>] ? acpi_ut_release_mutex+0x5e/0x62
> [  622.950828]  [<ffffffff816d04ec>] virtnet_cpu_callback+0x33/0x87
> [  622.950830]  [<ffffffff81a42576>] notifier_call_chain+0x3c/0x5e
> [  622.950832]  [<ffffffff810e86a8>] __raw_notifier_call_chain+0xe/0x10
> [  622.950835]  [<ffffffff810c5556>] __cpu_notify+0x20/0x37
> [  622.950836]  [<ffffffff810c5580>] cpu_notify+0x13/0x15
> [  622.950838]  [<ffffffff81a237cd>] take_cpu_down+0x27/0x3a
> [  622.950841]  [<ffffffff81136289>] stop_machine_cpu_stop+0x93/0xf1
> [  622.950842]  [<ffffffff81136167>] cpu_stopper_thread+0xa0/0x12f
> [  622.950844]  [<ffffffff811361f6>] ? cpu_stopper_thread+0x12f/0x12f
> [  622.950847]  [<ffffffff81119710>] ? lock_release_holdtime.part.7+0xa3/0xa8
> [  622.950848]  [<ffffffff81135e4b>] ? cpu_stop_should_run+0x3f/0x47
> [  622.950850]  [<ffffffff810ea9b0>] smpboot_thread_fn+0x1c5/0x1e3
> [  622.950852]  [<ffffffff810ea7eb>] ? lg_global_unlock+0x67/0x67
> [  622.950854]  [<ffffffff810e36b7>] kthread+0xd8/0xe0
> [  622.950857]  [<ffffffff81a3bfad>] ? wait_for_common+0x12f/0x164
> [  622.950859]  [<ffffffff810e35df>] ? kthread_create_on_node+0x124/0x124
> [  622.950861]  [<ffffffff81a45ffc>] ret_from_fork+0x7c/0xb0
> [  622.950862]  [<ffffffff810e35df>] ? kthread_create_on_node+0x124/0x124
> [  622.950876] smpboot: CPU 1 is now offline
> [  623.194556] SMP alternatives: lockdep: fixing up alternatives
> [  623.194559] smpboot: Booting Node 0 Processor 1 APIC 0x1
 
Thanks for the testing Fengguang, could you please try the attached
patch to see if it works?

[-- Attachment #2: 0001-virtio-net-fix.patch --]
[-- Type: text/x-patch, Size: 1464 bytes --]

>From 01e6c3f71c202aa02e4feda169e7cc9fb24193f5 Mon Sep 17 00:00:00 2001
From: Jason Wang <jasowang@redhat.com>
Date: Mon, 21 Oct 2013 20:39:09 +0800
Subject: [PATCH] virtio-net: fix

---
 drivers/net/virtio_net.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 9fbdfcd..bbc9cb8 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1118,11 +1118,6 @@ static int virtnet_cpu_callback(struct notifier_block *nfb,
 {
 	struct virtnet_info *vi = container_of(nfb, struct virtnet_info, nb);
 
-	mutex_lock(&vi->config_lock);
-
-	if (!vi->config_enable)
-		goto done;
-
 	switch(action & ~CPU_TASKS_FROZEN) {
 	case CPU_ONLINE:
 	case CPU_DOWN_FAILED:
@@ -1136,8 +1131,6 @@ static int virtnet_cpu_callback(struct notifier_block *nfb,
 		break;
 	}
 
-done:
-	mutex_unlock(&vi->config_lock);
 	return NOTIFY_OK;
 }
 
@@ -1699,6 +1692,8 @@ static int virtnet_freeze(struct virtio_device *vdev)
 	struct virtnet_info *vi = vdev->priv;
 	int i;
 
+	unregister_hotcpu_notifier(&vi->nb);
+
 	/* Prevent config work handler from accessing the device */
 	mutex_lock(&vi->config_lock);
 	vi->config_enable = false;
@@ -1747,6 +1742,10 @@ static int virtnet_restore(struct virtio_device *vdev)
 	virtnet_set_queues(vi, vi->curr_queue_pairs);
 	rtnl_unlock();
 
+	err = register_hotcpu_notifier(&vi->nb);
+	if (err)
+		return err;
+
 	return 0;
 }
 #endif
-- 
1.8.1.2


[-- Attachment #3: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related

* [PATCH RESENT net-next] net: remove function sk_reset_txq()
From: ZHAO Gang @ 2013-10-22  8:23 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev

What sk_reset_txq() does is just calls function sk_tx_queue_reset(),
and sk_reset_txq() is used only in sock.h, by dst_negative_advice().
Let dst_negative_advice() calls sk_tx_queue_reset() directly so we
can remove unneeded sk_reset_txq().

Signed-off-by: ZHAO Gang <gamerh2o@gmail.com>
---
Hope this time I don't mess it up. Sorry for the inconvenience.
---
 include/net/sock.h | 4 +---
 net/core/sock.c    | 6 ------
 2 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 86bb066..c93542f 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1746,8 +1746,6 @@ sk_dst_get(struct sock *sk)
 	return dst;
 }
 
-void sk_reset_txq(struct sock *sk);
-
 static inline void dst_negative_advice(struct sock *sk)
 {
 	struct dst_entry *ndst, *dst = __sk_dst_get(sk);
@@ -1757,7 +1755,7 @@ static inline void dst_negative_advice(struct sock *sk)
 
 		if (ndst != dst) {
 			rcu_assign_pointer(sk->sk_dst_cache, ndst);
-			sk_reset_txq(sk);
+			sk_tx_queue_clear(sk);
 		}
 	}
 }
diff --git a/net/core/sock.c b/net/core/sock.c
index 440afdc..ab20ed9 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -475,12 +475,6 @@ discard_and_relse:
 }
 EXPORT_SYMBOL(sk_receive_skb);
 
-void sk_reset_txq(struct sock *sk)
-{
-	sk_tx_queue_clear(sk);
-}
-EXPORT_SYMBOL(sk_reset_txq);
-
 struct dst_entry *__sk_dst_check(struct sock *sk, u32 cookie)
 {
 	struct dst_entry *dst = __sk_dst_get(sk);
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH net-next v2] bonding: move bond-specific init after enslave happens
From: Jiri Pirko @ 2013-10-22  7:50 UTC (permalink / raw)
  To: Veaceslav Falico; +Cc: netdev, dingtianhong, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1382348910-32724-1-git-send-email-vfalico@redhat.com>

Mon, Oct 21, 2013 at 11:48:30AM CEST, vfalico@redhat.com wrote:
>As Jiri noted, currently we first do all bonding-specific initialization
>(specifically - bond_select_active_slave(bond)) before we actually attach
>the slave (so that it becomes visible through bond_for_each_slave() and
>friends). This might result in bond_select_active_slave() not seeing the
>first/new slave and, thus, not actually selecting an active slave.
>
>Fix this by moving all the bond-related init part after we've actually
>completely initialized and linked (via bond_master_upper_dev_link()) the
>new slave.
>
>Also, remove the bond_(de/a)ttach_slave(), it's useless to have functions
>to ++/-- one int.
>
>After this we have all the initialization of the new slave *before*
>linking, and all the stuff that needs to be done on bonding *after* it. It
>has also a bonus effect - we can remove the locking on the new slave init
>completely, and only use it for bond_select_active_slave().
>
>Reported-by: Jiri Pirko <jiri@resnulli.us>
>CC: Jay Vosburgh <fubar@us.ibm.com>
>CC: Andy Gospodarek <andy@greyhouse.net>
>Signed-off-by: Veaceslav Falico <vfalico@redhat.com>

Reviewed-by: Jiri Pirko <jiri@resnulli.us>

^ permalink raw reply

* Re: [PATCH nf-next] netfilter: xtables: lightweight process control group matching
From: Daniel Wagner @ 2013-10-22  7:45 UTC (permalink / raw)
  To: Ni, Xun, Daniel Borkmann
  Cc: Eric W. Biederman, pablo@netfilter.org,
	netfilter-devel@vger.kernel.org, netdev@vger.kernel.org,
	Tejun Heo, cgroups@vger.kernel.org
In-Reply-To: <91E2D863603AD4478F101CE81E76E45D01828D59@SHSMSX103.ccr.corp.intel.com>

Hi Xun,

On 10/22/2013 08:15 AM, Ni, Xun wrote:
> Hello, Daniel:
> can all your examples block early before doing network operations?

I was referring to Linux Security Module which allows
to define access policies for an application e.g. which ports are
allowed to be used.

If the goal is just to block those ports you don't have to go through
half of the networking stack to figure out via an iptable rules that
this access is not allowed.

> What's the whole netfilter universe? Can you give us more clear
> examples?

I am not sure if I understood your question correctly. In case you
are asking what netfilter is I would like pointing you to the
http://www.netfilter.org/ project page.

cheers,
daniel

^ permalink raw reply

* Re: [PATCH nf-next] netfilter: xtables: lightweight process control group matching
From: Daniel Borkmann @ 2013-10-22  7:42 UTC (permalink / raw)
  To: Ni, Xun
  Cc: Daniel Wagner, Eric W. Biederman, pablo@netfilter.org,
	netfilter-devel@vger.kernel.org, netdev@vger.kernel.org,
	Tejun Heo, cgroups@vger.kernel.org
In-Reply-To: <91E2D863603AD4478F101CE81E76E45D01828D59@SHSMSX103.ccr.corp.intel.com>

On 10/22/2013 09:15 AM, Ni, Xun wrote:
> Hello, Daniel:
>     can all your examples block early before doing network operations? What's the whole netfilter universe? Can you give us more clear examples?

As you can see from the code, the netfilter hooks are located
in NF_INET_LOCAL_OUT and NF_INET_POST_ROUTING.

> Thanks
> On 10/21/2013 05:09 PM, Daniel Wagner wrote:
>> On 10/19/2013 08:16 AM, Daniel Borkmann wrote:
>>> On 10/19/2013 01:21 AM, Eric W. Biederman wrote:
>>>
>>>> I am coming to this late.  But two concrete suggestions.
>>>>
>>>> 1) process groups and sessions don't change as frequently as pids.
>>>>
>>>> 2) It is possible to put a set of processes in their own network
>>>>      namespace and pipe just the packets you want those processes to
>>>>      use into that network namespace.  Using an ingress queueing filter
>>>>      makes that process very efficient even if you have to filter by port.
>>>
>>> Actually in our case we're filtering outgoing traffic, based on which
>>> local socket that originated from; so you wouldn't need all of that
>>> construct. Also, you wouldn't even need to have an a-prio knowledge
>>> of the application internals regarding their use of particular use of
>>> ports or protocols. I don't think that such a setup will have the
>>> same efficiency, ease of use, and power to distinguish the
>>> application the traffic came from in such a lightweight, protocol independent and easy way.
>>
>> Sorry for beeing late as well (and also stupid question)
>>
>> Couldn't you use something from the LSM? I mean you allow the
>> application to create the socket etc and then block later the traffic
>> originated from that socket. Wouldn't it make more sense to block
>> early?
>
> I gave one simple example for blocking in the commit message, that's true, but it is not limited to that, meaning we can have much different scenarios/policies that netfilter allows us than just blocking, e.g. fine grained settings where applications are allowed to connect/send traffic to, application traffic marking/ conntracking, application-specific packet mangling, and so on, just think of the whole netfilter universe.
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

* Re: [PATCH net-next v2] bonding: move bond-specific init after enslave happens
From: Ding Tianhong @ 2013-10-22  7:38 UTC (permalink / raw)
  To: Veaceslav Falico; +Cc: netdev, jiri, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1382348910-32724-1-git-send-email-vfalico@redhat.com>

On 2013/10/21 17:48, Veaceslav Falico wrote:
> As Jiri noted, currently we first do all bonding-specific initialization
> (specifically - bond_select_active_slave(bond)) before we actually attach
> the slave (so that it becomes visible through bond_for_each_slave() and
> friends). This might result in bond_select_active_slave() not seeing the
> first/new slave and, thus, not actually selecting an active slave.
> 
> Fix this by moving all the bond-related init part after we've actually
> completely initialized and linked (via bond_master_upper_dev_link()) the
> new slave.
> 
> Also, remove the bond_(de/a)ttach_slave(), it's useless to have functions
> to ++/-- one int.
> 
> After this we have all the initialization of the new slave *before*
> linking, and all the stuff that needs to be done on bonding *after* it. It
> has also a bonus effect - we can remove the locking on the new slave init
> completely, and only use it for bond_select_active_slave().
> 
> Reported-by: Jiri Pirko <jiri@resnulli.us>
> CC: Jay Vosburgh <fubar@us.ibm.com>
> CC: Andy Gospodarek <andy@greyhouse.net>
> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
> ---
> 
> Notes:
>     v1 -> v2:
>     Move the bond_(de/a)ttach_slave() functionality, and remove these
>     functions.
> 
>  drivers/net/bonding/bond_main.c | 65 +++++++++--------------------------------
>  1 file changed, 14 insertions(+), 51 deletions(-)
> 
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index d90734f..2daa066 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -967,33 +967,6 @@ void bond_select_active_slave(struct bonding *bond)
>  	}
>  }
>  
> -/*--------------------------- slave list handling ---------------------------*/
> -
> -/*
> - * This function attaches the slave to the end of list.
> - *
> - * bond->lock held for writing by caller.
> - */
> -static void bond_attach_slave(struct bonding *bond, struct slave *new_slave)
> -{
> -	bond->slave_cnt++;
> -}
> -
> -/*
> - * This function detaches the slave from the list.
> - * WARNING: no check is made to verify if the slave effectively
> - * belongs to <bond>.
> - * Nothing is freed on return, structures are just unchained.
> - * If any slave pointer in bond was pointing to <slave>,
> - * it should be changed by the calling function.
> - *
> - * bond->lock held for writing by caller.
> - */
> -static void bond_detach_slave(struct bonding *bond, struct slave *slave)
> -{
> -	bond->slave_cnt--;
> -}
> -
>  #ifdef CONFIG_NET_POLL_CONTROLLER
>  static inline int slave_enable_netpoll(struct slave *slave)
>  {
> @@ -1471,22 +1444,13 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
>  		goto err_close;
>  	}
>  
> -	write_lock_bh(&bond->lock);
> -
>  	prev_slave = bond_last_slave(bond);
> -	bond_attach_slave(bond, new_slave);
>  
>  	new_slave->delay = 0;
>  	new_slave->link_failure_count = 0;
>  
> -	write_unlock_bh(&bond->lock);
> -
> -	bond_compute_features(bond);
> -
>  	bond_update_speed_duplex(new_slave);
>  
> -	read_lock(&bond->lock);
> -
>  	new_slave->last_arp_rx = jiffies -
>  		(msecs_to_jiffies(bond->params.arp_interval) + 1);
>  	for (i = 0; i < BOND_MAX_ARP_TARGETS; i++)
> @@ -1547,12 +1511,9 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
>  		}
>  	}
>  
> -	write_lock_bh(&bond->curr_slave_lock);
> -
>  	switch (bond->params.mode) {
>  	case BOND_MODE_ACTIVEBACKUP:
>  		bond_set_slave_inactive_flags(new_slave);
> -		bond_select_active_slave(bond);
>  		break;
>  	case BOND_MODE_8023AD:
>  		/* in 802.3ad mode, the internal mechanism
> @@ -1578,7 +1539,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
>  	case BOND_MODE_ALB:
>  		bond_set_active_slave(new_slave);
>  		bond_set_slave_inactive_flags(new_slave);
> -		bond_select_active_slave(bond);
>  		break;
>  	default:
>  		pr_debug("This slave is always active in trunk mode\n");
> @@ -1596,10 +1556,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
>  		break;
>  	} /* switch(bond_mode) */
>  
> -	write_unlock_bh(&bond->curr_slave_lock);
> -
> -	bond_set_carrier(bond);
> -
>  #ifdef CONFIG_NET_POLL_CONTROLLER
>  	slave_dev->npinfo = bond->dev->npinfo;
>  	if (slave_dev->npinfo) {
> @@ -1614,8 +1570,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
>  	}
>  #endif
>  
> -	read_unlock(&bond->lock);
> -
>  	res = netdev_rx_handler_register(slave_dev, bond_handle_frame,
>  					 new_slave);
>  	if (res) {
> @@ -1629,6 +1583,17 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
>  		goto err_unregister;
>  	}
>  
> +	bond->slave_cnt++;
> +	bond_compute_features(bond);
> +	bond_set_carrier(bond);
> +
> +	if (USES_PRIMARY(bond->params.mode)) {
> +		read_lock(&bond->lock);
> +		write_lock_bh(&bond->curr_slave_lock);
> +		bond_select_active_slave(bond);
> +		write_unlock_bh(&bond->curr_slave_lock);
> +		read_unlock(&bond->lock);
> +	}
>  
>  	pr_info("%s: enslaving %s as a%s interface with a%s link.\n",
>  		bond_dev->name, slave_dev->name,
> @@ -1648,7 +1613,6 @@ err_detach:
>  
>  	vlan_vids_del_by_dev(slave_dev, bond_dev);
>  	write_lock_bh(&bond->lock);
> -	bond_detach_slave(bond, new_slave);
>  	if (bond->primary_slave == new_slave)
>  		bond->primary_slave = NULL;
>  	if (bond->curr_active_slave == new_slave) {
> @@ -1686,7 +1650,6 @@ err_free:
>  	kfree(new_slave);
>  
>  err_undo_flags:
> -	bond_compute_features(bond);
>  	/* Enslave of first slave has failed and we need to fix master's mac */
>  	if (!bond_has_slaves(bond) &&
>  	    ether_addr_equal(bond_dev->dev_addr, slave_dev->dev_addr))
> @@ -1740,6 +1703,9 @@ static int __bond_release_one(struct net_device *bond_dev,
>  
>  	write_unlock_bh(&bond->lock);
>  
> +	/* release the slave from its bond */
> +	bond->slave_cnt--;
> +
>  	bond_upper_dev_unlink(bond_dev, slave_dev);
>  	/* unregister rx_handler early so bond_handle_frame wouldn't be called
>  	 * for this slave anymore.
> @@ -1764,9 +1730,6 @@ static int __bond_release_one(struct net_device *bond_dev,
>  
>  	bond->current_arp_slave = NULL;
>  
> -	/* release the slave from its bond */
> -	bond_detach_slave(bond, slave);
> -
>  	if (!all && !bond->params.fail_over_mac) {
>  		if (ether_addr_equal(bond_dev->dev_addr, slave->perm_hwaddr) &&
>  		    bond_has_slaves(bond))
> 

Acked-by: Ding Tianhong@huawei.com

^ permalink raw reply

* Re: [PATCH nf-next] netfilter: xtables: lightweight process control group matching
From: Daniel Wagner @ 2013-10-22  7:36 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Eric W. Biederman, pablo-Cap9r6Oaw4JrovVCs/uTlw,
	netfilter-devel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <52654CE6.7030706-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On 10/21/2013 04:48 PM, Daniel Borkmann wrote:
> On 10/21/2013 05:09 PM, Daniel Wagner wrote:
>> On 10/19/2013 08:16 AM, Daniel Borkmann wrote:
>>> On 10/19/2013 01:21 AM, Eric W. Biederman wrote:
>>>
>>>> I am coming to this late.  But two concrete suggestions.
>>>>
>>>> 1) process groups and sessions don't change as frequently as pids.
>>>>
>>>> 2) It is possible to put a set of processes in their own network
>>>>     namespace and pipe just the packets you want those processes to
>>>>     use into that network namespace.  Using an ingress queueing filter
>>>>     makes that process very efficient even if you have to filter by
>>>> port.
>>>
>>> Actually in our case we're filtering outgoing traffic, based on which
>>> local socket that originated from; so you wouldn't need all of that
>>> construct. Also, you wouldn't even need to have an a-prio knowledge of
>>> the application internals regarding their use of particular use of ports
>>> or protocols. I don't think that such a setup will have the same
>>> efficiency, ease of use, and power to distinguish the application the
>>> traffic came from in such a lightweight, protocol independent and
>>> easy way.
>>
>> Sorry for beeing late as well (and also stupid question)
>>
>> Couldn't you use something from the LSM? I mean you allow the
>> application to create the socket etc and then block later
>> the traffic originated from that socket. Wouldn't it make
>> more sense to block early?
>
> I gave one simple example for blocking in the commit message,
> that's true, but it is not limited to that, meaning we can have
> much different scenarios/policies that netfilter allows us than
> just blocking, e.g. fine grained settings where applications are
> allowed to connect/send traffic to, application traffic marking/
> conntracking, application-specific packet mangling, and so on,
> just think of the whole netfilter universe.

Oh, I didn't pay enough attention to the commit message. Sorry
about that. Obviously, if fine grained settings is a must
then blocking the write is not good enough.

cheers,
daniel

^ permalink raw reply

* Re: [PATCH net 1/2] sit: allow to use rtnl ops on fb tunnel
From: Nicolas Dichtel @ 2013-10-22  7:34 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David Miller, netdev, steffen.klassert, pshelar, gregkh, stable
In-Reply-To: <CA+FuTScA26dW8bO5YdKTKp9tjSM-xVqkdkq8xzu-wH0cT=Wh8Q@mail.gmail.com>

Le 22/10/2013 01:30, Willem de Bruijn a écrit :
> On Wed, Oct 2, 2013 at 3:15 AM, Nicolas Dichtel
> <nicolas.dichtel@6wind.com> wrote:
>> Le 01/10/2013 18:59, David Miller a écrit :
>>
>>> From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
>>> Date: Tue,  1 Oct 2013 18:04:59 +0200
>>>
>>>> rtnl ops where introduced by ba3e3f50a0e5 ("sit: advertise tunnel param
>>>> via
>>>> rtnl"), but I forget to assign rtnl ops to fb tunnels.
>>>>
>>>> Now that it is done, we must remove the explicit call to
>>>> unregister_netdevice_queue(), because  the fallback tunnel is added to
>>>> the queue
>>>> in sit_destroy_tunnels() when checking rtnl_link_ops of all netdevices
>>>> (this
>>>> is valid since commit 5e6700b3bf98 ("sit: add support of x-netns")).
>>>>
>>>> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
>>>
>>>
>>> Applied and queued up for -stable.
>>>
>>> But I imagine since the x-netns changes aren't in various -stable
>>> branches this will need to be adjusted a bit?
>>
>> Yes, it's what I've tried to say in the commit log ;-)
>>
>> In fact, before the x-netns changes, we must keep the
>> unregister_netdevice_queue() line.
>
> In 3.11 linux-stable, this patch was merged between 3.11.4 and 3.11.5
> in commit 3783100, after the x-netns changes in commit 5e6700b3bf, but
> the unregister_netdevice_queue was kept.
>
> I think that caused the following bug. In 3.11.6, a simple `modprobe
> sit && rmmod sit` hits the BUG at net/core/dev.c:5039:
>
>    BUG_ON(dev->reg_state != NETREG_REGISTERED);
>
> The device is actually NETREG_RELEASED at one point because the device
> is now unregistered twice. The issue goes away by porting the
> remainder of the original commit: the one liner that removes the call
> to unregister_netdevice_queue.
>
> +++ b/net/ipv6/sit.c
> @@ -1708,7 +1708,6 @@ static void __net_exit sit_exit_net(struct net *net)
>
>          sit_destroy_tunnels(sitn, &list);
> -       unregister_netdevice_queue(sitn->fb_tunnel_dev, &list);
>          unregister_netdevice_many(&list);
>
> If correct, let me know if I should send a proper one-line patch
> against 3.11.y. Since I haven't looked at this code before, I found it
> safer to report the issue first.
Yes, this line should be removed, like it was done in the original patch
(x-netns for sit is part of 3.11).

>
> 5e6700b3bf was not applied to 3.10 stable, so that branch is not affected.
Right.

^ permalink raw reply

* [PATCH 2/2] [PATCH] ax88179_178a: Add VID:DID for Samsung USB Ethernet Adapter
From: freddy @ 2013-10-22  7:32 UTC (permalink / raw)
  To: davem, linux-usb, linux-kernel, netdev, allan, louis; +Cc: Freddy Xin
In-Reply-To: <1382427131-2429-1-git-send-email-freddy@asix.com.tw>

From: Freddy Xin <freddy@asix.com.tw>

Add VID:DID for Samsung USB Ethernet Adapter.

Signed-off-by: Freddy Xin <freddy@asix.com.tw>
---
 drivers/net/usb/ax88179_178a.c |   19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c
index 3bcd0d9..846cc19 100644
--- a/drivers/net/usb/ax88179_178a.c
+++ b/drivers/net/usb/ax88179_178a.c
@@ -1406,6 +1406,19 @@ static const struct driver_info sitecom_info = {
 	.tx_fixup = ax88179_tx_fixup,
 };
 
+static const struct driver_info samsung_info = {
+	.description = "Samsung USB Ethernet Adapter",
+	.bind = ax88179_bind,
+	.unbind = ax88179_unbind,
+	.status = ax88179_status,
+	.link_reset = ax88179_link_reset,
+	.reset = ax88179_reset,
+	.stop = ax88179_stop,
+	.flags = FLAG_ETHER | FLAG_FRAMING_AX,
+	.rx_fixup = ax88179_rx_fixup,
+	.tx_fixup = ax88179_tx_fixup,
+};
+
 static const struct usb_device_id products[] = {
 {
 	/* ASIX AX88179 10/100/1000 */
@@ -1418,7 +1431,11 @@ static const struct usb_device_id products[] = {
 }, {
 	/* Sitecom USB 3.0 to Gigabit Adapter */
 	USB_DEVICE(0x0df6, 0x0072),
-	.driver_info = (unsigned long) &sitecom_info,
+	.driver_info = (unsigned long)&sitecom_info,
+}, {
+	/* Samsung USB Ethernet Adapter */
+	USB_DEVICE(0x04e8, 0xa100),
+	.driver_info = (unsigned long)&samsung_info,
 },
 	{ },
 };
-- 
1.7.10.4

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox