Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: ip_queue_xmit() used illegally
From: Vladislav Yasevich @ 2011-05-06 21:10 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, yjwei, jchapman
In-Reply-To: <20110506.132129.59693228.davem@davemloft.net>

On 05/06/2011 04:21 PM, David Miller wrote:
> From: David Miller <davem@davemloft.net>
> Date: Fri, 06 May 2011 12:26:56 -0700 (PDT)
> 
>> SCTP stores it's binding information using transports and assosciations
>> and does not fill in the ->inet_{daddr,saddr} values.
>>
>> It tries to work around this route issue by checking dst->obsolete
>> directly in sctp_packet_transmit(), which just makes the race smaller
>> and does not eliminate it.  ip_queue_xmit() can still end up with
>> __sk_dst_check() returning NULL and then we end up emitting a
>> potentially bogus packet.
> 
> I take this back, we added this hack where things like SCTP can
> pre-route the packet by hooking up the route to the SKB before
> calling ->queue_xmit.
> 
> And L2TP does something similar.
> 
> So false alarm, nothing to see here :-)
> 
> I still want to clean this up so that this kind of stuff can be
> handled generically inside of ->queue_xmit() by passing in the correct
> addressing information.
> 

Wow, You had me scrambling there for a while.  I was just about to send note
about the pre-hooked route, but you beat me to it.

The reason why sctp doesn't change the inet_addr, is because that address can theoretically
change on ever packet transmit due to multi-homing nature of SCTP.

I'll take a look at ->queue_xmit() to see if SCTP can convert to using that.

-vlad

^ permalink raw reply

* Re: ip_queue_xmit() used illegally
From: David Miller @ 2011-05-06 20:21 UTC (permalink / raw)
  To: netdev; +Cc: vladislav.yasevich, yjwei, jchapman
In-Reply-To: <20110506.122656.189696988.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Fri, 06 May 2011 12:26:56 -0700 (PDT)

> SCTP stores it's binding information using transports and assosciations
> and does not fill in the ->inet_{daddr,saddr} values.
> 
> It tries to work around this route issue by checking dst->obsolete
> directly in sctp_packet_transmit(), which just makes the race smaller
> and does not eliminate it.  ip_queue_xmit() can still end up with
> __sk_dst_check() returning NULL and then we end up emitting a
> potentially bogus packet.

I take this back, we added this hack where things like SCTP can
pre-route the packet by hooking up the route to the SKB before
calling ->queue_xmit.

And L2TP does something similar.

So false alarm, nothing to see here :-)

I still want to clean this up so that this kind of stuff can be
handled generically inside of ->queue_xmit() by passing in the correct
addressing information.

^ permalink raw reply

* Re: [Pv-drivers] [PATCH] vmxnet3: Consistently disable irqs when taking adapter->cmd_lock
From: David Miller @ 2011-05-06 20:12 UTC (permalink / raw)
  To: scottjg; +Cc: sbhatewara, roland, pv-drivers, netdev
In-Reply-To: <F78BCF638F95D74A99D036114107EDB5028EFAA0F8@EXCH-MBX-3.vmware.com>

From: Scott Goldman <scottjg@vmware.com>
Date: Fri, 6 May 2011 13:10:29 -0700

>> 
>> On Fri, 6 May 2011, Roland Dreier wrote:
>> 
>> > From: Roland Dreier <roland@purestorage.com>
>> >
>> > Using the vmxnet3 driver produces a lockdep warning because
>> 
>> > Signed-off-by: Roland Dreier <roland@purestorage.com>
>> 
>> 
>> Roland, thanks for the analysis and the patch.
>> 
>> Signed-off-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
> 
> Likewise, seems pretty sane to me. The command register operations are only control-path operations and disabling interrupts for the duration is probably not a big deal. Touching the cmd reg will result a VMEXIT, where the guest won't be processing interrupts anyway. 
> 
> Signed-off-by: Scott J. Goldman <scottjg@vmware.com>

Applied, thanks everyone.

^ permalink raw reply

* RE: [Pv-drivers] [PATCH] vmxnet3: Consistently disable irqs when taking adapter->cmd_lock
From: Scott Goldman @ 2011-05-06 20:10 UTC (permalink / raw)
  To: Shreyas Bhatewara, Roland Dreier
  Cc: pv-drivers@vmware.com, netdev@vger.kernel.org, David S. Miller
In-Reply-To: <alpine.LRH.2.00.1105061206500.26660@sbhatewara-dev1.eng.vmware.com>

> 
> On Fri, 6 May 2011, Roland Dreier wrote:
> 
> > From: Roland Dreier <roland@purestorage.com>
> >
> > Using the vmxnet3 driver produces a lockdep warning because
> 
> > Signed-off-by: Roland Dreier <roland@purestorage.com>
> 
> 
> Roland, thanks for the analysis and the patch.
> 
> Signed-off-by: Shreyas N Bhatewara <sbhatewara@vmware.com>

Likewise, seems pretty sane to me. The command register operations are only control-path operations and disabling interrupts for the duration is probably not a big deal. Touching the cmd reg will result a VMEXIT, where the guest won't be processing interrupts anyway. 

Signed-off-by: Scott J. Goldman <scottjg@vmware.com>

^ permalink raw reply

* Re: [PATCH] dccp: handle invalid feature options length
From: David Miller @ 2011-05-06 20:04 UTC (permalink / raw)
  To: gerrit; +Cc: drosenberg, dccp, netdev, linux-kernel, security
In-Reply-To: <20110506195733.GA3527@gerrit.erg.abdn.ac.uk>

From: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Date: Fri, 6 May 2011 21:57:33 +0200

> Quoting Dan Rosenberg:
> | A length of zero (after subtracting two for the type and len fields) for
> | the DCCPO_{CHANGE,CONFIRM}_{L,R} options will cause an underflow due to
> | the subtraction.  The subsequent code may read past the end of the
> | options value buffer when parsing.  I'm unsure of what the consequences
> | of this might be, but it's probably not good.
> | 
> Please disregard my earlier message, I erred.
> Dan is right, his patch is correct and definitively valid.
> A length of 0 would be cast to 0xff and then cause buffer overrun.
> 
> | Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
> | Cc: stable@kernel.org
> Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>

Great, I'll apply this, thanks!

^ permalink raw reply

* Re: [PATCH] dccp: handle invalid feature options length
From: Gerrit Renker @ 2011-05-06 19:57 UTC (permalink / raw)
  To: Dan Rosenberg; +Cc: davem, dccp, netdev, linux-kernel, security
In-Reply-To: <1304688438.29544.16.camel@dan>

Quoting Dan Rosenberg:
| A length of zero (after subtracting two for the type and len fields) for
| the DCCPO_{CHANGE,CONFIRM}_{L,R} options will cause an underflow due to
| the subtraction.  The subsequent code may read past the end of the
| options value buffer when parsing.  I'm unsure of what the consequences
| of this might be, but it's probably not good.
| 
Please disregard my earlier message, I erred.
Dan is right, his patch is correct and definitively valid.
A length of 0 would be cast to 0xff and then cause buffer overrun.

| Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
| Cc: stable@kernel.org
Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>

^ permalink raw reply

* future developments of usbnet
From: Oliver Neukum @ 2011-05-06 18:45 UTC (permalink / raw)
  To: netdev, linux-usb

Hi,

I'd like to get a feeling what people are working out there regarding usbnet.
So please, if you do something, or think something ought to be done, please
speak up now.

IMHO usbnet needs better support for

- batching protocols
- double buffering on the rx path

with the latter having higher priority.

Coments?

	Regards
		Oliver

^ permalink raw reply

* ip_queue_xmit() used illegally
From: David Miller @ 2011-05-06 19:26 UTC (permalink / raw)
  To: netdev; +Cc: vladislav.yasevich, yjwei, jchapman

Several users of ip_queue_xmit() use it illegally.

I've only audited L2TP and SCTP so far, and they both cannot use
ip_queue_xmit() with the way they operate currently.

The issue surrounds how the socket binding is maintained in
inet->inet_daddr, inet->inet_saddr etc.

TCP does things right, in that ip_queue_xmit() is only invoked with
inet->inet_daddr and inet->inet_saddr having fully resolved, final,
fully connected values.

This is an absolute requirement because if the socket's route
invalidates (which happens completely asynchronously) it's going to
lookup a new route using whatever is stored in
inet->inet_{daddr,saddr} and then use those addresses to build the
packet.  Even if ->inet_{saddr,daddr} are both zero this will still
emit a packet (bonus points if you know what addresses will be picked,
no peeking at route.c :-).

SCTP stores it's binding information using transports and assosciations
and does not fill in the ->inet_{daddr,saddr} values.

It tries to work around this route issue by checking dst->obsolete
directly in sctp_packet_transmit(), which just makes the race smaller
and does not eliminate it.  ip_queue_xmit() can still end up with
__sk_dst_check() returning NULL and then we end up emitting a
potentially bogus packet.

L2TP supports more of a datagram type socket semantic than a stream
one, it allows unconnected modes of operation.  And for this reason
it also cannot use ip_queue_xmit() legally.

After a quick cursory scan it seem like DCCP is OK.

I think SCTP could potentially be fixed by simply filling in the
inet->inet_{daddr,saddr} values when it makes an internal binding
of the transport via sctp_transport_route().

L2TP on the other hand will need to use another interface to send ipv4
packets because it allows disconnected operation.

^ permalink raw reply

* Re: [PATCH] vmxnet3: Consistently disable irqs when taking adapter->cmd_lock
From: Shreyas Bhatewara @ 2011-05-06 19:21 UTC (permalink / raw)
  To: Roland Dreier
  Cc: David S. Miller, pv-drivers@vmware.com, netdev@vger.kernel.org
In-Reply-To: <1304706773-21348-1-git-send-email-roland@kernel.org>

On Fri, 6 May 2011, Roland Dreier wrote:

> From: Roland Dreier <roland@purestorage.com>
> 
> Using the vmxnet3 driver produces a lockdep warning because

> Signed-off-by: Roland Dreier <roland@purestorage.com>

Roland, thanks for the analysis and the patch.

Signed-off-by: Shreyas N Bhatewara <sbhatewara@vmware.com>

^ permalink raw reply

* [PATCH] hamachi: Delete TX checksumming code commented out since 1999
From: David Miller @ 2011-05-06 18:59 UTC (permalink / raw)
  To: netdev


TX checksumming support has been ifdef commented out of this driver
for more than 10 years, and it makes references to aspects of the IPv4
stack from back then as well.

If someone has one of these rare cards and wants to properly resurrect
TX checksumming support, they can still get at this code in the
version control history.

Signed-off-by: David S. Miller <davem@davemloft.net>
---

I stumbled over this cruft while auditing ip_queue_xmit() users.

 drivers/net/hamachi.c |   79 -------------------------------------------------
 1 files changed, 0 insertions(+), 79 deletions(-)

diff --git a/drivers/net/hamachi.c b/drivers/net/hamachi.c
index 80d25ed..f5fba73 100644
--- a/drivers/net/hamachi.c
+++ b/drivers/net/hamachi.c
@@ -132,14 +132,8 @@ static int tx_params[MAX_UNITS] = {-1, -1, -1, -1, -1, -1, -1, -1};
 /*
  * RX_CHECKSUM turns on card-generated receive checksum generation for
  *   TCP and UDP packets.  Otherwise the upper layers do the calculation.
- * TX_CHECKSUM won't do anything too useful, even if it works.  There's no
- *   easy mechanism by which to tell the TCP/UDP stack that it need not
- *   generate checksums for this device.  But if somebody can find a way
- *   to get that to work, most of the card work is in here already.
  * 3/10/1999 Pete Wyckoff <wyckoff@ca.sandia.gov>
  */
-#undef  TX_CHECKSUM
-#define RX_CHECKSUM
 
 /* Operational parameters that usually are not changed. */
 /* Time in jiffies before concluding the transmitter is hung. */
@@ -630,11 +624,6 @@ static int __devinit hamachi_init_one (struct pci_dev *pdev,
 
 	SET_NETDEV_DEV(dev, &pdev->dev);
 
-#ifdef TX_CHECKSUM
-	printk("check that skbcopy in ip_queue_xmit isn't happening\n");
-	dev->hard_header_len += 8;  /* for cksum tag */
-#endif
-
 	for (i = 0; i < 6; i++)
 		dev->dev_addr[i] = 1 ? read_eeprom(ioaddr, 4 + i)
 			: readb(ioaddr + StationAddr + i);
@@ -937,11 +926,7 @@ static int hamachi_open(struct net_device *dev)
 
 	/* always 1, takes no more time to do it */
 	writew(0x0001, ioaddr + RxChecksum);
-#ifdef TX_CHECKSUM
-	writew(0x0001, ioaddr + TxChecksum);
-#else
 	writew(0x0000, ioaddr + TxChecksum);
-#endif
 	writew(0x8000, ioaddr + MACCnfg); /* Soft reset the MAC */
 	writew(0x215F, ioaddr + MACCnfg);
 	writew(0x000C, ioaddr + FrameGap0);
@@ -1226,40 +1211,6 @@ static void hamachi_init_ring(struct net_device *dev)
 }
 
 
-#ifdef TX_CHECKSUM
-#define csum_add(it, val) \
-do { \
-    it += (u16) (val); \
-    if (it & 0xffff0000) { \
-	it &= 0xffff; \
-	++it; \
-    } \
-} while (0)
-    /* printk("add %04x --> %04x\n", val, it); \ */
-
-/* uh->len already network format, do not swap */
-#define pseudo_csum_udp(sum,ih,uh) do { \
-    sum = 0; \
-    csum_add(sum, (ih)->saddr >> 16); \
-    csum_add(sum, (ih)->saddr & 0xffff); \
-    csum_add(sum, (ih)->daddr >> 16); \
-    csum_add(sum, (ih)->daddr & 0xffff); \
-    csum_add(sum, cpu_to_be16(IPPROTO_UDP)); \
-    csum_add(sum, (uh)->len); \
-} while (0)
-
-/* swap len */
-#define pseudo_csum_tcp(sum,ih,len) do { \
-    sum = 0; \
-    csum_add(sum, (ih)->saddr >> 16); \
-    csum_add(sum, (ih)->saddr & 0xffff); \
-    csum_add(sum, (ih)->daddr >> 16); \
-    csum_add(sum, (ih)->daddr & 0xffff); \
-    csum_add(sum, cpu_to_be16(IPPROTO_TCP)); \
-    csum_add(sum, htons(len)); \
-} while (0)
-#endif
-
 static netdev_tx_t hamachi_start_xmit(struct sk_buff *skb,
 				      struct net_device *dev)
 {
@@ -1292,36 +1243,6 @@ static netdev_tx_t hamachi_start_xmit(struct sk_buff *skb,
 
 	hmp->tx_skbuff[entry] = skb;
 
-#ifdef TX_CHECKSUM
-	{
-	    /* tack on checksum tag */
-	    u32 tagval = 0;
-	    struct ethhdr *eh = (struct ethhdr *)skb->data;
-	    if (eh->h_proto == cpu_to_be16(ETH_P_IP)) {
-		struct iphdr *ih = (struct iphdr *)((char *)eh + ETH_HLEN);
-		if (ih->protocol == IPPROTO_UDP) {
-		    struct udphdr *uh
-		      = (struct udphdr *)((char *)ih + ih->ihl*4);
-		    u32 offset = ((unsigned char *)uh + 6) - skb->data;
-		    u32 pseudo;
-		    pseudo_csum_udp(pseudo, ih, uh);
-		    pseudo = htons(pseudo);
-		    printk("udp cksum was %04x, sending pseudo %04x\n",
-		      uh->check, pseudo);
-		    uh->check = 0;  /* zero out uh->check before card calc */
-		    /*
-		     * start at 14 (skip ethhdr), store at offset (uh->check),
-		     * use pseudo value given.
-		     */
-		    tagval = (14 << 24) | (offset << 16) | pseudo;
-		} else if (ih->protocol == IPPROTO_TCP) {
-		    printk("tcp, no auto cksum\n");
-		}
-	    }
-	    *(u32 *)skb_push(skb, 8) = tagval;
-	}
-#endif
-
         hmp->tx_ring[entry].addr = cpu_to_leXX(pci_map_single(hmp->pci_dev,
 		skb->data, skb->len, PCI_DMA_TODEVICE));
 
-- 
1.7.5.1


^ permalink raw reply related

* [PATCH] vmxnet3: Consistently disable irqs when taking adapter->cmd_lock
From: Roland Dreier @ 2011-05-06 18:32 UTC (permalink / raw)
  To: Shreyas Bhatewara, David S. Miller; +Cc: pv-drivers, netdev

From: Roland Dreier <roland@purestorage.com>

Using the vmxnet3 driver produces a lockdep warning because
vmxnet3_set_mc(), which is called with mc->mca_lock held, takes
adapter->cmd_lock.  However, there are a couple of places where
adapter->cmd_lock is taken with softirqs enabled, lockdep warns that a
softirq that tries to take mc->mca_lock could happen while
adapter->cmd_lock is held, leading to an AB-BA deadlock.

I'm not sure if this is a real potential deadlock or not, but the
simplest and best fix seems to be simply to make sure we take cmd_lock
with spin_lock_irqsave() everywhere -- the places with plain spin_lock
just look like oversights.

The full enormous lockdep warning is:

 =========================================================
 [ INFO: possible irq lock inversion dependency detected ]
 2.6.39-rc6+ #1
 ---------------------------------------------------------
 ifconfig/567 just changed the state of lock:
  (&(&mc->mca_lock)->rlock){+.-...}, at: [<ffffffff81531e9f>] mld_ifc_timer_expire+0xff/0x280
 but this lock took another, SOFTIRQ-unsafe lock in the past:
  (&(&adapter->cmd_lock)->rlock){+.+...}
 
 and interrupts could create inverse lock ordering between them.
 
 
 other info that might help us debug this:
 4 locks held by ifconfig/567:
  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff8147d547>] rtnl_lock+0x17/0x20
  #1:  ((inetaddr_chain).rwsem){.+.+.+}, at: [<ffffffff810896cf>] __blocking_notifier_call_chain+0x5f/0xb0
  #2:  (&idev->mc_ifc_timer){+.-...}, at: [<ffffffff8106f21b>] run_timer_softirq+0xeb/0x3f0
  #3:  (&ndev->lock){++.-..}, at: [<ffffffff81531dd2>] mld_ifc_timer_expire+0x32/0x280
 
 the shortest dependencies between 2nd lock and 1st lock:
   -> (&(&adapter->cmd_lock)->rlock){+.+...} ops: 11 {
      HARDIRQ-ON-W at:
                                            [<ffffffff8109ad86>] __lock_acquire+0x7f6/0x1e10
                                            [<ffffffff8109ca4d>] lock_acquire+0x9d/0x130
                                            [<ffffffff81571156>] _raw_spin_lock+0x36/0x70
                                            [<ffffffffa000d212>] vmxnet3_alloc_intr_resources+0x22/0x230 [vmxnet3]
                                            [<ffffffffa0014031>] vmxnet3_probe_device+0x5f6/0x15c5 [vmxnet3]
                                            [<ffffffff812df67f>] local_pci_probe+0x5f/0xd0
                                            [<ffffffff812dfde9>] pci_device_probe+0x119/0x120
                                            [<ffffffff81373df6>] driver_probe_device+0x96/0x1c0
                                            [<ffffffff81373fcb>] __driver_attach+0xab/0xb0
                                            [<ffffffff81372a1e>] bus_for_each_dev+0x5e/0x90
                                            [<ffffffff81373a2e>] driver_attach+0x1e/0x20
                                            [<ffffffff813735b8>] bus_add_driver+0xc8/0x290
                                            [<ffffffff813745b6>] driver_register+0x76/0x140
                                            [<ffffffff812e0046>] __pci_register_driver+0x66/0xe0
                                            [<ffffffffa001b03a>] serio_raw_poll+0x3a/0x60 [serio_raw]
                                            [<ffffffff81002165>] do_one_initcall+0x45/0x190
                                            [<ffffffff810aa76b>] sys_init_module+0xfb/0x250
                                            [<ffffffff8157a142>] system_call_fastpath+0x16/0x1b
      SOFTIRQ-ON-W at:
                                            [<ffffffff8109adb7>] __lock_acquire+0x827/0x1e10
                                            [<ffffffff8109ca4d>] lock_acquire+0x9d/0x130
                                            [<ffffffff81571156>] _raw_spin_lock+0x36/0x70
                                            [<ffffffffa000d212>] vmxnet3_alloc_intr_resources+0x22/0x230 [vmxnet3]
                                            [<ffffffffa0014031>] vmxnet3_probe_device+0x5f6/0x15c5 [vmxnet3]
                                            [<ffffffff812df67f>] local_pci_probe+0x5f/0xd0
                                            [<ffffffff812dfde9>] pci_device_probe+0x119/0x120
                                            [<ffffffff81373df6>] driver_probe_device+0x96/0x1c0
                                            [<ffffffff81373fcb>] __driver_attach+0xab/0xb0
                                            [<ffffffff81372a1e>] bus_for_each_dev+0x5e/0x90
                                            [<ffffffff81373a2e>] driver_attach+0x1e/0x20
                                            [<ffffffff813735b8>] bus_add_driver+0xc8/0x290
                                            [<ffffffff813745b6>] driver_register+0x76/0x140
                                            [<ffffffff812e0046>] __pci_register_driver+0x66/0xe0
                                            [<ffffffffa001b03a>] serio_raw_poll+0x3a/0x60 [serio_raw]
                                            [<ffffffff81002165>] do_one_initcall+0x45/0x190
                                            [<ffffffff810aa76b>] sys_init_module+0xfb/0x250
                                            [<ffffffff8157a142>] system_call_fastpath+0x16/0x1b
      INITIAL USE at:
                                           [<ffffffff8109a9e9>] __lock_acquire+0x459/0x1e10
                                           [<ffffffff8109ca4d>] lock_acquire+0x9d/0x130
                                           [<ffffffff81571156>] _raw_spin_lock+0x36/0x70
                                           [<ffffffffa000d212>] vmxnet3_alloc_intr_resources+0x22/0x230 [vmxnet3]
                                           [<ffffffffa0014031>] vmxnet3_probe_device+0x5f6/0x15c5 [vmxnet3]
                                           [<ffffffff812df67f>] local_pci_probe+0x5f/0xd0
                                           [<ffffffff812dfde9>] pci_device_probe+0x119/0x120
                                           [<ffffffff81373df6>] driver_probe_device+0x96/0x1c0
                                           [<ffffffff81373fcb>] __driver_attach+0xab/0xb0
                                           [<ffffffff81372a1e>] bus_for_each_dev+0x5e/0x90
                                           [<ffffffff81373a2e>] driver_attach+0x1e/0x20
                                           [<ffffffff813735b8>] bus_add_driver+0xc8/0x290
                                           [<ffffffff813745b6>] driver_register+0x76/0x140
                                           [<ffffffff812e0046>] __pci_register_driver+0x66/0xe0
                                           [<ffffffffa001b03a>] serio_raw_poll+0x3a/0x60 [serio_raw]
                                           [<ffffffff81002165>] do_one_initcall+0x45/0x190
                                           [<ffffffff810aa76b>] sys_init_module+0xfb/0x250
                                           [<ffffffff8157a142>] system_call_fastpath+0x16/0x1b
    }
    ... key      at: [<ffffffffa0017590>] __key.42516+0x0/0xffffffffffffda70 [vmxnet3]
    ... acquired at:
    [<ffffffff8109ca4d>] lock_acquire+0x9d/0x130
    [<ffffffff81571bb5>] _raw_spin_lock_irqsave+0x55/0xa0
    [<ffffffffa000de27>] vmxnet3_set_mc+0x97/0x1a0 [vmxnet3]
    [<ffffffff8146ffa0>] __dev_set_rx_mode+0x40/0xb0
    [<ffffffff81470040>] dev_set_rx_mode+0x30/0x50
    [<ffffffff81470127>] __dev_open+0xc7/0x100
    [<ffffffff814703c1>] __dev_change_flags+0xa1/0x180
    [<ffffffff81470568>] dev_change_flags+0x28/0x70
    [<ffffffff814da960>] devinet_ioctl+0x730/0x800
    [<ffffffff814db508>] inet_ioctl+0x88/0xa0
    [<ffffffff814541f0>] sock_do_ioctl+0x30/0x70
    [<ffffffff814542a9>] sock_ioctl+0x79/0x2f0
    [<ffffffff81188798>] do_vfs_ioctl+0x98/0x570
    [<ffffffff81188d01>] sys_ioctl+0x91/0xa0
    [<ffffffff8157a142>] system_call_fastpath+0x16/0x1b
 
  -> (_xmit_ETHER){+.....} ops: 6 {
     HARDIRQ-ON-W at:
                                          [<ffffffff8109ad86>] __lock_acquire+0x7f6/0x1e10
                                          [<ffffffff8109ca4d>] lock_acquire+0x9d/0x130
                                          [<ffffffff8157124b>] _raw_spin_lock_bh+0x3b/0x70
                                          [<ffffffff81475618>] __dev_mc_add+0x38/0x90
                                          [<ffffffff814756a0>] dev_mc_add+0x10/0x20
                                          [<ffffffff81532c9e>] igmp6_group_added+0x10e/0x1b0
                                          [<ffffffff81533f2d>] ipv6_dev_mc_inc+0x2cd/0x430
                                          [<ffffffff81515e17>] ipv6_add_dev+0x357/0x450
                                          [<ffffffff81519f27>] addrconf_notify+0x2f7/0xb10
                                          [<ffffffff81575c1c>] notifier_call_chain+0x8c/0xc0
                                          [<ffffffff81089586>] raw_notifier_call_chain+0x16/0x20
                                          [<ffffffff814689b7>] call_netdevice_notifiers+0x37/0x70
                                          [<ffffffff8146a944>] register_netdevice+0x244/0x2d0
                                          [<ffffffff8146aa0f>] register_netdev+0x3f/0x60
                                          [<ffffffffa001419b>] vmxnet3_probe_device+0x760/0x15c5 [vmxnet3]
                                          [<ffffffff812df67f>] local_pci_probe+0x5f/0xd0
                                          [<ffffffff812dfde9>] pci_device_probe+0x119/0x120
                                          [<ffffffff81373df6>] driver_probe_device+0x96/0x1c0
                                          [<ffffffff81373fcb>] __driver_attach+0xab/0xb0
                                          [<ffffffff81372a1e>] bus_for_each_dev+0x5e/0x90
                                          [<ffffffff81373a2e>] driver_attach+0x1e/0x20
                                          [<ffffffff813735b8>] bus_add_driver+0xc8/0x290
                                          [<ffffffff813745b6>] driver_register+0x76/0x140
                                          [<ffffffff812e0046>] __pci_register_driver+0x66/0xe0
                                          [<ffffffffa001b03a>] serio_raw_poll+0x3a/0x60 [serio_raw]
                                          [<ffffffff81002165>] do_one_initcall+0x45/0x190
                                          [<ffffffff810aa76b>] sys_init_module+0xfb/0x250
                                          [<ffffffff8157a142>] system_call_fastpath+0x16/0x1b
     INITIAL USE at:
                                         [<ffffffff8109a9e9>] __lock_acquire+0x459/0x1e10
                                         [<ffffffff8109ca4d>] lock_acquire+0x9d/0x130
                                         [<ffffffff8157124b>] _raw_spin_lock_bh+0x3b/0x70
                                         [<ffffffff81475618>] __dev_mc_add+0x38/0x90
                                         [<ffffffff814756a0>] dev_mc_add+0x10/0x20
                                         [<ffffffff81532c9e>] igmp6_group_added+0x10e/0x1b0
                                         [<ffffffff81533f2d>] ipv6_dev_mc_inc+0x2cd/0x430
                                         [<ffffffff81515e17>] ipv6_add_dev+0x357/0x450
                                         [<ffffffff81519f27>] addrconf_notify+0x2f7/0xb10
                                         [<ffffffff81575c1c>] notifier_call_chain+0x8c/0xc0
                                         [<ffffffff81089586>] raw_notifier_call_chain+0x16/0x20
                                         [<ffffffff814689b7>] call_netdevice_notifiers+0x37/0x70
                                         [<ffffffff8146a944>] register_netdevice+0x244/0x2d0
                                         [<ffffffff8146aa0f>] register_netdev+0x3f/0x60
                                         [<ffffffffa001419b>] vmxnet3_probe_device+0x760/0x15c5 [vmxnet3]
                                         [<ffffffff812df67f>] local_pci_probe+0x5f/0xd0
                                         [<ffffffff812dfde9>] pci_device_probe+0x119/0x120
                                         [<ffffffff81373df6>] driver_probe_device+0x96/0x1c0
                                         [<ffffffff81373fcb>] __driver_attach+0xab/0xb0
                                         [<ffffffff81372a1e>] bus_for_each_dev+0x5e/0x90
                                         [<ffffffff81373a2e>] driver_attach+0x1e/0x20
                                         [<ffffffff813735b8>] bus_add_driver+0xc8/0x290
                                         [<ffffffff813745b6>] driver_register+0x76/0x140
                                         [<ffffffff812e0046>] __pci_register_driver+0x66/0xe0
                                         [<ffffffffa001b03a>] serio_raw_poll+0x3a/0x60 [serio_raw]
                                         [<ffffffff81002165>] do_one_initcall+0x45/0x190
                                         [<ffffffff810aa76b>] sys_init_module+0xfb/0x250
                                         [<ffffffff8157a142>] system_call_fastpath+0x16/0x1b
   }
   ... key      at: [<ffffffff827fd868>] netdev_addr_lock_key+0x8/0x1e0
   ... acquired at:
    [<ffffffff8109ca4d>] lock_acquire+0x9d/0x130
    [<ffffffff8157124b>] _raw_spin_lock_bh+0x3b/0x70
    [<ffffffff81475618>] __dev_mc_add+0x38/0x90
    [<ffffffff814756a0>] dev_mc_add+0x10/0x20
    [<ffffffff81532c9e>] igmp6_group_added+0x10e/0x1b0
    [<ffffffff81533f2d>] ipv6_dev_mc_inc+0x2cd/0x430
    [<ffffffff81515e17>] ipv6_add_dev+0x357/0x450
    [<ffffffff81519f27>] addrconf_notify+0x2f7/0xb10
    [<ffffffff81575c1c>] notifier_call_chain+0x8c/0xc0
    [<ffffffff81089586>] raw_notifier_call_chain+0x16/0x20
    [<ffffffff814689b7>] call_netdevice_notifiers+0x37/0x70
    [<ffffffff8146a944>] register_netdevice+0x244/0x2d0
    [<ffffffff8146aa0f>] register_netdev+0x3f/0x60
    [<ffffffffa001419b>] vmxnet3_probe_device+0x760/0x15c5 [vmxnet3]
    [<ffffffff812df67f>] local_pci_probe+0x5f/0xd0
    [<ffffffff812dfde9>] pci_device_probe+0x119/0x120
    [<ffffffff81373df6>] driver_probe_device+0x96/0x1c0
    [<ffffffff81373fcb>] __driver_attach+0xab/0xb0
    [<ffffffff81372a1e>] bus_for_each_dev+0x5e/0x90
    [<ffffffff81373a2e>] driver_attach+0x1e/0x20
    [<ffffffff813735b8>] bus_add_driver+0xc8/0x290
    [<ffffffff813745b6>] driver_register+0x76/0x140
    [<ffffffff812e0046>] __pci_register_driver+0x66/0xe0
    [<ffffffffa001b03a>] serio_raw_poll+0x3a/0x60 [serio_raw]
    [<ffffffff81002165>] do_one_initcall+0x45/0x190
    [<ffffffff810aa76b>] sys_init_module+0xfb/0x250
    [<ffffffff8157a142>] system_call_fastpath+0x16/0x1b
 
 -> (&(&mc->mca_lock)->rlock){+.-...} ops: 6 {
    HARDIRQ-ON-W at:
                                        [<ffffffff8109ad86>] __lock_acquire+0x7f6/0x1e10
                                        [<ffffffff8109ca4d>] lock_acquire+0x9d/0x130
                                        [<ffffffff8157124b>] _raw_spin_lock_bh+0x3b/0x70
                                        [<ffffffff81532bd5>] igmp6_group_added+0x45/0x1b0
                                        [<ffffffff81533f2d>] ipv6_dev_mc_inc+0x2cd/0x430
                                        [<ffffffff81515e17>] ipv6_add_dev+0x357/0x450
                                        [<ffffffff81ce0d16>] addrconf_init+0x4e/0x183
                                        [<ffffffff81ce0ba1>] inet6_init+0x191/0x2a6
                                        [<ffffffff81002165>] do_one_initcall+0x45/0x190
                                        [<ffffffff81ca4d3f>] kernel_init+0xe3/0x168
                                        [<ffffffff8157b2e4>] kernel_thread_helper+0x4/0x10
    IN-SOFTIRQ-W at:
                                        [<ffffffff8109ad5e>] __lock_acquire+0x7ce/0x1e10
                                        [<ffffffff8109ca4d>] lock_acquire+0x9d/0x130
                                        [<ffffffff8157124b>] _raw_spin_lock_bh+0x3b/0x70
                                        [<ffffffff81531e9f>] mld_ifc_timer_expire+0xff/0x280
                                        [<ffffffff8106f2a9>] run_timer_softirq+0x179/0x3f0
                                        [<ffffffff810666d0>] __do_softirq+0xc0/0x210
                                        [<ffffffff8157b3dc>] call_softirq+0x1c/0x30
                                        [<ffffffff8100d42d>] do_softirq+0xad/0xe0
                                        [<ffffffff81066afe>] irq_exit+0x9e/0xb0
                                        [<ffffffff8157bd40>] smp_apic_timer_interrupt+0x70/0x9b
                                        [<ffffffff8157ab93>] apic_timer_interrupt+0x13/0x20
                                        [<ffffffff8149d857>] rt_do_flush+0x87/0x2a0
                                        [<ffffffff814a16b6>] rt_cache_flush+0x46/0x60
                                        [<ffffffff814e36e0>] fib_disable_ip+0x40/0x60
                                        [<ffffffff814e5447>] fib_inetaddr_event+0xd7/0xe0
                                        [<ffffffff81575c1c>] notifier_call_chain+0x8c/0xc0
                                        [<ffffffff810896e8>] __blocking_notifier_call_chain+0x78/0xb0
                                        [<ffffffff81089736>] blocking_notifier_call_chain+0x16/0x20
                                        [<ffffffff814d8021>] __inet_del_ifa+0xf1/0x2e0
                                        [<ffffffff814d8223>] inet_del_ifa+0x13/0x20
                                        [<ffffffff814da731>] devinet_ioctl+0x501/0x800
                                        [<ffffffff814db508>] inet_ioctl+0x88/0xa0
                                        [<ffffffff814541f0>] sock_do_ioctl+0x30/0x70
                                        [<ffffffff814542a9>] sock_ioctl+0x79/0x2f0
                                        [<ffffffff81188798>] do_vfs_ioctl+0x98/0x570
                                        [<ffffffff81188d01>] sys_ioctl+0x91/0xa0
                                        [<ffffffff8157a142>] system_call_fastpath+0x16/0x1b
    INITIAL USE at:
                                       [<ffffffff8109a9e9>] __lock_acquire+0x459/0x1e10
                                       [<ffffffff8109ca4d>] lock_acquire+0x9d/0x130
                                       [<ffffffff8157124b>] _raw_spin_lock_bh+0x3b/0x70
                                       [<ffffffff81532bd5>] igmp6_group_added+0x45/0x1b0
                                       [<ffffffff81533f2d>] ipv6_dev_mc_inc+0x2cd/0x430
                                       [<ffffffff81515e17>] ipv6_add_dev+0x357/0x450
                                       [<ffffffff81ce0d16>] addrconf_init+0x4e/0x183
                                       [<ffffffff81ce0ba1>] inet6_init+0x191/0x2a6
                                       [<ffffffff81002165>] do_one_initcall+0x45/0x190
                                       [<ffffffff81ca4d3f>] kernel_init+0xe3/0x168
                                       [<ffffffff8157b2e4>] kernel_thread_helper+0x4/0x10
  }
  ... key      at: [<ffffffff82801be2>] __key.40877+0x0/0x8
  ... acquired at:
    [<ffffffff810997bc>] check_usage_forwards+0x9c/0x110
    [<ffffffff8109a32c>] mark_lock+0x19c/0x400
    [<ffffffff8109ad5e>] __lock_acquire+0x7ce/0x1e10
    [<ffffffff8109ca4d>] lock_acquire+0x9d/0x130
    [<ffffffff8157124b>] _raw_spin_lock_bh+0x3b/0x70
    [<ffffffff81531e9f>] mld_ifc_timer_expire+0xff/0x280
    [<ffffffff8106f2a9>] run_timer_softirq+0x179/0x3f0
    [<ffffffff810666d0>] __do_softirq+0xc0/0x210
    [<ffffffff8157b3dc>] call_softirq+0x1c/0x30
    [<ffffffff8100d42d>] do_softirq+0xad/0xe0
    [<ffffffff81066afe>] irq_exit+0x9e/0xb0
    [<ffffffff8157bd40>] smp_apic_timer_interrupt+0x70/0x9b
    [<ffffffff8157ab93>] apic_timer_interrupt+0x13/0x20
    [<ffffffff8149d857>] rt_do_flush+0x87/0x2a0
    [<ffffffff814a16b6>] rt_cache_flush+0x46/0x60
    [<ffffffff814e36e0>] fib_disable_ip+0x40/0x60
    [<ffffffff814e5447>] fib_inetaddr_event+0xd7/0xe0
    [<ffffffff81575c1c>] notifier_call_chain+0x8c/0xc0
    [<ffffffff810896e8>] __blocking_notifier_call_chain+0x78/0xb0
    [<ffffffff81089736>] blocking_notifier_call_chain+0x16/0x20
    [<ffffffff814d8021>] __inet_del_ifa+0xf1/0x2e0
    [<ffffffff814d8223>] inet_del_ifa+0x13/0x20
    [<ffffffff814da731>] devinet_ioctl+0x501/0x800
    [<ffffffff814db508>] inet_ioctl+0x88/0xa0
    [<ffffffff814541f0>] sock_do_ioctl+0x30/0x70
    [<ffffffff814542a9>] sock_ioctl+0x79/0x2f0
    [<ffffffff81188798>] do_vfs_ioctl+0x98/0x570
    [<ffffffff81188d01>] sys_ioctl+0x91/0xa0
    [<ffffffff8157a142>] system_call_fastpath+0x16/0x1b
 
 
 stack backtrace:
 Pid: 567, comm: ifconfig Not tainted 2.6.39-rc6+ #1
 Call Trace:
  <IRQ>  [<ffffffff810996f6>] print_irq_inversion_bug+0x146/0x170
  [<ffffffff81099720>] ? print_irq_inversion_bug+0x170/0x170
  [<ffffffff810997bc>] check_usage_forwards+0x9c/0x110
  [<ffffffff8109a32c>] mark_lock+0x19c/0x400
  [<ffffffff8109ad5e>] __lock_acquire+0x7ce/0x1e10
  [<ffffffff8109a383>] ? mark_lock+0x1f3/0x400
  [<ffffffff8109b497>] ? __lock_acquire+0xf07/0x1e10
  [<ffffffff81012255>] ? native_sched_clock+0x15/0x70
  [<ffffffff8109ca4d>] lock_acquire+0x9d/0x130
  [<ffffffff81531e9f>] ? mld_ifc_timer_expire+0xff/0x280
  [<ffffffff8109759d>] ? lock_release_holdtime+0x3d/0x1a0
  [<ffffffff8157124b>] _raw_spin_lock_bh+0x3b/0x70
  [<ffffffff81531e9f>] ? mld_ifc_timer_expire+0xff/0x280
  [<ffffffff8157170b>] ? _raw_spin_unlock+0x2b/0x40
  [<ffffffff81531e9f>] mld_ifc_timer_expire+0xff/0x280
  [<ffffffff8106f2a9>] run_timer_softirq+0x179/0x3f0
  [<ffffffff8106f21b>] ? run_timer_softirq+0xeb/0x3f0
  [<ffffffff810122b9>] ? sched_clock+0x9/0x10
  [<ffffffff81531da0>] ? mld_gq_timer_expire+0x30/0x30
  [<ffffffff810666d0>] __do_softirq+0xc0/0x210
  [<ffffffff8109455f>] ? tick_program_event+0x1f/0x30
  [<ffffffff8157b3dc>] call_softirq+0x1c/0x30
  [<ffffffff8100d42d>] do_softirq+0xad/0xe0
  [<ffffffff81066afe>] irq_exit+0x9e/0xb0
  [<ffffffff8157bd40>] smp_apic_timer_interrupt+0x70/0x9b
  [<ffffffff8157ab93>] apic_timer_interrupt+0x13/0x20
  <EOI>  [<ffffffff81571f14>] ? retint_restore_args+0x13/0x13
  [<ffffffff810974a7>] ? lock_is_held+0x17/0xd0
  [<ffffffff8149d857>] rt_do_flush+0x87/0x2a0
  [<ffffffff814a16b6>] rt_cache_flush+0x46/0x60
  [<ffffffff814e36e0>] fib_disable_ip+0x40/0x60
  [<ffffffff814e5447>] fib_inetaddr_event+0xd7/0xe0
  [<ffffffff81575c1c>] notifier_call_chain+0x8c/0xc0
  [<ffffffff810896e8>] __blocking_notifier_call_chain+0x78/0xb0
  [<ffffffff81089736>] blocking_notifier_call_chain+0x16/0x20
  [<ffffffff814d8021>] __inet_del_ifa+0xf1/0x2e0
  [<ffffffff814d8223>] inet_del_ifa+0x13/0x20
  [<ffffffff814da731>] devinet_ioctl+0x501/0x800
  [<ffffffff8108a3af>] ? local_clock+0x6f/0x80
  [<ffffffff81575898>] ? do_page_fault+0x268/0x560
  [<ffffffff814db508>] inet_ioctl+0x88/0xa0
  [<ffffffff814541f0>] sock_do_ioctl+0x30/0x70
  [<ffffffff814542a9>] sock_ioctl+0x79/0x2f0
  [<ffffffff810dfe87>] ? __call_rcu+0xa7/0x190
  [<ffffffff81188798>] do_vfs_ioctl+0x98/0x570
  [<ffffffff8117737e>] ? fget_light+0x33e/0x430
  [<ffffffff81571ef9>] ? retint_swapgs+0x13/0x1b
  [<ffffffff81188d01>] sys_ioctl+0x91/0xa0
  [<ffffffff8157a142>] system_call_fastpath+0x16/0x1b


Signed-off-by: Roland Dreier <roland@purestorage.com>
---
 drivers/net/vmxnet3/vmxnet3_drv.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c b/drivers/net/vmxnet3/vmxnet3_drv.c
index 0d47c3a..c16ed96 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -178,6 +178,7 @@ static void
 vmxnet3_process_events(struct vmxnet3_adapter *adapter)
 {
 	int i;
+	unsigned long flags;
 	u32 events = le32_to_cpu(adapter->shared->ecr);
 	if (!events)
 		return;
@@ -190,10 +191,10 @@ vmxnet3_process_events(struct vmxnet3_adapter *adapter)
 
 	/* Check if there is an error on xmit/recv queues */
 	if (events & (VMXNET3_ECR_TQERR | VMXNET3_ECR_RQERR)) {
-		spin_lock(&adapter->cmd_lock);
+		spin_lock_irqsave(&adapter->cmd_lock, flags);
 		VMXNET3_WRITE_BAR1_REG(adapter, VMXNET3_REG_CMD,
 				       VMXNET3_CMD_GET_QUEUE_STATUS);
-		spin_unlock(&adapter->cmd_lock);
+		spin_unlock_irqrestore(&adapter->cmd_lock, flags);
 
 		for (i = 0; i < adapter->num_tx_queues; i++)
 			if (adapter->tqd_start[i].status.stopped)
@@ -2733,13 +2734,14 @@ static void
 vmxnet3_alloc_intr_resources(struct vmxnet3_adapter *adapter)
 {
 	u32 cfg;
+	unsigned long flags;
 
 	/* intr settings */
-	spin_lock(&adapter->cmd_lock);
+	spin_lock_irqsave(&adapter->cmd_lock, flags);
 	VMXNET3_WRITE_BAR1_REG(adapter, VMXNET3_REG_CMD,
 			       VMXNET3_CMD_GET_CONF_INTR);
 	cfg = VMXNET3_READ_BAR1_REG(adapter, VMXNET3_REG_CMD);
-	spin_unlock(&adapter->cmd_lock);
+	spin_unlock_irqrestore(&adapter->cmd_lock, flags);
 	adapter->intr.type = cfg & 0x3;
 	adapter->intr.mask_mode = (cfg >> 2) & 0x3;
 

^ permalink raw reply related

* Re: [PATCH] bonding: convert to ndo_fix_features
From: Jay Vosburgh @ 2011-05-06 18:18 UTC (permalink / raw)
  To: Michał Mirosław; +Cc: netdev, Andy Gospodarek
In-Reply-To: <20110506175629.BC59D1389B@rere.qmqm.pl>

Michał Mirosław <mirq-linux@rere.qmqm.pl> wrote:

>This should also fix updating of vlan_features and propagating changes to
>VLAN devices on the bond.
>
>Side effect: it allows user to force-disable some offloads on the bond
>interface.
>
>Note: NETIF_F_VLAN_CHALLENGED is managed by bond_fix_features() now.
>
>BTW, What are the problems in creating VLAN devices on an empty bond
>(as stated in one of bond_setup() comments)?

	If there are no slaves, then the bond does not have a MAC
address assigned (because it gets its initial MAC from the first slave).
It's therefore impossible to pass a MAC address up to the VLAN
interface.

	So the limitation is that the bond must have at least one slave
before a VLAN may be configured above it.

	-J


>Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
>---
>
>Note: This is only compile tested, yet.
>
> drivers/net/bonding/bond_main.c |  133 +++++++++++++++------------------------
> 1 files changed, 50 insertions(+), 83 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 9a5feaf..04a2205 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -344,32 +344,6 @@ out:
> }
>
> /**
>- * bond_has_challenged_slaves
>- * @bond: the bond we're working on
>- *
>- * Searches the slave list. Returns 1 if a vlan challenged slave
>- * was found, 0 otherwise.
>- *
>- * Assumes bond->lock is held.
>- */
>-static int bond_has_challenged_slaves(struct bonding *bond)
>-{
>-	struct slave *slave;
>-	int i;
>-
>-	bond_for_each_slave(bond, slave, i) {
>-		if (slave->dev->features & NETIF_F_VLAN_CHALLENGED) {
>-			pr_debug("found VLAN challenged slave - %s\n",
>-				 slave->dev->name);
>-			return 1;
>-		}
>-	}
>-
>-	pr_debug("no VLAN challenged slaves found\n");
>-	return 0;
>-}
>-
>-/**
>  * bond_next_vlan - safely skip to the next item in the vlans list.
>  * @bond: the bond we're working on
>  * @curr: item we're advancing from
>@@ -1406,52 +1380,61 @@ static int bond_sethwaddr(struct net_device *bond_dev,
> 	return 0;
> }
>
>-#define BOND_VLAN_FEATURES \
>-	(NETIF_F_VLAN_CHALLENGED | NETIF_F_HW_VLAN_RX | NETIF_F_HW_VLAN_TX | \
>-	 NETIF_F_HW_VLAN_FILTER)
>-
>-/*
>- * Compute the common dev->feature set available to all slaves.  Some
>- * feature bits are managed elsewhere, so preserve those feature bits
>- * on the master device.
>- */
>-static int bond_compute_features(struct bonding *bond)
>+static u32 bond_fix_features(struct net_device *dev, u32 features)
> {
> 	struct slave *slave;
>-	struct net_device *bond_dev = bond->dev;
>-	u32 features = bond_dev->features;
>-	u32 vlan_features = 0;
>-	unsigned short max_hard_header_len = max((u16)ETH_HLEN,
>-						bond_dev->hard_header_len);
>+	struct bonding *bond = netdev_priv(dev);
>+	u32 mask;
> 	int i;
>
>-	features &= ~(NETIF_F_ALL_CSUM | BOND_VLAN_FEATURES);
>-	features |=  NETIF_F_GSO_MASK | NETIF_F_NO_CSUM | NETIF_F_NOCACHE_COPY;
>-
> 	if (!bond->first_slave)
>-		goto done;
>+		/* Disable adding VLANs to empty bond. But why? --mq */
>+		return features | NETIF_F_VLAN_CHALLENGED;
>
>+	mask = features;
> 	features &= ~NETIF_F_ONE_FOR_ALL;
>+	features |= NETIF_F_ALL_FOR_ALL;
>
>-	vlan_features = bond->first_slave->dev->vlan_features;
> 	bond_for_each_slave(bond, slave, i) {
> 		features = netdev_increment_features(features,
> 						     slave->dev->features,
>-						     NETIF_F_ONE_FOR_ALL);
>+						     mask);
>+	}
>+
>+	return features;
>+}
>+
>+#define BOND_VLAN_FEATURES	(NETIF_F_ALL_TX_OFFLOADS | \
>+				 NETIF_F_SOFT_FEATURES | \
>+				 NETIF_F_LRO)
>+
>+static void bond_compute_features(struct bonding *bond)
>+{
>+	struct slave *slave;
>+	struct net_device *bond_dev = bond->dev;
>+	u32 old_features, vlan_features = BOND_VLAN_FEATURES;
>+	unsigned short max_hard_header_len = ETH_HLEN;
>+	int i;
>+
>+	if (!bond->first_slave)
>+		goto done;
>+
>+	bond_for_each_slave(bond, slave, i) {
> 		vlan_features = netdev_increment_features(vlan_features,
>-							slave->dev->vlan_features,
>-							NETIF_F_ONE_FOR_ALL);
>+			slave->dev->vlan_features, BOND_VLAN_FEATURES);
>+
> 		if (slave->dev->hard_header_len > max_hard_header_len)
> 			max_hard_header_len = slave->dev->hard_header_len;
> 	}
>
> done:
>-	features |= (bond_dev->features & BOND_VLAN_FEATURES);
>-	bond_dev->features = netdev_fix_features(bond_dev, features);
>-	bond_dev->vlan_features = netdev_fix_features(bond_dev, vlan_features);
>+	bond_dev->vlan_features = vlan_features;
> 	bond_dev->hard_header_len = max_hard_header_len;
>
>-	return 0;
>+	old_features = bond_dev->features;
>+	netdev_update_features(bond_dev);
>+	if (old_features == bond_dev->features)
>+		netdev_features_change(bond_dev);
> }
>
> static void bond_setup_by_slave(struct net_device *bond_dev,
>@@ -1544,7 +1527,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
> 	struct netdev_hw_addr *ha;
> 	struct sockaddr addr;
> 	int link_reporting;
>-	int old_features = bond_dev->features;
> 	int res = 0;
>
> 	if (!bond->params.use_carrier && slave_dev->ethtool_ops == NULL &&
>@@ -1577,16 +1559,9 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
> 			pr_warning("%s: Warning: enslaved VLAN challenged slave %s. Adding VLANs will be blocked as long as %s is part of bond %s\n",
> 				   bond_dev->name, slave_dev->name,
> 				   slave_dev->name, bond_dev->name);
>-			bond_dev->features |= NETIF_F_VLAN_CHALLENGED;
> 		}
> 	} else {
> 		pr_debug("%s: ! NETIF_F_VLAN_CHALLENGED\n", slave_dev->name);
>-		if (bond->slave_cnt == 0) {
>-			/* First slave, and it is not VLAN challenged,
>-			 * so remove the block of adding VLANs over the bond.
>-			 */
>-			bond_dev->features &= ~NETIF_F_VLAN_CHALLENGED;
>-		}
> 	}
>
> 	/*
>@@ -1958,7 +1933,7 @@ err_free:
> 	kfree(new_slave);
>
> err_undo_flags:
>-	bond_dev->features = old_features;
>+	bond_compute_features(bond);
>
> 	return res;
> }
>@@ -1979,6 +1954,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
> 	struct bonding *bond = netdev_priv(bond_dev);
> 	struct slave *slave, *oldcurrent;
> 	struct sockaddr addr;
>+	u32 old_features = bond_dev->features;
>
> 	/* slave is not a slave or master is not master of this slave */
> 	if (!(slave_dev->flags & IFF_SLAVE) ||
>@@ -2084,19 +2060,16 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
> 		 */
> 		memset(bond_dev->dev_addr, 0, bond_dev->addr_len);
>
>-		if (!bond->vlgrp) {
>-			bond_dev->features |= NETIF_F_VLAN_CHALLENGED;
>-		} else {
>+		if (bond->vlgrp) {
> 			pr_warning("%s: Warning: clearing HW address of %s while it still has VLANs.\n",
> 				   bond_dev->name, bond_dev->name);
> 			pr_warning("%s: When re-adding slaves, make sure the bond's HW address matches its VLANs'.\n",
> 				   bond_dev->name);
> 		}
>-	} else if ((bond_dev->features & NETIF_F_VLAN_CHALLENGED) &&
>-		   !bond_has_challenged_slaves(bond)) {
>+	} else if (!(bond_dev->features & NETIF_F_VLAN_CHALLENGED) &&
>+		   old_features & NETIF_F_VLAN_CHALLENGED) {
> 		pr_info("%s: last VLAN challenged slave %s left bond %s. VLAN blocking is removed\n",
> 			bond_dev->name, slave_dev->name, bond_dev->name);
>-		bond_dev->features &= ~NETIF_F_VLAN_CHALLENGED;
> 	}
>
> 	write_unlock_bh(&bond->lock);
>@@ -2269,9 +2242,7 @@ static int bond_release_all(struct net_device *bond_dev)
> 	 */
> 	memset(bond_dev->dev_addr, 0, bond_dev->addr_len);
>
>-	if (!bond->vlgrp) {
>-		bond_dev->features |= NETIF_F_VLAN_CHALLENGED;
>-	} else {
>+	if (bond->vlgrp) {
> 		pr_warning("%s: Warning: clearing HW address of %s while it still has VLANs.\n",
> 			   bond_dev->name, bond_dev->name);
> 		pr_warning("%s: When re-adding slaves, make sure the bond's HW address matches its VLANs'.\n",
>@@ -4347,11 +4318,6 @@ static void bond_ethtool_get_drvinfo(struct net_device *bond_dev,
> static const struct ethtool_ops bond_ethtool_ops = {
> 	.get_drvinfo		= bond_ethtool_get_drvinfo,
> 	.get_link		= ethtool_op_get_link,
>-	.get_tx_csum		= ethtool_op_get_tx_csum,
>-	.get_sg			= ethtool_op_get_sg,
>-	.get_tso		= ethtool_op_get_tso,
>-	.get_ufo		= ethtool_op_get_ufo,
>-	.get_flags		= ethtool_op_get_flags,
> };
>
> static const struct net_device_ops bond_netdev_ops = {
>@@ -4377,6 +4343,7 @@ static const struct net_device_ops bond_netdev_ops = {
> #endif
> 	.ndo_add_slave		= bond_enslave,
> 	.ndo_del_slave		= bond_release,
>+	.ndo_fix_features	= bond_fix_features,
> };
>
> static void bond_destructor(struct net_device *bond_dev)
>@@ -4432,14 +4399,14 @@ static void bond_setup(struct net_device *bond_dev)
> 	 * when there are slaves that are not hw accel
> 	 * capable
> 	 */
>-	bond_dev->features |= (NETIF_F_HW_VLAN_TX |
>-			       NETIF_F_HW_VLAN_RX |
>-			       NETIF_F_HW_VLAN_FILTER);
>
>-	/* By default, we enable GRO on bonding devices.
>-	 * Actual support requires lowlevel drivers are GRO ready.
>-	 */
>-	bond_dev->features |= NETIF_F_GRO;
>+	bond_dev->hw_features = BOND_VLAN_FEATURES |
>+				NETIF_F_HW_VLAN_TX |
>+				NETIF_F_HW_VLAN_RX |
>+				NETIF_F_HW_VLAN_FILTER;
>+
>+	bond_dev->hw_features &= ~(NETIF_F_ALL_CSUM & ~NETIF_F_NO_CSUM);
>+	bond_dev->features |= bond_dev->hw_features;
> }
>
> static void bond_work_cancel_all(struct bonding *bond)
>-- 
>1.7.2.5
>

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply

* Re: [PATCH v6 BONUS 4/3] ipv4: Store rtable entries directly in FIB
From: David Miller @ 2011-05-06 17:57 UTC (permalink / raw)
  To: ja; +Cc: netdev, tgraf, jpirko, herbert, eric.dumazet
In-Reply-To: <alpine.LFD.2.00.1105060944230.1435@ja.ssi.bg>

From: Julian Anastasov <ja@ssi.bg>
Date: Fri, 6 May 2011 12:12:26 +0300 (EEST)

> 	Caching results of __mkroute_output in NH does
> not work well for RTN_MULTICAST because ip_check_mc_rcu
> wants to further restrict local delivery depending on
> the source address and protocol.

I understand that multicast needs special handling.

I'm concentrating on unicast/broadcast at the moment because
there is a predominantly clear path for making that work.

^ permalink raw reply

* [PATCH] net: Fix vlan_features propagation
From: Michał Mirosław @ 2011-05-06 17:56 UTC (permalink / raw)
  To: netdev; +Cc: Patrick McHardy

Fix VLAN features propagation for devices which change vlan_features.
For this to work, driver needs to make sure netdev_features_changed()
gets called after the change (it is e.g. after ndo_set_features()).

Side effect is that a user might request features that will never
be enabled on a VLAN device.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
 net/8021q/vlan_dev.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index d174c31..526159a 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -531,7 +531,7 @@ static int vlan_dev_init(struct net_device *dev)
 					  (1<<__LINK_STATE_DORMANT))) |
 		      (1<<__LINK_STATE_PRESENT);
 
-	dev->hw_features = real_dev->vlan_features & NETIF_F_ALL_TX_OFFLOADS;
+	dev->hw_features = NETIF_F_ALL_TX_OFFLOADS;
 	dev->features |= real_dev->vlan_features | NETIF_F_LLTX;
 	dev->gso_max_size = real_dev->gso_max_size;
 
@@ -590,9 +590,11 @@ static u32 vlan_dev_fix_features(struct net_device *dev, u32 features)
 {
 	struct net_device *real_dev = vlan_dev_info(dev)->real_dev;
 
-	features &= (real_dev->features | NETIF_F_LLTX);
+	features &= real_dev->features;
+	features &= real_dev->vlan_features;
 	if (dev_ethtool_get_rx_csum(real_dev))
 		features |= NETIF_F_RXCSUM;
+	features |= NETIF_F_LLTX;
 
 	return features;
 }
-- 
1.7.2.5


^ permalink raw reply related

* [PATCH] bonding: convert to ndo_fix_features
From: Michał Mirosław @ 2011-05-06 17:56 UTC (permalink / raw)
  To: netdev; +Cc: Jay Vosburgh, Andy Gospodarek

This should also fix updating of vlan_features and propagating changes to
VLAN devices on the bond.

Side effect: it allows user to force-disable some offloads on the bond
interface.

Note: NETIF_F_VLAN_CHALLENGED is managed by bond_fix_features() now.

BTW, What are the problems in creating VLAN devices on an empty bond
(as stated in one of bond_setup() comments)?

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---

Note: This is only compile tested, yet.

 drivers/net/bonding/bond_main.c |  133 +++++++++++++++------------------------
 1 files changed, 50 insertions(+), 83 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 9a5feaf..04a2205 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -344,32 +344,6 @@ out:
 }
 
 /**
- * bond_has_challenged_slaves
- * @bond: the bond we're working on
- *
- * Searches the slave list. Returns 1 if a vlan challenged slave
- * was found, 0 otherwise.
- *
- * Assumes bond->lock is held.
- */
-static int bond_has_challenged_slaves(struct bonding *bond)
-{
-	struct slave *slave;
-	int i;
-
-	bond_for_each_slave(bond, slave, i) {
-		if (slave->dev->features & NETIF_F_VLAN_CHALLENGED) {
-			pr_debug("found VLAN challenged slave - %s\n",
-				 slave->dev->name);
-			return 1;
-		}
-	}
-
-	pr_debug("no VLAN challenged slaves found\n");
-	return 0;
-}
-
-/**
  * bond_next_vlan - safely skip to the next item in the vlans list.
  * @bond: the bond we're working on
  * @curr: item we're advancing from
@@ -1406,52 +1380,61 @@ static int bond_sethwaddr(struct net_device *bond_dev,
 	return 0;
 }
 
-#define BOND_VLAN_FEATURES \
-	(NETIF_F_VLAN_CHALLENGED | NETIF_F_HW_VLAN_RX | NETIF_F_HW_VLAN_TX | \
-	 NETIF_F_HW_VLAN_FILTER)
-
-/*
- * Compute the common dev->feature set available to all slaves.  Some
- * feature bits are managed elsewhere, so preserve those feature bits
- * on the master device.
- */
-static int bond_compute_features(struct bonding *bond)
+static u32 bond_fix_features(struct net_device *dev, u32 features)
 {
 	struct slave *slave;
-	struct net_device *bond_dev = bond->dev;
-	u32 features = bond_dev->features;
-	u32 vlan_features = 0;
-	unsigned short max_hard_header_len = max((u16)ETH_HLEN,
-						bond_dev->hard_header_len);
+	struct bonding *bond = netdev_priv(dev);
+	u32 mask;
 	int i;
 
-	features &= ~(NETIF_F_ALL_CSUM | BOND_VLAN_FEATURES);
-	features |=  NETIF_F_GSO_MASK | NETIF_F_NO_CSUM | NETIF_F_NOCACHE_COPY;
-
 	if (!bond->first_slave)
-		goto done;
+		/* Disable adding VLANs to empty bond. But why? --mq */
+		return features | NETIF_F_VLAN_CHALLENGED;
 
+	mask = features;
 	features &= ~NETIF_F_ONE_FOR_ALL;
+	features |= NETIF_F_ALL_FOR_ALL;
 
-	vlan_features = bond->first_slave->dev->vlan_features;
 	bond_for_each_slave(bond, slave, i) {
 		features = netdev_increment_features(features,
 						     slave->dev->features,
-						     NETIF_F_ONE_FOR_ALL);
+						     mask);
+	}
+
+	return features;
+}
+
+#define BOND_VLAN_FEATURES	(NETIF_F_ALL_TX_OFFLOADS | \
+				 NETIF_F_SOFT_FEATURES | \
+				 NETIF_F_LRO)
+
+static void bond_compute_features(struct bonding *bond)
+{
+	struct slave *slave;
+	struct net_device *bond_dev = bond->dev;
+	u32 old_features, vlan_features = BOND_VLAN_FEATURES;
+	unsigned short max_hard_header_len = ETH_HLEN;
+	int i;
+
+	if (!bond->first_slave)
+		goto done;
+
+	bond_for_each_slave(bond, slave, i) {
 		vlan_features = netdev_increment_features(vlan_features,
-							slave->dev->vlan_features,
-							NETIF_F_ONE_FOR_ALL);
+			slave->dev->vlan_features, BOND_VLAN_FEATURES);
+
 		if (slave->dev->hard_header_len > max_hard_header_len)
 			max_hard_header_len = slave->dev->hard_header_len;
 	}
 
 done:
-	features |= (bond_dev->features & BOND_VLAN_FEATURES);
-	bond_dev->features = netdev_fix_features(bond_dev, features);
-	bond_dev->vlan_features = netdev_fix_features(bond_dev, vlan_features);
+	bond_dev->vlan_features = vlan_features;
 	bond_dev->hard_header_len = max_hard_header_len;
 
-	return 0;
+	old_features = bond_dev->features;
+	netdev_update_features(bond_dev);
+	if (old_features == bond_dev->features)
+		netdev_features_change(bond_dev);
 }
 
 static void bond_setup_by_slave(struct net_device *bond_dev,
@@ -1544,7 +1527,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 	struct netdev_hw_addr *ha;
 	struct sockaddr addr;
 	int link_reporting;
-	int old_features = bond_dev->features;
 	int res = 0;
 
 	if (!bond->params.use_carrier && slave_dev->ethtool_ops == NULL &&
@@ -1577,16 +1559,9 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 			pr_warning("%s: Warning: enslaved VLAN challenged slave %s. Adding VLANs will be blocked as long as %s is part of bond %s\n",
 				   bond_dev->name, slave_dev->name,
 				   slave_dev->name, bond_dev->name);
-			bond_dev->features |= NETIF_F_VLAN_CHALLENGED;
 		}
 	} else {
 		pr_debug("%s: ! NETIF_F_VLAN_CHALLENGED\n", slave_dev->name);
-		if (bond->slave_cnt == 0) {
-			/* First slave, and it is not VLAN challenged,
-			 * so remove the block of adding VLANs over the bond.
-			 */
-			bond_dev->features &= ~NETIF_F_VLAN_CHALLENGED;
-		}
 	}
 
 	/*
@@ -1958,7 +1933,7 @@ err_free:
 	kfree(new_slave);
 
 err_undo_flags:
-	bond_dev->features = old_features;
+	bond_compute_features(bond);
 
 	return res;
 }
@@ -1979,6 +1954,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
 	struct bonding *bond = netdev_priv(bond_dev);
 	struct slave *slave, *oldcurrent;
 	struct sockaddr addr;
+	u32 old_features = bond_dev->features;
 
 	/* slave is not a slave or master is not master of this slave */
 	if (!(slave_dev->flags & IFF_SLAVE) ||
@@ -2084,19 +2060,16 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
 		 */
 		memset(bond_dev->dev_addr, 0, bond_dev->addr_len);
 
-		if (!bond->vlgrp) {
-			bond_dev->features |= NETIF_F_VLAN_CHALLENGED;
-		} else {
+		if (bond->vlgrp) {
 			pr_warning("%s: Warning: clearing HW address of %s while it still has VLANs.\n",
 				   bond_dev->name, bond_dev->name);
 			pr_warning("%s: When re-adding slaves, make sure the bond's HW address matches its VLANs'.\n",
 				   bond_dev->name);
 		}
-	} else if ((bond_dev->features & NETIF_F_VLAN_CHALLENGED) &&
-		   !bond_has_challenged_slaves(bond)) {
+	} else if (!(bond_dev->features & NETIF_F_VLAN_CHALLENGED) &&
+		   old_features & NETIF_F_VLAN_CHALLENGED) {
 		pr_info("%s: last VLAN challenged slave %s left bond %s. VLAN blocking is removed\n",
 			bond_dev->name, slave_dev->name, bond_dev->name);
-		bond_dev->features &= ~NETIF_F_VLAN_CHALLENGED;
 	}
 
 	write_unlock_bh(&bond->lock);
@@ -2269,9 +2242,7 @@ static int bond_release_all(struct net_device *bond_dev)
 	 */
 	memset(bond_dev->dev_addr, 0, bond_dev->addr_len);
 
-	if (!bond->vlgrp) {
-		bond_dev->features |= NETIF_F_VLAN_CHALLENGED;
-	} else {
+	if (bond->vlgrp) {
 		pr_warning("%s: Warning: clearing HW address of %s while it still has VLANs.\n",
 			   bond_dev->name, bond_dev->name);
 		pr_warning("%s: When re-adding slaves, make sure the bond's HW address matches its VLANs'.\n",
@@ -4347,11 +4318,6 @@ static void bond_ethtool_get_drvinfo(struct net_device *bond_dev,
 static const struct ethtool_ops bond_ethtool_ops = {
 	.get_drvinfo		= bond_ethtool_get_drvinfo,
 	.get_link		= ethtool_op_get_link,
-	.get_tx_csum		= ethtool_op_get_tx_csum,
-	.get_sg			= ethtool_op_get_sg,
-	.get_tso		= ethtool_op_get_tso,
-	.get_ufo		= ethtool_op_get_ufo,
-	.get_flags		= ethtool_op_get_flags,
 };
 
 static const struct net_device_ops bond_netdev_ops = {
@@ -4377,6 +4343,7 @@ static const struct net_device_ops bond_netdev_ops = {
 #endif
 	.ndo_add_slave		= bond_enslave,
 	.ndo_del_slave		= bond_release,
+	.ndo_fix_features	= bond_fix_features,
 };
 
 static void bond_destructor(struct net_device *bond_dev)
@@ -4432,14 +4399,14 @@ static void bond_setup(struct net_device *bond_dev)
 	 * when there are slaves that are not hw accel
 	 * capable
 	 */
-	bond_dev->features |= (NETIF_F_HW_VLAN_TX |
-			       NETIF_F_HW_VLAN_RX |
-			       NETIF_F_HW_VLAN_FILTER);
 
-	/* By default, we enable GRO on bonding devices.
-	 * Actual support requires lowlevel drivers are GRO ready.
-	 */
-	bond_dev->features |= NETIF_F_GRO;
+	bond_dev->hw_features = BOND_VLAN_FEATURES |
+				NETIF_F_HW_VLAN_TX |
+				NETIF_F_HW_VLAN_RX |
+				NETIF_F_HW_VLAN_FILTER;
+
+	bond_dev->hw_features &= ~(NETIF_F_ALL_CSUM & ~NETIF_F_NO_CSUM);
+	bond_dev->features |= bond_dev->hw_features;
 }
 
 static void bond_work_cancel_all(struct bonding *bond)
-- 
1.7.2.5


^ permalink raw reply related

* Re: [Bugme-new] [Bug 34322] New: No ECN marking in IPv6
From: Steinar H. Gunderson @ 2011-05-06 17:12 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andrew Morton, netdev, bugzilla-daemon, bugme-daemon,
	YOSHIFUJI Hideaki
In-Reply-To: <1304694292.3066.29.camel@edumazet-laptop>

On Fri, May 06, 2011 at 05:04:52PM +0200, Eric Dumazet wrote:
> Analysis seems fine, but you also need to change INET_ECN_dontxmit() for
> retransmitted packets.
> 
> Any chance you can refine your patch ?

Sure, but is really checking against NULL the right way of checking for IPv6
sockets? I'd imagined I should have checked address family or something
instead...

/* Steinar */
-- 
Homepage: http://www.sesse.net/

^ permalink raw reply

* Re: [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error
From: TB @ 2011-05-06 17:39 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Brandeburg, Jesse, David Miller, Sangtae Ha, Injong Rhee,
	Valdis.Kletnieks@vt.edu, rdunlap@xenotime.net,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20110506095359.57c4fb38@nehalam>

On 11-05-06 12:53 PM, Stephen Hemminger wrote:
> On Fri, 06 May 2011 12:15:46 -0400
> TB <lkml@techboom.com> wrote:
> 
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> On 11-05-04 04:53 PM, Brandeburg, Jesse wrote:
>>>
>>>
>>> On Wed, 4 May 2011, Stephen Hemminger wrote:
>>>
>>>> TCP Cubic keeps a metric that estimates the amount of delayed
>>>> acknowledgements to use in adjusting the window. If an abnormally
>>>> large number of packets are acknowledged at once, then the update
>>>> could wrap and reach zero. This kind of ACK could only
>>>> happen when there was a large window and huge number of
>>>> ACK's were lost.
>>>>
>>>> This patch limits the value of delayed ack ratio. The choice of 32
>>>> is just a conservative value since normally it should be range of 
>>>> 1 to 4 packets.
>>>>
>>>> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
>>>
>>> patch seems fine, but please credit the reporter (lkml@techboom.com) with 
>>> reporting the issue with logs, maybe even with Reported-by: and some kind 
>>> of reference to the panic message or the email thread in the text or 
>>> header?
>>
>> We're currently testing the patch on 6 production servers
> 
> Thank you, is there some regularity to the failures previously?

Not really, there was more chance of it happening after a reboot and
during the night (when there is less traffic) for some weird reason.

As a workaround we switched most of the servers to reno

^ permalink raw reply

* Re: [RFC v3 02/10] Revert "lsm: Remove the socket_post_accept() hook"
From: Paul Moore @ 2011-05-06 17:27 UTC (permalink / raw)
  To: Samir Bellabes
  Cc: Tetsuo Handa, linux-security-module, linux-kernel, netdev,
	netfilter-devel, hadi, kaber, zbr, root
In-Reply-To: <87iptop4di.fsf@synack.fr>

On Friday, May 06, 2011 5:25:45 AM Samir Bellabes wrote:
> the main argument for socket_post_accept is to known informations of the
> remote inet.
> 
> from socket_accept(), we have no clue of who (inet->daddr and inet->saddr)
> is connecting to the local service. with socket_post_accept(), inet->daddr
> and inet->saddr are filled with the true distant informations.
> 
> This informations is interesting for next security operations on the
> socket. (we known with who we are talking to).

Looking at the snet_socket_post_accept() hook, I believe all of the 
information you are looking for should be available to you in the sock_graft() 
hook.

--
paul moore
linux @ hp

^ permalink raw reply

* For the netdev list
From: Tom Goetz @ 2011-05-06 16:57 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: Type: text/plain, Size: 569 bytes --]

We recently obtained a new a Lenovo Edge 0578-CTO. The r8169 driver causes instability in the system on this machine. The problem is that in rtl8169_rx_interrupt (status & 0x00001FFF) returns values less than four on this machine. This results in this line:

int pkt_size = (status & 0x00001FFF) - 4;

giving a huge packet size which causes problem when the packet is copied. For a work around we've added a patch to drop packets when we see this condition. I have attached lspci -vvv for this device and a patch for the work around we're using.

-Tom Goetz



[-- Attachment #2: r8168_lspci.log --]
[-- Type: application/octet-stream, Size: 3293 bytes --]

09:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 03)
	Subsystem: Lenovo Device 2131
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 299
	Region 0: I/O ports at 4000 [size=256]
	Region 2: Memory at f0904000 (64-bit, prefetchable) [size=4K]
	Region 4: Memory at f0900000 (64-bit, prefetchable) [size=16K]
	[virtual] Expansion ROM at f0920000 [disabled] [size=128K]
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee0200c  Data: 41c8
	Capabilities: [70] Express (v2) Endpoint, MSI 01
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 4096 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <64us
			ClockPM+ Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis+
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB
	Capabilities: [ac] MSI-X: Enable- Count=4 Masked-
		Vector table: BAR=4 offset=00000000
		PBA: BAR=4 offset=00000800
	Capabilities: [cc] Vital Product Data
		Unknown small resource type 00, will not decode more.
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Capabilities: [140 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed- WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
			Status:	NegoPending- InProgress-
	Capabilities: [160 v1] Device Serial Number d4-00-00-00-68-4c-e0-00
	Kernel driver in use: r8169
	Kernel modules: r8169


[-- Attachment #3: rtl8169.patch --]
[-- Type: application/octet-stream, Size: 652 bytes --]

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 7ffdb80..4c8ad2a 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -4579,6 +4579,14 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
 			dma_addr_t addr = le64_to_cpu(desc->addr);
 			int pkt_size = (status & 0x00001FFF) - 4;
 
+			if ((status & 0x00001FFF) < 4) {
+				dev->stats.rx_dropped++;
+				dev->stats.rx_length_errors++;
+				rtl8169_mark_to_asic(desc, rx_buf_sz);
+				printk("%s: bad packet length!\n", __FUNCTION__);
+				continue;
+			}
+
 			/*
 			 * The driver does not support incoming fragmented
 			 * frames. They are seen as a symptom of over-mtu

^ permalink raw reply related

* Re: [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error
From: Stephen Hemminger @ 2011-05-06 16:53 UTC (permalink / raw)
  To: TB
  Cc: Brandeburg, Jesse, David Miller, Sangtae Ha, Injong Rhee,
	Valdis.Kletnieks@vt.edu, rdunlap@xenotime.net,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <4DC41EB2.6070404@techboom.com>

On Fri, 06 May 2011 12:15:46 -0400
TB <lkml@techboom.com> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 11-05-04 04:53 PM, Brandeburg, Jesse wrote:
> > 
> > 
> > On Wed, 4 May 2011, Stephen Hemminger wrote:
> > 
> >> TCP Cubic keeps a metric that estimates the amount of delayed
> >> acknowledgements to use in adjusting the window. If an abnormally
> >> large number of packets are acknowledged at once, then the update
> >> could wrap and reach zero. This kind of ACK could only
> >> happen when there was a large window and huge number of
> >> ACK's were lost.
> >>
> >> This patch limits the value of delayed ack ratio. The choice of 32
> >> is just a conservative value since normally it should be range of 
> >> 1 to 4 packets.
> >>
> >> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> > 
> > patch seems fine, but please credit the reporter (lkml@techboom.com) with 
> > reporting the issue with logs, maybe even with Reported-by: and some kind 
> > of reference to the panic message or the email thread in the text or 
> > header?
> 
> We're currently testing the patch on 6 production servers

Thank you, is there some regularity to the failures previously?

^ permalink raw reply

* [PATCH] NET: slip, fix ldisc->open retval
From: Matvejchikov Ilya @ 2011-05-06 16:23 UTC (permalink / raw)
  To: netdev

TTY layer expects 0 if the ldisc->open operation succeeded.

Signed-off-by : Matvejchikov Ilya <matvejchikov@gmail.com>
---
 drivers/net/slip.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/net/slip.c b/drivers/net/slip.c
index 86cbb9e..8ec1a9a 100644
--- a/drivers/net/slip.c
+++ b/drivers/net/slip.c
@@ -853,7 +853,9 @@ static int slip_open(struct tty_struct *tty)
 	/* Done.  We have linked the TTY line to a channel. */
 	rtnl_unlock();
 	tty->receive_room = 65536;	/* We don't flow control */
-	return sl->dev->base_addr;
+
+	/* TTY layer expects 0 on success */
+	return 0;

 err_free_bufs:
 	sl_free_bufs(sl);
-- 
1.7.5.1

^ permalink raw reply related

* Re: ARM, AF_PACKET: caching problems on Marvell Kirkwood
From: Phil Sutter @ 2011-05-06 16:17 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: linux-arm-kernel, netdev, ne, Johann Baudy, Lennert Buytenhek,
	Nicolas Pitre
In-Reply-To: <20110505194601.GA10565@lunn.ch>

Hi,

On Thu, May 05, 2011 at 09:46:01PM +0200, Andrew Lunn wrote:
> I can reproduce it on a Kirkwood:
> 
> [    0.000000] CPU: Feroceon 88FR131 [56251311] revision 1 (ARMv5TE), cr=00053977

Thanks for the information. Seems like we have the same CPU:

| [    0.000000] CPU: Feroceon 88FR131 [56251311] revision 1 (ARMv5TE), cr=00053177
| [    0.000000] CPU: VIVT data cache, VIVT instruction cache

and it's actually VIVT, not VIPT as I wrote in an earlier mail.

Greetings, Phil

^ permalink raw reply

* Re: [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error
From: TB @ 2011-05-06 16:15 UTC (permalink / raw)
  To: Brandeburg, Jesse
  Cc: Stephen Hemminger, David Miller, Sangtae Ha, Injong Rhee,
	Valdis.Kletnieks@vt.edu, rdunlap@xenotime.net,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <alpine.WNT.2.00.1105041352020.6048@JBRANDEB-DESK2.amr.corp.intel.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11-05-04 04:53 PM, Brandeburg, Jesse wrote:
> 
> 
> On Wed, 4 May 2011, Stephen Hemminger wrote:
> 
>> TCP Cubic keeps a metric that estimates the amount of delayed
>> acknowledgements to use in adjusting the window. If an abnormally
>> large number of packets are acknowledged at once, then the update
>> could wrap and reach zero. This kind of ACK could only
>> happen when there was a large window and huge number of
>> ACK's were lost.
>>
>> This patch limits the value of delayed ack ratio. The choice of 32
>> is just a conservative value since normally it should be range of 
>> 1 to 4 packets.
>>
>> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> patch seems fine, but please credit the reporter (lkml@techboom.com) with 
> reporting the issue with logs, maybe even with Reported-by: and some kind 
> of reference to the panic message or the email thread in the text or 
> header?

We're currently testing the patch on 6 production servers

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNxB6yAAoJENOh8x1aI8Ye4ocH/3+6gjWWppgOwql0J4XGGD5R
wJX+u8A+YK2V+GBvxFgQs/qNa3IB/nnWwELolflO80twq2JrOq1I6g2n1VJhHjX4
b5jyROMe2gPHRKESibi84gNIuoImq4bqM/S1u7xWzcikTh8FxCevYQXTNilIKOOf
siuOIypFY7AyqSPjhq5/+HpTrrOQa097PAcVAr8RBO7niyrxAE75ACTolGAKBfvQ
HlOYKmxBT8SbnZ7YJNINopPdtpqz3iaraKWUoT44Wuv8Q8jt0cqB7YJWl0RG/C3y
ABK50Qihl1p6M+LL9jjR2YwVFkjiLyN3fO8g2pjVfn4wh0afFCyWtitN0OFd/4I=
=Vy5E
-----END PGP SIGNATURE-----

^ permalink raw reply

* Re: ARM, AF_PACKET: caching problems on Marvell Kirkwood
From: Phil Sutter @ 2011-05-06 16:12 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: linux-arm-kernel, netdev, ne, Johann Baudy, Lennert Buytenhek,
	Nicolas Pitre
In-Reply-To: <1304607362.3032.84.camel@edumazet-laptop>

Hi,

On Thu, May 05, 2011 at 04:56:02PM +0200, Eric Dumazet wrote:
> I assume you use latest linux-2.6 or net-next-2.6 ?

Well, initially we noticed the problem on 2.6.34.7, but I verified it
against both 2.6.37 and linux-2.6 from three days ago.

> Could you try to force vmalloc() use ?
> 
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index b5362e9..0b5a89c 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -2383,7 +2383,7 @@ static inline char *alloc_one_pg_vec_page(unsigned long order)
>  	gfp_t gfp_flags = GFP_KERNEL | __GFP_COMP |
>  			  __GFP_ZERO | __GFP_NOWARN | __GFP_NORETRY;
>  
> -	buffer = (char *) __get_free_pages(gfp_flags, order);
> +	buffer = NULL;
>  
>  	if (buffer)
>  		return buffer;

Thanks for the hint. I tried that, but the problem persists.

Greetings, Phil

^ permalink raw reply

* Re: [PATCH 0/2] wireless: Make and use const struct ieee80211_channel
From: Joe Perches @ 2011-05-06 16:10 UTC (permalink / raw)
  To: John W. Linville
  Cc: libertas-dev, linux-wireless, orinoco-users, orinoco-devel,
	netdev, LKML
In-Reply-To: <20110506131922.GA2252@tuxdriver.com>

On Fri, 2011-05-06 at 09:19 -0400, John W. Linville wrote:
> On Thu, May 05, 2011 at 03:21:47PM -0700, Joe Perches wrote:
> > On Thu, 2011-05-05 at 14:49 -0400, John W. Linville wrote:
> > > These patches generated a lot of warnings in net/mac80211.  Did you
> > > actually build them?
> > Yes.
> > Did you apply patch 1/2 first?
> > It's a dependent patch.
> That's the one that cause most of the warnings...

Consider the 2 patches as a single patch.
Do you have new build warnings after applying both
patches 1 and 2?

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox