Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] net: Adjust softirq raising in __napi_schedule
From: Jarek Poplawski @ 2009-10-21 21:39 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Tilman Schmidt, David Miller, hidave.darkstar, linux-kernel, tglx,
	linux-wireless, linux-ppp, netdev, paulus, Michael Buesch,
	Oliver Hartkopp
In-Reply-To: <1256160330.12174.2.camel@johannes.local>

On Thu, Oct 22, 2009 at 06:25:30AM +0900, Johannes Berg wrote:
> On Wed, 2009-10-21 at 23:19 +0200, Jarek Poplawski wrote:
...
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -2728,7 +2728,7 @@ void __napi_schedule(struct napi_struct *n)
> >  
> >  	local_irq_save(flags);
> >  	list_add_tail(&n->poll_list, &__get_cpu_var(softnet_data).poll_list);
> > -	__raise_softirq_irqoff(NET_RX_SOFTIRQ);
> > +	raise_softirq_irqoff(NET_RX_SOFTIRQ);
> 
> This still doesn't make any sense.
> 
> There may or may not be a lot of code that assumes that everything else
> is run with other tasklets disabled, and that it cannot be interrupted
> by a tasklet and thus create a race.
> 
> Can you prove that is not the case, across the entire networking layer?

I'm not sure I can understand your question. This patch is mainly to
avoid using netif_rx()/netif_rx_ni() pair as a test of proper process
context handling; IMHO there're better tools for this (lockdep,
WARN_ON's).

Jarek P.

^ permalink raw reply

* Re: [PATCH net-next-2.6] rtnetlink: rtnl_setlink() and rtnl_getlink() changes
From: Stephen Hemminger @ 2009-10-21 21:48 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S. Miller, Linux Netdev List
In-Reply-To: <4ADF7633.9050208@gmail.com>

On Wed, 21 Oct 2009 22:59:31 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Stephen, do you think we could change "ip link show dev ethX" to
> let it use rtnl_getlink() instead of rtnl_dump_ifinfo() ?
> 
> Thanks !
> 
> [PATCH net-next-2.6]rtnetlink: rtnl_setlink() and rtnl_getlink() changes
> 
> rtnl_getlink() & rtnl_setlink() run with RTNL held, we can use
> __dev_get_by_index() and __dev_get_by_name() variants and avoid
> dev_hold()/dev_put()
> 
> Adds to rtnl_getlink() the capability to find a device by its name,
> not only by its index.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
>  net/core/rtnetlink.c |   38 +++++++++++++++++++-------------------
>  1 files changed, 19 insertions(+), 19 deletions(-)
> 
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index eb42873..ba13b09 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -910,9 +910,9 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
>  	err = -EINVAL;
>  	ifm = nlmsg_data(nlh);
>  	if (ifm->ifi_index > 0)
> -		dev = dev_get_by_index(net, ifm->ifi_index);
> +		dev = __dev_get_by_index(net, ifm->ifi_index);
>  	else if (tb[IFLA_IFNAME])
> -		dev = dev_get_by_name(net, ifname);
> +		dev = __dev_get_by_name(net, ifname);
>  	else
>  		goto errout;
>  
> @@ -922,11 +922,9 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
>  	}
>  
>  	if ((err = validate_linkmsg(dev, tb)) < 0)
> -		goto errout_dev;
> +		goto errout;
>  
>  	err = do_setlink(dev, ifm, tb, ifname, 0);
> -errout_dev:
> -	dev_put(dev);
>  errout:
>  	return err;
>  }
> @@ -1154,6 +1152,7 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
>  {
>  	struct net *net = sock_net(skb->sk);
>  	struct ifinfomsg *ifm;
> +	char ifname[IFNAMSIZ];
>  	struct nlattr *tb[IFLA_MAX+1];
>  	struct net_device *dev = NULL;
>  	struct sk_buff *nskb;
> @@ -1163,19 +1162,23 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
>  	if (err < 0)
>  		return err;
>  
> +	if (tb[IFLA_IFNAME])
> +		nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
> +
>  	ifm = nlmsg_data(nlh);
> -	if (ifm->ifi_index > 0) {
> -		dev = dev_get_by_index(net, ifm->ifi_index);
> -		if (dev == NULL)
> -			return -ENODEV;
> -	} else
> +	if (ifm->ifi_index > 0)
> +		dev = __dev_get_by_index(net, ifm->ifi_index);
> +	else if (tb[IFLA_IFNAME])
> +		dev = __dev_get_by_name(net, ifname);
> +	else
>  		return -EINVAL;
>  
> +	if (dev == NULL)
> +		return -ENODEV;
> +
>  	nskb = nlmsg_new(if_nlmsg_size(dev), GFP_KERNEL);
> -	if (nskb == NULL) {
> -		err = -ENOBUFS;
> -		goto errout;
> -	}
> +	if (nskb == NULL)
> +		return -ENOBUFS;
>  
>  	err = rtnl_fill_ifinfo(nskb, dev, RTM_NEWLINK, NETLINK_CB(skb).pid,
>  			       nlh->nlmsg_seq, 0, 0);
> @@ -1183,11 +1186,8 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
>  		/* -EMSGSIZE implies BUG in if_nlmsg_size */
>  		WARN_ON(err == -EMSGSIZE);
>  		kfree_skb(nskb);
> -		goto errout;
> -	}
> -	err = rtnl_unicast(nskb, net, NETLINK_CB(skb).pid);
> -errout:
> -	dev_put(dev);
> +	} else
> +		err = rtnl_unicast(nskb, net, NETLINK_CB(skb).pid);
>  
>  	return err;
>  }

Would work, but not sure what it gains.

^ permalink raw reply

* Re: [PATCH] [NIU] VLAN does not work with niu driver
From: David Miller @ 2009-10-21 23:09 UTC (permalink / raw)
  To: Joyce.Yu; +Cc: netdev
In-Reply-To: <4ADF4CBD.90104@Sun.COM>


You still forgot your Signed-off-by: tag...

^ permalink raw reply

* Re: [PATCH] [NIU] VLAN does not work with niu driver
From: Joyce Yu @ 2009-10-21 23:28 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20091021.160923.26786943.davem@davemloft.net>


It was in the patch. Was an empty line needed after the body of the 
explanation? I will send out one with an empty line between body of the 
explanation and signed-off-by tag.

Thanks,
Joyce


 From f301748d3156437d65305f14288c7d5711861980 Mon Sep 17 00:00:00 2001
From: Joyce Yu <joyce.yu@sun.com>
Date: Wed, 21 Oct 2009 05:35:46 -0700
Subject: [PATCH] VLAN_ETH_HLEN should be used to make sure that the 
whole MAC header was copied to the head buffer in the Vlan packets case
Signed-off-by: Joyce Yu <joyce.yu@sun.com>  <=============== tag

---
  drivers/net/niu.c |    2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/niu.c b/drivers/net/niu.c
index f9364d0..d6c7ac6 100644
--- a/drivers/net/niu.c
+++ b/drivers/net/niu.c
@@ -3545,7 +3545,7 @@ static int niu_process_rx_pkt(struct napi_struct 
*napi, struct niu *np,
  	rp->rcr_index = index;

  	skb_reserve(skb, NET_IP_ALIGN);
-	__pskb_pull_tail(skb, min(len, NIU_RXPULL_MAX));
+	__pskb_pull_tail(skb, min(len, VLAN_ETH_HLEN));

  	rp->rx_packets++;
  	rp->rx_bytes += skb->len;
-- 
On 10/21/09 04:09 PM, David Miller wrote:
> You still forgot your Signed-off-by: tag...
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 


^ permalink raw reply related

* Re: [PATCH] [NIU] VLAN does not work with niu driver
From: David Miller @ 2009-10-22  0:19 UTC (permalink / raw)
  To: Joyce.Yu; +Cc: netdev
In-Reply-To: <4ADF992D.40507@Sun.COM>

From: Joyce Yu <Joyce.Yu@Sun.COM>
Date: Wed, 21 Oct 2009 16:28:45 -0700

> It was in the patch. Was an empty line needed after the body of the
> explanation? I will send out one with an empty line between body of
> the explanation and signed-off-by tag.

It should be of the form:

--------------------
Subject: Summary description of patch.

Full patch description.

Signed-off-by: ...

The patch.
--------------------

^ permalink raw reply

* Re: possible circular locking dependency in ISDN PPP
From: Xiaotian Feng @ 2009-10-22  2:27 UTC (permalink / raw)
  To: Tilman Schmidt; +Cc: LKML, isdn4linux, Netdev, Karsten Keil
In-Reply-To: <4ADF35CA.5020207@imap.cc>

On Thu, Oct 22, 2009 at 12:24 AM, Tilman Schmidt <tilman@imap.cc> wrote:
> Thanks for your analysis. The patch you propose does indeed prevent the
> "circular locking dependency" message, with no noticeable ill effect.
>
> I cannot say why xmit_lock was taken around isdn_net_lp_busy() in the
> first place. The ISDN4Linux people would be the ones to comment on that.
> If none of them objects, I propose you add a Signed-off-by line to your
> patch and submit it to Karsten Keil, the ISDN maintainer, for inclusion.
> You can also add a "Tested-by: Tilman Schmidt <tilman@imap.cc>" line.
>

Sure, I'll prepare a patch and send to Karsten later, Thanks.

> Thanks,
> Tilman
>
> Am 19.10.2009 07:27 schrieb Xiaotian Feng:
>> So there's a circular locking dependency.. Looking into isdn_net_get_locked_lp()
> [...]
>> Why do we need to hold xmit_lock while using
>> isdn_net_lp_busy(nd->queue) ? Can following patch fix this bug?
>>
>> ---
>> diff --git a/drivers/isdn/i4l/isdn_net.h b/drivers/isdn/i4l/isdn_net.h
>> index 74032d0..7511f08 100644
>> --- a/drivers/isdn/i4l/isdn_net.h
>> +++ b/drivers/isdn/i4l/isdn_net.h
>> @@ -83,19 +83,19 @@ static __inline__ isdn_net_local *
>> isdn_net_get_locked_lp(isdn_net_dev *nd)
>>
>>         spin_lock_irqsave(&nd->queue_lock, flags);
>>         lp = nd->queue;         /* get lp on top of queue */
>> -       spin_lock(&nd->queue->xmit_lock);
>>         while (isdn_net_lp_busy(nd->queue)) {
>> -               spin_unlock(&nd->queue->xmit_lock);
>>                 nd->queue = nd->queue->next;
>>                 if (nd->queue == lp) { /* not found -- should never happen */
>>                         lp = NULL;
>>                         goto errout;
>>                 }
>> -               spin_lock(&nd->queue->xmit_lock);
>>         }
>>         lp = nd->queue;
>>         nd->queue = nd->queue->next;
>> +       spin_unlock_irqrestore(&nd->queue_lock, flags);
>> +       spin_lock(&lp->xmit_lock);
>>         local_bh_disable();
>> +       return lp;
>>  errout:
>>         spin_unlock_irqrestore(&nd->queue_lock, flags);
>>         return lp;
>>
>
> --
> Tilman Schmidt                    E-Mail: tilman@imap.cc
> Bonn, Germany
> Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
> Ungeöffnet mindestens haltbar bis: (siehe Rückseite)
>
>

^ permalink raw reply

* Re: [PATCH net-next-2.6] rtnetlink: rtnl_setlink() and rtnl_getlink() changes
From: Eric Dumazet @ 2009-10-22  2:59 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David S. Miller, Linux Netdev List
In-Reply-To: <20091022064839.6b343125@s6510>

Stephen Hemminger a écrit :
> On Wed, 21 Oct 2009 22:59:31 +0200
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
>> Stephen, do you think we could change "ip link show dev ethX" to
>> let it use rtnl_getlink() instead of rtnl_dump_ifinfo() ?
>>
> 
> Would work, but not sure what it gains.

It takes about one second to dump the table when we have 25.000 devices

^ permalink raw reply

* Re: [PATCH v2 8/8] Document future removal of sysctl_tcp_* options
From: Bill Fink @ 2009-10-22  4:32 UTC (permalink / raw)
  To: William Allen Simpson; +Cc: netdev
In-Reply-To: <4ADF616B.1090405@gmail.com>

On Wed, 21 Oct 2009, William Allen Simpson wrote:

> Gilad Ben-Yossef wrote:
> > I have no issue with leaving those, if everyone thinks we're better off.
> > 
> > BTW, while we're talking about OS envy, I do believe that Windows do let
> > you specify on a per route basis. Not that this is really a good ground for
> > technical decision, but still... :-)
> > 
> I'm not concerned with "envy", I'm concerned with training operators, and
> consistency across platforms.
> 
> I'm in favor of per route configuration, it seems reasonably clean, as
> long as it's done consistently with other systems.  I don't permit Windows
> systems to be used here (except under controlled security circumstances), so
> I'm not familiar with their configuration.  However, doing things similarly
> across platforms will ease documentation and training.

And as mentioned previously, the global options can be quite useful
in certain test scenarios.  I also agree the per route settings are
a very useful addition.  I think the global and per route settings
are complementary and shouldn't be thought of as in conflict with
one another.

						-Bill

^ permalink raw reply

* Re: [net-next PATCH 1/4] qlge: Add ethtool get/set pause parameter.
From: David Miller @ 2009-10-22  4:46 UTC (permalink / raw)
  To: ron.mercer; +Cc: netdev
In-Reply-To: <1256159261-29151-1-git-send-email-ron.mercer@qlogic.com>

From: Ron Mercer <ron.mercer@qlogic.com>
Date: Wed, 21 Oct 2009 14:07:38 -0700

> Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>

Applied.

^ permalink raw reply

* Re: [net-next PATCH 2/4] qlge: Add ethtool blink function.
From: David Miller @ 2009-10-22  4:46 UTC (permalink / raw)
  To: ron.mercer; +Cc: netdev
In-Reply-To: <1256159261-29151-2-git-send-email-ron.mercer@qlogic.com>

From: Ron Mercer <ron.mercer@qlogic.com>
Date: Wed, 21 Oct 2009 14:07:39 -0700

> Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>

Applied.

^ permalink raw reply

* Re: [net-next PATCH 3/4] qlge: Add ethtool wake on LAN function.
From: David Miller @ 2009-10-22  4:46 UTC (permalink / raw)
  To: ron.mercer; +Cc: netdev
In-Reply-To: <1256159261-29151-3-git-send-email-ron.mercer@qlogic.com>

From: Ron Mercer <ron.mercer@qlogic.com>
Date: Wed, 21 Oct 2009 14:07:40 -0700

> 
> Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>

Applied.

^ permalink raw reply

* Re: [net-next PATCH 4/4] qlge: Add ethtool register dump function.
From: David Miller @ 2009-10-22  4:46 UTC (permalink / raw)
  To: ron.mercer; +Cc: netdev
In-Reply-To: <1256159261-29151-4-git-send-email-ron.mercer@qlogic.com>

From: Ron Mercer <ron.mercer@qlogic.com>
Date: Wed, 21 Oct 2009 14:07:41 -0700

> Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next-2.6] rtnetlink: rtnl_setlink() and rtnl_getlink() changes
From: Eric Dumazet @ 2009-10-22  4:48 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David S. Miller, Linux Netdev List
In-Reply-To: <4ADFCA82.80901@gmail.com>

Eric Dumazet a écrit :
> Stephen Hemminger a écrit :
>> On Wed, 21 Oct 2009 22:59:31 +0200
>> Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>
>>> Stephen, do you think we could change "ip link show dev ethX" to
>>> let it use rtnl_getlink() instead of rtnl_dump_ifinfo() ?
>>>
>> Would work, but not sure what it gains.
> 
> It takes about one second to dump the table when we have 25.000 devices

Adding new links takes also lot of time in rtnl_dump_ifinfo(), we could
optimize it using a 256 fanout (using the ifindex hash table instead
of the single list)

But IMHO rtnl_dump_ifinfo() should be used only when needed, not when
querying/adding a particular device.

------------------------------------------------------------------------------
   PerfTop:   42745 irqs/sec  kernel:88.0% [100000 cycles],  (all, 8 CPUs)
------------------------------------------------------------------------------

             samples    pcnt   kernel function
             _______   _____   _______________

           231146.00 - 52.4% : rtnl_dump_ifinfo
            18491.00 -  4.2% : __register_sysctl_paths
            17700.00 -  4.0% : mwait_idle
            12883.00 -  2.9% : rtnl_fill_ifinfo
            12661.00 -  2.9% : schedule
             6324.00 -  1.4% : find_busiest_group
             5911.00 -  1.3% : _spin_lock_irqsave
             4862.00 -  1.1% : dev_get_stats
             4726.00 -  1.1% : copy_to_user
             4547.00 -  1.0% : __nla_put
             4117.00 -  0.9% : sysfs_find_dirent
             4090.00 -  0.9% : sysenter_past_esp
             3789.00 -  0.9% : fput
             3735.00 -  0.8% : __nla_reserve
             3699.00 -  0.8% : read_tsc



^ permalink raw reply

* Re: [net-next-2.6 PATCH 2/3] ixgbe: Set MSI-X vectors to NOBALANCING and set affinity
From: David Miller @ 2009-10-22  4:50 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: gospo, netdev, peter.p.waskiewicz.jr
In-Reply-To: <20091021022713.32449.54868.stgit@localhost.localdomain>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Tue, 20 Oct 2009 19:27:14 -0700

> From: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
> 
> This patch will set each MSI-X vector to IRQF_NOBALANCING to
> prevent autobalance of the interrupts, then applies a CPU
> affinity.  This will only be done when Flow Director is enabled,
> which needs interrupts to be processed on the same CPUs where the
> applications are running.
> 
> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Just explain to me why irqbalanced in userspace cannot take care
of this issue.

Second, even if we cannot use irqbalanced for some reason, the last
thing I want to see is drivers directly fiddling with interrupt
states and attributes.  Every driver is going to do it every so
slightly differently, and often will get it wrong.

There is also no global policy or policy control available when
drivers do this stuff directly.  And that's how we end up with
situations where every driver behaves differently which results in a
terrible user experience.

^ permalink raw reply

* [GIT]: Networking
From: David Miller @ 2009-10-22  4:56 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


1) Add some diagnostics so we can try to track down the tcp MSG_PEEK
   WARN_ON that shows up in Arjan's kerneloops list a lot.  From
   Herbert Xu.

2) Bluetooth bug fixes from Dave Young.

3) Fix the DEFER_ACCEP ack counting issue in a better way, from Julian
   Anastasov.

4) be2net bug fixes from Sathya Perla.

5) ethoc bug fixes from Thomas Chou

6) AF_UNIX deadlock on half-close, fix fro Tomoki Sekiyama.

7) virtio_net fix for netconsole from Eric Dumazet.

8) IP_MULTICAST_IF uses __dev_get_by_index() without proper locking.
   From Eric Dumazet.

9) KS8851 bug fixes from Ben Dooks.

Please pull, thanks a lot!

The following changes since commit 2fdc246aaf9a7fa088451ad2a72e9119b5f7f029:
  Linus Torvalds (1):
        Merge branch 'for-linus' of git://git.kernel.org/.../bp/bp

are available in the git repository at:

  master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master

Ben Dooks (3):
      KS8851: Add soft reset at probe time
      KS8851: Fix MAC address write order
      KS8851: Fix ks8851_set_rx_mode() for IFF_MULTICAST

Dave Young (2):
      bluetooth: scheduling while atomic bug fix
      bluetooth: static lock key fix

David S. Miller (1):
      Revert "tcp: fix tcp_defer_accept to consider the timeout"

Eric Dumazet (3):
      virtio_net: use dev_kfree_skb_any() in free_old_xmit_skbs()
      net: Fix IP_MULTICAST_IF
      net: Fix struct inet_timewait_sock bitfield annotation

Herbert Xu (1):
      tcp: Try to catch MSG_PEEK bug

Joyce Yu (1):
      niu: VLAN_ETH_HLEN should be used to make sure that the whole MAC header was copied to the head buffer in the Vlan packets case

Julian Anastasov (3):
      tcp: accept socket after TCP_DEFER_ACCEPT period
      tcp: reduce SYN-ACK retrans for TCP_DEFER_ACCEPT
      tcp: fix TCP_DEFER_ACCEPT retrans calculation

Randy Dunlap (1):
      vmxnet3: use dev_dbg, fix build for CONFIG_BLOCK=n

Sathya Perla (2):
      be2net: fix promiscuous and multicast promiscuous modes being enabled always
      be2net: fix support for PCI hot plug

Steven King (1):
      net: fix section mismatch in fec.c

Thomas Chou (2):
      ethoc: inline regs access
      ethoc: clear only pending irqs

Tomoki Sekiyama (1):
      AF_UNIX: Fix deadlock on connecting to shutdown socket

 drivers/net/benet/be_cmds.c       |   33 ++++++++++++++-------
 drivers/net/benet/be_cmds.h       |    5 ++-
 drivers/net/benet/be_main.c       |   27 +++++++++-------
 drivers/net/ethoc.c               |   21 +++++++------
 drivers/net/fec.c                 |    2 +-
 drivers/net/ks8851.c              |   42 +++++++++++++++++++++++---
 drivers/net/ks8851.h              |    1 +
 drivers/net/niu.c                 |    2 +-
 drivers/net/virtio_net.c          |    2 +-
 drivers/net/vmxnet3/vmxnet3_drv.c |   27 +++++++++++-----
 drivers/net/vmxnet3/vmxnet3_int.h |    2 +-
 include/net/inet_timewait_sock.h  |    8 ++--
 net/bluetooth/hci_sysfs.c         |    4 +-
 net/bluetooth/l2cap.c             |    9 ++++--
 net/ipv4/inet_connection_sock.c   |   34 +++++++++++++++++++--
 net/ipv4/ip_sockglue.c            |    7 ++--
 net/ipv4/tcp.c                    |   59 ++++++++++++++++++++++++++++--------
 net/ipv4/tcp_minisocks.c          |    5 +--
 net/ipv6/ipv6_sockglue.c          |    6 +++-
 net/unix/af_unix.c                |    2 +
 20 files changed, 212 insertions(+), 86 deletions(-)

^ permalink raw reply

* Re: [PATCH v2 8/8] Document future removal of sysctl_tcp_* options
From: Eric Dumazet @ 2009-10-22  4:57 UTC (permalink / raw)
  To: Bill Fink; +Cc: William Allen Simpson, netdev
In-Reply-To: <20091022003245.5cd4885c.billfink@mindspring.com>

Bill Fink a écrit :

> And as mentioned previously, the global options can be quite useful
> in certain test scenarios.  I also agree the per route settings are
> a very useful addition.  I think the global and per route settings
> are complementary and shouldn't be thought of as in conflict with
> one another.
> 

Absolutely, global setting is a must when an admin wants a quick path.

The more flexible would be to have two bits per route, plus
2 bits on the global configuration.

global conf:
00 : timestamps OFF, unless a route setting is not 00
01 : timestamps ON, unless a route setting is not 00
10 : Force timestamps OFF, ignore route settings (emergency sysadmin request)
11 : Force timestamps ON, ignore route settings 

Route settings (used *only* if global setting is 0Y)
00 : global conf is used
01 : Force timestamps being OFF for this route
10 : Force timestamps being ON for this route
11 : complement global conf

^ permalink raw reply

* [PATCH 1/3] netxen: fix i2c init
From: Dhananjay Phadke @ 2009-10-22  5:39 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1256189943-20477-1-git-send-email-dhananjay@netxen.com>

Avoid resetting subsys ID in i2c block. Also remove duplicate
check for address tranlsation error.

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
---
 drivers/net/netxen/netxen_nic_hdr.h  |    1 +
 drivers/net/netxen/netxen_nic_init.c |    8 ++------
 2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/net/netxen/netxen_nic_hdr.h b/drivers/net/netxen/netxen_nic_hdr.h
index 7a71774..1c46da6 100644
--- a/drivers/net/netxen/netxen_nic_hdr.h
+++ b/drivers/net/netxen/netxen_nic_hdr.h
@@ -419,6 +419,7 @@ enum {
 #define NETXEN_CRB_ROMUSB	\
 	NETXEN_PCI_CRB_WINDOW(NETXEN_HW_PX_MAP_CRB_ROMUSB)
 #define NETXEN_CRB_I2Q		NETXEN_PCI_CRB_WINDOW(NETXEN_HW_PX_MAP_CRB_I2Q)
+#define NETXEN_CRB_I2C0		NETXEN_PCI_CRB_WINDOW(NETXEN_HW_PX_MAP_CRB_I2C0)
 #define NETXEN_CRB_SMB		NETXEN_PCI_CRB_WINDOW(NETXEN_HW_PX_MAP_CRB_SMB)
 #define NETXEN_CRB_MAX		NETXEN_PCI_CRB_WINDOW(64)
 
diff --git a/drivers/net/netxen/netxen_nic_init.c b/drivers/net/netxen/netxen_nic_init.c
index 91c2bc6..e40b914 100644
--- a/drivers/net/netxen/netxen_nic_init.c
+++ b/drivers/net/netxen/netxen_nic_init.c
@@ -531,6 +531,8 @@ int netxen_pinit_from_rom(struct netxen_adapter *adapter, int verbose)
 			continue;
 
 		if (NX_IS_REVISION_P3(adapter->ahw.revision_id)) {
+			if (off == (NETXEN_CRB_I2C0 + 0x1c))
+				continue;
 			/* do not reset PCI */
 			if (off == (ROMUSB_GLB + 0xbc))
 				continue;
@@ -553,12 +555,6 @@ int netxen_pinit_from_rom(struct netxen_adapter *adapter, int verbose)
 				continue;
 		}
 
-		if (off == NETXEN_ADDR_ERROR) {
-			printk(KERN_ERR "%s: Err: Unknown addr: 0x%08x\n",
-					netxen_nic_driver_name, buf[i].addr);
-			continue;
-		}
-
 		init_delay = 1;
 		/* After writing this register, HW needs time for CRB */
 		/* to quiet down (else crb_window returns 0xffffffff) */
-- 
1.6.0.2


^ permalink raw reply related

* [PATCH 2/3] netxen: fix tx timeout handling on firmware hang
From: Dhananjay Phadke @ 2009-10-22  5:39 UTC (permalink / raw)
  To: davem; +Cc: netdev, Amit Kumar Salecha
In-Reply-To: <1256189943-20477-1-git-send-email-dhananjay@netxen.com>

From: Amit Kumar Salecha <amit.salecha@qlogic.com>

Clear NX_RESETING bit in netxen_tx_timeout_task() so that
the firmware watchdog task can catch need_reset request
from tx timeout.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
---
 drivers/net/netxen/netxen_nic_main.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/net/netxen/netxen_nic_main.c b/drivers/net/netxen/netxen_nic_main.c
index 7fc15e9..0b4a56a 100644
--- a/drivers/net/netxen/netxen_nic_main.c
+++ b/drivers/net/netxen/netxen_nic_main.c
@@ -1919,6 +1919,7 @@ static void netxen_tx_timeout_task(struct work_struct *work)
 
 request_reset:
 	adapter->need_fw_reset = 1;
+	clear_bit(__NX_RESETTING, &adapter->state);
 }
 
 struct net_device_stats *netxen_nic_get_stats(struct net_device *netdev)
-- 
1.6.0.2


^ permalink raw reply related

* [PATCH 0/3] netxen: bug fixes
From: Dhananjay Phadke @ 2009-10-22  5:39 UTC (permalink / raw)
  To: davem; +Cc: netdev

Dave,

3 bug fixes for 2.6.32. Please apply to net-2.6.

Thanks,
	Dhananjay



^ permalink raw reply

* [PATCH 3/3] netxen: avoid undue board config check
From: Dhananjay Phadke @ 2009-10-22  5:39 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1256189943-20477-1-git-send-email-dhananjay@netxen.com>

Old code assumed board config version in the flash to be 1.
When this will get changed by tools, driver just refuses to
attach. This is unnecessary since driver does not have to
parse board config structure directly (maintained by firmware).

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
---
 drivers/net/netxen/netxen_nic_hw.c |   14 ++++----------
 1 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/net/netxen/netxen_nic_hw.c b/drivers/net/netxen/netxen_nic_hw.c
index 3231400..3185a98 100644
--- a/drivers/net/netxen/netxen_nic_hw.c
+++ b/drivers/net/netxen/netxen_nic_hw.c
@@ -1901,22 +1901,16 @@ netxen_setup_hwops(struct netxen_adapter *adapter)
 
 int netxen_nic_get_board_info(struct netxen_adapter *adapter)
 {
-	int offset, board_type, magic, header_version;
+	int offset, board_type, magic;
 	struct pci_dev *pdev = adapter->pdev;
 
 	offset = NX_FW_MAGIC_OFFSET;
 	if (netxen_rom_fast_read(adapter, offset, &magic))
 		return -EIO;
 
-	offset = NX_HDR_VERSION_OFFSET;
-	if (netxen_rom_fast_read(adapter, offset, &header_version))
-		return -EIO;
-
-	if (magic != NETXEN_BDINFO_MAGIC ||
-			header_version != NETXEN_BDINFO_VERSION) {
-		dev_err(&pdev->dev,
-			"invalid board config, magic=%08x, version=%08x\n",
-			magic, header_version);
+	if (magic != NETXEN_BDINFO_MAGIC) {
+		dev_err(&pdev->dev, "invalid board config, magic=%08x\n",
+			magic);
 		return -EIO;
 	}
 
-- 
1.6.0.2


^ permalink raw reply related

* [PATCH] udev: create empty regular files to represent net interfaces
From: dann frazier @ 2009-10-22  6:36 UTC (permalink / raw)
  To: linux-hotplug
  Cc: Narendra_K, netdev, Matt_Domsch, Jordan_Hargrave, Charles_Rose,
	Ben Hutchings
In-Reply-To: <20091016214024.GA10091@ldl.fc.hp.com>

Here's a proof of concept to further the discussion..

The default filename uses the format:
  /dev/netdev/by-ifindex/$ifindex

This provides the infrastructure to permit udev rules to create aliases for
network devices using symlinks, for example:

  /dev/netdev/by-name/eth0 -> ../by-ifindex/1
  /dev/netdev/by-biosname/LOM0 -> ../by-ifindex/3

A library (such as the proposed libnetdevname) could use this information
to provide an alias->realname mapping for network utilities.

Tested with the following rule:

SUBSYSTEM=="net", PROGRAM=="/usr/local/bin/ifindex2name $attr{ifindex}", SYMLINK+="netdev/by-name/%c"

$ cat /usr/local/bin/ifindex2name
#!/bin/sh

set -e

ifindex="$1"

for d in /sys/class/net/*; do
    testindex="$(cat $d/ifindex)"
    if [ "$ifindex" = "$testindex" ]; then
	echo "$(basename $d)"
	exit 0
    fi
done

exit 1

---
 libudev/exported_symbols |    1 +
 libudev/libudev.c        |   29 ++++++++++++++++
 libudev/libudev.h        |    1 +
 udev/udev-event.c        |   82 ++++++++++++++++++++--------------------------
 udev/udev-node.c         |   41 ++++++++++++++++++++---
 udev/udev-rules.c        |    3 +-
 6 files changed, 105 insertions(+), 52 deletions(-)

diff --git a/libudev/exported_symbols b/libudev/exported_symbols
index 018463d..31c616a 100644
--- a/libudev/exported_symbols
+++ b/libudev/exported_symbols
@@ -8,6 +8,7 @@ udev_get_userdata
 udev_set_userdata
 udev_get_sys_path
 udev_get_dev_path
+udev_get_netdev_path
 udev_list_entry_get_next
 udev_list_entry_get_by_name
 udev_list_entry_get_name
diff --git a/libudev/libudev.c b/libudev/libudev.c
index 1909138..2a83417 100644
--- a/libudev/libudev.c
+++ b/libudev/libudev.c
@@ -42,6 +42,7 @@ struct udev {
 	void *userdata;
 	char *sys_path;
 	char *dev_path;
+	char *netdev_path;
 	char *rules_path;
 	struct udev_list_node properties_list;
 	int log_priority;
@@ -125,8 +126,10 @@ struct udev *udev_new(void)
 	udev->run = 1;
 	udev->dev_path = strdup("/dev");
 	udev->sys_path = strdup("/sys");
+	udev->netdev_path = strdup("/dev/netdev/by-ifindex");
 	config_file = strdup(SYSCONFDIR "/udev/udev.conf");
 	if (udev->dev_path == NULL ||
+	    udev->netdev_path == NULL ||
 	    udev->sys_path == NULL ||
 	    config_file == NULL)
 		goto err;
@@ -243,6 +246,14 @@ struct udev *udev_new(void)
 		udev_add_property(udev, "UDEV_ROOT", udev->dev_path);
 	}
 
+	env = getenv("NETDEV_ROOT");
+	if (env != NULL) {
+		free(udev->netdev_path);
+		udev->netdev_path = strdup(env);
+		util_remove_trailing_chars(udev->netdev_path, '/');
+		udev_add_property(udev, "NETDEV_ROOT", udev->netdev_path);
+	}
+
 	env = getenv("UDEV_LOG");
 	if (env != NULL)
 		udev_set_log_priority(udev, util_log_priority(env));
@@ -253,6 +264,7 @@ struct udev *udev_new(void)
 	dbg(udev, "log_priority=%d\n", udev->log_priority);
 	dbg(udev, "config_file='%s'\n", config_file);
 	dbg(udev, "dev_path='%s'\n", udev->dev_path);
+	dbg(udev, "netdev_path='%s'\n", udev->netdev_path);
 	dbg(udev, "sys_path='%s'\n", udev->sys_path);
 	if (udev->rules_path != NULL)
 		dbg(udev, "rules_path='%s'\n", udev->rules_path);
@@ -398,6 +410,23 @@ const char *udev_get_dev_path(struct udev *udev)
 	return udev->dev_path;
 }
 
+/**
+ * udev_get_netdev_path:
+ * @udev: udev library context
+ *
+ * Retrieve the device directory path. The default value is "/etc/udev/net",
+ * the actual value may be overridden in the udev configuration
+ * file.
+ *
+ * Returns: the device directory path
+ **/
+const char *udev_get_netdev_path(struct udev *udev)
+{
+	if (udev == NULL)
+		return NULL;
+	return udev->netdev_path;
+}
+
 struct udev_list_entry *udev_add_property(struct udev *udev, const char *key, const char *value)
 {
 	if (value == NULL) {
diff --git a/libudev/libudev.h b/libudev/libudev.h
index 4bcf442..5834781 100644
--- a/libudev/libudev.h
+++ b/libudev/libudev.h
@@ -77,6 +77,7 @@ struct udev_device *udev_device_get_parent_with_subsystem_devtype(struct udev_de
 								  const char *subsystem, const char *devtype);
 /* retrieve device properties */
 const char *udev_device_get_devpath(struct udev_device *udev_device);
+const char *udev_device_get_netdevpath(struct udev_device *udev_device);
 const char *udev_device_get_subsystem(struct udev_device *udev_device);
 const char *udev_device_get_devtype(struct udev_device *udev_device);
 const char *udev_device_get_syspath(struct udev_device *udev_device);
diff --git a/udev/udev-event.c b/udev/udev-event.c
index d5b4d09..953f87a 100644
--- a/udev/udev-event.c
+++ b/udev/udev-event.c
@@ -542,7 +542,7 @@ int udev_event_execute_rules(struct udev_event *event, struct udev_rules *rules)
 	}
 
 	/* add device node */
-	if (major(udev_device_get_devnum(dev)) != 0 &&
+	if ((major(udev_device_get_devnum(dev)) != 0 || strcmp(udev_device_get_subsystem(dev), "net") == 0) &&
 	    (strcmp(udev_device_get_action(dev), "add") == 0 || strcmp(udev_device_get_action(dev), "change") == 0)) {
 		char filename[UTIL_PATH_SIZE];
 		struct udev_device *dev_old;
@@ -603,10 +603,38 @@ int udev_event_execute_rules(struct udev_event *event, struct udev_rules *rules)
 			goto exit_add;
 		}
 
-		/* set device node name */
-		util_strscpyl(filename, sizeof(filename), udev_get_dev_path(event->udev), "/", event->name, NULL);
-		udev_device_set_devnode(dev, filename);
-
+		/* add netif */
+		if (strcmp(udev_device_get_subsystem(dev), "net") == 0 &&
+		    strcmp(udev_device_get_action(dev), "add") == 0) {
+			char syspath[UTIL_PATH_SIZE];
+			info(event->udev, "netif add '%s'\n", udev_device_get_devpath(dev));
+			/* look if we want to change the name of the netif */
+			if (strcmp(event->name, udev_device_get_sysname(dev)) != 0) {
+				char *pos;
+				err = rename_netif(event);
+				if (err != 0)
+					goto exit;
+				info(event->udev, "renamed netif to '%s'\n", event->name);
+				
+				/* remember old name */
+				udev_device_add_property(dev, "INTERFACE_OLD", udev_device_get_sysname(dev));
+				
+				/* now change the devpath, because the kernel device name has changed */
+				util_strscpy(syspath, sizeof(syspath), udev_device_get_syspath(dev));
+				pos = strrchr(syspath, '/');
+				if (pos != NULL) {
+					pos++;
+					util_strscpy(pos, sizeof(syspath) - (pos - syspath), event->name);
+					udev_device_set_syspath(event->dev, syspath);
+					udev_device_add_property(dev, "INTERFACE", udev_device_get_sysname(dev));
+					info(event->udev, "changed devpath to '%s'\n", udev_device_get_devpath(dev));
+				}
+			}
+			snprintf(syspath, sizeof(syspath), "%s/%s", udev_get_netdev_path(event->udev),
+				 udev_device_get_property_value(event->dev, "IFINDEX"));
+			udev_device_set_devnode(dev, syspath);
+		}
+		    
 		/* write current database entry */
 		udev_device_update_db(dev);
 
@@ -632,49 +660,11 @@ exit_add:
 		goto exit;
 	}
 
-	/* add netif */
-	if (strcmp(udev_device_get_subsystem(dev), "net") == 0 && strcmp(udev_device_get_action(dev), "add") == 0) {
-		dbg(event->udev, "netif add '%s'\n", udev_device_get_devpath(dev));
-		udev_device_delete_db(dev);
-
-		udev_rules_apply_to_event(rules, event);
-		if (event->ignore_device) {
-			info(event->udev, "device event will be ignored\n");
-			goto exit;
-		}
-		if (event->name == NULL)
-			goto exit;
-
-		/* look if we want to change the name of the netif */
-		if (strcmp(event->name, udev_device_get_sysname(dev)) != 0) {
-			char syspath[UTIL_PATH_SIZE];
-			char *pos;
-
-			err = rename_netif(event);
-			if (err != 0)
-				goto exit;
-			info(event->udev, "renamed netif to '%s'\n", event->name);
-
-			/* remember old name */
-			udev_device_add_property(dev, "INTERFACE_OLD", udev_device_get_sysname(dev));
-
-			/* now change the devpath, because the kernel device name has changed */
-			util_strscpy(syspath, sizeof(syspath), udev_device_get_syspath(dev));
-			pos = strrchr(syspath, '/');
-			if (pos != NULL) {
-				pos++;
-				util_strscpy(pos, sizeof(syspath) - (pos - syspath), event->name);
-				udev_device_set_syspath(event->dev, syspath);
-				udev_device_add_property(dev, "INTERFACE", udev_device_get_sysname(dev));
-				info(event->udev, "changed devpath to '%s'\n", udev_device_get_devpath(dev));
-			}
-		}
-		udev_device_update_db(dev);
-		goto exit;
-	}
 
 	/* remove device node */
-	if (major(udev_device_get_devnum(dev)) != 0 && strcmp(udev_device_get_action(dev), "remove") == 0) {
+	if ((major(udev_device_get_devnum(dev)) != 0 ||
+	     strcmp(udev_device_get_subsystem(dev), "net") == 0) &&
+	    strcmp(udev_device_get_action(dev), "remove") == 0) {
 		/* import database entry and delete it */
 		udev_device_read_db(dev);
 		udev_device_set_info_loaded(dev);
diff --git a/udev/udev-node.c b/udev/udev-node.c
index 39bec3e..da96a4a 100644
--- a/udev/udev-node.c
+++ b/udev/udev-node.c
@@ -32,6 +32,34 @@
 
 #define TMP_FILE_EXT		".udev-tmp"
 
+static bool udev_node_mode_matches(struct stat *stats, dev_t devnum, mode_t mode)
+{
+	if ((stats->st_mode & S_IFMT) != (mode & S_IFMT))
+		return false;
+
+	if ((S_ISCHR(mode) || S_ISBLK(mode)) && (stats->st_rdev != devnum))
+		return false;
+
+	return true;
+}
+
+static int udev_node_create_file(struct udev *udev, const char *path, dev_t devnum, mode_t mode)
+{
+	int fd, ret = 0;
+
+	if (S_ISCHR(mode) || S_ISBLK(mode))
+		ret = mknod(path, mode, devnum);
+	else {
+		fd = creat(path, mode);
+		if (fd < 0)
+			ret = fd;
+		else
+			close(fd);
+	}
+
+	return ret;
+}
+
 int udev_node_mknod(struct udev_device *dev, const char *file, dev_t devnum, mode_t mode, uid_t uid, gid_t gid)
 {
 	struct udev *udev = udev_device_get_udev(dev);
@@ -47,12 +75,15 @@ int udev_node_mknod(struct udev_device *dev, const char *file, dev_t devnum, mod
 	else
 		mode |= S_IFCHR;
 
+	if (strcmp(udev_device_get_subsystem(dev), "net") == 0)
+		mode = S_IFREG | S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH;
+
 	if (file == NULL)
 		file = udev_device_get_devnode(dev);
 
 	if (lstat(file, &stats) == 0) {
-		if (((stats.st_mode & S_IFMT) == (mode & S_IFMT)) && (stats.st_rdev == devnum)) {
-			info(udev, "preserve file '%s', because it has correct dev_t\n", file);
+		if (udev_node_mode_matches(&stats, devnum, mode)) {
+			info(udev, "preserve file '%s', because it has correct type\n", file);
 			preserve = 1;
 			udev_selinux_lsetfilecon(udev, file, mode);
 		} else {
@@ -62,10 +93,10 @@ int udev_node_mknod(struct udev_device *dev, const char *file, dev_t devnum, mod
 			util_strscpyl(file_tmp, sizeof(file_tmp), file, TMP_FILE_EXT, NULL);
 			unlink(file_tmp);
 			udev_selinux_setfscreatecon(udev, file_tmp, mode);
-			err = mknod(file_tmp, mode, devnum);
+			err = udev_node_create_file(udev, file_tmp, devnum, mode);
 			udev_selinux_resetfscreatecon(udev);
 			if (err != 0) {
-				err(udev, "mknod(%s, %#o, %u, %u) failed: %m\n",
+				err(udev, "udev_node_create_file(%s, %#o, %u, %u) failed: %m\n",
 				    file_tmp, mode, major(devnum), minor(devnum));
 				goto exit;
 			}
@@ -80,7 +111,7 @@ int udev_node_mknod(struct udev_device *dev, const char *file, dev_t devnum, mod
 		do {
 			util_create_path(udev, file);
 			udev_selinux_setfscreatecon(udev, file, mode);
-			err = mknod(file, mode, devnum);
+			err = udev_node_create_file(udev, file, devnum, mode);
 			if (err != 0)
 				err = errno;
 			udev_selinux_resetfscreatecon(udev);
diff --git a/udev/udev-rules.c b/udev/udev-rules.c
index ddb51de..a1fe991 100644
--- a/udev/udev-rules.c
+++ b/udev/udev-rules.c
@@ -2435,7 +2435,8 @@ int udev_rules_apply_to_event(struct udev_rules *rules, struct udev_event *event
 
 				if (event->devlink_final)
 					break;
-				if (major(udev_device_get_devnum(event->dev)) == 0)
+				if ((major(udev_device_get_devnum(event->dev)) == 0) &&
+				    (strcmp(udev_device_get_subsystem(event->dev), "net") != 0))
 					break;
 				if (cur->key.op == OP_ASSIGN_FINAL)
 					event->devlink_final = 1;
-- 
1.6.5




^ permalink raw reply related

* [PATCH 2.6.32-rc5] r8169: fix Ethernet Hangup for RTL8110SC rev d
From: Simon Wunderlich @ 2009-10-22  6:48 UTC (permalink / raw)
  To: netdev; +Cc: Francois Romieu, Bernhard Schmidt

The 8110SC rev d chip on our board shows a regression which the 8110SB chip
did not have. When inbound traffic is overflowing the receive descriptor queue,
"holes" in the ring buffer may occur which lead to a hangup until the buffer
is filled again. The packets are than completely processed, but the ring
remains porous and no packets are processed until the next overflow. Setting
the interface down and up can fix the problem temporary from userspace.

For some reason we don't know, this behaviour is not occuring if the RxVlan
bit for hardware VLAN untagging is set. There is another "Work around for
AMD plateform" in the current code which checks the VLAN status
word in receive descriptors, but does never come to effect when hardware
VLAN support is enabled. We assume that this is a bug in the chip.

The following patch fixes the problem. Without the patch we could reproduce
the hang within minutes (given other devices also generating lots of
interrupts), without we couldn't reproduce within a few days of long term
testing.

Signed-off-by: Bernhard Schmidt <bernhard.schmidt@saxnet.de>
Signed-off-by: Simon Wunderlich <simon.wunderlich@saxnet.de>
Acked-by: Francois Romieu <romieu@zoreil.com>

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 83c47d9..0908c50 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -1029,7 +1029,10 @@ static void rtl8169_vlan_rx_register(struct
net_device *dev,

    spin_lock_irqsave(&tp->lock, flags);
    tp->vlgrp = grp;
-   if (tp->vlgrp)
+   /*
+    * Do not disable RxVlan on 8110SCd.
+    */
+   if (tp->vlgrp || (tp->mac_version == RTL_GIGA_MAC_VER_05))
        tp->cp_cmd |= RxVlan;
    else
        tp->cp_cmd &= ~RxVlan;
@@ -3197,6 +3200,15 @@ rtl8169_init_one(struct pci_dev *pdev, const
struct pci_device_id *ent)
    }

    rtl8169_init_phy(dev, tp);
+
+   /*
+    * Pretend we are using VLANs; This bypasses a nasty bug where
+    * Interrupts stop flowing on high load on 8110SCd controllers.
+    */
+   if (tp->mac_version == RTL_GIGA_MAC_VER_05)
+       RTL_W16(CPlusCmd, RTL_R16(CPlusCmd) | RxVlan);
+
+
    device_set_wakeup_enable(&pdev->dev, tp->features & RTL_FEATURE_WOL);

 out:

^ permalink raw reply related

* Re: [PATCH]bnx2x: remove duplication of the BCM_VLAN macro
From: Eilon Greenstein @ 2009-10-22  7:42 UTC (permalink / raw)
  To: kirjanov@gmail.com; +Cc: davem@davemloft.net, netdev@vger.kernel.org
In-Reply-To: <20091021203851.GA5311@coldcone>

On Wed, 2009-10-21 at 13:38 -0700, Denis Kirjanov 
> File bnx2.c already contains condition of the macro inclusion.
> So we can remove this.

It is true that BCM_VLAN is defined in bnx2.c, however it is not defined
in bnx2x_*.c files. The definition in bnx2x.h is needed for the bnx2x.ko
module.

Please do not remove it.

Thanks,
Eilon

> Signed-off-by: Denis Kirjanov <kirjanov@gmail.com>
> ---
> 
> diff --git a/drivers/net/bnx2x.h b/drivers/net/bnx2x.h
> index bbf8422..4b99fd2 100644
> --- a/drivers/net/bnx2x.h
> +++ b/drivers/net/bnx2x.h
> @@ -20,10 +20,6 @@
>   * (you will need to reboot afterwards) */
>  /* #define BNX2X_STOP_ON_ERROR */
>  
> -#if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
> -#define BCM_VLAN			1
> -#endif
> -
>  
>  #define BNX2X_MULTI_QUEUE
>  




^ permalink raw reply

* Re: [net-next-2.6 PATCH 2/3] ixgbe: Set MSI-X vectors to NOBALANCING and set affinity
From: Peter P Waskiewicz Jr @ 2009-10-22  8:22 UTC (permalink / raw)
  To: David Miller; +Cc: Kirsher, Jeffrey T, gospo@redhat.com, netdev@vger.kernel.org
In-Reply-To: <20091021.215031.57955781.davem@davemloft.net>

On Wed, 2009-10-21 at 21:50 -0700, David Miller wrote:
> From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Date: Tue, 20 Oct 2009 19:27:14 -0700
> 
> > From: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
> > 
> > This patch will set each MSI-X vector to IRQF_NOBALANCING to
> > prevent autobalance of the interrupts, then applies a CPU
> > affinity.  This will only be done when Flow Director is enabled,
> > which needs interrupts to be processed on the same CPUs where the
> > applications are running.
> > 
> > Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
> > Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> 
> Just explain to me why irqbalanced in userspace cannot take care
> of this issue.

The problem we have is when Flow Director is enabled, we want to try and
balance the applications across all CPUs.  irqbalance is going to fight
with the scheduler to balance things, and our tests show that irqbalance
only utilizes a few of the CPU cores, not all of them.  That fights
directly with Flow Director and what it's trying to do.

> Second, even if we cannot use irqbalanced for some reason, the last
> thing I want to see is drivers directly fiddling with interrupt
> states and attributes.  Every driver is going to do it every so
> slightly differently, and often will get it wrong.

The first thing any performance guide says is to disable irqbalance, and
affinitize the interrupts in /proc/irq/<irq>/smp_affinity.  This will
ensure the best distribution of work.  The major disadvantage in doing
this is disabling irqbalance affects the entire system.  What this
patchset is trying to do is make sure a single driver, trying to
optimize for performance, doesn't need to affect the entire system.
Setting no-balancing on a vector is the best approach for the entire
system.

I completely understand your concern that this opens precedent for other
drivers to potentially start doing crazy things with interrupts, but
with MSI-X, we're only impacting our driver.

> There is also no global policy or policy control available when
> drivers do this stuff directly.  And that's how we end up with
> situations where every driver behaves differently which results in a
> terrible user experience.

Again, I think the overall impact is worse where the normal approach to
performance tuning is to altogether disable irqbalancing.  The same
effect can be attained by a user disabling irqbalance, and assigning
whatever affinity they want, which could be even more devastating.  What
we're trying to do here is have the driver come as best tuned out of the
box as possible.

If there's something about this particular implementation you're not
comfortable with, I'm very willing to take any feedback on it.  We're
trying to do a specific thing, not lead poor design in drivers when
dealing with interrupts.

Regards,
-PJ Waskiewicz


^ permalink raw reply

* Re: [PATCH] net: Adjust softirq raising in __napi_schedule
From: Johannes Berg @ 2009-10-22  8:27 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Tilman Schmidt, David Miller, hidave.darkstar, linux-kernel, tglx,
	linux-wireless, linux-ppp, netdev, paulus, Michael Buesch,
	Oliver Hartkopp
In-Reply-To: <20091021213947.GA12202@ami.dom.local>

[-- Attachment #1: Type: text/plain, Size: 1228 bytes --]

On Wed, 2009-10-21 at 23:39 +0200, Jarek Poplawski wrote:

> > > -	__raise_softirq_irqoff(NET_RX_SOFTIRQ);
> > > +	raise_softirq_irqoff(NET_RX_SOFTIRQ);
> > 
> > This still doesn't make any sense.
> > 
> > There may or may not be a lot of code that assumes that everything else
> > is run with other tasklets disabled, and that it cannot be interrupted
> > by a tasklet and thus create a race.
> > 
> > Can you prove that is not the case, across the entire networking layer?
> 
> I'm not sure I can understand your question. This patch is mainly to
> avoid using netif_rx()/netif_rx_ni() pair as a test of proper process
> context handling; IMHO there're better tools for this (lockdep,
> WARN_ON's).

And how exactly does that matter to the patch at hand?!

I'm saying that it seems to me, as indicated by the API (and without
proof otherwise that's how it is) the networking layer needs to have
packets handed to it with softirqs disabled. Therefore, this patch is
not needed. While it may not be _wrong_, it'll definitely introduce a
performance regression.

This really should be obvious. You're fixing the warning at the source
of the warning, rather than the source of the problem.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox