Netdev List

Netdev List
 help / color / mirror / Atom feed

* radvd 1.5 released
From: Pekka Savola @ 2009-09-10 12:00 UTC (permalink / raw)
  To: netdev, radvd-announce-l

Hello,

A new version of radvd has been released.  This fixes two regressions 
introduced a couple of years back: radvd might end up segfaulting or 
infinite looping if cable is plugged on/off, or if the cable is off 
when starting and IgnoreIfMissing is configured, the interface might 
continue being ignored.

Special thanks to Reuben Hawkins and Teemu Torma for debugging 
these problems and working on patches.

Get it at: http://www.litech.org/radvd/

-- 
Pekka Savola                 "You each name yourselves king, yet the
Netcore Oy                    kingdom bleeds."
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings

^ permalink raw reply

* Re: igb bandwidth allocation configuration
From: Patrick McHardy @ 2009-09-10 11:55 UTC (permalink / raw)
  To: Simon Horman; +Cc: e1000-devel, netdev
In-Reply-To: <4AA8E2CE.2080707@trash.net>

[-- Attachment #1: Type: text/plain, Size: 1123 bytes --]

Patrick McHardy wrote:
> Simon Horman wrote:
>>
>> I have been looking into adding support the 82586's per-PF/VF
>> bandwidth allocation to the igb driver. It seems that the trickiest
>> part is working out how to expose things to user-space.
>>
>> ...
>> Internally it seems that actually the limits are applied to HW Tx queues
>> rather than directly VMs. There are 16 such queues. Accordingly it might
>> be useful to design an interface to set limits per-queue using ethtool.
>> But this would seem to also require exposing which queues are associated
>> with which PF/VF.
> 
> Just an idea since I don't know much about this stuff:
> 
> Since we now have the mq packet scheduler, which exposes the device
> queues as qdisc classes, how about adding driver-specific configuration
> attributes that are passed to the driver by the mq scheduler? This
> would allow to configure per-queue bandwidth limits using regular TC
> commands and also use those limits without VFs for any kind of traffic.
> Drivers not supporting this would refuse unsupported options.

Attached patch demonstrates the idea. Compile-tested only.


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 3012 bytes --]

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index a44118b..388841c 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -178,6 +178,7 @@ enum {
 struct neighbour;
 struct neigh_parms;
 struct sk_buff;
+struct nlattr;
 
 struct netif_rx_stats
 {
@@ -636,6 +637,12 @@ struct net_device_ops {
 	int			(*ndo_fcoe_ddp_done)(struct net_device *dev,
 						     u16 xid);
 #endif
+	int			(*ndo_queue_config)(struct net_device *dev,
+						    unsigned int qnum,
+						    const struct nlattr *nla[]);
+	int			(*ndo_get_queue_config)(struct net_device *dev,
+							struct sk_buff *skb,
+							unsigned int qnum);
 };
 
 /*
diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index d51a2b3..742db43 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -518,4 +518,14 @@ struct tc_drr_stats
 	__u32	deficit;
 };
 
+/* MQ */
+
+enum
+{
+	TCA_MQ_UNSPEC,
+	__TCA_MQ_MAX
+};
+
+#define TCA_MQ_MAX	(__TCA_MQ_MAX - 1)
+
 #endif
diff --git a/net/sched/sch_mq.c b/net/sched/sch_mq.c
index dd5ee02..13132b9 100644
--- a/net/sched/sch_mq.c
+++ b/net/sched/sch_mq.c
@@ -171,15 +171,61 @@ static void mq_put(struct Qdisc *sch, unsigned long cl)
 	return;
 }
 
+static const struct nla_policy mq_policy[TCA_MQ_MAX + 1] = {
+	/* nothing so far */
+};
+
+static int mq_change_class(struct Qdisc *sch, u32 classid, u32 parentid,
+			   struct nlattr **tca, unsigned long *arg)
+{
+	struct net_device *dev = qdisc_dev(sch);
+	struct nlattr *tb[TCA_MQ_MAX + 1];
+	unsigned long ntx;
+	int err;
+
+	if (*arg == 0)
+		return -EOPNOTSUPP;
+	if (mq_queue_get(sch, *arg))
+		return -ENOENT;
+	ntx = *arg - 1;
+
+	if (tca == NULL)
+		return -EINVAL;
+
+	err = nla_parse_nested(tb, TCA_MQ_MAX, tca[TCA_OPTIONS], mq_policy);
+	if (err < 0)
+		return err;
+
+	if (dev->netdev_ops->ndo_queue_config == NULL)
+		return -EOPNOTSUPP;
+	return dev->netdev_ops->ndo_queue_config(dev, ntx, (void *)tb);
+}
+
 static int mq_dump_class(struct Qdisc *sch, unsigned long cl,
 			 struct sk_buff *skb, struct tcmsg *tcm)
 {
 	struct netdev_queue *dev_queue = mq_queue_get(sch, cl);
+	struct net_device *dev = qdisc_dev(sch);
+	struct nlattr *nest;
 
 	tcm->tcm_parent = TC_H_ROOT;
 	tcm->tcm_handle |= TC_H_MIN(cl);
 	tcm->tcm_info = dev_queue->qdisc_sleeping->handle;
-	return 0;
+
+	if (dev->netdev_ops->ndo_get_queue_config) {
+		nest = nla_nest_start(skb, TCA_OPTIONS);
+		if (nest == NULL)
+			goto nla_put_failure;
+		if (dev->netdev_ops->ndo_get_queue_config(dev, skb, cl - 1) < 0)
+			goto nla_put_failure;
+		nla_nest_end(skb, nest);
+	}
+
+	return skb->len;
+
+nla_put_failure:
+	nla_nest_cancel(skb, nest);
+	return -EMSGSIZE;
 }
 
 static int mq_dump_class_stats(struct Qdisc *sch, unsigned long cl,
@@ -214,6 +260,7 @@ static void mq_walk(struct Qdisc *sch, struct qdisc_walker *arg)
 
 static const struct Qdisc_class_ops mq_class_ops = {
 	.select_queue	= mq_select_queue,
+	.change		= mq_change_class,
 	.graft		= mq_graft,
 	.leaf		= mq_leaf,
 	.get		= mq_get,

^ permalink raw reply related

* Re: net_sched 07/07: add classful multiqueue dummy scheduler
From: Patrick McHardy @ 2009-09-10 11:28 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Eric Dumazet, netdev
In-Reply-To: <20090909195238.GA3043@ami.dom.local>

Jarek Poplawski wrote:
> On Wed, Sep 09, 2009 at 06:02:59PM +0200, Patrick McHardy wrote:
>>>>>>> +	for (ntx = 0; ntx < dev->num_tx_queues; ntx++) {
>>>>>>> +		qdisc = netdev_get_tx_queue(dev, ntx)->qdisc_sleeping;
>>>>>>> +		spin_lock_bh(qdisc_lock(qdisc));
>>>>>>> +		sch->q.qlen		+= qdisc->q.qlen;
>>>>>>> +		sch->bstats.bytes	+= qdisc->bstats.bytes;
>>>>>>> +		sch->bstats.packets	+= qdisc->bstats.packets;
>>>>>>> +		sch->qstats.qlen	+= qdisc->qstats.qlen;
>>>>>> Like in Christoph's case, we should probably use q.qlen instead.
>>>>> Its done a few lines above. This simply sums up all members of qstats.
>>>> AFAICS these members are updated only in tc_fill_qdisc, starting from
>>>> the root, so they might be not up-to-date at the moment, unless I miss
>>>> something.
>>> Yes, we might need an q->ops->update_stats(struct Qdisc *sch) method, and
>>> to recursively call it from mq_update_stats()
>> Unless I'm missing something, that shouldn't be necessary since
>> sch->q.qlen contains the correct sum of all child qdiscs and
>> this is used by tc_fill_qdisc to update qstats.qlen.
> 
> You're perfectly right! (And the code is perfectly misleading.;-)

I'll remove the misleading (and unnecessary) line of code, thanks Jarek.

^ permalink raw reply

* Re: igb bandwidth allocation configuration
From: Patrick McHardy @ 2009-09-10 11:28 UTC (permalink / raw)
  To: Simon Horman; +Cc: e1000-devel, netdev
In-Reply-To: <20090910081844.GA5421@verge.net.au>

Simon Horman wrote:
> Hi,
> 
> I have been looking into adding support the 82586's per-PF/VF
> bandwidth allocation to the igb driver. It seems that the trickiest
> part is working out how to expose things to user-space.
> 
> I was thinking along the lines of an ethtool option as follows:
> 
> 	ethtool --bandwidth ethN LIMIT...
> 
> 	where:
> 		* There is one LIMIT per PF/VF.
> 		  The 82576 can have up to 7 VFs per PF,
> 		  so there would be up to 8 LIMITS
> 		* A keyword (none?) can be used to denote that
> 		  bandwidth allocation should be disabled for the
> 		  corresponding VM
> 		* Otherwise LIMITS are in Megabits/s
> 
> This may get a bit combersome if there are a lot of VFs per PF,
> perhaps a better syntax would be:
> 
> 	ethtool --bandwidth ethN M=LIMIT...
> 
> 	where:
> 		* LIMIT is as above
> 		* M is some key to denote which VF/PF is
> 		  having its limit set.
> 
> Internally it seems that actually the limits are applied to HW Tx queues
> rather than directly VMs. There are 16 such queues. Accordingly it might
> be useful to design an interface to set limits per-queue using ethtool.
> But this would seem to also require exposing which queues are associated
> with which PF/VF.

Just an idea since I don't know much about this stuff:

Since we now have the mq packet scheduler, which exposes the device
queues as qdisc classes, how about adding driver-specific configuration
attributes that are passed to the driver by the mq scheduler? This
would allow to configure per-queue bandwidth limits using regular TC
commands and also use those limits without VFs for any kind of traffic.
Drivers not supporting this would refuse unsupported options.


^ permalink raw reply

* Re: TCP kernel tables overflowing after sustained 1000 new connections per second
From: Andi Kleen @ 2009-09-10  9:24 UTC (permalink / raw)
  To: David Miller; +Cc: paulsheer, linux-kernel, roque, netdev
In-Reply-To: <20090909.170824.141343404.davem@davemloft.net>

> On a gigabit local LAN I can set the timeouts very low to encourage
> port reuse. A well known configuration issue with all OS's - just search
> for MyOS+TIMED_WAIT on google. No problems here.

The timeouts are what they are for a reason to detect old packets in
the network and prevent data corruption. That's why the RFCs require
them. 

Unless you never run on WANs or have very strong data integry checking
in your application (e.g. SSL) it's normally not a good idea to mess
with them.

When you run out of port space you should use more local IP addresses.

Possibly if you don't have problems with firewalls you could
also increase the port space, but that's still limited.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply

* Re: [PATCH RESEND] bonding: remap muticast addresses without using dev_close() and dev_open()
From: Moni Shoua @ 2009-09-10  8:47 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Jay Vosburgh, David Miller, Jason Gunthorpe, netdev,
	bonding-devel
In-Reply-To: <4AA8B19F.2080704@voltaire.com>

Or Gerlitz wrote:
> Moni Shoua wrote:
>> This patch fixes commit e36b9d16c6a6d0f59803b3ef04ff3c22c3844c10. The
>> approach there is to call dev_close()/dev_open() whenever the device
>> type is changed in order to remap the device IP multicast addresses to
>> HW multicast addresses. This approach suffers from 2 drawbacks [...]
>> The fix here is to directly remap the IP multicast addresses to HW
>> multicast addresses for a bonding device that changes its type, and
>> nothing else.
> 
> Moni,
> 
> The approach and patch look good. First, I think it may be more easier
> to review and maintain if you separate this to two patches, the first
> simply reverting e36b9d16c6a6d0f59803b3ef04ff3c22c3844c10 and the second
> the approach suggested by this patch. Second, I think you may be able to
> do well with only one event, see next
> 
I don't need to revert the entire patch. Only the dev_open() and dev_close() functions need to be removed and it is quite easy to review it in one patch.
>> @@ -1460,14 +1460,17 @@ int bond_enslave(struct net_device *bond_dev,
>> struct net_device *slave_dev)
>>       */
>>      if (bond->slave_cnt == 0) {
>>          if (bond_dev->type != slave_dev->type) {
>> -            dev_close(bond_dev);
>>              pr_debug("%s: change device type from %d to %d\n",
>>                  bond_dev->name, bond_dev->type, slave_dev->type);
>> +
>> +            netdev_bonding_change(bond_dev, NETDEV_BONDING_OLDTYPE);
>> +
>>              if (slave_dev->type != ARPHRD_ETHER)
>>                  bond_setup_by_slave(bond_dev, slave_dev);
>>              else
>>                  ether_setup(bond_dev);
>> -            dev_open(bond_dev);
>> +
>> +            netdev_bonding_change(bond_dev, NETDEV_BONDING_NEWTYPE);
>>          }
> can't you achieve the same impact if just calling
> netdev_bonding_change(bond_dev, NETDEV_BONDING_NEWTYPE) after doing the
> setup_by_slave, and have the stack call ip_mc_unmap(...) and then
> ip_mc_map(...) ???
> 
I thought about it but the function arp_mc_map() which is called before and after the change in dev->type, relies on the value of dev->type. I could write the patch with one event after the type has changed and passing the old device type somehow (field prev_type in struct net_device?) but the resulted code will look clumsy (at least to me).

> Or.
> 
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply

* igb bandwidth allocation configuration
From: Simon Horman @ 2009-09-10  8:18 UTC (permalink / raw)
  To: e1000-devel, netdev

Hi,

I have been looking into adding support the 82586's per-PF/VF
bandwidth allocation to the igb driver. It seems that the trickiest
part is working out how to expose things to user-space.

I was thinking along the lines of an ethtool option as follows:

	ethtool --bandwidth ethN LIMIT...

	where:
		* There is one LIMIT per PF/VF.
		  The 82576 can have up to 7 VFs per PF,
		  so there would be up to 8 LIMITS
		* A keyword (none?) can be used to denote that
		  bandwidth allocation should be disabled for the
		  corresponding VM
		* Otherwise LIMITS are in Megabits/s

This may get a bit combersome if there are a lot of VFs per PF,
perhaps a better syntax would be:

	ethtool --bandwidth ethN M=LIMIT...

	where:
		* LIMIT is as above
		* M is some key to denote which VF/PF is
		  having its limit set.

Internally it seems that actually the limits are applied to HW Tx queues
rather than directly VMs. There are 16 such queues. Accordingly it might
be useful to design an interface to set limits per-queue using ethtool.
But this would seem to also require exposing which queues are associated
with which PF/VF.

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

^ permalink raw reply

* Re: L2 switching in igb
From: Or Gerlitz @ 2009-09-10  8:04 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Kirsher, Jeffrey T, Fischer, Anna, netdev, David Miller,
	Stephen Hemminger
In-Reply-To: <5f2db9d90909032135l26cfdba6n52329f6be75c16a5@mail.gmail.com>

Alexander Duyck wrote:
> The suggestion I received from Dave and Stephen was to consider an rtnl_link_ops for
> configuring the VFs, but I still have issues trying to visualize how that would work since I don't want the VFs spawning in the host/hypervisor OS as network devices.
Note that VEPA mode is a characteristic of the PF, correct? and the PF 
resides in the host kernel. Also, as I wrote you earlier, I do see many 
schemes where a VF spawned in the host kernel IS very useful, and as 
such I'd be happy to continue the discussion on the approach suggested 
by Dave and Stephen, can you provide a pointer? (thanks).

Or.


^ permalink raw reply

* Re: [PATCH RESEND] bonding: remap muticast addresses without using dev_close() and dev_open()
From: Or Gerlitz @ 2009-09-10  7:58 UTC (permalink / raw)
  To: Moni Shoua
  Cc: Jay Vosburgh, David Miller, Jason Gunthorpe, netdev,
	bonding-devel
In-Reply-To: <4AA39E42.9070702@Voltaire.COM>

Moni Shoua wrote:
> This patch fixes commit e36b9d16c6a6d0f59803b3ef04ff3c22c3844c10. The approach there is to call dev_close()/dev_open() whenever the device type is changed in order to remap the device IP multicast addresses to HW multicast addresses. This approach suffers from 2 drawbacks [...] The fix here is to directly remap the IP multicast addresses to HW multicast addresses for a bonding device that changes its type, and nothing else.

Moni,

The approach and patch look good. First, I think it may be more easier 
to review and maintain if you separate this to two patches, the first 
simply reverting e36b9d16c6a6d0f59803b3ef04ff3c22c3844c10 and the second 
the approach suggested by this patch. Second, I think you may be able to 
do well with only one event, see next

> @@ -1460,14 +1460,17 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
>  	 */
>  	if (bond->slave_cnt == 0) {
>  		if (bond_dev->type != slave_dev->type) {
> -			dev_close(bond_dev);
>  			pr_debug("%s: change device type from %d to %d\n",
>  				bond_dev->name, bond_dev->type, slave_dev->type);
> +
> +			netdev_bonding_change(bond_dev, NETDEV_BONDING_OLDTYPE);
> +
>  			if (slave_dev->type != ARPHRD_ETHER)
>  				bond_setup_by_slave(bond_dev, slave_dev);
>  			else
>  				ether_setup(bond_dev);
> -			dev_open(bond_dev);
> +
> +			netdev_bonding_change(bond_dev, NETDEV_BONDING_NEWTYPE);
>  		}
can't you achieve the same impact if just calling 
netdev_bonding_change(bond_dev, NETDEV_BONDING_NEWTYPE) after doing the 
setup_by_slave, and have the stack call ip_mc_unmap(...) and then 
ip_mc_map(...) ???

Or.



^ permalink raw reply

* Re: r8169 ethernet hangs after a pm-suspend (and resume)
From: Alex Bennee @ 2009-09-10  6:49 UTC (permalink / raw)
  To: Francois Romieu; +Cc: lkml, netdev
In-Reply-To: <20090909092822.GA18355@electric-eye.fr.zoreil.com>

2009/9/9 Francois Romieu <romieu@fr.zoreil.com>:
> Alex Bennee <kernel-hacker@bennee.com> :
> [...]
>> I've just recently gotten suspend working on my system. Unfortunately
>> after the resume event I loose access to the network.
>> As far as the system is concerned the network is configured properly
>> but every attempt to ping local nodes fails with "Host not reachable".
>
> Can the problem be described as "gigabit link setting does not survive
> suspend/resume" ?

Further experimentation shows the failure is intermittent. The
following dmesg shows a successful resume with working 'net:

[  475.800017] ACPI: Waking up from system sleep state S3
[  475.800726] HDA Intel 0000:00:1b.0: restoring config space at
offset 0x1 (was 0x100006, writing 0x100002)
[  475.800747] pcieport-driver 0000:00:1c.0: restoring config space at
offset 0xf (was 0x60100, writing 0x6010a)
[  475.800762] pcieport-driver 0000:00:1c.0: restoring config space at
offset 0x1 (was 0x100107, writing 0x100507)
[  475.800799] pci 0000:00:1d.0: restoring config space at offset 0x1
(was 0x2800005, writing 0x2800001)
[  475.800819] pci 0000:00:1d.1: restoring config space at offset 0x1
(was 0x2800005, writing 0x2800001)
[  475.800840] pci 0000:00:1d.2: restoring config space at offset 0x1
(was 0x2800005, writing 0x2800001)
[  475.800861] pci 0000:00:1d.3: restoring config space at offset 0x1
(was 0x2800005, writing 0x2800001)
[  475.800889] pci 0000:00:1d.7: restoring config space at offset 0x1
(was 0x2900006, writing 0x2900002)
[  475.800967] PIIX_IDE 0000:00:1f.1: restoring config space at offset
0x1 (was 0x2880005, writing 0x2800005)
[  475.801050] r8169 0000:02:00.0: restoring config space at offset
0x3 (was 0x4, writing 0x8)
[  475.801056] r8169 0000:02:00.0: restoring config space at offset
0x1 (was 0x100007, writing 0x100407)
[  475.803466] i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[  475.803470] i915 0000:00:02.0: setting latency timer to 64
[  475.864097] [drm] DAC-6: set mode 1440x900 2a
[  475.936922] [drm] TMDS-8: set mode 1680x1050 2b
[  476.108887] HDA Intel 0000:00:1b.0: PCI INT A -> GSI 19 (level,
low) -> IRQ 19
[  476.108892] HDA Intel 0000:00:1b.0: setting latency timer to 64
[  476.548200] pci 0000:00:1d.7: PME# disabled
[  476.548207] pci 0000:00:1e.0: setting latency timer to 64
[  476.548216] PIIX_IDE 0000:00:1f.1: PCI INT A -> GSI 18 (level, low) -> IRQ 18
[  476.548223] PIIX_IDE 0000:00:1f.1: setting latency timer to 64
[  476.548235] ata_piix 0000:00:1f.2: PCI INT B -> GSI 17 (level, low) -> IRQ 17
[  476.548248] ata_piix 0000:00:1f.2: setting latency timer to 64
[  476.548352] r8169 0000:02:00.0: PME# disabled
[  476.564404] r8169: eth0: link up

And now compare with a return from suspend that failed:

[12397.816024] ACPI: Waking up from system sleep state S3
[12397.816693] agpgart-intel 0000:00:00.0: restoring config space at
offset 0x1 (was 0x30900006, writing 0x20900006)
[12397.816737] HDA Intel 0000:00:1b.0: restoring config space at
offset 0x1 (was 0x100006, writing 0x100002)
[12397.816757] pcieport-driver 0000:00:1c.0: restoring config space at
offset 0xf (was 0x60100, writing 0x6010a)
[12397.816768] pcieport-driver 0000:00:1c.0: restoring config space at
offset 0x7 (was 0x2000e0e0, writing 0xe0e0)
[12397.816776] pcieport-driver 0000:00:1c.0: restoring config space at
offset 0x1 (was 0x100107, writing 0x100507)
[12397.816813] uhci_hcd 0000:00:1d.0: restoring config space at offset
0x1 (was 0x2800005, writing 0x2800001)
[12397.816835] uhci_hcd 0000:00:1d.1: restoring config space at offset
0x1 (was 0x2800005, writing 0x2800001)
[12397.816856] uhci_hcd 0000:00:1d.2: restoring config space at offset
0x1 (was 0x2800005, writing 0x2800001)
[12397.816877] uhci_hcd 0000:00:1d.3: restoring config space at offset
0x1 (was 0x2800005, writing 0x2800001)
[12397.816906] pci 0000:00:1d.7: restoring config space at offset 0x1
(was 0x2900006, writing 0x2900002)
[12397.816929] pci 0000:00:1e.0: restoring config space at offset 0x7
(was 0x2280d0d0, writing 0xa280d0d0)
[12397.816987] PIIX_IDE 0000:00:1f.1: restoring config space at offset
0x1 (was 0x2880005, writing 0x2800005)
[12397.832040] r8169 0000:02:00.0: restoring config space at offset
0xf (was 0xffffffff, writing 0x10a)
[12397.832045] r8169 0000:02:00.0: restoring config space at offset
0xe (was 0xffffffff, writing 0x0)
[12397.832050] r8169 0000:02:00.0: restoring config space at offset
0xd (was 0xffffffff, writing 0x40)
[12397.832055] r8169 0000:02:00.0: restoring config space at offset
0xc (was 0xffffffff, writing 0xdffc0000)
[12397.832061] r8169 0000:02:00.0: restoring config space at offset
0xb (was 0xffffffff, writing 0x81aa1043)
[12397.832066] r8169 0000:02:00.0: restoring config space at offset
0xa (was 0xffffffff, writing 0x0)
[12397.832071] r8169 0000:02:00.0: restoring config space at offset
0x9 (was 0xffffffff, writing 0x0)
[12397.832076] r8169 0000:02:00.0: restoring config space at offset
0x8 (was 0xffffffff, writing 0xdeff000c)
[12397.832081] r8169 0000:02:00.0: restoring config space at offset
0x7 (was 0xffffffff, writing 0x0)
[12397.832086] r8169 0000:02:00.0: restoring config space at offset
0x6 (was 0xffffffff, writing 0xdffff004)
[12397.832091] r8169 0000:02:00.0: restoring config space at offset
0x5 (was 0xffffffff, writing 0x0)
[12397.832096] r8169 0000:02:00.0: restoring config space at offset
0x4 (was 0xffffffff, writing 0xe801)
[12397.832101] r8169 0000:02:00.0: restoring config space at offset
0x3 (was 0xffffffff, writing 0x8)
[12397.832106] r8169 0000:02:00.0: restoring config space at offset
0x2 (was 0xffffffff, writing 0x2000002)
[12397.832111] r8169 0000:02:00.0: restoring config space at offset
0x1 (was 0xffffffff, writing 0x100407)
[12397.832117] r8169 0000:02:00.0: restoring config space at offset
0x0 (was 0xffffffff, writing 0x816810ec)
[12397.834527] i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[12397.834531] i915 0000:00:02.0: setting latency timer to 64
[12397.895209] [drm] DAC-6: set mode 1440x900 2a
[12397.968038] [drm] TMDS-8: set mode 1680x1050 2b
[12398.140006] HDA Intel 0000:00:1b.0: PCI INT A -> GSI 19 (level,
low) -> IRQ 19
[12398.140011] HDA Intel 0000:00:1b.0: setting latency timer to 64
[12398.580194] uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
[12398.580200] uhci_hcd 0000:00:1d.0: setting latency timer to 64
[12398.580224] usb usb2: root hub lost power or was reset
[12398.580250] uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
[12398.580255] uhci_hcd 0000:00:1d.1: setting latency timer to 64
[12398.580273] usb usb3: root hub lost power or was reset
[12398.580291] uhci_hcd 0000:00:1d.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
[12398.580296] uhci_hcd 0000:00:1d.2: setting latency timer to 64
[12398.580314] usb usb4: root hub lost power or was reset
[12398.580332] uhci_hcd 0000:00:1d.3: PCI INT D -> GSI 19 (level, low) -> IRQ 19
[12398.580337] uhci_hcd 0000:00:1d.3: setting latency timer to 64
[12398.580355] usb usb5: root hub lost power or was reset
[12398.580374] pci 0000:00:1d.7: PME# disabled
[12398.580380] pci 0000:00:1e.0: setting latency timer to 64
[12398.580387] PIIX_IDE 0000:00:1f.1: PCI INT A -> GSI 18 (level, low) -> IRQ 18
[12398.580394] PIIX_IDE 0000:00:1f.1: setting latency timer to 64
[12398.580403] ata_piix 0000:00:1f.2: PCI INT B -> GSI 17 (level, low) -> IRQ 17
[12398.580407] ata_piix 0000:00:1f.2: setting latency timer to 64
[12398.580512] r8169 0000:02:00.0: PME# disabled
[12398.660050] firewire_core: skipped bus generations, destroying all nodes
[12398.664833] hda: host max PIO4 wanted PIO255(auto-tune) selected PIO4
[12398.665625] hda: skipping word 93 validity check
[12398.665627] hda: UDMA/66 mode selected
[12398.687404] sd 0:0:0:0: [sda] Starting disk
[12399.419164] r8169: eth0: link up

which has an oops further on:

[12434.816100] ------------[ cut here ]------------
[12434.816111] WARNING: at net/sched/sch_generic.c:246
dev_watchdog+0x132/0x1da()
[12434.816114] Hardware name: System Product Name
[12434.816117] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
[12434.816120] Modules linked in: bridge stp llc bnep rfcomm l2cap
bluetooth ipv6 snd_pcm_oss snd_mixer_oss snd_seq_oss
snd_seq_midi_event snd_seq snd_seq_device kvm_intel kvm acpi_cpufreq
snd_hda_codec_analog snd_hda_intel uhci_hcd snd_hda_codec snd_hwdep
snd_pcm snd_timer ide_cd_mod firewire_ohci firewire_core snd soundcore
usbcore r8169 cdrom processor crc_itu_t nls_base snd_page_alloc mii
evdev thermal pcspkr unix [last unloaded: ehci_hcd]
[12434.816164] Pid: 0, comm: swapper Not tainted
2.6.31-rc9-ajb-00012-g3ff323f-dirty #86
[12434.816167] Call Trace:
[12434.816169]  <IRQ>  [<ffffffff812aa117>] ? dev_watchdog+0x132/0x1da
[12434.816180]  [<ffffffff8103eb72>] warn_slowpath_common+0x7c/0xa9
[12434.816185]  [<ffffffff8103ec1e>] warn_slowpath_fmt+0x69/0x6b
[12434.816190]  [<ffffffff81039e47>] ? default_wake_function+0x12/0x14
[12434.816195]  [<ffffffff8102c24c>] ? __wake_up_common+0x4b/0x7b
[12434.816200]  [<ffffffff8102f793>] ? __wake_up+0x48/0x54
[12434.816205]  [<ffffffff81298b7d>] ? netdev_drivername+0x48/0x4f
[12434.816209]  [<ffffffff812aa117>] dev_watchdog+0x132/0x1da
[12434.816214]  [<ffffffff810510f2>] ? __queue_work+0x3a/0x43
[12434.816218]  [<ffffffff812a9fe5>] ? dev_watchdog+0x0/0x1da
[12434.816223]  [<ffffffff81048d76>] run_timer_softirq+0x198/0x20d
[12434.816229]  [<ffffffff8101d0c6>] ? lapic_next_event+0x1d/0x21
[12434.816234]  [<ffffffff8104464f>] __do_softirq+0xd6/0x19a
[12434.816239]  [<ffffffff8100c19c>] call_softirq+0x1c/0x28
[12434.816242]  [<ffffffff8100d51d>] do_softirq+0x39/0x77
[12434.816246]  [<ffffffff8104430c>] irq_exit+0x44/0x7e
[12434.816252]  [<ffffffff81305914>] smp_apic_timer_interrupt+0x8d/0x9b
[12434.816258]  [<ffffffff8100bb73>] apic_timer_interrupt+0x13/0x20
[12434.816260]  <EOI>  [<ffffffff810117ac>] ? mwait_idle+0xb9/0xf0
[12434.816269]  [<ffffffff81303df5>] ? atomic_notifier_call_chain+0x13/0x15
[12434.816273]  [<ffffffff8100a30a>] ? cpu_idle+0x57/0x98
[12434.816278]  [<ffffffff812f0612>] ? rest_init+0x66/0x68
[12434.816283]  [<ffffffff815299da>] ? start_kernel+0x343/0x34e
[12434.816288]  [<ffffffff8152903a>] ? x86_64_start_reservations+0xaa/0xae
[12434.816292]  [<ffffffff8152911f>] ? x86_64_start_kernel+0xe1/0xe8
[12434.816295] ---[ end trace 1353478188007667 ]---
[12435.635167] r8169: eth0: link up

At this point even unloading and reloading the r8169 module couldn't
bring the network back. I even tried unloading the module, doing a
pm-hibernate and restore reload and still nothing which was odd as I
though the power cycle should have un-wedged any hardware.

A couple of questions:

1. It seems the failure case has a lot more "restoring config space"
going on. Is this a wider range problem that just happens to hit r8169
harder?

2. Is the oops a red herring or could the failure to resume be because
the shutdown occurs before the hardware has flushed all in flight
packets?


-- 
Alex, homepage: http://www.bennee.com/~alex/
http://www.half-llama.co.uk

^ permalink raw reply

* Re: [iproute2] tc action mirred    question
From: Xiaofei Wu @ 2009-09-10  6:06 UTC (permalink / raw)
  To: hadi; +Cc: linux netdev
In-Reply-To: <1252534266.4119.5.camel@dogo.mojatatu.com>

>> After run 'tcpdump -i wlan1 -e', I can not capture any packets.

>Could it be related to the wireless driver?
Maybe. I will check it.

>Here's something i tried on my laptop
....
>

I tried your example.

-on window1  'ping 127.0.0.2'
....
2616 packets transmitted, 0 received, 100% packet loss

-on window2  'tcpdump -n -i eth0 -e' , i see
....
10:15:06.314420 00:23:cd:af:d0:74 > 00:23:cd:af:ec:da, ethertype IPv4 (0x0800), length 98: 127.0.0.2 > 127.0.0.2: ICMP echo request, id 17419, seq 234, length 64
....

-on window3  'tcpdump -i lo -e'
....
10:15:37.332527 00:23:cd:af:d0:74 (oui Unknown) > 00:23:cd:af:ec:da (oui Unknown), ethertype IPv4 (0x0800), length 98: 127.0.0.2 > 127.0.0.2: ICMP echo request, id 17419, seq 265, length 64
....

It seems that I modify the dst MAC, src MAC of the packets,  then transmit to 'lo'  and  mirror the packects to 'eth0'.  (On 'lo',  '2616 packets transmitted, 0 received, 100% packet loss' .)  How to let 'lo' receive the packets?

But I want to only modify the dst MAC, src MAC of the mirroring packets, transmit them to next hop. (not modify the dst,src MAC of the packets to 'lo').  What should I do?

When I change 'lo' to 'eth1' (or wlan1 ...), node A will have two paths (A-B-C, A-D-C) to transmit the "same"(IP header, data)  packets to node C simultaneously.

regards,
wu

^ permalink raw reply

* Re: [PATCH 00/12] Gigaset driver patches for 2.6.32
From: David Miller @ 2009-09-10  3:51 UTC (permalink / raw)
  To: dwalker; +Cc: tilman, linux-kernel, netdev, i4ldeveloper, hjlipp
In-Reply-To: <1252554477.30578.167.camel@desktop>

From: Daniel Walker <dwalker@fifo99.com>
Date: Wed, 09 Sep 2009 20:47:57 -0700

> On Thu, 2009-09-10 at 00:32 +0200, Tilman Schmidt wrote:
>> Daniel Walker wrote 07.09.09 16:30:
>> > Yeah, it looks like the whole file needs a checkpatch clean up..
>> Sounds
>> like your not willing to do that?
>> 
>> It's not a question of willingness. You may notice I did a lot of
>> cleanup work already. But it's very time consuming work, and there has
>> been more important work to attend to first.
>> 
>> > Usually if a checkpatch cleanup comes
>> first prior to all your other changes , it doesn't usually cloud the
>> rest of the changes..
>> 
>> Sure. But that would mean postponing the merging of bugfixes until
>> someone finds the time to do a complete checkpatch cleanup of the
>> affected code. I don't think that's a sensible approach.
> 
> You shouldn't be adding any new checkpatch errors, but you currently
> are .. Just clean up the individual patches w/o the entire gigaset
> driver, that should be do-able (it's even a basic submission
> requirement). The other issue is that your adding new files which aren't
> clean, those can certainly be cleaned up.

Right, this is a very reasonable request.

^ permalink raw reply

* Re: [PATCH 00/12] Gigaset driver patches for 2.6.32
From: Daniel Walker @ 2009-09-10  3:47 UTC (permalink / raw)
  To: Tilman Schmidt; +Cc: davem, linux-kernel, netdev, i4ldeveloper, Hansjoerg Lipp
In-Reply-To: <20090909223205.E9D632269516@fifo99.com>

On Thu, 2009-09-10 at 00:32 +0200, Tilman Schmidt wrote:
> Daniel Walker wrote 07.09.09 16:30:
> > Yeah, it looks like the whole file needs a checkpatch clean up..
> Sounds
> like your not willing to do that?
> 
> It's not a question of willingness. You may notice I did a lot of
> cleanup work already. But it's very time consuming work, and there has
> been more important work to attend to first.
> 
> > Usually if a checkpatch cleanup comes
> first prior to all your other changes , it doesn't usually cloud the
> rest of the changes..
> 
> Sure. But that would mean postponing the merging of bugfixes until
> someone finds the time to do a complete checkpatch cleanup of the
> affected code. I don't think that's a sensible approach.

You shouldn't be adding any new checkpatch errors, but you currently
are .. Just clean up the individual patches w/o the entire gigaset
driver, that should be do-able (it's even a basic submission
requirement). The other issue is that your adding new files which aren't
clean, those can certainly be cleaned up.

Daniel

^ permalink raw reply

* [PATCH 3/3] ucc_geth: Fix hangs after switching from full to half duplex
From: Anton Vorontsov @ 2009-09-10  2:01 UTC (permalink / raw)
  To: David Miller
  Cc: Andy Fleming, Timur Tabi, Li Yang, Kumar Gala, netdev,
	linuxppc-dev

MPC8360 QE UCC ethernet controllers hang when changing link duplex
under a load (a bit of NFS activity is enough).

  PHY: mdio@e0102120:00 - Link is Up - 1000/Full
  sh-3.00# ethtool -s eth0 speed 100 duplex half autoneg off
  PHY: mdio@e0102120:00 - Link is Down
  PHY: mdio@e0102120:00 - Link is Up - 100/Half
  NETDEV WATCHDOG: eth0 (ucc_geth): transmit queue 0 timed out
  ------------[ cut here ]------------
  Badness at c01fcbd0 [verbose debug info unavailable]
  NIP: c01fcbd0 LR: c01fcbd0 CTR: c0194e44
  ...

The cure is to disable the controller before changing speed/duplex
and enable it afterwards.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
---
 drivers/net/ucc_geth.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
index 2a2c973..9ad9015 100644
--- a/drivers/net/ucc_geth.c
+++ b/drivers/net/ucc_geth.c
@@ -1631,9 +1631,13 @@ static void adjust_link(struct net_device *dev)
 			ugeth->oldspeed = phydev->speed;
 		}
 
+		ugeth_disable(ugeth, COMM_DIR_RX_AND_TX);
+
 		out_be32(&ug_regs->maccfg2, tempval);
 		out_be32(&uf_regs->upsmr, upsmr);
 
+		ugeth_enable(ugeth, COMM_DIR_RX_AND_TX);
+
 		if (!ugeth->oldlink) {
 			new_state = 1;
 			ugeth->oldlink = 1;
-- 
1.6.3.3

^ permalink raw reply related

* [PATCH 2/3] ucc_geth: Rearrange some code to avoid forward declarations
From: Anton Vorontsov @ 2009-09-10  2:01 UTC (permalink / raw)
  To: David Miller
  Cc: Andy Fleming, Timur Tabi, Li Yang, Kumar Gala, netdev,
	linuxppc-dev

We'll need ugeth_disable() and ugeth_enable() calls earlier in the
file, so rearrange some code to avoid forward declarations.

The patch doesn't contain any functional changes.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
---
 drivers/net/ucc_geth.c |  300 ++++++++++++++++++++++++------------------------
 1 files changed, 149 insertions(+), 151 deletions(-)

diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
index 33ed69e..2a2c973 100644
--- a/drivers/net/ucc_geth.c
+++ b/drivers/net/ucc_geth.c
@@ -1411,6 +1411,155 @@ static int adjust_enet_interface(struct ucc_geth_private *ugeth)
 	return 0;
 }
 
+static int ugeth_graceful_stop_tx(struct ucc_geth_private *ugeth)
+{
+	struct ucc_fast_private *uccf;
+	u32 cecr_subblock;
+	u32 temp;
+	int i = 10;
+
+	uccf = ugeth->uccf;
+
+	/* Mask GRACEFUL STOP TX interrupt bit and clear it */
+	clrbits32(uccf->p_uccm, UCC_GETH_UCCE_GRA);
+	out_be32(uccf->p_ucce, UCC_GETH_UCCE_GRA);  /* clear by writing 1 */
+
+	/* Issue host command */
+	cecr_subblock =
+	    ucc_fast_get_qe_cr_subblock(ugeth->ug_info->uf_info.ucc_num);
+	qe_issue_cmd(QE_GRACEFUL_STOP_TX, cecr_subblock,
+		     QE_CR_PROTOCOL_ETHERNET, 0);
+
+	/* Wait for command to complete */
+	do {
+		msleep(10);
+		temp = in_be32(uccf->p_ucce);
+	} while (!(temp & UCC_GETH_UCCE_GRA) && --i);
+
+	uccf->stopped_tx = 1;
+
+	return 0;
+}
+
+static int ugeth_graceful_stop_rx(struct ucc_geth_private *ugeth)
+{
+	struct ucc_fast_private *uccf;
+	u32 cecr_subblock;
+	u8 temp;
+	int i = 10;
+
+	uccf = ugeth->uccf;
+
+	/* Clear acknowledge bit */
+	temp = in_8(&ugeth->p_rx_glbl_pram->rxgstpack);
+	temp &= ~GRACEFUL_STOP_ACKNOWLEDGE_RX;
+	out_8(&ugeth->p_rx_glbl_pram->rxgstpack, temp);
+
+	/* Keep issuing command and checking acknowledge bit until
+	it is asserted, according to spec */
+	do {
+		/* Issue host command */
+		cecr_subblock =
+		    ucc_fast_get_qe_cr_subblock(ugeth->ug_info->uf_info.
+						ucc_num);
+		qe_issue_cmd(QE_GRACEFUL_STOP_RX, cecr_subblock,
+			     QE_CR_PROTOCOL_ETHERNET, 0);
+		msleep(10);
+		temp = in_8(&ugeth->p_rx_glbl_pram->rxgstpack);
+	} while (!(temp & GRACEFUL_STOP_ACKNOWLEDGE_RX) && --i);
+
+	uccf->stopped_rx = 1;
+
+	return 0;
+}
+
+static int ugeth_restart_tx(struct ucc_geth_private *ugeth)
+{
+	struct ucc_fast_private *uccf;
+	u32 cecr_subblock;
+
+	uccf = ugeth->uccf;
+
+	cecr_subblock =
+	    ucc_fast_get_qe_cr_subblock(ugeth->ug_info->uf_info.ucc_num);
+	qe_issue_cmd(QE_RESTART_TX, cecr_subblock, QE_CR_PROTOCOL_ETHERNET, 0);
+	uccf->stopped_tx = 0;
+
+	return 0;
+}
+
+static int ugeth_restart_rx(struct ucc_geth_private *ugeth)
+{
+	struct ucc_fast_private *uccf;
+	u32 cecr_subblock;
+
+	uccf = ugeth->uccf;
+
+	cecr_subblock =
+	    ucc_fast_get_qe_cr_subblock(ugeth->ug_info->uf_info.ucc_num);
+	qe_issue_cmd(QE_RESTART_RX, cecr_subblock, QE_CR_PROTOCOL_ETHERNET,
+		     0);
+	uccf->stopped_rx = 0;
+
+	return 0;
+}
+
+static int ugeth_enable(struct ucc_geth_private *ugeth, enum comm_dir mode)
+{
+	struct ucc_fast_private *uccf;
+	int enabled_tx, enabled_rx;
+
+	uccf = ugeth->uccf;
+
+	/* check if the UCC number is in range. */
+	if (ugeth->ug_info->uf_info.ucc_num >= UCC_MAX_NUM) {
+		if (netif_msg_probe(ugeth))
+			ugeth_err("%s: ucc_num out of range.", __func__);
+		return -EINVAL;
+	}
+
+	enabled_tx = uccf->enabled_tx;
+	enabled_rx = uccf->enabled_rx;
+
+	/* Get Tx and Rx going again, in case this channel was actively
+	disabled. */
+	if ((mode & COMM_DIR_TX) && (!enabled_tx) && uccf->stopped_tx)
+		ugeth_restart_tx(ugeth);
+	if ((mode & COMM_DIR_RX) && (!enabled_rx) && uccf->stopped_rx)
+		ugeth_restart_rx(ugeth);
+
+	ucc_fast_enable(uccf, mode);	/* OK to do even if not disabled */
+
+	return 0;
+
+}
+
+static int ugeth_disable(struct ucc_geth_private *ugeth, enum comm_dir mode)
+{
+	struct ucc_fast_private *uccf;
+
+	uccf = ugeth->uccf;
+
+	/* check if the UCC number is in range. */
+	if (ugeth->ug_info->uf_info.ucc_num >= UCC_MAX_NUM) {
+		if (netif_msg_probe(ugeth))
+			ugeth_err("%s: ucc_num out of range.", __func__);
+		return -EINVAL;
+	}
+
+	/* Stop any transmissions */
+	if ((mode & COMM_DIR_TX) && uccf->enabled_tx && !uccf->stopped_tx)
+		ugeth_graceful_stop_tx(ugeth);
+
+	/* Stop any receptions */
+	if ((mode & COMM_DIR_RX) && uccf->enabled_rx && !uccf->stopped_rx)
+		ugeth_graceful_stop_rx(ugeth);
+
+	ucc_fast_disable(ugeth->uccf, mode); /* OK to do even if not enabled */
+
+	return 0;
+}
+
 /* Called every time the controller might need to be made
  * aware of new link state.  The PHY code conveys this
  * information through variables in the ugeth structure, and this
@@ -1586,157 +1735,6 @@ static int init_phy(struct net_device *dev)
 	return 0;
 }
 
-
-
-static int ugeth_graceful_stop_tx(struct ucc_geth_private *ugeth)
-{
-	struct ucc_fast_private *uccf;
-	u32 cecr_subblock;
-	u32 temp;
-	int i = 10;
-
-	uccf = ugeth->uccf;
-
-	/* Mask GRACEFUL STOP TX interrupt bit and clear it */
-	clrbits32(uccf->p_uccm, UCC_GETH_UCCE_GRA);
-	out_be32(uccf->p_ucce, UCC_GETH_UCCE_GRA);  /* clear by writing 1 */
-
-	/* Issue host command */
-	cecr_subblock =
-	    ucc_fast_get_qe_cr_subblock(ugeth->ug_info->uf_info.ucc_num);
-	qe_issue_cmd(QE_GRACEFUL_STOP_TX, cecr_subblock,
-		     QE_CR_PROTOCOL_ETHERNET, 0);
-
-	/* Wait for command to complete */
-	do {
-		msleep(10);
-		temp = in_be32(uccf->p_ucce);
-	} while (!(temp & UCC_GETH_UCCE_GRA) && --i);
-
-	uccf->stopped_tx = 1;
-
-	return 0;
-}
-
-static int ugeth_graceful_stop_rx(struct ucc_geth_private * ugeth)
-{
-	struct ucc_fast_private *uccf;
-	u32 cecr_subblock;
-	u8 temp;
-	int i = 10;
-
-	uccf = ugeth->uccf;
-
-	/* Clear acknowledge bit */
-	temp = in_8(&ugeth->p_rx_glbl_pram->rxgstpack);
-	temp &= ~GRACEFUL_STOP_ACKNOWLEDGE_RX;
-	out_8(&ugeth->p_rx_glbl_pram->rxgstpack, temp);
-
-	/* Keep issuing command and checking acknowledge bit until
-	it is asserted, according to spec */
-	do {
-		/* Issue host command */
-		cecr_subblock =
-		    ucc_fast_get_qe_cr_subblock(ugeth->ug_info->uf_info.
-						ucc_num);
-		qe_issue_cmd(QE_GRACEFUL_STOP_RX, cecr_subblock,
-			     QE_CR_PROTOCOL_ETHERNET, 0);
-		msleep(10);
-		temp = in_8(&ugeth->p_rx_glbl_pram->rxgstpack);
-	} while (!(temp & GRACEFUL_STOP_ACKNOWLEDGE_RX) && --i);
-
-	uccf->stopped_rx = 1;
-
-	return 0;
-}
-
-static int ugeth_restart_tx(struct ucc_geth_private *ugeth)
-{
-	struct ucc_fast_private *uccf;
-	u32 cecr_subblock;
-
-	uccf = ugeth->uccf;
-
-	cecr_subblock =
-	    ucc_fast_get_qe_cr_subblock(ugeth->ug_info->uf_info.ucc_num);
-	qe_issue_cmd(QE_RESTART_TX, cecr_subblock, QE_CR_PROTOCOL_ETHERNET, 0);
-	uccf->stopped_tx = 0;
-
-	return 0;
-}
-
-static int ugeth_restart_rx(struct ucc_geth_private *ugeth)
-{
-	struct ucc_fast_private *uccf;
-	u32 cecr_subblock;
-
-	uccf = ugeth->uccf;
-
-	cecr_subblock =
-	    ucc_fast_get_qe_cr_subblock(ugeth->ug_info->uf_info.ucc_num);
-	qe_issue_cmd(QE_RESTART_RX, cecr_subblock, QE_CR_PROTOCOL_ETHERNET,
-		     0);
-	uccf->stopped_rx = 0;
-
-	return 0;
-}
-
-static int ugeth_enable(struct ucc_geth_private *ugeth, enum comm_dir mode)
-{
-	struct ucc_fast_private *uccf;
-	int enabled_tx, enabled_rx;
-
-	uccf = ugeth->uccf;
-
-	/* check if the UCC number is in range. */
-	if (ugeth->ug_info->uf_info.ucc_num >= UCC_MAX_NUM) {
-		if (netif_msg_probe(ugeth))
-			ugeth_err("%s: ucc_num out of range.", __func__);
-		return -EINVAL;
-	}
-
-	enabled_tx = uccf->enabled_tx;
-	enabled_rx = uccf->enabled_rx;
-
-	/* Get Tx and Rx going again, in case this channel was actively
-	disabled. */
-	if ((mode & COMM_DIR_TX) && (!enabled_tx) && uccf->stopped_tx)
-		ugeth_restart_tx(ugeth);
-	if ((mode & COMM_DIR_RX) && (!enabled_rx) && uccf->stopped_rx)
-		ugeth_restart_rx(ugeth);
-
-	ucc_fast_enable(uccf, mode);	/* OK to do even if not disabled */
-
-	return 0;
-
-}
-
-static int ugeth_disable(struct ucc_geth_private * ugeth, enum comm_dir mode)
-{
-	struct ucc_fast_private *uccf;
-
-	uccf = ugeth->uccf;
-
-	/* check if the UCC number is in range. */
-	if (ugeth->ug_info->uf_info.ucc_num >= UCC_MAX_NUM) {
-		if (netif_msg_probe(ugeth))
-			ugeth_err("%s: ucc_num out of range.", __func__);
-		return -EINVAL;
-	}
-
-	/* Stop any transmissions */
-	if ((mode & COMM_DIR_TX) && uccf->enabled_tx && !uccf->stopped_tx)
-		ugeth_graceful_stop_tx(ugeth);
-
-	/* Stop any receptions */
-	if ((mode & COMM_DIR_RX) && uccf->enabled_rx && !uccf->stopped_rx)
-		ugeth_graceful_stop_rx(ugeth);
-
-	ucc_fast_disable(ugeth->uccf, mode); /* OK to do even if not enabled */
-
-	return 0;
-}
-
 static void ugeth_dump_regs(struct ucc_geth_private *ugeth)
 {
 #ifdef DEBUG
-- 
1.6.3.3


^ permalink raw reply related

* [PATCH 1/3] phy/marvell: Make non-aneg speed/duplex forcing work for 88E1111 PHYs
From: Anton Vorontsov @ 2009-09-10  2:01 UTC (permalink / raw)
  To: David Miller
  Cc: Andy Fleming, Timur Tabi, Li Yang, Kumar Gala, netdev,
	linuxppc-dev

According to specs, when auto-negotiation is disabled, Marvell PHYs need
a software reset after changing speed/duplex forcing bits. Otherwise,
the modified bits have no effect.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
---
 drivers/net/phy/marvell.c |   21 ++++++++++++++++++++-
 1 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
index dd6f54d..6f69b9b 100644
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@ -155,8 +155,27 @@ static int marvell_config_aneg(struct phy_device *phydev)
 		return err;
 
 	err = genphy_config_aneg(phydev);
+	if (err < 0)
+		return err;
 
-	return err;
+	if (phydev->autoneg != AUTONEG_ENABLE) {
+		int bmcr;
+
+		/*
+		 * A write to speed/duplex bits (that is performed by
+		 * genphy_config_aneg() call above) must be followed by
+		 * a software reset. Otherwise, the write has no effect.
+		 */
+		bmcr = phy_read(phydev, MII_BMCR);
+		if (bmcr < 0)
+			return bmcr;
+
+		err = phy_write(phydev, MII_BMCR, bmcr | BMCR_RESET);
+		if (err < 0)
+			return err;
+	}
+
+	return 0;
 }
 
 static int m88e1121_config_aneg(struct phy_device *phydev)
-- 
1.6.3.3


^ permalink raw reply related

* Re: [PATCH] dm9000: Remove unnecessary memset of netdev private data
From: David Miller @ 2009-09-10  1:55 UTC (permalink / raw)
  To: tklauser; +Cc: netdev
In-Reply-To: <1252494464-4633-2-git-send-email-tklauser@distanz.ch>

From: Tobias Klauser <tklauser@distanz.ch>
Date: Wed,  9 Sep 2009 13:07:44 +0200

> The memory for the private data is allocated using kzalloc in
> alloc_etherdev (or alloc_netdev_mq respectively) so there is no need to
> set it to 0 again.
> 
> Signed-off-by: Tobias Klauser <tklauser@distanz.ch>

Applied.

^ permalink raw reply

* Re: [PATCH] dm9000: Use resource_size instead of private macro
From: David Miller @ 2009-09-10  1:55 UTC (permalink / raw)
  To: tklauser; +Cc: netdev
In-Reply-To: <1252494464-4633-1-git-send-email-tklauser@distanz.ch>

From: Tobias Klauser <tklauser@distanz.ch>
Date: Wed,  9 Sep 2009 13:07:43 +0200

> The macro res_size in drivers/net/dm9000.c is a copy of resource_size in
> linux/ioport.h. Remove the function and use resource_size instead.
> 
> Signed-off-by: Tobias Klauser <tklauser@distanz.ch>

Applied.

^ permalink raw reply

* Re: [PATCH] [NIU] VLAN does not work with niu driver
From: Matheos Worku @ 2009-09-10  1:48 UTC (permalink / raw)
  To: David Miller; +Cc: Joyce.Yu, netdev
In-Reply-To: <20090909.184410.64382338.davem@davemloft.net>



David Miller wrote:
> From: Matheos Worku <Matheos.Worku@Sun.COM>
> Date: Wed, 09 Sep 2009 18:19:00 -0700
> 
>> We can work on a version which implements HW header checking and do
>> pullup accordingly.
> 
> I think using a constant based on the vlan header size would be
> sufficient to fix this.
We will have a patch based on vlan header size.

Regards
Matheos

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] [NIU] VLAN does not work with niu driver
From: David Miller @ 2009-09-10  1:44 UTC (permalink / raw)
  To: Matheos.Worku; +Cc: Joyce.Yu, netdev
In-Reply-To: <4AA85404.5020300@sun.com>

From: Matheos Worku <Matheos.Worku@Sun.COM>
Date: Wed, 09 Sep 2009 18:19:00 -0700

> We can work on a version which implements HW header checking and do
> pullup accordingly.

I think using a constant based on the vlan header size would be
sufficient to fix this.

^ permalink raw reply

* Re: [PATCH] [NIU] VLAN does not work with niu driver
From: Matheos Worku @ 2009-09-10  1:19 UTC (permalink / raw)
  To: David Miller; +Cc: Joyce.Yu, netdev
In-Reply-To: <20090909.181033.25374239.davem@davemloft.net>



David Miller wrote:
> From: Matheos Worku <Matheos.Worku@Sun.COM>
> Date: Wed, 09 Sep 2009 18:01:23 -0700
> 
>> The frame type in NIU HW is embedded in a HW header, so it is possible
>> to check the HW header and decide whether to pull up ETH_HLEN or VLAN
>> header size of bytes. However, considering the amount of work required
>> to get and examine the HW header (including endianess issues), we
>> thought pulling up 64 bytes by default (as used in cassini.c) would be
>> efficient.
> 
> Well, it was 64 in early versions of the driver, and I decreased it
> down to ETH_HLEN.
> 
> The less the better since for forwarding applications anything past
> the IPV4 header pulled is going to be a waste of CPU cache lines and
> thus negatively effect forwarding rates.
> 
> That's why I asked if this change was performance regression tested,
> because I know it's going to slow down forwarding rates for small
> packets.
Dave,

We did throughput testing (netperf) and didn't notice any performance 
degradation. We haven't done forwarding testing however.

We can work on a version which implements HW header checking and do 
pullup accordingly.


Regards,
Matheos

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH NEXT 2/2] netxen: fix tx descriptor structure
From: David Miller @ 2009-09-10  1:13 UTC (permalink / raw)
  To: dhananjay; +Cc: netdev, amit
In-Reply-To: <1252526963-25621-2-git-send-email-dhananjay@netxen.com>

From: Dhananjay Phadke <dhananjay@netxen.com>
Date: Wed,  9 Sep 2009 13:09:23 -0700

> From: Amit Kumar Salecha <amit@qlogic.com>
> 
> Fix the offset of vlan_TCI field in cmd_desc_type0.
> 
> Signed-off-by: Amit Kumar Salecha <amit@qlogic.com>
> Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>

Also applied, thanks.

^ permalink raw reply

* Re: net_sched: fix estimator lock selection for mq child qdiscs
From: David Miller @ 2009-09-10  1:11 UTC (permalink / raw)
  To: kaber; +Cc: netdev
In-Reply-To: <4AA7D0C5.9080601@trash.net>

From: Patrick McHardy <kaber@trash.net>
Date: Wed, 09 Sep 2009 17:59:01 +0200

>     net_sched: fix estimator lock selection for mq child qdiscs
>     
>     When new child qdiscs are attached to the mq qdisc, they are actually
>     attached as root qdiscs to the device queues. The lock selection for
>     new estimators incorrectly picks the root lock of the existing and
>     to be replaced qdisc, which results in a use-after-free once the old
>     qdisc has been destroyed.
>     
>     Mark mq qdisc instances with a new flag and treat qdiscs attached to
>     mq as children similar to regular root qdiscs.
>     
>     Additionally prevent estimators from being attached to the mq qdisc
>     itself since it only updates its byte and packet counters during dumps.
>     
>     Signed-off-by: Patrick McHardy <kaber@trash.net>

Applied, thanks!

^ permalink raw reply

* Re: [PATCH] [NIU] VLAN does not work with niu driver
From: David Miller @ 2009-09-10  1:10 UTC (permalink / raw)
  To: Matheos.Worku; +Cc: Joyce.Yu, netdev
In-Reply-To: <4AA84FE3.6030407@sun.com>

From: Matheos Worku <Matheos.Worku@Sun.COM>
Date: Wed, 09 Sep 2009 18:01:23 -0700

> The frame type in NIU HW is embedded in a HW header, so it is possible
> to check the HW header and decide whether to pull up ETH_HLEN or VLAN
> header size of bytes. However, considering the amount of work required
> to get and examine the HW header (including endianess issues), we
> thought pulling up 64 bytes by default (as used in cassini.c) would be
> efficient.

Well, it was 64 in early versions of the driver, and I decreased it
down to ETH_HLEN.

The less the better since for forwarding applications anything past
the IPV4 header pulled is going to be a waste of CPU cache lines and
thus negatively effect forwarding rates.

That's why I asked if this change was performance regression tested,
because I know it's going to slow down forwarding rates for small
packets.

^ permalink raw reply

* Re: [PATCH] [NIU] VLAN does not work with niu driver
From: Matheos Worku @ 2009-09-10  1:01 UTC (permalink / raw)
  To: David Miller; +Cc: Joyce.Yu, netdev
In-Reply-To: <20090909.171517.34998160.davem@davemloft.net>



David Miller wrote:
> From: Joyce Yu <Joyce.Yu@sun.com>
> Date: Wed, 09 Sep 2009 14:10:48 -0700
> 
>> drivers/net/niu.h |    2 +-
>> 1 files changed, 1 insertions(+), 1 deletions(-)
> 
> Can I get a more verbose commit message than this?
> 
>> @@ -2700,7 +2700,7 @@ struct fcram_hash_ipv6 {
>> #define RCR_PKT_TYPE_UDP               0x2
>> #define RCR_PKT_TYPE_SCTP              0x3
>>
>> -#define NIU_RXPULL_MAX                 ETH_HLEN
>> +#define NIU_RXPULL_MAX                 64
>>
> 
> See, that's why I want a detailed commit message, because if you
> described things more clearly I'd understand why you choose the value
> '64' as opposed to, say, the size of a VLAN header which to me would
> be a more appropriate value to use here.

Dave,

The frame type in NIU HW  is embedded  in a  HW header,  so it is 
possible to check the HW header and decide whether to pull up ETH_HLEN 
or  VLAN header size of bytes. However, considering the amount of work 
required to get and examine the HW header (including endianess issues), 
we thought pulling up 64 bytes by default (as used in cassini.c) would 
be efficient.

Regards,
Matheos

> 
> You just seem to be reverting a change I made a while back, and it
> just so happens to fix your problem.  But '64' is too large a value
> to use here and it will impact performance.
> 
> You did check to see if there were any performance regressions
> resulting from your change, right?
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox