Netdev List
 help / color / mirror / Atom feed
* Re: [GIT] Networking
From: Linus Torvalds @ 2011-01-07 22:48 UTC (permalink / raw)
  To: Francois Romieu
  Cc: Ben Hutchings, David Miller, Hayes Wang, David Woodhouse, akpm,
	netdev, linux-kernel
In-Reply-To: <20110107215505.GA1892@electric-eye.fr.zoreil.com>

On Fri, Jan 7, 2011 at 1:55 PM, Francois Romieu <romieu@fr.zoreil.com> wrote:
> Linus Torvalds <torvalds@linux-foundation.org> :
> [...]
>> I just confirmed that building it as a module works
>
> I have just tried a non-modular build and it worked without firmware.
>
...
> [    4.340876] sd 1:0:0:0: [sda] Attached SCSI disk
> [   63.968081] r8169 0000:02:00.0: eth0: unable to apply firmware patch
>
> It's here. After a 60 seconds black-out.

Hmm. I never even waited for 60 seconds. Maybe my boot would have
continued after the delay.

                    Linus

^ permalink raw reply

* Re: [net-next-2.6 PATCH v6 1/2] net: implement mechanism for HW based QOS
From: John Fastabend @ 2011-01-07 22:48 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: davem@davemloft.net, hadi@cyberus.ca, eric.dumazet@gmail.com,
	shemminger@vyatta.com, tgraf@infradead.org,
	bhutchings@solarflare.com, nhorman@tuxdriver.com,
	netdev@vger.kernel.org
In-Reply-To: <20110107214645.GB2050@del.dom.local>

On 1/7/2011 1:46 PM, Jarek Poplawski wrote:
> On Thu, Jan 06, 2011 at 07:12:11PM -0800, John Fastabend wrote:
>> This patch provides a mechanism for lower layer devices to
>> steer traffic using skb->priority to tx queues. This allows
>> for hardware based QOS schemes to use the default qdisc without
>> incurring the penalties related to global state and the qdisc
>> lock. While reliably receiving skbs on the correct tx ring
>> to avoid head of line blocking resulting from shuffling in
>> the LLD. Finally, all the goodness from txq caching and xps/rps
>> can still be leveraged.
>>
>> Many drivers and hardware exist with the ability to implement
>> QOS schemes in the hardware but currently these drivers tend
>> to rely on firmware to reroute specific traffic, a driver
>> specific select_queue or the queue_mapping action in the
>> qdisc.
>>
>> By using select_queue for this drivers need to be updated for
>> each and every traffic type and we lose the goodness of much
>> of the upstream work. Firmware solutions are inherently
>> inflexible. And finally if admins are expected to build a
>> qdisc and filter rules to steer traffic this requires knowledge
>> of how the hardware is currently configured. The number of tx
>> queues and the queue offsets may change depending on resources.
>> Also this approach incurs all the overhead of a qdisc with filters.
>>
>> With the mechanism in this patch users can set skb priority using
>> expected methods ie setsockopt() or the stack can set the priority
>> directly. Then the skb will be steered to the correct tx queues
>> aligned with hardware QOS traffic classes. In the normal case with
>> a single traffic class and all queues in this class everything
>> works as is until the LLD enables multiple tcs.
>>
>> To steer the skb we mask out the lower 4 bits of the priority
>> and allow the hardware to configure upto 15 distinct classes
>> of traffic. This is expected to be sufficient for most applications
>> at any rate it is more then the 8021Q spec designates and is
>> equal to the number of prio bands currently implemented in
>> the default qdisc.
>>
>> This in conjunction with a userspace application such as
>> lldpad can be used to implement 8021Q transmission selection
>> algorithms one of these algorithms being the extended transmission
>> selection algorithm currently being used for DCB.
>>
>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>> ---
>>
>>  include/linux/netdevice.h |   65 +++++++++++++++++++++++++++++++++++++++++++++
>>  net/core/dev.c            |   52 +++++++++++++++++++++++++++++++++++-
>>  2 files changed, 116 insertions(+), 1 deletions(-)
>>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index 0f6b1c9..12fff42 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -646,6 +646,14 @@ struct xps_dev_maps {
>>      (nr_cpu_ids * sizeof(struct xps_map *)))
>>  #endif /* CONFIG_XPS */
>>  
>> +#define TC_MAX_QUEUE	16
>> +#define TC_BITMASK	15
>> +/* HW offloaded queuing disciplines txq count and offset maps */
>> +struct netdev_tc_txq {
>> +	u16 count;
>> +	u16 offset;
>> +};
>> +
>>  /*
>>   * This structure defines the management hooks for network devices.
>>   * The following hooks can be defined; unless noted otherwise, they are
>> @@ -756,6 +764,7 @@ struct xps_dev_maps {
>>   * int (*ndo_set_vf_port)(struct net_device *dev, int vf,
>>   *			  struct nlattr *port[]);
>>   * int (*ndo_get_vf_port)(struct net_device *dev, int vf, struct sk_buff *skb);
>> + * void (*ndo_setup_tc)(struct net_device *dev, u8 tc)
> 
> ..., unsigned int txq) ?
> 
>>   */
>>  #define HAVE_NET_DEVICE_OPS
>>  struct net_device_ops {
>> @@ -814,6 +823,8 @@ struct net_device_ops {
>>  						   struct nlattr *port[]);
>>  	int			(*ndo_get_vf_port)(struct net_device *dev,
>>  						   int vf, struct sk_buff *skb);
>> +	int			(*ndo_setup_tc)(struct net_device *dev, u8 tc,
>> +						unsigned int txq);
> 
> ...
>> +/* netif_setup_tc - Handle tc mappings on real_num_tx_queues change
>> + * @dev: Network device
>> + * @txq: number of queues available
>> + *
>> + * If real_num_tx_queues is changed the tc mappings may no longer be
>> + * valid. To resolve this if the net_device supports ndo_setup_tc
>> + * call the ops routine with the new queue number. If the ops is not
>> + * available verify the tc mapping remains valid and if not NULL the
>> + * mapping. With no priorities mapping to this offset/count pair it
>> + * will no longer be used. In the worst case TC0 is invalid nothing
>> + * can be done so disable priority mappings.
>> + */
>> +void netif_setup_tc(struct net_device *dev, unsigned int txq)
>> +{
>> +	const struct net_device_ops *ops = dev->netdev_ops;
>> +
>> +	if (ops->ndo_setup_tc) {
>> +		ops->ndo_setup_tc(dev, dev->num_tc, txq);
>> +	} else {
>> +		int i;
>> +		struct netdev_tc_txq *tc = &dev->tc_to_txq[0];
>> +
>> +		/* If TC0 is invalidated disable TC mapping */
>> +		if (tc->offset + tc->count > txq) {
>> +			dev->num_tc = 0;
>> +			return;
>> +		}
>> +
>> +		/* Invalidated prio to tc mappings set to TC0 */
>> +		for (i = 1; i < TC_BITMASK + 1; i++) {
>> +			int q = netdev_get_prio_tc_map(dev, i);
> 
> (empty line)
> Btw, probably some warning should be logged on config change here.
> 

OK maybe I should see about making at least my local checkpatch script
look for this. Also added pr_warnings here.

>> +			tc = &dev->tc_to_txq[q];
>> +
>> +			if (tc->offset + tc->count > txq)
>> +				netdev_set_prio_tc_map(dev, i, 0);
>> +		}
>> +	}
>> +}
>> +
>>  /*
>>   * Routine to help set real_num_tx_queues. To avoid skbs mapped to queues
>>   * greater then real_num_tx_queues stale skbs on the qdisc must be flushed.
>> @@ -1614,6 +1653,9 @@ int netif_set_real_num_tx_queues(struct net_device *dev, unsigned int txq)
>>  
>>  		if (txq < dev->real_num_tx_queues)
>>  			qdisc_reset_all_tx_gt(dev, txq);
>> +
>> +		if (dev->num_tc)
>> +			netif_setup_tc(dev, txq);
> 
> Should be before qdisc_reset_all_tx_gt (above).
> 
> Jarek P.

I will fix this. Thanks!


^ permalink raw reply

* Re: [GIT] Networking
From: Francois Romieu @ 2011-01-07 21:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, Hayes Wang, Ben Hutchings, David Woodhouse, akpm,
	netdev, linux-kernel
In-Reply-To: <AANLkTi=7=_FyVsjA3vKEpg8RwfC=m-8QGfqa6ctxQday@mail.gmail.com>

Linus Torvalds <torvalds@linux-foundation.org> :
> On Thu, Jan 6, 2011 at 12:20 PM, David Miller <davem@davemloft.net> wrote:
> >
> > Plus the usual spattering of wireless, bluetooth, and wired driver
> > updates.
> 
> Grr.

Oops.

[...]
> [torvalds@i5 linux]$ git bisect good
> bca03d5f32c8ee9b5cfa1d32640a63fded6cb3c0 is the first bad commit
> commit bca03d5f32c8ee9b5cfa1d32640a63fded6cb3c0
> Author: fran?ois romieu <romieu@fr.zoreil.com>
> Date:   Mon Jan 3 15:07:31 2011 +0000
> 
>     r8169: remove the firmware of RTL8111D.
> 
>     The binary file of the firmware is moved to linux-firmware repository.
>     The firmwares are rtl_nic/rtl8168d-1.fw and rtl_nic/rtl8168d-2.fw.
>     The driver goes along if the firmware couldn't be found. However, it
>     is suggested to be done with the suitable firmware.
> 
>     Some wrong PHY parameters are directly corrected in the driver.
> 
>     Simple firmware checking added per Ben Hutchings suggestion.
> 
>     Signed-off-by: Hayes Wang <hayeswang@realtek.com>
>     Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
>     Cc: Ben Hutchings <benh@debian.org>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
> 
> and the behavior is very broken: it just hangs at boot-time. No
> messages from the driver (certainly not any messages about missing
> firmware), no nothing. The thing is just hung.
>
> On a working setup, I have
> 
> 
>   [    1.595071] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>   [    1.595114] r8169 0000:01:00.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19
>   [    1.595174] r8169 0000:01:00.0: setting latency timer to 64
>   [    1.595227] r8169 0000:01:00.0: irq 42 for MSI/MSI-X
>   [    1.595770] r8169 0000:01:00.0: eth0: RTL8168d/8111d at
> 0xffffc90000068000, e0:cb:4e:95:1a:d7, XID 083000c0 IRQ 42

>   ...
>   [    7.985917] r8169 0000:01:00.0: eth0: link up
>   [    7.987525] r8169 0000:01:00.0: eth0: link up
> 
> while on a non-working setup, I get that XID line, and after that a
> few other init routines continue to show up (probably just
> multi-threaded initcalls), but the box is dead.
> 
> Quite frankly, that commit looks broken anyway. It doesn't just switch
> to the firmware loader, it also seems to change other things (ie
> removed some mdio writes, added others).

It was described in the commit message as
[...]
    Some wrong PHY parameters are directly corrected in the driver.

> What's going on?

The firmware is supposed to be optional. The (relevant) 8168 have
been reported to work without it. I have tested it with my hardware
(RTL_GIGA_MAC_VER_26 (== your), RTL_GIGA_MAC_VER_25 + some others),
seen the "r8169 0000:02:00.0: eth0: unable to apply firmware patch"
message and noticed that the card was actually working. Though I did
not try a non-modular build, this was not exactly a five minutes
test.

I had no technical reason [*] to turn the firmware mandatory and screw
distro-maintainers who have decided that the firmware is not free enough.

[*] The small "I will regret it" voices in my head do not count as
    technical reasons.

[...]
> Not acceptable. I'm ok with an external firmware repository, but only
> if it _works_. Right now it doesn't. It just seems to cause silent
> failures: there were no warnings about missing firmware at ANY time.
> Not build-time, not run-time.

?

The driver is supposed to display a message at KERN_WARNING.

-- 
Ueimor

^ permalink raw reply

* Re: [PATCH] Export ACPI _DSM provided firmware instance number and string name to sysfs
From: Jesse Barnes @ 2011-01-07 22:27 UTC (permalink / raw)
  To: Narendra_K
  Cc: Matt_Domsch, linux-pci, linux-hotplug, netdev, Jordan_Hargrave,
	Charles_Rose, Vijay_Nijhawan
In-Reply-To: <20101223194927.GA2717@fedora14-r610.oslab.blr.amer.dell.com>

On Thu, 23 Dec 2010 11:24:36 -0800
<Narendra_K@Dell.com> wrote:

> On Thu, Dec 23, 2010 at 08:02:03PM +0530, Domsch, Matt wrote:
> > On Wed, Dec 22, 2010 at 08:42:39AM -0800, Narendra_K@Dell.com wrote:
> > > Hello,
> > > 
> > > This patch exports ACPI _DSM provided firmware instance number and
> > > string name to sysfs.
> > > 
> > > Please review -
> > 
> > There are now two different meanings for the 'index' file:
> > 
> > 1) SMBIOS-provided "type instance" value, which I've only seen in
> >    range [1..N] for N devices, monotonically stepwise increasing.
> > 
> > 2) ACPI-provided "index" value, which per spec only needs to be a
> >    "sort key", not starting at 0 or 1, and while monotonically
> >    increasing, not necessarily stepwise.  It's perfectly valid for the
> >    values to be (12, 16, 27, 29) if that's convenient for BIOS to
> >    generate.
> > 
> > Therefore, a consumer of this value (such as biosdevname) must know
> > which of the two it's dealing with, and either accept the value as-is,
> > or sort the value list.  While I suppose it could sort the value list
> > in either case, I'd prefer the ACPI value to be exposed in its own
> > file, perhaps 'acpi_index', to make this explicit rather than
> > implicit.
> > 
> > 'label' is fine for either case, with ACPI taking priority over
> > SMBIOS if both happen to be present.
> > 

Applied, thanks.

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply

* Re: [net-next-2.6 PATCH v6 2/2] net_sched: implement a root container qdisc sch_mqprio
From: John Fastabend @ 2011-01-07 22:16 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: davem@davemloft.net, hadi@cyberus.ca, eric.dumazet@gmail.com,
	shemminger@vyatta.com, tgraf@infradead.org,
	bhutchings@solarflare.com, nhorman@tuxdriver.com,
	netdev@vger.kernel.org
In-Reply-To: <20110107212140.GA2050@del.dom.local>

On 1/7/2011 1:21 PM, Jarek Poplawski wrote:
> On Thu, Jan 06, 2011 at 07:12:16PM -0800, John Fastabend wrote:
>> This implements a mqprio queueing discipline that by default creates
>> a pfifo_fast qdisc per tx queue and provides the needed configuration
>> interface.
>>
>> Using the mqprio qdisc the number of tcs currently in use along
>> with the range of queues alloted to each class can be configured. By
>> default skbs are mapped to traffic classes using the skb priority.
>> This mapping is configurable.
>>
>> Configurable parameters,
>>
>> struct tc_mqprio_qopt {
>>         __u8    num_tc;
>>         __u8    prio_tc_map[TC_BITMASK + 1];
>>         __u8    hw;
>>         __u16   count[TC_MAX_QUEUE];
>>         __u16   offset[TC_MAX_QUEUE];
>> };
>>
>> Here the count/offset pairing give the queue alignment and the
>> prio_tc_map gives the mapping from skb->priority to tc.
>>
>> The hw bit determines if the hardware should configure the count
>> and offset values. If the hardware bit is set then the operation
>> will fail if the hardware does not implement the ndo_setup_tc
>> operation. This is to avoid undetermined states where the hardware
>> may or may not control the queue mapping. Also minimal bounds
>> checking is done on the count/offset to verify a queue does not
>> exceed num_tx_queues and that queue ranges do not overlap. Otherwise
>> it is left to user policy or hardware configuration to create
>> useful mappings.
>>
>> It is expected that hardware QOS schemes can be implemented by
>> creating appropriate mappings of queues in ndo_tc_setup().
>>
>> One expected use case is drivers will use the ndo_setup_tc to map
>> queue ranges onto 802.1Q traffic classes. This provides a generic
>> mechanism to map network traffic onto these traffic classes and
>> removes the need for lower layer drivers to know specifics about
>> traffic types.
>>
>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>> ---
>>
>>  include/linux/pkt_sched.h |   12 +
>>  net/sched/Kconfig         |   12 +
>>  net/sched/Makefile        |    1 
>>  net/sched/sch_generic.c   |    4 
>>  net/sched/sch_mqprio.c    |  415 +++++++++++++++++++++++++++++++++++++++++++++
>>  5 files changed, 444 insertions(+), 0 deletions(-)
>>  create mode 100644 net/sched/sch_mqprio.c
>>
>> diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
>> index 2cfa4bc..776cd93 100644
>> --- a/include/linux/pkt_sched.h
>> +++ b/include/linux/pkt_sched.h
>> @@ -481,4 +481,16 @@ struct tc_drr_stats {
>>  	__u32	deficit;
>>  };
>>  
>> +/* MQPRIO */
>> +#define TC_QOPT_BITMASK 15
>> +#define TC_QOPT_MAX_QUEUE 16
>> +
>> +struct tc_mqprio_qopt {
>> +	__u8	num_tc;
>> +	__u8	prio_tc_map[TC_QOPT_BITMASK + 1];
>> +	__u8	hw;
>> +	__u16	count[TC_QOPT_MAX_QUEUE];
>> +	__u16	offset[TC_QOPT_MAX_QUEUE];
>> +};
> 
> ...
>> +static int mqprio_parse_opt(struct net_device *dev, struct tc_mqprio_qopt *qopt)
>> +{
>> +	int i, j;
>> +
>> +	/* Verify num_tc is not out of max range */
>> +	if (qopt->num_tc > TC_QOPT_MAX_QUEUE)
> 
> If these TC_QOPTs really couldn't be avoided you should probably check
> them with BUILD_BUG_ON() but use only TC_MAX_QUEUE/TC_BITMASK
> everywhere. Otherwise, it looks OK to me.
> 
> Jarek P.

I couldn't think of any better way. I'll add the BUILD_BUG_ON() macros.

Thanks 
John

^ permalink raw reply

* Re: [GIT] Networking
From: Francois Romieu @ 2011-01-07 21:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ben Hutchings, David Miller, Hayes Wang, David Woodhouse, akpm,
	netdev, linux-kernel
In-Reply-To: <AANLkTikwrHoOY4wvUp1qn-OT=EjZesSNR29US1ERc0o2@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2017 bytes --]

Linus Torvalds <torvalds@linux-foundation.org> :
[...]
> I just confirmed that building it as a module works

I have just tried a non-modular build and it worked without firmware.

However:
[    3.508989] scsi2 : pata_sis
[    3.515602] scsi3 : pata_sis
[    3.522815] ata3: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xffa0 irq 14
[    3.537408] ata4: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xffa8 irq 15
[    3.552204] Fixed MDIO Bus: probed
[    3.559902] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
[    3.571875] ACPI: PCI Interrupt Link [LNKE] enabled at IRQ 5
[    3.583869] PCI: setting IRQ 5 as level-triggered
[    3.583877] r8169 0000:02:00.0: PCI INT A -> Link[LNKE] -> GSI 5 (level, low) -> IRQ 5
[    3.600399] r8169 0000:02:00.0: enabling Mem-Wr-Inval
[    3.600440] r8169 0000:02:00.0: setting latency timer to 64
[    3.600453] r8169 0000:02:00.0: no MSI. Back to INTx.
[    3.611416] r8169 0000:02:00.0: eth0: RTL8168d/8111d at 0xf87ea000, 00:e0:4c:68:00:2c, XID 083000c0 IRQ 5

No firmware warning message ? 

[    3.842642] ata1: SATA link down (SStatus 0 SControl 300)
[    4.172056] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    4.192318] ata2.00: ATA-7: SAMSUNG HD250HJ, FH100-05, max UDMA7
[    4.205162] ata2.00: 488397168 sectors, multi 16: LBA48 NCQ (depth 0/32)
[    4.240311] ata2.00: configured for UDMA/133
[    4.249869] scsi 1:0:0:0: Direct-Access     ATA      SAMSUNG HD250HJ  FH10 PQ: 0 ANSI: 5
[    4.267151] sd 1:0:0:0: [sda] 488397168 512-byte logical blocks: (250 GB/232 GiB)
[    4.283199] sd 1:0:0:0: Attached scsi generic sg0 type 0
[    4.295028] sd 1:0:0:0: [sda] Write Protect is off
[    4.305610] sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    4.305637] sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    4.334873]  sda: sda1
[    4.340876] sd 1:0:0:0: [sda] Attached SCSI disk
[   63.968081] r8169 0000:02:00.0: eth0: unable to apply firmware patch

It's here. After a 60 seconds black-out.

-- 
Ueimor

[-- Attachment #2: dmesg.gz --]
[-- Type: application/x-gzip, Size: 14038 bytes --]

^ permalink raw reply

* Re: [net-next-2.6 PATCH v6 1/2] net: implement mechanism for HW based QOS
From: Jarek Poplawski @ 2011-01-07 21:46 UTC (permalink / raw)
  To: John Fastabend
  Cc: davem, hadi, eric.dumazet, shemminger, tgraf, bhutchings, nhorman,
	netdev
In-Reply-To: <20110107031211.2446.35715.stgit@jf-dev1-dcblab>

On Thu, Jan 06, 2011 at 07:12:11PM -0800, John Fastabend wrote:
> This patch provides a mechanism for lower layer devices to
> steer traffic using skb->priority to tx queues. This allows
> for hardware based QOS schemes to use the default qdisc without
> incurring the penalties related to global state and the qdisc
> lock. While reliably receiving skbs on the correct tx ring
> to avoid head of line blocking resulting from shuffling in
> the LLD. Finally, all the goodness from txq caching and xps/rps
> can still be leveraged.
> 
> Many drivers and hardware exist with the ability to implement
> QOS schemes in the hardware but currently these drivers tend
> to rely on firmware to reroute specific traffic, a driver
> specific select_queue or the queue_mapping action in the
> qdisc.
> 
> By using select_queue for this drivers need to be updated for
> each and every traffic type and we lose the goodness of much
> of the upstream work. Firmware solutions are inherently
> inflexible. And finally if admins are expected to build a
> qdisc and filter rules to steer traffic this requires knowledge
> of how the hardware is currently configured. The number of tx
> queues and the queue offsets may change depending on resources.
> Also this approach incurs all the overhead of a qdisc with filters.
> 
> With the mechanism in this patch users can set skb priority using
> expected methods ie setsockopt() or the stack can set the priority
> directly. Then the skb will be steered to the correct tx queues
> aligned with hardware QOS traffic classes. In the normal case with
> a single traffic class and all queues in this class everything
> works as is until the LLD enables multiple tcs.
> 
> To steer the skb we mask out the lower 4 bits of the priority
> and allow the hardware to configure upto 15 distinct classes
> of traffic. This is expected to be sufficient for most applications
> at any rate it is more then the 8021Q spec designates and is
> equal to the number of prio bands currently implemented in
> the default qdisc.
> 
> This in conjunction with a userspace application such as
> lldpad can be used to implement 8021Q transmission selection
> algorithms one of these algorithms being the extended transmission
> selection algorithm currently being used for DCB.
> 
> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
> ---
> 
>  include/linux/netdevice.h |   65 +++++++++++++++++++++++++++++++++++++++++++++
>  net/core/dev.c            |   52 +++++++++++++++++++++++++++++++++++-
>  2 files changed, 116 insertions(+), 1 deletions(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 0f6b1c9..12fff42 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -646,6 +646,14 @@ struct xps_dev_maps {
>      (nr_cpu_ids * sizeof(struct xps_map *)))
>  #endif /* CONFIG_XPS */
>  
> +#define TC_MAX_QUEUE	16
> +#define TC_BITMASK	15
> +/* HW offloaded queuing disciplines txq count and offset maps */
> +struct netdev_tc_txq {
> +	u16 count;
> +	u16 offset;
> +};
> +
>  /*
>   * This structure defines the management hooks for network devices.
>   * The following hooks can be defined; unless noted otherwise, they are
> @@ -756,6 +764,7 @@ struct xps_dev_maps {
>   * int (*ndo_set_vf_port)(struct net_device *dev, int vf,
>   *			  struct nlattr *port[]);
>   * int (*ndo_get_vf_port)(struct net_device *dev, int vf, struct sk_buff *skb);
> + * void (*ndo_setup_tc)(struct net_device *dev, u8 tc)

..., unsigned int txq) ?

>   */
>  #define HAVE_NET_DEVICE_OPS
>  struct net_device_ops {
> @@ -814,6 +823,8 @@ struct net_device_ops {
>  						   struct nlattr *port[]);
>  	int			(*ndo_get_vf_port)(struct net_device *dev,
>  						   int vf, struct sk_buff *skb);
> +	int			(*ndo_setup_tc)(struct net_device *dev, u8 tc,
> +						unsigned int txq);

...
> +/* netif_setup_tc - Handle tc mappings on real_num_tx_queues change
> + * @dev: Network device
> + * @txq: number of queues available
> + *
> + * If real_num_tx_queues is changed the tc mappings may no longer be
> + * valid. To resolve this if the net_device supports ndo_setup_tc
> + * call the ops routine with the new queue number. If the ops is not
> + * available verify the tc mapping remains valid and if not NULL the
> + * mapping. With no priorities mapping to this offset/count pair it
> + * will no longer be used. In the worst case TC0 is invalid nothing
> + * can be done so disable priority mappings.
> + */
> +void netif_setup_tc(struct net_device *dev, unsigned int txq)
> +{
> +	const struct net_device_ops *ops = dev->netdev_ops;
> +
> +	if (ops->ndo_setup_tc) {
> +		ops->ndo_setup_tc(dev, dev->num_tc, txq);
> +	} else {
> +		int i;
> +		struct netdev_tc_txq *tc = &dev->tc_to_txq[0];
> +
> +		/* If TC0 is invalidated disable TC mapping */
> +		if (tc->offset + tc->count > txq) {
> +			dev->num_tc = 0;
> +			return;
> +		}
> +
> +		/* Invalidated prio to tc mappings set to TC0 */
> +		for (i = 1; i < TC_BITMASK + 1; i++) {
> +			int q = netdev_get_prio_tc_map(dev, i);

(empty line)
Btw, probably some warning should be logged on config change here.

> +			tc = &dev->tc_to_txq[q];
> +
> +			if (tc->offset + tc->count > txq)
> +				netdev_set_prio_tc_map(dev, i, 0);
> +		}
> +	}
> +}
> +
>  /*
>   * Routine to help set real_num_tx_queues. To avoid skbs mapped to queues
>   * greater then real_num_tx_queues stale skbs on the qdisc must be flushed.
> @@ -1614,6 +1653,9 @@ int netif_set_real_num_tx_queues(struct net_device *dev, unsigned int txq)
>  
>  		if (txq < dev->real_num_tx_queues)
>  			qdisc_reset_all_tx_gt(dev, txq);
> +
> +		if (dev->num_tc)
> +			netif_setup_tc(dev, txq);

Should be before qdisc_reset_all_tx_gt (above).

Jarek P.

^ permalink raw reply

* Re: [net-next-2.6 PATCH v6 2/2] net_sched: implement a root container qdisc sch_mqprio
From: Jarek Poplawski @ 2011-01-07 21:21 UTC (permalink / raw)
  To: John Fastabend
  Cc: davem, hadi, eric.dumazet, shemminger, tgraf, bhutchings, nhorman,
	netdev
In-Reply-To: <20110107031216.2446.35953.stgit@jf-dev1-dcblab>

On Thu, Jan 06, 2011 at 07:12:16PM -0800, John Fastabend wrote:
> This implements a mqprio queueing discipline that by default creates
> a pfifo_fast qdisc per tx queue and provides the needed configuration
> interface.
> 
> Using the mqprio qdisc the number of tcs currently in use along
> with the range of queues alloted to each class can be configured. By
> default skbs are mapped to traffic classes using the skb priority.
> This mapping is configurable.
> 
> Configurable parameters,
> 
> struct tc_mqprio_qopt {
>         __u8    num_tc;
>         __u8    prio_tc_map[TC_BITMASK + 1];
>         __u8    hw;
>         __u16   count[TC_MAX_QUEUE];
>         __u16   offset[TC_MAX_QUEUE];
> };
> 
> Here the count/offset pairing give the queue alignment and the
> prio_tc_map gives the mapping from skb->priority to tc.
> 
> The hw bit determines if the hardware should configure the count
> and offset values. If the hardware bit is set then the operation
> will fail if the hardware does not implement the ndo_setup_tc
> operation. This is to avoid undetermined states where the hardware
> may or may not control the queue mapping. Also minimal bounds
> checking is done on the count/offset to verify a queue does not
> exceed num_tx_queues and that queue ranges do not overlap. Otherwise
> it is left to user policy or hardware configuration to create
> useful mappings.
> 
> It is expected that hardware QOS schemes can be implemented by
> creating appropriate mappings of queues in ndo_tc_setup().
> 
> One expected use case is drivers will use the ndo_setup_tc to map
> queue ranges onto 802.1Q traffic classes. This provides a generic
> mechanism to map network traffic onto these traffic classes and
> removes the need for lower layer drivers to know specifics about
> traffic types.
> 
> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
> ---
> 
>  include/linux/pkt_sched.h |   12 +
>  net/sched/Kconfig         |   12 +
>  net/sched/Makefile        |    1 
>  net/sched/sch_generic.c   |    4 
>  net/sched/sch_mqprio.c    |  415 +++++++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 444 insertions(+), 0 deletions(-)
>  create mode 100644 net/sched/sch_mqprio.c
> 
> diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
> index 2cfa4bc..776cd93 100644
> --- a/include/linux/pkt_sched.h
> +++ b/include/linux/pkt_sched.h
> @@ -481,4 +481,16 @@ struct tc_drr_stats {
>  	__u32	deficit;
>  };
>  
> +/* MQPRIO */
> +#define TC_QOPT_BITMASK 15
> +#define TC_QOPT_MAX_QUEUE 16
> +
> +struct tc_mqprio_qopt {
> +	__u8	num_tc;
> +	__u8	prio_tc_map[TC_QOPT_BITMASK + 1];
> +	__u8	hw;
> +	__u16	count[TC_QOPT_MAX_QUEUE];
> +	__u16	offset[TC_QOPT_MAX_QUEUE];
> +};

...
> +static int mqprio_parse_opt(struct net_device *dev, struct tc_mqprio_qopt *qopt)
> +{
> +	int i, j;
> +
> +	/* Verify num_tc is not out of max range */
> +	if (qopt->num_tc > TC_QOPT_MAX_QUEUE)

If these TC_QOPTs really couldn't be avoided you should probably check
them with BUILD_BUG_ON() but use only TC_MAX_QUEUE/TC_BITMASK
everywhere. Otherwise, it looks OK to me.

Jarek P.

^ permalink raw reply

* [rfc] Creating drivers/net/ethernet
From: Joe Perches @ 2011-01-07 21:15 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Paul Gortmaker, Jan Engelhardt, Jeffrey Kirsher
In-Reply-To: <1294360199-9860-1-git-send-email-jeffrey.t.kirsher@intel.com>

Does anyone still think moving files around in drivers/net
would be sensible and a suitable candidate for 2.6.38?

http://vger.kernel.org/netconf2010_slides/netconf-jtk.pdf

Here's what I proposed.

http://www.spinics.net/lists/netdev/msg149717.html

I agree with Jan that the current 10/100, 1000, 10000 speed
selections mechanisms are less than ideal.

Perhaps it'd be worthwhile to remove it.


^ permalink raw reply

* Re: [PATCH] forcedeth: Do not use legacy PCI power management
From: Rafael J. Wysocki @ 2011-01-07 21:12 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, LKML, Linux-pm mailing list
In-Reply-To: <201101070049.16833.rjw@sisk.pl>

On Friday, January 07, 2011, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> The forcedeth driver uses the legacy PCI power management, so it has
> to do PCI-specific things in its ->suspend() and ->resume() callbacks
> and some of them are not done correctly.
> 
> Convert forcedeth to the new PCI power management framework and make
> it let the PCI subsystem take care of all the PCI-specific aspects of
> device handling during system power transitions.
> 
> Tested with nVidia Corporation MCP55 Ethernet (rev a2).

This version of the patch contains a mistake (nv_shutdown should be #defined
as NULL if !CONFIG_PM, not nv_resume), please disregard it.

Fixed patch is appended.

Thanks,
Rafael


---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: forcedeth: Do not use legacy PCI power management (v2)

The forcedeth driver uses the legacy PCI power management, so it has
to do PCI-specific things in its ->suspend() and ->resume() callbacks
and some of them are not done correctly.

Convert forcedeth to the new PCI power management framework and make
it let the PCI subsystem take care of all the PCI-specific aspects of
device handling during system power transitions.

Tested with nVidia Corporation MCP55 Ethernet (rev a2).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/net/forcedeth.c |   34 ++++++++++++----------------------
 1 file changed, 12 insertions(+), 22 deletions(-)

Index: linux-2.6/drivers/net/forcedeth.c
===================================================================
--- linux-2.6.orig/drivers/net/forcedeth.c
+++ linux-2.6/drivers/net/forcedeth.c
@@ -4082,6 +4082,7 @@ static int nv_set_wol(struct net_device
 		writel(flags, base + NvRegWakeUpFlags);
 		spin_unlock_irq(&np->lock);
 	}
+	device_set_wakeup_enable(&np->pci_dev->dev, np->wolenabled);
 	return 0;
 }
 
@@ -5643,14 +5644,10 @@ static int __devinit nv_probe(struct pci
 	/* set mac address */
 	nv_copy_mac_to_hw(dev);
 
-	/* Workaround current PCI init glitch:  wakeup bits aren't
-	 * being set from PCI PM capability.
-	 */
-	device_init_wakeup(&pci_dev->dev, 1);
-
 	/* disable WOL */
 	writel(0, base + NvRegWakeUpFlags);
 	np->wolenabled = 0;
+	device_set_wakeup_enable(&pci_dev->dev, false);
 
 	if (id->driver_data & DEV_HAS_POWER_CNTRL) {
 
@@ -5923,8 +5920,9 @@ static void __devexit nv_remove(struct p
 }
 
 #ifdef CONFIG_PM
-static int nv_suspend(struct pci_dev *pdev, pm_message_t state)
+static int nv_suspend(struct device *device)
 {
+	struct pci_dev *pdev = to_pci_dev(device);
 	struct net_device *dev = pci_get_drvdata(pdev);
 	struct fe_priv *np = netdev_priv(dev);
 	u8 __iomem *base = get_hwbase(dev);
@@ -5940,25 +5938,17 @@ static int nv_suspend(struct pci_dev *pd
 	for (i = 0;i <= np->register_size/sizeof(u32); i++)
 		np->saved_config_space[i] = readl(base + i*sizeof(u32));
 
-	pci_save_state(pdev);
-	pci_enable_wake(pdev, pci_choose_state(pdev, state), np->wolenabled);
-	pci_disable_device(pdev);
-	pci_set_power_state(pdev, pci_choose_state(pdev, state));
 	return 0;
 }
 
-static int nv_resume(struct pci_dev *pdev)
+static int nv_resume(struct device *device)
 {
+	struct pci_dev *pdev = to_pci_dev(device);
 	struct net_device *dev = pci_get_drvdata(pdev);
 	struct fe_priv *np = netdev_priv(dev);
 	u8 __iomem *base = get_hwbase(dev);
 	int i, rc = 0;
 
-	pci_set_power_state(pdev, PCI_D0);
-	pci_restore_state(pdev);
-	/* ack any pending wake events, disable PME */
-	pci_enable_wake(pdev, PCI_D0, 0);
-
 	/* restore non-pci configuration space */
 	for (i = 0;i <= np->register_size/sizeof(u32); i++)
 		writel(np->saved_config_space[i], base+i*sizeof(u32));
@@ -5977,6 +5967,9 @@ static int nv_resume(struct pci_dev *pde
 	return rc;
 }
 
+static SIMPLE_DEV_PM_OPS(nv_pm_ops, nv_suspend, nv_resume);
+#define NV_PM_OPS (&nv_pm_ops)
+
 static void nv_shutdown(struct pci_dev *pdev)
 {
 	struct net_device *dev = pci_get_drvdata(pdev);
@@ -6000,15 +5993,13 @@ static void nv_shutdown(struct pci_dev *
 	 * only put the device into D3 if we really go for poweroff.
 	 */
 	if (system_state == SYSTEM_POWER_OFF) {
-		if (pci_enable_wake(pdev, PCI_D3cold, np->wolenabled))
-			pci_enable_wake(pdev, PCI_D3hot, np->wolenabled);
+		pci_wake_from_d3(pdev, np->wolenabled);
 		pci_set_power_state(pdev, PCI_D3hot);
 	}
 }
 #else
-#define nv_suspend NULL
+#define NV_PM_OPS NULL
 #define nv_shutdown NULL
-#define nv_resume NULL
 #endif /* CONFIG_PM */
 
 static DEFINE_PCI_DEVICE_TABLE(pci_tbl) = {
@@ -6180,9 +6171,8 @@ static struct pci_driver driver = {
 	.id_table	= pci_tbl,
 	.probe		= nv_probe,
 	.remove		= __devexit_p(nv_remove),
-	.suspend	= nv_suspend,
-	.resume		= nv_resume,
 	.shutdown	= nv_shutdown,
+	.driver.pm	= NV_PM_OPS,
 };
 
 static int __init init_nic(void)

^ permalink raw reply

* [patch] phonet: some signedness bugs
From: Dan Carpenter @ 2011-01-07 20:37 UTC (permalink / raw)
  To: Remi Denis-Courmont
  Cc: David S. Miller, netdev, kernel-janitors, dan.j.rosenberg

Dan Rosenberg pointed out that there were some signed comparison bugs
in the phonet protocol.

http://marc.info/?l=full-disclosure&m=129424528425330&w=2

If you have already have CAP_SYS_ADMIN then you could use the bugs to
get root, or someone could cause an oops by mistake.

Signed-off-by: Dan Carpenter <error27@gmail.com>

diff --git a/include/net/phonet/phonet.h b/include/net/phonet/phonet.h
index d5df797..5395e09 100644
--- a/include/net/phonet/phonet.h
+++ b/include/net/phonet/phonet.h
@@ -107,8 +107,8 @@ struct phonet_protocol {
 	int			sock_type;
 };
 
-int phonet_proto_register(int protocol, struct phonet_protocol *pp);
-void phonet_proto_unregister(int protocol, struct phonet_protocol *pp);
+int phonet_proto_register(unsigned int protocol, struct phonet_protocol *pp);
+void phonet_proto_unregister(unsigned int protocol, struct phonet_protocol *pp);
 
 int phonet_sysctl_init(void);
 void phonet_sysctl_exit(void);
diff --git a/net/phonet/af_phonet.c b/net/phonet/af_phonet.c
index fd95beb..c627d2e 100644
--- a/net/phonet/af_phonet.c
+++ b/net/phonet/af_phonet.c
@@ -37,7 +37,7 @@
 /* Transport protocol registration */
 static struct phonet_protocol *proto_tab[PHONET_NPROTO] __read_mostly;
 
-static struct phonet_protocol *phonet_proto_get(int protocol)
+static struct phonet_protocol *phonet_proto_get(unsigned int protocol)
 {
 	struct phonet_protocol *pp;
 
@@ -60,8 +60,8 @@ static inline void phonet_proto_put(struct phonet_protocol *pp)
 
 /* protocol family functions */
 
-static int pn_socket_create(struct net *net, struct socket *sock, int protocol,
-			    int kern)
+static int pn_socket_create(struct net *net, struct socket *sock,
+			    unsigned int protocol, int kern)
 {
 	struct sock *sk;
 	struct pn_sock *pn;
@@ -458,7 +458,7 @@ static struct packet_type phonet_packet_type __read_mostly = {
 
 static DEFINE_MUTEX(proto_tab_lock);
 
-int __init_or_module phonet_proto_register(int protocol,
+int __init_or_module phonet_proto_register(unsigned int protocol,
 						struct phonet_protocol *pp)
 {
 	int err = 0;
@@ -481,7 +481,7 @@ int __init_or_module phonet_proto_register(int protocol,
 }
 EXPORT_SYMBOL(phonet_proto_register);
 
-void phonet_proto_unregister(int protocol, struct phonet_protocol *pp)
+void phonet_proto_unregister(unsigned int protocol, struct phonet_protocol *pp)
 {
 	mutex_lock(&proto_tab_lock);
 	BUG_ON(proto_tab[protocol] != pp);

^ permalink raw reply related

* Re: [GIT] Networking
From: David Woodhouse @ 2011-01-07 20:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ben Hutchings, David Miller, Hayes Wang, Francois Romieu, akpm,
	netdev, linux-kernel
In-Reply-To: <AANLkTikwrHoOY4wvUp1qn-OT=EjZesSNR29US1ERc0o2@mail.gmail.com>

On Fri, 2011-01-07 at 11:49 -0800, Linus Torvalds wrote:
> [ Apart from the annoyance of also having to manually copying the
> firmware files: why the heck doesn't that firmware tree even have a
> "make firmware-install" makefile or something? The kernel has a "make
> firmware-install" thing, why doesn't the firmware tree itself have
> that?

Yeah, the main reason I haven't yet submitted a patch to remove the
legacy firmware/ directory from the kernel source, as discussed at the
Kernel Summit, is because I need to implement some makefiles for the
firmware tree first.

I'll do a 'make install' for the firmware tree which installs it all,
and lets you specify min/max kernel versions so you can omit stuff that
*really* isn't relevant.

I also want to preserve the CONFIG_FIRMWARE_IN_KERNEL option, but that
can't easily be based on the hardcoded knowledge of the config options;
it wants to be based on the MODULE_FIRMWARE tags of the built-in
drivers. And once we're enumerating those, it should be relatively
simple to warn about missing firmwares in the non-FIRMWARE_IN_KERNEL
case too.

I didn't find the time to do that for the 2.6.38 merge window, but I
should get it done for 2.6.39.

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation

^ permalink raw reply

* Re: [GIT] Networking
From: Linus Torvalds @ 2011-01-07 19:49 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: David Miller, Hayes Wang, Francois Romieu, David Woodhouse, akpm,
	netdev, linux-kernel
In-Reply-To: <20110107190656.GQ3702@decadent.org.uk>

On Fri, Jan 7, 2011 at 11:06 AM, Ben Hutchings <benh@debian.org> wrote:
>
> This is because the driver is requesting firmware during probe, before
> there is a firmware agent available (maybe even before / is mounted?)
> rather than when the interface is brought up, as many other network
> drivers do.

Yeah, doing it at device up time would fix the problem.

> At the very least, we ought to make firmware loading fail fast and
> noisily if this happens.

Absolutely.
>

> There were some tables of PHY registers to poke that were a mixture of
> control register writes and transfer of firmware to the microcontroller.
> Hayes and Francois have separated those out.

Ok.

>> I bet I can make it work by making it a module, and installing the
>> firmware manually. But it's supposed to work even without that.
>
> Right.  This is very vaguely based on a patch I applied in Debian
> where we build r8169 as a module, and I don't think anyone reported
> this behaviour.

I just confirmed that building it as a module works

[ Apart from the annoyance of also having to manually copying the
firmware files: why the heck doesn't that firmware tree even have a
"make firmware-install" makefile or something? The kernel has a "make
firmware-install" thing, why doesn't the firmware tree itself have
that? Gaah, this is almost as idiotic as trying to install a
self-built kernel under Ubuntu ]

                        Linus

^ permalink raw reply

* Re: [PATCH 3/3] offloading: Force software GSO for multiple vlan tags.
From: Michał Mirosław @ 2011-01-07 19:36 UTC (permalink / raw)
  To: Jesse Gross; +Cc: David Miller, netdev, Ben Hutchings
In-Reply-To: <1288390495-28923-3-git-send-email-jesse@nicira.com>

Hi,

Sorry for late reply, I noticed this patch only after it went to Linus' tree.

2010/10/30 Jesse Gross <jesse@nicira.com>:
> We currently use vlan_features to check for TSO support if there is
> a vlan tag.  However, it's quite likely that the NIC is not able to
> do TSO when there is an arbitrary number of tags.  Therefore if there
> is more than one tag (in-band or out-of-band), fall back to software
> emulation.
>
> Signed-off-by: Jesse Gross <jesse@nicira.com>
> CC: Ben Hutchings <bhutchings@solarflare.com>
> ---
>  include/linux/netdevice.h |    7 +++----
>  net/core/dev.c            |   16 ++++++++++++++++
>  2 files changed, 19 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 072652d..980c752 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2234,6 +2234,8 @@ unsigned long netdev_fix_features(unsigned long features, const char *name);
>  void netif_stacked_transfer_operstate(const struct net_device *rootdev,
>                                        struct net_device *dev);
>
> +int netif_get_vlan_features(struct sk_buff *skb, struct net_device *dev);
> +
>  static inline int net_gso_ok(int features, int gso_type)
>  {
>        int feature = gso_type << NETIF_F_GSO_SHIFT;
> @@ -2249,10 +2251,7 @@ static inline int skb_gso_ok(struct sk_buff *skb, int features)
>  static inline int netif_needs_gso(struct net_device *dev, struct sk_buff *skb)
>  {
>        if (skb_is_gso(skb)) {
> -               int features = dev->features;
> -
> -               if (skb->protocol == htons(ETH_P_8021Q) || skb->vlan_tci)
> -                       features &= dev->vlan_features;
> +               int features = netif_get_vlan_features(skb, dev);
>
>                return (!skb_gso_ok(skb, features) ||
>                        unlikely(skb->ip_summed != CHECKSUM_PARTIAL));
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 8bdda70..8d74988 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1969,6 +1969,22 @@ static inline void skb_orphan_try(struct sk_buff *skb)
>        }
>  }
>
> +int netif_get_vlan_features(struct sk_buff *skb, struct net_device *dev)
> +{
> +       __be16 protocol = skb->protocol;
> +
> +       if (protocol == htons(ETH_P_8021Q)) {
> +               struct vlan_ethhdr *veh = (struct vlan_ethhdr *)skb->data;
> +               protocol = veh->h_vlan_encapsulated_proto;
> +       } else if (!skb->vlan_tci)
> +               return dev->features;
> +
> +       if (protocol != htons(ETH_P_8021Q))
> +               return dev->features & dev->vlan_features;
> +       else
> +               return 0;
> +}

This clears all features for multiply-tagged frames. At least SG,
FRAGLIST and HW_CSUM are perfectly valid for those frames.

This doesn't really matter if this function stays used only in
netif_needs_gso(). It's name and placement suggests otherwise, though.

Best Regards,
Michał Mirosław

^ permalink raw reply

* RE: [net-next 08/12] ixgb: convert to new VLAN model
From: Tantilov, Emil S @ 2011-01-07 19:30 UTC (permalink / raw)
  To: Jesse Gross, Kirsher, Jeffrey T
  Cc: davem@davemloft.net, netdev@vger.kernel.org, gosp@redhat.com,
	bphilips@novell.com
In-Reply-To: <AANLkTik1rCFRtBWov5f8yfU+4JZnbzLHgcmRC1y_+TDP@mail.gmail.com>

Jesse Gross wrote:
> On Thu, Jan 6, 2011 at 7:29 PM,  <jeffrey.t.kirsher@intel.com> wrote:
>> +static int ixgb_set_flags(struct net_device *netdev, u32 data) +{
>> +       struct ixgb_adapter *adapter = netdev_priv(netdev); +      
>> bool need_reset; +       int rc;
>> +
>> +       /*
>> +        * TX vlan insertion does not work per HW design when Rx
>> stripping is +        * disabled.  Disable txvlan when rxvlan is
>> off. +        */ +       if ((data & ETH_FLAG_RXVLAN) !=
>> (netdev->features & NETIF_F_HW_VLAN_RX)) +               data ^=
>> ETH_FLAG_TXVLAN; 
> 
> Does this really do the right thing?  If the RX vlan setting is
> changed, it will do the opposite of what the user requested for TX
> vlan?
> 
> So if I start with both on (the default) and turn them both off in one
> command (a valid setting), I will get RX off and TX on (an invalid
> setting).
> 
> Why not:
> 
> if (!(data & ETH_FLAG_RXVLAN))
>         data &= ~ETH_FLAG_TXVLAN;

Ah, you're right. We missed this in testing.

I will spin another patch.

Thanks for all your help.

Emil

^ permalink raw reply

* [RFC] sched: QFQ - quick fair queue scheduler (iproute2)
From: Stephen Hemminger @ 2011-01-07 19:24 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: David Miller, Eric Dumazet, Fabio Checconi, netdev, Luigi Rizzo
In-Reply-To: <20110106195614.20dbc402@nehalam>

This is the iproute2 interface for QFQ. It is spartan at this point
and needs a man page.

---
 include/linux/pkt_sched.h |   14 +++++
 tc/Makefile               |    1 +
 tc/q_qfq.c                |  123 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 138 insertions(+), 0 deletions(-)
 create mode 100644 tc/q_qfq.c

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 2cfa4bc..c3ca324 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -481,4 +481,18 @@ struct tc_drr_stats {
 	__u32	deficit;
 };
 
+/* QFQ */
+enum {
+	TCA_QFQ_WEIGHT,
+	TCA_QFQ_LMAX,
+	__TCA_QFQ_MAX
+};
+
+#define TCA_QFQ_MAX	(__TCA_QFQ_MAX - 1)
+
+struct tc_qfq_stats {
+	__u32 weight;
+	__u32 lmax;
+};
+
 #endif
diff --git a/tc/Makefile b/tc/Makefile
index 101cc83..715b34e 100644
--- a/tc/Makefile
+++ b/tc/Makefile
@@ -29,6 +29,7 @@ TCMODULES += q_ingress.o
 TCMODULES += q_hfsc.o
 TCMODULES += q_htb.o
 TCMODULES += q_drr.o
+TCMODULES += q_qfq.o
 TCMODULES += m_gact.o
 TCMODULES += m_mirred.o
 TCMODULES += m_nat.o
diff --git a/tc/q_qfq.c b/tc/q_qfq.c
new file mode 100644
index 0000000..ad77f45
--- /dev/null
+++ b/tc/q_qfq.c
@@ -0,0 +1,123 @@
+/*
+ * q_qfq.c	QFQ.
+ *
+ *		This program is free software; you can redistribute it and/or
+ *		modify it under the terms of the GNU General Public License
+ *		as published by the Free Software Foundation; either version
+ *		2 of the License, or (at your option) any later version.
+ *
+ * Authors:	Stephen Hemminger <shemminger@vyatta.com>
+ *		Fabio Checconi <fabio@gandalf.sssup.it>
+ *
+ */
+#include <syslog.h>
+#include <fcntl.h>
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <arpa/inet.h>
+#include <string.h>
+
+#include "utils.h"
+#include "tc_util.h"
+
+static void explain(void)
+{
+	fprintf(stderr, "Usage: ... qfq\n");
+}
+
+static void explain1(const char *arg)
+{
+	fprintf(stderr, "Illegal \"%s\"\n", arg);
+}
+
+static void explain_class(void)
+{
+	fprintf(stderr, "Usage: ... qfq weight NUMBER maxpkt BYTES\n");
+}
+
+static int qfq_parse_opt(struct qdisc_util *qu, int argc, char **argv, 
+			 struct nlmsghdr *n)
+{
+	while (argc > 0) {
+		if (matches(*argv, "help") == 0) {
+			explain();
+			return -1;
+		} else {
+			fprintf(stderr, "What is \"%s\"?\n", *argv);
+			explain();
+			return -1;
+		}
+		argc--; argv++;
+	}
+
+	return 0;
+}
+
+static int qfq_parse_class_opt(struct qdisc_util *qu, int argc, char **argv,
+			       struct nlmsghdr *n)
+{
+	struct rtattr *tail;
+	__u32 tmp;
+
+	tail = NLMSG_TAIL(n);
+	addattr_l(n, 4096, TCA_OPTIONS, NULL, 0);
+
+	while (argc > 0) {
+		if (matches(*argv, "weight") == 0) {
+			NEXT_ARG();
+			if (get_u32(&tmp, *argv, 10)) {
+				explain1("weight"); return -1;
+			}
+			addattr32(n, 4096, TCA_QFQ_WEIGHT, tmp);
+		} else if (matches(*argv, "maxpkt") == 0) {
+			NEXT_ARG();
+			if (get_u32(&tmp, *argv, 10)) {
+				explain1("maxpkt"); return -1;
+			}
+			addattr32(n, 4096, TCA_QFQ_LMAX, tmp);
+		} else if (strcmp(*argv, "help") == 0) {
+			explain_class();
+			return -1;
+		} else {
+			fprintf(stderr, "What is \"%s\"?\n", *argv);
+			explain_class();
+			return -1;
+		}
+		argc--; argv++;
+	}
+
+	tail->rta_len = (void *)NLMSG_TAIL(n) - (void *)tail;
+
+	return 0;
+}
+
+static int qfq_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
+{
+	struct rtattr *tb[TCA_QFQ_MAX + 1];
+
+	if (opt == NULL)
+		return 0;
+
+	parse_rtattr_nested(tb, TCA_QFQ_MAX, opt);
+
+	if (tb[TCA_QFQ_WEIGHT]) {
+		fprintf(f, "weight %u ", 
+			*(__u32 *)RTA_DATA(tb[TCA_QFQ_WEIGHT]));
+	}
+
+	if (tb[TCA_QFQ_LMAX]) {
+		fprintf(f, "maxpkt %u ", 
+			*(__u32 *)RTA_DATA(tb[TCA_QFQ_LMAX]));
+	}
+
+	return 0;
+}
+
+struct qdisc_util qfq_qdisc_util = {
+	.id		= "qfq",
+	.parse_qopt	= qfq_parse_opt,
+	.print_qopt	= qfq_print_opt,
+	.parse_copt	= qfq_parse_class_opt,
+	.print_copt	= qfq_print_opt,
+};
+
-- 
1.7.1


^ permalink raw reply related

* Re: [GIT] Networking
From: Ben Hutchings @ 2011-01-07 19:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, Hayes Wang, Francois Romieu, David Woodhouse, akpm,
	netdev, linux-kernel
In-Reply-To: <AANLkTi=7=_FyVsjA3vKEpg8RwfC=m-8QGfqa6ctxQday@mail.gmail.com>

On Fri, Jan 07, 2011 at 10:46:48AM -0800, Linus Torvalds wrote:
> On Thu, Jan 6, 2011 at 12:20 PM, David Miller <davem@davemloft.net> wrote:
> >
> > Plus the usual spattering of wireless, bluetooth, and wired driver
> > updates.
> 
> Grr.
> 
> This breaks booting for me.
> 
> [torvalds@i5 linux]$ git bisect good
> bca03d5f32c8ee9b5cfa1d32640a63fded6cb3c0 is the first bad commit
> commit bca03d5f32c8ee9b5cfa1d32640a63fded6cb3c0
> Author: françois romieu <romieu@fr.zoreil.com>
> Date:   Mon Jan 3 15:07:31 2011 +0000
> 
>     r8169: remove the firmware of RTL8111D.
> 
>     The binary file of the firmware is moved to linux-firmware repository.
>     The firmwares are rtl_nic/rtl8168d-1.fw and rtl_nic/rtl8168d-2.fw.
>     The driver goes along if the firmware couldn't be found. However, it
>     is suggested to be done with the suitable firmware.
> 
>     Some wrong PHY parameters are directly corrected in the driver.
> 
>     Simple firmware checking added per Ben Hutchings suggestion.
> 
>     Signed-off-by: Hayes Wang <hayeswang@realtek.com>
>     Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
>     Cc: Ben Hutchings <benh@debian.org>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
> 
> and the behavior is very broken: it just hangs at boot-time. No
> messages from the driver (certainly not any messages about missing
> firmware), no nothing. The thing is just hung.

This is because the driver is requesting firmware during probe, before
there is a firmware agent available (maybe even before / is mounted?)
rather than when the interface is brought up, as many other network
drivers do.

At the very least, we ought to make firmware loading fail fast and
noisily if this happens.

[...]
> Quite frankly, that commit looks broken anyway. It doesn't just switch
> to the firmware loader, it also seems to change other things (ie
> removed some mdio writes, added others).

There were some tables of PHY registers to poke that were a mixture of
control register writes and transfer of firmware to the microcontroller.
Hayes and Francois have separated those out.

[...]
> I bet I can make it work by making it a module, and installing the
> firmware manually. But it's supposed to work even without that.
 
Right.  This is very vaguely based on a patch I applied in Debian
where we build r8169 as a module, and I don't think anyone reported
this behaviour.

Ben.

-- 
Ben Hutchings
We get into the habit of living before acquiring the habit of thinking.
                                                              - Albert Camus

^ permalink raw reply

* Re: [GIT] Networking
From: Linus Torvalds @ 2011-01-07 18:46 UTC (permalink / raw)
  To: David Miller, Hayes Wang, Francois Romieu, Ben Hutchings,
	David Woodhouse
  Cc: akpm, netdev, linux-kernel
In-Reply-To: <20110106.122003.233698077.davem@davemloft.net>

On Thu, Jan 6, 2011 at 12:20 PM, David Miller <davem@davemloft.net> wrote:
>
> Plus the usual spattering of wireless, bluetooth, and wired driver
> updates.

Grr.

This breaks booting for me.

[torvalds@i5 linux]$ git bisect good
bca03d5f32c8ee9b5cfa1d32640a63fded6cb3c0 is the first bad commit
commit bca03d5f32c8ee9b5cfa1d32640a63fded6cb3c0
Author: françois romieu <romieu@fr.zoreil.com>
Date:   Mon Jan 3 15:07:31 2011 +0000

    r8169: remove the firmware of RTL8111D.

    The binary file of the firmware is moved to linux-firmware repository.
    The firmwares are rtl_nic/rtl8168d-1.fw and rtl_nic/rtl8168d-2.fw.
    The driver goes along if the firmware couldn't be found. However, it
    is suggested to be done with the suitable firmware.

    Some wrong PHY parameters are directly corrected in the driver.

    Simple firmware checking added per Ben Hutchings suggestion.

    Signed-off-by: Hayes Wang <hayeswang@realtek.com>
    Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
    Cc: Ben Hutchings <benh@debian.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

and the behavior is very broken: it just hangs at boot-time. No
messages from the driver (certainly not any messages about missing
firmware), no nothing. The thing is just hung.

On a working setup, I have


  [    1.595071] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
  [    1.595114] r8169 0000:01:00.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19
  [    1.595174] r8169 0000:01:00.0: setting latency timer to 64
  [    1.595227] r8169 0000:01:00.0: irq 42 for MSI/MSI-X
  [    1.595770] r8169 0000:01:00.0: eth0: RTL8168d/8111d at
0xffffc90000068000, e0:cb:4e:95:1a:d7, XID 083000c0 IRQ 42
  ...
  [    7.985917] r8169 0000:01:00.0: eth0: link up
  [    7.987525] r8169 0000:01:00.0: eth0: link up

while on a non-working setup, I get that XID line, and after that a
few other init routines continue to show up (probably just
multi-threaded initcalls), but the box is dead.

Quite frankly, that commit looks broken anyway. It doesn't just switch
to the firmware loader, it also seems to change other things (ie
removed some mdio writes, added others).

What's going on? This is NOT how things are supposed to happen. Yes,
the firmware file is in the external linux-firmware repo, but that
doesn't help if the kernel build won't even notice and pick it up (or
warn about it not being found). Also, the external linux-firmware repo
isn't even support any sane installation scripts, nor does it work
with the "build automatically into the kernel" mode.

We also have those CONFIG_FIRMWARE_IN_KERNEL and CONFIG_STANDALONE
flags, and the build apparently failed both of those.

Not acceptable. I'm ok with an external firmware repository, but only
if it _works_. Right now it doesn't. It just seems to cause silent
failures: there were no warnings about missing firmware at ANY time.
Not build-time, not run-time.

I bet I can make it work by making it a module, and installing the
firmware manually. But it's supposed to work even without that.

                             Linus

^ permalink raw reply

* [ANNOUNCE] iproute2 2.6.37
From: Stephen Hemminger @ 2011-01-07 18:24 UTC (permalink / raw)
  To: netdev, linux-net, linux-kernel

This version of iproute2 utilities intended for use with 2.6.37 or
later kernel, but should be backward compatible with older releases.
Most of the changes are minor cleanups and fixes; new features
are:
   * support for flow matching based on rxhash
   * checksum update action

Source: (note old developer.osdl.org address works as well)
  http://devresources.linuxfoundation.org/dev/iproute2/download/iproute2-2.6.37.tar.bz2

Repository:
  git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git

For more info on iproute2 see:
  http://www.linuxfoundation.org/collaborate/workgroups/networking/iproute2

Report problems (or enhancements) to the netdev@vger.kernel.org mailing list.

Andreas Schwab (1):
      iproute2: remove useless use of buffer

Ben Greear (2):
      iproute2: Fix filtering related to flushing IP addresses.
      Allow 'ip addr flush' to loop more than 10 times

Changli Gao (1):
      iproute2: tc: f_flow: add key rxhash

Dan Smith (1):
      Add ip route save/restore

Eric Dumazet (2):
      ip: add RTA_MARK support
      iproute2: add 64bit support to ifstat

Gerrit Renker (1):
      tc-red: typo in man page

Gregoire Baron (1):
      tc: add ACT_CSUM action support (csum)

Mike Frysinger (1):
      m_xt: stop using xtables_set_revision()

Octavian Purdila (1):
      iproute2: initialize the ll_map only once

Petr Sabata (3):
      ss(8) improvements by Jiri Popelka <jpopelka@redhat.com>
      ss: Change "do now" to "do not" in ss(8), -n option
      ip: Few typo and grammar errors fixes for ip(8) manpage

Sridhar Samudrala (1):
      Support 'mode' parameter when creating macvtap device

Stephen Hemminger (11):
      Snapshot for 2.6.35.1
      Update kernel headers to 2.6.36-rc2
      Use correct rt_link_statistics
      Fix GRED options clearing
      Update to 2.6.36 headers
      Workaround for repeated distclean
      Use standard routines for interface name to index etc
      Increase size of ifindex hash heads
      Cleanup ll_map
      Update to 2.6.37-rc8 headers
      v2.6.37

Timo Teräs (2):
      iproute2: treat gre key as number
      iproute2: support xfrm upper protocol gre key

Ulrich Weber (2):
      iproute2: dont filter cached routes on iproute_get
      iproute2: display xfrm socket policy direction


^ permalink raw reply

* Re: 2.6.37 vlans on bnx2 not functional, panic with tcpdump
From: Iain Paton @ 2011-01-07 18:01 UTC (permalink / raw)
  To: Michael Chan; +Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <1294422653.14051.11.camel@HP1>

Michael Chan wrote:

> The kernel has undergone major VLAN changes at this time, so there may
> be a problem depending on what stripping mode we're in.  The tg3 driver
> is having some similar problems that haven't been completely resolved
> yet.

It seems to be related to generic vlan changes, I've duplicated it on an Intel 82547L nic in a completely different system connected 
to a different switch.

So you can consider any bnx2 cause for this closed. I'll get back to the list if I can come up with any more useful info.

Thanks for your help,
Iain

^ permalink raw reply

* Re: 2.6.37 vlans on bnx2 not functional, panic with tcpdump
From: Michael Chan @ 2011-01-07 17:50 UTC (permalink / raw)
  To: selsinork; +Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <AANLkTi=PGPVRbS3XViskpZ0GSS8ouBcgVZYYG9EoM-Nz@mail.gmail.com>


On Fri, 2011-01-07 at 00:57 -0800, selsinork wrote:
> On Fri, Jan 7, 2011 at 12:46 AM, Michael Chan <mchan@broadcom.com>
> wrote:
>         
>         
>         May be you have management firmware running on your devices
>         that can
>         change the behavior.  Can you provide ethtool -i eth0 on both
>         bnx2
>         devices on your system?
> 
> This particular system has four onboard ports, a two port add-in card
> and a single port fiber card all using bnx2, so I have some options to
> try different devices if there's something different about them.
> Details of all of them below.
> 
> Booting the same system back to 2.6.36 with the patch I mentioned
> previously leaves me with a functioning network, so given it could be
> management firmware related, why does it work on .36 but not .37 ?  

Management firmware affects the stripping/unstripping of VLAN tags.  If
management firmware is enabled, we need to configure the chip to always
strip the VLAN tag, otherwise management firmware will not receive the
packets properly.

The kernel has undergone major VLAN changes at this time, so there may
be a problem depending on what stripping mode we're in.  The tg3 driver
is having some similar problems that haven't been completely resolved
yet.

Anyway, I'll wait for you to narrow this down further.  I'll also try to
do some additional testing myself.

You have management firmware (NCSI) enabled on eth0 and eth1, but not
the other devices (the firmware shown is NVRAM firmware).  If it's still
a problem, please see if there's a difference in behavior between the
devices with NCSI enabled and disabled.

Thanks.

>  The install is very stripped down, there's no udev or /sbin/hotplug
> loading firmware behind my back as neither are installed. So firmware
> is whatever comes with the kernel. Just having had a look, I see the
> firmware provided with the kernel has changed in .37, but the output
> of ethtool -i shows the same firmware being used with .36 and .37
> 
> Iain
> 
> 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
> Gigabit Ethernet (rev 20)
> 02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
> Gigabit Ethernet (rev 20)
> 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
> Gigabit Ethernet (rev 20)
> 03:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
> Gigabit Ethernet (rev 20)
> 04:00.0 RAID bus controller: Hewlett-Packard Company Smart Array G6
> controllers (rev 01)
> 07:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
> Gigabit Ethernet (rev 20)
> 07:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
> Gigabit Ethernet (rev 20)
> 0a:00.0 PCI bridge: Broadcom EPB PCI-Express to PCI-X Bridge (rev c3)
> 0b:00.0 Ethernet controller: Broadcom Corporation NetXtreme II
> BCM5708S Gigabit Ethernet (rev 12)
> 
> from 2.6.37:
> 
> root@64bit:~# ethtool -i eth0
> driver: bnx2
> version: 2.0.17
> firmware-version: bc 5.2.2 NCSI 2.0.6
> bus-info: 0000:02:00.0
> root@64bit:~# ethtool -i eth1
> driver: bnx2
> version: 2.0.17
> firmware-version: bc 5.2.2 NCSI 2.0.6
> bus-info: 0000:02:00.1
> root@64bit:~# ethtool -i eth2
> driver: bnx2
> version: 2.0.17
> firmware-version: bc 5.2.2
> bus-info: 0000:03:00.0
> root@64bit:~# ethtool -i eth3
> driver: bnx2
> version: 2.0.17
> firmware-version: bc 5.2.2
> bus-info: 0000:03:00.1
> root@64bit:~# ethtool -i eth4
> driver: bnx2
> version: 2.0.17
> firmware-version: bc 5.2.2
> bus-info: 0000:07:00.0
> root@64bit:~# ethtool -i eth5
> driver: bnx2
> version: 2.0.17
> firmware-version: bc 5.2.2
> bus-info: 0000:07:00.1
> root@64bit:~# ethtool -i eth6
> driver: bnx2
> version: 2.0.17
> firmware-version: bc 1.9.6
> bus-info: 0000:0b:00.0
>  
> from 2.6.36:
> 
>  ethtool -i eth1
> driver: bnx2
> version: 2.0.17
> firmware-version: bc 5.2.2 NCSI 2.0.6
> bus-info: 0000:02:00.1
> 

^ permalink raw reply

* Re: [PATCH 2/2] sky2: convert to new VLAN model
From: Stephen Hemminger @ 2011-01-07 17:27 UTC (permalink / raw)
  To: Jesse Gross; +Cc: David Miller, netdev
In-Reply-To: <AANLkTikwGQFByOZGgCCjTJySPa8QYndZ903CFmOkS1Ha@mail.gmail.com>

On Fri, 7 Jan 2011 12:14:51 -0500
Jesse Gross <jesse@nicira.com> wrote:

> On Thu, Jan 6, 2011 at 11:41 PM, Stephen Hemminger
> <shemminger@vyatta.com> wrote:
> > +/* Features available on VLAN with transmit tag stripped */
> > +#define VLAN_FEAT (NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO)
> > +
> > +static void sky2_vlan_mode(struct net_device *dev)
> >  {
> > -       if (onoff) {
> > +       struct sky2_port *sky2 = netdev_priv(dev);
> > +       struct sky2_hw *hw = sky2->hw;
> > +       u16 port = sky2->port;
> > +
> > +       if (dev->features & NETIF_F_HW_VLAN_RX)
> >                sky2_write32(hw, SK_REG(port, RX_GMF_CTRL_T),
> >                             RX_VLAN_STRIP_ON);
> > +       else
> > +               sky2_write32(hw, SK_REG(port, RX_GMF_CTRL_T),
> > +                            RX_VLAN_STRIP_OFF);
> > +
> > +       if (dev->features & NETIF_F_HW_VLAN_TX) {
> >                sky2_write32(hw, SK_REG(port, TX_GMF_CTRL_T),
> >                             TX_VLAN_TAG_ON);
> > +               dev->vlan_features = dev->features & VLAN_FEAT;
> >        } else {
> > -               sky2_write32(hw, SK_REG(port, RX_GMF_CTRL_T),
> > -                            RX_VLAN_STRIP_OFF);
> >                sky2_write32(hw, SK_REG(port, TX_GMF_CTRL_T),
> >                             TX_VLAN_TAG_OFF);
> > +               dev->vlan_features = dev->features & NETIF_F_HIGHDMA;
> 
> Hmm, the chip supports SG only when TX vlan is on and HIGHDMA only
> when it is off?  Currently skb_needs_linearize() assumes that when not
> using vlan acceleration, the DMA engine doesn't care about the
> presence of a vlan tag and directly uses dev->features.

The chip supports checksum offload only if TX_VLAN is enabled.
Scatter/gather without checksum offload is not allowed by kernel
because checksum offload is needed for sendfile.

highdma should always be on

> Also, can't we enable NETIF_F_GRO in vlan_features?  It will be set
> initially by register_netdevice() but if we change the flags it gets
> blown away.

I will revise. GRO is independent of all this.


-- 

^ permalink raw reply

* Re: [PATCH 2/2] sky2: convert to new VLAN model
From: Jesse Gross @ 2011-01-07 17:14 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, netdev
In-Reply-To: <20110106204146.16b3a328@nehalam>

On Thu, Jan 6, 2011 at 11:41 PM, Stephen Hemminger
<shemminger@vyatta.com> wrote:
> +/* Features available on VLAN with transmit tag stripped */
> +#define VLAN_FEAT (NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO)
> +
> +static void sky2_vlan_mode(struct net_device *dev)
>  {
> -       if (onoff) {
> +       struct sky2_port *sky2 = netdev_priv(dev);
> +       struct sky2_hw *hw = sky2->hw;
> +       u16 port = sky2->port;
> +
> +       if (dev->features & NETIF_F_HW_VLAN_RX)
>                sky2_write32(hw, SK_REG(port, RX_GMF_CTRL_T),
>                             RX_VLAN_STRIP_ON);
> +       else
> +               sky2_write32(hw, SK_REG(port, RX_GMF_CTRL_T),
> +                            RX_VLAN_STRIP_OFF);
> +
> +       if (dev->features & NETIF_F_HW_VLAN_TX) {
>                sky2_write32(hw, SK_REG(port, TX_GMF_CTRL_T),
>                             TX_VLAN_TAG_ON);
> +               dev->vlan_features = dev->features & VLAN_FEAT;
>        } else {
> -               sky2_write32(hw, SK_REG(port, RX_GMF_CTRL_T),
> -                            RX_VLAN_STRIP_OFF);
>                sky2_write32(hw, SK_REG(port, TX_GMF_CTRL_T),
>                             TX_VLAN_TAG_OFF);
> +               dev->vlan_features = dev->features & NETIF_F_HIGHDMA;

Hmm, the chip supports SG only when TX vlan is on and HIGHDMA only
when it is off?  Currently skb_needs_linearize() assumes that when not
using vlan acceleration, the DMA engine doesn't care about the
presence of a vlan tag and directly uses dev->features.

Also, can't we enable NETIF_F_GRO in vlan_features?  It will be set
initially by register_netdevice() but if we change the flags it gets
blown away.

^ permalink raw reply

* Re: [PATCH V8 12/13] ptp: Added a clock driver for the IXP46x.
From: Richard Cochran @ 2011-01-07 17:07 UTC (permalink / raw)
  To: Krzysztof Halasa
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	Alan Cox, Arnd Bergmann, Christoph Lameter, David Miller,
	John Stultz, Peter Zijlstra, Rodolfo Giometti, Thomas Gleixner
In-Reply-To: <m3ei8plpa0.fsf-gTjgKJgtlHj77SC2UrCW1FMQynFLKtET@public.gmane.org>

Krzysztof,

Thanks for your review. I have some responses, below.

But before that, I have a big favor to ask you. Can you please look at
the TODO in my patch and give me a hint? I don't know how to relate
the port instance (NPE A,B,C) to the two time stamping channels.

On Thu, Jan 06, 2011 at 10:01:59PM +0100, Krzysztof Halasa wrote:
> Richard Cochran <richardcochran-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> > +	u32 SrcUUIDHi;  /* 0x5C Sequence Identifier/Source UUID0 High */
> 
> I don't like these XxxYyyZzz either :-(

Okay, I'll change that.

> > +static void do_tx_timestamp(struct port *port, struct sk_buff *skb)
> > +{
> > +#ifdef __ARMEB__
...
> > +#endif
> > +}
> 
> And what if we're little-endian?

It is a NOOP in that case.

> Why does it depend on BE?

The time stamp code clones the skb, but the LE version frees the skb
too early. Perhaps we can move that dev_kfree_skb(skb) in the LE case
to be the last statement in eth_xmit(). What do you think?

> 
> > @@ -1171,6 +1357,11 @@ static int __devinit eth_init_one(struct platform_device *pdev)
> >  	char phy_id[MII_BUS_ID_SIZE + 3];
> >  	int err;
> >  
> > +	if (ptp_filter_init(ptp_filter, ARRAY_SIZE(ptp_filter))) {
> > +		pr_err("ixp4xx_eth: bad ptp filter\n");
> > +		return -EINVAL;
> > +	}
> > +
> >  	if (!(dev = alloc_etherdev(sizeof(struct port))))
> >  		return -ENOMEM;
> 
> Shouldn't it depend on CPU type?

It won't hurt to init the BPF unconditionally. The important bits are
checked with cpu_is_ixp46x().

> BTW which CPU is required? IXP46x (455/460/465)? Does it work on 43x?

IIRC, it does not work on 43x.

I don't know about 455 and 460, but I can find out...

> > +	if (NO_IRQ == irq)
> > +		return NO_IRQ;
> 
> Don't like these either :-(

Do you mean, you don't like the constant on the left hand side?

Is that prohibited by CodingStyle or similar?

I got into the habit of writing it that way to prevent a typo like:

	if (irq = NO_IRQ)

> Not showstoppers but...
> 
> Also I don't like the ixp_read/ixp_write() trivial macros. Why not
> simply call __raw_readl() and __raw_writel()?

Well, I have had the experience back in 2.4 days of having my drivers
ruined by the changing IO macros in the kernel. The wrappers are
supposed to help if that ever happens again. Seeing *two* leading
underscores in the macro names certainly makes me nervous.

Thanks again,

Richard

^ permalink raw reply

* Re: [PATCH V8 09/13] posix clocks: introduce dynamic clocks
From: Richard Cochran @ 2011-01-07 16:41 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, linux-api, netdev, Alan Cox, Christoph Lameter,
	David Miller, John Stultz, Krzysztof Halasa, Peter Zijlstra,
	Rodolfo Giometti, Thomas Gleixner
In-Reply-To: <201101062056.22423.arnd@arndb.de>

On Thu, Jan 06, 2011 at 08:56:22PM +0100, Arnd Bergmann wrote:
> It should be just a trivial change and just affect how easy it is for
> other people to understand the code if they are already familiar
> with other kernel code.
 
Okay, I'll get right on it.

> Overall, your series looks really good now, it would be nice if this
> could still make it into 2.6.38.

Yes, that would be great.

Thanks,
Richard

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox