Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next-2.6] net: sysfs: ethtool_ops can be NULL
From: David Miller @ 2009-10-28 11:02 UTC (permalink / raw)
  To: andy; +Cc: eric.dumazet, netdev
In-Reply-To: <20091026134033.GD1639@gospo.rdu.redhat.com>

From: Andy Gospodarek <andy@greyhouse.net>
Date: Mon, 26 Oct 2009 09:40:33 -0400

> On Mon, Oct 26, 2009 at 12:23:33PM +0100, Eric Dumazet wrote:
>> commit d519e17e2d01a0ee9abe083019532061b4438065
>> (net: export device speed and duplex via sysfs)
>> made the wrong assumption that netdev->ethtool_ops was always set.
>> 
>> This makes possible to crash kernel and let rtnl in locked state.
>> 
>> modprobe dummy
>> ip link set dummy0 up
>> (udev runs and crash)
>> 
>> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
 ...
> Nice catch, Eric.
> 
> Acked-by: Andy Gospodarek <andy@greyhouse.net>

Applied.

^ permalink raw reply

* Re: [PATCH] net: Corrected spelling error heurestics->heuristics
From: David Miller @ 2009-10-28 11:02 UTC (permalink / raw)
  To: apetlund; +Cc: netdev, trivial, linux-kernel, ilpo.jarvinen
In-Reply-To: <4AE6F539.1020107@simula.no>

From: Andreas Petlund <apetlund@simula.no>
Date: Tue, 27 Oct 2009 14:27:21 +0100

> Corrected a spelling error in a function name.
> 
> Signed-off-by: Andreas Petlund <apetlund@simula.no>

Applied to net-next-2.6, thanks.

^ permalink raw reply

* Re: [PATCH] virtio-net: fix data corruption with OOM
From: David Miller @ 2009-10-28 11:03 UTC (permalink / raw)
  To: rusty; +Cc: netdev, mst
In-Reply-To: <200910282126.58902.rusty@rustcorp.com.au>

From: Rusty Russell <rusty@rustcorp.com.au>
Date: Wed, 28 Oct 2009 21:26:58 +1030

> On Tue, 27 Oct 2009 11:57:20 am you wrote:
>> Anything in a reply to a patch that looks like a signoff or ACK,
>> patchwork adds to the commit message in the mbox blob it spits out for
>> me.
> 
> In case this got lost in the meta-discussion:

Applied, thanks.

^ permalink raw reply

* Re: [PATCH NEXT 0/6] netxen: changes for new chip
From: David Miller @ 2009-10-28 11:11 UTC (permalink / raw)
  To: dhananjay; +Cc: netdev
In-Reply-To: <1256436243-5736-1-git-send-email-dhananjay@netxen.com>

From: Dhananjay Phadke <dhananjay@netxen.com>
Date: Sat, 24 Oct 2009 19:03:57 -0700

> Series of 6 patches for net-next-2.6, please apply.

All applied, thanks.

^ permalink raw reply

* Re: [PATCH] vmxnet3: remove duplicated #include
From: David Miller @ 2009-10-28 11:13 UTC (permalink / raw)
  To: sbhatewara; +Cc: netdev, weiyi.huang, pv-drivers
In-Reply-To: <alpine.LRH.2.00.0910221634130.23769@sbhatewara-dev1.eng.vmware.com>

From: Shreyas Bhatewara <sbhatewara@vmware.com>
Date: Thu, 22 Oct 2009 16:58:33 -0700 (PDT)

> 
> 
> Remove duplicate headerfile includes from vmxnet3_int.h
> 
> Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
> Signed-off-by: Shreyas Bhatewara <sbhatewara@vmware.com>
> Signed-off-by: Bhavesh Davda <davda@vmware.com>

This patch doesn't apply to net-next-2.6, please resend.

^ permalink raw reply

* Re: [net-next-2.6 PATCH] e100: Fix to allow systems with FW based cards to resume from STD
From: David Miller @ 2009-10-28 11:14 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, david.graham
In-Reply-To: <20091023025904.7057.58001.stgit@localhost.localdomain>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Thu, 22 Oct 2009 19:59:29 -0700

> From: David Graham <david.graham@intel.com>
> 
> Devices with loadable firmware must have their firmware reloaded
> after the system resumes from sleep, but the request_firmare()
> API is not available at this point in the resume flow because
> tasks are not yet running, and the system will hang if it is
> called. Work around this issue by only calling request_firmware()
> for a device's first firmware load, and cache a copy of the pointer
> to the firmware blob for that device, so that we may reload firmware
> images even during resume.
> 
> Signed-off-by: David Graham <david.graham@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied, thanks.

^ permalink raw reply

* Re: [net-next-2.6 PATCH] be2net:Changes to update ethtool get_settings function to return appropriate values.
From: David Miller @ 2009-10-28 11:15 UTC (permalink / raw)
  To: sarveshwarb; +Cc: netdev
In-Reply-To: <20091022132949.GA23701@serverengines.com>

From: Sarveshwar Bandi <sarveshwarb@serverengines.com>
Date: Thu, 22 Oct 2009 19:00:00 +0530

> Update ethtool get_settings function to:
> - get current link speed settings from controller
> - get port transceiver type from controller
> - fill appropriate values for supported, phy_address
> 
> Signed-off-by: Sarveshwar Bandi <sarveshwarb@serverengines.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH 0/5] Candidate fix for increased number of GFP_ATOMIC failures V2
From: Karol Lewandowski @ 2009-10-28 11:42 UTC (permalink / raw)
  To: Mel LKML
  Cc: Karol Lewandowski, Mel Gorman, Frans Pop, Jiri Kosina,
	Sven Geggus, Tobias Oetiker, Rafael J. Wysocki, David Miller,
	Reinette Chatre, Kalle Valo, David Rientjes, KOSAKI Motohiro,
	Mohamed Abbas, Jens Axboe, John W. Linville, Pekka Enberg,
	Bartlomiej Zolnierkiewicz, Greg Kroah-Hartman,
	Stephan von Krawczynski, Kernel Testers List,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	"linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org" <li
In-Reply-To: <9ec2d7290910240646p75b93c68v6ea1648d628a9660-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Sat, Oct 24, 2009 at 02:46:56PM +0100, Mel LKML wrote:
> Hi,

Hi,

> On 10/23/09, Karol Lewandowski <karol.k.lewandowski-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > On Fri, Oct 23, 2009 at 06:58:10PM +0200, Karol Lewandowski wrote:

> > Ok, I've tested patches 1+2+4 and bug, while very hard to trigger, is
> > still present. I'll test complete 1-4 patchset as time permits.

Sorry for silence, I've been quite busy lately.


> And also patch 5 please which is the revert. Patch 5 as pointed out is
> probably a red herring. Hwoever, it has changed the timing and made a
> difference for some testing so I'd like to know if it helps yours as
> well.

I've tested patches 1+2+3+4 in my normal usage scenario (do some work,
suspend, do work, suspend, ...) and it failed today after 4 days (== 4
suspend-resume cycles).

I'll test 1-5 now.


Thanks.

^ permalink raw reply

* Re: [PATCH 0/5] Candidate fix for increased number of GFP_ATOMIC failures V2
From: Mel Gorman @ 2009-10-28 11:59 UTC (permalink / raw)
  To: Karol Lewandowski
  Cc: Mel LKML, Frans Pop, Jiri Kosina, Sven Geggus, Tobias Oetiker,
	Rafael J. Wysocki, David Miller, Reinette Chatre, Kalle Valo,
	David Rientjes, KOSAKI Motohiro, Mohamed Abbas, Jens Axboe,
	John W. Linville, Pekka Enberg, Bartlomiej Zolnierkiewicz,
	Greg Kroah-Hartman, Stephan von Krawczynski, Kernel Testers List,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org
In-Reply-To: <20091028114208.GA14476-nLtalAL5mPp2RxbNQum0x1nzlInOXLuq@public.gmane.org>

On Wed, Oct 28, 2009 at 12:42:08PM +0100, Karol Lewandowski wrote:
> On Sat, Oct 24, 2009 at 02:46:56PM +0100, Mel LKML wrote:
> > Hi,
> 
> Hi,
> 
> > On 10/23/09, Karol Lewandowski <karol.k.lewandowski-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > > On Fri, Oct 23, 2009 at 06:58:10PM +0200, Karol Lewandowski wrote:
> 
> > > Ok, I've tested patches 1+2+4 and bug, while very hard to trigger, is
> > > still present. I'll test complete 1-4 patchset as time permits.
> 
> Sorry for silence, I've been quite busy lately.
> 
> 
> > And also patch 5 please which is the revert. Patch 5 as pointed out is
> > probably a red herring. Hwoever, it has changed the timing and made a
> > difference for some testing so I'd like to know if it helps yours as
> > well.
> 
> I've tested patches 1+2+3+4 in my normal usage scenario (do some work,
> suspend, do work, suspend, ...) and it failed today after 4 days (== 4
> suspend-resume cycles).
> 
> I'll test 1-5 now.
> 

I was digging through commits for suspend-related changes. Rafael, is
there any chance that some change to suspend is responsible for this
regression? This commit for example is a vague possibility;
c6f37f12197ac3bd2e5a35f2f0e195ae63d437de: PM/Suspend: Do not shrink memory before suspend

I say vague because FREE_PAGE_NUMBER is so small.

Also, what was the behaviour of the e100 driver when suspending before
this commit?

6905b1f1a03a48dcf115a2927f7b87dba8d5e566: Net / e100: Fix suspend of devices that cannot be power managed

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply

* RE: [PATCH v2 4/7] fsl_pq_mdio: Add Suport for etsec2.0 devices.
From: Kumar Gopalpet-B05799 @ 2009-10-28 12:00 UTC (permalink / raw)
  To: netdev; +Cc: David Miller
In-Reply-To: <20091028.024325.134189895.davem@davemloft.net>

 

>-----Original Message-----
>From: David Miller [mailto:davem@davemloft.net] 
>Sent: Wednesday, October 28, 2009 3:13 PM
>To: Kumar Gopalpet-B05799
>Cc: netdev@vger.kernel.org
>Subject: Re: [PATCH v2 4/7] fsl_pq_mdio: Add Suport for 
>etsec2.0 devices.
>
>From: Sandeep Gopalpet <sandeep.kumar@freescale.com>
>Date: Tue, 27 Oct 2009 22:25:18 +0530
>
>> This patch adds mdio support for etsec2.0 devices.
>> 
>> Modified the fsl_pq_mdio structure to include the new mdio members.
>> 
>> Signed-off-by: Sandeep Gopalpet <sandeep.kumar@freescale.com>
>
>This is the third time you've submitted this patch, and for 
>the third time it DOES NOT apply to net-next-2.6 at all when I 
>try to apply this gianfar patch series.
>
>You must be patching against another tree that has some 
>changes that conflict with this one.
>

I had rebased and tested on the master branch of the following tree
http://www.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6.git

(The git protocol clone was not working, hence we used http)
If this is not the right tree, kindly correct me.
I could not find any other net-next-2.6 tree/branch on kernel.org.


I have updated the tree again, (though there is no change)
the last commit on the tree being :

commit b37b62fea1d1bf68ca51818f8eb1035188efd030
Author: Ben Hutchings <bhutchings@solarflare.com>
Date:   Fri Oct 23 08:33:42 2009 +0000

    sfc: Rename 'xfp' file and functions to reflect reality

    The 'XFP' driver is really a driver for the QT2022C2 and QT2025C
PHYs,
    covering both more and less than XFP.  Rename its functions and
    constants to reflect reality and to reduce namespace pollution when
    sfc is a built-in driver.

    Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

 
Also, I have merged it with the latest linux-next tree 
(It is already updated with the net-next-2.6.git)

>Sort this out before submitting this again.

I am working on it.
>
>If you submit once more this same series, and it doesn't apply 
>properly to net-next-2.6, I will flat our ignore your 
>submissions for a week or so.
>

>You are wasting that much of my time by doing this over and over.
I apologize for that.
>
>Get your act together.
>


^ permalink raw reply

* Re: [PATCH V2]NET/KS8695: add support NAPI for Rx
From: Daniel Silverstone @ 2009-10-28 12:06 UTC (permalink / raw)
  To: Figo.zhang; +Cc: David S. Miller, netdev, Ben Dooks
In-Reply-To: <1256653422.2148.23.camel@myhost>

On Tue, Oct 27, 2009 at 10:23:42PM +0800, Figo.zhang wrote:
> for wan, irq = 29; for lan ,irq = 16.
> so we can do this read the interrupt status:
> 
> unsigned long mask_bit = 1 << ksp->rx_irq;
> status = readl(KS8695_IRQ_VA + KS8695_INTST);

I hate that there's no proper IRQ functions for managing these, as Ben has
commented.  Although I can understand that writing such is beyond the scope of
this patch.

>  #define MODULENAME	"ks8695_ether"
>  #define MODULEVERSION	"1.01"

You still didn't update the module version.  This is a pity because you've
potentially radically changed behaviour and you definitely have radically
changed implementation.

> +	struct napi_struct	napi;
> +	spinlock_t rx_lock;

You have not added documentation for these fields in the structure's
documentation string.

> + *	Use NAPI to receive packets.

"Inform NAPI that packet reception needs to be scheduled." might be better.

> +static int ks8695_rx(struct net_device *ndev, int budget)

This routine lacks a documentation string. Please write one.

> -	/* Kick the RX DMA engine, in case it became suspended */
> -	ks8695_writereg(ksp, KS8695_DRSC, 0);

I can't see where you have moved this to.  Without it, sometimes the KS8695's
RX DMA engine will falter and packets won't be transferred properly.

> +static int ks8695_poll(struct napi_struct *napi, int budget)

This routine also lacks a documentation string.

> +	netif_napi_add(ndev, &ksp->napi, ks8695_poll, 64);

This '64' seems quite arbitrary.  Is it a standard default? Did you work it out
from something else?  Some explanation would be nice.

I see that Dave Miller has accepted your patch into net-next-2.6.  I'd like to
see the above fixed before that gets merged any further.

Regards,

Daniel.

-- 
Daniel Silverstone                              http://www.simtec.co.uk/

^ permalink raw reply

* Re: [PATCH V2]NET/KS8695: add support NAPI for Rx
From: David Miller @ 2009-10-28 12:14 UTC (permalink / raw)
  To: dsilvers; +Cc: figo1802, netdev, ben
In-Reply-To: <20091028120643.GA7883@digital-scurf.org>

From: Daniel Silverstone <dsilvers@simtec.co.uk>
Date: Wed, 28 Oct 2009 12:06:44 +0000

> I see that Dave Miller has accepted your patch into net-next-2.6.
> I'd like to see the above fixed before that gets merged any further.

Any such change would need to be relative to what's already
in net-next-2.6

^ permalink raw reply

* [PATCH]udev:Extend udev to support move events
From: Narendra K @ 2009-10-28 12:46 UTC (permalink / raw)
  To: linux-hotplug
  Cc: netdev, matt_domsch, jordan_hargrave, charles_rose,
	sandeep_k_shandilya, dannf

As of now, udev does not support move events that are generated when
network interfaces are renamed. This patch extends udev to support move
events. With this patch udev would support rules like 

ACTION=="move", SUBSYSTEM=="net", PROGRAM="netif_id %k"

Signed-off-by: Narendra K <Narendra_K@dell.com>
---
 udev/udev-event.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/udev/udev-event.c b/udev/udev-event.c
index f4d7121..4a77753 100644
--- a/udev/udev-event.c
+++ b/udev/udev-event.c
@@ -647,6 +647,13 @@ exit_add:
 		goto exit;
 	}
 
+	/* handle "move" event */
+	if (strcmp(udev_device_get_subsystem(dev), "net") == 0 && strcmp(udev_device_get_action(dev), "move") == 0) {
+		udev_rules_apply_to_event(rules, event);
+		udev_device_update_db(dev);
+		goto exit;
+	}
+
 	/* remove device node */
 	if (major(udev_device_get_devnum(dev)) != 0 && strcmp(udev_device_get_action(dev), "remove") == 0) {
 		/* import database entry and delete it */
-- 

With regards,
Narendra K


^ permalink raw reply related

* Re: [PATCH 0/5] Candidate fix for increased number of GFP_ATOMIC failures V2
From: Tobi Oetiker @ 2009-10-28 12:55 UTC (permalink / raw)
  To: Karol Lewandowski
  Cc: Mel LKML, Mel Gorman, Frans Pop, Jiri Kosina, Sven Geggus,
	Rafael J. Wysocki, David Miller, Reinette Chatre, Kalle Valo,
	David Rientjes, KOSAKI Motohiro, Mohamed Abbas, Jens Axboe,
	John W. Linville, Pekka Enberg, Bartlomiej Zolnierkiewicz,
	Greg Kroah-Hartman, Stephan von Krawczynski, Kernel Testers List,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org
In-Reply-To: <20091028114208.GA14476-nLtalAL5mPp2RxbNQum0x1nzlInOXLuq@public.gmane.org>

 Today Karol Lewandowski wrote:

> On Sat, Oct 24, 2009 at 02:46:56PM +0100, Mel LKML wrote:
> > Hi,
>
> Hi,
>
> > On 10/23/09, Karol Lewandowski <karol.k.lewandowski-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > > On Fri, Oct 23, 2009 at 06:58:10PM +0200, Karol Lewandowski wrote:
>
> > > Ok, I've tested patches 1+2+4 and bug, while very hard to trigger, is
> > > still present. I'll test complete 1-4 patchset as time permits.
>
> Sorry for silence, I've been quite busy lately.
>
>
> > And also patch 5 please which is the revert. Patch 5 as pointed out is
> > probably a red herring. Hwoever, it has changed the timing and made a
> > difference for some testing so I'd like to know if it helps yours as
> > well.
>
> I've tested patches 1+2+3+4 in my normal usage scenario (do some work,
> suspend, do work, suspend, ...) and it failed today after 4 days (== 4
> suspend-resume cycles).

I have been testing 1+2,1+2+3 as well as 3+4 and have been of the
assumption that 3+4 does help ... I have now been runing a modified
version of 4 which prints a warning instead of doing anything ... I
have now seen the allocation issue again without the warning being
printed. So in other words

1+2+3 make the problem less severe, but do not solve it
4 seems to be a red hering.

cheers
tobi


-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
http://it.oetiker.ch tobi-7K0TWYW2a3pyDzI6CaY1VQ@public.gmane.org ++41 62 775 9902 / sb: -9900

^ permalink raw reply

* Re: [PATCH 2/3] net: TCP thin linear timeouts
From: Arnd Hannemann @ 2009-10-28 12:58 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andreas Petlund, netdev, linux-kernel, shemminger, ilpo.jarvinen,
	davem
In-Reply-To: <4AE7262B.1060703@gmail.com>

Eric Dumazet schrieb:
> Andreas Petlund a écrit :
>> This patch will make TCP use only linear timeouts if the stream is thin. This will help to avoid the very high latencies that thin stream suffer because of exponential backoff. This mechanism is only active if enabled by iocontrol or syscontrol and the stream is identified as thin.
>>
> 
> Wont this reduce the session timeout to something very small, ie 15 retransmits, way under the minute ?

The session timeout no longer depends on the actual number of retransmits. Instead its a time interval,
which is roughly equivalent to the time a TCP, performing exponential backoff would need to perform
15 retransmits.

However, addressing the proposal:
I wonder how one can seriously suggest to just skip congestion response during timeout-based
loss recovery? I believe that in a heavily congested scenarios, this would lead to a goodput
goodput disaster... Not to mention that in a heavily congested scenario, suddenly every flow
will become "thin", so this will even amplify the problems. Or did I miss something?

Best regards,
Arnd

^ permalink raw reply

* Re: [PATCH] udev: create empty regular files to represent net interfaces
From: Matt Domsch @ 2009-10-28 13:03 UTC (permalink / raw)
  To: Kay Sievers
  Cc: dann frazier, linux-hotplug, Narendra_K, netdev, Jordan_Hargrave,
	Charles_Rose, Ben Hutchings
In-Reply-To: <ac3eb2510910280123g3c0e3d95wb38a239238906027@mail.gmail.com>

On Wed, Oct 28, 2009 at 09:23:57AM +0100, Kay Sievers wrote:
> On Tue, Oct 27, 2009 at 21:55, Matt Domsch <Matt_Domsch@dell.com> wrote:
> > On Thu, Oct 22, 2009 at 12:36:20AM -0600, dann frazier wrote:
> >> Here's a proof of concept to further the discussion..
> >>
> >> The default filename uses the format:
> >> ?? /dev/netdev/by-ifindex/$ifindex
> >>
> >> This provides the infrastructure to permit udev rules to create aliases for
> >> network devices using symlinks, for example:
> >>
> >> ?? /dev/netdev/by-name/eth0 -> ../by-ifindex/1
> >> ?? /dev/netdev/by-biosname/LOM0 -> ../by-ifindex/3
> >>
> >> A library (such as the proposed libnetdevname) could use this information
> >> to provide an alias->realname mapping for network utilities.
> >
> > yes, this could work, as IFINDEX is already exported in the uevents,
> > and that's the primary value udev needs to set up the mapping.
> >
> > While I like the little ifindex2name script you've got, I think udev
> > could simply call if_indextoname() to get this, and not call an
> > external program? ??I suppose it could be a really really simple
> > external program too.
> 
> What's the point of all this? Why would udev ever need to find the
> name of a device by the ifindex? The device name is the primary value
> for the kernel events udev acts on.

Ultimately, udev doesn't care.  I just want to use udev to keep track
of the pathname to device connections, like it does for all other
types of devices.

Applications such as net-tools, iproute, ethtool, etc.  take a kernel
device name.  I want to extend them to also take a path, and resolve
that path to a kernel device name.  libnetdevname currently is _one
small function_ which does this.  It need not even be in a library.
But whatever the mechanism, the path names need to be anchored
somewhere, so the library or all apps doing this kind of lookup know
where to look.

> That all sounds very much like something which will hit us back some
> day. I'm not sure, if udev should publish such dead text files in
> /dev, it does not seem to fit the usual APIs/assumptions where /sys
> and /dev match, and libudev provides access to both. It all sounds
> more like a database for a possible netdevname library, which does not
> need to be public in /dev, right?

Right, it doesn't need to be in /dev.  We could have udev rules that
simply call yet another program to maintain that database, in yet
another way.  But I really like how udev maintains the database of
symlinks for other device types, using symlinks in /dev/, and which
people are quite familiar with.  Why can't it be extended to do
likewise for network device names too?

There is a completely different approach possible here, if people
don't want to use something like /dev to track device name aliases.
We could put the whole name alias mechanism in the kernel, with new
netlink commands to add/remove/list aliases (and now we've overloaded that
term, as the old eth0:1 "alias" and dmz -> eth1 "alias" wouldn't be
the same thing).  But that idea hasn't met with a lot of interest
either.

-- 
Matt Domsch
Technology Strategist, Dell Office of the CTO
linux.dell.com & www.dell.com/linux

^ permalink raw reply

* Re: PATCH: Network Device Naming mechanism and policy
From: Narendra K @ 2009-10-28 13:06 UTC (permalink / raw)
  To: notting, scott
  Cc: netdev, linux-hotplug, matt_domsch, jordan_hargrave, rose_charles
In-Reply-To: <EDA0A4495861324DA2618B4C45DCB3EE58964E@blrx3m08.blr.amer.dell.com>

On Wed, Oct 28, 2009 at 06:21:49PM +0530, K, Narendra wrote:
> > At the moment, we do not appear to get the proper change uevents from 
> > things like 'ip link set dev <foo> address <bar>', so we can't 
> > currently maintain these symlinks.
> > 
> 
> I have observed that the kernel does generate a "move" event when
> interfaces are renamed. Looks like udev at present doesn't handle this
> event, but i suppose it could be extended to hanlde this event.
> 

With the patch "[PATCH]udev:Extend udev to support move events"
(http://marc.info/?l=linux-hotplug&m=125673399217656&w=2) udev would be
able to handle "move" events that are generated when interfaces are
renamed by commands like nameif. And we can maintain the symlinks by
having rules to handle this move event.

With regards,
Narendra K

^ permalink raw reply

* Re: [PATCH] udev: create empty regular files to represent net interfaces
From: Ben Hutchings @ 2009-10-28 13:06 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Matt Domsch, dann frazier, linux-hotplug, Narendra_K, netdev,
	Jordan_Hargrave, Charles_Rose
In-Reply-To: <ac3eb2510910280123g3c0e3d95wb38a239238906027@mail.gmail.com>

On Wed, 2009-10-28 at 09:23 +0100, Kay Sievers wrote:
> On Tue, Oct 27, 2009 at 21:55, Matt Domsch <Matt_Domsch@dell.com> wrote:
> > On Thu, Oct 22, 2009 at 12:36:20AM -0600, dann frazier wrote:
> >> Here's a proof of concept to further the discussion..
> >>
> >> The default filename uses the format:
> >>   /dev/netdev/by-ifindex/$ifindex
> >>
> >> This provides the infrastructure to permit udev rules to create aliases for
> >> network devices using symlinks, for example:
> >>
> >>   /dev/netdev/by-name/eth0 -> ../by-ifindex/1
> >>   /dev/netdev/by-biosname/LOM0 -> ../by-ifindex/3
> >>
> >> A library (such as the proposed libnetdevname) could use this information
> >> to provide an alias->realname mapping for network utilities.
> >
> > yes, this could work, as IFINDEX is already exported in the uevents,
> > and that's the primary value udev needs to set up the mapping.
> >
> > While I like the little ifindex2name script you've got, I think udev
> > could simply call if_indextoname() to get this, and not call an
> > external program?  I suppose it could be a really really simple
> > external program too.
> 
> What's the point of all this? Why would udev ever need to find the
> name of a device by the ifindex? The device name is the primary value
> for the kernel events udev acts on.
[...]

Since net devices can be renamed, unlike other devices, the ifindex is
the proper stable identifier.  Using the name as an identifier opens up
race conditions.  If there are events that don't include the ifindex,
this should be fixed.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH] Multicast packet reassembly can fail
From: Steve Chen @ 2009-10-28 13:29 UTC (permalink / raw)
  To: Rick Jones; +Cc: netdev
In-Reply-To: <4AE780CB.8070401@hp.com>

On Tue, 2009-10-27 at 16:22 -0700, Rick Jones wrote:
> Steve Chen wrote:
> > Multicast packet reassembly can fail
> > 
> > When multicast connections with multiple fragments are received by the same
> > node from more than one Ethernet ports, race condition between fragments
> > from each Ethernet port can cause fragment reassembly to fail leading to
> > packet drop.  This is because packets from each Ethernet port appears identical
> > to the the code that reassembles the Ethernet packet.
> > 
> > The solution is evaluate the Ethernet interface number in addition to all other
> > parameters so that every packet can be uniquely identified.  The existing
> > iif field in struct ipq is now used to generate the hash key, and iif is also
> > used for comparison in case of hash collision.
> > 
> > Please note that q->saddr ^ (q->iif << 5) is now being passed into
> > ipqhashfn to generate the hash key.  This is borrowed from the routing
> > code.
> > 
> > Signed-off-by: Steve Chen <schen@mvista.com>
> > Signed-off-by: Mark Huth <mhuth@mvista.com>
> 
> It has been hours since my last good Emily Litella moment so I'll ask - isn't 
> the combination of source and dest addr, protocol, IP ID and fragment offset 
> supposed to take care of this?  How does the ingress interface have anything to 
> do with it?

Here is the scenario this patch tries to address

<src node> ---->  <switch>  ----> <eth0 dest node>
                            \--->  <eth1 dest node>

For this specific case, src/dst address, protocol, IP ID and fragment
offset are all identical.  The only difference is the ingress interface.
A good follow up question would be why would anyone in their right mind
multicast to the same destination?  well, I don't know.  I can not get
the people who reported the problem to tell me either.   Since someone
found the need to do this,  perhaps others may find it useful too.

Regards,

Steve


^ permalink raw reply

* [PATCH V3]NET/KS8695: add support NAPI for Rx
From: Figo.zhang @ 2009-10-28 13:23 UTC (permalink / raw)
  To: Daniel Silverstone, David S. Miller; +Cc: netdev, Ben Dooks

Add support NAPI Rx API for KS8695NET driver.

v2, change the Rx function to NAPI.

in <KS8695X Integrated Multi-port Gateway Solution Register Description
 v1.0>:
Interrupt Enable Register (offset 0xE204)
Bit29 : WAN MAC Receive Interrupt Enable
Bit16 : LAN MAC Receive Interrupt Enable

Interrupt Status Register (Offset 0xF208)
Bit29: WAN MAC Receive Status
Bit16: LAN MAC Receive Status

see arch/arm/mach-ks8695/devices.c:
ks8695_wan_resources[] and ks8695_lan_resources[]
have IORESOURCE_IRQ , it have define the RX irq,
for wan, irq = 29; for lan ,irq = 16.
so we can do this read the interrupt status:

unsigned long mask_bit = 1 << ksp->rx_irq;
status = readl(KS8695_IRQ_VA + KS8695_INTST);

In v3, some changes that adviced by Daniel Silverstone
and Ben Dooks.

Add k8695_get_rx_enable_bit() for get Rx interrupt enable/status
bit.

Signed-off-by: Figo.zhang <figo1802@gmail.com>
--- 

 drivers/net/arm/ks8695net.c |  145 +++++++++++++++++++++++++++++++++++--------
 1 files changed, 120 insertions(+), 25 deletions(-)

diff --git a/drivers/net/arm/ks8695net.c b/drivers/net/arm/ks8695net.c
index 2a7b774..7051bcc 100644
--- a/drivers/net/arm/ks8695net.c
+++ b/drivers/net/arm/ks8695net.c
@@ -35,11 +35,13 @@
 
 #include <mach/regs-switch.h>
 #include <mach/regs-misc.h>
+#include <asm/mach/irq.h>
+#include <mach/regs-irq.h>
 
 #include "ks8695net.h"
 
 #define MODULENAME	"ks8695_ether"
-#define MODULEVERSION	"1.01"
+#define MODULEVERSION	"1.02"
 
 /*
  * Transmit and device reset timeout, default 5 seconds.
@@ -95,6 +97,9 @@ struct ks8695_skbuff {
 #define MAX_RX_DESC 16
 #define MAX_RX_DESC_MASK 0xf
 
+/*napi_weight have better more than rx DMA buffers*/
+#define NAPI_WEIGHT   64
+
 #define MAX_RXBUF_SIZE 0x700
 
 #define TX_RING_DMA_SIZE (sizeof(struct tx_ring_desc) * MAX_TX_DESC)
@@ -120,6 +125,7 @@ enum ks8695_dtype {
  *	@dev: The platform device object for this interface
  *	@dtype: The type of this device
  *	@io_regs: The ioremapped registers for this interface
+ *      @napi : Add support NAPI for Rx
  *	@rx_irq_name: The textual name of the RX IRQ from the platform data
  *	@tx_irq_name: The textual name of the TX IRQ from the platform data
  *	@link_irq_name: The textual name of the link IRQ from the
@@ -143,6 +149,7 @@ enum ks8695_dtype {
  *	@rx_ring_dma: The DMA mapped equivalent of rx_ring
  *	@rx_buffers: The sk_buff mappings for the RX ring
  *	@next_rx_desc_read: The next RX descriptor to read from on IRQ
+ *      @rx_lock: A lock to protect Rx irq function
  *	@msg_enable: The flags for which messages to emit
  */
 struct ks8695_priv {
@@ -152,6 +159,8 @@ struct ks8695_priv {
 	enum ks8695_dtype dtype;
 	void __iomem *io_regs;
 
+	struct napi_struct	napi;
+
 	const char *rx_irq_name, *tx_irq_name, *link_irq_name;
 	int rx_irq, tx_irq, link_irq;
 
@@ -172,6 +181,7 @@ struct ks8695_priv {
 	dma_addr_t rx_ring_dma;
 	struct ks8695_skbuff rx_buffers[MAX_RX_DESC];
 	int next_rx_desc_read;
+	spinlock_t rx_lock;
 
 	int msg_enable;
 };
@@ -392,29 +402,82 @@ ks8695_tx_irq(int irq, void *dev_id)
 }
 
 /**
+ *	k8695_get_rx_enable_bit - Get rx interrupt enable/status bit
+ *	@ksp: Private data for the KS8695 Ethernet
+ *
+ *    For KS8695 document:
+ *    Interrupt Enable Register (offset 0xE204)
+ *        Bit29 : WAN MAC Receive Interrupt Enable
+ *        Bit16 : LAN MAC Receive Interrupt Enable
+ *    Interrupt Status Register (Offset 0xF208)
+ *        Bit29: WAN MAC Receive Status
+ *        Bit16: LAN MAC Receive Status
+ *    So, this Rx interrrupt enable/status bit number is equal
+ *    as Rx IRQ number.
+ */
+static inline u32 k8695_get_rx_enable_bit(struct ks8695_priv *ksp)
+{
+	return ksp->rx_irq;
+}
+
+/**
  *	ks8695_rx_irq - Receive IRQ handler
  *	@irq: The IRQ which went off (ignored)
  *	@dev_id: The net_device for the interrupt
  *
- *	Process the RX ring, passing any received packets up to the
- *	host.  If we received anything other than errors, we then
- *	refill the ring.
+ *	Inform NAPI that packet reception needs to be scheduled
  */
+
 static irqreturn_t
 ks8695_rx_irq(int irq, void *dev_id)
 {
 	struct net_device *ndev = (struct net_device *)dev_id;
 	struct ks8695_priv *ksp = netdev_priv(ndev);
+	unsigned long status;
+
+	unsigned long mask_bit = 1 << k8695_get_rx_enable_bit();
+
+	spin_lock(&ksp->rx_lock);
+
+	status = readl(KS8695_IRQ_VA + KS8695_INTST);
+
+	/*clean rx status bit*/
+	writel(status | mask_bit , KS8695_IRQ_VA + KS8695_INTST);
+
+	if (status & mask_bit) {
+		if (napi_schedule_prep(&ksp->napi)) {
+			/*disable rx interrupt*/
+			status &= ~mask_bit;
+			writel(status , KS8695_IRQ_VA + KS8695_INTEN);
+			__napi_schedule(&ksp->napi);
+		}
+	}
+
+	spin_unlock(&ksp->rx_lock);
+	return IRQ_HANDLED;
+}
+
+/**
+ *	ks8695_rx - Receive packets  called by NAPI poll method
+ *	@ksp: Private data for the KS8695 Ethernet
+ *	@budget: The max packets would be receive
+ */
+
+static int ks8695_rx(struct ks8695_priv *ksp, int budget)
+{
+	struct net_device *ndev = ksp->ndev;
 	struct sk_buff *skb;
 	int buff_n;
 	u32 flags;
 	int pktlen;
 	int last_rx_processed = -1;
+	int received = 0;
 
 	buff_n = ksp->next_rx_desc_read;
-	do {
-		if (ksp->rx_buffers[buff_n].skb &&
-		    !(ksp->rx_ring[buff_n].status & cpu_to_le32(RDES_OWN))) {
+	while (received < budget
+			&& ksp->rx_buffers[buff_n].skb
+			&& (!(ksp->rx_ring[buff_n].status &
+					cpu_to_le32(RDES_OWN)))) {
 			rmb();
 			flags = le32_to_cpu(ksp->rx_ring[buff_n].status);
 			/* Found an SKB which we own, this means we
@@ -464,7 +527,7 @@ ks8695_rx_irq(int irq, void *dev_id)
 			/* Relinquish the SKB to the network layer */
 			skb_put(skb, pktlen);
 			skb->protocol = eth_type_trans(skb, ndev);
-			netif_rx(skb);
+			netif_receive_skb(skb);
 
 			/* Record stats */
 			ndev->stats.rx_packets++;
@@ -478,29 +541,57 @@ rx_failure:
 			/* Give the ring entry back to the hardware */
 			ksp->rx_ring[buff_n].status = cpu_to_le32(RDES_OWN);
 rx_finished:
+			received++;
 			/* And note this as processed so we can start
 			 * from here next time
 			 */
 			last_rx_processed = buff_n;
-		} else {
-			/* Ran out of things to process, stop now */
-			break;
-		}
-		buff_n = (buff_n + 1) & MAX_RX_DESC_MASK;
-	} while (buff_n != ksp->next_rx_desc_read);
-
-	/* And note which RX descriptor we last did anything with */
-	if (likely(last_rx_processed != -1))
-		ksp->next_rx_desc_read =
-			(last_rx_processed + 1) & MAX_RX_DESC_MASK;
-
-	/* And refill the buffers */
-	ks8695_refill_rxbuffers(ksp);
+			buff_n = (buff_n + 1) & MAX_RX_DESC_MASK;
+			/*And note which RX descriptor we last did */
+			if (likely(last_rx_processed != -1))
+				ksp->next_rx_desc_read =
+					(last_rx_processed + 1) &
+					MAX_RX_DESC_MASK;
+
+			/* And refill the buffers */
+			ks8695_refill_rxbuffers(ksp);
+
+			/* Kick the RX DMA engine, in case it became
+			 *  suspended */
+			ks8695_writereg(ksp, KS8695_DRSC, 0);
+	}
+	return received;
+}
 
-	/* Kick the RX DMA engine, in case it became suspended */
-	ks8695_writereg(ksp, KS8695_DRSC, 0);
 
-	return IRQ_HANDLED;
+/**
+ *	ks8695_poll - Receive packet by NAPI poll method
+ *	@ksp: Private data for the KS8695 Ethernet
+ *	@budget: The remaining number packets for network subsystem
+ *
+ *     Invoked by the network core when it requests for new
+ *     packets from the driver
+ */
+static int ks8695_poll(struct napi_struct *napi, int budget)
+{
+	struct ks8695_priv *ksp = container_of(napi, struct ks8695_priv, napi);
+	struct net_device *dev = ksp->ndev;
+	unsigned long  work_done;
+
+	unsigned long isr = readl(KS8695_IRQ_VA + KS8695_INTEN);
+	unsigned long mask_bit = 1 << k8695_get_rx_enable_bit();
+
+	work_done = ks8695_rx(ksp, budget);
+
+	if (work_done < budget) {
+		unsigned long flags;
+		spin_lock_irqsave(&ksp->rx_lock, flags);
+		/*enable rx interrupt*/
+		writel(isr | mask_bit, KS8695_IRQ_VA + KS8695_INTEN);
+		__napi_complete(napi);
+		spin_unlock_irqrestore(&ksp->rx_lock, flags);
+	}
+	return work_done;
 }
 
 /**
@@ -1472,6 +1563,8 @@ ks8695_probe(struct platform_device *pdev)
 	SET_ETHTOOL_OPS(ndev, &ks8695_ethtool_ops);
 	ndev->watchdog_timeo	 = msecs_to_jiffies(watchdog);
 
+	netif_napi_add(ndev, &ksp->napi, ks8695_poll, NAPI_WEIGHT);
+
 	/* Retrieve the default MAC addr from the chip. */
 	/* The bootloader should have left it in there for us. */
 
@@ -1505,6 +1598,7 @@ ks8695_probe(struct platform_device *pdev)
 
 	/* And initialise the queue's lock */
 	spin_lock_init(&ksp->txq_lock);
+	spin_lock_init(&ksp->rx_lock);
 
 	/* Specify the RX DMA ring buffer */
 	ksp->rx_ring = ksp->ring_base + TX_RING_DMA_SIZE;
@@ -1626,6 +1720,7 @@ ks8695_drv_remove(struct platform_device *pdev)
 	struct ks8695_priv *ksp = netdev_priv(ndev);
 
 	platform_set_drvdata(pdev, NULL);
+	netif_napi_del(&ksp->napi);
 
 	unregister_netdev(ndev);
 	ks8695_release_device(ksp);



^ permalink raw reply related

* Re: [PATCH] Multicast packet reassembly can fail
From: Steve Chen @ 2009-10-28 13:32 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev
In-Reply-To: <4AE81A70.5060307@gmail.com>

On Wed, 2009-10-28 at 11:18 +0100, Eric Dumazet wrote:
> Steve Chen a écrit :
> > Multicast packet reassembly can fail
> > 
> > When multicast connections with multiple fragments are received by the same
> > node from more than one Ethernet ports, race condition between fragments
> > from each Ethernet port can cause fragment reassembly to fail leading to
> > packet drop.  This is because packets from each Ethernet port appears identical
> > to the the code that reassembles the Ethernet packet.
> > 
> > The solution is evaluate the Ethernet interface number in addition to all other
> > parameters so that every packet can be uniquely identified.  The existing
> > iif field in struct ipq is now used to generate the hash key, and iif is also
> > used for comparison in case of hash collision.
> > 
> > Please note that q->saddr ^ (q->iif << 5) is now being passed into
> > ipqhashfn to generate the hash key.  This is borrowed from the routing
> > code.
> > 
> > Signed-off-by: Steve Chen <schen@mvista.com>
> > Signed-off-by: Mark Huth <mhuth@mvista.com>
> > 
> 
> This makes no sense to me, but I need to check the code.
> 
> How interface could matter in IP defragmentation ?
> And why multicast is part of the equation ?
> 
> If defrag fails, this must be for other reason,
> and probably needs another fix.
> 
> Check line 219 of net/ipv4/inet_fragment.c
> 
> #ifdef CONFIG_SMP
>         /* With SMP race we have to recheck hash table, because
>          * such entry could be created on other cpu, while we
>          * promoted read lock to write lock.
>          */
>         hlist_for_each_entry(qp, n, &f->hash[hash], list) {
>                 if (qp->net == nf && f->match(qp, arg)) {
>                         atomic_inc(&qp->refcnt);
>                         write_unlock(&f->lock);
>                         qp_in->last_in |= INET_FRAG_COMPLETE;   <<< HERE >>>
>                         inet_frag_put(qp_in, f);
>                         return qp;
>                 }
>         }
> #endif
> 
> I really wonder why we set INET_FRAG_COMPLETE here

I sent the specific scenario the patch tries to address to the list in
an earlier e-mail.  Would it be beneficial if I post the test code
somewhere so everyone can have access?

Regards,

Steve


^ permalink raw reply

* WAN device configuration, again...
From: Krzysztof Halasa @ 2009-10-28 13:28 UTC (permalink / raw)
  To: netdev

Hi,

I'm currently at final stages of "producing" two WAN drivers and there
is one thing to solve: they have really complex options. It's no longer
a V.35 with ca. 4 clock modes, a clock rate and few encodings etc. They
need many options unique to each driver/board. I think I need a more
capable interface to configure the devices than the current ioctl-based
one.

I think of something:
- using netlink or similar interface
- with potentially unlimited "payload" size (data may be transfered in
  smaller packets)
- the "command" and "response" should be variable-length ASCII-based,
  instead of fixed structures. This way I don't have to duplicate all
  option handling in userspace, only the specific driver has to know
  about them.

Comments? Perhaps there is already an example?
Should I use something else?

I also thought about using /sys read/write calls, but I'm not sure it's
a good idea.
-- 
Krzysztof Halasa

^ permalink raw reply

* Re: [PATCH V3]NET/KS8695: add support NAPI for Rx
From: David Miller @ 2009-10-28 13:37 UTC (permalink / raw)
  To: figo1802; +Cc: dsilvers, netdev, ben
In-Reply-To: <1256736189.2148.30.camel@myhost>

I said to send a relative patch against V2, so that I can
apply it on top what you've already sent.

Why are you sending a full fresh patch after being instructed
not to do that?

^ permalink raw reply

* Re: [PATCH] Multicast packet reassembly can fail
From: Eric Dumazet @ 2009-10-28 13:30 UTC (permalink / raw)
  To: Steve Chen; +Cc: netdev
In-Reply-To: <1256736757.3153.412.camel@linux-1lbu>

Steve Chen a écrit :
 
> I sent the specific scenario the patch tries to address to the list in
> an earlier e-mail.  Would it be beneficial if I post the test code
> somewhere so everyone can have access?
> 

Yes please, I cannot find your previous mail in my archives.

Thanks


^ permalink raw reply

* [PATCHv4 0/7] Per route TCP options support kill switches
From: Gilad Ben-Yossef @ 2009-10-28 14:15 UTC (permalink / raw)
  To: netdev; +Cc: ori

Allow selectively turning off support for specific TCP options
on a per route basis.

One normally want to disable SACK, DSACK, time stamp or window
scale if one got a piece of broken networking equipment somewhere
as a stop gap until you can bring a big enough hammer to deal with
the broken network equipment. It doesn't make sense to "punish" the
entire connections going through the machine to destinations not
related to the broken equipment.

This is doubly true when one is dealing with network containers
used to isolate several virtual domains.

Per route options implemented in free bits in the features route
entry property, which in some cases were reserved by name for these
options, so this does not inflate any structure.

Global sysctls for these options are still preserved and retain 
the exact original meaning (e.g. you have to have both the global 
sysctl turned on and not turn off the TCP option parsing in the
specific route to have it proccessed).

It is not possible to turn off globally an option but turn it on
per route, so as to not subtly change the meaning of current
establish sysctls (and this is a rare need anyway).

Tested on x86 using Qemu/KVM.

Working but crude matching patch to iproute2 sent earlier to the list.

Patchset based on original work by Ori Finkelman and Yony Amit
from ComSleep Ltd.

The author wishes to thank Eric Dumazaet, William Allen Simpson, 
Bill Fink and Ilpo Jarvinen for their feedback.

Gilad Ben-Yossef (7):
  Only parse time stamp TCP option in time wait sock
  Allow tcp_parse_options to consult dst entry
  Add dst_feature to query route entry features
  Add the no SACK route option feature
  Allow disabling TCP timestamp options per route
  Allow to turn off TCP window scale opt per route
  Allow disabling of DSACK TCP option per route

 include/linux/rtnetlink.h |    6 ++++--
 include/net/dst.h         |    8 +++++++-
 include/net/tcp.h         |    3 ++-
 net/ipv4/syncookies.c     |   27 ++++++++++++++-------------
 net/ipv4/tcp_input.c      |   26 ++++++++++++++++++--------
 net/ipv4/tcp_ipv4.c       |   21 ++++++++++++---------
 net/ipv4/tcp_minisocks.c  |    9 ++++++---
 net/ipv4/tcp_output.c     |   18 +++++++++++++-----
 net/ipv6/syncookies.c     |   28 +++++++++++++++-------------
 net/ipv6/tcp_ipv6.c       |    3 ++-
 10 files changed, 93 insertions(+), 56 deletions(-)

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox