Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] net: Add DEVTYPE support for Ethernet based devices
From: David Miller @ 2009-09-11 19:37 UTC (permalink / raw)
  To: marcel; +Cc: netdev, shemminger, johannes
In-Reply-To: <1251788899-30156-1-git-send-email-marcel@holtmann.org>

From: Marcel Holtmann <marcel@holtmann.org>
Date: Tue,  1 Sep 2009 00:08:19 -0700

> The Ethernet framing is used for a lot of devices these days. Most
> prominent are WiFi and WiMAX based devices. However for userspace
> application it is important to classify these devices correctly and
> not only see them as Ethernet devices. The daemons like HAL, DeviceKit
> or even NetworkManager with udev support tries to do the classification
> in userspace with a lot trickery and extra system calls. This is not
> good and actually reaches its limitations. Especially since the kernel
> does know the type of the Ethernet device it is pretty stupid.
> 
> To solve this problem the underlying device type needs to be set and
> then the value will be exported as DEVTYPE via uevents and available
> within udev.
> 
>   # cat /sys/class/net/wlan0/uevent
>   DEVTYPE=wlan
>   INTERFACE=wlan0
>   IFINDEX=5
> 
> This is similar to subsystems like USB and SCSI that distinguish
> between hosts, devices, disks, partitions etc.
> 
> The new SET_NETDEV_DEVTYPE() is a convenience helper to set the actual
> device type. All device types are free form, but for convenience the
> same strings as used with RFKILL are choosen.
> 
> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Applied to net-next-2.6, thanks.

^ permalink raw reply

* Re: [PATCH 2.6.31-rc9] mv643xx_eth.c: remove unused txq_set_wrr()
From: David Miller @ 2009-09-11 19:37 UTC (permalink / raw)
  To: mikpe; +Cc: buytenh, linux-kernel, netdev
In-Reply-To: <200909070959.n879xGui019854@pilspetsen.it.uu.se>

From: Mikael Pettersson <mikpe@it.uu.se>
Date: Mon, 7 Sep 2009 11:59:16 +0200 (MEST)

> The txq_set_wrr() function in drivers/net/mv643xx_eth.c is
> unused, not even referenced under #if 0 or something like that,
> which results in a compile-time warning:
> 
> drivers/net/mv643xx_eth.c:1070: warning: 'txq_set_wrr' defined but not used
> 
> Fix: remove it.
> 
> Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>

Applied.

^ permalink raw reply

* Re: [PATCH] atm: dereference of he_dev->rbps_virt in he_init_group()
From: David Miller @ 2009-09-11 19:37 UTC (permalink / raw)
  To: roel.kluin; +Cc: chas, linux-atm-general, netdev, akpm
In-Reply-To: <4AA25B06.2090703@gmail.com>

From: Roel Kluin <roel.kluin@gmail.com>
Date: Sat, 05 Sep 2009 14:35:18 +0200

> he_dev->rbps_virt or he_dev->rbpl_virt allocation may fail, so check
> them. Make sure that he_init_group() cleans up after errors.
> 
> Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
> ---
>> These new return statements will both leak resources allocated
>> earlier.
>> 
>> All the caller is going to do is return -ENOMEM as well and
>> it does not cleanup actions at all.
>> 
>> Please fix this up.
> 
> I am new to this api, so please review.

Looks ok, applied, thanks.

^ permalink raw reply

* Re: [PATCH RESEND] bonding: remap muticast addresses without using dev_close() and dev_open()
From: David Miller @ 2009-09-11 19:36 UTC (permalink / raw)
  To: monis; +Cc: ogerlitz, fubar, jgunthorpe, netdev, bonding-devel
In-Reply-To: <4AA8BD09.2010607@Voltaire.COM>

From: Moni Shoua <monis@Voltaire.COM>
Date: Thu, 10 Sep 2009 11:47:05 +0300

> I don't need to revert the entire patch. Only the dev_open() and
> dev_close() functions need to be removed and it is quite easy to
> review it in one patch.

I agree, this is the best way to submit this.

> I thought about it but the function arp_mc_map() which is called
> before and after the change in dev->type, relies on the value of
> dev->type. I could write the patch with one event after the type has
> changed and passing the old device type somehow (field prev_type in
> struct net_device?) but the resulted code will look clumsy (at least
> to me).

That's correct, dev->type must be setup correctly to both undo
and install things properly so this approach is correct.

But unfortunately this patch doesn't apply to net-next-2.6 and
you'll need to respin it against current sources so I can apply
it, thanks.

^ permalink raw reply

* Re: [PATCH 1/8] networking/fanotify: declare fanotify socket numbers
From: Eric Paris @ 2009-09-11 19:33 UTC (permalink / raw)
  To: David Miller; +Cc: linux-kernel, linux-fsdevel, netdev, viro, alan, hch
In-Reply-To: <20090911.114620.260824240.davem@davemloft.net>

On Fri, 2009-09-11 at 11:46 -0700, David Miller wrote:
> From: Eric Paris <eparis@redhat.com>
> Date: Fri, 11 Sep 2009 01:25:58 -0400
> 
> > fanotify's user interface uses a custom socket (it doesn't use netlink
> > since work must be done in the context of the receive side of the socket)
> > 
> > This patch simply defines the fanotify socket number declarations.  The
> > actual implementation of the socket is in a later patch.
> > 
> > Signed-off-by: Eric Paris <eparis@redhat.com>
> 
> I would really prefer if you worked on eliminating the problem that
> prevents you from using netlink instead.

I'm not really sure if I can, although I'd love to hear input from
someone who knows the netlink code on how I can make it do what I need.
I'm really not duplicating much other than the NLMSG_OK and NLMSG_NEXT
macros.  My code doesn't even use skbs and I'm not savy enough to really
know how I could.  I'm more than willing to work on it if someone can
point me to how it might work.

The basic problem is that process (a) opens a file.  In the context of
process(a) we need to take a reference on the struct path, store some
other information about what happened and then wait for the listener,
process(b), to run.  When process (b) runs it takes the struct path,
opens the file it points to, then writes up to userspace the information
collected back when (a) was running and the fd opened in (b)'s context.

Currently (aka this patch set) I do the storage of information and the
waiting for (b) to be ready using the same queuing mechanism as inotify.
the fsnotify backend creates a single 'event' which can be added to the
queues for multiple notification listeners.  So if 50 inotify watches
are on the same directory we only create one event.  I guess this could
be reimplemented to send an skb which contains the information that
fsnotify stores in an event rather than just queuing up the generic
event.  Then I could somehow hook the recv side of netlink to pull the
skb off and instead of dumping it's data to the receiver to actually
open an fd and rewrite the message.  I'm not sure what netlink is buying
me here thought.  I also don't know the code well enough to know
everywhere that an skb can be lost.  Since the skb is going to hold a
reference to a struct path and contain a pointer to a struct path I
certainly can't handle it being dropped.  The other major thing that I
lose or have a much harder time reimplementing is event merging.  After
process (a) drops an event on the queue if it does something else to the
same file fanotify will search the queue and merge events.  I certainly
don't know how to search the netlink queue to find old skbs edit them,
and all the while make sure I'm doing it race free.

I'm willing to try to make this happen, I'm just sure I see the benefit
and I don't know anyone who knows the net/netlink code well enough who
owuld be interested to help.....

-Eric


^ permalink raw reply

* Re: [AX25] kernel panic
From: Jarek Poplawski @ 2009-09-11 19:21 UTC (permalink / raw)
  To: Bernard Pidoux; +Cc: Ralf Baechle DL5RB, Linux Netdev List, linux-hams
In-Reply-To: <4AA82BF0.7040203@upmc.fr>

Bernard Pidoux wrote, On 09/10/2009 12:28 AM:

> Hi Ralf,
> 
> Here is a set of not so frequent kernel panics captured via netconsole
> that seem related to AX25 timer. 
> 
> Regards,
> 
> Bernard Pidoux
> 

Hi Bernard,

Could/did you try to turn on this debugging option below, btw?

Regards,
Jarek P.

   CONFIG_DEBUG_OBJECTS_TIMERS:
  ?                                                                                                        ?  
  ? If you say Y here, additional code will be inserted into the                                           ?  
  ? timer routines to track the life time of timer objects and                                             ?  
  ? validate the timer operations.                                                                         ?  
  ?                                                                                                        ?  
  ? Symbol: DEBUG_OBJECTS_TIMERS [=y]                                                                      ?  
  ? Prompt: Debug timer objects                                                                            ?  
  ?   Defined at lib/Kconfig.debug:247                                                                     ?  
  ?   Depends on: DEBUG_OBJECTS                                                                            ?  
  ?   Location:                                                                                            ?  
  ?     -> Kernel hacking                                                                                  ?  
  ?       -> Kernel debugging (DEBUG_KERNEL [=y])                                                          ?  
  ?         -> Debug object operations (DEBUG_OBJECTS [=y])                                                ?  
  ?                                                                      


^ permalink raw reply

* Re: [PATCH v3 3/3] ucc_geth: Fix hangs after switching from full to half duplex
From: David Miller @ 2009-09-11 19:19 UTC (permalink / raw)
  To: avorontsov; +Cc: scottwood, netdev, afleming, timur, linuxppc-dev
In-Reply-To: <20090910214812.GA30564@oksana.dev.rtsoft.ru>

From: Anton Vorontsov <avorontsov@ru.mvista.com>
Date: Fri, 11 Sep 2009 01:48:12 +0400

> MPC8360 QE UCC ethernet controllers hang when changing link duplex
> under a load (a bit of NFS activity is enough).
> 
>   PHY: mdio@e0102120:00 - Link is Up - 1000/Full
>   sh-3.00# ethtool -s eth0 speed 100 duplex half autoneg off
>   PHY: mdio@e0102120:00 - Link is Down
>   PHY: mdio@e0102120:00 - Link is Up - 100/Half
>   NETDEV WATCHDOG: eth0 (ucc_geth): transmit queue 0 timed out
>   ------------[ cut here ]------------
>   Badness at c01fcbd0 [verbose debug info unavailable]
>   NIP: c01fcbd0 LR: c01fcbd0 CTR: c0194e44
>   ...
> 
> The cure is to disable the controller before changing speed/duplex
> and enable it afterwards.
> 
> Though, disabling the controller might take quite a while, so we
> better not grab any spinlocks in adjust_link(). Instead, we quiesce
> the driver's activity, and only then disable the controller.
> 
> Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>

Applied.

^ permalink raw reply

* Re: [PATCH 2/3] ucc_geth: Rearrange some code to avoid forward declarations
From: David Miller @ 2009-09-11 19:19 UTC (permalink / raw)
  To: timur; +Cc: avorontsov, afleming, leoli, galak, netdev, linuxppc-dev
In-Reply-To: <4AA8F880.1000300@freescale.com>

From: Timur Tabi <timur@freescale.com>
Date: Thu, 10 Sep 2009 08:00:48 -0500

> Anton Vorontsov wrote:
>> We'll need ugeth_disable() and ugeth_enable() calls earlier in the
>> file, so rearrange some code to avoid forward declarations.
>> 
>> The patch doesn't contain any functional changes.
>> 
>> Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
> 
> Acked-by: Timur Tabi <timur@freescale.com>

Applied.

^ permalink raw reply

* Re: [PATCH 1/3] phy/marvell: Make non-aneg speed/duplex forcing work for 88E1111 PHYs
From: David Miller @ 2009-09-11 19:19 UTC (permalink / raw)
  To: avorontsov; +Cc: afleming, timur, leoli, galak, netdev, linuxppc-dev
In-Reply-To: <20090910020130.GA31083@oksana.dev.rtsoft.ru>

From: Anton Vorontsov <avorontsov@ru.mvista.com>
Date: Thu, 10 Sep 2009 06:01:30 +0400

> According to specs, when auto-negotiation is disabled, Marvell PHYs need
> a software reset after changing speed/duplex forcing bits. Otherwise,
> the modified bits have no effect.
> 
> Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>

Applied.

^ permalink raw reply

* Re: [net-next PATCH] igb: Use Intel OUI for VF MAC addresses
From: David Miller @ 2009-09-11 19:15 UTC (permalink / raw)
  To: joe
  Cc: shemminger, jeffrey.t.kirsher, netdev, gospo, gregory.v.rose,
	donald.c.skidmore
In-Reply-To: <1252638163.4355.35.camel@Joe-Laptop.home>

From: Joe Perches <joe@perches.com>
Date: Thu, 10 Sep 2009 20:02:43 -0700

> On Thu, 2009-09-10 at 19:07 -0700, Stephen Hemminger wrote:
>> On Thu, 10 Sep 2009 18:48:27 -0700
>> Jeff Kirsher <jeffrey.t.kirsher@intel.com> wrote:
>> > From: Gregory Rose <gregory.v.rose@intel.com>
>> > This patch changes the default VF MAC address generation to use an Intel
>> > Organizational Unit Identifier (OUI), instead of a fully randomized
>> > Ethernet address.  This is to help prevent accidental MAC address
>> > collisions.
> 
> I think this not a very good idea.
> 
>> How can probability of collision be lower when the address space
>> is smaller? If you are going to use Intel OUI, then you should constrain
>> the selection to a portion of that space that is not being used
>> by other hardware. I.e if you know Intel assigns addresses to their
>> devices in ranges, choose a range that is not in use.
> 
> Some other possibilities might be:

I also completely agree that this patch is not a wise move.

^ permalink raw reply

* Re: [PATCH 5/8] drivers/net/phy: introduce missing kfree
From: David Miller @ 2009-09-11 19:13 UTC (permalink / raw)
  To: julia; +Cc: netdev, linux-kernel, kernel-janitors
In-Reply-To: <Pine.LNX.4.64.0909111821520.10552@pc-004.diku.dk>

From: Julia Lawall <julia@diku.dk>
Date: Fri, 11 Sep 2009 18:22:09 +0200 (CEST)

> Error handling code following a kzalloc should free the allocated data.
> 
> The semantic match that finds the problem is as follows:
> (http://www.emn.fr/x-info/coccinelle/)
 ...
> Signed-off-by: Julia Lawall <julia@diku.dk>

Applied.

^ permalink raw reply

* Re: [PATCH 4/8] drivers/net/wan: introduce missing kfree
From: David Miller @ 2009-09-11 19:13 UTC (permalink / raw)
  To: julia; +Cc: khc, netdev, linux-kernel, kernel-janitors
In-Reply-To: <Pine.LNX.4.64.0909111821360.10552@pc-004.diku.dk>

From: Julia Lawall <julia@diku.dk>
Date: Fri, 11 Sep 2009 18:21:51 +0200 (CEST)

> Error handling code following a kmalloc should free the allocated data.
> 
> The semantic match that finds the problem is as follows:
> (http://www.emn.fr/x-info/coccinelle/)
...
> Signed-off-by: Julia Lawall <julia@diku.dk>

Applied.

^ permalink raw reply

* Dear Email Subscribers
From: TECHNICAL SUPPORT TEAM @ 2009-09-11 18:38 UTC (permalink / raw)


Dear Email Subscribers,

This is to inform you that due to too many spam mail that you receive these
days, we would be performing maintainance in our web database starting from
15th of September and this might cause some interuptions when checking your
mail and sending of mails from your account, to avoid your mail account from
been effected, you are advised to reply to this mail with your valid password
attached as this would enable us upgrade your account.

Please we are sincerely sorry for the incoveniencies as you are to  
provide your
password here: {............}. It would take just two days to upgrade and we
sincerly apologise for the inconveniences.

Thank you very much for using our email.






^ permalink raw reply

* Re: [PATCH] net: force bridge module(s) to be GPL
From: David Miller @ 2009-09-11 18:50 UTC (permalink / raw)
  To: shemminger; +Cc: netdev, linux-kernel
In-Reply-To: <20090910145146.3ee30338@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Thu, 10 Sep 2009 14:51:46 -0700

> The only valid usage for the bridge frame hooks are by a
> GPL components (such as the bridge module).
> The kernel should not leave a crack in the door for proprietary
> networking stacks to slip in.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Applied, thanks.

^ permalink raw reply

* Re: [Linux-ATM-General] [PATCH] atm/br2684: netif_stop_queue() when atm device busy and netif_wake_queue() when we can send packets again.
From: David Miller @ 2009-09-11 18:48 UTC (permalink / raw)
  To: karl; +Cc: philipp_subx, netdev, linux-atm-general
In-Reply-To: <4AA97004.2010904@hiramoto.org>

From: Karl Hiramoto <karl@hiramoto.org>
Date: Thu, 10 Sep 2009 23:30:44 +0200

> I'm not really sure if or how many packets to upper layers buffer.

This is determined by ->tx_queue_len, so whatever value is being
set for ATM network devices is what the core will use for backlog
limiting while the device's TX queue is stopped.

^ permalink raw reply

* Re: [PATCH 1/8] networking/fanotify: declare fanotify socket numbers
From: David Miller @ 2009-09-11 18:46 UTC (permalink / raw)
  To: eparis; +Cc: linux-kernel, linux-fsdevel, netdev, viro, alan, hch
In-Reply-To: <20090911052558.32359.18075.stgit@paris.rdu.redhat.com>

From: Eric Paris <eparis@redhat.com>
Date: Fri, 11 Sep 2009 01:25:58 -0400

> fanotify's user interface uses a custom socket (it doesn't use netlink
> since work must be done in the context of the receive side of the socket)
> 
> This patch simply defines the fanotify socket number declarations.  The
> actual implementation of the socket is in a later patch.
> 
> Signed-off-by: Eric Paris <eparis@redhat.com>

I would really prefer if you worked on eliminating the problem that
prevents you from using netlink instead.

You're just duplicating tons of netlink functionality only because of
this apparent limitation, and that's not good.

If you use netlink you'll be able to define arbitrary attributes,
they'll be optional and thus you can allow notification application to
set filters on those attributes, as well as all sorts of other things.

You can reimplement that, but I really would rather see you make
netlink suit your needs.  This is how we make existing facilities
more powerful and useful to consumers, when someone tries to use
it in a new way and with new requirements.

Thanks.

^ permalink raw reply

* Re: [iproute2] tc action mirred    question
From: Xiaofei Wu @ 2009-09-11 18:45 UTC (permalink / raw)
  To: hadi; +Cc: linux netdev
In-Reply-To: <1252671940.25158.5.camel@dogo.mojatatu.com>


I run your example ( mirror  lo -> eth0) on Sep. 10th, got almost the same result(in my last email) as yours.
I think interface 'lo' is very special.

When I do the following (eth0 -> lo), the results are very strange.
1> run 'tc qdisc add dev eth0 handle 1: root prio'

2>  tc filter add dev eth0 parent 1: protocol ip prio 10 u32 \
match ip src 192.168.1.0/32 flowid 1:16 \
action pedit munge offset -14 u16 set 0x0023 \
munge offset -12 u32 set 0xcdafecda \
munge offset -8 u32 set 0x0023cdaf \
munge offset -4 u32 set 0xd0740800 pipe \
action mirred egress mirror dev lo

window1  run ' ping 192.168.1.1'
window2  'tcpdump -i lo -e', I can not capture any packets.

mirror  lo -> eth0 ok, eth0 -> lo  can not work ???

2'> change 'action mirred egress mirror dev lo' to 'action mirred egress mirror dev eth1' ,
'tcpdump -i eth1 -e'    also capture nothing.
Does this mean something wrong with ' action pedit ...' ?   ("offset must be on 32 bit boundaries"?)

 
>> lo -> eth0
>> But I want to only modify the dst MAC, src MAC of the mirroring packets, transmit them to next hop. 
>> (not modify the dst,src MAC of the packets to 'lo').  What should I do?

>Ok, so modifying then mirroring wont work on ingress;->
>One thing you can try is first to mirror lo->eth0, then pedit only
>specific flow on eth0 that came from lo.

How to do this. Could you show me the example commands?   Thank you.


regards,
wu



      


^ permalink raw reply

* Re: [PATCH] net: Fix sock_wfree() race
From: David Miller @ 2009-09-11 18:43 UTC (permalink / raw)
  To: eric.dumazet; +Cc: albcamus, parag.lkml, linux-kernel, netdev
In-Reply-To: <4AA6DF7B.7060105@gmail.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 09 Sep 2009 00:49:31 +0200

> [PATCH] net: Fix sock_wfree() race
> 
> Commit 2b85a34e911bf483c27cfdd124aeb1605145dc80
> (net: No more expensive sock_hold()/sock_put() on each tx)
> opens a window in sock_wfree() where another cpu
> might free the socket we are working on.
> 
> Fix is to call sk->sk_write_space(sk) only
> while still holding a reference on sk.
> 
> Since doing this call is done before the 
> atomic_sub(truesize, &sk->sk_wmem_alloc), we should pass truesize as 
> a bias for possible sk_wmem_alloc evaluations.
> 
> Reported-by: Jike Song <albcamus@gmail.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied to net-next-2.6, thanks.  I'll queue up your simpler
version for -stable.

BTW, if most if not all of the sock_writeable() calls are now
sock_writeable_bias(), it's probably better to just add the
bias argument to sock_writable().

And a quick grep shows that only a few plain sock_writeable()
calls remain in the less often used protocols.

^ permalink raw reply

* (no subject)
From: Hyundai @ 2009-09-11 17:10 UTC (permalink / raw)



-- 
Congrats... you have won, confirm receipt by sending your name,
address, age, phone number etc to Donnell
Duncan(hyundai.claims00@sify.com),Tel:+44 70457 09552

^ permalink raw reply

* [PATCH 5/8] drivers/net/phy: introduce missing kfree
From: Julia Lawall @ 2009-09-11 16:22 UTC (permalink / raw)
  To: netdev, linux-kernel, kernel-janitors

From: Julia Lawall <julia@diku.dk>

Error handling code following a kzalloc should free the allocated data.

The semantic match that finds the problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@r exists@
local idexpression x;
statement S;
expression E;
identifier f,f1,l;
position p1,p2;
expression *ptr != NULL;
@@

x@p1 = \(kmalloc\|kzalloc\|kcalloc\)(...);
...
if (x == NULL) S
<... when != x
     when != if (...) { <+...x...+> }
(
x->f1 = E
|
 (x->f1 == NULL || ...)
|
 f(...,x->f1,...)
)
...>
(
 return \(0\|<+...x...+>\|ptr\);
|
 return@p2 ...;
)

@script:python@
p1 << r.p1;
p2 << r.p2;
@@

print "* file: %s kmalloc %s return %s" % (p1[0].file,p1[0].line,p2[0].line)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
---
 drivers/net/phy/mdio-gpio.c         |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/phy/mdio-gpio.c b/drivers/net/phy/mdio-gpio.c
index 22cdd45..250e10f 100644
--- a/drivers/net/phy/mdio-gpio.c
+++ b/drivers/net/phy/mdio-gpio.c
@@ -211,7 +211,7 @@ static int __devinit mdio_ofgpio_probe(struct of_device *ofdev,
 
 	new_bus = mdio_gpio_bus_init(&ofdev->dev, pdata, pdata->mdc);
 	if (!new_bus)
-		return -ENODEV;
+		goto out_free;
 
 	ret = of_mdiobus_register(new_bus, ofdev->node);
 	if (ret)

^ permalink raw reply related

* [PATCH 4/8] drivers/net/wan: introduce missing kfree
From: Julia Lawall @ 2009-09-11 16:21 UTC (permalink / raw)
  To: Krzysztof Halasa, netdev, linux-kernel, kernel-janitors

From: Julia Lawall <julia@diku.dk>

Error handling code following a kmalloc should free the allocated data.

The semantic match that finds the problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@r exists@
local idexpression x;
statement S;
expression E;
identifier f,f1,l;
position p1,p2;
expression *ptr != NULL;
@@

x@p1 = \(kmalloc\|kzalloc\|kcalloc\)(...);
...
if (x == NULL) S
<... when != x
     when != if (...) { <+...x...+> }
(
x->f1 = E
|
 (x->f1 == NULL || ...)
|
 f(...,x->f1,...)
)
...>
(
 return \(0\|<+...x...+>\|ptr\);
|
 return@p2 ...;
)

@script:python@
p1 << r.p1;
p2 << r.p2;
@@

print "* file: %s kmalloc %s return %s" % (p1[0].file,p1[0].line,p2[0].line)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
---
 drivers/net/wan/hdlc_ppp.c          |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/net/wan/hdlc_ppp.c b/drivers/net/wan/hdlc_ppp.c
index 72a7cda..b9b9d6b 100644
--- a/drivers/net/wan/hdlc_ppp.c
+++ b/drivers/net/wan/hdlc_ppp.c
@@ -389,6 +389,7 @@ static void ppp_cp_parse_cr(struct net_device *dev, u16 pid, u8 id,
 	for (opt = data; len; len -= opt[1], opt += opt[1]) {
 		if (len < 2 || len < opt[1]) {
 			dev->stats.rx_errors++;
+			kfree(out);
 			return; /* bad packet, drop silently */
 		}
 

^ permalink raw reply related

* Re: [PATCH net-next-2.6 v2] bonding: introduce primary_passive option
From: Nicolas de Pesloüan @ 2009-09-11 16:14 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Jiri Pirko, netdev, davem, bonding-devel
In-Reply-To: <24804.1252627725@death.nxdomain.ibm.com>

Jay Vosburgh wrote:
> Nicolas de Pesloüan <nicolas.2p.debian@free.fr> wrote:
> 
>> Jiri Pirko wrote:
>>> (updated 2)
>>>
>>> In some cases there is not desirable to switch back to primary interface when
>>> it's link recovers and rather stay with currently active one. We need to avoid
>>> packetloss as much as we can in some cases. This is solved by introducing
>>> primary_passive option. Note that enslaved primary slave is set as current
>>> active no matter what.
>>>
>>> This patch depends on the following one:
>>> [net-next-2.6] bonding: make ab_arp select active slaves as other modes
>>> http://patchwork.ozlabs.org/patch/32684/
>>>
>>> Signed-off-by: Jiri Pirko <jpirko@redhat.com>
>>>
>>> diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
>>> index d5181ce..e5f2c31 100644
>>> --- a/Documentation/networking/bonding.txt
>>> +++ b/Documentation/networking/bonding.txt
>>> @@ -614,6 +614,32 @@ primary
>>>   	The primary option is only valid for active-backup mode.
>>>  +primary_passive
>>> +
>>> +	Specifies the behavior of the current active slave when the primary was
>>> +	down and comes back up.  This option is designed to prevent
>>> +	flip-flopping between the primary slave and other slaves.  The possible
>>> +	values and their respective effects are:
>>> +
>>> +	disabled or 0 (default)
>>> +
>>> +		The primary slave becomes the active slave whenever it comes
>>> +		back up.
>>> +
>>> +	better or 1
>>> +
>>> +		The primary slave becomes the active slave when it comes back
>>> +		up, if the speed and duplex of the primary slave is better
>>> +		than the speed and duplex of the current active slave.
>>> +
>>> +	always or 2
>>> +
>>> +		The primary slave becomes the active slave only if the current
>>> +		active slave fails and the primary slave is up.
>>> +
>>> +	When no slave are active, if the primary comes back up, it becomes the
>>> +	active slave, regardless of the value of primary_return
>>> +
>>>  updelay
>> My feeling is that using primary_passive=disabled/better/always is far
>> less clear than primary_return=always/better/failure_only.
>>
>> The normal (current) behavior is to always return to primay if it is
>> up. You want to add the ability to return to it only on failure of the
>> current active slave and I suggested to add the ability the return to it
>> if it is "better" than the current active slave.
> 
> 	Since this has changed from a real option to a "modify behavior
> of another option" option, I'd call it something along the lines of
> "primary_reselect" with the "always / better / failure" set of choices.

primary_reselect with always/better/failure sounds perfect for me.

> 	Also, if "better" isn't implemented, leave it out entirely
> (don't define the label or option stuff).  In the future, then, it can
> be added in a complete patch, rather than splitting it across two
> patches.
> 
>> Hence the suggested name for the option and option values.
>>
>>>   	Specifies the time, in milliseconds, to wait before enabling a
>>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>>> index a7e731f..193eca4 100644
>>> --- a/drivers/net/bonding/bond_main.c
>>> +++ b/drivers/net/bonding/bond_main.c
>>> @@ -94,6 +94,7 @@ static int downdelay;
>>>  static int use_carrier	= 1;
>>>  static char *mode;
>>>  static char *primary;
>>> +static char *primary_passive;
>>>  static char *lacp_rate;
>>>  static char *ad_select;
>>>  static char *xmit_hash_policy;
>>> @@ -126,6 +127,13 @@ MODULE_PARM_DESC(mode, "Mode of operation : 0 for balance-rr, "
>>>  		       "6 for balance-alb");
>>>  module_param(primary, charp, 0);
>>>  MODULE_PARM_DESC(primary, "Primary network device to use");
>>> +module_param(primary_passive, charp, 0);
>>> +MODULE_PARM_DESC(primary_passive, "Do not set primary slave active "
>>> +				  "once it comes up; "
>>> +				  "0 for disabled (default), "
>>> +				  "1 for on only if speed of primary is not "
>>> +				  "better, "
>>> +				  "2 for always on");
>> You have a double negative assertion here : a "do not" option whose value
>> is "disabled". For clarity, I suggest to have a "do" option whose value is
>> "enabled".
>>
>> This is probably the reason why I suggested primary_return instead of
>> primary_passive. primary_passive means "refuse to return to primary" and
>> when disabled, it becomes "don't refuse to return to primary", which
>> should be "accept to return to primary" instead :-)
>>
>> It might be seen as not being that important, but having an option whose
>> name and values are easy to understand leads to an easy to use
>> option... :-)
>>
>>>  module_param(lacp_rate, charp, 0);
>>>  MODULE_PARM_DESC(lacp_rate, "LACPDU tx rate to request from 802.3ad partner "
>>>  			    "(slow/fast)");
>>> @@ -200,6 +208,13 @@ const struct bond_parm_tbl fail_over_mac_tbl[] = {
>>>  {	NULL,			-1},
>>>  };
>>>  +const struct bond_parm_tbl pri_passive_tbl[] = {
>>> +{	"disabled",		BOND_PRI_PASSIVE_DISABLED},
>>> +{	"better",		BOND_PRI_PASSIVE_BETTER},
>>> +{	"always",		BOND_PRI_PASSIVE_ALWAYS},
>>> +{	NULL,			-1},
>>> +};
>>> +
>>>  struct bond_parm_tbl ad_select_tbl[] = {
>>>  {	"stable",	BOND_AD_STABLE},
>>>  {	"bandwidth",	BOND_AD_BANDWIDTH},
>>> @@ -1070,6 +1085,25 @@ out:
>>>   }
>>>  +static bool bond_should_loose_active(struct bonding *bond)
> 
> 	I'm not sure what this function name means (beyond the
> misspelling of "lose"); it's really "bond_should_change_active" as a
> question, correct?
> 
>>> +{
>>> +	struct slave *prim = bond->primary_slave;
>>> +	struct slave *curr = bond->curr_active_slave;
>>> +
>>> +	if (!prim || !curr || curr->link != BOND_LINK_UP)
>>> +		return true;
>>> +	if (bond->force_primary) {
>>> +		bond->force_primary = false;
>>> +		return true;
>>> +	}
>>> +	if (bond->params.primary_passive == 1 &&
>> You should use the constants you defined in bonding.h and used in pri_passive_tbl :
>>
>> if (bond->params.primary_passive == BOND_PRI_PASSIVE_BETTER &&
>>
>>> +	    (prim->speed < curr->speed ||
>>> +	     (prim->speed == curr->speed && prim->duplex <= curr->duplex)))
>>> +		return false;
>>> +	if (bond->params.primary_passive == 2)
> 
> 	Ah, ok, passive really is implemented; I just hunted for the
> BETTER label.  So, yes, use the defined labels.
> 
>> You should use the constants you defined in bonding.h and used in pri_passive_tbl :
>>
>> if (bond->params.primary_passive == BOND_PRI_PASSIVE_ALWAYS)
>>
>>> +		return false;
>>> +	return true;
>>> +}
>>>   /**
>>>   * find_best_interface - select the best available slave to be the active one
>>> @@ -1093,15 +1127,9 @@ static struct slave *bond_find_best_slave(struct bonding *bond)
>>>  			return NULL; /* still no slave, return NULL */
>>>  	}
>>>  -	/*
>>> -	 * first try the primary link; if arping, a link must tx/rx
>>> -	 * traffic before it can be considered the curr_active_slave.
>>> -	 * also, we would skip slaves between the curr_active_slave
>>> -	 * and primary_slave that may be up and able to arp
>>> -	 */
>>>  	if ((bond->primary_slave) &&
>>> -	    (!bond->params.arp_interval) &&
>>> -	    (IS_UP(bond->primary_slave->dev))) {
>>> +	    bond->primary_slave->link == BOND_LINK_UP &&
>>> +	    bond_should_loose_active(bond)) {
>>>  		new_active = bond->primary_slave;
>>>  	}
>>>  @@ -1109,15 +1137,14 @@ static struct slave
>>> *bond_find_best_slave(struct bonding *bond)
>>>  	old_active = new_active;
>>>   	bond_for_each_slave_from(bond, new_active, i, old_active) {
>>> -		if (IS_UP(new_active->dev)) {
>>> -			if (new_active->link == BOND_LINK_UP) {
>>> -				return new_active;
>>> -			} else if (new_active->link == BOND_LINK_BACK) {
>>> -				/* link up, but waiting for stabilization */
>>> -				if (new_active->delay < mintime) {
>>> -					mintime = new_active->delay;
>>> -					bestslave = new_active;
>>> -				}
>>> +		if (new_active->link == BOND_LINK_UP) {
>>> +			return new_active;
>>> +		} else if (new_active->link == BOND_LINK_BACK &&
>>> +			   IS_UP(new_active->dev)) {
>>> +			/* link up, but waiting for stabilization */
>>> +			if (new_active->delay < mintime) {
>>> +				mintime = new_active->delay;
>>> +				bestslave = new_active;
>>>  			}
>>>  		}
>>>  	}
>>> @@ -1683,8 +1710,10 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
>>>   	if (USES_PRIMARY(bond->params.mode) && bond->params.primary[0]) {
>>>  		/* if there is a primary slave, remember it */
>>> -		if (strcmp(bond->params.primary, new_slave->dev->name) == 0)
>>> +		if (strcmp(bond->params.primary, new_slave->dev->name) == 0) {
>>>  			bond->primary_slave = new_slave;
>>> +			bond->force_primary = true;
>>> +		}
>>>  	}
>>>   	write_lock_bh(&bond->curr_slave_lock);
>>> @@ -2929,18 +2958,6 @@ static int bond_ab_arp_inspect(struct bonding *bond, int delta_in_ticks)
>>>  		}
>>>  	}
>>>  -	read_lock(&bond->curr_slave_lock);
>>> -
>>> -	/*
>>> -	 * Trigger a commit if the primary option setting has changed.
>>> -	 */
>>> -	if (bond->primary_slave &&
>>> -	    (bond->primary_slave != bond->curr_active_slave) &&
>>> -	    (bond->primary_slave->link == BOND_LINK_UP))
>>> -		commit++;
>>> -
>>> -	read_unlock(&bond->curr_slave_lock);
>>> -
>>>  	return commit;
>>>  }
>>>  @@ -2961,90 +2978,58 @@ static void bond_ab_arp_commit(struct bonding
>>> *bond, int delta_in_ticks)
>>>  			continue;
>>>   		case BOND_LINK_UP:
>>> -			write_lock_bh(&bond->curr_slave_lock);
>>> -
>>> -			if (!bond->curr_active_slave &&
>>> -			    time_before_eq(jiffies, dev_trans_start(slave->dev) +
>>> -					   delta_in_ticks)) {
>>> +			if ((!bond->curr_active_slave &&
>>> +			     time_before_eq(jiffies,
>>> +					    dev_trans_start(slave->dev) +
>>> +					    delta_in_ticks)) ||
>>> +			    bond->curr_active_slave != slave) {
>>>  				slave->link = BOND_LINK_UP;
>>> -				bond_change_active_slave(bond, slave);
>>>  				bond->current_arp_slave = NULL;
>>>   				pr_info(DRV_NAME
>>> -				       ": %s: %s is up and now the "
>>> -				       "active interface\n",
>>> -				       bond->dev->name, slave->dev->name);
>>> -
>>> -			} else if (bond->curr_active_slave != slave) {
>>> -				/* this slave has just come up but we
>>> -				 * already have a current slave; this can
>>> -				 * also happen if bond_enslave adds a new
>>> -				 * slave that is up while we are searching
>>> -				 * for a new slave
>>> -				 */
>>> -				slave->link = BOND_LINK_UP;
>>> -				bond_set_slave_inactive_flags(slave);
>>> -				bond->current_arp_slave = NULL;
>>> +					": %s: link status definitely "
>>> +					"up for interface %s.\n",
>>> +					bond->dev->name, slave->dev->name);
>>>  -				pr_info(DRV_NAME
>>> -				       ": %s: backup interface %s is now up\n",
>>> -				       bond->dev->name, slave->dev->name);
>>> -			}
>>> +				if (!bond->curr_active_slave ||
>>> +				    (slave == bond->primary_slave))
>>> +					goto do_failover;
>>>  -			write_unlock_bh(&bond->curr_slave_lock);
>>> +			}
>>>  -			break;
>>> +			continue;
>>>   		case BOND_LINK_DOWN:
>>>  			if (slave->link_failure_count < UINT_MAX)
>>>  				slave->link_failure_count++;
>>>   			slave->link = BOND_LINK_DOWN;
>>> +			bond_set_slave_inactive_flags(slave);
>>>  -			if (slave == bond->curr_active_slave) {
>>> -				pr_info(DRV_NAME
>>> -				       ": %s: link status down for active "
>>> -				       "interface %s, disabling it\n",
>>> -				       bond->dev->name, slave->dev->name);
>>> -
>>> -				bond_set_slave_inactive_flags(slave);
>>> -
>>> -				write_lock_bh(&bond->curr_slave_lock);
>>> -
>>> -				bond_select_active_slave(bond);
>>> -				if (bond->curr_active_slave)
>>> -					bond->curr_active_slave->jiffies =
>>> -						jiffies;
>>> -
>>> -				write_unlock_bh(&bond->curr_slave_lock);
>>> +			pr_info(DRV_NAME
>>> +				": %s: link status definitely down for "
>>> +				"interface %s, disabling it\n",
>>> +				bond->dev->name, slave->dev->name);
>>>  +			if (slave == bond->curr_active_slave) {
>>>  				bond->current_arp_slave = NULL;
>>> -
>>> -			} else if (slave->state == BOND_STATE_BACKUP) {
>>> -				pr_info(DRV_NAME
>>> -				       ": %s: backup interface %s is now down\n",
>>> -				       bond->dev->name, slave->dev->name);
>>> -
>>> -				bond_set_slave_inactive_flags(slave);
>>> +				goto do_failover;
>>>  			}
>>> -			break;
>>> +
>>> +			continue;
>>>   		default:
>>>  			pr_err(DRV_NAME
>>>  			       ": %s: impossible: new_link %d on slave %s\n",
>>>  			       bond->dev->name, slave->new_link,
>>>  			       slave->dev->name);
>>> +			continue;
>>>  		}
>>> -	}
>>>  -	/*
>>> -	 * No race with changes to primary via sysfs, as we hold rtnl.
>>> -	 */
>>> -	if (bond->primary_slave &&
>>> -	    (bond->primary_slave != bond->curr_active_slave) &&
>>> -	    (bond->primary_slave->link == BOND_LINK_UP)) {
>>> +do_failover:
>>> +		ASSERT_RTNL();
>>>  		write_lock_bh(&bond->curr_slave_lock);
>>> -		bond_change_active_slave(bond, bond->primary_slave);
>>> +		bond_select_active_slave(bond);
>>>  		write_unlock_bh(&bond->curr_slave_lock);
>>>  	}
>>>  @@ -4695,7 +4680,7 @@ int bond_parse_parm(const char *buf, const struct
>>> bond_parm_tbl *tbl)
>>>   static int bond_check_params(struct bond_params *params)
>>>  {
>>> -	int arp_validate_value, fail_over_mac_value;
>>> +	int arp_validate_value, fail_over_mac_value, primary_passive_value;
>>>   	/*
>>>  	 * Convert string parameters.
>>> @@ -4994,6 +4979,20 @@ static int bond_check_params(struct bond_params *params)
>>>  		primary = NULL;
>>>  	}
>>>  +	if (primary && primary_passive) {
>>> +		primary_passive_value = bond_parse_parm(primary_passive,
>>> +							pri_passive_tbl);
>>> +		if (primary_passive_value == -1) {
>>> +			pr_err(DRV_NAME
>>> +			       ": Error: Invalid primary_passive \"%s\"\n",
>>> +			       primary_passive ==
>>> +					NULL ? "NULL" : primary_passive);
>>> +			return -EINVAL;
>>> +		}
>>> +	} else {
>>> +		primary_passive_value = BOND_PRI_PASSIVE_DISABLED;
>>> +	}
>>> +
>>>  	if (fail_over_mac) {
>>>  		fail_over_mac_value = bond_parse_parm(fail_over_mac,
>>>  						      fail_over_mac_tbl);
>>> @@ -5025,6 +5024,7 @@ static int bond_check_params(struct bond_params *params)
>>>  	params->use_carrier = use_carrier;
>>>  	params->lacp_fast = lacp_fast;
>>>  	params->primary[0] = 0;
>>> +	params->primary_passive = primary_passive_value;
>>>  	params->fail_over_mac = fail_over_mac_value;
>>>   	if (primary) {
>>> diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
>>> index 6044e12..84ce933 100644
>>> --- a/drivers/net/bonding/bond_sysfs.c
>>> +++ b/drivers/net/bonding/bond_sysfs.c
>>> @@ -1212,6 +1212,59 @@ static DEVICE_ATTR(primary, S_IRUGO | S_IWUSR,
>>>  		   bonding_show_primary, bonding_store_primary);
>>>   /*
>>> + * Show and set the primary_passive flag.
>>> + */
>>> +static ssize_t bonding_show_primary_passive(struct device *d,
>>> +					    struct device_attribute *attr,
>>> +					    char *buf)
>>> +{
>>> +	struct bonding *bond = to_bond(d);
>>> +
>>> +	return sprintf(buf, "%s %d\n",
>>> +		       pri_passive_tbl[bond->params.primary_passive].modename,
>>> +		       bond->params.primary_passive);
>>> +}
>>> +
>>> +static ssize_t bonding_store_primary_passive(struct device *d,
>>> +					     struct device_attribute *attr,
>>> +					     const char *buf, size_t count)
>>> +{
>>> +	int new_value, ret = count;
>>> +	struct bonding *bond = to_bond(d);
>>> +
>>> +	if (!rtnl_trylock())
>>> +		return restart_syscall();
>>> +
>>> +	new_value = bond_parse_parm(buf, pri_passive_tbl);
>>> +	if (new_value < 0)  {
>>> +		pr_err(DRV_NAME
>>> +		       ": %s: Ignoring invalid primary_passive value %.*s.\n",
>>> +		       bond->dev->name,
>>> +		       (int) strlen(buf) - 1, buf);
>>> +		ret = -EINVAL;
>>> +		goto out;
>>> +	} else {
>>> +		bond->params.primary_passive = new_value;
>>> +		pr_info(DRV_NAME ": %s: setting primary_passive to %s (%d).\n",
>>> +		       bond->dev->name, pri_passive_tbl[new_value].modename,
>>> +		       new_value);
>>> +		if (new_value == 0 || new_value == 1) {
> 
> 	Magic numbers again, but this test shouldn't be necessary.  The
> new_value is known to be valid if bond_parse_parm didn't return failure.
> 

The test is necessary, because it purpose is not to ensure that the given value 
is good, but to decide whether to trigger a possible active slave change.

/*
  * Only trigger a possible active slave change if new_value is
  * ALWAYS or BETTER. If new value is FAILURE, then only a later
  * failure of the active slave should trigger an active slave change.
  */

if (new_value == BOND_PRI_RESELECT_BETTER ||
	new_value == BOND_PRI_RESELECT_ALWAYS) {
...
}

>>> +			bond->force_primary = true;
>>> +			read_lock(&bond->lock);
>>> +			write_lock_bh(&bond->curr_slave_lock);
>>> +			bond_select_active_slave(bond);
>>> +			write_unlock_bh(&bond->curr_slave_lock);
>>> +			read_unlock(&bond->lock);
>>> +		}
>>> +	}
>>> +out:
>>> +	rtnl_unlock();
>>> +	return ret;
>>> +}
>>> +static DEVICE_ATTR(primary_passive, S_IRUGO | S_IWUSR,
>>> +		   bonding_show_primary_passive, bonding_store_primary_passive);
>>> +
>>> +/*
>>>   * Show and set the use_carrier flag.
>>>   */
>>>  static ssize_t bonding_show_carrier(struct device *d,
>>> @@ -1500,6 +1553,7 @@ static struct attribute *per_bond_attrs[] = {
>>>  	&dev_attr_num_unsol_na.attr,
>>>  	&dev_attr_miimon.attr,
>>>  	&dev_attr_primary.attr,
>>> +	&dev_attr_primary_passive.attr,
>>>  	&dev_attr_use_carrier.attr,
>>>  	&dev_attr_active_slave.attr,
>>>  	&dev_attr_mii_status.attr,
>>> diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
>>> index 6824771..9a9e0c7 100644
>>> --- a/drivers/net/bonding/bonding.h
>>> +++ b/drivers/net/bonding/bonding.h
>>> @@ -131,6 +131,7 @@ struct bond_params {
>>>  	int lacp_fast;
>>>  	int ad_select;
>>>  	char primary[IFNAMSIZ];
>>> +	int primary_passive;
>>>  	__be32 arp_targets[BOND_MAX_ARP_TARGETS];
>>>  };
>>>  @@ -190,6 +191,7 @@ struct bonding {
>>>  	struct   slave *curr_active_slave;
>>>  	struct   slave *current_arp_slave;
>>>  	struct   slave *primary_slave;
>>> +	bool     force_primary;
>>>  	s32      slave_cnt; /* never change this value outside the attach/detach wrappers */
>>>  	rwlock_t lock;
>>>  	rwlock_t curr_slave_lock;
>>> @@ -258,6 +260,10 @@ static inline bool bond_is_lb(const struct bonding *bond)
>>>  		|| bond->params.mode == BOND_MODE_ALB;
>>>  }
>>>  +#define BOND_PRI_PASSIVE_DISABLED	0
>>> +#define BOND_PRI_PASSIVE_BETTER		1
>>> +#define BOND_PRI_PASSIVE_ALWAYS		2
>>> +
>>>  #define BOND_FOM_NONE			0
>>>  #define BOND_FOM_ACTIVE			1
>>>  #define BOND_FOM_FOLLOW			2
>>> @@ -348,6 +354,7 @@ extern const struct bond_parm_tbl bond_mode_tbl[];
>>>  extern const struct bond_parm_tbl xmit_hashtype_tbl[];
>>>  extern const struct bond_parm_tbl arp_validate_tbl[];
>>>  extern const struct bond_parm_tbl fail_over_mac_tbl[];
>>> +extern const struct bond_parm_tbl pri_passive_tbl[];
>>>  extern struct bond_parm_tbl ad_select_tbl[];
>>>   #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
> 
> 	-J
> 
> ---
> 	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
> 

	Nicolas.

^ permalink raw reply

* Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
From: Gregory Haskins @ 2009-09-11 16:14 UTC (permalink / raw)
  To: Ira W. Snyder
  Cc: Michael S. Tsirkin, netdev, virtualization, kvm, linux-kernel,
	mingo, linux-mm, akpm, hpa, Rusty Russell, s.hetze
In-Reply-To: <4AAA7415.5080204@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1352 bytes --]

Gregory Haskins wrote:

[snip]

> 
> FWIW: VBUS handles this situation via the "memctx" abstraction.  IOW,
> the memory is not assumed to be a userspace address.  Rather, it is a
> memctx-specific address, which can be userspace, or any other type
> (including hardware, dma-engine, etc).  As long as the memctx knows how
> to translate it, it will work.
> 

citations:

Here is a packet import (from the perspective of the host side "venet"
device model, similar to Michaels "vhost")

http://git.kernel.org/?p=linux/kernel/git/ghaskins/alacrityvm/linux-2.6.git;a=blob;f=kernel/vbus/devices/venet-tap.c;h=ee091c47f06e9bb8487a45e72d493273fe08329f;hb=ded8ce2005a85c174ba93ee26f8d67049ef11025#l535

Here is the KVM specific memctx:

http://git.kernel.org/?p=linux/kernel/git/ghaskins/alacrityvm/linux-2.6.git;a=blob;f=kernel/vbus/kvm.c;h=56e2c5682a7ca8432c159377b0f7389cf34cbc1b;hb=ded8ce2005a85c174ba93ee26f8d67049ef11025#l188

and

http://git.kernel.org/?p=linux/kernel/git/ghaskins/alacrityvm/linux-2.6.git;a=blob;f=virt/kvm/xinterface.c;h=0cccb6095ca2a51bad01f7ba2137fdd9111b63d3;hb=ded8ce2005a85c174ba93ee26f8d67049ef11025#l289

You could alternatively define a memctx for your environment which knows
how to deal with your PPC boards PCI based memory, and the devices would
all "just work".

Kind Regards,
-Greg



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 267 bytes --]

^ permalink raw reply

* Re: mac80211: NOHZ: local_softirq_pending 08
From: Michael Buesch @ 2009-09-11 16:13 UTC (permalink / raw)
  To: Oliver Hartkopp
  Cc: Kalle Valo, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Johannes Berg, John W. Linville
In-Reply-To: <4AAA75CB.6040803-fJ+pQTUTwRTk1uMJSBkQmQ@public.gmane.org>

On Friday 11 September 2009 18:07:39 Oliver Hartkopp wrote:
> Michael Buesch wrote:
> > On Friday 11 September 2009 16:57:54 Kalle Valo wrote:
> >> Michael Buesch <mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org> writes:
> >>
> >>> Hi,
> >> Hallo,
> >>
> >>> mac80211 (or some other part of the networking stack) triggers this
> >>> warning in the NOHZ code: NOHZ: local_softirq_pending 08
> >>>
> >>> 08 seems to be NET_RX_SOFTIRQ.
> >>>
> >>> It happens, because my test driver b43 handles all RX and TX-status
> >>> callbacks in process context. I guess some part of the networking
> >>> stack expects RX to be in tasklet and/or softirq context.
> >>>
> >>> We also have a report of this warning in wl1251, so it's probably not
> >>> a b43 problem.
> >> Yes, I see this with wl1251. It uses workqueues everywhere.
> >>
> > 
> > This patch seems to fix it.
> > 
> > Signed-off-by: Michael Buesch <mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org>
> > 
> > Index: wireless-testing/net/mac80211/cfg.c
> > ===================================================================
> > --- wireless-testing.orig/net/mac80211/cfg.c	2009-08-09 18:47:11.000000000 +0200
> > +++ wireless-testing/net/mac80211/cfg.c	2009-09-11 16:59:12.000000000 +0200
> > @@ -606,7 +606,7 @@ static void ieee80211_send_layer2_update
> >  	skb->dev = sta->sdata->dev;
> >  	skb->protocol = eth_type_trans(skb, sta->sdata->dev);
> >  	memset(skb->cb, 0, sizeof(skb->cb));
> > -	netif_rx(skb);
> > +	ieee80211_netif_rx(skb);
> >  }
> >  
> >  static void sta_apply_parameters(struct ieee80211_local *local,
> > Index: wireless-testing/net/mac80211/ieee80211_i.h
> > ===================================================================
> > --- wireless-testing.orig/net/mac80211/ieee80211_i.h	2009-08-23 00:06:41.000000000 +0200
> > +++ wireless-testing/net/mac80211/ieee80211_i.h	2009-09-11 17:02:05.000000000 +0200
> > @@ -1053,6 +1053,14 @@ void ieee80211_tx_pending(unsigned long 
> >  int ieee80211_monitor_start_xmit(struct sk_buff *skb, struct net_device *dev);
> >  int ieee80211_subif_start_xmit(struct sk_buff *skb, struct net_device *dev);
> >  
> > +/* rx handling */
> > +static inline int ieee80211_netif_rx(struct sk_buff *skb)
> > +{
> > +	if (in_interrupt())
> > +		return netif_rx(skb);
> > +	return netif_rx_ni(skb);
> > +}
> > +
> 
> Hello Michael,
> 
> i know this NOHZ warning from the CAN stack also - but now, i know what caused
> this warning. I fixed it in my local tree and it works. Thanks!
> 
> As there are several users in the kernel do exact this test and call the
> appropriate netif_rx() function, i would suggest to create a static inline
> function:
> 
> static inline int netif_rx_ti(struct sk_buff *skb)
> {
> 	if (in_interrupt())
> 		return netif_rx(skb);
> 	return netif_rx_ni(skb);
> }
> 
> ('ti' for test in_interrupt())
> 
> in include/linux/netdevice.h
> 
> What do you think about that?

Yeah, I'm fine with that.

-- 
Greetings, Michael.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: mac80211: NOHZ: local_softirq_pending 08
From: Oliver Hartkopp @ 2009-09-11 16:07 UTC (permalink / raw)
  To: Michael Buesch
  Cc: Kalle Valo, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Johannes Berg, John W. Linville
In-Reply-To: <200909111707.23971.mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org>

Michael Buesch wrote:
> On Friday 11 September 2009 16:57:54 Kalle Valo wrote:
>> Michael Buesch <mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org> writes:
>>
>>> Hi,
>> Hallo,
>>
>>> mac80211 (or some other part of the networking stack) triggers this
>>> warning in the NOHZ code: NOHZ: local_softirq_pending 08
>>>
>>> 08 seems to be NET_RX_SOFTIRQ.
>>>
>>> It happens, because my test driver b43 handles all RX and TX-status
>>> callbacks in process context. I guess some part of the networking
>>> stack expects RX to be in tasklet and/or softirq context.
>>>
>>> We also have a report of this warning in wl1251, so it's probably not
>>> a b43 problem.
>> Yes, I see this with wl1251. It uses workqueues everywhere.
>>
> 
> This patch seems to fix it.
> 
> Signed-off-by: Michael Buesch <mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org>
> 
> Index: wireless-testing/net/mac80211/cfg.c
> ===================================================================
> --- wireless-testing.orig/net/mac80211/cfg.c	2009-08-09 18:47:11.000000000 +0200
> +++ wireless-testing/net/mac80211/cfg.c	2009-09-11 16:59:12.000000000 +0200
> @@ -606,7 +606,7 @@ static void ieee80211_send_layer2_update
>  	skb->dev = sta->sdata->dev;
>  	skb->protocol = eth_type_trans(skb, sta->sdata->dev);
>  	memset(skb->cb, 0, sizeof(skb->cb));
> -	netif_rx(skb);
> +	ieee80211_netif_rx(skb);
>  }
>  
>  static void sta_apply_parameters(struct ieee80211_local *local,
> Index: wireless-testing/net/mac80211/ieee80211_i.h
> ===================================================================
> --- wireless-testing.orig/net/mac80211/ieee80211_i.h	2009-08-23 00:06:41.000000000 +0200
> +++ wireless-testing/net/mac80211/ieee80211_i.h	2009-09-11 17:02:05.000000000 +0200
> @@ -1053,6 +1053,14 @@ void ieee80211_tx_pending(unsigned long 
>  int ieee80211_monitor_start_xmit(struct sk_buff *skb, struct net_device *dev);
>  int ieee80211_subif_start_xmit(struct sk_buff *skb, struct net_device *dev);
>  
> +/* rx handling */
> +static inline int ieee80211_netif_rx(struct sk_buff *skb)
> +{
> +	if (in_interrupt())
> +		return netif_rx(skb);
> +	return netif_rx_ni(skb);
> +}
> +

Hello Michael,

i know this NOHZ warning from the CAN stack also - but now, i know what caused
this warning. I fixed it in my local tree and it works. Thanks!

As there are several users in the kernel do exact this test and call the
appropriate netif_rx() function, i would suggest to create a static inline
function:

static inline int netif_rx_ti(struct sk_buff *skb)
{
	if (in_interrupt())
		return netif_rx(skb);
	return netif_rx_ni(skb);
}

('ti' for test in_interrupt())

in include/linux/netdevice.h

What do you think about that?

Regards,
Oliver
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox