* RE: rtnetlink and many VFs
From: Rose, Gregory V @ 2011-04-21 17:02 UTC (permalink / raw)
To: Ben Hutchings, David Miller; +Cc: netdev, sf-linux-drivers
In-Reply-To: <1303396576.3165.13.camel@bwh-desktop>
> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
> On Behalf Of Ben Hutchings
> Sent: Thursday, April 21, 2011 7:36 AM
> To: David Miller
> Cc: netdev; sf-linux-drivers
> Subject: rtnetlink and many VFs
>
> My colleagues have been working on SR-IOV support for sfc. The hardware
> supports up to 127 VFs per port.
>
> If we configure all 127 VFs through the net device, an RTM_GETLINK dump
> will need to include messages describing them, with a total size of:
>
> 127 * (sizeof(struct ifla_vf_mac) + sizeof(struct ifla_vf_vlan) +
> sizeof(struct ifla_vf_tx_rate) + protocol overhead)
> > 7112
>
> These messages are nested within the message describing the device as a
> whole, so they cannot be split. The maximum size of an outgoing netlink
> message, based on NLMSG_GOODSIZE, seems to be min(PAGE_SIZE, 8192). So
> when PAGE_SIZE = 4096 it is simply impossible to dump information about
> such a device!
>
> I think it needs to be made possible to grow a netlink skb during
> generation of the first message. Userspace may still be unable to
> receive the large message but at least it has a chance.
I've been looking at this one too. The limit seems to be about 40 or so in the most common case. My netlink fu is weak but I've been looking at the code in iproute2/ip and netlink to see what we can do about it.
As more VFs become possible it really needs a fix. I was thinking about something along the lines of this:
# ip link show eth(x) vf (n)
Where eth(x) is the physical function that owns the VFs and (n) is the specific VF you want information for. That way one could easily script something that loops through the VFs and gets the information for each. This really becomes necessary when we start adding additional MAC and VLAN filters for each VF that need to be displayed. In that case you can only show a few VFs before you run out of space.
In any case I've been working on an RFC patch for this and hope to have it soon. I consider this a pretty serious limitation and one could even view it as a bug.
- Greg
Greg Rose
LAD Division
Intel Corp.
^ permalink raw reply
* Re: [PATCH] tg3: Convert u32 flag,flg2,flg3 uses to bitmap
From: Joe Perches @ 2011-04-21 16:49 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Matt Carlson, Michael Chan, netdev, linux-kernel
In-Reply-To: <1303374696.3685.14.camel@edumazet-laptop>
On Thu, 2011-04-21 at 10:31 +0200, Eric Dumazet wrote:
> Le mercredi 20 avril 2011 à 23:39 -0700, Joe Perches a écrit :
> > Using a bitmap instead of separate u32 flags allows a consistent, simpler
[]
> Use an enum ?
No strong preference.
If it's an enum .c file will change.
> Why first value is 1 and not 0 ?
Should be 0.
> > +#define TG3_FLAGS 74 /* Set to number of flags */
> Also you need to make TG3_FLAGS be (last_flag_value + 1) or you could
> miss one long in bitmap.
Right. Thanks for comments Eric.
I'll wait for Matt to comment before resubmitting.
^ permalink raw reply
* Re: rfkill-input to be removed
From: Marco Chiappero @ 2011-04-21 16:45 UTC (permalink / raw)
To: Johannes Berg; +Cc: netdev
In-Reply-To: <1303402951.3597.27.camel@jlt3.sipsolutions.net>
Il 21/04/2011 18:22, Johannes Berg ha scritto:
> Yeah we noticed this before with some other drivers. The persistent
> stuff seems to only be suitable for a small number of semantics.
Sorry, what do you mean exactly? The persistent stuff seems to work well
with those notebooks.
> Frankly, I don't think we're ready for this yet, most distros don't yet
> ship the rfkill daemon.
Ok, in the meantime I'm going to avoid the SW_RFKILL_ALL switch event
forwarding, unless the master_switch_mode=1 is going to be changed to
honor the stored power state on drivers loading as well, not only when
moving the "kill all" switch (it seems that on boot every device is
always turned on, which is very annoying).
^ permalink raw reply
* Re: RPS will assign different smp_processor_id for the same packet?
From: zhou rui @ 2011-04-21 16:29 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev@vger.kernel.org
In-Reply-To: <1303403112.3685.61.camel@edumazet-laptop>
On Friday, April 22, 2011, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le jeudi 21 avril 2011 à 18:08 +0200, Eric Dumazet a écrit :
>> Le jeudi 21 avril 2011 à 23:50 +0800, zhou rui a écrit :
>> > kernel 2.6.36.4
>> > CONFIG_RPS=y but not set the cpu mask
>> >
>> > /sys/class/net/eth1/queues/rx-0 # cat rps_cpus
>> > 00
>> >
>> > register a hook func:
>> > prot_hook.func = packet_rcv;
>> > prot_hook.type = htons(ETH_P_ALL);
>> > dev_add_pack(&prot_hook);
>> >
>> >
>> > replay the same traffic in very slow speed, printk the
>> > smp_processor_id in packet_rcv():
>> > first time:
>> > cpu=4
>> > cpu=3
>> > cpu=6
>> > cpu=7
>> >
>> > second time:
>> > cpu=7
>> > cpu=1
>> > cpu=5
>> > cpu=2
>> >
>> > is it normal?
>>
>> Yes it is.
>>
>> What would you expect ?
>>
>
> If rps_cpus contains only '0' bits, it basically means RPS is not active
> for this input queue.
>
> CPU is therefore not changed : The cpu handling NAPI on your network
> device directly calls upper linux stack.
>
> Seeing your traces, it also means your device spreads its interrupts on
> many different cpus, this might be not optimal.
>
> Check /proc/irq/{irq_number}/smp_affinity, it probably contains "ff"
>
>
>
>
Thanks,just saw this email
^ permalink raw reply
* Re: RPS will assign different smp_processor_id for the same packet?
From: zhou rui @ 2011-04-21 16:27 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev@vger.kernel.org
In-Reply-To: <1303402094.3685.54.camel@edumazet-laptop>
On Friday, April 22, 2011, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le jeudi 21 avril 2011 à 23:50 +0800, zhou rui a écrit :
>> kernel 2.6.36.4
>> CONFIG_RPS=y but not set the cpu mask
>>
>> /sys/class/net/eth1/queues/rx-0 # cat rps_cpus
>> 00
>>
>> register a hook func:
>> prot_hook.func = packet_rcv;
>> prot_hook.type = htons(ETH_P_ALL);
>> dev_add_pack(&prot_hook);
>>
>>
>> replay the same traffic in very slow speed, printk the
>> smp_processor_id in packet_rcv():
>> first time:
>> cpu=4
>> cpu=3
>> cpu=6
>> cpu=7
>>
>> second time:
>> cpu=7
>> cpu=1
>> cpu=5
>> cpu=2
>>
>> is it normal?
>
> Yes it is.
>
> What would you expect ?
>
>
>
I want a same CPU for same packet,
If I echo ff >rps_cpu,will I get it?
And the design idea for different CPU in rps?
I understand nic will assign same rxq for packet has same hash
^ permalink raw reply
* Re: [PATCHv4] usbnet: Resubmit interrupt URB once if halted
From: Alan Stern @ 2011-04-21 16:27 UTC (permalink / raw)
To: Paul Stewart; +Cc: netdev, linux-usb, davem, greg
In-Reply-To: <BANLkTik7Da6K4Fn7=dpo8RwwctCp8kV_iw@mail.gmail.com>
On Thu, 21 Apr 2011, Paul Stewart wrote:
> I'm trying to handle two separate issues, one of which I can't say I
> fully understand. The first, which is the most straightforward, is
> for systems in which USB devices remained powered across
> suspend-resume. For this case for sure, we don't need a flag. The
> interrupt URBs are halted (either done so by the HCD as I've observed,
> or the drive can choose to kill them in usbnet_suspend()). On system
> resume, we're guaranteed URBs have stopped, and we can just submit
> one.
Okay, good.
> In a second scenario, for other systems USB devices go unpowered
> during suspend.
As happens during hibernation, for example.
> At resume time, there's a quick succession where the
> device appears to detach and reattach and enumerate.
Right. It's called reset-resume, and drivers have a special method for
it, distinct from regular resume. In theory it shouldn't make any
difference.
> This is where
> things get strange. It appears that since the enumeration happens in
> the course of system resume, when usbnet_open() gets called, and
> usb_autopm_get_interface(), there's a call path that leads to
> usbnet_resume().
Only if the interface was suspended when usbnet_open() was called. It
might have gotten suspended automatically following the system resume,
if it wasn't in use. But this should work out the same whether or not
there was a reset-rseume.
> If there's no flag, then we submit the interrupt urb
> from usbnet_resume(), so the submit_urb() in usbnet_open() fails in an
> error. This makes actions like "ifconfig eth0 up" fail on the
> interface after resume from suspend.
The driver needs better coordination between open/stop and
resume/suspend. The interrupt and receive URBs are supposed to be
active whenever the interface is up and not suspended, right? Which
means that usbnet_resume() shouldn't submit anything if the interface
isn't up.
Alan Stern
^ permalink raw reply
* Re: RPS will assign different smp_processor_id for the same packet?
From: Eric Dumazet @ 2011-04-21 16:25 UTC (permalink / raw)
To: zhou rui; +Cc: netdev
In-Reply-To: <1303402094.3685.54.camel@edumazet-laptop>
Le jeudi 21 avril 2011 à 18:08 +0200, Eric Dumazet a écrit :
> Le jeudi 21 avril 2011 à 23:50 +0800, zhou rui a écrit :
> > kernel 2.6.36.4
> > CONFIG_RPS=y but not set the cpu mask
> >
> > /sys/class/net/eth1/queues/rx-0 # cat rps_cpus
> > 00
> >
> > register a hook func:
> > prot_hook.func = packet_rcv;
> > prot_hook.type = htons(ETH_P_ALL);
> > dev_add_pack(&prot_hook);
> >
> >
> > replay the same traffic in very slow speed, printk the
> > smp_processor_id in packet_rcv():
> > first time:
> > cpu=4
> > cpu=3
> > cpu=6
> > cpu=7
> >
> > second time:
> > cpu=7
> > cpu=1
> > cpu=5
> > cpu=2
> >
> > is it normal?
>
> Yes it is.
>
> What would you expect ?
>
If rps_cpus contains only '0' bits, it basically means RPS is not active
for this input queue.
CPU is therefore not changed : The cpu handling NAPI on your network
device directly calls upper linux stack.
Seeing your traces, it also means your device spreads its interrupts on
many different cpus, this might be not optimal.
Check /proc/irq/{irq_number}/smp_affinity, it probably contains "ff"
^ permalink raw reply
* Re: rfkill-input to be removed
From: Johannes Berg @ 2011-04-21 16:22 UTC (permalink / raw)
To: Marco Chiappero; +Cc: netdev
In-Reply-To: <4DAFEAA7.5090003@absence.it>
On Thu, 2011-04-21 at 10:28 +0200, Marco Chiappero wrote:
> While working on the the sony-laptop driver, adding support for
> persistent rfkill state storing and adding the SW_RFKILL_ALL switch
> event forwarding to the input core to notify userspace, I realized that
> rfkill-input interferes with correct behavior of the driver, vanishing
> the hardware device state storing.
Yeah we noticed this before with some other drivers. The persistent
stuff seems to only be suitable for a small number of semantics.
> Then, looking at
> Documentation/feature-removal-schedule.txt I realized that rfkill-input
> was scheduled to be removed in 2.6.33, but it's still there in 2.6.39.
> Please remove that code as soon as possible, rfkill input events should
> be handled by user space tools.
Frankly, I don't think we're ready for this yet, most distros don't yet
ship the rfkill daemon.
johannes
^ permalink raw reply
* Re: RPS will assign different smp_processor_id for the same packet?
From: Eric Dumazet @ 2011-04-21 16:08 UTC (permalink / raw)
To: zhou rui; +Cc: netdev
In-Reply-To: <BANLkTimvuGKYuiV_ngTo4wtqhNix4iMfmA@mail.gmail.com>
Le jeudi 21 avril 2011 à 23:50 +0800, zhou rui a écrit :
> kernel 2.6.36.4
> CONFIG_RPS=y but not set the cpu mask
>
> /sys/class/net/eth1/queues/rx-0 # cat rps_cpus
> 00
>
> register a hook func:
> prot_hook.func = packet_rcv;
> prot_hook.type = htons(ETH_P_ALL);
> dev_add_pack(&prot_hook);
>
>
> replay the same traffic in very slow speed, printk the
> smp_processor_id in packet_rcv():
> first time:
> cpu=4
> cpu=3
> cpu=6
> cpu=7
>
> second time:
> cpu=7
> cpu=1
> cpu=5
> cpu=2
>
> is it normal?
Yes it is.
What would you expect ?
^ permalink raw reply
* Re: r8169 doesn't report link state correctly.
From: Dan Williams @ 2011-04-21 15:55 UTC (permalink / raw)
To: Ben Greear; +Cc: Francois Romieu, netdev
In-Reply-To: <4DAF569E.1040507@candelatech.com>
On Wed, 2011-04-20 at 14:56 -0700, Ben Greear wrote:
> On 04/20/2011 12:14 PM, Francois Romieu wrote:
> > François Romieu<romieu@fr.zoreil.com> :
> >> On Mon, Apr 11, 2011 at 01:09:19PM -0700, Ben Greear wrote:
> >>> I notice that in kernel 2.6.38-wl, the realtek 8169 NIC doesn't
> >>> report link down when in fact there is no cable connected. Instead,
> > [...]
> >> Thanks for the report. I'll try it tomorrow or friday.
> >
> > I have not been able to notice it with a current kernel.
> >
> > I'd welcome the XID of the 8169 NIC (see dmesg) and a short explanation
> > (no cable from boot ? cable removed after ifconfig up ? brand / ability
> > of the switch / hub ?).
>
> Well, as luck would have it, my system will boot today's upstream
> kernel (39-rc4+). And, I no longer see the problem in that release,
> so it seems it is fixed (or harder to reproduce that I thought).
>
> Basically, I was seeing it claim to have link in 'ethtool' output
> when there was no cable connected. It did go to 10Mbps/half duplex
> link speed when un-plugged. It showed full 1Gbps link when plugged
> in.
>
> I'll let you know if I see this again.
I just had a user run into this bug yesterday on 2.6.38.3 and he
described the exact symptoms in
https://bugzilla.kernel.org/show_bug.cgi?id=33782 . Essentially:
- cable plugged in
- ifconfig eth0 down
- wait a few seconds
- ifconfig eth0 up
- ifconfig eth0 (no LOWER_UP is shown)
If the cable is plugged in, and the device is UP, then we should expect
LOWER_UP indicating the device has a carrier. But that's not happening.
If there's any way to isolate a fix and push that fix to stable@ that
would be great...
Dan
^ permalink raw reply
* RPS will assign different smp_processor_id for the same packet?
From: zhou rui @ 2011-04-21 15:50 UTC (permalink / raw)
To: netdev
kernel 2.6.36.4
CONFIG_RPS=y but not set the cpu mask
/sys/class/net/eth1/queues/rx-0 # cat rps_cpus
00
register a hook func:
prot_hook.func = packet_rcv;
prot_hook.type = htons(ETH_P_ALL);
dev_add_pack(&prot_hook);
replay the same traffic in very slow speed, printk the
smp_processor_id in packet_rcv():
first time:
cpu=4
cpu=3
cpu=6
cpu=7
second time:
cpu=7
cpu=1
cpu=5
cpu=2
is it normal?
thanks
rui
^ permalink raw reply
* Re: rfkill-input to be removed
From: Dan Williams @ 2011-04-21 15:51 UTC (permalink / raw)
To: Marco Chiappero; +Cc: netdev, johannes
In-Reply-To: <4DB03586.3040702@absence.it>
On Thu, 2011-04-21 at 15:47 +0200, Marco Chiappero wrote:
> Il 21/04/2011 10:28, Marco Chiappero ha scritto:
> > Please remove that code as soon as possible, rfkill input events should
> > be handled by user space tools.
>
> About this topic, I've created a patch right now, you can find it here:
> http://www.absence.it/vaio-acpi/source/patches/rfkill-input.patch
> Does it look fine?
You'll want to follow the patch submission guidelines:
http://linux.yyz.us/patch-format.html
before people will look at the patch, because many of the people who
would look at it are quite busy. That means:
1) use a subject like "[PATCH] rfkill-input: remove deprecated module"
2) Add your Signed-off-by: Your Name <your email>
3) paste your patch *inline*, not as an attachment, and make *sure* to
use the "preformat" or whatever option when you do, so that your mailer
doesn't wrap long lines
Dan
> Moreover, using checkpatch.pl I've found 3 coding style errors, I'm
> attaching a patch to fix them (apply this one first).
>
> And just one last thing: as there is no configuration option inside the
> menu, shouldn't we change the "menuconfig RFKILL" line to "config
> RFKILL" inside net/rfkill/Kconfig?
^ permalink raw reply
* Re: ipqueue allocation failure.
From: Patrick McHardy @ 2011-04-21 15:13 UTC (permalink / raw)
To: Dave Jones; +Cc: netdev
In-Reply-To: <20110420014221.GC26949@redhat.com>
Am 20.04.2011 03:42, schrieb Dave Jones:
> Not catastrophic, but ipqueue seems to be too trusting of what it gets
> passed from userspace, and passes it on down to the page allocator,
> where it will spew warnings if the page order is too high.
>
> __ipq_rcv_skb has several checks for lengths too small, but doesn't
> seem to have any for oversized ones. I'm not sure what the maximum
> we should check for is. I'll code up a diff if anyone has any ideas
> on a sane maximum.
A sane maximum seems to be 2^16 - 1, the maximum size of an IPv4 packet.
Please also update ip6queue and nfnetlink_queue.
^ permalink raw reply
* Re: [PATCHv4] usbnet: Resubmit interrupt URB once if halted
From: Paul Stewart @ 2011-04-21 14:58 UTC (permalink / raw)
To: Alan Stern; +Cc: netdev, linux-usb, davem, greg
In-Reply-To: <Pine.LNX.4.44L0.1104210958560.1939-100000@iolanthe.rowland.org>
On Thu, Apr 21, 2011 at 7:03 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Tue, 19 Apr 2011, Paul Stewart wrote:
>
>> Set a flag if the interrupt URB completes with ENOENT as this
>> occurs legitimately during system suspend. When the
>> usbnet_resume is called, test this flag and try once to resubmit
>> the interrupt URB.
>
> I still don't think this is the best way to go.
>
>> This version of the patch moves the urb submit directly into
>> usbnet_resume. Is it okay to submit a GFP_KERNEL urb from
>> usbnet_resume()?
>
> Yes, it is.
>
>> Signed-off-by: Paul Stewart <pstew@chromium.org>
>> ---
>> drivers/net/usb/usbnet.c | 13 ++++++++++++-
>> include/linux/usb/usbnet.h | 1 +
>> 2 files changed, 13 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
>> index 02d25c7..3651a48 100644
>> --- a/drivers/net/usb/usbnet.c
>> +++ b/drivers/net/usb/usbnet.c
>> @@ -482,6 +482,7 @@ static void intr_complete (struct urb *urb)
>> case -ESHUTDOWN: /* hardware gone */
>> if (netif_msg_ifdown (dev))
>> devdbg (dev, "intr shutdown, code %d", status);
>> + set_bit(EVENT_INTR_HALT, &dev->flags);
>
> Is this new flag really needed?
>
>> return;
>>
>> /* NOTE: not throttling like RX/TX, since this endpoint
>> @@ -1294,9 +1295,19 @@ int usbnet_resume (struct usb_interface *intf)
>> {
>> struct usbnet *dev = usb_get_intfdata(intf);
>>
>> - if (!--dev->suspend_count)
>> + if (!--dev->suspend_count) {
>> tasklet_schedule (&dev->bh);
>>
>> + /* resubmit interrupt URB if it was halted by suspend */
>> + if (dev->interrupt && netif_running(dev->net) &&
>> + netif_device_present(dev->net) &&
>> + test_bit(EVENT_INTR_HALT, &dev->flags)) {
>
> Why do you need the test_bit()? If the other conditions are all true,
> don't you want to resubmit the interrupt URB regardless?
I'm trying to handle two separate issues, one of which I can't say I
fully understand. The first, which is the most straightforward, is
for systems in which USB devices remained powered across
suspend-resume. For this case for sure, we don't need a flag. The
interrupt URBs are halted (either done so by the HCD as I've observed,
or the drive can choose to kill them in usbnet_suspend()). On system
resume, we're guaranteed URBs have stopped, and we can just submit
one.
In a second scenario, for other systems USB devices go unpowered
during suspend. At resume time, there's a quick succession where the
device appears to detach and reattach and enumerate. This is where
things get strange. It appears that since the enumeration happens in
the course of system resume, when usbnet_open() gets called, and
usb_autopm_get_interface(), there's a call path that leads to
usbnet_resume(). If there's no flag, then we submit the interrupt urb
from usbnet_resume(), so the submit_urb() in usbnet_open() fails in an
error. This makes actions like "ifconfig eth0 up" fail on the
interface after resume from suspend.
>
>> + clear_bit(EVENT_INTR_HALT, &dev->flags);
>> + usb_submit_urb(dev->interrupt, GFP_KERNEL);
>> + }
>> + }
>> +}
>> +
>> return 0;
>> }
>
> Alan Stern
>
>
^ permalink raw reply
* Re: [PATCH v5] net: bnx2x: convert to hw_features
From: Eric Dumazet @ 2011-04-21 14:52 UTC (permalink / raw)
To: Michał Mirosław; +Cc: netdev, Vladislav Zolotarov, Eilon Greenstein
In-Reply-To: <20110412193823.0823213A65@rere.qmqm.pl>
Le mardi 12 avril 2011 à 21:38 +0200, Michał Mirosław a écrit :
> Since ndo_fix_features callback is postponing features change when
> bp->recovery_state != BNX2X_RECOVERY_DONE, netdev_update_features()
> has to be called again when this condition changes. Previously,
> ethtool_ops->set_flags callback returned -EBUSY in that case
> (it's not possible in the new model).
>
> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
>
> v5: - don't delay set_features, as it's rtnl_locked - same as recovery process
> v4: - complete bp->rx_csum -> NETIF_F_RXCSUM conversion
> - add check for failed ndo_set_features in ndo_open callback
> v3: - include NETIF_F_LRO in hw_features
> - don't call netdev_update_features() if bnx2x_nic_load() failed
> v2: - comment in ndo_fix_features callback
> ---
Hi guys
I am not sure its related to these changes, but I now have in
net-next-2.6 :
[ 23.674263] ------------[ cut here ]------------
[ 23.674266] WARNING: at net/core/dev.c:1318 dev_disable_lro+0x83/0x90()
[ 23.674270] Hardware name: ProLiant BL460c G6
[ 23.674273] Modules linked in: tg3 libphy sg
[ 23.674280] Pid: 3070, comm: sysctl Tainted: G W 2.6.39-rc2-01242-g3ef22b9-dirty #669
[ 23.674282] Call Trace:
[ 23.674285] [<ffffffff813b94f3>] ? dev_disable_lro+0x83/0x90
[ 23.674291] [<ffffffff81042c9b>] warn_slowpath_common+0x8b/0xc0
[ 23.674298] [<ffffffff81042ce5>] warn_slowpath_null+0x15/0x20
[ 23.674304] [<ffffffff813b94f3>] dev_disable_lro+0x83/0x90
[ 23.674309] [<ffffffff81429789>] devinet_sysctl_forward+0x199/0x210
[ 23.674313] [<ffffffff814296e4>] ? devinet_sysctl_forward+0xf4/0x210
[ 23.674318] [<ffffffff8104e712>] ? capable+0x12/0x20
[ 23.674324] [<ffffffff81168f45>] proc_sys_call_handler+0xb5/0xd0
[ 23.674328] [<ffffffff81168f6f>] proc_sys_write+0xf/0x20
[ 23.674334] [<ffffffff81105f39>] vfs_write+0xc9/0x170
[ 23.674339] [<ffffffff81106550>] sys_write+0x50/0x90
[ 23.674345] [<ffffffff814b95a0>] sysenter_dispatch+0x7/0x33
[ 23.674350] ---[ end trace 051ec497c66b228e ]---
Thanks
^ permalink raw reply
* Re: [PATCHv4] usbnet: Resubmit interrupt URB once if halted
From: Paul Stewart @ 2011-04-21 14:44 UTC (permalink / raw)
To: Alan Stern
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-usb-u79uwXL29TY76Z2rM5mHXA,
davem-fT/PcQaiUtIeIZ0/mPfg9Q, greg-U8xfFu+wG4EAvxtiuMwx3w
In-Reply-To: <Pine.LNX.4.44L0.1104210937360.1939-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
On Thu, Apr 21, 2011 at 6:43 AM, Alan Stern <stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz@public.gmane.org> wrote:
> On Wed, 20 Apr 2011, Paul Stewart wrote:
>
>> On Wed, Apr 20, 2011 at 2:08 PM, Alan Stern <stern-nwvwT67g6+6dFdvTe/nMLrNAH6kLmebB@public.gmane.orgdu> wrote:
>> > On Tue, 19 Apr 2011, Paul Stewart wrote:
>> >
>> >> Set a flag if the interrupt URB completes with ENOENT as this
>> >> occurs legitimately during system suspend. �When the usbnet_bh
>> >> is called after resume, test this flag and try once to resubmit
>> >> the interrupt URB.
>> >
>> > No doubt there's a good reason for doing things this way, but it isn't
>> > clear. �Why wait until usbnet_bh() is called after resume? �Why not
>> > resubmit the interrupt URB _during_ usbnet_resume()?
>>
>> Actually, I was doing this in the bh because of feedback I had gained
>> early in this process about not doing submit_urb in the resume().
>
> Do you have a URL for that feedback? In general, there's no reason not
> to resubmit URBs during a resume callback; lots of drivers do it. But
> usbnet may have some special requirements of its own that I'm not aware
> of.
>
>> If
>> that issue doesn't exist, that makes my work a lot easier. In testing
>> I found that just setting this to happen in the bh might be problematic
>> due to firing too early, so this is good news.
>>
>> >�This would seem
>> > to be the logical approach, seeing as how usbnet_suspend() kills the
>> > interrupt URB.
>>
>> Aha! But you'll see from the current version of my patch that we don't
>> actually ever kill the interrupt URB. It gets killed all on its own (by the
>> hcd?) and handed back to us in intr_complete(). This last bit about the
>> complete function being called was lost on me for a while which is why
>> in a previous iteration of the patch I was trying to kill the urb in suspend().
>
> Why not kill the interrupt URB while suspending? It's the proper thing
> to do. Otherwise you run the risk that an event might happen at just
> the wrong time, causing the interrupt URB to complete normally, but
> _after_ the driver has finished suspending. There's a good chance the
> driver would not process the event correctly.
I don't mind killing the URB. I'd want to set the halt flag as well
(more on why have a flag in response to your other email). You're
right that there may be a race between an interrupt URB arriving and
the onset of suspend, but I really can't imagine why I can't solve
that by setting the flag if a submit_urb() fails in intr_complete().
>
> Alan Stern
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* rtnetlink and many VFs
From: Ben Hutchings @ 2011-04-21 14:36 UTC (permalink / raw)
To: David Miller; +Cc: netdev, sf-linux-drivers
My colleagues have been working on SR-IOV support for sfc. The hardware
supports up to 127 VFs per port.
If we configure all 127 VFs through the net device, an RTM_GETLINK dump
will need to include messages describing them, with a total size of:
127 * (sizeof(struct ifla_vf_mac) + sizeof(struct ifla_vf_vlan) +
sizeof(struct ifla_vf_tx_rate) + protocol overhead)
> 7112
These messages are nested within the message describing the device as a
whole, so they cannot be split. The maximum size of an outgoing netlink
message, based on NLMSG_GOODSIZE, seems to be min(PAGE_SIZE, 8192). So
when PAGE_SIZE = 4096 it is simply impossible to dump information about
such a device!
I think it needs to be made possible to grow a netlink skb during
generation of the first message. Userspace may still be unable to
receive the large message but at least it has a chance.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: Hight speed data sending from custom IP out of kernel
From: zhou rui @ 2011-04-21 14:31 UTC (permalink / raw)
To: juice; +Cc: monstr, netdev
In-Reply-To: <45cb2254ff23a4977c95b0f9459e39a6.squirrel@www.liukuma.net>
On Wed, Apr 20, 2011 at 12:02 AM, juice <juice@swagman.org> wrote:
>
> Hi!
>
> I can see you are probably going to run into CPU performance problems, but
> it depends a lot on the type of traffic you are going to send.
>
> My system requires quite fast processor, but even more important is to
> have a network interface card that really supports the full speed of
> gigabit ethernet line. The reason for that is that my test traffic
> includes streams of very small packets that cause a lot of overhead in
> processing.
>
> Most of my test traffic is UDP, but it does not really matter what the
> higher layers of the traffic are, this scheme operates on the ethernet
> layer and does not care about payload structure.
>
> I tried several NIC:s before i settled using Intel 82576 cards with the
> igb driver. If you have less capable interface card, your small packet
> performance is going to be a lot poorer.
>
> Using that card I can get to full speed GE line rate even with 64byte
> packets, but if you want to send larger packets, say close to 1500byte
> then almost any NIC will work OK for you.
>
> You can download the module code and the userland seeding application from
> my svn server at https://toosa.swagman.org/svn/streamgen
> The streamseed userland application requires libpcap-dev to build
> correctly but the streamgen module is self-sufficent.
>
> There is not a lot of documentation, and the module is still "work in
> progress" as I am going to fix it to work with more than one interface at
> the same time when I get to do it. Currently it can only use one interface
> on the sending host machine.
>
> - Juice -
>
does it use the same command/config file as pktgen?
or special command?
rui
>
>> Hi Juice,
>>
>> juice wrote:
>>> Hi Michal.
>>>
>>> How fast do you need to send the data?
>>
>> It sounds weird but as fast as possible. There is no specific limit
>> because I
>> want to create demo and test it on various hw configuration which I can
>> easily
>> create on FPGA. For now the bottleneck is Microblaze cpu. It can run from
>> 50MHz
>> till 170-180MHz. We also support both endians and have two hw IP
>> cores(10/100/1000) which I can use.
>>
>>> I have an application where I send test stream out to GE line and can
>>> fill
>>> the total capacity of the ethernet regardless of the packet size.
>>
>> What cpu do you use?
>>
>>>
>>> The test stream I am sending is stored in kernel memory, and therefore
>>> is
>>> limited by the amount of free memory. 200M is no problem.
>>
>> Is it UDP or TCP?
>>
>>>
>>> The solution I am using is loosely based on the pktgen module, except
>>> that
>>> my module can load a wireshark capture from userland program and then
>>> send
>>> it from ethernet interface in wire speed.
>>
>> Sound good. Would it be possible to see it and test it?
>>
>> Thanks,
>> Michal
>>
>>
>>>
>>> - Juice -
>>>
>>>
>>>> Hi,
>>>> I would like to create demo for high speed data sending from custom IP
>>> through
>>>> the ethernet. I think the best description is that there are dmaable
>>>> memory
>>>> mapped registers or just memory which store data I want to send (for
>>> example 200MB).
>>>> Linux should handle all communication between target(probably server)
>>> and
>>>> host
>>>> (client) but data in the packets should go from that custom IP and
>>>> can't go
>>>> through the kernel because of performance issue.
>>>> Ethernet core have own DMA which I could use but the question is if
>>> there
>>>> is any
>>>> option how to convince the kernel that data will go directly from
>>>> memory
>>> mapped
>>>> registers and the kernel/driver/... just setup dma BD for headers and
>>> second for
>>>> data.
>>>> Do you have any experience with any solution with passing data
>>> completely
>>>> out of
>>>> kernel?
>>>> Thanks,
>>>> Michal
>>>> --
>>>> Michal Simek, Ing. (M.Eng)
>>>> w: www.monstr.eu p: +42-0-721842854
>>>> Maintainer of Linux kernel 2.6 Microblaze Linux -
>>>> http://www.monstr.eu/fdt/
>>>> Microblaze U-BOOT custodian
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>> the
>>> body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Michal Simek, Ing. (M.Eng)
>> w: www.monstr.eu p: +42-0-721842854
>> Maintainer of Linux kernel 2.6 Microblaze Linux -
>> http://www.monstr.eu/fdt/
>> Microblaze U-BOOT custodian
>>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply
* Re: [PATCH V3 0/8] macvtap/vhost TX zero copy support
From: Jon Mason @ 2011-04-21 14:29 UTC (permalink / raw)
To: Shirley Ma
Cc: David Miller, mst, Eric Dumazet, Avi Kivity, Arnd Bergmann,
netdev, kvm, linux-kernel
In-Reply-To: <1303328216.19336.18.camel@localhost.localdomain>
On Wed, Apr 20, 2011 at 3:36 PM, Shirley Ma <mashirle@us.ibm.com> wrote:
> This patchset add supports for TX zero-copy between guest and host
> kernel through vhost. It significantly reduces CPU utilization on the
> local host on which the guest is located (It reduced 30-50% CPU usage
> for vhost thread for single stream test). The patchset is based on
> previous submission and comments from the community regarding when/how
> to handle guest kernel buffers to be released. This is the simplest
> approach I can think of after comparing with several other solutions.
>
> This patchset includes:
>
> 1/8: Add a new sock zero-copy flag, SOCK_ZEROCOPY;
>
> 2/8: Add a new device flag, NETIF_F_ZEROCOPY for lower level device
> support zero-copy;
>
> 3/8: Add a new struct skb_ubuf_info in skb_share_info for userspace
> buffers release callback when lower device DMA has done for that skb;
>
> 4/8: Add vhost zero-copy callback in vhost when skb last refcnt is gone;
> add vhost_zerocopy_add_used_and_signal to notify guest to release TX skb
> buffers.
>
> 5/8: Add macvtap zero-copy in lower device when sending packet is
> greater than 128 bytes.
>
> 6/8: Add Chelsio 10Gb NIC to zero copy feature flag
>
> 7/8: Add Intel 10Gb NIC zero copy feature flag
>
> 8/8: Add Emulex 10Gb NIC zero copy feature flag
Why are only these 3 drivers getting support? As far as I can tell,
the only requirement is HIGHDMA. If this is the case, is there really
a need for an additional flag to support this? If you can key off of
HIGHDMA, all devices that support this would get the benefit.
> The patchset is built against most recent linux 2.6.git. It has passed
> netperf/netserver multiple streams stress test on above NICs.
>
> The single stream test results from 2.6.37 kernel on Chelsio:
>
> 64K message size: copy_from_user dropped from 40% to 5%; vhost thread
> cpu utilization dropped from 76% to 28%
>
> I am collecting more test results against 2.6.39-rc3 kernel and will
> provide the test matrix later.
>
> Thanks
> Shirley
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply
* Re: [PATCHv4] usbnet: Resubmit interrupt URB once if halted
From: Alan Stern @ 2011-04-21 14:03 UTC (permalink / raw)
To: Paul Stewart
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-usb-u79uwXL29TY76Z2rM5mHXA,
davem-fT/PcQaiUtIeIZ0/mPfg9Q, greg-U8xfFu+wG4EAvxtiuMwx3w
In-Reply-To: <20110420214452.C599321126-6A69KNNYBwgF248FYctl9mCaruZE5nAUZeezCHUQhQ4@public.gmane.org>
On Tue, 19 Apr 2011, Paul Stewart wrote:
> Set a flag if the interrupt URB completes with ENOENT as this
> occurs legitimately during system suspend. When the
> usbnet_resume is called, test this flag and try once to resubmit
> the interrupt URB.
I still don't think this is the best way to go.
> This version of the patch moves the urb submit directly into
> usbnet_resume. Is it okay to submit a GFP_KERNEL urb from
> usbnet_resume()?
Yes, it is.
> Signed-off-by: Paul Stewart <pstew-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> ---
> drivers/net/usb/usbnet.c | 13 ++++++++++++-
> include/linux/usb/usbnet.h | 1 +
> 2 files changed, 13 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
> index 02d25c7..3651a48 100644
> --- a/drivers/net/usb/usbnet.c
> +++ b/drivers/net/usb/usbnet.c
> @@ -482,6 +482,7 @@ static void intr_complete (struct urb *urb)
> case -ESHUTDOWN: /* hardware gone */
> if (netif_msg_ifdown (dev))
> devdbg (dev, "intr shutdown, code %d", status);
> + set_bit(EVENT_INTR_HALT, &dev->flags);
Is this new flag really needed?
> return;
>
> /* NOTE: not throttling like RX/TX, since this endpoint
> @@ -1294,9 +1295,19 @@ int usbnet_resume (struct usb_interface *intf)
> {
> struct usbnet *dev = usb_get_intfdata(intf);
>
> - if (!--dev->suspend_count)
> + if (!--dev->suspend_count) {
> tasklet_schedule (&dev->bh);
>
> + /* resubmit interrupt URB if it was halted by suspend */
> + if (dev->interrupt && netif_running(dev->net) &&
> + netif_device_present(dev->net) &&
> + test_bit(EVENT_INTR_HALT, &dev->flags)) {
Why do you need the test_bit()? If the other conditions are all true,
don't you want to resubmit the interrupt URB regardless?
> + clear_bit(EVENT_INTR_HALT, &dev->flags);
> + usb_submit_urb(dev->interrupt, GFP_KERNEL);
> + }
> + }
> +}
> +
> return 0;
> }
Alan Stern
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: Hight speed data sending from custom IP out of kernel
From: zhou rui @ 2011-04-21 13:56 UTC (permalink / raw)
To: juice; +Cc: monstr, netdev
In-Reply-To: <45cb2254ff23a4977c95b0f9459e39a6.squirrel@www.liukuma.net>
On Wed, Apr 20, 2011 at 12:02 AM, juice <juice@swagman.org> wrote:
>
> Hi!
>
> I can see you are probably going to run into CPU performance problems, but
> it depends a lot on the type of traffic you are going to send.
>
> My system requires quite fast processor, but even more important is to
> have a network interface card that really supports the full speed of
> gigabit ethernet line. The reason for that is that my test traffic
> includes streams of very small packets that cause a lot of overhead in
> processing.
>
> Most of my test traffic is UDP, but it does not really matter what the
> higher layers of the traffic are, this scheme operates on the ethernet
> layer and does not care about payload structure.
>
> I tried several NIC:s before i settled using Intel 82576 cards with the
> igb driver. If you have less capable interface card, your small packet
> performance is going to be a lot poorer.
>
> Using that card I can get to full speed GE line rate even with 64byte
> packets, but if you want to send larger packets, say close to 1500byte
> then almost any NIC will work OK for you.
>
> You can download the module code and the userland seeding application from
> my svn server at https://toosa.swagman.org/svn/streamgen
> The streamseed userland application requires libpcap-dev to build
> correctly but the streamgen module is self-sufficent.
>
> There is not a lot of documentation, and the module is still "work in
> progress" as I am going to fix it to work with more than one interface at
> the same time when I get to do it. Currently it can only use one interface
> on the sending host machine.
>
> - Juice -
>
does it use the same command/config file as pktgen?
or special command?
>
>> Hi Juice,
>>
>> juice wrote:
>>> Hi Michal.
>>>
>>> How fast do you need to send the data?
>>
>> It sounds weird but as fast as possible. There is no specific limit
>> because I
>> want to create demo and test it on various hw configuration which I can
>> easily
>> create on FPGA. For now the bottleneck is Microblaze cpu. It can run from
>> 50MHz
>> till 170-180MHz. We also support both endians and have two hw IP
>> cores(10/100/1000) which I can use.
>>
>>> I have an application where I send test stream out to GE line and can
>>> fill
>>> the total capacity of the ethernet regardless of the packet size.
>>
>> What cpu do you use?
>>
>>>
>>> The test stream I am sending is stored in kernel memory, and therefore
>>> is
>>> limited by the amount of free memory. 200M is no problem.
>>
>> Is it UDP or TCP?
>>
>>>
>>> The solution I am using is loosely based on the pktgen module, except
>>> that
>>> my module can load a wireshark capture from userland program and then
>>> send
>>> it from ethernet interface in wire speed.
>>
>> Sound good. Would it be possible to see it and test it?
>>
>> Thanks,
>> Michal
>>
>>
>>>
>>> - Juice -
>>>
>>>
>>>> Hi,
>>>> I would like to create demo for high speed data sending from custom IP
>>> through
>>>> the ethernet. I think the best description is that there are dmaable
>>>> memory
>>>> mapped registers or just memory which store data I want to send (for
>>> example 200MB).
>>>> Linux should handle all communication between target(probably server)
>>> and
>>>> host
>>>> (client) but data in the packets should go from that custom IP and
>>>> can't go
>>>> through the kernel because of performance issue.
>>>> Ethernet core have own DMA which I could use but the question is if
>>> there
>>>> is any
>>>> option how to convince the kernel that data will go directly from
>>>> memory
>>> mapped
>>>> registers and the kernel/driver/... just setup dma BD for headers and
>>> second for
>>>> data.
>>>> Do you have any experience with any solution with passing data
>>> completely
>>>> out of
>>>> kernel?
>>>> Thanks,
>>>> Michal
>>>> --
>>>> Michal Simek, Ing. (M.Eng)
>>>> w: www.monstr.eu p: +42-0-721842854
>>>> Maintainer of Linux kernel 2.6 Microblaze Linux -
>>>> http://www.monstr.eu/fdt/
>>>> Microblaze U-BOOT custodian
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>> the
>>> body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Michal Simek, Ing. (M.Eng)
>> w: www.monstr.eu p: +42-0-721842854
>> Maintainer of Linux kernel 2.6 Microblaze Linux -
>> http://www.monstr.eu/fdt/
>> Microblaze U-BOOT custodian
>>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply
* Re: rfkill-input to be removed
From: Marco Chiappero @ 2011-04-21 13:47 UTC (permalink / raw)
To: netdev; +Cc: johannes
In-Reply-To: <4DAFEAA7.5090003@absence.it>
[-- Attachment #1: Type: text/plain, Size: 615 bytes --]
Il 21/04/2011 10:28, Marco Chiappero ha scritto:
> Please remove that code as soon as possible, rfkill input events should
> be handled by user space tools.
About this topic, I've created a patch right now, you can find it here:
http://www.absence.it/vaio-acpi/source/patches/rfkill-input.patch
Does it look fine?
Moreover, using checkpatch.pl I've found 3 coding style errors, I'm
attaching a patch to fix them (apply this one first).
And just one last thing: as there is no configuration option inside the
menu, shouldn't we change the "menuconfig RFKILL" line to "config
RFKILL" inside net/rfkill/Kconfig?
[-- Attachment #2: rfkill-style.patch --]
[-- Type: text/x-patch, Size: 982 bytes --]
Signed-off-by: Marco Chiappero <marco@absence.it>
--- a/net/rfkill/core.c 2011-04-19 06:26:00.000000000 +0200
+++ b/net/rfkill/core.c 2011-04-21 15:33:21.970094489 +0200
@@ -621,7 +621,7 @@ static ssize_t rfkill_hard_show(struct d
{
struct rfkill *rfkill = to_rfkill(dev);
- return sprintf(buf, "%d\n", (rfkill->state & RFKILL_BLOCK_HW) ? 1 : 0 );
+ return sprintf(buf, "%d\n", (rfkill->state & RFKILL_BLOCK_HW) ? 1 : 0);
}
static ssize_t rfkill_soft_show(struct device *dev,
@@ -630,7 +630,7 @@ static ssize_t rfkill_soft_show(struct d
{
struct rfkill *rfkill = to_rfkill(dev);
- return sprintf(buf, "%d\n", (rfkill->state & RFKILL_BLOCK_SW) ? 1 : 0 );
+ return sprintf(buf, "%d\n", (rfkill->state & RFKILL_BLOCK_SW) ? 1 : 0);
}
static ssize_t rfkill_soft_store(struct device *dev,
@@ -648,7 +648,7 @@ static ssize_t rfkill_soft_store(struct
if (err)
return err;
- if (state > 1 )
+ if (state > 1)
return -EINVAL;
mutex_lock(&rfkill_global_mutex);
^ permalink raw reply
* Re: [PATCHv4] usbnet: Resubmit interrupt URB once if halted
From: Alan Stern @ 2011-04-21 13:43 UTC (permalink / raw)
To: Paul Stewart; +Cc: netdev, linux-usb, davem, greg
In-Reply-To: <BANLkTi=N3T-V8VNOcbKu6COKvbEHqMoAog@mail.gmail.com>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=UTF-8, Size: 2065 bytes --]
On Wed, 20 Apr 2011, Paul Stewart wrote:
> On Wed, Apr 20, 2011 at 2:08 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
> > On Tue, 19 Apr 2011, Paul Stewart wrote:
> >
> >> Set a flag if the interrupt URB completes with ENOENT as this
> >> occurs legitimately during system suspend. When the usbnet_bh
> >> is called after resume, test this flag and try once to resubmit
> >> the interrupt URB.
> >
> > No doubt there's a good reason for doing things this way, but it isn't
> > clear. Why wait until usbnet_bh() is called after resume? Why not
> > resubmit the interrupt URB _during_ usbnet_resume()?
>
> Actually, I was doing this in the bh because of feedback I had gained
> early in this process about not doing submit_urb in the resume().
Do you have a URL for that feedback? In general, there's no reason not
to resubmit URBs during a resume callback; lots of drivers do it. But
usbnet may have some special requirements of its own that I'm not aware
of.
> If
> that issue doesn't exist, that makes my work a lot easier. In testing
> I found that just setting this to happen in the bh might be problematic
> due to firing too early, so this is good news.
>
> > This would seem
> > to be the logical approach, seeing as how usbnet_suspend() kills the
> > interrupt URB.
>
> Aha! But you'll see from the current version of my patch that we don't
> actually ever kill the interrupt URB. It gets killed all on its own (by the
> hcd?) and handed back to us in intr_complete(). This last bit about the
> complete function being called was lost on me for a while which is why
> in a previous iteration of the patch I was trying to kill the urb in suspend().
Why not kill the interrupt URB while suspending? It's the proper thing
to do. Otherwise you run the risk that an event might happen at just
the wrong time, causing the interrupt URB to complete normally, but
_after_ the driver has finished suspending. There's a good chance the
driver would not process the event correctly.
Alan Stern
^ permalink raw reply
* [PATCH 1/2] net: Export dev_queue_xmit_nit for use by macvlan driver
From: David Ward @ 2011-04-21 13:31 UTC (permalink / raw)
To: netdev, kaber; +Cc: David Ward
In-Reply-To: <1303392693-1350-1-git-send-email-david.ward@ll.mit.edu>
Export dev_queue_xmit_nit for use by the macvlan virtual network device
driver. Also, use 'dev' instead of 'skb->dev' in this function.
Signed-off-by: David Ward <david.ward@ll.mit.edu>
---
include/linux/netdevice.h | 2 ++
net/core/dev.c | 14 +++++++++-----
2 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index cb8178a..b63e517 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2099,6 +2099,8 @@ extern int dev_hard_start_xmit(struct sk_buff *skb,
struct netdev_queue *txq);
extern int dev_forward_skb(struct net_device *dev,
struct sk_buff *skb);
+extern void dev_queue_xmit_nit(struct sk_buff *skb,
+ struct net_device *dev);
extern int netdev_budget;
diff --git a/net/core/dev.c b/net/core/dev.c
index 3871bf6..e851227 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1520,11 +1520,13 @@ static inline int deliver_skb(struct sk_buff *skb,
}
/*
- * Support routine. Sends outgoing frames to any network
- * taps currently in use.
+ * dev_queue_xmit_nit - send outgoing frame to AF_PACKET sockets
+ *
+ * @skb: buffer to send
+ * @dev: network device that AF_PACKET sockets are attached to (if any)
*/
-static void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
+void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
{
struct packet_type *ptype;
struct sk_buff *skb2 = NULL;
@@ -1539,7 +1541,8 @@ static void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
(ptype->af_packet_priv == NULL ||
(struct sock *)ptype->af_packet_priv != skb->sk)) {
if (pt_prev) {
- deliver_skb(skb2, pt_prev, skb->dev);
+ atomic_inc(&skb2->users);
+ pt_prev->func(skb2, dev, pt_prev, dev);
pt_prev = ptype;
continue;
}
@@ -1572,9 +1575,10 @@ static void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
}
}
if (pt_prev)
- pt_prev->func(skb2, skb->dev, pt_prev, skb->dev);
+ pt_prev->func(skb2, dev, pt_prev, dev);
rcu_read_unlock();
}
+EXPORT_SYMBOL(dev_queue_xmit_nit);
/* netif_setup_tc - Handle tc mappings on real_num_tx_queues change
* @dev: Network device
--
1.7.4
^ permalink raw reply related
* [PATCH 2/2] macvlan: Send frames to AF_PACKET sockets attached to lowerdev
From: David Ward @ 2011-04-21 13:31 UTC (permalink / raw)
To: netdev, kaber; +Cc: David Ward
In-Reply-To: <1303392693-1350-1-git-send-email-david.ward@ll.mit.edu>
In bridge mode, unicast frames can be forwarded directly between macvlan
interfaces attached to the same lowerdev without calling dev_queue_xmit.
These frames should still be sent to any AF_PACKET sockets (network taps)
attached to the lowerdev.
Signed-off-by: David Ward <david.ward@ll.mit.edu>
---
drivers/net/macvlan.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 3ad5425..2b1ee81 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -237,6 +237,7 @@ static int macvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev)
dest = macvlan_hash_lookup(port, eth->h_dest);
if (dest && dest->mode == MACVLAN_MODE_BRIDGE) {
+ dev_queue_xmit_nit(skb, vlan->lowerdev);
unsigned int length = skb->len + ETH_HLEN;
int ret = dest->forward(dest->dev, skb);
macvlan_count_rx(dest, length,
--
1.7.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox