* Re: [PATCH] rfs: Receive Flow Steering
From: Eric Dumazet @ 2010-04-02 16:28 UTC (permalink / raw)
To: Tom Herbert; +Cc: davem, netdev
In-Reply-To: <1270211871.11099.2.camel@edumazet-laptop>
Some more thoughts ...
Do we really want to call inet_rps_record_flow(sk) from inet_sendmsg() &
inet_sendpage() ?
This seems not necessary to me...
This bit :
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index f4df5f9..3060c17 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1674,6 +1674,10 @@ process:
skb->dev = NULL;
+#ifdef CONFIG_RPS
+ inet_sk(sk)->rxhash = skb->rxhash;
+#endif
+
bh_lock_sock_nested(sk);
ret = 0;
if (!sock_owned_by_user(sk)) {
1) It's not pretty, could you define a helper ?
2) Why should we set inet_sk(sk)->rxhash at each packet ?
rxhash is a constant given the (src,dst,sport,dport) tuple and could be
set at connection time. This way we could set inet_sk(sk)->rxhash both
for TCP and UDP.
(Currently, you dont set inet_sk(sk)->rxhash on UDP)
^ permalink raw reply related
* Re: [PATCH] rfs: Receive Flow Steering
From: Rick Jones @ 2010-04-02 17:01 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Changli Gao, Tom Herbert, davem, netdev
In-Reply-To: <1270193393.1936.52.camel@edumazet-laptop>
Eric Dumazet wrote:
>
> Your claim of RPS being not good for applications is wrong, our test
> results show an improvement as is. Maybe your applications dont scale,
> because of bad habits, or collidings heuristics, I dont know.
The progression in HP-UX was IPS (10.20) (aka RPS) then TOPS (11.0) (aka RFS).
We found that IPS was great for single-flow-per-thread-of-execution stuff and
that TOPS was better for multiple-flow-per-thread-of-execution stuff. It was
long enough ago now that I can safely say for one system-level benchmark not
known to be a "networking" benchmark, and without a massive kernel component,
TOPS was a 10% win. Not too shabby.
It wasn't that IPS wasn't good in its context - just that TOPS was even better.
We also preferred the concept of the scheduler giving networking clues as to
where to process an application's packets rather than networking trying to tell
the scheduler. There was some discussion of out of order worries, but we were
willing to trust to the basic soundness of the scheduler - if it was moving
threads around willy nilly at a rate able to cause big packet reordering it had
fundamental problems that would have to be addressed anyway. And while it may
be incindiary to point this out :) I suspect (without concrete data :) that
bonding mode 0 is a much, Much, MUCH larger source of out-of-order traffic than
any plausible scheduler thrashing.
happy benchmarking,
rick jones
^ permalink raw reply
* Re: [PATCH 0/2] Fix ethernet driver for Octeon based Movidis hardware
From: Greg KH @ 2010-04-02 17:04 UTC (permalink / raw)
To: David Miller; +Cc: ddaney, ralf, linux-mips, netdev
In-Reply-To: <20100401.182045.106908204.davem@davemloft.net>
On Thu, Apr 01, 2010 at 06:20:45PM -0700, David Miller wrote:
> From: David Daney <ddaney@caviumnetworks.com>
> Date: Thu, 1 Apr 2010 18:17:53 -0700
>
> > The Movidis X16 bootloader doesn't enable the mdio bus. The first
> > patch fixes this by enabling the mdio bus when the driver is
> > initialized.
> >
> > Also the addresses of the PHYs was unknown for this device. The
> > second patch adds the corresponding PHY addresses.
> >
> > With both patches applied I can successfully use all eight Ethernet
> > ports.
> >
> > Please consider for 2.6.34. Since Octeon is an embedded MIPS SOC it
> > is unlikely to break the kernel for any workstations. Any or all of
> > these could be considered for merging via Ralf's linux-mips.org tree.
>
> Ralf please merge this via your MIPS tree, thanks:
>
> Acked-by: David S. Miller <davem@davemloft.net>
I agree, Ralf, please take these and you can add:
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
to them.
thanks,
greg k-h
^ permalink raw reply
* Re: [PATCHv3] drivers/net/usb: Add new driver ipheth
From: "L. Alberto Giménez" @ 2010-04-02 17:15 UTC (permalink / raw)
To: Ben Hutchings
Cc: linux-kernel, netdev, linux-usb, oliver, linville, j.dumon,
steve.glendinning, davem, gregkh, dgiagio, dborca
In-Reply-To: <1270077538.8653.484.camel@localhost>
On 04/01/2010 01:18 AM, Ben Hutchings wrote:
> On Wed, 2010-03-31 at 21:42 +0200, L. Alberto Giménez wrote:
> [...]
>> --- /dev/null
>> +++ b/drivers/net/usb/ipheth.c
> [...]
Hi Ben,
Upstream has fixed several errors pointed out by you and Oliver (thanks
for that), but some of them are still pending.
I will send patches on top of my last driver submission (if the proper
way would be resubmit the whole code, please tell me. Anyway I need to
clarify some doubts...
>> + usb_fill_bulk_urb(dev->tx_urb, udev,
>> + usb_sndbulkpipe(udev, dev->bulk_out),
>> + dev->tx_buf, IPHETH_BUF_SIZE,
>> + ipheth_sndbulk_callback,
>> + dev);
>> + dev->tx_urb->transfer_flags |= URB_NO_TRANSFER_DMA_MAP;
>> +
>> + retval = usb_submit_urb(dev->tx_urb, GFP_ATOMIC);
>> + if (retval) {
>> + err("%s: usb_submit_urb: %d", __func__, retval);
>> + dev->stats.tx_errors++;
>> + dev_kfree_skb_irq(skb);
>> + } else {
>> + net->trans_start = jiffies;
>
> No longer needed.
What is not longer needed? The assignment, the whole "else" branch? If
the assignment is what is not needed, can I just remove that line, right?
> [...]
>> +#ifdef HAVE_NET_DEVICE_OPS
>> +static const struct net_device_ops ipheth_netdev_ops = {
>> + .ndo_open = &ipheth_open,
>> + .ndo_stop = &ipheth_close,
>> + .ndo_start_xmit = &ipheth_tx,
>> + .ndo_tx_timeout = &ipheth_tx_timeout,
>> + .ndo_get_stats = &ipheth_stats,
>> +};
>> +#endif
>
> Remove the #ifdef, there is no question whether we have net_device_ops.
Ok, I will just remove both #ifdefs, but why is that? Maybe in previous
versions of the kernel the net_device_ops struct was introduced and now
it's present no matter how you configure your kernel?
> I have no idea about USB so I haven't checked the USB API usage at all.
I think that Greg is the maintainer for the USB subsystem, so if he has
no further commets, I will try to submit fixes for both your and
Oliver's comments along with the upstream developers.
Thanks for your comments.
Best regards,
--
L. Alberto Giménez
JabberID agimenez@jabber.sysvalve.es
GnuPG key ID 0x3BAABDE1
^ permalink raw reply
* Re: [PATCHv3] drivers/net/usb: Add new driver ipheth
From: Ben Hutchings @ 2010-04-02 17:21 UTC (permalink / raw)
To: "L. Alberto Giménez"
Cc: linux-kernel, netdev, linux-usb, oliver, linville, j.dumon,
steve.glendinning, davem, gregkh, dgiagio, dborca
In-Reply-To: <4BB62620.3070807@sysvalve.es>
On Fri, 2010-04-02 at 19:15 +0200, "L. Alberto Giménez" wrote:
> On 04/01/2010 01:18 AM, Ben Hutchings wrote:
> > On Wed, 2010-03-31 at 21:42 +0200, L. Alberto Giménez wrote:
> > [...]
> >> --- /dev/null
> >> +++ b/drivers/net/usb/ipheth.c
> > [...]
>
> Hi Ben,
>
> Upstream has fixed several errors pointed out by you and Oliver (thanks
> for that), but some of them are still pending.
>
> I will send patches on top of my last driver submission (if the proper
> way would be resubmit the whole code, please tell me. Anyway I need to
> clarify some doubts...
Since David Miller has not merged your original patch, you should send a
single new patch with the changes incorporated.
> >> + usb_fill_bulk_urb(dev->tx_urb, udev,
> >> + usb_sndbulkpipe(udev, dev->bulk_out),
> >> + dev->tx_buf, IPHETH_BUF_SIZE,
> >> + ipheth_sndbulk_callback,
> >> + dev);
> >> + dev->tx_urb->transfer_flags |= URB_NO_TRANSFER_DMA_MAP;
> >> +
> >> + retval = usb_submit_urb(dev->tx_urb, GFP_ATOMIC);
> >> + if (retval) {
> >> + err("%s: usb_submit_urb: %d", __func__, retval);
> >> + dev->stats.tx_errors++;
> >> + dev_kfree_skb_irq(skb);
> >> + } else {
> >> + net->trans_start = jiffies;
> >
> > No longer needed.
>
> What is not longer needed? The assignment, the whole "else" branch? If
> the assignment is what is not needed, can I just remove that line, right?
The assignment is not needed.
> > [...]
> >> +#ifdef HAVE_NET_DEVICE_OPS
> >> +static const struct net_device_ops ipheth_netdev_ops = {
> >> + .ndo_open = &ipheth_open,
> >> + .ndo_stop = &ipheth_close,
> >> + .ndo_start_xmit = &ipheth_tx,
> >> + .ndo_tx_timeout = &ipheth_tx_timeout,
> >> + .ndo_get_stats = &ipheth_stats,
> >> +};
> >> +#endif
> >
> > Remove the #ifdef, there is no question whether we have net_device_ops.
>
> Ok, I will just remove both #ifdefs, but why is that? Maybe in previous
> versions of the kernel the net_device_ops struct was introduced and now
> it's present no matter how you configure your kernel?
[...]
Correct, it is now mandatory.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: [PATCH] ethtool: RXHASH flag support (v2)
From: Jeff Garzik @ 2010-04-02 17:26 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20100402081515.1b963562@nehalam>
Both patches look great, thanks. Will apply after Easter holiday...
Jeff
^ permalink raw reply
* Receive steering and hash and cache misses
From: Stephen Hemminger @ 2010-04-02 17:26 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
Although Receive Packet Steering can use a hardware generated receive hash
the device driver still causes an unnecessary cache miss on the interrupt
processing CPU. The current Ethernet network device driver receive processing
has the device driver calling eth_type_trans() which causes a the
interrupt CPU to read the received frame header.
Is there some way the hardware receive hash value could be used to
steer to the receive CPU, then have the receive CPU find the Ethernet
type field (eth_type_trans)?
^ permalink raw reply
* [PATCH] vhost: Make it more scalable by creating a vhost thread per device.
From: Sridhar Samudrala @ 2010-04-02 17:31 UTC (permalink / raw)
To: Michael S. Tsirkin, Tom Lendacky; +Cc: netdev, kvm@vger.kernel.org
Make vhost scalable by creating a separate vhost thread per vhost
device. This provides better scaling across multiple guests and with
multiple interfaces in a guest.
I am seeing better aggregated througput/latency when running netperf
across multiple guests or multiple interfaces in a guest in parallel
with this patch.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index a6a88df..29aa80f 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -339,8 +339,10 @@ static int vhost_net_open(struct inode *inode, struct file *f)
return r;
}
- vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT);
- vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN);
+ vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT,
+ &n->dev);
+ vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN,
+ &n->dev);
n->tx_poll_state = VHOST_NET_POLL_DISABLED;
f->private_data = n;
@@ -643,25 +645,14 @@ static struct miscdevice vhost_net_misc = {
int vhost_net_init(void)
{
- int r = vhost_init();
- if (r)
- goto err_init;
- r = misc_register(&vhost_net_misc);
- if (r)
- goto err_reg;
- return 0;
-err_reg:
- vhost_cleanup();
-err_init:
- return r;
-
+ return misc_register(&vhost_net_misc);
}
+
module_init(vhost_net_init);
void vhost_net_exit(void)
{
misc_deregister(&vhost_net_misc);
- vhost_cleanup();
}
module_exit(vhost_net_exit);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 7bd7a1e..243f4d3 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -36,8 +36,6 @@ enum {
VHOST_MEMORY_F_LOG = 0x1,
};
-static struct workqueue_struct *vhost_workqueue;
-
static void vhost_poll_func(struct file *file, wait_queue_head_t *wqh,
poll_table *pt)
{
@@ -56,18 +54,19 @@ static int vhost_poll_wakeup(wait_queue_t *wait, unsigned mode, int sync,
if (!((unsigned long)key & poll->mask))
return 0;
- queue_work(vhost_workqueue, &poll->work);
+ queue_work(poll->dev->wq, &poll->work);
return 0;
}
/* Init poll structure */
void vhost_poll_init(struct vhost_poll *poll, work_func_t func,
- unsigned long mask)
+ unsigned long mask, struct vhost_dev *dev)
{
INIT_WORK(&poll->work, func);
init_waitqueue_func_entry(&poll->wait, vhost_poll_wakeup);
init_poll_funcptr(&poll->table, vhost_poll_func);
poll->mask = mask;
+ poll->dev = dev;
}
/* Start polling a file. We add ourselves to file's wait queue. The caller must
@@ -96,7 +95,7 @@ void vhost_poll_flush(struct vhost_poll *poll)
void vhost_poll_queue(struct vhost_poll *poll)
{
- queue_work(vhost_workqueue, &poll->work);
+ queue_work(poll->dev->wq, &poll->work);
}
static void vhost_vq_reset(struct vhost_dev *dev,
@@ -128,6 +127,11 @@ long vhost_dev_init(struct vhost_dev *dev,
struct vhost_virtqueue *vqs, int nvqs)
{
int i;
+
+ dev->wq = create_singlethread_workqueue("vhost");
+ if (!dev->wq)
+ return -ENOMEM;
+
dev->vqs = vqs;
dev->nvqs = nvqs;
mutex_init(&dev->mutex);
@@ -143,7 +147,7 @@ long vhost_dev_init(struct vhost_dev *dev,
if (dev->vqs[i].handle_kick)
vhost_poll_init(&dev->vqs[i].poll,
dev->vqs[i].handle_kick,
- POLLIN);
+ POLLIN, dev);
}
return 0;
}
@@ -216,6 +220,8 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
if (dev->mm)
mmput(dev->mm);
dev->mm = NULL;
+
+ destroy_workqueue(dev->wq);
}
static int log_access_ok(void __user *log_base, u64 addr, unsigned long sz)
@@ -1095,16 +1101,3 @@ void vhost_disable_notify(struct vhost_virtqueue *vq)
vq_err(vq, "Failed to enable notification at %p: %d\n",
&vq->used->flags, r);
}
-
-int vhost_init(void)
-{
- vhost_workqueue = create_singlethread_workqueue("vhost");
- if (!vhost_workqueue)
- return -ENOMEM;
- return 0;
-}
-
-void vhost_cleanup(void)
-{
- destroy_workqueue(vhost_workqueue);
-}
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 44591ba..60fefd0 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -29,10 +29,11 @@ struct vhost_poll {
/* struct which will handle all actual work. */
struct work_struct work;
unsigned long mask;
+ struct vhost_dev *dev;
};
void vhost_poll_init(struct vhost_poll *poll, work_func_t func,
- unsigned long mask);
+ unsigned long mask, struct vhost_dev *dev);
void vhost_poll_start(struct vhost_poll *poll, struct file *file);
void vhost_poll_stop(struct vhost_poll *poll);
void vhost_poll_flush(struct vhost_poll *poll);
@@ -110,6 +111,7 @@ struct vhost_dev {
int nvqs;
struct file *log_file;
struct eventfd_ctx *log_ctx;
+ struct workqueue_struct *wq;
};
long vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue *vqs, int nvqs);
@@ -136,9 +138,6 @@ bool vhost_enable_notify(struct vhost_virtqueue *);
int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
unsigned int log_num, u64 len);
-int vhost_init(void);
-void vhost_cleanup(void);
-
#define vq_err(vq, fmt, ...) do { \
pr_debug(pr_fmt(fmt), ##__VA_ARGS__); \
if ((vq)->error_ctx) \
^ permalink raw reply related
* Re: [PATCH 1/2] phylib: Support phy module autoloading
From: Andy Fleming @ 2010-04-02 17:51 UTC (permalink / raw)
To: David Woodhouse; +Cc: davem, netdev, ben
In-Reply-To: <1270206327.3101.2436.camel@macbook.infradead.org>
On Fri, Apr 2, 2010 at 6:05 AM, David Woodhouse <dwmw2@infradead.org> wrote:
> We don't use the normal hotplug mechanism because it doesn't work. It will
> load the module some time after the device appears, but that's not good
> enough for us -- we need the driver loaded _immediately_ because otherwise
> the NIC driver may just abort and then the phy 'device' goes away.
>
> [bwh: s/phy/mdio/ in module alias, kerneldoc for struct mdio_device_id]
>
> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Neat!
Acked-by: Andy Fleming <afleming@freescale.com>
^ permalink raw reply
* Re: [PATCHv3] drivers/net/usb: Add new driver ipheth
From: "L. Alberto Giménez" @ 2010-04-02 17:53 UTC (permalink / raw)
To: Ben Hutchings
Cc: "L. Alberto Giménez", linux-kernel, netdev,
linux-usb, oliver, linville, j.dumon, steve.glendinning, davem,
gregkh, dgiagio, Daniel Borca
In-Reply-To: <1270228883.12516.140.camel@localhost>
On 04/02/2010 07:21 PM, Ben Hutchings wrote:
> On Fri, 2010-04-02 at 19:15 +0200, "L. Alberto Giménez" wrote:
>> On 04/01/2010 01:18 AM, Ben Hutchings wrote:
>>> On Wed, 2010-03-31 at 21:42 +0200, L. Alberto Giménez wrote:
>>> [...]
>>>> --- /dev/null
>>>> +++ b/drivers/net/usb/ipheth.c
>>> [...]
>>>> + usb_fill_bulk_urb(dev->tx_urb, udev,
>>>> + usb_sndbulkpipe(udev, dev->bulk_out),
>>>> + dev->tx_buf, IPHETH_BUF_SIZE,
>>>> + ipheth_sndbulk_callback,
>>>> + dev);
>>>> + dev->tx_urb->transfer_flags |= URB_NO_TRANSFER_DMA_MAP;
>>>> +
>>>> + retval = usb_submit_urb(dev->tx_urb, GFP_ATOMIC);
>>>> + if (retval) {
>>>> + err("%s: usb_submit_urb: %d", __func__, retval);
>>>> + dev->stats.tx_errors++;
>>>> + dev_kfree_skb_irq(skb);
>>>> + } else {
>>>> + net->trans_start = jiffies;
>>> No longer needed.
>> What is not longer needed? The assignment, the whole "else" branch? If
>> the assignment is what is not needed, can I just remove that line, right?
>
> The assignment is not needed.
Hi,
I've been looking into this and it seems that the net_device.trans_start
field is now deprecated in favor of net_device.rx_queue.trans_start
-rx_queue is a struct netdev_queue- (file include/linux/netdevice.h), as
states the comment:
512 /*
513 * please use this field instead of dev->trans_start
514 */
515 unsigned long trans_start;
Reading LDD3 book, it says that the driver is reponsible for updating
trans_start (as well as trans_rx, but we're not talking about that one).
So, I guess that the LDD book is outdated (again ;) ), but what I don't
understand at all is wether the driver should keep updating the right
field (dev->rx_queue.trans_start) or if the fact that it's inside a
"queue" implies that the code that is responsible for that queue would
update the trans_start field by itself?
Thanks in advance.
Best regards,
--
L. Alberto Giménez
JabberID agimenez@jabber.sysvalve.es
GnuPG key ID 0x3BAABDE1
^ permalink raw reply
* Re: Receive steering and hash and cache misses
From: Tom Herbert @ 2010-04-02 17:59 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Eric Dumazet, netdev
In-Reply-To: <20100402102650.5bdb5b52@nehalam>
On Fri, Apr 2, 2010 at 10:26 AM, Stephen Hemminger
<shemminger@vyatta.com> wrote:
>
> Although Receive Packet Steering can use a hardware generated receive hash
> the device driver still causes an unnecessary cache miss on the interrupt
> processing CPU. The current Ethernet network device driver receive processing
> has the device driver calling eth_type_trans() which causes a the
> interrupt CPU to read the received frame header.
>
It should be possible to deduce the values set by eth_type_trans from
the RX descriptor along with the RX hash. I'll post the patch getting
rxhash from bnx2x which does this.
> Is there some way the hardware receive hash value could be used to
> steer to the receive CPU, then have the receive CPU find the Ethernet
> type field (eth_type_trans)?
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: Any ideas about a crash on reboot with igb and intel_iommu?
From: Roland Dreier @ 2010-04-02 18:13 UTC (permalink / raw)
To: netdev; +Cc: iommu, David Woodhouse
In-Reply-To: <adabpe3f3c4.fsf@roland-alpha.cisco.com>
So actually I found the following change that went into 2.6.31:
commit 91615f765a2935b6cbae424b9eee1585ed681ae6
Author: Jesse Brandeburg <jesse.brandeburg@intel.com>
Date: Tue Jun 30 12:45:15 2009 +0000
igb: fix unmap length bug
driver was mixing NET_IP_ALIGN count bytes in map/unmap calls
unevenly. Only map the bytes that the hardware might dma into
igb in 2.6.30.y is doing pci_map_single(<some length>) and doing
pci_unmap_single(<some other length>).
However I haven't been able to provoke the crash yet, even by bouncing
an igb interface (with VT-d turned on and with CONFIG_DMAR_DEFAULT_ON enabled).
Does this make sense as the sort of thing that might corrupt the iova
rbtree and lead to a crash in the __free_iova / rb_erase code?
Thanks,
Roland
--
Roland Dreier <rolandd@cisco.com> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
^ permalink raw reply
* Re: [PATCH 5/5] netfilter: xt_TEE: have cloned packet travel through Xtables too
From: Jan Engelhardt @ 2010-04-02 18:15 UTC (permalink / raw)
To: Patrick McHardy; +Cc: netfilter-devel, netdev
In-Reply-To: <4BB4A796.8010708@trash.net>
On Thursday 2010-04-01 16:03, Patrick McHardy wrote:
>>>>[detecting teed packets getting teed again by means of
>>>> iptables -A OUTPUT -j TEE]
>>>
>>> What's wrong with adding a reentrancy counter?
>>
>> Sounds like a plan.
Should we be using a percpu variable, or is a simplistic
array ok too?
static bool tee_active;
target(...)
{
if (tee_active[smp_processor_id()])
return XT_CONTINUE;
...
if (tee_tg4_route(...)) {
tee_active[cpu] = true;
ip_local_out(skb);
tee_active[cpu] = false;
}
}
^ permalink raw reply
* Re: [PATCHv3] drivers/net/usb: Add new driver ipheth
From: "L. Alberto Giménez" @ 2010-04-02 18:23 UTC (permalink / raw)
To: Oliver Neukum
Cc: linux-kernel, netdev, linux-usb, linville, j.dumon,
steve.glendinning, davem, gregkh, dgiagio, dborca
In-Reply-To: <201003312233.26130.oliver@neukum.org>
On 03/31/2010 10:33 PM, Oliver Neukum wrote:
> Am Mittwoch, 31. März 2010 21:42:07 schrieb L. Alberto Giménez:
Hi Oliver,
Just like with Ben's comments I still have a couple of doubts about your
comments.
>> +
>> +static int ipheth_open(struct net_device *net)
>> +{
>> + struct ipheth_device *dev = netdev_priv(net);
>> + struct usb_device *udev = dev->udev;
>> + int retval = 0;
>> +
>> + usb_set_interface(udev, IPHETH_INTFNUM, IPHETH_ALT_INTFNUM);
>> + usb_clear_halt(udev, usb_rcvbulkpipe(udev, dev->bulk_in));
>> + usb_clear_halt(udev, usb_sndbulkpipe(udev, dev->bulk_out));
>
> Is this really needed? If so, please add a comment.
I understand that usb_clear_halt is only needed when the device has
transmitted data, and as it is "open" time, we can assume that no
transmissions ere made, so we don't need to clear anything (aka: remove
both lines), am I right?
>> +
>> + retval = ipheth_carrier_set(dev);
>> + if (retval)
>> + goto error;
>> +
>> + retval = ipheth_rx_submit(dev, GFP_KERNEL);
>> + if (retval)
>> + goto error;
>> +
>> + schedule_delayed_work(&dev->carrier_work, IPHETH_CARRIER_CHECK_TIMEOUT);
>
> Does it make sense to start rx while you have no carrier?
Well, I have no clue about this one. I think that upstream developers
should take a look into this (Dario, Daniel, could you?) since I don't
have the knowledge to decide what to do about it.
But I assume that as with the previous one, we have just opened the
device and we aren't (yet) doing anything with it, so we shouldn't start rx?
>> +static void ipheth_disconnect(struct usb_interface *intf)
>> +{
>> + struct ipheth_device *dev;
>> +
>> + dev = usb_get_intfdata(intf);
>> + if (dev != NULL) {
>
> is this check needed?
Does usb_get_infdata always return not NULL? I haven't found anything
about it (just manual pages for the function, but can't spot if it
cannot return NULL). We disconnected the device, but I understand that
the kernel still has the information and the allocated memory, so the
cleanup code is still needed, isn't it?
>
>> +static struct usb_driver ipheth_driver = {
>> + .name = "ipheth",
>> + .probe = ipheth_probe,
>> + .disconnect = ipheth_disconnect,
>> + .id_table = ipheth_table,
>> + .supports_autosuspend = 0,
>
> redundant
Why?
Thanks a lot in advance.
Best regards,
--
L. Alberto Giménez
JabberID agimenez@jabber.sysvalve.es
GnuPG key ID 0x3BAABDE1
^ permalink raw reply
* Re: [PATCH] rfs: Receive Flow Steering
From: Rick Jones @ 2010-04-02 18:25 UTC (permalink / raw)
To: Tom Herbert; +Cc: Eric Dumazet, Changli Gao, davem, netdev
In-Reply-To: <g2i65634d661004021045uff7c0e25ge7dfd17929bc9ee9@mail.gmail.com>
Tom Herbert wrote:
>
>
> On Fri, Apr 2, 2010 at 10:01 AM, Rick Jones <rick.jones2@hp.com
> <mailto:rick.jones2@hp.com>> wrote:
>
> Eric Dumazet wrote:
>
>
> Your claim of RPS being not good for applications is wrong, our test
> results show an improvement as is. Maybe your applications dont
> scale,
> because of bad habits, or collidings heuristics, I dont know.
>
>
> The progression in HP-UX was IPS (10.20) (aka RPS) then TOPS (11.0)
> (aka RFS). We found that IPS was great for
> single-flow-per-thread-of-execution stuff and that TOPS was better
> for multiple-flow-per-thread-of-execution stuff. It was long enough
> ago now that I can safely say for one system-level benchmark not
> known to be a "networking" benchmark, and without a massive kernel
> component, TOPS was a 10% win. Not too shabby.
>
> It wasn't that IPS wasn't good in its context - just that TOPS was
> even better.
>
> I would assume that with IPS threads would migrate to where packets were
> being delivered thus giving the same sort of locality TOPS was
> providing? That would work great without any other constraints
> (multiple flows per thread, thread CPU bindings, etc.).
Well... that depended - at the time, and still, we were and are also encouraging
users and app designers to make copious use of processor/locality affinity (SMP
and NUMA going back far longer in the RISC et al space than the x86 space). So,
it was and is entirely possible that the application thread of execution is
hard-bound to a specific core/locality. Also, I do not recall if HP-UX was as
aggressive about waking a process/thread on the processor from which the wake-up
came vs on the processor on which it last ran.
> We also preferred the concept of the scheduler giving networking
> clues as to where to process an application's packets rather than
> networking trying to tell the scheduler. There was some discussion
> of out of order worries, but we were willing to trust to the basic
> soundness of the scheduler - if it was moving threads around willy
> nilly at a rate able to cause big packet reordering it had
> fundamental problems that would have to be addressed anyway.
>
>
> I also think scheduler leading networking, like in RPS, is generally
> more scalable. As for OOO packets, I've spent way to much time trying
> to convince the bean-counters that a small number of them aren't
> problematic :-), in the end it's just easier to not introduce new
> mechanisms that will cause them!
So long as it doesn't drive you to produce new mechanisms heavier than they
would have otherwise been.
The irony in the case of HP-UX IPS was that it was put in place in response to
the severe out of order packet problems in HP-UX in 10.X before 10.20 - there
were multiple netisr processes and only one netisr queue. The other little
tweak that came along in 10.20 with IPS, was inaddition to having a per
processor (well, per core in today's parlance) netisr queue, the netisr would
grab the entire queue under the one spinlock and work off of that. That was
nice because the code path became more efficient under load - more packets
processed per spinlock/unlock pair.
happy benchmarking,
rick jones
^ permalink raw reply
* Re: [PATCHv3] drivers/net/usb: Add new driver ipheth
From: Ben Hutchings @ 2010-04-02 18:35 UTC (permalink / raw)
To: "L. Alberto Giménez"
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
netdev-u79uwXL29TY76Z2rM5mHXA, linux-usb-u79uwXL29TY76Z2rM5mHXA,
oliver-GvhC2dPhHPQdnm+yROfE0A, linville-2XuSBdqkA4R54TAoqtyWWQ,
j.dumon-x9gZzRpC1QbQT0dZR+AlfA, steve.glendinning-sdUf+H5yV5I,
davem-fT/PcQaiUtIeIZ0/mPfg9Q, gregkh-l3A5Bk7waGM,
dgiagio-Re5JQEeQqe8AvxtiuMwx3w, Daniel Borca
In-Reply-To: <4BB62F33.1020507-lqZFv/KUvpAxAGwisGp4zA@public.gmane.org>
On Fri, 2010-04-02 at 19:53 +0200, "L. Alberto Giménez" wrote:
> On 04/02/2010 07:21 PM, Ben Hutchings wrote:
> > On Fri, 2010-04-02 at 19:15 +0200, "L. Alberto Giménez" wrote:
> >> On 04/01/2010 01:18 AM, Ben Hutchings wrote:
> >>> On Wed, 2010-03-31 at 21:42 +0200, L. Alberto Giménez wrote:
> >>> [...]
> >>>> --- /dev/null
> >>>> +++ b/drivers/net/usb/ipheth.c
> >>> [...]
> >>>> + usb_fill_bulk_urb(dev->tx_urb, udev,
> >>>> + usb_sndbulkpipe(udev, dev->bulk_out),
> >>>> + dev->tx_buf, IPHETH_BUF_SIZE,
> >>>> + ipheth_sndbulk_callback,
> >>>> + dev);
> >>>> + dev->tx_urb->transfer_flags |= URB_NO_TRANSFER_DMA_MAP;
> >>>> +
> >>>> + retval = usb_submit_urb(dev->tx_urb, GFP_ATOMIC);
> >>>> + if (retval) {
> >>>> + err("%s: usb_submit_urb: %d", __func__, retval);
> >>>> + dev->stats.tx_errors++;
> >>>> + dev_kfree_skb_irq(skb);
> >>>> + } else {
> >>>> + net->trans_start = jiffies;
> >>> No longer needed.
> >> What is not longer needed? The assignment, the whole "else" branch? If
> >> the assignment is what is not needed, can I just remove that line, right?
> >
> > The assignment is not needed.
>
> Hi,
>
> I've been looking into this and it seems that the net_device.trans_start
> field is now deprecated in favor of net_device.rx_queue.trans_start
[...]
The networking core updates netdev_queue::trans_start by calling
txq_trans_update() where necessary. Drivers should not touch either
net_device::trans_start or netdev_queue::trans_start any more.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: Receive steering and hash and cache misses
From: Stephen Hemminger @ 2010-04-02 18:54 UTC (permalink / raw)
To: Tom Herbert; +Cc: Eric Dumazet, netdev
In-Reply-To: <g2l65634d661004021059z94214a43v82409d15a0fb09b6@mail.gmail.com>
On Fri, 2 Apr 2010 10:59:43 -0700
Tom Herbert <therbert@google.com> wrote:
> On Fri, Apr 2, 2010 at 10:26 AM, Stephen Hemminger
> <shemminger@vyatta.com> wrote:
> >
> > Although Receive Packet Steering can use a hardware generated receive hash
> > the device driver still causes an unnecessary cache miss on the interrupt
> > processing CPU. The current Ethernet network device driver receive processing
> > has the device driver calling eth_type_trans() which causes a the
> > interrupt CPU to read the received frame header.
> >
>
> It should be possible to deduce the values set by eth_type_trans from
> the RX descriptor along with the RX hash. I'll post the patch getting
> rxhash from bnx2x which does this.
>
On sky2, I get only RSS, Checksum, and length from descriptor info.
^ permalink raw reply
* Re: Receive steering and hash and cache misses
From: Eric Dumazet @ 2010-04-02 19:36 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Tom Herbert, netdev
In-Reply-To: <20100402115439.348575c9@nehalam>
Le vendredi 02 avril 2010 à 11:54 -0700, Stephen Hemminger a écrit :
> On Fri, 2 Apr 2010 10:59:43 -0700
> Tom Herbert <therbert@google.com> wrote:
>
> > On Fri, Apr 2, 2010 at 10:26 AM, Stephen Hemminger
> > <shemminger@vyatta.com> wrote:
> > >
> > > Although Receive Packet Steering can use a hardware generated receive hash
> > > the device driver still causes an unnecessary cache miss on the interrupt
> > > processing CPU. The current Ethernet network device driver receive processing
> > > has the device driver calling eth_type_trans() which causes a the
> > > interrupt CPU to read the received frame header.
> > >
> >
> > It should be possible to deduce the values set by eth_type_trans from
> > the RX descriptor along with the RX hash. I'll post the patch getting
> > rxhash from bnx2x which does this.
> >
>
> On sky2, I get only RSS, Checksum, and length from descriptor info.
Doesnt sky2 also provide vlan id (OP_RXVLAN/OP_RXCHKSVLAN) ?
A future version of hardware could provide more info perhaps...
Must eth_type_trans() be done *before* netif_receive_skb() ?
If a device provides a rxhash, maybe we can delay eth_type_trans() too.
If not, we need to access IP header anyway in the first cpu.
^ permalink raw reply
* Re: [PATCH] rfs: Receive Flow Steering
From: Eric Dumazet @ 2010-04-02 19:43 UTC (permalink / raw)
To: Tom Herbert; +Cc: davem, netdev
In-Reply-To: <1270225735.11099.20.camel@edumazet-laptop>
Le vendredi 02 avril 2010 à 18:28 +0200, Eric Dumazet a écrit :
> Some more thoughts ...
>
> Do we really want to call inet_rps_record_flow(sk) from inet_sendmsg() &
> inet_sendpage() ?
>
> This seems not necessary to me...
>
I think I get it, you want to catch unidirectional flows (apps that only
send data), and let ACK packets be processed by the sender cpu :=)
I did following patch to remove one conditional branch :
net/core/dev.c | 20 +++++---------------
1 file changed, 5 insertions(+), 15 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 0a9ced8..cfe46d8 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2225,8 +2225,6 @@ static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb,
u16 tcpu;
u32 addr1, addr2, ports, ihl;
- *rflowp = NULL;
-
if (skb_rx_queue_recorded(skb)) {
u16 index = skb_get_rx_queue(skb);
if (unlikely(index >= dev->num_rx_queues)) {
@@ -2443,7 +2441,7 @@ int netif_rx(struct sk_buff *skb)
{
unsigned int qtail;
#ifdef CONFIG_RPS
- struct rps_dev_flow *rflow;
+ struct rps_dev_flow voidflow, *rflow = &voidflow;
int cpu, err;
#endif
@@ -2461,10 +2459,7 @@ int netif_rx(struct sk_buff *skb)
if (cpu < 0)
cpu = smp_processor_id();
- err = enqueue_to_backlog(skb, cpu, &qtail);
-
- if (rflow)
- rflow->last_qtail = qtail;
+ err = enqueue_to_backlog(skb, cpu, &rflow->last_qtail);
rcu_read_unlock();
@@ -2839,7 +2834,7 @@ out:
int netif_receive_skb(struct sk_buff *skb)
{
#ifdef CONFIG_RPS
- struct rps_dev_flow *rflow;
+ struct rps_dev_flow voidflow, *rflow = &voidflow;
int cpu, err;
rcu_read_lock();
@@ -2848,13 +2843,8 @@ int netif_receive_skb(struct sk_buff *skb)
if (cpu < 0)
err = __netif_receive_skb(skb);
- else {
- unsigned int qtail;
-
- err = enqueue_to_backlog(skb, cpu, &qtail);
- if (rflow)
- rflow->last_qtail = qtail;
- }
+ else
+ err = enqueue_to_backlog(skb, cpu, &rflow->last_qtail);
rcu_read_unlock();
^ permalink raw reply related
* Unaligned access in xfrm_user:copy_to_user_state
From: Jan Engelhardt @ 2010-04-02 20:18 UTC (permalink / raw)
To: netdev; +Cc: David S. Miller
Hi,
since we seem to be dealing with unaligned access quite recently, here's
my turn in reporting one:
22:09 ares:/etc # uname -a
Linux ares 2.6.34-rc1 #17 SMP Thu Mar 25 00:08:55 CET 2010 sparc64
sparc64 sparc64 GNU/Linux
(This is kaber/nf-next)
Apr 2 22:09:53 ares kernel: Kernel unaligned access at TPC[101a0c18]
copy_to_user_state+0x18/0x120 [xfrm_user]
0000000000000c00 <copy_to_user_state>:
c00: 9d e3 bf 50 save %sp, -176, %sp
c04: ce 5e 20 80 ldx [ %i0 + 0x80 ], %g7
c08: 86 06 20 80 add %i0, 0x80, %g3
c0c: 84 06 60 38 add %i1, 0x38, %g2
c10: 82 06 20 98 add %i0, 0x98, %g1
c14: 90 06 60 60 add %i1, 0x60, %o0
c18: ce 76 60 38 stx %g7, [ %i1 + 0x38 ]
That happens when strongswan is trying to handle a new incoming tunnel
request between two IPv6 endpoints (it does not seem to get triggered
for IPv4).
^ permalink raw reply
* Re: [BUG] latest net-next-2.6 doesnt fly
From: David Miller @ 2010-04-02 20:35 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev, fujita.tomonori
In-Reply-To: <1270202304.1989.14.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 02 Apr 2010 11:58:24 +0200
> [PATCH net-next-2.6] net: illegal_highdma() fix
>
> Followup to commit 5acbbd428db47b12f137a8a2aa96b3c0a96b744e
> (net: change illegal_highdma to use dma_mask)
>
> If dev->dev.parent is NULL, we should not try to dereference it.
>
> Dont force inline illegal_highdma() as its pretty big now.
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Applied, thanks for tracking this down.
^ permalink raw reply
* Re: Unaligned access in xfrm_user:copy_to_user_state
From: David Miller @ 2010-04-02 21:03 UTC (permalink / raw)
To: jengelh; +Cc: netdev
In-Reply-To: <alpine.LSU.2.01.1004022210270.30875@obet.zrqbmnf.qr>
From: Jan Engelhardt <jengelh@medozas.de>
Date: Fri, 2 Apr 2010 22:18:59 +0200 (CEST)
> since we seem to be dealing with unaligned access quite recently, here's
> my turn in reporting one:
>
> 22:09 ares:/etc # uname -a
> Linux ares 2.6.34-rc1 #17 SMP Thu Mar 25 00:08:55 CET 2010 sparc64
> sparc64 sparc64 GNU/Linux
> (This is kaber/nf-next)
>
> Apr 2 22:09:53 ares kernel: Kernel unaligned access at TPC[101a0c18]
> copy_to_user_state+0x18/0x120 [xfrm_user]
>
> 0000000000000c00 <copy_to_user_state>:
> c00: 9d e3 bf 50 save %sp, -176, %sp
> c04: ce 5e 20 80 ldx [ %i0 + 0x80 ], %g7
> c08: 86 06 20 80 add %i0, 0x80, %g3
> c0c: 84 06 60 38 add %i1, 0x38, %g2
> c10: 82 06 20 98 add %i0, 0x98, %g1
> c14: 90 06 60 60 add %i1, 0x60, %o0
> c18: ce 76 60 38 stx %g7, [ %i1 + 0x38 ]
>
> That happens when strongswan is trying to handle a new incoming tunnel
> request between two IPv6 endpoints (it does not seem to get triggered
> for IPv4).
Yes, we need to "void *" untype the arguments to memcpy so that
GCC doesn't inline the thing.
Patches welcome.
^ permalink raw reply
* Re: [PATCH 1/2] phylib: Support phy module autoloading
From: David Miller @ 2010-04-02 21:31 UTC (permalink / raw)
To: dwmw2; +Cc: netdev, ben
In-Reply-To: <1270206327.3101.2436.camel@macbook.infradead.org>
From: David Woodhouse <dwmw2@infradead.org>
Date: Fri, 02 Apr 2010 12:05:27 +0100
> We don't use the normal hotplug mechanism because it doesn't work. It will
> load the module some time after the device appears, but that's not good
> enough for us -- we need the driver loaded _immediately_ because otherwise
> the NIC driver may just abort and then the phy 'device' goes away.
>
> [bwh: s/phy/mdio/ in module alias, kerneldoc for struct mdio_device_id]
>
> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Applied.
^ permalink raw reply
* Re: [PATCH 2/2] phylib: Add module table to all existing phy drivers
From: David Miller @ 2010-04-02 21:31 UTC (permalink / raw)
To: dwmw2; +Cc: netdev, ben
In-Reply-To: <1270206356.3101.2437.camel@macbook.infradead.org>
From: David Woodhouse <dwmw2@infradead.org>
Date: Fri, 02 Apr 2010 12:05:56 +0100
> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Applied, thanks for doing the notes and providing backporting
notes :-)
^ permalink raw reply
* [PATCH] SCTP: Change to use ipv6_addr_copy()
From: Brian Haley @ 2010-04-02 21:38 UTC (permalink / raw)
To: vladislav.yasevich, davem; +Cc: netdev, linux-sctp
Change SCTP IPv6 code to use ipv6_addr_copy()
Signed-off-by: Brian Haley <brian.haley@hp.com>
---
net/sctp/ipv6.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index 216d88f..db1c767 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -364,7 +364,7 @@ static void sctp_v6_copy_addrlist(struct list_head *addrlist,
if (addr) {
addr->a.v6.sin6_family = AF_INET6;
addr->a.v6.sin6_port = 0;
- addr->a.v6.sin6_addr = ifp->addr;
+ ipv6_addr_copy(&addr->a.v6.sin6_addr, &ifp->addr);
addr->a.v6.sin6_scope_id = dev->ifindex;
addr->valid = 1;
INIT_LIST_HEAD(&addr->list);
@@ -405,7 +405,7 @@ static void sctp_v6_from_sk(union sctp_addr *addr, struct sock *sk)
{
addr->v6.sin6_family = AF_INET6;
addr->v6.sin6_port = 0;
- addr->v6.sin6_addr = inet6_sk(sk)->rcv_saddr;
+ ipv6_addr_copy(&addr->v6.sin6_addr, &inet6_sk(sk)->rcv_saddr);
}
/* Initialize sk->sk_rcv_saddr from sctp_addr. */
@@ -418,7 +418,7 @@ static void sctp_v6_to_sk_saddr(union sctp_addr *addr, struct sock *sk)
inet6_sk(sk)->rcv_saddr.s6_addr32[3] =
addr->v4.sin_addr.s_addr;
} else {
- inet6_sk(sk)->rcv_saddr = addr->v6.sin6_addr;
+ ipv6_addr_copy(&inet6_sk(sk)->rcv_saddr, &addr->v6.sin6_addr);
}
}
@@ -431,7 +431,7 @@ static void sctp_v6_to_sk_daddr(union sctp_addr *addr, struct sock *sk)
inet6_sk(sk)->daddr.s6_addr32[2] = htonl(0x0000ffff);
inet6_sk(sk)->daddr.s6_addr32[3] = addr->v4.sin_addr.s_addr;
} else {
- inet6_sk(sk)->daddr = addr->v6.sin6_addr;
+ ipv6_addr_copy(&inet6_sk(sk)->daddr, &addr->v6.sin6_addr);
}
}
--
1.5.4.3
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox