From: Jason Wang <jasowang@redhat.com>
To: Ben Hutchings <bhutchings@solarflare.com>
Cc: krkumar2@in.ibm.com, kvm@vger.kernel.org, mst@redhat.com,
netdev@vger.kernel.org,
virtualization@lists.linux-foundation.org,
levinsasha928@gmail.com
Subject: Re: [net-next RFC PATCH 5/5] virtio-net: flow director support
Date: Tue, 06 Dec 2011 15:25:16 +0800 [thread overview]
Message-ID: <4EDDC35C.2070100@redhat.com> (raw)
In-Reply-To: <1323117745.2887.31.camel@bwh-desktop>
On 12/06/2011 04:42 AM, Ben Hutchings wrote:
> On Mon, 2011-12-05 at 16:59 +0800, Jason Wang wrote:
>> In order to let the packets of a flow to be passed to the desired
>> guest cpu, we can co-operate with devices through programming the flow
>> director which was just a hash to queue table.
>>
>> This kinds of co-operation is done through the accelerate RFS support,
>> a device specific flow sterring method virtnet_fd() is used to modify
>> the flow director based on rfs mapping. The desired queue were
>> calculated through reverse mapping of the irq affinity table. In order
>> to parallelize the ingress path, irq affinity of rx queue were also
>> provides by the driver.
>>
>> In addition to accelerate RFS, we can also use the guest scheduler to
>> balance the load of TX and reduce the lock contention on egress path,
>> so the processor_id() were used to tx queue selection.
> [...]
>> +#ifdef CONFIG_RFS_ACCEL
>> +
>> +int virtnet_fd(struct net_device *net_dev, const struct sk_buff *skb,
>> + u16 rxq_index, u32 flow_id)
>> +{
>> + struct virtnet_info *vi = netdev_priv(net_dev);
>> + u16 *table = NULL;
>> +
>> + if (skb->protocol != htons(ETH_P_IP) || !skb->rxhash)
>> + return -EPROTONOSUPPORT;
> Why only IPv4?
Oops, IPv6 should work also.
>> + table = kmap_atomic(vi->fd_page);
>> + table[skb->rxhash& TAP_HASH_MASK] = rxq_index;
>> + kunmap_atomic(table);
>> +
>> + return 0;
>> +}
>> +#endif
> This is not a proper implementation of ndo_rx_flow_steer. If you steer
> a flow by changing the RSS table this can easily cause packet reordering
> in other flows. The filtering should be more precise, ideally matching
> exactly a single flow by e.g. VID and IP 5-tuple.
>
> I think you need to add a second hash table which records exactly which
> flow is supposed to be steered. Also, you must call
> rps_may_expire_flow() to check whether an entry in this table may be
> replaced; otherwise you can cause packet reordering in the flow that was
> previously being steered.
>
> Finally, this function must return the table index it assigned, so that
> rps_may_expire_flow() works.
Thanks for the explanation, how about document this briefly in scaling.txt?
>> +static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff *skb)
>> +{
>> + int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
>> + smp_processor_id();
>> +
>> + /* As we make use of the accelerate rfs which let the scheduler to
>> + * balance the load, it make sense to choose the tx queue also based on
>> + * theprocessor id?
>> + */
>> + while (unlikely(txq>= dev->real_num_tx_queues))
>> + txq -= dev->real_num_tx_queues;
>> + return txq;
>> +}
> [...]
>
> Don't do this, let XPS handle it.
>
> Ben.
>
next prev parent reply other threads:[~2011-12-06 7:25 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-05 8:58 [net-next RFC PATCH 0/5] Series short description Jason Wang
2011-12-05 8:58 ` [net-next RFC PATCH 1/5] virtio_net: passing rxhash through vnet_hdr Jason Wang
2011-12-05 8:58 ` [net-next RFC PATCH 2/5] tuntap: simple flow director support Jason Wang
2011-12-05 10:38 ` Stefan Hajnoczi
2011-12-05 20:09 ` Ben Hutchings
2011-12-06 7:21 ` Jason Wang
2011-12-06 17:31 ` Ben Hutchings
2011-12-05 20:09 ` Ben Hutchings
2011-12-05 8:59 ` [net-next RFC PATCH 3/5] macvtap: " Jason Wang
2011-12-05 20:11 ` Ben Hutchings
2011-12-05 20:11 ` Ben Hutchings
2011-12-05 8:59 ` [net-next RFC PATCH 4/5] virtio: introduce a method to get the irq of a specific virtqueue Jason Wang
2011-12-05 8:59 ` [net-next RFC PATCH 5/5] virtio-net: flow director support Jason Wang
2011-12-05 8:59 ` Jason Wang
2011-12-05 10:55 ` Stefan Hajnoczi
2011-12-06 6:33 ` Jason Wang
2011-12-06 9:18 ` Stefan Hajnoczi
2011-12-06 10:21 ` Jason Wang
2011-12-06 13:15 ` Stefan Hajnoczi
2011-12-06 13:15 ` Stefan Hajnoczi
2011-12-06 15:42 ` Sridhar Samudrala
2011-12-06 15:42 ` Sridhar Samudrala
2011-12-06 16:14 ` Michael S. Tsirkin
2011-12-06 23:10 ` Sridhar Samudrala
2011-12-07 11:05 ` Jason Wang
2011-12-07 11:02 ` Jason Wang
2011-12-09 2:00 ` Sridhar Samudrala
2011-12-07 3:03 ` Jason Wang
2011-12-07 9:08 ` Stefan Hajnoczi
2011-12-07 12:10 ` Jason Wang
2011-12-07 15:04 ` Stefan Hajnoczi
2011-12-07 15:04 ` Stefan Hajnoczi
2011-12-06 9:18 ` Stefan Hajnoczi
2011-12-05 20:42 ` Ben Hutchings
2011-12-06 7:25 ` Jason Wang [this message]
2011-12-06 17:36 ` Ben Hutchings
2011-12-07 7:30 ` [net-next RFC PATCH 0/5] Series short description Rusty Russell
2011-12-07 7:30 ` Rusty Russell
2011-12-07 11:31 ` Jason Wang
2011-12-07 17:02 ` Ben Hutchings
2011-12-08 10:06 ` Jason Wang
2011-12-09 5:31 ` Rusty Russell
2011-12-15 1:36 ` Ben Hutchings
2011-12-15 23:12 ` Rusty Russell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4EDDC35C.2070100@redhat.com \
--to=jasowang@redhat.com \
--cc=bhutchings@solarflare.com \
--cc=krkumar2@in.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=levinsasha928@gmail.com \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.