From: Jason Wang <jasowang@redhat.com>
To: Ben Hutchings <bhutchings@solarflare.com>
Cc: krkumar2@in.ibm.com, kvm@vger.kernel.org, mst@redhat.com,
netdev@vger.kernel.org,
virtualization@lists.linux-foundation.org,
levinsasha928@gmail.com
Subject: Re: [net-next RFC PATCH 5/5] virtio-net: flow director support
Date: Tue, 06 Dec 2011 15:25:16 +0800 [thread overview]
Message-ID: <4EDDC35C.2070100@redhat.com> (raw)
In-Reply-To: <1323117745.2887.31.camel@bwh-desktop>
On 12/06/2011 04:42 AM, Ben Hutchings wrote:
> On Mon, 2011-12-05 at 16:59 +0800, Jason Wang wrote:
>> In order to let the packets of a flow to be passed to the desired
>> guest cpu, we can co-operate with devices through programming the flow
>> director which was just a hash to queue table.
>>
>> This kinds of co-operation is done through the accelerate RFS support,
>> a device specific flow sterring method virtnet_fd() is used to modify
>> the flow director based on rfs mapping. The desired queue were
>> calculated through reverse mapping of the irq affinity table. In order
>> to parallelize the ingress path, irq affinity of rx queue were also
>> provides by the driver.
>>
>> In addition to accelerate RFS, we can also use the guest scheduler to
>> balance the load of TX and reduce the lock contention on egress path,
>> so the processor_id() were used to tx queue selection.
> [...]
>> +#ifdef CONFIG_RFS_ACCEL
>> +
>> +int virtnet_fd(struct net_device *net_dev, const struct sk_buff *skb,
>> + u16 rxq_index, u32 flow_id)
>> +{
>> + struct virtnet_info *vi = netdev_priv(net_dev);
>> + u16 *table = NULL;
>> +
>> + if (skb->protocol != htons(ETH_P_IP) || !skb->rxhash)
>> + return -EPROTONOSUPPORT;
> Why only IPv4?
Oops, IPv6 should work also.
>> + table = kmap_atomic(vi->fd_page);
>> + table[skb->rxhash& TAP_HASH_MASK] = rxq_index;
>> + kunmap_atomic(table);
>> +
>> + return 0;
>> +}
>> +#endif
> This is not a proper implementation of ndo_rx_flow_steer. If you steer
> a flow by changing the RSS table this can easily cause packet reordering
> in other flows. The filtering should be more precise, ideally matching
> exactly a single flow by e.g. VID and IP 5-tuple.
>
> I think you need to add a second hash table which records exactly which
> flow is supposed to be steered. Also, you must call
> rps_may_expire_flow() to check whether an entry in this table may be
> replaced; otherwise you can cause packet reordering in the flow that was
> previously being steered.
>
> Finally, this function must return the table index it assigned, so that
> rps_may_expire_flow() works.
Thanks for the explanation, how about document this briefly in scaling.txt?
>> +static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff *skb)
>> +{
>> + int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
>> + smp_processor_id();
>> +
>> + /* As we make use of the accelerate rfs which let the scheduler to
>> + * balance the load, it make sense to choose the tx queue also based on
>> + * theprocessor id?
>> + */
>> + while (unlikely(txq>= dev->real_num_tx_queues))
>> + txq -= dev->real_num_tx_queues;
>> + return txq;
>> +}
> [...]
>
> Don't do this, let XPS handle it.
>
> Ben.
>
next prev parent reply other threads:[~2011-12-06 7:25 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-05 8:58 [net-next RFC PATCH 0/5] Series short description Jason Wang
2011-12-05 8:58 ` [net-next RFC PATCH 1/5] virtio_net: passing rxhash through vnet_hdr Jason Wang
2011-12-05 8:58 ` [net-next RFC PATCH 2/5] tuntap: simple flow director support Jason Wang
2011-12-05 10:38 ` Stefan Hajnoczi
2011-12-05 20:09 ` Ben Hutchings
2011-12-06 7:21 ` Jason Wang
2011-12-06 17:31 ` Ben Hutchings
2011-12-05 8:59 ` [net-next RFC PATCH 3/5] macvtap: " Jason Wang
2011-12-05 20:11 ` Ben Hutchings
2011-12-05 8:59 ` [net-next RFC PATCH 4/5] virtio: introduce a method to get the irq of a specific virtqueue Jason Wang
2011-12-05 8:59 ` [net-next RFC PATCH 5/5] virtio-net: flow director support Jason Wang
2011-12-05 10:55 ` Stefan Hajnoczi
2011-12-06 6:33 ` Jason Wang
2011-12-06 9:18 ` Stefan Hajnoczi
2011-12-06 10:21 ` Jason Wang
2011-12-06 13:15 ` Stefan Hajnoczi
2011-12-06 15:42 ` Sridhar Samudrala
2011-12-06 16:14 ` Michael S. Tsirkin
2011-12-06 23:10 ` Sridhar Samudrala
2011-12-07 11:05 ` Jason Wang
2011-12-07 11:02 ` Jason Wang
2011-12-09 2:00 ` Sridhar Samudrala
2011-12-07 3:03 ` Jason Wang
2011-12-07 9:08 ` Stefan Hajnoczi
2011-12-07 12:10 ` Jason Wang
2011-12-07 15:04 ` Stefan Hajnoczi
2011-12-05 20:42 ` Ben Hutchings
2011-12-06 7:25 ` Jason Wang [this message]
2011-12-06 17:36 ` Ben Hutchings
2011-12-07 7:30 ` [net-next RFC PATCH 0/5] Series short description Rusty Russell
2011-12-07 11:31 ` Jason Wang
2011-12-07 17:02 ` Ben Hutchings
2011-12-08 10:06 ` Jason Wang
2011-12-09 5:31 ` Rusty Russell
2011-12-15 1:36 ` Ben Hutchings
2011-12-15 23:12 ` Rusty Russell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4EDDC35C.2070100@redhat.com \
--to=jasowang@redhat.com \
--cc=bhutchings@solarflare.com \
--cc=krkumar2@in.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=levinsasha928@gmail.com \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).