* [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
@ 2011-11-24 8:17 Krishna Kumar
2011-11-24 9:36 ` jasowang
2011-11-24 9:59 ` Michael S. Tsirkin
0 siblings, 2 replies; 27+ messages in thread
From: Krishna Kumar @ 2011-11-24 8:17 UTC (permalink / raw)
To: arnd; +Cc: Krishna Kumar, mst, netdev, virtualization, levinsasha928, davem
It was reported that the macvtap device selects a
different vhost (when used with multiqueue feature)
for incoming packets of a single connection. Use
packet hash first. Patch tested on MQ virtio_net.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---
drivers/net/macvtap.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff -ruNp org/drivers/net/macvtap.c new/drivers/net/macvtap.c
--- org/drivers/net/macvtap.c 2011-10-22 08:38:01.000000000 +0530
+++ new/drivers/net/macvtap.c 2011-11-16 18:34:51.000000000 +0530
@@ -175,6 +175,14 @@ static struct macvtap_queue *macvtap_get
if (!numvtaps)
goto out;
+ /* Check if we can use flow to select a queue */
+ rxq = skb_get_rxhash(skb);
+ if (rxq) {
+ tap = rcu_dereference(vlan->taps[rxq % numvtaps]);
+ if (tap)
+ goto out;
+ }
+
if (likely(skb_rx_queue_recorded(skb))) {
rxq = skb_get_rx_queue(skb);
@@ -186,14 +194,6 @@ static struct macvtap_queue *macvtap_get
goto out;
}
- /* Check if we can use flow to select a queue */
- rxq = skb_get_rxhash(skb);
- if (rxq) {
- tap = rcu_dereference(vlan->taps[rxq % numvtaps]);
- if (tap)
- goto out;
- }
-
/* Everything failed - find first available queue */
for (rxq = 0; rxq < MAX_MACVTAP_QUEUES; rxq++) {
tap = rcu_dereference(vlan->taps[rxq]);
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-24 8:17 [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first Krishna Kumar
@ 2011-11-24 9:36 ` jasowang
2011-11-24 9:59 ` Michael S. Tsirkin
1 sibling, 0 replies; 27+ messages in thread
From: jasowang @ 2011-11-24 9:36 UTC (permalink / raw)
To: Krishna Kumar; +Cc: arnd, mst, netdev, virtualization, levinsasha928, davem
On 11/24/2011 04:17 PM, Krishna Kumar wrote:
> It was reported that the macvtap device selects a
> different vhost (when used with multiqueue feature)
> for incoming packets of a single connection. Use
> packet hash first. Patch tested on MQ virtio_net.
>
> Signed-off-by: Krishna Kumar<krkumar2@in.ibm.com>
> ---
> drivers/net/macvtap.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff -ruNp org/drivers/net/macvtap.c new/drivers/net/macvtap.c
> --- org/drivers/net/macvtap.c 2011-10-22 08:38:01.000000000 +0530
> +++ new/drivers/net/macvtap.c 2011-11-16 18:34:51.000000000 +0530
> @@ -175,6 +175,14 @@ static struct macvtap_queue *macvtap_get
> if (!numvtaps)
> goto out;
>
> + /* Check if we can use flow to select a queue */
> + rxq = skb_get_rxhash(skb);
> + if (rxq) {
> + tap = rcu_dereference(vlan->taps[rxq % numvtaps]);
> + if (tap)
> + goto out;
> + }
> +
> if (likely(skb_rx_queue_recorded(skb))) {
> rxq = skb_get_rx_queue(skb);
>
Looks reasonable and we can improve this further by implementing
smarter queue selection with the co-operation of guest. I think one
initiate of this is to let the packets of a flow to be handled by the
same guest cpu/vhost thread. This can be done by using accelerate rfs in
virtio-net driver and 'tell' the macvtap/tap which queue should a packet
go. I've started working on prototype of this, it should solves the
issue of packet steering in guest.
One more thought on this is, If a nic (such as ixgbe) with flow
director or similar technology is used, it can make sure the packet
belongs to a flow to be handled by host cpu. If we can make use of this
feature, more cache locality would be gained.
> @@ -186,14 +194,6 @@ static struct macvtap_queue *macvtap_get
> goto out;
> }
>
> - /* Check if we can use flow to select a queue */
> - rxq = skb_get_rxhash(skb);
> - if (rxq) {
> - tap = rcu_dereference(vlan->taps[rxq % numvtaps]);
> - if (tap)
> - goto out;
> - }
> -
> /* Everything failed - find first available queue */
> for (rxq = 0; rxq< MAX_MACVTAP_QUEUES; rxq++) {
> tap = rcu_dereference(vlan->taps[rxq]);
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-24 8:17 [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first Krishna Kumar
2011-11-24 9:36 ` jasowang
@ 2011-11-24 9:59 ` Michael S. Tsirkin
2011-11-24 10:13 ` jasowang
2011-11-24 11:14 ` Krishna Kumar2
1 sibling, 2 replies; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-24 9:59 UTC (permalink / raw)
To: Krishna Kumar; +Cc: arnd, netdev, virtualization, levinsasha928, davem
On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote:
> It was reported that the macvtap device selects a
> different vhost (when used with multiqueue feature)
> for incoming packets of a single connection. Use
> packet hash first. Patch tested on MQ virtio_net.
So this is sure to address the problem, why exactly does this happen?
Does your device spread a single flow across multiple RX queues? Would
not that cause trouble in the TCP layer?
It would seem that using the recorded queue should be faster with
less cache misses. Before we give up on that, I'd
like to understand why it's wrong. Do you know?
>
> Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
> ---
> drivers/net/macvtap.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff -ruNp org/drivers/net/macvtap.c new/drivers/net/macvtap.c
> --- org/drivers/net/macvtap.c 2011-10-22 08:38:01.000000000 +0530
> +++ new/drivers/net/macvtap.c 2011-11-16 18:34:51.000000000 +0530
> @@ -175,6 +175,14 @@ static struct macvtap_queue *macvtap_get
> if (!numvtaps)
> goto out;
>
> + /* Check if we can use flow to select a queue */
> + rxq = skb_get_rxhash(skb);
> + if (rxq) {
> + tap = rcu_dereference(vlan->taps[rxq % numvtaps]);
> + if (tap)
> + goto out;
> + }
> +
> if (likely(skb_rx_queue_recorded(skb))) {
> rxq = skb_get_rx_queue(skb);
>
> @@ -186,14 +194,6 @@ static struct macvtap_queue *macvtap_get
> goto out;
> }
>
> - /* Check if we can use flow to select a queue */
> - rxq = skb_get_rxhash(skb);
> - if (rxq) {
> - tap = rcu_dereference(vlan->taps[rxq % numvtaps]);
> - if (tap)
> - goto out;
> - }
> -
> /* Everything failed - find first available queue */
> for (rxq = 0; rxq < MAX_MACVTAP_QUEUES; rxq++) {
> tap = rcu_dereference(vlan->taps[rxq]);
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-24 9:59 ` Michael S. Tsirkin
@ 2011-11-24 10:13 ` jasowang
2011-11-24 10:34 ` Michael S. Tsirkin
2011-11-24 11:14 ` Krishna Kumar2
1 sibling, 1 reply; 27+ messages in thread
From: jasowang @ 2011-11-24 10:13 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Krishna Kumar, arnd, netdev, virtualization, levinsasha928, davem
On 11/24/2011 05:59 PM, Michael S. Tsirkin wrote:
> On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote:
>> It was reported that the macvtap device selects a
>> different vhost (when used with multiqueue feature)
>> for incoming packets of a single connection. Use
>> packet hash first. Patch tested on MQ virtio_net.
> So this is sure to address the problem, why exactly does this happen?
Ixgbe has flow director and bind queue to host cpu, so it can make sure
the packet of a flow to be handled by the same queue/cpu. So when vhost
thread moves from one host cpu to another, ixgbe would therefore send
the packet to the new cpu/queue.
> Does your device spread a single flow across multiple RX queues? Would
> not that cause trouble in the TCP layer?
> It would seem that using the recorded queue should be faster with
> less cache misses. Before we give up on that, I'd
> like to understand why it's wrong. Do you know?
>
>> Signed-off-by: Krishna Kumar<krkumar2@in.ibm.com>
>> ---
>> drivers/net/macvtap.c | 16 ++++++++--------
>> 1 file changed, 8 insertions(+), 8 deletions(-)
>>
>> diff -ruNp org/drivers/net/macvtap.c new/drivers/net/macvtap.c
>> --- org/drivers/net/macvtap.c 2011-10-22 08:38:01.000000000 +0530
>> +++ new/drivers/net/macvtap.c 2011-11-16 18:34:51.000000000 +0530
>> @@ -175,6 +175,14 @@ static struct macvtap_queue *macvtap_get
>> if (!numvtaps)
>> goto out;
>>
>> + /* Check if we can use flow to select a queue */
>> + rxq = skb_get_rxhash(skb);
>> + if (rxq) {
>> + tap = rcu_dereference(vlan->taps[rxq % numvtaps]);
>> + if (tap)
>> + goto out;
>> + }
>> +
>> if (likely(skb_rx_queue_recorded(skb))) {
>> rxq = skb_get_rx_queue(skb);
>>
>> @@ -186,14 +194,6 @@ static struct macvtap_queue *macvtap_get
>> goto out;
>> }
>>
>> - /* Check if we can use flow to select a queue */
>> - rxq = skb_get_rxhash(skb);
>> - if (rxq) {
>> - tap = rcu_dereference(vlan->taps[rxq % numvtaps]);
>> - if (tap)
>> - goto out;
>> - }
>> -
>> /* Everything failed - find first available queue */
>> for (rxq = 0; rxq< MAX_MACVTAP_QUEUES; rxq++) {
>> tap = rcu_dereference(vlan->taps[rxq]);
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-24 10:13 ` jasowang
@ 2011-11-24 10:34 ` Michael S. Tsirkin
2011-11-24 12:56 ` jasowang
0 siblings, 1 reply; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-24 10:34 UTC (permalink / raw)
To: jasowang
Cc: Krishna Kumar, arnd, netdev, virtualization, levinsasha928, davem
On Thu, Nov 24, 2011 at 06:13:41PM +0800, jasowang wrote:
> On 11/24/2011 05:59 PM, Michael S. Tsirkin wrote:
> >On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote:
> >>It was reported that the macvtap device selects a
> >>different vhost (when used with multiqueue feature)
> >>for incoming packets of a single connection. Use
> >>packet hash first. Patch tested on MQ virtio_net.
> >So this is sure to address the problem, why exactly does this happen?
>
> Ixgbe has flow director and bind queue to host cpu, so it can make
> sure the packet of a flow to be handled by the same queue/cpu. So
> when vhost thread moves from one host cpu to another, ixgbe would
> therefore send the packet to the new cpu/queue.
Confused. How does ixgbe know about vhost thread moving?
> >Does your device spread a single flow across multiple RX queues? Would
> >not that cause trouble in the TCP layer?
> >It would seem that using the recorded queue should be faster with
> >less cache misses. Before we give up on that, I'd
> >like to understand why it's wrong. Do you know?
> >
> >>Signed-off-by: Krishna Kumar<krkumar2@in.ibm.com>
> >>---
> >> drivers/net/macvtap.c | 16 ++++++++--------
> >> 1 file changed, 8 insertions(+), 8 deletions(-)
> >>
> >>diff -ruNp org/drivers/net/macvtap.c new/drivers/net/macvtap.c
> >>--- org/drivers/net/macvtap.c 2011-10-22 08:38:01.000000000 +0530
> >>+++ new/drivers/net/macvtap.c 2011-11-16 18:34:51.000000000 +0530
> >>@@ -175,6 +175,14 @@ static struct macvtap_queue *macvtap_get
> >> if (!numvtaps)
> >> goto out;
> >>
> >>+ /* Check if we can use flow to select a queue */
> >>+ rxq = skb_get_rxhash(skb);
> >>+ if (rxq) {
> >>+ tap = rcu_dereference(vlan->taps[rxq % numvtaps]);
> >>+ if (tap)
> >>+ goto out;
> >>+ }
> >>+
> >> if (likely(skb_rx_queue_recorded(skb))) {
> >> rxq = skb_get_rx_queue(skb);
> >>
> >>@@ -186,14 +194,6 @@ static struct macvtap_queue *macvtap_get
> >> goto out;
> >> }
> >>
> >>- /* Check if we can use flow to select a queue */
> >>- rxq = skb_get_rxhash(skb);
> >>- if (rxq) {
> >>- tap = rcu_dereference(vlan->taps[rxq % numvtaps]);
> >>- if (tap)
> >>- goto out;
> >>- }
> >>-
> >> /* Everything failed - find first available queue */
> >> for (rxq = 0; rxq< MAX_MACVTAP_QUEUES; rxq++) {
> >> tap = rcu_dereference(vlan->taps[rxq]);
> >--
> >To unsubscribe from this list: send the line "unsubscribe netdev" in
> >the body of a message to majordomo@vger.kernel.org
> >More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-24 9:59 ` Michael S. Tsirkin
2011-11-24 10:13 ` jasowang
@ 2011-11-24 11:14 ` Krishna Kumar2
2011-11-24 13:00 ` jasowang
1 sibling, 1 reply; 27+ messages in thread
From: Krishna Kumar2 @ 2011-11-24 11:14 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: arnd, davem, jasowang, levinsasha928, netdev, virtualization,
jeffrey.t.kirsher
"Michael S. Tsirkin" <mst@redhat.com> wrote on 11/24/2011 03:29:03 PM:
> Subject Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
>
> On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote:
> > It was reported that the macvtap device selects a
> > different vhost (when used with multiqueue feature)
> > for incoming packets of a single connection. Use
> > packet hash first. Patch tested on MQ virtio_net.
>
> So this is sure to address the problem, why exactly does this happen?
> Does your device spread a single flow across multiple RX queues? Would
> not that cause trouble in the TCP layer?
> It would seem that using the recorded queue should be faster with
> less cache misses. Before we give up on that, I'd
> like to understand why it's wrong. Do you know?
I am using ixgbe. From what I briefly saw, ixgbe_alloc_rx_buffers
calls skb_record_rx_queue when a skb is allocated. When a packet
is received (ixgbe_alloc_rx_buffers), it sets rxhash. The
recorded value is different for most skbs when I ran a single
stream TCP stream test (does skbs move between the rx_rings?).
With this patch, macvtap selects the same device for each
incoming packet. I can add some debugs in ixgbe to see what is
happening also. I am not sure if Sasha was using a different
device.
Cc'ing Jeffrey in case he can add something.
thanks,
- KK
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-24 10:34 ` Michael S. Tsirkin
@ 2011-11-24 12:56 ` jasowang
2011-11-24 16:14 ` Michael S. Tsirkin
0 siblings, 1 reply; 27+ messages in thread
From: jasowang @ 2011-11-24 12:56 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Krishna Kumar, arnd, netdev, virtualization, levinsasha928, davem
On 11/24/2011 06:34 PM, Michael S. Tsirkin wrote:
> On Thu, Nov 24, 2011 at 06:13:41PM +0800, jasowang wrote:
>> On 11/24/2011 05:59 PM, Michael S. Tsirkin wrote:
>>> On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote:
>>>> It was reported that the macvtap device selects a
>>>> different vhost (when used with multiqueue feature)
>>>> for incoming packets of a single connection. Use
>>>> packet hash first. Patch tested on MQ virtio_net.
>>> So this is sure to address the problem, why exactly does this happen?
>> Ixgbe has flow director and bind queue to host cpu, so it can make
>> sure the packet of a flow to be handled by the same queue/cpu. So
>> when vhost thread moves from one host cpu to another, ixgbe would
>> therefore send the packet to the new cpu/queue.
> Confused. How does ixgbe know about vhost thread moving?
As far as I can see, ixgbe binds queues to physical cpu, so let consider:
vhost thread transmits packets of flow A on processor M
during packet transmission, ixgbe driver programs the card to deliver
the packet of flow A to queue/cpu M through flow director (see ixgbe_atr())
vhost thread then receives packet of flow A with from M
...
vhost thread transmits packets of flow A on processor N
ixgbe driver programs the flow director to change the delivery of flow A
to queue N ( cpu N )
vhost thread then receives packet of flow A with from N
...
So, for a single flow A, we may get different queue mappings. Using
rxhash instead may solve this issue.
>
>>> Does your device spread a single flow across multiple RX queues? Would
>>> not that cause trouble in the TCP layer?
>>> It would seem that using the recorded queue should be faster with
>>> less cache misses. Before we give up on that, I'd
>>> like to understand why it's wrong. Do you know?
>>>
>>>> Signed-off-by: Krishna Kumar<krkumar2@in.ibm.com>
>>>> ---
>>>> drivers/net/macvtap.c | 16 ++++++++--------
>>>> 1 file changed, 8 insertions(+), 8 deletions(-)
>>>>
>>>> diff -ruNp org/drivers/net/macvtap.c new/drivers/net/macvtap.c
>>>> --- org/drivers/net/macvtap.c 2011-10-22 08:38:01.000000000 +0530
>>>> +++ new/drivers/net/macvtap.c 2011-11-16 18:34:51.000000000 +0530
>>>> @@ -175,6 +175,14 @@ static struct macvtap_queue *macvtap_get
>>>> if (!numvtaps)
>>>> goto out;
>>>>
>>>> + /* Check if we can use flow to select a queue */
>>>> + rxq = skb_get_rxhash(skb);
>>>> + if (rxq) {
>>>> + tap = rcu_dereference(vlan->taps[rxq % numvtaps]);
>>>> + if (tap)
>>>> + goto out;
>>>> + }
>>>> +
>>>> if (likely(skb_rx_queue_recorded(skb))) {
>>>> rxq = skb_get_rx_queue(skb);
>>>>
>>>> @@ -186,14 +194,6 @@ static struct macvtap_queue *macvtap_get
>>>> goto out;
>>>> }
>>>>
>>>> - /* Check if we can use flow to select a queue */
>>>> - rxq = skb_get_rxhash(skb);
>>>> - if (rxq) {
>>>> - tap = rcu_dereference(vlan->taps[rxq % numvtaps]);
>>>> - if (tap)
>>>> - goto out;
>>>> - }
>>>> -
>>>> /* Everything failed - find first available queue */
>>>> for (rxq = 0; rxq< MAX_MACVTAP_QUEUES; rxq++) {
>>>> tap = rcu_dereference(vlan->taps[rxq]);
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-24 11:14 ` Krishna Kumar2
@ 2011-11-24 13:00 ` jasowang
2011-11-24 16:12 ` Michael S. Tsirkin
2011-11-25 2:58 ` Krishna Kumar2
0 siblings, 2 replies; 27+ messages in thread
From: jasowang @ 2011-11-24 13:00 UTC (permalink / raw)
To: Krishna Kumar2
Cc: arnd, Michael S. Tsirkin, netdev, virtualization, levinsasha928,
davem, jeffrey.t.kirsher
On 11/24/2011 07:14 PM, Krishna Kumar2 wrote:
> "Michael S. Tsirkin"<mst@redhat.com> wrote on 11/24/2011 03:29:03 PM:
>
>> Subject Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
>>
>> On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote:
>>> It was reported that the macvtap device selects a
>>> different vhost (when used with multiqueue feature)
>>> for incoming packets of a single connection. Use
>>> packet hash first. Patch tested on MQ virtio_net.
>> So this is sure to address the problem, why exactly does this happen?
>> Does your device spread a single flow across multiple RX queues? Would
>> not that cause trouble in the TCP layer?
>> It would seem that using the recorded queue should be faster with
>> less cache misses. Before we give up on that, I'd
>> like to understand why it's wrong. Do you know?
> I am using ixgbe. From what I briefly saw, ixgbe_alloc_rx_buffers
> calls skb_record_rx_queue when a skb is allocated. When a packet
> is received (ixgbe_alloc_rx_buffers), it sets rxhash. The
> recorded value is different for most skbs when I ran a single
> stream TCP stream test (does skbs move between the rx_rings?).
Yes, it moves. It depends on last processor or tx queue who transmits
the packets of this stream. Because ixgbe select tx queue based on the
processor id, so if vhost thread transmits skbs on different processors,
the skb of a single stream may comes from different rx ring.
> With this patch, macvtap selects the same device for each
> incoming packet. I can add some debugs in ixgbe to see what is
> happening also. I am not sure if Sasha was using a different
> device.
>
> Cc'ing Jeffrey in case he can add something.
>
> thanks,
>
> - KK
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-24 13:00 ` jasowang
@ 2011-11-24 16:12 ` Michael S. Tsirkin
2011-11-25 2:58 ` Krishna Kumar2
1 sibling, 0 replies; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-24 16:12 UTC (permalink / raw)
To: jasowang
Cc: Krishna Kumar2, arnd, davem, levinsasha928, netdev,
virtualization, jeffrey.t.kirsher
On Thu, Nov 24, 2011 at 09:00:52PM +0800, jasowang wrote:
> On 11/24/2011 07:14 PM, Krishna Kumar2 wrote:
> >"Michael S. Tsirkin"<mst@redhat.com> wrote on 11/24/2011 03:29:03 PM:
> >
> >>Subject Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
> >>
> >>On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote:
> >>>It was reported that the macvtap device selects a
> >>>different vhost (when used with multiqueue feature)
> >>>for incoming packets of a single connection. Use
> >>>packet hash first. Patch tested on MQ virtio_net.
> >>So this is sure to address the problem, why exactly does this happen?
> >>Does your device spread a single flow across multiple RX queues? Would
> >>not that cause trouble in the TCP layer?
> >>It would seem that using the recorded queue should be faster with
> >>less cache misses. Before we give up on that, I'd
> >>like to understand why it's wrong. Do you know?
> >I am using ixgbe. From what I briefly saw, ixgbe_alloc_rx_buffers
> >calls skb_record_rx_queue when a skb is allocated. When a packet
> >is received (ixgbe_alloc_rx_buffers), it sets rxhash. The
> >recorded value is different for most skbs when I ran a single
> >stream TCP stream test (does skbs move between the rx_rings?).
>
> Yes, it moves. It depends on last processor or tx queue who
> transmits the packets of this stream. Because ixgbe select tx queue
> based on the processor id, so if vhost thread transmits skbs on
> different processors, the skb of a single stream may comes from
> different rx ring.
Is that so? In that case, it looks like there's bug on the transmit
side, causing skbs to get scattered between ixgbe tx queues,
and the receive side trouble follows.
> >With this patch, macvtap selects the same device for each
> >incoming packet. I can add some debugs in ixgbe to see what is
> >happening also. I am not sure if Sasha was using a different
> >device.
> >
> >Cc'ing Jeffrey in case he can add something.
> >
> >thanks,
> >
> >- KK
> >
> >--
> >To unsubscribe from this list: send the line "unsubscribe netdev" in
> >the body of a message to majordomo@vger.kernel.org
> >More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-24 12:56 ` jasowang
@ 2011-11-24 16:14 ` Michael S. Tsirkin
2011-11-25 3:07 ` Krishna Kumar2
2011-11-25 3:09 ` Jason Wang
0 siblings, 2 replies; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-24 16:14 UTC (permalink / raw)
To: jasowang
Cc: Krishna Kumar, arnd, netdev, virtualization, levinsasha928, davem
On Thu, Nov 24, 2011 at 08:56:45PM +0800, jasowang wrote:
> On 11/24/2011 06:34 PM, Michael S. Tsirkin wrote:
> >On Thu, Nov 24, 2011 at 06:13:41PM +0800, jasowang wrote:
> >>On 11/24/2011 05:59 PM, Michael S. Tsirkin wrote:
> >>>On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote:
> >>>>It was reported that the macvtap device selects a
> >>>>different vhost (when used with multiqueue feature)
> >>>>for incoming packets of a single connection. Use
> >>>>packet hash first. Patch tested on MQ virtio_net.
> >>>So this is sure to address the problem, why exactly does this happen?
> >>Ixgbe has flow director and bind queue to host cpu, so it can make
> >>sure the packet of a flow to be handled by the same queue/cpu. So
> >>when vhost thread moves from one host cpu to another, ixgbe would
> >>therefore send the packet to the new cpu/queue.
> >Confused. How does ixgbe know about vhost thread moving?
>
> As far as I can see, ixgbe binds queues to physical cpu, so let consider:
>
> vhost thread transmits packets of flow A on processor M
> during packet transmission, ixgbe driver programs the card to
> deliver the packet of flow A to queue/cpu M through flow director
> (see ixgbe_atr())
> vhost thread then receives packet of flow A with from M
> ...
> vhost thread transmits packets of flow A on processor N
> ixgbe driver programs the flow director to change the delivery of
> flow A to queue N ( cpu N )
> vhost thread then receives packet of flow A with from N
> ...
>
> So, for a single flow A, we may get different queue mappings. Using
> rxhash instead may solve this issue.
Or better, transmit a single flow from a single vhost thread.
If packets of a single flow get spread over different CPUs,
they will get reordered and things are not going to work well.
> >
> >>>Does your device spread a single flow across multiple RX queues? Would
> >>>not that cause trouble in the TCP layer?
> >>>It would seem that using the recorded queue should be faster with
> >>>less cache misses. Before we give up on that, I'd
> >>>like to understand why it's wrong. Do you know?
> >>>
> >>>>Signed-off-by: Krishna Kumar<krkumar2@in.ibm.com>
> >>>>---
> >>>> drivers/net/macvtap.c | 16 ++++++++--------
> >>>> 1 file changed, 8 insertions(+), 8 deletions(-)
> >>>>
> >>>>diff -ruNp org/drivers/net/macvtap.c new/drivers/net/macvtap.c
> >>>>--- org/drivers/net/macvtap.c 2011-10-22 08:38:01.000000000 +0530
> >>>>+++ new/drivers/net/macvtap.c 2011-11-16 18:34:51.000000000 +0530
> >>>>@@ -175,6 +175,14 @@ static struct macvtap_queue *macvtap_get
> >>>> if (!numvtaps)
> >>>> goto out;
> >>>>
> >>>>+ /* Check if we can use flow to select a queue */
> >>>>+ rxq = skb_get_rxhash(skb);
> >>>>+ if (rxq) {
> >>>>+ tap = rcu_dereference(vlan->taps[rxq % numvtaps]);
> >>>>+ if (tap)
> >>>>+ goto out;
> >>>>+ }
> >>>>+
> >>>> if (likely(skb_rx_queue_recorded(skb))) {
> >>>> rxq = skb_get_rx_queue(skb);
> >>>>
> >>>>@@ -186,14 +194,6 @@ static struct macvtap_queue *macvtap_get
> >>>> goto out;
> >>>> }
> >>>>
> >>>>- /* Check if we can use flow to select a queue */
> >>>>- rxq = skb_get_rxhash(skb);
> >>>>- if (rxq) {
> >>>>- tap = rcu_dereference(vlan->taps[rxq % numvtaps]);
> >>>>- if (tap)
> >>>>- goto out;
> >>>>- }
> >>>>-
> >>>> /* Everything failed - find first available queue */
> >>>> for (rxq = 0; rxq< MAX_MACVTAP_QUEUES; rxq++) {
> >>>> tap = rcu_dereference(vlan->taps[rxq]);
> >>>--
> >>>To unsubscribe from this list: send the line "unsubscribe netdev" in
> >>>the body of a message to majordomo@vger.kernel.org
> >>>More majordomo info at http://vger.kernel.org/majordomo-info.html
> >--
> >To unsubscribe from this list: send the line "unsubscribe netdev" in
> >the body of a message to majordomo@vger.kernel.org
> >More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-24 13:00 ` jasowang
2011-11-24 16:12 ` Michael S. Tsirkin
@ 2011-11-25 2:58 ` Krishna Kumar2
2011-11-25 3:18 ` Jason Wang
1 sibling, 1 reply; 27+ messages in thread
From: Krishna Kumar2 @ 2011-11-25 2:58 UTC (permalink / raw)
To: jasowang
Cc: arnd, davem, jeffrey.t.kirsher, levinsasha928, Michael S. Tsirkin,
netdev, virtualization
jasowang <jasowang@redhat.com> wrote on 11/24/2011 06:30:52 PM:
>
> >> On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote:
> >>> It was reported that the macvtap device selects a
> >>> different vhost (when used with multiqueue feature)
> >>> for incoming packets of a single connection. Use
> >>> packet hash first. Patch tested on MQ virtio_net.
> >> So this is sure to address the problem, why exactly does this happen?
> >> Does your device spread a single flow across multiple RX queues? Would
> >> not that cause trouble in the TCP layer?
> >> It would seem that using the recorded queue should be faster with
> >> less cache misses. Before we give up on that, I'd
> >> like to understand why it's wrong. Do you know?
> > I am using ixgbe. From what I briefly saw, ixgbe_alloc_rx_buffers
> > calls skb_record_rx_queue when a skb is allocated. When a packet
> > is received (ixgbe_alloc_rx_buffers), it sets rxhash. The
> > recorded value is different for most skbs when I ran a single
> > stream TCP stream test (does skbs move between the rx_rings?).
>
> Yes, it moves. It depends on last processor or tx queue who transmits
> the packets of this stream. Because ixgbe select tx queue based on the
> processor id, so if vhost thread transmits skbs on different processors,
> the skb of a single stream may comes from different rx ring.
But I don't see transmit going on different queues,
only incoming.
- KK
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-24 16:14 ` Michael S. Tsirkin
@ 2011-11-25 3:07 ` Krishna Kumar2
2011-11-25 3:21 ` Jason Wang
2011-11-25 3:09 ` Jason Wang
1 sibling, 1 reply; 27+ messages in thread
From: Krishna Kumar2 @ 2011-11-25 3:07 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: arnd, davem, jasowang, levinsasha928, netdev, virtualization
"Michael S. Tsirkin" <mst@redhat.com> wrote on 11/24/2011 09:44:31 PM:
> > As far as I can see, ixgbe binds queues to physical cpu, so let
consider:
> >
> > vhost thread transmits packets of flow A on processor M
> > during packet transmission, ixgbe driver programs the card to
> > deliver the packet of flow A to queue/cpu M through flow director
> > (see ixgbe_atr())
> > vhost thread then receives packet of flow A with from M
> > ...
> > vhost thread transmits packets of flow A on processor N
> > ixgbe driver programs the flow director to change the delivery of
> > flow A to queue N ( cpu N )
> > vhost thread then receives packet of flow A with from N
> > ...
> >
> > So, for a single flow A, we may get different queue mappings. Using
> > rxhash instead may solve this issue.
>
> Or better, transmit a single flow from a single vhost thread.
>
> If packets of a single flow get spread over different CPUs,
> they will get reordered and things are not going to work well.
My testing so far shows that guest sends on (e.g.) TXQ#2
only, which is handled by vhost#2; and this doesn't change
for the entire duration of the test. Incoming keeps
changing for different packets but become same with
this patch. To iterate, I have not seen the following:
"
vhost thread transmits packets of flow A on processor M
...
vhost thread transmits packets of flow A on processor N
"
- KK
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-24 16:14 ` Michael S. Tsirkin
2011-11-25 3:07 ` Krishna Kumar2
@ 2011-11-25 3:09 ` Jason Wang
1 sibling, 0 replies; 27+ messages in thread
From: Jason Wang @ 2011-11-25 3:09 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Krishna Kumar, arnd, netdev, virtualization, levinsasha928, davem
On 11/25/2011 12:14 AM, Michael S. Tsirkin wrote:
> On Thu, Nov 24, 2011 at 08:56:45PM +0800, jasowang wrote:
>> > On 11/24/2011 06:34 PM, Michael S. Tsirkin wrote:
>>> > >On Thu, Nov 24, 2011 at 06:13:41PM +0800, jasowang wrote:
>>>> > >>On 11/24/2011 05:59 PM, Michael S. Tsirkin wrote:
>>>>> > >>>On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote:
>>>>>> > >>>>It was reported that the macvtap device selects a
>>>>>> > >>>>different vhost (when used with multiqueue feature)
>>>>>> > >>>>for incoming packets of a single connection. Use
>>>>>> > >>>>packet hash first. Patch tested on MQ virtio_net.
>>>>> > >>>So this is sure to address the problem, why exactly does this happen?
>>>> > >>Ixgbe has flow director and bind queue to host cpu, so it can make
>>>> > >>sure the packet of a flow to be handled by the same queue/cpu. So
>>>> > >>when vhost thread moves from one host cpu to another, ixgbe would
>>>> > >>therefore send the packet to the new cpu/queue.
>>> > >Confused. How does ixgbe know about vhost thread moving?
>> >
>> > As far as I can see, ixgbe binds queues to physical cpu, so let consider:
>> >
>> > vhost thread transmits packets of flow A on processor M
>> > during packet transmission, ixgbe driver programs the card to
>> > deliver the packet of flow A to queue/cpu M through flow director
>> > (see ixgbe_atr())
>> > vhost thread then receives packet of flow A with from M
>> > ...
>> > vhost thread transmits packets of flow A on processor N
>> > ixgbe driver programs the flow director to change the delivery of
>> > flow A to queue N ( cpu N )
>> > vhost thread then receives packet of flow A with from N
>> > ...
>> >
>> > So, for a single flow A, we may get different queue mappings. Using
>> > rxhash instead may solve this issue.
> Or better, transmit a single flow from a single vhost thread.
It has already worked this way, as the tx queue were choose based on tx
hash in guest(), but vhost thread can move among processors.
>
> If packets of a single flow get spread over different CPUs,
> they will get reordered and things are not going to work well.
>
The problem is that vhost does not handle TCP itself but ixgbe driver
would think it does, so the nic would deliver packets of a single flow
to different CPUs when the vhost thread who does the transmission moves.
So, in conclusion, if we do not consider features of under layer nic,
using rxhash instead of queue mappings to identify a flow is better.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-25 2:58 ` Krishna Kumar2
@ 2011-11-25 3:18 ` Jason Wang
0 siblings, 0 replies; 27+ messages in thread
From: Jason Wang @ 2011-11-25 3:18 UTC (permalink / raw)
To: Krishna Kumar2
Cc: arnd, Michael S. Tsirkin, netdev, virtualization, levinsasha928,
davem, jeffrey.t.kirsher
On 11/25/2011 10:58 AM, Krishna Kumar2 wrote:
> jasowang<jasowang@redhat.com> wrote on 11/24/2011 06:30:52 PM:
>
>>>> On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote:
>>>>> It was reported that the macvtap device selects a
>>>>> different vhost (when used with multiqueue feature)
>>>>> for incoming packets of a single connection. Use
>>>>> packet hash first. Patch tested on MQ virtio_net.
>>>> So this is sure to address the problem, why exactly does this happen?
>>>> Does your device spread a single flow across multiple RX queues? Would
>>>> not that cause trouble in the TCP layer?
>>>> It would seem that using the recorded queue should be faster with
>>>> less cache misses. Before we give up on that, I'd
>>>> like to understand why it's wrong. Do you know?
>>> I am using ixgbe. From what I briefly saw, ixgbe_alloc_rx_buffers
>>> calls skb_record_rx_queue when a skb is allocated. When a packet
>>> is received (ixgbe_alloc_rx_buffers), it sets rxhash. The
>>> recorded value is different for most skbs when I ran a single
>>> stream TCP stream test (does skbs move between the rx_rings?).
>> Yes, it moves. It depends on last processor or tx queue who transmits
>> the packets of this stream. Because ixgbe select tx queue based on the
>> processor id, so if vhost thread transmits skbs on different processors,
>> the skb of a single stream may comes from different rx ring.
> But I don't see transmit going on different queues,
> only incoming.
>
> - KK
>
Maybe I'm not clear enough, I mean the processor of host and tx queue of
ixgbe. So you would see, for a single vhost thread, as it moves among
host cpus, it would use different tx queues of ixgbe. I think if you pin
the vhost thread on host cpu, you may get consistent rx queue no.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-25 3:07 ` Krishna Kumar2
@ 2011-11-25 3:21 ` Jason Wang
2011-11-25 4:09 ` Krishna Kumar2
0 siblings, 1 reply; 27+ messages in thread
From: Jason Wang @ 2011-11-25 3:21 UTC (permalink / raw)
To: Krishna Kumar2
Cc: arnd, Michael S. Tsirkin, netdev, virtualization, levinsasha928,
davem
On 11/25/2011 11:07 AM, Krishna Kumar2 wrote:
> "Michael S. Tsirkin"<mst@redhat.com> wrote on 11/24/2011 09:44:31 PM:
>
>>> As far as I can see, ixgbe binds queues to physical cpu, so let
> consider:
>>> vhost thread transmits packets of flow A on processor M
>>> during packet transmission, ixgbe driver programs the card to
>>> deliver the packet of flow A to queue/cpu M through flow director
>>> (see ixgbe_atr())
>>> vhost thread then receives packet of flow A with from M
>>> ...
>>> vhost thread transmits packets of flow A on processor N
>>> ixgbe driver programs the flow director to change the delivery of
>>> flow A to queue N ( cpu N )
>>> vhost thread then receives packet of flow A with from N
>>> ...
>>>
>>> So, for a single flow A, we may get different queue mappings. Using
>>> rxhash instead may solve this issue.
>> Or better, transmit a single flow from a single vhost thread.
>>
>> If packets of a single flow get spread over different CPUs,
>> they will get reordered and things are not going to work well.
> My testing so far shows that guest sends on (e.g.) TXQ#2
> only, which is handled by vhost#2; and this doesn't change
> for the entire duration of the test. Incoming keeps
> changing for different packets but become same with
> this patch. To iterate, I have not seen the following:
Yes because guest chose the txq of virtio-net based on hash.
>
> "
> vhost thread transmits packets of flow A on processor M
> ...
> vhost thread transmits packets of flow A on processor N
> "
My description is not clear again :(
I mean the same vhost thead:
vhost thread #0 transmits packets of flow A on processor M
...
vhost thread #0 move to another process N and start to transmit packets
of flow A
> - KK
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-25 3:21 ` Jason Wang
@ 2011-11-25 4:09 ` Krishna Kumar2
2011-11-25 6:35 ` David Miller
` (2 more replies)
0 siblings, 3 replies; 27+ messages in thread
From: Krishna Kumar2 @ 2011-11-25 4:09 UTC (permalink / raw)
To: Jason Wang
Cc: arnd, Michael S. Tsirkin, netdev, virtualization, levinsasha928,
davem
Jason Wang <jasowang@redhat.com> wrote on 11/25/2011 08:51:57 AM:
>
> My description is not clear again :(
> I mean the same vhost thead:
>
> vhost thread #0 transmits packets of flow A on processor M
> ...
> vhost thread #0 move to another process N and start to transmit packets
> of flow A
Thanks for clarifying. Yes, binding vhosts to CPU's
makes the incoming packet go to the same vhost each
time. BTW, are you doing any binding and/or irqbalance
when you run your tests? I am not running either at
this time, but thought both might be useful.
- KK
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-25 4:09 ` Krishna Kumar2
@ 2011-11-25 6:35 ` David Miller
2011-11-27 17:23 ` Michael S. Tsirkin
2011-12-07 16:10 ` Michael S. Tsirkin
2011-11-27 17:14 ` Michael S. Tsirkin
2011-11-28 4:25 ` Jason Wang
2 siblings, 2 replies; 27+ messages in thread
From: David Miller @ 2011-11-25 6:35 UTC (permalink / raw)
To: krkumar2; +Cc: arnd, mst, netdev, virtualization, levinsasha928
From: Krishna Kumar2 <krkumar2@in.ibm.com>
Date: Fri, 25 Nov 2011 09:39:11 +0530
> Jason Wang <jasowang@redhat.com> wrote on 11/25/2011 08:51:57 AM:
>>
>> My description is not clear again :(
>> I mean the same vhost thead:
>>
>> vhost thread #0 transmits packets of flow A on processor M
>> ...
>> vhost thread #0 move to another process N and start to transmit packets
>> of flow A
>
> Thanks for clarifying. Yes, binding vhosts to CPU's
> makes the incoming packet go to the same vhost each
> time. BTW, are you doing any binding and/or irqbalance
> when you run your tests? I am not running either at
> this time, but thought both might be useful.
So are we going with this patch or are we saying that vhost binding
is a requirement?
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-25 4:09 ` Krishna Kumar2
2011-11-25 6:35 ` David Miller
@ 2011-11-27 17:14 ` Michael S. Tsirkin
2011-11-28 4:25 ` Jason Wang
2 siblings, 0 replies; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-27 17:14 UTC (permalink / raw)
To: Krishna Kumar2; +Cc: arnd, netdev, virtualization, levinsasha928, davem
On Fri, Nov 25, 2011 at 09:39:11AM +0530, Krishna Kumar2 wrote:
> Jason Wang <jasowang@redhat.com> wrote on 11/25/2011 08:51:57 AM:
> >
> > My description is not clear again :(
> > I mean the same vhost thead:
> >
> > vhost thread #0 transmits packets of flow A on processor M
> > ...
> > vhost thread #0 move to another process N and start to transmit packets
> > of flow A
>
> Thanks for clarifying. Yes, binding vhosts to CPU's
> makes the incoming packet go to the same vhost each
> time.
Interesting, but still not sure why.
What if you bind the VCPUs but not the vhost thread?
> BTW, are you doing any binding and/or irqbalance
> when you run your tests? I am not running either at
> this time, but thought both might be useful.
>
> - KK
Either pinning or irqbalance is a good idea.
Doing neither means you get a random CPU handling
interrupts.
--
MST
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-25 6:35 ` David Miller
@ 2011-11-27 17:23 ` Michael S. Tsirkin
2011-11-28 4:40 ` Jason Wang
2011-12-07 16:10 ` Michael S. Tsirkin
1 sibling, 1 reply; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-11-27 17:23 UTC (permalink / raw)
To: David Miller; +Cc: krkumar2, arnd, netdev, virtualization, levinsasha928
On Fri, Nov 25, 2011 at 01:35:52AM -0500, David Miller wrote:
> From: Krishna Kumar2 <krkumar2@in.ibm.com>
> Date: Fri, 25 Nov 2011 09:39:11 +0530
>
> > Jason Wang <jasowang@redhat.com> wrote on 11/25/2011 08:51:57 AM:
> >>
> >> My description is not clear again :(
> >> I mean the same vhost thead:
> >>
> >> vhost thread #0 transmits packets of flow A on processor M
> >> ...
> >> vhost thread #0 move to another process N and start to transmit packets
> >> of flow A
> >
> > Thanks for clarifying. Yes, binding vhosts to CPU's
> > makes the incoming packet go to the same vhost each
> > time. BTW, are you doing any binding and/or irqbalance
> > when you run your tests? I am not running either at
> > this time, but thought both might be useful.
>
> So are we going with this patch or are we saying that vhost binding
> is a requirement?
I think it's a good idea to make sure we understand the problem
root cause well before applying the patch. We still
have a bit of time before 3.2. In particular, why does
the vhost thread bounce between CPUs so much?
Long term it seems the best way is to expose the preferred mapping
from the guest and forward it to the device.
--
MST
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-25 4:09 ` Krishna Kumar2
2011-11-25 6:35 ` David Miller
2011-11-27 17:14 ` Michael S. Tsirkin
@ 2011-11-28 4:25 ` Jason Wang
2011-11-28 17:42 ` Stephen Hemminger
2 siblings, 1 reply; 27+ messages in thread
From: Jason Wang @ 2011-11-28 4:25 UTC (permalink / raw)
To: Krishna Kumar2
Cc: arnd, Michael S. Tsirkin, netdev, virtualization, levinsasha928,
davem
On 11/25/2011 12:09 PM, Krishna Kumar2 wrote:
> Jason Wang<jasowang@redhat.com> wrote on 11/25/2011 08:51:57 AM:
>> My description is not clear again :(
>> I mean the same vhost thead:
>>
>> vhost thread #0 transmits packets of flow A on processor M
>> ...
>> vhost thread #0 move to another process N and start to transmit packets
>> of flow A
> Thanks for clarifying. Yes, binding vhosts to CPU's
> makes the incoming packet go to the same vhost each
> time. BTW, are you doing any binding and/or irqbalance
> when you run your tests? I am not running either at
> this time, but thought both might be useful.
I'm using ixgbe for testing also, for host, its driver seems provide irq
affinity hint, so no binding or irqbalance is needed. For guest,
irqbalance is used in guest. I've tried bind irq in guest, and it can
improve the rx performance.
> - KK
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-27 17:23 ` Michael S. Tsirkin
@ 2011-11-28 4:40 ` Jason Wang
0 siblings, 0 replies; 27+ messages in thread
From: Jason Wang @ 2011-11-28 4:40 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: krkumar2, arnd, netdev, virtualization, levinsasha928,
David Miller
On 11/28/2011 01:23 AM, Michael S. Tsirkin wrote:
> On Fri, Nov 25, 2011 at 01:35:52AM -0500, David Miller wrote:
>> From: Krishna Kumar2<krkumar2@in.ibm.com>
>> Date: Fri, 25 Nov 2011 09:39:11 +0530
>>
>>> Jason Wang<jasowang@redhat.com> wrote on 11/25/2011 08:51:57 AM:
>>>> My description is not clear again :(
>>>> I mean the same vhost thead:
>>>>
>>>> vhost thread #0 transmits packets of flow A on processor M
>>>> ...
>>>> vhost thread #0 move to another process N and start to transmit packets
>>>> of flow A
>>> Thanks for clarifying. Yes, binding vhosts to CPU's
>>> makes the incoming packet go to the same vhost each
>>> time. BTW, are you doing any binding and/or irqbalance
>>> when you run your tests? I am not running either at
>>> this time, but thought both might be useful.
>> So are we going with this patch or are we saying that vhost binding
>> is a requirement?
> I think it's a good idea to make sure we understand the problem
> root cause well before applying the patch. We still
> have a bit of time before 3.2. In particular, why does
> the vhost thread bounce between CPUs so much?
Other than this, since we could not assume the behavior of the under
nic, using rxhash to identify a flow is more generic way.
>
> Long term it seems the best way is to expose the preferred mapping
> from the guest and forward it to the device.
>
I was working on this and hope to post it soon.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-28 4:25 ` Jason Wang
@ 2011-11-28 17:42 ` Stephen Hemminger
0 siblings, 0 replies; 27+ messages in thread
From: Stephen Hemminger @ 2011-11-28 17:42 UTC (permalink / raw)
To: Jason Wang
Cc: Krishna Kumar2, arnd, Michael S. Tsirkin, netdev, virtualization,
levinsasha928, davem
On Mon, 28 Nov 2011 12:25:15 +0800
Jason Wang <jasowang@redhat.com> wrote:
> I'm using ixgbe for testing also, for host, its driver seems provide irq
> affinity hint, so no binding or irqbalance is needed.
The hint is for irqbalance to use. You need to still do manual
affinity or use irqbalance.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-11-25 6:35 ` David Miller
2011-11-27 17:23 ` Michael S. Tsirkin
@ 2011-12-07 16:10 ` Michael S. Tsirkin
2011-12-07 18:52 ` David Miller
2011-12-08 9:46 ` Jason Wang
1 sibling, 2 replies; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-12-07 16:10 UTC (permalink / raw)
To: David Miller; +Cc: krkumar2, arnd, netdev, virtualization, levinsasha928
On Fri, Nov 25, 2011 at 01:35:52AM -0500, David Miller wrote:
> From: Krishna Kumar2 <krkumar2@in.ibm.com>
> Date: Fri, 25 Nov 2011 09:39:11 +0530
>
> > Jason Wang <jasowang@redhat.com> wrote on 11/25/2011 08:51:57 AM:
> >>
> >> My description is not clear again :(
> >> I mean the same vhost thead:
> >>
> >> vhost thread #0 transmits packets of flow A on processor M
> >> ...
> >> vhost thread #0 move to another process N and start to transmit packets
> >> of flow A
> >
> > Thanks for clarifying. Yes, binding vhosts to CPU's
> > makes the incoming packet go to the same vhost each
> > time. BTW, are you doing any binding and/or irqbalance
> > when you run your tests? I am not running either at
> > this time, but thought both might be useful.
>
> So are we going with this patch or are we saying that vhost binding
> is a requirement?
OK we didn't come to a conclusion so I would be inclined
to merge this patch as is for 3.2, and revisit later.
One question though: do these changes affect userspace
in any way? For example, will this commit us to
ensure that a single flow gets a unique hash even
for strange configurations that transmit the same flow
from multiple cpus?
--
MST
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-12-07 16:10 ` Michael S. Tsirkin
@ 2011-12-07 18:52 ` David Miller
2011-12-20 11:15 ` Michael S. Tsirkin
2011-12-08 9:46 ` Jason Wang
1 sibling, 1 reply; 27+ messages in thread
From: David Miller @ 2011-12-07 18:52 UTC (permalink / raw)
To: mst; +Cc: krkumar2, arnd, netdev, virtualization, levinsasha928
From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Wed, 7 Dec 2011 18:10:02 +0200
> On Fri, Nov 25, 2011 at 01:35:52AM -0500, David Miller wrote:
>> From: Krishna Kumar2 <krkumar2@in.ibm.com>
>> Date: Fri, 25 Nov 2011 09:39:11 +0530
>>
>> > Jason Wang <jasowang@redhat.com> wrote on 11/25/2011 08:51:57 AM:
>> >>
>> >> My description is not clear again :(
>> >> I mean the same vhost thead:
>> >>
>> >> vhost thread #0 transmits packets of flow A on processor M
>> >> ...
>> >> vhost thread #0 move to another process N and start to transmit packets
>> >> of flow A
>> >
>> > Thanks for clarifying. Yes, binding vhosts to CPU's
>> > makes the incoming packet go to the same vhost each
>> > time. BTW, are you doing any binding and/or irqbalance
>> > when you run your tests? I am not running either at
>> > this time, but thought both might be useful.
>>
>> So are we going with this patch or are we saying that vhost binding
>> is a requirement?
>
> OK we didn't come to a conclusion so I would be inclined
> to merge this patch as is for 3.2, and revisit later.
> One question though: do these changes affect userspace
> in any way? For example, will this commit us to
> ensure that a single flow gets a unique hash even
> for strange configurations that transmit the same flow
> from multiple cpus?
Once you sort this out, reply with an Acked-by: for me, thanks.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-12-07 16:10 ` Michael S. Tsirkin
2011-12-07 18:52 ` David Miller
@ 2011-12-08 9:46 ` Jason Wang
1 sibling, 0 replies; 27+ messages in thread
From: Jason Wang @ 2011-12-08 9:46 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: krkumar2, arnd, netdev, virtualization, levinsasha928,
David Miller
On 12/08/2011 12:10 AM, Michael S. Tsirkin wrote:
> On Fri, Nov 25, 2011 at 01:35:52AM -0500, David Miller wrote:
>> From: Krishna Kumar2<krkumar2@in.ibm.com>
>> Date: Fri, 25 Nov 2011 09:39:11 +0530
>>
>>> Jason Wang<jasowang@redhat.com> wrote on 11/25/2011 08:51:57 AM:
>>>> My description is not clear again :(
>>>> I mean the same vhost thead:
>>>>
>>>> vhost thread #0 transmits packets of flow A on processor M
>>>> ...
>>>> vhost thread #0 move to another process N and start to transmit packets
>>>> of flow A
>>> Thanks for clarifying. Yes, binding vhosts to CPU's
>>> makes the incoming packet go to the same vhost each
>>> time. BTW, are you doing any binding and/or irqbalance
>>> when you run your tests? I am not running either at
>>> this time, but thought both might be useful.
>> So are we going with this patch or are we saying that vhost binding
>> is a requirement?
> OK we didn't come to a conclusion so I would be inclined
> to merge this patch as is for 3.2, and revisit later.
> One question though: do these changes affect userspace
> in any way? For example, will this commit us to
> ensure that a single flow gets a unique hash even
> for strange configurations that transmit the same flow
> from multiple cpus?
>
The hash were generated by either host kernel or host nic, so I think
it's unique except for the nic that would provide different hashes for a
flow. I wonder whether there's a such nic.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-12-07 18:52 ` David Miller
@ 2011-12-20 11:15 ` Michael S. Tsirkin
2011-12-20 18:46 ` David Miller
0 siblings, 1 reply; 27+ messages in thread
From: Michael S. Tsirkin @ 2011-12-20 11:15 UTC (permalink / raw)
To: David Miller; +Cc: krkumar2, arnd, netdev, virtualization, levinsasha928
On Wed, Dec 07, 2011 at 01:52:35PM -0500, David Miller wrote:
> From: "Michael S. Tsirkin" <mst@redhat.com>
> Date: Wed, 7 Dec 2011 18:10:02 +0200
>
> > On Fri, Nov 25, 2011 at 01:35:52AM -0500, David Miller wrote:
> >> From: Krishna Kumar2 <krkumar2@in.ibm.com>
> >> Date: Fri, 25 Nov 2011 09:39:11 +0530
> >>
> >> > Jason Wang <jasowang@redhat.com> wrote on 11/25/2011 08:51:57 AM:
> >> >>
> >> >> My description is not clear again :(
> >> >> I mean the same vhost thead:
> >> >>
> >> >> vhost thread #0 transmits packets of flow A on processor M
> >> >> ...
> >> >> vhost thread #0 move to another process N and start to transmit packets
> >> >> of flow A
> >> >
> >> > Thanks for clarifying. Yes, binding vhosts to CPU's
> >> > makes the incoming packet go to the same vhost each
> >> > time. BTW, are you doing any binding and/or irqbalance
> >> > when you run your tests? I am not running either at
> >> > this time, but thought both might be useful.
> >>
> >> So are we going with this patch or are we saying that vhost binding
> >> is a requirement?
> >
> > OK we didn't come to a conclusion so I would be inclined
> > to merge this patch as is for 3.2, and revisit later.
> > One question though: do these changes affect userspace
> > in any way? For example, will this commit us to
> > ensure that a single flow gets a unique hash even
> > for strange configurations that transmit the same flow
> > from multiple cpus?
>
> Once you sort this out, reply with an Acked-by: for me, thanks.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
2011-12-20 11:15 ` Michael S. Tsirkin
@ 2011-12-20 18:46 ` David Miller
0 siblings, 0 replies; 27+ messages in thread
From: David Miller @ 2011-12-20 18:46 UTC (permalink / raw)
To: mst; +Cc: krkumar2, arnd, netdev, virtualization, levinsasha928
From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Tue, 20 Dec 2011 13:15:12 +0200
> On Wed, Dec 07, 2011 at 01:52:35PM -0500, David Miller wrote:
>> Once you sort this out, reply with an Acked-by: for me, thanks.
>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
Applied.
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2011-12-20 18:46 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-24 8:17 [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first Krishna Kumar
2011-11-24 9:36 ` jasowang
2011-11-24 9:59 ` Michael S. Tsirkin
2011-11-24 10:13 ` jasowang
2011-11-24 10:34 ` Michael S. Tsirkin
2011-11-24 12:56 ` jasowang
2011-11-24 16:14 ` Michael S. Tsirkin
2011-11-25 3:07 ` Krishna Kumar2
2011-11-25 3:21 ` Jason Wang
2011-11-25 4:09 ` Krishna Kumar2
2011-11-25 6:35 ` David Miller
2011-11-27 17:23 ` Michael S. Tsirkin
2011-11-28 4:40 ` Jason Wang
2011-12-07 16:10 ` Michael S. Tsirkin
2011-12-07 18:52 ` David Miller
2011-12-20 11:15 ` Michael S. Tsirkin
2011-12-20 18:46 ` David Miller
2011-12-08 9:46 ` Jason Wang
2011-11-27 17:14 ` Michael S. Tsirkin
2011-11-28 4:25 ` Jason Wang
2011-11-28 17:42 ` Stephen Hemminger
2011-11-25 3:09 ` Jason Wang
2011-11-24 11:14 ` Krishna Kumar2
2011-11-24 13:00 ` jasowang
2011-11-24 16:12 ` Michael S. Tsirkin
2011-11-25 2:58 ` Krishna Kumar2
2011-11-25 3:18 ` Jason Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).