linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Vitaly Kuznetsov <vkuznets@redhat.com>,
	Simon Xiao <sixiao@microsoft.com>,
	Eric Dumazet <eric.dumazet@gmail.com>
Cc: Tom Herbert <tom@herbertland.com>,
	netdev@vger.kernel.org, "K. Y. Srinivasan" <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	devel@linuxdriverproject.org, linux-kernel@vger.kernel.org,
	David Miller <davem@davemloft.net>
Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
Date: Thu, 07 Jan 2016 17:02:08 -0800	[thread overview]
Message-ID: <568F0A90.8040109@gmail.com> (raw)
In-Reply-To: <877fjlfrid.fsf@vitty.brq.redhat.com>

On 16-01-07 05:28 AM, Vitaly Kuznetsov wrote:
> Eric Dumazet <eric.dumazet@gmail.com> writes:
> 
>> On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
>>> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
>>>  VLAN ID to flow_keys")) introduced a performance regression in netvsc
>>> driver. Is problem is, however, not the above mentioned commit but the
>>> fact that netvsc_set_hash() function did some assumptions on the struct
>>> flow_keys data layout and this is wrong. We need to extract the data we
>>> need (src/dst addresses and ports) after the dissect.
>>>
>>> The issue could also be solved in a completely different way: as suggested
>>> by Eric instead of our own homegrown netvsc_set_hash() we could use
>>> skb_get_hash() which does more or less the same. Unfortunately, the
>>> testing done by Simon showed that Hyper-V hosts are not happy with our
>>> Jenkins hash, selecting the output queue with the current algorithm based
>>> on Toeplitz hash works significantly better.
>>

Also can I ask the maybe naive question. It looks like the hypervisor
is populating some table via a mailbox msg and this is used to select
the queues I guess with some sort of weighting function?

What happens if you just remove select_queue altogether? Or maybe just
what is this 16 entry table doing? How does this work on my larger
systems with 64+ cores can I only use 16 cores? Sorry I really have
no experience with hyperV and this got me curious.

Thanks,
John

>> Were tests done on IPv6 traffic ?
>>
> 
> Simon, could you please test this patch for IPv6 and show us the numbers?
> 
>> Toeplitz hash takes at least 100 ns to hash 12 bytes (one iteration per
>> bit : 96 iterations)
>>
>> For IPv6 it is 3 times this, since we have to hash 36 bytes.
>>
>> I do not see how it can compete with skb_get_hash() that directly gives
>> skb->hash for local TCP flows.
>>
> 
> My guess is that this is not the bottleneck, something is happening
> behind the scene with out packets in Hyper-V host (e.g. re-distributing
> them to hardware queues?) but I don't know the internals, Microsoft
> folks could probably comment.
> 
> 
>> See commits b73c3d0e4f0e1961e15bec18720e48aabebe2109
>> ("net: Save TX flow hash in sock and set in skbuf on xmit")
>> and 877d1f6291f8e391237e324be58479a3e3a7407c
>> ("net: Set sk_txhash from a random number")
>>
>> I understand Microsoft loves Toeplitz, but this looks not well placed
>> here.
>>
>> I suspect there is another problem.
>>
>> Please share your numbers and test methodology, and the alternative
>> patch Simon tested so that we can double check it.
>>
> 
> Alternative patch which uses skb_get_hash() attached. Simon, could you
> please share the rest (environment, metodology, numbers) with us here?
> Thanks!
> 
>> Thanks.
>>
>> PS: For the time being this patch can probably be applied on -net tree,
>> as it fixes a real bug.
> 

  reply	other threads:[~2016-01-08  1:02 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-07  9:33 [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout Vitaly Kuznetsov
2016-01-07 12:52 ` Eric Dumazet
2016-01-07 13:28   ` Vitaly Kuznetsov
2016-01-08  1:02     ` John Fastabend [this message]
2016-01-08  3:49       ` KY Srinivasan
2016-01-08  6:16         ` John Fastabend
2016-01-08 18:01           ` KY Srinivasan
2016-01-08 21:07     ` Haiyang Zhang
2016-01-09  0:17   ` Tom Herbert
2016-01-10 22:25 ` David Miller
2016-01-13 23:10   ` Haiyang Zhang
2016-01-14  4:56     ` David Miller
2016-01-14 17:14     ` Tom Herbert
2016-01-14 17:53       ` One Thousand Gnomes
2016-01-14 18:24         ` Eric Dumazet
2016-01-14 18:35           ` Haiyang Zhang
2016-01-14 18:48             ` Tom Herbert
2016-01-14 19:15               ` Haiyang Zhang
2016-01-14 19:41                 ` Tom Herbert
2016-01-14 20:23                   ` Haiyang Zhang
2016-01-14 21:44                     ` Tom Herbert
2016-01-14 22:06                       ` David Miller
2016-01-14 22:08                     ` Eric Dumazet
2016-01-14 22:29                       ` Haiyang Zhang
2016-01-14 17:53     ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=568F0A90.8040109@gmail.com \
    --to=john.fastabend@gmail.com \
    --cc=davem@davemloft.net \
    --cc=devel@linuxdriverproject.org \
    --cc=eric.dumazet@gmail.com \
    --cc=haiyangz@microsoft.com \
    --cc=kys@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=sixiao@microsoft.com \
    --cc=tom@herbertland.com \
    --cc=vkuznets@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).