All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hannes Frederic Sowa <hannes@stressinduktion.org>
To: Jay Vosburgh <jay.vosburgh@canonical.com>
Cc: netdev@vger.kernel.org, Herbert Xu <herbert@gondor.apana.org.au>,
	Thomas Graf <tgraf@suug.ch>,
	Daniel Borkmann <dborkman@redhat.com>,
	Eric Dumazet <eric.dumazet@gmail.com>
Subject: Re: [PATCH net-next] arch_fast_hash: avoid indirect function calls and implement hash in asm
Date: Thu, 04 Dec 2014 20:32:06 +0100	[thread overview]
Message-ID: <1417721526.5386.39.camel@localhost> (raw)
In-Reply-To: <15333.1417721231@famine>

Hi Jay,

On Do, 2014-12-04 at 11:27 -0800, Jay Vosburgh wrote:
> Hannes Frederic Sowa <hannes@stressinduktion.org> wrote:
> 
> >By default the arch_fast_hash hashing function pointers are initialized
> >to jhash(2). If during boot-up a CPU with SSE4.2 is detected they get
> >updated to the CRC32 ones. This dispatching scheme incurs a function
> >pointer lookup and indirect call for every hashing operation.
> >
> >To keep the number of clobbered registers short the hashing primitives
> >are implemented in assembler. This makes it easier to do the dispatch
> >by alternative_call.
> 
> 	I have tested this on the same system that panicked with the
> original (now reverted) implementation (commit e5a2c8999576 "fast_hash:
> avoid indirect function calls"), and it functions correctly and does not
> panic.
> 
> 	I looked at the disassembly, and, as a data point, on a
> non-SSE4.2 system, the code generated is not as efficient as Hannes'
> original test patch, found here:
> 
> http://comments.gmane.org/gmane.linux.network/338430
> 
> 	which produced code as follows:
> 
> 0xffffffffa00b6bb9 <ovs_flow_tbl_insert+0xb9>:  mov    %r15,0x348(%r14)
> 0xffffffffa00b6bc0 <ovs_flow_tbl_insert+0xc0>:  movzwl 0x28(%r15),%ecx
> 0xffffffffa00b6bc5 <ovs_flow_tbl_insert+0xc5>:  movzwl 0x2a(%r15),%esi
> 0xffffffffa00b6bca <ovs_flow_tbl_insert+0xca>:  movzwl %cx,%eax
> 0xffffffffa00b6bcd <ovs_flow_tbl_insert+0xcd>:  sub    %ecx,%esi
> 0xffffffffa00b6bcf <ovs_flow_tbl_insert+0xcf>:  lea    0x38(%r14,%rax,1),%rdi
> 0xffffffffa00b6bd4 <ovs_flow_tbl_insert+0xd4>:  sar    $0x2,%esi
> 0xffffffffa00b6bd7 <ovs_flow_tbl_insert+0xd7>:  callq  0xffffffff813a7810 <__jhash2>
> 0xffffffffa00b6bdc <ovs_flow_tbl_insert+0xdc>:  mov    %eax,0x30(%r14)
> 0xffffffffa00b6be0 <ovs_flow_tbl_insert+0xe0>:  mov    (%rbx),%r13
> 0xffffffffa00b6be3 <ovs_flow_tbl_insert+0xe3>:  mov    %r14,%rsi
> 0xffffffffa00b6be6 <ovs_flow_tbl_insert+0xe6>:  mov    %r13,%rdi
> 0xffffffffa00b6be9 <ovs_flow_tbl_insert+0xe9>:  callq  0xffffffffa00b61a0 <table_instance_insert>
> 
> 	This patch's code ends up as follows:
> 
> 0xffffffffa01b5a57 <ovs_flow_tbl_insert+0xb7>:	mov    %r15,0x348(%rcx)
> 0xffffffffa01b5a5e <ovs_flow_tbl_insert+0xbe>:	movzwl 0x28(%r15),%eax
> 0xffffffffa01b5a63 <ovs_flow_tbl_insert+0xc3>:	movzwl 0x2a(%r15),%esi
> 0xffffffffa01b5a68 <ovs_flow_tbl_insert+0xc8>:	movzwl %ax,%edx
> 0xffffffffa01b5a6b <ovs_flow_tbl_insert+0xcb>:	sub    %eax,%esi
> 0xffffffffa01b5a6d <ovs_flow_tbl_insert+0xcd>:	lea    0x38(%rcx,%rdx,1),%rdi
> 0xffffffffa01b5a72 <ovs_flow_tbl_insert+0xd2>:	xor    %edx,%edx
> 0xffffffffa01b5a74 <ovs_flow_tbl_insert+0xd4>:	sar    $0x2,%esi
> 0xffffffffa01b5a77 <ovs_flow_tbl_insert+0xd7>:	callq  0xffffffff813ae9f0 <__jhash_trampoline>
> 0xffffffffa01b5a7c <ovs_flow_tbl_insert+0xdc>:	mov    %eax,0x30(%rcx)
> 0xffffffffa01b5a7f <ovs_flow_tbl_insert+0xdf>:	mov    (%rbx),%r13
> 0xffffffffa01b5a82 <ovs_flow_tbl_insert+0xe2>:	mov    %rcx,%rsi
> 0xffffffffa01b5a85 <ovs_flow_tbl_insert+0xe5>:	mov    %r13,%rdi
> 0xffffffffa01b5a88 <ovs_flow_tbl_insert+0xe8>:	callq  0xffffffffa01b5030 <table_instance_insert>
> 
> 0xffffffff813ae9f0 <__jhash_trampoline>:	push   %rcx
> 0xffffffff813ae9f1 <__jhash_trampoline+0x1>:	push   %r8
> 0xffffffff813ae9f3 <__jhash_trampoline+0x3>:	push   %r9
> 0xffffffff813ae9f5 <__jhash_trampoline+0x5>:	push   %r10
> 0xffffffff813ae9f7 <__jhash_trampoline+0x7>:	push   %r11
> 0xffffffff813ae9f9 <__jhash_trampoline+0x9>:	callq  0xffffffff813ae8a0 <__jhash>
> 0xffffffff813ae9fe <__jhash_trampoline+0xe>:	pop    %r11
> 0xffffffff813aea00 <__jhash_trampoline+0x10>:	pop    %r10
> 0xffffffff813aea02 <__jhash_trampoline+0x12>:	pop    %r9
> 0xffffffff813aea04 <__jhash_trampoline+0x14>:	pop    %r8
> 0xffffffff813aea06 <__jhash_trampoline+0x16>:	pop    %rcx
> 0xffffffff813aea07 <__jhash_trampoline+0x17>:	retq   
> 
> 	In any event, this new patch does work correctly in my test that
> originally failed, and it's debatable how much optimizing for old
> systems is worthwhile.

Yes, that is expected. I also don't have a good idea on how to improve
the hashing on non-SSE4.2 systems in a reasonable amount of time.

> 	I only tested the non-SSE4.2 (i.e., old system) portion on
> x86_64.

I tried every possible setup this time, especially with openvswitch. I
covered ia32 with and without SSE4.2 as well as x86_64 and it always
behaved correctly. Last time the problem was that the static inline
didn't become a function in OVS, but during the testing with rhashtable
it got synthesized into a normal C call because of the indirect
reference.

Thanks a lot,
Hannes

  reply	other threads:[~2014-12-04 19:32 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-04 13:08 [PATCH net-next] arch_fast_hash: avoid indirect function calls and implement hash in asm Hannes Frederic Sowa
2014-12-04 15:56 ` Herbert Xu
2014-12-04 16:37   ` Hannes Frederic Sowa
2014-12-04 19:27 ` Jay Vosburgh
2014-12-04 19:32   ` Hannes Frederic Sowa [this message]
2014-12-09 19:39 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1417721526.5386.39.camel@localhost \
    --to=hannes@stressinduktion.org \
    --cc=dborkman@redhat.com \
    --cc=eric.dumazet@gmail.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=jay.vosburgh@canonical.com \
    --cc=netdev@vger.kernel.org \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.