From: Hannes Frederic Sowa <hannes@stressinduktion.org>
To: Jay Vosburgh <jay.vosburgh@canonical.com>
Cc: David Miller <davem@davemloft.net>,
eric.dumazet@gmail.com, netdev@vger.kernel.org,
ogerlitz@mellanox.com, pshelar@nicira.com, jesse@nicira.com,
discuss@openvswitch.org
Subject: Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
Date: Fri, 14 Nov 2014 23:37:42 +0100 [thread overview]
Message-ID: <1416004662.15154.76.camel@localhost> (raw)
In-Reply-To: <18948.1416003002@famine>
Hi Jay,
On Fr, 2014-11-14 at 14:10 -0800, Jay Vosburgh wrote:
> Hannes Frederic Sowa <hannes@stressinduktion.org> wrote:
> [...]
> >I created it via the function calling convention documented in
> >arch/x86/include/asm/calling.h, so I specified each register which a
> >function is allowed to clobber with.
> >
> >I currently cannot see how I can resolve the invalid constraints error
> >easily. :(
> >
> >So either go with my first patch, which I puts the alternative_call
> >switch point into its own function without ever inlining or the patch
> >needs to be reverted. :/
>
> As a data point, I tested the first patch as well, and the
> system does not panic with it in place. Inspection shows that it's
> using %r14 in place of %r8 in the prior (crashing) implementation.
Yes, I also could reproduce your oops and the first unoffical patch and
the first offical one both fixed it. After that, I thought that just
adding more clobbers cannot introduce bugs, so I only did compile
testing until I hit a window where gcc got mad with the excessive use of
clobbered registers but haven't tested the inline call sites that much
(sorry). :(
> Disassembly of the call site (on the non-sse4_1 system) in
> ovs_flow_tbl_insert with the first patch applied looks like this:
>
> 0xffffffffa00b6bb9 <ovs_flow_tbl_insert+0xb9>: mov %r15,0x348(%r14)
> 0xffffffffa00b6bc0 <ovs_flow_tbl_insert+0xc0>: movzwl 0x28(%r15),%ecx
> 0xffffffffa00b6bc5 <ovs_flow_tbl_insert+0xc5>: movzwl 0x2a(%r15),%esi
> 0xffffffffa00b6bca <ovs_flow_tbl_insert+0xca>: movzwl %cx,%eax
> 0xffffffffa00b6bcd <ovs_flow_tbl_insert+0xcd>: sub %ecx,%esi
> 0xffffffffa00b6bcf <ovs_flow_tbl_insert+0xcf>: lea 0x38(%r14,%rax,1),%rdi
> 0xffffffffa00b6bd4 <ovs_flow_tbl_insert+0xd4>: sar $0x2,%esi
> 0xffffffffa00b6bd7 <ovs_flow_tbl_insert+0xd7>: callq 0xffffffff813a7810 <__jhash2>
> 0xffffffffa00b6bdc <ovs_flow_tbl_insert+0xdc>: mov %eax,0x30(%r14)
> 0xffffffffa00b6be0 <ovs_flow_tbl_insert+0xe0>: mov (%rbx),%r13
> 0xffffffffa00b6be3 <ovs_flow_tbl_insert+0xe3>: mov %r14,%rsi
> 0xffffffffa00b6be6 <ovs_flow_tbl_insert+0xe6>: mov %r13,%rdi
> 0xffffffffa00b6be9 <ovs_flow_tbl_insert+0xe9>: callq 0xffffffffa00b61a0 <table_instance_insert>
>
> Compared to the panicking version's function:
>
> 0xffffffffa01a55c9 <ovs_flow_tbl_insert+0xb9>: mov %r15,0x348(%r8)
> 0xffffffffa01a55d0 <ovs_flow_tbl_insert+0xc0>: movzwl 0x28(%r15),%ecx
> 0xffffffffa01a55d5 <ovs_flow_tbl_insert+0xc5>: movzwl 0x2a(%r15),%esi
> 0xffffffffa01a55da <ovs_flow_tbl_insert+0xca>: movzwl %cx,%eax
> 0xffffffffa01a55dd <ovs_flow_tbl_insert+0xcd>: sub %ecx,%esi
> 0xffffffffa01a55df <ovs_flow_tbl_insert+0xcf>: lea 0x38(%r8,%rax,1),%rdi
> 0xffffffffa01a55e4 <ovs_flow_tbl_insert+0xd4>: sar $0x2,%esi
> 0xffffffffa01a55e7 <ovs_flow_tbl_insert+0xd7>: callq 0xffffffff813a75c0 <__jhash2>
> 0xffffffffa01a55ec <ovs_flow_tbl_insert+0xdc>: mov %eax,0x30(%r8)
> 0xffffffffa01a55f0 <ovs_flow_tbl_insert+0xe0>: mov (%rbx),%r13
> 0xffffffffa01a55f3 <ovs_flow_tbl_insert+0xe3>: mov %r8,%rsi
> 0xffffffffa01a55f6 <ovs_flow_tbl_insert+0xe6>: mov %r13,%rdi
> 0xffffffffa01a55f9 <ovs_flow_tbl_insert+0xe9>: callq 0xffffffffa01a4ba0 <table_instance_insert>
>
> It appears to generate the same instructions, but allocates
> registers differently (using %r14 instead of %r8).
Exactly and that makes sense. While %r8 must be available for the callee
to be clobbered with, %r14 must be saved by the callee and restored
before returning. So you pass the responsibility down to the other
functions, which tries not to touch %r14 because it knows it will have
to generate code for saving and restoring.
That's the reason why I actually like the the static inline clobbering
approach so much, it gives gcc possibilities to move around the
save/restore cycles and decide itself just by aligning which registers
to use.
Also the first version does work flawlessly (which I didn't send as a
patch but only as a diff in the mail). Here gcc synthesizes a full
function call which has the same effect as the long clobber list, only
it does chain two calls right behind each other.
> The __jhash2 disassembly appears to be unchanged between the two
> versions.
Thanks for looking into this!
It is actually pretty hairy to come up with a good solution for this,
because with the alternative interface you are only allowed to alter one
instruction. jump_tables also don't work because I currently have the
opinion that they do the switch way too late. I absolutely don't want to
have inserts into a hashtable with different hashing functions depending
how early during boot they took place.
Bye,
Hannes
next prev parent reply other threads:[~2014-11-14 22:37 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-14 14:06 [PATCH net-next] fast_hash: clobber registers correctly for inline function use Hannes Frederic Sowa
2014-11-14 14:40 ` [PATCH net-next v2] " Hannes Frederic Sowa
2014-11-14 14:50 ` [PATCH net-next] " Eric Dumazet
2014-11-14 15:13 ` Hannes Frederic Sowa
2014-11-14 15:33 ` Eric Dumazet
2014-11-14 15:46 ` Hannes Frederic Sowa
2014-11-14 18:38 ` David Miller
2014-11-14 19:02 ` Cong Wang
2014-11-14 20:42 ` Hannes Frederic Sowa
2014-11-14 21:35 ` David Miller
2014-11-14 19:05 ` [PATCH net-next] Revert "fast_hash: avoid indirect function calls" Jay Vosburgh
2014-11-14 21:36 ` David Miller
2014-11-14 21:43 ` Hannes Frederic Sowa
2014-11-14 20:04 ` [PATCH net-next] fast_hash: clobber registers correctly for inline function use Hannes Frederic Sowa
2014-11-14 20:10 ` Hannes Frederic Sowa
2014-11-14 20:15 ` Jay Vosburgh
2014-11-14 20:35 ` Hannes Frederic Sowa
2014-11-14 22:10 ` Jay Vosburgh
2014-11-14 22:37 ` Hannes Frederic Sowa [this message]
2014-11-14 15:17 ` [PATCH net-next v3] " Hannes Frederic Sowa
2014-11-14 17:57 ` Jay Vosburgh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1416004662.15154.76.camel@localhost \
--to=hannes@stressinduktion.org \
--cc=davem@davemloft.net \
--cc=discuss@openvswitch.org \
--cc=eric.dumazet@gmail.com \
--cc=jay.vosburgh@canonical.com \
--cc=jesse@nicira.com \
--cc=netdev@vger.kernel.org \
--cc=ogerlitz@mellanox.com \
--cc=pshelar@nicira.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).