From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jay Vosburgh Subject: Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use Date: Fri, 14 Nov 2014 14:10:02 -0800 Message-ID: <18948.1416003002@famine> References: <1415978022.15154.31.camel@localhost> <1415979181.17262.45.camel@edumazet-glaptop2.roam.corp.google.com> <1415979978.15154.41.camel@localhost> <20141114.133829.1437047454714311242.davem@davemloft.net> <1415995451.15154.54.camel@localhost> <17658.1415996115@famine> <1415997309.15154.59.camel@localhost> Cc: David Miller , eric.dumazet@gmail.com, netdev@vger.kernel.org, ogerlitz@mellanox.com, pshelar@nicira.com, jesse@nicira.com, discuss@openvswitch.org To: Hannes Frederic Sowa Return-path: Received: from youngberry.canonical.com ([91.189.89.112]:51329 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754335AbaKNWKN (ORCPT ); Fri, 14 Nov 2014 17:10:13 -0500 In-reply-to: <1415997309.15154.59.camel@localhost> Sender: netdev-owner@vger.kernel.org List-ID: Hannes Frederic Sowa wrote: [...] >I created it via the function calling convention documented in >arch/x86/include/asm/calling.h, so I specified each register which a >function is allowed to clobber with. > >I currently cannot see how I can resolve the invalid constraints error >easily. :( > >So either go with my first patch, which I puts the alternative_call >switch point into its own function without ever inlining or the patch >needs to be reverted. :/ As a data point, I tested the first patch as well, and the system does not panic with it in place. Inspection shows that it's using %r14 in place of %r8 in the prior (crashing) implementation. Disassembly of the call site (on the non-sse4_1 system) in ovs_flow_tbl_insert with the first patch applied looks like this: 0xffffffffa00b6bb9 : mov %r15,0x348(%r14) 0xffffffffa00b6bc0 : movzwl 0x28(%r15),%ecx 0xffffffffa00b6bc5 : movzwl 0x2a(%r15),%esi 0xffffffffa00b6bca : movzwl %cx,%eax 0xffffffffa00b6bcd : sub %ecx,%esi 0xffffffffa00b6bcf : lea 0x38(%r14,%rax,1),%rdi 0xffffffffa00b6bd4 : sar $0x2,%esi 0xffffffffa00b6bd7 : callq 0xffffffff813a7810 <__jhash2> 0xffffffffa00b6bdc : mov %eax,0x30(%r14) 0xffffffffa00b6be0 : mov (%rbx),%r13 0xffffffffa00b6be3 : mov %r14,%rsi 0xffffffffa00b6be6 : mov %r13,%rdi 0xffffffffa00b6be9 : callq 0xffffffffa00b61a0 Compared to the panicking version's function: 0xffffffffa01a55c9 : mov %r15,0x348(%r8) 0xffffffffa01a55d0 : movzwl 0x28(%r15),%ecx 0xffffffffa01a55d5 : movzwl 0x2a(%r15),%esi 0xffffffffa01a55da : movzwl %cx,%eax 0xffffffffa01a55dd : sub %ecx,%esi 0xffffffffa01a55df : lea 0x38(%r8,%rax,1),%rdi 0xffffffffa01a55e4 : sar $0x2,%esi 0xffffffffa01a55e7 : callq 0xffffffff813a75c0 <__jhash2> 0xffffffffa01a55ec : mov %eax,0x30(%r8) 0xffffffffa01a55f0 : mov (%rbx),%r13 0xffffffffa01a55f3 : mov %r8,%rsi 0xffffffffa01a55f6 : mov %r13,%rdi 0xffffffffa01a55f9 : callq 0xffffffffa01a4ba0 It appears to generate the same instructions, but allocates registers differently (using %r14 instead of %r8). The __jhash2 disassembly appears to be unchanged between the two versions. -J --- -Jay Vosburgh, jay.vosburgh@canonical.com