From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([209.51.188.92]:36569) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gyhjy-0002zZ-DR for qemu-devel@nongnu.org; Tue, 26 Feb 2019 13:46:07 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gyhjx-0002py-Io for qemu-devel@nongnu.org; Tue, 26 Feb 2019 13:46:06 -0500 References: <20190226113915.20150-1-david@redhat.com> <20190226113915.20150-4-david@redhat.com> <59942bd6-49ca-504f-0d2a-910939eea09d@linaro.org> From: David Hildenbrand Message-ID: Date: Tue, 26 Feb 2019 19:45:54 +0100 MIME-Version: 1.0 In-Reply-To: <59942bd6-49ca-504f-0d2a-910939eea09d@linaro.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v1 03/33] s390x: Add one temporary vector register in CPU state for TCG List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Richard Henderson , qemu-devel@nongnu.org Cc: qemu-s390x@nongnu.org, Cornelia Huck , Thomas Huth , Richard Henderson On 26.02.19 19:36, Richard Henderson wrote: > On 2/26/19 3:38 AM, David Hildenbrand wrote: >> We sometimes want to work on a temporary vector register instead of the >> actual destination, because source and destination might overlap. An >> alternative would be loading the vector into two i64 variables, but than >> separate handling for accessing the vector elements would be needed. >> This is easier. Add one for now as that seems to be enough. > > Hmm, I'll reserve judgment until I see how this is used. > > For ARM SVE, I would allocate this temporary on the stack within the helper, > and move one of the operands out of the way. E.g. Yes, I do the same for helpers. This, however is for TCG translated code :) E.g. see [PATCH v1 08/33] s390x/tcg: Implement VECTOR LOAD [PATCH v1 19/33] s390x/tcg: Implement VECTOR MERGE (HIGH|LOW) [PATCH v1 33/33] s390x/tcg: Implement VECTOR UNPACK * > > void helper(foo)(void *vd, void *vx, *void *vy > { > VectorReg tmp; > TYPE *d = vd, *x = vx, *y = vy; > > if (vx == vd || vy == vd) { > tmp = *(VectorReg *)vd; > if (vx == vd) { > vx = &tmp; > } > if (vy == vd) { > vy = &tmp; > } > } > > process d, x, y as normal. > } > > This minimized the amount of code inline. However, SVE vectors are quite a bit > larger, at 256 bytes, so the copy itself was out of line most of the time anyway. > > Provisionally, > Reviewed-by: Richard Henderson > > > r~ > -- Thanks, David / dhildenb