From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.9 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5417FC433E2 for ; Sat, 29 Aug 2020 17:51:04 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1FE6A2071B for ; Sat, 29 Aug 2020 17:51:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="dx6genw/" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1FE6A2071B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:39424 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kC50J-0001j8-7U for qemu-devel@archiver.kernel.org; Sat, 29 Aug 2020 13:51:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41788) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kC4zh-0001Hm-Tk for qemu-devel@nongnu.org; Sat, 29 Aug 2020 13:50:25 -0400 Received: from mail-pg1-x52e.google.com ([2607:f8b0:4864:20::52e]:34120) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kC4zf-0007cV-N8 for qemu-devel@nongnu.org; Sat, 29 Aug 2020 13:50:25 -0400 Received: by mail-pg1-x52e.google.com with SMTP id i10so2028476pgk.1 for ; Sat, 29 Aug 2020 10:50:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=PdYrnXe1+8LDoG72dOpYZIwWCnbKltrMyhQ8WwPyKsA=; b=dx6genw/nRGsxAMj8Pw2Lcljjdag0YRJgpx6KYeL5ngkvzy63GN5AkBikmPYE3DKa8 EePietyA9pKWmHIsfqZqLWHrtfzyK9BuBm9Wi16r6gGZrHjUv3QVUBBApL2caD8cMvXr 7ZRmSi/y+9rDK5Ge23dBoJKBJAkkCHmXr4tiO2yvBc6TogOD+Qlu9pAC7NOOtq0ktbwH dDlA5roK7E3zLIbi8GdmedERCfgnucy5htZaLgewF6xXQDqrrYxkMLPRqnSIcTUb7Oqk GORHwS/VZ7nTeaepcMECh9soqcIeQwAQ6x7nzc8pvCQMctmaG5jRZWDOAqed54Nq4qjZ m5QQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=PdYrnXe1+8LDoG72dOpYZIwWCnbKltrMyhQ8WwPyKsA=; b=j27SQWk+bkRLCfQleF+uzi82HQWWA6ZRXja3Ca8qwmMCpy8O6tT/a3qiQCHksKc6vc AuwYCWK861zdl2f1O+mpMMRPZw7omJxmHL06SHXDJJea99hJMfPTiMAls71VbTOwlLg9 xunzDM4+kq2ZTZv1JTV5baBQv7X/hD2Nq+OcV1nCmZ7M+vNpQ9uenx9g1imeoOlMSrYU J7PLnpdKjSEgVLsIQR3EdNZ5jWtExu+o2Gn6VL+UmDKyQd0tfBSNEug6d6CMBVde/im+ L0NTp2l+49pewTUA7XTlaNllZugzWPk2K0R97IN/VM74Is/U/N0DecYy5wIphLdnnZfm HQug== X-Gm-Message-State: AOAM530ORoY2VXUXz+2a8UWpSwbBWCO/eX0swSnBmZpOdqweG8xFvPN9 iGbMzykQ69SatSZ8FXn5PBJztQ== X-Google-Smtp-Source: ABdhPJymynG8BTMkXYMoA2EttKHfbUD/bwIZjclgjUumfxsrPq/7Q4lU6lUGe4kEq1G17YZiWQWzdw== X-Received: by 2002:a62:33c2:: with SMTP id z185mr3722432pfz.242.1598723421852; Sat, 29 Aug 2020 10:50:21 -0700 (PDT) Received: from [192.168.1.11] ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id gm3sm2545920pjb.31.2020.08.29.10.50.20 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 29 Aug 2020 10:50:20 -0700 (PDT) Subject: Re: [RFC v4 14/70] target/riscv: rvv-1.0: update check functions To: frank.chang@sifive.com, qemu-devel@nongnu.org, qemu-riscv@nongnu.org References: <20200817084955.28793-1-frank.chang@sifive.com> <20200817084955.28793-15-frank.chang@sifive.com> From: Richard Henderson Message-ID: Date: Sat, 29 Aug 2020 10:50:18 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20200817084955.28793-15-frank.chang@sifive.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Received-SPF: pass client-ip=2607:f8b0:4864:20::52e; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52e.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.809, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Palmer Dabbelt , Alistair Francis , Sagar Karandikar , Bastian Koppelmann Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" On 8/17/20 1:48 AM, frank.chang@sifive.com wrote: > +static inline bool is_aligned(const uint8_t val, const uint8_t pos) > +{ > + return pos ? (val & (pos - 1)) == 0 : true; > +} The truncation to uint8_t from int is odd. Can we drop all of that and just use int? Looking at the uses, I think that you should pass lmul directly instead of requiring the callers to all compute 1 << lmul, and also verify that lmul is positive. That change makes this function look like return lmul <= 0 || extract32(val, 0, lmul) == 0; > +static inline bool is_overlapped(const uint8_t astart, uint8_t asize, > + const uint8_t bstart, uint8_t bsize) > +{ > + asize = asize == 0 ? 1 : asize; > + bsize = bsize == 0 ? 1 : bsize; This looks odd. Again, I think passing in lmul would be better than size. Then compute size here locally: int asize = amul <= 0 ? 1 : 1 << amul; > + > + const int aend = astart + asize; > + const int bend = bstart + bsize; > + > + return MAX(aend, bend) - MIN(astart, bstart) < asize + bsize; > +} > + > +static inline bool is_overlapped_widen(const uint8_t astart, uint8_t asize, > + const uint8_t bstart, uint8_t bsize) This needs more comments, I think. It's not obvious why this is (or needs to be) different from is_overlapped. I think you're trying to implement the * destination eew smaller than source eew, and overlap is allowed at the beginning. * destination eew larger than source eew, and overlap is allowed at the end. rule from section 5.2. But since you're not comparing asize vs bsize, that's not what you're doing. Anyway, I think all of these rules can be put into require_noover, and there need not be a separate require_noover_widen. > +static bool require_rvv(DisasContext *s) > +{ > + if (s->mstatus_vs == 0) { > + return false; > + } > + return true; return s->mstatus_vs != 0; > +static bool vext_check_sss(DisasContext *s, int vd, int vs1, > + int vs2, int vm, bool is_vs1) > +{ > + bool ret = require_vm(vm, vd); > + if (s->lmul > 0) { > + ret &= require_align(vd, 1 << s->lmul) && > + require_align(vs2, 1 << s->lmul); > + if (is_vs1) { > + ret &= require_align(vs1, 1 << s->lmul); > + } > + } > + return ret; > +} I think this (and similar function taking is_vs1) should be split. All callers pass a constant value, and thus can just as easily call a different function. Perhaps static bool vext_check_ss(DisasContext *s, int vd, int vs2, int vm) { return (require_vm(vm, vd) && require_align(vd, s->lmul) && require_align(vs2, s->lmul)); } static bool vext_check_sss(DisasContext *s, int vd, int vs1, int vs2, int vm) { return (vext_check_ss(s, vd, vs2, vm) && require_align(vs1, s->lmul)); } > +/* > + * Check function for maskable vector instruction with format: > + * single-width result and single-width sources (SEW = SEW op SEW) > + * > + * is_vs1: indicates whether insn[19:15] is a vs1 field or not. > + * > + * Rules to be checked here: > + * 1. Source (vs2, vs1) vector register number are multiples of LMUL. > + * (Section 3.3.2) > + * 2. Destination vector register cannot overlap a source vector > + * register (vs2, vs1) group. > + * (Section 5.2) > + */ > +static bool vext_check_mss(DisasContext *s, int vd, int vs1, > + int vs2, bool is_vs1) > { > + bool ret = require_align(vs2, 1 << s->lmul); > + if (vd != vs2) { > + ret &= require_noover(vd, 1, vs2, 1 << s->lmul); > + } > + if (is_vs1) { > + if (vd != vs1) { > + ret &= require_noover(vd, 1, vs1, 1 << s->lmul); > + } > + ret &= require_align(vs1, 1 << s->lmul); > + } > + return ret; > +} If require_noover implements all of the overlap rules, as suggested, this simplifies to static bool vext_check_ms(DisasContext *s, int vd, int vs2) { return (require_align(vs2, s->lmul) && require_noover(vd, 0, vs2, s->lmul); } static bool vext_check_mss(DisasContext *s, int vd, int vs1, int vs2) { return (vext_check_ms(s, vd, vs2) && require_align(vs1, s->lmul) && require_noover(vd, 0, vs1, s->lmul)); } > +/* > + * Common check function for vector widening instructions > + * of double-width result (2*SEW). > + * > + * Rules to be checked here: > + * 1. The largest vector register group used by an instruction > + * can not be greater than 8 vector registers (Section 5.2): > + * => LMUL < 8. > + * => SEW < 64. > + * 2. Destination vector register number is multiples of 2 * LMUL. > + * (Section 3.3.2, 11.2) > + * 3. Destination vector register group for a masked vector > + * instruction cannot overlap the source mask register (v0). > + * (Section 5.3) > + */ > +static bool vext_wide_check_common(DisasContext *s, int vd, int vm) > +{ > + return (s->lmul <= 2) && > + (s->sew < 3) && Use MO_64 here for clarity. > +static bool vext_narrow_check_common(DisasContext *s, int vd, int vs2, > + int vm) > +{ > + return (s->lmul <= 2) && > + (s->sew < 3) && Likewise. > +/* > + * Check function for vector instruction with format: > + * double-width result and single-width sources (2*SEW = SEW op SEW) > * > + * is_vs1: indicates whether insn[19:15] is a vs1 field or not. > * > + * Rules to be checked here: > + * 1. All rules in defined in widen common rules are applied. > + * 2. Source (vs2, vs1) vector register number are multiples of LMUL. > + * (Section 3.3.2) > + * 3. Destination vector register cannot overlap a source vector > + * register (vs2, vs1) group. > + * (Section 5.2) > */ > +static bool vext_check_dss(DisasContext *s, int vd, int vs1, int vs2, > + int vm, bool is_vs1) > { > + bool ret = (vext_wide_check_common(s, vd, vm) && > + require_align(vs2, 1 << s->lmul)); > + if (s->lmul < 0) { > + ret &= require_noover(vd, 1 << (s->lmul + 1), vs2, 1 << s->lmul); > + } else { > + ret &= require_noover_widen(vd, 1 << (s->lmul + 1), vs2, 1 << s->lmul); > + } This is buggy, with (1 << negative_number), and is exactly why I think require_noover needs to be passed the emul of each operand and implement all of the rules. This should just be static bool vext_check_ds(DisasContext *s, int vd, int vs2) { return (vext_wide_check_common(s, vd, vm) && require_align(vs2, s->lmul) && require_noover(vd, s->lmul + 1, vs2, s->lmul)); } static bool vext_check_dss(DisasContext *s, int vd, int vs1, int vs2) { return (vext_check_ds(s, vd, vs2) && require_align(vs1, s->lmul) && require_noover(vd, s->lmul + 1, vs1, s->lmul)); } static bool vext_check_dds(DisasContext *s, int vd, int vs1, int vs2) { return (vext_check_ds(s, vd, vs1) && require_align(vs2, s->lmul + 1) && require_noover(vd, s->lmul + 1, vs1, s->lmul + 1)); } > /* > + * Check function for vector reduction instructions. > + * > + * Rules to be checked here: > + * 1. Source 1 (vs2) vector register number is multiples of LMUL. > + * (Section 3.3.2) > + * 2. For widening reduction instructions, SEW < 64. > + * > + * TODO: Check vstart == 0 > */ > +static bool vext_check_reduction(DisasContext *s, int vs2, bool is_wide) > { > + bool ret = require_align(vs2, 1 << s->lmul); > + if (is_wide) { > + ret &= s->sew < 3; > + } > + return ret; > } Again, should be split. But in this case probably into the only callers... > +static bool reduction_widen_check(DisasContext *s, arg_rmrr *a) > +{ > + return require_rvv(s) && > + vext_check_isa_ill(s) && > + vext_check_reduction(s, a->rs2, true); > +} This could simplify to return reduction_check(s, a) && s->sew < MO_64; r~