From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 25DE8EB64D9 for ; Thu, 6 Jul 2023 09:42:50 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EC1D910E4B9; Thu, 6 Jul 2023 09:42:49 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4094610E4B9 for ; Thu, 6 Jul 2023 09:42:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1688636569; x=1720172569; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=0DV27QSee+aV51uLQC509c42EWC0ExcijAKY+pm688E=; b=kqpBElgib9kBFiHk/9jXKA9kHxeJORir0Vf7RJx/UKxHZwzRUDj3vC66 1zjZZdKa/zGAbZZyKseqnG/d0voRQZdvPhiw3AmqHO4u8C1P4hVgvQ3X5 utqKHueBT5x8f7NIO/viMaXmEBje8I43wtamlS2WacZwbrxKpoMZwuypa Mqa8+pgEyuu1AtDDLximDR6ILTQz1dE0xzA5wmswjo9jWYT2NqhFx9XWE tEgfCr7AjARGiZKoRRqMBuSDP9wY/IZ3CsaF+k4znE48T9h+nnvvUFgWN TcVyp0Kr21xctkaf4T+gEy7y3lmKix2iVd87nAbp395fC/zcz9047v5lQ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10762"; a="343150320" X-IronPort-AV: E=Sophos;i="6.01,185,1684825200"; d="scan'208";a="343150320" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Jul 2023 02:42:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10762"; a="864041013" X-IronPort-AV: E=Sophos;i="6.01,185,1684825200"; d="scan'208";a="864041013" Received: from fmahon-mobl.ger.corp.intel.com (HELO [10.252.26.229]) ([10.252.26.229]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Jul 2023 02:42:44 -0700 Message-ID: Date: Thu, 6 Jul 2023 10:42:37 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0 Thunderbird/102.12.0 Content-Language: en-GB To: Matthew Brost References: <20230705160602.237213-9-matthew.auld@intel.com> <20230705160602.237213-12-matthew.auld@intel.com> From: Matthew Auld In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Intel-xe] [PATCH v4 3/7] drm/xe/tlb: increment next seqno after successful CT send X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: intel-xe@lists.freedesktop.org Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On 06/07/2023 04:59, Matthew Brost wrote: > On Wed, Jul 05, 2023 at 05:06:06PM +0100, Matthew Auld wrote: >> If we are in the middle of a GT reset or similar the CT might be >> disabled, such that the CT send fails. However we already incremented >> gt->tlb_invalidation.seqno which might lead to warnings, since we >> effectively just skipped a seqno: >> >> 0000:00:02.0: drm_WARN_ON(expected_seqno != msg[0]) >> >> Signed-off-by: Matthew Auld >> Cc: Matthew Brost >> Cc: José Roberto de Souza >> --- >> drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 11 ++++++----- >> 1 file changed, 6 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c >> index 2fcb477604e2..b38da572d268 100644 >> --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c >> +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c >> @@ -124,10 +124,6 @@ static int send_tlb_invalidation(struct xe_guc *guc, >> trace_xe_gt_tlb_invalidation_fence_send(fence); >> } >> action[1] = seqno; >> - gt->tlb_invalidation.seqno = (gt->tlb_invalidation.seqno + 1) % >> - TLB_INVALIDATION_SEQNO_MAX; >> - if (!gt->tlb_invalidation.seqno) >> - gt->tlb_invalidation.seqno = 1; >> ret = xe_guc_ct_send_locked(&guc->ct, action, len, >> G2H_LEN_DW_TLB_INVALIDATE, 1); >> if (!ret && fence) { >> @@ -137,8 +133,13 @@ static int send_tlb_invalidation(struct xe_guc *guc, >> >->tlb_invalidation.fence_tdr, >> TLB_TIMEOUT); >> } >> - if (!ret) > > Do we now (after this entire series) have the another race where the > below warn could fire as the CT fast path executes before we update the > seqno value? Would it be better to just roll back the seqno on error? Ohh, so we do the CT send, we process the g2h message double quick on the fast-path, before we even had a chance to add the fence to the list? I missed that. Maybe easiest is just to sample seqno_recv here under the lock, and signal the fence directly if the seqno was already written. Thanks for catching that. I don't think there is any race with updating tlb_invalidation.seqno, the receiver side only considers the seqno_recv. So not sure it matters where we increment tlb_invalidation.seqno so long as we are under ct->lock and ideally don't leave it incremented on error for the next user. I can do the roll back instead if your prefer? > > Matt > >> + if (!ret) { >> + gt->tlb_invalidation.seqno = (gt->tlb_invalidation.seqno + 1) % >> + TLB_INVALIDATION_SEQNO_MAX; >> + if (!gt->tlb_invalidation.seqno) >> + gt->tlb_invalidation.seqno = 1; >> ret = seqno; >> + } >> if (ret < 0 && fence) >> invalidation_fence_signal(fence); >> mutex_unlock(&guc->ct.lock); >> -- >> 2.41.0 >>