From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 39C7BC47258 for ; Fri, 2 Feb 2024 08:37:41 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E278310E0BC; Fri, 2 Feb 2024 08:37:40 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="Q0+oCRTr"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) by gabe.freedesktop.org (Postfix) with ESMTPS id 64E8810E561 for ; Fri, 2 Feb 2024 08:37:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706863059; x=1738399059; h=message-id:date:mime-version:subject:from:to:cc: references:in-reply-to:content-transfer-encoding; bh=cui93uz7UTzKv+779R9OyUIDN4k9ScEKlxO4HylKEdE=; b=Q0+oCRTrbpiz/dyp0uy4dACrNqN1cnMVhL+eomYuAbde2uAGdJyqoSn2 azpGC0t+iMcBI6RM6ZMAH0LRdJ4gHTp2vMkvzKmjYnogRz8zrom22zIP3 jMJQo7SkdXqHVL9c8FlnonCA0rdgDgJ03gZof0wq8srph17Vld7txXRYI v2Q/IanGY7X0IlJocSdL4bdV4TaXWAfeDmwx9H/S4pUHVNVKONxpdwDe5 /9fHlQd173UXhd4M953SOkHx4/lU7/kK1Mge19QSBbXryMMshRQa7Sclp WDrhNjtUIoSi+L9Hy7q8Odlm/QGhqVk5A6XWhaP0DUD/MXtVql2YmLwMp A==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="25561846" X-IronPort-AV: E=Sophos;i="6.05,237,1701158400"; d="scan'208";a="25561846" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2024 00:37:38 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,237,1701158400"; d="scan'208";a="4635042" Received: from lhuot-mobl.amr.corp.intel.com (HELO [10.252.59.167]) ([10.252.59.167]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2024 00:37:36 -0800 Message-ID: <111ee61b-f699-4736-a0c1-e842cce618ff@linux.intel.com> Date: Fri, 2 Feb 2024 09:37:34 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 2/2] drm/xe: Pick correct userptr VMA to repin on REMAP op failure Content-Language: en-US From: Maarten Lankhorst To: Matthew Brost Cc: intel-xe@lists.freedesktop.org References: <20240201004849.2219558-1-matthew.brost@intel.com> <20240201004849.2219558-3-matthew.brost@intel.com> <265682a3-b813-4f1f-8031-1d500c3d89af@linux.intel.com> <3914d37b-2f89-41bc-bf20-4f255d96d217@linux.intel.com> In-Reply-To: <3914d37b-2f89-41bc-bf20-4f255d96d217@linux.intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On 2024-02-01 21:00, Maarten Lankhorst wrote: > Hey, > > On 2024-02-01 20:26, Matthew Brost wrote: >> On Thu, Feb 01, 2024 at 08:18:52PM +0100, Maarten Lankhorst wrote: >>> >>> >>> On 2024-02-01 01:48, Matthew Brost wrote: >>>> A REMAP op is composed of 3 VMA's - unmap, prev map, and next map. When >>>> op_execute fails with -EAGAIN we need to update the local VMA >>>> pointer to >>>> the current op state and then repin the VMA if it is a userptr. >>>> >>>> Fixes a failure seen in xe_vm.munmap-style-unbind-userptr-one-partial. >>>> >>>> Fixes: b06d47be7c83 ("drm/xe: Port Xe to GPUVA") >>>> Signed-off-by: Matthew Brost >>>> --- >>>>    drivers/gpu/drm/xe/xe_vm.c | 22 +++++++++++++++++----- >>>>    1 file changed, 17 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c >>>> index e55161136490..2ab863fe7d0a 100644 >>>> --- a/drivers/gpu/drm/xe/xe_vm.c >>>> +++ b/drivers/gpu/drm/xe/xe_vm.c >>>> @@ -2506,13 +2506,25 @@ static int __xe_vma_op_execute(struct xe_vm >>>> *vm, struct xe_vma *vma, >>>>        } >>>>        drm_exec_fini(&exec); >>>> -    if (err == -EAGAIN && xe_vma_is_userptr(vma)) { >>>> +    if (err == -EAGAIN) { >>>>            lockdep_assert_held_write(&vm->lock); >>>> -        err = xe_vma_userptr_pin_pages(vma); >>>> -        if (!err) >>>> -            goto retry_userptr; >>>> -        trace_xe_vma_fail(vma); >>>> +        if (op->base.op == DRM_GPUVA_OP_REMAP) { >>>> +            if (!op->remap.unmap_done) >>>> +                vma = gpuva_to_vma(op->base.remap.unmap->va); >>>> +            else if (op->remap.prev) >>>> +                vma = op->remap.prev; >>>> +            else >>>> +                vma = op->remap.next; >>>> +        } >>> I see this same vma picking in handling of DRM_GPUVA_OP_REMAP. >>> >>> Could the switch in xe_vma_op_execute() be moved to a separate pick_vma >>> function instead, called from this place too? >>> >>> It might make the code slightly more readable. >>> >> >> I would agree if this code wasn't going get rewritten shortly in [1]. We >> are transiting to 1 job per VM bind IOCTL in [1]. I currently am >> reworking on rebasing that code and found a few bugs in the current >> code. I want to stablize the code quickly so I czn reliably test my >> larger changes. >> >> Would it help if I added comment here saying this code is temporary? > Oh that gives some context. The infinite loop fix in one patch is caused > by the first patch in that series. > > I'd personally choose to fix it here, then when r ebasing put a revert > before 21/27 and squash? > > Cheers, > ~Maarten Reviewed-by: Maarten Lankhorst