From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 07E72C71136 for ; Tue, 17 Jun 2025 13:11:27 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BC78810E68B; Tue, 17 Jun 2025 13:11:26 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="fKa5ePYX"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1BC9010E68B; Tue, 17 Jun 2025 13:11:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1750165886; x=1781701886; h=message-id:subject:from:to:cc:date:in-reply-to: references:content-transfer-encoding:mime-version; bh=esbcnjXMNYaAmwABLm6KRNLffK01coaUFSmkWsAA9R0=; b=fKa5ePYXDP04QzIg/pO4IzCdQmds3qLPQwStRZ7hsybsc30jVRnPYCy8 l2T8E0MfGCmfnlN/47f/CALL+7izEmnuVJ0Ddvobj6rcl+J3B5NpCrL1O P56yXwW+/qtKy+fhc6jxm35ADSJ6wJhucRa2RF9vSDDROZgzGruCHqQ/I TxminXNhhsRnoHrBHumcIHIBlajG/s9suGqvH2XpO3YoAevHptVqYYbpZ 1/Z/Lsw0zHj1zDWJIfgVqUuE4vrt7rG06ImBnD2Lk7nhg6NczHGkS/s1k caVQlejdkWRgIt0H3XbFhKImiARp6li9iK0SwW1WgAdjBwYMvnvmOOHwG g==; X-CSE-ConnectionGUID: RYq2I9nzQdi4nH04QRLo1w== X-CSE-MsgGUID: cWa9GkLwRwy0lmjsrpG6vg== X-IronPort-AV: E=McAfee;i="6800,10657,11467"; a="69780790" X-IronPort-AV: E=Sophos;i="6.16,243,1744095600"; d="scan'208";a="69780790" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Jun 2025 06:11:25 -0700 X-CSE-ConnectionGUID: y++aLavZQjuKHZgX67quNA== X-CSE-MsgGUID: uHAmvEowSISP/laA74Imcw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,243,1744095600"; d="scan'208";a="149317255" Received: from oandoniu-mobl3.ger.corp.intel.com (HELO [10.245.245.95]) ([10.245.245.95]) by orviesa007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Jun 2025 06:11:22 -0700 Message-ID: Subject: Re: [PATCH v3 1/3] drm/gpusvm, drm/pagemap: Move migration functionality to drm_pagemap From: Thomas =?ISO-8859-1?Q?Hellstr=F6m?= To: "Ghimiray, Himal Prasad" , intel-xe@lists.freedesktop.org Cc: Matthew Brost , dri-devel@lists.freedesktop.org, apopple@nvidia.com, airlied@gmail.com, Simona Vetter , Felix =?ISO-8859-1?Q?K=FChling?= , Philip Yang , Christian =?ISO-8859-1?Q?K=F6nig?= , dakr@kernel.org, "Mrozek, Michal" , Joonas Lahtinen Date: Tue, 17 Jun 2025 15:11:20 +0200 In-Reply-To: <93e663cf-01e7-4241-89ea-3bdda3d19437@intel.com> References: <20250613140219.87479-1-thomas.hellstrom@linux.intel.com> <20250613140219.87479-2-thomas.hellstrom@linux.intel.com> <93e663cf-01e7-4241-89ea-3bdda3d19437@intel.com> Organization: Intel Sweden AB, Registration Number: 556189-6027 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.54.3 (3.54.3-1.fc41) MIME-Version: 1.0 X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Tue, 2025-06-17 at 18:25 +0530, Ghimiray, Himal Prasad wrote: >=20 >=20 > On 13-06-2025 19:32, Thomas Hellstr=C3=B6m wrote: > > From: Matthew Brost > >=20 > > The migration functionality and track-keeping of per-pagemap VRAM > > mapped to the CPU mm is not per GPU_vm, but rather per pagemap. > > This is also reflected by the functions not needing the drm_gpusvm > > structures. So move to drm_pagemap. > >=20 > > With this, drm_gpusvm shouldn't really access the page zone-device- > > data > > since its meaning is internal to drm_pagemap. Currently it's used > > to > > reject mapping ranges backed by multiple drm_pagemap allocations. > > For now, make the zone-device-data a void pointer. > >=20 > > Alter the interface of drm_gpusvm_migrate_to_devmem() to ensure we > > don't > > pass a gpusvm pointer. > >=20 > > Rename CONFIG_DRM_XE_DEVMEM_MIRROR to CONFIG_DRM_XE_PAGEMAP. > >=20 > > Matt is listed as author of this commit since he wrote most of the > > code, > > and it makes sense to retain his git authorship. > > Thomas mostly moved the code around. >=20 > >=20 > > v3: > > - Kerneldoc fixes (CI) > > - Don't update documentation about how the drm_pagemap > > =C2=A0=C2=A0 migration should be interpreted until upcoming > > =C2=A0=C2=A0 patches where the functionality is implemented. > > =C2=A0=C2=A0 (Matt Brost) > >=20 > > Co-developed-by: Thomas Hellstr=C3=B6m > > > > Signed-off-by: Thomas Hellstr=C3=B6m > > --- > > =C2=A0 Documentation/gpu/rfc/gpusvm.rst=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0= 12 +- > > =C2=A0 drivers/gpu/drm/Makefile=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0 6 +- > > =C2=A0 drivers/gpu/drm/drm_gpusvm.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 | 759 +--------------------- > > ---- > > =C2=A0 drivers/gpu/drm/drm_pagemap.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 | 788 > > +++++++++++++++++++++++++++ > > =C2=A0 drivers/gpu/drm/xe/Kconfig=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0 10 +- > > =C2=A0 drivers/gpu/drm/xe/xe_bo_types.h=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0= =C2=A0 2 +- > > =C2=A0 drivers/gpu/drm/xe/xe_device_types.h |=C2=A0=C2=A0 2 +- > > =C2=A0 drivers/gpu/drm/xe/xe_svm.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 |=C2=A0 47 +- > > =C2=A0 include/drm/drm_gpusvm.h=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0 96 ---- > > =C2=A0 include/drm/drm_pagemap.h=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 | 101 ++++ > > =C2=A0 10 files changed, 950 insertions(+), 873 deletions(-) > > =C2=A0 create mode 100644 drivers/gpu/drm/drm_pagemap.c > >=20 > > diff --git a/Documentation/gpu/rfc/gpusvm.rst > > b/Documentation/gpu/rfc/gpusvm.rst > > index bcf66a8137a6..469db1372f16 100644 > > --- a/Documentation/gpu/rfc/gpusvm.rst > > +++ b/Documentation/gpu/rfc/gpusvm.rst > > @@ -73,15 +73,21 @@ Overview of baseline design > > =C2=A0 .. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c > > =C2=A0=C2=A0=C2=A0=C2=A0 :doc: Locking > > =C2=A0=20 > > -.. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c > > -=C2=A0=C2=A0 :doc: Migration > > - > > =C2=A0 .. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c > > =C2=A0=C2=A0=C2=A0=C2=A0 :doc: Partial Unmapping of Ranges > > =C2=A0=20 > > =C2=A0 .. kernel-doc:: drivers/gpu/drm/drm_gpusvm.c > > =C2=A0=C2=A0=C2=A0=C2=A0 :doc: Examples > > =C2=A0=20 > > +Overview of drm_pagemap design > > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D > > + > > +.. kernel-doc:: drivers/gpu/drm/drm_pagemap.c > > +=C2=A0=C2=A0 :doc: Overview > > + > > +.. kernel-doc:: drivers/gpu/drm/drm_pagemap.c > > +=C2=A0=C2=A0 :doc: Migration > > + > > =C2=A0 Possible future design features > > =C2=A0 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > =C2=A0=20 > > diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile > > index 5050ac32bba2..4dafbdc8f86a 100644 > > --- a/drivers/gpu/drm/Makefile > > +++ b/drivers/gpu/drm/Makefile > > @@ -104,7 +104,11 @@ obj-$(CONFIG_DRM_PANEL_BACKLIGHT_QUIRKS) +=3D > > drm_panel_backlight_quirks.o > > =C2=A0 # > > =C2=A0 obj-$(CONFIG_DRM_EXEC) +=3D drm_exec.o > > =C2=A0 obj-$(CONFIG_DRM_GPUVM) +=3D drm_gpuvm.o > > -obj-$(CONFIG_DRM_GPUSVM) +=3D drm_gpusvm.o > > + > > +drm_gpusvm_helper-y :=3D \ > > + drm_gpusvm.o\ > > + drm_pagemap.o > > +obj-$(CONFIG_DRM_GPUSVM) +=3D drm_gpusvm_helper.o > > =C2=A0=20 > > =C2=A0 obj-$(CONFIG_DRM_BUDDY) +=3D drm_buddy.o > > =C2=A0=20 > > diff --git a/drivers/gpu/drm/drm_gpusvm.c > > b/drivers/gpu/drm/drm_gpusvm.c > > index 7ff81aa0a1ca..ef81381609de 100644 > > --- a/drivers/gpu/drm/drm_gpusvm.c > > +++ b/drivers/gpu/drm/drm_gpusvm.c > > @@ -8,10 +8,9 @@ > > =C2=A0=20 > > =C2=A0 #include > > =C2=A0 #include > > +#include > > =C2=A0 #include > > -#include > > =C2=A0 #include > > -#include > > =C2=A0 #include > > =C2=A0=20 > > =C2=A0 #include > > @@ -107,21 +106,6 @@ > > =C2=A0=C2=A0 * to add annotations to GPU SVM. > > =C2=A0=C2=A0 */ > > =C2=A0=20 > > -/** > > - * DOC: Migration > > - * > > - * The migration support is quite simple, allowing migration > > between RAM and > > - * device memory at the range granularity. For example, GPU SVM > > currently does > > - * not support mixing RAM and device memory pages within a range. > > This means > > - * that upon GPU fault, the entire range can be migrated to device > > memory, and > > - * upon CPU fault, the entire range is migrated to RAM. Mixed RAM > > and device > > - * memory storage within a range could be added in the future if > > required. > > - * > > - * The reasoning for only supporting range granularity is as > > follows: it > > - * simplifies the implementation, and range sizes are driver- > > defined and should > > - * be relatively small. > > - */ > > - > > =C2=A0 /** > > =C2=A0=C2=A0 * DOC: Partial Unmapping of Ranges > > =C2=A0=C2=A0 * > > @@ -193,10 +177,9 @@ > > =C2=A0=C2=A0 * if (driver_migration_policy(range)) { > > =C2=A0=C2=A0 * mmap_read_lock(mm); > > =C2=A0=C2=A0 * devmem =3D driver_alloc_devmem(); > > - * err =3D drm_gpusvm_migrate_to_devmem(gpusvm, > > range, > > - * =C2=A0=C2=A0 > > devmem_allocation, > > - * =C2=A0=C2=A0 &ctx); > > - * mmap_read_unlock(mm); > > + * err =3D > > drm_pagemap_migrate_to_devmem(devmem, gpusvm->mm, gpuva_start, > > + *=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 > > gpuva_end, driver_pgmap_owner()); >=20 >=20 >=20 > fix doc passing timeslice as parameter. Will fix. >=20 8<--------------------------------------------------------------------- > > +/** > > + * drm_pagemap_migrate_to_devmem() - Migrate a struct mm_struct > > range to device memory > > + * @devmem_allocation: The device memory allocation to migrate to. > > + * The caller should hold a reference to the device memory > > allocation, > > + * and the reference is consumed by this function unless it > > returns with > > + * an error. > > + * @mm: Pointer to the struct mm_struct. > > + * @start: Start of the virtual address range to migrate. > > + * @end: End of the virtual address range to migrate. > > + * @timeslice_ms: The time requested for the migrated pages to > > + * be present in the cpu memory map before migrated back. >=20 > Shouldn't this be present in gpu or cpu memory map ? We are using > this=20 > to ensure pagefault can be handled effectively by ensuring pages > remain=20 > in vram here for prescribed time too. So with this split, drm_pagemap is responsible for migrating memory and updating the CPU memory map only, whereas drm_gpusvm is responsible for setting up the GPU memory maps. So if it remains in the CPU memory map, then nothing will force the gpu vms to invalidate, unless the gpu driver decides to invalidate itself. But looking at this i should probably rephrase "before migrated back" to "before being allowed to be migrated back". >=20 > > + * @pgmap_owner: Not used currently, since only system memory is > > considered. > > + * > > + * This function migrates the specified virtual address range to > > device memory. > > + * It performs the necessary setup and invokes the driver-specific > > operations for > > + * migration to device memory. Expected to be called while holding > > the mmap lock in > > + * at least read mode. > > + * > > + * Return: %0 on success, negative error code on failure. >=20 > s/%0/0 kerneldoc prefers "%" before constants, no? Thanks, Thomas