From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57E6ACFC288 for ; Tue, 15 Oct 2024 11:21:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DA52C6B0092; Tue, 15 Oct 2024 07:21:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D2DB26B0093; Tue, 15 Oct 2024 07:21:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BA7136B0095; Tue, 15 Oct 2024 07:21:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 934106B0092 for ; Tue, 15 Oct 2024 07:21:34 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id E06CBAC2E3 for ; Tue, 15 Oct 2024 11:21:16 +0000 (UTC) X-FDA: 82675595940.15.B3D0A17 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.12]) by imf03.hostedemail.com (Postfix) with ESMTP id 1240B20002 for ; Tue, 15 Oct 2024 11:21:27 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=QjImIVgD; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf03.hostedemail.com: domain of thomas.hellstrom@linux.intel.com has no SPF policy when checking 192.198.163.12) smtp.mailfrom=thomas.hellstrom@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728991134; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Pm4hk4cS5Y+IaGBr/SM4gKjlbfKEN8YMFyIOJxEl21U=; b=Dp27K5O+FtTMYGAsYb6pCQ+Pb0VGlKiq/BezbimrcuDmQh73BFo/HBSB60wLY31P95CjFw XYSZY3yF1M+waCTpyadRgXKw20+CndhLiFuw2CoUrXVxK/murCNKF0OhivRTXW0TxD/DM2 yCawgBZsK+pdQPUlays+iS1FVgGSrs8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728991134; a=rsa-sha256; cv=none; b=pqK4jkoBOjtHME9qvfmEc7cuFirRfuX1iWXXMpzgCZHhBUNIidvWCLtjDgZHKQlHVRCuIe QXuNHLychjkym0DfGIM1/GqVaukK/CLAnJ0+yAkyZwUc8X8Y9Mry3e/OunlcTdD5eR1RZ2 GpyeIcRyaFzEnNDni37O/CbG0hfws38= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=QjImIVgD; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf03.hostedemail.com: domain of thomas.hellstrom@linux.intel.com has no SPF policy when checking 192.198.163.12) smtp.mailfrom=thomas.hellstrom@linux.intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1728991291; x=1760527291; h=message-id:subject:from:to:cc:date:in-reply-to: references:content-transfer-encoding:mime-version; bh=0ewV0D1RxWOE0VLFETll9aV4Xxx4md13rJvHQq0LnOk=; b=QjImIVgDID9plY6IW46cxYzGbbimHldmbislz8K0e9co/hZrgX4bGuWC rN3tYkb72jwCfvFXc7P0gDTyHe3OoXaWvaaAabJddjQl1NNS/8wvJiWR8 ePFgbvsFH+XwlTE9K/nypUcVw1vpmV3TcVaYgKiFLaCUvWdEwjyiTRUOi Qt0pasAyzB5Xsb21GsMras5zaebpLV8tcp2RiTidyTCTuh7N3qtRYlUMS Eo5GkHhuHKVivmJ3eosbVou6vbx/udftmgGEjE/cQUyhFY6zfbc2ZO07w 4YbsFjQ2Ol+ouryfHIBD1lINHH3SRaj69kz0ltVSqSHa67m/6/U4J0vib Q==; X-CSE-ConnectionGUID: RZFXnwWeR8SwA6khcit7Ag== X-CSE-MsgGUID: k60gWIPVQFmKFWah1lhu2Q== X-IronPort-AV: E=McAfee;i="6700,10204,11225"; a="32297658" X-IronPort-AV: E=Sophos;i="6.11,204,1725346800"; d="scan'208";a="32297658" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2024 04:21:29 -0700 X-CSE-ConnectionGUID: ncMsm8CZRh6ov6ecOvRpzQ== X-CSE-MsgGUID: mGlqmG8BTrOSWEQ9bWFcQA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,204,1725346800"; d="scan'208";a="77482151" Received: from cpetruta-mobl1.ger.corp.intel.com (HELO [10.245.246.43]) ([10.245.246.43]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2024 04:21:28 -0700 Message-ID: <5f369b087ed3ef53cf255c212b0aef6fe1a7e613.camel@linux.intel.com> Subject: Re: [RFC PATCH] mm/hmm, mm/migrate_device: Allow p2p access and p2p migration From: Thomas =?ISO-8859-1?Q?Hellstr=F6m?= To: intel-xe@lists.freedesktop.org Cc: Matthew Brost , Jason Gunthorpe , Simona Vetter , DRI-devel , Linux Memory Management List , LKML Date: Tue, 15 Oct 2024 13:21:25 +0200 In-Reply-To: <20241015111322.97514-1-thomas.hellstrom@linux.intel.com> References: <20241015111322.97514-1-thomas.hellstrom@linux.intel.com> Autocrypt: addr=thomas.hellstrom@linux.intel.com; prefer-encrypt=mutual; keydata=mDMEZaWU6xYJKwYBBAHaRw8BAQdAj/We1UBCIrAm9H5t5Z7+elYJowdlhiYE8zUXgxcFz360SFRob21hcyBIZWxsc3Ryw7ZtIChJbnRlbCBMaW51eCBlbWFpbCkgPHRob21hcy5oZWxsc3Ryb21AbGludXguaW50ZWwuY29tPoiTBBMWCgA7FiEEbJFDO8NaBua8diGTuBaTVQrGBr8FAmWllOsCGwMFCwkIBwICIgIGFQoJCAsCBBYCAwECHgcCF4AACgkQuBaTVQrGBr/yQAD/Z1B+Kzy2JTuIy9LsKfC9FJmt1K/4qgaVeZMIKCAxf2UBAJhmZ5jmkDIf6YghfINZlYq6ixyWnOkWMuSLmELwOsgPuDgEZaWU6xIKKwYBBAGXVQEFAQEHQF9v/LNGegctctMWGHvmV/6oKOWWf/vd4MeqoSYTxVBTAwEIB4h4BBgWCgAgFiEEbJFDO8NaBua8diGTuBaTVQrGBr8FAmWllOsCGwwACgkQuBaTVQrGBr/P2QD9Gts6Ee91w3SzOelNjsus/DcCTBb3fRugJoqcfxjKU0gBAKIFVMvVUGbhlEi6EFTZmBZ0QIZEIzOOVfkaIgWelFEH Organization: Intel Sweden AB, Registration Number: 556189-6027 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.50.4 (3.50.4-1.fc39) MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 1240B20002 X-Stat-Signature: t8sny4u1oqed1w85j5zbkfjf4ggyjpx7 X-Rspam-User: X-HE-Tag: 1728991287-513754 X-HE-Meta: U2FsdGVkX19H9Po5nPB60RD+HTUwX4x4Q7Ztzrbz9AiqEmdmlostf2tfLpz212DdQxzGMlrA0j9b5Pzpbv9XfA2jHksWXBbzUW8qFiOvmEk8LMUcsmAOLeJCCRvwsj+LSZztn1JdYRtLSMhlTWrbZiHkaM0m0kwgs7CLrOk7UCs36bSbfDbk8yBOdSUznuRLtv1jTWLbxCEVPO6q3/UxnTko+J6AXmQIo6RTDH70lYZNh52UyLN3wjHZtLzPc/HI5jD9P8gZutBas0jRVuycQ4ZZXWmOLM5xu9gyw4DcOK18wL3KIkZncNugyxTKe1I0ymYzPXEwx7ONWxErVaUtWiDz/0s9Uusn4MEGppkDffzeb6rNTeobKcVGqMvU279UQfUDCBKa/vS3S1e7XV/nRS7yn64R4i5cXOX1rFtDgb2NZVRx94jIKXSW53JAZbdFmr2/f9PWk8eBg0p8qP4Ohc+FaAeL4OaQ0bZl2j8jkJlHzlcbgiKhie/8wiceDrzh8G11dgIs0vFoseH4YLpl35Ej1D2W26BfD0O7aTc0Q8WyNP7AEHw1HDPfU//uKvkPVY8zM+y4NH3C7Qg1ed/I9hhK2mOepXvrrmPl+fnIfvrT0UMOx1PcajDl+/AZq9QJ/z9qNM1LB23xUZlxU+pbfYghrd7cgMzJOjyUQs53HxCj/vLZPQVsychUTA8u5peF9s9084j8SkrsYqS3eh9c7+s5YX+rElUbtI/V4nsAyl/5WBbNfaROCYPqs5SXp45V2Qkm6IkV0odsKsfMv9VTW+4N8ExkhNuqd0qrnw1/DXKUJLsA+jQsbUUR9HzxVrRKEKZNxxpKYFq55Aj/2dIS2BfkwIeIkh7DvF5ydcXiOCMO5f5n4ByIZwgDsKB/z/SihbJ8ciXFRqTae5jb9TCcxI500uBdObbyW/9KUYWiflxOGP3+nIX7Say2YtqOkqDd1rZRWX/eUxHRxs0Ymzs lWCOO7T4 riLe7qHdc1mtiLJy6mNseAK9VHRHATp2XZQ2DdLnuua0JjmnhmNYklNRY3Dwah2a96D5LvHrd+8lGGqt7DVF/M3RwIA8CEDBav4K9wjm9jEHWTaDrVtihoe6aT7FUaD2wxKpHyqcQx2B8TIkXe7MWy1wh6cbcGsKRzi9Bv6vuhrq4yQdFHocIQjQISrwGPhwDUL0BXzIDKaTnN/MY0zLYg75oVQMmjz7vElo1J2REfYNI3e8p3uXqnDJtGZoNKboix24XslsLgsIOGcTKU7z/OCa9owod/zcjTXGYRIzl70qr8mXkMJVgO8H2NWR44q7FOOEbCXBZkfnMHzYW6UtqqQtfcKg6iWpLxVDfpM58WQ+rU+a7cp8PXyUAUh36Y9kD8HCJY72ixGyCF67nXEqWdXzXSxNhYDHHvX+XaK/3F7+5riK4NBmMFXr283+b7wMkq4hu/hI+g2gwZx86Z0EViXbx6g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, 2024-10-15 at 13:13 +0200, Thomas Hellstr=C3=B6m wrote: > Introduce a way for hmm_range_fault() and migrate_vma_setup() to > identify > foreign devices with fast interconnect and thereby allow > both direct access over the interconnect and p2p migration. >=20 > The need for a callback arises because without it, the p2p ability > would > need to be static and determined at dev_pagemap creation time. With > a callback it can be determined dynamically, and in the migrate case > the callback could separate out local device pages. >=20 > The hmm_range_fault() change has been tested internally, the > migrate_vma_setup() change hasn't yet. >=20 > Seeking early feedback. Any suggestions appreciated. >=20 > Cc: Matthew Brost > Cc: Jason Gunthorpe > Cc: Simona Vetter > Cc: DRI-devel > Cc: Linux Memory Management List > Cc: LKML >=20 > Signed-off-by: Thomas Hellstr=C3=B6m > --- > =C2=A0include/linux/hmm.h=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0 2 ++ > =C2=A0include/linux/migrate.h | 29 +++++++++++++++++++++++++++++ > =C2=A0mm/hmm.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | 13 +++++++++++-- > =C2=A0mm/migrate_device.c=C2=A0=C2=A0=C2=A0=C2=A0 | 12 ++++++++++++ > =C2=A04 files changed, 54 insertions(+), 2 deletions(-) >=20 > diff --git a/include/linux/hmm.h b/include/linux/hmm.h > index 126a36571667..4de909a1e10a 100644 > --- a/include/linux/hmm.h > +++ b/include/linux/hmm.h > @@ -12,6 +12,7 @@ > =C2=A0#include > =C2=A0 > =C2=A0struct mmu_interval_notifier; > +struct p2p_allow; > =C2=A0 > =C2=A0/* > =C2=A0 * On output: > @@ -97,6 +98,7 @@ struct hmm_range { > =C2=A0 unsigned long default_flags; > =C2=A0 unsigned long pfn_flags_mask; > =C2=A0 void *dev_private_owner; > + struct p2p_allow=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 *p2p; > =C2=A0}; > =C2=A0 > =C2=A0/* > diff --git a/include/linux/migrate.h b/include/linux/migrate.h > index 002e49b2ebd9..0ff085b633e3 100644 > --- a/include/linux/migrate.h > +++ b/include/linux/migrate.h > @@ -183,10 +183,37 @@ static inline unsigned long > migrate_pfn(unsigned long pfn) > =C2=A0 return (pfn << MIGRATE_PFN_SHIFT) | MIGRATE_PFN_VALID; > =C2=A0} > =C2=A0 > +struct p2p_allow; > + > +/** > + * struct p2p_allow_ops - Functions for detailed cross-device > access. > + */ > +struct p2p_allow_ops { > + /** > + * @p2p_allow: Whether to allow cross-device access to > device_private pages. > + * @allow: Pointer to a struct p2p_allow. Typically > subclassed by the caller > + * to provide needed information. > + * @page: The page being queried. > + */ > + bool (*p2p_allow)(struct p2p_allow *allow, struct page > *page); > +}; > + > +/** > + * struct p2p_allow - Information needed to allow cross-device > access. > + * @ops: Pointer to a struct p2p_allow_ops. > + * > + * This struct is intended to be embedded / subclassed to provide > additional > + * information needed by the @ops p2p_allow() callback. > + */ > +struct p2p_allow { > + const struct p2p_allow_ops *ops; > +}; > + > =C2=A0enum migrate_vma_direction { > =C2=A0 MIGRATE_VMA_SELECT_SYSTEM =3D 1 << 0, > =C2=A0 MIGRATE_VMA_SELECT_DEVICE_PRIVATE =3D 1 << 1, > =C2=A0 MIGRATE_VMA_SELECT_DEVICE_COHERENT =3D 1 << 2, > + MIGRATE_VMA_SELECT_DEVICE_P2P =3D 1 << 3, > =C2=A0}; > =C2=A0 > =C2=A0struct migrate_vma { > @@ -222,6 +249,8 @@ struct migrate_vma { > =C2=A0 * a migrate_to_ram() callback. > =C2=A0 */ > =C2=A0 struct page *fault_page; > + /* Optional identification of devices for p2p migration */ > + struct p2p_allow=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 *p2p; > =C2=A0}; > =C2=A0 > =C2=A0int migrate_vma_setup(struct migrate_vma *args); > diff --git a/mm/hmm.c b/mm/hmm.c > index 7e0229ae4a5a..8c28f9b22ed2 100644 > --- a/mm/hmm.c > +++ b/mm/hmm.c > @@ -19,6 +19,7 @@ > =C2=A0#include > =C2=A0#include > =C2=A0#include > +#include > =C2=A0#include > =C2=A0#include > =C2=A0#include > @@ -220,6 +221,15 @@ static inline unsigned long > pte_to_hmm_pfn_flags(struct hmm_range *range, > =C2=A0 return pte_write(pte) ? (HMM_PFN_VALID | HMM_PFN_WRITE) : > HMM_PFN_VALID; > =C2=A0} > =C2=A0 > +static bool hmm_allow_devmem(struct hmm_range *range, struct page > *page) > +{ > + if (likely(page->pgmap->owner =3D=3D range->dev_private_owner)) > + return true; > + if (likely(!range->p2p)) > + return false; > + return range->p2p->ops->p2p_allow(range->p2p, page); > +} > + > =C2=A0static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long > addr, > =C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 unsigned long end, pmd_t *pmdp, p= te_t > *ptep, > =C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 unsigned long *hmm_pfn) > @@ -248,8 +258,7 @@ static int hmm_vma_handle_pte(struct mm_walk > *walk, unsigned long addr, > =C2=A0 * just report the PFN. > =C2=A0 */ > =C2=A0 if (is_device_private_entry(entry) && > - =C2=A0=C2=A0=C2=A0 pfn_swap_entry_to_page(entry)->pgmap->owner =3D=3D > - =C2=A0=C2=A0=C2=A0 range->dev_private_owner) { > + =C2=A0=C2=A0=C2=A0 hmm_allow_devmem(range, > pfn_swap_entry_to_page(entry))) { > =C2=A0 cpu_flags =3D HMM_PFN_VALID; > =C2=A0 if (is_writable_device_private_entry(entry)) > =C2=A0 cpu_flags |=3D HMM_PFN_WRITE; > diff --git a/mm/migrate_device.c b/mm/migrate_device.c > index 9cf26592ac93..8e643a3872c9 100644 > --- a/mm/migrate_device.c > +++ b/mm/migrate_device.c > @@ -54,6 +54,13 @@ static int migrate_vma_collect_hole(unsigned long > start, > =C2=A0 return 0; > =C2=A0} > =C2=A0 > +static bool migrate_vma_allow_p2p(struct migrate_vma *migrate, > struct page *page) > +{ > + if (likely(!migrate->p2p)) > + return false; > + return migrate->p2p->ops->p2p_allow(migrate->p2p, page); > +} > + > =C2=A0static int migrate_vma_collect_pmd(pmd_t *pmdp, > =C2=A0 =C2=A0=C2=A0 unsigned long start, > =C2=A0 =C2=A0=C2=A0 unsigned long end, > @@ -138,6 +145,11 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, > =C2=A0 =C2=A0=C2=A0=C2=A0 page->pgmap->owner !=3D migrate- > >pgmap_owner) > =C2=A0 goto next; > =C2=A0 > + if (!(migrate->flags & > + =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 MIGRATE_VMA_SELECT_DEVICE_P2P) || > + =C2=A0=C2=A0=C2=A0 !migrate_vma_allow_p2p(migrate, page)) > + goto next; > + And obviously some inverted logic here, sigh, but hopefully the intent is clear.. /Thomas > =C2=A0 mpfn =3D migrate_pfn(page_to_pfn(page)) | > =C2=A0 MIGRATE_PFN_MIGRATE; > =C2=A0 if (is_writable_device_private_entry(entry))