From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DC8A621A449; Tue, 4 Feb 2025 22:01:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.14 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738706518; cv=none; b=LVstRVTUIfYPbVeS91BSw9ypJhGJafZQPzEymDyT9KaXeNLlG66NL6t3XAwAdkB3lD9b8d28rixZz5yFgtzZLYMBF13PW56VViJq6SU9t3x6eWckMoBb42IuBQhYN3Wy2m33MXMSArey+/NCYFrvbzHgFrRZaQ987SBwLAc94fw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738706518; c=relaxed/simple; bh=lbPHclbARFkQmvsu0zJqrdw8H+4xG9Vq8ZiG/BDW7Jo=; h=Message-ID:Subject:From:To:Cc:Date:In-Reply-To:References: Content-Type:MIME-Version; b=c2EcO/yKPn1u8zy4BaPeGI6CBXmKlDn4N/+yw+x60ymM2MHWDDtpuN1X7i6892W3AMc0A87UQDyxdqYSzNiXrmv7kdrs7eX/Tzp28ZJL7PdyoX+YLAGLS1dmXqL8eDYuv4235T/9k63k9rKOdPvgKgfJjWOFhEE6K0joJVME4Hw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=SdO8o5t8; arc=none smtp.client-ip=192.198.163.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="SdO8o5t8" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738706517; x=1770242517; h=message-id:subject:from:to:cc:date:in-reply-to: references:content-transfer-encoding:mime-version; bh=lbPHclbARFkQmvsu0zJqrdw8H+4xG9Vq8ZiG/BDW7Jo=; b=SdO8o5t8l25nWlUS7nRVK96gD06y5ImWdrlIRyj6biZZVYr2V8sVqGuL Tn9Iomr8JyF3p3eZimWCMD10jwov8R20ndKXh7cdvyAPG0SDufnBFZtcn Ycq8+jF2K8gWtTv8jK/wcxlRUEDsMdfaFNm0vROumUvkTWMaZzKAyua38 B2MDNmE3eD+iip6J+dwF7xNsNOZ+QX21Qyhrs/MQoTo2vn8F3CRxP9RKl lhLjJETaIfqsBFbXhhesBfZwgBB8zWYrMoCeu8VOfT2qrfVYbASSyjlMj hsIrFiugFuMFTFoysx2bML+DtydZrQJem8CBYOH8Ly18Fax5pZ+Pn5+ka w==; X-CSE-ConnectionGUID: etr2uh17TgG7yOsIH2xKXQ== X-CSE-MsgGUID: ASXSxsQJR/a2AsWhJhjDLQ== X-IronPort-AV: E=McAfee;i="6700,10204,11336"; a="39517117" X-IronPort-AV: E=Sophos;i="6.13,259,1732608000"; d="scan'208";a="39517117" Received: from fmviesa010.fm.intel.com ([10.60.135.150]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Feb 2025 14:01:55 -0800 X-CSE-ConnectionGUID: NAx9oX9yRyOex+aeBj7lDg== X-CSE-MsgGUID: DNsEcHZsSNySIyoK+sYkqw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,259,1732608000"; d="scan'208";a="111263801" Received: from lfiedoro-mobl.ger.corp.intel.com (HELO [10.245.246.144]) ([10.245.246.144]) by fmviesa010-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Feb 2025 14:01:28 -0800 Message-ID: <60b7e29853cb33d45d10101e494c7ddbe6a5abd6.camel@linux.intel.com> Subject: Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages From: Thomas =?ISO-8859-1?Q?Hellstr=F6m?= To: Jason Gunthorpe Cc: Yonatan Maman , kherbst@redhat.com, lyude@redhat.com, dakr@redhat.com, airlied@gmail.com, simona@ffwll.ch, leon@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, GalShalom@nvidia.com, dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, linux-mm@kvack.org, linux-tegra@vger.kernel.org Date: Tue, 04 Feb 2025 23:01:25 +0100 In-Reply-To: <20250204191601.GK2296753@ziepe.ca> References: <20250129134757.GA2120662@ziepe.ca> <20250130132317.GG2120662@ziepe.ca> <20250130174217.GA2296753@ziepe.ca> <20250203150805.GC2296753@ziepe.ca> <7b7a15fb1f59acc60393eb01cefddf4dc1f32c00.camel@linux.intel.com> <20250204132615.GI2296753@ziepe.ca> <3e96aef8009be69858a69d3e49a2bd7fc7d06f5f.camel@linux.intel.com> <20250204191601.GK2296753@ziepe.ca> Organization: Intel Sweden AB, Registration Number: 556189-6027 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.54.3 (3.54.3-1.fc41) Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 On Tue, 2025-02-04 at 15:16 -0400, Jason Gunthorpe wrote: > On Tue, Feb 04, 2025 at 03:29:48PM +0100, Thomas Hellstr=C3=B6m wrote: > > On Tue, 2025-02-04 at 09:26 -0400, Jason Gunthorpe wrote: > > > On Tue, Feb 04, 2025 at 10:32:32AM +0100, Thomas Hellstr=C3=B6m wrote= : > > > >=20 > > >=20 > > > > 1) Existing users would never use the callback. They can still > > > > rely > > > > on > > > > the owner check, only if that fails we check for callback > > > > existence. > > > > 2) By simply caching the result from the last checked > > > > dev_pagemap, > > > > most > > > > callback calls could typically be eliminated. > > >=20 > > > But then you are not in the locked region so your cache is racy > > > and > > > invalid. > >=20 > > I'm not sure I follow? If a device private pfn handed back to the > > caller is dependent on dev_pagemap A having a fast interconnect to > > the > > client, then subsequent pfns in the same hmm_range_fault() call > > must be > > able to make the same assumption (pagemap A having a fast > > interconnect), else the whole result is invalid? >=20 > But what is the receiver going to do with this device private page? > Relock it again and check again if it is actually OK? Yuk. I'm still lost as to what would be the possible race-condition that can't be handled in the usual way using mmu invalidations + notifier seqno bump? Is it the fast interconnect being taken down? /Thomas >=20 > > > > 3) As mentioned before, a callback call would typically always > > > > be > > > > followed by either migration to ram or a page-table update. > > > > Compared to > > > > these, the callback overhead would IMO be unnoticeable. > > >=20 > > > Why? Surely the normal case should be a callback saying the > > > memory > > > can > > > be accessed? > >=20 > > Sure, but at least on the xe driver, that means page-table > > repopulation > > since the hmm_range_fault() typically originated from a page-fault. >=20 > Yes, I expect all hmm_range_fault()'s to be on page fault paths, and > we'd like it to be as fast as we can in the CPU present case.. >=20 > Jason