From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from fhigh-b5-smtp.messagingengine.com (fhigh-b5-smtp.messagingengine.com [202.12.124.156]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4E56F2ED175; Thu, 6 Nov 2025 21:56:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=202.12.124.156 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762466194; cv=none; b=pq3FlPvHubjhz4bL1PPhBVukNfRjMytt+nNdF1N8Jbw8TKpZ6W9ujX+bAynRDa0yywcAVb6XJvqQ0RlNa3odq5vT8prFu12a1G+YZ3nWN1156cg6HV+jwbzskPVzHM+E3T16so4mwpCxfSk9D3/A7lXuWUhi7BdAqaHEEaojSw8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762466194; c=relaxed/simple; bh=Wnn2TE7l8/HF6KTmsLA1p6vAtjQPjrc9cNKpin17/Pw=; h=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=cm/r6Q+RpivgLXsG5NjiyXR4GczgmlSfUlSZww2sI4r4VwxVUUgncSDlNmSewMv5ItcdS0EzSMg4pbKeg/gg/5sC/dhfkkjRuIbMDV9s3MLxG8aC6SKDnqmJYC3DppT6+CP31CqhGZFNBxqjCyrQat+U4iRRpmcIy0Aa1n54Lm8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=shazbot.org; spf=pass smtp.mailfrom=shazbot.org; dkim=pass (2048-bit key) header.d=shazbot.org header.i=@shazbot.org header.b=REjd7ea4; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=TEBqBx0v; arc=none smtp.client-ip=202.12.124.156 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=shazbot.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shazbot.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=shazbot.org header.i=@shazbot.org header.b="REjd7ea4"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="TEBqBx0v" Received: from phl-compute-01.internal (phl-compute-01.internal [10.202.2.41]) by mailfhigh.stl.internal (Postfix) with ESMTP id 33E637A009A; Thu, 6 Nov 2025 16:56:29 -0500 (EST) Received: from phl-mailfrontend-01 ([10.202.2.162]) by phl-compute-01.internal (MEProxy); Thu, 06 Nov 2025 16:56:30 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shazbot.org; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm2; t=1762466189; x=1762552589; bh=qPfsiLN1n3oCXH0bYb9sc+hEU45g8Wrq4al92DnOXSw=; b= REjd7ea48BU9x1rmydQeYrwoKcgxNX8+mUmzTKMSvlEpUs8m9K9hlcK0GbU1LEeA NVYPwD98/nmhIhUIkLFGqEMkZ0zcb0UQhKmGw+p9ULpJ6tfK3soxbmxm7mpK1sID 0KVYWCJWDNk6s3a4BLoCx0SUO8J1WUcG40XOMjtQAlKGZEmb3YXWXyZUcr1PFtHv 9lrTfMGzUTYcWiQFC9b1lvoBBt+S0AC9DYVLwU3vGQBdD5rpksjiEq8ERDMLzQh2 TCKP8oE1oMim/s7wzlN+v5BhFYNDZ88Y+Q85D3JXd4I889RtES83LnojVTX7vIhz gcPNtHt1r9nbutpmtX3g/A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t=1762466189; x= 1762552589; bh=qPfsiLN1n3oCXH0bYb9sc+hEU45g8Wrq4al92DnOXSw=; b=T EBqBx0vGyrD+M7zHDdEiypWw2aHj44F4s+KKGiS6E7bbBqutRUYGGdUXJDCJOPAP cH8T4NgtQFCrRZMfvh5JdrYimP+Atx9vaYclPTYIVED5Zl30HjeD0vqfv//7lsem /80j4lM4I85ZsZf0Cexk0ZY5LuzcuBze1x76Fgl5QNEJd3cNydWw5o4PC+u4vOPl DSgosf/FvJe/Rjs8EJdluO5i2unn6u8QIJUwn8NCw1dqdWDjpaXDTLxAdmRlbRc4 s33o+DwUb34GVfaw0ww0pKuJ0GAxUvtYopgP56lHjoNF1Y29qJLrQ39QmyWuzt9e Cm0zXNNMk0OliQDNcZyzg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdeggddukeejledvucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfgjfhggtgfgsehtjeertd dttddvnecuhfhrohhmpeetlhgvgicuhghilhhlihgrmhhsohhnuceorghlvgigsehshhgr iigsohhtrdhorhhgqeenucggtffrrghtthgvrhhnpeehvddtueevjeduffejfeduhfeufe ejvdetgffftdeiieduhfejjefhhfefueevudenucffohhmrghinhepkhgvrhhnvghlrdho rhhgnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomheprg hlvgigsehshhgriigsohhtrdhorhhgpdhnsggprhgtphhtthhopeefledpmhhouggvpehs mhhtphhouhhtpdhrtghpthhtoheprghnkhhithgrsehnvhhiughirgdrtghomhdprhgtph htthhopegrnhhikhgvthgrsehnvhhiughirgdrtghomhdprhgtphhtthhopehvshgvthhh ihesnhhvihguihgrrdgtohhmpdhrtghpthhtohepjhhgghesnhhvihguihgrrdgtohhmpd hrtghpthhtohepmhhotghhshesnhhvihguihgrrdgtohhmpdhrtghpthhtohepshhkohhl ohhthhhumhhthhhosehnvhhiughirgdrtghomhdprhgtphhtthhopehlihhnmhhirghohh gvsehhuhgrfigvihdrtghomhdprhgtphhtthhopehnrghordhhohhrihhguhgthhhisehg mhgrihhlrdgtohhmpdhrtghpthhtoheprghkphhmsehlihhnuhigqdhfohhunhgurghtih honhdrohhrgh X-ME-Proxy: Feedback-ID: i03f14258:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 6 Nov 2025 16:56:24 -0500 (EST) Date: Thu, 6 Nov 2025 14:56:22 -0700 From: Alex Williamson To: Cc: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: Re: [PATCH v5 3/3] vfio/nvgrace-gpu: register device memory for poison handling Message-ID: <20251106145622.1610d306.alex@shazbot.org> In-Reply-To: <20251102184434.2406-4-ankita@nvidia.com> References: <20251102184434.2406-1-ankita@nvidia.com> <20251102184434.2406-4-ankita@nvidia.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Sun, 2 Nov 2025 18:44:34 +0000 wrote: > From: Ankit Agrawal > > The nvgrace-gpu-vfio-pci module [1] maps the device memory to the user VA > (Qemu) using remap_pfn_range() without adding the memory to the kernel. > The device memory pages are not backed by struct page. The previous > patch implements the mechanism to handle ECC/poison on memory page without > struct page. This new mechanism is being used here. > > The module registers its memory region and the address_space with the > kernel MM for ECC handling using the register_pfn_address_space() > registration API exposed by the kernel. > > Link: https://lore.kernel.org/all/20240220115055.23546-1-ankita@nvidia.com/ [1] > > Signed-off-by: Ankit Agrawal > --- > drivers/vfio/pci/nvgrace-gpu/main.c | 45 ++++++++++++++++++++++++++++- > 1 file changed, 44 insertions(+), 1 deletion(-) LGTM. I see Andrew has already picked this up in mm-new, if he refreshes, here's another ack. Acked-by: Alex Williamson Thanks, Alex > diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace-gpu/main.c > index d95761dcdd58..80b3ed63c682 100644 > --- a/drivers/vfio/pci/nvgrace-gpu/main.c > +++ b/drivers/vfio/pci/nvgrace-gpu/main.c > @@ -8,6 +8,10 @@ > #include > #include > > +#ifdef CONFIG_MEMORY_FAILURE > +#include > +#endif > + > /* > * The device memory usable to the workloads running in the VM is cached > * and showcased as a 64b device BAR (comprising of BAR4 and BAR5 region) > @@ -47,6 +51,9 @@ struct mem_region { > void *memaddr; > void __iomem *ioaddr; > }; /* Base virtual address of the region */ > +#ifdef CONFIG_MEMORY_FAILURE > + struct pfn_address_space pfn_address_space; > +#endif > }; > > struct nvgrace_gpu_pci_core_device { > @@ -60,6 +67,28 @@ struct nvgrace_gpu_pci_core_device { > bool has_mig_hw_bug; > }; > > +#ifdef CONFIG_MEMORY_FAILURE > + > +static int > +nvgrace_gpu_vfio_pci_register_pfn_range(struct mem_region *region, > + struct vm_area_struct *vma) > +{ > + unsigned long nr_pages; > + int ret = 0; > + > + nr_pages = region->memlength >> PAGE_SHIFT; > + > + region->pfn_address_space.node.start = vma->vm_pgoff; > + region->pfn_address_space.node.last = vma->vm_pgoff + nr_pages - 1; > + region->pfn_address_space.mapping = vma->vm_file->f_mapping; > + > + ret = register_pfn_address_space(®ion->pfn_address_space); > + > + return ret; > +} > + > +#endif > + > static void nvgrace_gpu_init_fake_bar_emu_regs(struct vfio_device *core_vdev) > { > struct nvgrace_gpu_pci_core_device *nvdev = > @@ -127,6 +156,13 @@ static void nvgrace_gpu_close_device(struct vfio_device *core_vdev) > > mutex_destroy(&nvdev->remap_lock); > > +#ifdef CONFIG_MEMORY_FAILURE > + if (nvdev->resmem.memlength) > + unregister_pfn_address_space(&nvdev->resmem.pfn_address_space); > + > + unregister_pfn_address_space(&nvdev->usemem.pfn_address_space); > +#endif > + > vfio_pci_core_close_device(core_vdev); > } > > @@ -202,7 +238,14 @@ static int nvgrace_gpu_mmap(struct vfio_device *core_vdev, > > vma->vm_pgoff = start_pfn; > > - return 0; > +#ifdef CONFIG_MEMORY_FAILURE > + if (nvdev->resmem.memlength && index == VFIO_PCI_BAR2_REGION_INDEX) > + ret = nvgrace_gpu_vfio_pci_register_pfn_range(&nvdev->resmem, vma); > + else if (index == VFIO_PCI_BAR4_REGION_INDEX) > + ret = nvgrace_gpu_vfio_pci_register_pfn_range(&nvdev->usemem, vma); > +#endif > + > + return ret; > } > > static long