Re: [PATCH 6/7] nouveau/dmem: Evict device private memory during release

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Alistair Popple <apopple@nvidia.com>
To: Felix Kuehling <felix.kuehling@amd.com>
Cc: "Alex Sierra" <alex.sierra@amd.com>,
	"Karol Herbst" <kherbst@redhat.com>,
	"David Airlie" <airlied@linux.ie>,
	nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	linux-mm@kvack.org, amd-gfx@lists.freedesktop.org,
	"Michael Ellerman" <mpe@ellerman.id.au>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	"Ben Skeggs" <bskeggs@redhat.com>,
	"Jason Gunthorpe" <jgg@nvidia.com>,
	"Ralph Campbell" <rcampbell@nvidia.com>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Pan, Xinhui" <Xinhui.Pan@amd.com>,
	linux-kernel@vger.kernel.org, "Daniel Vetter" <daniel@ffwll.ch>,
	"Alex Deucher" <alexander.deucher@amd.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	linuxppc-dev@lists.ozlabs.org,
	"Christian König" <christian.koenig@amd.com>
Subject: Re: [PATCH 6/7] nouveau/dmem: Evict device private memory during release
Date: Tue, 27 Sep 2022 11:39:57 +1000	[thread overview]
Message-ID: <87bkr1lh3a.fsf@nvdebian.thelocal> (raw)
In-Reply-To: <7ca6ec0c-7e5e-3b24-8f8d-650df357130c@amd.com>


Felix Kuehling <felix.kuehling@amd.com> writes:

> On 2022-09-26 17:35, Lyude Paul wrote:
>> On Mon, 2022-09-26 at 16:03 +1000, Alistair Popple wrote:
>>> When the module is unloaded or a GPU is unbound from the module it is
>>> possible for device private pages to be left mapped in currently running
>>> processes. This leads to a kernel crash when the pages are either freed
>>> or accessed from the CPU because the GPU and associated data structures
>>> and callbacks have all been freed.
>>>
>>> Fix this by migrating any mappings back to normal CPU memory prior to
>>> freeing the GPU memory chunks and associated device private pages.
>>>
>>> Signed-off-by: Alistair Popple <apopple@nvidia.com>
>>>
>>> ---
>>>
>>> I assume the AMD driver might have a similar issue. However I can't see
>>> where device private (or coherent) pages actually get unmapped/freed
>>> during teardown as I couldn't find any relevant calls to
>>> devm_memunmap(), memunmap(), devm_release_mem_region() or
>>> release_mem_region(). So it appears that ZONE_DEVICE pages are not being
>>> properly freed during module unload, unless I'm missing something?
>> I've got no idea, will poke Ben to see if they know the answer to this
>
> I guess we're relying on devm to release the region. Isn't the whole point of
> using devm_request_free_mem_region that we don't have to remember to explicitly
> release it when the device gets destroyed? I believe we had an explicit free
> call at some point by mistake, and that caused a double-free during module
> unload. See this commit for reference:

Argh, thanks for that pointer. I was not so familiar with
devm_request_free_mem_region()/devm_memremap_pages() as currently
Nouveau explicitly manages that itself.

> commit 22f4f4faf337d5fb2d2750aff13215726814273e
> Author: Philip Yang <Philip.Yang@amd.com>
> Date:   Mon Sep 20 17:25:52 2021 -0400
>
>     drm/amdkfd: fix svm_migrate_fini warning
>          Device manager releases device-specific resources when a driver
>     disconnects from a device, devm_memunmap_pages and
>     devm_release_mem_region calls in svm_migrate_fini are redundant.
>          It causes below warning trace after patch "drm/amdgpu: Split
>     amdgpu_device_fini into early and late", so remove function
>     svm_migrate_fini.
>          BUG: https://gitlab.freedesktop.org/drm/amd/-/issues/1718
>          WARNING: CPU: 1 PID: 3646 at drivers/base/devres.c:795
>     devm_release_action+0x51/0x60
>     Call Trace:
>         ? memunmap_pages+0x360/0x360
>         svm_migrate_fini+0x2d/0x60 [amdgpu]
>         kgd2kfd_device_exit+0x23/0xa0 [amdgpu]
>         amdgpu_amdkfd_device_fini_sw+0x1d/0x30 [amdgpu]
>         amdgpu_device_fini_sw+0x45/0x290 [amdgpu]
>         amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
>         drm_dev_release+0x20/0x40 [drm]
>         release_nodes+0x196/0x1e0
>         device_release_driver_internal+0x104/0x1d0
>         driver_detach+0x47/0x90
>         bus_remove_driver+0x7a/0xd0
>         pci_unregister_driver+0x3d/0x90
>         amdgpu_exit+0x11/0x20 [amdgpu]
>          Signed-off-by: Philip Yang <Philip.Yang@amd.com>
>     Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
>     Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>
> Furthermore, I guess we are assuming that nobody is using the GPU when the
> module is unloaded. As long as any processes have /dev/kfd open, you won't be
> able to unload the module (except by force-unload). I suppose with ZONE_DEVICE
> memory, we can have references to device memory pages even when user mode has
> closed /dev/kfd. We do have a cleanup handler that runs in an MMU-free-notifier.
> In theory that should run after all the pages in the mm_struct have been freed.
> It releases all sorts of other device resources and needs the driver to still be
> there. I'm not sure if there is anything preventing a module unload before the
> free-notifier runs. I'll look into that.

Right - module unload (or device unbind) is one of the other ways we can
hit this issue in Nouveau at least. You can end up with ZONE_DEVICE
pages mapped in a running process after the module has unloaded.
Although now you mention it that seems a bit wrong - the pgmap refcount
should provide some protection against that. Will have to look into
that too.

> Regards,
>   Felix
>
>
>>
>>> ---
>>>   drivers/gpu/drm/nouveau/nouveau_dmem.c | 48 +++++++++++++++++++++++++++-
>>>   1 file changed, 48 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
>>> index 66ebbd4..3b247b8 100644
>>> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
>>> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
>>> @@ -369,6 +369,52 @@ nouveau_dmem_suspend(struct nouveau_drm *drm)
>>>   	mutex_unlock(&drm->dmem->mutex);
>>>   }
>>>   +/*
>>> + * Evict all pages mapping a chunk.
>>> + */
>>> +void
>>> +nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk)
>>> +{
>>> +	unsigned long i, npages = range_len(&chunk->pagemap.range) >> PAGE_SHIFT;
>>> +	unsigned long *src_pfns, *dst_pfns;
>>> +	dma_addr_t *dma_addrs;
>>> +	struct nouveau_fence *fence;
>>> +
>>> +	src_pfns = kcalloc(npages, sizeof(*src_pfns), GFP_KERNEL);
>>> +	dst_pfns = kcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL);
>>> +	dma_addrs = kcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL);
>>> +
>>> +	migrate_device_range(src_pfns, chunk->pagemap.range.start >> PAGE_SHIFT,
>>> +			npages);
>>> +
>>> +	for (i = 0; i < npages; i++) {
>>> +		if (src_pfns[i] & MIGRATE_PFN_MIGRATE) {
>>> +			struct page *dpage;
>>> +
>>> +			/*
>>> +			 * _GFP_NOFAIL because the GPU is going away and there
>>> +			 * is nothing sensible we can do if we can't copy the
>>> +			 * data back.
>>> +			 */
>> You'll have to excuse me for a moment since this area of nouveau isn't one of
>> my strongpoints, but are we sure about this? IIRC __GFP_NOFAIL means infinite
>> retry, in the case of a GPU hotplug event I would assume we would rather just
>> stop trying to migrate things to the GPU and just drop the data instead of
>> hanging on infinite retries.
>>
>>> +			dpage = alloc_page(GFP_HIGHUSER | __GFP_NOFAIL);
>>> +			dst_pfns[i] = migrate_pfn(page_to_pfn(dpage));
>>> +			nouveau_dmem_copy_one(chunk->drm,
>>> +					migrate_pfn_to_page(src_pfns[i]), dpage,
>>> +					&dma_addrs[i]);
>>> +		}
>>> +	}
>>> +
>>> +	nouveau_fence_new(chunk->drm->dmem->migrate.chan, false, &fence);
>>> +	migrate_device_pages(src_pfns, dst_pfns, npages);
>>> +	nouveau_dmem_fence_done(&fence);
>>> +	migrate_device_finalize(src_pfns, dst_pfns, npages);
>>> +	kfree(src_pfns);
>>> +	kfree(dst_pfns);
>>> +	for (i = 0; i < npages; i++)
>>> +		dma_unmap_page(chunk->drm->dev->dev, dma_addrs[i], PAGE_SIZE, DMA_BIDIRECTIONAL);
>>> +	kfree(dma_addrs);
>>> +}
>>> +
>>>   void
>>>   nouveau_dmem_fini(struct nouveau_drm *drm)
>>>   {
>>> @@ -380,8 +426,10 @@ nouveau_dmem_fini(struct nouveau_drm *drm)
>>>   	mutex_lock(&drm->dmem->mutex);
>>>     	list_for_each_entry_safe(chunk, tmp, &drm->dmem->chunks, list) {
>>> +		nouveau_dmem_evict_chunk(chunk);
>>>   		nouveau_bo_unpin(chunk->bo);
>>>   		nouveau_bo_ref(NULL, &chunk->bo);
>>> +		WARN_ON(chunk->callocated);
>>>   		list_del(&chunk->list);
>>>   		memunmap_pages(&chunk->pagemap);
>>>   		release_mem_region(chunk->pagemap.range.start,

WARNING: multiple messages have this Message-ID (diff)

From: Alistair Popple <apopple@nvidia.com>
To: Felix Kuehling <felix.kuehling@amd.com>
Cc: "Alex Sierra" <alex.sierra@amd.com>,
	"David Airlie" <airlied@linux.ie>,
	nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	linux-mm@kvack.org, amd-gfx@lists.freedesktop.org,
	"Michael Ellerman" <mpe@ellerman.id.au>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	"Ben Skeggs" <bskeggs@redhat.com>,
	"Jason Gunthorpe" <jgg@nvidia.com>,
	"Ralph Campbell" <rcampbell@nvidia.com>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Pan, Xinhui" <Xinhui.Pan@amd.com>,
	linux-kernel@vger.kernel.org, "Daniel Vetter" <daniel@ffwll.ch>,
	"Alex Deucher" <alexander.deucher@amd.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	linuxppc-dev@lists.ozlabs.org,
	"Christian König" <christian.koenig@amd.com>
Subject: Re: [Nouveau] [PATCH 6/7] nouveau/dmem: Evict device private memory during release
Date: Tue, 27 Sep 2022 11:39:57 +1000	[thread overview]
Message-ID: <87bkr1lh3a.fsf@nvdebian.thelocal> (raw)
In-Reply-To: <7ca6ec0c-7e5e-3b24-8f8d-650df357130c@amd.com>


Felix Kuehling <felix.kuehling@amd.com> writes:

> On 2022-09-26 17:35, Lyude Paul wrote:
>> On Mon, 2022-09-26 at 16:03 +1000, Alistair Popple wrote:
>>> When the module is unloaded or a GPU is unbound from the module it is
>>> possible for device private pages to be left mapped in currently running
>>> processes. This leads to a kernel crash when the pages are either freed
>>> or accessed from the CPU because the GPU and associated data structures
>>> and callbacks have all been freed.
>>>
>>> Fix this by migrating any mappings back to normal CPU memory prior to
>>> freeing the GPU memory chunks and associated device private pages.
>>>
>>> Signed-off-by: Alistair Popple <apopple@nvidia.com>
>>>
>>> ---
>>>
>>> I assume the AMD driver might have a similar issue. However I can't see
>>> where device private (or coherent) pages actually get unmapped/freed
>>> during teardown as I couldn't find any relevant calls to
>>> devm_memunmap(), memunmap(), devm_release_mem_region() or
>>> release_mem_region(). So it appears that ZONE_DEVICE pages are not being
>>> properly freed during module unload, unless I'm missing something?
>> I've got no idea, will poke Ben to see if they know the answer to this
>
> I guess we're relying on devm to release the region. Isn't the whole point of
> using devm_request_free_mem_region that we don't have to remember to explicitly
> release it when the device gets destroyed? I believe we had an explicit free
> call at some point by mistake, and that caused a double-free during module
> unload. See this commit for reference:

Argh, thanks for that pointer. I was not so familiar with
devm_request_free_mem_region()/devm_memremap_pages() as currently
Nouveau explicitly manages that itself.

> commit 22f4f4faf337d5fb2d2750aff13215726814273e
> Author: Philip Yang <Philip.Yang@amd.com>
> Date:   Mon Sep 20 17:25:52 2021 -0400
>
>     drm/amdkfd: fix svm_migrate_fini warning
>          Device manager releases device-specific resources when a driver
>     disconnects from a device, devm_memunmap_pages and
>     devm_release_mem_region calls in svm_migrate_fini are redundant.
>          It causes below warning trace after patch "drm/amdgpu: Split
>     amdgpu_device_fini into early and late", so remove function
>     svm_migrate_fini.
>          BUG: https://gitlab.freedesktop.org/drm/amd/-/issues/1718
>          WARNING: CPU: 1 PID: 3646 at drivers/base/devres.c:795
>     devm_release_action+0x51/0x60
>     Call Trace:
>         ? memunmap_pages+0x360/0x360
>         svm_migrate_fini+0x2d/0x60 [amdgpu]
>         kgd2kfd_device_exit+0x23/0xa0 [amdgpu]
>         amdgpu_amdkfd_device_fini_sw+0x1d/0x30 [amdgpu]
>         amdgpu_device_fini_sw+0x45/0x290 [amdgpu]
>         amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
>         drm_dev_release+0x20/0x40 [drm]
>         release_nodes+0x196/0x1e0
>         device_release_driver_internal+0x104/0x1d0
>         driver_detach+0x47/0x90
>         bus_remove_driver+0x7a/0xd0
>         pci_unregister_driver+0x3d/0x90
>         amdgpu_exit+0x11/0x20 [amdgpu]
>          Signed-off-by: Philip Yang <Philip.Yang@amd.com>
>     Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
>     Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>
> Furthermore, I guess we are assuming that nobody is using the GPU when the
> module is unloaded. As long as any processes have /dev/kfd open, you won't be
> able to unload the module (except by force-unload). I suppose with ZONE_DEVICE
> memory, we can have references to device memory pages even when user mode has
> closed /dev/kfd. We do have a cleanup handler that runs in an MMU-free-notifier.
> In theory that should run after all the pages in the mm_struct have been freed.
> It releases all sorts of other device resources and needs the driver to still be
> there. I'm not sure if there is anything preventing a module unload before the
> free-notifier runs. I'll look into that.

Right - module unload (or device unbind) is one of the other ways we can
hit this issue in Nouveau at least. You can end up with ZONE_DEVICE
pages mapped in a running process after the module has unloaded.
Although now you mention it that seems a bit wrong - the pgmap refcount
should provide some protection against that. Will have to look into
that too.

> Regards,
>   Felix
>
>
>>
>>> ---
>>>   drivers/gpu/drm/nouveau/nouveau_dmem.c | 48 +++++++++++++++++++++++++++-
>>>   1 file changed, 48 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
>>> index 66ebbd4..3b247b8 100644
>>> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
>>> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
>>> @@ -369,6 +369,52 @@ nouveau_dmem_suspend(struct nouveau_drm *drm)
>>>   	mutex_unlock(&drm->dmem->mutex);
>>>   }
>>>   +/*
>>> + * Evict all pages mapping a chunk.
>>> + */
>>> +void
>>> +nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk)
>>> +{
>>> +	unsigned long i, npages = range_len(&chunk->pagemap.range) >> PAGE_SHIFT;
>>> +	unsigned long *src_pfns, *dst_pfns;
>>> +	dma_addr_t *dma_addrs;
>>> +	struct nouveau_fence *fence;
>>> +
>>> +	src_pfns = kcalloc(npages, sizeof(*src_pfns), GFP_KERNEL);
>>> +	dst_pfns = kcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL);
>>> +	dma_addrs = kcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL);
>>> +
>>> +	migrate_device_range(src_pfns, chunk->pagemap.range.start >> PAGE_SHIFT,
>>> +			npages);
>>> +
>>> +	for (i = 0; i < npages; i++) {
>>> +		if (src_pfns[i] & MIGRATE_PFN_MIGRATE) {
>>> +			struct page *dpage;
>>> +
>>> +			/*
>>> +			 * _GFP_NOFAIL because the GPU is going away and there
>>> +			 * is nothing sensible we can do if we can't copy the
>>> +			 * data back.
>>> +			 */
>> You'll have to excuse me for a moment since this area of nouveau isn't one of
>> my strongpoints, but are we sure about this? IIRC __GFP_NOFAIL means infinite
>> retry, in the case of a GPU hotplug event I would assume we would rather just
>> stop trying to migrate things to the GPU and just drop the data instead of
>> hanging on infinite retries.
>>
>>> +			dpage = alloc_page(GFP_HIGHUSER | __GFP_NOFAIL);
>>> +			dst_pfns[i] = migrate_pfn(page_to_pfn(dpage));
>>> +			nouveau_dmem_copy_one(chunk->drm,
>>> +					migrate_pfn_to_page(src_pfns[i]), dpage,
>>> +					&dma_addrs[i]);
>>> +		}
>>> +	}
>>> +
>>> +	nouveau_fence_new(chunk->drm->dmem->migrate.chan, false, &fence);
>>> +	migrate_device_pages(src_pfns, dst_pfns, npages);
>>> +	nouveau_dmem_fence_done(&fence);
>>> +	migrate_device_finalize(src_pfns, dst_pfns, npages);
>>> +	kfree(src_pfns);
>>> +	kfree(dst_pfns);
>>> +	for (i = 0; i < npages; i++)
>>> +		dma_unmap_page(chunk->drm->dev->dev, dma_addrs[i], PAGE_SIZE, DMA_BIDIRECTIONAL);
>>> +	kfree(dma_addrs);
>>> +}
>>> +
>>>   void
>>>   nouveau_dmem_fini(struct nouveau_drm *drm)
>>>   {
>>> @@ -380,8 +426,10 @@ nouveau_dmem_fini(struct nouveau_drm *drm)
>>>   	mutex_lock(&drm->dmem->mutex);
>>>     	list_for_each_entry_safe(chunk, tmp, &drm->dmem->chunks, list) {
>>> +		nouveau_dmem_evict_chunk(chunk);
>>>   		nouveau_bo_unpin(chunk->bo);
>>>   		nouveau_bo_ref(NULL, &chunk->bo);
>>> +		WARN_ON(chunk->callocated);
>>>   		list_del(&chunk->list);
>>>   		memunmap_pages(&chunk->pagemap);
>>>   		release_mem_region(chunk->pagemap.range.start,

WARNING: multiple messages have this Message-ID (diff)

From: Alistair Popple <apopple@nvidia.com>
To: Felix Kuehling <felix.kuehling@amd.com>
Cc: "Alex Sierra" <alex.sierra@amd.com>,
	"Karol Herbst" <kherbst@redhat.com>,
	"David Airlie" <airlied@linux.ie>,
	nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	linux-mm@kvack.org, amd-gfx@lists.freedesktop.org,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	"Ben Skeggs" <bskeggs@redhat.com>,
	"Jason Gunthorpe" <jgg@nvidia.com>,
	"Ralph Campbell" <rcampbell@nvidia.com>,
	"Lyude Paul" <lyude@redhat.com>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Pan, Xinhui" <Xinhui.Pan@amd.com>,
	linux-kernel@vger.kernel.org, "Daniel Vetter" <daniel@ffwll.ch>,
	"Alex Deucher" <alexander.deucher@amd.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	linuxppc-dev@lists.ozlabs.org,
	"Christian König" <christian.koenig@amd.com>
Subject: Re: [PATCH 6/7] nouveau/dmem: Evict device private memory during release
Date: Tue, 27 Sep 2022 11:39:57 +1000	[thread overview]
Message-ID: <87bkr1lh3a.fsf@nvdebian.thelocal> (raw)
In-Reply-To: <7ca6ec0c-7e5e-3b24-8f8d-650df357130c@amd.com>


Felix Kuehling <felix.kuehling@amd.com> writes:

> On 2022-09-26 17:35, Lyude Paul wrote:
>> On Mon, 2022-09-26 at 16:03 +1000, Alistair Popple wrote:
>>> When the module is unloaded or a GPU is unbound from the module it is
>>> possible for device private pages to be left mapped in currently running
>>> processes. This leads to a kernel crash when the pages are either freed
>>> or accessed from the CPU because the GPU and associated data structures
>>> and callbacks have all been freed.
>>>
>>> Fix this by migrating any mappings back to normal CPU memory prior to
>>> freeing the GPU memory chunks and associated device private pages.
>>>
>>> Signed-off-by: Alistair Popple <apopple@nvidia.com>
>>>
>>> ---
>>>
>>> I assume the AMD driver might have a similar issue. However I can't see
>>> where device private (or coherent) pages actually get unmapped/freed
>>> during teardown as I couldn't find any relevant calls to
>>> devm_memunmap(), memunmap(), devm_release_mem_region() or
>>> release_mem_region(). So it appears that ZONE_DEVICE pages are not being
>>> properly freed during module unload, unless I'm missing something?
>> I've got no idea, will poke Ben to see if they know the answer to this
>
> I guess we're relying on devm to release the region. Isn't the whole point of
> using devm_request_free_mem_region that we don't have to remember to explicitly
> release it when the device gets destroyed? I believe we had an explicit free
> call at some point by mistake, and that caused a double-free during module
> unload. See this commit for reference:

Argh, thanks for that pointer. I was not so familiar with
devm_request_free_mem_region()/devm_memremap_pages() as currently
Nouveau explicitly manages that itself.

> commit 22f4f4faf337d5fb2d2750aff13215726814273e
> Author: Philip Yang <Philip.Yang@amd.com>
> Date:   Mon Sep 20 17:25:52 2021 -0400
>
>     drm/amdkfd: fix svm_migrate_fini warning
>          Device manager releases device-specific resources when a driver
>     disconnects from a device, devm_memunmap_pages and
>     devm_release_mem_region calls in svm_migrate_fini are redundant.
>          It causes below warning trace after patch "drm/amdgpu: Split
>     amdgpu_device_fini into early and late", so remove function
>     svm_migrate_fini.
>          BUG: https://gitlab.freedesktop.org/drm/amd/-/issues/1718
>          WARNING: CPU: 1 PID: 3646 at drivers/base/devres.c:795
>     devm_release_action+0x51/0x60
>     Call Trace:
>         ? memunmap_pages+0x360/0x360
>         svm_migrate_fini+0x2d/0x60 [amdgpu]
>         kgd2kfd_device_exit+0x23/0xa0 [amdgpu]
>         amdgpu_amdkfd_device_fini_sw+0x1d/0x30 [amdgpu]
>         amdgpu_device_fini_sw+0x45/0x290 [amdgpu]
>         amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
>         drm_dev_release+0x20/0x40 [drm]
>         release_nodes+0x196/0x1e0
>         device_release_driver_internal+0x104/0x1d0
>         driver_detach+0x47/0x90
>         bus_remove_driver+0x7a/0xd0
>         pci_unregister_driver+0x3d/0x90
>         amdgpu_exit+0x11/0x20 [amdgpu]
>          Signed-off-by: Philip Yang <Philip.Yang@amd.com>
>     Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
>     Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>
> Furthermore, I guess we are assuming that nobody is using the GPU when the
> module is unloaded. As long as any processes have /dev/kfd open, you won't be
> able to unload the module (except by force-unload). I suppose with ZONE_DEVICE
> memory, we can have references to device memory pages even when user mode has
> closed /dev/kfd. We do have a cleanup handler that runs in an MMU-free-notifier.
> In theory that should run after all the pages in the mm_struct have been freed.
> It releases all sorts of other device resources and needs the driver to still be
> there. I'm not sure if there is anything preventing a module unload before the
> free-notifier runs. I'll look into that.

Right - module unload (or device unbind) is one of the other ways we can
hit this issue in Nouveau at least. You can end up with ZONE_DEVICE
pages mapped in a running process after the module has unloaded.
Although now you mention it that seems a bit wrong - the pgmap refcount
should provide some protection against that. Will have to look into
that too.

> Regards,
>   Felix
>
>
>>
>>> ---
>>>   drivers/gpu/drm/nouveau/nouveau_dmem.c | 48 +++++++++++++++++++++++++++-
>>>   1 file changed, 48 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
>>> index 66ebbd4..3b247b8 100644
>>> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
>>> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
>>> @@ -369,6 +369,52 @@ nouveau_dmem_suspend(struct nouveau_drm *drm)
>>>   	mutex_unlock(&drm->dmem->mutex);
>>>   }
>>>   +/*
>>> + * Evict all pages mapping a chunk.
>>> + */
>>> +void
>>> +nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk)
>>> +{
>>> +	unsigned long i, npages = range_len(&chunk->pagemap.range) >> PAGE_SHIFT;
>>> +	unsigned long *src_pfns, *dst_pfns;
>>> +	dma_addr_t *dma_addrs;
>>> +	struct nouveau_fence *fence;
>>> +
>>> +	src_pfns = kcalloc(npages, sizeof(*src_pfns), GFP_KERNEL);
>>> +	dst_pfns = kcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL);
>>> +	dma_addrs = kcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL);
>>> +
>>> +	migrate_device_range(src_pfns, chunk->pagemap.range.start >> PAGE_SHIFT,
>>> +			npages);
>>> +
>>> +	for (i = 0; i < npages; i++) {
>>> +		if (src_pfns[i] & MIGRATE_PFN_MIGRATE) {
>>> +			struct page *dpage;
>>> +
>>> +			/*
>>> +			 * _GFP_NOFAIL because the GPU is going away and there
>>> +			 * is nothing sensible we can do if we can't copy the
>>> +			 * data back.
>>> +			 */
>> You'll have to excuse me for a moment since this area of nouveau isn't one of
>> my strongpoints, but are we sure about this? IIRC __GFP_NOFAIL means infinite
>> retry, in the case of a GPU hotplug event I would assume we would rather just
>> stop trying to migrate things to the GPU and just drop the data instead of
>> hanging on infinite retries.
>>
>>> +			dpage = alloc_page(GFP_HIGHUSER | __GFP_NOFAIL);
>>> +			dst_pfns[i] = migrate_pfn(page_to_pfn(dpage));
>>> +			nouveau_dmem_copy_one(chunk->drm,
>>> +					migrate_pfn_to_page(src_pfns[i]), dpage,
>>> +					&dma_addrs[i]);
>>> +		}
>>> +	}
>>> +
>>> +	nouveau_fence_new(chunk->drm->dmem->migrate.chan, false, &fence);
>>> +	migrate_device_pages(src_pfns, dst_pfns, npages);
>>> +	nouveau_dmem_fence_done(&fence);
>>> +	migrate_device_finalize(src_pfns, dst_pfns, npages);
>>> +	kfree(src_pfns);
>>> +	kfree(dst_pfns);
>>> +	for (i = 0; i < npages; i++)
>>> +		dma_unmap_page(chunk->drm->dev->dev, dma_addrs[i], PAGE_SIZE, DMA_BIDIRECTIONAL);
>>> +	kfree(dma_addrs);
>>> +}
>>> +
>>>   void
>>>   nouveau_dmem_fini(struct nouveau_drm *drm)
>>>   {
>>> @@ -380,8 +426,10 @@ nouveau_dmem_fini(struct nouveau_drm *drm)
>>>   	mutex_lock(&drm->dmem->mutex);
>>>     	list_for_each_entry_safe(chunk, tmp, &drm->dmem->chunks, list) {
>>> +		nouveau_dmem_evict_chunk(chunk);
>>>   		nouveau_bo_unpin(chunk->bo);
>>>   		nouveau_bo_ref(NULL, &chunk->bo);
>>> +		WARN_ON(chunk->callocated);
>>>   		list_del(&chunk->list);
>>>   		memunmap_pages(&chunk->pagemap);
>>>   		release_mem_region(chunk->pagemap.range.start,

WARNING: multiple messages have this Message-ID (diff)

From: Alistair Popple <apopple@nvidia.com>
To: Felix Kuehling <felix.kuehling@amd.com>
Cc: "Alex Sierra" <alex.sierra@amd.com>,
	"Karol Herbst" <kherbst@redhat.com>,
	"David Airlie" <airlied@linux.ie>,
	nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	linux-mm@kvack.org, amd-gfx@lists.freedesktop.org,
	"Michael Ellerman" <mpe@ellerman.id.au>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	"Ben Skeggs" <bskeggs@redhat.com>,
	"Jason Gunthorpe" <jgg@nvidia.com>,
	"Ralph Campbell" <rcampbell@nvidia.com>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Pan, Xinhui" <Xinhui.Pan@amd.com>,
	linux-kernel@vger.kernel.org,
	"Alex Deucher" <alexander.deucher@amd.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	linuxppc-dev@lists.ozlabs.org,
	"Christian König" <christian.koenig@amd.com>
Subject: Re: [PATCH 6/7] nouveau/dmem: Evict device private memory during release
Date: Tue, 27 Sep 2022 11:39:57 +1000	[thread overview]
Message-ID: <87bkr1lh3a.fsf@nvdebian.thelocal> (raw)
In-Reply-To: <7ca6ec0c-7e5e-3b24-8f8d-650df357130c@amd.com>


Felix Kuehling <felix.kuehling@amd.com> writes:

> On 2022-09-26 17:35, Lyude Paul wrote:
>> On Mon, 2022-09-26 at 16:03 +1000, Alistair Popple wrote:
>>> When the module is unloaded or a GPU is unbound from the module it is
>>> possible for device private pages to be left mapped in currently running
>>> processes. This leads to a kernel crash when the pages are either freed
>>> or accessed from the CPU because the GPU and associated data structures
>>> and callbacks have all been freed.
>>>
>>> Fix this by migrating any mappings back to normal CPU memory prior to
>>> freeing the GPU memory chunks and associated device private pages.
>>>
>>> Signed-off-by: Alistair Popple <apopple@nvidia.com>
>>>
>>> ---
>>>
>>> I assume the AMD driver might have a similar issue. However I can't see
>>> where device private (or coherent) pages actually get unmapped/freed
>>> during teardown as I couldn't find any relevant calls to
>>> devm_memunmap(), memunmap(), devm_release_mem_region() or
>>> release_mem_region(). So it appears that ZONE_DEVICE pages are not being
>>> properly freed during module unload, unless I'm missing something?
>> I've got no idea, will poke Ben to see if they know the answer to this
>
> I guess we're relying on devm to release the region. Isn't the whole point of
> using devm_request_free_mem_region that we don't have to remember to explicitly
> release it when the device gets destroyed? I believe we had an explicit free
> call at some point by mistake, and that caused a double-free during module
> unload. See this commit for reference:

Argh, thanks for that pointer. I was not so familiar with
devm_request_free_mem_region()/devm_memremap_pages() as currently
Nouveau explicitly manages that itself.

> commit 22f4f4faf337d5fb2d2750aff13215726814273e
> Author: Philip Yang <Philip.Yang@amd.com>
> Date:   Mon Sep 20 17:25:52 2021 -0400
>
>     drm/amdkfd: fix svm_migrate_fini warning
>          Device manager releases device-specific resources when a driver
>     disconnects from a device, devm_memunmap_pages and
>     devm_release_mem_region calls in svm_migrate_fini are redundant.
>          It causes below warning trace after patch "drm/amdgpu: Split
>     amdgpu_device_fini into early and late", so remove function
>     svm_migrate_fini.
>          BUG: https://gitlab.freedesktop.org/drm/amd/-/issues/1718
>          WARNING: CPU: 1 PID: 3646 at drivers/base/devres.c:795
>     devm_release_action+0x51/0x60
>     Call Trace:
>         ? memunmap_pages+0x360/0x360
>         svm_migrate_fini+0x2d/0x60 [amdgpu]
>         kgd2kfd_device_exit+0x23/0xa0 [amdgpu]
>         amdgpu_amdkfd_device_fini_sw+0x1d/0x30 [amdgpu]
>         amdgpu_device_fini_sw+0x45/0x290 [amdgpu]
>         amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
>         drm_dev_release+0x20/0x40 [drm]
>         release_nodes+0x196/0x1e0
>         device_release_driver_internal+0x104/0x1d0
>         driver_detach+0x47/0x90
>         bus_remove_driver+0x7a/0xd0
>         pci_unregister_driver+0x3d/0x90
>         amdgpu_exit+0x11/0x20 [amdgpu]
>          Signed-off-by: Philip Yang <Philip.Yang@amd.com>
>     Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
>     Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>
> Furthermore, I guess we are assuming that nobody is using the GPU when the
> module is unloaded. As long as any processes have /dev/kfd open, you won't be
> able to unload the module (except by force-unload). I suppose with ZONE_DEVICE
> memory, we can have references to device memory pages even when user mode has
> closed /dev/kfd. We do have a cleanup handler that runs in an MMU-free-notifier.
> In theory that should run after all the pages in the mm_struct have been freed.
> It releases all sorts of other device resources and needs the driver to still be
> there. I'm not sure if there is anything preventing a module unload before the
> free-notifier runs. I'll look into that.

Right - module unload (or device unbind) is one of the other ways we can
hit this issue in Nouveau at least. You can end up with ZONE_DEVICE
pages mapped in a running process after the module has unloaded.
Although now you mention it that seems a bit wrong - the pgmap refcount
should provide some protection against that. Will have to look into
that too.

> Regards,
>   Felix
>
>
>>
>>> ---
>>>   drivers/gpu/drm/nouveau/nouveau_dmem.c | 48 +++++++++++++++++++++++++++-
>>>   1 file changed, 48 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
>>> index 66ebbd4..3b247b8 100644
>>> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
>>> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
>>> @@ -369,6 +369,52 @@ nouveau_dmem_suspend(struct nouveau_drm *drm)
>>>   	mutex_unlock(&drm->dmem->mutex);
>>>   }
>>>   +/*
>>> + * Evict all pages mapping a chunk.
>>> + */
>>> +void
>>> +nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk)
>>> +{
>>> +	unsigned long i, npages = range_len(&chunk->pagemap.range) >> PAGE_SHIFT;
>>> +	unsigned long *src_pfns, *dst_pfns;
>>> +	dma_addr_t *dma_addrs;
>>> +	struct nouveau_fence *fence;
>>> +
>>> +	src_pfns = kcalloc(npages, sizeof(*src_pfns), GFP_KERNEL);
>>> +	dst_pfns = kcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL);
>>> +	dma_addrs = kcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL);
>>> +
>>> +	migrate_device_range(src_pfns, chunk->pagemap.range.start >> PAGE_SHIFT,
>>> +			npages);
>>> +
>>> +	for (i = 0; i < npages; i++) {
>>> +		if (src_pfns[i] & MIGRATE_PFN_MIGRATE) {
>>> +			struct page *dpage;
>>> +
>>> +			/*
>>> +			 * _GFP_NOFAIL because the GPU is going away and there
>>> +			 * is nothing sensible we can do if we can't copy the
>>> +			 * data back.
>>> +			 */
>> You'll have to excuse me for a moment since this area of nouveau isn't one of
>> my strongpoints, but are we sure about this? IIRC __GFP_NOFAIL means infinite
>> retry, in the case of a GPU hotplug event I would assume we would rather just
>> stop trying to migrate things to the GPU and just drop the data instead of
>> hanging on infinite retries.
>>
>>> +			dpage = alloc_page(GFP_HIGHUSER | __GFP_NOFAIL);
>>> +			dst_pfns[i] = migrate_pfn(page_to_pfn(dpage));
>>> +			nouveau_dmem_copy_one(chunk->drm,
>>> +					migrate_pfn_to_page(src_pfns[i]), dpage,
>>> +					&dma_addrs[i]);
>>> +		}
>>> +	}
>>> +
>>> +	nouveau_fence_new(chunk->drm->dmem->migrate.chan, false, &fence);
>>> +	migrate_device_pages(src_pfns, dst_pfns, npages);
>>> +	nouveau_dmem_fence_done(&fence);
>>> +	migrate_device_finalize(src_pfns, dst_pfns, npages);
>>> +	kfree(src_pfns);
>>> +	kfree(dst_pfns);
>>> +	for (i = 0; i < npages; i++)
>>> +		dma_unmap_page(chunk->drm->dev->dev, dma_addrs[i], PAGE_SIZE, DMA_BIDIRECTIONAL);
>>> +	kfree(dma_addrs);
>>> +}
>>> +
>>>   void
>>>   nouveau_dmem_fini(struct nouveau_drm *drm)
>>>   {
>>> @@ -380,8 +426,10 @@ nouveau_dmem_fini(struct nouveau_drm *drm)
>>>   	mutex_lock(&drm->dmem->mutex);
>>>     	list_for_each_entry_safe(chunk, tmp, &drm->dmem->chunks, list) {
>>> +		nouveau_dmem_evict_chunk(chunk);
>>>   		nouveau_bo_unpin(chunk->bo);
>>>   		nouveau_bo_ref(NULL, &chunk->bo);
>>> +		WARN_ON(chunk->callocated);
>>>   		list_del(&chunk->list);
>>>   		memunmap_pages(&chunk->pagemap);
>>>   		release_mem_region(chunk->pagemap.range.start,

WARNING: multiple messages have this Message-ID (diff)

From: Alistair Popple <apopple@nvidia.com>
To: Felix Kuehling <felix.kuehling@amd.com>
Cc: "Lyude Paul" <lyude@redhat.com>,
	linux-mm@kvack.org, "Andrew Morton" <akpm@linux-foundation.org>,
	"Michael Ellerman" <mpe@ellerman.id.au>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Alex Deucher" <alexander.deucher@amd.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Pan, Xinhui" <Xinhui.Pan@amd.com>,
	"David Airlie" <airlied@linux.ie>,
	"Daniel Vetter" <daniel@ffwll.ch>,
	"Ben Skeggs" <bskeggs@redhat.com>,
	"Karol Herbst" <kherbst@redhat.com>,
	"Ralph Campbell" <rcampbell@nvidia.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	"Alex Sierra" <alex.sierra@amd.com>,
	"John Hubbard" <jhubbard@nvidia.com>,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	amd-gfx@lists.freedesktop.org, nouveau@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org,
	"Jason Gunthorpe" <jgg@nvidia.com>,
	"Dan Williams" <dan.j.williams@intel.com>
Subject: Re: [PATCH 6/7] nouveau/dmem: Evict device private memory during release
Date: Tue, 27 Sep 2022 11:39:57 +1000	[thread overview]
Message-ID: <87bkr1lh3a.fsf@nvdebian.thelocal> (raw)
In-Reply-To: <7ca6ec0c-7e5e-3b24-8f8d-650df357130c@amd.com>


Felix Kuehling <felix.kuehling@amd.com> writes:

> On 2022-09-26 17:35, Lyude Paul wrote:
>> On Mon, 2022-09-26 at 16:03 +1000, Alistair Popple wrote:
>>> When the module is unloaded or a GPU is unbound from the module it is
>>> possible for device private pages to be left mapped in currently running
>>> processes. This leads to a kernel crash when the pages are either freed
>>> or accessed from the CPU because the GPU and associated data structures
>>> and callbacks have all been freed.
>>>
>>> Fix this by migrating any mappings back to normal CPU memory prior to
>>> freeing the GPU memory chunks and associated device private pages.
>>>
>>> Signed-off-by: Alistair Popple <apopple@nvidia.com>
>>>
>>> ---
>>>
>>> I assume the AMD driver might have a similar issue. However I can't see
>>> where device private (or coherent) pages actually get unmapped/freed
>>> during teardown as I couldn't find any relevant calls to
>>> devm_memunmap(), memunmap(), devm_release_mem_region() or
>>> release_mem_region(). So it appears that ZONE_DEVICE pages are not being
>>> properly freed during module unload, unless I'm missing something?
>> I've got no idea, will poke Ben to see if they know the answer to this
>
> I guess we're relying on devm to release the region. Isn't the whole point of
> using devm_request_free_mem_region that we don't have to remember to explicitly
> release it when the device gets destroyed? I believe we had an explicit free
> call at some point by mistake, and that caused a double-free during module
> unload. See this commit for reference:

Argh, thanks for that pointer. I was not so familiar with
devm_request_free_mem_region()/devm_memremap_pages() as currently
Nouveau explicitly manages that itself.

> commit 22f4f4faf337d5fb2d2750aff13215726814273e
> Author: Philip Yang <Philip.Yang@amd.com>
> Date:   Mon Sep 20 17:25:52 2021 -0400
>
>     drm/amdkfd: fix svm_migrate_fini warning
>          Device manager releases device-specific resources when a driver
>     disconnects from a device, devm_memunmap_pages and
>     devm_release_mem_region calls in svm_migrate_fini are redundant.
>          It causes below warning trace after patch "drm/amdgpu: Split
>     amdgpu_device_fini into early and late", so remove function
>     svm_migrate_fini.
>          BUG: https://gitlab.freedesktop.org/drm/amd/-/issues/1718
>          WARNING: CPU: 1 PID: 3646 at drivers/base/devres.c:795
>     devm_release_action+0x51/0x60
>     Call Trace:
>         ? memunmap_pages+0x360/0x360
>         svm_migrate_fini+0x2d/0x60 [amdgpu]
>         kgd2kfd_device_exit+0x23/0xa0 [amdgpu]
>         amdgpu_amdkfd_device_fini_sw+0x1d/0x30 [amdgpu]
>         amdgpu_device_fini_sw+0x45/0x290 [amdgpu]
>         amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
>         drm_dev_release+0x20/0x40 [drm]
>         release_nodes+0x196/0x1e0
>         device_release_driver_internal+0x104/0x1d0
>         driver_detach+0x47/0x90
>         bus_remove_driver+0x7a/0xd0
>         pci_unregister_driver+0x3d/0x90
>         amdgpu_exit+0x11/0x20 [amdgpu]
>          Signed-off-by: Philip Yang <Philip.Yang@amd.com>
>     Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
>     Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>
> Furthermore, I guess we are assuming that nobody is using the GPU when the
> module is unloaded. As long as any processes have /dev/kfd open, you won't be
> able to unload the module (except by force-unload). I suppose with ZONE_DEVICE
> memory, we can have references to device memory pages even when user mode has
> closed /dev/kfd. We do have a cleanup handler that runs in an MMU-free-notifier.
> In theory that should run after all the pages in the mm_struct have been freed.
> It releases all sorts of other device resources and needs the driver to still be
> there. I'm not sure if there is anything preventing a module unload before the
> free-notifier runs. I'll look into that.

Right - module unload (or device unbind) is one of the other ways we can
hit this issue in Nouveau at least. You can end up with ZONE_DEVICE
pages mapped in a running process after the module has unloaded.
Although now you mention it that seems a bit wrong - the pgmap refcount
should provide some protection against that. Will have to look into
that too.

> Regards,
>   Felix
>
>
>>
>>> ---
>>>   drivers/gpu/drm/nouveau/nouveau_dmem.c | 48 +++++++++++++++++++++++++++-
>>>   1 file changed, 48 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
>>> index 66ebbd4..3b247b8 100644
>>> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
>>> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
>>> @@ -369,6 +369,52 @@ nouveau_dmem_suspend(struct nouveau_drm *drm)
>>>   	mutex_unlock(&drm->dmem->mutex);
>>>   }
>>>   +/*
>>> + * Evict all pages mapping a chunk.
>>> + */
>>> +void
>>> +nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk)
>>> +{
>>> +	unsigned long i, npages = range_len(&chunk->pagemap.range) >> PAGE_SHIFT;
>>> +	unsigned long *src_pfns, *dst_pfns;
>>> +	dma_addr_t *dma_addrs;
>>> +	struct nouveau_fence *fence;
>>> +
>>> +	src_pfns = kcalloc(npages, sizeof(*src_pfns), GFP_KERNEL);
>>> +	dst_pfns = kcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL);
>>> +	dma_addrs = kcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL);
>>> +
>>> +	migrate_device_range(src_pfns, chunk->pagemap.range.start >> PAGE_SHIFT,
>>> +			npages);
>>> +
>>> +	for (i = 0; i < npages; i++) {
>>> +		if (src_pfns[i] & MIGRATE_PFN_MIGRATE) {
>>> +			struct page *dpage;
>>> +
>>> +			/*
>>> +			 * _GFP_NOFAIL because the GPU is going away and there
>>> +			 * is nothing sensible we can do if we can't copy the
>>> +			 * data back.
>>> +			 */
>> You'll have to excuse me for a moment since this area of nouveau isn't one of
>> my strongpoints, but are we sure about this? IIRC __GFP_NOFAIL means infinite
>> retry, in the case of a GPU hotplug event I would assume we would rather just
>> stop trying to migrate things to the GPU and just drop the data instead of
>> hanging on infinite retries.
>>
>>> +			dpage = alloc_page(GFP_HIGHUSER | __GFP_NOFAIL);
>>> +			dst_pfns[i] = migrate_pfn(page_to_pfn(dpage));
>>> +			nouveau_dmem_copy_one(chunk->drm,
>>> +					migrate_pfn_to_page(src_pfns[i]), dpage,
>>> +					&dma_addrs[i]);
>>> +		}
>>> +	}
>>> +
>>> +	nouveau_fence_new(chunk->drm->dmem->migrate.chan, false, &fence);
>>> +	migrate_device_pages(src_pfns, dst_pfns, npages);
>>> +	nouveau_dmem_fence_done(&fence);
>>> +	migrate_device_finalize(src_pfns, dst_pfns, npages);
>>> +	kfree(src_pfns);
>>> +	kfree(dst_pfns);
>>> +	for (i = 0; i < npages; i++)
>>> +		dma_unmap_page(chunk->drm->dev->dev, dma_addrs[i], PAGE_SIZE, DMA_BIDIRECTIONAL);
>>> +	kfree(dma_addrs);
>>> +}
>>> +
>>>   void
>>>   nouveau_dmem_fini(struct nouveau_drm *drm)
>>>   {
>>> @@ -380,8 +426,10 @@ nouveau_dmem_fini(struct nouveau_drm *drm)
>>>   	mutex_lock(&drm->dmem->mutex);
>>>     	list_for_each_entry_safe(chunk, tmp, &drm->dmem->chunks, list) {
>>> +		nouveau_dmem_evict_chunk(chunk);
>>>   		nouveau_bo_unpin(chunk->bo);
>>>   		nouveau_bo_ref(NULL, &chunk->bo);
>>> +		WARN_ON(chunk->callocated);
>>>   		list_del(&chunk->list);
>>>   		memunmap_pages(&chunk->pagemap);
>>>   		release_mem_region(chunk->pagemap.range.start,

next prev parent reply	other threads:[~2022-09-27  7:35 UTC|newest]

Thread overview: 130+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-26  6:03 [PATCH 0/7] Fix several device private page reference counting issues Alistair Popple
2022-09-26  6:03 ` Alistair Popple
2022-09-26  6:03 ` Alistair Popple
2022-09-26  6:03 ` Alistair Popple
2022-09-26  6:03 ` [Nouveau] " Alistair Popple
2022-09-26  6:03 ` [PATCH 1/7] mm/memory.c: Fix race when faulting a device private page Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` [Nouveau] " Alistair Popple
2022-09-29  0:07   ` Michael Ellerman
2022-09-29  0:07     ` Michael Ellerman
2022-09-29  0:07     ` Michael Ellerman
2022-09-29  0:07     ` Michael Ellerman
2022-09-29  0:07     ` [Nouveau] " Michael Ellerman
2022-09-29  1:40     ` Alistair Popple
2022-09-29  1:40       ` Alistair Popple
2022-09-29  1:40       ` Alistair Popple
2022-09-29  1:40       ` Alistair Popple
2022-09-29  1:40       ` [Nouveau] " Alistair Popple
2022-09-29  5:07       ` Michael Ellerman
2022-09-29  5:07         ` Michael Ellerman
2022-09-29  5:07         ` Michael Ellerman
2022-09-29  5:07         ` Michael Ellerman
2022-09-29  5:07         ` [Nouveau] " Michael Ellerman
2022-09-26  6:03 ` [PATCH 2/7] mm: Free device private pages have zero refcount Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` [Nouveau] " Alistair Popple
2022-09-26 14:36   ` Jason Gunthorpe
2022-09-26 14:36     ` Jason Gunthorpe
2022-09-26 14:36     ` Jason Gunthorpe
2022-09-26 14:36     ` Jason Gunthorpe
2022-09-26 14:36     ` [Nouveau] " Jason Gunthorpe
2022-09-27  2:06     ` Alistair Popple
2022-09-27  2:06       ` Alistair Popple
2022-09-27  2:06       ` Alistair Popple
2022-09-27  2:06       ` Alistair Popple
2022-09-27  2:06       ` [Nouveau] " Alistair Popple
2022-09-29 20:18       ` Dan Williams
2022-09-29 20:18         ` Dan Williams
2022-09-29 20:18         ` Dan Williams
2022-09-29 20:18         ` Dan Williams
2022-09-29 20:18         ` [Nouveau] " Dan Williams
2022-09-30  0:45         ` Alistair Popple
2022-09-30  0:45           ` Alistair Popple
2022-09-30  0:45           ` Alistair Popple
2022-09-30  0:45           ` Alistair Popple
2022-09-30  0:45           ` [Nouveau] " Alistair Popple
2022-09-30  1:49           ` Dan Williams
2022-09-30  1:49             ` Dan Williams
2022-09-30  1:49             ` Dan Williams
2022-09-30  1:49             ` Dan Williams
2022-09-30  1:49             ` [Nouveau] " Dan Williams
2022-09-26  6:03 ` [PATCH 3/7] mm/migrate_device.c: Refactor migrate_vma and migrate_deivce_coherent_page() Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` [Nouveau] " Alistair Popple
2022-09-26  6:03 ` [PATCH 4/7] mm/migrate_device.c: Add migrate_device_range() Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` [Nouveau] " Alistair Popple
2022-09-26  6:03 ` [PATCH 5/7] nouveau/dmem: Refactor nouveau_dmem_fault_copy_one() Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` [Nouveau] " Alistair Popple
2022-09-26 21:29   ` Lyude Paul
2022-09-26 21:29     ` Lyude Paul
2022-09-26 21:29     ` Lyude Paul
2022-09-26 21:29     ` Lyude Paul
2022-09-26 21:29     ` [Nouveau] " Lyude Paul
2022-09-28 11:30     ` Alistair Popple
2022-09-28 11:30       ` Alistair Popple
2022-09-28 11:30       ` Alistair Popple
2022-09-28 11:30       ` Alistair Popple
2022-09-28 11:30       ` [Nouveau] " Alistair Popple
2022-09-26  6:03 ` [PATCH 6/7] nouveau/dmem: Evict device private memory during release Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` [Nouveau] " Alistair Popple
2022-09-26 13:28   ` kernel test robot
2022-09-26 13:28     ` kernel test robot
2022-09-26 13:28     ` kernel test robot
2022-09-26 13:28     ` kernel test robot
2022-09-26 13:28     ` [Nouveau] " kernel test robot
2022-09-26 21:35   ` Lyude Paul
2022-09-26 21:35     ` Lyude Paul
2022-09-26 21:35     ` Lyude Paul
2022-09-26 21:35     ` Lyude Paul
2022-09-26 21:35     ` [Nouveau] " Lyude Paul
2022-09-26 22:14     ` John Hubbard
2022-09-26 22:14       ` John Hubbard
2022-09-26 22:14       ` John Hubbard
2022-09-26 22:14       ` John Hubbard
2022-09-26 22:14       ` [Nouveau] " John Hubbard
2022-09-26 23:45       ` Alistair Popple
2022-09-26 23:45         ` Alistair Popple
2022-09-26 23:45         ` Alistair Popple
2022-09-26 23:45         ` Alistair Popple
2022-09-26 23:45         ` [Nouveau] " Alistair Popple
2022-09-28 21:39         ` Lyude Paul
2022-09-28 21:39           ` Lyude Paul
2022-09-28 21:39           ` Lyude Paul
2022-09-28 21:39           ` Lyude Paul
2022-09-28 21:39           ` [Nouveau] " Lyude Paul
2022-09-26 23:07     ` Felix Kuehling
2022-09-26 23:07       ` Felix Kuehling
2022-09-26 23:07       ` Felix Kuehling
2022-09-26 23:07       ` Felix Kuehling
2022-09-26 23:07       ` [Nouveau] " Felix Kuehling
2022-09-27  1:39       ` Alistair Popple [this message]
2022-09-27  1:39         ` Alistair Popple
2022-09-27  1:39         ` Alistair Popple
2022-09-27  1:39         ` Alistair Popple
2022-09-27  1:39         ` [Nouveau] " Alistair Popple
2022-09-28 21:23         ` Lyude Paul
2022-09-28 21:23           ` Lyude Paul
2022-09-28 21:23           ` Lyude Paul
2022-09-28 21:23           ` Lyude Paul
2022-09-28 21:23           ` [Nouveau] " Lyude Paul
2022-09-26  6:03 ` [PATCH 7/7] hmm-tests: Add test for migrate_device_range() Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` Alistair Popple
2022-09-26  6:03   ` [Nouveau] " Alistair Popple

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bkr1lh3a.fsf@nvdebian.thelocal \
    --to=apopple@nvidia.com \
    --cc=Xinhui.Pan@amd.com \
    --cc=airlied@linux.ie \
    --cc=akpm@linux-foundation.org \
    --cc=alex.sierra@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=bskeggs@redhat.com \
    --cc=christian.koenig@amd.com \
    --cc=dan.j.williams@intel.com \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=felix.kuehling@amd.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kherbst@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=nouveau@lists.freedesktop.org \
    --cc=npiggin@gmail.com \
    --cc=rcampbell@nvidia.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.