Am 17.05.2017 um 10:01 schrieb Michel Dänzer:I can re-paste the Vulkan APIs requiring ENODEV:
On 17/05/17 04:13 PM, zhoucm1 wrote:
On 2017年05月17日 14:57, Michel Dänzer wrote:Essentially, your patches only track VRAM contents being lost per file
On 17/05/17 01:28 PM, zhoucm1 wrote:When vram is lost, we treat all contexts need to reset.
On 2017年05月17日 11:15, Michel Dänzer wrote:Still not sure what you mean by that.
On 17/05/17 12:04 PM, zhoucm1 wrote:Yes, but it really depends on userspace requirement, if you need to
On 2017年05月17日 09:18, Michel Dänzer wrote:I'm not following. With vram_lost_counter in amdgpu_fpriv, if any
On 16/05/17 06:25 PM, Chunming Zhou wrote:struct amdgpu_fpriv for vram_lost_counter is proper place, especially
Change-Id: I8eb6d7f558da05510e429d3bf1d48c8cec6c1977How do you envision the UMDs using this? I can mostly think of them
Signed-off-by: Chunming Zhou <David1.Zhou-5C7GfCeVMHo@public.gmane.org>
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index bca1fb5..f3e7525 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2547,6 +2547,9 @@ int amdgpu_vm_ioctl(struct drm_device *dev,
void *data, struct drm_file *filp)
case AMDGPU_VM_OP_UNRESERVE_VMID:
amdgpu_vm_free_reserved_vmid(adev, &fpriv->vm,
AMDGPU_GFXHUB);
break;
+ case AMDGPU_VM_OP_RESET:
+ fpriv->vram_lost_counter =
atomic_read(&adev->vram_lost_counter);
+ break;
calling this ioctl when a context is created or destroyed. But that
would also allow any other remaining contexts using the same DRM file
descriptor to use all ioctls again. So, I think there needs to be a
vram_lost_counter in struct amdgpu_ctx instead of in struct
amdgpu_fpriv.
for ioctl return value.
if you need to reset ctx one by one, we can mark all contexts of that
vm, and then reset by userspace.
context calls this ioctl, all other contexts using the same file
descriptor will also be considered safe again, right?
reset ctx one by one, we can mark all contexts of that vm to guilty, and
then reset one context by userspace.
E.g. what do you mean by "guilty"? I thought that refers to the context
which caused a hang. But it seems like you're using it to refer to any
context which hasn't reacted yet to VRAM contents being lost.
descriptor, not per context. I'm not sure (rather skeptical) that this
is suitable for OpenGL UMDs, since state is usually tracked per context.
Marek / Nicolai?
Oh, yeah that's a good point.
The problem with tracking it per context is that Vulkan also wants the ENODEV on the amdgpu_gem_va_ioct() and amdgpu_info_ioctl() which are context less.
But thinking more about this blocking those two doesn't make much sense. The VM content can be restored and why should be disallow reading GPU info?
The Vulkan APIs listed below could return
VK_ERROR_DEVICE_LOST according to the spec.
I tries to provide a list of u/k interfaces
that could be called for each vk API.
vkCreateDevice
- amdgpu_device_initialize.
- amdgpu_query_gpu_info
vkQueueSubmit
- amdgpu_cs_submit
vkWaitForFences
amdgpu_cs_wait_fences
vkGetEventStatus
vkQueueWaitIdle
vkDeviceWaitIdle
vkGetQueryPoolResults
amdgpu_cs_query_Fence_status
vkQueueBindSparse
amdgpu_bo_va_op
amdgpu_bo_va_op_raw
vkCreateSwapchainKHR
vkAcquireNextImageKHR
vkQueuePresentKHR
Not related with u/k interface.
Christian.