public inbox for amd-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: "Christian König" <ckoenig.leichtzumerken-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Felix Kuehling <felix.kuehling-5C7GfCeVMHo@public.gmane.org>,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Subject: Re: [PATCH 04/12] drm/amdgpu: move IV prescreening into the GMC code
Date: Wed, 10 Oct 2018 09:08:01 +0200	[thread overview]
Message-ID: <b9266d48-7327-165e-797d-2bdc5bda7f0b@gmail.com> (raw)
In-Reply-To: <dcf7fd50-e453-8a9d-cdaa-c452d4716e9a-5C7GfCeVMHo@public.gmane.org>

Yeah, exactly my thinking.

Basically the long term goal is to move most of the reporting and 
handling of faults into amdgpu_gmc.c. Otherwise we would duplicate a lot 
of handling for future hw generations.

On the other hand if the approach with the second IH ring buffer works 
as expected we most likely won't need the pre-screening anymore at all. 
But that needs more work to be 100% sure.

Christian.

Am 10.10.2018 um 01:46 schrieb Felix Kuehling:
> I realized that most of the code in gmc_v9_0_psescreen_iv is not
> actually hardware-specific. If it was not prescreening, but using an
> amdgpu_iv_entry that was already parsed, I think it could just be a
> generic function for processing retry faults:
>
>    * looking up the VM of a fault
>    * storing retry faults in a per-VM fifo
>    * dropping faults that have already been seen
>
> In other words, it's just a generic top half interrupt handler for retry
> faults while the bottom half (worker thread) would use the per-VM FIFOs
> to handle those pending retry faults.
>
> Regards,
>    Felix
>
>
> On 2018-09-26 09:53 AM, Christian König wrote:
>> The GMC/VM subsystem is causing the faults, so move the handling here as
>> well.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 59 +++++++++++++++++++++++++++++
>>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c | 69 ----------------------------------
>>   2 files changed, 59 insertions(+), 69 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> index 729a2c230f91..f8d69ab85fc3 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> @@ -244,6 +244,62 @@ static int gmc_v9_0_vm_fault_interrupt_state(struct amdgpu_device *adev,
>>   	return 0;
>>   }
>>   
>> +/**
>> + * vega10_ih_prescreen_iv - prescreen an interrupt vector
>> + *
>> + * @adev: amdgpu_device pointer
>> + *
>> + * Returns true if the interrupt vector should be further processed.
>> + */
>> +static bool gmc_v9_0_prescreen_iv(struct amdgpu_device *adev,
>> +				  struct amdgpu_iv_entry *entry,
>> +				  uint64_t addr)
>> +{
>> +	struct amdgpu_vm *vm;
>> +	u64 key;
>> +	int r;
>> +
>> +	/* No PASID, can't identify faulting process */
>> +	if (!entry->pasid)
>> +		return true;
>> +
>> +	/* Not a retry fault */
>> +	if (!(entry->src_data[1] & 0x80))
>> +		return true;
>> +
>> +	/* Track retry faults in per-VM fault FIFO. */
>> +	spin_lock(&adev->vm_manager.pasid_lock);
>> +	vm = idr_find(&adev->vm_manager.pasid_idr, entry->pasid);
>> +	if (!vm) {
>> +		/* VM not found, process it normally */
>> +		spin_unlock(&adev->vm_manager.pasid_lock);
>> +		return true;
>> +	}
>> +
>> +	key = AMDGPU_VM_FAULT(entry->pasid, addr);
>> +	r = amdgpu_vm_add_fault(vm->fault_hash, key);
>> +
>> +	/* Hash table is full or the fault is already being processed,
>> +	 * ignore further page faults
>> +	 */
>> +	if (r != 0) {
>> +		spin_unlock(&adev->vm_manager.pasid_lock);
>> +		return false;
>> +	}
>> +	/* No locking required with single writer and single reader */
>> +	r = kfifo_put(&vm->faults, key);
>> +	if (!r) {
>> +		/* FIFO is full. Ignore it until there is space */
>> +		amdgpu_vm_clear_fault(vm->fault_hash, key);
>> +		spin_unlock(&adev->vm_manager.pasid_lock);
>> +		return false;
>> +	}
>> +
>> +	spin_unlock(&adev->vm_manager.pasid_lock);
>> +	/* It's the first fault for this address, process it normally */
>> +	return true;
>> +}
>> +
>>   static int gmc_v9_0_process_interrupt(struct amdgpu_device *adev,
>>   				struct amdgpu_irq_src *source,
>>   				struct amdgpu_iv_entry *entry)
>> @@ -255,6 +311,9 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device *adev,
>>   	addr = (u64)entry->src_data[0] << 12;
>>   	addr |= ((u64)entry->src_data[1] & 0xf) << 44;
>>   
>> +	if (!gmc_v9_0_prescreen_iv(adev, entry, addr))
>> +		return 1;
>> +
>>   	if (!amdgpu_sriov_vf(adev)) {
>>   		status = RREG32(hub->vm_l2_pro_fault_status);
>>   		WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> index 0f50bef87163..0f68a0cd1fbf 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
>> @@ -228,76 +228,7 @@ static u32 vega10_ih_get_wptr(struct amdgpu_device *adev)
>>    */
>>   static bool vega10_ih_prescreen_iv(struct amdgpu_device *adev)
>>   {
>> -	u32 ring_index = adev->irq.ih.rptr >> 2;
>> -	u32 dw0, dw3, dw4, dw5;
>> -	u16 pasid;
>> -	u64 addr, key;
>> -	struct amdgpu_vm *vm;
>> -	int r;
>> -
>> -	dw0 = le32_to_cpu(adev->irq.ih.ring[ring_index + 0]);
>> -	dw3 = le32_to_cpu(adev->irq.ih.ring[ring_index + 3]);
>> -	dw4 = le32_to_cpu(adev->irq.ih.ring[ring_index + 4]);
>> -	dw5 = le32_to_cpu(adev->irq.ih.ring[ring_index + 5]);
>> -
>> -	/* Filter retry page faults, let only the first one pass. If
>> -	 * there are too many outstanding faults, ignore them until
>> -	 * some faults get cleared.
>> -	 */
>> -	switch (dw0 & 0xff) {
>> -	case SOC15_IH_CLIENTID_VMC:
>> -	case SOC15_IH_CLIENTID_UTCL2:
>> -		break;
>> -	default:
>> -		/* Not a VM fault */
>> -		return true;
>> -	}
>> -
>> -	pasid = dw3 & 0xffff;
>> -	/* No PASID, can't identify faulting process */
>> -	if (!pasid)
>> -		return true;
>> -
>> -	/* Not a retry fault */
>> -	if (!(dw5 & 0x80))
>> -		return true;
>> -
>> -	/* Track retry faults in per-VM fault FIFO. */
>> -	spin_lock(&adev->vm_manager.pasid_lock);
>> -	vm = idr_find(&adev->vm_manager.pasid_idr, pasid);
>> -	addr = ((u64)(dw5 & 0xf) << 44) | ((u64)dw4 << 12);
>> -	key = AMDGPU_VM_FAULT(pasid, addr);
>> -	if (!vm) {
>> -		/* VM not found, process it normally */
>> -		spin_unlock(&adev->vm_manager.pasid_lock);
>> -		return true;
>> -	} else {
>> -		r = amdgpu_vm_add_fault(vm->fault_hash, key);
>> -
>> -		/* Hash table is full or the fault is already being processed,
>> -		 * ignore further page faults
>> -		 */
>> -		if (r != 0) {
>> -			spin_unlock(&adev->vm_manager.pasid_lock);
>> -			goto ignore_iv;
>> -		}
>> -	}
>> -	/* No locking required with single writer and single reader */
>> -	r = kfifo_put(&vm->faults, key);
>> -	if (!r) {
>> -		/* FIFO is full. Ignore it until there is space */
>> -		amdgpu_vm_clear_fault(vm->fault_hash, key);
>> -		spin_unlock(&adev->vm_manager.pasid_lock);
>> -		goto ignore_iv;
>> -	}
>> -
>> -	spin_unlock(&adev->vm_manager.pasid_lock);
>> -	/* It's the first fault for this address, process it normally */
>>   	return true;
>> -
>> -ignore_iv:
>> -	adev->irq.ih.rptr += 32;
>> -	return false;
>>   }
>>   
>>   /**

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  parent reply	other threads:[~2018-10-10  7:08 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-26 13:53 [PATCH 01/12] drm/amdgpu: add missing error handling Christian König
     [not found] ` <20180926135330.2218-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-26 13:53   ` [PATCH 02/12] drm/amdgpu: send IVs to the KFD only after processing them Christian König
     [not found]     ` <20180926135330.2218-2-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-26 18:24       ` Jay Cornwall
     [not found]         ` <1537986270.3887323.1521702048.0DEA7A63-2RFepEojUI2N1INw9kWLP6GC3tUn3ZHUQQ4Iyu8u01E@public.gmane.org>
2018-09-27  9:32           ` Christian König
2018-09-26 13:53   ` [PATCH 03/12] drm/amdgpu: remove VM fault_credit handling Christian König
2018-09-26 13:53   ` [PATCH 04/12] drm/amdgpu: move IV prescreening into the GMC code Christian König
     [not found]     ` <20180926135330.2218-4-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-10-09 23:46       ` Felix Kuehling
     [not found]         ` <dcf7fd50-e453-8a9d-cdaa-c452d4716e9a-5C7GfCeVMHo@public.gmane.org>
2018-10-10  7:08           ` Christian König [this message]
2018-09-26 13:53   ` [PATCH 05/12] drm/amdgpu: remove IV prescreening Christian König
     [not found]     ` <20180926135330.2218-5-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-27 10:28       ` Huang Rui
2018-09-26 13:53   ` [PATCH 06/12] drm/amdgpu: add IH ring to ih_get_wptr/ih_set_rptr v2 Christian König
2018-09-26 13:53   ` [PATCH 07/12] drm/amdgpu: simplify IH programming Christian König
2018-09-26 13:53   ` [PATCH 08/12] drm/amdgpu: enable IH ring 1 and ring 2 v2 Christian König
2018-09-26 13:53   ` [PATCH 09/12] drm/amdgpu: add the IH to the IV trace Christian König
2018-09-26 13:53   ` [PATCH 10/12] drm/amdgpu: add support for processing IH ring 1 & 2 Christian König
2018-09-26 13:53   ` [PATCH 11/12] drm/amdgpu: add support for self irq on Vega10 Christian König
     [not found]     ` <20180926135330.2218-11-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-26 19:19       ` Alex Deucher
2018-09-26 13:53   ` [PATCH 12/12] drm/amdgpu: disable IH ring 2 WPTR overflow " Christian König
2018-09-27 10:00   ` [PATCH 01/12] drm/amdgpu: add missing error handling Huang Rui
2018-09-27 11:25     ` Christian König
     [not found]       ` <e3ab66c9-41b1-c0cd-8f72-07f4eebff70b-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-09-28  7:36         ` Huang, Ray
     [not found]           ` <BY2PR12MB0040E69C83DAC1B5136F95E4ECEC0-K//h7OWB4q7Zvl48JdS6+wdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-10-03 14:25             ` Christian König

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b9266d48-7327-165e-797d-2bdc5bda7f0b@gmail.com \
    --to=ckoenig.leichtzumerken-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=christian.koenig-5C7GfCeVMHo@public.gmane.org \
    --cc=felix.kuehling-5C7GfCeVMHo@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox