* [PATCH] drm/amdgpu: clear seq64 memory on free @ 2024-04-15 18:48 Arunpravin Paneer Selvam 2024-04-16 9:05 ` Christian König 0 siblings, 1 reply; 7+ messages in thread From: Arunpravin Paneer Selvam @ 2024-04-15 18:48 UTC (permalink / raw) To: amd-gfx; +Cc: christian.koenig, alexander.deucher, Arunpravin Paneer Selvam We should clear the memory on free. Otherwise, there is a chance that we will access the previous application data and this would leads to an abnormal behaviour in the current application. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c index 4b9afc4df031..9613992c9804 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c @@ -189,10 +189,14 @@ int amdgpu_seq64_alloc(struct amdgpu_device *adev, u64 *va, u64 **cpu_addr) void amdgpu_seq64_free(struct amdgpu_device *adev, u64 va) { unsigned long bit_pos; + u64 *cpu_addr; bit_pos = (va - amdgpu_seq64_get_va_base(adev)) / sizeof(u64); - if (bit_pos < adev->seq64.num_sem) + if (bit_pos < adev->seq64.num_sem) { + cpu_addr = bit_pos + adev->seq64.cpu_base_addr; + memset(cpu_addr, 0, sizeof(u64)); __clear_bit(bit_pos, adev->seq64.used); + } } /** -- 2.25.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/amdgpu: clear seq64 memory on free 2024-04-15 18:48 [PATCH] drm/amdgpu: clear seq64 memory on free Arunpravin Paneer Selvam @ 2024-04-16 9:05 ` Christian König 2024-04-16 12:16 ` Paneer Selvam, Arunpravin 0 siblings, 1 reply; 7+ messages in thread From: Christian König @ 2024-04-16 9:05 UTC (permalink / raw) To: Arunpravin Paneer Selvam, amd-gfx; +Cc: christian.koenig, alexander.deucher Am 15.04.24 um 20:48 schrieb Arunpravin Paneer Selvam: > We should clear the memory on free. Otherwise, > there is a chance that we will access the previous > application data and this would leads to an abnormal > behaviour in the current application. Mhm, I would strongly expect that we initialize the seq number after allocation. It could be that we also have situations were the correct start value is 0xffffffff or something like that instead. Why does this matter? Regards, Christian. > > Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c > index 4b9afc4df031..9613992c9804 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c > @@ -189,10 +189,14 @@ int amdgpu_seq64_alloc(struct amdgpu_device *adev, u64 *va, u64 **cpu_addr) > void amdgpu_seq64_free(struct amdgpu_device *adev, u64 va) > { > unsigned long bit_pos; > + u64 *cpu_addr; > > bit_pos = (va - amdgpu_seq64_get_va_base(adev)) / sizeof(u64); > - if (bit_pos < adev->seq64.num_sem) > + if (bit_pos < adev->seq64.num_sem) { > + cpu_addr = bit_pos + adev->seq64.cpu_base_addr; > + memset(cpu_addr, 0, sizeof(u64)); > __clear_bit(bit_pos, adev->seq64.used); > + } > } > > /** ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/amdgpu: clear seq64 memory on free 2024-04-16 9:05 ` Christian König @ 2024-04-16 12:16 ` Paneer Selvam, Arunpravin 2024-04-16 12:17 ` Christian König 0 siblings, 1 reply; 7+ messages in thread From: Paneer Selvam, Arunpravin @ 2024-04-16 12:16 UTC (permalink / raw) To: Christian König, amd-gfx; +Cc: christian.koenig, alexander.deucher Hi Christian, On 4/16/2024 2:35 PM, Christian König wrote: > Am 15.04.24 um 20:48 schrieb Arunpravin Paneer Selvam: >> We should clear the memory on free. Otherwise, >> there is a chance that we will access the previous >> application data and this would leads to an abnormal >> behaviour in the current application. > > Mhm, I would strongly expect that we initialize the seq number after > allocation. > > It could be that we also have situations were the correct start value > is 0xffffffff or something like that instead. > > Why does this matter? when the user queue A's u64 address (fence address) is allocated to the newly created user queue B, we see a problem. User queue B calls the signal IOCTL which creates a new fence having the wptr as the seq number, in amdgpu_userq_fence_create function we have a check where we are comparing the rptr and wptr value (rptr >= wptr). since the rptr value is read from the u64 address which has the user queue A's wptr data, this rptr >= wptr condition gets satisfied and we are dropping the reference before the actual command gets processed in the hardware. If we clear this u64 value on free, we dont see this problem. Thanks, Arun. > > Regards, > Christian. > >> >> Signed-off-by: Arunpravin Paneer Selvam >> <Arunpravin.PaneerSelvam@amd.com> >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c | 6 +++++- >> 1 file changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >> index 4b9afc4df031..9613992c9804 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >> @@ -189,10 +189,14 @@ int amdgpu_seq64_alloc(struct amdgpu_device >> *adev, u64 *va, u64 **cpu_addr) >> void amdgpu_seq64_free(struct amdgpu_device *adev, u64 va) >> { >> unsigned long bit_pos; >> + u64 *cpu_addr; >> bit_pos = (va - amdgpu_seq64_get_va_base(adev)) / sizeof(u64); >> - if (bit_pos < adev->seq64.num_sem) >> + if (bit_pos < adev->seq64.num_sem) { >> + cpu_addr = bit_pos + adev->seq64.cpu_base_addr; >> + memset(cpu_addr, 0, sizeof(u64)); >> __clear_bit(bit_pos, adev->seq64.used); >> + } >> } >> /** > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/amdgpu: clear seq64 memory on free 2024-04-16 12:16 ` Paneer Selvam, Arunpravin @ 2024-04-16 12:17 ` Christian König 2024-04-16 12:34 ` Paneer Selvam, Arunpravin 0 siblings, 1 reply; 7+ messages in thread From: Christian König @ 2024-04-16 12:17 UTC (permalink / raw) To: Paneer Selvam, Arunpravin, Christian König, amd-gfx Cc: alexander.deucher Am 16.04.24 um 14:16 schrieb Paneer Selvam, Arunpravin: > Hi Christian, > > On 4/16/2024 2:35 PM, Christian König wrote: >> Am 15.04.24 um 20:48 schrieb Arunpravin Paneer Selvam: >>> We should clear the memory on free. Otherwise, >>> there is a chance that we will access the previous >>> application data and this would leads to an abnormal >>> behaviour in the current application. >> >> Mhm, I would strongly expect that we initialize the seq number after >> allocation. >> >> It could be that we also have situations were the correct start value >> is 0xffffffff or something like that instead. >> >> Why does this matter? > when the user queue A's u64 address (fence address) is allocated to > the newly created user queue B, we see a problem. > User queue B calls the signal IOCTL which creates a new fence having > the wptr as the seq number, in > amdgpu_userq_fence_create function we have a check where we are > comparing the rptr and wptr value (rptr >= wptr). > since the rptr value is read from the u64 address which has the user > queue A's wptr data, this rptr >= wptr condition > gets satisfied and we are dropping the reference before the actual > command gets processed in the hardware. > If we clear this u64 value on free, we dont see this problem. Yeah, but that doesn't belongs into the seq64 handling. Instead the code which allocates the seq64 during userqueue created needs to clear it to 0. Regards, Christian. > > Thanks, > Arun. >> >> Regards, >> Christian. >> >>> >>> Signed-off-by: Arunpravin Paneer Selvam >>> <Arunpravin.PaneerSelvam@amd.com> >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c | 6 +++++- >>> 1 file changed, 5 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >>> index 4b9afc4df031..9613992c9804 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >>> @@ -189,10 +189,14 @@ int amdgpu_seq64_alloc(struct amdgpu_device >>> *adev, u64 *va, u64 **cpu_addr) >>> void amdgpu_seq64_free(struct amdgpu_device *adev, u64 va) >>> { >>> unsigned long bit_pos; >>> + u64 *cpu_addr; >>> bit_pos = (va - amdgpu_seq64_get_va_base(adev)) / sizeof(u64); >>> - if (bit_pos < adev->seq64.num_sem) >>> + if (bit_pos < adev->seq64.num_sem) { >>> + cpu_addr = bit_pos + adev->seq64.cpu_base_addr; >>> + memset(cpu_addr, 0, sizeof(u64)); >>> __clear_bit(bit_pos, adev->seq64.used); >>> + } >>> } >>> /** >> > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/amdgpu: clear seq64 memory on free 2024-04-16 12:17 ` Christian König @ 2024-04-16 12:34 ` Paneer Selvam, Arunpravin 2024-04-16 12:54 ` Christian König 0 siblings, 1 reply; 7+ messages in thread From: Paneer Selvam, Arunpravin @ 2024-04-16 12:34 UTC (permalink / raw) To: Christian König, Christian König, amd-gfx; +Cc: alexander.deucher Hi Christian, On 4/16/2024 5:47 PM, Christian König wrote: > Am 16.04.24 um 14:16 schrieb Paneer Selvam, Arunpravin: >> Hi Christian, >> >> On 4/16/2024 2:35 PM, Christian König wrote: >>> Am 15.04.24 um 20:48 schrieb Arunpravin Paneer Selvam: >>>> We should clear the memory on free. Otherwise, >>>> there is a chance that we will access the previous >>>> application data and this would leads to an abnormal >>>> behaviour in the current application. >>> >>> Mhm, I would strongly expect that we initialize the seq number after >>> allocation. >>> >>> It could be that we also have situations were the correct start >>> value is 0xffffffff or something like that instead. >>> >>> Why does this matter? >> when the user queue A's u64 address (fence address) is allocated to >> the newly created user queue B, we see a problem. >> User queue B calls the signal IOCTL which creates a new fence having >> the wptr as the seq number, in >> amdgpu_userq_fence_create function we have a check where we are >> comparing the rptr and wptr value (rptr >= wptr). >> since the rptr value is read from the u64 address which has the user >> queue A's wptr data, this rptr >= wptr condition >> gets satisfied and we are dropping the reference before the actual >> command gets processed in the hardware. >> If we clear this u64 value on free, we dont see this problem. > > Yeah, but that doesn't belongs into the seq64 handling. > > Instead the code which allocates the seq64 during userqueue created > needs to clear it to 0. sure, got it. Thanks, Arun. > > Regards, > Christian. > >> >> Thanks, >> Arun. >>> >>> Regards, >>> Christian. >>> >>>> >>>> Signed-off-by: Arunpravin Paneer Selvam >>>> <Arunpravin.PaneerSelvam@amd.com> >>>> --- >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c | 6 +++++- >>>> 1 file changed, 5 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >>>> index 4b9afc4df031..9613992c9804 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >>>> @@ -189,10 +189,14 @@ int amdgpu_seq64_alloc(struct amdgpu_device >>>> *adev, u64 *va, u64 **cpu_addr) >>>> void amdgpu_seq64_free(struct amdgpu_device *adev, u64 va) >>>> { >>>> unsigned long bit_pos; >>>> + u64 *cpu_addr; >>>> bit_pos = (va - amdgpu_seq64_get_va_base(adev)) / sizeof(u64); >>>> - if (bit_pos < adev->seq64.num_sem) >>>> + if (bit_pos < adev->seq64.num_sem) { >>>> + cpu_addr = bit_pos + adev->seq64.cpu_base_addr; >>>> + memset(cpu_addr, 0, sizeof(u64)); >>>> __clear_bit(bit_pos, adev->seq64.used); >>>> + } >>>> } >>>> /** >>> >> > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/amdgpu: clear seq64 memory on free 2024-04-16 12:34 ` Paneer Selvam, Arunpravin @ 2024-04-16 12:54 ` Christian König 2024-04-16 12:57 ` Paneer Selvam, Arunpravin 0 siblings, 1 reply; 7+ messages in thread From: Christian König @ 2024-04-16 12:54 UTC (permalink / raw) To: Paneer Selvam, Arunpravin, Christian König, amd-gfx Cc: alexander.deucher Am 16.04.24 um 14:34 schrieb Paneer Selvam, Arunpravin: > Hi Christian, > > On 4/16/2024 5:47 PM, Christian König wrote: >> Am 16.04.24 um 14:16 schrieb Paneer Selvam, Arunpravin: >>> Hi Christian, >>> >>> On 4/16/2024 2:35 PM, Christian König wrote: >>>> Am 15.04.24 um 20:48 schrieb Arunpravin Paneer Selvam: >>>>> We should clear the memory on free. Otherwise, >>>>> there is a chance that we will access the previous >>>>> application data and this would leads to an abnormal >>>>> behaviour in the current application. >>>> >>>> Mhm, I would strongly expect that we initialize the seq number >>>> after allocation. >>>> >>>> It could be that we also have situations were the correct start >>>> value is 0xffffffff or something like that instead. >>>> >>>> Why does this matter? >>> when the user queue A's u64 address (fence address) is allocated to >>> the newly created user queue B, we see a problem. >>> User queue B calls the signal IOCTL which creates a new fence having >>> the wptr as the seq number, in >>> amdgpu_userq_fence_create function we have a check where we are >>> comparing the rptr and wptr value (rptr >= wptr). >>> since the rptr value is read from the u64 address which has the user >>> queue A's wptr data, this rptr >= wptr condition >>> gets satisfied and we are dropping the reference before the actual >>> command gets processed in the hardware. >>> If we clear this u64 value on free, we dont see this problem. >> >> Yeah, but that doesn't belongs into the seq64 handling. >> >> Instead the code which allocates the seq64 during userqueue created >> needs to clear it to 0. > sure, got it. Yeah, but fixing that aside. We should probably initialize the seq64 array to something like 0xdeadbeef or a similar pattern to find issues were we forget to initialize the allocated slots. Regards, Christian. > > Thanks, > Arun. >> >> Regards, >> Christian. >> >>> >>> Thanks, >>> Arun. >>>> >>>> Regards, >>>> Christian. >>>> >>>>> >>>>> Signed-off-by: Arunpravin Paneer Selvam >>>>> <Arunpravin.PaneerSelvam@amd.com> >>>>> --- >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c | 6 +++++- >>>>> 1 file changed, 5 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >>>>> index 4b9afc4df031..9613992c9804 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >>>>> @@ -189,10 +189,14 @@ int amdgpu_seq64_alloc(struct amdgpu_device >>>>> *adev, u64 *va, u64 **cpu_addr) >>>>> void amdgpu_seq64_free(struct amdgpu_device *adev, u64 va) >>>>> { >>>>> unsigned long bit_pos; >>>>> + u64 *cpu_addr; >>>>> bit_pos = (va - amdgpu_seq64_get_va_base(adev)) / >>>>> sizeof(u64); >>>>> - if (bit_pos < adev->seq64.num_sem) >>>>> + if (bit_pos < adev->seq64.num_sem) { >>>>> + cpu_addr = bit_pos + adev->seq64.cpu_base_addr; >>>>> + memset(cpu_addr, 0, sizeof(u64)); >>>>> __clear_bit(bit_pos, adev->seq64.used); >>>>> + } >>>>> } >>>>> /** >>>> >>> >> > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/amdgpu: clear seq64 memory on free 2024-04-16 12:54 ` Christian König @ 2024-04-16 12:57 ` Paneer Selvam, Arunpravin 0 siblings, 0 replies; 7+ messages in thread From: Paneer Selvam, Arunpravin @ 2024-04-16 12:57 UTC (permalink / raw) To: Christian König, Christian König, amd-gfx; +Cc: alexander.deucher Hi Christian, On 4/16/2024 6:24 PM, Christian König wrote: > Am 16.04.24 um 14:34 schrieb Paneer Selvam, Arunpravin: >> Hi Christian, >> >> On 4/16/2024 5:47 PM, Christian König wrote: >>> Am 16.04.24 um 14:16 schrieb Paneer Selvam, Arunpravin: >>>> Hi Christian, >>>> >>>> On 4/16/2024 2:35 PM, Christian König wrote: >>>>> Am 15.04.24 um 20:48 schrieb Arunpravin Paneer Selvam: >>>>>> We should clear the memory on free. Otherwise, >>>>>> there is a chance that we will access the previous >>>>>> application data and this would leads to an abnormal >>>>>> behaviour in the current application. >>>>> >>>>> Mhm, I would strongly expect that we initialize the seq number >>>>> after allocation. >>>>> >>>>> It could be that we also have situations were the correct start >>>>> value is 0xffffffff or something like that instead. >>>>> >>>>> Why does this matter? >>>> when the user queue A's u64 address (fence address) is allocated to >>>> the newly created user queue B, we see a problem. >>>> User queue B calls the signal IOCTL which creates a new fence >>>> having the wptr as the seq number, in >>>> amdgpu_userq_fence_create function we have a check where we are >>>> comparing the rptr and wptr value (rptr >= wptr). >>>> since the rptr value is read from the u64 address which has the >>>> user queue A's wptr data, this rptr >= wptr condition >>>> gets satisfied and we are dropping the reference before the actual >>>> command gets processed in the hardware. >>>> If we clear this u64 value on free, we dont see this problem. >>> >>> Yeah, but that doesn't belongs into the seq64 handling. >>> >>> Instead the code which allocates the seq64 during userqueue created >>> needs to clear it to 0. >> sure, got it. > > Yeah, but fixing that aside. We should probably initialize the seq64 > array to something like 0xdeadbeef or a similar pattern to find issues > were we forget to initialize the allocated slots. yes, I will prepare a patch and send for the review. Thanks, Arun. > > Regards, > Christian. > >> >> Thanks, >> Arun. >>> >>> Regards, >>> Christian. >>> >>>> >>>> Thanks, >>>> Arun. >>>>> >>>>> Regards, >>>>> Christian. >>>>> >>>>>> >>>>>> Signed-off-by: Arunpravin Paneer Selvam >>>>>> <Arunpravin.PaneerSelvam@amd.com> >>>>>> --- >>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c | 6 +++++- >>>>>> 1 file changed, 5 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >>>>>> index 4b9afc4df031..9613992c9804 100644 >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c >>>>>> @@ -189,10 +189,14 @@ int amdgpu_seq64_alloc(struct amdgpu_device >>>>>> *adev, u64 *va, u64 **cpu_addr) >>>>>> void amdgpu_seq64_free(struct amdgpu_device *adev, u64 va) >>>>>> { >>>>>> unsigned long bit_pos; >>>>>> + u64 *cpu_addr; >>>>>> bit_pos = (va - amdgpu_seq64_get_va_base(adev)) / >>>>>> sizeof(u64); >>>>>> - if (bit_pos < adev->seq64.num_sem) >>>>>> + if (bit_pos < adev->seq64.num_sem) { >>>>>> + cpu_addr = bit_pos + adev->seq64.cpu_base_addr; >>>>>> + memset(cpu_addr, 0, sizeof(u64)); >>>>>> __clear_bit(bit_pos, adev->seq64.used); >>>>>> + } >>>>>> } >>>>>> /** >>>>> >>>> >>> >> > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-04-16 12:58 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-04-15 18:48 [PATCH] drm/amdgpu: clear seq64 memory on free Arunpravin Paneer Selvam 2024-04-16 9:05 ` Christian König 2024-04-16 12:16 ` Paneer Selvam, Arunpravin 2024-04-16 12:17 ` Christian König 2024-04-16 12:34 ` Paneer Selvam, Arunpravin 2024-04-16 12:54 ` Christian König 2024-04-16 12:57 ` Paneer Selvam, Arunpravin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.