Linux kernel -stable discussions
 help / color / mirror / Atom feed
* [PATCH v6] drm/i915: Fix NULL pointer dereference in capture_engine
@ 2024-12-03 14:54 Eugene Kobyak
  2024-12-03 17:06 ` Michal Wajdeczko
  2024-12-04 14:43 ` Andi Shyti
  0 siblings, 2 replies; 6+ messages in thread
From: Eugene Kobyak @ 2024-12-03 14:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: John.C.Harrison, andi.shyti, jani.nikula, stable

When the intel_context structure contains NULL,
it raises a NULL pointer dereference error in drm_info().

Fixes: e8a3319c31a1 ("drm/i915: Allow error capture without a request")
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12309
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: <stable@vger.kernel.org> # v6.3+
Signed-off-by: Eugene Kobyak <eugene.kobyak@intel.com>
---
v2:
  - return drm_info to separate condition
v3:
  - create separate condition which generate string if intel_context exist
v4:
  - rollback and add check intel_context in log condition
v5:
  - create separate string with guc_id if intel_context exist
v6:
  - print changed log if intel_context exist

 drivers/gpu/drm/i915/i915_gpu_error.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 135ded17334e..d88cefb889c3 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1643,9 +1643,21 @@ capture_engine(struct intel_engine_cs *engine,
 		return NULL;
 
 	intel_engine_get_hung_entity(engine, &ce, &rq);
-	if (rq && !i915_request_started(rq))
-		drm_info(&engine->gt->i915->drm, "Got hung context on %s with active request %lld:%lld [0x%04X] not yet started\n",
-			 engine->name, rq->fence.context, rq->fence.seqno, ce->guc_id.id);
+	if (rq && !i915_request_started(rq)) {
+		/*
+		* We want to know also what is the gcu_id of the context,
+		* but if we don't have the context reference, then skip
+		* printing it.
+		*/
+		if (ce)
+			drm_info(&engine->gt->i915->drm,
+				"Got hung context on %s with active request %lld:%lld [0x%04X] not yet started\n",
+				engine->name, rq->fence.context, rq->fence.seqno, ce->guc_id.id);
+		else
+			drm_info(&engine->gt->i915->drm,
+				"Got hung context on %s with active request %lld:%lld not yet started\n",
+				engine->name, rq->fence.context, rq->fence.seqno);
+	}
 
 	if (rq) {
 		capture = intel_engine_coredump_add_request(ee, rq, ATOMIC_MAYFAIL);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v6] drm/i915: Fix NULL pointer dereference in capture_engine
  2024-12-03 14:54 [PATCH v6] drm/i915: Fix NULL pointer dereference in capture_engine Eugene Kobyak
@ 2024-12-03 17:06 ` Michal Wajdeczko
  2024-12-04 10:32   ` Andi Shyti
  2024-12-04 14:43 ` Andi Shyti
  1 sibling, 1 reply; 6+ messages in thread
From: Michal Wajdeczko @ 2024-12-03 17:06 UTC (permalink / raw)
  To: Eugene Kobyak, intel-gfx; +Cc: John.C.Harrison, andi.shyti, jani.nikula, stable



On 03.12.2024 15:54, Eugene Kobyak wrote:
> When the intel_context structure contains NULL,
> it raises a NULL pointer dereference error in drm_info().
> 
> Fixes: e8a3319c31a1 ("drm/i915: Allow error capture without a request")
> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12309
> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: John Harrison <John.C.Harrison@Intel.com>
> Cc: <stable@vger.kernel.org> # v6.3+
> Signed-off-by: Eugene Kobyak <eugene.kobyak@intel.com>
> ---
> v2:
>   - return drm_info to separate condition
> v3:
>   - create separate condition which generate string if intel_context exist
> v4:
>   - rollback and add check intel_context in log condition
> v5:
>   - create separate string with guc_id if intel_context exist
> v6:
>   - print changed log if intel_context exist
> 
>  drivers/gpu/drm/i915/i915_gpu_error.c | 18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 135ded17334e..d88cefb889c3 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -1643,9 +1643,21 @@ capture_engine(struct intel_engine_cs *engine,
>  		return NULL;
>  
>  	intel_engine_get_hung_entity(engine, &ce, &rq);
> -	if (rq && !i915_request_started(rq))
> -		drm_info(&engine->gt->i915->drm, "Got hung context on %s with active request %lld:%lld [0x%04X] not yet started\n",
> -			 engine->name, rq->fence.context, rq->fence.seqno, ce->guc_id.id);
> +	if (rq && !i915_request_started(rq)) {
> +		/*
> +		* We want to know also what is the gcu_id of the context,

typo: guc_id

> +		* but if we don't have the context reference, then skip
> +		* printing it.
> +		*/

but IMO this comment is redundant as it's quite obvious that without
context pointer you can't print guc_id member

> +		if (ce)
> +			drm_info(&engine->gt->i915->drm,
> +				"Got hung context on %s with active request %lld:%lld [0x%04X] not yet started\n",
> +				engine->name, rq->fence.context, rq->fence.seqno, ce->guc_id.id);
> +		else
> +			drm_info(&engine->gt->i915->drm,
> +				"Got hung context on %s with active request %lld:%lld not yet started\n",
> +				engine->name, rq->fence.context, rq->fence.seqno);

since you are touching drm_info() where we use engine->gt then maybe
it's good time to switch to gt_info() to get better per-GT message?

> +	}
>  
>  	if (rq) {
>  		capture = intel_engine_coredump_add_request(ee, rq, ATOMIC_MAYFAIL);


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6] drm/i915: Fix NULL pointer dereference in capture_engine
  2024-12-03 17:06 ` Michal Wajdeczko
@ 2024-12-04 10:32   ` Andi Shyti
  2024-12-04 10:51     ` Jani Nikula
  0 siblings, 1 reply; 6+ messages in thread
From: Andi Shyti @ 2024-12-04 10:32 UTC (permalink / raw)
  To: Michal Wajdeczko
  Cc: Eugene Kobyak, intel-gfx, John.C.Harrison, andi.shyti,
	jani.nikula, stable

Hi Michal,

> > +	if (rq && !i915_request_started(rq)) {
> > +		/*
> > +		* We want to know also what is the gcu_id of the context,
> 
> typo: guc_id
> 
> > +		* but if we don't have the context reference, then skip
> > +		* printing it.
> > +		*/
> 
> but IMO this comment is redundant as it's quite obvious that without
> context pointer you can't print guc_id member

I recommended to add a comment because there is some code
redundancy that, I think, needs some explanation; someone might
not spot immediately the need for ce, unless goes a the end of
the drm_info parameter's list.

> > +		if (ce)
> > +			drm_info(&engine->gt->i915->drm,
> > +				"Got hung context on %s with active request %lld:%lld [0x%04X] not yet started\n",
> > +				engine->name, rq->fence.context, rq->fence.seqno, ce->guc_id.id);
> > +		else
> > +			drm_info(&engine->gt->i915->drm,
> > +				"Got hung context on %s with active request %lld:%lld not yet started\n",
> > +				engine->name, rq->fence.context, rq->fence.seqno);
> 
> since you are touching drm_info() where we use engine->gt then maybe
> it's good time to switch to gt_info() to get better per-GT message?

I think the original reason for using drm_info is because we are
outside the GT. But, because the engine belongs to the GT, it
makes also sense to use gt_info(), I don't oppose.

It would make more sense to move this function completely into
gt/.

Thanks,
Andi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6] drm/i915: Fix NULL pointer dereference in capture_engine
  2024-12-04 10:32   ` Andi Shyti
@ 2024-12-04 10:51     ` Jani Nikula
  2024-12-04 11:04       ` Andi Shyti
  0 siblings, 1 reply; 6+ messages in thread
From: Jani Nikula @ 2024-12-04 10:51 UTC (permalink / raw)
  To: Andi Shyti, Michal Wajdeczko
  Cc: Eugene Kobyak, intel-gfx, John.C.Harrison, andi.shyti, stable

On Wed, 04 Dec 2024, Andi Shyti <andi.shyti@linux.intel.com> wrote:
> Hi Michal,
>
>> > +	if (rq && !i915_request_started(rq)) {
>> > +		/*
>> > +		* We want to know also what is the gcu_id of the context,
>> 
>> typo: guc_id
>> 
>> > +		* but if we don't have the context reference, then skip
>> > +		* printing it.
>> > +		*/
>> 
>> but IMO this comment is redundant as it's quite obvious that without
>> context pointer you can't print guc_id member
>
> I recommended to add a comment because there is some code
> redundancy that, I think, needs some explanation; someone might
> not spot immediately the need for ce, unless goes a the end of
> the drm_info parameter's list.
>
>> > +		if (ce)
>> > +			drm_info(&engine->gt->i915->drm,
>> > +				"Got hung context on %s with active request %lld:%lld [0x%04X] not yet started\n",
>> > +				engine->name, rq->fence.context, rq->fence.seqno, ce->guc_id.id);
>> > +		else
>> > +			drm_info(&engine->gt->i915->drm,
>> > +				"Got hung context on %s with active request %lld:%lld not yet started\n",
>> > +				engine->name, rq->fence.context, rq->fence.seqno);
>> 
>> since you are touching drm_info() where we use engine->gt then maybe
>> it's good time to switch to gt_info() to get better per-GT message?
>
> I think the original reason for using drm_info is because we are
> outside the GT. But, because the engine belongs to the GT, it
> makes also sense to use gt_info(), I don't oppose.
>
> It would make more sense to move this function completely into
> gt/.

Can we converge on the patch instead of diverge, please?

It's a Cc: stable null pointer dereference fix.

It's already been iterated for two weeks to reach v6.

Fix the comment typo while applying, but there's nothing inherently
wrong here AFAICT. Merge it and move on.


BR,
Jani.


-- 
Jani Nikula, Intel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6] drm/i915: Fix NULL pointer dereference in capture_engine
  2024-12-04 10:51     ` Jani Nikula
@ 2024-12-04 11:04       ` Andi Shyti
  0 siblings, 0 replies; 6+ messages in thread
From: Andi Shyti @ 2024-12-04 11:04 UTC (permalink / raw)
  To: Jani Nikula
  Cc: Andi Shyti, Michal Wajdeczko, Eugene Kobyak, intel-gfx,
	John.C.Harrison, stable

Hi Jani,

On Wed, Dec 04, 2024 at 12:51:56PM +0200, Jani Nikula wrote:
> On Wed, 04 Dec 2024, Andi Shyti <andi.shyti@linux.intel.com> wrote:
> > Hi Michal,
> >
> >> > +	if (rq && !i915_request_started(rq)) {
> >> > +		/*
> >> > +		* We want to know also what is the gcu_id of the context,
> >> 
> >> typo: guc_id
> >> 
> >> > +		* but if we don't have the context reference, then skip
> >> > +		* printing it.
> >> > +		*/
> >> 
> >> but IMO this comment is redundant as it's quite obvious that without
> >> context pointer you can't print guc_id member
> >
> > I recommended to add a comment because there is some code
> > redundancy that, I think, needs some explanation; someone might
> > not spot immediately the need for ce, unless goes a the end of
> > the drm_info parameter's list.
> >
> >> > +		if (ce)
> >> > +			drm_info(&engine->gt->i915->drm,
> >> > +				"Got hung context on %s with active request %lld:%lld [0x%04X] not yet started\n",
> >> > +				engine->name, rq->fence.context, rq->fence.seqno, ce->guc_id.id);
> >> > +		else
> >> > +			drm_info(&engine->gt->i915->drm,
> >> > +				"Got hung context on %s with active request %lld:%lld not yet started\n",
> >> > +				engine->name, rq->fence.context, rq->fence.seqno);
> >> 
> >> since you are touching drm_info() where we use engine->gt then maybe
> >> it's good time to switch to gt_info() to get better per-GT message?
> >
> > I think the original reason for using drm_info is because we are
> > outside the GT. But, because the engine belongs to the GT, it
> > makes also sense to use gt_info(), I don't oppose.
> >
> > It would make more sense to move this function completely into
> > gt/.
> 
> Can we converge on the patch instead of diverge, please?
> 
> It's a Cc: stable null pointer dereference fix.
> 
> It's already been iterated for two weeks to reach v6.
> 
> Fix the comment typo while applying, but there's nothing inherently
> wrong here AFAICT. Merge it and move on.

Thanks for the feedback, will go ahead and merge.

All other gt/ adjustments can be done later.

Andi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6] drm/i915: Fix NULL pointer dereference in capture_engine
  2024-12-03 14:54 [PATCH v6] drm/i915: Fix NULL pointer dereference in capture_engine Eugene Kobyak
  2024-12-03 17:06 ` Michal Wajdeczko
@ 2024-12-04 14:43 ` Andi Shyti
  1 sibling, 0 replies; 6+ messages in thread
From: Andi Shyti @ 2024-12-04 14:43 UTC (permalink / raw)
  To: Eugene Kobyak; +Cc: intel-gfx, John.C.Harrison, andi.shyti, jani.nikula, stable

Hi Eugene,

On Tue, Dec 03, 2024 at 02:54:06PM +0000, Eugene Kobyak wrote:
> When the intel_context structure contains NULL,
> it raises a NULL pointer dereference error in drm_info().
> 
> Fixes: e8a3319c31a1 ("drm/i915: Allow error capture without a request")
> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12309
> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: John Harrison <John.C.Harrison@Intel.com>
> Cc: <stable@vger.kernel.org> # v6.3+
> Signed-off-by: Eugene Kobyak <eugene.kobyak@intel.com>

merged to drm-intel-next.

Thank you,
Andi

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-12-04 14:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-03 14:54 [PATCH v6] drm/i915: Fix NULL pointer dereference in capture_engine Eugene Kobyak
2024-12-03 17:06 ` Michal Wajdeczko
2024-12-04 10:32   ` Andi Shyti
2024-12-04 10:51     ` Jani Nikula
2024-12-04 11:04       ` Andi Shyti
2024-12-04 14:43 ` Andi Shyti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox