All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: <gregkh@linuxfoundation.org>
Cc: umesh.nerlige.ramappa@intel.com, jonathan.cavitt@intel.com,
	matthew.brost@intel.com, <stable@vger.kernel.org>
Subject: Re: FAILED: patch "[PATCH] xe/oa: Fix query mode of operation for OAR/OAC" failed to apply to 6.13-stable tree
Date: Mon, 10 Feb 2025 13:11:27 -0800	[thread overview]
Message-ID: <85ikph33pc.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <2025021014-cartridge-snooze-15bd@gregkh>

On Mon, 10 Feb 2025 05:02:14 -0800, <gregkh@linuxfoundation.org> wrote:
>

Hi Greg,

> The patch below does not apply to the 6.13-stable tree.
> If someone wants it applied there, or to any other stable or longterm
> tree, then please email the backport, including the original git commit
> id to <stable@vger.kernel.org>.
>
> To reproduce the conflict and resubmit, you may use the following commands:
>
> git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.13.y
> git checkout FETCH_HEAD
> git cherry-pick -x 55039832f98c7e05f1cf9e0d8c12b2490abd0f16
> # <resolve conflicts, build, test, etc.>
> git commit -s
> git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2025021014-cartridge-snooze-15bd@gregkh' --subject-prefix 'PATCH 6.13.y' HEAD^..
>
> Possible dependencies:

This was a modified commit we had sent for 6.12. It will not apply to
6.13. Neither is it needed for 6.13, since the original commit

	55039832f98c ("xe/oa: Fix query mode of operation for OAR/OAC")

is already present in 6.13.

Thanks.
--
Ashutosh

>
>
>
> thanks,
>
> greg k-h
>
> ------------------ original commit in Linus's tree ------------------
>
> From 55039832f98c7e05f1cf9e0d8c12b2490abd0f16 Mon Sep 17 00:00:00 2001
> From: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> Date: Fri, 20 Dec 2024 09:19:18 -0800
> Subject: [PATCH] xe/oa: Fix query mode of operation for OAR/OAC
>
> This is a set of squashed commits to facilitate smooth applying to
> stable. Each commit message is retained for reference.
>
> 1) Allow a GGTT mapped batch to be submitted to user exec queue
>
> For a OA use case, one of the HW registers needs to be modified by
> submitting an MI_LOAD_REGISTER_IMM command to the users exec queue, so
> that the register is modified in the user's hardware context. In order
> to do this a batch that is mapped in GGTT, needs to be submitted to the
> user exec queue. Since all user submissions use q->vm and hence PPGTT,
> add some plumbing to enable submission of batches mapped in GGTT.
>
> v2: ggtt is zero-initialized, so no need to set it false (Matt Brost)
>
> 2) xe/oa: Use MI_LOAD_REGISTER_IMMEDIATE to enable OAR/OAC
>
> To enable OAR/OAC, a bit in RING_CONTEXT_CONTROL needs to be set.
> Setting this bit cause the context image size to change and if not done
> correct, can cause undesired hangs.
>
> Current code uses a separate exec_queue to modify this bit and is
> error-prone. As per HW recommendation, submit MI_LOAD_REGISTER_IMM to
> the target hardware context to modify the relevant bit.
>
> In v2 version, an attempt to submit everything to the user-queue was
> made, but it failed the unprivileged-single-ctx-counters test. It
> appears that the OACTXCONTROL must be modified from a remote context.
>
> In v3 version, all context specific register configurations were moved
> to use LOAD_REGISTER_IMMEDIATE and that seems to work well. This is a
> cleaner way, since we can now submit all configuration to user
> exec_queue and the fence handling is simplified.
>
> v2:
> (Matt)
> - set job->ggtt to true if create job is successful
> - unlock vm on job error
>
> (Ashutosh)
> - don't wait on job submission
> - use kernel exec queue where possible
>
> v3:
> (Ashutosh)
> - Fix checkpatch issues
> - Remove extra spaces/new-lines
> - Add Fixes: and Cc: tags
> - Reset context control bit when OA stream is closed
> - Submit all config via MI_LOAD_REGISTER_IMMEDIATE
>
> (Umesh)
> - Update commit message for v3 experiment
> - Squash patches for easier port to stable
>
> v4:
> (Ashutosh)
> - No need to pass q to xe_oa_submit_bb
> - Do not support exec queues with width > 1
> - Fix disabling of CTX_CTRL_OAC_CONTEXT_ENABLE
>
> v5:
> (Ashutosh)
> - Drop reg_lri related comments
> - Use XE_OA_SUBMIT_NO_DEPS in xe_oa_load_with_lri
>
> Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close")
> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> Reviewed-by: Matthew Brost <matthew.brost@intel.com> # commit 1
> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> Cc: stable@vger.kernel.org
> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.nerlige.ramappa@intel.com
>
> diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c
> index ae94490b0eac..9add60097ab5 100644
> --- a/drivers/gpu/drm/xe/xe_oa.c
> +++ b/drivers/gpu/drm/xe/xe_oa.c
> @@ -74,12 +74,6 @@ struct xe_oa_config {
>	struct rcu_head rcu;
>  };
>
> -struct flex {
> -	struct xe_reg reg;
> -	u32 offset;
> -	u32 value;
> -};
> -
>  struct xe_oa_open_param {
>	struct xe_file *xef;
>	u32 oa_unit_id;
> @@ -605,19 +599,38 @@ static __poll_t xe_oa_poll(struct file *file, poll_table *wait)
>	return ret;
>  }
>
> +static void xe_oa_lock_vma(struct xe_exec_queue *q)
> +{
> +	if (q->vm) {
> +		down_read(&q->vm->lock);
> +		xe_vm_lock(q->vm, false);
> +	}
> +}
> +
> +static void xe_oa_unlock_vma(struct xe_exec_queue *q)
> +{
> +	if (q->vm) {
> +		xe_vm_unlock(q->vm);
> +		up_read(&q->vm->lock);
> +	}
> +}
> +
>  static struct dma_fence *xe_oa_submit_bb(struct xe_oa_stream *stream, enum xe_oa_submit_deps deps,
>					 struct xe_bb *bb)
>  {
> +	struct xe_exec_queue *q = stream->exec_q ?: stream->k_exec_q;
>	struct xe_sched_job *job;
>	struct dma_fence *fence;
>	int err = 0;
>
> -	/* Kernel configuration is issued on stream->k_exec_q, not stream->exec_q */
> -	job = xe_bb_create_job(stream->k_exec_q, bb);
> +	xe_oa_lock_vma(q);
> +
> +	job = xe_bb_create_job(q, bb);
>	if (IS_ERR(job)) {
>		err = PTR_ERR(job);
>		goto exit;
>	}
> +	job->ggtt = true;
>
>	if (deps == XE_OA_SUBMIT_ADD_DEPS) {
>		for (int i = 0; i < stream->num_syncs && !err; i++)
> @@ -632,10 +645,13 @@ static struct dma_fence *xe_oa_submit_bb(struct xe_oa_stream *stream, enum xe_oa
>	fence = dma_fence_get(&job->drm.s_fence->finished);
>	xe_sched_job_push(job);
>
> +	xe_oa_unlock_vma(q);
> +
>	return fence;
>  err_put_job:
>	xe_sched_job_put(job);
>  exit:
> +	xe_oa_unlock_vma(q);
>	return ERR_PTR(err);
>  }
>
> @@ -684,65 +700,19 @@ static void xe_oa_free_configs(struct xe_oa_stream *stream)
>	dma_fence_put(stream->last_fence);
>  }
>
> -static void xe_oa_store_flex(struct xe_oa_stream *stream, struct xe_lrc *lrc,
> -			     struct xe_bb *bb, const struct flex *flex, u32 count)
> -{
> -	u32 offset = xe_bo_ggtt_addr(lrc->bo);
> -
> -	do {
> -		bb->cs[bb->len++] = MI_STORE_DATA_IMM | MI_SDI_GGTT |
> -				    MI_FORCE_WRITE_COMPLETION_CHECK |
> -				    MI_SDI_NUM_DW(1);
> -		bb->cs[bb->len++] = offset + flex->offset * sizeof(u32);
> -		bb->cs[bb->len++] = 0;
> -		bb->cs[bb->len++] = flex->value;
> -
> -	} while (flex++, --count);
> -}
> -
> -static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lrc,
> -				  const struct flex *flex, u32 count)
> +static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *reg_lri, u32 count)
>  {
>	struct dma_fence *fence;
>	struct xe_bb *bb;
>	int err;
>
> -	bb = xe_bb_new(stream->gt, 4 * count, false);
> +	bb = xe_bb_new(stream->gt, 2 * count + 1, false);
>	if (IS_ERR(bb)) {
>		err = PTR_ERR(bb);
>		goto exit;
>	}
>
> -	xe_oa_store_flex(stream, lrc, bb, flex, count);
> -
> -	fence = xe_oa_submit_bb(stream, XE_OA_SUBMIT_NO_DEPS, bb);
> -	if (IS_ERR(fence)) {
> -		err = PTR_ERR(fence);
> -		goto free_bb;
> -	}
> -	xe_bb_free(bb, fence);
> -	dma_fence_put(fence);
> -
> -	return 0;
> -free_bb:
> -	xe_bb_free(bb, NULL);
> -exit:
> -	return err;
> -}
> -
> -static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *reg_lri)
> -{
> -	struct dma_fence *fence;
> -	struct xe_bb *bb;
> -	int err;
> -
> -	bb = xe_bb_new(stream->gt, 3, false);
> -	if (IS_ERR(bb)) {
> -		err = PTR_ERR(bb);
> -		goto exit;
> -	}
> -
> -	write_cs_mi_lri(bb, reg_lri, 1);
> +	write_cs_mi_lri(bb, reg_lri, count);
>
>	fence = xe_oa_submit_bb(stream, XE_OA_SUBMIT_NO_DEPS, bb);
>	if (IS_ERR(fence)) {
> @@ -762,71 +732,55 @@ static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *re
>  static int xe_oa_configure_oar_context(struct xe_oa_stream *stream, bool enable)
>  {
>	const struct xe_oa_format *format = stream->oa_buffer.format;
> -	struct xe_lrc *lrc = stream->exec_q->lrc[0];
> -	u32 regs_offset = xe_lrc_regs_offset(lrc) / sizeof(u32);
>	u32 oacontrol = __format_to_oactrl(format, OAR_OACONTROL_COUNTER_SEL_MASK) |
>		(enable ? OAR_OACONTROL_COUNTER_ENABLE : 0);
>
> -	struct flex regs_context[] = {
> +	struct xe_oa_reg reg_lri[] = {
>		{
>			OACTXCONTROL(stream->hwe->mmio_base),
> -			stream->oa->ctx_oactxctrl_offset[stream->hwe->class] + 1,
>			enable ? OA_COUNTER_RESUME : 0,
>		},
> +		{
> +			OAR_OACONTROL,
> +			oacontrol,
> +		},
>		{
>			RING_CONTEXT_CONTROL(stream->hwe->mmio_base),
> -			regs_offset + CTX_CONTEXT_CONTROL,
> -			_MASKED_BIT_ENABLE(CTX_CTRL_OAC_CONTEXT_ENABLE),
> +			_MASKED_FIELD(CTX_CTRL_OAC_CONTEXT_ENABLE,
> +				      enable ? CTX_CTRL_OAC_CONTEXT_ENABLE : 0)
>		},
>	};
> -	struct xe_oa_reg reg_lri = { OAR_OACONTROL, oacontrol };
> -	int err;
>
> -	/* Modify stream hwe context image with regs_context */
> -	err = xe_oa_modify_ctx_image(stream, stream->exec_q->lrc[0],
> -				     regs_context, ARRAY_SIZE(regs_context));
> -	if (err)
> -		return err;
> -
> -	/* Apply reg_lri using LRI */
> -	return xe_oa_load_with_lri(stream, &reg_lri);
> +	return xe_oa_load_with_lri(stream, reg_lri, ARRAY_SIZE(reg_lri));
>  }
>
>  static int xe_oa_configure_oac_context(struct xe_oa_stream *stream, bool enable)
>  {
>	const struct xe_oa_format *format = stream->oa_buffer.format;
> -	struct xe_lrc *lrc = stream->exec_q->lrc[0];
> -	u32 regs_offset = xe_lrc_regs_offset(lrc) / sizeof(u32);
>	u32 oacontrol = __format_to_oactrl(format, OAR_OACONTROL_COUNTER_SEL_MASK) |
>		(enable ? OAR_OACONTROL_COUNTER_ENABLE : 0);
> -	struct flex regs_context[] = {
> +	struct xe_oa_reg reg_lri[] = {
>		{
>			OACTXCONTROL(stream->hwe->mmio_base),
> -			stream->oa->ctx_oactxctrl_offset[stream->hwe->class] + 1,
>			enable ? OA_COUNTER_RESUME : 0,
>		},
> +		{
> +			OAC_OACONTROL,
> +			oacontrol
> +		},
>		{
>			RING_CONTEXT_CONTROL(stream->hwe->mmio_base),
> -			regs_offset + CTX_CONTEXT_CONTROL,
> -			_MASKED_BIT_ENABLE(CTX_CTRL_OAC_CONTEXT_ENABLE) |
> +			_MASKED_FIELD(CTX_CTRL_OAC_CONTEXT_ENABLE,
> +				      enable ? CTX_CTRL_OAC_CONTEXT_ENABLE : 0) |
>			_MASKED_FIELD(CTX_CTRL_RUN_ALONE, enable ? CTX_CTRL_RUN_ALONE : 0),
>		},
>	};
> -	struct xe_oa_reg reg_lri = { OAC_OACONTROL, oacontrol };
> -	int err;
>
>	/* Set ccs select to enable programming of OAC_OACONTROL */
>	xe_mmio_write32(&stream->gt->mmio, __oa_regs(stream)->oa_ctrl,
>			__oa_ccs_select(stream));
>
> -	/* Modify stream hwe context image with regs_context */
> -	err = xe_oa_modify_ctx_image(stream, stream->exec_q->lrc[0],
> -				     regs_context, ARRAY_SIZE(regs_context));
> -	if (err)
> -		return err;
> -
> -	/* Apply reg_lri using LRI */
> -	return xe_oa_load_with_lri(stream, &reg_lri);
> +	return xe_oa_load_with_lri(stream, reg_lri, ARRAY_SIZE(reg_lri));
>  }
>
>  static int xe_oa_configure_oa_context(struct xe_oa_stream *stream, bool enable)
> @@ -2110,8 +2064,8 @@ int xe_oa_stream_open_ioctl(struct drm_device *dev, u64 data, struct drm_file *f
>		if (XE_IOCTL_DBG(oa->xe, !param.exec_q))
>			return -ENOENT;
>
> -		if (param.exec_q->width > 1)
> -			drm_dbg(&oa->xe->drm, "exec_q->width > 1, programming only exec_q->lrc[0]\n");
> +		if (XE_IOCTL_DBG(oa->xe, param.exec_q->width > 1))
> +			return -EOPNOTSUPP;
>	}
>
>	/*
> diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c
> index 3a75a08b6be9..c8ab37fa0d19 100644
> --- a/drivers/gpu/drm/xe/xe_ring_ops.c
> +++ b/drivers/gpu/drm/xe/xe_ring_ops.c
> @@ -223,7 +223,10 @@ static int emit_pipe_imm_ggtt(u32 addr, u32 value, bool stall_only, u32 *dw,
>
>  static u32 get_ppgtt_flag(struct xe_sched_job *job)
>  {
> -	return job->q->vm ? BIT(8) : 0;
> +	if (job->q->vm && !job->ggtt)
> +		return BIT(8);
> +
> +	return 0;
>  }
>
>  static int emit_copy_timestamp(struct xe_lrc *lrc, u32 *dw, int i)
> diff --git a/drivers/gpu/drm/xe/xe_sched_job_types.h b/drivers/gpu/drm/xe/xe_sched_job_types.h
> index f13f333f00be..d942b20a9f29 100644
> --- a/drivers/gpu/drm/xe/xe_sched_job_types.h
> +++ b/drivers/gpu/drm/xe/xe_sched_job_types.h
> @@ -56,6 +56,8 @@ struct xe_sched_job {
>	u32 migrate_flush_flags;
>	/** @ring_ops_flush_tlb: The ring ops need to flush TLB before payload. */
>	bool ring_ops_flush_tlb;
> +	/** @ggtt: mapped in ggtt. */
> +	bool ggtt;
>	/** @ptrs: per instance pointers. */
>	struct xe_job_ptrs ptrs[];
>  };
>

  reply	other threads:[~2025-02-10 21:11 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-10 13:02 FAILED: patch "[PATCH] xe/oa: Fix query mode of operation for OAR/OAC" failed to apply to 6.13-stable tree gregkh
2025-02-10 21:11 ` Dixit, Ashutosh [this message]
2025-02-11 10:27   ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=85ikph33pc.wl-ashutosh.dixit@intel.com \
    --to=ashutosh.dixit@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jonathan.cavitt@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=stable@vger.kernel.org \
    --cc=umesh.nerlige.ramappa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.