From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: Riana Tauro <riana.tauro@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, <anshuman.gupta@intel.com>,
<badal.nilawar@intel.com>, <lucas.demarchi@intel.com>,
<himal.prasad.ghimiray@intel.com>
Subject: Re: [PATCH v2 3/3] RFC drm/xe: add fault injection for lmem init check
Date: Tue, 19 Mar 2024 10:38:06 -0400 [thread overview]
Message-ID: <ZfmjTgCTsppFxe6j@intel.com> (raw)
In-Reply-To: <d214b05e-2107-4ee5-9b4d-ebfb5b7dd0f0@intel.com>
On Tue, Mar 19, 2024 at 10:16:47AM +0530, Riana Tauro wrote:
> Hi Rodrigo
>
> On 3/19/2024 2:45 AM, Rodrigo Vivi wrote:
> > On Fri, Mar 15, 2024 at 03:35:30PM +0530, Riana Tauro wrote:
> > > add a boot time fault injection for lmem init check.
> > > This can be triggered by adding a modparam fail_lmem_init
> > >
> > > xe.fail_lmem_init=<interval>,<probability>,<space>,<times>
> >
> > Please let's avoid module parameters as much as we can.
> >
> > Let's use the CONFIG_FAULT_INJECTION_DEBUG_FS
> > similarly to
> >
> > fault_create_debugfs_attr("fail_gt_reset", root, >_reset_f\
> > ailure);
> >
> lmem init check is done during early probe. We cannot set debugfs before
> probe completes. So i added the module parameter.
doh! indeed! sorry about that.
>
> I can try to set static values before injecting fault if module param is not
> needed.
>
> lmem_init_fail.times = 1;
> lmem_init_fail.probability = 100;
no, let's go with the module parameter. It would be good if we could have
something per-device, but there's no way to pass argument to the bind/probe
operation...
hmm, unless if we also require the pci id as the input to the param.
The bad part would be that we need to parse the str, then make another
string for the setup_fault_attr().
also I agree with Himal, an igt case is important here.
Thanks,
Rodrigo.
>
> Thanks
> Riana
> > And then use it like this:
> >
> > https://lore.kernel.org/all/20240315010843.194335-1-rodrigo.vivi@intel.com/
> >
> > >
> > > Adding this causes the lmem init check to fail causing
> > > the probe to defer.
> > >
> > > v2: add fault injection (Lucas)
> > >
> > > Signed-off-by: Riana Tauro <riana.tauro@intel.com>
> > > ---
> > > drivers/gpu/drm/xe/xe_device.c | 21 +++++++++++++++++++++
> > > drivers/gpu/drm/xe/xe_module.c | 5 +++++
> > > drivers/gpu/drm/xe/xe_module.h | 3 +++
> > > 3 files changed, 29 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > > index 50473329cce7..393610e95bd1 100644
> > > --- a/drivers/gpu/drm/xe/xe_device.c
> > > +++ b/drivers/gpu/drm/xe/xe_device.c
> > > @@ -51,6 +51,10 @@ struct lockdep_map xe_device_mem_access_lockdep_map = {
> > > };
> > > #endif
> > > +#ifdef CONFIG_FAULT_INJECTION
> > > +DECLARE_FAULT_ATTR(lmem_init_fail);
> > > +#endif
> > > +
> > > static int xe_file_open(struct drm_device *dev, struct drm_file *file)
> > > {
> > > struct xe_device *xe = to_xe_device(dev);
> > > @@ -431,6 +435,23 @@ static int wait_for_lmem_ready(struct xe_device *xe)
> > > if (IS_SRIOV_VF(xe))
> > > return 0;
> > > +#ifdef CONFIG_FAULT_INJECTION
> > > + /*
> > > + * use fault injection to cause a lmem init failure to validate
> > > + * deferred probe. Set the verbose to 0 to avoid dump stack
> > > + */
> > > + if (xe_modparam.fail_lmem_init) {
> > > + setup_fault_attr(&lmem_init_fail, xe_modparam.fail_lmem_init);
> > > + lmem_init_fail.verbose = 0;
> > > + if (should_fail(&lmem_init_fail, 1)) {
> > > + /* add delay to reduce the number of deferred probe attempts */
> > > + msleep(500);
> > > + drm_dbg(&xe->drm, "Fault Injection lmem init failure\n");
> > > + return -EPROBE_DEFER;
> > > + }
> > > + }
> > > +#endif
> > > +
> > > if (verify_lmem_ready(gt))
> > > return 0;
> > > diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
> > > index 110b69864656..c4efbab430a7 100644
> > > --- a/drivers/gpu/drm/xe/xe_module.c
> > > +++ b/drivers/gpu/drm/xe/xe_module.c
> > > @@ -48,6 +48,11 @@ module_param_named_unsafe(force_probe, xe_modparam.force_probe, charp, 0400);
> > > MODULE_PARM_DESC(force_probe,
> > > "Force probe options for specified devices. See CONFIG_DRM_XE_FORCE_PROBE for details.");
> > > +#ifdef CONFIG_FAULT_INJECTION
> > > +module_param_named_unsafe(fail_lmem_init, xe_modparam.fail_lmem_init, charp, 0400);
> > > +MODULE_PARM_DESC(fail_lmem_init, "Fault injection. fail_lmem_init=<interval>,<probability>,<space>,<times>");
> > > +#endif
> > > +
> > > struct init_funcs {
> > > int (*init)(void);
> > > void (*exit)(void);
> > > diff --git a/drivers/gpu/drm/xe/xe_module.h b/drivers/gpu/drm/xe/xe_module.h
> > > index 88ef0e8b2bfd..ccbeacbc3efb 100644
> > > --- a/drivers/gpu/drm/xe/xe_module.h
> > > +++ b/drivers/gpu/drm/xe/xe_module.h
> > > @@ -18,6 +18,9 @@ struct xe_modparam {
> > > char *huc_firmware_path;
> > > char *gsc_firmware_path;
> > > char *force_probe;
> > > +#if IS_ENABLED(CONFIG_FAULT_INJECTION)
> > > + char *fail_lmem_init;
> > > +#endif /* CONFIG_FAULT_INJECTION */
> > > };
> > > extern struct xe_modparam xe_modparam;
> > > --
> > > 2.40.0
> > >
next prev parent reply other threads:[~2024-03-19 14:38 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-15 10:05 [PATCH v2 0/3] Pcode init status and lmem check Riana Tauro
2024-03-15 10:02 ` ✓ CI.Patch_applied: success for Pcode init status and lmem check (rev3) Patchwork
2024-03-15 10:02 ` ✓ CI.checkpatch: " Patchwork
2024-03-15 10:03 ` ✓ CI.KUnit: " Patchwork
2024-03-15 10:05 ` [PATCH v2 1/3] drm/xe: check pcode init status only on root gt of root tile Riana Tauro
2024-03-15 10:05 ` [PATCH v2 2/3] drm/xe: re-order lmem init check and wait for initialization to complete Riana Tauro
2024-03-15 10:05 ` [PATCH v2 3/3] RFC drm/xe: add fault injection for lmem init check Riana Tauro
2024-03-15 10:03 ` Ghimiray, Himal Prasad
2024-03-18 21:15 ` Rodrigo Vivi
2024-03-19 4:46 ` Riana Tauro
2024-03-19 14:38 ` Rodrigo Vivi [this message]
2024-03-26 5:16 ` Riana Tauro
2024-03-15 10:13 ` ✓ CI.Build: success for Pcode init status and lmem check (rev3) Patchwork
2024-03-15 10:16 ` ✓ CI.Hooks: " Patchwork
2024-03-15 10:17 ` ✓ CI.checksparse: " Patchwork
2024-03-15 10:48 ` ✓ CI.BAT: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZfmjTgCTsppFxe6j@intel.com \
--to=rodrigo.vivi@intel.com \
--cc=anshuman.gupta@intel.com \
--cc=badal.nilawar@intel.com \
--cc=himal.prasad.ghimiray@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
--cc=riana.tauro@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox