* [PATCH v4 1/2] devcoredump: Add dev_coredump_put()
@ 2024-04-09 20:02 José Roberto de Souza
2024-04-09 20:02 ` [PATCH v4 2/2] drm/xe: Remove devcoredump during driver release José Roberto de Souza
2024-04-09 22:23 ` [PATCH v4 1/2] devcoredump: Add dev_coredump_put() Rodrigo Vivi
0 siblings, 2 replies; 5+ messages in thread
From: José Roberto de Souza @ 2024-04-09 20:02 UTC (permalink / raw)
To: linux-kernel, intel-xe
Cc: Rodrigo Vivi, Mukesh Ojha, Johannes Berg, Jonathan Cavitt,
José Roberto de Souza
It is useful for modules that do not want to keep coredump available
after its unload.
Otherwise, the coredump would only be removed after DEVCD_TIMEOUT
seconds.
v2:
- dev_coredump_put() documentation updated (Mukesh)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Mukesh Ojha <quic_mojha@quicinc.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
---
drivers/base/devcoredump.c | 23 +++++++++++++++++++++++
include/linux/devcoredump.h | 5 +++++
2 files changed, 28 insertions(+)
diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
index 7e2d1f0d903a6..82aeb09b3d1b5 100644
--- a/drivers/base/devcoredump.c
+++ b/drivers/base/devcoredump.c
@@ -304,6 +304,29 @@ static ssize_t devcd_read_from_sgtable(char *buffer, loff_t offset,
offset);
}
+/**
+ * dev_coredump_put - remove device coredump
+ * @dev: the struct device for the crashed device
+ *
+ * dev_coredump_put() removes coredump, if exists, for a given device from
+ * the file system and free its associated data otherwise, does nothing.
+ *
+ * It is useful for modules that do not want to keep coredump
+ * available after its unload.
+ */
+void dev_coredump_put(struct device *dev)
+{
+ struct device *existing;
+
+ existing = class_find_device(&devcd_class, NULL, dev,
+ devcd_match_failing);
+ if (existing) {
+ devcd_free(existing, NULL);
+ put_device(existing);
+ }
+}
+EXPORT_SYMBOL_GPL(dev_coredump_put);
+
/**
* dev_coredumpm - create device coredump with read/free methods
* @dev: the struct device for the crashed device
diff --git a/include/linux/devcoredump.h b/include/linux/devcoredump.h
index c008169ed2c6f..c8f7eb6cc1915 100644
--- a/include/linux/devcoredump.h
+++ b/include/linux/devcoredump.h
@@ -63,6 +63,8 @@ void dev_coredumpm(struct device *dev, struct module *owner,
void dev_coredumpsg(struct device *dev, struct scatterlist *table,
size_t datalen, gfp_t gfp);
+
+void dev_coredump_put(struct device *dev);
#else
static inline void dev_coredumpv(struct device *dev, void *data,
size_t datalen, gfp_t gfp)
@@ -85,6 +87,9 @@ static inline void dev_coredumpsg(struct device *dev, struct scatterlist *table,
{
_devcd_free_sgtable(table);
}
+static inline void dev_coredump_put(struct device *dev)
+{
+}
#endif /* CONFIG_DEV_COREDUMP */
#endif /* __DEVCOREDUMP_H */
--
2.44.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v4 2/2] drm/xe: Remove devcoredump during driver release
2024-04-09 20:02 [PATCH v4 1/2] devcoredump: Add dev_coredump_put() José Roberto de Souza
@ 2024-04-09 20:02 ` José Roberto de Souza
2024-04-09 22:21 ` Rodrigo Vivi
2024-04-09 22:23 ` [PATCH v4 1/2] devcoredump: Add dev_coredump_put() Rodrigo Vivi
1 sibling, 1 reply; 5+ messages in thread
From: José Roberto de Souza @ 2024-04-09 20:02 UTC (permalink / raw)
To: linux-kernel, intel-xe
Cc: Rodrigo Vivi, Jonathan Cavitt, José Roberto de Souza
This will remove devcoredump from file system and free its resources
during driver unload.
This fix the driver unload after gpu hang happened, otherwise this
it would report that Xe KMD is still in use and it would leave the
kernel in a state that Xe KMD can't be unload without a reboot.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Acked-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
---
drivers/gpu/drm/xe/xe_devcoredump.c | 13 ++++++++++++-
drivers/gpu/drm/xe/xe_devcoredump.h | 6 ++++++
drivers/gpu/drm/xe/xe_device.c | 4 ++++
3 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
index 283ca7518aff2..3d7980232be1c 100644
--- a/drivers/gpu/drm/xe/xe_devcoredump.c
+++ b/drivers/gpu/drm/xe/xe_devcoredump.c
@@ -9,6 +9,8 @@
#include <linux/devcoredump.h>
#include <generated/utsrelease.h>
+#include <drm/drm_managed.h>
+
#include "xe_device.h"
#include "xe_exec_queue.h"
#include "xe_force_wake.h"
@@ -235,5 +237,14 @@ void xe_devcoredump(struct xe_sched_job *job)
dev_coredumpm(xe->drm.dev, THIS_MODULE, coredump, 0, GFP_KERNEL,
xe_devcoredump_read, xe_devcoredump_free);
}
-#endif
+static void xe_driver_devcoredump_fini(struct drm_device *drm, void *arg)
+{
+ dev_coredump_put(drm->dev);
+}
+
+int xe_devcoredump_init(struct xe_device *xe)
+{
+ return drmm_add_action_or_reset(&xe->drm, xe_driver_devcoredump_fini, xe);
+}
+#endif
diff --git a/drivers/gpu/drm/xe/xe_devcoredump.h b/drivers/gpu/drm/xe/xe_devcoredump.h
index df8671f0b5eb2..e2fa65ce09322 100644
--- a/drivers/gpu/drm/xe/xe_devcoredump.h
+++ b/drivers/gpu/drm/xe/xe_devcoredump.h
@@ -11,10 +11,16 @@ struct xe_sched_job;
#ifdef CONFIG_DEV_COREDUMP
void xe_devcoredump(struct xe_sched_job *job);
+int xe_devcoredump_init(struct xe_device *xe);
#else
static inline void xe_devcoredump(struct xe_sched_job *job)
{
}
+
+static inline int xe_devcoredump_init(struct xe_device *xe)
+{
+ return 0;
+}
#endif
#endif
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 9083f5e02dd9e..ce27d0d1bdb34 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -20,6 +20,7 @@
#include "regs/xe_regs.h"
#include "xe_bo.h"
#include "xe_debugfs.h"
+#include "xe_devcoredump.h"
#include "xe_dma_buf.h"
#include "xe_drm_client.h"
#include "xe_drv.h"
@@ -513,6 +514,9 @@ int xe_device_probe(struct xe_device *xe)
return err;
}
+ err = xe_devcoredump_init(xe);
+ if (err)
+ return err;
err = drmm_add_action_or_reset(&xe->drm, xe_driver_flr_fini, xe);
if (err)
return err;
--
2.44.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v4 2/2] drm/xe: Remove devcoredump during driver release
2024-04-09 20:02 ` [PATCH v4 2/2] drm/xe: Remove devcoredump during driver release José Roberto de Souza
@ 2024-04-09 22:21 ` Rodrigo Vivi
0 siblings, 0 replies; 5+ messages in thread
From: Rodrigo Vivi @ 2024-04-09 22:21 UTC (permalink / raw)
To: José Roberto de Souza; +Cc: linux-kernel, intel-xe, Jonathan Cavitt
On Tue, Apr 09, 2024 at 01:02:06PM -0700, José Roberto de Souza wrote:
> This will remove devcoredump from file system and free its resources
> during driver unload.
>
> This fix the driver unload after gpu hang happened, otherwise this
> it would report that Xe KMD is still in use and it would leave the
> kernel in a state that Xe KMD can't be unload without a reboot.
>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
> Acked-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
> drivers/gpu/drm/xe/xe_devcoredump.c | 13 ++++++++++++-
> drivers/gpu/drm/xe/xe_devcoredump.h | 6 ++++++
> drivers/gpu/drm/xe/xe_device.c | 4 ++++
> 3 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
> index 283ca7518aff2..3d7980232be1c 100644
> --- a/drivers/gpu/drm/xe/xe_devcoredump.c
> +++ b/drivers/gpu/drm/xe/xe_devcoredump.c
> @@ -9,6 +9,8 @@
> #include <linux/devcoredump.h>
> #include <generated/utsrelease.h>
>
> +#include <drm/drm_managed.h>
> +
> #include "xe_device.h"
> #include "xe_exec_queue.h"
> #include "xe_force_wake.h"
> @@ -235,5 +237,14 @@ void xe_devcoredump(struct xe_sched_job *job)
> dev_coredumpm(xe->drm.dev, THIS_MODULE, coredump, 0, GFP_KERNEL,
> xe_devcoredump_read, xe_devcoredump_free);
> }
> -#endif
>
> +static void xe_driver_devcoredump_fini(struct drm_device *drm, void *arg)
> +{
> + dev_coredump_put(drm->dev);
> +}
> +
> +int xe_devcoredump_init(struct xe_device *xe)
> +{
> + return drmm_add_action_or_reset(&xe->drm, xe_driver_devcoredump_fini, xe);
> +}
> +#endif
> diff --git a/drivers/gpu/drm/xe/xe_devcoredump.h b/drivers/gpu/drm/xe/xe_devcoredump.h
> index df8671f0b5eb2..e2fa65ce09322 100644
> --- a/drivers/gpu/drm/xe/xe_devcoredump.h
> +++ b/drivers/gpu/drm/xe/xe_devcoredump.h
> @@ -11,10 +11,16 @@ struct xe_sched_job;
>
> #ifdef CONFIG_DEV_COREDUMP
> void xe_devcoredump(struct xe_sched_job *job);
> +int xe_devcoredump_init(struct xe_device *xe);
> #else
> static inline void xe_devcoredump(struct xe_sched_job *job)
> {
> }
> +
> +static inline int xe_devcoredump_init(struct xe_device *xe)
> +{
> + return 0;
> +}
> #endif
>
> #endif
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 9083f5e02dd9e..ce27d0d1bdb34 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -20,6 +20,7 @@
> #include "regs/xe_regs.h"
> #include "xe_bo.h"
> #include "xe_debugfs.h"
> +#include "xe_devcoredump.h"
> #include "xe_dma_buf.h"
> #include "xe_drm_client.h"
> #include "xe_drv.h"
> @@ -513,6 +514,9 @@ int xe_device_probe(struct xe_device *xe)
> return err;
> }
>
> + err = xe_devcoredump_init(xe);
> + if (err)
> + return err;
> err = drmm_add_action_or_reset(&xe->drm, xe_driver_flr_fini, xe);
> if (err)
> return err;
> --
> 2.44.0
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v4 1/2] devcoredump: Add dev_coredump_put()
2024-04-09 20:02 [PATCH v4 1/2] devcoredump: Add dev_coredump_put() José Roberto de Souza
2024-04-09 20:02 ` [PATCH v4 2/2] drm/xe: Remove devcoredump during driver release José Roberto de Souza
@ 2024-04-09 22:23 ` Rodrigo Vivi
2024-04-11 13:08 ` Greg KH
1 sibling, 1 reply; 5+ messages in thread
From: Rodrigo Vivi @ 2024-04-09 22:23 UTC (permalink / raw)
To: José Roberto de Souza, gregkh
Cc: linux-kernel, intel-xe, Mukesh Ojha, Johannes Berg,
Jonathan Cavitt
On Tue, Apr 09, 2024 at 01:02:05PM -0700, José Roberto de Souza wrote:
> It is useful for modules that do not want to keep coredump available
> after its unload.
> Otherwise, the coredump would only be removed after DEVCD_TIMEOUT
> seconds.
>
> v2:
> - dev_coredump_put() documentation updated (Mukesh)
>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Mukesh Ojha <quic_mojha@quicinc.com>
> Cc: Johannes Berg <johannes@sipsolutions.net>
> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
> Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
> Acked-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Greg, can I have you ack to merge this through the drm-next?
> ---
> drivers/base/devcoredump.c | 23 +++++++++++++++++++++++
> include/linux/devcoredump.h | 5 +++++
> 2 files changed, 28 insertions(+)
>
> diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
> index 7e2d1f0d903a6..82aeb09b3d1b5 100644
> --- a/drivers/base/devcoredump.c
> +++ b/drivers/base/devcoredump.c
> @@ -304,6 +304,29 @@ static ssize_t devcd_read_from_sgtable(char *buffer, loff_t offset,
> offset);
> }
>
> +/**
> + * dev_coredump_put - remove device coredump
> + * @dev: the struct device for the crashed device
> + *
> + * dev_coredump_put() removes coredump, if exists, for a given device from
> + * the file system and free its associated data otherwise, does nothing.
> + *
> + * It is useful for modules that do not want to keep coredump
> + * available after its unload.
> + */
> +void dev_coredump_put(struct device *dev)
> +{
> + struct device *existing;
> +
> + existing = class_find_device(&devcd_class, NULL, dev,
> + devcd_match_failing);
> + if (existing) {
> + devcd_free(existing, NULL);
> + put_device(existing);
> + }
> +}
> +EXPORT_SYMBOL_GPL(dev_coredump_put);
> +
> /**
> * dev_coredumpm - create device coredump with read/free methods
> * @dev: the struct device for the crashed device
> diff --git a/include/linux/devcoredump.h b/include/linux/devcoredump.h
> index c008169ed2c6f..c8f7eb6cc1915 100644
> --- a/include/linux/devcoredump.h
> +++ b/include/linux/devcoredump.h
> @@ -63,6 +63,8 @@ void dev_coredumpm(struct device *dev, struct module *owner,
>
> void dev_coredumpsg(struct device *dev, struct scatterlist *table,
> size_t datalen, gfp_t gfp);
> +
> +void dev_coredump_put(struct device *dev);
> #else
> static inline void dev_coredumpv(struct device *dev, void *data,
> size_t datalen, gfp_t gfp)
> @@ -85,6 +87,9 @@ static inline void dev_coredumpsg(struct device *dev, struct scatterlist *table,
> {
> _devcd_free_sgtable(table);
> }
> +static inline void dev_coredump_put(struct device *dev)
> +{
> +}
> #endif /* CONFIG_DEV_COREDUMP */
>
> #endif /* __DEVCOREDUMP_H */
> --
> 2.44.0
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v4 1/2] devcoredump: Add dev_coredump_put()
2024-04-09 22:23 ` [PATCH v4 1/2] devcoredump: Add dev_coredump_put() Rodrigo Vivi
@ 2024-04-11 13:08 ` Greg KH
0 siblings, 0 replies; 5+ messages in thread
From: Greg KH @ 2024-04-11 13:08 UTC (permalink / raw)
To: Rodrigo Vivi
Cc: José Roberto de Souza, linux-kernel, intel-xe, Mukesh Ojha,
Johannes Berg, Jonathan Cavitt
On Tue, Apr 09, 2024 at 06:23:33PM -0400, Rodrigo Vivi wrote:
> On Tue, Apr 09, 2024 at 01:02:05PM -0700, José Roberto de Souza wrote:
> > It is useful for modules that do not want to keep coredump available
> > after its unload.
> > Otherwise, the coredump would only be removed after DEVCD_TIMEOUT
> > seconds.
> >
> > v2:
> > - dev_coredump_put() documentation updated (Mukesh)
> >
> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > Cc: Mukesh Ojha <quic_mojha@quicinc.com>
> > Cc: Johannes Berg <johannes@sipsolutions.net>
> > Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
> > Acked-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
>
> Greg, can I have you ack to merge this through the drm-next?
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-04-11 13:08 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-09 20:02 [PATCH v4 1/2] devcoredump: Add dev_coredump_put() José Roberto de Souza
2024-04-09 20:02 ` [PATCH v4 2/2] drm/xe: Remove devcoredump during driver release José Roberto de Souza
2024-04-09 22:21 ` Rodrigo Vivi
2024-04-09 22:23 ` [PATCH v4 1/2] devcoredump: Add dev_coredump_put() Rodrigo Vivi
2024-04-11 13:08 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).