* [PATCH v5 01/11] PM / devfreq: Drop unneed locking to appease lockdep
2023-08-22 18:01 [PATCH v5 00/11] drm/msm+PM+icc: Make job_run() reclaim-safe Rob Clark
@ 2023-08-22 18:01 ` Rob Clark
2023-08-22 18:01 ` [PATCH v5 02/11] PM / devfreq: Teach lockdep about locking order Rob Clark
` (5 subsequent siblings)
6 siblings, 0 replies; 12+ messages in thread
From: Rob Clark @ 2023-08-22 18:01 UTC (permalink / raw)
To: dri-devel
Cc: linux-arm-msm, freedreno, Rob Clark, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, open list:DEVICE FREQUENCY (DEVFREQ), open list
From: Rob Clark <robdclark@chromium.org>
In the process of adding lockdep annotation for GPU job_run() path to
catch potential deadlocks against the shrinker/reclaim path, I turned
up this lockdep splat:
======================================================
WARNING: possible circular locking dependency detected
6.2.0-rc8-debug+ #556 Not tainted
------------------------------------------------------
ring0/123 is trying to acquire lock:
ffffff8087219078 (&devfreq->lock){+.+.}-{3:3}, at: devfreq_monitor_resume+0x3c/0xf0
but task is already holding lock:
ffffffd6f64e57e8 (dma_fence_map){++++}-{0:0}, at: msm_job_run+0x68/0x150
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (dma_fence_map){++++}-{0:0}:
__dma_fence_might_wait+0x74/0xc0
dma_resv_lockdep+0x1f4/0x2f4
do_one_initcall+0x104/0x2bc
kernel_init_freeable+0x344/0x34c
kernel_init+0x30/0x134
ret_from_fork+0x10/0x20
-> #2 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
fs_reclaim_acquire+0x80/0xa8
slab_pre_alloc_hook.constprop.0+0x40/0x25c
__kmem_cache_alloc_node+0x60/0x1cc
__kmalloc+0xd8/0x100
topology_parse_cpu_capacity+0x8c/0x178
get_cpu_for_node+0x88/0xc4
parse_cluster+0x1b0/0x28c
parse_cluster+0x8c/0x28c
init_cpu_topology+0x168/0x188
smp_prepare_cpus+0x24/0xf8
kernel_init_freeable+0x18c/0x34c
kernel_init+0x30/0x134
ret_from_fork+0x10/0x20
-> #1 (fs_reclaim){+.+.}-{0:0}:
__fs_reclaim_acquire+0x3c/0x48
fs_reclaim_acquire+0x54/0xa8
slab_pre_alloc_hook.constprop.0+0x40/0x25c
__kmem_cache_alloc_node+0x60/0x1cc
__kmalloc_node_track_caller+0xb8/0xe0
kstrdup+0x70/0x90
kstrdup_const+0x38/0x48
kvasprintf_const+0x48/0xbc
kobject_set_name_vargs+0x40/0xb0
dev_set_name+0x64/0x8c
devfreq_add_device+0x31c/0x55c
devm_devfreq_add_device+0x6c/0xb8
msm_devfreq_init+0xa8/0x16c
msm_gpu_init+0x38c/0x570
adreno_gpu_init+0x1b4/0x2b4
a6xx_gpu_init+0x15c/0x3e4
adreno_bind+0x218/0x254
component_bind_all+0x114/0x1ec
msm_drm_bind+0x2b8/0x608
try_to_bring_up_aggregate_device+0x88/0x1a4
__component_add+0xec/0x13c
component_add+0x1c/0x28
dsi_dev_attach+0x28/0x34
dsi_host_attach+0xdc/0x124
mipi_dsi_attach+0x30/0x44
devm_mipi_dsi_attach+0x2c/0x70
ti_sn_bridge_probe+0x298/0x2c4
auxiliary_bus_probe+0x7c/0x94
really_probe+0x158/0x290
__driver_probe_device+0xc8/0xe0
driver_probe_device+0x44/0x100
__device_attach_driver+0x64/0xdc
bus_for_each_drv+0xa0/0xc8
__device_attach+0xd8/0x168
device_initial_probe+0x1c/0x28
bus_probe_device+0x38/0xa0
deferred_probe_work_func+0xc8/0xe0
process_one_work+0x2d8/0x478
process_scheduled_works+0x4c/0x50
worker_thread+0x218/0x274
kthread+0xf0/0x100
ret_from_fork+0x10/0x20
-> #0 (&devfreq->lock){+.+.}-{3:3}:
__lock_acquire+0xe00/0x1060
lock_acquire+0x1e0/0x2f8
__mutex_lock+0xcc/0x3c8
mutex_lock_nested+0x30/0x44
devfreq_monitor_resume+0x3c/0xf0
devfreq_simple_ondemand_handler+0x54/0x7c
devfreq_resume_device+0xa4/0xe8
msm_devfreq_resume+0x78/0xa8
a6xx_pm_resume+0x110/0x234
adreno_runtime_resume+0x2c/0x38
pm_generic_runtime_resume+0x30/0x44
__rpm_callback+0x15c/0x174
rpm_callback+0x78/0x7c
rpm_resume+0x318/0x524
__pm_runtime_resume+0x78/0xbc
pm_runtime_get_sync.isra.0+0x14/0x20
msm_gpu_submit+0x58/0x178
msm_job_run+0x78/0x150
drm_sched_main+0x290/0x370
kthread+0xf0/0x100
ret_from_fork+0x10/0x20
other info that might help us debug this:
Chain exists of:
&devfreq->lock --> mmu_notifier_invalidate_range_start --> dma_fence_map
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(dma_fence_map);
lock(mmu_notifier_invalidate_range_start);
lock(dma_fence_map);
lock(&devfreq->lock);
*** DEADLOCK ***
2 locks held by ring0/123:
#0: ffffff8087201170 (&gpu->lock){+.+.}-{3:3}, at: msm_job_run+0x64/0x150
#1: ffffffd6f64e57e8 (dma_fence_map){++++}-{0:0}, at: msm_job_run+0x68/0x150
stack backtrace:
CPU: 6 PID: 123 Comm: ring0 Not tainted 6.2.0-rc8-debug+ #556
Hardware name: Google Lazor (rev1 - 2) with LTE (DT)
Call trace:
dump_backtrace.part.0+0xb4/0xf8
show_stack+0x20/0x38
dump_stack_lvl+0x9c/0xd0
dump_stack+0x18/0x34
print_circular_bug+0x1b4/0x1f0
check_noncircular+0x78/0xac
__lock_acquire+0xe00/0x1060
lock_acquire+0x1e0/0x2f8
__mutex_lock+0xcc/0x3c8
mutex_lock_nested+0x30/0x44
devfreq_monitor_resume+0x3c/0xf0
devfreq_simple_ondemand_handler+0x54/0x7c
devfreq_resume_device+0xa4/0xe8
msm_devfreq_resume+0x78/0xa8
a6xx_pm_resume+0x110/0x234
adreno_runtime_resume+0x2c/0x38
pm_generic_runtime_resume+0x30/0x44
__rpm_callback+0x15c/0x174
rpm_callback+0x78/0x7c
rpm_resume+0x318/0x524
__pm_runtime_resume+0x78/0xbc
pm_runtime_get_sync.isra.0+0x14/0x20
msm_gpu_submit+0x58/0x178
msm_job_run+0x78/0x150
drm_sched_main+0x290/0x370
kthread+0xf0/0x100
ret_from_fork+0x10/0x20
The issue is that we cannot be holding any lock while doing memory
allocations that is also needed in the job_run (and in the case of
devfreq, this means runpm_resume()) because lockdep sees this as a
potential dependency.
Fortunately there is really no reason to hold the devfreq lock when
we are creating the devfreq device, as it is not yet visible to any
other task. The only reason it was needed was for a lockdep assert
in devfreq_get_freq_range(). Instead, split this up into an internal
fxn that is used in the devfreq_add_device() (where the lock is not
required).
Signed-off-by: Rob Clark <robdclark@chromium.org>
---
drivers/devfreq/devfreq.c | 46 ++++++++++++++++++---------------------
1 file changed, 21 insertions(+), 25 deletions(-)
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index e36cbb920ec8..e5558ec68ce8 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -111,23 +111,13 @@ static unsigned long find_available_max_freq(struct devfreq *devfreq)
return max_freq;
}
-/**
- * devfreq_get_freq_range() - Get the current freq range
- * @devfreq: the devfreq instance
- * @min_freq: the min frequency
- * @max_freq: the max frequency
- *
- * This takes into consideration all constraints.
- */
-void devfreq_get_freq_range(struct devfreq *devfreq,
- unsigned long *min_freq,
- unsigned long *max_freq)
+static void __get_freq_range(struct devfreq *devfreq,
+ unsigned long *min_freq,
+ unsigned long *max_freq)
{
unsigned long *freq_table = devfreq->freq_table;
s32 qos_min_freq, qos_max_freq;
- lockdep_assert_held(&devfreq->lock);
-
/*
* Initialize minimum/maximum frequency from freq table.
* The devfreq drivers can initialize this in either ascending or
@@ -158,6 +148,23 @@ void devfreq_get_freq_range(struct devfreq *devfreq,
if (*min_freq > *max_freq)
*min_freq = *max_freq;
}
+
+/**
+ * devfreq_get_freq_range() - Get the current freq range
+ * @devfreq: the devfreq instance
+ * @min_freq: the min frequency
+ * @max_freq: the max frequency
+ *
+ * This takes into consideration all constraints.
+ */
+void devfreq_get_freq_range(struct devfreq *devfreq,
+ unsigned long *min_freq,
+ unsigned long *max_freq)
+{
+ lockdep_assert_held(&devfreq->lock);
+
+ __get_freq_range(devfreq, min_freq, max_freq);
+}
EXPORT_SYMBOL(devfreq_get_freq_range);
/**
@@ -810,7 +817,6 @@ struct devfreq *devfreq_add_device(struct device *dev,
}
mutex_init(&devfreq->lock);
- mutex_lock(&devfreq->lock);
devfreq->dev.parent = dev;
devfreq->dev.class = devfreq_class;
devfreq->dev.release = devfreq_dev_release;
@@ -823,17 +829,14 @@ struct devfreq *devfreq_add_device(struct device *dev,
if (devfreq->profile->timer < 0
|| devfreq->profile->timer >= DEVFREQ_TIMER_NUM) {
- mutex_unlock(&devfreq->lock);
err = -EINVAL;
goto err_dev;
}
if (!devfreq->profile->max_state || !devfreq->profile->freq_table) {
- mutex_unlock(&devfreq->lock);
err = set_freq_table(devfreq);
if (err < 0)
goto err_dev;
- mutex_lock(&devfreq->lock);
} else {
devfreq->freq_table = devfreq->profile->freq_table;
devfreq->max_state = devfreq->profile->max_state;
@@ -841,19 +844,17 @@ struct devfreq *devfreq_add_device(struct device *dev,
devfreq->scaling_min_freq = find_available_min_freq(devfreq);
if (!devfreq->scaling_min_freq) {
- mutex_unlock(&devfreq->lock);
err = -EINVAL;
goto err_dev;
}
devfreq->scaling_max_freq = find_available_max_freq(devfreq);
if (!devfreq->scaling_max_freq) {
- mutex_unlock(&devfreq->lock);
err = -EINVAL;
goto err_dev;
}
- devfreq_get_freq_range(devfreq, &min_freq, &max_freq);
+ __get_freq_range(devfreq, &min_freq, &max_freq);
devfreq->suspend_freq = dev_pm_opp_get_suspend_opp_freq(dev);
devfreq->opp_table = dev_pm_opp_get_opp_table(dev);
@@ -865,7 +866,6 @@ struct devfreq *devfreq_add_device(struct device *dev,
dev_set_name(&devfreq->dev, "%s", dev_name(dev));
err = device_register(&devfreq->dev);
if (err) {
- mutex_unlock(&devfreq->lock);
put_device(&devfreq->dev);
goto err_out;
}
@@ -876,7 +876,6 @@ struct devfreq *devfreq_add_device(struct device *dev,
devfreq->max_state),
GFP_KERNEL);
if (!devfreq->stats.trans_table) {
- mutex_unlock(&devfreq->lock);
err = -ENOMEM;
goto err_devfreq;
}
@@ -886,7 +885,6 @@ struct devfreq *devfreq_add_device(struct device *dev,
sizeof(*devfreq->stats.time_in_state),
GFP_KERNEL);
if (!devfreq->stats.time_in_state) {
- mutex_unlock(&devfreq->lock);
err = -ENOMEM;
goto err_devfreq;
}
@@ -896,8 +894,6 @@ struct devfreq *devfreq_add_device(struct device *dev,
srcu_init_notifier_head(&devfreq->transition_notifier_list);
- mutex_unlock(&devfreq->lock);
-
err = dev_pm_qos_add_request(dev, &devfreq->user_min_freq_req,
DEV_PM_QOS_MIN_FREQUENCY, 0);
if (err < 0)
--
2.41.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v5 02/11] PM / devfreq: Teach lockdep about locking order
2023-08-22 18:01 [PATCH v5 00/11] drm/msm+PM+icc: Make job_run() reclaim-safe Rob Clark
2023-08-22 18:01 ` [PATCH v5 01/11] PM / devfreq: Drop unneed locking to appease lockdep Rob Clark
@ 2023-08-22 18:01 ` Rob Clark
2023-08-22 18:01 ` [PATCH v5 03/11] PM / QoS: Fix constraints alloc vs reclaim locking Rob Clark
` (4 subsequent siblings)
6 siblings, 0 replies; 12+ messages in thread
From: Rob Clark @ 2023-08-22 18:01 UTC (permalink / raw)
To: dri-devel
Cc: linux-arm-msm, freedreno, Rob Clark, MyungJoo Ham, Kyungmin Park,
Chanwoo Choi, open list:DEVICE FREQUENCY (DEVFREQ), open list
From: Rob Clark <robdclark@chromium.org>
This will make it easier to catch places doing allocations that can
trigger reclaim under devfreq->lock.
Because devfreq->lock is held over various devfreq_dev_profile
callbacks, there might be some fallout if those callbacks do allocations
that can trigger reclaim, but I've looked through the various callback
implementations and don't see anything obvious. If it does trigger any
lockdep splats, those should be fixed.
Signed-off-by: Rob Clark <robdclark@chromium.org>
---
drivers/devfreq/devfreq.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index e5558ec68ce8..81add6064406 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -817,6 +817,12 @@ struct devfreq *devfreq_add_device(struct device *dev,
}
mutex_init(&devfreq->lock);
+
+ /* Teach lockdep about lock ordering wrt. shrinker: */
+ fs_reclaim_acquire(GFP_KERNEL);
+ might_lock(&devfreq->lock);
+ fs_reclaim_release(GFP_KERNEL);
+
devfreq->dev.parent = dev;
devfreq->dev.class = devfreq_class;
devfreq->dev.release = devfreq_dev_release;
--
2.41.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v5 03/11] PM / QoS: Fix constraints alloc vs reclaim locking
2023-08-22 18:01 [PATCH v5 00/11] drm/msm+PM+icc: Make job_run() reclaim-safe Rob Clark
2023-08-22 18:01 ` [PATCH v5 01/11] PM / devfreq: Drop unneed locking to appease lockdep Rob Clark
2023-08-22 18:01 ` [PATCH v5 02/11] PM / devfreq: Teach lockdep about locking order Rob Clark
@ 2023-08-22 18:01 ` Rob Clark
2023-08-22 18:47 ` Rafael J. Wysocki
2023-08-22 18:01 ` [PATCH v5 04/11] PM / QoS: Decouple request alloc from dev_pm_qos_mtx Rob Clark
` (3 subsequent siblings)
6 siblings, 1 reply; 12+ messages in thread
From: Rob Clark @ 2023-08-22 18:01 UTC (permalink / raw)
To: dri-devel
Cc: linux-arm-msm, freedreno, Rob Clark, Rafael J . Wysocki,
Pavel Machek, Len Brown, Greg Kroah-Hartman,
open list:HIBERNATION (aka Software Suspend, aka swsusp),
open list
From: Rob Clark <robdclark@chromium.org>
In the process of adding lockdep annotation for drm GPU scheduler's
job_run() to detect potential deadlock against shrinker/reclaim, I hit
this lockdep splat:
======================================================
WARNING: possible circular locking dependency detected
6.2.0-rc8-debug+ #558 Tainted: G W
------------------------------------------------------
ring0/125 is trying to acquire lock:
ffffffd6d6ce0f28 (dev_pm_qos_mtx){+.+.}-{3:3}, at: dev_pm_qos_update_request+0x38/0x68
but task is already holding lock:
ffffff8087239208 (&gpu->active_lock){+.+.}-{3:3}, at: msm_gpu_submit+0xec/0x178
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (&gpu->active_lock){+.+.}-{3:3}:
__mutex_lock+0xcc/0x3c8
mutex_lock_nested+0x30/0x44
msm_gpu_submit+0xec/0x178
msm_job_run+0x78/0x150
drm_sched_main+0x290/0x370
kthread+0xf0/0x100
ret_from_fork+0x10/0x20
-> #3 (dma_fence_map){++++}-{0:0}:
__dma_fence_might_wait+0x74/0xc0
dma_resv_lockdep+0x1f4/0x2f4
do_one_initcall+0x104/0x2bc
kernel_init_freeable+0x344/0x34c
kernel_init+0x30/0x134
ret_from_fork+0x10/0x20
-> #2 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
fs_reclaim_acquire+0x80/0xa8
slab_pre_alloc_hook.constprop.0+0x40/0x25c
__kmem_cache_alloc_node+0x60/0x1cc
__kmalloc+0xd8/0x100
topology_parse_cpu_capacity+0x8c/0x178
get_cpu_for_node+0x88/0xc4
parse_cluster+0x1b0/0x28c
parse_cluster+0x8c/0x28c
init_cpu_topology+0x168/0x188
smp_prepare_cpus+0x24/0xf8
kernel_init_freeable+0x18c/0x34c
kernel_init+0x30/0x134
ret_from_fork+0x10/0x20
-> #1 (fs_reclaim){+.+.}-{0:0}:
__fs_reclaim_acquire+0x3c/0x48
fs_reclaim_acquire+0x54/0xa8
slab_pre_alloc_hook.constprop.0+0x40/0x25c
__kmem_cache_alloc_node+0x60/0x1cc
kmalloc_trace+0x50/0xa8
dev_pm_qos_constraints_allocate+0x38/0x100
__dev_pm_qos_add_request+0xb0/0x1e8
dev_pm_qos_add_request+0x58/0x80
dev_pm_qos_expose_latency_limit+0x60/0x13c
register_cpu+0x12c/0x130
topology_init+0xac/0xbc
do_one_initcall+0x104/0x2bc
kernel_init_freeable+0x344/0x34c
kernel_init+0x30/0x134
ret_from_fork+0x10/0x20
-> #0 (dev_pm_qos_mtx){+.+.}-{3:3}:
__lock_acquire+0xe00/0x1060
lock_acquire+0x1e0/0x2f8
__mutex_lock+0xcc/0x3c8
mutex_lock_nested+0x30/0x44
dev_pm_qos_update_request+0x38/0x68
msm_devfreq_boost+0x40/0x70
msm_devfreq_active+0xc0/0xf0
msm_gpu_submit+0x10c/0x178
msm_job_run+0x78/0x150
drm_sched_main+0x290/0x370
kthread+0xf0/0x100
ret_from_fork+0x10/0x20
other info that might help us debug this:
Chain exists of:
dev_pm_qos_mtx --> dma_fence_map --> &gpu->active_lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&gpu->active_lock);
lock(dma_fence_map);
lock(&gpu->active_lock);
lock(dev_pm_qos_mtx);
*** DEADLOCK ***
3 locks held by ring0/123:
#0: ffffff8087251170 (&gpu->lock){+.+.}-{3:3}, at: msm_job_run+0x64/0x150
#1: ffffffd00b0e57e8 (dma_fence_map){++++}-{0:0}, at: msm_job_run+0x68/0x150
#2: ffffff8087251208 (&gpu->active_lock){+.+.}-{3:3}, at: msm_gpu_submit+0xec/0x178
stack backtrace:
CPU: 6 PID: 123 Comm: ring0 Not tainted 6.2.0-rc8-debug+ #559
Hardware name: Google Lazor (rev1 - 2) with LTE (DT)
Call trace:
dump_backtrace.part.0+0xb4/0xf8
show_stack+0x20/0x38
dump_stack_lvl+0x9c/0xd0
dump_stack+0x18/0x34
print_circular_bug+0x1b4/0x1f0
check_noncircular+0x78/0xac
__lock_acquire+0xe00/0x1060
lock_acquire+0x1e0/0x2f8
__mutex_lock+0xcc/0x3c8
mutex_lock_nested+0x30/0x44
dev_pm_qos_update_request+0x38/0x68
msm_devfreq_boost+0x40/0x70
msm_devfreq_active+0xc0/0xf0
msm_gpu_submit+0x10c/0x178
msm_job_run+0x78/0x150
drm_sched_main+0x290/0x370
kthread+0xf0/0x100
ret_from_fork+0x10/0x20
The issue is that dev_pm_qos_mtx is held in the runpm suspend/resume (or
freq change) path, but it is also held across allocations that could
recurse into shrinker.
Solve this by changing dev_pm_qos_constraints_allocate() into a function
that can be called unconditionally before the device qos object is
needed and before aquiring dev_pm_qos_mtx. This way the allocations can
be done without holding the mutex. In the case that we raced with
another thread to allocate the qos object, detect this *after* acquiring
the dev_pm_qos_mtx and simply free the redundant allocations.
Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
---
drivers/base/power/qos.c | 76 +++++++++++++++++++++++++++++-----------
1 file changed, 56 insertions(+), 20 deletions(-)
diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c
index 8e93167f1783..7e95760d16dc 100644
--- a/drivers/base/power/qos.c
+++ b/drivers/base/power/qos.c
@@ -185,27 +185,33 @@ static int apply_constraint(struct dev_pm_qos_request *req,
}
/*
- * dev_pm_qos_constraints_allocate
+ * dev_pm_qos_constraints_allocate: Allocate and initializes qos constraints
* @dev: device to allocate data for
*
- * Called at the first call to add_request, for constraint data allocation
- * Must be called with the dev_pm_qos_mtx mutex held
+ * Called to allocate constraints before dev_pm_qos_mtx mutex is held. Should
+ * be matched with a call to dev_pm_qos_constraints_set() once dev_pm_qos_mtx
+ * is held.
*/
-static int dev_pm_qos_constraints_allocate(struct device *dev)
+static struct dev_pm_qos *dev_pm_qos_constraints_allocate(struct device *dev)
{
struct dev_pm_qos *qos;
struct pm_qos_constraints *c;
struct blocking_notifier_head *n;
- qos = kzalloc(sizeof(*qos), GFP_KERNEL);
+ /*
+ * If constraints are already allocated, we can skip speculatively
+ * allocating a new one, as we don't have to work about qos transitioning
+ * from non-null to null. The constraints are only freed on device
+ * removal.
+ */
+ if (dev->power.qos)
+ return NULL;
+
+ qos = kzalloc(sizeof(*qos) + 3 * sizeof(*n), GFP_KERNEL);
if (!qos)
- return -ENOMEM;
+ return NULL;
- n = kzalloc(3 * sizeof(*n), GFP_KERNEL);
- if (!n) {
- kfree(qos);
- return -ENOMEM;
- }
+ n = (struct blocking_notifier_head *)(qos + 1);
c = &qos->resume_latency;
plist_head_init(&c->list);
@@ -227,11 +233,29 @@ static int dev_pm_qos_constraints_allocate(struct device *dev)
INIT_LIST_HEAD(&qos->flags.list);
+ return qos;
+}
+
+/*
+ * dev_pm_qos_constraints_set: Ensure dev->power.qos is set
+ *
+ * If dev->power.qos is already set, free the newly allocated qos constraints.
+ * Otherwise set dev->power.qos. Must be called with dev_pm_qos_mtx held.
+ *
+ * This split unsynchronized allocation and synchronized set moves allocation
+ * out from under dev_pm_qos_mtx, so that lockdep does does not get angry about
+ * drivers which use dev_pm_qos in paths related to shrinker/reclaim.
+ */
+static void dev_pm_qos_constraints_set(struct device *dev, struct dev_pm_qos *qos)
+{
+ if (dev->power.qos) {
+ kfree(qos);
+ return;
+ }
+
spin_lock_irq(&dev->power.lock);
dev->power.qos = qos;
spin_unlock_irq(&dev->power.lock);
-
- return 0;
}
static void __dev_pm_qos_hide_latency_limit(struct device *dev);
@@ -309,7 +333,6 @@ void dev_pm_qos_constraints_destroy(struct device *dev)
dev->power.qos = ERR_PTR(-ENODEV);
spin_unlock_irq(&dev->power.lock);
- kfree(qos->resume_latency.notifiers);
kfree(qos);
out:
@@ -341,7 +364,7 @@ static int __dev_pm_qos_add_request(struct device *dev,
if (IS_ERR(dev->power.qos))
ret = -ENODEV;
else if (!dev->power.qos)
- ret = dev_pm_qos_constraints_allocate(dev);
+ ret = -ENOMEM;
trace_dev_pm_qos_add_request(dev_name(dev), type, value);
if (ret)
@@ -388,9 +411,11 @@ static int __dev_pm_qos_add_request(struct device *dev,
int dev_pm_qos_add_request(struct device *dev, struct dev_pm_qos_request *req,
enum dev_pm_qos_req_type type, s32 value)
{
+ struct dev_pm_qos *qos = dev_pm_qos_constraints_allocate(dev);
int ret;
mutex_lock(&dev_pm_qos_mtx);
+ dev_pm_qos_constraints_set(dev, qos);
ret = __dev_pm_qos_add_request(dev, req, type, value);
mutex_unlock(&dev_pm_qos_mtx);
return ret;
@@ -535,14 +560,15 @@ EXPORT_SYMBOL_GPL(dev_pm_qos_remove_request);
int dev_pm_qos_add_notifier(struct device *dev, struct notifier_block *notifier,
enum dev_pm_qos_req_type type)
{
+ struct dev_pm_qos *qos = dev_pm_qos_constraints_allocate(dev);
int ret = 0;
mutex_lock(&dev_pm_qos_mtx);
+ dev_pm_qos_constraints_set(dev, qos);
+
if (IS_ERR(dev->power.qos))
ret = -ENODEV;
- else if (!dev->power.qos)
- ret = dev_pm_qos_constraints_allocate(dev);
if (ret)
goto unlock;
@@ -903,12 +929,22 @@ s32 dev_pm_qos_get_user_latency_tolerance(struct device *dev)
*/
int dev_pm_qos_update_user_latency_tolerance(struct device *dev, s32 val)
{
- int ret;
+ struct dev_pm_qos *qos = dev_pm_qos_constraints_allocate(dev);
+ int ret = 0;
mutex_lock(&dev_pm_qos_mtx);
- if (IS_ERR_OR_NULL(dev->power.qos)
- || !dev->power.qos->latency_tolerance_req) {
+ dev_pm_qos_constraints_set(dev, qos);
+
+ if (IS_ERR(dev->power.qos))
+ ret = -ENODEV;
+ else if (!dev->power.qos)
+ ret = -ENOMEM;
+
+ if (ret)
+ goto out;
+
+ if (!dev->power.qos->latency_tolerance_req) {
struct dev_pm_qos_request *req;
if (val < 0) {
--
2.41.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v5 03/11] PM / QoS: Fix constraints alloc vs reclaim locking
2023-08-22 18:01 ` [PATCH v5 03/11] PM / QoS: Fix constraints alloc vs reclaim locking Rob Clark
@ 2023-08-22 18:47 ` Rafael J. Wysocki
2023-08-22 19:41 ` Rob Clark
2023-08-23 21:01 ` Rob Clark
0 siblings, 2 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2023-08-22 18:47 UTC (permalink / raw)
To: Rob Clark
Cc: dri-devel, linux-arm-msm, freedreno, Rob Clark,
Rafael J . Wysocki, Pavel Machek, Len Brown, Greg Kroah-Hartman,
open list:HIBERNATION (aka Software Suspend, aka swsusp),
open list
On Tue, Aug 22, 2023 at 8:02 PM Rob Clark <robdclark@gmail.com> wrote:
>
> From: Rob Clark <robdclark@chromium.org>
>
> In the process of adding lockdep annotation for drm GPU scheduler's
> job_run() to detect potential deadlock against shrinker/reclaim, I hit
> this lockdep splat:
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.2.0-rc8-debug+ #558 Tainted: G W
> ------------------------------------------------------
> ring0/125 is trying to acquire lock:
> ffffffd6d6ce0f28 (dev_pm_qos_mtx){+.+.}-{3:3}, at: dev_pm_qos_update_request+0x38/0x68
>
> but task is already holding lock:
> ffffff8087239208 (&gpu->active_lock){+.+.}-{3:3}, at: msm_gpu_submit+0xec/0x178
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #4 (&gpu->active_lock){+.+.}-{3:3}:
> __mutex_lock+0xcc/0x3c8
> mutex_lock_nested+0x30/0x44
> msm_gpu_submit+0xec/0x178
> msm_job_run+0x78/0x150
> drm_sched_main+0x290/0x370
> kthread+0xf0/0x100
> ret_from_fork+0x10/0x20
>
> -> #3 (dma_fence_map){++++}-{0:0}:
> __dma_fence_might_wait+0x74/0xc0
> dma_resv_lockdep+0x1f4/0x2f4
> do_one_initcall+0x104/0x2bc
> kernel_init_freeable+0x344/0x34c
> kernel_init+0x30/0x134
> ret_from_fork+0x10/0x20
>
> -> #2 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
> fs_reclaim_acquire+0x80/0xa8
> slab_pre_alloc_hook.constprop.0+0x40/0x25c
> __kmem_cache_alloc_node+0x60/0x1cc
> __kmalloc+0xd8/0x100
> topology_parse_cpu_capacity+0x8c/0x178
> get_cpu_for_node+0x88/0xc4
> parse_cluster+0x1b0/0x28c
> parse_cluster+0x8c/0x28c
> init_cpu_topology+0x168/0x188
> smp_prepare_cpus+0x24/0xf8
> kernel_init_freeable+0x18c/0x34c
> kernel_init+0x30/0x134
> ret_from_fork+0x10/0x20
>
> -> #1 (fs_reclaim){+.+.}-{0:0}:
> __fs_reclaim_acquire+0x3c/0x48
> fs_reclaim_acquire+0x54/0xa8
> slab_pre_alloc_hook.constprop.0+0x40/0x25c
> __kmem_cache_alloc_node+0x60/0x1cc
> kmalloc_trace+0x50/0xa8
> dev_pm_qos_constraints_allocate+0x38/0x100
> __dev_pm_qos_add_request+0xb0/0x1e8
> dev_pm_qos_add_request+0x58/0x80
> dev_pm_qos_expose_latency_limit+0x60/0x13c
> register_cpu+0x12c/0x130
> topology_init+0xac/0xbc
> do_one_initcall+0x104/0x2bc
> kernel_init_freeable+0x344/0x34c
> kernel_init+0x30/0x134
> ret_from_fork+0x10/0x20
>
> -> #0 (dev_pm_qos_mtx){+.+.}-{3:3}:
> __lock_acquire+0xe00/0x1060
> lock_acquire+0x1e0/0x2f8
> __mutex_lock+0xcc/0x3c8
> mutex_lock_nested+0x30/0x44
> dev_pm_qos_update_request+0x38/0x68
> msm_devfreq_boost+0x40/0x70
> msm_devfreq_active+0xc0/0xf0
> msm_gpu_submit+0x10c/0x178
> msm_job_run+0x78/0x150
> drm_sched_main+0x290/0x370
> kthread+0xf0/0x100
> ret_from_fork+0x10/0x20
>
> other info that might help us debug this:
>
> Chain exists of:
> dev_pm_qos_mtx --> dma_fence_map --> &gpu->active_lock
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&gpu->active_lock);
> lock(dma_fence_map);
> lock(&gpu->active_lock);
> lock(dev_pm_qos_mtx);
>
> *** DEADLOCK ***
>
> 3 locks held by ring0/123:
> #0: ffffff8087251170 (&gpu->lock){+.+.}-{3:3}, at: msm_job_run+0x64/0x150
> #1: ffffffd00b0e57e8 (dma_fence_map){++++}-{0:0}, at: msm_job_run+0x68/0x150
> #2: ffffff8087251208 (&gpu->active_lock){+.+.}-{3:3}, at: msm_gpu_submit+0xec/0x178
>
> stack backtrace:
> CPU: 6 PID: 123 Comm: ring0 Not tainted 6.2.0-rc8-debug+ #559
> Hardware name: Google Lazor (rev1 - 2) with LTE (DT)
> Call trace:
> dump_backtrace.part.0+0xb4/0xf8
> show_stack+0x20/0x38
> dump_stack_lvl+0x9c/0xd0
> dump_stack+0x18/0x34
> print_circular_bug+0x1b4/0x1f0
> check_noncircular+0x78/0xac
> __lock_acquire+0xe00/0x1060
> lock_acquire+0x1e0/0x2f8
> __mutex_lock+0xcc/0x3c8
> mutex_lock_nested+0x30/0x44
> dev_pm_qos_update_request+0x38/0x68
> msm_devfreq_boost+0x40/0x70
> msm_devfreq_active+0xc0/0xf0
> msm_gpu_submit+0x10c/0x178
> msm_job_run+0x78/0x150
> drm_sched_main+0x290/0x370
> kthread+0xf0/0x100
> ret_from_fork+0x10/0x20
>
> The issue is that dev_pm_qos_mtx is held in the runpm suspend/resume (or
> freq change) path, but it is also held across allocations that could
> recurse into shrinker.
>
> Solve this by changing dev_pm_qos_constraints_allocate() into a function
> that can be called unconditionally before the device qos object is
> needed and before aquiring dev_pm_qos_mtx. This way the allocations can
acquiring
> be done without holding the mutex. In the case that we raced with
> another thread to allocate the qos object, detect this *after* acquiring
> the dev_pm_qos_mtx and simply free the redundant allocations.
>
> Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
> Signed-off-by: Rob Clark <robdclark@chromium.org>
Please feel free to add
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
to this patch and the next 2 PM QoS ones in this series.
Thanks!
> ---
> drivers/base/power/qos.c | 76 +++++++++++++++++++++++++++++-----------
> 1 file changed, 56 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c
> index 8e93167f1783..7e95760d16dc 100644
> --- a/drivers/base/power/qos.c
> +++ b/drivers/base/power/qos.c
> @@ -185,27 +185,33 @@ static int apply_constraint(struct dev_pm_qos_request *req,
> }
>
> /*
> - * dev_pm_qos_constraints_allocate
> + * dev_pm_qos_constraints_allocate: Allocate and initializes qos constraints
> * @dev: device to allocate data for
> *
> - * Called at the first call to add_request, for constraint data allocation
> - * Must be called with the dev_pm_qos_mtx mutex held
> + * Called to allocate constraints before dev_pm_qos_mtx mutex is held. Should
> + * be matched with a call to dev_pm_qos_constraints_set() once dev_pm_qos_mtx
> + * is held.
> */
> -static int dev_pm_qos_constraints_allocate(struct device *dev)
> +static struct dev_pm_qos *dev_pm_qos_constraints_allocate(struct device *dev)
> {
> struct dev_pm_qos *qos;
> struct pm_qos_constraints *c;
> struct blocking_notifier_head *n;
>
> - qos = kzalloc(sizeof(*qos), GFP_KERNEL);
> + /*
> + * If constraints are already allocated, we can skip speculatively
> + * allocating a new one, as we don't have to work about qos transitioning
> + * from non-null to null. The constraints are only freed on device
> + * removal.
> + */
> + if (dev->power.qos)
> + return NULL;
> +
> + qos = kzalloc(sizeof(*qos) + 3 * sizeof(*n), GFP_KERNEL);
> if (!qos)
> - return -ENOMEM;
> + return NULL;
>
> - n = kzalloc(3 * sizeof(*n), GFP_KERNEL);
> - if (!n) {
> - kfree(qos);
> - return -ENOMEM;
> - }
> + n = (struct blocking_notifier_head *)(qos + 1);
>
> c = &qos->resume_latency;
> plist_head_init(&c->list);
> @@ -227,11 +233,29 @@ static int dev_pm_qos_constraints_allocate(struct device *dev)
>
> INIT_LIST_HEAD(&qos->flags.list);
>
> + return qos;
> +}
> +
> +/*
> + * dev_pm_qos_constraints_set: Ensure dev->power.qos is set
> + *
> + * If dev->power.qos is already set, free the newly allocated qos constraints.
> + * Otherwise set dev->power.qos. Must be called with dev_pm_qos_mtx held.
> + *
> + * This split unsynchronized allocation and synchronized set moves allocation
> + * out from under dev_pm_qos_mtx, so that lockdep does does not get angry about
> + * drivers which use dev_pm_qos in paths related to shrinker/reclaim.
> + */
> +static void dev_pm_qos_constraints_set(struct device *dev, struct dev_pm_qos *qos)
> +{
> + if (dev->power.qos) {
> + kfree(qos);
> + return;
> + }
> +
> spin_lock_irq(&dev->power.lock);
> dev->power.qos = qos;
> spin_unlock_irq(&dev->power.lock);
> -
> - return 0;
> }
>
> static void __dev_pm_qos_hide_latency_limit(struct device *dev);
> @@ -309,7 +333,6 @@ void dev_pm_qos_constraints_destroy(struct device *dev)
> dev->power.qos = ERR_PTR(-ENODEV);
> spin_unlock_irq(&dev->power.lock);
>
> - kfree(qos->resume_latency.notifiers);
> kfree(qos);
>
> out:
> @@ -341,7 +364,7 @@ static int __dev_pm_qos_add_request(struct device *dev,
> if (IS_ERR(dev->power.qos))
> ret = -ENODEV;
> else if (!dev->power.qos)
> - ret = dev_pm_qos_constraints_allocate(dev);
> + ret = -ENOMEM;
>
> trace_dev_pm_qos_add_request(dev_name(dev), type, value);
> if (ret)
> @@ -388,9 +411,11 @@ static int __dev_pm_qos_add_request(struct device *dev,
> int dev_pm_qos_add_request(struct device *dev, struct dev_pm_qos_request *req,
> enum dev_pm_qos_req_type type, s32 value)
> {
> + struct dev_pm_qos *qos = dev_pm_qos_constraints_allocate(dev);
> int ret;
>
> mutex_lock(&dev_pm_qos_mtx);
> + dev_pm_qos_constraints_set(dev, qos);
> ret = __dev_pm_qos_add_request(dev, req, type, value);
> mutex_unlock(&dev_pm_qos_mtx);
> return ret;
> @@ -535,14 +560,15 @@ EXPORT_SYMBOL_GPL(dev_pm_qos_remove_request);
> int dev_pm_qos_add_notifier(struct device *dev, struct notifier_block *notifier,
> enum dev_pm_qos_req_type type)
> {
> + struct dev_pm_qos *qos = dev_pm_qos_constraints_allocate(dev);
> int ret = 0;
>
> mutex_lock(&dev_pm_qos_mtx);
>
> + dev_pm_qos_constraints_set(dev, qos);
> +
> if (IS_ERR(dev->power.qos))
> ret = -ENODEV;
> - else if (!dev->power.qos)
> - ret = dev_pm_qos_constraints_allocate(dev);
>
> if (ret)
> goto unlock;
> @@ -903,12 +929,22 @@ s32 dev_pm_qos_get_user_latency_tolerance(struct device *dev)
> */
> int dev_pm_qos_update_user_latency_tolerance(struct device *dev, s32 val)
> {
> - int ret;
> + struct dev_pm_qos *qos = dev_pm_qos_constraints_allocate(dev);
> + int ret = 0;
>
> mutex_lock(&dev_pm_qos_mtx);
>
> - if (IS_ERR_OR_NULL(dev->power.qos)
> - || !dev->power.qos->latency_tolerance_req) {
> + dev_pm_qos_constraints_set(dev, qos);
> +
> + if (IS_ERR(dev->power.qos))
> + ret = -ENODEV;
> + else if (!dev->power.qos)
> + ret = -ENOMEM;
> +
> + if (ret)
> + goto out;
> +
> + if (!dev->power.qos->latency_tolerance_req) {
> struct dev_pm_qos_request *req;
>
> if (val < 0) {
> --
> 2.41.0
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5 03/11] PM / QoS: Fix constraints alloc vs reclaim locking
2023-08-22 18:47 ` Rafael J. Wysocki
@ 2023-08-22 19:41 ` Rob Clark
2023-08-23 21:01 ` Rob Clark
1 sibling, 0 replies; 12+ messages in thread
From: Rob Clark @ 2023-08-22 19:41 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: dri-devel, linux-arm-msm, freedreno, Rob Clark, Pavel Machek,
Len Brown, Greg Kroah-Hartman,
open list:HIBERNATION (aka Software Suspend, aka swsusp),
open list
On Tue, Aug 22, 2023 at 11:48 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Tue, Aug 22, 2023 at 8:02 PM Rob Clark <robdclark@gmail.com> wrote:
> >
> > From: Rob Clark <robdclark@chromium.org>
> >
> > In the process of adding lockdep annotation for drm GPU scheduler's
> > job_run() to detect potential deadlock against shrinker/reclaim, I hit
> > this lockdep splat:
> >
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 6.2.0-rc8-debug+ #558 Tainted: G W
> > ------------------------------------------------------
> > ring0/125 is trying to acquire lock:
> > ffffffd6d6ce0f28 (dev_pm_qos_mtx){+.+.}-{3:3}, at: dev_pm_qos_update_request+0x38/0x68
> >
> > but task is already holding lock:
> > ffffff8087239208 (&gpu->active_lock){+.+.}-{3:3}, at: msm_gpu_submit+0xec/0x178
> >
> > which lock already depends on the new lock.
> >
> > the existing dependency chain (in reverse order) is:
> >
> > -> #4 (&gpu->active_lock){+.+.}-{3:3}:
> > __mutex_lock+0xcc/0x3c8
> > mutex_lock_nested+0x30/0x44
> > msm_gpu_submit+0xec/0x178
> > msm_job_run+0x78/0x150
> > drm_sched_main+0x290/0x370
> > kthread+0xf0/0x100
> > ret_from_fork+0x10/0x20
> >
> > -> #3 (dma_fence_map){++++}-{0:0}:
> > __dma_fence_might_wait+0x74/0xc0
> > dma_resv_lockdep+0x1f4/0x2f4
> > do_one_initcall+0x104/0x2bc
> > kernel_init_freeable+0x344/0x34c
> > kernel_init+0x30/0x134
> > ret_from_fork+0x10/0x20
> >
> > -> #2 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
> > fs_reclaim_acquire+0x80/0xa8
> > slab_pre_alloc_hook.constprop.0+0x40/0x25c
> > __kmem_cache_alloc_node+0x60/0x1cc
> > __kmalloc+0xd8/0x100
> > topology_parse_cpu_capacity+0x8c/0x178
> > get_cpu_for_node+0x88/0xc4
> > parse_cluster+0x1b0/0x28c
> > parse_cluster+0x8c/0x28c
> > init_cpu_topology+0x168/0x188
> > smp_prepare_cpus+0x24/0xf8
> > kernel_init_freeable+0x18c/0x34c
> > kernel_init+0x30/0x134
> > ret_from_fork+0x10/0x20
> >
> > -> #1 (fs_reclaim){+.+.}-{0:0}:
> > __fs_reclaim_acquire+0x3c/0x48
> > fs_reclaim_acquire+0x54/0xa8
> > slab_pre_alloc_hook.constprop.0+0x40/0x25c
> > __kmem_cache_alloc_node+0x60/0x1cc
> > kmalloc_trace+0x50/0xa8
> > dev_pm_qos_constraints_allocate+0x38/0x100
> > __dev_pm_qos_add_request+0xb0/0x1e8
> > dev_pm_qos_add_request+0x58/0x80
> > dev_pm_qos_expose_latency_limit+0x60/0x13c
> > register_cpu+0x12c/0x130
> > topology_init+0xac/0xbc
> > do_one_initcall+0x104/0x2bc
> > kernel_init_freeable+0x344/0x34c
> > kernel_init+0x30/0x134
> > ret_from_fork+0x10/0x20
> >
> > -> #0 (dev_pm_qos_mtx){+.+.}-{3:3}:
> > __lock_acquire+0xe00/0x1060
> > lock_acquire+0x1e0/0x2f8
> > __mutex_lock+0xcc/0x3c8
> > mutex_lock_nested+0x30/0x44
> > dev_pm_qos_update_request+0x38/0x68
> > msm_devfreq_boost+0x40/0x70
> > msm_devfreq_active+0xc0/0xf0
> > msm_gpu_submit+0x10c/0x178
> > msm_job_run+0x78/0x150
> > drm_sched_main+0x290/0x370
> > kthread+0xf0/0x100
> > ret_from_fork+0x10/0x20
> >
> > other info that might help us debug this:
> >
> > Chain exists of:
> > dev_pm_qos_mtx --> dma_fence_map --> &gpu->active_lock
> >
> > Possible unsafe locking scenario:
> >
> > CPU0 CPU1
> > ---- ----
> > lock(&gpu->active_lock);
> > lock(dma_fence_map);
> > lock(&gpu->active_lock);
> > lock(dev_pm_qos_mtx);
> >
> > *** DEADLOCK ***
> >
> > 3 locks held by ring0/123:
> > #0: ffffff8087251170 (&gpu->lock){+.+.}-{3:3}, at: msm_job_run+0x64/0x150
> > #1: ffffffd00b0e57e8 (dma_fence_map){++++}-{0:0}, at: msm_job_run+0x68/0x150
> > #2: ffffff8087251208 (&gpu->active_lock){+.+.}-{3:3}, at: msm_gpu_submit+0xec/0x178
> >
> > stack backtrace:
> > CPU: 6 PID: 123 Comm: ring0 Not tainted 6.2.0-rc8-debug+ #559
> > Hardware name: Google Lazor (rev1 - 2) with LTE (DT)
> > Call trace:
> > dump_backtrace.part.0+0xb4/0xf8
> > show_stack+0x20/0x38
> > dump_stack_lvl+0x9c/0xd0
> > dump_stack+0x18/0x34
> > print_circular_bug+0x1b4/0x1f0
> > check_noncircular+0x78/0xac
> > __lock_acquire+0xe00/0x1060
> > lock_acquire+0x1e0/0x2f8
> > __mutex_lock+0xcc/0x3c8
> > mutex_lock_nested+0x30/0x44
> > dev_pm_qos_update_request+0x38/0x68
> > msm_devfreq_boost+0x40/0x70
> > msm_devfreq_active+0xc0/0xf0
> > msm_gpu_submit+0x10c/0x178
> > msm_job_run+0x78/0x150
> > drm_sched_main+0x290/0x370
> > kthread+0xf0/0x100
> > ret_from_fork+0x10/0x20
> >
> > The issue is that dev_pm_qos_mtx is held in the runpm suspend/resume (or
> > freq change) path, but it is also held across allocations that could
> > recurse into shrinker.
> >
> > Solve this by changing dev_pm_qos_constraints_allocate() into a function
> > that can be called unconditionally before the device qos object is
> > needed and before aquiring dev_pm_qos_mtx. This way the allocations can
>
> acquiring
>
> > be done without holding the mutex. In the case that we raced with
> > another thread to allocate the qos object, detect this *after* acquiring
> > the dev_pm_qos_mtx and simply free the redundant allocations.
> >
> > Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
> > Signed-off-by: Rob Clark <robdclark@chromium.org>
>
> Please feel free to add
>
> Acked-by: Rafael J. Wysocki <rafael@kernel.org>
>
> to this patch and the next 2 PM QoS ones in this series.
>
thanks
> Thanks!
>
> > ---
> > drivers/base/power/qos.c | 76 +++++++++++++++++++++++++++++-----------
> > 1 file changed, 56 insertions(+), 20 deletions(-)
> >
> > diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c
> > index 8e93167f1783..7e95760d16dc 100644
> > --- a/drivers/base/power/qos.c
> > +++ b/drivers/base/power/qos.c
> > @@ -185,27 +185,33 @@ static int apply_constraint(struct dev_pm_qos_request *req,
> > }
> >
> > /*
> > - * dev_pm_qos_constraints_allocate
> > + * dev_pm_qos_constraints_allocate: Allocate and initializes qos constraints
> > * @dev: device to allocate data for
> > *
> > - * Called at the first call to add_request, for constraint data allocation
> > - * Must be called with the dev_pm_qos_mtx mutex held
> > + * Called to allocate constraints before dev_pm_qos_mtx mutex is held. Should
> > + * be matched with a call to dev_pm_qos_constraints_set() once dev_pm_qos_mtx
> > + * is held.
> > */
> > -static int dev_pm_qos_constraints_allocate(struct device *dev)
> > +static struct dev_pm_qos *dev_pm_qos_constraints_allocate(struct device *dev)
> > {
> > struct dev_pm_qos *qos;
> > struct pm_qos_constraints *c;
> > struct blocking_notifier_head *n;
> >
> > - qos = kzalloc(sizeof(*qos), GFP_KERNEL);
> > + /*
> > + * If constraints are already allocated, we can skip speculatively
> > + * allocating a new one, as we don't have to work about qos transitioning
> > + * from non-null to null. The constraints are only freed on device
> > + * removal.
> > + */
> > + if (dev->power.qos)
> > + return NULL;
> > +
> > + qos = kzalloc(sizeof(*qos) + 3 * sizeof(*n), GFP_KERNEL);
> > if (!qos)
> > - return -ENOMEM;
> > + return NULL;
> >
> > - n = kzalloc(3 * sizeof(*n), GFP_KERNEL);
> > - if (!n) {
> > - kfree(qos);
> > - return -ENOMEM;
> > - }
> > + n = (struct blocking_notifier_head *)(qos + 1);
> >
> > c = &qos->resume_latency;
> > plist_head_init(&c->list);
> > @@ -227,11 +233,29 @@ static int dev_pm_qos_constraints_allocate(struct device *dev)
> >
> > INIT_LIST_HEAD(&qos->flags.list);
> >
> > + return qos;
> > +}
> > +
> > +/*
> > + * dev_pm_qos_constraints_set: Ensure dev->power.qos is set
> > + *
> > + * If dev->power.qos is already set, free the newly allocated qos constraints.
> > + * Otherwise set dev->power.qos. Must be called with dev_pm_qos_mtx held.
> > + *
> > + * This split unsynchronized allocation and synchronized set moves allocation
> > + * out from under dev_pm_qos_mtx, so that lockdep does does not get angry about
> > + * drivers which use dev_pm_qos in paths related to shrinker/reclaim.
> > + */
> > +static void dev_pm_qos_constraints_set(struct device *dev, struct dev_pm_qos *qos)
> > +{
> > + if (dev->power.qos) {
> > + kfree(qos);
> > + return;
> > + }
> > +
> > spin_lock_irq(&dev->power.lock);
> > dev->power.qos = qos;
> > spin_unlock_irq(&dev->power.lock);
> > -
> > - return 0;
> > }
> >
> > static void __dev_pm_qos_hide_latency_limit(struct device *dev);
> > @@ -309,7 +333,6 @@ void dev_pm_qos_constraints_destroy(struct device *dev)
> > dev->power.qos = ERR_PTR(-ENODEV);
> > spin_unlock_irq(&dev->power.lock);
> >
> > - kfree(qos->resume_latency.notifiers);
> > kfree(qos);
> >
> > out:
> > @@ -341,7 +364,7 @@ static int __dev_pm_qos_add_request(struct device *dev,
> > if (IS_ERR(dev->power.qos))
> > ret = -ENODEV;
> > else if (!dev->power.qos)
> > - ret = dev_pm_qos_constraints_allocate(dev);
> > + ret = -ENOMEM;
> >
> > trace_dev_pm_qos_add_request(dev_name(dev), type, value);
> > if (ret)
> > @@ -388,9 +411,11 @@ static int __dev_pm_qos_add_request(struct device *dev,
> > int dev_pm_qos_add_request(struct device *dev, struct dev_pm_qos_request *req,
> > enum dev_pm_qos_req_type type, s32 value)
> > {
> > + struct dev_pm_qos *qos = dev_pm_qos_constraints_allocate(dev);
> > int ret;
> >
> > mutex_lock(&dev_pm_qos_mtx);
> > + dev_pm_qos_constraints_set(dev, qos);
> > ret = __dev_pm_qos_add_request(dev, req, type, value);
> > mutex_unlock(&dev_pm_qos_mtx);
> > return ret;
> > @@ -535,14 +560,15 @@ EXPORT_SYMBOL_GPL(dev_pm_qos_remove_request);
> > int dev_pm_qos_add_notifier(struct device *dev, struct notifier_block *notifier,
> > enum dev_pm_qos_req_type type)
> > {
> > + struct dev_pm_qos *qos = dev_pm_qos_constraints_allocate(dev);
> > int ret = 0;
> >
> > mutex_lock(&dev_pm_qos_mtx);
> >
> > + dev_pm_qos_constraints_set(dev, qos);
> > +
> > if (IS_ERR(dev->power.qos))
> > ret = -ENODEV;
> > - else if (!dev->power.qos)
> > - ret = dev_pm_qos_constraints_allocate(dev);
> >
> > if (ret)
> > goto unlock;
> > @@ -903,12 +929,22 @@ s32 dev_pm_qos_get_user_latency_tolerance(struct device *dev)
> > */
> > int dev_pm_qos_update_user_latency_tolerance(struct device *dev, s32 val)
> > {
> > - int ret;
> > + struct dev_pm_qos *qos = dev_pm_qos_constraints_allocate(dev);
> > + int ret = 0;
> >
> > mutex_lock(&dev_pm_qos_mtx);
> >
> > - if (IS_ERR_OR_NULL(dev->power.qos)
> > - || !dev->power.qos->latency_tolerance_req) {
> > + dev_pm_qos_constraints_set(dev, qos);
> > +
> > + if (IS_ERR(dev->power.qos))
> > + ret = -ENODEV;
> > + else if (!dev->power.qos)
> > + ret = -ENOMEM;
> > +
> > + if (ret)
> > + goto out;
> > +
> > + if (!dev->power.qos->latency_tolerance_req) {
> > struct dev_pm_qos_request *req;
> >
> > if (val < 0) {
> > --
> > 2.41.0
> >
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5 03/11] PM / QoS: Fix constraints alloc vs reclaim locking
2023-08-22 18:47 ` Rafael J. Wysocki
2023-08-22 19:41 ` Rob Clark
@ 2023-08-23 21:01 ` Rob Clark
1 sibling, 0 replies; 12+ messages in thread
From: Rob Clark @ 2023-08-23 21:01 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: dri-devel, linux-arm-msm, freedreno, Rob Clark, Pavel Machek,
Len Brown, Greg Kroah-Hartman,
open list:HIBERNATION (aka Software Suspend, aka swsusp),
open list
On Tue, Aug 22, 2023 at 11:48 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Tue, Aug 22, 2023 at 8:02 PM Rob Clark <robdclark@gmail.com> wrote:
> >
> > From: Rob Clark <robdclark@chromium.org>
> >
> > In the process of adding lockdep annotation for drm GPU scheduler's
> > job_run() to detect potential deadlock against shrinker/reclaim, I hit
> > this lockdep splat:
> >
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 6.2.0-rc8-debug+ #558 Tainted: G W
> > ------------------------------------------------------
> > ring0/125 is trying to acquire lock:
> > ffffffd6d6ce0f28 (dev_pm_qos_mtx){+.+.}-{3:3}, at: dev_pm_qos_update_request+0x38/0x68
> >
> > but task is already holding lock:
> > ffffff8087239208 (&gpu->active_lock){+.+.}-{3:3}, at: msm_gpu_submit+0xec/0x178
> >
> > which lock already depends on the new lock.
> >
> > the existing dependency chain (in reverse order) is:
> >
> > -> #4 (&gpu->active_lock){+.+.}-{3:3}:
> > __mutex_lock+0xcc/0x3c8
> > mutex_lock_nested+0x30/0x44
> > msm_gpu_submit+0xec/0x178
> > msm_job_run+0x78/0x150
> > drm_sched_main+0x290/0x370
> > kthread+0xf0/0x100
> > ret_from_fork+0x10/0x20
> >
> > -> #3 (dma_fence_map){++++}-{0:0}:
> > __dma_fence_might_wait+0x74/0xc0
> > dma_resv_lockdep+0x1f4/0x2f4
> > do_one_initcall+0x104/0x2bc
> > kernel_init_freeable+0x344/0x34c
> > kernel_init+0x30/0x134
> > ret_from_fork+0x10/0x20
> >
> > -> #2 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
> > fs_reclaim_acquire+0x80/0xa8
> > slab_pre_alloc_hook.constprop.0+0x40/0x25c
> > __kmem_cache_alloc_node+0x60/0x1cc
> > __kmalloc+0xd8/0x100
> > topology_parse_cpu_capacity+0x8c/0x178
> > get_cpu_for_node+0x88/0xc4
> > parse_cluster+0x1b0/0x28c
> > parse_cluster+0x8c/0x28c
> > init_cpu_topology+0x168/0x188
> > smp_prepare_cpus+0x24/0xf8
> > kernel_init_freeable+0x18c/0x34c
> > kernel_init+0x30/0x134
> > ret_from_fork+0x10/0x20
> >
> > -> #1 (fs_reclaim){+.+.}-{0:0}:
> > __fs_reclaim_acquire+0x3c/0x48
> > fs_reclaim_acquire+0x54/0xa8
> > slab_pre_alloc_hook.constprop.0+0x40/0x25c
> > __kmem_cache_alloc_node+0x60/0x1cc
> > kmalloc_trace+0x50/0xa8
> > dev_pm_qos_constraints_allocate+0x38/0x100
> > __dev_pm_qos_add_request+0xb0/0x1e8
> > dev_pm_qos_add_request+0x58/0x80
> > dev_pm_qos_expose_latency_limit+0x60/0x13c
> > register_cpu+0x12c/0x130
> > topology_init+0xac/0xbc
> > do_one_initcall+0x104/0x2bc
> > kernel_init_freeable+0x344/0x34c
> > kernel_init+0x30/0x134
> > ret_from_fork+0x10/0x20
> >
> > -> #0 (dev_pm_qos_mtx){+.+.}-{3:3}:
> > __lock_acquire+0xe00/0x1060
> > lock_acquire+0x1e0/0x2f8
> > __mutex_lock+0xcc/0x3c8
> > mutex_lock_nested+0x30/0x44
> > dev_pm_qos_update_request+0x38/0x68
> > msm_devfreq_boost+0x40/0x70
> > msm_devfreq_active+0xc0/0xf0
> > msm_gpu_submit+0x10c/0x178
> > msm_job_run+0x78/0x150
> > drm_sched_main+0x290/0x370
> > kthread+0xf0/0x100
> > ret_from_fork+0x10/0x20
> >
> > other info that might help us debug this:
> >
> > Chain exists of:
> > dev_pm_qos_mtx --> dma_fence_map --> &gpu->active_lock
> >
> > Possible unsafe locking scenario:
> >
> > CPU0 CPU1
> > ---- ----
> > lock(&gpu->active_lock);
> > lock(dma_fence_map);
> > lock(&gpu->active_lock);
> > lock(dev_pm_qos_mtx);
> >
> > *** DEADLOCK ***
> >
> > 3 locks held by ring0/123:
> > #0: ffffff8087251170 (&gpu->lock){+.+.}-{3:3}, at: msm_job_run+0x64/0x150
> > #1: ffffffd00b0e57e8 (dma_fence_map){++++}-{0:0}, at: msm_job_run+0x68/0x150
> > #2: ffffff8087251208 (&gpu->active_lock){+.+.}-{3:3}, at: msm_gpu_submit+0xec/0x178
> >
> > stack backtrace:
> > CPU: 6 PID: 123 Comm: ring0 Not tainted 6.2.0-rc8-debug+ #559
> > Hardware name: Google Lazor (rev1 - 2) with LTE (DT)
> > Call trace:
> > dump_backtrace.part.0+0xb4/0xf8
> > show_stack+0x20/0x38
> > dump_stack_lvl+0x9c/0xd0
> > dump_stack+0x18/0x34
> > print_circular_bug+0x1b4/0x1f0
> > check_noncircular+0x78/0xac
> > __lock_acquire+0xe00/0x1060
> > lock_acquire+0x1e0/0x2f8
> > __mutex_lock+0xcc/0x3c8
> > mutex_lock_nested+0x30/0x44
> > dev_pm_qos_update_request+0x38/0x68
> > msm_devfreq_boost+0x40/0x70
> > msm_devfreq_active+0xc0/0xf0
> > msm_gpu_submit+0x10c/0x178
> > msm_job_run+0x78/0x150
> > drm_sched_main+0x290/0x370
> > kthread+0xf0/0x100
> > ret_from_fork+0x10/0x20
> >
> > The issue is that dev_pm_qos_mtx is held in the runpm suspend/resume (or
> > freq change) path, but it is also held across allocations that could
> > recurse into shrinker.
> >
> > Solve this by changing dev_pm_qos_constraints_allocate() into a function
> > that can be called unconditionally before the device qos object is
> > needed and before aquiring dev_pm_qos_mtx. This way the allocations can
>
> acquiring
>
> > be done without holding the mutex. In the case that we raced with
> > another thread to allocate the qos object, detect this *after* acquiring
> > the dev_pm_qos_mtx and simply free the redundant allocations.
> >
> > Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
> > Signed-off-by: Rob Clark <robdclark@chromium.org>
>
> Please feel free to add
>
> Acked-by: Rafael J. Wysocki <rafael@kernel.org>
>
> to this patch and the next 2 PM QoS ones in this series.
btw, Georgi picked up the interconnect patches. I think it is fine if
you want to pick up the PM patches, as there are no dependencies
between these and other patches in the series. But lmk if you want to
handle it in a different way
BR,
-R
> Thanks!
>
> > ---
> > drivers/base/power/qos.c | 76 +++++++++++++++++++++++++++++-----------
> > 1 file changed, 56 insertions(+), 20 deletions(-)
> >
> > diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c
> > index 8e93167f1783..7e95760d16dc 100644
> > --- a/drivers/base/power/qos.c
> > +++ b/drivers/base/power/qos.c
> > @@ -185,27 +185,33 @@ static int apply_constraint(struct dev_pm_qos_request *req,
> > }
> >
> > /*
> > - * dev_pm_qos_constraints_allocate
> > + * dev_pm_qos_constraints_allocate: Allocate and initializes qos constraints
> > * @dev: device to allocate data for
> > *
> > - * Called at the first call to add_request, for constraint data allocation
> > - * Must be called with the dev_pm_qos_mtx mutex held
> > + * Called to allocate constraints before dev_pm_qos_mtx mutex is held. Should
> > + * be matched with a call to dev_pm_qos_constraints_set() once dev_pm_qos_mtx
> > + * is held.
> > */
> > -static int dev_pm_qos_constraints_allocate(struct device *dev)
> > +static struct dev_pm_qos *dev_pm_qos_constraints_allocate(struct device *dev)
> > {
> > struct dev_pm_qos *qos;
> > struct pm_qos_constraints *c;
> > struct blocking_notifier_head *n;
> >
> > - qos = kzalloc(sizeof(*qos), GFP_KERNEL);
> > + /*
> > + * If constraints are already allocated, we can skip speculatively
> > + * allocating a new one, as we don't have to work about qos transitioning
> > + * from non-null to null. The constraints are only freed on device
> > + * removal.
> > + */
> > + if (dev->power.qos)
> > + return NULL;
> > +
> > + qos = kzalloc(sizeof(*qos) + 3 * sizeof(*n), GFP_KERNEL);
> > if (!qos)
> > - return -ENOMEM;
> > + return NULL;
> >
> > - n = kzalloc(3 * sizeof(*n), GFP_KERNEL);
> > - if (!n) {
> > - kfree(qos);
> > - return -ENOMEM;
> > - }
> > + n = (struct blocking_notifier_head *)(qos + 1);
> >
> > c = &qos->resume_latency;
> > plist_head_init(&c->list);
> > @@ -227,11 +233,29 @@ static int dev_pm_qos_constraints_allocate(struct device *dev)
> >
> > INIT_LIST_HEAD(&qos->flags.list);
> >
> > + return qos;
> > +}
> > +
> > +/*
> > + * dev_pm_qos_constraints_set: Ensure dev->power.qos is set
> > + *
> > + * If dev->power.qos is already set, free the newly allocated qos constraints.
> > + * Otherwise set dev->power.qos. Must be called with dev_pm_qos_mtx held.
> > + *
> > + * This split unsynchronized allocation and synchronized set moves allocation
> > + * out from under dev_pm_qos_mtx, so that lockdep does does not get angry about
> > + * drivers which use dev_pm_qos in paths related to shrinker/reclaim.
> > + */
> > +static void dev_pm_qos_constraints_set(struct device *dev, struct dev_pm_qos *qos)
> > +{
> > + if (dev->power.qos) {
> > + kfree(qos);
> > + return;
> > + }
> > +
> > spin_lock_irq(&dev->power.lock);
> > dev->power.qos = qos;
> > spin_unlock_irq(&dev->power.lock);
> > -
> > - return 0;
> > }
> >
> > static void __dev_pm_qos_hide_latency_limit(struct device *dev);
> > @@ -309,7 +333,6 @@ void dev_pm_qos_constraints_destroy(struct device *dev)
> > dev->power.qos = ERR_PTR(-ENODEV);
> > spin_unlock_irq(&dev->power.lock);
> >
> > - kfree(qos->resume_latency.notifiers);
> > kfree(qos);
> >
> > out:
> > @@ -341,7 +364,7 @@ static int __dev_pm_qos_add_request(struct device *dev,
> > if (IS_ERR(dev->power.qos))
> > ret = -ENODEV;
> > else if (!dev->power.qos)
> > - ret = dev_pm_qos_constraints_allocate(dev);
> > + ret = -ENOMEM;
> >
> > trace_dev_pm_qos_add_request(dev_name(dev), type, value);
> > if (ret)
> > @@ -388,9 +411,11 @@ static int __dev_pm_qos_add_request(struct device *dev,
> > int dev_pm_qos_add_request(struct device *dev, struct dev_pm_qos_request *req,
> > enum dev_pm_qos_req_type type, s32 value)
> > {
> > + struct dev_pm_qos *qos = dev_pm_qos_constraints_allocate(dev);
> > int ret;
> >
> > mutex_lock(&dev_pm_qos_mtx);
> > + dev_pm_qos_constraints_set(dev, qos);
> > ret = __dev_pm_qos_add_request(dev, req, type, value);
> > mutex_unlock(&dev_pm_qos_mtx);
> > return ret;
> > @@ -535,14 +560,15 @@ EXPORT_SYMBOL_GPL(dev_pm_qos_remove_request);
> > int dev_pm_qos_add_notifier(struct device *dev, struct notifier_block *notifier,
> > enum dev_pm_qos_req_type type)
> > {
> > + struct dev_pm_qos *qos = dev_pm_qos_constraints_allocate(dev);
> > int ret = 0;
> >
> > mutex_lock(&dev_pm_qos_mtx);
> >
> > + dev_pm_qos_constraints_set(dev, qos);
> > +
> > if (IS_ERR(dev->power.qos))
> > ret = -ENODEV;
> > - else if (!dev->power.qos)
> > - ret = dev_pm_qos_constraints_allocate(dev);
> >
> > if (ret)
> > goto unlock;
> > @@ -903,12 +929,22 @@ s32 dev_pm_qos_get_user_latency_tolerance(struct device *dev)
> > */
> > int dev_pm_qos_update_user_latency_tolerance(struct device *dev, s32 val)
> > {
> > - int ret;
> > + struct dev_pm_qos *qos = dev_pm_qos_constraints_allocate(dev);
> > + int ret = 0;
> >
> > mutex_lock(&dev_pm_qos_mtx);
> >
> > - if (IS_ERR_OR_NULL(dev->power.qos)
> > - || !dev->power.qos->latency_tolerance_req) {
> > + dev_pm_qos_constraints_set(dev, qos);
> > +
> > + if (IS_ERR(dev->power.qos))
> > + ret = -ENODEV;
> > + else if (!dev->power.qos)
> > + ret = -ENOMEM;
> > +
> > + if (ret)
> > + goto out;
> > +
> > + if (!dev->power.qos->latency_tolerance_req) {
> > struct dev_pm_qos_request *req;
> >
> > if (val < 0) {
> > --
> > 2.41.0
> >
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v5 04/11] PM / QoS: Decouple request alloc from dev_pm_qos_mtx
2023-08-22 18:01 [PATCH v5 00/11] drm/msm+PM+icc: Make job_run() reclaim-safe Rob Clark
` (2 preceding siblings ...)
2023-08-22 18:01 ` [PATCH v5 03/11] PM / QoS: Fix constraints alloc vs reclaim locking Rob Clark
@ 2023-08-22 18:01 ` Rob Clark
2023-09-22 7:14 ` kernel test robot
2023-08-22 18:01 ` [PATCH v5 05/11] PM / QoS: Teach lockdep about dev_pm_qos_mtx locking order Rob Clark
` (2 subsequent siblings)
6 siblings, 1 reply; 12+ messages in thread
From: Rob Clark @ 2023-08-22 18:01 UTC (permalink / raw)
To: dri-devel
Cc: linux-arm-msm, freedreno, Rob Clark, Rafael J. Wysocki,
Pavel Machek, Len Brown, Greg Kroah-Hartman,
open list:HIBERNATION (aka Software Suspend, aka swsusp),
open list
From: Rob Clark <robdclark@chromium.org>
Similar to the previous patch, move the allocation out from under
dev_pm_qos_mtx, by speculatively doing the allocation and handle
any race after acquiring dev_pm_qos_mtx by freeing the redundant
allocation.
Signed-off-by: Rob Clark <robdclark@chromium.org>
---
drivers/base/power/qos.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c
index 7e95760d16dc..09834f3354d7 100644
--- a/drivers/base/power/qos.c
+++ b/drivers/base/power/qos.c
@@ -930,8 +930,12 @@ s32 dev_pm_qos_get_user_latency_tolerance(struct device *dev)
int dev_pm_qos_update_user_latency_tolerance(struct device *dev, s32 val)
{
struct dev_pm_qos *qos = dev_pm_qos_constraints_allocate(dev);
+ struct dev_pm_qos_request *req = NULL;
int ret = 0;
+ if (!qos->latency_tolerance_req)
+ req = kzalloc(sizeof(*req), GFP_KERNEL);
+
mutex_lock(&dev_pm_qos_mtx);
dev_pm_qos_constraints_set(dev, qos);
@@ -945,8 +949,6 @@ int dev_pm_qos_update_user_latency_tolerance(struct device *dev, s32 val)
goto out;
if (!dev->power.qos->latency_tolerance_req) {
- struct dev_pm_qos_request *req;
-
if (val < 0) {
if (val == PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT)
ret = 0;
@@ -954,17 +956,15 @@ int dev_pm_qos_update_user_latency_tolerance(struct device *dev, s32 val)
ret = -EINVAL;
goto out;
}
- req = kzalloc(sizeof(*req), GFP_KERNEL);
if (!req) {
ret = -ENOMEM;
goto out;
}
ret = __dev_pm_qos_add_request(dev, req, DEV_PM_QOS_LATENCY_TOLERANCE, val);
- if (ret < 0) {
- kfree(req);
+ if (ret < 0)
goto out;
- }
dev->power.qos->latency_tolerance_req = req;
+ req = NULL;
} else {
if (val < 0) {
__dev_pm_qos_drop_user_request(dev, DEV_PM_QOS_LATENCY_TOLERANCE);
@@ -976,6 +976,7 @@ int dev_pm_qos_update_user_latency_tolerance(struct device *dev, s32 val)
out:
mutex_unlock(&dev_pm_qos_mtx);
+ kfree(req);
return ret;
}
EXPORT_SYMBOL_GPL(dev_pm_qos_update_user_latency_tolerance);
--
2.41.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v5 04/11] PM / QoS: Decouple request alloc from dev_pm_qos_mtx
2023-08-22 18:01 ` [PATCH v5 04/11] PM / QoS: Decouple request alloc from dev_pm_qos_mtx Rob Clark
@ 2023-09-22 7:14 ` kernel test robot
0 siblings, 0 replies; 12+ messages in thread
From: kernel test robot @ 2023-09-22 7:14 UTC (permalink / raw)
To: Rob Clark
Cc: oe-lkp, lkp, linux-pm, dri-devel, Rob Clark, Len Brown,
Rafael J. Wysocki, linux-arm-msm, open list, Pavel Machek,
Greg Kroah-Hartman, freedreno, oliver.sang
Hello,
kernel test robot noticed "canonical_address#:#[##]" on:
commit: d308a440bdf329cfa70cc5d35c565939d81ae73f ("[PATCH v5 04/11] PM / QoS: Decouple request alloc from dev_pm_qos_mtx")
url: https://github.com/intel-lab-lkp/linux/commits/Rob-Clark/PM-devfreq-Drop-unneed-locking-to-appease-lockdep/20230823-020443
base: git://anongit.freedesktop.org/drm/drm-misc drm-misc-next
patch link: https://lore.kernel.org/all/20230822180208.95556-5-robdclark@gmail.com/
patch subject: [PATCH v5 04/11] PM / QoS: Decouple request alloc from dev_pm_qos_mtx
in testcase: blktests
version: blktests-x86_64-e0bb3dc-1_20230912
with following parameters:
disk: 1SSD
test: nvme-group-01
nvme_trtype: rdma
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480+ (Sapphire Rapids) with 256G memory
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202309221426.fb0fe750-oliver.sang@intel.com
[ 79.616893][ T2311]
[ 79.634663][ T3447] run blktests nvme/032 at 2023-09-19 15:50:52
[ 83.369231][ T2313] /lkp/lkp/src/monitors/kmemleak: 19: echo: echo: I/O error
[ 83.369240][ T2313]
[ 85.082264][ T1434] nvme nvme0: 128/0/0 default/read/poll queues
[ 88.926272][ T3447] general protection fault, probably for non-canonical address 0xdffffc0000000024: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 88.941100][ T3447] KASAN: null-ptr-deref in range [0x0000000000000120-0x0000000000000127]
[ 88.951583][ T3447] CPU: 95 PID: 3447 Comm: check Tainted: G S 6.5.0-rc2-00514-gd308a440bdf3 #1
[ 88.964091][ T3447] Hardware name: Intel Corporation D50DNP1SBB/D50DNP1SBB, BIOS SE5C7411.86B.8118.D04.2206151341 06/15/2022
[ 88.977880][ T3447] RIP: 0010:dev_pm_qos_update_user_latency_tolerance (kbuild/src/consumer/drivers/base/power/qos.c:936)
[ 88.987504][ T3447] Code: 02 00 00 48 8b bb 08 02 00 00 e8 79 ea ff ff 48 8d b8 20 01 00 00 48 89 c5 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 3a 02 00 00 45 31 f6 48 83 bd 20 01 00 00 00 0f
All code
========
0: 02 00 add (%rax),%al
2: 00 48 8b add %cl,-0x75(%rax)
5: bb 08 02 00 00 mov $0x208,%ebx
a: e8 79 ea ff ff callq 0xffffffffffffea88
f: 48 8d b8 20 01 00 00 lea 0x120(%rax),%rdi
16: 48 89 c5 mov %rax,%rbp
19: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
20: fc ff df
23: 48 89 fa mov %rdi,%rdx
26: 48 c1 ea 03 shr $0x3,%rdx
2a:* 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction
2e: 0f 85 3a 02 00 00 jne 0x26e
34: 45 31 f6 xor %r14d,%r14d
37: 48 83 bd 20 01 00 00 cmpq $0x0,0x120(%rbp)
3e: 00
3f: 0f .byte 0xf
Code starting with the faulting instruction
===========================================
0: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1)
4: 0f 85 3a 02 00 00 jne 0x244
a: 45 31 f6 xor %r14d,%r14d
d: 48 83 bd 20 01 00 00 cmpq $0x0,0x120(%rbp)
14: 00
15: 0f .byte 0xf
[ 89.010647][ T3447] RSP: 0018:ffa0000017fe7b70 EFLAGS: 00010206
[ 89.018574][ T3447] RAX: dffffc0000000000 RBX: ff1100209b614298 RCX: 0000000000000000
[ 89.028658][ T3447] RDX: 0000000000000024 RSI: 00000000ffffffff RDI: 0000000000000120
[ 89.038735][ T3447] RBP: 0000000000000000 R08: 0000000000000000 R09: fff3fc0002ffcf64
[ 89.048812][ T3447] R10: 0000000000000003 R11: ff1100208a8624b0 R12: ff1100209b6144a0
[ 89.058895][ T3447] R13: 00000000ffffffff R14: ffffffffc08e3468 R15: ff110001273f4138
[ 89.068957][ T3447] FS: 00007fc6d8027740(0000) GS:ff11003fd3180000(0000) knlGS:0000000000000000
[ 89.080098][ T3447] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 89.088618][ T3447] CR2: 00007f5be5eeb120 CR3: 0000000263306002 CR4: 0000000000f71ee0
[ 89.098720][ T3447] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 89.108812][ T3447] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[ 89.118899][ T3447] PKRU: 55555554
[ 89.123997][ T3447] Call Trace:
[ 89.128804][ T3447] <TASK>
[ 89.133218][ T3447] ? die_addr (kbuild/src/consumer/arch/x86/kernel/dumpstack.c:421 kbuild/src/consumer/arch/x86/kernel/dumpstack.c:460)
[ 89.139003][ T3447] ? exc_general_protection (kbuild/src/consumer/arch/x86/kernel/traps.c:786 kbuild/src/consumer/arch/x86/kernel/traps.c:728)
[ 89.146323][ T3447] ? asm_exc_general_protection (kbuild/src/consumer/arch/x86/include/asm/idtentry.h:564)
[ 89.153849][ T3447] ? dev_pm_qos_update_user_latency_tolerance (kbuild/src/consumer/drivers/base/power/qos.c:936)
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230922/202309221426.fb0fe750-oliver.sang@intel.com
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v5 05/11] PM / QoS: Teach lockdep about dev_pm_qos_mtx locking order
2023-08-22 18:01 [PATCH v5 00/11] drm/msm+PM+icc: Make job_run() reclaim-safe Rob Clark
` (3 preceding siblings ...)
2023-08-22 18:01 ` [PATCH v5 04/11] PM / QoS: Decouple request alloc from dev_pm_qos_mtx Rob Clark
@ 2023-08-22 18:01 ` Rob Clark
2023-08-22 18:01 ` [PATCH v5 06/11] interconnect: Fix locking for runpm vs reclaim Rob Clark
2023-08-22 18:01 ` [PATCH v5 07/11] interconnect: Teach lockdep about icc_bw_lock order Rob Clark
6 siblings, 0 replies; 12+ messages in thread
From: Rob Clark @ 2023-08-22 18:01 UTC (permalink / raw)
To: dri-devel
Cc: linux-arm-msm, freedreno, Rob Clark, Rafael J. Wysocki,
Pavel Machek, Len Brown, Greg Kroah-Hartman,
open list:HIBERNATION (aka Software Suspend, aka swsusp),
open list
From: Rob Clark <robdclark@chromium.org>
Annotate dev_pm_qos_mtx to teach lockdep to scream about allocations
that could trigger reclaim under dev_pm_qos_mtx.
Signed-off-by: Rob Clark <robdclark@chromium.org>
---
drivers/base/power/qos.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c
index 09834f3354d7..2018c805a6f1 100644
--- a/drivers/base/power/qos.c
+++ b/drivers/base/power/qos.c
@@ -1017,3 +1017,14 @@ void dev_pm_qos_hide_latency_tolerance(struct device *dev)
pm_runtime_put(dev);
}
EXPORT_SYMBOL_GPL(dev_pm_qos_hide_latency_tolerance);
+
+static int __init dev_pm_qos_init(void)
+{
+ /* Teach lockdep about lock ordering wrt. shrinker: */
+ fs_reclaim_acquire(GFP_KERNEL);
+ might_lock(&dev_pm_qos_mtx);
+ fs_reclaim_release(GFP_KERNEL);
+
+ return 0;
+}
+early_initcall(dev_pm_qos_init);
--
2.41.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v5 06/11] interconnect: Fix locking for runpm vs reclaim
2023-08-22 18:01 [PATCH v5 00/11] drm/msm+PM+icc: Make job_run() reclaim-safe Rob Clark
` (4 preceding siblings ...)
2023-08-22 18:01 ` [PATCH v5 05/11] PM / QoS: Teach lockdep about dev_pm_qos_mtx locking order Rob Clark
@ 2023-08-22 18:01 ` Rob Clark
2023-08-22 18:01 ` [PATCH v5 07/11] interconnect: Teach lockdep about icc_bw_lock order Rob Clark
6 siblings, 0 replies; 12+ messages in thread
From: Rob Clark @ 2023-08-22 18:01 UTC (permalink / raw)
To: dri-devel
Cc: linux-arm-msm, freedreno, Rob Clark, Georgi Djakov,
open list:INTERCONNECT API, open list
From: Rob Clark <robdclark@chromium.org>
For cases where icc_bw_set() can be called in callbaths that could
deadlock against shrinker/reclaim, such as runpm resume, we need to
decouple the icc locking. Introduce a new icc_bw_lock for cases where
we need to serialize bw aggregation and update to decouple that from
paths that require memory allocation such as node/link creation/
destruction.
Fixes this lockdep splat:
======================================================
WARNING: possible circular locking dependency detected
6.2.0-rc8-debug+ #554 Not tainted
------------------------------------------------------
ring0/132 is trying to acquire lock:
ffffff80871916d0 (&gmu->lock){+.+.}-{3:3}, at: a6xx_pm_resume+0xf0/0x234
but task is already holding lock:
ffffffdb5aee57e8 (dma_fence_map){++++}-{0:0}, at: msm_job_run+0x68/0x150
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (dma_fence_map){++++}-{0:0}:
__dma_fence_might_wait+0x74/0xc0
dma_resv_lockdep+0x1f4/0x2f4
do_one_initcall+0x104/0x2bc
kernel_init_freeable+0x344/0x34c
kernel_init+0x30/0x134
ret_from_fork+0x10/0x20
-> #3 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
fs_reclaim_acquire+0x80/0xa8
slab_pre_alloc_hook.constprop.0+0x40/0x25c
__kmem_cache_alloc_node+0x60/0x1cc
__kmalloc+0xd8/0x100
topology_parse_cpu_capacity+0x8c/0x178
get_cpu_for_node+0x88/0xc4
parse_cluster+0x1b0/0x28c
parse_cluster+0x8c/0x28c
init_cpu_topology+0x168/0x188
smp_prepare_cpus+0x24/0xf8
kernel_init_freeable+0x18c/0x34c
kernel_init+0x30/0x134
ret_from_fork+0x10/0x20
-> #2 (fs_reclaim){+.+.}-{0:0}:
__fs_reclaim_acquire+0x3c/0x48
fs_reclaim_acquire+0x54/0xa8
slab_pre_alloc_hook.constprop.0+0x40/0x25c
__kmem_cache_alloc_node+0x60/0x1cc
__kmalloc+0xd8/0x100
kzalloc.constprop.0+0x14/0x20
icc_node_create_nolock+0x4c/0xc4
icc_node_create+0x38/0x58
qcom_icc_rpmh_probe+0x1b8/0x248
platform_probe+0x70/0xc4
really_probe+0x158/0x290
__driver_probe_device+0xc8/0xe0
driver_probe_device+0x44/0x100
__driver_attach+0xf8/0x108
bus_for_each_dev+0x78/0xc4
driver_attach+0x2c/0x38
bus_add_driver+0xd0/0x1d8
driver_register+0xbc/0xf8
__platform_driver_register+0x30/0x3c
qnoc_driver_init+0x24/0x30
do_one_initcall+0x104/0x2bc
kernel_init_freeable+0x344/0x34c
kernel_init+0x30/0x134
ret_from_fork+0x10/0x20
-> #1 (icc_lock){+.+.}-{3:3}:
__mutex_lock+0xcc/0x3c8
mutex_lock_nested+0x30/0x44
icc_set_bw+0x88/0x2b4
_set_opp_bw+0x8c/0xd8
_set_opp+0x19c/0x300
dev_pm_opp_set_opp+0x84/0x94
a6xx_gmu_resume+0x18c/0x804
a6xx_pm_resume+0xf8/0x234
adreno_runtime_resume+0x2c/0x38
pm_generic_runtime_resume+0x30/0x44
__rpm_callback+0x15c/0x174
rpm_callback+0x78/0x7c
rpm_resume+0x318/0x524
__pm_runtime_resume+0x78/0xbc
adreno_load_gpu+0xc4/0x17c
msm_open+0x50/0x120
drm_file_alloc+0x17c/0x228
drm_open_helper+0x74/0x118
drm_open+0xa0/0x144
drm_stub_open+0xd4/0xe4
chrdev_open+0x1b8/0x1e4
do_dentry_open+0x2f8/0x38c
vfs_open+0x34/0x40
path_openat+0x64c/0x7b4
do_filp_open+0x54/0xc4
do_sys_openat2+0x9c/0x100
do_sys_open+0x50/0x7c
__arm64_sys_openat+0x28/0x34
invoke_syscall+0x8c/0x128
el0_svc_common.constprop.0+0xa0/0x11c
do_el0_svc+0xac/0xbc
el0_svc+0x48/0xa0
el0t_64_sync_handler+0xac/0x13c
el0t_64_sync+0x190/0x194
-> #0 (&gmu->lock){+.+.}-{3:3}:
__lock_acquire+0xe00/0x1060
lock_acquire+0x1e0/0x2f8
__mutex_lock+0xcc/0x3c8
mutex_lock_nested+0x30/0x44
a6xx_pm_resume+0xf0/0x234
adreno_runtime_resume+0x2c/0x38
pm_generic_runtime_resume+0x30/0x44
__rpm_callback+0x15c/0x174
rpm_callback+0x78/0x7c
rpm_resume+0x318/0x524
__pm_runtime_resume+0x78/0xbc
pm_runtime_get_sync.isra.0+0x14/0x20
msm_gpu_submit+0x58/0x178
msm_job_run+0x78/0x150
drm_sched_main+0x290/0x370
kthread+0xf0/0x100
ret_from_fork+0x10/0x20
other info that might help us debug this:
Chain exists of:
&gmu->lock --> mmu_notifier_invalidate_range_start --> dma_fence_map
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(dma_fence_map);
lock(mmu_notifier_invalidate_range_start);
lock(dma_fence_map);
lock(&gmu->lock);
*** DEADLOCK ***
2 locks held by ring0/132:
#0: ffffff8087191170 (&gpu->lock){+.+.}-{3:3}, at: msm_job_run+0x64/0x150
#1: ffffffdb5aee57e8 (dma_fence_map){++++}-{0:0}, at: msm_job_run+0x68/0x150
stack backtrace:
CPU: 7 PID: 132 Comm: ring0 Not tainted 6.2.0-rc8-debug+ #554
Hardware name: Google Lazor (rev1 - 2) with LTE (DT)
Call trace:
dump_backtrace.part.0+0xb4/0xf8
show_stack+0x20/0x38
dump_stack_lvl+0x9c/0xd0
dump_stack+0x18/0x34
print_circular_bug+0x1b4/0x1f0
check_noncircular+0x78/0xac
__lock_acquire+0xe00/0x1060
lock_acquire+0x1e0/0x2f8
__mutex_lock+0xcc/0x3c8
mutex_lock_nested+0x30/0x44
a6xx_pm_resume+0xf0/0x234
adreno_runtime_resume+0x2c/0x38
pm_generic_runtime_resume+0x30/0x44
__rpm_callback+0x15c/0x174
rpm_callback+0x78/0x7c
rpm_resume+0x318/0x524
__pm_runtime_resume+0x78/0xbc
pm_runtime_get_sync.isra.0+0x14/0x20
msm_gpu_submit+0x58/0x178
msm_job_run+0x78/0x150
drm_sched_main+0x290/0x370
kthread+0xf0/0x100
ret_from_fork+0x10/0x20
Signed-off-by: Rob Clark <robdclark@chromium.org>
---
drivers/interconnect/core.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
index 5fac448c28fd..e15a92a79df1 100644
--- a/drivers/interconnect/core.c
+++ b/drivers/interconnect/core.c
@@ -28,6 +28,7 @@ static LIST_HEAD(icc_providers);
static int providers_count;
static bool synced_state;
static DEFINE_MUTEX(icc_lock);
+static DEFINE_MUTEX(icc_bw_lock);
static struct dentry *icc_debugfs_dir;
static void icc_summary_show_one(struct seq_file *s, struct icc_node *n)
@@ -631,7 +632,7 @@ int icc_set_bw(struct icc_path *path, u32 avg_bw, u32 peak_bw)
if (WARN_ON(IS_ERR(path) || !path->num_nodes))
return -EINVAL;
- mutex_lock(&icc_lock);
+ mutex_lock(&icc_bw_lock);
old_avg = path->reqs[0].avg_bw;
old_peak = path->reqs[0].peak_bw;
@@ -663,7 +664,7 @@ int icc_set_bw(struct icc_path *path, u32 avg_bw, u32 peak_bw)
apply_constraints(path);
}
- mutex_unlock(&icc_lock);
+ mutex_unlock(&icc_bw_lock);
trace_icc_set_bw_end(path, ret);
@@ -872,6 +873,7 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
return;
mutex_lock(&icc_lock);
+ mutex_lock(&icc_bw_lock);
node->provider = provider;
list_add_tail(&node->node_list, &provider->nodes);
@@ -900,6 +902,7 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
node->avg_bw = 0;
node->peak_bw = 0;
+ mutex_unlock(&icc_bw_lock);
mutex_unlock(&icc_lock);
}
EXPORT_SYMBOL_GPL(icc_node_add);
@@ -1025,6 +1028,7 @@ void icc_sync_state(struct device *dev)
return;
mutex_lock(&icc_lock);
+ mutex_lock(&icc_bw_lock);
synced_state = true;
list_for_each_entry(p, &icc_providers, provider_list) {
dev_dbg(p->dev, "interconnect provider is in synced state\n");
--
2.41.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v5 07/11] interconnect: Teach lockdep about icc_bw_lock order
2023-08-22 18:01 [PATCH v5 00/11] drm/msm+PM+icc: Make job_run() reclaim-safe Rob Clark
` (5 preceding siblings ...)
2023-08-22 18:01 ` [PATCH v5 06/11] interconnect: Fix locking for runpm vs reclaim Rob Clark
@ 2023-08-22 18:01 ` Rob Clark
6 siblings, 0 replies; 12+ messages in thread
From: Rob Clark @ 2023-08-22 18:01 UTC (permalink / raw)
To: dri-devel
Cc: linux-arm-msm, freedreno, Rob Clark, Georgi Djakov,
open list:INTERCONNECT API, open list
From: Rob Clark <robdclark@chromium.org>
Teach lockdep that icc_bw_lock is needed in code paths that could
deadlock if they trigger reclaim.
Signed-off-by: Rob Clark <robdclark@chromium.org>
---
drivers/interconnect/core.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
index e15a92a79df1..1afbc4f7c6e7 100644
--- a/drivers/interconnect/core.c
+++ b/drivers/interconnect/core.c
@@ -1041,13 +1041,21 @@ void icc_sync_state(struct device *dev)
}
}
}
+ mutex_unlock(&icc_bw_lock);
mutex_unlock(&icc_lock);
}
EXPORT_SYMBOL_GPL(icc_sync_state);
static int __init icc_init(void)
{
- struct device_node *root = of_find_node_by_path("/");
+ struct device_node *root;
+
+ /* Teach lockdep about lock ordering wrt. shrinker: */
+ fs_reclaim_acquire(GFP_KERNEL);
+ might_lock(&icc_bw_lock);
+ fs_reclaim_release(GFP_KERNEL);
+
+ root = of_find_node_by_path("/");
providers_count = of_count_icc_providers(root);
of_node_put(root);
--
2.41.0
^ permalink raw reply related [flat|nested] 12+ messages in thread