The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH v2] fs/resctrl: Fix deadlock for errors during mount
@ 2026-05-04 22:01 Tony Luck
  2026-05-06 22:16 ` Reinette Chatre
  0 siblings, 1 reply; 6+ messages in thread
From: Tony Luck @ 2026-05-04 22:01 UTC (permalink / raw)
  To: Borislav Petkov, x86
  Cc: Fenghua Yu, Reinette Chatre, Maciej Wieczor-Retman, Peter Newman,
	James Morse, Babu Moger, Drew Fustini, Dave Martin, Chen Yu,
	linux-kernel, patches, Tony Luck

Sashiko noticed[1] a deadlock in the resctrl mount code.

rdt_get_tree() acquires rdtgroup_mutex before calling kernfs_get_tree(). If
superblock setup fails inside kernfs_get_tree(), the VFS calls kill_sb on
the same thread before the call returns.  rdt_kill_sb() unconditionally
attempts to acquire rdtgroup_mutex and deadlock occurs.

Move the call to kernfs_get_tree() outside of locks.

If kernfs_get_tree() fails and ctx->kfc.new_sb_created is set, then rdt_kill_sb()
has already been called and no further cleanup is needed.

Fixes: 5ff193fbde20 ("x86/intel_rdt: Add basic resctrl filesystem support")
Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://sashiko.dev/#/patchset/20260429184858.36423-1-tony.luck%40intel.com [1]
---

Changes since v1:
Link: https://lore.kernel.org/all/20260501185612.14442-1-tony.luck@intel.com/

AI solution from Claude abandoned after human (Reinette) and AI (Sashiko)
reviews found it lacking.

Reinette: Feel free to claim full authorship of this fix since you
provided direction for every aspect. At minimum the "Suggested-by:"
should be "Co-authored-by:" (with your sign-off).

 fs/resctrl/rdtgroup.c | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 5dfdaa6f9d8f..6709b74bd655 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -2855,10 +2855,6 @@ static int rdt_get_tree(struct fs_context *fc)
 	if (ret)
 		goto out_mondata;
 
-	ret = kernfs_get_tree(fc);
-	if (ret < 0)
-		goto out_psl;
-
 	if (resctrl_arch_alloc_capable())
 		resctrl_arch_enable_alloc();
 	if (resctrl_arch_mon_capable())
@@ -2874,10 +2870,26 @@ static int rdt_get_tree(struct fs_context *fc)
 						   RESCTRL_PICK_ANY_CPU);
 	}
 
-	goto out;
+	/* Release locks because kernfs_get_tree() may call rdt_kill_sb() */
+	mutex_unlock(&rdtgroup_mutex);
+	cpus_read_unlock();
+	ret = kernfs_get_tree(fc);
+	if (!ret || ctx->kfc.new_sb_created) {
+		/* mount succeeded, or failed and already cleaned up */
+		return ret;
+	}
+	cpus_read_lock();
+	mutex_lock(&rdtgroup_mutex);
+
+	if (resctrl_arch_alloc_capable())
+		resctrl_arch_disable_alloc();
+	if (resctrl_arch_mon_capable())
+		resctrl_arch_disable_mon();
+
+	resctrl_mounted = false;
 
-out_psl:
 	rdt_pseudo_lock_release();
+
 out_mondata:
 	if (resctrl_arch_mon_capable())
 		kernfs_remove(kn_mondata);
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] fs/resctrl: Fix deadlock for errors during mount
  2026-05-04 22:01 [PATCH v2] fs/resctrl: Fix deadlock for errors during mount Tony Luck
@ 2026-05-06 22:16 ` Reinette Chatre
  2026-05-07 17:47   ` Luck, Tony
  0 siblings, 1 reply; 6+ messages in thread
From: Reinette Chatre @ 2026-05-06 22:16 UTC (permalink / raw)
  To: Tony Luck, Borislav Petkov, x86
  Cc: Fenghua Yu, Maciej Wieczor-Retman, Peter Newman, James Morse,
	Babu Moger, Drew Fustini, Dave Martin, Chen Yu, linux-kernel,
	patches

Hi Tony,

On 5/4/26 3:01 PM, Tony Luck wrote:
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index 5dfdaa6f9d8f..6709b74bd655 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c
> @@ -2855,10 +2855,6 @@ static int rdt_get_tree(struct fs_context *fc)
>  	if (ret)
>  		goto out_mondata;
>  
> -	ret = kernfs_get_tree(fc);
> -	if (ret < 0)
> -		goto out_psl;
> -
>  	if (resctrl_arch_alloc_capable())
>  		resctrl_arch_enable_alloc();
>  	if (resctrl_arch_mon_capable())
> @@ -2874,10 +2870,26 @@ static int rdt_get_tree(struct fs_context *fc)
>  						   RESCTRL_PICK_ANY_CPU);
>  	}
>  
> -	goto out;
> +	/* Release locks because kernfs_get_tree() may call rdt_kill_sb() */

I neglected to add a rdt_last_cmd_clear() to this path now used for success and failure.
It is needed on success path to ensure that the last_cmd_status file does not show stale
information.

> +	mutex_unlock(&rdtgroup_mutex);
> +	cpus_read_unlock();
> +	ret = kernfs_get_tree(fc);
> +	if (!ret || ctx->kfc.new_sb_created) {
> +		/* mount succeeded, or failed and already cleaned up */
> +		return ret;
> +	}
> +	cpus_read_lock();
> +	mutex_lock(&rdtgroup_mutex);
> +
> +	if (resctrl_arch_alloc_capable())
> +		resctrl_arch_disable_alloc();
> +	if (resctrl_arch_mon_capable())
> +		resctrl_arch_disable_mon();
> +
> +	resctrl_mounted = false;

I find symmetrical code significantly easier to reason about while the above introduces
asymmetrical code in two ways:
- kernfs_get_tree() failures have identical resctrl state but the resctrl
  cleanup is done differently (rdt_kill_sb() or rdt_get_tree() error path) based on the
  specific failure that has nothing to do with resctrl. I think it will be simpler if
  resctrl state is cleaned up consistently irrespective of how kernfs_get_tree() fails.
- Above change creates inconsistency in how kernfs_get_tree() is called without
  locks but its "partner" function kernfs_kill_sb() is called with both
  CPU hotplug lock and rdtgroup_mutex held. Here too I believe it will be
  simpler to just keep locking consistent when interacting with these
  kernfs calls.

>  
> -out_psl:
>  	rdt_pseudo_lock_release();
> +
>  out_mondata:
>  	if (resctrl_arch_mon_capable())
>  		kernfs_remove(kn_mondata);

While comparing rdt_kill_sb() and this error exit path to make sure they do the same
I seem to have stumbled on a bug where out_mondata is missing mon_put_kn_priv(). Do you
agree?

Putting all comments together, how about something like below?

diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 5dfdaa6f9d8f..6d0d6ab34985 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -73,6 +73,11 @@ static char last_cmd_status_buf[512];
 static int rdtgroup_setup_root(struct rdt_fs_context *ctx);
 
 static void rdtgroup_destroy_root(void);
+/*
+ * Temporary forward declaration for testing only. Move functions instead.
+ */
+static void resctrl_unmount(void);
+static void mon_put_kn_priv(void);
 
 struct dentry *debugfs_resctrl;
 
@@ -2855,10 +2860,6 @@ static int rdt_get_tree(struct fs_context *fc)
 	if (ret)
 		goto out_mondata;
 
-	ret = kernfs_get_tree(fc);
-	if (ret < 0)
-		goto out_psl;
-
 	if (resctrl_arch_alloc_capable())
 		resctrl_arch_enable_alloc();
 	if (resctrl_arch_mon_capable())
@@ -2874,13 +2875,24 @@ static int rdt_get_tree(struct fs_context *fc)
 						   RESCTRL_PICK_ANY_CPU);
 	}
 
-	goto out;
+	rdt_last_cmd_clear();
+	mutex_unlock(&rdtgroup_mutex);
+	cpus_read_unlock();
+
+	ret = kernfs_get_tree(fc);
+	/*
+	 * resctrl can only be mounted once, new superblock only expected
+	 * to be created once.
+	 */
+	if (!ctx->kfc.new_sb_created)
+		resctrl_unmount();
+	return ret;
 
-out_psl:
-	rdt_pseudo_lock_release();
 out_mondata:
-	if (resctrl_arch_mon_capable())
+	if (resctrl_arch_mon_capable()) {
 		kernfs_remove(kn_mondata);
+		mon_put_kn_priv(); /* separate fix */
+	}
 out_mongrp:
 	if (resctrl_arch_mon_capable()) {
 		rdtgroup_unassign_cntrs(&rdtgroup_default);
@@ -2896,7 +2908,6 @@ static int rdt_get_tree(struct fs_context *fc)
 out_root:
 	rdtgroup_destroy_root();
 out:
-	rdt_last_cmd_clear();
 	mutex_unlock(&rdtgroup_mutex);
 	cpus_read_unlock();
 	return ret;
@@ -3169,7 +3180,7 @@ static void resctrl_fs_teardown(void)
 	rdtgroup_destroy_root();
 }
 
-static void rdt_kill_sb(struct super_block *sb)
+static void resctrl_unmount(void)
 {
 	struct rdt_resource *r;
 
@@ -3188,11 +3199,17 @@ static void rdt_kill_sb(struct super_block *sb)
 	if (resctrl_arch_mon_capable())
 		resctrl_arch_disable_mon();
 	resctrl_mounted = false;
-	kernfs_kill_sb(sb);
 	mutex_unlock(&rdtgroup_mutex);
 	cpus_read_unlock();
 }
 
+
+static void rdt_kill_sb(struct super_block *sb)
+{
+	resctrl_unmount();
+	kernfs_kill_sb(sb);
+}
+
 static struct file_system_type rdt_fs_type = {
 	.name			= "resctrl",
 	.init_fs_context	= rdt_init_fs_context,


Reinette

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] fs/resctrl: Fix deadlock for errors during mount
  2026-05-06 22:16 ` Reinette Chatre
@ 2026-05-07 17:47   ` Luck, Tony
  2026-05-07 18:30     ` Reinette Chatre
  0 siblings, 1 reply; 6+ messages in thread
From: Luck, Tony @ 2026-05-07 17:47 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Borislav Petkov, x86, Fenghua Yu, Maciej Wieczor-Retman,
	Peter Newman, James Morse, Babu Moger, Drew Fustini, Dave Martin,
	Chen Yu, linux-kernel, patches

Reinette,

This looks promising.

I'll split out the missing call to mon_put_kn_priv(); into its own
patch before the deadlock fix.

Going to re-run the tests I did before forcing kernfs_get_tree()
to fail early without setting new_sb_created, and also late.

> +/*
> + * Temporary forward declaration for testing only. Move functions instead.
> + */
> +static void resctrl_unmount(void);
> +static void mon_put_kn_priv(void);

Question: How much are forward declarations hated? And how to handle
this?

Moving the functions around in the same patch really obscures the
actual change. Is it OK to have a patch to make the functional
change including the forward declarations. Then a separate commit
that does the re-order (where it is obvious that functions are
being picked up and moved without any code changes)?

-Tony

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] fs/resctrl: Fix deadlock for errors during mount
  2026-05-07 17:47   ` Luck, Tony
@ 2026-05-07 18:30     ` Reinette Chatre
  2026-05-07 23:45       ` Luck, Tony
  0 siblings, 1 reply; 6+ messages in thread
From: Reinette Chatre @ 2026-05-07 18:30 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Borislav Petkov, x86, Fenghua Yu, Maciej Wieczor-Retman,
	Peter Newman, James Morse, Babu Moger, Drew Fustini, Dave Martin,
	Chen Yu, linux-kernel, patches

Hi Tony,

On 5/7/26 10:47 AM, Luck, Tony wrote:
> Reinette,
> 
> This looks promising.
> 
> I'll split out the missing call to mon_put_kn_priv(); into its own
> patch before the deadlock fix.
> 
> Going to re-run the tests I did before forcing kernfs_get_tree()
> to fail early without setting new_sb_created, and also late.

Thank you very much.

> 
>> +/*
>> + * Temporary forward declaration for testing only. Move functions instead.
>> + */
>> +static void resctrl_unmount(void);
>> +static void mon_put_kn_priv(void);
> 
> Question: How much are forward declarations hated? And how to handle
> this?

I am not aware of forward declarations being hated and I am not aware of
documented tip rules about this. Even so, I do find it cleaner if they
can be avoided. To handle this a prep patch that just moves the code
without any functional change should work?

> Moving the functions around in the same patch really obscures the
> actual change. Is it OK to have a patch to make the functional
> change including the forward declarations. Then a separate commit
> that does the re-order (where it is obvious that functions are
> being picked up and moved without any code changes)?

My preference would be a prep patch that does the move with new
capability built on top. It seems unnecessary to me to add a forward
declaration in one patch just to remove it in a following patch.
I agree that moving code as part of functional change should be avoided.

Reinette

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] fs/resctrl: Fix deadlock for errors during mount
  2026-05-07 18:30     ` Reinette Chatre
@ 2026-05-07 23:45       ` Luck, Tony
  2026-05-08  0:23         ` Reinette Chatre
  0 siblings, 1 reply; 6+ messages in thread
From: Luck, Tony @ 2026-05-07 23:45 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Borislav Petkov, x86, Fenghua Yu, Maciej Wieczor-Retman,
	Peter Newman, James Morse, Babu Moger, Drew Fustini, Dave Martin,
	Chen Yu, linux-kernel, patches

On Thu, May 07, 2026 at 11:30:53AM -0700, Reinette Chatre wrote:
> Hi Tony,
> 
> On 5/7/26 10:47 AM, Luck, Tony wrote:
> > Reinette,
> > 
> > This looks promising.
> > 
> > I'll split out the missing call to mon_put_kn_priv(); into its own
> > patch before the deadlock fix.
> > 
> > Going to re-run the tests I did before forcing kernfs_get_tree()
> > to fail early without setting new_sb_created, and also late.
> 
> Thank you very much.

Tests pass.

> > 
> >> +/*
> >> + * Temporary forward declaration for testing only. Move functions instead.
> >> + */
> >> +static void resctrl_unmount(void);
> >> +static void mon_put_kn_priv(void);
> > 
> > Question: How much are forward declarations hated? And how to handle
> > this?
> 
> I am not aware of forward declarations being hated and I am not aware of
> documented tip rules about this. Even so, I do find it cleaner if they
> can be avoided. To handle this a prep patch that just moves the code
> without any functional change should work?
> 
> > Moving the functions around in the same patch really obscures the
> > actual change. Is it OK to have a patch to make the functional
> > change including the forward declarations. Then a separate commit
> > that does the re-order (where it is obvious that functions are
> > being picked up and moved without any code changes)?
> 
> My preference would be a prep patch that does the move with new
> capability built on top. It seems unnecessary to me to add a forward
> declaration in one patch just to remove it in a following patch.
> I agree that moving code as part of functional change should be avoided.

OK. I have a three patch series:

1) Pure code move, no changes [while I picked up a block of functions
and moved them earlier, "git show" presents the move in terms of a
different set of functions moving]

2) Add missing mon_put_kn_priv()

3) Fix the mount error deadlock

Internal AI scans have only two complaints. Both appear spurious;

First it says that:

	if (!ctx->kfc.new_sb_created)
		resctrl_unmount();
should be:

	if (ret && !ctx->kfc.new_sb_created)
		resctrl_unmount();

It admits that current code can't return ret == 0 (success) while
failing to create a super block. But argues in some twisted future
that might be possible.

Second:

Code used to call rdt_last_cmd_clear() unconditionally on success or
failure of the mount. Now the call for the end-of-function error path
is gone.

This is true, but it doesn't matter. If the filesystem did not mount
then there is no /sys/fs/resctrl/info/last_cmd_status file to leak an
old error string.

> 
> Reinette

I just need to write up the cover letter.

-Tony

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] fs/resctrl: Fix deadlock for errors during mount
  2026-05-07 23:45       ` Luck, Tony
@ 2026-05-08  0:23         ` Reinette Chatre
  0 siblings, 0 replies; 6+ messages in thread
From: Reinette Chatre @ 2026-05-08  0:23 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Borislav Petkov, x86, Fenghua Yu, Maciej Wieczor-Retman,
	Peter Newman, James Morse, Babu Moger, Drew Fustini, Dave Martin,
	Chen Yu, linux-kernel, patches

Hi Tony,

On 5/7/26 4:45 PM, Luck, Tony wrote:
> On Thu, May 07, 2026 at 11:30:53AM -0700, Reinette Chatre wrote:
>> On 5/7/26 10:47 AM, Luck, Tony wrote:
>>> Reinette,
>>>
>>> This looks promising.
>>>
>>> I'll split out the missing call to mon_put_kn_priv(); into its own
>>> patch before the deadlock fix.
>>>
>>> Going to re-run the tests I did before forcing kernfs_get_tree()
>>> to fail early without setting new_sb_created, and also late.
>>
>> Thank you very much.
> 
> Tests pass.

Great. Thank you very much.

> 
>>>
>>>> +/*
>>>> + * Temporary forward declaration for testing only. Move functions instead.
>>>> + */
>>>> +static void resctrl_unmount(void);
>>>> +static void mon_put_kn_priv(void);
>>>
>>> Question: How much are forward declarations hated? And how to handle
>>> this?
>>
>> I am not aware of forward declarations being hated and I am not aware of
>> documented tip rules about this. Even so, I do find it cleaner if they
>> can be avoided. To handle this a prep patch that just moves the code
>> without any functional change should work?
>>
>>> Moving the functions around in the same patch really obscures the
>>> actual change. Is it OK to have a patch to make the functional
>>> change including the forward declarations. Then a separate commit
>>> that does the re-order (where it is obvious that functions are
>>> being picked up and moved without any code changes)?
>>
>> My preference would be a prep patch that does the move with new
>> capability built on top. It seems unnecessary to me to add a forward
>> declaration in one patch just to remove it in a following patch.
>> I agree that moving code as part of functional change should be avoided.
> 
> OK. I have a three patch series:
> 
> 1) Pure code move, no changes [while I picked up a block of functions
> and moved them earlier, "git show" presents the move in terms of a
> different set of functions moving]

If the patches seem awkward it may be worthwhile to try --patience or
--histogram to see if readability improves.

> 
> 2) Add missing mon_put_kn_priv()
> 
> 3) Fix the mount error deadlock
> 
> Internal AI scans have only two complaints. Both appear spurious;
> 
> First it says that:
> 
> 	if (!ctx->kfc.new_sb_created)
> 		resctrl_unmount();
> should be:
> 
> 	if (ret && !ctx->kfc.new_sb_created)
> 		resctrl_unmount();
> 
> It admits that current code can't return ret == 0 (success) while
> failing to create a super block. But argues in some twisted future
> that might be possible.

I think kernfs_get_tree() can indeed return 0 without creating a superblock
when this is a re-mount but since resctrl does not call kernfs_get_tree()
on a remount (instead returning -EBUSY) this should not happen. If indeed
something changes with kernfs API to cause this to happen then it would be
safer to return success with state than return success with all state removed.

While the change is not necessary in current flow it does improve robustness
while making the flow easier to understand so I am ok with it. I think a comment
above the check will be helpful though.

> 
> Second:
> 
> Code used to call rdt_last_cmd_clear() unconditionally on success or
> failure of the mount. Now the call for the end-of-function error path
> is gone.
> 
> This is true, but it doesn't matter. If the filesystem did not mount
> then there is no /sys/fs/resctrl/info/last_cmd_status file to leak an
> old error string.

ack. I see this similar to unmount where this state is not cleared
either.

> 
> I just need to write up the cover letter.

Thank you very much.

Reinette


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-05-08  0:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-04 22:01 [PATCH v2] fs/resctrl: Fix deadlock for errors during mount Tony Luck
2026-05-06 22:16 ` Reinette Chatre
2026-05-07 17:47   ` Luck, Tony
2026-05-07 18:30     ` Reinette Chatre
2026-05-07 23:45       ` Luck, Tony
2026-05-08  0:23         ` Reinette Chatre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox