All of lore.kernel.org
 help / color / mirror / Atom feed
From: Philippe Gerum <rpm@xenomai.org>
To: Dave Rolenc <Dave.Rolenc@kratosdefense.com>
Cc: "xenomai@lists.linux.dev" <xenomai@lists.linux.dev>,
	Russell Johnson <russell.johnson@kratosdefense.com>
Subject: Re: [PATCH 0/1] rwsem_down_write_slowpath check if oob() before skipping schedule()
Date: Wed, 31 May 2023 08:22:34 +0200	[thread overview]
Message-ID: <87jzwpxa8l.fsf@xenomai.org> (raw)
In-Reply-To: <e5624d226deb4a49bbe28867e15eadf9@PH1P110MB1666.NAMP110.PROD.OUTLOOK.COM>


Dave Rolenc <Dave.Rolenc@kratosdefense.com> writes:

> This fixes a timing-dependent bug that would cause a tight loop without
> schedule() when running in-band. The rwsem would get released enough to
> allow this tight loop without schedule() to execute when the handoff
> bit got set, but it would never allow the previous on-CPU owner to
> finish releasing the rwsem. Checking to see if we are running oob before
> skipping the schedule() seems to fix the problem. This problem was nasty
> when the global kernfs_rwsem was the target semaphore since many tasks
> ended up in D state waiting on the kernfs_rwsem. 
>
> We would typically see this problem when repeatedly starting/stopping an
> EVL-enabled application, which would create and tear-down a lot of EVL
> elements. Eventually, the timing worked out to where we'd get a stuck
> CPU message and a bunch of tasks in D-state waiting on the kernfs_rwsem.
> The cuplrit task was stuck in a forever loop in
> rwsem_down_write_slowpath() with the handoff bit set and skipping the
> schedule() call. 
>
> This patch is based on 5.15.98.
>
> This is also my first patch submission to this group, so if you need me
> to follow some specific procedure or checklist for patch submission,
> please let me know and I'll comply.
>
> Dave Rolenc (1):
>   rwsem_down_write_slowpath check if oob() before skipping schedule()
>
>  kernel/locking/rwsem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Mm, nasty. Thanks for digging into this. What's even worse and the root
cause of that issue is that no task on the oob stage should ever run
rwsem_down_write_slowpath() in the first place. Generally speaking, oob
execution is an NMI-like context, therefore taking regular locks and/or
(re)scheduling is not allowed, which means rwsem_down_write_slowpath()
should not be entered by a task running oob. In other words, Houston
we've had a problem here.

In order to figure out what happened, could you please check whether the
following patch causes a kernel splat when the bug triggers?  We are
trying to find out who calls rwsem_down_write_slowpath() from the wrong
context. The second hunk in this patch is making sure nothing weird
happens due to transitioning from inband to oob context - when Dovetail
is enabled, the tail code of the logic performing such transition lives
in the regular/inband schedule() code, but that should never leak into
the common in-band code.

diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index f0287a16b4ec8..12f4956842fdb 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -1070,6 +1070,8 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
 	struct rwsem_waiter waiter;
 	DEFINE_WAKE_Q(wake_q);
 
+	WARN_ON_ONCE(running_oob());
+
 	/* do optimistic spinning and steal lock if possible */
 	if (rwsem_can_spin_on_owner(sem) && rwsem_optimistic_spin(sem)) {
 		/* rwsem_optimistic_spin() implies ACQUIRE on success */
@@ -1155,6 +1157,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
 		}
 
 		schedule();
+		WARN_ON_ONCE(running_oob());
 		lockevent_inc(rwsem_sleep_writer);
 		set_current_state(state);
 trylock_again:

-- 
Philippe.

  reply	other threads:[~2023-05-31  6:53 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-30 19:04 [PATCH 0/1] rwsem_down_write_slowpath check if oob() before skipping schedule() Dave Rolenc
2023-05-31  6:22 ` Philippe Gerum [this message]
2023-05-31 15:27   ` Dave Rolenc
2023-05-31 16:55     ` Philippe Gerum
2023-05-31 21:48       ` [External] - " Dave Rolenc

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87jzwpxa8l.fsf@xenomai.org \
    --to=rpm@xenomai.org \
    --cc=Dave.Rolenc@kratosdefense.com \
    --cc=russell.johnson@kratosdefense.com \
    --cc=xenomai@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.