Linux NILFS development
 help / color / mirror / Atom feed
From: Ryusuke Konishi <konishi.ryusuke@gmail.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-nilfs@vger.kernel.org,
	syzbot <syzbot+e3973c409251e136fdd0@syzkaller.appspotmail.com>,
	syzkaller-bugs@googlegroups.com, linux-kernel@vger.kernel.org,
	sjb7183@psu.edu
Subject: [PATCH 2/3] nilfs2: fix unexpected freezing of nilfs_segctor_sync()
Date: Mon, 20 May 2024 22:26:20 +0900	[thread overview]
Message-ID: <20240520132621.4054-3-konishi.ryusuke@gmail.com> (raw)
In-Reply-To: <20240520132621.4054-1-konishi.ryusuke@gmail.com>

A potential and reproducible race issue has been identified where
nilfs_segctor_sync() would block even after the log writer thread
writes a checkpoint, unless there is an interrupt or other trigger to
resume log writing.

This turned out to be because, depending on the execution timing
of the log writer thread running in parallel, the log writer thread
may skip responding to nilfs_segctor_sync(), which causes a call to
schedule() waiting for completion within nilfs_segctor_sync() to lose
the opportunity to wake up.

The reason why waking up the task waiting in nilfs_segctor_sync() may
be skipped is that updating the request generation issued using a
shared sequence counter and adding an wait queue entry to the request
wait queue to the log writer, are not done atomically.  There is a
possibility that log writing and request completion notification by
nilfs_segctor_wakeup() may occur between the two operations, and in
that case, the wait queue entry is not yet visible to
nilfs_segctor_wakeup() and the wake-up of nilfs_segctor_sync() will be
carried over until the next request occurs.

Fix this issue by performing these two operations simultaneously
within the lock section of sc_state_lock.  Also, following the memory
barrier guidelines for event waiting loops, move the call to
set_current_state() in the same location into the event waiting loop
to ensure that a memory barrier is inserted just before the event
condition determination.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Fixes: 9ff05123e3bf ("nilfs2: segment constructor")
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: stable@vger.kernel.org
---
 fs/nilfs2/segment.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 4e274bc8eb79..99c78a49e432 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -2168,19 +2168,28 @@ static int nilfs_segctor_sync(struct nilfs_sc_info *sci)
 	struct nilfs_segctor_wait_request wait_req;
 	int err = 0;
 
-	spin_lock(&sci->sc_state_lock);
 	init_wait(&wait_req.wq);
 	wait_req.err = 0;
 	atomic_set(&wait_req.done, 0);
+	init_waitqueue_entry(&wait_req.wq, current);
+
+	/*
+	 * To prevent a race issue where completion notifications from the
+	 * log writer thread are missed, increment the request sequence count
+	 * "sc_seq_request" and insert a wait queue entry using the current
+	 * sequence number into the "sc_wait_request" queue at the same time
+	 * within the lock section of "sc_state_lock".
+	 */
+	spin_lock(&sci->sc_state_lock);
 	wait_req.seq = ++sci->sc_seq_request;
+	add_wait_queue(&sci->sc_wait_request, &wait_req.wq);
 	spin_unlock(&sci->sc_state_lock);
 
-	init_waitqueue_entry(&wait_req.wq, current);
-	add_wait_queue(&sci->sc_wait_request, &wait_req.wq);
-	set_current_state(TASK_INTERRUPTIBLE);
 	wake_up(&sci->sc_wait_daemon);
 
 	for (;;) {
+		set_current_state(TASK_INTERRUPTIBLE);
+
 		if (atomic_read(&wait_req.done)) {
 			err = wait_req.err;
 			break;
-- 
2.34.1


  parent reply	other threads:[~2024-05-20 13:26 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <0000000000001a167a05ebc4f62b@google.com>
2024-05-20 13:26 ` [PATCH 0/3] nilfs2: fix log writer related issues Ryusuke Konishi
2024-05-20 13:26   ` [PATCH 1/3] nilfs2: fix use-after-free of timer for log writer thread Ryusuke Konishi
2024-05-20 13:26   ` Ryusuke Konishi [this message]
2024-05-20 13:26   ` [PATCH 3/3] nilfs2: fix potential hang in nilfs_detach_log_writer() Ryusuke Konishi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240520132621.4054-3-konishi.ryusuke@gmail.com \
    --to=konishi.ryusuke@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nilfs@vger.kernel.org \
    --cc=sjb7183@psu.edu \
    --cc=syzbot+e3973c409251e136fdd0@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox