From: Jens Axboe <axboe@kernel.dk>
To: Dave Chinner <david@fromorbit.com>, Andres Freund <andres@anarazel.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>,
"Thorsten Leemhuis" <regressions@leemhuis.info>,
"Shreeya Patel" <shreeya.patel@collabora.com>,
linux-ext4@vger.kernel.org,
"Ricardo Cañuelo" <ricardo.canuelo@collabora.com>,
gustavo.padovan@collabora.com, zsm@google.com,
garrick@google.com,
"Linux regressions mailing list" <regressions@lists.linux.dev>,
io-uring@vger.kernel.org
Subject: Re: task hung in ext4_fallocate #2
Date: Tue, 24 Oct 2023 12:35:26 -0600 [thread overview]
Message-ID: <ab4f311b-9700-4d3d-8f2e-09ccbcfb3df5@kernel.dk> (raw)
In-Reply-To: <74921cba-6237-4303-bb4c-baa22aaf497b@kernel.dk>
On 10/24/23 8:30 AM, Jens Axboe wrote:
> I don't think this is related to the io-wq workers doing non-blocking
> IO. The callback is eventually executed by the task that originally
> submitted the IO, which is the owner and not the async workers. But...
> If that original task is blocked in eg fallocate, then I can see how
> that would potentially be an issue.
>
> I'll take a closer look.
I think the best way to fix this is likely to have inode_dio_wait() be
interruptible, and return -ERESTARTSYS if it should be restarted. Now
the below is obviously not a full patch, but I suspect it'll make ext4
and xfs tick, because they should both be affected.
Andres, any chance you can throw this into the testing mix?
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 202c76996b62..0d946b6d36fe 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4747,7 +4747,9 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
}
/* Wait all existing dio workers, newcomers will block on i_rwsem */
- inode_dio_wait(inode);
+ ret = inode_dio_wait(inode);
+ if (ret)
+ goto out;
ret = file_modified(file);
if (ret)
diff --git a/fs/inode.c b/fs/inode.c
index 84bc3c76e5cc..c4eca812b16b 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -2417,17 +2417,24 @@ EXPORT_SYMBOL(inode_owner_or_capable);
/*
* Direct i/o helper functions
*/
-static void __inode_dio_wait(struct inode *inode)
+static int __inode_dio_wait(struct inode *inode)
{
wait_queue_head_t *wq = bit_waitqueue(&inode->i_state, __I_DIO_WAKEUP);
DEFINE_WAIT_BIT(q, &inode->i_state, __I_DIO_WAKEUP);
+ int ret = 0;
do {
- prepare_to_wait(wq, &q.wq_entry, TASK_UNINTERRUPTIBLE);
- if (atomic_read(&inode->i_dio_count))
- schedule();
+ prepare_to_wait(wq, &q.wq_entry, TASK_INTERRUPTIBLE);
+ if (!atomic_read(&inode->i_dio_count))
+ break;
+ schedule();
+ if (signal_pending(current)) {
+ ret = -ERESTARTSYS;
+ break;
+ }
} while (atomic_read(&inode->i_dio_count));
finish_wait(wq, &q.wq_entry);
+ return ret;
}
/**
@@ -2440,10 +2447,11 @@ static void __inode_dio_wait(struct inode *inode)
* Must be called under a lock that serializes taking new references
* to i_dio_count, usually by inode->i_mutex.
*/
-void inode_dio_wait(struct inode *inode)
+int inode_dio_wait(struct inode *inode)
{
if (atomic_read(&inode->i_dio_count))
- __inode_dio_wait(inode);
+ return __inode_dio_wait(inode);
+ return 0;
}
EXPORT_SYMBOL(inode_dio_wait);
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 203700278ddb..8ea0c414b173 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -936,7 +936,9 @@ xfs_file_fallocate(
* the on disk and in memory inode sizes, and the operations that follow
* require the in-memory size to be fully up-to-date.
*/
- inode_dio_wait(inode);
+ error = inode_dio_wait(inode);
+ if (error)
+ goto out_unlock;
/*
* Now AIO and DIO has drained we flush and (if necessary) invalidate
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 4a40823c3c67..7dff3167cb0c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2971,7 +2971,7 @@ static inline ssize_t blockdev_direct_IO(struct kiocb *iocb,
}
#endif
-void inode_dio_wait(struct inode *inode);
+int inode_dio_wait(struct inode *inode);
/**
* inode_dio_begin - signal start of a direct I/O requests
--
Jens Axboe
next prev parent reply other threads:[~2023-10-24 18:35 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-17 3:37 task hung in ext4_fallocate #2 Andres Freund
2023-10-18 0:43 ` Theodore Ts'o
2023-10-18 2:50 ` Andres Freund
2023-10-18 9:41 ` Andres Freund
2023-10-24 1:12 ` Dave Chinner
2023-10-24 1:36 ` Andres Freund
2023-10-24 14:30 ` Jens Axboe
2023-10-24 18:35 ` Jens Axboe [this message]
2023-10-25 0:06 ` Dave Chinner
2023-10-25 0:34 ` Jens Axboe
2023-10-25 15:31 ` Andres Freund
2023-10-25 15:36 ` Jens Axboe
2023-10-25 16:14 ` Andres Freund
2023-10-26 2:48 ` Andres Freund
2023-10-25 19:55 ` Theodore Ts'o
2023-10-27 14:46 ` Jens Axboe
2023-10-25 22:28 ` Dave Chinner
2023-10-27 14:55 ` Jens Axboe
2023-10-20 7:01 ` Linux regression tracking #adding (Thorsten Leemhuis)
2023-10-29 6:28 ` Linux regression tracking #update (Thorsten Leemhuis)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ab4f311b-9700-4d3d-8f2e-09ccbcfb3df5@kernel.dk \
--to=axboe@kernel.dk \
--cc=andres@anarazel.de \
--cc=david@fromorbit.com \
--cc=garrick@google.com \
--cc=gustavo.padovan@collabora.com \
--cc=io-uring@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=regressions@leemhuis.info \
--cc=regressions@lists.linux.dev \
--cc=ricardo.canuelo@collabora.com \
--cc=shreeya.patel@collabora.com \
--cc=tytso@mit.edu \
--cc=zsm@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.