From: Jan Kara <jack@suse.cz>
To: "Theodore Y. Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>,
linux-ext4@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH 03/22] ext4: Do not iput inode under running transaction in ext4_mkdir()
Date: Thu, 24 Oct 2019 15:37:01 +0200 [thread overview]
Message-ID: <20191024133701.GP31271@quack2.suse.cz> (raw)
In-Reply-To: <20191024120958.GC1124@mit.edu>
On Thu 24-10-19 08:09:58, Theodore Y. Ts'o wrote:
> On Thu, Oct 24, 2019 at 12:19:06PM +0200, Jan Kara wrote:
> > Correct on both points. Thanks for spotting this! Now I still don't think
> > that calling iput() with running transaction is good. It complicates
> > matters with revoke record reservation but it is also fragile for other
> > reasons - e.g. flush worker could find the allocated inode just before we
> > will call iput() on it, try to write it out, block on starting transaction
> > and we get a deadlock with inode_wait_for_writeback() inside evict(). Now
> > inode *probably* won't be dirty yet by the time we get to ext4_add_nondir()
> > or similar, that's why I say above it's just fragile, not an outright bug.
>
> But we don't ever write the inode itself via
> inode_wait_for_writeback(), because how ext4 journalling works. (See
> the comments before ext4_mark_inode_dirty()). And for the special
> inodes (directories, device nodes, etc.) there's no data dirtyness to
> worry about. For regular files, we hit this code path when have just
> created the inode, but were not able to add a link to the parent
> directory; the fd wasn't been released to userspace yet, so it can't
> be data dirty either.
>
> So unless I'm missing something, I don't think the deadlock described
> above is possible?
Actually, now that I look at it, large symlinks may be prone to this
deadlock. There we create unlinked inode, add it to orphan list, stop
transaction, call __page_symlink() which will dirty the inode through
mark_inode_dirty(), then we start transaction and call ext4_add_nondir()
which may call iput() while the transaction is started.
Granted we can fix just ext4_symlink() but it kind of demonstrates my point
that calling iput() under transaction is fragile - some of the stuff done
on last iput generaly ranks above transaction start, just in cases we clean
up failed create none of them happens to block currently (except for the
symlink case mentioned above). And also lockdep does not track dependencies
like inode_wait_for_writeback() as otherwise it would complain as well.
> We can certainly add it to the orphan list if it's necessary, but it's
> extra overhead and adds a global contention point. So if it's not
> necessary, I'd rather avoid it if possible, and I think it's safe to
> do so in this case.
As this is error cleanup path (only EIO and ENOSPC are realistic failure
cases AFAICT) I don't think performance really matters here. I certainly
don't want to add inode to orphan list in the fast path. I agree that would
be non-starter. I'll try to write a patch and we'll see how bad it will be.
If you still hate it, I can have a look into how bad it would be to fix
ext4_symlink() and somehow deal with revoke reservation issues.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2019-10-24 13:37 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-03 22:05 [PATCH 0/19 v3] ext4: Fix transaction overflow due to revoke descriptors Jan Kara
2019-11-05 16:44 ` [PATCH 0/25 " Jan Kara
2019-10-03 22:05 ` [PATCH 01/22] jbd2: Fix possible overflow in jbd2_log_space_left() Jan Kara
2019-10-21 1:08 ` Theodore Y. Ts'o
2019-10-03 22:05 ` [PATCH 02/22] jbd2: Fixup stale comment in commit code Jan Kara
2019-10-21 1:08 ` Theodore Y. Ts'o
2019-10-03 22:05 ` [PATCH 03/22] ext4: Do not iput inode under running transaction in ext4_mkdir() Jan Kara
2019-10-21 1:21 ` Theodore Y. Ts'o
2019-10-24 10:19 ` Jan Kara
2019-10-24 12:09 ` Theodore Y. Ts'o
2019-10-24 13:37 ` Jan Kara [this message]
2019-11-04 12:35 ` Theodore Y. Ts'o
2019-10-03 22:05 ` [PATCH 04/22] ext4: Fix credit estimate for final inode freeing Jan Kara
2019-10-21 1:07 ` Theodore Y. Ts'o
2019-10-24 10:30 ` Jan Kara
2019-10-03 22:05 ` [PATCH 05/22] ext4: Fix ext4_should_journal_data() for EA inodes Jan Kara
2019-10-21 1:38 ` Theodore Y. Ts'o
2019-10-23 16:55 ` Jan Kara
2019-10-03 22:05 ` [PATCH 06/22] ext4: Use ext4_journal_extend() instead of jbd2_journal_extend() Jan Kara
2019-10-21 1:39 ` Theodore Y. Ts'o
2019-10-03 22:05 ` [PATCH 07/22] ext4: Avoid unnecessary revokes in ext4_alloc_branch() Jan Kara
2019-10-21 13:39 ` Theodore Y. Ts'o
2019-10-03 22:05 ` [PATCH 08/22] ext4: Provide function to handle transaction restarts Jan Kara
2019-10-21 16:20 ` Theodore Y. Ts'o
2019-10-23 16:25 ` Jan Kara
2019-10-03 22:05 ` [PATCH 09/22] ext4, jbd2: Provide accessor function for handle credits Jan Kara
2019-10-21 16:21 ` Theodore Y. Ts'o
2019-10-03 22:05 ` [PATCH 10/22] ocfs2: Use accessor function for h_buffer_credits Jan Kara
2019-10-21 16:21 ` Theodore Y. Ts'o
2019-10-03 22:05 ` [PATCH 11/22] jbd2: Fix statistics for the number of logged blocks Jan Kara
2019-10-21 16:24 ` Theodore Y. Ts'o
2019-10-03 22:05 ` [PATCH 12/22] jbd2: Reorganize jbd2_journal_stop() Jan Kara
2019-10-21 17:29 ` Theodore Y. Ts'o
2019-10-03 22:05 ` [PATCH 13/22] jbd2: Drop pointless check from jbd2_journal_stop() Jan Kara
2019-10-21 17:30 ` Theodore Y. Ts'o
2019-10-03 22:06 ` [PATCH 14/22] jbd2: Drop pointless wakeup " Jan Kara
2019-10-21 17:34 ` Theodore Y. Ts'o
2019-10-03 22:06 ` [PATCH 15/22] jbd2: Factor out common parts of stopping and restarting a handle Jan Kara
2019-10-21 17:49 ` Theodore Y. Ts'o
2019-10-23 16:17 ` Jan Kara
2019-11-04 12:36 ` Theodore Y. Ts'o
2019-11-04 12:59 ` Jan Kara
2019-10-03 22:06 ` [PATCH 16/22] jbd2: Account descriptor blocks into t_outstanding_credits Jan Kara
2019-10-21 21:04 ` Theodore Y. Ts'o
2019-10-23 13:09 ` Jan Kara
2019-10-03 22:06 ` [PATCH 17/22] jbd2: Drop jbd2_space_needed() Jan Kara
2019-10-21 21:05 ` Theodore Y. Ts'o
2019-10-03 22:06 ` [PATCH 18/22] jbd2: Reserve space for revoke descriptor blocks Jan Kara
2019-10-21 21:47 ` Theodore Y. Ts'o
2019-10-23 13:27 ` Jan Kara
2019-10-03 22:06 ` [PATCH 19/22] jbd2: Rename h_buffer_credits to h_total_credits Jan Kara
2019-10-21 21:48 ` Theodore Y. Ts'o
2019-10-03 22:06 ` [PATCH 20/22] jbd2: Make credit checking more strict Jan Kara
2019-10-21 22:29 ` Theodore Y. Ts'o
2019-10-23 13:30 ` Jan Kara
2019-10-03 22:06 ` [PATCH 21/22] ext4: Reserve revoke credits for freed blocks Jan Kara
2019-10-21 23:18 ` Theodore Y. Ts'o
2019-10-23 16:13 ` Jan Kara
2019-11-04 13:08 ` Theodore Y. Ts'o
2019-11-05 8:31 ` Jan Kara
2019-10-03 22:06 ` [PATCH 22/22] jbd2: Provide trace event for handle restarts Jan Kara
2019-10-21 23:18 ` Theodore Y. Ts'o
2019-10-19 19:19 ` [PATCH 0/19 v3] ext4: Fix transaction overflow due to revoke descriptors Theodore Y. Ts'o
2019-10-24 13:09 ` Jan Kara
2019-10-24 15:12 ` Jan Kara
2019-11-04 3:32 ` Theodore Y. Ts'o
2019-11-04 11:22 ` Jan Kara
2019-11-04 13:09 ` Theodore Y. Ts'o
2019-11-05 16:44 ` [PATCH 01/25] jbd2: Fix possible overflow in jbd2_log_space_left() Jan Kara
2019-11-05 16:44 ` [PATCH 02/25] jbd2: Fixup stale comment in commit code Jan Kara
2019-11-05 16:44 ` [PATCH 03/25] jbd2: Completely fill journal descriptor blocks Jan Kara
2019-11-05 16:44 ` [PATCH 04/25] ext4: Move marking of handle as sync to ext4_add_nondir() Jan Kara
2019-11-05 16:44 ` [PATCH 05/25] ext4: Do not iput inode under running transaction Jan Kara
2019-11-05 16:44 ` [PATCH 06/25] ext4: Fix credit estimate for final inode freeing Jan Kara
2019-11-05 21:00 ` Theodore Y. Ts'o
2019-11-05 16:44 ` [PATCH 07/25] ext4: Fix ext4_should_journal_data() for EA inodes Jan Kara
2019-11-05 16:44 ` [PATCH 08/25] ext4: Use ext4_journal_extend() instead of jbd2_journal_extend() Jan Kara
2019-11-05 16:44 ` [PATCH 09/25] ext4: Avoid unnecessary revokes in ext4_alloc_branch() Jan Kara
2019-11-05 16:44 ` [PATCH 10/25] ext4: Provide function to handle transaction restarts Jan Kara
2019-11-05 16:44 ` [PATCH 11/25] ext4, jbd2: Provide accessor function for handle credits Jan Kara
2019-11-05 16:44 ` [PATCH 12/25] ocfs2: Use accessor function for h_buffer_credits Jan Kara
2019-11-05 16:44 ` [PATCH 13/25] jbd2: Fix statistics for the number of logged blocks Jan Kara
2019-11-05 16:44 ` [PATCH 14/25] jbd2: Reorganize jbd2_journal_stop() Jan Kara
2019-11-05 16:44 ` [PATCH 15/25] jbd2: Drop pointless check from jbd2_journal_stop() Jan Kara
2019-11-05 16:44 ` [PATCH 16/25] jbd2: Drop pointless wakeup " Jan Kara
2019-11-05 16:44 ` [PATCH 17/25] jbd2: Factor out common parts of stopping and restarting a handle Jan Kara
2019-11-05 16:44 ` [PATCH 18/25] jbd2: Account descriptor blocks into t_outstanding_credits Jan Kara
2019-11-05 16:44 ` [PATCH 19/25] jbd2: Drop jbd2_space_needed() Jan Kara
2019-11-05 16:44 ` [PATCH 20/25] jbd2: Reserve space for revoke descriptor blocks Jan Kara
2019-11-15 7:52 ` Eric Biggers
2019-11-15 10:02 ` Jan Kara
2019-11-15 14:20 ` Theodore Y. Ts'o
2019-11-15 17:10 ` Eric Biggers
2019-11-05 16:44 ` [PATCH 21/25] jbd2: Rename h_buffer_credits to h_total_credits Jan Kara
2019-11-05 16:44 ` [PATCH 22/25] jbd2: Make credit checking more strict Jan Kara
2019-11-05 16:44 ` [PATCH 23/25] ext4: Reserve revoke credits for freed blocks Jan Kara
2019-11-05 16:44 ` [PATCH 24/25] jbd2: Provide trace event for handle restarts Jan Kara
2019-11-05 16:44 ` [PATCH 25/25] jbd2: Fine tune estimate of necessary descriptor blocks Jan Kara
2019-11-05 21:04 ` [PATCH 0/25 v3] ext4: Fix transaction overflow due to revoke descriptors Theodore Y. Ts'o
[not found] ` <20191112220614.GA11089@mit.edu>
[not found] ` <20191113094545.GC6367@quack2.suse.cz>
2019-11-14 5:26 ` [PATCH 0/19 " Theodore Y. Ts'o
2019-11-14 8:49 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191024133701.GP31271@quack2.suse.cz \
--to=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.