From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tim Murray <timmurray@google.com>,
Waiman Long <longman@redhat.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Linux F2FS Dev Mailing List
<linux-f2fs-devel@lists.sourceforge.net>
Subject: Re: [GIT PULL] f2fs for 5.18
Date: Wed, 23 Mar 2022 09:26:44 -0700 [thread overview]
Message-ID: <YjtKRAgFmBfgU0al@google.com> (raw)
In-Reply-To: <CAHk-=whGKUyJpi0dTQJjyJxdmG+WCeKkJJyycpOaUW0De17h_Q@mail.gmail.com>
On 03/22, Linus Torvalds wrote:
> On Tue, Mar 22, 2022 at 5:34 PM Tim Murray <timmurray@google.com> wrote:
> >
> > AFAICT, what's happening is that rwsem_down_read_slowpath
> > modifies sem->count to indicate that there's a pending reader while
> > f2fs_ckpt holds the write lock, and when f2fs_ckpt releases the write
> > lock, it wakes pending readers and hands the lock over to readers.
> > This means that any subsequent attempt to grab the write lock from
> > f2fs_ckpt will stall until the newly-awakened reader releases the read
> > lock, which depends on the readers' arbitrarily long scheduling
> > delays.
>
> Ugh.
>
> So I'm looking at some of this, and you have things like this:
>
> f2fs_down_read(&F2FS_I(inode)->i_sem);
> cp_reason = need_do_checkpoint(inode);
> f2fs_up_read(&F2FS_I(inode)->i_sem);
>
> which really doesn't seem to want a sleeping lock at all.
>
> In fact, it's not clear that it has any business serializing with IO
> at all. It seems to just check very basic inode state. Very strange.
> It's the kind of thing that the VFS layer tends to use te i_lock
> *spinlock* for.
Um.. let me check this i_sem, introduced by
d928bfbfe77a ("f2fs: introduce fi->i_sem to protect fi's info").
OTOH, I was suspecting the major contetion would be
f2fs_lock_op -> f2fs_down_read(&sbi->cp_rwsem);
, which was used for most of filesystem operations.
And, when we need to do checkpoint, we'd like to block internal operations by
f2fs_lock_all -> f2fs_down_write(&sbi->cp_rwsem);
So, what I expected was giving the highest priority to the checkpoint thread
by grabbing down_write to block all the other readers.
>
> And perhaps equally oddly, then when you do f2fs_issue_checkpoint(),
> _that_ code uses fancy lockless lists.
>
> I'm probably mis-reading it.
>
> Linus
next prev parent reply other threads:[~2022-03-23 16:26 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-21 20:39 [GIT PULL] f2fs for 5.18 Jaegeuk Kim
2022-03-22 17:22 ` Linus Torvalds
2022-03-22 17:37 ` Waiman Long
2022-03-22 17:50 ` Linus Torvalds
2022-03-22 20:58 ` Jaegeuk Kim
2022-06-15 20:13 ` Pavel Machek
2022-06-16 17:02 ` Jaegeuk Kim
2022-03-23 0:34 ` Tim Murray
2022-03-23 2:03 ` Linus Torvalds
2022-03-23 16:26 ` Jaegeuk Kim [this message]
2022-03-23 17:06 ` Linus Torvalds
2022-03-23 21:21 ` Jaegeuk Kim
2022-03-23 7:33 ` Christoph Hellwig
2022-03-23 16:48 ` Jaegeuk Kim
2022-03-23 16:49 ` Christoph Hellwig
2022-03-23 17:00 ` Jaegeuk Kim
2022-03-23 19:28 ` Waiman Long
2022-03-23 21:25 ` Jaegeuk Kim
2022-03-22 18:32 ` [f2fs-dev] " pr-tracker-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YjtKRAgFmBfgU0al@google.com \
--to=jaegeuk@kernel.org \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=timmurray@google.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox