From: Ingo Molnar <mingo@elte.hu>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
Jeff Mahoney <jeffm@suse.com>,
ReiserFS Development List <reiserfs-devel@vger.kernel.org>,
Chris Mason <chris.mason@oracle.com>,
Alexander Beregalov <a.beregalov@gmail.com>,
Alessio Igor Bogani <abogani@texware.it>,
Jonathan Corbet <corbet@lwn.net>,
Alexander Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH 6/6] kill-the-BKL/reiserfs: release the write lock on flush_commit_list()
Date: Fri, 1 May 2009 15:23:22 +0200 [thread overview]
Message-ID: <20090501132322.GA30055@elte.hu> (raw)
In-Reply-To: <20090501131340.GB6011@nowhere>
* Frederic Weisbecker <fweisbec@gmail.com> wrote:
> On Fri, May 01, 2009 at 07:42:47AM +0200, Ingo Molnar wrote:
> >
> > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> >
> > > flush_commit_list() uses ll_rw_block() to commit the pending log blocks.
> > > ll_rw_block() might sleep, and the bkl was released at this point. Then
> > > we can also relax the write lock at this point.
> > >
> > > [ Impact: release the reiserfs write lock when it is not needed ]
> > >
> > > Cc: Jeff Mahoney <jeffm@suse.com>
> > > Cc: Chris Mason <chris.mason@oracle.com>
> > > Cc: Alexander Beregalov <a.beregalov@gmail.com>
> > > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > > ---
> > > fs/reiserfs/journal.c | 7 +++++--
> > > 1 files changed, 5 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
> > > index 373d080..b1ebd5a 100644
> > > --- a/fs/reiserfs/journal.c
> > > +++ b/fs/reiserfs/journal.c
> > > @@ -1120,8 +1120,11 @@ static int flush_commit_list(struct super_block *s,
> > > SB_ONDISK_JOURNAL_SIZE(s);
> > > tbh = journal_find_get_block(s, bn);
> > > if (tbh) {
> > > - if (buffer_dirty(tbh))
> > > - ll_rw_block(WRITE, 1, &tbh) ;
> > > + if (buffer_dirty(tbh)) {
> > > + reiserfs_write_unlock(s);
> > > + ll_rw_block(WRITE, 1, &tbh);
> > > + reiserfs_write_lock(s);
> > > + }
> > > put_bh(tbh) ;
> > > }
> > > }
> >
> > there's 7 other instances of ll_rw_block():
> >
> > fs/reiserfs/journal.c- spin_unlock(lock);
> > fs/reiserfs/journal.c: ll_rw_block(WRITE, 1, &bh);
> > fs/reiserfs/journal.c- spin_lock(lock);
> > --
> > fs/reiserfs/journal.c- reiserfs_write_unlock(s);
> > fs/reiserfs/journal.c: ll_rw_block(WRITE, 1, &tbh);
> > fs/reiserfs/journal.c- reiserfs_write_lock(s);
> > --
> > fs/reiserfs/journal.c- /* read in the log blocks, memcpy to the corresponding real block */
> > fs/reiserfs/journal.c: ll_rw_block(READ, get_desc_trans_len(desc), log_blocks);
> > fs/reiserfs/journal.c- for (i = 0; i < get_desc_trans_len(desc); i++) {
> > --
> > fs/reiserfs/journal.c- set_buffer_dirty(real_blocks[i]);
> > fs/reiserfs/journal.c: ll_rw_block(SWRITE, 1, real_blocks + i);
> > fs/reiserfs/journal.c- }
> > --
> > fs/reiserfs/journal.c- }
> > fs/reiserfs/journal.c: ll_rw_block(READ, j, bhlist);
> > fs/reiserfs/journal.c- for (i = 1; i < j; i++)
> > --
> > fs/reiserfs/stree.c- if (!buffer_uptodate(bh[j]))
> > fs/reiserfs/stree.c: ll_rw_block(READA, 1, bh + j);
> > fs/reiserfs/stree.c- brelse(bh[j]);
> > --
> > fs/reiserfs/stree.c- reada_blocks, reada_count);
> > fs/reiserfs/stree.c: ll_rw_block(READ, 1, &bh);
> > fs/reiserfs/stree.c- reiserfs_write_unlock(sb);
> > --
> > fs/reiserfs/super.c-{
> > fs/reiserfs/super.c: ll_rw_block(READ, 1, &(SB_BUFFER_WITH_SB(s)));
> > fs/reiserfs/super.c- reiserfs_write_unlock(s);
> >
> > in particular the second stree.c one and the super.c has a
> > write-unlock straight before the lock-drop.
> >
> > I think the stree.c unlock could be moved to before the
> > ll_rw_block() call straight away.
>
>
> Indeed.
>
>
> >
> > The super.c one needs more care: first put &(SB_BUFFER_WITH_SB(s))
> > into a local variable, then unlock the wite-lock, then call
> > ll_rw_block(). (This is important because &(SB_BUFFER_WITH_SB(s)) is
> > global filesystem state that has to be read with the lock held.)
>
>
> Indeed &(SB_BUFFER_WITH_SB(s)) is a pointer to blocks that
> reflect the state of the filesystem but it was already not
> safe on the old code.
>
> ll_rw_block() may sleep, and wait_on_buffer() too. And this
> pointer could have changed already during these sleeps.
No, it was safe prior. This was the prior code:
fs/reiserfs/super.c: ll_rw_block(READ, 1, &(SB_BUFFER_WITH_SB(s)));
The argument is passed to ll_rw_block _before_ the scheduling. So
the dereference always happens with the BKL held, and is safe.
It is true that ll_rw_block() can sleep (and will most likely sleep
for a READ command), but that's not the issue: the issue is the
former dereference which can now get out from under the lock. (it
might still be safe in special circumstances - but has to be
reviewed carefully.)
Ingo
next prev parent reply other threads:[~2009-05-01 13:23 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-01 2:44 [PATCH 0/6] kill-the-BKL/reiserfs3: performance improvements, faster than Bkl based scheme Frederic Weisbecker
2009-05-01 2:44 ` [PATCH 1/6] kill-the-BKL/reiserfs: release write lock on fs_changed() Frederic Weisbecker
2009-05-01 6:31 ` Andi Kleen
2009-05-01 13:28 ` Frederic Weisbecker
2009-05-01 13:44 ` Chris Mason
2009-05-01 14:01 ` Frederic Weisbecker
2009-05-01 14:14 ` Chris Mason
2009-05-02 1:19 ` Frederic Weisbecker
2009-05-01 2:44 ` [PATCH 2/6] kill-the-BKL/reiserfs: release the write lock before rescheduling on do_journal_end() Frederic Weisbecker
2009-05-01 7:09 ` Ingo Molnar
2009-05-01 13:31 ` Frederic Weisbecker
2009-05-01 22:20 ` Frederic Weisbecker
2009-05-01 2:44 ` [PATCH 3/6] kill-the-BKL/reiserfs: release write lock while rescheduling on prepare_for_delete_or_cut() Frederic Weisbecker
2009-05-01 2:44 ` [PATCH 4/6] kill-the-BKL/reiserfs: release the write lock inside get_neighbors() Frederic Weisbecker
2009-05-01 5:51 ` Ingo Molnar
2009-05-01 13:25 ` Frederic Weisbecker
2009-05-01 13:29 ` Chris Mason
2009-05-01 13:31 ` Ingo Molnar
2009-05-01 2:44 ` [PATCH 5/6] kill-the-BKL/reiserfs: release the write lock inside reiserfs_read_bitmap_block() Frederic Weisbecker
2009-05-01 5:47 ` Ingo Molnar
2009-05-01 13:19 ` Frederic Weisbecker
2009-05-01 13:30 ` Chris Mason
2009-05-01 13:51 ` Frederic Weisbecker
2009-05-01 2:44 ` [PATCH 6/6] kill-the-BKL/reiserfs: release the write lock on flush_commit_list() Frederic Weisbecker
2009-05-01 5:42 ` Ingo Molnar
2009-05-01 13:13 ` Frederic Weisbecker
2009-05-01 13:23 ` Ingo Molnar [this message]
2009-05-01 13:26 ` Chris Mason
2009-05-01 13:29 ` Ingo Molnar
2009-05-01 13:54 ` Chris Mason
2009-05-01 5:35 ` [PATCH 0/6] kill-the-BKL/reiserfs3: performance improvements, faster than Bkl based scheme Ingo Molnar
2009-05-01 12:18 ` Thomas Meyer
2009-05-01 14:12 ` Frederic Weisbecker
2009-05-01 19:59 ` Linus Torvalds
2009-05-01 20:33 ` Ingo Molnar
2009-05-01 20:36 ` Ingo Molnar
2009-05-01 21:11 ` Linus Torvalds
2009-05-01 21:32 ` Ingo Molnar
2009-05-02 1:39 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090501132322.GA30055@elte.hu \
--to=mingo@elte.hu \
--cc=a.beregalov@gmail.com \
--cc=abogani@texware.it \
--cc=chris.mason@oracle.com \
--cc=corbet@lwn.net \
--cc=fweisbec@gmail.com \
--cc=jeffm@suse.com \
--cc=linux-kernel@vger.kernel.org \
--cc=reiserfs-devel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.