From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Jan Kara <jack@suse.cz>
Cc: linux-kernel@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH RESEND] Minimal fix for private_list handling races
Date: Thu, 24 Jan 2008 02:05:16 +1100 [thread overview]
Message-ID: <200801240205.17005.nickpiggin@yahoo.com.au> (raw)
In-Reply-To: <20080123133045.GC10144@duck.suse.cz>
On Thursday 24 January 2008 00:30, Jan Kara wrote:
> On Wed 23-01-08 12:00:02, Nick Piggin wrote:
> > On Wednesday 23 January 2008 04:10, Jan Kara wrote:
> > > Hi,
> > >
> > > as I got no answer for a week, I'm resending this fix for races in
> > > private_list handling. Andrew, do you like them more than the previous
> > > version?
> >
> > FWIW, I reviewed this, and it looks OK although I think some comments
> > would be in order.
>
> Thanks.
>
> > What would be really nice is to avoid the use of b_assoc_buffers
> > completely in this function like I've attempted (untested). I don't
> > know if you'd actually call that an improvement...?
>
> I thought about this solution as well. But main issue I had with this
> solution is that currently, you nicely submit all the metadata buffers at
> once, so that block layer can sort them and write them in nice order. With
> the array you submit buffers by 16 (or any other fixed amount) and in
> mostly random order... So IMHO fsync would become measurably slower.
Oh, I don't know the filesystems very well... which ones would
attach a large number of metadata buffers to the inode?
> > Couple of things I noticed while looking at this code.
> >
> > - What is osync_buffers_list supposed to do? I couldn't actually
> > work it out. Why do we care about waiting for these buffers on
> > here that were added while waiting for writeout of other buffers
> > to finish? Can we just remove it completely? I must be missing
> > something.
>
> The problem here is that mark_buffer_dirty_inode() can move the buffer
> from 'tmp' list back to private_list while we are waiting for another
> buffer...
Hmm, no not while we're waiting for another buffer because b_assoc_buffers
will not be empty. However it is possible between removing from the inode
list and insertion onto the temp list I think, because
if (list_empty(&bh->b_assoc_buffers)) {
check in mark_buffer_dirty_inode is done without private_lock held. Nice.
With that in mind, doesn't your first patch suffer from a race due to
exactly this unlocked list_empty check when you are removing clean buffers
from the queue?
if (!buffer_dirty(bh) && !buffer_locked(bh))
mark_buffer_dirty()
if (list_empty(&bh->b_assoc_buffers))
/* add */
__remove_assoc_queue(bh);
Which results in the buffer being dirty but not on the ->private_list,
doesn't it?
But let's see... there must be a memory ordering problem here in existing
code anyway, because I don't see any barriers. Between b_assoc_buffers and
b_state (via buffer_dirty); fsync_buffers_list vs mark_buffer_dirty_inode,
right?
> > - What are the get_bh(bh) things supposed to do? Protect the lifetime
> > of a given bh while "lock" is dropped? That's nice, ignoring the
> > fact that we brelse(bh) *before* taking the lock again... but isn't
> > every single other buffer that we _have't_ elevated its reference
> > exposed to exactly the same lifetime problem? IOW, either it is not
> > required at all, or it is required for _all_ buffers? (my patch
> > should fix this).
>
> I think this get_bh() should stop try_to_free_buffers() from removing the
> buffer. brelse() before taking the private_lock is fine, because the loop
> actually checks for while (!list_empty(tmp)) so we really don't care what
> happens with the buffer after we are done with it. So I think that logic is
> actually fine.
Oh, of course. I overlooked the important fact that the tmp list is
also actually subject to modification via other threads exactly the
same as private_list...
next prev parent reply other threads:[~2008-01-23 15:05 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-22 17:10 [PATCH RESEND] Minimal fix for private_list handling races Jan Kara
2008-01-23 1:00 ` Nick Piggin
2008-01-23 13:30 ` Jan Kara
2008-01-23 15:05 ` Nick Piggin [this message]
2008-01-23 15:48 ` Jan Kara
2008-01-25 8:34 ` Nick Piggin
2008-01-25 18:16 ` Jan Kara
2008-01-28 22:16 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200801240205.17005.nickpiggin@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=akpm@linux-foundation.org \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox