From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kara Subject: Re: [PATCH 2/2] reiserfs: fix race with flush_used_journal_lists and flush_journal_list Date: Tue, 24 Sep 2013 11:11:43 +0200 Message-ID: <20130924091143.GC1821@quack.suse.cz> References: <5240A9A2.3040707@suse.com> Mime-Version: 1.0 Return-path: Content-Disposition: inline In-Reply-To: <5240A9A2.3040707@suse.com> Sender: reiserfs-devel-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Jeff Mahoney Cc: reiserfs-devel , Jan Kara On Mon 23-09-13 16:50:42, Jeff Mahoney wrote: > There are two locks involved in managing the journal lists. The general > reiserfs_write_lock and the journal->j_flush_mutex. > > While flush_journal_list is sleeping to acquire the j_flush_mutex or to > submit a block for write, it will drop the write lock. This allows > another thread to acquire the write lock and ultimately call > flush_used_journal_lists to traverse the list of journal lists and > select one for flushing. It can select the journal_list that has just > had flush_journal_list called on it in the original thread and call it > again with the same journal_list. > > The second thread then drops the write lock to acquire j_flush_mutex and > the first thread reacquires it and continues execution and eventually > clears and frees the journal list before dropping j_flush_mutex and > returning. > > The second thread acquires j_flush_mutex and ends up operating on a > journal_list that has already been released. If the memory hasn't > been reused, we'll soon after hit a BUG_ON because the transaction id > has already been cleared. If it's been reused, we'll crash in other > fun ways. > > Since flush_journal_list will synchronize on j_flush_mutex, we can fix > the race by taking a proper reference in flush_used_journal_lists > and checking to see if it's still valid after the mutex is taken. It's > safe to iterate the list of journal lists and pick a list with > just the write lock as long as a reference is taken on the journal list > before we drop the lock. We already have code to handle whether a > transaction has been flushed already so we can use that to handle the > race and get rid of the trans_id BUG_ON. Thanks. I've merged the patch into my tree. Honza > > Signed-off-by: Jeff Mahoney > --- > fs/reiserfs/journal.c | 13 +++++++++---- > 1 file changed, 9 insertions(+), 4 deletions(-) > > --- a/fs/reiserfs/journal.c > +++ b/fs/reiserfs/journal.c > @@ -1363,7 +1363,6 @@ static int flush_journal_list(struct sup > reiserfs_warning(s, "clm-2048", "called with wcount %d", > atomic_read(&journal->j_wcount)); > } > - BUG_ON(jl->j_trans_id == 0); > > /* if flushall == 0, the lock is already held */ > if (flushall) { > @@ -1818,6 +1817,8 @@ static int flush_used_journal_lists(stru > break; > tjl = JOURNAL_LIST_ENTRY(tjl->j_list.next); > } > + get_journal_list(jl); > + get_journal_list(flush_jl); > /* try to find a group of blocks we can flush across all the > ** transactions, but only bother if we've actually spanned > ** across multiple lists > @@ -1826,6 +1827,8 @@ static int flush_used_journal_lists(stru > ret = kupdate_transactions(s, jl, &tjl, &trans_id, len, i); > } > flush_journal_list(s, flush_jl, 1); > + put_journal_list(s, flush_jl); > + put_journal_list(s, jl); > return 0; > } > > > -- > Jeff Mahoney > SUSE Labs -- Jan Kara SUSE Labs, CR