From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Mason Subject: Re: btrfs panic - BUG: soft lockup - CPU#0 stuck for 61s! [fs_mark:4573] Date: Sun, 8 Jun 2008 22:37:12 -0400 Message-ID: <20080608223712.73692933@think.oraclecorp.com> References: <4844336F.1010807@redhat.com> <20080605143428.GA16999@think.oraclecorp.com> <484825D4.2010402@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Cc: linux-btrfs@vger.kernel.org To: rwheeler@redhat.com Return-path: In-Reply-To: <484825D4.2010402@redhat.com> List-ID: On Thu, 05 Jun 2008 13:43:48 -0400 Ric Wheeler wrote: > Chris Mason wrote: > > On Mon, Jun 02, 2008 at 01:52:47PM -0400, Ric Wheeler wrote: > > > >> I can reliably get btrfs to panic by running my fs_mark code on a > >> newly created file system with lots of threads on an 8-way box. If > >> this is too aggressive, let me know ;-) > >> > >> Here is a summary of the panic: > >> > > > > BTW, exactly how are you running fs_mark? Mingming reminded me that > > strictly speaking this patch shouldn't be required, so there might > > be other related problems. > > > > -chris > > > > > It still crashes, Mingming is clearly correct ;-) > Grin, I never should have doubted her. So, the actual fix should be below. It looks like the problem is that I've got a race in setting the pointer to a new transaction, which makes the data=ordered code take a spin lock that hasn't yet been setup. Before this patch my test box got into an infinite loop with fs_mark. Now it seems to run to completion. -chris diff -r 0b4ab489ffe1 transaction.c --- a/transaction.c Tue May 27 10:55:43 2008 -0400 +++ b/transaction.c Sun Jun 08 22:23:50 2008 -0400 @@ -56,7 +56,6 @@ static noinline int join_transaction(str total_trans++; BUG_ON(!cur_trans); root->fs_info->generation++; - root->fs_info->running_transaction = cur_trans; root->fs_info->last_alloc = 0; root->fs_info->last_data_alloc = 0; cur_trans->num_writers = 1; @@ -74,6 +73,9 @@ static noinline int join_transaction(str extent_io_tree_init(&cur_trans->dirty_pages, root->fs_info->btree_inode->i_mapping, GFP_NOFS); + spin_lock(&root->fs_info->new_trans_lock); + root->fs_info->running_transaction = cur_trans; + spin_unlock(&root->fs_info->new_trans_lock); } else { cur_trans->num_writers++; cur_trans->num_joined++;