From mboxrd@z Thu Jan 1 00:00:00 1970 From: Li Zefan Subject: Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!) Date: Tue, 14 Jun 2011 13:44:45 +0800 Message-ID: <4DF6F54D.3050603@cn.fujitsu.com> References: <4DEC90CB.4050609@jp.fujitsu.com> <4DEDB7AF.2060308@cn.fujitsu.com> <4DEDBE4B.2020403@jp.fujitsu.com> <4DEDC293.30105@jp.fujitsu.com> <4DEDE03B.9050907@jp.fujitsu.com> <4DEDE328.5060405@cn.fujitsu.com> <1307461229-sup-9822@shiny> <4DEF04CF.8010502@jp.fujitsu.com> <4DF5B889.5080202@cn.fujitsu.com> <1307994828-sup-6461@shiny> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Cc: Chris Mason , Tsutomu Itoh , liubo , Linux Btrfs , Josef Bacik To: "Yan, Zheng " Return-path: In-Reply-To: List-ID: Yan, Zheng wrote: > On Tue, Jun 14, 2011 at 3:55 AM, Chris Mason wrote: >> Excerpts from Yan, Zheng's message of 2011-06-13 10:58:35 -0400: >>> The usage of trans_mutex in relocation code is subtle. It controls >>> interaction of relocation >>> with transaction start, transaction commit and snapshot creation. >>> Simple replacing >>> trans_mutex with trans_lock is wrong. >> >> So, I've got a mutex around the reloc_root here and that was almost but >> not quite enough. It looks like the biggest problem is that we need to >> wait in btrfs_record_root_in_trans for anyone inside merge_reloc_roots. >> >> I'm surviving much longer with a patch in place that synchronizes >> btrfs_record_root_in_trans better. >> >> Zheng if you have other comments on the locking please let me know. >> > > following untested patch may help. I've tested this patch, and the bug was triggered in minutes as usual. Also I've tested a patch in an offline email from Chris, which survived the test.