From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cn.fujitsu.com ([222.73.24.84]:36567 "EHLO song.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org with ESMTP id S1753857Ab3JYCNw (ORCPT ); Thu, 24 Oct 2013 22:13:52 -0400 Message-ID: <5269D41A.6040802@cn.fujitsu.com> Date: Fri, 25 Oct 2013 10:14:50 +0800 From: Miao Xie Reply-To: miaox@cn.fujitsu.com MIME-Version: 1.0 To: Wang Shilong , Chris Mason CC: Stefan Behrens , Bob Marley , Wang Shilong , linux-btrfs@vger.kernel.org Subject: Re: [PATCH] Btrfs: fix race condition between writting and scrubing supers References: <1382156250-2336-1-git-send-email-wangshilong1991@gmail.com> <526247E3.9000804@giantdisaster.de> <5262914D.7030306@giantdisaster.de> <52663960.4060905@giantdisaster.de> <5266AE1F.6030304@shiftmail.org> <5268059E.707@giantdisaster.de> <20131024100842.14051.45479@localhost.localdomain> <5269053F.3050906@cn.fujitsu.com> In-Reply-To: <5269053F.3050906@cn.fujitsu.com> Content-Type: text/plain; charset=UTF-8 Sender: linux-btrfs-owner@vger.kernel.org List-ID: On thu, 24 Oct 2013 19:32:15 +0800, Wang Shilong wrote: > On 10/24/2013 06:08 PM, Chris Mason wrote: >> Quoting Stefan Behrens (2013-10-23 13:21:34) >>> On Tue, 22 Oct 2013 18:55:59 +0200, Bob Marley wrote: >>>> On 22/10/2013 10:37, Stefan Behrens wrote: >>>>> I don't believe that this issue can ever happen. I don't believe that >>>>> somewhere on the path to the flash memory, to the magnetic disc or to >>>>> the drive's cache memory, someone interrupts a 4KB write in the middle >>>>> of operation to read from this 4KB area. This is not an issue IMHO. >>>> I think I have read that unfortunately it can happen. >>>> SAS and SATA specs for disks do not mandate that if a write is in-flight >>>> but still not completed, reads from the same sector should return the >>>> value it is being written; they can return the old value. >>>> I also think that Linux does not check either. >>> If the _old_ 4KB block is returned, that's fine and won't cause a >>> checksum error. >>> >>> The patch in question addresses the case that Btrfs submits a write >>> request for a 4KB block, and a concurrent read request for that 4KB >>> block reads partially the old block and partially the new block, >>> resulting in a checksum error reported in the scrub statistic counters. >> Concurrent reads and writes to the device are completely undefined, and >> Any combination of old, new, random memory corruption wouldn't >> surprise me...I'd rather avoid them ;) >> >> Doing the transaction join during the super read is probably the least >> complex choice. > Yeah, by joining transaction we can solve this problem, but it is a little confused, > because we don't involve writting in scrubing supers. > > And the only race condition happens in commiting transaction, Miao also pointed out that > maybe the best way is to move btrfs_scrub_continue after write_ctree_super(). Sorry, My miss. btrfs_scrub_continue() is behind write_ctree_super() all the while, so the above problem doesn't exist. Thanks Miao > > Thanks, > Wang >> -chris >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >