From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cn.fujitsu.com ([222.73.24.84]:57161 "EHLO song.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org with ESMTP id S1753844Ab3JXLfa (ORCPT ); Thu, 24 Oct 2013 07:35:30 -0400 Message-ID: <5269053F.3050906@cn.fujitsu.com> Date: Thu, 24 Oct 2013 19:32:15 +0800 From: Wang Shilong MIME-Version: 1.0 To: Chris Mason CC: Stefan Behrens , Bob Marley , Wang Shilong , linux-btrfs@vger.kernel.org Subject: Re: [PATCH] Btrfs: fix race condition between writting and scrubing supers References: <1382156250-2336-1-git-send-email-wangshilong1991@gmail.com> <526247E3.9000804@giantdisaster.de> <5262914D.7030306@giantdisaster.de> <52663960.4060905@giantdisaster.de> <5266AE1F.6030304@shiftmail.org> <5268059E.707@giantdisaster.de> <20131024100842.14051.45479@localhost.localdomain> In-Reply-To: <20131024100842.14051.45479@localhost.localdomain> Content-Type: text/plain; charset=UTF-8; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 10/24/2013 06:08 PM, Chris Mason wrote: > Quoting Stefan Behrens (2013-10-23 13:21:34) >> On Tue, 22 Oct 2013 18:55:59 +0200, Bob Marley wrote: >>> On 22/10/2013 10:37, Stefan Behrens wrote: >>>> I don't believe that this issue can ever happen. I don't believe that >>>> somewhere on the path to the flash memory, to the magnetic disc or to >>>> the drive's cache memory, someone interrupts a 4KB write in the middle >>>> of operation to read from this 4KB area. This is not an issue IMHO. >>> I think I have read that unfortunately it can happen. >>> SAS and SATA specs for disks do not mandate that if a write is in-flight >>> but still not completed, reads from the same sector should return the >>> value it is being written; they can return the old value. >>> I also think that Linux does not check either. >> If the _old_ 4KB block is returned, that's fine and won't cause a >> checksum error. >> >> The patch in question addresses the case that Btrfs submits a write >> request for a 4KB block, and a concurrent read request for that 4KB >> block reads partially the old block and partially the new block, >> resulting in a checksum error reported in the scrub statistic counters. > Concurrent reads and writes to the device are completely undefined, and > Any combination of old, new, random memory corruption wouldn't > surprise me...I'd rather avoid them ;) > > Doing the transaction join during the super read is probably the least > complex choice. Yeah, by joining transaction we can solve this problem, but it is a little confused, because we don't involve writting in scrubing supers. And the only race condition happens in commiting transaction, Miao also pointed out that maybe the best way is to move btrfs_scrub_continue after write_ctree_super(). Thanks, Wang > -chris > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >