From: Miao Xie <miaox@cn.fujitsu.com>
To: Chris Mason <chris.mason@fusionio.com>,
Stefan Behrens <sbehrens@giantdisaster.de>,
Bob Marley <bobmarley@shiftmail.org>
Cc: Wang Shilong <wangshilong1991@gmail.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] Btrfs: fix race condition between writting and scrubing supers
Date: Thu, 24 Oct 2013 18:42:34 +0800 [thread overview]
Message-ID: <5268F99A.8010907@cn.fujitsu.com> (raw)
In-Reply-To: <20131024100842.14051.45479@localhost.localdomain>
On thu, 24 Oct 2013 06:08:42 -0400, Chris Mason wrote:
> Quoting Stefan Behrens (2013-10-23 13:21:34)
>> On Tue, 22 Oct 2013 18:55:59 +0200, Bob Marley wrote:
>>> On 22/10/2013 10:37, Stefan Behrens wrote:
>>>> I don't believe that this issue can ever happen. I don't believe that
>>>> somewhere on the path to the flash memory, to the magnetic disc or to
>>>> the drive's cache memory, someone interrupts a 4KB write in the middle
>>>> of operation to read from this 4KB area. This is not an issue IMHO.
>>>
>>> I think I have read that unfortunately it can happen.
>>> SAS and SATA specs for disks do not mandate that if a write is in-flight
>>> but still not completed, reads from the same sector should return the
>>> value it is being written; they can return the old value.
>>> I also think that Linux does not check either.
>>
>> If the _old_ 4KB block is returned, that's fine and won't cause a
>> checksum error.
>>
>> The patch in question addresses the case that Btrfs submits a write
>> request for a 4KB block, and a concurrent read request for that 4KB
>> block reads partially the old block and partially the new block,
>> resulting in a checksum error reported in the scrub statistic counters.
>
> Concurrent reads and writes to the device are completely undefined, and
> Any combination of old, new, random memory corruption wouldn't
> surprise me...I'd rather avoid them ;)
>
> Doing the transaction join during the super read is probably the least
> complex choice.
But it can not block the log tree sync, I think using device_list_mutex is
better since we should acquire this mutex when writing the super blocks and
we are sure that the super blocks are on non-volatile media on completion
after we unlock the mutex.
Thanks
Miao
>
> -chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2013-10-24 10:41 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-19 4:17 [PATCH] Btrfs: fix race condition between writting and scrubing supers Wang Shilong
2013-10-19 8:50 ` Stefan Behrens
2013-10-19 10:32 ` Shilong Wang
2013-10-19 14:03 ` Stefan Behrens
2013-10-19 14:34 ` Wang Shilong
2013-10-20 4:03 ` Wang Shilong
2013-10-22 8:37 ` Stefan Behrens
2013-10-22 16:55 ` Bob Marley
2013-10-23 17:21 ` Stefan Behrens
2013-10-24 10:08 ` Chris Mason
2013-10-24 10:42 ` Miao Xie [this message]
2013-10-24 11:32 ` Wang Shilong
2013-10-25 2:14 ` Miao Xie
2013-10-20 7:28 ` Bob Marley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5268F99A.8010907@cn.fujitsu.com \
--to=miaox@cn.fujitsu.com \
--cc=bobmarley@shiftmail.org \
--cc=chris.mason@fusionio.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=sbehrens@giantdisaster.de \
--cc=wangshilong1991@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).