From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Chris Mason <clm@fb.com>, Qu Wenruo <quwenruo@cn.fujitsu.com>,
Hugo Mills <hugo@carfax.org.uk>,
Btrfs mailing list <linux-btrfs@vger.kernel.org>
Subject: Re: Bug/regression: Read-only mount not read-only
Date: Wed, 2 Dec 2015 07:47:35 +0800 [thread overview]
Message-ID: <565E3197.8050209@gmx.com> (raw)
In-Reply-To: <20151201185420.GC8918@ret.masoncoding.com>
On 12/02/2015 02:54 AM, Chris Mason wrote:
> On Tue, Dec 01, 2015 at 02:46:32PM +0800, Qu Wenruo wrote:
>>
>>
>> Chris Mason wrote on 2015/11/30 11:48 -0500:
>>> On Sat, Nov 28, 2015 at 01:46:34PM +0000, Hugo Mills wrote:
>>>> We've just had someone on IRC with a problem mounting their FS. The
>>>> main problem is that they've got a corrupt log tree. That isn't the
>>>> subject of this email, though.
>>>>
>>>> The issue I'd like to raise is that even with -oro as a point
>>>> option, the FS is trying to replay the log tree. The dmesg output from
>>>> mount -oro is at the end of the email.
>>>>
>>>> Now, my memory, experience and understanding is that the FS
>>>> doesn't, and shouldn't replay the log tree on a RO mount, because the
>>>> FS should still be consistent even without the reply, and
>>>> RO-means-actually-RO is possible and desirable. (Compared to a
>>>> journalling FS, where journal replay is required for a consistent,
>>>> usable FS).
>>>>
>>>> So, this looks to me like a regression that's come in somewhere.
>>>>
>>>> (Just for completeness, the system in question usually runs 4.2.5,
>>>> but the live CD the OP is using is 4.2.3).
>>>
>>> We do need to replay the log tree, even on readonly mounts. Otherwise
>>> files created and fsunk before crashing may not even exist.
>>>
>>> We'll bail out of the log replay on readonly media, but otherwise the
>>> replay always happens.
>>>
>>> -chris
>>
>> Or disable log_tree (making fsync as slow as sync).
>> And there will be no log replay, making RO mount real RO.
>> I think we can add it to kernel btrfs documentation.
>
> True, without the log tree there's nothing to replay.
>
>>
>>
>> Or, in my wildest dream, introduce a per-inode tree to record file
>> extents/dir items.
>>
>> Then fsync will only need to sync the inode file extent/dir item tree.(and
>> its direct parent maybe)
>> And better random read/write performance.
>>
>> Although that's just my dream....
>>
>> But I'm a little curious about why btrfs choose to pack dir items and file
>> extents into the same subvolume tree at design time.
>> Unlike most of other file systems(ext4 for example).
>>
>> Is it just designed for simplicity?
>
> It's partially simplicity, but it also helps with locality. When you're
> working with lots of files in a single directory, we're able to do many operations
> faster because we're not jumping around to other indexes for individual
> file extents.
>
> The cost is contention at the top of the btree, which I'm still hoping
> to fix without having to go all the way down to per-file trees.
>
Thanks for the information.
I'll just forget the crazy idea to do such per-file trees until we don't
have better fix for the slow metadata operation.
Thanks,
Qu
> -chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
prev parent reply other threads:[~2015-12-01 23:47 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-28 13:46 Bug/regression: Read-only mount not read-only Hugo Mills
2015-11-30 14:59 ` Austin S Hemmelgarn
2015-11-30 15:28 ` Hugo Mills
2015-11-30 16:00 ` Austin S Hemmelgarn
2015-11-30 16:48 ` Chris Mason
2015-11-30 17:06 ` Hugo Mills
2015-12-01 19:00 ` Chris Mason
2015-12-01 19:05 ` Eric Sandeen
2015-12-02 6:25 ` Russell Coker
2015-12-02 9:06 ` Qu Wenruo
2015-12-02 9:23 ` Qu Wenruo
2015-12-02 16:54 ` Eric Sandeen
2015-12-02 17:48 ` Austin S Hemmelgarn
2015-12-02 18:53 ` Hugo Mills
2015-12-02 22:48 ` Eric Sandeen
2015-12-02 23:40 ` Qu Wenruo
2015-12-02 23:51 ` Hugo Mills
2015-12-03 6:44 ` Duncan
2015-12-04 12:32 ` Austin S Hemmelgarn
2015-12-04 12:23 ` Austin S Hemmelgarn
2015-11-30 17:08 ` Austin S Hemmelgarn
2015-12-01 6:46 ` Qu Wenruo
2015-12-01 18:54 ` Chris Mason
2015-12-01 23:47 ` Qu Wenruo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=565E3197.8050209@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=clm@fb.com \
--cc=hugo@carfax.org.uk \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox