All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Chris Mason <clm@fb.com>, Qu Wenruo <quwenruo@cn.fujitsu.com>,
	Hugo Mills <hugo@carfax.org.uk>,
	Btrfs mailing list <linux-btrfs@vger.kernel.org>
Subject: Re: Bug/regression: Read-only mount not read-only
Date: Wed, 2 Dec 2015 07:47:35 +0800	[thread overview]
Message-ID: <565E3197.8050209@gmx.com> (raw)
In-Reply-To: <20151201185420.GC8918@ret.masoncoding.com>



On 12/02/2015 02:54 AM, Chris Mason wrote:
> On Tue, Dec 01, 2015 at 02:46:32PM +0800, Qu Wenruo wrote:
>>
>>
>> Chris Mason wrote on 2015/11/30 11:48 -0500:
>>> On Sat, Nov 28, 2015 at 01:46:34PM +0000, Hugo Mills wrote:
>>>>     We've just had someone on IRC with a problem mounting their FS. The
>>>> main problem is that they've got a corrupt log tree. That isn't the
>>>> subject of this email, though.
>>>>
>>>>     The issue I'd like to raise is that even with -oro as a point
>>>> option, the FS is trying to replay the log tree. The dmesg output from
>>>> mount -oro is at the end of the email.
>>>>
>>>>     Now, my memory, experience and understanding is that the FS
>>>> doesn't, and shouldn't replay the log tree on a RO mount, because the
>>>> FS should still be consistent even without the reply, and
>>>> RO-means-actually-RO is possible and desirable. (Compared to a
>>>> journalling FS, where journal replay is required for a consistent,
>>>> usable FS).
>>>>
>>>>     So, this looks to me like a regression that's come in somewhere.
>>>>
>>>>     (Just for completeness, the system in question usually runs 4.2.5,
>>>> but the live CD the OP is using is 4.2.3).
>>>
>>> We do need to replay the log tree, even on readonly mounts.  Otherwise
>>> files created and fsunk before crashing may not even exist.
>>>
>>> We'll bail out of the log replay on readonly media, but otherwise the
>>> replay always happens.
>>>
>>> -chris
>>
>> Or disable log_tree (making fsync as slow as sync).
>> And there will be no log replay, making RO mount real RO.
>> I think we can add it to kernel btrfs documentation.
>
> True, without the log tree there's nothing to replay.
>
>>
>>
>> Or, in my wildest dream, introduce a per-inode tree to record file
>> extents/dir items.
>>
>> Then fsync will only need to sync the inode file extent/dir item tree.(and
>> its direct parent maybe)
>> And better random read/write performance.
>>
>> Although that's just my dream....
>>
>> But I'm a little curious about why btrfs choose to pack dir items and file
>> extents into the same subvolume tree at design time.
>> Unlike most of other file systems(ext4 for example).
>>
>> Is it just designed for simplicity?
>
> It's partially simplicity, but it also helps with locality.  When you're
> working with lots of files in a single directory, we're able to do many operations
> faster because we're not jumping around to other indexes for individual
> file extents.
>
> The cost is contention at the top of the btree, which I'm still hoping
> to fix without having to go all the way down to per-file trees.
>

Thanks for the information.

I'll just forget the crazy idea to do such per-file trees until we don't 
have better fix for the slow metadata operation.

Thanks,
Qu

> -chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

      reply	other threads:[~2015-12-01 23:47 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-28 13:46 Bug/regression: Read-only mount not read-only Hugo Mills
2015-11-30 14:59 ` Austin S Hemmelgarn
2015-11-30 15:28   ` Hugo Mills
2015-11-30 16:00     ` Austin S Hemmelgarn
2015-11-30 16:48 ` Chris Mason
2015-11-30 17:06   ` Hugo Mills
2015-12-01 19:00     ` Chris Mason
2015-12-01 19:05       ` Eric Sandeen
2015-12-02  6:25         ` Russell Coker
2015-12-02  9:06           ` Qu Wenruo
2015-12-02  9:23             ` Qu Wenruo
2015-12-02 16:54               ` Eric Sandeen
2015-12-02 17:48                 ` Austin S Hemmelgarn
2015-12-02 18:53                   ` Hugo Mills
2015-12-02 22:48                   ` Eric Sandeen
2015-12-02 23:40                     ` Qu Wenruo
2015-12-02 23:51                       ` Hugo Mills
2015-12-03  6:44                         ` Duncan
2015-12-04 12:32                         ` Austin S Hemmelgarn
2015-12-04 12:23                       ` Austin S Hemmelgarn
2015-11-30 17:08   ` Austin S Hemmelgarn
2015-12-01  6:46   ` Qu Wenruo
2015-12-01 18:54     ` Chris Mason
2015-12-01 23:47       ` Qu Wenruo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=565E3197.8050209@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=clm@fb.com \
    --cc=hugo@carfax.org.uk \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.