From: Gao Xiang <hsiangkao@linux.alibaba.com>
To: Alexander Larsson <alexl@redhat.com>
Cc: Jingbo Xu <jefflexu@linux.alibaba.com>,
lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
Amir Goldstein <amir73il@gmail.com>,
Christian Brauner <brauner@kernel.org>,
Giuseppe Scrivano <gscrivan@redhat.com>,
Dave Chinner <david@fromorbit.com>,
Vivek Goyal <vgoyal@redhat.com>,
Miklos Szeredi <miklos@szeredi.hu>
Subject: Re: [LSF/MM/BFP TOPIC] Composefs vs erofs+overlay
Date: Tue, 7 Mar 2023 18:06:46 +0800 [thread overview]
Message-ID: <3b58af0a-387c-fea5-cf04-72503c89f4bc@linux.alibaba.com> (raw)
In-Reply-To: <CAL7ro1GMAKrYG3gWJHx2UwVTQo=UjKWSH6iBbpoBO_a-ybbieQ@mail.gmail.com>
On 2023/3/7 17:56, Alexander Larsson wrote:
> On Tue, Mar 7, 2023 at 10:38 AM Gao Xiang <hsiangkao@linux.alibaba.com> wrote:
>>
>> On 2023/3/7 17:26, Gao Xiang wrote:
>>>
>>>
>>> On 2023/3/7 17:07, Alexander Larsson wrote:
>>>> On Tue, Mar 7, 2023 at 9:34 AM Gao Xiang <hsiangkao@linux.alibaba.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 2023/3/7 16:21, Alexander Larsson wrote:
>>>>>> On Mon, Mar 6, 2023 at 5:17 PM Gao Xiang <hsiangkao@linux.alibaba.com> wrote:
>>>>>>
>>>>>>>>> I tested the performance of "ls -lR" on the whole tree of
>>>>>>>>> cs9-developer-rootfs. It seems that the performance of erofs (generated
>>>>>>>>> from mkfs.erofs) is slightly better than that of composefs. While the
>>>>>>>>> performance of erofs generated from mkfs.composefs is slightly worse
>>>>>>>>> that that of composefs.
>>>>>>>>
>>>>>>>> I suspect that the reason for the lower performance of mkfs.composefs
>>>>>>>> is the added overlay.fs-verity xattr to all the files. It makes the
>>>>>>>> image larger, and that means more i/o.
>>>>>>>
>>>>>>> Actually you could move overlay.fs-verity to EROFS shared xattr area (or
>>>>>>> even overlay.redirect but it depends) if needed, which could save some
>>>>>>> I/Os for your workloads.
>>>>>>>
>>>>>>> shared xattrs can be used in this way as well if you care such minor
>>>>>>> difference, actually I think inlined xattrs for your workload are just
>>>>>>> meaningful for selinux labels and capabilities.
>>>>>>
>>>>>> Really? Could you expand on this, because I would think it will be
>>>>>> sort of the opposite. In my usecase, the erofs fs will be read by
>>>>>> overlayfs, which will probably access overlay.* pretty often. At the
>>>>>> very least it will load overlay.metacopy and overlay.redirect for
>>>>>> every lookup.
>>>>>
>>>>> Really. In that way, it will behave much similiar to composefs on-disk
>>>>> arrangement now (in composefs vdata area).
>>>>>
>>>>> Because in that way, although an extra I/O is needed for verification,
>>>>> and it can only happen when actually opening the file (so "ls -lR" is
>>>>> not impacted.) But on-disk inodes are more compact.
>>>>>
>>>>> All EROFS xattrs will be cached in memory so that accessing
>>>>> overlay.* pretty often is not greatly impacted due to no real I/Os
>>>>> (IOWs, only some CPU time is consumed).
>>>>
>>>> So, I tried moving the overlay.digest xattr to the shared area, but
>>>> actually this made the performance worse for the ls case. I have not
>>>
>>> That is much strange. We'd like to open it up if needed. BTW, did you
>>> test EROFS with acl enabled all the time?
>>>
>>>> looked into the cause in detail, but my guess is that ls looks for the
>>>> acl xattr, and such a negative lookup will cause erofs to look at all
>>>> the shared xattrs for the inode, which means they all end up being
>>>> loaded anyway. Of course, this will only affect ls (or other cases
>>>> that read the acl), so its perhaps a bit uncommon.
>>>
>>> Yeah, in addition to that, I guess real acls could be landed in inlined
>>> xattrs as well if exists...
>>>
>>>>
>>>> Did you ever consider putting a bloom filter in the h_reserved area of
>>>> erofs_xattr_ibody_header? Then it could return early without i/o
>>>> operations for keys that are not set for the inode. Not sure what the
>>>> computational cost of that would be though.
>>>
>>> Good idea! Let me think about it, but enabling "noacl" mount
>>> option isn't prefered if acl is no needed in your use cases.
>>
>> ^ is preferred.
>
> That is probably the right approach for the composefs usecase. But
> even when you want acls, typically only just a few files have acls
> set, so it might be interesting to handle the negative acl lookup case
> more efficiently.
Let me to seek time to improve this with bloom filters. It won't be hard,
also I'd like to improve some other on-disk formats together with this
xattr enhancement. Thanks for your input!
Thanks,
Gao Xiang
>
next prev parent reply other threads:[~2023-03-07 10:07 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-27 9:22 [LSF/MM/BFP TOPIC] Composefs vs erofs+overlay Alexander Larsson
2023-02-27 10:45 ` Gao Xiang
2023-02-27 10:58 ` Christian Brauner
2023-04-27 16:11 ` [Lsf-pc] " Amir Goldstein
2023-03-01 3:47 ` Jingbo Xu
2023-03-03 14:41 ` Alexander Larsson
2023-03-03 15:48 ` Gao Xiang
2023-02-27 11:37 ` Jingbo Xu
2023-03-03 13:57 ` Alexander Larsson
2023-03-03 15:13 ` Gao Xiang
2023-03-03 17:37 ` Gao Xiang
2023-03-04 14:59 ` Colin Walters
2023-03-04 15:29 ` Gao Xiang
2023-03-04 16:22 ` Gao Xiang
2023-03-07 1:00 ` Colin Walters
2023-03-07 3:10 ` Gao Xiang
2023-03-07 10:15 ` Christian Brauner
2023-03-07 11:03 ` Gao Xiang
2023-03-07 12:09 ` Alexander Larsson
2023-03-07 12:55 ` Gao Xiang
2023-03-07 15:16 ` Christian Brauner
2023-03-07 19:33 ` Giuseppe Scrivano
2023-03-08 10:31 ` Christian Brauner
2023-03-07 13:38 ` Jeff Layton
2023-03-08 10:37 ` Christian Brauner
2023-03-04 0:46 ` Jingbo Xu
2023-03-06 11:33 ` Alexander Larsson
2023-03-06 12:15 ` Gao Xiang
2023-03-06 15:49 ` Jingbo Xu
2023-03-06 16:09 ` Alexander Larsson
2023-03-06 16:17 ` Gao Xiang
2023-03-07 8:21 ` Alexander Larsson
2023-03-07 8:33 ` Gao Xiang
2023-03-07 8:48 ` Gao Xiang
2023-03-07 9:07 ` Alexander Larsson
2023-03-07 9:26 ` Gao Xiang
2023-03-07 9:38 ` Gao Xiang
2023-03-07 9:56 ` Alexander Larsson
2023-03-07 10:06 ` Gao Xiang [this message]
2023-03-07 9:46 ` Alexander Larsson
2023-03-07 10:01 ` Gao Xiang
2023-03-07 10:00 ` Jingbo Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3b58af0a-387c-fea5-cf04-72503c89f4bc@linux.alibaba.com \
--to=hsiangkao@linux.alibaba.com \
--cc=alexl@redhat.com \
--cc=amir73il@gmail.com \
--cc=brauner@kernel.org \
--cc=david@fromorbit.com \
--cc=gscrivan@redhat.com \
--cc=jefflexu@linux.alibaba.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=miklos@szeredi.hu \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).