From: Mike Kravetz <mike.kravetz@oracle.com>
To: Miaohe Lin <linmiaohe@huawei.com>
Cc: akpm@linux-foundation.org, songmuchun@bytedance.com,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 5/5] hugetlbfs: fix confusing hugetlbfs stat
Date: Mon, 25 Jul 2022 16:40:43 -0700 [thread overview]
Message-ID: <Yt8p+5FrU3XpFlxv@monkey> (raw)
In-Reply-To: <e6e2de0d-ee72-4572-75c5-02cb27e7ae95@huawei.com>
On 07/23/22 10:56, Miaohe Lin wrote:
> On 2022/7/23 6:55, Mike Kravetz wrote:
> > On 07/22/22 14:38, Miaohe Lin wrote:
> >> On 2022/7/22 8:28, Mike Kravetz wrote:
> >>> On 07/21/22 21:16, Miaohe Lin wrote:
> >>>> When size option is not specified, f_blocks, f_bavail and f_bfree will be
> >>>> set to -1 instead of 0. Likewise, when nr_inodes is not specified, f_files
> >>>> and f_ffree will be set to -1 too. Check max_hpages and max_inodes against
> >>>> -1 first to make sure 0 is reported for max/free/used when no limit is set
> >>>> as the comment states.
> >>>
> >>> Just curious, where are you seeing values reported as -1? The check
> >>
> >> From the standard statvfs() function.
> >>
> >>> for sbinfo->spool was supposed to handle these cases. Seems like it
> >>
> >> sbinfo->spool could be created when ctx->max_hpages == -1 while
> >> ctx->min_hpages != -1 in hugetlbfs_fill_super.
> >>
> >>> should handle the max_hpages == -1 case. But, it doesn't look like it
> >>> considers the max_inodes == -1 case.
> >>>
> >>> If I create/mount a hugetlb filesystem without specifying size or nr_inodes,
> >>> df seems to report zero instead of -1.
> >>>
> >>> Just want to understand the reasoning behind the change.
> >
> > Thanks for the additional information (and test program)!
> >
> >>From the hugetlbfs documentation:
> > "If the ``size``, ``min_size`` or ``nr_inodes`` option is not provided on
> > command line then no limits are set."
> >
> > So, having those values set to -1 indicates there is no limit set.
> >
> > With this change, 0 is reported for the case where there is no limit set as
> > well as the case where the max value is 0.
>
> IMHO, 0 should not be a valid max value otherwise there will be no hugetlb pages
> to use. It should mean there's no limit. But maybe I'm wrong.
I agree that 0 as a max value makes little sense. However, it is allowed
today and from what I can tell it is file system specific. So, there is no
defined behavior.
>
> >
> > There may be some value in reporting -1 as is done today.
>
> There still be a inconsistency:
>
> If the ``size`` and ``min_size`` isn't specified, then reported max value is 0.
> But if ``min_size`` is specified while ``size`` isn't specified, the reported
> max value is -1.
>
Agree that this is inconsistent and confusing.
In the case where min_size and size is not specified, -1 for size still may
make sense. min_size specifies how many pages are reserved for use by the
filesystem. The only required relation between min_size and size is that if
size is specified, then min_size must be smaller. Otherwise, it makes no
sense to reserve pages (min_size) that can not be used.
> > To be honest, I am not sure what is the correct behavior here. Unless
> > there is a user visible issue/problem, I am hesitant to change. Other
> > opinions are welcome.
>
> Yes, it might be better to keep it as is. Maybe we could change the comment to
> reflect what the current behavior is like below?
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 44da9828e171..f03b1a019cc0 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -1080,7 +1080,7 @@ static int hugetlbfs_statfs(struct dentry *dentry, struct kstatfs *buf)
> buf->f_bsize = huge_page_size(h);
> if (sbinfo) {
> spin_lock(&sbinfo->stat_lock);
> - /* If no limits set, just report 0 for max/free/used
> + /* If no limits set, just report 0 or -1 for max/free/used
> * blocks, like simple_statfs() */
> if (sbinfo->spool) {
> spin_lock_irq(&sbinfo->spool->lock);
>
> >
>
> No strong opinion to keep this patch or above change. Many thanks for your comment and reply. :)
>
I am fine with the comment change. Thanks for reading through the code and
trying to make sense of it!
--
Mike Kravetz
next prev parent reply other threads:[~2022-07-25 23:41 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-21 13:16 [PATCH 0/5] A few cleanup and fixup patches for hugetlbfs Miaohe Lin
2022-07-21 13:16 ` [PATCH 1/5] hugetlbfs: use helper macro SZ_1{K,M} Miaohe Lin
2022-07-21 23:13 ` Mike Kravetz
2022-07-21 13:16 ` [PATCH 2/5] hugetlbfs: remove unneeded hugetlbfs_ops forward declaration Miaohe Lin
2022-07-21 23:14 ` Mike Kravetz
2022-07-21 13:16 ` [PATCH 3/5] hugetlbfs: remove unneeded header file Miaohe Lin
2022-07-21 23:18 ` Mike Kravetz
2022-07-22 6:12 ` Miaohe Lin
2022-07-21 13:16 ` [PATCH 4/5] hugetlbfs: cleanup some comments in inode.c Miaohe Lin
2022-07-21 23:23 ` Mike Kravetz
2022-07-22 6:19 ` Miaohe Lin
2022-07-22 21:38 ` Mike Kravetz
2022-07-21 13:16 ` [PATCH 5/5] hugetlbfs: fix confusing hugetlbfs stat Miaohe Lin
2022-07-22 0:28 ` Mike Kravetz
2022-07-22 6:38 ` Miaohe Lin
2022-07-22 22:55 ` Mike Kravetz
2022-07-23 2:56 ` Miaohe Lin
2022-07-25 23:40 ` Mike Kravetz [this message]
2022-07-26 2:01 ` Miaohe Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yt8p+5FrU3XpFlxv@monkey \
--to=mike.kravetz@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=songmuchun@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.