public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ryan Roberts <ryan.roberts@arm.com>
To: Baolin Wang <baolin.wang@linux.alibaba.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>, Jonathan Corbet <corbet@lwn.net>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	David Hildenbrand <david@redhat.com>,
	Barry Song <baohua@kernel.org>, Lance Yang <ioworker0@gmail.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v1 2/2] mm: mTHP stats for pagecache folio allocations
Date: Sun, 14 Jul 2024 10:05:57 +0100	[thread overview]
Message-ID: <f21c97ea-426a-46e3-900a-42cc039acc6f@arm.com> (raw)
In-Reply-To: <b8d1dc3c-ee05-450e-961e-b13dded06a78@linux.alibaba.com>

On 13/07/2024 13:54, Baolin Wang wrote:
> 
> 
> On 2024/7/13 19:00, Ryan Roberts wrote:
>> [...]
>>
>>>> +static int thpsize_create(int order, struct kobject *parent)
>>>>    {
>>>>        unsigned long size = (PAGE_SIZE << order) / SZ_1K;
>>>> +    struct thpsize_child *stats;
>>>>        struct thpsize *thpsize;
>>>>        int ret;
>>>>    +    /*
>>>> +     * Each child object (currently only "stats" directory) holds a
>>>> +     * reference to the top-level thpsize object, so we can drop our ref to
>>>> +     * the top-level once stats is setup. Then we just need to drop a
>>>> +     * reference on any children to clean everything up. We can't just use
>>>> +     * the attr group name for the stats subdirectory because there may be
>>>> +     * multiple attribute groups to populate inside stats and overlaying
>>>> +     * using the name property isn't supported in that way; each attr group
>>>> +     * name, if provided, must be unique in the parent directory.
>>>> +     */
>>>> +
>>>>        thpsize = kzalloc(sizeof(*thpsize), GFP_KERNEL);
>>>> -    if (!thpsize)
>>>> -        return ERR_PTR(-ENOMEM);
>>>> +    if (!thpsize) {
>>>> +        ret = -ENOMEM;
>>>> +        goto err;
>>>> +    }
>>>> +    thpsize->order = order;
>>>>          ret = kobject_init_and_add(&thpsize->kobj, &thpsize_ktype, parent,
>>>>                       "hugepages-%lukB", size);
>>>>        if (ret) {
>>>>            kfree(thpsize);
>>>> -        return ERR_PTR(ret);
>>>> +        goto err;
>>>>        }
>>>>    -    ret = sysfs_create_group(&thpsize->kobj, &thpsize_attr_group);
>>>> -    if (ret) {
>>>> +    stats = kzalloc(sizeof(*stats), GFP_KERNEL);
>>>> +    if (!stats) {
>>>>            kobject_put(&thpsize->kobj);
>>>> -        return ERR_PTR(ret);
>>>> +        ret = -ENOMEM;
>>>> +        goto err;
>>>>        }
>>>>    -    ret = sysfs_create_group(&thpsize->kobj, &stats_attr_group);
>>>> +    ret = kobject_init_and_add(&stats->kobj, &thpsize_child_ktype,
>>>> +                   &thpsize->kobj, "stats");
>>>> +    kobject_put(&thpsize->kobj);
>>>>        if (ret) {
>>>> -        kobject_put(&thpsize->kobj);
>>>> -        return ERR_PTR(ret);
>>>> +        kfree(stats);
>>>> +        goto err;
>>>>        }
>>>>    -    thpsize->order = order;
>>>> -    return thpsize;
>>>> +    if (BIT(order) & THP_ORDERS_ALL_ANON) {
>>>> +        ret = sysfs_create_group(&thpsize->kobj, &thpsize_attr_group);
>>>> +        if (ret)
>>>> +            goto err_put;
>>>> +
>>>> +        ret = sysfs_create_group(&stats->kobj, &stats_attr_group);
>>>> +        if (ret)
>>>> +            goto err_put;
>>>> +    }
>>>> +
>>>> +    if (BIT(order) & PAGECACHE_LARGE_ORDERS) {
>>>> +        ret = sysfs_create_group(&stats->kobj, &file_stats_attr_group);
>>>> +        if (ret)
>>>> +            goto err_put;
>>>> +    }
>>>> +
>>>> +    list_add(&stats->node, &thpsize_child_list);
>>>> +    return 0;
>>>> +err_put:
>>>
>>> IIUC, I think you should call 'sysfs_remove_group' to remove the group before
>>> putting the kobject.
>>
>> Are you sure about that? As I understood it, sysfs_create_group() was
>> conceptually modifying the state of the kobj, so when the kobj gets destroyed,
>> all its state is tidied up. __kobject_del() (called on the last kobject_put())
>> calls sysfs_remove_groups() and tidies up the sysfs state as far as I can see?
> 
> IIUC, __kobject_del() only removes the ktype defaut groups by
> 'sysfs_remove_groups(kobj, ktype->default_groups)', but your created groups are
> not added into the ktype->default_groups. That means you should mannuly remove
> them, or am I miss something?

That was also putting doubt in my mind. But the sample at
samples/kobject/kobject-example.c does not call sysfs_remove_group(). It just
calls sysfs_create_group() in example_init() and calls kobject_put() in
example_exit(). So I think that's the correct pattern.

Looking at the code more closely, sysfs_create_group() just creates files for
each of the attributes in the group. __kobject_del() calls sysfs_remove_dir(),
who's comment states "we remove any files in the directory before we remove the
directory" so I'm pretty sure sysfs_remove_group() is not required.

By the way, if we do choose to only populate stats if that size can be used by
anon/shmem/file, then I've found sysfs_merge_group() which will simplify adding
named groups without needing to manually create the stats directory as I am in
this version of the patch. I'll migrate to using that approach in v2. Of course
if we decide to take the approach of populating all stats for all sizes, that
problem goes away anyway.

Thanks,
Ryan





  reply	other threads:[~2024-07-14  9:06 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-11  7:29 [PATCH v1 0/2] mTHP allocation stats for file-backed memory Ryan Roberts
2024-07-11  7:29 ` [PATCH v1 1/2] mm: Cleanup count_mthp_stat() definition Ryan Roberts
2024-07-11  8:20   ` Barry Song
2024-07-12  2:31   ` Baolin Wang
2024-07-12 11:57   ` Lance Yang
2024-07-13  1:04   ` David Hildenbrand
2024-07-11  7:29 ` [PATCH v1 2/2] mm: mTHP stats for pagecache folio allocations Ryan Roberts
2024-07-12  3:00   ` Baolin Wang
2024-07-12 12:22     ` Lance Yang
2024-07-13  1:08       ` David Hildenbrand
2024-07-13 10:45         ` Ryan Roberts
2024-07-16  8:31           ` Ryan Roberts
2024-07-16 10:19             ` David Hildenbrand
2024-07-16 11:14               ` Ryan Roberts
2024-07-17  8:02                 ` David Hildenbrand
2024-07-17  8:29                   ` Ryan Roberts
2024-07-17  8:44                     ` David Hildenbrand
2024-07-17  9:50                       ` Ryan Roberts
2024-07-17 10:03                         ` David Hildenbrand
2024-07-17 10:18                           ` Ryan Roberts
2024-07-17 10:25                             ` David Hildenbrand
2024-07-17 10:48                               ` Ryan Roberts
2024-07-13 11:00     ` Ryan Roberts
2024-07-13 12:54       ` Baolin Wang
2024-07-14  9:05         ` Ryan Roberts [this message]
2024-07-22  3:52           ` Baolin Wang
2024-07-22  7:36             ` Ryan Roberts
2024-07-12 22:44   ` kernel test robot
2024-07-15 13:55     ` Ryan Roberts

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f21c97ea-426a-46e3-900a-42cc039acc6f@arm.com \
    --to=ryan.roberts@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=corbet@lwn.net \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=ioworker0@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox