From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>, linux-mm@kvack.org
Cc: linuxppc-dev@lists.ozlabs.org, paulus@samba.org, anton@samba.org,
nyc@holomorphy.com
Subject: Re: [RFC PATCH] hugetlb: ensure hugepage access is denied if hugepages are not supported
Date: Thu, 03 Apr 2014 21:49:46 +0530 [thread overview]
Message-ID: <87ioqqo065.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140326155815.GB15234@linux.vnet.ibm.com>
Nishanth Aravamudan <nacc@linux.vnet.ibm.com> writes:
> On 24.03.2014 [16:02:56 -0700], Nishanth Aravamudan wrote:
>> In KVM guests on Power, if the guest is not backed by hugepages, we see
>> the following in the guest:
>>
>> AnonHugePages: 0 kB
>> HugePages_Total: 0
>> HugePages_Free: 0
>> HugePages_Rsvd: 0
>> HugePages_Surp: 0
>> Hugepagesize: 64 kB
>>
>> This seems like a configuration issue -- why is a hstate of 64k being
>> registered?
>>
>> I did some debugging and found that the following does trigger,
>> mm/hugetlb.c::hugetlb_init():
>>
>> /* Some platform decide whether they support huge pages at boot
>> * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when
>> * there is no such support
>> */
>> if (HPAGE_SHIFT == 0)
>> return 0;
>>
>> That check is only during init-time. So we don't support hugepages, but
>> none of the hugetlb APIs actually check this condition (HPAGE_SHIFT ==
>> 0), so /proc/meminfo above falsely indicates there is a valid hstate (at
>> least one). But note that there is no /sys/kernel/mm/hugepages meaning
>> no hstate was actually registered.
>>
>> Further, it turns out that huge_page_order(default_hstate) is 0, so
>> hugetlb_report_meminfo is doing:
>>
>> 1UL << (huge_page_order(h) + PAGE_SHIFT - 10)
>>
>> which ends up just doing 1 << (PAGE_SHIFT - 10) and since the base page
>> size is 64k, we report a hugepage size of 64k... And allow the user to
>> allocate hugepages via the sysctl, etc.
>>
>> What's the right thing to do here?
>>
>> 1) Should we add checks for HPAGE_SHIFT == 0 to all the hugetlb APIs? It
>> seems like HPAGE_SHIFT == 0 should be the equivalent, functionally, of
>> the config options being off. This seems like a lot of overhead, though,
>> to put everywhere, so maybe I can do it in an arch-specific macro, that
>> in asm-generic defaults to 0 (and so will hopefully be compiled out?).
>>
>> 2) What should hugetlbfs do when HPAGE_SHIFT == 0? Should it be
>> mountable? Obviously if it's mountable, we can't great files there
>> (since the fs will report insufficient space). [1]
>
> Here is my solution to this. Comments appreciated!
>
> In KVM guests on Power, in a guest not backed by hugepages, we see the
> following:
>
> AnonHugePages: 0 kB
> HugePages_Total: 0
> HugePages_Free: 0
> HugePages_Rsvd: 0
> HugePages_Surp: 0
> Hugepagesize: 64 kB
>
> HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages
> are not supported at boot-time, but this is only checked in
> hugetlb_init(). Extract the check to a helper function, and use it in a
> few relevant places.
>
> This does make hugetlbfs not supported in this environment. I believe
> this is fine, as there are no valid hugepages and that won't change at
> runtime.
>
> Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Looks good. Can you resubmit it as a proper patch ? You may also want to
capture in commit message saying hugetlbfs file system also will not be
registered.
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index d19b30a..c7aa477 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -1017,6 +1017,11 @@ static int __init init_hugetlbfs_fs(void)
> int error;
> int i;
>
> + if (!hugepages_supported()) {
> + printk(KERN_ERR "hugetlbfs: Disabling because there are no supported page sizes\n");
> + return -ENOTSUPP;
> + }
> +
> error = bdi_init(&hugetlbfs_backing_dev_info);
> if (error)
> return error;
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 8c43cc4..0aea8de 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -450,4 +450,14 @@ static inline spinlock_t *huge_pte_lock(struct hstate *h,
> return ptl;
> }
>
> +static inline bool hugepages_supported(void)
> +{
> + /*
> + * Some platform decide whether they support huge pages at boot
> + * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when
> + * there is no such support
> + */
> + return HPAGE_SHIFT != 0;
> +}
> +
> #endif /* _LINUX_HUGETLB_H */
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index c01cb9f..1c99585 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1949,11 +1949,7 @@ module_exit(hugetlb_exit);
>
> static int __init hugetlb_init(void)
> {
> - /* Some platform decide whether they support huge pages at boot
> - * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when
> - * there is no such support
> - */
> - if (HPAGE_SHIFT == 0)
> + if (!hugepages_supported())
> return 0;
>
> if (!size_to_hstate(default_hstate_size)) {
> @@ -2069,6 +2065,9 @@ static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
> unsigned long tmp;
> int ret;
>
> + if (!hugepages_supported())
> + return -ENOTSUPP;
> +
> tmp = h->max_huge_pages;
>
> if (write && h->order >= MAX_ORDER)
> @@ -2122,6 +2121,9 @@ int hugetlb_overcommit_handler(struct ctl_table *table, int write,
> unsigned long tmp;
> int ret;
>
> + if (!hugepages_supported())
> + return -ENOTSUPP;
> +
> tmp = h->nr_overcommit_huge_pages;
>
> if (write && h->order >= MAX_ORDER)
> @@ -2147,6 +2149,8 @@ out:
> void hugetlb_report_meminfo(struct seq_file *m)
> {
> struct hstate *h = &default_hstate;
> + if (!hugepages_supported())
> + return;
> seq_printf(m,
> "HugePages_Total: %5lu\n"
> "HugePages_Free: %5lu\n"
> @@ -2163,6 +2167,8 @@ void hugetlb_report_meminfo(struct seq_file *m)
> int hugetlb_report_node_meminfo(int nid, char *buf)
> {
> struct hstate *h = &default_hstate;
> + if (!hugepages_supported())
> + return 0;
> return sprintf(buf,
> "Node %d HugePages_Total: %5u\n"
> "Node %d HugePages_Free: %5u\n"
> @@ -2177,6 +2183,9 @@ void hugetlb_show_meminfo(void)
> struct hstate *h;
> int nid;
>
> + if (!hugepages_supported())
> + return;
> +
> for_each_node_state(nid, N_MEMORY)
> for_each_hstate(h)
> pr_info("Node %d hugepages_total=%u hugepages_free=%u hugepages_surp=%u hugepages_size=%lukB\n",
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
WARNING: multiple messages have this Message-ID (diff)
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>, linux-mm@kvack.org
Cc: paulus@samba.org, linuxppc-dev@lists.ozlabs.org, anton@samba.org,
nyc@holomorphy.com
Subject: Re: [RFC PATCH] hugetlb: ensure hugepage access is denied if hugepages are not supported
Date: Thu, 03 Apr 2014 21:49:46 +0530 [thread overview]
Message-ID: <87ioqqo065.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140326155815.GB15234@linux.vnet.ibm.com>
Nishanth Aravamudan <nacc@linux.vnet.ibm.com> writes:
> On 24.03.2014 [16:02:56 -0700], Nishanth Aravamudan wrote:
>> In KVM guests on Power, if the guest is not backed by hugepages, we see
>> the following in the guest:
>>
>> AnonHugePages: 0 kB
>> HugePages_Total: 0
>> HugePages_Free: 0
>> HugePages_Rsvd: 0
>> HugePages_Surp: 0
>> Hugepagesize: 64 kB
>>
>> This seems like a configuration issue -- why is a hstate of 64k being
>> registered?
>>
>> I did some debugging and found that the following does trigger,
>> mm/hugetlb.c::hugetlb_init():
>>
>> /* Some platform decide whether they support huge pages at boot
>> * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when
>> * there is no such support
>> */
>> if (HPAGE_SHIFT == 0)
>> return 0;
>>
>> That check is only during init-time. So we don't support hugepages, but
>> none of the hugetlb APIs actually check this condition (HPAGE_SHIFT ==
>> 0), so /proc/meminfo above falsely indicates there is a valid hstate (at
>> least one). But note that there is no /sys/kernel/mm/hugepages meaning
>> no hstate was actually registered.
>>
>> Further, it turns out that huge_page_order(default_hstate) is 0, so
>> hugetlb_report_meminfo is doing:
>>
>> 1UL << (huge_page_order(h) + PAGE_SHIFT - 10)
>>
>> which ends up just doing 1 << (PAGE_SHIFT - 10) and since the base page
>> size is 64k, we report a hugepage size of 64k... And allow the user to
>> allocate hugepages via the sysctl, etc.
>>
>> What's the right thing to do here?
>>
>> 1) Should we add checks for HPAGE_SHIFT == 0 to all the hugetlb APIs? It
>> seems like HPAGE_SHIFT == 0 should be the equivalent, functionally, of
>> the config options being off. This seems like a lot of overhead, though,
>> to put everywhere, so maybe I can do it in an arch-specific macro, that
>> in asm-generic defaults to 0 (and so will hopefully be compiled out?).
>>
>> 2) What should hugetlbfs do when HPAGE_SHIFT == 0? Should it be
>> mountable? Obviously if it's mountable, we can't great files there
>> (since the fs will report insufficient space). [1]
>
> Here is my solution to this. Comments appreciated!
>
> In KVM guests on Power, in a guest not backed by hugepages, we see the
> following:
>
> AnonHugePages: 0 kB
> HugePages_Total: 0
> HugePages_Free: 0
> HugePages_Rsvd: 0
> HugePages_Surp: 0
> Hugepagesize: 64 kB
>
> HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages
> are not supported at boot-time, but this is only checked in
> hugetlb_init(). Extract the check to a helper function, and use it in a
> few relevant places.
>
> This does make hugetlbfs not supported in this environment. I believe
> this is fine, as there are no valid hugepages and that won't change at
> runtime.
>
> Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Looks good. Can you resubmit it as a proper patch ? You may also want to
capture in commit message saying hugetlbfs file system also will not be
registered.
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index d19b30a..c7aa477 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -1017,6 +1017,11 @@ static int __init init_hugetlbfs_fs(void)
> int error;
> int i;
>
> + if (!hugepages_supported()) {
> + printk(KERN_ERR "hugetlbfs: Disabling because there are no supported page sizes\n");
> + return -ENOTSUPP;
> + }
> +
> error = bdi_init(&hugetlbfs_backing_dev_info);
> if (error)
> return error;
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 8c43cc4..0aea8de 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -450,4 +450,14 @@ static inline spinlock_t *huge_pte_lock(struct hstate *h,
> return ptl;
> }
>
> +static inline bool hugepages_supported(void)
> +{
> + /*
> + * Some platform decide whether they support huge pages at boot
> + * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when
> + * there is no such support
> + */
> + return HPAGE_SHIFT != 0;
> +}
> +
> #endif /* _LINUX_HUGETLB_H */
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index c01cb9f..1c99585 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1949,11 +1949,7 @@ module_exit(hugetlb_exit);
>
> static int __init hugetlb_init(void)
> {
> - /* Some platform decide whether they support huge pages at boot
> - * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when
> - * there is no such support
> - */
> - if (HPAGE_SHIFT == 0)
> + if (!hugepages_supported())
> return 0;
>
> if (!size_to_hstate(default_hstate_size)) {
> @@ -2069,6 +2065,9 @@ static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
> unsigned long tmp;
> int ret;
>
> + if (!hugepages_supported())
> + return -ENOTSUPP;
> +
> tmp = h->max_huge_pages;
>
> if (write && h->order >= MAX_ORDER)
> @@ -2122,6 +2121,9 @@ int hugetlb_overcommit_handler(struct ctl_table *table, int write,
> unsigned long tmp;
> int ret;
>
> + if (!hugepages_supported())
> + return -ENOTSUPP;
> +
> tmp = h->nr_overcommit_huge_pages;
>
> if (write && h->order >= MAX_ORDER)
> @@ -2147,6 +2149,8 @@ out:
> void hugetlb_report_meminfo(struct seq_file *m)
> {
> struct hstate *h = &default_hstate;
> + if (!hugepages_supported())
> + return;
> seq_printf(m,
> "HugePages_Total: %5lu\n"
> "HugePages_Free: %5lu\n"
> @@ -2163,6 +2167,8 @@ void hugetlb_report_meminfo(struct seq_file *m)
> int hugetlb_report_node_meminfo(int nid, char *buf)
> {
> struct hstate *h = &default_hstate;
> + if (!hugepages_supported())
> + return 0;
> return sprintf(buf,
> "Node %d HugePages_Total: %5u\n"
> "Node %d HugePages_Free: %5u\n"
> @@ -2177,6 +2183,9 @@ void hugetlb_show_meminfo(void)
> struct hstate *h;
> int nid;
>
> + if (!hugepages_supported())
> + return;
> +
> for_each_node_state(nid, N_MEMORY)
> for_each_hstate(h)
> pr_info("Node %d hugepages_total=%u hugepages_free=%u hugepages_surp=%u hugepages_size=%lukB\n",
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-04-03 16:19 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-24 23:02 powerpc hugepage bug(s) when no valid hstates? Nishanth Aravamudan
2014-03-24 23:02 ` Nishanth Aravamudan
2014-03-26 15:58 ` [RFC PATCH] hugetlb: ensure hugepage access is denied if hugepages are not supported Nishanth Aravamudan
2014-03-26 15:58 ` Nishanth Aravamudan
2014-04-02 17:16 ` Nishanth Aravamudan
2014-04-02 17:16 ` Nishanth Aravamudan
2014-04-03 16:19 ` Aneesh Kumar K.V [this message]
2014-04-03 16:19 ` Aneesh Kumar K.V
2014-04-03 23:12 ` Nishanth Aravamudan
2014-04-03 23:12 ` Nishanth Aravamudan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ioqqo065.fsf@linux.vnet.ibm.com \
--to=aneesh.kumar@linux.vnet.ibm.com \
--cc=anton@samba.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=nacc@linux.vnet.ibm.com \
--cc=nyc@holomorphy.com \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.