* powerpc hugepage bug(s) when no valid hstates? @ 2014-03-24 23:02 Nishanth Aravamudan 2014-03-26 15:58 ` [RFC PATCH] hugetlb: ensure hugepage access is denied if hugepages are not supported Nishanth Aravamudan 0 siblings, 1 reply; 5+ messages in thread From: Nishanth Aravamudan @ 2014-03-24 23:02 UTC (permalink / raw) To: linux-mm; +Cc: linuxppc-dev, nyc, benh, paulus, anton In KVM guests on Power, if the guest is not backed by hugepages, we see the following in the guest: AnonHugePages: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 64 kB This seems like a configuration issue -- why is a hstate of 64k being registered? I did some debugging and found that the following does trigger, mm/hugetlb.c::hugetlb_init(): /* Some platform decide whether they support huge pages at boot * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when * there is no such support */ if (HPAGE_SHIFT == 0) return 0; That check is only during init-time. So we don't support hugepages, but none of the hugetlb APIs actually check this condition (HPAGE_SHIFT == 0), so /proc/meminfo above falsely indicates there is a valid hstate (at least one). But note that there is no /sys/kernel/mm/hugepages meaning no hstate was actually registered. Further, it turns out that huge_page_order(default_hstate) is 0, so hugetlb_report_meminfo is doing: 1UL << (huge_page_order(h) + PAGE_SHIFT - 10) which ends up just doing 1 << (PAGE_SHIFT - 10) and since the base page size is 64k, we report a hugepage size of 64k... And allow the user to allocate hugepages via the sysctl, etc. What's the right thing to do here? 1) Should we add checks for HPAGE_SHIFT == 0 to all the hugetlb APIs? It seems like HPAGE_SHIFT == 0 should be the equivalent, functionally, of the config options being off. This seems like a lot of overhead, though, to put everywhere, so maybe I can do it in an arch-specific macro, that in asm-generic defaults to 0 (and so will hopefully be compiled out?). 2) What should hugetlbfs do when HPAGE_SHIFT == 0? Should it be mountable? Obviously if it's mountable, we can't great files there (since the fs will report insufficient space). [1] Thanks, Nish [1] Currently, I am seeing the following when I `mount -t hugetlbfs /none /dev/hugetlbfs`, and then simply do a `ls /dev/hugetlbfs`. I think it's related to the fact that hugetlbfs is properly not correctly setting itself up in this state?: Unable to handle kernel paging request for data at address 0x00000031 Faulting instruction address: 0xc000000000245710 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=2048 NUMA pSeries Modules linked in: pseries_rng rng_core virtio_net virtio_pci virtio_ring virtio CPU: 0 PID: 1807 Comm: ls Not tainted 3.14.0-rc7-00066-g774868c-dirty #14 task: c00000007e804520 ti: c00000007aed4000 task.ti: c00000007aed4000 NIP: c000000000245710 LR: c00000000024586c CTR: 0000000000000000 REGS: c00000007aed74f0 TRAP: 0300 Not tainted (3.14.0-rc7-00066-g774868c-dirty) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24002484 XER: 00000000 CFAR: 00003fff91037760 DAR: 0000000000000031 DSISR: 40000000 SOFTE: 1 GPR00: c00000000024586c c00000007aed7770 c000000000d85420 c00000007d7a0010 GPR04: c000000000abcf20 c000000000ed7c78 0000000000000020 c000000000cbc880 GPR08: 0000000000000000 0000000000000000 0000000080000000 0000000000000002 GPR12: 0000000044002484 c00000000fe40000 0000000000000000 00000000100232f0 GPR16: 0000000000000001 0000000000000000 0000000000000000 c00000007d794a40 GPR20: 0000000000000000 0000000000000024 c00000007a49a200 c00000007a2bd000 GPR24: c00000007aed7bb8 c00000007d7a0090 0000000000014800 0000000000000000 GPR28: c00000007d7a0010 c00000007a49a210 c00000007d7a0150 0000000000000001 NIP [c000000000245710] .time_out_leases+0x30/0x100 LR [c00000000024586c] .__break_lease+0x8c/0x480 Call Trace: [c00000007aed7770] [c0000000002434c0] .lease_alloc+0x20/0xe0 (unreliable) [c00000007aed77f0] [c00000000024586c] .__break_lease+0x8c/0x480 [c00000007aed78e0] [c0000000001e0374] .do_dentry_open.isra.14+0xf4/0x370 [c00000007aed7980] [c0000000001e0624] .finish_open+0x34/0x60 [c00000007aed7a00] [c0000000001f519c] .do_last+0x56c/0xe40 [c00000007aed7b20] [c0000000001f5b68] .path_openat+0xf8/0x800 [c00000007aed7c40] [c0000000001f7810] .do_filp_open+0x40/0xb0 [c00000007aed7d70] [c0000000001e1f08] .do_sys_open+0x198/0x2e0 [c00000007aed7e30] [c00000000000a158] syscall_exit+0x0/0x98 Instruction dump: -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC PATCH] hugetlb: ensure hugepage access is denied if hugepages are not supported 2014-03-24 23:02 powerpc hugepage bug(s) when no valid hstates? Nishanth Aravamudan @ 2014-03-26 15:58 ` Nishanth Aravamudan 2014-04-02 17:16 ` Nishanth Aravamudan 2014-04-03 16:19 ` Aneesh Kumar K.V 0 siblings, 2 replies; 5+ messages in thread From: Nishanth Aravamudan @ 2014-03-26 15:58 UTC (permalink / raw) To: linux-mm; +Cc: linuxppc-dev, nyc, benh, paulus, anton On 24.03.2014 [16:02:56 -0700], Nishanth Aravamudan wrote: > In KVM guests on Power, if the guest is not backed by hugepages, we see > the following in the guest: > > AnonHugePages: 0 kB > HugePages_Total: 0 > HugePages_Free: 0 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 64 kB > > This seems like a configuration issue -- why is a hstate of 64k being > registered? > > I did some debugging and found that the following does trigger, > mm/hugetlb.c::hugetlb_init(): > > /* Some platform decide whether they support huge pages at boot > * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when > * there is no such support > */ > if (HPAGE_SHIFT == 0) > return 0; > > That check is only during init-time. So we don't support hugepages, but > none of the hugetlb APIs actually check this condition (HPAGE_SHIFT == > 0), so /proc/meminfo above falsely indicates there is a valid hstate (at > least one). But note that there is no /sys/kernel/mm/hugepages meaning > no hstate was actually registered. > > Further, it turns out that huge_page_order(default_hstate) is 0, so > hugetlb_report_meminfo is doing: > > 1UL << (huge_page_order(h) + PAGE_SHIFT - 10) > > which ends up just doing 1 << (PAGE_SHIFT - 10) and since the base page > size is 64k, we report a hugepage size of 64k... And allow the user to > allocate hugepages via the sysctl, etc. > > What's the right thing to do here? > > 1) Should we add checks for HPAGE_SHIFT == 0 to all the hugetlb APIs? It > seems like HPAGE_SHIFT == 0 should be the equivalent, functionally, of > the config options being off. This seems like a lot of overhead, though, > to put everywhere, so maybe I can do it in an arch-specific macro, that > in asm-generic defaults to 0 (and so will hopefully be compiled out?). > > 2) What should hugetlbfs do when HPAGE_SHIFT == 0? Should it be > mountable? Obviously if it's mountable, we can't great files there > (since the fs will report insufficient space). [1] Here is my solution to this. Comments appreciated! In KVM guests on Power, in a guest not backed by hugepages, we see the following: AnonHugePages: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 64 kB HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages are not supported at boot-time, but this is only checked in hugetlb_init(). Extract the check to a helper function, and use it in a few relevant places. This does make hugetlbfs not supported in this environment. I believe this is fine, as there are no valid hugepages and that won't change at runtime. Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index d19b30a..c7aa477 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -1017,6 +1017,11 @@ static int __init init_hugetlbfs_fs(void) int error; int i; + if (!hugepages_supported()) { + printk(KERN_ERR "hugetlbfs: Disabling because there are no supported page sizes\n"); + return -ENOTSUPP; + } + error = bdi_init(&hugetlbfs_backing_dev_info); if (error) return error; diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 8c43cc4..0aea8de 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -450,4 +450,14 @@ static inline spinlock_t *huge_pte_lock(struct hstate *h, return ptl; } +static inline bool hugepages_supported(void) +{ + /* + * Some platform decide whether they support huge pages at boot + * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when + * there is no such support + */ + return HPAGE_SHIFT != 0; +} + #endif /* _LINUX_HUGETLB_H */ diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c01cb9f..1c99585 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1949,11 +1949,7 @@ module_exit(hugetlb_exit); static int __init hugetlb_init(void) { - /* Some platform decide whether they support huge pages at boot - * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when - * there is no such support - */ - if (HPAGE_SHIFT == 0) + if (!hugepages_supported()) return 0; if (!size_to_hstate(default_hstate_size)) { @@ -2069,6 +2065,9 @@ static int hugetlb_sysctl_handler_common(bool obey_mempolicy, unsigned long tmp; int ret; + if (!hugepages_supported()) + return -ENOTSUPP; + tmp = h->max_huge_pages; if (write && h->order >= MAX_ORDER) @@ -2122,6 +2121,9 @@ int hugetlb_overcommit_handler(struct ctl_table *table, int write, unsigned long tmp; int ret; + if (!hugepages_supported()) + return -ENOTSUPP; + tmp = h->nr_overcommit_huge_pages; if (write && h->order >= MAX_ORDER) @@ -2147,6 +2149,8 @@ out: void hugetlb_report_meminfo(struct seq_file *m) { struct hstate *h = &default_hstate; + if (!hugepages_supported()) + return; seq_printf(m, "HugePages_Total: %5lu\n" "HugePages_Free: %5lu\n" @@ -2163,6 +2167,8 @@ void hugetlb_report_meminfo(struct seq_file *m) int hugetlb_report_node_meminfo(int nid, char *buf) { struct hstate *h = &default_hstate; + if (!hugepages_supported()) + return 0; return sprintf(buf, "Node %d HugePages_Total: %5u\n" "Node %d HugePages_Free: %5u\n" @@ -2177,6 +2183,9 @@ void hugetlb_show_meminfo(void) struct hstate *h; int nid; + if (!hugepages_supported()) + return; + for_each_node_state(nid, N_MEMORY) for_each_hstate(h) pr_info("Node %d hugepages_total=%u hugepages_free=%u hugepages_surp=%u hugepages_size=%lukB\n", -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] hugetlb: ensure hugepage access is denied if hugepages are not supported 2014-03-26 15:58 ` [RFC PATCH] hugetlb: ensure hugepage access is denied if hugepages are not supported Nishanth Aravamudan @ 2014-04-02 17:16 ` Nishanth Aravamudan 2014-04-03 16:19 ` Aneesh Kumar K.V 1 sibling, 0 replies; 5+ messages in thread From: Nishanth Aravamudan @ 2014-04-02 17:16 UTC (permalink / raw) To: linux-mm; +Cc: linuxppc-dev, nyc, benh, paulus, anton, akpm On 26.03.2014 [08:58:15 -0700], Nishanth Aravamudan wrote: > On 24.03.2014 [16:02:56 -0700], Nishanth Aravamudan wrote: > > In KVM guests on Power, if the guest is not backed by hugepages, we see > > the following in the guest: > > > > AnonHugePages: 0 kB > > HugePages_Total: 0 > > HugePages_Free: 0 > > HugePages_Rsvd: 0 > > HugePages_Surp: 0 > > Hugepagesize: 64 kB > > > > This seems like a configuration issue -- why is a hstate of 64k being > > registered? > > > > I did some debugging and found that the following does trigger, > > mm/hugetlb.c::hugetlb_init(): > > > > /* Some platform decide whether they support huge pages at boot > > * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when > > * there is no such support > > */ > > if (HPAGE_SHIFT == 0) > > return 0; > > > > That check is only during init-time. So we don't support hugepages, but > > none of the hugetlb APIs actually check this condition (HPAGE_SHIFT == > > 0), so /proc/meminfo above falsely indicates there is a valid hstate (at > > least one). But note that there is no /sys/kernel/mm/hugepages meaning > > no hstate was actually registered. > > > > Further, it turns out that huge_page_order(default_hstate) is 0, so > > hugetlb_report_meminfo is doing: > > > > 1UL << (huge_page_order(h) + PAGE_SHIFT - 10) > > > > which ends up just doing 1 << (PAGE_SHIFT - 10) and since the base page > > size is 64k, we report a hugepage size of 64k... And allow the user to > > allocate hugepages via the sysctl, etc. > > > > What's the right thing to do here? > > > > 1) Should we add checks for HPAGE_SHIFT == 0 to all the hugetlb APIs? It > > seems like HPAGE_SHIFT == 0 should be the equivalent, functionally, of > > the config options being off. This seems like a lot of overhead, though, > > to put everywhere, so maybe I can do it in an arch-specific macro, that > > in asm-generic defaults to 0 (and so will hopefully be compiled out?). > > > > 2) What should hugetlbfs do when HPAGE_SHIFT == 0? Should it be > > mountable? Obviously if it's mountable, we can't great files there > > (since the fs will report insufficient space). [1] > > Here is my solution to this. Comments appreciated! > > In KVM guests on Power, in a guest not backed by hugepages, we see the > following: > > AnonHugePages: 0 kB > HugePages_Total: 0 > HugePages_Free: 0 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 64 kB > > HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages > are not supported at boot-time, but this is only checked in > hugetlb_init(). Extract the check to a helper function, and use it in a > few relevant places. > > This does make hugetlbfs not supported in this environment. I believe > this is fine, as there are no valid hugepages and that won't change at > runtime. > > Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Ping on this? The patch below fixes a pretty easy-to-reproduce bug in guests under KVM guests on Power. Thanks, Nish > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c > index d19b30a..c7aa477 100644 > --- a/fs/hugetlbfs/inode.c > +++ b/fs/hugetlbfs/inode.c > @@ -1017,6 +1017,11 @@ static int __init init_hugetlbfs_fs(void) > int error; > int i; > > + if (!hugepages_supported()) { > + printk(KERN_ERR "hugetlbfs: Disabling because there are no supported page sizes\n"); > + return -ENOTSUPP; > + } > + > error = bdi_init(&hugetlbfs_backing_dev_info); > if (error) > return error; > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index 8c43cc4..0aea8de 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -450,4 +450,14 @@ static inline spinlock_t *huge_pte_lock(struct hstate *h, > return ptl; > } > > +static inline bool hugepages_supported(void) > +{ > + /* > + * Some platform decide whether they support huge pages at boot > + * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when > + * there is no such support > + */ > + return HPAGE_SHIFT != 0; > +} > + > #endif /* _LINUX_HUGETLB_H */ > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index c01cb9f..1c99585 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1949,11 +1949,7 @@ module_exit(hugetlb_exit); > > static int __init hugetlb_init(void) > { > - /* Some platform decide whether they support huge pages at boot > - * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when > - * there is no such support > - */ > - if (HPAGE_SHIFT == 0) > + if (!hugepages_supported()) > return 0; > > if (!size_to_hstate(default_hstate_size)) { > @@ -2069,6 +2065,9 @@ static int hugetlb_sysctl_handler_common(bool obey_mempolicy, > unsigned long tmp; > int ret; > > + if (!hugepages_supported()) > + return -ENOTSUPP; > + > tmp = h->max_huge_pages; > > if (write && h->order >= MAX_ORDER) > @@ -2122,6 +2121,9 @@ int hugetlb_overcommit_handler(struct ctl_table *table, int write, > unsigned long tmp; > int ret; > > + if (!hugepages_supported()) > + return -ENOTSUPP; > + > tmp = h->nr_overcommit_huge_pages; > > if (write && h->order >= MAX_ORDER) > @@ -2147,6 +2149,8 @@ out: > void hugetlb_report_meminfo(struct seq_file *m) > { > struct hstate *h = &default_hstate; > + if (!hugepages_supported()) > + return; > seq_printf(m, > "HugePages_Total: %5lu\n" > "HugePages_Free: %5lu\n" > @@ -2163,6 +2167,8 @@ void hugetlb_report_meminfo(struct seq_file *m) > int hugetlb_report_node_meminfo(int nid, char *buf) > { > struct hstate *h = &default_hstate; > + if (!hugepages_supported()) > + return 0; > return sprintf(buf, > "Node %d HugePages_Total: %5u\n" > "Node %d HugePages_Free: %5u\n" > @@ -2177,6 +2183,9 @@ void hugetlb_show_meminfo(void) > struct hstate *h; > int nid; > > + if (!hugepages_supported()) > + return; > + > for_each_node_state(nid, N_MEMORY) > for_each_hstate(h) > pr_info("Node %d hugepages_total=%u hugepages_free=%u hugepages_surp=%u hugepages_size=%lukB\n", -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] hugetlb: ensure hugepage access is denied if hugepages are not supported 2014-03-26 15:58 ` [RFC PATCH] hugetlb: ensure hugepage access is denied if hugepages are not supported Nishanth Aravamudan 2014-04-02 17:16 ` Nishanth Aravamudan @ 2014-04-03 16:19 ` Aneesh Kumar K.V 2014-04-03 23:12 ` Nishanth Aravamudan 1 sibling, 1 reply; 5+ messages in thread From: Aneesh Kumar K.V @ 2014-04-03 16:19 UTC (permalink / raw) To: Nishanth Aravamudan, linux-mm; +Cc: paulus, linuxppc-dev, anton, nyc Nishanth Aravamudan <nacc@linux.vnet.ibm.com> writes: > On 24.03.2014 [16:02:56 -0700], Nishanth Aravamudan wrote: >> In KVM guests on Power, if the guest is not backed by hugepages, we see >> the following in the guest: >> >> AnonHugePages: 0 kB >> HugePages_Total: 0 >> HugePages_Free: 0 >> HugePages_Rsvd: 0 >> HugePages_Surp: 0 >> Hugepagesize: 64 kB >> >> This seems like a configuration issue -- why is a hstate of 64k being >> registered? >> >> I did some debugging and found that the following does trigger, >> mm/hugetlb.c::hugetlb_init(): >> >> /* Some platform decide whether they support huge pages at boot >> * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when >> * there is no such support >> */ >> if (HPAGE_SHIFT == 0) >> return 0; >> >> That check is only during init-time. So we don't support hugepages, but >> none of the hugetlb APIs actually check this condition (HPAGE_SHIFT == >> 0), so /proc/meminfo above falsely indicates there is a valid hstate (at >> least one). But note that there is no /sys/kernel/mm/hugepages meaning >> no hstate was actually registered. >> >> Further, it turns out that huge_page_order(default_hstate) is 0, so >> hugetlb_report_meminfo is doing: >> >> 1UL << (huge_page_order(h) + PAGE_SHIFT - 10) >> >> which ends up just doing 1 << (PAGE_SHIFT - 10) and since the base page >> size is 64k, we report a hugepage size of 64k... And allow the user to >> allocate hugepages via the sysctl, etc. >> >> What's the right thing to do here? >> >> 1) Should we add checks for HPAGE_SHIFT == 0 to all the hugetlb APIs? It >> seems like HPAGE_SHIFT == 0 should be the equivalent, functionally, of >> the config options being off. This seems like a lot of overhead, though, >> to put everywhere, so maybe I can do it in an arch-specific macro, that >> in asm-generic defaults to 0 (and so will hopefully be compiled out?). >> >> 2) What should hugetlbfs do when HPAGE_SHIFT == 0? Should it be >> mountable? Obviously if it's mountable, we can't great files there >> (since the fs will report insufficient space). [1] > > Here is my solution to this. Comments appreciated! > > In KVM guests on Power, in a guest not backed by hugepages, we see the > following: > > AnonHugePages: 0 kB > HugePages_Total: 0 > HugePages_Free: 0 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 64 kB > > HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages > are not supported at boot-time, but this is only checked in > hugetlb_init(). Extract the check to a helper function, and use it in a > few relevant places. > > This does make hugetlbfs not supported in this environment. I believe > this is fine, as there are no valid hugepages and that won't change at > runtime. > > Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Looks good. Can you resubmit it as a proper patch ? You may also want to capture in commit message saying hugetlbfs file system also will not be registered. > > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c > index d19b30a..c7aa477 100644 > --- a/fs/hugetlbfs/inode.c > +++ b/fs/hugetlbfs/inode.c > @@ -1017,6 +1017,11 @@ static int __init init_hugetlbfs_fs(void) > int error; > int i; > > + if (!hugepages_supported()) { > + printk(KERN_ERR "hugetlbfs: Disabling because there are no supported page sizes\n"); > + return -ENOTSUPP; > + } > + > error = bdi_init(&hugetlbfs_backing_dev_info); > if (error) > return error; > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index 8c43cc4..0aea8de 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -450,4 +450,14 @@ static inline spinlock_t *huge_pte_lock(struct hstate *h, > return ptl; > } > > +static inline bool hugepages_supported(void) > +{ > + /* > + * Some platform decide whether they support huge pages at boot > + * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when > + * there is no such support > + */ > + return HPAGE_SHIFT != 0; > +} > + > #endif /* _LINUX_HUGETLB_H */ > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index c01cb9f..1c99585 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1949,11 +1949,7 @@ module_exit(hugetlb_exit); > > static int __init hugetlb_init(void) > { > - /* Some platform decide whether they support huge pages at boot > - * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when > - * there is no such support > - */ > - if (HPAGE_SHIFT == 0) > + if (!hugepages_supported()) > return 0; > > if (!size_to_hstate(default_hstate_size)) { > @@ -2069,6 +2065,9 @@ static int hugetlb_sysctl_handler_common(bool obey_mempolicy, > unsigned long tmp; > int ret; > > + if (!hugepages_supported()) > + return -ENOTSUPP; > + > tmp = h->max_huge_pages; > > if (write && h->order >= MAX_ORDER) > @@ -2122,6 +2121,9 @@ int hugetlb_overcommit_handler(struct ctl_table *table, int write, > unsigned long tmp; > int ret; > > + if (!hugepages_supported()) > + return -ENOTSUPP; > + > tmp = h->nr_overcommit_huge_pages; > > if (write && h->order >= MAX_ORDER) > @@ -2147,6 +2149,8 @@ out: > void hugetlb_report_meminfo(struct seq_file *m) > { > struct hstate *h = &default_hstate; > + if (!hugepages_supported()) > + return; > seq_printf(m, > "HugePages_Total: %5lu\n" > "HugePages_Free: %5lu\n" > @@ -2163,6 +2167,8 @@ void hugetlb_report_meminfo(struct seq_file *m) > int hugetlb_report_node_meminfo(int nid, char *buf) > { > struct hstate *h = &default_hstate; > + if (!hugepages_supported()) > + return 0; > return sprintf(buf, > "Node %d HugePages_Total: %5u\n" > "Node %d HugePages_Free: %5u\n" > @@ -2177,6 +2183,9 @@ void hugetlb_show_meminfo(void) > struct hstate *h; > int nid; > > + if (!hugepages_supported()) > + return; > + > for_each_node_state(nid, N_MEMORY) > for_each_hstate(h) > pr_info("Node %d hugepages_total=%u hugepages_free=%u hugepages_surp=%u hugepages_size=%lukB\n", > > _______________________________________________ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] hugetlb: ensure hugepage access is denied if hugepages are not supported 2014-04-03 16:19 ` Aneesh Kumar K.V @ 2014-04-03 23:12 ` Nishanth Aravamudan 0 siblings, 0 replies; 5+ messages in thread From: Nishanth Aravamudan @ 2014-04-03 23:12 UTC (permalink / raw) To: Aneesh Kumar K.V; +Cc: linux-mm, paulus, linuxppc-dev, anton, nyc On 03.04.2014 [21:49:46 +0530], Aneesh Kumar K.V wrote: > Nishanth Aravamudan <nacc@linux.vnet.ibm.com> writes: > > > On 24.03.2014 [16:02:56 -0700], Nishanth Aravamudan wrote: > >> In KVM guests on Power, if the guest is not backed by hugepages, we see > >> the following in the guest: > >> > >> AnonHugePages: 0 kB > >> HugePages_Total: 0 > >> HugePages_Free: 0 > >> HugePages_Rsvd: 0 > >> HugePages_Surp: 0 > >> Hugepagesize: 64 kB > >> > >> This seems like a configuration issue -- why is a hstate of 64k being > >> registered? > >> > >> I did some debugging and found that the following does trigger, > >> mm/hugetlb.c::hugetlb_init(): > >> > >> /* Some platform decide whether they support huge pages at boot > >> * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when > >> * there is no such support > >> */ > >> if (HPAGE_SHIFT == 0) > >> return 0; > >> > >> That check is only during init-time. So we don't support hugepages, but > >> none of the hugetlb APIs actually check this condition (HPAGE_SHIFT == > >> 0), so /proc/meminfo above falsely indicates there is a valid hstate (at > >> least one). But note that there is no /sys/kernel/mm/hugepages meaning > >> no hstate was actually registered. > >> > >> Further, it turns out that huge_page_order(default_hstate) is 0, so > >> hugetlb_report_meminfo is doing: > >> > >> 1UL << (huge_page_order(h) + PAGE_SHIFT - 10) > >> > >> which ends up just doing 1 << (PAGE_SHIFT - 10) and since the base page > >> size is 64k, we report a hugepage size of 64k... And allow the user to > >> allocate hugepages via the sysctl, etc. > >> > >> What's the right thing to do here? > >> > >> 1) Should we add checks for HPAGE_SHIFT == 0 to all the hugetlb APIs? It > >> seems like HPAGE_SHIFT == 0 should be the equivalent, functionally, of > >> the config options being off. This seems like a lot of overhead, though, > >> to put everywhere, so maybe I can do it in an arch-specific macro, that > >> in asm-generic defaults to 0 (and so will hopefully be compiled out?). > >> > >> 2) What should hugetlbfs do when HPAGE_SHIFT == 0? Should it be > >> mountable? Obviously if it's mountable, we can't great files there > >> (since the fs will report insufficient space). [1] > > > > Here is my solution to this. Comments appreciated! > > > > In KVM guests on Power, in a guest not backed by hugepages, we see the > > following: > > > > AnonHugePages: 0 kB > > HugePages_Total: 0 > > HugePages_Free: 0 > > HugePages_Rsvd: 0 > > HugePages_Surp: 0 > > Hugepagesize: 64 kB > > > > HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages > > are not supported at boot-time, but this is only checked in > > hugetlb_init(). Extract the check to a helper function, and use it in a > > few relevant places. > > > > This does make hugetlbfs not supported in this environment. I believe > > this is fine, as there are no valid hugepages and that won't change at > > runtime. > > > > Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> > > > Looks good. Can you resubmit it as a proper patch ? Will Cc you on that. > You may also want to capture in commit message saying hugetlbfs file > system also will not be registered. I did that already: > > This does make hugetlbfs not supported in this environment. I > > believe this is fine, as there are no valid hugepages and that won't > > change at runtime. Thanks, Nish -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-04-03 23:12 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-03-24 23:02 powerpc hugepage bug(s) when no valid hstates? Nishanth Aravamudan 2014-03-26 15:58 ` [RFC PATCH] hugetlb: ensure hugepage access is denied if hugepages are not supported Nishanth Aravamudan 2014-04-02 17:16 ` Nishanth Aravamudan 2014-04-03 16:19 ` Aneesh Kumar K.V 2014-04-03 23:12 ` Nishanth Aravamudan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).