stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Andrey Ryabinin <arbn@yandex-team.com>,
	Michal Hocko <mhocko@suse.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	David Rientjes <rientjes@google.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 5.10 64/76] mm: mempolicy: fix THP allocations escaping mempolicy restrictions
Date: Mon, 27 Dec 2021 16:31:19 +0100	[thread overview]
Message-ID: <20211227151326.910063787@linuxfoundation.org> (raw)
In-Reply-To: <20211227151324.694661623@linuxfoundation.org>

From: Andrey Ryabinin <arbn@yandex-team.com>

commit 338635340669d5b317c7e8dcf4fff4a0f3651d87 upstream.

alloc_pages_vma() may try to allocate THP page on the local NUMA node
first:

	page = __alloc_pages_node(hpage_node,
		gfp | __GFP_THISNODE | __GFP_NORETRY, order);

And if the allocation fails it retries allowing remote memory:

	if (!page && (gfp & __GFP_DIRECT_RECLAIM))
    		page = __alloc_pages_node(hpage_node,
					gfp, order);

However, this retry allocation completely ignores memory policy nodemask
allowing allocation to escape restrictions.

The first appearance of this bug seems to be the commit ac5b2c18911f
("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings").

The bug disappeared later in the commit 89c83fb539f9 ("mm, thp:
consolidate THP gfp handling into alloc_hugepage_direct_gfpmask") and
reappeared again in slightly different form in the commit 76e654cc91bb
("mm, page_alloc: allow hugepage fallback to remote nodes when
madvised")

Fix this by passing correct nodemask to the __alloc_pages() call.

The demonstration/reproducer of the problem:

    $ mount -oremount,size=4G,huge=always /dev/shm/
    $ echo always > /sys/kernel/mm/transparent_hugepage/defrag
    $ cat mbind_thp.c
    #include <unistd.h>
    #include <sys/mman.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    #include <assert.h>
    #include <stdlib.h>
    #include <stdio.h>
    #include <numaif.h>

    #define SIZE 2ULL << 30
    int main(int argc, char **argv)
    {
        int fd;
        unsigned long long i;
        char *addr;
        pid_t pid;
        char buf[100];
        unsigned long nodemask = 1;

        fd = open("/dev/shm/test", O_RDWR|O_CREAT);
        assert(fd > 0);
        assert(ftruncate(fd, SIZE) == 0);

        addr = mmap(NULL, SIZE, PROT_READ|PROT_WRITE,
                           MAP_SHARED, fd, 0);

        assert(mbind(addr, SIZE, MPOL_BIND, &nodemask, 2, MPOL_MF_STRICT|MPOL_MF_MOVE)==0);
        for (i = 0; i < SIZE; i+=4096) {
          addr[i] = 1;
        }
        pid = getpid();
        snprintf(buf, sizeof(buf), "grep shm /proc/%d/numa_maps", pid);
        system(buf);
        sleep(10000);

        return 0;
    }
    $ gcc mbind_thp.c -o mbind_thp -lnuma
    $ numactl -H
    available: 2 nodes (0-1)
    node 0 cpus: 0 2
    node 0 size: 1918 MB
    node 0 free: 1595 MB
    node 1 cpus: 1 3
    node 1 size: 2014 MB
    node 1 free: 1731 MB
    node distances:
    node   0   1
      0:  10  20
      1:  20  10
    $ rm -f /dev/shm/test; taskset -c 0 ./mbind_thp
    7fd970a00000 bind:0 file=/dev/shm/test dirty=524288 active=0 N0=396800 N1=127488 kernelpagesize_kB=4

Link: https://lkml.kernel.org/r/20211208165343.22349-1-arbn@yandex-team.com
Fixes: ac5b2c18911f ("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings")
Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/mempolicy.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2222,8 +2222,8 @@ alloc_pages_vma(gfp_t gfp, int order, st
 			 * memory with both reclaim and compact as well.
 			 */
 			if (!page && (gfp & __GFP_DIRECT_RECLAIM))
-				page = __alloc_pages_node(hpage_node,
-								gfp, order);
+				page = __alloc_pages_nodemask(gfp, order,
+							hpage_node, nmask);
 
 			goto out;
 		}



  parent reply	other threads:[~2021-12-27 15:42 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-27 15:30 [PATCH 5.10 00/76] 5.10.89-rc1 review Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 01/76] arm64: vdso32: drop -no-integrated-as flag Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 02/76] arm64: vdso32: require CROSS_COMPILE_COMPAT for gcc+bfd Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 03/76] net: usb: lan78xx: add Allied Telesis AT29M2-AF Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 04/76] ext4: prevent partial update of the extent blocks Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 05/76] ext4: check for out-of-order index extents in ext4_valid_extent_entries() Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 06/76] ext4: check for inconsistent extents between index and leaf block Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 07/76] HID: holtek: fix mouse probing Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 08/76] HID: potential dereference of null pointer Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 09/76] arm64: dts: allwinner: orangepi-zero-plus: fix PHY mode Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 10/76] spi: change clk_disable_unprepare to clk_unprepare Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 11/76] ASoC: meson: aiu: fifo: Add missing dma_coerce_mask_and_coherent() Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 12/76] IB/qib: Fix memory leak in qib_user_sdma_queue_pkts() Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 13/76] RDMA/hns: Replace kfree() with kvfree() Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 14/76] netfilter: fix regression in looped (broad|multi)casts MAC handling Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 15/76] ARM: dts: imx6qdl-wandboard: Fix Ethernet support Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 16/76] net: marvell: prestera: fix incorrect return of port_find Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 17/76] qlcnic: potential dereference null pointer of rx_queue->page_ring Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 18/76] net: accept UFOv6 packages in virtio_net_hdr_to_skb Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 19/76] net: skip virtio_net_hdr_set_proto if protocol already set Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 20/76] igb: fix deadlock caused by taking RTNL in RPM resume path Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 21/76] ipmi: Fix UAF when uninstall ipmi_si and ipmi_msghandler module Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 22/76] bonding: fix ad_actor_system option setting to default Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 23/76] fjes: Check for error irq Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 24/76] drivers: net: smc911x: " Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 25/76] net: ks8851: " Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 26/76] sfc: Check null pointer of rx_queue->page_ring Greg Kroah-Hartman
2021-12-29 11:17   ` Pavel Machek
2022-01-01 11:54     ` Martin Habets
2021-12-27 15:30 ` [PATCH 5.10 27/76] sfc: falcon: " Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 28/76] Input: elantech - fix stack out of bound access in elantech_change_report_id() Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 29/76] pinctrl: bcm2835: Change init order for gpio hogs Greg Kroah-Hartman
2021-12-31  9:52   ` Pavel Machek
2021-12-27 15:30 ` [PATCH 5.10 30/76] hwmon: (lm90) Fix usage of CONFIG2 register in detect function Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 31/76] hwmon: (lm90) Add basic support for TI TMP461 Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 32/76] hwmon: (lm90) Introduce flag indicating extended temperature support Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 33/76] hwmon: (lm90) Drop critical attribute support for MAX6654 Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 34/76] ALSA: jack: Check the return value of kstrdup() Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 35/76] ALSA: drivers: opl3: Fix incorrect use of vp->state Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 36/76] ALSA: hda/realtek: Amp init fixup for HP ZBook 15 G6 Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 37/76] ALSA: hda/realtek: Add new alc285-hp-amp-init model Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 38/76] ALSA: hda/realtek: Fix quirk for Clevo NJ51CU Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 39/76] ASoC: meson: aiu: Move AIU_I2S_MISC hold setting to aiu-fifo-i2s Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 40/76] Input: atmel_mxt_ts - fix double free in mxt_read_info_block Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 41/76] ipmi: bail out if init_srcu_struct fails Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 42/76] ipmi: ssif: initialize ssif_info->client early Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 43/76] ipmi: fix initialization when workqueue allocation fails Greg Kroah-Hartman
2021-12-27 15:30 ` [PATCH 5.10 44/76] parisc: Correct completer in lws start Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 45/76] parisc: Fix mask used to select futex spinlock Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 46/76] tee: handle lookup of shm with reference count 0 Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 47/76] x86/pkey: Fix undefined behaviour with PKRU_WD_BIT Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 48/76] platform/x86: intel_pmc_core: fix memleak on registration failure Greg Kroah-Hartman
2021-12-31 10:04   ` Pavel Machek
2021-12-31 10:18     ` Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 49/76] KVM: VMX: Wake vCPU when delivering posted IRQ even if vCPU == this vCPU Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 50/76] pinctrl: stm32: consider the GPIO offset to expose all the GPIO lines Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 51/76] gpio: dln2: Fix interrupts when replugging the device Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 52/76] mmc: sdhci-tegra: Fix switch to HS400ES mode Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 53/76] mmc: meson-mx-sdhc: Set MANUAL_STOP for multi-block SDIO commands Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 54/76] mmc: core: Disable card detect during shutdown Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 55/76] mmc: mmci: stm32: clear DLYB_CR after sending tuning command Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 56/76] ARM: 9169/1: entry: fix Thumb2 bug in iWMMXt exception handling Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 57/76] mac80211: fix locking in ieee80211_start_ap error path Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 58/76] mm/hwpoison: clear MF_COUNT_INCREASED before retrying get_any_page() Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 59/76] tee: optee: Fix incorrect page free bug Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 60/76] f2fs: fix to do sanity check on last xattr entry in __f2fs_setxattr() Greg Kroah-Hartman
2022-01-03 21:11   ` Salvatore Bonaccorso
2022-01-04  9:29     ` Chao Yu
2022-01-04  9:56       ` Salvatore Bonaccorso
2022-01-04 10:22         ` Greg Kroah-Hartman
2022-01-04 21:10           ` Jaegeuk Kim
2022-01-05  8:10             ` Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 61/76] ceph: fix up non-directory creation in SGID directories Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 62/76] usb: gadget: u_ether: fix race in setting MAC address in setup phase Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 63/76] KVM: VMX: Fix stale docs for kvm-intel.emulate_invalid_guest_state Greg Kroah-Hartman
2021-12-27 15:31 ` Greg Kroah-Hartman [this message]
2021-12-27 15:31 ` [PATCH 5.10 65/76] Input: elants_i2c - do not check Remark ID on eKTH3900/eKTH5312 Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 66/76] Input: i8042 - enable deferred probe quirk for ASUS UM325UA Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 67/76] Input: goodix - add id->model mapping for the "9111" model Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 68/76] ASoC: tas2770: Fix setting of high sample rates Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 69/76] ASoC: rt5682: fix the wrong jack type detected Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 70/76] pinctrl: mediatek: fix global-out-of-bounds issue Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 71/76] hwmom: (lm90) Fix citical alarm status for MAX6680/MAX6681 Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 72/76] hwmon: (lm90) Do not report busy status bit as alarm Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 73/76] ax25: NPD bug when detaching AX25 device Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 74/76] hamradio: defer ax25 kfree after unregister_netdev Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 75/76] hamradio: improve the incomplete fix to avoid NPD Greg Kroah-Hartman
2021-12-27 15:31 ` [PATCH 5.10 76/76] phonet/pep: refuse to enable an unbound pipe Greg Kroah-Hartman
2021-12-27 18:02 ` [PATCH 5.10 00/76] 5.10.89-rc1 review Florian Fainelli
2021-12-28  8:12 ` Naresh Kamboju
2021-12-28 13:22 ` Sudip Mukherjee
2021-12-28 17:07 ` Guenter Roeck
2021-12-28 21:27 ` Shuah Khan
2021-12-29  1:33 ` Samuel Zou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211227151326.910063787@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arbn@yandex-team.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=rientjes@google.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).