public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Doug Berger <opendmb@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.com>, Jason Baron <jbaron@akamai.com>,
	David Rientjes <rientjes@google.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.9 13/39] mm: include CMA pages in lowmem_reserve at boot
Date: Mon, 24 Aug 2020 10:31:12 +0200	[thread overview]
Message-ID: <20200824082349.153225426@linuxfoundation.org> (raw)
In-Reply-To: <20200824082348.445866152@linuxfoundation.org>

From: Doug Berger <opendmb@gmail.com>

commit e08d3fdfe2dafa0331843f70ce1ff6c1c4900bf4 upstream.

The lowmem_reserve arrays provide a means of applying pressure against
allocations from lower zones that were targeted at higher zones.  Its
values are a function of the number of pages managed by higher zones and
are assigned by a call to the setup_per_zone_lowmem_reserve() function.

The function is initially called at boot time by the function
init_per_zone_wmark_min() and may be called later by accesses of the
/proc/sys/vm/lowmem_reserve_ratio sysctl file.

The function init_per_zone_wmark_min() was moved up from a module_init to
a core_initcall to resolve a sequencing issue with khugepaged.
Unfortunately this created a sequencing issue with CMA page accounting.

The CMA pages are added to the managed page count of a zone when
cma_init_reserved_areas() is called at boot also as a core_initcall.  This
makes it uncertain whether the CMA pages will be added to the managed page
counts of their zones before or after the call to
init_per_zone_wmark_min() as it becomes dependent on link order.  With the
current link order the pages are added to the managed count after the
lowmem_reserve arrays are initialized at boot.

This means the lowmem_reserve values at boot may be lower than the values
used later if /proc/sys/vm/lowmem_reserve_ratio is accessed even if the
ratio values are unchanged.

In many cases the difference is not significant, but for example
an ARM platform with 1GB of memory and the following memory layout

  cma: Reserved 256 MiB at 0x0000000030000000
  Zone ranges:
    DMA      [mem 0x0000000000000000-0x000000002fffffff]
    Normal   empty
    HighMem  [mem 0x0000000030000000-0x000000003fffffff]

would result in 0 lowmem_reserve for the DMA zone.  This would allow
userspace to deplete the DMA zone easily.

Funnily enough

  $ cat /proc/sys/vm/lowmem_reserve_ratio

would fix up the situation because as a side effect it forces
setup_per_zone_lowmem_reserve.

This commit breaks the link order dependency by invoking
init_per_zone_wmark_min() as a postcore_initcall so that the CMA pages
have the chance to be properly accounted in their zone(s) and allowing
the lowmem_reserve arrays to receive consistent values.

Fixes: bc22af74f271 ("mm: update min_free_kbytes from khugepaged after core initialization")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1597423766-27849-1-git-send-email-opendmb@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/page_alloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6782,7 +6782,7 @@ int __meminit init_per_zone_wmark_min(vo
 
 	return 0;
 }
-core_initcall(init_per_zone_wmark_min)
+postcore_initcall(init_per_zone_wmark_min)
 
 /*
  * min_free_kbytes_sysctl_handler - just a wrapper around proc_dointvec() so



  parent reply	other threads:[~2020-08-24  8:52 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-24  8:30 [PATCH 4.9 00/39] 4.9.234-rc1 review Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 01/39] x86/asm: Remove unnecessary \n\t in front of CC_SET() from asm templates Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 02/39] x86/asm: Add instruction suffixes to bitops Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 03/39] drm/imx: imx-ldb: Disable both channels for split mode in enc->disable() Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 04/39] perf probe: Fix memory leakage when the probe point is not found Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 05/39] tracing: Clean up the hwlat binding code Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 06/39] tracing/hwlat: Honor the tracing_cpumask Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 07/39] khugepaged: khugepaged_test_exit() check mmget_still_valid() Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 08/39] khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter() Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 09/39] btrfs: export helpers for subvolume name/id resolution Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 10/39] btrfs: dont show full path of bind mounts in subvol= Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 11/39] romfs: fix uninitialized memory leak in romfs_dev_read() Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 12/39] kernel/relay.c: fix memleak on destroy relay channel Greg Kroah-Hartman
2020-08-24  8:31 ` Greg Kroah-Hartman [this message]
2020-08-24  8:31 ` [PATCH 4.9 14/39] mm, page_alloc: fix core hung in free_pcppages_bulk() Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 15/39] jbd2: add the missing unlock_buffer() in the error path of jbd2_write_superblock() Greg Kroah-Hartman
2020-08-24 11:15   ` zhangyi (F)
2020-08-24 15:38     ` Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 16/39] ext4: clean up ext4_match() and callers Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 17/39] ext4: fix checking of directory entry validity for inline directories Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 18/39] scsi: ufs: Add DELAY_BEFORE_LPM quirk for Micron devices Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 19/39] media: budget-core: Improve exception handling in budget_register() Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 20/39] media: vpss: clean up resources in init Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 21/39] Input: psmouse - add a newline when printing proto by sysfs Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 22/39] m68knommu: fix overwriting of bits in ColdFire V3 cache control Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 23/39] xfs: fix inode quota reservation checks Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 24/39] jffs2: fix UAF problem Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 25/39] scsi: libfc: Free skb in fc_disc_gpn_id_resp() for valid cases Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 26/39] virtio_ring: Avoid loop when vq is broken in virtqueue_poll Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 27/39] xfs: Fix UBSAN null-ptr-deref in xfs_sysfs_init Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 28/39] alpha: fix annotation of io{read,write}{16,32}be() Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 29/39] ext4: fix potential negative array index in do_split() Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 30/39] i40e: Set RX_ONLY mode for unicast promiscuous on VLAN Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 31/39] net: fec: correct the error path for regulator disable in probe Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 32/39] ASoC: intel: Fix memleak in sst_media_open Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 33/39] net: dsa: b53: check for timeout Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 34/39] powerpc/pseries: Do not initiate shutdown when system is running on UPS Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 35/39] powerpc: Allow 4224 bytes of stack expansion for the signal frame Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 36/39] epoll: Keep a reference on files added to the check list Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 37/39] do_epoll_ctl(): clean the failure exits up a bit Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 38/39] mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible Greg Kroah-Hartman
2020-08-24  8:31 ` [PATCH 4.9 39/39] xen: dont reschedule in preemption off sections Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200824082349.153225426@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=jbaron@akamai.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=opendmb@gmail.com \
    --cc=rientjes@google.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox