stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Andrea Arcangeli <aarcange@redhat.com>,
	Michal Hocko <mhocko@suse.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Oleg Nesterov <oleg@redhat.com>, Jann Horn <jannh@google.com>,
	Hugh Dickins <hughd@google.com>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Peter Xu <peterx@redhat.com>, Jason Gunthorpe <jgg@mellanox.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 5.1 98/98] coredump: fix race condition between collapse_huge_page() and core dumping
Date: Thu, 20 Jun 2019 19:58:05 +0200	[thread overview]
Message-ID: <20190620174354.374371652@linuxfoundation.org> (raw)
In-Reply-To: <20190620174349.443386789@linuxfoundation.org>

From: Andrea Arcangeli <aarcange@redhat.com>

commit 59ea6d06cfa9247b586a695c21f94afa7183af74 upstream.

When fixing the race conditions between the coredump and the mmap_sem
holders outside the context of the process, we focused on
mmget_not_zero()/get_task_mm() callers in 04f5866e41fb70 ("coredump: fix
race condition between mmget_not_zero()/get_task_mm() and core
dumping"), but those aren't the only cases where the mmap_sem can be
taken outside of the context of the process as Michal Hocko noticed
while backporting that commit to older -stable kernels.

If mmgrab() is called in the context of the process, but then the
mm_count reference is transferred outside the context of the process,
that can also be a problem if the mmap_sem has to be taken for writing
through that mm_count reference.

khugepaged registration calls mmgrab() in the context of the process,
but the mmap_sem for writing is taken later in the context of the
khugepaged kernel thread.

collapse_huge_page() after taking the mmap_sem for writing doesn't
modify any vma, so it's not obvious that it could cause a problem to the
coredump, but it happens to modify the pmd in a way that breaks an
invariant that pmd_trans_huge_lock() relies upon.  collapse_huge_page()
needs the mmap_sem for writing just to block concurrent page faults that
call pmd_trans_huge_lock().

Specifically the invariant that "!pmd_trans_huge()" cannot become a
"pmd_trans_huge()" doesn't hold while collapse_huge_page() runs.

The coredump will call __get_user_pages() without mmap_sem for reading,
which eventually can invoke a lockless page fault which will need a
functional pmd_trans_huge_lock().

So collapse_huge_page() needs to use mmget_still_valid() to check it's
not running concurrently with the coredump...  as long as the coredump
can invoke page faults without holding the mmap_sem for reading.

This has "Fixes: khugepaged" to facilitate backporting, but in my view
it's more a bug in the coredump code that will eventually have to be
rewritten to stop invoking page faults without the mmap_sem for reading.
So the long term plan is still to drop all mmget_still_valid().

Link: http://lkml.kernel.org/r/20190607161558.32104-1-aarcange@redhat.com
Fixes: ba76149f47d8 ("thp: khugepaged")
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/linux/sched/mm.h |    4 ++++
 mm/khugepaged.c          |    3 +++
 2 files changed, 7 insertions(+)

--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -54,6 +54,10 @@ static inline void mmdrop(struct mm_stru
  * followed by taking the mmap_sem for writing before modifying the
  * vmas or anything the coredump pretends not to change from under it.
  *
+ * It also has to be called when mmgrab() is used in the context of
+ * the process, but then the mm_count refcount is transferred outside
+ * the context of the process to run down_write() on that pinned mm.
+ *
  * NOTE: find_extend_vma() called from GUP context is the only place
  * that can modify the "mm" (notably the vm_start/end) under mmap_sem
  * for reading and outside the context of the process, so it is also
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1004,6 +1004,9 @@ static void collapse_huge_page(struct mm
 	 * handled by the anon_vma lock + PG_lock.
 	 */
 	down_write(&mm->mmap_sem);
+	result = SCAN_ANY_PROCESS;
+	if (!mmget_still_valid(mm))
+		goto out;
 	result = hugepage_vma_revalidate(mm, address, &vma);
 	if (result)
 		goto out;



  parent reply	other threads:[~2019-06-20 18:18 UTC|newest]

Thread overview: 106+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-20 17:56 [PATCH 5.1 00/98] 5.1.13-stable review Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 01/98] netfilter: nat: fix udp checksum corruption Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 02/98] ax25: fix inconsistent lock state in ax25_destroy_timer Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 03/98] be2net: Fix number of Rx queues used for flow hashing Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 04/98] hv_netvsc: Set probe mode to sync Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 05/98] ipv6: flowlabel: fl6_sock_lookup() must use atomic_inc_not_zero Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 06/98] lapb: fixed leak of control-blocks Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 07/98] neigh: fix use-after-free read in pneigh_get_next Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 08/98] net: dsa: rtl8366: Fix up VLAN filtering Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 09/98] net: openvswitch: do not free vport if register_netdevice() is failed Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 10/98] net: tls, correctly account for copied bytes with multiple sk_msgs Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 11/98] nfc: Ensure presence of required attributes in the deactivate_target handler Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 12/98] sctp: Free cookie before we memdup a new one Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 13/98] sunhv: Fix device naming inconsistency between sunhv_console and sunhv_reg Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 14/98] tipc: purge deferredq list for each grp member in tipc_group_delete Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 15/98] vsock/virtio: set SOCK_DONE on peer shutdown Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 16/98] net/mlx5: Avoid reloading already removed devices Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 17/98] vxlan: Dont assume linear buffers in error handler Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 18/98] geneve: " Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 19/98] net: mvpp2: prs: Fix parser range for VID filtering Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 20/98] net: mvpp2: prs: Use the correct helpers when removing all VID filters Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 21/98] net: dsa: microchip: Dont try to read stats for unused ports Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 22/98] net: ethtool: Allow matching on vlan DEI bit Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 23/98] net/mlx5: Update pci error handler entries and command translation Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 24/98] mlxsw: spectrum_router: Refresh nexthop neighbour when it becomes dead Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 25/98] net/mlx5e: Add ndo_set_feature for uplink representor Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 26/98] mlxsw: spectrum_flower: Fix TOS matching Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 27/98] net/mlx5e: Fix source port matching in fdb peer flow rule Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 28/98] mlxsw: spectrum_buffers: Reduce pool size on Spectrum-2 Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 29/98] net/mlx5e: Support tagged tunnel over bond Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 30/98] net: correct udp zerocopy refcnt also when zerocopy only on append Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 31/98] net/mlx5e: Avoid detaching non-existing netdev under switchdev mode Greg Kroah-Hartman
2019-06-20 17:56 ` [PATCH 5.1 32/98] iio: imu: mpu6050: Fix FIFO layout for ICM20602 Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 33/98] staging: erofs: set sb->s_root to NULL when failing from __getname() Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 34/98] Staging: vc04_services: Fix a couple error codes Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 35/98] staging: wilc1000: Fix some double unlock bugs in wilc_wlan_cleanup() Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 36/98] pinctrl: intel: Clear interrupt status in mask/unmask callback Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 37/98] netfilter: nf_tables: fix oops during rule dump Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 38/98] perf/x86/intel/ds: Fix EVENT vs. UEVENT PEBS constraints Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 39/98] netfilter: nf_queue: fix reinject verdict handling Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 40/98] netfilter: nft_fib: Fix existence check support Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 41/98] ipvs: Fix use-after-free in ip_vs_in Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 42/98] selftests: netfilter: missing error check when setting up veth interface Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 43/98] clk: ti: clkctrl: Fix clkdm_clk handling Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 44/98] powerpc/powernv: Return for invalid IMC domain Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 45/98] usb: xhci: Fix a potential null pointer dereference in xhci_debugfs_create_endpoint() Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 46/98] mISDN: make sure device name is NUL terminated Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 47/98] x86/CPU/AMD: Dont force the CPB cap when running under a hypervisor Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 48/98] perf/ring_buffer: Fix exposing a temporarily decreased data_head Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 49/98] perf/ring_buffer: Add ordering to rb->nest increment Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 50/98] perf/ring-buffer: Always use {READ,WRITE}_ONCE() for rb->user_page data Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 51/98] gpio: fix gpio-adp5588 build errors Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 52/98] net: stmmac: update rx tail pointer register to fix rx dma hang issue Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 53/98] net: stmmac: fix csr_clk cant be zero issue Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 54/98] net: stmmac: dwmac-mediatek: modify csr_clk value to fix mdio read/write fail Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 55/98] io_uring: Fix __io_uring_register() false success Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 56/98] dpaa2-eth: Fix potential spectre issue Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 57/98] dpaa2-eth: Use PTR_ERR_OR_ZERO where appropriate Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 58/98] net: tulip: de4x5: Drop redundant MODULE_DEVICE_TABLE() Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 59/98] ACPI/PCI: PM: Add missing wakeup.flags.valid checks Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 60/98] drm/etnaviv: lock MMU while dumping core Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 61/98] net: aquantia: tx clean budget logic error Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 62/98] net: aquantia: fix LRO with FCS error Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 63/98] i2c: dev: fix potential memory leak in i2cdev_ioctl_rdwr Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 64/98] ALSA: hda - Force polling mode on CNL for fixing codec communication Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 65/98] configfs: Fix use-after-free when accessing sd->s_dentry Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 66/98] perf data: Fix strncat may truncate build failure with recent gcc Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 67/98] s390/zcrypt: Fix wrong dispatching for control domain CPRBs Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 68/98] perf namespace: Protect reading threads namespace Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 69/98] perf record: Fix s390 missing module symbol and warning for non-root users Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 70/98] ia64: fix build errors by exporting paddr_to_nid() Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 71/98] dpaa_eth: use only online CPU portals Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 72/98] xen/pvcalls: Remove set but not used variable Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 73/98] xenbus: Avoid deadlock during suspend due to open transactions Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 74/98] dfs_cache: fix a wrong use of kfree in flush_cache_ent() Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 75/98] KVM: PPC: Book3S HV: Use new mutex to synchronize MMU setup Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 76/98] KVM: PPC: Book3S: Use new mutex to synchronize access to rtas token list Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 77/98] KVM: PPC: Book3S HV: Dont take kvm->lock around kvm_for_each_vcpu Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 78/98] ALSA: fireface: Use ULL suffixes for 64-bit constants Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 79/98] arm64: fix syscall_fn_t type Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 80/98] arm64: use the correct function type in SYSCALL_DEFINE0 Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 81/98] arm64: use the correct function type for __arm64_sys_ni_syscall Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 82/98] net: sh_eth: fix mdio access in sh_eth_close() for R-Car Gen2 and RZ/A1 SoCs Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 83/98] blk-mq: Fix memory leak in error handling Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 84/98] net: phylink: ensure consistent phy interface mode Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 85/98] net: phy: dp83867: fix speed 10 in sgmii mode Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 86/98] net: phy: dp83867: increase SGMII autoneg timer duration Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 87/98] net: phy: dp83867: Set up RGMII TX delay Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 88/98] scsi: libcxgbi: add a check for NULL pointer in cxgbi_check_route() Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 89/98] scsi: smartpqi: properly set both the DMA mask and the coherent DMA mask Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 90/98] scsi: scsi_dh_alua: Fix possible null-ptr-deref Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 91/98] scsi: libsas: delete sas port if expander discover failed Greg Kroah-Hartman
2019-06-20 17:57 ` [PATCH 5.1 92/98] mlxsw: spectrum: Prevent force of 56G Greg Kroah-Hartman
2019-06-20 17:58 ` [PATCH 5.1 93/98] ocfs2: fix error path kobject memory leak Greg Kroah-Hartman
2019-06-20 17:58 ` [PATCH 5.1 94/98] mm: mmu_gather: remove __tlb_reset_range() for force flush Greg Kroah-Hartman
2019-06-20 17:58 ` [PATCH 5.1 95/98] nvme-tcp: rename function to have nvme_tcp prefix Greg Kroah-Hartman
2019-06-20 17:58 ` [PATCH 5.1 96/98] nvme-tcp: fix possible null deref on a timed out io queue connect Greg Kroah-Hartman
2019-06-20 17:58 ` [PATCH 5.1 97/98] nvme-tcp: fix queue mapping when queue count is limited Greg Kroah-Hartman
2019-06-20 17:58 ` Greg Kroah-Hartman [this message]
2019-06-20 23:48 ` [PATCH 5.1 00/98] 5.1.13-stable review Jiunn Chang
2019-06-21  6:14   ` Greg Kroah-Hartman
2019-06-20 23:51 ` kernelci.org bot
2019-06-21  3:55 ` Naresh Kamboju
2019-06-21  6:14   ` Greg Kroah-Hartman
2019-06-22  0:45 ` Guenter Roeck
2019-06-22  5:43   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190620174354.374371652@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=jgg@mellanox.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=oleg@redhat.com \
    --cc=peterx@redhat.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).