From: Kamal Mostafa <kamal@canonical.com>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org,
kernel-team@lists.ubuntu.com
Cc: Mel Gorman <mgorman@suse.de>,
Andrea Arcangeli <aarcange@redhat.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
Kamal Mostafa <kamal@canonical.com>
Subject: [PATCH 3.8 74/91] mm: Wait for THP migrations to complete during NUMA hinting faults
Date: Thu, 7 Nov 2013 18:15:29 -0800 [thread overview]
Message-ID: <1383876946-2396-75-git-send-email-kamal@canonical.com> (raw)
In-Reply-To: <1383876946-2396-1-git-send-email-kamal@canonical.com>
3.8.13.13 -stable review patch. If anyone has any objections, please let me know.
------------------
From: Mel Gorman <mgorman@suse.de>
commit 42836f5f8baa33085f547098b74aa98991ee9216 upstream.
The locking for migrating THP is unusual. While normal page migration
prevents parallel accesses using a migration PTE, THP migration relies on
a combination of the page_table_lock, the page lock and the existance of
the NUMA hinting PTE to guarantee safety but there is a bug in the scheme.
If a THP page is currently being migrated and another thread traps a
fault on the same page it checks if the page is misplaced. If it is not,
then pmd_numa is cleared. The problem is that it checks if the page is
misplaced without holding the page lock meaning that the racing thread
can be migrating the THP when the second thread clears the NUMA bit
and faults a stale page.
This patch checks if the page is potentially being migrated and stalls
using the lock_page if it is potentially being migrated before checking
if the page is misplaced or not.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-6-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
---
mm/huge_memory.c | 23 ++++++++++++++++-------
1 file changed, 16 insertions(+), 7 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index a057a7d..f3868de 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1315,13 +1315,14 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
if (current_nid == numa_node_id())
count_vm_numa_event(NUMA_HINT_FAULTS_LOCAL);
- target_nid = mpol_misplaced(page, vma, haddr);
- if (target_nid == -1) {
- put_page(page);
- goto clear_pmdnuma;
- }
+ /*
+ * Acquire the page lock to serialise THP migrations but avoid dropping
+ * page_table_lock if at all possible
+ */
+ if (trylock_page(page))
+ goto got_lock;
- /* Acquire the page lock to serialise THP migrations */
+ /* Serialise against migrationa and check placement check placement */
spin_unlock(&mm->page_table_lock);
lock_page(page);
page_locked = true;
@@ -1333,9 +1334,17 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
put_page(page);
goto out_unlock;
}
- spin_unlock(&mm->page_table_lock);
+
+got_lock:
+ target_nid = mpol_misplaced(page, vma, haddr);
+ if (target_nid == -1) {
+ unlock_page(page);
+ put_page(page);
+ goto clear_pmdnuma;
+ }
/* Migrate the THP to the requested node */
+ spin_unlock(&mm->page_table_lock);
migrated = migrate_misplaced_transhuge_page(mm, vma,
pmdp, pmd, addr,
page, target_nid);
--
1.8.1.2
next prev parent reply other threads:[~2013-11-08 2:15 UTC|newest]
Thread overview: 94+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-08 2:14 [3.8.y.z extended stable] Linux 3.8.13.13 stable review Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 01/91] ACPICA: Interpreter: Fix Store() when implicit conversion is not possible Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 02/91] ACPICA: DeRefOf operator: Update to fully resolve FieldUnit and BufferField refs Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 03/91] ACPICA: Return error if DerefOf resolves to a null package element Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 04/91] ACPICA: Fix for a Store->ArgX when ArgX contains a reference to a field Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 05/91] vxlan: fix ip_select_ident skb parameter Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 06/91] tcp: TSO packets automatic sizing Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 07/91] tcp: TSQ can use a dynamic limit Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 08/91] net: Add skb_unclone() helper function Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 09/91] ipv6: fix warning in xfrm6_mode_tunnel_input Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 10/91] ip: fix warning in xfrm4_mode_tunnel_input Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 11/91] tcp: must unclone packets before mangling them Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 12/91] tcp: do not forget FIN in tcp_shifted_skb() Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 13/91] net: do not call sock_put() on TIMEWAIT sockets Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 14/91] l2tp: fix kernel panic when using IPv4-mapped IPv6 addresses Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 15/91] l2tp: Fix build warning with ipv6 disabled Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 16/91] net: mv643xx_eth: update statistics timer from timer context only Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 17/91] net: mv643xx_eth: fix orphaned statistics timer crash Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 18/91] net: heap overflow in __audit_sockaddr() Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 19/91] proc connector: fix info leaks Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 20/91] ipv4: fix ineffective source address selection Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 21/91] can: dev: fix nlmsg size calculation in can_get_size() Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 22/91] net: secure_seq: Fix warning when CONFIG_IPV6 and CONFIG_INET are not selected Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 23/91] xen-netback: Don't destroy the netdev until the vif is shut down Kamal Mostafa
2013-11-08 9:53 ` Ian Campbell
2013-11-08 17:54 ` Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 24/91] net: vlan: fix nlmsg size calculation in vlan_get_size() Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 25/91] vti: get rid of nf mark rule in prerouting Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 26/91] l2tp: must disable bh before calling l2tp_xmit_skb() Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 27/91] farsync: fix info leak in ioctl Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 28/91] unix_diag: fix info leak Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 29/91] connector: use nlmsg_len() to check message length Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 30/91] virtio-net: don't respond to cpu hotplug notifier if we're not ready Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 31/91] bridge: Correctly clamp MAX forward_delay when enabling STP Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 32/91] net: dst: provide accessor function to dst->xfrm Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 33/91] sctp: Use software crc32 checksum when xfrm transform will happen Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 34/91] sctp: Perform software checksum if packet has to be fragmented Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 35/91] wanxl: fix info leak in ioctl Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 36/91] net: unix: inherit SOCK_PASS{CRED, SEC} flags from socket to fix race Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 37/91] net: fix cipso packet validation when !NETLABEL Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 38/91] inet: fix possible memory corruption with UDP_CORK and UFO Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 39/91] davinci_emac.c: Fix IFF_ALLMULTI setup Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 40/91] jfs: fix error path in ialloc Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 41/91] cfg80211: fix warning when using WEXT for IBSS Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 42/91] mac80211: drop spoofed packets in ad-hoc mode Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 43/91] mac80211: use sta_info_get_bss() for nl80211 tx and client probing Kamal Mostafa
2013-11-08 2:14 ` [PATCH 3.8 44/91] mac80211: update sta->last_rx on acked tx frames Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 45/91] mwifiex: fix SDIO interrupt lost issue Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 46/91] ath9k: fix tx queue scheduling after channel changes Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 47/91] libata: make ata_eh_qc_retry() bump scmd->allowed on bogus failures Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 48/91] mac80211: correctly close cancelled scans Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 49/91] can: flexcan: fix mx28 detection by rearanging OF match table Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 50/91] rtlwifi: rtl8192cu: Fix error in pointer arithmetic Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 51/91] wireless: radiotap: fix parsing buffer overrun Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 52/91] mac80211: fix crash if bitrate calculation goes wrong Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 53/91] drm/vmwgfx: Don't put resources with invalid id's on lru list Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 54/91] drm/vmwgfx: Don't kill clients on VT switch Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 55/91] ecryptfs: Fix memory leakage in keystore.c Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 56/91] drm: Prevent overwriting from userspace underallocating core ioctl structs Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 57/91] drm: Pad drm_mode_get_connector to 64-bit boundary Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 58/91] drm/radeon/atom: workaround vbios bug in transmitter table on rs780 Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 59/91] md: Fix skipping recovery for read-only arrays Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 60/91] md: avoid deadlock when md_set_badblocks Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 61/91] raid5: set bio bi_vcnt 0 for discard request Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 62/91] raid5: avoid finding "discard" stripe Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 63/91] target/pscsi: fix return value check Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 64/91] vhost/scsi: Fix incorrect usage of get_user_pages_fast write parameter Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 65/91] parisc: Do not crash 64bit SMP kernels on machines with >= 4GB RAM Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 66/91] dmi: add support for exact DMI matches in addition to substring matching Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 67/91] clk: fixup argument order when setting VCO parameters Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 68/91] xtensa: don't use alternate signal stack on threads Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 69/91] ALSA: hda - Fix unbalanced runtime PM refcount after S3/S4 Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 70/91] ASoC: dapm: Fix source list debugfs outputs Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 71/91] drm/i915: quirk away phantom LVDS on Intel's D510MO mainboard Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 72/91] drm/i915: quirk away phantom LVDS on Intel's D525MW mainboard Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 73/91] drm/i915: No LVDS hardware on Intel D410PT and D425KT Kamal Mostafa
2013-11-08 2:15 ` Kamal Mostafa [this message]
2013-11-08 2:15 ` [PATCH 3.8 75/91] mm: numa: cleanup flow of transhuge page migration Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 76/91] mm: Prevent parallel splits during THP migration Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 77/91] mm: numa: Sanitize task_numa_fault() callsites Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 78/91] mm: Close races between THP migration and PMD numa clearing Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 79/91] mm: Account for a THP NUMA hinting update as one PTE update Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 80/91] Fix a few incorrectly checked [io_]remap_pfn_range() calls Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 81/91] ALSA: hda - Add a fixup for ASUS N76VZ Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 82/91] ASoC: wm_hubs: Add missing break in hp_supply_event() Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 83/91] uml: check length in exitcode_proc_write() Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 84/91] staging: ozwpan: prevent overflow in oz_cdev_write() Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 85/91] aacraid: missing capable() check in compat ioctl Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 86/91] staging: wlags49_h2: buffer overflow setting station name Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 87/91] Staging: bcm: info leak in ioctl Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 88/91] Staging: sb105x: info leak in mp_get_count() Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 89/91] ALSA: fix oops in snd_pcm_info() caused by ASoC DPCM Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 90/91] lib/scatterlist.c: don't flush_kernel_dcache_page on slab page Kamal Mostafa
2013-11-08 2:15 ` [PATCH 3.8 91/91] scripts/kallsyms: filter symbols not in kernel address space Kamal Mostafa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1383876946-2396-75-git-send-email-kamal@canonical.com \
--to=kamal@canonical.com \
--cc=aarcange@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@lists.ubuntu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=srikar@linux.vnet.ibm.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox