From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Hugh Dickins <hughd@google.com>,
Christoph Lameter <cl@linux.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Rik van Riel <riel@redhat.com>, Vlastimil Babka <vbabka@suse.cz>,
Davidlohr Bueso <dave@stgolabs.net>,
Oleg Nesterov <oleg@redhat.com>,
Sasha Levin <sasha.levin@oracle.com>,
Dmitry Vyukov <dvyukov@google.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
"Charles (Chas) Williams" <ciwillia@brocade.com>
Subject: [PATCH 3.14 02/29] mm: migrate dirty page without clear_page_dirty_for_io etc
Date: Sun, 14 Aug 2016 22:07:30 +0200 [thread overview]
Message-ID: <20160814200731.521604418@linuxfoundation.org> (raw)
In-Reply-To: <20160814200731.375346059@linuxfoundation.org>
3.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hugh Dickins <hughd@google.com>
commit 42cb14b110a5698ccf26ce59c4441722605a3743 upstream.
clear_page_dirty_for_io() has accumulated writeback and memcg subtleties
since v2.6.16 first introduced page migration; and the set_page_dirty()
which completed its migration of PageDirty, later had to be moderated to
__set_page_dirty_nobuffers(); then PageSwapBacked had to skip that too.
No actual problems seen with this procedure recently, but if you look into
what the clear_page_dirty_for_io(page)+set_page_dirty(newpage) is actually
achieving, it turns out to be nothing more than moving the PageDirty flag,
and its NR_FILE_DIRTY stat from one zone to another.
It would be good to avoid a pile of irrelevant decrementations and
incrementations, and improper event counting, and unnecessary descent of
the radix_tree under tree_lock (to set the PAGECACHE_TAG_DIRTY which
radix_tree_replace_slot() left in place anyway).
Do the NR_FILE_DIRTY movement, like the other stats movements, while
interrupts still disabled in migrate_page_move_mapping(); and don't even
bother if the zone is the same. Do the PageDirty movement there under
tree_lock too, where old page is frozen and newpage not yet visible:
bearing in mind that as soon as newpage becomes visible in radix_tree, an
un-page-locked set_page_dirty() might interfere (or perhaps that's just
not possible: anything doing so should already hold an additional
reference to the old page, preventing its migration; but play safe).
But we do still need to transfer PageDirty in migrate_page_copy(), for
those who don't go the mapping route through migrate_page_move_mapping().
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ciwillia@brocade.com: backported to 3.14: adjusted context]
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/migrate.c | 51 +++++++++++++++++++++++++++++++--------------------
1 file changed, 31 insertions(+), 20 deletions(-)
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -30,6 +30,7 @@
#include <linux/mempolicy.h>
#include <linux/vmalloc.h>
#include <linux/security.h>
+#include <linux/backing-dev.h>
#include <linux/memcontrol.h>
#include <linux/syscalls.h>
#include <linux/hugetlb.h>
@@ -344,6 +345,8 @@ int migrate_page_move_mapping(struct add
struct buffer_head *head, enum migrate_mode mode,
int extra_count)
{
+ struct zone *oldzone, *newzone;
+ int dirty;
int expected_count = 1 + extra_count;
void **pslot;
@@ -354,6 +357,9 @@ int migrate_page_move_mapping(struct add
return MIGRATEPAGE_SUCCESS;
}
+ oldzone = page_zone(page);
+ newzone = page_zone(newpage);
+
spin_lock_irq(&mapping->tree_lock);
pslot = radix_tree_lookup_slot(&mapping->page_tree,
@@ -394,6 +400,13 @@ int migrate_page_move_mapping(struct add
set_page_private(newpage, page_private(page));
}
+ /* Move dirty while page refs frozen and newpage not yet exposed */
+ dirty = PageDirty(page);
+ if (dirty) {
+ ClearPageDirty(page);
+ SetPageDirty(newpage);
+ }
+
radix_tree_replace_slot(pslot, newpage);
/*
@@ -403,6 +416,9 @@ int migrate_page_move_mapping(struct add
*/
page_unfreeze_refs(page, expected_count - 1);
+ spin_unlock(&mapping->tree_lock);
+ /* Leave irq disabled to prevent preemption while updating stats */
+
/*
* If moved to a different zone then also account
* the page for that zone. Other VM counters will be
@@ -413,13 +429,19 @@ int migrate_page_move_mapping(struct add
* via NR_FILE_PAGES and NR_ANON_PAGES if they
* are mapped to swap space.
*/
- __dec_zone_page_state(page, NR_FILE_PAGES);
- __inc_zone_page_state(newpage, NR_FILE_PAGES);
- if (!PageSwapCache(page) && PageSwapBacked(page)) {
- __dec_zone_page_state(page, NR_SHMEM);
- __inc_zone_page_state(newpage, NR_SHMEM);
+ if (newzone != oldzone) {
+ __dec_zone_state(oldzone, NR_FILE_PAGES);
+ __inc_zone_state(newzone, NR_FILE_PAGES);
+ if (PageSwapBacked(page) && !PageSwapCache(page)) {
+ __dec_zone_state(oldzone, NR_SHMEM);
+ __inc_zone_state(newzone, NR_SHMEM);
+ }
+ if (dirty && mapping_cap_account_dirty(mapping)) {
+ __dec_zone_state(oldzone, NR_FILE_DIRTY);
+ __inc_zone_state(newzone, NR_FILE_DIRTY);
+ }
}
- spin_unlock_irq(&mapping->tree_lock);
+ local_irq_enable();
return MIGRATEPAGE_SUCCESS;
}
@@ -544,20 +566,9 @@ void migrate_page_copy(struct page *newp
if (PageMappedToDisk(page))
SetPageMappedToDisk(newpage);
- if (PageDirty(page)) {
- clear_page_dirty_for_io(page);
- /*
- * Want to mark the page and the radix tree as dirty, and
- * redo the accounting that clear_page_dirty_for_io undid,
- * but we can't use set_page_dirty because that function
- * is actually a signal that all of the page has become dirty.
- * Whereas only part of our page may be dirty.
- */
- if (PageSwapBacked(page))
- SetPageDirty(newpage);
- else
- __set_page_dirty_nobuffers(newpage);
- }
+ /* Move dirty on pages not done by migrate_page_move_mapping() */
+ if (PageDirty(page))
+ SetPageDirty(newpage);
/*
* Copy NUMA information to the new page, to prevent over-eager
next prev parent reply other threads:[~2016-08-14 20:08 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20160814200812uscas1p1ef0170d47bedbb472ff4f71fa6e71b1c@uscas1p1.samsung.com>
2016-08-14 20:07 ` [PATCH 3.14 00/29] 3.14.76-stable review Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 01/29] USB: fix invalid memory access in hub_activate() Greg Kroah-Hartman
2016-08-14 20:07 ` Greg Kroah-Hartman [this message]
2016-08-14 20:07 ` [PATCH 3.14 03/29] printk: do cond_resched() between lines while outputting to consoles Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 04/29] x86/mm: Add barriers and document switch_mm()-vs-flush synchronization Greg Kroah-Hartman
2016-08-14 20:07 ` Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 05/29] sctp: Prevent soft lockup when sctp_accept() is called during a timeout event Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 06/29] x86/mm: Improve switch_mm() barrier comments Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 08/29] USB: fix up incorrect quirk Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 09/29] arm: oabi compat: add missing access checks Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 10/29] KEYS: 64-bit MIPS needs to use compat_sys_keyctl for 32-bit userspace Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 11/29] apparmor: fix ref count leak when profile sha1 hash is read Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 12/29] random: strengthen input validation for RNDADDTOENTCNT Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 13/29] scsi: remove scsi_end_request Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 14/29] scsi_lib: correctly retry failed zero length REQ_TYPE_FS commands Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 15/29] IB/security: Restrict use of the write() interface Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 16/29] block: fix use-after-free in seq file Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 17/29] sysv, ipc: fix security-layer leaking Greg Kroah-Hartman
2016-08-21 11:49 ` Willy Tarreau
2016-08-29 9:23 ` Manfred Spraul
2016-08-29 11:49 ` Willy Tarreau
2016-08-14 20:07 ` [PATCH 3.14 18/29] fuse: fix wrong assignment of ->flags in fuse_send_init() Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 19/29] crypto: gcm - Filter out async ghash if necessary Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 20/29] crypto: scatterwalk - Fix test in scatterwalk_done Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 21/29] ext4: check for extents that wrap around Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 22/29] ext4: fix deadlock during page writeback Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 23/29] ext4: dont call ext4_should_journal_data() on the journal inode Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 24/29] ext4: short-cut orphan cleanup on error Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 25/29] bonding: set carrier off for devices created through netlink Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 26/29] net/irda: fix NULL pointer dereference on memory allocation failure Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 27/29] tcp: consider recv buf for the initial window scale Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 28/29] [PATCH 1/8] tcp: make challenge acks less predictable Greg Kroah-Hartman
2016-08-14 20:07 ` [PATCH 3.14 29/29] ext4: fix reference counting bug on block allocation error Greg Kroah-Hartman
2016-08-15 14:49 ` [PATCH 3.14 00/29] 3.14.76-stable review Guenter Roeck
2016-08-16 4:01 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160814200731.521604418@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=ciwillia@brocade.com \
--cc=cl@linux.com \
--cc=dave@stgolabs.net \
--cc=dvyukov@google.com \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=riel@redhat.com \
--cc=sasha.levin@oracle.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.