linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
	Hugh Dickins <hughd@google.com>,
	Wu Fengguang <fengguang.wu@intel.com>, Jan Kara <jack@suse.cz>,
	Mel Gorman <mgorman@suse.de>,
	linux-mm@kvack.org, Andi Kleen <ak@linux.intel.com>,
	Matthew Wilcox <matthew.r.wilcox@intel.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Hillf Danton <dhillf@gmail.com>, Dave Hansen <dave@sr71.net>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: [PATCHv4 06/39] thp, mm: avoid PageUnevictable on active/inactive lru lists
Date: Sun, 12 May 2013 04:23:03 +0300	[thread overview]
Message-ID: <1368321816-17719-7-git-send-email-kirill.shutemov@linux.intel.com> (raw)
In-Reply-To: <1368321816-17719-1-git-send-email-kirill.shutemov@linux.intel.com>

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

active/inactive lru lists can contain unevicable pages (i.e. ramfs pages
that have been placed on the LRU lists when first allocated), but these
pages must not have PageUnevictable set - otherwise shrink_active_list
goes crazy:

kernel BUG at /home/space/kas/git/public/linux-next/mm/vmscan.c:1122!
invalid opcode: 0000 [#1] SMP
CPU 0
Pid: 293, comm: kswapd0 Not tainted 3.8.0-rc6-next-20130202+ #531
RIP: 0010:[<ffffffff81110478>]  [<ffffffff81110478>] isolate_lru_pages.isra.61+0x138/0x260
RSP: 0000:ffff8800796d9b28  EFLAGS: 00010082
RAX: 00000000ffffffea RBX: 0000000000000012 RCX: 0000000000000001
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffea0001de8040
RBP: ffff8800796d9b88 R08: ffff8800796d9df0 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000012
R13: ffffea0001de8060 R14: ffffffff818818e8 R15: ffff8800796d9bf8
FS:  0000000000000000(0000) GS:ffff88007a200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1bfc108000 CR3: 000000000180b000 CR4: 00000000000406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kswapd0 (pid: 293, threadinfo ffff8800796d8000, task ffff880079e0a6e0)
Stack:
 ffff8800796d9b48 ffffffff81881880 ffff8800796d9df0 ffff8800796d9be0
 0000000000000002 000000000000001f ffff8800796d9b88 ffffffff818818c8
 ffffffff81881480 ffff8800796d9dc0 0000000000000002 000000000000001f
Call Trace:
 [<ffffffff81111e98>] shrink_inactive_list+0x108/0x4a0
 [<ffffffff8109ce3d>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff8107b8bf>] ? local_clock+0x4f/0x60
 [<ffffffff8110ff5d>] ? shrink_slab+0x1fd/0x4c0
 [<ffffffff811125a1>] shrink_zone+0x371/0x610
 [<ffffffff8110ff75>] ? shrink_slab+0x215/0x4c0
 [<ffffffff81112dfc>] kswapd+0x5bc/0xb60
 [<ffffffff81112840>] ? shrink_zone+0x610/0x610
 [<ffffffff81066676>] kthread+0xd6/0xe0
 [<ffffffff810665a0>] ? __kthread_bind+0x40/0x40
 [<ffffffff814fed6c>] ret_from_fork+0x7c/0xb0
 [<ffffffff810665a0>] ? __kthread_bind+0x40/0x40
Code: 1f 40 00 49 8b 45 08 49 8b 75 00 48 89 46 08 48 89 30 49 8b 06 4c 89 68 08 49 89 45 00 4d 89 75 08 4d 89 2e eb 9c 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 31 db 45 31 e4 eb 9b 0f 0b 0f 0b 65 48
RIP  [<ffffffff81110478>] isolate_lru_pages.isra.61+0x138/0x260
 RSP <ffff8800796d9b28>

For lru_add_page_tail(), it means we should not set PageUnevictable()
for tail pages unless we're sure that it will go to LRU_UNEVICTABLE.
Let's just copy PG_active and PG_unevictable from head page in
__split_huge_page_refcount(), it will simplify lru_add_page_tail().

This will fix one more bug in lru_add_page_tail():
if page_evictable(page_tail) is false and PageLRU(page) is true, page_tail
will go to the same lru as page, but nobody cares to sync page_tail
active/inactive state with page. So we can end up with inactive page on
active lru.
The patch will fix it as well since we copy PG_active from head page.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/huge_memory.c |    4 +++-
 mm/swap.c        |   20 ++------------------
 2 files changed, 5 insertions(+), 19 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 03a89a2..b39fa01 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1612,7 +1612,9 @@ static void __split_huge_page_refcount(struct page *page,
 				     ((1L << PG_referenced) |
 				      (1L << PG_swapbacked) |
 				      (1L << PG_mlocked) |
-				      (1L << PG_uptodate)));
+				      (1L << PG_uptodate) |
+				      (1L << PG_active) |
+				      (1L << PG_unevictable)));
 		page_tail->flags |= (1L << PG_dirty);
 
 		/* clear PageTail before overwriting first_page */
diff --git a/mm/swap.c b/mm/swap.c
index acd40bf..9b0a64b 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -739,8 +739,6 @@ EXPORT_SYMBOL(__pagevec_release);
 void lru_add_page_tail(struct page *page, struct page *page_tail,
 		       struct lruvec *lruvec, struct list_head *list)
 {
-	int uninitialized_var(active);
-	enum lru_list lru;
 	const int file = 0;
 
 	VM_BUG_ON(!PageHead(page));
@@ -752,20 +750,6 @@ void lru_add_page_tail(struct page *page, struct page *page_tail,
 	if (!list)
 		SetPageLRU(page_tail);
 
-	if (page_evictable(page_tail)) {
-		if (PageActive(page)) {
-			SetPageActive(page_tail);
-			active = 1;
-			lru = LRU_ACTIVE_ANON;
-		} else {
-			active = 0;
-			lru = LRU_INACTIVE_ANON;
-		}
-	} else {
-		SetPageUnevictable(page_tail);
-		lru = LRU_UNEVICTABLE;
-	}
-
 	if (likely(PageLRU(page)))
 		list_add_tail(&page_tail->lru, &page->lru);
 	else if (list) {
@@ -781,13 +765,13 @@ void lru_add_page_tail(struct page *page, struct page *page_tail,
 		 * Use the standard add function to put page_tail on the list,
 		 * but then correct its position so they all end up in order.
 		 */
-		add_page_to_lru_list(page_tail, lruvec, lru);
+		add_page_to_lru_list(page_tail, lruvec, page_lru(page_tail));
 		list_head = page_tail->lru.prev;
 		list_move_tail(&page_tail->lru, list_head);
 	}
 
 	if (!PageUnevictable(page))
-		update_page_reclaim_stat(lruvec, file, active);
+		update_page_reclaim_stat(lruvec, file, PageActive(page_tail));
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
-- 
1.7.10.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2013-05-12  1:21 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-12  1:22 [PATCHv4 00/39] Transparent huge page cache Kirill A. Shutemov
2013-05-12  1:22 ` [PATCHv4 01/39] mm: drop actor argument of do_generic_file_read() Kirill A. Shutemov
2013-05-21 18:22   ` Dave Hansen
2013-05-12  1:22 ` [PATCHv4 02/39] block: implement add_bdi_stat() Kirill A. Shutemov
2013-05-21 18:25   ` Dave Hansen
2013-05-22 11:06     ` Kirill A. Shutemov
2013-05-12  1:23 ` [PATCHv4 03/39] mm: implement zero_huge_user_segment and friends Kirill A. Shutemov
2013-05-23 10:32   ` Hillf Danton
2013-05-23 11:32     ` Kirill A. Shutemov
2013-05-12  1:23 ` [PATCHv4 04/39] radix-tree: implement preload for multiple contiguous elements Kirill A. Shutemov
2013-05-21 18:58   ` Dave Hansen
2013-05-22 12:03     ` Kirill A. Shutemov
2013-05-22 14:20       ` Dave Hansen
2013-05-12  1:23 ` [PATCHv4 05/39] memcg, thp: charge huge cache pages Kirill A. Shutemov
2013-05-21 19:04   ` Dave Hansen
2013-05-12  1:23 ` Kirill A. Shutemov [this message]
2013-05-21 19:17   ` [PATCHv4 06/39] thp, mm: avoid PageUnevictable on active/inactive lru lists Dave Hansen
2013-05-22 12:34     ` Kirill A. Shutemov
2013-05-12  1:23 ` [PATCHv4 07/39] thp, mm: basic defines for transparent huge page cache Kirill A. Shutemov
2013-05-23 10:36   ` Hillf Danton
2013-05-23 15:49     ` Dave Hansen
2013-05-12  1:23 ` [PATCHv4 08/39] thp: compile-time and sysfs knob for thp pagecache Kirill A. Shutemov
2013-05-22 11:19   ` Hillf Danton
2013-05-12  1:23 ` [PATCHv4 09/39] thp, mm: introduce mapping_can_have_hugepages() predicate Kirill A. Shutemov
2013-05-21 19:28   ` Dave Hansen
2013-05-22 13:51     ` Kirill A. Shutemov
2013-05-22 15:31       ` Dave Hansen
2013-05-12  1:23 ` [PATCHv4 10/39] thp: account anon transparent huge pages into NR_ANON_PAGES Kirill A. Shutemov
2013-05-21 19:32   ` Dave Hansen
2013-05-12  1:23 ` [PATCHv4 11/39] thp: represent file thp pages in meminfo and friends Kirill A. Shutemov
2013-05-21 19:34   ` Dave Hansen
2013-05-12  1:23 ` [PATCHv4 12/39] thp, mm: rewrite add_to_page_cache_locked() to support huge pages Kirill A. Shutemov
2013-05-21 19:59   ` Dave Hansen
2013-05-23 14:36     ` Kirill A. Shutemov
2013-05-23 16:00       ` Dave Hansen
2013-05-28 11:59         ` Kirill A. Shutemov
2013-05-12  1:23 ` [PATCHv4 13/39] mm: trace filemap: dump page order Kirill A. Shutemov
2013-05-21 19:35   ` Dave Hansen
2013-05-12  1:23 ` [PATCHv4 14/39] thp, mm: rewrite delete_from_page_cache() to support huge pages Kirill A. Shutemov
2013-05-21 20:14   ` Dave Hansen
2013-05-28 12:28     ` Kirill A. Shutemov
2013-06-07 15:10       ` Kirill A. Shutemov
2013-06-07 15:56         ` Dave Hansen
2013-06-10 17:41           ` Kirill A. Shutemov
2013-05-12  1:23 ` [PATCHv4 15/39] thp, mm: trigger bug in replace_page_cache_page() on THP Kirill A. Shutemov
2013-05-21 20:17   ` Dave Hansen
2013-05-28 12:53     ` Kirill A. Shutemov
2013-05-28 16:33       ` Dave Hansen
2013-05-12  1:23 ` [PATCHv4 16/39] thp, mm: locking tail page is a bug Kirill A. Shutemov
2013-05-21 20:18   ` Dave Hansen
2013-05-22 14:12     ` Kirill A. Shutemov
2013-05-22 14:53       ` Dave Hansen
2013-05-12  1:23 ` [PATCHv4 17/39] thp, mm: handle tail pages in page_cache_get_speculative() Kirill A. Shutemov
2013-05-21 20:49   ` Dave Hansen
2013-06-27 12:40     ` Kirill A. Shutemov
2013-05-12  1:23 ` [PATCHv4 18/39] thp, mm: add event counters for huge page alloc on write to a file Kirill A. Shutemov
2013-05-21 20:54   ` Dave Hansen
2013-05-12  1:23 ` [PATCHv4 19/39] thp, mm: allocate huge pages in grab_cache_page_write_begin() Kirill A. Shutemov
2013-05-21 21:14   ` Dave Hansen
2013-05-30 13:20     ` Kirill A. Shutemov
2013-05-12  1:23 ` [PATCHv4 20/39] thp, mm: naive support of thp in generic read/write routines Kirill A. Shutemov
2013-05-21 21:28   ` Dave Hansen
2013-06-07 15:17     ` Kirill A. Shutemov
2013-06-07 15:29       ` Dave Hansen
2013-05-12  1:23 ` [PATCHv4 21/39] thp, libfs: initial support of thp in simple_read/write_begin/write_end Kirill A. Shutemov
2013-05-21 21:49   ` Dave Hansen
2013-05-12  1:23 ` [PATCHv4 22/39] thp: handle file pages in split_huge_page() Kirill A. Shutemov
2013-05-12  1:23 ` [PATCHv4 23/39] thp: wait_split_huge_page(): serialize over i_mmap_mutex too Kirill A. Shutemov
2013-05-21 22:05   ` Dave Hansen
2013-06-03 15:02     ` Kirill A. Shutemov
2013-06-03 15:53       ` Dave Hansen
2013-06-03 16:09         ` Kirill A. Shutemov
2013-05-12  1:23 ` [PATCHv4 24/39] thp, mm: truncate support for transparent huge page cache Kirill A. Shutemov
2013-05-21 22:39   ` Dave Hansen
2013-05-12  1:23 ` [PATCHv4 25/39] thp, mm: split huge page on mmap file page Kirill A. Shutemov
2013-05-12  1:23 ` [PATCHv4 26/39] ramfs: enable transparent huge page cache Kirill A. Shutemov
2013-05-21 22:43   ` Dave Hansen
2013-05-22 14:22     ` Kirill A. Shutemov
2013-05-22 14:55       ` Dave Hansen
2013-05-12  1:23 ` [PATCHv4 27/39] x86-64, mm: proper alignment mappings with hugepages Kirill A. Shutemov
2013-05-21 22:56   ` Dave Hansen
2013-06-25 14:56     ` Kirill A. Shutemov
2013-06-25 16:46       ` Dave Hansen
2013-05-21 23:20   ` Dave Hansen
2013-05-12  1:23 ` [PATCHv4 28/39] thp: prepare zap_huge_pmd() to uncharge file pages Kirill A. Shutemov
2013-05-22  7:26   ` Hillf Danton
2013-05-12  1:23 ` [PATCHv4 29/39] thp: move maybe_pmd_mkwrite() out of mk_huge_pmd() Kirill A. Shutemov
2013-05-21 23:23   ` Dave Hansen
2013-05-22 14:37     ` Kirill A. Shutemov
2013-05-22 14:56       ` Dave Hansen
2013-05-21 23:23   ` Dave Hansen
2013-05-12  1:23 ` [PATCHv4 30/39] thp: do_huge_pmd_anonymous_page() cleanup Kirill A. Shutemov
2013-05-22 11:45   ` Hillf Danton
2013-05-12  1:23 ` [PATCHv4 31/39] thp: consolidate code between handle_mm_fault() and do_huge_pmd_anonymous_page() Kirill A. Shutemov
2013-05-21 23:38   ` Dave Hansen
2013-05-22  6:51   ` Hillf Danton
2013-05-12  1:23 ` [PATCHv4 32/39] mm: cleanup __do_fault() implementation Kirill A. Shutemov
2013-05-21 23:57   ` Dave Hansen
2013-05-12  1:23 ` [PATCHv4 33/39] thp, mm: implement do_huge_linear_fault() Kirill A. Shutemov
2013-05-22 12:47   ` Hillf Danton
2013-05-22 15:13     ` Kirill A. Shutemov
2013-05-22 12:56   ` Hillf Danton
2013-05-22 15:14     ` Kirill A. Shutemov
2013-05-22 13:24   ` Hillf Danton
2013-05-22 15:26     ` Kirill A. Shutemov
2013-05-12  1:23 ` [PATCHv4 34/39] thp, mm: handle huge pages in filemap_fault() Kirill A. Shutemov
2013-05-22 11:37   ` Hillf Danton
2013-05-22 15:34     ` Kirill A. Shutemov
2013-05-12  1:23 ` [PATCHv4 35/39] mm: decomposite do_wp_page() and get rid of some 'goto' logic Kirill A. Shutemov
2013-05-12  1:23 ` [PATCHv4 36/39] mm: do_wp_page(): extract VM_WRITE|VM_SHARED case to separate function Kirill A. Shutemov
2013-05-12  1:23 ` [PATCHv4 37/39] thp: handle write-protect exception to file-backed huge pages Kirill A. Shutemov
2013-05-23 11:57   ` Hillf Danton
2013-05-23 12:08     ` Kirill A. Shutemov
2013-05-23 12:12       ` Hillf Danton
2013-05-23 12:33         ` Kirill A. Shutemov
2013-05-12  1:23 ` [PATCHv4 38/39] thp: vma_adjust_trans_huge(): adjust file-backed VMA too Kirill A. Shutemov
2013-05-23 11:01   ` Hillf Danton
2013-05-12  1:23 ` [PATCHv4 39/39] thp: map file-backed huge pages on fault Kirill A. Shutemov
2013-05-23 11:36   ` Hillf Danton
2013-05-23 11:48     ` Kirill A. Shutemov
2013-05-21 18:37 ` [PATCHv4 00/39] Transparent huge page cache Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1368321816-17719-7-git-send-email-kirill.shutemov@linux.intel.com \
    --to=kirill.shutemov@linux.intel.com \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave@sr71.net \
    --cc=dhillf@gmail.com \
    --cc=fengguang.wu@intel.com \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kirill@shutemov.name \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.r.wilcox@intel.com \
    --cc=mgorman@suse.de \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).