From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC321C282D7 for ; Mon, 4 Feb 2019 06:02:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 73B9C21871 for ; Mon, 4 Feb 2019 06:02:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1549260158; bh=qgjrwaVkALYcKDecakskW54Hb511rnjV74Rfv5eNDQA=; h=Subject:To:Cc:From:Date:List-ID:From; b=R/R7IB5AImfKx5Ed99tle7Q1G+1R5HzKgO8x5jbkt8soJvgH7Pp+hONRhhB6qkc83 FFm2G6qCezhKszE/VrmEaOCPUI34Ryaa4sSUOpNVapPdo7q9DBVSEbMzsu7RfnhfVZ 2E3a8gQunx7Fs7uL7P9q9i/OqnPtXXe2TuVRxXgE= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726795AbfBDGCi (ORCPT ); Mon, 4 Feb 2019 01:02:38 -0500 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:39307 "EHLO out1-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725997AbfBDGCh (ORCPT ); Mon, 4 Feb 2019 01:02:37 -0500 Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id C2080220BD; Mon, 4 Feb 2019 01:02:36 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute6.internal (MEProxy); Mon, 04 Feb 2019 01:02:36 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:message-id:mime-version:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; bh=Ngzaf6 5ny371Q/xib9iv8ac0S5ZMcNo7Q8BXVDVqZaI=; b=wKzlixehG3lLc43sm1RH6V EKkUv0MWbZ/SENJdINUX4/YSY64ohDG8YTEQDpwq6FM3MFf+WA/70Rn+5DWeXr3o Xft98IxjfmCEClLA6f725OlaIX5hw8wqRNrpeU176+9o03BCqGH6uyqxIczC1Va8 aLTVv6lxF4VmvrQoVUDnKupv9kjbIFR+OZlBEmjDnTeQijo42BZIijBLZdhOgStD TFmlUpELjuxthSoBKC+MMRLG7xNrBEL6tVghMVHC4SI/0sy1DsMxmRlhJVxiuz8S Mfs4ph17h3xVVDauEil1FqQ6rhkOR5jz9h4TTz6dhwiFCzyhSd83kiZEBarmB72A == X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedtledrkeefgdeltdcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfhuthenuceurghilhhouhhtmecufedt tdenucenucfjughrpefuvffhfffkgggtgfesthekredttddtlfenucfhrhhomhepoehgrh gvghhkhheslhhinhhugihfohhunhgurghtihhonhdrohhrgheqnecuffhomhgrihhnpehk vghrnhgvlhdrohhrghenucfkphepkeefrdekiedrkeelrddutdejnecurfgrrhgrmhepmh grihhlfhhrohhmpehgrhgvgheskhhrohgrhhdrtghomhenucevlhhushhtvghrufhiiigv pedt X-ME-Proxy: Received: from localhost (5356596b.cm-6-7b.dynamic.ziggo.nl [83.86.89.107]) by mail.messagingengine.com (Postfix) with ESMTPA id D4E55100BB; Mon, 4 Feb 2019 01:02:33 -0500 (EST) Subject: FAILED: patch "[PATCH] mm: migrate: don't rely on __PageMovable() of newpage after" failed to apply to 4.4-stable tree To: david@redhat.com, aarcange@redhat.com, akpm@linux-foundation.org, aquini@redhat.com, jack@suse.cz, k.khlebnikov@samsung.com, kirill.shutemov@linux.intel.com, linux@dominikbrodowski.net, mgorman@techsingularity.net, mhocko@suse.com, minchan@kernel.org, n-horiguchi@ah.jp.nec.com, stable@vger.kernel.org, torvalds@linux-foundation.org, vbendel@redhat.com, willy@infradead.org Cc: From: Date: Mon, 04 Feb 2019 07:02:31 +0100 Message-ID: <154926015133103@kroah.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ANSI_X3.4-1968 Content-Transfer-Encoding: 8bit Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org The patch below does not apply to the 4.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to . thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From e0a352fabce61f730341d119fbedf71ffdb8663f Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Fri, 1 Feb 2019 14:21:19 -0800 Subject: [PATCH] mm: migrate: don't rely on __PageMovable() of newpage after unlocking it We had a race in the old balloon compaction code before b1123ea6d3b3 ("mm: balloon: use general non-lru movable page feature") refactored it that became visible after backporting 195a8c43e93d ("virtio-balloon: deflate via a page list") without the refactoring. The bug existed from commit d6d86c0a7f8d ("mm/balloon_compaction: redesign ballooned pages management") till b1123ea6d3b3 ("mm: balloon: use general non-lru movable page feature"). d6d86c0a7f8d ("mm/balloon_compaction: redesign ballooned pages management") was backported to 3.12, so the broken kernels are stable kernels [3.12 - 4.7]. There was a subtle race between dropping the page lock of the newpage in __unmap_and_move() and checking for __is_movable_balloon_page(newpage). Just after dropping this page lock, virtio-balloon could go ahead and deflate the newpage, effectively dequeueing it and clearing PageBalloon, in turn making __is_movable_balloon_page(newpage) fail. This resulted in dropping the reference of the newpage via putback_lru_page(newpage) instead of put_page(newpage), leading to page->lru getting modified and a !LRU page ending up in the LRU lists. With 195a8c43e93d ("virtio-balloon: deflate via a page list") backported, one would suddenly get corrupted lists in release_pages_balloon(): - WARNING: CPU: 13 PID: 6586 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0 - list_del corruption. prev->next should be ffffe253961090a0, but was dead000000000100 Nowadays this race is no longer possible, but it is hidden behind very ugly handling of __ClearPageMovable() and __PageMovable(). __ClearPageMovable() will not make __PageMovable() fail, only PageMovable(). So the new check (__PageMovable(newpage)) will still hold even after newpage was dequeued by virtio-balloon. If anybody would ever change that special handling, the BUG would be introduced again. So instead, make it explicit and use the information of the original isolated page before migration. This patch can be backported fairly easy to stable kernels (in contrast to the refactoring). Link: http://lkml.kernel.org/r/20190129233217.10747-1-david@redhat.com Fixes: d6d86c0a7f8d ("mm/balloon_compaction: redesign ballooned pages management") Signed-off-by: David Hildenbrand Reported-by: Vratislav Bendel Acked-by: Michal Hocko Acked-by: Rafael Aquini Cc: Mel Gorman Cc: "Kirill A. Shutemov" Cc: Michal Hocko Cc: Naoya Horiguchi Cc: Jan Kara Cc: Andrea Arcangeli Cc: Dominik Brodowski Cc: Matthew Wilcox Cc: Vratislav Bendel Cc: Rafael Aquini Cc: Konstantin Khlebnikov Cc: Minchan Kim Cc: [3.12 - 4.7] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds diff --git a/mm/migrate.c b/mm/migrate.c index 712b231a7376..d4fd680be3b0 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1130,10 +1130,13 @@ static int __unmap_and_move(struct page *page, struct page *newpage, * If migration is successful, decrease refcount of the newpage * which will not free the page because new page owner increased * refcounter. As well, if it is LRU page, add the page to LRU - * list in here. + * list in here. Use the old state of the isolated source page to + * determine if we migrated a LRU page. newpage was already unlocked + * and possibly modified by its owner - don't rely on the page + * state. */ if (rc == MIGRATEPAGE_SUCCESS) { - if (unlikely(__PageMovable(newpage))) + if (unlikely(!is_lru)) put_page(newpage); else putback_lru_page(newpage);