From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id A8FA8211E2EE9 for ; Thu, 21 Mar 2019 15:36:09 -0700 (PDT) Date: Thu, 21 Mar 2019 16:37:08 -0600 From: Keith Busch Subject: Re: [PATCH 0/5] Page demotion for memory reclaim Message-ID: <20190321223706.GA29817@localhost.localdomain> References: <20190321200157.29678-1-keith.busch@intel.com> <5B5EFBC2-2979-4B9F-A43A-1A14F16ACCE1@nvidia.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <5B5EFBC2-2979-4B9F-A43A-1A14F16ACCE1@nvidia.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: Zi Yan Cc: Michal Hocko , Dave Hansen , David Nellans , John Hubbard , linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Kirill A. Shutemov" List-ID: T24gVGh1LCBNYXIgMjEsIDIwMTkgYXQgMDI6MjA6NTFQTSAtMDcwMCwgWmkgWWFuIHdyb3RlOgo+ IDEuIFRoZSBuYW1lIG9mIOKAnHBhZ2UgZGVtb3Rpb27igJ0gc2VlbXMgY29uZnVzaW5nIHRvIG1l LCBzaW5jZSBJIHRob3VnaHQgaXQgd2FzIGFib3V0IGxhcmdlIHBhZ2VzCj4gZGVtb3RlIHRvIHNt YWxsIHBhZ2VzIGFzIG9wcG9zaXRlIHRvIHByb21vdGluZyBzbWFsbCBwYWdlcyB0byBUSFBzLiBB bSBJIHRoZSBvbmx5Cj4gb25lIGhlcmU/CgpJZiB5b3UgaGF2ZSBhIFRIUCwgd2UnbGwgc2tpcCB0 aGUgcGFnZSBtaWdyYXRpb24gYW5kIGZhbGwgdGhyb3VnaCB0bwpzcGxpdF9odWdlX3BhZ2VfdG9f bGlzdCgpLCB0aGVuIHRoZSBzbWFsbGVyIHBhZ2VzIGNhbiBiZSBjb25zaWRlcmVkLAptaWdyYXRl ZCBhbmQgcmVjbGFpbWVkIGluZGl2aWR1YWxseS4gTm90IHRoYXQgd2UgY291bGRuJ3QgdHJ5IHRv IG1pZ3JhdGUKYSBUSFAgZGlyZWN0bHkuIEl0IHdhcyBqdXN0IHNpbXBsZXIgaW1wbGVtZW50YXRp b24gZm9yIHRoaXMgZmlyc3QgYXR0ZW1wdC4KCj4gMi4gRm9yIHRoZSBkZW1vdGlvbiBwYXRoLCBh IGNvbW1vbiBjYXNlIHdvdWxkIGJlIGZyb20gaGlnaC1wZXJmb3JtYW5jZSBtZW1vcnksIGxpa2Ug SEJNCj4gb3IgTXVsdGktQ2hhbm5lbCBEUkFNLCB0byBEUkFNLCB0aGVuIHRvIFBNRU0sIGFuZCBm aW5hbGx5IHRvIGRpc2tzLCByaWdodD8gTW9yZSBnZW5lcmFsCj4gY2FzZSBmb3IgZGVtb3Rpb24g cGF0aCB3b3VsZCBiZSBkZXJpdmVkIGZyb20gdGhlIG1lbW9yeSBwZXJmb3JtYW5jZSBkZXNjcmlw dGlvbiBmcm9tIEhNQVRbMV0sCj4gcmlnaHQ/IERvIHlvdSBoYXZlIGFueSBhbGdvcml0aG0gdG8g Zm9ybSBzdWNoIGEgcGF0aCBmcm9tIEhNQVQ/CgpZZXMsIEkgaGF2ZSBhIFBvQyBmb3IgdGhlIGtl cm5lbCBzZXR0aW5nIHVwIGEgZGVtb3Rpb24gcGF0aCBiYXNlZCBvbgpITUFUIHByb3BlcnRpZXMg aGVyZToKCiAgaHR0cHM6Ly9naXQua2VybmVsLm9yZy9wdWIvc2NtL2xpbnV4L2tlcm5lbC9naXQv a2J1c2NoL2xpbnV4LmdpdC9jb21taXQvP2g9bW0tbWlncmF0ZSZpZD00ZDAwNzY1OWUxZGQxYjBk YWQ0OTUxNDM0OGJlNDQ0MWZiZTdjYWRiCgpUaGUgYWJvdmUgaXMganVzdCBmcm9tIGFuIGV4cGVy aW1lbnRhbCBicmFuY2guCgo+IDMuIERvIHlvdSBoYXZlIGEgcGxhbiBmb3IgcHJvbW90aW5nIHBh Z2VzIGZyb20gbG93ZXItbGV2ZWwgbWVtb3J5IHRvIGhpZ2hlci1sZXZlbCBtZW1vcnksCj4gbGlr ZSBmcm9tIFBNRU0gdG8gRFJBTT8gV2lsbCB0aGlzIG9uZS13YXkgZGVtb3Rpb24gbWFrZSBhbGwg cGFnZXMgc2luayB0byBQTUVNIGFuZCBkaXNrPwoKUHJvbW90aW5nIHByZXZpb3VzbHkgZGVtb3Rl ZCBwYWdlcyB3b3VsZCByZXF1aXJlIHRoZSBhcHBsaWNhdGlvbiBkbwpzb21ldGhpbmcgdG8gbWFr ZSB0aGF0IGhhcHBlbiBpZiB5b3UgdHVybiBkZW1vdGlvbiBvbiB3aXRoIHRoaXMgc2VyaWVzLgpL ZXJuZWwgYXV0by1wcm9tb3Rpb24gaXMgc3RpbGwgYmVpbmcgaW52ZXN0aWdhdGVkLCBhbmQgaXQn cyBhIGxpdHRsZQp0cmlja2llciB0aGFuIHJlY2xhaW0uCgpJZiBpdCBzaW5rcyB0byBkaXNrLCB0 aG91Z2gsIHRoZSBuZXh0IGFjY2VzcyBiZWhhdmlvciBpcyB0aGUgc2FtZSBhcwpiZWZvcmUsIHdp dGhvdXQgdGhpcyBzZXJpZXMuCgo+IDQuIEluIHlvdXIgcGF0Y2ggMywgeW91IGNyZWF0ZWQgYSBu ZXcgbWV0aG9kIG1pZ3JhdGVfZGVtb3RlX21hcHBpbmcoKSB0byBtaWdyYXRlIHBhZ2VzIHRvCj4g b3RoZXIgbWVtb3J5IG5vZGUsIGlzIHRoZXJlIGFueSBwcm9ibGVtIG9mIHJldXNpbmcgZXhpc3Rp bmcgbWlncmF0ZV9wYWdlcygpIGludGVyZmFjZT8KClllcywgd2UgbWF5IG5vdCB3YW50IHRvIG1p Z3JhdGUgZXZlcnl0aGluZyBpbiB0aGUgc2hyaW5rX3BhZ2VfbGlzdCgpCnBhZ2VzLiBXZSBtaWdo dCB3YW50IHRvIGtlZXAgYSBwYWdlLCBzbyB3ZSBoYXZlIHRvIGRvIHRob3NlIGNoZWNrcyBmaXJz dC4gQXQKdGhlIHBvaW50IHdlIGtub3cgd2Ugd2FudCB0byBhdHRlbXB0IG1pZ3JhdGlvbiwgdGhl IHBhZ2UgaXMgYWxyZWFkeQpsb2NrZWQgYW5kIG5vdCBpbiBhIGxpc3QsIHNvIGl0IGlzIGp1c3Qg ZWFzaWVyIHRvIGRpcmVjdGx5IGludm9rZSB0aGUKbmV3IF9fdW5tYXBfYW5kX21vdmVfbG9ja2Vk KCkgdGhhdCBtaWdyYXRlX3BhZ2VzKCkgZXZlbnR1YWxseSBhbHNvIGNhbGxzLgogCj4gNS4gSW4g YWRkaXRpb24sIHlvdSBvbmx5IG1pZ3JhdGUgYmFzZSBwYWdlcywgaXMgdGhlcmUgYW55IHBlcmZv cm1hbmNlIGNvbmNlcm4gb24gbWlncmF0aW5nIFRIUHM/Cj4gSXMgaXQgdG9vIGNvc3RseSB0byBt aWdyYXRlIFRIUHM/CgpJdCB3YXMganVzdCBlYXNpZXIgdG8gY29uc2lkZXIgc2luZ2xlIHBhZ2Vz IGZpcnN0LCBzbyB3ZSBsZXQgYSBUSFAgc3BsaXQKaWYgcG9zc2libGUuIEknbSBub3Qgc3VyZSBv ZiB0aGUgY29zdCBpbiBtaWdyYXRpbmcgVEhQcyBkaXJlY3RseS4KX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX18KTGludXgtbnZkaW1tIG1haWxpbmcgbGlzdApM aW51eC1udmRpbW1AbGlzdHMuMDEub3JnCmh0dHBzOi8vbGlzdHMuMDEub3JnL21haWxtYW4vbGlz dGluZm8vbGludXgtbnZkaW1tCg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_PASS,USER_AGENT_MUTT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15636C4360F for ; Thu, 21 Mar 2019 22:36:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A373B21916 for ; Thu, 21 Mar 2019 22:36:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A373B21916 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E71E46B0003; Thu, 21 Mar 2019 18:36:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E20F66B0006; Thu, 21 Mar 2019 18:36:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D10CF6B0007; Thu, 21 Mar 2019 18:36:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f198.google.com (mail-pf1-f198.google.com [209.85.210.198]) by kanga.kvack.org (Postfix) with ESMTP id 9AA576B0003 for ; Thu, 21 Mar 2019 18:36:10 -0400 (EDT) Received: by mail-pf1-f198.google.com with SMTP id h69so297113pfd.21 for ; Thu, 21 Mar 2019 15:36:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:date:from:to :cc:subject:message-id:references:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=/J9nv2hX6UWKMr2II9vzpM3SerBuBExGQmcOvxECRaM=; b=XHqatQMQ0ukYJjD6eeuJuYKn2UV88ywVzNyl+AYy/KVnP/OVMk7EhhPmOdDx2CLg1h EpIkyGKkPPKxI9z6v8pysSZuukswwLjHNfLf6Dr6/qd4x2uRfV9HF7TpRvW5+8RF89rK +Z/LbQaUgf9hm3g9FoDyowGdJVqLHjf3c+BUEOfKAywDMX2DE0KgZ6UCurpSa1UHmgwq XWKLef5Ipdk0FbeaxBnlKiPJ4EhAdyBqMBxDAE35OIFoKmZOIeKF40kq9C8h0VfQoxKZ V9+XWY3f9yIBi0sx0FfcuFERjbiUU06fvwVO1utpH0i5OzzqRmMFDKmanKcHwg8Vcq1y zjgQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of keith.busch@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=keith.busch@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAW87bHhsOO9FsRW1Il1NpCCK5tmz96+LBbQncMivHEdukg8Bpak dEA3JEgAWRm33NTCcrXSVzS49J1IBUSXjSRuceXLnN/9RpB0hY1N72dZ0mhRCiJvsCtK1zhRhcP gs/he0CuyeHlnWgPpHJuKPXxMoosbe0q2opg5DQvpQZE96DDqmNOygoYWDrzP9ibTqQ== X-Received: by 2002:a17:902:7c94:: with SMTP id y20mr4227009pll.263.1553207770224; Thu, 21 Mar 2019 15:36:10 -0700 (PDT) X-Google-Smtp-Source: APXvYqxqBPbJ8Dumg7WSdu6sWf0L73xoiQFTJfch3gZ653NPfDPuzO/dHxIjTvJ4/tFPorQuWqIV X-Received: by 2002:a17:902:7c94:: with SMTP id y20mr4226925pll.263.1553207769336; Thu, 21 Mar 2019 15:36:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553207769; cv=none; d=google.com; s=arc-20160816; b=0xoqI0aYK1j0VkJfWHgBWLyNXV+f8Ysz8QVgIsspk7AzOnnwJOSZfbS3kVqMJs7pCJ YMT3Ly2QCklXwqt0jiOnw22FMEN3HCnc5UdqssP3t2O+fDX70Sa/ItgctzbT7JXpoOGK NTWLe8jd7fC5Ca/onUGLh4haeyheCetgLB14AIei3CsW72rcGgORdbL+6IFPAFu7omPf AIF2jZ48dBJ4b20g4CBWvccDLOCgU4Jk+pJnBunE7Yr7GotbnSZEZnb3SNJoBDp1FTmF Ik6aTIh6ZYGBBbVHPqbBXCdP3RGS8CmntBpJ7nKZUoJvKrEsNWYR04E6Q2kuokiljSoy TeNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=user-agent:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=/J9nv2hX6UWKMr2II9vzpM3SerBuBExGQmcOvxECRaM=; b=gqhZaPug0X0GS6B+4Qi0jDYs5qPcOWNaYYR6AwpCHA5bfTNBQyDdLb2Ne6zae+5vvI DRPli5y+0J1zQ1UPrP4fq7Zf3krgTaNH3IGMx1MGoMUrEaaE5d9mLutdnman2whnWfVy 6SHx05U/gUquS6QS8/alCESddHB9FF0x1RweCfFJF8QyiuF3fUF7ezOilJq67D7ehSjr S7egR9JW5RNiyhnTHQHuZkf6OUs7VGNzB5KIFikff68MKOq18BoggqwhXwrcz96o98tw xJTdWskzIqJfzDmrex1Z61q7gY23gqMm5ADXZT/W7UU5r6Y0FACdK7R/9tcKz0ncppkX vo8w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of keith.busch@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=keith.busch@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga14.intel.com (mga14.intel.com. [192.55.52.115]) by mx.google.com with ESMTPS id l11si3413041pgp.216.2019.03.21.15.36.09 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 21 Mar 2019 15:36:09 -0700 (PDT) Received-SPF: pass (google.com: domain of keith.busch@intel.com designates 192.55.52.115 as permitted sender) client-ip=192.55.52.115; Authentication-Results: mx.google.com; spf=pass (google.com: domain of keith.busch@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=keith.busch@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Mar 2019 15:36:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,254,1549958400"; d="scan'208";a="154574896" Received: from unknown (HELO localhost.localdomain) ([10.232.112.69]) by fmsmga004.fm.intel.com with ESMTP; 21 Mar 2019 15:36:08 -0700 Date: Thu, 21 Mar 2019 16:37:08 -0600 From: Keith Busch To: Zi Yan Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org, Dave Hansen , Dan Williams , "Kirill A. Shutemov" , John Hubbard , Michal Hocko , David Nellans Subject: Re: [PATCH 0/5] Page demotion for memory reclaim Message-ID: <20190321223706.GA29817@localhost.localdomain> References: <20190321200157.29678-1-keith.busch@intel.com> <5B5EFBC2-2979-4B9F-A43A-1A14F16ACCE1@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <5B5EFBC2-2979-4B9F-A43A-1A14F16ACCE1@nvidia.com> User-Agent: Mutt/1.9.1 (2017-09-22) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Mar 21, 2019 at 02:20:51PM -0700, Zi Yan wrote: > 1. The name of “page demotion” seems confusing to me, since I thought it was about large pages > demote to small pages as opposite to promoting small pages to THPs. Am I the only > one here? If you have a THP, we'll skip the page migration and fall through to split_huge_page_to_list(), then the smaller pages can be considered, migrated and reclaimed individually. Not that we couldn't try to migrate a THP directly. It was just simpler implementation for this first attempt. > 2. For the demotion path, a common case would be from high-performance memory, like HBM > or Multi-Channel DRAM, to DRAM, then to PMEM, and finally to disks, right? More general > case for demotion path would be derived from the memory performance description from HMAT[1], > right? Do you have any algorithm to form such a path from HMAT? Yes, I have a PoC for the kernel setting up a demotion path based on HMAT properties here: https://git.kernel.org/pub/scm/linux/kernel/git/kbusch/linux.git/commit/?h=mm-migrate&id=4d007659e1dd1b0dad49514348be4441fbe7cadb The above is just from an experimental branch. > 3. Do you have a plan for promoting pages from lower-level memory to higher-level memory, > like from PMEM to DRAM? Will this one-way demotion make all pages sink to PMEM and disk? Promoting previously demoted pages would require the application do something to make that happen if you turn demotion on with this series. Kernel auto-promotion is still being investigated, and it's a little trickier than reclaim. If it sinks to disk, though, the next access behavior is the same as before, without this series. > 4. In your patch 3, you created a new method migrate_demote_mapping() to migrate pages to > other memory node, is there any problem of reusing existing migrate_pages() interface? Yes, we may not want to migrate everything in the shrink_page_list() pages. We might want to keep a page, so we have to do those checks first. At the point we know we want to attempt migration, the page is already locked and not in a list, so it is just easier to directly invoke the new __unmap_and_move_locked() that migrate_pages() eventually also calls. > 5. In addition, you only migrate base pages, is there any performance concern on migrating THPs? > Is it too costly to migrate THPs? It was just easier to consider single pages first, so we let a THP split if possible. I'm not sure of the cost in migrating THPs directly.