From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AC560C43217 for ; Mon, 10 Jan 2022 22:06:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=Bsrh2UWK7JbNCF0BOVnhZALW94pVQjG6XOnt5bLejrA=; b=fuax4t6HDQK4gy mEYcOoFEURg1Gz6N97uEu/w0Ck2Jb6WkJ7uAEEUpj/vYj1dSY+a8imOIwz0UIeZwvANgaeVCUvYAM p2hNlpm3Z6+CKR3V5Eit+OYWBYBoWSQecn0/EXmmYkd/kCE3vshk+4i7y2PB22W7tPviNUmqWpOLc dFaVB3dfK5/bsqFz866Yw79oVSpHEEx0YVUtxMgXCVLGJFjYcLo959cAXRx20X7j+mie+pt4W8zKh pvBhu+xNdwHScjLcuIb0Sj1J60GcQ1ZMIK2FCTp3qQvrY88OI2XjGEG+FhqoNEhL5xv5KwY0mdOXD b08A5dIlJ5XikpmO3rpQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1n72mm-00DYz3-UW; Mon, 10 Jan 2022 22:05:05 +0000 Received: from mail-io1-xd2b.google.com ([2607:f8b0:4864:20::d2b]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1n72mi-00DYwm-V9 for linux-arm-kernel@lists.infradead.org; Mon, 10 Jan 2022 22:05:02 +0000 Received: by mail-io1-xd2b.google.com with SMTP id s6so19743973ioj.0 for ; Mon, 10 Jan 2022 14:05:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=ZF1CJQBzh/80v0BtRZispFGHTetfiJtaTzECzTUh4ls=; b=dx0D386bXH4QfkdfHO50yb/ut/bhAD5N6GmCuKkMh4mmQQbt9bw3Ok/yfATHoc+iQI qAeFeHqFnnPW4sGu8eYXyyD9lU3rQ7rcd72g2G4L1gxA/PhC1T8w5gwMsm4VLr4UZJCn j7pYB7yDIldEzkB7XE3UPev7QKCCceR1V7oIsEu0s7T8aKqjJDkc+nEue19/Q4IRcGBp eWinzrVVnAoWsRrtgxPJi5QLe9blKyBY9ylmdPHxkdIyszk1rHTvVVmHq0O2epJSIcC3 Xdftd7hExeHdJjDYgmeifiP8P+lBocrEIQ8Hi8kLOEzuC0HrpmIVkAHu86Fx3sjCo+64 VxNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=ZF1CJQBzh/80v0BtRZispFGHTetfiJtaTzECzTUh4ls=; b=BfGVc99L+I7Xm1LImhBkO2+WYX4LBEvgUySc8WfieQ6Wpj+mOG2ejNttquv9ch/1t5 tXkZgckdciXzsPsCf1OaytEjeKwz3mtHHG4wKoHDI/3RFBhHY3EqxSVTi8kzd+KDGAbk wVX+3/kfpAjSrYCny9FP9JCZnDt938duy5PoL+9idSYcQ1bhi9DL89HT8dnSIGlgo2AD hUtD0Jaw+6Aj5pYFP6YwR8UdarUmGMjiUKj6WDCN2KXaM6r7ljVklkhcrkpKS08GtJvO 00Cf96EGai03SuhxhaORFme+rXWe/IRQ7HSFaJTC1JmU/nv4CCtFIZhKz1DUCEuwEHjL BEuw== X-Gm-Message-State: AOAM5309o/bpjcnwfnLwKdTvJ8rKE6u2APriEqW4XwH+l8QdhUJ3/ehh IInNsmnWYsugCginqx1G0YOTuQ== X-Google-Smtp-Source: ABdhPJziaqAPZyGTfrPdWvsoPVFD8RNO6WlMKxKetxjQmQbenL9eNyAzlV+ZGAmku4iALDqt7TKy8Q== X-Received: by 2002:a6b:8f0b:: with SMTP id r11mr875799iod.20.1641852299693; Mon, 10 Jan 2022 14:04:59 -0800 (PST) Received: from google.com ([2620:15c:183:200:d17d:9fe6:6a18:f270]) by smtp.gmail.com with ESMTPSA id f1sm4986646iow.33.2022.01.10.14.04.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Jan 2022 14:04:59 -0800 (PST) Date: Mon, 10 Jan 2022 15:04:54 -0700 From: Yu Zhao To: Michal Hocko Cc: Andrew Morton , Linus Torvalds , Andi Kleen , Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Jesse Barnes , Johannes Weiner , Jonathan Corbet , Matthew Wilcox , Mel Gorman , Michael Larabel , Rik van Riel , Vlastimil Babka , Will Deacon , Ying Huang , linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, page-reclaim@google.com, x86@kernel.org Subject: Re: [PATCH v6 0/9] Multigenerational LRU Framework Message-ID: References: <20220104202227.2903605-1-yuzhao@google.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220110_140501_063085_3C5CCD86 X-CRM114-Status: GOOD ( 44.72 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, Jan 10, 2022 at 04:39:51PM +0100, Michal Hocko wrote: > On Fri 07-01-22 11:45:40, Yu Zhao wrote: > [...] > > Next, I argue that the benefits of this patchset outweigh its risks, > > because, drawing from my past experience, > > 1. There have been many larger and/or riskier patchsets taken; I'll > > assemble a list if you disagree. > > No question about that. Changes in the reclaim path are paved with > failures and reverts and fine tuning on top of existing fine tuning. > The difference from your patchset is that they tend to be much much > smaller and go incremental and therefore easier to review. No argument here. > > And this patchset is fully guarded > > by #ifdef; Linus has also assessed on this point. > > I appreciate you made the new behavior an opt-in and therefore existing > workloads are less likely to regress. I do not think ifdefs help > all that much, though, because a) realistically the config will > likely be enabled for most distribution kernels There is also a runtime kill switch. > b) the parallel > reclaim implementation adds a maintenance overhead regardless of those > ifdef. The later point is especially worrying because the memory reclaim > is a complex and hard to review beast already. Any future changes would > need to consider both reclaim algorithms of course. A perfectly legitimate concern. If this patchset is taken: 1. There will be refactoring that makes the long-term maintenance as affordable as possible, i.e., similar to the SL.B model, but can also make runtime switch. 2. There will also be optimizations for mmu notifier (KVM), THP, etc. 3. Most importantly, Google will be committing more resource on this. And that's why we need to hear a decision -- our resource planning depends on it. > Hence I argue we really need a wider consensus this is the right > direction we want to pursue. We've been doing our best to get this consensus -- we invited all the stakeholders to meetings a long time ago -- but unfortunately we couldn't move the needle. I agree consensus is important. But, IMO, progress is even more important. And personally, I'd rather try something wrong than do nothing. > > 2. There have been none that came with the testing/benchmarking > > coverage as this one did. Please point me to some if I'm mistaken, > > and I'll gladly match them. > > I do appreciate your numbers but you should realize that this is an area > that is really hard to get any conclusive testing for. Fully agreed. That's why we started a new initiative, and we hope more people will following these practices: 1. All results in this area should be reported with at least standard deviations, or preferably confidence intervals. 2. Real applications should be benchmarked (with synthetic load generator), not just synthetic benchmarks (not real applications). 3. A wide range of devices should be covered, i.e., servers, desktops, laptops and phones. I'm very confident to say our benchmark reports were hold to the highest standards. We have worked with MariaDB (company), EnterpriseDB (Postgres), Redis (company), etc. on these reports. They have copies of these reports (PDF version): https://linux-mm.googlesource.com/benchmarks/ We welcome any expert in those applications to examine our reports, and we'll be happy to run any other benchmarks or same benchmarks with different configurations that anybody thinks it's important and we've missed. > We keep learning > about fallouts on workloads we haven't really anticipated or where the > runtime effects happen to disagree with our intuition. So while those > numbers are nice there are other important aspects to consider like the > maintenance cost for example. I assume we agree this is not an easy decision. Can I also assume we agree that this decision should be make within a reasonable time frame? > > The numbers might not materialize in the real world; the code is not > > perfect; and many other risks... But all the top eight open source > > memory hogs were covered, which is unprecedented; memcached and fio > > showed significant improvements and it only takes a few commands to > > see for yourselves. > > > > Regarding the acks and the reviewed-bys, I certainly can ask people > > who have reaped the benefits of this patchset to do them, if it's > > required. But I see less fun in that. I prefer to provide empirical > > evidence and convince people who are on the other side of the aisle. > > I like to hear from users who benefit from your work and that certainly > gives more credit to it. But it will be the MM community to maintain the > code and address future issues. I'll ask downstream kernel maintainers (from different distros) that have taken this patchset to ACK. I'll ask credible testers who are professionals, researchers, contributors to other subsystems to provide Test-by's. There are many other individual testers I may not be able to acknowledge their efforts, e.g., my coworker just sent this to me: Using that v5 for some time and confirm that difference under heavy load and memory pressure is significant." https://www.phoronix.com/forums/forum/software/general-linux-open-source/1301258-mglru-is-a-very-enticing-enhancement-for-linux-in-2022#post1301275 I'll leave the reviews in your capable hands. As I said, I prefer to convince people with empirical evidence. > We do not have a dedicated maintainer for the memory reclaim but > certainly there are people who have helped shaping the existing code and > have learned a lot from the past issues - like Johannes, Rik, Mel just > to name few. If I were you I would be really looking into finding an > agreement with them. I myself can help you with memcg and oom side of > the things (we already have discussions about those). Unfortunately people have different priorities. As I said, we tried to get all the stakeholders in the same (conference) room so that we can make some good progress. But we failed. Rest assured, we'll keep trying. But please understand we need to do cost control and therefore we can't keep investing in this effort forever. So I think it's not unreasonable, after I've addressed all pending comments, to ask for some clear instructions from the leadership: Yes No Or something specific Thanks! _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel