From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB223C4706C for ; Fri, 22 Dec 2023 13:29:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 23E066B0074; Fri, 22 Dec 2023 08:29:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1ED3A6B0075; Fri, 22 Dec 2023 08:29:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0DC876B007B; Fri, 22 Dec 2023 08:29:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id F1C4B6B0074 for ; Fri, 22 Dec 2023 08:29:30 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id BBA1DA088F for ; Fri, 22 Dec 2023 13:29:30 +0000 (UTC) X-FDA: 81594536100.25.15984C4 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf07.hostedemail.com (Postfix) with ESMTP id 7808B40005 for ; Fri, 22 Dec 2023 13:29:28 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=fxD6B0eT; spf=none (imf07.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1703251768; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9Uur8kWiCFEylSVPe8CEuTfymS/npRaY98dOGC8z4tU=; b=6Ph5MjMlI/JPAIgRvjc0kDQtxf4/A7nOx9qxLv2li19sN4mvMomIicsKbDfkWxHBHZUZVE 2sntjvUvDstcx04zJvKM1dsz3Ip9vOciUy7N0kqwuVUAijnxjDvhU7tVKpRUNybK+BQ9bO Mir40sPJP7Co6QMFpiFDKng6PSSbNZc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1703251768; a=rsa-sha256; cv=none; b=XB/cL9eFj0dwuocLrrkAcz7rH0CsOf5RL/7Vg7trsjbvZC1QHCr8oJGJn/6NrHUJq0xoic lSwcwKgvbIoRJAUCrARM8iIRSAfRK6BtwwUB8wh0lSUacazflH07bP8p/6fos03Wu3Tuv3 5gfVIWN3s/D5kkEBXD/uRMAzq1RBC5g= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=fxD6B0eT; spf=none (imf07.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=9Uur8kWiCFEylSVPe8CEuTfymS/npRaY98dOGC8z4tU=; b=fxD6B0eTfA7zahTsC+znNxAVc1 FRQLFyr+dv7ZkPfmvysDJT3Z4pPcghnkFqeF09i3uD50TzVy3tz7O6sXoWc33KNN4YTDVuVIh6hRn WsrUV09dv1RlMj5oP3kRCx6taxyNEKEm0UDMD+oaEWyanGAVdu2EvyX/jvgjhGZZtEgeZN3jp47g8 fPkkEpA8XPI++ah3iAqgdfLp1rotCYd0JnwTHMNSz7p00plDqdne9odNBlUgZ/1awLrEh5aREKGRB D2G4r/7Z34eSw3x8BC2pYqopeuOI38EpRXLG6nEFXhJFpfvjmh3YuZRvS9wlj3Y1p/apJzoxh/6w/ O3Kre3dg==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1rGfaX-0087W1-RY; Fri, 22 Dec 2023 13:29:17 +0000 Date: Fri, 22 Dec 2023 13:29:17 +0000 From: Matthew Wilcox To: Hannes Reinecke Cc: Viacheslav Dubeyko , Bart Van Assche , lsf-pc@lists.linuxfoundation.org, linux-mm@kvack.org, linux-block@vger.kernel.org, linux-scsi@vger.kernel.org, "linux-nvme@lists.infradead.org" Subject: Re: [LSF/MM/BPF TOPIC] Large block for I/O Message-ID: References: <7970ad75-ca6a-34b9-43ea-c6f67fe6eae6@iogearbox.net> <4343d07b-b1b2-d43b-c201-a48e89145e5c@iogearbox.net> <03ebbc5f-2ff5-4f3c-8c5b-544413c55257@suse.de> <5c356222-fe9e-41b0-b7fe-218fbcde4573@acm.org> <4f03e599-2772-4eb3-afb2-efa788eb08c4@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4f03e599-2772-4eb3-afb2-efa788eb08c4@suse.de> X-Rspamd-Queue-Id: 7808B40005 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: nran3gnz48cosyj5ptscxju1omaj1mmj X-HE-Tag: 1703251768-627006 X-HE-Meta: U2FsdGVkX19vub0a2rvPfuKIPBsrO8w393+dddcVhGjh8P10QdFLeTwdnbRdCvTq/zpbPLJQuGwPpS1wmXutU82JMYnYwZTWgVfUAVBXCgosKrp5K1CaQ/1R4zx1ZM67hqG7f6k7SI9YfUwxnbFwUuXj/sudmW+bhbxd4o70Qslbkc33LDMBJpCY6OAGJvTr5q1peVJBLbjOYUSWBvWtTTCVs2Jr7K1Wl6FMZ645c9CSMA2whhr/oyXcrP63Zlzi2nIW19QmRj3RGmIvIjkc9DSS40z9JS6zkmOO4NKDaF8MRRViYn1EirlErdE1ypgMlN5XKXMSjdo6ei8fx9Q/yWnnML5TMagrhrLRTNWxBC+IXo2ZwaRYEXp709jQv1+Dd1CELVJNRvwJZxM6iPliInXhZOJm/TQMXWZ1ZQGAKt6KzYld9zZwQSO1LetW+Oal0poADyA+lmTcdT/INkEwlq4g9VmSVSC2jYj3TAQ/hDen4n7VDdA414LIbqMRZZGpILpYAicLXetxaLxE2/1i+jawbGH1w6MBzidjIpXbjxzc1BfkXI+CGvaAcw1n3uS86JpJIrzHj4gWFNFgwILvXmZJrwHqnlbOApkh0PnGra6WtqLMPOVufWAL8q/PrBmkQDDd1FfaYYwnAmxOENlHTBU56CRncJ4yMwTYgjNJ9myW8rIzhEX4Bi+Ua6tMoaoKs50n2cGIoliAmk+VJ3Br9u7bSlLU6bLmw40c1vyog8iquGjni7gCSlLhunB0vMDS1QIAbJmlrD1VxehYJiYOuXan4M8z8jflUz8iaDdpdyOgDZIk2bJytZaUxidAOuw2LunMLSt+4pOnqTH/sqD340KgGqt3LGt5buK6kHfgNU3Nvmy5V3ye16oaYJ+HNpq6uBfiuLW099OJ452v2FH+pmMOFVpSuT7MWddKB/E2h2cSLGdxxsguk2tUAiVj3oLFHsa80fglXeLpBbPON0X Knzp0Boa abBTpPKRqBh1Sl21+s0VYMufTZ77jTAEqhJnrhjjYhgOUPKLS9trhurTOVwsfFD7tq6fl9J+MlKIvTiNcZdrJJtl6OCJwwFuSVJWnZmC83Xa4U0BdbzrGsT2UmXxP3WoHG7CDrqQ5UVGZOqdYQ9E5lh29Y4cpYoP/tcIwROiEObkjsd9Ay1dbd0339A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Dec 22, 2023 at 01:29:18PM +0100, Hannes Reinecke wrote: > And that is actually a very valid point; memory fragmentation will become an > issue with larger block sizes. > > Theoretically it should be quite easily solved; just switch the memory > subsystem to use the largest block size in the system, and run every smaller > memory allocation via SLUB (or whatever the allocator-of-the-day > currently is :-). Then trivially the system will never be fragmented, > and I/O can always use large folios. > > However, that means to do away with alloc_page(), which is still in > widespread use throughout the kernel. I would actually in favour of it, > but it might be that mm people have a different view. > > Matthew, worth a new topic? > Handling memory fragmentation on large block I/O systems? I think if we're going to do that as a topic (and I'm not opposed!), we need data. Various workloads, various block sizes, etc. Right now people discuss this topic with "feelings" and "intuition" and I think we need more than vibes to have a productive discussion. My laptop (rebooted last night due to an unfortunate upgrade that left anything accessing the sound device hanging ...): MemTotal: 16006344 kB MemFree: 2353108 kB Cached: 7957552 kB AnonPages: 4271088 kB Slab: 654896 kB so ~50% of my 16GB of memory is in the page cache and ~25% is anon memory. If the page cache is all in 16kB chunks and we need to allocate order-2 folios in order to read from a file, we can find it easily by reclaiming other order-2 folios from the page cache. We don't need to resort to heroics like eliminating use of alloc_page(). We should eliminate use of alloc_page() across most of the kernel, but that's a different topic and one that has not much relevance to LSF/MM since it's drivers that need to change, not the MM ;-) Now, other people "feel" differently. And that's cool, but we're not going to have a productive discussion without data that shows whose feelings represent reality and for which kinds of workloads.