From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53983C47DA9 for ; Sat, 27 Jan 2024 18:43:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B9E006B007B; Sat, 27 Jan 2024 13:43:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B4E6B6B007D; Sat, 27 Jan 2024 13:43:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A15186B007E; Sat, 27 Jan 2024 13:43:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 91EF76B007B for ; Sat, 27 Jan 2024 13:43:44 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 2AC25160908 for ; Sat, 27 Jan 2024 18:43:44 +0000 (UTC) X-FDA: 81725964768.17.088C6EE Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf06.hostedemail.com (Postfix) with ESMTP id 1C8DA180002 for ; Sat, 27 Jan 2024 18:43:40 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=lELbwOzz; dmarc=none; spf=none (imf06.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706381021; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=X6dqNNaFcBPAuUj/Qb7BL8mzsncc4vwb10YQ9AHh2wU=; b=LdvXo4HR07T1fuU9d3gtbqZ3v/94FAGTF/1Od6SBXWUvvtn6pg9ilKDsj/koNMn5IQesbf 5pkzdD8tmJRfBt3/+VABPH55CH3DaeAK1a1AsrJgkQeYsgWQikVYiQpH837h8cCFMTMQ6S NO1fGrFBePAGW4Wxi9PSGvJG6du4yEA= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=lELbwOzz; dmarc=none; spf=none (imf06.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706381021; a=rsa-sha256; cv=none; b=679D9Qt4pXlsLM7YImC/VltTuhYH3jl2VOnVlOI9JYiKALGw2SGnf9TX/dprYVrhU1wa1H OQUKmLwt9ZBe2NosPginFjhmsf6m63Dy/xnWVDmRmCxPxTmAk9yrA3CWxNr8U5NK+Xenj2 lUS8P9Jst+jzZsSQOFhLL849cETz3SQ= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=X6dqNNaFcBPAuUj/Qb7BL8mzsncc4vwb10YQ9AHh2wU=; b=lELbwOzzp1TR91on867emJBA1d avobqrdJ8X7yB9badToi0BoMO96MowyanbyF5LaeRj65WWbJbwni699zGz/2EG8LVzbhIQIZQHJ66 ZEOhKelSRC9LdHVTE/Qv/LzJsoEhFTcX3pSrMMHV8vraAYReK0RhQVbAp5wv1iF7LCp+D3iQ4GZ6F Ik//ZLI7LL9Eqh3cUFXOhbEM55uQij82k1lfPkCdZw/SJhA7C1ITmkWtfCxI+Q9bSGz2z1LBaiTsZ NUS3eWu46fYn809l7rTxCMLJvNvauvvXIIPMCbg/oSIskf8I7CYTgvFKUE4YX58ztH/r9QBdsy2gv QxiM558g==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1rTneT-00000000400-0bqE; Sat, 27 Jan 2024 18:43:37 +0000 Date: Sat, 27 Jan 2024 18:43:37 +0000 From: Matthew Wilcox To: Kent Overstreet Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, linux-ide@vger.kernel.org, linux-scsi@vger.kernel.org, linux-nvme@lists.infradead.org, bpf@vger.kernel.org Subject: Re: [LSF/MM/BPF TOPIC] State Of The Page Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 1C8DA180002 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: uf3quu6ts4qzwyqs7k1jn9hzpbp5hg3u X-HE-Tag: 1706381020-68762 X-HE-Meta: U2FsdGVkX1+2DWmJ1zwbz95OR0ZHIo27MzUlq2prPJVg2r8dqNkiqrltJmX8A1VPBnj9jikkTFTkeBZpN/MK+DSEJQzBktjrG0bIr9Bqm3OzSx4rt55QY2NL5k4UPZhPEY+qN7Rr6msP8USmoTi3Gj5o56YuDUUC1Pllcin6DFLBdEb1PF9RzQirg1POVZjt9bNvQp+mfaLeHCVXMokYXpsHyuUQ44QZ+Ec7MW8rC2A07VF+x15LwSOG4MBKLnPkhsfuqZ33NMHIIFEFMXu+8EWAjDUkdspru3371WAiMUKEDYfRPzQu7T4MygFx+tBTkgjlxVwVaBz0KKrVdKkezgEm950SRKVP8grPnKF1wgq+BIdAXCq+3Weqt9KH4HfB3pHaC91RXh0uHw7CdyVyeUvf5lsfdDckpueUxRBDEcW8mk63Oyd0aYPgNzCUxVgp8LGEXZE1CKyUlIPtUs7lE8UIK/7PBwGn9kcziLII4qDABbCGdKxDHeYeYrlmmTM9w++W7z+Ne4A2HQKmQX5Kv+LDmVA5fPU0vSm3DXTuWdgGsplWFyTGUeDKNHSr2TfnfSKErDtZ3whPMufRtSCDktxIu2kifCrzAqnmoMSlXz5dL1nF2WAh31YGOYOPjQWrF68Cip0FIOCjNvBQXgDlL1U9RG1HW4DZ3+P7CRTpcPG7cmz29PsAcG2gdBTqUWYv1ll25F16CH3Bo9zeEPOnlH8zuv8BZJAszYzVbFFpy4tSvocdkLNRU4JD5Ppzmz6y4IRxE3qF3UVooSzlIySEypr4Sw4N7wxEI3GhttbjCyqlOU1N60dDLSwFEOPZNOYZs3krSycpKgBb7tOz1pkFC2qEdf429z/MX3kjB1KGp0UwCYLczWnQIvETTEjbt+GEOdXfVFNQ6DNpVX6nsks6X3cNJnnFePko+3/7bP8WAqhsjSnUZ45a5HXBXMMgIX3MQPa481B8gGRaHZHgA9T NvIoD0F9 JGYW8y/ymij/VQmfj9Spi56sR/JUXugfXxS8YukvO2YYOx/X7pKv9xmH9Sg24oNTF8M+Hn+qGWI4RGHXvmKa18fVgoQQ1HKSgQ/5bebKisiOAYSR4DMEmtMr4pXCeqT9/qL18/79q0Tqu6ig= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000891, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Jan 27, 2024 at 12:57:45PM -0500, Kent Overstreet wrote: > On Fri, Jan 19, 2024 at 04:24:29PM +0000, Matthew Wilcox wrote: > > - What are we going to do about bio_vecs? > > For bios and biovecs, I think it's important to keep in mind the > distinction between the code that owns and submits the bio, and the > consumer underneath. > > The code underneath could just as easily work with pfns, and the code > above got those pages from somewhere else, so it doesn't _need_ the bio > for access to those pages/folios (it would be a lot of refactoring > though). > > But I've been thinking about going in a different direction - what if we > unified iov_iter and bio? We've got ~3 different scatter-gather types > that an IO passes through down the stack, and it would be lovely if we > could get it down to just one; e.g. for DIO, pinning pages right at the > copy_from_user boundary. Yes, but ... One of the things that Xen can do and Linux can't is I/O to/from memory that doesn't have an associated struct page. We have all kinds of hacks in place to get around that right now, and I'd like to remove those. Since we want that kind of memory (lets take, eg, GPU memory as an example) to be mappable to userspace, and we want to be able to do DIO to that memory, that points us to using a non-page-based structure right from the start. Yes, if it happens to be backed by pages we need to 'pin' them in some way (I'd like to get away from per-page or even per-folio pinning, but we'll see about that), but the data structure that we use to represent that memory as it moves through the I/O subsystem needs to be physical address based. So my 40,000 foot view is that we do something like get_user_phyrs() at the start of DIO, pas the phyr to the filesystem; the filesystem then passes one or more phyrs to the block layer, the block layer gives the phyrs to the driver which DMA maps the phyr. Yes, the IO completion path (for buffered IO) needs to figure out which folios are decsribed by this phyr, but that's a phys_to_folio() call away.