From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9E7B82AF00; Fri, 29 May 2026 05:36:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780033002; cv=none; b=YaRVj+oPhfuNyGSX3bPSo7hpgr32jpMTFvxUiLD1jBIiz7+FpQ1zD6q+NmOi128OKbLLK33eF1OQM3GpKA+SMijIESrCjY1zEZUWmJesgHO6QVsp1MNMIuVyUyaY399JerhexOMX5AtIJwWsH1ndIbq8r3234JizyBzIX5OhZpQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780033002; c=relaxed/simple; bh=MuWcljyp7OOdH9o85PvmckiMkDkWaaHun8MLuCLlIr4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=n2/0bMUpRGMPCW0RbjwWQXRnqCJLBROxhRHb//jkhojwyeSkuBjZAQuK3Z+kaboEK6oh1YLFxk7aq4OzXfNmuy31c1iRWZ2bB1G5U5C90eO9TQTxX3snJSAqH+qg6i+PeFnfL+DLetlnz17ayVk2B5dnxvRI4mNhLFBVaY2uRr0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=bombadil.srs.infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=vZ6FwFY/; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=bombadil.srs.infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="vZ6FwFY/" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=aWo/gTZ+lk7qCswHD1K4xHit8Q7TVDZZNoyS6wwTu8Y=; b=vZ6FwFY/6VqF6WzikjGIZxuBlC FLMmpJLHB1p5hx2CsiBI/QUAU2dEbJULNxIZqhYJcJWSj4LkthqJ3RzSqurUj8ga5p7zkk+DGxIbb l0CrRbsr9CTjvx51lSw1qc0faVXR5JH0vmj6Ikbyfh23v42jgWYoZ6v2AbGP/kjGmb2GZAjzcPI6z IluxeI7aHavbJ3PLPUbgjtngBQyObKDGlnQbakChVVkOGCAP4EJy/6Umqu93cGXoPGws54kMp3tED U+OMb9FtkrQnZhUtkvZlDtU4ulHdOD05FgUGQCY0ug52p0Bk0z6/rwsAHUr9uOCWNjjK+d9jeTFmX Q1Ew1VRw==; Received: from hch by bombadil.infradead.org with local (Exim 4.99.1 #2 (Red Hat Linux)) id 1wSptf-00000006laY-1dEg; Fri, 29 May 2026 05:36:39 +0000 Date: Thu, 28 May 2026 22:36:39 -0700 From: Christoph Hellwig To: Jaegeuk Kim Cc: Christoph Hellwig , Bart Van Assche , Theodore Tso , linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, Matthew Wilcox , linux-f2fs-devel@lists.sourceforge.net, linux-mm@kvack.org, Akilesh Kailash , linux-fsdevel@vger.kernel.org, Christian Brauner Subject: Re: [f2fs-dev] [PATCH v2] f2fs: another way to set large folio by remembering inode number Message-ID: References: <20260521155748.GA79343@macsyma-wired.lan> <20260522141115.GA8258@macsyma-wired.lan> <20260522224108.GA18663@macsyma-wired.lan> Precedence: bulk X-Mailing-List: linux-api@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html On Wed, May 27, 2026 at 03:59:35PM +0000, Jaegeuk Kim wrote: > F2FS merges bios before submit_bio, regardless of small or large folios, > since the block addresses are consecutive. So, I think IO subsystem was > working in full speed. As does every other remotely modern file system. But that merging is surprisingly expensive, which is why using folios gets really major performance improvements. For one doing these checks to merge touch quite a few cache lines. Second, devices are often a lot more efficient if they see fewer SGL entries. I.e. having a 1MB bio a single SGL tends to work better than having 256 of them. The same is true in the kernel code itself, both in the submission path (dma mapping and co), and even more so in the page cache handling both before submitting and in the completion path. See Bart's patch about how long the walk of the bio_vecs in the f2fs completion path can take. We had similar issues in XFS even in the workqueue completion path due to lack of rescheduling, and these simply go away when you do the folio manipulation in larger chunks (LAZY_PREEMPT would avoid the need to explicit rescheduling these days, but that just papers over the symptoms in this case).