From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B515BC433DB for ; Sun, 21 Feb 2021 17:16:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8D62964EE0 for ; Sun, 21 Feb 2021 17:16:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230177AbhBURQ0 (ORCPT ); Sun, 21 Feb 2021 12:16:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229926AbhBURQX (ORCPT ); Sun, 21 Feb 2021 12:16:23 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC7FCC061574; Sun, 21 Feb 2021 09:15:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=AOsa6QLKiBxd9IXor0Ks1BmuObn2r2CzlMsd1FiSqZk=; b=mkXhOpibR/jP1Bck8iR0Ny+ZcA uF1uiE+nQZqFOzwzrdDtl3cCJ9TkMsbVZXUxZVLp8MjC8l6ryLx8JNkYDcIyARpWS/Ffq1nmH6gE4 8IIB9u0N9nlcj0oWB4Oep6VK2UJh6F1Myw4SdEY4bx+EjKK6nV9k7QLgjcXbpQ7eB1LXeV8g5SfIn ozQvCrqZwSx8wX1/ah+sIeDPwYIf8h1j0p0qppcdXaCMfnN4sl7JueMNhyO7KWhJeVseQu3YVvmUR 6bxyzVdrm8OyJ5B5FeuFrHYIQu02/CLwi6ZITYt3vlDu8LgQYVHqjh8Gc2v1I5tdrC2BsbRrn8xHC em70uvxg==; Received: from willy by casper.infradead.org with local (Exim 4.94 #2 (Red Hat Linux)) id 1lDsKX-005WOk-8p; Sun, 21 Feb 2021 17:15:37 +0000 Date: Sun, 21 Feb 2021 17:15:37 +0000 From: Matthew Wilcox To: Erik Jensen Cc: Qu Wenruo , Linux FS Devel , "linux-btrfs@vger.kernel.org" Subject: Re: page->index limitation on 32bit system? Message-ID: <20210221171537.GG2858050@casper.infradead.org> References: <1783f16d-7a28-80e6-4c32-fdf19b705ed0@gmx.com> <20210218121503.GQ2858050@casper.infradead.org> <20210218133954.GR2858050@casper.infradead.org> <20210220232224.GF2858050@casper.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Sat, Feb 20, 2021 at 04:01:17PM -0800, Erik Jensen wrote: > On Sat, Feb 20, 2021 at 3:23 PM Matthew Wilcox wrote: > > On Sat, Feb 20, 2021 at 03:02:26PM -0800, Erik Jensen wrote: > > > Out of curiosity, would it be at all feasible to use 64-bits for the page > > > offset *without* changing XArray, perhaps by indexing by the lower 32-bits, > > > and evicting the page that's there if the top bits don't match (vaguely like > > > how the CPU cache works)? Or, if there are cases where a page can't be > > > evicted (I don't know if this can ever happen), use chaining? > > > > > > I would expect index contention to be extremely uncommon, and it could only > > > happen for inodes larger than 16 TiB, which can't be used at all today. I > > > don't know how many data structures store page offsets today, but it seems > > > like this should significantly reduce the performance impact versus upping > > > XArray to 64-bit indexes. > > > > Again, you're asking for significant development work for a dying > > platform. > > Depending on how complex it would be, I'm not unwilling to give it a > go myself, but I admittedly have no kernel development experience or > knowledge of how locking works around the page cache. E.g., I have no > idea if evicting the old page at an index before bringing in a new one > is even possible without causing deadlocks right and left. I wouldn't recommend the page cache as the ideal place to start learning how to hack on the kernel. Not only is it complex, it affects almost everything. What might work is using "auxiliary" inodes for btrfs's special purpose. Allocate an array of inodes and use inodes[index / (ULONG_MAX + 1)] and look up the page at index % (ULONG_MAX + 1).