From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="jHE3IW8z" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38E2F135 for ; Wed, 6 Dec 2023 11:00:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1701889227; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=5ASA2WfCNw2DykBTCR15qulVl2D/3MPpvBvcFDtuDzI=; b=jHE3IW8zTRKs1VzJIrPKfwgONV5wjQOHM04RqZMMnk4+95mRqsCO4tx09xxd9tOWpXaF49 hqwy2FRnfhtuiYYVWJFstlhTTwC8ljn0CVVCqqbAkrrQiopHYpE/EK0dJCLjRa4TJ3dsj/ H4aC9l64mETnbmdYp+p1Ar2rMIv6kDo= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-173-JrHkKAzCMTqLdAP0sc69pw-1; Wed, 06 Dec 2023 14:00:24 -0500 X-MC-Unique: JrHkKAzCMTqLdAP0sc69pw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 24B6F870821; Wed, 6 Dec 2023 19:00:07 +0000 (UTC) Received: from bfoster (unknown [10.22.32.38]) by smtp.corp.redhat.com (Postfix) with ESMTPS id F1645111E400; Wed, 6 Dec 2023 19:00:06 +0000 (UTC) Date: Wed, 6 Dec 2023 14:01:02 -0500 From: Brian Foster To: Steve Smith Cc: linux-bcachefs@vger.kernel.org Subject: Re: [bug]: fiemap returns zero extents for all files Message-ID: References: Precedence: bulk X-Mailing-List: linux-bcachefs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3 On Wed, Dec 06, 2023 at 10:26:14AM +1100, Steve Smith wrote: > Hi Brian, > > Thanks for the suggestion. Yes, the ioctl call that xfs_io makes is > basically identical to the one the test makes, except it has the SYNC > flag set. Adding this flag (or calling fsync) results in expected > behaviour (mapped_extents > 0). > Ah.. > I assume that the bcachefs fiemap impl ignores unwritten data and only > responds with flushed extents? I'm not sure if this is correct; for > comparison, XFS, Btfs, zfs, and ext4, all return the extent map > immediately after write without syncing. > Ok, thanks for digging into the difference. I can reproduce what you observe by just removing the sync flag from the xfs_io command. I know that XFS basically looks up extents in the in-core extent tree, which will include things like delayed allocation (i.e. buffered writes that have not yet been physically allocated and flushed to disk). It looks like bcachefs fiemap just walks the extents btree for associated inode keys. I suspect the buffered write path is just not updating the tree in any way, which means fiemap won't see extents until dirty data is flushed out. FWIW, this can also be observed by doing buffered overwrites and observing that the underlying block range does not change until the file data is flushed. I'll have to read through the buffered write path to grok it more clearly and maybe think about if/how this could be improved. For the time being I'd suggest to consider bcachefs fiemap without FIEMAP_FLAG_SYNC as unsupported if you expect to see most recent inode state. The most obvious solution may very well be to flush unconditionally anyways. Thanks for reporting this behavior. Brian > You're right that bcachefs seems to split extents that would be > contiguous on some other FSs, but that's fine here. > > Thanks, > Steve > > > > On Wed, 6 Dec 2023 at 01:53, Brian Foster wrote: > > > > On Tue, Dec 05, 2023 at 04:26:40PM +1100, Steve Smith wrote: > > > Hi, > > > > > > I'm doing some testing of xcp[1] against bcachefs, and having issues > > > with fiemap. Using fiemap on both sparse and non-sparse files always > > > returns `mapped_extents` of 0. The same tests on other extent-based > > > FSs return non-zero extents, with consistent values. This is against > > > -rc4 and master. > > > > > > I've broken the relevant tests (in Rust) out into a standalone crate > > > if you want to reproduce: > > > > > > https://github.com/tarka/bcachefs-test > > > > > > The FS in question was created with a simple `bcachefs -L xcp-test /dev/sdh1`. > > > > > > > Hi Steve, > > > > Well I'm not really familiar with xcp or your test harness here, but > > have you checked the results you're seeing against known working tools? > > For example, one of the subtests just seems to perform a simple write > > followed by an fiemap check/assert for a single extent. That is roughly > > equivalent to: > > > > # xfs_io -fc "pwrite 0 128k" -c "fiemap -v" ./file > > wrote 131072/131072 bytes at offset 0 > > 128 KiB, 32 ops; 0.0011 sec (108.790 MiB/sec and 27850.3046 ops/sec) > > ./file: > > EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS > > 0: [0..127]: 339968..340095 128 0x0 > > 1: [128..255]: 340096..340223 128 0x1 > > > > ... which mostly seems to DTRT. I have noticed in the past that bcachefs > > fiemap seems to break extents into bucket size or some such segments, > > regardless of contiguity, but this doesn't seem to be the issue you are > > reporting here. I.e. an strace of the above shows the following for the > > fiemap ioctl(): > > > > ioctl(3, FS_IOC_FIEMAP, {fm_start=0, fm_length=18446744073709551615, fm_flags=FIEMAP_FLAG_SYNC, fm_extent_count=32} => {fm_flags=FIEMAP_FLAG_SYNC, fm_mapped_extents=2, ...}) = 0 > > > > ... which shows fm_mapped_extents == 2. Hm? > > > > Brian > > > > > Cheers, > > > Steve > > > > > > [1]: https://github.com/tarka/xcp > > > > > >