public inbox for fstests@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>,
	"fstests@vger.kernel.org" <fstests@vger.kernel.org>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
	"glaubitz@physik.fu-berlin.de" <glaubitz@physik.fu-berlin.de>,
	"frank.li@vivo.com" <frank.li@vivo.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [RFC] Why generic/073 is generic but not btrfs specific?
Date: Fri, 7 Nov 2025 08:02:23 +1030	[thread overview]
Message-ID: <a43fd07d-88e6-473d-a0be-3ba3203785e6@suse.com> (raw)
In-Reply-To: <06b369cd4fdf2dfb1cfe0b43640dbe6b05be368a.camel@ibm.com>



在 2025/11/7 07:41, Viacheslav Dubeyko 写道:
> On Fri, 2025-11-07 at 07:28 +1030, Qu Wenruo wrote:
>>
>> 在 2025/11/7 07:10, Viacheslav Dubeyko 写道:
>>> Hello,
>>>
>>> Running generic/073 for the case of HFS+ finishes with volume corruption:
>>>
>>> sudo ./check generic/073
>>> FSTYP -- hfsplus
>>> PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.17.0-rc1+ #4 SMP PREEMPT_DYNAMIC
>>> Wed Oct 1 15:02:44 PDT 2025
>>> MKFS_OPTIONS -- /dev/loop51
>>> MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch
>>>
>>> generic/073 _check_generic_filesystem: filesystem on /dev/loop51 is inconsistent
>>> (see XFSTESTS-2/xfstests-dev/results//generic/073.full for details)
>>>
>>> Ran: generic/073
>>> Failures: generic/073
>>> Failed 1 of 1 tests
>>>
>>> sudo fsck.hfsplus -d /dev/loop51
>>> ** /dev/loop51
>>> Using cacheBlockSize=32K cacheTotalBlock=1024 cacheSize=32768K.
>>> Executing fsck_hfs (version 540.1-Linux).
>>> ** Checking non-journaled HFS Plus Volume.
>>> The volume name is untitled
>>> ** Checking extents overflow file.
>>> ** Checking catalog file.
>>> ** Checking multi-linked files.
>>> ** Checking catalog hierarchy.
>>> Invalid directory item count
>>> (It should be 1 instead of 0)
>>> ** Checking extended attributes file.
>>> ** Checking volume bitmap.
>>> ** Checking volume information.
>>> Verify Status: VIStat = 0x0000, ABTStat = 0x0000 EBTStat = 0x0000
>>> CBTStat = 0x0000 CatStat = 0x00004000
>>> ** Repairing volume.
>>> ** Rechecking volume.
>>> ** Checking non-journaled HFS Plus Volume.
>>> The volume name is untitled
>>> ** Checking extents overflow file.
>>> ** Checking catalog file.
>>> ** Checking multi-linked files.
>>> ** Checking catalog hierarchy.
>>> ** Checking extended attributes file.
>>> ** Checking volume bitmap.
>>> ** Checking volume information.
>>> ** The volume untitled was repaired successfully.
>>>
>>> Initially, I considered that something is wrong with HFS+ driver logic. But
>>> after testing and debugging the issue, I believe that HFS+ logic is correct.
>>>
>>> As far as I can see, the generic/073 is checking specific btrfs related case:
>>>
>>> # Test file A fsync after moving one other unrelated file B between directories
>>> # and fsyncing B's old parent directory before fsyncing the file A. Check that
>>> # after a crash all the file A data we fsynced is available.
>>> #
>>> # This test is motivated by an issue discovered in btrfs which caused the file
>>> # data to be lost (despite fsync returning success to user space). That btrfs
>>> # bug was fixed by the following linux kernel patch:
>>> #
>>> #   Btrfs: fix data loss in the fast fsync path
>>>
>>> The test is doing these steps on final phase:
>>>
>>> mv $SCRATCH_MNT/testdir_1/bar $SCRATCH_MNT/testdir_2/bar
>>> $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/testdir_1
>>> $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo
>>>
>>> So, we move file bar from testdir_1 into testdir_2 folder. It means that HFS+
>>> logic decrements the number of entries in testdir_1 and increments number of
>>> entries in testdir_2. Finally, we do fsync only for testdir_1 and foo but not
>>> for testdir_2.
>>
>> If the fs is using journal, shouldn't the increments on the testdir_2
>> already be journaled? Thus after a power loss, the increments on
>> testdir_2/bar should be replayed thus the end user should still see that
>> inode.
>>
> 
> Technically speaking, HFS+ is journaling file system in Apple implementation.
> But we don't have this functionality implemented and fully supported on Linux
> kernel side. Potentially, it can be done but currently we haven't such
> functionality yet. So, HFS/HFS+ doesn't use journaling on Linux kernel side  and
> no journal replay could happen. :)

That's fine, btrfs doesn't support (traditional) journal either, since 
its metadata is already fully COW protected.

The journal-like part is called log-tree, which is only utilized for 
speeding up fsync() so no full fs sync is needed (which is super 
expensive for btrfs)

That's the reason why btrfs' fsync() is super complex and Filipe spent 
tons of his time on this field, and it is btrfs that exposed this problem.


But if HFS+ on linux doesn't support metadata journal, I'm afraid you 
will need to go the hard path just like btrfs, to manually check if an 
inode needs its parent inodes updated during fsync.
> 
>> To me this looks like a bug in HFS logic where something is not properly
>> journaled (the increment on the testdir_2/bar).
>>
>>
> 
> This statement could be correct if we will support journaling for HFS+. But HFS
> never supported the journaling technology.
> 
>> Finally, if you're asking an end user that if it is acceptable that
>> after moving an inode and fsync the old directory, the inode is no
>> longer reachable, I'm pretty sure no end user will think it's acceptable.
>>
> 
> As far as I can see, the fsck only corrects the number of entries for testdir_2
> folder. The rest is pretty OK.

So it means the nlink number on the child inode is correct. I guess 
since testdir_2 is not the directory where fsync is called, thus its 
number of references is not updated.

With journal it will be much easier, but without it, you have to iterate 
through the changed inode backrefs, and update the involved directories 
one by one during fsync(), just like btrfs...

> 
>>> As a result, this is the reason why fsck.hfsplus detects the
>>> volume corruption afterwards. As far as I can see, the HFS+ driver behavior is
>>> completely correct and nothing needs to be done for fixing in HFS+ logic here.
>>
>> Then I guess you may also want to ask why EXT4/XFS/Btrfs/F2fs all pass
>> the test case.
>>
> 
> So, ext4 and xfs support journaling. Maybe, it's interesting to check what is
> going on with F2FS, FAT, and other file system that hasn't journaling support.

IIRC F2FS also support journals.

For VFAT the test case is rejected due to the lack of journal support.

On the other hand, if HFS+ doesn't support journal at all, you may want 
to add hfs/hfs+ to _has_metadata_journaling(), so that the test case 
will be skipped.

But still, since this test case already leads to fsck error reports, it 
is worthy some improvement to avoid such problem.

Thanks,
Qu

> 
> Thanks,
> Slava.


  reply	other threads:[~2025-11-06 21:32 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-06 20:40 [RFC] Why generic/073 is generic but not btrfs specific? Viacheslav Dubeyko
2025-11-06 20:58 ` Qu Wenruo
2025-11-06 21:11   ` Viacheslav Dubeyko
2025-11-06 21:32     ` Qu Wenruo [this message]
2025-11-06 22:29       ` Viacheslav Dubeyko
2025-11-08 14:01         ` Theodore Ts'o
2025-11-10 19:41           ` Viacheslav Dubeyko
2025-11-11  1:53             ` Zorro Lang
2025-11-11  2:02               ` Viacheslav Dubeyko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a43fd07d-88e6-473d-a0be-3ba3203785e6@suse.com \
    --to=wqu@suse.com \
    --cc=Slava.Dubeyko@ibm.com \
    --cc=frank.li@vivo.com \
    --cc=fstests@vger.kernel.org \
    --cc=glaubitz@physik.fu-berlin.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox