From: Anders Blomdell <anders.blomdell@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Chandan Babu R <chandan.babu@oracle.com>,
"Darrick J. Wong" <djwong@kernel.org>,
Christoph Hellwig <hch@lst.de>
Subject: Re: XFS mount timeout in linux-6.9.11
Date: Sat, 10 Aug 2024 10:29:38 +0200 [thread overview]
Message-ID: <252d91e2-282e-4af4-b99b-3b8147d98bc3@gmail.com> (raw)
In-Reply-To: <ZraeRdPmGXpbRM7V@dread.disaster.area>
On 2024-08-10 00:55, Dave Chinner wrote:
> On Fri, Aug 09, 2024 at 07:08:41PM +0200, Anders Blomdell wrote:
>> With a filesystem that contains a very large amount of hardlinks
>> the time to mount the filesystem skyrockets to around 15 minutes
>> on 6.9.11-200.fc40.x86_64 as compared to around 1 second on
>> 6.8.10-300.fc40.x86_64,
>
> That sounds like the filesystem is not being cleanly unmounted on
> 6.9.11-200.fc40.x86_64 and so is having to run log recovery on the
> next mount and so is recovering lots of hardlink operations that
> weren't written back at unmount.
>
> Hence this smells like an unmount or OS shutdown process issue, not
> a mount issue. e.g. if something in the shutdown scripts hangs,
> systemd may time out the shutdown and power off/reboot the machine
> wihtout completing the full shutdown process. The result of this is
> the filesystem has to perform recovery on the next mount and so you
> see a long mount time because of some other unrelated issue.
>
> What is the dmesg output for the mount operations? That will tell us
> if journal recovery is the difference for certain. Have you also
> checked to see what is happening in the shutdown/unmount process
> before the long mount times occur?
echo $(uname -r) $(date +%H:%M:%S) > /dev/kmsg
mount /dev/vg1/test /test
echo $(uname -r) $(date +%H:%M:%S) > /dev/kmsg
umount /test
echo $(uname -r) $(date +%H:%M:%S) > /dev/kmsg
mount /dev/vg1/test /test
echo $(uname -r) $(date +%H:%M:%S) > /dev/kmsg
[55581.470484] 6.8.0-rc4-00129-g14dd46cf31f4 09:17:20
[55581.492733] XFS (dm-7): Mounting V5 Filesystem e2159bbc-18fb-4d4b-a6c5-14c97b8e5380
[56048.292804] XFS (dm-7): Ending clean mount
[56516.433008] 6.8.0-rc4-00129-g14dd46cf31f4 09:32:55
[56516.434695] XFS (dm-7): Unmounting Filesystem e2159bbc-18fb-4d4b-a6c5-14c97b8e5380
[56516.925145] 6.8.0-rc4-00129-g14dd46cf31f4 09:32:56
[56517.039873] XFS (dm-7): Mounting V5 Filesystem e2159bbc-18fb-4d4b-a6c5-14c97b8e5380
[56986.017144] XFS (dm-7): Ending clean mount
[57454.876371] 6.8.0-rc4-00129-g14dd46cf31f4 09:48:34
And rebooting to the kernel before the offending commit:
[ 60.177951] 6.8.0-rc4-00128-g8541a7d9da2d 10:23:00
[ 61.009283] SGI XFS with ACLs, security attributes, realtime, scrub, quota, no debug enabled
[ 61.017422] XFS (dm-7): Mounting V5 Filesystem e2159bbc-18fb-4d4b-a6c5-14c97b8e5380
[ 61.351100] XFS (dm-7): Ending clean mount
[ 61.366359] 6.8.0-rc4-00128-g8541a7d9da2d 10:23:01
[ 61.367673] XFS (dm-7): Unmounting Filesystem e2159bbc-18fb-4d4b-a6c5-14c97b8e5380
[ 61.444552] 6.8.0-rc4-00128-g8541a7d9da2d 10:23:01
[ 61.459358] XFS (dm-7): Mounting V5 Filesystem e2159bbc-18fb-4d4b-a6c5-14c97b8e5380
[ 61.513938] XFS (dm-7): Ending clean mount
[ 61.524056] 6.8.0-rc4-00128-g8541a7d9da2d 10:23:01
>
>> this of course makes booting drop
>> into emergency mode if the filesystem is in /etc/fstab. A git bisect
>> nails the offending commit as 14dd46cf31f4aaffcf26b00de9af39d01ec8d547.
>
> Commit 14dd46cf31f4 ("xfs: split xfs_inobt_init_cursor") doesn't
> seem like a candidate for any sort of change of behaviour. It's just
> a refactoring patch that doesn't change any behaviour at all.
> Are you sure the reproducer you used for the bisect is reliable?
Yes.
>> The filesystem is a collection of daily snapshots of a live filesystem
>> collected over a number of years, organized as a storage of unique files,
>> that are reflinked to inodes that contain the actual {owner,group,permission,
>> mtime}, and these inodes are hardlinked into the daily snapshot trees.
>
> So it's reflinks and hardlinks. Recovering a reflink takes a lot
> more CPU time and journal traffic than recovering a hardlink, so
> that will also be a contributing factor.
>
>> The numbers for the filesystem are:
>>
>> Total file size: 3.6e+12 bytes
>
> 3.6TB, not a large data set by any measurement.
>
>> Unique files: 12.4e+06
>
> 12M files, not a lot.
>
>> Reflink inodes: 18.6e+06
>
> 18M inodes with shared extents, not a huge number, either.
>
>> Hardlinks: 15.7e+09
>
> Ok, 15.7 billion hardlinks is a *lot*.
:-)
>
> And by a lot, I mean that's the largest number of hardlinks in an
> XFS filesystem I've personally ever heard about in 20 years.
Glad to be of service.
>
> As a warning: hope like hell you never have a disaster with that
> storage and need to run xfs_repair on that filesystem. It you don't
> have many, many TBs of RAM, just checking the hardlinks resolve
> correctly could take billions of IOs...
I hope so as well :-), but it is not a critical system (used for testing
and statistics, will take about a month to rebuild though :-/).
>
> -Dave.
next prev parent reply other threads:[~2024-08-10 8:29 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-09 17:08 XFS mount timeout in linux-6.9.11 Anders Blomdell
2024-08-09 22:55 ` Dave Chinner
2024-08-10 8:29 ` Anders Blomdell [this message]
2024-08-10 23:11 ` Dave Chinner
2024-08-11 8:17 ` Anders Blomdell
2024-08-12 0:04 ` Dave Chinner
[not found] ` <6a19bfdf-9503-4c3b-bc5b-192685ec1bdd@gmail.com>
2024-08-13 9:19 ` Dave Chinner
2024-08-13 12:01 ` Anders Blomdell
2024-08-13 14:59 ` Christoph Hellwig
2024-08-13 15:25 ` Darrick J. Wong
2024-08-22 10:25 ` Anders Blomdell
2024-08-22 15:46 ` Darrick J. Wong
2024-09-04 7:45 ` David Woodhouse
2024-09-08 14:18 ` Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=252d91e2-282e-4af4-b99b-3b8147d98bc3@gmail.com \
--to=anders.blomdell@gmail.com \
--cc=chandan.babu@oracle.com \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox