From: bugzilla-daemon@kernel.org
To: linux-xfs@vger.kernel.org
Subject: [Bug 216343] XFS: no space left in xlog cause system hang
Date: Sun, 14 Aug 2022 23:54:51 +0000 [thread overview]
Message-ID: <bug-216343-201763-jdxdDvUKTL@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-216343-201763@https.bugzilla.kernel.org/>
https://bugzilla.kernel.org/show_bug.cgi?id=216343
--- Comment #1 from Dave Chinner (david@fromorbit.com) ---
[cc Amir, the 5.10 stable XFS maintainer]
On Tue, Aug 09, 2022 at 11:46:23AM +0000, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=216343
>
> Bug ID: 216343
> Summary: XFS: no space left in xlog cause system hang
> Product: File System
> Version: 2.5
> Kernel Version: 5.10.38
> Hardware: ARM
> OS: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: XFS
> Assignee: filesystem_xfs@kernel-bugs.kernel.org
> Reporter: zhoukete@126.com
> Regression: No
>
> Created attachment 301539
> --> https://bugzilla.kernel.org/attachment.cgi?id=301539&action=edit
> stack
>
> 1. cannot login with ssh, system hanged and cannot do anything
> 2. dmesg report 'audit: audit_backlog=41349 > audit_backlog_limit=8192'
> 3. I send sysrq-crash and get vmcore file , I dont know how to reproduce it.
>
> Follwing is my analysis from vmcore:
>
> The reason why tty cannot login is pid 2021571 hold the acct_process mutex,
> and
> 2021571 cannot release mutex because it is wait for xlog release space. See
> the
> stac info in the attachment of stack.txt
>
> So I try to figure out what happened to xlog
>
> crash> struct xfs_ail.ail_target_prev,ail_targe,ail_head 0xffff00ff884f1000
> ail_target_prev = 0xe9200058600
> ail_target = 0xe9200058600
> ail_head = {
> next = 0xffff0340999a0a80,
> prev = 0xffff020013c66b40
> }
>
> there are 112 log item in ail list
> crash> list 0xffff0340999a0a80 | wc -l
> 112
>
> 79 item of them are xlog_inode_item
> 30 item of them are xlog_buf_item
>
> crash> xfs_log_item.li_flags,li_lsn 0xffff0340999a0a80 -x
> li_flags = 0x1
> li_lsn = 0xe910005cc00 ===> first item lsn
>
> crash> xfs_log_item.li_flags,li_lsn ffff020013c66b40 -x
> li_flags = 0x1
> li_lsn = 0xe9200058600 ===> last item lsn
>
> crash>xfs_log_item.li_buf 0xffff0340999a0a80
> li_buf = 0xffff0200125b7180
>
> crash> xfs_buf.b_flags 0xffff0200125b7180 -x
> b_flags = 0x110032 (XBF_WRITE|XBF_ASYNC|XBF_DONE|_XBF_INODES|_XBF_PAGES)
>
> crash> xfs_buf.b_state 0xffff0200125b7180 -x
> b_state = 0x2 (XFS_BSTATE_IN_FLIGHT)
>
> crash> xfs_buf.b_last_error,b_retries,b_first_retry_time 0xffff0200125b7180
> -x
> b_last_error = 0x0
> b_retries = 0x0
> b_first_retry_time = 0x0
>
> The buf flags show the io had been done(XBF_DONE is set).
> When I review the code xfs_buf_ioend, if XBF_DONE is set,
> xfs_buf_inode_iodone
> will be called and it will remove the log item from ail list, then release
> the
> xlog space by moving the tail_lsn.
>
> But now this item is still in the ail list, and the b_last_error = 0,
> XBF_WRITE
> is set.
>
> xfs buf log item is the same as the inode log item.
>
> crash> list -s xfs_log_item.li_buf 0xffff0340999a0a80
> ffff033f8d7c9de8
> li_buf = 0x0
> crash> xfs_buf_log_item.bli_buf ffff033f8d7c9de8
> bli_buf = 0xffff0200125b4a80
> crash> xfs_buf.b_flags 0xffff0200125b4a80 -x
> b_flags = 0x100032 (XBF_WRITE|XBF_ASYNC|XBF_DONE|_XBF_PAGES)
>
> I think it is impossible that (XBF_DONE is set & b_last_error = 0) and the
> item
> still in the ail.
>
> Is my analysis correct?
> Why xlog space cannot release space?
>
> --
> You may reply to this email to add a comment.
>
> You are receiving this mail because:
> You are watching the assignee of the bug.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
next prev parent reply other threads:[~2022-08-14 23:54 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-09 11:46 [Bug 216343] New: XFS: no space left in xlog cause system hang bugzilla-daemon
2022-08-11 7:04 ` [Bug 216343] " bugzilla-daemon
2022-08-14 23:54 ` [Bug 216343] New: " Dave Chinner
2022-08-15 16:12 ` Amir Goldstein
2022-08-14 23:54 ` bugzilla-daemon [this message]
2022-08-15 16:12 ` [Bug 216343] " bugzilla-daemon
2022-08-16 6:56 ` bugzilla-daemon
2022-08-16 14:32 ` Amir Goldstein
2022-08-16 14:32 ` bugzilla-daemon
2022-08-17 10:05 ` bugzilla-daemon
2022-08-17 13:15 ` Amir Goldstein
2022-08-17 13:15 ` bugzilla-daemon
2022-08-18 8:23 ` bugzilla-daemon
2023-09-21 6:58 ` bugzilla-daemon
2023-09-21 6:59 ` bugzilla-daemon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-216343-201763-jdxdDvUKTL@https.bugzilla.kernel.org/ \
--to=bugzilla-daemon@kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).