From: bugzilla-daemon@kernel.org
To: linux-xfs@vger.kernel.org
Subject: [Bug 216343] XFS: no space left in xlog cause system hang
Date: Sun, 14 Aug 2022 23:54:51 +0000 [thread overview]
Message-ID: <bug-216343-201763-jdxdDvUKTL@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-216343-201763@https.bugzilla.kernel.org/>
https://bugzilla.kernel.org/show_bug.cgi?id=216343
--- Comment #1 from Dave Chinner (david@fromorbit.com) ---
[cc Amir, the 5.10 stable XFS maintainer]
On Tue, Aug 09, 2022 at 11:46:23AM +0000, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=216343
>
> Bug ID: 216343
> Summary: XFS: no space left in xlog cause system hang
> Product: File System
> Version: 2.5
> Kernel Version: 5.10.38
> Hardware: ARM
> OS: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: XFS
> Assignee: filesystem_xfs@kernel-bugs.kernel.org
> Reporter: zhoukete@126.com
> Regression: No
>
> Created attachment 301539
> --> https://bugzilla.kernel.org/attachment.cgi?id=301539&action=edit
> stack
>
> 1. cannot login with ssh, system hanged and cannot do anything
> 2. dmesg report 'audit: audit_backlog=41349 > audit_backlog_limit=8192'
> 3. I send sysrq-crash and get vmcore file , I dont know how to reproduce it.
>
> Follwing is my analysis from vmcore:
>
> The reason why tty cannot login is pid 2021571 hold the acct_process mutex,
> and
> 2021571 cannot release mutex because it is wait for xlog release space. See
> the
> stac info in the attachment of stack.txt
>
> So I try to figure out what happened to xlog
>
> crash> struct xfs_ail.ail_target_prev,ail_targe,ail_head 0xffff00ff884f1000
> ail_target_prev = 0xe9200058600
> ail_target = 0xe9200058600
> ail_head = {
> next = 0xffff0340999a0a80,
> prev = 0xffff020013c66b40
> }
>
> there are 112 log item in ail list
> crash> list 0xffff0340999a0a80 | wc -l
> 112
>
> 79 item of them are xlog_inode_item
> 30 item of them are xlog_buf_item
>
> crash> xfs_log_item.li_flags,li_lsn 0xffff0340999a0a80 -x
> li_flags = 0x1
> li_lsn = 0xe910005cc00 ===> first item lsn
>
> crash> xfs_log_item.li_flags,li_lsn ffff020013c66b40 -x
> li_flags = 0x1
> li_lsn = 0xe9200058600 ===> last item lsn
>
> crash>xfs_log_item.li_buf 0xffff0340999a0a80
> li_buf = 0xffff0200125b7180
>
> crash> xfs_buf.b_flags 0xffff0200125b7180 -x
> b_flags = 0x110032 (XBF_WRITE|XBF_ASYNC|XBF_DONE|_XBF_INODES|_XBF_PAGES)
>
> crash> xfs_buf.b_state 0xffff0200125b7180 -x
> b_state = 0x2 (XFS_BSTATE_IN_FLIGHT)
>
> crash> xfs_buf.b_last_error,b_retries,b_first_retry_time 0xffff0200125b7180
> -x
> b_last_error = 0x0
> b_retries = 0x0
> b_first_retry_time = 0x0
>
> The buf flags show the io had been done(XBF_DONE is set).
> When I review the code xfs_buf_ioend, if XBF_DONE is set,
> xfs_buf_inode_iodone
> will be called and it will remove the log item from ail list, then release
> the
> xlog space by moving the tail_lsn.
>
> But now this item is still in the ail list, and the b_last_error = 0,
> XBF_WRITE
> is set.
>
> xfs buf log item is the same as the inode log item.
>
> crash> list -s xfs_log_item.li_buf 0xffff0340999a0a80
> ffff033f8d7c9de8
> li_buf = 0x0
> crash> xfs_buf_log_item.bli_buf ffff033f8d7c9de8
> bli_buf = 0xffff0200125b4a80
> crash> xfs_buf.b_flags 0xffff0200125b4a80 -x
> b_flags = 0x100032 (XBF_WRITE|XBF_ASYNC|XBF_DONE|_XBF_PAGES)
>
> I think it is impossible that (XBF_DONE is set & b_last_error = 0) and the
> item
> still in the ail.
>
> Is my analysis correct?
> Why xlog space cannot release space?
>
> --
> You may reply to this email to add a comment.
>
> You are receiving this mail because:
> You are watching the assignee of the bug.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
next prev parent reply other threads:[~2022-08-14 23:54 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-09 11:46 [Bug 216343] New: XFS: no space left in xlog cause system hang bugzilla-daemon
2022-08-11 7:04 ` [Bug 216343] " bugzilla-daemon
2022-08-14 23:54 ` [Bug 216343] New: " Dave Chinner
2022-08-15 16:12 ` Amir Goldstein
2022-08-14 23:54 ` bugzilla-daemon [this message]
2022-08-15 16:12 ` [Bug 216343] " bugzilla-daemon
2022-08-16 6:56 ` bugzilla-daemon
2022-08-16 14:32 ` Amir Goldstein
2022-08-16 14:32 ` bugzilla-daemon
2022-08-17 10:05 ` bugzilla-daemon
2022-08-17 13:15 ` Amir Goldstein
2022-08-17 13:15 ` bugzilla-daemon
2022-08-18 8:23 ` bugzilla-daemon
2023-09-21 6:58 ` bugzilla-daemon
2023-09-21 6:59 ` bugzilla-daemon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-216343-201763-jdxdDvUKTL@https.bugzilla.kernel.org/ \
--to=bugzilla-daemon@kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.