All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@kernel.org
To: linux-xfs@vger.kernel.org
Subject: [Bug 216343] New: XFS: no space left in xlog cause system hang
Date: Tue, 09 Aug 2022 11:46:23 +0000	[thread overview]
Message-ID: <bug-216343-201763@https.bugzilla.kernel.org/> (raw)

https://bugzilla.kernel.org/show_bug.cgi?id=216343

            Bug ID: 216343
           Summary: XFS: no space left in xlog cause system hang
           Product: File System
           Version: 2.5
    Kernel Version: 5.10.38
          Hardware: ARM
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: XFS
          Assignee: filesystem_xfs@kernel-bugs.kernel.org
          Reporter: zhoukete@126.com
        Regression: No

Created attachment 301539
  --> https://bugzilla.kernel.org/attachment.cgi?id=301539&action=edit
stack

1. cannot login with ssh, system hanged and cannot do anything
2. dmesg report 'audit: audit_backlog=41349 > audit_backlog_limit=8192'
3. I send sysrq-crash and get vmcore file , I dont know how to reproduce it.

Follwing is my analysis from vmcore:

The reason why tty cannot login is pid 2021571 hold the acct_process mutex, and
2021571 cannot release mutex because it is wait for xlog release space. See the
stac info in the attachment of stack.txt

So I try to figure out what happened to xlog

crash> struct xfs_ail.ail_target_prev,ail_targe,ail_head 0xffff00ff884f1000 
  ail_target_prev = 0xe9200058600
  ail_target = 0xe9200058600
  ail_head = {
    next = 0xffff0340999a0a80, 
    prev = 0xffff020013c66b40
  }

there are 112 log item in ail list
crash> list 0xffff0340999a0a80 | wc -l
112 

79 item of them are xlog_inode_item
30 item of them are xlog_buf_item

crash> xfs_log_item.li_flags,li_lsn 0xffff0340999a0a80 -x 
  li_flags = 0x1
  li_lsn = 0xe910005cc00 ===> first item lsn

crash> xfs_log_item.li_flags,li_lsn ffff020013c66b40 -x
  li_flags = 0x1
  li_lsn = 0xe9200058600 ===> last item lsn

crash>xfs_log_item.li_buf 0xffff0340999a0a80               
 li_buf = 0xffff0200125b7180

crash> xfs_buf.b_flags 0xffff0200125b7180 -x
 b_flags = 0x110032  (XBF_WRITE|XBF_ASYNC|XBF_DONE|_XBF_INODES|_XBF_PAGES) 

crash> xfs_buf.b_state 0xffff0200125b7180 -x
  b_state = 0x2 (XFS_BSTATE_IN_FLIGHT)

crash> xfs_buf.b_last_error,b_retries,b_first_retry_time 0xffff0200125b7180 -x
  b_last_error = 0x0
  b_retries = 0x0
  b_first_retry_time = 0x0 

The buf flags show the io had been done(XBF_DONE is set).
When I review the code xfs_buf_ioend, if XBF_DONE is set, xfs_buf_inode_iodone
will be called and it will remove the log item from ail list, then release the
xlog space by moving the tail_lsn.

But now this item is still in the ail list, and the b_last_error = 0, XBF_WRITE
is set.

xfs buf log item is the same as the inode log item.

crash> list -s xfs_log_item.li_buf 0xffff0340999a0a80
ffff033f8d7c9de8
  li_buf = 0x0
crash> xfs_buf_log_item.bli_buf  ffff033f8d7c9de8
  bli_buf = 0xffff0200125b4a80
crash> xfs_buf.b_flags 0xffff0200125b4a80 -x
  b_flags = 0x100032 (XBF_WRITE|XBF_ASYNC|XBF_DONE|_XBF_PAGES) 

I think it is impossible that (XBF_DONE is set & b_last_error = 0) and the item
still in the ail.

Is my analysis correct? 
Why xlog space cannot release space?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

             reply	other threads:[~2022-08-09 11:46 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-09 11:46 bugzilla-daemon [this message]
2022-08-11  7:04 ` [Bug 216343] XFS: no space left in xlog cause system hang bugzilla-daemon
2022-08-14 23:54 ` [Bug 216343] New: " Dave Chinner
2022-08-15 16:12   ` Amir Goldstein
2022-08-14 23:54 ` [Bug 216343] " bugzilla-daemon
2022-08-15 16:12 ` bugzilla-daemon
2022-08-16  6:56 ` bugzilla-daemon
2022-08-16 14:32   ` Amir Goldstein
2022-08-16 14:32 ` bugzilla-daemon
2022-08-17 10:05 ` bugzilla-daemon
2022-08-17 13:15   ` Amir Goldstein
2022-08-17 13:15 ` bugzilla-daemon
2022-08-18  8:23 ` bugzilla-daemon
2023-09-21  6:58 ` bugzilla-daemon
2023-09-21  6:59 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-216343-201763@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.