public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Dave Chinner <dgc@kernel.org>
Cc: linux-xfs@vger.kernel.org
Subject: Re: Hang with xfs/285 on 2026-03-02 kernel
Date: Sat, 4 Apr 2026 21:40:37 +0100	[thread overview]
Message-ID: <adF3RXLIzlp8SwZO@casper.infradead.org> (raw)
In-Reply-To: <adDgYCmgNsA9ff3e@dread>

On Sat, Apr 04, 2026 at 10:42:59PM +1100, Dave Chinner wrote:
> On Fri, Apr 03, 2026 at 04:35:46PM +0100, Matthew Wilcox wrote:
> > This is with commit 5619b098e2fb so after 7.0-rc6
> > INFO: task fsstress:3762792 blocked on a semaphore likely last held by task fsstress:3762793
> > task:fsstress        state:D stack:0     pid:3762793 tgid:3762793 ppid:3762783 task_flags:0x440140 flags:0x00080800
> > Call Trace:
> >  <TASK>
> >  __schedule+0x560/0xfc0
> >  schedule+0x3e/0x140
> >  schedule_timeout+0x84/0x110
> >  ? __pfx_process_timeout+0x10/0x10
> >  io_schedule_timeout+0x5b/0x80
> >  xfs_buf_alloc+0x793/0x7d0
> 
> -ENOMEM.
> 
> It'll be looping here:
> 
> fallback:
>         for (;;) {
>                 bp->b_addr = __vmalloc(size, gfp_mask);
>                 if (bp->b_addr)
>                         break;
>                 if (flags & XBF_READ_AHEAD)
>                         return -ENOMEM;
>                 XFS_STATS_INC(bp->b_mount, xb_page_retries);
>                 memalloc_retry_wait(gfp_mask);
>         }
> 
> If it is looping here long enough to trigger the hang check timer,
> then the MM subsystem is not making progress reclaiming memory. This
> is probably a 16kB allocation (it's an inode cluster buffer), and
> the allocation context is NOFAIL because it is within a transaction
> (this loop pre-dates __vmalloc() supporting __GFP_NOFAIL)....

There may be something else going on.  I reproduced it again and ssh'd
into the VM.

# free
               total        used        free      shared  buff/cache   available
Mem:         3988260     1197132      240080         144     3147496     2791128
Swap:        2097148      258128     1839020

There are five instances of fsstress running.  Very slowly, but they are
accumulating seconds of CPU time:

root@deadly-kvm:~# ps -aux |grep fsstress
root     3745227  0.0  0.0   2664  1476 ?        S    06:48   0:00 ./ltp/fsstress -p 4 -d /mnt/scratch -n 2000000
root     3745236  7.5  1.6 127928 65256 ?        D    06:48  42:54 ./ltp/fsstress -p 4 -d /mnt/scratch -n 2000000
root     3745237  7.6  1.5 124644 61308 ?        D    06:48  42:55 ./ltp/fsstress -p 4 -d /mnt/scratch -n 2000000
root     3745238  7.6  1.6 130844 65584 ?        D    06:48  43:01 ./ltp/fsstress -p 4 -d /mnt/scratch -n 2000000
root     3745239  7.6  1.6 126524 66536 ?        D    06:48  42:58 ./ltp/fsstress -p 4 -d /mnt/scratch -n 2000000
root@deadly-kvm:~# ps -aux |grep fsstress
root     3745227  0.0  0.0   2664  1476 ?        S    06:48   0:00 ./ltp/fsstress -p 4 -d /mnt/scratch -n 2000000
root     3745236  5.5  1.6 133116 66708 ?        R    06:48  45:44 ./ltp/fsstress -p 4 -d /mnt/scratch -n 2000000
root     3745237  5.5  1.5 130136 62516 ?        R    06:48  45:45 ./ltp/fsstress -p 4 -d /mnt/scratch -n 2000000
root     3745238  5.5  1.6 136520 65944 ?        R    06:48  45:52 ./ltp/fsstress -p 4 -d /mnt/scratch -n 2000000
root     3745239  5.5  1.7 131988 67884 ?        R    06:48  45:50 ./ltp/fsstress -p 4 -d /mnt/scratch -n 2000000

# cat /proc/3745239/stack
[<0>] xfs_buf_lock+0x4b/0x170
[<0>] xfs_buf_find_lock+0x69/0x140
[<0>] xfs_buf_get_map+0x265/0xbd0
[<0>] xfs_buf_read_map+0x59/0x2e0
[<0>] xfs_trans_read_buf_map+0x1bb/0x560
[<0>] xfs_read_agi+0xab/0x1a0
(...)

# cat /proc/3745238/stack
[<0>] xfs_buf_alloc+0x793/0x7d0
[<0>] xfs_buf_get_map+0x651/0xbd0
[<0>] xfs_buf_readahead_map+0x3b/0x1b0
[<0>] xfs_iwalk_ichunk_ra+0xe9/0x130
[<0>] xfs_iwalk_ag+0x185/0x2d0
(...)

It doesn't _seem_ like the system is struggling for memory.

# cat /proc/meminfo
MemTotal:        3988260 kB
MemFree:          241956 kB
MemAvailable:    2781960 kB
Buffers:            5184 kB
Cached:          2503020 kB
SwapCached:         4860 kB
Active:          2062948 kB
Inactive:         713828 kB
Active(anon):      85800 kB
Inactive(anon):   182968 kB
Active(file):    1977148 kB
Inactive(file):   530860 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:       2097148 kB
SwapFree:        1823052 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:        267836 kB
Mapped:            16280 kB
Shmem:               144 kB
KReclaimable:     628212 kB
Slab:             783840 kB
SReclaimable:     628212 kB
SUnreclaim:       155628 kB
KernelStack:        3536 kB
PageTables:         3680 kB
SecPageTables:         0 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     4091276 kB
Committed_AS:     560852 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       13004 kB
VmallocChunk:          0 kB
Percpu:             7360 kB
AnonHugePages:     12288 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:         0 kB
FilePmdMapped:         0 kB
Balloon:               0 kB
DirectMap4k:      153396 kB
DirectMap2M:     4040704 kB
DirectMap1G:     2097152 kB

and an excerpt of zoneinfo:

Node 0, zone   Normal
  pages free     27350
        boost    18939
        min      27357
        low      29461
        high     31565
        promo    33669
        spanned  524288
        present  524288
        managed  496128
        cma      0
        protection: (0, 0, 0, 0)
      nr_free_pages 27350
      nr_free_pages_blocks 0
      nr_zone_inactive_anon 21269
      nr_zone_active_anon 9703
      nr_zone_inactive_file 62769
      nr_zone_active_file 228878
      nr_zone_unevictable 0
      nr_zone_write_pending 0
      nr_mlock     0
      nr_free_cma  0


  reply	other threads:[~2026-04-04 20:40 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-03 15:35 Hang with xfs/285 on 2026-03-02 kernel Matthew Wilcox
2026-04-04 11:42 ` Dave Chinner
2026-04-04 20:40   ` Matthew Wilcox [this message]
2026-04-05 22:29     ` Dave Chinner
2026-04-05  1:03   ` Ritesh Harjani
2026-04-05 22:16     ` Dave Chinner
2026-04-06  0:27       ` Ritesh Harjani
2026-04-06 21:45         ` Dave Chinner
2026-04-07  5:41 ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adF3RXLIzlp8SwZO@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=dgc@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox