public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* xfs issue
       [not found] <MW4PR84MB1660F0405D5E26BD48C5DD7C88A82@MW4PR84MB1660.NAMPRD84.PROD.OUTLOOK.COM>
@ 2024-07-22 14:21 ` P M, Priya
  2024-07-22 22:15   ` Dave Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: P M, Priya @ 2024-07-22 14:21 UTC (permalink / raw)
  To: linux-xfs@vger.kernel.org

Hi, 

Good Morning! 

We see the IO stall on backing disk sdh when it hangs - literally no IO, but a very few, per this sort of thing in diskstat:
 
alslater@HPE-W5P7CGPQYL collectl % grep 21354078 sdhi.out | sed 's/.*disk//'|wc -l
   1003

alslater@HPE-W5P7CGPQYL collectl % grep 21354078 sdhi.out | sed 's/.*disk//'|uniq -c
  1     8     112 sdh 21354078 11338 20953907501 1972079123 18657407 324050 16530008823 580990600 0 17845212 2553245350
  1     8     112 sdh 21354078 11338 20953907501 1972079123 18657429 324051 16530009041 580990691 0 17845254 2553245441
  1     8     112 sdh 21354078 11338 20953907501 1972079123 18657431 324051 16530009044 580990691 0 17845254 2553245441
1000     8     112 sdh 21354078 11338 20953907501 1972079123 18657433 324051 16530009047 580990691 0 17845254 2553245441
^ /very/ slight changes these write cols ->
(these are diskstat metrics per 3.10 era, read metrics first, then writes)
 
And there is a spike in sleeping on logspace concurrent with fail.
 
Prior backtraces had xlog_grant_head_check hungtasks but currently with noop scheduler
change (from deadline which was our default), and xfssyncd dialled down to 10s, we get:
 
bc3:
/proc/25146  xfsaild/sdh
[<ffffffffc11aa9f7>] xfs_buf_iowait+0x27/0xc0 [xfs]
[<ffffffffc11ac320>] __xfs_buf_submit+0x130/0x250 [xfs]
[<ffffffffc11ac465>] _xfs_buf_read+0x25/0x30 [xfs]
[<ffffffffc11ac569>] xfs_buf_read_map+0xf9/0x160 [xfs]
[<ffffffffc11de299>] xfs_trans_read_buf_map+0xf9/0x2d0 [xfs]
[<ffffffffc119fe9e>] xfs_imap_to_bp+0x6e/0xe0 [xfs]
[<ffffffffc11c265a>] xfs_iflush+0xda/0x250 [xfs]
[<ffffffffc11d4f16>] xfs_inode_item_push+0x156/0x1a0 [xfs]
[<ffffffffc11dd1ef>] xfsaild+0x38f/0x780 [xfs]
[<ffffffff956c32b1>] kthread+0xd1/0xe0
[<ffffffff95d801dd>] ret_from_fork_nospec_begin+0x7/0x21
[<ffffffffffffffff>] 0xffffffffffffffff
 
bbm:
/proc/22022  xfsaild/sdh
[<ffffffffc12d09f7>] xfs_buf_iowait+0x27/0xc0 [xfs]
[<ffffffffc12d2320>] __xfs_buf_submit+0x130/0x250 [xfs]
[<ffffffffc12d2465>] _xfs_buf_read+0x25/0x30 [xfs]
[<ffffffffc12d2569>] xfs_buf_read_map+0xf9/0x160 [xfs]
[<ffffffffc1304299>] xfs_trans_read_buf_map+0xf9/0x2d0 [xfs]
[<ffffffffc12c5e9e>] xfs_imap_to_bp+0x6e/0xe0 [xfs]
[<ffffffffc12e865a>] xfs_iflush+0xda/0x250 [xfs]
[<ffffffffc12faf16>] xfs_inode_item_push+0x156/0x1a0 [xfs]
[<ffffffffc13031ef>] xfsaild+0x38f/0x780 [xfs]
[<ffffffffbe4c32b1>] kthread+0xd1/0xe0
[<ffffffffbeb801dd>] ret_from_fork_nospec_begin+0x7/0x21
[<ffffffffffffffff>] 0xffffffffffffffff
 
.. along with cofc threads in isr waiting for data. What that doesn't tell us yet is who's the symptom
versus who's the cause. Might be lack of / lost interrupt handling, might be lack of pushing the 
xfs log hard enough  out, might be combination of timing aspects..

Wondering if there is any patch which address this issue, please let me know. 
Any pointers to further debug would also help. 

Thanks
Priya

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: xfs issue
  2024-07-22 14:21 ` xfs issue P M, Priya
@ 2024-07-22 22:15   ` Dave Chinner
  2024-07-23  6:18     ` P M, Priya
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2024-07-22 22:15 UTC (permalink / raw)
  To: P M, Priya; +Cc: linux-xfs@vger.kernel.org

On Mon, Jul 22, 2024 at 02:21:40PM +0000, P M, Priya wrote:
> Hi, 
> 
> Good Morning! 
> 
> We see the IO stall on backing disk sdh when it hangs - literally no IO, but a very few, per this sort of thing in diskstat:
>  
> alslater@HPE-W5P7CGPQYL collectl % grep 21354078 sdhi.out | sed 's/.*disk//'|wc -l
>    1003
> 
> alslater@HPE-W5P7CGPQYL collectl % grep 21354078 sdhi.out | sed 's/.*disk//'|uniq -c
>   1     8     112 sdh 21354078 11338 20953907501 1972079123 18657407 324050 16530008823 580990600 0 17845212 2553245350
>   1     8     112 sdh 21354078 11338 20953907501 1972079123 18657429 324051 16530009041 580990691 0 17845254 2553245441
>   1     8     112 sdh 21354078 11338 20953907501 1972079123 18657431 324051 16530009044 580990691 0 17845254 2553245441
> 1000     8     112 sdh 21354078 11338 20953907501 1972079123 18657433 324051 16530009047 580990691 0 17845254 2553245441
> ^ /very/ slight changes these write cols ->
> (these are diskstat metrics per 3.10 era, read metrics first, then writes)

What kernel are you running? What's the storage stack look like
(i.e. storage hardware, lvm, md, xfs_info, etc).

> And there is a spike in sleeping on logspace concurrent with fail.
>  
> Prior backtraces had xlog_grant_head_check hungtasks

Which means the journal ran out of space, and the tasks were waiting
on metadata writeback to make progress to free up journal space.

What do the block device stats tell you about inflight IOs
(/sys/block/*/inflight)?

> but currently with noop scheduler
> change (from deadline which was our default), and xfssyncd dialled down to 10s, we get:
>  
> bc3:
> /proc/25146  xfsaild/sdh
> [<ffffffffc11aa9f7>] xfs_buf_iowait+0x27/0xc0 [xfs]
> [<ffffffffc11ac320>] __xfs_buf_submit+0x130/0x250 [xfs]
> [<ffffffffc11ac465>] _xfs_buf_read+0x25/0x30 [xfs]
> [<ffffffffc11ac569>] xfs_buf_read_map+0xf9/0x160 [xfs]
> [<ffffffffc11de299>] xfs_trans_read_buf_map+0xf9/0x2d0 [xfs]
> [<ffffffffc119fe9e>] xfs_imap_to_bp+0x6e/0xe0 [xfs]
> [<ffffffffc11c265a>] xfs_iflush+0xda/0x250 [xfs]
> [<ffffffffc11d4f16>] xfs_inode_item_push+0x156/0x1a0 [xfs]
> [<ffffffffc11dd1ef>] xfsaild+0x38f/0x780 [xfs]
> [<ffffffff956c32b1>] kthread+0xd1/0xe0
> [<ffffffff95d801dd>] ret_from_fork_nospec_begin+0x7/0x21
> [<ffffffffffffffff>] 0xffffffffffffffff
>  
> bbm:
> /proc/22022  xfsaild/sdh
> [<ffffffffc12d09f7>] xfs_buf_iowait+0x27/0xc0 [xfs]
> [<ffffffffc12d2320>] __xfs_buf_submit+0x130/0x250 [xfs]
> [<ffffffffc12d2465>] _xfs_buf_read+0x25/0x30 [xfs]
> [<ffffffffc12d2569>] xfs_buf_read_map+0xf9/0x160 [xfs]
> [<ffffffffc1304299>] xfs_trans_read_buf_map+0xf9/0x2d0 [xfs]
> [<ffffffffc12c5e9e>] xfs_imap_to_bp+0x6e/0xe0 [xfs]
> [<ffffffffc12e865a>] xfs_iflush+0xda/0x250 [xfs]
> [<ffffffffc12faf16>] xfs_inode_item_push+0x156/0x1a0 [xfs]
> [<ffffffffc13031ef>] xfsaild+0x38f/0x780 [xfs]
> [<ffffffffbe4c32b1>] kthread+0xd1/0xe0
> [<ffffffffbeb801dd>] ret_from_fork_nospec_begin+0x7/0x21
> [<ffffffffffffffff>] 0xffffffffffffffff

And that's metadata writeback waiting for IO completion to occur.

If this is where the filesystem is stuck, then that's why the
journal has no space and tasks get hung up in
xlog_grant_head_check(). i.e.  these appaer to be two symptoms of
the same problem.

> .. along with cofc threads in isr waiting for data. What that doesn't tell us yet is who's the symptom
> versus who's the cause. Might be lack of / lost interrupt handling, might be lack of pushing the 
> xfs log hard enough  out, might be combination of timing aspects..

No idea as yet - what kernel you are running is kinda important to
know before we look much deeper.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: xfs issue
  2024-07-22 22:15   ` Dave Chinner
@ 2024-07-23  6:18     ` P M, Priya
  2024-07-23 21:51       ` Dave Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: P M, Priya @ 2024-07-23  6:18 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs@vger.kernel.org

Thanks, David, for the quick response. The kernel version is 3.10.0-1160.114.2. 

-----Original Message-----
From: Dave Chinner <david@fromorbit.com> 
Sent: Tuesday, July 23, 2024 3:45 AM
To: P M, Priya <pm.priya@hpe.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: xfs issue

On Mon, Jul 22, 2024 at 02:21:40PM +0000, P M, Priya wrote:
> Hi,
> 
> Good Morning! 
> 
> We see the IO stall on backing disk sdh when it hangs - literally no IO, but a very few, per this sort of thing in diskstat:
>  
> alslater@HPE-W5P7CGPQYL collectl % grep 21354078 sdhi.out | sed 
> 's/.*disk//'|wc -l
>    1003
> 
> alslater@HPE-W5P7CGPQYL collectl % grep 21354078 sdhi.out | sed 
> 's/.*disk//'|uniq -c
>   1     8     112 sdh 21354078 11338 20953907501 1972079123 18657407 
> 324050 16530008823 580990600 0 17845212 2553245350
>   1     8     112 sdh 21354078 11338 20953907501 1972079123 18657429 
> 324051 16530009041 580990691 0 17845254 2553245441
>   1     8     112 sdh 21354078 11338 20953907501 1972079123 18657431 
> 324051 16530009044 580990691 0 17845254 2553245441
> 1000     8     112 sdh 21354078 11338 20953907501 1972079123 18657433 
> 324051 16530009047 580990691 0 17845254 2553245441 ^ /very/ slight 
> changes these write cols -> (these are diskstat metrics per 3.10 era, 
> read metrics first, then writes)

What kernel are you running? What's the storage stack look like (i.e. storage hardware, lvm, md, xfs_info, etc).

> And there is a spike in sleeping on logspace concurrent with fail.
>  
> Prior backtraces had xlog_grant_head_check hungtasks

Which means the journal ran out of space, and the tasks were waiting on metadata writeback to make progress to free up journal space.

What do the block device stats tell you about inflight IOs (/sys/block/*/inflight)?

> but currently with noop scheduler
> change (from deadline which was our default), and xfssyncd dialled down to 10s, we get:
>  
> bc3:
> /proc/25146  xfsaild/sdh
> [<ffffffffc11aa9f7>] xfs_buf_iowait+0x27/0xc0 [xfs] 
> [<ffffffffc11ac320>] __xfs_buf_submit+0x130/0x250 [xfs] 
> [<ffffffffc11ac465>] _xfs_buf_read+0x25/0x30 [xfs] 
> [<ffffffffc11ac569>] xfs_buf_read_map+0xf9/0x160 [xfs] 
> [<ffffffffc11de299>] xfs_trans_read_buf_map+0xf9/0x2d0 [xfs] 
> [<ffffffffc119fe9e>] xfs_imap_to_bp+0x6e/0xe0 [xfs] 
> [<ffffffffc11c265a>] xfs_iflush+0xda/0x250 [xfs] [<ffffffffc11d4f16>] 
> xfs_inode_item_push+0x156/0x1a0 [xfs] [<ffffffffc11dd1ef>] 
> xfsaild+0x38f/0x780 [xfs] [<ffffffff956c32b1>] kthread+0xd1/0xe0 
> [<ffffffff95d801dd>] ret_from_fork_nospec_begin+0x7/0x21
> [<ffffffffffffffff>] 0xffffffffffffffff
>  
> bbm:
> /proc/22022  xfsaild/sdh
> [<ffffffffc12d09f7>] xfs_buf_iowait+0x27/0xc0 [xfs] 
> [<ffffffffc12d2320>] __xfs_buf_submit+0x130/0x250 [xfs] 
> [<ffffffffc12d2465>] _xfs_buf_read+0x25/0x30 [xfs] 
> [<ffffffffc12d2569>] xfs_buf_read_map+0xf9/0x160 [xfs] 
> [<ffffffffc1304299>] xfs_trans_read_buf_map+0xf9/0x2d0 [xfs] 
> [<ffffffffc12c5e9e>] xfs_imap_to_bp+0x6e/0xe0 [xfs] 
> [<ffffffffc12e865a>] xfs_iflush+0xda/0x250 [xfs] [<ffffffffc12faf16>] 
> xfs_inode_item_push+0x156/0x1a0 [xfs] [<ffffffffc13031ef>] 
> xfsaild+0x38f/0x780 [xfs] [<ffffffffbe4c32b1>] kthread+0xd1/0xe0 
> [<ffffffffbeb801dd>] ret_from_fork_nospec_begin+0x7/0x21
> [<ffffffffffffffff>] 0xffffffffffffffff

And that's metadata writeback waiting for IO completion to occur.

If this is where the filesystem is stuck, then that's why the journal has no space and tasks get hung up in xlog_grant_head_check(). i.e.  these appaer to be two symptoms of the same problem.

> .. along with cofc threads in isr waiting for data. What that doesn't 
> tell us yet is who's the symptom versus who's the cause. Might be lack 
> of / lost interrupt handling, might be lack of pushing the xfs log hard enough  out, might be combination of timing aspects..

No idea as yet - what kernel you are running is kinda important to know before we look much deeper.

-Dave.
--
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: xfs issue
  2024-07-23  6:18     ` P M, Priya
@ 2024-07-23 21:51       ` Dave Chinner
  2024-07-24  3:51         ` P M, Priya
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2024-07-23 21:51 UTC (permalink / raw)
  To: P M, Priya; +Cc: linux-xfs@vger.kernel.org

On Tue, Jul 23, 2024 at 06:18:55AM +0000, P M, Priya wrote:
> Thanks, David, for the quick response. The kernel version is 3.10.0-1160.114.2. 

Ok, that's a highly modified downstream enterprise distro kernel and
far older than any kernel upstream actually supports. There's little
we can do as upstream developers here because we don't have access
to the source to help diagnose any problem, nor is there any way can
we fix any problem that might be found.

IOWs, you'll need to contact your distro vendor for further triage
and support of the problem you are having.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: xfs issue
  2024-07-23 21:51       ` Dave Chinner
@ 2024-07-24  3:51         ` P M, Priya
  0 siblings, 0 replies; 5+ messages in thread
From: P M, Priya @ 2024-07-24  3:51 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs@vger.kernel.org

Sure. Thanks! 

-----Original Message-----
From: Dave Chinner <david@fromorbit.com> 
Sent: Wednesday, July 24, 2024 3:21 AM
To: P M, Priya <pm.priya@hpe.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: xfs issue

On Tue, Jul 23, 2024 at 06:18:55AM +0000, P M, Priya wrote:
> Thanks, David, for the quick response. The kernel version is 3.10.0-1160.114.2. 

Ok, that's a highly modified downstream enterprise distro kernel and far older than any kernel upstream actually supports. There's little we can do as upstream developers here because we don't have access to the source to help diagnose any problem, nor is there any way can we fix any problem that might be found.

IOWs, you'll need to contact your distro vendor for further triage and support of the problem you are having.

-Dave.
--
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-07-24  3:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <MW4PR84MB1660F0405D5E26BD48C5DD7C88A82@MW4PR84MB1660.NAMPRD84.PROD.OUTLOOK.COM>
2024-07-22 14:21 ` xfs issue P M, Priya
2024-07-22 22:15   ` Dave Chinner
2024-07-23  6:18     ` P M, Priya
2024-07-23 21:51       ` Dave Chinner
2024-07-24  3:51         ` P M, Priya

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox