linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Moulding <dan@danm.net>
To: song@kernel.org
Cc: dan@danm.net, gregkh@linuxfoundation.org, junxiao.bi@oracle.com,
	linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
	regressions@lists.linux.dev, stable@vger.kernel.org,
	yukuai1@huaweicloud.com
Subject: Re: [REGRESSION] 6.7.1: md: raid5 hang and unresponsive system; successfully bisected
Date: Tue, 23 Jan 2024 14:53:07 -0700	[thread overview]
Message-ID: <20240123215307.8083-1-dan@danm.net> (raw)
In-Reply-To: <CAPhsuW7-r=UAO8f7Ok08vCx2kdVx6mZADyZ-LknNE8csnX+L8g@mail.gmail.com>

> I think we still want d6e035aad6c0 in 6.7.2. We may need to revert
> 0de40f76d567 on top of that. Could you please test it out? (6.7.1 +
> d6e035aad6c0 + revert 0de40f76d567.

I was operating under the assumption that the two commits were
intended to exist as a pair (the one reverts the old fix, because the
next commit has what is supposed to be a better fix). But since the
regression still exists, even with both patches applied, the old fix
must be reapplied to resolve the current regression.

But, as you've requested, I have tested 6.7.1 + d6e035aad6c0 + revert
0de40f76d567 and it seems fine. So I have no issue if you think it
makes sense to accept d6e035aad6c0 on its own, even though it would
break up the pair of commits.

> OTOH, I am not able to reproduce the issue. Could you please help
> get more information:
>   cat /proc/mdstat

Here is /proc/mdstat from one of the systems where I can reproduce it:

    $ cat /proc/mdstat
    Personalities : [raid6] [raid5] [raid4]
    md0 : active raid5 dm-0[4](J) sdc[3] sda[0] sdb[1]
          3906764800 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]

    unused devices: <none>

dm-0 is an LVM logical volume which is backed by an NVMe SSD. The
others are run-of-the-mill SATA SSDs.

>  profile (perf, etc.) of the md thread

I might need a little more pointing in the direction of what exactly
to look for and under what conditions (i.e. should I run perf while
the thread is stuck in the 100% CPU loop? what kind of report should I
ask perf for?). Also, are there any debug options I could enable in
the kernel configuration that might help gather more information?
Maybe something in debugfs? I currently get absolutely no warnings or
errors in dmesg when the problem occurs.

Cheers,

-- Dan

  reply	other threads:[~2024-01-23 21:53 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-23  0:56 [REGRESSION] 6.7.1: md: raid5 hang and unresponsive system; successfully bisected Dan Moulding
2024-01-23  1:08 ` Song Liu
2024-01-23  1:35 ` Dan Moulding
2024-01-23  6:35   ` Song Liu
2024-01-23 21:53     ` Dan Moulding [this message]
2024-01-23 22:21       ` Song Liu
2024-01-23 23:58         ` Dan Moulding
2024-01-25  0:01           ` Song Liu
2024-01-25 16:44             ` junxiao.bi
2024-01-25 19:40               ` Song Liu
2024-01-25 20:31               ` Dan Moulding
2024-01-26  3:30                 ` Carlos Carvalho
2024-01-26 15:46                   ` Dan Moulding
2024-01-30 16:26                     ` Blazej Kucman
2024-01-30 20:21                       ` Song Liu
2024-01-31  1:26                       ` Song Liu
2024-01-31  2:13                         ` Yu Kuai
2024-01-31  2:41                       ` Yu Kuai
2024-01-31  4:55                         ` Song Liu
2024-01-31 13:36                           ` Blazej Kucman
2024-02-01  1:39                             ` Yu Kuai
2024-01-26 16:21                   ` Roman Mamedov
2024-01-31 17:37                 ` junxiao.bi
2024-02-06  8:07                 ` Song Liu
2024-02-06 20:56                   ` Dan Moulding
2024-02-06 21:34                     ` Song Liu
2024-02-20 23:06 ` Dan Moulding
2024-02-20 23:15   ` junxiao.bi
2024-02-21 14:50     ` Mateusz Kusiak
2024-02-21 19:15       ` junxiao.bi
2024-02-23 17:44     ` Dan Moulding
2024-02-23 19:18       ` junxiao.bi
2024-02-23 20:22         ` Dan Moulding
2024-02-23  8:07   ` Linux regression tracking (Thorsten Leemhuis)
2024-02-24  2:13     ` Song Liu
2024-03-01 20:26       ` junxiao.bi
2024-03-01 23:12         ` Dan Moulding
2024-03-02  0:05           ` Song Liu
2024-03-06  8:38             ` Linux regression tracking (Thorsten Leemhuis)
2024-03-06 17:13               ` Song Liu
2024-03-02 16:55         ` Dan Moulding
2024-03-07  3:34         ` Yu Kuai
2024-03-08 23:49         ` junxiao.bi
2024-03-10  5:13           ` Dan Moulding
2024-03-11  1:50           ` Yu Kuai
2024-03-12 22:56             ` junxiao.bi
2024-03-13  1:20               ` Yu Kuai
2024-03-14 18:20                 ` junxiao.bi
2024-03-14 22:36                   ` Song Liu
2024-03-15  1:30                   ` Yu Kuai
2024-03-14 16:12             ` Dan Moulding
2024-03-15  1:17               ` Yu Kuai
2024-03-19 14:16                 ` Dan Moulding

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240123215307.8083-1-dan@danm.net \
    --to=dan@danm.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=junxiao.bi@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=regressions@lists.linux.dev \
    --cc=song@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=yukuai1@huaweicloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).