Linux RAID subsystem development
 help / color / mirror / Atom feed
From: Yu Kuai <yukuai1@huaweicloud.com>
To: Christian Theune <ct@flyingcircus.io>, Yu Kuai <yukuai1@huaweicloud.com>
Cc: "John Stoffel" <john@stoffel.org>,
	"linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>,
	dm-devel@lists.linux.dev,
	"Dragan Milivojević" <galileo@pkm-inc.com>,
	"yukuai (C)" <yukuai3@huawei.com>
Subject: Re: PROBLEM: repeatable lockup on RAID-6 with LUKS dm-crypt on NVMe devices when rsyncing many files
Date: Mon, 4 Nov 2024 20:18:21 +0800	[thread overview]
Message-ID: <2b093abc-cd9a-0b84-bcba-baec689fa153@huaweicloud.com> (raw)
In-Reply-To: <5170f0d2-cb0f-2e0f-eb5e-31aa9d6ff65d@huawei.com>

Hi,

在 2024/11/04 19:40, Yu Kuai 写道:
> Hi,
> 
> 在 2024/11/01 16:33, Christian Theune 写道:
>> I dug out a different one that goes back longer but even that one 
>> seems like something was missing early on when I didn’t have the 
>> serial console attached.
>>
>> I’m wondering whether this indicates an issue during initialization? 
>> I’m going to reboot the machine and make sure i get the early logs 
>> with those numbers.
>>
>> [  405.347345] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(22301786792+8) 4294967259
> 
> For this log, let's assume the firt start is from here.
>> [  432.542465] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(22837701992+8) 4294967260
>> [  432.542469] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(22837701992+8) 4294967261
>> [  434.272964] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(22837701992+8) 4294967262
>> [  434.273175] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(22837701992+8) 4294967263
>> [  434.273189] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(22837701992+8) 4294967264
>> [  434.273285] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(22837701992+8) 4294967265
>> [  434.274063] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(22837701992+8) 4294967264
>> [  434.274066] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(22837701992+8) 4294967263
>> [  434.274070] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(22837701992+8) 4294967262
>> [  434.274073] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(22837701992+8) 4294967261
>> [  434.274078] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(22837701992+8) 4294967260
>> [  434.274083] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(22837701992+8) 4294967259
>> [  434.276609] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(23374951848+8) 4294967260
>> [  434.278939] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(23374951848+8) 4294967261
>> [  464.922354] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(23374951848+8) 4294967260
>> [  464.931833] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(23374951848+8) 4294967259
>> [  466.964557] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(23912715112+8) 4294967260
>> [  466.964616] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(23912715112+8) 4294967261
>> [  474.399930] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(23912715112+8) 4294967262
>> [  474.451451] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(23912715112+8) 4294967263
>> [  489.447079] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(23912715112+8) 4294967262
>> [  489.456574] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(23912715112+8) 4294967261
>> [  489.466069] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(23912715112+8) 4294967260
>> [  489.475565] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(23912715112+8) 4294967259
>> [  491.235517] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(24448073512+8) 4294967260
>> [  491.235602] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(24448073512+8) 4294967261
>> [  498.153108] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(24716445096+8) 4294967262
>> [  498.156307] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(24716445096+8) 4294967263
>> [  530.332619] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(24716445096+8) 4294967262
>> [  530.342110] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(24716445096+8) 4294967261
>> [  530.351595] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(24716445096+8) 4294967260
>> [  530.361082] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(24716445096+8) 4294967259
>> [  535.176774] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(24985208424+8) 4294967260
>> [  549.125326] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(24985208424+8) 4294967259
> 
> Then until now, everything is good, start and end is balanced for this
> stripe head.
>> [  549.635782] __add_stripe_bio: md127: start 
>> ff2721beec8c2fa0(25521770024+8) 4294967261
>> [  590.875593] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(25521770024+8) 4294967260
>> [  590.885081] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(25521770024+8) 4294967259
>> [  596.973863] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(26057037928+8) 4294967263
>> [  596.973866] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(26057037928+8) 4294967262
>> [  596.973869] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(26057037928+8) 4294967261
>> [  596.973871] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(26057037928+8) 4294967260
>> [  596.973881] handle_stripe_clean_event: md127: end 
>> ff2721beec8c2fa0(26057037928+8) 4294967259
> 
> Then, oops, this 'sh' start just once here, and end lots of times. It's
> unlikely that those end are corresponding to the log much earlier, so
> I'm almost convinced that this problem is due to unbalanced start and
> end. And the huge number is due to underflow.
> 
> Let me dig more. :)

I think I found a problem by code review, can you test the following
patch? (Noted this is still from latest mainline).

Thanks,
Kuai

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index dc2ea636d173..04f32173839a 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -4042,6 +4042,8 @@ static void handle_stripe_clean_event(struct 
r5conf *conf,
                              test_bit(R5_SkipCopy, &dev->flags))) {
                                 /* We can return any write requests */
                                 struct bio *wbi, *wbi2;
+                               bool written = false;
+
                                 pr_debug("Return write for disc %d\n", i);
                                 if (test_and_clear_bit(R5_Discard, 
&dev->flags))
                                         clear_bit(R5_UPTODATE, 
&dev->flags);
@@ -4054,6 +4056,9 @@ static void handle_stripe_clean_event(struct 
r5conf *conf,
                                 dev->page = dev->orig_page;
                                 wbi = dev->written;
                                 dev->written = NULL;
+                               if (wbi)
+                                       written = true;
+
                                 while (wbi && wbi->bi_iter.bi_sector <
                                         dev->sector + 
RAID5_STRIPE_SECTORS(conf)) {
                                         wbi2 = r5_next_bio(conf, wbi, 
dev->sector);
@@ -4061,10 +4066,13 @@ static void handle_stripe_clean_event(struct 
r5conf *conf,
                                         bio_endio(wbi);
                                         wbi = wbi2;
                                 }
- 
conf->mddev->bitmap_ops->endwrite(conf->mddev,
-                                       sh->sector, 
RAID5_STRIPE_SECTORS(conf),
-                                       !test_bit(STRIPE_DEGRADED, 
&sh->state),
-                                       false);
+
+                               if (written)
+ 
conf->mddev->bitmap_ops->endwrite(conf->mddev,
+                                               sh->sector, 
RAID5_STRIPE_SECTORS(conf),
+ 
!test_bit(STRIPE_DEGRADED, &sh->state),
+                                               false);
+
                                 if (head_sh->batch_head) {
                                         sh = 
list_first_entry(&sh->batch_list,
                                                               struct 
stripe_head,

> 
> Thanks,
> Kuai
> 


  reply	other threads:[~2024-11-04 12:18 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-06 14:10 PROBLEM: repeatable lockup on RAID-6 with LUKS dm-crypt on NVMe devices when rsyncing many files Christian Theune
2024-08-06 14:10 ` Christian Theune
2024-08-07  2:55 ` Yu Kuai
2024-08-07  5:31   ` Christian Theune
2024-08-07  6:46     ` Christian Theune
2024-08-07  8:59       ` Christian Theune
2024-08-07 21:05         ` John Stoffel
2024-08-08  1:33           ` Yu Kuai
2024-08-08  6:02           ` Christian Theune
2024-08-08  6:55             ` Yu Kuai
2024-08-08  7:06               ` Christian Theune
2024-08-08  8:53                 ` Christian Theune
2024-08-09  1:13                   ` Yu Kuai
2024-08-09  6:10                     ` Christian Theune
2024-08-09 22:51                       ` John Stoffel
2024-08-12  6:58                         ` Christian Theune
2024-08-12 18:37                           ` John Stoffel
2024-08-14  8:53                             ` Christian Theune
2024-08-15  6:19                               ` Christian Theune
2024-08-15 10:03                                 ` Christian Theune
2024-08-15 11:14                                   ` Yu Kuai
2024-08-15 11:24                                     ` Christian Theune
2024-08-15 11:49                                       ` Yu Kuai
2024-10-22 15:02                                     ` Christian Theune
2024-10-23  1:13                                       ` Yu Kuai
2024-10-23  6:03                                         ` Christian Theune
2024-10-23 17:50                                           ` Christian Theune
2024-10-25  8:39                                         ` Christian Theune
2024-10-25 13:31                                           ` Dragan Milivojević
2024-10-25 14:02                                             ` Christian Theune
2024-10-26  5:37                                               ` Christian Theune
2024-10-26  9:07                                                 ` Yu Kuai
2024-10-26 11:51                                                   ` Christian Theune
2024-10-26 12:07                                                   ` Christian Theune
2024-10-26 12:11                                                     ` Christian Theune
2024-10-30  1:25                                                       ` Yu Kuai
2024-10-30  6:29                                                         ` Christian Theune
2024-10-31  7:48                                                           ` Yu Kuai
2024-10-31  8:04                                                             ` Christian Theune
2024-10-31 15:07                                                               ` Christian Theune
2024-10-31 19:46                                                                 ` Christian Theune
2024-10-31 20:33                                                                   ` John Stoffel
2024-11-01  2:02                                                                     ` Yu Kuai
2024-11-01  7:56                                                                       ` Christian Theune
2024-11-01  8:33                                                                         ` Christian Theune
2024-11-03 15:54                                                                           ` Christian Theune
2024-11-03 16:16                                                                             ` Dragan Milivojević
2024-11-04 11:29                                                                           ` Yu Kuai
2024-11-04 11:51                                                                             ` Christian Theune
2024-11-04 12:30                                                                               ` Yu Kuai
2024-11-04 11:40                                                                           ` Yu Kuai
2024-11-04 12:18                                                                             ` Yu Kuai [this message]
2024-11-04 14:45                                                                               ` Christian Theune
2024-11-04 20:04                                                                                 ` Christian Theune
2024-11-05  1:20                                                                                   ` Yu Kuai
2024-11-05  6:23                                                                                     ` Christian Theune
2024-11-05 10:15                                                                                       ` Christian Theune
2024-11-06  6:35                                                                                         ` Yu Kuai
2024-11-06  6:40                                                                                           ` Christian Theune
2024-11-07  7:55                                                                                             ` Yu Kuai
2024-11-07  8:01                                                                                               ` Yu Kuai
2024-11-09 11:35                                                                                               ` Xiao Ni
2024-11-11  2:25                                                                                                 ` Yu Kuai
2024-11-11  8:00                                                                                                 ` Christian Theune
2024-11-11 14:34                                                                                                   ` Christian Theune
2024-11-12  6:57                                                                                                     ` Christian Theune
2024-11-14 15:07                                                                                                       ` Christian Theune
2024-11-15  8:07                                                                                                         ` Xiao Ni
2024-11-15  8:44                                                                                                           ` Christian Theune
2024-11-15 10:11                                                                                                             ` Xiao Ni
2024-11-15 11:06                                                                                                               ` Christian Theune
2024-12-10  8:33                                                                                                                 ` Christian Theune
2024-12-16 13:25                                                                                                                   ` Christian Theune
2024-12-16 13:36                                                                                                                     ` Yu Kuai
2024-12-16 14:18                                                                                                                       ` Christian Theune
2025-01-20  9:19                                                                                                                         ` Christian Theune
2025-01-24  6:22                                                                                                                           ` Christian Theune
2025-01-24  6:35                                                                                                                             ` Yu Kuai
2025-01-24  6:38                                                                                                                               ` Christian Theune
2024-08-15 15:53                                 ` John Stoffel
2024-08-15 19:13                                   ` Christian Theune
2024-08-26 14:38                                     ` Christian Theune
2024-08-08 14:23             ` John Stoffel
2024-08-19 19:12               ` tihmstar
2024-08-19 21:05                 ` John Stoffel
2024-08-24 16:56                   ` tihmstar
2024-08-24 18:12                   ` Dragan Milivojević
2024-08-27  1:27                     ` John Stoffel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2b093abc-cd9a-0b84-bcba-baec689fa153@huaweicloud.com \
    --to=yukuai1@huaweicloud.com \
    --cc=ct@flyingcircus.io \
    --cc=dm-devel@lists.linux.dev \
    --cc=galileo@pkm-inc.com \
    --cc=john@stoffel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox