Linux block layer
 help / color / mirror / Atom feed
From: Paul Menzel <pmenzel@molgen.mpg.de>
To: Roger Heflin <rogerheflin@gmail.com>
Cc: linux-raid@vger.kernel.org, linux-nfs@vger.kernel.org,
	linux-block@vger.kernel.org, linux-xfs@vger.kernel.org,
	it+linux-raid@molgen.mpg.de
Subject: Re: How to debug intermittent increasing md/inflight but no disk activity?
Date: Tue, 23 Jul 2024 12:33:33 +0200	[thread overview]
Message-ID: <02ceb39e-e4fb-4f62-ac40-7afafbd620c1@molgen.mpg.de> (raw)
In-Reply-To: <CAAMCDedmjyyn93V+ScRTyqd1FbW5VJmbZHGMss3iwyqxwJL3Pg@mail.gmail.com>

Dear Roger,


Thank you for your reply.

Am 10.07.24 um 13:54 schrieb Roger Heflin:
> How long does it freeze this way?

It froze up to five minutes I’d say.

> The disks getting bad blocks do show up as stopping activity for 3-60
> seconds (depending on the disks internal settings).
> 
> smartctl --xall <device> | grep -iE 'sector|reall' should show the
> reallocation counters.

These are SAS disks, and none of the array members has any errors. Example:

```
@grele:~$ sudo smartctl --xall /dev/sdy
[…]
Error counter log:
            Errors Corrected by           Total   Correction 
Gigabytes    Total
                ECC          rereads/    errors   algorithm 
processed    uncorrected
            fast | delayed   rewrites  corrected  invocations   [10^9 
bytes]  errors
read:          0        0         0         0          0     655487.372 
          0
write:         0        0         0         0          0      38289.771 
          0
```

> What kind of disks does the machine have?

Seagate ST16000NM004J (16 TB, SAS)

> On my home machine a bad sector freezes it for 7 seconds (scterc
> defaults to 7).  On some work large disk big raid the hang is minutes.
>     The raw disk is set to 10 (that is what the vendor told us) and
> that 10 + having potentially a bunch of IOs against the bad sector
> shows as minutes.
> 
> I wrote a script that work uses that both times how long smartctl
> takes for each disk (the bad disk takes >5 seconds, and up to minutes)
> and also shows the reallocated count and save a copy every hour so one
> can see what disk incremented its counter in the last hour and replace
> that disk.

A colleague also wrote a Perl program diskcheck, that is regularly run 
to check all the disks. Nothing suspicious here.


Kind regards,

Paul

  reply	other threads:[~2024-07-23 10:33 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-10 11:46 How to debug intermittent increasing md/inflight but no disk activity? Paul Menzel
2024-07-10 11:54 ` Roger Heflin
2024-07-23 10:33   ` Paul Menzel [this message]
2024-07-10 23:12 ` Dave Chinner
2024-07-11  8:51   ` Johannes Truschnigg
2024-07-11 11:23   ` Andre Noll
2024-07-11 22:26     ` Dave Chinner
2024-07-13 15:47       ` Andre Noll
2024-07-23 15:13     ` Paul Menzel
2024-07-12  3:54   ` Dragan Milivojević
2024-07-12 23:45     ` Dave Chinner
2024-07-13 17:44       ` Dragan Milivojević

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=02ceb39e-e4fb-4f62-ac40-7afafbd620c1@molgen.mpg.de \
    --to=pmenzel@molgen.mpg.de \
    --cc=it+linux-raid@molgen.mpg.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=rogerheflin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox