linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* LVM kernel lockup scenario during lvcreate
@ 2023-06-09  7:29 Jaco Kroon
  2023-06-12 18:40 ` Bart Van Assche
  0 siblings, 1 reply; 26+ messages in thread
From: Jaco Kroon @ 2023-06-09  7:29 UTC (permalink / raw)
  To: linux-block@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 3774 bytes --]

Hi All,

We've been tracking a scenario where IO on one of our hosts locks up.

We were hopeful that 
https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.2.15 and 
specifially 62a4b137d8aff9ce9dc5e96685253190b4ec6e62 (aka 
38d11da522aacaa05898c734a1cec86f1e611129) would fix this.

It did not.

What we consistently see when we're in time to catch this prior to being 
unable to log in is an lvcreate process running, not always this 
specific LV but a variation of:

13469 ?        D<L    0:00  |       |       \_ /sbin/lvcreate -kn -An -s 
-n fsck_MDKServer /dev/lvm/backup_MDKServer

(The idea here is to create thin snapshots of the origin volume in order 
to run a fsck on them so that we can recover from filesystem corruption 
on the server which acts as a backup server for others, once the fsck is 
done the snapshot is then renamed to save_* after removing the previous 
save_* LVs.)

This always seems to happen on lvcreate, not lvremove nor lvrename.

I'm attaching dmesg -T and ps axf.  dmesg in particular may provide 
clues as it provides a number of stack traces indicating stalling at IO 
time.

Once this has triggered, even commands such as "lvs" goes into 
uninterruptable wait, I unfortunately didn't test "dmsetup ls" now and 
triggered a reboot already (system needs to be up).

To recover we have to "echo b > /proc/sysrq-trigger" since a normal 
reboot simply doesn't work (presumably due to being unable to close and 
unmount the filesystems, even emergency sync never completes).

In terms of disk setup, there are a number of mdraids that serves as PVs 
into the VG.  /proc/mdstat and some details from LVM attached as 
disksetup.txt.

The one thing from lvs that stands out for me is this one:

  66   fsck_MDKServer          lvm Vwi---tz--   4.00t thin_pool 
backup_MDKServer

Which to me indicates that the lvcreate didn't complete (the LV isn't 
marked active above, whereas all other snaps are active, even if not open).

Assistance would be greatly appreciated.  I notice that newer kernels 
are out, so happy to upgrade if there are relevant changes that might 
affect this.  Nothing at https://kernelnewbies.org/Linux_6.3#Block_layer 
jumps out to me.

This host locks up approximately weekly.  It's worse if dmeventd is also 
running (won't survive a day).  We do have the following "non-standard" 
udev rules, which I'm not sure if they might be an influencing factor:

crowsnest [10:24:37] /etc/udev/rules.d (master) # cat 
98-md-stripe-cache.rules
SUBSYSTEM=="block", KERNEL=="md*", ACTION=="change|add", 
TEST=="md/stripe_cache_size", ATTR{md/stripe_cache_size}="32768"
SUBSYSTEM=="block", ACTION=="change|add", TEST=="bdi/read_ahead_kb", 
ATTR{bdi/read_ahead_kb}="512"
SUBSYSTEM=="block", ACTION=="change|add", TEST=="device/timeout", 
ATTR{device/timeout}="180"

We've not aware of any other hosts being  affected, at least not to this 
scale.  This is the host where we make the most heavy use of both mdraid 
as well as lvm (especially thin snaps).  In comparison most of our other 
hosts are toys w.r.t. disks.

A few things I find strange (perhaps due to ignorance):

1.  It's not just the LV that's being snapped that locks up, it's always 
more than just that one.

2.  Some filesystems are usually still usable (/var).  Once the lvcreate 
triggered the issue, these get fewer over time, once /var is affected we 
can no longer log in.

To be noted:  when this host idles we see ~40-60MB/s worth of IO, goes 
up to 400-800MB/s and we saw it peak around 1.3GB/s around two weeks 
back whilst also investigating possible causes (Current DISK READ/WRITE 
as per iotop).

Filesystems are all ext4 with O_DIRINDEX disabled (standard options 
otherwise).

Kind regards,
Jaco


[-- Attachment #2: disksetup.txt.gz --]
[-- Type: application/gzip, Size: 1517 bytes --]

[-- Attachment #3: dmesg.txt.gz --]
[-- Type: application/gzip, Size: 22155 bytes --]

[-- Attachment #4: ps.txt.gz --]
[-- Type: application/gzip, Size: 4214 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-09-06 22:06 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-09  7:29 LVM kernel lockup scenario during lvcreate Jaco Kroon
2023-06-12 18:40 ` Bart Van Assche
2023-06-12 19:36   ` Jaco Kroon
2023-06-18 23:28     ` Bart Van Assche
2023-06-18 19:34   ` Jaco Kroon
2023-06-18 23:56     ` Bart Van Assche
2023-06-19  6:06       ` Jaco Kroon
2023-06-26  8:30       ` Jaco Kroon
2023-06-26 16:42         ` Bart Van Assche
2023-06-26 23:29           ` Jaco Kroon
2023-07-11 13:22             ` Jaco Kroon
2023-07-11 14:45               ` Bart Van Assche
2023-07-12 10:12                 ` Jaco Kroon
2023-07-12 13:43                   ` Bart Van Assche
     [not found]                     ` <ef2812b4-7853-9dda-85dd-210636840a59@uls.co.za>
2023-08-24  7:29                       ` Jaco Kroon
2023-08-24 17:13                         ` Bart Van Assche
2023-08-24 20:16                           ` Jaco Kroon
2023-08-24 17:29   ` Laurence Oberman
2023-08-24 20:01     ` Jaco Kroon
2023-08-24 20:19       ` Laurence Oberman
2023-08-24 23:40         ` Jaco Kroon
2023-08-25 12:01           ` Laurence Oberman
2023-08-26 18:18             ` Jaco Kroon
2023-09-06 21:03             ` Jaco Kroon
2023-09-06 21:22               ` Jens Axboe
2023-09-06 22:05                 ` Jaco Kroon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).