* [linux-lvm] Raid5 sync problem
@ 2001-05-24 11:24 Kaj-Michael Lang
2001-05-24 16:46 ` Austin Gonyou
2001-05-24 17:04 ` [linux-lvm] Raid5 sync problem Andreas Dilger
0 siblings, 2 replies; 9+ messages in thread
From: Kaj-Michael Lang @ 2001-05-24 11:24 UTC (permalink / raw)
To: linux-raid; +Cc: linux-lvm
I'm mailing this to the LVM list as it might also be the cause of this.. I
don't know.
Anyway I have 4 6g drives in a RAID5 configuration. On top the md device I
have a VG and a couple of LV in it. I'm running 2.4.4 with the beta7 LVM
patch and using ext2 on all the LV's.
It now happened that one drive failed (actually just a IDE cable problem but
anyway) and the machine became really really slow. Load was up to 200 and
almost every command got stuck so all I could do was to boot.
It came up ok, and I hotadded the drive back and it started to sync but
after a while the sync was down to 0 kb/sec and the same slow-down started
to happen.
So I boot again.. and now when it started to fsck the LV's it slowed-down
again. So I booted with init=/bin/bash and let the sync finnish without any
other access to the RAID/LV's and after that everything worked fine again.
So the problem seems to be that when accessing the RAID5 while it is syncing
locks somewhere. Does anyone know of a fix or know what I should do so that
syncing won't kill the machine.
Kaj-Michael Lang
milang@tal.org
Java? I've heard of it, it is what I drink while hacking PHP!
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] Raid5 sync problem
2001-05-24 11:24 [linux-lvm] Raid5 sync problem Kaj-Michael Lang
@ 2001-05-24 16:46 ` Austin Gonyou
2001-05-25 10:55 ` Kaj-Michael Lang
2001-05-24 17:04 ` [linux-lvm] Raid5 sync problem Andreas Dilger
1 sibling, 1 reply; 9+ messages in thread
From: Austin Gonyou @ 2001-05-24 16:46 UTC (permalink / raw)
To: linux-lvm; +Cc: linux-raid
You are running off of parity. Until you get that drive fixed, it will
hurt.
--
Austin Gonyou
Systems Architect, CCNA
Coremetrics, Inc.
Phone: 512-796-9023
email: austin@coremetrics.com
On Thu, 24 May 2001, Kaj-Michael Lang wrote:
> I'm mailing this to the LVM list as it might also be the cause of this.. I
> don't know.
>
> Anyway I have 4 6g drives in a RAID5 configuration. On top the md device I
> have a VG and a couple of LV in it. I'm running 2.4.4 with the beta7 LVM
> patch and using ext2 on all the LV's.
> It now happened that one drive failed (actually just a IDE cable problem but
> anyway) and the machine became really really slow. Load was up to 200 and
> almost every command got stuck so all I could do was to boot.
> It came up ok, and I hotadded the drive back and it started to sync but
> after a while the sync was down to 0 kb/sec and the same slow-down started
> to happen.
> So I boot again.. and now when it started to fsck the LV's it slowed-down
> again. So I booted with init=/bin/bash and let the sync finnish without any
> other access to the RAID/LV's and after that everything worked fine again.
>
> So the problem seems to be that when accessing the RAID5 while it is syncing
> locks somewhere. Does anyone know of a fix or know what I should do so that
> syncing won't kill the machine.
>
>
> Kaj-Michael Lang
> milang@tal.org
> Java? I've heard of it, it is what I drink while hacking PHP!
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://www.sistina.com/lvm/Pages/howto.html
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] Raid5 sync problem
2001-05-24 11:24 [linux-lvm] Raid5 sync problem Kaj-Michael Lang
2001-05-24 16:46 ` Austin Gonyou
@ 2001-05-24 17:04 ` Andreas Dilger
2001-05-24 18:28 ` Austin Gonyou
2001-05-24 20:00 ` Jos� Luis Domingo L�pez
1 sibling, 2 replies; 9+ messages in thread
From: Andreas Dilger @ 2001-05-24 17:04 UTC (permalink / raw)
To: linux-lvm
Kaj-Michael Lang writes:
> Anyway I have 4 6g drives in a RAID5 configuration. On top the md device I
> have a VG and a couple of LV in it. I'm running 2.4.4 with the beta7 LVM
> patch and using ext2 on all the LV's.
> It now happened that one drive failed (actually just a IDE cable problem but
> anyway) and the machine became really really slow. Load was up to 200 and
> almost every command got stuck so all I could do was to boot.
> It came up ok, and I hotadded the drive back and it started to sync but
> after a while the sync was down to 0 kb/sec and the same slow-down started
> to happen.
> So I boot again.. and now when it started to fsck the LV's it slowed-down
> again. So I booted with init=/bin/bash and let the sync finnish without any
> other access to the RAID/LV's and after that everything worked fine again.
>
> So the problem seems to be that when accessing the RAID5 while it is syncing
> locks somewhere. Does anyone know of a fix or know what I should do so that
> syncing won't kill the machine.
This is the "RAID5 speed limit" which only kicks in if you have I/O to
the volume while you are resyncing. You can change a parameter somewhere
in the RAID code which increases the minimum speed at which resync is
done on active volumes (at the expense of performance for other apps).
If you have a single huge RAID 5 volume which is always in use while the
system is running, then you will always get the resync slowdown. I'm not
sure why this would cause your whole system to get so slow, unless it is
something like you are swapping to RAID5, and this kills performance.
Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] Raid5 sync problem
2001-05-24 17:04 ` [linux-lvm] Raid5 sync problem Andreas Dilger
@ 2001-05-24 18:28 ` Austin Gonyou
2001-05-24 20:00 ` Jos� Luis Domingo L�pez
1 sibling, 0 replies; 9+ messages in thread
From: Austin Gonyou @ 2001-05-24 18:28 UTC (permalink / raw)
To: linux-lvm
Well, as noted, he's got one drive Not in the raid5 right now so his R5 is
running in a degraded state, versus optimal. This will case some pretty
sever slowdows with software level raid5, vs hardware dedicated to the
job.
--
Austin Gonyou
Systems Architect, CCNA
Coremetrics, Inc.
Phone: 512-796-9023
email: austin@coremetrics.com
On Thu, 24 May 2001, Andreas Dilger wrote:
> If you have a single huge RAID 5 volume which is always in use while the
> system is running, then you will always get the resync slowdown. I'm not
> sure why this would cause your whole system to get so slow, unless it is
> something like you are swapping to RAID5, and this kills performance.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] Raid5 sync problem
2001-05-24 17:04 ` [linux-lvm] Raid5 sync problem Andreas Dilger
2001-05-24 18:28 ` Austin Gonyou
@ 2001-05-24 20:00 ` Jos� Luis Domingo L�pez
1 sibling, 0 replies; 9+ messages in thread
From: Jos� Luis Domingo L�pez @ 2001-05-24 20:00 UTC (permalink / raw)
To: linux-lvm
On Thursday, 24 May 2001, at 11:04:27 -0600,
Andreas Dilger wrote:
> Kaj-Michael Lang writes:
> [...]
> This is the "RAID5 speed limit" which only kicks in if you have I/O to
> the volume while you are resyncing. You can change a parameter somewhere
> in the RAID code which increases the minimum speed at which resync is
> done on active volumes (at the expense of performance for other apps).
>
Under 2.4.x there are two "tunnables" on /proc, namely:
/proc/sys/dev/raid/speed_limit_min
/proc/sys/dev/raid/speed_limit_max
Both are expressed in KB/s. The latter sets the maximun array
reconstruction speed for RAID-1,4,5 (taking available IO), and the former
sets the minimun "guaranteed" reconstruction speed.
At least on my 2.4.4, "minumun speed" is set to 100 KB/s, and "maximun
speed" equals 100000 KB/s, but it seems that array reconstruction always
operate at the minimun guaranteed speed, 100 KB/s.
So echo "5000" > /proc/sys/dev/raid/speed_limit_min should help
determining whether this speed limit is causing the problems described.
--
Jos� Luis Domingo L�pez
Linux Registered User #189436 Debian GNU/Linux Potato (P166 64 MB RAM)
jdomingo EN internautas PUNTO org => � Spam ? Atente a las consecuencias
jdomingo AT internautas DOT org => Spam at your own risk
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] Raid5 sync problem
2001-05-24 16:46 ` Austin Gonyou
@ 2001-05-25 10:55 ` Kaj-Michael Lang
2001-05-25 19:46 ` Andreas Dilger
0 siblings, 1 reply; 9+ messages in thread
From: Kaj-Michael Lang @ 2001-05-25 10:55 UTC (permalink / raw)
To: linux-lvm; +Cc: linux-raid
> You are running off of parity. Until you get that drive fixed, it will
> hurt.
>
I *think* I've found out what happened (not sure, not tested), fsck was
trying to scan all of the LVs on the same raid5 at the same time. And as the
raid5 code had to calculate everything and trying to sync in the
background... instant high load. But I'm not sure.. and I really don't wan't
to test it..
Could that be it ?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] Raid5 sync problem
2001-05-25 10:55 ` Kaj-Michael Lang
@ 2001-05-25 19:46 ` Andreas Dilger
2001-05-25 22:36 ` [linux-lvm] lvm_chr_ioctl Justin Booth
0 siblings, 1 reply; 9+ messages in thread
From: Andreas Dilger @ 2001-05-25 19:46 UTC (permalink / raw)
To: linux-lvm; +Cc: linux-raid
Kai-Michael writes:
> I *think* I've found out what happened (not sure, not tested), fsck was
> trying to scan all of the LVs on the same raid5 at the same time. And as the
> raid5 code had to calculate everything and trying to sync in the
> background... instant high load. But I'm not sure.. and I really don't wan't
> to test it..
Yes, there was a bug in fsck where if it couldn't determine if two devices
were on the same disk, it assumed they were NOT and did fsck in parallel.
In the MOST recent fsck (i.e. 1 week old, CVS only), this has been fixed
so that it will run the two fscks in serial. Whether that was your problem
or not, I'm not sure.
Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert
^ permalink raw reply [flat|nested] 9+ messages in thread
* [linux-lvm] lvm_chr_ioctl
2001-05-25 19:46 ` Andreas Dilger
@ 2001-05-25 22:36 ` Justin Booth
2001-05-25 23:00 ` Joe Harvell
0 siblings, 1 reply; 9+ messages in thread
From: Justin Booth @ 2001-05-25 22:36 UTC (permalink / raw)
To: linux-lvm
I seem to be getting some errors for
lvm_chr_ioctl: unknown command 4004fe0a
When I do a vgcreate. I also noticed that if a partition is under heavy
strain and I take a snapshot, the snapshot seg. faults, and all subseqent
lvscans will also until I lvremove the volume and re-create it.
Should I be worried about the lvm_chr_ioctl???
Thanks in advance, Justin Booth
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] lvm_chr_ioctl
2001-05-25 22:36 ` [linux-lvm] lvm_chr_ioctl Justin Booth
@ 2001-05-25 23:00 ` Joe Harvell
0 siblings, 0 replies; 9+ messages in thread
From: Joe Harvell @ 2001-05-25 23:00 UTC (permalink / raw)
To: linux-lvm
Justin:
I saw this exact same message (either when doing a vgscan or vgchange -a
y). Well, I'm not 100% sure about the number reported in the error message,
but it looks the same.
In my case it was because i was running the beta2 version of the kernel code
and the beta7 version of the user-space software. I had recompiled my
kernel and forgotten to re-apply the kernel patch. I fixed the problem by
patching the kernel code up to the beta 7 version.
Justin Booth wrote:
> I seem to be getting some errors for
> lvm_chr_ioctl: unknown command 4004fe0a
>
> When I do a vgcreate. I also noticed that if a partition is under heavy
> strain and I take a snapshot, the snapshot seg. faults, and all subseqent
> lvscans will also until I lvremove the volume and re-create it.
>
> Should I be worried about the lvm_chr_ioctl???
>
> Thanks in advance, Justin Booth
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://www.sistina.com/lvm/Pages/howto.html
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2001-05-25 23:00 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-05-24 11:24 [linux-lvm] Raid5 sync problem Kaj-Michael Lang
2001-05-24 16:46 ` Austin Gonyou
2001-05-25 10:55 ` Kaj-Michael Lang
2001-05-25 19:46 ` Andreas Dilger
2001-05-25 22:36 ` [linux-lvm] lvm_chr_ioctl Justin Booth
2001-05-25 23:00 ` Joe Harvell
2001-05-24 17:04 ` [linux-lvm] Raid5 sync problem Andreas Dilger
2001-05-24 18:28 ` Austin Gonyou
2001-05-24 20:00 ` Jos� Luis Domingo L�pez
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.