* [linux-lvm] snapshots on RAID5 blow up machine ("switching cache buffer size")
@ 2003-07-02 2:38 Scott Mcdermott
2003-07-04 8:28 ` Jason H. Smith
0 siblings, 1 reply; 2+ messages in thread
From: Scott Mcdermott @ 2003-07-02 2:38 UTC (permalink / raw)
To: linux-lvm
Today my NFS-mounted mail spool slowed to a crawl in my mail
agent; I logged into the server and noticed the load average
over 20. I saw that a developer had a build going on (from
another machine) off one of the NFS exports (LVM on RAID5).
This particular LV was snapshotted at the time and
presumably the build caused lots of snapshot activity, but a
load over 20 is obviously abnormal.
I saw these in the logs:
kernel: raid5: switching cache buffer size, 4096 --> 1024
kernel: raid5: switching cache buffer size, 1024 --> 4096
kernel: raid5: switching cache buffer size, 4096 --> 1024
kernel: raid5: switching cache buffer size, 0 --> 1024
last message repeated 3 times
kernel: raid5: switching cache buffer size, 1024 --> 4096
kernel: raid5: switching cache buffer size, 0 --> 1024
kernel: raid5: switching cache buffer size, 0 --> 4096
last message repeated 2 times
kernel: raid5: switching cache buffer size, 4096 --> 1024
mostly the transitions were 512 to 4k, then back again,
hundreds of times per second.
I've searched the archives and see that this is related to
the filesystem using 4k blocks whereas snapshot IO uses 1k
blocks, so RAID5 code gets confused, but I don't understand
the internals of the filesystem to be able to say this is
expected behavior.
My questions are these:
- Is this a RAID5 problem or an LVM problem, or both?
I'm using an SMP kernel 2.4.22-pre2. In other words,
am I asking the wrong list about this problem because
it's a perfectly fair use of the backing store by the
LVM subsystem?
- Is this problem nonexistent on RAID1 backed or
RAID10 backed VGs (especially the latter since I am
contemplating a switch thereto)?
- Is the problem dependent on the snapshot extents
residing on the same PV as the snapshotted LVs? In
this case how to force snapshot extents to use
particular PVs if not all extents in the PV which
contains the LVs in question are allocated already?
I am planning to make extensive use of snapshots for backup
purposes (I plan to keep the last seven days of data online
as daily exported snapshots, to let users easily retrieve
things without going to tape), so I need to try to
understand this problem better.
Thanks for any comments.
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [linux-lvm] snapshots on RAID5 blow up machine ("switching cache buffer size")
2003-07-02 2:38 [linux-lvm] snapshots on RAID5 blow up machine ("switching cache buffer size") Scott Mcdermott
@ 2003-07-04 8:28 ` Jason H. Smith
0 siblings, 0 replies; 2+ messages in thread
From: Jason H. Smith @ 2003-07-04 8:28 UTC (permalink / raw)
To: linux-lvm
[-- Attachment #1: signed data --]
[-- Type: text/plain, Size: 1212 bytes --]
On Wednesday 02 July 2003 02:37 pm, Scott Mcdermott wrote:
> I've searched the archives and see that this is related to
> the filesystem using 4k blocks whereas snapshot IO uses 1k
> blocks, so RAID5 code gets confused, but I don't understand
> the internals of the filesystem to be able to say this is
> expected behavior.
>
> My questions are these:
>
> - Is this a RAID5 problem or an LVM problem, or both?
I ran into this a while ago and I eventually decided that, currently,
software RAID 5 and LVM are, to put it nicely, "incompatible" if you want
long-term snapshots. I'm no expert, but IMO, these days, the RAID5
subsystem should not assume i/o of equal chunk sizes, so it is md's
problem. IIRC, the 2.6 kernel is supposed to address this; but there's
not much you can do at the moment.
Somebody please correct me if I'm wrong.
> - Is this problem nonexistent on RAID1 backed or
> RAID10 backed VGs (especially the latter since I am
> contemplating a switch thereto)?
I believe that's correct, but please don't quote me. Testing this should
be simple enough.
--
Jason Smith
Open Enterprise Systems
Bangkok, Thailand
http://www.oes.co.th
[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2003-07-04 8:28 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-07-02 2:38 [linux-lvm] snapshots on RAID5 blow up machine ("switching cache buffer size") Scott Mcdermott
2003-07-04 8:28 ` Jason H. Smith
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.