* [linux-lvm] cmp of inactive mirrored LV fails @ 2011-12-02 17:20 starlight 2011-12-02 18:34 ` starlight 0 siblings, 1 reply; 11+ messages in thread From: starlight @ 2011-12-02 17:20 UTC (permalink / raw) To: linux-lvm Hello, Had a drive report via 'smartd' and 'smartctl' that it was having trouble reading a sector. For about two days "197 Current_Pending_Sector" indicated one. Then the drive appears to have recovered itself and the count went to zero. Strangely a 'dd iflag=direct' of the drive to /dev/null was successful during the period where the unreadable sector count was non-zero. 'smartctl -l error' reports no errors. Concerned, I ran a 'cmp' of the two mirror image LVs (_mimage_0 vs _mimage_1) for each of the LVs on the drive. Two compare differently; in particular one that is presently not "open" according to the 'lvs' command failed to compare. Does anyone know how to deal with this? A concern is that the _mimage_0 LV is on the drive experiencing the problem. Versions are lvm2-2.02.84-6.el5_7.1 kernel-2.6.18-274.el5 Thanks ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-lvm] cmp of inactive mirrored LV fails 2011-12-02 17:20 [linux-lvm] cmp of inactive mirrored LV fails starlight @ 2011-12-02 18:34 ` starlight 2011-12-02 21:01 ` Stuart D. Gathman 2012-01-02 13:38 ` Lars Ellenberg 0 siblings, 2 replies; 11+ messages in thread From: starlight @ 2011-12-02 18:34 UTC (permalink / raw) To: LVM general discussion and development After a little digging discovered and ran 'debugfs' and used the 'testb' command to determine that the mirror mismatch blocks are "not in use". So that's good. However I am rather disturbed that LVM mirroring appears to have bugs that allow images to become out-of-sync. Have read that MD is the only way to go with any kind of RAID and now I see that is true. If anyone can explain what happened here in any positive light I'd be interested in hearing about it. For now I see LVM mirroring as a turkey that should be avoided. Additional details: * both LVs with discrepancies are "root" file system LVs where one or the other is selected in differing 'grub' boot configuration lines * both LVs have an associated mirror log * in the past have experienced system lockups due to a mirrored swap volume; reported it to RH Bugzilla and was told there are deadlock scenarios in the kernel and that mirrored swap volumes are not supported. This and today's discovery leads me to the conclusion that LVM mirroring is a seriously bad idea. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-lvm] cmp of inactive mirrored LV fails 2011-12-02 18:34 ` starlight @ 2011-12-02 21:01 ` Stuart D. Gathman 2011-12-02 21:14 ` starlight ` (2 more replies) 2012-01-02 13:38 ` Lars Ellenberg 1 sibling, 3 replies; 11+ messages in thread From: Stuart D. Gathman @ 2011-12-02 21:01 UTC (permalink / raw) To: LVM general discussion and development Centuries ago, Nostradamus foresaw that on Dec 2, starlight@binnacle.cx...: > After a little digging discovered and ran 'debugfs' > and used the 'testb' command to determine that the > mirror mismatch blocks are "not in use". > > So that's good. > > However I am rather disturbed that LVM > mirroring appears to have bugs that allow > images to become out-of-sync. This is not necessarily a bug. Both MD and LVM support DISCARD, and discarded blocks are not necessarily synced between mirror legs. The same thing happens with MD, and the MD "cmp" operation reports the number of blocks out of sync, but not which blocks! (So LVM is actually an improvement on that score.) Swap volumes make heavy use of DISCARD and are especially likely to have blocks out of sync. > Have read that MD is the only way to go > with any kind of RAID and now I see that > is true. If anyone can explain what > happened here in any positive light I'd > be interested in hearing about it. For > now I see LVM mirroring as a turkey that > should be avoided. Yes, I still consider LVM mirroring as experimental (but not a "turkey"). I especially don't like the way the mirror log works. I was spoiled by AIX LVM mirroring. -- Stuart D. Gathman <stuart@bmsi.com> Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Confutatis maledictis, flammis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-lvm] cmp of inactive mirrored LV fails 2011-12-02 21:01 ` Stuart D. Gathman @ 2011-12-02 21:14 ` starlight 2011-12-02 21:54 ` Stuart D. Gathman 2011-12-02 21:27 ` starlight 2011-12-02 21:47 ` starlight 2 siblings, 1 reply; 11+ messages in thread From: starlight @ 2011-12-02 21:14 UTC (permalink / raw) To: LVM general discussion and development By coincidence I've just setup my first MD RAID0 partitions on a different system. Discovered echo "check" >>/sys/block/md?/md/sync_action which triggers a mirror check. I'm guessing that this operation takes account of the DISCARD discrepencies and therefore does not report false errors. (correct me if I'm wrong) Does LVM have anything similar to this feature? At 04:01 PM 12/2/2011 -0500, Stuart D. Gathman wrote: >This is not necessarily a bug. Both MD and LVM support >DISCARD, and discarded blocks are not necessarily synced >between mirror legs. The same thing happens with MD, >and the MD "cmp" operation reports the number of blocks >out of sync, but not which blocks! (So LVM is actually >an improvement on that score.) Swap volumes make >heavy use of DISCARD and are especially likely to >have blocks out of sync. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-lvm] cmp of inactive mirrored LV fails 2011-12-02 21:14 ` starlight @ 2011-12-02 21:54 ` Stuart D. Gathman 2011-12-02 22:02 ` starlight 0 siblings, 1 reply; 11+ messages in thread From: Stuart D. Gathman @ 2011-12-02 21:54 UTC (permalink / raw) To: LVM general discussion and development Centuries ago, Nostradamus foresaw that on Dec 2, starlight@binnacle.cx...: > By coincidence I've just setup my first MD > RAID0 partitions on a different system. > Discovered > > echo "check" >>/sys/block/md?/md/sync_action > > which triggers a mirror check. I'm guessing that > this operation takes account of the DISCARD > discrepencies and therefore does not report > false errors. (correct me if I'm wrong) At least as of EL5.7, MD sync check does not take into account DISCARD, and does not report individual blocks out of sync - just a count. BTW, RAID0 does *not* provide mirroring! You want RAID1 or RAID10 for that. -- Stuart D. Gathman <stuart@bmsi.com> Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Confutatis maledictis, flammis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-lvm] cmp of inactive mirrored LV fails 2011-12-02 21:54 ` Stuart D. Gathman @ 2011-12-02 22:02 ` starlight 0 siblings, 0 replies; 11+ messages in thread From: starlight @ 2011-12-02 22:02 UTC (permalink / raw) To: LVM general discussion and development At 04:54 PM 12/2/2011 -0500, Stuart D. Gathman wrote: >At least as of EL5.7, MD sync check does not take >into account DISCARD, and does not report >individual blocks out of sync - just a count. Ugly. Will surely avoid SSD mirrors till that's one is figured out. >BTW, RAID0 does *not* provide mirroring! You want >RAID1 or RAID10 for that. Ya. Meant RAID1--was a goof. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-lvm] cmp of inactive mirrored LV fails 2011-12-02 21:01 ` Stuart D. Gathman 2011-12-02 21:14 ` starlight @ 2011-12-02 21:27 ` starlight 2011-12-02 21:47 ` starlight 2 siblings, 0 replies; 11+ messages in thread From: starlight @ 2011-12-02 21:27 UTC (permalink / raw) To: LVM general discussion and development At 04:01 PM 12/2/2011 -0500, Stuart D. Gathman wrote: >Both MD and LVM support >DISCARD, and discarded blocks are not necessarily synced >between mirror legs. Googled it--seems like DISCARD (aka TRIM) is about telling SSDs that blocks are no longer in use so that the SSDs can optimally release tracking of the storage. I can see how this might result in random data appearing in the "discarded" region. I don't see how the 'mdadm' mirror check will cope with that result unless SSDs return zeros or some other constant value when TRIMed blocks are subsequently read. However this is a hard-drive scenario--no SSDs. It would seem that hard drives generally will either ignore TRIM operations or via drive flags inform the kernel to not attempt them. So this leaves me with two LVs where it's clear that LVM mirroring failed to maintain synchronization at some point in the last year or two. Also based on this thread http://www.issociate.de/board/post/507507/SSD_-_TRIM_command.html it would seem TRIM is very much a bleeding-edge feature. Unlikely it appears in the CentOS 5.7 kernel in use on the affected system. So I'm sticking with "turkey" as the appropriate characterization of LVM mirroring. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-lvm] cmp of inactive mirrored LV fails 2011-12-02 21:01 ` Stuart D. Gathman 2011-12-02 21:14 ` starlight 2011-12-02 21:27 ` starlight @ 2011-12-02 21:47 ` starlight 2011-12-02 23:57 ` Ron Johnson 2011-12-03 3:44 ` starlight 2 siblings, 2 replies; 11+ messages in thread From: starlight @ 2011-12-02 21:47 UTC (permalink / raw) To: LVM general discussion and development It just occurred to me that the system in question is a development box that is abused in stress testing and other scenarios. It crashes in horrible ways fairly often. So I realize I should soften my criticism of LVM mirroring somewhat. Probably the differences resulted during one of the numerous kernel deaths. Block-by-block mirror re-synchronization has been suppressed by the existence of persistent mlogs. Then again, I suppose the mirror log logic is where the failure resides. That fact and that LVM mirroring reliably hangs the kernel if a swap mirror and only moderate system stress are present leaves me with the conviction that LVM mirrors will be entirely avoided here going forward. https://bugzilla.redhat.com/show_bug.cgi?id=559959 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-lvm] cmp of inactive mirrored LV fails 2011-12-02 21:47 ` starlight @ 2011-12-02 23:57 ` Ron Johnson 2011-12-03 3:44 ` starlight 1 sibling, 0 replies; 11+ messages in thread From: Ron Johnson @ 2011-12-02 23:57 UTC (permalink / raw) To: LVM general discussion and development On 12/02/2011 03:47 PM, starlight@binnacle.cx wrote: > It just occurred to me that the system > in question is a development box that > is abused in stress testing and other > scenarios. It crashes in horrible ways > fairly often. > Unless you're testing the kernel or libc, crashes aren't really supposed to happen, even when putting the system under heavy load. -- Vegetarians eat vegetables, Humanitarians frighten me. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-lvm] cmp of inactive mirrored LV fails 2011-12-02 21:47 ` starlight 2011-12-02 23:57 ` Ron Johnson @ 2011-12-03 3:44 ` starlight 1 sibling, 0 replies; 11+ messages in thread From: starlight @ 2011-12-03 3:44 UTC (permalink / raw) To: LVM general discussion and development On 12/02/2011 17:47 PM, Ron Johnson wrote: >On 12/02/2011 03:47 PM, starlight binnacle cx wrote: > >It just occurred to me that the system >in question is a development box that >is abused in stress testing and other >scenarios. It crashes in horrible ways >fairly often. > >Unless you're testing the kernel or libc, crashes aren't really >supposed to happen, even when putting the system under heavy >load. > :-D !!!!!! Your joking right? I know at least half a dozen ways to crash or deadlock linux. Perhaps some of the bugs have been fixed in newer kernels (I generally report them), but like any OS out there Linux has it holes. The LVM bug where a system can be hung with a mirrored swap LV is just one. Reported it for RHEL 4 but I'll bet it's still in there (it was obvious that RH had no intention of fixing it). https://bugzilla.redhat.com/show_bug.cgi?id=559959 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-lvm] cmp of inactive mirrored LV fails 2011-12-02 18:34 ` starlight 2011-12-02 21:01 ` Stuart D. Gathman @ 2012-01-02 13:38 ` Lars Ellenberg 1 sibling, 0 replies; 11+ messages in thread From: Lars Ellenberg @ 2012-01-02 13:38 UTC (permalink / raw) To: linux-lvm On Fri, Dec 02, 2011 at 01:34:24PM -0500, starlight@binnacle.cx wrote: > After a little digging discovered and ran 'debugfs' > and used the 'testb' command to determine that the > mirror mismatch blocks are "not in use". > > So that's good. > > However I am rather disturbed that LVM > mirroring appears to have bugs that allow > images to become out-of-sync. I'd like to point to "unstable pages". The Problem: http://lwn.net/Articles/429305/ http://thread.gmane.org/gmane.linux.kernel/1103571 http://thread.gmane.org/gmane.linux.scsi/59259 And many many more older threads on various ML, some of them misleading, some of them mixing this issue of in-flight modifications with actual (hardware caused) data corruption. In short: you do some thing like continuously append to some (log) file, *not* doing fsync, while also having some background writeout (global sync). perl -le '$|=1; print scalar localtime while !select undef,undef,undef,0.001;' >> log & while sleep 1; do sync ; done Other variants involve mmap, but anything that keeps modifying buffer will do. Global writeout causes dirty pages to be flushed to disk, the continuous append changes the page while it is being written out. These inconsistencies are usually short lived, not persistent, because, at some point, the "changing spot" will move to some other page, the page has been redirtied by the last change, and eventually will be written out one last time. But. Consider the case of already unlinked temporary files, such as used by many data bases and other applications. It is a valid optimization for file systems to skip implicit write-out of "deleted" pages. In which case you may end up with persistent data divergence on disk, supposedly only in the "deleted" area -- which matches your observation. (such tmp files often live in /tmp, which may be on your root fs, which would then also match your observation). Also for swap it is legal to start swapout, then recognize suddenly the page is needed after all, mark it's on-disk location as invalid and continue to use it (which will change it while it is "in-flight"). So unless your file system does ensure "stable pages", anything that submits a bio to more than one location without first copying the data to private (thus supposedly stable) pages, will suffer from that problem. ext4 (and others) in recent kernels are supposed to provide stable pages. afaik (means: I may be wrong), ext3 does not yet fully guarantee "stable pages", though it has gotten much better. > Have read that MD is the only way to go > with any kind of RAID and now I see that > is true. If anyone can explain what > happened here in any positive light I'd > be interested in hearing about it. For > now I see LVM mirroring as a turkey that > should be avoided. > > Additional details: > > * both LVs with discrepancies are "root" > file system LVs where one or the other > is selected in differing 'grub' boot > configuration lines > > * both LVs have an associated mirror log > > * in the past have experienced system > lockups due to a mirrored swap volume; > reported it to RH Bugzilla and was > told there are deadlock scenarios in > the kernel and that mirrored swap > volumes are not supported. This and > today's discovery leads me to the > conclusion that LVM mirroring is > a seriously bad idea. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD� and LINBIT� are registered trademarks of LINBIT, Austria. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2012-01-02 13:38 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-12-02 17:20 [linux-lvm] cmp of inactive mirrored LV fails starlight 2011-12-02 18:34 ` starlight 2011-12-02 21:01 ` Stuart D. Gathman 2011-12-02 21:14 ` starlight 2011-12-02 21:54 ` Stuart D. Gathman 2011-12-02 22:02 ` starlight 2011-12-02 21:27 ` starlight 2011-12-02 21:47 ` starlight 2011-12-02 23:57 ` Ron Johnson 2011-12-03 3:44 ` starlight 2012-01-02 13:38 ` Lars Ellenberg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).