* [linux-lvm] LVM seems to be causing SCSI errors.
@ 2001-10-05 16:58 Karl
2001-10-05 21:31 ` Andreas Dilger
2001-10-08 10:07 ` Heinz J . Mauelshagen
0 siblings, 2 replies; 15+ messages in thread
From: Karl @ 2001-10-05 16:58 UTC (permalink / raw)
To: linux-lvm
This is a real odd problem and my searching of the mailing lists and
docs has not shed any light on it.
I created 3 logical volumes (with lvm 1.0.1r2, r3 and r4) across 4 disks
on a dual 300 MHz PII running 2.4.6 (same behavior under 2.4.9) and
readhat 7.1. After this I noticed IO errors while reading some files
and SCSI errors when accessing lvol2. The errors were similar to the
following
Sep 28 17:11:21 gar kernel: scsi0: ERROR on channel 0, id 11, lun 0, CDB: Read (10) 00 00 ba e1 97 00 00 80 00
Sep 28 17:11:21 gar kernel: Info fld=0xbae1e6, Current sd08:21: sense key Medium Error
Sep 28 17:11:21 gar kernel: Additional sense indicates Unrecovered read error
Sep 28 17:11:21 gar kernel: I/O error: dev 08:21, sector 12247456
lvol2 spanned two disks and all the errors were on only one of them. I
naturally thought that the problem was a hardware problem with the
disk. The only odd thing was that lvol1 which spanned the same two disks
did not have any errors.
The first thing I tried was an mke2fs with bad block checking on
lvol2. It found several errors and when I copied the files back on new
errors cropped up and more files were lost.
The next thing I did was to move all the data off of that disk (pvmove)
and then did a low level format on that disk. No errors were detected
during the low lever format nor the verify I did after that. I then did
a mke2fs with bad block checking on the disk without LVM. No errors. I
then did a pvcreate and re-created lvol2 and then did a mke2fs with bad
block checking. Once again the SCSI errors. Removing that drive from LVM
and formatting yet again without LVM showed no errors.
Moving lvol2 to a different physical volume results in the SCSI errors
moving to that volume.
I am now convinced that the problem is with LVM and not with the
hardware. I still have two other logical volumes (lvol1 and lvol3) that
are not having ANY problems at all. Does anyone have any suggestion on
how to solve this one?
Thanks.
-----------------------------------------------------
Protect yourself from spam, use http://sneakemail.com
^ permalink raw reply [flat|nested] 15+ messages in thread
* [linux-lvm] LVM seems to be causing SCSI errors.
@ 2001-10-05 18:21 Karl
2001-10-05 22:58 ` Andreas Dilger
0 siblings, 1 reply; 15+ messages in thread
From: Karl @ 2001-10-05 18:21 UTC (permalink / raw)
To: linux-lvm
> People have recently reported similar "hardware" errors on IDE drives
> as well. Can you verify if the sector in question is really within the
> boundaries of the device (this appears to be /dev/sdc1, sector 12247456)?
> If not, it is possible that LVM is doing math incorrectly somewhere.
Good point. That's a pretty big number. It would also explain the bad noise from the drive...
From fdisk on that drive I count 17767890 sectors so it does seem to fall within the boundaries
fdisk /dev/sdc
The number of cylinders for this disk is set to 1106.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)
Command (m for help): p
Disk /dev/sdc: 255 heads, 63 sectors, 1106 cylinders
Units = cylinders of 16065 * 512 bytes
Device Boot Start End Blocks Id System
/dev/sdc1 1 1106 8883913+ 83 Linux
-----------------------------------------------------
Protect yourself from spam, use http://sneakemail.com
^ permalink raw reply [flat|nested] 15+ messages in thread
* [linux-lvm] LVM seems to be causing SCSI errors.
@ 2001-10-05 19:52 Karl
2001-10-08 10:10 ` Heinz J . Mauelshagen
2001-10-08 16:20 ` Joe Thornber
0 siblings, 2 replies; 15+ messages in thread
From: Karl @ 2001-10-05 19:52 UTC (permalink / raw)
To: linux-lvm
> Well, if it is within the size of the drive, and you get a "bad noise"
> when trying to access it, then it may be that you have a bad drive. If
> you are doing low-level formats of the drive, this may do bad block
> relocation, and hide the fact that there are bad spots.
That's what I thought at first, but as I explained earlier, the errors
ONLY occur when running under LVM and go away as soon as I use the raw
disk. I went back and forth several times. I'm pretty sure it is an LVM
thing, but no idea what it is.
-----------------------------------------------------
Protect yourself from spam, use http://sneakemail.com
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-lvm] LVM seems to be causing SCSI errors.
2001-10-05 16:58 Karl
@ 2001-10-05 21:31 ` Andreas Dilger
2001-10-08 10:07 ` Heinz J . Mauelshagen
1 sibling, 0 replies; 15+ messages in thread
From: Andreas Dilger @ 2001-10-05 21:31 UTC (permalink / raw)
To: linux-lvm
On Oct 05, 2001 16:58 -0000, Karl wrote:
> I created 3 logical volumes (with lvm 1.0.1r2, r3 and r4) across 4 disks
> on a dual 300 MHz PII running 2.4.6 (same behavior under 2.4.9) and
> readhat 7.1. After this I noticed IO errors while reading some files
> and SCSI errors when accessing lvol2. The errors were similar to the
> following
>
>
> Sep 28 17:11:21 gar kernel: scsi0: ERROR on channel 0, id 11, lun 0, CDB: Read (10) 00 00 ba e1 97 00 00 80 00
> Sep 28 17:11:21 gar kernel: Info fld=0xbae1e6, Current sd08:21: sense key Medium Error
> Sep 28 17:11:21 gar kernel: Additional sense indicates Unrecovered read error
> Sep 28 17:11:21 gar kernel: I/O error: dev 08:21, sector 12247456
People have recently reported similar "hardware" errors on IDE drives
as well. Can you verify if the sector in question is really within the
boundaries of the device (this appears to be /dev/sdc1, sector 12247456)?
If not, it is possible that LVM is doing math incorrectly somewhere.
Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-lvm] LVM seems to be causing SCSI errors.
2001-10-05 18:21 Karl
@ 2001-10-05 22:58 ` Andreas Dilger
0 siblings, 0 replies; 15+ messages in thread
From: Andreas Dilger @ 2001-10-05 22:58 UTC (permalink / raw)
To: linux-lvm
On Oct 05, 2001 18:21 -0000, Karl wrote:
> > People have recently reported similar "hardware" errors on IDE drives
> > as well. Can you verify if the sector in question is really within the
> > boundaries of the device (this appears to be /dev/sdc1, sector 12247456)?
> > If not, it is possible that LVM is doing math incorrectly somewhere.
>
> Good point. That's a pretty big number. It would also explain the bad noise
> from the drive...
>
> From fdisk on that drive I count 17767890 sectors so it does seem to fall
> within the boundaries
Well, if it is within the size of the drive, and you get a "bad noise"
when trying to access it, then it may be that you have a bad drive. If
you are doing low-level formats of the drive, this may do bad block
relocation, and hide the fact that there are bad spots.
One way to test this theory is to do "dd if=/dev/sdc of=/dev/null" and
see if it generates errors. If so your drive is bad.
Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-lvm] LVM seems to be causing SCSI errors.
2001-10-05 16:58 Karl
2001-10-05 21:31 ` Andreas Dilger
@ 2001-10-08 10:07 ` Heinz J . Mauelshagen
1 sibling, 0 replies; 15+ messages in thread
From: Heinz J . Mauelshagen @ 2001-10-08 10:07 UTC (permalink / raw)
To: linux-lvm
Karl,
did you check the filesystem.
Ages ago there was a couple of problems like this caused by filesystem
corruptions being fixable by the checker.
Please report, if this helped.
Regards,
Heinz -- The LVM Guy --
On Fri, Oct 05, 2001 at 04:58:15PM -0000, Karl wrote:
> This is a real odd problem and my searching of the mailing lists and
> docs has not shed any light on it.
>
> I created 3 logical volumes (with lvm 1.0.1r2, r3 and r4) across 4 disks
> on a dual 300 MHz PII running 2.4.6 (same behavior under 2.4.9) and
> readhat 7.1. After this I noticed IO errors while reading some files
> and SCSI errors when accessing lvol2. The errors were similar to the
> following
>
>
> Sep 28 17:11:21 gar kernel: scsi0: ERROR on channel 0, id 11, lun 0, CDB: Read (10) 00 00 ba e1 97 00 00 80 00
> Sep 28 17:11:21 gar kernel: Info fld=0xbae1e6, Current sd08:21: sense key Medium Error
> Sep 28 17:11:21 gar kernel: Additional sense indicates Unrecovered read error
> Sep 28 17:11:21 gar kernel: I/O error: dev 08:21, sector 12247456
>
>
> lvol2 spanned two disks and all the errors were on only one of them. I
> naturally thought that the problem was a hardware problem with the
> disk. The only odd thing was that lvol1 which spanned the same two disks
> did not have any errors.
>
> The first thing I tried was an mke2fs with bad block checking on
> lvol2. It found several errors and when I copied the files back on new
> errors cropped up and more files were lost.
>
> The next thing I did was to move all the data off of that disk (pvmove)
> and then did a low level format on that disk. No errors were detected
> during the low lever format nor the verify I did after that. I then did
> a mke2fs with bad block checking on the disk without LVM. No errors. I
> then did a pvcreate and re-created lvol2 and then did a mke2fs with bad
> block checking. Once again the SCSI errors. Removing that drive from LVM
> and formatting yet again without LVM showed no errors.
>
> Moving lvol2 to a different physical volume results in the SCSI errors
> moving to that volume.
>
> I am now convinced that the problem is with LVM and not with the
> hardware. I still have two other logical volumes (lvol1 and lvol3) that
> are not having ANY problems at all. Does anyone have any suggestion on
> how to solve this one?
>
> Thanks.
>
>
> -----------------------------------------------------
> Protect yourself from spam, use http://sneakemail.com
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://www.sistina.com/lvm/Pages/howto.html
*** Software bugs are stupid.
Nevertheless it needs not so stupid people to solve them ***
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Heinz Mauelshagen Sistina Software Inc.
Senior Consultant/Developer Am Sonnenhang 11
56242 Marienrachdorf
Germany
Mauelshagen@Sistina.com +49 2626 141200
FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-lvm] LVM seems to be causing SCSI errors.
2001-10-05 19:52 Karl
@ 2001-10-08 10:10 ` Heinz J . Mauelshagen
2001-10-08 16:20 ` Joe Thornber
1 sibling, 0 replies; 15+ messages in thread
From: Heinz J . Mauelshagen @ 2001-10-08 10:10 UTC (permalink / raw)
To: linux-lvm
IIRC you said, that your test LVs spanned physical drives.
Another known problem is flaky SCSI subsystem behaviour when LVM causes
more load in configurations like this.
Can you trigger the problem, if you've got just one LV being allocated
on one PV?
What did the filesystem consistency check I mentioned in my other mail say?
On Fri, Oct 05, 2001 at 07:52:50PM -0000, Karl wrote:
> > Well, if it is within the size of the drive, and you get a "bad noise"
> > when trying to access it, then it may be that you have a bad drive. If
> > you are doing low-level formats of the drive, this may do bad block
> > relocation, and hide the fact that there are bad spots.
>
> That's what I thought at first, but as I explained earlier, the errors
> ONLY occur when running under LVM and go away as soon as I use the raw
> disk. I went back and forth several times. I'm pretty sure it is an LVM
> thing, but no idea what it is.
>
>
> -----------------------------------------------------
> Protect yourself from spam, use http://sneakemail.com
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://www.sistina.com/lvm/Pages/howto.html
--
Regards,
Heinz -- The LVM Guy --
*** Software bugs are stupid.
Nevertheless it needs not so stupid people to solve them ***
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Heinz Mauelshagen Sistina Software Inc.
Senior Consultant/Developer Am Sonnenhang 11
56242 Marienrachdorf
Germany
Mauelshagen@Sistina.com +49 2626 141200
FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-lvm] LVM seems to be causing SCSI errors.
@ 2001-10-08 10:46 Karl
0 siblings, 0 replies; 15+ messages in thread
From: Karl @ 2001-10-08 10:46 UTC (permalink / raw)
To: linux-lvm
> did you check the filesystem.
> Ages ago there was a couple of problems like this caused by filesystem
> corruptions being fixable by the checker.
>
> Please report, if this helped.
Since I see this problem while creating the filesystem with mke2fs -c,
I don't think that is the problem. I first has the problem right after
creating the file system when coping files to it. That time, I did not do the
bad block checking on the file system.
-----------------------------------------------------
Protect yourself from spam, use http://sneakemail.com
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-lvm] LVM seems to be causing SCSI errors.
@ 2001-10-08 10:51 Karl
2001-10-09 9:21 ` Heinz J . Mauelshagen
0 siblings, 1 reply; 15+ messages in thread
From: Karl @ 2001-10-08 10:51 UTC (permalink / raw)
To: linux-lvm
> IIRC you said, that your test LVs spanned physical drives.
> Another known problem is flaky SCSI subsystem behaviour when LVM causes
> more load in configurations like this.
>
> Can you trigger the problem, if you've got just one LV being allocated
> on one PV?
The problem is still there when the LVM is only on a single SCSI volume.
In fact, changing the volume from one drive to the other shows SCSI
errors on the new drive and the old drive used on a non LVM file system
stops having errors.
> What did the file system consistency check I mentioned in my other mail say?
When I ran an fsck on the file system with the errors (I did not re-do
this experiment recently) it found bad blocks on the files system, but
no inconsistencies on the file system itself.
-----------------------------------------------------
Protect yourself from spam, use http://sneakemail.com
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-lvm] LVM seems to be causing SCSI errors.
2001-10-05 19:52 Karl
2001-10-08 10:10 ` Heinz J . Mauelshagen
@ 2001-10-08 16:20 ` Joe Thornber
1 sibling, 0 replies; 15+ messages in thread
From: Joe Thornber @ 2001-10-08 16:20 UTC (permalink / raw)
To: linux-lvm
On Fri, Oct 05, 2001 at 07:52:50PM -0000, Karl wrote:
> > Well, if it is within the size of the drive, and you get a "bad noise"
> > when trying to access it, then it may be that you have a bad drive. If
> > you are doing low-level formats of the drive, this may do bad block
> > relocation, and hide the fact that there are bad spots.
>
> That's what I thought at first, but as I explained earlier, the errors
> ONLY occur when running under LVM and go away as soon as I use the raw
> disk. I went back and forth several times. I'm pretty sure it is an LVM
> thing, but no idea what it is.
The only thing I can think is that the SCSI device is not being
opened/initialised properly by LVM. This certainly wasn't happening
before ~July when I put the open/close_pv() functions in. But should
be working now. Puzzled. You really are using the new tools and not
old binaries in strange places ? I've been caught a couple of times
by this ...
- Joe
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-lvm] LVM seems to be causing SCSI errors.
@ 2001-10-08 17:37 Karl
0 siblings, 0 replies; 15+ messages in thread
From: Karl @ 2001-10-08 17:37 UTC (permalink / raw)
To: linux-lvm
> The only thing I can think is that the SCSI device is not being
> opened/initialised properly by LVM. This certainly wasn't happening
> before ~July when I put the open/close_pv() functions in. But should
> be working now. Puzzled. You really are using the new tools and not
> old binaries in strange places ? I've been caught a couple of times
> by this ...
I'm pretty sure I'm using the correct binaries. Date/time stamps on the
only ones I find on my system agree with when I compiled the latest
release.
I'm using the static links so the libraries should not be an issue
either. The only binaries I run on my tests are pvcreate, vgextend,
vgreduce, lvcreate, and lvremove. The patch of the kernel and kernel
recompile went without errors. LVM is compiled as a module, not in the
kernel itself.
--
Karl Hakimian
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-lvm] LVM seems to be causing SCSI errors.
2001-10-08 10:51 Karl
@ 2001-10-09 9:21 ` Heinz J . Mauelshagen
0 siblings, 0 replies; 15+ messages in thread
From: Heinz J . Mauelshagen @ 2001-10-09 9:21 UTC (permalink / raw)
To: linux-lvm
On Mon, Oct 08, 2001 at 10:51:42AM -0000, Karl wrote:
> > IIRC you said, that your test LVs spanned physical drives.
> > Another known problem is flaky SCSI subsystem behaviour when LVM causes
> > more load in configurations like this.
> >
> > Can you trigger the problem, if you've got just one LV being allocated
> > on one PV?
>
> The problem is still there when the LVM is only on a single SCSI volume.
> In fact, changing the volume from one drive to the other shows SCSI
> errors on the new drive and the old drive used on a non LVM file system
> stops having errors.
>
> > What did the file system consistency check I mentioned in my other mail say?
>
> When I ran an fsck on the file system with the errors (I did not re-do
> this experiment recently) it found bad blocks on the files system, but
> no inconsistencies on the file system itself.
Strange, running out of ideas here.
We don't have similar reports, do we?
>
>
> -----------------------------------------------------
> Protect yourself from spam, use http://sneakemail.com
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://www.sistina.com/lvm/Pages/howto.html
--
Regards,
Heinz -- The LVM Guy --
*** Software bugs are stupid.
Nevertheless it needs not so stupid people to solve them ***
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Heinz Mauelshagen Sistina Software Inc.
Senior Consultant/Developer Am Sonnenhang 11
56242 Marienrachdorf
Germany
Mauelshagen@Sistina.com +49 2626 141200
FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-lvm] LVM seems to be causing SCSI errors.
@ 2001-10-09 13:55 Karl
2001-10-09 15:12 ` Joe Thornber
0 siblings, 1 reply; 15+ messages in thread
From: Karl @ 2001-10-09 13:55 UTC (permalink / raw)
To: linux-lvm
> > When I ran an fsck on the file system with the errors (I did not re-do
> > this experiment recently) it found bad blocks on the files system, but
> > no inconsistencies on the file system itself.
>
> Strange, running out of ideas here.
> We don't have similar reports, do we?
I did not see anything when checking the archives. Any thing else I can
do to provide info on this problem?
Could the SCSI ID have anything to do with it? Size of drives? Phase of
the moon? :-)
It is odd that this is only lvol2 that has the problem. My lvol1 and
lvol3 have been working perfectly. Maybe I should add an lvol2 and lvol4
in and see if the even numbers have problems...
--
Karl Hakimian
XXXXXXXXXXXXXXXX
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-lvm] LVM seems to be causing SCSI errors.
2001-10-09 13:55 Karl
@ 2001-10-09 15:12 ` Joe Thornber
0 siblings, 0 replies; 15+ messages in thread
From: Joe Thornber @ 2001-10-09 15:12 UTC (permalink / raw)
To: linux-lvm
On Tue, Oct 09, 2001 at 06:55:54AM -0700, Karl wrote:
> > > When I ran an fsck on the file system with the errors (I did not re-do
> > > this experiment recently) it found bad blocks on the files system, but
> > > no inconsistencies on the file system itself.
> >
> > Strange, running out of ideas here.
> > We don't have similar reports, do we?
>
> I did not see anything when checking the archives. Any thing else I can
> do to provide info on this problem?
>
> Could the SCSI ID have anything to do with it? Size of drives? Phase of
> the moon? :-)
>
> It is odd that this is only lvol2 that has the problem. My lvol1 and
> lvol3 have been working perfectly. Maybe I should add an lvol2 and lvol4
> in and see if the even numbers have problems...
Have you tried running md across these disks ?
- Joe
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-lvm] LVM seems to be causing SCSI errors.
@ 2001-10-09 16:57 Karl
0 siblings, 0 replies; 15+ messages in thread
From: Karl @ 2001-10-09 16:57 UTC (permalink / raw)
To: linux-lvm
> Have you tried running md across these disks ?
Not yet. I can probably try that this weekend.
--
Karl Hakimian
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2001-10-09 16:57 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-10-08 10:46 [linux-lvm] LVM seems to be causing SCSI errors Karl
-- strict thread matches above, loose matches on Subject: below --
2001-10-09 16:57 Karl
2001-10-09 13:55 Karl
2001-10-09 15:12 ` Joe Thornber
2001-10-08 17:37 Karl
2001-10-08 10:51 Karl
2001-10-09 9:21 ` Heinz J . Mauelshagen
2001-10-05 19:52 Karl
2001-10-08 10:10 ` Heinz J . Mauelshagen
2001-10-08 16:20 ` Joe Thornber
2001-10-05 18:21 Karl
2001-10-05 22:58 ` Andreas Dilger
2001-10-05 16:58 Karl
2001-10-05 21:31 ` Andreas Dilger
2001-10-08 10:07 ` Heinz J . Mauelshagen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).