* RE: [linux-lvm] SCSI/LVM problems after power outage
@ 2001-05-08 21:05 Day, Evan
2001-05-09 9:35 ` Heinz J. Mauelshagen
2001-05-11 8:00 ` Steven Lembark
0 siblings, 2 replies; 16+ messages in thread
From: Day, Evan @ 2001-05-08 21:05 UTC (permalink / raw)
To: 'linux-lvm@sistina.com'
It looks like device 2 is having issues:
SCSI disk error : host 2 channel 0 id 2 lun 0 return code = 18000002
However, it has been several years since I worked with Sun machines, so I
could be wrong. Regardless, it doesn't sound like an LVM issue, but a
hardware issue. Unfortunately, I can't offer much recovery advice - most of
my LVM experience is with HP-UX, and we use mirroring (RAID-1) at work -
just unplug the bad drive, plug in a new one, and do a vgsync. I think you
can add a replacement drive to the VG and use pvmove to try and move the PEs
from the bad drive to the new drive, but I wouldn't take my word for it...
-----Original Message-----
From: Dave Wapstra [mailto:dave@xs4all.nl]
Sent: Tuesday, May 08, 2001 9:33 AM
To: linux-lvm@sistina.com
Subject: [linux-lvm] SCSI/LVM problems after power outage
Hi all,
After a power outage, our ftp server with LVM has not survived very well.
System: UltraSparc II, Linux 2.4(2) + io.path + LVM 0.9.1beta6
All normal ext2 paritions were fsck'd fine, however, the volume is having
problems.
The syslog has a lot of SCSI errors:
May 7 19:15:52 ftp kernel: sym53c8xx_reset: pid=0 reset_flags=1
serial_number=0 serial_number_at_timeout=0
May 7 19:15:52 ftp kernel: scsi2: device driver called scsi_done() for a
synchronous reset.
May 7 19:15:53 ftp kernel: sym53c875-2-<0,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 15)
May 7 19:15:53 ftp kernel: sym53c875-2-<1,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 15)
May 7 19:15:53 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 16)
May 7 19:15:53 ftp kernel: sym53c875-2-<3,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 15)
May 7 19:15:53 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3
DBC=11000c00 SBCL=ae
May 7 19:15:53 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 16)
May 7 19:15:53 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3
DBC=11000c00 SBCL=ae
May 7 19:15:53 ftp kernel: SCSI disk error : host 2 channel 0 id 2 lun 0
return code = 18000002
May 7 19:15:53 ftp kernel: [valid=0] Info fld=0x0, Current sd08:20: sense
key Aborted Command
May 7 19:15:53 ftp kernel: Additional sense indicates Initiator detected
error message received
May 7 19:15:53 ftp kernel: I/O error: dev 08:20, sector 2239227
May 7 19:15:53 ftp kernel: EXT2-fs error (device lvm(58,0)):
ext2_read_inode: unable to read inode block - inode=1245185, block
=2490377
May 7 19:15:53 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 16)
May 7 19:15:53 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3
DBC=11000c00 SBCL=ae
May 7 19:15:53 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 16)
May 7 19:15:56 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3
DBC=11000c00 SBCL=ae
May 7 19:15:56 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 16)
May 7 19:15:56 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3
DBC=11000c00 SBCL=ae
May 7 19:15:57 ftp kernel: scsi2 channel 0 : resetting for second half of
retries.
May 7 19:15:57 ftp kernel: SCSI bus is being reset for host 2 channel 0.
May 7 19:15:57 ftp kernel: sym53c8xx_reset: pid=0 reset_flags=1
serial_number=0 serial_number_at_timeout=0
May 7 19:15:57 ftp kernel: scsi2: device driver called scsi_done() for a
synchronous reset.
May 7 19:15:58 ftp kernel: sym53c875-2-<0,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 15)
May 7 19:15:58 ftp kernel: sym53c875-2-<3,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 15)
May 7 19:15:58 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 16)
May 7 19:15:58 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3
DBC=11000c00 SBCL=ae
May 7 19:15:58 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 16)
May 7 19:15:58 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3
DBC=11000c00 SBCL=ae
May 7 19:15:58 ftp kernel: SCSI disk error : host 2 channel 0 id 2 lun 0
return code = 18000002
May 7 19:15:58 ftp kernel: [valid=0] Info fld=0x0, Current sd08:20: sense
key Aborted Command
May 7 19:15:58 ftp kernel: Additional sense indicates Initiator detected
error message received
May 7 19:15:58 ftp kernel: I/O error: dev 08:20, sector 17443579
May 7 19:15:58 ftp kernel: EXT2-fs error (device lvm(58,0)):
ext2_read_inode: unable to read inode block - inode=2195457, block
=4390921
May 7 19:15:58 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 16)
May 7 19:16:00 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3
DBC=11000c00 SBCL=ae
May 7 19:16:00 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 16)
May 7 19:16:01 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3
DBC=11000c00 SBCL=ae
May 7 19:16:01 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 16)
May 7 19:16:01 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3
DBC=11000c00 SBCL=ae
May 7 19:16:02 ftp kernel: scsi2 channel 0 : resetting for second half of
retries.
May 7 19:16:02 ftp kernel: SCSI bus is being reset for host 2 channel 0.
May 7 19:16:02 ftp kernel: sym53c8xx_reset: pid=0 reset_flags=1
serial_number=0 serial_number_at_timeout=0
May 7 19:16:02 ftp kernel: scsi2: device driver called scsi_done() for a
synchronous reset.
May 7 19:16:02 ftp kernel: sym53c875-2-<0,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 15)
May 7 19:16:03 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 16)
May 7 19:16:03 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3
DBC=11000c00 SBCL=ae
May 7 19:16:03 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 16)
May 7 19:16:03 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3
DBC=11000c00 SBCL=ae
May 7 19:16:03 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 16)
May 7 19:16:03 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3
DBC=11000c00 SBCL=ae
May 7 19:16:04 ftp kernel: scsi2 channel 0 : resetting for second half of
retries.
May 7 19:16:04 ftp kernel: SCSI bus is being reset for host 2 channel 0.
May 7 19:16:04 ftp kernel: sym53c8xx_reset: pid=0 reset_flags=1
serial_number=0 serial_number_at_timeout=0
May 7 19:16:04 ftp kernel: scsi2: device driver called scsi_done() for a
synchronous reset.
May 7 19:16:04 ftp kernel: sym53c875-2-<0,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 15)
May 7 19:16:05 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 16)
May 7 19:16:05 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3
DBC=11000c00 SBCL=ae
May 7 19:16:05 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50.0 ns, offset 16)
May 7 19:16:05 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3
DBC=11000c00 SBCL=ae
May 7 19:16:05 ftp kernel: SCSI disk error : host 2 channel 0 id 2 lun 0
return code = 18000002
May 7 19:16:05 ftp kernel: [valid=0] Info fld=0x0, Current sd08:20: sense
key Aborted Command
May 8 10:20:50 ftp kernel: scsi2 channel 0 : resetting for second half of
retries.
May 8 10:20:50 ftp kernel: SCSI bus is being reset for host 2 channel 0.
May 8 10:20:50 ftp kernel: sym53c8xx_reset: pid=0 reset_flags=1
serial_number=0 serial_number_at_timeout=0
May 8 10:20:50 ftp kernel: scsi2: device driver called scsi_done() for a
synchronous reset.
May 8 10:20:50 ftp kernel: sym53c876-2: restart (scsi reset).
May 8 10:20:50 ftp kernel: sym53c876-2: Downloading SCSI SCRIPTS.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: wide msgout: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide msgout: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: wide msgin: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: wide: wide=1 chg=0.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: wide msgout: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide msgin: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide: wide=1 chg=0.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: wide msgin: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: wide: wide=1 chg=0.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: sync msgout: 1-3-1-c-10.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide msgout: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: sync msg in: 1-3-1-c-f.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: sync: per=12 scntl3=0x90
scntl4=0x0 ofs=15 fak=0 chg=0.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50 ns, offset 15)
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide msgin: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide: wide=1 chg=0.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: sync msgout: 1-3-1-c-10.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: sync msg in: 1-3-1-c-10.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: sync: per=12 scntl3=0x90
scntl4=0x0 ofs=16 fak=0 chg=0.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50 ns, offset 16)
May 8 10:20:50 ftp kernel: sym53c876-2: SCSI parity error detected: SCR1=3
DBC=11000c00 SBCL=ae
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide msgout: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide msgin: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide: wide=1 chg=0.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: sync msgout: 1-3-1-c-10.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: sync msg in: 1-3-1-c-10.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: sync: per=12 scntl3=0x90
scntl4=0x0 ofs=16 fak=0 chg=0.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50 ns, offset 16)
May 8 10:20:50 ftp kernel: sym53c876-2: SCSI parity error detected: SCR1=3
DBC=11000c00 SBCL=ae
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide msgout: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide msgin: 1-2-3-1.
May 8 10:20:51 ftp kernel: sym53c876-2-<2,0>: wide: wide=1 chg=0.
May 8 10:20:51 ftp kernel: SCSI disk error : host 2 channel 0 id 2 lun 0
return code = 18000002
May 8 10:20:51 ftp kernel: [valid=0] Info fld=0x0, Current sd08:20: sense
key Aborted Command
May 8 10:20:51 ftp kernel: Additional sense indicates Initiator detected
error message received
May 8 10:20:51 ftp kernel: I/O error: dev 08:20, sector 6695675
May 8 10:20:51 ftp kernel: sym53c876-2-<2,0>: sync msgout: 1-3-1-c-10.
May 8 10:20:51 ftp kernel: sym53c876-2-<2,0>: sync msg in: 1-3-1-c-10.
May 8 10:20:51 ftp kernel: sym53c876-2-<2,0>: sync: per=12 scntl3=0x90
scntl4=0x0 ofs=16 fak=0 chg=0.
May 8 10:20:51 ftp kernel: sym53c876-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s
(50 ns, offset 16)
May 8 10:20:52 ftp kernel: sym53c876-2: SCSI parity error detected: SCR1=3
DBC=11007c00 SBCL=ae
May 8 10:20:52 ftp kernel: sym53c876-2-<2,0>: wide msgout: 1-2-3-1.
How can I find out which disks are having problems, and what are
possibilities to fix this?
# pvscan
pvscan -- reading all physical volumes (this may take a while...)
pvscan -- ACTIVE PV "/dev/sdb" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdc" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdd" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sde" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdf" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdg" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdh" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdi" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdj" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdk" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdl" of VG "ftp" [8.43 GB / 0 free]
pvscan -- total: 11 [92.78 GB] / in use: 11 [92.78 GB] / in no VG: 0 [0]
# vgdisplay ftp
--- Volume group ---
VG Name ftp
VG Access read/write
VG Status available/resizable
VG # 0
MAX LV 256
Cur LV 1
Open LV 1
MAX LV Size 255.99 GB
Max PV 256
Cur PV 11
Act PV 11
VG Size 92.77 GB
PE Size 4 MB
Total PE 23749
Alloc PE / Size 23749 / 92.77 GB
Free PE / Size 0 / 0
VG UUID eFMVIQ-SMBr-ecsj-TDkc-bGvw-0fyE-jibMIU
# lvdisplay /dev/ftp/pub
--- Logical volume ---
LV Name /dev/ftp/pub
VG Name ftp
LV Write Access read/write
LV Status available
LV # 1
# open 1
LV Size 92.77 GB
Current LE 23749
Allocated LE 23749
Allocation next free
Read ahead sectors 120
Block device 58:0
-Dave
--
Dave Wapstra
dave@xs4all.nl
_______________________________________________
linux-lvm mailing list
linux-lvm@sistina.com
http://lists.sistina.com/mailman/listinfo/linux-lvm
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] SCSI/LVM problems after power outage
2001-05-08 21:05 [linux-lvm] SCSI/LVM problems after power outage Day, Evan
@ 2001-05-09 9:35 ` Heinz J. Mauelshagen
2001-05-09 18:14 ` Dave Wapstra
2001-05-11 8:03 ` Steven Lembark
2001-05-11 8:00 ` Steven Lembark
1 sibling, 2 replies; 16+ messages in thread
From: Heinz J. Mauelshagen @ 2001-05-09 9:35 UTC (permalink / raw)
To: linux-lvm
On Tue, May 08, 2001 at 02:05:16PM -0700, Day, Evan wrote:
> It looks like device 2 is having issues:
>
> SCSI disk error : host 2 channel 0 id 2 lun 0 return code = 18000002
>
> However, it has been several years since I worked with Sun machines, so I
> could be wrong. Regardless, it doesn't sound like an LVM issue, but a
> hardware issue. Unfortunately, I can't offer much recovery advice - most of
> my LVM experience is with HP-UX, and we use mirroring (RAID-1) at work -
> just unplug the bad drive, plug in a new one, and do a vgsync. I think you
> can add a replacement drive to the VG and use pvmove to try and move the PEs
> from the bad drive to the new drive, but I wouldn't take my word for it...
In case the PEs on the flaky device are still readable:
extending your VG and using pvmove is the way to go.
Steps needed (crossing your fingers that the flaky drive can still stand this):
- install or use an additional drive of at least the size of the flaky one
- pvcreate that drive
- vgextend the VG by it
- pvmove /dev/FlakyDrive
- wait until pvmove is done
- vgreduce VG /dev/FlakyDrive
You should have your LVs offline in case you have an older version than
LVM 0.9.1 Beta 7, because they had a pvmove related bug which could cause
oopses when you moved PEs in use.
BTW: you need to configure MD (RAID 1 or 5) or use hardware raid subsystems
to avoid suffering from such flaky disk problems.
Regards,
Heinz -- The LVM Guy --
>
> -----Original Message-----
> From: Dave Wapstra [mailto:dave@xs4all.nl]
> Sent: Tuesday, May 08, 2001 9:33 AM
> To: linux-lvm@sistina.com
> Subject: [linux-lvm] SCSI/LVM problems after power outage
>
>
> Hi all,
>
> After a power outage, our ftp server with LVM has not survived very well.
>
> System: UltraSparc II, Linux 2.4(2) + io.path + LVM 0.9.1beta6
>
> All normal ext2 paritions were fsck'd fine, however, the volume is having
> problems.
>
> The syslog has a lot of SCSI errors:
>
> May 7 19:15:52 ftp kernel: sym53c8xx_reset: pid=0 reset_flags=1
> serial_number=0 serial_number_at_timeout=0
<SNIP>
> Read ahead sectors 120
> Block device 58:0
>
>
> -Dave
>
> --
> Dave Wapstra
> dave@xs4all.nl
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
*** Software bugs are stupid.
Nevertheless it needs not so stupid people to solve them ***
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Heinz Mauelshagen Sistina Software Inc.
Senior Consultant/Developer Am Sonnenhang 11
56242 Marienrachdorf
Germany
Mauelshagen@Sistina.com +49 2626 141200
FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [linux-lvm] SCSI/LVM problems after power outage
2001-05-09 9:35 ` Heinz J. Mauelshagen
@ 2001-05-09 18:14 ` Dave Wapstra
2001-05-09 21:26 ` Heinz J. Mauelshagen
2001-05-11 8:03 ` Steven Lembark
1 sibling, 1 reply; 16+ messages in thread
From: Dave Wapstra @ 2001-05-09 18:14 UTC (permalink / raw)
To: linux-lvm
On 09 May, 2001 at 09:35:10 +0000, Heinz J. Mauelshagen <Mauelshagen@sistina.com> wrote:
>
> On Tue, May 08, 2001 at 02:05:16PM -0700, Day, Evan wrote:
> > It looks like device 2 is having issues:
> >
> > SCSI disk error : host 2 channel 0 id 2 lun 0 return code = 18000002
> >
Ok. Sounds reasonable ;) It seems /dev/sdc is giving troubles.
> In case the PEs on the flaky device are still readable:
> extending your VG and using pvmove is the way to go.
>
> Steps needed (crossing your fingers that the flaky drive can still stand this):
>
> - install or use an additional drive of at least the size of the flaky one
> - pvcreate that drive
> - vgextend the VG by it
> - pvmove /dev/FlakyDrive
> - wait until pvmove is done
> - vgreduce VG /dev/FlakyDrive
I'm trying to backup the data that is still available (backup was a
thing that was still in progress... :) )
pvmove is not really happy with the current situation.
Can I just do "vgreduce ftp /dev/sdc" after I finished rescueing most of the data?
> You should have your LVs offline in case you have an older version than
> LVM 0.9.1 Beta 7, because they had a pvmove related bug which could cause
> oopses when you moved PEs in use.
>
> BTW: you need to configure MD (RAID 1 or 5) or use hardware raid subsystems
> to avoid suffering from such flaky disk problems.
I haven't looked at RAID yet. The striping option of LVM is not really
an option, since I cannot add any disks to the volume.
-Dave
--
Dave Wapstra
dave@xs4all.nl
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] SCSI/LVM problems after power outage
2001-05-09 18:14 ` Dave Wapstra
@ 2001-05-09 21:26 ` Heinz J. Mauelshagen
2001-05-10 20:30 ` Dave Wapstra
0 siblings, 1 reply; 16+ messages in thread
From: Heinz J. Mauelshagen @ 2001-05-09 21:26 UTC (permalink / raw)
To: linux-lvm
On Wed, May 09, 2001 at 08:14:16PM +0200, Dave Wapstra wrote:
> On 09 May, 2001 at 09:35:10 +0000, Heinz J. Mauelshagen <Mauelshagen@sistina.com> wrote:
> >
> > On Tue, May 08, 2001 at 02:05:16PM -0700, Day, Evan wrote:
> > > It looks like device 2 is having issues:
> > >
> > > SCSI disk error : host 2 channel 0 id 2 lun 0 return code = 18000002
> > >
>
> Ok. Sounds reasonable ;) It seems /dev/sdc is giving troubles.
>
> > In case the PEs on the flaky device are still readable:
> > extending your VG and using pvmove is the way to go.
> >
> > Steps needed (crossing your fingers that the flaky drive can still stand this):
> >
> > - install or use an additional drive of at least the size of the flaky one
> > - pvcreate that drive
> > - vgextend the VG by it
> > - pvmove /dev/FlakyDrive
> > - wait until pvmove is done
> > - vgreduce VG /dev/FlakyDrive
>
> I'm trying to backup the data that is still available (backup was a
> thing that was still in progress... :) )
>
> pvmove is not really happy with the current situation.
I don't understand you here.
Did you try the above steps and something failed?
>
> Can I just do "vgreduce ftp /dev/sdc" after I finished rescueing most of the data?
>
Sure.
In case you've got a backup, you could remove or reduce LVs so that *no*
PEs are allocated on /dev/sdc.
You can check with "pvdisplay -v /dev/sdc" which LV(s) have PEs allocated
on that PV and run lvremove/lvreduce in order to free that PV.
Once pvdisplay tells you that no PEs are allocated (Allocated PE 0) you
can run "vgreduce VGName /dev/sdc".
> > You should have your LVs offline in case you have an older version than
> > LVM 0.9.1 Beta 7, because they had a pvmove related bug which could cause
> > oopses when you moved PEs in use.
> >
> > BTW: you need to configure MD (RAID 1 or 5) or use hardware raid subsystems
> > to avoid suffering from such flaky disk problems.
>
> I haven't looked at RAID yet. The striping option of LVM is not really
> an option, since I cannot add any disks to the volume.
That's true today.
There's a tiny item on the TODO list to enhance this ;-)
>
> -Dave
>
> --
> Dave Wapstra
> dave@xs4all.nl
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
Regards,
Heinz -- The LVM Guy --
*** Software bugs are stupid.
Nevertheless it needs not so stupid people to solve them ***
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Heinz Mauelshagen Sistina Software Inc.
Senior Consultant/Developer Am Sonnenhang 11
56242 Marienrachdorf
Germany
Mauelshagen@Sistina.com +49 2626 141200
FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [linux-lvm] SCSI/LVM problems after power outage
2001-05-09 21:26 ` Heinz J. Mauelshagen
@ 2001-05-10 20:30 ` Dave Wapstra
2001-05-11 12:54 ` Heinz J. Mauelshagen
0 siblings, 1 reply; 16+ messages in thread
From: Dave Wapstra @ 2001-05-10 20:30 UTC (permalink / raw)
To: linux-lvm
On 09 May, 2001 at 21:26:29 +0000, Heinz J. Mauelshagen <Mauelshagen@sistina.com> wrote:
> On Wed, May 09, 2001 at 08:14:16PM +0200, Dave Wapstra wrote:
> > On 09 May, 2001 at 09:35:10 +0000, Heinz J. Mauelshagen <Mauelshagen@sistina.com> wrote:
> > >
> > > Steps needed (crossing your fingers that the flaky drive can still stand this):
> > >
> > > - install or use an additional drive of at least the size of the flaky one
> > > - pvcreate that drive
> > > - vgextend the VG by it
> > > - pvmove /dev/FlakyDrive
> > > - wait until pvmove is done
> > > - vgreduce VG /dev/FlakyDrive
> >
> > I'm trying to backup the data that is still available (backup was a
> > thing that was still in progress... :) )
> >
> > pvmove is not really happy with the current situation.
>
> I don't understand you here.
> Did you try the above steps and something failed?
Yes, the pvmove command didn't finish succesfully. Unfortunately, I
haven't saved the output, so I cannot show you what the exact messages were.
> > Can I just do "vgreduce ftp /dev/sdc" after I finished rescueing most of the data?
> >
>
> Sure.
> In case you've got a backup, you could remove or reduce LVs so that *no*
> PEs are allocated on /dev/sdc.
> You can check with "pvdisplay -v /dev/sdc" which LV(s) have PEs allocated
> on that PV and run lvremove/lvreduce in order to free that PV.
>
> Once pvdisplay tells you that no PEs are allocated (Allocated PE 0) you
> can run "vgreduce VGName /dev/sdc".
Actually, my colleage already removed the complete volume configuration
and started from scratch.
The thing is, I've had this problem before. It seemed that a power
failure caused us some troubles. However, after re-initialising the
disks (pv/vg/lv), creating a new filesystem and restoring the backup,
everything is fine again.
Even /dev/sdc is not complaining at all.
# pvdisplay /dev/sdc
--- Physical volume ---
PV Name /dev/sdc
VG Name ftp
PV Size 8.43 GB / NOT usable 1.34 MB [LVM: 129 KB]
PV# 2
PV Status available
Allocatable yes (but full)
Cur LV 1
PE Size (KByte) 4096
Total PE 2159
Free PE 0
Allocated PE 2159
PV UUID none
Seems the disk is full, and no SCSI messages in the log files (?)
After the new filesystem has been created, the system was powered down
(switched off) twice to check if this is trigerring anything. In both
cases fsck successfully restored any errors.
I'm not sure if this is a hardware problem or not. The disk seems to
work fine, however in some situations SCSI errors occur.
Any suggestions on testing the disk for hardware problems are appreciated.
(dd if=/dev/sdc of=/dev/null ?)
-Dave
--
Dave Wapstra
dave@xs4all.nl
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] SCSI/LVM problems after power outage
2001-05-10 20:30 ` Dave Wapstra
@ 2001-05-11 12:54 ` Heinz J. Mauelshagen
0 siblings, 0 replies; 16+ messages in thread
From: Heinz J. Mauelshagen @ 2001-05-11 12:54 UTC (permalink / raw)
To: linux-lvm
On Thu, May 10, 2001 at 10:30:37PM +0200, Dave Wapstra wrote:
> On 09 May, 2001 at 21:26:29 +0000, Heinz J. Mauelshagen <Mauelshagen@sistina.com> wrote:
> > On Wed, May 09, 2001 at 08:14:16PM +0200, Dave Wapstra wrote:
> > > On 09 May, 2001 at 09:35:10 +0000, Heinz J. Mauelshagen <Mauelshagen@sistina.com> wrote:
> > > >
> > > > Steps needed (crossing your fingers that the flaky drive can still stand this):
> > > >
> > > > - install or use an additional drive of at least the size of the flaky one
> > > > - pvcreate that drive
> > > > - vgextend the VG by it
> > > > - pvmove /dev/FlakyDrive
> > > > - wait until pvmove is done
> > > > - vgreduce VG /dev/FlakyDrive
> > >
> > > I'm trying to backup the data that is still available (backup was a
> > > thing that was still in progress... :) )
> > >
> > > pvmove is not really happy with the current situation.
> >
> > I don't understand you here.
> > Did you try the above steps and something failed?
>
> Yes, the pvmove command didn't finish succesfully. Unfortunately, I
> haven't saved the output, so I cannot show you what the exact messages were.
:-(
>
> > > Can I just do "vgreduce ftp /dev/sdc" after I finished rescueing most of the data?
> > >
> >
> > Sure.
> > In case you've got a backup, you could remove or reduce LVs so that *no*
> > PEs are allocated on /dev/sdc.
> > You can check with "pvdisplay -v /dev/sdc" which LV(s) have PEs allocated
> > on that PV and run lvremove/lvreduce in order to free that PV.
> >
> > Once pvdisplay tells you that no PEs are allocated (Allocated PE 0) you
> > can run "vgreduce VGName /dev/sdc".
>
> Actually, my colleage already removed the complete volume configuration
> and started from scratch.
Oops, he was faster than us ;-)
>
> The thing is, I've had this problem before. It seemed that a power
> failure caused us some troubles. However, after re-initialising the
> disks (pv/vg/lv), creating a new filesystem and restoring the backup,
> everything is fine again.
>
> Even /dev/sdc is not complaining at all.
I wouldn't count on that disk, because SCSI errors were displayed before.
>
> # pvdisplay /dev/sdc
> --- Physical volume ---
> PV Name /dev/sdc
> VG Name ftp
> PV Size 8.43 GB / NOT usable 1.34 MB [LVM: 129 KB]
> PV# 2
> PV Status available
> Allocatable yes (but full)
> Cur LV 1
> PE Size (KByte) 4096
> Total PE 2159
> Free PE 0
> Allocated PE 2159
> PV UUID none
>
> Seems the disk is full, and no SCSI messages in the log files (?)
>
> After the new filesystem has been created, the system was powered down
> (switched off) twice to check if this is trigerring anything. In both
> cases fsck successfully restored any errors.
>
> I'm not sure if this is a hardware problem or not. The disk seems to
> work fine, however in some situations SCSI errors occur.
As I said: believe in a hardware problem.
You could follow piles of old threads in regard to flaky cables et al.
>
> Any suggestions on testing the disk for hardware problems are appreciated.
> (dd if=/dev/sdc of=/dev/null ?)
You need to read/write trash the disk.
Create a test LV with all the PEs on sdc, run bonnie with a huge file
all day and see if you can trigger SCSI errors.
>
>
> -Dave
>
> --
> Dave Wapstra
> dave@xs4all.nl
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
--
Regards,
Heinz -- The LVM Guy --
*** Software bugs are stupid.
Nevertheless it needs not so stupid people to solve them ***
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Heinz Mauelshagen Sistina Software Inc.
Senior Consultant/Developer Am Sonnenhang 11
56242 Marienrachdorf
Germany
Mauelshagen@Sistina.com +49 2626 141200
FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] SCSI/LVM problems after power outage
2001-05-09 9:35 ` Heinz J. Mauelshagen
2001-05-09 18:14 ` Dave Wapstra
@ 2001-05-11 8:03 ` Steven Lembark
2001-05-11 8:37 ` Adrian Phillips
2001-05-11 15:56 ` Luca Berra
1 sibling, 2 replies; 16+ messages in thread
From: Steven Lembark @ 2001-05-11 8:03 UTC (permalink / raw)
To: linux-lvm
> Steps needed (crossing your fingers that the flaky drive can still stand this):
q: any known way to just dump the lv's out to tape? problem
w/ pvmove is that you have to update data stored on the flakey
device.
another out might be to:
- grab a spare disk.
- create an equally-sized partition.
- try to software mirror it.
if the mirror succeeds then you can just pretend the flakey
drive died and replace it normally.
--
Steven Lembark 2930 W. Palmer St.
Chicago, IL 60647
lembark@wrkhors.com 800-762-1582
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [linux-lvm] SCSI/LVM problems after power outage
2001-05-11 8:03 ` Steven Lembark
@ 2001-05-11 8:37 ` Adrian Phillips
2001-05-11 14:05 ` Terje Kvernes
2001-05-11 15:56 ` Luca Berra
1 sibling, 1 reply; 16+ messages in thread
From: Adrian Phillips @ 2001-05-11 8:37 UTC (permalink / raw)
To: linux-lvm
>>>>> "Steven" == Steven Lembark <lembark@wrkhors.com> writes:
>> Steps needed (crossing your fingers that the flaky drive can
>> still stand this):
Steven> q: any known way to just dump the lv's out to tape?
Steven> problem w/ pvmove is that you have to update data stored
Steven> on the flakey device.
I might have misunderstood but : dd if=<lv> of=<tape>
Sincerely,
Adrian Phillips
--
Your mouse has moved.
Windows NT must be restarted for the change to take effect.
Reboot now? [OK]
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [linux-lvm] SCSI/LVM problems after power outage
2001-05-11 8:03 ` Steven Lembark
2001-05-11 8:37 ` Adrian Phillips
@ 2001-05-11 15:56 ` Luca Berra
1 sibling, 0 replies; 16+ messages in thread
From: Luca Berra @ 2001-05-11 15:56 UTC (permalink / raw)
To: linux-lvm
On Fri, May 11, 2001 at 03:03:00AM -0500, Steven Lembark wrote:
> - grab a spare disk.
> - create an equally-sized partition.
> - try to software mirror it.
if with software mirror you intend md, be warned that md
writes a raid superblock at the end of the partition
overwriting existing data.
L.
--
Luca Berra -- bluca@comedia.it
Communication Media & Services S.r.l.
/"\
\ / ASCII RIBBON CAMPAIGN
X AGAINST HTML MAIL
/ \
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] SCSI/LVM problems after power outage
2001-05-08 21:05 [linux-lvm] SCSI/LVM problems after power outage Day, Evan
2001-05-09 9:35 ` Heinz J. Mauelshagen
@ 2001-05-11 8:00 ` Steven Lembark
1 sibling, 0 replies; 16+ messages in thread
From: Steven Lembark @ 2001-05-11 8:00 UTC (permalink / raw)
To: linux-lvm
"Day, Evan" wrote:
>
> It looks like device 2 is having issues:
>
> SCSI disk error : host 2 channel 0 id 2 lun 0 return code = 18000002
>
> However, it has been several years since I worked with Sun machines, so I
> could be wrong. Regardless, it doesn't sound like an LVM issue, but a
> hardware issue. Unfortunately, I can't offer much recovery advice - most of
> my LVM experience is with HP-UX, and we use mirroring (RAID-1) at work -
> just unplug the bad drive, plug in a new one, and do a vgsync. I think you
> can add a replacement drive to the VG and use pvmove to try and move the PEs
> from the bad drive to the new drive, but I wouldn't take my word for it...
LVM -- or raid -- shouldn't normally be able to cause scsi-level
errors. ext2 might complain about hoked up data, but the scsi
stuff happens at the circut level. looks like your disk got fried.
you might want to try running a low-level scsi check on the disk
(non-destructive, hopefully). this would also be a really good
time to verify your backups...
--
Steven Lembark 2930 W. Palmer St.
Chicago, IL 60647
lembark@wrkhors.com 800-762-1582
^ permalink raw reply [flat|nested] 16+ messages in thread
* [linux-lvm] SCSI/LVM problems after power outage
@ 2001-05-08 16:33 Dave Wapstra
2001-05-08 16:59 ` Hugo Lombard
0 siblings, 1 reply; 16+ messages in thread
From: Dave Wapstra @ 2001-05-08 16:33 UTC (permalink / raw)
To: linux-lvm
Hi all,
After a power outage, our ftp server with LVM has not survived very well.
System: UltraSparc II, Linux 2.4(2) + io.path + LVM 0.9.1beta6
All normal ext2 paritions were fsck'd fine, however, the volume is having problems.
The syslog has a lot of SCSI errors:
May 7 19:15:52 ftp kernel: sym53c8xx_reset: pid=0 reset_flags=1 serial_number=0 serial_number_at_timeout=0
May 7 19:15:52 ftp kernel: scsi2: device driver called scsi_done() for a synchronous reset.
May 7 19:15:53 ftp kernel: sym53c875-2-<0,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 15)
May 7 19:15:53 ftp kernel: sym53c875-2-<1,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 15)
May 7 19:15:53 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 16)
May 7 19:15:53 ftp kernel: sym53c875-2-<3,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 15)
May 7 19:15:53 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3 DBC=11000c00 SBCL=ae
May 7 19:15:53 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 16)
May 7 19:15:53 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3 DBC=11000c00 SBCL=ae
May 7 19:15:53 ftp kernel: SCSI disk error : host 2 channel 0 id 2 lun 0 return code = 18000002
May 7 19:15:53 ftp kernel: [valid=0] Info fld=0x0, Current sd08:20: sense key Aborted Command
May 7 19:15:53 ftp kernel: Additional sense indicates Initiator detected error message received
May 7 19:15:53 ftp kernel: I/O error: dev 08:20, sector 2239227
May 7 19:15:53 ftp kernel: EXT2-fs error (device lvm(58,0)): ext2_read_inode: unable to read inode block - inode=1245185, block
=2490377
May 7 19:15:53 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 16)
May 7 19:15:53 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3 DBC=11000c00 SBCL=ae
May 7 19:15:53 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 16)
May 7 19:15:56 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3 DBC=11000c00 SBCL=ae
May 7 19:15:56 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 16)
May 7 19:15:56 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3 DBC=11000c00 SBCL=ae
May 7 19:15:57 ftp kernel: scsi2 channel 0 : resetting for second half of retries.
May 7 19:15:57 ftp kernel: SCSI bus is being reset for host 2 channel 0.
May 7 19:15:57 ftp kernel: sym53c8xx_reset: pid=0 reset_flags=1 serial_number=0 serial_number_at_timeout=0
May 7 19:15:57 ftp kernel: scsi2: device driver called scsi_done() for a synchronous reset.
May 7 19:15:58 ftp kernel: sym53c875-2-<0,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 15)
May 7 19:15:58 ftp kernel: sym53c875-2-<3,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 15)
May 7 19:15:58 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 16)
May 7 19:15:58 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3 DBC=11000c00 SBCL=ae
May 7 19:15:58 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 16)
May 7 19:15:58 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3 DBC=11000c00 SBCL=ae
May 7 19:15:58 ftp kernel: SCSI disk error : host 2 channel 0 id 2 lun 0 return code = 18000002
May 7 19:15:58 ftp kernel: [valid=0] Info fld=0x0, Current sd08:20: sense key Aborted Command
May 7 19:15:58 ftp kernel: Additional sense indicates Initiator detected error message received
May 7 19:15:58 ftp kernel: I/O error: dev 08:20, sector 17443579
May 7 19:15:58 ftp kernel: EXT2-fs error (device lvm(58,0)): ext2_read_inode: unable to read inode block - inode=2195457, block
=4390921
May 7 19:15:58 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 16)
May 7 19:16:00 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3 DBC=11000c00 SBCL=ae
May 7 19:16:00 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 16)
May 7 19:16:01 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3 DBC=11000c00 SBCL=ae
May 7 19:16:01 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 16)
May 7 19:16:01 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3 DBC=11000c00 SBCL=ae
May 7 19:16:02 ftp kernel: scsi2 channel 0 : resetting for second half of retries.
May 7 19:16:02 ftp kernel: SCSI bus is being reset for host 2 channel 0.
May 7 19:16:02 ftp kernel: sym53c8xx_reset: pid=0 reset_flags=1 serial_number=0 serial_number_at_timeout=0
May 7 19:16:02 ftp kernel: scsi2: device driver called scsi_done() for a synchronous reset.
May 7 19:16:02 ftp kernel: sym53c875-2-<0,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 15)
May 7 19:16:03 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 16)
May 7 19:16:03 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3 DBC=11000c00 SBCL=ae
May 7 19:16:03 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 16)
May 7 19:16:03 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3 DBC=11000c00 SBCL=ae
May 7 19:16:03 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 16)
May 7 19:16:03 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3 DBC=11000c00 SBCL=ae
May 7 19:16:04 ftp kernel: scsi2 channel 0 : resetting for second half of retries.
May 7 19:16:04 ftp kernel: SCSI bus is being reset for host 2 channel 0.
May 7 19:16:04 ftp kernel: sym53c8xx_reset: pid=0 reset_flags=1 serial_number=0 serial_number_at_timeout=0
May 7 19:16:04 ftp kernel: scsi2: device driver called scsi_done() for a synchronous reset.
May 7 19:16:04 ftp kernel: sym53c875-2-<0,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 15)
May 7 19:16:05 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 16)
May 7 19:16:05 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3 DBC=11000c00 SBCL=ae
May 7 19:16:05 ftp kernel: sym53c875-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 16)
May 7 19:16:05 ftp kernel: sym53c875-2: SCSI parity error detected: SCR1=3 DBC=11000c00 SBCL=ae
May 7 19:16:05 ftp kernel: SCSI disk error : host 2 channel 0 id 2 lun 0 return code = 18000002
May 7 19:16:05 ftp kernel: [valid=0] Info fld=0x0, Current sd08:20: sense key Aborted Command
May 8 10:20:50 ftp kernel: scsi2 channel 0 : resetting for second half of retries.
May 8 10:20:50 ftp kernel: SCSI bus is being reset for host 2 channel 0.
May 8 10:20:50 ftp kernel: sym53c8xx_reset: pid=0 reset_flags=1 serial_number=0 serial_number_at_timeout=0
May 8 10:20:50 ftp kernel: scsi2: device driver called scsi_done() for a synchronous reset.
May 8 10:20:50 ftp kernel: sym53c876-2: restart (scsi reset).
May 8 10:20:50 ftp kernel: sym53c876-2: Downloading SCSI SCRIPTS.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: wide msgout: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide msgout: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: wide msgin: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: wide: wide=1 chg=0.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: wide msgout: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide msgin: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide: wide=1 chg=0.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: wide msgin: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: wide: wide=1 chg=0.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: sync msgout: 1-3-1-c-10.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide msgout: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: sync msg in: 1-3-1-c-f.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,0>: sync: per=12 scntl3=0x90 scntl4=0x0 ofs=15 fak=0 chg=0.
May 8 10:20:50 ftp kernel: sym53c876-2-<0,*>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns, offset 15)
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide msgin: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide: wide=1 chg=0.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: sync msgout: 1-3-1-c-10.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: sync msg in: 1-3-1-c-10.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: sync: per=12 scntl3=0x90 scntl4=0x0 ofs=16 fak=0 chg=0.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns, offset 16)
May 8 10:20:50 ftp kernel: sym53c876-2: SCSI parity error detected: SCR1=3 DBC=11000c00 SBCL=ae
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide msgout: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide msgin: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide: wide=1 chg=0.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: sync msgout: 1-3-1-c-10.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: sync msg in: 1-3-1-c-10.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: sync: per=12 scntl3=0x90 scntl4=0x0 ofs=16 fak=0 chg=0.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns, offset 16)
May 8 10:20:50 ftp kernel: sym53c876-2: SCSI parity error detected: SCR1=3 DBC=11000c00 SBCL=ae
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide msgout: 1-2-3-1.
May 8 10:20:50 ftp kernel: sym53c876-2-<2,0>: wide msgin: 1-2-3-1.
May 8 10:20:51 ftp kernel: sym53c876-2-<2,0>: wide: wide=1 chg=0.
May 8 10:20:51 ftp kernel: SCSI disk error : host 2 channel 0 id 2 lun 0 return code = 18000002
May 8 10:20:51 ftp kernel: [valid=0] Info fld=0x0, Current sd08:20: sense key Aborted Command
May 8 10:20:51 ftp kernel: Additional sense indicates Initiator detected error message received
May 8 10:20:51 ftp kernel: I/O error: dev 08:20, sector 6695675
May 8 10:20:51 ftp kernel: sym53c876-2-<2,0>: sync msgout: 1-3-1-c-10.
May 8 10:20:51 ftp kernel: sym53c876-2-<2,0>: sync msg in: 1-3-1-c-10.
May 8 10:20:51 ftp kernel: sym53c876-2-<2,0>: sync: per=12 scntl3=0x90 scntl4=0x0 ofs=16 fak=0 chg=0.
May 8 10:20:51 ftp kernel: sym53c876-2-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns, offset 16)
May 8 10:20:52 ftp kernel: sym53c876-2: SCSI parity error detected: SCR1=3 DBC=11007c00 SBCL=ae
May 8 10:20:52 ftp kernel: sym53c876-2-<2,0>: wide msgout: 1-2-3-1.
How can I find out which disks are having problems, and what are possibilities to fix this?
# pvscan
pvscan -- reading all physical volumes (this may take a while...)
pvscan -- ACTIVE PV "/dev/sdb" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdc" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdd" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sde" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdf" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdg" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdh" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdi" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdj" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdk" of VG "ftp" [8.43 GB / 0 free]
pvscan -- ACTIVE PV "/dev/sdl" of VG "ftp" [8.43 GB / 0 free]
pvscan -- total: 11 [92.78 GB] / in use: 11 [92.78 GB] / in no VG: 0 [0]
# vgdisplay ftp
--- Volume group ---
VG Name ftp
VG Access read/write
VG Status available/resizable
VG # 0
MAX LV 256
Cur LV 1
Open LV 1
MAX LV Size 255.99 GB
Max PV 256
Cur PV 11
Act PV 11
VG Size 92.77 GB
PE Size 4 MB
Total PE 23749
Alloc PE / Size 23749 / 92.77 GB
Free PE / Size 0 / 0
VG UUID eFMVIQ-SMBr-ecsj-TDkc-bGvw-0fyE-jibMIU
# lvdisplay /dev/ftp/pub
--- Logical volume ---
LV Name /dev/ftp/pub
VG Name ftp
LV Write Access read/write
LV Status available
LV # 1
# open 1
LV Size 92.77 GB
Current LE 23749
Allocated LE 23749
Allocation next free
Read ahead sectors 120
Block device 58:0
-Dave
--
Dave Wapstra
dave@xs4all.nl
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] SCSI/LVM problems after power outage
2001-05-08 16:33 Dave Wapstra
@ 2001-05-08 16:59 ` Hugo Lombard
2001-05-08 18:16 ` Bart-Jan Vrielink
0 siblings, 1 reply; 16+ messages in thread
From: Hugo Lombard @ 2001-05-08 16:59 UTC (permalink / raw)
To: linux-lvm; +Cc: Dave Wapstra
On Tue, May 08, 2001 at 06:33:26PM +0200, Dave Wapstra wrote:
> Hi all,
>
> After a power outage, our ftp server with LVM has not survived very well.
>
> System: UltraSparc II, Linux 2.4(2) + io.path + LVM 0.9.1beta6
>
> All normal ext2 paritions were fsck'd fine, however, the volume is having problems.
>
> The syslog has a lot of SCSI errors:
>
> May 7 19:15:53 ftp kernel: SCSI disk error : host 2 channel 0 id 2 lun 0 return code = 18000002
second controlelr, id 2..? you have such a disk?
> May 7 19:15:53 ftp kernel: I/O error: dev 08:20, sector 2239227
$ ls -l /dev/sdb4
brw-rw---- 1 root disk 8, 20 May 5 1998 /dev/sdb4
seems to be the culprit, IMHO.
--
The goal of science is to build better mousetraps. The goal of nature
is to build better mice.
---------------------------------------------------------------------------
Hugo Lombard Infoline (Pty) Ltd
System Administrator
---------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] SCSI/LVM problems after power outage
2001-05-08 16:59 ` Hugo Lombard
@ 2001-05-08 18:16 ` Bart-Jan Vrielink
0 siblings, 0 replies; 16+ messages in thread
From: Bart-Jan Vrielink @ 2001-05-08 18:16 UTC (permalink / raw)
To: linux-lvm; +Cc: Dave Wapstra
On Tue, 8 May 2001, Hugo Lombard wrote:
> On Tue, May 08, 2001 at 06:33:26PM +0200, Dave Wapstra wrote:
> > Hi all,
> >
> > After a power outage, our ftp server with LVM has not survived very well.
> >
> > System: UltraSparc II, Linux 2.4(2) + io.path + LVM 0.9.1beta6
> >
> > All normal ext2 paritions were fsck'd fine, however, the volume is having problems.
> >
> > The syslog has a lot of SCSI errors:
> >
> > May 7 19:15:53 ftp kernel: SCSI disk error : host 2 channel 0 id 2 lun 0 return code = 18000002
>
> second controlelr, id 2..? you have such a disk?
>
> > May 7 19:15:53 ftp kernel: I/O error: dev 08:20, sector 2239227
>
> $ ls -l /dev/sdb4
> brw-rw---- 1 root disk 8, 20 May 5 1998 /dev/sdb4
>
> seems to be the culprit, IMHO.
As far as I know kernel messages like these always show the device numbers
in hex. hex 08:20 is major 8, minor 32, so it's /dev/sdc, not /dev/sdb4.
--
Tot ziens,
Bart-Jan
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2001-05-11 15:56 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-05-08 21:05 [linux-lvm] SCSI/LVM problems after power outage Day, Evan
2001-05-09 9:35 ` Heinz J. Mauelshagen
2001-05-09 18:14 ` Dave Wapstra
2001-05-09 21:26 ` Heinz J. Mauelshagen
2001-05-10 20:30 ` Dave Wapstra
2001-05-11 12:54 ` Heinz J. Mauelshagen
2001-05-11 8:03 ` Steven Lembark
2001-05-11 8:37 ` Adrian Phillips
2001-05-11 14:05 ` Terje Kvernes
2001-05-11 14:25 ` Terje Kvernes
2001-05-11 14:48 ` Adrian Phillips
2001-05-11 15:56 ` Luca Berra
2001-05-11 8:00 ` Steven Lembark
-- strict thread matches above, loose matches on Subject: below --
2001-05-08 16:33 Dave Wapstra
2001-05-08 16:59 ` Hugo Lombard
2001-05-08 18:16 ` Bart-Jan Vrielink
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.