* [linux-lvm] lvm2 *TEMPORARY* PV failure - what happens?
@ 2006-04-25 19:08 Ty! Boyack
2006-04-25 19:57 ` Ming Zhang
0 siblings, 1 reply; 9+ messages in thread
From: Ty! Boyack @ 2006-04-25 19:08 UTC (permalink / raw)
To: linux-lvm
I've been intrigued by the discussion of what happens when a PV fails,
and have begun to wonder what would happen in the case of a transient
failure of a PV.
The design I'm thinking of is a SAN environment with several
multi-terabyte iSCSI arrays as PVs, being grouped together into a single
VG, and then carving LVs out of that. We plan on using the CLVM tools
to fit into a clustered environment.
The arrays themselves are robust (RAID 5/6, redundant power supplies,
etc.) and I grant that if we lose the actual array (for example, if
multiple disks fail), then we are in the situation of a true and
possibly total failure of the PV and loss of it's data blocks.
But there is always the possiblity that we could lose the CPU, memory,
bus, etc. in the iSCSI controller portion of the array, which will cause
downtime, but no true loss of data. Or someone may hit the wrong power
switch and just reboot the thing, taking it offline for a short time.
Yes, that someone would probably be me. Shame on me.
The key point is that the iSCSI disk will come back in a few
minutes/hours/days depending on the failure type, and all blocks will be
intact when it comes back up. I suppose the analagous situation would
be using LVM on a group of hot swap drives and pulling one of the disks,
waiting a while, and then re-inserting it.
Can someone please walk me through the resulting steps that would happen
within LVM2 (or a GFS filesystem on top of that LV) in this situation?
Thanks,
-Ty!
--
-===========================-
Ty! Boyack
NREL Unix Network Manager
ty@nrel.colostate.edu
(970) 491-1186
-===========================-
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] lvm2 *TEMPORARY* PV failure - what happens?
2006-04-25 19:08 [linux-lvm] lvm2 *TEMPORARY* PV failure - what happens? Ty! Boyack
@ 2006-04-25 19:57 ` Ming Zhang
2006-04-25 20:21 ` Jonathan E Brassow
0 siblings, 1 reply; 9+ messages in thread
From: Ming Zhang @ 2006-04-25 19:57 UTC (permalink / raw)
To: LVM general discussion and development
my 2c. fix me if i am wrong
either activate the VG partially, and then all LVs on other PVs are
still accessible. I remember these LVs will only have RO access. Though
I have no idea why.
use dm-zero to generate a fake PVs and add to VG, then allow VG to
activate and access those LV. But i do not know if you access a LV that
is partially or fully on this PV, what will happen.
Ming
On Tue, 2006-04-25 at 13:08 -0600, Ty! Boyack wrote:
> I've been intrigued by the discussion of what happens when a PV fails,
> and have begun to wonder what would happen in the case of a transient
> failure of a PV.
>
> The design I'm thinking of is a SAN environment with several
> multi-terabyte iSCSI arrays as PVs, being grouped together into a single
> VG, and then carving LVs out of that. We plan on using the CLVM tools
> to fit into a clustered environment.
>
> The arrays themselves are robust (RAID 5/6, redundant power supplies,
> etc.) and I grant that if we lose the actual array (for example, if
> multiple disks fail), then we are in the situation of a true and
> possibly total failure of the PV and loss of it's data blocks.
>
> But there is always the possiblity that we could lose the CPU, memory,
> bus, etc. in the iSCSI controller portion of the array, which will cause
> downtime, but no true loss of data. Or someone may hit the wrong power
> switch and just reboot the thing, taking it offline for a short time.
> Yes, that someone would probably be me. Shame on me.
>
> The key point is that the iSCSI disk will come back in a few
> minutes/hours/days depending on the failure type, and all blocks will be
> intact when it comes back up. I suppose the analagous situation would
> be using LVM on a group of hot swap drives and pulling one of the disks,
> waiting a while, and then re-inserting it.
>
> Can someone please walk me through the resulting steps that would happen
> within LVM2 (or a GFS filesystem on top of that LV) in this situation?
>
> Thanks,
>
> -Ty!
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] lvm2 *TEMPORARY* PV failure - what happens?
2006-04-25 19:57 ` Ming Zhang
@ 2006-04-25 20:21 ` Jonathan E Brassow
2006-04-25 20:39 ` Ming Zhang
0 siblings, 1 reply; 9+ messages in thread
From: Jonathan E Brassow @ 2006-04-25 20:21 UTC (permalink / raw)
To: LVM general discussion and development
It is simple to play with this type of scenario by doing:
echo offline > /sys/block/<sd dev>/device/state
and later
echo running > /sys/block/<sd dev>/device/state
I know this doesn't answer your question directly.
brassow
On Apr 25, 2006, at 2:57 PM, Ming Zhang wrote:
> my 2c. fix me if i am wrong
>
> either activate the VG partially, and then all LVs on other PVs are
> still accessible. I remember these LVs will only have RO access. Though
> I have no idea why.
>
> use dm-zero to generate a fake PVs and add to VG, then allow VG to
> activate and access those LV. But i do not know if you access a LV that
> is partially or fully on this PV, what will happen.
>
> Ming
>
>
> On Tue, 2006-04-25 at 13:08 -0600, Ty! Boyack wrote:
>> I've been intrigued by the discussion of what happens when a PV fails,
>> and have begun to wonder what would happen in the case of a transient
>> failure of a PV.
>>
>> The design I'm thinking of is a SAN environment with several
>> multi-terabyte iSCSI arrays as PVs, being grouped together into a
>> single
>> VG, and then carving LVs out of that. We plan on using the CLVM tools
>> to fit into a clustered environment.
>>
>> The arrays themselves are robust (RAID 5/6, redundant power supplies,
>> etc.) and I grant that if we lose the actual array (for example, if
>> multiple disks fail), then we are in the situation of a true and
>> possibly total failure of the PV and loss of it's data blocks.
>>
>> But there is always the possiblity that we could lose the CPU, memory,
>> bus, etc. in the iSCSI controller portion of the array, which will
>> cause
>> downtime, but no true loss of data. Or someone may hit the wrong
>> power
>> switch and just reboot the thing, taking it offline for a short time.
>> Yes, that someone would probably be me. Shame on me.
>>
>> The key point is that the iSCSI disk will come back in a few
>> minutes/hours/days depending on the failure type, and all blocks will
>> be
>> intact when it comes back up. I suppose the analagous situation would
>> be using LVM on a group of hot swap drives and pulling one of the
>> disks,
>> waiting a while, and then re-inserting it.
>>
>> Can someone please walk me through the resulting steps that would
>> happen
>> within LVM2 (or a GFS filesystem on top of that LV) in this situation?
>>
>> Thanks,
>>
>> -Ty!
>>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] lvm2 *TEMPORARY* PV failure - what happens?
2006-04-25 20:21 ` Jonathan E Brassow
@ 2006-04-25 20:39 ` Ming Zhang
2006-04-25 21:34 ` Jonathan E Brassow
2006-04-25 22:13 ` Ty! Boyack
0 siblings, 2 replies; 9+ messages in thread
From: Ming Zhang @ 2006-04-25 20:39 UTC (permalink / raw)
To: LVM general discussion and development
assume 2 scenarios
1) this PV is under use when it is disconnected temporarily. then
eventually will return r/w errors to applications. but other LVs are
still accessible.
2) system is off and boot up again. for this system will complain PV
with UUID ... is not found. so the only way is to partially activate VG.
am i correct here?
ming
On Tue, 2006-04-25 at 15:21 -0500, Jonathan E Brassow wrote:
> It is simple to play with this type of scenario by doing:
>
> echo offline > /sys/block/<sd dev>/device/state
>
> and later
>
> echo running > /sys/block/<sd dev>/device/state
>
> I know this doesn't answer your question directly.
>
> brassow
>
>
> On Apr 25, 2006, at 2:57 PM, Ming Zhang wrote:
>
> > my 2c. fix me if i am wrong
> >
> > either activate the VG partially, and then all LVs on other PVs are
> > still accessible. I remember these LVs will only have RO access. Though
> > I have no idea why.
> >
> > use dm-zero to generate a fake PVs and add to VG, then allow VG to
> > activate and access those LV. But i do not know if you access a LV that
> > is partially or fully on this PV, what will happen.
> >
> > Ming
> >
> >
> > On Tue, 2006-04-25 at 13:08 -0600, Ty! Boyack wrote:
> >> I've been intrigued by the discussion of what happens when a PV fails,
> >> and have begun to wonder what would happen in the case of a transient
> >> failure of a PV.
> >>
> >> The design I'm thinking of is a SAN environment with several
> >> multi-terabyte iSCSI arrays as PVs, being grouped together into a
> >> single
> >> VG, and then carving LVs out of that. We plan on using the CLVM tools
> >> to fit into a clustered environment.
> >>
> >> The arrays themselves are robust (RAID 5/6, redundant power supplies,
> >> etc.) and I grant that if we lose the actual array (for example, if
> >> multiple disks fail), then we are in the situation of a true and
> >> possibly total failure of the PV and loss of it's data blocks.
> >>
> >> But there is always the possiblity that we could lose the CPU, memory,
> >> bus, etc. in the iSCSI controller portion of the array, which will
> >> cause
> >> downtime, but no true loss of data. Or someone may hit the wrong
> >> power
> >> switch and just reboot the thing, taking it offline for a short time.
> >> Yes, that someone would probably be me. Shame on me.
> >>
> >> The key point is that the iSCSI disk will come back in a few
> >> minutes/hours/days depending on the failure type, and all blocks will
> >> be
> >> intact when it comes back up. I suppose the analagous situation would
> >> be using LVM on a group of hot swap drives and pulling one of the
> >> disks,
> >> waiting a while, and then re-inserting it.
> >>
> >> Can someone please walk me through the resulting steps that would
> >> happen
> >> within LVM2 (or a GFS filesystem on top of that LV) in this situation?
> >>
> >> Thanks,
> >>
> >> -Ty!
> >>
> >
> > _______________________________________________
> > linux-lvm mailing list
> > linux-lvm@redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-lvm
> > read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] lvm2 *TEMPORARY* PV failure - what happens?
2006-04-25 20:39 ` Ming Zhang
@ 2006-04-25 21:34 ` Jonathan E Brassow
2006-04-25 21:44 ` Ming Zhang
2006-04-25 22:13 ` Ty! Boyack
1 sibling, 1 reply; 9+ messages in thread
From: Jonathan E Brassow @ 2006-04-25 21:34 UTC (permalink / raw)
To: LVM general discussion and development, mingz
y, sounds right. It's pretty much what I get.
brassow
On Apr 25, 2006, at 3:39 PM, Ming Zhang wrote:
> assume 2 scenarios
>
> 1) this PV is under use when it is disconnected temporarily. then
> eventually will return r/w errors to applications. but other LVs are
> still accessible.
>
> 2) system is off and boot up again. for this system will complain PV
> with UUID ... is not found. so the only way is to partially activate
> VG.
>
> am i correct here?
>
> ming
>
>
>
> On Tue, 2006-04-25 at 15:21 -0500, Jonathan E Brassow wrote:
>> It is simple to play with this type of scenario by doing:
>>
>> echo offline > /sys/block/<sd dev>/device/state
>>
>> and later
>>
>> echo running > /sys/block/<sd dev>/device/state
>>
>> I know this doesn't answer your question directly.
>>
>> brassow
>>
>>
>> On Apr 25, 2006, at 2:57 PM, Ming Zhang wrote:
>>
>>> my 2c. fix me if i am wrong
>>>
>>> either activate the VG partially, and then all LVs on other PVs are
>>> still accessible. I remember these LVs will only have RO access.
>>> Though
>>> I have no idea why.
>>>
>>> use dm-zero to generate a fake PVs and add to VG, then allow VG to
>>> activate and access those LV. But i do not know if you access a LV
>>> that
>>> is partially or fully on this PV, what will happen.
>>>
>>> Ming
>>>
>>>
>>> On Tue, 2006-04-25 at 13:08 -0600, Ty! Boyack wrote:
>>>> I've been intrigued by the discussion of what happens when a PV
>>>> fails,
>>>> and have begun to wonder what would happen in the case of a
>>>> transient
>>>> failure of a PV.
>>>>
>>>> The design I'm thinking of is a SAN environment with several
>>>> multi-terabyte iSCSI arrays as PVs, being grouped together into a
>>>> single
>>>> VG, and then carving LVs out of that. We plan on using the CLVM
>>>> tools
>>>> to fit into a clustered environment.
>>>>
>>>> The arrays themselves are robust (RAID 5/6, redundant power
>>>> supplies,
>>>> etc.) and I grant that if we lose the actual array (for example, if
>>>> multiple disks fail), then we are in the situation of a true and
>>>> possibly total failure of the PV and loss of it's data blocks.
>>>>
>>>> But there is always the possiblity that we could lose the CPU,
>>>> memory,
>>>> bus, etc. in the iSCSI controller portion of the array, which will
>>>> cause
>>>> downtime, but no true loss of data. Or someone may hit the wrong
>>>> power
>>>> switch and just reboot the thing, taking it offline for a short
>>>> time.
>>>> Yes, that someone would probably be me. Shame on me.
>>>>
>>>> The key point is that the iSCSI disk will come back in a few
>>>> minutes/hours/days depending on the failure type, and all blocks
>>>> will
>>>> be
>>>> intact when it comes back up. I suppose the analagous situation
>>>> would
>>>> be using LVM on a group of hot swap drives and pulling one of the
>>>> disks,
>>>> waiting a while, and then re-inserting it.
>>>>
>>>> Can someone please walk me through the resulting steps that would
>>>> happen
>>>> within LVM2 (or a GFS filesystem on top of that LV) in this
>>>> situation?
>>>>
>>>> Thanks,
>>>>
>>>> -Ty!
>>>>
>>>
>>> _______________________________________________
>>> linux-lvm mailing list
>>> linux-lvm@redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>>
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] lvm2 *TEMPORARY* PV failure - what happens?
2006-04-25 21:34 ` Jonathan E Brassow
@ 2006-04-25 21:44 ` Ming Zhang
0 siblings, 0 replies; 9+ messages in thread
From: Ming Zhang @ 2006-04-25 21:44 UTC (permalink / raw)
To: Jonathan E Brassow; +Cc: LVM general discussion and development
thx.
any idea why partially activated VGs can only allow read only access to
LVs on even good PVs. I think as long we do not do LVM level change like
create new/resize/delete LVs, the write access to LVs data should be
allowed right?
ming
On Tue, 2006-04-25 at 16:34 -0500, Jonathan E Brassow wrote:
> y, sounds right. It's pretty much what I get.
>
> brassow
>
> On Apr 25, 2006, at 3:39 PM, Ming Zhang wrote:
>
> > assume 2 scenarios
> >
> > 1) this PV is under use when it is disconnected temporarily. then
> > eventually will return r/w errors to applications. but other LVs are
> > still accessible.
> >
> > 2) system is off and boot up again. for this system will complain PV
> > with UUID ... is not found. so the only way is to partially activate
> > VG.
> >
> > am i correct here?
> >
> > ming
> >
> >
> >
> > On Tue, 2006-04-25 at 15:21 -0500, Jonathan E Brassow wrote:
> >> It is simple to play with this type of scenario by doing:
> >>
> >> echo offline > /sys/block/<sd dev>/device/state
> >>
> >> and later
> >>
> >> echo running > /sys/block/<sd dev>/device/state
> >>
> >> I know this doesn't answer your question directly.
> >>
> >> brassow
> >>
> >>
> >> On Apr 25, 2006, at 2:57 PM, Ming Zhang wrote:
> >>
> >>> my 2c. fix me if i am wrong
> >>>
> >>> either activate the VG partially, and then all LVs on other PVs are
> >>> still accessible. I remember these LVs will only have RO access.
> >>> Though
> >>> I have no idea why.
> >>>
> >>> use dm-zero to generate a fake PVs and add to VG, then allow VG to
> >>> activate and access those LV. But i do not know if you access a LV
> >>> that
> >>> is partially or fully on this PV, what will happen.
> >>>
> >>> Ming
> >>>
> >>>
> >>> On Tue, 2006-04-25 at 13:08 -0600, Ty! Boyack wrote:
> >>>> I've been intrigued by the discussion of what happens when a PV
> >>>> fails,
> >>>> and have begun to wonder what would happen in the case of a
> >>>> transient
> >>>> failure of a PV.
> >>>>
> >>>> The design I'm thinking of is a SAN environment with several
> >>>> multi-terabyte iSCSI arrays as PVs, being grouped together into a
> >>>> single
> >>>> VG, and then carving LVs out of that. We plan on using the CLVM
> >>>> tools
> >>>> to fit into a clustered environment.
> >>>>
> >>>> The arrays themselves are robust (RAID 5/6, redundant power
> >>>> supplies,
> >>>> etc.) and I grant that if we lose the actual array (for example, if
> >>>> multiple disks fail), then we are in the situation of a true and
> >>>> possibly total failure of the PV and loss of it's data blocks.
> >>>>
> >>>> But there is always the possiblity that we could lose the CPU,
> >>>> memory,
> >>>> bus, etc. in the iSCSI controller portion of the array, which will
> >>>> cause
> >>>> downtime, but no true loss of data. Or someone may hit the wrong
> >>>> power
> >>>> switch and just reboot the thing, taking it offline for a short
> >>>> time.
> >>>> Yes, that someone would probably be me. Shame on me.
> >>>>
> >>>> The key point is that the iSCSI disk will come back in a few
> >>>> minutes/hours/days depending on the failure type, and all blocks
> >>>> will
> >>>> be
> >>>> intact when it comes back up. I suppose the analagous situation
> >>>> would
> >>>> be using LVM on a group of hot swap drives and pulling one of the
> >>>> disks,
> >>>> waiting a while, and then re-inserting it.
> >>>>
> >>>> Can someone please walk me through the resulting steps that would
> >>>> happen
> >>>> within LVM2 (or a GFS filesystem on top of that LV) in this
> >>>> situation?
> >>>>
> >>>> Thanks,
> >>>>
> >>>> -Ty!
> >>>>
> >>>
> >>> _______________________________________________
> >>> linux-lvm mailing list
> >>> linux-lvm@redhat.com
> >>> https://www.redhat.com/mailman/listinfo/linux-lvm
> >>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>>
> >>
> >> _______________________________________________
> >> linux-lvm mailing list
> >> linux-lvm@redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-lvm
> >> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >
> > _______________________________________________
> > linux-lvm mailing list
> > linux-lvm@redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-lvm
> > read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] lvm2 *TEMPORARY* PV failure - what happens?
2006-04-25 20:39 ` Ming Zhang
2006-04-25 21:34 ` Jonathan E Brassow
@ 2006-04-25 22:13 ` Ty! Boyack
2006-04-25 22:54 ` Ming Zhang
1 sibling, 1 reply; 9+ messages in thread
From: Ty! Boyack @ 2006-04-25 22:13 UTC (permalink / raw)
To: LVM general discussion and development
The first scenario you give is most likey the one to occur. I'm
thinking the server is active, has active volumes which use the iSCSI
array as PVs, and may or may not have applications accessing it at the
time of the failure. I'm glad to hear that the other volumes should be
accessable (assuming we don't stripe across the devices). It also makes
sense that the user will get a r/w error or i/o error.
I'm still wondering if the disk comes back if the LV will be available
again, or if LVM will mark it as failed, and assume that all blocks on
it are invalid now and forevermore. Or is this a function of the
filesystem? I'll be building a test case and will certainly have fun
breaking it like this. I'm glad to know of the ability to test it as
Jonathan pointed out that will be a simpler test bed.
Good info - thanks folks!
-Ty!
Ming Zhang wrote:
>assume 2 scenarios
>
>1) this PV is under use when it is disconnected temporarily. then
>eventually will return r/w errors to applications. but other LVs are
>still accessible.
>
>2) system is off and boot up again. for this system will complain PV
>with UUID ... is not found. so the only way is to partially activate VG.
>
>am i correct here?
>
>ming
>
>
>
>On Tue, 2006-04-25 at 15:21 -0500, Jonathan E Brassow wrote:
>
>
>>It is simple to play with this type of scenario by doing:
>>
>>echo offline > /sys/block/<sd dev>/device/state
>>
>>and later
>>
>>echo running > /sys/block/<sd dev>/device/state
>>
>>I know this doesn't answer your question directly.
>>
>> brassow
>>
>>
>>On Apr 25, 2006, at 2:57 PM, Ming Zhang wrote:
>>
>>
>>
>>>my 2c. fix me if i am wrong
>>>
>>>either activate the VG partially, and then all LVs on other PVs are
>>>still accessible. I remember these LVs will only have RO access. Though
>>>I have no idea why.
>>>
>>>use dm-zero to generate a fake PVs and add to VG, then allow VG to
>>>activate and access those LV. But i do not know if you access a LV that
>>>is partially or fully on this PV, what will happen.
>>>
>>>Ming
>>>
>>>
>>>On Tue, 2006-04-25 at 13:08 -0600, Ty! Boyack wrote:
>>>
>>>
>>>>I've been intrigued by the discussion of what happens when a PV fails,
>>>>and have begun to wonder what would happen in the case of a transient
>>>>failure of a PV.
>>>>
>>>>The design I'm thinking of is a SAN environment with several
>>>>multi-terabyte iSCSI arrays as PVs, being grouped together into a
>>>>single
>>>>VG, and then carving LVs out of that. We plan on using the CLVM tools
>>>>to fit into a clustered environment.
>>>>
>>>>The arrays themselves are robust (RAID 5/6, redundant power supplies,
>>>>etc.) and I grant that if we lose the actual array (for example, if
>>>>multiple disks fail), then we are in the situation of a true and
>>>>possibly total failure of the PV and loss of it's data blocks.
>>>>
>>>>But there is always the possiblity that we could lose the CPU, memory,
>>>>bus, etc. in the iSCSI controller portion of the array, which will
>>>>cause
>>>>downtime, but no true loss of data. Or someone may hit the wrong
>>>>power
>>>>switch and just reboot the thing, taking it offline for a short time.
>>>>Yes, that someone would probably be me. Shame on me.
>>>>
>>>>The key point is that the iSCSI disk will come back in a few
>>>>minutes/hours/days depending on the failure type, and all blocks will
>>>>be
>>>>intact when it comes back up. I suppose the analagous situation would
>>>>be using LVM on a group of hot swap drives and pulling one of the
>>>>disks,
>>>>waiting a while, and then re-inserting it.
>>>>
>>>>Can someone please walk me through the resulting steps that would
>>>>happen
>>>>within LVM2 (or a GFS filesystem on top of that LV) in this situation?
>>>>
>>>>Thanks,
>>>>
>>>>-Ty!
>>>>
>>>>
>>>>
>>>_______________________________________________
>>>linux-lvm mailing list
>>>linux-lvm@redhat.com
>>>https://www.redhat.com/mailman/listinfo/linux-lvm
>>>read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>>
>>>
>>>
>>_______________________________________________
>>linux-lvm mailing list
>>linux-lvm@redhat.com
>>https://www.redhat.com/mailman/listinfo/linux-lvm
>>read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
>>
>
>_______________________________________________
>linux-lvm mailing list
>linux-lvm@redhat.com
>https://www.redhat.com/mailman/listinfo/linux-lvm
>read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
>
--
-===========================-
Ty! Boyack
NREL Unix Network Manager
ty@nrel.colostate.edu
(970) 491-1186
-===========================-
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] lvm2 *TEMPORARY* PV failure - what happens?
2006-04-25 22:13 ` Ty! Boyack
@ 2006-04-25 22:54 ` Ming Zhang
2006-04-26 16:08 ` Jonathan E Brassow
0 siblings, 1 reply; 9+ messages in thread
From: Ming Zhang @ 2006-04-25 22:54 UTC (permalink / raw)
To: LVM general discussion and development
On Tue, 2006-04-25 at 16:13 -0600, Ty! Boyack wrote:
> The first scenario you give is most likey the one to occur. I'm
> thinking the server is active, has active volumes which use the iSCSI
> array as PVs, and may or may not have applications accessing it at the
> time of the failure. I'm glad to hear that the other volumes should be
> accessable (assuming we don't stripe across the devices). It also makes
> sense that the user will get a r/w error or i/o error.
maybe u can consider a raid5 on top of these iscsi disks, if u app are
very unhappy or buggy to see these r/w errors.
>
> I'm still wondering if the disk comes back if the LV will be available
> again, or if LVM will mark it as failed, and assume that all blocks on
it will not automatically available i think, but with that "echo ...", u
might be able to see it again.
> it are invalid now and forevermore. Or is this a function of the
> filesystem? I'll be building a test case and will certainly have fun
> breaking it like this. I'm glad to know of the ability to test it as
> Jonathan pointed out that will be a simpler test bed.
keep us updated. thanks.
>
> Good info - thanks folks!
>
> -Ty!
>
> Ming Zhang wrote:
>
> >assume 2 scenarios
> >
> >1) this PV is under use when it is disconnected temporarily. then
> >eventually will return r/w errors to applications. but other LVs are
> >still accessible.
> >
> >2) system is off and boot up again. for this system will complain PV
> >with UUID ... is not found. so the only way is to partially activate VG.
> >
> >am i correct here?
> >
> >ming
> >
> >
> >
> >On Tue, 2006-04-25 at 15:21 -0500, Jonathan E Brassow wrote:
> >
> >
> >>It is simple to play with this type of scenario by doing:
> >>
> >>echo offline > /sys/block/<sd dev>/device/state
> >>
> >>and later
> >>
> >>echo running > /sys/block/<sd dev>/device/state
> >>
> >>I know this doesn't answer your question directly.
> >>
> >> brassow
> >>
> >>
> >>On Apr 25, 2006, at 2:57 PM, Ming Zhang wrote:
> >>
> >>
> >>
> >>>my 2c. fix me if i am wrong
> >>>
> >>>either activate the VG partially, and then all LVs on other PVs are
> >>>still accessible. I remember these LVs will only have RO access. Though
> >>>I have no idea why.
> >>>
> >>>use dm-zero to generate a fake PVs and add to VG, then allow VG to
> >>>activate and access those LV. But i do not know if you access a LV that
> >>>is partially or fully on this PV, what will happen.
> >>>
> >>>Ming
> >>>
> >>>
> >>>On Tue, 2006-04-25 at 13:08 -0600, Ty! Boyack wrote:
> >>>
> >>>
> >>>>I've been intrigued by the discussion of what happens when a PV fails,
> >>>>and have begun to wonder what would happen in the case of a transient
> >>>>failure of a PV.
> >>>>
> >>>>The design I'm thinking of is a SAN environment with several
> >>>>multi-terabyte iSCSI arrays as PVs, being grouped together into a
> >>>>single
> >>>>VG, and then carving LVs out of that. We plan on using the CLVM tools
> >>>>to fit into a clustered environment.
> >>>>
> >>>>The arrays themselves are robust (RAID 5/6, redundant power supplies,
> >>>>etc.) and I grant that if we lose the actual array (for example, if
> >>>>multiple disks fail), then we are in the situation of a true and
> >>>>possibly total failure of the PV and loss of it's data blocks.
> >>>>
> >>>>But there is always the possiblity that we could lose the CPU, memory,
> >>>>bus, etc. in the iSCSI controller portion of the array, which will
> >>>>cause
> >>>>downtime, but no true loss of data. Or someone may hit the wrong
> >>>>power
> >>>>switch and just reboot the thing, taking it offline for a short time.
> >>>>Yes, that someone would probably be me. Shame on me.
> >>>>
> >>>>The key point is that the iSCSI disk will come back in a few
> >>>>minutes/hours/days depending on the failure type, and all blocks will
> >>>>be
> >>>>intact when it comes back up. I suppose the analagous situation would
> >>>>be using LVM on a group of hot swap drives and pulling one of the
> >>>>disks,
> >>>>waiting a while, and then re-inserting it.
> >>>>
> >>>>Can someone please walk me through the resulting steps that would
> >>>>happen
> >>>>within LVM2 (or a GFS filesystem on top of that LV) in this situation?
> >>>>
> >>>>Thanks,
> >>>>
> >>>>-Ty!
> >>>>
> >>>>
> >>>>
> >>>_______________________________________________
> >>>linux-lvm mailing list
> >>>linux-lvm@redhat.com
> >>>https://www.redhat.com/mailman/listinfo/linux-lvm
> >>>read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>>
> >>>
> >>>
> >>_______________________________________________
> >>linux-lvm mailing list
> >>linux-lvm@redhat.com
> >>https://www.redhat.com/mailman/listinfo/linux-lvm
> >>read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>
> >>
> >
> >_______________________________________________
> >linux-lvm mailing list
> >linux-lvm@redhat.com
> >https://www.redhat.com/mailman/listinfo/linux-lvm
> >read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >
> >
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] lvm2 *TEMPORARY* PV failure - what happens?
2006-04-25 22:54 ` Ming Zhang
@ 2006-04-26 16:08 ` Jonathan E Brassow
0 siblings, 0 replies; 9+ messages in thread
From: Jonathan E Brassow @ 2006-04-26 16:08 UTC (permalink / raw)
To: LVM general discussion and development
On Apr 25, 2006, at 5:54 PM, Ming Zhang wrote:
>> I'm still wondering if the disk comes back if the LV will be available
>> again, or if LVM will mark it as failed, and assume that all blocks on
>
> it will not automatically available i think, but with that "echo ...",
> u
> might be able to see it again.
>
Actually, I think it is available once the dev comes back.
brassow
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2006-04-26 16:06 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-25 19:08 [linux-lvm] lvm2 *TEMPORARY* PV failure - what happens? Ty! Boyack
2006-04-25 19:57 ` Ming Zhang
2006-04-25 20:21 ` Jonathan E Brassow
2006-04-25 20:39 ` Ming Zhang
2006-04-25 21:34 ` Jonathan E Brassow
2006-04-25 21:44 ` Ming Zhang
2006-04-25 22:13 ` Ty! Boyack
2006-04-25 22:54 ` Ming Zhang
2006-04-26 16:08 ` Jonathan E Brassow
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).