* [linux-lvm] LVM in shared parallel SCSI environment @ 2000-11-11 13:46 Jesse Sipprell 2000-11-14 6:44 ` Jos Visser 0 siblings, 1 reply; 13+ messages in thread From: Jesse Sipprell @ 2000-11-11 13:46 UTC (permalink / raw) To: linux-lvm Hi all, I'm currently in the process of planning a large(ish)-scale Linux deployment. As part of the planning process, I am considering using LVM in a shared SCSI environment; potentially with GFS as the file system in the future, but starting with ext2. To that end I have a test "cluster" set up. It consists of: One Winchester Systems OpenRAID chassis with 100GB total storage, running in RAID5. The OpenRAID is a SCSI-to-SCSI solution, with four host SCSI channels. The hardware can be configured so that the channels can share physical storage space. Four test Linux boxes, each with a small (9GB) boot/root/swap/emergency SCSI disk. Obviously, the four boxes are connected to the four host channels on the OpenRAID. The OpenRAID is configured such that all available disk space is visable identically to the servers. Obviously, because the initial test uses ext2 instead of a filesystem that supports shared media (i.e. GFS), we won't be actually sharing filesystems between boxes. However, LVM seemed like a pretty good potential in this situation because it allows for relatively dynamic allocation of storage from a large shared "pool." The catch, as with all shared SCSI solutions, is that each participating host must maintain a consistant view of LVM metadata. So far, it's been fairly successful. Once the vg and lvs are created, and each host has meta in core, all is well. The only problem I am seeing is when an lv (for example) is extended or reduced, the other systems are unaware of the change in LVM metadata. This makes sense of course, because each host is operating with it's own "notion" of the LVM in core. The solution is rather awkward; one has to unmount all LVM filesystems on each host, vgchange the vg to inactive, vgchange it back to active and then remount the filesystems. My question is this: Is there any way to "refresh" the in-core metadata from disk? If not, does anyone think this might be a good idea for the future? Regards, -- Jesse Sipprell Technical Operations Director Evolution Communications, Inc. 800.496.4736 * Finger jss@evcom.net for my PGP Public Key * ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] LVM in shared parallel SCSI environment 2000-11-11 13:46 [linux-lvm] LVM in shared parallel SCSI environment Jesse Sipprell @ 2000-11-14 6:44 ` Jos Visser 2000-11-14 14:38 ` Jesse Sipprell 0 siblings, 1 reply; 13+ messages in thread From: Jos Visser @ 2000-11-14 6:44 UTC (permalink / raw) To: Jesse Sipprell; +Cc: linux-lvm Hi Jesse, Let's see if I get this right: You have an LVM configuration on a shared SCSI disk set. I understand from your description that you have some VG's active on more than one node at a time. Is this right? If so, I wonder if it's supported (others are better equiped to determine this than I am). However, most volume managers on other Unix platforms do not allow this. As far as I know the vgchange inactive/active bounce is the only thing that will refresh the metadata. However, if you have the VG truly active on more than one node (if that's possible), you have a recipe for disaster! What if you by mistake change the VG (e.g. adding an LV) from one node, and then perform a similar action from another node? I would not be surprised if there would be corruptions, oopses and panics all over the place. ++Jos And thus it came to pass that Jesse Sipprell wrote: (on Sat, Nov 11, 2000 at 08:46:02AM -0500 to be exact) > Hi all, > > I'm currently in the process of planning a large(ish)-scale Linux deployment. > As part of the planning process, I am considering using LVM in a shared SCSI > environment; potentially with GFS as the file system in the future, but > starting with ext2. > > To that end I have a test "cluster" set up. It consists of: > > One Winchester Systems OpenRAID chassis with 100GB total storage, running in > RAID5. The OpenRAID is a SCSI-to-SCSI solution, with four host SCSI channels. > The hardware can be configured so that the channels can share physical storage > space. > > Four test Linux boxes, each with a small (9GB) boot/root/swap/emergency SCSI > disk. Obviously, the four boxes are connected to the four host channels on > the OpenRAID. > > The OpenRAID is configured such that all available disk space is visable > identically to the servers. Obviously, because the initial test uses ext2 > instead of a filesystem that supports shared media (i.e. GFS), we won't be > actually sharing filesystems between boxes. However, LVM seemed like a pretty > good potential in this situation because it allows for relatively dynamic > allocation of storage from a large shared "pool." The catch, as with all > shared SCSI solutions, is that each participating host must maintain a > consistant view of LVM metadata. > > So far, it's been fairly successful. Once the vg and lvs are created, and each > host has meta in core, all is well. The only problem I am seeing is when an > lv (for example) is extended or reduced, the other systems are unaware of the > change in LVM metadata. This makes sense of course, because each host is > operating with it's own "notion" of the LVM in core. The solution is rather > awkward; one has to unmount all LVM filesystems on each host, vgchange the vg > to inactive, vgchange it back to active and then remount the filesystems. > > My question is this: Is there any way to "refresh" the in-core metadata from > disk? If not, does anyone think this might be a good idea for the future? > > Regards, > > -- > Jesse Sipprell > Technical Operations Director > Evolution Communications, Inc. > 800.496.4736 > > * Finger jss@evcom.net for my PGP Public Key * -- Success and happiness can not be pursued; it must ensue as the unintended side-effect of one's personal dedication to a course greater than oneself. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] LVM in shared parallel SCSI environment 2000-11-14 6:44 ` Jos Visser @ 2000-11-14 14:38 ` Jesse Sipprell 2000-11-14 16:09 ` Paul Jakma 0 siblings, 1 reply; 13+ messages in thread From: Jesse Sipprell @ 2000-11-14 14:38 UTC (permalink / raw) To: Jos Visser; +Cc: linux-lvm On Tue, Nov 14, 2000 at 07:44:25AM +0100, Jos Visser wrote: > Hi Jesse, > > Let's see if I get this right: You have an LVM configuration on a shared > SCSI disk set. I understand from your description that you have some > VG's active on more than one node at a time. Is this right? If so, I > wonder if it's supported (others are better equiped to determine this > than I am). However, most volume managers on other Unix platforms do not > allow this. > > As far as I know the vgchange inactive/active bounce is the only thing > that will refresh the metadata. > > However, if you have the VG truly active on more than one node (if > that's possible), you have a recipe for disaster! What if you by mistake > change the VG (e.g. adding an LV) from one node, and then perform a > similar action from another node? I would not be surprised if there > would be corruptions, oopses and panics all over the place. You are correct, sir. ;) It is exceedingly important that each node's view of the LVM metadata be consistant. I understand that the addition of LVM clustering features (including this issue and others) is currently in the works. In the mean time, I'll just have to do things the old fashioned way. I'll put a procedure in place that any LVM changes done from a particular node require the bouncing of VGs on all other attached nodes. Fortunately, after initial cluster setup, manipulation of LVs won't really be performed on a routine basis. -- Jesse Sipprell Technical Operations Director Evolution Communications, Inc. 800.496.4736 * Finger jss@evcom.net for my PGP Public Key * ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] LVM in shared parallel SCSI environment 2000-11-14 14:38 ` Jesse Sipprell @ 2000-11-14 16:09 ` Paul Jakma 2000-11-14 19:29 ` Jesse Sipprell 0 siblings, 1 reply; 13+ messages in thread From: Paul Jakma @ 2000-11-14 16:09 UTC (permalink / raw) To: Jesse Sipprell; +Cc: Jos Visser, linux-lvm On Tue, 14 Nov 2000, Jesse Sipprell wrote: > In the mean time, I'll just have to do things the old fashioned > way. I'll put a procedure in place that any LVM changes done from > a particular node require the bouncing of VGs on all other > attached nodes. Fortunately, after initial cluster setup, > manipulation of LVs won't really be performed on a routine basis. and so what do you do with these LV's? The filesystem/application you run on them has to be aware of the shared-access nature of the device.. so that rules out all but GFS - which IIRC already has some LVM like features. --paulj ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] LVM in shared parallel SCSI environment 2000-11-14 16:09 ` Paul Jakma @ 2000-11-14 19:29 ` Jesse Sipprell [not found] ` <200011142115.eAELFHV11698@webber.adilger.net> 2000-11-15 7:04 ` Jos Visser 0 siblings, 2 replies; 13+ messages in thread From: Jesse Sipprell @ 2000-11-14 19:29 UTC (permalink / raw) To: Paul Jakma; +Cc: Jos Visser, linux-lvm On Tue, Nov 14, 2000 at 04:09:47PM +0000, Paul Jakma wrote: > On Tue, 14 Nov 2000, Jesse Sipprell wrote: > > > In the mean time, I'll just have to do things the old fashioned > > way. I'll put a procedure in place that any LVM changes done from > > a particular node require the bouncing of VGs on all other > > attached nodes. Fortunately, after initial cluster setup, > > manipulation of LVs won't really be performed on a routine basis. > > and so what do you do with these LV's? The filesystem/application you > run on them has to be aware of the shared-access nature of the > device.. so that rules out all but GFS - which IIRC already has some > LVM like features. Actually, it's entirely possible to run a non-shared-media-aware filesystem as long as no more than one cluster node has a given file system mounted at a time. To illustrate: |-------- VG --------| ||====== LV0 =======|| || (ext2) || --> Mounted on Cluster Node 1 ||==================|| ||====== LV1 =======|| || (ext2) || --> Mounted on Cluster Node 2 ||==================|| ||====== LV2 =======|| || (ext2) || --> Mounted on Cluster Node 3 ||==================|| ||====== LV3 =======|| || (ext2) || --> Mounted on Cluster Node 4 ||==================|| | | | Free Space in VG | | | |====================| Because none of the cluster nodes are attempting to share access to the actual blocks where each filesystem is stored, there are no concurrency issues. One can use the benefits of LVM to unmount LV0's fs on Cluster Node 1, resize the LV, resize the fs and remount. Now, Cluster Node's 2, 3 and 4 need to have their in-core LVM metadata updated in order to see the new size of LV0. Once this is done via the vgchange bounce, everything is consistant. -- Jesse Sipprell Technical Operations Director Evolution Communications, Inc. 800.496.4736 * Finger jss@evcom.net for my PGP Public Key * ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <200011142115.eAELFHV11698@webber.adilger.net>]
* Re: [linux-lvm] LVM in shared parallel SCSI environment [not found] ` <200011142115.eAELFHV11698@webber.adilger.net> @ 2000-11-14 22:03 ` Jesse Sipprell 2000-11-14 22:40 ` Andreas Dilger 0 siblings, 1 reply; 13+ messages in thread From: Jesse Sipprell @ 2000-11-14 22:03 UTC (permalink / raw) To: Andreas Dilger; +Cc: Paul Jakma, Jos Visser, linux-lvm On Tue, Nov 14, 2000 at 02:15:16PM -0700, Andreas Dilger wrote: > Jesse Sipprell writes: > > Actually, it's entirely possible to run a non-shared-media-aware filesystem as > > long as no more than one cluster node has a given file system mounted at a > > time. > > > > To illustrate: > > > > |-------- VG --------| > > ||====== LV0 =======|| > > || (ext2) || --> Mounted on Cluster Node 1 > > ||==================|| > > ||====== LV1 =======|| > > || (ext2) || --> Mounted on Cluster Node 2 > > ||==================|| > > ||====== LV2 =======|| > > || (ext2) || --> Mounted on Cluster Node 3 > > ||==================|| > > ||====== LV3 =======|| > > || (ext2) || --> Mounted on Cluster Node 4 > > ||==================|| > > | | > > | Free Space in VG | > > | | > > |====================| > > Far safer to simply have a separate VG for each node, and import it on the > backup node when you do a failover. Since you have to have some sort of > external control for the filesystems, doing the import at the same time is > not any more overhead. Safer, certainly, but won't this make extending/reducing an LV problematic? For example, say that in 6 months from now it becomes necessary to move 100MB of capacity from LV0 to LV1; now per your assumption that LV0 and LV1 on are on separate VGs, one can only extend and/or reduce the involved VGs by this amount if the increment/decrement size is a multiple of the PV sizes. > > One can use the benefits of LVM to unmount LV0's fs on Cluster Node 1, resize > > the LV, resize the fs and remount. Now, Cluster Node's 2, 3 and 4 need to > > have their in-core LVM metadata updated in order to see the new size of LV0. > > Once this is done via the vgchange bounce, everything is consistant. > > Yes, except when someone also does a resize on another filesystem before > they have synced up the VG, you have corrupt filesystems and LVM. You've > made something that looks to be "highly available" into something that > facilitates destroying your important data. In many critical systems, it > is very rare that you would be able to take filesystems 2, 3, 4 offline > in order to sync the LVM back up. Absolutely. This is the reasoning behind my original question regarding "refreshing" the in-core LVM metadata on cluster nodes. LVM is still a win without clustering features, however the potential exists for exactly what you described (and other horrors). The only solution (until LVM-clustering comes to fruition), in my organization, is to procedurally make sure that only one person is in charge of a cluster, resizing LVs, filesystems, etc, as well as to make sure that person is fully aware of the dangers involved with current LVM in a shared media environment. > PS - no need to unmount the ext2 filesystem to do the resize with the > right tools. Certainly. ;) -- Jesse Sipprell Technical Operations Director Evolution Communications, Inc. 800.496.4736 * Finger jss@evcom.net for my PGP Public Key * ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] LVM in shared parallel SCSI environment 2000-11-14 22:03 ` Jesse Sipprell @ 2000-11-14 22:40 ` Andreas Dilger 0 siblings, 0 replies; 13+ messages in thread From: Andreas Dilger @ 2000-11-14 22:40 UTC (permalink / raw) To: Jesse Sipprell; +Cc: Andreas Dilger, Paul Jakma, Jos Visser, linux-lvm Jesse Sipprell writes: > > Far safer to simply have a separate VG for each node, and import it on the > > backup node when you do a failover. Since you have to have some sort of > > external control for the filesystems, doing the import at the same time is > > not any more overhead. > > Safer, certainly, but won't this make extending/reducing an LV problematic? > For example, say that in 6 months from now it becomes necessary to move 100MB > of capacity from LV0 to LV1; now per your assumption that LV0 and LV1 on are > on separate VGs, one can only extend and/or reduce the involved VGs by this > amount if the increment/decrement size is a multiple of the PV sizes. I guess it depends on how your cluster is set up. If you really need to move that much storage from one node to another, you could simply vgreduce the one VG by a few disks, and vgextend the other - only affecting the two nodes you are working on, and also able to do it without taking either VG offline and with much less risk. This may be problematic if you have only a single giant RAID device or similar, unless you carved the RAID into multiple logical disks of a convenient size (e.g. 10GB or so). For lesser amounts of storage, I assume that there is at least some extra space within a VG on any given node. Of course it's not quite as good as cluster LVM, but much much safer. Cheers, Andreas -- Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto, \ would they cancel out, leaving him still hungry?" http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] LVM in shared parallel SCSI environment 2000-11-14 19:29 ` Jesse Sipprell [not found] ` <200011142115.eAELFHV11698@webber.adilger.net> @ 2000-11-15 7:04 ` Jos Visser 2000-11-21 13:44 ` Matthew O'Keefe 1 sibling, 1 reply; 13+ messages in thread From: Jos Visser @ 2000-11-15 7:04 UTC (permalink / raw) To: Jesse Sipprell; +Cc: Paul Jakma, linux-lvm Hi, Though most has already been said in this thread, just a small followup with some notes and thoughts. The traditional volume managers on HP-UX, Solaris (VxVM) and AIX do not usually support shared access to a volume group from two or more nodes, even if the nodes access different logical volumes. This is done explicitly to prevent the kind of problems that have been pointed out in this thread (the chance that two nodes have different in-core metadata about the VG). HP's LVM supports a read-only vgchange that allows only read-only access to the VG and its LV's, but I've never used it. In these traditional environment, the clustering software exports and imports the VG's as necessary, and run some clusterwide resource manager that takes care of who currently "owns" the VG. Veritas has a special Cluster Volume Manager (CVM) that allows shared access to volume groups, but AFAIK it is only used with parallel databases such as Oracle Parallel Server. For myself, I would not choose a solution like Jesse's. However, the fun and power of Unix is that everyone can handcraft his/her own optimal environment. As long as you're aware of the consequences what you're doing: please be my guest :-) I must admit that I have not looked at what LVM 0.9 will bring to the table, but some added features in the clustering arena would be very welcome. ++Jos And thus it came to pass that Jesse Sipprell wrote: (on Tue, Nov 14, 2000 at 02:29:02PM -0500 to be exact) > On Tue, Nov 14, 2000 at 04:09:47PM +0000, Paul Jakma wrote: > > On Tue, 14 Nov 2000, Jesse Sipprell wrote: > > > > > In the mean time, I'll just have to do things the old fashioned > > > way. I'll put a procedure in place that any LVM changes done from > > > a particular node require the bouncing of VGs on all other > > > attached nodes. Fortunately, after initial cluster setup, > > > manipulation of LVs won't really be performed on a routine basis. > > > > and so what do you do with these LV's? The filesystem/application you > > run on them has to be aware of the shared-access nature of the > > device.. so that rules out all but GFS - which IIRC already has some > > LVM like features. > > Actually, it's entirely possible to run a non-shared-media-aware filesystem as > long as no more than one cluster node has a given file system mounted at a > time. > > To illustrate: > > |-------- VG --------| > ||====== LV0 =======|| > || (ext2) || --> Mounted on Cluster Node 1 > ||==================|| > ||====== LV1 =======|| > || (ext2) || --> Mounted on Cluster Node 2 > ||==================|| > ||====== LV2 =======|| > || (ext2) || --> Mounted on Cluster Node 3 > ||==================|| > ||====== LV3 =======|| > || (ext2) || --> Mounted on Cluster Node 4 > ||==================|| > | | > | Free Space in VG | > | | > |====================| > > Because none of the cluster nodes are attempting to share access to the actual > blocks where each filesystem is stored, there are no concurrency issues. > > One can use the benefits of LVM to unmount LV0's fs on Cluster Node 1, resize > the LV, resize the fs and remount. Now, Cluster Node's 2, 3 and 4 need to > have their in-core LVM metadata updated in order to see the new size of LV0. > Once this is done via the vgchange bounce, everything is consistant. > > -- > Jesse Sipprell > Technical Operations Director > Evolution Communications, Inc. > 800.496.4736 > > * Finger jss@evcom.net for my PGP Public Key * -- Success and happiness can not be pursued; it must ensue as the unintended side-effect of one's personal dedication to a course greater than oneself. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] LVM in shared parallel SCSI environment 2000-11-15 7:04 ` Jos Visser @ 2000-11-21 13:44 ` Matthew O'Keefe 2000-11-21 18:01 ` John DeFranco 2000-11-21 22:25 ` Jos Visser 0 siblings, 2 replies; 13+ messages in thread From: Matthew O'Keefe @ 2000-11-21 13:44 UTC (permalink / raw) To: Jesse Sipprell, Paul Jakma, linux-lvm; +Cc: Matthew O'Keefe, mauelshagen Hi, Heinz and his LVM team (we've hired two new LVM developers) as well as the GFS team have worked out a preliminary design for cluster LVM. The plan is too include it in the 1.0 release. I totally agree with Jos: a cluster volume manager is very useful, and should stand alone (but also be compatible with) a cluster file system like GFS. There is a tremendous amount of commercial activity in the area of volume management software for shared SAN storage. Imagine you have 2 $3 million dollar EMC symmetrix disk arrays, each attached to independent servers. If one of these symmetrix fills up, you have to buy another for just that server alone, even if the other server's symmetrix has lots of free space. If instead you share these 2 symmetrix boxen across a san, then you can expand the PV for one machine into the other the symmetrix with free space, and there is no need to buy another array. This is a key reason why shared SAN storage is taking off. Matt O'Keefe Sistina Software, Inc. On Wed, Nov 15, 2000 at 08:04:14AM +0100, Jos Visser wrote: > Hi, > > Though most has already been said in this thread, just a small followup > with some notes and thoughts. > > The traditional volume managers on HP-UX, Solaris (VxVM) and AIX do not > usually support shared access to a volume group from two or more nodes, > even if the nodes access different logical volumes. This is done > explicitly to prevent the kind of problems that have been pointed out in > this thread (the chance that two nodes have different in-core metadata > about the VG). HP's LVM supports a read-only vgchange that allows only > read-only access to the VG and its LV's, but I've never used it. > > In these traditional environment, the clustering software exports and > imports the VG's as necessary, and run some clusterwide resource manager > that takes care of who currently "owns" the VG. Veritas has a special > Cluster Volume Manager (CVM) that allows shared access to volume groups, > but AFAIK it is only used with parallel databases such as Oracle > Parallel Server. > > For myself, I would not choose a solution like Jesse's. However, the fun > and power of Unix is that everyone can handcraft his/her own optimal > environment. As long as you're aware of the consequences what you're > doing: please be my guest :-) > > I must admit that I have not looked at what LVM 0.9 will bring to the > table, but some added features in the clustering arena would be very > welcome. > > ++Jos > > And thus it came to pass that Jesse Sipprell wrote: > (on Tue, Nov 14, 2000 at 02:29:02PM -0500 to be exact) > > > On Tue, Nov 14, 2000 at 04:09:47PM +0000, Paul Jakma wrote: > > > On Tue, 14 Nov 2000, Jesse Sipprell wrote: > > > > > > > In the mean time, I'll just have to do things the old fashioned > > > > way. I'll put a procedure in place that any LVM changes done from > > > > a particular node require the bouncing of VGs on all other > > > > attached nodes. Fortunately, after initial cluster setup, > > > > manipulation of LVs won't really be performed on a routine basis. > > > > > > and so what do you do with these LV's? The filesystem/application you > > > run on them has to be aware of the shared-access nature of the > > > device.. so that rules out all but GFS - which IIRC already has some > > > LVM like features. > > > > Actually, it's entirely possible to run a non-shared-media-aware filesystem as > > long as no more than one cluster node has a given file system mounted at a > > time. > > > > To illustrate: > > > > |-------- VG --------| > > ||====== LV0 =======|| > > || (ext2) || --> Mounted on Cluster Node 1 > > ||==================|| > > ||====== LV1 =======|| > > || (ext2) || --> Mounted on Cluster Node 2 > > ||==================|| > > ||====== LV2 =======|| > > || (ext2) || --> Mounted on Cluster Node 3 > > ||==================|| > > ||====== LV3 =======|| > > || (ext2) || --> Mounted on Cluster Node 4 > > ||==================|| > > | | > > | Free Space in VG | > > | | > > |====================| > > > > Because none of the cluster nodes are attempting to share access to the actual > > blocks where each filesystem is stored, there are no concurrency issues. > > > > One can use the benefits of LVM to unmount LV0's fs on Cluster Node 1, resize > > the LV, resize the fs and remount. Now, Cluster Node's 2, 3 and 4 need to > > have their in-core LVM metadata updated in order to see the new size of LV0. > > Once this is done via the vgchange bounce, everything is consistant. > > > > -- > > Jesse Sipprell > > Technical Operations Director > > Evolution Communications, Inc. > > 800.496.4736 > > > > * Finger jss@evcom.net for my PGP Public Key * > > -- > Success and happiness can not be pursued; it must ensue as the > unintended side-effect of one's personal dedication to a course greater > than oneself. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] LVM in shared parallel SCSI environment 2000-11-21 13:44 ` Matthew O'Keefe @ 2000-11-21 18:01 ` John DeFranco 2000-11-21 22:25 ` Jos Visser 1 sibling, 0 replies; 13+ messages in thread From: John DeFranco @ 2000-11-21 18:01 UTC (permalink / raw) To: Matthew O'Keefe; +Cc: Jesse Sipprell, Paul Jakma, linux-lvm, mauelshagen Hi, So when is the 1.0 release tentatively scheduled for? Matthew O'Keefe wrote: > > Hi, > > Heinz and his LVM team (we've hired two new LVM developers) > as well as the GFS team have worked > out a preliminary design for cluster LVM. The plan is too > include it in the 1.0 release. > > I totally agree with Jos: a cluster volume manager is very > useful, and should stand alone (but also be compatible with) > a cluster file system like GFS. There is a tremendous amount > of commercial activity in the area of volume management > software for shared SAN storage. Imagine you have 2 > $3 million dollar EMC symmetrix disk arrays, each attached > to independent servers. If one of these symmetrix fills up, > you have to buy another for just that server alone, even if > the other server's symmetrix has lots of free space. > > If instead you share these 2 symmetrix boxen across a san, > then you can expand the PV for one machine into the other > the symmetrix with free space, and there is no need to buy > another array. This is a key reason why shared SAN storage is > taking off. > > Matt O'Keefe > Sistina Software, Inc. > > On Wed, Nov 15, 2000 at 08:04:14AM +0100, Jos Visser wrote: > > Hi, > > > > Though most has already been said in this thread, just a small followup > > with some notes and thoughts. > > > > The traditional volume managers on HP-UX, Solaris (VxVM) and AIX do not > > usually support shared access to a volume group from two or more nodes, > > even if the nodes access different logical volumes. This is done > > explicitly to prevent the kind of problems that have been pointed out in > > this thread (the chance that two nodes have different in-core metadata > > about the VG). HP's LVM supports a read-only vgchange that allows only > > read-only access to the VG and its LV's, but I've never used it. > > > > In these traditional environment, the clustering software exports and > > imports the VG's as necessary, and run some clusterwide resource manager > > that takes care of who currently "owns" the VG. Veritas has a special > > Cluster Volume Manager (CVM) that allows shared access to volume groups, > > but AFAIK it is only used with parallel databases such as Oracle > > Parallel Server. > > > > For myself, I would not choose a solution like Jesse's. However, the fun > > and power of Unix is that everyone can handcraft his/her own optimal > > environment. As long as you're aware of the consequences what you're > > doing: please be my guest :-) > > > > I must admit that I have not looked at what LVM 0.9 will bring to the > > table, but some added features in the clustering arena would be very > > welcome. > > > > ++Jos > > > > And thus it came to pass that Jesse Sipprell wrote: > > (on Tue, Nov 14, 2000 at 02:29:02PM -0500 to be exact) > > > > > On Tue, Nov 14, 2000 at 04:09:47PM +0000, Paul Jakma wrote: > > > > On Tue, 14 Nov 2000, Jesse Sipprell wrote: > > > > > > > > > In the mean time, I'll just have to do things the old fashioned > > > > > way. I'll put a procedure in place that any LVM changes done from > > > > > a particular node require the bouncing of VGs on all other > > > > > attached nodes. Fortunately, after initial cluster setup, > > > > > manipulation of LVs won't really be performed on a routine basis. > > > > > > > > and so what do you do with these LV's? The filesystem/application you > > > > run on them has to be aware of the shared-access nature of the > > > > device.. so that rules out all but GFS - which IIRC already has some > > > > LVM like features. > > > > > > Actually, it's entirely possible to run a non-shared-media-aware filesystem as > > > long as no more than one cluster node has a given file system mounted at a > > > time. > > > > > > To illustrate: > > > > > > |-------- VG --------| > > > ||====== LV0 =======|| > > > || (ext2) || --> Mounted on Cluster Node 1 > > > ||==================|| > > > ||====== LV1 =======|| > > > || (ext2) || --> Mounted on Cluster Node 2 > > > ||==================|| > > > ||====== LV2 =======|| > > > || (ext2) || --> Mounted on Cluster Node 3 > > > ||==================|| > > > ||====== LV3 =======|| > > > || (ext2) || --> Mounted on Cluster Node 4 > > > ||==================|| > > > | | > > > | Free Space in VG | > > > | | > > > |====================| > > > > > > Because none of the cluster nodes are attempting to share access to the actual > > > blocks where each filesystem is stored, there are no concurrency issues. > > > > > > One can use the benefits of LVM to unmount LV0's fs on Cluster Node 1, resize > > > the LV, resize the fs and remount. Now, Cluster Node's 2, 3 and 4 need to > > > have their in-core LVM metadata updated in order to see the new size of LV0. > > > Once this is done via the vgchange bounce, everything is consistant. > > > > > > -- > > > Jesse Sipprell > > > Technical Operations Director > > > Evolution Communications, Inc. > > > 800.496.4736 > > > > > > * Finger jss@evcom.net for my PGP Public Key * > > > > -- > > Success and happiness can not be pursued; it must ensue as the > > unintended side-effect of one's personal dedication to a course greater > > than oneself. -- ========== John DeFranco 408-447-7543 Hewlett-Packard Company 19111 Pruneridge Avenue, MS 44UB Cupertino, CA 95014 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] LVM in shared parallel SCSI environment 2000-11-21 13:44 ` Matthew O'Keefe 2000-11-21 18:01 ` John DeFranco @ 2000-11-21 22:25 ` Jos Visser 2000-11-22 0:42 ` Matthew O'Keefe 1 sibling, 1 reply; 13+ messages in thread From: Jos Visser @ 2000-11-21 22:25 UTC (permalink / raw) To: Matthew O'Keefe; +Cc: Jesse Sipprell, Paul Jakma, linux-lvm, mauelshagen Hi, Are the plans public? Are comments invited? ++Jos And thus it came to pass that Matthew O'Keefe wrote: (on Tue, Nov 21, 2000 at 07:44:52AM -0600 to be exact) > > Hi, > > Heinz and his LVM team (we've hired two new LVM developers) > as well as the GFS team have worked > out a preliminary design for cluster LVM. The plan is too > include it in the 1.0 release. > > I totally agree with Jos: a cluster volume manager is very > useful, and should stand alone (but also be compatible with) > a cluster file system like GFS. There is a tremendous amount > of commercial activity in the area of volume management > software for shared SAN storage. Imagine you have 2 > $3 million dollar EMC symmetrix disk arrays, each attached > to independent servers. If one of these symmetrix fills up, > you have to buy another for just that server alone, even if > the other server's symmetrix has lots of free space. > > If instead you share these 2 symmetrix boxen across a san, > then you can expand the PV for one machine into the other > the symmetrix with free space, and there is no need to buy > another array. This is a key reason why shared SAN storage is > taking off. > > > > Matt O'Keefe > Sistina Software, Inc. > > On Wed, Nov 15, 2000 at 08:04:14AM +0100, Jos Visser wrote: > > Hi, > > > > Though most has already been said in this thread, just a small followup > > with some notes and thoughts. > > > > The traditional volume managers on HP-UX, Solaris (VxVM) and AIX do not > > usually support shared access to a volume group from two or more nodes, > > even if the nodes access different logical volumes. This is done > > explicitly to prevent the kind of problems that have been pointed out in > > this thread (the chance that two nodes have different in-core metadata > > about the VG). HP's LVM supports a read-only vgchange that allows only > > read-only access to the VG and its LV's, but I've never used it. > > > > In these traditional environment, the clustering software exports and > > imports the VG's as necessary, and run some clusterwide resource manager > > that takes care of who currently "owns" the VG. Veritas has a special > > Cluster Volume Manager (CVM) that allows shared access to volume groups, > > but AFAIK it is only used with parallel databases such as Oracle > > Parallel Server. > > > > For myself, I would not choose a solution like Jesse's. However, the fun > > and power of Unix is that everyone can handcraft his/her own optimal > > environment. As long as you're aware of the consequences what you're > > doing: please be my guest :-) > > > > I must admit that I have not looked at what LVM 0.9 will bring to the > > table, but some added features in the clustering arena would be very > > welcome. > > > > ++Jos > > > > And thus it came to pass that Jesse Sipprell wrote: > > (on Tue, Nov 14, 2000 at 02:29:02PM -0500 to be exact) > > > > > On Tue, Nov 14, 2000 at 04:09:47PM +0000, Paul Jakma wrote: > > > > On Tue, 14 Nov 2000, Jesse Sipprell wrote: > > > > > > > > > In the mean time, I'll just have to do things the old fashioned > > > > > way. I'll put a procedure in place that any LVM changes done from > > > > > a particular node require the bouncing of VGs on all other > > > > > attached nodes. Fortunately, after initial cluster setup, > > > > > manipulation of LVs won't really be performed on a routine basis. > > > > > > > > and so what do you do with these LV's? The filesystem/application you > > > > run on them has to be aware of the shared-access nature of the > > > > device.. so that rules out all but GFS - which IIRC already has some > > > > LVM like features. > > > > > > Actually, it's entirely possible to run a non-shared-media-aware filesystem as > > > long as no more than one cluster node has a given file system mounted at a > > > time. > > > > > > To illustrate: > > > > > > |-------- VG --------| > > > ||====== LV0 =======|| > > > || (ext2) || --> Mounted on Cluster Node 1 > > > ||==================|| > > > ||====== LV1 =======|| > > > || (ext2) || --> Mounted on Cluster Node 2 > > > ||==================|| > > > ||====== LV2 =======|| > > > || (ext2) || --> Mounted on Cluster Node 3 > > > ||==================|| > > > ||====== LV3 =======|| > > > || (ext2) || --> Mounted on Cluster Node 4 > > > ||==================|| > > > | | > > > | Free Space in VG | > > > | | > > > |====================| > > > > > > Because none of the cluster nodes are attempting to share access to the actual > > > blocks where each filesystem is stored, there are no concurrency issues. > > > > > > One can use the benefits of LVM to unmount LV0's fs on Cluster Node 1, resize > > > the LV, resize the fs and remount. Now, Cluster Node's 2, 3 and 4 need to > > > have their in-core LVM metadata updated in order to see the new size of LV0. > > > Once this is done via the vgchange bounce, everything is consistant. > > > > > > -- > > > Jesse Sipprell > > > Technical Operations Director > > > Evolution Communications, Inc. > > > 800.496.4736 > > > > > > * Finger jss@evcom.net for my PGP Public Key * > > > > -- > > Success and happiness can not be pursued; it must ensue as the > > unintended side-effect of one's personal dedication to a course greater > > than oneself. -- Success and happiness can not be pursued; it must ensue as the unintended side-effect of one's personal dedication to a course greater than oneself. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] LVM in shared parallel SCSI environment 2000-11-21 22:25 ` Jos Visser @ 2000-11-22 0:42 ` Matthew O'Keefe 2000-11-22 11:40 ` Heinz J. Mauelshagen 0 siblings, 1 reply; 13+ messages in thread From: Matthew O'Keefe @ 2000-11-22 0:42 UTC (permalink / raw) To: Matthew O'Keefe, Jesse Sipprell, Paul Jakma, linux-lvm, mauelshagen Hi, On Tue, Nov 21, 2000 at 11:25:15PM +0100, Jos Visser wrote: > Hi, > > Are the plans public? Are comments invited? Heinz and his team is working on a draft for this and will post it "soon": I'll let Heinz define "soon" :-) Of course comments are welcome. I think we are talking about 1.0 being released in Q1 2000, but again, Heinz and others should make that prediction. Regards, Matt Matthew O'Keefe Sistina Software, Inc. > > ++Jos > > And thus it came to pass that Matthew O'Keefe wrote: > (on Tue, Nov 21, 2000 at 07:44:52AM -0600 to be exact) > > > > > Hi, > > > > Heinz and his LVM team (we've hired two new LVM developers) > > as well as the GFS team have worked > > out a preliminary design for cluster LVM. The plan is too > > include it in the 1.0 release. > > > > I totally agree with Jos: a cluster volume manager is very > > useful, and should stand alone (but also be compatible with) > > a cluster file system like GFS. There is a tremendous amount > > of commercial activity in the area of volume management > > software for shared SAN storage. Imagine you have 2 > > $3 million dollar EMC symmetrix disk arrays, each attached > > to independent servers. If one of these symmetrix fills up, > > you have to buy another for just that server alone, even if > > the other server's symmetrix has lots of free space. > > > > If instead you share these 2 symmetrix boxen across a san, > > then you can expand the PV for one machine into the other > > the symmetrix with free space, and there is no need to buy > > another array. This is a key reason why shared SAN storage is > > taking off. > > > > > > > > Matt O'Keefe > > Sistina Software, Inc. > > > > On Wed, Nov 15, 2000 at 08:04:14AM +0100, Jos Visser wrote: > > > Hi, > > > > > > Though most has already been said in this thread, just a small followup > > > with some notes and thoughts. > > > > > > The traditional volume managers on HP-UX, Solaris (VxVM) and AIX do not > > > usually support shared access to a volume group from two or more nodes, > > > even if the nodes access different logical volumes. This is done > > > explicitly to prevent the kind of problems that have been pointed out in > > > this thread (the chance that two nodes have different in-core metadata > > > about the VG). HP's LVM supports a read-only vgchange that allows only > > > read-only access to the VG and its LV's, but I've never used it. > > > > > > In these traditional environment, the clustering software exports and > > > imports the VG's as necessary, and run some clusterwide resource manager > > > that takes care of who currently "owns" the VG. Veritas has a special > > > Cluster Volume Manager (CVM) that allows shared access to volume groups, > > > but AFAIK it is only used with parallel databases such as Oracle > > > Parallel Server. > > > > > > For myself, I would not choose a solution like Jesse's. However, the fun > > > and power of Unix is that everyone can handcraft his/her own optimal > > > environment. As long as you're aware of the consequences what you're > > > doing: please be my guest :-) > > > > > > I must admit that I have not looked at what LVM 0.9 will bring to the > > > table, but some added features in the clustering arena would be very > > > welcome. > > > > > > ++Jos > > > > > > And thus it came to pass that Jesse Sipprell wrote: > > > (on Tue, Nov 14, 2000 at 02:29:02PM -0500 to be exact) > > > > > > > On Tue, Nov 14, 2000 at 04:09:47PM +0000, Paul Jakma wrote: > > > > > On Tue, 14 Nov 2000, Jesse Sipprell wrote: > > > > > > > > > > > In the mean time, I'll just have to do things the old fashioned > > > > > > way. I'll put a procedure in place that any LVM changes done from > > > > > > a particular node require the bouncing of VGs on all other > > > > > > attached nodes. Fortunately, after initial cluster setup, > > > > > > manipulation of LVs won't really be performed on a routine basis. > > > > > > > > > > and so what do you do with these LV's? The filesystem/application you > > > > > run on them has to be aware of the shared-access nature of the > > > > > device.. so that rules out all but GFS - which IIRC already has some > > > > > LVM like features. > > > > > > > > Actually, it's entirely possible to run a non-shared-media-aware filesystem as > > > > long as no more than one cluster node has a given file system mounted at a > > > > time. > > > > > > > > To illustrate: > > > > > > > > |-------- VG --------| > > > > ||====== LV0 =======|| > > > > || (ext2) || --> Mounted on Cluster Node 1 > > > > ||==================|| > > > > ||====== LV1 =======|| > > > > || (ext2) || --> Mounted on Cluster Node 2 > > > > ||==================|| > > > > ||====== LV2 =======|| > > > > || (ext2) || --> Mounted on Cluster Node 3 > > > > ||==================|| > > > > ||====== LV3 =======|| > > > > || (ext2) || --> Mounted on Cluster Node 4 > > > > ||==================|| > > > > | | > > > > | Free Space in VG | > > > > | | > > > > |====================| > > > > > > > > Because none of the cluster nodes are attempting to share access to the actual > > > > blocks where each filesystem is stored, there are no concurrency issues. > > > > > > > > One can use the benefits of LVM to unmount LV0's fs on Cluster Node 1, resize > > > > the LV, resize the fs and remount. Now, Cluster Node's 2, 3 and 4 need to > > > > have their in-core LVM metadata updated in order to see the new size of LV0. > > > > Once this is done via the vgchange bounce, everything is consistant. > > > > > > > > -- > > > > Jesse Sipprell > > > > Technical Operations Director > > > > Evolution Communications, Inc. > > > > 800.496.4736 > > > > > > > > * Finger jss@evcom.net for my PGP Public Key * > > > > > > -- > > > Success and happiness can not be pursued; it must ensue as the > > > unintended side-effect of one's personal dedication to a course greater > > > than oneself. > > -- > Success and happiness can not be pursued; it must ensue as the > unintended side-effect of one's personal dedication to a course greater > than oneself. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-lvm] LVM in shared parallel SCSI environment 2000-11-22 0:42 ` Matthew O'Keefe @ 2000-11-22 11:40 ` Heinz J. Mauelshagen 0 siblings, 0 replies; 13+ messages in thread From: Heinz J. Mauelshagen @ 2000-11-22 11:40 UTC (permalink / raw) To: Matthew O'Keefe; +Cc: linux-lvm On Tue, Nov 21, 2000 at 06:42:32PM -0600, Matthew O'Keefe wrote: > > Hi, > > On Tue, Nov 21, 2000 at 11:25:15PM +0100, Jos Visser wrote: > > Hi, > > > > Are the plans public? Are comments invited? > > Heinz and his team is working on a draft for this and will post it > "soon": I'll let Heinz define "soon" :-) We are writing the spec this week. Probably we can post it early december. > > Of course comments are welcome. I think we are talking about > 1.0 being released in Q1 2000, but again, Heinz and others > should make that prediction. Our plan is Q1 2001. Cheers, Heinz > Regards, > Matt > > Matthew O'Keefe > Sistina Software, Inc. > > > > ++Jos > > > > And thus it came to pass that Matthew O'Keefe wrote: > > (on Tue, Nov 21, 2000 at 07:44:52AM -0600 to be exact) > > > > > > > > Hi, > > > > > > Heinz and his LVM team (we've hired two new LVM developers) > > > as well as the GFS team have worked > > > out a preliminary design for cluster LVM. The plan is too > > > include it in the 1.0 release. > > > > > > I totally agree with Jos: a cluster volume manager is very > > > useful, and should stand alone (but also be compatible with) > > > a cluster file system like GFS. There is a tremendous amount > > > of commercial activity in the area of volume management > > > software for shared SAN storage. Imagine you have 2 > > > $3 million dollar EMC symmetrix disk arrays, each attached > > > to independent servers. If one of these symmetrix fills up, > > > you have to buy another for just that server alone, even if > > > the other server's symmetrix has lots of free space. > > > > > > If instead you share these 2 symmetrix boxen across a san, > > > then you can expand the PV for one machine into the other > > > the symmetrix with free space, and there is no need to buy > > > another array. This is a key reason why shared SAN storage is > > > taking off. > > > > > > > > > > > > Matt O'Keefe > > > Sistina Software, Inc. > > > > > > On Wed, Nov 15, 2000 at 08:04:14AM +0100, Jos Visser wrote: > > > > Hi, > > > > > > > > Though most has already been said in this thread, just a small followup > > > > with some notes and thoughts. > > > > > > > > The traditional volume managers on HP-UX, Solaris (VxVM) and AIX do not > > > > usually support shared access to a volume group from two or more nodes, > > > > even if the nodes access different logical volumes. This is done > > > > explicitly to prevent the kind of problems that have been pointed out in > > > > this thread (the chance that two nodes have different in-core metadata > > > > about the VG). HP's LVM supports a read-only vgchange that allows only > > > > read-only access to the VG and its LV's, but I've never used it. > > > > > > > > In these traditional environment, the clustering software exports and > > > > imports the VG's as necessary, and run some clusterwide resource manager > > > > that takes care of who currently "owns" the VG. Veritas has a special > > > > Cluster Volume Manager (CVM) that allows shared access to volume groups, > > > > but AFAIK it is only used with parallel databases such as Oracle > > > > Parallel Server. > > > > > > > > For myself, I would not choose a solution like Jesse's. However, the fun > > > > and power of Unix is that everyone can handcraft his/her own optimal > > > > environment. As long as you're aware of the consequences what you're > > > > doing: please be my guest :-) > > > > > > > > I must admit that I have not looked at what LVM 0.9 will bring to the > > > > table, but some added features in the clustering arena would be very > > > > welcome. > > > > > > > > ++Jos > > > > > > > > And thus it came to pass that Jesse Sipprell wrote: > > > > (on Tue, Nov 14, 2000 at 02:29:02PM -0500 to be exact) > > > > > > > > > On Tue, Nov 14, 2000 at 04:09:47PM +0000, Paul Jakma wrote: > > > > > > On Tue, 14 Nov 2000, Jesse Sipprell wrote: > > > > > > > > > > > > > In the mean time, I'll just have to do things the old fashioned > > > > > > > way. I'll put a procedure in place that any LVM changes done from > > > > > > > a particular node require the bouncing of VGs on all other > > > > > > > attached nodes. Fortunately, after initial cluster setup, > > > > > > > manipulation of LVs won't really be performed on a routine basis. > > > > > > > > > > > > and so what do you do with these LV's? The filesystem/application you > > > > > > run on them has to be aware of the shared-access nature of the > > > > > > device.. so that rules out all but GFS - which IIRC already has some > > > > > > LVM like features. > > > > > > > > > > Actually, it's entirely possible to run a non-shared-media-aware filesystem as > > > > > long as no more than one cluster node has a given file system mounted at a > > > > > time. > > > > > > > > > > To illustrate: > > > > > > > > > > |-------- VG --------| > > > > > ||====== LV0 =======|| > > > > > || (ext2) || --> Mounted on Cluster Node 1 > > > > > ||==================|| > > > > > ||====== LV1 =======|| > > > > > || (ext2) || --> Mounted on Cluster Node 2 > > > > > ||==================|| > > > > > ||====== LV2 =======|| > > > > > || (ext2) || --> Mounted on Cluster Node 3 > > > > > ||==================|| > > > > > ||====== LV3 =======|| > > > > > || (ext2) || --> Mounted on Cluster Node 4 > > > > > ||==================|| > > > > > | | > > > > > | Free Space in VG | > > > > > | | > > > > > |====================| > > > > > > > > > > Because none of the cluster nodes are attempting to share access to the actual > > > > > blocks where each filesystem is stored, there are no concurrency issues. > > > > > > > > > > One can use the benefits of LVM to unmount LV0's fs on Cluster Node 1, resize > > > > > the LV, resize the fs and remount. Now, Cluster Node's 2, 3 and 4 need to > > > > > have their in-core LVM metadata updated in order to see the new size of LV0. > > > > > Once this is done via the vgchange bounce, everything is consistant. > > > > > > > > > > -- > > > > > Jesse Sipprell > > > > > Technical Operations Director > > > > > Evolution Communications, Inc. > > > > > 800.496.4736 > > > > > > > > > > * Finger jss@evcom.net for my PGP Public Key * > > > > > > > > -- > > > > Success and happiness can not be pursued; it must ensue as the > > > > unintended side-effect of one's personal dedication to a course greater > > > > than oneself. > > > > -- > > Success and happiness can not be pursued; it must ensue as the > > unintended side-effect of one's personal dedication to a course greater > > than oneself. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Heinz Mauelshagen Sistina Software Inc. Senior Consultant/Developer Bartningstr. 12 64289 Darmstadt Germany Mauelshagen@Sistina.com +49 6151 7103 86 FAX 7103 96 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2000-11-22 11:40 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2000-11-11 13:46 [linux-lvm] LVM in shared parallel SCSI environment Jesse Sipprell
2000-11-14 6:44 ` Jos Visser
2000-11-14 14:38 ` Jesse Sipprell
2000-11-14 16:09 ` Paul Jakma
2000-11-14 19:29 ` Jesse Sipprell
[not found] ` <200011142115.eAELFHV11698@webber.adilger.net>
2000-11-14 22:03 ` Jesse Sipprell
2000-11-14 22:40 ` Andreas Dilger
2000-11-15 7:04 ` Jos Visser
2000-11-21 13:44 ` Matthew O'Keefe
2000-11-21 18:01 ` John DeFranco
2000-11-21 22:25 ` Jos Visser
2000-11-22 0:42 ` Matthew O'Keefe
2000-11-22 11:40 ` Heinz J. Mauelshagen
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.