linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] Cannot get rid of root filesystem LVM snapshot
@ 2006-09-09 20:49 Martin Dvořák
  2006-09-09 23:17 ` Alasdair G Kergon
  0 siblings, 1 reply; 7+ messages in thread
From: Martin Dvořák @ 2006-09-09 20:49 UTC (permalink / raw)
  To: linux-lvm

Hi all,

I am writing here with hope somebody had already run and successfully
resolved this issue. I am on CentOS 4.4 with kernel
kernel-smp-2.6.9-200.0.2.EL.iworx.x86_64.rpm. I've been in the process
of creating a backup script which uses LVM snapshoting feature to
backup a snapshot of root filesystem instead of live filesystem.

During testing phase on testing CentOS installation, I've been able to
successfully create, mount, unmount and remove root filesystem
snapshots. On live production system, I've also successfully created,
mounted and unmounted root filesystem snapshot, but I CANNOT REMOVE IT
in any way.

lvremove -f gives following output:
[CODE]File descriptor 3 left open
File descriptor 5 left open
File descriptor 7 left open
  /dev/cdrom: open failed: Read-only file system
  Can't remove open logical volume "SSsystem"[/CODE]

tried also to deactivate the snapshot using lvchange -an command:
[CODE]File descriptor 3 left open
File descriptor 5 left open
File descriptor 7 left open
  /dev/cdrom: open failed: Read-only file system
  Can't change snapshot logical volume "SSsystem"[/CODE]

lvdisplay outputs this:
[CODE]File descriptor 3 left open
File descriptor 5 left open
File descriptor 7 left open
  --- Logical volume ---
  LV Name                /dev/VGdb/LVdb
  VG Name                VGdb
  LV UUID                R4mtQ4-Idyo-SRiU-EfLg-TFM6-NZN2-Cd1kgv
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                14.66 GB
  Current LE             469
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:1

  --- Logical volume ---
  LV Name                /dev/VGsystem/LVsystem
  VG Name                VGsystem
  LV UUID                HusBAX-tbxn-6283-zap5-Cihe-K1AX-jKDbk7
  LV Write Access        read/write
  LV snapshot status     source of
                         /dev/VGsystem/SSsystem [active]
  LV Status              available
  # open                 1
  LV Size                31.25 GB
  Current LE             1000
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:0

  --- Logical volume ---
  LV Name                /dev/VGsystem/SSsystem
  VG Name                VGsystem
  LV UUID                qohvbp-yo1c-SFGv-suM1-9QC9-OwA3-TpX1i0
  LV Write Access        read/write
  LV snapshot status     active destination for /dev/VGsystem/LVsystem
  LV Status              available
  # open                 1
  LV Size                31.25 GB
  Current LE             1000
  COW-table size         2.66 GB
  COW-table LE           85
  Allocated to snapshot  2.32%
  Snapshot chunk size    8.00 KB
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:4[/CODE]

At this very time I am in a situation when the snapshot space is
slowly filling and I am afraid to reboot, manually fill the snapshot
(with hope it will automatically deactivate) or try anything else,
because this is live production system placed in data center and
currently I have no physical access to it - only remote through SSH2.

I'll be grateful for any ideas which might help to resolve this issue
without loosing any data. Thanks very much.

Martin


Here is output of uname -a:
[CODE]Linux xxxx.xxx.xxx 2.6.9-200.0.2.EL.iworxsmp #1 SMP Fri Aug 4
15:25:06 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux[/CODE]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Cannot get rid of root filesystem LVM snapshot
  2006-09-09 20:49 [linux-lvm] Cannot get rid of root filesystem LVM snapshot Martin Dvořák
@ 2006-09-09 23:17 ` Alasdair G Kergon
  2006-09-09 23:53   ` Martin Dvořák
  0 siblings, 1 reply; 7+ messages in thread
From: Alasdair G Kergon @ 2006-09-09 23:17 UTC (permalink / raw)
  To: LVM general discussion and development

The tools do not support root snapshots properly yet - whether or not they
work is down to chance.  To remove one reliably you need lvremove from a
rescue environment - you're lucky lvremove is refusing to proceed - if it
got further it could hang your machine.  In principle you can recover
without rebooting but you'll need to write a custom script to run various
dmsetup commands from a ramdisk, for example, and then fix up the lvm2
metadata with activation disabled.  [If you learn how snapshots are
created/destroyed you can work out what to do: sorry, it's too much for me
to explain here.  This is what the tools will want enhancing to do to
provide proper support.]

Older kernels like yours however can get into a state where recovery is only
possible by writing directly to kernel memory, if you already ran certain
commands in the wrong sequence.

Alasdair
-- 
agk@redhat.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Cannot get rid of root filesystem LVM snapshot
  2006-09-09 23:17 ` Alasdair G Kergon
@ 2006-09-09 23:53   ` Martin Dvořák
  2006-09-10 15:41     ` Alasdair G Kergon
  0 siblings, 1 reply; 7+ messages in thread
From: Martin Dvořák @ 2006-09-09 23:53 UTC (permalink / raw)
  To: LVM general discussion and development

Thanks for info. I suggest to stress the fact that snapshotting of
root filesystem is not yet supported somewhere in documentation,
HOWTOs etc. Many people might not be as lucky as me and find it out
only after they are left with frozen or even corrupted system.

Martin

2006/9/10, Alasdair G Kergon <agk@redhat.com>:
> The tools do not support root snapshots properly yet - whether or not they
> work is down to chance.  To remove one reliably you need lvremove from a
> rescue environment - you're lucky lvremove is refusing to proceed - if it
> got further it could hang your machine.  In principle you can recover
> without rebooting but you'll need to write a custom script to run various
> dmsetup commands from a ramdisk, for example, and then fix up the lvm2
> metadata with activation disabled.  [If you learn how snapshots are
> created/destroyed you can work out what to do: sorry, it's too much for me
> to explain here.  This is what the tools will want enhancing to do to
> provide proper support.]
>
> Older kernels like yours however can get into a state where recovery is only
> possible by writing directly to kernel memory, if you already ran certain
> commands in the wrong sequence.
>
> Alasdair
> --
> agk@redhat.com
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Cannot get rid of root filesystem LVM snapshot
  2006-09-09 23:53   ` Martin Dvořák
@ 2006-09-10 15:41     ` Alasdair G Kergon
  2006-09-10 17:35       ` Nix
  0 siblings, 1 reply; 7+ messages in thread
From: Alasdair G Kergon @ 2006-09-10 15:41 UTC (permalink / raw)
  To: LVM general discussion and development

On Sun, Sep 10, 2006 at 01:53:34AM +0200, Martin Dvořák wrote:
> Thanks for info. I suggest to stress the fact that snapshotting of
> root filesystem is not yet supported somewhere in documentation,
> HOWTOs etc. Many people might not be as lucky as me and find it out
> only after they are left with frozen or even corrupted system.
 
Upstream does not support using lvm2 for the root filesystem because, as you
discovered, the tools do not have complete support for it yet.

However people tried it and found it worked for them and so distributions
started shipping this as an option despite incomplete support for the
configuration in the upstream tools.  And so far nobody's considered it a
high enough priority to get that support completed:-(

Alasdair
-- 
agk@redhat.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Cannot get rid of root filesystem LVM snapshot
  2006-09-10 15:41     ` Alasdair G Kergon
@ 2006-09-10 17:35       ` Nix
  2006-09-10 21:30         ` Alasdair G Kergon
  0 siblings, 1 reply; 7+ messages in thread
From: Nix @ 2006-09-10 17:35 UTC (permalink / raw)
  To: LVM general discussion and development

On Sun, 10 Sep 2006, Alasdair G. Kergon gibbered uncontrollably:
> Upstream does not support using lvm2 for the root filesystem because, as you
> discovered, the tools do not have complete support for it yet.

This is also nowhere documented: I've been using LVM2 as a root filesystem
for years without incident, so of course I thought it was meant to work.

(So much for backing up the whole system from snapshots, too...)

I must admit I can't see what's so special about the root filesystem.
From the kernel's POV it's just another vfsmnt: you can even unmount it
if you have no open sessions and you have a rootfs still mounted under
it.

Is it because the VG cfg backups are stored there?

> However people tried it and found it worked for them and so distributions
> started shipping this as an option despite incomplete support for the
> configuration in the upstream tools.  And so far nobody's considered it a
> high enough priority to get that support completed:-(

The tools work for me: even doing things like pvmoving root filesystems in
active use works fine. I'd vaguely guess that only snapshot support is
broken?

-- 
`In typical emacs fashion, it is both absurdly ornate and
 still not really what one wanted.' --- jdev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Cannot get rid of root filesystem LVM snapshot
  2006-09-10 17:35       ` Nix
@ 2006-09-10 21:30         ` Alasdair G Kergon
  2006-09-13 19:02           ` Nix
  0 siblings, 1 reply; 7+ messages in thread
From: Alasdair G Kergon @ 2006-09-10 21:30 UTC (permalink / raw)
  To: LVM general discussion and development

On Sun, Sep 10, 2006 at 06:35:00PM +0100, Nix wrote:
> I must admit I can't see what's so special about the root filesystem.

The tools you are running may be stored on it and access files on it.
Some operations the tools do temporarily suspend I/O to the LVs involved.
If a tool suspends I/O to the root then does anything that causes an access
to it the system can lock up.  Caches/flushing/scheduling can make this
indeterminate.  Even the filesystem type matters (e.g. timing of journal
transactions e.g. lazy inode time updates).

> The tools work for me: even doing things like pvmoving root filesystems in
> active use works fine. I'd vaguely guess that only snapshot support is
> broken?
 
pvmove also may fail.

Workaround is to copy required components into ramdisk and run from
there.  I have never done a complete audit, but:
  lvm binary (& libraries) should not be on root filesystem [use lvm.static
in ramdisk] 
  /dev must not be part of root filesystem being changed [ramdisk copy]
  log/activation = 0 in lvm.conf (the default)

Depending on the particular command and configuration there may be more
restrictions.  (E.g. some commands may still require config files and lock
files not to be in the root filesystem.)  dmeventd is also unsafe as
currently implemented - one reason it is not enabled by default upstream.

Upstream support entails a full review and changes to deal with as many
problems as possible (regardless of the user's configuration and
compile-time options) and then detecting the remaining problems and issuing
warning messages or refusing to proceed as appropriate.

Alasdair
-- 
agk@redhat.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Cannot get rid of root filesystem LVM snapshot
  2006-09-10 21:30         ` Alasdair G Kergon
@ 2006-09-13 19:02           ` Nix
  0 siblings, 0 replies; 7+ messages in thread
From: Nix @ 2006-09-13 19:02 UTC (permalink / raw)
  To: LVM general discussion and development

On Sun, 10 Sep 2006, Alasdair G. Kergon stipulated:
> On Sun, Sep 10, 2006 at 06:35:00PM +0100, Nix wrote:
>> I must admit I can't see what's so special about the root filesystem.

(Well, *that* was a stupid thing for me to say.)

> The tools you are running may be stored on it and access files on it.
> Some operations the tools do temporarily suspend I/O to the LVs involved.
> If a tool suspends I/O to the root then does anything that causes an access
> to it the system can lock up.

Ah. Isn't a possible solution to this to mlock() tools that may do that
into memory, so that they're not going to need to be paged in while
access is suspended?

>                                Caches/flushing/scheduling can make this
> indeterminate.  Even the filesystem type matters (e.g. timing of journal
> transactions e.g. lazy inode time updates).

Ick! A delicate area? :(

>> The tools work for me: even doing things like pvmoving root filesystems in
>> active use works fine. I'd vaguely guess that only snapshot support is
>> broken?
>  
> pvmove also may fail.

The only times I've ever had pvmove fail was when moving active swap
partitions: a related problem? They'd start moving, then disk I/O would
cease: lvs shows the swap LV in SUSPENDED state. (This was back in
2.6.14 and 2.6.15 days: it may be fixed by now.)

The attention to disaster recovery/move resumption in pvmove was a
godsend: reboot, `pvmove' and it ran to completion happily.

> Workaround is to copy required components into ramdisk and run from
> there.  I have never done a complete audit, but:
>   lvm binary (& libraries) should not be on root filesystem [use lvm.static
> in ramdisk] 

Yeah. My most recent root pvmove I did using the copy of lvm on my
initramfs (statically linked against uClibc). I guess that would avoid
all the problems (only, of course, the system wasn't exactly very far
up when I did that).

>   /dev must not be part of root filesystem being changed [ramdisk copy]

That's pretty much the standard by now.

>   log/activation = 0 in lvm.conf (the default)

Hm, the message above that says that low-memory situations can deadlock,
not that all can.

> Depending on the particular command and configuration there may be more
> restrictions.  (E.g. some commands may still require config files and lock
> files not to be in the root filesystem.)  dmeventd is also unsafe as
> currently implemented - one reason it is not enabled by default upstream.

It's also totally undocumented :( but then I'm not even sure what clvmd
is for (something to do with GFS or SANs? Why else would you want to
distribute a VG over multiple machines?) so I'm not really fit to comment
(or to document it).

> Upstream support entails a full review and changes to deal with as many
> problems as possible (regardless of the user's configuration and
> compile-time options) and then detecting the remaining problems and issuing
> warning messages or refusing to proceed as appropriate.

Oh yes, paranoia is good in this area.

But it would be nice to document the root-is-not-really-supported stuff
somewhere (with a caveat perhaps that it might work if operations on /'s
LV, PVs and VG are done with great care).

-- 
`In typical emacs fashion, it is both absurdly ornate and
 still not really what one wanted.' --- jdev

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-09-13 19:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-09 20:49 [linux-lvm] Cannot get rid of root filesystem LVM snapshot Martin Dvořák
2006-09-09 23:17 ` Alasdair G Kergon
2006-09-09 23:53   ` Martin Dvořák
2006-09-10 15:41     ` Alasdair G Kergon
2006-09-10 17:35       ` Nix
2006-09-10 21:30         ` Alasdair G Kergon
2006-09-13 19:02           ` Nix

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).