From: "Stephen C. Tweedie" <sct@redhat.com>
To: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Arjan van de Ven <arjanv@redhat.com>,
Joe Thornber <thornber@fib011235813.fsnet.co.uk>,
lvm-devel@sistina.com, Jim McDonald <Jim@mcdee.net>,
Andreas Dilger <adilger@turbolabs.com>,
linux-lvm@sistina.com, linux-kernel@vger.kernel.org,
evms-devel@lists.sourceforge.net
Subject: [linux-lvm] Re: [Evms-devel] Re: [ANNOUNCE] LVM reimplementation ready for beta testing
Date: Fri Feb 1 08:59:02 2002 [thread overview]
Message-ID: <20020201145855.G2149@redhat.com> (raw)
In-Reply-To: <E16WevQ-0005H0-00@the-village.bc.nu>; from alan@lxorguk.ukuu.org.uk on Fri, Feb 01, 2002 at 02:44:24PM +0000
Hi,
On Fri, Feb 01, 2002 at 02:44:24PM +0000, Alan Cox wrote:
> > But "flushes all pending io" is *far* from trivial. there's no current
> > kernel functionality for this, so you'll have to do "weird shit" that will
> > break easy and often.
> >
> > Also "suspending" is rather dangerous because it can deadlock the machine
> > (think about the VM needing to write back dirty data on this LV in order to
> > make memory available for your move)...
>
> I don't think you need to suspend I/O except momentarily. I don't use LVM and
> while I can't resize volumes I migrate them like this
LVM1 has some problems here. First, when it needs to flush IO as
part of its locking it does so with fsync_dev, which is not a valid
way of flushing certain types IO. Second, its copy is done in user
space, so there is no cache coherence with the logical device contents
and there is enough VM pressure to give a good chance of deadlocking.
However, it _does_ do its locking at a finer granularity than the
whole disk (it locks an extent --- 4MB by default --- at a time), so
even with LVM1 it is possible to do the move on a live volume without
locking up all IO for the duration of the entire copy.
LVM2's device-mirror code is much closer to the raid1 mechanism in
design, so it doesn't even have to lock down an extent during the
copy.
> the situation here seems analogous. You never need to suspend I/O to the
> volume until you actually kill it, by which time you can just skip the write
> to the dead volume.
Right. LVM1 doesn't actually suspend IO to the volume, just to an
extent. What it does volume-wide is to flush IO, which is different.
The problem is that when we come to copy a chunk of the volume,
however large that chunk is, we need to make sure both that no new IOs
arrive on it, AND that we have waited for all outstanding IOs against
that chunk. It's the latter part which is the problem. It is
expensive to keep track of all outstanding IOs on a per-stripe basis,
so when we place a lock on a stripe and come to wait for
already-submitted IOs to complete, it is much easier just to do that
flush volume-wide. It's not a complete lock on the whole volume, just
a temporary mutex to ensure that there are no IOs left outstanding on
the stripe we're locking.
Cheers,
Stephen
WARNING: multiple messages have this Message-ID (diff)
From: "Stephen C. Tweedie" <sct@redhat.com>
To: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Arjan van de Ven <arjanv@redhat.com>,
Joe Thornber <thornber@fib011235813.fsnet.co.uk>,
lvm-devel@sistina.com, Jim McDonald <Jim@mcdee.net>,
Andreas Dilger <adilger@turbolabs.com>,
linux-lvm@sistina.com, linux-kernel@vger.kernel.org,
evms-devel@lists.sourceforge.net
Subject: Re: [Evms-devel] Re: [ANNOUNCE] LVM reimplementation ready for beta testing
Date: Fri, 1 Feb 2002 14:58:55 +0000 [thread overview]
Message-ID: <20020201145855.G2149@redhat.com> (raw)
In-Reply-To: <20020201051251.B10893@devserv.devel.redhat.com> <E16WevQ-0005H0-00@the-village.bc.nu>
In-Reply-To: <E16WevQ-0005H0-00@the-village.bc.nu>; from alan@lxorguk.ukuu.org.uk on Fri, Feb 01, 2002 at 02:44:24PM +0000
Hi,
On Fri, Feb 01, 2002 at 02:44:24PM +0000, Alan Cox wrote:
> > But "flushes all pending io" is *far* from trivial. there's no current
> > kernel functionality for this, so you'll have to do "weird shit" that will
> > break easy and often.
> >
> > Also "suspending" is rather dangerous because it can deadlock the machine
> > (think about the VM needing to write back dirty data on this LV in order to
> > make memory available for your move)...
>
> I don't think you need to suspend I/O except momentarily. I don't use LVM and
> while I can't resize volumes I migrate them like this
LVM1 has some problems here. First, when it needs to flush IO as
part of its locking it does so with fsync_dev, which is not a valid
way of flushing certain types IO. Second, its copy is done in user
space, so there is no cache coherence with the logical device contents
and there is enough VM pressure to give a good chance of deadlocking.
However, it _does_ do its locking at a finer granularity than the
whole disk (it locks an extent --- 4MB by default --- at a time), so
even with LVM1 it is possible to do the move on a live volume without
locking up all IO for the duration of the entire copy.
LVM2's device-mirror code is much closer to the raid1 mechanism in
design, so it doesn't even have to lock down an extent during the
copy.
> the situation here seems analogous. You never need to suspend I/O to the
> volume until you actually kill it, by which time you can just skip the write
> to the dead volume.
Right. LVM1 doesn't actually suspend IO to the volume, just to an
extent. What it does volume-wide is to flush IO, which is different.
The problem is that when we come to copy a chunk of the volume,
however large that chunk is, we need to make sure both that no new IOs
arrive on it, AND that we have waited for all outstanding IOs against
that chunk. It's the latter part which is the problem. It is
expensive to keep track of all outstanding IOs on a per-stripe basis,
so when we place a lock on a stripe and come to wait for
already-submitted IOs to complete, it is much easier just to do that
flush volume-wide. It's not a complete lock on the whole volume, just
a temporary mutex to ensure that there are no IOs left outstanding on
the stripe we're locking.
Cheers,
Stephen
next prev parent reply other threads:[~2002-02-01 8:59 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-01-31 13:54 [linux-lvm] Re: [lvm-devel] [ANNOUNCE] LVM reimplementation ready for beta testing Steve Pratt
2002-01-31 19:52 ` Steve Pratt
2002-01-31 12:52 ` Joe Thornber
2002-02-01 3:47 ` Joe Thornber
2002-02-01 3:55 ` [Evms-devel] " Arjan van de Ven
2002-02-01 9:55 ` Arjan van de Ven
2002-01-31 13:09 ` Joe Thornber
2002-02-01 4:04 ` Joe Thornber
2002-02-01 4:13 ` [linux-lvm] " Arjan van de Ven
2002-02-01 10:12 ` Arjan van de Ven
2002-01-31 13:35 ` Joe Thornber
2002-02-01 4:31 ` [linux-lvm] " Joe Thornber
2002-02-01 5:06 ` [linux-lvm] Re: [Evms-devel] " Stephen C. Tweedie
2002-02-01 11:05 ` Stephen C. Tweedie
2002-02-01 8:32 ` [linux-lvm] " Alan Cox
2002-02-01 14:44 ` Alan Cox
2002-02-01 8:59 ` Stephen C. Tweedie [this message]
2002-02-01 14:58 ` [Evms-devel] " Stephen C. Tweedie
2002-02-01 16:01 ` [evms-devel] [linux-lvm] " Kevin Corry
2002-02-01 21:59 ` Kevin Corry
2002-01-31 21:51 ` Joe Thornber
2002-02-03 6:22 ` Joe Thornber
2002-02-01 17:17 ` Alan Cox
2002-02-01 23:30 ` Alan Cox
2002-02-02 7:40 ` Andrew Clausen
2002-02-02 13:39 ` Andrew Clausen
2002-02-02 13:30 ` Alan Cox
2002-02-02 19:42 ` Alan Cox
2002-01-31 14:05 ` [linux-lvm] Re: [lvm-devel] [ANNOUNCE] LVM reimplementationre ady " Jeff Layton
2002-02-01 3:29 ` Heinz J . Mauelshagen
2002-02-01 9:43 ` Jeff Layton
2002-02-05 8:04 ` James Hawtin
2002-02-05 8:09 ` Patrick Caulfield
2002-02-05 11:13 ` Jesus Manuel NAVARRO LOPEZ
2002-02-05 12:28 ` [linux-lvm] (OT) Backups (was Re: LVM reimplementationre ady for beta testing...) Chad C. Walstrom
2002-02-06 13:13 ` [linux-lvm] Backup costs (was: LVM reimplementationre) Benjamin Scott
2002-02-06 13:39 ` Daniel Whicker
2002-02-06 13:46 ` James Mello
2002-02-06 14:35 ` Anders Widman
2002-02-07 3:01 ` Jesus Manuel NAVARRO LOPEZ
2002-02-07 3:17 ` Petro
2002-02-07 4:34 ` Jesus Manuel NAVARRO LOPEZ
2002-02-07 7:19 ` Petro
2002-02-07 7:54 ` Jesus Manuel NAVARRO LOPEZ
2002-02-07 3:55 ` Dieter Stueken
2002-02-06 13:46 ` Andreas Dilger
2002-02-06 13:48 ` Theo Van Dinter
2002-02-06 15:45 ` Austin Gonyou
2002-02-06 13:51 ` Petro
2002-02-06 13:52 ` Kirby C. Bohling
2002-02-06 13:55 ` Jeff Layton
2002-02-06 14:03 ` James Mello
2002-02-06 20:03 ` Jeff Layton
2002-02-06 20:08 ` James Mello
2002-02-06 20:11 ` Jeff Layton
2002-02-07 3:10 ` Jesus Manuel NAVARRO LOPEZ
2002-02-07 4:53 ` Jeff Layton
2002-02-07 5:31 ` James Hawtin
2002-02-07 17:05 ` Wolfgang Weisselberg
2002-02-08 20:04 ` James Hawtin
2002-02-07 3:03 ` Jesus Manuel NAVARRO LOPEZ
2002-02-07 3:17 ` Petro
2002-02-07 4:15 ` Jesus Manuel NAVARRO LOPEZ
2002-02-12 1:00 ` Marc MERLIN
2002-01-31 15:19 ` [Evms-devel] Re: [linux-lvm] Re: [lvm-devel] [ANNOUNCE] LVM reimplementation ready for beta testing Andrew Clausen
2002-01-31 21:18 ` Andrew Clausen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20020201145855.G2149@redhat.com \
--to=sct@redhat.com \
--cc=Jim@mcdee.net \
--cc=adilger@turbolabs.com \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=arjanv@redhat.com \
--cc=evms-devel@lists.sourceforge.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-lvm@sistina.com \
--cc=lvm-devel@sistina.com \
--cc=thornber@fib011235813.fsnet.co.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.