Could receive allow updating an existing subvolume?

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Could receive allow updating an existing subvolume?
@ 2016-11-08 22:48 Ian Kelling
  2016-11-08 23:00 ` Hugo Mills
  0 siblings, 1 reply; 5+ messages in thread
From: Ian Kelling @ 2016-11-08 22:48 UTC (permalink / raw)
  To: linux-btrfs

It seems to be an artificially imposed limitation which hurts which
hurts its usefulness. Let me know if this makes sense. If so, perhaps it
can be implemented eventually. It seems a bit obvious but I couldn't
find any existing discussion of it.

Say you have this situation:
a/1, a/2 (parent is a/1)
b/1 (received from a/1)
Currently, you can (abbreviated) "send -p a/1 a/2 | receive" to create
b/2 (received from a/2, parent is b/1).

b/2 must start out as a rw snapshot from b/1, so we start out with 2
identical subvolumes except for some metadata and rw status, why not
have an option to update the existing subvol instead of the new one?
For example, when receive starts, rename b/1 to b/2, take an ro snapshot
of b/2 named b/1, set b/2 to rw, update b/2's metadata, then update b/2
with the new incremental data just as before.

The current situation severely limits the use case of having a host
which has read-only data which is incrementally updated, which is
read/served by standard programs, not just backup restore tools, because
to have any persistent paths, you need to remount using the new
subvolume (generally means killing programs reading from it), or using
paths that begin like /dir/b{1,2} and then renaming subvolumes, and then
requesting that all reading programs reopen their files because they
will still have references in the old subvolume (again, often means
killing programs).

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Could receive allow updating an existing subvolume?
  2016-11-08 22:48 Could receive allow updating an existing subvolume? Ian Kelling
@ 2016-11-08 23:00 ` Hugo Mills
  2016-11-08 23:15   ` Ian Kelling
  0 siblings, 1 reply; 5+ messages in thread
From: Hugo Mills @ 2016-11-08 23:00 UTC (permalink / raw)
  To: Ian Kelling; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2887 bytes --]

On Tue, Nov 08, 2016 at 02:48:56PM -0800, Ian Kelling wrote:
> It seems to be an artificially imposed limitation which hurts which
> hurts its usefulness. Let me know if this makes sense. If so, perhaps it
> can be implemented eventually. It seems a bit obvious but I couldn't
> find any existing discussion of it.

   It's not artificial -- it's ensuring safety of operation.

   If the sender sends an incremental stream, that assumes an *exact*
subvol state on the receiving side. If the subvol on the receiving
side is modified, then the receive can fail.

   So, the assumption is that the reference subvol on the receiving
side (equivalent to the -p subvol on the sending side) hasn't been
changed since it was received. The same assumption applies to the -p
subvol on the sending side.

   Now, receive is a fully userspace tool, so it would have to set the
subvol to RW, then update it, then set it to RO. The subvol risks
being modified by other processes during that window -- *particularly*
if it's actively being read by those other processes.

   Note that this is still an issue with the current situation, but
the expectation is that nothing's going to be actively reading that
location at the time the receive is running. But, if something does go
wrong with the receive, it's possible to abort and restart the
process. If you're modifying an existing subvol, there's no
recoverability if something goes wrong halfway through.

   Hugo.

> Say you have this situation:
> a/1, a/2 (parent is a/1)
> b/1 (received from a/1)
> Currently, you can (abbreviated) "send -p a/1 a/2 | receive" to create
> b/2 (received from a/2, parent is b/1).
> 
> b/2 must start out as a rw snapshot from b/1, so we start out with 2
> identical subvolumes except for some metadata and rw status, why not
> have an option to update the existing subvol instead of the new one?
> For example, when receive starts, rename b/1 to b/2, take an ro snapshot
> of b/2 named b/1, set b/2 to rw, update b/2's metadata, then update b/2
> with the new incremental data just as before.
> 
> The current situation severely limits the use case of having a host
> which has read-only data which is incrementally updated, which is
> read/served by standard programs, not just backup restore tools, because
> to have any persistent paths, you need to remount using the new
> subvolume (generally means killing programs reading from it), or using
> paths that begin like /dir/b{1,2} and then renaming subvolumes, and then
> requesting that all reading programs reopen their files because they
> will still have references in the old subvolume (again, often means
> killing programs).

-- 
Hugo Mills             | I believe that it's closely correlated with the
hugo@... carfax.org.uk | aeroswine coefficient
http://carfax.org.uk/  |
PGP: E2AB1DE4          |                                       Adrian Bridgett

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Could receive allow updating an existing subvolume?
  2016-11-08 23:00 ` Hugo Mills
@ 2016-11-08 23:15   ` Ian Kelling
  2016-11-08 23:17     ` Ian Kelling
  2016-11-09 12:26     ` Austin S. Hemmelgarn
  0 siblings, 2 replies; 5+ messages in thread
From: Ian Kelling @ 2016-11-08 23:15 UTC (permalink / raw)
  To: Hugo Mills; +Cc: linux-btrfs

On Tue, Nov 8, 2016, at 03:00 PM, Hugo Mills wrote:
> On Tue, Nov 08, 2016 at 02:48:56PM -0800, Ian Kelling wrote:
> > It seems to be an artificially imposed limitation which hurts which
> > hurts its usefulness. Let me know if this makes sense. If so, perhaps it
> > can be implemented eventually. It seems a bit obvious but I couldn't
> > find any existing discussion of it.
> 
>    It's not artificial -- it's ensuring safety of operation.

No, it doesn't ensure the subvolume is not modified, so it IS
artificial. I can still set the subvolume to rw before or probably
during the send and modify a file and mess things up.

> 
>    If the sender sends an incremental stream, that assumes an *exact*
> subvol state on the receiving side. If the subvol on the receiving
> side is modified, then the receive can fail.

No. The reading program never needs to have access to rw files if it's
reading from a read-only mountpoint while the subvolume is rw and
mounted as such elsewhere. And a reading program does not magically risk
writes.

> 
>    So, the assumption is that the reference subvol on the receiving
> side (equivalent to the -p subvol on the sending side) hasn't been
> changed since it was received. The same assumption applies to the -p
> subvol on the sending side.
> 
>    Now, receive is a fully userspace tool, so it would have to set the
> subvol to RW, then update it, then set it to RO. The subvol risks
> being modified by other processes during that window -- *particularly*
> if it's actively being read by those other processes.

No. The reading program never needs to have access to rw files if it's
reading from a read-only mountpoint while the subvolume is rw and
mounted as such elsewhere. And a reading program does not magically risk
writes.

> 
>    Note that this is still an issue with the current situation, but
> the expectation is that nothing's going to be actively reading that
> location at the time the receive is running. But, if something does go
> wrong with the receive, it's possible to abort and restart the
> process. If you're modifying an existing subvol, there's no
> recoverability if something goes wrong halfway through.

No. You could recover using the snapshot that I mentioned.

>    Hugo.

So my question still stands.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Could receive allow updating an existing subvolume?
  2016-11-08 23:15   ` Ian Kelling
@ 2016-11-08 23:17     ` Ian Kelling
  2016-11-09 12:26     ` Austin S. Hemmelgarn
  1 sibling, 0 replies; 5+ messages in thread
From: Ian Kelling @ 2016-11-08 23:17 UTC (permalink / raw)
  To: Hugo Mills; +Cc: linux-btrfs



On Tue, Nov 8, 2016, at 03:15 PM, Ian Kelling wrote:
> On Tue, Nov 8, 2016, at 03:00 PM, Hugo Mills wrote:
> > 
> >    If the sender sends an incremental stream, that assumes an *exact*
> > subvol state on the receiving side. If the subvol on the receiving
> > side is modified, then the receive can fail.
> 
> No. The reading program never needs to have access to rw files if it's
> reading from a read-only mountpoint while the subvolume is rw and
> mounted as such elsewhere. And a reading program does not magically risk
> writes.

Quoted text was accidentally duplicated, ignore it.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Could receive allow updating an existing subvolume?
  2016-11-08 23:15   ` Ian Kelling
  2016-11-08 23:17     ` Ian Kelling
@ 2016-11-09 12:26     ` Austin S. Hemmelgarn
  1 sibling, 0 replies; 5+ messages in thread
From: Austin S. Hemmelgarn @ 2016-11-09 12:26 UTC (permalink / raw)
  To: Ian Kelling, Hugo Mills; +Cc: linux-btrfs

On 2016-11-08 18:15, Ian Kelling wrote:
> On Tue, Nov 8, 2016, at 03:00 PM, Hugo Mills wrote:
>> On Tue, Nov 08, 2016 at 02:48:56PM -0800, Ian Kelling wrote:
>>> It seems to be an artificially imposed limitation which hurts which
>>> hurts its usefulness. Let me know if this makes sense. If so, perhaps it
>>> can be implemented eventually. It seems a bit obvious but I couldn't
>>> find any existing discussion of it.
>>
>>    It's not artificial -- it's ensuring safety of operation.
>
> No, it doesn't ensure the subvolume is not modified, so it IS
> artificial. I can still set the subvolume to rw before or probably
> during the send and modify a file and mess things up.
>
>>
>>    If the sender sends an incremental stream, that assumes an *exact*
>> subvol state on the receiving side. If the subvol on the receiving
>> side is modified, then the receive can fail.
>
> No. The reading program never needs to have access to rw files if it's
> reading from a read-only mountpoint while the subvolume is rw and
> mounted as such elsewhere. And a reading program does not magically risk
> writes.
That assumes that the reading program is bug free (and perfectly 
secured) running on 100% reliable hardware with no chance of any kind of 
failure, which is a pretty significant requirement that's functionally 
impossible to enforce.

There's also the fact that things which have files opened read-only 
generally expect that the file will not change under them, and that 
you'd need to restart most software anyway so that it would pick up on 
any renames and new or deleted paths.
>
>>
>>    So, the assumption is that the reference subvol on the receiving
>> side (equivalent to the -p subvol on the sending side) hasn't been
>> changed since it was received. The same assumption applies to the -p
>> subvol on the sending side.
>>
>>    Now, receive is a fully userspace tool, so it would have to set the
>> subvol to RW, then update it, then set it to RO. The subvol risks
>> being modified by other processes during that window -- *particularly*
>> if it's actively being read by those other processes.
>
> No. The reading program never needs to have access to rw files if it's
> reading from a read-only mountpoint while the subvolume is rw and
> mounted as such elsewhere. And a reading program does not magically risk
> writes.
>
>>
>>    Note that this is still an issue with the current situation, but
>> the expectation is that nothing's going to be actively reading that
>> location at the time the receive is running. But, if something does go
>> wrong with the receive, it's possible to abort and restart the
>> process. If you're modifying an existing subvol, there's no
>> recoverability if something goes wrong halfway through.
>
> No. You could recover using the snapshot that I mentioned.
>
>>    Hugo.
>
> So my question still stands.
Given the use case you're describing, it sounds like `rsync --inplace` 
plus snapshots is a better fit for what you want to do than 
send/receive.  It's worth pointing out though that this is _NOT_ a safe 
way to handle things if your actually serving data based on the contents 
of those files, because any read while the file is being updated will 
likely return half-updated data.  The only case where I would ever 
consider doing something like this is on a system which is an 
active-backup for another system, because it's pretty much guaranteed to 
not be serving data if it's syncing it from the primary system.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-11-09 12:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-08 22:48 Could receive allow updating an existing subvolume? Ian Kelling
2016-11-08 23:00 ` Hugo Mills
2016-11-08 23:15   ` Ian Kelling
2016-11-08 23:17     ` Ian Kelling
2016-11-09 12:26     ` Austin S. Hemmelgarn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).