Re: [PATCH 0/3] btrfs-safe implementation of -oloop

Util-Linux package development
 help / color / mirror / Atom feed

From: Stanislav Brabec <sbrabec@suse.cz>
To: Karel Zak <kzak@redhat.com>
Cc: util-linux@vger.kernel.org
Subject: Re: [PATCH 0/3] btrfs-safe implementation of -oloop
Date: Wed, 13 Apr 2016 14:52:07 +0200	[thread overview]
Message-ID: <570E40F7.3020902@suse.cz> (raw)
In-Reply-To: <20160413110857.ftvekbq2fnbb2fau@ws.net.home>

Karel Zak wrote:
> On Tue, Apr 12, 2016 at 08:21:59PM +0200, Stanislav Brabec wrote:

>> - If the same backing file is already used for a read-only loop device,
>>    there is no safe way to continue.
>
> Why? Just re-use the read-only loopdev. We don't have have to promise
> read-write device if the backend is read only. This is generic
> behavior used in another cases too. For example if you try to mount
> NFS exported by server in read-only mode than your "rw" is ignored and
> result is "ro", the same is with read-only media (CDROM, etc.).

There is a problem that mount itself initializes loop as R/O:

# ./mount -oro,loop /btrfs.img /mnt/1
# ./mount -oloop,subvol=/ /btrfs.img /mnt/2
mount: /btrfs.img is used as read only loop, mounting read-only

> IMHO all we need is a warning, "backing file already used as
> read-only, mounting read-only".

This is part of patch 3. See above. Feel free to improve the warning.

>>    - There is no way to turn R/O loop to R/W in kernel.
>>    - If another loop is initialized, changes will not propagate to R/O
>>      volume.
>>    - One would need to umount all R/O devices, initialize loop R/W, and
>>      then everything mount again.
>>    I can imagine partial solution: Introduce looprw option. Such option
>>    would cause to initialize loop device R/W even for R/O mount.
>
> Sounds like over-engineering :-)

I think that mounting / as R/W and /snapshots as R/O could be a legal 
use. If mount mounts /snapshots first, there is no way to mount / R/W.

>> - If the same backing file is already used for a loop device with
>>    correct offset, but incorrect sizelimit, there is no solution. The
>>    current implementation does not check for it.
>
> IMHO the current "check start offset only" is a poor solution if we want
> to re-use loopdevs. It would be better to think about offset+limit
> as about partition and about backing-file as about whole-disk.

I am aware that it is unsafe. But it was easy, and will work in most 
"normal" cases. Nobody has a reason to initialize loop device that 
contains only half of the partition.

> It means check if the area specified by offset+limit does not overlap
> any existing backing-file mapping. It should be possible to re-use
> loopdev (by mount(8)) only if offset and size matches.
 >
> It should be also forbidden to create a new (non re-used) loopdev if
> area specified by offset and sizelimit overlap any existing backing
> file mapping. This is important!
>
> Maybe we can also improve losetup to warn about overlapping loop
> devices.

This sounds as a 100% safe solution.

>> - There exists a change for a race condition between device lookup and
>>    mount syscall.
>
> Sure.

I was thinking about LO_FLAGS_OPEN_FD flag for 
loopcxt_find_by_backing_file(). It will use open() as soon as possible, 
and if it fails, it repeats the iteration.
loopctxt_close_fd() would then close it.

But it would need more work, as mount does not have access to loopctxt.

>> - The implementation does not check for crypto. I think it is not a big
>>    problem, as it makes no sense to initialize the same backing file as
>>    encrypted and non-encrypted at once.
>
> There is nothing like loop devices encryption :-)

Good to know. It is still a part of kernel structures.

-- 
Best Regards / S pozdravem,

Stanislav Brabec
software developer
---------------------------------------------------------------------
SUSE LINUX, s. r. o.                         e-mail: sbrabec@suse.com
Lihovarská 1060/12                            tel: +49 911 7405384547
190 00 Praha 9                                 fax:  +420 284 084 001
Czech Republic                                    http://www.suse.cz/
PGP: 830B 40D5 9E05 35D8 5E27 6FA3 717C 209F A04F CD76

next prev parent reply	other threads:[~2016-04-13 12:52 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-12 18:21 [PATCH 0/3] btrfs-safe implementation of -oloop Stanislav Brabec
2016-04-13 11:08 ` Karel Zak
2016-04-13 11:29   ` Ruediger Meier
2016-04-13 11:41     ` Karel Zak
2016-04-13 12:52   ` Stanislav Brabec [this message]
2016-04-22 11:22 ` Karel Zak
2016-04-22 12:42   ` Stanislav Brabec

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=570E40F7.3020902@suse.cz \
    --to=sbrabec@suse.cz \
    --cc=kzak@redhat.com \
    --cc=util-linux@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox