All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fam Zheng <famz@redhat.com>
To: "Daniel P. Berrange" <berrange@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	qemu-block@nongnu.org, rjones@redhat.com,
	John Snow <jsnow@redhat.com>, Jeff Cody <jcody@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	qemu-devel@nongnu.org, stefanha@redhat.com, den@openvz.org,
	pbonzini@redhat.com, Max Reitz <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v4 08/27] osdep: Add qemu_lock_fd and qemu_unlock_fd
Date: Wed, 11 May 2016 08:48:18 +0800	[thread overview]
Message-ID: <20160511004818.GA14074@ad.usersys.redhat.com> (raw)
In-Reply-To: <20160510085748.GD13377@redhat.com>

On Tue, 05/10 09:57, Daniel P. Berrange wrote:
> On Tue, May 10, 2016 at 10:50:40AM +0800, Fam Zheng wrote:
> > They are wrappers of POSIX fcntl file locking, with the additional
> > interception of open/close (through qemu_open and qemu_close) to offer a
> > better semantics that preserves the locks across multiple life cycles of
> > different fds on the same file.  The reason to make this semantics
> > change over the fcntl locks is to make the API cleaner for QEMU
> > subsystems.
> > 
> > More precisely, we remove this "feature" of fcntl lock:
> > 
> >     If a process closes any file descriptor referring to a file, then
> >     all of the process's locks on that file are released, regardless of
> >     the file descriptor(s) on which the locks were obtained.
> > 
> > as long as the fd is always open/closed through qemu_open and
> > qemu_close.
> 
> You're not actually really removing that problem - this is just hacking
> around it in a manner which is racy & susceptible to silent failure.
> 
> 
> > +static int qemu_fd_close(int fd)
> > +{
> > +    LockEntry *ent, *tmp;
> > +    LockRecord *rec;
> > +    char *path;
> > +    int ret;
> > +
> > +    assert(fd_to_path);
> > +    path = g_hash_table_lookup(fd_to_path, GINT_TO_POINTER(fd));
> > +    assert(path);
> > +    g_hash_table_remove(fd_to_path, GINT_TO_POINTER(fd));
> > +    rec = g_hash_table_lookup(lock_map, path);
> > +    ret = close(fd);
> > +
> > +    if (rec) {
> > +        QLIST_FOREACH_SAFE(ent, &rec->records, next, tmp) {
> > +            int r;
> > +            if (ent->fd == fd) {
> > +                QLIST_REMOVE(ent, next);
> > +                g_free(ent);
> > +                continue;
> > +            }
> > +            r = qemu_lock_do(ent->fd, ent->start, ent->len,
> > +                             ent->readonly ? F_RDLCK : F_WRLCK);
> > +            if (r == -1) {
> > +                fprintf(stderr, "Failed to acquire lock on fd %d: %s\n",
> > +                        ent->fd, strerror(errno));
> > +            }
> 
> If another app has acquired the lock between the close and this attempt
> to re-acquire the lock, then QEMU is silently carrying on without any
> lock being held. For something that's intending to provide protection
> against concurrent use I think this is not an acceptable failure
> scenario.
> 
> 
> > +int qemu_open(const char *name, int flags, ...)
> > +{
> > +    int mode = 0;
> > +    int ret;
> > +
> > +    if (flags & O_CREAT) {
> > +        va_list ap;
> > +
> > +        va_start(ap, flags);
> > +        mode = va_arg(ap, int);
> > +        va_end(ap);
> > +    }
> > +    ret = qemu_fd_open(name, flags, mode);
> > +    if (ret >= 0) {
> > +        qemu_fd_add_record(ret, name);
> > +    }
> > +    return ret;
> > +}
> 
> I think the approach you have here is fundamentally not usable with
> fcntl locks, because it is still using the pattern
> 
>    fd = open(path)
>    lock(fd)
>    if failed lock
>       close(fd)
>    ...do stuff.
> 
> 
> As mentioned in previous posting I believe the block layer must be
> checking whether it already has a lock held against "path" *before*
> even attempting to open the file. Once QEMU has the file descriptor
> open it is too later, because closing the FD will release pre-existing
> locks QEMU has.

There will still be valid close() calls, that will release necessary locks
temporarily.  You are right the "close + re-acquire" in qemu_fd_close() is a
racy problem. Any suggestion how this could be fixed?  Something like
introducing a "close2" syscall that doesn't drop locks on other fds?

Fam

  parent reply	other threads:[~2016-05-11  0:48 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 01/27] block: Add BDRV_O_NO_LOCK Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 02/27] qapi: Add lock-image in blockdev-add options Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 03/27] blockdev: Add and parse "lock-image" option for block devices Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 04/27] block: Introduce image file locking Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 05/27] block: Add bdrv_image_locked Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 06/27] block: Make bdrv_reopen_{commit, abort} private functions Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 07/27] block: Handle image locking during reopen Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 08/27] osdep: Add qemu_lock_fd and qemu_unlock_fd Fam Zheng
2016-05-10  7:54   ` Richard W.M. Jones
2016-05-10  8:57   ` Daniel P. Berrange
2016-05-10  9:06     ` Richard W.M. Jones
2016-05-10  9:20       ` Daniel P. Berrange
2016-05-11  0:48     ` Fam Zheng [this message]
2016-05-11  1:05       ` Fam Zheng
2016-05-11  9:01       ` Daniel P. Berrange
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 09/27] osdep: Introduce qemu_dup Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 10/27] raw-posix: Use qemu_dup Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 11/27] raw-posix: Implement .bdrv_lockf Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 12/27] gluster: " Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 13/27] qemu-io: Add "-L" option for BDRV_O_NO_LOCK Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 14/27] qemu-img: Add "-L" option to sub commands Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 15/27] qemu-img: Update documentation of "-L" option Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 16/27] qemu-nbd: Add "--no-lock/-L" option Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 17/27] block: Don't lock drive-backup target image in none mode Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 18/27] mirror: Disable image locking on target backing chain Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 19/27] qemu-iotests: 140: Disable image lock for qemu-io access Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 20/27] qemu-iotests: 046: Move version detection out from verify_io Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 21/27] qemu-iotests: Wait for QEMU processes before checking image in 091 Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 22/27] qemu-iotests: 030: Disable image lock when checking test image Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 23/27] iotests: 087: Disable image lock in cases where file is shared Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 24/27] iotests: Disable image locking in 085 Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 25/27] tests: Use null-co:// instead of /dev/null Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 26/27] block: Turn on image locking by default Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 27/27] qemu-iotests: Add test case 153 for image locking Fam Zheng
2016-05-10  8:14 ` [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Richard W.M. Jones
2016-05-10  8:43   ` Richard W.M. Jones
2016-05-10  8:50     ` Daniel P. Berrange
2016-05-10  9:14       ` Kevin Wolf
2016-05-10  9:23         ` Daniel P. Berrange
2016-05-10  9:35           ` Kevin Wolf
2016-05-10  9:43             ` Daniel P. Berrange
2016-05-10 10:07               ` Kevin Wolf
2016-05-10 10:16                 ` Richard W.M. Jones
2016-05-10 11:08                   ` Kevin Wolf
2016-05-10 11:46                     ` Richard W.M. Jones
2016-05-10 12:01                       ` Kevin Wolf
2016-05-10 12:11                         ` Richard W.M. Jones
2016-05-10 12:22                           ` Daniel P. Berrange
2016-05-10 12:45                             ` Kevin Wolf
2016-05-11  8:04                             ` Markus Armbruster
2016-05-11  8:52                               ` Daniel P. Berrange
2016-05-11  8:04                             ` Fam Zheng
2016-05-11  9:28                               ` Richard W.M. Jones
2016-05-11 10:03                                 ` Kevin Wolf
2016-05-10 10:29                 ` Daniel P. Berrange
2016-05-10 11:14                   ` Kevin Wolf
2016-05-10 10:02         ` Richard W.M. Jones
2016-05-11 11:48 ` Richard W.M. Jones
2016-05-11 11:56   ` Kevin Wolf
2016-05-12  1:07     ` Fam Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160511004818.GA14074@ad.usersys.redhat.com \
    --to=famz@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=den@openvz.org \
    --cc=jcody@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rjones@redhat.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.