linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
To: Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
Cc: Jeff Layton <jlayton-vpEMnDpepFuMZCB2o+C8xQ@public.gmane.org>,
	Alexander Viro
	<viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
	linux-fsdevel
	<linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Linux Kernel Mailing List
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH] locks: Ability to test for flock presence on fd
Date: Wed, 3 Sep 2014 11:44:34 -0400	[thread overview]
Message-ID: <20140903154434.GC22731@fieldses.org> (raw)
In-Reply-To: <540727E0.6030005-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>

On Wed, Sep 03, 2014 at 06:38:24PM +0400, Pavel Emelyanov wrote:
> On 09/02/2014 11:53 PM, Jeff Layton wrote:
> > On Tue, 2 Sep 2014 15:43:00 -0400
> > "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> wrote:
> > 
> >> On Tue, Sep 02, 2014 at 11:07:14PM +0400, Pavel Emelyanov wrote:
> >>> On 09/02/2014 10:44 PM, J. Bruce Fields wrote:
> >>>> On Tue, Sep 02, 2014 at 09:17:34PM +0400, Pavel Emelyanov wrote:
> >>>>> Hi,
> >>>>>
> >>>>> There's a problem with getting information about who has a flock on
> >>>>> a specific file. The thing is that the "owner" field, that is shown in
> >>>>> /proc/locks is the pid of the task who created the flock, not the one
> >>>>> who _may_ hold it.
> >>>>>
> >>>>> If the flock creator shared the file with some other task (by forking
> >>>>> or via scm_rights) and then died or closed the file, the information
> >>>>> shown in proc no longer corresponds to the reality.
> >>>>>
> >>>>> This is critical for CRIU project, that tries to dump (and restore)
> >>>>> the state of running tasks. For example, let's take two tasks A and B
> >>>>> both opened a file "/foo", one of tasks places a LOCK_SH lock on the 
> >>>>> file and then "obfuscated" the owner field in /proc/locks. After this
> >>>>> we have no ways to find out who is the lock holder.
> >>>>>
> >>>>> I'd like to note, that for LOCK_EX this problem is not critical -- we
> >>>>> may go to both tasks and "ask" them to LOCK_EX the file again (we can
> >>>>> do it in CRIU, I can tell more if required). The one who succeeds is 
> >>>>> the lock holder.
> >>>>
> >>>> It could be both, actually, right?
> >>>
> >>> Two succeeding with LOCK_EX? AFAIU no. Am I missing something?
> >>
> >> After a fork, two processes "own" the lock, right?:
> >>
> >> 	int main(int argc, char *argv[])
> >> 	{
> >> 		int fd, ret;
> >> 	
> >> 		fd = open(argv[1], O_RDWR);
> >> 		ret = flock(fd, LOCK_EX);
> >> 		if (ret)
> >> 			err(1, "flock");
> >> 		ret = fork();
> >> 		if (ret == -1)
> >> 			err(1, "flock");
> >> 		ret = flock(fd, LOCK_EX);
> >> 		if (ret)
> >> 			err(1, "flock");
> >> 		printf("%d got exclusive lock\n", getpid());
> >> 		sleep(1000);
> >> 	}
> >>
> >> 	$ touch TMP
> >> 	$ ./test TMP
> >> 	15882 got exclusive lock
> >> 	15883 got exclusive lock
> >> 	^C
> >>
> >> I may misunderstand what you're doing.
> >>
> > 
> > Yeah, I don't understand either.
> > 
> > Flock locks are owned by the file description. The task that set
> > them is really irrelevant once they are set.
> > 
> > In the second flock() call there, you're just "modifying" an existing
> > lock (which turns out to be a noop here).
> > 
> > So, I don't quite understand the problem this solves. I get that you're
> > trying to reestablish the flock "state" after a checkpoint/restore
> > event, but why does it matter what task actually sets the locks as long
> > as they're set on the correct set of fds?
> 
> Sorry for confusion. Let me try to explain it more clearly.
> 
> First, what I meant talking about two LOCK_EX locks. Let's consider
> this scenario:
> 
> pid = fork()
> fd = open("/foo"); /* both parent and child has _different_ files */
> if (pid == 0)
> 	/* child only */
> 	flock(fd, LOCK_EX);
> 
> at this point we have two different files pointing to "/foo" and 
> only one of them has LOCK_EX on it. So if try to LOCK_EX it again, 
> only at child's file this would succeed. So we can distinguish which
> file is locked using this method.
> 
> 
> 
> Now, what problem this patch is trying to solve. It's quite tricky, 
> but still. Let's imagine this scenario:
> 
> pid = fork();
> fd = open("/foo"); /* yet again -- two different files */
> if (pid == 0) {
> 	flock(fd, LOCK_SH);
> 	pid2 = fork();
> 	if (pid2 != 0)
> 		exit(0);
> }
> 
> at this point we have:
> 
> task A -- the original task with file "/foo" opened
> task B -- the first child, that exited at the end
> task C -- the 2nd child, that "inherited" a file with the lock from B
> 
> Note, that file at A and file at C are two different files (struct 
> file-s). And it's only the C's one that is locked.
> 
> The problem is that the /proc/locks shows the pid of B in this lock's
> owner field. And we have no glue to find out who the real lock owner
> is using the /proc/locks.
> 
> If we try to do the trickery like the one we did with LOCK_EX above,
> this is what we would get.
> 
> If putting the 2nd LOCK_SH from A and from C, both attempts would succeed,
> so this is not the solution.
> 
> If we try to LOCK_EX from A and C, only C would succeed, so this seem
> to be the solution, but it's actually not. If there's another pair of 
> A' and C' tasks holding the same "/foo" and having the LOCK_SH on C', 
> this trick would stop working as none of the tasks would be able to 
> put such lock on this file.
> 
> 
> Thus, we need some way to find out whether a task X has a lock on file F.
> This patch is one of the ways of doing this.
> 
> Hope this explanation is more clear.

Thanks, I think I understand.

Remind me how you figure out which file descriptors point to the same
file description (struct file)?

--b.

  parent reply	other threads:[~2014-09-03 15:44 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-02 17:17 [PATCH] locks: Ability to test for flock presence on fd Pavel Emelyanov
2014-09-02 18:44 ` J. Bruce Fields
2014-09-02 19:07   ` Pavel Emelyanov
     [not found]     ` <54061562.4080306-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2014-09-02 19:43       ` J. Bruce Fields
     [not found]         ` <20140902194300.GE31793-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2014-09-02 19:53           ` Jeff Layton
2014-09-03 14:38             ` Pavel Emelyanov
     [not found]               ` <540727E0.6030005-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2014-09-03 15:44                 ` J. Bruce Fields [this message]
     [not found]                   ` <20140903154434.GC22731-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2014-09-03 15:47                     ` Pavel Emelyanov
2014-09-03 15:55               ` Jeff Layton
     [not found]                 ` <20140903115504.63a7ae6f-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2014-09-03 16:00                   ` Pavel Emelyanov
     [not found]                     ` <54073B02.2060707-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2014-09-03 16:03                       ` Jeff Layton
     [not found]                         ` <20140903120321.604f9039-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2014-09-03 16:57                           ` Andy Lutomirski
2014-09-09 16:18       ` J. Bruce Fields
2014-09-10 13:32         ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140903154434.GC22731@fieldses.org \
    --to=bfields-uc3wqj2krung9huczpvpmw@public.gmane.org \
    --cc=jlayton-vpEMnDpepFuMZCB2o+C8xQ@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
    --cc=xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).