All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>,
	Linus Torvalds
	<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	criu-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Alexander Viro
	<viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
	"Paul E. McKenney"
	<paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
	David Howells <dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Dave Jones <davej-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Michael Kerrisk
	<mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>,
	Cyrill Gorcunov
	<gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
Subject: Re: [PATCH 3/3] signalfd: add ability to read siginfo-s without dequeuing signals (v3)
Date: Fri, 28 Dec 2012 15:32:00 +0100	[thread overview]
Message-ID: <20121228143200.GB24229@redhat.com> (raw)
In-Reply-To: <1356690181-1796-4-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>

On 12/28, Andrey Vagin wrote:
>
> pread(fd, buf, size, pos) with non-zero pos returns siginfo-s
> without dequeuing signals.
>
> A sequence number and a queue are encoded in pos.
>
> pos = seq + SFD_*_OFFSET
>
> seq is a sequence number of a signal in a queue.
>
> SFD_PER_THREAD_QUEUE_OFFSET - read signals from a per-thread queue.
> SFD_SHARED_QUEUE_OFFSET - read signals from a shared (process wide) queue.
>
> This functionality is required for checkpointing pending signals.
>
> v2: llseek() can't be used here, because peek_offset/f_pos/whatever
> has to be shared with all processes which have this file opened.
>
> Suppose that the task forks after sys_signalfd(). Now if parent or child
> do llseek this affects them both. This is insane because signalfd is
> "strange" to say at least, fork/dup/etc inherits signalfd_ctx but not
> the" source" of the data. // Oleg Nesterov

I think we should cc Linus.

This patch adds the hack and it makes signalfd even more strange.

Yes, this hack was suggested by me because I can't suggest something
better. But if Linus dislikes this user-visible API it would be better
to get his nack right now.

> +static ssize_t signalfd_peek(struct signalfd_ctx *ctx,
> +				siginfo_t *info, loff_t *ppos)
> +{
> +	struct sigpending *pending;
> +	struct sigqueue *q;
> +	loff_t seq;
> +	int ret = 0;
> +
> +	spin_lock_irq(&current->sighand->siglock);
> +
> +	if (*ppos >= SFD_SHARED_QUEUE_OFFSET) {
> +		pending = &current->signal->shared_pending;
> +		seq = *ppos - SFD_SHARED_QUEUE_OFFSET;
> +	} else {
> +		pending = &current->pending;
> +		seq = *ppos - SFD_PER_THREAD_QUEUE_OFFSET;
> +	}

You can do this outside of spin_lock_irq().

And I think it would be better to check SFD_PRIVATE_QUEUE_OFFSET too
although this is not strictly necessary. Otherwise this code assumes
that sys_pread() cheks pos >= 0 and SFD_PRIVATE_QUEUE_OFFSET == 1.

> +	list_for_each_entry(q, &pending->list, list) {
> +		if (sigismember(&ctx->sigmask, q->info.si_signo))
> +			continue;
> +
> +		if (seq-- == 0) {
> +			copy_siginfo(info, &q->info);
> +			ret = info->si_signo;
> +			break;
> +		}
> +	}
> +
> +	spin_unlock_irq(&current->sighand->siglock);
> +
> +	if (ret)
> +		(*ppos)++;

We can change it unconditionally but I won't argue.

> @@ -338,6 +379,7 @@ SYSCALL_DEFINE4(signalfd4, int, ufd, sigset_t __user *, user_mask,
>  		}
>
>  		file->f_flags |= flags & SFD_RAW;
> +		file->f_mode |= FMODE_PREAD;

Again, this is not needed or the code was broken by the previous patch.

Given that 2/3 passes O_RDWR to anon_inode_getfile() I think FMODE_PREAD
should be already set. Note OPEN_FMODE(flags) in anon_inode_getfile().

Oleg.

WARNING: multiple messages have this Message-ID (diff)
From: Oleg Nesterov <oleg@redhat.com>
To: Andrey Vagin <avagin@openvz.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, criu@openvz.org,
	linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	David Howells <dhowells@redhat.com>,
	Dave Jones <davej@redhat.com>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Pavel Emelyanov <xemul@parallels.com>,
	Cyrill Gorcunov <gorcunov@openvz.org>
Subject: Re: [PATCH 3/3] signalfd: add ability to read siginfo-s without dequeuing signals (v3)
Date: Fri, 28 Dec 2012 15:32:00 +0100	[thread overview]
Message-ID: <20121228143200.GB24229@redhat.com> (raw)
In-Reply-To: <1356690181-1796-4-git-send-email-avagin@openvz.org>

On 12/28, Andrey Vagin wrote:
>
> pread(fd, buf, size, pos) with non-zero pos returns siginfo-s
> without dequeuing signals.
>
> A sequence number and a queue are encoded in pos.
>
> pos = seq + SFD_*_OFFSET
>
> seq is a sequence number of a signal in a queue.
>
> SFD_PER_THREAD_QUEUE_OFFSET - read signals from a per-thread queue.
> SFD_SHARED_QUEUE_OFFSET - read signals from a shared (process wide) queue.
>
> This functionality is required for checkpointing pending signals.
>
> v2: llseek() can't be used here, because peek_offset/f_pos/whatever
> has to be shared with all processes which have this file opened.
>
> Suppose that the task forks after sys_signalfd(). Now if parent or child
> do llseek this affects them both. This is insane because signalfd is
> "strange" to say at least, fork/dup/etc inherits signalfd_ctx but not
> the" source" of the data. // Oleg Nesterov

I think we should cc Linus.

This patch adds the hack and it makes signalfd even more strange.

Yes, this hack was suggested by me because I can't suggest something
better. But if Linus dislikes this user-visible API it would be better
to get his nack right now.

> +static ssize_t signalfd_peek(struct signalfd_ctx *ctx,
> +				siginfo_t *info, loff_t *ppos)
> +{
> +	struct sigpending *pending;
> +	struct sigqueue *q;
> +	loff_t seq;
> +	int ret = 0;
> +
> +	spin_lock_irq(&current->sighand->siglock);
> +
> +	if (*ppos >= SFD_SHARED_QUEUE_OFFSET) {
> +		pending = &current->signal->shared_pending;
> +		seq = *ppos - SFD_SHARED_QUEUE_OFFSET;
> +	} else {
> +		pending = &current->pending;
> +		seq = *ppos - SFD_PER_THREAD_QUEUE_OFFSET;
> +	}

You can do this outside of spin_lock_irq().

And I think it would be better to check SFD_PRIVATE_QUEUE_OFFSET too
although this is not strictly necessary. Otherwise this code assumes
that sys_pread() cheks pos >= 0 and SFD_PRIVATE_QUEUE_OFFSET == 1.

> +	list_for_each_entry(q, &pending->list, list) {
> +		if (sigismember(&ctx->sigmask, q->info.si_signo))
> +			continue;
> +
> +		if (seq-- == 0) {
> +			copy_siginfo(info, &q->info);
> +			ret = info->si_signo;
> +			break;
> +		}
> +	}
> +
> +	spin_unlock_irq(&current->sighand->siglock);
> +
> +	if (ret)
> +		(*ppos)++;

We can change it unconditionally but I won't argue.

> @@ -338,6 +379,7 @@ SYSCALL_DEFINE4(signalfd4, int, ufd, sigset_t __user *, user_mask,
>  		}
>
>  		file->f_flags |= flags & SFD_RAW;
> +		file->f_mode |= FMODE_PREAD;

Again, this is not needed or the code was broken by the previous patch.

Given that 2/3 passes O_RDWR to anon_inode_getfile() I think FMODE_PREAD
should be already set. Note OPEN_FMODE(flags) in anon_inode_getfile().

Oleg.


  parent reply	other threads:[~2012-12-28 14:32 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-28 10:22 [PATCH 0/3] signalfd: a kernel interface for dumping/restoring pending signals (v2) Andrey Vagin
2012-12-28 10:22 ` Andrey Vagin
2012-12-28 10:22 ` [PATCH 1/3] signal: allow to send any siginfo to itself Andrey Vagin
2012-12-28 10:23 ` [PATCH 2/3] signalfd: add ability to return siginfo in a raw format (v2) Andrey Vagin
     [not found]   ` <1356690181-1796-3-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2012-12-28 16:14     ` Oleg Nesterov
2012-12-28 16:14       ` Oleg Nesterov
2013-01-10  9:47       ` Andrey Wagin
2013-01-10 22:45         ` Michael Kerrisk (man-pages)
2013-01-12 18:55         ` Oleg Nesterov
2012-12-28 10:23 ` [PATCH 3/3] signalfd: add ability to read siginfo-s without dequeuing signals (v3) Andrey Vagin
     [not found]   ` <1356690181-1796-4-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2012-12-28 14:32     ` Oleg Nesterov [this message]
2012-12-28 14:32       ` Oleg Nesterov
     [not found]       ` <20121228143200.GB24229-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-28 15:40         ` Oleg Nesterov
2012-12-28 15:40           ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121228143200.GB24229@redhat.com \
    --to=oleg-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
    --cc=criu-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
    --cc=davej-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    --cc=torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
    --cc=xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.