linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jens Axboe <axboe@kernel.dk>, Jann Horn <jannh@google.com>,
	linux-aio@kvack.org, linux-block@vger.kernel.org,
	Linux API <linux-api@vger.kernel.org>,
	hch@lst.de, jmoyer@redhat.com, avi@scylladb.com,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 13/18] io_uring: add file set registration
Date: Thu, 7 Feb 2019 13:31:35 +0000	[thread overview]
Message-ID: <20190207133135.GZ2217@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20190207092253.GD19821@veci.piliscsaba.redhat.com>

On Thu, Feb 07, 2019 at 10:22:53AM +0100, Miklos Szeredi wrote:
> On Thu, Feb 07, 2019 at 04:00:59AM +0000, Al Viro wrote:
> 
> > So in theory it would be possible to have
> > 	* thread A: sendmsg() has SCM_RIGHTS created and populated,
> > complete with file refcount and ->inflight increments implied,
> > at which point it gets preempted and loses the timeslice.
> > 	* thread B: gets to run and removes all references
> > from descriptor table it shares with thread A.
> > 	* on another CPU we have garbage collector triggered;
> > it determines the set of potentially unreachable unix_sock and
> > everything in our SCM_RIGHTS _is_ in that set, now that no
> > other references remain.
> > 	* on the first CPU, thread A regains the timeslice
> > and inserts its SCM_RIGHTS into queue.  And it does contain
> > references to sockets from the candidate set of running
> > garbage collector, confusing the hell out of it.
> 
> Reminds me: long time ago there was a bug report, and based on that I found a
> bug in MSG_PEEK handling (not confirmed to have fixed the reported bug).  This
> fix, although pretty simple, got lost somehow.  While unix gc code is in your
> head, can you please review and I'll resend through davem?

Umm...  I think the bug is real (something that looks like an eviction
candidate, but actually is referenced from the reachable queue might
get peeked via that queue, then have _its_ queue modified via new
external reference, all between two passes over that queue, confusing the
fuck out of unix_gc()), but I think the fix is an overkill...

Am I right assuming that this queue-modifying operation is accept(), removing
an embryo unix_sock from the queue of listener and thus hiding SCM_RIGHTS in
_its_ queue from scan_children()?

Let me think of it a bit, OK?  While we are at it, some questions from digging
through the current net/unix/garbage.c:
	1) is there any need for ->inflight to be atomic?  All accesses are under
unix_gc_lock, after all...
	2) pumping unix_gc on each sodding reference in SCM_RIGHTS (within
unix_notinflight()/unix_inflight()) looks atrocious... wouldn't it be better to
hold it over that loop?
	3) unix_get_socket() probably ought to be static nowadays...
	4) I wonder if in scan_inflight()/scan_children() we would be better
off with explicit switch (by enum argument) instead of an indirect call.
	5) do we really need UNIX_GC_MAYBE_CYCLE?  Looks like the only
real use is in inc_inflight_move_tail(), and AFAICS it could bloody well
have been
	u->inflight++;
	if (u->inflight == 1)	// just got from zero to non-zero
		list_move_tail(&u->link, &gc_candidates);
The logics there is "we'd found a reference to something that still was
a candidate for eviction in a reachable SCM_RIGHTS, so it's actually
reachable and needs to be scanned (and removed from the set of candidates);
move to the end of list, so that the main loop gets around to it".
If it *was* past the cursor in the list, there's no need to move it; if
we got past it, it must've had zero ->inflight (or we would've removed
it from the set back when we got past it).  Note that it's only called if
UNIX_GC_CANDIDATE is set (i.e. if it's in the initial candidate set),
so for this one ->inflight is guaranteed to mean the number of SCM_RIGHTS
refs from outside of the current candidate set...
	6) unix_get_socket() looks like it might benefit from another
FMODE bit; not sure where it's best set, though - the obvious way would
be SOCK_GC_CARES in sock->flags, set by e.g. unix_create1(), with
sock_alloc_file() propagating it into file->f_mode.  Then unix_get_socket()
would be able to bugger off with NULL for most of the references in
SCM_RIGHTS, without looking into inode...

Comments?  IIRC, you'd done the last serious round of rewriting unix_gc()
and friends...

  reply	other threads:[~2019-02-07 13:31 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-29 19:26 [PATCHSET v9] io_uring IO interface Jens Axboe
2019-01-29 19:26 ` [PATCH 01/18] fs: add an iopoll method to struct file_operations Jens Axboe
2019-01-29 19:26 ` [PATCH 02/18] block: wire up block device iopoll method Jens Axboe
2019-01-29 19:26 ` [PATCH 03/18] block: add bio_set_polled() helper Jens Axboe
2019-01-29 19:26 ` [PATCH 04/18] iomap: wire up the iopoll method Jens Axboe
2019-01-29 19:26 ` [PATCH 05/18] Add io_uring IO interface Jens Axboe
2019-01-29 19:26 ` [PATCH 06/18] io_uring: add fsync support Jens Axboe
2019-01-29 19:26 ` [PATCH 07/18] io_uring: support for IO polling Jens Axboe
2019-01-29 20:47   ` Jann Horn
2019-01-29 20:56     ` Jens Axboe
2019-01-29 21:10       ` Jann Horn
2019-01-29 21:33         ` Jens Axboe
2019-01-29 19:26 ` [PATCH 08/18] fs: add fget_many() and fput_many() Jens Axboe
2019-01-29 19:26 ` [PATCH 09/18] io_uring: use fget/fput_many() for file references Jens Axboe
2019-01-29 23:31   ` Jann Horn
2019-01-29 23:44     ` Jens Axboe
2019-01-30 15:33       ` Jens Axboe
2019-01-29 19:26 ` [PATCH 10/18] io_uring: batch io_kiocb allocation Jens Axboe
2019-01-29 19:26 ` [PATCH 11/18] block: implement bio helper to add iter bvec pages to bio Jens Axboe
2019-01-29 19:26 ` [PATCH 12/18] io_uring: add support for pre-mapped user IO buffers Jens Axboe
2019-01-29 22:44   ` Jann Horn
2019-01-29 22:56     ` Jens Axboe
2019-01-29 23:03       ` Jann Horn
2019-01-29 23:06         ` Jens Axboe
2019-01-29 23:08           ` Jann Horn
2019-01-29 23:14             ` Jens Axboe
2019-01-29 23:42               ` Jann Horn
2019-01-29 23:51                 ` Jens Axboe
2019-01-29 19:26 ` [PATCH 13/18] io_uring: add file set registration Jens Axboe
2019-01-30  1:29   ` Jann Horn
2019-01-30 15:35     ` Jens Axboe
2019-02-04  2:56     ` Al Viro
2019-02-05  2:19       ` Jens Axboe
2019-02-05 17:57         ` Jens Axboe
2019-02-05 19:08           ` Jens Axboe
2019-02-06  0:27             ` Jens Axboe
2019-02-06  1:01               ` Al Viro
2019-02-06 17:56                 ` Jens Axboe
2019-02-07  4:05                   ` Al Viro
2019-02-07 16:14                     ` Jens Axboe
2019-02-07 16:30                       ` Al Viro
2019-02-07 16:35                         ` Jens Axboe
2019-02-07 16:51                         ` Al Viro
2019-02-06  0:56             ` Al Viro
2019-02-06 13:41               ` Jens Axboe
2019-02-07  4:00                 ` Al Viro
2019-02-07  9:22                   ` Miklos Szeredi
2019-02-07 13:31                     ` Al Viro [this message]
2019-02-07 14:20                       ` Miklos Szeredi
2019-02-07 15:20                         ` Al Viro
2019-02-07 15:27                           ` Miklos Szeredi
2019-02-07 16:26                             ` Al Viro
2019-02-07 19:08                               ` Miklos Szeredi
2019-02-07 18:45                   ` Jens Axboe
2019-02-07 18:58                     ` Jens Axboe
2019-02-11 15:55                     ` Jonathan Corbet
2019-02-11 17:35                       ` Al Viro
2019-02-11 20:33                         ` Jonathan Corbet
2019-01-29 19:26 ` [PATCH 14/18] io_uring: add submission polling Jens Axboe
2019-01-29 19:26 ` [PATCH 15/18] io_uring: add io_kiocb ref count Jens Axboe
2019-01-29 19:27 ` [PATCH 16/18] io_uring: add support for IORING_OP_POLL Jens Axboe
2019-01-29 19:27 ` [PATCH 17/18] io_uring: allow workqueue item to handle multiple buffered requests Jens Axboe
2019-01-29 19:27 ` [PATCH 18/18] io_uring: add io_uring_event cache hit information Jens Axboe
  -- strict thread matches above, loose matches on Subject: below --
2019-02-07 19:55 [PATCHSET v12] io_uring IO interface Jens Axboe
2019-02-07 19:55 ` [PATCH 13/18] io_uring: add file set registration Jens Axboe
2019-02-08 12:17   ` Alan Jenkins
2019-02-08 12:57     ` Jens Axboe
2019-02-08 14:02       ` Alan Jenkins
2019-02-08 15:13         ` Jens Axboe
2019-02-12 12:29           ` Alan Jenkins
2019-02-12 15:17             ` Jens Axboe
2019-02-12 17:21               ` Alan Jenkins
2019-02-12 17:33                 ` Jens Axboe
2019-02-12 20:23                   ` Alan Jenkins
2019-02-12 21:10                     ` Jens Axboe
2019-02-01 15:23 [PATCHSET v11] io_uring IO interface Jens Axboe
2019-02-01 15:24 ` [PATCH 13/18] io_uring: add file set registration Jens Axboe
2019-01-30 21:55 [PATCHSET v10] io_uring IO interface Jens Axboe
2019-01-30 21:55 ` [PATCH 13/18] io_uring: add file set registration Jens Axboe
2019-01-28 21:35 [PATCHSET v8] io_uring IO interface Jens Axboe
2019-01-28 21:35 ` [PATCH 13/18] io_uring: add file set registration Jens Axboe
2019-01-29 16:36   ` Jann Horn
2019-01-29 18:13     ` Jens Axboe
2019-01-23 15:35 [PATCHSET v7] io_uring IO interface Jens Axboe
2019-01-23 15:35 ` [PATCH 13/18] io_uring: add file set registration Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190207133135.GZ2217@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=avi@scylladb.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=jannh@google.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).