Re: [PATCH 1/3] io_uring: add support for async work inheriting files table

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jens Axboe <axboe@kernel.dk>
To: Jann Horn <jannh@google.com>
Cc: linux-block@vger.kernel.org,
	"David S. Miller" <davem@davemloft.net>,
	Network Development <netdev@vger.kernel.org>
Subject: Re: [PATCH 1/3] io_uring: add support for async work inheriting files table
Date: Thu, 24 Oct 2019 18:35:53 -0600	[thread overview]
Message-ID: <947c74b9-e828-e190-19fc-449c72a20798@kernel.dk> (raw)
In-Reply-To: <CAG48ez00zr2P1WCznnXmTvq+FQ4Ji8kDnuNqbeeMvOh_MhXeTg@mail.gmail.com>

On 10/24/19 5:13 PM, Jann Horn wrote:
> On Fri, Oct 25, 2019 at 12:04 AM Jens Axboe <axboe@kernel.dk> wrote:
>> On 10/24/19 2:31 PM, Jann Horn wrote:
>>> On Thu, Oct 24, 2019 at 9:41 PM Jens Axboe <axboe@kernel.dk> wrote:
>>>> On 10/18/19 12:50 PM, Jann Horn wrote:
>>>>> On Fri, Oct 18, 2019 at 8:16 PM Jens Axboe <axboe@kernel.dk> wrote:
>>>>>> On 10/18/19 12:06 PM, Jann Horn wrote:
>>>>>>> But actually, by the way: Is this whole files_struct thing creating a
>>>>>>> reference loop? The files_struct has a reference to the uring file,
>>>>>>> and the uring file has ACCEPT work that has a reference to the
>>>>>>> files_struct. If the task gets killed and the accept work blocks, the
>>>>>>> entire files_struct will stay alive, right?
>>>>>>
>>>>>> Yes, for the lifetime of the request, it does create a loop. So if the
>>>>>> application goes away, I think you're right, the files_struct will stay.
>>>>>> And so will the io_uring, for that matter, as we depend on the closing
>>>>>> of the files to do the final reap.
>>>>>>
>>>>>> Hmm, not sure how best to handle that, to be honest. We need some way to
>>>>>> break the loop, if the request never finishes.
>>>>>
>>>>> A wacky and dubious approach would be to, instead of taking a
>>>>> reference to the files_struct, abuse f_op->flush() to synchronously
>>>>> flush out pending requests with references to the files_struct... But
>>>>> it's probably a bad idea, given that in f_op->flush(), you can't
>>>>> easily tell which files_struct the close is coming from. I suppose you
>>>>> could keep a list of (fdtable, fd) pairs through which ACCEPT requests
>>>>> have come in and then let f_op->flush() probe whether the file
>>>>> pointers are gone from them...
>>>>
>>>> Got back to this after finishing the io-wq stuff, which we need for the
>>>> cancel.
>>>>
>>>> Here's an updated patch:
>>>>
>>>> http://git.kernel.dk/cgit/linux-block/commit/?h=for-5.5/io_uring-test&id=1ea847edc58d6a54ca53001ad0c656da57257570
>>>>
>>>> that seems to work for me (lightly tested), we correctly find and cancel
>>>> work that is holding on to the file table.
>>>>
>>>> The full series sits on top of my for-5.5/io_uring-wq branch, and can be
>>>> viewed here:
>>>>
>>>> http://git.kernel.dk/cgit/linux-block/log/?h=for-5.5/io_uring-test
>>>>
>>>> Let me know what you think!
>>>
>>> Ah, I didn't realize that the second argument to f_op->flush is a
>>> pointer to the files_struct. That's neat.
>>>
>>>
>>> Security: There is no guarantee that ->flush() will run after the last
>>> io_uring_enter() finishes. You can race like this, with threads A and
>>> B in one process and C in another one:
>>>
>>> A: sends uring fd to C via unix domain socket
>>> A: starts syscall io_uring_enter(fd, ...)
>>> A: calls fdget(fd), takes reference to file
>>> B: starts syscall close(fd)
>>> B: fd table entry is removed
>>> B: f_op->flush is invoked and finds no pending transactions
>>> B: syscall close() returns
>>> A: continues io_uring_enter(), grabbing current->files
>>> A: io_uring_enter() returns
>>> A and B: exit
>>> worker: use-after-free access to files_struct
>>>
>>> I think the solution to this would be (unless you're fine with adding
>>> some broad global read-write mutex) something like this in
>>> __io_queue_sqe(), where "fd" and "f" are the variables from
>>> io_uring_enter(), plumbed through the stack somehow:
>>>
>>> if (req->flags & REQ_F_NEED_FILES) {
>>>     rcu_read_lock();
>>>     spin_lock_irq(&ctx->inflight_lock);
>>>     if (fcheck(fd) == f) {
>>>       list_add(&req->inflight_list,
>>>         &ctx->inflight_list);
>>>       req->work.files = current->files;
>>>       ret = 0;
>>>     } else {
>>>       ret = -EBADF;
>>>     }
>>>     spin_unlock_irq(&ctx->inflight_lock);
>>>     rcu_read_unlock();
>>>     if (ret)
>>>       goto put_req;
>>> }
>>
>> First of all, thanks for the thorough look at this! We already have f
>> available here, it's req->file. And we just made a copy of the sqe, so
>> we have sqe->fd available as well. I fixed this up.
> 
> sqe->fd is the file descriptor we're doing I/O on, not the file
> descriptor of the uring file, right? Same thing for req->file. This
> check only detects whether the fd we're doing I/O on was closed, which
> is irrelevant.

Duh yes, I'm an idiot. Easily fixable, I'll update this for the ring fd.

>>> Security + Correctness: If there is more than one io_wqe, it seems to
>>> me that io_uring_flush() calls io_wq_cancel_work(), which calls
>>> io_wqe_cancel_work(), which may return IO_WQ_CANCEL_OK if the first
>>> request it looks at is pending. In that case, io_wq_cancel_work() will
>>> immediately return, and io_uring_flush() will also immediately return.
>>> It looks like any other requests will continue running?
>>
>> Ah good point, I missed that. We need to keep looping until we get
>> NOTFOUND returned. Fixed as well.
>>
>> Also added cancellation if the task is going away. Here's the
>> incremental patch, I'll resend with the full version.
> [...]
>> +static int io_uring_flush(struct file *file, void *data)
>> +{
>> +       struct io_ring_ctx *ctx = file->private_data;
>> +
>> +       if (fatal_signal_pending(current) || (current->flags & PF_EXITING))
>> +               io_wq_cancel_all(ctx->io_wq);
> 
> Looking at io_wq_cancel_all(), this will just send a signal to the
> task without waiting for anything, right? Isn't that unsafe?

Yes, that's a logic error, we should always do the
io_uring_cancel_files(). Ala:

	io_uring_cancel_files();
	if (fatal_signal_pending(current) || (current->flags & PF_EXITING))
		io_wq_cancel_all(ctx->io_wq);

Thanks!

-- 
Jens Axboe

next prev parent reply	other threads:[~2019-10-25  0:35 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-17 21:28 [PATCHSET] io_uring: add support for accept(4) Jens Axboe
2019-10-17 21:28 ` [PATCH 1/3] io_uring: add support for async work inheriting files table Jens Axboe
2019-10-18  2:41   ` Jann Horn
2019-10-18 14:01     ` Jens Axboe
2019-10-18 14:34       ` Jann Horn
2019-10-18 14:37         ` Jens Axboe
2019-10-18 14:40           ` Jann Horn
2019-10-18 14:43             ` Jens Axboe
2019-10-18 14:52               ` Jann Horn
2019-10-18 15:00                 ` Jens Axboe
2019-10-18 15:54                   ` Jens Axboe
2019-10-18 16:20                     ` Jann Horn
2019-10-18 16:36                       ` Jens Axboe
2019-10-18 17:05                         ` Jens Axboe
2019-10-18 18:06                           ` Jann Horn
2019-10-18 18:16                             ` Jens Axboe
2019-10-18 18:50                               ` Jann Horn
2019-10-24 19:41                                 ` Jens Axboe
2019-10-24 20:31                                   ` Jann Horn
2019-10-24 22:04                                     ` Jens Axboe
2019-10-24 22:09                                       ` Jens Axboe
2019-10-24 23:13                                       ` Jann Horn
2019-10-25  0:35                                         ` Jens Axboe [this message]
2019-10-25  0:52                                           ` Jens Axboe
2019-10-23 12:04   ` Wolfgang Bumiller
2019-10-23 14:11     ` Jens Axboe
2019-10-17 21:28 ` [PATCH 2/3] net: add __sys_accept4_file() helper Jens Axboe
2019-10-17 21:28 ` [PATCH 3/3] io_uring: add support for IORING_OP_ACCEPT Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=947c74b9-e828-e190-19fc-449c72a20798@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=davem@davemloft.net \
    --cc=jannh@google.com \
    --cc=linux-block@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.