rust-for-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Benno Lossin <benno.lossin@proton.me>
To: Andreas Hindborg <nmi@metaspace.dk>
Cc: "Jens Axboe" <axboe@kernel.dk>, "Christoph Hellwig" <hch@lst.de>,
	"Keith Busch" <kbusch@kernel.org>,
	"Damien Le Moal" <Damien.LeMoal@wdc.com>,
	"Bart Van Assche" <bvanassche@acm.org>,
	"Hannes Reinecke" <hare@suse.de>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"Andreas Hindborg" <a.hindborg@samsung.com>,
	"Wedson Almeida Filho" <wedsonaf@gmail.com>,
	"Niklas Cassel" <Niklas.Cassel@wdc.com>,
	"Greg KH" <gregkh@linuxfoundation.org>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"Alex Gaynor" <alex.gaynor@gmail.com>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"Gary Guo" <gary@garyguo.net>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Alice Ryhl" <aliceryhl@google.com>,
	"Chaitanya Kulkarni" <chaitanyak@nvidia.com>,
	"Luis Chamberlain" <mcgrof@kernel.org>,
	"Yexuan Yang" <1182282462@bupt.edu.cn>,
	"Sergio González Collado" <sergio.collado@gmail.com>,
	"Joel Granados" <j.granados@samsung.com>,
	"Pankaj Raghav (Samsung)" <kernel@pankajraghav.com>,
	"Daniel Gomez" <da.gomez@samsung.com>,
	"open list" <linux-kernel@vger.kernel.org>,
	"rust-for-linux@vger.kernel.org" <rust-for-linux@vger.kernel.org>,
	"lsf-pc@lists.linux-foundation.org"
	<lsf-pc@lists.linux-foundation.org>,
	"gost.dev@samsung.com" <gost.dev@samsung.com>,
	"Ming Lei" <ming.lei@redhat.com>
Subject: Re: [RFC PATCH 1/5] rust: block: introduce `kernel::block::mq` module
Date: Thu, 04 Apr 2024 08:46:52 +0000	[thread overview]
Message-ID: <3478efc2-7e81-4688-a115-5d33c70a24aa@proton.me> (raw)
In-Reply-To: <87zfu9r8be.fsf@metaspace.dk>

On 04.04.24 07:44, Andreas Hindborg wrote:
> Benno Lossin <benno.lossin@proton.me> writes:
> 
>> On 03.04.24 10:46, Andreas Hindborg wrote:
>>> Benno Lossin <benno.lossin@proton.me> writes:
>>>
>>>> On 23.03.24 07:32, Andreas Hindborg wrote:
>>>>> Benno Lossin <benno.lossin@proton.me> writes:
>>>>>> On 3/13/24 12:05, Andreas Hindborg wrote:
>>>>>>> +//! implementations of the `Operations` trait.
>>>>>>> +//!
>>>>>>> +//! IO requests are passed to the driver as [`Request`] references. The
>>>>>>> +//! `Request` type is a wrapper around the C `struct request`. The driver must
>>>>>>> +//! mark start of request processing by calling [`Request::start`] and end of
>>>>>>> +//! processing by calling one of the [`Request::end`], methods. Failure to do so
>>>>>>> +//! can lead to IO failures.
>>>>>>
>>>>>> I am unfamiliar with this, what are "IO failures"?
>>>>>> Do you think that it might be better to change the API to use a
>>>>>> callback? So instead of calling start and end, you would do
>>>>>>
>>>>>>         request.handle(|req| {
>>>>>>             // do the stuff that would be done between start and end
>>>>>>         });
>>>>>>
>>>>>> I took a quick look at the rnull driver and there you are calling
>>>>>> `Request::end_ok` from a different function. So my suggestion might not
>>>>>> be possible, since you really need the freedom.
>>>>>>
>>>>>> Do you think that a guard approach might work better? ie `start` returns
>>>>>> a guard that when dropped will call `end` and you need the guard to
>>>>>> operate on the request.
>>>>>
>>>>> I don't think that would fit, since the driver might not complete the
>>>>> request immediately. We might be able to call `start` on behalf of the
>>>>> driver.
>>>>>
>>>>> At any rate, since the request is reference counted now, we can
>>>>> automatically fail a request when the last reference is dropped and it
>>>>> was not marked successfully completed. I would need to measure the
>>>>> performance implications of such a feature.
>>>>
>>>> Are there cases where you still need access to the request after you
>>>> have called `end`?
>>>
>>> In general no, there is no need to handle the request after calling end.
>>> C drivers are not allowed to, because this transfers ownership of the
>>> request back to the block layer. This patch series defer the transfer of
>>> ownership to the point when the ARef<Request> refcount goes to zero, so
>>> there should be no danger associated with touching the `Request` after
>>> end.
>>>
>>>> If no, I think it would be better for the request to
>>>> be consumed by the `end` function.
>>>> This is a bit difficult with `ARef`, since the user can just clone it
>>>> though... Do you think that it might be necessary to clone requests?
>>>
>>> Looking into the details now I see that calling `Request::end` more than
>>> once will trigger UAF, because C code decrements the refcount on the
>>> request. When we have `ARef<Request>` around, that is a problem. It
>>> probably also messes with other things in C land. Good catch.
>>>
>>> I did implement `Request::end` to consume the request at one point
>>> before I fell back on reference counting. It works fine for simple
>>> drivers. However, most drivers will need to use the block layer tag set
>>> service, that allows conversion of an integer id to a request pointer.
>>> The abstraction for this feature is not part of this patch set. But the
>>> block layer manages a mapping of integer to request mapping, and drivers
>>> typically use this to identify the request that corresponds to
>>> completion messages that arrive from hardware. When drivers are able to
>>> turn integers into requests like this, consuming the request in the call
>>> to `end` makes little sense (because we can just construct more).
>>
>> How do you ensure that this is fine?:
>>
>>       let r1 = tagset.get(0);
>>       let r2 = tagset.get(0);
>>       r1.end_ok();
>>       r2.do_something_that_would_only_be_done_while_active();
>>
>> One thing that comes to my mind would be to only give out `&Request`
>> from the tag set. And to destroy, you could have a separate operation
>> that also removes the request from the tag set. (I am thinking of a tag
>> set as a `HashMap<u64, Request>`.
> 
> This would be similar to
> 
>    let r1 = tagset.get(0)?;
>    ler r2 = r1.clone();
>    r1.end_ok();
>    r2.do_something_requires_active();
> 
> but it is not a problem because we do not implement any actions that are
> illegal in that position (outside of `end` - that _is_ a problem).

Makes sense, but I think it's a bit weird to still be able to access it
after `end`ing.

> 
> 
>>>
>>> What I do now is issue the an `Option<ARef<Request>>` with
>>> `bindings::req_ref_inc_not_zero(rq_ptr)`, to make sure that the request
>>> is currently owned by the driver.
>>>
>>> I guess we can check the absolute value of the refcount, and only issue
>>> a request handle if the count matches what we expect. Then we can be certain
>>> that the handle is unique, and we can require transfer of ownership of
>>> the handle to `Request::end` to make sure it can never be called more
>>> than once.
>>>
>>> Another option is to error out in `Request::end` if the
>>> refcount is not what we expect.
>>
>> I am a bit confused, why does the refcount matter in this case? Can't
>> the user just have multiple `ARef`s?
> 
> Because we want to assert that we are consuming the last handle to the
> request. After we do that, the user cannot call `Request::end` again.
> `TagSet::get` will not issue a request reference if the request is not
> in flight. Although there might be a race condition to watch out for.
> 
> When the block layer hands over ownership to Rust, the reference count
> is 1. The first `ARef<Request>` we create increments the count to 2. To
> complete the request, we must have ownership of all reference counts
> above 1. The block layer takes the last reference count when it takes
> back ownership of the request.
> 
>> I think it would be weird to use `ARef<Request>` if you expect the
>> refcount to be 1.
> 
> Yes, that would require a custom smart pointer with a `try_into_unique`
> method that succeeds when the refcount is exactly 2. It would consume
> the instance and decrement the refcount to 1. But as I said, there is a
> potential race with `TagSet::get` when the refcount is 1 that needs to
> be handled.
> 
>> Maybe the API should be different?
> 
> I needs to change a little, yes.
> 
>> As I understand it, a request has the following life cycle (please
>> correct me if I am wrong):
>> 1. A new request is created, it is given to the driver via `queue_rq`.
>> 2. The driver can now decide what to do with it (theoretically it can
>>      store it somewhere and later do something with it), but it should at
>>      some point call `Request::start`.
>> 3. Work happens and eventually the driver calls `Request::end`.
>>
>> To me this does not seem like something where we need a refcount (we
>> still might need one for safety, but it does not need to be exposed to
>> the user).
> 
> It would not need to be exposed to the user, other than a) ending a request
> can fail OR b) `TagSet::get` can fail.
> 
> a) would require that ending a request must be done with a unique
> reference. This could be done by the user by the user calling
> `try_into_unique` or by making the `end` method fallible.
> 
> b) would make the reference handle `!Clone` and add a failure mode to
> `TagSet::get`, so it fails to construct a `Request` handle if there are
> already one in existence.
> 
> I gravitate towards a) because it allows the user to clone the Request
> reference without adding an additional `Arc`.

This confuses me a little, since I thought that `TagSet::get` returns
`Option<ARef<Request>>`. (I tried to find the abstractions in your
github, but I did not find them)

I think that this could work: `queue_rq` takes a `OwnedRequest`, which
the user can store in a `TagSet`, transferring ownership. `TagSet::get`
returns `Option<&Request>` and you can call `TagSet::remove` to get
`Option<OwnedRequest>`. `OwnedRequest::end` consumes `self`.
With this pattern we also do not need to take an additional refcount.

-- 
Cheers,
Benno


  reply	other threads:[~2024-04-04  8:47 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-22 23:40 [RFC PATCH 1/5] rust: block: introduce `kernel::block::mq` module Benno Lossin
2024-03-23  6:32 ` Andreas Hindborg
2024-04-02 23:09   ` Benno Lossin
2024-04-03  8:46     ` Andreas Hindborg
2024-04-03 19:37       ` Benno Lossin
2024-04-04  5:44         ` Andreas Hindborg
2024-04-04  8:46           ` Benno Lossin [this message]
2024-04-04  9:30             ` Andreas Hindborg
2024-04-04 13:20               ` Benno Lossin
2024-04-05  8:43                 ` Andreas Hindborg
2024-04-05  9:40                   ` Benno Lossin
  -- strict thread matches above, loose matches on Subject: below --
2024-03-13 11:05 [RFC PATCH 0/5] Rust block device driver API and null block driver Andreas Hindborg
2024-03-13 11:05 ` [RFC PATCH 1/5] rust: block: introduce `kernel::block::mq` module Andreas Hindborg
2024-03-13 23:55   ` Boqun Feng
2024-03-14  8:58     ` Andreas Hindborg
2024-03-14 18:55       ` Miguel Ojeda
2024-03-14 19:22         ` Andreas Hindborg
2024-03-14 19:41           ` Andreas Hindborg
2024-03-14 19:41           ` Miguel Ojeda
2024-03-14 20:56             ` Miguel Ojeda
2024-03-15  7:52             ` Andreas Hindborg
2024-03-15 12:17               ` Ming Lei
2024-03-15 12:46                 ` Andreas Hindborg
2024-03-15 15:24                   ` Ming Lei
2024-03-15 17:49                     ` Andreas Hindborg
2024-03-16 14:48                       ` Ming Lei
2024-03-16 17:27                         ` Andreas Hindborg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3478efc2-7e81-4688-a115-5d33c70a24aa@proton.me \
    --to=benno.lossin@proton.me \
    --cc=1182282462@bupt.edu.cn \
    --cc=Damien.LeMoal@wdc.com \
    --cc=Niklas.Cassel@wdc.com \
    --cc=a.hindborg@samsung.com \
    --cc=alex.gaynor@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=axboe@kernel.dk \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=bvanassche@acm.org \
    --cc=chaitanyak@nvidia.com \
    --cc=da.gomez@samsung.com \
    --cc=gary@garyguo.net \
    --cc=gost.dev@samsung.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=j.granados@samsung.com \
    --cc=kbusch@kernel.org \
    --cc=kernel@pankajraghav.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mcgrof@kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=nmi@metaspace.dk \
    --cc=ojeda@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=sergio.collado@gmail.com \
    --cc=wedsonaf@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).