From: Benno Lossin <benno.lossin@proton.me>
To: Andreas Hindborg <nmi@metaspace.dk>
Cc: "Jens Axboe" <axboe@kernel.dk>, "Christoph Hellwig" <hch@lst.de>,
"Keith Busch" <kbusch@kernel.org>,
"Damien Le Moal" <dlemoal@kernel.org>,
"Bart Van Assche" <bvanassche@acm.org>,
"Hannes Reinecke" <hare@suse.de>,
"Ming Lei" <ming.lei@redhat.com>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"Andreas Hindborg" <a.hindborg@samsung.com>,
"Wedson Almeida Filho" <wedsonaf@gmail.com>,
"Greg KH" <gregkh@linuxfoundation.org>,
"Matthew Wilcox" <willy@infradead.org>,
"Miguel Ojeda" <ojeda@kernel.org>,
"Alex Gaynor" <alex.gaynor@gmail.com>,
"Boqun Feng" <boqun.feng@gmail.com>,
"Gary Guo" <gary@garyguo.net>,
"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
"Alice Ryhl" <aliceryhl@google.com>,
"Chaitanya Kulkarni" <chaitanyak@nvidia.com>,
"Luis Chamberlain" <mcgrof@kernel.org>,
"Yexuan Yang" <1182282462@bupt.edu.cn>,
"Sergio González Collado" <sergio.collado@gmail.com>,
"Joel Granados" <j.granados@samsung.com>,
"Pankaj Raghav (Samsung)" <kernel@pankajraghav.com>,
"Daniel Gomez" <da.gomez@samsung.com>,
"Niklas Cassel" <Niklas.Cassel@wdc.com>,
"Philipp Stanner" <pstanner@redhat.com>,
"Conor Dooley" <conor@kernel.org>,
"Johannes Thumshirn" <Johannes.Thumshirn@wdc.com>,
"Matias Bjørling" <m@bjorling.me>,
"open list" <linux-kernel@vger.kernel.org>,
"rust-for-linux@vger.kernel.org" <rust-for-linux@vger.kernel.org>,
"lsf-pc@lists.linux-foundation.org"
<lsf-pc@lists.linux-foundation.org>,
"gost.dev@samsung.com" <gost.dev@samsung.com>
Subject: Re: [PATCH v4 1/3] rust: block: introduce `kernel::block::mq` module
Date: Mon, 03 Jun 2024 18:26:08 +0000 [thread overview]
Message-ID: <925fe0fe-9303-4f49-b473-c3a3ecc5e2e6@proton.me> (raw)
In-Reply-To: <87mso2me0p.fsf@metaspace.dk>
On 03.06.24 14:01, Andreas Hindborg wrote:
> Benno Lossin <benno.lossin@proton.me> writes:
>> On 01.06.24 15:40, Andreas Hindborg wrote:
>>> +impl seal::Sealed for Initialized {}
>>> +impl GenDiskState for Initialized {
>>> + const DELETE_ON_DROP: bool = false;
>>> +}
>>> +impl seal::Sealed for Added {}
>>> +impl GenDiskState for Added {
>>> + const DELETE_ON_DROP: bool = true;
>>> +}
>>> +
>>> +impl<T: Operations> GenDisk<T, Initialized> {
>>> + /// Try to create a new `GenDisk`.
>>> + pub fn try_new(tagset: Arc<TagSet<T>>) -> Result<Self> {
>>
>> Since there is no non-try `new` function, I think we should name this
>> function just `new`.
>
> Right, I am still getting used to the new naming scheme. Do you know if
> it is documented anywhere?
I don't think it is documented, it might only be a verbal convention at
the moment. Although [1] is suggesting `new` for general constructors.
Since this is the only constructor, one could argue that the
recommendation is to use `new` (which I personally find a good idea).
[1]: https://rust-lang.github.io/api-guidelines/naming.html
[...]
>>> +impl<T: Operations> OperationsVTable<T> {
>>> + /// This function is called by the C kernel. A pointer to this function is
>>> + /// installed in the `blk_mq_ops` vtable for the driver.
>>> + ///
>>> + /// # Safety
>>> + ///
>>> + /// - The caller of this function must ensure `bd` is valid
>>> + /// and initialized. The pointees must outlive this function.
>>
>> Until when do the pointees have to be alive? "must outlive this
>> function" could also be the case if the pointees die immediately after
>> this function returns.
>
> It should not be plural. What I intended to communicate is that what
> `bd` points to must be valid for read for the duration of the function
> call. I think that is what "The pointee must outlive this function"
> states? Although when we talk about lifetime of an object pointed to by
> a pointer, I am not sure about the correct way to word this. Do we talk
> about the lifetime of the pointer or the lifetime of the pointed to
> object (the pointee). We should not use the same wording for the pointer
> and the pointee.
>
> How about:
>
> /// - The caller of this function must ensure that the pointee of `bd` is
> /// valid for read for the duration of this function.
But this is not enough for it to be sound, right? You create an `ARef`
from `bd.rq`, which potentially lives forever. You somehow need to
require that the pointer `bd` stays valid for reads and (synchronized)
writes until the request is ended (probably via `blk_mq_end_request`).
>>> + /// - This function must not be called with a `hctx` for which
>>> + /// `Self::exit_hctx_callback()` has been called.
>>> + /// - (*bd).rq must point to a valid `bindings:request` for which
>>> + /// `OperationsVTable<T>::init_request_callback` was called
>>
>> Missing `.` at the end.
>
> Thanks.
>
>>
>>> + unsafe extern "C" fn queue_rq_callback(
>>> + _hctx: *mut bindings::blk_mq_hw_ctx,
>>> + bd: *const bindings::blk_mq_queue_data,
>>> + ) -> bindings::blk_status_t {
>>> + // SAFETY: `bd.rq` is valid as required by the safety requirement for
>>> + // this function.
>>> + let request = unsafe { &*(*bd).rq.cast::<Request<T>>() };
>>> +
>>> + // One refcount for the ARef, one for being in flight
>>> + request.wrapper_ref().refcount().store(2, Ordering::Relaxed);
>>> +
>>> + // SAFETY: We own a refcount that we took above. We pass that to `ARef`.
>>> + // By the safety requirements of this function, `request` is a valid
>>> + // `struct request` and the private data is properly initialized.
>>> + let rq = unsafe { Request::aref_from_raw((*bd).rq) };
>>
>> I think that you need to require that the request is alive at least
>> until `blk_mq_end_request` is called for the request (since at that
>> point all `ARef`s will be gone).
>> Also if this is not guaranteed, the safety requirements of
>> `AlwaysRefCounted` are violated (since the object can just disappear
>> even if it has refcount > 0 [the refcount refers to the Rust refcount in
>> the `RequestDataWrapper`, not the one in C]).
>
> Yea, for the last invariant of `Request`:
>
> /// * `self` is reference counted by atomic modification of
> /// self.wrapper_ref().refcount().
>
> I will add this to the safety comment at the call site:
>
> // - `rq` will be alive until `blk_mq_end_request` is called and is
> // reference counted by `ARef` until then.
Seems like you already want to use this here :)
[...]
>>> + /// This function is called by the C kernel. A pointer to this function is
>>> + /// installed in the `blk_mq_ops` vtable for the driver.
>>> + ///
>>> + /// # Safety
>>> + ///
>>> + /// This function may only be called by blk-mq C infrastructure. `set` must
`set` doesn't exist (`_set` does), you are also not using this
requirement.
>>> + /// point to an initialized `TagSet<T>`.
>>> + unsafe extern "C" fn init_request_callback(
>>> + _set: *mut bindings::blk_mq_tag_set,
>>> + rq: *mut bindings::request,
>>> + _hctx_idx: core::ffi::c_uint,
>>> + _numa_node: core::ffi::c_uint,
>>> + ) -> core::ffi::c_int {
>>> + from_result(|| {
>>> + // SAFETY: The `blk_mq_tag_set` invariants guarantee that all
>>> + // requests are allocated with extra memory for the request data.
>>
>> What guarantees that the right amount of memory has been allocated?
>> AFAIU that is guaranteed by the `TagSet` (but there is no invariant).
>
> It is by C API contract. `TagSet`::try_new` (now `new`) writes
> `cmd_size` into the `struct blk_mq_tag_set`. That is picked up by
> `blk_mq_alloc_tag_set` to allocate the right amount of space for each request.
>
> The invariant here is on the C type. Perhaps the wording is wrong. I am
> not exactly sure how to express this. How about this:
>
> // SAFETY: We instructed `blk_mq_alloc_tag_set` to allocate requests
> // with extra memory for the request data when we called it in
> // `TagSet::new`.
I think you need a safety requirement on the function: `rq` points to a
valid `Request`. Then you could just use `Request::wrapper_ptr` instead
of the line below.
>>> + let pdu = unsafe { bindings::blk_mq_rq_to_pdu(rq) }.cast::<RequestDataWrapper>();
>>> +
>>> + // SAFETY: The refcount field is allocated but not initialized, this
>>> + // valid for write.
>>> + unsafe { RequestDataWrapper::refcount_ptr(pdu).write(AtomicU64::new(0)) };
>>> +
>>> + Ok(0)
>>> + })
>>> + }
>>
>> [...]
>>
>>> + /// Notify the block layer that a request is going to be processed now.
>>> + ///
>>> + /// The block layer uses this hook to do proper initializations such as
>>> + /// starting the timeout timer. It is a requirement that block device
>>> + /// drivers call this function when starting to process a request.
>>> + ///
>>> + /// # Safety
>>> + ///
>>> + /// The caller must have exclusive ownership of `self`, that is
>>> + /// `self.wrapper_ref().refcount() == 2`.
>>> + pub(crate) unsafe fn start_unchecked(this: &ARef<Self>) {
>>> + // SAFETY: By type invariant, `self.0` is a valid `struct request`. By
>>> + // existence of `&mut self` we have exclusive access.
>>
>> We don't have a `&mut self`. But the safety requirements ask for a
>> unique `ARef`.
>
> Thanks, I'll rephrase to:
>
> // SAFETY: By type invariant, `self.0` is a valid `struct request` and
> // we have exclusive access.
>
>>
>>> + unsafe { bindings::blk_mq_start_request(this.0.get()) };
>>> + }
>>> +
>>> + fn try_set_end(this: ARef<Self>) -> Result<ARef<Self>, ARef<Self>> {
>>> + // We can race with `TagSet::tag_to_rq`
>>> + match this.wrapper_ref().refcount().compare_exchange(
>>> + 2,
>>> + 0,
>>> + Ordering::Relaxed,
>>> + Ordering::Relaxed,
>>> + ) {
>>> + Err(_old) => Err(this),
>>> + Ok(_) => Ok(this),
>>> + }
>>> + }
>>> +
>>> + /// Notify the block layer that the request has been completed without errors.
>>> + ///
>>> + /// This function will return `Err` if `this` is not the only `ARef`
>>> + /// referencing the request.
>>> + pub fn end_ok(this: ARef<Self>) -> Result<(), ARef<Self>> {
>>> + let this = Self::try_set_end(this)?;
>>> + let request_ptr = this.0.get();
>>> + core::mem::forget(this);
>>> +
>>> + // SAFETY: By type invariant, `self.0` is a valid `struct request`. By
>>> + // existence of `&mut self` we have exclusive access.
>>
>> Same here, but in this case, the `ARef` is unique, since you called
>> `try_set_end`. You could make it a `# Guarantee` of `try_set_end`: "If
>> `Ok(aref)` is returned, then the `aref` is unique."
>
> Makes sense. I have not seen `# Guarantee` used anywhere. Do you have a link for that use?
Alice used it a couple of times, eg in [2]. I plan on putting it in the
safety standard.
[2]: https://lore.kernel.org/rust-for-linux/20230601134946.3887870-2-aliceryhl@google.com/
---
Cheers,
Benno
next prev parent reply other threads:[~2024-06-03 18:26 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-01 13:40 [PATCH v4 0/3] Rust block device driver API and null block driver Andreas Hindborg
2024-06-01 13:40 ` [PATCH v4 1/3] rust: block: introduce `kernel::block::mq` module Andreas Hindborg
2024-06-02 20:08 ` Benno Lossin
2024-06-03 12:01 ` Andreas Hindborg
2024-06-03 18:26 ` Benno Lossin [this message]
2024-06-04 9:59 ` Andreas Hindborg
2024-06-10 20:07 ` Benno Lossin
2024-06-01 13:40 ` [PATCH v4 2/3] rust: block: add rnull, Rust null_blk implementation Andreas Hindborg
2024-06-01 14:24 ` Keith Busch
2024-06-01 15:36 ` Andreas Hindborg
2024-06-01 16:01 ` Keith Busch
2024-06-01 16:59 ` Andreas Hindborg
2024-06-01 19:53 ` Andreas Hindborg
2024-06-02 3:49 ` Matthew Wilcox
2024-06-02 9:27 ` Andreas Hindborg
2024-06-03 9:05 ` Hannes Reinecke
2024-06-03 9:06 ` Alice Ryhl
2024-06-03 12:05 ` Andreas Hindborg
2024-06-03 12:07 ` Andreas Hindborg
2024-06-01 13:40 ` [PATCH v4 3/3] MAINTAINERS: add entry for Rust block device driver API Andreas Hindborg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=925fe0fe-9303-4f49-b473-c3a3ecc5e2e6@proton.me \
--to=benno.lossin@proton.me \
--cc=1182282462@bupt.edu.cn \
--cc=Johannes.Thumshirn@wdc.com \
--cc=Niklas.Cassel@wdc.com \
--cc=a.hindborg@samsung.com \
--cc=alex.gaynor@gmail.com \
--cc=aliceryhl@google.com \
--cc=axboe@kernel.dk \
--cc=bjorn3_gh@protonmail.com \
--cc=boqun.feng@gmail.com \
--cc=bvanassche@acm.org \
--cc=chaitanyak@nvidia.com \
--cc=conor@kernel.org \
--cc=da.gomez@samsung.com \
--cc=dlemoal@kernel.org \
--cc=gary@garyguo.net \
--cc=gost.dev@samsung.com \
--cc=gregkh@linuxfoundation.org \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=j.granados@samsung.com \
--cc=kbusch@kernel.org \
--cc=kernel@pankajraghav.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=m@bjorling.me \
--cc=mcgrof@kernel.org \
--cc=ming.lei@redhat.com \
--cc=nmi@metaspace.dk \
--cc=ojeda@kernel.org \
--cc=pstanner@redhat.com \
--cc=rust-for-linux@vger.kernel.org \
--cc=sergio.collado@gmail.com \
--cc=wedsonaf@gmail.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).