* fuse-io-uring: We need to keep the tag/index
@ 2024-10-02 21:54 Bernd Schubert
2024-10-03 9:50 ` Miklos Szeredi
0 siblings, 1 reply; 10+ messages in thread
From: Bernd Schubert @ 2024-10-02 21:54 UTC (permalink / raw)
To: linux-fsdevel@vger.kernel.org, Miklos Szeredi; +Cc: Joanne Koong, Josef Bacik
Hi Miklos,
in the discussion we had you asked to avoid an entry tag/index,
in the mean time I don't think we can.
In fuse_uring_cmd(), i.e. the function that gets
'struct io_uring_cmd *cmd' we need identify the corresponding fuse
data structure ('struct fuse_ring_ent'). Basically same as in
as in do_write(), which calls request_find() and does a list search.
With a large queue size that would be an issue (and even for smaller
queue sizes not beautiful, imho).
I'm now rewriting code to create an index in
FUSE_URING_REQ_FETCH and return that later on with the request
to userspace. That needs to do realloc the array as we do know the
exact queue size anymore.
Please let me know if you should have another suggestion.
Thanks,
Bernd
PS: In code review Joanne didn't like 'tag' too much (I took that
over from ublk). I personally find index/idx too generic - any naming
suggestion is appreciated :)
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: fuse-io-uring: We need to keep the tag/index 2024-10-02 21:54 fuse-io-uring: We need to keep the tag/index Bernd Schubert @ 2024-10-03 9:50 ` Miklos Szeredi 2024-10-03 10:10 ` Bernd Schubert 0 siblings, 1 reply; 10+ messages in thread From: Miklos Szeredi @ 2024-10-03 9:50 UTC (permalink / raw) To: Bernd Schubert; +Cc: linux-fsdevel@vger.kernel.org, Joanne Koong, Josef Bacik On Wed, 2 Oct 2024 at 23:54, Bernd Schubert <bernd.schubert@fastmail.fm> wrote: > > Hi Miklos, > > in the discussion we had you asked to avoid an entry tag/index, > in the mean time I don't think we can. > In fuse_uring_cmd(), i.e. the function that gets > 'struct io_uring_cmd *cmd' we need identify the corresponding fuse > data structure ('struct fuse_ring_ent'). Basically same as in > as in do_write(), which calls request_find() and does a list search. > With a large queue size that would be an issue (and even for smaller > queue sizes not beautiful, imho). I don't really understand the problem. Is efficiency the issue? I agree, that that would need to be addressed. But that's not a interface question, it's an implementation question, and there are plenty of potential solutions for that (hash table, rbtree, etc.) > I'm now rewriting code to create an index in > FUSE_URING_REQ_FETCH and return that later on with the request > to userspace. That needs to do realloc the array as we do know the > exact queue size anymore. It should not be an array because dynamic arrays are complex and inefficient. Rbtree sounds right, but I haven't thought much about it. Thanks, Miklos ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: fuse-io-uring: We need to keep the tag/index 2024-10-03 9:50 ` Miklos Szeredi @ 2024-10-03 10:10 ` Bernd Schubert 2024-10-03 12:02 ` Miklos Szeredi 0 siblings, 1 reply; 10+ messages in thread From: Bernd Schubert @ 2024-10-03 10:10 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-fsdevel@vger.kernel.org, Joanne Koong, Josef Bacik On 10/3/24 11:50, Miklos Szeredi wrote: > On Wed, 2 Oct 2024 at 23:54, Bernd Schubert <bernd.schubert@fastmail.fm> wrote: >> >> Hi Miklos, >> >> in the discussion we had you asked to avoid an entry tag/index, >> in the mean time I don't think we can. >> In fuse_uring_cmd(), i.e. the function that gets >> 'struct io_uring_cmd *cmd' we need identify the corresponding fuse >> data structure ('struct fuse_ring_ent'). Basically same as in >> as in do_write(), which calls request_find() and does a list search. >> With a large queue size that would be an issue (and even for smaller >> queue sizes not beautiful, imho). > > I don't really understand the problem. > > Is efficiency the issue? I agree, that that would need to be > addressed. But that's not a interface question, it's an > implementation question, and there are plenty of potential solutions > for that (hash table, rbtree, etc.) > >> I'm now rewriting code to create an index in >> FUSE_URING_REQ_FETCH and return that later on with the request >> to userspace. That needs to do realloc the array as we do know the >> exact queue size anymore. > > It should not be an array because dynamic arrays are complex and > inefficient. Rbtree sounds right, but I haven't thought much about > it. What I mean is that you wanted to get rid of the 'tag' - using any kind of search means we still need it. I.e. we cannot just take last list head or tail and use that. The array is only dynamic at initialization time. And why spending O(logN) to search instead of O(1)? And I know that it is an implementation detail, I just would like to avoid many rebasing rounds on these details. Thanks, Bernd ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: fuse-io-uring: We need to keep the tag/index 2024-10-03 10:10 ` Bernd Schubert @ 2024-10-03 12:02 ` Miklos Szeredi 2024-10-03 13:19 ` Bernd Schubert 0 siblings, 1 reply; 10+ messages in thread From: Miklos Szeredi @ 2024-10-03 12:02 UTC (permalink / raw) To: Bernd Schubert; +Cc: linux-fsdevel@vger.kernel.org, Joanne Koong, Josef Bacik On Thu, 3 Oct 2024 at 12:10, Bernd Schubert <bernd.schubert@fastmail.fm> wrote: > What I mean is that you wanted to get rid of the 'tag' - using any kind of > search means we still need it. I.e. we cannot just take last list head > or tail and use that. > The array is only dynamic at initialization time. And why spending O(logN) > to search instead of O(1)? Because for sane queue depths they are essentially the same. This is not where we can gain or lose any significant performance. > And I know that it is an implementation detail, I just would like to avoid > many rebasing rounds on these details. I think the logical interface would be: - pass a userspace buffer to FETCH (you told me, but I don't remember why sqe->addr isn't suitable) - set sqe->user_data to an implementation dependent value, this could be just the userspace buffer, but it could be a request object - kernel allocates an idle request and queues it. - request comes in, kernel takes a request from the idle queue and fills it - cqe->user_data is returned with the original sqe->user_data, which should be sufficient for the server to identify the request - process request, send COMMIT_AND_FETCH with the userspace buffer and user data - the kernel reads the header from the userspace buffer, finds outh->unique, finds and completes the request - then queues the request on the idle queue ... What's wrong with that? Thanks, Miklos ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: fuse-io-uring: We need to keep the tag/index 2024-10-03 12:02 ` Miklos Szeredi @ 2024-10-03 13:19 ` Bernd Schubert 2024-10-03 13:56 ` Bernd Schubert 0 siblings, 1 reply; 10+ messages in thread From: Bernd Schubert @ 2024-10-03 13:19 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-fsdevel@vger.kernel.org, Joanne Koong, Josef Bacik On 10/3/24 14:02, Miklos Szeredi wrote: > On Thu, 3 Oct 2024 at 12:10, Bernd Schubert <bernd.schubert@fastmail.fm> wrote: > >> What I mean is that you wanted to get rid of the 'tag' - using any kind of >> search means we still need it. I.e. we cannot just take last list head >> or tail and use that. >> The array is only dynamic at initialization time. And why spending O(logN) >> to search instead of O(1)? > > Because for sane queue depths they are essentially the same. This is > not where we can gain or lose any significant performance. > >> And I know that it is an implementation detail, I just would like to avoid >> many rebasing rounds on these details. > > I think the logical interface would be: > > - pass a userspace buffer to FETCH (you told me, but I don't remember > why sqe->addr isn't suitable) I think we could change to that now. > > - set sqe->user_data to an implementation dependent value, this could > be just the userspace buffer, but it could be a request object Libfuse already does this, it points to 'struct fuse_ring_ent', which then points to the buffer. Maybe that could be optimized to have 'struct fuse_ring_ent' as part of the buffer. > > - kernel allocates an idle request and queues it. > > - request comes in, kernel takes a request from the idle queue and fills it > > - cqe->user_data is returned with the original sqe->user_data, which > should be sufficient for the server to identify the request > > - process request, send COMMIT_AND_FETCH with the userspace buffer > and user data > > - the kernel reads the header from the userspace buffer, finds > outh->unique, finds and completes the request > > - then queues the request on the idle queue > > ... > > What's wrong with that? In my opinion a bit sad that we have to search instead of just doing an array[ID] access. I certainly don't want to rely on the current hashed list search, this only works reasonably well, because with the current threading model requests in fly is typically small. And then rb entries also have pointers - it won't take any less memory, more on the contrary. At best one could argue that on tear down races are avoided as one has to do a search now. Although that was handled but checks tear-down checks in fuse_uring_cmd. Well, if you really prefer, I'm going to add use an rbtree or maybe better xarray search. No reason for anything like that in libfuse, it can just continue to type cast 'user_data'. Thanks, Bernd ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: fuse-io-uring: We need to keep the tag/index 2024-10-03 13:19 ` Bernd Schubert @ 2024-10-03 13:56 ` Bernd Schubert 2024-10-03 14:02 ` Miklos Szeredi 0 siblings, 1 reply; 10+ messages in thread From: Bernd Schubert @ 2024-10-03 13:56 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-fsdevel@vger.kernel.org, Joanne Koong, Josef Bacik On 10/3/24 15:19, Bernd Schubert wrote: > > > On 10/3/24 14:02, Miklos Szeredi wrote: >> On Thu, 3 Oct 2024 at 12:10, Bernd Schubert <bernd.schubert@fastmail.fm> wrote: >> >>> What I mean is that you wanted to get rid of the 'tag' - using any kind of >>> search means we still need it. I.e. we cannot just take last list head >>> or tail and use that. >>> The array is only dynamic at initialization time. And why spending O(logN) >>> to search instead of O(1)? >> >> Because for sane queue depths they are essentially the same. This is >> not where we can gain or lose any significant performance. >> >>> And I know that it is an implementation detail, I just would like to avoid >>> many rebasing rounds on these details. >> >> I think the logical interface would be: >> >> - pass a userspace buffer to FETCH (you told me, but I don't remember >> why sqe->addr isn't suitable) > > I think we could change to that now. > >> >> - set sqe->user_data to an implementation dependent value, this could >> be just the userspace buffer, but it could be a request object > > Libfuse already does this, it points to 'struct fuse_ring_ent', which then > points to the buffer. Maybe that could be optimized to have > 'struct fuse_ring_ent' as part of the buffer. > >> >> - kernel allocates an idle request and queues it. >> >> - request comes in, kernel takes a request from the idle queue and fills it >> >> - cqe->user_data is returned with the original sqe->user_data, which >> should be sufficient for the server to identify the request >> >> - process request, send COMMIT_AND_FETCH with the userspace buffer >> and user data >> >> - the kernel reads the header from the userspace buffer, finds >> outh->unique, finds and completes the request >> >> - then queues the request on the idle queue >> >> ... >> >> What's wrong with that? > > In my opinion a bit sad that we have to search > instead of just doing an array[ID] access. I certainly don't want to > rely on the current hashed list search, this only works reasonably > well, because with the current threading model requests in fly is > typically small. > And then rb entries also have pointers - it won't take > any less memory, more on the contrary. > > At best one could argue that on tear down races are avoided as one > has to do a search now. Although that was handled but checks > tear-down checks in fuse_uring_cmd. > > Well, if you really prefer, I'm going to add use an rbtree or maybe > better xarray search. No reason for anything like that in libfuse, it can > just continue to type cast 'user_data'. I'm inclined to do xarray, but still to assign an index (in the order of received FUSE_URING_REQ_FETCH per queue) - should still give O(1). Thanks, Bernd ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: fuse-io-uring: We need to keep the tag/index 2024-10-03 13:56 ` Bernd Schubert @ 2024-10-03 14:02 ` Miklos Szeredi 2024-10-03 14:04 ` Miklos Szeredi 0 siblings, 1 reply; 10+ messages in thread From: Miklos Szeredi @ 2024-10-03 14:02 UTC (permalink / raw) To: Bernd Schubert; +Cc: linux-fsdevel@vger.kernel.org, Joanne Koong, Josef Bacik On Thu, 3 Oct 2024 at 15:56, Bernd Schubert <bernd.schubert@fastmail.fm> wrote: > I'm inclined to do xarray, but still to assign an index (in the order of > received FUSE_URING_REQ_FETCH per queue) - should still give O(1). xarray index is unsigned long, while req->unique is u64, which on 32 bit doesn't fit. Thanks, Miklos ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: fuse-io-uring: We need to keep the tag/index 2024-10-03 14:02 ` Miklos Szeredi @ 2024-10-03 14:04 ` Miklos Szeredi 2024-10-03 14:09 ` Bernd Schubert 0 siblings, 1 reply; 10+ messages in thread From: Miklos Szeredi @ 2024-10-03 14:04 UTC (permalink / raw) To: Bernd Schubert; +Cc: linux-fsdevel@vger.kernel.org, Joanne Koong, Josef Bacik On Thu, 3 Oct 2024 at 16:02, Miklos Szeredi <miklos@szeredi.hu> wrote: > > On Thu, 3 Oct 2024 at 15:56, Bernd Schubert <bernd.schubert@fastmail.fm> wrote: > > > I'm inclined to do xarray, but still to assign an index (in the order of > > received FUSE_URING_REQ_FETCH per queue) - should still give O(1). > > xarray index is unsigned long, while req->unique is u64, which on 32 > bit doesn't fit. I suggest leaving this optimization to a later time. Less code to review and less chance for bugs to creep in. Thanks, Miklos ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: fuse-io-uring: We need to keep the tag/index 2024-10-03 14:04 ` Miklos Szeredi @ 2024-10-03 14:09 ` Bernd Schubert 2024-10-03 14:26 ` Miklos Szeredi 0 siblings, 1 reply; 10+ messages in thread From: Bernd Schubert @ 2024-10-03 14:09 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-fsdevel@vger.kernel.org, Joanne Koong, Josef Bacik On 10/3/24 16:04, Miklos Szeredi wrote: > On Thu, 3 Oct 2024 at 16:02, Miklos Szeredi <miklos@szeredi.hu> wrote: >> >> On Thu, 3 Oct 2024 at 15:56, Bernd Schubert <bernd.schubert@fastmail.fm> wrote: >> >>> I'm inclined to do xarray, but still to assign an index (in the order of >>> received FUSE_URING_REQ_FETCH per queue) - should still give O(1). >> >> xarray index is unsigned long, while req->unique is u64, which on 32 >> bit doesn't fit. > > I suggest leaving this optimization to a later time. Less code to > review and less chance for bugs to creep in. Well, we need to search - either rbtree or xarray, please not the existing request_find() with a list search - totally unsuitable for large number of requests in fly (think of a ring with 32768 entries). I.e. that points to not using req->unique, but own own index. Thanks, Bernd ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: fuse-io-uring: We need to keep the tag/index 2024-10-03 14:09 ` Bernd Schubert @ 2024-10-03 14:26 ` Miklos Szeredi 0 siblings, 0 replies; 10+ messages in thread From: Miklos Szeredi @ 2024-10-03 14:26 UTC (permalink / raw) To: Bernd Schubert; +Cc: linux-fsdevel@vger.kernel.org, Joanne Koong, Josef Bacik On Thu, 3 Oct 2024 at 16:09, Bernd Schubert <bernd.schubert@fastmail.fm> wrote: > Well, we need to search - either rbtree or xarray, please not the > existing request_find() with a list search - totally unsuitable for > large number of requests in fly (think of a ring with 32768 entries). > I.e. that points to not using req->unique, but own own index. Even then it won't add a significant overhead (that's a tight loop with ~128 iterations, with likely everything in cache). But I won't stop you from trying to improve this, if you can justify a separate index with some great performance numbers, then I'll accept that. I just think it's unlikely to be the case. Thanks, Miklos ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-10-03 14:26 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-10-02 21:54 fuse-io-uring: We need to keep the tag/index Bernd Schubert 2024-10-03 9:50 ` Miklos Szeredi 2024-10-03 10:10 ` Bernd Schubert 2024-10-03 12:02 ` Miklos Szeredi 2024-10-03 13:19 ` Bernd Schubert 2024-10-03 13:56 ` Bernd Schubert 2024-10-03 14:02 ` Miklos Szeredi 2024-10-03 14:04 ` Miklos Szeredi 2024-10-03 14:09 ` Bernd Schubert 2024-10-03 14:26 ` Miklos Szeredi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).