* Re: [PATCH 2/4] io_uring/cmd: Introduce SOCKET_URING_OP_GETSOCKOPT
[not found] ` <20230724142237.358769-3-leitao@debian.org>
@ 2023-07-24 17:31 ` Stanislav Fomichev
2023-07-25 9:27 ` Breno Leitao
0 siblings, 1 reply; 8+ messages in thread
From: Stanislav Fomichev @ 2023-07-24 17:31 UTC (permalink / raw)
To: Breno Leitao
Cc: asml.silence, axboe, io-uring, netdev, davem, kuba, edumazet,
pabeni, linux-kernel, leit, bpf
On 07/24, Breno Leitao wrote:
> Add support for getsockopt command (SOCKET_URING_OP_GETSOCKOPT), where
> level is SOL_SOCKET. This is leveraging the sockptr_t infrastructure,
> where a sockptr_t is either userspace or kernel space, and handled as
> such.
>
> Function io_uring_cmd_getsockopt() is inspired by __sys_getsockopt().
We probably need to also have bpf bits in the new
io_uring_cmd_getsockopt?
> Differently from the getsockopt(2), the optlen field is not a userspace
> pointers. In getsockopt(2), userspace provides optlen pointer, which is
> overwritten by the kernel. In this implementation, userspace passes a
> u32, and the new value is returned in cqe->res. I.e., optlen is not a
> pointer.
>
> Important to say that userspace needs to keep the pointer alive until
> the CQE is completed.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
> include/uapi/linux/io_uring.h | 7 ++++++
> io_uring/uring_cmd.c | 43 +++++++++++++++++++++++++++++++++++
> 2 files changed, 50 insertions(+)
>
> diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
> index 9fc7195f25df..8152151080db 100644
> --- a/include/uapi/linux/io_uring.h
> +++ b/include/uapi/linux/io_uring.h
> @@ -43,6 +43,10 @@ struct io_uring_sqe {
> union {
> __u64 addr; /* pointer to buffer or iovecs */
> __u64 splice_off_in;
> + struct {
> + __u32 level;
> + __u32 optname;
> + };
> };
> __u32 len; /* buffer size or number of iovecs */
> union {
> @@ -79,6 +83,7 @@ struct io_uring_sqe {
> union {
> __s32 splice_fd_in;
> __u32 file_index;
> + __u32 optlen;
> struct {
> __u16 addr_len;
> __u16 __pad3[1];
> @@ -89,6 +94,7 @@ struct io_uring_sqe {
> __u64 addr3;
> __u64 __pad2[1];
> };
> + __u64 optval;
> /*
> * If the ring is initialized with IORING_SETUP_SQE128, then
> * this field is used for 80 bytes of arbitrary command data
> @@ -729,6 +735,7 @@ struct io_uring_recvmsg_out {
> enum {
> SOCKET_URING_OP_SIOCINQ = 0,
> SOCKET_URING_OP_SIOCOUTQ,
> + SOCKET_URING_OP_GETSOCKOPT,
> };
>
> #ifdef __cplusplus
> diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
> index 8e7a03c1b20e..16c857cbf3b0 100644
> --- a/io_uring/uring_cmd.c
> +++ b/io_uring/uring_cmd.c
> @@ -166,6 +166,47 @@ int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
> }
> EXPORT_SYMBOL_GPL(io_uring_cmd_import_fixed);
>
> +static inline int io_uring_cmd_getsockopt(struct socket *sock,
> + struct io_uring_cmd *cmd)
> +{
> + void __user *optval = u64_to_user_ptr(READ_ONCE(cmd->sqe->optval));
> + int optname = READ_ONCE(cmd->sqe->optname);
> + int optlen = READ_ONCE(cmd->sqe->optlen);
> + int level = READ_ONCE(cmd->sqe->level);
> + void *koptval;
> + int err;
> +
> + err = security_socket_getsockopt(sock, level, optname);
> + if (err)
> + return err;
> +
> + koptval = kmalloc(optlen, GFP_KERNEL);
> + if (!koptval)
> + return -ENOMEM;
> +
> + err = copy_from_user(koptval, optval, optlen);
> + if (err)
> + goto fail;
> +
> + err = -EOPNOTSUPP;
> + if (level == SOL_SOCKET) {
> + err = sk_getsockopt(sock->sk, level, optname,
> + KERNEL_SOCKPTR(koptval),
> + KERNEL_SOCKPTR(&optlen));
> + if (err)
> + goto fail;
> + }
> +
> + err = copy_to_user(optval, koptval, optlen);
> +
> +fail:
> + kfree(koptval);
> + if (err)
> + return err;
> + else
> + return optlen;
> +}
> +
> int io_uring_cmd_sock(struct io_uring_cmd *cmd, unsigned int issue_flags)
> {
> struct socket *sock = cmd->file->private_data;
> @@ -187,6 +228,8 @@ int io_uring_cmd_sock(struct io_uring_cmd *cmd, unsigned int issue_flags)
> if (ret)
> return ret;
> return arg;
> + case SOCKET_URING_OP_GETSOCKOPT:
> + return io_uring_cmd_getsockopt(sock, cmd);
> default:
> return -EOPNOTSUPP;
> }
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/4] io_uring/cmd: Introduce SOCKET_URING_OP_GETSOCKOPT
2023-07-24 17:31 ` [PATCH 2/4] io_uring/cmd: Introduce SOCKET_URING_OP_GETSOCKOPT Stanislav Fomichev
@ 2023-07-25 9:27 ` Breno Leitao
2023-07-25 17:02 ` Stanislav Fomichev
0 siblings, 1 reply; 8+ messages in thread
From: Breno Leitao @ 2023-07-25 9:27 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: asml.silence, axboe, io-uring, netdev, davem, kuba, edumazet,
pabeni, linux-kernel, leit, bpf
On Mon, Jul 24, 2023 at 10:31:28AM -0700, Stanislav Fomichev wrote:
> On 07/24, Breno Leitao wrote:
> > Add support for getsockopt command (SOCKET_URING_OP_GETSOCKOPT), where
> > level is SOL_SOCKET. This is leveraging the sockptr_t infrastructure,
> > where a sockptr_t is either userspace or kernel space, and handled as
> > such.
> >
> > Function io_uring_cmd_getsockopt() is inspired by __sys_getsockopt().
>
> We probably need to also have bpf bits in the new
> io_uring_cmd_getsockopt?
It might be interesting to have the BPF hook for this function as
well, but I would like to do it in a following patch, so, I can
experiment with it better, if that is OK.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/4] io_uring/cmd: Introduce SOCKET_URING_OP_GETSOCKOPT
2023-07-25 9:27 ` Breno Leitao
@ 2023-07-25 17:02 ` Stanislav Fomichev
2023-07-25 17:56 ` Martin KaFai Lau
2023-07-28 17:03 ` Breno Leitao
0 siblings, 2 replies; 8+ messages in thread
From: Stanislav Fomichev @ 2023-07-25 17:02 UTC (permalink / raw)
To: Breno Leitao
Cc: asml.silence, axboe, io-uring, netdev, davem, kuba, edumazet,
pabeni, linux-kernel, leit, bpf, ast, martin.lau
On 07/25, Breno Leitao wrote:
> On Mon, Jul 24, 2023 at 10:31:28AM -0700, Stanislav Fomichev wrote:
> > On 07/24, Breno Leitao wrote:
> > > Add support for getsockopt command (SOCKET_URING_OP_GETSOCKOPT), where
> > > level is SOL_SOCKET. This is leveraging the sockptr_t infrastructure,
> > > where a sockptr_t is either userspace or kernel space, and handled as
> > > such.
> > >
> > > Function io_uring_cmd_getsockopt() is inspired by __sys_getsockopt().
> >
> > We probably need to also have bpf bits in the new
> > io_uring_cmd_getsockopt?
>
> It might be interesting to have the BPF hook for this function as
> well, but I would like to do it in a following patch, so, I can
> experiment with it better, if that is OK.
We are not using io_uring, so fine with me. However, having a way to bypass
get/setsockopt bpf might be problematic for some other heavy io_uring
users.
Lemme CC a bunch of Meta folks explicitly. I'm not sure what that state
of bpf support in io_uring.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/4] io_uring/cmd: Introduce SOCKET_URING_OP_GETSOCKOPT
2023-07-25 17:02 ` Stanislav Fomichev
@ 2023-07-25 17:56 ` Martin KaFai Lau
2023-07-26 9:26 ` Breno Leitao
2023-07-28 17:03 ` Breno Leitao
1 sibling, 1 reply; 8+ messages in thread
From: Martin KaFai Lau @ 2023-07-25 17:56 UTC (permalink / raw)
To: Stanislav Fomichev, Breno Leitao
Cc: asml.silence, axboe, io-uring, netdev, davem, kuba, edumazet,
pabeni, linux-kernel, leit, bpf, ast
On 7/25/23 10:02 AM, Stanislav Fomichev wrote:
> On 07/25, Breno Leitao wrote:
>> On Mon, Jul 24, 2023 at 10:31:28AM -0700, Stanislav Fomichev wrote:
>>> On 07/24, Breno Leitao wrote:
>>>> Add support for getsockopt command (SOCKET_URING_OP_GETSOCKOPT), where
>>>> level is SOL_SOCKET. This is leveraging the sockptr_t infrastructure,
>>>> where a sockptr_t is either userspace or kernel space, and handled as
>>>> such.
>>>>
>>>> Function io_uring_cmd_getsockopt() is inspired by __sys_getsockopt().
>>>
>>> We probably need to also have bpf bits in the new
>>> io_uring_cmd_getsockopt?
I also think this inconsistency behavior should be avoided.
>>
>> It might be interesting to have the BPF hook for this function as
>> well, but I would like to do it in a following patch, so, I can
>> experiment with it better, if that is OK.
>
> We are not using io_uring, so fine with me. However, having a way to bypass
> get/setsockopt bpf might be problematic for some other heavy io_uring
> users.
>
> Lemme CC a bunch of Meta folks explicitly. I'm not sure what that state
> of bpf support in io_uring.
We have use cases on the "cgroup/{g,s}etsockopt". It will be a surprise when the
user moves from the syscall {g,s}etsockopt to SOCKET_URING_OP_*SOCKOPT and
figured that the bpf handling is skipped.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/4] io_uring/cmd: Introduce SOCKET_URING_OP_GETSOCKOPT
2023-07-25 17:56 ` Martin KaFai Lau
@ 2023-07-26 9:26 ` Breno Leitao
0 siblings, 0 replies; 8+ messages in thread
From: Breno Leitao @ 2023-07-26 9:26 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: Stanislav Fomichev, asml.silence, axboe, io-uring, netdev, davem,
kuba, edumazet, pabeni, linux-kernel, leit, bpf, ast
On Tue, Jul 25, 2023 at 10:56:23AM -0700, Martin KaFai Lau wrote:
> On 7/25/23 10:02 AM, Stanislav Fomichev wrote:
> > On 07/25, Breno Leitao wrote:
> > > On Mon, Jul 24, 2023 at 10:31:28AM -0700, Stanislav Fomichev wrote:
> > > > On 07/24, Breno Leitao wrote:
> > > > > Add support for getsockopt command (SOCKET_URING_OP_GETSOCKOPT), where
> > > > > level is SOL_SOCKET. This is leveraging the sockptr_t infrastructure,
> > > > > where a sockptr_t is either userspace or kernel space, and handled as
> > > > > such.
> > > > >
> > > > > Function io_uring_cmd_getsockopt() is inspired by __sys_getsockopt().
> > > >
> > > > We probably need to also have bpf bits in the new
> > > > io_uring_cmd_getsockopt?
>
> I also think this inconsistency behavior should be avoided.
>
> > >
> > > It might be interesting to have the BPF hook for this function as
> > > well, but I would like to do it in a following patch, so, I can
> > > experiment with it better, if that is OK.
> >
> > We are not using io_uring, so fine with me. However, having a way to bypass
> > get/setsockopt bpf might be problematic for some other heavy io_uring
> > users.
> >
> > Lemme CC a bunch of Meta folks explicitly. I'm not sure what that state
> > of bpf support in io_uring.
>
> We have use cases on the "cgroup/{g,s}etsockopt". It will be a surprise when
> the user moves from the syscall {g,s}etsockopt to SOCKET_URING_OP_*SOCKOPT
> and figured that the bpf handling is skipped.
Ok, I will add the BPF bits in the next revision then. Thanks for
clarifying it.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/4] io_uring/cmd: Introduce SOCKET_URING_OP_GETSOCKOPT
2023-07-25 17:02 ` Stanislav Fomichev
2023-07-25 17:56 ` Martin KaFai Lau
@ 2023-07-28 17:03 ` Breno Leitao
2023-07-28 18:07 ` Stanislav Fomichev
1 sibling, 1 reply; 8+ messages in thread
From: Breno Leitao @ 2023-07-28 17:03 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: asml.silence, axboe, io-uring, netdev, davem, kuba, edumazet,
pabeni, linux-kernel, leit, bpf, ast, martin.lau
Hello Stanislav,
On Tue, Jul 25, 2023 at 10:02:40AM -0700, Stanislav Fomichev wrote:
> On 07/25, Breno Leitao wrote:
> > On Mon, Jul 24, 2023 at 10:31:28AM -0700, Stanislav Fomichev wrote:
> > > On 07/24, Breno Leitao wrote:
> > > > Add support for getsockopt command (SOCKET_URING_OP_GETSOCKOPT), where
> > > > level is SOL_SOCKET. This is leveraging the sockptr_t infrastructure,
> > > > where a sockptr_t is either userspace or kernel space, and handled as
> > > > such.
> > > >
> > > > Function io_uring_cmd_getsockopt() is inspired by __sys_getsockopt().
> > >
> > > We probably need to also have bpf bits in the new
> > > io_uring_cmd_getsockopt?
> >
> > It might be interesting to have the BPF hook for this function as
> > well, but I would like to do it in a following patch, so, I can
> > experiment with it better, if that is OK.
I spent smoe time looking at the problem, and I understand we want to
call something as BPF_CGROUP_RUN_PROG_{G,S}ETSOCKOPT() into
io_uring_cmd_{g,s}etsockopt().
Per the previous conversation with Williem,
io_uring_cmd_{g,s}etsockopt() should use optval as a user pointer (void __user
*optval), and optlen as a kernel integer (it comes as from the io_uring
SQE), such as:
void __user *optval = u64_to_user_ptr(READ_ONCE(cmd->sqe->optval));
int optlen = READ_ONCE(cmd->sqe->optlen);
Function BPF_CGROUP_RUN_PROG_GETSOCKOPT() calls
__cgroup_bpf_run_filter_getsockopt() which expects userpointer for
optlen and optval.
At the same time BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN() expects kernel
pointers for both optlen and optval.
In this current patchset, it has user pointer for optval and kernel value
for optlen. I.e., a third combination. So, none of the functions would
work properly, and we probably do not want to create another function.
I am wondering if it is a good idea to move
__cgroup_bpf_run_filter_getsockopt() to use sockptr_t, so, it will be
able to adapt to any combination.
Any feedback is appreciate.
Thanks!
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/4] io_uring/cmd: Introduce SOCKET_URING_OP_GETSOCKOPT
2023-07-28 17:03 ` Breno Leitao
@ 2023-07-28 18:07 ` Stanislav Fomichev
2023-07-31 10:13 ` Breno Leitao
0 siblings, 1 reply; 8+ messages in thread
From: Stanislav Fomichev @ 2023-07-28 18:07 UTC (permalink / raw)
To: Breno Leitao
Cc: asml.silence, axboe, io-uring, netdev, davem, kuba, edumazet,
pabeni, linux-kernel, leit, bpf, ast, martin.lau
On Fri, Jul 28, 2023 at 10:03 AM Breno Leitao <leitao@debian.org> wrote:
>
> Hello Stanislav,
>
> On Tue, Jul 25, 2023 at 10:02:40AM -0700, Stanislav Fomichev wrote:
> > On 07/25, Breno Leitao wrote:
> > > On Mon, Jul 24, 2023 at 10:31:28AM -0700, Stanislav Fomichev wrote:
> > > > On 07/24, Breno Leitao wrote:
> > > > > Add support for getsockopt command (SOCKET_URING_OP_GETSOCKOPT), where
> > > > > level is SOL_SOCKET. This is leveraging the sockptr_t infrastructure,
> > > > > where a sockptr_t is either userspace or kernel space, and handled as
> > > > > such.
> > > > >
> > > > > Function io_uring_cmd_getsockopt() is inspired by __sys_getsockopt().
> > > >
> > > > We probably need to also have bpf bits in the new
> > > > io_uring_cmd_getsockopt?
> > >
> > > It might be interesting to have the BPF hook for this function as
> > > well, but I would like to do it in a following patch, so, I can
> > > experiment with it better, if that is OK.
>
> I spent smoe time looking at the problem, and I understand we want to
> call something as BPF_CGROUP_RUN_PROG_{G,S}ETSOCKOPT() into
> io_uring_cmd_{g,s}etsockopt().
>
> Per the previous conversation with Williem,
> io_uring_cmd_{g,s}etsockopt() should use optval as a user pointer (void __user
> *optval), and optlen as a kernel integer (it comes as from the io_uring
> SQE), such as:
>
> void __user *optval = u64_to_user_ptr(READ_ONCE(cmd->sqe->optval));
> int optlen = READ_ONCE(cmd->sqe->optlen);
>
> Function BPF_CGROUP_RUN_PROG_GETSOCKOPT() calls
> __cgroup_bpf_run_filter_getsockopt() which expects userpointer for
> optlen and optval.
>
> At the same time BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN() expects kernel
> pointers for both optlen and optval.
>
> In this current patchset, it has user pointer for optval and kernel value
> for optlen. I.e., a third combination. So, none of the functions would
> work properly, and we probably do not want to create another function.
>
> I am wondering if it is a good idea to move
> __cgroup_bpf_run_filter_getsockopt() to use sockptr_t, so, it will be
> able to adapt to any combination.
Yeah, I think it makes sense. However, note that the intent of that
optlen being a __user pointer is to possibly write some (updated)
value back into the userspace.
Presumably, you'll pass that updated optlen into some io_uring
completion queue? (maybe a stupid question, not super familiar with
io_uring)
> Any feedback is appreciate.
> Thanks!
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/4] io_uring/cmd: Introduce SOCKET_URING_OP_GETSOCKOPT
2023-07-28 18:07 ` Stanislav Fomichev
@ 2023-07-31 10:13 ` Breno Leitao
0 siblings, 0 replies; 8+ messages in thread
From: Breno Leitao @ 2023-07-31 10:13 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: asml.silence, axboe, io-uring, netdev, davem, kuba, edumazet,
pabeni, linux-kernel, leit, bpf, ast, martin.lau
On Fri, Jul 28, 2023 at 11:07:10AM -0700, Stanislav Fomichev wrote:
> On Fri, Jul 28, 2023 at 10:03 AM Breno Leitao <leitao@debian.org> wrote:
> >
> > Hello Stanislav,
> >
> > On Tue, Jul 25, 2023 at 10:02:40AM -0700, Stanislav Fomichev wrote:
> > > On 07/25, Breno Leitao wrote:
> > > > On Mon, Jul 24, 2023 at 10:31:28AM -0700, Stanislav Fomichev wrote:
> > > > > On 07/24, Breno Leitao wrote:
> > > > > > Add support for getsockopt command (SOCKET_URING_OP_GETSOCKOPT), where
> > > > > > level is SOL_SOCKET. This is leveraging the sockptr_t infrastructure,
> > > > > > where a sockptr_t is either userspace or kernel space, and handled as
> > > > > > such.
> > > > > >
> > > > > > Function io_uring_cmd_getsockopt() is inspired by __sys_getsockopt().
> > > > >
> > > > > We probably need to also have bpf bits in the new
> > > > > io_uring_cmd_getsockopt?
> > > >
> > > > It might be interesting to have the BPF hook for this function as
> > > > well, but I would like to do it in a following patch, so, I can
> > > > experiment with it better, if that is OK.
> >
> > I spent smoe time looking at the problem, and I understand we want to
> > call something as BPF_CGROUP_RUN_PROG_{G,S}ETSOCKOPT() into
> > io_uring_cmd_{g,s}etsockopt().
> >
> > Per the previous conversation with Williem,
> > io_uring_cmd_{g,s}etsockopt() should use optval as a user pointer (void __user
> > *optval), and optlen as a kernel integer (it comes as from the io_uring
> > SQE), such as:
> >
> > void __user *optval = u64_to_user_ptr(READ_ONCE(cmd->sqe->optval));
> > int optlen = READ_ONCE(cmd->sqe->optlen);
> >
> > Function BPF_CGROUP_RUN_PROG_GETSOCKOPT() calls
> > __cgroup_bpf_run_filter_getsockopt() which expects userpointer for
> > optlen and optval.
> >
> > At the same time BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN() expects kernel
> > pointers for both optlen and optval.
> >
> > In this current patchset, it has user pointer for optval and kernel value
> > for optlen. I.e., a third combination. So, none of the functions would
> > work properly, and we probably do not want to create another function.
> >
> > I am wondering if it is a good idea to move
> > __cgroup_bpf_run_filter_getsockopt() to use sockptr_t, so, it will be
> > able to adapt to any combination.
>
> Yeah, I think it makes sense. However, note that the intent of that
> optlen being a __user pointer is to possibly write some (updated)
> value back into the userspace.
> Presumably, you'll pass that updated optlen into some io_uring
> completion queue? (maybe a stupid question, not super familiar with
> io_uring)
On io_uring proposal, the optlen is part of the SQE for setsockopt().
You give a userpointer (optval) and set the optlen in the SQE->optlen.
For getsockopt(), the optlen is returned as a result of the operation,
in the CQE->res.
If you need more detail about it, I documented this behaviour in the
cover-letter (PS1):
https://lore.kernel.org/all/20230724142237.358769-1-leitao@debian.org/
Thanks for the feedback!
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-07-31 10:13 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20230724142237.358769-1-leitao@debian.org>
[not found] ` <20230724142237.358769-3-leitao@debian.org>
2023-07-24 17:31 ` [PATCH 2/4] io_uring/cmd: Introduce SOCKET_URING_OP_GETSOCKOPT Stanislav Fomichev
2023-07-25 9:27 ` Breno Leitao
2023-07-25 17:02 ` Stanislav Fomichev
2023-07-25 17:56 ` Martin KaFai Lau
2023-07-26 9:26 ` Breno Leitao
2023-07-28 17:03 ` Breno Leitao
2023-07-28 18:07 ` Stanislav Fomichev
2023-07-31 10:13 ` Breno Leitao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).