* io_uring, IORING_OP_RECVMSG and ancillary data @ 2020-04-25 17:29 Andreas Smas 2020-04-25 20:23 ` Jens Axboe 0 siblings, 1 reply; 9+ messages in thread From: Andreas Smas @ 2020-04-25 17:29 UTC (permalink / raw) To: axboe, linux-fsdevel Hi, Tried to use io_uring with OP_RECVMSG with ancillary buffers (for my particular use case I'm using SO_TIMESTAMP for incoming UDP packets). These submissions fail with EINVAL due to the check in __sys_recvmsg_sock(). The following hack fixes the problem for me and I get valid timestamps back. Not suggesting this is the real fix as I'm not sure what the implications of this is. Any insight into this would be much appreciated. Thanks, Andreas diff --git a/net/socket.c b/net/socket.c index 2dd739fba866..689f41f4156e 100644 --- a/net/socket.c +++ b/net/socket.c @@ -2637,10 +2637,6 @@ long __sys_recvmsg_sock(struct socket *sock, struct msghdr *msg, struct user_msghdr __user *umsg, struct sockaddr __user *uaddr, unsigned int flags) { - /* disallow ancillary data requests from this path */ - if (msg->msg_control || msg->msg_controllen) - return -EINVAL; - return ____sys_recvmsg(sock, msg, umsg, uaddr, flags, 0); } ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: io_uring, IORING_OP_RECVMSG and ancillary data 2020-04-25 17:29 io_uring, IORING_OP_RECVMSG and ancillary data Andreas Smas @ 2020-04-25 20:23 ` Jens Axboe 2020-04-27 19:20 ` Jann Horn 0 siblings, 1 reply; 9+ messages in thread From: Jens Axboe @ 2020-04-25 20:23 UTC (permalink / raw) To: Andreas Smas, linux-fsdevel, io-uring, Jann Horn On 4/25/20 11:29 AM, Andreas Smas wrote: > Hi, > > Tried to use io_uring with OP_RECVMSG with ancillary buffers (for my > particular use case I'm using SO_TIMESTAMP for incoming UDP packets). > > These submissions fail with EINVAL due to the check in __sys_recvmsg_sock(). > > The following hack fixes the problem for me and I get valid timestamps > back. Not suggesting this is the real fix as I'm not sure what the > implications of this is. > > Any insight into this would be much appreciated. It was originally disabled because of a security issue, but I do think it's safe to enable again. Adding the io-uring list and Jann as well, leaving patch intact below. > diff --git a/net/socket.c b/net/socket.c > index 2dd739fba866..689f41f4156e 100644 > --- a/net/socket.c > +++ b/net/socket.c > @@ -2637,10 +2637,6 @@ long __sys_recvmsg_sock(struct socket *sock, > struct msghdr *msg, > struct user_msghdr __user *umsg, > struct sockaddr __user *uaddr, unsigned int flags) > { > - /* disallow ancillary data requests from this path */ > - if (msg->msg_control || msg->msg_controllen) > - return -EINVAL; > - > return ____sys_recvmsg(sock, msg, umsg, uaddr, flags, 0); > } > -- Jens Axboe ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: io_uring, IORING_OP_RECVMSG and ancillary data 2020-04-25 20:23 ` Jens Axboe @ 2020-04-27 19:20 ` Jann Horn 2020-04-27 19:29 ` Jens Axboe 0 siblings, 1 reply; 9+ messages in thread From: Jann Horn @ 2020-04-27 19:20 UTC (permalink / raw) To: Jens Axboe; +Cc: Andreas Smas, linux-fsdevel, io-uring On Sat, Apr 25, 2020 at 10:23 PM Jens Axboe <axboe@kernel.dk> wrote: > On 4/25/20 11:29 AM, Andreas Smas wrote: > > Hi, > > > > Tried to use io_uring with OP_RECVMSG with ancillary buffers (for my > > particular use case I'm using SO_TIMESTAMP for incoming UDP packets). > > > > These submissions fail with EINVAL due to the check in __sys_recvmsg_sock(). > > > > The following hack fixes the problem for me and I get valid timestamps > > back. Not suggesting this is the real fix as I'm not sure what the > > implications of this is. > > > > Any insight into this would be much appreciated. > > It was originally disabled because of a security issue, but I do think > it's safe to enable again. > > Adding the io-uring list and Jann as well, leaving patch intact below. > > > diff --git a/net/socket.c b/net/socket.c > > index 2dd739fba866..689f41f4156e 100644 > > --- a/net/socket.c > > +++ b/net/socket.c > > @@ -2637,10 +2637,6 @@ long __sys_recvmsg_sock(struct socket *sock, > > struct msghdr *msg, > > struct user_msghdr __user *umsg, > > struct sockaddr __user *uaddr, unsigned int flags) > > { > > - /* disallow ancillary data requests from this path */ > > - if (msg->msg_control || msg->msg_controllen) > > - return -EINVAL; > > - > > return ____sys_recvmsg(sock, msg, umsg, uaddr, flags, 0); > > } I think that's hard to get right. In particular, unix domain sockets can currently pass file descriptors in control data - so you'd need to set the file_table flag for recvmsg and sendmsg. And I'm not sure whether, to make this robust, there should be a whitelist of types of control messages that are permitted to be used with io_uring, or something like that... I think of ancillary buffers as being kind of like ioctl handlers in this regard. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: io_uring, IORING_OP_RECVMSG and ancillary data 2020-04-27 19:20 ` Jann Horn @ 2020-04-27 19:29 ` Jens Axboe 2020-04-27 19:53 ` Jens Axboe 0 siblings, 1 reply; 9+ messages in thread From: Jens Axboe @ 2020-04-27 19:29 UTC (permalink / raw) To: Jann Horn; +Cc: Andreas Smas, linux-fsdevel, io-uring On 4/27/20 1:20 PM, Jann Horn wrote: > On Sat, Apr 25, 2020 at 10:23 PM Jens Axboe <axboe@kernel.dk> wrote: >> On 4/25/20 11:29 AM, Andreas Smas wrote: >>> Hi, >>> >>> Tried to use io_uring with OP_RECVMSG with ancillary buffers (for my >>> particular use case I'm using SO_TIMESTAMP for incoming UDP packets). >>> >>> These submissions fail with EINVAL due to the check in __sys_recvmsg_sock(). >>> >>> The following hack fixes the problem for me and I get valid timestamps >>> back. Not suggesting this is the real fix as I'm not sure what the >>> implications of this is. >>> >>> Any insight into this would be much appreciated. >> >> It was originally disabled because of a security issue, but I do think >> it's safe to enable again. >> >> Adding the io-uring list and Jann as well, leaving patch intact below. >> >>> diff --git a/net/socket.c b/net/socket.c >>> index 2dd739fba866..689f41f4156e 100644 >>> --- a/net/socket.c >>> +++ b/net/socket.c >>> @@ -2637,10 +2637,6 @@ long __sys_recvmsg_sock(struct socket *sock, >>> struct msghdr *msg, >>> struct user_msghdr __user *umsg, >>> struct sockaddr __user *uaddr, unsigned int flags) >>> { >>> - /* disallow ancillary data requests from this path */ >>> - if (msg->msg_control || msg->msg_controllen) >>> - return -EINVAL; >>> - >>> return ____sys_recvmsg(sock, msg, umsg, uaddr, flags, 0); >>> } > > I think that's hard to get right. In particular, unix domain sockets > can currently pass file descriptors in control data - so you'd need to > set the file_table flag for recvmsg and sendmsg. And I'm not sure > whether, to make this robust, there should be a whitelist of types of > control messages that are permitted to be used with io_uring, or > something like that... > > I think of ancillary buffers as being kind of like ioctl handlers in > this regard. Good point. I'll send out something that hopefully will be enough to be useful, whole not allowing anything randomly. -- Jens Axboe ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: io_uring, IORING_OP_RECVMSG and ancillary data 2020-04-27 19:29 ` Jens Axboe @ 2020-04-27 19:53 ` Jens Axboe 2020-04-27 20:03 ` Jann Horn 0 siblings, 1 reply; 9+ messages in thread From: Jens Axboe @ 2020-04-27 19:53 UTC (permalink / raw) To: Jann Horn; +Cc: Andreas Smas, linux-fsdevel, io-uring On 4/27/20 1:29 PM, Jens Axboe wrote: > On 4/27/20 1:20 PM, Jann Horn wrote: >> On Sat, Apr 25, 2020 at 10:23 PM Jens Axboe <axboe@kernel.dk> wrote: >>> On 4/25/20 11:29 AM, Andreas Smas wrote: >>>> Hi, >>>> >>>> Tried to use io_uring with OP_RECVMSG with ancillary buffers (for my >>>> particular use case I'm using SO_TIMESTAMP for incoming UDP packets). >>>> >>>> These submissions fail with EINVAL due to the check in __sys_recvmsg_sock(). >>>> >>>> The following hack fixes the problem for me and I get valid timestamps >>>> back. Not suggesting this is the real fix as I'm not sure what the >>>> implications of this is. >>>> >>>> Any insight into this would be much appreciated. >>> >>> It was originally disabled because of a security issue, but I do think >>> it's safe to enable again. >>> >>> Adding the io-uring list and Jann as well, leaving patch intact below. >>> >>>> diff --git a/net/socket.c b/net/socket.c >>>> index 2dd739fba866..689f41f4156e 100644 >>>> --- a/net/socket.c >>>> +++ b/net/socket.c >>>> @@ -2637,10 +2637,6 @@ long __sys_recvmsg_sock(struct socket *sock, >>>> struct msghdr *msg, >>>> struct user_msghdr __user *umsg, >>>> struct sockaddr __user *uaddr, unsigned int flags) >>>> { >>>> - /* disallow ancillary data requests from this path */ >>>> - if (msg->msg_control || msg->msg_controllen) >>>> - return -EINVAL; >>>> - >>>> return ____sys_recvmsg(sock, msg, umsg, uaddr, flags, 0); >>>> } >> >> I think that's hard to get right. In particular, unix domain sockets >> can currently pass file descriptors in control data - so you'd need to >> set the file_table flag for recvmsg and sendmsg. And I'm not sure >> whether, to make this robust, there should be a whitelist of types of >> control messages that are permitted to be used with io_uring, or >> something like that... >> >> I think of ancillary buffers as being kind of like ioctl handlers in >> this regard. > > Good point. I'll send out something that hopefully will be enough to > be useful, whole not allowing anything randomly. That things is a bit of a mess... How about something like this for starters? diff --git a/fs/io_uring.c b/fs/io_uring.c index 084dfade5cda..40aa5b38367e 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -3570,6 +3570,37 @@ static int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) return ret; } +static bool __io_net_allow_cmsg(struct cmsghdr *cmsg) +{ + switch (cmsg->cmsg_level) { + case SOL_SOCKET: + if (cmsg->cmsg_type != SCM_RIGHTS && + cmsg->cmsg_type != SCM_CREDENTIALS) + return true; + return false; + case SOL_IP: + case SOL_TCP: + case SOL_IPV6: + case SOL_ICMPV6: + case SOL_SCTP: + return true; + default: + return false; + } +} + +static bool io_net_allow_cmsg(struct msghdr *msg) +{ + struct cmsghdr *cmsg; + + for_each_cmsghdr(cmsg, msg) { + if (!__io_net_allow_cmsg(cmsg)) + return false; + } + + return true; +} + static int io_sendmsg(struct io_kiocb *req, bool force_nonblock) { struct io_async_msghdr *kmsg = NULL; @@ -3604,6 +3635,11 @@ static int io_sendmsg(struct io_kiocb *req, bool force_nonblock) return ret; } + if (!io_net_allow_cmsg(&kmsg->msg)) { + ret = -EINVAL; + goto err; + } + flags = req->sr_msg.msg_flags; if (flags & MSG_DONTWAIT) req->flags |= REQ_F_NOWAIT; @@ -3617,6 +3653,7 @@ static int io_sendmsg(struct io_kiocb *req, bool force_nonblock) ret = -EINTR; } +err: if (kmsg && kmsg->iov != kmsg->fast_iov) kfree(kmsg->iov); req->flags &= ~REQ_F_NEED_CLEANUP; @@ -3840,6 +3877,11 @@ static int io_recvmsg(struct io_kiocb *req, bool force_nonblock) return ret; } + if (!io_net_allow_cmsg(&kmsg->msg)) { + ret = -EINVAL; + goto err; + } + kbuf = io_recv_buffer_select(req, &cflags, !force_nonblock); if (IS_ERR(kbuf)) { return PTR_ERR(kbuf); @@ -3863,6 +3905,7 @@ static int io_recvmsg(struct io_kiocb *req, bool force_nonblock) ret = -EINTR; } +err: if (kmsg && kmsg->iov != kmsg->fast_iov) kfree(kmsg->iov); req->flags &= ~REQ_F_NEED_CLEANUP; diff --git a/net/socket.c b/net/socket.c index 2dd739fba866..78cdf9a8cf73 100644 --- a/net/socket.c +++ b/net/socket.c @@ -2425,10 +2425,6 @@ static int ___sys_sendmsg(struct socket *sock, struct user_msghdr __user *msg, long __sys_sendmsg_sock(struct socket *sock, struct msghdr *msg, unsigned int flags) { - /* disallow ancillary data requests from this path */ - if (msg->msg_control || msg->msg_controllen) - return -EINVAL; - return ____sys_sendmsg(sock, msg, flags, NULL, 0); } @@ -2637,10 +2633,6 @@ long __sys_recvmsg_sock(struct socket *sock, struct msghdr *msg, struct user_msghdr __user *umsg, struct sockaddr __user *uaddr, unsigned int flags) { - /* disallow ancillary data requests from this path */ - if (msg->msg_control || msg->msg_controllen) - return -EINVAL; - return ____sys_recvmsg(sock, msg, umsg, uaddr, flags, 0); } -- Jens Axboe ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: io_uring, IORING_OP_RECVMSG and ancillary data 2020-04-27 19:53 ` Jens Axboe @ 2020-04-27 20:03 ` Jann Horn 2020-04-27 20:08 ` Jens Axboe 0 siblings, 1 reply; 9+ messages in thread From: Jann Horn @ 2020-04-27 20:03 UTC (permalink / raw) To: Jens Axboe; +Cc: Andreas Smas, linux-fsdevel, io-uring On Mon, Apr 27, 2020 at 9:53 PM Jens Axboe <axboe@kernel.dk> wrote: > On 4/27/20 1:29 PM, Jens Axboe wrote: > > On 4/27/20 1:20 PM, Jann Horn wrote: > >> On Sat, Apr 25, 2020 at 10:23 PM Jens Axboe <axboe@kernel.dk> wrote: > >>> On 4/25/20 11:29 AM, Andreas Smas wrote: > >>>> Hi, > >>>> > >>>> Tried to use io_uring with OP_RECVMSG with ancillary buffers (for my > >>>> particular use case I'm using SO_TIMESTAMP for incoming UDP packets). > >>>> > >>>> These submissions fail with EINVAL due to the check in __sys_recvmsg_sock(). > >>>> > >>>> The following hack fixes the problem for me and I get valid timestamps > >>>> back. Not suggesting this is the real fix as I'm not sure what the > >>>> implications of this is. > >>>> > >>>> Any insight into this would be much appreciated. > >>> > >>> It was originally disabled because of a security issue, but I do think > >>> it's safe to enable again. > >>> > >>> Adding the io-uring list and Jann as well, leaving patch intact below. > >>> > >>>> diff --git a/net/socket.c b/net/socket.c > >>>> index 2dd739fba866..689f41f4156e 100644 > >>>> --- a/net/socket.c > >>>> +++ b/net/socket.c > >>>> @@ -2637,10 +2637,6 @@ long __sys_recvmsg_sock(struct socket *sock, > >>>> struct msghdr *msg, > >>>> struct user_msghdr __user *umsg, > >>>> struct sockaddr __user *uaddr, unsigned int flags) > >>>> { > >>>> - /* disallow ancillary data requests from this path */ > >>>> - if (msg->msg_control || msg->msg_controllen) > >>>> - return -EINVAL; > >>>> - > >>>> return ____sys_recvmsg(sock, msg, umsg, uaddr, flags, 0); > >>>> } > >> > >> I think that's hard to get right. In particular, unix domain sockets > >> can currently pass file descriptors in control data - so you'd need to > >> set the file_table flag for recvmsg and sendmsg. And I'm not sure > >> whether, to make this robust, there should be a whitelist of types of > >> control messages that are permitted to be used with io_uring, or > >> something like that... > >> > >> I think of ancillary buffers as being kind of like ioctl handlers in > >> this regard. > > > > Good point. I'll send out something that hopefully will be enough to > > be useful, whole not allowing anything randomly. > > That things is a bit of a mess... How about something like this for > starters? [...] > +static bool io_net_allow_cmsg(struct msghdr *msg) > +{ > + struct cmsghdr *cmsg; > + > + for_each_cmsghdr(cmsg, msg) { Isn't this going to dereference a userspace pointer? ->msg_control has not been copied into the kernel at this point, right? > + if (!__io_net_allow_cmsg(cmsg)) > + return false; > + } [...] > @@ -3604,6 +3635,11 @@ static int io_sendmsg(struct io_kiocb *req, bool force_nonblock) [...] > + if (!io_net_allow_cmsg(&kmsg->msg)) { > + ret = -EINVAL; > + goto err; > + } [...] > @@ -3840,6 +3877,11 @@ static int io_recvmsg(struct io_kiocb *req, bool force_nonblock) [...] > + if (!io_net_allow_cmsg(&kmsg->msg)) { > + ret = -EINVAL; > + goto err; > + } > + ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: io_uring, IORING_OP_RECVMSG and ancillary data 2020-04-27 20:03 ` Jann Horn @ 2020-04-27 20:08 ` Jens Axboe 2020-04-27 20:10 ` Christoph Hellwig 0 siblings, 1 reply; 9+ messages in thread From: Jens Axboe @ 2020-04-27 20:08 UTC (permalink / raw) To: Jann Horn; +Cc: Andreas Smas, linux-fsdevel, io-uring On 4/27/20 2:03 PM, Jann Horn wrote: > On Mon, Apr 27, 2020 at 9:53 PM Jens Axboe <axboe@kernel.dk> wrote: >> On 4/27/20 1:29 PM, Jens Axboe wrote: >>> On 4/27/20 1:20 PM, Jann Horn wrote: >>>> On Sat, Apr 25, 2020 at 10:23 PM Jens Axboe <axboe@kernel.dk> wrote: >>>>> On 4/25/20 11:29 AM, Andreas Smas wrote: >>>>>> Hi, >>>>>> >>>>>> Tried to use io_uring with OP_RECVMSG with ancillary buffers (for my >>>>>> particular use case I'm using SO_TIMESTAMP for incoming UDP packets). >>>>>> >>>>>> These submissions fail with EINVAL due to the check in __sys_recvmsg_sock(). >>>>>> >>>>>> The following hack fixes the problem for me and I get valid timestamps >>>>>> back. Not suggesting this is the real fix as I'm not sure what the >>>>>> implications of this is. >>>>>> >>>>>> Any insight into this would be much appreciated. >>>>> >>>>> It was originally disabled because of a security issue, but I do think >>>>> it's safe to enable again. >>>>> >>>>> Adding the io-uring list and Jann as well, leaving patch intact below. >>>>> >>>>>> diff --git a/net/socket.c b/net/socket.c >>>>>> index 2dd739fba866..689f41f4156e 100644 >>>>>> --- a/net/socket.c >>>>>> +++ b/net/socket.c >>>>>> @@ -2637,10 +2637,6 @@ long __sys_recvmsg_sock(struct socket *sock, >>>>>> struct msghdr *msg, >>>>>> struct user_msghdr __user *umsg, >>>>>> struct sockaddr __user *uaddr, unsigned int flags) >>>>>> { >>>>>> - /* disallow ancillary data requests from this path */ >>>>>> - if (msg->msg_control || msg->msg_controllen) >>>>>> - return -EINVAL; >>>>>> - >>>>>> return ____sys_recvmsg(sock, msg, umsg, uaddr, flags, 0); >>>>>> } >>>> >>>> I think that's hard to get right. In particular, unix domain sockets >>>> can currently pass file descriptors in control data - so you'd need to >>>> set the file_table flag for recvmsg and sendmsg. And I'm not sure >>>> whether, to make this robust, there should be a whitelist of types of >>>> control messages that are permitted to be used with io_uring, or >>>> something like that... >>>> >>>> I think of ancillary buffers as being kind of like ioctl handlers in >>>> this regard. >>> >>> Good point. I'll send out something that hopefully will be enough to >>> be useful, whole not allowing anything randomly. >> >> That things is a bit of a mess... How about something like this for >> starters? > [...] >> +static bool io_net_allow_cmsg(struct msghdr *msg) >> +{ >> + struct cmsghdr *cmsg; >> + >> + for_each_cmsghdr(cmsg, msg) { > > Isn't this going to dereference a userspace pointer? ->msg_control has > not been copied into the kernel at this point, right? Possibly... Totally untested, maybe I forgot to mention that :-) I'll check. The question was more "in principle" if this was a viable approach. The whole cmsg_type and cmsg_level is really a mess. -- Jens Axboe ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: io_uring, IORING_OP_RECVMSG and ancillary data 2020-04-27 20:08 ` Jens Axboe @ 2020-04-27 20:10 ` Christoph Hellwig 2020-04-27 20:13 ` Jens Axboe 0 siblings, 1 reply; 9+ messages in thread From: Christoph Hellwig @ 2020-04-27 20:10 UTC (permalink / raw) To: Jens Axboe; +Cc: Jann Horn, Andreas Smas, linux-fsdevel, io-uring On Mon, Apr 27, 2020 at 02:08:25PM -0600, Jens Axboe wrote: > Possibly... Totally untested, maybe I forgot to mention that :-) > I'll check. > > The question was more "in principle" if this was a viable approach. The > whole cmsg_type and cmsg_level is really a mess. FYI, I have a series in the works to sort out the set_fs and casting to/from __user mess around msg_control. It needs a little more work, but hopefully I can find some time the next days. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: io_uring, IORING_OP_RECVMSG and ancillary data 2020-04-27 20:10 ` Christoph Hellwig @ 2020-04-27 20:13 ` Jens Axboe 0 siblings, 0 replies; 9+ messages in thread From: Jens Axboe @ 2020-04-27 20:13 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Jann Horn, Andreas Smas, linux-fsdevel, io-uring On 4/27/20 2:10 PM, Christoph Hellwig wrote: > On Mon, Apr 27, 2020 at 02:08:25PM -0600, Jens Axboe wrote: >> Possibly... Totally untested, maybe I forgot to mention that :-) >> I'll check. >> >> The question was more "in principle" if this was a viable approach. The >> whole cmsg_type and cmsg_level is really a mess. > > FYI, I have a series in the works to sort out the set_fs and > casting to/from __user mess around msg_control. It needs a little > more work, but hopefully I can find some time the next days. That'd be great - looking at it right now, I'd need to refactor ____sys_sendmsg (and ditto receive side) to poke at cmsg before allowing us to proceed. I'll wait for you to post your series and base it on top of that, if appropriate. -- Jens Axboe ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2020-04-27 20:13 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-04-25 17:29 io_uring, IORING_OP_RECVMSG and ancillary data Andreas Smas 2020-04-25 20:23 ` Jens Axboe 2020-04-27 19:20 ` Jann Horn 2020-04-27 19:29 ` Jens Axboe 2020-04-27 19:53 ` Jens Axboe 2020-04-27 20:03 ` Jann Horn 2020-04-27 20:08 ` Jens Axboe 2020-04-27 20:10 ` Christoph Hellwig 2020-04-27 20:13 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).