* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction [not found] ` <62336067-18c2-3493-d0ec-6dd6a6d3a1b5@huawei-partners.com> @ 2024-12-12 18:43 ` Mickaël Salaün 2024-12-13 18:19 ` Mikhail Ivanov 0 siblings, 1 reply; 18+ messages in thread From: Mickaël Salaün @ 2024-12-12 18:43 UTC (permalink / raw) To: Mikhail Ivanov Cc: Matthieu Baerts, gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs On Thu, Oct 31, 2024 at 07:21:44PM +0300, Mikhail Ivanov wrote: > On 10/18/2024 9:08 PM, Mickaël Salaün wrote: > > On Thu, Oct 17, 2024 at 02:59:48PM +0200, Matthieu Baerts wrote: > > > Hi Mikhail and Landlock maintainers, > > > > > > +cc MPTCP list. > > > > Thanks, we should include this list in the next series. > > > > > > > > On 17/10/2024 13:04, Mikhail Ivanov wrote: > > > > Do not check TCP access right if socket protocol is not IPPROTO_TCP. > > > > LANDLOCK_ACCESS_NET_BIND_TCP and LANDLOCK_ACCESS_NET_CONNECT_TCP > > > > should not restrict bind(2) and connect(2) for non-TCP protocols > > > > (SCTP, MPTCP, SMC). > > > > > > Thank you for the patch! > > > > > > I'm part of the MPTCP team, and I'm wondering if MPTCP should not be > > > treated like TCP here. MPTCP is an extension to TCP: on the wire, we can > > > see TCP packets with extra TCP options. On Linux, there is indeed a > > > dedicated MPTCP socket (IPPROTO_MPTCP), but that's just internal, > > > because we needed such dedicated socket to talk to the userspace. > > > > > > I don't know Landlock well, but I think it is important to know that an > > > MPTCP socket can be used to discuss with "plain" TCP packets: the kernel > > > will do a fallback to "plain" TCP if MPTCP is not supported by the other > > > peer or by a middlebox. It means that with this patch, if TCP is blocked > > > by Landlock, someone can simply force an application to create an MPTCP > > > socket -- e.g. via LD_PRELOAD -- and bypass the restrictions. It will > > > certainly work, even when connecting to a peer not supporting MPTCP. > > > > > > Please note that I'm not against this modification -- especially here > > > when we remove restrictions around MPTCP sockets :) -- I'm just saying > > > it might be less confusing for users if MPTCP is considered as being > > > part of TCP. A bit similar to what someone would do with a firewall: if > > > TCP is blocked, MPTCP is blocked as well. > > > > Good point! I don't know well MPTCP but I think you're right. Given > > it's close relationship with TCP and the fallback mechanism, it would > > make sense for users to not make a difference and it would avoid bypass > > of misleading restrictions. Moreover the Landlock rules are simple and > > only control TCP ports, not peer addresses, which seems to be the main > > evolution of MPTCP. > > > > > > > I understand that a future goal might probably be to have dedicated > > > restrictions for MPTCP and the other stream protocols (and/or for all > > > stream protocols like it was before this patch), but in the meantime, it > > > might be less confusing considering MPTCP as being part of TCP (I'm not > > > sure about the other stream protocols). > > > > We need to take a closer look at the other stream protocols indeed. > Hello! Sorry for the late reply, I was on a small business trip. > > Thanks a lot for this catch, without doubt MPTCP should be controlled > with TCP access rights. > > In that case, we should reconsider current semantics of TCP control. > > Currently, it looks like this: > * LANDLOCK_ACCESS_NET_BIND_TCP: Bind a TCP socket to a local port. > * LANDLOCK_ACCESS_NET_CONNECT_TCP: Connect an active TCP socket to a > remote port. > > According to these definitions only TCP sockets should be restricted and > this is already provided by Landlock (considering observing commit) > (assuming that "TCP socket" := user space socket of IPPROTO_TCP > protocol). > > AFAICS the two objectives of TCP access rights are to control > (1) which ports can be used for sending or receiving TCP packets > (including SYN, ACK or other service packets). > (2) which ports can be used to establish TCP connection (performed by > kernel network stack on server or client side). > > In most cases denying (2) cause denying (1). Sending or receiving TCP > packets without initial 3-way handshake is only possible on RAW [1] or > PACKET [2] sockets. Usage of such sockets requires root privilligies, so > there is no point to control them with Landlock. I agree. > > Therefore Landlock should only take care about case (2). For now > (please correct me if I'm wrong), we only considered control of > connection performed on user space plain TCP sockets (created with > IPPROTO_TCP). Correct. Landlock is dedicated to sandbox user space processes and the related access rights should focus on restricting what is possible through syscalls (mainly). > > TCP kernel sockets are generally used in the following ways: > * in a couple of other user space protocols (MPTCP, SMC, RDS) > * in a few network filesystems (e.g. NFS communication over TCP) > > For the second case TCP connection is currently not restricted by > Landlock. This approach is may be correct, since NFS should not have > access to a plain TCP communication and TCP restriction of NFS may > be too implicit. Nevertheless, I think that restriction via current > access rights should be considered. I'm not sure what you mean here. I'm not familiar with NFS in the kernel. AFAIK there is no socket type for NFS. > > For the first case, each protocol use TCP differently, so they should > be considered separately. Yes, for user-accessible protocols. > > In the case of MPTCP TCP internal sockets are used to establish > connection and exchange data between two network interfaces. MPTCP > allows to have multiple TCP connections between two MPTCP sockets by > connecting different network interfaces (e.g. WIFI and 3G). > > Shared Memory Communication is a protocol that allows TCP applications > transparently use RDMA for communication [3]. TCP internal socket is > used to exchange service CLC messages when establishing SMC connection > (which seems harmless for sandboxing) and for communication in the case > of fallback. Fallback happens only if RDMA communication became > impossible (e.g. if RDMA capable RNIC card went down on host or peer > side). So, preventing TCP communication may be achieved by controlling > fallback mechanism. > > Reliable Datagram Socket is connectionless protocol implemented by > Oracle [4]. It uses TCP stack or Infiniband to reliably deliever > datagrams. For every sendmsg(2), recvmsg(2) it establishes TCP > connection and use it to deliever splitted message. > > In comparison with previous protocols, RDS sockets cannot be binded or > connected to special TCP ports (e.g. with bind(2), connect(2)). 16385 > port is assigned to receiving side and sending side is binded to the > port allocated by the kernel (by using zero as port number). > > It may be useful to restrict RDS-over-TCP with current access rights, > since it allows to perform TCP communication from user-space. But it > would be only possible to fully allow or deny sending/receiving > (since used ports are not controlled from user space). Thanks for these explanations. The ability to fine-control specific protocol operations (e.g. connect, bind) can be useful for widely used protocol such as TCP and UDP (or if someone wants to implement it for another protocol), but this approach would not scale with all protocols because of their own semantic and the development efforts. The Landlock access rights should be explicit, and we should also be able to deny access to a whole set of protocols. This should be partially possible with your socket creation patch series. I guess the remaining cases would be to cover transformation of one socket type to another. I think we could control such transformation by building on top of the socket creation control foundation: instead of controlling socket creation, add a new access right to control socket transformation. What do you think? > > Restricting any TCP connection in the kernel is probably simplest > design, but we should consider above cases to provide the most useful > one. > > [1] https://man7.org/linux/man-pages/man7/raw.7.html > [2] https://man7.org/linux/man-pages/man7/packet.7.html > [3] https://datatracker.ietf.org/doc/html/rfc7609 > [4] https://oss.oracle.com/projects/rds/dist/documentation/rds-3.1-spec.html > > > > > > > > > > > > > sk_is_tcp() is used for this to check address family of the socket > > > > before doing INET-specific address length validation. This is required > > > > for error consistency. > > > > > > > > Closes: https://github.com/landlock-lsm/linux/issues/40 > > > > Fixes: fff69fb03dde ("landlock: Support network rules with TCP bind and connect") > > > > > > I don't know how fixes are considered in Landlock, but should this patch > > > be considered as a fix? It might be surprising for someone who thought > > > all "stream" connections were blocked to have them unblocked when > > > updating to a minor kernel version, no? > > > > Indeed. The main issue was with the semantic/definition of > > LANDLOCK_ACCESS_FS_NET_{CONNECT,BIND}_TCP. We need to synchronize the > > code with the documentation, one way or the other, preferably following > > the principle of least astonishment. > > > > > > > > (Personally, I would understand such behaviour change when upgrading to > > > a major version, and still, maybe only if there were alternatives to > > > > This "fix" needs to be backported, but we're not clear yet on what it > > should be. :) > > > > > continue having the same behaviour, e.g. a way to restrict all stream > > > sockets the same way, or something per stream socket. But that's just me > > > :) ) > > > > The documentation and the initial idea was to control TCP bind and > > connect. The kernel implementation does more than that, so we need to > > synthronize somehow. > > > > > > > > Cheers, > > > Matt > > > -- > > > Sponsored by the NGI0 Core fund. > > > > > > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction 2024-12-12 18:43 ` [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction Mickaël Salaün @ 2024-12-13 18:19 ` Mikhail Ivanov 2025-01-24 15:02 ` Mickaël Salaün 0 siblings, 1 reply; 18+ messages in thread From: Mikhail Ivanov @ 2024-12-13 18:19 UTC (permalink / raw) To: Mickaël Salaün Cc: Matthieu Baerts, gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs On 12/12/2024 9:43 PM, Mickaël Salaün wrote: > On Thu, Oct 31, 2024 at 07:21:44PM +0300, Mikhail Ivanov wrote: >> On 10/18/2024 9:08 PM, Mickaël Salaün wrote: >>> On Thu, Oct 17, 2024 at 02:59:48PM +0200, Matthieu Baerts wrote: >>>> Hi Mikhail and Landlock maintainers, >>>> >>>> +cc MPTCP list. >>> >>> Thanks, we should include this list in the next series. >>> >>>> >>>> On 17/10/2024 13:04, Mikhail Ivanov wrote: >>>>> Do not check TCP access right if socket protocol is not IPPROTO_TCP. >>>>> LANDLOCK_ACCESS_NET_BIND_TCP and LANDLOCK_ACCESS_NET_CONNECT_TCP >>>>> should not restrict bind(2) and connect(2) for non-TCP protocols >>>>> (SCTP, MPTCP, SMC). >>>> >>>> Thank you for the patch! >>>> >>>> I'm part of the MPTCP team, and I'm wondering if MPTCP should not be >>>> treated like TCP here. MPTCP is an extension to TCP: on the wire, we can >>>> see TCP packets with extra TCP options. On Linux, there is indeed a >>>> dedicated MPTCP socket (IPPROTO_MPTCP), but that's just internal, >>>> because we needed such dedicated socket to talk to the userspace. >>>> >>>> I don't know Landlock well, but I think it is important to know that an >>>> MPTCP socket can be used to discuss with "plain" TCP packets: the kernel >>>> will do a fallback to "plain" TCP if MPTCP is not supported by the other >>>> peer or by a middlebox. It means that with this patch, if TCP is blocked >>>> by Landlock, someone can simply force an application to create an MPTCP >>>> socket -- e.g. via LD_PRELOAD -- and bypass the restrictions. It will >>>> certainly work, even when connecting to a peer not supporting MPTCP. >>>> >>>> Please note that I'm not against this modification -- especially here >>>> when we remove restrictions around MPTCP sockets :) -- I'm just saying >>>> it might be less confusing for users if MPTCP is considered as being >>>> part of TCP. A bit similar to what someone would do with a firewall: if >>>> TCP is blocked, MPTCP is blocked as well. >>> >>> Good point! I don't know well MPTCP but I think you're right. Given >>> it's close relationship with TCP and the fallback mechanism, it would >>> make sense for users to not make a difference and it would avoid bypass >>> of misleading restrictions. Moreover the Landlock rules are simple and >>> only control TCP ports, not peer addresses, which seems to be the main >>> evolution of MPTCP. > >>>> >>>> I understand that a future goal might probably be to have dedicated >>>> restrictions for MPTCP and the other stream protocols (and/or for all >>>> stream protocols like it was before this patch), but in the meantime, it >>>> might be less confusing considering MPTCP as being part of TCP (I'm not >>>> sure about the other stream protocols). >>> >>> We need to take a closer look at the other stream protocols indeed. >> Hello! Sorry for the late reply, I was on a small business trip. >> >> Thanks a lot for this catch, without doubt MPTCP should be controlled >> with TCP access rights. >> >> In that case, we should reconsider current semantics of TCP control. >> >> Currently, it looks like this: >> * LANDLOCK_ACCESS_NET_BIND_TCP: Bind a TCP socket to a local port. >> * LANDLOCK_ACCESS_NET_CONNECT_TCP: Connect an active TCP socket to a >> remote port. >> >> According to these definitions only TCP sockets should be restricted and >> this is already provided by Landlock (considering observing commit) >> (assuming that "TCP socket" := user space socket of IPPROTO_TCP >> protocol). >> >> AFAICS the two objectives of TCP access rights are to control >> (1) which ports can be used for sending or receiving TCP packets >> (including SYN, ACK or other service packets). >> (2) which ports can be used to establish TCP connection (performed by >> kernel network stack on server or client side). >> >> In most cases denying (2) cause denying (1). Sending or receiving TCP >> packets without initial 3-way handshake is only possible on RAW [1] or >> PACKET [2] sockets. Usage of such sockets requires root privilligies, so >> there is no point to control them with Landlock. > > I agree. > >> >> Therefore Landlock should only take care about case (2). For now >> (please correct me if I'm wrong), we only considered control of >> connection performed on user space plain TCP sockets (created with >> IPPROTO_TCP). > > Correct. Landlock is dedicated to sandbox user space processes and the > related access rights should focus on restricting what is possible > through syscalls (mainly). > >> >> TCP kernel sockets are generally used in the following ways: >> * in a couple of other user space protocols (MPTCP, SMC, RDS) >> * in a few network filesystems (e.g. NFS communication over TCP) >> >> For the second case TCP connection is currently not restricted by >> Landlock. This approach is may be correct, since NFS should not have >> access to a plain TCP communication and TCP restriction of NFS may >> be too implicit. Nevertheless, I think that restriction via current >> access rights should be considered. > > I'm not sure what you mean here. I'm not familiar with NFS in the > kernel. AFAIK there is no socket type for NFS. NFS client makes RPC requests to perform remote file operations on the NFS server. RPC requests can be sent using TCP, UDP, or RDMA sockets at the transport layer. Call trace of creating TCP socket for client->server communication: nfs_create_rpc_client() rpc_create() xprt_create_transport() xs_setup_tcp() xs_tcp_setup_socket() xs_create_sock() And RPC request is forwarded to TCP stack by calling xs_tcp_send_request(). > >> >> For the first case, each protocol use TCP differently, so they should >> be considered separately. > > Yes, for user-accessible protocols. > >> >> In the case of MPTCP TCP internal sockets are used to establish >> connection and exchange data between two network interfaces. MPTCP >> allows to have multiple TCP connections between two MPTCP sockets by >> connecting different network interfaces (e.g. WIFI and 3G). >> >> Shared Memory Communication is a protocol that allows TCP applications >> transparently use RDMA for communication [3]. TCP internal socket is >> used to exchange service CLC messages when establishing SMC connection >> (which seems harmless for sandboxing) and for communication in the case >> of fallback. Fallback happens only if RDMA communication became >> impossible (e.g. if RDMA capable RNIC card went down on host or peer >> side). So, preventing TCP communication may be achieved by controlling >> fallback mechanism. >> >> Reliable Datagram Socket is connectionless protocol implemented by >> Oracle [4]. It uses TCP stack or Infiniband to reliably deliever >> datagrams. For every sendmsg(2), recvmsg(2) it establishes TCP >> connection and use it to deliever splitted message. >> >> In comparison with previous protocols, RDS sockets cannot be binded or >> connected to special TCP ports (e.g. with bind(2), connect(2)). 16385 >> port is assigned to receiving side and sending side is binded to the >> port allocated by the kernel (by using zero as port number). >> >> It may be useful to restrict RDS-over-TCP with current access rights, >> since it allows to perform TCP communication from user-space. But it >> would be only possible to fully allow or deny sending/receiving >> (since used ports are not controlled from user space). > > Thanks for these explanations. The ability to fine-control specific > protocol operations (e.g. connect, bind) can be useful for widely used > protocol such as TCP and UDP (or if someone wants to implement it for > another protocol), but this approach would not scale with all protocols > because of their own semantic and the development efforts. The Landlock > access rights should be explicit, and we should also be able to deny > access to a whole set of protocols. This should be partially possible > with your socket creation patch series. I guess the remaining cases > would be to cover transformation of one socket type to another. I think > we could control such transformation by building on top of the socket > creation control foundation: instead of controlling socket creation, add > a new access right to control socket transformation. What do you think? I agree that implementing fine-control network access rights for other protocols only to be able to completely restrict TCP operations seems excessive. Do you mean the implementation of 2 access rights: for creating and transforming sockets? If so, there are only 2 socket protocols that can be transformed to TCP (in the fallback path) - MPTCP and SMC. Recall that in the case of RDS, a TCP socket can be used implicitly to deliver an RDS datagram. Let's assume that the process of configuring TCP as a transport for RDS is also included in the socket transformation control. Socket creation control is sufficient to restrict the implicit use of a TCP connection. Theoretically, separate socket transformation control is only required if the user wants to use (for example) SMC sockets with restricted (partially or completely) TCP bind(2) and connect(2) actions. But SMC (or MPTCP) applications should rely on TCP communication in case of fallback. I think they are unlikely to have any TCP restrictions. However, control of fallback to TCP by applying socket creation rules is too implicit and inconvenient. Initially, I thought that users could expect TCP access rights to completely restrict the corresponding TCP actions without additional rules for sockets. I have concerns that socket transformation control would not be explicit enough for such purpose. Probably, it will be more correctly to apply rules that deny creation of SMC, MPTCP and RDS sockets (or their transformation to TCP) in landlock_restrict_self() if TCP actions are not fully allowed? > >> >> Restricting any TCP connection in the kernel is probably simplest >> design, but we should consider above cases to provide the most useful >> one. >> >> [1] https://man7.org/linux/man-pages/man7/raw.7.html >> [2] https://man7.org/linux/man-pages/man7/packet.7.html >> [3] https://datatracker.ietf.org/doc/html/rfc7609 >> [4] https://oss.oracle.com/projects/rds/dist/documentation/rds-3.1-spec.html >> >>> >>>> >>>> >>>>> sk_is_tcp() is used for this to check address family of the socket >>>>> before doing INET-specific address length validation. This is required >>>>> for error consistency. >>>>> >>>>> Closes: https://github.com/landlock-lsm/linux/issues/40 >>>>> Fixes: fff69fb03dde ("landlock: Support network rules with TCP bind and connect") >>>> >>>> I don't know how fixes are considered in Landlock, but should this patch >>>> be considered as a fix? It might be surprising for someone who thought >>>> all "stream" connections were blocked to have them unblocked when >>>> updating to a minor kernel version, no? >>> >>> Indeed. The main issue was with the semantic/definition of >>> LANDLOCK_ACCESS_FS_NET_{CONNECT,BIND}_TCP. We need to synchronize the >>> code with the documentation, one way or the other, preferably following >>> the principle of least astonishment. >>> >>>> >>>> (Personally, I would understand such behaviour change when upgrading to >>>> a major version, and still, maybe only if there were alternatives to >>> >>> This "fix" needs to be backported, but we're not clear yet on what it >>> should be. :) >>> >>>> continue having the same behaviour, e.g. a way to restrict all stream >>>> sockets the same way, or something per stream socket. But that's just me >>>> :) ) >>> >>> The documentation and the initial idea was to control TCP bind and >>> connect. The kernel implementation does more than that, so we need to >>> synthronize somehow. >>> >>>> >>>> Cheers, >>>> Matt >>>> -- >>>> Sponsored by the NGI0 Core fund. >>>> >>>> >> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction 2024-12-13 18:19 ` Mikhail Ivanov @ 2025-01-24 15:02 ` Mickaël Salaün 2025-01-27 12:40 ` Mikhail Ivanov 0 siblings, 1 reply; 18+ messages in thread From: Mickaël Salaün @ 2025-01-24 15:02 UTC (permalink / raw) To: Mikhail Ivanov Cc: Matthieu Baerts, gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs On Fri, Dec 13, 2024 at 09:19:10PM +0300, Mikhail Ivanov wrote: > On 12/12/2024 9:43 PM, Mickaël Salaün wrote: > > On Thu, Oct 31, 2024 at 07:21:44PM +0300, Mikhail Ivanov wrote: > > > On 10/18/2024 9:08 PM, Mickaël Salaün wrote: > > > > On Thu, Oct 17, 2024 at 02:59:48PM +0200, Matthieu Baerts wrote: > > > > > Hi Mikhail and Landlock maintainers, > > > > > > > > > > +cc MPTCP list. > > > > > > > > Thanks, we should include this list in the next series. > > > > > > > > > > > > > > On 17/10/2024 13:04, Mikhail Ivanov wrote: > > > > > > Do not check TCP access right if socket protocol is not IPPROTO_TCP. > > > > > > LANDLOCK_ACCESS_NET_BIND_TCP and LANDLOCK_ACCESS_NET_CONNECT_TCP > > > > > > should not restrict bind(2) and connect(2) for non-TCP protocols > > > > > > (SCTP, MPTCP, SMC). > > > > > > > > > > Thank you for the patch! > > > > > > > > > > I'm part of the MPTCP team, and I'm wondering if MPTCP should not be > > > > > treated like TCP here. MPTCP is an extension to TCP: on the wire, we can > > > > > see TCP packets with extra TCP options. On Linux, there is indeed a > > > > > dedicated MPTCP socket (IPPROTO_MPTCP), but that's just internal, > > > > > because we needed such dedicated socket to talk to the userspace. > > > > > > > > > > I don't know Landlock well, but I think it is important to know that an > > > > > MPTCP socket can be used to discuss with "plain" TCP packets: the kernel > > > > > will do a fallback to "plain" TCP if MPTCP is not supported by the other > > > > > peer or by a middlebox. It means that with this patch, if TCP is blocked > > > > > by Landlock, someone can simply force an application to create an MPTCP > > > > > socket -- e.g. via LD_PRELOAD -- and bypass the restrictions. It will > > > > > certainly work, even when connecting to a peer not supporting MPTCP. > > > > > > > > > > Please note that I'm not against this modification -- especially here > > > > > when we remove restrictions around MPTCP sockets :) -- I'm just saying > > > > > it might be less confusing for users if MPTCP is considered as being > > > > > part of TCP. A bit similar to what someone would do with a firewall: if > > > > > TCP is blocked, MPTCP is blocked as well. > > > > > > > > Good point! I don't know well MPTCP but I think you're right. Given > > > > it's close relationship with TCP and the fallback mechanism, it would > > > > make sense for users to not make a difference and it would avoid bypass > > > > of misleading restrictions. Moreover the Landlock rules are simple and > > > > only control TCP ports, not peer addresses, which seems to be the main > > > > evolution of MPTCP. > > > > > > > > > > > I understand that a future goal might probably be to have dedicated > > > > > restrictions for MPTCP and the other stream protocols (and/or for all > > > > > stream protocols like it was before this patch), but in the meantime, it > > > > > might be less confusing considering MPTCP as being part of TCP (I'm not > > > > > sure about the other stream protocols). > > > > > > > > We need to take a closer look at the other stream protocols indeed. > > > Hello! Sorry for the late reply, I was on a small business trip. > > > > > > Thanks a lot for this catch, without doubt MPTCP should be controlled > > > with TCP access rights. > > > > > > In that case, we should reconsider current semantics of TCP control. > > > > > > Currently, it looks like this: > > > * LANDLOCK_ACCESS_NET_BIND_TCP: Bind a TCP socket to a local port. > > > * LANDLOCK_ACCESS_NET_CONNECT_TCP: Connect an active TCP socket to a > > > remote port. > > > > > > According to these definitions only TCP sockets should be restricted and > > > this is already provided by Landlock (considering observing commit) > > > (assuming that "TCP socket" := user space socket of IPPROTO_TCP > > > protocol). > > > > > > AFAICS the two objectives of TCP access rights are to control > > > (1) which ports can be used for sending or receiving TCP packets > > > (including SYN, ACK or other service packets). > > > (2) which ports can be used to establish TCP connection (performed by > > > kernel network stack on server or client side). > > > > > > In most cases denying (2) cause denying (1). Sending or receiving TCP > > > packets without initial 3-way handshake is only possible on RAW [1] or > > > PACKET [2] sockets. Usage of such sockets requires root privilligies, so > > > there is no point to control them with Landlock. > > > > I agree. > > > > > > > > Therefore Landlock should only take care about case (2). For now > > > (please correct me if I'm wrong), we only considered control of > > > connection performed on user space plain TCP sockets (created with > > > IPPROTO_TCP). > > > > Correct. Landlock is dedicated to sandbox user space processes and the > > related access rights should focus on restricting what is possible > > through syscalls (mainly). > > > > > > > > TCP kernel sockets are generally used in the following ways: > > > * in a couple of other user space protocols (MPTCP, SMC, RDS) > > > * in a few network filesystems (e.g. NFS communication over TCP) > > > > > > For the second case TCP connection is currently not restricted by > > > Landlock. This approach is may be correct, since NFS should not have > > > access to a plain TCP communication and TCP restriction of NFS may > > > be too implicit. Nevertheless, I think that restriction via current > > > access rights should be considered. > > > > I'm not sure what you mean here. I'm not familiar with NFS in the > > kernel. AFAIK there is no socket type for NFS. > > NFS client makes RPC requests to perform remote file operations on the > NFS server. RPC requests can be sent using TCP, UDP, or RDMA sockets at > the transport layer. > > Call trace of creating TCP socket for client->server communication: > nfs_create_rpc_client() > rpc_create() > xprt_create_transport() > xs_setup_tcp() > xs_tcp_setup_socket() > xs_create_sock() > > And RPC request is forwarded to TCP stack by calling > xs_tcp_send_request(). OK, but it looks like this is connections on behalf of the kernel, that only the kernel can use. In other words, when these functions are called, I guess current_cred() doesn't point to user space credentials. Because the kernel cannot be restricted by Landlock, we should be good. > > > > > > > > > For the first case, each protocol use TCP differently, so they should > > > be considered separately. > > > > Yes, for user-accessible protocols. > > > > > > > > In the case of MPTCP TCP internal sockets are used to establish > > > connection and exchange data between two network interfaces. MPTCP > > > allows to have multiple TCP connections between two MPTCP sockets by > > > connecting different network interfaces (e.g. WIFI and 3G). > > > > > > Shared Memory Communication is a protocol that allows TCP applications > > > transparently use RDMA for communication [3]. TCP internal socket is > > > used to exchange service CLC messages when establishing SMC connection > > > (which seems harmless for sandboxing) and for communication in the case > > > of fallback. Fallback happens only if RDMA communication became > > > impossible (e.g. if RDMA capable RNIC card went down on host or peer > > > side). So, preventing TCP communication may be achieved by controlling > > > fallback mechanism. > > > > > > Reliable Datagram Socket is connectionless protocol implemented by > > > Oracle [4]. It uses TCP stack or Infiniband to reliably deliever > > > datagrams. For every sendmsg(2), recvmsg(2) it establishes TCP > > > connection and use it to deliever splitted message. > > > > > > In comparison with previous protocols, RDS sockets cannot be binded or > > > connected to special TCP ports (e.g. with bind(2), connect(2)). 16385 > > > port is assigned to receiving side and sending side is binded to the > > > port allocated by the kernel (by using zero as port number). > > > > > > It may be useful to restrict RDS-over-TCP with current access rights, > > > since it allows to perform TCP communication from user-space. But it > > > would be only possible to fully allow or deny sending/receiving > > > (since used ports are not controlled from user space). > > > > Thanks for these explanations. The ability to fine-control specific > > protocol operations (e.g. connect, bind) can be useful for widely used > > protocol such as TCP and UDP (or if someone wants to implement it for > > another protocol), but this approach would not scale with all protocols > > because of their own semantic and the development efforts. The Landlock > > access rights should be explicit, and we should also be able to deny > > access to a whole set of protocols. This should be partially possible > > with your socket creation patch series. I guess the remaining cases > > would be to cover transformation of one socket type to another. I think > > we could control such transformation by building on top of the socket > > creation control foundation: instead of controlling socket creation, add > > a new access right to control socket transformation. What do you think? > > I agree that implementing fine-control network access rights for other > protocols only to be able to completely restrict TCP operations seems > excessive. > > Do you mean the implementation of 2 access rights: for creating and > transforming sockets? Yes, but if it's not too complex I think it would make sense to only have one access right that will cover these two cases. I'm not sure there is one common point where to check these socket transformation though. > > If so, there are only 2 socket protocols that can be transformed to TCP > (in the fallback path) - MPTCP and SMC. Recall that in the case of RDS, > a TCP socket can be used implicitly to deliver an RDS datagram. Hmm, interesting. Then we'll also need an access right to use a protocol? I'm worried that this kind of check would have a significant performance impact. I think we could tag a socket at creation time with the allowed protocol transitions. > Let's > assume that the process of configuring TCP as a transport for RDS is > also included in the socket transformation control. > > Socket creation control is sufficient to restrict the implicit use of a > TCP connection. Theoretically, separate socket transformation > control is only required if the user wants to use (for example) SMC > sockets with restricted (partially or completely) TCP bind(2) and > connect(2) actions. But SMC (or MPTCP) applications should rely on TCP > communication in case of fallback. I think they are unlikely to have any > TCP restrictions. > > However, control of fallback to TCP by applying socket creation rules > is too implicit and inconvenient. > > Initially, I thought that users could expect TCP access rights to > completely restrict the corresponding TCP actions without additional > rules for sockets. I have concerns that socket transformation control > would not be explicit enough for such purpose. > > Probably, it will be more correctly to apply rules that deny creation of > SMC, MPTCP and RDS sockets (or their transformation to TCP) in > landlock_restrict_self() if TCP actions are not fully allowed? That should be achieved with your socket creation control patch series right? I'm not sure to understand the use of landlock_restrict_self() here. Rulesets should fully define an access control on their own. > > > > > > > > > Restricting any TCP connection in the kernel is probably simplest > > > design, but we should consider above cases to provide the most useful > > > one. > > > > > > [1] https://man7.org/linux/man-pages/man7/raw.7.html > > > [2] https://man7.org/linux/man-pages/man7/packet.7.html > > > [3] https://datatracker.ietf.org/doc/html/rfc7609 > > > [4] https://oss.oracle.com/projects/rds/dist/documentation/rds-3.1-spec.html > > > > > > > > > > > > > > > > > > > > > > > sk_is_tcp() is used for this to check address family of the socket > > > > > > before doing INET-specific address length validation. This is required > > > > > > for error consistency. Could you please send a new patch series for this specific fix, including minimal tests? I'd like to merge that as soon as possible, and it will be backported to all kernel versions. > > > > > > > > > > > > Closes: https://github.com/landlock-lsm/linux/issues/40 > > > > > > Fixes: fff69fb03dde ("landlock: Support network rules with TCP bind and connect") > > > > > > > > > > I don't know how fixes are considered in Landlock, but should this patch > > > > > be considered as a fix? It might be surprising for someone who thought > > > > > all "stream" connections were blocked to have them unblocked when > > > > > updating to a minor kernel version, no? > > > > > > > > Indeed. The main issue was with the semantic/definition of > > > > LANDLOCK_ACCESS_FS_NET_{CONNECT,BIND}_TCP. We need to synchronize the > > > > code with the documentation, one way or the other, preferably following > > > > the principle of least astonishment. > > > > > > > > > > > > > > (Personally, I would understand such behaviour change when upgrading to > > > > > a major version, and still, maybe only if there were alternatives to > > > > > > > > This "fix" needs to be backported, but we're not clear yet on what it > > > > should be. :) > > > > > > > > > continue having the same behaviour, e.g. a way to restrict all stream > > > > > sockets the same way, or something per stream socket. But that's just me > > > > > :) ) > > > > > > > > The documentation and the initial idea was to control TCP bind and > > > > connect. The kernel implementation does more than that, so we need to > > > > synthronize somehow. > > > > > > > > > > > > > > Cheers, > > > > > Matt > > > > > -- > > > > > Sponsored by the NGI0 Core fund. > > > > > > > > > > > > > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction 2025-01-24 15:02 ` Mickaël Salaün @ 2025-01-27 12:40 ` Mikhail Ivanov 2025-01-27 19:48 ` Mickaël Salaün 0 siblings, 1 reply; 18+ messages in thread From: Mikhail Ivanov @ 2025-01-27 12:40 UTC (permalink / raw) To: Mickaël Salaün Cc: Matthieu Baerts, gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs On 1/24/2025 6:02 PM, Mickaël Salaün wrote: > On Fri, Dec 13, 2024 at 09:19:10PM +0300, Mikhail Ivanov wrote: >> On 12/12/2024 9:43 PM, Mickaël Salaün wrote: >>> On Thu, Oct 31, 2024 at 07:21:44PM +0300, Mikhail Ivanov wrote: >>>> On 10/18/2024 9:08 PM, Mickaël Salaün wrote: >>>>> On Thu, Oct 17, 2024 at 02:59:48PM +0200, Matthieu Baerts wrote: >>>>>> Hi Mikhail and Landlock maintainers, >>>>>> >>>>>> +cc MPTCP list. >>>>> >>>>> Thanks, we should include this list in the next series. >>>>> >>>>>> >>>>>> On 17/10/2024 13:04, Mikhail Ivanov wrote: >>>>>>> Do not check TCP access right if socket protocol is not IPPROTO_TCP. >>>>>>> LANDLOCK_ACCESS_NET_BIND_TCP and LANDLOCK_ACCESS_NET_CONNECT_TCP >>>>>>> should not restrict bind(2) and connect(2) for non-TCP protocols >>>>>>> (SCTP, MPTCP, SMC). >>>>>> >>>>>> Thank you for the patch! >>>>>> >>>>>> I'm part of the MPTCP team, and I'm wondering if MPTCP should not be >>>>>> treated like TCP here. MPTCP is an extension to TCP: on the wire, we can >>>>>> see TCP packets with extra TCP options. On Linux, there is indeed a >>>>>> dedicated MPTCP socket (IPPROTO_MPTCP), but that's just internal, >>>>>> because we needed such dedicated socket to talk to the userspace. >>>>>> >>>>>> I don't know Landlock well, but I think it is important to know that an >>>>>> MPTCP socket can be used to discuss with "plain" TCP packets: the kernel >>>>>> will do a fallback to "plain" TCP if MPTCP is not supported by the other >>>>>> peer or by a middlebox. It means that with this patch, if TCP is blocked >>>>>> by Landlock, someone can simply force an application to create an MPTCP >>>>>> socket -- e.g. via LD_PRELOAD -- and bypass the restrictions. It will >>>>>> certainly work, even when connecting to a peer not supporting MPTCP. >>>>>> >>>>>> Please note that I'm not against this modification -- especially here >>>>>> when we remove restrictions around MPTCP sockets :) -- I'm just saying >>>>>> it might be less confusing for users if MPTCP is considered as being >>>>>> part of TCP. A bit similar to what someone would do with a firewall: if >>>>>> TCP is blocked, MPTCP is blocked as well. >>>>> >>>>> Good point! I don't know well MPTCP but I think you're right. Given >>>>> it's close relationship with TCP and the fallback mechanism, it would >>>>> make sense for users to not make a difference and it would avoid bypass >>>>> of misleading restrictions. Moreover the Landlock rules are simple and >>>>> only control TCP ports, not peer addresses, which seems to be the main >>>>> evolution of MPTCP. > >>>>>> >>>>>> I understand that a future goal might probably be to have dedicated >>>>>> restrictions for MPTCP and the other stream protocols (and/or for all >>>>>> stream protocols like it was before this patch), but in the meantime, it >>>>>> might be less confusing considering MPTCP as being part of TCP (I'm not >>>>>> sure about the other stream protocols). >>>>> >>>>> We need to take a closer look at the other stream protocols indeed. >>>> Hello! Sorry for the late reply, I was on a small business trip. >>>> >>>> Thanks a lot for this catch, without doubt MPTCP should be controlled >>>> with TCP access rights. >>>> >>>> In that case, we should reconsider current semantics of TCP control. >>>> >>>> Currently, it looks like this: >>>> * LANDLOCK_ACCESS_NET_BIND_TCP: Bind a TCP socket to a local port. >>>> * LANDLOCK_ACCESS_NET_CONNECT_TCP: Connect an active TCP socket to a >>>> remote port. >>>> >>>> According to these definitions only TCP sockets should be restricted and >>>> this is already provided by Landlock (considering observing commit) >>>> (assuming that "TCP socket" := user space socket of IPPROTO_TCP >>>> protocol). >>>> >>>> AFAICS the two objectives of TCP access rights are to control >>>> (1) which ports can be used for sending or receiving TCP packets >>>> (including SYN, ACK or other service packets). >>>> (2) which ports can be used to establish TCP connection (performed by >>>> kernel network stack on server or client side). >>>> >>>> In most cases denying (2) cause denying (1). Sending or receiving TCP >>>> packets without initial 3-way handshake is only possible on RAW [1] or >>>> PACKET [2] sockets. Usage of such sockets requires root privilligies, so >>>> there is no point to control them with Landlock. >>> >>> I agree. >>> >>>> >>>> Therefore Landlock should only take care about case (2). For now >>>> (please correct me if I'm wrong), we only considered control of >>>> connection performed on user space plain TCP sockets (created with >>>> IPPROTO_TCP). >>> >>> Correct. Landlock is dedicated to sandbox user space processes and the >>> related access rights should focus on restricting what is possible >>> through syscalls (mainly). >>> >>>> >>>> TCP kernel sockets are generally used in the following ways: >>>> * in a couple of other user space protocols (MPTCP, SMC, RDS) >>>> * in a few network filesystems (e.g. NFS communication over TCP) >>>> >>>> For the second case TCP connection is currently not restricted by >>>> Landlock. This approach is may be correct, since NFS should not have >>>> access to a plain TCP communication and TCP restriction of NFS may >>>> be too implicit. Nevertheless, I think that restriction via current >>>> access rights should be considered. >>> >>> I'm not sure what you mean here. I'm not familiar with NFS in the >>> kernel. AFAIK there is no socket type for NFS. >> >> NFS client makes RPC requests to perform remote file operations on the >> NFS server. RPC requests can be sent using TCP, UDP, or RDMA sockets at >> the transport layer. >> >> Call trace of creating TCP socket for client->server communication: >> nfs_create_rpc_client() >> rpc_create() >> xprt_create_transport() >> xs_setup_tcp() >> xs_tcp_setup_socket() >> xs_create_sock() >> >> And RPC request is forwarded to TCP stack by calling >> xs_tcp_send_request(). > > OK, but it looks like this is connections on behalf of the kernel, that > only the kernel can use. In other words, when these functions are > called, I guess current_cred() doesn't point to user space credentials. > Because the kernel cannot be restricted by Landlock, we should be good. Agreed, only NFS can establish and use its connections directly. NFS uses kernel_{bind, connect}() methods on kernel sockets, so TCP operations are not checked by LSM. > >> >>> >>>> >>>> For the first case, each protocol use TCP differently, so they should >>>> be considered separately. >>> >>> Yes, for user-accessible protocols. >>> >>>> >>>> In the case of MPTCP TCP internal sockets are used to establish >>>> connection and exchange data between two network interfaces. MPTCP >>>> allows to have multiple TCP connections between two MPTCP sockets by >>>> connecting different network interfaces (e.g. WIFI and 3G). >>>> >>>> Shared Memory Communication is a protocol that allows TCP applications >>>> transparently use RDMA for communication [3]. TCP internal socket is >>>> used to exchange service CLC messages when establishing SMC connection >>>> (which seems harmless for sandboxing) and for communication in the case >>>> of fallback. Fallback happens only if RDMA communication became >>>> impossible (e.g. if RDMA capable RNIC card went down on host or peer >>>> side). So, preventing TCP communication may be achieved by controlling >>>> fallback mechanism. >>>> >>>> Reliable Datagram Socket is connectionless protocol implemented by >>>> Oracle [4]. It uses TCP stack or Infiniband to reliably deliever >>>> datagrams. For every sendmsg(2), recvmsg(2) it establishes TCP >>>> connection and use it to deliever splitted message. >>>> >>>> In comparison with previous protocols, RDS sockets cannot be binded or >>>> connected to special TCP ports (e.g. with bind(2), connect(2)). 16385 >>>> port is assigned to receiving side and sending side is binded to the >>>> port allocated by the kernel (by using zero as port number). >>>> >>>> It may be useful to restrict RDS-over-TCP with current access rights, >>>> since it allows to perform TCP communication from user-space. But it >>>> would be only possible to fully allow or deny sending/receiving >>>> (since used ports are not controlled from user space). >>> >>> Thanks for these explanations. The ability to fine-control specific >>> protocol operations (e.g. connect, bind) can be useful for widely used >>> protocol such as TCP and UDP (or if someone wants to implement it for >>> another protocol), but this approach would not scale with all protocols >>> because of their own semantic and the development efforts. The Landlock >>> access rights should be explicit, and we should also be able to deny >>> access to a whole set of protocols. This should be partially possible >>> with your socket creation patch series. I guess the remaining cases >>> would be to cover transformation of one socket type to another. I think >>> we could control such transformation by building on top of the socket >>> creation control foundation: instead of controlling socket creation, add >>> a new access right to control socket transformation. What do you think? >> >> I agree that implementing fine-control network access rights for other >> protocols only to be able to completely restrict TCP operations seems >> excessive. >> >> Do you mean the implementation of 2 access rights: for creating and >> transforming sockets? > > Yes, but if it's not too complex I think it would make sense to only > have one access right that will cover these two cases. I'm not sure > there is one common point where to check these socket transformation > though. There are at least 3 different places where some kind of transformation is taking place. > >> >> If so, there are only 2 socket protocols that can be transformed to TCP >> (in the fallback path) - MPTCP and SMC. Recall that in the case of RDS, >> a TCP socket can be used implicitly to deliver an RDS datagram. > > Hmm, interesting. Then we'll also need an access right to use a > protocol? I'm worried that this kind of check would have a significant > performance impact. I think we could tag a socket at creation time with > the allowed protocol transitions. What do you mean by "to use a protocol"? > >> Let's >> assume that the process of configuring TCP as a transport for RDS is >> also included in the socket transformation control. >> >> Socket creation control is sufficient to restrict the implicit use of a >> TCP connection. Theoretically, separate socket transformation >> control is only required if the user wants to use (for example) SMC >> sockets with restricted (partially or completely) TCP bind(2) and >> connect(2) actions. But SMC (or MPTCP) applications should rely on TCP >> communication in case of fallback. I think they are unlikely to have any >> TCP restrictions. >> >> However, control of fallback to TCP by applying socket creation rules >> is too implicit and inconvenient. >> >> Initially, I thought that users could expect TCP access rights to >> completely restrict the corresponding TCP actions without additional >> rules for sockets. I have concerns that socket transformation control >> would not be explicit enough for such purpose. >> >> Probably, it will be more correctly to apply rules that deny creation of >> SMC, MPTCP and RDS sockets (or their transformation to TCP) in >> landlock_restrict_self() if TCP actions are not fully allowed? > > That should be achieved with your socket creation control patch series > right? That's correct. I was just a little worried about a possible unawareness on the part of the user about the sockets transformation. I'll better just make a note in the documentation about this. > > I'm not sure to understand the use of landlock_restrict_self() here. > Rulesets should fully define an access control on their own. You're right, landlock_restrict_self() can not define any additional rules. > >> >>> >>>> >>>> Restricting any TCP connection in the kernel is probably simplest >>>> design, but we should consider above cases to provide the most useful >>>> one. >>>> >>>> [1] https://man7.org/linux/man-pages/man7/raw.7.html >>>> [2] https://man7.org/linux/man-pages/man7/packet.7.html >>>> [3] https://datatracker.ietf.org/doc/html/rfc7609 >>>> [4] https://oss.oracle.com/projects/rds/dist/documentation/rds-3.1-spec.html >>>> >>>>> >>>>>> >>>>>> >>>>>>> sk_is_tcp() is used for this to check address family of the socket >>>>>>> before doing INET-specific address length validation. This is required >>>>>>> for error consistency. > > Could you please send a new patch series for this specific fix, > including minimal tests? I'd like to merge that as soon as possible, > and it will be backported to all kernel versions. Ok, I'll do it ASAP. > >>>>>>> >>>>>>> Closes: https://github.com/landlock-lsm/linux/issues/40 >>>>>>> Fixes: fff69fb03dde ("landlock: Support network rules with TCP bind and connect") >>>>>> >>>>>> I don't know how fixes are considered in Landlock, but should this patch >>>>>> be considered as a fix? It might be surprising for someone who thought >>>>>> all "stream" connections were blocked to have them unblocked when >>>>>> updating to a minor kernel version, no? >>>>> >>>>> Indeed. The main issue was with the semantic/definition of >>>>> LANDLOCK_ACCESS_FS_NET_{CONNECT,BIND}_TCP. We need to synchronize the >>>>> code with the documentation, one way or the other, preferably following >>>>> the principle of least astonishment. >>>>> >>>>>> >>>>>> (Personally, I would understand such behaviour change when upgrading to >>>>>> a major version, and still, maybe only if there were alternatives to >>>>> >>>>> This "fix" needs to be backported, but we're not clear yet on what it >>>>> should be. :) >>>>> >>>>>> continue having the same behaviour, e.g. a way to restrict all stream >>>>>> sockets the same way, or something per stream socket. But that's just me >>>>>> :) ) >>>>> >>>>> The documentation and the initial idea was to control TCP bind and >>>>> connect. The kernel implementation does more than that, so we need to >>>>> synthronize somehow. >>>>> >>>>>> >>>>>> Cheers, >>>>>> Matt >>>>>> -- >>>>>> Sponsored by the NGI0 Core fund. >>>>>> >>>>>> >>>> >> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction 2025-01-27 12:40 ` Mikhail Ivanov @ 2025-01-27 19:48 ` Mickaël Salaün 2025-01-28 10:56 ` Mikhail Ivanov 0 siblings, 1 reply; 18+ messages in thread From: Mickaël Salaün @ 2025-01-27 19:48 UTC (permalink / raw) To: Mikhail Ivanov Cc: Matthieu Baerts, gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs, Paul Moore On Mon, Jan 27, 2025 at 03:40:33PM +0300, Mikhail Ivanov wrote: > On 1/24/2025 6:02 PM, Mickaël Salaün wrote: > > On Fri, Dec 13, 2024 at 09:19:10PM +0300, Mikhail Ivanov wrote: > > > On 12/12/2024 9:43 PM, Mickaël Salaün wrote: > > > > On Thu, Oct 31, 2024 at 07:21:44PM +0300, Mikhail Ivanov wrote: > > > > > On 10/18/2024 9:08 PM, Mickaël Salaün wrote: > > > > > > On Thu, Oct 17, 2024 at 02:59:48PM +0200, Matthieu Baerts wrote: > > > > > > > Hi Mikhail and Landlock maintainers, > > > > > > > > > > > > > > +cc MPTCP list. > > > > > > > > > > > > Thanks, we should include this list in the next series. > > > > > > > > > > > > > > > > > > > > On 17/10/2024 13:04, Mikhail Ivanov wrote: > > > > > > > > Do not check TCP access right if socket protocol is not IPPROTO_TCP. > > > > > > > > LANDLOCK_ACCESS_NET_BIND_TCP and LANDLOCK_ACCESS_NET_CONNECT_TCP > > > > > > > > should not restrict bind(2) and connect(2) for non-TCP protocols > > > > > > > > (SCTP, MPTCP, SMC). > > > > > > > > > > > > > > Thank you for the patch! > > > > > > > > > > > > > > I'm part of the MPTCP team, and I'm wondering if MPTCP should not be > > > > > > > treated like TCP here. MPTCP is an extension to TCP: on the wire, we can > > > > > > > see TCP packets with extra TCP options. On Linux, there is indeed a > > > > > > > dedicated MPTCP socket (IPPROTO_MPTCP), but that's just internal, > > > > > > > because we needed such dedicated socket to talk to the userspace. > > > > > > > > > > > > > > I don't know Landlock well, but I think it is important to know that an > > > > > > > MPTCP socket can be used to discuss with "plain" TCP packets: the kernel > > > > > > > will do a fallback to "plain" TCP if MPTCP is not supported by the other > > > > > > > peer or by a middlebox. It means that with this patch, if TCP is blocked > > > > > > > by Landlock, someone can simply force an application to create an MPTCP > > > > > > > socket -- e.g. via LD_PRELOAD -- and bypass the restrictions. It will > > > > > > > certainly work, even when connecting to a peer not supporting MPTCP. > > > > > > > > > > > > > > Please note that I'm not against this modification -- especially here > > > > > > > when we remove restrictions around MPTCP sockets :) -- I'm just saying > > > > > > > it might be less confusing for users if MPTCP is considered as being > > > > > > > part of TCP. A bit similar to what someone would do with a firewall: if > > > > > > > TCP is blocked, MPTCP is blocked as well. > > > > > > > > > > > > Good point! I don't know well MPTCP but I think you're right. Given > > > > > > it's close relationship with TCP and the fallback mechanism, it would > > > > > > make sense for users to not make a difference and it would avoid bypass > > > > > > of misleading restrictions. Moreover the Landlock rules are simple and > > > > > > only control TCP ports, not peer addresses, which seems to be the main > > > > > > evolution of MPTCP. > > > > > > > > > > > > > > > I understand that a future goal might probably be to have dedicated > > > > > > > restrictions for MPTCP and the other stream protocols (and/or for all > > > > > > > stream protocols like it was before this patch), but in the meantime, it > > > > > > > might be less confusing considering MPTCP as being part of TCP (I'm not > > > > > > > sure about the other stream protocols). > > > > > > > > > > > > We need to take a closer look at the other stream protocols indeed. > > > > > Hello! Sorry for the late reply, I was on a small business trip. > > > > > > > > > > Thanks a lot for this catch, without doubt MPTCP should be controlled > > > > > with TCP access rights. > > > > > > > > > > In that case, we should reconsider current semantics of TCP control. > > > > > > > > > > Currently, it looks like this: > > > > > * LANDLOCK_ACCESS_NET_BIND_TCP: Bind a TCP socket to a local port. > > > > > * LANDLOCK_ACCESS_NET_CONNECT_TCP: Connect an active TCP socket to a > > > > > remote port. > > > > > > > > > > According to these definitions only TCP sockets should be restricted and > > > > > this is already provided by Landlock (considering observing commit) > > > > > (assuming that "TCP socket" := user space socket of IPPROTO_TCP > > > > > protocol). > > > > > > > > > > AFAICS the two objectives of TCP access rights are to control > > > > > (1) which ports can be used for sending or receiving TCP packets > > > > > (including SYN, ACK or other service packets). > > > > > (2) which ports can be used to establish TCP connection (performed by > > > > > kernel network stack on server or client side). > > > > > > > > > > In most cases denying (2) cause denying (1). Sending or receiving TCP > > > > > packets without initial 3-way handshake is only possible on RAW [1] or > > > > > PACKET [2] sockets. Usage of such sockets requires root privilligies, so > > > > > there is no point to control them with Landlock. > > > > > > > > I agree. > > > > > > > > > > > > > > Therefore Landlock should only take care about case (2). For now > > > > > (please correct me if I'm wrong), we only considered control of > > > > > connection performed on user space plain TCP sockets (created with > > > > > IPPROTO_TCP). > > > > > > > > Correct. Landlock is dedicated to sandbox user space processes and the > > > > related access rights should focus on restricting what is possible > > > > through syscalls (mainly). > > > > > > > > > > > > > > TCP kernel sockets are generally used in the following ways: > > > > > * in a couple of other user space protocols (MPTCP, SMC, RDS) > > > > > * in a few network filesystems (e.g. NFS communication over TCP) > > > > > > > > > > For the second case TCP connection is currently not restricted by > > > > > Landlock. This approach is may be correct, since NFS should not have > > > > > access to a plain TCP communication and TCP restriction of NFS may > > > > > be too implicit. Nevertheless, I think that restriction via current > > > > > access rights should be considered. > > > > > > > > I'm not sure what you mean here. I'm not familiar with NFS in the > > > > kernel. AFAIK there is no socket type for NFS. > > > > > > NFS client makes RPC requests to perform remote file operations on the > > > NFS server. RPC requests can be sent using TCP, UDP, or RDMA sockets at > > > the transport layer. > > > > > > Call trace of creating TCP socket for client->server communication: > > > nfs_create_rpc_client() > > > rpc_create() > > > xprt_create_transport() > > > xs_setup_tcp() > > > xs_tcp_setup_socket() > > > xs_create_sock() > > > > > > And RPC request is forwarded to TCP stack by calling > > > xs_tcp_send_request(). > > > > OK, but it looks like this is connections on behalf of the kernel, that > > only the kernel can use. In other words, when these functions are > > called, I guess current_cred() doesn't point to user space credentials. > > Because the kernel cannot be restricted by Landlock, we should be good. > > Agreed, only NFS can establish and use its connections directly. > NFS uses kernel_{bind, connect}() methods on kernel sockets, so TCP > operations are not checked by LSM. > > > > > > > > > > > > > > > > > > > > For the first case, each protocol use TCP differently, so they should > > > > > be considered separately. > > > > > > > > Yes, for user-accessible protocols. > > > > > > > > > > > > > > In the case of MPTCP TCP internal sockets are used to establish > > > > > connection and exchange data between two network interfaces. MPTCP > > > > > allows to have multiple TCP connections between two MPTCP sockets by > > > > > connecting different network interfaces (e.g. WIFI and 3G). > > > > > > > > > > Shared Memory Communication is a protocol that allows TCP applications > > > > > transparently use RDMA for communication [3]. TCP internal socket is > > > > > used to exchange service CLC messages when establishing SMC connection > > > > > (which seems harmless for sandboxing) and for communication in the case > > > > > of fallback. Fallback happens only if RDMA communication became > > > > > impossible (e.g. if RDMA capable RNIC card went down on host or peer > > > > > side). So, preventing TCP communication may be achieved by controlling > > > > > fallback mechanism. > > > > > > > > > > Reliable Datagram Socket is connectionless protocol implemented by > > > > > Oracle [4]. It uses TCP stack or Infiniband to reliably deliever > > > > > datagrams. For every sendmsg(2), recvmsg(2) it establishes TCP > > > > > connection and use it to deliever splitted message. > > > > > > > > > > In comparison with previous protocols, RDS sockets cannot be binded or > > > > > connected to special TCP ports (e.g. with bind(2), connect(2)). 16385 > > > > > port is assigned to receiving side and sending side is binded to the > > > > > port allocated by the kernel (by using zero as port number). > > > > > > > > > > It may be useful to restrict RDS-over-TCP with current access rights, > > > > > since it allows to perform TCP communication from user-space. But it > > > > > would be only possible to fully allow or deny sending/receiving > > > > > (since used ports are not controlled from user space). > > > > > > > > Thanks for these explanations. The ability to fine-control specific > > > > protocol operations (e.g. connect, bind) can be useful for widely used > > > > protocol such as TCP and UDP (or if someone wants to implement it for > > > > another protocol), but this approach would not scale with all protocols > > > > because of their own semantic and the development efforts. The Landlock > > > > access rights should be explicit, and we should also be able to deny > > > > access to a whole set of protocols. This should be partially possible > > > > with your socket creation patch series. I guess the remaining cases > > > > would be to cover transformation of one socket type to another. I think > > > > we could control such transformation by building on top of the socket > > > > creation control foundation: instead of controlling socket creation, add > > > > a new access right to control socket transformation. What do you think? > > > > > > I agree that implementing fine-control network access rights for other > > > protocols only to be able to completely restrict TCP operations seems > > > excessive. > > > > > > Do you mean the implementation of 2 access rights: for creating and > > > transforming sockets? > > > > Yes, but if it's not too complex I think it would make sense to only > > have one access right that will cover these two cases. I'm not sure > > there is one common point where to check these socket transformation > > though. > > There are at least 3 different places where some kind of transformation > is taking place. I'm a bit worried that we miss some of these places (now or in future kernel versions). We'll need a new LSM hook for that. Could you list the current locations? > > > > > > > > > If so, there are only 2 socket protocols that can be transformed to TCP > > > (in the fallback path) - MPTCP and SMC. Recall that in the case of RDS, > > > a TCP socket can be used implicitly to deliver an RDS datagram. > > > > Hmm, interesting. Then we'll also need an access right to use a > > protocol? I'm worried that this kind of check would have a significant > > performance impact. I think we could tag a socket at creation time with > > the allowed protocol transitions. > > What do you mean by "to use a protocol"? To use a socket with a specific protocol. Until now, I though being able to control socket creation would be enough, but being able to use one kind of socket with different protocols would be an issue if users want to control the use of protocols (which makes sense from an access control point of view). > > > > > > Let's > > > assume that the process of configuring TCP as a transport for RDS is > > > also included in the socket transformation control. > > > > > > Socket creation control is sufficient to restrict the implicit use of a > > > TCP connection. Theoretically, separate socket transformation > > > control is only required if the user wants to use (for example) SMC > > > sockets with restricted (partially or completely) TCP bind(2) and > > > connect(2) actions. But SMC (or MPTCP) applications should rely on TCP > > > communication in case of fallback. I think they are unlikely to have any > > > TCP restrictions. > > > > > > However, control of fallback to TCP by applying socket creation rules > > > is too implicit and inconvenient. > > > > > > Initially, I thought that users could expect TCP access rights to > > > completely restrict the corresponding TCP actions without additional > > > rules for sockets. I have concerns that socket transformation control > > > would not be explicit enough for such purpose. > > > > > > Probably, it will be more correctly to apply rules that deny creation of > > > SMC, MPTCP and RDS sockets (or their transformation to TCP) in > > > landlock_restrict_self() if TCP actions are not fully allowed? > > > > That should be achieved with your socket creation control patch series > > right? > > That's correct. I was just a little worried about a possible unawareness > on the part of the user about the sockets transformation. I'll better > just make a note in the documentation about this. That's why I was talking about a dedicated access right to get a clear semantic (socket creation vs. and socket use/transition). However, I don't really see use cases where one should be used and not the other, and that could also misleading to users, which means we should probably only have one access right and consider protocol transitions as a kind of socket creation (and find a more appropriate name). > > > > > I'm not sure to understand the use of landlock_restrict_self() here. > > Rulesets should fully define an access control on their own. > > You're right, landlock_restrict_self() can not define any additional > rules. > > > > > > > > > > > > > > > > > > > > Restricting any TCP connection in the kernel is probably simplest > > > > > design, but we should consider above cases to provide the most useful > > > > > one. > > > > > > > > > > [1] https://man7.org/linux/man-pages/man7/raw.7.html > > > > > [2] https://man7.org/linux/man-pages/man7/packet.7.html > > > > > [3] https://datatracker.ietf.org/doc/html/rfc7609 > > > > > [4] https://oss.oracle.com/projects/rds/dist/documentation/rds-3.1-spec.html > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > sk_is_tcp() is used for this to check address family of the socket > > > > > > > > before doing INET-specific address length validation. This is required > > > > > > > > for error consistency. > > > > Could you please send a new patch series for this specific fix, > > including minimal tests? I'd like to merge that as soon as possible, > > and it will be backported to all kernel versions. > > Ok, I'll do it ASAP. Great > > > > > > > > > > > > > > > > > > > Closes: https://github.com/landlock-lsm/linux/issues/40 > > > > > > > > Fixes: fff69fb03dde ("landlock: Support network rules with TCP bind and connect") > > > > > > > > > > > > > > I don't know how fixes are considered in Landlock, but should this patch > > > > > > > be considered as a fix? It might be surprising for someone who thought > > > > > > > all "stream" connections were blocked to have them unblocked when > > > > > > > updating to a minor kernel version, no? > > > > > > > > > > > > Indeed. The main issue was with the semantic/definition of > > > > > > LANDLOCK_ACCESS_FS_NET_{CONNECT,BIND}_TCP. We need to synchronize the > > > > > > code with the documentation, one way or the other, preferably following > > > > > > the principle of least astonishment. > > > > > > > > > > > > > > > > > > > > (Personally, I would understand such behaviour change when upgrading to > > > > > > > a major version, and still, maybe only if there were alternatives to > > > > > > > > > > > > This "fix" needs to be backported, but we're not clear yet on what it > > > > > > should be. :) > > > > > > > > > > > > > continue having the same behaviour, e.g. a way to restrict all stream > > > > > > > sockets the same way, or something per stream socket. But that's just me > > > > > > > :) ) > > > > > > > > > > > > The documentation and the initial idea was to control TCP bind and > > > > > > connect. The kernel implementation does more than that, so we need to > > > > > > synthronize somehow. > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > Matt > > > > > > > -- > > > > > > > Sponsored by the NGI0 Core fund. > > > > > > > > > > > > > > > > > > > > > > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction 2025-01-27 19:48 ` Mickaël Salaün @ 2025-01-28 10:56 ` Mikhail Ivanov 2025-01-28 18:14 ` Matthieu Baerts 0 siblings, 1 reply; 18+ messages in thread From: Mikhail Ivanov @ 2025-01-28 10:56 UTC (permalink / raw) To: Mickaël Salaün Cc: Matthieu Baerts, gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs, Paul Moore On 1/27/2025 10:48 PM, Mickaël Salaün wrote: > On Mon, Jan 27, 2025 at 03:40:33PM +0300, Mikhail Ivanov wrote: >> On 1/24/2025 6:02 PM, Mickaël Salaün wrote: >>> On Fri, Dec 13, 2024 at 09:19:10PM +0300, Mikhail Ivanov wrote: >>>> On 12/12/2024 9:43 PM, Mickaël Salaün wrote: >>>>> On Thu, Oct 31, 2024 at 07:21:44PM +0300, Mikhail Ivanov wrote: >>>>>> On 10/18/2024 9:08 PM, Mickaël Salaün wrote: >>>>>>> On Thu, Oct 17, 2024 at 02:59:48PM +0200, Matthieu Baerts wrote: >>>>>>>> Hi Mikhail and Landlock maintainers, >>>>>>>> >>>>>>>> +cc MPTCP list. >>>>>>> >>>>>>> Thanks, we should include this list in the next series. >>>>>>> >>>>>>>> >>>>>>>> On 17/10/2024 13:04, Mikhail Ivanov wrote: >>>>>>>>> Do not check TCP access right if socket protocol is not IPPROTO_TCP. >>>>>>>>> LANDLOCK_ACCESS_NET_BIND_TCP and LANDLOCK_ACCESS_NET_CONNECT_TCP >>>>>>>>> should not restrict bind(2) and connect(2) for non-TCP protocols >>>>>>>>> (SCTP, MPTCP, SMC). >>>>>>>> >>>>>>>> Thank you for the patch! >>>>>>>> >>>>>>>> I'm part of the MPTCP team, and I'm wondering if MPTCP should not be >>>>>>>> treated like TCP here. MPTCP is an extension to TCP: on the wire, we can >>>>>>>> see TCP packets with extra TCP options. On Linux, there is indeed a >>>>>>>> dedicated MPTCP socket (IPPROTO_MPTCP), but that's just internal, >>>>>>>> because we needed such dedicated socket to talk to the userspace. >>>>>>>> >>>>>>>> I don't know Landlock well, but I think it is important to know that an >>>>>>>> MPTCP socket can be used to discuss with "plain" TCP packets: the kernel >>>>>>>> will do a fallback to "plain" TCP if MPTCP is not supported by the other >>>>>>>> peer or by a middlebox. It means that with this patch, if TCP is blocked >>>>>>>> by Landlock, someone can simply force an application to create an MPTCP >>>>>>>> socket -- e.g. via LD_PRELOAD -- and bypass the restrictions. It will >>>>>>>> certainly work, even when connecting to a peer not supporting MPTCP. >>>>>>>> >>>>>>>> Please note that I'm not against this modification -- especially here >>>>>>>> when we remove restrictions around MPTCP sockets :) -- I'm just saying >>>>>>>> it might be less confusing for users if MPTCP is considered as being >>>>>>>> part of TCP. A bit similar to what someone would do with a firewall: if >>>>>>>> TCP is blocked, MPTCP is blocked as well. >>>>>>> >>>>>>> Good point! I don't know well MPTCP but I think you're right. Given >>>>>>> it's close relationship with TCP and the fallback mechanism, it would >>>>>>> make sense for users to not make a difference and it would avoid bypass >>>>>>> of misleading restrictions. Moreover the Landlock rules are simple and >>>>>>> only control TCP ports, not peer addresses, which seems to be the main >>>>>>> evolution of MPTCP. > >>>>>>>> >>>>>>>> I understand that a future goal might probably be to have dedicated >>>>>>>> restrictions for MPTCP and the other stream protocols (and/or for all >>>>>>>> stream protocols like it was before this patch), but in the meantime, it >>>>>>>> might be less confusing considering MPTCP as being part of TCP (I'm not >>>>>>>> sure about the other stream protocols). >>>>>>> >>>>>>> We need to take a closer look at the other stream protocols indeed. >>>>>> Hello! Sorry for the late reply, I was on a small business trip. >>>>>> >>>>>> Thanks a lot for this catch, without doubt MPTCP should be controlled >>>>>> with TCP access rights. >>>>>> >>>>>> In that case, we should reconsider current semantics of TCP control. >>>>>> >>>>>> Currently, it looks like this: >>>>>> * LANDLOCK_ACCESS_NET_BIND_TCP: Bind a TCP socket to a local port. >>>>>> * LANDLOCK_ACCESS_NET_CONNECT_TCP: Connect an active TCP socket to a >>>>>> remote port. >>>>>> >>>>>> According to these definitions only TCP sockets should be restricted and >>>>>> this is already provided by Landlock (considering observing commit) >>>>>> (assuming that "TCP socket" := user space socket of IPPROTO_TCP >>>>>> protocol). >>>>>> >>>>>> AFAICS the two objectives of TCP access rights are to control >>>>>> (1) which ports can be used for sending or receiving TCP packets >>>>>> (including SYN, ACK or other service packets). >>>>>> (2) which ports can be used to establish TCP connection (performed by >>>>>> kernel network stack on server or client side). >>>>>> >>>>>> In most cases denying (2) cause denying (1). Sending or receiving TCP >>>>>> packets without initial 3-way handshake is only possible on RAW [1] or >>>>>> PACKET [2] sockets. Usage of such sockets requires root privilligies, so >>>>>> there is no point to control them with Landlock. >>>>> >>>>> I agree. >>>>> >>>>>> >>>>>> Therefore Landlock should only take care about case (2). For now >>>>>> (please correct me if I'm wrong), we only considered control of >>>>>> connection performed on user space plain TCP sockets (created with >>>>>> IPPROTO_TCP). >>>>> >>>>> Correct. Landlock is dedicated to sandbox user space processes and the >>>>> related access rights should focus on restricting what is possible >>>>> through syscalls (mainly). >>>>> >>>>>> >>>>>> TCP kernel sockets are generally used in the following ways: >>>>>> * in a couple of other user space protocols (MPTCP, SMC, RDS) >>>>>> * in a few network filesystems (e.g. NFS communication over TCP) >>>>>> >>>>>> For the second case TCP connection is currently not restricted by >>>>>> Landlock. This approach is may be correct, since NFS should not have >>>>>> access to a plain TCP communication and TCP restriction of NFS may >>>>>> be too implicit. Nevertheless, I think that restriction via current >>>>>> access rights should be considered. >>>>> >>>>> I'm not sure what you mean here. I'm not familiar with NFS in the >>>>> kernel. AFAIK there is no socket type for NFS. >>>> >>>> NFS client makes RPC requests to perform remote file operations on the >>>> NFS server. RPC requests can be sent using TCP, UDP, or RDMA sockets at >>>> the transport layer. >>>> >>>> Call trace of creating TCP socket for client->server communication: >>>> nfs_create_rpc_client() >>>> rpc_create() >>>> xprt_create_transport() >>>> xs_setup_tcp() >>>> xs_tcp_setup_socket() >>>> xs_create_sock() >>>> >>>> And RPC request is forwarded to TCP stack by calling >>>> xs_tcp_send_request(). >>> >>> OK, but it looks like this is connections on behalf of the kernel, that >>> only the kernel can use. In other words, when these functions are >>> called, I guess current_cred() doesn't point to user space credentials. >>> Because the kernel cannot be restricted by Landlock, we should be good. >> >> Agreed, only NFS can establish and use its connections directly. >> NFS uses kernel_{bind, connect}() methods on kernel sockets, so TCP >> operations are not checked by LSM. >> >>> >>>> >>>>> >>>>>> >>>>>> For the first case, each protocol use TCP differently, so they should >>>>>> be considered separately. >>>>> >>>>> Yes, for user-accessible protocols. >>>>> >>>>>> >>>>>> In the case of MPTCP TCP internal sockets are used to establish >>>>>> connection and exchange data between two network interfaces. MPTCP >>>>>> allows to have multiple TCP connections between two MPTCP sockets by >>>>>> connecting different network interfaces (e.g. WIFI and 3G). >>>>>> >>>>>> Shared Memory Communication is a protocol that allows TCP applications >>>>>> transparently use RDMA for communication [3]. TCP internal socket is >>>>>> used to exchange service CLC messages when establishing SMC connection >>>>>> (which seems harmless for sandboxing) and for communication in the case >>>>>> of fallback. Fallback happens only if RDMA communication became >>>>>> impossible (e.g. if RDMA capable RNIC card went down on host or peer >>>>>> side). So, preventing TCP communication may be achieved by controlling >>>>>> fallback mechanism. >>>>>> >>>>>> Reliable Datagram Socket is connectionless protocol implemented by >>>>>> Oracle [4]. It uses TCP stack or Infiniband to reliably deliever >>>>>> datagrams. For every sendmsg(2), recvmsg(2) it establishes TCP >>>>>> connection and use it to deliever splitted message. >>>>>> >>>>>> In comparison with previous protocols, RDS sockets cannot be binded or >>>>>> connected to special TCP ports (e.g. with bind(2), connect(2)). 16385 >>>>>> port is assigned to receiving side and sending side is binded to the >>>>>> port allocated by the kernel (by using zero as port number). >>>>>> >>>>>> It may be useful to restrict RDS-over-TCP with current access rights, >>>>>> since it allows to perform TCP communication from user-space. But it >>>>>> would be only possible to fully allow or deny sending/receiving >>>>>> (since used ports are not controlled from user space). >>>>> >>>>> Thanks for these explanations. The ability to fine-control specific >>>>> protocol operations (e.g. connect, bind) can be useful for widely used >>>>> protocol such as TCP and UDP (or if someone wants to implement it for >>>>> another protocol), but this approach would not scale with all protocols >>>>> because of their own semantic and the development efforts. The Landlock >>>>> access rights should be explicit, and we should also be able to deny >>>>> access to a whole set of protocols. This should be partially possible >>>>> with your socket creation patch series. I guess the remaining cases >>>>> would be to cover transformation of one socket type to another. I think >>>>> we could control such transformation by building on top of the socket >>>>> creation control foundation: instead of controlling socket creation, add >>>>> a new access right to control socket transformation. What do you think? >>>> >>>> I agree that implementing fine-control network access rights for other >>>> protocols only to be able to completely restrict TCP operations seems >>>> excessive. >>>> >>>> Do you mean the implementation of 2 access rights: for creating and >>>> transforming sockets? >>> >>> Yes, but if it's not too complex I think it would make sense to only >>> have one access right that will cover these two cases. I'm not sure >>> there is one common point where to check these socket transformation >>> though. >> >> There are at least 3 different places where some kind of transformation >> is taking place. > > I'm a bit worried that we miss some of these places (now or in future > kernel versions). We'll need a new LSM hook for that. > > Could you list the current locations? Currently, I know only about TCP-related transformations: * SMC can fallback to TCP during connection. TCP connection is used (1) to exchange CLC control messages in default case and (2) for the communication in the case of fallback. If socket was connected or connection failed, socket can not be reconnected again. There is no existing security hook to control the fallback case, * MPTCP uses TCP for communication between two network interfaces in the default case and can fallback to plain TCP if remote peer does not support MPTCP. AFAICS, there is also no security hook to control the fallback transformation, * IPv6 -> IPv4 transformation for TCP and UDP sockets with IPV6_ADDRFORM. Can be controlled with setsockopt() security hook. As I said before, I wonder if user may want to use SMC or MPTCP and deny TCP communication, since he should rely on fallback transformation during the connection in the common case. It may be unexpected for connect(2) to fail during the fallback due to security politics. Theoretically, any TCP restriction should cause similar SMC and MPTCP restriction. If we deny creation of TCP sockets, we should also deny creation of SMC and MPTCP sockets. I thought that such dependencies may be too complex and it will be better to leave them for the user and not provide any transformation control at all. What do you think? IPV6_ADDRFORM case is simple and should be covered with "socket creation" access right. > >> >>> >>>> >>>> If so, there are only 2 socket protocols that can be transformed to TCP >>>> (in the fallback path) - MPTCP and SMC. Recall that in the case of RDS, >>>> a TCP socket can be used implicitly to deliver an RDS datagram. >>> >>> Hmm, interesting. Then we'll also need an access right to use a >>> protocol? I'm worried that this kind of check would have a significant >>> performance impact. I think we could tag a socket at creation time with >>> the allowed protocol transitions. >> >> What do you mean by "to use a protocol"? > > To use a socket with a specific protocol. Until now, I though being > able to control socket creation would be enough, but being able to use > one kind of socket with different protocols would be an issue if users > want to control the use of protocols (which makes sense from an access > control point of view). Got it, thanks! > >> >>> >>>> Let's >>>> assume that the process of configuring TCP as a transport for RDS is >>>> also included in the socket transformation control. >>>> >>>> Socket creation control is sufficient to restrict the implicit use of a >>>> TCP connection. Theoretically, separate socket transformation >>>> control is only required if the user wants to use (for example) SMC >>>> sockets with restricted (partially or completely) TCP bind(2) and >>>> connect(2) actions. But SMC (or MPTCP) applications should rely on TCP >>>> communication in case of fallback. I think they are unlikely to have any >>>> TCP restrictions. >>>> >>>> However, control of fallback to TCP by applying socket creation rules >>>> is too implicit and inconvenient. >>>> >>>> Initially, I thought that users could expect TCP access rights to >>>> completely restrict the corresponding TCP actions without additional >>>> rules for sockets. I have concerns that socket transformation control >>>> would not be explicit enough for such purpose. >>>> >>>> Probably, it will be more correctly to apply rules that deny creation of >>>> SMC, MPTCP and RDS sockets (or their transformation to TCP) in >>>> landlock_restrict_self() if TCP actions are not fully allowed? >>> >>> That should be achieved with your socket creation control patch series >>> right? >> >> That's correct. I was just a little worried about a possible unawareness >> on the part of the user about the sockets transformation. I'll better >> just make a note in the documentation about this. > > That's why I was talking about a dedicated access right to get a clear > semantic (socket creation vs. and socket use/transition). However, I > don't really see use cases where one should be used and not the other, > and that could also misleading to users, which means we should probably > only have one access right and consider protocol transitions as a kind > of socket creation (and find a more appropriate name). Agreed. There is no point to control socket transformation with a separate right. > >> >>> >>> I'm not sure to understand the use of landlock_restrict_self() here. >>> Rulesets should fully define an access control on their own. >> >> You're right, landlock_restrict_self() can not define any additional >> rules. >> >>> >>>> >>>>> >>>>>> >>>>>> Restricting any TCP connection in the kernel is probably simplest >>>>>> design, but we should consider above cases to provide the most useful >>>>>> one. >>>>>> >>>>>> [1] https://man7.org/linux/man-pages/man7/raw.7.html >>>>>> [2] https://man7.org/linux/man-pages/man7/packet.7.html >>>>>> [3] https://datatracker.ietf.org/doc/html/rfc7609 >>>>>> [4] https://oss.oracle.com/projects/rds/dist/documentation/rds-3.1-spec.html >>>>>> >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> sk_is_tcp() is used for this to check address family of the socket >>>>>>>>> before doing INET-specific address length validation. This is required >>>>>>>>> for error consistency. >>> >>> Could you please send a new patch series for this specific fix, >>> including minimal tests? I'd like to merge that as soon as possible, >>> and it will be backported to all kernel versions. >> >> Ok, I'll do it ASAP. > > Great > >> >>> >>>>>>>>> >>>>>>>>> Closes: https://github.com/landlock-lsm/linux/issues/40 >>>>>>>>> Fixes: fff69fb03dde ("landlock: Support network rules with TCP bind and connect") >>>>>>>> >>>>>>>> I don't know how fixes are considered in Landlock, but should this patch >>>>>>>> be considered as a fix? It might be surprising for someone who thought >>>>>>>> all "stream" connections were blocked to have them unblocked when >>>>>>>> updating to a minor kernel version, no? >>>>>>> >>>>>>> Indeed. The main issue was with the semantic/definition of >>>>>>> LANDLOCK_ACCESS_FS_NET_{CONNECT,BIND}_TCP. We need to synchronize the >>>>>>> code with the documentation, one way or the other, preferably following >>>>>>> the principle of least astonishment. >>>>>>> >>>>>>>> >>>>>>>> (Personally, I would understand such behaviour change when upgrading to >>>>>>>> a major version, and still, maybe only if there were alternatives to >>>>>>> >>>>>>> This "fix" needs to be backported, but we're not clear yet on what it >>>>>>> should be. :) >>>>>>> >>>>>>>> continue having the same behaviour, e.g. a way to restrict all stream >>>>>>>> sockets the same way, or something per stream socket. But that's just me >>>>>>>> :) ) >>>>>>> >>>>>>> The documentation and the initial idea was to control TCP bind and >>>>>>> connect. The kernel implementation does more than that, so we need to >>>>>>> synthronize somehow. >>>>>>> >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Matt >>>>>>>> -- >>>>>>>> Sponsored by the NGI0 Core fund. >>>>>>>> >>>>>>>> >>>>>> >>>> >> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction 2025-01-28 10:56 ` Mikhail Ivanov @ 2025-01-28 18:14 ` Matthieu Baerts 2025-01-29 9:52 ` Mikhail Ivanov 0 siblings, 1 reply; 18+ messages in thread From: Matthieu Baerts @ 2025-01-28 18:14 UTC (permalink / raw) To: Mikhail Ivanov, Mickaël Salaün Cc: gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs, Paul Moore Hi Mikhail, Sorry, I didn't follow all the discussions in this thread, but here are some comments, hoping this can help to clarify the MPTCP case. On 28/01/2025 11:56, Mikhail Ivanov wrote: > On 1/27/2025 10:48 PM, Mickaël Salaün wrote: (...) >> I'm a bit worried that we miss some of these places (now or in future >> kernel versions). We'll need a new LSM hook for that. >> >> Could you list the current locations? > > Currently, I know only about TCP-related transformations: > > * SMC can fallback to TCP during connection. TCP connection is used > (1) to exchange CLC control messages in default case and (2) for the > communication in the case of fallback. If socket was connected or > connection failed, socket can not be reconnected again. There is no > existing security hook to control the fallback case, > > * MPTCP uses TCP for communication between two network interfaces in the > default case and can fallback to plain TCP if remote peer does not > support MPTCP. AFAICS, there is also no security hook to control the > fallback transformation, There are security hooks to control the path creation, but not to control the "fallback transformation". Technically, with MPTCP, the userspace will create an IPPROTO_MPTCP socket. This is only used "internally": to communicate between the userspace and the kernelspace, but not directly used between network interfaces. This "external" communication is done via one or multiple kernel TCP sockets carrying extra TCP options for the mapping. The userspace cannot directly control these sockets created by the kernel. In case of fallback, the kernel TCP socket "simply" drop the extra TCP options needed for MPTCP, and carry on like normal TCP. So on the wire and in the Linux network stack, it is the same TCP connection, without the MPTCP options in the TCP header. The userspace continue to communicate with the same socket. I'm not sure if there is a need to block the fallback: it means only one path can be used at a time. > * IPv6 -> IPv4 transformation for TCP and UDP sockets with > IPV6_ADDRFORM. Can be controlled with setsockopt() security hook. > > As I said before, I wonder if user may want to use SMC or MPTCP and deny > TCP communication, since he should rely on fallback transformation > during the connection in the common case. It may be unexpected for > connect(2) to fail during the fallback due to security politics. With MPTCP, fallbacks can happen at the beginning of a connection, when there is only one path. This is done after the userspace's connect(). If the fallback is blocked, I guess the userspace will get the same errors as when an open connection is reset. (Note that on the listener side, the fallback can happen before the userspace's accept() which can even get an IPPROTO_TCP socket in return) > Theoretically, any TCP restriction should cause similar SMC and MPTCP > restriction. If we deny creation of TCP sockets, we should also deny > creation of SMC and MPTCP sockets. I thought that such dependencies may > be too complex and it will be better to leave them for the user and not > provide any transformation control at all. What do you think? I guess the creation of "kernel" TCP sockets used by MPTCP (and SMC?) can be restricted, it depends on where this hook is placed I suppose. (...) Cheers, Matt -- Sponsored by the NGI0 Core fund. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction 2025-01-28 18:14 ` Matthieu Baerts @ 2025-01-29 9:52 ` Mikhail Ivanov 2025-01-29 10:25 ` Matthieu Baerts 0 siblings, 1 reply; 18+ messages in thread From: Mikhail Ivanov @ 2025-01-29 9:52 UTC (permalink / raw) To: Matthieu Baerts, Mickaël Salaün Cc: gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs, Paul Moore On 1/28/2025 9:14 PM, Matthieu Baerts wrote: > Hi Mikhail, > > Sorry, I didn't follow all the discussions in this thread, but here are > some comments, hoping this can help to clarify the MPTCP case. Thanks a lot for sharing your knowledge, Matthieu! > > On 28/01/2025 11:56, Mikhail Ivanov wrote: >> On 1/27/2025 10:48 PM, Mickaël Salaün wrote: > > (...) > >>> I'm a bit worried that we miss some of these places (now or in future >>> kernel versions). We'll need a new LSM hook for that. >>> >>> Could you list the current locations? >> >> Currently, I know only about TCP-related transformations: >> >> * SMC can fallback to TCP during connection. TCP connection is used >> (1) to exchange CLC control messages in default case and (2) for the >> communication in the case of fallback. If socket was connected or >> connection failed, socket can not be reconnected again. There is no >> existing security hook to control the fallback case, >> >> * MPTCP uses TCP for communication between two network interfaces in the >> default case and can fallback to plain TCP if remote peer does not >> support MPTCP. AFAICS, there is also no security hook to control the >> fallback transformation, > > There are security hooks to control the path creation, but not to > control the "fallback transformation". > > Technically, with MPTCP, the userspace will create an IPPROTO_MPTCP > socket. This is only used "internally": to communicate between the > userspace and the kernelspace, but not directly used between network > interfaces. This "external" communication is done via one or multiple > kernel TCP sockets carrying extra TCP options for the mapping. The > userspace cannot directly control these sockets created by the kernel. > > In case of fallback, the kernel TCP socket "simply" drop the extra TCP > options needed for MPTCP, and carry on like normal TCP. So on the wire > and in the Linux network stack, it is the same TCP connection, without > the MPTCP options in the TCP header. The userspace continue to > communicate with the same socket. > > I'm not sure if there is a need to block the fallback: it means only one > path can be used at a time. You mean that users always rely on a plain TCP communication in the case the connection of MPTCP multipath communication fails? > >> * IPv6 -> IPv4 transformation for TCP and UDP sockets withon >> IPV6_ADDRFORM. Can be controlled with setsockopt() security hook. >> >> As I said before, I wonder if user may want to use SMC or MPTCP and deny >> TCP communication, since he should rely on fallback transformation >> during the connection in the common case. It may be unexpected for >> connect(2) to fail during the fallback due to security politics. > > With MPTCP, fallbacks can happen at the beginning of a connection, when > there is only one path. This is done after the userspace's connect(). If > the fallback is blocked, I guess the userspace will get the same errors > as when an open connection is reset. In the case of blocking due to security policy, userspace should get -EACESS. I mean, the user might not expect the fallback path to be blocked during the connection if he has allowed only MPTCP communication using the Landlock policy. > > (Note that on the listener side, the fallback can happen before the > userspace's accept() which can even get an IPPROTO_TCP socket in return) Indeed, fallback can happen on a server side as well. > >> Theoretically, any TCP restriction should cause similar SMC and MPTCP >> restriction. If we deny creation of TCP sockets, we should also deny >> creation of SMC and MPTCP sockets. I thought that such dependencies may >> be too complex and it will be better to leave them for the user and not >> provide any transformation control at all. What do you think? > I guess the creation of "kernel" TCP sockets used by MPTCP (and SMC?) > can be restricted, it depends on where this hook is placed I suppose. Calling socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP) causes creation of kernel TCP socket, so we can use security_socket_create() hook for this purpose. > > (...) > > Cheers, > Matt ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction 2025-01-29 9:52 ` Mikhail Ivanov @ 2025-01-29 10:25 ` Matthieu Baerts 2025-01-29 11:02 ` Mikhail Ivanov 0 siblings, 1 reply; 18+ messages in thread From: Matthieu Baerts @ 2025-01-29 10:25 UTC (permalink / raw) To: Mikhail Ivanov, Mickaël Salaün Cc: gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs, Paul Moore Hi Mikhail, On 29/01/2025 10:52, Mikhail Ivanov wrote: > On 1/28/2025 9:14 PM, Matthieu Baerts wrote: >> Hi Mikhail, >> >> Sorry, I didn't follow all the discussions in this thread, but here are >> some comments, hoping this can help to clarify the MPTCP case. > > Thanks a lot for sharing your knowledge, Matthieu! > >> >> On 28/01/2025 11:56, Mikhail Ivanov wrote: >>> On 1/27/2025 10:48 PM, Mickaël Salaün wrote: >> >> (...) >> >>>> I'm a bit worried that we miss some of these places (now or in future >>>> kernel versions). We'll need a new LSM hook for that. >>>> >>>> Could you list the current locations? >>> >>> Currently, I know only about TCP-related transformations: >>> >>> * SMC can fallback to TCP during connection. TCP connection is used >>> (1) to exchange CLC control messages in default case and (2) for the >>> communication in the case of fallback. If socket was connected or >>> connection failed, socket can not be reconnected again. There is no >>> existing security hook to control the fallback case, >>> >>> * MPTCP uses TCP for communication between two network interfaces in the >>> default case and can fallback to plain TCP if remote peer does not >>> support MPTCP. AFAICS, there is also no security hook to control the >>> fallback transformation, >> >> There are security hooks to control the path creation, but not to >> control the "fallback transformation". >> >> Technically, with MPTCP, the userspace will create an IPPROTO_MPTCP >> socket. This is only used "internally": to communicate between the >> userspace and the kernelspace, but not directly used between network >> interfaces. This "external" communication is done via one or multiple >> kernel TCP sockets carrying extra TCP options for the mapping. The >> userspace cannot directly control these sockets created by the kernel. >> >> In case of fallback, the kernel TCP socket "simply" drop the extra TCP >> options needed for MPTCP, and carry on like normal TCP. So on the wire >> and in the Linux network stack, it is the same TCP connection, without >> the MPTCP options in the TCP header. The userspace continue to >> communicate with the same socket. >> >> I'm not sure if there is a need to block the fallback: it means only one >> path can be used at a time. > > You mean that users always rely on a plain TCP communication in the case > the connection of MPTCP multipath communication fails? Yes, that's the same TCP connection, just without extra bit to be able to use multiple TCP connections associated to the same MPTCP one. >>> * IPv6 -> IPv4 transformation for TCP and UDP sockets withon >>> IPV6_ADDRFORM. Can be controlled with setsockopt() security hook. >>> >>> As I said before, I wonder if user may want to use SMC or MPTCP and deny >>> TCP communication, since he should rely on fallback transformation >>> during the connection in the common case. It may be unexpected for >>> connect(2) to fail during the fallback due to security politics. >> >> With MPTCP, fallbacks can happen at the beginning of a connection, when >> there is only one path. This is done after the userspace's connect(). If >> the fallback is blocked, I guess the userspace will get the same errors >> as when an open connection is reset. > > In the case of blocking due to security policy, userspace should get > -EACESS. I mean, the user might not expect the fallback path to be > blocked during the connection if he has allowed only MPTCP communication > using the Landlock policy. A "fallback" can happen on different occasions as mentioned in the RFC8684 [1], e.g. - The client asks to use MPTCP, but the other peer doesn't support it: Client Server | SYN + MP_CAPABLE | |------------------------->| | SYN/ACK | |<-------------------------| => Fallback on the client side | ACK | |------------------------->| - A middle box doesn't touch the 3WHS, but intercept the communication just after: Client Server | SYN + MP_CAPABLE | |------------------------->| | SYN/ACK + MP_CAPABLE | |<-------------------------| | ACK + MP_CAPABLE | |------------------------->| | DSS + data | => but the server doesn't receive the DSS |------------------------->| => So fallback on the server side | ACK | |<-------------------------| => Fallback on the client side - etc. So the connect(), even in blocking mode, can be OK, but the "fallback" will happen later. Again, once the "fallback" has been done, it just means there will be no more MPTCP options in the TCP headers, and these TCP connections, created and controlled by the kernel, will continue as "plain" TCP connections. It simply means that the MPTCP connection will be restricted to one path, because it will not be possible to create additional paths any more without these MPTCP options in the initial path. [1] https://datatracker.ietf.org/doc/html/rfc8684#name-fallback >> (Note that on the listener side, the fallback can happen before the >> userspace's accept() which can even get an IPPROTO_TCP socket in return) > > Indeed, fallback can happen on a server side as well. Same here, this fallback can happen at different stages of the connection, e.g. the server, supporting MPTCP, can receive a SYN without MP_CAPABLE option ; or the 3WHS is OK, but the MPTCP options are stripped later. >>> Theoretically, any TCP restriction should cause similar SMC and MPTCP >>> restriction. If we deny creation of TCP sockets, we should also deny >>> creation of SMC and MPTCP sockets. I thought that such dependencies may >>> be too complex and it will be better to leave them for the user and not >>> provide any transformation control at all. What do you think? >> I guess the creation of "kernel" TCP sockets used by MPTCP (and SMC?) >> can be restricted, it depends on where this hook is placed I suppose. > > Calling > socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP) > causes creation of kernel TCP socket, so we can use > security_socket_create() hook for this purpose. That's good if you use this hook then! Cheers, Matt -- Sponsored by the NGI0 Core fund. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction 2025-01-29 10:25 ` Matthieu Baerts @ 2025-01-29 11:02 ` Mikhail Ivanov 2025-01-29 11:33 ` Matthieu Baerts 0 siblings, 1 reply; 18+ messages in thread From: Mikhail Ivanov @ 2025-01-29 11:02 UTC (permalink / raw) To: Matthieu Baerts, Mickaël Salaün Cc: gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs, Paul Moore On 1/29/2025 1:25 PM, Matthieu Baerts wrote: > Hi Mikhail, > > On 29/01/2025 10:52, Mikhail Ivanov wrote: >> On 1/28/2025 9:14 PM, Matthieu Baerts wrote: >>> Hi Mikhail, >>> >>> Sorry, I didn't follow all the discussions in this thread, but here are >>> some comments, hoping this can help to clarify the MPTCP case. >> >> Thanks a lot for sharing your knowledge, Matthieu! >> >>> >>> On 28/01/2025 11:56, Mikhail Ivanov wrote: >>>> On 1/27/2025 10:48 PM, Mickaël Salaün wrote: >>> >>> (...) >>> >>>>> I'm a bit worried that we miss some of these places (now or in future >>>>> kernel versions). We'll need a new LSM hook for that. >>>>> >>>>> Could you list the current locations? >>>> >>>> Currently, I know only about TCP-related transformations: >>>> >>>> * SMC can fallback to TCP during connection. TCP connection is used >>>> (1) to exchange CLC control messages in default case and (2) for the >>>> communication in the case of fallback. If socket was connected or >>>> connection failed, socket can not be reconnected again. There is no >>>> existing security hook to control the fallback case, >>>> >>>> * MPTCP uses TCP for communication between two network interfaces in the >>>> default case and can fallback to plain TCP if remote peer does not >>>> support MPTCP. AFAICS, there is also no security hook to control the >>>> fallback transformation, >>> >>> There are security hooks to control the path creation, but not to >>> control the "fallback transformation". >>> >>> Technically, with MPTCP, the userspace will create an IPPROTO_MPTCP >>> socket. This is only used "internally": to communicate between the >>> userspace and the kernelspace, but not directly used between network >>> interfaces. This "external" communication is done via one or multiple >>> kernel TCP sockets carrying extra TCP options for the mapping. The >>> userspace cannot directly control these sockets created by the kernel. >>> >>> In case of fallback, the kernel TCP socket "simply" drop the extra TCP >>> options needed for MPTCP, and carry on like normal TCP. So on the wire >>> and in the Linux network stack, it is the same TCP connection, without >>> the MPTCP options in the TCP header. The userspace continue to >>> communicate with the same socket. >>> >>> I'm not sure if there is a need to block the fallback: it means only one >>> path can be used at a time. >> >> You mean that users always rely on a plain TCP communication in the case >> the connection of MPTCP multipath communication fails? > > Yes, that's the same TCP connection, just without extra bit to be able > to use multiple TCP connections associated to the same MPTCP one. Indeed, so MPTCP communication should be restricted the same way as TCP. AFAICS this should be intuitive for MPTCP users and it'll be better to let userland define this dependency. > >>>> * IPv6 -> IPv4 transformation for TCP and UDP sockets withon >>>> IPV6_ADDRFORM. Can be controlled with setsockopt() security hook. >>>> >>>> As I said before, I wonder if user may want to use SMC or MPTCP and deny >>>> TCP communication, since he should rely on fallback transformation >>>> during the connection in the common case. It may be unexpected for >>>> connect(2) to fail during the fallback due to security politics. >>> >>> With MPTCP, fallbacks can happen at the beginning of a connection, when >>> there is only one path. This is done after the userspace's connect(). If >>> the fallback is blocked, I guess the userspace will get the same errors >>> as when an open connection is reset. >> >> In the case of blocking due to security policy, userspace should get >> -EACESS. I mean, the user might not expect the fallback path to be >> blocked during the connection if he has allowed only MPTCP communication >> using the Landlock policy. > > A "fallback" can happen on different occasions as mentioned in the > RFC8684 [1], e.g. > > - The client asks to use MPTCP, but the other peer doesn't support it: > > Client Server > | SYN + MP_CAPABLE | > |------------------------->| > | SYN/ACK | > |<-------------------------| => Fallback on the client side > | ACK | > |------------------------->| > > - A middle box doesn't touch the 3WHS, but intercept the communication > just after: > > Client Server > | SYN + MP_CAPABLE | > |------------------------->| > | SYN/ACK + MP_CAPABLE | > |<-------------------------| > | ACK + MP_CAPABLE | > |------------------------->| > | DSS + data | => but the server doesn't receive the DSS > |------------------------->| => So fallback on the server side > | ACK | > |<-------------------------| => Fallback on the client side > > - etc. > > So the connect(), even in blocking mode, can be OK, but the "fallback" > will happen later. Thanks! Theoretical "socket transformation" control should cover all these cases. You mean that it might be reasonable for a Landlock policy to block MPTCP fallback when establishing first sublflow (when client does not receive MP_CAPABLE)? > > Again, once the "fallback" has been done, it just means there will be no > more MPTCP options in the TCP headers, and these TCP connections, > created and controlled by the kernel, will continue as "plain" TCP > connections. It simply means that the MPTCP connection will be > restricted to one path, because it will not be possible to create > additional paths any more without these MPTCP options in the initial path. Correct, thanks > > [1] https://datatracker.ietf.org/doc/html/rfc8684#name-fallback > >>> (Note that on the listener side, the fallback can happen before the >>> userspace's accept() which can even get an IPPROTO_TCP socket in return) >> >> Indeed, fallback can happen on a server side as well. > > Same here, this fallback can happen at different stages of the > connection, e.g. the server, supporting MPTCP, can receive a SYN without > MP_CAPABLE option ; or the 3WHS is OK, but the MPTCP options are > stripped later. > >>>> Theoretically, any TCP restriction should cause similar SMC and MPTCP >>>> restriction. If we deny creation of TCP sockets, we should also deny >>>> creation of SMC and MPTCP sockets. I thought that such dependencies may >>>> be too complex and it will be better to leave them for the user and not >>>> provide any transformation control at all. What do you think? >>> I guess the creation of "kernel" TCP sockets used by MPTCP (and SMC?) >>> can be restricted, it depends on where this hook is placed I suppose. >> >> Calling >> socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP) >> causes creation of kernel TCP socket, so we can use >> security_socket_create() hook for this purpose. > > That's good if you use this hook then! > > Cheers, > Matt ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction 2025-01-29 11:02 ` Mikhail Ivanov @ 2025-01-29 11:33 ` Matthieu Baerts 2025-01-29 11:47 ` Mikhail Ivanov 0 siblings, 1 reply; 18+ messages in thread From: Matthieu Baerts @ 2025-01-29 11:33 UTC (permalink / raw) To: Mikhail Ivanov, Mickaël Salaün Cc: gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs, Paul Moore On 29/01/2025 12:02, Mikhail Ivanov wrote: > On 1/29/2025 1:25 PM, Matthieu Baerts wrote: >> Hi Mikhail, >> >> On 29/01/2025 10:52, Mikhail Ivanov wrote: >>> On 1/28/2025 9:14 PM, Matthieu Baerts wrote: >>>> Hi Mikhail, >>>> >>>> Sorry, I didn't follow all the discussions in this thread, but here are >>>> some comments, hoping this can help to clarify the MPTCP case. >>> >>> Thanks a lot for sharing your knowledge, Matthieu! >>> >>>> >>>> On 28/01/2025 11:56, Mikhail Ivanov wrote: >>>>> On 1/27/2025 10:48 PM, Mickaël Salaün wrote: >>>> >>>> (...) >>>> >>>>>> I'm a bit worried that we miss some of these places (now or in future >>>>>> kernel versions). We'll need a new LSM hook for that. >>>>>> >>>>>> Could you list the current locations? >>>>> >>>>> Currently, I know only about TCP-related transformations: >>>>> >>>>> * SMC can fallback to TCP during connection. TCP connection is used >>>>> (1) to exchange CLC control messages in default case and (2) >>>>> for the >>>>> communication in the case of fallback. If socket was connected or >>>>> connection failed, socket can not be reconnected again. There >>>>> is no >>>>> existing security hook to control the fallback case, >>>>> >>>>> * MPTCP uses TCP for communication between two network interfaces >>>>> in the >>>>> default case and can fallback to plain TCP if remote peer does not >>>>> support MPTCP. AFAICS, there is also no security hook to >>>>> control the >>>>> fallback transformation, >>>> >>>> There are security hooks to control the path creation, but not to >>>> control the "fallback transformation". >>>> >>>> Technically, with MPTCP, the userspace will create an IPPROTO_MPTCP >>>> socket. This is only used "internally": to communicate between the >>>> userspace and the kernelspace, but not directly used between network >>>> interfaces. This "external" communication is done via one or multiple >>>> kernel TCP sockets carrying extra TCP options for the mapping. The >>>> userspace cannot directly control these sockets created by the kernel. >>>> >>>> In case of fallback, the kernel TCP socket "simply" drop the extra TCP >>>> options needed for MPTCP, and carry on like normal TCP. So on the wire >>>> and in the Linux network stack, it is the same TCP connection, without >>>> the MPTCP options in the TCP header. The userspace continue to >>>> communicate with the same socket. >>>> >>>> I'm not sure if there is a need to block the fallback: it means only >>>> one >>>> path can be used at a time. >>> >>> You mean that users always rely on a plain TCP communication in the case >>> the connection of MPTCP multipath communication fails? >> >> Yes, that's the same TCP connection, just without extra bit to be able >> to use multiple TCP connections associated to the same MPTCP one. > > Indeed, so MPTCP communication should be restricted the same way as TCP. > AFAICS this should be intuitive for MPTCP users and it'll be better > to let userland define this dependency. Yes, I think that would make more sense. I guess we can look at MPTCP as TCP with extra features. So if TCP is blocked, MPTCP should be blocked as well. (And eventually having the possibility to block only TCP but not MPTCP and the opposite, but that's a different topic: a possible new feature, but not a bug-fix) >>>>> * IPv6 -> IPv4 transformation for TCP and UDP sockets withon >>>>> IPV6_ADDRFORM. Can be controlled with setsockopt() security hook. >>>>> >>>>> As I said before, I wonder if user may want to use SMC or MPTCP and >>>>> deny >>>>> TCP communication, since he should rely on fallback transformation >>>>> during the connection in the common case. It may be unexpected for >>>>> connect(2) to fail during the fallback due to security politics. >>>> >>>> With MPTCP, fallbacks can happen at the beginning of a connection, when >>>> there is only one path. This is done after the userspace's >>>> connect(). If >>>> the fallback is blocked, I guess the userspace will get the same errors >>>> as when an open connection is reset. >>> >>> In the case of blocking due to security policy, userspace should get >>> -EACESS. I mean, the user might not expect the fallback path to be >>> blocked during the connection if he has allowed only MPTCP communication >>> using the Landlock policy. >> >> A "fallback" can happen on different occasions as mentioned in the >> RFC8684 [1], e.g. >> >> - The client asks to use MPTCP, but the other peer doesn't support it: >> >> Client Server >> | SYN + MP_CAPABLE | >> |------------------------->| >> | SYN/ACK | >> |<-------------------------| => Fallback on the client side >> | ACK | >> |------------------------->| >> >> - A middle box doesn't touch the 3WHS, but intercept the communication >> just after: >> >> Client Server >> | SYN + MP_CAPABLE | >> |------------------------->| >> | SYN/ACK + MP_CAPABLE | >> |<-------------------------| >> | ACK + MP_CAPABLE | >> |------------------------->| >> | DSS + data | => but the server doesn't receive the DSS >> |------------------------->| => So fallback on the server side >> | ACK | >> |<-------------------------| => Fallback on the client side >> >> - etc. >> >> So the connect(), even in blocking mode, can be OK, but the "fallback" >> will happen later. > > Thanks! Theoretical "socket transformation" control should cover all > these cases. > > You mean that it might be reasonable for a Landlock policy to block > MPTCP fallback when establishing first sublflow (when client does not > receive MP_CAPABLE)? Personally, I don't even know if there is really a need for such policies. The fallback is there not to block a connection if the other peer doesn't support MPTCP, or if a middlebox decides to mess-up with MPTCP options. So instead of an error, the connection continues but is "degraded" by not being able to create multiple paths later on. Maybe best to wait for a concrete use-case before implementing this? (...) Cheers, Matt -- Sponsored by the NGI0 Core fund. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction 2025-01-29 11:33 ` Matthieu Baerts @ 2025-01-29 11:47 ` Mikhail Ivanov 2025-01-29 11:57 ` Matthieu Baerts 2025-01-29 14:51 ` Mickaël Salaün 0 siblings, 2 replies; 18+ messages in thread From: Mikhail Ivanov @ 2025-01-29 11:47 UTC (permalink / raw) To: Matthieu Baerts, Mickaël Salaün Cc: gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs, Paul Moore On 1/29/2025 2:33 PM, Matthieu Baerts wrote: > On 29/01/2025 12:02, Mikhail Ivanov wrote: >> On 1/29/2025 1:25 PM, Matthieu Baerts wrote: >>> Hi Mikhail, >>> >>> On 29/01/2025 10:52, Mikhail Ivanov wrote: >>>> On 1/28/2025 9:14 PM, Matthieu Baerts wrote: >>>>> Hi Mikhail, >>>>> >>>>> Sorry, I didn't follow all the discussions in this thread, but here are >>>>> some comments, hoping this can help to clarify the MPTCP case. >>>> >>>> Thanks a lot for sharing your knowledge, Matthieu! >>>> >>>>> >>>>> On 28/01/2025 11:56, Mikhail Ivanov wrote: >>>>>> On 1/27/2025 10:48 PM, Mickaël Salaün wrote: >>>>> >>>>> (...) >>>>> >>>>>>> I'm a bit worried that we miss some of these places (now or in future >>>>>>> kernel versions). We'll need a new LSM hook for that. >>>>>>> >>>>>>> Could you list the current locations? >>>>>> >>>>>> Currently, I know only about TCP-related transformations: >>>>>> >>>>>> * SMC can fallback to TCP during connection. TCP connection is used >>>>>> (1) to exchange CLC control messages in default case and (2) >>>>>> for the >>>>>> communication in the case of fallback. If socket was connected or >>>>>> connection failed, socket can not be reconnected again. There >>>>>> is no >>>>>> existing security hook to control the fallback case, >>>>>> >>>>>> * MPTCP uses TCP for communication between two network interfaces >>>>>> in the >>>>>> default case and can fallback to plain TCP if remote peer does not >>>>>> support MPTCP. AFAICS, there is also no security hook to >>>>>> control the >>>>>> fallback transformation, >>>>> >>>>> There are security hooks to control the path creation, but not to >>>>> control the "fallback transformation". >>>>> >>>>> Technically, with MPTCP, the userspace will create an IPPROTO_MPTCP >>>>> socket. This is only used "internally": to communicate between the >>>>> userspace and the kernelspace, but not directly used between network >>>>> interfaces. This "external" communication is done via one or multiple >>>>> kernel TCP sockets carrying extra TCP options for the mapping. The >>>>> userspace cannot directly control these sockets created by the kernel. >>>>> >>>>> In case of fallback, the kernel TCP socket "simply" drop the extra TCP >>>>> options needed for MPTCP, and carry on like normal TCP. So on the wire >>>>> and in the Linux network stack, it is the same TCP connection, without >>>>> the MPTCP options in the TCP header. The userspace continue to >>>>> communicate with the same socket. >>>>> >>>>> I'm not sure if there is a need to block the fallback: it means only >>>>> one >>>>> path can be used at a time. >>>> >>>> You mean that users always rely on a plain TCP communication in the case >>>> the connection of MPTCP multipath communication fails? >>> >>> Yes, that's the same TCP connection, just without extra bit to be able >>> to use multiple TCP connections associated to the same MPTCP one. >> >> Indeed, so MPTCP communication should be restricted the same way as TCP. >> AFAICS this should be intuitive for MPTCP users and it'll be better >> to let userland define this dependency. > > Yes, I think that would make more sense. > > I guess we can look at MPTCP as TCP with extra features. Yeap > > So if TCP is blocked, MPTCP should be blocked as well. (And eventually > having the possibility to block only TCP but not MPTCP and the opposite, > but that's a different topic: a possible new feature, but not a bug-fix) What do you mean by the "bug fix"? > >>>>>> * IPv6 -> IPv4 transformation for TCP and UDP sockets withon >>>>>> IPV6_ADDRFORM. Can be controlled with setsockopt() security hook. >>>>>> >>>>>> As I said before, I wonder if user may want to use SMC or MPTCP and >>>>>> deny >>>>>> TCP communication, since he should rely on fallback transformation >>>>>> during the connection in the common case. It may be unexpected for >>>>>> connect(2) to fail during the fallback due to security politics. >>>>> >>>>> With MPTCP, fallbacks can happen at the beginning of a connection, when >>>>> there is only one path. This is done after the userspace's >>>>> connect(). If >>>>> the fallback is blocked, I guess the userspace will get the same errors >>>>> as when an open connection is reset. >>>> >>>> In the case of blocking due to security policy, userspace should get >>>> -EACESS. I mean, the user might not expect the fallback path to be >>>> blocked during the connection if he has allowed only MPTCP communication >>>> using the Landlock policy. >>> >>> A "fallback" can happen on different occasions as mentioned in the >>> RFC8684 [1], e.g. >>> >>> - The client asks to use MPTCP, but the other peer doesn't support it: >>> >>> Client Server >>> | SYN + MP_CAPABLE | >>> |------------------------->| >>> | SYN/ACK | >>> |<-------------------------| => Fallback on the client side >>> | ACK | >>> |------------------------->| >>> >>> - A middle box doesn't touch the 3WHS, but intercept the communication >>> just after: >>> >>> Client Server >>> | SYN + MP_CAPABLE | >>> |------------------------->| >>> | SYN/ACK + MP_CAPABLE | >>> |<-------------------------| >>> | ACK + MP_CAPABLE | >>> |------------------------->| >>> | DSS + data | => but the server doesn't receive the DSS >>> |------------------------->| => So fallback on the server side >>> | ACK | >>> |<-------------------------| => Fallback on the client side >>> >>> - etc. >>> >>> So the connect(), even in blocking mode, can be OK, but the "fallback" >>> will happen later. >> >> Thanks! Theoretical "socket transformation" control should cover all >> these cases. >> >> You mean that it might be reasonable for a Landlock policy to block >> MPTCP fallback when establishing first sublflow (when client does not >> receive MP_CAPABLE)? > > Personally, I don't even know if there is really a need for such > policies. The fallback is there not to block a connection if the other > peer doesn't support MPTCP, or if a middlebox decides to mess-up with > MPTCP options. So instead of an error, the connection continues but is > "degraded" by not being able to create multiple paths later on. > > Maybe best to wait for a concrete use-case before implementing this? Ok, got it! I agree that such policies does not seem to be useful. > > (...) > > Cheers, > Matt ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction 2025-01-29 11:47 ` Mikhail Ivanov @ 2025-01-29 11:57 ` Matthieu Baerts 2025-01-29 14:51 ` Mickaël Salaün 1 sibling, 0 replies; 18+ messages in thread From: Matthieu Baerts @ 2025-01-29 11:57 UTC (permalink / raw) To: Mikhail Ivanov, Mickaël Salaün Cc: gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs, Paul Moore On 29/01/2025 12:47, Mikhail Ivanov wrote: > On 1/29/2025 2:33 PM, Matthieu Baerts wrote: >> So if TCP is blocked, MPTCP should be blocked as well. (And eventually >> having the possibility to block only TCP but not MPTCP and the opposite, >> but that's a different topic: a possible new feature, but not a bug-fix) > > What do you mean by the "bug fix"? I mean that to me, adding the possibility to block one but not the other might be seen as a new feature. But at the end, that's up to the Landlocks maintainers to decide! So feel free to ignore this previous comment :) Cheers, Matt -- Sponsored by the NGI0 Core fund. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction 2025-01-29 11:47 ` Mikhail Ivanov 2025-01-29 11:57 ` Matthieu Baerts @ 2025-01-29 14:51 ` Mickaël Salaün 2025-01-29 15:44 ` Matthieu Baerts 2025-01-31 11:04 ` Mikhail Ivanov 1 sibling, 2 replies; 18+ messages in thread From: Mickaël Salaün @ 2025-01-29 14:51 UTC (permalink / raw) To: Mikhail Ivanov Cc: Matthieu Baerts, gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs, Paul Moore On Wed, Jan 29, 2025 at 02:47:19PM +0300, Mikhail Ivanov wrote: > On 1/29/2025 2:33 PM, Matthieu Baerts wrote: > > On 29/01/2025 12:02, Mikhail Ivanov wrote: > > > On 1/29/2025 1:25 PM, Matthieu Baerts wrote: > > > > Hi Mikhail, > > > > > > > > On 29/01/2025 10:52, Mikhail Ivanov wrote: > > > > > On 1/28/2025 9:14 PM, Matthieu Baerts wrote: > > > > > > Hi Mikhail, > > > > > > > > > > > > Sorry, I didn't follow all the discussions in this thread, but here are > > > > > > some comments, hoping this can help to clarify the MPTCP case. > > > > > > > > > > Thanks a lot for sharing your knowledge, Matthieu! > > > > > > > > > > > > > > > > > On 28/01/2025 11:56, Mikhail Ivanov wrote: > > > > > > > On 1/27/2025 10:48 PM, Mickaël Salaün wrote: > > > > > > > > > > > > (...) > > > > > > > > > > > > > > I'm a bit worried that we miss some of these places (now or in future > > > > > > > > kernel versions). We'll need a new LSM hook for that. > > > > > > > > > > > > > > > > Could you list the current locations? > > > > > > > > > > > > > > Currently, I know only about TCP-related transformations: > > > > > > > > > > > > > > * SMC can fallback to TCP during connection. TCP connection is used > > > > > > > (1) to exchange CLC control messages in default case and (2) > > > > > > > for the > > > > > > > communication in the case of fallback. If socket was connected or > > > > > > > connection failed, socket can not be reconnected again. There > > > > > > > is no > > > > > > > existing security hook to control the fallback case, > > > > > > > > > > > > > > * MPTCP uses TCP for communication between two network interfaces > > > > > > > in the > > > > > > > default case and can fallback to plain TCP if remote peer does not > > > > > > > support MPTCP. AFAICS, there is also no security hook to > > > > > > > control the > > > > > > > fallback transformation, > > > > > > > > > > > > There are security hooks to control the path creation, but not to > > > > > > control the "fallback transformation". > > > > > > > > > > > > Technically, with MPTCP, the userspace will create an IPPROTO_MPTCP > > > > > > socket. This is only used "internally": to communicate between the > > > > > > userspace and the kernelspace, but not directly used between network > > > > > > interfaces. This "external" communication is done via one or multiple > > > > > > kernel TCP sockets carrying extra TCP options for the mapping. The > > > > > > userspace cannot directly control these sockets created by the kernel. > > > > > > > > > > > > In case of fallback, the kernel TCP socket "simply" drop the extra TCP > > > > > > options needed for MPTCP, and carry on like normal TCP. So on the wire > > > > > > and in the Linux network stack, it is the same TCP connection, without > > > > > > the MPTCP options in the TCP header. The userspace continue to > > > > > > communicate with the same socket. > > > > > > > > > > > > I'm not sure if there is a need to block the fallback: it means only > > > > > > one > > > > > > path can be used at a time. Thanks Matthieu. So user space needs to specific IPPROTO_MPTCP to use MPTCP, but on the network this socket can translate to "augmented" or plain TCP. From Landlock point of view, what matters is to have a consistent policy that maps to user space code. The fear was that a malicious user space that is only allowed to use MPTCP could still transform an MPTCP socket to a TCP socket, while it wasn't allowed to create a TCP socket in the first place. I now think this should not be an issue because: 1. MPTCP is kind of a superset of TCP 2. user space legitimately using MPTCP should not get any error related to a Landlock policy because of TCP/any automatic fallback. To say it another way, such fallback is independent of user space requests and may not be predicted because it is related to the current network path. This follows the principle of least astonishment (at least from user space point of view). So, if I understand correctly, this should be simple for the Landlock socket creation control: we only check socket properties at creation time and we ignore potential fallbacks. This should be documented though. As an example, if a Landlock policies only allows MPTCP: socket(..., IPPROTO_MPTCP) should be allowed and any legitimate use of the returned socket (according to MPTCP) should be allowed, including TCP fallback. However, socket(..., IPPROTO_TCP/0), should only be allowed if TCP is explicitly allowed. This means that we might end up with an MPTCP socket only using TCP, which is OK. I guess this should be the same for other protocols, except if user space can explicitly transform a specific socket type to use an *arbitrary* protocol, but I think this is not possible. > > > > > > > > > > You mean that users always rely on a plain TCP communication in the case > > > > > the connection of MPTCP multipath communication fails? > > > > > > > > Yes, that's the same TCP connection, just without extra bit to be able > > > > to use multiple TCP connections associated to the same MPTCP one. > > > > > > Indeed, so MPTCP communication should be restricted the same way as TCP. > > > AFAICS this should be intuitive for MPTCP users and it'll be better > > > to let userland define this dependency. > > > > Yes, I think that would make more sense. > > > > I guess we can look at MPTCP as TCP with extra features. > > Yeap > > > > > So if TCP is blocked, MPTCP should be blocked as well. (And eventually > > having the possibility to block only TCP but not MPTCP and the opposite, > > but that's a different topic: a possible new feature, but not a bug-fix) > What do you mean by the "bug fix"? > > > > > > > > > > * IPv6 -> IPv4 transformation for TCP and UDP sockets withon > > > > > > > IPV6_ADDRFORM. Can be controlled with setsockopt() security hook. According to the man page: "It is allowed only for IPv6 sockets that are connected and bound to a v4-mapped-on-v6 address." This compatibility feature makes sense from user space point of view and should not result in an error because of Landlock. > > > > > > > > > > > > > > As I said before, I wonder if user may want to use SMC or MPTCP and > > > > > > > deny > > > > > > > TCP communication, since he should rely on fallback transformation > > > > > > > during the connection in the common case. It may be unexpected for > > > > > > > connect(2) to fail during the fallback due to security politics. > > > > > > > > > > > > With MPTCP, fallbacks can happen at the beginning of a connection, when > > > > > > there is only one path. This is done after the userspace's > > > > > > connect(). If A remaining question is then, can we repurpose an MPTCP socket that did fallback to TCP, to (re)connect to another destination (this time directly with TCP)? I guess this is possible. If it is the case, I think it should be OK anyway. That could be used by an attacker, but that should not give more access because of the MPTCP fallback mechanism anyway. We should see MPTCP as a superset of TCP. At the end, security policy is in the hands of user space. > > > > > > the fallback is blocked, I guess the userspace will get the same errors > > > > > > as when an open connection is reset. > > > > > > > > > > In the case of blocking due to security policy, userspace should get > > > > > -EACESS. I mean, the user might not expect the fallback path to be > > > > > blocked during the connection if he has allowed only MPTCP communication > > > > > using the Landlock policy. > > > > > > > > A "fallback" can happen on different occasions as mentioned in the > > > > RFC8684 [1], e.g. > > > > > > > > - The client asks to use MPTCP, but the other peer doesn't support it: > > > > > > > > Client Server > > > > | SYN + MP_CAPABLE | > > > > |------------------------->| > > > > | SYN/ACK | > > > > |<-------------------------| => Fallback on the client side > > > > | ACK | > > > > |------------------------->| > > > > > > > > - A middle box doesn't touch the 3WHS, but intercept the communication > > > > just after: > > > > > > > > Client Server > > > > | SYN + MP_CAPABLE | > > > > |------------------------->| > > > > | SYN/ACK + MP_CAPABLE | > > > > |<-------------------------| > > > > | ACK + MP_CAPABLE | > > > > |------------------------->| > > > > | DSS + data | => but the server doesn't receive the DSS > > > > |------------------------->| => So fallback on the server side > > > > | ACK | > > > > |<-------------------------| => Fallback on the client side > > > > > > > > - etc. > > > > > > > > So the connect(), even in blocking mode, can be OK, but the "fallback" > > > > will happen later. > > > > > > Thanks! Theoretical "socket transformation" control should cover all > > > these cases. > > > > > > You mean that it might be reasonable for a Landlock policy to block > > > MPTCP fallback when establishing first sublflow (when client does not > > > receive MP_CAPABLE)? > > > > Personally, I don't even know if there is really a need for such > > policies. The fallback is there not to block a connection if the other > > peer doesn't support MPTCP, or if a middlebox decides to mess-up with > > MPTCP options. So instead of an error, the connection continues but is > > "degraded" by not being able to create multiple paths later on. I agree, this kind of compatibility feature should not be denied. > > > > Maybe best to wait for a concrete use-case before implementing this? > > Ok, got it! I agree that such policies does not seem to be useful. > > > > > (...) > > > > Cheers, > > Matt > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction 2025-01-29 14:51 ` Mickaël Salaün @ 2025-01-29 15:44 ` Matthieu Baerts 2025-01-30 9:51 ` Mickaël Salaün 2025-01-31 11:04 ` Mikhail Ivanov 1 sibling, 1 reply; 18+ messages in thread From: Matthieu Baerts @ 2025-01-29 15:44 UTC (permalink / raw) To: Mickaël Salaün, Mikhail Ivanov Cc: gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs, Paul Moore Hi Mickaël, On 29/01/2025 15:51, Mickaël Salaün wrote: > On Wed, Jan 29, 2025 at 02:47:19PM +0300, Mikhail Ivanov wrote: >> On 1/29/2025 2:33 PM, Matthieu Baerts wrote: >>> On 29/01/2025 12:02, Mikhail Ivanov wrote: >>>> On 1/29/2025 1:25 PM, Matthieu Baerts wrote: >>>>> Hi Mikhail, >>>>> >>>>> On 29/01/2025 10:52, Mikhail Ivanov wrote: >>>>>> On 1/28/2025 9:14 PM, Matthieu Baerts wrote: >>>>>>> Hi Mikhail, >>>>>>> >>>>>>> Sorry, I didn't follow all the discussions in this thread, but here are >>>>>>> some comments, hoping this can help to clarify the MPTCP case. >>>>>> >>>>>> Thanks a lot for sharing your knowledge, Matthieu! >>>>>> >>>>>>> >>>>>>> On 28/01/2025 11:56, Mikhail Ivanov wrote: >>>>>>>> On 1/27/2025 10:48 PM, Mickaël Salaün wrote: >>>>>>> >>>>>>> (...) >>>>>>> >>>>>>>>> I'm a bit worried that we miss some of these places (now or in future >>>>>>>>> kernel versions). We'll need a new LSM hook for that. >>>>>>>>> >>>>>>>>> Could you list the current locations? >>>>>>>> >>>>>>>> Currently, I know only about TCP-related transformations: >>>>>>>> >>>>>>>> * SMC can fallback to TCP during connection. TCP connection is used >>>>>>>> (1) to exchange CLC control messages in default case and (2) >>>>>>>> for the >>>>>>>> communication in the case of fallback. If socket was connected or >>>>>>>> connection failed, socket can not be reconnected again. There >>>>>>>> is no >>>>>>>> existing security hook to control the fallback case, >>>>>>>> >>>>>>>> * MPTCP uses TCP for communication between two network interfaces >>>>>>>> in the >>>>>>>> default case and can fallback to plain TCP if remote peer does not >>>>>>>> support MPTCP. AFAICS, there is also no security hook to >>>>>>>> control the >>>>>>>> fallback transformation, >>>>>>> >>>>>>> There are security hooks to control the path creation, but not to >>>>>>> control the "fallback transformation". >>>>>>> >>>>>>> Technically, with MPTCP, the userspace will create an IPPROTO_MPTCP >>>>>>> socket. This is only used "internally": to communicate between the >>>>>>> userspace and the kernelspace, but not directly used between network >>>>>>> interfaces. This "external" communication is done via one or multiple >>>>>>> kernel TCP sockets carrying extra TCP options for the mapping. The >>>>>>> userspace cannot directly control these sockets created by the kernel. >>>>>>> >>>>>>> In case of fallback, the kernel TCP socket "simply" drop the extra TCP >>>>>>> options needed for MPTCP, and carry on like normal TCP. So on the wire >>>>>>> and in the Linux network stack, it is the same TCP connection, without >>>>>>> the MPTCP options in the TCP header. The userspace continue to >>>>>>> communicate with the same socket. >>>>>>> >>>>>>> I'm not sure if there is a need to block the fallback: it means only >>>>>>> one >>>>>>> path can be used at a time. > > Thanks Matthieu. > > So user space needs to specific IPPROTO_MPTCP to use MPTCP, but on the > network this socket can translate to "augmented" or plain TCP. Correct. On the wire, you will only see packet with the IPPROTO_TCP protocol. When MPTCP is used, extra MPTCP options will be present in the TCP headers, but the protocol is still IPPROTO_TCP on the network. > From Landlock point of view, what matters is to have a consistent policy > that maps to user space code. The fear was that a malicious user space > that is only allowed to use MPTCP could still transform an MPTCP socket > to a TCP socket, while it wasn't allowed to create a TCP socket in the > first place. I now think this should not be an issue because: > 1. MPTCP is kind of a superset of TCP > 2. user space legitimately using MPTCP should not get any error related > to a Landlock policy because of TCP/any automatic fallback. To say > it another way, such fallback is independent of user space requests > and may not be predicted because it is related to the current network > path. This follows the principle of least astonishment (at least > from user space point of view). > > So, if I understand correctly, this should be simple for the Landlock > socket creation control: we only check socket properties at creation > time and we ignore potential fallbacks. This should be documented > though. It depends on the restrictions that are put in place: are the user and kernel sockets treated the same way? If yes, blocking TCP means that even if it will be possible for the userspace to create an IPPROTO_MPTCP socket, the kernel will not be allowed to IPPROTO_TCP ones to communicate with the outside world. So blocking TCP will implicitly block MPTCP. On the other hand, if only TCP user sockets are blocked, then it will be possible to use MPTCP to communicate to any TCP sockets: with an IPPROTO_MPTCP socket, it is possible to communicate with any IPPROTO_TCP sockets, but without the extra features supported by MPTCP. > As an example, if a Landlock policies only allows MPTCP: socket(..., > IPPROTO_MPTCP) should be allowed and any legitimate use of the returned > socket (according to MPTCP) should be allowed, including TCP fallback. > However, socket(..., IPPROTO_TCP/0), should only be allowed if TCP is > explicitly allowed. This means that we might end up with an MPTCP > socket only using TCP, which is OK. Would it not be confusing for the person who set the Landlock policies? Especially for the ones who had policies to block TCP, and thought they were "safe", no? If only TCP is blocked on the userspace side, simply using IPPROTO_MPTCP instead of IPPROTO_TCP will allow any users to continue to talk with the outside world. Also, it is easy to force apps to use IPPROTO_MPTCP instead of IPPROTO_TCP, e.g. using 'mptcpize' which set LD_PRELOAD in order to change the parameters of the socket() call. mptcpize run curl https://check.mptcp.dev > I guess this should be the same for other protocols, except if user > space can explicitly transform a specific socket type to use an > *arbitrary* protocol, but I think this is not possible. I'm sorry, I don't know what is possible with the other ones. But again, blocking both user and kernel sockets the same way might make more sense here. >>>>>> >>>>>> You mean that users always rely on a plain TCP communication in the case >>>>>> the connection of MPTCP multipath communication fails? >>>>> >>>>> Yes, that's the same TCP connection, just without extra bit to be able >>>>> to use multiple TCP connections associated to the same MPTCP one. >>>> >>>> Indeed, so MPTCP communication should be restricted the same way as TCP. >>>> AFAICS this should be intuitive for MPTCP users and it'll be better >>>> to let userland define this dependency. >>> >>> Yes, I think that would make more sense. >>> >>> I guess we can look at MPTCP as TCP with extra features. >> >> Yeap >> >>> >>> So if TCP is blocked, MPTCP should be blocked as well. (And eventually >>> having the possibility to block only TCP but not MPTCP and the opposite, >>> but that's a different topic: a possible new feature, but not a bug-fix) >> What do you mean by the "bug fix"? >> >>> >>>>>>>> * IPv6 -> IPv4 transformation for TCP and UDP sockets withon >>>>>>>> IPV6_ADDRFORM. Can be controlled with setsockopt() security hook. > > According to the man page: "It is allowed only for IPv6 sockets that are > connected and bound to a v4-mapped-on-v6 address." > > This compatibility feature makes sense from user space point of view and > should not result in an error because of Landlock. > >>>>>>>> >>>>>>>> As I said before, I wonder if user may want to use SMC or MPTCP and >>>>>>>> deny >>>>>>>> TCP communication, since he should rely on fallback transformation >>>>>>>> during the connection in the common case. It may be unexpected for >>>>>>>> connect(2) to fail during the fallback due to security politics. >>>>>>> >>>>>>> With MPTCP, fallbacks can happen at the beginning of a connection, when >>>>>>> there is only one path. This is done after the userspace's >>>>>>> connect(). If > > A remaining question is then, can we repurpose an MPTCP socket that did > fallback to TCP, to (re)connect to another destination (this time > directly with TCP)? If the socket was created with the IPPROTO_MPTCP protocol, the protocol will not change after a disconnection. But still, with an MPTCP socket, it is by design possible to connect to a TCP one no mater how the socket was used before. > I guess this is possible. If it is the case, I think it should be OK > anyway. That could be used by an attacker, but that should not give > more access because of the MPTCP fallback mechanism anyway. We should > see MPTCP as a superset of TCP. At the end, security policy is in the > hands of user space. As long as it is documented and not seen as a regression :) To me, it sounds strange to have to add extra rules for MPTCP if TCP is blocked, but that's certainly because I see MPTCP like it is seen on the wire: as an extension to TCP, not as a different protocol. (...) Cheers, Matt -- Sponsored by the NGI0 Core fund. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction 2025-01-29 15:44 ` Matthieu Baerts @ 2025-01-30 9:51 ` Mickaël Salaün 2025-01-30 10:18 ` Matthieu Baerts 0 siblings, 1 reply; 18+ messages in thread From: Mickaël Salaün @ 2025-01-30 9:51 UTC (permalink / raw) To: Matthieu Baerts Cc: Mikhail Ivanov, gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs, Paul Moore On Wed, Jan 29, 2025 at 04:44:18PM +0100, Matthieu Baerts wrote: > Hi Mickaël, > > On 29/01/2025 15:51, Mickaël Salaün wrote: > > On Wed, Jan 29, 2025 at 02:47:19PM +0300, Mikhail Ivanov wrote: > >> On 1/29/2025 2:33 PM, Matthieu Baerts wrote: > >>> On 29/01/2025 12:02, Mikhail Ivanov wrote: > >>>> On 1/29/2025 1:25 PM, Matthieu Baerts wrote: > >>>>> Hi Mikhail, > >>>>> > >>>>> On 29/01/2025 10:52, Mikhail Ivanov wrote: > >>>>>> On 1/28/2025 9:14 PM, Matthieu Baerts wrote: > >>>>>>> Hi Mikhail, > >>>>>>> > >>>>>>> Sorry, I didn't follow all the discussions in this thread, but here are > >>>>>>> some comments, hoping this can help to clarify the MPTCP case. > >>>>>> > >>>>>> Thanks a lot for sharing your knowledge, Matthieu! > >>>>>> > >>>>>>> > >>>>>>> On 28/01/2025 11:56, Mikhail Ivanov wrote: > >>>>>>>> On 1/27/2025 10:48 PM, Mickaël Salaün wrote: > >>>>>>> > >>>>>>> (...) > >>>>>>> > >>>>>>>>> I'm a bit worried that we miss some of these places (now or in future > >>>>>>>>> kernel versions). We'll need a new LSM hook for that. > >>>>>>>>> > >>>>>>>>> Could you list the current locations? > >>>>>>>> > >>>>>>>> Currently, I know only about TCP-related transformations: > >>>>>>>> > >>>>>>>> * SMC can fallback to TCP during connection. TCP connection is used > >>>>>>>> (1) to exchange CLC control messages in default case and (2) > >>>>>>>> for the > >>>>>>>> communication in the case of fallback. If socket was connected or > >>>>>>>> connection failed, socket can not be reconnected again. There > >>>>>>>> is no > >>>>>>>> existing security hook to control the fallback case, > >>>>>>>> > >>>>>>>> * MPTCP uses TCP for communication between two network interfaces > >>>>>>>> in the > >>>>>>>> default case and can fallback to plain TCP if remote peer does not > >>>>>>>> support MPTCP. AFAICS, there is also no security hook to > >>>>>>>> control the > >>>>>>>> fallback transformation, > >>>>>>> > >>>>>>> There are security hooks to control the path creation, but not to > >>>>>>> control the "fallback transformation". > >>>>>>> > >>>>>>> Technically, with MPTCP, the userspace will create an IPPROTO_MPTCP > >>>>>>> socket. This is only used "internally": to communicate between the > >>>>>>> userspace and the kernelspace, but not directly used between network > >>>>>>> interfaces. This "external" communication is done via one or multiple > >>>>>>> kernel TCP sockets carrying extra TCP options for the mapping. The > >>>>>>> userspace cannot directly control these sockets created by the kernel. > >>>>>>> > >>>>>>> In case of fallback, the kernel TCP socket "simply" drop the extra TCP > >>>>>>> options needed for MPTCP, and carry on like normal TCP. So on the wire > >>>>>>> and in the Linux network stack, it is the same TCP connection, without > >>>>>>> the MPTCP options in the TCP header. The userspace continue to > >>>>>>> communicate with the same socket. > >>>>>>> > >>>>>>> I'm not sure if there is a need to block the fallback: it means only > >>>>>>> one > >>>>>>> path can be used at a time. > > > > Thanks Matthieu. > > > > So user space needs to specific IPPROTO_MPTCP to use MPTCP, but on the > > network this socket can translate to "augmented" or plain TCP. > > Correct. On the wire, you will only see packet with the IPPROTO_TCP > protocol. When MPTCP is used, extra MPTCP options will be present in the > TCP headers, but the protocol is still IPPROTO_TCP on the network. > > > From Landlock point of view, what matters is to have a consistent policy > > that maps to user space code. The fear was that a malicious user space > > that is only allowed to use MPTCP could still transform an MPTCP socket > > to a TCP socket, while it wasn't allowed to create a TCP socket in the > > first place. I now think this should not be an issue because: > > 1. MPTCP is kind of a superset of TCP > > 2. user space legitimately using MPTCP should not get any error related > > to a Landlock policy because of TCP/any automatic fallback. To say > > it another way, such fallback is independent of user space requests > > and may not be predicted because it is related to the current network > > path. This follows the principle of least astonishment (at least > > from user space point of view). > > > > So, if I understand correctly, this should be simple for the Landlock > > socket creation control: we only check socket properties at creation > > time and we ignore potential fallbacks. This should be documented > > though. > > It depends on the restrictions that are put in place: are the user and > kernel sockets treated the same way? If yes, blocking TCP means that > even if it will be possible for the userspace to create an IPPROTO_MPTCP > socket, the kernel will not be allowed to IPPROTO_TCP ones to > communicate with the outside world. So blocking TCP will implicitly > block MPTCP. > > On the other hand, if only TCP user sockets are blocked, then it will be > possible to use MPTCP to communicate to any TCP sockets: with an > IPPROTO_MPTCP socket, it is possible to communicate with any IPPROTO_TCP > sockets, but without the extra features supported by MPTCP. Yes, that how Landlock works, it only enforces a security policy defined by user space on user space. The kernel on its own is never restricted. > > > As an example, if a Landlock policies only allows MPTCP: socket(..., > > IPPROTO_MPTCP) should be allowed and any legitimate use of the returned > > socket (according to MPTCP) should be allowed, including TCP fallback. > > However, socket(..., IPPROTO_TCP/0), should only be allowed if TCP is > > explicitly allowed. This means that we might end up with an MPTCP > > socket only using TCP, which is OK. > > Would it not be confusing for the person who set the Landlock policies? > Especially for the ones who had policies to block TCP, and thought they > were "safe", no? There are two kind of users for Landlock: 1. developers sandboxing their applications; 2. sysadmins or security experts sandboxing execution environments (e.g. with container runtimes, service managers, sandboxing tools...). It would make sense for developers to allow what their code request, whatever fallback the kernel might use instead. In this case, they should not care about MPTCP being TCP with some flags underneath. Moreover, developers might not be aware of the system on which their application is running, and their concern should mainly be about compatibility. For security or network experts, implying that allowing MPTCP means that fallback to TCP is allowed might be a bit surprising at first, but they should have the knowledge to know how MPTCP works underneath, including this fallback mechanism. Moreover, this kind of users can (and should) also rely on system-wide security policies such as Netfilter, which give more control. In a nutshell, Landlock should favor compatibility at the sandboxing/app layers and we should rely on system-wide security policies (taking into account the running system's context) for more fine-grained control. This compatibility behaviors should be explained in the Landlock documentation though. > > If only TCP is blocked on the userspace side, simply using IPPROTO_MPTCP > instead of IPPROTO_TCP will allow any users to continue to talk with the > outside world. Also, it is easy to force apps to use IPPROTO_MPTCP > instead of IPPROTO_TCP, e.g. using 'mptcpize' which set LD_PRELOAD in > order to change the parameters of the socket() call. > > mptcpize run curl https://check.mptcp.dev Landlock restrictions are enforced at a specific time for a process and all its future children. LD_PRELOAD is not an issue because a security policy cannot be disabled once enforced. If a sandboxed program uses MPTCP (because of LD_PRELOAD) instead of TCP, the previously enforced policy will be enforced the same (either to allow or deny the use of MPTCP). The only issue with LD_PRELOAD could be when e.g. curl sandboxes itself and denies itself the use of MPTCP, whereas mptcpize would "patch" the curl process to use MPTCP. In this case, connections would failed. A solution would be for mptcpize to "patch" the Landlock security as well, or for curl to be more permissive. If the sandboxing happens before calling mptcpize, or if it is enforced by mptcpize, then it would work as expected. > > > I guess this should be the same for other protocols, except if user > > space can explicitly transform a specific socket type to use an > > *arbitrary* protocol, but I think this is not possible. > I'm sorry, I don't know what is possible with the other ones. But again, > blocking both user and kernel sockets the same way might make more sense > here. > > >>>>>> > >>>>>> You mean that users always rely on a plain TCP communication in the case > >>>>>> the connection of MPTCP multipath communication fails? > >>>>> > >>>>> Yes, that's the same TCP connection, just without extra bit to be able > >>>>> to use multiple TCP connections associated to the same MPTCP one. > >>>> > >>>> Indeed, so MPTCP communication should be restricted the same way as TCP. > >>>> AFAICS this should be intuitive for MPTCP users and it'll be better > >>>> to let userland define this dependency. > >>> > >>> Yes, I think that would make more sense. > >>> > >>> I guess we can look at MPTCP as TCP with extra features. > >> > >> Yeap > >> > >>> > >>> So if TCP is blocked, MPTCP should be blocked as well. (And eventually > >>> having the possibility to block only TCP but not MPTCP and the opposite, > >>> but that's a different topic: a possible new feature, but not a bug-fix) > >> What do you mean by the "bug fix"? > >> > >>> > >>>>>>>> * IPv6 -> IPv4 transformation for TCP and UDP sockets withon > >>>>>>>> IPV6_ADDRFORM. Can be controlled with setsockopt() security hook. > > > > According to the man page: "It is allowed only for IPv6 sockets that are > > connected and bound to a v4-mapped-on-v6 address." > > > > This compatibility feature makes sense from user space point of view and > > should not result in an error because of Landlock. > > > >>>>>>>> > >>>>>>>> As I said before, I wonder if user may want to use SMC or MPTCP and > >>>>>>>> deny > >>>>>>>> TCP communication, since he should rely on fallback transformation > >>>>>>>> during the connection in the common case. It may be unexpected for > >>>>>>>> connect(2) to fail during the fallback due to security politics. > >>>>>>> > >>>>>>> With MPTCP, fallbacks can happen at the beginning of a connection, when > >>>>>>> there is only one path. This is done after the userspace's > >>>>>>> connect(). If > > > > A remaining question is then, can we repurpose an MPTCP socket that did > > fallback to TCP, to (re)connect to another destination (this time > > directly with TCP)? > > If the socket was created with the IPPROTO_MPTCP protocol, the protocol > will not change after a disconnection. But still, with an MPTCP socket, > it is by design possible to connect to a TCP one no mater how the socket > was used before. OK, this makes sense if we see MPTCP as a superset of TCP. > > > I guess this is possible. If it is the case, I think it should be OK > > anyway. That could be used by an attacker, but that should not give > > more access because of the MPTCP fallback mechanism anyway. We should > > see MPTCP as a superset of TCP. At the end, security policy is in the > > hands of user space. > > As long as it is documented and not seen as a regression :) > > To me, it sounds strange to have to add extra rules for MPTCP if TCP is > blocked, but that's certainly because I see MPTCP like it is seen on the > wire: as an extension to TCP, not as a different protocol. I understand. For Landlock, I'd prefer to not add exceptions according to protocol implementations, but to define a security policy that could easily map to user space code. The current proposal is to map the Landlock API to (a superset of) the socket(2) API, and then being able to specify restrictions on a domain, a type, or a protocol. However, we could document and encourage users to only specify AF_INET/AF_INET6 + SOCK_STREAM but without specifying any protocol (not "0" but a wildcard "(u64)-1"), which would then implicitly allow TCP and MPTCP. > > (...) > > Cheers, > Matt > -- > Sponsored by the NGI0 Core fund. > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction 2025-01-30 9:51 ` Mickaël Salaün @ 2025-01-30 10:18 ` Matthieu Baerts 0 siblings, 0 replies; 18+ messages in thread From: Matthieu Baerts @ 2025-01-30 10:18 UTC (permalink / raw) To: Mickaël Salaün Cc: Mikhail Ivanov, gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs, Paul Moore Hi Mickaël, On 30/01/2025 10:51, Mickaël Salaün wrote: > On Wed, Jan 29, 2025 at 04:44:18PM +0100, Matthieu Baerts wrote: >> Hi Mickaël, >> >> On 29/01/2025 15:51, Mickaël Salaün wrote: >>> On Wed, Jan 29, 2025 at 02:47:19PM +0300, Mikhail Ivanov wrote: >>>> On 1/29/2025 2:33 PM, Matthieu Baerts wrote: >>>>> On 29/01/2025 12:02, Mikhail Ivanov wrote: >>>>>> On 1/29/2025 1:25 PM, Matthieu Baerts wrote: >>>>>>> Hi Mikhail, >>>>>>> >>>>>>> On 29/01/2025 10:52, Mikhail Ivanov wrote: >>>>>>>> On 1/28/2025 9:14 PM, Matthieu Baerts wrote: >>>>>>>>> Hi Mikhail, >>>>>>>>> >>>>>>>>> Sorry, I didn't follow all the discussions in this thread, but here are >>>>>>>>> some comments, hoping this can help to clarify the MPTCP case. >>>>>>>> >>>>>>>> Thanks a lot for sharing your knowledge, Matthieu! >>>>>>>> >>>>>>>>> >>>>>>>>> On 28/01/2025 11:56, Mikhail Ivanov wrote: >>>>>>>>>> On 1/27/2025 10:48 PM, Mickaël Salaün wrote: >>>>>>>>> >>>>>>>>> (...) >>>>>>>>> >>>>>>>>>>> I'm a bit worried that we miss some of these places (now or in future >>>>>>>>>>> kernel versions). We'll need a new LSM hook for that. >>>>>>>>>>> >>>>>>>>>>> Could you list the current locations? >>>>>>>>>> >>>>>>>>>> Currently, I know only about TCP-related transformations: >>>>>>>>>> >>>>>>>>>> * SMC can fallback to TCP during connection. TCP connection is used >>>>>>>>>> (1) to exchange CLC control messages in default case and (2) >>>>>>>>>> for the >>>>>>>>>> communication in the case of fallback. If socket was connected or >>>>>>>>>> connection failed, socket can not be reconnected again. There >>>>>>>>>> is no >>>>>>>>>> existing security hook to control the fallback case, >>>>>>>>>> >>>>>>>>>> * MPTCP uses TCP for communication between two network interfaces >>>>>>>>>> in the >>>>>>>>>> default case and can fallback to plain TCP if remote peer does not >>>>>>>>>> support MPTCP. AFAICS, there is also no security hook to >>>>>>>>>> control the >>>>>>>>>> fallback transformation, >>>>>>>>> >>>>>>>>> There are security hooks to control the path creation, but not to >>>>>>>>> control the "fallback transformation". >>>>>>>>> >>>>>>>>> Technically, with MPTCP, the userspace will create an IPPROTO_MPTCP >>>>>>>>> socket. This is only used "internally": to communicate between the >>>>>>>>> userspace and the kernelspace, but not directly used between network >>>>>>>>> interfaces. This "external" communication is done via one or multiple >>>>>>>>> kernel TCP sockets carrying extra TCP options for the mapping. The >>>>>>>>> userspace cannot directly control these sockets created by the kernel. >>>>>>>>> >>>>>>>>> In case of fallback, the kernel TCP socket "simply" drop the extra TCP >>>>>>>>> options needed for MPTCP, and carry on like normal TCP. So on the wire >>>>>>>>> and in the Linux network stack, it is the same TCP connection, without >>>>>>>>> the MPTCP options in the TCP header. The userspace continue to >>>>>>>>> communicate with the same socket. >>>>>>>>> >>>>>>>>> I'm not sure if there is a need to block the fallback: it means only >>>>>>>>> one >>>>>>>>> path can be used at a time. >>> >>> Thanks Matthieu. >>> >>> So user space needs to specific IPPROTO_MPTCP to use MPTCP, but on the >>> network this socket can translate to "augmented" or plain TCP. >> >> Correct. On the wire, you will only see packet with the IPPROTO_TCP >> protocol. When MPTCP is used, extra MPTCP options will be present in the >> TCP headers, but the protocol is still IPPROTO_TCP on the network. >> >>> From Landlock point of view, what matters is to have a consistent policy >>> that maps to user space code. The fear was that a malicious user space >>> that is only allowed to use MPTCP could still transform an MPTCP socket >>> to a TCP socket, while it wasn't allowed to create a TCP socket in the >>> first place. I now think this should not be an issue because: >>> 1. MPTCP is kind of a superset of TCP >>> 2. user space legitimately using MPTCP should not get any error related >>> to a Landlock policy because of TCP/any automatic fallback. To say >>> it another way, such fallback is independent of user space requests >>> and may not be predicted because it is related to the current network >>> path. This follows the principle of least astonishment (at least >>> from user space point of view). >>> >>> So, if I understand correctly, this should be simple for the Landlock >>> socket creation control: we only check socket properties at creation >>> time and we ignore potential fallbacks. This should be documented >>> though. >> >> It depends on the restrictions that are put in place: are the user and >> kernel sockets treated the same way? If yes, blocking TCP means that >> even if it will be possible for the userspace to create an IPPROTO_MPTCP >> socket, the kernel will not be allowed to IPPROTO_TCP ones to >> communicate with the outside world. So blocking TCP will implicitly >> block MPTCP. >> >> On the other hand, if only TCP user sockets are blocked, then it will be >> possible to use MPTCP to communicate to any TCP sockets: with an >> IPPROTO_MPTCP socket, it is possible to communicate with any IPPROTO_TCP >> sockets, but without the extra features supported by MPTCP. > > Yes, that how Landlock works, it only enforces a security policy defined > by user space on user space. The kernel on its own is never restricted. OK, thank you, that's clearer. >>> As an example, if a Landlock policies only allows MPTCP: socket(..., >>> IPPROTO_MPTCP) should be allowed and any legitimate use of the returned >>> socket (according to MPTCP) should be allowed, including TCP fallback. >>> However, socket(..., IPPROTO_TCP/0), should only be allowed if TCP is >>> explicitly allowed. This means that we might end up with an MPTCP >>> socket only using TCP, which is OK. >> >> Would it not be confusing for the person who set the Landlock policies? >> Especially for the ones who had policies to block TCP, and thought they >> were "safe", no? > > There are two kind of users for Landlock: > 1. developers sandboxing their applications; > 2. sysadmins or security experts sandboxing execution environments (e.g. > with container runtimes, service managers, sandboxing tools...). > > It would make sense for developers to allow what their code request, > whatever fallback the kernel might use instead. In this case, they > should not care about MPTCP being TCP with some flags underneath. > Moreover, developers might not be aware of the system on which their > application is running, and their concern should mainly be about > compatibility. > > For security or network experts, implying that allowing MPTCP means that > fallback to TCP is allowed might be a bit surprising at first, but they > should have the knowledge to know how MPTCP works underneath, including > this fallback mechanism. Moreover, this kind of users can (and should) > also rely on system-wide security policies such as Netfilter, which > give more control. > > In a nutshell, Landlock should favor compatibility at the sandboxing/app > layers and we should rely on system-wide security policies (taking into > account the running system's context) for more fine-grained control. > This compatibility behaviors should be explained in the Landlock > documentation though. Thank you, also clearer! In my mind, Landlock would be used to get a sort of "jail" so that "any" users could use it to run untrusted apps for example. In that case, I was thinking no everybody will know that MPTCP can be used to bypass some restrictions only applied to TCP sockets. >> If only TCP is blocked on the userspace side, simply using IPPROTO_MPTCP >> instead of IPPROTO_TCP will allow any users to continue to talk with the >> outside world. Also, it is easy to force apps to use IPPROTO_MPTCP >> instead of IPPROTO_TCP, e.g. using 'mptcpize' which set LD_PRELOAD in >> order to change the parameters of the socket() call. >> >> mptcpize run curl https://check.mptcp.dev > > Landlock restrictions are enforced at a specific time for a process and > all its future children. LD_PRELOAD is not an issue because a security > policy cannot be disabled once enforced. If a sandboxed program uses > MPTCP (because of LD_PRELOAD) instead of TCP, the previously enforced > policy will be enforced the same (either to allow or deny the use of > MPTCP). > > The only issue with LD_PRELOAD could be when e.g. curl sandboxes itself > and denies itself the use of MPTCP, whereas mptcpize would "patch" the > curl process to use MPTCP. In this case, connections would failed. A > solution would be for mptcpize to "patch" the Landlock security as well, > or for curl to be more permissive. If the sandboxing happens before > calling mptcpize, or if it is enforced by mptcpize, then it would work > as expected. OK, it is clearer for me now that I understand apps can sandbox themselves! >>> I guess this should be the same for other protocols, except if user >>> space can explicitly transform a specific socket type to use an >>> *arbitrary* protocol, but I think this is not possible. >> I'm sorry, I don't know what is possible with the other ones. But again, >> blocking both user and kernel sockets the same way might make more sense >> here. >> >>>>>>>> >>>>>>>> You mean that users always rely on a plain TCP communication in the case >>>>>>>> the connection of MPTCP multipath communication fails? >>>>>>> >>>>>>> Yes, that's the same TCP connection, just without extra bit to be able >>>>>>> to use multiple TCP connections associated to the same MPTCP one. >>>>>> >>>>>> Indeed, so MPTCP communication should be restricted the same way as TCP. >>>>>> AFAICS this should be intuitive for MPTCP users and it'll be better >>>>>> to let userland define this dependency. >>>>> >>>>> Yes, I think that would make more sense. >>>>> >>>>> I guess we can look at MPTCP as TCP with extra features. >>>> >>>> Yeap >>>> >>>>> >>>>> So if TCP is blocked, MPTCP should be blocked as well. (And eventually >>>>> having the possibility to block only TCP but not MPTCP and the opposite, >>>>> but that's a different topic: a possible new feature, but not a bug-fix) >>>> What do you mean by the "bug fix"? >>>> >>>>> >>>>>>>>>> * IPv6 -> IPv4 transformation for TCP and UDP sockets withon >>>>>>>>>> IPV6_ADDRFORM. Can be controlled with setsockopt() security hook. >>> >>> According to the man page: "It is allowed only for IPv6 sockets that are >>> connected and bound to a v4-mapped-on-v6 address." >>> >>> This compatibility feature makes sense from user space point of view and >>> should not result in an error because of Landlock. >>> >>>>>>>>>> >>>>>>>>>> As I said before, I wonder if user may want to use SMC or MPTCP and >>>>>>>>>> deny >>>>>>>>>> TCP communication, since he should rely on fallback transformation >>>>>>>>>> during the connection in the common case. It may be unexpected for >>>>>>>>>> connect(2) to fail during the fallback due to security politics. >>>>>>>>> >>>>>>>>> With MPTCP, fallbacks can happen at the beginning of a connection, when >>>>>>>>> there is only one path. This is done after the userspace's >>>>>>>>> connect(). If >>> >>> A remaining question is then, can we repurpose an MPTCP socket that did >>> fallback to TCP, to (re)connect to another destination (this time >>> directly with TCP)? >> >> If the socket was created with the IPPROTO_MPTCP protocol, the protocol >> will not change after a disconnection. But still, with an MPTCP socket, >> it is by design possible to connect to a TCP one no mater how the socket >> was used before. > > OK, this makes sense if we see MPTCP as a superset of TCP. > >> >>> I guess this is possible. If it is the case, I think it should be OK >>> anyway. That could be used by an attacker, but that should not give >>> more access because of the MPTCP fallback mechanism anyway. We should >>> see MPTCP as a superset of TCP. At the end, security policy is in the >>> hands of user space. >> >> As long as it is documented and not seen as a regression :) >> >> To me, it sounds strange to have to add extra rules for MPTCP if TCP is >> blocked, but that's certainly because I see MPTCP like it is seen on the >> wire: as an extension to TCP, not as a different protocol. > > I understand. For Landlock, I'd prefer to not add exceptions according > to protocol implementations, but to define a security policy that could > easily map to user space code. The current proposal is to map the > Landlock API to (a superset of) the socket(2) API, and then being able > to specify restrictions on a domain, a type, or a protocol. However, we > could document and encourage users to only specify AF_INET/AF_INET6 + > SOCK_STREAM but without specifying any protocol (not "0" but a wildcard > "(u64)-1"), which would then implicitly allow TCP and MPTCP. Good idea! Cheers, Matt -- Sponsored by the NGI0 Core fund. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction 2025-01-29 14:51 ` Mickaël Salaün 2025-01-29 15:44 ` Matthieu Baerts @ 2025-01-31 11:04 ` Mikhail Ivanov 1 sibling, 0 replies; 18+ messages in thread From: Mikhail Ivanov @ 2025-01-31 11:04 UTC (permalink / raw) To: Mickaël Salaün Cc: Matthieu Baerts, gnoack, willemdebruijn.kernel, matthieu, linux-security-module, netdev, netfilter-devel, yusongping, artem.kuzin, konstantin.meskhidze, MPTCP Linux, linux-nfs, Paul Moore On 1/29/2025 5:51 PM, Mickaël Salaün wrote:>>>>>>> On 28/01/2025 11:56, Mikhail Ivanov wrote: [...] >>>>>>>> * IPv6 -> IPv4 transformation for TCP and UDP sockets withon >>>>>>>> IPV6_ADDRFORM. Can be controlled with setsockopt() security hook. > > According to the man page: "It is allowed only for IPv6 sockets that are > connected and bound to a v4-mapped-on-v6 address." > > This compatibility feature makes sense from user space point of view and > should not result in an error because of Landlock. IPV6_ADDRFORM is useful to pass IPv6 sockets binded and connected to v4-mapped-on-v6 addresses to pure IPv4 applications [1]. I just realized we first need to consider restriction of IPv4 access for IPv4/v6 dual stack. It's possible to communicate with IPv4 peer using IPv6 socket (on client or server side) that is mapped on v4-mapped-on-v6 address (RFC 3493 [2]). If socket access rights provide separate control over IPv6 and IPv4, v4-mapped-on-v6 looks like possible bypass of IPv4 restriction and violation of the least astonishment principle. This can be controlled with IPV6_V6ONLY socket option or with net.ipv6.bindv6only sysctl knob. Restriction with sysctl knob is applied globally and may break some dual-stack dependent applications. I'm currently trying to collect real-world examples in which user may want to allow IPv6-only communication in a sandboxed environment. Theoretically, this can be seen as unprivileged reduction of attack surface for IPv6-only programs in dual-stack network (disallow to open IPv4 connections and communicate with loopback via IPv4 stack). Earlier, it was also discussed about possible security issues on the userland side related to different address representation and address filtering [3]. But, I don't really think these are the good examples for the motivation. If the v4-mapped-on-v6 addressing control is deemed reasonable, it should be better implemented with a new access right for LANDLOCK_RULE_NET_PORT rather than a part of socket creation control. [1] https://man7.org/linux/man-pages/man7/ipv6.7.html [2] https://datatracker.ietf.org/doc/html/rfc3493#section-3.7 [3] https://lwn.net/Articles/688462/ ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-01-31 11:04 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20241017110454.265818-1-ivanov.mikhail1@huawei-partners.com>
[not found] ` <20241017110454.265818-2-ivanov.mikhail1@huawei-partners.com>
[not found] ` <49bc2227-d8e1-4233-8bc4-4c2f0a191b7c@kernel.org>
[not found] ` <20241018.Kahdeik0aaCh@digikod.net>
[not found] ` <62336067-18c2-3493-d0ec-6dd6a6d3a1b5@huawei-partners.com>
2024-12-12 18:43 ` [RFC PATCH v2 1/8] landlock: Fix non-TCP sockets restriction Mickaël Salaün
2024-12-13 18:19 ` Mikhail Ivanov
2025-01-24 15:02 ` Mickaël Salaün
2025-01-27 12:40 ` Mikhail Ivanov
2025-01-27 19:48 ` Mickaël Salaün
2025-01-28 10:56 ` Mikhail Ivanov
2025-01-28 18:14 ` Matthieu Baerts
2025-01-29 9:52 ` Mikhail Ivanov
2025-01-29 10:25 ` Matthieu Baerts
2025-01-29 11:02 ` Mikhail Ivanov
2025-01-29 11:33 ` Matthieu Baerts
2025-01-29 11:47 ` Mikhail Ivanov
2025-01-29 11:57 ` Matthieu Baerts
2025-01-29 14:51 ` Mickaël Salaün
2025-01-29 15:44 ` Matthieu Baerts
2025-01-30 9:51 ` Mickaël Salaün
2025-01-30 10:18 ` Matthieu Baerts
2025-01-31 11:04 ` Mikhail Ivanov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox