From: Alejandro Colomar <alx@kernel.org>
To: Matthew House <mattlloydhouse@gmail.com>
Cc: linux-man@vger.kernel.org, linux-api@vger.kernel.org,
netdev@vger.kernel.org, Ulrich Drepper <drepper@gmail.com>,
Ulrich Drepper <drepper@redhat.com>,
"David S. Miller" <davem@davemloft.net>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v2] recv.2: Document MSG_CMSG_CLOEXEC as returned in msg_flags
Date: Tue, 18 Jul 2023 14:03:02 +0200 [thread overview]
Message-ID: <05f6395d-4ee2-ce87-253a-9dcbfe227d42@kernel.org> (raw)
In-Reply-To: <20230718060121.934187-1-mattlloydhouse@gmail.com>
[-- Attachment #1.1: Type: text/plain, Size: 5155 bytes --]
Hi Matthew,
On 2023-07-18 08:00, Matthew House wrote:
> On Mon, Jul 17, 2023 at 7:10 PM Alejandro Colomar <alx@kernel.org> wrote:
>> Hi Matthew,
>>
>> I don't understand what's the purpose of this. The kernel sets a bit
>> just to report to the caller that it set a bit? No other purpose?
>> It feels very weird. Of course, the caller already has that info,
>> doesn't it?
>
> The main reason I posted this patch was because I was confused by the
> flag's presence in the msg_flags when I was looking at some strace logs, so
> I figured that it would be a good idea to document it.
Makes sense.
> As for the original
> purpose of the behavior, it's not really clear, and it may well have been
> an implementation artifact that got enshrined in the user space ABI. (Even
> io_uring is careful to replicate this behavior!)
This is what worries me. I've CCd a bunch of people to see if they can
bring some light.
>
> This behavior began when the MSG_CMSG_CLOEXEC flag was first added in Linux
> 2.6.23, with Ulrich Drepper's commit 4a19542e5f69 ("O_CLOEXEC for
> SCM_RIGHTS"). Per the commit message, the flag was designed to be
> "passe[d]... just like the existing MSG_CMSG_COMPAT flag". Since it was
> added to the msg_flags at the start of sys_recvmsg(), the
> scm_detach_fds[_compat]() functions in net/core/scm.c and net/compat.c
> could read the flag off of msg->msg_flags without having to thread the
> recvmsg() flags through.
>
> This was indeed similar to the behavior of MSG_CMSG_COMPAT. That flag was
> added in Linux 2.5.65, with commit 3225fc8a85f4 ("[NET]: Simplify scm
> handling and sendmsg/recvmsg invocation, consolidate net compat
> syscalls."), in which put_cmsg() and scm_detach_fds() in net/core/scm.c
> read it off of msg->msg_flags. (It wouldn't actually be set in msg_flags
> until Linux 2.5.67, with commit 7e8d06bc1d90, "[COMPAT]: Fix
> MSG_CMSG_COMPAT flag passing, kill cmsg_compat_recvmsg_fixup." Both of
> these commits are from history/history.git.)
>
> However, the MSG_CMSG_COMPAT flag has been scrubbed from the output
> msg_flags since Linux 2.6.14, with commit 37f7f421cce1 ("[NET]: Do not leak
> MSG_CMSG_COMPAT into userspace."). That's what I find so unclear:
> MSG_CMSG_CLOEXEC was added after the kernel started scrubbing
> MSG_CMSG_COMPAT from the output, but the new flag was never written to be
> similarly scrubbed.
>
> Later, in Linux 3.10, with commits 1be374a0518a ("net: Block
> MSG_CMSG_COMPAT in send(m)msg and recv(m)msg") and a7526eb5d06b ("net:
> Unbreak compat_sys_{send,recv}msg"), MSG_CMSG_COMPAT was banned from being
> passed to the *msg() syscalls' flags from user space, with the rationale
> that they were "not intended to be part of the API". Then, in Linux 4.0, we
> reached the current status quo with commit d720d8cec563 ("net: compat:
> Ignore MSG_CMSG_COMPAT in compat_sys_{send, recv}msg"), where
> MSG_CMSG_COMPAT is allowed (and a no-op) in compat syscalls, but banned
> from non-compat syscalls.
>
> So I agree that it's very weird that this flag gets returned to user space,
> even while the internal flag that it's modeled after doesn't. I suppose I
> could spin up a nice story, where the user-space function calling recvmsg()
> is totally separate from the function processing the returned struct
> msghdr, and the latter function would really like to know whether the fds
> in that message are close-on-exec without having to call fcntl(F_GETFD).
> But that's all just a total guess. If you want to know for sure, perhaps
> cc'ing Drepper may be worthwhile?
>
> A cursory look hasn't shown me any existing user-space code that depends on
> this behavior. Though one library appears to be aware of this behavior,
> actively filtering MSG_CMSG_CLOEXEC out of the result flags:
> <https://github.com/dutchanddutch/node-socket-calls/blob/ca759a0da87cb112875d158f4a81b45b31f4a871/src/socket_calls.cc#L417>
>
> Also, only somewhat relatedly, some libraries incorrectly attempt to
> request MSG_CMSG_CLOEXEC by passing it into the msg_flags field instead of
> the flags argument:
> <https://git.samba.org/samba.git/?p=samba.git;a=blob;f=lib/messaging/messages_dgm.c;hb=refs/tags/samba-4.17.9#l1272>
> <https://github.com/genodelabs/genode/blob/23.05/repos/base-linux/src/lib/base/ipc.cc#L132>
> <https://github.com/proxmox/pve-lxc-syscalld/blob/a14430f3e75c2b695332ad712164e599464177fc/src/io/seq_packet.rs#L123>
In any case, all of this mail has been very interesting,
and it would be useful to have it in the commit message of the patch.
Please send an v2 with it, and add 'Cc:' tags for all these people:
Cc: <linux-api@vger.kernel.org>
Cc: <netdev@vger.kernel.org>
Cc: Ulrich Drepper <drepper@gmail.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
(I don't know if <drepper@redhat.com> still works. Does anyone know?)
Thanks!
Alex
>
> Thank you,
> Matthew House
--
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2023-07-18 12:03 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-09 21:33 [PATCH] recv.2: Document MSG_CMSG_CLOEXEC as returned in msg_flags Matthew House
2023-07-15 15:36 ` Alejandro Colomar
2023-07-16 23:47 ` [PATCH v2] " Matthew House
2023-07-17 23:10 ` Alejandro Colomar
2023-07-18 6:00 ` Matthew House
2023-07-18 12:03 ` Alejandro Colomar [this message]
2023-07-18 17:26 ` [PATCH v3] " Matthew House
2023-07-18 21:39 ` Alejandro Colomar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=05f6395d-4ee2-ce87-253a-9dcbfe227d42@kernel.org \
--to=alx@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=drepper@gmail.com \
--cc=drepper@redhat.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-man@vger.kernel.org \
--cc=mattlloydhouse@gmail.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox