netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vadim Fedorenko <vadim.fedorenko@linux.dev>
To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
	Willem de Bruijn <willemb@google.com>
Cc: netdev@vger.kernel.org, Jakub Kicinski <kuba@kernel.org>,
	Jason Xing <kerneljasonxing@gmail.com>,
	Paolo Abeni <pabeni@redhat.com>, David Ahern <dsahern@kernel.org>
Subject: Re: [PATCH net-next v2 1/2] net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message
Date: Mon, 2 Sep 2024 22:07:44 +0100	[thread overview]
Message-ID: <932f7ff3-6087-47cf-91ec-2601da38ebec@linux.dev> (raw)
In-Reply-To: <66d5cbbba9669_6138829497@willemb.c.googlers.com.notmuch>

On 02/09/2024 15:29, Willem de Bruijn wrote:
> Vadim Fedorenko wrote:
>> SOF_TIMESTAMPING_OPT_ID socket option flag gives a way to correlate TX
>> timestamps and packets sent via socket. Unfortunately, there is no way
>> to reliably predict socket timestamp ID value in case of error returned
>> by sendmsg. For UDP sockets it's impossible because of lockless
>> nature of UDP transmit, several threads may send packets in parallel. In
>> case of RAW sockets MSG_MORE option makes things complicated. More
>> details are in the conversation [1].
>> This patch adds new control message type to give user-space
>> software an opportunity to control the mapping between packets and
>> values by providing ID with each sendmsg. This works fine for UDP
>> sockets only, and explicit check is added to control message parser.
>>
>> [1] https://lore.kernel.org/netdev/CALCETrU0jB+kg0mhV6A8mrHfTE1D1pr1SD_B9Eaa9aDPfgHdtA@mail.gmail.com/
>>
>> Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
>> ---
>>   Documentation/networking/timestamping.rst | 14 ++++++++++++++
>>   arch/alpha/include/uapi/asm/socket.h      |  4 +++-
>>   arch/mips/include/uapi/asm/socket.h       |  2 ++
>>   arch/parisc/include/uapi/asm/socket.h     |  2 ++
>>   arch/sparc/include/uapi/asm/socket.h      |  2 ++
>>   include/net/inet_sock.h                   |  4 +++-
>>   include/net/sock.h                        |  1 +
>>   include/uapi/asm-generic/socket.h         |  2 ++
>>   include/uapi/linux/net_tstamp.h           |  3 ++-
>>   net/core/sock.c                           | 12 ++++++++++++
>>   net/ethtool/common.c                      |  1 +
>>   net/ipv4/ip_output.c                      | 16 ++++++++++++----
>>   net/ipv6/ip6_output.c                     | 16 ++++++++++++----
>>   13 files changed, 68 insertions(+), 11 deletions(-)
>>
>> diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
>> index 5e93cd71f99f..93b0901e4e8e 100644
>> --- a/Documentation/networking/timestamping.rst
>> +++ b/Documentation/networking/timestamping.rst
>> @@ -193,6 +193,20 @@ SOF_TIMESTAMPING_OPT_ID:
>>     among all possibly concurrently outstanding timestamp requests for
>>     that socket.
>>   
>> +  With this option enabled user-space application can provide custom
>> +  ID for each message sent via UDP socket with control message with
>> +  type set to SCM_TS_OPT_ID::
>> +
>> +    struct msghdr *msg;
>> +    ...
>> +    cmsg			 = CMSG_FIRSTHDR(msg);
>> +    cmsg->cmsg_level		 = SOL_SOCKET;
>> +    cmsg->cmsg_type		 = SO_TIMESTAMPING;
>> +    cmsg->cmsg_len		 = CMSG_LEN(sizeof(__u32));
>> +    *((__u32 *) CMSG_DATA(cmsg)) = opt_id;
>> +    err = sendmsg(fd, msg, 0);
>> +
> 
> Please make it clear that this CMSG is optional.
> 
> The process can optionally override the default generated ID, by
> passing a specific ID with control message SCM_TS_OPT_ID:

Ok, I'll re-phrase it this way, thanks!


>>   SOF_TIMESTAMPING_OPT_ID_TCP:
>>     Pass this modifier along with SOF_TIMESTAMPING_OPT_ID for new TCP
>>     timestamping applications. SOF_TIMESTAMPING_OPT_ID defines how the
>> diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
>> index e94f621903fe..0698e6662cdf 100644
>> --- a/arch/alpha/include/uapi/asm/socket.h
>> +++ b/arch/alpha/include/uapi/asm/socket.h
>> @@ -10,7 +10,7 @@
>>    * Note: we only bother about making the SOL_SOCKET options
>>    * same as OSF/1, as that's all that "normal" programs are
>>    * likely to set.  We don't necessarily want to be binary
>> - * compatible with _everything_.
>> + * compatible with _everything_.
> 
> Is this due to a checkpatch warning? If so, please add a brief comment
> to the commit message to show that this change is intentional. If not,
> please don't touch unrelated code.

I'll remove it, because it looks like it was some unhappy linter...

>>    */
>>   #define SOL_SOCKET	0xffff
>>   
>> @@ -140,6 +140,8 @@
>>   #define SO_PASSPIDFD		76
>>   #define SO_PEERPIDFD		77
>>   
>> +#define SCM_TS_OPT_ID		78
>> +
>>   #if !defined(__KERNEL__)
>>   
>>   #if __BITS_PER_LONG == 64
>> diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
>> index 60ebaed28a4c..bb3dc8feb205 100644
>> --- a/arch/mips/include/uapi/asm/socket.h
>> +++ b/arch/mips/include/uapi/asm/socket.h
>> @@ -151,6 +151,8 @@
>>   #define SO_PASSPIDFD		76
>>   #define SO_PEERPIDFD		77
>>   
>> +#define SCM_TS_OPT_ID		78
>> +
>>   #if !defined(__KERNEL__)
>>   
>>   #if __BITS_PER_LONG == 64
>> diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
>> index be264c2b1a11..c3ab3b3289eb 100644
>> --- a/arch/parisc/include/uapi/asm/socket.h
>> +++ b/arch/parisc/include/uapi/asm/socket.h
>> @@ -132,6 +132,8 @@
>>   #define SO_PASSPIDFD		0x404A
>>   #define SO_PEERPIDFD		0x404B
>>   
>> +#define SCM_TS_OPT_ID		0x404C
>> +
>>   #if !defined(__KERNEL__)
>>   
>>   #if __BITS_PER_LONG == 64
>> diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
>> index 682da3714686..9b40f0a57fbc 100644
>> --- a/arch/sparc/include/uapi/asm/socket.h
>> +++ b/arch/sparc/include/uapi/asm/socket.h
>> @@ -133,6 +133,8 @@
>>   #define SO_PASSPIDFD             0x0055
>>   #define SO_PEERPIDFD             0x0056
>>   
>> +#define SCM_TS_OPT_ID            0x0057
>> +
>>   #if !defined(__KERNEL__)
>>   
>>   
>> diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
>> index 394c3b66065e..2161d50cf0fd 100644
>> --- a/include/net/inet_sock.h
>> +++ b/include/net/inet_sock.h
>> @@ -174,6 +174,7 @@ struct inet_cork {
>>   	__s16			tos;
>>   	char			priority;
>>   	__u16			gso_size;
>> +	u32			ts_opt_id;
>>   	u64			transmit_time;
>>   	u32			mark;
>>   };
>> @@ -241,7 +242,8 @@ struct inet_sock {
>>   	struct inet_cork_full	cork;
>>   };
>>   
>> -#define IPCORK_OPT	1	/* ip-options has been held in ipcork.opt */
>> +#define IPCORK_OPT		1	/* ip-options has been held in ipcork.opt */
>> +#define IPCORK_TS_OPT_ID	2	/* timestmap opt id has been provided in cmsg */
> 
> typo: timestamp
> 
> And maybe more relevant:  /* ts_opt_id field is valid, overriding sk_tskey */

I'll change it

>>   enum {
>>   	INET_FLAGS_PKTINFO	= 0,
>> diff --git a/include/net/sock.h b/include/net/sock.h
>> index f51d61fab059..73e21dad5660 100644
>> --- a/include/net/sock.h
>> +++ b/include/net/sock.h
>> @@ -1794,6 +1794,7 @@ struct sockcm_cookie {
>>   	u64 transmit_time;
>>   	u32 mark;
>>   	u32 tsflags;
>> +	u32 ts_opt_id;
>>   };
>>   
>>   static inline void sockcm_init(struct sockcm_cookie *sockc,
>> diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
>> index 8ce8a39a1e5f..db3df3e74b01 100644
>> --- a/include/uapi/asm-generic/socket.h
>> +++ b/include/uapi/asm-generic/socket.h
>> @@ -135,6 +135,8 @@
>>   #define SO_PASSPIDFD		76
>>   #define SO_PEERPIDFD		77
>>   
>> +#define SCM_TS_OPT_ID		78
> 
> nit: different indentation

Hmm... that's interesting, it's ok in the code, there no spaces before
#define. I'll re-check it in the patch in v3.

>> +
>>   #if !defined(__KERNEL__)
>>   
>>   #if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
>> diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h
>> index a2c66b3d7f0f..e2f145e3f3a1 100644
>> --- a/include/uapi/linux/net_tstamp.h
>> +++ b/include/uapi/linux/net_tstamp.h
>> @@ -32,8 +32,9 @@ enum {
>>   	SOF_TIMESTAMPING_OPT_TX_SWHW = (1<<14),
>>   	SOF_TIMESTAMPING_BIND_PHC = (1 << 15),
>>   	SOF_TIMESTAMPING_OPT_ID_TCP = (1 << 16),
>> +	SOF_TIMESTAMPING_OPT_ID_CMSG = (1 << 17),
>>   
>> -	SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_ID_TCP,
>> +	SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_ID_CMSG,
>>   	SOF_TIMESTAMPING_MASK = (SOF_TIMESTAMPING_LAST - 1) |
>>   				 SOF_TIMESTAMPING_LAST
>>   };
>> diff --git a/net/core/sock.c b/net/core/sock.c
>> index 468b1239606c..560b075765fa 100644
>> --- a/net/core/sock.c
>> +++ b/net/core/sock.c
>> @@ -2859,6 +2859,18 @@ int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg,
>>   			return -EINVAL;
>>   		sockc->transmit_time = get_unaligned((u64 *)CMSG_DATA(cmsg));
>>   		break;
>> +	case SCM_TS_OPT_ID:
>> +		/* allow this option for UDP sockets only */
>> +		if (!sk_is_udp(sk))
>> +			return -EINVAL;
> 
> Let's relax the restriction that this is only for UDP.
> 
> At least to also support SOCK_RAW. I don't think that requires any
> additional code at all?

RAW sockets use skb_setup_tx_timestamps which does atomic operation of
incrementing sk_tskey when in _sock_tx_timestamp. So I'll have to
convert all spots (can, ipv4/raw, ipv6/raw, 3 x af_packet) to use the
same logic as in udp path (sock_tx_timestamp) and add conditions.
Or change skb_setup_tx_timestamps to do the logic and take different
arguments. It may look as a big refactoring, so I would like to make
it as a follow-up series.

> Extending to TCP should be straightforward too, just a branch
> on sockc in tcp_tx_timestamp.

TCP part looks a bit easier, as you said, I have to adjust
tcp_tx_timestamp and the logic is straightforward. I still have to
provide a pointer to sock coockie instead of flags, but there is only
one caller of this function and it's much easier than with RAW sockets.

> If so, let's support all. It makes for a simpler API if it is
> supported uniformly wherever OPT_ID is.

If you think that the way I explained for RAW sockets is good enough,
I can send all of them as a single patcheset. Otherwise I would like to
add RAW sockets in follow-up series.

>> +		tsflags = READ_ONCE(sk->sk_tsflags);
>> +		if (!(tsflags & SOF_TIMESTAMPING_OPT_ID))
>> +			return -EINVAL;
>> +		if (cmsg->cmsg_len != CMSG_LEN(sizeof(u32)))
>> +			return -EINVAL;
>> +		sockc->ts_opt_id = *(u32 *)CMSG_DATA(cmsg);
>> +		sockc->tsflags |= SOF_TIMESTAMPING_OPT_ID_CMSG;
>> +		break;
>>   	/* SCM_RIGHTS and SCM_CREDENTIALS are semantically in SOL_UNIX. */
>>   	case SCM_RIGHTS:
>>   	case SCM_CREDENTIALS:


  reply	other threads:[~2024-09-02 21:07 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-02 13:09 [PATCH net-next v2 1/2] net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message Vadim Fedorenko
2024-09-02 13:09 ` [PATCH net-next v2 2/2] selftests: txtimestamp: add SCM_TS_OPT_ID test Vadim Fedorenko
2024-09-02 14:24   ` Jason Xing
2024-09-08 20:04     ` Vadim Fedorenko
2024-09-08 23:36       ` Jason Xing
2024-09-02 14:29 ` [PATCH net-next v2 1/2] net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message Willem de Bruijn
2024-09-02 21:07   ` Vadim Fedorenko [this message]
2024-09-03 20:40     ` Willem de Bruijn
2024-09-02 15:19 ` Jason Xing
2024-09-02 15:51   ` Willem de Bruijn
2024-09-02 19:40     ` Vadim Fedorenko
2024-09-02 20:59       ` Willem de Bruijn
2024-09-02 21:10         ` Vadim Fedorenko
2024-09-03 16:01     ` Vadim Fedorenko
2024-09-03 20:46       ` Willem de Bruijn
2024-09-02 18:38 ` Simon Horman
2024-09-02 19:35   ` Vadim Fedorenko
2024-09-03 15:23     ` Simon Horman
2024-09-03  8:06 ` Dan Carpenter
2024-09-03  8:16   ` Vadim Fedorenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=932f7ff3-6087-47cf-91ec-2601da38ebec@linux.dev \
    --to=vadim.fedorenko@linux.dev \
    --cc=dsahern@kernel.org \
    --cc=kerneljasonxing@gmail.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=willemb@google.com \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).