* [PATCH net-next 0/3] net-timestamp: add some trivial
@ 2024-09-30 9:24 Jason Xing
2024-09-30 9:24 ` [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags Jason Xing
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: Jason Xing @ 2024-09-30 9:24 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, dsahern, willemdebruijn.kernel,
shuah, willemb
Cc: linux-kselftest, netdev, Jason Xing
From: Jason Xing <kernelxing@tencent.com>
When reading through the whole feature, I feel we can do these things to
make it more robust. They are trivial changes, not big ones.
Jason Xing (3):
net-timestamp: add strict check when setting tx flags
net-timestamp: add OPT_ID_TCP test in selftests
net-timestamp: namespacify the sysctl_tstamp_allow_data
include/net/netns/core.h | 1 +
include/net/sock.h | 2 --
net/core/net_namespace.c | 1 +
net/core/skbuff.c | 2 +-
net/core/sock.c | 4 ++++
net/core/sysctl_net_core.c | 18 +++++++++---------
tools/testing/selftests/net/txtimestamp.c | 6 ++++++
7 files changed, 22 insertions(+), 12 deletions(-)
--
2.37.3
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags
2024-09-30 9:24 [PATCH net-next 0/3] net-timestamp: add some trivial Jason Xing
@ 2024-09-30 9:24 ` Jason Xing
2024-09-30 10:39 ` Willem de Bruijn
2024-09-30 10:48 ` Vadim Fedorenko
2024-09-30 9:24 ` [PATCH net-next 2/3] net-timestamp: add OPT_ID_TCP test in selftests Jason Xing
2024-09-30 9:24 ` [PATCH net-next 3/3] net-timestamp: namespacify the sysctl_tstamp_allow_data Jason Xing
2 siblings, 2 replies; 20+ messages in thread
From: Jason Xing @ 2024-09-30 9:24 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, dsahern, willemdebruijn.kernel,
shuah, willemb
Cc: linux-kselftest, netdev, Jason Xing
From: Jason Xing <kernelxing@tencent.com>
Even though this case is unlikely to happen, we have to avoid such
a case occurring at an earlier point: the sk_rmem_alloc could get
increased because of inserting more and more skbs into the errqueue
when calling __skb_complete_tx_timestamp(). This bad case would stop
the socket transmitting soon.
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
net/core/sock.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/core/sock.c b/net/core/sock.c
index fe87f9bd8f16..4bddd6f62e4f 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -905,6 +905,10 @@ int sock_set_timestamping(struct sock *sk, int optname,
if (val & ~SOF_TIMESTAMPING_MASK)
return -EINVAL;
+ if (val & SOF_TIMESTAMPING_TX_RECORD_MASK &&
+ !(val & SOF_TIMESTAMPING_SOFTWARE))
+ return -EINVAL;
+
if (val & SOF_TIMESTAMPING_OPT_ID_TCP &&
!(val & SOF_TIMESTAMPING_OPT_ID))
return -EINVAL;
--
2.37.3
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH net-next 2/3] net-timestamp: add OPT_ID_TCP test in selftests
2024-09-30 9:24 [PATCH net-next 0/3] net-timestamp: add some trivial Jason Xing
2024-09-30 9:24 ` [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags Jason Xing
@ 2024-09-30 9:24 ` Jason Xing
2024-09-30 10:42 ` Willem de Bruijn
2024-09-30 9:24 ` [PATCH net-next 3/3] net-timestamp: namespacify the sysctl_tstamp_allow_data Jason Xing
2 siblings, 1 reply; 20+ messages in thread
From: Jason Xing @ 2024-09-30 9:24 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, dsahern, willemdebruijn.kernel,
shuah, willemb
Cc: linux-kselftest, netdev, Jason Xing
From: Jason Xing <kernelxing@tencent.com>
Introduce a test for SOF_TIMESTAMPING_OPT_ID_TCP for TCP proto so
that we can get aware of whether using write_seq as an initial key
value works as expected.
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
tools/testing/selftests/net/txtimestamp.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/tools/testing/selftests/net/txtimestamp.c b/tools/testing/selftests/net/txtimestamp.c
index d626f22f9550..5f5de79d0e51 100644
--- a/tools/testing/selftests/net/txtimestamp.c
+++ b/tools/testing/selftests/net/txtimestamp.c
@@ -893,6 +893,12 @@ static void do_main(int family)
do_test(family, SOF_TIMESTAMPING_TX_SCHED |
SOF_TIMESTAMPING_TX_SOFTWARE |
SOF_TIMESTAMPING_TX_ACK);
+
+ fprintf(stderr, "\ntest ENQ + SND + ACK with tcp tskey setting\n");
+ do_test(family, SOF_TIMESTAMPING_TX_SCHED |
+ SOF_TIMESTAMPING_TX_SOFTWARE |
+ SOF_TIMESTAMPING_TX_ACK |
+ SOF_TIMESTAMPING_OPT_ID_TCP);
}
}
--
2.37.3
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH net-next 3/3] net-timestamp: namespacify the sysctl_tstamp_allow_data
2024-09-30 9:24 [PATCH net-next 0/3] net-timestamp: add some trivial Jason Xing
2024-09-30 9:24 ` [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags Jason Xing
2024-09-30 9:24 ` [PATCH net-next 2/3] net-timestamp: add OPT_ID_TCP test in selftests Jason Xing
@ 2024-09-30 9:24 ` Jason Xing
2024-09-30 10:47 ` Willem de Bruijn
2 siblings, 1 reply; 20+ messages in thread
From: Jason Xing @ 2024-09-30 9:24 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, dsahern, willemdebruijn.kernel,
shuah, willemb
Cc: linux-kselftest, netdev, Jason Xing
From: Jason Xing <kernelxing@tencent.com>
Let it be tuned in per netns by admins.
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
include/net/netns/core.h | 1 +
include/net/sock.h | 2 --
net/core/net_namespace.c | 1 +
net/core/skbuff.c | 2 +-
net/core/sysctl_net_core.c | 18 +++++++++---------
5 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/include/net/netns/core.h b/include/net/netns/core.h
index 78214f1b43a2..ef8b3105c632 100644
--- a/include/net/netns/core.h
+++ b/include/net/netns/core.h
@@ -23,6 +23,7 @@ struct netns_core {
#if IS_ENABLED(CONFIG_RPS) && IS_ENABLED(CONFIG_SYSCTL)
struct cpumask *rps_default_mask;
#endif
+ int sysctl_tstamp_allow_data;
};
#endif
diff --git a/include/net/sock.h b/include/net/sock.h
index c58ca8dd561b..4f31be0fd671 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2808,8 +2808,6 @@ void sk_get_meminfo(const struct sock *sk, u32 *meminfo);
extern __u32 sysctl_wmem_max;
extern __u32 sysctl_rmem_max;
-extern int sysctl_tstamp_allow_data;
-
extern __u32 sysctl_wmem_default;
extern __u32 sysctl_rmem_default;
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index e39479f1c9a4..e78c01912c64 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -317,6 +317,7 @@ static __net_init void preinit_net_sysctl(struct net *net)
*/
net->core.sysctl_optmem_max = 128 * 1024;
net->core.sysctl_txrehash = SOCK_TXREHASH_ENABLED;
+ net->core.sysctl_tstamp_allow_data = 1;
}
/* init code that must occur even if setup_net() is not called. */
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 74149dc4ee31..ad727d924f73 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -5506,7 +5506,7 @@ static bool skb_may_tx_timestamp(struct sock *sk, bool tsonly)
{
bool ret;
- if (likely(READ_ONCE(sysctl_tstamp_allow_data) || tsonly))
+ if (likely(READ_ONCE(sock_net(sk)->core.sysctl_tstamp_allow_data) || tsonly))
return true;
read_lock_bh(&sk->sk_callback_lock);
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 86a2476678c4..83622799eb80 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -491,15 +491,6 @@ static struct ctl_table net_core_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec,
},
- {
- .procname = "tstamp_allow_data",
- .data = &sysctl_tstamp_allow_data,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec_minmax,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE
- },
#ifdef CONFIG_RPS
{
.procname = "rps_sock_flow_entries",
@@ -665,6 +656,15 @@ static struct ctl_table netns_core_table[] = {
.extra2 = SYSCTL_ONE,
.proc_handler = proc_dou8vec_minmax,
},
+ {
+ .procname = "tstamp_allow_data",
+ .data = &init_net.core.sysctl_tstamp_allow_data,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE
+ },
/* sysctl_core_net_init() will set the values after this
* to readonly in network namespaces
*/
--
2.37.3
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags
2024-09-30 9:24 ` [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags Jason Xing
@ 2024-09-30 10:39 ` Willem de Bruijn
2024-09-30 11:29 ` Jason Xing
2024-09-30 10:48 ` Vadim Fedorenko
1 sibling, 1 reply; 20+ messages in thread
From: Willem de Bruijn @ 2024-09-30 10:39 UTC (permalink / raw)
To: Jason Xing, davem, edumazet, kuba, pabeni, dsahern,
willemdebruijn.kernel, shuah, willemb
Cc: linux-kselftest, netdev, Jason Xing
Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
>
> Even though this case is unlikely to happen, we have to avoid such
> a case occurring at an earlier point: the sk_rmem_alloc could get
> increased because of inserting more and more skbs into the errqueue
> when calling __skb_complete_tx_timestamp(). This bad case would stop
> the socket transmitting soon.
It is up to the application to read from the error queue frequently
enough and/or increase SO_RCVBUF.
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
> ---
> net/core/sock.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/net/core/sock.c b/net/core/sock.c
> index fe87f9bd8f16..4bddd6f62e4f 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -905,6 +905,10 @@ int sock_set_timestamping(struct sock *sk, int optname,
> if (val & ~SOF_TIMESTAMPING_MASK)
> return -EINVAL;
>
> + if (val & SOF_TIMESTAMPING_TX_RECORD_MASK &&
> + !(val & SOF_TIMESTAMPING_SOFTWARE))
> + return -EINVAL;
> +
This breaks hardware timestamping
> if (val & SOF_TIMESTAMPING_OPT_ID_TCP &&
> !(val & SOF_TIMESTAMPING_OPT_ID))
> return -EINVAL;
> --
> 2.37.3
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 2/3] net-timestamp: add OPT_ID_TCP test in selftests
2024-09-30 9:24 ` [PATCH net-next 2/3] net-timestamp: add OPT_ID_TCP test in selftests Jason Xing
@ 2024-09-30 10:42 ` Willem de Bruijn
2024-09-30 11:49 ` Jason Xing
0 siblings, 1 reply; 20+ messages in thread
From: Willem de Bruijn @ 2024-09-30 10:42 UTC (permalink / raw)
To: Jason Xing, davem, edumazet, kuba, pabeni, dsahern,
willemdebruijn.kernel, shuah, willemb
Cc: linux-kselftest, netdev, Jason Xing
Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
>
> Introduce a test for SOF_TIMESTAMPING_OPT_ID_TCP for TCP proto so
> that we can get aware of whether using write_seq as an initial key
> value works as expected.
Does the test behave different with this flag set?
>
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
> ---
> tools/testing/selftests/net/txtimestamp.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/tools/testing/selftests/net/txtimestamp.c b/tools/testing/selftests/net/txtimestamp.c
> index d626f22f9550..5f5de79d0e51 100644
> --- a/tools/testing/selftests/net/txtimestamp.c
> +++ b/tools/testing/selftests/net/txtimestamp.c
> @@ -893,6 +893,12 @@ static void do_main(int family)
> do_test(family, SOF_TIMESTAMPING_TX_SCHED |
> SOF_TIMESTAMPING_TX_SOFTWARE |
> SOF_TIMESTAMPING_TX_ACK);
> +
> + fprintf(stderr, "\ntest ENQ + SND + ACK with tcp tskey setting\n");
> + do_test(family, SOF_TIMESTAMPING_TX_SCHED |
> + SOF_TIMESTAMPING_TX_SOFTWARE |
> + SOF_TIMESTAMPING_TX_ACK |
> + SOF_TIMESTAMPING_OPT_ID_TCP);
> }
> }
>
> --
> 2.37.3
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 3/3] net-timestamp: namespacify the sysctl_tstamp_allow_data
2024-09-30 9:24 ` [PATCH net-next 3/3] net-timestamp: namespacify the sysctl_tstamp_allow_data Jason Xing
@ 2024-09-30 10:47 ` Willem de Bruijn
2024-09-30 11:14 ` Jason Xing
0 siblings, 1 reply; 20+ messages in thread
From: Willem de Bruijn @ 2024-09-30 10:47 UTC (permalink / raw)
To: Jason Xing, davem, edumazet, kuba, pabeni, dsahern,
willemdebruijn.kernel, shuah, willemb
Cc: linux-kselftest, netdev, Jason Xing
Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
>
> Let it be tuned in per netns by admins.
>
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
+1 on the idea
> ---
> include/net/netns/core.h | 1 +
> include/net/sock.h | 2 --
also remove the static global from sock.c
> net/core/net_namespace.c | 1 +
> net/core/skbuff.c | 2 +-
> net/core/sysctl_net_core.c | 18 +++++++++---------
> 5 files changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/include/net/netns/core.h b/include/net/netns/core.h
> index 78214f1b43a2..ef8b3105c632 100644
> --- a/include/net/netns/core.h
> +++ b/include/net/netns/core.h
> @@ -23,6 +23,7 @@ struct netns_core {
> #if IS_ENABLED(CONFIG_RPS) && IS_ENABLED(CONFIG_SYSCTL)
> struct cpumask *rps_default_mask;
> #endif
> + int sysctl_tstamp_allow_data;
> };
>
> #endif
> diff --git a/include/net/sock.h b/include/net/sock.h
> index c58ca8dd561b..4f31be0fd671 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -2808,8 +2808,6 @@ void sk_get_meminfo(const struct sock *sk, u32 *meminfo);
> extern __u32 sysctl_wmem_max;
> extern __u32 sysctl_rmem_max;
>
> -extern int sysctl_tstamp_allow_data;
> -
> extern __u32 sysctl_wmem_default;
> extern __u32 sysctl_rmem_default;
>
> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
> index e39479f1c9a4..e78c01912c64 100644
> --- a/net/core/net_namespace.c
> +++ b/net/core/net_namespace.c
> @@ -317,6 +317,7 @@ static __net_init void preinit_net_sysctl(struct net *net)
> */
> net->core.sysctl_optmem_max = 128 * 1024;
> net->core.sysctl_txrehash = SOCK_TXREHASH_ENABLED;
> + net->core.sysctl_tstamp_allow_data = 1;
> }
>
> /* init code that must occur even if setup_net() is not called. */
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 74149dc4ee31..ad727d924f73 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -5506,7 +5506,7 @@ static bool skb_may_tx_timestamp(struct sock *sk, bool tsonly)
> {
> bool ret;
>
> - if (likely(READ_ONCE(sysctl_tstamp_allow_data) || tsonly))
> + if (likely(READ_ONCE(sock_net(sk)->core.sysctl_tstamp_allow_data) || tsonly))
> return true;
Let's switch order of the tests here too
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags
2024-09-30 9:24 ` [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags Jason Xing
2024-09-30 10:39 ` Willem de Bruijn
@ 2024-09-30 10:48 ` Vadim Fedorenko
2024-09-30 11:24 ` Jason Xing
1 sibling, 1 reply; 20+ messages in thread
From: Vadim Fedorenko @ 2024-09-30 10:48 UTC (permalink / raw)
To: Jason Xing, davem, edumazet, kuba, pabeni, dsahern,
willemdebruijn.kernel, shuah, willemb
Cc: linux-kselftest, netdev, Jason Xing
On 30/09/2024 10:24, Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
>
> Even though this case is unlikely to happen, we have to avoid such
> a case occurring at an earlier point: the sk_rmem_alloc could get
> increased because of inserting more and more skbs into the errqueue
> when calling __skb_complete_tx_timestamp(). This bad case would stop
> the socket transmitting soon.
>
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
> ---
> net/core/sock.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/net/core/sock.c b/net/core/sock.c
> index fe87f9bd8f16..4bddd6f62e4f 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -905,6 +905,10 @@ int sock_set_timestamping(struct sock *sk, int optname,
> if (val & ~SOF_TIMESTAMPING_MASK)
> return -EINVAL;
>
> + if (val & SOF_TIMESTAMPING_TX_RECORD_MASK &&
> + !(val & SOF_TIMESTAMPING_SOFTWARE))
> + return -EINVAL;
SOF_TIMESTAMPING_TX_RECORD_MASK contains SOF_TIMESTAMPING_TX_HARDWARE.
That means that there will be no option to enable HW TX timestamping
without enabling software timestamping. I believe this is wrong
restriction.
> +
> if (val & SOF_TIMESTAMPING_OPT_ID_TCP &&
> !(val & SOF_TIMESTAMPING_OPT_ID))
> return -EINVAL;
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 3/3] net-timestamp: namespacify the sysctl_tstamp_allow_data
2024-09-30 10:47 ` Willem de Bruijn
@ 2024-09-30 11:14 ` Jason Xing
0 siblings, 0 replies; 20+ messages in thread
From: Jason Xing @ 2024-09-30 11:14 UTC (permalink / raw)
To: Willem de Bruijn
Cc: davem, edumazet, kuba, pabeni, dsahern, shuah, willemb,
linux-kselftest, netdev, Jason Xing
On Mon, Sep 30, 2024 at 6:47 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Jason Xing wrote:
> > From: Jason Xing <kernelxing@tencent.com>
> >
> > Let it be tuned in per netns by admins.
> >
> > Signed-off-by: Jason Xing <kernelxing@tencent.com>
>
> +1 on the idea
>
> > ---
> > include/net/netns/core.h | 1 +
> > include/net/sock.h | 2 --
>
> also remove the static global from sock.c
Thanks for pointing this out.
>
> > net/core/net_namespace.c | 1 +
> > net/core/skbuff.c | 2 +-
> > net/core/sysctl_net_core.c | 18 +++++++++---------
> > 5 files changed, 12 insertions(+), 12 deletions(-)
> >
> > diff --git a/include/net/netns/core.h b/include/net/netns/core.h
> > index 78214f1b43a2..ef8b3105c632 100644
> > --- a/include/net/netns/core.h
> > +++ b/include/net/netns/core.h
> > @@ -23,6 +23,7 @@ struct netns_core {
> > #if IS_ENABLED(CONFIG_RPS) && IS_ENABLED(CONFIG_SYSCTL)
> > struct cpumask *rps_default_mask;
> > #endif
> > + int sysctl_tstamp_allow_data;
> > };
> >
> > #endif
> > diff --git a/include/net/sock.h b/include/net/sock.h
> > index c58ca8dd561b..4f31be0fd671 100644
> > --- a/include/net/sock.h
> > +++ b/include/net/sock.h
> > @@ -2808,8 +2808,6 @@ void sk_get_meminfo(const struct sock *sk, u32 *meminfo);
> > extern __u32 sysctl_wmem_max;
> > extern __u32 sysctl_rmem_max;
> >
> > -extern int sysctl_tstamp_allow_data;
> > -
> > extern __u32 sysctl_wmem_default;
> > extern __u32 sysctl_rmem_default;
> >
> > diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
> > index e39479f1c9a4..e78c01912c64 100644
> > --- a/net/core/net_namespace.c
> > +++ b/net/core/net_namespace.c
> > @@ -317,6 +317,7 @@ static __net_init void preinit_net_sysctl(struct net *net)
> > */
> > net->core.sysctl_optmem_max = 128 * 1024;
> > net->core.sysctl_txrehash = SOCK_TXREHASH_ENABLED;
> > + net->core.sysctl_tstamp_allow_data = 1;
> > }
> >
> > /* init code that must occur even if setup_net() is not called. */
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 74149dc4ee31..ad727d924f73 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -5506,7 +5506,7 @@ static bool skb_may_tx_timestamp(struct sock *sk, bool tsonly)
> > {
> > bool ret;
> >
> > - if (likely(READ_ONCE(sysctl_tstamp_allow_data) || tsonly))
> > + if (likely(READ_ONCE(sock_net(sk)->core.sysctl_tstamp_allow_data) || tsonly))
> > return true;
>
> Let's switch order of the tests here too
Got it. Will do it.
Thanks,
Jason
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags
2024-09-30 10:48 ` Vadim Fedorenko
@ 2024-09-30 11:24 ` Jason Xing
0 siblings, 0 replies; 20+ messages in thread
From: Jason Xing @ 2024-09-30 11:24 UTC (permalink / raw)
To: Vadim Fedorenko
Cc: davem, edumazet, kuba, pabeni, dsahern, willemdebruijn.kernel,
shuah, willemb, linux-kselftest, netdev, Jason Xing
On Mon, Sep 30, 2024 at 6:48 PM Vadim Fedorenko
<vadim.fedorenko@linux.dev> wrote:
>
> On 30/09/2024 10:24, Jason Xing wrote:
> > From: Jason Xing <kernelxing@tencent.com>
> >
> > Even though this case is unlikely to happen, we have to avoid such
> > a case occurring at an earlier point: the sk_rmem_alloc could get
> > increased because of inserting more and more skbs into the errqueue
> > when calling __skb_complete_tx_timestamp(). This bad case would stop
> > the socket transmitting soon.
> >
> > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > ---
> > net/core/sock.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/net/core/sock.c b/net/core/sock.c
> > index fe87f9bd8f16..4bddd6f62e4f 100644
> > --- a/net/core/sock.c
> > +++ b/net/core/sock.c
> > @@ -905,6 +905,10 @@ int sock_set_timestamping(struct sock *sk, int optname,
> > if (val & ~SOF_TIMESTAMPING_MASK)
> > return -EINVAL;
> >
> > + if (val & SOF_TIMESTAMPING_TX_RECORD_MASK &&
> > + !(val & SOF_TIMESTAMPING_SOFTWARE))
> > + return -EINVAL;
>
> SOF_TIMESTAMPING_TX_RECORD_MASK contains SOF_TIMESTAMPING_TX_HARDWARE.
> That means that there will be no option to enable HW TX timestamping
> without enabling software timestamping. I believe this is wrong
> restriction.
Thanks! You are right. I should have realized that. I need to get rid
of this one.
Thanks,
Jason
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags
2024-09-30 10:39 ` Willem de Bruijn
@ 2024-09-30 11:29 ` Jason Xing
2024-09-30 11:49 ` Willem de Bruijn
0 siblings, 1 reply; 20+ messages in thread
From: Jason Xing @ 2024-09-30 11:29 UTC (permalink / raw)
To: Willem de Bruijn
Cc: davem, edumazet, kuba, pabeni, dsahern, shuah, willemb,
linux-kselftest, netdev, Jason Xing
On Mon, Sep 30, 2024 at 6:39 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Jason Xing wrote:
> > From: Jason Xing <kernelxing@tencent.com>
> >
> > Even though this case is unlikely to happen, we have to avoid such
> > a case occurring at an earlier point: the sk_rmem_alloc could get
> > increased because of inserting more and more skbs into the errqueue
> > when calling __skb_complete_tx_timestamp(). This bad case would stop
> > the socket transmitting soon.
>
> It is up to the application to read from the error queue frequently
> enough and/or increase SO_RCVBUF.
Sure thing. If we test it without setting SOF_TIMESTAMPING_SOFTWARE on
the loopback, it will soon stop. That's the reason why I tried to add
the restriction just in case.
>
> > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > ---
> > net/core/sock.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/net/core/sock.c b/net/core/sock.c
> > index fe87f9bd8f16..4bddd6f62e4f 100644
> > --- a/net/core/sock.c
> > +++ b/net/core/sock.c
> > @@ -905,6 +905,10 @@ int sock_set_timestamping(struct sock *sk, int optname,
> > if (val & ~SOF_TIMESTAMPING_MASK)
> > return -EINVAL;
> >
> > + if (val & SOF_TIMESTAMPING_TX_RECORD_MASK &&
> > + !(val & SOF_TIMESTAMPING_SOFTWARE))
> > + return -EINVAL;
> > +
>
> This breaks hardware timestamping
Yes, and sorry about that. I'll fix this.
>
> > if (val & SOF_TIMESTAMPING_OPT_ID_TCP &&
> > !(val & SOF_TIMESTAMPING_OPT_ID))
> > return -EINVAL;
> > --
> > 2.37.3
> >
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags
2024-09-30 11:29 ` Jason Xing
@ 2024-09-30 11:49 ` Willem de Bruijn
2024-09-30 12:42 ` Jason Xing
0 siblings, 1 reply; 20+ messages in thread
From: Willem de Bruijn @ 2024-09-30 11:49 UTC (permalink / raw)
To: Jason Xing, Willem de Bruijn
Cc: davem, edumazet, kuba, pabeni, dsahern, shuah, willemb,
linux-kselftest, netdev, Jason Xing
Jason Xing wrote:
> On Mon, Sep 30, 2024 at 6:39 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > Jason Xing wrote:
> > > From: Jason Xing <kernelxing@tencent.com>
> > >
> > > Even though this case is unlikely to happen, we have to avoid such
> > > a case occurring at an earlier point: the sk_rmem_alloc could get
> > > increased because of inserting more and more skbs into the errqueue
> > > when calling __skb_complete_tx_timestamp(). This bad case would stop
> > > the socket transmitting soon.
> >
> > It is up to the application to read from the error queue frequently
> > enough and/or increase SO_RCVBUF.
>
> Sure thing. If we test it without setting SOF_TIMESTAMPING_SOFTWARE on
> the loopback, it will soon stop. That's the reason why I tried to add
> the restriction just in case.
I don't follow at all.
That bit does not affect the core issue: that the application is not
clearing its error queue quickly enough.
> >
> > > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > > ---
> > > net/core/sock.c | 4 ++++
> > > 1 file changed, 4 insertions(+)
> > >
> > > diff --git a/net/core/sock.c b/net/core/sock.c
> > > index fe87f9bd8f16..4bddd6f62e4f 100644
> > > --- a/net/core/sock.c
> > > +++ b/net/core/sock.c
> > > @@ -905,6 +905,10 @@ int sock_set_timestamping(struct sock *sk, int optname,
> > > if (val & ~SOF_TIMESTAMPING_MASK)
> > > return -EINVAL;
> > >
> > > + if (val & SOF_TIMESTAMPING_TX_RECORD_MASK &&
> > > + !(val & SOF_TIMESTAMPING_SOFTWARE))
> > > + return -EINVAL;
> > > +
> >
> > This breaks hardware timestamping
>
> Yes, and sorry about that. I'll fix this.
As is I don't understand the purpose of this patch. Please do not
just resubmit with a change, but explain the problem and suggested
solution first.
> >
> > > if (val & SOF_TIMESTAMPING_OPT_ID_TCP &&
> > > !(val & SOF_TIMESTAMPING_OPT_ID))
> > > return -EINVAL;
> > > --
> > > 2.37.3
> > >
> >
> >
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 2/3] net-timestamp: add OPT_ID_TCP test in selftests
2024-09-30 10:42 ` Willem de Bruijn
@ 2024-09-30 11:49 ` Jason Xing
2024-09-30 11:54 ` Willem de Bruijn
0 siblings, 1 reply; 20+ messages in thread
From: Jason Xing @ 2024-09-30 11:49 UTC (permalink / raw)
To: Willem de Bruijn
Cc: davem, edumazet, kuba, pabeni, dsahern, shuah, willemb,
linux-kselftest, netdev, Jason Xing
On Mon, Sep 30, 2024 at 6:42 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Jason Xing wrote:
> > From: Jason Xing <kernelxing@tencent.com>
> >
> > Introduce a test for SOF_TIMESTAMPING_OPT_ID_TCP for TCP proto so
> > that we can get aware of whether using write_seq as an initial key
> > value works as expected.
>
> Does the test behave different with this flag set?
>
Sorry, my mistake, the last email is not open to the mailing list. So
I copy that here.
Not that much, only at the very beginning, this new test will use
write_seq directly.
I once thought and wondered if I need to setsockopt() when one or two
sendmsg() are already done, then we check the behaviour of subsequent
sendmsg() calls. Then I changed my mind because it's a bit complex. Do
you think it's a good way to test?
Thanks,
Jason
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 2/3] net-timestamp: add OPT_ID_TCP test in selftests
2024-09-30 11:49 ` Jason Xing
@ 2024-09-30 11:54 ` Willem de Bruijn
2024-09-30 12:17 ` Jason Xing
0 siblings, 1 reply; 20+ messages in thread
From: Willem de Bruijn @ 2024-09-30 11:54 UTC (permalink / raw)
To: Jason Xing, Willem de Bruijn
Cc: davem, edumazet, kuba, pabeni, dsahern, shuah, willemb,
linux-kselftest, netdev, Jason Xing
Jason Xing wrote:
> On Mon, Sep 30, 2024 at 6:42 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > Jason Xing wrote:
> > > From: Jason Xing <kernelxing@tencent.com>
> > >
> > > Introduce a test for SOF_TIMESTAMPING_OPT_ID_TCP for TCP proto so
> > > that we can get aware of whether using write_seq as an initial key
> > > value works as expected.
> >
> > Does the test behave different with this flag set?
> >
>
> Sorry, my mistake, the last email is not open to the mailing list. So
> I copy that here.
>
> Not that much, only at the very beginning, this new test will use
> write_seq directly.
The kernel will act differently. But the test does not detect this.
> I once thought and wondered if I need to setsockopt() when one or two
> sendmsg() are already done, then we check the behaviour of subsequent
> sendmsg() calls. Then I changed my mind because it's a bit complex. Do
> you think it's a good way to test?
Packetdrill is more suitable for deterministically testing such subtle
differences.
I have a packetdrill test for OPT_ID with and without OPT_ID_TCP. It
is not public yet. As part of upstreaming our packetdrill tests, this
will eventually also be available.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 2/3] net-timestamp: add OPT_ID_TCP test in selftests
2024-09-30 11:54 ` Willem de Bruijn
@ 2024-09-30 12:17 ` Jason Xing
0 siblings, 0 replies; 20+ messages in thread
From: Jason Xing @ 2024-09-30 12:17 UTC (permalink / raw)
To: Willem de Bruijn
Cc: davem, edumazet, kuba, pabeni, dsahern, shuah, willemb,
linux-kselftest, netdev, Jason Xing
On Mon, Sep 30, 2024 at 7:54 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Jason Xing wrote:
> > On Mon, Sep 30, 2024 at 6:42 PM Willem de Bruijn
> > <willemdebruijn.kernel@gmail.com> wrote:
> > >
> > > Jason Xing wrote:
> > > > From: Jason Xing <kernelxing@tencent.com>
> > > >
> > > > Introduce a test for SOF_TIMESTAMPING_OPT_ID_TCP for TCP proto so
> > > > that we can get aware of whether using write_seq as an initial key
> > > > value works as expected.
> > >
> > > Does the test behave different with this flag set?
> > >
> >
> > Sorry, my mistake, the last email is not open to the mailing list. So
> > I copy that here.
> >
> > Not that much, only at the very beginning, this new test will use
> > write_seq directly.
>
> The kernel will act differently. But the test does not detect this.
No, it will not cover this.
>
> > I once thought and wondered if I need to setsockopt() when one or two
> > sendmsg() are already done, then we check the behaviour of subsequent
> > sendmsg() calls. Then I changed my mind because it's a bit complex. Do
> > you think it's a good way to test?
>
> Packetdrill is more suitable for deterministically testing such subtle
> differences.
>
> I have a packetdrill test for OPT_ID with and without OPT_ID_TCP. It
> is not public yet. As part of upstreaming our packetdrill tests, this
> will eventually also be available.
Good to hear that. Now I think I will drop this patch.
Thanks,
Jason
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags
2024-09-30 11:49 ` Willem de Bruijn
@ 2024-09-30 12:42 ` Jason Xing
2024-09-30 17:14 ` Willem de Bruijn
0 siblings, 1 reply; 20+ messages in thread
From: Jason Xing @ 2024-09-30 12:42 UTC (permalink / raw)
To: Willem de Bruijn
Cc: davem, edumazet, kuba, pabeni, dsahern, shuah, willemb,
linux-kselftest, netdev, Jason Xing
On Mon, Sep 30, 2024 at 7:49 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Jason Xing wrote:
> > On Mon, Sep 30, 2024 at 6:39 PM Willem de Bruijn
> > <willemdebruijn.kernel@gmail.com> wrote:
> > >
> > > Jason Xing wrote:
> > > > From: Jason Xing <kernelxing@tencent.com>
> > > >
> > > > Even though this case is unlikely to happen, we have to avoid such
> > > > a case occurring at an earlier point: the sk_rmem_alloc could get
> > > > increased because of inserting more and more skbs into the errqueue
> > > > when calling __skb_complete_tx_timestamp(). This bad case would stop
> > > > the socket transmitting soon.
> > >
> > > It is up to the application to read from the error queue frequently
> > > enough and/or increase SO_RCVBUF.
> >
> > Sure thing. If we test it without setting SOF_TIMESTAMPING_SOFTWARE on
> > the loopback, it will soon stop. That's the reason why I tried to add
> > the restriction just in case.
>
> I don't follow at all.
>
> That bit does not affect the core issue: that the application is not
> clearing its error queue quickly enough.
>
> > >
> > > > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > > > ---
> > > > net/core/sock.c | 4 ++++
> > > > 1 file changed, 4 insertions(+)
> > > >
> > > > diff --git a/net/core/sock.c b/net/core/sock.c
> > > > index fe87f9bd8f16..4bddd6f62e4f 100644
> > > > --- a/net/core/sock.c
> > > > +++ b/net/core/sock.c
> > > > @@ -905,6 +905,10 @@ int sock_set_timestamping(struct sock *sk, int optname,
> > > > if (val & ~SOF_TIMESTAMPING_MASK)
> > > > return -EINVAL;
> > > >
> > > > + if (val & SOF_TIMESTAMPING_TX_RECORD_MASK &&
> > > > + !(val & SOF_TIMESTAMPING_SOFTWARE))
> > > > + return -EINVAL;
> > > > +
> > >
> > > This breaks hardware timestamping
> >
> > Yes, and sorry about that. I'll fix this.
>
> As is I don't understand the purpose of this patch. Please do not
> just resubmit with a change, but explain the problem and suggested
> solution first.
>
I will drop this patch because I just tested with my program in the
local machine and found there is one mistake I made about calculating
the diff between those two . Sorry for the noise.
Well, I only need to send a V2 patch of patch [3/3] in the next few days.
BTW, please allow me to ask one question unrelated to this patch
again. I do wonder: if we batch the recv skbs from the errqueue when
calling tcp_recvmsg() -> inet_recv_error(), it could break users,
right?
Thanks,
Jason
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags
2024-09-30 12:42 ` Jason Xing
@ 2024-09-30 17:14 ` Willem de Bruijn
2024-09-30 17:56 ` Jason Xing
0 siblings, 1 reply; 20+ messages in thread
From: Willem de Bruijn @ 2024-09-30 17:14 UTC (permalink / raw)
To: Jason Xing, Willem de Bruijn
Cc: davem, edumazet, kuba, pabeni, dsahern, shuah, willemb,
linux-kselftest, netdev, Jason Xing
Jason Xing wrote:
> On Mon, Sep 30, 2024 at 7:49 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > Jason Xing wrote:
> > > On Mon, Sep 30, 2024 at 6:39 PM Willem de Bruijn
> > > <willemdebruijn.kernel@gmail.com> wrote:
> > > >
> > > > Jason Xing wrote:
> > > > > From: Jason Xing <kernelxing@tencent.com>
> > > > >
> > > > > Even though this case is unlikely to happen, we have to avoid such
> > > > > a case occurring at an earlier point: the sk_rmem_alloc could get
> > > > > increased because of inserting more and more skbs into the errqueue
> > > > > when calling __skb_complete_tx_timestamp(). This bad case would stop
> > > > > the socket transmitting soon.
> > > >
> > > > It is up to the application to read from the error queue frequently
> > > > enough and/or increase SO_RCVBUF.
> > >
> > > Sure thing. If we test it without setting SOF_TIMESTAMPING_SOFTWARE on
> > > the loopback, it will soon stop. That's the reason why I tried to add
> > > the restriction just in case.
> >
> > I don't follow at all.
> >
> > That bit does not affect the core issue: that the application is not
> > clearing its error queue quickly enough.
> >
> > > >
> > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > > > > ---
> > > > > net/core/sock.c | 4 ++++
> > > > > 1 file changed, 4 insertions(+)
> > > > >
> > > > > diff --git a/net/core/sock.c b/net/core/sock.c
> > > > > index fe87f9bd8f16..4bddd6f62e4f 100644
> > > > > --- a/net/core/sock.c
> > > > > +++ b/net/core/sock.c
> > > > > @@ -905,6 +905,10 @@ int sock_set_timestamping(struct sock *sk, int optname,
> > > > > if (val & ~SOF_TIMESTAMPING_MASK)
> > > > > return -EINVAL;
> > > > >
> > > > > + if (val & SOF_TIMESTAMPING_TX_RECORD_MASK &&
> > > > > + !(val & SOF_TIMESTAMPING_SOFTWARE))
> > > > > + return -EINVAL;
> > > > > +
> > > >
> > > > This breaks hardware timestamping
> > >
> > > Yes, and sorry about that. I'll fix this.
> >
> > As is I don't understand the purpose of this patch. Please do not
> > just resubmit with a change, but explain the problem and suggested
> > solution first.
> >
>
> I will drop this patch because I just tested with my program in the
> local machine and found there is one mistake I made about calculating
> the diff between those two . Sorry for the noise.
>
> Well, I only need to send a V2 patch of patch [3/3] in the next few days.
>
> BTW, please allow me to ask one question unrelated to this patch
> again. I do wonder: if we batch the recv skbs from the errqueue when
> calling tcp_recvmsg() -> inet_recv_error(), it could break users,
> right?
Analogous to __msg_zerocopy_callback with __msg_zerocopy_callback.
Only here we cannot return range-based results and thus cannot just
expand the range of the one outstanding notification.
This would mean in ip(v6)_recv_error calling sock_dequeue_err_skb,
sock_recv_timestamp and put_cmsg IP_RECVERR multiple times. And
ip_cmsg_recv if needed.
Existing applications do not have to expect multiple results per
single recvmsg call. So enabling that unconditionally could break
them.
Adding this will require a new flag. An sk_tsflag is the obvious
approach.
Interpreting a MSG_* flag passed to recvmsg would be
another option. If there is a bit that can be set with MSG_ERRQUEUE
and cannot already be set currently. But I don't think that's the
case. We allow all bits and ignore any undefined ones.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags
2024-09-30 17:14 ` Willem de Bruijn
@ 2024-09-30 17:56 ` Jason Xing
2024-09-30 18:15 ` Willem de Bruijn
0 siblings, 1 reply; 20+ messages in thread
From: Jason Xing @ 2024-09-30 17:56 UTC (permalink / raw)
To: Willem de Bruijn
Cc: davem, edumazet, kuba, pabeni, dsahern, shuah, willemb,
linux-kselftest, netdev, Jason Xing
On Tue, Oct 1, 2024 at 1:14 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Jason Xing wrote:
> > On Mon, Sep 30, 2024 at 7:49 PM Willem de Bruijn
> > <willemdebruijn.kernel@gmail.com> wrote:
> > >
> > > Jason Xing wrote:
> > > > On Mon, Sep 30, 2024 at 6:39 PM Willem de Bruijn
> > > > <willemdebruijn.kernel@gmail.com> wrote:
> > > > >
> > > > > Jason Xing wrote:
> > > > > > From: Jason Xing <kernelxing@tencent.com>
> > > > > >
> > > > > > Even though this case is unlikely to happen, we have to avoid such
> > > > > > a case occurring at an earlier point: the sk_rmem_alloc could get
> > > > > > increased because of inserting more and more skbs into the errqueue
> > > > > > when calling __skb_complete_tx_timestamp(). This bad case would stop
> > > > > > the socket transmitting soon.
> > > > >
> > > > > It is up to the application to read from the error queue frequently
> > > > > enough and/or increase SO_RCVBUF.
> > > >
> > > > Sure thing. If we test it without setting SOF_TIMESTAMPING_SOFTWARE on
> > > > the loopback, it will soon stop. That's the reason why I tried to add
> > > > the restriction just in case.
> > >
> > > I don't follow at all.
> > >
> > > That bit does not affect the core issue: that the application is not
> > > clearing its error queue quickly enough.
> > >
> > > > >
> > > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > > > > > ---
> > > > > > net/core/sock.c | 4 ++++
> > > > > > 1 file changed, 4 insertions(+)
> > > > > >
> > > > > > diff --git a/net/core/sock.c b/net/core/sock.c
> > > > > > index fe87f9bd8f16..4bddd6f62e4f 100644
> > > > > > --- a/net/core/sock.c
> > > > > > +++ b/net/core/sock.c
> > > > > > @@ -905,6 +905,10 @@ int sock_set_timestamping(struct sock *sk, int optname,
> > > > > > if (val & ~SOF_TIMESTAMPING_MASK)
> > > > > > return -EINVAL;
> > > > > >
> > > > > > + if (val & SOF_TIMESTAMPING_TX_RECORD_MASK &&
> > > > > > + !(val & SOF_TIMESTAMPING_SOFTWARE))
> > > > > > + return -EINVAL;
> > > > > > +
> > > > >
> > > > > This breaks hardware timestamping
> > > >
> > > > Yes, and sorry about that. I'll fix this.
> > >
> > > As is I don't understand the purpose of this patch. Please do not
> > > just resubmit with a change, but explain the problem and suggested
> > > solution first.
> > >
> >
> > I will drop this patch because I just tested with my program in the
> > local machine and found there is one mistake I made about calculating
> > the diff between those two . Sorry for the noise.
> >
> > Well, I only need to send a V2 patch of patch [3/3] in the next few days.
> >
> > BTW, please allow me to ask one question unrelated to this patch
> > again. I do wonder: if we batch the recv skbs from the errqueue when
> > calling tcp_recvmsg() -> inet_recv_error(), it could break users,
> > right?
>
> Analogous to __msg_zerocopy_callback with __msg_zerocopy_callback.
>
> Only here we cannot return range-based results and thus cannot just
> expand the range of the one outstanding notification.
>
> This would mean in ip(v6)_recv_error calling sock_dequeue_err_skb,
> sock_recv_timestamp and put_cmsg IP_RECVERR multiple times. And
> ip_cmsg_recv if needed.
>
> Existing applications do not have to expect multiple results per
> single recvmsg call. So enabling that unconditionally could break
> them.
Thanks for your explanation! I was unsure because I read some use
cases in github and txtimestamp.c, they can only handle one err skb at
one time.
>
> Adding this will require a new flag. An sk_tsflag is the obvious
> approach.
>
> Interpreting a MSG_* flag passed to recvmsg would be
> another option. If there is a bit that can be set with MSG_ERRQUEUE
> and cannot already be set currently. But I don't think that's the
> case. We allow all bits and ignore any undefined ones.
Do you feel it is necessary that we can implement this idea to
optimize it, saving 2 or 3 syscalls at most at one time? IIRC, it's
you who proposed that we can batch them when applying the tracepoints
mechanism after I gave a presentation at netconf :) It's really good.
That inspires me a lot and makes me keep wondering if we can do this
these days.
Since I've already finished the bpf for timestamping feature locally
which bypasses receiving skbs from errqueue, I believe it could be
helpful for those applications that still have tendency to use the
"traditional way" to trace.
What are your thoughts on this? If you agree, do you want to do this
on your own or allow me to give it a try?
Thanks,
Jason
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags
2024-09-30 17:56 ` Jason Xing
@ 2024-09-30 18:15 ` Willem de Bruijn
2024-10-01 0:42 ` Jason Xing
0 siblings, 1 reply; 20+ messages in thread
From: Willem de Bruijn @ 2024-09-30 18:15 UTC (permalink / raw)
To: Jason Xing, Willem de Bruijn
Cc: davem, edumazet, kuba, pabeni, dsahern, shuah, willemb,
linux-kselftest, netdev, Jason Xing
Jason Xing wrote:
> On Tue, Oct 1, 2024 at 1:14 AM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > Jason Xing wrote:
> > > On Mon, Sep 30, 2024 at 7:49 PM Willem de Bruijn
> > > <willemdebruijn.kernel@gmail.com> wrote:
> > > >
> > > > Jason Xing wrote:
> > > > > On Mon, Sep 30, 2024 at 6:39 PM Willem de Bruijn
> > > > > <willemdebruijn.kernel@gmail.com> wrote:
> > > > > >
> > > > > > Jason Xing wrote:
> > > > > > > From: Jason Xing <kernelxing@tencent.com>
> > > > > > >
> > > > > > > Even though this case is unlikely to happen, we have to avoid such
> > > > > > > a case occurring at an earlier point: the sk_rmem_alloc could get
> > > > > > > increased because of inserting more and more skbs into the errqueue
> > > > > > > when calling __skb_complete_tx_timestamp(). This bad case would stop
> > > > > > > the socket transmitting soon.
> > > > > >
> > > > > > It is up to the application to read from the error queue frequently
> > > > > > enough and/or increase SO_RCVBUF.
> > > > >
> > > > > Sure thing. If we test it without setting SOF_TIMESTAMPING_SOFTWARE on
> > > > > the loopback, it will soon stop. That's the reason why I tried to add
> > > > > the restriction just in case.
> > > >
> > > > I don't follow at all.
> > > >
> > > > That bit does not affect the core issue: that the application is not
> > > > clearing its error queue quickly enough.
> > > >
> > > > > >
> > > > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > > > > > > ---
> > > > > > > net/core/sock.c | 4 ++++
> > > > > > > 1 file changed, 4 insertions(+)
> > > > > > >
> > > > > > > diff --git a/net/core/sock.c b/net/core/sock.c
> > > > > > > index fe87f9bd8f16..4bddd6f62e4f 100644
> > > > > > > --- a/net/core/sock.c
> > > > > > > +++ b/net/core/sock.c
> > > > > > > @@ -905,6 +905,10 @@ int sock_set_timestamping(struct sock *sk, int optname,
> > > > > > > if (val & ~SOF_TIMESTAMPING_MASK)
> > > > > > > return -EINVAL;
> > > > > > >
> > > > > > > + if (val & SOF_TIMESTAMPING_TX_RECORD_MASK &&
> > > > > > > + !(val & SOF_TIMESTAMPING_SOFTWARE))
> > > > > > > + return -EINVAL;
> > > > > > > +
> > > > > >
> > > > > > This breaks hardware timestamping
> > > > >
> > > > > Yes, and sorry about that. I'll fix this.
> > > >
> > > > As is I don't understand the purpose of this patch. Please do not
> > > > just resubmit with a change, but explain the problem and suggested
> > > > solution first.
> > > >
> > >
> > > I will drop this patch because I just tested with my program in the
> > > local machine and found there is one mistake I made about calculating
> > > the diff between those two . Sorry for the noise.
> > >
> > > Well, I only need to send a V2 patch of patch [3/3] in the next few days.
> > >
> > > BTW, please allow me to ask one question unrelated to this patch
> > > again. I do wonder: if we batch the recv skbs from the errqueue when
> > > calling tcp_recvmsg() -> inet_recv_error(), it could break users,
> > > right?
> >
> > Analogous to __msg_zerocopy_callback with __msg_zerocopy_callback.
> >
> > Only here we cannot return range-based results and thus cannot just
> > expand the range of the one outstanding notification.
> >
> > This would mean in ip(v6)_recv_error calling sock_dequeue_err_skb,
> > sock_recv_timestamp and put_cmsg IP_RECVERR multiple times. And
> > ip_cmsg_recv if needed.
> >
> > Existing applications do not have to expect multiple results per
> > single recvmsg call. So enabling that unconditionally could break
> > them.
>
> Thanks for your explanation! I was unsure because I read some use
> cases in github and txtimestamp.c, they can only handle one err skb at
> one time.
>
> >
> > Adding this will require a new flag. An sk_tsflag is the obvious
> > approach.
> >
> > Interpreting a MSG_* flag passed to recvmsg would be
> > another option. If there is a bit that can be set with MSG_ERRQUEUE
> > and cannot already be set currently. But I don't think that's the
> > case. We allow all bits and ignore any undefined ones.
>
> Do you feel it is necessary that we can implement this idea to
> optimize it, saving 2 or 3 syscalls at most at one time? IIRC, it's
> you who proposed that we can batch them when applying the tracepoints
> mechanism after I gave a presentation at netconf :) It's really good.
> That inspires me a lot and makes me keep wondering if we can do this
> these days.
>
> Since I've already finished the bpf for timestamping feature locally
> which bypasses receiving skbs from errqueue,
That's great!
> I believe it could be
> helpful for those applications that still have tendency to use the
> "traditional way" to trace.
>
> What are your thoughts on this? If you agree, do you want to do this
> on your own or allow me to give it a try?
I'd focus on the workload that you care about most, which is the
administrator driven interface, which will use BPF.
This micro optimization would need some benchmarks that show that it
has a measurable effect.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags
2024-09-30 18:15 ` Willem de Bruijn
@ 2024-10-01 0:42 ` Jason Xing
0 siblings, 0 replies; 20+ messages in thread
From: Jason Xing @ 2024-10-01 0:42 UTC (permalink / raw)
To: Willem de Bruijn
Cc: davem, edumazet, kuba, pabeni, dsahern, shuah, willemb,
linux-kselftest, netdev, Jason Xing
On Tue, Oct 1, 2024 at 2:15 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Jason Xing wrote:
> > On Tue, Oct 1, 2024 at 1:14 AM Willem de Bruijn
> > <willemdebruijn.kernel@gmail.com> wrote:
> > >
> > > Jason Xing wrote:
> > > > On Mon, Sep 30, 2024 at 7:49 PM Willem de Bruijn
> > > > <willemdebruijn.kernel@gmail.com> wrote:
> > > > >
> > > > > Jason Xing wrote:
> > > > > > On Mon, Sep 30, 2024 at 6:39 PM Willem de Bruijn
> > > > > > <willemdebruijn.kernel@gmail.com> wrote:
> > > > > > >
> > > > > > > Jason Xing wrote:
> > > > > > > > From: Jason Xing <kernelxing@tencent.com>
> > > > > > > >
> > > > > > > > Even though this case is unlikely to happen, we have to avoid such
> > > > > > > > a case occurring at an earlier point: the sk_rmem_alloc could get
> > > > > > > > increased because of inserting more and more skbs into the errqueue
> > > > > > > > when calling __skb_complete_tx_timestamp(). This bad case would stop
> > > > > > > > the socket transmitting soon.
> > > > > > >
> > > > > > > It is up to the application to read from the error queue frequently
> > > > > > > enough and/or increase SO_RCVBUF.
> > > > > >
> > > > > > Sure thing. If we test it without setting SOF_TIMESTAMPING_SOFTWARE on
> > > > > > the loopback, it will soon stop. That's the reason why I tried to add
> > > > > > the restriction just in case.
> > > > >
> > > > > I don't follow at all.
> > > > >
> > > > > That bit does not affect the core issue: that the application is not
> > > > > clearing its error queue quickly enough.
> > > > >
> > > > > > >
> > > > > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > > > > > > > ---
> > > > > > > > net/core/sock.c | 4 ++++
> > > > > > > > 1 file changed, 4 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/net/core/sock.c b/net/core/sock.c
> > > > > > > > index fe87f9bd8f16..4bddd6f62e4f 100644
> > > > > > > > --- a/net/core/sock.c
> > > > > > > > +++ b/net/core/sock.c
> > > > > > > > @@ -905,6 +905,10 @@ int sock_set_timestamping(struct sock *sk, int optname,
> > > > > > > > if (val & ~SOF_TIMESTAMPING_MASK)
> > > > > > > > return -EINVAL;
> > > > > > > >
> > > > > > > > + if (val & SOF_TIMESTAMPING_TX_RECORD_MASK &&
> > > > > > > > + !(val & SOF_TIMESTAMPING_SOFTWARE))
> > > > > > > > + return -EINVAL;
> > > > > > > > +
> > > > > > >
> > > > > > > This breaks hardware timestamping
> > > > > >
> > > > > > Yes, and sorry about that. I'll fix this.
> > > > >
> > > > > As is I don't understand the purpose of this patch. Please do not
> > > > > just resubmit with a change, but explain the problem and suggested
> > > > > solution first.
> > > > >
> > > >
> > > > I will drop this patch because I just tested with my program in the
> > > > local machine and found there is one mistake I made about calculating
> > > > the diff between those two . Sorry for the noise.
> > > >
> > > > Well, I only need to send a V2 patch of patch [3/3] in the next few days.
> > > >
> > > > BTW, please allow me to ask one question unrelated to this patch
> > > > again. I do wonder: if we batch the recv skbs from the errqueue when
> > > > calling tcp_recvmsg() -> inet_recv_error(), it could break users,
> > > > right?
> > >
> > > Analogous to __msg_zerocopy_callback with __msg_zerocopy_callback.
> > >
> > > Only here we cannot return range-based results and thus cannot just
> > > expand the range of the one outstanding notification.
> > >
> > > This would mean in ip(v6)_recv_error calling sock_dequeue_err_skb,
> > > sock_recv_timestamp and put_cmsg IP_RECVERR multiple times. And
> > > ip_cmsg_recv if needed.
> > >
> > > Existing applications do not have to expect multiple results per
> > > single recvmsg call. So enabling that unconditionally could break
> > > them.
> >
> > Thanks for your explanation! I was unsure because I read some use
> > cases in github and txtimestamp.c, they can only handle one err skb at
> > one time.
> >
> > >
> > > Adding this will require a new flag. An sk_tsflag is the obvious
> > > approach.
> > >
> > > Interpreting a MSG_* flag passed to recvmsg would be
> > > another option. If there is a bit that can be set with MSG_ERRQUEUE
> > > and cannot already be set currently. But I don't think that's the
> > > case. We allow all bits and ignore any undefined ones.
> >
> > Do you feel it is necessary that we can implement this idea to
> > optimize it, saving 2 or 3 syscalls at most at one time? IIRC, it's
> > you who proposed that we can batch them when applying the tracepoints
> > mechanism after I gave a presentation at netconf :) It's really good.
> > That inspires me a lot and makes me keep wondering if we can do this
> > these days.
> >
> > Since I've already finished the bpf for timestamping feature locally
> > which bypasses receiving skbs from errqueue,
>
> That's great!
>
> > I believe it could be
> > helpful for those applications that still have tendency to use the
> > "traditional way" to trace.
> >
> > What are your thoughts on this? If you agree, do you want to do this
> > on your own or allow me to give it a try?
>
> I'd focus on the workload that you care about most, which is the
> administrator driven interface, which will use BPF.
>
> This micro optimization would need some benchmarks that show that it
> has a measurable effect.
Got it. I will post that series soon.
Thanks,
Jason
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2024-10-01 0:43 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-30 9:24 [PATCH net-next 0/3] net-timestamp: add some trivial Jason Xing
2024-09-30 9:24 ` [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags Jason Xing
2024-09-30 10:39 ` Willem de Bruijn
2024-09-30 11:29 ` Jason Xing
2024-09-30 11:49 ` Willem de Bruijn
2024-09-30 12:42 ` Jason Xing
2024-09-30 17:14 ` Willem de Bruijn
2024-09-30 17:56 ` Jason Xing
2024-09-30 18:15 ` Willem de Bruijn
2024-10-01 0:42 ` Jason Xing
2024-09-30 10:48 ` Vadim Fedorenko
2024-09-30 11:24 ` Jason Xing
2024-09-30 9:24 ` [PATCH net-next 2/3] net-timestamp: add OPT_ID_TCP test in selftests Jason Xing
2024-09-30 10:42 ` Willem de Bruijn
2024-09-30 11:49 ` Jason Xing
2024-09-30 11:54 ` Willem de Bruijn
2024-09-30 12:17 ` Jason Xing
2024-09-30 9:24 ` [PATCH net-next 3/3] net-timestamp: namespacify the sysctl_tstamp_allow_data Jason Xing
2024-09-30 10:47 ` Willem de Bruijn
2024-09-30 11:14 ` Jason Xing
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).