* [RFC 0/4] net: enable timestamps on a per-socket basis
@ 2008-04-21 5:34 Jason Uhlenkott
2008-04-21 5:34 ` [RFC 1/4] net core: move timestamp functions Jason Uhlenkott
` (4 more replies)
0 siblings, 5 replies; 11+ messages in thread
From: Jason Uhlenkott @ 2008-04-21 5:34 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Stephen Hemminger
Currently, if anyone enables the SOCK_TIMESTAMP flag on any socket
(via the SO_TIMESTAMP socket option or the SIOCGSTAMP ioctl), we start
recording timestamps for *every* packet coming into the system. We do
this because timestamps are recorded very early by the net core,
before the upper protocol layers have a chance to look up the
socket(s) associated with each incoming packet.
A number of common apps enable timestamps, including ping, tcpdump,
and named. named probably has the biggest impact -- most people don't
leave ping running all the time, and expect that leaving tcpdump
running will cost a bit of performance, but it's unfortunate that the
mere presence of a local caching nameserver has a global negative
effect on incoming packet overhead.
This patchset creates a mechanism for protocols to opt in to handling
timestamps at a higher layer, and uses this mechanism to implement
per-socket SOCK_TIMESTAMP support for several common protocols.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC 1/4] net core: move timestamp functions
2008-04-21 5:34 [RFC 0/4] net: enable timestamps on a per-socket basis Jason Uhlenkott
@ 2008-04-21 5:34 ` Jason Uhlenkott
2008-04-21 5:35 ` [RFC 2/4] net core: let protocols implement SOCK_TIMESTAMP efficiently Jason Uhlenkott
` (3 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: Jason Uhlenkott @ 2008-04-21 5:34 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Stephen Hemminger
Reorganize some timestamp related functions which were scattered
around sock.c.
This is pure code motion with no changes.
Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com>
---
net/core/sock.c | 80 ++++++++++++++++++++++++++++----------------------------
1 file changed, 40 insertions(+), 40 deletions(-)
Index: linux/net/core/sock.c
===================================================================
--- linux.orig/net/core/sock.c 2008-04-18 17:18:11.000000000 -0700
+++ linux/net/core/sock.c 2008-04-20 21:03:54.000000000 -0700
@@ -255,6 +255,46 @@
}
}
+int sock_get_timestamp(struct sock *sk, struct timeval __user *userstamp)
+{
+ struct timeval tv;
+ if (!sock_flag(sk, SOCK_TIMESTAMP))
+ sock_enable_timestamp(sk);
+ tv = ktime_to_timeval(sk->sk_stamp);
+ if (tv.tv_sec == -1)
+ return -ENOENT;
+ if (tv.tv_sec == 0) {
+ sk->sk_stamp = ktime_get_real();
+ tv = ktime_to_timeval(sk->sk_stamp);
+ }
+ return copy_to_user(userstamp, &tv, sizeof(tv)) ? -EFAULT : 0;
+}
+EXPORT_SYMBOL(sock_get_timestamp);
+
+int sock_get_timestampns(struct sock *sk, struct timespec __user *userstamp)
+{
+ struct timespec ts;
+ if (!sock_flag(sk, SOCK_TIMESTAMP))
+ sock_enable_timestamp(sk);
+ ts = ktime_to_timespec(sk->sk_stamp);
+ if (ts.tv_sec == -1)
+ return -ENOENT;
+ if (ts.tv_sec == 0) {
+ sk->sk_stamp = ktime_get_real();
+ ts = ktime_to_timespec(sk->sk_stamp);
+ }
+ return copy_to_user(userstamp, &ts, sizeof(ts)) ? -EFAULT : 0;
+}
+EXPORT_SYMBOL(sock_get_timestampns);
+
+void sock_enable_timestamp(struct sock *sk)
+{
+ if (!sock_flag(sk, SOCK_TIMESTAMP)) {
+ sock_set_flag(sk, SOCK_TIMESTAMP);
+ net_enable_timestamp();
+ }
+}
+
static void sock_disable_timestamp(struct sock *sk)
{
if (sock_flag(sk, SOCK_TIMESTAMP)) {
@@ -1765,46 +1805,6 @@
}
EXPORT_SYMBOL(release_sock);
-int sock_get_timestamp(struct sock *sk, struct timeval __user *userstamp)
-{
- struct timeval tv;
- if (!sock_flag(sk, SOCK_TIMESTAMP))
- sock_enable_timestamp(sk);
- tv = ktime_to_timeval(sk->sk_stamp);
- if (tv.tv_sec == -1)
- return -ENOENT;
- if (tv.tv_sec == 0) {
- sk->sk_stamp = ktime_get_real();
- tv = ktime_to_timeval(sk->sk_stamp);
- }
- return copy_to_user(userstamp, &tv, sizeof(tv)) ? -EFAULT : 0;
-}
-EXPORT_SYMBOL(sock_get_timestamp);
-
-int sock_get_timestampns(struct sock *sk, struct timespec __user *userstamp)
-{
- struct timespec ts;
- if (!sock_flag(sk, SOCK_TIMESTAMP))
- sock_enable_timestamp(sk);
- ts = ktime_to_timespec(sk->sk_stamp);
- if (ts.tv_sec == -1)
- return -ENOENT;
- if (ts.tv_sec == 0) {
- sk->sk_stamp = ktime_get_real();
- ts = ktime_to_timespec(sk->sk_stamp);
- }
- return copy_to_user(userstamp, &ts, sizeof(ts)) ? -EFAULT : 0;
-}
-EXPORT_SYMBOL(sock_get_timestampns);
-
-void sock_enable_timestamp(struct sock *sk)
-{
- if (!sock_flag(sk, SOCK_TIMESTAMP)) {
- sock_set_flag(sk, SOCK_TIMESTAMP);
- net_enable_timestamp();
- }
-}
-
/*
* Get a socket option on an socket.
*
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC 2/4] net core: let protocols implement SOCK_TIMESTAMP efficiently
2008-04-21 5:34 [RFC 0/4] net: enable timestamps on a per-socket basis Jason Uhlenkott
2008-04-21 5:34 ` [RFC 1/4] net core: move timestamp functions Jason Uhlenkott
@ 2008-04-21 5:35 ` Jason Uhlenkott
2008-04-21 5:35 ` [RFC 3/4] ipv4: efficient SOCK_TIMESTAMP support for TCP, UDP, and raw sockets Jason Uhlenkott
` (2 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: Jason Uhlenkott @ 2008-04-21 5:35 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Stephen Hemminger
Classically, if anyone enabled the SOCK_TIMESTAMP flag on any socket
(via the SO_TIMESTAMP socket option or the SIOCGSTAMP ioctl), we'd
start recording timestamps for *every* packet coming into the system.
We did this because timestamps were recorded very early by the net
core, before the upper protocol layers had a chance to look up the
socket(s) associated with each incoming packet.
This gives protocols a way to announce (via a new flag,
PROTO_HAS_SOCK_TIMESTAMP) that they'll record packet timestamps
themselves, *after* the socket lookup has happened, so timestamps on
supported socket types can be enabled on a per-socket basis, instead
of globally in the net core.
For most protocols, supporting this is trivial: it just requires a
call to sock_timestamp(sk, skb) somewhere in the rcv path, typically
immediately after the socket lookup.
Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com>
---
include/net/sock.h | 13 +++++++++++++
net/core/sock.c | 21 +++++++++++++++++++--
2 files changed, 32 insertions(+), 2 deletions(-)
Index: linux/include/net/sock.h
===================================================================
--- linux.orig/include/net/sock.h 2008-04-20 21:03:54.000000000 -0700
+++ linux/include/net/sock.h 2008-04-20 21:04:03.000000000 -0700
@@ -475,6 +475,12 @@
skb->next = NULL;
}
+static inline void sock_timestamp(struct sock *sk, struct sk_buff *skb)
+{
+ if (sock_flag(sk, SOCK_TIMESTAMP) && !skb->tstamp.tv64)
+ __net_timestamp(skb);
+}
+
#define sk_wait_event(__sk, __timeo, __condition) \
({ int __rc; \
release_sock(__sk); \
@@ -587,6 +593,8 @@
char name[32];
struct list_head node;
+
+ unsigned long flags;
#ifdef SOCK_REFCNT_DEBUG
atomic_t socks;
#endif
@@ -595,6 +603,11 @@
extern int proto_register(struct proto *prot, int alloc_slab);
extern void proto_unregister(struct proto *prot);
+enum proto_flags {
+ /* proto handles per-socket timestamp flag efficiently */
+ PROTO_HAS_SOCK_TIMESTAMP = 0,
+};
+
#ifdef SOCK_REFCNT_DEBUG
static inline void sk_refcnt_debug_inc(struct sock *sk)
{
Index: linux/net/core/sock.c
===================================================================
--- linux.orig/net/core/sock.c 2008-04-20 21:03:54.000000000 -0700
+++ linux/net/core/sock.c 2008-04-20 21:04:03.000000000 -0700
@@ -287,11 +287,27 @@
}
EXPORT_SYMBOL(sock_get_timestampns);
+/*
+ * Does this type of socket implement SOCK_TIMESTAMP efficiently (by
+ * calling sock_timestamp() in the upper protocol layers, after we've
+ * looked up the skb's socket)?
+ */
+static inline int sock_supports_timestamps(struct sock *sk)
+{
+ return sk->sk_prot->flags & (1 << PROTO_HAS_SOCK_TIMESTAMP);
+}
+
void sock_enable_timestamp(struct sock *sk)
{
if (!sock_flag(sk, SOCK_TIMESTAMP)) {
sock_set_flag(sk, SOCK_TIMESTAMP);
- net_enable_timestamp();
+ if (!sock_supports_timestamps(sk)) {
+ /*
+ * Ugh, we can't enable timestamps for just
+ * this socket. We'll have to do it globally.
+ */
+ net_enable_timestamp();
+ }
}
}
@@ -299,7 +315,8 @@
{
if (sock_flag(sk, SOCK_TIMESTAMP)) {
sock_reset_flag(sk, SOCK_TIMESTAMP);
- net_disable_timestamp();
+ if (!sock_supports_timestamps(sk))
+ net_disable_timestamp();
}
}
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC 3/4] ipv4: efficient SOCK_TIMESTAMP support for TCP, UDP, and raw sockets
2008-04-21 5:34 [RFC 0/4] net: enable timestamps on a per-socket basis Jason Uhlenkott
2008-04-21 5:34 ` [RFC 1/4] net core: move timestamp functions Jason Uhlenkott
2008-04-21 5:35 ` [RFC 2/4] net core: let protocols implement SOCK_TIMESTAMP efficiently Jason Uhlenkott
@ 2008-04-21 5:35 ` Jason Uhlenkott
2008-04-21 5:36 ` [RFC 4/4] af_packet: efficient SOCK_TIMESTAMP support Jason Uhlenkott
2008-04-21 6:03 ` [RFC 0/4] net: enable timestamps on a per-socket basis David Miller
4 siblings, 0 replies; 11+ messages in thread
From: Jason Uhlenkott @ 2008-04-21 5:35 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Stephen Hemminger
Implement SOCK_TIMESTAMP support for TCP, UDP, and raw sockets by
calling sock_timestamp() on each received skb, after we've figured out
which socket(s) it belongs to.
By implementing SOCK_TIMESTAMP at this layer, we make it possible for
timestamps to be enabled on a per-socket basis for these types of
sockets; we avoid the performance penalty of enabling timestamps
globally in the net core.
Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com>
---
net/ipv4/raw.c | 3 +++
net/ipv4/tcp_ipv4.c | 3 +++
net/ipv4/udp.c | 4 ++++
3 files changed, 10 insertions(+)
Index: linux/net/ipv4/tcp_ipv4.c
===================================================================
--- linux.orig/net/ipv4/tcp_ipv4.c 2008-04-20 21:03:53.000000000 -0700
+++ linux/net/ipv4/tcp_ipv4.c 2008-04-20 21:04:05.000000000 -0700
@@ -1663,6 +1663,8 @@
skb->dev = NULL;
+ sock_timestamp(sk, skb);
+
bh_lock_sock_nested(sk);
ret = 0;
if (!sock_owned_by_user(sk)) {
@@ -2440,6 +2442,7 @@
.compat_setsockopt = compat_tcp_setsockopt,
.compat_getsockopt = compat_tcp_getsockopt,
#endif
+ .flags = (1 << PROTO_HAS_SOCK_TIMESTAMP),
REF_PROTO_INUSE(tcp)
};
Index: linux/net/ipv4/udp.c
===================================================================
--- linux.orig/net/ipv4/udp.c 2008-04-20 21:03:53.000000000 -0700
+++ linux/net/ipv4/udp.c 2008-04-20 21:04:05.000000000 -0700
@@ -1085,6 +1085,8 @@
do {
struct sk_buff *skb1 = skb;
+ sock_timestamp(sk, skb);
+
sknext = udp_v4_mcast_next(sk_next(sk), uh->dest, daddr,
uh->source, saddr, dif);
if (sknext)
@@ -1193,6 +1195,7 @@
if (sk != NULL) {
int ret = 0;
+ sock_timestamp(sk, skb);
bh_lock_sock_nested(sk);
if (!sock_owned_by_user(sk))
ret = udp_queue_rcv_skb(sk, skb);
@@ -1502,6 +1505,7 @@
.compat_setsockopt = compat_udp_setsockopt,
.compat_getsockopt = compat_udp_getsockopt,
#endif
+ .flags = (1 << PROTO_HAS_SOCK_TIMESTAMP),
REF_PROTO_INUSE(udp)
};
Index: linux/net/ipv4/raw.c
===================================================================
--- linux.orig/net/ipv4/raw.c 2008-04-20 21:03:53.000000000 -0700
+++ linux/net/ipv4/raw.c 2008-04-20 21:04:05.000000000 -0700
@@ -320,6 +320,8 @@
skb_push(skb, skb->data - skb_network_header(skb));
+ sock_timestamp(sk, skb);
+
raw_rcv_skb(sk, skb);
return 0;
}
@@ -848,6 +850,7 @@
.compat_setsockopt = compat_raw_setsockopt,
.compat_getsockopt = compat_raw_getsockopt,
#endif
+ .flags = (1 << PROTO_HAS_SOCK_TIMESTAMP),
REF_PROTO_INUSE(raw)
};
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC 4/4] af_packet: efficient SOCK_TIMESTAMP support
2008-04-21 5:34 [RFC 0/4] net: enable timestamps on a per-socket basis Jason Uhlenkott
` (2 preceding siblings ...)
2008-04-21 5:35 ` [RFC 3/4] ipv4: efficient SOCK_TIMESTAMP support for TCP, UDP, and raw sockets Jason Uhlenkott
@ 2008-04-21 5:36 ` Jason Uhlenkott
2008-04-21 6:03 ` [RFC 0/4] net: enable timestamps on a per-socket basis David Miller
4 siblings, 0 replies; 11+ messages in thread
From: Jason Uhlenkott @ 2008-04-21 5:36 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Stephen Hemminger
Implement SOCK_TIMESTAMP support for AF_PACKET sockets by recording a
timestamp for each skb which makes it past the filter.
By implementing SOCK_TIMESTAMP at this layer, we avoid the overhead of
recording a timestamp in the net core for every packet before we
know whether the packet will be filtered.
This preserves the existing behavior of recording a timestamp in the
tpacket_hdr of every packet on an mmaped packet socket, regardless of
whether SOCK_TIMESTAMP is set (but at least now we'll always record
timestamps *after* the filter runs).
Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com>
---
net/packet/af_packet.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
Index: linux/net/packet/af_packet.c
===================================================================
--- linux.orig/net/packet/af_packet.c 2008-04-20 21:03:53.000000000 -0700
+++ linux/net/packet/af_packet.c 2008-04-20 21:04:07.000000000 -0700
@@ -484,6 +484,8 @@
(unsigned)sk->sk_rcvbuf)
goto drop_n_acct;
+ sock_timestamp(sk, skb);
+
if (skb_shared(skb)) {
struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
if (nskb == NULL)
@@ -638,10 +640,9 @@
h->tp_snaplen = snaplen;
h->tp_mac = macoff;
h->tp_net = netoff;
- if (skb->tstamp.tv64)
- tv = ktime_to_timeval(skb->tstamp);
- else
- do_gettimeofday(&tv);
+ if (!skb->tstamp.tv64)
+ __net_timestamp(skb);
+ tv = ktime_to_timeval(skb->tstamp);
h->tp_sec = tv.tv_sec;
h->tp_usec = tv.tv_usec;
@@ -957,6 +958,7 @@
.name = "PACKET",
.owner = THIS_MODULE,
.obj_size = sizeof(struct packet_sock),
+ .flags = (1 << PROTO_HAS_SOCK_TIMESTAMP),
};
/*
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 0/4] net: enable timestamps on a per-socket basis
2008-04-21 5:34 [RFC 0/4] net: enable timestamps on a per-socket basis Jason Uhlenkott
` (3 preceding siblings ...)
2008-04-21 5:36 ` [RFC 4/4] af_packet: efficient SOCK_TIMESTAMP support Jason Uhlenkott
@ 2008-04-21 6:03 ` David Miller
2008-04-21 7:28 ` Jason Uhlenkott
2008-04-21 10:44 ` Andi Kleen
4 siblings, 2 replies; 11+ messages in thread
From: David Miller @ 2008-04-21 6:03 UTC (permalink / raw)
To: juhlenko; +Cc: netdev, shemminger
From: Jason Uhlenkott <juhlenko@akamai.com>
Date: Sun, 20 Apr 2008 22:34:02 -0700
> This patchset creates a mechanism for protocols to opt in to handling
> timestamps at a higher layer, and uses this mechanism to implement
> per-socket SOCK_TIMESTAMP support for several common protocols.
Moving the timestamp up to a higher level takes away some of the
frequent use cases of timestamps, which is to detect things like the
fact that it is taking a long time for packets to get from the
top-level packet receive down to the actual protocol processing.
With your patch, it can't be used that way any more.
In fact, people are desiring timestamps which are _closer_ to when the
device actually receives the frame rather than further away.
I understand the reasons why this change is desirable from a
performance standpoint, but we really can't do this, sorry.
It's probably more tenable to do the work necessary get get named to
not enable timestamps by default. It if really wants packet data
timestamps, it can do a gettimeofday() call right after recvmsg() as
gettimeofday() is merely a function call into the VSYSCALL page on
most platforms.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 0/4] net: enable timestamps on a per-socket basis
2008-04-21 6:03 ` [RFC 0/4] net: enable timestamps on a per-socket basis David Miller
@ 2008-04-21 7:28 ` Jason Uhlenkott
2008-04-21 10:44 ` Andi Kleen
1 sibling, 0 replies; 11+ messages in thread
From: Jason Uhlenkott @ 2008-04-21 7:28 UTC (permalink / raw)
To: David Miller; +Cc: netdev, shemminger
On Sun, Apr 20, 2008 at 23:03:56 -0700, David Miller wrote:
> Moving the timestamp up to a higher level takes away some of the
> frequent use cases of timestamps, which is to detect things like the
> fact that it is taking a long time for packets to get from the
> top-level packet receive down to the actual protocol processing.
>
> With your patch, it can't be used that way any more.
>
> In fact, people are desiring timestamps which are _closer_ to when the
> device actually receives the frame rather than further away.
Fair enough.
It's too bad that SO_TIMESTAMP has a scope of impact which really has
nothing to do with a socket, though. It almost seems like it should
have been an interface flag instead of a socket option.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 0/4] net: enable timestamps on a per-socket basis
2008-04-21 6:03 ` [RFC 0/4] net: enable timestamps on a per-socket basis David Miller
2008-04-21 7:28 ` Jason Uhlenkott
@ 2008-04-21 10:44 ` Andi Kleen
2008-04-21 10:59 ` David Miller
1 sibling, 1 reply; 11+ messages in thread
From: Andi Kleen @ 2008-04-21 10:44 UTC (permalink / raw)
To: David Miller; +Cc: juhlenko, netdev, shemminger
David Miller <davem@davemloft.net> writes:
> Moving the timestamp up to a higher level takes away some of the
> frequent use cases of timestamps, which is to detect things like the
> fact that it is taking a long time for packets to get from the
> top-level packet receive down to the actual protocol processing.
Is that really a frequent use case? It sounds more like a specialized
debugging situation. Most users are not network stack hackers :)
How about a sysctl to trigger between the two behaviours?
My guess would be that Jason's semantics are better for most systems.
> In fact, people are desiring timestamps which are _closer_ to when the
> device actually receives the frame rather than further away.
The other alternative was always to support loser time stamps especially
for networking. The reason people often complain about the time stamping
at all is when they use x86 systems with no globally reliable TSC which
has to fall back to slower southbridge timers and then they hurt.
But if you are willing to give away some of the guarantees of standard
gettimeofday (like global non monotonicity between CPUs) then you
could actually still use TSC even on those systems. And I don't
think global non monotonicity is really needed for a packet
time stamp ...
-Andi
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 0/4] net: enable timestamps on a per-socket basis
2008-04-21 10:44 ` Andi Kleen
@ 2008-04-21 10:59 ` David Miller
2008-04-21 11:43 ` Andi Kleen
0 siblings, 1 reply; 11+ messages in thread
From: David Miller @ 2008-04-21 10:59 UTC (permalink / raw)
To: andi; +Cc: juhlenko, netdev, shemminger
From: Andi Kleen <andi@firstfloor.org>
Date: Mon, 21 Apr 2008 12:44:56 +0200
> David Miller <davem@davemloft.net> writes:
>
> > Moving the timestamp up to a higher level takes away some of the
> > frequent use cases of timestamps, which is to detect things like the
> > fact that it is taking a long time for packets to get from the
> > top-level packet receive down to the actual protocol processing.
>
> Is that really a frequent use case? It sounds more like a specialized
> debugging situation. Most users are not network stack hackers :)
Ask a financial service industry shop what the implications of
inaccurate transaction timestamps can be. It possible for it to be
measured in the millions if not billions of euros.
> But if you are willing to give away some of the guarantees of standard
> gettimeofday (like global non monotonicity between CPUs) then you
> could actually still use TSC even on those systems. And I don't
> think global non monotonicity is really needed for a packet
> time stamp ...
So if tcpdump gets resceduled on another cpu, or the multiqueue flow
hashing algorithm changes, the appearance of the ordering of packets
changes.
No thanks.
Nobody wants half-working timestamps. That's why it's such an
enormous issue that x86 screwed this up so badly for such a long
period of time.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 0/4] net: enable timestamps on a per-socket basis
2008-04-21 10:59 ` David Miller
@ 2008-04-21 11:43 ` Andi Kleen
2008-04-21 11:51 ` David Miller
0 siblings, 1 reply; 11+ messages in thread
From: Andi Kleen @ 2008-04-21 11:43 UTC (permalink / raw)
To: David Miller; +Cc: juhlenko, netdev, shemminger
David Miller wrote:
> From: Andi Kleen <andi@firstfloor.org>
> Date: Mon, 21 Apr 2008 12:44:56 +0200
>
>> David Miller <davem@davemloft.net> writes:
>>
>>> Moving the timestamp up to a higher level takes away some of the
>>> frequent use cases of timestamps, which is to detect things like the
>>> fact that it is taking a long time for packets to get from the
>>> top-level packet receive down to the actual protocol processing.
>> Is that really a frequent use case? It sounds more like a specialized
>> debugging situation. Most users are not network stack hackers :)
>
> Ask a financial service industry shop what the implications of
> inaccurate transaction timestamps can be. It possible for it to be
> measured in the millions if not billions of euros.
Are you sure they don't just need end2end timestamps as in from sendmsg
to recvmsg()? Imagine the packet is stuck for some time in the
kernel for whatever reason and you only process it later in user space
wouldn't you consider that older time stamp "inaccurate" too? I would,
unless I was debugging the network stack.
It is hard to imagine they really care about excluding one set of queues
(kernel queue) and not other queues (nic rx/tx queues, switch queues
etc.) for their time stamps as you imply.
>> But if you are willing to give away some of the guarantees of standard
>> gettimeofday (like global non monotonicity between CPUs) then you
>> could actually still use TSC even on those systems. And I don't
>> think global non monotonicity is really needed for a packet
>> time stamp ...
>
> So if tcpdump gets resceduled on another cpu, or the multiqueue flow
> hashing algorithm changes, the appearance of the ordering of packets
> changes.
If that is the problem you could always cut off some lower bits in the
time stamp and put a sequence counter in there. Ok not serious. It would
probably still work for most people though.
> No thanks.
>
> Nobody wants half-working timestamps.
I'm not so sure. When I was still working on this I had some
conversation with various application people about this, and when
prodded they generally were supportive of the relaxed time stamp idea
when presented with the alternatives (either slow timers or relaxed timers)
Very few applications really need the full time stamp guarantees.
For your debugging example relaxed time stamps would work great I would
think.
But yes as always when changing some existing semantics it is hard to be
sure it won't be a problem for somebody. The only really safe way
is to use different interfaces but that circumvents Jason's idea
of fixing the existing binaries.
> That's why it's such an
> enormous issue that x86 screwed this up so badly for such a long
> period of time.
Well pretty much all older CPUs (x86 and non x86) with aggressive power
saving have broken internal timers. the one usually comes with the
other. Only recently have hardware people learned to avoid that.
-Andi
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 0/4] net: enable timestamps on a per-socket basis
2008-04-21 11:43 ` Andi Kleen
@ 2008-04-21 11:51 ` David Miller
0 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2008-04-21 11:51 UTC (permalink / raw)
To: andi; +Cc: juhlenko, netdev, shemminger
From: Andi Kleen <andi@firstfloor.org>
Date: Mon, 21 Apr 2008 13:43:31 +0200
> Are you sure they don't just need end2end timestamps as in from sendmsg
> to recvmsg()? Imagine the packet is stuck for some time in the
> kernel for whatever reason and you only process it later in user space
> wouldn't you consider that older time stamp "inaccurate" too? I would,
> unless I was debugging the network stack.
They need the exact timestamp when the packet was received at the
physical machine (and this pretty much means the network card) so that
they know precisely when customer X's trade order arrived for
scheduling and prioritizing purposes.
> It is hard to imagine they really care about excluding one set of queues
> (kernel queue) and not other queues (nic rx/tx queues, switch queues
> etc.) for their time stamps as you imply.
They do.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2008-04-21 11:51 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-21 5:34 [RFC 0/4] net: enable timestamps on a per-socket basis Jason Uhlenkott
2008-04-21 5:34 ` [RFC 1/4] net core: move timestamp functions Jason Uhlenkott
2008-04-21 5:35 ` [RFC 2/4] net core: let protocols implement SOCK_TIMESTAMP efficiently Jason Uhlenkott
2008-04-21 5:35 ` [RFC 3/4] ipv4: efficient SOCK_TIMESTAMP support for TCP, UDP, and raw sockets Jason Uhlenkott
2008-04-21 5:36 ` [RFC 4/4] af_packet: efficient SOCK_TIMESTAMP support Jason Uhlenkott
2008-04-21 6:03 ` [RFC 0/4] net: enable timestamps on a per-socket basis David Miller
2008-04-21 7:28 ` Jason Uhlenkott
2008-04-21 10:44 ` Andi Kleen
2008-04-21 10:59 ` David Miller
2008-04-21 11:43 ` Andi Kleen
2008-04-21 11:51 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).