* [RFC][PATCH 0/3] net: per skb control messages
@ 2008-07-23 22:01 Octavian Purdila
2008-07-23 22:01 ` [RFC][PATCH 1/3] " Octavian Purdila
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: Octavian Purdila @ 2008-07-23 22:01 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila
Hi everybody,
These patches are a follow up on two previous [rfc] threads:
- new sk_buff member: hwstamp [1]
- support for IEEE 1588 [2]
[1] http://www.spinics.net/lists/netdev/msg68804.html
[2] http://www.spinics.net/lists/netdev/msg67799.html
The idea is to associate control messages with skbs - for both
out-going and in-going traffic.
The RX case seems pretty straight forward.
For the TX case, I am queueing an empty (or a clone of the original)
skb (with control messages attached) to the error queue of the socket
the TX skb belongs to.
To make an idea about how would this be used from userspace, I've
added a simple userspace demo for getting TX hardware timestamps.
I still ended up adding a new skb member but since this is a bit more
generic maybe it can offset the cost?
Thanks,
tavi
---
#include <stdlib.h>
#include <string.h>
#include <error.h>
#include <stdio.h>
#include <errno.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/poll.h>
#include <sys/uio.h>
#include <netinet/in.h>
typedef unsigned long long __u64;
typedef unsigned long long __u32;
#define SOL_SKB 275
#include "include/linux/skb_cmsg.h"
struct cmsg_tx_tstamp {
struct cmsghdr hdr;
struct skb_tx_tstamp stt __attribute__ ((packed));
};
int main()
{
char buffer[100];
struct iovec iovec = {
.iov_base = buffer,
.iov_len = sizeof(buffer)
};
struct cmsg_tx_tstamp ctt = {
.hdr = {
.cmsg_len = sizeof(ctt),
.cmsg_level = SOL_SKB,
.cmsg_type = SKB_TX_GET_TSTAMP
},
.stt = {
.cookie = 0x0bad0badbeefbeefULL,
}
};
struct sockaddr_in daddr = {
.sin_family = AF_INET,
.sin_port = 8888,
.sin_addr = { htonl(0x01010101) }
};
struct msghdr msg = {
.msg_iov = &iovec,
.msg_iovlen = 1,
.msg_control = &ctt,
.msg_controllen = sizeof(ctt),
.msg_name = &daddr,
.msg_namelen = sizeof(daddr)
};
int s=socket(PF_INET, SOCK_DGRAM, 0);
struct pollfd pollfd = {
.fd = s,
.events = POLLERR,
};
printf("%d %d %d %llx %llx\n", ctt.hdr.cmsg_len, ctt.hdr.cmsg_level,
ctt.hdr.cmsg_type, ctt.stt.cookie, ctt.stt.tstamp);
if (sendmsg(s, &msg, 0) < 0)
perror("sendmsg");
poll(&pollfd, 1, -1);
if (recvmsg(s, &msg, MSG_ERRQUEUE) < 0)
perror("recvmsg");
return printf("%d %d %d %llx %llx\n", ctt.hdr.cmsg_len,
ctt.hdr.cmsg_level, ctt.hdr.cmsg_type, ctt.stt.tstamp,
ctt.stt.cookie);
}
---
^ permalink raw reply [flat|nested] 19+ messages in thread
* [RFC][PATCH 1/3] net: per skb control messages
2008-07-23 22:01 [RFC][PATCH 0/3] net: per skb control messages Octavian Purdila
@ 2008-07-23 22:01 ` Octavian Purdila
2008-07-24 12:46 ` Herbert Xu
2008-07-23 22:01 ` [RFC][PATCH 2/3] ip: support for SOL_SKB control messages for UDP/RAW sockets Octavian Purdila
2008-07-23 22:01 ` [RFC][PATCH 3/3] net: add SKB_SOL control messages Octavian Purdila
2 siblings, 1 reply; 19+ messages in thread
From: Octavian Purdila @ 2008-07-23 22:01 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila
This patch introduces per skb control messages that can be used to
directly exchange information between the application and the
hardware, at a per packet level. Examples of usecases are: RX/TX
hardware timestamps, TX scheduling (request hardware to send packets
at a future time).
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
---
include/linux/skbuff.h | 62 +++++++++++++++++++++++++
include/linux/socket.h | 1 +
include/net/sock.h | 2 +
net/core/skbuff.c | 118 +++++++++++++++++++++++++++++++++++++++++++++++-
net/core/sock.c | 46 +++++++++++++++++++
5 files changed, 227 insertions(+), 2 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f24d261..f2988d1 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -135,6 +135,17 @@ struct skb_frag_struct {
__u32 size;
};
+/*
+ * Control message queue; used to exchange additional information between the
+ * application and hardware (e.g. RX/TX hardware timestamps, TX scheduling,
+ * etc.)
+ */
+struct skb_cmsg {
+ struct skb_cmsg *next;
+ int type, len;
+ char data[0];
+};
+
/* This data is invariant across clones and lives at
* the end of the header data, ie. at skb->end.
*/
@@ -148,6 +159,7 @@ struct skb_shared_info {
__be32 ip6_frag_id;
struct sk_buff *frag_list;
skb_frag_t frags[MAX_SKB_FRAGS];
+ struct skb_cmsg *cmsg;
};
/* We divide dataref into two halves. The higher 16 bits hold references
@@ -1719,5 +1731,55 @@ static inline void skb_forward_csum(struct sk_buff *skb)
}
bool skb_partial_csum_set(struct sk_buff *skb, u16 start, u16 off);
+
+
+#define skb_cmsg_queue(skb) (skb_shinfo(skb)->cmsg)
+
+/**
+ * skb_cmsg_for_each - iterate over a control message list
+ * @i: the loop cursor (a struct skb_cmsg *)
+ * @head: the head (a struct skb_cmsg **)
+ */
+#define skb_cmsg_for_each(i, head) for (i = *head; i != NULL; \
+ i = i->next)
+
+void skb_cmsg_free(struct skb_cmsg *head);
+
+struct skb_cmsg *skb_cmsg_alloc(int type, int len, int gfp);
+
+/**
+ * skb_add_cmsg - add a new control message to a list
+ * @head - head of the list
+ * @sc - the control message
+ *
+ * This function should only be called from:
+ * - the device driver, before sending the skb up the network stack
+ * - the higher levels of the network stack, before sending the skb down the
+ * network stack
+ *
+ * Because of these conventions, no synchonization is needed on the cmsg queue
+ * of the skb.
+ */
+static inline void skb_cmsg_add(struct skb_cmsg **head, struct skb_cmsg *sc)
+{
+ sc->next = (*head)->next;
+ *head = sc;
+}
+
+int __skb_cmsg_send(struct skb_cmsg **head, struct msghdr *msg);
+
+/**
+ * skb_cmsg_send - appends the skb control messages sent by the
+ * application to an skb
+ * @msg: the message header passed by the application
+ * @skb: the skb to append the control messages to
+ */
+static inline int skb_cmsg_send(struct sk_buff *skb, struct msghdr *msg)
+{
+ return __skb_cmsg_send(&skb_cmsg_queue(skb), msg);
+}
+
+void skb_cmsg_recv(struct msghdr *msg, struct sk_buff *skb);
+
#endif /* __KERNEL__ */
#endif /* _LINUX_SKBUFF_H */
diff --git a/include/linux/socket.h b/include/linux/socket.h
index bd2b30a..21687a7 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -293,6 +293,7 @@ struct ucred {
#define SOL_RXRPC 272
#define SOL_PPPOL2TP 273
#define SOL_BLUETOOTH 274
+#define SOL_SKB 275
/* IPX options */
#define IPX_TYPE 1
diff --git a/include/net/sock.h b/include/net/sock.h
index dc42b44..8cc7598 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1365,4 +1365,6 @@ extern int sysctl_optmem_max;
extern __u32 sysctl_wmem_default;
extern __u32 sysctl_rmem_default;
+int sock_queue_skb_cmsg(struct sk_buff *skb, struct skb_cmsg *sc, int clone);
+
#endif /* _SOCK_H */
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 61f3d1f..6426ee9 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -220,6 +220,7 @@ struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask,
shinfo->gso_type = 0;
shinfo->ip6_frag_id = 0;
shinfo->frag_list = NULL;
+ shinfo->cmsg = NULL;
if (fclone) {
struct sk_buff *child = skb + 1;
@@ -313,6 +314,48 @@ static void skb_clone_fraglist(struct sk_buff *skb)
skb_get(list);
}
+/*
+ * FIXME: Do we need a full copy, or is it ok to just prune the control message
+ * list for copied skb?
+ */
+static struct sk_buff *skb_cmsg_copy(struct sk_buff *skb, int gfp)
+{
+#if 0
+ struct skb_cmsg *i, *head = NULL;
+
+ skb_cmsg_for_each(i, skb_cmsg_queue(skb)) {
+ struct skb_cmsg *sc = skb_cmsg_alloc(i->type, i->len, gfp);
+
+ if (!sc)
+ goto abort;
+
+ memcpy(sc->data, i->data, i->len);
+ skb_cmsg_add(&head, sc);
+ }
+
+ skb_cmsg_queue(skb) = head;
+
+ return skb;
+
+abort:
+ skb_cmsg_free(head);
+ dev_kfree_skb_any(skb);
+ return NULL;
+#else
+ skb_cmsg_queue(skb) = NULL;
+ return skb;
+#endif
+}
+
+void skb_cmsg_free(struct skb_cmsg *i)
+{
+ while (i) {
+ struct skb_cmsg *sc = i;
+ i = i->next;
+ kfree(sc);
+ }
+}
+
static void skb_release_data(struct sk_buff *skb)
{
if (!skb->cloned ||
@@ -328,6 +371,8 @@ static void skb_release_data(struct sk_buff *skb)
skb_drop_fraglist(skb);
kfree(skb->head);
+
+ skb_cmsg_free(skb_cmsg_queue(skb));
}
}
@@ -567,6 +612,8 @@ static void copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
skb_shinfo(new)->gso_size = skb_shinfo(old)->gso_size;
skb_shinfo(new)->gso_segs = skb_shinfo(old)->gso_segs;
skb_shinfo(new)->gso_type = skb_shinfo(old)->gso_type;
+
+ skb_shinfo(new)->cmsg = skb_shinfo(old)->cmsg;
}
/**
@@ -610,7 +657,8 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, gfp_t gfp_mask)
BUG();
copy_skb_header(n, skb);
- return n;
+
+ return skb_cmsg_copy(n, gfp_mask);
}
@@ -835,7 +883,7 @@ struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
n->mac_header += off;
#endif
- return n;
+ return skb_cmsg_copy(n, gfp_mask);
}
/**
@@ -2594,6 +2642,72 @@ bool skb_partial_csum_set(struct sk_buff *skb, u16 start, u16 off)
return true;
}
+/**
+ * skb_cmsg_recv - copies the control messages from an skb to the
+ * application
+ * @msg: the message header passed by the application
+ * @skb: the skb
+ *
+ * Control messages are freed when we free the skb.
+ */
+void skb_cmsg_recv(struct msghdr *msg, struct sk_buff *skb)
+{
+ struct skb_cmsg *sc = skb_shinfo(skb)->cmsg;
+
+ while (sc) {
+ put_cmsg(msg, SOL_SKB, sc->type, sc->len, sc->data);
+ sc = sc->next;
+ }
+}
+EXPORT_SYMBOL(skb_cmsg_recv);
+
+/**
+ * __skb_cmsg_recv - appends the skb control messages sent by the
+ * application to a control message list
+ * @msg: the message header passed by the application
+ * @head: the head of the list to append the control messages to
+ *
+ */
+int __skb_cmsg_send(struct skb_cmsg **head, struct msghdr *msg)
+{
+ struct cmsghdr *cmsg;
+
+ for (cmsg = CMSG_FIRSTHDR(msg); cmsg; cmsg = CMSG_NXTHDR(msg, cmsg)) {
+ int len;
+ struct skb_cmsg *sc;
+
+ if (!CMSG_OK(msg, cmsg))
+ return -EINVAL;
+ if (cmsg->cmsg_level != SOL_SKB)
+ continue;
+ len = cmsg->cmsg_len - sizeof(struct cmsghdr);
+
+ sc = skb_cmsg_alloc(cmsg->cmsg_type, len, GFP_KERNEL);
+ if (!sc)
+ return -ENOMEM;
+ memcpy(sc->data, CMSG_DATA(cmsg), len);
+
+ skb_cmsg_add(head, sc);
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL(__skb_cmsg_send);
+
+struct skb_cmsg *skb_cmsg_alloc(int type, int len, int gfp)
+{
+ struct skb_cmsg *sc = kmalloc(sizeof(*sc) + len, gfp);
+
+ if (!sc)
+ return NULL;
+
+ sc->type = type;
+ sc->len = len;
+
+ return sc;
+}
+EXPORT_SYMBOL(skb_cmsg_alloc);
+
EXPORT_SYMBOL(___pskb_trim);
EXPORT_SYMBOL(__kfree_skb);
EXPORT_SYMBOL(kfree_skb);
diff --git a/net/core/sock.c b/net/core/sock.c
index 88094cb..8a63898 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -307,6 +307,52 @@ out:
}
EXPORT_SYMBOL(sock_queue_rcv_skb);
+/**
+ * sock_queue_skb_cmsg - queue a cloned or empty skb together with a list of
+ * control messages to the error socket of a given skb
+ * @skb: the skb for which the control messages are being send
+ * @sc: a control message list to be sent
+ * @clone: whether the control messages will be transported with an empty skb
+ * or with a clone of the original skb
+ *
+ * This function can be used to get post-send skb control messages to the
+ * application (e.g. TX timestamps).
+ *
+ * Control messages of the original skb will be deleted, the driver and stack
+ * should read them before calling this function.
+ */
+int sock_queue_skb_cmsg(struct sk_buff *skb, struct skb_cmsg *sc, int clone)
+{
+ struct sock *sk = skb->sk;
+ struct sk_buff *n;
+ int err = -ENOMEM;
+
+ if (!sk)
+ return -ENOTSOCK;
+
+ if (clone) {
+ n = skb_clone(skb, GFP_ATOMIC);
+ skb_cmsg_free(skb_cmsg_queue(skb));
+ } else {
+ n = dev_alloc_skb(0);
+ skb_reset_network_header(n);
+ skb_reset_transport_header(n);
+ }
+
+ if (!n)
+ return err;
+
+ n->dev = NULL;
+ skb_cmsg_add(&skb_cmsg_queue(n), sc);
+
+ err = sock_queue_err_skb(sk, n);
+ if (err)
+ kfree_skb(n);
+
+ return err;
+}
+EXPORT_SYMBOL(sock_queue_skb_cmsg);
+
int sk_receive_skb(struct sock *sk, struct sk_buff *skb, const int nested)
{
int rc = NET_RX_SUCCESS;
--
1.5.6.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [RFC][PATCH 2/3] ip: support for SOL_SKB control messages for UDP/RAW sockets
2008-07-23 22:01 [RFC][PATCH 0/3] net: per skb control messages Octavian Purdila
2008-07-23 22:01 ` [RFC][PATCH 1/3] " Octavian Purdila
@ 2008-07-23 22:01 ` Octavian Purdila
2008-07-23 22:01 ` [RFC][PATCH 3/3] net: add SKB_SOL control messages Octavian Purdila
2 siblings, 0 replies; 19+ messages in thread
From: Octavian Purdila @ 2008-07-23 22:01 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
---
include/linux/skbuff.h | 2 +-
include/net/ip.h | 1 +
net/ipv4/icmp.c | 2 ++
net/ipv4/ip_output.c | 2 ++
net/ipv4/ip_sockglue.c | 6 ++++++
net/ipv4/raw.c | 1 +
net/ipv4/udp.c | 1 +
7 files changed, 14 insertions(+), 1 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f2988d1..2d394ed 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1762,7 +1762,7 @@ struct skb_cmsg *skb_cmsg_alloc(int type, int len, int gfp);
*/
static inline void skb_cmsg_add(struct skb_cmsg **head, struct skb_cmsg *sc)
{
- sc->next = (*head)->next;
+ sc->next = *head;
*head = sc;
}
diff --git a/include/net/ip.h b/include/net/ip.h
index 3b40bc2..ccb8be7 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -54,6 +54,7 @@ struct ipcm_cookie
__be32 addr;
int oif;
struct ip_options *opt;
+ struct skb_cmsg *skb_cmsg;
};
#define IPCB(skb) ((struct inet_skb_parm*)((skb)->cb))
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 8739735..68da287 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -377,6 +377,7 @@ static void icmp_reply(struct icmp_bxm *icmp_param, struct sk_buff *skb)
inet->tos = ip_hdr(skb)->tos;
daddr = ipc.addr = rt->rt_src;
ipc.opt = NULL;
+ ipc.skb_cmsg = NULL;
if (icmp_param->replyopts.optlen) {
ipc.opt = &icmp_param->replyopts;
if (ipc.opt->srr)
@@ -534,6 +535,7 @@ void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info)
inet_sk(sk)->tos = tos;
ipc.addr = iph->saddr;
ipc.opt = &icmp_param.replyopts;
+ ipc.skb_cmsg = NULL;
{
struct flowi fl = {
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index e527628..da85d42 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -941,6 +941,7 @@ alloc_new_skb:
skb->ip_summed = csummode;
skb->csum = 0;
skb_reserve(skb, hh_len);
+ skb_shinfo(skb)->cmsg = ipc->skb_cmsg;
/*
* Find where to start putting bytes.
@@ -1354,6 +1355,7 @@ void ip_send_reply(struct sock *sk, struct sk_buff *skb, struct ip_reply_arg *ar
daddr = ipc.addr = rt->rt_src;
ipc.opt = NULL;
+ ipc.skb_cmsg = NULL;
if (replyopts.opt.optlen) {
ipc.opt = &replyopts.opt;
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index e0514e8..599cfe6 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -169,6 +169,10 @@ int ip_cmsg_send(struct net *net, struct msghdr *msg, struct ipcm_cookie *ipc)
int err;
struct cmsghdr *cmsg;
+ err = __skb_cmsg_send(&ipc->skb_cmsg, msg);
+ if (err)
+ return err;
+
for (cmsg = CMSG_FIRSTHDR(msg); cmsg; cmsg = CMSG_NXTHDR(msg, cmsg)) {
if (!CMSG_OK(msg, cmsg))
return -EINVAL;
@@ -353,6 +357,8 @@ int ip_recv_error(struct sock *sk, struct msghdr *msg, int len)
if (err)
goto out_free_skb;
+ skb_cmsg_recv(msg, skb);
+
sock_recv_timestamp(msg, sk, skb);
serr = SKB_EXT_ERR(skb);
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 37a1ecd..75f0122 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -494,6 +494,7 @@ static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
ipc.addr = inet->saddr;
ipc.opt = NULL;
+ ipc.skb_cmsg = NULL;
ipc.oif = sk->sk_bound_dev_if;
if (msg->msg_controllen) {
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 56fcda3..e9b0f51 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -560,6 +560,7 @@ int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
return -EOPNOTSUPP;
ipc.opt = NULL;
+ ipc.skb_cmsg = NULL;
if (up->pending) {
/*
--
1.5.6.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [RFC][PATCH 3/3] net: add SKB_SOL control messages
2008-07-23 22:01 [RFC][PATCH 0/3] net: per skb control messages Octavian Purdila
2008-07-23 22:01 ` [RFC][PATCH 1/3] " Octavian Purdila
2008-07-23 22:01 ` [RFC][PATCH 2/3] ip: support for SOL_SKB control messages for UDP/RAW sockets Octavian Purdila
@ 2008-07-23 22:01 ` Octavian Purdila
2 siblings, 0 replies; 19+ messages in thread
From: Octavian Purdila @ 2008-07-23 22:01 UTC (permalink / raw)
To: netdev; +Cc: Octavian Purdila
SKB_RX_GET_TSTAMP - get hardware timestamp at receive time
SKB_TX_GET_TSTAMP - get hardware timestamp at transmit time
SKB_TX_SCHED - schedule packet to be transmited at hardware timestamp
SKB_TX_EMBED_TSTAMP - embed hardware timestamp into packet at
transmit time
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
---
include/linux/skb_cmsg.h | 28 ++++++++++++++++++++++++++++
1 files changed, 28 insertions(+), 0 deletions(-)
create mode 100644 include/linux/skb_cmsg.h
diff --git a/include/linux/skb_cmsg.h b/include/linux/skb_cmsg.h
new file mode 100644
index 0000000..c5b8a9b
--- /dev/null
+++ b/include/linux/skb_cmsg.h
@@ -0,0 +1,28 @@
+#ifndef _LINUX_SKB_CMSG_H
+#define _LINUX_SKB_CMSG_H
+
+#define SKB_RX_GET_TSTAMP 1
+#define SKB_TX_GET_TSTAMP 2
+#define SKB_TX_SCHED 3
+#define SKB_TX_EMBED_TSTAMP 4
+
+struct skb_rx_tstamp {
+ __u64 tstamp;
+};
+
+struct skb_tx_tstamp {
+ __u64 tstamp, cookie;
+};
+
+struct skb_tx_sched {
+ __u64 tstamp;
+};
+
+#define SKB_TX_EMBED_TSTAMP_OFFSET_EOP 0x00010000
+
+struct skb_tx_embed_tstamp {
+ __u64 mask;
+ __u32 offset;
+};
+
+#endif
--
1.5.6.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [RFC][PATCH 1/3] net: per skb control messages
2008-07-23 22:01 ` [RFC][PATCH 1/3] " Octavian Purdila
@ 2008-07-24 12:46 ` Herbert Xu
2008-07-24 13:34 ` Octavian Purdila
0 siblings, 1 reply; 19+ messages in thread
From: Herbert Xu @ 2008-07-24 12:46 UTC (permalink / raw)
To: Octavian Purdila; +Cc: netdev, opurdila
Octavian Purdila <opurdila@ixiacom.com> wrote:
> This patch introduces per skb control messages that can be used to
> directly exchange information between the application and the
> hardware, at a per packet level. Examples of usecases are: RX/TX
> hardware timestamps, TX scheduling (request hardware to send packets
> at a future time).
Putting it in the share info area makes its life-cycle management
difficult. This is also a major change to how we store skb control
data.
> @@ -567,6 +612,8 @@ static void copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
> skb_shinfo(new)->gso_size = skb_shinfo(old)->gso_size;
> skb_shinfo(new)->gso_segs = skb_shinfo(old)->gso_segs;
> skb_shinfo(new)->gso_type = skb_shinfo(old)->gso_type;
> +
> + skb_shinfo(new)->cmsg = skb_shinfo(old)->cmsg;
> }
>
> /**
> @@ -610,7 +657,8 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, gfp_t gfp_mask)
> BUG();
>
> copy_skb_header(n, skb);
> - return n;
> +
> + return skb_cmsg_copy(n, gfp_mask);
What about the other callers of copy_skb_header, like pskb_copy?
In fact why isn't this in copy_skb_header?
Also what about pskb_expand_head?
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC][PATCH 1/3] net: per skb control messages
2008-07-24 12:46 ` Herbert Xu
@ 2008-07-24 13:34 ` Octavian Purdila
2008-07-24 15:01 ` Herbert Xu
0 siblings, 1 reply; 19+ messages in thread
From: Octavian Purdila @ 2008-07-24 13:34 UTC (permalink / raw)
To: Herbert Xu; +Cc: netdev
On Thursday 24 July 2008, Herbert Xu wrote:
Hi Herbert,
Thanks for taking the time for the review.
>
> Putting it in the share info area makes its life-cycle management
> difficult. This is also a major change to how we store skb control
> data.
>
Initially I tried putting the new member directly into the skb, but then I
run into issues with cloned packets, since I needed a reference
counter to know when to delete the cmsg queue.
I've noticed that the shared info already has a reference counter and I've
though that taking advantage of it would be better then doing it myself.
The shinfo seems to be the right place also because the cmsg queue can not be
modified once the skb has been pushed into the network stack. It sort of is
like data, only with special meaning.
Is this approach wrong? Or is there a better way of doing it?
> > @@ -567,6 +612,8 @@ static void copy_skb_header(struct sk_buff *new,
> > const struct sk_buff *old) skb_shinfo(new)->gso_size =
> > skb_shinfo(old)->gso_size;
> > skb_shinfo(new)->gso_segs = skb_shinfo(old)->gso_segs;
> > skb_shinfo(new)->gso_type = skb_shinfo(old)->gso_type;
> > +
> > + skb_shinfo(new)->cmsg = skb_shinfo(old)->cmsg;
> > }
> >
> > /**
> > @@ -610,7 +657,8 @@ struct sk_buff *skb_copy(const struct sk_buff *skb,
> > gfp_t gfp_mask) BUG();
> >
> > copy_skb_header(n, skb);
> > - return n;
> > +
> > + return skb_cmsg_copy(n, gfp_mask);
>
> What about the other callers of copy_skb_header, like pskb_copy?
>From what I understand, the shinfo remains shared, thus there is no need to do
the copy here.
> In fact why isn't this in copy_skb_header?
>
In copy_skb_header we do a pointer copy. This should be enough for packets
that are cloned or partially cloned.
The only problem is when we do a full copy of the packet, where shinfo should
be also copied. skb_cmsg_copy is called only in these cases.
> Also what about pskb_expand_head?
>
I think this is only touching the skb header, not the shinfo. Thus we should
be safe?
BTW: I have a dilemma with regard to fully copied skbs: should we copy the
cmsg queue as well, or should we just prune it in the copy? I don't see a
real usecase for doing the copy at this point, but since these is at core
level, maybe it is a good idea to be conservative and do the copying?
Thanks,
tavi
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC][PATCH 1/3] net: per skb control messages
2008-07-24 13:34 ` Octavian Purdila
@ 2008-07-24 15:01 ` Herbert Xu
2008-07-24 16:22 ` Octavian Purdila
0 siblings, 1 reply; 19+ messages in thread
From: Herbert Xu @ 2008-07-24 15:01 UTC (permalink / raw)
To: Octavian Purdila; +Cc: netdev
On Thu, Jul 24, 2008 at 04:34:31PM +0300, Octavian Purdila wrote:
>
> > What about the other callers of copy_skb_header, like pskb_copy?
>
> From what I understand, the shinfo remains shared, thus there is no need to do
> the copy here.
No pskb_copy does a full copy on the head portion of the packet
which is where your cmsg is.
> > Also what about pskb_expand_head?
> >
>
> I think this is only touching the skb header, not the shinfo. Thus we should
> be safe?
But you're freeing the cmsg in skb_release_data which is called
by pskb_expand_head. This can't possibly work.
> BTW: I have a dilemma with regard to fully copied skbs: should we copy the
> cmsg queue as well, or should we just prune it in the copy? I don't see a
> real usecase for doing the copy at this point, but since these is at core
> level, maybe it is a good idea to be conservative and do the copying?
Do we need to make it so generic? If we do then can you please
come up with other uses for it too. For example, find out what
existing fields can be better suited by storing them as a cmsg.
Only then can we design this properly.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC][PATCH 1/3] net: per skb control messages
2008-07-24 15:01 ` Herbert Xu
@ 2008-07-24 16:22 ` Octavian Purdila
2008-07-24 20:28 ` David Miller
0 siblings, 1 reply; 19+ messages in thread
From: Octavian Purdila @ 2008-07-24 16:22 UTC (permalink / raw)
To: Herbert Xu; +Cc: netdev
On Thursday 24 July 2008, Herbert Xu wrote:
> > From what I understand, the shinfo remains shared, thus there is no need
> > to do the copy here.
>
> No pskb_copy does a full copy on the head portion of the packet
> which is where your cmsg is.
>
> > > Also what about pskb_expand_head?
> >
> > I think this is only touching the skb header, not the shinfo. Thus we
> > should be safe?
>
> But you're freeing the cmsg in skb_release_data which is called
> by pskb_expand_head. This can't possibly work.
Right, I've missed the fact that skb_shinfo is stored in the header part.
> Do we need to make it so generic? If we do then can you please
> come up with other uses for it too. For example, find out what
> existing fields can be better suited by storing them as a cmsg.
> Only then can we design this properly.
It looks like we can switch a few members from the skb to cmsg: secmark,
priority, tstamp, thus saving some space in the skb, but we will have a
performance overhead for accessing these since we now have to search for them
in the cmesg list.
So, the more I think about this the more I am inclined to think that we don't
really need it. For our hw timestamp usecases, the reason of this patch, we
can get away by adding these new fields in the skb:
u16 flags; /* request / signal presence of RX/TX hw stamps, TX scheduling, TX
embed hw timestamp in packet */
u8 hwstamp[6];
and of course the mechanism of enabling hw stamps requests on per skb, with
msghdr.msg_control stuff.
Would that be acceptable (with proper CONFIG_HW_TSTAMPS of course)?
Thanks,
tavi
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC][PATCH 1/3] net: per skb control messages
2008-07-24 16:22 ` Octavian Purdila
@ 2008-07-24 20:28 ` David Miller
2008-07-24 21:49 ` Octavian Purdila
0 siblings, 1 reply; 19+ messages in thread
From: David Miller @ 2008-07-24 20:28 UTC (permalink / raw)
To: opurdila; +Cc: herbert, netdev
From: Octavian Purdila <opurdila@ixiacom.com>
Date: Thu, 24 Jul 2008 19:22:57 +0300
> Would that be acceptable (with proper CONFIG_HW_TSTAMPS of course)?
Adding new fields to struct sk_buff that take up space is generally
not allowed unless the new field adds substantially to the benefit of
a large group of users of Linus.
I don't think that applied here for this hw-tstamps stuff.
And yes, this is even with the config option protection, because every
distribution is going to turn all the options on, which makes them
essentially pointless from a sk_buff overhead perspective.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC][PATCH 1/3] net: per skb control messages
2008-07-24 20:28 ` David Miller
@ 2008-07-24 21:49 ` Octavian Purdila
2008-07-24 21:56 ` David Miller
0 siblings, 1 reply; 19+ messages in thread
From: Octavian Purdila @ 2008-07-24 21:49 UTC (permalink / raw)
To: David Miller; +Cc: herbert, netdev
On Thursday 24 July 2008, David Miller wrote:
> Adding new fields to struct sk_buff that take up space is generally
> not allowed unless the new field adds substantially to the benefit of
> a large group of users of Linus.
>
> I don't think that applied here for this hw-tstamps stuff.
What about the approach proposed in the patch? Is it ok to add a pointer which
may resolve other future similar issues?
Thanks,
tavi
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC][PATCH 1/3] net: per skb control messages
2008-07-24 21:49 ` Octavian Purdila
@ 2008-07-24 21:56 ` David Miller
2008-07-24 21:58 ` Stephen Hemminger
2008-07-24 22:12 ` Octavian Purdila
0 siblings, 2 replies; 19+ messages in thread
From: David Miller @ 2008-07-24 21:56 UTC (permalink / raw)
To: opurdila; +Cc: herbert, netdev
From: Octavian Purdila <opurdila@ixiacom.com>
Date: Fri, 25 Jul 2008 00:49:46 +0300
> On Thursday 24 July 2008, David Miller wrote:
>
> > Adding new fields to struct sk_buff that take up space is generally
> > not allowed unless the new field adds substantially to the benefit of
> > a large group of users of Linus.
> >
> > I don't think that applied here for this hw-tstamps stuff.
>
> What about the approach proposed in the patch? Is it ok to add a pointer which
> may resolve other future similar issues?
Traversing the list and maintaining the reference counting and sharing
issues is expensive and error prone.
This is what OpenBSD uses for their IPSEC state attached to MBUFs and
it's a nightmare.
We have a timestamp in the SKB already, why don't you simply override
it when your feature is enable and set a single flag bit that
indicates you used a HW timestamp to set that timestamp?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC][PATCH 1/3] net: per skb control messages
2008-07-24 21:56 ` David Miller
@ 2008-07-24 21:58 ` Stephen Hemminger
2008-07-24 22:35 ` Octavian Purdila
2008-07-24 22:12 ` Octavian Purdila
1 sibling, 1 reply; 19+ messages in thread
From: Stephen Hemminger @ 2008-07-24 21:58 UTC (permalink / raw)
To: David Miller; +Cc: opurdila, herbert, netdev
On Thu, 24 Jul 2008 14:56:00 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:
> From: Octavian Purdila <opurdila@ixiacom.com>
> Date: Fri, 25 Jul 2008 00:49:46 +0300
>
> > On Thursday 24 July 2008, David Miller wrote:
> >
> > > Adding new fields to struct sk_buff that take up space is generally
> > > not allowed unless the new field adds substantially to the benefit of
> > > a large group of users of Linus.
> > >
> > > I don't think that applied here for this hw-tstamps stuff.
> >
> > What about the approach proposed in the patch? Is it ok to add a pointer which
> > may resolve other future similar issues?
>
> Traversing the list and maintaining the reference counting and sharing
> issues is expensive and error prone.
>
> This is what OpenBSD uses for their IPSEC state attached to MBUFs and
> it's a nightmare.
>
> We have a timestamp in the SKB already, why don't you simply override
> it when your feature is enable and set a single flag bit that
> indicates you used a HW timestamp to set that timestamp?
How hard would it be to add PLL support to use same kind of timestamp.
Enlist the help of some NTP/clock experts to help.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC][PATCH 1/3] net: per skb control messages
2008-07-24 21:56 ` David Miller
2008-07-24 21:58 ` Stephen Hemminger
@ 2008-07-24 22:12 ` Octavian Purdila
2008-07-24 22:17 ` David Miller
1 sibling, 1 reply; 19+ messages in thread
From: Octavian Purdila @ 2008-07-24 22:12 UTC (permalink / raw)
To: David Miller; +Cc: herbert, netdev
On Friday 25 July 2008, David Miller wrote:
> Traversing the list and maintaining the reference counting and sharing
> issues is expensive and error prone.
I've notice that :)
> We have a timestamp in the SKB already, why don't you simply override
> it when your feature is enable and set a single flag bit that
> indicates you used a HW timestamp to set that timestamp?
I thought of something similar, but I am not sure if I can to so, as it seems
that the skb->tstamp requires current gettimeofday semantics at least in
netfilter's ipt_time module.
Thanks,
tavi
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC][PATCH 1/3] net: per skb control messages
2008-07-24 22:12 ` Octavian Purdila
@ 2008-07-24 22:17 ` David Miller
2008-07-24 23:14 ` Octavian Purdila
0 siblings, 1 reply; 19+ messages in thread
From: David Miller @ 2008-07-24 22:17 UTC (permalink / raw)
To: opurdila; +Cc: herbert, netdev
From: Octavian Purdila <opurdila@ixiacom.com>
Date: Fri, 25 Jul 2008 01:12:03 +0300
> On Friday 25 July 2008, David Miller wrote:
>
> > We have a timestamp in the SKB already, why don't you simply override
> > it when your feature is enable and set a single flag bit that
> > indicates you used a HW timestamp to set that timestamp?
>
> I thought of something similar, but I am not sure if I can to so, as it seems
> that the skb->tstamp requires current gettimeofday semantics at least in
> netfilter's ipt_time module.
Can your timestamp format at least be converted to
gettimeofday() format?
I thought we had a ton of accessor functions that code uses to access
the timestamp? You should be able to do your translation in those
routines.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC][PATCH 1/3] net: per skb control messages
2008-07-24 21:58 ` Stephen Hemminger
@ 2008-07-24 22:35 ` Octavian Purdila
2008-07-24 23:05 ` Stephen Hemminger
0 siblings, 1 reply; 19+ messages in thread
From: Octavian Purdila @ 2008-07-24 22:35 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, herbert, netdev
On Friday 25 July 2008, Stephen Hemminger wrote:
> How hard would it be to add PLL support to use same kind of timestamp.
> Enlist the help of some NTP/clock experts to help.
Let me see if I got this correctly: modify the hardware so that the CPU is
synchronized with the NIC timestamping unit. If so, I will agree that this is
the cleanest possible solution.
Thanks,
tavi
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC][PATCH 1/3] net: per skb control messages
2008-07-24 22:35 ` Octavian Purdila
@ 2008-07-24 23:05 ` Stephen Hemminger
0 siblings, 0 replies; 19+ messages in thread
From: Stephen Hemminger @ 2008-07-24 23:05 UTC (permalink / raw)
To: Octavian Purdila; +Cc: David Miller, herbert, netdev
On Fri, 25 Jul 2008 01:35:52 +0300
Octavian Purdila <opurdila@ixiacom.com> wrote:
> On Friday 25 July 2008, Stephen Hemminger wrote:
>
> > How hard would it be to add PLL support to use same kind of timestamp.
> > Enlist the help of some NTP/clock experts to help.
>
> Let me see if I got this correctly: modify the hardware so that the CPU is
> synchronized with the NIC timestamping unit. If so, I will agree that this is
> the cleanest possible solution.
>
> Thanks,
> tavi
Perodically, sample the hardware clock and the system time of day,
and resynchronize the clocks. I did something like this in a prototype
that never got integrated...
------------------------
Subject: sky2: hardware receive timestamp counter
Use hardware timestamp counter to stamp receive packets.
It allows for higher resolution values without the hardware overhead
of doing gettimeofday.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
---
drivers/net/sky2.c | 159 +++++++++++++++++++++++++++++++++++++++--------------
drivers/net/sky2.h | 5 +
2 files changed, 124 insertions(+), 40 deletions(-)
--- a/drivers/net/sky2.c 2007-08-29 11:41:16.000000000 -0700
+++ b/drivers/net/sky2.c 2007-08-30 13:05:39.000000000 -0700
@@ -26,6 +26,7 @@
#include <linux/kernel.h>
#include <linux/version.h>
#include <linux/module.h>
+#include <linux/reciprocal_div.h>
#include <linux/netdevice.h>
#include <linux/dma-mapping.h>
#include <linux/etherdevice.h>
@@ -2186,6 +2187,36 @@ error:
goto resubmit;
}
+static inline u32 sky2_tist2ns(const struct sky2_hw *hw, u32 clk)
+{
+ return reciprocal_divide(clk * 1000, hw->tist_rate);
+}
+
+/*
+ * Convert hardware timestamp clock into a real time value
+ */
+static void sky2_set_timestamp(struct sk_buff *skb,
+ const struct sky2_hw *hw, u32 stamp)
+{
+ unsigned long seq;
+
+ do {
+ s32 delta;
+
+ seq = read_seqbegin(&hw->tist_lock);
+
+ /* ticks since last synchronization */
+ delta = stamp - hw->tist_base;
+
+ if (delta > 0)
+ skb->tstamp = ktime_add_ns(hw->tist_real,
+ sky2_tist2ns(hw, delta));
+ else
+ skb->tstamp = ktime_sub_ns(hw->tist_real,
+ sky2_tist2ns(hw, -delta));
+ } while (read_seqretry(&hw->tist_lock, seq));
+}
+
/* Transmit complete */
static inline void sky2_tx_done(struct net_device *dev, u16 last)
{
@@ -2262,6 +2293,16 @@ static int sky2_status_intr(struct sky2_
break;
#ifdef SKY2_VLAN_TAG_USED
+ case OP_RXTIMEVLAN:
+ sky2->rx_tag = length;
+ /* fall through */
+#endif
+ case OP_RXTIMESTAMP:
+ skb = sky2->rx_ring[sky2->rx_next].skb;
+ sky2_set_timestamp(skb, hw, status);
+ break;
+
+#ifdef SKY2_VLAN_TAG_USED
case OP_RXVLAN:
sky2->rx_tag = length;
break;
@@ -2375,9 +2416,6 @@ static void sky2_hw_intr(struct sky2_hw
{
struct pci_dev *pdev = hw->pdev;
u32 status = sky2_read32(hw, B0_HWE_ISRC);
- u32 hwmsk = sky2_read32(hw, B0_HWE_IMSK);
-
- status &= hwmsk;
if (status & Y2_IS_TIST_OV)
sky2_write8(hw, GMAC_TI_ST_CTRL, GMT_ST_CLR_IRQ);
@@ -2457,11 +2495,15 @@ static void sky2_le_error(struct sky2_hw
sky2_write32(hw, Q_ADDR(q, Q_CSR), BMU_CLR_IRQ_CHK);
}
-/* Check for lost IRQ once a second */
+/* Once a second timer for safety checking and polling for timestamp
+ *
+ * Note: receive and timer processing both happen under softirq
+ */
static void sky2_watchdog(unsigned long arg)
{
struct sky2_hw *hw = (struct sky2_hw *) arg;
+ /* Look for lost IRQ */
if (sky2_read32(hw, B0_ISRC)) {
struct net_device *dev = hw->dev[0];
@@ -2469,6 +2511,14 @@ static void sky2_watchdog(unsigned long
__netif_rx_schedule(dev);
}
+ /* Snapshot current system realtime at current timestamp value
+ * @ 150Mhz counter wraps in 28.6 secs
+ */
+ write_seqlock(&hw->tist_lock);
+ hw->tist_real = ktime_get_real();
+ hw->tist_base = sky2_read32(hw, GMAC_TI_ST_VAL);
+ write_sequnlock(&hw->tist_lock);
+
if (hw->active > 0)
mod_timer(&hw->watchdog_timer, round_jiffies(jiffies + HZ));
}
@@ -2476,9 +2526,6 @@ static void sky2_watchdog(unsigned long
/* Hardware/software error handling */
static void sky2_err_intr(struct sky2_hw *hw, u32 status)
{
- if (net_ratelimit())
- dev_warn(&hw->pdev->dev, "error interrupt status=%#x\n", status);
-
if (status & Y2_IS_HW_ERR)
sky2_hw_intr(hw);
@@ -2707,9 +2754,9 @@ static void sky2_reset(struct sky2_hw *h
/* Turn off descriptor polling */
sky2_write32(hw, B28_DPT_CTRL, DPT_STOP);
- /* Turn off receive timestamp */
- sky2_write8(hw, GMAC_TI_ST_CTRL, GMT_ST_STOP);
- sky2_write8(hw, GMAC_TI_ST_CTRL, GMT_ST_CLR_IRQ);
+ /* Turn on receive timestamp */
+ sky2_write32(hw, GMAC_TI_ST_VAL, 0);
+ sky2_write8(hw, GMAC_TI_ST_CTRL, GMT_ST_CLR_IRQ|GMT_ST_START);
/* enable the Tx Arbiters */
for (i = 0; i < hw->ports; i++)
@@ -4061,6 +4108,9 @@ static int __devinit sky2_probe(struct p
sky2_show_addr(dev1);
}
+ seqlock_init(&hw->tist_lock);
+ hw->tist_rate = reciprocal_value(sky2_mhz(hw));
+
setup_timer(&hw->watchdog_timer, sky2_watchdog, (unsigned long) hw);
INIT_WORK(&hw->restart_work, sky2_restart);
--- a/drivers/net/sky2.h 2007-08-29 11:41:16.000000000 -0700
+++ b/drivers/net/sky2.h 2007-08-30 13:01:50.000000000 -0700
@@ -352,7 +352,7 @@ enum {
/* Hardware error interrupt mask for Yukon 2 */
enum {
- Y2_IS_TIST_OV = 1<<29,/* Time Stamp Timer overflow interrupt */
+ Y2_IS_TIST_OV = 1<<29, /* Time Stamp Timer overflow interrupt */
Y2_IS_SENSOR = 1<<28, /* Sensor interrupt */
Y2_IS_MST_ERR = 1<<27, /* Master error interrupt */
Y2_IS_IRQ_STAT = 1<<26, /* Status exception interrupt */
@@ -2032,6 +2032,11 @@ struct sky2_hw {
u8 ports;
u8 active;
+ seqlock_t tist_lock;
+ ktime_t tist_real;
+ u32 tist_base;
+ u32 tist_rate;
+
struct sky2_status_le *st_le;
u32 st_idx;
dma_addr_t st_dma;
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC][PATCH 1/3] net: per skb control messages
2008-07-24 22:17 ` David Miller
@ 2008-07-24 23:14 ` Octavian Purdila
2008-07-24 23:18 ` David Miller
0 siblings, 1 reply; 19+ messages in thread
From: Octavian Purdila @ 2008-07-24 23:14 UTC (permalink / raw)
To: David Miller; +Cc: herbert, netdev
On Friday 25 July 2008, David Miller wrote:
> From: Octavian Purdila <opurdila@ixiacom.com>
> Date: Fri, 25 Jul 2008 01:12:03 +0300
>
> > On Friday 25 July 2008, David Miller wrote:
> > > We have a timestamp in the SKB already, why don't you simply override
> > > it when your feature is enable and set a single flag bit that
> > > indicates you used a HW timestamp to set that timestamp?
> >
> > I thought of something similar, but I am not sure if I can to so, as it
> > seems that the skb->tstamp requires current gettimeofday semantics at
> > least in netfilter's ipt_time module.
>
> Can your timestamp format at least be converted to
> gettimeofday() format?
>
> I thought we had a ton of accessor functions that code uses to access
> the timestamp? You should be able to do your translation in those
> routines.
Sure but the problem is that the NIC hw timestamp is not synced with the CPU
time. In that case I think that the netfilter rules which are looking at the
timestamp will be messed up.
[
A bit of general context about this annoying hw timestamp thing I keep
bringing up here:)
I know that this is a very specific thing and there is probably not a clean
solution to this and we will probably have to go with an in internal patch
approach.
But the thing is that we accumulated a lot such internal patches to the point
that makes it very hard to upgrade and track a recent Linux version. And I
feel that we need to stop adding new stuff in this pile, otherwise we will
not be able to keep up.
Thus my fixation with this very specific and not so significant thing in the
great Linux ecosystem. Probably I just chosen the wrong patch to battle.
]
Thanks,
tavi
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC][PATCH 1/3] net: per skb control messages
2008-07-24 23:14 ` Octavian Purdila
@ 2008-07-24 23:18 ` David Miller
2008-07-24 23:26 ` Octavian Purdila
0 siblings, 1 reply; 19+ messages in thread
From: David Miller @ 2008-07-24 23:18 UTC (permalink / raw)
To: opurdila; +Cc: herbert, netdev
From: Octavian Purdila <opurdila@ixiacom.com>
Date: Fri, 25 Jul 2008 02:14:11 +0300
> Sure but the problem is that the NIC hw timestamp is not synced with
> the CPU time. In that case I think that the netfilter rules which
> are looking at the timestamp will be messed up.
Right, but you can maintain a suitable "delta" between the two.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC][PATCH 1/3] net: per skb control messages
2008-07-24 23:18 ` David Miller
@ 2008-07-24 23:26 ` Octavian Purdila
0 siblings, 0 replies; 19+ messages in thread
From: Octavian Purdila @ 2008-07-24 23:26 UTC (permalink / raw)
To: David Miller; +Cc: herbert, netdev
On Friday 25 July 2008, David Miller wrote:
> > Sure but the problem is that the NIC hw timestamp is not synced with
> > the CPU time. In that case I think that the netfilter rules which
> > are looking at the timestamp will be messed up.
>
> Right, but you can maintain a suitable "delta" between the two.
OK, I think I got the idea now:
- use a bit in the timestamp to specify that this is a hw timestamp
- use a new net_device method to convert a hw timestamp to a gettimeofday
timestamp
This is great, thanks a lot !
tavi
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2008-07-24 23:29 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-23 22:01 [RFC][PATCH 0/3] net: per skb control messages Octavian Purdila
2008-07-23 22:01 ` [RFC][PATCH 1/3] " Octavian Purdila
2008-07-24 12:46 ` Herbert Xu
2008-07-24 13:34 ` Octavian Purdila
2008-07-24 15:01 ` Herbert Xu
2008-07-24 16:22 ` Octavian Purdila
2008-07-24 20:28 ` David Miller
2008-07-24 21:49 ` Octavian Purdila
2008-07-24 21:56 ` David Miller
2008-07-24 21:58 ` Stephen Hemminger
2008-07-24 22:35 ` Octavian Purdila
2008-07-24 23:05 ` Stephen Hemminger
2008-07-24 22:12 ` Octavian Purdila
2008-07-24 22:17 ` David Miller
2008-07-24 23:14 ` Octavian Purdila
2008-07-24 23:18 ` David Miller
2008-07-24 23:26 ` Octavian Purdila
2008-07-23 22:01 ` [RFC][PATCH 2/3] ip: support for SOL_SKB control messages for UDP/RAW sockets Octavian Purdila
2008-07-23 22:01 ` [RFC][PATCH 3/3] net: add SKB_SOL control messages Octavian Purdila
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).