* Re: [PATCH 7/8]: Handle timestamps on Request/Response exchange
2007-09-25 14:30 [PATCH 7/8]: Handle timestamps on Request/Response exchange separately Gerrit Renker
@ 2007-09-25 18:50 ` Arnaldo Carvalho de Melo
2007-09-26 9:46 ` [PATCH 7/8]: Handle timestamps on Request/Response exchange separately Gerrit Renker
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2007-09-25 18:50 UTC (permalink / raw)
To: dccp
Em Tue, Sep 25, 2007 at 03:30:48PM +0100, Gerrit Renker escreveu:
> [DCCP]: Handle timestamps on Request/Response exchange separately
>
> In DCCP, timestamps can occur on packets anytime, CCID3 uses a timestamp(/echo) on the Request/Response
> exchange. This patch addresses the following situation:
> * timestamps are recorded on the listening socket;
I noticed this, I think it is unaceptable to do that, therefore the best
thing is to combine the two patches, so as not to introduce a problem
that is fixed in the following patch.
Look below for some other considerations.
> * Responses are sent from dccp_request_sockets;
> * suppose two connections reach the listening socket with very small time in between:
> * the first timestamp value gets overwritten by the second connection request.
>
> It may seem unlikely that this bug will occur, since the Response is sent out immediately, but to me
> this does not seem right. I am further not sure whether the socket locks provide sufficient protection
> against overwriting timestamp values on the listening socket.
It is just wrong, I cannot see anything that would prevent this window
from being hit.
> This patch therefore
> * makes memory use for timestamps dynamic (to use less when no timestamps are present);
> * separates `handshake timestamps' from `established timestamps';
I didn't understood this one. Care to further explain? Anyway, I think
that adding yet another allocation in a connection lifetime is not good.
One of the most pressing things for me after merging all the patches
that are pending (THANK YOU! :-) ) is to work on DCCP memory consuption
and accounting, the code has to be made more robust :-\
I think that we should just add dreq_{time,echo} to dccp_request_sock
and keep dccp_sock as is.
Now it is:
[acme@mica net-2.6.24]$ pahole -C dccp_request_sock net/dccp/minisocks.o
struct dccp_request_sock {
struct inet_request_sock dreq_inet_rsk; /* 0 56 */
__u64 dreq_iss; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
__u64 dreq_isr; /* 64 8 */
__be32 dreq_service; /* 72 4 */
/* size: 76, cachelines: 2 */
/* last cacheline: 12 bytes */
};
I suggest it to become:
[acme@mica net-2.6.24]$ pahole -C dccp_request_sock net/dccp/minisocks.o
struct dccp_request_sock {
struct inet_request_sock dreq_inet_rsk; /* 0 56 */
__u64 dreq_iss; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
__u64 dreq_isr; /* 64 8 */
__be32 dreq_service; /* 72 4 */
__u32 dreq_tstamp_echo; /* 76 4 */
ktime_t dreq_tstamp_time; /* 80 8 */
/* size: 88, cachelines: 2 */
/* last cacheline: 24 bytes */
};
Humm, these minisocks are getting fat... another thing for my TODO list,
request_sock::ts_recent seems to be used only by the TCP machinery, ripe
for the picking....
Anyway, I'll move along the patch queue looking for more easily
cherrypickable patches.
> * de-allocates in request socket destructor if previous de-allocation has failed.
>
> Furthermore, inserting the Timestamp Echo option has been taken out of the block starting with
> '!dccp_packet_without_ack()', since Timestamp Echo can be carried on any packet (5.8 and 13.3).
Well, not really, a timestamp echo on a request packet would make no
sense :-) But yeah, the code right now its wrong as it doesn't puts
timestamp echo options in data packets, and that is allowed.
> Lastly, with sampling RTTs in mind, the earliest-unacknowledged timestamp is always kept on the
> socket (mimicking RFC 1323). This is not fully worked out, to do a RFC1323-style algorithm requires
> more work, and possibly some changes; but in can in principle benefit from the provided code.
>
> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
> ---
> include/linux/dccp.h | 25 +++++++++++++++++++++----
> net/dccp/dccp.h | 2 +-
> net/dccp/ipv4.c | 4 ++++
> net/dccp/ipv6.c | 4 ++++
> net/dccp/minisocks.c | 4 ++++
> net/dccp/options.c | 41 +++++++++++++++++++++++++++--------------
> net/dccp/proto.c | 5 +++++
> 7 files changed, 66 insertions(+), 19 deletions(-)
>
> --- a/include/linux/dccp.h
> +++ b/include/linux/dccp.h
> @@ -407,11 +407,30 @@ struct dccp_opt_pend {
>
> extern void dccp_minisock_init(struct dccp_minisock *dmsk);
>
> +/**
> + * struct dccp_ts_echo - Record incoming timestamp to echo it later
> + * @ts_time: arrival time of timestamp
> + * @ts_echo: the timestamp value recorded at @ts_time
> + */
> +struct dccp_ts_echo {
> + ktime_t ts_time;
> + __u32 ts_echo;
> +};
> +
> +/**
> + * struct dccp_request_sock - represent DCCP-specific connection request
> + * @dreq_inet_rsk: structure inherited from
> + * @dreq_iss: initial sequence number sent on the Response (RFC 4340, 7.1)
> + * @dreq_isr: initial sequence number received on the Request
> + * @dreq_service: service code present on the Request (there is just one)
> + * @dreq_tstamp: most recent timestamp received during connection setup
> + */
> struct dccp_request_sock {
> struct inet_request_sock dreq_inet_rsk;
> __u64 dreq_iss;
> __u64 dreq_isr;
> __be32 dreq_service;
> + struct dccp_ts_echo *dreq_tstamp;
> };
>
> static inline struct dccp_request_sock *dccp_rsk(const struct request_sock *req)
> @@ -477,8 +496,7 @@ struct dccp_ackvec;
> * @dccps_gar - greatest valid ack number received on a non-Sync; initialized to %dccps_iss
> * @dccps_service - first (passive sock) or unique (active sock) service code
> * @dccps_service_list - second .. last service code on passive socket
> - * @dccps_timestamp_time - time of latest TIMESTAMP option
> - * @dccps_timestamp_echo - latest timestamp received on a TIMESTAMP option
> + * @dccps_tstamp - most recently received timestamp to echo (RFC 4340, 13.1)
> * @dccps_l_ack_ratio - feature-local Ack Ratio
> * @dccps_r_ack_ratio - feature-remote Ack Ratio
> * @dccps_pcslen - sender partial checksum coverage (via sockopt)
> @@ -514,8 +532,7 @@ struct dccp_sock {
> __u64 dccps_gar;
> __be32 dccps_service;
> struct dccp_service_list *dccps_service_list;
> - ktime_t dccps_timestamp_time;
> - __u32 dccps_timestamp_echo;
> + struct dccp_ts_echo *dccps_tstamp;
> __u16 dccps_l_ack_ratio;
> __u16 dccps_r_ack_ratio;
> __u16 dccps_pcslen;
> --- a/net/dccp/options.c
> +++ b/net/dccp/options.c
> @@ -56,6 +56,7 @@ int dccp_parse_options(struct sock *sk,
> const unsigned char *opt_end = (unsigned char *)dh +
> (dh->dccph_doff * 4);
> struct dccp_options_received *opt_recv = &dp->dccps_options_received;
> + struct dccp_ts_echo **tse;
> unsigned char opt, len;
> unsigned char *value;
> u32 elapsed_time, opt_val;
> @@ -159,12 +160,25 @@ int dccp_parse_options(struct sock *sk,
> if (len != 4)
> goto out_invalid_option;
>
> + tse = dreq? &dreq->dreq_tstamp : dp->dccps_tstamp;
huh? both dreq_tstamp and dccps_tstamp are pointers to dccp_ts_echo, you
are mixing pointer types here by using '&' on one and not on the other
:)
> + /*
> + * Keep the earliest received timestamp on the socket,
> + * until echoing to the peer frees it. This policy is
> + * useful for doing RTT measurements (eg. RFC 1323, 3.4)
> + */
> + if (*tse != NULL)
> + break;
> +
> + *tse = kmalloc(sizeof(struct dccp_ts_echo), GFP_ATOMIC);
> + if (*tse = NULL) {
> + DCCP_CRIT("can not store received timestamp");
we avoid this situation by having the dccp_ts_echo fields in the
minisock and keeping dccp_sock as is now.
> + break;
> + }
> +
> + (*tse)->ts_time = ktime_get_real();
> opt_val = get_unaligned((u32 *)value);
> opt_recv->dccpor_timestamp = ntohl(opt_val);
> -
> - /* FIXME: if dreq != NULL, don't store this on listening socket */
> - dp->dccps_timestamp_echo = opt_recv->dccpor_timestamp;
> - dp->dccps_timestamp_time = ktime_get_real();
> + (*tse)->ts_echo = opt_recv->dccpor_timestamp;
>
> dccp_pr_debug("%s rx opt: TIMESTAMP=%u, ackno=%llu\n",
> dccp_role(sk), opt_recv->dccpor_timestamp,
> @@ -384,15 +398,14 @@ int dccp_insert_option_timestamp(struct
>
> EXPORT_SYMBOL_GPL(dccp_insert_option_timestamp);
>
> -static int dccp_insert_option_timestamp_echo(struct sock *sk,
> +static int dccp_insert_option_timestamp_echo(struct dccp_ts_echo **tse,
> struct sk_buff *skb)
> {
> - struct dccp_sock *dp = dccp_sk(sk);
> __be32 tstamp_echo;
> int len, elapsed_time_len;
> unsigned char *to;
> const suseconds_t delta = ktime_us_delta(ktime_get_real(),
> - dp->dccps_timestamp_time);
> + (*tse)->ts_time);
> u32 elapsed_time = delta / 10;
> elapsed_time_len = dccp_elapsed_time_len(elapsed_time);
> len = 6 + elapsed_time_len;
> @@ -406,7 +419,7 @@ static int dccp_insert_option_timestamp_
> *to++ = DCCPO_TIMESTAMP_ECHO;
> *to++ = len;
>
> - tstamp_echo = htonl(dp->dccps_timestamp_echo);
> + tstamp_echo = htonl((*tse)->ts_echo);
> memcpy(to, &tstamp_echo, 4);
> to += 4;
>
> @@ -418,8 +431,8 @@ static int dccp_insert_option_timestamp_
> memcpy(to, &var32, 4);
> }
>
> - dp->dccps_timestamp_echo = 0;
> - dp->dccps_timestamp_time = ktime_set(0, 0);
> + kfree(*tse);
> + *tse = NULL;
> return 0;
> }
>
> @@ -528,10 +541,6 @@ int dccp_insert_options(struct sock *sk,
> dccp_ackvec_pending(dp->dccps_hc_rx_ackvec) &&
> dccp_insert_option_ackvec(sk, skb))
> return -1;
> -
> - if (dp->dccps_timestamp_echo != 0 &&
> - dccp_insert_option_timestamp_echo(sk, skb))
> - return -1;
> }
>
> if (dp->dccps_hc_rx_insert_options) {
> @@ -555,6 +564,10 @@ int dccp_insert_options(struct sock *sk,
> dccp_insert_option_timestamp(sk, skb))
> return -1;
>
> + if (dp->dccps_tstamp != NULL &&
> + dccp_insert_option_timestamp_echo(&dp->dccps_tstamp, skb))
> + return -1;
> +
> /* XXX: insert other options when appropriate */
>
> if (DCCP_SKB_CB(skb)->dccpd_opt_len != 0) {
> --- a/net/dccp/dccp.h
> +++ b/net/dccp/dccp.h
> @@ -413,7 +413,7 @@ static inline void dccp_update_gss(struc
> static inline int dccp_ack_pending(const struct sock *sk)
> {
> const struct dccp_sock *dp = dccp_sk(sk);
> - return dp->dccps_timestamp_echo != 0 ||
> + return dp->dccps_tstamp != NULL ||
> #ifdef CONFIG_IP_DCCP_ACKVEC
> (dccp_msk(sk)->dccpms_send_ack_vector &&
> dccp_ackvec_pending(dp->dccps_hc_rx_ackvec)) ||
> --- a/net/dccp/minisocks.c
> +++ b/net/dccp/minisocks.c
> @@ -120,6 +120,7 @@ struct sock *dccp_create_openreq_child(s
> newdp->dccps_role = DCCP_ROLE_SERVER;
> newdp->dccps_hc_rx_ackvec = NULL;
> newdp->dccps_service_list = NULL;
> + newdp->dccps_tstamp = NULL;
> newdp->dccps_service = dreq->dreq_service;
> newicsk->icsk_rto = DCCP_TIMEOUT_INIT;
>
> @@ -303,9 +304,12 @@ EXPORT_SYMBOL_GPL(dccp_reqsk_send_ack);
>
> void dccp_reqsk_init(struct request_sock *req, struct sk_buff *skb)
> {
> + struct dccp_request_sock *dreq = dccp_rsk(req);
> +
> inet_rsk(req)->rmt_port = dccp_hdr(skb)->dccph_sport;
> inet_rsk(req)->acked = 0;
> req->rcv_wnd = sysctl_dccp_feat_sequence_window;
> + dreq->dreq_tstamp = NULL;
> }
>
> EXPORT_SYMBOL_GPL(dccp_reqsk_init);
> --- a/net/dccp/proto.c
> +++ b/net/dccp/proto.c
> @@ -254,6 +254,11 @@ int dccp_destroy_sock(struct sock *sk)
> kfree(dp->dccps_service_list);
> dp->dccps_service_list = NULL;
>
> + if (dp->dccps_tstamp != NULL) {
> + kfree(dp->dccps_tstamp);
> + dp->dccps_tstamp = NULL;
> + }
> +
> if (dmsk->dccpms_send_ack_vector) {
> dccp_ackvec_free(dp->dccps_hc_rx_ackvec);
> dp->dccps_hc_rx_ackvec = NULL;
> --- a/net/dccp/ipv4.c
> +++ b/net/dccp/ipv4.c
> @@ -551,6 +551,10 @@ out:
>
> static void dccp_v4_reqsk_destructor(struct request_sock *req)
> {
> + struct dccp_request_sock *dreq = dccp_rsk(req);
> +
> + if (dreq->dreq_tstamp != NULL)
> + kfree(dreq->dreq_tstamp);
> kfree(inet_rsk(req)->opt);
> }
>
> --- a/net/dccp/ipv6.c
> +++ b/net/dccp/ipv6.c
> @@ -295,6 +295,10 @@ done:
>
> static void dccp_v6_reqsk_destructor(struct request_sock *req)
> {
> + struct dccp_request_sock *dreq = dccp_rsk(req);
> +
> + if (dreq->dreq_tstamp != NULL)
> + kfree(dreq->dreq_tstamp);
> if (inet6_rsk(req)->pktopts != NULL)
> kfree_skb(inet6_rsk(req)->pktopts);
> }
> -
> To unsubscribe from this list: send the line "unsubscribe dccp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH 7/8]: Handle timestamps on Request/Response exchange separately
2007-09-25 14:30 [PATCH 7/8]: Handle timestamps on Request/Response exchange separately Gerrit Renker
2007-09-25 18:50 ` [PATCH 7/8]: Handle timestamps on Request/Response exchange Arnaldo Carvalho de Melo
@ 2007-09-26 9:46 ` Gerrit Renker
2007-09-26 9:51 ` Gerrit Renker
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Gerrit Renker @ 2007-09-26 9:46 UTC (permalink / raw)
To: dccp
| > This patch addresses the following situation:
| > * timestamps are recorded on the listening socket;
|
| I noticed this, I think it is unaceptable to do that, therefore the best
| thing is to combine the two patches, so as not to introduce a problem
| that is fixed in the following patch.
OK, so we have at least agreement that something needs to be done. Sorry I don't understand the
above remark: do you refer to the hunk which fixed a pointer assignment (tse = ...) and slipped
into the wrong patch?
I am happy to combine the two patches if that is what you are asking. I had kept the next patch
("Support inserting options during the 3-way handshake") separate since it is mainly concerned
with inserting options, which is called by feature-negotiation (see below for related issue).
| > This patch therefore
| > * makes memory use for timestamps dynamic (to use less when no timestamps are present);
| > * separates `handshake timestamps' from `established timestamps';
|
| I didn't understood this one. Care to further explain?
Yes - awkward sentence. What I meant (and that becomes only obvious further down in the patch set) is:
1) First point: if no timestamp is present then there is no memory (apart from a pointer) associated
with this. I was concerned that otherwise the request_socks get too big - which is exactly your
comment from below. This is not the end of the story - there is also a list_head on request_socks
to store the feature-negotiation list. Here I had the same concerns: initially I wanted to use a
table for all features, and that would have been a fat use of memory (9 features x 2 locations
(local/remote) x 64bit (maximum option size) gives 1152 bytes - which I found utterly inacceptable
to put on a request_sock. Instead, the feature negotiation uses a list which is allocated at most
once (even when the Request is retransmitted, it is not allocated again).
And for the timestamps I thought it would be good practice to do the same - if no timestamps are
present during the handshake, don't allocate memory for them.
2) Separating `handshake timestamps' from `established timestamps': I don't know where my mind was,
this just means that timestamps can appear during the handshake (which needs special considerations
on the server) and when the connection is fully established (then it uses the sk fields like before).
| Anyway, I think that adding yet another allocation in a connection lifetime is not good.
Please see below - I can see your point, but other issues are involved also.
| One of the most pressing things for me after merging all the patches
| that are pending (THANK YOU! :-) ) is to work on DCCP memory consuption
| and accounting, the code has to be made more robust :-\
The coming feature negotiation patch set removes lots of kmallocing and kmemduping which is frequent in
the current feature negotiation code. I don't know if that is what you mean, but I also observed problems
with the cache when I did the tests.
| I think that we should just add dreq_{time,echo} to dccp_request_sock
| and keep dccp_sock as is.
|
<snip>
< ...
| Humm, these minisocks are getting fat... another thing for my TODO list,
That is my concern, too. So what shall we do: add two fields which may not be used most of the
time (at the moment only to support an erratum to CCID3), or keep the memory allocation.
| request_sock::ts_recent seems to be used only by the TCP machinery, ripe
| for the picking....
Is there any use of ts_recent for CCID2 - in an earlier email you pointed out that CCID2
could do timestamping, and that would be a place to use.
|
| Anyway, I'll move along the patch queue looking for more easily
| cherrypickable patches.
Please don't at the moment consider the feature negotiation patches for cherry picking -
there are more to come; I split them into batches to make reviewing easier (and your input
is appreciated).
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH 7/8]: Handle timestamps on Request/Response exchange separately
2007-09-25 14:30 [PATCH 7/8]: Handle timestamps on Request/Response exchange separately Gerrit Renker
` (3 preceding siblings ...)
2007-09-27 13:36 ` Gerrit Renker
@ 2007-10-05 10:25 ` Gerrit Renker
2007-10-05 17:11 ` [PATCH 7/8]: Handle timestamps on Request/Response exchange Arnaldo Carvalho de Melo
5 siblings, 0 replies; 7+ messages in thread
From: Gerrit Renker @ 2007-10-05 10:25 UTC (permalink / raw)
To: dccp
Arnaldo,
please disregard the earlier suggestion from below regarding ts_recent and feel free
to do with the structure as you see fit.
To me it seems that the main problems using a RFC1323-like algorithm are
* the ts_recent field is not enough, the algorithm requires other information (e.g. whether
an Ack advances the send window) to deal robustly with delays, holes,
* it is hard to get right (e.g. omments above tcp_ack_saw_tstamp() in tcp_input.c)
* the current solution of timing both send time and Ack arrival is the simplest
and has the advantage of being responsive to receiver behaviour (as in CCID3).
An additional advantage is that the current code already provides Elapsed Time information
on each Ack Vector, so that dccp_sample_rtt() can be used.
Maybe CCID2 could benefit by upgrading from jiffies to ktime_t, as this enables to
better determine whether multiple losses belong to the same RTT (with 1ms resolution
and Gbps speed this does not work so well).
Please can you let me know whether:
* the outlined "struct dccp_request_sock" below is still the preferred format;
* whether as an alternative the dreq_tstamp_{echo,time} fields can be combined, i.e.
use a fixed member of type
struct dccp_ts_echo {
ktime_t ts_time;
__u32 ts_echo;
};
or similar - but without the mallocing, and with overriding each time a new timestamp arrives;
* or whether a different solution is planned.
I'd need to know so that I can rework the patches and resubmit them accordingly.
Quoting Gerrit Renker:
| Quoting Arnaldo Carvalho de Melo:
| | I suggest it to become:
| |
| | [acme@mica net-2.6.24]$ pahole -C dccp_request_sock net/dccp/minisocks.o
| |
| | struct dccp_request_sock {
| | struct inet_request_sock dreq_inet_rsk; /* 0 56 */
| | __u64 dreq_iss; /* 56 8 */
| | /* --- cacheline 1 boundary (64 bytes) --- */
| | __u64 dreq_isr; /* 64 8 */
| | __be32 dreq_service; /* 72 4 */
| | __u32 dreq_tstamp_echo; /* 76 4 */
| | ktime_t dreq_tstamp_time; /* 80 8 */
| |
| | /* size: 88, cachelines: 2 */
| | /* last cacheline: 24 bytes */
| | };
| |
| | Humm, these minisocks are getting fat... another thing for my TODO list,
| | request_sock::ts_recent seems to be used only by the TCP machinery, ripe
| | for the picking....
|
| I have thought about this: do you think the following solution is better -
| the difference between kmallocing and fixed is now between pointer to struct
| and u64 (ktime_t).
|
|
| struct dccp_request_sock {
| struct inet_request_sock dreq_inet_rsk;
| __u64 dreq_iss,
| dreq_isr;
| __be32 dreq_service;
| #define dreq_tstamp_echo dreq_inet_rsk.req.ts_recent
| ktime_t dreq_tstamp_time;
| };
|
|
| The only other thing that is required is then to change the insertion routine to
|
| dccp_insert_option_timestamp_echo(struct sock *sk, struct dccp_request_sock *dreq,
| struct sk_buff *skb);
| /* when @dreq is NULL, @sk is used */
|
|
|
| On another note I think that the CCID2 code could benefit from using such timestamps also, in particular
| for high-speed networks.
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH 7/8]: Handle timestamps on Request/Response exchange
2007-09-25 14:30 [PATCH 7/8]: Handle timestamps on Request/Response exchange separately Gerrit Renker
` (4 preceding siblings ...)
2007-10-05 10:25 ` Gerrit Renker
@ 2007-10-05 17:11 ` Arnaldo Carvalho de Melo
5 siblings, 0 replies; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2007-10-05 17:11 UTC (permalink / raw)
To: dccp
Em Fri, Oct 05, 2007 at 11:25:24AM +0100, Gerrit Renker escreveu:
> Arnaldo,
>
> please disregard the earlier suggestion from below regarding ts_recent and feel free
> to do with the structure as you see fit.
>
> To me it seems that the main problems using a RFC1323-like algorithm are
> * the ts_recent field is not enough, the algorithm requires other information (e.g. whether
> an Ack advances the send window) to deal robustly with delays, holes,
> * it is hard to get right (e.g. omments above tcp_ack_saw_tstamp() in tcp_input.c)
> * the current solution of timing both send time and Ack arrival is the simplest
> and has the advantage of being responsive to receiver behaviour (as in CCID3).
> An additional advantage is that the current code already provides Elapsed Time information
> on each Ack Vector, so that dccp_sample_rtt() can be used.
> Maybe CCID2 could benefit by upgrading from jiffies to ktime_t, as this enables to
> better determine whether multiple losses belong to the same RTT (with 1ms resolution
> and Gbps speed this does not work so well).
CCID2 needs a lot of love and care, yes :-\
> Please can you let me know whether:
>
> * the outlined "struct dccp_request_sock" below is still the preferred format;
Please use the outlined one. I haven't checked, but if we use a struct
like in your second option (below) we can end up with struct holes on
64-bit arches.
> * whether as an alternative the dreq_tstamp_{echo,time} fields can be combined, i.e.
> use a fixed member of type
> struct dccp_ts_echo {
> ktime_t ts_time;
> __u32 ts_echo;
> };
> or similar - but without the mallocing, and with overriding each time a new timestamp arrives;
> * or whether a different solution is planned.
>
> I'd need to know so that I can rework the patches and resubmit them accordingly.
>
>
> Quoting Gerrit Renker:
> | Quoting Arnaldo Carvalho de Melo:
> | | I suggest it to become:
> | |
> | | [acme@mica net-2.6.24]$ pahole -C dccp_request_sock net/dccp/minisocks.o
> | |
> | | struct dccp_request_sock {
> | | struct inet_request_sock dreq_inet_rsk; /* 0 56 */
> | | __u64 dreq_iss; /* 56 8 */
> | | /* --- cacheline 1 boundary (64 bytes) --- */
> | | __u64 dreq_isr; /* 64 8 */
> | | __be32 dreq_service; /* 72 4 */
> | | __u32 dreq_tstamp_echo; /* 76 4 */
> | | ktime_t dreq_tstamp_time; /* 80 8 */
> | |
> | | /* size: 88, cachelines: 2 */
> | | /* last cacheline: 24 bytes */
> | | };
> | |
> | | Humm, these minisocks are getting fat... another thing for my TODO list,
> | | request_sock::ts_recent seems to be used only by the TCP machinery, ripe
> | | for the picking....
> |
> | I have thought about this: do you think the following solution is better -
> | the difference between kmallocing and fixed is now between pointer to struct
> | and u64 (ktime_t).
> |
> |
> | struct dccp_request_sock {
> | struct inet_request_sock dreq_inet_rsk;
> | __u64 dreq_iss,
> | dreq_isr;
> | __be32 dreq_service;
> | #define dreq_tstamp_echo dreq_inet_rsk.req.ts_recent
> | ktime_t dreq_tstamp_time;
> | };
> |
> |
> | The only other thing that is required is then to change the insertion routine to
> |
> | dccp_insert_option_timestamp_echo(struct sock *sk, struct dccp_request_sock *dreq,
> | struct sk_buff *skb);
> | /* when @dreq is NULL, @sk is used */
> |
> |
> |
> | On another note I think that the CCID2 code could benefit from using such timestamps also, in particular
> | for high-speed networks.
^ permalink raw reply [flat|nested] 7+ messages in thread