All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Test tree patch inventory - update
@ 2007-11-20 20:10 Arnaldo Carvalho de Melo
  2007-11-21  0:21 ` Arnaldo Carvalho de Melo
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2007-11-20 20:10 UTC (permalink / raw)
  To: dccp

Em Sat, Nov 17, 2007 at 03:42:52PM +0000, Gerrit Renker escreveu:
>                         Test Tree Inventory
> 		        =========> 
> This is the list of current small batches (from oldest to latest) as of Sat 17 Nov.
> The entire test tree is fully bisectable, so batches can be combined freely.
> 
> Spaces in between sets indicate separate blocks.
> 
> The CCID3 set has been completely re-arranged and recombined to give self-contained and
> independent patches. Within the set there are further, self-contained blocks.
> 
> 
> 1. Miscellaneous CCID3 fixes
> ----------------------------
> A small set of various fixes, independent of the main CCID3 patch set.
> 
> 	[CCID3]: Revert use of MSS instead of s

Applied.

> 	[CCID3]: Ignore trivial amounts of elapsed time

Applied, but I think we can revisit elapsed time later by taking a
timestamp earlier, at dccp_v4_rcv, because then there can be time spent
in sk_backlog, etc. We could then store a timestamp on skb->cb and the
ccids would set a flag on dccp_sock to tell that the timestamp is
wanted.

Ah, and there is more space to save, after this patch the layout is
this, on x86_64:

struct ccid3_hc_rx_sock {
	struct tfrc_rx_info     ccid3hcrx_tfrc;                 /*   0  12 */

	/* XXX 4 bytes hole, try to pack */

	u64                     ccid3hcrx_seqno_nonloss:48;     /*  16   8 */
	u64                     ccid3hcrx_ccval_nonloss:4;      /*  16   8 */
	u64                     ccid3hcrx_ccval_last_counter:4; /*  16   8 */
	enum ccid3_hc_rx_states ccid3hcrx_state:8;              /*  20   4 */
	u32                     ccid3hcrx_bytes_recv;           /*  24   4 */

	/* XXX 4 bytes hole, try to pack */

	ktime_t                 ccid3hcrx_tstamp_last_feedback; /* 32   8 */
	ktime_t                 ccid3hcrx_tstamp_last_ack;      /* 40   8 */
	struct list_head        ccid3hcrx_hist;                 /* 48  16 */
	/* --- cacheline 1 boundary (64 bytes) --- */
	struct list_head        ccid3hcrx_li_hist;              /* 64  16 */
	u16                     ccid3hcrx_s;                    /* 80   2 */

	/* XXX 2 bytes hole, try to pack */

	u32                     ccid3hcrx_pinv;                 /* 84   4 */

	/* size: 88, cachelines: 2 */
	/* sum members: 78, holes: 3, sum holes: 10 */
	/* last cacheline: 24 bytes */
};

Just moving ccid3hcrx_bytes_recv to after ccid3hcrx_tfrc we can save 8
bytes.

> 	[CCID3]: Accurately determine idle & application-limited periods

Applied.

> 	[CCID3]: Inline for moving average

Applied
 
> 2. Main CCID3 patch set
> -----------------------
> This is the original CCID3 patch set, developed in Feb/Mar, significantly reordered to
> make the test tree fully bisectable. 
> The patch set does three things (reflected in the order of batches):
> 
> 	(1) new TX history for TFRC (packet_history.{c,h}),

Why do we need DECLARE_TFRC_TX_CACHE?

> 	(2) new RX history for TFRC (packet_history.{c,h}),
> 	(3) new Loss Intervals database (loss_interval.{c,h}).
> 
> (Actually it is 3+1/2 things, the third patch introduces tfrc_module.c).
> 
> 	[TFRC]: Migrate TX history to singly-linked list
> 	[TFRC]: Remove old (doubly-linked list) TX history 
> 
> 	[TFRC]: Provide central source file and debug facility
> 
> 	[TFRC]: New RX history implementation


+static struct kmem_cache *tfrcxh;

Why "xh"?

> 	[CCID3]: Hook up with new RX history interface
> 	[TFRC]: Remove old RX history interface

Will try to continue later today.

- Arnaldo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Test tree patch inventory - update
  2007-11-20 20:10 Test tree patch inventory - update Arnaldo Carvalho de Melo
@ 2007-11-21  0:21 ` Arnaldo Carvalho de Melo
  2007-11-21 11:52 ` Gerrit Renker
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2007-11-21  0:21 UTC (permalink / raw)
  To: dccp

Em Sat, Nov 17, 2007 at 03:42:52PM +0000, Gerrit Renker escreveu:
>                         Test Tree Inventory
> 		        ========= 
> 4. Support for passive-close without flushing unread data
> ---------------------------------------------------------
> This set actually involves only 5 patches, the last one is according to your suggestion.
> 
> 	[DCCP]: Separate protocol states into general/specific

I dropped this one, see next explanation.

> 	[DCCP]: Make PARTOPEN an autonomous state

Here I made DCCP_PARTOPEN = TCP_MAX_STATES, that DCCP_INTRINSECS was not
used anywhere and this just means that states >= TCP_MAX_STATES are
DCCP only states.
 
> 	[DCCP]: Dedicated auxiliary states to support passive-close

I'll combine this with the next one.

> 	[DCCP]: Basic support for passive-close

What happens if we receive a second CloseReq or a second Close packet or
if the server performed an active close?

In all these cases we're leaking an skb, no? As dccp_rcv_close{req}
before always used dccp_fin there wasn't a need for returning a value
from these functions, but now we better return if we used the skb or
not, so that when back to __dccp_rcv_established and
dccp_rcv_state_process we can discard the packet.

- Arnaldo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Test tree patch inventory - update
  2007-11-20 20:10 Test tree patch inventory - update Arnaldo Carvalho de Melo
  2007-11-21  0:21 ` Arnaldo Carvalho de Melo
@ 2007-11-21 11:52 ` Gerrit Renker
  2007-11-21 12:46 ` Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Gerrit Renker @ 2007-11-21 11:52 UTC (permalink / raw)
  To: dccp

| >                         Test Tree Inventory
| > 		            =========| >
| > 	[CCID3]: Ignore trivial amounts of elapsed time
| 
| Applied, but I think we can revisit elapsed time later by taking a
| timestamp earlier, at dccp_v4_rcv, because then there can be time spent
| in sk_backlog, etc. We could then store a timestamp on skb->cb and the
| ccids would set a flag on dccp_sock to tell that the timestamp is
| wanted.
| 
The elapsed time is needed for RTT measurements and you are right, the
hole RTT measurement issue needs more work. And this includes CCID2
which goes into the other extreme and uses a coarse-grained HZ
resolution of RTTs. But ever since there was the issue with interface
timestamps (skb_get_timestamp) I got convinced that it is not useful
to increase the resolution below, say, 1 millisecond. (There was a
discussion with Ian about above patch where this came up.) The reason
is that when we do such fine-grained resolution, we also get a lot of
noise: much of the TCP code works very well with just the jiffies
resolution.
With regard to your point, I think it would be ideal to use timestamps
in the TCP High-Speed way (RFC 1323), where we could use a resolution of
0.00001 seconds (dccp_timestamp() and RFC 4340, 13.). That is still
very high-resolution when considering that the RFC2988 remarks regarding
clock granularity in part referred back to that old BSD timer which had
a 0.5 second granularity.
I agree fully with your suggestion: all CCID-specific RTT measurement
code can be eliminated once there is a working alternative. The
challenge that needs to be overcome is to make the timestamp usage robust
against packet duplication and re-ordering.

| Just moving ccid3hcrx_bytes_recv to after ccid3hcrx_tfrc we can save 8
| bytes.
This is good news - with regard to the above I still have in the back of
my mind that there are unused icsk fields which may also be used for
storing timestamps; this could help (in a future effort) to reduce further.

  
| > 2. Main CCID3 patch set
| > -----------------------
| > This is the original CCID3 patch set, developed in Feb/Mar, significantly reordered to
| > make the test tree fully bisectable. 
| > The patch set does three things (reflected in the order of batches):
| > 
| > 	(1) new TX history for TFRC (packet_history.{c,h}),
| 
| Why do we need DECLARE_TFRC_TX_CACHE?
| 
It is not needed for compilation. When I first did this patch set there were three different
kmem_caches defined in ccid3.c:
	* one for the TX history (packet_history.c; for which the above macro was declared),
	* one for the RX history (packet_history.c)
	* one for the Loss History (loss_interval.c)
The loss interval history has since gone, this was done when you worked on Ian's patch set
around March/April. When the patch set is fully applied, the RX history cache also goes,
since the new patch set uses at most 4 entries per socket which are supplied by a static
kmem_cache in packet_history.c.
Long story, short conclusion: if you think this is an eyesore, feel free to take it out. I
left it in since I thought it would make the purpose clearer (since ctags then automatically
jumps to packet_history.h).


| > 	[TFRC]: New RX history implementation
| 
| 
| +static struct kmem_cache *tfrcxh;
| 
| Why "xh"?
| 
This is a typo, my mistake. I had meant to use the suffix `rxh' to indicate that
this static cache is for the RX history, but it somehow merged with `r'
from tfrc. Are you okay with tfrc_rxh? I will uploaded a revised patch
which uses this identifier instead. Thank you for pointing this out.

Gerrit

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Test tree patch inventory - update
  2007-11-20 20:10 Test tree patch inventory - update Arnaldo Carvalho de Melo
  2007-11-21  0:21 ` Arnaldo Carvalho de Melo
  2007-11-21 11:52 ` Gerrit Renker
@ 2007-11-21 12:46 ` Arnaldo Carvalho de Melo
  2007-11-21 13:18 ` Gerrit Renker
  2007-11-22 10:29 ` Gerrit Renker
  4 siblings, 0 replies; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2007-11-21 12:46 UTC (permalink / raw)
  To: dccp

Em Wed, Nov 21, 2007 at 11:52:55AM +0000, Gerrit Renker escreveu:
> | >                         Test Tree Inventory
> | > 		            =========> | >
> | > 	[CCID3]: Ignore trivial amounts of elapsed time
> | 
> | Applied, but I think we can revisit elapsed time later by taking a
> | timestamp earlier, at dccp_v4_rcv, because then there can be time spent
> | in sk_backlog, etc. We could then store a timestamp on skb->cb and the
> | ccids would set a flag on dccp_sock to tell that the timestamp is
> | wanted.
> | 
> The elapsed time is needed for RTT measurements and you are right, the
> hole RTT measurement issue needs more work. And this includes CCID2
> which goes into the other extreme and uses a coarse-grained HZ
> resolution of RTTs. But ever since there was the issue with interface
> timestamps (skb_get_timestamp) I got convinced that it is not useful
> to increase the resolution below, say, 1 millisecond. (There was a
> discussion with Ian about above patch where this came up.) The reason
> is that when we do such fine-grained resolution, we also get a lot of
> noise: much of the TCP code works very well with just the jiffies
> resolution.
> With regard to your point, I think it would be ideal to use timestamps
> in the TCP High-Speed way (RFC 1323), where we could use a resolution of
> 0.00001 seconds (dccp_timestamp() and RFC 4340, 13.). That is still
> very high-resolution when considering that the RFC2988 remarks regarding
> clock granularity in part referred back to that old BSD timer which had
> a 0.5 second granularity.
> I agree fully with your suggestion: all CCID-specific RTT measurement
> code can be eliminated once there is a working alternative. The
> challenge that needs to be overcome is to make the timestamp usage robust
> against packet duplication and re-ordering.

I'll try at some point to measure the maximum time a packet stays on
sk_backlog on a heavily loaded server. But agreed, this is not urgent,
just to keep on the back of our minds for later.
 
> | Just moving ccid3hcrx_bytes_recv to after ccid3hcrx_tfrc we can save 8
> | bytes.
> This is good news - with regard to the above I still have in the back of
> my mind that there are unused icsk fields which may also be used for
> storing timestamps; this could help (in a future effort) to reduce further.
> 
>   
> | > 2. Main CCID3 patch set
> | > -----------------------
> | > This is the original CCID3 patch set, developed in Feb/Mar, significantly reordered to
> | > make the test tree fully bisectable. 
> | > The patch set does three things (reflected in the order of batches):
> | > 
> | > 	(1) new TX history for TFRC (packet_history.{c,h}),
> | 
> | Why do we need DECLARE_TFRC_TX_CACHE?
> | 
> It is not needed for compilation. When I first did this patch set there were three different
> kmem_caches defined in ccid3.c:
> 	* one for the TX history (packet_history.c; for which the above macro was declared),
> 	* one for the RX history (packet_history.c)
> 	* one for the Loss History (loss_interval.c)
> The loss interval history has since gone, this was done when you worked on Ian's patch set
> around March/April. When the patch set is fully applied, the RX history cache also goes,
> since the new patch set uses at most 4 entries per socket which are supplied by a static
> kmem_cache in packet_history.c.
> Long story, short conclusion: if you think this is an eyesore, feel free to take it out. I
> left it in since I thought it would make the purpose clearer (since ctags then automatically
> jumps to packet_history.h).
> 
> 
> | > 	[TFRC]: New RX history implementation
> | 
> | 
> | +static struct kmem_cache *tfrcxh;
> | 
> | Why "xh"?
> | 
> This is a typo, my mistake. I had meant to use the suffix `rxh' to indicate that
> this static cache is for the RX history, but it somehow merged with `r'
> from tfrc. Are you okay with tfrc_rxh? I will uploaded a revised patch

tfrc_rx_hist would be better, its not overly long and for the casual
reader provides more info.

> which uses this identifier instead. Thank you for pointing this out.
> 
> Gerrit

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Test tree patch inventory - update
  2007-11-20 20:10 Test tree patch inventory - update Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2007-11-21 12:46 ` Arnaldo Carvalho de Melo
@ 2007-11-21 13:18 ` Gerrit Renker
  2007-11-22 10:29 ` Gerrit Renker
  4 siblings, 0 replies; 6+ messages in thread
From: Gerrit Renker @ 2007-11-21 13:18 UTC (permalink / raw)
  To: dccp

| > | > 	[TFRC]: New RX history implementation
| > | 
| > | 
| > | +static struct kmem_cache *tfrcxh;
| > | 
| > | Why "xh"?
| > | 
| > This is a typo, my mistake. I had meant to use the suffix `rxh' to indicate that
| > this static cache is for the RX history, but it somehow merged with `r'
| > from tfrc. Are you okay with tfrc_rxh? I will uploaded a revised patch
| 
| tfrc_rx_hist would be better, its not overly long and for the casual
| reader provides more info.
| 
I just tried this, the problem is that it clashes with 

	struct tfrc_rx_hist {
        	struct tfrc_rx_hist_entry       *ring[NDUPACK + 1];
		/* ... */
	};

Thus changed it to

   	static struct kmem_cache *tfrc_rxh_cache;

   and uploaded a revised patch. If you are not ok with the naming
   scheme, let me know and I'll revise it.

I am working on the leaked skbs issue, either there will be a revised
patch later or I'll fix this after work.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Test tree patch inventory - update
  2007-11-20 20:10 Test tree patch inventory - update Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2007-11-21 13:18 ` Gerrit Renker
@ 2007-11-22 10:29 ` Gerrit Renker
  4 siblings, 0 replies; 6+ messages in thread
From: Gerrit Renker @ 2007-11-22 10:29 UTC (permalink / raw)
  To: dccp

Arnaldo,

| > 	[DCCP]: Basic support for passive-close
| 
| What happens if we receive a second CloseReq or a second Close packet or
| if the server performed an active close?
| 
Just to say I am still working on this. First, thanks a lot for pointing
out the bug. This forced me to look again at this patch and ... ecce,
there is another bug: PASSIVE_1/2 may only be entered from OPEN/PARTOPEN, 
the code allowed to enter these states from other states, too. This is
clearly wrong. 
Will send the revised patch once I am clearer what is going on, but this
may not be today.

Gerrit

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-11-22 10:29 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-20 20:10 Test tree patch inventory - update Arnaldo Carvalho de Melo
2007-11-21  0:21 ` Arnaldo Carvalho de Melo
2007-11-21 11:52 ` Gerrit Renker
2007-11-21 12:46 ` Arnaldo Carvalho de Melo
2007-11-21 13:18 ` Gerrit Renker
2007-11-22 10:29 ` Gerrit Renker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.