public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 2/2] TTY: tty flip buffer optimisation.
@ 2011-11-14 12:47 Ilya Zykov
  2011-11-14 14:25 ` Jiri Slaby
  0 siblings, 1 reply; 5+ messages in thread
From: Ilya Zykov @ 2011-11-14 12:47 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel, Greg Kroah-Hartman, Ilya Zykov

Currently, free flip buffer (tty->buf.free) reserve memory for further used,
only if driver send to ldisc less 257 bytes in one time.
If driver send more, flip buffer reserve(kmalloc()) and then
free(kfree()) every chunk more 256 bytes every time.


This patch allow reserve more than 256 bytes chunks in free buffer.
And have self-regulation chunk size ability(very useful for pty).
Also we can control memory used(at the most TTY_BUFFER_MAX).
And avoiding useless looking up buffer more then 256 byte size.

With this patch I get follow results:

ilya@serh:~/src/pty$ time ./bench_pty_buf 253
chunk = 253. loop = 3952569.
real	0m21.017s
user	0m0.436s
sys	0m17.749s

ilya@serh:~/src/pty$ time ./bench_pty_buf 255
chunk = 255. loop = 3921568.
real	0m21.276s
user	0m0.544s
sys	0m17.329s

ilya@serh:~/src/pty$ time ./bench_pty_buf 257
chunk = 257. loop = 3891050.
real	0m21.652s
user	0m0.512s
sys	0m18.593s

Signed-off-by: Ilya Zykov <ilya@ilyx.ru>
---
diff -uprN -X ../../dontdiff a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
--- a/drivers/tty/tty_buffer.c	2011-11-12 00:19:27.000000000 +0400
+++ b/drivers/tty/tty_buffer.c	2011-11-14 15:55:45.000000000 +0400
@@ -58,8 +58,21 @@ static struct tty_buffer *tty_buffer_all
 {
 	struct tty_buffer *p;
 -	if (tty->buf.memory_used + size > 65536)
-		return NULL;
+	if (tty->buf.memory_reserve + size > TTY_BUFFER_MAX) {
+		/* If possible, free small chunks
+				have been had total size = "size". */
+		if (tty->buf.memory_reserve - tty->buf.memory_used >= size)
+			while (tty->buf.memory_reserve + size >
+							TTY_BUFFER_MAX) {
+				p = tty->buf.free;
+				tty->buf.free = p->next;
+				tty->buf.memory_reserve -= p->size;
+				WARN_ON(tty->buf.memory_reserve < 0);
+				kfree(p);
+			}
+		else
+			return NULL;
+	}
 	p = kmalloc(sizeof(struct tty_buffer) + 2 * size, GFP_ATOMIC);
 	if (p == NULL)
 		return NULL;
@@ -71,6 +84,7 @@ static struct tty_buffer *tty_buffer_all
 	p->char_buf_ptr = (char *)(p->data);
 	p->flag_buf_ptr = (unsigned char *)p->char_buf_ptr + size;
 	tty->buf.memory_used += size;
+	tty->buf.memory_reserve += size;
 	return p;
 }
 @@ -91,11 +105,12 @@ static void tty_buffer_free(struct tty_s
 	tty->buf.memory_used -= b->size;
 	WARN_ON(tty->buf.memory_used < 0);
 -	if (b->size >= 512)
-		kfree(b);
-	else {
+	if (tty->buf.free != NULL) {
 		b->next = tty->buf.free;
 		tty->buf.free = b;
+	} else {
+		tty->buf.free = b;
+		b->next = NULL;
 	}
 }
 @@ -517,6 +532,7 @@ void tty_buffer_init(struct tty_struct *
 	tty->buf.tail = NULL;
 	tty->buf.free = NULL;
 	tty->buf.memory_used = 0;
+	tty->buf.memory_reserve = 0;
 	INIT_WORK(&tty->buf.work, flush_to_ldisc);
 }
 diff -uprN -X ../../dontdiff a/include/linux/tty.h b/include/linux/tty.h
--- a/include/linux/tty.h	2011-11-12 00:19:27.000000000 +0400
+++ b/include/linux/tty.h	2011-11-14 14:51:21.000000000 +0400
@@ -82,7 +82,7 @@ struct tty_buffer {
  */
  #define TTY_BUFFER_PAGE	(((PAGE_SIZE - sizeof(struct tty_buffer)) / 2) & ~0xFF)
-
+#define TTY_BUFFER_MAX		65536
  struct tty_bufhead {
 	struct work_struct work;
@@ -90,6 +90,7 @@ struct tty_bufhead {
 	struct tty_buffer *head;	/* Queue head */
 	struct tty_buffer *tail;	/* Active buffer */
 	struct tty_buffer *free;	/* Free queue head */
+	int memory_reserve;		/* Buffer space used */
 	int memory_used;		/* Buffer space used excluding
 								free queue */
 };

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 2/2] TTY: tty flip buffer optimisation.
  2011-11-14 12:47 [PATCH v3 2/2] TTY: tty flip buffer optimisation Ilya Zykov
@ 2011-11-14 14:25 ` Jiri Slaby
  2011-11-14 15:05   ` Alan Cox
  0 siblings, 1 reply; 5+ messages in thread
From: Jiri Slaby @ 2011-11-14 14:25 UTC (permalink / raw)
  To: Ilya Zykov; +Cc: Alan Cox, linux-kernel, Greg Kroah-Hartman, Jiri Slaby

On 11/14/2011 01:47 PM, Ilya Zykov wrote:
> Currently, free flip buffer (tty->buf.free) reserve memory for further used,
> only if driver send to ldisc less 257 bytes in one time.
> If driver send more, flip buffer reserve(kmalloc()) and then
> free(kfree()) every chunk more 256 bytes every time.
> 
> 
> This patch allow reserve more than 256 bytes chunks in free buffer.
> And have self-regulation chunk size ability(very useful for pty).
> Also we can control memory used(at the most TTY_BUFFER_MAX).
> And avoiding useless looking up buffer more then 256 byte size.
> 
> With this patch I get follow results:
> 
> ilya@serh:~/src/pty$ time ./bench_pty_buf 253
> chunk = 253. loop = 3952569.
> real	0m21.017s
> user	0m0.436s
> sys	0m17.749s
> 
> ilya@serh:~/src/pty$ time ./bench_pty_buf 255
> chunk = 255. loop = 3921568.
> real	0m21.276s
> user	0m0.544s
> sys	0m17.329s
> 
> ilya@serh:~/src/pty$ time ./bench_pty_buf 257
> chunk = 257. loop = 3891050.
> real	0m21.652s
> user	0m0.512s
> sys	0m18.593s

Hi, the results are indeed nice. However is there any *real* load other
than this tailor-made microbenchmark where the added code complexity is
worth it?

BTW am I guessing correctly that PATCH 1/2 should contain *no* patch in
fact? That it's just to present the benchmark sources?

> Signed-off-by: Ilya Zykov <ilya@ilyx.ru>
> ---
> diff -uprN -X ../../dontdiff a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
> --- a/drivers/tty/tty_buffer.c	2011-11-12 00:19:27.000000000 +0400
> +++ b/drivers/tty/tty_buffer.c	2011-11-14 15:55:45.000000000 +0400
> @@ -58,8 +58,21 @@ static struct tty_buffer *tty_buffer_all
>  {
>  	struct tty_buffer *p;
>  -	if (tty->buf.memory_used + size > 65536)
> -		return NULL;
> +	if (tty->buf.memory_reserve + size > TTY_BUFFER_MAX) {
> +		/* If possible, free small chunks
> +				have been had total size = "size". */
> +		if (tty->buf.memory_reserve - tty->buf.memory_used >= size)
> +			while (tty->buf.memory_reserve + size >
> +							TTY_BUFFER_MAX) {
> +				p = tty->buf.free;
> +				tty->buf.free = p->next;
> +				tty->buf.memory_reserve -= p->size;
> +				WARN_ON(tty->buf.memory_reserve < 0);
> +				kfree(p);
> +			}
> +		else
> +			return NULL;
> +	}
>  	p = kmalloc(sizeof(struct tty_buffer) + 2 * size, GFP_ATOMIC);
>  	if (p == NULL)
>  		return NULL;
> @@ -71,6 +84,7 @@ static struct tty_buffer *tty_buffer_all
>  	p->char_buf_ptr = (char *)(p->data);
>  	p->flag_buf_ptr = (unsigned char *)p->char_buf_ptr + size;
>  	tty->buf.memory_used += size;
> +	tty->buf.memory_reserve += size;
>  	return p;
>  }
>  @@ -91,11 +105,12 @@ static void tty_buffer_free(struct tty_s
>  	tty->buf.memory_used -= b->size;
>  	WARN_ON(tty->buf.memory_used < 0);
>  -	if (b->size >= 512)
> -		kfree(b);
> -	else {
> +	if (tty->buf.free != NULL) {
>  		b->next = tty->buf.free;
>  		tty->buf.free = b;
> +	} else {
> +		tty->buf.free = b;
> +		b->next = NULL;
>  	}
>  }
>  @@ -517,6 +532,7 @@ void tty_buffer_init(struct tty_struct *
>  	tty->buf.tail = NULL;
>  	tty->buf.free = NULL;
>  	tty->buf.memory_used = 0;
> +	tty->buf.memory_reserve = 0;
>  	INIT_WORK(&tty->buf.work, flush_to_ldisc);
>  }
>  diff -uprN -X ../../dontdiff a/include/linux/tty.h b/include/linux/tty.h
> --- a/include/linux/tty.h	2011-11-12 00:19:27.000000000 +0400
> +++ b/include/linux/tty.h	2011-11-14 14:51:21.000000000 +0400
> @@ -82,7 +82,7 @@ struct tty_buffer {
>   */
>   #define TTY_BUFFER_PAGE	(((PAGE_SIZE - sizeof(struct tty_buffer)) / 2) & ~0xFF)
> -
> +#define TTY_BUFFER_MAX		65536
>   struct tty_bufhead {
>  	struct work_struct work;
> @@ -90,6 +90,7 @@ struct tty_bufhead {
>  	struct tty_buffer *head;	/* Queue head */
>  	struct tty_buffer *tail;	/* Active buffer */
>  	struct tty_buffer *free;	/* Free queue head */
> +	int memory_reserve;		/* Buffer space used */
>  	int memory_used;		/* Buffer space used excluding
>  								free queue */
>  };

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 2/2] TTY: tty flip buffer optimisation.
  2011-11-14 14:25 ` Jiri Slaby
@ 2011-11-14 15:05   ` Alan Cox
  2011-11-14 15:47     ` Ilya Zykov
  0 siblings, 1 reply; 5+ messages in thread
From: Alan Cox @ 2011-11-14 15:05 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Ilya Zykov, Alan Cox, linux-kernel, Greg Kroah-Hartman,
	Jiri Slaby

> Hi, the results are indeed nice. However is there any *real* load other
> than this tailor-made microbenchmark where the added code complexity is
> worth it?

I'm wondering if we need the complexity in the first place. Certainly 256
does seem a bit small for pty/tty traffic. A 'real world' benchmark would
be an ls -lR / on a machine with a fast graphics card or in console mode

ie

ls -lR /		# prime cache
time ls -lR /

and there are cases where people do a lot of traffic over a pty like this
so I don't think it's entirely fake.

I don't like the complexity but we could certainly go from using 256 byte
buffers to "tty->buf.bufsize" and make it configurable without
that complexity.

Alan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 2/2] TTY: tty flip buffer optimisation.
  2011-11-14 15:05   ` Alan Cox
@ 2011-11-14 15:47     ` Ilya Zykov
  2011-11-14 16:08       ` Alan Cox
  0 siblings, 1 reply; 5+ messages in thread
From: Ilya Zykov @ 2011-11-14 15:47 UTC (permalink / raw)
  To: Alan Cox; +Cc: Jiri Slaby, Alan Cox, linux-kernel, Greg Kroah-Hartman,
	Jiri Slaby

Alan Cox wrote:

>> Hi, the results are indeed nice. However is there any *real* load other
>> than this tailor-made microbenchmark where the added code complexity is
>> worth it?
> 
> I'm wondering if we need the complexity in the first place. Certainly 256
> does seem a bit small for pty/tty traffic. A 'real world' benchmark would
> be an ls -lR / on a machine with a fast graphics card or in console mode
> 
> ie
> 
> ls -lR /		# prime cache
> time ls -lR /
> 
> and there are cases where people do a lot of traffic over a pty like this
> so I don't think it's entirely fake.
> 
> I don't like the complexity but we could certainly go from using 256 byte
> buffers to "tty->buf.bufsize" and make it configurable without
> that complexity.
> 
> Alan
> 


For avoid complexity we need remove free buffer at all.
And use kmalloc()-kfree() for every chunk. Don't need tty_buffer_find().
It will be fast and easy.
Ilya.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 2/2] TTY: tty flip buffer optimisation.
  2011-11-14 15:47     ` Ilya Zykov
@ 2011-11-14 16:08       ` Alan Cox
  0 siblings, 0 replies; 5+ messages in thread
From: Alan Cox @ 2011-11-14 16:08 UTC (permalink / raw)
  To: Ilya Zykov
  Cc: Jiri Slaby, Alan Cox, linux-kernel, Greg Kroah-Hartman,
	Jiri Slaby

> For avoid complexity we need remove free buffer at all.
> And use kmalloc()-kfree() for every chunk. Don't need tty_buffer_find().
> It will be fast and easy.

The buffer cycling was added because it speeds up some cases (as you
found), and because its needed for high speed serial on some very low end
devices. From your benchmarks it's also speeding up pty/tty so it's
clearly relevant.

Alan

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-11-14 16:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-14 12:47 [PATCH v3 2/2] TTY: tty flip buffer optimisation Ilya Zykov
2011-11-14 14:25 ` Jiri Slaby
2011-11-14 15:05   ` Alan Cox
2011-11-14 15:47     ` Ilya Zykov
2011-11-14 16:08       ` Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox