linux-input.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] HID: debug: improve hid_debug_event()
@ 2015-11-24 12:33 Rasmus Villemoes
  2015-11-25 20:31 ` Joe Perches
  2015-11-26 23:05 ` Jiri Kosina
  0 siblings, 2 replies; 4+ messages in thread
From: Rasmus Villemoes @ 2015-11-24 12:33 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: Rasmus Villemoes, linux-input, linux-kernel

The code in hid_debug_event() causes horrible code generation. First,
we do a strlen() call for every byte we copy (we're doing a store to
global memory, so gcc has no way of proving that strlen(buf) doesn't
change). Second, since both i, list->tail and HID_DEBUG_BUFSIZE have
signed type, the modulo computation has to take into account the
possibility that list->tail+i is negative, so it's not just a simple
and.

Fix the former by simply not doing strlen() at all (we have to load
buf[i] anyway, so testing it is almost free) and the latter by
changing i to unsigned. This cuts 29% (69 bytes) of the size of the
function.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 drivers/hid/hid-debug.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/hid/hid-debug.c b/drivers/hid/hid-debug.c
index 2886b645ced7..acfb522a432a 100644
--- a/drivers/hid/hid-debug.c
+++ b/drivers/hid/hid-debug.c
@@ -659,13 +659,13 @@ EXPORT_SYMBOL_GPL(hid_dump_device);
 /* enqueue string to 'events' ring buffer */
 void hid_debug_event(struct hid_device *hdev, char *buf)
 {
-	int i;
+	unsigned i;
 	struct hid_debug_list *list;
 	unsigned long flags;
 
 	spin_lock_irqsave(&hdev->debug_list_lock, flags);
 	list_for_each_entry(list, &hdev->debug_list, node) {
-		for (i = 0; i < strlen(buf); i++)
+		for (i = 0; buf[i]; i++)
 			list->hid_debug_buf[(list->tail + i) % HID_DEBUG_BUFSIZE] =
 				buf[i];
 		list->tail = (list->tail + i) % HID_DEBUG_BUFSIZE;
-- 
2.6.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] HID: debug: improve hid_debug_event()
  2015-11-24 12:33 [PATCH] HID: debug: improve hid_debug_event() Rasmus Villemoes
@ 2015-11-25 20:31 ` Joe Perches
  2015-11-26 21:03   ` Rasmus Villemoes
  2015-11-26 23:05 ` Jiri Kosina
  1 sibling, 1 reply; 4+ messages in thread
From: Joe Perches @ 2015-11-25 20:31 UTC (permalink / raw)
  To: Rasmus Villemoes, Jiri Kosina; +Cc: linux-input, linux-kernel

On Tue, 2015-11-24 at 13:33 +0100, Rasmus Villemoes wrote:
> The code in hid_debug_event() causes horrible code generation. First,
> we do a strlen() call for every byte we copy (we're doing a store to
> global memory, so gcc has no way of proving that strlen(buf) doesn't
> change). Second, since both i, list->tail and HID_DEBUG_BUFSIZE have
> signed type, the modulo computation has to take into account the
> possibility that list->tail+i is negative, so it's not just a simple
> and.
> 
> Fix the former by simply not doing strlen() at all (we have to load
> buf[i] anyway, so testing it is almost free) and the latter by
> changing i to unsigned. This cuts 29% (69 bytes) of the size of the
> function.
[]
> diff --git a/drivers/hid/hid-debug.c b/drivers/hid/hid-debug.c
[]
> @@ -659,13 +659,13 @@ EXPORT_SYMBOL_GPL(hid_dump_device);
>  /* enqueue string to 'events' ring buffer */
>  void hid_debug_event(struct hid_device *hdev, char *buf)
>  {
> -	int i;
> +	unsigned i;
>  	struct hid_debug_list *list;
>  	unsigned long flags;
>  
>  	spin_lock_irqsave(&hdev->debug_list_lock, flags);
>  	list_for_each_entry(list, &hdev->debug_list, node) {
> -		for (i = 0; i < strlen(buf); i++)
> +		for (i = 0; buf[i]; i++)
>  			list->hid_debug_buf[(list->tail + i) % HID_DEBUG_BUFSIZE] =
>  				buf[i];
>  		list->tail = (list->tail + i) % HID_DEBUG_BUFSIZE;

trivia:

The code might look nicer if (list->tail + i) % HID_DEBUG_BUFSIZE
was stored into a temporary.

Maybe use an if >= BUFSIZE to avoid a %
Something like:

		int pos = list->tail;
		for (i = 0; buf[i]; i++) {
			list->hid_debug_buf[pos++] = buf[i];
			if (pos >= HID_DEBUG_BUFSIZE)
				pos = 0;
		}
		list->tail = pos;

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] HID: debug: improve hid_debug_event()
  2015-11-25 20:31 ` Joe Perches
@ 2015-11-26 21:03   ` Rasmus Villemoes
  0 siblings, 0 replies; 4+ messages in thread
From: Rasmus Villemoes @ 2015-11-26 21:03 UTC (permalink / raw)
  To: Joe Perches; +Cc: Jiri Kosina, linux-input, linux-kernel

On Wed, Nov 25 2015, Joe Perches <joe@perches.com> wrote:

>>  	spin_lock_irqsave(&hdev->debug_list_lock, flags);
>>  	list_for_each_entry(list, &hdev->debug_list, node) {
>> -		for (i = 0; i < strlen(buf); i++)
>> +		for (i = 0; buf[i]; i++)
>>  			list->hid_debug_buf[(list->tail + i) % HID_DEBUG_BUFSIZE] =
>>  				buf[i];
>>  		list->tail = (list->tail + i) % HID_DEBUG_BUFSIZE;
>
> trivia:
>
> The code might look nicer if (list->tail + i) % HID_DEBUG_BUFSIZE
> was stored into a temporary.

Maybe.

> Maybe use an if >= BUFSIZE to avoid a %

Nah, that would likely be worse; both a cmov and a conditional jump are
probably more expensive than a simple '& 0x1ff' which the % should compile to
(provided the expression is unsigned).

Rasmus

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] HID: debug: improve hid_debug_event()
  2015-11-24 12:33 [PATCH] HID: debug: improve hid_debug_event() Rasmus Villemoes
  2015-11-25 20:31 ` Joe Perches
@ 2015-11-26 23:05 ` Jiri Kosina
  1 sibling, 0 replies; 4+ messages in thread
From: Jiri Kosina @ 2015-11-26 23:05 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: linux-input, linux-kernel

On Tue, 24 Nov 2015, Rasmus Villemoes wrote:

> The code in hid_debug_event() causes horrible code generation. First,
> we do a strlen() call for every byte we copy (we're doing a store to
> global memory, so gcc has no way of proving that strlen(buf) doesn't
> change). Second, since both i, list->tail and HID_DEBUG_BUFSIZE have
> signed type, the modulo computation has to take into account the
> possibility that list->tail+i is negative, so it's not just a simple
> and.
> 
> Fix the former by simply not doing strlen() at all (we have to load
> buf[i] anyway, so testing it is almost free) and the latter by
> changing i to unsigned. This cuts 29% (69 bytes) of the size of the
> function.
> 
> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>

Agreed, this is much better. Applied to for-4.5/core, thanks Rasmus.

-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-11-26 23:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-24 12:33 [PATCH] HID: debug: improve hid_debug_event() Rasmus Villemoes
2015-11-25 20:31 ` Joe Perches
2015-11-26 21:03   ` Rasmus Villemoes
2015-11-26 23:05 ` Jiri Kosina

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).