* [PATCH] HID: debug: improve hid_debug_event()
@ 2015-11-24 12:33 Rasmus Villemoes
2015-11-25 20:31 ` Joe Perches
2015-11-26 23:05 ` Jiri Kosina
0 siblings, 2 replies; 4+ messages in thread
From: Rasmus Villemoes @ 2015-11-24 12:33 UTC (permalink / raw)
To: Jiri Kosina; +Cc: Rasmus Villemoes, linux-input, linux-kernel
The code in hid_debug_event() causes horrible code generation. First,
we do a strlen() call for every byte we copy (we're doing a store to
global memory, so gcc has no way of proving that strlen(buf) doesn't
change). Second, since both i, list->tail and HID_DEBUG_BUFSIZE have
signed type, the modulo computation has to take into account the
possibility that list->tail+i is negative, so it's not just a simple
and.
Fix the former by simply not doing strlen() at all (we have to load
buf[i] anyway, so testing it is almost free) and the latter by
changing i to unsigned. This cuts 29% (69 bytes) of the size of the
function.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
drivers/hid/hid-debug.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/hid/hid-debug.c b/drivers/hid/hid-debug.c
index 2886b645ced7..acfb522a432a 100644
--- a/drivers/hid/hid-debug.c
+++ b/drivers/hid/hid-debug.c
@@ -659,13 +659,13 @@ EXPORT_SYMBOL_GPL(hid_dump_device);
/* enqueue string to 'events' ring buffer */
void hid_debug_event(struct hid_device *hdev, char *buf)
{
- int i;
+ unsigned i;
struct hid_debug_list *list;
unsigned long flags;
spin_lock_irqsave(&hdev->debug_list_lock, flags);
list_for_each_entry(list, &hdev->debug_list, node) {
- for (i = 0; i < strlen(buf); i++)
+ for (i = 0; buf[i]; i++)
list->hid_debug_buf[(list->tail + i) % HID_DEBUG_BUFSIZE] =
buf[i];
list->tail = (list->tail + i) % HID_DEBUG_BUFSIZE;
--
2.6.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] HID: debug: improve hid_debug_event()
2015-11-24 12:33 [PATCH] HID: debug: improve hid_debug_event() Rasmus Villemoes
@ 2015-11-25 20:31 ` Joe Perches
2015-11-26 21:03 ` Rasmus Villemoes
2015-11-26 23:05 ` Jiri Kosina
1 sibling, 1 reply; 4+ messages in thread
From: Joe Perches @ 2015-11-25 20:31 UTC (permalink / raw)
To: Rasmus Villemoes, Jiri Kosina; +Cc: linux-input, linux-kernel
On Tue, 2015-11-24 at 13:33 +0100, Rasmus Villemoes wrote:
> The code in hid_debug_event() causes horrible code generation. First,
> we do a strlen() call for every byte we copy (we're doing a store to
> global memory, so gcc has no way of proving that strlen(buf) doesn't
> change). Second, since both i, list->tail and HID_DEBUG_BUFSIZE have
> signed type, the modulo computation has to take into account the
> possibility that list->tail+i is negative, so it's not just a simple
> and.
>
> Fix the former by simply not doing strlen() at all (we have to load
> buf[i] anyway, so testing it is almost free) and the latter by
> changing i to unsigned. This cuts 29% (69 bytes) of the size of the
> function.
[]
> diff --git a/drivers/hid/hid-debug.c b/drivers/hid/hid-debug.c
[]
> @@ -659,13 +659,13 @@ EXPORT_SYMBOL_GPL(hid_dump_device);
> /* enqueue string to 'events' ring buffer */
> void hid_debug_event(struct hid_device *hdev, char *buf)
> {
> - int i;
> + unsigned i;
> struct hid_debug_list *list;
> unsigned long flags;
>
> spin_lock_irqsave(&hdev->debug_list_lock, flags);
> list_for_each_entry(list, &hdev->debug_list, node) {
> - for (i = 0; i < strlen(buf); i++)
> + for (i = 0; buf[i]; i++)
> list->hid_debug_buf[(list->tail + i) % HID_DEBUG_BUFSIZE] =
> buf[i];
> list->tail = (list->tail + i) % HID_DEBUG_BUFSIZE;
trivia:
The code might look nicer if (list->tail + i) % HID_DEBUG_BUFSIZE
was stored into a temporary.
Maybe use an if >= BUFSIZE to avoid a %
Something like:
int pos = list->tail;
for (i = 0; buf[i]; i++) {
list->hid_debug_buf[pos++] = buf[i];
if (pos >= HID_DEBUG_BUFSIZE)
pos = 0;
}
list->tail = pos;
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] HID: debug: improve hid_debug_event()
2015-11-25 20:31 ` Joe Perches
@ 2015-11-26 21:03 ` Rasmus Villemoes
0 siblings, 0 replies; 4+ messages in thread
From: Rasmus Villemoes @ 2015-11-26 21:03 UTC (permalink / raw)
To: Joe Perches; +Cc: Jiri Kosina, linux-input, linux-kernel
On Wed, Nov 25 2015, Joe Perches <joe@perches.com> wrote:
>> spin_lock_irqsave(&hdev->debug_list_lock, flags);
>> list_for_each_entry(list, &hdev->debug_list, node) {
>> - for (i = 0; i < strlen(buf); i++)
>> + for (i = 0; buf[i]; i++)
>> list->hid_debug_buf[(list->tail + i) % HID_DEBUG_BUFSIZE] =
>> buf[i];
>> list->tail = (list->tail + i) % HID_DEBUG_BUFSIZE;
>
> trivia:
>
> The code might look nicer if (list->tail + i) % HID_DEBUG_BUFSIZE
> was stored into a temporary.
Maybe.
> Maybe use an if >= BUFSIZE to avoid a %
Nah, that would likely be worse; both a cmov and a conditional jump are
probably more expensive than a simple '& 0x1ff' which the % should compile to
(provided the expression is unsigned).
Rasmus
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] HID: debug: improve hid_debug_event()
2015-11-24 12:33 [PATCH] HID: debug: improve hid_debug_event() Rasmus Villemoes
2015-11-25 20:31 ` Joe Perches
@ 2015-11-26 23:05 ` Jiri Kosina
1 sibling, 0 replies; 4+ messages in thread
From: Jiri Kosina @ 2015-11-26 23:05 UTC (permalink / raw)
To: Rasmus Villemoes; +Cc: linux-input, linux-kernel
On Tue, 24 Nov 2015, Rasmus Villemoes wrote:
> The code in hid_debug_event() causes horrible code generation. First,
> we do a strlen() call for every byte we copy (we're doing a store to
> global memory, so gcc has no way of proving that strlen(buf) doesn't
> change). Second, since both i, list->tail and HID_DEBUG_BUFSIZE have
> signed type, the modulo computation has to take into account the
> possibility that list->tail+i is negative, so it's not just a simple
> and.
>
> Fix the former by simply not doing strlen() at all (we have to load
> buf[i] anyway, so testing it is almost free) and the latter by
> changing i to unsigned. This cuts 29% (69 bytes) of the size of the
> function.
>
> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Agreed, this is much better. Applied to for-4.5/core, thanks Rasmus.
--
Jiri Kosina
SUSE Labs
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-11-26 23:05 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-24 12:33 [PATCH] HID: debug: improve hid_debug_event() Rasmus Villemoes
2015-11-25 20:31 ` Joe Perches
2015-11-26 21:03 ` Rasmus Villemoes
2015-11-26 23:05 ` Jiri Kosina
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).