From: Helmut Schaa <helmut.schaa@googlemail.com>
To: "Rafał Miłecki" <zajec5@gmail.com>
Cc: Ivo Van Doorn <ivdoorn@gmail.com>,
"John W. Linville" <linville@tuxdriver.com>,
linux-wireless@vger.kernel.org, users@rt2x00.serialmonkey.com
Subject: Re: [PATCH 21/23] rt2x00: Optimize register access in rt2800usb
Date: Mon, 18 Apr 2011 16:48:43 +0200 [thread overview]
Message-ID: <201104181648.43967.helmut.schaa@googlemail.com> (raw)
In-Reply-To: <BANLkTimOsO1RSO-+n97a0WiQR__f_sgcUQ@mail.gmail.com>
Hi,
Am Montag, 18. April 2011 schrieb Ivo Van Doorn:
> > Wouldn't this be better to create two pointers in struct rt2x00_dev.
> > One for writing function and one for reading function? Am I right
> > thinking calling functions by pointers is quite fast? Or is this still
> > noticeably slower than using proper functions directly?
>
> We already have the pointer inside struct rt2x00_dev which references
> the register access functions for rt2800pci/usb. These pointers are used
> by rt2800lib to access the common registers. What this patch does, is
> optimize the case where we exactly know which function we need, because
> we are in the actual driver.
>
> As for the performance, I'll let Helmut comment on that as he created patch 20,
> which introduced this change to rt2800pci. :)
Sure, I was comparing some assembly in the rt2800pci hotpaths (on a 380Mhz
MIPS CPU btw). A register read/write on PCI is just a readl or writel,
nothing more but using the indirect wrappers we get something like this
(This is x86_64 as I didn't want to cross compile right now). For example
the register read + write in rt2800pci_enable_interrupt (which is called
in every tasklet invocation, which can happen for every rx'ed frame and
every tx'ed frame).
movq 8(%rbx), %rax # rt2x00dev_1(D)->ops, rt2x00dev_1(D)->ops
leaq -36(%rbp), %rdx #, tmp82
movq %rbx, %rdi # rt2x00dev,
movq 72(%rax), %rax # D.47612_27->drv, D.47612_27->drv
movl $516, %esi #,
call *(%rax) # rt2800ops_29->register_read
movb %r14b, %cl #,
movq 8(%rbx), %rax # rt2x00dev_1(D)->ops, rt2x00dev_1(D)->ops
movq %rbx, %rdi # rt2x00dev,
movq 72(%rax), %rax # D.47619_31->drv, D.47619_31->drv
movl $516, %esi #,
movl $1, %edx #, reg.119
sall %cl, %edx #, reg.119
andl %r13d, %edx # irq_field$bit_mask, reg.119
notl %r13d # tmp89
andl -36(%rbp), %r13d # reg, tmp89
orl %r13d, %edx # tmp89, reg.119
movl %edx, -36(%rbp) # reg.119, reg
call *16(%rax) # rt2800ops_33->register_write
Also, this will trigger rt2x00pci_register_read
pushq %rbp #
mov %esi, %esi # offset, addr.27
movq %rsp, %rbp #,
addq 1056(%rdi), %rsi # rt2x00dev_1(D)->csr.base, addr.27
movl %eax, (%rdx) # ret,* value
And rt2x00pci_register_write:
pushq %rbp #
mov %esi, %esi # offset, addr.26
movq %rsp, %rbp #,
addq 1056(%rdi), %rsi # rt2x00dev_1(D)->csr.base, addr.26
movl %edx,(%rsi) # value,* addr.26
And here the same when using rt2x00pci_register_read/write directly:
movq 1056(%rbx), %rax # rt2x00dev_1(D)->csr.base, rt2x00dev_1(D)->csr.base
movl 516(%rax),%eax #, reg.119
movl %r13d, %edx # irq_field$bit_mask, tmp80
movb %r14b, %cl #,
notl %edx # tmp80
andl %edx, %eax # tmp80, reg.119
movl $1, %edx #, tmp85
sall %cl, %edx #, tmp85
andl %r13d, %edx # irq_field$bit_mask, tmp85
orl %edx, %eax # tmp85, reg.119
movq 1056(%rbx), %rdx # rt2x00dev_1(D)->csr.base, rt2x00dev_1(D)->csr.base
movl %eax,516(%rdx) # reg.119,
As you can see we save more then just one indirect function call:
17 movs -> 7 movs
2 calls -> 0 calls
1 add -> 0 adds
This happens because the compiler is able to apply a number of optimizations
that are only possible by inlining rt2x00pci_register_read/write. When using
the indirect function call the compiler is not able to inline them.
So, I first thought about using direct calls only in the interrupt handler
and the RX/TX hotpaths but since using rt2800_register_read and
rt2x00pci_register_read in different locations in rt2800pci would be even
more confusing I just replaced every rt2800_register_read with
rt2x00pci_register_read in rt2800pci.
One way to keep the abstraction and still improve the register_read/write
operations would be to introduce a inlined rt2800pci_register_read/write
which directly calls rt2x00pci_register_read/write and provide that via
rt2800_ops to rt2800lib. That way all calls in rt2800pci can directly
inline rt2x00_register_read/write while rt2800lib will still use indirect
calls to do the same.
However, I didn't see any need for this.
Helmut
next prev parent reply other threads:[~2011-04-18 14:50 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-18 13:26 [PATCH 01/23] Enable WLAN LED on Ralink SoC (rt305x) devices Ivo van Doorn
2011-04-18 13:26 ` [PATCH 02/23] rt2x00: Fix stuck queue in tx failure case Ivo van Doorn
2011-04-18 13:27 ` [PATCH 03/23] rt2x00: Split rt2x00dev->flags Ivo van Doorn
2011-04-18 13:27 ` [PATCH 04/23] rt2x00: Make rt2x00_queue_entry_for_each more flexible Ivo van Doorn
2011-04-18 13:28 ` [PATCH 05/23] rt2x00: Use correct TBTT_SYNC config in AP mode Ivo van Doorn
2011-04-18 13:28 ` [PATCH 06/23] rt2x00: Update TX_SW_CFG2 init value Ivo van Doorn
2011-04-18 13:28 ` [PATCH 07/23] rt2x00: Use TXOP_HTTXOP for beacons Ivo van Doorn
2011-04-18 13:29 ` [PATCH 08/23] rt2800usb: read TX_STA_FIFO asynchronously Ivo van Doorn
2011-04-18 13:29 ` [PATCH 09/23] rt2x00: fix queue timeout checks Ivo van Doorn
2011-04-18 13:30 ` [PATCH 10/23] rt2800usb: handle TX status timeouts Ivo van Doorn
2011-04-18 13:30 ` [PATCH 11/23] rt2800usb: add timer to handle TX_STA_FIFO Ivo van Doorn
2011-04-18 13:31 ` [PATCH 12/23] Decrease association time for USB devices Ivo van Doorn
2011-04-18 13:31 ` [PATCH 13/23] rt2x00: Always inline rt2x00pci_enable_interrupt Ivo van Doorn
2011-04-18 13:31 ` [PATCH 14/23] rt2x00: Linksys WUSB600N rev2 is a RT3572 device Ivo van Doorn
2011-04-18 13:32 ` [PATCH 15/23] rt2x00: Allow dynamic addition of PCI/USB IDs Ivo van Doorn
2011-04-18 13:32 ` [PATCH 16/23] rt2x00: Add USB IDs Ivo van Doorn
2011-04-18 13:33 ` [PATCH 17/23] rt2x00: RT33xx device support is no longer experimental Ivo van Doorn
2011-04-18 13:33 ` [PATCH 18/23] rt2x00: Enable support for RT53xx PCI devices by default Ivo van Doorn
2011-04-18 13:33 ` [PATCH 19/23] rt2x00: Merge rt2x00ht.c contents in other files Ivo van Doorn
2011-04-18 13:34 ` [PATCH 20/23] rt2x00: Optimize register access in rt2800pci Ivo van Doorn
2011-04-18 13:34 ` [PATCH 21/23] rt2x00: Optimize register access in rt2800usb Ivo van Doorn
2011-04-18 13:34 ` [PATCH 22/23] rt2x00: Implement get_ringparam callback function Ivo van Doorn
2011-04-18 13:35 ` [PATCH 23/23] rt2x00: Implement get_antenna and set_antenna callback functions Ivo van Doorn
2011-04-18 13:56 ` [PATCH 21/23] rt2x00: Optimize register access in rt2800usb Rafał Miłecki
2011-04-18 14:06 ` Ivo Van Doorn
2011-04-18 14:14 ` Rafał Miłecki
2011-04-18 14:48 ` Helmut Schaa [this message]
2011-04-18 15:02 ` Rafał Miłecki
2011-04-28 2:55 ` [PATCH 04/23] rt2x00: Make rt2x00_queue_entry_for_each more flexible Yasushi SHOJI
2011-04-28 2:55 ` Yasushi SHOJI
2011-04-28 18:55 ` Ivo Van Doorn
2011-04-29 6:06 ` Gertjan van Wingerde
2011-04-30 14:01 ` Ivo van Doorn
2011-05-02 13:33 ` Yasushi SHOJI
2011-05-02 19:24 ` Ivo van Doorn
2011-05-09 8:08 ` Yasushi SHOJI
2011-05-09 8:50 ` Ivo Van Doorn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201104181648.43967.helmut.schaa@googlemail.com \
--to=helmut.schaa@googlemail.com \
--cc=ivdoorn@gmail.com \
--cc=linux-wireless@vger.kernel.org \
--cc=linville@tuxdriver.com \
--cc=users@rt2x00.serialmonkey.com \
--cc=zajec5@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).