linux-gpio.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: David Daney <ddaney@caviumnetworks.com>
Cc: Linus Walleij <linus.walleij@linaro.org>,
	William Breathitt Gray <vilhelm.gray@gmail.com>,
	Geert Uytterhoeven <geert+renesas@glider.be>,
	Phil Reid <preid@electromag.com.au>,
	David Daney <david.daney@cavium.com>,
	Iban Rodriguez <irodriguez@cemitec.com>,
	Mathias Duckeck <m.duckeck@kunbus.de>,
	Phil Elwell <phil@raspberrypi.org>,
	linux-gpio@vger.kernel.org
Subject: Re: [PATCH v2 5/5] gpio: Add driver for Maxim MAX3191x industrial serializer
Date: Fri, 13 Oct 2017 23:29:07 +0200	[thread overview]
Message-ID: <20171013212907.GA25710@wunner.de> (raw)
In-Reply-To: <797c3efc-c466-1371-bb3e-176915f81895@caviumnetworks.com>

On Fri, Oct 13, 2017 at 09:53:06AM -0700, David Daney wrote:
> On 10/13/2017 04:11 AM, Lukas Wunner wrote:
> > The drivers
> >
> >        gpio-104-dio-48e.c, gpio-74x164.c, gpio-gpio-mm.c,
> >        gpio-pca953x.c, gpio-thunderx.c, gpio-ws16c48.c,
> >        gpio-xilinx.c
> >
> > currently use an inefficient algorithm for their ->set_multiple
> > callback:  They iterate over every chip (or bank or gpio pin),
> > check if any bits are set in the mask for this particular chip,
> > and if so, update the affected GPIOs.  If the mask is sparsely
> > populated, you'll waste CPU time checking chips even though
> > they're not affected by the operation at all.
> > 
> > Iterating over the chips is backwards, it is more efficient to
> > iterate over the bits set in the mask, identify the corresponding
> > chip and update its affected GPIOs.  The gpio-max3191x.c driver
> > I posted yesterday contains an example for such an algorithm and
> > you may want to improve your ->set_mutiple implementation accordingly:
>
> Do you have any profiling results that demonstrate significant system-wide
> performance improvements by making such a change?

Sorry, no.


> For the gpio-thunderx.c driver, the words in the bits/mask exactly match the
> banks, so the number of iterations doesn't change with the approach you
> suggest.

No, I was specifically referring to a sparsely populated mask above.
Let's say the mask consists of several 64-bit words and only the last
word has set bits.  With the existing algorithm you have as many
iterations as you have banks:

	for (bank = 0; bank <= chip->ngpio / 64; bank++) {
		set_bits = bits[bank] & mask[bank];
		clear_bits = ~bits[bank] & mask[bank];
		writeq(set_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) + GPIO_TX_SET);
		writeq(clear_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) + GPIO_TX_CLR);
	}

Whereas with the algorithm I'm proposing you only have a single iteration:

	int bit = 0;
	while ((bit = find_next_bit(mask, chip->ngpio, bit)) != chip->ngpio) {
		unsigned int bank = bit / 64;
		set_bits = bits[bank] & mask[bank];
		clear_bits = ~bits[bank] & mask[bank];
		writeq(set_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) + GPIO_TX_SET);
		writeq(clear_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) + GPIO_TX_CLR);
		bit = (bank + 1) * 64; /* continue search from next bank */
	}

find_next_bit() uses the clz instruction on ARM, similar instructions
exist on most arches, so the search is very fast.

Or did you mean that find_next_bit() iterates over the words to find
the first one with set bits?

BTW, why isn't there a check in your code whether mask[bank] != 0,
n which case the mmio writes can be skipped?


> In fact, an argument could be made in the other direction.  The increased
> ICache pressure from code bloat and branch prediction misses resulting from
> extra testing of the mask bits could easily make the system slower using
> your suggestion.

I don't quite follow, where should the branch prediction misses come from,
the number of branch instructions is the same with both algorithms, and
fewer with the proposed algorithm on sparsely populated masks.

Also, 2 additional LoC is hardly code bloat.

Are you referring to the call to find_next_bit() specifically?

Thanks,

Lukas

  reply	other threads:[~2017-10-13 21:29 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-12 10:40 [PATCH v2 0/5] GPIO driver for Maxim MAX3191x Lukas Wunner
2017-10-12 10:40 ` [PATCH v2 2/5] gpio: Introduce ->get_multiple callback Lukas Wunner
2017-10-13 12:46   ` Linus Walleij
2017-10-12 10:40 ` [PATCH v2 1/5] bitops: Introduce assign_bit() Lukas Wunner
2017-10-12 18:30   ` Andrew Morton
2017-10-13 12:44   ` Linus Walleij
2017-10-12 10:40 ` [PATCH v2 3/5] dt-bindings: Document common property for daisy-chained devices Lukas Wunner
     [not found]   ` <f0c3b0c5514f74717c5783360b60062dfe9b8c0f.1507797496.git.lukas-JFq808J9C/izQB+pC5nmwQ@public.gmane.org>
2017-10-15 11:30     ` Jonathan Cameron
2017-10-17 20:32     ` Rob Herring
2017-10-19 20:35     ` Linus Walleij
     [not found] ` <cover.1507797496.git.lukas-JFq808J9C/izQB+pC5nmwQ@public.gmane.org>
2017-10-12 10:40   ` [PATCH v2 5/5] gpio: Add driver for Maxim MAX3191x industrial serializer Lukas Wunner
2017-10-13 11:11     ` Lukas Wunner
2017-10-13 16:53       ` David Daney
2017-10-13 21:29         ` Lukas Wunner [this message]
2017-10-13 22:12           ` David Daney
2017-10-14 11:24             ` Lukas Wunner
2017-10-19 20:41     ` Linus Walleij
2017-10-12 10:40   ` [PATCH v2 4/5] dt-bindings: gpio: max3191x: Document new driver Lukas Wunner
2017-10-17 20:30     ` Rob Herring
     [not found]     ` <57660f421f2080914940806394a9365ac959f5a8.1507797496.git.lukas-JFq808J9C/izQB+pC5nmwQ@public.gmane.org>
2017-10-19 20:36       ` Linus Walleij
2017-10-13 12:48 ` [PATCH v2 0/5] GPIO driver for Maxim MAX3191x Linus Walleij

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171013212907.GA25710@wunner.de \
    --to=lukas@wunner.de \
    --cc=david.daney@cavium.com \
    --cc=ddaney@caviumnetworks.com \
    --cc=geert+renesas@glider.be \
    --cc=irodriguez@cemitec.com \
    --cc=linus.walleij@linaro.org \
    --cc=linux-gpio@vger.kernel.org \
    --cc=m.duckeck@kunbus.de \
    --cc=phil@raspberrypi.org \
    --cc=preid@electromag.com.au \
    --cc=vilhelm.gray@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).