All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tao Ren <rentao.bupt@gmail.com>
To: linux-aspeed@lists.ozlabs.org
Subject: [PATCH] usb: gadget: aspeed: fixup vhub port irq handling
Date: Mon, 17 Aug 2020 15:56:04 -0700	[thread overview]
Message-ID: <20200817225603.GA21228@taoren-ubuntu-R90MNF91> (raw)
In-Reply-To: <875z9hz8k3.fsf@kernel.org>

On Mon, Aug 17, 2020 at 04:49:32PM +0300, Felipe Balbi wrote:
> 
> Hi,
> 
> rentao.bupt at gmail.com writes:
> > From: Tao Ren <rentao.bupt@gmail.com>
> >
> > This is a follow-on patch for commit a23be4ed8f48 ("usb: gadget: aspeed:
> > improve vhub port irq handling"): for_each_set_bit() is replaced with
> > simple for() loop because for() loop runs faster on ASPEED BMC.
> >
> > Signed-off-by: Tao Ren <rentao.bupt@gmail.com>
> > ---
> >  drivers/usb/gadget/udc/aspeed-vhub/core.c | 10 +++-------
> >  drivers/usb/gadget/udc/aspeed-vhub/vhub.h |  3 +++
> >  2 files changed, 6 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/usb/gadget/udc/aspeed-vhub/core.c b/drivers/usb/gadget/udc/aspeed-vhub/core.c
> > index cdf96911e4b1..be7bb64e3594 100644
> > --- a/drivers/usb/gadget/udc/aspeed-vhub/core.c
> > +++ b/drivers/usb/gadget/udc/aspeed-vhub/core.c
> > @@ -135,13 +135,9 @@ static irqreturn_t ast_vhub_irq(int irq, void *data)
> >  
> >  	/* Handle device interrupts */
> >  	if (istat & vhub->port_irq_mask) {
> > -		unsigned long bitmap = istat;
> > -		int offset = VHUB_IRQ_DEV1_BIT;
> > -		int size = VHUB_IRQ_DEV1_BIT + vhub->max_ports;
> > -
> > -		for_each_set_bit_from(offset, &bitmap, size) {
> > -			i = offset - VHUB_IRQ_DEV1_BIT;
> > -			ast_vhub_dev_irq(&vhub->ports[i].dev);
> > +		for (i = 0; i < vhub->max_ports; i++) {
> > +			if (istat & VHUB_DEV_IRQ(i))
> > +				ast_vhub_dev_irq(&vhub->ports[i].dev);
> 
> how have you measured your statement above? for_each_set_bit() does
> exactly what you did. Unless your architecture has an instruction which
> helps finds the next set bit (like cls on ARM), which, then, makes it
> much faster.

I did some testing and result shows for() loop runs faster than
for_each_set_bit() loop. Please refer to details below (discussion with
Benjamin in the original patch) and kindly let me know your suggestions.

> On Mon, 2020-04-06 at 23:02 -0700, Tao Ren wrote:
> > I ran some testing on my ast2400 and ast2500 BMC and looks like the
> > for() loop runs faster than for_each_set_bit_from() loop in my
> > environment. I'm not sure if something needs to be revised in my test
> > code, but please kindly share your suggestions:
> >
> > I use get_cycles() to calculate execution time of 2 different loops, and
> > ast_vhub_dev_irq() is replaced with barrier() to avoid "noise"; below
> > are the results:
> >
> >   - when downstream port number is 5 and only 1 irq bit is set, it takes
> >     ~30 cycles to finish for_each_set_bit() loop, and 20-25 cycles to
> >     finish the for() loop.
> >
> >   - if downstream port number is 5 and all 5 bits are set, then
> >     for_each_set_bit() loop takes ~50 cycles and for() loop takes ~25
> >     cycles.
> >
> >   - when I increase downsteam port number to 16 and set 1 irq bit, the
> >     for_each_set_bit() loop takes ~30 cycles and for() loop takes 25
> >     cycles. It's a little surprise to me because I thought for() loop
> >     would cost 60+ cycles (3 times of the value when port number is 5).
> >
> >   - if downstream port number is 16 and all irq status bits are set,
> >     then for_each_set_bit() loop takes 60-70 cycles and for() loop takes
> >     30+ cycles.


Cheers,

Tao

WARNING: multiple messages have this Message-ID (diff)
From: Tao Ren <rentao.bupt@gmail.com>
To: Felipe Balbi <balbi@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Joel Stanley <joel@jms.id.au>, Andrew Jeffery <andrew@aj.id.au>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Chunfeng Yun <chunfeng.yun@mediatek.com>,
	Stephen Boyd <swboyd@chromium.org>,
	linux-usb@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-aspeed@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	openbmc@lists.ozlabs.org, taoren@fb.com
Subject: Re: [PATCH] usb: gadget: aspeed: fixup vhub port irq handling
Date: Mon, 17 Aug 2020 15:56:04 -0700	[thread overview]
Message-ID: <20200817225603.GA21228@taoren-ubuntu-R90MNF91> (raw)
In-Reply-To: <875z9hz8k3.fsf@kernel.org>

On Mon, Aug 17, 2020 at 04:49:32PM +0300, Felipe Balbi wrote:
> 
> Hi,
> 
> rentao.bupt@gmail.com writes:
> > From: Tao Ren <rentao.bupt@gmail.com>
> >
> > This is a follow-on patch for commit a23be4ed8f48 ("usb: gadget: aspeed:
> > improve vhub port irq handling"): for_each_set_bit() is replaced with
> > simple for() loop because for() loop runs faster on ASPEED BMC.
> >
> > Signed-off-by: Tao Ren <rentao.bupt@gmail.com>
> > ---
> >  drivers/usb/gadget/udc/aspeed-vhub/core.c | 10 +++-------
> >  drivers/usb/gadget/udc/aspeed-vhub/vhub.h |  3 +++
> >  2 files changed, 6 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/usb/gadget/udc/aspeed-vhub/core.c b/drivers/usb/gadget/udc/aspeed-vhub/core.c
> > index cdf96911e4b1..be7bb64e3594 100644
> > --- a/drivers/usb/gadget/udc/aspeed-vhub/core.c
> > +++ b/drivers/usb/gadget/udc/aspeed-vhub/core.c
> > @@ -135,13 +135,9 @@ static irqreturn_t ast_vhub_irq(int irq, void *data)
> >  
> >  	/* Handle device interrupts */
> >  	if (istat & vhub->port_irq_mask) {
> > -		unsigned long bitmap = istat;
> > -		int offset = VHUB_IRQ_DEV1_BIT;
> > -		int size = VHUB_IRQ_DEV1_BIT + vhub->max_ports;
> > -
> > -		for_each_set_bit_from(offset, &bitmap, size) {
> > -			i = offset - VHUB_IRQ_DEV1_BIT;
> > -			ast_vhub_dev_irq(&vhub->ports[i].dev);
> > +		for (i = 0; i < vhub->max_ports; i++) {
> > +			if (istat & VHUB_DEV_IRQ(i))
> > +				ast_vhub_dev_irq(&vhub->ports[i].dev);
> 
> how have you measured your statement above? for_each_set_bit() does
> exactly what you did. Unless your architecture has an instruction which
> helps finds the next set bit (like cls on ARM), which, then, makes it
> much faster.

I did some testing and result shows for() loop runs faster than
for_each_set_bit() loop. Please refer to details below (discussion with
Benjamin in the original patch) and kindly let me know your suggestions.

> On Mon, 2020-04-06 at 23:02 -0700, Tao Ren wrote:
> > I ran some testing on my ast2400 and ast2500 BMC and looks like the
> > for() loop runs faster than for_each_set_bit_from() loop in my
> > environment. I'm not sure if something needs to be revised in my test
> > code, but please kindly share your suggestions:
> >
> > I use get_cycles() to calculate execution time of 2 different loops, and
> > ast_vhub_dev_irq() is replaced with barrier() to avoid "noise"; below
> > are the results:
> >
> >   - when downstream port number is 5 and only 1 irq bit is set, it takes
> >     ~30 cycles to finish for_each_set_bit() loop, and 20-25 cycles to
> >     finish the for() loop.
> >
> >   - if downstream port number is 5 and all 5 bits are set, then
> >     for_each_set_bit() loop takes ~50 cycles and for() loop takes ~25
> >     cycles.
> >
> >   - when I increase downsteam port number to 16 and set 1 irq bit, the
> >     for_each_set_bit() loop takes ~30 cycles and for() loop takes 25
> >     cycles. It's a little surprise to me because I thought for() loop
> >     would cost 60+ cycles (3 times of the value when port number is 5).
> >
> >   - if downstream port number is 16 and all irq status bits are set,
> >     then for_each_set_bit() loop takes 60-70 cycles and for() loop takes
> >     30+ cycles.


Cheers,

Tao

WARNING: multiple messages have this Message-ID (diff)
From: Tao Ren <rentao.bupt@gmail.com>
To: Felipe Balbi <balbi@kernel.org>
Cc: linux-aspeed@lists.ozlabs.org, taoren@fb.com,
	Andrew Jeffery <andrew@aj.id.au>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	openbmc@lists.ozlabs.org, linux-usb@vger.kernel.org,
	linux-kernel@vger.kernel.org, Stephen Boyd <swboyd@chromium.org>,
	Joel Stanley <joel@jms.id.au>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Chunfeng Yun <chunfeng.yun@mediatek.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] usb: gadget: aspeed: fixup vhub port irq handling
Date: Mon, 17 Aug 2020 15:56:04 -0700	[thread overview]
Message-ID: <20200817225603.GA21228@taoren-ubuntu-R90MNF91> (raw)
In-Reply-To: <875z9hz8k3.fsf@kernel.org>

On Mon, Aug 17, 2020 at 04:49:32PM +0300, Felipe Balbi wrote:
> 
> Hi,
> 
> rentao.bupt@gmail.com writes:
> > From: Tao Ren <rentao.bupt@gmail.com>
> >
> > This is a follow-on patch for commit a23be4ed8f48 ("usb: gadget: aspeed:
> > improve vhub port irq handling"): for_each_set_bit() is replaced with
> > simple for() loop because for() loop runs faster on ASPEED BMC.
> >
> > Signed-off-by: Tao Ren <rentao.bupt@gmail.com>
> > ---
> >  drivers/usb/gadget/udc/aspeed-vhub/core.c | 10 +++-------
> >  drivers/usb/gadget/udc/aspeed-vhub/vhub.h |  3 +++
> >  2 files changed, 6 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/usb/gadget/udc/aspeed-vhub/core.c b/drivers/usb/gadget/udc/aspeed-vhub/core.c
> > index cdf96911e4b1..be7bb64e3594 100644
> > --- a/drivers/usb/gadget/udc/aspeed-vhub/core.c
> > +++ b/drivers/usb/gadget/udc/aspeed-vhub/core.c
> > @@ -135,13 +135,9 @@ static irqreturn_t ast_vhub_irq(int irq, void *data)
> >  
> >  	/* Handle device interrupts */
> >  	if (istat & vhub->port_irq_mask) {
> > -		unsigned long bitmap = istat;
> > -		int offset = VHUB_IRQ_DEV1_BIT;
> > -		int size = VHUB_IRQ_DEV1_BIT + vhub->max_ports;
> > -
> > -		for_each_set_bit_from(offset, &bitmap, size) {
> > -			i = offset - VHUB_IRQ_DEV1_BIT;
> > -			ast_vhub_dev_irq(&vhub->ports[i].dev);
> > +		for (i = 0; i < vhub->max_ports; i++) {
> > +			if (istat & VHUB_DEV_IRQ(i))
> > +				ast_vhub_dev_irq(&vhub->ports[i].dev);
> 
> how have you measured your statement above? for_each_set_bit() does
> exactly what you did. Unless your architecture has an instruction which
> helps finds the next set bit (like cls on ARM), which, then, makes it
> much faster.

I did some testing and result shows for() loop runs faster than
for_each_set_bit() loop. Please refer to details below (discussion with
Benjamin in the original patch) and kindly let me know your suggestions.

> On Mon, 2020-04-06 at 23:02 -0700, Tao Ren wrote:
> > I ran some testing on my ast2400 and ast2500 BMC and looks like the
> > for() loop runs faster than for_each_set_bit_from() loop in my
> > environment. I'm not sure if something needs to be revised in my test
> > code, but please kindly share your suggestions:
> >
> > I use get_cycles() to calculate execution time of 2 different loops, and
> > ast_vhub_dev_irq() is replaced with barrier() to avoid "noise"; below
> > are the results:
> >
> >   - when downstream port number is 5 and only 1 irq bit is set, it takes
> >     ~30 cycles to finish for_each_set_bit() loop, and 20-25 cycles to
> >     finish the for() loop.
> >
> >   - if downstream port number is 5 and all 5 bits are set, then
> >     for_each_set_bit() loop takes ~50 cycles and for() loop takes ~25
> >     cycles.
> >
> >   - when I increase downsteam port number to 16 and set 1 irq bit, the
> >     for_each_set_bit() loop takes ~30 cycles and for() loop takes 25
> >     cycles. It's a little surprise to me because I thought for() loop
> >     would cost 60+ cycles (3 times of the value when port number is 5).
> >
> >   - if downstream port number is 16 and all irq status bits are set,
> >     then for_each_set_bit() loop takes 60-70 cycles and for() loop takes
> >     30+ cycles.


Cheers,

Tao

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-08-17 22:56 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-28  1:11 [PATCH] usb: gadget: aspeed: fixup vhub port irq handling rentao.bupt
2020-05-28  1:11 ` rentao.bupt
2020-05-28  1:11 ` rentao.bupt
2020-05-28  1:20 ` Tao Ren
2020-05-28  1:20   ` Tao Ren
2020-05-28  1:20   ` Tao Ren
2020-08-17 13:49 ` Felipe Balbi
2020-08-17 13:49   ` Felipe Balbi
2020-08-17 13:49   ` Felipe Balbi
2020-08-17 22:56   ` Tao Ren [this message]
2020-08-17 22:56     ` Tao Ren
2020-08-17 22:56     ` Tao Ren
2020-08-31  9:54     ` Felipe Balbi
2020-08-31  9:54       ` Felipe Balbi
2020-08-31  9:54       ` Felipe Balbi
2020-08-31  9:54     ` Felipe Balbi
2020-08-31  9:54       ` Felipe Balbi
2020-08-31  9:54       ` Felipe Balbi
2020-08-31 23:26       ` Tao Ren
2020-08-31 23:26         ` Tao Ren
2020-08-31 23:26         ` Tao Ren
2020-08-31  9:56     ` Felipe Balbi
2020-08-31  9:56       ` Felipe Balbi
2020-08-31  9:56       ` Felipe Balbi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200817225603.GA21228@taoren-ubuntu-R90MNF91 \
    --to=rentao.bupt@gmail.com \
    --cc=linux-aspeed@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.