linux-spi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Jander <david@protonic.nl>
To: Andrew Lunn <andrew@lunn.ch>
Cc: Mark Brown <broonie@kernel.org>,
	linux-spi@vger.kernel.org, Marc Kleine-Budde <mkl@pengutronix.de>
Subject: Re: [PATCH] drivers: spi: spi.c: Convert statistics to per-cpu u64_stats_t
Date: Tue, 24 May 2022 08:25:06 +0200	[thread overview]
Message-ID: <20220524082506.0aa4fde6@erd992> (raw)
In-Reply-To: <YoukjLDGvz5kN5fp@lunn.ch>


Hi Andrew,

Thanks a lot for the review.

On Mon, 23 May 2022 17:13:16 +0200
Andrew Lunn <andrew@lunn.ch> wrote:

> On Mon, May 23, 2022 at 04:20:09PM +0200, David Jander wrote:
> > This change gives a dramatic performance improvement in the hot path,
> > since many costly spin_lock_irqsave() calls can be avoided.  
> 
> It is normal to back up such claims with numbers. Please include your
> numbers here.

Ok, will add my measurement results here.

> > Suggested-by: Andrew Lunn <andrew@lunn.ch>
> > Signed-off-by: David Jander <david@protonic.nl>
> > ---
> >  drivers/spi/spi.c       | 95 ++++++++++++++++++++++++-----------------
> >  include/linux/spi/spi.h | 80 ++++++++++++++++++++++++----------
> >  2 files changed, 114 insertions(+), 61 deletions(-)
> > 
> > diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
> > index c4dd1200fe99..edc290e67b92 100644
> > --- a/drivers/spi/spi.c
> > +++ b/drivers/spi/spi.c
> > @@ -33,6 +33,7 @@
> >  #include <linux/idr.h>
> >  #include <linux/platform_data/x86/apple.h>
> >  #include <linux/ptp_clock_kernel.h>
> > +#include <linux/percpu.h>
> >  
> >  #define CREATE_TRACE_POINTS
> >  #include <trace/events/spi.h>
> > @@ -111,6 +112,25 @@ static ssize_t driver_override_show(struct device *dev,
> >  }
> >  static DEVICE_ATTR_RW(driver_override);
> >  
> > +#define spi_pcpu_stats_totalize(ret, in, field)				\
> > +do {									\
> > +	int i;								\
> > +	ret = 0;							\
> > +	for_each_possible_cpu(i) {					\
> > +		const struct spi_statistics *pcpu_stats;		\
> > +		u64 inc;						\
> > +		unsigned int start;					\
> > +		pcpu_stats = per_cpu_ptr(in, i);			\
> > +		do {							\
> > +			start = u64_stats_fetch_begin_irq(		\
> > +					&pcpu_stats->syncp);		\
> > +			inc = u64_stats_read(&pcpu_stats->field);	\
> > +		} while (u64_stats_fetch_retry_irq(			\
> > +					&pcpu_stats->syncp, start));	\
> > +		ret += inc;						\
> > +	}								\
> > +} while (0)  
> 
> Looking at this from the perspective of a network reviewer, that is a
> pretty big macro. netdev would probably not allow this. But macro
> magic does seem normal in this file.
> 
> What also makes a difference here is the API to user space. With
> netdev, it is not so much the absolute values of the counters, but the
> relationship with other counters that is interesting. So there is a
> method to get all the counters in one atomic snapshot. For that to
> work, the loop doing the addition adds up all the counters, not just
> one specific counter. So there is no need for 'field' which is the
> source of all this macro magic. But it seems like SPI only ever
> exports individual counters.

Yes, I initially started out just totalling all the members in one loop, but
after noticing they are never used together, I changed it.
Going to go with Marks comment here and leave it as is.

> >  #define SPI_STATISTICS_ATTRS(field, file)				\
> >  static ssize_t spi_controller_##field##_show(struct device *dev,	\
> >  					     struct device_attribute *attr, \
> > @@ -118,7 +138,7 @@ static ssize_t spi_controller_##field##_show(struct device *dev,	\
> >  {									\
> >  	struct spi_controller *ctlr = container_of(dev,			\
> >  					 struct spi_controller, dev);	\
> > -	return spi_statistics_##field##_show(&ctlr->statistics, buf);	\
> > +	return spi_statistics_##field##_show(ctlr->pcpu_statistics, buf); \
> >  }									\
> >  static struct device_attribute dev_attr_spi_controller_##field = {	\
> >  	.attr = { .name = file, .mode = 0444 },				\
> > @@ -129,7 +149,7 @@ static ssize_t spi_device_##field##_show(struct device *dev,		\
> >  					char *buf)			\
> >  {									\
> >  	struct spi_device *spi = to_spi_device(dev);			\
> > -	return spi_statistics_##field##_show(&spi->statistics, buf);	\
> > +	return spi_statistics_##field##_show(spi->pcpu_statistics, buf); \
> >  }									\
> >  static struct device_attribute dev_attr_spi_device_##field = {		\
> >  	.attr = { .name = file, .mode = 0444 },				\  
> 
> Not a criticism of this patch, but this is the old way to do this. It
> is better to use DEVICE_ATTR_RO(), but that is a bigger rework.

I suppose it is better to do this in a separate patch? @Mark, should I leave
it as is, or do you require a "modernization" for this first?

> > @@ -140,11 +160,10 @@ static struct device_attribute dev_attr_spi_device_##field = {		\
> >  static ssize_t spi_statistics_##name##_show(struct spi_statistics *stat, \
> >  					    char *buf)			\
> >  {									\
> > -	unsigned long flags;						\
> >  	ssize_t len;							\
> > -	spin_lock_irqsave(&stat->lock, flags);				\
> > -	len = sysfs_emit(buf, format_string "\n", stat->field);		\
> > -	spin_unlock_irqrestore(&stat->lock, flags);			\
> > +	u64 val;							\
> > +	spi_pcpu_stats_totalize(val, stat, field);			\
> > +	len = sysfs_emit(buf, format_string "\n", val);			\  
> 
> If you have made all the statistics u64, you don't need format_string.

Ack. Good point. I wasn't convinced to leave it all u64 till the end, so I
didn't think of removing the format strings. Will fix that.

> > @@ -543,7 +562,7 @@ struct spi_device *spi_alloc_device(struct spi_controller *ctlr)
> >  	spi->dev.release = spidev_release;
> >  	spi->mode = ctlr->buswidth_override_bits;
> >  
> > -	spin_lock_init(&spi->statistics.lock);
> > +	spi->pcpu_statistics = spi_alloc_pcpu_stats(struct spi_statistics);  
> 
> It looks like you are missing error checking here.

Oops. Will fix that.

> > @@ -3042,7 +3061,7 @@ int spi_register_controller(struct spi_controller *ctlr)
> >  		}
> >  	}
> >  	/* add statistics */
> > -	spin_lock_init(&ctlr->statistics.lock);
> > +	ctlr->pcpu_statistics = devm_spi_alloc_pcpu_stats(dev, struct spi_statistics);  
> 
> This could fail, so you should check for NULL pointer.

Thanks. Will fix.

> > +#define devm_spi_alloc_pcpu_stats(dev, type)				\
> > +({									\
> > +	typeof(type) __percpu *pcpu_stats = devm_alloc_percpu(dev, type);\
> > +	if (pcpu_stats) {						\
> > +		int __cpu;						\
> > +		for_each_possible_cpu(__cpu) {				\
> > +			typeof(type) *stat;				\
> > +			stat = per_cpu_ptr(pcpu_stats, __cpu);		\
> > +			u64_stats_init(&stat->syncp);			\
> > +		}							\
> > +	}								\
> > +	pcpu_stats;							\
> > +})
> > +
> > +#define spi_alloc_pcpu_stats(type)					\
> > +({									\
> > +	typeof(type) __percpu *pcpu_stats = alloc_percpu_gfp(type, GFP_KERNEL);\
> > +	if (pcpu_stats) {						\
> > +		int __cpu;						\
> > +		for_each_possible_cpu(__cpu) {				\
> > +			typeof(type) *stat;				\
> > +			stat = per_cpu_ptr(pcpu_stats, __cpu);		\
> > +			u64_stats_init(&stat->syncp);			\
> > +		}							\
> > +	}								\
> > +	pcpu_stats;							\
> > +})  
> 
> Do these need to be macros? How many different types are used?

Good question. Probably not. I initially had just one macro that I intended to
use twice (the devm variant), but later realized that there is a
chicken-and-egg problem in the code, where the bus (and thus the statistics)
need to be operational before the client device is fully registered, so device
resource allocations aren't allowed at that point.
But this reminds me that not being a device resource, it needs explicit
freeing and I haven't done that. Will fix this memory leak in the next version.

Best regards,

-- 
David Jander


  parent reply	other threads:[~2022-05-24  6:25 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-23 14:20 [PATCH] drivers: spi: spi.c: Convert statistics to per-cpu u64_stats_t David Jander
2022-05-23 15:13 ` Andrew Lunn
2022-05-23 18:56   ` Mark Brown
2022-05-24  6:25   ` David Jander [this message]
2022-05-24 11:46     ` Mark Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220524082506.0aa4fde6@erd992 \
    --to=david@protonic.nl \
    --cc=andrew@lunn.ch \
    --cc=broonie@kernel.org \
    --cc=linux-spi@vger.kernel.org \
    --cc=mkl@pengutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).