linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Mohamed Khalfella <mkhalfella@purestorage.com>
Cc: "open list:PCI SUBSYSTEM" <linux-pci@vger.kernel.org>,
	open list <linux-kernel@vger.kernel.org>,
	Meeta Saggi <msaggi@purestorage.com>,
	Eric Badger <ebadger@purestorage.com>,
	Oliver O'Halloran <oohall@gmail.com>,
	stable@vger.kernel.org, Bjorn Helgaas <bhelgaas@google.com>,
	Rajat Jain <rajatja@google.com>,
	"open list:PCI ENHANCED ERROR HANDLING \(EEH\) FOR POWERPC"
	<linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH] PCI/AER: Iterate over error counters instead of error strings
Date: Tue, 10 May 2022 11:43:05 -0500	[thread overview]
Message-ID: <20220510164305.GA678149@bhelgaas> (raw)
In-Reply-To: <20220509181441.31884-1-mkhalfella@purestorage.com>

[+cc Rajat]

On Mon, May 09, 2022 at 06:14:41PM +0000, Mohamed Khalfella wrote:
> PCI AER stats counters sysfs attributes need to iterate over
> stats counters instead of stats names. 

Thanks for catching this; it definitely looks like a real issue!  I
guess you're probably seeing junk in the sysfs files?

It would be helpful to reviewers if this said *why* we need to iterate
over the counters instead of the names.  I think the problem is that
the current code reads past the end of the stats counters.

There are parallel arrays here:

  #define AER_MAX_TYPEOF_COR_ERRS         16
  #define AER_MAX_TYPEOF_UNCOR_ERRS       27

  aer_correctable_error_string[32]                               # 32
  pdev->aer_stats->dev_cor_errs[AER_MAX_TYPEOF_COR_ERRS]         # 16
  aer_uncorrectable_error_string[32]                             # 32
  pdev->aer_stats->dev_fatal_errs[AER_MAX_TYPEOF_UNCOR_ERRS]     # 27
  pdev->aer_stats->dev_nonfatal_errs[AER_MAX_TYPEOF_UNCOR_ERRS]  # 27

And here's the current use of them:

  #define aer_stats_dev_attr(..., stats_array, strings_array, ...)
    for (i = 0; i < ARRAY_SIZE(strings_array); i++) {
      if (strings_array[i])
	sysfs_emit_at(..., strings_array[i], stats[i]);          (1)
      else if (stats[i])
	sysfs_emit_at(..., stats[i]);                            (2)

  aer_stats_dev_attr(..., dev_cor_errs, aer_correctable_error_string,
  aer_stats_dev_attr(..., dev_fatal_errs, aer_uncorrectable_error_string,
  aer_stats_dev_attr(..., dev_nonfatal_errs, aer_uncorrectable_error_string,

The current loop iterates over 0..31, which is safe at (1) because the
non-NULL strings are at aer_correctable_error_string[0..15] and
aer_uncorrectable_error_string[0..26].

But it is unsafe at (2) because it references dev_cor_errs[16..31],
dev_fatal_errs[27..31], and dev_nonfatal_errs[27..31], which are past
the end of the arrays.

> Also, added a build time check to make sure all counters have
> entries in strings array.
>
> Fixes: 0678e3109a3c ("PCI/AER: Simplify __aer_print_error()")

Yep, I blew it there.  Rajat did it correctly when he added this with
81aa5206f9a7 ("PCI/AER: Add sysfs attributes to provide AER stats and
breakdown"), and I broke it by extending the string arrays to 32
entries.

> Cc: stable@vger.kernel.org
> Reported-by: Meeta Saggi <msaggi@purestorage.com>
> Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
> Reviewed-by: Meeta Saggi <msaggi@purestorage.com>
> Reviewed-by: Eric Badger <ebadger@purestorage.com>
> ---
>  drivers/pci/pcie/aer.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 9fa1f97e5b27..ce99a6d44786 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -533,7 +533,7 @@ static const char *aer_agent_string[] = {
>  	u64 *stats = pdev->aer_stats->stats_array;			\
>  	size_t len = 0;							\
>  									\
> -	for (i = 0; i < ARRAY_SIZE(strings_array); i++) {		\
> +	for (i = 0; i < ARRAY_SIZE(pdev->aer_stats->stats_array); i++) {\
>  		if (strings_array[i])					\
>  			len += sysfs_emit_at(buf, len, "%s %llu\n",	\
>  					     strings_array[i],		\

I think maybe we should populate the currently NULL entries in the
string[] arrays and simplify the code here, e.g.,

  static const char *aer_correctable_error_string[] = {
        "RxErr",                        /* Bit Position 0       */
        "dev_cor_errs_bit[1]",
	...

  if (stats[i])
    len += sysfs_emit_at(buf, len, "%s %llu\n", strings_array[i], stats[i]);

It's a little more data space, but easier to verify.

> @@ -1342,6 +1342,11 @@ static int aer_probe(struct pcie_device *dev)
>  	struct device *device = &dev->device;
>  	struct pci_dev *port = dev->port;
>  
> +	BUILD_BUG_ON(ARRAY_SIZE(aer_correctable_error_string) <
> +		     AER_MAX_TYPEOF_COR_ERRS);
> +	BUILD_BUG_ON(ARRAY_SIZE(aer_uncorrectable_error_string) <
> +		     AER_MAX_TYPEOF_UNCOR_ERRS);

And make these check for "!=" instead of "<".

  reply	other threads:[~2022-05-10 16:43 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-09 18:14 [PATCH] PCI/AER: Iterate over error counters instead of error strings Mohamed Khalfella
2022-05-10 16:43 ` Bjorn Helgaas [this message]
2022-05-10 21:17   ` [PATCH] PCI/AER: Iterate over error counters instead of error Mohamed Khalfella
2022-06-03 22:12     ` Mohamed Khalfella
2022-06-03 23:58       ` Bjorn Helgaas
2022-06-04 20:30         ` Mohamed Khalfella
2022-07-11 22:54 ` [PATCH] PCI/AER: Iterate over error counters instead of error strings Bjorn Helgaas
2022-07-11 23:02   ` Mohamed Khalfella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220510164305.GA678149@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=ebadger@purestorage.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mkhalfella@purestorage.com \
    --cc=msaggi@purestorage.com \
    --cc=oohall@gmail.com \
    --cc=rajatja@google.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).