All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <jbacik@fb.com>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: <bp@alien8.de>, <tglx@linutronix.de>, <mingo@redhat.com>,
	<hpa@zytor.com>, <x86@kernel.org>, <linux-edac@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mce-apei: do not rely on ACPI_ERST_GET_RECORD_ID for record id
Date: Thu, 3 Mar 2016 16:53:16 -0500	[thread overview]
Message-ID: <56D8B24C.9080003@fb.com> (raw)
In-Reply-To: <20160303214945.GA11233@intel.com>

On 03/03/2016 04:49 PM, Luck, Tony wrote:
>>   retry:
>> -	rc = erst_get_record_id_next(&pos, record_id);
>> -	if (rc)
>> -		goto out;
>> +	/*
>> +	 * Some hardware is broken and doesn't actually advance the record id
>
> I'd blame this on firmware rather than hardware.

Yup sorry misspoke.

>
>> +	 * returned by ACPI_ERST_GET_RECORD_ID when we read a record like the
>> +	 * spec says it is supposed to.  So instead use record_id == 0 to just
>> +	 * grab the first record in the erst, and fall back only if we trip over
>> +	 * a record that isn't a MCE record.
>> +	 */
>> +	if (lookup_record) {
>> +		rc = erst_get_record_id_next(&pos, record_id);
>> +		if (rc)
>> +			goto out;
>> +	} else {
>> +		*record_id = 0;
>> +	}
>>   	/* no more record */
>>   	if (*record_id == APEI_ERST_INVALID_RECORD_ID)
>>   		goto out;
>>   	rc = erst_read(*record_id, &rcd.hdr, sizeof(rcd));
>> -	/* someone else has cleared the record, try next one */
>> -	if (rc == -ENOENT)
>> -		goto retry;
>> -	else if (rc < 0)
>> +	/*
>> +	 * someone else has cleared the record, try next one if we are looking
>> +	 * up records.  If we aren't looking up the record id's then just bail
>> +	 * since this means we have an empty table.
>> +	 */
>> +	if (rc == -ENOENT) {
>> +		if (lookup_record)
>> +			goto retry;
>> +		rc = 0;
>> +		goto out;
>> +	} else if (rc < 0) {
>>   		goto out;
>> -	/* try to skip other type records in storage */
>> -	else if (rc != sizeof(rcd) ||
>> -		 uuid_le_cmp(rcd.hdr.creator_id, CPER_CREATOR_MCE))
>> +	} else if (rc != sizeof(rcd) ||
>> +		 uuid_le_cmp(rcd.hdr.creator_id, CPER_CREATOR_MCE)) {
>> +		/* try to skip other type records in storage */
>> +		lookup_record = true;
>>   		goto retry;
>
> Are you still doomed by the buggy firmware if we take this "goto"?
> You be back at the top of the loop excpecting erst_get_record_id_next()
> to move on.  Does this just not happen in practice (finding non MCE
> records in amognst the MCE ones)?
>

So one of the boxes had a non MCE record with MCE records, but yeah it 
was on a box that was also broken in this way.  I'm not super worried 
about this case for us, I just want to have a fallback for any firmware 
that may not be broken in this way to be able to skip things.  Thanks,

Josef

      reply	other threads:[~2016-03-03 21:54 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-03 19:03 [PATCH] mce-apei: do not rely on ACPI_ERST_GET_RECORD_ID for record id Josef Bacik
2016-03-03 21:49 ` Luck, Tony
2016-03-03 21:53   ` Josef Bacik [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56D8B24C.9080003@fb.com \
    --to=jbacik@fb.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.