From: Stewart Smith <stewart@linux.vnet.ibm.com>
To: Michael Ellerman <mpe@ellerman.id.au>,
Vipin K Parashar <vipin@linux.vnet.ibm.com>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] powernv/opal: Handle OPAL_WRONG_STATE error from OPAL fails
Date: Thu, 23 Feb 2017 14:52:33 +1100 [thread overview]
Message-ID: <871supftry.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <87r32th2rt.fsf@concordia.ellerman.id.au>
Michael Ellerman <mpe@ellerman.id.au> writes:
> Stewart Smith <stewart@linux.vnet.ibm.com> writes:
>
>> Vipin K Parashar <vipin@linux.vnet.ibm.com> writes:
>>> On Monday 13 February 2017 06:13 AM, Michael Ellerman wrote:
>>>> Vipin K Parashar <vipin@linux.vnet.ibm.com> writes:
>>>>
>>>>> OPAL returns OPAL_WRONG_STATE for XSCOM operations
>>>>>
>>>>> done to read any core FIR which is sleeping, offline.
>>>> OK.
>>>>
>>>> Do we know why Linux is causing that to happen?
>>>
>>> This issue is originally seen upon running STAF (Software Test
>>> Automation Framework) stress tests and off-lining some cores
>>> with stress tests running.
>>>
>>> It can also be re-created after off-lining few cores and following
>>> one of below methods.
>>> 1. Executing Linux "sensors" command
>>> 2. Reading contents of file /sys/class/hwmon/hwmon0/tempX_input,
>>> where X is offline CPU.
>>>
>>> Its "opal_get_sensor_data" Linux API that that triggers
>>> OPAL call "opal_sensor_read", performing XSCOM ops here.
>>> If core is found sleeping/offline Linux throws up
>>> "opal_error_code: Unexpected OPAL error" error onto console.
>>>
>>> Currently Linux isn't aware about OPAL_WRONG_STATE return code
>>> from OPAL. Thus it prints "Unexpected OPAL error" message, same
>>> as it would log for any unknown OPAL return codes.
>>>
>>> Seeing this error over console has been a concern for Test and
>>> would puzzle real user as well. This patch makes Linux aware about
>>> OPAL_WRONG_STATE return code from OPAL and stops printing
>>> "Unexpected OPAL error" message onto console for OPAL fails
>>> with OPAL_WRONG_STATE
>>
>> Ahh... so this is a DTS sensor, which indeed is just XSCOMs and we
>> return the xscom_read return code in event of error.
>>
>> I would argue that converting to EIO in that instance is probably
>> correct... or EAGAIN? EAGAIN may be more correct in the situation where
>> the core is just sleeping.
>>
>> What kind of offlining are you doing?
>>
>> Arguably, the correct behaviour would be to remove said sensors when the
>> core is offline.
>
> Right, that would be ideal. There appear to be at least two other hwmon
> drivers that are CPU hotplug aware (coretemp and via-cputemp).
>
> But perhaps it's not possible to work out which sensors are attached to
> which CPU etc., I haven't looked in detail.
Each core-temp@ sensor has a ibm,pir property, so linking back to what
core shouldn't be too hard. For mem-temp@ sensors, we have the chip-id.
> In that case changing just opal_get_sensor_data() to handle
> OPAL_WRONG_STATE would be OK, with a comment explaining that we might be
> asked to read a sensor on an offline CPU and we aren't able to detect
> that.
Agree.
--
Stewart Smith
OPAL Architect, IBM.
next prev parent reply other threads:[~2017-02-23 3:52 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-20 14:16 [PATCH] powernv/opal: Handle OPAL_WRONG_STATE error from OPAL fails Vipin K Parashar
2016-12-21 5:24 ` Mukesh Ojha
2017-01-27 0:17 ` Michael Ellerman
2017-01-27 6:48 ` Vipin K Parashar
2017-02-13 0:43 ` Michael Ellerman
2017-02-15 5:01 ` Stewart Smith
2017-02-15 20:12 ` Vipin K Parashar
2017-02-16 0:52 ` Stewart Smith
2017-02-20 5:03 ` Michael Ellerman
2017-02-23 3:52 ` Stewart Smith [this message]
2017-02-28 9:20 ` Vipin K Parashar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871supftry.fsf@linux.vnet.ibm.com \
--to=stewart@linux.vnet.ibm.com \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=vipin@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.