From: Anthony Krowiak <akrowiak@linux.ibm.com>
To: Halil Pasic <pasic@linux.ibm.com>,
Harald Freudenberger <freude@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>,
linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, jjherne@linux.ibm.com,
alex.williamson@redhat.com, kwankhede@nvidia.com,
frankja@linux.ibm.com, imbrenda@linux.ibm.com, david@redhat.com,
Reinhard Buendgen <BUENDGEN@de.ibm.com>
Subject: Re: [PATCH] s390/vfio-ap: handle response code 01 on queue reset
Date: Thu, 7 Dec 2023 10:31:06 -0500 [thread overview]
Message-ID: <483f23b2-0c88-49e2-8b40-7b17cd2b46cc@linux.ibm.com> (raw)
In-Reply-To: <20231206181727.376c3d67.pasic@linux.ibm.com>
On 12/6/23 12:17 PM, Halil Pasic wrote:
> On Tue, 05 Dec 2023 09:04:23 +0100
> Harald Freudenberger <freude@linux.ibm.com> wrote:
>
>> On 2023-12-04 17:15, Halil Pasic wrote:
>>> On Mon, 4 Dec 2023 16:16:31 +0100
>>> Christian Borntraeger <borntraeger@linux.ibm.com> wrote:
>>>
>>>> Am 04.12.23 um 15:53 schrieb Tony Krowiak:
>>>>>
>>>>> On 11/29/23 12:12, Christian Borntraeger wrote:
>>>>>> Am 29.11.23 um 15:35 schrieb Tony Krowiak:
>>>>>>> In the current implementation, response code 01 (AP queue number not valid)
>>>>>>> is handled as a default case along with other response codes returned from
>>>>>>> a queue reset operation that are not handled specifically. Barring a bug,
>>>>>>> response code 01 will occur only when a queue has been externally removed
>>>>>>> from the host's AP configuration; nn this case, the queue must
>>>>>>> be reset by the machine in order to avoid leaking crypto data if/when the
>>>>>>> queue is returned to the host's configuration. The response code 01 case
>>>>>>> will be handled specifically by logging a WARN message followed by cleaning
>>>>>>> up the IRQ resources.
>>>>>>>
>>>>>> To me it looks like this can be triggered by the LPAR admin, correct? So it
>>>>>> is not desireable but possible.
>>>>>> In that case I prefer to not use WARN, maybe use dev_warn or dev_err instead.
>>>>>> WARN can be a disruptive event if panic_on_warn is set.
>>>>> Yes, it can be triggered by the LPAR admin. I can't use dev_warn here because we don't have a reference to any device, but I can use pr_warn if that suffices.
>>>> Ok, please use pr_warn then.
>>> Shouldn't we rather make this an 'info'. I mean we probably do not want
>>> people complaining about this condition. Yes it should be a besNo info logging is done via the S390 Debug Feature in vfio_ap.
>>> There are a few warning messages logged solely in the handle_pqap
>>> and vfio_ap_irq_enable functions. The question is, why are we
>>> talking about the S390 Debug Feature? We are talking about using
>>> pr_warn verses pr_info. What am I missing here?t
>>> practice
>>> to coordinate such things with the guest, and ideally remove the
>>> resource
>>> from the guest first. But AFAIU our stack is supposed to be able to
>>> handle something like this. IMHO issuing a warning is excessive
>>> measure.
>>> I know Reinhard and Tony probably disagree with the last sentence
>>> though.
>> Halil, Tony, the thing about about info versus warning versus error is
>> our
>> own stuff. Keep in mind that these messages end up in the "debug
>> feature"
>> as FFDC data. So it comes to the point which FFDC data do you/Tony want
>> to
>> see there ? It should be enough to explain to a customer what happened
>> without the need to "recreate with higher debug level" if something
>> serious
>> happened. So my private decision table is:
>> 1) is it something serious, something exceptional, something which may
>> not
>> come up again if tried to recreate ? Yes -> make it visible on the
>> first
>> occurrence as error msg.
>> 2) is it something you want to read when a customer hits it and you tell
>> him
>> to extract and examine the debug feature data ? Yes -> make it a
>> warning
>> and make sure your debug feature by default records warnings.
>> 3) still serious, but may flood the debug feature. Good enough and high
>> probability to reappear on a recreate ? Yes -> make it an info
>> message
>> and live with the risk that you may not be able to explain to a
>> customer
>> what happened without a recreate and higher debug level.
>> 4) not 1-3, -> maybe a debug msg but still think about what happens when
>> a
>> customer enables "debug feature" with highest level. Does it squeeze
>> out
>> more important stuff ? Maybe make it dynamic debug with pr_debug()
>> (see
>> kernel docu admin-guide/dynamic-debug-howto.rst).
> AFAIU the default log level of the S390 Debug Feature is 3 that is
> error. So warnings do not help us there by default. And if we are
> already asking the reporter to crank up the loglevel of the debug
> feature, we can as the reporter to crank it up to 5, assumed there
> is not too much stuff that log level 5 in that area... How much
> info stuff do we have for the 'ap' debug facility (I hope
> that is the facility used by vfio_ap)?
No info logging is done via the S390 Debug Feature in vfio_ap. There are
a few warning messages logged solely in the handle_pqap and
vfio_ap_irq_enable functions. The question is, why are we talking about
the S390 Debug Feature given the discussion is about using pr_warn
verses pr_info. What am I missing here?
>
> I think log levels are supposed to be primarily about severity, and
> and I'm not sure that a queue becoming unavailable in G1 without
> fist re-configuring the G2 so that it no more has access to the
> given queue is not really a warning severity thing. IMHO if we
> really do want people complaining about this should they ever see it,
> yes it should be a warning. If not then probably not.
>
> Regards,
> Halil
next prev parent reply other threads:[~2023-12-07 15:31 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-29 14:35 [PATCH] s390/vfio-ap: handle response code 01 on queue reset Tony Krowiak
2023-11-29 17:12 ` Christian Borntraeger
2023-12-04 14:53 ` Tony Krowiak
2023-12-04 15:16 ` Christian Borntraeger
2023-12-04 16:15 ` Halil Pasic
2023-12-04 17:05 ` Tony Krowiak
2023-12-05 8:04 ` Harald Freudenberger
2023-12-06 17:17 ` Halil Pasic
2023-12-07 15:31 ` Anthony Krowiak [this message]
2023-12-04 12:10 ` Halil Pasic
2023-12-04 17:51 ` Tony Krowiak
2023-12-04 22:05 ` Halil Pasic
2024-01-09 17:02 ` Anthony Krowiak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=483f23b2-0c88-49e2-8b40-7b17cd2b46cc@linux.ibm.com \
--to=akrowiak@linux.ibm.com \
--cc=BUENDGEN@de.ibm.com \
--cc=alex.williamson@redhat.com \
--cc=borntraeger@linux.ibm.com \
--cc=david@redhat.com \
--cc=frankja@linux.ibm.com \
--cc=freude@linux.ibm.com \
--cc=imbrenda@linux.ibm.com \
--cc=jjherne@linux.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=kwankhede@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=pasic@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox