From: Cornelia Huck <cohuck@redhat.com>
To: Pierre Morel <pmorel@linux.ibm.com>
Cc: kvm@vger.kernel.org, frankja@linux.ibm.com, david@redhat.com,
thuth@redhat.com, imbrenda@linux.ibm.com
Subject: Re: [kvm-unit-tests PATCH v1 3/6] s390x: lib: css: upgrading IRQ handling
Date: Fri, 19 Mar 2021 12:01:05 +0100 [thread overview]
Message-ID: <20210319120105.182c8684.cohuck@redhat.com> (raw)
In-Reply-To: <1616073988-10381-4-git-send-email-pmorel@linux.ibm.com>
On Thu, 18 Mar 2021 14:26:25 +0100
Pierre Morel <pmorel@linux.ibm.com> wrote:
> Until now we had very few usage of interrupts, to be able to handle
> several interrupts coming up asynchronously we need to take care
> to save the previous interrupt before handling the next one.
An alternative would be to keep I/O interrupts disabled until you are
done with processing any information that might be overwritten.
>
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> ---
> lib/s390x/css.h | 29 +++++++++++
> lib/s390x/css_lib.c | 117 ++++++++++++++++++++++++++++++++++----------
> 2 files changed, 120 insertions(+), 26 deletions(-)
>
> diff --git a/lib/s390x/css.h b/lib/s390x/css.h
> index 460b0bd..65fc335 100644
> --- a/lib/s390x/css.h
> +++ b/lib/s390x/css.h
> @@ -425,4 +425,33 @@ struct measurement_block_format1 {
> uint32_t irq_prio_delay_time;
> };
>
> +struct irq_entry {
> + struct irq_entry *next;
> + struct irb irb;
> + uint32_t sid;
I'm wondering whether that set of information make sense for saving.
We basically have some things in the lowcore that get overwritten by
subsequent I/O interrupts (in addition to the sid the intparm and the
interrupt identification word which contains the isc), and the irb,
which only gets overwritten if you do a tsch into the same memory area.
So, if you need to save some things, I'd suggest to add the intparm and
the interrupt identification word to it. Not sure whether the irb can
be handled independently? Need to read code first :)
> +};
(...)
> @@ -422,38 +464,38 @@ static struct irb irb;
> void css_irq_io(void)
> {
> int ret = 0;
> - char *flags;
> - int sid;
> + struct irq_entry *irq;
>
> report_prefix_push("Interrupt");
> - sid = lowcore_ptr->subsys_id_word;
> + irq = alloc_irq();
> + assert(irq);
> +
> + irq->sid = lowcore_ptr->subsys_id_word;
> /* Lowlevel set the SID as interrupt parameter. */
> - if (lowcore_ptr->io_int_param != sid) {
> + if (lowcore_ptr->io_int_param != irq->sid) {
> report(0,
> "io_int_param: %x differs from subsys_id_word: %x",
> - lowcore_ptr->io_int_param, sid);
> + lowcore_ptr->io_int_param, irq->sid);
> goto pop;
> }
> report_prefix_pop();
>
> report_prefix_push("tsch");
> - ret = tsch(sid, &irb);
> + ret = tsch(irq->sid, &irq->irb);
> switch (ret) {
> case 1:
> - dump_irb(&irb);
> - flags = dump_scsw_flags(irb.scsw.ctrl);
> - report(0,
> - "I/O interrupt, but tsch returns CC 1 for subchannel %08x. SCSW flags: %s",
> - sid, flags);
> + report_info("no status pending on %08x : %s", irq->sid,
> + dump_scsw_flags(irq->irb.scsw.ctrl));
This is not what you are looking at here, though?
The problem is that the hypervisor gave you cc 1 (stored, not status
pending) while you just got an interrupt; the previous message logged
that, while the new one does not. (The scsw flags are still
interesting, as it gives further information about the mismatch.)
> break;
> case 2:
> report(0, "tsch returns unexpected CC 2");
> break;
> case 3:
> - report(0, "tsch reporting sch %08x as not operational", sid);
> + report(0, "tsch reporting sch %08x as not operational", irq->sid);
> break;
> case 0:
> /* Stay humble on success */
> + save_irq(irq);
> break;
> }
> pop:
> @@ -498,47 +540,70 @@ struct ccw1 *ccw_alloc(int code, void *data, int count, unsigned char flags)
> int wait_and_check_io_completion(int schid)
> {
> int ret = 0;
> -
> - wait_for_interrupt(PSW_MASK_IO);
> + struct irq_entry *irq = NULL;
>
> report_prefix_push("check I/O completion");
>
> - if (lowcore_ptr->io_int_param != schid) {
> + disable_io_irq();
> + irq = get_irq();
> + while (!irq) {
> + wait_for_interrupt(PSW_MASK_IO);
> + disable_io_irq();
Isn't the disable_io_irq() redundant here?
(In general, I'm a bit confused about the I/O interrupt handling here.
Might need to read through the whole thing again.)
> + irq = get_irq();
> + report_info("next try");
> + }
> + enable_io_irq();
> +
> + assert(irq);
> +
> + if (irq->sid != schid) {
> report(0, "interrupt parameter: expected %08x got %08x",
> - schid, lowcore_ptr->io_int_param);
> + schid, irq->sid);
> ret = -1;
> goto end;
You're still expecting that there's only one subchannel enabled for I/O
interrupts at the same time, right?
> }
>
> /* Verify that device status is valid */
> - if (!(irb.scsw.ctrl & SCSW_SC_PENDING)) {
> - report(0, "No status pending after interrupt. Subch Ctrl: %08x",
> - irb.scsw.ctrl);
> - ret = -1;
> + if (!(irq->irb.scsw.ctrl & SCSW_SC_PENDING)) {
Confused. An I/O interrupt for a subchannel that is not status pending
is surely an issue?
> + ret = 0;
> goto end;
> }
>
> - if (!(irb.scsw.ctrl & (SCSW_SC_SECONDARY | SCSW_SC_PRIMARY))) {
> + /* clear and halt pending are valid even without secondary or primary status */
> + if (irq->irb.scsw.ctrl & (SCSW_FC_CLEAR | SCSW_FC_HALT)) {
Can you factor out the new/changed checks here into a separate patch?
Would make the change easier to follow.
Also, you might want to check other things for halt/clear as well?
> + ret = 0;
> + goto end;
> + }
> +
> + /* For start pending we need at least one of primary or secondary status */
> + if (!(irq->irb.scsw.ctrl & (SCSW_SC_SECONDARY | SCSW_SC_PRIMARY))) {
> report(0, "Primary or secondary status missing. Subch Ctrl: %08x",
> - irb.scsw.ctrl);
> + irq->irb.scsw.ctrl);
I'm wondering whether that is actually true. Maybe need to double check
what happens with deferred ccs etc.
> ret = -1;
> goto end;
> }
>
> - if (!(irb.scsw.dev_stat & (SCSW_DEVS_DEV_END | SCSW_DEVS_SCH_END))) {
> + /* For start pending we also need to have device or channel end information */
> + if (!(irq->irb.scsw.dev_stat & (SCSW_DEVS_DEV_END | SCSW_DEVS_SCH_END))) {
> report(0, "No device end or sch end. Dev. status: %02x",
> - irb.scsw.dev_stat);
> + irq->irb.scsw.dev_stat);
Again, not sure whether that is true in any case (surely for the good
path, and I think for unit check as well; but ISTR that there can be
error conditions where we won't get another interrupt for the same I/O,
but device end is not set, because the error occurred before we even
reached the device... should those be logged?)
> ret = -1;
> goto end;
> }
>
> - if (irb.scsw.sch_stat & ~SCSW_SCHS_IL) {
> - report_info("Unexpected Subch. status %02x", irb.scsw.sch_stat);
> + /* We only accept the SubCHannel Status for Illegal Length */
It's more like that we just don't deal with any of the other subchannel
status flags, right?
> + if (irq->irb.scsw.sch_stat & ~SCSW_SCHS_IL) {
> + report_info("Unexpected Subch. status %02x",
> + irq->irb.scsw.sch_stat);
> ret = -1;
> goto end;
> }
>
> end:
> + if (ret)
> + dump_irb(&irq->irb);
> +
> + put_irq(irq);
> report_prefix_pop();
> return ret;
> }
next prev parent reply other threads:[~2021-03-19 11:02 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-18 13:26 [kvm-unit-tests PATCH v1 0/6] Testing SSCH, CSCH and HSCH for errors Pierre Morel
2021-03-18 13:26 ` [kvm-unit-tests PATCH v1 1/6] s390x: lib: css: disabling a subchannel Pierre Morel
2021-03-19 9:02 ` Janosch Frank
2021-03-19 9:11 ` Pierre Morel
2021-03-19 10:03 ` Cornelia Huck
2021-03-19 10:11 ` Pierre Morel
2021-03-19 15:29 ` Pierre Morel
2021-03-18 13:26 ` [kvm-unit-tests PATCH v1 2/6] s390x: lib: css: SCSW bit definitions Pierre Morel
2021-03-19 10:16 ` Cornelia Huck
2021-03-19 15:30 ` Pierre Morel
2021-03-18 13:26 ` [kvm-unit-tests PATCH v1 3/6] s390x: lib: css: upgrading IRQ handling Pierre Morel
2021-03-18 14:22 ` Pierre Morel
2021-03-19 11:01 ` Cornelia Huck [this message]
2021-03-19 15:55 ` Pierre Morel
2021-03-19 16:09 ` Cornelia Huck
2021-03-19 16:34 ` Pierre Morel
2021-03-18 13:26 ` [kvm-unit-tests PATCH v1 4/6] s390x: lib: css: add expectations to wait for interrupt Pierre Morel
2021-03-19 9:09 ` Janosch Frank
2021-03-19 9:50 ` Pierre Morel
2021-03-19 11:23 ` Cornelia Huck
2021-03-19 16:18 ` Pierre Morel
2021-03-18 13:26 ` [kvm-unit-tests PATCH v1 5/6] s390x: css: testing ssch error response Pierre Morel
2021-03-19 9:18 ` Janosch Frank
2021-03-19 10:02 ` Pierre Morel
2021-03-18 13:26 ` [kvm-unit-tests PATCH v1 6/6] s390x: css: testing clear and halt subchannel Pierre Morel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210319120105.182c8684.cohuck@redhat.com \
--to=cohuck@redhat.com \
--cc=david@redhat.com \
--cc=frankja@linux.ibm.com \
--cc=imbrenda@linux.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=pmorel@linux.ibm.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox