Re: [PATCH RFC] vfio-ccw: forward halt/clear errors

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Cornelia Huck <cohuck@redhat.com>
To: Eric Farman <farman@linux.ibm.com>
Cc: Halil Pasic <pasic@linux.ibm.com>,
	qemu-s390x@nongnu.org, qemu-devel@nongnu.org,
	Matthew Rosato <mjrosato@linux.ibm.com>
Subject: Re: [PATCH RFC] vfio-ccw: forward halt/clear errors
Date: Mon, 17 May 2021 19:31:45 +0200	[thread overview]
Message-ID: <20210517193145.5a13ae1c.cohuck@redhat.com> (raw)
In-Reply-To: <12fd64d4d368230b69f47a6ed67049b67553717b.camel@linux.ibm.com>

On Wed, 12 May 2021 16:19:15 -0400
Eric Farman <farman@linux.ibm.com> wrote:

> On Tue, 2021-05-11 at 17:11 +0200, Cornelia Huck wrote:
> > hsch and csch basically have two parts: execute the command,
> > and perform the halt/clear function. For fully emulated
> > subchannels, it is pretty clear how it will work: check the
> > subchannel state, and actually 'perform the halt/clear function'
> > and set cc 0 if everything looks good.
> > 
> > For passthrough subchannels, some of the checking is done
> > within QEMU, but some has to be done within the kernel. QEMU's
> > subchannel state may be such that we can perform the async
> > function, but the kernel may still get a cc != 0 when it is
> > actually executing the instruction. In that case, we need to
> > set the condition actually encountered by the kernel; if we
> > set cc 0 on error, we would actually need to inject an interrupt
> > as well.
> > 
> > Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> > ---
> > 
> > Stumbled over this during the vfio-ccw kernel locking discussions.
> > 
> > This is probably a corner case, and I'm not sure how I can actually
> > get this path excercised, but it passes my smoke tests.  
> 
> I'll see if I can hammer my way into some of this.

Thanks, that would be cool.

> 
> > 
> > Not sure whether this is the way to go.   
> 
> I think it seems reasonable.
> 
> > The unit exceptions in the
> > halt/clear error paths also seem slightly fishy.  
> 
> It is peculiar. Looking at the old POPS references, the unit exception
> states that the --device-- detected something unusual, not really the
> subchannel (which is how vfio-ccw is behaving). But, providing some
> indication that something went seriously wrong is good. Which I guess
> was the point of the UE code, even though it's obviously set up to be
> generated after a failure on the START.

Yes, I had copied it over when I wired up halt/clear.

> 
> I guess at the least, we need to clean up the FCTL based on the
> function that failed, rather than only cleaning up the START function.

Makes sense.

> The UE itself may just be an extra "hey this is busted" indicator.

It still feels a bit odd, but I'm not sure that there's a better
alternative.

> 
> > 
> > ---
> >  hw/s390x/css.c | 34 ++++++++++++++++++++++++++++++----
> >  hw/vfio/ccw.c  |  4 ++--
> >  2 files changed, 32 insertions(+), 6 deletions(-)
> > 
> > diff --git a/hw/s390x/css.c b/hw/s390x/css.c
> > index bed46f5ec3a2..ce2e903ca25a 100644
> > --- a/hw/s390x/css.c
> > +++ b/hw/s390x/css.c
> > @@ -1206,23 +1206,49 @@ static void
> > sch_handle_start_func_virtual(SubchDev *sch)
> >  
> >  }
> >  
> > -static void sch_handle_halt_func_passthrough(SubchDev *sch)
> > +static IOInstEnding sch_handle_halt_func_passthrough(SubchDev *sch)
> >  {
> >      int ret;
> >  
> >      ret = s390_ccw_halt(sch);
> >      if (ret == -ENOSYS) {
> >          sch_handle_halt_func(sch);
> > +        return IOINST_CC_EXPECTED;  
> 
> This is the fallback, so makes sense. You could fold it into the switch
> table, but since that's for "stuff from the kernel" versus -ENOSYS says
> "there's no way to call the kernel" I guess this is fine too.

Yes, that was the reason I kept that separate. (I don't expect it to be
a heavily exercised path nowadays anyway.)

> 
> > +    }
> > +    /*
> > +     * Some conditions may have been detected prior to starting the
> > halt
> > +     * function; map them to the correct cc.
> > +     */
> > +    switch (ret) {
> > +    case -EBUSY:
> > +        return IOINST_CC_BUSY;
> > +    case -ENODEV:
> > +    case -EACCES:
> > +        return IOINST_CC_NOT_OPERATIONAL;
> > +    default:
> > +        return IOINST_CC_EXPECTED;
> >      }
> >  }

(...)

> > diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
> > index e752c845e9e4..39275a917bd2 100644
> > --- a/hw/vfio/ccw.c
> > +++ b/hw/vfio/ccw.c
> > @@ -199,7 +199,7 @@ again:  
> 
> // This is for CLEAR
> 
> >      case 0:
> >      case -ENODEV:
> >      case -EACCES:
> > -        return 0;
> > +        return ret;
> >      case -EFAULT:
> >      default:
> >          sch_gen_unit_exception(sch);
> > @@ -240,7 +240,7 @@ again:  
> 
> // This is for HALT
> 
> >      case -EBUSY:
> >      case -ENODEV:
> >      case -EACCES:
> > -        return 0;
> > +        return ret;  
> 
> Aside: How could we get EACCES for either HALT or CLEAR? I only see
> that set in the normal request path, if we got a CC3 on the SSCH.

We should only get -ENODEV, which basically indicates cc3 on the kernel
side. For start, the kernel distinguishes between "you didn't specify a
path in the orb that's actually available" and "device not
operational", but we map everything to cc 3 in QEMU (and I don't think
there's anything else we could do with that information anyway.)

> 
> Can we scrub them, or do we need to update kernel
> Documentation/s390/vfio-ccw.rst ? :)

We could remove them, or log something if they come up unexpectedly;
but their presence does not really hurt, either.

     prev parent reply	other threads:[~2021-05-17 17:37 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-11 15:11 [PATCH RFC] vfio-ccw: forward halt/clear errors Cornelia Huck
2021-05-12 20:19 ` Eric Farman
2021-05-17 17:31   ` Cornelia Huck [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210517193145.5a13ae1c.cohuck@redhat.com \
    --to=cohuck@redhat.com \
    --cc=farman@linux.ibm.com \
    --cc=mjrosato@linux.ibm.com \
    --cc=pasic@linux.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-s390x@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.