linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pierre Morel <pmorel@linux.ibm.com>
To: Cornelia Huck <cohuck@redhat.com>
Cc: Pierre Morel <pmorel@linux.vnet.ibm.com>,
	pasic@linux.vnet.ibm.com, bjsdjshi@linux.vnet.ibm.com,
	linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org
Subject: Re: [PATCH v2 08/10] vfio: ccw: Handling reset and shutdown with states
Date: Tue, 5 Jun 2018 18:40:48 +0200	[thread overview]
Message-ID: <c03e0ce2-7f14-fcf1-4fa9-f67df4b8046b@linux.ibm.com> (raw)
In-Reply-To: <20180605172708.24bb7af2.cohuck@redhat.com>

On 05/06/2018 17:27, Cornelia Huck wrote:
> On Tue, 5 Jun 2018 16:10:52 +0200
> Pierre Morel <pmorel@linux.ibm.com> wrote:
>
>> On 05/06/2018 14:18, Cornelia Huck wrote:
>>> On Fri, 25 May 2018 12:21:16 +0200
>>> Pierre Morel <pmorel@linux.vnet.ibm.com> wrote:
>>>> +static int fsm_online(struct vfio_ccw_private *private)
>>>> +{
>>>> +	struct subchannel *sch = private->sch;
>>>> +	int ret = VFIO_CCW_STATE_IDLE;
>>>> +
>>>> +	spin_lock_irq(sch->lock);
>>>> +	if (cio_enable_subchannel(sch, (u32)(unsigned long)sch))
>>>> +		ret = VFIO_CCW_STATE_NOT_OPER;
>>>> +	spin_unlock_irq(sch->lock);
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +static int fsm_offline(struct vfio_ccw_private *private)
>>>> +{
>>>> +	struct subchannel *sch = private->sch;
>>>> +	int ret = VFIO_CCW_STATE_STANDBY;
>>>> +
>>>> +	spin_lock_irq(sch->lock);
>>>> +	if (cio_disable_subchannel(sch))
>>>> +		ret = VFIO_CCW_STATE_NOT_OPER;
>>> So, what about a subchannel that is busy? Why should it go to the not
>>> oper state?
>> right, thanks.
>>
>>> (And you should try to flush pending I/O and then try again in that
>>> case. Otherwise, you may have a still-enabled subchannel which may
>>> throw an interrupt.)
>> What about letting the guest doing this.
>> After giving him the right information on what happened of course.
> Why should the guest know anything about this? Getting the device to a
> usable state respectively cleaning up is the responsibility of the host
> code. This processing will happen before the guest gets use of the
> device or after it has lost use of it already (or it is some internal
> handling like reset, which the guest should not be made aware of).

Hum, not inspired today,
sorry I should have take a day to recover from holidays. :)

>
>>>   
>>>> +	spin_unlock_irq(sch->lock);
>>>> +	if (private->completion)
>>>> +		complete(private->completion);
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +static int fsm_quiescing(struct vfio_ccw_private *private)
>>>> +{
>>>> +	struct subchannel *sch = private->sch;
>>>> +	int ret = VFIO_CCW_STATE_STANDBY;
>>>> +	int iretry = 255;
>>>> +
>>>> +	spin_lock_irq(sch->lock);
>>>> +	ret = cio_cancel_halt_clear(sch, &iretry);
>>>> +	if (ret == -EBUSY)
>>>> +		ret = VFIO_CCW_STATE_QUIESCING;
>>>> +	else if (private->completion)
>>>> +		complete(private->completion);
>>>> +	spin_unlock_irq(sch->lock);
>>>> +	return ret;
>>> If I read this correctly, you're calling cio_cancel_halt_clear() only
>>> once. What happened to the retry loop?
>> Same as above, what about letting the guest doing this?
> See my reply above.
>
>> And there are already 255 retries as part of the interface to cio.
>  From the kerneldoc comment for cio_cancel_halt_clear():
>
>   * This should be called repeatedly since halt/clear are asynchronous
>   * operations. We do one try with cio_cancel, three tries with cio_halt,
>   * 255 tries with cio_clear. The caller should initialize @iretry with
>   * the value 255 for its first call to this, and keep using the same
>   * @iretry in the subsequent calls until it gets a non -EBUSY return.

OK thanks, I do so.

>
>>>   
>>>> +}
>>>> +static int fsm_quiescing_done(struct vfio_ccw_private *private)
>>>> +{
>>>> +	if (private->completion)
>>>> +		complete(private->completion);
>>>> +	return VFIO_CCW_STATE_STANDBY;
>>>> +}
>>>>    /*
>>>>     * No operation action.
>>>>     */
>>>> @@ -178,15 +225,10 @@ static int fsm_sch_event(struct vfio_ccw_private *private)
>>>>    static int fsm_init(struct vfio_ccw_private *private)
>>>>    {
>>>>    	struct subchannel *sch = private->sch;
>>>> -	int ret = VFIO_CCW_STATE_STANDBY;
>>>>    
>>>> -	spin_lock_irq(sch->lock);
>>>>    	sch->isc = VFIO_CCW_ISC;
>>>> -	if (cio_enable_subchannel(sch, (u32)(unsigned long)sch))
>>>> -		ret = VFIO_CCW_STATE_NOT_OPER;
>>>> -	spin_unlock_irq(sch->lock);
>>>>    
>>>> -	return ret;
>>>> +	return VFIO_CCW_STATE_STANDBY;
>>> Doesn't that change the semantic of the standby state?
>> It changes the FSM: NOT_OPER and STANDBY are clearly different.
>> Part of the initialization is now done in when putting the device online.
> Hm, I think the changes to the fsm semantics are a bit mixed up between
> patches. I'll wait for an outline of how this is supposed to look in
> the end before commenting further :)

Yes, I do this in the next cover letter.

>
>>> Your idea here seems to be to go to either disabling the subchannel
>>> directly or flushing out I/O first, depending on the state you're in.
>>> The problem is that you may need retries in any case (the subchannel
>>> may be status pending if it is enabled; not necessarily by any I/O that
>>> had been started, but also from an unsolicited notification.)
>> I wanted to let the guest do the retries as he wants to.
>> Somehow we must give the right response back to the guest
>> and take care of the error number we give back.
> As described above, we need to be clear on what should be guest-visible
> and what is just internal handling e.g. during initialization/removal.

Yes.

>
>> I will get a better look at this.
>>
>>>   
>>>>    };
>>>> diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
>>>> index ea8fd64..b202e73 100644
>>>> --- a/drivers/s390/cio/vfio_ccw_ops.c
>>>> +++ b/drivers/s390/cio/vfio_ccw_ops.c
>>>> @@ -21,21 +21,14 @@ static int vfio_ccw_mdev_reset(struct mdev_device *mdev)
>>>>    
>>>>    	private = dev_get_drvdata(mdev_parent_dev(mdev));
>>>>    	sch = private->sch;
>>>> -	/*
>>>> -	 * TODO:
>>>> -	 * In the cureent stage, some things like "no I/O running" and "no
>>>> -	 * interrupt pending" are clear, but we are not sure what other state
>>>> -	 * we need to care about.
>>>> -	 * There are still a lot more instructions need to be handled. We
>>>> -	 * should come back here later.
>>>> -	 */
>>> This is still true, no? I'm thinking about things like channel monitors
>>> and the like (even if we don't support them yet).
>> I think that this is not the place to put this remark since here
>> we should send an event to the FSM, having new states
>> will be handled as FSM states.
>> I put it back, here or where I think it belong if I find another
>> place after resolving the RESET problem.
> The comment basically refers to "we aren't quite sure whether there is
> more stuff we need to reset", so I think this is indeed the correct
> place.

OK

>

-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

  reply	other threads:[~2018-06-05 16:40 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-25 10:21 [PATCH v2 00/10] vfio: ccw: Refactoring the VFIO CCW state machine Pierre Morel
2018-05-25 10:21 ` [PATCH v2 01/10] vfio: ccw: Moving state change out of IRQ context Pierre Morel
2018-06-04 13:52   ` Cornelia Huck
2018-06-05 13:34     ` Pierre Morel
2018-06-05 13:52       ` Cornelia Huck
2018-06-05 14:22         ` Pierre Morel
2018-05-25 10:21 ` [PATCH v2 02/10] vfio: ccw: Transform FSM functions to return state Pierre Morel
2018-05-25 10:21 ` [PATCH v2 03/10] vfio: ccw: new SCH_EVENT event Pierre Morel
2018-05-25 10:21 ` [PATCH v2 04/10] vfio: ccw: replace IO_REQ event with SSCH_REQ event Pierre Morel
2018-05-25 10:21 ` [PATCH v2 05/10] vfio: ccw: Suppress unused event parameter Pierre Morel
2018-05-25 10:21 ` [PATCH v2 06/10] vfio: ccw: Make FSM functions atomic Pierre Morel
2018-06-05 11:38   ` Cornelia Huck
2018-06-05 13:10     ` Pierre Morel
2018-06-05 13:35       ` Cornelia Huck
2018-06-05 14:21         ` Pierre Morel
2018-06-05 15:15           ` Cornelia Huck
2018-05-25 10:21 ` [PATCH v2 07/10] vfio: ccw: FSM and mediated device initialization Pierre Morel
2018-05-25 10:21 ` [PATCH v2 08/10] vfio: ccw: Handling reset and shutdown with states Pierre Morel
2018-06-05 12:18   ` Cornelia Huck
2018-06-05 14:10     ` Pierre Morel
2018-06-05 15:27       ` Cornelia Huck
2018-06-05 16:40         ` Pierre Morel [this message]
2018-05-25 10:21 ` [PATCH v2 09/10] vfio: ccw: Suppressing the BOXED state Pierre Morel
2018-05-25 10:21 ` [PATCH v2 10/10] vfio: ccw: Let user wait when busy on IO Pierre Morel
2018-05-25 14:04   ` Heiko Carstens
2018-06-05 13:02     ` Pierre Morel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c03e0ce2-7f14-fcf1-4fa9-f67df4b8046b@linux.ibm.com \
    --to=pmorel@linux.ibm.com \
    --cc=bjsdjshi@linux.vnet.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=pasic@linux.vnet.ibm.com \
    --cc=pmorel@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).