Discussion of the VIRTIO specification
 help / color / mirror / Atom feed
* [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-09-14 11:37 ` [virtio-comment] " Michael S. Tsirkin
@ 2023-09-15  4:41   ` Zhu, Lingshan
  0 siblings, 0 replies; 29+ messages in thread
From: Zhu, Lingshan @ 2023-09-15  4:41 UTC (permalink / raw)
  To: Michael S. Tsirkin, Zhu Lingshan
  Cc: jasowang, eperezma, cohuck, stefanha, virtio-comment, virtio-dev



On 9/14/2023 7:37 PM, Michael S. Tsirkin wrote:
> On Wed, Sep 06, 2023 at 04:16:32PM +0800, Zhu Lingshan wrote:
>> This series introduces
>> 1)a new SUSPEND bit in the device status
>> Which is used to suspend the device, so that the device states
>> and virtqueue states are stabilized.
>>
>> 2)virtqueue state and its accessor, to get and set last_avail_idx
>> and last_used_idx of virtqueues.
>>
>> The main usecase of these new facilities is Live Migration.
>>
>> Future work: dirty page tracking and in-flight descriptors.
>>
>> This series addresses many comments from Jason, Stefan and Eugenio
>> from RFC series.
>
> after going over this in detail, it is like I worried: this
> tries to do too much through a single register and
> the ownership is muddied significantly.
Not sure about what ownership, device usually STOPPED after
guest freezes, so the hypervisor owns the device status
and LM facilities at that moment.
>
> I feel a separate capability for suspend/resume that would
> be independent of device status would be preferable.
The implementation of the live migration basic facilities are transport 
specific, for PCI:
1)Dirty page tracking will have its own capability
2)In-flight descriptors tracker will have its own capability
3)vq states stored in common config space

Only SUSPEND is implemented in the device status, and this is a valid 
device status.
There are already 6 device status bits, and IMHO this series 
implementing SUSPEND does not
introduce more complexities.
>
>> Zhu Lingshan (5):
>>    virtio: introduce vq state as basic facility
>>    virtio: introduce SUSPEND bit in device status
>>    virtqueue: constraints for virtqueue state
>>    virtqueue: ignore resetting vqs when SUSPEND
>>    virtio-pci: implement VIRTIO_F_QUEUE_STATE
>>
>>   content.tex       | 118 ++++++++++++++++++++++++++++++++++++++++++++++
>>   transport-pci.tex |  18 +++++++
>>   2 files changed, 136 insertions(+)
>>
>> -- 
>> 2.35.3
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
       [not found]                                   ` <20230921004957-mutt-send-email-mst@kernel.org>
@ 2023-09-21  9:06                                     ` Zhu, Lingshan
  0 siblings, 0 replies; 29+ messages in thread
From: Zhu, Lingshan @ 2023-09-21  9:06 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit, eperezma@redhat.com,
	Cornelia Huck, Stefan Hajnoczi, Stefano Garzarella, Jason Wang
  Cc: virtio-dev@lists.oasis-open.org,
	virtio-comment@lists.oasis-open.org



On 9/21/2023 1:41 PM, Michael S. Tsirkin wrote:
> On Thu, Sep 21, 2023 at 03:43:12AM +0000, Parav Pandit wrote:
>>
>>> From: Michael S. Tsirkin <mst@redhat.com>
>>> Sent: Thursday, September 21, 2023 1:34 AM
>>>
>>> On Wed, Sep 20, 2023 at 05:21:52PM +0000, Parav Pandit wrote:
>>>>> OK so with this summary in mind, can you find any advantages to
>>>>> inband+mediation that are real or do you just see disadvantages? And
>>>>> it's a tricky question because I can see some advantages ;)
>>>> inband + mediation may be useful for nested case.
>>> Hint: there's more.
>> Can you please list down?
>>
>> The starting point of discussion is, there is passthrough member device without mediation in virtio interface layers.
>> How shall device migration should work for it?
> I was attempting to have each of you see other's point of view.
> It seems clear I was right, at least one way communication was
> not getting through. Let me try to help.
>
>
> First, clearly Zhu Lingshan cares about the mediation use-case, not the
> un-mediated one.  Mediation is clearly heavier but also more powerful
> in many use-cases - is that obvious or do I need to list the reasons?
> To mention one example, it supports cross-vendor migration. Which the unmediated
> variant maybe can in theory support too, and when it does maybe in a better and
> more structured way - but that will require standartization effort that
> didn't happen yet. With mediation it was already demonstrated more than
> once.
>
> 1. For mediation something that works within existing mediation framework -
> e.g. reusing as he does feature bits - will require less support
> than a completely separate facility.
> I think Zhu Lingshan also believes that since there will be less code ->
> less security issues.
>
> 2. With or without mediation, the mapping of commands to VFs is simpler,
> allowing more control - for example, let's say you want to reset a VF -
> you do not need to flush the queue of existing commands, which might
> potentially take a long time because some other VFs are very busy - you
> just reset the VF which any unmap flow will already do.
>
>
Thanks, I agree
> But Zhu Lingshan, all this will be pointless if you also do not try to
> do this and list what are reasonable points that Parav made. Please do
> not mistake what I'm doing here for taking sides I just want the
> communication to start working. And that means everyone tries to take
> all use-cases into account even if working for a vendor that does not
> care about this use-case. Otherwise we will just keep getting into these
> flamewars.
I think admin vq live migration surely work for some scenarios and can 
meet specific customers requirements,
that uses cases are reasonable for sure.

Jason, Eugenio and me, et al spent a lot of efforts on this live 
migration proposal in the past two years,
this series are based on the joint work, directly carry on previous 
series sent by Jason and Eugenio.

This series introduces basic facilities for live migration, and the 
implementation is transport specific.

I agree we should cooperate, at least the basic facilities can be used 
by admin vq, for example the
dirty page tracking facility and even forward suspend command to device 
status.

Thanks,
Zhu Lingshan


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
       [not found]                         ` <PH0PR12MB548110E53C924C984FA24A49DCF9A@PH0PR12MB5481.namprd12.prod.outlook.com>
       [not found]                           ` <20230920101402-mutt-send-email-mst@kernel.org>
@ 2023-09-21  9:18                           ` Zhu, Lingshan
  2023-09-21  9:26                             ` [virtio-comment] " Parav Pandit
  1 sibling, 1 reply; 29+ messages in thread
From: Zhu, Lingshan @ 2023-09-21  9:18 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin, eperezma@redhat.com,
	Stefan Hajnoczi, Cornelia Huck, Jason Wang
  Cc: virtio-dev@lists.oasis-open.org,
	virtio-comment@lists.oasis-open.org



On 9/20/2023 9:41 PM, Parav Pandit wrote:
>> From: Michael S. Tsirkin <mst@redhat.com>
>> Sent: Wednesday, September 20, 2023 6:12 PM
>> And Parav same goes for you - can you summarize Zhu Lingshan's position?
> Below is my summary about Zhu Lingshan's position:
>
> One line summary of his position in my view:
>
> 0. Use inband device migration only, use mediation, mediation is secure, but AQ is not secure.
>
> Details of his position in my view:
>
> 1. Device migration must be done through VF itself by suspending specific vqs and the VF device both.
Not exactly, my series implements basic facilities for live migration, 
admin vq solution can reuse them
for sure. admin vq solution can work for some use cases, but for others, 
you still need to resolve
the issues we talked before.
> 2. When device migration is done using #1, it must be done using mediation approach in hypervisor.
for fundamentals of virtualization, it is trap and emulate, I think 
Jason have told you many times.
>
> 3. When migration is done using inband mediation it is more secure than AQ approach.
> (as opposed to AQ of the owner device who enables/disables SR-IOV).
VF owns it and the hypervisor owns the VF, so no side channel.
>
> 4. AQ is not secure.
> But,
so many times discussions....
> 5. AQ and admin commands can be built on top of his proposal #1, even if AQ is less secure. Opposing statements...
The security leaks and attacking surface are introduced by AQ, not the 
basic facilities,
>
> 6. Dirty page tracking and inflight descriptors tracking to be done in his v1. but he does not want to review such coverage in [1].
Will be done in V2, and they are still config space solution, with help 
of the hypervisor.
>
> 8. Since his series does not cover any device context migration and does not talk anything about it,
> I deduce that he plans to use cvq for setting ups RSS and other fields using inband CVQ of the VF.
> This further limit the solution to only net device, ignoring rest of the other 20+ device types, where all may not have the CVQ.
Any difference from current vhost solution?
>
> 9. trapping and emulation of following objects: AQ, CVQ, virtio config space, PCI FLR flow in hypervisor is secure, but when if AQ of the PF do far small work of it, AQ is not secure.
for cvq, you should read Eugenio's patcheset, it is secure. For others, 
we have discussed for many times, no need to repeat.
>
> 10. Any traps proposed in #9 mostly do not work with future TDISP as TDISP do not bifurcate the device, so ignore them for now to promote inband migration.
TDISP devices can not be migrated for now, and the TDISP spec make clear 
examples of attacking models, your admin vq LM on the PF exactly match 
the model.

Sorry I have to repeat this again, this is the last time.
>
> 11. He do not show interest in collaboration (even after requesting few times) to see if we can produce common commands that may work for both passthrough (without mediation) and using mediation for nested case.
as repeated for many times, we are implementing basic facilities, and 
you can reuse the basic facilities for live migration in admin vq 
design, do you want to cooperate?
>
> 12. Some how register access on single physical card for the PFs and VFs gives better QoS guarantee than virtqueue as registers can scale infinitely no matter how many VFs or for multiple VQs because it is per VF.
that is per-device facilities.
>
> [1] https://lore.kernel.org/virtio-comment/20230909142911.524407-7-parav@nvidia.com/T/#md9fcfa1ba997463de8c7fb8c6d1786b224b0bead


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-09-21  9:18                           ` Zhu, Lingshan
@ 2023-09-21  9:26                             ` Parav Pandit
  2023-09-21  9:55                               ` [virtio-comment] " Zhu, Lingshan
  0 siblings, 1 reply; 29+ messages in thread
From: Parav Pandit @ 2023-09-21  9:26 UTC (permalink / raw)
  To: Zhu, Lingshan, Michael S. Tsirkin, eperezma@redhat.com,
	Stefan Hajnoczi, Cornelia Huck, Jason Wang
  Cc: virtio-dev@lists.oasis-open.org,
	virtio-comment@lists.oasis-open.org



> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Thursday, September 21, 2023 2:49 PM

> TDISP devices can not be migrated for now, and the TDISP spec make clear
> examples of attacking models, your admin vq LM on the PF exactly match the
> model.

I gave hint yesterday to you to consult Ravi at Intel who showed TDISP migration using a dedicated TVM using similar mechanism as admin command.
But you sadly ignored...

So let me make another attempt to explain,

When in future TDISP device migration to be supported, the admin command will be done through a dedicated PF or a VF that resides in another trust domain, for example another TVM.
Such admin virtio device will not be located in the hypervisor.
Thereby, it will be secure.
The admin commands pave the road to make this happen. Only thing changes is delegation of admin commands to another admin device instead of a PF.

There are other solutions too that will arise.
I have seen another one too, may be DPU.

In all the 2 approaches, TDISP is migratable and spec will evolve as multiple vendors including Intel, AMD and others showed the path towards it without mediation.
Virtio will be able to leverage that as well using admin commands.

I want to emphasize again, do not keep repeating AQ in your comments.
It is admin commands in proposal [1].

As Michael also requested, I kindly request to co-operate on doing join technical work, shared ideas, knowledge and improve the spec.

[1] https://lore.kernel.org/virtio-comment/20230909142911.524407-7-parav@nvidia.com/T/#mf15b68617f772770c6bf79f70e8ddc6fea834cfa


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-09-21  9:26                             ` [virtio-comment] " Parav Pandit
@ 2023-09-21  9:55                               ` Zhu, Lingshan
  2023-09-21 11:28                                 ` [virtio-comment] " Parav Pandit
  0 siblings, 1 reply; 29+ messages in thread
From: Zhu, Lingshan @ 2023-09-21  9:55 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin, eperezma@redhat.com,
	Stefan Hajnoczi, Cornelia Huck, Jason Wang
  Cc: virtio-dev@lists.oasis-open.org,
	virtio-comment@lists.oasis-open.org



On 9/21/2023 5:26 PM, Parav Pandit wrote:
>
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Thursday, September 21, 2023 2:49 PM
>> TDISP devices can not be migrated for now, and the TDISP spec make clear
>> examples of attacking models, your admin vq LM on the PF exactly match the
>> model.
> I gave hint yesterday to you to consult Ravi at Intel who showed TDISP migration using a dedicated TVM using similar mechanism as admin command.
> But you sadly ignored...
>
> So let me make another attempt to explain,
>
> When in future TDISP device migration to be supported, the admin command will be done through a dedicated PF or a VF that resides in another trust domain, for example another TVM.
> Such admin virtio device will not be located in the hypervisor.
> Thereby, it will be secure.
> The admin commands pave the road to make this happen. Only thing changes is delegation of admin commands to another admin device instead of a PF.
if you plan to do it in future, then lets discuss in the future.

And TDISP can be migrated in future does not mean admin vq LM is secure, 
I have repeated for so many times of the attacking model. and I will not 
repeat again.
>
> There are other solutions too that will arise.
> I have seen another one too, may be DPU.
>
> In all the 2 approaches, TDISP is migratable and spec will evolve as multiple vendors including Intel, AMD and others showed the path towards it without mediation.
> Virtio will be able to leverage that as well using admin commands.
>
> I want to emphasize again, do not keep repeating AQ in your comments.
> It is admin commands in proposal [1].
we are discussing LM, right? Can TDISP help you here? TDISP spec gives 
examples of attacking models, and your admin vq matches it, I gave you 
quote of the spec yesterday.

This thread is about live migration anyway, not TDISP.
>
> As Michael also requested, I kindly request to co-operate on doing join technical work, shared ideas, knowledge and improve the spec.
>
> [1] https://lore.kernel.org/virtio-comment/20230909142911.524407-7-parav@nvidia.com/T/#mf15b68617f772770c6bf79f70e8ddc6fea834cfa
see other threads, I propose to reuse the basic facilities of live 
migration in admin vq.
>


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-09-21  9:55                               ` [virtio-comment] " Zhu, Lingshan
@ 2023-09-21 11:28                                 ` Parav Pandit
  2023-09-22  2:40                                   ` [virtio-comment] " Zhu, Lingshan
  0 siblings, 1 reply; 29+ messages in thread
From: Parav Pandit @ 2023-09-21 11:28 UTC (permalink / raw)
  To: Zhu, Lingshan, Michael S. Tsirkin, eperezma@redhat.com,
	Stefan Hajnoczi, Cornelia Huck, Jason Wang
  Cc: virtio-dev@lists.oasis-open.org,
	virtio-comment@lists.oasis-open.org


> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Thursday, September 21, 2023 3:25 PM
> 
> On 9/21/2023 5:26 PM, Parav Pandit wrote:
> >
> >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >> Sent: Thursday, September 21, 2023 2:49 PM TDISP devices can not be
> >> migrated for now, and the TDISP spec make clear examples of attacking
> >> models, your admin vq LM on the PF exactly match the model.
> > I gave hint yesterday to you to consult Ravi at Intel who showed TDISP
> migration using a dedicated TVM using similar mechanism as admin command.
> > But you sadly ignored...
> >
> > So let me make another attempt to explain,
> >
> > When in future TDISP device migration to be supported, the admin command
> will be done through a dedicated PF or a VF that resides in another trust
> domain, for example another TVM.
> > Such admin virtio device will not be located in the hypervisor.
> > Thereby, it will be secure.
> > The admin commands pave the road to make this happen. Only thing changes
> is delegation of admin commands to another admin device instead of a PF.
> if you plan to do it in future, then lets discuss in the future.
> 
> And TDISP can be migrated in future does not mean admin vq LM is secure, I
> have repeated for so many times of the attacking model. and I will not repeat
> again.

> > There are other solutions too that will arise.
> > I have seen another one too, may be DPU.
> >
> > In all the 2 approaches, TDISP is migratable and spec will evolve as multiple
> vendors including Intel, AMD and others showed the path towards it without
> mediation.
> > Virtio will be able to leverage that as well using admin commands.
> >
> > I want to emphasize again, do not keep repeating AQ in your comments.
> > It is admin commands in proposal [1].
> we are discussing LM, right? Can TDISP help you here? TDISP spec gives
> examples of attacking models, and your admin vq matches it, I gave you quote
> of the spec yesterday.
> 
> This thread is about live migration anyway, not TDISP.
> >
> > As Michael also requested, I kindly request to co-operate on doing join
> technical work, shared ideas, knowledge and improve the spec.
> >
> > [1]
> > https://lore.kernel.org/virtio-comment/20230909142911.524407-7-parav@n
> > vidia.com/T/#mf15b68617f772770c6bf79f70e8ddc6fea834cfa
> see other threads, I propose to reuse the basic facilities of live migration in
> admin vq.
> >

I don’t see a point in repeating anything anymore with your constant repetitions and ignorance to ideas.

I am happy to collaborate to driver virtio spec when you can give thoughts with an open mind to address two use cases to converge and discuss.

1. virtio device migration using mediation approach
2. virtio member passthrough device migration

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-09-21 11:28                                 ` [virtio-comment] " Parav Pandit
@ 2023-09-22  2:40                                   ` Zhu, Lingshan
  0 siblings, 0 replies; 29+ messages in thread
From: Zhu, Lingshan @ 2023-09-22  2:40 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin, eperezma@redhat.com,
	Stefan Hajnoczi, Cornelia Huck, Jason Wang
  Cc: virtio-dev@lists.oasis-open.org,
	virtio-comment@lists.oasis-open.org



On 9/21/2023 7:28 PM, Parav Pandit wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Thursday, September 21, 2023 3:25 PM
>>
>> On 9/21/2023 5:26 PM, Parav Pandit wrote:
>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>> Sent: Thursday, September 21, 2023 2:49 PM TDISP devices can not be
>>>> migrated for now, and the TDISP spec make clear examples of attacking
>>>> models, your admin vq LM on the PF exactly match the model.
>>> I gave hint yesterday to you to consult Ravi at Intel who showed TDISP
>> migration using a dedicated TVM using similar mechanism as admin command.
>>> But you sadly ignored...
>>>
>>> So let me make another attempt to explain,
>>>
>>> When in future TDISP device migration to be supported, the admin command
>> will be done through a dedicated PF or a VF that resides in another trust
>> domain, for example another TVM.
>>> Such admin virtio device will not be located in the hypervisor.
>>> Thereby, it will be secure.
>>> The admin commands pave the road to make this happen. Only thing changes
>> is delegation of admin commands to another admin device instead of a PF.
>> if you plan to do it in future, then lets discuss in the future.
>>
>> And TDISP can be migrated in future does not mean admin vq LM is secure, I
>> have repeated for so many times of the attacking model. and I will not repeat
>> again.
>>> There are other solutions too that will arise.
>>> I have seen another one too, may be DPU.
>>>
>>> In all the 2 approaches, TDISP is migratable and spec will evolve as multiple
>> vendors including Intel, AMD and others showed the path towards it without
>> mediation.
>>> Virtio will be able to leverage that as well using admin commands.
>>>
>>> I want to emphasize again, do not keep repeating AQ in your comments.
>>> It is admin commands in proposal [1].
>> we are discussing LM, right? Can TDISP help you here? TDISP spec gives
>> examples of attacking models, and your admin vq matches it, I gave you quote
>> of the spec yesterday.
>>
>> This thread is about live migration anyway, not TDISP.
>>> As Michael also requested, I kindly request to co-operate on doing join
>> technical work, shared ideas, knowledge and improve the spec.
>>> [1]
>>> https://lore.kernel.org/virtio-comment/20230909142911.524407-7-parav@n
>>> vidia.com/T/#mf15b68617f772770c6bf79f70e8ddc6fea834cfa
>> see other threads, I propose to reuse the basic facilities of live migration in
>> admin vq.
> I don’t see a point in repeating anything anymore with your constant repetitions and ignorance to ideas.
>
> I am happy to collaborate to driver virtio spec when you can give thoughts with an open mind to address two use cases to converge and discuss.
>
> 1. virtio device migration using mediation approach
As Jason and I have told you many times, basic and fundamental of 
virtualization is trap and emulate,
and this series work for trap and emulate.

And for mediation, do you see any troubles?

Can't vDPA migrate devices by this solution?
> 2. virtio member passthrough device migration
if you want, you can build admin vq LM on the basic facilities. But 
still admin vq LM will not work for nested.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
       [not found]                           ` <CACGkMEtW88zJkDQL58NqLzzudq=f+SmzJ8bha55Dd2fd=FRGBQ@mail.gmail.com>
@ 2023-09-22  3:39                             ` Zhu, Lingshan
       [not found]                             ` <PH0PR12MB5481573D6EE3BE03FB7C3D70DCFCA@PH0PR12MB5481.namprd12.prod.outlook.com>
  1 sibling, 0 replies; 29+ messages in thread
From: Zhu, Lingshan @ 2023-09-22  3:39 UTC (permalink / raw)
  To: Jason Wang, Parav Pandit, eperezma@redhat.com, Cornelia Huck,
	Michael S. Tsirkin, virtio-comment@lists.oasis-open.org,
	Stefan Hajnoczi
  Cc: virtio-dev@lists.oasis-open.org



On 9/22/2023 11:08 AM, Jason Wang wrote:
> On Thu, Sep 21, 2023 at 12:19 PM Parav Pandit <parav@nvidia.com> wrote:
>>
>>> From: Jason Wang <jasowang@redhat.com>
>>> Sent: Thursday, September 21, 2023 9:39 AM
>>>
>>> On Thu, Sep 21, 2023 at 12:01 PM Parav Pandit <parav@nvidia.com> wrote:
>>>>
>>>>
>>>>> From: Jason Wang <jasowang@redhat.com>
>>>>> Sent: Thursday, September 21, 2023 8:48 AM
>>>>> As replied in another thread, the issues for BAR are:
>>>>>
>>>>> 1) Not sure it can have an efficient interface, it would be
>>>>> something like VIRTIO_PCI_CAP_PCI_CFG which is very slow compared to
>>>>> single register accessing
>>>>> 2) There's no owner/group/member for MMIO, most of the time, we only
>>>>> need a single MMIO device. If we want the owner to manage itself, it
>>>>> seems redundant as is implied in all the existing transports (without admin
>>> commands).
>>>>> Even if we had, it might still suffer from bootstrap issues.
>>>>> 3) For live migration, it means the admin commands needs to start
>>>>> from duplicating every existing transport specific interface it can
>>>>> give us. One example is that we may end up with two interfaces to
>>>>> access virtqueue addresses etc. This results in extra complicity and
>>>>> it is actually a full transport (driver can just use admin commands to drive
>>> the device).
>>>> In [1] there is no duplication. The live migration driver never parses the device
>>> context either while reading or write.
>>>> Hence no code and no complexity in driver and no duplicate work.
>>>> Therefore, those admin commands are not to drive the guest device either.
>>> I'm not sure how this is related to the duplication issue.
>>>
>> You commented that admin virtqueue duplicates somethings.
>> And I explained above that it does not.
>>
>>>>> 4) Admin commands itself may not be capable of doing things like
>>>>> dirty page logging, it requires the assistance from the transport
>>>>>
>>>> Admin command in [1] is capable of dirty page logging.
>>> In your design, the logging is done via DMA not the virtqueue.
>>>
>> No. it is done via admin command, not DMA in [2].
>>
>> [2] https://lore.kernel.org/virtio-comment/20230909142911.524407-7-parav@nvidia.com/T/#m17b09acd8c73d374e98ad84764b315afa94f59c9
>>
>>> The only job for virtqueue is to initiate the DMA. But if DMA can be initiated via
>>> virtqueue, it can be done in other ways.
>>>
>> Lets first establish 4 things in alternative way, any motivation to do so with 5th point without giant registers need in device.
>>
>>>>> 1) Parav's proposal does several couplings: couple basic build
>>>>> blocks (suspend, dirty page tracking) with live migration, couple
>>>>> live migration with admin commands.
>>>> In which use case you find dirty page tracking useful without migration for
>>> which you like to see it detached from device migration flow?
>>>
>>> Is it only the dirty page tracking? It's the combinations of
>>>
>>> 1) suspending
>>> 2) device states
>>> 3) dirty page tracking
>>>
>>> Eeah of those will have use cases other than live migration: VM stop, power
>>> management in VM, profiling and monitoring, failover etc.
>>>
>> Suspend/resume with different power state is driven by the guest directly.
> And there's hibernation actually where device states might be useful.
>
>> So it may find some overlap.
>>
>> Device context has no overlap.
> I can give you one example, e.g debugging.
>
>> Dirty page tracking has no overlap. What do you want to profile and monitor? In case if you want to profile, it can be used without migration command anyway?
> It works like a dirty bit of PTE. We all know it has a broader use
> case than logging. For example, tracking working set and do
> optimization on IOMMU/IOTLB or even device IOTLB.
>
> 1) Try to prove your facility can only work for one specific cases
> 2) Try to prove your facility can work for more than one cases
>
> Which one is easier and more beneficial to virtio?
>
>
>> If you describe, may be we I can split "device migration" chapter to two pieces,
>> Device management and device migration.
>>
>> Device migration will use these basic facility.
>> Would that help you?
> Definitely, but it needs to be done by not making it under the
> subsection of admin commands, that's it.
>
> Let me repeat once again here for the possible steps to collaboration:
>
> 1) define virtqueue state, inflight descriptors in the section of
> basic facility but not under the admin commands
> 2) define the dirty page tracking, device context/states in the
> section of basic facility but not under the admin commands
> 3) define transport specific interfaces or admin commands to access them
I totally agree with this proposal.
>
> Does this work? It seems you refused such steps in the past.
>
> Actually, I would like to leave 2) as it's very complicated which
> might not converge easily.
>
> Thanks
>


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
       [not found]                                 ` <PH0PR12MB5481C11F6D68A892A091E2AFDCC3A@PH0PR12MB5481.namprd12.prod.outlook.com>
@ 2023-09-26  5:36                                   ` Zhu, Lingshan
  2023-09-26  6:03                                     ` [virtio-comment] " Parav Pandit
  0 siblings, 1 reply; 29+ messages in thread
From: Zhu, Lingshan @ 2023-09-26  5:36 UTC (permalink / raw)
  To: Parav Pandit, Jason Wang, Michael S. Tsirkin, eperezma@redhat.com,
	Stefan Hajnoczi, virtio-comment@lists.oasis-open.org
  Cc: virtio-dev@lists.oasis-open.org



On 9/26/2023 11:40 AM, Parav Pandit wrote:
>
>> From: Jason Wang <jasowang@redhat.com>
>> Sent: Tuesday, September 26, 2023 8:16 AM
>>
>> On Mon, Sep 25, 2023 at 6:41 PM Parav Pandit <parav@nvidia.com> wrote:
>>>
>>>
>>>> From: Jason Wang <jasowang@redhat.com>
>>>> Sent: Friday, September 22, 2023 8:38 AM
>>>>> Device context has no overlap.
>>>> I can give you one example, e.g debugging.
>>>>
>>> Almost every feature needs debugging. :) So I am omitting it for time
>>> being.
>> Well, I don't think so. We have a lot of handy tools for that (ethtool -d?).
> Sure add something specific for debug and explicitly mention that it is for debug like -d.
> Every feature and functionality needs debug, not specifically device context.
> So add infra for debug. Device migration series is not the vehicle to piggy back on.
>
>>>> 1) define virtqueue state, inflight descriptors in the section of
>>>> basic facility but not under the admin commands
>>> It will be part of the device context such a way that so that one can only read
>> the vq state only instead of full device context.
>>> This will work.
>> I'm not sure what it looks like, but I think they are well decoupled in this series.
>> E.g driver can choose to just read e.g last_avail_idx and report to ethtool or
>> watchdog.
>>
> Once its done it will be visible how it looks like.
> The key is it needs to cover BOTH use cases.
>
>> As I replied in other thread, I see several problems:
>>
>> 1) layer violation, PCI specific state were mixed into the basic facilities
> After agreeing to see merged donctext, now you are hinting that you don’t agree to merge two.
> I disagree if you are leaning towards that direction.
> I hope my deduction from your above comment is incorrect.
>
> There is no violation. PCI specific device context will be captured in PCI specific section.
> Device type contexts will be captured in those device type sections.
>
> TLVs will cover many of the device context information.
>
>> 2) I don't see a good definition on "device context"
>> 3) TLV works fine for queue but not registers
>>
> Please definition of device context in [1].
>
> [1] https://lists.oasis-open.org/archives/virtio-comment/202309/msg00064.html
>
>> What needs to be done first is to describe what device context means and what
>> it contains. Not the actual data structure since it may vary.
>>
> Sure, it is already defined in device migration theory of operation section in [2].
> I will try to take it out and put in device management section, so that device migration can refer to it and
> Some other basic facility also can refer to it (which must need to explain a use case beyond silly debug point).
>
> [2] https://lists.oasis-open.org/archives/virtio-comment/202309/msg00062.html
>
>>>> 3) define transport specific interfaces or admin commands to access
>>>> them
>>>>
>>> As you also envisioned, it is done using admin commands to access it.
>> That's fine, but it should allow other ways.
>>
> For passthrough admin commands fits the design.
> Sure, you need to draft it how to do other ways..
>   > >
>>>> Does this work? It seems you refused such steps in the past.
>>>>
>>> I didn’t.
>>> There must be some confusion in many emails we both exchanged, because
>> already posted v0 has it split such as device context.
>>> For dirty page tracking I couldn’t find a solid use case without device
>> migration, so I asked which you already replied above.
>>
>> First, you never explain why such coupling gives us any benefit.
> It is not coupled. Device migration uses this facility. So it is matter of text organization in the spec.
> Not the design.
>
>> Second, I've given you sufficient examples but you tend to ignore them. Why
>> not go through Qemu codes then you will see the answer.
> You gave example of debug, and profiling. You didn’t explain the use case of how to actually connect and how to profile etc.
>
>>>> Actually, I would like to leave 2) as it's very complicated which
>>>> might not converge easily.
>>>>
>>> I will split current "device migration" section to two.
>>> 1. device management
>>> 2. device migration
>>>
>>> Device management covers device mode, device context and dirty page
>> tracking.
>>
>> I don't see any connection between "management" and "device context".
>>
> You have better name than device management?
> May be device operation.
> May be it is just better to have the device context just the way it is in [1] under basic facility.
>   
>>> Device migration refers to device management section.
>>>
>>> We can omit the dirty page tracking commands for a moment and first close
>> on the device mode and device context as first step.
>>> Since it is already part of v0 and it is needed, I will keep it in subsequent v1 but
>> moved to device management section.
>>
>> Could you please answer what's wrong with the first 4 patches in this series?
>>
> 1. cover letter is missing the problem statement and use case
I only reply to this section of comments, this does not mean I agree 
with you on your other statements, Instead I agree with
Jason on his replies to you.

In my cover letter:
"The main usecase of these new facilities is Live Migration."

Did you miss it?
> 2. why queue suspend and both device suspend are introduced, only one should be there. The design description is missing.
there are no queue suspend, they are device suspend and vq state 
accessors. please read the patch if you want to comment.
> 3. Even though it claims under some random basic facility, cover letter clearly states the main use case is "live migration".
it is not random, they are precisely defined virtio basic facilities.
> 4. Patch 4 is not needed at all. When device is suspended, it is _suspended_. It does not do any bifurcation.
The device should not accept vq reset and the driver should reset vqs, 
please read previous discussions with MST
and please don't ignore the conclusions
> 5. only suspend bit of patch2 is not enough to cover P2P. One needs suspend and freeze both covered in series [1].
we have discussed this for many times, P2P is out of virtio spec, do you 
want to mediate every PCI state/functionality?
> 6. Finally the whole description of 1 to 4 need to be split in the device operation, so that both passthrough and medication can utilize it using admin cmd and otherwise.
Do you see any reasons this solution can not be used for passthrough and 
mediation?
Or does features_OK work for passthrough or mediation? Any difference?
> Since Zhu, told that dirty tracking and inflight descriptors will be done, I presume he will propose to do over admin q or command interface.
> And since all can run over the admin commands, the plumbing done in 1 to 4 can be made using admin commands.
No
>
> Until now we could not establish creating yet another DMA interface that is better than q interface.
> So ...
> To me both the methods will start looking more converged to me over admin command and queues.
I don't think so, again, we are introducing basic facilities and these 
facilities don't
depend on or rely on admin vq.
>
> Passthrough will use them over owner device.
> Mediation somehow need to do over member device.
> Mediation will not use any device suspend command because it needs to keep bisecting everything.
please read QEMU vhost live migration solution


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-09-26  5:36                                   ` Zhu, Lingshan
@ 2023-09-26  6:03                                     ` Parav Pandit
  2023-09-26  9:25                                       ` [virtio-comment] " Zhu, Lingshan
  0 siblings, 1 reply; 29+ messages in thread
From: Parav Pandit @ 2023-09-26  6:03 UTC (permalink / raw)
  To: Zhu, Lingshan, Jason Wang, Michael S. Tsirkin,
	eperezma@redhat.com, Stefan Hajnoczi,
	virtio-comment@lists.oasis-open.org
  Cc: virtio-dev@lists.oasis-open.org



> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Tuesday, September 26, 2023 11:07 AM

> > 1. cover letter is missing the problem statement and use case
> I only reply to this section of comments, this does not mean I agree with you on
> your other statements, Instead I agree with Jason on his replies to you.
> 
> In my cover letter:
> "The main usecase of these new facilities is Live Migration."
>
:)
Two letter word do not explain the use case of why is asking to mediating a native virtio device.

And yet you call is the basic facilities.
Anyways you know the misaligned response in email and cover letter is evident.

> Did you miss it?
No.
It misses the detail as I described in the theory of operation described in [1].

[1] https://lists.oasis-open.org/archives/virtio-comment/202309/msg00061.html

> > 2. why queue suspend and both device suspend are introduced, only one
> should be there. The design description is missing.
> there are no queue suspend, they are device suspend and vq state accessors.
> please read the patch if you want to comment.
> > 3. Even though it claims under some random basic facility, cover letter clearly
> states the main use case is "live migration".
> it is not random, they are precisely defined virtio basic facilities.
> > 4. Patch 4 is not needed at all. When device is suspended, it is _suspended_. It
> does not do any bifurcation.
> The device should not accept vq reset and the driver should reset vqs, please
> read previous discussions with MST and please don't ignore the conclusions
> > 5. only suspend bit of patch2 is not enough to cover P2P. One needs suspend
> and freeze both covered in series [1].
> we have discussed this for many times, P2P is out of virtio spec, do you want to
> mediate every PCI state/functionality?
> > 6. Finally the whole description of 1 to 4 need to be split in the device
> operation, so that both passthrough and medication can utilize it using admin
> cmd and otherwise.
> Do you see any reasons this solution can not be used for passthrough and
> mediation?
Right. Proposed solution does not meeting following requirements addressed in [1].

[1] https://lore.kernel.org/virtio-comment/20230906081637.32185-1-lingshan.zhu@intel.com/T/#m7efbaadbc73f033c2793d9eb1eb0afa210aae4be

[1] I replied to Jason in previous email.
I will repeat here. They are covered in [1].

1. Missing P2P support
2. Missing dirty page tracking
3. Incremental device context framework for short downtime
3.a Ability to do inflight descriptor tracking
4. Ability to do the work for multiple member devices in parallel.


> Or does features_OK work for passthrough or mediation? Any difference?
It does not work. Passthrough devices are not trapped by the hypervisor.

> > Since Zhu, told that dirty tracking and inflight descriptors will be done, I
> presume he will propose to do over admin q or command interface.
> > And since all can run over the admin commands, the plumbing done in 1 to 4
> can be made using admin commands.
> No
Such negative assertion does not help.
Explain why part, like how I explained above.

> >
> > Until now we could not establish creating yet another DMA interface that is
> better than q interface.
> > So ...
> > To me both the methods will start looking more converged to me over admin
> command and queues.
> I don't think so, again, we are introducing basic facilities and these facilities
> don't depend on or rely on admin vq.
If so, stop the work "live migration" from the cover letter.
Reliance of admin command (again not vq, be careful what you constantly claim).
Not reliance on admin queue or admin command does/does not make it basic facility.

Admin commands and queues are already in basic facilities section today.
So claiming that hey one is using admin commands that means it is non_basic facility is not correct.

> >
> > Passthrough will use them over owner device.
> > Mediation somehow need to do over member device.
> > Mediation will not use any device suspend command because it needs to
> keep bisecting everything.
> please read QEMU vhost live migration solution
Can you please share the pointer to it?

I am familiar with [2] and it does not require device suspend flow as things are bisected.

[2] https://qemu-project.gitlab.io/qemu/interop/vhost-user.html#introduction

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-09-26  6:03                                     ` [virtio-comment] " Parav Pandit
@ 2023-09-26  9:25                                       ` Zhu, Lingshan
  2023-09-26 10:48                                         ` Michael S. Tsirkin
  0 siblings, 1 reply; 29+ messages in thread
From: Zhu, Lingshan @ 2023-09-26  9:25 UTC (permalink / raw)
  To: Parav Pandit, Jason Wang, Michael S. Tsirkin, eperezma@redhat.com,
	Stefan Hajnoczi, virtio-comment@lists.oasis-open.org
  Cc: virtio-dev@lists.oasis-open.org



On 9/26/2023 2:03 PM, Parav Pandit wrote:
>
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Tuesday, September 26, 2023 11:07 AM
>>> 1. cover letter is missing the problem statement and use case
>> I only reply to this section of comments, this does not mean I agree with you on
>> your other statements, Instead I agree with Jason on his replies to you.
>>
>> In my cover letter:
>> "The main usecase of these new facilities is Live Migration."
>>
> :)
> Two letter word do not explain the use case of why is asking to mediating a native virtio device.
this solution work for fundamental virtualization: trap and emulate, 
just like other virtio config space
fields.

Do you know how device status or vq_enable work? I suggest to read QEMU 
code.
>
> And yet you call is the basic facilities.
> Anyways you know the misaligned response in email and cover letter is evident.
No, I don't, they are basic facilities as you can see, this is loud and 
clear.
>
>> Did you miss it?
> No.
> It misses the detail as I described in the theory of operation described in [1].
>
> [1] https://lists.oasis-open.org/archives/virtio-comment/202309/msg00061.html
Details are in the following patch in this series
>
>>> 2. why queue suspend and both device suspend are introduced, only one
>> should be there. The design description is missing.
>> there are no queue suspend, they are device suspend and vq state accessors.
>> please read the patch if you want to comment.
>>> 3. Even though it claims under some random basic facility, cover letter clearly
>> states the main use case is "live migration".
>> it is not random, they are precisely defined virtio basic facilities.
>>> 4. Patch 4 is not needed at all. When device is suspended, it is _suspended_. It
>> does not do any bifurcation.
>> The device should not accept vq reset and the driver should reset vqs, please
>> read previous discussions with MST and please don't ignore the conclusions
>>> 5. only suspend bit of patch2 is not enough to cover P2P. One needs suspend
>> and freeze both covered in series [1].
>> we have discussed this for many times, P2P is out of virtio spec, do you want to
>> mediate every PCI state/functionality?
>>> 6. Finally the whole description of 1 to 4 need to be split in the device
>> operation, so that both passthrough and medication can utilize it using admin
>> cmd and otherwise.
>> Do you see any reasons this solution can not be used for passthrough and
>> mediation?
> Right. Proposed solution does not meeting following requirements addressed in [1].
>
> [1] https://lore.kernel.org/virtio-comment/20230906081637.32185-1-lingshan.zhu@intel.com/T/#m7efbaadbc73f033c2793d9eb1eb0afa210aae4be
>
> [1] I replied to Jason in previous email.
> I will repeat here. They are covered in [1].
>
> 1. Missing P2P support
As I asked before, please don't ignore, please answer:
"we have discussed this for many times, P2P is out of virtio spec,
do you want to mediate every PCI state/functionality?"
> 2. Missing dirty page tracking
This will be included in V2, as we have repeated for many times,
we want this series to be small and focus, that is why dirty page tracking
and in-flight descriptors are not here. but they will in V2.
> 3. Incremental device context framework for short downtime
Do you observe significant downtime in QEMU/vhost?

Why do you think this series can introduce more downtime,

Do you know this series can re-use QEMU/vhost?

Have you really read QEMU live migration code?

Jason has ever suggested you read it.
> 3.a Ability to do inflight descriptor tracking
as told you many times, in V2
> 4. Ability to do the work for multiple member devices in parallel.
As told you many times, they are per-device facilities, for example,
per-vf device, that means, migrate the VF by its own facilities.

Is that clear enough for you?
>
>
>> Or does features_OK work for passthrough or mediation? Any difference?
> It does not work. Passthrough devices are not trapped by the hypervisor.
Really? Features_ok does not work for passthrough? Seriously?
>
>>> Since Zhu, told that dirty tracking and inflight descriptors will be done, I
>> presume he will propose to do over admin q or command interface.
>>> And since all can run over the admin commands, the plumbing done in 1 to 4
>> can be made using admin commands.
>> No
> Such negative assertion does not help.
> Explain why part, like how I explained above.
OK, we can repeat:

Again! They are self-contained basic facilities, they should better not 
depend on others like admin vq.

And please refer to previous discussions, where Jason and I pointed out 
admin vq is not a qualified
solution for live migration because of: 1)nested 2)baremetal LM 3)QOS 
4)security.

We don't want to repeat the discussions, it looks like endless circle 
with no direction.
>
>>> Until now we could not establish creating yet another DMA interface that is
>> better than q interface.
>>> So ...
>>> To me both the methods will start looking more converged to me over admin
>> command and queues.
>> I don't think so, again, we are introducing basic facilities and these facilities
>> don't depend on or rely on admin vq.
> If so, stop the work "live migration" from the cover letter.
> Reliance of admin command (again not vq, be careful what you constantly claim).
> Not reliance on admin queue or admin command does/does not make it basic facility.
why admin commands are must? These facilities are self contained, right?
>
> Admin commands and queues are already in basic facilities section today.
> So claiming that hey one is using admin commands that means it is non_basic facility is not correct.
Still why you think admin command is a must? It is clear that this 
proposal can work without admin vq,
and even better!
>
>>> Passthrough will use them over owner device.
>>> Mediation somehow need to do over member device.
>>> Mediation will not use any device suspend command because it needs to
>> keep bisecting everything.
>> please read QEMU vhost live migration solution
> Can you please share the pointer to it?
>
> I am familiar with [2] and it does not require device suspend flow as things are bisected.
>
> [2] https://qemu-project.gitlab.io/qemu/interop/vhost-user.html#introduction
If so, I believe you may find out that this solution can work perfect 
with vhost, right?


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-09-26  9:25                                       ` [virtio-comment] " Zhu, Lingshan
@ 2023-09-26 10:48                                         ` Michael S. Tsirkin
  2023-09-27  8:20                                           ` Zhu, Lingshan
  0 siblings, 1 reply; 29+ messages in thread
From: Michael S. Tsirkin @ 2023-09-26 10:48 UTC (permalink / raw)
  To: Zhu, Lingshan
  Cc: Parav Pandit, Jason Wang, eperezma@redhat.com, Stefan Hajnoczi,
	virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org

On Tue, Sep 26, 2023 at 05:25:42PM +0800, Zhu, Lingshan wrote:
> We don't want to repeat the discussions, it looks like endless circle with
> no direction.

OK let me try to direct this discussion.
You guys were speaking past each other, no dialog is happening.
And as long as it goes on no progress will be made and you
will keep going in circles.

Parav here made an effort and attempted to summarize
use-cases addressed by your proposal but not his.
He couldn't resist adding "a yes but" in there oh well.
But now I hope you know he knows about your use-cases?

So please do the same. Do you see any advantages to Parav's
proposal as compared to yours? Try to list them and
if possible try not to accompany the list with "yes but"
(put it in a separate mail if you must ;) ).
If you won't be able to see any, let me know and I'll try to help.

Once each of you and Parav have finally heard the other and
the other also knows he's been heard, that's when we can
try to make progress by looking for something that addresses
all use-cases as opposed to endlessly repeating same arguments.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-09-26 10:48                                         ` Michael S. Tsirkin
@ 2023-09-27  8:20                                           ` Zhu, Lingshan
  2023-09-27 10:39                                             ` [virtio-comment] " Parav Pandit
  2023-09-27 15:40                                             ` [virtio-comment] " Michael S. Tsirkin
  0 siblings, 2 replies; 29+ messages in thread
From: Zhu, Lingshan @ 2023-09-27  8:20 UTC (permalink / raw)
  To: Michael S. Tsirkin, Cornelia Huck
  Cc: Parav Pandit, Jason Wang, eperezma@redhat.com, Stefan Hajnoczi,
	virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org



On 9/26/2023 6:48 PM, Michael S. Tsirkin wrote:
> On Tue, Sep 26, 2023 at 05:25:42PM +0800, Zhu, Lingshan wrote:
>> We don't want to repeat the discussions, it looks like endless circle with
>> no direction.
> OK let me try to direct this discussion.
> You guys were speaking past each other, no dialog is happening.
> And as long as it goes on no progress will be made and you
> will keep going in circles.
>
> Parav here made an effort and attempted to summarize
> use-cases addressed by your proposal but not his.
> He couldn't resist adding "a yes but" in there oh well.
> But now I hope you know he knows about your use-cases?
>
> So please do the same. Do you see any advantages to Parav's
> proposal as compared to yours? Try to list them and
> if possible try not to accompany the list with "yes but"
> (put it in a separate mail if you must ;) ).
> If you won't be able to see any, let me know and I'll try to help.
>
> Once each of you and Parav have finally heard the other and
> the other also knows he's been heard, that's when we can
> try to make progress by looking for something that addresses
> all use-cases as opposed to endlessly repeating same arguments.
Sure Michael, I will not say "yes but" here.

 From Parav's proposal, he intends to migrate a member device by its 
owner device through the admin vq,
thus necessary admin vq commands are introduced in his series.


I see his proposal can:
1) meet some customers requirements without nested and bare-metal
2) align with Nvidia production
3) easier to emulate by onboard SOC

The general purpose of his proposal and mine are aligned: migrate virtio 
devices.

Jason has ever proposed to collaborate, please allow me quote his proposal:

"
Let me repeat once again here for the possible steps to collaboration:

1) define virtqueue state, inflight descriptors in the section of
basic facility but not under the admin commands
2) define the dirty page tracking, device context/states in the
section of basic facility but not under the admin commands
3) define transport specific interfaces or admin commands to access them
"

I totally agree with his proposal.

Does this work for you Michael?

Thanks
Zhu Lingshan

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-09-27  8:20                                           ` Zhu, Lingshan
@ 2023-09-27 10:39                                             ` Parav Pandit
  2023-10-09 10:05                                               ` [virtio-comment] " Zhu, Lingshan
  2023-09-27 15:40                                             ` [virtio-comment] " Michael S. Tsirkin
  1 sibling, 1 reply; 29+ messages in thread
From: Parav Pandit @ 2023-09-27 10:39 UTC (permalink / raw)
  To: Zhu, Lingshan, Michael S. Tsirkin, Cornelia Huck
  Cc: Jason Wang, eperezma@redhat.com, Stefan Hajnoczi,
	virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org



> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Wednesday, September 27, 2023 1:50 PM

> I see his proposal can:
> 1) meet some customers requirements without nested and bare-metal
> 2) align with Nvidia production
Slightly inaccurate.
The work produced is for the virtio spec update for the users.

I have missed adding other contributors Sign-off who also share similar use cases, which I will add in v1.

> 3) easier to emulate by onboard SOC
> 
> The general purpose of his proposal and mine are aligned: migrate virtio	
> devices.
> 
Great.

> Jason has ever proposed to collaborate, please allow me quote his proposal:
> 
> "
> Let me repeat once again here for the possible steps to collaboration:
> 
> 1) define virtqueue state, inflight descriptors in the section of
> basic facility but not under the admin commands
> 2) define the dirty page tracking, device context/states in the
> section of basic facility but not under the admin commands
> 3) define transport specific interfaces or admin commands to access them
> "
> 
> I totally agree with his proposal.

We started discussing some of the it.
If I draw parallels, one should not say "detach descriptors from virtqueue" for the infrastructure that exists in the basic facilities.
If so, one should explain the technical design reason and it would make sense.

So let's discuss it.
I like to better understand the _real_ technical reason for detaching it.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-09-27  8:20                                           ` Zhu, Lingshan
  2023-09-27 10:39                                             ` [virtio-comment] " Parav Pandit
@ 2023-09-27 15:40                                             ` Michael S. Tsirkin
  2023-10-09 10:01                                               ` Zhu, Lingshan
  1 sibling, 1 reply; 29+ messages in thread
From: Michael S. Tsirkin @ 2023-09-27 15:40 UTC (permalink / raw)
  To: Zhu, Lingshan
  Cc: Cornelia Huck, Parav Pandit, Jason Wang, eperezma@redhat.com,
	Stefan Hajnoczi, virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org

On Wed, Sep 27, 2023 at 04:20:01PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 9/26/2023 6:48 PM, Michael S. Tsirkin wrote:
> > On Tue, Sep 26, 2023 at 05:25:42PM +0800, Zhu, Lingshan wrote:
> > > We don't want to repeat the discussions, it looks like endless circle with
> > > no direction.
> > OK let me try to direct this discussion.
> > You guys were speaking past each other, no dialog is happening.
> > And as long as it goes on no progress will be made and you
> > will keep going in circles.
> > 
> > Parav here made an effort and attempted to summarize
> > use-cases addressed by your proposal but not his.
> > He couldn't resist adding "a yes but" in there oh well.
> > But now I hope you know he knows about your use-cases?
> > 
> > So please do the same. Do you see any advantages to Parav's
> > proposal as compared to yours? Try to list them and
> > if possible try not to accompany the list with "yes but"
> > (put it in a separate mail if you must ;) ).
> > If you won't be able to see any, let me know and I'll try to help.
> > 
> > Once each of you and Parav have finally heard the other and
> > the other also knows he's been heard, that's when we can
> > try to make progress by looking for something that addresses
> > all use-cases as opposed to endlessly repeating same arguments.
> Sure Michael, I will not say "yes but" here.
> 
> From Parav's proposal, he intends to migrate a member device by its owner
> device through the admin vq,
> thus necessary admin vq commands are introduced in his series.
> 
> 
> I see his proposal can:
> 1) meet some customers requirements without nested and bare-metal
> 2) align with Nvidia production
> 3) easier to emulate by onboard SOC

Is that all you can see?

Hint: there's more.





> The general purpose of his proposal and mine are aligned: migrate virtio
> devices.
> 
> Jason has ever proposed to collaborate, please allow me quote his proposal:
> 
> "
> Let me repeat once again here for the possible steps to collaboration:
> 
> 1) define virtqueue state, inflight descriptors in the section of
> basic facility but not under the admin commands
> 2) define the dirty page tracking, device context/states in the
> section of basic facility but not under the admin commands
> 3) define transport specific interfaces or admin commands to access them
> "
> 
> I totally agree with his proposal.
> 
> Does this work for you Michael?
> 
> Thanks
> Zhu Lingshan

I just doubt very much this will work.  What will "define" mean then -
not an interface, just a description in english? I think you
underestimate the difficulty of creating such definitions that
are robust and precise.


Instead I suggest you define a way to submit admin commands that works
for nested and bare-metal (i.e. not admin vq, and not with sriov group
type). And work with Parav to make live migration admin commands work
reasonably will through this interface and with this type.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-09-27 15:40                                             ` [virtio-comment] " Michael S. Tsirkin
@ 2023-10-09 10:01                                               ` Zhu, Lingshan
  2023-10-11 10:20                                                 ` Michael S. Tsirkin
  0 siblings, 1 reply; 29+ messages in thread
From: Zhu, Lingshan @ 2023-10-09 10:01 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, Parav Pandit, Jason Wang, eperezma@redhat.com,
	Stefan Hajnoczi, virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org



On 9/27/2023 11:40 PM, Michael S. Tsirkin wrote:
> On Wed, Sep 27, 2023 at 04:20:01PM +0800, Zhu, Lingshan wrote:
>>
>> On 9/26/2023 6:48 PM, Michael S. Tsirkin wrote:
>>> On Tue, Sep 26, 2023 at 05:25:42PM +0800, Zhu, Lingshan wrote:
>>>> We don't want to repeat the discussions, it looks like endless circle with
>>>> no direction.
>>> OK let me try to direct this discussion.
>>> You guys were speaking past each other, no dialog is happening.
>>> And as long as it goes on no progress will be made and you
>>> will keep going in circles.
>>>
>>> Parav here made an effort and attempted to summarize
>>> use-cases addressed by your proposal but not his.
>>> He couldn't resist adding "a yes but" in there oh well.
>>> But now I hope you know he knows about your use-cases?
>>>
>>> So please do the same. Do you see any advantages to Parav's
>>> proposal as compared to yours? Try to list them and
>>> if possible try not to accompany the list with "yes but"
>>> (put it in a separate mail if you must ;) ).
>>> If you won't be able to see any, let me know and I'll try to help.
>>>
>>> Once each of you and Parav have finally heard the other and
>>> the other also knows he's been heard, that's when we can
>>> try to make progress by looking for something that addresses
>>> all use-cases as opposed to endlessly repeating same arguments.
>> Sure Michael, I will not say "yes but" here.
>>
>>  From Parav's proposal, he intends to migrate a member device by its owner
>> device through the admin vq,
>> thus necessary admin vq commands are introduced in his series.
>>
>>
>> I see his proposal can:
>> 1) meet some customers requirements without nested and bare-metal
>> 2) align with Nvidia production
>> 3) easier to emulate by onboard SOC
> Is that all you can see?
>
> Hint: there's more.
please help provide more.
>
>
>
>
>
>> The general purpose of his proposal and mine are aligned: migrate virtio
>> devices.
>>
>> Jason has ever proposed to collaborate, please allow me quote his proposal:
>>
>> "
>> Let me repeat once again here for the possible steps to collaboration:
>>
>> 1) define virtqueue state, inflight descriptors in the section of
>> basic facility but not under the admin commands
>> 2) define the dirty page tracking, device context/states in the
>> section of basic facility but not under the admin commands
>> 3) define transport specific interfaces or admin commands to access them
>> "
>>
>> I totally agree with his proposal.
>>
>> Does this work for you Michael?
>>
>> Thanks
>> Zhu Lingshan
> I just doubt very much this will work.  What will "define" mean then -
> not an interface, just a description in english? I think you
> underestimate the difficulty of creating such definitions that
> are robust and precise.
I think we can review the patch to correct the words.
>
>
> Instead I suggest you define a way to submit admin commands that works
> for nested and bare-metal (i.e. not admin vq, and not with sriov group
> type). And work with Parav to make live migration admin commands work
> reasonably will through this interface and with this type.
why admin commands are better than registers?
>


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-09-27 10:39                                             ` [virtio-comment] " Parav Pandit
@ 2023-10-09 10:05                                               ` Zhu, Lingshan
  2023-10-09 10:07                                                 ` [virtio-comment] " Parav Pandit
  0 siblings, 1 reply; 29+ messages in thread
From: Zhu, Lingshan @ 2023-10-09 10:05 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin, Cornelia Huck
  Cc: Jason Wang, eperezma@redhat.com, Stefan Hajnoczi,
	virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org



On 9/27/2023 6:39 PM, Parav Pandit wrote:
>
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Wednesday, September 27, 2023 1:50 PM
>> I see his proposal can:
>> 1) meet some customers requirements without nested and bare-metal
>> 2) align with Nvidia production
> Slightly inaccurate.
> The work produced is for the virtio spec update for the users.
>
> I have missed adding other contributors Sign-off who also share similar use cases, which I will add in v1.
>
>> 3) easier to emulate by onboard SOC
>>
>> The general purpose of his proposal and mine are aligned: migrate virtio	
>> devices.
>>
> Great.
>
>> Jason has ever proposed to collaborate, please allow me quote his proposal:
>>
>> "
>> Let me repeat once again here for the possible steps to collaboration:
>>
>> 1) define virtqueue state, inflight descriptors in the section of
>> basic facility but not under the admin commands
>> 2) define the dirty page tracking, device context/states in the
>> section of basic facility but not under the admin commands
>> 3) define transport specific interfaces or admin commands to access them
>> "
>>
>> I totally agree with his proposal.
> We started discussing some of the it.
> If I draw parallels, one should not say "detach descriptors from virtqueue" for the infrastructure that exists in the basic facilities.
> If so, one should explain the technical design reason and it would make sense.
not sure what is  "detach descriptors from virtqueue", but admin vq 
carries commands anyway.
>
> So let's discuss it.
> I like to better understand the _real_ technical reason for detaching it.
so please to or cc me in your series.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] RE: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-10-09 10:05                                               ` [virtio-comment] " Zhu, Lingshan
@ 2023-10-09 10:07                                                 ` Parav Pandit
  0 siblings, 0 replies; 29+ messages in thread
From: Parav Pandit @ 2023-10-09 10:07 UTC (permalink / raw)
  To: Zhu, Lingshan, Michael S. Tsirkin, Cornelia Huck
  Cc: Jason Wang, eperezma@redhat.com, Stefan Hajnoczi,
	virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org


> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Monday, October 9, 2023 3:36 PM
> 
> On 9/27/2023 6:39 PM, Parav Pandit wrote:
> >
> >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >> Sent: Wednesday, September 27, 2023 1:50 PM I see his proposal can:
> >> 1) meet some customers requirements without nested and bare-metal
> >> 2) align with Nvidia production
> > Slightly inaccurate.
> > The work produced is for the virtio spec update for the users.
> >
> > I have missed adding other contributors Sign-off who also share similar use
> cases, which I will add in v1.
> >
> >> 3) easier to emulate by onboard SOC
> >>
> >> The general purpose of his proposal and mine are aligned: migrate virtio
> 
> >> devices.
> >>
> > Great.
> >
> >> Jason has ever proposed to collaborate, please allow me quote his proposal:
> >>
> >> "
> >> Let me repeat once again here for the possible steps to collaboration:
> >>
> >> 1) define virtqueue state, inflight descriptors in the section of
> >> basic facility but not under the admin commands
> >> 2) define the dirty page tracking, device context/states in the
> >> section of basic facility but not under the admin commands
> >> 3) define transport specific interfaces or admin commands to access
> >> them "
> >>
> >> I totally agree with his proposal.
> > We started discussing some of the it.
> > If I draw parallels, one should not say "detach descriptors from virtqueue" for
> the infrastructure that exists in the basic facilities.
> > If so, one should explain the technical design reason and it would make sense.
> not sure what is  "detach descriptors from virtqueue", but admin vq carries
> commands anyway.
> >
> > So let's discuss it.
> > I like to better understand the _real_ technical reason for detaching it.
> so please to or cc me in your series.
Sure, will do it in the v2.
Jason already added you in the v1.
It is already in the virtio-comment mailing list, so for now you can respond to v1.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-10-09 10:01                                               ` Zhu, Lingshan
@ 2023-10-11 10:20                                                 ` Michael S. Tsirkin
  2023-10-11 10:38                                                   ` Zhu, Lingshan
  0 siblings, 1 reply; 29+ messages in thread
From: Michael S. Tsirkin @ 2023-10-11 10:20 UTC (permalink / raw)
  To: Zhu, Lingshan
  Cc: Cornelia Huck, Parav Pandit, Jason Wang, eperezma@redhat.com,
	Stefan Hajnoczi, virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org

On Mon, Oct 09, 2023 at 06:01:42PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 9/27/2023 11:40 PM, Michael S. Tsirkin wrote:
> > On Wed, Sep 27, 2023 at 04:20:01PM +0800, Zhu, Lingshan wrote:
> > > 
> > > On 9/26/2023 6:48 PM, Michael S. Tsirkin wrote:
> > > > On Tue, Sep 26, 2023 at 05:25:42PM +0800, Zhu, Lingshan wrote:
> > > > > We don't want to repeat the discussions, it looks like endless circle with
> > > > > no direction.
> > > > OK let me try to direct this discussion.
> > > > You guys were speaking past each other, no dialog is happening.
> > > > And as long as it goes on no progress will be made and you
> > > > will keep going in circles.
> > > > 
> > > > Parav here made an effort and attempted to summarize
> > > > use-cases addressed by your proposal but not his.
> > > > He couldn't resist adding "a yes but" in there oh well.
> > > > But now I hope you know he knows about your use-cases?
> > > > 
> > > > So please do the same. Do you see any advantages to Parav's
> > > > proposal as compared to yours? Try to list them and
> > > > if possible try not to accompany the list with "yes but"
> > > > (put it in a separate mail if you must ;) ).
> > > > If you won't be able to see any, let me know and I'll try to help.
> > > > 
> > > > Once each of you and Parav have finally heard the other and
> > > > the other also knows he's been heard, that's when we can
> > > > try to make progress by looking for something that addresses
> > > > all use-cases as opposed to endlessly repeating same arguments.
> > > Sure Michael, I will not say "yes but" here.
> > > 
> > >  From Parav's proposal, he intends to migrate a member device by its owner
> > > device through the admin vq,
> > > thus necessary admin vq commands are introduced in his series.
> > > 
> > > 
> > > I see his proposal can:
> > > 1) meet some customers requirements without nested and bare-metal
> > > 2) align with Nvidia production
> > > 3) easier to emulate by onboard SOC
> > Is that all you can see?
> > 
> > Hint: there's more.
> please help provide more.

Just a small subset off the top of my head:
Error handling.
Extendable to other group types such as SIOV.
Batching of commands
less pci transactioons
Support for keeping some data off-device

which does not mean it's better unconditionally.
are above points clear?

as long as you guys keep not hearing each other we will keep
seeing these flame wars. if you expect everyone on virtio-comment
to follow a 300 message thread you are imo very much mistaken.

> > 
> > 
> > 
> > 
> > 
> > > The general purpose of his proposal and mine are aligned: migrate virtio
> > > devices.
> > > 
> > > Jason has ever proposed to collaborate, please allow me quote his proposal:
> > > 
> > > "
> > > Let me repeat once again here for the possible steps to collaboration:
> > > 
> > > 1) define virtqueue state, inflight descriptors in the section of
> > > basic facility but not under the admin commands
> > > 2) define the dirty page tracking, device context/states in the
> > > section of basic facility but not under the admin commands
> > > 3) define transport specific interfaces or admin commands to access them
> > > "
> > > 
> > > I totally agree with his proposal.
> > > 
> > > Does this work for you Michael?
> > > 
> > > Thanks
> > > Zhu Lingshan
> > I just doubt very much this will work.  What will "define" mean then -
> > not an interface, just a description in english? I think you
> > underestimate the difficulty of creating such definitions that
> > are robust and precise.
> I think we can review the patch to correct the words.
> > 
> > 
> > Instead I suggest you define a way to submit admin commands that works
> > for nested and bare-metal (i.e. not admin vq, and not with sriov group
> > type). And work with Parav to make live migration admin commands work
> > reasonably will through this interface and with this type.
> why admin commands are better than registers?
> > 
> 
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
> 


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-10-11 10:20                                                 ` Michael S. Tsirkin
@ 2023-10-11 10:38                                                   ` Zhu, Lingshan
  2023-10-11 11:52                                                     ` Parav Pandit
  2023-10-12  9:59                                                     ` Michael S. Tsirkin
  0 siblings, 2 replies; 29+ messages in thread
From: Zhu, Lingshan @ 2023-10-11 10:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, Parav Pandit, Jason Wang, eperezma@redhat.com,
	Stefan Hajnoczi, virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org



On 10/11/2023 6:20 PM, Michael S. Tsirkin wrote:
> On Mon, Oct 09, 2023 at 06:01:42PM +0800, Zhu, Lingshan wrote:
>>
>> On 9/27/2023 11:40 PM, Michael S. Tsirkin wrote:
>>> On Wed, Sep 27, 2023 at 04:20:01PM +0800, Zhu, Lingshan wrote:
>>>> On 9/26/2023 6:48 PM, Michael S. Tsirkin wrote:
>>>>> On Tue, Sep 26, 2023 at 05:25:42PM +0800, Zhu, Lingshan wrote:
>>>>>> We don't want to repeat the discussions, it looks like endless circle with
>>>>>> no direction.
>>>>> OK let me try to direct this discussion.
>>>>> You guys were speaking past each other, no dialog is happening.
>>>>> And as long as it goes on no progress will be made and you
>>>>> will keep going in circles.
>>>>>
>>>>> Parav here made an effort and attempted to summarize
>>>>> use-cases addressed by your proposal but not his.
>>>>> He couldn't resist adding "a yes but" in there oh well.
>>>>> But now I hope you know he knows about your use-cases?
>>>>>
>>>>> So please do the same. Do you see any advantages to Parav's
>>>>> proposal as compared to yours? Try to list them and
>>>>> if possible try not to accompany the list with "yes but"
>>>>> (put it in a separate mail if you must ;) ).
>>>>> If you won't be able to see any, let me know and I'll try to help.
>>>>>
>>>>> Once each of you and Parav have finally heard the other and
>>>>> the other also knows he's been heard, that's when we can
>>>>> try to make progress by looking for something that addresses
>>>>> all use-cases as opposed to endlessly repeating same arguments.
>>>> Sure Michael, I will not say "yes but" here.
>>>>
>>>>   From Parav's proposal, he intends to migrate a member device by its owner
>>>> device through the admin vq,
>>>> thus necessary admin vq commands are introduced in his series.
>>>>
>>>>
>>>> I see his proposal can:
>>>> 1) meet some customers requirements without nested and bare-metal
>>>> 2) align with Nvidia production
>>>> 3) easier to emulate by onboard SOC
>>> Is that all you can see?
>>>
>>> Hint: there's more.
>> please help provide more.
> Just a small subset off the top of my head:
> Error handling.
handle failed live migration? how?

and for other errors, we have mature error handling solutions
in virtio for years, like re-read, NEEDS_RESET.

If that is not good enough, then the corollary is:
admin vq is better than config space,
then the further corollary could be:
we should refactor virito-pci interfaces to admin vq commands,
like how we handle features

Is that true?
> Extendable to other group types such as SIOV.
For SIOV, the admin vq is a transport, but for SR-IOV
the admin vq is a control channel, that is different,
and admin vq can be a side channel.

For example, for SIOV, we config and migrate MSIX through
admin vq. For SRIOV, they are in config space.
> Batching of commands
> less pci transactioons
so this can still be a QOS issue.
If batching, others to starve?
> Support for keeping some data off-device
I don't get it, what is off-device?
The live migration facilities need to fetch data from the device anyway
>
> which does not mean it's better unconditionally.
> are above points clear?
The thing is, what blocks the config space solution?
Why admin vq is a must for live migration?
What's wrong in config space solution?
Shall we refactor everything in virtio-pci to use admin vq?
>
> as long as you guys keep not hearing each other we will keep
> seeing these flame wars. if you expect everyone on virtio-comment
> to follow a 300 message thread you are imo very much mistaken.
I am sure I have not ignored any questions.
I am saying admin vq is problematic for live migration,
at least it doesn't work for nested, so why admin vq is a must for live 
migration?
>
>>>
>>>
>>>
>>>
>>>> The general purpose of his proposal and mine are aligned: migrate virtio
>>>> devices.
>>>>
>>>> Jason has ever proposed to collaborate, please allow me quote his proposal:
>>>>
>>>> "
>>>> Let me repeat once again here for the possible steps to collaboration:
>>>>
>>>> 1) define virtqueue state, inflight descriptors in the section of
>>>> basic facility but not under the admin commands
>>>> 2) define the dirty page tracking, device context/states in the
>>>> section of basic facility but not under the admin commands
>>>> 3) define transport specific interfaces or admin commands to access them
>>>> "
>>>>
>>>> I totally agree with his proposal.
>>>>
>>>> Does this work for you Michael?
>>>>
>>>> Thanks
>>>> Zhu Lingshan
>>> I just doubt very much this will work.  What will "define" mean then -
>>> not an interface, just a description in english? I think you
>>> underestimate the difficulty of creating such definitions that
>>> are robust and precise.
>> I think we can review the patch to correct the words.
>>>
>>> Instead I suggest you define a way to submit admin commands that works
>>> for nested and bare-metal (i.e. not admin vq, and not with sriov group
>>> type). And work with Parav to make live migration admin commands work
>>> reasonably will through this interface and with this type.
>> why admin commands are better than registers?
>>
>> This publicly archived list offers a means to provide input to the
>> OASIS Virtual I/O Device (VIRTIO) TC.
>>
>> In order to verify user consent to the Feedback License terms and
>> to minimize spam in the list archive, subscription is required
>> before posting.
>>
>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>> List help: virtio-comment-help@lists.oasis-open.org
>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
>> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
>> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
>> Committee: https://www.oasis-open.org/committees/virtio/
>> Join OASIS: https://www.oasis-open.org/join/
>>


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-10-11 10:38                                                   ` Zhu, Lingshan
@ 2023-10-11 11:52                                                     ` Parav Pandit
  2023-10-12 10:57                                                       ` Zhu, Lingshan
  2023-10-12  9:59                                                     ` Michael S. Tsirkin
  1 sibling, 1 reply; 29+ messages in thread
From: Parav Pandit @ 2023-10-11 11:52 UTC (permalink / raw)
  To: Zhu, Lingshan, Michael S. Tsirkin
  Cc: Cornelia Huck, Jason Wang, eperezma@redhat.com, Stefan Hajnoczi,
	virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org


> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Wednesday, October 11, 2023 4:09 PM

> I am sure I have not ignored any questions.
What about below one?

https://lore.kernel.org/virtio-dev/20230921011221-mutt-send-email-mst@kernel.org/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-10-11 10:38                                                   ` Zhu, Lingshan
  2023-10-11 11:52                                                     ` Parav Pandit
@ 2023-10-12  9:59                                                     ` Michael S. Tsirkin
  2023-10-12 10:49                                                       ` Zhu, Lingshan
  1 sibling, 1 reply; 29+ messages in thread
From: Michael S. Tsirkin @ 2023-10-12  9:59 UTC (permalink / raw)
  To: Zhu, Lingshan
  Cc: Cornelia Huck, Parav Pandit, Jason Wang, eperezma@redhat.com,
	Stefan Hajnoczi, virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org

On Wed, Oct 11, 2023 at 06:38:32PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 10/11/2023 6:20 PM, Michael S. Tsirkin wrote:
> > On Mon, Oct 09, 2023 at 06:01:42PM +0800, Zhu, Lingshan wrote:
> > > 
> > > On 9/27/2023 11:40 PM, Michael S. Tsirkin wrote:
> > > > On Wed, Sep 27, 2023 at 04:20:01PM +0800, Zhu, Lingshan wrote:
> > > > > On 9/26/2023 6:48 PM, Michael S. Tsirkin wrote:
> > > > > > On Tue, Sep 26, 2023 at 05:25:42PM +0800, Zhu, Lingshan wrote:
> > > > > > > We don't want to repeat the discussions, it looks like endless circle with
> > > > > > > no direction.
> > > > > > OK let me try to direct this discussion.
> > > > > > You guys were speaking past each other, no dialog is happening.
> > > > > > And as long as it goes on no progress will be made and you
> > > > > > will keep going in circles.
> > > > > > 
> > > > > > Parav here made an effort and attempted to summarize
> > > > > > use-cases addressed by your proposal but not his.
> > > > > > He couldn't resist adding "a yes but" in there oh well.
> > > > > > But now I hope you know he knows about your use-cases?
> > > > > > 
> > > > > > So please do the same. Do you see any advantages to Parav's
> > > > > > proposal as compared to yours? Try to list them and
> > > > > > if possible try not to accompany the list with "yes but"
> > > > > > (put it in a separate mail if you must ;) ).
> > > > > > If you won't be able to see any, let me know and I'll try to help.
> > > > > > 
> > > > > > Once each of you and Parav have finally heard the other and
> > > > > > the other also knows he's been heard, that's when we can
> > > > > > try to make progress by looking for something that addresses
> > > > > > all use-cases as opposed to endlessly repeating same arguments.
> > > > > Sure Michael, I will not say "yes but" here.
> > > > > 
> > > > >   From Parav's proposal, he intends to migrate a member device by its owner
> > > > > device through the admin vq,
> > > > > thus necessary admin vq commands are introduced in his series.
> > > > > 
> > > > > 
> > > > > I see his proposal can:
> > > > > 1) meet some customers requirements without nested and bare-metal
> > > > > 2) align with Nvidia production
> > > > > 3) easier to emulate by onboard SOC
> > > > Is that all you can see?
> > > > 
> > > > Hint: there's more.
> > > please help provide more.
> > Just a small subset off the top of my head:
> > Error handling.
> handle failed live migration? how?

For example you can try restarting VM on source.
Or at least report an error to hypervisor.


> and for other errors, we have mature error handling solutions
> in virtio for years, like re-read, NEEDS_RESET.

facepalm

Are you aware of the fact that Linux still doesn't support
it since it turned out to be an extremely awkward interface
to use?

> If that is not good enough, then the corollary is:
> admin vq is better than config space,


You keep confusing admin vq with admin commands.


> then the further corollary could be:
> we should refactor virito-pci interfaces to admin vq commands,
> like how we handle features
> 
> Is that true?
> > Extendable to other group types such as SIOV.
> For SIOV, the admin vq is a transport, but for SR-IOV
> the admin vq is a control channel, that is different,
> and admin vq can be a side channel.
> 
> For example, for SIOV, we config and migrate MSIX through
> admin vq. For SRIOV, they are in config space.

And that's a mess. FYI we already got feedback from Linux devs
who are wondering why we can't come up with a consistent
interface that does everything.


> > Batching of commands
> > less pci transactioons
> so this can still be a QOS issue.
> If batching, others to starve?

And if you block CPU since you are not accepting
a posted write this is better?

> > Support for keeping some data off-device
> I don't get it, what is off-device?
> The live migration facilities need to fetch data from the device anyway

Heh this is what was driving nvidia to use DMA so heavily all this time.
no - if data is not in registers, device can fetch the data from
across pci express link, presumably with a local cache.


> > 
> > which does not mean it's better unconditionally.
> > are above points clear?
> The thing is, what blocks the config space solution?
> Why admin vq is a must for live migration?
> What's wrong in config space solution?

Whan you say what's wrong do you mean you still see no
advantages to doing DMA at all? config space is just better
with no drawbacks?

> Shall we refactor everything in virtio-pci to use admin vq?

> > 
> > as long as you guys keep not hearing each other we will keep
> > seeing these flame wars. if you expect everyone on virtio-comment
> > to follow a 300 message thread you are imo very much mistaken.
> I am sure I have not ignored any questions.
> I am saying admin vq is problematic for live migration,
> at least it doesn't work for nested, so why admin vq is a must for live
> migration?


My suggestion for you was to add admin command support to
VF memory, as an alternative to admin vq. It looks like that
will address the nested virt usecase.

> > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > > The general purpose of his proposal and mine are aligned: migrate virtio
> > > > > devices.
> > > > > 
> > > > > Jason has ever proposed to collaborate, please allow me quote his proposal:
> > > > > 
> > > > > "
> > > > > Let me repeat once again here for the possible steps to collaboration:
> > > > > 
> > > > > 1) define virtqueue state, inflight descriptors in the section of
> > > > > basic facility but not under the admin commands
> > > > > 2) define the dirty page tracking, device context/states in the
> > > > > section of basic facility but not under the admin commands
> > > > > 3) define transport specific interfaces or admin commands to access them
> > > > > "
> > > > > 
> > > > > I totally agree with his proposal.
> > > > > 
> > > > > Does this work for you Michael?
> > > > > 
> > > > > Thanks
> > > > > Zhu Lingshan
> > > > I just doubt very much this will work.  What will "define" mean then -
> > > > not an interface, just a description in english? I think you
> > > > underestimate the difficulty of creating such definitions that
> > > > are robust and precise.
> > > I think we can review the patch to correct the words.
> > > > 
> > > > Instead I suggest you define a way to submit admin commands that works
> > > > for nested and bare-metal (i.e. not admin vq, and not with sriov group
> > > > type). And work with Parav to make live migration admin commands work
> > > > reasonably will through this interface and with this type.
> > > why admin commands are better than registers?
> > > 
> > > This publicly archived list offers a means to provide input to the
> > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > 
> > > In order to verify user consent to the Feedback License terms and
> > > to minimize spam in the list archive, subscription is required
> > > before posting.
> > > 
> > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > List help: virtio-comment-help@lists.oasis-open.org
> > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > Committee: https://www.oasis-open.org/committees/virtio/
> > > Join OASIS: https://www.oasis-open.org/join/
> > > 


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-10-12  9:59                                                     ` Michael S. Tsirkin
@ 2023-10-12 10:49                                                       ` Zhu, Lingshan
  2023-10-12 11:12                                                         ` Michael S. Tsirkin
  2023-10-12 14:38                                                         ` Michael S. Tsirkin
  0 siblings, 2 replies; 29+ messages in thread
From: Zhu, Lingshan @ 2023-10-12 10:49 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, Parav Pandit, Jason Wang, eperezma@redhat.com,
	Stefan Hajnoczi, virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org



On 10/12/2023 5:59 PM, Michael S. Tsirkin wrote:
> On Wed, Oct 11, 2023 at 06:38:32PM +0800, Zhu, Lingshan wrote:
>>
>> On 10/11/2023 6:20 PM, Michael S. Tsirkin wrote:
>>> On Mon, Oct 09, 2023 at 06:01:42PM +0800, Zhu, Lingshan wrote:
>>>> On 9/27/2023 11:40 PM, Michael S. Tsirkin wrote:
>>>>> On Wed, Sep 27, 2023 at 04:20:01PM +0800, Zhu, Lingshan wrote:
>>>>>> On 9/26/2023 6:48 PM, Michael S. Tsirkin wrote:
>>>>>>> On Tue, Sep 26, 2023 at 05:25:42PM +0800, Zhu, Lingshan wrote:
>>>>>>>> We don't want to repeat the discussions, it looks like endless circle with
>>>>>>>> no direction.
>>>>>>> OK let me try to direct this discussion.
>>>>>>> You guys were speaking past each other, no dialog is happening.
>>>>>>> And as long as it goes on no progress will be made and you
>>>>>>> will keep going in circles.
>>>>>>>
>>>>>>> Parav here made an effort and attempted to summarize
>>>>>>> use-cases addressed by your proposal but not his.
>>>>>>> He couldn't resist adding "a yes but" in there oh well.
>>>>>>> But now I hope you know he knows about your use-cases?
>>>>>>>
>>>>>>> So please do the same. Do you see any advantages to Parav's
>>>>>>> proposal as compared to yours? Try to list them and
>>>>>>> if possible try not to accompany the list with "yes but"
>>>>>>> (put it in a separate mail if you must ;) ).
>>>>>>> If you won't be able to see any, let me know and I'll try to help.
>>>>>>>
>>>>>>> Once each of you and Parav have finally heard the other and
>>>>>>> the other also knows he's been heard, that's when we can
>>>>>>> try to make progress by looking for something that addresses
>>>>>>> all use-cases as opposed to endlessly repeating same arguments.
>>>>>> Sure Michael, I will not say "yes but" here.
>>>>>>
>>>>>>    From Parav's proposal, he intends to migrate a member device by its owner
>>>>>> device through the admin vq,
>>>>>> thus necessary admin vq commands are introduced in his series.
>>>>>>
>>>>>>
>>>>>> I see his proposal can:
>>>>>> 1) meet some customers requirements without nested and bare-metal
>>>>>> 2) align with Nvidia production
>>>>>> 3) easier to emulate by onboard SOC
>>>>> Is that all you can see?
>>>>>
>>>>> Hint: there's more.
>>>> please help provide more.
>>> Just a small subset off the top of my head:
>>> Error handling.
>> handle failed live migration? how?
> For example you can try restarting VM on source.
> Or at least report an error to hypervisor.
I am not sure resetting a VM due to failed live migration is
a good idea, should we resume the VM instead? Then try other
convergence algorithm?

And I think current live migration solution already implements error
detector, like sees a time out?
>
>
>> and for other errors, we have mature error handling solutions
>> in virtio for years, like re-read, NEEDS_RESET.
> facepalm
>
> Are you aware of the fact that Linux still doesn't support
> it since it turned out to be an extremely awkward interface
> to use?
I think we have implemented this in virtio driver,
like re-read to check FEATURES.
>
>> If that is not good enough, then the corollary is:
>> admin vq is better than config space,
>
> You keep confusing admin vq with admin commands.
OK, so are admin commands better than registers?
>
>
>> then the further corollary could be:
>> we should refactor virito-pci interfaces to admin vq commands,
>> like how we handle features
>>
>> Is that true?
>>> Extendable to other group types such as SIOV.
>> For SIOV, the admin vq is a transport, but for SR-IOV
>> the admin vq is a control channel, that is different,
>> and admin vq can be a side channel.
>>
>> For example, for SIOV, we config and migrate MSIX through
>> admin vq. For SRIOV, they are in config space.
> And that's a mess. FYI we already got feedback from Linux devs
> who are wondering why we can't come up with a consistent
> interface that does everything.
I believe config space is a consistent interface for PCI.
For SIOV, we need a new transport layer anyway.
>
>
>>> Batching of commands
>>> less pci transactioons
>> so this can still be a QOS issue.
>> If batching, others to starve?
> And if you block CPU since you are not accepting
> a posted write this is better?
I don't get it, block guest CPU?
>
>>> Support for keeping some data off-device
>> I don't get it, what is off-device?
>> The live migration facilities need to fetch data from the device anyway
> Heh this is what was driving nvidia to use DMA so heavily all this time.
> no - if data is not in registers, device can fetch the data from
> across pci express link, presumably with a local cache.
For PCI based configuration, like MSI, we need to fetch from config 
space anyway.
For others like dirty page, we can store the bitmap in host memory, and use
PASID for isolation.
>
>
>>> which does not mean it's better unconditionally.
>>> are above points clear?
>> The thing is, what blocks the config space solution?
>> Why admin vq is a must for live migration?
>> What's wrong in config space solution?
> Whan you say what's wrong do you mean you still see no
> advantages to doing DMA at all? config space is just better
> with no drawbacks?
still, if admin vq or admin commands are better than config space,
we should refactor the whole virtio-pci interfaces to admin vq.

And Jason has ever proposed to build admin vq LM on our basic
facilities, but I see this has been rejected.
>
>> Shall we refactor everything in virtio-pci to use admin vq?
>>> as long as you guys keep not hearing each other we will keep
>>> seeing these flame wars. if you expect everyone on virtio-comment
>>> to follow a 300 message thread you are imo very much mistaken.
>> I am sure I have not ignored any questions.
>> I am saying admin vq is problematic for live migration,
>> at least it doesn't work for nested, so why admin vq is a must for live
>> migration?
>
> My suggestion for you was to add admin command support to
> VF memory, as an alternative to admin vq. It looks like that
> will address the nested virt usecase.
If you mean carrying some big bulk of data like dirty page information,
we implemented a facility in host memory which is isolated by PASID.

I should send a new series soon, so we can work on the patch.

Thanks for your suggestions and efforts anyway.
>
>>>>>
>>>>>
>>>>>
>>>>>> The general purpose of his proposal and mine are aligned: migrate virtio
>>>>>> devices.
>>>>>>
>>>>>> Jason has ever proposed to collaborate, please allow me quote his proposal:
>>>>>>
>>>>>> "
>>>>>> Let me repeat once again here for the possible steps to collaboration:
>>>>>>
>>>>>> 1) define virtqueue state, inflight descriptors in the section of
>>>>>> basic facility but not under the admin commands
>>>>>> 2) define the dirty page tracking, device context/states in the
>>>>>> section of basic facility but not under the admin commands
>>>>>> 3) define transport specific interfaces or admin commands to access them
>>>>>> "
>>>>>>
>>>>>> I totally agree with his proposal.
>>>>>>
>>>>>> Does this work for you Michael?
>>>>>>
>>>>>> Thanks
>>>>>> Zhu Lingshan
>>>>> I just doubt very much this will work.  What will "define" mean then -
>>>>> not an interface, just a description in english? I think you
>>>>> underestimate the difficulty of creating such definitions that
>>>>> are robust and precise.
>>>> I think we can review the patch to correct the words.
>>>>> Instead I suggest you define a way to submit admin commands that works
>>>>> for nested and bare-metal (i.e. not admin vq, and not with sriov group
>>>>> type). And work with Parav to make live migration admin commands work
>>>>> reasonably will through this interface and with this type.
>>>> why admin commands are better than registers?
>>>>
>>>> This publicly archived list offers a means to provide input to the
>>>> OASIS Virtual I/O Device (VIRTIO) TC.
>>>>
>>>> In order to verify user consent to the Feedback License terms and
>>>> to minimize spam in the list archive, subscription is required
>>>> before posting.
>>>>
>>>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>>>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>>>> List help: virtio-comment-help@lists.oasis-open.org
>>>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
>>>> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
>>>> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
>>>> Committee: https://www.oasis-open.org/committees/virtio/
>>>> Join OASIS: https://www.oasis-open.org/join/
>>>>


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-10-11 11:52                                                     ` Parav Pandit
@ 2023-10-12 10:57                                                       ` Zhu, Lingshan
  2023-10-12 11:13                                                         ` Michael S. Tsirkin
  0 siblings, 1 reply; 29+ messages in thread
From: Zhu, Lingshan @ 2023-10-12 10:57 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin
  Cc: Cornelia Huck, Jason Wang, eperezma@redhat.com, Stefan Hajnoczi,
	virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org



On 10/11/2023 7:52 PM, Parav Pandit wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Wednesday, October 11, 2023 4:09 PM
>> I am sure I have not ignored any questions.
> What about below one?
>
> https://lore.kernel.org/virtio-dev/20230921011221-mutt-send-email-mst@kernel.org/
This is to discuss a attacking model, I have given the answer in another 
thread,
I have even provide an example of how malicious SW can dump guest 
security through admin vq
>


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-10-12 10:49                                                       ` Zhu, Lingshan
@ 2023-10-12 11:12                                                         ` Michael S. Tsirkin
  2023-10-13 10:18                                                           ` [virtio-comment] Re: [virtio-dev] " Zhu, Lingshan
  2023-10-12 14:38                                                         ` Michael S. Tsirkin
  1 sibling, 1 reply; 29+ messages in thread
From: Michael S. Tsirkin @ 2023-10-12 11:12 UTC (permalink / raw)
  To: Zhu, Lingshan
  Cc: Cornelia Huck, Parav Pandit, Jason Wang, eperezma@redhat.com,
	Stefan Hajnoczi, virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org

On Thu, Oct 12, 2023 at 06:49:51PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 10/12/2023 5:59 PM, Michael S. Tsirkin wrote:
> > On Wed, Oct 11, 2023 at 06:38:32PM +0800, Zhu, Lingshan wrote:
> > > 
> > > On 10/11/2023 6:20 PM, Michael S. Tsirkin wrote:
> > > > On Mon, Oct 09, 2023 at 06:01:42PM +0800, Zhu, Lingshan wrote:
> > > > > On 9/27/2023 11:40 PM, Michael S. Tsirkin wrote:
> > > > > > On Wed, Sep 27, 2023 at 04:20:01PM +0800, Zhu, Lingshan wrote:
> > > > > > > On 9/26/2023 6:48 PM, Michael S. Tsirkin wrote:
> > > > > > > > On Tue, Sep 26, 2023 at 05:25:42PM +0800, Zhu, Lingshan wrote:
> > > > > > > > > We don't want to repeat the discussions, it looks like endless circle with
> > > > > > > > > no direction.
> > > > > > > > OK let me try to direct this discussion.
> > > > > > > > You guys were speaking past each other, no dialog is happening.
> > > > > > > > And as long as it goes on no progress will be made and you
> > > > > > > > will keep going in circles.
> > > > > > > > 
> > > > > > > > Parav here made an effort and attempted to summarize
> > > > > > > > use-cases addressed by your proposal but not his.
> > > > > > > > He couldn't resist adding "a yes but" in there oh well.
> > > > > > > > But now I hope you know he knows about your use-cases?
> > > > > > > > 
> > > > > > > > So please do the same. Do you see any advantages to Parav's
> > > > > > > > proposal as compared to yours? Try to list them and
> > > > > > > > if possible try not to accompany the list with "yes but"
> > > > > > > > (put it in a separate mail if you must ;) ).
> > > > > > > > If you won't be able to see any, let me know and I'll try to help.
> > > > > > > > 
> > > > > > > > Once each of you and Parav have finally heard the other and
> > > > > > > > the other also knows he's been heard, that's when we can
> > > > > > > > try to make progress by looking for something that addresses
> > > > > > > > all use-cases as opposed to endlessly repeating same arguments.
> > > > > > > Sure Michael, I will not say "yes but" here.
> > > > > > > 
> > > > > > >    From Parav's proposal, he intends to migrate a member device by its owner
> > > > > > > device through the admin vq,
> > > > > > > thus necessary admin vq commands are introduced in his series.
> > > > > > > 
> > > > > > > 
> > > > > > > I see his proposal can:
> > > > > > > 1) meet some customers requirements without nested and bare-metal
> > > > > > > 2) align with Nvidia production
> > > > > > > 3) easier to emulate by onboard SOC
> > > > > > Is that all you can see?
> > > > > > 
> > > > > > Hint: there's more.
> > > > > please help provide more.
> > > > Just a small subset off the top of my head:
> > > > Error handling.
> > > handle failed live migration? how?
> > For example you can try restarting VM on source.
> > Or at least report an error to hypervisor.
> I am not sure resetting a VM due to failed live migration is
> a good idea, should we resume the VM instead?

Yes - when I said restarting I meant resuming not resetting.

> Then try other
> convergence algorithm?

Talking about device failures here nothing to do with convergence.
But yes, can e.g. try a different destination.

> 
> And I think current live migration solution already implements error
> detector, like sees a time out?

it is extremely hard to predict how
long will it take a random piece of hardware from a random
vendor to respond. even if you do timeouts break nested
don't they ;) and finally, they provide no indication
of what went wrong whatsoever.

> > 
> > 
> > > and for other errors, we have mature error handling solutions
> > > in virtio for years, like re-read, NEEDS_RESET.
> > facepalm
> > 
> > Are you aware of the fact that Linux still doesn't support
> > it since it turned out to be an extremely awkward interface
> > to use?
> I think we have implemented this in virtio driver,
> like re-read to check FEATURES.

grep for NEEDS_RESET in drivers/virtio and weep.

> > 
> > > If that is not good enough, then the corollary is:
> > > admin vq is better than config space,
> > 
> > You keep confusing admin vq with admin commands.
> OK, so are admin commands better than registers?

They have more functionality for sure.

> > 
> > 
> > > then the further corollary could be:
> > > we should refactor virito-pci interfaces to admin vq commands,
> > > like how we handle features
> > > 
> > > Is that true?
> > > > Extendable to other group types such as SIOV.
> > > For SIOV, the admin vq is a transport, but for SR-IOV
> > > the admin vq is a control channel, that is different,
> > > and admin vq can be a side channel.
> > > 
> > > For example, for SIOV, we config and migrate MSIX through
> > > admin vq. For SRIOV, they are in config space.
> > And that's a mess. FYI we already got feedback from Linux devs
> > who are wondering why we can't come up with a consistent
> > interface that does everything.
> I believe config space is a consistent interface for PCI.
> For SIOV, we need a new transport layer anyway.
> > 
> > 
> > > > Batching of commands
> > > > less pci transactioons
> > > so this can still be a QOS issue.
> > > If batching, others to starve?
> > And if you block CPU since you are not accepting
> > a posted write this is better?
> I don't get it, block guest CPU?

host cpu in fact. if you flood pci expess with transactions
this is exactly what happens.

> > 
> > > > Support for keeping some data off-device
> > > I don't get it, what is off-device?
> > > The live migration facilities need to fetch data from the device anyway
> > Heh this is what was driving nvidia to use DMA so heavily all this time.
> > no - if data is not in registers, device can fetch the data from
> > across pci express link, presumably with a local cache.
> For PCI based configuration, like MSI, we need to fetch from config space
> anyway.
> For others like dirty page, we can store the bitmap in host memory, and use
> PASID for isolation.

Oh really?  What do we get by not using same mechanism for
device state then? This begins to look exactly like admin vq.

> > 
> > 
> > > > which does not mean it's better unconditionally.
> > > > are above points clear?
> > > The thing is, what blocks the config space solution?
> > > Why admin vq is a must for live migration?
> > > What's wrong in config space solution?
> > Whan you say what's wrong do you mean you still see no
> > advantages to doing DMA at all? config space is just better
> > with no drawbacks?
> still, if admin vq or admin commands are better than config space,
> we should refactor the whole virtio-pci interfaces to admin vq.

mixing admin vq and command up again apparently.
We want to support virtio over admin commands for SIOV, yes.
And once that's supported nothing should prevent using that
for SRIOV too.

> And Jason has ever proposed to build admin vq LM on our basic
> facilities, but I see this has been rejected.

Please do not conclude that you just need to resubmit.

> > 
> > > Shall we refactor everything in virtio-pci to use admin vq?
> > > > as long as you guys keep not hearing each other we will keep
> > > > seeing these flame wars. if you expect everyone on virtio-comment
> > > > to follow a 300 message thread you are imo very much mistaken.
> > > I am sure I have not ignored any questions.
> > > I am saying admin vq is problematic for live migration,
> > > at least it doesn't work for nested, so why admin vq is a must for live
> > > migration?
> > 
> > My suggestion for you was to add admin command support to
> > VF memory, as an alternative to admin vq. It looks like that
> > will address the nested virt usecase.
> If you mean carrying some big bulk of data like dirty page information,
> we implemented a facility in host memory which is isolated by PASID.
> 
> I should send a new series soon, so we can work on the patch.

I hope that one does not just restart the same flame war.
As it will if people keep talking past each other and
not listening.

> Thanks for your suggestions and efforts anyway.
> > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > > The general purpose of his proposal and mine are aligned: migrate virtio
> > > > > > > devices.
> > > > > > > 
> > > > > > > Jason has ever proposed to collaborate, please allow me quote his proposal:
> > > > > > > 
> > > > > > > "
> > > > > > > Let me repeat once again here for the possible steps to collaboration:
> > > > > > > 
> > > > > > > 1) define virtqueue state, inflight descriptors in the section of
> > > > > > > basic facility but not under the admin commands
> > > > > > > 2) define the dirty page tracking, device context/states in the
> > > > > > > section of basic facility but not under the admin commands
> > > > > > > 3) define transport specific interfaces or admin commands to access them
> > > > > > > "
> > > > > > > 
> > > > > > > I totally agree with his proposal.
> > > > > > > 
> > > > > > > Does this work for you Michael?
> > > > > > > 
> > > > > > > Thanks
> > > > > > > Zhu Lingshan
> > > > > > I just doubt very much this will work.  What will "define" mean then -
> > > > > > not an interface, just a description in english? I think you
> > > > > > underestimate the difficulty of creating such definitions that
> > > > > > are robust and precise.
> > > > > I think we can review the patch to correct the words.
> > > > > > Instead I suggest you define a way to submit admin commands that works
> > > > > > for nested and bare-metal (i.e. not admin vq, and not with sriov group
> > > > > > type). And work with Parav to make live migration admin commands work
> > > > > > reasonably will through this interface and with this type.
> > > > > why admin commands are better than registers?
> > > > > 
> > > > > This publicly archived list offers a means to provide input to the
> > > > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > > > 
> > > > > In order to verify user consent to the Feedback License terms and
> > > > > to minimize spam in the list archive, subscription is required
> > > > > before posting.
> > > > > 
> > > > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > > List help: virtio-comment-help@lists.oasis-open.org
> > > > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > > > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > > > Committee: https://www.oasis-open.org/committees/virtio/
> > > > > Join OASIS: https://www.oasis-open.org/join/
> > > > > 


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-10-12 10:57                                                       ` Zhu, Lingshan
@ 2023-10-12 11:13                                                         ` Michael S. Tsirkin
  0 siblings, 0 replies; 29+ messages in thread
From: Michael S. Tsirkin @ 2023-10-12 11:13 UTC (permalink / raw)
  To: Zhu, Lingshan
  Cc: Parav Pandit, Cornelia Huck, Jason Wang, eperezma@redhat.com,
	Stefan Hajnoczi, virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org

On Thu, Oct 12, 2023 at 06:57:35PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 10/11/2023 7:52 PM, Parav Pandit wrote:
> > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > Sent: Wednesday, October 11, 2023 4:09 PM
> > > I am sure I have not ignored any questions.
> > What about below one?
> > 
> > https://lore.kernel.org/virtio-dev/20230921011221-mutt-send-email-mst@kernel.org/
> This is to discuss a attacking model, I have given the answer in another
> thread,
> I have even provide an example of how malicious SW can dump guest security
> through admin vq

No one cares, without encryption hypervisor is in control anyway.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-10-12 10:49                                                       ` Zhu, Lingshan
  2023-10-12 11:12                                                         ` Michael S. Tsirkin
@ 2023-10-12 14:38                                                         ` Michael S. Tsirkin
  2023-10-13 10:23                                                           ` Zhu, Lingshan
  1 sibling, 1 reply; 29+ messages in thread
From: Michael S. Tsirkin @ 2023-10-12 14:38 UTC (permalink / raw)
  To: Zhu, Lingshan
  Cc: Cornelia Huck, Parav Pandit, Jason Wang, eperezma@redhat.com,
	Stefan Hajnoczi, virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org

On Thu, Oct 12, 2023 at 06:49:51PM +0800, Zhu, Lingshan wrote:
> For PCI based configuration, like MSI, we need to fetch from config space
> anyway.
> For others like dirty page, we can store the bitmap in host memory, and use
> PASID for isolation.

Ok. So how a about a simple interface along the lines of
u64 cmd_address
u8 ready_flags:1

For this kind of stuff?

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-10-12 11:12                                                         ` Michael S. Tsirkin
@ 2023-10-13 10:18                                                           ` Zhu, Lingshan
  0 siblings, 0 replies; 29+ messages in thread
From: Zhu, Lingshan @ 2023-10-13 10:18 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, Parav Pandit, Jason Wang, eperezma@redhat.com,
	Stefan Hajnoczi, virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org



On 10/12/2023 7:12 PM, Michael S. Tsirkin wrote:
> On Thu, Oct 12, 2023 at 06:49:51PM +0800, Zhu, Lingshan wrote:
>>
>> On 10/12/2023 5:59 PM, Michael S. Tsirkin wrote:
>>> On Wed, Oct 11, 2023 at 06:38:32PM +0800, Zhu, Lingshan wrote:
>>>> On 10/11/2023 6:20 PM, Michael S. Tsirkin wrote:
>>>>> On Mon, Oct 09, 2023 at 06:01:42PM +0800, Zhu, Lingshan wrote:
>>>>>> On 9/27/2023 11:40 PM, Michael S. Tsirkin wrote:
>>>>>>> On Wed, Sep 27, 2023 at 04:20:01PM +0800, Zhu, Lingshan wrote:
>>>>>>>> On 9/26/2023 6:48 PM, Michael S. Tsirkin wrote:
>>>>>>>>> On Tue, Sep 26, 2023 at 05:25:42PM +0800, Zhu, Lingshan wrote:
>>>>>>>>>> We don't want to repeat the discussions, it looks like endless circle with
>>>>>>>>>> no direction.
>>>>>>>>> OK let me try to direct this discussion.
>>>>>>>>> You guys were speaking past each other, no dialog is happening.
>>>>>>>>> And as long as it goes on no progress will be made and you
>>>>>>>>> will keep going in circles.
>>>>>>>>>
>>>>>>>>> Parav here made an effort and attempted to summarize
>>>>>>>>> use-cases addressed by your proposal but not his.
>>>>>>>>> He couldn't resist adding "a yes but" in there oh well.
>>>>>>>>> But now I hope you know he knows about your use-cases?
>>>>>>>>>
>>>>>>>>> So please do the same. Do you see any advantages to Parav's
>>>>>>>>> proposal as compared to yours? Try to list them and
>>>>>>>>> if possible try not to accompany the list with "yes but"
>>>>>>>>> (put it in a separate mail if you must ;) ).
>>>>>>>>> If you won't be able to see any, let me know and I'll try to help.
>>>>>>>>>
>>>>>>>>> Once each of you and Parav have finally heard the other and
>>>>>>>>> the other also knows he's been heard, that's when we can
>>>>>>>>> try to make progress by looking for something that addresses
>>>>>>>>> all use-cases as opposed to endlessly repeating same arguments.
>>>>>>>> Sure Michael, I will not say "yes but" here.
>>>>>>>>
>>>>>>>>     From Parav's proposal, he intends to migrate a member device by its owner
>>>>>>>> device through the admin vq,
>>>>>>>> thus necessary admin vq commands are introduced in his series.
>>>>>>>>
>>>>>>>>
>>>>>>>> I see his proposal can:
>>>>>>>> 1) meet some customers requirements without nested and bare-metal
>>>>>>>> 2) align with Nvidia production
>>>>>>>> 3) easier to emulate by onboard SOC
>>>>>>> Is that all you can see?
>>>>>>>
>>>>>>> Hint: there's more.
>>>>>> please help provide more.
>>>>> Just a small subset off the top of my head:
>>>>> Error handling.
>>>> handle failed live migration? how?
>>> For example you can try restarting VM on source.
>>> Or at least report an error to hypervisor.
>> I am not sure resetting a VM due to failed live migration is
>> a good idea, should we resume the VM instead?
> Yes - when I said restarting I meant resuming not resetting.
OK, we have implemented the interface to resume the device, to clear 
suspend.
>
>> Then try other
>> convergence algorithm?
> Talking about device failures here nothing to do with convergence.
> But yes, can e.g. try a different destination.
OK
>
>> And I think current live migration solution already implements error
>> detector, like sees a time out?
> it is extremely hard to predict how
> long will it take a random piece of hardware from a random
> vendor to respond. even if you do timeouts break nested
> don't they ;) and finally, they provide no indication
> of what went wrong whatsoever.
the hypervisor would not complete the live migration
process before device migration done.

I think the hypervisor or the orchestration layer
know the LM status anyway.
>
>>>
>>>> and for other errors, we have mature error handling solutions
>>>> in virtio for years, like re-read, NEEDS_RESET.
>>> facepalm
>>>
>>> Are you aware of the fact that Linux still doesn't support
>>> it since it turned out to be an extremely awkward interface
>>> to use?
>> I think we have implemented this in virtio driver,
>> like re-read to check FEATURES.
> grep for NEEDS_RESET in drivers/virtio and weep.
that is interesting, virito driver lives so many years
without handling NEEDS_RESET, so good device quality and
layers of error handlers.

what prevent implementing NEEDS_RESET? Is it because of how to reinitialize?
It looks like we should do that.

For now, re-read working well at least.
>
>>>> If that is not good enough, then the corollary is:
>>>> admin vq is better than config space,
>>> You keep confusing admin vq with admin commands.
>> OK, so are admin commands better than registers?
> They have more functionality for sure.
yes they are powerful than registers.

However, to suspend, resume, config dirty page facility,
registers are low hanging fruits.
>
>>>
>>>> then the further corollary could be:
>>>> we should refactor virito-pci interfaces to admin vq commands,
>>>> like how we handle features
>>>>
>>>> Is that true?
>>>>> Extendable to other group types such as SIOV.
>>>> For SIOV, the admin vq is a transport, but for SR-IOV
>>>> the admin vq is a control channel, that is different,
>>>> and admin vq can be a side channel.
>>>>
>>>> For example, for SIOV, we config and migrate MSIX through
>>>> admin vq. For SRIOV, they are in config space.
>>> And that's a mess. FYI we already got feedback from Linux devs
>>> who are wondering why we can't come up with a consistent
>>> interface that does everything.
>> I believe config space is a consistent interface for PCI.
>> For SIOV, we need a new transport layer anyway.
>>>
>>>>> Batching of commands
>>>>> less pci transactioons
>>>> so this can still be a QOS issue.
>>>> If batching, others to starve?
>>> And if you block CPU since you are not accepting
>>> a posted write this is better?
>> I don't get it, block guest CPU?
> host cpu in fact. if you flood pci expess with transactions
> this is exactly what happens.
Not sure hypervisor will implement this just because adapting to admin 
vq live migration.
>
>>>>> Support for keeping some data off-device
>>>> I don't get it, what is off-device?
>>>> The live migration facilities need to fetch data from the device anyway
>>> Heh this is what was driving nvidia to use DMA so heavily all this time.
>>> no - if data is not in registers, device can fetch the data from
>>> across pci express link, presumably with a local cache.
>> For PCI based configuration, like MSI, we need to fetch from config space
>> anyway.
>> For others like dirty page, we can store the bitmap in host memory, and use
>> PASID for isolation.
> Oh really?  What do we get by not using same mechanism for
> device state then? This begins to look exactly like admin vq.
implementing a register to config a logging address in host memory and 
isolated by PASID.
Also there are other few registers to control the facility, like 
enable/disable.
>
>>>
>>>>> which does not mean it's better unconditionally.
>>>>> are above points clear?
>>>> The thing is, what blocks the config space solution?
>>>> Why admin vq is a must for live migration?
>>>> What's wrong in config space solution?
>>> Whan you say what's wrong do you mean you still see no
>>> advantages to doing DMA at all? config space is just better
>>> with no drawbacks?
>> still, if admin vq or admin commands are better than config space,
>> we should refactor the whole virtio-pci interfaces to admin vq.
> mixing admin vq and command up again apparently.
> We want to support virtio over admin commands for SIOV, yes.
> And once that's supported nothing should prevent using that
> for SRIOV too.
admin commands work for SRIOV, but overkill for live migration.

For example, to suspend a device, what is the benefit using a
admin command than just a register?

And if we want a bar to process admin commands, do we need
to implement some fields like data_length, total_length and
etc, much more complex than a register.
>
>> And Jason has ever proposed to build admin vq LM on our basic
>> facilities, but I see this has been rejected.
> Please do not conclude that you just need to resubmit.
>
>>>> Shall we refactor everything in virtio-pci to use admin vq?
>>>>> as long as you guys keep not hearing each other we will keep
>>>>> seeing these flame wars. if you expect everyone on virtio-comment
>>>>> to follow a 300 message thread you are imo very much mistaken.
>>>> I am sure I have not ignored any questions.
>>>> I am saying admin vq is problematic for live migration,
>>>> at least it doesn't work for nested, so why admin vq is a must for live
>>>> migration?
>>> My suggestion for you was to add admin command support to
>>> VF memory, as an alternative to admin vq. It looks like that
>>> will address the nested virt usecase.
>> If you mean carrying some big bulk of data like dirty page information,
>> we implemented a facility in host memory which is isolated by PASID.
>>
>> I should send a new series soon, so we can work on the patch.
> I hope that one does not just restart the same flame war.
> As it will if people keep talking past each other and
> not listening.
V2 will include dirty page tracking, so we can review the design.

Yes I hope no flame wars.
>
>> Thanks for your suggestions and efforts anyway.
>>>>>>>
>>>>>>>
>>>>>>>> The general purpose of his proposal and mine are aligned: migrate virtio
>>>>>>>> devices.
>>>>>>>>
>>>>>>>> Jason has ever proposed to collaborate, please allow me quote his proposal:
>>>>>>>>
>>>>>>>> "
>>>>>>>> Let me repeat once again here for the possible steps to collaboration:
>>>>>>>>
>>>>>>>> 1) define virtqueue state, inflight descriptors in the section of
>>>>>>>> basic facility but not under the admin commands
>>>>>>>> 2) define the dirty page tracking, device context/states in the
>>>>>>>> section of basic facility but not under the admin commands
>>>>>>>> 3) define transport specific interfaces or admin commands to access them
>>>>>>>> "
>>>>>>>>
>>>>>>>> I totally agree with his proposal.
>>>>>>>>
>>>>>>>> Does this work for you Michael?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Zhu Lingshan
>>>>>>> I just doubt very much this will work.  What will "define" mean then -
>>>>>>> not an interface, just a description in english? I think you
>>>>>>> underestimate the difficulty of creating such definitions that
>>>>>>> are robust and precise.
>>>>>> I think we can review the patch to correct the words.
>>>>>>> Instead I suggest you define a way to submit admin commands that works
>>>>>>> for nested and bare-metal (i.e. not admin vq, and not with sriov group
>>>>>>> type). And work with Parav to make live migration admin commands work
>>>>>>> reasonably will through this interface and with this type.
>>>>>> why admin commands are better than registers?
>>>>>>
>>>>>> This publicly archived list offers a means to provide input to the
>>>>>> OASIS Virtual I/O Device (VIRTIO) TC.
>>>>>>
>>>>>> In order to verify user consent to the Feedback License terms and
>>>>>> to minimize spam in the list archive, subscription is required
>>>>>> before posting.
>>>>>>
>>>>>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>>>>>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>>>>>> List help: virtio-comment-help@lists.oasis-open.org
>>>>>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
>>>>>> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
>>>>>> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
>>>>>> Committee: https://www.oasis-open.org/committees/virtio/
>>>>>> Join OASIS: https://www.oasis-open.org/join/
>>>>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  2023-10-12 14:38                                                         ` Michael S. Tsirkin
@ 2023-10-13 10:23                                                           ` Zhu, Lingshan
  0 siblings, 0 replies; 29+ messages in thread
From: Zhu, Lingshan @ 2023-10-13 10:23 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, Parav Pandit, Jason Wang, eperezma@redhat.com,
	Stefan Hajnoczi, virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org



On 10/12/2023 10:38 PM, Michael S. Tsirkin wrote:
> On Thu, Oct 12, 2023 at 06:49:51PM +0800, Zhu, Lingshan wrote:
>> For PCI based configuration, like MSI, we need to fetch from config space
>> anyway.
>> For others like dirty page, we can store the bitmap in host memory, and use
>> PASID for isolation.
> Ok. So how a about a simple interface along the lines of
> u64 cmd_address
> u8 ready_flags:1
>
> For this kind of stuff?
Yes, something like this, log dirty pages in host memory.
The is the draft, not finished yet.

  84 +\begin{lstlisting}
  85 +struct virtio_pci_dity_page_track {
  86 +        u8 enable;               /* Read-Write */
  87 +        u8 gra_power;            /* Read-Write */
  88 +        u8 reserved[2];
  89 +        le32 {
  90 +            pasid: 20;           /* Read-Write */
  91 +            reserved: 12;
  92 +        };
  93 +        le64 bitmap_addr;        /* Read-Write */
  94 +        le64 bitmap_length;      /* Read-Write */
  95 +};
  96 +\end{lstlisting}


>


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2023-10-13 10:24 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <PH0PR12MB5481C41D4F32DA26D5831471DCFBA@PH0PR12MB5481.namprd12.prod.outlook.com>
     [not found] ` <213a0f94-cee2-d8c5-3c5d-d2d7fc920e75@intel.com>
     [not found]   ` <PH0PR12MB5481AE8C6E5EAFE9A4ADA6F6DCFBA@PH0PR12MB5481.namprd12.prod.outlook.com>
     [not found]     ` <f1fe513e-d42d-03a6-348c-5e58aac0a759@intel.com>
     [not found]       ` <PH0PR12MB5481323A3B4D67EB66328476DCFBA@PH0PR12MB5481.namprd12.prod.outlook.com>
     [not found]         ` <5f01772f-eb27-bfe0-7f69-b83fbd90dda0@intel.com>
     [not found]           ` <PH0PR12MB5481ABF4B9BC2082BC54AB1DDCFBA@PH0PR12MB5481.namprd12.prod.outlook.com>
     [not found]             ` <20230918144312-mutt-send-email-mst@kernel.org>
     [not found]               ` <bb402c34-ad07-063a-f4f5-d8c6b3a642c7@intel.com>
     [not found]                 ` <20230920054836-mutt-send-email-mst@kernel.org>
     [not found]                   ` <2f67fb85-2238-9c34-a265-b0f97b7ab7e1@intel.com>
     [not found]                     ` <20230920075243-mutt-send-email-mst@kernel.org>
     [not found]                       ` <20230920084058-mutt-send-email-mst@kernel.org>
     [not found]                         ` <PH0PR12MB548110E53C924C984FA24A49DCF9A@PH0PR12MB5481.namprd12.prod.outlook.com>
     [not found]                           ` <20230920101402-mutt-send-email-mst@kernel.org>
     [not found]                             ` <PH0PR12MB54816F8341A77E63831316DBDCF9A@PH0PR12MB5481.namprd12.prod.outlook.com>
     [not found]                               ` <20230920160218-mutt-send-email-mst@kernel.org>
     [not found]                                 ` <PH0PR12MB54814A89793CF796412D6A27DCF8A@PH0PR12MB5481.namprd12.prod.outlook.com>
     [not found]                                   ` <20230921004957-mutt-send-email-mst@kernel.org>
2023-09-21  9:06                                     ` [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state Zhu, Lingshan
2023-09-21  9:18                           ` Zhu, Lingshan
2023-09-21  9:26                             ` [virtio-comment] " Parav Pandit
2023-09-21  9:55                               ` [virtio-comment] " Zhu, Lingshan
2023-09-21 11:28                                 ` [virtio-comment] " Parav Pandit
2023-09-22  2:40                                   ` [virtio-comment] " Zhu, Lingshan
     [not found]                   ` <CACGkMEsn+9AqgmurN8-GXkcu8UxAr62_woJn3XZN+oUkTQNPUg@mail.gmail.com>
     [not found]                     ` <PH0PR12MB5481830365A80EF4583A3FFADCF8A@PH0PR12MB5481.namprd12.prod.outlook.com>
     [not found]                       ` <CACGkMEshxAU3Mjo7vczBNRb=P=FnenO4mPb9HJL0Ma3ZRbe-oA@mail.gmail.com>
     [not found]                         ` <PH0PR12MB548172996D62E37714A9776BDCF8A@PH0PR12MB5481.namprd12.prod.outlook.com>
     [not found]                           ` <CACGkMEtW88zJkDQL58NqLzzudq=f+SmzJ8bha55Dd2fd=FRGBQ@mail.gmail.com>
2023-09-22  3:39                             ` Zhu, Lingshan
     [not found]                             ` <PH0PR12MB5481573D6EE3BE03FB7C3D70DCFCA@PH0PR12MB5481.namprd12.prod.outlook.com>
     [not found]                               ` <CACGkMEvOxraeVB-5g7dJ-KBN=63kpmDfDdqaiQVyDh8egDjsGw@mail.gmail.com>
     [not found]                                 ` <PH0PR12MB5481C11F6D68A892A091E2AFDCC3A@PH0PR12MB5481.namprd12.prod.outlook.com>
2023-09-26  5:36                                   ` Zhu, Lingshan
2023-09-26  6:03                                     ` [virtio-comment] " Parav Pandit
2023-09-26  9:25                                       ` [virtio-comment] " Zhu, Lingshan
2023-09-26 10:48                                         ` Michael S. Tsirkin
2023-09-27  8:20                                           ` Zhu, Lingshan
2023-09-27 10:39                                             ` [virtio-comment] " Parav Pandit
2023-10-09 10:05                                               ` [virtio-comment] " Zhu, Lingshan
2023-10-09 10:07                                                 ` [virtio-comment] " Parav Pandit
2023-09-27 15:40                                             ` [virtio-comment] " Michael S. Tsirkin
2023-10-09 10:01                                               ` Zhu, Lingshan
2023-10-11 10:20                                                 ` Michael S. Tsirkin
2023-10-11 10:38                                                   ` Zhu, Lingshan
2023-10-11 11:52                                                     ` Parav Pandit
2023-10-12 10:57                                                       ` Zhu, Lingshan
2023-10-12 11:13                                                         ` Michael S. Tsirkin
2023-10-12  9:59                                                     ` Michael S. Tsirkin
2023-10-12 10:49                                                       ` Zhu, Lingshan
2023-10-12 11:12                                                         ` Michael S. Tsirkin
2023-10-13 10:18                                                           ` [virtio-comment] Re: [virtio-dev] " Zhu, Lingshan
2023-10-12 14:38                                                         ` Michael S. Tsirkin
2023-10-13 10:23                                                           ` Zhu, Lingshan
2023-09-06  8:16 [virtio-comment] " Zhu Lingshan
2023-09-14 11:37 ` [virtio-comment] " Michael S. Tsirkin
2023-09-15  4:41   ` [virtio-comment] Re: [virtio-dev] " Zhu, Lingshan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox