public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
From: Max Gurtovoy <mgurtovoy@nvidia.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@ziepe.ca>, Lei Rao <lei.rao@intel.com>,
	kbusch@kernel.org, axboe@fb.com, kch@nvidia.com,
	sagi@grimberg.me, alex.williamson@redhat.com, cohuck@redhat.com,
	yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com,
	kevin.tian@intel.com, mjrosato@linux.ibm.com,
	linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
	kvm@vger.kernel.org, eddie.dong@intel.com, yadong.li@intel.com,
	yi.l.liu@intel.com, Konrad.wilk@oracle.com,
	stephen@eideticom.com, hang.yuan@intel.com
Subject: Re: [RFC PATCH 1/5] nvme-pci: add function nvme_submit_vf_cmd to issue admin commands for VF driver.
Date: Wed, 7 Dec 2022 16:50:00 +0200	[thread overview]
Message-ID: <d28a7848-b284-6c86-a2ae-ab79de3675d4@nvidia.com> (raw)
In-Reply-To: <20221207134644.GB21691@lst.de>


On 12/7/2022 3:46 PM, Christoph Hellwig wrote:
> On Wed, Dec 07, 2022 at 12:59:00PM +0200, Max Gurtovoy wrote:
>> Why is it preferred that the migration SW will talk directly to the PF and
>> not via VFIO interface ?
> It should never talk directly to any hardware, but through a kernel
> interface, and that's probably vfio.  But that interface needs to
> centered around the controlling function for all the reasons I've
> written down multiple times now.
>
>> It's just an implementation detail.
> No, it's not.  While you could come up with awkward ways to map how
> the hardware interface must work to a completely contrary kernel
> interface that's just going to create the need for lots of boilerplate
> code _and_ confuses users.  The function that is beeing migrated can
> fundamentally not be in control of itself.  Any interface that pretends
> it is broken and a long term nightmare for users and implementers.

We're defining the SPEC and interfaces now :)

Bellow is some possible direction I can think of.

>> I feel like it's even sounds more reasonable to have a common API like we
>> have today to save_state/resume_state/quiesce_device/freeze_device and each
>> device implementation will translate this functionality to its own SPEC.
> Absolutely.
>
>> If I understand your direction is to have QEMU code to talk to
>> nvmecli/new_mlx5cli/my_device_cli to do that and I'm not sure it's needed.
> No.
great.
>
>> The controlled device is not aware of any of the migration process. Only
>> the migration SW, system admin and controlling device.
> Exactly.
>
>> So in the source:
>>
>> 1. We enable SRIOV on the NVMe driver
> Again.  Nothing in live migration is tied to SR-IOV at all.  SR-IOV
> is just one way to get multiple functions.

Sure.

It's just an example. It can be some mdev.

>
>> 2. We list all the secondary controllers: nvme1, nvme2, nvme3
>>
>> 3. We allow migrating nvme1, nvme2, nvme3 - now these VFs are migratable
>> (controlling to controlled).
>>
>> 4. We bind nvme1, nvme2, nvme3 to VFIO NVMe driver
>>
>> 5. We pass these functions to VM
> And you need to pass the controlling function (or rather a handle for
> it), because there is absolutely no sane way to discover that from
> the controlled function as it can't have that information by the
> fact that it is beeing passed to unprivilged VMs.

Just thinking out loud:

When we perform step #3 we are narrowing it's scope and maybe some caps 
that you're concerned of. After this setting, the controlled function is 
in LM mode (we should define what does that mean in order to be able to 
migrate it correctly) and the controlling function is the migration 
master of it. Both can be aware of that. The only one that can master 
the controlled function is the controlling function in LM mode. Thus, it 
will be easy to keep that handle inside the kernel for VFs and for MDEVs 
as well.
Although I'm not against passing this handle to migration SW somehow in 
the command line of the QEMU but I still can't completely agree it's 
necessary.



  reply	other threads:[~2022-12-07 14:50 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-06  5:58 [RFC PATCH 0/5] Add new VFIO PCI driver for NVMe devices Lei Rao
2022-12-06  5:58 ` [RFC PATCH 1/5] nvme-pci: add function nvme_submit_vf_cmd to issue admin commands for VF driver Lei Rao
2022-12-06  6:19   ` Christoph Hellwig
2022-12-06 13:44     ` Jason Gunthorpe
2022-12-06 13:51       ` Keith Busch
2022-12-06 14:27         ` Jason Gunthorpe
2022-12-06 13:58       ` Christoph Hellwig
2022-12-06 15:22         ` Jason Gunthorpe
2022-12-06 15:38           ` Christoph Hellwig
2022-12-06 15:51             ` Jason Gunthorpe
2022-12-06 16:55               ` Christoph Hellwig
2022-12-06 19:15                 ` Jason Gunthorpe
2022-12-07  2:30                   ` Max Gurtovoy
2022-12-07  7:58                     ` Christoph Hellwig
2022-12-09  2:11                       ` Tian, Kevin
2022-12-12  7:41                         ` Christoph Hellwig
2022-12-07  7:54                   ` Christoph Hellwig
2022-12-07 10:59                     ` Max Gurtovoy
2022-12-07 13:46                       ` Christoph Hellwig
2022-12-07 14:50                         ` Max Gurtovoy [this message]
2022-12-07 16:35                           ` Christoph Hellwig
2022-12-07 13:34                     ` Jason Gunthorpe
2022-12-07 13:52                       ` Christoph Hellwig
2022-12-07 15:07                         ` Jason Gunthorpe
2022-12-07 16:38                           ` Christoph Hellwig
2022-12-07 17:31                             ` Jason Gunthorpe
2022-12-07 18:33                               ` Christoph Hellwig
2022-12-07 20:08                                 ` Jason Gunthorpe
2022-12-09  2:50                                   ` Tian, Kevin
2022-12-09 18:56                                     ` Dong, Eddie
2022-12-11 11:39                                   ` Max Gurtovoy
2022-12-12  7:55                                     ` Christoph Hellwig
2022-12-12 14:49                                       ` Max Gurtovoy
2022-12-12  7:50                                   ` Christoph Hellwig
2022-12-13 14:01                                     ` Jason Gunthorpe
2022-12-13 16:08                                       ` Christoph Hellwig
2022-12-13 17:49                                         ` Jason Gunthorpe
2022-12-06  5:58 ` [RFC PATCH 2/5] nvme-vfio: add new vfio-pci driver for NVMe device Lei Rao
2022-12-06  5:58 ` [RFC PATCH 3/5] nvme-vfio: enable the function of VFIO live migration Lei Rao
2023-01-19 10:21   ` Max Gurtovoy
2023-02-09  9:09     ` Rao, Lei
2022-12-06  5:58 ` [RFC PATCH 4/5] nvme-vfio: check if the hardware supports " Lei Rao
2022-12-06 13:47   ` Keith Busch
2022-12-06  5:58 ` [RFC PATCH 5/5] nvme-vfio: Add a document for the NVMe device Lei Rao
2022-12-06  6:26   ` Christoph Hellwig
2022-12-06 13:05     ` Jason Gunthorpe
2022-12-06 13:09       ` Christoph Hellwig
2022-12-06 13:52         ` Jason Gunthorpe
2022-12-06 14:00           ` Christoph Hellwig
2022-12-06 14:20             ` Jason Gunthorpe
2022-12-06 14:31               ` Christoph Hellwig
2022-12-06 14:48                 ` Jason Gunthorpe
2022-12-06 15:01                   ` Christoph Hellwig
2022-12-06 15:28                     ` Jason Gunthorpe
2022-12-06 15:35                       ` Christoph Hellwig
2022-12-06 18:00                         ` Dong, Eddie
2022-12-12  7:57                           ` Christoph Hellwig
2022-12-11 12:05                     ` Max Gurtovoy
2022-12-11 13:21                       ` Rao, Lei
2022-12-11 14:51                         ` Max Gurtovoy
2022-12-12  1:20                           ` Rao, Lei
2022-12-12  8:09                           ` Christoph Hellwig
2022-12-09  2:05         ` Tian, Kevin
2022-12-09 16:53           ` Li, Yadong
2022-12-12  8:11             ` Christoph Hellwig
2022-12-07 22:42   ` Jonathan Derrick
2022-12-07 22:54     ` Chaitanya Kulkarni
2022-12-08  0:03       ` Keith Busch
2022-12-08  5:39         ` Chaitanya Kulkarni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d28a7848-b284-6c86-a2ae-ab79de3675d4@nvidia.com \
    --to=mgurtovoy@nvidia.com \
    --cc=Konrad.wilk@oracle.com \
    --cc=alex.williamson@redhat.com \
    --cc=axboe@fb.com \
    --cc=cohuck@redhat.com \
    --cc=eddie.dong@intel.com \
    --cc=hang.yuan@intel.com \
    --cc=hch@lst.de \
    --cc=jgg@ziepe.ca \
    --cc=kbusch@kernel.org \
    --cc=kch@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=lei.rao@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=sagi@grimberg.me \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=stephen@eideticom.com \
    --cc=yadong.li@intel.com \
    --cc=yi.l.liu@intel.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox