From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from ws5-mx01.kavi.com (ws5-mx01.kavi.com [34.193.7.191]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7735ECE79AC for ; Wed, 20 Sep 2023 07:27:42 +0000 (UTC) Received: from lists.oasis-open.org (oasis.ws5.connectedcommunity.org [10.110.1.242]) by ws5-mx01.kavi.com (Postfix) with ESMTP id E42742B13A for ; Wed, 20 Sep 2023 07:27:41 +0000 (UTC) Received: from lists.oasis-open.org (oasis-open.org [10.110.1.242]) by lists.oasis-open.org (Postfix) with ESMTP id C245198666C for ; Wed, 20 Sep 2023 07:27:41 +0000 (UTC) Received: from host09.ws5.connectedcommunity.org (host09.ws5.connectedcommunity.org [10.110.1.97]) by lists.oasis-open.org (Postfix) with QMQP id B360F986667; Wed, 20 Sep 2023 07:27:41 +0000 (UTC) Mailing-List: contact virtio-comment-help@lists.oasis-open.org; run by ezmlm List-ID: Sender: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Received: from lists.oasis-open.org (oasis-open.org [10.110.1.242]) by lists.oasis-open.org (Postfix) with ESMTP id A2A00986668; Wed, 20 Sep 2023 07:27:41 +0000 (UTC) X-Virus-Scanned: amavisd-new at kavi.com X-IronPort-AV: E=McAfee;i="6600,9927,10838"; a="411088293" X-IronPort-AV: E=Sophos;i="6.02,161,1688454000"; d="scan'208";a="411088293" X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10838"; a="781592828" X-IronPort-AV: E=Sophos;i="6.02,161,1688454000"; d="scan'208";a="781592828" Message-ID: <91c3e7ec-d702-ee61-c420-59ddc8dac6dc@intel.com> Date: Wed, 20 Sep 2023 15:27:30 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Firefox/102.0 Thunderbird/102.15.1 Content-Language: en-US To: Parav Pandit , "Chen, Jiqian" , "Michael S. Tsirkin" Cc: Gerd Hoffmann , Jason Wang , Xuan Zhuo , David Airlie , Gurchetan Singh , Chia-I Wu , =?UTF-8?Q?Marc-Andr=c3=a9_Lureau?= , Robert Beckett , Mikhail Golubev-Ciuchea , "virtio-comment@lists.oasis-open.org" , "virtio-dev@lists.oasis-open.org" , "qemu-devel@nongnu.org" , "linux-kernel@vger.kernel.org" , Stefano Stabellini , =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= , "Deucher, Alexander" , "Koenig, Christian" , "Hildebrand, Stewart" , Xenia Ragiadakou , "Huang, Honglei1" , "Zhang, Julia" , "Huang, Ray" References: <20230919114242.2283646-1-Jiqian.Chen@amd.com> <20230919114242.2283646-2-Jiqian.Chen@amd.com> <20230919082802-mutt-send-email-mst@kernel.org> <701bb67c-c52d-4eb3-a6ed-f73bd5d0ff33@intel.com> From: "Zhu, Lingshan" In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg On 9/20/2023 3:10 PM, Parav Pandit wrote: >> From: Zhu, Lingshan >> Sent: Wednesday, September 20, 2023 12:37 PM >>> The problem to overcome in [1] is, resume operation needs to be synchronous >> as it involves large part of context to resume back, and hence just >> asynchronously setting DRIVER_OK is not enough. >>> The sw must verify back that device has resumed the operation and ready to >> answer requests. >> this is not live migration, all device status and other information still stay in the >> device, no need to "resume" context, just resume running. >> > I am aware that it is not live migration. :) > > "Just resuming" involves lot of device setup task. The device implementation does not know for how long a device is suspended. > So for example, a VM is suspended for 6 hours, hence the device context could be saved in a slow disk. > Hence, when the resume is done, it needs to setup things again and driver got to verify before accessing more from the device. The restore procedures should perform by the hypervisor and done before set DRIVER_OK and wake up the guest. And the hypervisor/driver needs to check the device status by re-reading. > >> Like resume from a failed LM. >>> This is slightly different flow than setting the DRIVER_OK for the first time >> device initialization sequence as it does not involve large restoration. >>> So, to merge two ideas, instead of doing DRIVER_OK to resume, the driver >> should clear the SUSPEND bit and verify that it is out of SUSPEND. >>> Because driver is still in _OK_ driving the device flipping the SUSPEND bit. >> Please read the spec, it says: >> The driver MUST NOT clear a device status bit >> > Yes, this is why either DRIER_OK validation by the driver is needed or Jiqian's synchronous new register.. so re-read > This publicly archived list offers a means to provide input to the OASIS Virtual I/O Device (VIRTIO) TC. In order to verify user consent to the Feedback License terms and to minimize spam in the list archive, subscription is required before posting. Subscribe: virtio-comment-subscribe@lists.oasis-open.org Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org List help: virtio-comment-help@lists.oasis-open.org List archive: https://lists.oasis-open.org/archives/virtio-comment/ Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists Committee: https://www.oasis-open.org/committees/virtio/ Join OASIS: https://www.oasis-open.org/join/ From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from ws5-mx01.kavi.com (ws5-mx01.kavi.com [34.193.7.191]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 689CACE79AD for ; Wed, 20 Sep 2023 07:27:46 +0000 (UTC) Received: from lists.oasis-open.org (oasis.ws5.connectedcommunity.org [10.110.1.242]) by ws5-mx01.kavi.com (Postfix) with ESMTP id CED9B2A8EE for ; Wed, 20 Sep 2023 07:27:45 +0000 (UTC) Received: from lists.oasis-open.org (oasis-open.org [10.110.1.242]) by lists.oasis-open.org (Postfix) with ESMTP id C4B67986675 for ; Wed, 20 Sep 2023 07:27:45 +0000 (UTC) Received: from host09.ws5.connectedcommunity.org (host09.ws5.connectedcommunity.org [10.110.1.97]) by lists.oasis-open.org (Postfix) with QMQP id B583C986669; Wed, 20 Sep 2023 07:27:45 +0000 (UTC) Mailing-List: contact virtio-dev-help@lists.oasis-open.org; run by ezmlm List-ID: Sender: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Received: from lists.oasis-open.org (oasis-open.org [10.110.1.242]) by lists.oasis-open.org (Postfix) with ESMTP id A2A00986668; Wed, 20 Sep 2023 07:27:41 +0000 (UTC) X-Virus-Scanned: amavisd-new at kavi.com X-IronPort-AV: E=McAfee;i="6600,9927,10838"; a="411088293" X-IronPort-AV: E=Sophos;i="6.02,161,1688454000"; d="scan'208";a="411088293" X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10838"; a="781592828" X-IronPort-AV: E=Sophos;i="6.02,161,1688454000"; d="scan'208";a="781592828" Message-ID: <91c3e7ec-d702-ee61-c420-59ddc8dac6dc@intel.com> Date: Wed, 20 Sep 2023 15:27:30 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Firefox/102.0 Thunderbird/102.15.1 Content-Language: en-US To: Parav Pandit , "Chen, Jiqian" , "Michael S. Tsirkin" Cc: Gerd Hoffmann , Jason Wang , Xuan Zhuo , David Airlie , Gurchetan Singh , Chia-I Wu , =?UTF-8?Q?Marc-Andr=c3=a9_Lureau?= , Robert Beckett , Mikhail Golubev-Ciuchea , "virtio-comment@lists.oasis-open.org" , "virtio-dev@lists.oasis-open.org" , "qemu-devel@nongnu.org" , "linux-kernel@vger.kernel.org" , Stefano Stabellini , =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= , "Deucher, Alexander" , "Koenig, Christian" , "Hildebrand, Stewart" , Xenia Ragiadakou , "Huang, Honglei1" , "Zhang, Julia" , "Huang, Ray" References: <20230919114242.2283646-1-Jiqian.Chen@amd.com> <20230919114242.2283646-2-Jiqian.Chen@amd.com> <20230919082802-mutt-send-email-mst@kernel.org> <701bb67c-c52d-4eb3-a6ed-f73bd5d0ff33@intel.com> From: "Zhu, Lingshan" In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg On 9/20/2023 3:10 PM, Parav Pandit wrote: >> From: Zhu, Lingshan >> Sent: Wednesday, September 20, 2023 12:37 PM >>> The problem to overcome in [1] is, resume operation needs to be synchronous >> as it involves large part of context to resume back, and hence just >> asynchronously setting DRIVER_OK is not enough. >>> The sw must verify back that device has resumed the operation and ready to >> answer requests. >> this is not live migration, all device status and other information still stay in the >> device, no need to "resume" context, just resume running. >> > I am aware that it is not live migration. :) > > "Just resuming" involves lot of device setup task. The device implementation does not know for how long a device is suspended. > So for example, a VM is suspended for 6 hours, hence the device context could be saved in a slow disk. > Hence, when the resume is done, it needs to setup things again and driver got to verify before accessing more from the device. The restore procedures should perform by the hypervisor and done before set DRIVER_OK and wake up the guest. And the hypervisor/driver needs to check the device status by re-reading. > >> Like resume from a failed LM. >>> This is slightly different flow than setting the DRIVER_OK for the first time >> device initialization sequence as it does not involve large restoration. >>> So, to merge two ideas, instead of doing DRIVER_OK to resume, the driver >> should clear the SUSPEND bit and verify that it is out of SUSPEND. >>> Because driver is still in _OK_ driving the device flipping the SUSPEND bit. >> Please read the spec, it says: >> The driver MUST NOT clear a device status bit >> > Yes, this is why either DRIER_OK validation by the driver is needed or Jiqian's synchronous new register.. so re-read > --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1427CE79AC for ; Wed, 20 Sep 2023 07:27:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233732AbjITH14 (ORCPT ); Wed, 20 Sep 2023 03:27:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233728AbjITH1y (ORCPT ); Wed, 20 Sep 2023 03:27:54 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EF3C3C9 for ; Wed, 20 Sep 2023 00:27:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695194860; x=1726730860; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=c//2mqr4QCDnboufqfxvs3v0ljEejNOYFoTJbE1+at8=; b=R9WW70+gho4pqxFqCzld0/kgaekb54qloIAoeMvIbLX6nNnZuNW8k0Xg tTbVoLrjuseBQjm+CXQUuWjftKTPO5boIEYWnORNyMdZjpQBXkNu5JQ0i UNRTum6fOdO5SiPDpyZdhV1ShUaVK4N8mktoLTzScyMjl7bfq2YPDUXho 5QDjg8Qpmy3bhH/mzeJE2TlLKPyZ6cQuWQfET4HQ0IzsB323wNvRvgF8l 23Nvom41yWbcPZCcDIbY5EBsFBPjbo3s5Q3Zno0Q9f2Zwd3Hl0C2hmeTV 5JyD2EGjjRhSo+iJ7jWpnjupbZ+Ldx3eMUZgJ2inF0+4+86+M7CSxh9c+ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10838"; a="411088297" X-IronPort-AV: E=Sophos;i="6.02,161,1688454000"; d="scan'208";a="411088297" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2023 00:27:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10838"; a="781592828" X-IronPort-AV: E=Sophos;i="6.02,161,1688454000"; d="scan'208";a="781592828" Received: from lingshan-mobl.ccr.corp.intel.com (HELO [10.93.14.5]) ([10.93.14.5]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2023 00:27:32 -0700 Message-ID: <91c3e7ec-d702-ee61-c420-59ddc8dac6dc@intel.com> Date: Wed, 20 Sep 2023 15:27:30 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Firefox/102.0 Thunderbird/102.15.1 Subject: Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg Content-Language: en-US To: Parav Pandit , "Chen, Jiqian" , "Michael S. Tsirkin" Cc: Gerd Hoffmann , Jason Wang , Xuan Zhuo , David Airlie , Gurchetan Singh , Chia-I Wu , =?UTF-8?Q?Marc-Andr=c3=a9_Lureau?= , Robert Beckett , Mikhail Golubev-Ciuchea , "virtio-comment@lists.oasis-open.org" , "virtio-dev@lists.oasis-open.org" , "qemu-devel@nongnu.org" , "linux-kernel@vger.kernel.org" , Stefano Stabellini , =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= , "Deucher, Alexander" , "Koenig, Christian" , "Hildebrand, Stewart" , Xenia Ragiadakou , "Huang, Honglei1" , "Zhang, Julia" , "Huang, Ray" References: <20230919114242.2283646-1-Jiqian.Chen@amd.com> <20230919114242.2283646-2-Jiqian.Chen@amd.com> <20230919082802-mutt-send-email-mst@kernel.org> <701bb67c-c52d-4eb3-a6ed-f73bd5d0ff33@intel.com> From: "Zhu, Lingshan" In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 9/20/2023 3:10 PM, Parav Pandit wrote: >> From: Zhu, Lingshan >> Sent: Wednesday, September 20, 2023 12:37 PM >>> The problem to overcome in [1] is, resume operation needs to be synchronous >> as it involves large part of context to resume back, and hence just >> asynchronously setting DRIVER_OK is not enough. >>> The sw must verify back that device has resumed the operation and ready to >> answer requests. >> this is not live migration, all device status and other information still stay in the >> device, no need to "resume" context, just resume running. >> > I am aware that it is not live migration. :) > > "Just resuming" involves lot of device setup task. The device implementation does not know for how long a device is suspended. > So for example, a VM is suspended for 6 hours, hence the device context could be saved in a slow disk. > Hence, when the resume is done, it needs to setup things again and driver got to verify before accessing more from the device. The restore procedures should perform by the hypervisor and done before set DRIVER_OK and wake up the guest. And the hypervisor/driver needs to check the device status by re-reading. > >> Like resume from a failed LM. >>> This is slightly different flow than setting the DRIVER_OK for the first time >> device initialization sequence as it does not involve large restoration. >>> So, to merge two ideas, instead of doing DRIVER_OK to resume, the driver >> should clear the SUSPEND bit and verify that it is out of SUSPEND. >>> Because driver is still in _OK_ driving the device flipping the SUSPEND bit. >> Please read the spec, it says: >> The driver MUST NOT clear a device status bit >> > Yes, this is why either DRIER_OK validation by the driver is needed or Jiqian's synchronous new register.. so re-read >