From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93648C47082 for ; Mon, 31 May 2021 04:39:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5F3826120E for ; Mon, 31 May 2021 04:39:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230070AbhEaEkm (ORCPT ); Mon, 31 May 2021 00:40:42 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:52918 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229717AbhEaEkl (ORCPT ); Mon, 31 May 2021 00:40:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1622435941; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0iOJFSk4M5fATpsyHla9pHiyIsvAdGUk+2E+QqouBgM=; b=L+jiF7+tB5YabCr1iLPVY3rCsSGsZ6A7JSlyRiO2h4epTy3tLiPa7nNZrHmYqNlq6Yd07b OxTLR3pvLzuRJO3PiWyu12uQTzpGF4Mfm+7w5lQtwKD/lHTk/g4aH3XZ2c0tYYe/ZNl0nl jq6N7UPGlmshkP5YkrgBywAon68yOhI= Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-428-BiCVWdTjMYiN6rExAvkT-A-1; Mon, 31 May 2021 00:39:00 -0400 X-MC-Unique: BiCVWdTjMYiN6rExAvkT-A-1 Received: by mail-pl1-f199.google.com with SMTP id h3-20020a1709026803b029010163ec78c5so1692860plk.14 for ; Sun, 30 May 2021 21:38:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=0iOJFSk4M5fATpsyHla9pHiyIsvAdGUk+2E+QqouBgM=; b=EijZZQzLfoi7P+zQScamATYsQTIyB+tENzxdWEQPfgUIlXwGtCbFwa/vKKjWrm0vbX QdMAFh1TGi6QubDhjrfM55OSNSlcDx19G/E2LO+dOrCZYQ3H0MLVryC9vXLptuX9Og4b EPg2vmw5Bs/KFFOozBc179qqigf9NPIaT99XLE6jw3cxNmexxdLXC85Z6sxxvcil+jOn XE6L+9rIOwxIafCld7yboorwftpvdfoKqI/TBjwfPpWPaPoBZ0U0I+deuDemnUZqsVFR SVT9h62Uwy/cr9a4aBa+1SKtgAu4h+vc/OHk/NESDo5gk8rkp15giesZc+6lUZRMDvKY U/QQ== X-Gm-Message-State: AOAM530+KQBQLM6Vf/+B2sf0P0wb5rXAqVlcXf7AJt8PYVoQGvL3s2Xj F2boqkZhGK2PDzMJDHAjyf0ZsOj6vKSlvp+GfgdTE9JxLSqqMQYIshonjdq+iljJbARKSGQGyat RdE9hudztClP5/LjN2GIyCN4SoQ== X-Received: by 2002:a17:90b:1489:: with SMTP id js9mr8106327pjb.227.1622435939096; Sun, 30 May 2021 21:38:59 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwECaFWqzow6NdzjEA/SPKPwRYy8BftnnINSbmAd0S9WQ6jOL+NhYjoGxK0mOczJ8jEV1I9Vg== X-Received: by 2002:a17:90b:1489:: with SMTP id js9mr8106314pjb.227.1622435938816; Sun, 30 May 2021 21:38:58 -0700 (PDT) Received: from wangxiaodeMacBook-Air.local ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id b20sm1398269pgm.30.2021.05.30.21.38.52 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 30 May 2021 21:38:58 -0700 (PDT) Subject: Re: [PATCH v7 11/12] vduse: Introduce VDUSE - vDPA Device in Userspace To: Yongji Xie Cc: "Michael S. Tsirkin" , Stefan Hajnoczi , Stefano Garzarella , Parav Pandit , Christoph Hellwig , Christian Brauner , Randy Dunlap , Matthew Wilcox , Al Viro , Jens Axboe , bcrl@kvack.org, Jonathan Corbet , =?UTF-8?Q?Mika_Penttil=c3=a4?= , Dan Carpenter , joro@8bytes.org, virtualization , netdev@vger.kernel.org, kvm , linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel References: <20210517095513.850-1-xieyongji@bytedance.com> <20210517095513.850-12-xieyongji@bytedance.com> <3740c7eb-e457-07f3-5048-917c8606275d@redhat.com> <5a68bb7c-fd05-ce02-cd61-8a601055c604@redhat.com> <00ded99f-91b6-ba92-5d92-2366b163f129@redhat.com> From: Jason Wang Message-ID: Date: Mon, 31 May 2021 12:38:47 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.10.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org 在 2021/5/31 下午12:27, Yongji Xie 写道: > On Fri, May 28, 2021 at 10:31 AM Jason Wang wrote: >> >> 在 2021/5/27 下午9:17, Yongji Xie 写道: >>> On Thu, May 27, 2021 at 4:41 PM Jason Wang wrote: >>>> 在 2021/5/27 下午3:34, Yongji Xie 写道: >>>>> On Thu, May 27, 2021 at 1:40 PM Jason Wang wrote: >>>>>> 在 2021/5/27 下午1:08, Yongji Xie 写道: >>>>>>> On Thu, May 27, 2021 at 1:00 PM Jason Wang wrote: >>>>>>>> 在 2021/5/27 下午12:57, Yongji Xie 写道: >>>>>>>>> On Thu, May 27, 2021 at 12:13 PM Jason Wang wrote: >>>>>>>>>> 在 2021/5/17 下午5:55, Xie Yongji 写道: >>>>>>>>>>> + >>>>>>>>>>> +static int vduse_dev_msg_sync(struct vduse_dev *dev, >>>>>>>>>>> + struct vduse_dev_msg *msg) >>>>>>>>>>> +{ >>>>>>>>>>> + init_waitqueue_head(&msg->waitq); >>>>>>>>>>> + spin_lock(&dev->msg_lock); >>>>>>>>>>> + vduse_enqueue_msg(&dev->send_list, msg); >>>>>>>>>>> + wake_up(&dev->waitq); >>>>>>>>>>> + spin_unlock(&dev->msg_lock); >>>>>>>>>>> + wait_event_killable(msg->waitq, msg->completed); >>>>>>>>>> What happens if the userspace(malicous) doesn't give a response forever? >>>>>>>>>> >>>>>>>>>> It looks like a DOS. If yes, we need to consider a way to fix that. >>>>>>>>>> >>>>>>>>> How about using wait_event_killable_timeout() instead? >>>>>>>> Probably, and then we need choose a suitable timeout and more important, >>>>>>>> need to report the failure to virtio. >>>>>>>> >>>>>>> Makes sense to me. But it looks like some >>>>>>> vdpa_config_ops/virtio_config_ops such as set_status() didn't have a >>>>>>> return value. Now I add a WARN_ON() for the failure. Do you mean we >>>>>>> need to add some change for virtio core to handle the failure? >>>>>> Maybe, but I'm not sure how hard we can do that. >>>>>> >>>>> We need to change all virtio device drivers in this way. >>>> Probably. >>>> >>>> >>>>>> We had NEEDS_RESET but it looks we don't implement it. >>>>>> >>>>> Could it handle the failure of get_feature() and get/set_config()? >>>> Looks not: >>>> >>>> " >>>> >>>> The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state >>>> that a reset is needed. If DRIVER_OK is set, after it sets >>>> DEVICE_NEEDS_RESET, the device MUST send a device configuration change >>>> notification to the driver. >>>> >>>> " >>>> >>>> This looks implies that NEEDS_RESET may only work after device is >>>> probed. But in the current design, even the reset() is not reliable. >>>> >>>> >>>>>> Or a rough idea is that maybe need some relaxing to be coupled loosely >>>>>> with userspace. E.g the device (control path) is implemented in the >>>>>> kernel but the datapath is implemented in the userspace like TUN/TAP. >>>>>> >>>>> I think it can work for most cases. One problem is that the set_config >>>>> might change the behavior of the data path at runtime, e.g. >>>>> virtnet_set_mac_address() in the virtio-net driver and >>>>> cache_type_store() in the virtio-blk driver. Not sure if this path is >>>>> able to return before the datapath is aware of this change. >>>> Good point. >>>> >>>> But set_config() should be rare: >>>> >>>> E.g in the case of virtio-net with VERSION_1, config space is read only, >>>> and it was set via control vq. >>>> >>>> For block, we can >>>> >>>> 1) start from without WCE or >>>> 2) we add a config change notification to userspace or >>> I prefer this way. And I think we also need to do similar things for >>> set/get_vq_state(). >> >> Yes, I agree. >> > Hi Jason, > > Now I'm working on this. But I found the config change notification > must be synchronous in the virtio-blk case, which means the kernel > still needs to wait for the response from userspace in set_config(). > Otherwise, some I/Os might still run the old way after we change the > cache_type in sysfs. > > The simple ways to solve this problem are: > > 1. Only support read-only config space, disable WCE as you suggested > 2. Add a return value to set_config() and handle the failure only in > virtio-blk driver > 3. Print some warnings after timeout since it only affects the > dataplane which is under userspace's control > > Any suggestions? Let's go without WCE first and make VDUSE work first. We can then think of a solution for WCE on top. Thanks > > Thanks, > Yongji >