From mboxrd@z Thu Jan 1 00:00:00 1970 From: Maxime Coquelin Subject: Re: [PATCH v3 17/19] vhost-user: iommu: postpone device creation until ring are mapped Date: Fri, 3 Nov 2017 16:54:05 +0100 Message-ID: <041c6163-f4fc-a6e7-7142-8ea6594800b5@redhat.com> References: <20171005083627.27828-1-maxime.coquelin@redhat.com> <20171005083627.27828-18-maxime.coquelin@redhat.com> <2DBBFF226F7CF64BAFCA79B681719D953A2CD10A@shsmsx102.ccr.corp.intel.com> <8d27d7d9-9567-1a57-5a5f-760f2f117b73@redhat.com> <4054d863-8909-23a7-aba2-b8675cdbb4ba@redhat.com> <20171103171230-mutt-send-email-mst@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Cc: "dev@dpdk.org" , "yliu@fridaylinux.org" , "Horton, Remy" , "Bie, Tiwei" , "jfreiman@redhat.com" , "vkaplans@redhat.com" , "jasowang@redhat.com" To: "Michael S. Tsirkin" , "Yao, Lei A" Return-path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id D0F401B6B9 for ; Fri, 3 Nov 2017 16:54:22 +0100 (CET) In-Reply-To: <20171103171230-mutt-send-email-mst@kernel.org> Content-Language: en-US List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 11/03/2017 04:15 PM, Michael S. Tsirkin wrote: > On Fri, Nov 03, 2017 at 09:25:58AM +0100, Maxime Coquelin wrote: >> >> >> On 11/02/2017 05:02 PM, Maxime Coquelin wrote: >>> >>> >>> On 11/02/2017 09:21 AM, Maxime Coquelin wrote: >>>> Hi Lei, >>>> >>>> On 11/02/2017 08:21 AM, Yao, Lei A wrote: >>>>> >>>> ... >>>>> Hi, Maxime > I met one issue with your patch set during the v17.11 test. >>>> >>>> Is it with v17.11-rc2 or -rc1? >>>> >>>>> The test scenario is following, >>>>> 1.    Bind one NIC, use test-pmd set vhost-user with 2 queue >>>>> usertools/dpdk-devbind.py --bind=igb_uio 0000:05:00.0 >>>>> ./x86_64-native-linuxapp-gcc/app/testpmd -c 0xe -n 4 >>>>> --socket-mem 1024,1024 \ >>>>> --vdev 'net_vhost0,iface=vhost-net,queues=2' - -i --rxq=2 >>>>> --txq=2 --nb-cores=2 --rss-ip >>>>> 2.    Launch qemu with  virtio device which has 2 queue >>>>> 3.    In VM, launch testpmd with virtio-pmd using only 1 queue. >>>>> x86_64-native-linuxapp-gcc/app/testpmd -c 0x07 -n 3 - -i >>>>> --txqflags=0xf01 \ >>>>> --rxq=1 --txq=1 --rss-ip --nb-cores=1 >>>>> >>>>> First, >>>>> commit 09927b5249694bad1c094d3068124673722e6b8f >>>>> vhost: translate ring addresses when IOMMU enabled >>>>> The patch causes no traffic in PVP test. but link status is >>>>> still up in vhost-user. >>>>> >>>>> Second, >>>>> eefac9536a901a1f0bb52aa3b6fec8f375f09190 >>>>> vhost: postpone device creation until rings are mapped >>>>> The patch causes link status "down" in vhost-user. >>> >>> I reproduced this one, and understand why link status remains down. >>> My series did fixed a potential issue Michael raised, that the vring >>> addresses should only interpreted once the ring is enabled. >>> When VHOST_USER_F_PROTOCOL_FEATURES is negotiated, the rings addrs are >>> translated when ring is enabled via VHOST_USER_SET_VRING_ENABLE. >>> When not negotiated, the ring is considered started enabled, so >>> translation is done at VHOST_USER_SET_VRING_KICK time. >>> >>> In your case, protocol features are negotiated, so the ring addresses >>> are translated at enable time. The problem is that the code considers >>> the device is ready once addresses for all the rings are translated. >>> But since only the first pair of rings is used, it never happens, and >>> the link remains down. >>> >>> One of the reason this check is done is to avoid starting the PMD >>> threads before the addresses are translated in case of NUMA >>> reallocation, as virtqueues and virtio-net device structs can be >>> reallocated on a different node. >>> >>> I think the right fix would be to only perform NUMA reallocation for >>> vring 0, as today we would end-up reallocating virtio-net struct >>> mulitple time if VQs are on different NUMA nodes. >>> >>> Doing that, we could then consider the device is ready if vring 0 is >>> enabled and its ring addresses are translated, and if other vrings have >>> been kicked. >>> >>> I'll post a patch shortly implementing this idea. >> >> The proposed solution doesn't work, because disabled queues get accessed at >> device start time: >> >> int >> rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable) >> { >> .. >> dev->virtqueue[queue_id]->used->flags = VRING_USED_F_NO_NOTIFY; >> return 0; >> } >> >> The above function being called in Vhost PMD for every queues, enabled >> or not. While we could fix the PMD, it could break other applications >> using the Vhost lib API directly, so we cannot translate at enable >> time reliably. >> >> I think we may be a bit less conservative, and postpone addresses >> translation at kick time, whatever VHOST_USER_F_PROTOCOL_FEATURES is >> negotiated or not. >> >> Regards, >> Maxime >> >>> Thanks, >>> Maxime > > I agree, enabling has nothing to do with it. > > The spec is quite explicit: > > Client must only process each ring when it is started. > > and > > Client must start ring upon receiving a kick (that is, detecting that file > descriptor is readable) on the descriptor specified by > VHOST_USER_SET_VRING_KICK, and stop ring upon receiving > VHOST_USER_GET_VRING_BASE. > Thanks for the confirmation Michael, fix posted. Maxime