From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35565) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bZ37w-0002fA-49 for qemu-devel@nongnu.org; Sun, 14 Aug 2016 17:39:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bZ37s-0007ro-TU for qemu-devel@nongnu.org; Sun, 14 Aug 2016 17:39:28 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44232) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bZ37s-0007rk-L1 for qemu-devel@nongnu.org; Sun, 14 Aug 2016 17:39:24 -0400 Date: Mon, 15 Aug 2016 00:39:19 +0300 From: "Michael S. Tsirkin" Message-ID: <20160815003520-mutt-send-email-mst@kernel.org> References: <1470842980-32481-1-git-send-email-mst@redhat.com> <1470842980-32481-4-git-send-email-mst@redhat.com> <20160812063828.GG2759@al.usersys.redhat.com> <20160814054808-mutt-send-email-mst@kernel.org> <09A27EBF-F644-45E7-949D-A5D55AE3BCB5@nutanix.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <09A27EBF-F644-45E7-949D-A5D55AE3BCB5@nutanix.com> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PULL 3/3] vhost-user: Attempt to fix a race with set_mem_table. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Prerna Saxena Cc: Fam Zheng , "qemu-devel@nongnu.org" , Peter Maydell , "marcandre.lureau@redhat.com" On Sun, Aug 14, 2016 at 09:42:11AM +0000, Prerna Saxena wrote: > On 14/08/16 8:21 am, "Michael S. Tsirkin" wrote: >=20 >=20 > >On Fri, Aug 12, 2016 at 07:16:34AM +0000, Prerna Saxena wrote: > >>=20 > >> On 12/08/16 12:08 pm, "Fam Zheng" wrote: > >>=20 > >>=20 > >>=20 > >>=20 > >>=20 > >> >On Wed, 08/10 18:30, Michael S. Tsirkin wrote: > >> >> From: Prerna Saxena > >> >>=20 > >> >> The set_mem_table command currently does not seek a reply. Hence,= there is > >> >> no easy way for a remote application to notify to QEMU when it fi= nished > >> >> setting up memory, or if there were errors doing so. > >> >>=20 > >> >> As an example: > >> >> (1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user n= et > >> >> application). SET_MEM_TABLE does not require a reply according to= the spec. > >> >> (2) Qemu commits the memory to the guest. > >> >> (3) Guest issues an I/O operation over a new memory region which = was configured on (1). > >> >> (4) The application has not yet remapped the memory, but it sees = the I/O request. > >> >> (5) The application cannot satisfy the request because it does no= t know about those GPAs. > >> >>=20 > >> >> While a guaranteed fix would require a protocol extension (commit= ted separately), > >> >> a best-effort workaround for existing applications is to send a G= ET_FEATURES > >> >> message before completing the vhost_user_set_mem_table() call. > >> >> Since GET_FEATURES requires a reply, an application that processe= s vhost-user > >> >> messages synchronously would probably have completed the SET_MEM_= TABLE before replying. > >> >>=20 > >> >> Signed-off-by: Prerna Saxena > >> >> Reviewed-by: Michael S. Tsirkin > >> >> Signed-off-by: Michael S. Tsirkin > >> > > >> >Sporadic hangs are seen with test-vhost-user after this patch: > >> > > >> >https://travis-ci.org/qemu/qemu/builds > >> > > >> >Reverting seems to fix it for me. > >> > > >> >Is this a known problem? > >> > > >> >Fam > >>=20 > >> Hi Fam, > >> Thanks for reporting the sporadic hangs. I had seen =E2=80=98make ch= eck=E2=80=99 pass on my Centos 6 environment, so missed this. > >> I am setting up the docker test env to repro this, but I think I can= guess the problem : > >>=20 > >> In tests/vhost-user-test.c:=20 > >>=20 > >> static void chr_read(void *opaque, const uint8_t *buf, int size) > >> { > >> ..[snip].. > >>=20 > >> case VHOST_USER_SET_MEM_TABLE: > >> /* received the mem table */ > >> memcpy(&s->memory, &msg.payload.memory, sizeof(msg.payload.me= mory)); > >> s->fds_num =3D qemu_chr_fe_get_msgfds(chr, s->fds, G_N_ELEMEN= TS(s->fds)); > >>=20 > >>=20 > >> /* signal the test that it can continue */ > >> g_cond_signal(&s->data_cond); > >> break; > >> ..[snip].. > >> } > >>=20 > >>=20 > >> The test seems to be marked complete as soon as mem_table is copied.= =20 > >> However, this patch 3/3 changes the behaviour of the SET_MEM_TABLE v= host command implementation with qemu. SET_MEM_TABLE now sends out a new = message GET_FEATURES, and the call is only completed once it receives fea= tures from the remote application. (or the test framework, as is the case= here.) > > > >Hmm but why does it matter that data_cond is woken up? >=20 > Michael, sorry, I didn=E2=80=99t quite understand that. Could you pls e= xplain ? We do g_cond_signal but I don't see where does it prevent test from responding to GET_FEATURES. Except if test exited signaling success - but then why do we care? > > > > > >> While the test itself can be modified (Do not signal completion unti= l we=E2=80=99ve sent a follow-up response to GET_FEATURES), I am now wond= ering if this patch may break existing vhost applications too ? If so, re= verting it possibly better. > > > >What bothers me is that the new feature might cause the same > >issue once we enable it in the test. >=20 > No it wont. The new feature is a protocol extension, and only works if = it has been negotiated with. If not negotiated, that part of code is neve= r executed. Absolutely - this reduces the risk - but how do we know that whatever problem causes the test failures is not there with the new feature? > > > >How about a patch to tests/vhost-user-test.c adding the new > >protocol feature? I would be quite interested to see what > >is going on with it. >=20 > Yes that can be done. But you can see that the protocol extension patch= will not change the behaviour of the _existing_ test. If that passes on travis, then we'll be more confident - after all it is the travis failures that cause us to reconsider the work-around. > > > > > >> What confuses me is why it doesn=E2=80=99t fail all the time, but on= ly about 20% to 30% time as Fam reports.=20 > > > >And succeeds every time on my systems :( >=20 > +1 to that :( I have had no luck repro=E2=80=99ing it. >=20 > > > >>=20 > >> Thoughts : Michael, Fam, MarcAndre ? > >>=20 > >> Regards, > >> >=20 > Prerna