From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [PATCH] oxenstored: fix short-write issue Date: Wed, 28 Oct 2015 14:00:02 +0000 Message-ID: <5630D4E2.8070201@citrix.com> References: <1445965809-5144-1-git-send-email-wei.liu2@citrix.com> <562FB2A3.8070603@citrix.com> <20151027172810.GQ2483@var.bordeaux.inria.fr> <562FB4F9.20100@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZrRGp-000232-E4 for xen-devel@lists.xenproject.org; Wed, 28 Oct 2015 14:00:07 +0000 In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: David Scott Cc: Ian Jackson , Samuel Thibault , Wei Liu , Ian Campbell , Xen-devel List-Id: xen-devel@lists.xenproject.org On 28/10/15 13:34, David Scott wrote: > >> On 27 Oct 2015, at 17:31, Andrew Cooper wrot= e: >> >> On 27/10/15 17:28, Samuel Thibault wrote: >>> Andrew Cooper, le Tue 27 Oct 2015 17:21:39 +0000, a =E9crit : >>>> as the second attempted write could return short as well. >>> That is fine. The second attempt will only return a short write if there >>> was really not enough room for it, which is what we want. >> Then surely the bug is that Xs_ring.write returns short when it shouldn= =92t? > >> On 27 Oct 2015, at 17:31, Samuel Thibault = wrote: >> >> Another way would be to make ml_interface_write write both pieces >> instead of just a contiguous one. The caml code would then look less >> suspicious :) >> > I agree with both Andrew and Samuel that it would be a better fix if Xs_r= ing.write (=3D=3D =93ml_interface_write=94) wrote both pieces at once. > > However I believe the =91suspicious=92 OCaml patch fixes the specific iss= ue (=97 or have I missed something?). Actually the suspicious OCaml is still buggy in the case that there was a genuine short write followed by a false short write from wrapping the ring. > > Last time I meddled with the C-level ring reading/writing code I didn=92t= get it quite right. Does anyone have time to prototype what a C-level fix = would look like? If we=92re short of time I could live with the OCaml-level= fix, especially since Wei has done some stress-testing (assuming everyone = believes it fixes the issue) There are several bugs in that function (the remaining size calculations are plain wrong), and Samuels patch sadly doesn't address all of them. = I will throw a different patch together addressing all the issues I can spot. ~Andrew