* [PATCH] oxenstored: fix short-write issue @ 2015-10-27 17:10 Wei Liu 2015-10-27 17:21 ` Samuel Thibault ` (3 more replies) 0 siblings, 4 replies; 15+ messages in thread From: Wei Liu @ 2015-10-27 17:10 UTC (permalink / raw) To: Xen-devel Cc: Ian Jackson, Samuel Thibault, Wei Liu, Ian Campbell, David Scott When oxenstored wrote to the ring, it wrote a chunk of contiguous data. Originally when it tried to write across ring boundary, it returned a short-write when there is still room. That led to stalling mini-os's xenstore thread at times. Fix this by calling write function for a second time when the first write completes partially. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- Cc: David Scott <dave@recoil.org> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Ian, backport candidate for as far as you can manage. --- tools/ocaml/libs/xb/xb.ml | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/tools/ocaml/libs/xb/xb.ml b/tools/ocaml/libs/xb/xb.ml index 50944b5..0730d13 100644 --- a/tools/ocaml/libs/xb/xb.ml +++ b/tools/ocaml/libs/xb/xb.ml @@ -91,10 +91,12 @@ let write_fd back con s len = Unix.write back.fd s 0 len let write_mmap back con s len = - let ws = Xs_ring.write back.mmap s len in - if ws > 0 then + let ws = ref (Xs_ring.write back.mmap s len) in + if !ws < len then + ws := !ws + Xs_ring.write back.mmap (String.sub s !ws (len - !ws)) (len - !ws); + if !ws > 0 then back.eventchn_notify (); - ws + !ws let write con s len = match con.backend with -- 2.1.4 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] oxenstored: fix short-write issue 2015-10-27 17:10 [PATCH] oxenstored: fix short-write issue Wei Liu @ 2015-10-27 17:21 ` Samuel Thibault 2015-10-27 17:21 ` Andrew Cooper ` (2 subsequent siblings) 3 siblings, 0 replies; 15+ messages in thread From: Samuel Thibault @ 2015-10-27 17:21 UTC (permalink / raw) To: Wei Liu; +Cc: Xen-devel, Ian Jackson, Ian Campbell, David Scott That being said, Wei Liu, le Tue 27 Oct 2015 17:10:09 +0000, a écrit : > @@ -91,10 +91,12 @@ let write_fd back con s len = > Unix.write back.fd s 0 len > > let write_mmap back con s len = > - let ws = Xs_ring.write back.mmap s len in > - if ws > 0 then > + let ws = ref (Xs_ring.write back.mmap s len) in > + if !ws < len then This could also be if !ws > 0 && !ws < len then And also perhaps add a comment: (* Xs_ring.write returned a short-write. Perhaps it just hit the ring boundary and there is actually still room. Call it a second time to try to finish writing at the beginning of the ring. *) > + ws := !ws + Xs_ring.write back.mmap (String.sub s !ws (len - !ws)) (len - !ws); > + if !ws > 0 then > back.eventchn_notify (); > - ws > + !ws > > let write con s len = > match con.backend with > -- > 2.1.4 > -- Samuel <N> un driver qui fait quoi, alors ? <y> ben pour les bips <s> pour passer les oops en morse -+- #ens-mim - vive les rapports de bug -+- ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] oxenstored: fix short-write issue 2015-10-27 17:10 [PATCH] oxenstored: fix short-write issue Wei Liu 2015-10-27 17:21 ` Samuel Thibault @ 2015-10-27 17:21 ` Andrew Cooper 2015-10-27 17:26 ` Samuel Thibault 2015-10-27 17:28 ` Samuel Thibault 2015-10-28 14:04 ` David Vrabel 2015-11-02 13:44 ` Ian Campbell 3 siblings, 2 replies; 15+ messages in thread From: Andrew Cooper @ 2015-10-27 17:21 UTC (permalink / raw) To: Wei Liu, Xen-devel Cc: Samuel Thibault, Ian Jackson, Ian Campbell, David Scott On 27/10/15 17:10, Wei Liu wrote: > When oxenstored wrote to the ring, it wrote a chunk of contiguous data. > Originally when it tried to write across ring boundary, it returned a > short-write when there is still room. That led to stalling mini-os's > xenstore thread at times. > > Fix this by calling write function for a second time when the first > write completes partially. > > Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> > Signed-off-by: Wei Liu <wei.liu2@citrix.com> > --- > Cc: David Scott <dave@recoil.org> > Cc: Ian Campbell <ian.campbell@citrix.com> > Cc: Ian Jackson <ian.jackson@eu.citrix.com> > > Ian, backport candidate for as far as you can manage. > --- > tools/ocaml/libs/xb/xb.ml | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/tools/ocaml/libs/xb/xb.ml b/tools/ocaml/libs/xb/xb.ml > index 50944b5..0730d13 100644 > --- a/tools/ocaml/libs/xb/xb.ml > +++ b/tools/ocaml/libs/xb/xb.ml > @@ -91,10 +91,12 @@ let write_fd back con s len = > Unix.write back.fd s 0 len > > let write_mmap back con s len = > - let ws = Xs_ring.write back.mmap s len in > - if ws > 0 then > + let ws = ref (Xs_ring.write back.mmap s len) in > + if !ws < len then > + ws := !ws + Xs_ring.write back.mmap (String.sub s !ws (len - !ws)) (len - !ws); This is surely a TOCTOU, as the second attempted write could return short as well. The correct behaviour would be for Xs_ring.write to return the actual number of bytes it put into the ring, even if this is shorter than len. ~Andrew ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] oxenstored: fix short-write issue 2015-10-27 17:21 ` Andrew Cooper @ 2015-10-27 17:26 ` Samuel Thibault 2015-10-27 17:28 ` Samuel Thibault 1 sibling, 0 replies; 15+ messages in thread From: Samuel Thibault @ 2015-10-27 17:26 UTC (permalink / raw) To: Andrew Cooper; +Cc: Xen-devel, David Scott, Wei Liu, Ian Jackson, Ian Campbell Andrew Cooper, le Tue 27 Oct 2015 17:21:39 +0000, a écrit : > The correct behaviour would be for Xs_ring.write to return the actual > number of bytes it put into the ring, even if this is shorter than len. That is already what it does. Samuel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] oxenstored: fix short-write issue 2015-10-27 17:21 ` Andrew Cooper 2015-10-27 17:26 ` Samuel Thibault @ 2015-10-27 17:28 ` Samuel Thibault 2015-10-27 17:31 ` Samuel Thibault 2015-10-27 17:31 ` Andrew Cooper 1 sibling, 2 replies; 15+ messages in thread From: Samuel Thibault @ 2015-10-27 17:28 UTC (permalink / raw) To: Andrew Cooper; +Cc: Xen-devel, David Scott, Wei Liu, Ian Jackson, Ian Campbell Andrew Cooper, le Tue 27 Oct 2015 17:21:39 +0000, a écrit : > as the second attempted write could return short as well. That is fine. The second attempt will only return a short write if there was really not enough room for it, which is what we want. Samuel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] oxenstored: fix short-write issue 2015-10-27 17:28 ` Samuel Thibault @ 2015-10-27 17:31 ` Samuel Thibault 2015-10-27 17:31 ` Andrew Cooper 1 sibling, 0 replies; 15+ messages in thread From: Samuel Thibault @ 2015-10-27 17:31 UTC (permalink / raw) To: Andrew Cooper, Wei Liu, Xen-devel, Ian Jackson, Ian Campbell, David Scott Another way would be to make ml_interface_write write both pieces instead of just a contiguous one. The caml code would then look less suspicious :) Samuel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] oxenstored: fix short-write issue 2015-10-27 17:28 ` Samuel Thibault 2015-10-27 17:31 ` Samuel Thibault @ 2015-10-27 17:31 ` Andrew Cooper 2015-10-28 13:34 ` David Scott 1 sibling, 1 reply; 15+ messages in thread From: Andrew Cooper @ 2015-10-27 17:31 UTC (permalink / raw) To: Samuel Thibault, Wei Liu, Xen-devel, Ian Jackson, Ian Campbell, David Scott On 27/10/15 17:28, Samuel Thibault wrote: > Andrew Cooper, le Tue 27 Oct 2015 17:21:39 +0000, a écrit : >> as the second attempted write could return short as well. > That is fine. The second attempt will only return a short write if there > was really not enough room for it, which is what we want. Then surely the bug is that Xs_ring.write returns short when it shouldn't? ~Andrew ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] oxenstored: fix short-write issue 2015-10-27 17:31 ` Andrew Cooper @ 2015-10-28 13:34 ` David Scott 2015-10-28 13:47 ` Samuel Thibault 2015-10-28 14:00 ` Andrew Cooper 0 siblings, 2 replies; 15+ messages in thread From: David Scott @ 2015-10-28 13:34 UTC (permalink / raw) To: Andrew Cooper Cc: Samuel Thibault, Ian Campbell, Wei Liu, Ian Jackson, Xen-devel > On 27 Oct 2015, at 17:31, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > > On 27/10/15 17:28, Samuel Thibault wrote: >> Andrew Cooper, le Tue 27 Oct 2015 17:21:39 +0000, a écrit : >>> as the second attempted write could return short as well. >> That is fine. The second attempt will only return a short write if there >> was really not enough room for it, which is what we want. > > Then surely the bug is that Xs_ring.write returns short when it shouldn’t? > On 27 Oct 2015, at 17:31, Samuel Thibault <samuel.thibault@ens-lyon.org> wrote: > > Another way would be to make ml_interface_write write both pieces > instead of just a contiguous one. The caml code would then look less > suspicious :) > I agree with both Andrew and Samuel that it would be a better fix if Xs_ring.write (== “ml_interface_write”) wrote both pieces at once. However I believe the ‘suspicious’ OCaml patch fixes the specific issue (— or have I missed something?). Last time I meddled with the C-level ring reading/writing code I didn’t get it quite right. Does anyone have time to prototype what a C-level fix would look like? If we’re short of time I could live with the OCaml-level fix, especially since Wei has done some stress-testing (assuming everyone believes it fixes the issue) Cheers, Dave ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] oxenstored: fix short-write issue 2015-10-28 13:34 ` David Scott @ 2015-10-28 13:47 ` Samuel Thibault 2015-10-28 14:00 ` Andrew Cooper 1 sibling, 0 replies; 15+ messages in thread From: Samuel Thibault @ 2015-10-28 13:47 UTC (permalink / raw) To: David Scott; +Cc: Andrew Cooper, Ian Campbell, Wei Liu, Ian Jackson, Xen-devel [-- Attachment #1: Type: text/plain, Size: 2121 bytes --] David Scott, le Wed 28 Oct 2015 13:34:10 +0000, a écrit : > However I believe the ‘suspicious’ OCaml patch fixes the specific issue (— or have I missed something?). It does. > Does anyone have time to prototype what a C-level fix would look like? Here is a try, untested though. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> diff --git a/tools/ocaml/libs/xb/xs_ring_stubs.c b/tools/ocaml/libs/xb/xs_ring_stubs.c index fd561a2..5c771d6 100644 --- a/tools/ocaml/libs/xb/xs_ring_stubs.c +++ b/tools/ocaml/libs/xb/xs_ring_stubs.c @@ -99,12 +99,12 @@ CAMLprim value ml_interface_write(value ml_interface, int result; struct xenstore_domain_interface *intf = interface->addr; - XENSTORE_RING_IDX cons, prod; + XENSTORE_RING_IDX cons, old_prod, prod; int can_write; uint32_t connection; cons = *(volatile uint32_t*)&intf->rsp_cons; - prod = *(volatile uint32_t*)&intf->rsp_prod; + old_prod = prod = *(volatile uint32_t*)&intf->rsp_prod; connection = *(volatile uint32_t*)&intf->connection; if (connection != XENSTORE_CONNECTED) @@ -116,15 +116,30 @@ CAMLprim value ml_interface_write(value ml_interface, goto exit; } if (MASK_XENSTORE_IDX(prod) >= MASK_XENSTORE_IDX(cons)) + { can_write = XENSTORE_RING_SIZE - MASK_XENSTORE_IDX(prod); - else + if (can_write < len) + { + /* Not enough room at end of ring. Copy what we can first. */ + memcpy(intf->rsp + MASK_XENSTORE_IDX(prod), buffer, can_write); + prod += can_write; + len -= can_write; + buffer += can_write; + + /* We are now back at beginning of ring. */ + can_write = MASK_XENSTORE_IDX(cons) - MASK_XENSTORE_IDX(prod); + } + } + else can_write = MASK_XENSTORE_IDX(cons) - MASK_XENSTORE_IDX(prod); if (can_write < len) + /* Real short write: not enough room. */ len = can_write; memcpy(intf->rsp + MASK_XENSTORE_IDX(prod), buffer, len); + prod += len; xen_mb(); - intf->rsp_prod += len; - result = len; + intf->rsp_prod = prod; + result = prod - old_prod; exit: ml_result = Val_int(result); CAMLreturn(ml_result); [-- Attachment #2: patch --] [-- Type: text/plain, Size: 1708 bytes --] diff --git a/tools/ocaml/libs/xb/xs_ring_stubs.c b/tools/ocaml/libs/xb/xs_ring_stubs.c index fd561a2..5c771d6 100644 --- a/tools/ocaml/libs/xb/xs_ring_stubs.c +++ b/tools/ocaml/libs/xb/xs_ring_stubs.c @@ -99,12 +99,12 @@ CAMLprim value ml_interface_write(value ml_interface, int result; struct xenstore_domain_interface *intf = interface->addr; - XENSTORE_RING_IDX cons, prod; + XENSTORE_RING_IDX cons, old_prod, prod; int can_write; uint32_t connection; cons = *(volatile uint32_t*)&intf->rsp_cons; - prod = *(volatile uint32_t*)&intf->rsp_prod; + old_prod = prod = *(volatile uint32_t*)&intf->rsp_prod; connection = *(volatile uint32_t*)&intf->connection; if (connection != XENSTORE_CONNECTED) @@ -116,15 +116,30 @@ CAMLprim value ml_interface_write(value ml_interface, goto exit; } if (MASK_XENSTORE_IDX(prod) >= MASK_XENSTORE_IDX(cons)) + { can_write = XENSTORE_RING_SIZE - MASK_XENSTORE_IDX(prod); - else + if (can_write < len) + { + /* Not enough room at end of ring. Copy what we can first. */ + memcpy(intf->rsp + MASK_XENSTORE_IDX(prod), buffer, can_write); + prod += can_write; + len -= can_write; + buffer += can_write; + + /* We are now back at beginning of ring. */ + can_write = MASK_XENSTORE_IDX(cons) - MASK_XENSTORE_IDX(prod); + } + } + else can_write = MASK_XENSTORE_IDX(cons) - MASK_XENSTORE_IDX(prod); if (can_write < len) + /* Real short write: not enough room. */ len = can_write; memcpy(intf->rsp + MASK_XENSTORE_IDX(prod), buffer, len); + prod += len; xen_mb(); - intf->rsp_prod += len; - result = len; + intf->rsp_prod = prod; + result = prod - old_prod; exit: ml_result = Val_int(result); CAMLreturn(ml_result); [-- Attachment #3: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] oxenstored: fix short-write issue 2015-10-28 13:34 ` David Scott 2015-10-28 13:47 ` Samuel Thibault @ 2015-10-28 14:00 ` Andrew Cooper 1 sibling, 0 replies; 15+ messages in thread From: Andrew Cooper @ 2015-10-28 14:00 UTC (permalink / raw) To: David Scott Cc: Ian Jackson, Samuel Thibault, Wei Liu, Ian Campbell, Xen-devel On 28/10/15 13:34, David Scott wrote: > >> On 27 Oct 2015, at 17:31, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> >> On 27/10/15 17:28, Samuel Thibault wrote: >>> Andrew Cooper, le Tue 27 Oct 2015 17:21:39 +0000, a écrit : >>>> as the second attempted write could return short as well. >>> That is fine. The second attempt will only return a short write if there >>> was really not enough room for it, which is what we want. >> Then surely the bug is that Xs_ring.write returns short when it shouldn’t? > >> On 27 Oct 2015, at 17:31, Samuel Thibault <samuel.thibault@ens-lyon.org> wrote: >> >> Another way would be to make ml_interface_write write both pieces >> instead of just a contiguous one. The caml code would then look less >> suspicious :) >> > I agree with both Andrew and Samuel that it would be a better fix if Xs_ring.write (== “ml_interface_write”) wrote both pieces at once. > > However I believe the ‘suspicious’ OCaml patch fixes the specific issue (— or have I missed something?). Actually the suspicious OCaml is still buggy in the case that there was a genuine short write followed by a false short write from wrapping the ring. > > Last time I meddled with the C-level ring reading/writing code I didn’t get it quite right. Does anyone have time to prototype what a C-level fix would look like? If we’re short of time I could live with the OCaml-level fix, especially since Wei has done some stress-testing (assuming everyone believes it fixes the issue) There are several bugs in that function (the remaining size calculations are plain wrong), and Samuels patch sadly doesn't address all of them. I will throw a different patch together addressing all the issues I can spot. ~Andrew ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] oxenstored: fix short-write issue 2015-10-27 17:10 [PATCH] oxenstored: fix short-write issue Wei Liu 2015-10-27 17:21 ` Samuel Thibault 2015-10-27 17:21 ` Andrew Cooper @ 2015-10-28 14:04 ` David Vrabel 2015-11-02 13:44 ` Ian Campbell 3 siblings, 0 replies; 15+ messages in thread From: David Vrabel @ 2015-10-28 14:04 UTC (permalink / raw) To: Wei Liu, Xen-devel Cc: Samuel Thibault, Ian Jackson, Ian Campbell, David Scott On 27/10/15 17:10, Wei Liu wrote: > When oxenstored wrote to the ring, it wrote a chunk of contiguous data. > Originally when it tried to write across ring boundary, it returned a > short-write when there is still room. That led to stalling mini-os's > xenstore thread at times. > > Fix this by calling write function for a second time when the first > write completes partially. What happens if the 1st write is short because there is not enough space, and the 2nd write is short because of the ring wraparound? You will still stall. David ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] oxenstored: fix short-write issue 2015-10-27 17:10 [PATCH] oxenstored: fix short-write issue Wei Liu ` (2 preceding siblings ...) 2015-10-28 14:04 ` David Vrabel @ 2015-11-02 13:44 ` Ian Campbell 2015-11-02 14:10 ` Andrew Cooper 3 siblings, 1 reply; 15+ messages in thread From: Ian Campbell @ 2015-11-02 13:44 UTC (permalink / raw) To: Wei Liu, Xen-devel; +Cc: Samuel Thibault, Ian Jackson, David Scott On Tue, 2015-10-27 at 17:10 +0000, Wei Liu wrote: > When oxenstored wrote to the ring, it wrote a chunk of contiguous data. > Originally when it tried to write across ring boundary, it returned a > short-write when there is still room. That led to stalling mini-os's > xenstore thread at times. What is a "short-write" in this context? Given data bytes 0..M I assumed it is only writing bytes 0..N and not N+1..M because the ring boundary is at N. But what is it writing to the ->prod ring pointer N or M? AIUI writing N should be allowed by the ring protocol, the client should keep looking for more data until it has a complete request. Writing M would be a server error. Ian. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] oxenstored: fix short-write issue 2015-11-02 13:44 ` Ian Campbell @ 2015-11-02 14:10 ` Andrew Cooper 2015-11-02 14:24 ` Ian Campbell 0 siblings, 1 reply; 15+ messages in thread From: Andrew Cooper @ 2015-11-02 14:10 UTC (permalink / raw) To: Ian Campbell, Wei Liu, Xen-devel Cc: Samuel Thibault, Ian Jackson, David Scott On 02/11/15 13:44, Ian Campbell wrote: > On Tue, 2015-10-27 at 17:10 +0000, Wei Liu wrote: >> When oxenstored wrote to the ring, it wrote a chunk of contiguous data. >> Originally when it tried to write across ring boundary, it returned a >> short-write when there is still room. That led to stalling mini-os's >> xenstore thread at times. > What is a "short-write" in this context? > > Given data bytes 0..M I assumed it is only writing bytes 0..N and not > N+1..M because the ring boundary is at N. But what is it writing to the > ->prod ring pointer N or M? Prod gets incremented by N in this case. > > AIUI writing N should be allowed by the ring protocol, the client should > keep looking for more data until it has a complete request. > > Writing M would be a server error. Correct, and this is what is happening. The server (believes) that the ring is full, when it is not. It waits for the client to make more space in the ring, while the client is waiting for the server to complete its message in the ring, thus stalling. ~Andrew ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] oxenstored: fix short-write issue 2015-11-02 14:10 ` Andrew Cooper @ 2015-11-02 14:24 ` Ian Campbell 2015-11-02 14:27 ` Andrew Cooper 0 siblings, 1 reply; 15+ messages in thread From: Ian Campbell @ 2015-11-02 14:24 UTC (permalink / raw) To: Andrew Cooper, Wei Liu, Xen-devel Cc: Samuel Thibault, Ian Jackson, David Scott On Mon, 2015-11-02 at 14:10 +0000, Andrew Cooper wrote: > On 02/11/15 13:44, Ian Campbell wrote: > > On Tue, 2015-10-27 at 17:10 +0000, Wei Liu wrote: > > > When oxenstored wrote to the ring, it wrote a chunk of contiguous > > > data. > > > Originally when it tried to write across ring boundary, it returned a > > > short-write when there is still room. That led to stalling mini-os's > > > xenstore thread at times. > > What is a "short-write" in this context? > > > > Given data bytes 0..M I assumed it is only writing bytes 0..N and not > > N+1..M because the ring boundary is at N. But what is it writing to the > > ->prod ring pointer N or M? > > Prod gets incremented by N in this case. > > > > > AIUI writing N should be allowed by the ring protocol, the client > > should > > keep looking for more data until it has a complete request. > > > > Writing M would be a server error. > > Correct, and this is what is happening. The first or second? Your first comment suggests the first, but your second binds most closely to the second. > The server (believes) that the ring is full, when it is not. It waits > for the client to make more space in the ring, while the client is > waiting for the server to complete its message in the ring, thus > stalling. That makes sense, thanks. I think this needs to be spelled out more fully in the commit message, in particular "server thinks ring is full when it is not" is the most relevant thing, the short write is just how we arrived there. Ian. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] oxenstored: fix short-write issue 2015-11-02 14:24 ` Ian Campbell @ 2015-11-02 14:27 ` Andrew Cooper 0 siblings, 0 replies; 15+ messages in thread From: Andrew Cooper @ 2015-11-02 14:27 UTC (permalink / raw) To: Ian Campbell, Wei Liu, Xen-devel Cc: Samuel Thibault, Ian Jackson, David Scott On 02/11/15 14:24, Ian Campbell wrote: > On Mon, 2015-11-02 at 14:10 +0000, Andrew Cooper wrote: >> On 02/11/15 13:44, Ian Campbell wrote: >>> On Tue, 2015-10-27 at 17:10 +0000, Wei Liu wrote: >>>> When oxenstored wrote to the ring, it wrote a chunk of contiguous >>>> data. >>>> Originally when it tried to write across ring boundary, it returned a >>>> short-write when there is still room. That led to stalling mini-os's >>>> xenstore thread at times. >>> What is a "short-write" in this context? >>> >>> Given data bytes 0..M I assumed it is only writing bytes 0..N and not >>> N+1..M because the ring boundary is at N. But what is it writing to the >>> ->prod ring pointer N or M? >> Prod gets incremented by N in this case. >> >>> AIUI writing N should be allowed by the ring protocol, the client >>> should >>> keep looking for more data until it has a complete request. >>> >>> Writing M would be a server error. >> Correct, and this is what is happening. > The first or second? Your first comment suggests the first, but your second > binds most closely to the second. Oops yes. Writing M would be an error. Prod currently gets incremented by N. > >> The server (believes) that the ring is full, when it is not. It waits >> for the client to make more space in the ring, while the client is >> waiting for the server to complete its message in the ring, thus >> stalling. > That makes sense, thanks. > > I think this needs to be spelled out more fully in the commit message, in > particular "server thinks ring is full when it is not" is the most relevant > thing, the short write is just how we arrived there. This patch here is buggy, and superseeded by one of mine which attempts to fix the C stubs. I need to post a v2. ~Andrew ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2015-11-02 14:27 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-10-27 17:10 [PATCH] oxenstored: fix short-write issue Wei Liu 2015-10-27 17:21 ` Samuel Thibault 2015-10-27 17:21 ` Andrew Cooper 2015-10-27 17:26 ` Samuel Thibault 2015-10-27 17:28 ` Samuel Thibault 2015-10-27 17:31 ` Samuel Thibault 2015-10-27 17:31 ` Andrew Cooper 2015-10-28 13:34 ` David Scott 2015-10-28 13:47 ` Samuel Thibault 2015-10-28 14:00 ` Andrew Cooper 2015-10-28 14:04 ` David Vrabel 2015-11-02 13:44 ` Ian Campbell 2015-11-02 14:10 ` Andrew Cooper 2015-11-02 14:24 ` Ian Campbell 2015-11-02 14:27 ` Andrew Cooper
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).