git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Jiří Hruška" <jirka@fud.cz>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>,
	Jonathan Tan <jonathantanmy@google.com>,
	 Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v2 4/5] remote-curl: simplify rpc_out() - less nesting and rename
Date: Tue, 14 Nov 2023 19:34:54 -0800	[thread overview]
Message-ID: <CAGE_+C7CEh63ThpJx2BvMQLb==v=h79pMKHC14t_wYeRowMfwQ@mail.gmail.com> (raw)
In-Reply-To: <20231115033121.939-1-jirka@fud.cz>

- Remove one indentation level in `rpc_out()`, as the conditions can be
  combined into just one `if` statement without losing readability.

  Skipping the resetting of `initial_buffer`/`len`/`pos` revealed a bug
  in `stateless_connect()`, where `rpc.len` is reset to 0 after each
  request, but not `rpc.pos`. Relying on `rpc_out()` always doing this
  before has never been safe (it might have not finished cleanly, for
  example). So better reset it there, just like `rpc.len`.

- Rename `flush_read_but_not_sent` to `read_from_out_done`. The name is
  slightly misleading, because the "flush" might never be really "sent"
  (depends on `write_line_lengths`), and this is not the most important
  part anyway. The primary role of the flag is rather to signal that
  `read_from_out()` is "done" and must not be called for this particular
  RPC exchange anymore.

- Update/add some related comments.

Signed-off-by: Jiri Hruska <jirka@fud.cz>
---
 remote-curl.c | 46 ++++++++++++++++++++++++++++------------------
 1 file changed, 28 insertions(+), 18 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index 690df2a43e..d5aa66a44c 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -606,13 +606,14 @@ struct rpc_state {
 	unsigned write_line_lengths : 1;

 	/*
-	 * Used by rpc_out; initialize to 0. This is true if a flush has been
-	 * read, but the corresponding line length (if write_line_lengths is
-	 * true) and EOF have not been sent to libcurl. Since each flush marks
-	 * the end of a request, each flush must be completely sent before any
-	 * further reading occurs.
+	 * Used by rpc_out; initialize to 0. This is true if a flush packet
+	 * has been read from the child process, signaling the end of the
+	 * current data to send. There might be still some bytes pending in
+	 * 'buf' (e.g. the corresponding line length, if write_line_lengths
+	 * is true), but no more reads can be performed on the 'out' pipe as
+	 * part of the current RPC exchange.
 	 */
-	unsigned flush_read_but_not_sent : 1;
+	unsigned read_from_out_done : 1;
 };

 #define RPC_STATE_INIT { 0 }
@@ -690,21 +691,29 @@ static size_t rpc_out(void *ptr, size_t eltsize,
 	size_t avail = rpc->len - rpc->pos;
 	enum packet_read_status status;

-	if (!avail) {
+	/*
+	 * If there is no more data available in our buffer and the child
+	 * process is not done sending yet, read the next packet.
+	 */
+	if (!avail && !rpc->read_from_out_done) {
 		rpc->initial_buffer = 0;
 		rpc->len = 0;
 		rpc->pos = 0;
-		if (!rpc->flush_read_but_not_sent) {
-			if (!rpc_read_from_out(rpc, 0, &avail, &status))
-				BUG("The entire rpc->buf should be larger than LARGE_PACKET_MAX");
-			if (status == PACKET_READ_FLUSH)
-				rpc->flush_read_but_not_sent = 1;
-		}
+		if (!rpc_read_from_out(rpc, 0, &avail, &status))
+			BUG("The entire rpc->buf should be larger than LARGE_PACKET_MAX");
+
 		/*
-		 * If flush_read_but_not_sent is true, we have already read one
-		 * full request but have not fully sent it + EOF, which is why
-		 * we need to refrain from reading.
+		 * If a flush packet was read, it means the child process is
+		 * done sending this request. The buffer might be fully empty
+		 * at this point or contain a flush packet too, depending on
+		 * rpc->write_line_lengths.
+		 * In any case, we must refrain from reading any more, because
+		 * the child process already expects to receive a response back
+		 * instead. If both sides would try to read at once, they would
+		 * just hang waiting for each other.
 		 */
+		if (status == PACKET_READ_FLUSH)
+			rpc->read_from_out_done = 1;
 	}

 	/*
@@ -967,7 +976,7 @@ static int post_rpc(struct rpc_state *rpc, int
stateless_connect, int flush_rece
 		 */
 		headers = curl_slist_append(headers, "Transfer-Encoding: chunked");
 		rpc->initial_buffer = 1;
-		rpc->flush_read_but_not_sent = 0;
+		rpc->read_from_out_done = 0;
 		curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, rpc_out);
 		curl_easy_setopt(slot->curl, CURLOPT_INFILE, rpc);
 		curl_easy_setopt(slot->curl, CURLOPT_SEEKFUNCTION, rpc_seek);
@@ -1487,7 +1496,7 @@ static int stateless_connect(const char *service_name)
 	rpc.gzip_request = 1;
 	rpc.initial_buffer = 0;
 	rpc.write_line_lengths = 1;
-	rpc.flush_read_but_not_sent = 0;
+	rpc.read_from_out_done = 0;

 	/*
 	 * Dump the capability listing that we got from the server earlier
@@ -1510,6 +1519,7 @@ static int stateless_connect(const char *service_name)
 			break;
 		/* Reset the buffer for next request */
 		rpc.len = 0;
+		rpc.pos = 0;
 	}

 	free(rpc.service_url);
-- 
2.42.1.5.g2f21867bd5

  parent reply	other threads:[~2023-11-15  3:34 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-13 11:25 [PATCH] remote-curl: avoid hang if curl asks for more data after eof Jiří Hruška
2023-11-13 21:22 ` Jonathan Tan
2023-11-14  1:36   ` Junio C Hamano
2023-11-14 23:33     ` Jiří Hruška
2023-11-27 13:19       ` Jiří Hruška
2023-11-14 23:16   ` Jiří Hruška
2023-11-15  3:34 ` [PATCH v2 0/5] Avoid hang if curl needs eof twice + minor related improvements Jiří Hruška
2023-11-15 23:28   ` Jonathan Tan
2023-11-27 13:39     ` Jiří Hruška
2023-11-27 18:26       ` Jonathan Tan
     [not found] ` <20231115033121.939-1-jirka@fud.cz>
2023-11-15  3:34   ` [PATCH v2 1/5] remote-curl: avoid hang if curl asks for more data after eof Jiří Hruška
2023-11-15 19:20     ` Jonathan Tan
2023-11-15  3:34   ` [PATCH v2 2/5] remote-curl: improve readability of curl callbacks Jiří Hruška
2023-11-15  3:34   ` [PATCH v2 3/5] remote-curl: simplify rpc_out() - remove superfluous ifs Jiří Hruška
2023-11-15  3:34   ` Jiří Hruška [this message]
2023-11-15  3:34   ` [PATCH v2 5/5] http: reset CURLOPT_POSTFIELDSIZE_LARGE between requests Jiří Hruška
2023-11-15  6:44     ` Patrick Steinhardt
2023-11-27 13:21       ` Jiří Hruška

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGE_+C7CEh63ThpJx2BvMQLb==v=h79pMKHC14t_wYeRowMfwQ@mail.gmail.com' \
    --to=jirka@fud.cz \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).