* [PATCHv2 01/10] pkt-line: move a misplaced comment
2013-02-18 9:12 ` [PATCHv2 0/10] pkt-line and remote-curl cleanups server Jeff King
@ 2013-02-18 9:14 ` Jeff King
2013-02-18 9:15 ` [PATCHv2 02/10] pkt-line: drop safe_write function Jeff King
` (10 subsequent siblings)
11 siblings, 0 replies; 50+ messages in thread
From: Jeff King @ 2013-02-18 9:14 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git, Shawn O. Pearce
The comment describing the packet writing interface was
originally written above packet_write, but migrated to be
above safe_write in f3a3214, probably because it is meant to
generally describe the packet writing interface and not a
single function. Let's move it into the header file, where
users of the interface are more likely to see it.
Signed-off-by: Jeff King <peff@peff.net>
---
I just left the comment intact as I moved it. It kind of implies to me
that you hand a big buffer to these functions and they would packetize
it for you, which is not true. I don't know if anybody else sees that;
it might be worth tweaking the text.
pkt-line.c | 15 ---------------
pkt-line.h | 14 +++++++++++++-
2 files changed, 13 insertions(+), 16 deletions(-)
diff --git a/pkt-line.c b/pkt-line.c
index eaba15f..5138f47 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -46,21 +46,6 @@ static void packet_trace(const char *buf, unsigned int len, int write)
strbuf_release(&out);
}
-/*
- * Write a packetized stream, where each line is preceded by
- * its length (including the header) as a 4-byte hex number.
- * A length of 'zero' means end of stream (and a length of 1-3
- * would be an error).
- *
- * This is all pretty stupid, but we use this packetized line
- * format to make a streaming format possible without ever
- * over-running the read buffers. That way we'll never read
- * into what might be the pack data (which should go to another
- * process entirely).
- *
- * The writing side could use stdio, but since the reading
- * side can't, we stay with pure read/write interfaces.
- */
ssize_t safe_write(int fd, const void *buf, ssize_t n)
{
ssize_t nn = n;
diff --git a/pkt-line.h b/pkt-line.h
index 8cfeb0c..7a67e9c 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -5,7 +5,19 @@
#include "strbuf.h"
/*
- * Silly packetized line writing interface
+ * Write a packetized stream, where each line is preceded by
+ * its length (including the header) as a 4-byte hex number.
+ * A length of 'zero' means end of stream (and a length of 1-3
+ * would be an error).
+ *
+ * This is all pretty stupid, but we use this packetized line
+ * format to make a streaming format possible without ever
+ * over-running the read buffers. That way we'll never read
+ * into what might be the pack data (which should go to another
+ * process entirely).
+ *
+ * The writing side could use stdio, but since the reading
+ * side can't, we stay with pure read/write interfaces.
*/
void packet_flush(int fd);
void packet_write(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
--
1.8.1.20.g7078b03
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCHv2 02/10] pkt-line: drop safe_write function
2013-02-18 9:12 ` [PATCHv2 0/10] pkt-line and remote-curl cleanups server Jeff King
2013-02-18 9:14 ` [PATCHv2 01/10] pkt-line: move a misplaced comment Jeff King
@ 2013-02-18 9:15 ` Jeff King
2013-02-18 9:56 ` Jonathan Nieder
2013-02-18 9:16 ` [PATCHv2 03/10] pkt-line: clean up "gentle" reading function Jeff King
` (9 subsequent siblings)
11 siblings, 1 reply; 50+ messages in thread
From: Jeff King @ 2013-02-18 9:15 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git, Shawn O. Pearce
This is just write_or_die by another name.
Signed-off-by: Jeff King <peff@peff.net>
---
Actually, they are not quite the same. write_or_die will exit(0) when it
sees EPIPE. Which makes me a little nervous.
builtin/receive-pack.c | 2 +-
builtin/send-pack.c | 2 +-
fetch-pack.c | 2 +-
http-backend.c | 8 ++++----
pkt-line.c | 21 ++-------------------
remote-curl.c | 4 ++--
send-pack.c | 2 +-
sideband.c | 9 +++++----
upload-pack.c | 3 ++-
9 files changed, 19 insertions(+), 34 deletions(-)
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index e8878de..48cd5dc 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -908,7 +908,7 @@ static void report(struct command *commands, const char *unpack_status)
if (use_sideband)
send_sideband(1, 1, buf.buf, buf.len, use_sideband);
else
- safe_write(1, buf.buf, buf.len);
+ write_or_die(1, buf.buf, buf.len);
strbuf_release(&buf);
}
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index 57a46b2..8778519 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -79,7 +79,7 @@ static void print_helper_status(struct ref *ref)
}
strbuf_addch(&buf, '\n');
- safe_write(1, buf.buf, buf.len);
+ write_or_die(1, buf.buf, buf.len);
}
strbuf_release(&buf);
}
diff --git a/fetch-pack.c b/fetch-pack.c
index 6d8926a..b90dadf 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -245,7 +245,7 @@ static void send_request(struct fetch_pack_args *args,
send_sideband(fd, -1, buf->buf, buf->len, LARGE_PACKET_MAX);
packet_flush(fd);
} else
- safe_write(fd, buf->buf, buf->len);
+ write_or_die(fd, buf->buf, buf->len);
}
static void insert_one_alternate_ref(const struct ref *ref, void *unused)
diff --git a/http-backend.c b/http-backend.c
index f50e77f..8144f3a 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -70,7 +70,7 @@ static void format_write(int fd, const char *fmt, ...)
if (n >= sizeof(buffer))
die("protocol error: impossibly long line");
- safe_write(fd, buffer, n);
+ write_or_die(fd, buffer, n);
}
static void http_status(unsigned code, const char *msg)
@@ -111,7 +111,7 @@ static void end_headers(void)
static void end_headers(void)
{
- safe_write(1, "\r\n", 2);
+ write_or_die(1, "\r\n", 2);
}
__attribute__((format (printf, 1, 2)))
@@ -157,7 +157,7 @@ static void send_strbuf(const char *type, struct strbuf *buf)
hdr_int(content_length, buf->len);
hdr_str(content_type, type);
end_headers();
- safe_write(1, buf->buf, buf->len);
+ write_or_die(1, buf->buf, buf->len);
}
static void send_local_file(const char *the_type, const char *name)
@@ -185,7 +185,7 @@ static void send_local_file(const char *the_type, const char *name)
die_errno("Cannot read '%s'", p);
if (!n)
break;
- safe_write(1, buf, n);
+ write_or_die(1, buf, n);
}
close(fd);
free(buf);
diff --git a/pkt-line.c b/pkt-line.c
index 5138f47..699c2dd 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -46,23 +46,6 @@ static void packet_trace(const char *buf, unsigned int len, int write)
strbuf_release(&out);
}
-ssize_t safe_write(int fd, const void *buf, ssize_t n)
-{
- ssize_t nn = n;
- while (n) {
- int ret = xwrite(fd, buf, n);
- if (ret > 0) {
- buf = (char *) buf + ret;
- n -= ret;
- continue;
- }
- if (!ret)
- die("write error (disk full?)");
- die_errno("write error");
- }
- return nn;
-}
-
/*
* If we buffered things up above (we don't, but we should),
* we'd flush it here
@@ -70,7 +53,7 @@ void packet_flush(int fd)
void packet_flush(int fd)
{
packet_trace("0000", 4, 1);
- safe_write(fd, "0000", 4);
+ write_or_die(fd, "0000", 4);
}
void packet_buf_flush(struct strbuf *buf)
@@ -106,7 +89,7 @@ void packet_write(int fd, const char *fmt, ...)
va_start(args, fmt);
n = format_packet(fmt, args);
va_end(args);
- safe_write(fd, buffer, n);
+ write_or_die(fd, buffer, n);
}
void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
diff --git a/remote-curl.c b/remote-curl.c
index 933c69a..7be4b53 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -685,7 +685,7 @@ static int fetch_git(struct discovery *heads,
err = rpc_service(&rpc, heads);
if (rpc.result.len)
- safe_write(1, rpc.result.buf, rpc.result.len);
+ write_or_die(1, rpc.result.buf, rpc.result.len);
strbuf_release(&rpc.result);
strbuf_release(&preamble);
free(depth_arg);
@@ -805,7 +805,7 @@ static int push_git(struct discovery *heads, int nr_spec, char **specs)
err = rpc_service(&rpc, heads);
if (rpc.result.len)
- safe_write(1, rpc.result.buf, rpc.result.len);
+ write_or_die(1, rpc.result.buf, rpc.result.len);
strbuf_release(&rpc.result);
free(argv);
return err;
diff --git a/send-pack.c b/send-pack.c
index 97ab336..a99a1e4 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -281,7 +281,7 @@ int send_pack(struct send_pack_args *args,
send_sideband(out, -1, req_buf.buf, req_buf.len, LARGE_PACKET_MAX);
}
} else {
- safe_write(out, req_buf.buf, req_buf.len);
+ write_or_die(out, req_buf.buf, req_buf.len);
packet_flush(out);
}
strbuf_release(&req_buf);
diff --git a/sideband.c b/sideband.c
index d5ffa1c..8f7b25b 100644
--- a/sideband.c
+++ b/sideband.c
@@ -1,3 +1,4 @@
+#include "cache.h"
#include "pkt-line.h"
#include "sideband.h"
@@ -108,7 +109,7 @@ int recv_sideband(const char *me, int in_stream, int out)
} while (len);
continue;
case 1:
- safe_write(out, buf + pf+1, len);
+ write_or_die(out, buf + pf+1, len);
continue;
default:
fprintf(stderr, "%s: protocol error: bad band #%d\n",
@@ -138,12 +139,12 @@ ssize_t send_sideband(int fd, int band, const char *data, ssize_t sz, int packet
if (0 <= band) {
sprintf(hdr, "%04x", n + 5);
hdr[4] = band;
- safe_write(fd, hdr, 5);
+ write_or_die(fd, hdr, 5);
} else {
sprintf(hdr, "%04x", n + 4);
- safe_write(fd, hdr, 4);
+ write_or_die(fd, hdr, 4);
}
- safe_write(fd, p, n);
+ write_or_die(fd, p, n);
p += n;
sz -= n;
}
diff --git a/upload-pack.c b/upload-pack.c
index 7c05b15..04d6bd7 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -69,7 +69,8 @@ static ssize_t send_client_data(int fd, const char *data, ssize_t sz)
xwrite(fd, data, sz);
return sz;
}
- return safe_write(fd, data, sz);
+ write_or_die(fd, data, sz);
+ return sz;
}
static FILE *pack_pipe = NULL;
--
1.8.1.20.g7078b03
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCHv2 02/10] pkt-line: drop safe_write function
2013-02-18 9:15 ` [PATCHv2 02/10] pkt-line: drop safe_write function Jeff King
@ 2013-02-18 9:56 ` Jonathan Nieder
2013-02-18 10:24 ` Jeff King
0 siblings, 1 reply; 50+ messages in thread
From: Jonathan Nieder @ 2013-02-18 9:56 UTC (permalink / raw)
To: Jeff King; +Cc: git, Shawn O. Pearce
Jeff King wrote:
> This is just write_or_die by another name.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> Actually, they are not quite the same. write_or_die will exit(0) when it
> sees EPIPE.
That information definitely belongs in the commit message.
> Which makes me a little nervous.
Me, too. Let's see:
[...]
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -908,7 +908,7 @@ static void report(struct command *commands, const char *unpack_status)
> if (use_sideband)
> send_sideband(1, 1, buf.buf, buf.len, use_sideband);
> else
> - safe_write(1, buf.buf, buf.len);
> + write_or_die(1, buf.buf, buf.len);
If the connection to send-pack is lost and stdout becomes a broken
pipe and I am updating enough refs to overflow the pipe buffer,
receive-pack will die with SIGPIPE. So unless the sadistic caller has
set the inherited SIGPIPE action to SIG_IGN (for example by wrapping
git with an uncautious Python wrapper that uses subprocess.Popen), the
change to EPIPE handling is not a behavior change.
Since the pipe is closed, presumably the calling send-pack has hung up
and won't notice the exit status, so this should be safe.
Arguably it would be more friendly to stay alive to run the
post-receive and post-update hooks, though, given that a ref update
has occurred. Maybe transport commands like this one should always
set the disposition of SIGPIPE to SIG_IGN.
[...]
> --- a/builtin/send-pack.c
> +++ b/builtin/send-pack.c
> @@ -79,7 +79,7 @@ static void print_helper_status(struct ref *ref)
> }
> strbuf_addch(&buf, '\n');
>
> - safe_write(1, buf.buf, buf.len);
> + write_or_die(1, buf.buf, buf.len);
A signal will kill send-pack before write_or_die has a chance to
intervene so this change is a no-op unless the caller is sadistic
(as in the [1] case). In the signal(SIGPIPE, SIG_IGN) case, it might
be a regression, since "git push" should not declare success when its
connection to receive-pack closes early.
[1] http://www.chiark.greenend.org.uk/ucgi/~cjwatson/blosxom/2009-07-02-python-sigpipe.html
[...]
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -245,7 +245,7 @@ static void send_request(struct fetch_pack_args *args,
> send_sideband(fd, -1, buf->buf, buf->len, LARGE_PACKET_MAX);
> packet_flush(fd);
> } else
> - safe_write(fd, buf->buf, buf->len);
> + write_or_die(fd, buf->buf, buf->len);
Also a no-op except when the parent process is insane enough to let us
inherit signal(SIGPIPE, SIG_IGN).
In that case, if triggerable this looks like a bad change: if
upload-pack has gone missing, the fetch should not be considered a
success.
[...]
> --- a/http-backend.c
> +++ b/http-backend.c
> @@ -70,7 +70,7 @@ static void format_write(int fd, const char *fmt, ...)
> if (n >= sizeof(buffer))
> die("protocol error: impossibly long line");
>
> - safe_write(fd, buffer, n);
> + write_or_die(fd, buffer, n);
Etc. I'm stopping here.
I'm thinking before a patch like this we should make the following
change:
1. at startup, set the signal action of SIGPIPE to SIG_DFL, to make
the behavior a little more predictable.
Perhaps the following as well:
2. in write_or_die(), when encountering EPIPE, set the signal action
of SIGPIPE to SIG_DFL and raise(SIGPIPE), ensuring the exit status
reflects the broken pipe. If the parent process is unnecessarily
noisy about that, that's a bug in the parent process (hopefully
uncommon).
Or alternatively:
2b. never set SIGPIPE to SIG_IGN except in short blocks of code that
do not call write_or_die()
What do you think?
Jonathan
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 02/10] pkt-line: drop safe_write function
2013-02-18 9:56 ` Jonathan Nieder
@ 2013-02-18 10:24 ` Jeff King
0 siblings, 0 replies; 50+ messages in thread
From: Jeff King @ 2013-02-18 10:24 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git, Shawn O. Pearce
On Mon, Feb 18, 2013 at 01:56:45AM -0800, Jonathan Nieder wrote:
> Jeff King wrote:
>
> > This is just write_or_die by another name.
> >
> > Signed-off-by: Jeff King <peff@peff.net>
> > ---
> > Actually, they are not quite the same. write_or_die will exit(0) when it
> > sees EPIPE.
>
> That information definitely belongs in the commit message.
Yeah, this one was more RFC; I was hoping for some input on the EPIPE
thing, so I could know _what_ to put in the commit message. Is it safe,
and if so, why? And if not, we should drop the patch.
> If the connection to send-pack is lost and stdout becomes a broken
> pipe and I am updating enough refs to overflow the pipe buffer,
> receive-pack will die with SIGPIPE. So unless the sadistic caller has
> set the inherited SIGPIPE action to SIG_IGN (for example by wrapping
> git with an uncautious Python wrapper that uses subprocess.Popen), the
> change to EPIPE handling is not a behavior change.
Yeah, but I don't want to count on always catching SIGPIPE. There's the
inherited signal handler thing, but there's also the fact that we may
end up ignoring SIGPIPE from backend programs like upload-pack and
receive-pack; they check their writes anyway, and we have already run
into issues with getting SIGPIPE when we don't necessarily expect or
care about it.
The nice thing about write_or_die is that it still _exits_ on EPIPE.
It's just that it doesn't print an error (which is really not a big
deal) and exit with a 0 return code. I really wonder if we should just
change the latter. For programs which are creating copious output (e.g.,
"git log"), the return value is not important anyway. For backend
programs, an unexpected EPIPE from something like write_or_die should
probably involve a non-successful return code.
> Arguably it would be more friendly to stay alive to run the
> post-receive and post-update hooks, though, given that a ref update
> has occurred. Maybe transport commands like this one should always
> set the disposition of SIGPIPE to SIG_IGN.
Yeah, I've suggested that in the past. And I do think it's sane, because
if you took a ref update, you almost certainly want to run the
post-receive, even if the client is no longer around (e.g., if it is
going to email out the changeset).
> [...]
> > --- a/builtin/send-pack.c
> > +++ b/builtin/send-pack.c
> > @@ -79,7 +79,7 @@ static void print_helper_status(struct ref *ref)
> > }
> > strbuf_addch(&buf, '\n');
> >
> > - safe_write(1, buf.buf, buf.len);
> > + write_or_die(1, buf.buf, buf.len);
>
> A signal will kill send-pack before write_or_die has a chance to
> intervene so this change is a no-op unless the caller is sadistic
> (as in the [1] case). In the signal(SIGPIPE, SIG_IGN) case, it might
> be a regression, since "git push" should not declare success when its
> connection to receive-pack closes early.
But that isn't going to receive-pack, is it? Send-pack's stdout is
really just going to the user (or wherever). So it would have an effect
more for:
(git push && echo >&2 OK) | grep -m1 foo
which might print "OK" even if we failed. That's quite contrived, but it
is at least a measurable change. And anyway...
> In that case, if triggerable this looks like a bad change: if
> upload-pack has gone missing, the fetch should not be considered a
> success.
> [...]
> Etc. I'm stopping here.
Yeah, there are definitely some bad ones.
> I'm thinking before a patch like this we should make the following
> change:
>
> 1. at startup, set the signal action of SIGPIPE to SIG_DFL, to make
> the behavior a little more predictable.
I'm lukewarm on that, just because we may want to ignore SIGPIPE
ourselves at some point.
> Perhaps the following as well:
>
> 2. in write_or_die(), when encountering EPIPE, set the signal action
> of SIGPIPE to SIG_DFL and raise(SIGPIPE), ensuring the exit status
> reflects the broken pipe. If the parent process is unnecessarily
> noisy about that, that's a bug in the parent process (hopefully
> uncommon).
I like this. My suggestion would be to just exit(1) instead of exit(0).
But really, raising SIGPIPE makes the most sense, because it
communicates to the parent what happened (and the shell will wisely not
print a message, but careful parents like "git fetch" and "git push"
will check it properly and notice that the child did not succeed).
> Or alternatively:
>
> 2b. never set SIGPIPE to SIG_IGN except in short blocks of code that
> do not call write_or_die()
Yuck. :)
> What do you think?
I really like option 2. That exit(0) when we see SIGPIPE bugs me.
Because we _are_ dying due to our write failing, and I think the only
reason to exit(0) was to avoid unnecessary complaints from parents. But
raising SIGPIPE seems like the best of both worlds to me.
-Peff
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCHv2 03/10] pkt-line: clean up "gentle" reading function
2013-02-18 9:12 ` [PATCHv2 0/10] pkt-line and remote-curl cleanups server Jeff King
2013-02-18 9:14 ` [PATCHv2 01/10] pkt-line: move a misplaced comment Jeff King
2013-02-18 9:15 ` [PATCHv2 02/10] pkt-line: drop safe_write function Jeff King
@ 2013-02-18 9:16 ` Jeff King
2013-02-18 10:12 ` Jonathan Nieder
2013-02-18 9:22 ` [PATCHv2 04/10] pkt-line: change error message for oversized packet Jeff King
` (8 subsequent siblings)
11 siblings, 1 reply; 50+ messages in thread
From: Jeff King @ 2013-02-18 9:16 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git, Shawn O. Pearce
Originally we had a single function for reading packetized
data: packet_read_line. Commit 46284dd grew a more "gentle"
form that would return an error instead of dying upon
reading a truncated input stream. However:
1. The two functions were called "packet_read" and
"packet_read_line", with no indication that the only
difference is in the error handling.
2. There was no documentation about which error conditions
were handled in the gentle form, and which still caused
a death.
3. The internal variable to trigger the gentle mode was
called "return_line_fail", which was not very
expressive.
This patch converts packet_line to packet_read_line_gently
to more clearly indicate its relationship to
packet_read_line, and renames the internal variable to
"gently". This is also not incredibly expressive, but it is
at least a convention within the git code. And finally, we
document the exact behavior for the gentle and non-gentle
modes.
While we are cleaning up the names, we can drop the
"return_line_fail" checks in packet_read_internal entirely.
They look like this:
ret = safe_read(..., return_line_fail);
if (return_line_fail && ret < 0)
...
The check for return_line_fail is a no-op; safe_read will
only ever return an error value if we passed it
return_line_fail in the first place.
Signed-off-by: Jeff King <peff@peff.net>
---
Obviously this one is a matter of taste, but I think the result is much
better. Certainly the documentation bits are hard to argue with. :)
connect.c | 2 +-
pkt-line.c | 16 ++++++++--------
pkt-line.h | 21 ++++++++++++++++++++-
3 files changed, 29 insertions(+), 10 deletions(-)
diff --git a/connect.c b/connect.c
index 49e56ba..7e0920d 100644
--- a/connect.c
+++ b/connect.c
@@ -76,7 +76,7 @@ struct ref **get_remote_heads(int in, struct ref **list,
char *name;
int len, name_len;
- len = packet_read(in, buffer, sizeof(buffer));
+ len = packet_read_line_gently(in, buffer, sizeof(buffer));
if (len < 0)
die_initial_contact(got_at_least_one_head);
diff --git a/pkt-line.c b/pkt-line.c
index 699c2dd..62479d3 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -103,13 +103,13 @@ static int safe_read(int fd, void *buffer, unsigned size, int return_line_fail)
strbuf_add(buf, buffer, n);
}
-static int safe_read(int fd, void *buffer, unsigned size, int return_line_fail)
+static int safe_read(int fd, void *buffer, unsigned size, int gently)
{
ssize_t ret = read_in_full(fd, buffer, size);
if (ret < 0)
die_errno("read error");
else if (ret < size) {
- if (return_line_fail)
+ if (gently)
return -1;
die("The remote end hung up unexpectedly");
@@ -143,13 +143,13 @@ static int packet_read_internal(int fd, char *buffer, unsigned size, int return_
return len;
}
-static int packet_read_internal(int fd, char *buffer, unsigned size, int return_line_fail)
+static int packet_read_internal(int fd, char *buffer, unsigned size, int gently)
{
int len, ret;
char linelen[4];
- ret = safe_read(fd, linelen, 4, return_line_fail);
- if (return_line_fail && ret < 0)
+ ret = safe_read(fd, linelen, 4, gently);
+ if (ret < 0)
return ret;
len = packet_length(linelen);
if (len < 0)
@@ -161,15 +161,15 @@ static int packet_read_internal(int fd, char *buffer, unsigned size, int return_
len -= 4;
if (len >= size)
die("protocol error: bad line length %d", len);
- ret = safe_read(fd, buffer, len, return_line_fail);
- if (return_line_fail && ret < 0)
+ ret = safe_read(fd, buffer, len, gently);
+ if (ret < 0)
return ret;
buffer[len] = 0;
packet_trace(buffer, len, 0);
return len;
}
-int packet_read(int fd, char *buffer, unsigned size)
+int packet_read_line_gently(int fd, char *buffer, unsigned size)
{
return packet_read_internal(fd, buffer, size, 1);
}
diff --git a/pkt-line.h b/pkt-line.h
index 7a67e9c..31bd069 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -24,8 +24,27 @@ int packet_read_line(int fd, char *buffer, unsigned size);
void packet_buf_flush(struct strbuf *buf);
void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
+/*
+ * Read a packetized line from the descriptor into the buffer. We will die
+ * under any of the following conditions:
+ *
+ * 1. Read error from descriptor.
+ *
+ * 2. Protocol error from the remote (e.g., bogus length characters).
+ *
+ * 3. Receiving a packet larger than "size" bytes.
+ *
+ * 4. Truncated output from the remote (e.g., we expected a packet but got
+ * EOF, or we got a partial packet followed by EOF).
+ */
int packet_read_line(int fd, char *buffer, unsigned size);
-int packet_read(int fd, char *buffer, unsigned size);
+
+/*
+ * Same as packet_read_line, but do not die on condition 4 (truncated input);
+ * instead return -1. We still die for the other conditions.
+ */
+int packet_read_line_gently(int fd, char *buffer, unsigned size);
+
int packet_get_line(struct strbuf *out, char **src_buf, size_t *src_len);
ssize_t safe_write(int, const void *, ssize_t);
--
1.8.1.20.g7078b03
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCHv2 03/10] pkt-line: clean up "gentle" reading function
2013-02-18 9:16 ` [PATCHv2 03/10] pkt-line: clean up "gentle" reading function Jeff King
@ 2013-02-18 10:12 ` Jonathan Nieder
2013-02-18 10:25 ` Jeff King
0 siblings, 1 reply; 50+ messages in thread
From: Jonathan Nieder @ 2013-02-18 10:12 UTC (permalink / raw)
To: Jeff King; +Cc: git, Shawn O. Pearce
Jeff King wrote:
> Originally we had a single function for reading packetized
> data: packet_read_line. Commit 46284dd grew a more "gentle"
> form that would return an error instead of dying upon
> reading a truncated input stream. However:
In other words:
Based on the names of two functions "packet_read" and
"packet_read_line", it is not obvious which to use and what the
ramifications of that choice are.
Rename packet_read to packet_read_line_gently and add a comment
explaining that the latter is a "gentler" form that returns an
error instead of dying upon reading a truncated input stream.
While at it:
* Rename the internal argument triggering the gentle mode to
"gentle" instead of "return_line_fail".
* Drop the redundant "return_line_fail &&" in checks like
"if (return_line_fail && ret < 0)". safe_read() never
returns an error when !gentle.
No functional change intended.
FWIW, the patch itself is
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 03/10] pkt-line: clean up "gentle" reading function
2013-02-18 10:12 ` Jonathan Nieder
@ 2013-02-18 10:25 ` Jeff King
0 siblings, 0 replies; 50+ messages in thread
From: Jeff King @ 2013-02-18 10:25 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git, Shawn O. Pearce
On Mon, Feb 18, 2013 at 02:12:09AM -0800, Jonathan Nieder wrote:
> Jeff King wrote:
>
> > Originally we had a single function for reading packetized
> > data: packet_read_line. Commit 46284dd grew a more "gentle"
> > form that would return an error instead of dying upon
> > reading a truncated input stream. However:
>
> In other words:
Hmph. I had originally written a commit message organized more like
yours, then I changed it to try to be more clear. I guess that didn't
work.
But yes, you get the intent exactly.
-Peff
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCHv2 04/10] pkt-line: change error message for oversized packet
2013-02-18 9:12 ` [PATCHv2 0/10] pkt-line and remote-curl cleanups server Jeff King
` (2 preceding siblings ...)
2013-02-18 9:16 ` [PATCHv2 03/10] pkt-line: clean up "gentle" reading function Jeff King
@ 2013-02-18 9:22 ` Jeff King
2013-02-18 9:40 ` Junio C Hamano
2013-02-18 10:15 ` Jonathan Nieder
2013-02-18 9:22 ` [PATCHv2 05/10] pkt-line: rename s/packet_read_line/packet_read/ Jeff King
` (7 subsequent siblings)
11 siblings, 2 replies; 50+ messages in thread
From: Jeff King @ 2013-02-18 9:22 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git, Shawn O. Pearce
If we get a packet from the remote side that is too large to
fit in our buffer, we currently complain "protocol error:
bad line length". This is a bit vague. The line length the
other side sent is not "bad" per se; the actual problem is
that it exceeded our expectation for buffer length.
This will generally not happen between two git-core
implementations, because the sender limits themselves to
either 1000, or to LARGE_PACKET_MAX (depending on what is
being sent, sideband-64k negotiation, etc), and the receiver
uses a buffer of the appropriate size.
The protocol document indicates the LARGE_PACKET_MAX limit
(of 65520), but does not actually specify the 1000-byte
limit for ref lines. It is likely that other implementations
just create a packet as large as they need, and this doesn't
come up often because nobody has 1000-character ref names
(or close to it, counting sha1 and other boilerplate).
We may want to increase the size of our receive buffers for
ref lines to prepare for such writers (whether they are
other implementations, or if we eventually want to bump the
write size in git-core). In the meantime, this patch tries
to give a more clear message in case it does come up.
Signed-off-by: Jeff King <peff@peff.net>
---
I'm really tempted to bump all of our 1000-byte buffers to just use
LARGE_PACKET_MAX. If we provided a packet_read variant that used a
static buffer (which is fine for all but one or two callers), then it
would not take much memory (right now we stick some LARGE_PACKET_MAX
buffers on the stack, which is slightly questionable for
stack-restricted systems). But I left that for a different topic (and
even if we do, we would still want this message to catch anything over
the bizarre 65520 limit).
Out of curiosity, I grepped the list archives, and found only one
instance of this message. And it was somebody whose data stream was tainted
with random crud that happened to be numbers (much more common is "bad line
length character", when the crud does not look like a packet length).
pkt-line.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/pkt-line.c b/pkt-line.c
index 62479d3..f2a7575 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -160,7 +160,8 @@ static int packet_read_internal(int fd, char *buffer, unsigned size, int gently)
}
len -= 4;
if (len >= size)
- die("protocol error: bad line length %d", len);
+ die("protocol error: line too large: (expected %u, got %d)",
+ size, len);
ret = safe_read(fd, buffer, len, gently);
if (ret < 0)
return ret;
--
1.8.1.20.g7078b03
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCHv2 04/10] pkt-line: change error message for oversized packet
2013-02-18 9:22 ` [PATCHv2 04/10] pkt-line: change error message for oversized packet Jeff King
@ 2013-02-18 9:40 ` Junio C Hamano
2013-02-18 9:49 ` Jeff King
2013-02-18 10:15 ` Jonathan Nieder
1 sibling, 1 reply; 50+ messages in thread
From: Junio C Hamano @ 2013-02-18 9:40 UTC (permalink / raw)
To: Jeff King; +Cc: Jonathan Nieder, git, Shawn O. Pearce
Jeff King <peff@peff.net> writes:
> I'm really tempted to bump all of our 1000-byte buffers to just use
> LARGE_PACKET_MAX. If we provided a packet_read variant that used a
> static buffer (which is fine for all but one or two callers), then it
> would not take much memory...
I thought that 1000-byte limit was kept when we introduced the 64k
interface to interoperate with other side that does not yet support
the 64k interface. Is your justification that such an old version of
Git no longer matters in the real world (which is true, I think), or
we use 1000-byte limit in some codepaths even when we know that we
are talking with a 64k-capable version of Git on the other side?
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 04/10] pkt-line: change error message for oversized packet
2013-02-18 9:40 ` Junio C Hamano
@ 2013-02-18 9:49 ` Jeff King
2013-02-18 21:25 ` Junio C Hamano
0 siblings, 1 reply; 50+ messages in thread
From: Jeff King @ 2013-02-18 9:49 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jonathan Nieder, git, Shawn O. Pearce
On Mon, Feb 18, 2013 at 01:40:17AM -0800, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > I'm really tempted to bump all of our 1000-byte buffers to just use
> > LARGE_PACKET_MAX. If we provided a packet_read variant that used a
> > static buffer (which is fine for all but one or two callers), then it
> > would not take much memory...
>
> I thought that 1000-byte limit was kept when we introduced the 64k
> interface to interoperate with other side that does not yet support
> the 64k interface. Is your justification that such an old version of
> Git no longer matters in the real world (which is true, I think), or
> we use 1000-byte limit in some codepaths even when we know that we
> are talking with a 64k-capable version of Git on the other side?
I should have been more clear that I want to bump only the _reading_
side, not the writing.
The sideband-64k capability bumps the sideband chunk size. But the size
for packetized ref advertisement lines has remained at 1000. Which it
must, since we start outputting them before doing capability
negotiation. The sender will die before writing a longer ref line (see
pkg-line.c:format_packet), and most of the reading callsites feed a
1000-byte buffer to packet_read_line (which will die if we get a larger
packet). So the upgrade path for that would be:
1. Git bumps up all reading buffers to LARGE_PACKET_MAX, just in case.
This immediately covers us for any alternate implementations that
send larger ref packets (I do not know if any exist, but the
documentation does not mention this limitation at all, so I would
not be surprised if other implementations just use LARGE_PACKET_MAX
as a maximum).
2. Many years pass. We decide that Git v1.8.2 and older are ancient
history and not worth caring about.
3. We bump the 1000-byte limit for format_packet to LARGE_PACKET_MAX.
This is not pressing at all; I wouldn't have even noticed it if I hadn't
been wondering how large to make the new buffer I was adding in a later
patch. I have not heard of anyone running up against this limit in
practice. But it's easy to do (1), and it starts the clock ticking for
the 1000-byte readers to become obsolete.
-Peff
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 04/10] pkt-line: change error message for oversized packet
2013-02-18 9:49 ` Jeff King
@ 2013-02-18 21:25 ` Junio C Hamano
2013-02-18 21:33 ` Jeff King
0 siblings, 1 reply; 50+ messages in thread
From: Junio C Hamano @ 2013-02-18 21:25 UTC (permalink / raw)
To: Jeff King; +Cc: Jonathan Nieder, git, Shawn O. Pearce
Jeff King <peff@peff.net> writes:
> But it's easy to do (1), and it starts the clock ticking for
> the 1000-byte readers to become obsolete.
Yup, I agree with that goal.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 04/10] pkt-line: change error message for oversized packet
2013-02-18 21:25 ` Junio C Hamano
@ 2013-02-18 21:33 ` Jeff King
2013-02-20 8:47 ` Jeff King
0 siblings, 1 reply; 50+ messages in thread
From: Jeff King @ 2013-02-18 21:33 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jonathan Nieder, git, Shawn O. Pearce
On Mon, Feb 18, 2013 at 01:25:35PM -0800, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > But it's easy to do (1), and it starts the clock ticking for
> > the 1000-byte readers to become obsolete.
>
> Yup, I agree with that goal.
Having just looked at the pkt-line callers a lot, I think most of them
could go for something like:
char *packet_read(int fd, unsigned *len_p)
{
static char buffer[LARGE_PACKET_MAX];
int len;
len = packet_read_to_buf(fd, buffer, sizeof(buffer));
if (len < 0)
return NULL;
*len_p = len;
return buffer;
}
That would actually simplify the callers a bit, and would harmonize the
buffer sizes at the same time. I'll look into doing a series tonight.
-Peff
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 04/10] pkt-line: change error message for oversized packet
2013-02-18 21:33 ` Jeff King
@ 2013-02-20 8:47 ` Jeff King
2013-02-20 8:54 ` Junio C Hamano
0 siblings, 1 reply; 50+ messages in thread
From: Jeff King @ 2013-02-20 8:47 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jonathan Nieder, git, Shawn O. Pearce
On Mon, Feb 18, 2013 at 04:33:31PM -0500, Jeff King wrote:
> On Mon, Feb 18, 2013 at 01:25:35PM -0800, Junio C Hamano wrote:
>
> > Jeff King <peff@peff.net> writes:
> >
> > > But it's easy to do (1), and it starts the clock ticking for
> > > the 1000-byte readers to become obsolete.
> >
> > Yup, I agree with that goal.
>
> Having just looked at the pkt-line callers a lot, I think most of them
> could go for something like:
> [...]
>
> That would actually simplify the callers a bit, and would harmonize the
> buffer sizes at the same time. I'll look into doing a series tonight.
Just a quick update on this series. It ended up taking more nights than
I thought. :) The result looks much better than what I posted before, and
I found several other corner cases and bugs in packet parsing, too.
I'm going to hold off on posting it tonight, as I'm now up to 19
patches, and after several rounds of "rebase -i", I need to give it a
final read-through myself before inflicting it on anyone else. I'll do
that and post it tomorrow.
In the meantime, please hold off on what I've posted so far (that
includes the jk/smart-http-robustify topic).
-Peff
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 04/10] pkt-line: change error message for oversized packet
2013-02-20 8:47 ` Jeff King
@ 2013-02-20 8:54 ` Junio C Hamano
0 siblings, 0 replies; 50+ messages in thread
From: Junio C Hamano @ 2013-02-20 8:54 UTC (permalink / raw)
To: Jeff King; +Cc: Jonathan Nieder, git, Shawn O. Pearce
Jeff King <peff@peff.net> wrote:
>
>In the meantime, please hold off on what I've posted so far (that
>includes the jk/smart-http-robustify topic).
Surely. I'm done for the night already. Looking forward to see the reroll tomorrow.
Thanks.
--
Pardon terseness, typo and HTML from a tablet.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 04/10] pkt-line: change error message for oversized packet
2013-02-18 9:22 ` [PATCHv2 04/10] pkt-line: change error message for oversized packet Jeff King
2013-02-18 9:40 ` Junio C Hamano
@ 2013-02-18 10:15 ` Jonathan Nieder
2013-02-18 10:26 ` Jeff King
1 sibling, 1 reply; 50+ messages in thread
From: Jonathan Nieder @ 2013-02-18 10:15 UTC (permalink / raw)
To: Jeff King; +Cc: git, Shawn O. Pearce
Jeff King wrote:
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -160,7 +160,8 @@ static int packet_read_internal(int fd, char *buffer, unsigned size, int gently)
> }
> len -= 4;
> if (len >= size)
> - die("protocol error: bad line length %d", len);
> + die("protocol error: line too large: (expected %u, got %d)",
> + size, len);
Makes sense. I think this should say "expected < %u, got %d", since we
don't actually expect most lines to be 1004 bytes in practice.
With or without such a change,
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 04/10] pkt-line: change error message for oversized packet
2013-02-18 10:15 ` Jonathan Nieder
@ 2013-02-18 10:26 ` Jeff King
0 siblings, 0 replies; 50+ messages in thread
From: Jeff King @ 2013-02-18 10:26 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git, Shawn O. Pearce
On Mon, Feb 18, 2013 at 02:15:23AM -0800, Jonathan Nieder wrote:
> Jeff King wrote:
>
> > --- a/pkt-line.c
> > +++ b/pkt-line.c
> > @@ -160,7 +160,8 @@ static int packet_read_internal(int fd, char *buffer, unsigned size, int gently)
> > }
> > len -= 4;
> > if (len >= size)
> > - die("protocol error: bad line length %d", len);
> > + die("protocol error: line too large: (expected %u, got %d)",
> > + size, len);
>
> Makes sense. I think this should say "expected < %u, got %d", since we
> don't actually expect most lines to be 1004 bytes in practice.
Yeah, I had toyed with writing "expected max %u" for the same reason.
I'll tweak it in the re-roll.
-Peff
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCHv2 05/10] pkt-line: rename s/packet_read_line/packet_read/
2013-02-18 9:12 ` [PATCHv2 0/10] pkt-line and remote-curl cleanups server Jeff King
` (3 preceding siblings ...)
2013-02-18 9:22 ` [PATCHv2 04/10] pkt-line: change error message for oversized packet Jeff King
@ 2013-02-18 9:22 ` Jeff King
2013-02-18 10:19 ` Jonathan Nieder
2013-02-18 9:26 ` [PATCHv2 06/10] pkt-line: share buffer/descriptor reading implementation Jeff King
` (6 subsequent siblings)
11 siblings, 1 reply; 50+ messages in thread
From: Jeff King @ 2013-02-18 9:22 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git, Shawn O. Pearce
Originally packets were used just for the line-oriented ref
advertisement and negotiation. These days, we also stuff
packfiles and sidebands into them, and they do not
necessarily represent a line. Drop the "_line" suffix, as it
is not informative and makes the function names quite long
(especially as we add "_gently" and other variants).
Signed-off-by: Jeff King <peff@peff.net>
---
Again, this is a taste issue. Can be optional.
builtin/archive.c | 4 ++--
builtin/fetch-pack.c | 2 +-
builtin/receive-pack.c | 2 +-
builtin/upload-archive.c | 2 +-
connect.c | 2 +-
daemon.c | 2 +-
fetch-pack.c | 6 +++---
pkt-line.c | 4 ++--
pkt-line.h | 6 +++---
remote-curl.c | 6 +++---
send-pack.c | 4 ++--
sideband.c | 2 +-
upload-pack.c | 4 ++--
13 files changed, 23 insertions(+), 23 deletions(-)
diff --git a/builtin/archive.c b/builtin/archive.c
index 9a1cfd3..bf600f5 100644
--- a/builtin/archive.c
+++ b/builtin/archive.c
@@ -53,7 +53,7 @@ static int run_remote_archiver(int argc, const char **argv,
packet_write(fd[1], "argument %s\n", argv[i]);
packet_flush(fd[1]);
- len = packet_read_line(fd[0], buf, sizeof(buf));
+ len = packet_read(fd[0], buf, sizeof(buf));
if (!len)
die(_("git archive: expected ACK/NAK, got EOF"));
if (buf[len-1] == '\n')
@@ -66,7 +66,7 @@ static int run_remote_archiver(int argc, const char **argv,
die(_("git archive: protocol error"));
}
- len = packet_read_line(fd[0], buf, sizeof(buf));
+ len = packet_read(fd[0], buf, sizeof(buf));
if (len)
die(_("git archive: expected a flush"));
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 940ae35..b59e60e 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -102,7 +102,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
*/
static char line[1000];
for (;;) {
- int n = packet_read_line(0, line, sizeof(line));
+ int n = packet_read(0, line, sizeof(line));
if (!n)
break;
if (line[n-1] == '\n')
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 48cd5dc..3f58ce8 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -736,7 +736,7 @@ static struct command *read_head_info(void)
char *refname;
int len, reflen;
- len = packet_read_line(0, line, sizeof(line));
+ len = packet_read(0, line, sizeof(line));
if (!len)
break;
if (line[len-1] == '\n')
diff --git a/builtin/upload-archive.c b/builtin/upload-archive.c
index b928beb..2ecf461 100644
--- a/builtin/upload-archive.c
+++ b/builtin/upload-archive.c
@@ -40,7 +40,7 @@ int cmd_upload_archive_writer(int argc, const char **argv, const char *prefix)
sent_argv[0] = "git-upload-archive";
for (p = buf;;) {
/* This will die if not enough free space in buf */
- len = packet_read_line(0, p, (buf + sizeof buf) - p);
+ len = packet_read(0, p, (buf + sizeof buf) - p);
if (len == 0)
break; /* got a flush */
if (sent_argc > MAX_ARGS - 2)
diff --git a/connect.c b/connect.c
index 7e0920d..59266b1 100644
--- a/connect.c
+++ b/connect.c
@@ -76,7 +76,7 @@ struct ref **get_remote_heads(int in, struct ref **list,
char *name;
int len, name_len;
- len = packet_read_line_gently(in, buffer, sizeof(buffer));
+ len = packet_read_gently(in, buffer, sizeof(buffer));
if (len < 0)
die_initial_contact(got_at_least_one_head);
diff --git a/daemon.c b/daemon.c
index 4602b46..ac26b93 100644
--- a/daemon.c
+++ b/daemon.c
@@ -612,7 +612,7 @@ static int execute(void)
loginfo("Connection from %s:%s", addr, port);
alarm(init_timeout ? init_timeout : timeout);
- pktlen = packet_read_line(0, line, sizeof(line));
+ pktlen = packet_read(0, line, sizeof(line));
alarm(0);
len = strlen(line);
diff --git a/fetch-pack.c b/fetch-pack.c
index b90dadf..d89c9d0 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -173,7 +173,7 @@ static void consume_shallow_list(struct fetch_pack_args *args, int fd)
* is a block of have lines exchanged.
*/
char line[1000];
- while (packet_read_line(fd, line, sizeof(line))) {
+ while (packet_read(fd, line, sizeof(line))) {
if (!prefixcmp(line, "shallow "))
continue;
if (!prefixcmp(line, "unshallow "))
@@ -216,7 +216,7 @@ static enum ack_type get_ack(int fd, unsigned char *result_sha1)
static enum ack_type get_ack(int fd, unsigned char *result_sha1)
{
static char line[1000];
- int len = packet_read_line(fd, line, sizeof(line));
+ int len = packet_read(fd, line, sizeof(line));
if (!len)
die("git fetch-pack: expected ACK/NAK, got EOF");
@@ -350,7 +350,7 @@ static int find_common(struct fetch_pack_args *args,
unsigned char sha1[20];
send_request(args, fd[1], &req_buf);
- while (packet_read_line(fd[0], line, sizeof(line))) {
+ while (packet_read(fd[0], line, sizeof(line))) {
if (!prefixcmp(line, "shallow ")) {
if (get_sha1_hex(line + 8, sha1))
die("invalid shallow line: %s", line);
diff --git a/pkt-line.c b/pkt-line.c
index f2a7575..85faf73 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -170,12 +170,12 @@ int packet_read_line_gently(int fd, char *buffer, unsigned size)
return len;
}
-int packet_read_line_gently(int fd, char *buffer, unsigned size)
+int packet_read_gently(int fd, char *buffer, unsigned size)
{
return packet_read_internal(fd, buffer, size, 1);
}
-int packet_read_line(int fd, char *buffer, unsigned size)
+int packet_read(int fd, char *buffer, unsigned size)
{
return packet_read_internal(fd, buffer, size, 0);
}
diff --git a/pkt-line.h b/pkt-line.h
index 31bd069..2dc4941 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -37,13 +37,13 @@ int packet_read_line(int fd, char *buffer, unsigned size);
* 4. Truncated output from the remote (e.g., we expected a packet but got
* EOF, or we got a partial packet followed by EOF).
*/
-int packet_read_line(int fd, char *buffer, unsigned size);
+int packet_read(int fd, char *buffer, unsigned size);
/*
- * Same as packet_read_line, but do not die on condition 4 (truncated input);
+ * Same as packet_read, but do not die on condition 4 (truncated input);
* instead return -1. We still die for the other conditions.
*/
-int packet_read_line_gently(int fd, char *buffer, unsigned size);
+int packet_read_gently(int fd, char *buffer, unsigned size);
int packet_get_line(struct strbuf *out, char **src_buf, size_t *src_len);
ssize_t safe_write(int, const void *, ssize_t);
diff --git a/remote-curl.c b/remote-curl.c
index 7be4b53..99cc016 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -308,7 +308,7 @@ static size_t rpc_out(void *ptr, size_t eltsize,
if (!avail) {
rpc->initial_buffer = 0;
- avail = packet_read_line(rpc->out, rpc->buf, rpc->alloc);
+ avail = packet_read(rpc->out, rpc->buf, rpc->alloc);
if (!avail)
return 0;
rpc->pos = 0;
@@ -425,7 +425,7 @@ static int post_rpc(struct rpc_state *rpc)
break;
}
- n = packet_read_line(rpc->out, buf, left);
+ n = packet_read(rpc->out, buf, left);
if (!n)
break;
rpc->len += n;
@@ -579,7 +579,7 @@ static int rpc_service(struct rpc_state *rpc, struct discovery *heads)
rpc->hdr_accept = strbuf_detach(&buf, NULL);
while (!err) {
- int n = packet_read_line(rpc->out, rpc->buf, rpc->alloc);
+ int n = packet_read(rpc->out, rpc->buf, rpc->alloc);
if (!n)
break;
rpc->pos = 0;
diff --git a/send-pack.c b/send-pack.c
index a99a1e4..7b22235 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -108,7 +108,7 @@ static int receive_status(int in, struct ref *refs)
struct ref *hint;
char line[1000];
int ret = 0;
- int len = packet_read_line(in, line, sizeof(line));
+ int len = packet_read(in, line, sizeof(line));
if (len < 10 || memcmp(line, "unpack ", 7))
return error("did not receive remote status");
if (memcmp(line, "unpack ok\n", 10)) {
@@ -122,7 +122,7 @@ static int receive_status(int in, struct ref *refs)
while (1) {
char *refname;
char *msg;
- len = packet_read_line(in, line, sizeof(line));
+ len = packet_read(in, line, sizeof(line));
if (!len)
break;
if (len < 3 ||
diff --git a/sideband.c b/sideband.c
index 8f7b25b..4a6446d 100644
--- a/sideband.c
+++ b/sideband.c
@@ -38,7 +38,7 @@ int recv_sideband(const char *me, int in_stream, int out)
while (1) {
int band, len;
- len = packet_read_line(in_stream, buf + pf, LARGE_PACKET_MAX);
+ len = packet_read(in_stream, buf + pf, LARGE_PACKET_MAX);
if (len == 0)
break;
if (len < 1) {
diff --git a/upload-pack.c b/upload-pack.c
index 04d6bd7..04231d7 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -428,7 +428,7 @@ static int get_common_commits(void)
save_commit_buffer = 0;
for (;;) {
- int len = packet_read_line(0, line, sizeof(line));
+ int len = packet_read(0, line, sizeof(line));
reset_timeout();
if (!len) {
@@ -589,7 +589,7 @@ static void receive_needs(void)
struct object *o;
const char *features;
unsigned char sha1_buf[20];
- len = packet_read_line(0, line, sizeof(line));
+ len = packet_read(0, line, sizeof(line));
reset_timeout();
if (!len)
break;
--
1.8.1.20.g7078b03
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCHv2 05/10] pkt-line: rename s/packet_read_line/packet_read/
2013-02-18 9:22 ` [PATCHv2 05/10] pkt-line: rename s/packet_read_line/packet_read/ Jeff King
@ 2013-02-18 10:19 ` Jonathan Nieder
2013-02-18 10:29 ` Jeff King
0 siblings, 1 reply; 50+ messages in thread
From: Jonathan Nieder @ 2013-02-18 10:19 UTC (permalink / raw)
To: Jeff King; +Cc: git, Shawn O. Pearce
Jeff King wrote:
> Originally packets were used just for the line-oriented ref
> advertisement and negotiation. These days, we also stuff
> packfiles and sidebands into them, and they do not
> necessarily represent a line. Drop the "_line" suffix, as it
> is not informative and makes the function names quite long
> (especially as we add "_gently" and other variants).
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> Again, this is a taste issue. Can be optional.
In combination with patch 3, this changes the meaning of packet_read()
without changing its signature, which could make other patches
cherry-picked on top change behavior in unpredictable ways. :(
So I'd be all for this if the signature changes (for example to put
the fd at the end or something), but not so if not.
Thanks,
Jonathan
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 05/10] pkt-line: rename s/packet_read_line/packet_read/
2013-02-18 10:19 ` Jonathan Nieder
@ 2013-02-18 10:29 ` Jeff King
2013-02-18 11:05 ` Jonathan Nieder
0 siblings, 1 reply; 50+ messages in thread
From: Jeff King @ 2013-02-18 10:29 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git, Shawn O. Pearce
On Mon, Feb 18, 2013 at 02:19:15AM -0800, Jonathan Nieder wrote:
> Jeff King wrote:
>
> > Originally packets were used just for the line-oriented ref
> > advertisement and negotiation. These days, we also stuff
> > packfiles and sidebands into them, and they do not
> > necessarily represent a line. Drop the "_line" suffix, as it
> > is not informative and makes the function names quite long
> > (especially as we add "_gently" and other variants).
> >
> > Signed-off-by: Jeff King <peff@peff.net>
> > ---
> > Again, this is a taste issue. Can be optional.
>
> In combination with patch 3, this changes the meaning of packet_read()
> without changing its signature, which could make other patches
> cherry-picked on top change behavior in unpredictable ways. :(
>
> So I'd be all for this if the signature changes (for example to put
> the fd at the end or something), but not so if not.
True. Though packet_read has only existed since last June, only had one
callsite (which would now conflict, since I'm touching it in this
series), and has no new calls in origin..origin/pu. So it's relatively
low risk for such a problem. I don't know how careful we want to be.
-Peff
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 05/10] pkt-line: rename s/packet_read_line/packet_read/
2013-02-18 10:29 ` Jeff King
@ 2013-02-18 11:05 ` Jonathan Nieder
0 siblings, 0 replies; 50+ messages in thread
From: Jonathan Nieder @ 2013-02-18 11:05 UTC (permalink / raw)
To: Jeff King; +Cc: git, Shawn O. Pearce
Jeff King wrote:
> On Mon, Feb 18, 2013 at 02:19:15AM -0800, Jonathan Nieder wrote:
>> In combination with patch 3, this changes the meaning of packet_read()
>> without changing its signature, which could make other patches
>> cherry-picked on top change behavior in unpredictable ways. :(
>>
>> So I'd be all for this if the signature changes (for example to put
>> the fd at the end or something), but not so if not.
>
> True. Though packet_read has only existed since last June, only had one
> callsite (which would now conflict, since I'm touching it in this
> series), and has no new calls in origin..origin/pu. So it's relatively
> low risk for such a problem. I don't know how careful we want to be.
I was unclear. What I am worried about is that someone using a
version of git without this patch will try some yet-to-be-written
patch using packet_read from the mailing list and not notice that they
are using the wrong function. For example, if someone is using
1.7.12.y or 1.8.1.y and wants to try a patch from after the above,
they would get subtly different and wrong results.
The rule "change the name or signature when breaking the ABI of a
global function" is easy to remember and follow. I think we want not
to have to be careful at all, and such rules can help with that. :)
Thanks,
Jonathan
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCHv2 06/10] pkt-line: share buffer/descriptor reading implementation
2013-02-18 9:12 ` [PATCHv2 0/10] pkt-line and remote-curl cleanups server Jeff King
` (4 preceding siblings ...)
2013-02-18 9:22 ` [PATCHv2 05/10] pkt-line: rename s/packet_read_line/packet_read/ Jeff King
@ 2013-02-18 9:26 ` Jeff King
2013-02-18 10:43 ` Jonathan Nieder
2013-02-18 9:27 ` [PATCHv2 07/10] teach get_remote_heads to read from a memory buffer Jeff King
` (5 subsequent siblings)
11 siblings, 1 reply; 50+ messages in thread
From: Jeff King @ 2013-02-18 9:26 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git, Shawn O. Pearce
The packet_read function reads from a descriptor. The
packet_get_line function is similar, but reads from an
in-memory buffer, and uses a completely separate
implementation. This patch teaches the internal function
backing packet_read to accept either source, and use the
appropriate one. As a result:
1. The function has been renamed to packet_read_from_buf,
which more clearly indicates its relationship to the
packet_read function.
2. The original packet_get_line wrote to a strbuf; the new
function, like its descriptor counterpart, reads into a
buffer provided by the caller.
3. The original function did not die on any errors, but
instead returned an error code. Now we have the usual
"normal" and "gently" forms.
There are only two existing calls to packet_get_line which
have to be converted, and both are in remote-curl. The first
reads and checks the "# service=git-foo" line from a smart
http server. The second just reads past any additional
smart headers, without bothering to look at them.
This patch converts both to the new form, with a few
implications:
1. Because we use the non-gentle form, the first caller
can drop its own error checking. As a result, we will get
more accurate error reporting about protocol breakage,
since the errors come from inside the protocol code. We
will no longer print the URL as part of the error, but
that's OK. Protocol breakages should be rare (and we
are pretty sure at this point in the code that it is a
real smart server, so we won't be confused by dumb
servers), and the first debugging step would probably
be GIT_CURL_VERBOSE, anyway.
2. The second caller did not error check at all, and now
does. This can help us catch broken or truncated input
close to the source.
3. Since we are no longer using a strbuf, we now have a
1000-byte limit on the smart-http headers. That should
be fine, as the only header that has ever been sent
here is the short "service=git-foo" header.
Signed-off-by: Jeff King <peff@peff.net>
---
The diffstat shows more lines appearing, but it is mainly from comments
and from the various parse_line{_from_buf,}{,_gently} variants; we
really do get rid of a duplicate parsing implementation, and we
harmonize all of the error conditions and messages.
We can also make "gently" a parameter to avoid the proliferation of
related functions, but would mean all but one callsite would have to
pass an extra "0". Choose your poison, I guess.
pkt-line.c | 69 ++++++++++++++++++++++++++++-------------------------------
pkt-line.h | 11 +++++++++-
remote-curl.c | 19 ++++++++--------
3 files changed, 53 insertions(+), 46 deletions(-)
diff --git a/pkt-line.c b/pkt-line.c
index 85faf73..7ee91e0 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -103,12 +103,26 @@ static int safe_read(int fd, void *buffer, unsigned size, int gently)
strbuf_add(buf, buffer, n);
}
-static int safe_read(int fd, void *buffer, unsigned size, int gently)
+static int get_packet_data(int fd, char **src_buf, size_t *src_size,
+ void *dst, unsigned size, int gently)
{
- ssize_t ret = read_in_full(fd, buffer, size);
- if (ret < 0)
- die_errno("read error");
- else if (ret < size) {
+ ssize_t ret;
+
+ /* Read up to "size" bytes from our source, whatever it is. */
+ if (src_buf) {
+ ret = size < *src_size ? size : *src_size;
+ memcpy(dst, *src_buf, ret);
+ *src_buf += size;
+ *src_size -= size;
+ }
+ else {
+ ret = read_in_full(fd, dst, size);
+ if (ret < 0)
+ die_errno("read error");
+ }
+
+ /* And complain if we didn't get enough bytes to satisfy the read. */
+ if (ret < size) {
if (gently)
return -1;
@@ -143,12 +157,13 @@ static int packet_read_internal(int fd, char *buffer, unsigned size, int gently)
return len;
}
-static int packet_read_internal(int fd, char *buffer, unsigned size, int gently)
+static int packet_read_internal(int fd, char **src_buf, size_t *src_len,
+ char *buffer, unsigned size, int gently)
{
int len, ret;
char linelen[4];
- ret = safe_read(fd, linelen, 4, gently);
+ ret = get_packet_data(fd, src_buf, src_len, linelen, 4, gently);
if (ret < 0)
return ret;
len = packet_length(linelen);
@@ -162,7 +177,7 @@ static int packet_read_internal(int fd, char *buffer, unsigned size, int gently)
if (len >= size)
die("protocol error: line too large: (expected %u, got %d)",
size, len);
- ret = safe_read(fd, buffer, len, gently);
+ ret = get_packet_data(fd, src_buf, src_len, buffer, len, gently);
if (ret < 0)
return ret;
buffer[len] = 0;
@@ -172,40 +187,22 @@ int packet_get_line(struct strbuf *out,
int packet_read_gently(int fd, char *buffer, unsigned size)
{
- return packet_read_internal(fd, buffer, size, 1);
+ return packet_read_internal(fd, NULL, 0, buffer, size, 1);
}
int packet_read(int fd, char *buffer, unsigned size)
{
- return packet_read_internal(fd, buffer, size, 0);
+ return packet_read_internal(fd, NULL, 0, buffer, size, 0);
}
-int packet_get_line(struct strbuf *out,
- char **src_buf, size_t *src_len)
+int packet_read_from_buf(char *dst, unsigned dst_len,
+ char **src_buf, size_t *src_len)
{
- int len;
-
- if (*src_len < 4)
- return -1;
- len = packet_length(*src_buf);
- if (len < 0)
- return -1;
- if (!len) {
- *src_buf += 4;
- *src_len -= 4;
- packet_trace("0000", 4, 0);
- return 0;
- }
- if (*src_len < len)
- return -2;
-
- *src_buf += 4;
- *src_len -= 4;
- len -= 4;
+ return packet_read_internal(-1, src_buf, src_len, dst, dst_len, 0);
+}
- strbuf_add(out, *src_buf, len);
- *src_buf += len;
- *src_len -= len;
- packet_trace(out->buf, out->len, 0);
- return len;
+int packet_read_from_buf_gently(char *dst, unsigned dst_len,
+ char **src_buf, size_t *src_len)
+{
+ return packet_read_internal(-1, src_buf, src_len, dst, dst_len, 1);
}
diff --git a/pkt-line.h b/pkt-line.h
index 2dc4941..287a391 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -45,7 +45,16 @@ int packet_read_gently(int fd, char *buffer, unsigned size);
*/
int packet_read_gently(int fd, char *buffer, unsigned size);
-int packet_get_line(struct strbuf *out, char **src_buf, size_t *src_len);
+/*
+ * Same as packet_read above, but read from an in-memory buffer
+ * instead of a file descriptor. The src_buf and src_len are modified
+ * to iterate past the consumed data.
+ */
+int packet_read_from_buf(char *dst, unsigned dst_len,
+ char **src_buf, size_t *src_len);
+int packet_read_from_buf_gently(char *dst, unsigned dst_len,
+ char **src_buf, size_t *src_len);
+
ssize_t safe_write(int, const void *, ssize_t);
#endif
diff --git a/remote-curl.c b/remote-curl.c
index 99cc016..2ec5854 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -138,28 +138,29 @@ static struct discovery* discover_refs(const char *service)
if (maybe_smart &&
(5 <= last->len && last->buf[4] == '#') &&
!strbuf_cmp(&exp, &type)) {
+ char line[1000];
+ int len;
+
/*
* smart HTTP response; validate that the service
* pkt-line matches our request.
*/
- if (packet_get_line(&buffer, &last->buf, &last->len) <= 0)
- die("%s has invalid packet header", refs_url);
- if (buffer.len && buffer.buf[buffer.len - 1] == '\n')
- strbuf_setlen(&buffer, buffer.len - 1);
+ len = packet_read_from_buf(line, sizeof(line), &last->buf, &last->len);
+ if (len && line[len - 1] == '\n')
+ len--;
strbuf_reset(&exp);
strbuf_addf(&exp, "# service=%s", service);
- if (strbuf_cmp(&exp, &buffer))
- die("invalid server response; got '%s'", buffer.buf);
+ if (len != exp.len || memcmp(exp.buf, line, len))
+ die("invalid server response; got '%s'", line);
strbuf_release(&exp);
/* The header can include additional metadata lines, up
* until a packet flush marker. Ignore these now, but
* in the future we might start to scan them.
*/
- strbuf_reset(&buffer);
- while (packet_get_line(&buffer, &last->buf, &last->len) > 0)
- strbuf_reset(&buffer);
+ while (packet_read_from_buf(line, sizeof(line), &last->buf, &last->len) > 0)
+ ;
last->proto_git = 1;
}
--
1.8.1.20.g7078b03
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCHv2 06/10] pkt-line: share buffer/descriptor reading implementation
2013-02-18 9:26 ` [PATCHv2 06/10] pkt-line: share buffer/descriptor reading implementation Jeff King
@ 2013-02-18 10:43 ` Jonathan Nieder
2013-02-18 10:48 ` Jeff King
0 siblings, 1 reply; 50+ messages in thread
From: Jonathan Nieder @ 2013-02-18 10:43 UTC (permalink / raw)
To: Jeff King; +Cc: git, Shawn O. Pearce
Jeff King wrote:
> The packet_read function reads from a descriptor.
Ah, so this introduces a new analagous helper that reads from
a strbuf, to avoid the copy-from-async-procedure hack?
[...]
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -103,12 +103,26 @@ static int safe_read(int fd, void *buffer, unsigned size, int gently)
> strbuf_add(buf, buffer, n);
> }
>
> -static int safe_read(int fd, void *buffer, unsigned size, int gently)
> +static int get_packet_data(int fd, char **src_buf, size_t *src_size,
> + void *dst, unsigned size, int gently)
> {
> - ssize_t ret = read_in_full(fd, buffer, size);
> - if (ret < 0)
> - die_errno("read error");
> - else if (ret < size) {
> + ssize_t ret;
> +
> + /* Read up to "size" bytes from our source, whatever it is. */
> + if (src_buf) {
> + ret = size < *src_size ? size : *src_size;
> + memcpy(dst, *src_buf, ret);
> + *src_buf += size;
> + *src_size -= size;
> + }
> + else {
Style: git cuddles its "else"s.
assert(src_buf ? fd < 0 : fd >= 0);
if (src_buf) {
...
} else {
...
}
> + ret = read_in_full(fd, dst, size);
> + if (ret < 0)
> + die_errno("read error");
This is noisy about upstream pipe gone missing, which makes sense
since this is transport-related. Maybe that deserves a comment.
[...]
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -138,28 +138,29 @@ static struct discovery* discover_refs(const char *service)
> if (maybe_smart &&
> (5 <= last->len && last->buf[4] == '#') &&
> !strbuf_cmp(&exp, &type)) {
> + char line[1000];
> + int len;
> +
> /*
> * smart HTTP response; validate that the service
> * pkt-line matches our request.
> */
> - if (packet_get_line(&buffer, &last->buf, &last->len) <= 0)
> - die("%s has invalid packet header", refs_url);
> - if (buffer.len && buffer.buf[buffer.len - 1] == '\n')
> - strbuf_setlen(&buffer, buffer.len - 1);
> + len = packet_read_from_buf(line, sizeof(line), &last->buf, &last->len);
> + if (len && line[len - 1] == '\n')
> + len--;
Was anything guaranteeing that buffer.len < 1000 before this change?
The rest looks good from a quick glance.
Jonathan
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 06/10] pkt-line: share buffer/descriptor reading implementation
2013-02-18 10:43 ` Jonathan Nieder
@ 2013-02-18 10:48 ` Jeff King
2013-02-18 10:54 ` Jonathan Nieder
0 siblings, 1 reply; 50+ messages in thread
From: Jeff King @ 2013-02-18 10:48 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git, Shawn O. Pearce
On Mon, Feb 18, 2013 at 02:43:50AM -0800, Jonathan Nieder wrote:
> Jeff King wrote:
>
> > The packet_read function reads from a descriptor.
>
> Ah, so this introduces a new analagous helper that reads from
> a strbuf, to avoid the copy-from-async-procedure hack?
Not from a strbuf, but basically, yes.
> > + ret = read_in_full(fd, dst, size);
> > + if (ret < 0)
> > + die_errno("read error");
>
> This is noisy about upstream pipe gone missing, which makes sense
> since this is transport-related. Maybe that deserves a comment.
That is not new code; it is just reindented from the original safe_read.
But is it noisy about a missing pipe? We do not get EPIPE for reading.
We should just get a short read or EOF, both of which is handled later.
> > + len = packet_read_from_buf(line, sizeof(line), &last->buf, &last->len);
> > + if (len && line[len - 1] == '\n')
> > + len--;
>
> Was anything guaranteeing that buffer.len < 1000 before this change?
No. That's discussed in point (3) of the "implications" in the commit
message.
-Peff
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 06/10] pkt-line: share buffer/descriptor reading implementation
2013-02-18 10:48 ` Jeff King
@ 2013-02-18 10:54 ` Jonathan Nieder
0 siblings, 0 replies; 50+ messages in thread
From: Jonathan Nieder @ 2013-02-18 10:54 UTC (permalink / raw)
To: Jeff King; +Cc: git, Shawn O. Pearce
Jeff King wrote:
> But is it noisy about a missing pipe? We do not get EPIPE for reading.
Ah, that makes more sense.
[...]
>>> + len = packet_read_from_buf(line, sizeof(line), &last->buf, &last->len);
>>> + if (len && line[len - 1] == '\n')
>>> + len--;
>>
>> Was anything guaranteeing that buffer.len < 1000 before this change?
>
> No. That's discussed in point (3) of the "implications" in the commit
> message.
Thanks. Sorry I missed it the first time.
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCHv2 07/10] teach get_remote_heads to read from a memory buffer
2013-02-18 9:12 ` [PATCHv2 0/10] pkt-line and remote-curl cleanups server Jeff King
` (5 preceding siblings ...)
2013-02-18 9:26 ` [PATCHv2 06/10] pkt-line: share buffer/descriptor reading implementation Jeff King
@ 2013-02-18 9:27 ` Jeff King
2013-02-18 9:29 ` [PATCHv2 08/10] remote-curl: pass buffer straight to get_remote_heads Jeff King
` (4 subsequent siblings)
11 siblings, 0 replies; 50+ messages in thread
From: Jeff King @ 2013-02-18 9:27 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git, Shawn O. Pearce
Now that we can read packet data from memory as easily as a
descriptor, get_remote_heads can take either one as a
source. This will allow further refactoring in remote-curl.
Signed-off-by: Jeff King <peff@peff.net>
---
There aren't that many callers of get_remote_heads, so I just added the
optional parameters and tweaked each callsite. We can do it as a
get_remote_heads_from_buf wrapper function and leave them be, if we
want.
builtin/fetch-pack.c | 2 +-
builtin/send-pack.c | 2 +-
cache.h | 4 +++-
connect.c | 10 +++++++---
remote-curl.c | 2 +-
transport.c | 6 +++---
6 files changed, 16 insertions(+), 10 deletions(-)
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index b59e60e..bf850d0 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -128,7 +128,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
args.verbose ? CONNECT_VERBOSE : 0);
}
- get_remote_heads(fd[0], &ref, 0, NULL);
+ get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL);
ref = fetch_pack(&args, fd, conn, ref, dest,
&sought, pack_lockfile_ptr);
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index 8778519..152c4ea 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -207,7 +207,7 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
memset(&extra_have, 0, sizeof(extra_have));
- get_remote_heads(fd[0], &remote_refs, REF_NORMAL, &extra_have);
+ get_remote_heads(fd[0], NULL, 0, &remote_refs, REF_NORMAL, &extra_have);
transport_verify_remote_names(nr_refspecs, refspecs);
diff --git a/cache.h b/cache.h
index e493563..db646a2 100644
--- a/cache.h
+++ b/cache.h
@@ -1049,7 +1049,9 @@ struct extra_have_objects {
int nr, alloc;
unsigned char (*array)[20];
};
-extern struct ref **get_remote_heads(int in, struct ref **list, unsigned int flags, struct extra_have_objects *);
+extern struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
+ struct ref **list, unsigned int flags,
+ struct extra_have_objects *);
extern int server_supports(const char *feature);
extern int parse_feature_request(const char *features, const char *feature);
extern const char *server_feature_value(const char *feature, int *len_ret);
diff --git a/connect.c b/connect.c
index 59266b1..28f21d0 100644
--- a/connect.c
+++ b/connect.c
@@ -62,8 +62,8 @@ static void die_initial_contact(int got_at_least_one_head)
/*
* Read all the refs from the other end
*/
-struct ref **get_remote_heads(int in, struct ref **list,
- unsigned int flags,
+struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
+ struct ref **list, unsigned int flags,
struct extra_have_objects *extra_have)
{
int got_at_least_one_head = 0;
@@ -76,7 +76,11 @@ struct ref **get_remote_heads(int in, struct ref **list,
char *name;
int len, name_len;
- len = packet_read_gently(in, buffer, sizeof(buffer));
+ if (src_buf)
+ len = packet_read_from_buf_gently(buffer, sizeof(buffer),
+ &src_buf, &src_len);
+ else
+ len = packet_read_gently(in, buffer, sizeof(buffer));
if (len < 0)
die_initial_contact(got_at_least_one_head);
diff --git a/remote-curl.c b/remote-curl.c
index 2ec5854..6e43463 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -195,7 +195,7 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
if (start_async(&async))
die("cannot start thread to parse advertised refs");
- get_remote_heads(async.out, &list,
+ get_remote_heads(async.out, NULL, 0, &list,
for_push ? REF_NORMAL : 0, NULL);
close(async.out);
if (finish_async(&async))
diff --git a/transport.c b/transport.c
index 886ffd8..62df466 100644
--- a/transport.c
+++ b/transport.c
@@ -507,7 +507,7 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
struct ref *refs;
connect_setup(transport, for_push, 0);
- get_remote_heads(data->fd[0], &refs,
+ get_remote_heads(data->fd[0], NULL, 0, &refs,
for_push ? REF_NORMAL : 0, &data->extra_have);
data->got_remote_heads = 1;
@@ -541,7 +541,7 @@ static int fetch_refs_via_pack(struct transport *transport,
if (!data->got_remote_heads) {
connect_setup(transport, 0, 0);
- get_remote_heads(data->fd[0], &refs_tmp, 0, NULL);
+ get_remote_heads(data->fd[0], NULL, 0, &refs_tmp, 0, NULL);
data->got_remote_heads = 1;
}
@@ -799,7 +799,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
struct ref *tmp_refs;
connect_setup(transport, 1, 0);
- get_remote_heads(data->fd[0], &tmp_refs, REF_NORMAL, NULL);
+ get_remote_heads(data->fd[0], NULL, 0, &tmp_refs, REF_NORMAL, NULL);
data->got_remote_heads = 1;
}
--
1.8.1.20.g7078b03
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCHv2 08/10] remote-curl: pass buffer straight to get_remote_heads
2013-02-18 9:12 ` [PATCHv2 0/10] pkt-line and remote-curl cleanups server Jeff King
` (6 preceding siblings ...)
2013-02-18 9:27 ` [PATCHv2 07/10] teach get_remote_heads to read from a memory buffer Jeff King
@ 2013-02-18 9:29 ` Jeff King
2013-02-18 10:47 ` Jonathan Nieder
2013-02-18 9:29 ` [PATCHv2 0/10] pkt-line and remote-curl cleanups server Junio C Hamano
` (3 subsequent siblings)
11 siblings, 1 reply; 50+ messages in thread
From: Jeff King @ 2013-02-18 9:29 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git, Shawn O. Pearce
Until recently, get_remote_heads only knew how to read refs
from a file descriptor. To hack around this, we spawned a
thread (or forked a process) to write the buffer back to us.
Now that we can just pass it our buffer directly, we don't
have to use this hack anymore.
Signed-off-by: Jeff King <peff@peff.net>
---
I don't know that this code was hurting anything, but it has always
struck me as ugly and a possible source of error. And now it's gone.
remote-curl.c | 26 ++------------------------
1 file changed, 2 insertions(+), 24 deletions(-)
diff --git a/remote-curl.c b/remote-curl.c
index 6e43463..f049da2 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -173,33 +173,11 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
return last;
}
-static int write_discovery(int in, int out, void *data)
-{
- struct discovery *heads = data;
- int err = 0;
- if (write_in_full(out, heads->buf, heads->len) != heads->len)
- err = 1;
- close(out);
- return err;
-}
-
static struct ref *parse_git_refs(struct discovery *heads, int for_push)
{
struct ref *list = NULL;
- struct async async;
-
- memset(&async, 0, sizeof(async));
- async.proc = write_discovery;
- async.data = heads;
- async.out = -1;
-
- if (start_async(&async))
- die("cannot start thread to parse advertised refs");
- get_remote_heads(async.out, NULL, 0, &list,
- for_push ? REF_NORMAL : 0, NULL);
- close(async.out);
- if (finish_async(&async))
- die("ref parsing thread failed");
+ get_remote_heads(-1, heads->buf, heads->len, &list,
+ for_push ? REF_NORMAL : 0, NULL);
return list;
}
--
1.8.1.20.g7078b03
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCHv2 0/10] pkt-line and remote-curl cleanups server
2013-02-18 9:12 ` [PATCHv2 0/10] pkt-line and remote-curl cleanups server Jeff King
` (7 preceding siblings ...)
2013-02-18 9:29 ` [PATCHv2 08/10] remote-curl: pass buffer straight to get_remote_heads Jeff King
@ 2013-02-18 9:29 ` Junio C Hamano
2013-02-18 9:33 ` Jeff King
2013-02-18 9:29 ` [PATCHv2 09/10] remote-curl: move ref-parsing code up in file Jeff King
` (2 subsequent siblings)
11 siblings, 1 reply; 50+ messages in thread
From: Junio C Hamano @ 2013-02-18 9:29 UTC (permalink / raw)
To: Jeff King; +Cc: Jonathan Nieder, git, Shawn O. Pearce
Jeff King <peff@peff.net> writes:
> On Sun, Feb 17, 2013 at 05:41:13PM -0800, Jonathan Nieder wrote:
>
>> > I don't think so. Don't ERR lines appear inside their own packets?
>>
>> Yes, I misread get_remote_heads for some reason. Thanks for checking.
>
> Thanks for bringing it up. I had not even thought about ERR at all. So
> it was luck rather than skill that I was right. :)
>
>> I'm not sure whether servers are expected to send a flush after an
>> ERR packet. The only codepath I know of in git itself that sends
>> such packets is git-daemon, which does not flush after the error (but
>> is not used in the stateless-rpc case). http-backend uses HTTP error
>> codes for its errors.
>
> I just checked, and GitHub also does not send flush packets after ERR.
> Which makes sense; ERR is supposed to end the conversation.
Hmph. A flush packet was supposed to be a mark to say "all the
packets before this one can be buffered and kept without getting
passed to write(2), but this and all such buffered data _must_ go on
the wire _now_". So in the sense, ERR not followed by a flush may
not even have a chance to be seen on the other end, no? That is
what the comment before the implementation of packet_flush() is all
about.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 0/10] pkt-line and remote-curl cleanups server
2013-02-18 9:29 ` [PATCHv2 0/10] pkt-line and remote-curl cleanups server Junio C Hamano
@ 2013-02-18 9:33 ` Jeff King
2013-02-18 9:49 ` Junio C Hamano
0 siblings, 1 reply; 50+ messages in thread
From: Jeff King @ 2013-02-18 9:33 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jonathan Nieder, git, Shawn O. Pearce
On Mon, Feb 18, 2013 at 01:29:16AM -0800, Junio C Hamano wrote:
> > I just checked, and GitHub also does not send flush packets after ERR.
> > Which makes sense; ERR is supposed to end the conversation.
>
> Hmph. A flush packet was supposed to be a mark to say "all the
> packets before this one can be buffered and kept without getting
> passed to write(2), but this and all such buffered data _must_ go on
> the wire _now_". So in the sense, ERR not followed by a flush may
> not even have a chance to be seen on the other end, no? That is
> what the comment before the implementation of packet_flush() is all
> about.
Despite the name, I always thought of packet_flush as more of a signal
for the _receiver_, who then knows that they have gotten all of a
particular list. In other words, we seem to use it as a sequence point
as much as anything (mostly because we immediately write() out any other
packets anyway, so there is no flushing to be done; but perhaps there
were originally thoughts that we could do more buffering on the writing
side).
-Peff
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 0/10] pkt-line and remote-curl cleanups server
2013-02-18 9:33 ` Jeff King
@ 2013-02-18 9:49 ` Junio C Hamano
2013-02-18 9:55 ` Jeff King
2013-02-20 7:14 ` Shawn Pearce
0 siblings, 2 replies; 50+ messages in thread
From: Junio C Hamano @ 2013-02-18 9:49 UTC (permalink / raw)
To: Jeff King; +Cc: Jonathan Nieder, git, Shawn O. Pearce
Jeff King <peff@peff.net> writes:
> On Mon, Feb 18, 2013 at 01:29:16AM -0800, Junio C Hamano wrote:
>
>> > I just checked, and GitHub also does not send flush packets after ERR.
>> > Which makes sense; ERR is supposed to end the conversation.
>>
>> Hmph. A flush packet was supposed to be a mark to say "all the
>> packets before this one can be buffered and kept without getting
>> passed to write(2), but this and all such buffered data _must_ go on
>> the wire _now_". So in the sense, ERR not followed by a flush may
>> not even have a chance to be seen on the other end, no? That is
>> what the comment before the implementation of packet_flush() is all
>> about.
>
> Despite the name, I always thought of packet_flush as more of a signal
> for the _receiver_, who then knows that they have gotten all of a
> particular list. In other words, we seem to use it as a sequence point
> as much as anything (mostly because we immediately write() out any other
> packets anyway, so there is no flushing to be done; but perhaps there
> were originally thoughts that we could do more buffering on the writing
> side).
Exactly.
The logic behind the comment "we do not buffer, but we should" is
that flush tells the receiver that the sending end is "done" feeding
it a class of data, and the data before flush do not have to reach
the receiver immediately, hence we can afford to buffer on the
sending end if we can measure that doing so would give us better
performance. We haven't measure and turned buffering on yet.
But when dying, we may of course have data before flushing. We may
disconnect (by dying) without showing flush (or preceding ERR) to
the other side, and for that reason, not relying on the flush packet
on the receiving end is a good change. Once we turn buffering on, we
probably need to be careful when sending an ERR indicator by making
it always drain any buffered data and show the ERR indicator without
buffering, or something.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 0/10] pkt-line and remote-curl cleanups server
2013-02-18 9:49 ` Junio C Hamano
@ 2013-02-18 9:55 ` Jeff King
2013-02-20 7:14 ` Shawn Pearce
1 sibling, 0 replies; 50+ messages in thread
From: Jeff King @ 2013-02-18 9:55 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jonathan Nieder, git, Shawn O. Pearce
On Mon, Feb 18, 2013 at 01:49:37AM -0800, Junio C Hamano wrote:
> The logic behind the comment "we do not buffer, but we should" is
> that flush tells the receiver that the sending end is "done" feeding
> it a class of data, and the data before flush do not have to reach
> the receiver immediately, hence we can afford to buffer on the
> sending end if we can measure that doing so would give us better
> performance. We haven't measure and turned buffering on yet.
>
> But when dying, we may of course have data before flushing. We may
> disconnect (by dying) without showing flush (or preceding ERR) to
> the other side, and for that reason, not relying on the flush packet
> on the receiving end is a good change. Once we turn buffering on, we
> probably need to be careful when sending an ERR indicator by making
> it always drain any buffered data and show the ERR indicator without
> buffering, or something.
Yeah, I'd agree. An ERR packet should always be accompanied by a flush
(whether an actual flush packet or not doesn't really matter, but
definitely a buffer flush on the sender). But I think that is really the
sender's problem, and we can deal with it there if and when we buffer.
>From the receiver's end, they simply want to be liberal in interpreting
an ERR as the end of the data stream. They do not have to be picky about
whether it is followed by more data, or by a flush packet, or whatever.
But that is not a change I am introducing; that is how get_remote_heads
has always worked. I am merely correcting the proposed change from v1 of
the series that did not correctly take that into account (and therefore
regressed the error message in the ERR case).
-Peff
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 0/10] pkt-line and remote-curl cleanups server
2013-02-18 9:49 ` Junio C Hamano
2013-02-18 9:55 ` Jeff King
@ 2013-02-20 7:14 ` Shawn Pearce
1 sibling, 0 replies; 50+ messages in thread
From: Shawn Pearce @ 2013-02-20 7:14 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jeff King, Jonathan Nieder, git
On Mon, Feb 18, 2013 at 1:49 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Jeff King <peff@peff.net> writes:
>
>> On Mon, Feb 18, 2013 at 01:29:16AM -0800, Junio C Hamano wrote:
>>
>>> > I just checked, and GitHub also does not send flush packets after ERR.
>>> > Which makes sense; ERR is supposed to end the conversation.
>>>
>>> Hmph. A flush packet was supposed to be a mark to say "all the
>>> packets before this one can be buffered and kept without getting
>>> passed to write(2), but this and all such buffered data _must_ go on
>>> the wire _now_". So in the sense, ERR not followed by a flush may
>>> not even have a chance to be seen on the other end, no? That is
>>> what the comment before the implementation of packet_flush() is all
>>> about.
>>
>> Despite the name, I always thought of packet_flush as more of a signal
>> for the _receiver_, who then knows that they have gotten all of a
>> particular list. In other words, we seem to use it as a sequence point
>> as much as anything (mostly because we immediately write() out any other
>> packets anyway, so there is no flushing to be done; but perhaps there
>> were originally thoughts that we could do more buffering on the writing
>> side).
>
> Exactly.
This is also my understanding of the flush-pkt ("0000"). Its an
end-of-list/end-of-section marker to let the peer know the protocol is
moving to the next state. Except where the protocol can move to the
next state without a flush-pkt of course (see below).
> The logic behind the comment "we do not buffer, but we should" is
> that flush tells the receiver that the sending end is "done" feeding
> it a class of data, and the data before flush do not have to reach
> the receiver immediately, hence we can afford to buffer on the
> sending end if we can measure that doing so would give us better
> performance. We haven't measure and turned buffering on yet.
So funny story. JGit actually buffers the pkt-line writes in memory
and does flushes to the network socket when any of the following
happen:
- fixed size in-memory buffer is full (8k or 32k by default)
- flush-pkt is needed in the protocol
- JGit forces a flush without a flush-pkt
There are places in the protocol where data needs to be shared with
the peer *despite* not having a flush-pkt present in the data stream
to do that. ERR is one of these places. "done\n" at the end of the
negotiation in a client is another. Sending ACK/NAK in a multi_ack
from upload-pack is another. I may be missing more.
JGit had to define three methods to make the pkt-line protocol work correctly:
- writeString: format a string as a single pkt-line, insert into buffer.
- end: write "000" into the buffer, flush the buffer.
- flush: flush the buffer if it has any content.
I always found that comment in front of that function funny. Its
totally not true. Fixing it is harder than just stuffing a buffer in
there and hoping for the best. The callers need work. At which point
that function isn't really what its author was trying to do.
> But when dying, we may of course have data before flushing. We may
> disconnect (by dying) without showing flush (or preceding ERR) to
> the other side, and for that reason, not relying on the flush packet
> on the receiving end is a good change. Once we turn buffering on, we
> probably need to be careful when sending an ERR indicator by making
> it always drain any buffered data and show the ERR indicator without
> buffering, or something.
Yes.
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCHv2 09/10] remote-curl: move ref-parsing code up in file
2013-02-18 9:12 ` [PATCHv2 0/10] pkt-line and remote-curl cleanups server Jeff King
` (8 preceding siblings ...)
2013-02-18 9:29 ` [PATCHv2 0/10] pkt-line and remote-curl cleanups server Junio C Hamano
@ 2013-02-18 9:29 ` Jeff King
2013-02-18 9:30 ` [PATCHv2 10/10] remote-curl: always parse incoming refs Jeff King
2013-02-20 7:05 ` [PATCHv2 0/10] pkt-line and remote-curl cleanups server Shawn Pearce
11 siblings, 0 replies; 50+ messages in thread
From: Jeff King @ 2013-02-18 9:29 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git, Shawn O. Pearce
The ref-parsing functions are static. Let's move them up in
the file to be available to more functions, which will help
us with later refactoring.
Signed-off-by: Jeff King <peff@peff.net>
---
Just a cleanup for the next patch. We could also just do extra
declarations at the top.
remote-curl.c | 117 +++++++++++++++++++++++++++++-----------------------------
1 file changed, 59 insertions(+), 58 deletions(-)
diff --git a/remote-curl.c b/remote-curl.c
index f049da2..62f82d1 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -80,6 +80,65 @@ static struct discovery *last_discovery;
};
static struct discovery *last_discovery;
+static struct ref *parse_git_refs(struct discovery *heads, int for_push)
+{
+ struct ref *list = NULL;
+ get_remote_heads(-1, heads->buf, heads->len, &list,
+ for_push ? REF_NORMAL : 0, NULL);
+ return list;
+}
+
+static struct ref *parse_info_refs(struct discovery *heads)
+{
+ char *data, *start, *mid;
+ char *ref_name;
+ int i = 0;
+
+ struct ref *refs = NULL;
+ struct ref *ref = NULL;
+ struct ref *last_ref = NULL;
+
+ data = heads->buf;
+ start = NULL;
+ mid = data;
+ while (i < heads->len) {
+ if (!start) {
+ start = &data[i];
+ }
+ if (data[i] == '\t')
+ mid = &data[i];
+ if (data[i] == '\n') {
+ if (mid - start != 40)
+ die("%sinfo/refs not valid: is this a git repository?", url);
+ data[i] = 0;
+ ref_name = mid + 1;
+ ref = xmalloc(sizeof(struct ref) +
+ strlen(ref_name) + 1);
+ memset(ref, 0, sizeof(struct ref));
+ strcpy(ref->name, ref_name);
+ get_sha1_hex(start, ref->old_sha1);
+ if (!refs)
+ refs = ref;
+ if (last_ref)
+ last_ref->next = ref;
+ last_ref = ref;
+ start = NULL;
+ }
+ i++;
+ }
+
+ ref = alloc_ref("HEAD");
+ if (!http_fetch_ref(url, ref) &&
+ !resolve_remote_symref(ref, refs)) {
+ ref->next = refs;
+ refs = ref;
+ } else {
+ free(ref);
+ }
+
+ return refs;
+}
+
static void free_discovery(struct discovery *d)
{
if (d) {
@@ -173,64 +232,6 @@ static struct discovery* discover_refs(const char *service)
return last;
}
-static struct ref *parse_git_refs(struct discovery *heads, int for_push)
-{
- struct ref *list = NULL;
- get_remote_heads(-1, heads->buf, heads->len, &list,
- for_push ? REF_NORMAL : 0, NULL);
- return list;
-}
-
-static struct ref *parse_info_refs(struct discovery *heads)
-{
- char *data, *start, *mid;
- char *ref_name;
- int i = 0;
-
- struct ref *refs = NULL;
- struct ref *ref = NULL;
- struct ref *last_ref = NULL;
-
- data = heads->buf;
- start = NULL;
- mid = data;
- while (i < heads->len) {
- if (!start) {
- start = &data[i];
- }
- if (data[i] == '\t')
- mid = &data[i];
- if (data[i] == '\n') {
- if (mid - start != 40)
- die("%sinfo/refs not valid: is this a git repository?", url);
- data[i] = 0;
- ref_name = mid + 1;
- ref = xmalloc(sizeof(struct ref) +
- strlen(ref_name) + 1);
- memset(ref, 0, sizeof(struct ref));
- strcpy(ref->name, ref_name);
- get_sha1_hex(start, ref->old_sha1);
- if (!refs)
- refs = ref;
- if (last_ref)
- last_ref->next = ref;
- last_ref = ref;
- start = NULL;
- }
- i++;
- }
-
- ref = alloc_ref("HEAD");
- if (!http_fetch_ref(url, ref) &&
- !resolve_remote_symref(ref, refs)) {
- ref->next = refs;
- refs = ref;
- } else {
- free(ref);
- }
-
- return refs;
-}
static struct ref *get_refs(int for_push)
{
--
1.8.1.20.g7078b03
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCHv2 10/10] remote-curl: always parse incoming refs
2013-02-18 9:12 ` [PATCHv2 0/10] pkt-line and remote-curl cleanups server Jeff King
` (9 preceding siblings ...)
2013-02-18 9:29 ` [PATCHv2 09/10] remote-curl: move ref-parsing code up in file Jeff King
@ 2013-02-18 9:30 ` Jeff King
2013-02-18 10:50 ` Jonathan Nieder
2013-02-20 7:05 ` [PATCHv2 0/10] pkt-line and remote-curl cleanups server Shawn Pearce
11 siblings, 1 reply; 50+ messages in thread
From: Jeff King @ 2013-02-18 9:30 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: git, Shawn O. Pearce
When remote-curl receives a list of refs from a server, it
keeps the whole buffer intact. When we get a "list" command,
we feed the result to get_remote_heads, and when we get a
"fetch" or "push" command, we feed it to fetch-pack or
send-pack, respectively.
If the HTTP response from the server is truncated for any
reason, we will get an incomplete ref advertisement. If we
then feed this incomplete list to fetch-pack, one of a few
things may happen:
1. If the truncation is in a packet header, fetch-pack
will notice the bogus line and complain.
2. If the truncation is inside a packet, fetch-pack will
keep waiting for us to send the rest of the packet,
which we never will.
3. If the truncation is at a packet boundary, fetch-pack
will keep waiting for us to send the next packet, which
we never will.
As a result, fetch-pack hangs, waiting for input. However,
remote-curl believes it has sent all of the advertisement,
and therefore waits for fetch-pack to speak. The two
processes end up in a deadlock.
We do notice the broken ref list if we feed it to
get_remote_heads. So if git asks the helper to do a "list"
followed by a "fetch", we are safe; we'll abort during the
list operation, which parses the refs.
This patch teaches remote-curl to always parse and save the
incoming ref list when we read the ref advertisement from a
server. That means that we will always verify and abort
before even running fetch-pack (or send-pack) when reading a
corrupted list, even if we do not run the "list" command
explicitly.
Since we save the result, in the common case of running
"list" then "fetch", we do not do any extra parsing at all.
In the case of just a "fetch", we do an extra round of
parsing, but only once.
Note also that the "fetch" case will now also initialize
server_capabilities from the remote (in remote-curl; we
already would do so inside fetch-pack). Doing "list+fetch"
already does this. It doesn't actually matter now, but the
new behavior is arguably more correct, should remote-curl
ever start caring about the server's capability list.
Signed-off-by: Jeff King <peff@peff.net>
---
And this does the equivalent of patch 3/3 from the first series, but I
think this is much more robust (certainly it solves the ERR problem, but
more importantly, it uses the exact same function that other code paths
do, so we do not have to worry about it diverging).
remote-curl.c | 23 +++++++++++++----------
1 file changed, 13 insertions(+), 10 deletions(-)
diff --git a/remote-curl.c b/remote-curl.c
index 62f82d1..33af198 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -76,6 +76,7 @@ struct discovery {
char *buf_alloc;
char *buf;
size_t len;
+ struct ref *refs;
unsigned proto_git : 1;
};
static struct discovery *last_discovery;
@@ -145,11 +146,12 @@ static void free_discovery(struct discovery *d)
if (d == last_discovery)
last_discovery = NULL;
free(d->buf_alloc);
+ free_refs(d->refs);
free(d);
}
}
-static struct discovery* discover_refs(const char *service)
+static struct discovery* discover_refs(const char *service, int for_push)
{
struct strbuf exp = STRBUF_INIT;
struct strbuf type = STRBUF_INIT;
@@ -224,6 +226,11 @@ static struct discovery* discover_refs(const char *service)
last->proto_git = 1;
}
+ if (last->proto_git)
+ last->refs = parse_git_refs(last, for_push);
+ else
+ last->refs = parse_info_refs(last);
+
free(refs_url);
strbuf_release(&exp);
strbuf_release(&type);
@@ -232,19 +239,16 @@ static struct ref *get_refs(int for_push)
return last;
}
-
static struct ref *get_refs(int for_push)
{
struct discovery *heads;
if (for_push)
- heads = discover_refs("git-receive-pack");
+ heads = discover_refs("git-receive-pack", for_push);
else
- heads = discover_refs("git-upload-pack");
+ heads = discover_refs("git-upload-pack", for_push);
- if (heads->proto_git)
- return parse_git_refs(heads, for_push);
- return parse_info_refs(heads);
+ return heads->refs;
}
static void output_refs(struct ref *refs)
@@ -258,7 +262,6 @@ static void output_refs(struct ref *refs)
}
printf("\n");
fflush(stdout);
- free_refs(refs);
}
struct rpc_state {
@@ -674,7 +677,7 @@ static int fetch(int nr_heads, struct ref **to_fetch)
static int fetch(int nr_heads, struct ref **to_fetch)
{
- struct discovery *d = discover_refs("git-upload-pack");
+ struct discovery *d = discover_refs("git-upload-pack", 0);
if (d->proto_git)
return fetch_git(d, nr_heads, to_fetch);
else
@@ -793,7 +796,7 @@ static int push(int nr_spec, char **specs)
static int push(int nr_spec, char **specs)
{
- struct discovery *heads = discover_refs("git-receive-pack");
+ struct discovery *heads = discover_refs("git-receive-pack", 1);
int ret;
if (heads->proto_git)
--
1.8.1.20.g7078b03
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCHv2 10/10] remote-curl: always parse incoming refs
2013-02-18 9:30 ` [PATCHv2 10/10] remote-curl: always parse incoming refs Jeff King
@ 2013-02-18 10:50 ` Jonathan Nieder
2013-02-20 7:41 ` Shawn Pearce
0 siblings, 1 reply; 50+ messages in thread
From: Jonathan Nieder @ 2013-02-18 10:50 UTC (permalink / raw)
To: Jeff King; +Cc: git, Shawn O. Pearce
Jeff King wrote:
> remote-curl.c | 23 +++++++++++++----------
> 1 file changed, 13 insertions(+), 10 deletions(-)
I like.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 10/10] remote-curl: always parse incoming refs
2013-02-18 10:50 ` Jonathan Nieder
@ 2013-02-20 7:41 ` Shawn Pearce
0 siblings, 0 replies; 50+ messages in thread
From: Shawn Pearce @ 2013-02-20 7:41 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: Jeff King, git
On Mon, Feb 18, 2013 at 1:30 AM, Jeff King <peff@peff.net> wrote:
> When remote-curl receives a list of refs from a server, it
> keeps the whole buffer intact. When we get a "list" command,
> we feed the result to get_remote_heads, and when we get a
> "fetch" or "push" command, we feed it to fetch-pack or
> send-pack, respectively.
>
> If the HTTP response from the server is truncated for any
> reason,
...
> As a result, fetch-pack hangs, waiting for input. However,
> remote-curl believes it has sent all of the advertisement,
> and therefore waits for fetch-pack to speak. The two
> processes end up in a deadlock.
Eek. Thanks for fixing this.
On Mon, Feb 18, 2013 at 2:50 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Jeff King wrote:
>
>> remote-curl.c | 23 +++++++++++++----------
>> 1 file changed, 13 insertions(+), 10 deletions(-)
>
> I like.
Me too.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCHv2 0/10] pkt-line and remote-curl cleanups server
2013-02-18 9:12 ` [PATCHv2 0/10] pkt-line and remote-curl cleanups server Jeff King
` (10 preceding siblings ...)
2013-02-18 9:30 ` [PATCHv2 10/10] remote-curl: always parse incoming refs Jeff King
@ 2013-02-20 7:05 ` Shawn Pearce
11 siblings, 0 replies; 50+ messages in thread
From: Shawn Pearce @ 2013-02-20 7:05 UTC (permalink / raw)
To: Jeff King; +Cc: Jonathan Nieder, git
On Mon, Feb 18, 2013 at 1:12 AM, Jeff King <peff@peff.net> wrote:
> On Sun, Feb 17, 2013 at 05:41:13PM -0800, Jonathan Nieder wrote:
>
>> > I don't think so. Don't ERR lines appear inside their own packets?
>>
>> Yes, I misread get_remote_heads for some reason. Thanks for checking.
>
> Thanks for bringing it up. I had not even thought about ERR at all. So
> it was luck rather than skill that I was right. :)
>
>> I'm not sure whether servers are expected to send a flush after an
>> ERR packet. The only codepath I know of in git itself that sends
>> such packets is git-daemon, which does not flush after the error (but
>> is not used in the stateless-rpc case). http-backend uses HTTP error
>> codes for its errors.
>
> I just checked, and GitHub also does not send flush packets after ERR.
> Which makes sense; ERR is supposed to end the conversation. I can change
> GitHub, of course, but who knows what other implementations exist (e.g.,
> I do not know off-hand whether gitolite has custom ERR responses). So it
> seems pretty clear that just checking for a flush packet is not the
> right thing, and we need to actually parse the packet contents (at least
> to some degree).
JGit (and by extension Gerrit Code Review, android.googlesource.com)
sends ERR with no flush-pkt. I would like to sort of keep the protocol
this way, given how many servers in the wild are running Gerrit and
currently use ERR with no flush-pkt. IMHO its a little late to be
closing that door and stuffing a flush-pkt after the ERR that ends the
conversation.
^ permalink raw reply [flat|nested] 50+ messages in thread