* [PATCH] config: do not ungetc EOF
@ 2015-02-05 6:53 Jeff King
2015-02-05 21:00 ` Jeff King
0 siblings, 1 reply; 6+ messages in thread
From: Jeff King @ 2015-02-05 6:53 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Heiko Voigt
When we are parsing a config value, if we see a carriage
return, we fgetc the next character to see if it is a
line feed (in which case we silently drop the CR). If it
isn't, we then ungetc the character, and take the literal
CR.
But we never check whether we in fact got a character at
all. If the config file ends in CR, we will get EOF here,
and try to ungetc EOF. This works OK for a real stdio
stream. The ungetc returns an error, and the next fgetc will
then return EOF again.
However, our custom buffer-based stream is not so fortunate.
It happily rewinds the position of the stream by one
character, ignoring the fact that we fed it EOF. The next
fgetc call returns the final CR again, over and over, and we
end up in an infinite loop.
Signed-off-by: Jeff King <peff@peff.net>
---
Looks like this weirdness has been there for a long time, but we only
added the `git config --blob` code in v1.8.4.
I also notice that config_buf_ungetc does not actually ungetc the
character we give it; it just rewinds one character in the stream. This
is fine, because we always feed the last-retrieved character. I dunno if
it is worth fixing (it also would have fixed this infinite loop, but for
the wrong reason; we would have stuck "-1" back into the stream, and
retrieved it on the next fgetc rather than the same '\r' over and over).
config.c | 3 ++-
t/t1307-config-blob.sh | 9 +++++++++
2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/config.c b/config.c
index 752e2e2..2c63099 100644
--- a/config.c
+++ b/config.c
@@ -235,7 +235,8 @@ static int get_next_char(void)
/* DOS like systems */
c = cf->do_fgetc(cf);
if (c != '\n') {
- cf->do_ungetc(c, cf);
+ if (c != EOF)
+ cf->do_ungetc(c, cf);
c = '\r';
}
}
diff --git a/t/t1307-config-blob.sh b/t/t1307-config-blob.sh
index fdc257e..3c6791e 100755
--- a/t/t1307-config-blob.sh
+++ b/t/t1307-config-blob.sh
@@ -67,4 +67,13 @@ test_expect_success 'parse errors in blobs are properly attributed' '
grep "HEAD:config" err
'
+test_expect_success 'can parse blob ending with CR' '
+ printf "[some]key = value\\r" >config &&
+ git add config &&
+ git commit -m CR &&
+ echo value >expect &&
+ git config --blob=HEAD:config some.key >actual &&
+ test_cmp expect actual
+'
+
test_done
--
2.3.0.rc1.287.g761fd19
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] config: do not ungetc EOF
2015-02-05 6:53 [PATCH] config: do not ungetc EOF Jeff King
@ 2015-02-05 21:00 ` Jeff King
2015-02-05 21:16 ` Junio C Hamano
2015-02-08 21:13 ` Heiko Voigt
0 siblings, 2 replies; 6+ messages in thread
From: Jeff King @ 2015-02-05 21:00 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Heiko Voigt
On Thu, Feb 05, 2015 at 01:53:27AM -0500, Jeff King wrote:
> I also notice that config_buf_ungetc does not actually ungetc the
> character we give it; it just rewinds one character in the stream. This
> is fine, because we always feed the last-retrieved character. I dunno if
> it is worth fixing (it also would have fixed this infinite loop, but for
> the wrong reason; we would have stuck "-1" back into the stream, and
> retrieved it on the next fgetc rather than the same '\r' over and over).
Here's a patch to deal with that. I'm not sure if it's worth doing or
not.
-- >8 --
Subject: [PATCH] config_buf_ungetc: warn when pushing back a random character
Our config code simulates a stdio stream around a buffer,
but our fake ungetc() does not behave quite like the real
one. In particular, we only rewind the position by one
character, but do _not_ actually put the character from the
caller into position.
It turns out that this does not matter, because we only ever
push back the character we just read. In other words, such
an assignment would be a noop. But because the function is
called ungetc, and because it takes a character parameter,
it is a mistake waiting to happen.
Actually assigning the character into the buffer would be
ideal, but our pointer is actually a "const" copy of the
buffer. We do not know who the real owner of the buffer is
in this code, and would not want to munge their contents.
Instead, we can simply add an assertion that matches what
the current caller does, and will let us know if new callers
are added that violate the contract.
Signed-off-by: Jeff King <peff@peff.net>
---
config.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/config.c b/config.c
index 2c63099..b74cc47 100644
--- a/config.c
+++ b/config.c
@@ -73,8 +73,12 @@ static int config_buf_fgetc(struct config_source *conf)
static int config_buf_ungetc(int c, struct config_source *conf)
{
- if (conf->u.buf.pos > 0)
- return conf->u.buf.buf[--conf->u.buf.pos];
+ if (conf->u.buf.pos > 0) {
+ conf->u.buf.pos--;
+ if (conf->u.buf.buf[conf->u.buf.pos] != c)
+ die("BUG: config_buf can only ungetc the same character");
+ return c;
+ }
return EOF;
}
--
2.3.0.rc1.287.g761fd19
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] config: do not ungetc EOF
2015-02-05 21:00 ` Jeff King
@ 2015-02-05 21:16 ` Junio C Hamano
2015-02-05 21:28 ` Jeff King
2015-02-08 21:13 ` Heiko Voigt
1 sibling, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2015-02-05 21:16 UTC (permalink / raw)
To: Jeff King; +Cc: git, Heiko Voigt
Jeff King <peff@peff.net> writes:
> On Thu, Feb 05, 2015 at 01:53:27AM -0500, Jeff King wrote:
>
>> I also notice that config_buf_ungetc does not actually ungetc the
>> character we give it; it just rewinds one character in the stream. This
>> is fine, because we always feed the last-retrieved character. I dunno if
>> it is worth fixing (it also would have fixed this infinite loop, but for
>> the wrong reason; we would have stuck "-1" back into the stream, and
>> retrieved it on the next fgetc rather than the same '\r' over and over).
>
> Here's a patch to deal with that. I'm not sure if it's worth doing or
> not.
I am not sure, either. If this were to become stdio emulator over
random in-core data used throughout the system, perhaps.
But in its current form it is tied to the implementation of config.c
very strongly, so...
> -- >8 --
> Subject: [PATCH] config_buf_ungetc: warn when pushing back a random character
>
> Our config code simulates a stdio stream around a buffer,
> but our fake ungetc() does not behave quite like the real
> one. In particular, we only rewind the position by one
> character, but do _not_ actually put the character from the
> caller into position.
>
> It turns out that this does not matter, because we only ever
> push back the character we just read. In other words, such
> an assignment would be a noop. But because the function is
> called ungetc, and because it takes a character parameter,
> it is a mistake waiting to happen.
>
> Actually assigning the character into the buffer would be
> ideal, but our pointer is actually a "const" copy of the
> buffer. We do not know who the real owner of the buffer is
> in this code, and would not want to munge their contents.
>
> Instead, we can simply add an assertion that matches what
> the current caller does, and will let us know if new callers
> are added that violate the contract.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> config.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/config.c b/config.c
> index 2c63099..b74cc47 100644
> --- a/config.c
> +++ b/config.c
> @@ -73,8 +73,12 @@ static int config_buf_fgetc(struct config_source *conf)
>
> static int config_buf_ungetc(int c, struct config_source *conf)
> {
> - if (conf->u.buf.pos > 0)
> - return conf->u.buf.buf[--conf->u.buf.pos];
> + if (conf->u.buf.pos > 0) {
> + conf->u.buf.pos--;
> + if (conf->u.buf.buf[conf->u.buf.pos] != c)
> + die("BUG: config_buf can only ungetc the same character");
> + return c;
> + }
>
> return EOF;
> }
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] config: do not ungetc EOF
2015-02-05 21:16 ` Junio C Hamano
@ 2015-02-05 21:28 ` Jeff King
2015-02-08 21:22 ` Heiko Voigt
0 siblings, 1 reply; 6+ messages in thread
From: Jeff King @ 2015-02-05 21:28 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Heiko Voigt
On Thu, Feb 05, 2015 at 01:16:36PM -0800, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > On Thu, Feb 05, 2015 at 01:53:27AM -0500, Jeff King wrote:
> >
> >> I also notice that config_buf_ungetc does not actually ungetc the
> >> character we give it; it just rewinds one character in the stream. This
> >> is fine, because we always feed the last-retrieved character. I dunno if
> >> it is worth fixing (it also would have fixed this infinite loop, but for
> >> the wrong reason; we would have stuck "-1" back into the stream, and
> >> retrieved it on the next fgetc rather than the same '\r' over and over).
> >
> > Here's a patch to deal with that. I'm not sure if it's worth doing or
> > not.
>
> I am not sure, either. If this were to become stdio emulator over
> random in-core data used throughout the system, perhaps.
>
> But in its current form it is tied to the implementation of config.c
> very strongly, so...
Yeah, that was my thinking, and why I have doubts. Maybe a comment would
make more sense, like the patch below. I am also OK with just leaving
it as-is.
-- >8 --
Subject: [PATCH] config_buf_ungetc: document quirks in a comment
Our config_buf_ungetc implements just enough for the config
code to work. That's OK, but we would not want anyone to
mistakenly move it elsewhere as a general purpose ungetc.
Signed-off-by: Jeff King <peff@peff.net>
---
config.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/config.c b/config.c
index 2c63099..089a94f 100644
--- a/config.c
+++ b/config.c
@@ -71,6 +71,12 @@ static int config_buf_fgetc(struct config_source *conf)
return EOF;
}
+/*
+ * Note that this is not a real ungetc replacement. It only rewinds
+ * the position, and ignores the "c" parameter, rather than
+ * putting it into our (const) buffer. That's good enough for
+ * the callers here, though.
+ */
static int config_buf_ungetc(int c, struct config_source *conf)
{
if (conf->u.buf.pos > 0)
--
2.3.0.rc1.287.g761fd19
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] config: do not ungetc EOF
2015-02-05 21:00 ` Jeff King
2015-02-05 21:16 ` Junio C Hamano
@ 2015-02-08 21:13 ` Heiko Voigt
1 sibling, 0 replies; 6+ messages in thread
From: Heiko Voigt @ 2015-02-08 21:13 UTC (permalink / raw)
To: Jeff King; +Cc: git, Junio C Hamano
On Thu, Feb 05, 2015 at 04:00:24PM -0500, Jeff King wrote:
> On Thu, Feb 05, 2015 at 01:53:27AM -0500, Jeff King wrote:
>
> > I also notice that config_buf_ungetc does not actually ungetc the
> > character we give it; it just rewinds one character in the stream. This
> > is fine, because we always feed the last-retrieved character. I dunno if
> > it is worth fixing (it also would have fixed this infinite loop, but for
> > the wrong reason; we would have stuck "-1" back into the stream, and
> > retrieved it on the next fgetc rather than the same '\r' over and over).
>
> Here's a patch to deal with that. I'm not sure if it's worth doing or
> not.
>
> -- >8 --
> Subject: [PATCH] config_buf_ungetc: warn when pushing back a random character
Thanks for noticing and fixing both. I think it is worth adding this
assertion. If someone in the future comes along and uses our fake
ungetc() wrong it might save some trouble figuring out whats wrong.
Cheers Heiko
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] config: do not ungetc EOF
2015-02-05 21:28 ` Jeff King
@ 2015-02-08 21:22 ` Heiko Voigt
0 siblings, 0 replies; 6+ messages in thread
From: Heiko Voigt @ 2015-02-08 21:22 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, git
On Thu, Feb 05, 2015 at 04:28:47PM -0500, Jeff King wrote:
> On Thu, Feb 05, 2015 at 01:16:36PM -0800, Junio C Hamano wrote:
>
> > Jeff King <peff@peff.net> writes:
> >
> > > On Thu, Feb 05, 2015 at 01:53:27AM -0500, Jeff King wrote:
> > >
> > >> I also notice that config_buf_ungetc does not actually ungetc the
> > >> character we give it; it just rewinds one character in the stream. This
> > >> is fine, because we always feed the last-retrieved character. I dunno if
> > >> it is worth fixing (it also would have fixed this infinite loop, but for
> > >> the wrong reason; we would have stuck "-1" back into the stream, and
> > >> retrieved it on the next fgetc rather than the same '\r' over and over).
> > >
> > > Here's a patch to deal with that. I'm not sure if it's worth doing or
> > > not.
> >
> > I am not sure, either. If this were to become stdio emulator over
> > random in-core data used throughout the system, perhaps.
> >
> > But in its current form it is tied to the implementation of config.c
> > very strongly, so...
>
> Yeah, that was my thinking, and why I have doubts. Maybe a comment would
> make more sense, like the patch below. I am also OK with just leaving
> it as-is.
>
> -- >8 --
> Subject: [PATCH] config_buf_ungetc: document quirks in a comment
I think a comment would be fine as well. Both helps to quickly find the
cause why our ungetc() might not behave as the caller expects. But I
think one of both would be good so we document that this behavior is in
fact intentional.
Cheers Heiko
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-02-08 21:23 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-05 6:53 [PATCH] config: do not ungetc EOF Jeff King
2015-02-05 21:00 ` Jeff King
2015-02-05 21:16 ` Junio C Hamano
2015-02-05 21:28 ` Jeff King
2015-02-08 21:22 ` Heiko Voigt
2015-02-08 21:13 ` Heiko Voigt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).