From: "Carlos Rica" <jasampler@gmail.com>
To: "Junio C Hamano" <gitster@pobox.com>
Cc: git@vger.kernel.org,
"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
"Kristian Høgsberg" <krh@redhat.com>
Subject: Re: [PATCH 1/2] Function stripspace now gets a buffer instead file descriptors.
Date: Thu, 12 Jul 2007 02:14:53 +0200 [thread overview]
Message-ID: <1b46aba20707111714v63bf921dh15000e2629bcf260@mail.gmail.com> (raw)
In-Reply-To: <7vd4yy1svw.fsf@assigned-by-dhcp.cox.net>
2007/7/12, Junio C Hamano <gitster@pobox.com>:
> Carlos Rica <jasampler@gmail.com> writes:
> > @@ -28,52 +26,67 @@ static int cleanup(char *line, int len)
> > * Remove empty lines from the beginning and end
> > * and also trailing spaces from every line.
> > *
> > + * Note that the buffer will not be null-terminated.
> > + *
>
> The name of the sentinel character '\0' is NUL, not null (which
> is a different word, used to call a pointer that points
> nowhere). The buffer will not be "NUL-terminated".
Thank you Junio, I will use it on the future.
> > int cmd_stripspace(int argc, const char **argv, const char *prefix)
> > {
> > - stripspace(stdin, stdout, 0);
> > + char *buffer;
> > + unsigned long size;
> > +
> > + size = 1024;
> > + buffer = xmalloc(size);
> > + if (read_pipe(0, &buffer, &size))
> > + die("could not read the input");
>
> The command used to be capable of streaming and filtering a few
> hundred gigabytes of text on a machine with small address space,
> as it operated one line at a time, but now it cannot as it has
> to hold everything in core before starting.
>
> I do not think we miss that loss of capability too much, but I
> wonder if we can be a bit more clever about it, perhaps feeding
> a chunk at a time. Not a very strong request, but just
> wondering if it is an easy change.
I did those changes because I was needing those tests that
I had written before in order to develop the function. After that,
we now can restore the previous function with file descriptors to
make it capable of filter a few hundred gigabytes of text, provided
that the text does not have long long lines on it.
Indeed, the implementation for composing a tag (header, cleaned
message and optional signature) in "builtin-tag.c", now pass it to
the function write_sha1_file as a buffer on memory, so it won't support
sizes bigger than memory available on the system. Messages should
not be so big, but I don't know how to limit those.
prev parent reply other threads:[~2007-07-12 0:15 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-11 18:50 [PATCH 1/2] Function stripspace now gets a buffer instead file descriptors Carlos Rica
2007-07-11 19:17 ` Johannes Schindelin
2007-07-11 22:24 ` Junio C Hamano
2007-07-11 23:20 ` Bill Lear
2007-07-11 23:41 ` Carlos Rica
2007-07-12 0:03 ` Junio C Hamano
2007-07-12 0:14 ` Carlos Rica [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1b46aba20707111714v63bf921dh15000e2629bcf260@mail.gmail.com \
--to=jasampler@gmail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=krh@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).