From: Junio C Hamano <gitster@pobox.com>
To: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 02/11] Factor out and export large blob writing code to arbitrary file handle
Date: Mon, 27 Feb 2012 13:50:10 -0800 [thread overview]
Message-ID: <7vaa4454kt.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <7v4nucb2xl.fsf@alter.siamese.dyndns.org> (Junio C. Hamano's message of "Mon, 27 Feb 2012 09:29:10 -0800")
Junio C Hamano <gitster@pobox.com> writes:
> So I think the external declaration and the definition should move to a
> more generic place, namely streaming.[ch]. It does not belong to entry.c
> anymore.
>
> Thanks for working on this.
In other words, I think the result should look more like this.
The original logic in entry.c is that the caller should try to get a
filter and call streaming_write_entry(), but either of them is allowed to
return a failure when the blob is not suitable for the streaming codepath
to tell the caller to try their traditional codepath.
We might want to add another helper function for callers to use to decide
if they should use the streaming interface, or the traditional one, before
actually making a call to streaming_write_entry(). With the original (and
current) API, they have to retry even when the streaming codepath truly
failed (e.g. no such blob object), in which case it is very likely that
the traditional codepath in the caller will fail the same way. Retrying is
a wasted effort in such a case.
-- >8 --
Subject: [PATCH] streaming: make streaming-write-entry to be more reusable
The static function in entry.c takes a cache entry and streams its blob
contents to a file in the working tree. Refactor the logic to a new API
function stream_blob_to_fd() that takes an object name and an open file
descriptor, so that it can be reused by other callers.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
entry.c | 53 +++++------------------------------------------------
streaming.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
streaming.h | 2 ++
3 files changed, 62 insertions(+), 48 deletions(-)
diff --git a/entry.c b/entry.c
index 852fea1..17a6bcc 100644
--- a/entry.c
+++ b/entry.c
@@ -120,58 +120,15 @@ static int streaming_write_entry(struct cache_entry *ce, char *path,
const struct checkout *state, int to_tempfile,
int *fstat_done, struct stat *statbuf)
{
- struct git_istream *st;
- enum object_type type;
- unsigned long sz;
int result = -1;
- ssize_t kept = 0;
- int fd = -1;
-
- st = open_istream(ce->sha1, &type, &sz, filter);
- if (!st)
- return -1;
- if (type != OBJ_BLOB)
- goto close_and_exit;
+ int fd;
fd = open_output_fd(path, ce, to_tempfile);
- if (fd < 0)
- goto close_and_exit;
-
- for (;;) {
- char buf[1024 * 16];
- ssize_t wrote, holeto;
- ssize_t readlen = read_istream(st, buf, sizeof(buf));
-
- if (!readlen)
- break;
- if (sizeof(buf) == readlen) {
- for (holeto = 0; holeto < readlen; holeto++)
- if (buf[holeto])
- break;
- if (readlen == holeto) {
- kept += holeto;
- continue;
- }
- }
-
- if (kept && lseek(fd, kept, SEEK_CUR) == (off_t) -1)
- goto close_and_exit;
- else
- kept = 0;
- wrote = write_in_full(fd, buf, readlen);
-
- if (wrote != readlen)
- goto close_and_exit;
- }
- if (kept && (lseek(fd, kept - 1, SEEK_CUR) == (off_t) -1 ||
- write(fd, "", 1) != 1))
- goto close_and_exit;
- *fstat_done = fstat_output(fd, state, statbuf);
-
-close_and_exit:
- close_istream(st);
- if (0 <= fd)
+ if (0 <= fd) {
+ result = stream_blob_to_fd(fd, ce->sha1, filter, 1);
+ *fstat_done = fstat_output(fd, state, statbuf);
result = close(fd);
+ }
if (result && 0 <= fd)
unlink(path);
return result;
diff --git a/streaming.c b/streaming.c
index 71072e1..7e7ee2b 100644
--- a/streaming.c
+++ b/streaming.c
@@ -489,3 +489,58 @@ static open_method_decl(incore)
return st->u.incore.buf ? 0 : -1;
}
+
+
+/****************************************************************
+ * Users of streaming interface
+ ****************************************************************/
+
+int stream_blob_to_fd(int fd, unsigned const char *sha1, struct stream_filter *filter,
+ int can_seek)
+{
+ struct git_istream *st;
+ enum object_type type;
+ unsigned long sz;
+ ssize_t kept = 0;
+ int result = -1;
+
+ st = open_istream(sha1, &type, &sz, filter);
+ if (!st)
+ return result;
+ if (type != OBJ_BLOB)
+ goto close_and_exit;
+ for (;;) {
+ char buf[1024 * 16];
+ ssize_t wrote, holeto;
+ ssize_t readlen = read_istream(st, buf, sizeof(buf));
+
+ if (!readlen)
+ break;
+ if (can_seek && sizeof(buf) == readlen) {
+ for (holeto = 0; holeto < readlen; holeto++)
+ if (buf[holeto])
+ break;
+ if (readlen == holeto) {
+ kept += holeto;
+ continue;
+ }
+ }
+
+ if (kept && lseek(fd, kept, SEEK_CUR) == (off_t) -1)
+ goto close_and_exit;
+ else
+ kept = 0;
+ wrote = write_in_full(fd, buf, readlen);
+
+ if (wrote != readlen)
+ goto close_and_exit;
+ }
+ if (kept && (lseek(fd, kept - 1, SEEK_CUR) == (off_t) -1 ||
+ write(fd, "", 1) != 1))
+ goto close_and_exit;
+ result = 0;
+
+ close_and_exit:
+ close_istream(st);
+ return result;
+}
diff --git a/streaming.h b/streaming.h
index 589e857..3e82770 100644
--- a/streaming.h
+++ b/streaming.h
@@ -12,4 +12,6 @@ extern struct git_istream *open_istream(const unsigned char *, enum object_type
extern int close_istream(struct git_istream *);
extern ssize_t read_istream(struct git_istream *, char *, size_t);
+extern int stream_blob_to_fd(int fd, const unsigned char *, struct stream_filter *, int can_seek);
+
#endif /* STREAMING_H */
--
1.7.9.2.312.g1abc3
next prev parent reply other threads:[~2012-02-27 21:50 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-27 7:55 [PATCH 00/11] Large blob fixes Nguyễn Thái Ngọc Duy
2012-02-27 7:55 ` [PATCH 01/11] Add more large blob test cases Nguyễn Thái Ngọc Duy
2012-02-27 20:18 ` Peter Baumann
2012-02-27 7:55 ` [PATCH 02/11] Factor out and export large blob writing code to arbitrary file handle Nguyễn Thái Ngọc Duy
2012-02-27 17:29 ` Junio C Hamano
2012-02-27 21:50 ` Junio C Hamano [this message]
2012-02-27 7:55 ` [PATCH 03/11] cat-file: use streaming interface to print blobs Nguyễn Thái Ngọc Duy
2012-02-27 17:44 ` Junio C Hamano
2012-02-28 1:08 ` Nguyen Thai Ngoc Duy
2012-02-27 7:55 ` [PATCH 04/11] parse_object: special code path for blobs to avoid putting whole object in memory Nguyễn Thái Ngọc Duy
2012-02-27 7:55 ` [PATCH 05/11] show: use streaming interface for showing blobs Nguyễn Thái Ngọc Duy
2012-02-27 18:00 ` Junio C Hamano
2012-02-27 7:55 ` [PATCH 06/11] index-pack --verify: skip sha-1 collision test Nguyễn Thái Ngọc Duy
2012-02-27 7:55 ` [PATCH 07/11] index-pack: split second pass obj handling into own function Nguyễn Thái Ngọc Duy
2012-02-27 7:55 ` [PATCH 08/11] index-pack: reduce memory usage when the pack has large blobs Nguyễn Thái Ngọc Duy
2012-02-27 7:55 ` [PATCH 09/11] pack-check: do not unpack blobs Nguyễn Thái Ngọc Duy
2012-02-27 7:55 ` [PATCH 10/11] archive: support streaming large files to a tar archive Nguyễn Thái Ngọc Duy
2012-02-27 7:55 ` [PATCH 11/11] fsck: use streaming interface for writing lost-found blobs Nguyễn Thái Ngọc Duy
2012-02-27 18:43 ` [PATCH 00/11] Large blob fixes Junio C Hamano
2012-02-28 1:23 ` Nguyen Thai Ngoc Duy
2012-03-04 12:59 ` [PATCH v2 00/10] " Nguyễn Thái Ngọc Duy
2012-03-04 12:59 ` [PATCH v2 01/10] Add more large blob test cases Nguyễn Thái Ngọc Duy
2012-03-06 0:59 ` Junio C Hamano
2012-03-04 12:59 ` [PATCH v2 02/10] streaming: make streaming-write-entry to be more reusable Nguyễn Thái Ngọc Duy
2012-03-04 12:59 ` [PATCH v2 03/10] cat-file: use streaming interface to print blobs Nguyễn Thái Ngọc Duy
2012-03-04 23:12 ` Junio C Hamano
2012-03-05 2:42 ` Nguyen Thai Ngoc Duy
2012-03-04 12:59 ` [PATCH v2 04/10] parse_object: special code path for blobs to avoid putting whole object in memory Nguyễn Thái Ngọc Duy
2012-03-04 12:59 ` [PATCH v2 05/10] show: use streaming interface for showing blobs Nguyễn Thái Ngọc Duy
2012-03-04 12:59 ` [PATCH v2 06/10] index-pack: split second pass obj handling into own function Nguyễn Thái Ngọc Duy
2012-03-04 12:59 ` [PATCH v2 07/10] index-pack: reduce memory usage when the pack has large blobs Nguyễn Thái Ngọc Duy
2012-03-04 12:59 ` [PATCH v2 08/10] pack-check: do not unpack blobs Nguyễn Thái Ngọc Duy
2012-03-04 12:59 ` [PATCH v2 09/10] archive: support streaming large files to a tar archive Nguyễn Thái Ngọc Duy
2012-03-04 12:59 ` [PATCH v2 10/10] fsck: use streaming interface for writing lost-found blobs Nguyễn Thái Ngọc Duy
2012-03-05 3:43 ` [PATCH v3 00/11] Large blob fixes Nguyễn Thái Ngọc Duy
2012-03-05 3:43 ` [PATCH v3 01/11] Add more large blob test cases Nguyễn Thái Ngọc Duy
2012-03-05 3:43 ` [PATCH v3 02/11] streaming: make streaming-write-entry to be more reusable Nguyễn Thái Ngọc Duy
2012-03-05 3:43 ` [PATCH v3 03/11] cat-file: use streaming interface to print blobs Nguyễn Thái Ngọc Duy
2012-03-05 3:43 ` [PATCH v3 04/11] parse_object: special code path for blobs to avoid putting whole object in memory Nguyễn Thái Ngọc Duy
2012-03-06 0:57 ` Junio C Hamano
2012-03-05 3:43 ` [PATCH v3 05/11] show: use streaming interface for showing blobs Nguyễn Thái Ngọc Duy
2012-03-05 3:43 ` [PATCH v3 06/11] index-pack: split second pass obj handling into own function Nguyễn Thái Ngọc Duy
2012-03-05 3:43 ` [PATCH v3 07/11] index-pack: reduce memory usage when the pack has large blobs Nguyễn Thái Ngọc Duy
2012-03-05 3:43 ` [PATCH v3 08/11] pack-check: do not unpack blobs Nguyễn Thái Ngọc Duy
2012-03-05 3:43 ` [PATCH v3 09/11] archive: support streaming large files to a tar archive Nguyễn Thái Ngọc Duy
2012-03-06 0:57 ` Junio C Hamano
2012-03-05 3:43 ` [PATCH v3 10/11] fsck: use streaming interface for writing lost-found blobs Nguyễn Thái Ngọc Duy
2012-03-05 3:43 ` [PATCH v3 11/11] update-server-info: respect core.bigfilethreshold Nguyễn Thái Ngọc Duy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7vaa4454kt.fsf@alter.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).