git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Subject: [PATCH v3 09/12] streaming_write_entry(): support files with holes
Date: Fri, 20 May 2011 23:56:32 -0700	[thread overview]
Message-ID: <1305960995-25738-10-git-send-email-gitster@pobox.com> (raw)
In-Reply-To: <1305960995-25738-1-git-send-email-gitster@pobox.com>

One typical use of a large binary file is to hold a sparse on-disk hash
table with a lot of holes. Help preserving the holes with lseek().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 entry.c |   21 +++++++++++++++++++--
 1 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/entry.c b/entry.c
index da37d01..e2dc16c 100644
--- a/entry.c
+++ b/entry.c
@@ -123,6 +123,7 @@ static int streaming_write_entry(struct cache_entry *ce, char *path,
 	enum object_type type;
 	unsigned long sz;
 	int result = -1;
+	ssize_t kept = 0;
 	int fd = -1;
 
 	st = open_istream(ce->sha1, &type, &sz);
@@ -136,18 +137,34 @@ static int streaming_write_entry(struct cache_entry *ce, char *path,
 		goto close_and_exit;
 
 	for (;;) {
-		char buf[10240];
-		ssize_t wrote;
+		char buf[1024 * 16];
+		ssize_t wrote, holeto;
 		ssize_t readlen = read_istream(st, buf, sizeof(buf));
 
 		if (!readlen)
 			break;
+		if (sizeof(buf) == readlen) {
+			for (holeto = 0; holeto < readlen; holeto++)
+				if (buf[holeto])
+					break;
+			if (readlen == holeto) {
+				kept += holeto;
+				continue;
+			}
+		}
 
+		if (kept && lseek(fd, kept, SEEK_CUR) == (off_t) -1)
+			goto close_and_exit;
+		else
+			kept = 0;
 		wrote = write_in_full(fd, buf, readlen);
 
 		if (wrote != readlen)
 			goto close_and_exit;
 	}
+	if (kept && (lseek(fd, kept - 1, SEEK_CUR) == (off_t) -1 ||
+		     write(fd, "", 1) != 1))
+		goto close_and_exit;
 	*fstat_done = fstat_output(fd, state, statbuf);
 
 close_and_exit:
-- 
1.7.5.2.369.g8fc017

  parent reply	other threads:[~2011-05-21  6:57 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-21  6:56 [PATCH v3 00/12] writing out a huge blob to working tree Junio C Hamano
2011-05-21  6:56 ` [PATCH v3 01/12] packed_object_info_detail(): do not return a string Junio C Hamano
2011-05-21  6:56 ` [PATCH v3 02/12] sha1_object_info_extended(): expose a bit more info Junio C Hamano
2011-05-21  6:56 ` [PATCH v3 03/12] sha1_object_info_extended(): hint about objects in delta-base cache Junio C Hamano
2011-05-21  6:56 ` [PATCH v3 04/12] unpack_object_header(): make it public Junio C Hamano
2011-05-21  6:56 ` [PATCH v3 05/12] write_entry(): separate two helper functions out Junio C Hamano
2011-05-21  6:56 ` [PATCH v3 06/12] streaming: a new API to read from the object store Junio C Hamano
2011-05-21  6:56 ` [PATCH v3 07/12] streaming_write_entry(): use streaming API in write_entry() Junio C Hamano
2011-05-21  6:56 ` [PATCH v3 08/12] convert: CRLF_INPUT is a no-op in the output codepath Junio C Hamano
2011-05-21  6:56 ` Junio C Hamano [this message]
2011-05-21  6:56 ` [PATCH v3 10/12] streaming: read non-delta incrementally from a pack Junio C Hamano
2011-05-21  6:56 ` [PATCH v3 11/12] sha1_file.c: expose helpers to read loose objects Junio C Hamano
2011-05-21  6:56 ` [PATCH v3 12/12] streaming: read loose objects incrementally Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1305960995-25738-10-git-send-email-gitster@pobox.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).