From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Turner Subject: [PATCH v4 07/16] read-cache: add watchman 'WAMA' extension Date: Tue, 12 Apr 2016 20:33:00 -0400 Message-ID: <1460507589-25525-8-git-send-email-dturner@twopensource.com> References: <1460507589-25525-1-git-send-email-dturner@twopensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: =?UTF-8?q?Nguy=E1=BB=85n=20Th=C3=A1i=20Ng=E1=BB=8Dc=20Duy?= , David Turner To: git@vger.kernel.org, pclouds@gmail.co, =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?= , Ramsay Jones X-From: git-owner@vger.kernel.org Wed Apr 13 02:33:43 2016 Return-path: Envelope-to: gcvg-git-2@plane.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aq8kX-0002jC-A3 for gcvg-git-2@plane.gmane.org; Wed, 13 Apr 2016 02:33:41 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758933AbcDMAdi convert rfc822-to-quoted-printable (ORCPT ); Tue, 12 Apr 2016 20:33:38 -0400 Received: from mail-qk0-f177.google.com ([209.85.220.177]:35574 "EHLO mail-qk0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758903AbcDMAd3 (ORCPT ); Tue, 12 Apr 2016 20:33:29 -0400 Received: by mail-qk0-f177.google.com with SMTP id o6so13626632qkc.2 for ; Tue, 12 Apr 2016 17:33:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=twopensource-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=WxudQDjkTqm9Nhgne5oZN7qKEmyKStEEi6ciI9xMdEY=; b=J+Mu2ejIaOG2wxknGz40e1qCW1WcU+D/hgfsRo/dKTJUqnbpyH8/4SA51G5RWb4ViO wWiZ2z/DFhLnXEb+juRWtagBfDd6c/Vi1E/GpAwJyw0Vu79Zb5Fdae24SXCl9Xy1bprI fbXthJYWSckoDJdwxF7d49DF0FnRtZEBfHB597PoZpkLFvQrwm9TtwJhYuigZxVvNyjK wDu+q/+4KEBOLp0rjjZSRY7J2o/tOTDLSreM4Gm3FO5OYnMzIJaX0xKyqpTia4wnzbzd qNovU1gdc3AO4AYRKnggcPUhM2YrlpXjOELRsbNJUrqSYp4HWMof5cv4K15c9M+WuSNM ahWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WxudQDjkTqm9Nhgne5oZN7qKEmyKStEEi6ciI9xMdEY=; b=mET72bFpOnS/b7csp04JYwOiW0D9UxJQTCGnAmC2A0piDXxi7C279scLLDxR3HZF13 f8lda+UtDFJ3rIijBz/58Uxxv9SIWvyHbU+cvktyAAtHvzRp12b+DIQhKGmu2HB3TgNI QiNPG5YTLIURKVudLc2IyEgT3Odz1W/xhW0odVGZedatknf5rCXTZHm2nFKK4GIaPYsX veHnHYGchaZZOv01iGgdSN2cHI53ZTBg5M5QuSCgtYzK4G7Ejtwu4Prgv6iz94NmxXz1 xYHoOtBDZbr03myo8+Be0GJXcWgzGZaFFAIQeJ62Td6GG+ZTejEvScY0JLtFSBJzeu9+ Dszw== X-Gm-Message-State: AOPr4FV+a3WGx0i9C1kos+K++i/tG5CI1uUCc6eXU8MHzmg69oZfLDZgj/C8uVMfp/IVIQ== X-Received: by 10.55.73.74 with SMTP id w71mr7360963qka.60.1460507608528; Tue, 12 Apr 2016 17:33:28 -0700 (PDT) Received: from ubuntu.twitter.biz ([192.133.79.145]) by smtp.gmail.com with ESMTPSA id v65sm14677604qhc.6.2016.04.12.17.33.27 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 12 Apr 2016 17:33:27 -0700 (PDT) X-Mailer: git-send-email 2.4.2.767.g62658d5-twtrsrc In-Reply-To: <1460507589-25525-1-git-send-email-dturner@twopensource.com> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: =46rom: Nguy=E1=BB=85n Th=C3=A1i Ng=E1=BB=8Dc Duy The extension contains a bitmap, one bit for each entry in the index. If the n-th bit is zero, the n-th entry is considered unchanged, we can ce_mark_uptodate() it without refreshing. If the bit is non-zero and we found out the corresponding file is clean after refresh, we can clear the bit. In addition, there's a list of directories in the untracked-cache to invalidate (because they have new or modified entries). The 'skipping refresh' bit is not in this patch yet as we would need watchman. More details in later patches. Signed-off-by: Nguy=E1=BB=85n Th=C3=A1i Ng=E1=BB=8Dc Duy Signed-off-by: David Turner --- cache.h | 4 ++ dir.h | 3 ++ read-cache.c | 117 +++++++++++++++++++++++++++++++++++++++++++++++++++= +++++++- 3 files changed, 122 insertions(+), 2 deletions(-) diff --git a/cache.h b/cache.h index 0aeb994..f4f7eef 100644 --- a/cache.h +++ b/cache.h @@ -182,6 +182,8 @@ struct cache_entry { #define CE_VALID (0x8000) #define CE_STAGESHIFT 12 =20 +#define CE_WATCHMAN_DIRTY (0x0001) + /* * Range 0xFFFF0FFF in ce_flags is divided into * two parts: in-memory flags and on-disk ones. @@ -320,6 +322,7 @@ static inline unsigned int canon_mode(unsigned int = mode) #define CACHE_TREE_CHANGED (1 << 5) #define SPLIT_INDEX_ORDERED (1 << 6) #define UNTRACKED_CHANGED (1 << 7) +#define WATCHMAN_CHANGED (1 << 8) =20 struct split_index; struct untracked_cache; @@ -344,6 +347,7 @@ struct index_state { struct untracked_cache *untracked; void *mmap; size_t mmap_size; + char *last_update; }; =20 extern struct index_state the_index; diff --git a/dir.h b/dir.h index 3ec3fb0..3d540de 100644 --- a/dir.h +++ b/dir.h @@ -142,6 +142,9 @@ struct untracked_cache { int gitignore_invalidated; int dir_invalidated; int dir_opened; + /* watchman invalidation data */ + unsigned int use_watchman : 1; + struct string_list invalid_untracked; }; =20 struct dir_struct { diff --git a/read-cache.c b/read-cache.c index e7d2688..19b0c85 100644 --- a/read-cache.c +++ b/read-cache.c @@ -19,6 +19,7 @@ #include "split-index.h" #include "utf8.h" #include "unix-socket.h" +#include "ewah/ewok.h" =20 static struct cache_entry *refresh_cache_entry(struct cache_entry *ce, unsigned int options); @@ -41,11 +42,13 @@ static struct cache_entry *refresh_cache_entry(stru= ct cache_entry *ce, #define CACHE_EXT_RESOLVE_UNDO 0x52455543 /* "REUC" */ #define CACHE_EXT_LINK 0x6c696e6b /* "link" */ #define CACHE_EXT_UNTRACKED 0x554E5452 /* "UNTR" */ +#define CACHE_EXT_WATCHMAN 0x57414D41 /* "WAMA" */ =20 /* changes that can be kept in $GIT_DIR/index (basically all extension= s) */ #define EXTMASK (RESOLVE_UNDO_CHANGED | CACHE_TREE_CHANGED | \ CE_ENTRY_ADDED | CE_ENTRY_REMOVED | CE_ENTRY_CHANGED | \ - SPLIT_INDEX_ORDERED | UNTRACKED_CHANGED) + SPLIT_INDEX_ORDERED | UNTRACKED_CHANGED | \ + WATCHMAN_CHANGED) =20 struct index_state the_index; static const char *alternate_index_output; @@ -1221,8 +1224,13 @@ int refresh_index(struct index_state *istate, un= signed int flags, continue; =20 new =3D refresh_cache_ent(istate, ce, options, &cache_errno, &change= d); - if (new =3D=3D ce) + if (new =3D=3D ce) { + if (ce->ce_flags & CE_WATCHMAN_DIRTY) { + ce->ce_flags &=3D ~CE_WATCHMAN_DIRTY; + istate->cache_changed |=3D WATCHMAN_CHANGED; + } continue; + } if (!new) { const char *fmt; =20 @@ -1366,6 +1374,94 @@ static int verify_hdr(const struct cache_header = *hdr, unsigned long size) return 0; } =20 +static void mark_no_watchman(size_t pos, void *data) +{ + struct index_state *istate =3D data; + assert(pos < istate->cache_nr); + istate->cache[pos]->ce_flags |=3D CE_WATCHMAN_DIRTY; +} + +static int read_watchman_ext(struct index_state *istate, const void *d= ata, + unsigned long sz) +{ + struct ewah_bitmap *bitmap; + int ret, len; + uint32_t bitmap_size; + uint32_t untracked_nr; + + if (memchr(data, 0, sz) =3D=3D NULL) + return error("invalid extension"); + + len =3D strlen(data) + 1; + memcpy(&bitmap_size, (const char *)data + len, 4); + memcpy(&untracked_nr, (const char *)data + len + 4, 4); + untracked_nr =3D ntohl(untracked_nr); + bitmap_size =3D ntohl(bitmap_size); + + bitmap =3D ewah_new(); + ret =3D ewah_read_mmap(bitmap, (const char *)data + len + 8, bitmap_s= ize); + if (ret !=3D bitmap_size) { + ewah_free(bitmap); + return error("failed to parse ewah bitmap reading watchman index ext= ension"); + } + istate->last_update =3D xstrdup(data); + ewah_each_bit(bitmap, mark_no_watchman, istate); + ewah_free(bitmap); + + /* + * TODO: update the untracked cache from the untracked data in this + * extension. + */ + return 0; +} + +static int untracked_entry_append(struct string_list_item *item, void = *sbvoid) +{ + struct strbuf *sb =3D sbvoid; + + strbuf_addstr(sb, item->string); + strbuf_addch(sb, 0); + return 0; +} + +void write_watchman_ext(struct strbuf *sb, struct index_state* istate) +{ + struct ewah_bitmap *bitmap; + int i; + int ewah_start; + int ewah_size =3D 0; + int fixup =3D 0; + + strbuf_add(sb, istate->last_update, strlen(istate->last_update) + 1); + fixup =3D sb->len; + strbuf_add(sb, &ewah_size, 4); /* we'll fix this up later */ + if (istate->untracked) { + uint32_t nr =3D istate->untracked->invalid_untracked.nr; + nr =3D htonl(nr); + strbuf_add(sb, &nr, 4); + } else { + /* zero */ + strbuf_add(sb, &ewah_size, 4); + } + + ewah_start =3D sb->len; + bitmap =3D ewah_new(); + for (i =3D 0; i < istate->cache_nr; i++) + if (istate->cache[i]->ce_flags & CE_WATCHMAN_DIRTY) + ewah_set(bitmap, i); + ewah_serialize_strbuf(bitmap, sb); + ewah_free(bitmap); + + /* fix up size field */ + ewah_size =3D sb->len - ewah_start; + ewah_size =3D htonl(ewah_size); + memcpy(sb->buf + fixup, &ewah_size, 4); + + if (istate->untracked) + for_each_string_list(&istate->untracked->invalid_untracked, + untracked_entry_append, sb); +} + static int read_index_extension(struct index_state *istate, const char *ext, void *data, unsigned long sz) { @@ -1383,6 +1479,11 @@ static int read_index_extension(struct index_sta= te *istate, case CACHE_EXT_UNTRACKED: istate->untracked =3D read_untracked_extension(data, sz); break; + + case CACHE_EXT_WATCHMAN: + read_watchman_ext(istate, data, sz); + break; + default: if (*ext < 'A' || 'Z' < *ext) return error("index uses %.4s extension, which we do not understand= ", @@ -1864,6 +1965,8 @@ int discard_index(struct index_state *istate) istate->untracked =3D NULL; istate->from_shm =3D 0; istate->to_shm =3D 0; + free(istate->last_update); + istate->last_update =3D NULL; return 0; } =20 @@ -2261,6 +2364,16 @@ static int do_write_index(struct index_state *is= tate, int newfd, if (err) return -1; } + if (!strip_extensions && istate->last_update) { + struct strbuf sb =3D STRBUF_INIT; + + write_watchman_ext(&sb, istate); + err =3D write_index_ext_header(&c, newfd, CACHE_EXT_WATCHMAN, sb.len= ) < 0 + || ce_write(&c, newfd, sb.buf, sb.len) < 0; + strbuf_release(&sb); + if (err) + return -1; + } =20 if (ce_flush(&c, newfd, istate->sha1) || fstat(newfd, &st)) return -1; --=20 2.4.2.767.g62658d5-twtrsrc