git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Eric Sunshine <sunshine@sunshineco.com>,
	Patrick Steinhardt <ps@pks.im>, Victoria Dye <vdye@github.com>,
	Victoria Dye <vdye@github.com>
Subject: [PATCH v2 14/17] mktree: add directory-file conflict hashmap
Date: Wed, 19 Jun 2024 21:58:02 +0000	[thread overview]
Message-ID: <6f6d78ae7acb35991afbeaef9b61af892af93ca1.1718834285.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1746.v2.git.1718834285.gitgitgadget@gmail.com>

From: Victoria Dye <vdye@github.com>

Create a hashmap member of a 'struct tree_entry_array' that contains all of
the (de-duplicated) provided tree entries, indexed by the hash of their path
with *no* trailing slash. This hashmap will be used in a later commit to
avoid adding a file to an existing tree that has the same path as a
directory, or vice versa.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 builtin/mktree.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/builtin/mktree.c b/builtin/mktree.c
index 344c9b9b6fe..b4d71dcdd02 100644
--- a/builtin/mktree.c
+++ b/builtin/mktree.c
@@ -16,6 +16,8 @@
 #include "object-store-ll.h"
 
 struct tree_entry {
+	struct hashmap_entry ent;
+
 	/* Internal */
 	size_t order;
 
@@ -33,8 +35,33 @@ static inline size_t df_path_len(size_t pathlen, unsigned int mode)
 struct tree_entry_array {
 	size_t nr, alloc;
 	struct tree_entry **entries;
+
+	struct hashmap df_name_hash;
 };
 
+static int df_name_hash_cmp(const void *cmp_data UNUSED,
+			    const struct hashmap_entry *eptr,
+			    const struct hashmap_entry *entry_or_key,
+			    const void *keydata UNUSED)
+{
+	const struct tree_entry *e1, *e2;
+	size_t e1_len, e2_len;
+
+	e1 = container_of(eptr, const struct tree_entry, ent);
+	e2 = container_of(entry_or_key, const struct tree_entry, ent);
+
+	e1_len = df_path_len(e1->len, e1->mode);
+	e2_len = df_path_len(e2->len, e2->mode);
+
+	return e1_len != e2_len ||
+	       name_compare(e1->name, e1_len, e2->name, e2_len);
+}
+
+static void tree_entry_array_init(struct tree_entry_array *arr)
+{
+	hashmap_init(&arr->df_name_hash, df_name_hash_cmp, NULL, 0);
+}
+
 static void tree_entry_array_push(struct tree_entry_array *arr, struct tree_entry *ent)
 {
 	ALLOC_GROW(arr->entries, arr->nr + 1, arr->alloc);
@@ -48,6 +75,7 @@ static void tree_entry_array_clear(struct tree_entry_array *arr, int free_entrie
 			FREE_AND_NULL(arr->entries[i]);
 	}
 	arr->nr = 0;
+	hashmap_clear(&arr->df_name_hash);
 }
 
 static void tree_entry_array_release(struct tree_entry_array *arr, int free_entries)
@@ -137,6 +165,14 @@ static void sort_and_dedup_tree_entry_array(struct tree_entry_array *arr)
 	/* Sort again to order the entries for tree insertion */
 	ignore_mode = 0;
 	QSORT_S(arr->entries, arr->nr, ent_compare, &ignore_mode);
+
+	/* Finally, initialize the directory-file conflict hash map */
+	for (size_t i = 0; i < count; i++) {
+		struct tree_entry *curr = arr->entries[i];
+		hashmap_entry_init(&curr->ent,
+				   memhash(curr->name, df_path_len(curr->len, curr->mode)));
+		hashmap_put(&arr->df_name_hash, &curr->ent);
+	}
 }
 
 struct tree_entry_iterator {
@@ -311,6 +347,8 @@ int cmd_mktree(int ac, const char **av, const char *prefix)
 
 	ac = parse_options(ac, av, prefix, option, mktree_usage, 0);
 
+	tree_entry_array_init(&arr);
+
 	do {
 		ret = read_index_info(nul_term_line, mktree_line, &mktree_line_data, &line);
 		if (ret < 0)
-- 
gitgitgadget


  parent reply	other threads:[~2024-06-19 21:58 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-11 18:24 [PATCH 00/16] mktree: support more flexible usage Victoria Dye via GitGitGadget
2024-06-11 18:24 ` [PATCH 01/16] mktree: use OPT_BOOL Victoria Dye via GitGitGadget
2024-06-11 18:24 ` [PATCH 02/16] mktree: rename treeent to tree_entry Victoria Dye via GitGitGadget
2024-06-12  9:40   ` Patrick Steinhardt
2024-06-11 18:24 ` [PATCH 03/16] mktree: use non-static tree_entry array Victoria Dye via GitGitGadget
2024-06-11 18:45   ` Eric Sunshine
2024-06-12  9:40   ` Patrick Steinhardt
2024-06-11 18:24 ` [PATCH 04/16] update-index: generalize 'read_index_info' Victoria Dye via GitGitGadget
2024-06-11 22:45   ` Junio C Hamano
2024-06-11 18:24 ` [PATCH 05/16] index-info.c: identify empty input lines in read_index_info Victoria Dye via GitGitGadget
2024-06-11 22:52   ` Junio C Hamano
2024-06-18 17:33     ` Victoria Dye
2024-06-11 18:24 ` [PATCH 06/16] index-info.c: parse object type in provided " Victoria Dye via GitGitGadget
2024-06-12  1:54   ` Junio C Hamano
2024-06-11 18:24 ` [PATCH 07/16] mktree: use read_index_info to read stdin lines Victoria Dye via GitGitGadget
2024-06-12  2:11   ` Junio C Hamano
2024-06-12  9:40   ` Patrick Steinhardt
2024-06-12 18:35     ` Junio C Hamano
2024-06-11 18:24 ` [PATCH 08/16] mktree: add a --literally option Victoria Dye via GitGitGadget
2024-06-12  2:18   ` Junio C Hamano
2024-06-11 18:24 ` [PATCH 09/16] mktree: validate paths more carefully Victoria Dye via GitGitGadget
2024-06-12  2:26   ` Junio C Hamano
2024-06-12 19:01     ` Victoria Dye
2024-06-12 19:45       ` Junio C Hamano
2024-06-11 18:24 ` [PATCH 10/16] mktree: overwrite duplicate entries Victoria Dye via GitGitGadget
2024-06-12  9:40   ` Patrick Steinhardt
2024-06-12 18:48     ` Victoria Dye
2024-06-11 18:24 ` [PATCH 11/16] mktree: create tree using an in-core index Victoria Dye via GitGitGadget
2024-06-12  9:40   ` Patrick Steinhardt
2024-06-11 18:24 ` [PATCH 12/16] mktree: use iterator struct to add tree entries to index Victoria Dye via GitGitGadget
2024-06-12  9:40   ` Patrick Steinhardt
2024-06-13 18:38     ` Victoria Dye
2024-06-11 18:24 ` [PATCH 13/16] mktree: add directory-file conflict hashmap Victoria Dye via GitGitGadget
2024-06-11 18:24 ` [PATCH 14/16] mktree: optionally add to an existing tree Victoria Dye via GitGitGadget
2024-06-12  9:40   ` Patrick Steinhardt
2024-06-12 19:50     ` Junio C Hamano
2024-06-17 19:23     ` Victoria Dye
2024-06-11 18:24 ` [PATCH 15/16] mktree: allow deeper paths in input Victoria Dye via GitGitGadget
2024-06-11 18:24 ` [PATCH 16/16] mktree: remove entries when mode is 0 Victoria Dye via GitGitGadget
2024-06-19 21:57 ` [PATCH v2 00/17] mktree: support more flexible usage Victoria Dye via GitGitGadget
2024-06-19 21:57   ` [PATCH v2 01/17] mktree: use OPT_BOOL Victoria Dye via GitGitGadget
2024-06-19 21:57   ` [PATCH v2 02/17] mktree: rename treeent to tree_entry Victoria Dye via GitGitGadget
2024-06-19 21:57   ` [PATCH v2 03/17] mktree: use non-static tree_entry array Victoria Dye via GitGitGadget
2024-06-19 21:57   ` [PATCH v2 04/17] update-index: generalize 'read_index_info' Victoria Dye via GitGitGadget
2024-06-19 21:57   ` [PATCH v2 05/17] index-info.c: return unrecognized lines to caller Victoria Dye via GitGitGadget
2024-06-19 21:57   ` [PATCH v2 06/17] index-info.c: parse object type in provided in read_index_info Victoria Dye via GitGitGadget
2024-06-19 21:57   ` [PATCH v2 07/17] mktree: use read_index_info to read stdin lines Victoria Dye via GitGitGadget
2024-06-20 20:18     ` Junio C Hamano
2024-06-19 21:57   ` [PATCH v2 08/17] mktree.c: do not fail on mismatched submodule type Victoria Dye via GitGitGadget
2024-06-19 21:57   ` [PATCH v2 09/17] mktree: add a --literally option Victoria Dye via GitGitGadget
2024-06-19 21:57   ` [PATCH v2 10/17] mktree: validate paths more carefully Victoria Dye via GitGitGadget
2024-06-19 21:57   ` [PATCH v2 11/17] mktree: overwrite duplicate entries Victoria Dye via GitGitGadget
2024-06-20 22:05     ` Junio C Hamano
2024-06-19 21:58   ` [PATCH v2 12/17] mktree: create tree using an in-core index Victoria Dye via GitGitGadget
2024-06-20 22:26     ` Junio C Hamano
2024-06-19 21:58   ` [PATCH v2 13/17] mktree: use iterator struct to add tree entries to index Victoria Dye via GitGitGadget
2024-06-26 21:10     ` Junio C Hamano
2024-06-19 21:58   ` Victoria Dye via GitGitGadget [this message]
2024-06-19 21:58   ` [PATCH v2 15/17] mktree: optionally add to an existing tree Victoria Dye via GitGitGadget
2024-06-26 21:23     ` Junio C Hamano
2024-06-19 21:58   ` [PATCH v2 16/17] mktree: allow deeper paths in input Victoria Dye via GitGitGadget
2024-06-27 19:29     ` Junio C Hamano
2024-06-19 21:58   ` [PATCH v2 17/17] mktree: remove entries when mode is 0 Victoria Dye via GitGitGadget
2024-06-25 23:26   ` [PATCH v2 00/17] mktree: support more flexible usage Junio C Hamano
2024-07-10 21:40     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6f6d78ae7acb35991afbeaef9b61af892af93ca1.1718834285.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=ps@pks.im \
    --cc=sunshine@sunshineco.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).