* [PATCH] git-apply doesn't handle same name patches well
@ 2008-06-13 16:55 Don Zickus
2008-06-13 20:32 ` Johannes Schindelin
2008-06-16 9:01 ` [PATCH] git-apply doesn't handle same name patches well Mike Ralphson
0 siblings, 2 replies; 15+ messages in thread
From: Don Zickus @ 2008-06-13 16:55 UTC (permalink / raw)
To: git; +Cc: Don Zickus
When working with a lot of people who backport patches all day long, every
once in a while I get a patch that modifies the same file more than once
inside the same patch. git-apply either fails if the second change relies
on the first change or silently drops the first change if the second change
is independent.
The silent part is the scary scenario for us. Also this behaviour is
different from the patch-utils.
I have modified git-apply to cache the filenames of files it modifies such
that if a later patch chunk modifies a file in the cache it will buffer the
previously changed file instead of reading the original file from disk.
Logic has been put in to handle creations/deletions/renames/copies. All the
relevant tests of git-apply succeed.
A new test has been added to cover the two cases I addressed.
The fix is relatively straight-forward. But I'm not sure if this new
behaviour is something the git community wants.
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
builtin-apply.c | 69 +++++++++++++++++++++++++++++++++++++++++++++-
t/t4127-apply-same-fn.sh | 40 ++++++++++++++++++++++++++
2 files changed, 108 insertions(+), 1 deletions(-)
create mode 100755 t/t4127-apply-same-fn.sh
diff --git a/builtin-apply.c b/builtin-apply.c
index c497889..8330517 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -185,6 +185,18 @@ struct image {
struct line *line;
};
+/*
+ * Caches patch filenames to handle the case where a
+ * patch chunk reuses a filename
+ */
+struct fn_cache {
+ char *name;
+ struct patch *patch;
+ struct fn_cache *next;
+};
+
+struct fn_cache *fn_cache_top = NULL;
+
static uint32_t hash_line(const char *cp, size_t len)
{
size_t i;
@@ -2176,6 +2188,51 @@ static int read_file_or_gitlink(struct cache_entry *ce, struct strbuf *buf)
return 0;
}
+struct patch *in_fn_cache(char *name)
+{
+ struct fn_cache *p;
+
+ for (p=fn_cache_top; p; p=p->next) {
+ if (!strcmp(name, p->name))
+ return p->patch;
+ }
+
+ return NULL;
+}
+
+void add_to_fn_cache(char *name, struct patch *patch)
+{
+ struct fn_cache *fn_cache;
+
+ /* Always add new_name unless patch is a deletion */
+ if (name != NULL) {
+ fn_cache = xmalloc(sizeof(*fn_cache));
+
+ /* assuming the pointer to filename won't disappear */
+ fn_cache->name = name;
+ fn_cache->patch = patch;
+ fn_cache->next = fn_cache_top;
+
+ fn_cache_top = fn_cache;
+ }
+
+ /* skip normal diffs, creations and copies */
+ /*
+ * store a failure on rename/deletion cases because
+ * later chunks shouldn't patch old names
+ */
+ if ((name == NULL) || (patch->is_rename)) {
+ fn_cache = xmalloc(sizeof(*fn_cache));
+
+ /* assuming the pointer to filename won't disappear */
+ fn_cache->name = patch->old_name;
+ fn_cache->patch = (struct patch *) -1;
+ fn_cache->next = fn_cache_top;
+
+ fn_cache_top = fn_cache;
+ }
+}
+
static int apply_data(struct patch *patch, struct stat *st, struct cache_entry *ce)
{
struct strbuf buf;
@@ -2188,7 +2245,16 @@ static int apply_data(struct patch *patch, struct stat *st, struct cache_entry *
if (read_file_or_gitlink(ce, &buf))
return error("read of %s failed", patch->old_name);
} else if (patch->old_name) {
- if (S_ISGITLINK(patch->old_mode)) {
+ struct patch *tpatch = in_fn_cache(patch->old_name);
+
+ if (tpatch != NULL) {
+ if (tpatch == (struct patch *) -1) {
+ return error("patch %s has been renamed/deleted",
+ patch->old_name);
+ }
+ /* We have a patched copy in memory use that */
+ strbuf_add(&buf, tpatch->result, tpatch->resultsize);
+ } else if (S_ISGITLINK(patch->old_mode)) {
if (ce) {
read_file_or_gitlink(ce, &buf);
} else {
@@ -2211,6 +2277,7 @@ static int apply_data(struct patch *patch, struct stat *st, struct cache_entry *
return -1; /* note with --reject this succeeds. */
patch->result = image.buf;
patch->resultsize = image.len;
+ add_to_fn_cache(patch->new_name, patch);
free(image.line_allocated);
if (0 < patch->is_delete && patch->resultsize)
diff --git a/t/t4127-apply-same-fn.sh b/t/t4127-apply-same-fn.sh
new file mode 100755
index 0000000..a20bd8e
--- /dev/null
+++ b/t/t4127-apply-same-fn.sh
@@ -0,0 +1,40 @@
+#!/bin/sh
+
+test_description='apply same filename'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+ for i in a b c d e f g h i j k l m
+ do
+ echo $i
+ done >same_fn &&
+ git add same_fn &&
+ git commit -m initial
+'
+test_expect_success 'apply same filename with independent changes' '
+ sed -i -e "s/^d/z/" same_fn &&
+ git diff > patch0 &&
+ git add same_fn &&
+ sed -i -e "s/^i/y/" same_fn &&
+ git diff >> patch0 &&
+ cp same_fn same_fn2 &&
+ git reset --hard &&
+ git-apply patch0 &&
+ diff same_fn same_fn2
+'
+
+test_expect_success 'apply same filename with overlapping changes' '
+ git reset --hard
+ sed -i -e "s/^d/z/" same_fn &&
+ git diff > patch0 &&
+ git add same_fn &&
+ sed -i -e "s/^e/y/" same_fn &&
+ git diff >> patch0 &&
+ cp same_fn same_fn2 &&
+ git reset --hard &&
+ git-apply patch0 &&
+ diff same_fn same_fn2
+'
+
+test_done
--
1.5.6.rc2.45.gdc92c
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] git-apply doesn't handle same name patches well
2008-06-13 16:55 [PATCH] git-apply doesn't handle same name patches well Don Zickus
@ 2008-06-13 20:32 ` Johannes Schindelin
2008-06-13 20:42 ` Don Zickus
2008-06-13 22:48 ` [PATCH] path-list documentation: document all functions and data structures Miklos Vajna
2008-06-16 9:01 ` [PATCH] git-apply doesn't handle same name patches well Mike Ralphson
1 sibling, 2 replies; 15+ messages in thread
From: Johannes Schindelin @ 2008-06-13 20:32 UTC (permalink / raw)
To: Don Zickus; +Cc: git
Hi,
On Fri, 13 Jun 2008, Don Zickus wrote:
> When working with a lot of people who backport patches all day long,
> every once in a while I get a patch that modifies the same file more
> than once inside the same patch. git-apply either fails if the second
> change relies on the first change or silently drops the first change if
> the second change is independent.
>
> The silent part is the scary scenario for us. Also this behaviour is
> different from the patch-utils.
>
> I have modified git-apply to cache the filenames of files it modifies
> such that if a later patch chunk modifies a file in the cache it will
> buffer the previously changed file instead of reading the original file
> from disk.
>
> Logic has been put in to handle creations/deletions/renames/copies. All the
> relevant tests of git-apply succeed.
>
> A new test has been added to cover the two cases I addressed.
>
> The fix is relatively straight-forward. But I'm not sure if this new
> behaviour is something the git community wants.
The scary part is about adding a linked list for file names you want to
look up.
Not that performance matters here, I guess, but we _already_ have
something much more efficient in Git, namely path-lists.
You could use that, and end up with a substantially smaller patch.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] git-apply doesn't handle same name patches well
2008-06-13 20:32 ` Johannes Schindelin
@ 2008-06-13 20:42 ` Don Zickus
2008-06-13 22:48 ` [PATCH] path-list documentation: document all functions and data structures Miklos Vajna
1 sibling, 0 replies; 15+ messages in thread
From: Don Zickus @ 2008-06-13 20:42 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
On Fri, Jun 13, 2008 at 09:32:52PM +0100, Johannes Schindelin wrote:
> Hi,
>
> On Fri, 13 Jun 2008, Don Zickus wrote:
>
> > When working with a lot of people who backport patches all day long,
> > every once in a while I get a patch that modifies the same file more
> > than once inside the same patch. git-apply either fails if the second
> > change relies on the first change or silently drops the first change if
> > the second change is independent.
> >
> > The silent part is the scary scenario for us. Also this behaviour is
> > different from the patch-utils.
> >
> > I have modified git-apply to cache the filenames of files it modifies
> > such that if a later patch chunk modifies a file in the cache it will
> > buffer the previously changed file instead of reading the original file
> > from disk.
> >
> > Logic has been put in to handle creations/deletions/renames/copies. All the
> > relevant tests of git-apply succeed.
> >
> > A new test has been added to cover the two cases I addressed.
> >
> > The fix is relatively straight-forward. But I'm not sure if this new
> > behaviour is something the git community wants.
>
> The scary part is about adding a linked list for file names you want to
> look up.
>
> Not that performance matters here, I guess, but we _already_ have
> something much more efficient in Git, namely path-lists.
>
> You could use that, and end up with a substantially smaller patch.
Thanks for the feedback. I was unaware of path-lists. I'll try to find
an example and implement it if it works.
Cheers,
Don
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] path-list documentation: document all functions and data structures
2008-06-13 20:32 ` Johannes Schindelin
2008-06-13 20:42 ` Don Zickus
@ 2008-06-13 22:48 ` Miklos Vajna
2008-06-13 23:30 ` Olivier Marin
1 sibling, 1 reply; 15+ messages in thread
From: Miklos Vajna @ 2008-06-13 22:48 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Junio C Hamano, Don Zickus, git
Signed-off-by: Miklos Vajna <vmiklos@frugalware.org>
---
On Fri, Jun 13, 2008 at 09:32:52PM +0100, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> Not that performance matters here, I guess, but we _already_ have
> something much more efficient in Git, namely path-lists.
>
> You could use that, and end up with a substantially smaller patch.
I just noticed that Documentation/technical/api-path-list.txt is almost
empty. Here is an attempt to document the path-list API.
Documentation/technical/api-path-list.txt | 92 +++++++++++++++++++++++++++-
1 files changed, 88 insertions(+), 4 deletions(-)
diff --git a/Documentation/technical/api-path-list.txt b/Documentation/technical/api-path-list.txt
index d077683..844eee9 100644
--- a/Documentation/technical/api-path-list.txt
+++ b/Documentation/technical/api-path-list.txt
@@ -1,9 +1,93 @@
path-list API
=============
-Talk about <path-list.h>, things like
+The path_list API offers a data structure and functions to handle sorted
+and unsorted string lists.
-* it is not just paths but strings in general;
-* the calling sequence.
+The name is a bit misleading, a path_list may store not only paths but
+strings in general.
-(Dscho)
+The caller:
+
+. Allocates and clears (`memset(&list, '0', sizeof(path_list));`) a
+ `struct path_list` variable.
+
+. Initializes the members. You can manually set the `items` member, but
+ then you have to set `nr`, accordingly. Also don't forget to set
+ `strdup_paths` if you need it.
+
+. Adds new items to the list, using `path_list_append` or `path_list_insert`.
+
+. Can check if a string is in the list using `path_list_has_path` or
+ `unsorted_path_list_has_path` and get it from the list using
+ `path_list_lookup` for sorted lists.
+
+. Can sort an unsorted list using `sort_path_list`.
+
+. Finally it should free the list using `path_list_clear`.
+
+Functions
+---------
+
+* General ones (works with sorted and unsorted lists as well)
+
+`print_path_list`::
+
+ Dump a path_list to stdout, useful mainly for debugging purposes. It
+ can take an optional header argument and it writes out the
+ string-pointer pairs of the path_list, each one in its own line.
+
+`path_list_clear`::
+
+ Free a path_list. The `path` pointer of the items will be freed in case
+ the `strdup_paths` member of the path_list is set. The second parameter
+ controls if the `util` pointer of the items should be freed or not.
+
+* Functions for sorted lists only
+
+`path_list_has_path`::
+
+ Determine if the path_list has a given string or not.
+
+`path_list_insert`::
+
+ Insert a new element to the path_list. The returned pointer can be handy
+ if you want to write something to the `util` pointer of the
+ path_list_item containing the just added string.
+
+`path_list_lookup`::
+
+ Look up a given string in the path_list, returning the containing
+ path_list_item. If the string is not found, NULL is returned.
+
+* Functions for unsorted lists only
+
+`path_list_append`::
+
+ Append a new string to the end of the path_list.
+
+`sort_path_list`::
+
+ Make an unsorted list sorted.
+
+`unsorted_path_list_has_path`::
+
+ It's like `path_list_has_path()` but for unsorted lists.
+
+Data structures
+---------------
+
+* `struct path_list_item`
+
+Represent an item of the list. The `path` member is a pointer to the
+string, and you may use the `util` member for any purpose, if you want.
+
+* `struct path_list`
+
+Represents the list itself.
+
+. The array of items are available via the `items` member.
+. The `nr` member contains the number of items stored in the list.
+. The `alloc` member is used for `ALLOC_GROW()`.
+. Setting the `strdup_paths` member to 1 means that the added paths are
+ copied to the path list and not just a pointer to them is stored.
--
1.5.6.rc2.dirty
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] path-list documentation: document all functions and data structures
2008-06-13 22:48 ` [PATCH] path-list documentation: document all functions and data structures Miklos Vajna
@ 2008-06-13 23:30 ` Olivier Marin
2008-06-14 0:13 ` Miklos Vajna
2008-06-14 0:56 ` Miklos Vajna
0 siblings, 2 replies; 15+ messages in thread
From: Olivier Marin @ 2008-06-13 23:30 UTC (permalink / raw)
To: Miklos Vajna; +Cc: Johannes Schindelin, Junio C Hamano, Don Zickus, git
Hi,
Nice work!
Miklos Vajna a écrit :
>
> +The caller:
> +
> +. Allocates and clears (`memset(&list, '0', sizeof(path_list));`) a
> + `struct path_list` variable.
Don't you mean sizeof(list) here?
Olivier.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] path-list documentation: document all functions and data structures
2008-06-13 23:30 ` Olivier Marin
@ 2008-06-14 0:13 ` Miklos Vajna
2008-06-14 0:56 ` Miklos Vajna
1 sibling, 0 replies; 15+ messages in thread
From: Miklos Vajna @ 2008-06-14 0:13 UTC (permalink / raw)
To: Olivier Marin; +Cc: Johannes Schindelin, Junio C Hamano, Don Zickus, git
Signed-off-by: Miklos Vajna <vmiklos@frugalware.org>
---
On Sat, Jun 14, 2008 at 01:30:57AM +0200, Olivier Marin <dkr+ml.git@free.fr> wrote:
> > +. Allocates and clears (`memset(&list, '0', sizeof(path_list));`) a
> > + `struct path_list` variable.
>
> Don't you mean sizeof(list) here?
Right, it was a typo. Thanks for the correction.
Also I just noticed that the '0' did not render properly in asciidoc,
using \'0' fixes the issue.
Updated patch below.
Documentation/technical/api-path-list.txt | 92 +++++++++++++++++++++++++++-
1 files changed, 88 insertions(+), 4 deletions(-)
diff --git a/Documentation/technical/api-path-list.txt b/Documentation/technical/api-path-list.txt
index d077683..313d088 100644
--- a/Documentation/technical/api-path-list.txt
+++ b/Documentation/technical/api-path-list.txt
@@ -1,9 +1,93 @@
path-list API
=============
-Talk about <path-list.h>, things like
+The path_list API offers a data structure and functions to handle sorted
+and unsorted string lists.
-* it is not just paths but strings in general;
-* the calling sequence.
+The name is a bit misleading, a path_list may store not only paths but
+strings in general.
-(Dscho)
+The caller:
+
+. Allocates and clears (`memset(&list, \'0', sizeof(list));`) a
+ `struct path_list` variable.
+
+. Initializes the members. You can manually set the `items` member, but
+ then you have to set `nr`, accordingly. Also don't forget to set
+ `strdup_paths` if you need it.
+
+. Adds new items to the list, using `path_list_append` or `path_list_insert`.
+
+. Can check if a string is in the list using `path_list_has_path` or
+ `unsorted_path_list_has_path` and get it from the list using
+ `path_list_lookup` for sorted lists.
+
+. Can sort an unsorted list using `sort_path_list`.
+
+. Finally it should free the list using `path_list_clear`.
+
+Functions
+---------
+
+* General ones (works with sorted and unsorted lists as well)
+
+`print_path_list`::
+
+ Dump a path_list to stdout, useful mainly for debugging purposes. It
+ can take an optional header argument and it writes out the
+ string-pointer pairs of the path_list, each one in its own line.
+
+`path_list_clear`::
+
+ Free a path_list. The `path` pointer of the items will be freed in case
+ the `strdup_paths` member of the path_list is set. The second parameter
+ controls if the `util` pointer of the items should be freed or not.
+
+* Functions for sorted lists only
+
+`path_list_has_path`::
+
+ Determine if the path_list has a given string or not.
+
+`path_list_insert`::
+
+ Insert a new element to the path_list. The returned pointer can be handy
+ if you want to write something to the `util` pointer of the
+ path_list_item containing the just added string.
+
+`path_list_lookup`::
+
+ Look up a given string in the path_list, returning the containing
+ path_list_item. If the string is not found, NULL is returned.
+
+* Functions for unsorted lists only
+
+`path_list_append`::
+
+ Append a new string to the end of the path_list.
+
+`sort_path_list`::
+
+ Make an unsorted list sorted.
+
+`unsorted_path_list_has_path`::
+
+ It's like `path_list_has_path()` but for unsorted lists.
+
+Data structures
+---------------
+
+* `struct path_list_item`
+
+Represent an item of the list. The `path` member is a pointer to the
+string, and you may use the `util` member for any purpose, if you want.
+
+* `struct path_list`
+
+Represents the list itself.
+
+. The array of items are available via the `items` member.
+. The `nr` member contains the number of items stored in the list.
+. The `alloc` member is used for `ALLOC_GROW()`.
+. Setting the `strdup_paths` member to 1 means that the added paths are
+ copied to the path list and not just a pointer to them is stored.
--
1.5.6.rc2.dirty
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH] path-list documentation: document all functions and data structures
2008-06-13 23:30 ` Olivier Marin
2008-06-14 0:13 ` Miklos Vajna
@ 2008-06-14 0:56 ` Miklos Vajna
2008-06-14 12:50 ` Olivier Marin
2008-06-14 18:08 ` Johannes Schindelin
1 sibling, 2 replies; 15+ messages in thread
From: Miklos Vajna @ 2008-06-14 0:56 UTC (permalink / raw)
To: Olivier Marin; +Cc: Johannes Schindelin, Junio C Hamano, Don Zickus, git
Signed-off-by: Miklos Vajna <vmiklos@frugalware.org>
---
Updated patch, obviously we want to init the struct with '\0', not '0'.
Documentation/technical/api-path-list.txt | 92 +++++++++++++++++++++++++++-
1 files changed, 88 insertions(+), 4 deletions(-)
diff --git a/Documentation/technical/api-path-list.txt b/Documentation/technical/api-path-list.txt
index d077683..313d088 100644
--- a/Documentation/technical/api-path-list.txt
+++ b/Documentation/technical/api-path-list.txt
@@ -1,9 +1,93 @@
path-list API
=============
-Talk about <path-list.h>, things like
+The path_list API offers a data structure and functions to handle sorted
+and unsorted string lists.
-* it is not just paths but strings in general;
-* the calling sequence.
+The name is a bit misleading, a path_list may store not only paths but
+strings in general.
-(Dscho)
+The caller:
+
+. Allocates and clears (`memset(&list, \'\0', sizeof(list));`) a
+ `struct path_list` variable.
+
+. Initializes the members. You can manually set the `items` member, but
+ then you have to set `nr`, accordingly. Also don't forget to set
+ `strdup_paths` if you need it.
+
+. Adds new items to the list, using `path_list_append` or `path_list_insert`.
+
+. Can check if a string is in the list using `path_list_has_path` or
+ `unsorted_path_list_has_path` and get it from the list using
+ `path_list_lookup` for sorted lists.
+
+. Can sort an unsorted list using `sort_path_list`.
+
+. Finally it should free the list using `path_list_clear`.
+
+Functions
+---------
+
+* General ones (works with sorted and unsorted lists as well)
+
+`print_path_list`::
+
+ Dump a path_list to stdout, useful mainly for debugging purposes. It
+ can take an optional header argument and it writes out the
+ string-pointer pairs of the path_list, each one in its own line.
+
+`path_list_clear`::
+
+ Free a path_list. The `path` pointer of the items will be freed in case
+ the `strdup_paths` member of the path_list is set. The second parameter
+ controls if the `util` pointer of the items should be freed or not.
+
+* Functions for sorted lists only
+
+`path_list_has_path`::
+
+ Determine if the path_list has a given string or not.
+
+`path_list_insert`::
+
+ Insert a new element to the path_list. The returned pointer can be handy
+ if you want to write something to the `util` pointer of the
+ path_list_item containing the just added string.
+
+`path_list_lookup`::
+
+ Look up a given string in the path_list, returning the containing
+ path_list_item. If the string is not found, NULL is returned.
+
+* Functions for unsorted lists only
+
+`path_list_append`::
+
+ Append a new string to the end of the path_list.
+
+`sort_path_list`::
+
+ Make an unsorted list sorted.
+
+`unsorted_path_list_has_path`::
+
+ It's like `path_list_has_path()` but for unsorted lists.
+
+Data structures
+---------------
+
+* `struct path_list_item`
+
+Represent an item of the list. The `path` member is a pointer to the
+string, and you may use the `util` member for any purpose, if you want.
+
+* `struct path_list`
+
+Represents the list itself.
+
+. The array of items are available via the `items` member.
+. The `nr` member contains the number of items stored in the list.
+. The `alloc` member is used for `ALLOC_GROW()`.
+. Setting the `strdup_paths` member to 1 means that the added paths are
+ copied to the path list and not just a pointer to them is stored.
--
1.5.6.rc2.dirty
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] path-list documentation: document all functions and data structures
2008-06-14 0:56 ` Miklos Vajna
@ 2008-06-14 12:50 ` Olivier Marin
2008-06-14 22:50 ` Miklos Vajna
2008-06-14 18:08 ` Johannes Schindelin
1 sibling, 1 reply; 15+ messages in thread
From: Olivier Marin @ 2008-06-14 12:50 UTC (permalink / raw)
To: Miklos Vajna; +Cc: Johannes Schindelin, Junio C Hamano, Don Zickus, git
Miklos Vajna a écrit :
>
> +. Allocates and clears (`memset(&list, \'\0', sizeof(list));`) a
> + `struct path_list` variable.
What about just `memset(&list, 0, sizeof(list))` instead?
It's readable in the text format, clean in html and this is the way
memset() is used.
Olivier.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] path-list documentation: document all functions and data structures
2008-06-14 12:50 ` Olivier Marin
@ 2008-06-14 22:50 ` Miklos Vajna
0 siblings, 0 replies; 15+ messages in thread
From: Miklos Vajna @ 2008-06-14 22:50 UTC (permalink / raw)
To: Olivier Marin; +Cc: Johannes Schindelin, Junio C Hamano, Don Zickus, git
[-- Attachment #1: Type: text/plain, Size: 282 bytes --]
On Sat, Jun 14, 2008 at 02:50:22PM +0200, Olivier Marin <dkr+ml.git@free.fr> wrote:
> What about just `memset(&list, 0, sizeof(list))` instead?
>
> It's readable in the text format, clean in html and this is the way
> memset() is used.
Good idea, thanks. Will do in a bit.
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] path-list documentation: document all functions and data structures
2008-06-14 0:56 ` Miklos Vajna
2008-06-14 12:50 ` Olivier Marin
@ 2008-06-14 18:08 ` Johannes Schindelin
2008-06-14 23:22 ` Miklos Vajna
2008-06-15 5:04 ` Junio C Hamano
1 sibling, 2 replies; 15+ messages in thread
From: Johannes Schindelin @ 2008-06-14 18:08 UTC (permalink / raw)
To: Miklos Vajna; +Cc: Olivier Marin, Junio C Hamano, Don Zickus, git
Hi,
On Sat, 14 Jun 2008, Miklos Vajna wrote:
> Signed-off-by: Miklos Vajna <vmiklos@frugalware.org>
Thanks for doing this... I meant to document it after pushing the
path_list -> string_list patch.
Speaking of which: Junio, could you give me any clue how you would like to
proceed with that patch?
> @@ -1,9 +1,93 @@
> path-list API
> =============
>
> -Talk about <path-list.h>, things like
> +The path_list API offers a data structure and functions to handle sorted
> +and unsorted string lists.
>
> -* it is not just paths but strings in general;
> -* the calling sequence.
> +The name is a bit misleading, a path_list may store not only paths but
> +strings in general.
>
> -(Dscho)
> +The caller:
> +
> +. Allocates and clears (`memset(&list, \'\0', sizeof(list));`) a
> + `struct path_list` variable.
Some callers use global variables; these are already initialized. Also, I
would not write code here, but later in a concise example.
> +. Initializes the members. You can manually set the `items` member, but
> + then you have to set `nr`, accordingly. Also don't forget to set
> + `strdup_paths` if you need it.
I would not promote the manual setting of the items member, until later.
This is advanced usage, and you have to malloc() the list if you add
things later, and you should set the `alloc` member in that case, too.
Further, I would like to have an explanation of the variable
"strdup_paths" first.
Something like: "You might want to set the flag `strdup_paths` if the
strings should be strdup()ed. For example, this is necessary when you add
something like git_path("..."), since that function returns a static
buffer that will change with the next call to git_path()."
> +. Adds new items to the list, using `path_list_append` or `path_list_insert`.
> +
> +. Can check if a string is in the list using `path_list_has_path` or
> + `unsorted_path_list_has_path` and get it from the list using
> + `path_list_lookup` for sorted lists.
> +
> +. Can sort an unsorted list using `sort_path_list`.
> +
> +. Finally it should free the list using `path_list_clear`.
Here, you should add a note that it is more efficient to build an unsorted
list and sort it afterwards, instead of building a sorted list (O(n log n)
instead of O(n^2)).
However, if you use the list to check if a certain string was added
already, you should not do that (using unsorted_path_list_has_path()),
because the complexity would be quadratic again (but with a worse factor).
> +`path_list_insert`::
> +
> + Insert a new element to the path_list. The returned pointer can be handy
> + if you want to write something to the `util` pointer of the
> + path_list_item containing the just added string.
Since this function uses xrealloc() (which die()s if it fails) if
the list needs to grow, it is safe not to check the pointer. I.e.
you may write "path_list_insert(...)->util = ...;"
> +`unsorted_path_list_has_path`::
> +
> + It's like `path_list_has_path()` but for unsorted lists.
Obviously, this function needs to look through all items, as
opposed to its counterpart for sorted lists, which performs a
binary search.
> +Data structures
> +---------------
> +
> +* `struct path_list_item`
> +
> +Represent an item of the list. The `path` member is a pointer to the
s/sent/&s/
> +string, and you may use the `util` member for any purpose, if you want.
> +
> +* `struct path_list`
> +
> +Represents the list itself.
> +
> +. The array of items are available via the `items` member.
> +. The `nr` member contains the number of items stored in the list.
> +. The `alloc` member is used for `ALLOC_GROW()`.
Maybe "The `alloc` member is used to avoid reallocating at every
insertion. You should not tamper with it."
> +. Setting the `strdup_paths` member to 1 means that the added paths are
> + copied to the path list and not just a pointer to them is stored.
Like I said, I would say that it will strdup() the strings before adding
them, and then motivate it (by presenting a case where it helps, e.g.
git_path()).
Of course, a short and concise example how to use path_lists would be
nice... ;-)
Ciao,
Dscho
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] path-list documentation: document all functions and data structures
2008-06-14 18:08 ` Johannes Schindelin
@ 2008-06-14 23:22 ` Miklos Vajna
2008-06-15 9:01 ` Jakub Narebski
2008-06-15 5:04 ` Junio C Hamano
1 sibling, 1 reply; 15+ messages in thread
From: Miklos Vajna @ 2008-06-14 23:22 UTC (permalink / raw)
To: git
Signed-off-by: Miklos Vajna <vmiklos@frugalware.org>
---
On Sat, Jun 14, 2008 at 07:08:19PM +0100, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> Thanks for doing this... I meant to document it after pushing the
> path_list -> string_list patch.
Here is an updated version, hopefully I added all your suggestion, and
appended a short example as well.
(Sending to the list only, as I accidently removed the list from Cc, sorry for that.)
Documentation/technical/api-path-list.txt | 125 ++++++++++++++++++++++++++++-
1 files changed, 121 insertions(+), 4 deletions(-)
diff --git a/Documentation/technical/api-path-list.txt b/Documentation/technical/api-path-list.txt
index d077683..654470d 100644
--- a/Documentation/technical/api-path-list.txt
+++ b/Documentation/technical/api-path-list.txt
@@ -1,9 +1,126 @@
path-list API
=============
-Talk about <path-list.h>, things like
+The path_list API offers a data structure and functions to handle sorted
+and unsorted string lists.
-* it is not just paths but strings in general;
-* the calling sequence.
+The name is a bit misleading, a path_list may store not only paths but
+strings in general.
-(Dscho)
+The caller:
+
+. Allocates and clears a `struct path_list` variable.
+
+. Initializes the members. You might want to set the flag `strdup_paths`
+ if the strings should be strdup()ed. For example, this is necessary
+ when you add something like git_path("..."), since that function returns
+ a static buffer that will change with the next call to git_path().
++
+If you need something advanced, you can manually malloc() the `items`
+member (you need this if you add things later) and you should set the
+`nr` and `alloc` members in that case, too.
+
+. Adds new items to the list, using `path_list_append` or `path_list_insert`.
+
+. Can check if a string is in the list using `path_list_has_path` or
+ `unsorted_path_list_has_path` and get it from the list using
+ `path_list_lookup` for sorted lists.
+
+. Can sort an unsorted list using `sort_path_list`.
+
+. Finally it should free the list using `path_list_clear`.
+
+Example:
+
+----
+struct path_list list;
+int i;
+
+memset(&list, 0, sizeof(struct path_list));
+path_list_append("foo", &list);
+path_list_append("bar", &list);
+for (i = 0; i < list.nr; i++)
+ printf("%s\n", list.items[i].path)
+----
+
+NOTE: It is more efficient to build an unsorted list and sort it
+afterwards, instead of building a sorted list `(O(n log n)` instead of
+`O(n^2))`.
++
+However, if you use the list to check if a certain string was added
+already, you should not do that (using unsorted_path_list_has_path()),
+because the complexity would be quadratic again (but with a worse factor).
+
+Functions
+---------
+
+* General ones (works with sorted and unsorted lists as well)
+
+`print_path_list`::
+
+ Dump a path_list to stdout, useful mainly for debugging purposes. It
+ can take an optional header argument and it writes out the
+ string-pointer pairs of the path_list, each one in its own line.
+
+`path_list_clear`::
+
+ Free a path_list. The `path` pointer of the items will be freed in case
+ the `strdup_paths` member of the path_list is set. The second parameter
+ controls if the `util` pointer of the items should be freed or not.
+
+* Functions for sorted lists only
+
+`path_list_has_path`::
+
+ Determine if the path_list has a given string or not.
+
+`path_list_insert`::
+
+ Insert a new element to the path_list. The returned pointer can be handy
+ if you want to write something to the `util` pointer of the
+ path_list_item containing the just added string.
++
+Since this function uses xrealloc() (which die()s if it fails) if the
+list needs to grow, it is safe not to check the pointer. I.e. you may
+write `path_list_insert(...)->util = ...;`.
+
+`path_list_lookup`::
+
+ Look up a given string in the path_list, returning the containing
+ path_list_item. If the string is not found, NULL is returned.
+
+* Functions for unsorted lists only
+
+`path_list_append`::
+
+ Append a new string to the end of the path_list.
+
+`sort_path_list`::
+
+ Make an unsorted list sorted.
+
+`unsorted_path_list_has_path`::
+
+ It's like `path_list_has_path()` but for unsorted lists.
++
+This function needs to look through all items, as opposed to its
+counterpart for sorted lists, which performs a binary search.
+
+Data structures
+---------------
+
+* `struct path_list_item`
+
+Represents an item of the list. The `path` member is a pointer to the
+string, and you may use the `util` member for any purpose, if you want.
+
+* `struct path_list`
+
+Represents the list itself.
+
+. The array of items are available via the `items` member.
+. The `nr` member contains the number of items stored in the list.
+. The `alloc` member is used to avoid reallocating at every insertion.
+ You should not tamper with it.
+. Setting the `strdup_paths` member to 1 will strdup() the strings
+ before adding them, see above.
--
1.5.6.rc2.dirty
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] path-list documentation: document all functions and data structures
2008-06-14 23:22 ` Miklos Vajna
@ 2008-06-15 9:01 ` Jakub Narebski
2008-06-15 12:06 ` Miklos Vajna
0 siblings, 1 reply; 15+ messages in thread
From: Jakub Narebski @ 2008-06-15 9:01 UTC (permalink / raw)
To: Miklos Vajna; +Cc: git
Miklos Vajna <vmiklos@frugalware.org> writes:
> +NOTE: It is more efficient to build an unsorted list and sort it
> +afterwards, instead of building a sorted list `(O(n log n)` instead of
> +`O(n^2))`.
I think there is typo here (misplaced backticks '`' on the wrong side
of enclosing parentheses), and this fragment should read:
+afterwards, instead of building a sorted list (`O(n log n)` instead of
+`O(n^2)`).
--
Jakub Narebski
Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] path-list documentation: document all functions and data structures
2008-06-15 9:01 ` Jakub Narebski
@ 2008-06-15 12:06 ` Miklos Vajna
0 siblings, 0 replies; 15+ messages in thread
From: Miklos Vajna @ 2008-06-15 12:06 UTC (permalink / raw)
To: Jakub Narebski
Cc: Johannes Schindelin, Olivier Marin, Junio C Hamano, Don Zickus,
git
Signed-off-by: Miklos Vajna <vmiklos@frugalware.org>
---
On Sun, Jun 15, 2008 at 02:01:19AM -0700, Jakub Narebski <jnareb@gmail.com> wrote:
> > +NOTE: It is more efficient to build an unsorted list and sort it
> > +afterwards, instead of building a sorted list `(O(n log n)` instead
> > of
> > +`O(n^2))`.
>
> I think there is typo here (misplaced backticks '`' on the wrong side
> of enclosing parentheses), and this fragment should read:
>
> +afterwards, instead of building a sorted list (`O(n log n)` instead
> of
> +`O(n^2)`).
Exactly, thanks for pointing out. Updated patch below.
Documentation/technical/api-path-list.txt | 125 ++++++++++++++++++++++++++++-
1 files changed, 121 insertions(+), 4 deletions(-)
diff --git a/Documentation/technical/api-path-list.txt b/Documentation/technical/api-path-list.txt
index d077683..9dbedd0 100644
--- a/Documentation/technical/api-path-list.txt
+++ b/Documentation/technical/api-path-list.txt
@@ -1,9 +1,126 @@
path-list API
=============
-Talk about <path-list.h>, things like
+The path_list API offers a data structure and functions to handle sorted
+and unsorted string lists.
-* it is not just paths but strings in general;
-* the calling sequence.
+The name is a bit misleading, a path_list may store not only paths but
+strings in general.
-(Dscho)
+The caller:
+
+. Allocates and clears a `struct path_list` variable.
+
+. Initializes the members. You might want to set the flag `strdup_paths`
+ if the strings should be strdup()ed. For example, this is necessary
+ when you add something like git_path("..."), since that function returns
+ a static buffer that will change with the next call to git_path().
++
+If you need something advanced, you can manually malloc() the `items`
+member (you need this if you add things later) and you should set the
+`nr` and `alloc` members in that case, too.
+
+. Adds new items to the list, using `path_list_append` or `path_list_insert`.
+
+. Can check if a string is in the list using `path_list_has_path` or
+ `unsorted_path_list_has_path` and get it from the list using
+ `path_list_lookup` for sorted lists.
+
+. Can sort an unsorted list using `sort_path_list`.
+
+. Finally it should free the list using `path_list_clear`.
+
+Example:
+
+----
+struct path_list list;
+int i;
+
+memset(&list, 0, sizeof(struct path_list));
+path_list_append("foo", &list);
+path_list_append("bar", &list);
+for (i = 0; i < list.nr; i++)
+ printf("%s\n", list.items[i].path)
+----
+
+NOTE: It is more efficient to build an unsorted list and sort it
+afterwards, instead of building a sorted list (`O(n log n)` instead of
+`O(n^2)`).
++
+However, if you use the list to check if a certain string was added
+already, you should not do that (using unsorted_path_list_has_path()),
+because the complexity would be quadratic again (but with a worse factor).
+
+Functions
+---------
+
+* General ones (works with sorted and unsorted lists as well)
+
+`print_path_list`::
+
+ Dump a path_list to stdout, useful mainly for debugging purposes. It
+ can take an optional header argument and it writes out the
+ string-pointer pairs of the path_list, each one in its own line.
+
+`path_list_clear`::
+
+ Free a path_list. The `path` pointer of the items will be freed in case
+ the `strdup_paths` member of the path_list is set. The second parameter
+ controls if the `util` pointer of the items should be freed or not.
+
+* Functions for sorted lists only
+
+`path_list_has_path`::
+
+ Determine if the path_list has a given string or not.
+
+`path_list_insert`::
+
+ Insert a new element to the path_list. The returned pointer can be handy
+ if you want to write something to the `util` pointer of the
+ path_list_item containing the just added string.
++
+Since this function uses xrealloc() (which die()s if it fails) if the
+list needs to grow, it is safe not to check the pointer. I.e. you may
+write `path_list_insert(...)->util = ...;`.
+
+`path_list_lookup`::
+
+ Look up a given string in the path_list, returning the containing
+ path_list_item. If the string is not found, NULL is returned.
+
+* Functions for unsorted lists only
+
+`path_list_append`::
+
+ Append a new string to the end of the path_list.
+
+`sort_path_list`::
+
+ Make an unsorted list sorted.
+
+`unsorted_path_list_has_path`::
+
+ It's like `path_list_has_path()` but for unsorted lists.
++
+This function needs to look through all items, as opposed to its
+counterpart for sorted lists, which performs a binary search.
+
+Data structures
+---------------
+
+* `struct path_list_item`
+
+Represents an item of the list. The `path` member is a pointer to the
+string, and you may use the `util` member for any purpose, if you want.
+
+* `struct path_list`
+
+Represents the list itself.
+
+. The array of items are available via the `items` member.
+. The `nr` member contains the number of items stored in the list.
+. The `alloc` member is used to avoid reallocating at every insertion.
+ You should not tamper with it.
+. Setting the `strdup_paths` member to 1 will strdup() the strings
+ before adding them, see above.
--
1.5.6.rc2.dirty
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] path-list documentation: document all functions and data structures
2008-06-14 18:08 ` Johannes Schindelin
2008-06-14 23:22 ` Miklos Vajna
@ 2008-06-15 5:04 ` Junio C Hamano
1 sibling, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2008-06-15 5:04 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Miklos Vajna, Olivier Marin, Don Zickus, git
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> Speaking of which: Junio, could you give me any clue how you would like to
> proceed with that patch?
It would be most convenient to do so when
git diff master pu | grep path.list
shrinks to the minimum. I think very early after 1.5.6 would be the best,
as there is nothing outstanding that adds new use or removes existing use
of path_list.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] git-apply doesn't handle same name patches well
2008-06-13 16:55 [PATCH] git-apply doesn't handle same name patches well Don Zickus
2008-06-13 20:32 ` Johannes Schindelin
@ 2008-06-16 9:01 ` Mike Ralphson
1 sibling, 0 replies; 15+ messages in thread
From: Mike Ralphson @ 2008-06-16 9:01 UTC (permalink / raw)
To: Don Zickus; +Cc: git, Johannes Schindelin
2008/6/13 Don Zickus <dzickus@redhat.com>:
> When working with a lot of people who backport patches all day long, every
> once in a while I get a patch that modifies the same file more than once
> inside the same patch...
>
> I have modified git-apply to cache the filenames of files it modifies such
> that if a later patch chunk modifies a file in the cache it will buffer the
> previously changed file instead of reading the original file from disk.
Excellent spot. A couple of things you might want to add to your new
test cases would be examples where the first patch renames or removes
a file (or two files are swapped) and a subsequent patch then touches
the same path(s).
Mike
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2008-06-16 9:02 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-13 16:55 [PATCH] git-apply doesn't handle same name patches well Don Zickus
2008-06-13 20:32 ` Johannes Schindelin
2008-06-13 20:42 ` Don Zickus
2008-06-13 22:48 ` [PATCH] path-list documentation: document all functions and data structures Miklos Vajna
2008-06-13 23:30 ` Olivier Marin
2008-06-14 0:13 ` Miklos Vajna
2008-06-14 0:56 ` Miklos Vajna
2008-06-14 12:50 ` Olivier Marin
2008-06-14 22:50 ` Miklos Vajna
2008-06-14 18:08 ` Johannes Schindelin
2008-06-14 23:22 ` Miklos Vajna
2008-06-15 9:01 ` Jakub Narebski
2008-06-15 12:06 ` Miklos Vajna
2008-06-15 5:04 ` Junio C Hamano
2008-06-16 9:01 ` [PATCH] git-apply doesn't handle same name patches well Mike Ralphson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).