* [PATCH] string_list API: document what "sorted" means.
@ 2012-09-17 15:21 Michael Haggerty
2012-09-17 21:17 ` Junio C Hamano
0 siblings, 1 reply; 7+ messages in thread
From: Michael Haggerty @ 2012-09-17 15:21 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Michael Haggerty
Junio pointed out that the sort order currently used by string_list
could be considered to be an implementation detail internal to
string_list. But the sort order is already visible to the outside
world (e.g., via iteration or via print_string_list()), so it
shouldn't be changed willy-nilly. Therefore, document the current
sort order as part of the API's contract.
(If, at some future time, somebody wants a string_list that is sorted
by a different criterion, then the order should be made specifiable
via a callback function specified by the user.)
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
Documentation/technical/api-string-list.txt | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt
index 155ac8c..94d7a2b 100644
--- a/Documentation/technical/api-string-list.txt
+++ b/Documentation/technical/api-string-list.txt
@@ -1,8 +1,9 @@
string-list API
===============
-The string_list API offers a data structure and functions to handle sorted
-and unsorted string lists.
+The string_list API offers a data structure and functions to handle
+sorted and unsorted string lists. A "sorted" list is one whose
+entries are sorted by string value in `strcmp()` order.
The 'string_list' struct used to be called 'path_list', but was renamed
because it is not specific to paths.
@@ -143,7 +144,7 @@ write `string_list_insert(...)->util = ...;`.
`sort_string_list`::
- Make an unsorted list sorted.
+ Sort the list's entries by string value in `strcmp()` order.
`unsorted_string_list_has_string`::
--
1.7.11.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] string_list API: document what "sorted" means.
2012-09-17 15:21 [PATCH] string_list API: document what "sorted" means Michael Haggerty
@ 2012-09-17 21:17 ` Junio C Hamano
2012-09-18 7:58 ` Michael Haggerty
0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2012-09-17 21:17 UTC (permalink / raw)
To: Michael Haggerty; +Cc: git
Michael Haggerty <mhagger@alum.mit.edu> writes:
> Junio pointed out that the sort order currently used by string_list
> could be considered to be an implementation detail internal to
> string_list. But the sort order is already visible to the outside
> world (e.g., via iteration or via print_string_list()), so it
> shouldn't be changed willy-nilly. Therefore, document the current
> sort order as part of the API's contract.
>
> (If, at some future time, somebody wants a string_list that is sorted
> by a different criterion, then the order should be made specifiable
> via a callback function specified by the user.)
As I said in a separate message, we do not give any documented
guarantee on the order the strings comes out of print_string_list(),
other than "if you are using the unsorted list function to insert or
append strings, we won't muck with the order of the items and you
will get your strings back in the order you gave us". Until we add
a callback that the user can specify at the time of string list
initialization, I do not think it is wise to cast it in stone and
declare it as "shouldn't be changed willy-nilly", unnecessarily
painting us into a corner that we need to expend extra effort to get
out of.
Besides, nobody calls print_string_list() in the first place. I have
always considered it as a debugging aid.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] string_list API: document what "sorted" means.
2012-09-17 21:17 ` Junio C Hamano
@ 2012-09-18 7:58 ` Michael Haggerty
2012-09-18 8:19 ` Junio C Hamano
0 siblings, 1 reply; 7+ messages in thread
From: Michael Haggerty @ 2012-09-18 7:58 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On 09/17/2012 11:17 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
>
>> Junio pointed out that the sort order currently used by string_list
>> could be considered to be an implementation detail internal to
>> string_list. But the sort order is already visible to the outside
>> world (e.g., via iteration or via print_string_list()), so it
>> shouldn't be changed willy-nilly. Therefore, document the current
>> sort order as part of the API's contract.
>>
>> (If, at some future time, somebody wants a string_list that is sorted
>> by a different criterion, then the order should be made specifiable
>> via a callback function specified by the user.)
>
> As I said in a separate message, we do not give any documented
> guarantee on the order the strings comes out of print_string_list(),
> other than "if you are using the unsorted list function to insert or
> append strings, we won't muck with the order of the items and you
> will get your strings back in the order you gave us". Until we add
> a callback that the user can specify at the time of string list
> initialization, I do not think it is wise to cast it in stone and
> declare it as "shouldn't be changed willy-nilly", unnecessarily
> painting us into a corner that we need to expend extra effort to get
> out of.
>
> Besides, nobody calls print_string_list() in the first place. I have
> always considered it as a debugging aid.
As you pointed out, mh/fetch-filter-refs depends on the sort order of
the "sought" reference array matching the sort order of the linked list
of refs received from the remote.
The linked list has been sorted since 9e8e704f using mergesort based on
strcmp().
Prior to 8bee93dd2, the sought array was sorted using qsort() and an
explicit strcmp()-based comparison function. 8bee93dd2 switched to
sorting the "sought" array using sort_string_list(), which is also based
on strcmp().
If (after mh/fetch-filter-refs) somebody would decide to change the sort
order of string_list, then it would break fetch. So I see three
possibilities:
1. Document that string_list sorts entries according to their strcmp()
order, as proposed in this patch. Then fetch can rely on this ordering.
If somebody wants a different ordering in the future, it is easy to
make the sort order a parameter.
2. Leave string_list's "default" sort order unspecified, but already
implement a way to let string_list users specify a particular sort
order. Change the fetch machinery to specify a strcmp()-based ordering
explicitly. This approach has the advantage of letting the default sort
order of string_list to be changed, though I don't really see how this
would be helpful.
3. Change fetch back to doing its own sorting again, rather than relying
on sort_string_list(). This isn't a lot of work (inline the one line of
sort_string_list(), then either make string-list.c:cmp_items() public or
duplicate that function too).
I prefer (1) because it is nearly as flexible as (2) and doesn't require
us to do work now that might not be needed (namely, parameterizing the
compare function), nor does it require the overhead of keeping a pointer
to the compare function in the string_list header that might be needed
if string_lists with non-default sort orders should be usable with the
"functions for sorted lists only".
Michael
--
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] string_list API: document what "sorted" means.
2012-09-18 7:58 ` Michael Haggerty
@ 2012-09-18 8:19 ` Junio C Hamano
2012-09-18 8:55 ` Michael Haggerty
0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2012-09-18 8:19 UTC (permalink / raw)
To: Michael Haggerty; +Cc: git
Michael Haggerty <mhagger@alum.mit.edu> writes:
> 1. Document that string_list sorts entries according to their strcmp()
> order, as proposed in this patch. Then fetch can rely on this ordering.
> If somebody wants a different ordering in the future, it is easy to
> make the sort order a parameter.
>
> 2. Leave string_list's "default" sort order unspecified, but already
> implement a way to let string_list users specify a particular sort
> order. Change the fetch machinery to specify a strcmp()-based ordering
> explicitly. This approach has the advantage of letting the default sort
> order of string_list to be changed, though I don't really see how this
> would be helpful.
>
> 3. Change fetch back to doing its own sorting again, rather than relying
> on sort_string_list(). This isn't a lot of work (inline the one line of
> sort_string_list(), then either make string-list.c:cmp_items() public or
> duplicate that function too).
I haven't looked at non-users of string-list API, but my gut feeling
has been that lack of 2. made the API less useful for current
non-users, possible callers that could benefit from something like
string-list that lets them specify their own sort order.
Also, I actually am more worried about us wanting to change the
order in which ref-list is sorted, rather than somebody randomly
deciding to change the default (and only) order string-list is
sorted on. When that happens, we would have even less useful
string-list left behind, with a documented invariant that is not
helping anybody if we choose to do 1 now.
So to me, if we really wanted to avoid doing 2 prematurely (which is
a sensible concern, by the way), the more sensible option between 1
and 3 would be 3. That makes my preference 2, 3 and then 1 (but I
do not care too deeply between 2 and 3).
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] string_list API: document what "sorted" means.
2012-09-18 8:19 ` Junio C Hamano
@ 2012-09-18 8:55 ` Michael Haggerty
2012-09-18 17:21 ` Junio C Hamano
0 siblings, 1 reply; 7+ messages in thread
From: Michael Haggerty @ 2012-09-18 8:55 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On 09/18/2012 10:19 AM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
>
>> 1. Document that string_list sorts entries according to their strcmp()
>> order, as proposed in this patch. Then fetch can rely on this ordering.
>> If somebody wants a different ordering in the future, it is easy to
>> make the sort order a parameter.
>>
>> 2. Leave string_list's "default" sort order unspecified, but already
>> implement a way to let string_list users specify a particular sort
>> order. Change the fetch machinery to specify a strcmp()-based ordering
>> explicitly. This approach has the advantage of letting the default sort
>> order of string_list to be changed, though I don't really see how this
>> would be helpful.
>>
>> 3. Change fetch back to doing its own sorting again, rather than relying
>> on sort_string_list(). This isn't a lot of work (inline the one line of
>> sort_string_list(), then either make string-list.c:cmp_items() public or
>> duplicate that function too).
>
> I haven't looked at non-users of string-list API, but my gut feeling
> has been that lack of 2. made the API less useful for current
> non-users, possible callers that could benefit from something like
> string-list that lets them specify their own sort order.
>
> Also, I actually am more worried about us wanting to change the
> order in which ref-list is sorted, rather than somebody randomly
> deciding to change the default (and only) order string-list is
> sorted on. When that happens, we would have even less useful
> string-list left behind, with a documented invariant that is not
> helping anybody if we choose to do 1 now.
If another sort order is needed, then we will either have to audit
existing string_list users to make sure that they don't rely on strcmp()
ordering, or we will have to implement strcmp() ordering *plus* the new
ordering. In that case the simplest approach would be to convert all
existing users to explicit strcmp() ordering and only use the new
ordering in places where there is a specific need for it. This process
would be no more difficult if we document strcmp() ordering now.
When/if we reach that bridge, we can easily change the documentation from
A "sorted" list is one whose entries are sorted by string value in
`strcmp()` order.
to
A "sorted" list is one whose entries are sorted according to the
configured `compare()` function. The default `compare()` function
orders entries in the `strcmp()` order of their string values.
It's not that I'm unwilling to implement 2; it's just that I still don't
see any advantage to doing so before there is a demonstrated need for it.
Michael
--
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] string_list API: document what "sorted" means.
2012-09-18 8:55 ` Michael Haggerty
@ 2012-09-18 17:21 ` Junio C Hamano
2012-09-19 7:35 ` Michael Haggerty
0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2012-09-18 17:21 UTC (permalink / raw)
To: Michael Haggerty; +Cc: git
Michael Haggerty <mhagger@alum.mit.edu> writes:
> If another sort order is needed, then we will either have to audit
> existing string_list users to make sure that they don't rely on strcmp()
> ordering, or we will have to implement strcmp() ordering *plus* the new
> ordering.
What I was envisioning was to pass in an optional custom comparison
when you instantiate a string_list object. Existing callers that
rely on (or can live with, because they do not care about the exact
order) the default order will continue to use the byte-value ordering,
so there is no need for auditing. Only the new callers that want
different ordering would set custom comparison routine.
But now I wrote it down, I realize that there is no _harm_ in
documenting "we sort in byte-value order, so expect iterations on
sorted string list to give them in that order to you" at all.
So let's go with your documentation patch as-is.
> It's not that I'm unwilling to implement 2; it's just that I still don't
> see any advantage to doing so before there is a demonstrated need for it.
As I said, I have this suspicion that the lack of demonstrated need
is largely because the existing code that do _not_ use string-list
don't do so because the interface is limited, so the argument is
sort of self-fulfilling.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] string_list API: document what "sorted" means.
2012-09-18 17:21 ` Junio C Hamano
@ 2012-09-19 7:35 ` Michael Haggerty
0 siblings, 0 replies; 7+ messages in thread
From: Michael Haggerty @ 2012-09-19 7:35 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On 09/18/2012 07:21 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
>
>> If another sort order is needed, then we will either have to audit
>> existing string_list users to make sure that they don't rely on strcmp()
>> ordering, or we will have to implement strcmp() ordering *plus* the new
>> ordering.
>
> What I was envisioning was to pass in an optional custom comparison
> when you instantiate a string_list object. Existing callers that
> rely on (or can live with, because they do not care about the exact
> order) the default order will continue to use the byte-value ordering,
> so there is no need for auditing. Only the new callers that want
> different ordering would set custom comparison routine.
>
> But now I wrote it down, I realize that there is no _harm_ in
> documenting "we sort in byte-value order, so expect iterations on
> sorted string list to give them in that order to you" at all.
>
> So let's go with your documentation patch as-is.
>
>> It's not that I'm unwilling to implement 2; it's just that I still don't
>> see any advantage to doing so before there is a demonstrated need for it.
>
> As I said, I have this suspicion that the lack of demonstrated need
> is largely because the existing code that do _not_ use string-list
> don't do so because the interface is limited, so the argument is
> sort of self-fulfilling.
I've been looking for places in the code where string_list can be used,
so hopefully I'll get a feeling for what new functionality will increase
its applicability.
Now that my string_list enhancements are in master, I will start
trickling in patches to convert other code to using string_list when it
seems to benefit code length and/or understandability.
Michael
--
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-09-19 7:36 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-17 15:21 [PATCH] string_list API: document what "sorted" means Michael Haggerty
2012-09-17 21:17 ` Junio C Hamano
2012-09-18 7:58 ` Michael Haggerty
2012-09-18 8:19 ` Junio C Hamano
2012-09-18 8:55 ` Michael Haggerty
2012-09-18 17:21 ` Junio C Hamano
2012-09-19 7:35 ` Michael Haggerty
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).