public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-01-31  7:34 [PATCH 0/8] lib/sort: Add generic sort to lib/ Matt Mackall
@ 2005-01-31  7:34 ` Matt Mackall
  2005-01-31 17:16   ` Andreas Gruenbacher
                     ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Matt Mackall @ 2005-01-31  7:34 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

This patch adds a generic array sorting library routine. This is meant
to replace qsort, which has two problem areas for kernel use.

The first issue is quadratic worst-case performance. While quicksort
worst-case datasets are rarely encountered in normal scenarios, it is
in fact quite easy to construct worst cases for almost all quicksort
algorithms given source or access to an element comparison callback.
This could allow attackers to cause sorts that would otherwise take
less than a millisecond to take seconds and sorts that should take
less than a second to take weeks or months. Fixing this problem
requires randomizing pivot selection with a secure random number
generator, which is rather expensive.

The second is that quicksort's recursion tracking requires either
nontrivial amounts of stack space or dynamic memory allocation and out
of memory error handling.

By comparison, heapsort has both O(n log n) average and worst-case
performance and practically no extra storage requirements. This
version runs within 70-90% of the average performance of optimized
quicksort so it should be an acceptable replacement wherever quicksort
would be used in the kernel.

Note that this function has an extra parameter for passing in an
optimized swapping function. This is worth 10% or more over the
typical byte-by-byte exchange functions.

Benchmarks:

qsort:     glibc variant            1189 bytes (+ 256/1024 stack)
qsort_3f:  my simplified variant     459 bytes (+ 256/1024 stack)
heapsort:  the version below         346 bytes
shellsort: an optimized shellsort    196 bytes

                         P4 1.8GHz        Opteron 1.4GHz (32-bit)
size   algorithm      cycles relative        cycles relative
100:
           qsort:      38682 100.00%	      27631 100.00%
        qsort_3f:      36277 106.63%	      22406 123.32%
        heapsort:      43574  88.77%	      30301  91.19%
       shellsort:      39087  98.97%	      25139 109.91%
200:									    
           qsort:      86468 100.00%	      61148 100.00%
        qsort_3f:      78918 109.57%	      48959 124.90%
        heapsort:      98040  88.20%	      68235  89.61%
       shellsort:      95688  90.36%	      62279  98.18%
400:									    
           qsort:     187720 100.00%	     131313 100.00%
        qsort_3f:     174905 107.33%	     107954 121.64%
        heapsort:     223896  83.84%	     154241  85.13%
       shellsort:     223037  84.17%	     148990  88.14%
800:									    
           qsort:     407060 100.00%	     287460 100.00%
        qsort_3f:     385106 105.70%	     239131 120.21%
        heapsort:     484662  83.99%	     340099  84.52%
       shellsort:     537110  75.79%	     354755  81.03%
1600:									    
           qsort:     879596 100.00%	     621331 100.00%
        qsort_3f:     861568 102.09%	     522013 119.03%
        heapsort:    1079750  81.46%	     746677  83.21%
       shellsort:    1234243  71.27%	     820782  75.70%
3200:									    
           qsort:    1903902 100.00%	    1342126 100.00%
        qsort_3f:    1908816  99.74%	    1131496 118.62%
        heapsort:    2515493  75.69%	    1630333  82.32%
       shellsort:    2985339  63.78%	    1964794  68.31%
6400:									    
           qsort:    4046370 100.00%	    2909215 100.00%
        qsort_3f:    4164468  97.16%	    2468393 117.86%
        heapsort:    5150659  78.56%	    3533585  82.33%
       shellsort:    6650225  60.85%	    4429849  65.67%
12800:									    
           qsort:    8729730 100.00%	    6185097 100.00%
        qsort_3f:    8776885  99.46%	    5288826 116.95%
        heapsort:   11064224  78.90%	    7603061  81.35%
       shellsort:   15487905  56.36%	   10305163  60.02%
25600:									    
           qsort:   18357770 100.00%	   13172205 100.00%
        qsort_3f:   18687842  98.23%	   11337115 116.19%
        heapsort:   24121241  76.11%	   16612122  79.29%
       shellsort:   35552814  51.64%	   24106987  54.64%
51200:									    
           qsort:   38658883 100.00%	   28008505 100.00%
        qsort_3f:   39498463  97.87%	   24339675 115.07%
        heapsort:   50553552  76.47%	   37013828  75.67%
       shellsort:   82602416  46.80%	   56201889  49.84%
102400:									    
           qsort:   81197794 100.00%	   58918933 100.00%
        qsort_3f:   84257930  96.37%	   51986219 113.34%
        heapsort:  110540577  73.46%	   81419675  72.36%
       shellsort:  191303132  42.44%	  129786472  45.40%


Signed-off-by: Matt Mackall <mpm@selenic.com>

Index: mm2/lib/Makefile
===================================================================
--- mm2.orig/lib/Makefile	2005-01-30 21:26:28.000000000 -0800
+++ mm2/lib/Makefile	2005-01-30 22:37:49.000000000 -0800
@@ -6,7 +6,7 @@
 	 bust_spinlocks.o rbtree.o radix-tree.o dump_stack.o \
 	 kobject.o kref.o idr.o div64.o parser.o int_sqrt.o \
 	 bitmap.o extable.o kobject_uevent.o prio_tree.o \
-	 sha1.o halfmd4.o
+	 sha1.o halfmd4.o sort.o
 
 ifeq ($(CONFIG_DEBUG_KOBJECT),y)
 CFLAGS_kobject.o += -DDEBUG
Index: mm2/include/linux/sort.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ mm2/include/linux/sort.h	2005-01-30 22:06:37.000000000 -0800
@@ -0,0 +1,8 @@
+#ifndef _LINUX_SORT_H
+#define _LINUX_SORT_H
+
+void sort(void *base, size_t num, size_t size,
+	  int (*cmp)(const void *, const void *),
+	  void (*swap)(void *, void *, int));
+
+#endif
Index: mm2/lib/sort.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ mm2/lib/sort.c	2005-01-30 22:37:28.000000000 -0800
@@ -0,0 +1,121 @@
+/*
+ * A fast, small, non-recursive O(nlog n) sort for the Linux kernel
+ *
+ * Jan 23 2005  Matt Mackall <mpm@selenic.com>
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+
+void u32_swap(void *a, void *b, int size)
+{
+	u32 t = *(u32 *)a;
+	*(u32 *)a = *(u32 *)b;
+	*(u32 *)b = t;
+}
+
+void generic_swap(void *a, void *b, int size)
+{
+	char t;
+
+	do {
+		t = *(char *)a;
+		*(char *)a++ = *(char *)b;
+		*(char *)b++ = t;
+	} while (--size > 0);
+}
+
+/*
+ * sort - sort an array of elements
+ * @base: pointer to data to sort
+ * @num: number of elements
+ * @size: size of each element
+ * @cmp: pointer to comparison function
+ * @swap: pointer to swap function or NULL
+ *
+ * This function does a heapsort on the given array. You may provide a
+ * swap function optimized to your element type.
+ *
+ * Sorting time is O(n log n) both on average and worst-case. While
+ * qsort is about 20% faster on average, it suffers from exploitable
+ * O(n*n) worst-case behavior and extra memory requirements that make
+ * it less suitable for kernel use.
+ */
+
+void sort(void *base, size_t num, size_t size,
+	  int (*cmp)(const void *, const void *),
+	  void (*swap)(void *, void *, int size))
+{
+	/* pre-scale counters for performance */
+	int i = (num/2 - 1) * size, n = num * size, c, r;
+
+	if (!swap)
+		swap = (size == 4 ? u32_swap : generic_swap);
+
+	/* heapify */
+	for ( ; i >= 0; i -= size) {
+		for (r = i; r * 2 < n; r  = c) {
+			c = r * 2;
+			if (c < n - size && cmp(base + c, base + c + size) < 0)
+				c += size;
+			if (cmp(base + r, base + c) >= 0)
+				break;
+			swap(base + r, base + c, size);
+		}
+	}
+
+	/* sort */
+	for (i = n - size; i >= 0; i -= size) {
+		swap(base, base + i, size);
+		for (r = 0; r * 2 < i; r = c) {
+			c = r * 2;
+			if (c < i - size && cmp(base + c, base + c + size) < 0)
+				c += size;
+			if (cmp(base + r, base + c) >= 0)
+				break;
+			swap(base + r, base + c, size);
+		}
+	}
+}
+
+EXPORT_SYMBOL_GPL(sort);
+
+MODULE_LICENSE("GPL");
+
+#if 1
+/* a simple boot-time regression test */
+
+int cmpint(const void *a, const void *b)
+{
+	return *(int *)a - *(int *)b;
+}
+
+static int sort_test(void)
+{
+	int *a, i, r = 0;
+
+	a = kmalloc(1000 * sizeof(int), GFP_KERNEL);
+	BUG_ON(!a);
+
+	printk("testing sort()\n");
+
+	for (i = 0; i < 1000; i++) {
+		r = (r * 725861) % 6599;
+		a[i] = r;
+	}
+
+	sort(a, 1000, sizeof(int), cmpint, 0);
+
+	for (i = 0; i < 999; i++)
+		if (a[i] > a[i+1]) {
+			printk("sort() failed!\n");
+			break;
+		}
+
+	kfree(a);
+
+	return 0;
+}
+
+module_init(sort_test);
+#endif

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
@ 2005-01-31 11:52 Alexey Dobriyan
  2005-01-31 16:53 ` Matt Mackall
  0 siblings, 1 reply; 24+ messages in thread
From: Alexey Dobriyan @ 2005-01-31 11:52 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andrew Morton, linux-kernel

On Mon, 31 Jan 2005 01:34:59 -0600, Matt Mackall wrote:

> This patch adds a generic array sorting library routine. This is meant
> to replace qsort, which has two problem areas for kernel use.

> --- /dev/null
> +++ mm2/include/linux/sort.h
> @@ -0,0 +1,8 @@

> +void sort(void *base, size_t num, size_t size,
> +	  int (*cmp)(const void *, const void *),
> +	  void (*swap)(void *, void *, int));

extern void sort(...) ?

	Alexey

P. S.: Andrew's email ends with ".org".

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-01-31 11:52 [PATCH 1/8] lib/sort: Heapsort implementation of sort() Alexey Dobriyan
@ 2005-01-31 16:53 ` Matt Mackall
  0 siblings, 0 replies; 24+ messages in thread
From: Matt Mackall @ 2005-01-31 16:53 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: Andrew Morton, linux-kernel

On Mon, Jan 31, 2005 at 01:52:57PM +0200, Alexey Dobriyan wrote:
> On Mon, 31 Jan 2005 01:34:59 -0600, Matt Mackall wrote:
> 
> > This patch adds a generic array sorting library routine. This is meant
> > to replace qsort, which has two problem areas for kernel use.
> 
> > --- /dev/null
> > +++ mm2/include/linux/sort.h
> > @@ -0,0 +1,8 @@
> 
> > +void sort(void *base, size_t num, size_t size,
> > +	  int (*cmp)(const void *, const void *),
> > +	  void (*swap)(void *, void *, int));
> 
> extern void sort(...) ?

Extern is 100% redundant on function declarations.

> P. S.: Andrew's email ends with ".org".

I should not mail off patches just before going to sleep.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-01-31  7:34 ` [PATCH 1/8] lib/sort: Heapsort implementation of sort() Matt Mackall
@ 2005-01-31 17:16   ` Andreas Gruenbacher
  2005-01-31 17:30     ` Paulo Marques
                       ` (2 more replies)
  2005-02-01  2:10   ` Horst von Brand
  2005-02-27 13:17   ` Andreas Gruenbacher
  2 siblings, 3 replies; 24+ messages in thread
From: Andreas Gruenbacher @ 2005-01-31 17:16 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andrew Morton, linux-kernel@vger.kernel.org

Hello,

On Mon, 2005-01-31 at 08:34, Matt Mackall wrote:
> This patch adds a generic array sorting library routine. This is meant
> to replace qsort, which has two problem areas for kernel use.

looks reasonable.

> Note that this function has an extra parameter for passing in an
> optimized swapping function. This is worth 10% or more over the
> typical byte-by-byte exchange functions.

I would appreciate a version without the swap callback. The optimized
version of swap should use the machine word size instead of u32. How
about this approach instead, if you think we must really optimize
swapping?

static inline void swap(void *a, void *b, int size)
{
        if (size % sizeof(long)) {
                char t;
                do {
                        t = *(char *)a;
                        *(char *)a++ = *(char *)b;
                        *(char *)b++ = t;
                } while (--size > 0);
        } else {
                long t;
                do {
                        t = *(long *)a;
                        *(long *)a = *(long *)b;
                        *(long *)b = t;
                        size -= sizeof(long);
                } while (size > sizeof(long));
        }
}

static inline
void __sort(void *base, size_t num, size_t size,
          int (*cmp)(const void *, const void *))
{
...
}

void sort(void *base, size_t num, size_t size,
          int (*cmp)(const void *, const void *)) {
        if (size == sizeof(long)) {
                __sort(base, num, size, cmp);
        } else {
                __sort(base, num, size, cmp);
        }
}

The code size doubles, but it's still hardly an issue. gcc will refuse
to inline things fully without __attribute((allways_inline)). (Note that
__builtin_constant_p doesn't work for inline functions; else we could do
better.)


Cheers,
-- 
Andreas Gruenbacher <agruen@suse.de>
SUSE Labs, SUSE LINUX GMBH


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-01-31 17:16   ` Andreas Gruenbacher
@ 2005-01-31 17:30     ` Paulo Marques
  2005-02-01 17:54       ` Andreas Gruenbacher
  2005-01-31 19:30     ` Matt Mackall
  2005-02-02 10:50     ` Herbert Xu
  2 siblings, 1 reply; 24+ messages in thread
From: Paulo Marques @ 2005-01-31 17:30 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Matt Mackall, Andrew Morton, linux-kernel@vger.kernel.org

Andreas Gruenbacher wrote:
> [...]
> 
> static inline void swap(void *a, void *b, int size)
> {
>         if (size % sizeof(long)) {
>                 char t;
>                 do {
>                         t = *(char *)a;
>                         *(char *)a++ = *(char *)b;
>                         *(char *)b++ = t;
>                 } while (--size > 0);
>         } else {
>                 long t;
>                 do {
>                         t = *(long *)a;
>                         *(long *)a = *(long *)b;
>                         *(long *)b = t;
>                         size -= sizeof(long);
>                 } while (size > sizeof(long));

You forgot to increment a and b, and this should be "while (size);", no?

>         }
> }

Or better yet,

static inline void swap(void *a, void *b, int size)
{
	long tl;
         char t;

	while (size >= sizeof(long)) {
                 tl = *(long *)a;
                 *(long *)a = *(long *)b;
                 *(long *)b = tl;
		a += sizeof(long);
		b += sizeof(long);
                 size -= sizeof(long);
	}
	while (size) {
                 t = *(char *)a;
                 *(char *)a++ = *(char *)b;
                 *(char *)b++ = t;
		size--;
         }
}

This works better if the size is not a multiple of sizeof(long), but is 
bigger than a long.

However it seems that this should be put in a generic library function...

-- 
Paulo Marques - www.grupopie.com

All that is necessary for the triumph of evil is that good men do nothing.
Edmund Burke (1729 - 1797)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-01-31 17:16   ` Andreas Gruenbacher
  2005-01-31 17:30     ` Paulo Marques
@ 2005-01-31 19:30     ` Matt Mackall
  2005-02-01 17:50       ` Andreas Gruenbacher
  2005-02-02 10:50     ` Herbert Xu
  2 siblings, 1 reply; 24+ messages in thread
From: Matt Mackall @ 2005-01-31 19:30 UTC (permalink / raw)
  To: Andreas Gruenbacher; +Cc: Andrew Morton, linux-kernel@vger.kernel.org

On Mon, Jan 31, 2005 at 06:16:23PM +0100, Andreas Gruenbacher wrote:
> Hello,
> 
> On Mon, 2005-01-31 at 08:34, Matt Mackall wrote:
> > This patch adds a generic array sorting library routine. This is meant
> > to replace qsort, which has two problem areas for kernel use.
> 
> looks reasonable.
> 
> > Note that this function has an extra parameter for passing in an
> > optimized swapping function. This is worth 10% or more over the
> > typical byte-by-byte exchange functions.
> 
> I would appreciate a version without the swap callback.

Why? To eliminate an argument?

> The optimized version of swap should use the machine word size
> instead of u32.

That occurred to me, but most instances I found in my audit were using
u32 rather than long.

> How about this approach instead, if you think we
> must really optimize swapping?
>
> static inline void swap(void *a, void *b, int size)
> {
>         if (size % sizeof(long)) {
>                 char t;
>                 do {
>                         t = *(char *)a;
>                         *(char *)a++ = *(char *)b;
>                         *(char *)b++ = t;
>                 } while (--size > 0);
>         } else {
>                 long t;
>                 do {
>                         t = *(long *)a;
>                         *(long *)a = *(long *)b;
>                         *(long *)b = t;
>                         size -= sizeof(long);
>                 } while (size > sizeof(long));
>         }
> }

This makes things worse. Sort isn't inlined, so we don't know size
until we're called and then we're branching in the innermost loop and
growing the code footprint to boot. Function pointer wins in my
benchmarks.

Note that there are callers like IA64 extable that want a custom swap already:

- * stack may be more than 2GB away from the exception-table).
+ * Sort the exception table. It's usually already sorted, but there
+ * may be unordered entries due to multiple text sections (such as the
+ * .init text section). Note that the exception-table-entries contain
+ * location-relative addresses, which requires a bit of care during
+ * sorting to avoid overflows in the offset members (e.g., it would
+ * not be safe to make a temporary copy of an exception-table entry on
+ * the stack, because the stack may be more than 2GB away from the
+ * exception-table).
  */

There are a bunch of other potential users that sort parallel arrays
a[] and b[] with keys in a[] that want this too.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-01-31  7:34 ` [PATCH 1/8] lib/sort: Heapsort implementation of sort() Matt Mackall
  2005-01-31 17:16   ` Andreas Gruenbacher
@ 2005-02-01  2:10   ` Horst von Brand
  2005-02-27 13:17   ` Andreas Gruenbacher
  2 siblings, 0 replies; 24+ messages in thread
From: Horst von Brand @ 2005-02-01  2:10 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andrew Morton, linux-kernel

Matt Mackall <mpm@selenic.com> said:
> This patch adds a generic array sorting library routine. This is meant
> to replace qsort, which has two problem areas for kernel use.

Another problem is the GPL license. It will certainly be wanted from
non-GPL (e.g., binary) modules. Please just EXPORT_SYMBOL it.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-01-31 19:30     ` Matt Mackall
@ 2005-02-01 17:50       ` Andreas Gruenbacher
  2005-02-02  1:00         ` Horst von Brand
  0 siblings, 1 reply; 24+ messages in thread
From: Andreas Gruenbacher @ 2005-02-01 17:50 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andrew Morton, linux-kernel@vger.kernel.org

On Mon, 2005-01-31 at 20:30, Matt Mackall wrote:
> On Mon, Jan 31, 2005 at 06:16:23PM +0100, Andreas Gruenbacher wrote:
> > Hello,
> > 
> > On Mon, 2005-01-31 at 08:34, Matt Mackall wrote:
> > > This patch adds a generic array sorting library routine. This is meant
> > > to replace qsort, which has two problem areas for kernel use.
> > 
> > looks reasonable.
> > 
> > > Note that this function has an extra parameter for passing in an
> > > optimized swapping function. This is worth 10% or more over the
> > > typical byte-by-byte exchange functions.
> > 
> > I would appreciate a version without the swap callback.
> 
> Why? To eliminate an argument?

Yes, because a custom swap routine isn't very useful generally. It's
over-engineered IMHO.

> > The optimized version of swap should use the machine word size
> > instead of u32.
> 
> That occurred to me, but most instances I found in my audit were using
> u32 rather than long.

Alright, that may be the case.

> > How about this approach instead, if you think we
> > must really optimize swapping?
> >
> > static inline void swap(void *a, void *b, int size)
> > {
> >         if (size % sizeof(long)) {
> >                 char t;
> >                 do {
> >                         t = *(char *)a;
> >                         *(char *)a++ = *(char *)b;
> >                         *(char *)b++ = t;
> >                 } while (--size > 0);
> >         } else {
> >                 long t;
> >                 do {
> >                         t = *(long *)a;
> >                         *(long *)a = *(long *)b;
> >                         *(long *)b = t;
> >                         size -= sizeof(long);
> >                 } while (size > sizeof(long));
> >         }
> > }
> 
> This makes things worse. Sort isn't inlined, so we don't know size
> until we're called and then we're branching in the innermost loop

Well, the example code I referred to had an inlined __sort, and the size
== sizeof(long) case ended up being perfectly optimized. It bloats the
code somewhat though, but it's still tiny.

> and growing the code footprint to boot.

You mean the object footprint? I don't care much whether we use some 350
bytes more or not.

> Function pointer wins in my benchmarks.

I had a slowdown in the low percentage range when doing bytewise swap
instead of wordwise swap as well, but function pointers were about as
fast as wordwise swap. So no significant gain.

> Note that there are callers like IA64 extable that want a custom swap already:
> 
> - * stack may be more than 2GB away from the exception-table).
> + * Sort the exception table. It's usually already sorted, but there
> + * may be unordered entries due to multiple text sections (such as the
> + * .init text section). Note that the exception-table-entries contain
> + * location-relative addresses, which requires a bit of care during
> + * sorting to avoid overflows in the offset members (e.g., it would
> + * not be safe to make a temporary copy of an exception-table entry on
> + * the stack, because the stack may be more than 2GB away from the
> + * exception-table).
>   */

We could either leave the current insertion sort in place in
arch/ia64/mm/extable.c, or transform the exception table into sortable
form first, as in the below mock-up.

+       /* Exception-table-entries contain location-relative addresses. Convert
+          to addresses relative to START before sorting, and convert back to
+          location-relative addresses afterwards. This allows us to use the
+          general-purpose sort function. */
+       for (p = start+1; p < finish; p++) {
+               int n = (void *)p - (void *)start;
+               p->addr += n;
+               p->cont += n;
+       }
+       sort(start, finish - start, sizeof(struct exception_table_entry),
+            compare_entries);
+       for (p = start+1; p < finish; p++) {
+               int n = (void *)p - (void *)start;
+               p->addr -= n;
+               p->cont -= n;
+       }

Switching to the generic sort probably isn't worth it in this case.

> There are a bunch of other potential users that sort parallel arrays
> a[] and b[] with keys in a[] that want this too.

Where?

Thanks,
-- 
Andreas Gruenbacher <agruen@suse.de>
SUSE Labs, SUSE LINUX GMBH


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-01-31 17:30     ` Paulo Marques
@ 2005-02-01 17:54       ` Andreas Gruenbacher
  2005-02-01 18:11         ` linux-os
  0 siblings, 1 reply; 24+ messages in thread
From: Andreas Gruenbacher @ 2005-02-01 17:54 UTC (permalink / raw)
  To: Paulo Marques; +Cc: Matt Mackall, Andrew Morton, linux-kernel@vger.kernel.org

On Mon, 2005-01-31 at 18:30, Paulo Marques wrote:
> Andreas Gruenbacher wrote:
> > [...]
> > 
> > static inline void swap(void *a, void *b, int size)
> > {
> >         if (size % sizeof(long)) {
> >                 char t;
> >                 do {
> >                         t = *(char *)a;
> >                         *(char *)a++ = *(char *)b;
> >                         *(char *)b++ = t;
> >                 } while (--size > 0);
> >         } else {
> >                 long t;
> >                 do {
> >                         t = *(long *)a;
> >                         *(long *)a = *(long *)b;
> >                         *(long *)b = t;
> >                         size -= sizeof(long);
> >                 } while (size > sizeof(long));
> 
> You forgot to increment a and b, and this should be "while (size);", no?

Correct, yes.

> Or better yet,
> 
> static inline void swap(void *a, void *b, int size)
> {
> 	long tl;
>          char t;
> 
> 	while (size >= sizeof(long)) {
>                  tl = *(long *)a;
>                  *(long *)a = *(long *)b;
>                  *(long *)b = tl;
> 		a += sizeof(long);
> 		b += sizeof(long);
>                  size -= sizeof(long);
> 	}
> 	while (size) {
>                  t = *(char *)a;
>                  *(char *)a++ = *(char *)b;
>                  *(char *)b++ = t;
> 		size--;
>          }
> }
> 
> This works better if the size is not a multiple of sizeof(long), but is 
> bigger than a long.

In case size is not a multiple of sizeof(long) here, you'll have
unaligned memory accesses most of the time anyway, so it probably
doesn't really matter.

Thanks,
-- 
Andreas Gruenbacher <agruen@suse.de>
SUSE Labs, SUSE LINUX GMBH


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-02-01 17:54       ` Andreas Gruenbacher
@ 2005-02-01 18:11         ` linux-os
  2005-02-01 19:04           ` linux-os
  2005-02-01 19:47           ` Andreas Gruenbacher
  0 siblings, 2 replies; 24+ messages in thread
From: linux-os @ 2005-02-01 18:11 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Paulo Marques, Matt Mackall, Andrew Morton, linux-kernel

On Tue, 1 Feb 2005, Andreas Gruenbacher wrote:

> On Mon, 2005-01-31 at 18:30, Paulo Marques wrote:
>> Andreas Gruenbacher wrote:
>>> [...]
>>>
>>> static inline void swap(void *a, void *b, int size)
>>> {
>>>         if (size % sizeof(long)) {
>>>                 char t;
>>>                 do {
>>>                         t = *(char *)a;
>>>                         *(char *)a++ = *(char *)b;
>>>                         *(char *)b++ = t;
>>>                 } while (--size > 0);
>>>         } else {
>>>                 long t;
>>>                 do {
>>>                         t = *(long *)a;
>>>                         *(long *)a = *(long *)b;
>>>                         *(long *)b = t;
>>>                         size -= sizeof(long);
>>>                 } while (size > sizeof(long));
>>
>> You forgot to increment a and b, and this should be "while (size);", no?
>
> Correct, yes.
>
>> Or better yet,
>>
>> static inline void swap(void *a, void *b, int size)
>> {
>> 	long tl;
>>          char t;
>>
>> 	while (size >= sizeof(long)) {
>>                  tl = *(long *)a;
>>                  *(long *)a = *(long *)b;
>>                  *(long *)b = tl;
>> 		a += sizeof(long);
>> 		b += sizeof(long);
>>                  size -= sizeof(long);
>> 	}
>> 	while (size) {
>>                  t = *(char *)a;
>>                  *(char *)a++ = *(char *)b;
>>                  *(char *)b++ = t;
>> 		size--;
>>          }
>> }
>>
>> This works better if the size is not a multiple of sizeof(long), but is
>> bigger than a long.
>
> In case size is not a multiple of sizeof(long) here, you'll have
> unaligned memory accesses most of the time anyway, so it probably
> doesn't really matter.
>
> Thanks,
> -- 
> Andreas Gruenbacher <agruen@suse.de>
> SUSE Labs, SUSE LINUX GMBH

This uses an GNU-ISM where you are doing pointer arithmetic
on a pointer-to-void. NotGood(tm) this is an example of
where there is absolutely no rationale whatsoever for
writing such code.


Cheers,
Dick Johnson
Penguin : Linux version 2.6.10 on an i686 machine (5537.79 BogoMips).
  Notice : All mail here is now cached for review by Dictator Bush.
                  98.36% of all statistics are fiction.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-02-01 18:11         ` linux-os
@ 2005-02-01 19:04           ` linux-os
  2005-02-01 19:47           ` Andreas Gruenbacher
  1 sibling, 0 replies; 24+ messages in thread
From: linux-os @ 2005-02-01 19:04 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Paulo Marques, Matt Mackall, Andrew Morton, Linux kernel

On Tue, 1 Feb 2005, linux-os wrote:

> On Tue, 1 Feb 2005, Andreas Gruenbacher wrote:
>
>> On Mon, 2005-01-31 at 18:30, Paulo Marques wrote:
>>> Andreas Gruenbacher wrote:
>>>> [...]
>>>> 
>>>> static inline void swap(void *a, void *b, int size)
>>>> {
>>>>         if (size % sizeof(long)) {
>>>>                 char t;
>>>>                 do {
>>>>                         t = *(char *)a;
>>>>                         *(char *)a++ = *(char *)b;
>>>>                         *(char *)b++ = t;
>>>>                 } while (--size > 0);
>>>>         } else {
>>>>                 long t;
>>>>                 do {
>>>>                         t = *(long *)a;
>>>>                         *(long *)a = *(long *)b;
>>>>                         *(long *)b = t;
>>>>                         size -= sizeof(long);
>>>>                 } while (size > sizeof(long));
>>> 
>>> You forgot to increment a and b, and this should be "while (size);", no?
>> 
>> Correct, yes.
>> 
>>> Or better yet,
>>> 
>>> static inline void swap(void *a, void *b, int size)
>>> {
>>> 	long tl;
>>>          char t;
>>> 
>>> 	while (size >= sizeof(long)) {
>>>                  tl = *(long *)a;
>>>                  *(long *)a = *(long *)b;
>>>                  *(long *)b = tl;
>>> 		a += sizeof(long);
>>> 		b += sizeof(long);
>>>                  size -= sizeof(long);
>>> 	}
>>> 	while (size) {
>>>                  t = *(char *)a;
>>>                  *(char *)a++ = *(char *)b;
>>>                  *(char *)b++ = t;
>>> 		size--;
>>>          }
>>> }
>>> 
>>> This works better if the size is not a multiple of sizeof(long), but is
>>> bigger than a long.
>> 
>> In case size is not a multiple of sizeof(long) here, you'll have
>> unaligned memory accesses most of the time anyway, so it probably
>> doesn't really matter.
>> 
>> Thanks,
>> -- 
>> Andreas Gruenbacher <agruen@suse.de>
>> SUSE Labs, SUSE LINUX GMBH
>
> This uses an GNU-ISM where you are doing pointer arithmetic
> on a pointer-to-void. NotGood(tm) this is an example of
> where there is absolutely no rationale whatsoever for
> writing such code.
>

Here is swap with no games. Plus it handles the case where
sizeof(long) is not the same as sizeof(int).


void swap(void *a, void *b, size_t len)
{
     while(len >= sizeof(long))
     {
         long one, two;
         long *p0 = a;
         long *p1 = b;
         one = *p0;
         two = *p1;
         *p1++ = one;
         *p0++ = two;
         len -= sizeof(long);
     }
     while(len >= sizeof(int))	// Compiler may even ignore
     {
         int one, two;
         int *p0 = a;
         int *p1 = b;
         one = *p0;
         two = *p1;
         *p1++ = one;
         *p0++ = two;
         len -= sizeof(int);
     }
     while(len--)
     {
         char one, two;
         char *p0 = a;
         char *p1 = b;
         one = *p0;
         two = *p1;
         *p1++ = one;
         *p0++ = two;
     }
}

//----------------------------------------------


And here is the output. You can make it in-line of you want,
but you really need to make in ANSI first.


 	.file	"xxx.c"
 	.text
 	.p2align 2,,3
.globl swap
 	.type	swap, @function
swap:
 	pushl	%ebp
 	movl	%esp, %ebp
 	movl	16(%ebp), %ecx
 	pushl	%esi
 	cmpl	$3, %ecx
 	pushl	%ebx
 	movl	8(%ebp), %esi
 	movl	12(%ebp), %ebx
 	jbe	.L17
 	.p2align 2,,3
.L5:
 	subl	$4, %ecx
 	movl	(%esi), %eax
 	movl	(%ebx), %edx
 	cmpl	$3, %ecx
 	movl	%eax, (%ebx)
 	movl	%edx, (%esi)
 	ja	.L5
.L17:
 	decl	%ecx
 	cmpl	$-1, %ecx
 	je	.L19
 	.p2align 2,,3
.L13:
 	decl	%ecx
 	movb	(%esi), %al
 	movb	(%ebx), %dl
 	cmpl	$-1, %ecx
 	movb	%al, (%ebx)
 	movb	%dl, (%esi)
 	jne	.L13
.L19:
 	popl	%ebx
 	popl	%esi
 	leave
 	ret
 	.size	swap, .-swap
 	.section	.note.GNU-stack,"",@progbits
 	.ident	"GCC: (GNU) 3.3.3 20040412 (Red Hat Linux 3.3.3-7)"

A lot of folks don't realilize that adding a new program-unit
after a conditional expression doesn't necessarily add any code.
In this case, it simply tells the compiler what we want to do.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.10 on an i686 machine (5537.79 BogoMips).
  Notice : All mail here is now cached for review by Dictator Bush.
                  98.36% of all statistics are fiction.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-02-01 18:11         ` linux-os
  2005-02-01 19:04           ` linux-os
@ 2005-02-01 19:47           ` Andreas Gruenbacher
  1 sibling, 0 replies; 24+ messages in thread
From: Andreas Gruenbacher @ 2005-02-01 19:47 UTC (permalink / raw)
  To: linux-os; +Cc: linux-kernel@vger.kernel.org

On Tue, 2005-02-01 at 19:11, linux-os wrote:
> This uses an GNU-ISM where you are doing pointer arithmetic
> on a pointer-to-void. NotGood(tm) [...]

That's perfectly fine inside the kernel.

-- Andreas.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-02-01 17:50       ` Andreas Gruenbacher
@ 2005-02-02  1:00         ` Horst von Brand
  0 siblings, 0 replies; 24+ messages in thread
From: Horst von Brand @ 2005-02-02  1:00 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Matt Mackall, Andrew Morton, linux-kernel@vger.kernel.org

Andreas Gruenbacher <agruen@suse.de> said:

[...]

> Yes, because a custom swap routine isn't very useful generally. It's
> over-engineered IMHO.

It shouldn't swap, but juggle elements around like so:

    t --------------->+
    ^                 |
    |                 v
    x <-- x <-- x <-- x

Sure, this needs a temporary element, but reduces the data copying to
around 1/3 (1 swap == 3 copies, this makes a bit more than 1 copy for each
element moved). This is probably much more important than
microoptimizations in swap.

My tuned heapsort for doubles is:

/*
 * heapsort.c: Heap sort
 */

#include "sort.h"

void
sort(double a[], int n)

{
  double tmp;
  int i, j, k;
    
  /* Make heap */
  for(i = n / 2 - 1; i >= 0; i--) {
    /* downheap(a, i, n); */
    j = i;
    tmp = a[j];
    for(;;) {
      k = 2 * j + 1;
      if(k >= n)
	break;
      if(k + 1 < n && a[k + 1] > a[k])
	k++;
      if(tmp > a[k])
	break;
      a[j] = a[k];
      j = k;
    }
    a[j] = tmp;
  }
    
  /* Sort */
  for(i = n - 1; i >= 1; i--) {
    /* downheap(a, 1, i); swap(a[1], a[n]); */
    j = 0;
    tmp = a[j];
    for(;;) {
      k = 2 * j + 1;
      if(k > i)
	break;
      if(k + 1 <= i && a[k + 1] > a[k])
	k++;
      if(tmp > a[k])
	break;
      a[j] = a[k];
      j = k;
    }
    a[j] = tmp;

    tmp = a[0]; a[0] = a[i]; a[i] = tmp;
  }
}

Hack on it as you wish.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-01-31 17:16   ` Andreas Gruenbacher
  2005-01-31 17:30     ` Paulo Marques
  2005-01-31 19:30     ` Matt Mackall
@ 2005-02-02 10:50     ` Herbert Xu
  2005-02-02 11:14       ` Andreas Gruenbacher
  2 siblings, 1 reply; 24+ messages in thread
From: Herbert Xu @ 2005-02-02 10:50 UTC (permalink / raw)
  To: Andreas Gruenbacher; +Cc: mpm, akpm, linux-kernel

Andreas Gruenbacher <agruen@suse.de> wrote:
> 
> static inline void swap(void *a, void *b, int size)
> {
>        if (size % sizeof(long)) {
>                char t;
>                do {
>                        t = *(char *)a;
>                        *(char *)a++ = *(char *)b;
>                        *(char *)b++ = t;
>                } while (--size > 0);
>        } else {
>                long t;
>                do {
>                        t = *(long *)a;
>                        *(long *)a = *(long *)b;
>                        *(long *)b = t;
>                        size -= sizeof(long);
>                } while (size > sizeof(long));
>        }
> }

What if a/b aren't aligned?
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-02-02 10:50     ` Herbert Xu
@ 2005-02-02 11:14       ` Andreas Gruenbacher
  2005-02-03 23:19         ` Junio C Hamano
  0 siblings, 1 reply; 24+ messages in thread
From: Andreas Gruenbacher @ 2005-02-02 11:14 UTC (permalink / raw)
  To: Herbert Xu; +Cc: mpm, Andrew Morton, linux-kernel@vger.kernel.org

On Wed, 2005-02-02 at 11:50, Herbert Xu wrote:
> Andreas Gruenbacher <agruen@suse.de> wrote:
> > 
> > static inline void swap(void *a, void *b, int size)
> > {
> >        if (size % sizeof(long)) {
> >                char t;
> >                do {
> >                        t = *(char *)a;
> >                        *(char *)a++ = *(char *)b;
> >                        *(char *)b++ = t;
> >                } while (--size > 0);
> >        } else {
> >                long t;
> >                do {
> >                        t = *(long *)a;
> >                        *(long *)a = *(long *)b;
> >                        *(long *)b = t;
> >                        size -= sizeof(long);
> >                } while (size > sizeof(long));
> >        }
> > }
> 
> What if a/b aren't aligned?

That would be the case if the entire array was unaligned, or (size %
sizeof(long)) != 0. If people sort arrays that are unaligned even though
the element size is a multiple of sizeof(long) (or sizeof(u32) as Matt
proposes), they are just begging for bad performance. Otherwise, we're
doing byte-wise swap anyway.

-- 
Andreas Gruenbacher <agruen@suse.de>
SUSE Labs, SUSE LINUX GMBH


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-02-02 11:14       ` Andreas Gruenbacher
@ 2005-02-03 23:19         ` Junio C Hamano
  0 siblings, 0 replies; 24+ messages in thread
From: Junio C Hamano @ 2005-02-03 23:19 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Herbert Xu, mpm, Andrew Morton, linux-kernel@vger.kernel.org

>>>>> "AG" == Andreas Gruenbacher <agruen@suse.de> writes:

AG> On Wed, 2005-02-02 at 11:50, Herbert Xu wrote:
>> What if a/b aren't aligned?

AG> If people sort arrays that are unaligned even though the
AG> element size is a multiple of sizeof(long) (or sizeof(u32)
AG> as Matt proposes), they are just begging for bad
AG> performance.

If the user of your sort routine is dealing with an array of
this shape:

    struct { char e[8]; } a[]

the alignment for the normal access (e.g. a[ix].e[iy]) would not
matter and they are not begging for bad performance, are they?
It is only this swap routine, which is internal to the
implementation of sort, that is begging for it, methinks.  

Is unaligned access inside of the kernel get fixed up on all
architectures?  If not, this would not just be a performance
issue but becomes a correctness issue.

How about something like this to be a bit more defensive:

static inline void swap(void *a, void *b, int size)
{
       if (((unsigned long)a | (unsigned long)b | size) % sizeof(long)) {
	       char t;
	       do {
		       t = *(char *)a;
		       *(char *)a++ = *(char *)b;
		       *(char *)b++ = t;
	       } while (--size > 0);
       } else {
	       long t;
	       do {
		       t = *(long *)a;
		       *(long *)a = *(long *)b;
		       *(long *)b = t;
		       size -= sizeof(long);
	       } while (size > sizeof(long));
       }
}


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-01-31  7:34 ` [PATCH 1/8] lib/sort: Heapsort implementation of sort() Matt Mackall
  2005-01-31 17:16   ` Andreas Gruenbacher
  2005-02-01  2:10   ` Horst von Brand
@ 2005-02-27 13:17   ` Andreas Gruenbacher
  2005-02-27 21:25     ` Matt Mackall
  2 siblings, 1 reply; 24+ messages in thread
From: Andreas Gruenbacher @ 2005-02-27 13:17 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andrew Morton, linux-kernel

Matt,

On Monday 31 January 2005 08:34, Matt Mackall wrote:
> This patch adds a generic array sorting library routine. This is meant
> to replace qsort, which has two problem areas for kernel use.

the sort function is broken. When sorting the integer array {1, 2, 3, 4, 5}, 
I'm getting {2, 3, 4, 5, 1} as a result. Can you please have a look?

Thanks,
-- 
Andreas Gruenbacher <agruen@suse.de>
SUSE Labs, SUSE LINUX PRODUCTS GMBH


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-02-27 13:17   ` Andreas Gruenbacher
@ 2005-02-27 21:25     ` Matt Mackall
  2005-02-27 21:53       ` Andreas Gruenbacher
                         ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Matt Mackall @ 2005-02-27 21:25 UTC (permalink / raw)
  To: Andreas Gruenbacher; +Cc: Andrew Morton, linux-kernel

On Sun, Feb 27, 2005 at 02:17:51PM +0100, Andreas Gruenbacher wrote:
> Matt,
> 
> On Monday 31 January 2005 08:34, Matt Mackall wrote:
> > This patch adds a generic array sorting library routine. This is meant
> > to replace qsort, which has two problem areas for kernel use.
> 
> the sort function is broken. When sorting the integer array {1, 2, 3, 4, 5}, 
> I'm getting {2, 3, 4, 5, 1} as a result. Can you please have a look?

Which kernel? There was an off-by-one for odd array sizes in the
original posted version that was quickly spotted:

http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc4/2.6.11-rc4-mm1/broken-out/sort-fix.patch

I've since tested all sizes 1 - 1000 with 100 random arrays each, so
I'm fairly confident it's now fixed.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-02-27 21:25     ` Matt Mackall
@ 2005-02-27 21:53       ` Andreas Gruenbacher
  2005-02-27 22:10         ` Andreas Gruenbacher
  2005-03-01 13:23       ` Andreas Gruenbacher
  2005-03-01 19:06       ` Christophe Saout
  2 siblings, 1 reply; 24+ messages in thread
From: Andreas Gruenbacher @ 2005-02-27 21:53 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andrew Morton, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1064 bytes --]

On Sunday 27 February 2005 22:25, Matt Mackall wrote:
> On Sun, Feb 27, 2005 at 02:17:51PM +0100, Andreas Gruenbacher wrote:
> > Matt,
> >
> > On Monday 31 January 2005 08:34, Matt Mackall wrote:
> > > This patch adds a generic array sorting library routine. This is meant
> > > to replace qsort, which has two problem areas for kernel use.
> >
> > the sort function is broken. When sorting the integer array {1, 2, 3, 4,
> > 5}, I'm getting {2, 3, 4, 5, 1} as a result. Can you please have a look?
>
> Which kernel? There was an off-by-one for odd array sizes in the
> original posted version that was quickly spotted:
>
> http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc4/2
>.6.11-rc4-mm1/broken-out/sort-fix.patch

Okay, I didn't notice the off-by-one fix. It's still broken though; see the 
attached user-space test.

> I've since tested all sizes 1 - 1000 with 100 random arrays each, so
> I'm fairly confident it's now fixed.

Famous last words ;)

Thanks,
-- 
Andreas Gruenbacher <agruen@suse.de>
SUSE Labs, SUSE LINUX PRODUCTS GMBH

[-- Attachment #2: sort.c --]
[-- Type: text/x-csrc, Size: 2121 bytes --]

/*
 * A fast, small, non-recursive O(nlog n) sort for the Linux kernel
 *
 * Jan 23 2005  Matt Mackall <mpm@selenic.com>
 */

#include <stdlib.h>
#include <stdio.h>

void generic_swap(void *a, void *b, int size)
{
	char t;

	do {
		t = *(char *)a;
		*(char *)a++ = *(char *)b;
		*(char *)b++ = t;
	} while (--size > 0);
}

/*
 * sort - sort an array of elements
 * @base: pointer to data to sort
 * @num: number of elements
 * @size: size of each element
 * @cmp: pointer to comparison function
 * @swap: pointer to swap function or NULL
 *
 * This function does a heapsort on the given array. You may provide a
 * swap function optimized to your element type.
 *
 * Sorting time is O(n log n) both on average and worst-case. While
 * qsort is about 20% faster on average, it suffers from exploitable
 * O(n*n) worst-case behavior and extra memory requirements that make
 * it less suitable for kernel use.
 */

void sort(void *base, size_t num, size_t size,
	  int (*cmp)(const void *, const void *),
	  void (*swap)(void *, void *, int size))
{
	/* pre-scale counters for performance */
	int i = (num/2) * size, n = num * size, c, r;

	if (!swap)
		swap = generic_swap;

	/* heapify */
	for ( ; i >= 0; i -= size) {
		for (r = i; r * 2 < n; r  = c) {
			c = r * 2;
			if (c < n - size && cmp(base + c, base + c + size) < 0)
				c += size;
			if (cmp(base + r, base + c) >= 0)
				break;
			swap(base + r, base + c, size);
		}
	}

	/* sort */
	for (i = n - size; i >= 0; i -= size) {
		swap(base, base + i, size);
		for (r = 0; r * 2 < i; r = c) {
			c = r * 2;
			if (c < i - size && cmp(base + c, base + c + size) < 0)
				c += size;
			if (cmp(base + r, base + c) >= 0)
				break;
			swap(base + r, base + c, size);
		}
	}
}

int cmp(const int *a, const int *b)
{
	return b - a;
}

int main(void)
{
	int a[] = { 1, 2, 3, 4, 5 };
	size_t n;

	for (n = 0; n < sizeof(a)/sizeof(a[0]); n++)
		printf("%d ", a[n]);
	puts("");
	sort(a, sizeof(a)/sizeof(a[0]), sizeof(a[0]),
	     (int (*)(const void *, const void *))cmp, NULL);
	for (n = 0; n < sizeof(a)/sizeof(a[0]); n++)
		printf("%d ", a[n]);
	puts("");

	return 0;
}

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-02-27 21:53       ` Andreas Gruenbacher
@ 2005-02-27 22:10         ` Andreas Gruenbacher
  0 siblings, 0 replies; 24+ messages in thread
From: Andreas Gruenbacher @ 2005-02-27 22:10 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andrew Morton, linux-kernel

On Sunday 27 February 2005 22:53, Andreas Gruenbacher wrote:
> Okay, I didn't notice the off-by-one fix. It's still broken though; see the
> attached user-space test.

Found it. I didn't have the off-by-one fix and the bug triggered; then the 
test case had a useless comparison function:

int cmp(const int *a, const int *b)
{
        return b - a;
}

Works fine now.

Thanks,
-- 
Andreas Gruenbacher <agruen@suse.de>
SUSE Labs, SUSE LINUX PRODUCTS GMBH

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-02-27 21:25     ` Matt Mackall
  2005-02-27 21:53       ` Andreas Gruenbacher
@ 2005-03-01 13:23       ` Andreas Gruenbacher
  2005-03-01 19:06       ` Christophe Saout
  2 siblings, 0 replies; 24+ messages in thread
From: Andreas Gruenbacher @ 2005-03-01 13:23 UTC (permalink / raw)
  To: linux-kernel

On Sun, 2005-02-27 at 22:25, Matt Mackall wrote:
> On Sun, Feb 27, 2005 at 02:17:51PM +0100, Andreas Gruenbacher wrote:
> > Matt,
> > 
> > On Monday 31 January 2005 08:34, Matt Mackall wrote:
> > > This patch adds a generic array sorting library routine. This is meant
> > > to replace qsort, which has two problem areas for kernel use.
> > 
> > the sort function is broken. When sorting the integer array {1, 2, 3, 4, 5}, 
> > I'm getting {2, 3, 4, 5, 1} as a result. Can you please have a look?
> 
> Which kernel? There was an off-by-one for odd array sizes in the
> original posted version that was quickly spotted:
> 
> http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc4/2.6.11-rc4-mm1/broken-out/sort-fix.patch

Just for the record: This fixed it.

Thanks,
-- 
Andreas Gruenbacher <agruen@suse.de>
SUSE Labs, SUSE LINUX GMBH


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-02-27 21:25     ` Matt Mackall
  2005-02-27 21:53       ` Andreas Gruenbacher
  2005-03-01 13:23       ` Andreas Gruenbacher
@ 2005-03-01 19:06       ` Christophe Saout
  2005-03-01 20:12         ` Matt Mackall
  2 siblings, 1 reply; 24+ messages in thread
From: Christophe Saout @ 2005-03-01 19:06 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Andreas Gruenbacher, Andrew Morton, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3473 bytes --]

Am Sonntag, den 27.02.2005, 13:25 -0800 schrieb Matt Mackall:

> Which kernel? There was an off-by-one for odd array sizes in the
> original posted version that was quickly spotted:
> 
> http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc4/2.6.11-rc4-mm1/broken-out/sort-fix.patch
> 
> I've since tested all sizes 1 - 1000 with 100 random arrays each, so
> I'm fairly confident it's now fixed.

-	int i = (num/2 - 1) * size, n = num * size, c, r;
+	int i = (num/2) * size, n = num * size, c, r;

What's probably meant is: ((num - 1)/2) * size

`i' must cover half of the array or a little bit more (not less, in case
of odd numbers). `i' (before my patch) is the highest address to start
with, so that's why it should be ((num + 1)/2 - 1) * size or simpler:
((num - 1)/2) * size

Anyway, I was wondering, is there a specific reason you are not using
size_t for the offset variables? size is a size_t and the only purpose
of the variables i, n, c and r is to be compared or added to the start
pointer (also I think it's just ugly to cast size_t down to an int).

On system where int is 32 bit but pointers are 64 bit the compiler might
need to extend to adjust the size of the operands for the address
calculation. Right?

Since size_t is unsigned I also had to modify the two loops since I
can't check for any of the variables becoming negative. Tested with all
kinds of array sizes.

Signed-off-by: Christophe Saout <christophe@saout.de>

diff -Nur linux-2.6.11-rc4-mm1.orig/include/linux/sort.h linux-2.6.11-rc4-mm1/include/linux/sort.h
--- linux-2.6.11-rc4-mm1.orig/include/linux/sort.h	2005-03-01 19:45:11.000000000 +0100
+++ linux-2.6.11-rc4-mm1/include/linux/sort.h	2005-03-01 19:47:13.456097896 +0100
@@ -5,6 +5,6 @@
 
 void sort(void *base, size_t num, size_t size,
 	  int (*cmp)(const void *, const void *),
-	  void (*swap)(void *, void *, int));
+	  void (*swap)(void *, void *, size_t));
 
 #endif
diff -Nur linux-2.6.11-rc4-mm1.orig/lib/sort.c linux-2.6.11-rc4-mm1/lib/sort.c
--- linux-2.6.11-rc4-mm1.orig/lib/sort.c	2005-03-01 19:46:05.000000000 +0100
+++ linux-2.6.11-rc4-mm1/lib/sort.c	2005-03-01 19:47:55.688677568 +0100
@@ -7,14 +7,14 @@
 #include <linux/kernel.h>
 #include <linux/module.h>
 
-void u32_swap(void *a, void *b, int size)
+void u32_swap(void *a, void *b, size_t size)
 {
 	u32 t = *(u32 *)a;
 	*(u32 *)a = *(u32 *)b;
 	*(u32 *)b = t;
 }
 
-void generic_swap(void *a, void *b, int size)
+void generic_swap(void *a, void *b, size_t size)
 {
 	char t;
 
@@ -44,17 +44,19 @@
 
 void sort(void *base, size_t num, size_t size,
 	  int (*cmp)(const void *, const void *),
-	  void (*swap)(void *, void *, int size))
+	  void (*swap)(void *, void *, size_t size))
 {
 	/* pre-scale counters for performance */
-	int i = (num/2) * size, n = num * size, c, r;
+	size_t i = ((num + 1)/2) * size, n = num * size, c, r;
 
 	if (!swap)
 		swap = (size == 4 ? u32_swap : generic_swap);
 
 	/* heapify */
-	for ( ; i >= 0; i -= size) {
-		for (r = i; r * 2 < n; r  = c) {
+	while(i > 0) {
+		i -= size;
+
+		for (r = i; r * 2 < n; r = c) {
 			c = r * 2;
 			if (c < n - size && cmp(base + c, base + c + size) < 0)
 				c += size;
@@ -65,7 +67,9 @@
 	}
 
 	/* sort */
-	for (i = n - size; i >= 0; i -= size) {
+	for (i = n; i > 0;) {
+		i -= size;
+
 		swap(base, base + i, size);
 		for (r = 0; r * 2 < i; r = c) {
 			c = r * 2;


[-- Attachment #2: Dies ist ein digital signierter Nachrichtenteil --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-03-01 19:06       ` Christophe Saout
@ 2005-03-01 20:12         ` Matt Mackall
  2005-03-01 21:47           ` Andrew Morton
  0 siblings, 1 reply; 24+ messages in thread
From: Matt Mackall @ 2005-03-01 20:12 UTC (permalink / raw)
  To: Christophe Saout; +Cc: Andreas Gruenbacher, Andrew Morton, linux-kernel

On Tue, Mar 01, 2005 at 08:06:22PM +0100, Christophe Saout wrote:
> Am Sonntag, den 27.02.2005, 13:25 -0800 schrieb Matt Mackall:
> 
> > Which kernel? There was an off-by-one for odd array sizes in the
> > original posted version that was quickly spotted:
> > 
> > http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc4/2.6.11-rc4-mm1/broken-out/sort-fix.patch
> > 
> > I've since tested all sizes 1 - 1000 with 100 random arrays each, so
> > I'm fairly confident it's now fixed.
> 
> -	int i = (num/2 - 1) * size, n = num * size, c, r;
> +	int i = (num/2) * size, n = num * size, c, r;
> 
> What's probably meant is: ((num - 1)/2) * size
> 
> `i' must cover half of the array or a little bit more (not less, in case
> of odd numbers). `i' (before my patch) is the highest address to start
> with, so that's why it should be ((num + 1)/2 - 1) * size or simpler:
> ((num - 1)/2) * size

Argh. Yes, you're right. This probably deserves a comment since it's
been gotten wrong twice. I'll add something..
 
> Anyway, I was wondering, is there a specific reason you are not using
> size_t for the offset variables? size is a size_t and the only purpose
> of the variables i, n, c and r is to be compared or added to the start
> pointer (also I think it's just ugly to cast size_t down to an int).
> 
> On system where int is 32 bit but pointers are 64 bit the compiler might
> need to extend to adjust the size of the operands for the address
> calculation. Right?
> 
> Since size_t is unsigned I also had to modify the two loops since I
> can't check for any of the variables becoming negative. Tested with all
> kinds of array sizes.

This is good, but I suspect you'll have Andrew pulling his hair out as
I'll then have to go adjust all the callers and this is already a huge
mess because of the ACL bits from Andreas. As the current code
correctly (but slightly suboptimally) sorts any array size less than a
2G, I think it's safe to hold off on this for a bit. I'll queue this
up for after the sort and ACL stuff gets merged.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] lib/sort: Heapsort implementation of sort()
  2005-03-01 20:12         ` Matt Mackall
@ 2005-03-01 21:47           ` Andrew Morton
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2005-03-01 21:47 UTC (permalink / raw)
  To: Matt Mackall; +Cc: christophe, agruen, linux-kernel

Matt Mackall <mpm@selenic.com> wrote:
>
> I'll queue this
> up for after the sort and ACL stuff gets merged.

Whew!

I don't know how long the ACL changes will take to get merged up - is up to
Trond and he had quite a lot of rather robust comments on the first
iteration.


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2005-03-01 21:42 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-01-31 11:52 [PATCH 1/8] lib/sort: Heapsort implementation of sort() Alexey Dobriyan
2005-01-31 16:53 ` Matt Mackall
  -- strict thread matches above, loose matches on Subject: below --
2005-01-31  7:34 [PATCH 0/8] lib/sort: Add generic sort to lib/ Matt Mackall
2005-01-31  7:34 ` [PATCH 1/8] lib/sort: Heapsort implementation of sort() Matt Mackall
2005-01-31 17:16   ` Andreas Gruenbacher
2005-01-31 17:30     ` Paulo Marques
2005-02-01 17:54       ` Andreas Gruenbacher
2005-02-01 18:11         ` linux-os
2005-02-01 19:04           ` linux-os
2005-02-01 19:47           ` Andreas Gruenbacher
2005-01-31 19:30     ` Matt Mackall
2005-02-01 17:50       ` Andreas Gruenbacher
2005-02-02  1:00         ` Horst von Brand
2005-02-02 10:50     ` Herbert Xu
2005-02-02 11:14       ` Andreas Gruenbacher
2005-02-03 23:19         ` Junio C Hamano
2005-02-01  2:10   ` Horst von Brand
2005-02-27 13:17   ` Andreas Gruenbacher
2005-02-27 21:25     ` Matt Mackall
2005-02-27 21:53       ` Andreas Gruenbacher
2005-02-27 22:10         ` Andreas Gruenbacher
2005-03-01 13:23       ` Andreas Gruenbacher
2005-03-01 19:06       ` Christophe Saout
2005-03-01 20:12         ` Matt Mackall
2005-03-01 21:47           ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox