* [PATCH 0/2] lib/sort: Optimize the number of swaps and comparisons
@ 2024-01-13 3:13 Kuan-Wei Chiu
2024-01-13 3:13 ` [PATCH 1/2] lib/sort: Optimize heapsort for equal elements in sift-down path Kuan-Wei Chiu
2024-01-13 3:13 ` [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation Kuan-Wei Chiu
0 siblings, 2 replies; 20+ messages in thread
From: Kuan-Wei Chiu @ 2024-01-13 3:13 UTC (permalink / raw)
To: akpm; +Cc: lkml, jserv, linux-kernel, Kuan-Wei Chiu
Hello,
This patch series aims to optimize the heapsort algorithm, specifically
targeting a reduction in the number of swaps and comparisons required.
Thanks,
Kuan-Wei Chiu
Kuan-Wei Chiu (2):
lib/sort: Optimize heapsort for equal elements in sift-down path
lib/sort: Optimize heapsort with double-pop variation
lib/sort.c | 20 +++++++++++++++-----
1 file changed, 15 insertions(+), 5 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/2] lib/sort: Optimize heapsort for equal elements in sift-down path
2024-01-13 3:13 [PATCH 0/2] lib/sort: Optimize the number of swaps and comparisons Kuan-Wei Chiu
@ 2024-01-13 3:13 ` Kuan-Wei Chiu
2024-01-13 3:13 ` [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation Kuan-Wei Chiu
1 sibling, 0 replies; 20+ messages in thread
From: Kuan-Wei Chiu @ 2024-01-13 3:13 UTC (permalink / raw)
To: akpm; +Cc: lkml, jserv, linux-kernel, Kuan-Wei Chiu
Currently, when searching for the sift-down path and encountering equal
elements, the algorithm chooses the left child. However, considering
that the height of the right subtree may be one less than that of the
left subtree, selecting the right child in such cases can potentially
reduce the number of comparisons and swaps.
For instance, when sorting an array of 10,000 identical elements, the
current implementation requires 247,209 comparisons. With this patch,
the number of comparisons can be reduced to 227,241.
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
lib/sort.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/sort.c b/lib/sort.c
index b399bf10d675..fe4efd4a1410 100644
--- a/lib/sort.c
+++ b/lib/sort.c
@@ -262,7 +262,7 @@ void sort_r(void *base, size_t num, size_t size,
* average, 3/4 worst-case.)
*/
for (b = a; c = 2*b + size, (d = c + size) < n;)
- b = do_cmp(base + c, base + d, cmp_func, priv) >= 0 ? c : d;
+ b = do_cmp(base + c, base + d, cmp_func, priv) > 0 ? c : d;
if (d == n) /* Special case last leaf with no sibling */
b = c;
--
2.25.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation
2024-01-13 3:13 [PATCH 0/2] lib/sort: Optimize the number of swaps and comparisons Kuan-Wei Chiu
2024-01-13 3:13 ` [PATCH 1/2] lib/sort: Optimize heapsort for equal elements in sift-down path Kuan-Wei Chiu
@ 2024-01-13 3:13 ` Kuan-Wei Chiu
2024-06-20 15:36 ` Julian Sikorski
1 sibling, 1 reply; 20+ messages in thread
From: Kuan-Wei Chiu @ 2024-01-13 3:13 UTC (permalink / raw)
To: akpm; +Cc: lkml, jserv, linux-kernel, Kuan-Wei Chiu
Instead of popping only the maximum element from the heap during each
iteration, we now pop the two largest elements at once. Although this
introduces an additional comparison to determine the second largest
element, it enables a reduction in the height of the tree by one during
the heapify operations starting from root's left/right child. This
reduction in tree height by one leads to a decrease of one comparison
and one swap.
This optimization results in saving approximately 0.5 * n swaps without
increasing the number of comparisons. Additionally, the heap size
during heapify is now one less than the original size, offering a
chance for further reduction in comparisons and swaps.
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
The following experimental data is based on the array generated using
get_random_u32().
| N | swaps (old) | swaps (new) | comparisons (old) | comparisons (new) |
|-------|-------------|-------------|-------------------|-------------------|
| 1000 | 9054 | 8569 | 10328 | 10320 |
| 2000 | 20137 | 19182 | 22634 | 22587 |
| 3000 | 32062 | 30623 | 35833 | 35752 |
| 4000 | 44274 | 42282 | 49332 | 49306 |
| 5000 | 57195 | 54676 | 63300 | 63294 |
| 6000 | 70205 | 67202 | 77599 | 77557 |
| 7000 | 83276 | 79831 | 92113 | 92032 |
| 8000 | 96630 | 92678 | 106635 | 106617 |
| 9000 | 110349 | 105883 | 121505 | 121404 |
| 10000 | 124165 | 119202 | 136628 | 136617 |
lib/sort.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/lib/sort.c b/lib/sort.c
index fe4efd4a1410..a0509088f82a 100644
--- a/lib/sort.c
+++ b/lib/sort.c
@@ -215,6 +215,7 @@ void sort_r(void *base, size_t num, size_t size,
/* pre-scale counters for performance */
size_t n = num * size, a = (num/2) * size;
const unsigned int lsbit = size & -size; /* Used to find parent */
+ size_t shift = 0;
if (!a) /* num < 2 || size == 0 */
return;
@@ -242,12 +243,21 @@ void sort_r(void *base, size_t num, size_t size,
for (;;) {
size_t b, c, d;
- if (a) /* Building heap: sift down --a */
- a -= size;
- else if (n -= size) /* Sorting: Extract root to --n */
+ if (a) /* Building heap: sift down a */
+ a -= size << shift;
+ else if (n > 3 * size) { /* Sorting: Extract two largest elements */
+ n -= size;
do_swap(base, base + n, size, swap_func, priv);
- else /* Sort complete */
+ shift = do_cmp(base + size, base + 2 * size, cmp_func, priv) <= 0;
+ a = size << shift;
+ n -= size;
+ do_swap(base + a, base + n, size, swap_func, priv);
+ } else if (n > size) { /* Sorting: Extract root */
+ n -= size;
+ do_swap(base, base + n, size, swap_func, priv);
+ } else { /* Sort complete */
break;
+ }
/*
* Sift element at "a" down into heap. This is the
--
2.25.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation
2024-01-13 3:13 ` [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation Kuan-Wei Chiu
@ 2024-06-20 15:36 ` Julian Sikorski
2024-06-20 20:17 ` Mario Limonciello
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: Julian Sikorski @ 2024-06-20 15:36 UTC (permalink / raw)
To: visitorckw; +Cc: akpm, jserv, linux-kernel, lkml, alexdeucher
Hello,
it appears that this patch has caused suspend-to-idle regression:
https://gitlab.freedesktop.org/drm/amd/-/issues/3436
In brief, my laptop fails to suspend completely with the following error
in the log:
Jun 18 12:42:20 kernel: amd_pmc AMDI0005:00: Last suspend didn't reach
deepest state
Power consumption remains high enough that my battery has already
unexpectedly drained twice before I noticed something was off.
I am not entirely sure how changes to sorting function can influence
suspend, but it is what it is. 6.9.5 as shipped by Fedora 40 exhibits
the issue, 6.9.5 as shipped by Fedora with the patch reverted does not.
Best regards,
Juliam
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation
2024-06-20 15:36 ` Julian Sikorski
@ 2024-06-20 20:17 ` Mario Limonciello
2024-06-28 15:15 ` Linux regression tracking (Thorsten Leemhuis)
2025-01-15 3:27 ` [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation Luke Jones
2 siblings, 0 replies; 20+ messages in thread
From: Mario Limonciello @ 2024-06-20 20:17 UTC (permalink / raw)
To: Julian Sikorski, visitorckw, Linux kernel regressions list
Cc: akpm, jserv, linux-kernel, lkml, alexdeucher
On 6/20/2024 10:36, Julian Sikorski wrote:
> Hello,
>
> it appears that this patch has caused suspend-to-idle regression:
>
> https://gitlab.freedesktop.org/drm/amd/-/issues/3436
>
> In brief, my laptop fails to suspend completely with the following error
> in the log:
>
> Jun 18 12:42:20 kernel: amd_pmc AMDI0005:00: Last suspend didn't reach
> deepest state
>
> Power consumption remains high enough that my battery has already
> unexpectedly drained twice before I noticed something was off.
> I am not entirely sure how changes to sorting function can influence
> suspend, but it is what it is. 6.9.5 as shipped by Fedora 40 exhibits
> the issue, 6.9.5 as shipped by Fedora with the patch reverted does not.
>
> Best regards,
> Juliam
>
+regressions M/L
#regzbot ^introduced: 0e02ca29a563
#regzbot from: Kuan-Wei Chiu <visitorckw@gmail.com>
#regzbot monitor: https://gitlab.freedesktop.org/drm/amd/-/issues/3436
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation
2024-06-20 15:36 ` Julian Sikorski
2024-06-20 20:17 ` Mario Limonciello
@ 2024-06-28 15:15 ` Linux regression tracking (Thorsten Leemhuis)
2024-06-28 17:10 ` Kuan-Wei Chiu
2025-01-15 3:27 ` [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation Luke Jones
2 siblings, 1 reply; 20+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-06-28 15:15 UTC (permalink / raw)
To: visitorckw, akpm
Cc: jserv, linux-kernel, lkml, alexdeucher,
Linux kernel regressions list, Julian Sikorski
On 20.06.24 17:36, Julian Sikorski wrote:
>
> it appears that this patch has caused suspend-to-idle regression:
>
> https://gitlab.freedesktop.org/drm/amd/-/issues/3436
>
> In brief, my laptop fails to suspend completely with the following error
> in the log:
>
> Jun 18 12:42:20 kernel: amd_pmc AMDI0005:00: Last suspend didn't reach
> deepest state
>
> Power consumption remains high enough that my battery has already
> unexpectedly drained twice before I noticed something was off.
> I am not entirely sure how changes to sorting function can influence
> suspend, but it is what it is. 6.9.5 as shipped by Fedora 40 exhibits
> the issue, 6.9.5 as shipped by Fedora with the patch reverted does not.
Andrew, could you maybe help out here or point us in the direction of
somewhat that might be able to help? I'm asking, as from a quick search
on lore it seems Kuan-Wei Chiu has not posted anything since nearly four
weeks and thus also did not reply to Julian's regression report.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
#regzbot poke
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation
2024-06-28 15:15 ` Linux regression tracking (Thorsten Leemhuis)
@ 2024-06-28 17:10 ` Kuan-Wei Chiu
2024-06-29 5:03 ` Linux regression tracking (Thorsten Leemhuis)
0 siblings, 1 reply; 20+ messages in thread
From: Kuan-Wei Chiu @ 2024-06-28 17:10 UTC (permalink / raw)
To: Linux regressions mailing list
Cc: akpm, jserv, linux-kernel, lkml, alexdeucher, Julian Sikorski
On Fri, Jun 28, 2024 at 05:15:15PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 20.06.24 17:36, Julian Sikorski wrote:
> >
> > it appears that this patch has caused suspend-to-idle regression:
> >
> > https://gitlab.freedesktop.org/drm/amd/-/issues/3436
> >
> > In brief, my laptop fails to suspend completely with the following error
> > in the log:
> >
> > Jun 18 12:42:20 kernel: amd_pmc AMDI0005:00: Last suspend didn't reach
> > deepest state
> >
> > Power consumption remains high enough that my battery has already
> > unexpectedly drained twice before I noticed something was off.
> > I am not entirely sure how changes to sorting function can influence
> > suspend, but it is what it is. 6.9.5 as shipped by Fedora 40 exhibits
> > the issue, 6.9.5 as shipped by Fedora with the patch reverted does not.
>
> Andrew, could you maybe help out here or point us in the direction of
> somewhat that might be able to help? I'm asking, as from a quick search
> on lore it seems Kuan-Wei Chiu has not posted anything since nearly four
> weeks and thus also did not reply to Julian's regression report.
>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.
>
> #regzbot poke
Sorry for the late reply.
I apologize for the regression caused by the patch. I am not familiar
with the power management domain. I would guess there might be some
side effects in the compare or swap functions that I was not aware of.
While reviewing the sort calls that could potentially cause the error,
I noticed that the compare function used in the sort call within
drivers/acpi/processor_idle.c might not satisfy the transitive
relation. Although I'm not sure if this is the root cause of the
problem, specifically, if there are two valid acpi_processor_cx
elements A and B, and an invalid element C, there could be a scenario
where A < B but simultaneously A = C and B = C. However, if I
understand correctly, this issue should have existed before this patch
but might not have been triggered previously. My patch might have
exposed this issue by changing the order of comparisons and swaps.
Regards,
Kuan-Wei
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation
2024-06-28 17:10 ` Kuan-Wei Chiu
@ 2024-06-29 5:03 ` Linux regression tracking (Thorsten Leemhuis)
[not found] ` <20240630210809.37550-1-visitorckw@gmail.com>
0 siblings, 1 reply; 20+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-06-29 5:03 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: akpm, jserv, linux-kernel, lkml, alexdeucher, Julian Sikorski,
ACPI Devel Maling List, Kuan-Wei Chiu,
Linux regressions mailing list
On 28.06.24 19:10, Kuan-Wei Chiu wrote:
> On Fri, Jun 28, 2024 at 05:15:15PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 20.06.24 17:36, Julian Sikorski wrote:
>>>
>>> it appears that this patch has caused suspend-to-idle regression:
>>>
>>> https://gitlab.freedesktop.org/drm/amd/-/issues/3436
>>>
>>> In brief, my laptop fails to suspend completely with the following error
>>> in the log:
>>>
>>> Jun 18 12:42:20 kernel: amd_pmc AMDI0005:00: Last suspend didn't reach
>>> deepest state
>>>
>>> Power consumption remains high enough that my battery has already
>>> unexpectedly drained twice before I noticed something was off.
>>> I am not entirely sure how changes to sorting function can influence
>>> suspend, but it is what it is. 6.9.5 as shipped by Fedora 40 exhibits
>>> the issue, 6.9.5 as shipped by Fedora with the patch reverted does not.
> [...]
> I apologize for the regression caused by the patch. I am not familiar
> with the power management domain. I would guess there might be some
> side effects in the compare or swap functions that I was not aware of.
>
> While reviewing the sort calls that could potentially cause the error,
> I noticed that the compare function used in the sort call within
> drivers/acpi/processor_idle.c might not satisfy the transitive
> relation. Although I'm not sure if this is the root cause of the
> problem, specifically, if there are two valid acpi_processor_cx
> elements A and B, and an invalid element C, there could be a scenario
> where A < B but simultaneously A = C and B = C.
Let's bring in Rafael, he might be able to help us out here.
> However, if I
> understand correctly, this issue should have existed before this patch
> but might not have been triggered previously. My patch might have
> exposed this issue by changing the order of comparisons and swaps.
Yeah, these things happen. But when it comes to the Linux kernel's "no
regressions" rule that does not matter much; what matters more is which
change exposed the problem.
Ciao, Thorsten
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency
[not found] ` <20240630210809.37550-1-visitorckw@gmail.com>
@ 2024-06-30 21:13 ` Kuan-Wei Chiu
2024-07-01 4:42 ` [PATCH v2] " Kuan-Wei Chiu
0 siblings, 1 reply; 20+ messages in thread
From: Kuan-Wei Chiu @ 2024-06-30 21:13 UTC (permalink / raw)
To: rafael
Cc: lenb, mario.limonciello, akpm, jserv, alexdeucher, belegdol,
linux-acpi, regressions, regressions, linux-kernel
+ linux-kernel@vger.kernel.org
On Mon, Jul 01, 2024 at 05:08:09AM +0800, Kuan-Wei Chiu wrote:
> The acpi_cst_latency_cmp comparison function currently used for sorting
> C-state latencies does not satisfy transitivity, causing incorrect
> sorting results. Specifically, if there are two valid acpi_processor_cx
> elements A and B and one invalid element C, it may occur that A < B,
> A = C, and B = C. Sorting algorithms assume that if A < B and A = C,
> then C < B, leading to incorrect ordering.
>
> Given the small size of the array (<=8), we replace the library sort
> function with a simple insertion sort that properly ignores invalid
> elements and sorts valid ones based on latency. This change ensures
> correct ordering of the C-state latencies.
>
> Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered")
> Reported-by: Julian Sikorski <belegdol@gmail.com>
> Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/
> Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
> ---
> I do not have the appropriate AMD hardware to reproduce this issue and
> test the patch. However, if the aforementioned reason is indeed the
> source of the problem, I believe this patch might help.
>
> drivers/acpi/processor_idle.c | 33 ++++++++++++---------------------
> 1 file changed, 12 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index bd6a7857ce05..d58a7c64d80b 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -386,25 +386,19 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr,
> acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1);
> }
>
> -static int acpi_cst_latency_cmp(const void *a, const void *b)
> +static void acpi_cst_latency_sort(struct acpi_processor_cx *arr, size_t length)
> {
> - const struct acpi_processor_cx *x = a, *y = b;
> + int i, j, k;
>
> - if (!(x->valid && y->valid))
> - return 0;
> - if (x->latency > y->latency)
> - return 1;
> - if (x->latency < y->latency)
> - return -1;
> - return 0;
> -}
> -static void acpi_cst_latency_swap(void *a, void *b, int n)
> -{
> - struct acpi_processor_cx *x = a, *y = b;
> -
> - if (!(x->valid && y->valid))
> - return;
> - swap(x->latency, y->latency);
> + for (i = 1; i < length; i++) {
> + for (j = i - 1, k = i; j >= 0; j--) {
> + if (!arr[j].valid)
> + continue;
> + if (arr[j].latency > arr[k].latency)
> + swap(arr[j].latency, arr[k].latency);
> + k = j;
> + }
> + }
> }
>
> static int acpi_processor_power_verify(struct acpi_processor *pr)
> @@ -449,10 +443,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr)
>
> if (buggy_latency) {
> pr_notice("FW issue: working around C-state latencies out of order\n");
> - sort(&pr->power.states[1], max_cstate,
> - sizeof(struct acpi_processor_cx),
> - acpi_cst_latency_cmp,
> - acpi_cst_latency_swap);
> + acpi_cst_latency_sort(&pr->power.states[1], max_cstate);
> }
>
> lapic_timer_propagate_broadcast(pr);
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency
2024-06-30 21:13 ` [PATCH] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency Kuan-Wei Chiu
@ 2024-07-01 4:42 ` Kuan-Wei Chiu
2024-07-01 5:06 ` Greg KH
2024-07-01 15:17 ` Mario Limonciello
0 siblings, 2 replies; 20+ messages in thread
From: Kuan-Wei Chiu @ 2024-07-01 4:42 UTC (permalink / raw)
To: rafael
Cc: lenb, mario.limonciello, akpm, jserv, alexdeucher, belegdol,
regressions, linux-acpi, regressions, linux-kernel, Kuan-Wei Chiu
The acpi_cst_latency_cmp comparison function currently used for sorting
C-state latencies does not satisfy transitivity, causing incorrect
sorting results. Specifically, if there are two valid acpi_processor_cx
elements A and B and one invalid element C, it may occur that A < B,
A = C, and B = C. Sorting algorithms assume that if A < B and A = C,
then C < B, leading to incorrect ordering.
Given the small size of the array (<=8), we replace the library sort
function with a simple insertion sort that properly ignores invalid
elements and sorts valid ones based on latency. This change ensures
correct ordering of the C-state latencies.
Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered")
Reported-by: Julian Sikorski <belegdol@gmail.com>
Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
v1 -> v2:
- Avoid swapping if arr[i] is an invalid element.
I do not have the appropriate AMD hardware to reproduce this issue and
test the patch. However, if the aforementioned reason is indeed the
source of the problem, I believe this patch might help.
drivers/acpi/processor_idle.c | 35 ++++++++++++++---------------------
1 file changed, 14 insertions(+), 21 deletions(-)
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index bd6a7857ce05..813c718b9108 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -386,25 +386,21 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr,
acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1);
}
-static int acpi_cst_latency_cmp(const void *a, const void *b)
+static void acpi_cst_latency_sort(struct acpi_processor_cx *arr, size_t length)
{
- const struct acpi_processor_cx *x = a, *y = b;
+ int i, j, k;
- if (!(x->valid && y->valid))
- return 0;
- if (x->latency > y->latency)
- return 1;
- if (x->latency < y->latency)
- return -1;
- return 0;
-}
-static void acpi_cst_latency_swap(void *a, void *b, int n)
-{
- struct acpi_processor_cx *x = a, *y = b;
-
- if (!(x->valid && y->valid))
- return;
- swap(x->latency, y->latency);
+ for (i = 1; i < length; i++) {
+ if (!arr[i].valid)
+ continue;
+ for (j = i - 1, k = i; j >= 0; j--) {
+ if (!arr[j].valid)
+ continue;
+ if (arr[j].latency > arr[k].latency)
+ swap(arr[j].latency, arr[k].latency);
+ k = j;
+ }
+ }
}
static int acpi_processor_power_verify(struct acpi_processor *pr)
@@ -449,10 +445,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr)
if (buggy_latency) {
pr_notice("FW issue: working around C-state latencies out of order\n");
- sort(&pr->power.states[1], max_cstate,
- sizeof(struct acpi_processor_cx),
- acpi_cst_latency_cmp,
- acpi_cst_latency_swap);
+ acpi_cst_latency_sort(&pr->power.states[1], max_cstate);
}
lapic_timer_propagate_broadcast(pr);
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v2] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency
2024-07-01 4:42 ` [PATCH v2] " Kuan-Wei Chiu
@ 2024-07-01 5:06 ` Greg KH
2024-07-01 15:17 ` Mario Limonciello
1 sibling, 0 replies; 20+ messages in thread
From: Greg KH @ 2024-07-01 5:06 UTC (permalink / raw)
To: Kuan-Wei Chiu
Cc: rafael, lenb, mario.limonciello, akpm, jserv, alexdeucher,
belegdol, regressions, linux-acpi, regressions, linux-kernel
On Mon, Jul 01, 2024 at 12:42:32PM +0800, Kuan-Wei Chiu wrote:
> The acpi_cst_latency_cmp comparison function currently used for sorting
> C-state latencies does not satisfy transitivity, causing incorrect
> sorting results. Specifically, if there are two valid acpi_processor_cx
> elements A and B and one invalid element C, it may occur that A < B,
> A = C, and B = C. Sorting algorithms assume that if A < B and A = C,
> then C < B, leading to incorrect ordering.
>
> Given the small size of the array (<=8), we replace the library sort
> function with a simple insertion sort that properly ignores invalid
> elements and sorts valid ones based on latency. This change ensures
> correct ordering of the C-state latencies.
>
> Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered")
> Reported-by: Julian Sikorski <belegdol@gmail.com>
> Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/
> Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
> ---
> v1 -> v2:
> - Avoid swapping if arr[i] is an invalid element.
>
> I do not have the appropriate AMD hardware to reproduce this issue and
> test the patch. However, if the aforementioned reason is indeed the
> source of the problem, I believe this patch might help.
>
> drivers/acpi/processor_idle.c | 35 ++++++++++++++---------------------
> 1 file changed, 14 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index bd6a7857ce05..813c718b9108 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -386,25 +386,21 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr,
> acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1);
> }
>
> -static int acpi_cst_latency_cmp(const void *a, const void *b)
> +static void acpi_cst_latency_sort(struct acpi_processor_cx *arr, size_t length)
> {
> - const struct acpi_processor_cx *x = a, *y = b;
> + int i, j, k;
>
> - if (!(x->valid && y->valid))
> - return 0;
> - if (x->latency > y->latency)
> - return 1;
> - if (x->latency < y->latency)
> - return -1;
> - return 0;
> -}
> -static void acpi_cst_latency_swap(void *a, void *b, int n)
> -{
> - struct acpi_processor_cx *x = a, *y = b;
> -
> - if (!(x->valid && y->valid))
> - return;
> - swap(x->latency, y->latency);
> + for (i = 1; i < length; i++) {
> + if (!arr[i].valid)
> + continue;
> + for (j = i - 1, k = i; j >= 0; j--) {
> + if (!arr[j].valid)
> + continue;
> + if (arr[j].latency > arr[k].latency)
> + swap(arr[j].latency, arr[k].latency);
> + k = j;
> + }
> + }
> }
>
> static int acpi_processor_power_verify(struct acpi_processor *pr)
> @@ -449,10 +445,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr)
>
> if (buggy_latency) {
> pr_notice("FW issue: working around C-state latencies out of order\n");
> - sort(&pr->power.states[1], max_cstate,
> - sizeof(struct acpi_processor_cx),
> - acpi_cst_latency_cmp,
> - acpi_cst_latency_swap);
> + acpi_cst_latency_sort(&pr->power.states[1], max_cstate);
> }
>
> lapic_timer_propagate_broadcast(pr);
> --
> 2.34.1
>
>
Hi,
This is the friendly patch-bot of Greg Kroah-Hartman. You have sent him
a patch that has triggered this response. He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created. Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.
You are receiving this message because of the following common error(s)
as indicated below:
- You have marked a patch with a "Fixes:" tag for a commit that is in an
older released kernel, yet you do not have a cc: stable line in the
signed-off-by area at all, which means that the patch will not be
applied to any older kernel releases. To properly fix this, please
follow the documented rules in the
Documentation/process/stable-kernel-rules.rst file for how to resolve
this.
If you wish to discuss this problem further, or you have questions about
how to resolve this issue, please feel free to respond to this email and
Greg will reply once he has dug out from the pending patches received
from other developers.
thanks,
greg k-h's patch email bot
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency
2024-07-01 4:42 ` [PATCH v2] " Kuan-Wei Chiu
2024-07-01 5:06 ` Greg KH
@ 2024-07-01 15:17 ` Mario Limonciello
2024-07-01 16:10 ` [PATCH v3] " Kuan-Wei Chiu
1 sibling, 1 reply; 20+ messages in thread
From: Mario Limonciello @ 2024-07-01 15:17 UTC (permalink / raw)
To: Kuan-Wei Chiu, rafael
Cc: lenb, akpm, jserv, alexdeucher, belegdol, regressions, linux-acpi,
regressions, linux-kernel
On 6/30/2024 23:42, Kuan-Wei Chiu wrote:
> The acpi_cst_latency_cmp comparison function currently used for sorting
> C-state latencies does not satisfy transitivity, causing incorrect
> sorting results. Specifically, if there are two valid acpi_processor_cx
> elements A and B and one invalid element C, it may occur that A < B,
> A = C, and B = C. Sorting algorithms assume that if A < B and A = C,
> then C < B, leading to incorrect ordering.
>
> Given the small size of the array (<=8), we replace the library sort
> function with a simple insertion sort that properly ignores invalid
> elements and sorts valid ones based on latency. This change ensures
> correct ordering of the C-state latencies.
>
> Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered")
> Reported-by: Julian Sikorski <belegdol@gmail.com>
> Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/
> Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
> ---
> v1 -> v2:
> - Avoid swapping if arr[i] is an invalid element.
>
> I do not have the appropriate AMD hardware to reproduce this issue and
> test the patch.
Even if you had hardware, the behavior stems from a BIOS bug with the
_CST entries. I know it's been fixed in the BIOS on most platforms in
that era, but if I recall correctly from a few years back Julian's
system was EoL already.
> However, if the aforementioned reason is indeed the
> source of the problem, I believe this patch might help.
Thanks!
Assuming this patch works for Julian, I believe you can also drop the
#include <linux/sort.h>
from this file as well.
I do think this should be cc to @stable too in that case.
>
> drivers/acpi/processor_idle.c | 35 ++++++++++++++---------------------
> 1 file changed, 14 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index bd6a7857ce05..813c718b9108 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -386,25 +386,21 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr,
> acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1);
> }
>
> -static int acpi_cst_latency_cmp(const void *a, const void *b)
> +static void acpi_cst_latency_sort(struct acpi_processor_cx *arr, size_t length)
> {
> - const struct acpi_processor_cx *x = a, *y = b;
> + int i, j, k;
>
> - if (!(x->valid && y->valid))
> - return 0;
> - if (x->latency > y->latency)
> - return 1;
> - if (x->latency < y->latency)
> - return -1;
> - return 0;
> -}
> -static void acpi_cst_latency_swap(void *a, void *b, int n)
> -{
> - struct acpi_processor_cx *x = a, *y = b;
> -
> - if (!(x->valid && y->valid))
> - return;
> - swap(x->latency, y->latency);
> + for (i = 1; i < length; i++) {
> + if (!arr[i].valid)
> + continue;
> + for (j = i - 1, k = i; j >= 0; j--) {
> + if (!arr[j].valid)
> + continue;
> + if (arr[j].latency > arr[k].latency)
> + swap(arr[j].latency, arr[k].latency);
> + k = j;
> + }
> + }
> }
>
> static int acpi_processor_power_verify(struct acpi_processor *pr)
> @@ -449,10 +445,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr)
>
> if (buggy_latency) {
> pr_notice("FW issue: working around C-state latencies out of order\n");
> - sort(&pr->power.states[1], max_cstate,
> - sizeof(struct acpi_processor_cx),
> - acpi_cst_latency_cmp,
> - acpi_cst_latency_swap);
> + acpi_cst_latency_sort(&pr->power.states[1], max_cstate);
> }
>
> lapic_timer_propagate_broadcast(pr);
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v3] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency
2024-07-01 15:17 ` Mario Limonciello
@ 2024-07-01 16:10 ` Kuan-Wei Chiu
2024-07-01 17:36 ` Rafael J. Wysocki
0 siblings, 1 reply; 20+ messages in thread
From: Kuan-Wei Chiu @ 2024-07-01 16:10 UTC (permalink / raw)
To: rafael
Cc: lenb, mario.limonciello, akpm, jserv, alexdeucher, belegdol,
regressions, linux-acpi, regressions, linux-kernel, Kuan-Wei Chiu,
stable
The acpi_cst_latency_cmp comparison function currently used for sorting
C-state latencies does not satisfy transitivity, causing incorrect
sorting results. Specifically, if there are two valid acpi_processor_cx
elements A and B and one invalid element C, it may occur that A < B,
A = C, and B = C. Sorting algorithms assume that if A < B and A = C,
then C < B, leading to incorrect ordering.
Given the small size of the array (<=8), we replace the library sort
function with a simple insertion sort that properly ignores invalid
elements and sorts valid ones based on latency. This change ensures
correct ordering of the C-state latencies.
Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered")
Cc: stable@vger.kernel.org
Reported-by: Julian Sikorski <belegdol@gmail.com>
Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
v2 -> v3:
- Remove #include <linux/sort.h>
- Cc @stable
Note: I only performed a build test and a simple unit test to ensure
the latency of valid elements is correctly sorted in the randomly
generated data.
drivers/acpi/processor_idle.c | 36 ++++++++++++++---------------------
1 file changed, 14 insertions(+), 22 deletions(-)
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index bd6a7857ce05..17cc81340b4b 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -16,7 +16,6 @@
#include <linux/acpi.h>
#include <linux/dmi.h>
#include <linux/sched.h> /* need_resched() */
-#include <linux/sort.h>
#include <linux/tick.h>
#include <linux/cpuidle.h>
#include <linux/cpu.h>
@@ -386,25 +385,21 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr,
acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1);
}
-static int acpi_cst_latency_cmp(const void *a, const void *b)
+static void acpi_cst_latency_sort(struct acpi_processor_cx *arr, size_t length)
{
- const struct acpi_processor_cx *x = a, *y = b;
+ int i, j, k;
- if (!(x->valid && y->valid))
- return 0;
- if (x->latency > y->latency)
- return 1;
- if (x->latency < y->latency)
- return -1;
- return 0;
-}
-static void acpi_cst_latency_swap(void *a, void *b, int n)
-{
- struct acpi_processor_cx *x = a, *y = b;
-
- if (!(x->valid && y->valid))
- return;
- swap(x->latency, y->latency);
+ for (i = 1; i < length; i++) {
+ if (!arr[i].valid)
+ continue;
+ for (j = i - 1, k = i; j >= 0; j--) {
+ if (!arr[j].valid)
+ continue;
+ if (arr[j].latency > arr[k].latency)
+ swap(arr[j].latency, arr[k].latency);
+ k = j;
+ }
+ }
}
static int acpi_processor_power_verify(struct acpi_processor *pr)
@@ -449,10 +444,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr)
if (buggy_latency) {
pr_notice("FW issue: working around C-state latencies out of order\n");
- sort(&pr->power.states[1], max_cstate,
- sizeof(struct acpi_processor_cx),
- acpi_cst_latency_cmp,
- acpi_cst_latency_swap);
+ acpi_cst_latency_sort(&pr->power.states[1], max_cstate);
}
lapic_timer_propagate_broadcast(pr);
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v3] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency
2024-07-01 16:10 ` [PATCH v3] " Kuan-Wei Chiu
@ 2024-07-01 17:36 ` Rafael J. Wysocki
2024-07-01 20:56 ` [PATCH v4] " Kuan-Wei Chiu
0 siblings, 1 reply; 20+ messages in thread
From: Rafael J. Wysocki @ 2024-07-01 17:36 UTC (permalink / raw)
To: Kuan-Wei Chiu
Cc: rafael, lenb, mario.limonciello, akpm, jserv, alexdeucher,
belegdol, regressions, linux-acpi, regressions, linux-kernel,
stable
On Mon, Jul 1, 2024 at 6:10 PM Kuan-Wei Chiu <visitorckw@gmail.com> wrote:
>
> The acpi_cst_latency_cmp comparison function currently used for sorting
> C-state latencies does not satisfy transitivity, causing incorrect
> sorting results. Specifically, if there are two valid acpi_processor_cx
> elements A and B and one invalid element C, it may occur that A < B,
> A = C, and B = C. Sorting algorithms assume that if A < B and A = C,
> then C < B, leading to incorrect ordering.
>
> Given the small size of the array (<=8), we replace the library sort
> function with a simple insertion sort that properly ignores invalid
> elements and sorts valid ones based on latency. This change ensures
> correct ordering of the C-state latencies.
>
> Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered")
> Cc: stable@vger.kernel.org
> Reported-by: Julian Sikorski <belegdol@gmail.com>
> Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/
> Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
> ---
> v2 -> v3:
> - Remove #include <linux/sort.h>
> - Cc @stable
>
> Note: I only performed a build test and a simple unit test to ensure
> the latency of valid elements is correctly sorted in the randomly
> generated data.
>
> drivers/acpi/processor_idle.c | 36 ++++++++++++++---------------------
> 1 file changed, 14 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index bd6a7857ce05..17cc81340b4b 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -16,7 +16,6 @@
> #include <linux/acpi.h>
> #include <linux/dmi.h>
> #include <linux/sched.h> /* need_resched() */
> -#include <linux/sort.h>
> #include <linux/tick.h>
> #include <linux/cpuidle.h>
> #include <linux/cpu.h>
> @@ -386,25 +385,21 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr,
> acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1);
> }
>
> -static int acpi_cst_latency_cmp(const void *a, const void *b)
> +static void acpi_cst_latency_sort(struct acpi_processor_cx *arr, size_t length)
s/arr/states/ please.
> {
> - const struct acpi_processor_cx *x = a, *y = b;
> + int i, j, k;
>
> - if (!(x->valid && y->valid))
> - return 0;
> - if (x->latency > y->latency)
> - return 1;
> - if (x->latency < y->latency)
> - return -1;
> - return 0;
> -}
> -static void acpi_cst_latency_swap(void *a, void *b, int n)
> -{
> - struct acpi_processor_cx *x = a, *y = b;
> -
> - if (!(x->valid && y->valid))
> - return;
> - swap(x->latency, y->latency);
> + for (i = 1; i < length; i++) {
> + if (!arr[i].valid)
> + continue;
Please add an empty line here (and analogously below).
> + for (j = i - 1, k = i; j >= 0; j--) {
> + if (!arr[j].valid)
> + continue;
> + if (arr[j].latency > arr[k].latency)
> + swap(arr[j].latency, arr[k].latency);
And here.
> + k = j;
> + }
> + }
> }
>
> static int acpi_processor_power_verify(struct acpi_processor *pr)
> @@ -449,10 +444,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr)
>
> if (buggy_latency) {
> pr_notice("FW issue: working around C-state latencies out of order\n");
> - sort(&pr->power.states[1], max_cstate,
> - sizeof(struct acpi_processor_cx),
> - acpi_cst_latency_cmp,
> - acpi_cst_latency_swap);
> + acpi_cst_latency_sort(&pr->power.states[1], max_cstate);
> }
>
> lapic_timer_propagate_broadcast(pr);
> --
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v4] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency
2024-07-01 17:36 ` Rafael J. Wysocki
@ 2024-07-01 20:56 ` Kuan-Wei Chiu
2024-07-02 7:28 ` Julian Sikorski
2024-07-02 18:38 ` Rafael J. Wysocki
0 siblings, 2 replies; 20+ messages in thread
From: Kuan-Wei Chiu @ 2024-07-01 20:56 UTC (permalink / raw)
To: rafael
Cc: lenb, mario.limonciello, akpm, jserv, alexdeucher, belegdol,
regressions, linux-acpi, regressions, linux-kernel, Kuan-Wei Chiu,
stable
The acpi_cst_latency_cmp comparison function currently used for sorting
C-state latencies does not satisfy transitivity, causing incorrect
sorting results. Specifically, if there are two valid acpi_processor_cx
elements A and B and one invalid element C, it may occur that A < B,
A = C, and B = C. Sorting algorithms assume that if A < B and A = C,
then C < B, leading to incorrect ordering.
Given the small size of the array (<=8), we replace the library sort
function with a simple insertion sort that properly ignores invalid
elements and sorts valid ones based on latency. This change ensures
correct ordering of the C-state latencies.
Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered")
Cc: stable@vger.kernel.org
Reported-by: Julian Sikorski <belegdol@gmail.com>
Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
v3 -> v4:
- Rename the parameter 'arr' to 'states'.
- Add empty lines to enhance readability.
Note: I only performed a build test and a simple unit test to ensure
the latency of valid elements is correctly sorted in the randomly
generated data.
drivers/acpi/processor_idle.c | 37 +++++++++++++++--------------------
1 file changed, 16 insertions(+), 21 deletions(-)
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index bd6a7857ce05..831fa4a12159 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -16,7 +16,6 @@
#include <linux/acpi.h>
#include <linux/dmi.h>
#include <linux/sched.h> /* need_resched() */
-#include <linux/sort.h>
#include <linux/tick.h>
#include <linux/cpuidle.h>
#include <linux/cpu.h>
@@ -386,25 +385,24 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr,
acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1);
}
-static int acpi_cst_latency_cmp(const void *a, const void *b)
+static void acpi_cst_latency_sort(struct acpi_processor_cx *states, size_t length)
{
- const struct acpi_processor_cx *x = a, *y = b;
+ int i, j, k;
- if (!(x->valid && y->valid))
- return 0;
- if (x->latency > y->latency)
- return 1;
- if (x->latency < y->latency)
- return -1;
- return 0;
-}
-static void acpi_cst_latency_swap(void *a, void *b, int n)
-{
- struct acpi_processor_cx *x = a, *y = b;
+ for (i = 1; i < length; i++) {
+ if (!states[i].valid)
+ continue;
- if (!(x->valid && y->valid))
- return;
- swap(x->latency, y->latency);
+ for (j = i - 1, k = i; j >= 0; j--) {
+ if (!states[j].valid)
+ continue;
+
+ if (states[j].latency > states[k].latency)
+ swap(states[j].latency, states[k].latency);
+
+ k = j;
+ }
+ }
}
static int acpi_processor_power_verify(struct acpi_processor *pr)
@@ -449,10 +447,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr)
if (buggy_latency) {
pr_notice("FW issue: working around C-state latencies out of order\n");
- sort(&pr->power.states[1], max_cstate,
- sizeof(struct acpi_processor_cx),
- acpi_cst_latency_cmp,
- acpi_cst_latency_swap);
+ acpi_cst_latency_sort(&pr->power.states[1], max_cstate);
}
lapic_timer_propagate_broadcast(pr);
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v4] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency
2024-07-01 20:56 ` [PATCH v4] " Kuan-Wei Chiu
@ 2024-07-02 7:28 ` Julian Sikorski
2024-07-02 12:59 ` Mario Limonciello
2024-07-02 18:38 ` Rafael J. Wysocki
1 sibling, 1 reply; 20+ messages in thread
From: Julian Sikorski @ 2024-07-02 7:28 UTC (permalink / raw)
To: Kuan-Wei Chiu, rafael
Cc: lenb, mario.limonciello, akpm, jserv, alexdeucher, regressions,
linux-acpi, regressions, linux-kernel, stable
Am 01.07.24 um 22:56 schrieb Kuan-Wei Chiu:
> The acpi_cst_latency_cmp comparison function currently used for sorting
> C-state latencies does not satisfy transitivity, causing incorrect
> sorting results. Specifically, if there are two valid acpi_processor_cx
> elements A and B and one invalid element C, it may occur that A < B,
> A = C, and B = C. Sorting algorithms assume that if A < B and A = C,
> then C < B, leading to incorrect ordering.
>
> Given the small size of the array (<=8), we replace the library sort
> function with a simple insertion sort that properly ignores invalid
> elements and sorts valid ones based on latency. This change ensures
> correct ordering of the C-state latencies.
>
> Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered")
> Cc: stable@vger.kernel.org
> Reported-by: Julian Sikorski <belegdol@gmail.com>
> Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/
> Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
> ---
> v3 -> v4:
> - Rename the parameter 'arr' to 'states'.
> - Add empty lines to enhance readability.
>
> Note: I only performed a build test and a simple unit test to ensure
> the latency of valid elements is correctly sorted in the randomly
> generated data.
>
Hello,
thanks for the patch. I have tested this applied on top of Fedora 6.9.7
kernel on my Asus laptop and the message about suspend not reaching the
deepest state is gone. Thank you.
I wonder whether this will also fix random S3 suspend issues I have been
seeing on my 5600x since 6.9 kernel too. I will definitely try.
Best regards,
Julian
Tested-by: Julian Sikorski <belegdol@gmail.com>
> drivers/acpi/processor_idle.c | 37 +++++++++++++++--------------------
> 1 file changed, 16 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index bd6a7857ce05..831fa4a12159 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -16,7 +16,6 @@
> #include <linux/acpi.h>
> #include <linux/dmi.h>
> #include <linux/sched.h> /* need_resched() */
> -#include <linux/sort.h>
> #include <linux/tick.h>
> #include <linux/cpuidle.h>
> #include <linux/cpu.h>
> @@ -386,25 +385,24 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr,
> acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1);
> }
>
> -static int acpi_cst_latency_cmp(const void *a, const void *b)
> +static void acpi_cst_latency_sort(struct acpi_processor_cx *states, size_t length)
> {
> - const struct acpi_processor_cx *x = a, *y = b;
> + int i, j, k;
>
> - if (!(x->valid && y->valid))
> - return 0;
> - if (x->latency > y->latency)
> - return 1;
> - if (x->latency < y->latency)
> - return -1;
> - return 0;
> -}
> -static void acpi_cst_latency_swap(void *a, void *b, int n)
> -{
> - struct acpi_processor_cx *x = a, *y = b;
> + for (i = 1; i < length; i++) {
> + if (!states[i].valid)
> + continue;
>
> - if (!(x->valid && y->valid))
> - return;
> - swap(x->latency, y->latency);
> + for (j = i - 1, k = i; j >= 0; j--) {
> + if (!states[j].valid)
> + continue;
> +
> + if (states[j].latency > states[k].latency)
> + swap(states[j].latency, states[k].latency);
> +
> + k = j;
> + }
> + }
> }
>
> static int acpi_processor_power_verify(struct acpi_processor *pr)
> @@ -449,10 +447,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr)
>
> if (buggy_latency) {
> pr_notice("FW issue: working around C-state latencies out of order\n");
> - sort(&pr->power.states[1], max_cstate,
> - sizeof(struct acpi_processor_cx),
> - acpi_cst_latency_cmp,
> - acpi_cst_latency_swap);
> + acpi_cst_latency_sort(&pr->power.states[1], max_cstate);
> }
>
> lapic_timer_propagate_broadcast(pr);
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v4] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency
2024-07-02 7:28 ` Julian Sikorski
@ 2024-07-02 12:59 ` Mario Limonciello
0 siblings, 0 replies; 20+ messages in thread
From: Mario Limonciello @ 2024-07-02 12:59 UTC (permalink / raw)
To: Julian Sikorski, Kuan-Wei Chiu, rafael
Cc: lenb, akpm, jserv, alexdeucher, regressions, linux-acpi,
regressions, linux-kernel, stable
On 7/2/2024 2:28, Julian Sikorski wrote:
> Am 01.07.24 um 22:56 schrieb Kuan-Wei Chiu:
>> The acpi_cst_latency_cmp comparison function currently used for sorting
>> C-state latencies does not satisfy transitivity, causing incorrect
>> sorting results. Specifically, if there are two valid acpi_processor_cx
>> elements A and B and one invalid element C, it may occur that A < B,
>> A = C, and B = C. Sorting algorithms assume that if A < B and A = C,
>> then C < B, leading to incorrect ordering.
>>
>> Given the small size of the array (<=8), we replace the library sort
>> function with a simple insertion sort that properly ignores invalid
>> elements and sorts valid ones based on latency. This change ensures
>> correct ordering of the C-state latencies.
>>
>> Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if
>> not ordered")
>> Cc: stable@vger.kernel.org
>> Reported-by: Julian Sikorski <belegdol@gmail.com>
>> Closes:
>> https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/
>> Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
>> ---
>> v3 -> v4:
>> - Rename the parameter 'arr' to 'states'.
>> - Add empty lines to enhance readability.
>>
>> Note: I only performed a build test and a simple unit test to ensure
>> the latency of valid elements is correctly sorted in the randomly
>> generated data.
>>
>
> Hello,
>
> thanks for the patch. I have tested this applied on top of Fedora 6.9.7
> kernel on my Asus laptop and the message about suspend not reaching the
> deepest state is gone. Thank you.
That's great news.
> I wonder whether this will also fix random S3 suspend issues I have been
> seeing on my 5600x since 6.9 kernel too. I will definitely try.
Does your 5600x also sort C states? You'll see message in the logs. If
so yes it could help. If not; you probably will need to bisect that
separately.
>
> Best regards,
> Julian
>
> Tested-by: Julian Sikorski <belegdol@gmail.com>
>
>> drivers/acpi/processor_idle.c | 37 +++++++++++++++--------------------
>> 1 file changed, 16 insertions(+), 21 deletions(-)
>>
>> diff --git a/drivers/acpi/processor_idle.c
>> b/drivers/acpi/processor_idle.c
>> index bd6a7857ce05..831fa4a12159 100644
>> --- a/drivers/acpi/processor_idle.c
>> +++ b/drivers/acpi/processor_idle.c
>> @@ -16,7 +16,6 @@
>> #include <linux/acpi.h>
>> #include <linux/dmi.h>
>> #include <linux/sched.h> /* need_resched() */
>> -#include <linux/sort.h>
>> #include <linux/tick.h>
>> #include <linux/cpuidle.h>
>> #include <linux/cpu.h>
>> @@ -386,25 +385,24 @@ static void
>> acpi_processor_power_verify_c3(struct acpi_processor *pr,
>> acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1);
>> }
>> -static int acpi_cst_latency_cmp(const void *a, const void *b)
>> +static void acpi_cst_latency_sort(struct acpi_processor_cx *states,
>> size_t length)
>> {
>> - const struct acpi_processor_cx *x = a, *y = b;
>> + int i, j, k;
>> - if (!(x->valid && y->valid))
>> - return 0;
>> - if (x->latency > y->latency)
>> - return 1;
>> - if (x->latency < y->latency)
>> - return -1;
>> - return 0;
>> -}
>> -static void acpi_cst_latency_swap(void *a, void *b, int n)
>> -{
>> - struct acpi_processor_cx *x = a, *y = b;
>> + for (i = 1; i < length; i++) {
>> + if (!states[i].valid)
>> + continue;
>> - if (!(x->valid && y->valid))
>> - return;
>> - swap(x->latency, y->latency);
>> + for (j = i - 1, k = i; j >= 0; j--) {
>> + if (!states[j].valid)
>> + continue;
>> +
>> + if (states[j].latency > states[k].latency)
>> + swap(states[j].latency, states[k].latency);
>> +
>> + k = j;
>> + }
>> + }
>> }
>> static int acpi_processor_power_verify(struct acpi_processor *pr)
>> @@ -449,10 +447,7 @@ static int acpi_processor_power_verify(struct
>> acpi_processor *pr)
>> if (buggy_latency) {
>> pr_notice("FW issue: working around C-state latencies out of
>> order\n");
>> - sort(&pr->power.states[1], max_cstate,
>> - sizeof(struct acpi_processor_cx),
>> - acpi_cst_latency_cmp,
>> - acpi_cst_latency_swap);
>> + acpi_cst_latency_sort(&pr->power.states[1], max_cstate);
>> }
>> lapic_timer_propagate_broadcast(pr);
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v4] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency
2024-07-01 20:56 ` [PATCH v4] " Kuan-Wei Chiu
2024-07-02 7:28 ` Julian Sikorski
@ 2024-07-02 18:38 ` Rafael J. Wysocki
1 sibling, 0 replies; 20+ messages in thread
From: Rafael J. Wysocki @ 2024-07-02 18:38 UTC (permalink / raw)
To: Kuan-Wei Chiu
Cc: rafael, lenb, mario.limonciello, akpm, jserv, alexdeucher,
belegdol, regressions, linux-acpi, regressions, linux-kernel,
stable
On Mon, Jul 1, 2024 at 10:56 PM Kuan-Wei Chiu <visitorckw@gmail.com> wrote:
>
> The acpi_cst_latency_cmp comparison function currently used for sorting
> C-state latencies does not satisfy transitivity, causing incorrect
> sorting results. Specifically, if there are two valid acpi_processor_cx
> elements A and B and one invalid element C, it may occur that A < B,
> A = C, and B = C. Sorting algorithms assume that if A < B and A = C,
> then C < B, leading to incorrect ordering.
>
> Given the small size of the array (<=8), we replace the library sort
> function with a simple insertion sort that properly ignores invalid
> elements and sorts valid ones based on latency. This change ensures
> correct ordering of the C-state latencies.
>
> Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered")
> Cc: stable@vger.kernel.org
> Reported-by: Julian Sikorski <belegdol@gmail.com>
> Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/
> Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
> ---
> v3 -> v4:
> - Rename the parameter 'arr' to 'states'.
> - Add empty lines to enhance readability.
>
> Note: I only performed a build test and a simple unit test to ensure
> the latency of valid elements is correctly sorted in the randomly
> generated data.
>
> drivers/acpi/processor_idle.c | 37 +++++++++++++++--------------------
> 1 file changed, 16 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index bd6a7857ce05..831fa4a12159 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -16,7 +16,6 @@
> #include <linux/acpi.h>
> #include <linux/dmi.h>
> #include <linux/sched.h> /* need_resched() */
> -#include <linux/sort.h>
> #include <linux/tick.h>
> #include <linux/cpuidle.h>
> #include <linux/cpu.h>
> @@ -386,25 +385,24 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr,
> acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1);
> }
>
> -static int acpi_cst_latency_cmp(const void *a, const void *b)
> +static void acpi_cst_latency_sort(struct acpi_processor_cx *states, size_t length)
> {
> - const struct acpi_processor_cx *x = a, *y = b;
> + int i, j, k;
>
> - if (!(x->valid && y->valid))
> - return 0;
> - if (x->latency > y->latency)
> - return 1;
> - if (x->latency < y->latency)
> - return -1;
> - return 0;
> -}
> -static void acpi_cst_latency_swap(void *a, void *b, int n)
> -{
> - struct acpi_processor_cx *x = a, *y = b;
> + for (i = 1; i < length; i++) {
> + if (!states[i].valid)
> + continue;
>
> - if (!(x->valid && y->valid))
> - return;
> - swap(x->latency, y->latency);
> + for (j = i - 1, k = i; j >= 0; j--) {
> + if (!states[j].valid)
> + continue;
> +
> + if (states[j].latency > states[k].latency)
> + swap(states[j].latency, states[k].latency);
> +
> + k = j;
> + }
> + }
> }
>
> static int acpi_processor_power_verify(struct acpi_processor *pr)
> @@ -449,10 +447,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr)
>
> if (buggy_latency) {
> pr_notice("FW issue: working around C-state latencies out of order\n");
> - sort(&pr->power.states[1], max_cstate,
> - sizeof(struct acpi_processor_cx),
> - acpi_cst_latency_cmp,
> - acpi_cst_latency_swap);
> + acpi_cst_latency_sort(&pr->power.states[1], max_cstate);
> }
>
> lapic_timer_propagate_broadcast(pr);
> --
Applied as 6.10-rc material, thanks!
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation
2024-06-20 15:36 ` Julian Sikorski
2024-06-20 20:17 ` Mario Limonciello
2024-06-28 15:15 ` Linux regression tracking (Thorsten Leemhuis)
@ 2025-01-15 3:27 ` Luke Jones
2025-01-15 12:49 ` Kuan-Wei Chiu
2 siblings, 1 reply; 20+ messages in thread
From: Luke Jones @ 2025-01-15 3:27 UTC (permalink / raw)
To: Julian Sikorski, visitorckw; +Cc: akpm, jserv, linux-kernel, lkml, alexdeucher
On Thu, 2024-06-20 at 17:36 +0200, Julian Sikorski wrote:
> Hello,
>
> it appears that this patch has caused suspend-to-idle regression:
>
> https://gitlab.freedesktop.org/drm/amd/-/issues/3436
>
Another regression from this has been reported here
https://bugzilla.kernel.org/show_bug.cgi?id=219158
Regards,
Luke.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation
2025-01-15 3:27 ` [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation Luke Jones
@ 2025-01-15 12:49 ` Kuan-Wei Chiu
0 siblings, 0 replies; 20+ messages in thread
From: Kuan-Wei Chiu @ 2025-01-15 12:49 UTC (permalink / raw)
To: Luke Jones
Cc: Julian Sikorski, akpm, jserv, linux-kernel, lkml, alexdeucher,
perex, tiwai, linux-sound, chuang, hui.wang
(+Cc sound developers)
Hi Luke,
On Wed, Jan 15, 2025 at 04:27:52PM +1300, Luke Jones wrote:
> On Thu, 2024-06-20 at 17:36 +0200, Julian Sikorski wrote:
> > Hello,
> >
> > it appears that this patch has caused suspend-to-idle regression:
> >
> > https://gitlab.freedesktop.org/drm/amd/-/issues/3436
> >
>
> Another regression from this has been reported here
> https://bugzilla.kernel.org/show_bug.cgi?id=219158
>
Thank you for reporting this regression!
From a quick look, this seems to be caused by yet another broken
compare function. In sound/pci/hda/hda_auto_parser.c, the
compare_input_type() function can lead to sorting issues when both
is_headset_mic and is_headphone_mic are true for a and b. This can
result in a situation where both a < b and b < a hold true, violating
the antisymmetry and transitivity required by sort().
Additionally, the comments about "swap" and "don't swap" seem to make
incorrect assumptions about how sort() works. Regardless of this
optimization patch, sort() may swap a and b without comparing them.
Could you help test the following code? If it works, I'll submit it as
an official patch.
Regards,
Kuan-Wei
diff --git a/sound/pci/hda/hda_auto_parser.c b/sound/pci/hda/hda_auto_parser.c
index 84393f4f429d..5502ec09b584 100644
--- a/sound/pci/hda/hda_auto_parser.c
+++ b/sound/pci/hda/hda_auto_parser.c
@@ -73,6 +73,8 @@ static int compare_input_type(const void *ap, const void *bp)
return (int)(a->type - b->type);
/* If has both hs_mic and hp_mic, pick the hs_mic ahead of hp_mic. */
+ if (a->is_headset_mic && b->is_headset_mic && a->is_headphone_mic && b->is_headphone_mic)
+ return (int)(b->has_boost_on_pin - a->has_boost_on_pin);
if (a->is_headset_mic && b->is_headphone_mic)
return -1; /* don't swap */
else if (a->is_headphone_mic && b->is_headset_mic)
^ permalink raw reply related [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-01-15 12:49 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-13 3:13 [PATCH 0/2] lib/sort: Optimize the number of swaps and comparisons Kuan-Wei Chiu
2024-01-13 3:13 ` [PATCH 1/2] lib/sort: Optimize heapsort for equal elements in sift-down path Kuan-Wei Chiu
2024-01-13 3:13 ` [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation Kuan-Wei Chiu
2024-06-20 15:36 ` Julian Sikorski
2024-06-20 20:17 ` Mario Limonciello
2024-06-28 15:15 ` Linux regression tracking (Thorsten Leemhuis)
2024-06-28 17:10 ` Kuan-Wei Chiu
2024-06-29 5:03 ` Linux regression tracking (Thorsten Leemhuis)
[not found] ` <20240630210809.37550-1-visitorckw@gmail.com>
2024-06-30 21:13 ` [PATCH] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency Kuan-Wei Chiu
2024-07-01 4:42 ` [PATCH v2] " Kuan-Wei Chiu
2024-07-01 5:06 ` Greg KH
2024-07-01 15:17 ` Mario Limonciello
2024-07-01 16:10 ` [PATCH v3] " Kuan-Wei Chiu
2024-07-01 17:36 ` Rafael J. Wysocki
2024-07-01 20:56 ` [PATCH v4] " Kuan-Wei Chiu
2024-07-02 7:28 ` Julian Sikorski
2024-07-02 12:59 ` Mario Limonciello
2024-07-02 18:38 ` Rafael J. Wysocki
2025-01-15 3:27 ` [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation Luke Jones
2025-01-15 12:49 ` Kuan-Wei Chiu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox