* [PATCH 0/2] lib/sort: Optimize the number of swaps and comparisons @ 2024-01-13 3:13 Kuan-Wei Chiu 2024-01-13 3:13 ` [PATCH 1/2] lib/sort: Optimize heapsort for equal elements in sift-down path Kuan-Wei Chiu 2024-01-13 3:13 ` [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation Kuan-Wei Chiu 0 siblings, 2 replies; 20+ messages in thread From: Kuan-Wei Chiu @ 2024-01-13 3:13 UTC (permalink / raw) To: akpm; +Cc: lkml, jserv, linux-kernel, Kuan-Wei Chiu Hello, This patch series aims to optimize the heapsort algorithm, specifically targeting a reduction in the number of swaps and comparisons required. Thanks, Kuan-Wei Chiu Kuan-Wei Chiu (2): lib/sort: Optimize heapsort for equal elements in sift-down path lib/sort: Optimize heapsort with double-pop variation lib/sort.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) -- 2.25.1 ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/2] lib/sort: Optimize heapsort for equal elements in sift-down path 2024-01-13 3:13 [PATCH 0/2] lib/sort: Optimize the number of swaps and comparisons Kuan-Wei Chiu @ 2024-01-13 3:13 ` Kuan-Wei Chiu 2024-01-13 3:13 ` [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation Kuan-Wei Chiu 1 sibling, 0 replies; 20+ messages in thread From: Kuan-Wei Chiu @ 2024-01-13 3:13 UTC (permalink / raw) To: akpm; +Cc: lkml, jserv, linux-kernel, Kuan-Wei Chiu Currently, when searching for the sift-down path and encountering equal elements, the algorithm chooses the left child. However, considering that the height of the right subtree may be one less than that of the left subtree, selecting the right child in such cases can potentially reduce the number of comparisons and swaps. For instance, when sorting an array of 10,000 identical elements, the current implementation requires 247,209 comparisons. With this patch, the number of comparisons can be reduced to 227,241. Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> --- lib/sort.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/sort.c b/lib/sort.c index b399bf10d675..fe4efd4a1410 100644 --- a/lib/sort.c +++ b/lib/sort.c @@ -262,7 +262,7 @@ void sort_r(void *base, size_t num, size_t size, * average, 3/4 worst-case.) */ for (b = a; c = 2*b + size, (d = c + size) < n;) - b = do_cmp(base + c, base + d, cmp_func, priv) >= 0 ? c : d; + b = do_cmp(base + c, base + d, cmp_func, priv) > 0 ? c : d; if (d == n) /* Special case last leaf with no sibling */ b = c; -- 2.25.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation 2024-01-13 3:13 [PATCH 0/2] lib/sort: Optimize the number of swaps and comparisons Kuan-Wei Chiu 2024-01-13 3:13 ` [PATCH 1/2] lib/sort: Optimize heapsort for equal elements in sift-down path Kuan-Wei Chiu @ 2024-01-13 3:13 ` Kuan-Wei Chiu 2024-06-20 15:36 ` Julian Sikorski 1 sibling, 1 reply; 20+ messages in thread From: Kuan-Wei Chiu @ 2024-01-13 3:13 UTC (permalink / raw) To: akpm; +Cc: lkml, jserv, linux-kernel, Kuan-Wei Chiu Instead of popping only the maximum element from the heap during each iteration, we now pop the two largest elements at once. Although this introduces an additional comparison to determine the second largest element, it enables a reduction in the height of the tree by one during the heapify operations starting from root's left/right child. This reduction in tree height by one leads to a decrease of one comparison and one swap. This optimization results in saving approximately 0.5 * n swaps without increasing the number of comparisons. Additionally, the heap size during heapify is now one less than the original size, offering a chance for further reduction in comparisons and swaps. Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> --- The following experimental data is based on the array generated using get_random_u32(). | N | swaps (old) | swaps (new) | comparisons (old) | comparisons (new) | |-------|-------------|-------------|-------------------|-------------------| | 1000 | 9054 | 8569 | 10328 | 10320 | | 2000 | 20137 | 19182 | 22634 | 22587 | | 3000 | 32062 | 30623 | 35833 | 35752 | | 4000 | 44274 | 42282 | 49332 | 49306 | | 5000 | 57195 | 54676 | 63300 | 63294 | | 6000 | 70205 | 67202 | 77599 | 77557 | | 7000 | 83276 | 79831 | 92113 | 92032 | | 8000 | 96630 | 92678 | 106635 | 106617 | | 9000 | 110349 | 105883 | 121505 | 121404 | | 10000 | 124165 | 119202 | 136628 | 136617 | lib/sort.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/lib/sort.c b/lib/sort.c index fe4efd4a1410..a0509088f82a 100644 --- a/lib/sort.c +++ b/lib/sort.c @@ -215,6 +215,7 @@ void sort_r(void *base, size_t num, size_t size, /* pre-scale counters for performance */ size_t n = num * size, a = (num/2) * size; const unsigned int lsbit = size & -size; /* Used to find parent */ + size_t shift = 0; if (!a) /* num < 2 || size == 0 */ return; @@ -242,12 +243,21 @@ void sort_r(void *base, size_t num, size_t size, for (;;) { size_t b, c, d; - if (a) /* Building heap: sift down --a */ - a -= size; - else if (n -= size) /* Sorting: Extract root to --n */ + if (a) /* Building heap: sift down a */ + a -= size << shift; + else if (n > 3 * size) { /* Sorting: Extract two largest elements */ + n -= size; do_swap(base, base + n, size, swap_func, priv); - else /* Sort complete */ + shift = do_cmp(base + size, base + 2 * size, cmp_func, priv) <= 0; + a = size << shift; + n -= size; + do_swap(base + a, base + n, size, swap_func, priv); + } else if (n > size) { /* Sorting: Extract root */ + n -= size; + do_swap(base, base + n, size, swap_func, priv); + } else { /* Sort complete */ break; + } /* * Sift element at "a" down into heap. This is the -- 2.25.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation 2024-01-13 3:13 ` [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation Kuan-Wei Chiu @ 2024-06-20 15:36 ` Julian Sikorski 2024-06-20 20:17 ` Mario Limonciello ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Julian Sikorski @ 2024-06-20 15:36 UTC (permalink / raw) To: visitorckw; +Cc: akpm, jserv, linux-kernel, lkml, alexdeucher Hello, it appears that this patch has caused suspend-to-idle regression: https://gitlab.freedesktop.org/drm/amd/-/issues/3436 In brief, my laptop fails to suspend completely with the following error in the log: Jun 18 12:42:20 kernel: amd_pmc AMDI0005:00: Last suspend didn't reach deepest state Power consumption remains high enough that my battery has already unexpectedly drained twice before I noticed something was off. I am not entirely sure how changes to sorting function can influence suspend, but it is what it is. 6.9.5 as shipped by Fedora 40 exhibits the issue, 6.9.5 as shipped by Fedora with the patch reverted does not. Best regards, Juliam ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation 2024-06-20 15:36 ` Julian Sikorski @ 2024-06-20 20:17 ` Mario Limonciello 2024-06-28 15:15 ` Linux regression tracking (Thorsten Leemhuis) 2025-01-15 3:27 ` [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation Luke Jones 2 siblings, 0 replies; 20+ messages in thread From: Mario Limonciello @ 2024-06-20 20:17 UTC (permalink / raw) To: Julian Sikorski, visitorckw, Linux kernel regressions list Cc: akpm, jserv, linux-kernel, lkml, alexdeucher On 6/20/2024 10:36, Julian Sikorski wrote: > Hello, > > it appears that this patch has caused suspend-to-idle regression: > > https://gitlab.freedesktop.org/drm/amd/-/issues/3436 > > In brief, my laptop fails to suspend completely with the following error > in the log: > > Jun 18 12:42:20 kernel: amd_pmc AMDI0005:00: Last suspend didn't reach > deepest state > > Power consumption remains high enough that my battery has already > unexpectedly drained twice before I noticed something was off. > I am not entirely sure how changes to sorting function can influence > suspend, but it is what it is. 6.9.5 as shipped by Fedora 40 exhibits > the issue, 6.9.5 as shipped by Fedora with the patch reverted does not. > > Best regards, > Juliam > +regressions M/L #regzbot ^introduced: 0e02ca29a563 #regzbot from: Kuan-Wei Chiu <visitorckw@gmail.com> #regzbot monitor: https://gitlab.freedesktop.org/drm/amd/-/issues/3436 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation 2024-06-20 15:36 ` Julian Sikorski 2024-06-20 20:17 ` Mario Limonciello @ 2024-06-28 15:15 ` Linux regression tracking (Thorsten Leemhuis) 2024-06-28 17:10 ` Kuan-Wei Chiu 2025-01-15 3:27 ` [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation Luke Jones 2 siblings, 1 reply; 20+ messages in thread From: Linux regression tracking (Thorsten Leemhuis) @ 2024-06-28 15:15 UTC (permalink / raw) To: visitorckw, akpm Cc: jserv, linux-kernel, lkml, alexdeucher, Linux kernel regressions list, Julian Sikorski On 20.06.24 17:36, Julian Sikorski wrote: > > it appears that this patch has caused suspend-to-idle regression: > > https://gitlab.freedesktop.org/drm/amd/-/issues/3436 > > In brief, my laptop fails to suspend completely with the following error > in the log: > > Jun 18 12:42:20 kernel: amd_pmc AMDI0005:00: Last suspend didn't reach > deepest state > > Power consumption remains high enough that my battery has already > unexpectedly drained twice before I noticed something was off. > I am not entirely sure how changes to sorting function can influence > suspend, but it is what it is. 6.9.5 as shipped by Fedora 40 exhibits > the issue, 6.9.5 as shipped by Fedora with the patch reverted does not. Andrew, could you maybe help out here or point us in the direction of somewhat that might be able to help? I'm asking, as from a quick search on lore it seems Kuan-Wei Chiu has not posted anything since nearly four weeks and thus also did not reply to Julian's regression report. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page. #regzbot poke ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation 2024-06-28 15:15 ` Linux regression tracking (Thorsten Leemhuis) @ 2024-06-28 17:10 ` Kuan-Wei Chiu 2024-06-29 5:03 ` Linux regression tracking (Thorsten Leemhuis) 0 siblings, 1 reply; 20+ messages in thread From: Kuan-Wei Chiu @ 2024-06-28 17:10 UTC (permalink / raw) To: Linux regressions mailing list Cc: akpm, jserv, linux-kernel, lkml, alexdeucher, Julian Sikorski On Fri, Jun 28, 2024 at 05:15:15PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote: > On 20.06.24 17:36, Julian Sikorski wrote: > > > > it appears that this patch has caused suspend-to-idle regression: > > > > https://gitlab.freedesktop.org/drm/amd/-/issues/3436 > > > > In brief, my laptop fails to suspend completely with the following error > > in the log: > > > > Jun 18 12:42:20 kernel: amd_pmc AMDI0005:00: Last suspend didn't reach > > deepest state > > > > Power consumption remains high enough that my battery has already > > unexpectedly drained twice before I noticed something was off. > > I am not entirely sure how changes to sorting function can influence > > suspend, but it is what it is. 6.9.5 as shipped by Fedora 40 exhibits > > the issue, 6.9.5 as shipped by Fedora with the patch reverted does not. > > Andrew, could you maybe help out here or point us in the direction of > somewhat that might be able to help? I'm asking, as from a quick search > on lore it seems Kuan-Wei Chiu has not posted anything since nearly four > weeks and thus also did not reply to Julian's regression report. > > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) > -- > Everything you wanna know about Linux kernel regression tracking: > https://linux-regtracking.leemhuis.info/about/#tldr > If I did something stupid, please tell me, as explained on that page. > > #regzbot poke Sorry for the late reply. I apologize for the regression caused by the patch. I am not familiar with the power management domain. I would guess there might be some side effects in the compare or swap functions that I was not aware of. While reviewing the sort calls that could potentially cause the error, I noticed that the compare function used in the sort call within drivers/acpi/processor_idle.c might not satisfy the transitive relation. Although I'm not sure if this is the root cause of the problem, specifically, if there are two valid acpi_processor_cx elements A and B, and an invalid element C, there could be a scenario where A < B but simultaneously A = C and B = C. However, if I understand correctly, this issue should have existed before this patch but might not have been triggered previously. My patch might have exposed this issue by changing the order of comparisons and swaps. Regards, Kuan-Wei ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation 2024-06-28 17:10 ` Kuan-Wei Chiu @ 2024-06-29 5:03 ` Linux regression tracking (Thorsten Leemhuis) [not found] ` <20240630210809.37550-1-visitorckw@gmail.com> 0 siblings, 1 reply; 20+ messages in thread From: Linux regression tracking (Thorsten Leemhuis) @ 2024-06-29 5:03 UTC (permalink / raw) To: Rafael J. Wysocki Cc: akpm, jserv, linux-kernel, lkml, alexdeucher, Julian Sikorski, ACPI Devel Maling List, Kuan-Wei Chiu, Linux regressions mailing list On 28.06.24 19:10, Kuan-Wei Chiu wrote: > On Fri, Jun 28, 2024 at 05:15:15PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote: >> On 20.06.24 17:36, Julian Sikorski wrote: >>> >>> it appears that this patch has caused suspend-to-idle regression: >>> >>> https://gitlab.freedesktop.org/drm/amd/-/issues/3436 >>> >>> In brief, my laptop fails to suspend completely with the following error >>> in the log: >>> >>> Jun 18 12:42:20 kernel: amd_pmc AMDI0005:00: Last suspend didn't reach >>> deepest state >>> >>> Power consumption remains high enough that my battery has already >>> unexpectedly drained twice before I noticed something was off. >>> I am not entirely sure how changes to sorting function can influence >>> suspend, but it is what it is. 6.9.5 as shipped by Fedora 40 exhibits >>> the issue, 6.9.5 as shipped by Fedora with the patch reverted does not. > [...] > I apologize for the regression caused by the patch. I am not familiar > with the power management domain. I would guess there might be some > side effects in the compare or swap functions that I was not aware of. > > While reviewing the sort calls that could potentially cause the error, > I noticed that the compare function used in the sort call within > drivers/acpi/processor_idle.c might not satisfy the transitive > relation. Although I'm not sure if this is the root cause of the > problem, specifically, if there are two valid acpi_processor_cx > elements A and B, and an invalid element C, there could be a scenario > where A < B but simultaneously A = C and B = C. Let's bring in Rafael, he might be able to help us out here. > However, if I > understand correctly, this issue should have existed before this patch > but might not have been triggered previously. My patch might have > exposed this issue by changing the order of comparisons and swaps. Yeah, these things happen. But when it comes to the Linux kernel's "no regressions" rule that does not matter much; what matters more is which change exposed the problem. Ciao, Thorsten ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <20240630210809.37550-1-visitorckw@gmail.com>]
* Re: [PATCH] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency [not found] ` <20240630210809.37550-1-visitorckw@gmail.com> @ 2024-06-30 21:13 ` Kuan-Wei Chiu 2024-07-01 4:42 ` [PATCH v2] " Kuan-Wei Chiu 0 siblings, 1 reply; 20+ messages in thread From: Kuan-Wei Chiu @ 2024-06-30 21:13 UTC (permalink / raw) To: rafael Cc: lenb, mario.limonciello, akpm, jserv, alexdeucher, belegdol, linux-acpi, regressions, regressions, linux-kernel + linux-kernel@vger.kernel.org On Mon, Jul 01, 2024 at 05:08:09AM +0800, Kuan-Wei Chiu wrote: > The acpi_cst_latency_cmp comparison function currently used for sorting > C-state latencies does not satisfy transitivity, causing incorrect > sorting results. Specifically, if there are two valid acpi_processor_cx > elements A and B and one invalid element C, it may occur that A < B, > A = C, and B = C. Sorting algorithms assume that if A < B and A = C, > then C < B, leading to incorrect ordering. > > Given the small size of the array (<=8), we replace the library sort > function with a simple insertion sort that properly ignores invalid > elements and sorts valid ones based on latency. This change ensures > correct ordering of the C-state latencies. > > Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered") > Reported-by: Julian Sikorski <belegdol@gmail.com> > Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/ > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> > --- > I do not have the appropriate AMD hardware to reproduce this issue and > test the patch. However, if the aforementioned reason is indeed the > source of the problem, I believe this patch might help. > > drivers/acpi/processor_idle.c | 33 ++++++++++++--------------------- > 1 file changed, 12 insertions(+), 21 deletions(-) > > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c > index bd6a7857ce05..d58a7c64d80b 100644 > --- a/drivers/acpi/processor_idle.c > +++ b/drivers/acpi/processor_idle.c > @@ -386,25 +386,19 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr, > acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1); > } > > -static int acpi_cst_latency_cmp(const void *a, const void *b) > +static void acpi_cst_latency_sort(struct acpi_processor_cx *arr, size_t length) > { > - const struct acpi_processor_cx *x = a, *y = b; > + int i, j, k; > > - if (!(x->valid && y->valid)) > - return 0; > - if (x->latency > y->latency) > - return 1; > - if (x->latency < y->latency) > - return -1; > - return 0; > -} > -static void acpi_cst_latency_swap(void *a, void *b, int n) > -{ > - struct acpi_processor_cx *x = a, *y = b; > - > - if (!(x->valid && y->valid)) > - return; > - swap(x->latency, y->latency); > + for (i = 1; i < length; i++) { > + for (j = i - 1, k = i; j >= 0; j--) { > + if (!arr[j].valid) > + continue; > + if (arr[j].latency > arr[k].latency) > + swap(arr[j].latency, arr[k].latency); > + k = j; > + } > + } > } > > static int acpi_processor_power_verify(struct acpi_processor *pr) > @@ -449,10 +443,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr) > > if (buggy_latency) { > pr_notice("FW issue: working around C-state latencies out of order\n"); > - sort(&pr->power.states[1], max_cstate, > - sizeof(struct acpi_processor_cx), > - acpi_cst_latency_cmp, > - acpi_cst_latency_swap); > + acpi_cst_latency_sort(&pr->power.states[1], max_cstate); > } > > lapic_timer_propagate_broadcast(pr); > -- > 2.34.1 > ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency 2024-06-30 21:13 ` [PATCH] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency Kuan-Wei Chiu @ 2024-07-01 4:42 ` Kuan-Wei Chiu 2024-07-01 5:06 ` Greg KH 2024-07-01 15:17 ` Mario Limonciello 0 siblings, 2 replies; 20+ messages in thread From: Kuan-Wei Chiu @ 2024-07-01 4:42 UTC (permalink / raw) To: rafael Cc: lenb, mario.limonciello, akpm, jserv, alexdeucher, belegdol, regressions, linux-acpi, regressions, linux-kernel, Kuan-Wei Chiu The acpi_cst_latency_cmp comparison function currently used for sorting C-state latencies does not satisfy transitivity, causing incorrect sorting results. Specifically, if there are two valid acpi_processor_cx elements A and B and one invalid element C, it may occur that A < B, A = C, and B = C. Sorting algorithms assume that if A < B and A = C, then C < B, leading to incorrect ordering. Given the small size of the array (<=8), we replace the library sort function with a simple insertion sort that properly ignores invalid elements and sorts valid ones based on latency. This change ensures correct ordering of the C-state latencies. Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered") Reported-by: Julian Sikorski <belegdol@gmail.com> Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/ Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> --- v1 -> v2: - Avoid swapping if arr[i] is an invalid element. I do not have the appropriate AMD hardware to reproduce this issue and test the patch. However, if the aforementioned reason is indeed the source of the problem, I believe this patch might help. drivers/acpi/processor_idle.c | 35 ++++++++++++++--------------------- 1 file changed, 14 insertions(+), 21 deletions(-) diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index bd6a7857ce05..813c718b9108 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -386,25 +386,21 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr, acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1); } -static int acpi_cst_latency_cmp(const void *a, const void *b) +static void acpi_cst_latency_sort(struct acpi_processor_cx *arr, size_t length) { - const struct acpi_processor_cx *x = a, *y = b; + int i, j, k; - if (!(x->valid && y->valid)) - return 0; - if (x->latency > y->latency) - return 1; - if (x->latency < y->latency) - return -1; - return 0; -} -static void acpi_cst_latency_swap(void *a, void *b, int n) -{ - struct acpi_processor_cx *x = a, *y = b; - - if (!(x->valid && y->valid)) - return; - swap(x->latency, y->latency); + for (i = 1; i < length; i++) { + if (!arr[i].valid) + continue; + for (j = i - 1, k = i; j >= 0; j--) { + if (!arr[j].valid) + continue; + if (arr[j].latency > arr[k].latency) + swap(arr[j].latency, arr[k].latency); + k = j; + } + } } static int acpi_processor_power_verify(struct acpi_processor *pr) @@ -449,10 +445,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr) if (buggy_latency) { pr_notice("FW issue: working around C-state latencies out of order\n"); - sort(&pr->power.states[1], max_cstate, - sizeof(struct acpi_processor_cx), - acpi_cst_latency_cmp, - acpi_cst_latency_swap); + acpi_cst_latency_sort(&pr->power.states[1], max_cstate); } lapic_timer_propagate_broadcast(pr); -- 2.34.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v2] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency 2024-07-01 4:42 ` [PATCH v2] " Kuan-Wei Chiu @ 2024-07-01 5:06 ` Greg KH 2024-07-01 15:17 ` Mario Limonciello 1 sibling, 0 replies; 20+ messages in thread From: Greg KH @ 2024-07-01 5:06 UTC (permalink / raw) To: Kuan-Wei Chiu Cc: rafael, lenb, mario.limonciello, akpm, jserv, alexdeucher, belegdol, regressions, linux-acpi, regressions, linux-kernel On Mon, Jul 01, 2024 at 12:42:32PM +0800, Kuan-Wei Chiu wrote: > The acpi_cst_latency_cmp comparison function currently used for sorting > C-state latencies does not satisfy transitivity, causing incorrect > sorting results. Specifically, if there are two valid acpi_processor_cx > elements A and B and one invalid element C, it may occur that A < B, > A = C, and B = C. Sorting algorithms assume that if A < B and A = C, > then C < B, leading to incorrect ordering. > > Given the small size of the array (<=8), we replace the library sort > function with a simple insertion sort that properly ignores invalid > elements and sorts valid ones based on latency. This change ensures > correct ordering of the C-state latencies. > > Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered") > Reported-by: Julian Sikorski <belegdol@gmail.com> > Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/ > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> > --- > v1 -> v2: > - Avoid swapping if arr[i] is an invalid element. > > I do not have the appropriate AMD hardware to reproduce this issue and > test the patch. However, if the aforementioned reason is indeed the > source of the problem, I believe this patch might help. > > drivers/acpi/processor_idle.c | 35 ++++++++++++++--------------------- > 1 file changed, 14 insertions(+), 21 deletions(-) > > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c > index bd6a7857ce05..813c718b9108 100644 > --- a/drivers/acpi/processor_idle.c > +++ b/drivers/acpi/processor_idle.c > @@ -386,25 +386,21 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr, > acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1); > } > > -static int acpi_cst_latency_cmp(const void *a, const void *b) > +static void acpi_cst_latency_sort(struct acpi_processor_cx *arr, size_t length) > { > - const struct acpi_processor_cx *x = a, *y = b; > + int i, j, k; > > - if (!(x->valid && y->valid)) > - return 0; > - if (x->latency > y->latency) > - return 1; > - if (x->latency < y->latency) > - return -1; > - return 0; > -} > -static void acpi_cst_latency_swap(void *a, void *b, int n) > -{ > - struct acpi_processor_cx *x = a, *y = b; > - > - if (!(x->valid && y->valid)) > - return; > - swap(x->latency, y->latency); > + for (i = 1; i < length; i++) { > + if (!arr[i].valid) > + continue; > + for (j = i - 1, k = i; j >= 0; j--) { > + if (!arr[j].valid) > + continue; > + if (arr[j].latency > arr[k].latency) > + swap(arr[j].latency, arr[k].latency); > + k = j; > + } > + } > } > > static int acpi_processor_power_verify(struct acpi_processor *pr) > @@ -449,10 +445,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr) > > if (buggy_latency) { > pr_notice("FW issue: working around C-state latencies out of order\n"); > - sort(&pr->power.states[1], max_cstate, > - sizeof(struct acpi_processor_cx), > - acpi_cst_latency_cmp, > - acpi_cst_latency_swap); > + acpi_cst_latency_sort(&pr->power.states[1], max_cstate); > } > > lapic_timer_propagate_broadcast(pr); > -- > 2.34.1 > > Hi, This is the friendly patch-bot of Greg Kroah-Hartman. You have sent him a patch that has triggered this response. He used to manually respond to these common problems, but in order to save his sanity (he kept writing the same thing over and over, yet to different people), I was created. Hopefully you will not take offence and will fix the problem in your patch and resubmit it so that it can be accepted into the Linux kernel tree. You are receiving this message because of the following common error(s) as indicated below: - You have marked a patch with a "Fixes:" tag for a commit that is in an older released kernel, yet you do not have a cc: stable line in the signed-off-by area at all, which means that the patch will not be applied to any older kernel releases. To properly fix this, please follow the documented rules in the Documentation/process/stable-kernel-rules.rst file for how to resolve this. If you wish to discuss this problem further, or you have questions about how to resolve this issue, please feel free to respond to this email and Greg will reply once he has dug out from the pending patches received from other developers. thanks, greg k-h's patch email bot ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency 2024-07-01 4:42 ` [PATCH v2] " Kuan-Wei Chiu 2024-07-01 5:06 ` Greg KH @ 2024-07-01 15:17 ` Mario Limonciello 2024-07-01 16:10 ` [PATCH v3] " Kuan-Wei Chiu 1 sibling, 1 reply; 20+ messages in thread From: Mario Limonciello @ 2024-07-01 15:17 UTC (permalink / raw) To: Kuan-Wei Chiu, rafael Cc: lenb, akpm, jserv, alexdeucher, belegdol, regressions, linux-acpi, regressions, linux-kernel On 6/30/2024 23:42, Kuan-Wei Chiu wrote: > The acpi_cst_latency_cmp comparison function currently used for sorting > C-state latencies does not satisfy transitivity, causing incorrect > sorting results. Specifically, if there are two valid acpi_processor_cx > elements A and B and one invalid element C, it may occur that A < B, > A = C, and B = C. Sorting algorithms assume that if A < B and A = C, > then C < B, leading to incorrect ordering. > > Given the small size of the array (<=8), we replace the library sort > function with a simple insertion sort that properly ignores invalid > elements and sorts valid ones based on latency. This change ensures > correct ordering of the C-state latencies. > > Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered") > Reported-by: Julian Sikorski <belegdol@gmail.com> > Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/ > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> > --- > v1 -> v2: > - Avoid swapping if arr[i] is an invalid element. > > I do not have the appropriate AMD hardware to reproduce this issue and > test the patch. Even if you had hardware, the behavior stems from a BIOS bug with the _CST entries. I know it's been fixed in the BIOS on most platforms in that era, but if I recall correctly from a few years back Julian's system was EoL already. > However, if the aforementioned reason is indeed the > source of the problem, I believe this patch might help. Thanks! Assuming this patch works for Julian, I believe you can also drop the #include <linux/sort.h> from this file as well. I do think this should be cc to @stable too in that case. > > drivers/acpi/processor_idle.c | 35 ++++++++++++++--------------------- > 1 file changed, 14 insertions(+), 21 deletions(-) > > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c > index bd6a7857ce05..813c718b9108 100644 > --- a/drivers/acpi/processor_idle.c > +++ b/drivers/acpi/processor_idle.c > @@ -386,25 +386,21 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr, > acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1); > } > > -static int acpi_cst_latency_cmp(const void *a, const void *b) > +static void acpi_cst_latency_sort(struct acpi_processor_cx *arr, size_t length) > { > - const struct acpi_processor_cx *x = a, *y = b; > + int i, j, k; > > - if (!(x->valid && y->valid)) > - return 0; > - if (x->latency > y->latency) > - return 1; > - if (x->latency < y->latency) > - return -1; > - return 0; > -} > -static void acpi_cst_latency_swap(void *a, void *b, int n) > -{ > - struct acpi_processor_cx *x = a, *y = b; > - > - if (!(x->valid && y->valid)) > - return; > - swap(x->latency, y->latency); > + for (i = 1; i < length; i++) { > + if (!arr[i].valid) > + continue; > + for (j = i - 1, k = i; j >= 0; j--) { > + if (!arr[j].valid) > + continue; > + if (arr[j].latency > arr[k].latency) > + swap(arr[j].latency, arr[k].latency); > + k = j; > + } > + } > } > > static int acpi_processor_power_verify(struct acpi_processor *pr) > @@ -449,10 +445,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr) > > if (buggy_latency) { > pr_notice("FW issue: working around C-state latencies out of order\n"); > - sort(&pr->power.states[1], max_cstate, > - sizeof(struct acpi_processor_cx), > - acpi_cst_latency_cmp, > - acpi_cst_latency_swap); > + acpi_cst_latency_sort(&pr->power.states[1], max_cstate); > } > > lapic_timer_propagate_broadcast(pr); ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v3] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency 2024-07-01 15:17 ` Mario Limonciello @ 2024-07-01 16:10 ` Kuan-Wei Chiu 2024-07-01 17:36 ` Rafael J. Wysocki 0 siblings, 1 reply; 20+ messages in thread From: Kuan-Wei Chiu @ 2024-07-01 16:10 UTC (permalink / raw) To: rafael Cc: lenb, mario.limonciello, akpm, jserv, alexdeucher, belegdol, regressions, linux-acpi, regressions, linux-kernel, Kuan-Wei Chiu, stable The acpi_cst_latency_cmp comparison function currently used for sorting C-state latencies does not satisfy transitivity, causing incorrect sorting results. Specifically, if there are two valid acpi_processor_cx elements A and B and one invalid element C, it may occur that A < B, A = C, and B = C. Sorting algorithms assume that if A < B and A = C, then C < B, leading to incorrect ordering. Given the small size of the array (<=8), we replace the library sort function with a simple insertion sort that properly ignores invalid elements and sorts valid ones based on latency. This change ensures correct ordering of the C-state latencies. Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered") Cc: stable@vger.kernel.org Reported-by: Julian Sikorski <belegdol@gmail.com> Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/ Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> --- v2 -> v3: - Remove #include <linux/sort.h> - Cc @stable Note: I only performed a build test and a simple unit test to ensure the latency of valid elements is correctly sorted in the randomly generated data. drivers/acpi/processor_idle.c | 36 ++++++++++++++--------------------- 1 file changed, 14 insertions(+), 22 deletions(-) diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index bd6a7857ce05..17cc81340b4b 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -16,7 +16,6 @@ #include <linux/acpi.h> #include <linux/dmi.h> #include <linux/sched.h> /* need_resched() */ -#include <linux/sort.h> #include <linux/tick.h> #include <linux/cpuidle.h> #include <linux/cpu.h> @@ -386,25 +385,21 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr, acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1); } -static int acpi_cst_latency_cmp(const void *a, const void *b) +static void acpi_cst_latency_sort(struct acpi_processor_cx *arr, size_t length) { - const struct acpi_processor_cx *x = a, *y = b; + int i, j, k; - if (!(x->valid && y->valid)) - return 0; - if (x->latency > y->latency) - return 1; - if (x->latency < y->latency) - return -1; - return 0; -} -static void acpi_cst_latency_swap(void *a, void *b, int n) -{ - struct acpi_processor_cx *x = a, *y = b; - - if (!(x->valid && y->valid)) - return; - swap(x->latency, y->latency); + for (i = 1; i < length; i++) { + if (!arr[i].valid) + continue; + for (j = i - 1, k = i; j >= 0; j--) { + if (!arr[j].valid) + continue; + if (arr[j].latency > arr[k].latency) + swap(arr[j].latency, arr[k].latency); + k = j; + } + } } static int acpi_processor_power_verify(struct acpi_processor *pr) @@ -449,10 +444,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr) if (buggy_latency) { pr_notice("FW issue: working around C-state latencies out of order\n"); - sort(&pr->power.states[1], max_cstate, - sizeof(struct acpi_processor_cx), - acpi_cst_latency_cmp, - acpi_cst_latency_swap); + acpi_cst_latency_sort(&pr->power.states[1], max_cstate); } lapic_timer_propagate_broadcast(pr); -- 2.34.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v3] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency 2024-07-01 16:10 ` [PATCH v3] " Kuan-Wei Chiu @ 2024-07-01 17:36 ` Rafael J. Wysocki 2024-07-01 20:56 ` [PATCH v4] " Kuan-Wei Chiu 0 siblings, 1 reply; 20+ messages in thread From: Rafael J. Wysocki @ 2024-07-01 17:36 UTC (permalink / raw) To: Kuan-Wei Chiu Cc: rafael, lenb, mario.limonciello, akpm, jserv, alexdeucher, belegdol, regressions, linux-acpi, regressions, linux-kernel, stable On Mon, Jul 1, 2024 at 6:10 PM Kuan-Wei Chiu <visitorckw@gmail.com> wrote: > > The acpi_cst_latency_cmp comparison function currently used for sorting > C-state latencies does not satisfy transitivity, causing incorrect > sorting results. Specifically, if there are two valid acpi_processor_cx > elements A and B and one invalid element C, it may occur that A < B, > A = C, and B = C. Sorting algorithms assume that if A < B and A = C, > then C < B, leading to incorrect ordering. > > Given the small size of the array (<=8), we replace the library sort > function with a simple insertion sort that properly ignores invalid > elements and sorts valid ones based on latency. This change ensures > correct ordering of the C-state latencies. > > Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered") > Cc: stable@vger.kernel.org > Reported-by: Julian Sikorski <belegdol@gmail.com> > Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/ > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> > --- > v2 -> v3: > - Remove #include <linux/sort.h> > - Cc @stable > > Note: I only performed a build test and a simple unit test to ensure > the latency of valid elements is correctly sorted in the randomly > generated data. > > drivers/acpi/processor_idle.c | 36 ++++++++++++++--------------------- > 1 file changed, 14 insertions(+), 22 deletions(-) > > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c > index bd6a7857ce05..17cc81340b4b 100644 > --- a/drivers/acpi/processor_idle.c > +++ b/drivers/acpi/processor_idle.c > @@ -16,7 +16,6 @@ > #include <linux/acpi.h> > #include <linux/dmi.h> > #include <linux/sched.h> /* need_resched() */ > -#include <linux/sort.h> > #include <linux/tick.h> > #include <linux/cpuidle.h> > #include <linux/cpu.h> > @@ -386,25 +385,21 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr, > acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1); > } > > -static int acpi_cst_latency_cmp(const void *a, const void *b) > +static void acpi_cst_latency_sort(struct acpi_processor_cx *arr, size_t length) s/arr/states/ please. > { > - const struct acpi_processor_cx *x = a, *y = b; > + int i, j, k; > > - if (!(x->valid && y->valid)) > - return 0; > - if (x->latency > y->latency) > - return 1; > - if (x->latency < y->latency) > - return -1; > - return 0; > -} > -static void acpi_cst_latency_swap(void *a, void *b, int n) > -{ > - struct acpi_processor_cx *x = a, *y = b; > - > - if (!(x->valid && y->valid)) > - return; > - swap(x->latency, y->latency); > + for (i = 1; i < length; i++) { > + if (!arr[i].valid) > + continue; Please add an empty line here (and analogously below). > + for (j = i - 1, k = i; j >= 0; j--) { > + if (!arr[j].valid) > + continue; > + if (arr[j].latency > arr[k].latency) > + swap(arr[j].latency, arr[k].latency); And here. > + k = j; > + } > + } > } > > static int acpi_processor_power_verify(struct acpi_processor *pr) > @@ -449,10 +444,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr) > > if (buggy_latency) { > pr_notice("FW issue: working around C-state latencies out of order\n"); > - sort(&pr->power.states[1], max_cstate, > - sizeof(struct acpi_processor_cx), > - acpi_cst_latency_cmp, > - acpi_cst_latency_swap); > + acpi_cst_latency_sort(&pr->power.states[1], max_cstate); > } > > lapic_timer_propagate_broadcast(pr); > -- ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v4] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency 2024-07-01 17:36 ` Rafael J. Wysocki @ 2024-07-01 20:56 ` Kuan-Wei Chiu 2024-07-02 7:28 ` Julian Sikorski 2024-07-02 18:38 ` Rafael J. Wysocki 0 siblings, 2 replies; 20+ messages in thread From: Kuan-Wei Chiu @ 2024-07-01 20:56 UTC (permalink / raw) To: rafael Cc: lenb, mario.limonciello, akpm, jserv, alexdeucher, belegdol, regressions, linux-acpi, regressions, linux-kernel, Kuan-Wei Chiu, stable The acpi_cst_latency_cmp comparison function currently used for sorting C-state latencies does not satisfy transitivity, causing incorrect sorting results. Specifically, if there are two valid acpi_processor_cx elements A and B and one invalid element C, it may occur that A < B, A = C, and B = C. Sorting algorithms assume that if A < B and A = C, then C < B, leading to incorrect ordering. Given the small size of the array (<=8), we replace the library sort function with a simple insertion sort that properly ignores invalid elements and sorts valid ones based on latency. This change ensures correct ordering of the C-state latencies. Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered") Cc: stable@vger.kernel.org Reported-by: Julian Sikorski <belegdol@gmail.com> Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/ Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> --- v3 -> v4: - Rename the parameter 'arr' to 'states'. - Add empty lines to enhance readability. Note: I only performed a build test and a simple unit test to ensure the latency of valid elements is correctly sorted in the randomly generated data. drivers/acpi/processor_idle.c | 37 +++++++++++++++-------------------- 1 file changed, 16 insertions(+), 21 deletions(-) diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index bd6a7857ce05..831fa4a12159 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -16,7 +16,6 @@ #include <linux/acpi.h> #include <linux/dmi.h> #include <linux/sched.h> /* need_resched() */ -#include <linux/sort.h> #include <linux/tick.h> #include <linux/cpuidle.h> #include <linux/cpu.h> @@ -386,25 +385,24 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr, acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1); } -static int acpi_cst_latency_cmp(const void *a, const void *b) +static void acpi_cst_latency_sort(struct acpi_processor_cx *states, size_t length) { - const struct acpi_processor_cx *x = a, *y = b; + int i, j, k; - if (!(x->valid && y->valid)) - return 0; - if (x->latency > y->latency) - return 1; - if (x->latency < y->latency) - return -1; - return 0; -} -static void acpi_cst_latency_swap(void *a, void *b, int n) -{ - struct acpi_processor_cx *x = a, *y = b; + for (i = 1; i < length; i++) { + if (!states[i].valid) + continue; - if (!(x->valid && y->valid)) - return; - swap(x->latency, y->latency); + for (j = i - 1, k = i; j >= 0; j--) { + if (!states[j].valid) + continue; + + if (states[j].latency > states[k].latency) + swap(states[j].latency, states[k].latency); + + k = j; + } + } } static int acpi_processor_power_verify(struct acpi_processor *pr) @@ -449,10 +447,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr) if (buggy_latency) { pr_notice("FW issue: working around C-state latencies out of order\n"); - sort(&pr->power.states[1], max_cstate, - sizeof(struct acpi_processor_cx), - acpi_cst_latency_cmp, - acpi_cst_latency_swap); + acpi_cst_latency_sort(&pr->power.states[1], max_cstate); } lapic_timer_propagate_broadcast(pr); -- 2.34.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v4] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency 2024-07-01 20:56 ` [PATCH v4] " Kuan-Wei Chiu @ 2024-07-02 7:28 ` Julian Sikorski 2024-07-02 12:59 ` Mario Limonciello 2024-07-02 18:38 ` Rafael J. Wysocki 1 sibling, 1 reply; 20+ messages in thread From: Julian Sikorski @ 2024-07-02 7:28 UTC (permalink / raw) To: Kuan-Wei Chiu, rafael Cc: lenb, mario.limonciello, akpm, jserv, alexdeucher, regressions, linux-acpi, regressions, linux-kernel, stable Am 01.07.24 um 22:56 schrieb Kuan-Wei Chiu: > The acpi_cst_latency_cmp comparison function currently used for sorting > C-state latencies does not satisfy transitivity, causing incorrect > sorting results. Specifically, if there are two valid acpi_processor_cx > elements A and B and one invalid element C, it may occur that A < B, > A = C, and B = C. Sorting algorithms assume that if A < B and A = C, > then C < B, leading to incorrect ordering. > > Given the small size of the array (<=8), we replace the library sort > function with a simple insertion sort that properly ignores invalid > elements and sorts valid ones based on latency. This change ensures > correct ordering of the C-state latencies. > > Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered") > Cc: stable@vger.kernel.org > Reported-by: Julian Sikorski <belegdol@gmail.com> > Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/ > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> > --- > v3 -> v4: > - Rename the parameter 'arr' to 'states'. > - Add empty lines to enhance readability. > > Note: I only performed a build test and a simple unit test to ensure > the latency of valid elements is correctly sorted in the randomly > generated data. > Hello, thanks for the patch. I have tested this applied on top of Fedora 6.9.7 kernel on my Asus laptop and the message about suspend not reaching the deepest state is gone. Thank you. I wonder whether this will also fix random S3 suspend issues I have been seeing on my 5600x since 6.9 kernel too. I will definitely try. Best regards, Julian Tested-by: Julian Sikorski <belegdol@gmail.com> > drivers/acpi/processor_idle.c | 37 +++++++++++++++-------------------- > 1 file changed, 16 insertions(+), 21 deletions(-) > > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c > index bd6a7857ce05..831fa4a12159 100644 > --- a/drivers/acpi/processor_idle.c > +++ b/drivers/acpi/processor_idle.c > @@ -16,7 +16,6 @@ > #include <linux/acpi.h> > #include <linux/dmi.h> > #include <linux/sched.h> /* need_resched() */ > -#include <linux/sort.h> > #include <linux/tick.h> > #include <linux/cpuidle.h> > #include <linux/cpu.h> > @@ -386,25 +385,24 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr, > acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1); > } > > -static int acpi_cst_latency_cmp(const void *a, const void *b) > +static void acpi_cst_latency_sort(struct acpi_processor_cx *states, size_t length) > { > - const struct acpi_processor_cx *x = a, *y = b; > + int i, j, k; > > - if (!(x->valid && y->valid)) > - return 0; > - if (x->latency > y->latency) > - return 1; > - if (x->latency < y->latency) > - return -1; > - return 0; > -} > -static void acpi_cst_latency_swap(void *a, void *b, int n) > -{ > - struct acpi_processor_cx *x = a, *y = b; > + for (i = 1; i < length; i++) { > + if (!states[i].valid) > + continue; > > - if (!(x->valid && y->valid)) > - return; > - swap(x->latency, y->latency); > + for (j = i - 1, k = i; j >= 0; j--) { > + if (!states[j].valid) > + continue; > + > + if (states[j].latency > states[k].latency) > + swap(states[j].latency, states[k].latency); > + > + k = j; > + } > + } > } > > static int acpi_processor_power_verify(struct acpi_processor *pr) > @@ -449,10 +447,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr) > > if (buggy_latency) { > pr_notice("FW issue: working around C-state latencies out of order\n"); > - sort(&pr->power.states[1], max_cstate, > - sizeof(struct acpi_processor_cx), > - acpi_cst_latency_cmp, > - acpi_cst_latency_swap); > + acpi_cst_latency_sort(&pr->power.states[1], max_cstate); > } > > lapic_timer_propagate_broadcast(pr); ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v4] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency 2024-07-02 7:28 ` Julian Sikorski @ 2024-07-02 12:59 ` Mario Limonciello 0 siblings, 0 replies; 20+ messages in thread From: Mario Limonciello @ 2024-07-02 12:59 UTC (permalink / raw) To: Julian Sikorski, Kuan-Wei Chiu, rafael Cc: lenb, akpm, jserv, alexdeucher, regressions, linux-acpi, regressions, linux-kernel, stable On 7/2/2024 2:28, Julian Sikorski wrote: > Am 01.07.24 um 22:56 schrieb Kuan-Wei Chiu: >> The acpi_cst_latency_cmp comparison function currently used for sorting >> C-state latencies does not satisfy transitivity, causing incorrect >> sorting results. Specifically, if there are two valid acpi_processor_cx >> elements A and B and one invalid element C, it may occur that A < B, >> A = C, and B = C. Sorting algorithms assume that if A < B and A = C, >> then C < B, leading to incorrect ordering. >> >> Given the small size of the array (<=8), we replace the library sort >> function with a simple insertion sort that properly ignores invalid >> elements and sorts valid ones based on latency. This change ensures >> correct ordering of the C-state latencies. >> >> Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if >> not ordered") >> Cc: stable@vger.kernel.org >> Reported-by: Julian Sikorski <belegdol@gmail.com> >> Closes: >> https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/ >> Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> >> --- >> v3 -> v4: >> - Rename the parameter 'arr' to 'states'. >> - Add empty lines to enhance readability. >> >> Note: I only performed a build test and a simple unit test to ensure >> the latency of valid elements is correctly sorted in the randomly >> generated data. >> > > Hello, > > thanks for the patch. I have tested this applied on top of Fedora 6.9.7 > kernel on my Asus laptop and the message about suspend not reaching the > deepest state is gone. Thank you. That's great news. > I wonder whether this will also fix random S3 suspend issues I have been > seeing on my 5600x since 6.9 kernel too. I will definitely try. Does your 5600x also sort C states? You'll see message in the logs. If so yes it could help. If not; you probably will need to bisect that separately. > > Best regards, > Julian > > Tested-by: Julian Sikorski <belegdol@gmail.com> > >> drivers/acpi/processor_idle.c | 37 +++++++++++++++-------------------- >> 1 file changed, 16 insertions(+), 21 deletions(-) >> >> diff --git a/drivers/acpi/processor_idle.c >> b/drivers/acpi/processor_idle.c >> index bd6a7857ce05..831fa4a12159 100644 >> --- a/drivers/acpi/processor_idle.c >> +++ b/drivers/acpi/processor_idle.c >> @@ -16,7 +16,6 @@ >> #include <linux/acpi.h> >> #include <linux/dmi.h> >> #include <linux/sched.h> /* need_resched() */ >> -#include <linux/sort.h> >> #include <linux/tick.h> >> #include <linux/cpuidle.h> >> #include <linux/cpu.h> >> @@ -386,25 +385,24 @@ static void >> acpi_processor_power_verify_c3(struct acpi_processor *pr, >> acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1); >> } >> -static int acpi_cst_latency_cmp(const void *a, const void *b) >> +static void acpi_cst_latency_sort(struct acpi_processor_cx *states, >> size_t length) >> { >> - const struct acpi_processor_cx *x = a, *y = b; >> + int i, j, k; >> - if (!(x->valid && y->valid)) >> - return 0; >> - if (x->latency > y->latency) >> - return 1; >> - if (x->latency < y->latency) >> - return -1; >> - return 0; >> -} >> -static void acpi_cst_latency_swap(void *a, void *b, int n) >> -{ >> - struct acpi_processor_cx *x = a, *y = b; >> + for (i = 1; i < length; i++) { >> + if (!states[i].valid) >> + continue; >> - if (!(x->valid && y->valid)) >> - return; >> - swap(x->latency, y->latency); >> + for (j = i - 1, k = i; j >= 0; j--) { >> + if (!states[j].valid) >> + continue; >> + >> + if (states[j].latency > states[k].latency) >> + swap(states[j].latency, states[k].latency); >> + >> + k = j; >> + } >> + } >> } >> static int acpi_processor_power_verify(struct acpi_processor *pr) >> @@ -449,10 +447,7 @@ static int acpi_processor_power_verify(struct >> acpi_processor *pr) >> if (buggy_latency) { >> pr_notice("FW issue: working around C-state latencies out of >> order\n"); >> - sort(&pr->power.states[1], max_cstate, >> - sizeof(struct acpi_processor_cx), >> - acpi_cst_latency_cmp, >> - acpi_cst_latency_swap); >> + acpi_cst_latency_sort(&pr->power.states[1], max_cstate); >> } >> lapic_timer_propagate_broadcast(pr); > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v4] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency 2024-07-01 20:56 ` [PATCH v4] " Kuan-Wei Chiu 2024-07-02 7:28 ` Julian Sikorski @ 2024-07-02 18:38 ` Rafael J. Wysocki 1 sibling, 0 replies; 20+ messages in thread From: Rafael J. Wysocki @ 2024-07-02 18:38 UTC (permalink / raw) To: Kuan-Wei Chiu Cc: rafael, lenb, mario.limonciello, akpm, jserv, alexdeucher, belegdol, regressions, linux-acpi, regressions, linux-kernel, stable On Mon, Jul 1, 2024 at 10:56 PM Kuan-Wei Chiu <visitorckw@gmail.com> wrote: > > The acpi_cst_latency_cmp comparison function currently used for sorting > C-state latencies does not satisfy transitivity, causing incorrect > sorting results. Specifically, if there are two valid acpi_processor_cx > elements A and B and one invalid element C, it may occur that A < B, > A = C, and B = C. Sorting algorithms assume that if A < B and A = C, > then C < B, leading to incorrect ordering. > > Given the small size of the array (<=8), we replace the library sort > function with a simple insertion sort that properly ignores invalid > elements and sorts valid ones based on latency. This change ensures > correct ordering of the C-state latencies. > > Fixes: 65ea8f2c6e23 ("ACPI: processor idle: Fix up C-state latency if not ordered") > Cc: stable@vger.kernel.org > Reported-by: Julian Sikorski <belegdol@gmail.com> > Closes: https://lore.kernel.org/lkml/70674dc7-5586-4183-8953-8095567e73df@gmail.com/ > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> > --- > v3 -> v4: > - Rename the parameter 'arr' to 'states'. > - Add empty lines to enhance readability. > > Note: I only performed a build test and a simple unit test to ensure > the latency of valid elements is correctly sorted in the randomly > generated data. > > drivers/acpi/processor_idle.c | 37 +++++++++++++++-------------------- > 1 file changed, 16 insertions(+), 21 deletions(-) > > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c > index bd6a7857ce05..831fa4a12159 100644 > --- a/drivers/acpi/processor_idle.c > +++ b/drivers/acpi/processor_idle.c > @@ -16,7 +16,6 @@ > #include <linux/acpi.h> > #include <linux/dmi.h> > #include <linux/sched.h> /* need_resched() */ > -#include <linux/sort.h> > #include <linux/tick.h> > #include <linux/cpuidle.h> > #include <linux/cpu.h> > @@ -386,25 +385,24 @@ static void acpi_processor_power_verify_c3(struct acpi_processor *pr, > acpi_write_bit_register(ACPI_BITREG_BUS_MASTER_RLD, 1); > } > > -static int acpi_cst_latency_cmp(const void *a, const void *b) > +static void acpi_cst_latency_sort(struct acpi_processor_cx *states, size_t length) > { > - const struct acpi_processor_cx *x = a, *y = b; > + int i, j, k; > > - if (!(x->valid && y->valid)) > - return 0; > - if (x->latency > y->latency) > - return 1; > - if (x->latency < y->latency) > - return -1; > - return 0; > -} > -static void acpi_cst_latency_swap(void *a, void *b, int n) > -{ > - struct acpi_processor_cx *x = a, *y = b; > + for (i = 1; i < length; i++) { > + if (!states[i].valid) > + continue; > > - if (!(x->valid && y->valid)) > - return; > - swap(x->latency, y->latency); > + for (j = i - 1, k = i; j >= 0; j--) { > + if (!states[j].valid) > + continue; > + > + if (states[j].latency > states[k].latency) > + swap(states[j].latency, states[k].latency); > + > + k = j; > + } > + } > } > > static int acpi_processor_power_verify(struct acpi_processor *pr) > @@ -449,10 +447,7 @@ static int acpi_processor_power_verify(struct acpi_processor *pr) > > if (buggy_latency) { > pr_notice("FW issue: working around C-state latencies out of order\n"); > - sort(&pr->power.states[1], max_cstate, > - sizeof(struct acpi_processor_cx), > - acpi_cst_latency_cmp, > - acpi_cst_latency_swap); > + acpi_cst_latency_sort(&pr->power.states[1], max_cstate); > } > > lapic_timer_propagate_broadcast(pr); > -- Applied as 6.10-rc material, thanks! ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation 2024-06-20 15:36 ` Julian Sikorski 2024-06-20 20:17 ` Mario Limonciello 2024-06-28 15:15 ` Linux regression tracking (Thorsten Leemhuis) @ 2025-01-15 3:27 ` Luke Jones 2025-01-15 12:49 ` Kuan-Wei Chiu 2 siblings, 1 reply; 20+ messages in thread From: Luke Jones @ 2025-01-15 3:27 UTC (permalink / raw) To: Julian Sikorski, visitorckw; +Cc: akpm, jserv, linux-kernel, lkml, alexdeucher On Thu, 2024-06-20 at 17:36 +0200, Julian Sikorski wrote: > Hello, > > it appears that this patch has caused suspend-to-idle regression: > > https://gitlab.freedesktop.org/drm/amd/-/issues/3436 > Another regression from this has been reported here https://bugzilla.kernel.org/show_bug.cgi?id=219158 Regards, Luke. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation 2025-01-15 3:27 ` [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation Luke Jones @ 2025-01-15 12:49 ` Kuan-Wei Chiu 0 siblings, 0 replies; 20+ messages in thread From: Kuan-Wei Chiu @ 2025-01-15 12:49 UTC (permalink / raw) To: Luke Jones Cc: Julian Sikorski, akpm, jserv, linux-kernel, lkml, alexdeucher, perex, tiwai, linux-sound, chuang, hui.wang (+Cc sound developers) Hi Luke, On Wed, Jan 15, 2025 at 04:27:52PM +1300, Luke Jones wrote: > On Thu, 2024-06-20 at 17:36 +0200, Julian Sikorski wrote: > > Hello, > > > > it appears that this patch has caused suspend-to-idle regression: > > > > https://gitlab.freedesktop.org/drm/amd/-/issues/3436 > > > > Another regression from this has been reported here > https://bugzilla.kernel.org/show_bug.cgi?id=219158 > Thank you for reporting this regression! From a quick look, this seems to be caused by yet another broken compare function. In sound/pci/hda/hda_auto_parser.c, the compare_input_type() function can lead to sorting issues when both is_headset_mic and is_headphone_mic are true for a and b. This can result in a situation where both a < b and b < a hold true, violating the antisymmetry and transitivity required by sort(). Additionally, the comments about "swap" and "don't swap" seem to make incorrect assumptions about how sort() works. Regardless of this optimization patch, sort() may swap a and b without comparing them. Could you help test the following code? If it works, I'll submit it as an official patch. Regards, Kuan-Wei diff --git a/sound/pci/hda/hda_auto_parser.c b/sound/pci/hda/hda_auto_parser.c index 84393f4f429d..5502ec09b584 100644 --- a/sound/pci/hda/hda_auto_parser.c +++ b/sound/pci/hda/hda_auto_parser.c @@ -73,6 +73,8 @@ static int compare_input_type(const void *ap, const void *bp) return (int)(a->type - b->type); /* If has both hs_mic and hp_mic, pick the hs_mic ahead of hp_mic. */ + if (a->is_headset_mic && b->is_headset_mic && a->is_headphone_mic && b->is_headphone_mic) + return (int)(b->has_boost_on_pin - a->has_boost_on_pin); if (a->is_headset_mic && b->is_headphone_mic) return -1; /* don't swap */ else if (a->is_headphone_mic && b->is_headset_mic) ^ permalink raw reply related [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-01-15 12:49 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-13 3:13 [PATCH 0/2] lib/sort: Optimize the number of swaps and comparisons Kuan-Wei Chiu
2024-01-13 3:13 ` [PATCH 1/2] lib/sort: Optimize heapsort for equal elements in sift-down path Kuan-Wei Chiu
2024-01-13 3:13 ` [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation Kuan-Wei Chiu
2024-06-20 15:36 ` Julian Sikorski
2024-06-20 20:17 ` Mario Limonciello
2024-06-28 15:15 ` Linux regression tracking (Thorsten Leemhuis)
2024-06-28 17:10 ` Kuan-Wei Chiu
2024-06-29 5:03 ` Linux regression tracking (Thorsten Leemhuis)
[not found] ` <20240630210809.37550-1-visitorckw@gmail.com>
2024-06-30 21:13 ` [PATCH] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency Kuan-Wei Chiu
2024-07-01 4:42 ` [PATCH v2] " Kuan-Wei Chiu
2024-07-01 5:06 ` Greg KH
2024-07-01 15:17 ` Mario Limonciello
2024-07-01 16:10 ` [PATCH v3] " Kuan-Wei Chiu
2024-07-01 17:36 ` Rafael J. Wysocki
2024-07-01 20:56 ` [PATCH v4] " Kuan-Wei Chiu
2024-07-02 7:28 ` Julian Sikorski
2024-07-02 12:59 ` Mario Limonciello
2024-07-02 18:38 ` Rafael J. Wysocki
2025-01-15 3:27 ` [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation Luke Jones
2025-01-15 12:49 ` Kuan-Wei Chiu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox