* [PATCH v2] mm, compaction: properly signal and act upon lock and need_sched() contention
2014-05-23 8:34 ` Vlastimil Babka
@ 2014-05-23 10:49 ` Shawn Guo
2014-05-23 15:07 ` Kevin Hilman
` (2 subsequent siblings)
3 siblings, 0 replies; 9+ messages in thread
From: Shawn Guo @ 2014-05-23 10:49 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, May 23, 2014 at 10:34:55AM +0200, Vlastimil Babka wrote:
> From: Vlastimil Babka <vbabka@suse.cz>
> Date: Fri, 23 May 2014 10:18:56 +0200
> Subject: mm-compaction-properly-signal-and-act-upon-lock-and-need_sched-contention-fix2
>
> Step 1: Change function name and comment between v1 and v2 so that the return
> value signals the opposite thing.
> Step 2: Change the call sites to reflect the opposite return value.
> Step 3: ???
> Step 4: Make a complete fool of yourself.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
> ---
> mm/compaction.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index a525cd4..5175019 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -237,13 +237,13 @@ static inline bool compact_should_abort(struct compact_control *cc)
> if (need_resched()) {
> if (cc->mode == MIGRATE_ASYNC) {
> cc->contended = true;
> - return false;
> + return true;
> }
>
> cond_resched();
> }
>
> - return true;
> + return false;
> }
>
> /* Returns true if the page is within a block suitable for migration to */
> --
> 1.8.4.5
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH v2] mm, compaction: properly signal and act upon lock and need_sched() contention
2014-05-23 8:34 ` Vlastimil Babka
2014-05-23 10:49 ` Shawn Guo
@ 2014-05-23 15:07 ` Kevin Hilman
2014-05-30 16:59 ` Stephen Warren
2014-06-02 13:35 ` Fabio Estevam
3 siblings, 0 replies; 9+ messages in thread
From: Kevin Hilman @ 2014-05-23 15:07 UTC (permalink / raw)
To: linux-arm-kernel
Vlastimil Babka <vbabka@suse.cz> writes:
> On 05/23/2014 04:48 AM, Shawn Guo wrote:
>> On 23 May 2014 07:49, Kevin Hilman <khilman@linaro.org> wrote:
>>> On Fri, May 16, 2014 at 2:47 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
>>>> Compaction uses compact_checklock_irqsave() function to periodically check for
>>>> lock contention and need_resched() to either abort async compaction, or to
>>>> free the lock, schedule and retake the lock. When aborting, cc->contended is
>>>> set to signal the contended state to the caller. Two problems have been
>>>> identified in this mechanism.
>>>
>>> This patch (or later version) has hit next-20140522 (in the form
>>> commit 645ceea9331bfd851bc21eea456dda27862a10f4) and according to my
>>> bisect, appears to be the culprit of several boot failures on ARM
>>> platforms.
>>
>> On i.MX6 where CMA is enabled, the commit causes the drivers calling
>> dma_alloc_coherent() fail to probe. Tracing it a little bit, it seems
>> dma_alloc_from_contiguous() always return page as NULL after this
>> commit.
>>
>> Shawn
>>
>
> Really sorry, guys :/
>
> -----8<-----
> From: Vlastimil Babka <vbabka@suse.cz>
> Date: Fri, 23 May 2014 10:18:56 +0200
> Subject: mm-compaction-properly-signal-and-act-upon-lock-and-need_sched-contention-fix2
>
> Step 1: Change function name and comment between v1 and v2 so that the return
> value signals the opposite thing.
> Step 2: Change the call sites to reflect the opposite return value.
> Step 3: ???
> Step 4: Make a complete fool of yourself.
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Kevin Hilman <khilman@linaro.org>
I verified that this fixes the boot failures I've seen on ARM (i.MX6 and
Marvell Armada 370).
Thanks for the quick fix.
Kevin
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2] mm, compaction: properly signal and act upon lock and need_sched() contention
2014-05-23 8:34 ` Vlastimil Babka
2014-05-23 10:49 ` Shawn Guo
2014-05-23 15:07 ` Kevin Hilman
@ 2014-05-30 16:59 ` Stephen Warren
2014-06-02 13:35 ` Fabio Estevam
3 siblings, 0 replies; 9+ messages in thread
From: Stephen Warren @ 2014-05-30 16:59 UTC (permalink / raw)
To: linux-arm-kernel
On 05/23/2014 02:34 AM, Vlastimil Babka wrote:
> On 05/23/2014 04:48 AM, Shawn Guo wrote:
>> On 23 May 2014 07:49, Kevin Hilman <khilman@linaro.org> wrote:
>>> On Fri, May 16, 2014 at 2:47 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
>>>> Compaction uses compact_checklock_irqsave() function to periodically check for
>>>> lock contention and need_resched() to either abort async compaction, or to
>>>> free the lock, schedule and retake the lock. When aborting, cc->contended is
>>>> set to signal the contended state to the caller. Two problems have been
>>>> identified in this mechanism.
>>>
>>> This patch (or later version) has hit next-20140522 (in the form
>>> commit 645ceea9331bfd851bc21eea456dda27862a10f4) and according to my
>>> bisect, appears to be the culprit of several boot failures on ARM
>>> platforms.
>>
>> On i.MX6 where CMA is enabled, the commit causes the drivers calling
>> dma_alloc_coherent() fail to probe. Tracing it a little bit, it seems
>> dma_alloc_from_contiguous() always return page as NULL after this
>> commit.
>>
>> Shawn
>>
>
> Really sorry, guys :/
>
> -----8<-----
> From: Vlastimil Babka <vbabka@suse.cz>
> Date: Fri, 23 May 2014 10:18:56 +0200
> Subject: mm-compaction-properly-signal-and-act-upon-lock-and-need_sched-contention-fix2
>
> Step 1: Change function name and comment between v1 and v2 so that the return
> value signals the opposite thing.
> Step 2: Change the call sites to reflect the opposite return value.
> Step 3: ???
> Step 4: Make a complete fool of yourself.
Tested-by: Stephen Warren <swarren@nvidia.com>
This fix doesn't seem to be in linux-next yet:-(
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2] mm, compaction: properly signal and act upon lock and need_sched() contention
2014-05-23 8:34 ` Vlastimil Babka
` (2 preceding siblings ...)
2014-05-30 16:59 ` Stephen Warren
@ 2014-06-02 13:35 ` Fabio Estevam
2014-06-02 14:33 ` [PATCH -mm] mm, compaction: properly signal and act upon lock and need_sched() contention - fix Vlastimil Babka
3 siblings, 1 reply; 9+ messages in thread
From: Fabio Estevam @ 2014-06-02 13:35 UTC (permalink / raw)
To: linux-arm-kernel
Vlastimil,
On Fri, May 23, 2014 at 5:34 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
> Really sorry, guys :/
>
> -----8<-----
> From: Vlastimil Babka <vbabka@suse.cz>
> Date: Fri, 23 May 2014 10:18:56 +0200
> Subject: mm-compaction-properly-signal-and-act-upon-lock-and-need_sched-contention-fix2
>
> Step 1: Change function name and comment between v1 and v2 so that the return
> value signals the opposite thing.
> Step 2: Change the call sites to reflect the opposite return value.
> Step 3: ???
> Step 4: Make a complete fool of yourself.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> mm/compaction.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index a525cd4..5175019 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -237,13 +237,13 @@ static inline bool compact_should_abort(struct compact_control *cc)
> if (need_resched()) {
> if (cc->mode == MIGRATE_ASYNC) {
> cc->contended = true;
> - return false;
> + return true;
> }
>
> cond_resched();
> }
>
> - return true;
> + return false;
> }
This patch is still not in linux-next.
Could you please submit it formally?
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH -mm] mm, compaction: properly signal and act upon lock and need_sched() contention - fix
2014-06-02 13:35 ` Fabio Estevam
@ 2014-06-02 14:33 ` Vlastimil Babka
2014-06-02 15:18 ` Fabio Estevam
2014-06-02 20:09 ` David Rientjes
0 siblings, 2 replies; 9+ messages in thread
From: Vlastimil Babka @ 2014-06-02 14:33 UTC (permalink / raw)
To: linux-arm-kernel
compact_should_abort() returns true instead of false and vice versa
due to changes between v1 and v2 of the patch. This makes both async
and sync compaction abort with high probability, and has been reported
to cause e.g. soft lockups on some ARM boards, or drivers calling
dma_alloc_coherent() fail to probe with CMA enabled on different boards.
This patch fixes the return value to match comments and callers expecations.
Reported-and-tested-by: Kevin Hilman <khilman@linaro.org>
Reported-and-tested-by: Shawn Guo <shawn.guo@linaro.org>
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
mm/compaction.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index a525cd4..5175019 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -237,13 +237,13 @@ static inline bool compact_should_abort(struct compact_control *cc)
if (need_resched()) {
if (cc->mode == MIGRATE_ASYNC) {
cc->contended = true;
- return false;
+ return true;
}
cond_resched();
}
- return true;
+ return false;
}
/* Returns true if the page is within a block suitable for migration to */
-- 1.8.4.5
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH -mm] mm, compaction: properly signal and act upon lock and need_sched() contention - fix
2014-06-02 14:33 ` [PATCH -mm] mm, compaction: properly signal and act upon lock and need_sched() contention - fix Vlastimil Babka
@ 2014-06-02 15:18 ` Fabio Estevam
2014-06-02 20:09 ` David Rientjes
1 sibling, 0 replies; 9+ messages in thread
From: Fabio Estevam @ 2014-06-02 15:18 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jun 2, 2014 at 11:33 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
> compact_should_abort() returns true instead of false and vice versa
> due to changes between v1 and v2 of the patch. This makes both async
> and sync compaction abort with high probability, and has been reported
> to cause e.g. soft lockups on some ARM boards, or drivers calling
> dma_alloc_coherent() fail to probe with CMA enabled on different boards.
>
> This patch fixes the return value to match comments and callers expecations.
>
> Reported-and-tested-by: Kevin Hilman <khilman@linaro.org>
> Reported-and-tested-by: Shawn Guo <shawn.guo@linaro.org>
> Tested-by: Stephen Warren <swarren@nvidia.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH -mm] mm, compaction: properly signal and act upon lock and need_sched() contention - fix
2014-06-02 14:33 ` [PATCH -mm] mm, compaction: properly signal and act upon lock and need_sched() contention - fix Vlastimil Babka
2014-06-02 15:18 ` Fabio Estevam
@ 2014-06-02 20:09 ` David Rientjes
1 sibling, 0 replies; 9+ messages in thread
From: David Rientjes @ 2014-06-02 20:09 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, 2 Jun 2014, Vlastimil Babka wrote:
> compact_should_abort() returns true instead of false and vice versa
> due to changes between v1 and v2 of the patch. This makes both async
> and sync compaction abort with high probability, and has been reported
> to cause e.g. soft lockups on some ARM boards, or drivers calling
> dma_alloc_coherent() fail to probe with CMA enabled on different boards.
>
> This patch fixes the return value to match comments and callers expecations.
>
> Reported-and-tested-by: Kevin Hilman <khilman@linaro.org>
> Reported-and-tested-by: Shawn Guo <shawn.guo@linaro.org>
> Tested-by: Stephen Warren <swarren@nvidia.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
^ permalink raw reply [flat|nested] 9+ messages in thread