* dma_alloc_coherent fails in framebuffer
@ 2012-10-14 5:28 Tony Prisk
2012-10-14 20:34 ` Tony Prisk
0 siblings, 1 reply; 15+ messages in thread
From: Tony Prisk @ 2012-10-14 5:28 UTC (permalink / raw)
To: linux-arm-kernel
Up until 07 Oct, drivers/video/wm8505-fb.c was working fine, but on the
11 Oct when I did another pull from linus all of a sudden
dma_alloc_coherent is failing to allocate the framebuffer any longer.
I did a quick look back and found this:
ARM: add coherent dma ops
arch_is_coherent is problematic as it is a global symbol. This
doesn't work for multi-platform kernels or platforms which can support
per device coherent DMA.
This adds arm_coherent_dma_ops to be used for devices which connected
coherently (i.e. to the ACP port on Cortex-A9 or A15). The arm_dma_ops
are modified at boot when arch_is_coherent is true.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
This is the only patch lately that I could find (not that I would claim
to be any good at finding things) that is related to the problem. Could
it have caused the allocations to fail?
Regards
Tony P
^ permalink raw reply [flat|nested] 15+ messages in thread
* dma_alloc_coherent fails in framebuffer
2012-10-14 5:28 dma_alloc_coherent fails in framebuffer Tony Prisk
@ 2012-10-14 20:34 ` Tony Prisk
2012-10-14 22:26 ` Tony Prisk
2012-10-15 9:45 ` Mel Gorman
0 siblings, 2 replies; 15+ messages in thread
From: Tony Prisk @ 2012-10-14 20:34 UTC (permalink / raw)
To: linux-arm-kernel
On Sun, 2012-10-14 at 18:28 +1300, Tony Prisk wrote:
> Up until 07 Oct, drivers/video/wm8505-fb.c was working fine, but on the
> 11 Oct when I did another pull from linus all of a sudden
> dma_alloc_coherent is failing to allocate the framebuffer any longer.
>
> I did a quick look back and found this:
>
> ARM: add coherent dma ops
>
> arch_is_coherent is problematic as it is a global symbol. This
> doesn't work for multi-platform kernels or platforms which can support
> per device coherent DMA.
>
> This adds arm_coherent_dma_ops to be used for devices which connected
> coherently (i.e. to the ACP port on Cortex-A9 or A15). The arm_dma_ops
> are modified at boot when arch_is_coherent is true.
>
> Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> Cc: Russell King <linux@arm.linux.org.uk>
> Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
>
>
> This is the only patch lately that I could find (not that I would claim
> to be any good at finding things) that is related to the problem. Could
> it have caused the allocations to fail?
>
> Regards
> Tony P
Have done a bit more digging and found the cause - not Rob's patch so
apologies.
The cause of the regression is this patch:
>From f40d1e42bb988d2a26e8e111ea4c4c7bac819b7e Mon Sep 17 00:00:00 2001
From: Mel Gorman <mgorman@suse.de>
Date: Mon, 8 Oct 2012 16:32:36 -0700
Subject: [PATCH 2/3] mm: compaction: acquire the zone->lock as late as
possible
Up until then, the framebuffer allocation with dma_alloc_coherent(...)
was fine. From this patch onwards, allocations fail.
I don't know how this patch would effect CMA allocations, but it seems
to be causing the issue (or at least, it's caused an error in
arch-vt8500 to become visible).
Perhaps someone who understand -mm could explain the best way to
troubleshoot the cause of this problem?
Regards
Tony P
^ permalink raw reply [flat|nested] 15+ messages in thread
* dma_alloc_coherent fails in framebuffer
2012-10-14 20:34 ` Tony Prisk
@ 2012-10-14 22:26 ` Tony Prisk
2012-10-15 6:42 ` Tomasz Figa
2012-10-15 9:45 ` Mel Gorman
1 sibling, 1 reply; 15+ messages in thread
From: Tony Prisk @ 2012-10-14 22:26 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, 2012-10-15 at 09:34 +1300, Tony Prisk wrote:
> On Sun, 2012-10-14 at 18:28 +1300, Tony Prisk wrote:
> > Up until 07 Oct, drivers/video/wm8505-fb.c was working fine, but on the
> > 11 Oct when I did another pull from linus all of a sudden
> > dma_alloc_coherent is failing to allocate the framebuffer any longer.
> >
> > I did a quick look back and found this:
> >
> > ARM: add coherent dma ops
> >
> > arch_is_coherent is problematic as it is a global symbol. This
> > doesn't work for multi-platform kernels or platforms which can support
> > per device coherent DMA.
> >
> > This adds arm_coherent_dma_ops to be used for devices which connected
> > coherently (i.e. to the ACP port on Cortex-A9 or A15). The arm_dma_ops
> > are modified at boot when arch_is_coherent is true.
> >
> > Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> > Cc: Russell King <linux@arm.linux.org.uk>
> > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >
> >
> > This is the only patch lately that I could find (not that I would claim
> > to be any good at finding things) that is related to the problem. Could
> > it have caused the allocations to fail?
> >
> > Regards
> > Tony P
>
> Have done a bit more digging and found the cause - not Rob's patch so
> apologies.
>
> The cause of the regression is this patch:
>
> From f40d1e42bb988d2a26e8e111ea4c4c7bac819b7e Mon Sep 17 00:00:00 2001
> From: Mel Gorman <mgorman@suse.de>
> Date: Mon, 8 Oct 2012 16:32:36 -0700
> Subject: [PATCH 2/3] mm: compaction: acquire the zone->lock as late as
> possible
>
> Up until then, the framebuffer allocation with dma_alloc_coherent(...)
> was fine. From this patch onwards, allocations fail.
>
> I don't know how this patch would effect CMA allocations, but it seems
> to be causing the issue (or at least, it's caused an error in
> arch-vt8500 to become visible).
>
> Perhaps someone who understand -mm could explain the best way to
> troubleshoot the cause of this problem?
>
>
> Regards
> Tony P
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Have done a bit more testing..
Disabling Memory Compaction makes no difference.
Disabling CMA fixes/hides the problem. ?!?!?!
Regards
Tony P
^ permalink raw reply [flat|nested] 15+ messages in thread
* dma_alloc_coherent fails in framebuffer
2012-10-14 22:26 ` Tony Prisk
@ 2012-10-15 6:42 ` Tomasz Figa
2012-10-15 8:03 ` Tony Prisk
0 siblings, 1 reply; 15+ messages in thread
From: Tomasz Figa @ 2012-10-15 6:42 UTC (permalink / raw)
To: linux-arm-kernel
Hi Tony,
On Monday 15 of October 2012 11:26:31 Tony Prisk wrote:
> On Mon, 2012-10-15 at 09:34 +1300, Tony Prisk wrote:
> > On Sun, 2012-10-14 at 18:28 +1300, Tony Prisk wrote:
> > > Up until 07 Oct, drivers/video/wm8505-fb.c was working fine, but on
> > > the
> > > 11 Oct when I did another pull from linus all of a sudden
> > > dma_alloc_coherent is failing to allocate the framebuffer any
> > > longer.
> > >
> > > I did a quick look back and found this:
> > >
> > > ARM: add coherent dma ops
> > >
> > > arch_is_coherent is problematic as it is a global symbol. This
> > > doesn't work for multi-platform kernels or platforms which can
> > > support
> > > per device coherent DMA.
> > >
> > > This adds arm_coherent_dma_ops to be used for devices which
> > > connected
> > > coherently (i.e. to the ACP port on Cortex-A9 or A15). The
> > > arm_dma_ops
> > > are modified at boot when arch_is_coherent is true.
> > >
> > > Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> > > Cc: Russell King <linux@arm.linux.org.uk>
> > > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> > >
> > >
> > > This is the only patch lately that I could find (not that I would
> > > claim
> > > to be any good at finding things) that is related to the problem.
> > > Could
> > > it have caused the allocations to fail?
> > >
> > > Regards
> > > Tony P
> >
> > Have done a bit more digging and found the cause - not Rob's patch so
> > apologies.
> >
> > The cause of the regression is this patch:
> >
> > From f40d1e42bb988d2a26e8e111ea4c4c7bac819b7e Mon Sep 17 00:00:00 2001
> > From: Mel Gorman <mgorman@suse.de>
> > Date: Mon, 8 Oct 2012 16:32:36 -0700
> > Subject: [PATCH 2/3] mm: compaction: acquire the zone->lock as late as
> >
> > possible
> >
> > Up until then, the framebuffer allocation with dma_alloc_coherent(...)
> > was fine. From this patch onwards, allocations fail.
> >
> > I don't know how this patch would effect CMA allocations, but it seems
> > to be causing the issue (or at least, it's caused an error in
> > arch-vt8500 to become visible).
> >
> > Perhaps someone who understand -mm could explain the best way to
> > troubleshoot the cause of this problem?
> >
> >
> > Regards
> > Tony P
> >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
> Have done a bit more testing..
>
> Disabling Memory Compaction makes no difference.
> Disabling CMA fixes/hides the problem. ?!?!?!
Could you post your kernel log when it isn't working?
Do you have the default CMA reserved pool in Kconfig set big enough to
serve this allocation?
I'm not sure what kind of allocation this framebuffer driver does, but if
it needs to allocate memory from atomic context then possibly this patch
series has something to do with it:
http://thread.gmane.org/gmane.linux.ports.arm.kernel/182697/focus=182699
CC'ing Marek Szyprowski <m.szyprowski@samsung.com>
Best regards,
Tomasz Figa
^ permalink raw reply [flat|nested] 15+ messages in thread
* dma_alloc_coherent fails in framebuffer
2012-10-15 6:42 ` Tomasz Figa
@ 2012-10-15 8:03 ` Tony Prisk
2012-10-15 13:35 ` Marek Szyprowski
0 siblings, 1 reply; 15+ messages in thread
From: Tony Prisk @ 2012-10-15 8:03 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, 2012-10-15 at 08:42 +0200, Tomasz Figa wrote:
> Hi Tony,
>
> On Monday 15 of October 2012 11:26:31 Tony Prisk wrote:
> > On Mon, 2012-10-15 at 09:34 +1300, Tony Prisk wrote:
> > > On Sun, 2012-10-14 at 18:28 +1300, Tony Prisk wrote:
> > > > Up until 07 Oct, drivers/video/wm8505-fb.c was working fine, but on
> > > > the
> > > > 11 Oct when I did another pull from linus all of a sudden
> > > > dma_alloc_coherent is failing to allocate the framebuffer any
> > > > longer.
> > > >
> > > > I did a quick look back and found this:
> > > >
> > > > ARM: add coherent dma ops
> > > >
> > > > arch_is_coherent is problematic as it is a global symbol. This
> > > > doesn't work for multi-platform kernels or platforms which can
> > > > support
> > > > per device coherent DMA.
> > > >
> > > > This adds arm_coherent_dma_ops to be used for devices which
> > > > connected
> > > > coherently (i.e. to the ACP port on Cortex-A9 or A15). The
> > > > arm_dma_ops
> > > > are modified at boot when arch_is_coherent is true.
> > > >
> > > > Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> > > > Cc: Russell King <linux@arm.linux.org.uk>
> > > > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > > > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> > > >
> > > >
> > > > This is the only patch lately that I could find (not that I would
> > > > claim
> > > > to be any good at finding things) that is related to the problem.
> > > > Could
> > > > it have caused the allocations to fail?
> > > >
> > > > Regards
> > > > Tony P
> > >
> > > Have done a bit more digging and found the cause - not Rob's patch so
> > > apologies.
> > >
> > > The cause of the regression is this patch:
> > >
> > > From f40d1e42bb988d2a26e8e111ea4c4c7bac819b7e Mon Sep 17 00:00:00 2001
> > > From: Mel Gorman <mgorman@suse.de>
> > > Date: Mon, 8 Oct 2012 16:32:36 -0700
> > > Subject: [PATCH 2/3] mm: compaction: acquire the zone->lock as late as
> > >
> > > possible
> > >
> > > Up until then, the framebuffer allocation with dma_alloc_coherent(...)
> > > was fine. From this patch onwards, allocations fail.
> > >
> > > I don't know how this patch would effect CMA allocations, but it seems
> > > to be causing the issue (or at least, it's caused an error in
> > > arch-vt8500 to become visible).
> > >
> > > Perhaps someone who understand -mm could explain the best way to
> > > troubleshoot the cause of this problem?
> > >
> > >
> > > Regards
> > > Tony P
> > >
> > >
> > > _______________________________________________
> > > linux-arm-kernel mailing list
> > > linux-arm-kernel at lists.infradead.org
> > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >
> > Have done a bit more testing..
> >
> > Disabling Memory Compaction makes no difference.
> > Disabling CMA fixes/hides the problem. ?!?!?!
>
> Could you post your kernel log when it isn't working?
>
> Do you have the default CMA reserved pool in Kconfig set big enough to
> serve this allocation?
>
> I'm not sure what kind of allocation this framebuffer driver does, but if
> it needs to allocate memory from atomic context then possibly this patch
> series has something to do with it:
> http://thread.gmane.org/gmane.linux.ports.arm.kernel/182697/focus=182699
>
> CC'ing Marek Szyprowski <m.szyprowski@samsung.com>
>
> Best regards,
> Tomasz Figa
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
I set CMA to 16MB (and also tried 32MB) in Kconfig, but it made no
difference. The framebuffer is only trying to allocate ~1.5MB via
dma_alloc_coherent().
I haven't captured a kernel log, but will do.
Thanks for the feedback.
Regards
Tony P
^ permalink raw reply [flat|nested] 15+ messages in thread
* dma_alloc_coherent fails in framebuffer
2012-10-14 20:34 ` Tony Prisk
2012-10-14 22:26 ` Tony Prisk
@ 2012-10-15 9:45 ` Mel Gorman
2012-10-15 18:28 ` Tony Prisk
1 sibling, 1 reply; 15+ messages in thread
From: Mel Gorman @ 2012-10-15 9:45 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Oct 15, 2012 at 09:34:55AM +1300, Tony Prisk wrote:
> On Sun, 2012-10-14 at 18:28 +1300, Tony Prisk wrote:
> > Up until 07 Oct, drivers/video/wm8505-fb.c was working fine, but on the
> > 11 Oct when I did another pull from linus all of a sudden
> > dma_alloc_coherent is failing to allocate the framebuffer any longer.
> >
> > I did a quick look back and found this:
> >
> > ARM: add coherent dma ops
> >
> > arch_is_coherent is problematic as it is a global symbol. This
> > doesn't work for multi-platform kernels or platforms which can support
> > per device coherent DMA.
> >
> > This adds arm_coherent_dma_ops to be used for devices which connected
> > coherently (i.e. to the ACP port on Cortex-A9 or A15). The arm_dma_ops
> > are modified at boot when arch_is_coherent is true.
> >
> > Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> > Cc: Russell King <linux@arm.linux.org.uk>
> > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >
> >
> > This is the only patch lately that I could find (not that I would claim
> > to be any good at finding things) that is related to the problem. Could
> > it have caused the allocations to fail?
> >
> > Regards
> > Tony P
>
> Have done a bit more digging and found the cause - not Rob's patch so
> apologies.
>
> The cause of the regression is this patch:
>
> From f40d1e42bb988d2a26e8e111ea4c4c7bac819b7e Mon Sep 17 00:00:00 2001
> From: Mel Gorman <mgorman@suse.de>
> Date: Mon, 8 Oct 2012 16:32:36 -0700
> Subject: [PATCH 2/3] mm: compaction: acquire the zone->lock as late as
> possible
>
> Up until then, the framebuffer allocation with dma_alloc_coherent(...)
> was fine. From this patch onwards, allocations fail.
>
Was this found through bisection or some other means?
There was a bug in that series that broke CMA but it was commit bb13ffeb
(mm: compaction: cache if a pageblock was scanned and no pages were
isolated) and it was fixed by 62726059 (mm: compaction: fix bit ranges
in {get,clear,set}_pageblock_skip()). So it should have been fixed by
3.7-rc1 and probably was included by the time you pulled in October 11th
but bisection would be a pain. There were problems with that series during
development but tests were completing for other people.
Just in case, is this still broken in 3.7-rc1?
> I don't know how this patch would effect CMA allocations, but it seems
> to be causing the issue (or at least, it's caused an error in
> arch-vt8500 to become visible).
>
> Perhaps someone who understand -mm could explain the best way to
> troubleshoot the cause of this problem?
>
If you are comfortable with ftrace, it can be used to narrow down where
the exact failure is occurring but if you're not comfortable with that
then the easiest is a bunch of printks starting in alloc_contig_range()
to see at what point and why it returns failure.
It's not obvious at the moment why that patch would cause an allocation
problem. It's the type of patch that if it was wrong it would fail every
time for everyone, not just for a single driver.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 15+ messages in thread
* dma_alloc_coherent fails in framebuffer
2012-10-15 8:03 ` Tony Prisk
@ 2012-10-15 13:35 ` Marek Szyprowski
0 siblings, 0 replies; 15+ messages in thread
From: Marek Szyprowski @ 2012-10-15 13:35 UTC (permalink / raw)
To: linux-arm-kernel
Hello,
On Monday, October 15, 2012 10:04 AM Tony Prisk wrote:
> On Mon, 2012-10-15 at 08:42 +0200, Tomasz Figa wrote:
> > Hi Tony,
> >
> > On Monday 15 of October 2012 11:26:31 Tony Prisk wrote:
> > > On Mon, 2012-10-15 at 09:34 +1300, Tony Prisk wrote:
> > > > On Sun, 2012-10-14 at 18:28 +1300, Tony Prisk wrote:
> > > > > Up until 07 Oct, drivers/video/wm8505-fb.c was working fine, but on
> > > > > the
> > > > > 11 Oct when I did another pull from linus all of a sudden
> > > > > dma_alloc_coherent is failing to allocate the framebuffer any
> > > > > longer.
> > > > >
> > > > > I did a quick look back and found this:
> > > > >
> > > > > ARM: add coherent dma ops
> > > > >
> > > > > arch_is_coherent is problematic as it is a global symbol. This
> > > > > doesn't work for multi-platform kernels or platforms which can
> > > > > support
> > > > > per device coherent DMA.
> > > > >
> > > > > This adds arm_coherent_dma_ops to be used for devices which
> > > > > connected
> > > > > coherently (i.e. to the ACP port on Cortex-A9 or A15). The
> > > > > arm_dma_ops
> > > > > are modified at boot when arch_is_coherent is true.
> > > > >
> > > > > Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> > > > > Cc: Russell King <linux@arm.linux.org.uk>
> > > > > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > > > > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> > > > >
> > > > >
> > > > > This is the only patch lately that I could find (not that I would
> > > > > claim
> > > > > to be any good at finding things) that is related to the problem.
> > > > > Could
> > > > > it have caused the allocations to fail?
> > > > >
> > > > > Regards
> > > > > Tony P
> > > >
> > > > Have done a bit more digging and found the cause - not Rob's patch so
> > > > apologies.
> > > >
> > > > The cause of the regression is this patch:
> > > >
> > > > From f40d1e42bb988d2a26e8e111ea4c4c7bac819b7e Mon Sep 17 00:00:00 2001
> > > > From: Mel Gorman <mgorman@suse.de>
> > > > Date: Mon, 8 Oct 2012 16:32:36 -0700
> > > > Subject: [PATCH 2/3] mm: compaction: acquire the zone->lock as late as
> > > >
> > > > possible
> > > >
> > > > Up until then, the framebuffer allocation with dma_alloc_coherent(...)
> > > > was fine. From this patch onwards, allocations fail.
> > > >
> > > > I don't know how this patch would effect CMA allocations, but it seems
> > > > to be causing the issue (or at least, it's caused an error in
> > > > arch-vt8500 to become visible).
> > > >
> > > > Perhaps someone who understand -mm could explain the best way to
> > > > troubleshoot the cause of this problem?
> > > >
> > > >
> > > > Regards
> > > > Tony P
> > > >
> > > Have done a bit more testing..
> > >
> > > Disabling Memory Compaction makes no difference.
> > > Disabling CMA fixes/hides the problem. ?!?!?!
> >
> > Could you post your kernel log when it isn't working?
> >
> > Do you have the default CMA reserved pool in Kconfig set big enough to
> > serve this allocation?
> >
> > I'm not sure what kind of allocation this framebuffer driver does, but if
> > it needs to allocate memory from atomic context then possibly this patch
> > series has something to do with it:
> > http://thread.gmane.org/gmane.linux.ports.arm.kernel/182697/focus=182699
> >
> > CC'ing Marek Szyprowski <m.szyprowski@samsung.com>
>
> I set CMA to 16MB (and also tried 32MB) in Kconfig, but it made no
> difference. The framebuffer is only trying to allocate ~1.5MB via
> dma_alloc_coherent().
>
> I haven't captured a kernel log, but will do.
>
> Thanks for the feedback.
Commit 627260595ca6abcb16d68a3732bac6b547e112d6 "mm: compaction: fix bit ranges in
{get,clear,set}_pageblock_skip()" should fix the issues with broken CMA allocations,
please check if todays v3.7-rc1 works for You.
For more information please check http://thread.gmane.org/gmane.linux.kernel/1365503/
thread.
Best regards
--
Marek Szyprowski
Samsung Poland R&D Center
^ permalink raw reply [flat|nested] 15+ messages in thread
* dma_alloc_coherent fails in framebuffer
2012-10-15 9:45 ` Mel Gorman
@ 2012-10-15 18:28 ` Tony Prisk
2012-10-16 2:17 ` Bob Liu
0 siblings, 1 reply; 15+ messages in thread
From: Tony Prisk @ 2012-10-15 18:28 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, 2012-10-15 at 10:45 +0100, Mel Gorman wrote:
> On Mon, Oct 15, 2012 at 09:34:55AM +1300, Tony Prisk wrote:
> > On Sun, 2012-10-14 at 18:28 +1300, Tony Prisk wrote:
> > > Up until 07 Oct, drivers/video/wm8505-fb.c was working fine, but on the
> > > 11 Oct when I did another pull from linus all of a sudden
> > > dma_alloc_coherent is failing to allocate the framebuffer any longer.
> > >
> > > I did a quick look back and found this:
> > >
> > > ARM: add coherent dma ops
> > >
> > > arch_is_coherent is problematic as it is a global symbol. This
> > > doesn't work for multi-platform kernels or platforms which can support
> > > per device coherent DMA.
> > >
> > > This adds arm_coherent_dma_ops to be used for devices which connected
> > > coherently (i.e. to the ACP port on Cortex-A9 or A15). The arm_dma_ops
> > > are modified at boot when arch_is_coherent is true.
> > >
> > > Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> > > Cc: Russell King <linux@arm.linux.org.uk>
> > > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> > >
> > >
> > > This is the only patch lately that I could find (not that I would claim
> > > to be any good at finding things) that is related to the problem. Could
> > > it have caused the allocations to fail?
> > >
> > > Regards
> > > Tony P
> >
> > Have done a bit more digging and found the cause - not Rob's patch so
> > apologies.
> >
> > The cause of the regression is this patch:
> >
> > From f40d1e42bb988d2a26e8e111ea4c4c7bac819b7e Mon Sep 17 00:00:00 2001
> > From: Mel Gorman <mgorman@suse.de>
> > Date: Mon, 8 Oct 2012 16:32:36 -0700
> > Subject: [PATCH 2/3] mm: compaction: acquire the zone->lock as late as
> > possible
> >
> > Up until then, the framebuffer allocation with dma_alloc_coherent(...)
> > was fine. From this patch onwards, allocations fail.
> >
>
> Was this found through bisection or some other means?
>
> There was a bug in that series that broke CMA but it was commit bb13ffeb
> (mm: compaction: cache if a pageblock was scanned and no pages were
> isolated) and it was fixed by 62726059 (mm: compaction: fix bit ranges
> in {get,clear,set}_pageblock_skip()). So it should have been fixed by
> 3.7-rc1 and probably was included by the time you pulled in October 11th
> but bisection would be a pain. There were problems with that series during
> development but tests were completing for other people.
>
> Just in case, is this still broken in 3.7-rc1?
Still broken. Although the printk's might have cleared it up a bit.
>
> > I don't know how this patch would effect CMA allocations, but it seems
> > to be causing the issue (or at least, it's caused an error in
> > arch-vt8500 to become visible).
> >
> > Perhaps someone who understand -mm could explain the best way to
> > troubleshoot the cause of this problem?
> >
>
> If you are comfortable with ftrace, it can be used to narrow down where
> the exact failure is occurring but if you're not comfortable with that
> then the easiest is a bunch of printks starting in alloc_contig_range()
> to see at what point and why it returns failure.
>
> It's not obvious at the moment why that patch would cause an allocation
> problem. It's the type of patch that if it was wrong it would fail every
> time for everyone, not just for a single driver.
>
I added some printk's to see what was happening.
from arch/arm/mm/dma-mapping.c: arm_dma_alloc(..) it calls out to:
dma_alloc_from_coherent().
This returns 0, because:
mem = dev->dma_mem
if (!mem) return 0;
and then arm_dma_alloc() falls back on __dma_alloc(..)
I suspect the reason this fault is a bit 'weird' is because its
effectively not using alloc_from_coherent at all, but falling back on
__dma_alloc all the time, and sometimes it fails.
Why it caused a problem on that particular commit I don't know - but it
was reproducible by adding/removing it.
Regards
Tony P
^ permalink raw reply [flat|nested] 15+ messages in thread
* dma_alloc_coherent fails in framebuffer
2012-10-15 18:28 ` Tony Prisk
@ 2012-10-16 2:17 ` Bob Liu
2012-10-16 5:54 ` Tony Prisk
2012-10-16 14:41 ` James Bottomley
0 siblings, 2 replies; 15+ messages in thread
From: Bob Liu @ 2012-10-16 2:17 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Oct 16, 2012 at 2:28 AM, Tony Prisk <linux@prisktech.co.nz> wrote:
> On Mon, 2012-10-15 at 10:45 +0100, Mel Gorman wrote:
>> On Mon, Oct 15, 2012 at 09:34:55AM +1300, Tony Prisk wrote:
>> > On Sun, 2012-10-14 at 18:28 +1300, Tony Prisk wrote:
>> > > Up until 07 Oct, drivers/video/wm8505-fb.c was working fine, but on the
>> > > 11 Oct when I did another pull from linus all of a sudden
>> > > dma_alloc_coherent is failing to allocate the framebuffer any longer.
>> > >
>> > > I did a quick look back and found this:
>> > >
>> > > ARM: add coherent dma ops
>> > >
>> > > arch_is_coherent is problematic as it is a global symbol. This
>> > > doesn't work for multi-platform kernels or platforms which can support
>> > > per device coherent DMA.
>> > >
>> > > This adds arm_coherent_dma_ops to be used for devices which connected
>> > > coherently (i.e. to the ACP port on Cortex-A9 or A15). The arm_dma_ops
>> > > are modified at boot when arch_is_coherent is true.
>> > >
>> > > Signed-off-by: Rob Herring <rob.herring@calxeda.com>
>> > > Cc: Russell King <linux@arm.linux.org.uk>
>> > > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
>> > > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
>> > >
>> > >
>> > > This is the only patch lately that I could find (not that I would claim
>> > > to be any good at finding things) that is related to the problem. Could
>> > > it have caused the allocations to fail?
>> > >
>> > > Regards
>> > > Tony P
>> >
>> > Have done a bit more digging and found the cause - not Rob's patch so
>> > apologies.
>> >
>> > The cause of the regression is this patch:
>> >
>> > From f40d1e42bb988d2a26e8e111ea4c4c7bac819b7e Mon Sep 17 00:00:00 2001
>> > From: Mel Gorman <mgorman@suse.de>
>> > Date: Mon, 8 Oct 2012 16:32:36 -0700
>> > Subject: [PATCH 2/3] mm: compaction: acquire the zone->lock as late as
>> > possible
>> >
>> > Up until then, the framebuffer allocation with dma_alloc_coherent(...)
>> > was fine. From this patch onwards, allocations fail.
>> >
>>
>> Was this found through bisection or some other means?
>>
>> There was a bug in that series that broke CMA but it was commit bb13ffeb
>> (mm: compaction: cache if a pageblock was scanned and no pages were
>> isolated) and it was fixed by 62726059 (mm: compaction: fix bit ranges
>> in {get,clear,set}_pageblock_skip()). So it should have been fixed by
>> 3.7-rc1 and probably was included by the time you pulled in October 11th
>> but bisection would be a pain. There were problems with that series during
>> development but tests were completing for other people.
>>
>> Just in case, is this still broken in 3.7-rc1?
>
> Still broken. Although the printk's might have cleared it up a bit.
>>
>> > I don't know how this patch would effect CMA allocations, but it seems
>> > to be causing the issue (or at least, it's caused an error in
>> > arch-vt8500 to become visible).
>> >
>> > Perhaps someone who understand -mm could explain the best way to
>> > troubleshoot the cause of this problem?
>> >
>>
>> If you are comfortable with ftrace, it can be used to narrow down where
>> the exact failure is occurring but if you're not comfortable with that
>> then the easiest is a bunch of printks starting in alloc_contig_range()
>> to see at what point and why it returns failure.
>>
>> It's not obvious at the moment why that patch would cause an allocation
>> problem. It's the type of patch that if it was wrong it would fail every
>> time for everyone, not just for a single driver.
>>
>
> I added some printk's to see what was happening.
>
> from arch/arm/mm/dma-mapping.c: arm_dma_alloc(..) it calls out to:
> dma_alloc_from_coherent().
>
> This returns 0, because:
> mem = dev->dma_mem
> if (!mem) return 0;
>
> and then arm_dma_alloc() falls back on __dma_alloc(..)
>
>
> I suspect the reason this fault is a bit 'weird' is because its
> effectively not using alloc_from_coherent at all, but falling back on
> __dma_alloc all the time, and sometimes it fails.
>
I think you need to declare that memory using
dma_declare_coherent_memory() before
alloc_from_coherent.
> Why it caused a problem on that particular commit I don't know - but it
> was reproducible by adding/removing it.
>
>
> Regards
> Tony P
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo at kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email at kvack.org </a>
--
Regards,
--Bob
^ permalink raw reply [flat|nested] 15+ messages in thread
* dma_alloc_coherent fails in framebuffer
2012-10-16 2:17 ` Bob Liu
@ 2012-10-16 5:54 ` Tony Prisk
2012-10-16 6:50 ` Tony Prisk
2012-10-16 14:41 ` James Bottomley
1 sibling, 1 reply; 15+ messages in thread
From: Tony Prisk @ 2012-10-16 5:54 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, 2012-10-16 at 10:17 +0800, Bob Liu wrote:
> On Tue, Oct 16, 2012 at 2:28 AM, Tony Prisk <linux@prisktech.co.nz> wrote:
> > On Mon, 2012-10-15 at 10:45 +0100, Mel Gorman wrote:
> >> On Mon, Oct 15, 2012 at 09:34:55AM +1300, Tony Prisk wrote:
> >> > On Sun, 2012-10-14 at 18:28 +1300, Tony Prisk wrote:
> >> > > Up until 07 Oct, drivers/video/wm8505-fb.c was working fine, but on the
> >> > > 11 Oct when I did another pull from linus all of a sudden
> >> > > dma_alloc_coherent is failing to allocate the framebuffer any longer.
> >> > >
> >> > > I did a quick look back and found this:
> >> > >
> >> > > ARM: add coherent dma ops
> >> > >
> >> > > arch_is_coherent is problematic as it is a global symbol. This
> >> > > doesn't work for multi-platform kernels or platforms which can support
> >> > > per device coherent DMA.
> >> > >
> >> > > This adds arm_coherent_dma_ops to be used for devices which connected
> >> > > coherently (i.e. to the ACP port on Cortex-A9 or A15). The arm_dma_ops
> >> > > are modified at boot when arch_is_coherent is true.
> >> > >
> >> > > Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> >> > > Cc: Russell King <linux@arm.linux.org.uk>
> >> > > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> >> > > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >> > >
> >> > >
> >> > > This is the only patch lately that I could find (not that I would claim
> >> > > to be any good at finding things) that is related to the problem. Could
> >> > > it have caused the allocations to fail?
> >> > >
> >> > > Regards
> >> > > Tony P
> >> >
> >> > Have done a bit more digging and found the cause - not Rob's patch so
> >> > apologies.
> >> >
> >> > The cause of the regression is this patch:
> >> >
> >> > From f40d1e42bb988d2a26e8e111ea4c4c7bac819b7e Mon Sep 17 00:00:00 2001
> >> > From: Mel Gorman <mgorman@suse.de>
> >> > Date: Mon, 8 Oct 2012 16:32:36 -0700
> >> > Subject: [PATCH 2/3] mm: compaction: acquire the zone->lock as late as
> >> > possible
> >> >
> >> > Up until then, the framebuffer allocation with dma_alloc_coherent(...)
> >> > was fine. From this patch onwards, allocations fail.
> >> >
> >>
> >> Was this found through bisection or some other means?
> >>
> >> There was a bug in that series that broke CMA but it was commit bb13ffeb
> >> (mm: compaction: cache if a pageblock was scanned and no pages were
> >> isolated) and it was fixed by 62726059 (mm: compaction: fix bit ranges
> >> in {get,clear,set}_pageblock_skip()). So it should have been fixed by
> >> 3.7-rc1 and probably was included by the time you pulled in October 11th
> >> but bisection would be a pain. There were problems with that series during
> >> development but tests were completing for other people.
> >>
> >> Just in case, is this still broken in 3.7-rc1?
> >
> > Still broken. Although the printk's might have cleared it up a bit.
> >>
> >> > I don't know how this patch would effect CMA allocations, but it seems
> >> > to be causing the issue (or at least, it's caused an error in
> >> > arch-vt8500 to become visible).
> >> >
> >> > Perhaps someone who understand -mm could explain the best way to
> >> > troubleshoot the cause of this problem?
> >> >
> >>
> >> If you are comfortable with ftrace, it can be used to narrow down where
> >> the exact failure is occurring but if you're not comfortable with that
> >> then the easiest is a bunch of printks starting in alloc_contig_range()
> >> to see at what point and why it returns failure.
> >>
> >> It's not obvious at the moment why that patch would cause an allocation
> >> problem. It's the type of patch that if it was wrong it would fail every
> >> time for everyone, not just for a single driver.
> >>
> >
> > I added some printk's to see what was happening.
> >
> > from arch/arm/mm/dma-mapping.c: arm_dma_alloc(..) it calls out to:
> > dma_alloc_from_coherent().
> >
> > This returns 0, because:
> > mem = dev->dma_mem
> > if (!mem) return 0;
> >
> > and then arm_dma_alloc() falls back on __dma_alloc(..)
> >
> >
> > I suspect the reason this fault is a bit 'weird' is because its
> > effectively not using alloc_from_coherent at all, but falling back on
> > __dma_alloc all the time, and sometimes it fails.
> >
>
> I think you need to declare that memory using
> dma_declare_coherent_memory() before
> alloc_from_coherent.
>
> > Why it caused a problem on that particular commit I don't know - but it
> > was reproducible by adding/removing it.
> >
> >
> > Regards
> > Tony P
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo at kvack.org. For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email at kvack.org </a>
>
I finally found the link to this patch which caused the problem - and
may still be the cause of my problems :)
> >>>
> >>> From f40d1e42bb988d2a26e8e111ea4c4c7bac819b7e Mon Sep 17 00:00:00 2001
> >>> From: Mel Gorman <mgorman@suse.de>
> >>> Date: Mon, 8 Oct 2012 16:32:36 -0700
> >>> Subject: [PATCH 2/3] mm: compaction: acquire the zone->lock as late as
> >>> possible
In mm/page_alloc.c:alloc_contig_range()
...
outer_end = isolate_freepages_range(&cc, outer_start, end);
if (!outer_end) {
ret = -EBUSY;
goto done;
}
..
It is always returning via the !outer_end test with -EBUSY.
isolate_freepages_range() was one of the functions modified by
the above mentioned patch.
Around in a big circle and back to the start :)
Regards
Tony P
^ permalink raw reply [flat|nested] 15+ messages in thread
* dma_alloc_coherent fails in framebuffer
2012-10-16 5:54 ` Tony Prisk
@ 2012-10-16 6:50 ` Tony Prisk
2012-10-16 7:58 ` Mel Gorman
0 siblings, 1 reply; 15+ messages in thread
From: Tony Prisk @ 2012-10-16 6:50 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, 2012-10-16 at 18:54 +1300, Tony Prisk wrote:
> On Tue, 2012-10-16 at 10:17 +0800, Bob Liu wrote:
> > On Tue, Oct 16, 2012 at 2:28 AM, Tony Prisk <linux@prisktech.co.nz> wrote:
> > > On Mon, 2012-10-15 at 10:45 +0100, Mel Gorman wrote:
> > >> On Mon, Oct 15, 2012 at 09:34:55AM +1300, Tony Prisk wrote:
> > >> > On Sun, 2012-10-14 at 18:28 +1300, Tony Prisk wrote:
> > >> > > Up until 07 Oct, drivers/video/wm8505-fb.c was working fine, but on the
> > >> > > 11 Oct when I did another pull from linus all of a sudden
> > >> > > dma_alloc_coherent is failing to allocate the framebuffer any longer.
> > >> > >
> > >> > > I did a quick look back and found this:
> > >> > >
> > >> > > ARM: add coherent dma ops
> > >> > >
> > >> > > arch_is_coherent is problematic as it is a global symbol. This
> > >> > > doesn't work for multi-platform kernels or platforms which can support
> > >> > > per device coherent DMA.
> > >> > >
> > >> > > This adds arm_coherent_dma_ops to be used for devices which connected
> > >> > > coherently (i.e. to the ACP port on Cortex-A9 or A15). The arm_dma_ops
> > >> > > are modified at boot when arch_is_coherent is true.
> > >> > >
> > >> > > Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> > >> > > Cc: Russell King <linux@arm.linux.org.uk>
> > >> > > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > >> > > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> > >> > >
> > >> > >
> > >> > > This is the only patch lately that I could find (not that I would claim
> > >> > > to be any good at finding things) that is related to the problem. Could
> > >> > > it have caused the allocations to fail?
> > >> > >
> > >> > > Regards
> > >> > > Tony P
> > >> >
> > >> > Have done a bit more digging and found the cause - not Rob's patch so
> > >> > apologies.
> > >> >
> > >> > The cause of the regression is this patch:
> > >> >
> > >> > From f40d1e42bb988d2a26e8e111ea4c4c7bac819b7e Mon Sep 17 00:00:00 2001
> > >> > From: Mel Gorman <mgorman@suse.de>
> > >> > Date: Mon, 8 Oct 2012 16:32:36 -0700
> > >> > Subject: [PATCH 2/3] mm: compaction: acquire the zone->lock as late as
> > >> > possible
> > >> >
> > >> > Up until then, the framebuffer allocation with dma_alloc_coherent(...)
> > >> > was fine. From this patch onwards, allocations fail.
> > >> >
> > >>
> > >> Was this found through bisection or some other means?
> > >>
> > >> There was a bug in that series that broke CMA but it was commit bb13ffeb
> > >> (mm: compaction: cache if a pageblock was scanned and no pages were
> > >> isolated) and it was fixed by 62726059 (mm: compaction: fix bit ranges
> > >> in {get,clear,set}_pageblock_skip()). So it should have been fixed by
> > >> 3.7-rc1 and probably was included by the time you pulled in October 11th
> > >> but bisection would be a pain. There were problems with that series during
> > >> development but tests were completing for other people.
> > >>
> > >> Just in case, is this still broken in 3.7-rc1?
> > >
> > > Still broken. Although the printk's might have cleared it up a bit.
> > >>
> > >> > I don't know how this patch would effect CMA allocations, but it seems
> > >> > to be causing the issue (or at least, it's caused an error in
> > >> > arch-vt8500 to become visible).
> > >> >
> > >> > Perhaps someone who understand -mm could explain the best way to
> > >> > troubleshoot the cause of this problem?
> > >> >
> > >>
> > >> If you are comfortable with ftrace, it can be used to narrow down where
> > >> the exact failure is occurring but if you're not comfortable with that
> > >> then the easiest is a bunch of printks starting in alloc_contig_range()
> > >> to see at what point and why it returns failure.
> > >>
> > >> It's not obvious at the moment why that patch would cause an allocation
> > >> problem. It's the type of patch that if it was wrong it would fail every
> > >> time for everyone, not just for a single driver.
> > >>
> > >
> > > I added some printk's to see what was happening.
> > >
> > > from arch/arm/mm/dma-mapping.c: arm_dma_alloc(..) it calls out to:
> > > dma_alloc_from_coherent().
> > >
> > > This returns 0, because:
> > > mem = dev->dma_mem
> > > if (!mem) return 0;
> > >
> > > and then arm_dma_alloc() falls back on __dma_alloc(..)
> > >
> > >
> > > I suspect the reason this fault is a bit 'weird' is because its
> > > effectively not using alloc_from_coherent at all, but falling back on
> > > __dma_alloc all the time, and sometimes it fails.
> > >
> >
> > I think you need to declare that memory using
> > dma_declare_coherent_memory() before
> > alloc_from_coherent.
> >
> > > Why it caused a problem on that particular commit I don't know - but it
> > > was reproducible by adding/removing it.
> > >
> > >
> > > Regards
> > > Tony P
> > >
> > > --
> > > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > > the body to majordomo at kvack.org. For more info on Linux MM,
> > > see: http://www.linux-mm.org/ .
> > > Don't email: <a href=mailto:"dont@kvack.org"> email at kvack.org </a>
> >
>
>
> I finally found the link to this patch which caused the problem - and
> may still be the cause of my problems :)
>
> > >>>
> > >>> From f40d1e42bb988d2a26e8e111ea4c4c7bac819b7e Mon Sep 17 00:00:00 2001
> > >>> From: Mel Gorman <mgorman@suse.de>
> > >>> Date: Mon, 8 Oct 2012 16:32:36 -0700
> > >>> Subject: [PATCH 2/3] mm: compaction: acquire the zone->lock as late as
> > >>> possible
>
> In mm/page_alloc.c:alloc_contig_range()
>
> ...
> outer_end = isolate_freepages_range(&cc, outer_start, end);
> if (!outer_end) {
> ret = -EBUSY;
> goto done;
> }
> ..
>
> It is always returning via the !outer_end test with -EBUSY.
>
> isolate_freepages_range() was one of the functions modified by
> the above mentioned patch.
>
> Around in a big circle and back to the start :)
>
> Regards
> Tony P
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Found the new code which has produced the problem, but perhaps someone
with more knowledge can explain why. Commit
f40d1e42bb988d2a26e8e111ea4c4c7bac819b7e introduced an additional check
which wasn't previously there.
+ /*
+ * If strict isolation is requested by CMA then check that all the
+ * pages requested were isolated. If there were any failures, 0 is
+ * returned and CMA will fail.
+ */
+ if (strict && nr_strict_required != total_isolated)
+ total_isolated = 0;
At the moment, my platform always fails to allocate the framebuffer
because nr_strict_required != total_isolated.
total_isolated will exceed nr_strict_required which causes an error.
For all the other drivers using dma_alloc_coherent, it seems to keep
increasing nr_strict_required until it gets a match with total_isolated:
For the EHCI driver:
[ 4.740000] total_isolated = 512, nr_strict_required=1
[ 4.740000] FINAL! total_isolated = 512, nr_strict_required=1
[ 4.740000] strict && nr_strict_required != total_isolated
[ 4.750000] total_isolated = 512, nr_strict_required=2
[ 4.750000] FINAL! total_isolated = 512, nr_strict_required=2
[ 4.750000] strict && nr_strict_required != total_isolated
...
[ 13.220000] total_isolated = 512, nr_strict_required=511
[ 13.220000] FINAL! total_isolated = 512, nr_strict_required=511
[ 13.220000] strict && nr_strict_required != total_isolated
[ 13.230000] total_isolated = 512, nr_strict_required=512
[ 13.230000] FINAL! total_isolated = 512, nr_strict_required=512
The framebuffer gives up trying when:
[ 0.730000] total_isolated = 1024, nr_strict_required=983
[ 0.730000] FINAL! total_isolated = 1024, nr_strict_required=983
[ 0.730000] strict && nr_strict_required != total_isolated
[ 0.730000] total_isolated = 1024, nr_strict_required=999
[ 0.730000] FINAL! total_isolated = 1024, nr_strict_required=999
[ 0.730000] strict && nr_strict_required != total_isolated
[ 0.740000] total_isolated = 1024, nr_strict_required=1015
[ 0.740000] FINAL! total_isolated = 1024, nr_strict_required=1015
[ 0.740000] strict && nr_strict_required != total_isolated
Given that nr_strict_required + 16 = 1031, I guess it gives up trying
and fails.
Any suggestions on how to fix this?
Regards
Tony P
^ permalink raw reply [flat|nested] 15+ messages in thread
* dma_alloc_coherent fails in framebuffer
2012-10-16 6:50 ` Tony Prisk
@ 2012-10-16 7:58 ` Mel Gorman
2012-10-16 8:13 ` Tony Prisk
0 siblings, 1 reply; 15+ messages in thread
From: Mel Gorman @ 2012-10-16 7:58 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Oct 16, 2012 at 07:50:07PM +1300, Tony Prisk wrote:
> > > > Why it caused a problem on that particular commit I don't know - but it
> > > > was reproducible by adding/removing it.
> > > >
> >
> > I finally found the link to this patch which caused the problem - and
> > may still be the cause of my problems :)
> >
Blast, thanks. This was already identified as being a problem and "fixed"
in https://lkml.org/lkml/2012/10/5/164 but I missed that the fix did not
get picked up before RC1 after all the patches got collapsed together. I'm
very sorry about that, I should have spotted that it didn't make it through.
> Any suggestions on how to fix this?
>
Can you test this to be sure and if it's fine I'll push it to Andrew.
---8<---
mm: compaction: Correct the strict_isolated check for CMA
Thierry reported that the "iron out" patch for isolate_freepages_block()
had problems due to the strict check being too strict with "mm: compaction:
Iron out isolate_freepages_block() and isolate_freepages_range() -fix1".
It's possible that more pages than necessary are isolated but the check
still fails and I missed that this fix was not picked up before RC1. This
has also been identified in RC1 by Tony Prisk and should be addressed by
the following patch.
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
compaction.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 2c4ce17..9eef558 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -346,7 +346,7 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
* pages requested were isolated. If there were any failures, 0 is
* returned and CMA will fail.
*/
- if (strict && nr_strict_required != total_isolated)
+ if (strict && nr_strict_required > total_isolated)
total_isolated = 0;
if (locked)
^ permalink raw reply related [flat|nested] 15+ messages in thread
* dma_alloc_coherent fails in framebuffer
2012-10-16 7:58 ` Mel Gorman
@ 2012-10-16 8:13 ` Tony Prisk
0 siblings, 0 replies; 15+ messages in thread
From: Tony Prisk @ 2012-10-16 8:13 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, 2012-10-16 at 08:58 +0100, Mel Gorman wrote:
> On Tue, Oct 16, 2012 at 07:50:07PM +1300, Tony Prisk wrote:
> > > > > Why it caused a problem on that particular commit I don't know - but it
> > > > > was reproducible by adding/removing it.
> > > > >
> > >
> > > I finally found the link to this patch which caused the problem - and
> > > may still be the cause of my problems :)
> > >
>
> Blast, thanks. This was already identified as being a problem and "fixed"
> in https://lkml.org/lkml/2012/10/5/164 but I missed that the fix did not
> get picked up before RC1 after all the patches got collapsed together. I'm
> very sorry about that, I should have spotted that it didn't make it through.
>
> > Any suggestions on how to fix this?
> >
>
> Can you test this to be sure and if it's fine I'll push it to Andrew.
>
> ---8<---
> mm: compaction: Correct the strict_isolated check for CMA
>
> Thierry reported that the "iron out" patch for isolate_freepages_block()
> had problems due to the strict check being too strict with "mm: compaction:
> Iron out isolate_freepages_block() and isolate_freepages_range() -fix1".
> It's possible that more pages than necessary are isolated but the check
> still fails and I missed that this fix was not picked up before RC1. This
> has also been identified in RC1 by Tony Prisk and should be addressed by
> the following patch.
>
> Signed-off-by: Mel Gorman <mgorman@suse.de>
> ---
> compaction.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 2c4ce17..9eef558 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -346,7 +346,7 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
> * pages requested were isolated. If there were any failures, 0 is
> * returned and CMA will fail.
> */
> - if (strict && nr_strict_required != total_isolated)
> + if (strict && nr_strict_required > total_isolated)
> total_isolated = 0;
>
> if (locked)
I don't need to test that again.. thats exactly what I did to fix it
myself :)
Tested-by: Tony Prisk <linux@prisktech.co.nz>
.. if needed.
Nice to know I'm not completely bonkers.
Thanks for your help
Tony P
^ permalink raw reply [flat|nested] 15+ messages in thread
* dma_alloc_coherent fails in framebuffer
2012-10-16 2:17 ` Bob Liu
2012-10-16 5:54 ` Tony Prisk
@ 2012-10-16 14:41 ` James Bottomley
2012-10-17 2:26 ` Bob Liu
1 sibling, 1 reply; 15+ messages in thread
From: James Bottomley @ 2012-10-16 14:41 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, 2012-10-16 at 10:17 +0800, Bob Liu wrote:
> I think you need to declare that memory using
> dma_declare_coherent_memory() before
> alloc_from_coherent.
This isn't true. Almost every platform has a mechanism for
manufacturing coherent memory (in the worst case, they just turn off the
CPU cache on a page and hand it out). The purpose of
dma_declare_coherent_memory() is to allow a per device declaration of
preferred regions ... usually because they reside either on the fast
path to the device or sometimes on the device itself. There are only a
handful of devices which need it, so in the ordinary course of events,
dma_alloc_coherent() is used without any memory declaration.
James
^ permalink raw reply [flat|nested] 15+ messages in thread
* dma_alloc_coherent fails in framebuffer
2012-10-16 14:41 ` James Bottomley
@ 2012-10-17 2:26 ` Bob Liu
0 siblings, 0 replies; 15+ messages in thread
From: Bob Liu @ 2012-10-17 2:26 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Oct 16, 2012 at 10:41 PM, James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
> On Tue, 2012-10-16 at 10:17 +0800, Bob Liu wrote:
>> I think you need to declare that memory using
>> dma_declare_coherent_memory() before
>> alloc_from_coherent.
>
> This isn't true. Almost every platform has a mechanism for
> manufacturing coherent memory (in the worst case, they just turn off the
> CPU cache on a page and hand it out). The purpose of
> dma_declare_coherent_memory() is to allow a per device declaration of
> preferred regions ... usually because they reside either on the fast
> path to the device or sometimes on the device itself. There are only a
> handful of devices which need it, so in the ordinary course of events,
> dma_alloc_coherent() is used without any memory declaration.
>
Sorry for my ambiguity.
It obviously true we can use dma_alloc_coherent() without any memory
declaration.
I thought Tony's original idea was want to make
dma_alloc_from_coherent() return success.
But the dev->dma_mem check can't pass, so i suggested him using
dma_declare_coherent_memory()
to declare per-device area first.
Thanks,
--Bob
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2012-10-17 2:26 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-14 5:28 dma_alloc_coherent fails in framebuffer Tony Prisk
2012-10-14 20:34 ` Tony Prisk
2012-10-14 22:26 ` Tony Prisk
2012-10-15 6:42 ` Tomasz Figa
2012-10-15 8:03 ` Tony Prisk
2012-10-15 13:35 ` Marek Szyprowski
2012-10-15 9:45 ` Mel Gorman
2012-10-15 18:28 ` Tony Prisk
2012-10-16 2:17 ` Bob Liu
2012-10-16 5:54 ` Tony Prisk
2012-10-16 6:50 ` Tony Prisk
2012-10-16 7:58 ` Mel Gorman
2012-10-16 8:13 ` Tony Prisk
2012-10-16 14:41 ` James Bottomley
2012-10-17 2:26 ` Bob Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).