From: linux@arm.linux.org.uk (Russell King - ARM Linux)
To: linux-arm-kernel@lists.infradead.org
Subject: kmalloc memory slower than malloc
Date: Tue, 10 Sep 2013 13:50:48 +0100 [thread overview]
Message-ID: <20130910125048.GD12758@n2100.arm.linux.org.uk> (raw)
In-Reply-To: <alpine.DEB.2.02.1309101435430.1305@kelly.ryd.net>
On Tue, Sep 10, 2013 at 02:42:17PM +0200, Thommy Jakobsson wrote:
> Using pgprot_dmacoherent() in mmap they look more similar. Still
> ~10-15% difference, but maybe that is normal for kernel/userspace.
>
> dma_alloc_coherent in kernel 4.257s (s=0)
> kmalloc in kernel 0.126s (s=81370000)
> dma_alloc_coherent userspace 4.907s (s=0)
> kmalloc in userspace 1.815s (s=81370000)
> malloc in userspace 0.566s (s=0)
>
> Note that I was lazy and used the same pgprot for all mappings now, which
> I guess is a violation.
What it means is that the results you end up with are documented to be
"unpredictable" which gives scope to manufacturers to come up with any
behaviour they desire in that situation - and it doesn't have to be
consistent.
What that means is that if you have an area of physical memory mapped as
"normal memory cacheable" and it's also mapped "strongly ordered" elsewhere,
it is entirely legal for an access via the strongly ordered mapping to
hit the cache if a cache line exists, whereas another implementation
may miss the cache line if it exists.
Furthermore, with such mappings (and this has been true since ARMv3 days)
if you have two such mappings - one cacheable and one non-cacheable, and
the cacheable mapping has dirty cache lines, the dirty cache lines can be
evicted at any moment, overwriting whatever you're doing via the non-
cacheable mapping.
I've recently had a hard-to-track bug doing exactly that in a non-mainline
kernel on ARMv7 because someone decided it was a good idea to bypass my
test in arch/arm/mm/ioremap.c preventing system RAM being ioremap()d. It
lead to one boot in 20ish locking up because a GPU command stream was
being overwritten by the dirty cache lines being evicted after the GPU
had started to read from that memory - or, if you typed "reboot" at the
right moment during a previous boot, you could get it to occur 100% of
the time.
I notice you turn off VM_IO - you don't want to do that...
next prev parent reply other threads:[~2013-09-10 12:50 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-06 7:48 kmalloc memory slower than malloc Thommy Jakobsson
2013-09-06 8:07 ` Russell King - ARM Linux
2013-09-06 9:04 ` Thommy Jakobsson
2013-09-06 9:12 ` Lucas Stach
2013-09-06 9:36 ` Thommy Jakobsson
2013-09-10 9:54 ` Thommy Jakobsson
2013-09-10 10:10 ` Lucas Stach
2013-09-10 10:42 ` Duan Fugang-B38611
2013-09-10 11:28 ` Thommy Jakobsson
2013-09-10 11:36 ` Duan Fugang-B38611
2013-09-10 11:44 ` Russell King - ARM Linux
2013-09-10 12:42 ` Thommy Jakobsson
2013-09-10 12:50 ` Russell King - ARM Linux [this message]
2013-09-12 15:58 ` Thommy Jakobsson
2013-09-12 16:19 ` Russell King - ARM Linux
2013-09-10 11:27 ` Thommy Jakobsson
2013-09-10 11:41 ` Russell King - ARM Linux
2013-09-10 12:54 ` Thommy Jakobsson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130910125048.GD12758@n2100.arm.linux.org.uk \
--to=linux@arm.linux.org.uk \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).