From: Russell King <rmk+lkml@arm.linux.org.uk>
To: James Bottomley <James.Bottomley@SteelEye.com>,
Dave Miller <davem@redhat.com>
Cc: Tejun Heo <htejun@gmail.com>,
axboe@suse.de, bzolnier@gmail.com,
james.steward@dynamicratings.com, jgarzik@pobox.com,
linux-kernel@vger.kernel.org
Subject: Re: [PATCHSET] block: fix PIO cache coherency bug
Date: Fri, 13 Jan 2006 18:20:35 +0000 [thread overview]
Message-ID: <20060113182035.GC25849@flint.arm.linux.org.uk> (raw)
In-Reply-To: <1137167419.3365.5.camel@mulgrave>
On Fri, Jan 13, 2006 at 09:50:19AM -0600, James Bottomley wrote:
> Actually, this doesn't look to be the correct thing to do. The
> dma_map/unmap don't make the data coherent with respect to the user
> space, only with respect to the kernel space. I've never liked this
> (and indeed I wrote an OLS paper in 2004 trying to explain how we could
> fix it) but that's our current model.
>
> Our classic path for data on machines is that the driver makes the
> kernel coherent and then whatever's transferring from the page cache to
> the user makes user space coherent. It sounds, therefore, like
> whatever's broken (what is the problem, by the way?) is broken in the
> second half (page cache to user) not in the first half (driver to
> kernel).
I think you're misunderstanding the issue. I'll give you essentially
my understanding of the explaination that Dave Miller gave me a number
of years ago. This is from memory, so Dave may wish to correct it.
1. When a driver DMAs data into a page cache page, it is written directly
to RAM and is made visible to the kernel mapping via the DMA API. As
a result, there will be no cache lines associated with the kernel
mapping at the point when the driver hands the page back to the page
cache.
However, in the PIO case, there is the possibility that the data read
from the device into the kernel mapping results in cache lines
associated with the page. Moreover, if the cache is write-allocate,
you _will_ have cache lines.
Therefore, you have two completely differing system states depending
on how the driver decided to transfer data from the device to the page
cache.
As such, drivers must ensure that PIO data transfers have the same
system state guarantees as DMA data transfers.
ISTR davem recommended flush_dcache_page() be used for this.
2. (this is my own) The cachetlb document specifies quite clearly what
is required whenever a page cache page is written to - that is
flush_dcache_page() is called. The situation when a driver uses PIO
quote clearly violates the requirements set out in that document.
>From (2), it is quite clear that flush_dcache_page() is the correct
function to use, otherwise we would end up with random set of state
of pages in the page cache. (1) merely reinforces that it's the
correct place for the decision to be made. In fact, it's the only
part of the kernel which _knows_ what needs to be done.
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core
next prev parent reply other threads:[~2006-01-13 18:21 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-01-13 15:24 [PATCHSET] block: fix PIO cache coherency bug Tejun Heo
2006-01-13 15:24 ` [PATCH 3/8] block: convert bio kmap helpers to use blk_kmap helpers Tejun Heo
2006-01-13 15:24 ` [PATCH 5/8] block: convert libata " Tejun Heo
2006-01-13 15:24 ` [PATCH 2/8] block: implement blk kmap helpers Tejun Heo
2006-01-13 15:24 ` [PATCH 4/8] block: convert IDE to use blk_kmap helpers Tejun Heo
2006-02-14 19:07 ` Matt Reimer
2006-02-15 2:05 ` Tejun Heo
2006-02-16 18:01 ` Russell King
2006-02-16 18:10 ` Linus Torvalds
2006-02-16 19:02 ` Russell King
2006-01-13 15:24 ` [PATCH 1/8] highmem: include asm/kmap_types.h in linux/highmem.h Tejun Heo
2006-01-13 15:24 ` [PATCH 6/8] block: convert scsi to use blk_kmap helpers Tejun Heo
2006-01-13 15:24 ` [PATCH 8/8] block: convert md " Tejun Heo
2006-01-13 15:24 ` [PATCH 7/8] block: convert block/rd.c " Tejun Heo
2006-01-13 15:37 ` [PATCHSET] block: fix PIO cache coherency bug Jens Axboe
2006-01-13 15:47 ` Bartlomiej Zolnierkiewicz
2006-01-13 15:50 ` James Bottomley
2006-01-13 18:20 ` Russell King [this message]
2006-01-13 18:35 ` James Bottomley
2006-01-13 19:06 ` Russell King
2006-02-22 8:27 ` Tejun Heo
2006-03-02 18:46 ` James Bottomley
2006-03-02 20:30 ` Russell King
2006-03-02 20:43 ` James Bottomley
2006-03-02 20:57 ` Russell King
2006-03-02 20:44 ` Jens Axboe
2006-05-29 19:17 ` Nicolas Pitre
2006-05-30 11:19 ` Tejun Heo
2006-05-30 21:07 ` Guennadi Liakhovetski
2006-05-30 21:32 ` Nicolas Pitre
2006-05-31 0:57 ` David S. Miller
2006-03-02 20:40 ` Jens Axboe
2006-03-20 16:12 ` James Bottomley
2006-03-20 16:26 ` Tejun Heo
2006-03-20 16:33 ` James Bottomley
2006-03-20 16:40 ` Tejun Heo
2006-03-20 16:48 ` James Bottomley
2006-03-23 14:23 ` James Bottomley
2006-03-20 16:50 ` Randy.Dunlap
2006-03-20 16:52 ` James Bottomley
2006-01-13 22:02 ` Russell King
2006-01-13 22:38 ` James Bottomley
2006-01-13 22:43 ` David S. Miller
2006-01-14 4:58 ` James Bottomley
2006-01-17 15:00 ` Jeff Garzik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060113182035.GC25849@flint.arm.linux.org.uk \
--to=rmk+lkml@arm.linux.org.uk \
--cc=James.Bottomley@SteelEye.com \
--cc=axboe@suse.de \
--cc=bzolnier@gmail.com \
--cc=davem@redhat.com \
--cc=htejun@gmail.com \
--cc=james.steward@dynamicratings.com \
--cc=jgarzik@pobox.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.