From: andreiw@motorola.com (Andrei Warkentin)
To: linux-arm-kernel@lists.infradead.org
Subject: MMC and reliable write - was: since when does ARM map the kernel memory in sections?
Date: Wed, 27 Apr 2011 14:18:16 -0500 [thread overview]
Message-ID: <BANLkTimMvGfkmcXKrM=vtFNGqdoUJpqwcw@mail.gmail.com> (raw)
In-Reply-To: <20110427130719.GE5832@shareable.org>
On Wed, Apr 27, 2011 at 8:07 AM, Jamie Lokier <jamie@shareable.org> wrote:
> Andrei Warkentin wrote:
>> I think this basically says - don't end up with corrupt flash if I
>> pull the power when doing this MMC transaction.
>> If you pull power during a regular write, you could end up with ALL
>> erase units affected being wiped.
>>
>> Note, that the new definition of reliable writes provides a guarantee
>> to a sector boundary. So if you interrupt
>> the transaction, you will end up with [new data] followed by [old
>> data]. The old definition guaranteed the entire range,
>> but the transaction was only reliable when done over a sector or erase unit.
>
> The old definition might not have been implemented in practice, or
> might have caused performance problems -- or maybe it just wasn't that
> useful, because it's so different from what hard-disk-like filesystems
> expect of a block device.
>
>> This means I jumped the gun on implementing REQ_FUA as reliable write,
>> as REQ_FUA says nothing about atomicity.
>> OTOH, I don't think anything in the block layer expects massive data
>> corruption on power loss. In my defence, I saw REQ_FUA
>> as being "prevent data corruption during power loss", hence the
>> reliable write via REQ_FUA in mmc layer.
>>
>> So my question -
>> a) how should reliable writes be handled?
>
> If your understanding is this:
>
> ? - "Reliable Write" only affects the range being written
>
> ? - "Normal Write" can corrupt ANY random part of the flash
> ? ? (because you don't know where the physical erase blocks are, or
> ? ? what reorganising it might provoke.)
>
> Then the answer's pretty clear.
> You have to use "Reliable Write" for everything.
>
>> REQ_META?
>
> No, that's a scheduling hint; you can't assume filesystems
> consistently label "metadata needed for filesystem integrity" with
> that flag. ?(And databases and VMs have similar needs, but don't get
> to choose REQ_ flags).
>
> But even if they did, wouldn't a single normal write, from the above
> description, potentially corrupt all previously written metadata
> anyway, making it pointless?
Gah... yes.
>
>> b) how do we make sure to not wind up with data corruption and MMCs
>> for work loads where you know power can be removed at any moment?
>
>> We could always turn on reliable writes (not good perf wise). We could
>> turn on reliable writes for a particular range (enhanced user
>> partition). ?We could also turn on reliable writes for a specific
>> hardware partition.
>
> It might have to be simply a mount option - let the user decide their
> priorities.
So basically add a new REQ_ flag - something like REQ_SAFE, which
would ensure that data
on block storage is not corrupted due to interrupting this write (or
even, after the write, if the card does some optimizations). We
already have a flag that ensures corruptions don't occur
because of local-to-disk caches - REQ_FUA, so this would just thinking
about what effects REQ_FUA already has that's not considered. On a
(spinning) disk, I can't image that interrupting a REQ_FUA write would
cause data loss somewhere other than where data was written.
Then it would be as simple as a mount flag that would ensure all
(write) accesses are FUA accesses, to ensure desired behavior for
platforms where power could be cut at any moment.
What do you think?
Yes, all write transactions for MMC are contiguous.
A
next prev parent reply other threads:[~2011-04-27 19:18 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-12 18:52 since when does ARM map the kernel memory in sections? Peter Wächtler
2011-04-12 19:11 ` Colin Cross
2011-04-13 18:19 ` Peter Wächtler
2011-04-12 19:20 ` Andrei Warkentin
2011-04-12 20:33 ` Jamie Lokier
2011-04-13 15:27 ` Nicolas Pitre
2011-04-13 20:11 ` Jamie Lokier
2011-04-18 13:52 ` Pavel Machek
2011-04-18 17:07 ` Jamie Lokier
2011-04-18 17:17 ` Nicolas Pitre
2011-04-22 15:47 ` Pavel Machek
2011-04-23 9:23 ` Linus Walleij
2011-04-26 10:33 ` Per Forlin
2011-04-26 19:00 ` Peter Waechtler
2011-04-26 19:07 ` Jamie Lokier
2011-04-26 20:38 ` MMC and reliable write - was: " Peter Waechtler
2011-04-26 22:45 ` Jamie Lokier
2011-04-27 1:13 ` Andrei Warkentin
2011-04-27 13:07 ` Jamie Lokier
2011-04-27 19:18 ` Andrei Warkentin [this message]
2011-04-27 19:33 ` Arnd Bergmann
2011-05-03 8:04 ` Jamie Lokier
2011-06-06 10:28 ` Pavel Machek
2011-06-06 20:38 ` Peter Waechtler
2011-04-26 20:24 ` Andrei Warkentin
2011-04-26 22:58 ` Jamie Lokier
2011-04-27 0:27 ` Andrei Warkentin
2011-04-27 13:19 ` Jamie Lokier
2011-04-27 13:32 ` Arnd Bergmann
2011-04-27 18:50 ` Peter Waechtler
2011-04-27 18:58 ` Andrei Warkentin
2011-04-18 19:21 ` Peter Waechtler
2011-04-18 17:24 ` Pavel Machek
2011-04-19 0:43 ` Jamie Lokier
2011-04-13 6:51 ` Peter Wächtler
2011-04-13 15:44 ` Nicolas Pitre
2011-04-13 18:35 ` Peter Wächtler
2011-04-12 20:15 ` Russell King - ARM Linux
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='BANLkTimMvGfkmcXKrM=vtFNGqdoUJpqwcw@mail.gmail.com' \
--to=andreiw@motorola.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).