From: Mitsuhiro Tanino <mitsuhiro.tanino.gm@hitachi.com>
To: Andi Kleen <andi@firstfloor.org>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>
Subject: Re: [RFC Patch 0/2] mm: Add parameters to make kernel behavior at memory error on dirty cache selectable
Date: Fri, 12 Apr 2013 22:38:43 +0900 [thread overview]
Message-ID: <51680E63.3070100@hitachi.com> (raw)
In-Reply-To: <20130411181004.GK16732@two.firstfloor.org>
(2013/04/12 3:10), Andi Kleen wrote:
> On Thu, Apr 11, 2013 at 11:23:08AM -0400, Naoya Horiguchi wrote:
>> On Thu, Apr 11, 2013 at 03:49:16PM +0200, Andi Kleen wrote:
>>>> As a result, if the dirty cache includes user data, the data is lost,
>>>> and data corruption occurs if an application uses old data.
>>>
>>> The application cannot use old data, the kernel code kills it if it
>>> would do that. And if it's IO data there is an EIO triggered.
>>>
>>> iirc the only concern in the past was that the application may miss
>>> the asynchronous EIO because it's cleared on any fd access.
>>>
>>> This is a general problem not specific to memory error handling,
>>> as these asynchronous IO errors can happen due to other reason
>>> (bad disk etc.)
>>>
>>> If you're really concerned about this case I think the solution
>>> is to make the EIO more sticky so that there is a higher chance
>>> than it gets returned. This will make your data much more safe,
>>> as it will cover all kinds of IO errors, not just the obscure memory
>>> errors.
I agree with Andi. We need to care both memory error and asynchronous
I/O error.
>> I'm interested in this topic, and in previous discussion, what I was said
>> is that we can't expect user applications to change their behaviors when
>> they get EIO, so globally changing EIO's stickiness is not a great approach.
>
> Not sure. Some of the current behavior may be dubious and it may
> be possible to change it. But would need more analysis.
>
> I don't think we're concerned that much about "correct" applications,
> but more ones that do not check everything. So returning more
> errors should be safer.
>
> For example you could have a sysctl that enables always stick
> IO error -- that keeps erroring until it is closed.
>
>> I'm working on a new pagecache tag based mechanism to solve this.
>> But it needs time and more discussions.
>> So I guess Tanino-san suggests giving up on dirty pagecache errors
>> as a quick solution.
>
> A quick solution would be enabling panic for any asynchronous IO error.
> I don't think the memory error code is the right point to hook into.
Yes. I think both short term solution and long term solution is necessary
in order to enable hwpoison feature for Linux as KVM hypervisor.
So my proposal is as follows,
For short term solution to care both memory error and I/O error:
- I will resend a panic knob to handle data lost related to dirty cache
which is caused by memory error and I/O error.
For long term solution:
- Andi's proposal or Horiguchi-san's new pagecache tag based mechanism
Regards,
Mitsuhiro Tanino
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-04-12 13:38 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-11 3:26 [RFC Patch 0/2] mm: Add parameters to make kernel behavior at memory error on dirty cache selectable Mitsuhiro Tanino
2013-04-11 3:53 ` Simon Jeons
2013-04-11 12:51 ` Mitsuhiro Tanino
2013-04-11 13:00 ` Ric Mason
2013-04-12 13:43 ` Mitsuhiro Tanino
2013-04-17 5:49 ` Simon Jeons
2013-04-11 7:11 ` Naoya Horiguchi
2013-04-12 13:24 ` Mitsuhiro Tanino
2013-04-12 14:45 ` Naoya Horiguchi
2013-04-17 7:14 ` Simon Jeons
2013-04-17 14:55 ` Naoya Horiguchi
2013-04-18 0:27 ` Simon Jeons
2013-04-11 13:49 ` Andi Kleen
2013-04-11 15:23 ` Naoya Horiguchi
2013-04-11 18:10 ` Andi Kleen
2013-04-12 13:38 ` Mitsuhiro Tanino [this message]
2013-04-12 15:13 ` Naoya Horiguchi
2013-04-17 13:58 ` Naoya Horiguchi
2013-04-17 6:42 ` Simon Jeons
2013-04-17 14:16 ` Naoya Horiguchi
2013-04-17 5:30 ` Simon Jeons
2013-04-11 15:15 ` KOSAKI Motohiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51680E63.3070100@hitachi.com \
--to=mitsuhiro.tanino.gm@hitachi.com \
--cc=andi@firstfloor.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=n-horiguchi@ah.jp.nec.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).