All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mitsuhiro Tanino <mitsuhiro.tanino.gm@hitachi.com>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Andi Kleen <andi@firstfloor.org>,
	Kosaki Motohiro <kosaki.motohiro@gmail.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>
Subject: Re: [RFC Patch 2/2] mm: Add parameters to limit a rate of outputting memory error messages
Date: Fri, 12 Apr 2013 22:30:28 +0900	[thread overview]
Message-ID: <51680C74.9010000@hitachi.com> (raw)
In-Reply-To: <1365691626-w2h428s2-mutt-n-horiguchi@ah.jp.nec.com>

(2013/04/11 23:47), Naoya Horiguchi wrote:
> On Thu, Apr 11, 2013 at 04:00:12PM +0200, Andi Kleen wrote:
>>> I don't think it's enough to do ratelimit only for me_pagecache_dirty().
>>> When tons of memory errors flood, all of printk()s in memory error handler
>>> can print out tons of messages.
>>
>> Note that when you really have a flood of uncorrected errors you'll
>> likely die soon anyways as something unrecoverable is very likely to
>> happen. Error memory recovery cannot fix large scale memory corruptions,
>> just the rare events that slip through all the other memory error correction
>> schemes.
>>
>> So I wouldn't worry too much about that.
> 
> I agree.
> My previous comment is valid only when we assume the flooding can happen
> (and I personally don't believe that can happen except for in testing.)
> 
> And for paranoid users, we can suggest that they set up mcelog script
> triggering to turn off vm.memory_failure_recovery when memory errors flood.
> Such users don't expect that memory error handling works fine in flooding,
> so just suppressing kernel messages is pointless.
> 
> Thanks,
> Naoya

Hi Andi, Horiguchi-san, Kosaki-san

Thank you for your comments. I agree with your opinions.
I think that occurrence of uncorrected error is rare event, too.

I introduced a limitation feature using ratelimit in my patch in honor
of the previous discussion a half year ago. In the discussion, Andrew-san
threw a concern of a flood of uncorrected error for the patch proposed by
Horiguchi-san.

I think that ratelimit can be removed to output all "important messages".

I will try to resend patches sepalately, 
one is for outputting error messages related to a corrupted file
and the other is for adding a panic knob to handle data lost of dirty cache
which is caused by both memory error and I/O error.

Regards,
Mitsuhiro Tanino

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Mitsuhiro Tanino <mitsuhiro.tanino.gm@hitachi.com>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Andi Kleen <andi@firstfloor.org>,
	Kosaki Motohiro <kosaki.motohiro@gmail.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>
Subject: Re: [RFC Patch 2/2] mm: Add parameters to limit a rate of outputting memory error messages
Date: Fri, 12 Apr 2013 22:30:28 +0900	[thread overview]
Message-ID: <51680C74.9010000@hitachi.com> (raw)
In-Reply-To: <1365691626-w2h428s2-mutt-n-horiguchi@ah.jp.nec.com>

(2013/04/11 23:47), Naoya Horiguchi wrote:
> On Thu, Apr 11, 2013 at 04:00:12PM +0200, Andi Kleen wrote:
>>> I don't think it's enough to do ratelimit only for me_pagecache_dirty().
>>> When tons of memory errors flood, all of printk()s in memory error handler
>>> can print out tons of messages.
>>
>> Note that when you really have a flood of uncorrected errors you'll
>> likely die soon anyways as something unrecoverable is very likely to
>> happen. Error memory recovery cannot fix large scale memory corruptions,
>> just the rare events that slip through all the other memory error correction
>> schemes.
>>
>> So I wouldn't worry too much about that.
> 
> I agree.
> My previous comment is valid only when we assume the flooding can happen
> (and I personally don't believe that can happen except for in testing.)
> 
> And for paranoid users, we can suggest that they set up mcelog script
> triggering to turn off vm.memory_failure_recovery when memory errors flood.
> Such users don't expect that memory error handling works fine in flooding,
> so just suppressing kernel messages is pointless.
> 
> Thanks,
> Naoya

Hi Andi, Horiguchi-san, Kosaki-san

Thank you for your comments. I agree with your opinions.
I think that occurrence of uncorrected error is rare event, too.

I introduced a limitation feature using ratelimit in my patch in honor
of the previous discussion a half year ago. In the discussion, Andrew-san
threw a concern of a flood of uncorrected error for the patch proposed by
Horiguchi-san.

I think that ratelimit can be removed to output all "important messages".

I will try to resend patches sepalately, 
one is for outputting error messages related to a corrupted file
and the other is for adding a panic knob to handle data lost of dirty cache
which is caused by both memory error and I/O error.

Regards,
Mitsuhiro Tanino


  reply	other threads:[~2013-04-12 13:30 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-11  7:32 [RFC Patch 2/2] mm: Add parameters to limit a rate of outputting memory error messages Naoya Horiguchi
2013-04-11  7:32 ` Naoya Horiguchi
2013-04-11 14:00 ` Andi Kleen
2013-04-11 14:00   ` Andi Kleen
2013-04-11 14:47   ` Naoya Horiguchi
2013-04-11 14:47     ` Naoya Horiguchi
2013-04-12 13:30     ` Mitsuhiro Tanino [this message]
2013-04-12 13:30       ` Mitsuhiro Tanino
  -- strict thread matches above, loose matches on Subject: below --
2013-04-11  3:27 Mitsuhiro Tanino
2013-04-11  3:27 ` Mitsuhiro Tanino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51680C74.9010000@hitachi.com \
    --to=mitsuhiro.tanino.gm@hitachi.com \
    --cc=andi@firstfloor.org \
    --cc=kosaki.motohiro@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.