All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>,
	Andi Kleen <andi.kleen@intel.com>,
	Wu Fengguang <fengguang.wu@intel.com>,
	Ingo Molnar <mingo@elte.hu>,
	Jun'ichi Nomura <j-nomura@ce.jp.nec.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Jan Kara <jack@suse.cz>
Subject: Re: [PATCH 2/2 v2] mm: print out information of file affected by memory error
Date: Mon, 5 Nov 2012 14:01:54 -0800	[thread overview]
Message-ID: <20121105140154.fce89f05.akpm@linux-foundation.org> (raw)
In-Reply-To: <1351873993-9373-3-git-send-email-n-horiguchi@ah.jp.nec.com>

On Fri,  2 Nov 2012 12:33:13 -0400
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> Printing out the information about which file can be affected by a
> memory error in generic_error_remove_page() is helpful for user to
> estimate the impact of the error.
> 
> Changelog v2:
>   - dereference mapping->host after if (!mapping) check for robustness
> 
> ...
>
> --- v3.7-rc3.orig/mm/truncate.c
> +++ v3.7-rc3/mm/truncate.c
> @@ -151,14 +151,20 @@ int truncate_inode_page(struct address_space *mapping, struct page *page)
>   */
>  int generic_error_remove_page(struct address_space *mapping, struct page *page)
>  {
> +	struct inode *inode;
> +
>  	if (!mapping)
>  		return -EINVAL;
> +	inode = mapping->host;
>  	/*
>  	 * Only punch for normal data pages for now.
>  	 * Handling other types like directories would need more auditing.
>  	 */
> -	if (!S_ISREG(mapping->host->i_mode))
> +	if (!S_ISREG(inode->i_mode))
>  		return -EIO;
> +	pr_info("MCE %#lx: file info pgoff:%lu, inode:%lu, dev:%s\n",
> +		page_to_pfn(page), page_index(page),
> +		inode->i_ino, inode->i_sb->s_id);
>  	return truncate_inode_page(mapping, page);
>  }
>  EXPORT_SYMBOL(generic_error_remove_page);

A couple of things.

- I worry that if a hardware error occurs, it might affect a large
  amount of memory all at the same time.  For example, if a 4G memory
  block goes bad, this message will be printed a million times?

- hard-wiring "MCE" in here seems a bit of a layering violation? 
  What right does the generic, core .error_remove_page() implementation
  have to assume that it was called because of an MCE?  Many CPU types
  don't eveh have such a thing?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>,
	Andi Kleen <andi.kleen@intel.com>,
	Wu Fengguang <fengguang.wu@intel.com>,
	Ingo Molnar <mingo@elte.hu>,
	"Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Jan Kara <jack@suse.cz>
Subject: Re: [PATCH 2/2 v2] mm: print out information of file affected by memory error
Date: Mon, 5 Nov 2012 14:01:54 -0800	[thread overview]
Message-ID: <20121105140154.fce89f05.akpm@linux-foundation.org> (raw)
In-Reply-To: <1351873993-9373-3-git-send-email-n-horiguchi@ah.jp.nec.com>

On Fri,  2 Nov 2012 12:33:13 -0400
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> Printing out the information about which file can be affected by a
> memory error in generic_error_remove_page() is helpful for user to
> estimate the impact of the error.
> 
> Changelog v2:
>   - dereference mapping->host after if (!mapping) check for robustness
> 
> ...
>
> --- v3.7-rc3.orig/mm/truncate.c
> +++ v3.7-rc3/mm/truncate.c
> @@ -151,14 +151,20 @@ int truncate_inode_page(struct address_space *mapping, struct page *page)
>   */
>  int generic_error_remove_page(struct address_space *mapping, struct page *page)
>  {
> +	struct inode *inode;
> +
>  	if (!mapping)
>  		return -EINVAL;
> +	inode = mapping->host;
>  	/*
>  	 * Only punch for normal data pages for now.
>  	 * Handling other types like directories would need more auditing.
>  	 */
> -	if (!S_ISREG(mapping->host->i_mode))
> +	if (!S_ISREG(inode->i_mode))
>  		return -EIO;
> +	pr_info("MCE %#lx: file info pgoff:%lu, inode:%lu, dev:%s\n",
> +		page_to_pfn(page), page_index(page),
> +		inode->i_ino, inode->i_sb->s_id);
>  	return truncate_inode_page(mapping, page);
>  }
>  EXPORT_SYMBOL(generic_error_remove_page);

A couple of things.

- I worry that if a hardware error occurs, it might affect a large
  amount of memory all at the same time.  For example, if a 4G memory
  block goes bad, this message will be printed a million times?

- hard-wiring "MCE" in here seems a bit of a layering violation? 
  What right does the generic, core .error_remove_page() implementation
  have to assume that it was called because of an MCE?  Many CPU types
  don't eveh have such a thing?


  reply	other threads:[~2012-11-05 22:01 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-02 16:33 [PATCH 0/2] HWPOISON: improve logging Naoya Horiguchi
2012-11-02 16:33 ` Naoya Horiguchi
2012-11-02 16:33 ` [PATCH 1/2 v2] HWPOISON: fix action_result() to print out dirty/clean Naoya Horiguchi
2012-11-02 16:33   ` Naoya Horiguchi
2012-11-05 21:56   ` Andrew Morton
2012-11-05 21:56     ` Andrew Morton
2012-11-05 22:40     ` [PATCH v3] HWPOISON: fix action_result() to print out dirty/clean (Re: [PATCH 1/2 v2] HWPOISON: fix action_result() to print out) dirty/clean Naoya Horiguchi
2012-11-05 22:40       ` Naoya Horiguchi
2012-11-02 16:33 ` [PATCH 2/2 v2] mm: print out information of file affected by memory error Naoya Horiguchi
2012-11-02 16:33   ` Naoya Horiguchi
2012-11-05 22:01   ` Andrew Morton [this message]
2012-11-05 22:01     ` Andrew Morton
2012-11-06  5:07     ` Naoya Horiguchi
2012-11-06  5:07       ` Naoya Horiguchi
2012-11-06 20:12       ` Andrew Morton
2012-11-06 20:12         ` Andrew Morton
2012-11-06 22:45         ` Naoya Horiguchi
2012-11-06 22:45           ` Naoya Horiguchi
2012-11-06 22:52           ` Andrew Morton
2012-11-06 22:52             ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121105140154.fce89f05.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=andi.kleen@intel.com \
    --cc=fengguang.wu@intel.com \
    --cc=j-nomura@ce.jp.nec.com \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.