From: Wu Fengguang <fengguang.wu@intel.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Andi Kleen <andi@firstfloor.org>, "tytso@mit.edu" <tytso@mit.edu>,
"mfasheh@suse.com" <mfasheh@suse.com>,
"aia21@cantab.net" <aia21@cantab.net>,
"hugh.dickins@tiscali.co.uk" <hugh.dickins@tiscali.co.uk>,
"swhiteho@redhat.com" <swhiteho@redhat.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"npiggin@suse.de" <npiggin@suse.de>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"hidehiro.kawai.ez@hitachi.com" <hidehiro.kawai.ez@hitachi.com>
Subject: Re: [PATCH] [16/19] HWPOISON: Enable .remove_error_page for migration aware file systems
Date: Wed, 5 Aug 2009 19:52:42 +0800 [thread overview]
Message-ID: <20090805115242.GB6737@localhost> (raw)
In-Reply-To: <20090805111231.GA19532@infradead.org>
On Wed, Aug 05, 2009 at 07:12:31PM +0800, Christoph Hellwig wrote:
> On Wed, Aug 05, 2009 at 11:36:43AM +0200, Andi Kleen wrote:
> >
> > Enable removing of corrupted pages through truncation
> > for a bunch of file systems: ext*, xfs, gfs2, ocfs2, ntfs
> > These should cover most server needs.
> >
> > I chose the set of migration aware file systems for this
> > for now, assuming they have been especially audited.
> > But in general it should be safe for all file systems
> > on the data area that support read/write and truncate.
> >
> > Caveat: the hardware error handler does not take i_mutex
> > for now before calling the truncate function. Is that ok?
>
> It will probably need locking, e.g. the iolock in XFS. I'll
> need to take a look at the actual implementation of
> generic_error_remove_page to make sense of this.
In patch 13, it simply calls truncate_inode_page() for S_ISREG inodes.
Nick suggests call truncate_inode_page() with i_mutex. Sure we can
do mutex_trylock(i_mutex), but we'd appreciate it if some fs gurus
can demonstrate some bad consequences of not doing so, thanks!
> Is there any way for us to test this functionality without introducing
> real hardware problems?
We have some additional patches (ugly but works for now) that export
interfaces for injecting hwpoison to selected types pages. It can
guarantee only data/metadata pages of selected fs will be poisoned.
Based on which we can do all kinds of stress testing in user space.
Thanks,
Fengguang
WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Andi Kleen <andi@firstfloor.org>, "tytso@mit.edu" <tytso@mit.edu>,
"mfasheh@suse.com" <mfasheh@suse.com>,
"aia21@cantab.net" <aia21@cantab.net>,
"hugh.dickins@tiscali.co.uk" <hugh.dickins@tiscali.co.uk>,
"swhiteho@redhat.com" <swhiteho@redhat.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"npiggin@suse.de" <npiggin@suse.de>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"hidehiro.kawai.ez@hitachi.com" <hidehiro.kawai.ez@hitachi.com>
Subject: Re: [PATCH] [16/19] HWPOISON: Enable .remove_error_page for migration aware file systems
Date: Wed, 5 Aug 2009 19:52:42 +0800 [thread overview]
Message-ID: <20090805115242.GB6737@localhost> (raw)
In-Reply-To: <20090805111231.GA19532@infradead.org>
On Wed, Aug 05, 2009 at 07:12:31PM +0800, Christoph Hellwig wrote:
> On Wed, Aug 05, 2009 at 11:36:43AM +0200, Andi Kleen wrote:
> >
> > Enable removing of corrupted pages through truncation
> > for a bunch of file systems: ext*, xfs, gfs2, ocfs2, ntfs
> > These should cover most server needs.
> >
> > I chose the set of migration aware file systems for this
> > for now, assuming they have been especially audited.
> > But in general it should be safe for all file systems
> > on the data area that support read/write and truncate.
> >
> > Caveat: the hardware error handler does not take i_mutex
> > for now before calling the truncate function. Is that ok?
>
> It will probably need locking, e.g. the iolock in XFS. I'll
> need to take a look at the actual implementation of
> generic_error_remove_page to make sense of this.
In patch 13, it simply calls truncate_inode_page() for S_ISREG inodes.
Nick suggests call truncate_inode_page() with i_mutex. Sure we can
do mutex_trylock(i_mutex), but we'd appreciate it if some fs gurus
can demonstrate some bad consequences of not doing so, thanks!
> Is there any way for us to test this functionality without introducing
> real hardware problems?
We have some additional patches (ugly but works for now) that export
interfaces for injecting hwpoison to selected types pages. It can
guarantee only data/metadata pages of selected fs will be poisoned.
Based on which we can do all kinds of stress testing in user space.
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-08-05 12:17 UTC|newest]
Thread overview: 106+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-05 9:36 [PATCH] [0/19] HWPOISON: Intro Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 9:36 ` [PATCH] [1/19] HWPOISON: Add page flag for poisoned pages Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 9:36 ` [PATCH] [2/19] HWPOISON: Export some rmap vma locking to outside world Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 9:36 ` [PATCH] [3/19] HWPOISON: Add support for poison swap entries v2 Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 9:36 ` [PATCH] [4/19] HWPOISON: Add new SIGBUS error codes for hardware poison signals Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 9:36 ` [PATCH] [5/19] HWPOISON: Add basic support for poisoned pages in fault handler v3 Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 9:36 ` [PATCH] [6/19] HWPOISON: Add various poison checks in mm/memory.c v2 Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 9:36 ` [PATCH] [7/19] HWPOISON: x86: Add VM_FAULT_HWPOISON handling to x86 page fault handler v2 Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 9:36 ` [PATCH] [8/19] HWPOISON: Use bitmask/action code for try_to_unmap behaviour Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 9:36 ` [PATCH] [9/19] HWPOISON: Handle hardware poisoned pages in try_to_unmap Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 9:36 ` [PATCH] [10/19] HWPOISON: check and isolate corrupted free pages v2 Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 9:36 ` [PATCH] [11/19] HWPOISON: Refactor truncate to allow direct truncating of page v2 Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 10:20 ` Nick Piggin
2009-08-05 10:20 ` Nick Piggin
2009-08-05 12:37 ` Wu Fengguang
2009-08-05 12:37 ` Wu Fengguang
2009-08-05 13:46 ` Andi Kleen
2009-08-05 13:46 ` Andi Kleen
2009-08-05 14:01 ` Nick Piggin
2009-08-05 14:01 ` Nick Piggin
2009-08-05 14:10 ` Andi Kleen
2009-08-05 14:10 ` Andi Kleen
2009-08-05 14:16 ` Nick Piggin
2009-08-05 14:16 ` Nick Piggin
2009-08-05 14:41 ` Andi Kleen
2009-08-05 14:41 ` Andi Kleen
2009-08-05 14:44 ` Nick Piggin
2009-08-05 14:44 ` Nick Piggin
2009-08-05 15:00 ` Matthew Wilcox
2009-08-05 15:00 ` Matthew Wilcox
2009-08-06 11:48 ` Martin Schwidefsky
2009-08-06 11:48 ` Martin Schwidefsky
2009-08-06 12:04 ` Andi Kleen
2009-08-06 12:04 ` Andi Kleen
2009-08-05 15:12 ` Wu Fengguang
2009-08-05 15:12 ` Wu Fengguang
2009-08-05 9:36 ` [PATCH] [12/19] HWPOISON: Add invalidate_inode_page Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 9:36 ` [PATCH] [13/19] HWPOISON: Define a new error_remove_page address space op for async truncation Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 9:36 ` [PATCH] [14/19] HWPOISON: Add PR_MCE_KILL prctl to control early kill behaviour per process Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 9:36 ` [PATCH] [15/19] HWPOISON: The high level memory error handler in the VM v7 Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 9:36 ` [PATCH] [16/19] HWPOISON: Enable .remove_error_page for migration aware file systems Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 11:12 ` Christoph Hellwig
2009-08-05 11:12 ` Christoph Hellwig
2009-08-05 11:52 ` Wu Fengguang [this message]
2009-08-05 11:52 ` Wu Fengguang
2009-08-05 13:50 ` Andi Kleen
2009-08-05 13:50 ` Andi Kleen
2009-08-10 6:36 ` Hidehiro Kawai
2009-08-10 6:36 ` Hidehiro Kawai
2009-08-10 7:07 ` Wu Fengguang
2009-08-10 7:07 ` Wu Fengguang
2009-08-11 3:48 ` Hidehiro Kawai
2009-08-11 3:48 ` Hidehiro Kawai
2009-08-11 6:59 ` Andi Kleen
2009-08-11 6:59 ` Andi Kleen
2009-08-11 12:38 ` Wu Fengguang
2009-08-11 12:38 ` Wu Fengguang
2009-08-10 7:44 ` Andi Kleen
2009-08-10 7:44 ` Andi Kleen
2009-08-11 3:50 ` Hidehiro Kawai
2009-08-11 3:50 ` Hidehiro Kawai
2009-08-11 7:17 ` Andi Kleen
2009-08-11 7:17 ` Andi Kleen
2009-08-12 2:49 ` Hidehiro Kawai
2009-08-12 2:49 ` Hidehiro Kawai
2009-08-12 7:46 ` Andi Kleen
2009-08-12 7:46 ` Andi Kleen
2009-08-12 9:52 ` Hidehiro Kawai
2009-08-12 9:52 ` Hidehiro Kawai
2009-08-12 10:16 ` Andi Kleen
2009-08-12 10:16 ` Andi Kleen
2009-08-12 8:05 ` Nick Piggin
2009-08-12 8:05 ` Nick Piggin
2009-08-12 8:23 ` Andi Kleen
2009-08-12 8:23 ` Andi Kleen
2009-08-12 8:46 ` Nick Piggin
2009-08-12 8:46 ` Nick Piggin
2009-08-12 8:57 ` Andi Kleen
2009-08-12 8:57 ` Andi Kleen
2009-08-12 9:05 ` Nick Piggin
2009-08-12 9:05 ` Nick Piggin
2009-08-12 9:39 ` Wu Fengguang
2009-08-12 9:39 ` Wu Fengguang
2009-08-05 9:36 ` [PATCH] [17/19] HWPOISON: Enable error_remove_page for NFS Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 9:36 ` [PATCH] [18/19] HWPOISON: Add madvise() based injector for hardware poisoned pages v3 Andi Kleen
2009-08-05 9:36 ` Andi Kleen
2009-08-05 9:36 ` [PATCH] [19/19] HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs Andi Kleen
2009-08-05 9:36 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090805115242.GB6737@localhost \
--to=fengguang.wu@intel.com \
--cc=aia21@cantab.net \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=hch@infradead.org \
--cc=hidehiro.kawai.ez@hitachi.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mfasheh@suse.com \
--cc=npiggin@suse.de \
--cc=swhiteho@redhat.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.