linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* HWPoison fixes for 2.6.36
@ 2010-10-06 20:48 Andi Kleen
  2010-10-06 20:48 ` [PATCH 1/4] page-types.c: fix name of unpoison interface Andi Kleen
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Andi Kleen @ 2010-10-06 20:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: fengguang.wu, linux-mm

Here are some hwpoison fixes I plan to send to Linus in a day or two
for 2.6.36. Any review appreciated.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/4] page-types.c: fix name of unpoison interface
  2010-10-06 20:48 HWPoison fixes for 2.6.36 Andi Kleen
@ 2010-10-06 20:48 ` Andi Kleen
  2010-10-06 20:48 ` [PATCH 2/4] HWPOISON: Copy si_addr_lsb to user Andi Kleen
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: Andi Kleen @ 2010-10-06 20:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: fengguang.wu, linux-mm, Naoya Horiguchi, Andi Kleen

From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

The page-types utility still uses an out of date name for the
unpoison interface: debugfs:hwpoison/renew-pfn
This patch renames and fixes it.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 Documentation/vm/page-types.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/Documentation/vm/page-types.c b/Documentation/vm/page-types.c
index ccd951f..cc96ee2 100644
--- a/Documentation/vm/page-types.c
+++ b/Documentation/vm/page-types.c
@@ -478,7 +478,7 @@ static void prepare_hwpoison_fd(void)
 	}
 
 	if (opt_unpoison && !hwpoison_forget_fd) {
-		sprintf(buf, "%s/renew-pfn", hwpoison_debug_fs);
+		sprintf(buf, "%s/unpoison-pfn", hwpoison_debug_fs);
 		hwpoison_forget_fd = checked_open(buf, O_WRONLY);
 	}
 }
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/4] HWPOISON: Copy si_addr_lsb to user
  2010-10-06 20:48 HWPoison fixes for 2.6.36 Andi Kleen
  2010-10-06 20:48 ` [PATCH 1/4] page-types.c: fix name of unpoison interface Andi Kleen
@ 2010-10-06 20:48 ` Andi Kleen
  2010-10-07  0:27   ` Naoya Horiguchi
                     ` (2 more replies)
  2010-10-06 20:49 ` [PATCH 3/4] HWPOISON: Report correct address granuality for AO huge page errors Andi Kleen
  2010-10-06 20:49 ` [PATCH 4/4] HWPOISON: Stop shrinking at right page count Andi Kleen
  3 siblings, 3 replies; 18+ messages in thread
From: Andi Kleen @ 2010-10-06 20:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: fengguang.wu, linux-mm, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

The original hwpoison code added a new siginfo field si_addr_lsb to
pass the granuality of the fault address to user space. Unfortunately
this field was never copied to user space. Fix this here.

I added explicit checks for the MCEERR codes to avoid having
to patch all potential callers to initialize the field.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 kernel/signal.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index bded651..919562c 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2215,6 +2215,14 @@ int copy_siginfo_to_user(siginfo_t __user *to, siginfo_t *from)
 #ifdef __ARCH_SI_TRAPNO
 		err |= __put_user(from->si_trapno, &to->si_trapno);
 #endif
+#ifdef BUS_MCEERR_AO
+		/* 
+		 * Other callers might not initialize the si_lsb field,
+	 	 * so check explicitely for the right codes here.
+		 */
+		if (from->si_code == BUS_MCEERR_AR || from->si_code == BUS_MCEERR_AO)
+			err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb);
+#endif
 		break;
 	case __SI_CHLD:
 		err |= __put_user(from->si_pid, &to->si_pid);
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/4] HWPOISON: Report correct address granuality for AO huge page errors
  2010-10-06 20:48 HWPoison fixes for 2.6.36 Andi Kleen
  2010-10-06 20:48 ` [PATCH 1/4] page-types.c: fix name of unpoison interface Andi Kleen
  2010-10-06 20:48 ` [PATCH 2/4] HWPOISON: Copy si_addr_lsb to user Andi Kleen
@ 2010-10-06 20:49 ` Andi Kleen
  2010-10-07  0:31   ` Naoya Horiguchi
  2010-10-07  1:50   ` Wu Fengguang
  2010-10-06 20:49 ` [PATCH 4/4] HWPOISON: Stop shrinking at right page count Andi Kleen
  3 siblings, 2 replies; 18+ messages in thread
From: Andi Kleen @ 2010-10-06 20:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: fengguang.wu, linux-mm, Andi Kleen, Naoya Horiguchi

From: Andi Kleen <ak@linux.intel.com>

The SIGBUS user space signalling is supposed to report the
address granuality of a corruption. Pass this information correctly
for huge pages by querying the hpage order.

Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: fengguang.wu@intel.com
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 mm/memory-failure.c |   15 +++++++++------
 1 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 9c26eec..886144b 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -183,10 +183,11 @@ EXPORT_SYMBOL_GPL(hwpoison_filter);
  * signal.
  */
 static int kill_proc_ao(struct task_struct *t, unsigned long addr, int trapno,
-			unsigned long pfn)
+			unsigned long pfn, struct page *page)
 {
 	struct siginfo si;
 	int ret;
+	unsigned order;
 
 	printk(KERN_ERR
 		"MCE %#lx: Killing %s:%d early due to hardware memory corruption\n",
@@ -198,7 +199,8 @@ static int kill_proc_ao(struct task_struct *t, unsigned long addr, int trapno,
 #ifdef __ARCH_SI_TRAPNO
 	si.si_trapno = trapno;
 #endif
-	si.si_addr_lsb = PAGE_SHIFT;
+	order = PageCompound(page) ? huge_page_order(page) : PAGE_SHIFT;
+	si.si_addr_lsb = order;
 	/*
 	 * Don't use force here, it's convenient if the signal
 	 * can be temporarily blocked.
@@ -327,7 +329,7 @@ static void add_to_kill(struct task_struct *tsk, struct page *p,
  * wrong earlier.
  */
 static void kill_procs_ao(struct list_head *to_kill, int doit, int trapno,
-			  int fail, unsigned long pfn)
+			  int fail, struct page *page, unsigned long pfn)
 {
 	struct to_kill *tk, *next;
 
@@ -341,7 +343,8 @@ static void kill_procs_ao(struct list_head *to_kill, int doit, int trapno,
 			if (fail || tk->addr_valid == 0) {
 				printk(KERN_ERR
 		"MCE %#lx: forcibly killing %s:%d because of failure to unmap corrupted page\n",
-					pfn, tk->tsk->comm, tk->tsk->pid);
+					pfn,	
+					tk->tsk->comm, tk->tsk->pid);
 				force_sig(SIGKILL, tk->tsk);
 			}
 
@@ -352,7 +355,7 @@ static void kill_procs_ao(struct list_head *to_kill, int doit, int trapno,
 			 * process anyways.
 			 */
 			else if (kill_proc_ao(tk->tsk, tk->addr, trapno,
-					      pfn) < 0)
+					      pfn, page) < 0)
 				printk(KERN_ERR
 		"MCE %#lx: Cannot send advisory machine check signal to %s:%d\n",
 					pfn, tk->tsk->comm, tk->tsk->pid);
@@ -928,7 +931,7 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
 	 * any accesses to the poisoned memory.
 	 */
 	kill_procs_ao(&tokill, !!PageDirty(hpage), trapno,
-		      ret != SWAP_SUCCESS, pfn);
+		      ret != SWAP_SUCCESS, p, pfn);
 
 	return ret;
 }
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/4] HWPOISON: Stop shrinking at right page count
  2010-10-06 20:48 HWPoison fixes for 2.6.36 Andi Kleen
                   ` (2 preceding siblings ...)
  2010-10-06 20:49 ` [PATCH 3/4] HWPOISON: Report correct address granuality for AO huge page errors Andi Kleen
@ 2010-10-06 20:49 ` Andi Kleen
  2010-10-07  1:53   ` Wu Fengguang
  3 siblings, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2010-10-06 20:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: fengguang.wu, linux-mm, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

When we call the slab shrinker to free a page we need to stop at
page count one because the caller always holds a single reference, not zero.

This avoids useless looping over slab shrinkers and freeing too much
memory.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 mm/memory-failure.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 886144b..7c1af9b 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -237,7 +237,7 @@ void shake_page(struct page *p, int access)
 		int nr;
 		do {
 			nr = shrink_slab(1000, GFP_KERNEL, 1000);
-			if (page_count(p) == 0)
+			if (page_count(p) == 1)
 				break;
 		} while (nr > 10);
 	}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/4] HWPOISON: Copy si_addr_lsb to user
  2010-10-06 20:48 ` [PATCH 2/4] HWPOISON: Copy si_addr_lsb to user Andi Kleen
@ 2010-10-07  0:27   ` Naoya Horiguchi
  2010-10-07  6:31   ` Hidetoshi Seto
  2010-10-08 17:09   ` Ralf Baechle
  2 siblings, 0 replies; 18+ messages in thread
From: Naoya Horiguchi @ 2010-10-07  0:27 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, fengguang.wu, linux-mm, Andi Kleen

Just nitpicking...

> @@ -2215,6 +2215,14 @@ int copy_siginfo_to_user(siginfo_t __user *to, siginfo_t *from)
>  #ifdef __ARCH_SI_TRAPNO
>  		err |= __put_user(from->si_trapno, &to->si_trapno);
>  #endif
> +#ifdef BUS_MCEERR_AO
> +		/* 
                  ^
                  trailing white space
> +		 * Other callers might not initialize the si_lsb field,
> +	 	 * so check explicitely for the right codes here.
        ^                   ^^^^^^^^^^^
        white space         explicitly

> +		 */
> +		if (from->si_code == BUS_MCEERR_AR || from->si_code == BUS_MCEERR_AO)
> +			err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb);
> +#endif
>  		break;
>  	case __SI_CHLD:
>  		err |= __put_user(from->si_pid, &to->si_pid);

Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

Thanks,
Naoya Horiguchi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/4] HWPOISON: Report correct address granuality for AO huge page errors
  2010-10-06 20:49 ` [PATCH 3/4] HWPOISON: Report correct address granuality for AO huge page errors Andi Kleen
@ 2010-10-07  0:31   ` Naoya Horiguchi
  2010-10-07  7:38     ` Andi Kleen
  2010-10-07  1:50   ` Wu Fengguang
  1 sibling, 1 reply; 18+ messages in thread
From: Naoya Horiguchi @ 2010-10-07  0:31 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, fengguang.wu, linux-mm, Andi Kleen

> @@ -198,7 +199,8 @@ static int kill_proc_ao(struct task_struct *t, unsigned long addr, int trapno,
>  #ifdef __ARCH_SI_TRAPNO
>  	si.si_trapno = trapno;
>  #endif
> -	si.si_addr_lsb = PAGE_SHIFT;
> +	order = PageCompound(page) ? huge_page_order(page) : PAGE_SHIFT;
                                                     ^^^^
                                     huge_page_order(page_hstate(page)) ?

> +	si.si_addr_lsb = order;
>  	/*
>  	 * Don't use force here, it's convenient if the signal
>  	 * can be temporarily blocked.

...

> @@ -341,7 +343,8 @@ static void kill_procs_ao(struct list_head *to_kill, int doit, int trapno,
>  			if (fail || tk->addr_valid == 0) {
>  				printk(KERN_ERR
>  		"MCE %#lx: forcibly killing %s:%d because of failure to unmap corrupted page\n",
> -					pfn, tk->tsk->comm, tk->tsk->pid);
> +					pfn,	
> +					tk->tsk->comm, tk->tsk->pid);

What's the point of this change?

Thanks,
Naoya Horiguchi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/4] HWPOISON: Report correct address granuality for AO huge page errors
  2010-10-06 20:49 ` [PATCH 3/4] HWPOISON: Report correct address granuality for AO huge page errors Andi Kleen
  2010-10-07  0:31   ` Naoya Horiguchi
@ 2010-10-07  1:50   ` Wu Fengguang
  1 sibling, 0 replies; 18+ messages in thread
From: Wu Fengguang @ 2010-10-07  1:50 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andi Kleen,
	Naoya Horiguchi

On Thu, Oct 07, 2010 at 04:49:00AM +0800, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
> 
> The SIGBUS user space signalling is supposed to report the
> address granuality of a corruption. Pass this information correctly
> for huge pages by querying the hpage order.
> 
> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Cc: fengguang.wu@intel.com
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  mm/memory-failure.c |   15 +++++++++------
>  1 files changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 9c26eec..886144b 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -183,10 +183,11 @@ EXPORT_SYMBOL_GPL(hwpoison_filter);
>   * signal.
>   */
>  static int kill_proc_ao(struct task_struct *t, unsigned long addr, int trapno,
> -			unsigned long pfn)
> +			unsigned long pfn, struct page *page)
>  {
>  	struct siginfo si;
>  	int ret;
> +	unsigned order;
>  
>  	printk(KERN_ERR
>  		"MCE %#lx: Killing %s:%d early due to hardware memory corruption\n",
> @@ -198,7 +199,8 @@ static int kill_proc_ao(struct task_struct *t, unsigned long addr, int trapno,
>  #ifdef __ARCH_SI_TRAPNO
>  	si.si_trapno = trapno;
>  #endif
> -	si.si_addr_lsb = PAGE_SHIFT;
> +	order = PageCompound(page) ? huge_page_order(page) : PAGE_SHIFT;

huge_page_order() expects struct hstate *h. Should be
compound_order(compound_head(page)) or compound_order(page) if it's
already a head page.

btw, I notice that force_sig_info_fault() sets 

        info.si_addr_lsb = si_code == BUS_MCEERR_AR ? PAGE_SHIFT : 0;

What's the intention of conditional 0 here?

> +	si.si_addr_lsb = order;
>  	/*
>  	 * Don't use force here, it's convenient if the signal
>  	 * can be temporarily blocked.
> @@ -327,7 +329,7 @@ static void add_to_kill(struct task_struct *tsk, struct page *p,
>   * wrong earlier.
>   */
>  static void kill_procs_ao(struct list_head *to_kill, int doit, int trapno,
> -			  int fail, unsigned long pfn)
> +			  int fail, struct page *page, unsigned long pfn)
>  {
>  	struct to_kill *tk, *next;
>  
> @@ -341,7 +343,8 @@ static void kill_procs_ao(struct list_head *to_kill, int doit, int trapno,
>  			if (fail || tk->addr_valid == 0) {
>  				printk(KERN_ERR
>  		"MCE %#lx: forcibly killing %s:%d because of failure to unmap corrupted page\n",
> -					pfn, tk->tsk->comm, tk->tsk->pid);
> +					pfn,	
> +					tk->tsk->comm, tk->tsk->pid);
>  				force_sig(SIGKILL, tk->tsk);
>  			}
>  
> @@ -352,7 +355,7 @@ static void kill_procs_ao(struct list_head *to_kill, int doit, int trapno,
>  			 * process anyways.
>  			 */
>  			else if (kill_proc_ao(tk->tsk, tk->addr, trapno,
> -					      pfn) < 0)
> +					      pfn, page) < 0)
>  				printk(KERN_ERR
>  		"MCE %#lx: Cannot send advisory machine check signal to %s:%d\n",
>  					pfn, tk->tsk->comm, tk->tsk->pid);
> @@ -928,7 +931,7 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
>  	 * any accesses to the poisoned memory.
>  	 */
>  	kill_procs_ao(&tokill, !!PageDirty(hpage), trapno,
> -		      ret != SWAP_SUCCESS, pfn);
> +		      ret != SWAP_SUCCESS, p, pfn);

It seems a bit better to pass "hpage" (the head page) instead of "p"
since the function only referenced the head page, and "p" is somehow
duplicated with "pfn".

Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] HWPOISON: Stop shrinking at right page count
  2010-10-06 20:49 ` [PATCH 4/4] HWPOISON: Stop shrinking at right page count Andi Kleen
@ 2010-10-07  1:53   ` Wu Fengguang
  0 siblings, 0 replies; 18+ messages in thread
From: Wu Fengguang @ 2010-10-07  1:53 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andi Kleen

On Thu, Oct 07, 2010 at 04:49:01AM +0800, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
> 
> When we call the slab shrinker to free a page we need to stop at
> page count one because the caller always holds a single reference, not zero.
> 
> This avoids useless looping over slab shrinkers and freeing too much
> memory.
> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>

Good catch!

Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>

> ---
>  mm/memory-failure.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 886144b..7c1af9b 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -237,7 +237,7 @@ void shake_page(struct page *p, int access)
>  		int nr;
>  		do {
>  			nr = shrink_slab(1000, GFP_KERNEL, 1000);
> -			if (page_count(p) == 0)
> +			if (page_count(p) == 1)
>  				break;
>  		} while (nr > 10);
>  	}
> -- 
> 1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/4] HWPOISON: Copy si_addr_lsb to user
  2010-10-06 20:48 ` [PATCH 2/4] HWPOISON: Copy si_addr_lsb to user Andi Kleen
  2010-10-07  0:27   ` Naoya Horiguchi
@ 2010-10-07  6:31   ` Hidetoshi Seto
  2010-10-07  7:36     ` Andi Kleen
  2010-10-08 17:09   ` Ralf Baechle
  2 siblings, 1 reply; 18+ messages in thread
From: Hidetoshi Seto @ 2010-10-07  6:31 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, fengguang.wu, linux-mm, Andi Kleen

(2010/10/07 5:48), Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
> 
> The original hwpoison code added a new siginfo field si_addr_lsb to
> pass the granuality of the fault address to user space. Unfortunately
> this field was never copied to user space. Fix this here.
> 
> I added explicit checks for the MCEERR codes to avoid having
> to patch all potential callers to initialize the field.

Now QEMU uses signalfd to catch the SIGBUS delivered to the
main thread, so I think similar fix to copy lsb to user is
required for signalfd too. 


Thanks,
H.Seto

=====

From: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Subject: [PATCH] signalfd: add support addr_lsb

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
---
 fs/signalfd.c            |   10 ++++++++++
 include/linux/signalfd.h |    3 ++-
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/fs/signalfd.c b/fs/signalfd.c
index 1c5a6ad..3e28173 100644
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -99,6 +99,16 @@ static int signalfd_copyinfo(struct signalfd_siginfo __user *uinfo,
 #ifdef __ARCH_SI_TRAPNO
 		err |= __put_user(kinfo->si_trapno, &uinfo->ssi_trapno);
 #endif
+#ifdef BUS_MCEERR_AO
+		/* 
+		 * Other callers might not initialize the si_lsb field,
+		 * so check explicitely for the right codes here.
+		 */
+		if (kinfo->si_code == BUS_MCEERR_AR ||
+		    kinfo->si_code == BUS_MCEERR_AO)
+			err |= __put_user((short) kinfo->si_addr_lsb,
+					  &uinfo->ssi_addr_lsb);
+#endif
 		break;
 	case __SI_CHLD:
 		err |= __put_user(kinfo->si_pid, &uinfo->ssi_pid);
diff --git a/include/linux/signalfd.h b/include/linux/signalfd.h
index b363b91..3ff4961 100644
--- a/include/linux/signalfd.h
+++ b/include/linux/signalfd.h
@@ -33,6 +33,7 @@ struct signalfd_siginfo {
 	__u64 ssi_utime;
 	__u64 ssi_stime;
 	__u64 ssi_addr;
+	__u16 ssi_addr_lsb;
 
 	/*
 	 * Pad strcture to 128 bytes. Remember to update the
@@ -43,7 +44,7 @@ struct signalfd_siginfo {
 	 * comes out of a read(2) and we really don't want to have
 	 * a compat on read(2).
 	 */
-	__u8 __pad[48];
+	__u8 __pad[46];
 };
 
 
-- 
1.7.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/4] HWPOISON: Copy si_addr_lsb to user
  2010-10-07  6:31   ` Hidetoshi Seto
@ 2010-10-07  7:36     ` Andi Kleen
  0 siblings, 0 replies; 18+ messages in thread
From: Andi Kleen @ 2010-10-07  7:36 UTC (permalink / raw)
  To: Hidetoshi Seto
  Cc: Andi Kleen, linux-kernel, fengguang.wu, linux-mm, Andi Kleen

On Thu, Oct 07, 2010 at 03:31:31PM +0900, Hidetoshi Seto wrote:
> (2010/10/07 5:48), Andi Kleen wrote:
> > From: Andi Kleen <ak@linux.intel.com>
> > 
> > The original hwpoison code added a new siginfo field si_addr_lsb to
> > pass the granuality of the fault address to user space. Unfortunately
> > this field was never copied to user space. Fix this here.
> > 
> > I added explicit checks for the MCEERR codes to avoid having
> > to patch all potential callers to initialize the field.
> 
> Now QEMU uses signalfd to catch the SIGBUS delivered to the
> main thread, so I think similar fix to copy lsb to user is
> required for signalfd too. 

Good catch. I don't think qemu uses this today, but it should
be fixed there too for .37 at least.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/4] HWPOISON: Report correct address granuality for AO huge page errors
  2010-10-07  0:31   ` Naoya Horiguchi
@ 2010-10-07  7:38     ` Andi Kleen
  2010-10-07  8:41       ` Naoya Horiguchi
  0 siblings, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2010-10-07  7:38 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Andi Kleen, linux-kernel, fengguang.wu, linux-mm, Andi Kleen

On Thu, Oct 07, 2010 at 09:31:20AM +0900, Naoya Horiguchi wrote:
> > @@ -198,7 +199,8 @@ static int kill_proc_ao(struct task_struct *t, unsigned long addr, int trapno,
> >  #ifdef __ARCH_SI_TRAPNO
> >  	si.si_trapno = trapno;
> >  #endif
> > -	si.si_addr_lsb = PAGE_SHIFT;
> > +	order = PageCompound(page) ? huge_page_order(page) : PAGE_SHIFT;
>                                                      ^^^^
>                                      huge_page_order(page_hstate(page)) ?

Ok.

> >  				printk(KERN_ERR
> >  		"MCE %#lx: forcibly killing %s:%d because of failure to unmap corrupted page\n",
> > -					pfn, tk->tsk->comm, tk->tsk->pid);
> > +					pfn,	
> > +					tk->tsk->comm, tk->tsk->pid);
> 
> What's the point of this change?

Probably left over from an earlier version; I will drop that hunk thanks.


-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/4] HWPOISON: Report correct address granuality for AO huge page errors
  2010-10-07  7:38     ` Andi Kleen
@ 2010-10-07  8:41       ` Naoya Horiguchi
  2010-10-07  8:45         ` Andi Kleen
  0 siblings, 1 reply; 18+ messages in thread
From: Naoya Horiguchi @ 2010-10-07  8:41 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, fengguang.wu, linux-mm, Andi Kleen

On Thu, Oct 07, 2010 at 09:38:48AM +0200, Andi Kleen wrote:
> On Thu, Oct 07, 2010 at 09:31:20AM +0900, Naoya Horiguchi wrote:
> > > @@ -198,7 +199,8 @@ static int kill_proc_ao(struct task_struct *t, unsigned long addr, int trapno,
> > >  #ifdef __ARCH_SI_TRAPNO
> > >  	si.si_trapno = trapno;
> > >  #endif
> > > -	si.si_addr_lsb = PAGE_SHIFT;
> > > +	order = PageCompound(page) ? huge_page_order(page) : PAGE_SHIFT;
> >                                                      ^^^^
> >                                      huge_page_order(page_hstate(page)) ?
> 
> Ok.

order seems to represent a least significant bit of corrupted address,
so is huge_page_order() + PAGE_SHIFT or huge_page_shift() correct?
And since @page can be a tail page, compound_head() is needed as Wu-san pointed out.
So huge_page_shift(page_hstate(compound_head(page))) looks good for me.

Thanks,
Naoya Horiguchi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/4] HWPOISON: Report correct address granuality for AO huge page errors
  2010-10-07  8:41       ` Naoya Horiguchi
@ 2010-10-07  8:45         ` Andi Kleen
  2010-10-07  8:48           ` Naoya Horiguchi
  0 siblings, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2010-10-07  8:45 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Andi Kleen, linux-kernel, fengguang.wu, linux-mm, Andi Kleen

On Thu, Oct 07, 2010 at 05:41:01PM +0900, Naoya Horiguchi wrote:
> On Thu, Oct 07, 2010 at 09:38:48AM +0200, Andi Kleen wrote:
> > On Thu, Oct 07, 2010 at 09:31:20AM +0900, Naoya Horiguchi wrote:
> > > > @@ -198,7 +199,8 @@ static int kill_proc_ao(struct task_struct *t, unsigned long addr, int trapno,
> > > >  #ifdef __ARCH_SI_TRAPNO
> > > >  	si.si_trapno = trapno;
> > > >  #endif
> > > > -	si.si_addr_lsb = PAGE_SHIFT;
> > > > +	order = PageCompound(page) ? huge_page_order(page) : PAGE_SHIFT;
> > >                                                      ^^^^
> > >                                      huge_page_order(page_hstate(page)) ?
> > 
> > Ok.
> 
> order seems to represent a least significant bit of corrupted address,
> so is huge_page_order() + PAGE_SHIFT or huge_page_shift() correct?

Both I guess.

> And since @page can be a tail page, compound_head() is needed as Wu-san pointed out.
> So huge_page_shift(page_hstate(compound_head(page))) looks good for me.

I used compound_order(compound_head(page)) + PAGE_SHIFT now.
This even works for non compound, so the special case check
can be dropped.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/4] HWPOISON: Report correct address granuality for AO huge page errors
  2010-10-07  8:45         ` Andi Kleen
@ 2010-10-07  8:48           ` Naoya Horiguchi
  2010-10-07  8:58             ` Andi Kleen
  0 siblings, 1 reply; 18+ messages in thread
From: Naoya Horiguchi @ 2010-10-07  8:48 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, fengguang.wu, linux-mm, Andi Kleen

On Thu, Oct 07, 2010 at 10:45:29AM +0200, Andi Kleen wrote:
> On Thu, Oct 07, 2010 at 05:41:01PM +0900, Naoya Horiguchi wrote:
> > On Thu, Oct 07, 2010 at 09:38:48AM +0200, Andi Kleen wrote:
> > > On Thu, Oct 07, 2010 at 09:31:20AM +0900, Naoya Horiguchi wrote:
> > > > > @@ -198,7 +199,8 @@ static int kill_proc_ao(struct task_struct *t, unsigned long addr, int trapno,
> > > > >  #ifdef __ARCH_SI_TRAPNO
> > > > >  	si.si_trapno = trapno;
> > > > >  #endif
> > > > > -	si.si_addr_lsb = PAGE_SHIFT;
> > > > > +	order = PageCompound(page) ? huge_page_order(page) : PAGE_SHIFT;
> > > >                                                      ^^^^
> > > >                                      huge_page_order(page_hstate(page)) ?
> > > 
> > > Ok.
> > 
> > order seems to represent a least significant bit of corrupted address,
> > so is huge_page_order() + PAGE_SHIFT or huge_page_shift() correct?
> 
> Both I guess.
> 
> > And since @page can be a tail page, compound_head() is needed as Wu-san pointed out.
> > So huge_page_shift(page_hstate(compound_head(page))) looks good for me.
> 
> I used compound_order(compound_head(page)) + PAGE_SHIFT now.
> This even works for non compound, so the special case check
> can be dropped.

OK.

Thanks,
Naoya Horiguchi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/4] HWPOISON: Report correct address granuality for AO huge page errors
  2010-10-07  8:48           ` Naoya Horiguchi
@ 2010-10-07  8:58             ` Andi Kleen
  0 siblings, 0 replies; 18+ messages in thread
From: Andi Kleen @ 2010-10-07  8:58 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Andi Kleen, linux-kernel, fengguang.wu, linux-mm, Andi Kleen

> > I used compound_order(compound_head(page)) + PAGE_SHIFT now.
> > This even works for non compound, so the special case check
> > can be dropped.
> 
> OK.

BTW it would be nice if mce-test checked this for the huge page case too.
(I fixed this for small pages)

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/4] HWPOISON: Copy si_addr_lsb to user
  2010-10-06 20:48 ` [PATCH 2/4] HWPOISON: Copy si_addr_lsb to user Andi Kleen
  2010-10-07  0:27   ` Naoya Horiguchi
  2010-10-07  6:31   ` Hidetoshi Seto
@ 2010-10-08 17:09   ` Ralf Baechle
  2010-10-08 17:32     ` Andi Kleen
  2 siblings, 1 reply; 18+ messages in thread
From: Ralf Baechle @ 2010-10-08 17:09 UTC (permalink / raw)
  To: Andi Kleen, Linus Torvalds
  Cc: linux-kernel, fengguang.wu, linux-mm, Andi Kleen, Manuel Lauss,
	linux-mips

On Wed, Oct 06, 2010 at 10:48:59PM +0200, Andi Kleen wrote:

> The original hwpoison code added a new siginfo field si_addr_lsb to
> pass the granuality of the fault address to user space. Unfortunately
> this field was never copied to user space. Fix this here.
> 
> I added explicit checks for the MCEERR codes to avoid having
> to patch all potential callers to initialize the field.

That doesn't fly, see below.

> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2215,6 +2215,14 @@ int copy_siginfo_to_user(siginfo_t __user *to, siginfo_t *from)
>  #ifdef __ARCH_SI_TRAPNO
>  		err |= __put_user(from->si_trapno, &to->si_trapno);
>  #endif
> +#ifdef BUS_MCEERR_AO
> +		/* 
> +		 * Other callers might not initialize the si_lsb field,
> +	 	 * so check explicitely for the right codes here.
> +		 */
> +		if (from->si_code == BUS_MCEERR_AR || from->si_code == BUS_MCEERR_AO)
> +			err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb);
> +#endif

include/asm-generic/siginfo.h defines BUS_MCEERR_AR unconditionally and is
getting include in all <asm/siginfo.h> so that #ifdef condition is always
true.  struct siginfo.si_addr_lsb is defined only for the generic struct
siginfo.  The architectures that define HAVE_ARCH_SIGINFO_T (MIPS and
IA-64) do not define this field so the build breaks.

  Ralf

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/4] HWPOISON: Copy si_addr_lsb to user
  2010-10-08 17:09   ` Ralf Baechle
@ 2010-10-08 17:32     ` Andi Kleen
  0 siblings, 0 replies; 18+ messages in thread
From: Andi Kleen @ 2010-10-08 17:32 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: Andi Kleen, Linus Torvalds, linux-kernel, fengguang.wu, linux-mm,
	Andi Kleen, Manuel Lauss, linux-mips, tony.luck

On Fri, Oct 08, 2010 at 06:09:41PM +0100, Ralf Baechle wrote:
> > --- a/kernel/signal.c
> > +++ b/kernel/signal.c
> > @@ -2215,6 +2215,14 @@ int copy_siginfo_to_user(siginfo_t __user *to, siginfo_t *from)
> >  #ifdef __ARCH_SI_TRAPNO
> >  		err |= __put_user(from->si_trapno, &to->si_trapno);
> >  #endif
> > +#ifdef BUS_MCEERR_AO
> > +		/* 
> > +		 * Other callers might not initialize the si_lsb field,
> > +	 	 * so check explicitely for the right codes here.
> > +		 */
> > +		if (from->si_code == BUS_MCEERR_AR || from->si_code == BUS_MCEERR_AO)
> > +			err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb);
> > +#endif
> 
> include/asm-generic/siginfo.h defines BUS_MCEERR_AR unconditionally and is
> getting include in all <asm/siginfo.h> so that #ifdef condition is always
> true.  struct siginfo.si_addr_lsb is defined only for the generic struct
> siginfo.  The architectures that define HAVE_ARCH_SIGINFO_T (MIPS and
> IA-64) do not define this field so the build breaks.

Oops. I see two possible solutions:

#undef BUS_MCEERR_AR in the ia64 and mips siginfo.h or simply
add the si_addr_lsb field there too (it just sits over padding
and should be harmless)

What do you prefer?

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2010-10-08 17:32 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-06 20:48 HWPoison fixes for 2.6.36 Andi Kleen
2010-10-06 20:48 ` [PATCH 1/4] page-types.c: fix name of unpoison interface Andi Kleen
2010-10-06 20:48 ` [PATCH 2/4] HWPOISON: Copy si_addr_lsb to user Andi Kleen
2010-10-07  0:27   ` Naoya Horiguchi
2010-10-07  6:31   ` Hidetoshi Seto
2010-10-07  7:36     ` Andi Kleen
2010-10-08 17:09   ` Ralf Baechle
2010-10-08 17:32     ` Andi Kleen
2010-10-06 20:49 ` [PATCH 3/4] HWPOISON: Report correct address granuality for AO huge page errors Andi Kleen
2010-10-07  0:31   ` Naoya Horiguchi
2010-10-07  7:38     ` Andi Kleen
2010-10-07  8:41       ` Naoya Horiguchi
2010-10-07  8:45         ` Andi Kleen
2010-10-07  8:48           ` Naoya Horiguchi
2010-10-07  8:58             ` Andi Kleen
2010-10-07  1:50   ` Wu Fengguang
2010-10-06 20:49 ` [PATCH 4/4] HWPOISON: Stop shrinking at right page count Andi Kleen
2010-10-07  1:53   ` Wu Fengguang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).