linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] super1: error handling for super-block loading
@ 2016-05-12 17:38 Gioh Kim
  2016-05-12 18:16 ` Jes Sorensen
  0 siblings, 1 reply; 4+ messages in thread
From: Gioh Kim @ 2016-05-12 17:38 UTC (permalink / raw)
  To: Jes Sorensen, linux-raid; +Cc: Elmar Gerdes, Jinpu Wang

Hi,

I found a segfault of mdadm.
mdadm[59966]: segfault at 5c ip 000000000042c1e1 sp 00007ffe52745390 
error 4 in mdadm[400000+65000]

I disassembled mdadm.

   42c170:       41 55                   push   %r13
   42c172:       49 89 f8                mov    %rdi,%r8
   42c175:       41 54                   push   %r12
   42c177:       49 89 d4                mov    %rdx,%r12
   42c17a:       55                      push   %rbp
   42c17b:       53                      push   %rbx
   42c17c:       48 89 f3                mov    %rsi,%rbx
   42c17f:       48 83 ec 08             sub    $0x8,%rsp
   42c183:       f6 c3 01                test   $0x1,%bl
   42c186:       48 8b 6f 18             mov    0x18(%rdi),%rbp
   42c18a:       44 8b 6e 1c             mov    0x1c(%rsi),%r13d
   42c18e:       48 89 f7                mov    %rsi,%rdi
   42c191:       be 88 01 00 00          mov    $0x188,%esi
   42c196:       0f 85 b8 02 00 00       jne    42c454 <socket@plt+0x29724>
   42c19c:       40 f6 c7 02             test   $0x2,%dil
   42c1a0:       0f 85 c2 02 00 00       jne    42c468 <socket@plt+0x29738>
   42c1a6:       40 f6 c7 04             test   $0x4,%dil
   42c1aa:       0f 85 ce 02 00 00       jne    42c47e <socket@plt+0x2974e>
   42c1b0:       89 f1                   mov    %esi,%ecx
   42c1b2:       31 c0                   xor    %eax,%eax
   42c1b4:       c1 e9 03                shr    $0x3,%ecx
   42c1b7:       40 f6 c6 04             test   $0x4,%sil
   42c1bb:       f3 48 ab                rep stos %rax,%es:(%rdi)
   42c1be:       74 0a                   je     42c1ca <socket@plt+0x2949a>
   42c1c0:       c7 07 00 00 00 00       movl   $0x0,(%rdi)
   42c1c6:       48 83 c7 04             add    $0x4,%rdi
   42c1ca:       40 f6 c6 02             test   $0x2,%sil
   42c1ce:       74 09                   je     42c1d9 <socket@plt+0x294a9>
   42c1d0:       66 c7 07 00 00          movw   $0x0,(%rdi)
   42c1d5:       48 83 c7 02             add    $0x2,%rdi
   42c1d9:       83 e6 01                and    $0x1,%esi
   42c1dc:       74 03                   je     42c1e1 <socket@plt+0x294b1>
   42c1de:       c6 07 00                movb   $0x0,(%rdi)
   42c1e1:       8b 45 5c                mov    0x5c(%rbp),%eax

I think this is getinfo_super1() because I rebuilt mdadm with debugging 
symbol and found exactly the same assemble code.
A NULL pointer referencing is generated by reading st->sb->raid_disks 
(%rdi=st, %rbp=sb=NULL).
I found there is no error handling if Grow_addbitmap fails to load 
super-block.

I'm not sure yet why mdadm failed to load super-block of disks.
I checked the kernel log and found I/O error from disks.
Anyway mdadm needs to handle that error case.

Please review following patch.

------------------------------------------- 8< 
-------------------------------------------------------------
 From 8cacf56b2d630c7e74bad942779ff7ed5f516d26 Mon Sep 17 00:00:00 2001
From: Gioh Kim <gi-oh.kim@profitbricks.com>
Date: Thu, 12 May 2016 19:09:45 +0200
Subject: [PATCH] super1: error handling for super-block loading

Loading super-block can fail if all sub-devices are faulty
or have I/O errors.

Signed-off-by: Gioh Kim <gi-oh.kim@profitbricks.com>
---
  Grow.c | 7 ++++++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/Grow.c b/Grow.c
index f58c753..fa08522 100755
--- a/Grow.c
+++ b/Grow.c
@@ -389,7 +389,7 @@ int Grow_addbitmap(char *devname, int fd, struct 
context *c, struct shape *s)
      }
      if (strcmp(s->bitmap_file, "internal") == 0 ||
          strcmp(s->bitmap_file, "clustered") == 0) {
-        int rv;
+        int rv = 0;
          int d;
          int offset_setable = 0;
          struct mdinfo *mdi;
@@ -419,6 +419,7 @@ int Grow_addbitmap(char *devname, int fd, struct 
context *c, struct shape *s)
                  if (fd2 < 0)
                      continue;
                  if (st->ss->load_super(st, fd2, NULL)==0) {
+                    rv++;
                      if (st->ss->add_internal_bitmap(
                              st,
                              &s->bitmap_chunk, c->delay, s->write_behind,
@@ -435,6 +436,10 @@ int Grow_addbitmap(char *devname, int fd, struct 
context *c, struct shape *s)
                  close(fd2);
              }
          }
+        if (rv == 0) {
+            pr_err("failed to load super-block.\n");
+            return 1;
+        }
          if (offset_setable) {
              st->ss->getinfo_super(st, mdi, NULL);
              sysfs_init(mdi, fd, NULL);
-- 
2.5.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-05-13 15:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-12 17:38 [RFC] super1: error handling for super-block loading Gioh Kim
2016-05-12 18:16 ` Jes Sorensen
2016-05-12 19:43   ` Jes Sorensen
2016-05-13 15:15     ` Gioh Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).