Linux RAID subsystem development

Linux RAID subsystem development
 help / color / mirror / Atom feed

* Re: [PATCHv2 0/2] mdadm: setting device role of raid1 disk with failfast
From: Gioh Kim @ 2017-03-27 10:59 UTC (permalink / raw)
  To: jes.sorensen; +Cc: neilb, linux-raid, linux-kernel
In-Reply-To: <1490003517-4216-1-git-send-email-gi-oh.kim@profitbricks.com>

Hi,
Is nobody interested in those patches?

On Mon, Mar 20, 2017 at 10:51:55AM +0100, Gioh Kim wrote:
> Hi,
> 
> I've found a case that failfast option of mdadm set a disk faulty wrongly.
> Following is my test case.
> 
> mdadm --create /dev/md100 -l 1 --failfast -e 1.2 -n 2 /dev/vdb /dev/vdc
> mdadm /dev/md100 -a --failfast /dev/vdd
> 
> If I use failfast option, the vdd disk was faulty wrongly.
> If not, it was spare.
> 
> This patch fixes a corner case for setting device role and
> prints device role if it's faulty.
> This patch is based on "mdadm - v4.0-8-g72b616a - 2017-03-07".
> 
> v2: fix a typo of v1
> 
> Gioh Kim (1):
>   super1: ignore failfast flag for setting device role
> 
> Jack Wang (1):
>   super1: check and output faulty dev role
> 
>  super1.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> -- 
> 2.5.0
> 

-- 
Best regards,
Gi-Oh Kim
TEL: 0176 2697 8962

^ permalink raw reply

* Re: [PATCH 0/5] Fix dead URLs to ftp.kernel.org
From: Michal Marek @ 2017-03-27 11:54 UTC (permalink / raw)
  To: SeongJae Park, rml, axboe, shli, raven, yamada.masahito
  Cc: linux-raid, autofs, linux-kbuild, linux-kernel
In-Reply-To: <20170327054731.31882-1-sj38.park@gmail.com>

Dne 27.3.2017 v 07:47 SeongJae Park napsal(a):
> URLs to `ftp.kernel.org` exist here and there though `ftp.kernel.org` is
> already dead [0].  This patchset fixes those URLs to use `www.kernel.org`
> instead.
> 
> The change is splitted into multiple patches for independent review and merge
> of each maintainer, though the change is trivial.
> 
> [0] https://www.kernel.org/shutting-down-ftp-services.html
> 
> SeongJae Park (5):
>   MAINTAINERS: Fix a dead URL to ftp.kernel.org
>   drivers/block: Fix a dead URL to ftp.kernel.org
>   drivers/md: Fix a dead URL to ftp.kernel.org
>   fs/autofs4: Fix a dead URL to ftp.kernel.org
>   scripts: Fix dead URLs to ftp.kernel.org

I guess the change be sent in a single patch to the trivial tree.

Michal

^ permalink raw reply

* Re: [PATCH 0/5] Fix dead URLs to ftp.kernel.org
From: SeongJae Park @ 2017-03-27 12:42 UTC (permalink / raw)
  To: Michal Marek
  Cc: rml, axboe, shli, Ian Kent, yamada.masahito, linux-raid, autofs,
	linux-kbuild, linux-kernel@vger.kernel.org
In-Reply-To: <fabba6b8-8f8d-8f8f-1660-c579e6515563@suse.com>

On Mon, Mar 27, 2017 at 8:54 PM, Michal Marek <mmarek@suse.com> wrote:
> Dne 27.3.2017 v 07:47 SeongJae Park napsal(a):
>> URLs to `ftp.kernel.org` exist here and there though `ftp.kernel.org` is
>> already dead [0].  This patchset fixes those URLs to use `www.kernel.org`
>> instead.
>>
>> The change is splitted into multiple patches for independent review and merge
>> of each maintainer, though the change is trivial.
>>
>> [0] https://www.kernel.org/shutting-down-ftp-services.html
>>
>> SeongJae Park (5):
>>   MAINTAINERS: Fix a dead URL to ftp.kernel.org
>>   drivers/block: Fix a dead URL to ftp.kernel.org
>>   drivers/md: Fix a dead URL to ftp.kernel.org
>>   fs/autofs4: Fix a dead URL to ftp.kernel.org
>>   scripts: Fix dead URLs to ftp.kernel.org
>
> I guess the change be sent in a single patch to the trivial tree.

Sounds reasonable.  I will post the single patch as a reply to this mail.


Thanks,
SeongJae Park

>
> Michal
>

^ permalink raw reply

* [PATCH] Fix dead URLs to ftp.kernel.org
From: SeongJae Park @ 2017-03-27 12:44 UTC (permalink / raw)
  To: trivial, rml, axboe, shli, raven, yamada.masahito, mmarek
  Cc: linux-raid, autofs, linux-kbuild, linux-kernel, SeongJae Park
In-Reply-To: <CAEjAshqR0Nhp4Wkmkg_HrswsiuUb3rYJ=TOUSBSvoUT3FSWTjg@mail.gmail.com>

URLs to ftp.kernel.org are still exist though the service is closed [0].
This commit fixes the URLs to use www.kernel.org instead.

[0] https://www.kernel.org/shutting-down-ftp-services.html

Signed-off-by: SeongJae Park <sj38.park@gmail.com>
---
 MAINTAINERS              | 2 +-
 drivers/block/Kconfig    | 2 +-
 drivers/md/Kconfig       | 2 +-
 fs/autofs4/Kconfig       | 2 +-
 scripts/ksymoops/README  | 5 ++---
 scripts/package/builddeb | 4 ++--
 6 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index c45c02bc6082..0a4df79c0823 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10068,7 +10068,7 @@ W:	http://sourceforge.net/projects/accel-pptp
 PREEMPTIBLE KERNEL
 M:	Robert Love <rml@tech9.net>
 L:	kpreempt-tech@lists.sourceforge.net
-W:	ftp://ftp.kernel.org/pub/linux/kernel/people/rml/preempt-kernel
+W:	https://www.kernel.org/pub/linux/kernel/people/rml/preempt-kernel
 S:	Supported
 F:	Documentation/preempt-locking.txt
 F:	include/linux/preempt.h
diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig
index f744de7a0f9b..7b608cf2516f 100644
--- a/drivers/block/Kconfig
+++ b/drivers/block/Kconfig
@@ -219,7 +219,7 @@ config BLK_DEV_LOOP
 
 	  To use the loop device, you need the losetup utility, found in the
 	  util-linux package, see
-	  <ftp://ftp.kernel.org/pub/linux/utils/util-linux/>.
+	  <https://www.kernel.org/pub/linux/utils/util-linux/>.
 
 	  The loop device driver can also be used to "hide" a file system in
 	  a disk partition, floppy, or regular file, either using encryption
diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
index b7767da50c26..585ff3284bf5 100644
--- a/drivers/md/Kconfig
+++ b/drivers/md/Kconfig
@@ -115,7 +115,7 @@ config MD_RAID10
 
 	  RAID-10 requires mdadm-1.7.0 or later, available at:
 
-	  ftp://ftp.kernel.org/pub/linux/utils/raid/mdadm/
+	  https://www.kernel.org/pub/linux/utils/raid/mdadm/
 
 	  If unsure, say Y.
 
diff --git a/fs/autofs4/Kconfig b/fs/autofs4/Kconfig
index 1204d6384d39..44727bf18297 100644
--- a/fs/autofs4/Kconfig
+++ b/fs/autofs4/Kconfig
@@ -7,7 +7,7 @@ config AUTOFS4_FS
 	  automounter (amd), which is a pure user space daemon.
 
 	  To use the automounter you need the user-space tools from
-	  <ftp://ftp.kernel.org/pub/linux/daemons/autofs/v4/>; you also
+	  <https://www.kernel.org/pub/linux/daemons/autofs/v4/>; you also
 	  want to answer Y to "NFS file system support", below.
 
 	  To compile this support as a module, choose M here: the module will be
diff --git a/scripts/ksymoops/README b/scripts/ksymoops/README
index f6cb06e3f30e..413043980127 100644
--- a/scripts/ksymoops/README
+++ b/scripts/ksymoops/README
@@ -1,8 +1,7 @@
 ksymoops has been removed from the kernel.  It was always meant to be a
 free standing utility, not linked to any particular kernel version.
 The latest version can be found in
-ftp://ftp.<country>.kernel.org/pub/linux/utils/kernel/ksymoops together
-with patches to other utilities in order to give more accurate Oops
-debugging.
+https://www.kernel.org/pub/linux/utils/kernel/ksymoops together with patches to
+other utilities in order to give more accurate Oops debugging.
 
 Keith Owens <kaos@ocs.com.au> Sat Jun 19 10:30:34 EST 1999
diff --git a/scripts/package/builddeb b/scripts/package/builddeb
index 3c575cd07888..676fc10c9514 100755
--- a/scripts/package/builddeb
+++ b/scripts/package/builddeb
@@ -262,8 +262,8 @@ EOF
 cat <<EOF > debian/copyright
 This is a packacked upstream version of the Linux kernel.
 
-The sources may be found at most Linux ftp sites, including:
-ftp://ftp.kernel.org/pub/linux/kernel
+The sources may be found at most Linux archive sites, including:
+https://www.kernel.org/pub/linux/kernel
 
 Copyright: 1991 - 2015 Linus Torvalds and others.
 
-- 
2.10.0

--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply related

* Re-assembling array after double device failure
From: Andy Smith @ 2017-03-27 13:38 UTC (permalink / raw)
  To: linux-raid

Hi,

I'm attempting to clean up after what is most likely a
timeout-related double device failure (yes, I know).

I just want to check I have the right procedure here.

So, initial situation was a two device RAID-10 (sdc, sdd). sdc saw
some I/O errors and was kicked. Contents of /proc/mdstat after that:

md4 : active raid10 sdc[0](F) sdd[1]
      3906886656 blocks super 1.2 512K chunks 2 far-copies [2/1] [_U]
      bitmap: 7/30 pages [28KB], 65536KB chunk

A couple of hours later, sdd also saw some I/O errors and was
similarly kicked. Neither /dev/sdc nor sdd appear as device nodes in
the system any more at this point and the controller doesn't see
them.

sdd was re-plugged and re-appeared as sdg.

A mdadm --examine /dev/sdg looks like:

/dev/sdg:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 4100ddce:8edf6082:ba50427e:60da0a42
           Name : elephant:4  (local to host elephant)
  Creation Time : Fri Nov 18 22:53:10 2016
     Raid Level : raid10
   Raid Devices : 2

 Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
     Array Size : 3906886656 (3725.90 GiB 4000.65 GB)
  Used Dev Size : 7813773312 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=1712 sectors
          State : active
    Device UUID : d9c9d81d:c487599a:3d3e3a30:0c512610

Internal Bitmap : 8 sectors from superblock
    Update Time : Sun Mar 26 00:00:01 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : ec70d450 - correct
         Events : 298824

         Layout : far=2
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : .A ('A' == active, '.' == missing, 'R' == replacing)

mdadm config:

$ grep -v '^#' /etc/mdadm/mdadm.conf | grep -v '^$'
DEVICE /dev/sd*
CREATE owner=root group=disk mode=0660 auto=yes
HOMEHOST <system>
MAILADDR root
ARRAY /dev/md/0  metadata=1.2 UUID=400bac1d:e2c5d6ef:fea3b8c8:bcb70f8f
ARRAY /dev/md/1  metadata=1.2 UUID=e29c8b89:705f0116:d888f77e:2b6e32f5
ARRAY /dev/md/2  metadata=1.2 UUID=039b3427:4be5157a:6e2d53bd:fe898803
ARRAY /dev/md/3  metadata=1.2 UUID=30f745ce:7ed41b53:4df72181:7406ea1d
ARRAY /dev/md/4  metadata=1.2 UUID=4100ddce:8edf6082:ba50427e:60da0a42
ARRAY /dev/md/5  metadata=1.2 UUID=957030cf:c09f023d:ceaebb27:e546f095

(other arrays are on different devices and are not involved here)

So, I think I need to:

- Increase /sys/block/sdg/device/timeout to 180 (already done). TLER
  not supported.

- Stop md4.

  mdadm --stop /dev/md4

- Assemble it again.

  mdadm --assemble /dev/md4

 Theory being that there is at least one good device (sdg that was
 sdd).

- If that complains, I would then have to consider re-creating the
  array with something like:

  mdadm --create --assume-clean --level=10 --layout=f2 missing /dev/sdd

- Once it's up and running, add sdc back in and let it sync

- Make timeout changes permanent.

Does that seem correct?

I'm fairly confident that the drives themselves are actually okay -
nothing untoward in SMART data - so I'm not going to replace them at
this stage.

Cheers,
Andy

^ permalink raw reply

* Re: RFC: always use REQ_OP_WRITE_ZEROES for zeroing offload
From: Mike Snitzer @ 2017-03-27 14:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-raid-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA,
	martin.petersen-QHcLZuEGTsvQT0dZR+AlfA,
	philipp.reisner-63ez5xqkn6DQT0dZR+AlfA,
	linux-block-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, shli-DgEjT+Ai2ygdnm+yROfE0A,
	Christoph Hellwig, agk-H+wXaHxf7aLQT0dZR+AlfA,
	drbd-dev-cunTk1MwBs8qoQakbn7OcQ
In-Reply-To: <20170327091056.GB6879-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>

On Mon, Mar 27 2017 at  5:10am -0400,
Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> wrote:

> It sounds like you don't want to support traditional discard at all,
> but only WRITE ZEROES.  So in many ways this series is the right way
> forward.  It would be nice if we could do a full blown
> REQ_OP_WRITE_ZEROES for dm_think that zeroes out partial blocks,
> similar to what hardware that implements WRITE SAME of zeroes
> or WRITE ZEROES would do.  I'll see if I could include that in my
> series.

By "you" I assume you're referring to Lars?  Lars' approach for discard,
when drbd is layered on dm-thinp, feels over-engineered.  Not his fault,
the way discard and zeroing got conflated certainly lends itself to
these ugly hacks.  SO I do appreciate that for anything to leverage
discard_zeroes_data it needs to be reliable.  Which runs counter to how
discard was implemented (discard may get silently dropped!)  But that is
why dm-thinp doesn't advertise dzd.  Anyway...

As for the blkdev_issue_zeroout() resorting to manually zeroing the
range, if the discard fails or dzd not supported, that certainly
requires DM thinp to implement manual zeroing of the head and tail of
the range if partial blocks are being zeroed.  So I welcome any advances
there.  It is probably something that is best left to Joe or myself to
tackle.  But I'll gladly accept patches ;)

^ permalink raw reply

* Re: Re-assembling array after double device failure
From: Andreas Klauer @ 2017-03-27 14:31 UTC (permalink / raw)
  To: linux-raid
In-Reply-To: <20170327133813.GQ4349@bitfolk.com>

On Mon, Mar 27, 2017 at 01:38:13PM +0000, Andy Smith wrote:
> I'm fairly confident that the drives themselves are actually okay -
> nothing untoward in SMART data - so I'm not going to replace them at
> this stage.

You did not show any logs or SMART output. There is literally nothing 
in your mail that points at timeouts. If your confidence is based on 
the frequent "disk got kicked. must be timeouts!!1" mails on this list, 
then I wish you all the best. Praying works for some people, right...?

If you get two disks kicked, chances are something is seriously wrong.
If there is any doubt at all, and no backups exist, ddrescue both drives.
Better to make a copy you don't need than need a copy you didn't make.

Be very careful with mdadm --create. Defaults change over time and 
rescue systems might give you old mdadm versions, so you have to 
specify everything (metadata version, data offsets, ...).

Consider using overlays for experiments:

https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID#Making_the_harddisks_read-only_using_an_overlay_file

(But not on faulty drives.)

Regards
Andreas Klauer

^ permalink raw reply

* Re: RFC: always use REQ_OP_WRITE_ZEROES for zeroing offload
From: Christoph Hellwig @ 2017-03-27 14:57 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-block-u79uwXL29TY76Z2rM5mHXA,
	linux-raid-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA,
	martin.petersen-QHcLZuEGTsvQT0dZR+AlfA,
	philipp.reisner-63ez5xqkn6DQT0dZR+AlfA, Christoph Hellwig,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, shli-DgEjT+Ai2ygdnm+yROfE0A,
	Christoph Hellwig, agk-H+wXaHxf7aLQT0dZR+AlfA,
	drbd-dev-cunTk1MwBs8qoQakbn7OcQ
In-Reply-To: <20170327140307.GA13020-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On Mon, Mar 27, 2017 at 10:03:07AM -0400, Mike Snitzer wrote:
> By "you" I assume you're referring to Lars?

Yes.

> Lars' approach for discard,
> when drbd is layered on dm-thinp, feels over-engineered.  Not his fault,
> the way discard and zeroing got conflated certainly lends itself to
> these ugly hacks.  SO I do appreciate that for anything to leverage
> discard_zeroes_data it needs to be reliable.  Which runs counter to how
> discard was implemented (discard may get silently dropped!)  But that is
> why dm-thinp doesn't advertise dzd.  Anyway...

That's exactly what this series does - remove discard_zeroes_data and
use the new REQ_OP_WRITE_ZEROES for anything that wants zeroing offload.

> As for the blkdev_issue_zeroout() resorting to manually zeroing the
> range, if the discard fails or dzd not supported, that certainly
> requires DM thinp to implement manual zeroing of the head and tail of
> the range if partial blocks are being zeroed.  So I welcome any advances
> there.  It is probably something that is best left to Joe or myself to
> tackle.  But I'll gladly accept patches ;)

Ok, I'll happily leave this to the two of you..

^ permalink raw reply

* Re: [Drbd-dev] RFC: always use REQ_OP_WRITE_ZEROES for zeroing offload
From: Bart Van Assche @ 2017-03-27 15:08 UTC (permalink / raw)
  To: hch@infradead.org, snitzer@redhat.com
  Cc: linux-block@vger.kernel.org, agk@redhat.com,
	linux-raid@vger.kernel.org, hch@lst.de,
	martin.petersen@oracle.com, philipp.reisner@linbit.com,
	axboe@kernel.dk, linux-scsi@vger.kernel.org,
	drbd-dev@lists.linbit.com, shli@kernel.org, dm-devel@redhat.com
In-Reply-To: <20170327140307.GA13020@redhat.com>

On Mon, 2017-03-27 at 10:03 -0400, Mike Snitzer wrote:
> As for the blkdev_issue_zeroout() resorting to manually zeroing the
> range, if the discard fails or dzd not supported, that certainly
> requires DM thinp to implement manual zeroing of the head and tail of
> the range if partial blocks are being zeroed.  So I welcome any advances
> there.  It is probably something that is best left to Joe or myself to
> tackle.  But I'll gladly accept patches ;)

Some time ago I posted a patch series for the block layer that zeroes
start and tail for nonaligned discard requests if dzd has been set. See
also "[PATCH v3 0/5] Make blkdev_issue_discard() submit aligned discard
requests" (https://www.spinics.net/lists/linux-block/msg02360.html).

Bart.

^ permalink raw reply

* Re: Re-assembling array after double device failure
From: Anthony Youngman @ 2017-03-27 15:23 UTC (permalink / raw)
  To: linux-raid
In-Reply-To: <20170327133813.GQ4349@bitfolk.com>



On 27/03/17 14:38, Andy Smith wrote:
> Hi,
>
> I'm attempting to clean up after what is most likely a
> timeout-related double device failure (yes, I know).
>
> I just want to check I have the right procedure here.
>
> So, initial situation was a two device RAID-10 (sdc, sdd). sdc saw
> some I/O errors and was kicked. Contents of /proc/mdstat after that:
>
> md4 : active raid10 sdc[0](F) sdd[1]
>       3906886656 blocks super 1.2 512K chunks 2 far-copies [2/1] [_U]
>       bitmap: 7/30 pages [28KB], 65536KB chunk
>
> A couple of hours later, sdd also saw some I/O errors and was
> similarly kicked. Neither /dev/sdc nor sdd appear as device nodes in
> the system any more at this point and the controller doesn't see
> them.
>
> sdd was re-plugged and re-appeared as sdg.
>
> A mdadm --examine /dev/sdg looks like:
>
> /dev/sdg:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x1
>      Array UUID : 4100ddce:8edf6082:ba50427e:60da0a42
>            Name : elephant:4  (local to host elephant)
>   Creation Time : Fri Nov 18 22:53:10 2016
>      Raid Level : raid10
>    Raid Devices : 2
>
>  Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
>      Array Size : 3906886656 (3725.90 GiB 4000.65 GB)
>   Used Dev Size : 7813773312 (3725.90 GiB 4000.65 GB)
>     Data Offset : 262144 sectors
>    Super Offset : 8 sectors
>    Unused Space : before=262056 sectors, after=1712 sectors
>           State : active
>     Device UUID : d9c9d81d:c487599a:3d3e3a30:0c512610
>
> Internal Bitmap : 8 sectors from superblock
>     Update Time : Sun Mar 26 00:00:01 2017
>   Bad Block Log : 512 entries available at offset 72 sectors
>        Checksum : ec70d450 - correct
>          Events : 298824
>
>          Layout : far=2
>      Chunk Size : 512K
>
>    Device Role : Active device 1
>    Array State : .A ('A' == active, '.' == missing, 'R' == replacing)
>
> mdadm config:
>
> $ grep -v '^#' /etc/mdadm/mdadm.conf | grep -v '^$'
> DEVICE /dev/sd*
> CREATE owner=root group=disk mode=0660 auto=yes
> HOMEHOST <system>
> MAILADDR root
> ARRAY /dev/md/0  metadata=1.2 UUID=400bac1d:e2c5d6ef:fea3b8c8:bcb70f8f
> ARRAY /dev/md/1  metadata=1.2 UUID=e29c8b89:705f0116:d888f77e:2b6e32f5
> ARRAY /dev/md/2  metadata=1.2 UUID=039b3427:4be5157a:6e2d53bd:fe898803
> ARRAY /dev/md/3  metadata=1.2 UUID=30f745ce:7ed41b53:4df72181:7406ea1d
> ARRAY /dev/md/4  metadata=1.2 UUID=4100ddce:8edf6082:ba50427e:60da0a42
> ARRAY /dev/md/5  metadata=1.2 UUID=957030cf:c09f023d:ceaebb27:e546f095
>
> (other arrays are on different devices and are not involved here)
>
> So, I think I need to:
>
> - Increase /sys/block/sdg/device/timeout to 180 (already done). TLER
>   not supported.
>
> - Stop md4.
>
>   mdadm --stop /dev/md4
>
> - Assemble it again.
>
>   mdadm --assemble /dev/md4
>
>  Theory being that there is at least one good device (sdg that was
>  sdd).
>
> - If that complains, I would then have to consider re-creating the
>   array with something like:

NEVER NEVER NEVER use --create except as a last resort. Try --assemble 
--force. And if you are going to try it, as an absolute minimum, read 
the kernel raid wiki, get lsdrv, run it AND MAKE SURE THE OUTPUT IS SAFE 
SOMEWHERE.

https://raid.wiki.kernel.org/index.php/Asking_for_help

Snag is, you might end up with a non-functional array with two spare 
drives. I'll have to step back and let the experts handle that if it 
happens.
>
>   mdadm --create --assume-clean --level=10 --layout=f2 missing /dev/sdd
>
> - Once it's up and running, add sdc back in and let it sync
>
> - Make timeout changes permanent.

I'd do this as the very first step - I think you need to put a script in 
your run-level. There's a good sample script on the wiki.

That way it'll get done as the system boots, and should prevent any 
problems. Oh - and do scheduled scrubs, as the fact you're getting 
timeout errors indicates that something is wrong - a scrub is probably 
sufficient to clean it up.
>
> Does that seem correct?

Hopefully fixing the timeout, followed by a --assemble --force, then a 
scrub, will be all that's required.

>
> I'm fairly confident that the drives themselves are actually okay -
> nothing untoward in SMART data - so I'm not going to replace them at
> this stage.
>
Cheers,
Wol

^ permalink raw reply

* Re: Re-assembling array after double device failure
From: Anthony Youngman @ 2017-03-27 15:27 UTC (permalink / raw)
  To: Andreas Klauer, linux-raid
In-Reply-To: <20170327143113.GA19680@metamorpher.de>



On 27/03/17 15:31, Andreas Klauer wrote:
> You did not show any logs or SMART output. There is literally nothing
> in your mail that points at timeouts. If your confidence is based on
> the frequent "disk got kicked. must be timeouts!!1" mails on this list,
> then I wish you all the best. Praying works for some people, right...?

To quote the OP's original email ...

"- Increase /sys/block/sdg/device/timeout to 180 (already done). TLER
   not supported."

Cheers,
Wol

^ permalink raw reply

* Re: [PATCH v3] block: trace completion of all bios.
From: Christoph Hellwig @ 2017-03-27 17:14 UTC (permalink / raw)
  To: NeilBrown
  Cc: Jens Axboe, linux-block, linux-raid, Martin K . Petersen,
	Mike Snitzer, Ming Lei, linux-kernel, Christoph Hellwig, dm-devel,
	Shaohua Li, Alasdair Kergon
In-Reply-To: <87zig76oca.fsf@notabene.neil.brown.name>

On Mon, Mar 27, 2017 at 08:49:57PM +1100, NeilBrown wrote:
> On Mon, Mar 27 2017, Christoph Hellwig wrote:
> 
> > I don't really like the flag at all.  I'd much prefer a __bio_endio
> > with a 'bool trace' flag.  Also please remove the manual tracing in
> > dm.ċ.  Once that is done I suspect we can also remove the
> > block_bio_complete export.
> 
> Can you say why you don't like it?

It uses up a precious bit in the bio for something that should be state
that can be determined in the caller at compile time.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply

* [PATCH v3] md/raid5: use consistency_policy to remove journal feature
From: Song Liu @ 2017-03-27 17:51 UTC (permalink / raw)
  To: linux-raid
  Cc: shli, neilb, kernel-team, dan.j.williams, hch, jes.sorensen,
	Song Liu

When journal device of an array fails, the array is forced into read-only
mode. To make the array normal without adding another journal device, we
need to remove journal _feature_ from the array.

This patch allows remove journal _feature_ from an array, For journal
existing journal should be either missing or faulty.

To remove journal feature, it is necessary to remove the journal device
first:

  mdadm --fail /dev/md0 /dev/sdb
  mdadm: set /dev/sdb faulty in /dev/md0
  mdadm --remove /dev/md0 /dev/sdb
  mdadm: hot removed /dev/sdb from /dev/md0

Then the journal feature can be removed by echoing into the sysfs file:

 cat /sys/block/md0/md/consistency_policy
 journal

 echo resync > /sys/block/md0/md/consistency_policy
 cat /sys/block/md0/md/consistency_policy
 resync

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 drivers/md/raid5.c | 46 ++++++++++++++++++++++++++++++++++++----------
 1 file changed, 36 insertions(+), 10 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 266d661..6036d5e 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -8292,17 +8292,41 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
 	}
 
 	if (strncmp(buf, "ppl", 3) == 0 && !raid5_has_ppl(conf)) {
-		mddev_suspend(mddev);
-		set_bit(MD_HAS_PPL, &mddev->flags);
-		err = log_init(conf, NULL);
-		if (!err)
+		/* ppl only works with RAID 5 */
+		if (conf->level == 5) {
+			mddev_suspend(mddev);
+			set_bit(MD_HAS_PPL, &mddev->flags);
+			err = log_init(conf, NULL);
+			if (!err)
+				raid5_reset_stripe_cache(mddev);
+			mddev_resume(mddev);
+		} else
+			err = -EINVAL;
+	} else if (strncmp(buf, "resync", 6) == 0) {
+		if (raid5_has_ppl(conf)) {
+			mddev_suspend(mddev);
+			log_exit(conf);
 			raid5_reset_stripe_cache(mddev);
-		mddev_resume(mddev);
-	} else if (strncmp(buf, "resync", 6) == 0 && raid5_has_ppl(conf)) {
-		mddev_suspend(mddev);
-		log_exit(conf);
-		raid5_reset_stripe_cache(mddev);
-		mddev_resume(mddev);
+			mddev_resume(mddev);
+		} else if (test_bit(MD_HAS_JOURNAL, &conf->mddev->flags) &&
+			   r5l_log_disk_error(conf)) {
+			bool journal_dev_exists = false;
+			struct md_rdev *rdev;
+
+			rdev_for_each(rdev, mddev)
+				if (test_bit(Journal, &rdev->flags)) {
+					journal_dev_exists = true;
+					break;
+				}
+
+			if (!journal_dev_exists) {
+				mddev_suspend(mddev);
+				clear_bit(MD_HAS_JOURNAL, &mddev->flags);
+				mddev_resume(mddev);
+			} else  /* need remove journal device first */
+				err = -EBUSY;
+		} else
+			err = -EINVAL;
 	} else {
 		err = -EINVAL;
 	}
@@ -8337,6 +8361,7 @@ static struct md_personality raid6_personality =
 	.quiesce	= raid5_quiesce,
 	.takeover	= raid6_takeover,
 	.congested	= raid5_congested,
+	.change_consistency_policy = raid5_change_consistency_policy,
 };
 static struct md_personality raid5_personality =
 {
@@ -8385,6 +8410,7 @@ static struct md_personality raid4_personality =
 	.quiesce	= raid5_quiesce,
 	.takeover	= raid4_takeover,
 	.congested	= raid5_congested,
+	.change_consistency_policy = raid5_change_consistency_policy,
 };
 
 static int __init raid5_init(void)
-- 
2.9.3


^ permalink raw reply related

* Re: constant array_state active after specific jobs
From: Shaohua Li @ 2017-03-27 18:08 UTC (permalink / raw)
  To: NeilBrown; +Cc: pdi, linux-raid
In-Reply-To: <8737e39rg0.fsf@notabene.neil.brown.name>

On Fri, Mar 24, 2017 at 04:25:35PM +1100, Neil Brown wrote:
> On Thu, Mar 23 2017, pdi wrote:
> 
> > Greetings all,
> >
> > The problem in a nutshell is that an array is clean after boot, until
> > some specific jobs switch it to active where it remains until reboot.
> >
> > A similar problem was discussed, and solved, in 
> > https://www.spinics.net/lists/raid/msg46450.html. However, AFAICT,
> > it is not the same issue.
> >
> > I would be grateful for any insights as to why this happens and/or how
> > to prevent it.
> >
> > The relevant info follows, please let me know if anything further might
> > help.
> >
> > Many thanks in advance.
> >
> > - uname -a
> >   Linux hostname 4.4.38 #1 SMP Sun Dec 11 16:03:41 CST 2016 x86_64
> >   Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel GNU/Linux
> > - mdadm -V
> >   mdadm - v3.3.4 - 3rd August 2015
> > - Desktop drives without sct/erc,
> >   with timeout mismatch correction as per
> >   https://raid.wiki.kernel.org/index.php/Timeout_Mismatch
> > - /dev/md9 is a raid10 array, 4 devices, far=2,
> >   with various dirs used as samba and nfs shares
> > - The array is in *constant* array_state active
> > - mdadm -D /dev/md9 | grep 'State :'
> >   State : active
> > - cat /sys/block/md9/md/array_state
> >   active
> > - watch -d 'grep md9 /proc/diskstats'
> >   remain unchanged
> > - uptime
> >   load average: 0.00, 0.00, 0.00
> > - cat /sys/block/md9/md/safe_mode_delay
> >   0.201
> > - echo 0.1 > /sys/block/md9/md/safe_mode_delay
> >   array_state remains active
> > - echo clean > /sys/block/md9/md/array_state
> >   echo: write error: Device or resource busy
> > - reboot (with or without prior check)
> >   array_state clean
> > - After reboot, array remains clean until some specific
> >   jobs put it in constant active state. Such jobs so far
> >   identified:
> >   - echo check > /sys/block/md9/md/sync_action
> >   - run an rsnapshot job
> >   - start a qemu/kvm vm
> > - Other jobs, like text/doc editing, multimedia playback,
> >   etc retain array_state clean
> 
> This bug was introduced by
> Commit: 20d0189b1012 ("block: Introduce new bio_split()")
> in 3.14, and fixed by
> Commit: 9b622e2bbcf0 ("raid10: increment write counter after bio is split")
> in 4.8.
> 
> Maybe the latter patch should be sent to -stable ??

Sure, looks suitable, will do it now.

Thanks,
Shaohua

^ permalink raw reply

* Re: [PATCH] md:array cannot be opened again after 'md_set_readonly'
From: Shaohua Li @ 2017-03-27 18:22 UTC (permalink / raw)
  To: Zhilong Liu; +Cc: neilb, shli, linux-raid, Guoqing Jiang
In-Reply-To: <1490601145-5865-1-git-send-email-zlliu@suse.com>

On Mon, Mar 27, 2017 at 03:52:25PM +0800, Zhilong Liu wrote:
> This is a bug about array cannot be opened again after 'md_set_readonly',
> because the MD_CLOSING bit is still waiting for clear.
> MD_CLOSING should only be set for a short period or time to avoid certain
> races. After the operation that set it completes, it should be cleared.

where is the bit set? Why don't clear it after the operation but clear it in
set_readonly?
 
> Reviewed-by: NeilBrown <neilb@suse.com>
> Cc: Guoqing Jiang <gqjiang@suse.com>
> Signed-off-by: Zhilong Liu <zlliu@suse.com>
> ---
>  drivers/md/md.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index f6ae1d6..7f2db7c 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -5588,6 +5588,7 @@ static int md_set_readonly(struct mddev *mddev, struct block_device *bdev)
>  	int err = 0;
>  	int did_freeze = 0;
>  
> +	test_and_clear_bit(MD_CLOSING, &mddev->flags);

I don't understand why this must be a test_and_clear.

Thanks,
Shaohua

^ permalink raw reply

* Re: [PATCH v3] md/raid5: use consistency_policy to remove journal feature
From: Shaohua Li @ 2017-03-27 19:02 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-raid, shli, neilb, kernel-team, dan.j.williams, hch,
	jes.sorensen
In-Reply-To: <20170327175133.3607211-1-songliubraving@fb.com>

On Mon, Mar 27, 2017 at 10:51:33AM -0700, Song Liu wrote:
> When journal device of an array fails, the array is forced into read-only
> mode. To make the array normal without adding another journal device, we
> need to remove journal _feature_ from the array.
> 
> This patch allows remove journal _feature_ from an array, For journal
> existing journal should be either missing or faulty.
> 
> To remove journal feature, it is necessary to remove the journal device
> first:
> 
>   mdadm --fail /dev/md0 /dev/sdb
>   mdadm: set /dev/sdb faulty in /dev/md0
>   mdadm --remove /dev/md0 /dev/sdb
>   mdadm: hot removed /dev/sdb from /dev/md0
> 
> Then the journal feature can be removed by echoing into the sysfs file:
> 
>  cat /sys/block/md0/md/consistency_policy
>  journal
> 
>  echo resync > /sys/block/md0/md/consistency_policy
>  cat /sys/block/md0/md/consistency_policy
>  resync

Looks good, applied, thanks!
 
> Signed-off-by: Song Liu <songliubraving@fb.com>
> ---
>  drivers/md/raid5.c | 46 ++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 36 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 266d661..6036d5e 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -8292,17 +8292,41 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
>  	}
>  
>  	if (strncmp(buf, "ppl", 3) == 0 && !raid5_has_ppl(conf)) {
> -		mddev_suspend(mddev);
> -		set_bit(MD_HAS_PPL, &mddev->flags);
> -		err = log_init(conf, NULL);
> -		if (!err)
> +		/* ppl only works with RAID 5 */
> +		if (conf->level == 5) {
> +			mddev_suspend(mddev);
> +			set_bit(MD_HAS_PPL, &mddev->flags);
> +			err = log_init(conf, NULL);
> +			if (!err)
> +				raid5_reset_stripe_cache(mddev);
> +			mddev_resume(mddev);
> +		} else
> +			err = -EINVAL;
> +	} else if (strncmp(buf, "resync", 6) == 0) {
> +		if (raid5_has_ppl(conf)) {
> +			mddev_suspend(mddev);
> +			log_exit(conf);
>  			raid5_reset_stripe_cache(mddev);
> -		mddev_resume(mddev);
> -	} else if (strncmp(buf, "resync", 6) == 0 && raid5_has_ppl(conf)) {
> -		mddev_suspend(mddev);
> -		log_exit(conf);
> -		raid5_reset_stripe_cache(mddev);
> -		mddev_resume(mddev);
> +			mddev_resume(mddev);
> +		} else if (test_bit(MD_HAS_JOURNAL, &conf->mddev->flags) &&
> +			   r5l_log_disk_error(conf)) {
> +			bool journal_dev_exists = false;
> +			struct md_rdev *rdev;
> +
> +			rdev_for_each(rdev, mddev)
> +				if (test_bit(Journal, &rdev->flags)) {
> +					journal_dev_exists = true;
> +					break;
> +				}
> +
> +			if (!journal_dev_exists) {
> +				mddev_suspend(mddev);
> +				clear_bit(MD_HAS_JOURNAL, &mddev->flags);
> +				mddev_resume(mddev);
> +			} else  /* need remove journal device first */
> +				err = -EBUSY;
> +		} else
> +			err = -EINVAL;
>  	} else {
>  		err = -EINVAL;
>  	}
> @@ -8337,6 +8361,7 @@ static struct md_personality raid6_personality =
>  	.quiesce	= raid5_quiesce,
>  	.takeover	= raid6_takeover,
>  	.congested	= raid5_congested,
> +	.change_consistency_policy = raid5_change_consistency_policy,
>  };
>  static struct md_personality raid5_personality =
>  {
> @@ -8385,6 +8410,7 @@ static struct md_personality raid4_personality =
>  	.quiesce	= raid5_quiesce,
>  	.takeover	= raid4_takeover,
>  	.congested	= raid5_congested,
> +	.change_consistency_policy = raid5_change_consistency_policy,
>  };
>  
>  static int __init raid5_init(void)
> -- 
> 2.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] mdadm:checking --level once mode has been set
From: jes.sorensen @ 2017-03-27 21:59 UTC (permalink / raw)
  To: Zhilong Liu; +Cc: linux-raid
In-Reply-To: <1489987046-22496-1-git-send-email-zlliu@suse.com>

Zhilong Liu <zlliu@suse.com> writes:
> mdadm: it would be better to check --level ealier,
> because it would fall to different prompt if user
> forgets to specify the --level. such as:
> ./mdadm --build /dev/md0 -n2 /dev/loop[0-1]
>
> Signed-off-by: Zhilong Liu <zlliu@suse.com>
> ---
>  Create.c | 4 ----
>  mdadm.c  | 6 ++++++
>  2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/Create.c b/Create.c
> index 2721884..50ec85e 100644
> --- a/Create.c
> +++ b/Create.c
> @@ -125,10 +125,6 @@ int Create(struct supertype *st, char *mddev,
>  	memset(&info, 0, sizeof(info));
>  	if (s->level == UnSet && st && st->ss->default_geometry)
>  		st->ss->default_geometry(st, &s->level, NULL, NULL);
> -	if (s->level == UnSet) {
> -		pr_err("a RAID level is needed to create an array.\n");
> -		return 1;
> -	}
>  	if (s->raiddisks < 4 && s->level == 6) {
>  		pr_err("at least 4 raid-devices needed for level 6\n");
>  		return 1;
> diff --git a/mdadm.c b/mdadm.c
> index d6ad8dc..fcb33d1 100644
> --- a/mdadm.c
> +++ b/mdadm.c
> @@ -349,6 +349,12 @@ int main(int argc, char *argv[])
>  				pr_err("Must give -a/--add for devices to add: %s\n", optarg);
>  				exit(2);
>  			}
> +			if (devs_found > 0 && s.level == UnSet && !devmode) {
> +				if (mode == CREATE || mode == BUILD) {
> +					pr_err("a RAID level is needed to create or build an array.\n");
> +					exit(2);
> +				}
> +			}
>  			dv = xmalloc(sizeof(*dv));
>  			dv->devname = optarg;
>  			dv->disposition = devmode;

So I am not sure I like this solution. I don't like the attempted catch
all global error handling, where we hope we catch all the cases in the
calling function. I would really prefer to move towards a model where
errors are caught in the function and we instead do better handling of
error return values as we return.

Second your patch changes the failure return code for Create() from
exit(1) to exit(2).

Jes

^ permalink raw reply

* Re: [PATCH] mdadm:fixed some trivial typos in comments of mdadm.h
From: jes.sorensen @ 2017-03-27 22:00 UTC (permalink / raw)
  To: Zhilong Liu; +Cc: linux-raid
In-Reply-To: <1489987206-22578-1-git-send-email-zlliu@suse.com>

Zhilong Liu <zlliu@suse.com> writes:
> mdadm.h: fixed some trivial typos in comments
>
> Signed-off-by: Zhilong Liu <zlliu@suse.com>
> ---
>  mdadm.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Applied!

Thanks,
Jes

^ permalink raw reply

* Re: [PATCH] mdadm/grow: reshape would be stuck from raid1 to raid5
From: jes.sorensen @ 2017-03-27 22:10 UTC (permalink / raw)
  To: Zhilong Liu; +Cc: linux-raid, Harald Hoyer
In-Reply-To: <1489987229-22631-1-git-send-email-zlliu@suse.com>

Zhilong Liu <zlliu@suse.com> writes:
> it would be stuck at the beginning of reshape progress
> when grows array from raid1 to raid5, correct the name
> of mdadm-grow-continue@.service in continue_via_systemd.
>
> reproduce steps:
> ./mdadm -CR /dev/md0 -l1 -b internal -n2 /dev/loop[0-1]
> ./mdadm --grow /dev/md0 -l5 -n3 -a /dev/loop2
>
> Signed-off-by: Zhilong Liu <zlliu@suse.com>
> ---
>  Grow.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/Grow.c b/Grow.c
> index 455c5f9..10c02a1 100755
> --- a/Grow.c
> +++ b/Grow.c
> @@ -2808,13 +2808,11 @@ static int continue_via_systemd(char *devnm)
>  		 */
>  		close(2);
>  		open("/dev/null", O_WRONLY);
> -		snprintf(pathbuf, sizeof(pathbuf), "mdadm-grow-continue@%s.service",
> -			 devnm);
> +		snprintf(pathbuf, sizeof(pathbuf), "mdadm-grow-continue@.service");

My memory is rusty here, isn't systemctl interpreting the device name in
mdadm-grow-continue@<device>.service as an argument?

>  		status = execl("/usr/bin/systemctl", "systemctl",
>  			       "start",
>  			       pathbuf, NULL);
> -		status = execl("/bin/systemctl", "systemctl", "start",
> -			       pathbuf, NULL);
> +		pr_err("/usr/bin/systemctl %s got failure\n", pathbuf);
>  		exit(1);

This assumes systemctl is location in /usr/bin only - you removed the
fallback case for it being location in /bin.

In addition, instead of saying 'got failure' lets do something with the
errno value so the user gets a more descriptive error message.

Cheers,
Jes

^ permalink raw reply

* Re: [PATCH] mdadm:it doesn't make sense to set --bitmap twice
From: jes.sorensen @ 2017-03-27 22:14 UTC (permalink / raw)
  To: Zhilong Liu; +Cc: linux-raid
In-Reply-To: <1489987263-22735-1-git-send-email-zlliu@suse.com>

Zhilong Liu <zlliu@suse.com> writes:
> mdadm.c: it doesn't make sense to set --bitmap twice.
>
> Signed-off-by: Zhilong Liu <zlliu@suse.com>
> ---
>  mdadm.c | 4 ++++
>  1 file changed, 4 insertions(+)

Applied!

Thanks,
Jes

^ permalink raw reply

* Re: [PATCH] mdadm/mdmon:deleted the abort_reshape never invoked
From: jes.sorensen @ 2017-03-27 22:16 UTC (permalink / raw)
  To: Zhilong Liu; +Cc: linux-raid
In-Reply-To: <1489987284-22786-1-git-send-email-zlliu@suse.com>

Zhilong Liu <zlliu@suse.com> writes:
> mdmon.c: abort_reshape() has implemented in Grow.c,
> this function doesn't make a lot of sense here.
>
> Signed-off-by: Zhilong Liu <zlliu@suse.com>
> ---
>  mdmon.c | 5 -----
>  1 file changed, 5 deletions(-)

This used to be there for super-intel in mdmon, but is no longer needed.

Applied!

Thanks,
Jes

^ permalink raw reply

* Re: [PATCH] mdadm/Monitor:triggers core dump when stat2devnm return NULL
From: jes.sorensen @ 2017-03-27 22:25 UTC (permalink / raw)
  To: Zhilong Liu; +Cc: linux-raid
In-Reply-To: <1489987301-22836-1-git-send-email-zlliu@suse.com>

Zhilong Liu <zlliu@suse.com> writes:
> Monitor: ensure that the device should be a block
> device when uses --wait parameter, such as the 'f'
> and 'd' type file would be triggered core dumped.
> such as: ./mdadm --wait /dev/md/

I modified the patch description here to make it easier to read.

> Reviewed-by: NeilBrown <neilb@suse.com>
> Signed-off-by: Zhilong Liu <zlliu@suse.com>
> ---
>  Monitor.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/Monitor.c b/Monitor.c
> index 802a9d9..f8850d3 100644
> --- a/Monitor.c
> +++ b/Monitor.c
> @@ -1002,7 +1002,12 @@ int Wait(char *dev)
>  			strerror(errno));
>  		return 2;
>  	}
> -	strcpy(devnm, stat2devnm(&stb));
> +	char *tmp = stat2devnm(&stb);

Please do not declare variables in the middle of the codeflow, that is
extremely bad programming practice.

I fixed this up and applied the patch.

Thanks,
Jes

^ permalink raw reply

* Re: [PATCH] mdadm/Monitor:dev should be a block file when use --waitclean
From: jes.sorensen @ 2017-03-27 22:29 UTC (permalink / raw)
  To: Zhilong Liu; +Cc: linux-raid
In-Reply-To: <1489987317-22895-1-git-send-email-zlliu@suse.com>

Zhilong Liu <zlliu@suse.com> writes:
> Monitor: mdadm --wait-clean /dev/mdX, the dev should
> be a block file, otherwise fd2devnm returns NULL and
> then triggers core dumped.
>
> Signed-off-by: Zhilong Liu <zlliu@suse.com>
> ---
>  Monitor.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/Monitor.c b/Monitor.c
> index f8850d3..5a2b5ca 100644
> --- a/Monitor.c
> +++ b/Monitor.c
> @@ -1065,7 +1065,17 @@ int WaitClean(char *dev, int sock, int verbose)
>  	struct mdinfo *mdi;
>  	int rv = 1;
>  	char devnm[32];
> +	struct stat stb;
>  
> +	if (stat(dev, &stb) != 0) {
> +		pr_err("Cannot find %s: %s\n", dev,
> +			strerror(errno));

Please use the 80 characters in the line.

> +		return 2;
> +	}
> +	if ((S_IFMT & stb.st_mode) != S_IFBLK) {
> +		pr_err("%s is not a block device.\n", dev);
> +		return 2;
> +	}
>  	fd = open(dev, O_RDONLY);
>  	if (fd < 0) {
>  		if (verbose)

We have 7-8 instances of this throughout the code (stat and fstat).
Maybe we should make it a utility function instead of duplicating it
further.

Cheers,
Jes

^ permalink raw reply

* Re: [dm-devel] [PATCH v3] block: trace completion of all bios.
From: NeilBrown @ 2017-03-27 23:42 UTC (permalink / raw)
  Cc: Jens Axboe, linux-block, linux-raid, Martin K . Petersen,
	Mike Snitzer, Ming Lei, linux-kernel, Christoph Hellwig, dm-devel,
	Shaohua Li, Alasdair Kergon
In-Reply-To: <20170327171421.GA464@infradead.org>

[-- Attachment #1: Type: text/plain, Size: 1192 bytes --]

On Mon, Mar 27 2017, Christoph Hellwig wrote:

> On Mon, Mar 27, 2017 at 08:49:57PM +1100, NeilBrown wrote:
>> On Mon, Mar 27 2017, Christoph Hellwig wrote:
>> 
>> > I don't really like the flag at all.  I'd much prefer a __bio_endio
>> > with a 'bool trace' flag.  Also please remove the manual tracing in
>> > dm.ċ.  Once that is done I suspect we can also remove the
>> > block_bio_complete export.
>> 
>> Can you say why you don't like it?
>
> It uses up a precious bit in the bio for something that should be state
> that can be determined in the caller at compile time.

I've already demonstrated that the bit is not "precious" at all.  I have
shown how I could easily give you 20 unused flag bits without increasing
the size of struct bio.
Yes, the state could be determined in the caller at compiler time.  That
would require developers to make the correct choice between two very
similar interfaces, where the consequences of an correct choice are not
immediately obvious.
I think that spending one bit (out of 20) to relieve developers of the
burden of choice (and to spare as all of the consequences of wrong
choice) is a price worth paying.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply

* Re: [PATCH 4/5] fs/autofs4: Fix a dead URL to ftp.kernel.org
From: Ian Kent @ 2017-03-28  2:02 UTC (permalink / raw)
  To: SeongJae Park, rml, axboe, shli, yamada.masahito, mmarek
  Cc: linux-raid, autofs, linux-kbuild, linux-kernel
In-Reply-To: <20170327054731.31882-5-sj38.park@gmail.com>

On Mon, 2017-03-27 at 14:47 +0900, SeongJae Park wrote:
> As ftp.kernel.org is closed [0], this commit fixes a dead URL to
> ftp.kernel.org in fs/autofs4/ to use www.kernel.org instead.
> 
> [0] https://www.kernel.org/shutting-down-ftp-services.html
> 
> Signed-off-by: SeongJae Park <sj38.park@gmail.com>

ACK, and thanks for fixing this.

> ---
>  fs/autofs4/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/autofs4/Kconfig b/fs/autofs4/Kconfig
> index 1204d6384d39..44727bf18297 100644
> --- a/fs/autofs4/Kconfig
> +++ b/fs/autofs4/Kconfig
> @@ -7,7 +7,7 @@ config AUTOFS4_FS
>  	  automounter (amd), which is a pure user space daemon.
>  
>  	  To use the automounter you need the user-space tools from
> -	  <ftp://ftp.kernel.org/pub/linux/daemons/autofs/v4/>; you also
> +	  <https://www.kernel.org/pub/linux/daemons/autofs/v4/>; you also
>  	  want to answer Y to "NFS file system support", below.
>  
>  	  To compile this support as a module, choose M here: the module will
> be
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox