Linux RAID subsystem development

Linux RAID subsystem development
 help / color / mirror / Atom feed

* Re: Why not just return an error?
From: Phil Turmel @ 2016-10-07 18:41 UTC (permalink / raw)
  To: Dark Penguin, Andreas Klauer, Rudy Zijlstra, keld; +Cc: linux-raid
In-Reply-To: <57F7DF05.8090605@yandex.ru>

On 10/07/2016 01:44 PM, Dark Penguin wrote:
> On 07/10/16 19:52, Phil Turmel wrote:

>> MD raid has no idea what is at any given sector.  And with a
>> near-infinite variety of layering choices, there's no way it's going to.
>>   That's why *you* have to do this.  You trimmed my description of the
>> only "easy option" actually trustable.
> 
> I actually wanted to ask about that. Can you really ddrescue a drive
> with a "hole" in it, re-add it and expect it to work?.. What happens if
> you try to read from that "hole" again? And while I'm talking about
> re-adding, when does it become impossible to "re-add" a drive?..

Yes, ddrescue replaces unreadable areas with zeroes.  If those blocks
were part of a file, then the file will have zeroes in it.  But they
might have been where an inode or dirent were stored, in which case you
get orphaned data elsewhere.  You need fsck to minimize that.

ddrescue can provide a listing of the sectors it replaced so you can use
filesystem forensic tools to pinpoint the problems (which file, etc).

Note that all of the above are manual operations -- mdadm has no
knowledge of the upper layers.

None of the above uses --re-add.  Just assembly or forced assembly.
Re-add is only to return a kicked drive to a *functional* array when the
failure reason isn't really the drive.  (Controller, cable, power
supply, etc.)  And re-add is only helpful if the array members have
write-intent bitmaps so MD can figure out which parts of the re-added
disk are out of date.  Re-add can be used if a drive is kicked for
timeout mismatch, but is only helpful if the mismatch is addressed first.

Phil

^ permalink raw reply

* Re: Why not just return an error?
From: Dark Penguin @ 2016-10-07 20:39 UTC (permalink / raw)
  To: Phil Turmel, Andreas Klauer, Rudy Zijlstra, keld, linux-raid
In-Reply-To: <a39cdc87-394e-008c-2e50-6df1bd7394da@turmel.org>

>> On 07/10/16 19:52, Phil Turmel wrote:
>
>>> MD raid has no idea what is at any given sector.  And with a
>>> near-infinite variety of layering choices, there's no way it's going to.
>>>    That's why *you* have to do this.  You trimmed my description of the
>>> only "easy option" actually trustable.
>>
>> I actually wanted to ask about that. Can you really ddrescue a drive
>> with a "hole" in it, re-add it and expect it to work?.. What happens if
>> you try to read from that "hole" again? And while I'm talking about
>> re-adding, when does it become impossible to "re-add" a drive?..
>
> Yes, ddrescue replaces unreadable areas with zeroes.  If those blocks
> were part of a file, then the file will have zeroes in it.  But they
> might have been where an inode or dirent were stored, in which case you
> get orphaned data elsewhere.  You need fsck to minimize that.

Ah, yes - in this case it's the only drive with this piece of 
information, and md doesn't keep any checksums or anything, so it will 
simply return those zeroes. Thanks for explaining this!

> ddrescue can provide a listing of the sectors it replaced so you can use
> filesystem forensic tools to pinpoint the problems (which file, etc).
>
> Note that all of the above are manual operations -- mdadm has no
> knowledge of the upper layers.
>
> None of the above uses --re-add.  Just assembly or forced assembly.
> Re-add is only to return a kicked drive to a *functional* array when the
> failure reason isn't really the drive.  (Controller, cable, power
> supply, etc.)  And re-add is only helpful if the array members have
> write-intent bitmaps so MD can figure out which parts of the re-added
> disk are out of date.  Re-add can be used if a drive is kicked for
> timeout mismatch, but is only helpful if the mismatch is addressed first.

"Forced assembly"... That's one thing I've missed. So forced-assembling 
a faulty drive back into a collapsed array after each failure would 
basically do what I wanted to do - and with no inconsistencies, because 
the array stops the moment the drive was kicked; but I can see why this 
is not a good idea. %)

So, "re-adding" is only possible with a functional array, and only when 
a write-intent bitmap is used. But I remember clearly that not long ago, 
one of my drives failed (most likely due to a cable popping off) and 
refused to re-add into a mirror with a bitmap, so I'm still wondering 
why was it not possible. At least in theory, as long as there is a 
bitmap, it should be possible to re-add, no matter how much later, right?..

-- 
darkpenguin

^ permalink raw reply

* Re: Why not just return an error?
From: Edward Kuns @ 2016-10-07 23:11 UTC (permalink / raw)
  To: Phil Turmel; +Cc: Dark Penguin, Andreas Klauer, Rudy Zijlstra, keld, Linux-RAID
In-Reply-To: <a39cdc87-394e-008c-2e50-6df1bd7394da@turmel.org>

On Fri, Oct 7, 2016 at 9:43 AM, Phil Turmel <philip@turmel.org> wrote:
> I want to see the relocations growing in my daily logwatch  reports so
> I can use mdadm --replace to maintain the array without *any* loss
> of redundancy.

Is this report something you added to base logwatch?  I don't see any
such report in my daily logwatch.  Maybe you only see this if the
relocations are non-zero?

Let's say one has a RAID-5 and two drives develop bad sectors (in
different areas) at the same time that for whatever reason are not
repairable.  (Or someone ignores errors on one drive long enough that
another develops problems, or someone never did scrubbing so didn't
find out about bad sectors in one drive until multiple drives had bad
sectors.)  OK, you ddrescue the two drives and reassemble the array.
What will a RAID scrub do when it gets to the area that was zeroed?
Clearly the checksum will be invalid for those sections.

        Thanks,

            Eddie

^ permalink raw reply

* Re: kernel BUG at block/bio.c:1785! observed on 4.8.0-rc6
From: Yi Zhang @ 2016-10-08  6:42 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-raid, Shaohua Li, Xiaotian Zhang
In-Reply-To: <20160926165257.GA129281@kernel.org>

Hi Shaohua

Sorry for the late response, I did the same test and cannot be reproduced on 4.8.0-rc8, thanks.

Best Regards,
  Yi Zhang


----- Original Message -----
From: "Shaohua Li" <shli@kernel.org>
To: "Yi Zhang" <yizhan@redhat.com>
Cc: linux-raid@vger.kernel.org, "Shaohua Li" <shli@fb.com>, "Xiaotian Zhang" <xiaotzha@redhat.com>
Sent: Tuesday, September 27, 2016 12:52:57 AM
Subject: Re: kernel BUG at block/bio.c:1785! observed on 4.8.0-rc6

On Mon, Sep 26, 2016 at 01:17:37AM -0400, Yi Zhang wrote:
> Hello 
> 
> I observed below bug during my MD RAID testing on 4.8.0-rc6, anyone could help check it? Thanks.
> 
> [22535.847193] md: bind<loop0>
> [22535.850414] md: bind<loop1>
> [22535.853638] md: bind<loop2>
> [22535.856861] md: bind<loop3>
> [22535.860056] md: bind<loop5>
> [22535.863278] md: bind<loop4>
> [22535.872061] md/raid:md0: device loop3 operational as raid disk 3
> [22535.878783] md/raid:md0: device loop2 operational as raid disk 2
> [22535.885495] md/raid:md0: device loop1 operational as raid disk 1
> [22535.892206] md/raid:md0: device loop0 operational as raid disk 0
> [22535.899761] md/raid:md0: allocated 5432kB
> [22535.904381] md/raid:md0: raid level 5 active with 4 out of 5 devices, algorithm 2
> [22535.912785] md/raid456: discard support disabled due to uncertainty.
> [22535.919885] Set raid456.devices_handle_discard_safely=Y to override.
> [22535.927016] md0: detected capacity change from 0 to 8384413696
> [22535.933796] md: recovery of RAID array md0
> [22535.938386] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
> [22535.944906] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
> [22535.955670] md: using 128k window, over a total of 2046976k.
> [22565.627129] md: md0: recovery done.
> [22569.183047] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null)
> [22570.376773] md: bind<loop7>
> [22570.508870] md: reshape of RAID array md0
> [22570.513358] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
> [22570.519874] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape.
> [22570.530545] md: using 128k window, over a total of 2046976k.
> [22691.448933] md: md0: reshape done.
> [22709.108706] md0: detected capacity change from 8384413696 to 10480517120
> [22709.144385] VFS: busy inodes on changed media or resized disk md0
> [22709.312043] ------------[ cut here ]------------
> [22709.317198] kernel BUG at block/bio.c:1785!
> [22709.321866] invalid opcode: 0000 [#1] SMP
> [22709.326337] Modules linked in: ext4 jbd2 mbcache loop rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp raid456 kvm_intel async_raid6_recov kvm async_memcpy async_pq async_xor xor async_tx irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel raid6_pq aesni_intel lrw iTCO_wdt gf128mul iTCO_vendor_support glue_helper ablk_helper ipmi_devintf ipmi_ssif cryptd dcdbas mei_me sg pcspkr mei lpc_ich ipmi_si ipmi_msghandler shpchp wmi acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_en sd_mod mgag200 i2c_algo_bit drm_kms_he
 lper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm mlx4_core libahci tg3 crc32c_intel libata ptp i2c_core megaraid_sas devlink fjes pps_core dm_mirror dm_region_hash dm_log dm_!
 mod
> [22709.423707] CPU: 4 PID: 11012 Comm: md0_raid5 Not tainted 4.8.0-rc6 #2
> [22709.430990] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
> [22709.439342] task: ffff8810f8850000 task.stack: ffff88102379c000
> [22709.445947] RIP: 0010:[<ffffffff81328a8a>]  [<ffffffff81328a8a>] bio_split+0x8a/0x90
> [22709.454607] RSP: 0018:ffff88102379f930  EFLAGS: 00010246
> [22709.460527] RAX: 0000000000000080 RBX: 0000000000001000 RCX: ffff8810386bfd00
> [22709.468489] RDX: 0000000002400000 RSI: 0000000000000000 RDI: ffff88203a604178
> [22709.476452] RBP: ffff88102379f948 R08: 0000000000000000 R09: ffff88203a604178
> [22709.484413] R10: 00058000ffffffff R11: 0000000000000000 R12: 0000000000000000
> [22709.492376] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000080
> [22709.500339] FS:  0000000000000000(0000) GS:ffff88103ec80000(0000) knlGS:0000000000000000
> [22709.509574] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [22709.515987] CR2: 00007f1460629000 CR3: 0000000001c06000 CR4: 00000000001406e0
> [22709.523951] Stack:
> [22709.526193]  0000000000001000 0000000000000000 0000000000000000 ffff88102379f9f0
> [22709.534489]  ffffffff81335ca0 ffff88103ecd9000 0000000000000001 ffff8810386bfd00
> [22709.542776]  0000000000000000 ffff8810372b2c60 ffff88102379fa28 00000080810c6dac
> [22709.551074] Call Trace:
> [22709.553808]  [<ffffffff81335ca0>] blk_queue_split+0x480/0x640
> [22709.560223]  [<ffffffff8133b9d5>] blk_sq_make_request+0x95/0x490
> [22709.566922]  [<ffffffff8132cec4>] ? generic_make_request_checks+0x234/0x4f0
> [22709.574698]  [<ffffffffa04e51c3>] ? async_xor+0x1c3/0x5b0 [async_xor]
> [22709.581888]  [<ffffffff8132f903>] generic_make_request+0x103/0x1d0
> [22709.588788]  [<ffffffffa0998286>] ops_run_io+0x376/0x960 [raid456]
> [22709.595678]  [<ffffffffa09a0e3b>] handle_stripe+0xbdb/0x23f0 [raid456]
> [22709.602967]  [<ffffffffa09a2a3c>] handle_active_stripes.isra.52+0x3ec/0x4c0 [raid456]
> [22709.611708]  [<ffffffffa0995f69>] ? do_release_stripe+0x99/0x180 [raid456]
> [22709.619382]  [<ffffffffa0996065>] ? __release_stripe+0x15/0x20 [raid456]
> [22709.626862]  [<ffffffffa09a2fb8>] raid5d+0x4a8/0x750 [raid456]
> [22709.633381]  [<ffffffff815756c6>] md_thread+0x136/0x150
> [22709.639218]  [<ffffffff810d2330>] ? prepare_to_wait_event+0xf0/0xf0
> [22709.646214]  [<ffffffff81575590>] ? find_pers+0x70/0x70
> [22709.652045]  [<ffffffff810acca8>] kthread+0xd8/0xf0
> [22709.657490]  [<ffffffff810b515f>] ? finish_task_switch+0x7f/0x240
> [22709.664292]  [<ffffffff816ff13f>] ret_from_fork+0x1f/0x40
> [22709.670309]  [<ffffffff810acbd0>] ? kthread_park+0x60/0x60
> [22709.676430] Code: df e8 eb 29 03 00 8b 73 28 4c 89 e7 e8 80 de ff ff 48 89 d8 5b 41 5c 41 5d 5d c3 e8 61 fc ff ff 48 89 c3 eb b9 31 c0 eb eb 0f 0b <0f> 0b 0f 1f 40 00 0f 1f 44 00 00 48 8b 07 55 48 89 e5 48 85 c0 
> [22709.698111] RIP  [<ffffffff81328a8a>] bio_split+0x8a/0x90
> [22709.704146]  RSP <ffff88102379f930>
> [22709.714624] ---[ end trace 47f4294978ff2bd0 ]---
> [22709.788366] Kernel panic - not syncing: Fatal exception
> [22709.794278] Kernel Offset: disabled
> [22709.867270] ---[ end Kernel panic - not syncing: Fatal exception
> [22709.873997] ------------[ cut here ]------------
> [22709.879159] WARNING: CPU: 4 PID: 11012 at arch/x86/kernel/smp.c:125 native_smp_send_reschedule+0x3f/0x50
> [22709.889740] Modules linked in: ext4 jbd2 mbcache loop rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp raid456 kvm_intel async_raid6_recov kvm async_memcpy async_pq async_xor xor async_tx irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel raid6_pq aesni_intel lrw iTCO_wdt gf128mul iTCO_vendor_support glue_helper ablk_helper ipmi_devintf ipmi_ssif cryptd dcdbas mei_me sg pcspkr mei lpc_ich ipmi_si ipmi_msghandler shpchp wmi acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_en sd_mod mgag200 i2c_algo_bit drm_kms_he
 lper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm mlx4_core libahci tg3 crc32c_intel libata ptp i2c_core megaraid_sas devlink fjes pps_core dm_mirror dm_region_hash dm_log dm_!
 mod
> [22709.987252] CPU: 4 PID: 11012 Comm: md0_raid5 Tainted: G      D         4.8.0-rc6 #2
> [22709.995894] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
> [22710.004469]  0000000000000086 0000000003b679dd ffff88103ec83bb0 ffffffff8135ce3c
> [22710.012757]  0000000000000000 0000000000000000 ffff88103ec83bf0 ffffffff8108d7a1
> [22710.021051]  0000007d3ec190c0 0000000000000000 ffff88203903da00 ffff88103ec190c0
> [22710.029344] Call Trace:
> [22710.032072]  <IRQ>  [<ffffffff8135ce3c>] dump_stack+0x63/0x87
> [22710.038505]  [<ffffffff8108d7a1>] __warn+0xd1/0xf0
> [22710.043851]  [<ffffffff8108d8dd>] warn_slowpath_null+0x1d/0x20
> [22710.050357]  [<ffffffff81050c2f>] native_smp_send_reschedule+0x3f/0x50
> [22710.057647]  [<ffffffff810b6928>] resched_curr+0xa8/0xd0
> [22710.063573]  [<ffffffff810b7685>] check_preempt_curr+0x75/0x90
> [22710.070080]  [<ffffffff810b76b9>] ttwu_do_wakeup+0x19/0xe0
> [22710.076201]  [<ffffffff810b77ef>] ttwu_do_activate+0x6f/0x80
> [22710.082515]  [<ffffffff810b841e>] try_to_wake_up+0x1ae/0x3c0
> [22710.088830]  [<ffffffff810b86e2>] default_wake_function+0x12/0x20
> [22710.095630]  [<ffffffff810d1be5>] __wake_up_common+0x55/0x90
> [22710.101944]  [<ffffffff810d1c33>] __wake_up_locked+0x13/0x20
> [22710.108263]  [<ffffffff81275419>] ep_poll_callback+0xb9/0x200
> [22710.114672]  [<ffffffff810d1be5>] __wake_up_common+0x55/0x90
> [22710.120986]  [<ffffffff810d1d39>] __wake_up+0x39/0x50
> [22710.126626]  [<ffffffff810e9470>] wake_up_klogd_work_func+0x40/0x60
> [22710.133624]  [<ffffffff8118101d>] irq_work_run_list+0x4d/0x70
> [22710.140040]  [<ffffffff8110de30>] ? tick_sched_do_timer+0x50/0x50
> [22710.146837]  [<ffffffff811811d0>] irq_work_tick+0x40/0x50
> [22710.152867]  [<ffffffff810fdca2>] update_process_times+0x42/0x60
> [22710.159567]  [<ffffffff8110d775>] tick_sched_handle.isra.16+0x25/0x60
> [22710.166756]  [<ffffffff8110de6d>] tick_sched_timer+0x3d/0x70
> [22710.173072]  [<ffffffff810fe9c3>] __hrtimer_run_queues+0xf3/0x280
> [22710.179869]  [<ffffffff810feea8>] hrtimer_interrupt+0xa8/0x1a0
> [22710.186380]  [<ffffffff810535d5>] local_apic_timer_interrupt+0x35/0x60
> [22710.193669]  [<ffffffff81701aad>] smp_apic_timer_interrupt+0x3d/0x50
> [22710.200761]  [<ffffffff81700c6c>] apic_timer_interrupt+0x8c/0xa0
> [22710.207461]  <EOI>  [<ffffffff811987da>] ? panic+0x1f1/0x232
> [22710.213786]  [<ffffffff81030ba8>] oops_end+0xb8/0xd0
> [22710.219331]  [<ffffffff8103110b>] die+0x4b/0x70
> [22710.224384]  [<ffffffff8102df20>] do_trap+0x140/0x150
> [22710.230017]  [<ffffffff8102e2a9>] do_error_trap+0x89/0x110
> [22710.236142]  [<ffffffff81328a8a>] ? bio_split+0x8a/0x90
> [22710.241970]  [<ffffffff810b7692>] ? check_preempt_curr+0x82/0x90
> [22710.248671]  [<ffffffff810b76b9>] ? ttwu_do_wakeup+0x19/0xe0
> [22710.254989]  [<ffffffff810c0cc3>] ? update_cfs_rq_load_avg+0x233/0x440
> [22710.262272]  [<ffffffff8102e7e0>] do_invalid_op+0x20/0x30
> [22710.268297]  [<ffffffff816ffd3e>] invalid_op+0x1e/0x30
> [22710.274029]  [<ffffffff81328a8a>] ? bio_split+0x8a/0x90
> [22710.279861]  [<ffffffff81335ca0>] blk_queue_split+0x480/0x640
> [22710.286273]  [<ffffffff8133b9d5>] blk_sq_make_request+0x95/0x490
> [22710.292976]  [<ffffffff8132cec4>] ? generic_make_request_checks+0x234/0x4f0
> [22710.300750]  [<ffffffffa04e51c3>] ? async_xor+0x1c3/0x5b0 [async_xor]
> [22710.307939]  [<ffffffff8132f903>] generic_make_request+0x103/0x1d0
> [22710.314839]  [<ffffffffa0998286>] ops_run_io+0x376/0x960 [raid456]
> [22710.321737]  [<ffffffffa09a0e3b>] handle_stripe+0xbdb/0x23f0 [raid456]
> [22710.329021]  [<ffffffffa09a2a3c>] handle_active_stripes.isra.52+0x3ec/0x4c0 [raid456]
> [22710.337759]  [<ffffffffa0995f69>] ? do_release_stripe+0x99/0x180 [raid456]
> [22710.345429]  [<ffffffffa0996065>] ? __release_stripe+0x15/0x20 [raid456]
> [22710.352907]  [<ffffffffa09a2fb8>] raid5d+0x4a8/0x750 [raid456]
> [22710.359418]  [<ffffffff815756c6>] md_thread+0x136/0x150
> [22710.365248]  [<ffffffff810d2330>] ? prepare_to_wait_event+0xf0/0xf0
> [22710.372240]  [<ffffffff81575590>] ? find_pers+0x70/0x70
> [22710.378071]  [<ffffffff810acca8>] kthread+0xd8/0xf0
> [22710.383513]  [<ffffffff810b515f>] ? finish_task_switch+0x7f/0x240
> [22710.390314]  [<ffffffff816ff13f>] ret_from_fork+0x1f/0x40
> [22710.396337]  [<ffffffff810acbd0>] ? kthread_park+0x60/0x60
> [22710.402457] ---[ end trace 47f4294978ff2bd1 ]---

There is one bug fixed in 4.8-rc7, c94455558337eece474, can you try that?

Thanks,
Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: md-cluster - Assemble/Scan During Resync
From: Guoqing Jiang @ 2016-10-08  6:57 UTC (permalink / raw)
  To: Marc Smith, linux-raid
In-Reply-To: <CAHkw+Lc3XGqTL5pLX=UhQ+-kdHiVDPO8Z+EbTkWjyk6RgW5tug@mail.gmail.com>

Hi Marc,

On 10/05/2016 04:43 PM, Marc Smith wrote:
> Hi,
>
> First, I believe this issue may have been reported/solved with this
> thread ("[PATCH 3/3] MD: hold mddev lock for md-cluster receive
> thread"):
> http://www.spinics.net/lists/raid/msg53121.html
>
> But I'm not totally sure, and I'm looking for confirmation, or maybe
> this is a new one... I'm trying to hold out for Linux 4.9 in my
> project, and I am hoping to just cherry pick any patches until then.
>
> Testing md-cluster with Linux 4.5.2 (yes, I know its dated)... two
> nodes connected to shared SAS storage, and I'm using DM Multipath in
> front of the individual SAS disks (two I/O modules with dual-domain
> SAS disks).

For cluster-md, there are lots of changes since v4.5, and there are some
fixes which are just merged in 4.9 window (I personally think it should
be stable).

> On tgtnode2 I created the array like this: mdadm --create --verbose
> --run /dev/md/test4 --name=test4 --level=raid1 --raid-devices=2
> --chunk=64 --bitmap=clustered /dev/dm-4 /dev/dm-5
>
> And then, without waiting for the resync to complete, on the second
> node (tgtnode1) I do this: mdadm --assemble --scan

I can't reproduce it with latest code, pls let me know if you see it again
with v4.9 (also send the stack of md related process which are in 'D'
state).

[snip]

> So, again, this may already be fixed, just looking for confirmation if
> the aforementioned patch / thread is related to this bug (or maybe
> another).

Not the one you mentioned (it was not merged), actually there are lots
of changes for resync, you can bisect them to find the related commit.

Thanks,
Guoqing

^ permalink raw reply

* Re: How to fix mistake on raid: mdadm create instead of assemble?
From: Andreas Klauer @ 2016-10-08 12:30 UTC (permalink / raw)
  To: Santiago DIEZ; +Cc: Linux Raid LIST
In-Reply-To: <CAJh8RqUaT_D3GEkj9dWGY5d4e4icUKzyidV2JTVToKN=MpCRyQ@mail.gmail.com>

On Fri, Oct 07, 2016 at 05:37:32PM +0200, Santiago DIEZ wrote:
> First thing I did is ddrescue the remaining partitions sd[abc]10 .
> ddrescue did not stumble into any read error so I assume all remaining
> partitions are perfectly safe.

So ... don't you still have a good copy?

You only killed one of them, right? Did not make same mistake twice?

> There comes my mistake: I ran the --create command instead of --assemble :
> 
> ================================================================================
> # mdadm --create --verbose /dev/md1 --raid-devices=4 --level=raid5
> --run --readonly /dev/loop0 /dev/loop1 /dev/loop2 missing
> 
> mdadm: layout defaults to left-symmetric
> mdadm: layout defaults to left-symmetric
> mdadm: chunk size defaults to 512K
> mdadm: /dev/loop0 appears to contain an ext2fs file system
>        size=5778741888K  mtime=Sat Sep  3 11:00:22 2016
> mdadm: /dev/loop0 appears to be part of a raid array:
>        level=raid5 devices=4 ctime=Wed Jan 25 09:08:11 2012
> mdadm: /dev/loop1 appears to be part of a raid array:
>        level=raid5 devices=4 ctime=Wed Jan 25 09:08:11 2012
> mdadm: /dev/loop2 appears to be part of a raid array:
>        level=raid5 devices=4 ctime=Wed Jan 25 09:08:11 2012
> mdadm: size set to 1926115840K
> mdadm: automatically enabling write-intent bitmap on large array
> mdadm: creation continuing despite oddities due to --run
> mdadm: Defaulting to version 1.2 metadata
> mdadm: array /dev/md1 started.
> ================================================================================

You had 0.90 metadata before, that is metadata at the end of the device. 
1.2 metadata is at the start of the device (4K from start). So you have 
overwritten your filesystem superblock...

You can --create again with --metadata=0.90 --chunk=64K options and 
see what is left to the rescue.

But it would be much better if you still had your good copy of the original.

Regards
Andreas Klauer

^ permalink raw reply

* Bitmap in RAM?
From: Dark Penguin @ 2016-10-08 12:54 UTC (permalink / raw)
  To: linux-raid

After researching write-intent bitmaps for a while, my understanding is 
that they are used only to speed up "re-adding" drives by avoiding a 
full resync, and to enable --write-mostly --write-behind. However, it 
does introduce some pretty heavy load on whatever device it's on, 
especially if it's an internal bitmap, because the head would have to 
fly all the way to the superblock twice per each write. If it's an 
external bitmap, then the device it's on would be too busy just serving 
it to do anything else.

So if I were to place it on a tmpfs, I could eliminate this problem only 
at the expense of being unable to re-add drives after a reboot, right?.. 
I've read somewhere that bitmaps only work correctly on ext2 or ext3 
filesystems, but that probably means that it's not a good idea to put it 
on a filesystem with delayed allocation like ext4 of zfs, otherwise I 
don't understand why - and so I don't know if there would be any problem 
with it running on tmpfs. Is there?..

By the way, Phil, you are a hero! :) I remember that it was you who 
taught me about the "timeouts and scrubbing" problem a year ago, and you 
always explain things so well! You must have a lot of patience and love 
for all people! :)

-- 
darkpenguin

^ permalink raw reply

* Re: Bitmap in RAM?
From: Roman Mamedov @ 2016-10-08 16:02 UTC (permalink / raw)
  To: Dark Penguin; +Cc: linux-raid
In-Reply-To: <57F8EC82.9010804@yandex.ru>

[-- Attachment #1: Type: text/plain, Size: 722 bytes --]

On Sat, 8 Oct 2016 15:54:26 +0300
Dark Penguin <darkpenguin@yandex.ru> wrote:

> So if I were to place it on a tmpfs, I could eliminate this problem only 
> at the expense of being unable to re-add drives after a reboot, right?.. 

If you don't need that ability, you can just remove bitmap entirely, it's not
mandatory. Run:

  mdadm --grow --bitmap=none /dev/mdX

However I'd say being able to re-add drives is very useful, so first consider
switching to a higher bitmap granularity, 

  mdadm --grow --bitmap=none /dev/mdX
  mdadm --grow --bitmap=internal --bitmap-chunk=131072 /dev/mdX

(or even 262144, 524288) as that will reduce the performance impact of the
bitmap.

-- 
With respect,
Roman

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply

* Re: Bitmap in RAM?
From: Dark Penguin @ 2016-10-08 16:23 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: linux-raid
In-Reply-To: <20161008210216.1a3fead2@natsu>

If I remove it entirely, I won't be able to re-add drives at all. If I 
move it to a tmpfs, then I can re-add them, but not across reboots - and 
with no downsides, which I want to confirm. So this would be better than 
removing it completely.

I've thought about it and switched the external bitmaps to the 
chunk-size of 65536, which apparently is the default for intermal 
bitmaps. They've become much smaller, which means the default size 
selected for them before was indeed much higher. I'll see if I notice 
any difference the next time I'm moving data around; maybe the load will 
indeed be negligible.

On 08/10/16 19:02, Roman Mamedov wrote:
> On Sat, 8 Oct 2016 15:54:26 +0300
> Dark Penguin <darkpenguin@yandex.ru> wrote:
>
>> So if I were to place it on a tmpfs, I could eliminate this problem only
>> at the expense of being unable to re-add drives after a reboot, right?..
>
> If you don't need that ability, you can just remove bitmap entirely, it's not
> mandatory. Run:
>
>    mdadm --grow --bitmap=none /dev/mdX
>
> However I'd say being able to re-add drives is very useful, so first consider
> switching to a higher bitmap granularity,
>
>    mdadm --grow --bitmap=none /dev/mdX
>    mdadm --grow --bitmap=internal --bitmap-chunk=131072 /dev/mdX
>
> (or even 262144, 524288) as that will reduce the performance impact of the
> bitmap.
>

-- 
darkpenguin

^ permalink raw reply

* io lockup on cp
From: Glenn Enright @ 2016-10-09 22:01 UTC (permalink / raw)
  To: mdraid

We are seeing cp processes getting stuck in state D. it mayt be
related to md. Or perhaps it is dm_mod.

What seems to happen generally is the cp command gets to D state, then
a dmeventd command sleeps in a polling state and can not be roused or
safely killed. And then eventually other io commands that depend on
dmeventd queue up enough that the server hangs.

Ive seen this on raid1,6 and 10. running xen 4.6 with a 4.4.19 kernel
from xen.crc.id.au. On Centos6 (mdadm 3.3)

I have lots of other 'stuff' from the system, if anyone wants more
ouptut from anything specific please let me know.

Thanks
--Glenn

# the cp command
[~]# cat /proc/30923/stack
[<ffffffff8133ff53>] call_rwsem_down_write_failed+0x13/0x20
[<ffffffffa030bc20>] snapshot_map+0x90/0x490 [dm_snapshot]
[<ffffffffa000427a>] __map_bio+0x4a/0x130 [dm_mod]
[<ffffffffa0004867>] __split_and_process_bio+0x327/0x3f0 [dm_mod]
[<ffffffffa00049a4>] dm_make_request+0x74/0xe0 [dm_mod]
[<ffffffff8130922f>] generic_make_request+0xff/0x1d0
[<ffffffff81309370>] submit_bio+0x70/0x140
[<ffffffff8120eed4>] mpage_bio_submit+0x34/0x50
[<ffffffff8120f2c3>] do_mpage_readpage+0x2b3/0x6d0
[<ffffffff8120f874>] mpage_readpages+0x114/0x160
[<ffffffff81209fdd>] blkdev_readpages+0x1d/0x20
[<ffffffff81168d60>] __do_page_cache_readahead+0x1a0/0x240
[<ffffffff81168f4d>] ondemand_readahead+0x14d/0x250
[<ffffffff811690c2>] page_cache_async_readahead+0x72/0x80
[<ffffffff8115c62e>] generic_file_read_iter+0x40e/0x5e0
[<ffffffff81209ac7>] blkdev_read_iter+0x37/0x40
[<ffffffff811d0f7c>] __vfs_read+0xcc/0xf0
[<ffffffff811d122e>] vfs_read+0x8e/0xe0
[<ffffffff811d1ae6>] SyS_read+0x56/0xc0
[<ffffffff816856ee>] entry_SYSCALL_64_fastpath+0x12/0x71
[<ffffffffffffffff>] 0xffffffffffffffff

# dmeventd
[~]# cat /proc/30757/stack
[<ffffffff811e3e19>] poll_schedule_timeout+0x49/0x70
[<ffffffff811e45fa>] do_select+0x5ba/0x750
[<ffffffff811e5052>] core_sys_select+0x1c2/0x2b0
[<ffffffff811e5657>] SyS_select+0x47/0x110
[<ffffffff816856ee>] entry_SYSCALL_64_fastpath+0x12/0x71
[<ffffffffffffffff>] 0xffffffffffffffff

^ permalink raw reply

* Re: How to fix mistake on raid: mdadm create instead of assemble?
From: Wols Lists @ 2016-10-09 22:39 UTC (permalink / raw)
  To: Andreas Klauer, Santiago DIEZ; +Cc: Linux Raid LIST
In-Reply-To: <20161008123040.GA4603@metamorpher.de>

On 08/10/16 13:30, Andreas Klauer wrote:
> On Fri, Oct 07, 2016 at 05:37:32PM +0200, Santiago DIEZ wrote:
>> > First thing I did is ddrescue the remaining partitions sd[abc]10 .
>> > ddrescue did not stumble into any read error so I assume all remaining
>> > partitions are perfectly safe.
> So ... don't you still have a good copy?
> 
> You only killed one of them, right? Did not make same mistake twice?
> 
>> > There comes my mistake: I ran the --create command instead of --assemble :
>> > 
>> > ================================================================================
>> > # mdadm --create --verbose /dev/md1 --raid-devices=4 --level=raid5
>> > --run --readonly /dev/loop0 /dev/loop1 /dev/loop2 missing

One oddity I've noticed. You've created the array using loop devices.
What are these?

The reason I ask is that using loopback devices is a standard technique
for rebuilding a damaged array, specifically to prevent md from actually
writing to the drive. So is it possible that "mdadm --create" only wrote
to ram, and a reboot will recover your ddrescue copies untouched?

My raid-fu isn't enough to tell me whether I'm right or not ... :-)

If necessary you'll have to do another ddrescue from the original
drives, and you should then be able to assemble the array from the
copies. Don't use "missing", use "--force" and you should get a working,
degraded, array to which you can add a new drive and rebuild the array.

mdadm --assemble /dev/md0 /dev/sd[efg]10 --force

if I'm right ... so long as it's the copies, you can always recover
again from the original disks, and if there's a problem with the copies
mdadm should complain when it assembles the array.

Cheers,
Wol

^ permalink raw reply

* Re: [PATCH 24/54] md/raid1: Improve another size determination in setup_conf()
From: Dan Carpenter @ 2016-10-10 11:12 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: SF Markus Elfring, linux-raid@vger.kernel.org, Christoph Hellwig,
	Guoqing Jiang, Jens Axboe, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors@vger.kernel.org,
	Julia Lawall
In-Reply-To: <CAFLxGvx0Fr-WNvKYACdvNiZwYzHW4ocna9HPH8J_WPSXWc+ngg@mail.gmail.com>

On Thu, Oct 06, 2016 at 11:29:20AM +0200, Richard Weinberger wrote:
> On Thu, Oct 6, 2016 at 11:22 AM, SF Markus Elfring
> <elfring@users.sourceforge.net> wrote:
> > From: Markus Elfring <elfring@users.sourceforge.net>
> > Date: Tue, 4 Oct 2016 21:46:18 +0200
> >
> > Replace the specification of a data structure by a pointer dereference
> > as the parameter for the operator "sizeof" to make the corresponding size
> > determination a bit safer.
> 
> Isn't this pure matter of taste?
> Some developers prefer sizeof(*ptr) because it is easier to type, other
> developers prefer sizeof(struct foo) because you can determine the type
> at first sight and makes review more easy.
> 

I am ignoring Markus patches and have told him that he should focus on
bug fixes.  These patches don't add any value and regularly introduce
bugs.

That said, "sizeof(*ptr)" is sort of official style.  It's slightly
more obvious and easier to review because all the information you need
is on that one line.  Also if we change the datatype of ptr then that
format is slightly more future proof.

regards,
dan carpenter

^ permalink raw reply

* Re: md/raid1: Improve another size determination in setup_conf()
From: SF Markus Elfring @ 2016-10-10 12:28 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Richard Weinberger, linux-raid, Christoph Hellwig, Guoqing Jiang,
	Jens Axboe, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors, Julia Lawall
In-Reply-To: <20161010110050.GA5687@mwanda>

> I am ignoring Markus patches

It's a pity that you chose such a reaction.

> and have told him that he should focus on bug fixes.

I find that I suggest to improve something. Could you admit a few times
that I found a "bug" you care also about at other source code places?

> These patches don't add any value

Can it be that you express a lower value for the Linux coding style here
than desired as there might be other concerns behind such negative feedback?

> and regularly introduce bugs.

How do you think about to discuss corresponding facts further?

> That said, "sizeof(*ptr)" is sort of official style.

When this implementation detail is so official, I wonder then why some
software developers can become "special" about the proposed update step
like for this module.

Regards,
Markus

^ permalink raw reply

* [PATCH] dm: free io_barrier after blk_cleanup_queue call
From: Tahsin Erdogan @ 2016-10-10 12:35 UTC (permalink / raw)
  To: Alasdair Kergon, Mike Snitzer, dm-devel, Shaohua Li
  Cc: linux-raid, linux-kernel, Tahsin Erdogan

dm_old_request_fn() has paths that access md->io_barrier. The party
destroying io_barrier should ensure that no future execution
of dm_old_request_fn() is possible. Move destruction to below
blk_cleanup_queue() to ensure this.

Signed-off-by: Tahsin Erdogan <tahsin@google.com>
---
 drivers/md/dm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index be35258324c1..ec513ee864f2 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1423,8 +1423,6 @@ static void cleanup_mapped_device(struct mapped_device *md)
 	if (md->bs)
 		bioset_free(md->bs);
 
-	cleanup_srcu_struct(&md->io_barrier);
-
 	if (md->disk) {
 		spin_lock(&_minor_lock);
 		md->disk->private_data = NULL;
@@ -1436,6 +1434,8 @@ static void cleanup_mapped_device(struct mapped_device *md)
 	if (md->queue)
 		blk_cleanup_queue(md->queue);
 
+	cleanup_srcu_struct(&md->io_barrier);
+
 	if (md->bdev) {
 		bdput(md->bdev);
 		md->bdev = NULL;
-- 
2.8.0.rc3.226.g39d4020


^ permalink raw reply related

* Re: md/raid1: Improve another size determination in setup_conf()
From: Jes Sorensen @ 2016-10-10 13:06 UTC (permalink / raw)
  To: SF Markus Elfring
  Cc: Dan Carpenter, Richard Weinberger, linux-raid@vger.kernel.org,
	Christoph Hellwig, Guoqing Jiang, Jens Axboe, Mike Christie,
	Neil Brown, Shaohua Li, Tomasz Majchrzak, LKML,
	kernel-janitors@vger.kernel.org, Julia Lawall
In-Reply-To: <77d68bcd-1ae4-4808-fc0b-6183ae5fb6c4@users.sourceforge.net>

SF Markus Elfring <elfring@users.sourceforge.net> writes:
>>>> Replace the specification of a data structure by a pointer dereference
>>>> as the parameter for the operator "sizeof" to make the corresponding size
>>>> determination a bit safer.
>>>
>>> Isn't this pure matter of taste?
>>> Some developers prefer sizeof(*ptr) because it is easier to type, other
>>> developers prefer sizeof(struct foo) because you can determine the type
>>> at first sight and makes review more easy.
>> 
>> sizeof(*ptr) is more future proof and normally more obvious and easier
>> to review.
>
> Is it interesting to see how different the software development opinions
> can be for such an implementation detail?
>
>> That said, I've tried to tell Markus to only send bugfix patches
>
> Can any deviations from the Linux coding style become "bugs" also in
> your view of the software situation?
>
>> because these are a waste of time
>
> How do you value compliance with coding styles?

The Linux Coding Style is not a law, nor is it at all perfect. You
clearly misunderstood how Linux development work and you are doing a
great job wasting everyone's time with this patchset.

Jes

^ permalink raw reply

* Re: [PATCH 24/54] md/raid1: Improve another size determination in setup_conf()
From: Jes Sorensen @ 2016-10-10 13:09 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Richard Weinberger, SF Markus Elfring, linux-raid@vger.kernel.org,
	Christoph Hellwig, Guoqing Jiang, Jens Axboe, Mike Christie,
	Neil Brown, Shaohua Li, Tomasz Majchrzak, LKML,
	kernel-janitors@vger.kernel.org, Julia Lawall
In-Reply-To: <20161007075345.GB6039@mwanda>

Dan Carpenter <dan.carpenter@oracle.com> writes:
> On Thu, Oct 06, 2016 at 11:29:20AM +0200, Richard Weinberger wrote:
>> On Thu, Oct 6, 2016 at 11:22 AM, SF Markus Elfring
>> <elfring@users.sourceforge.net> wrote:
>> > From: Markus Elfring <elfring@users.sourceforge.net>
>> > Date: Tue, 4 Oct 2016 21:46:18 +0200
>> >
>> > Replace the specification of a data structure by a pointer dereference
>> > as the parameter for the operator "sizeof" to make the corresponding size
>> > determination a bit safer.
>> 
>> Isn't this pure matter of taste?
>> Some developers prefer sizeof(*ptr) because it is easier to type, other
>> developers prefer sizeof(struct foo) because you can determine the type
>> at first sight and makes review more easy.
>
> sizeof(*ptr) is more future proof and normally more obvious and easier
> to review.  That said, I've tried to tell Markus to only send bugfix
> patches because these are a waste of time and regularly introduce bugs.

This is totally a matter of taste. I for one find it way easier to
review something which says 'sizeof(struct ....)' because it stands out
more. I am curious what you mean by it being more future proof - if the
code says 'struct foo' in the sizeof argument, what is the problem?

The one area where there is a higher risk is if the type is changed, but
that is outweighed by the fact the spelled out version is easier to
review.

Jes

^ permalink raw reply

* Re: md/raid1: Improve another size determination in setup_conf()
From: Jes Sorensen @ 2016-10-10 13:11 UTC (permalink / raw)
  To: SF Markus Elfring
  Cc: Jiri Kosina, Austin S. Hemmelgarn, Dan Carpenter,
	Richard Weinberger, linux-raid, Christoph Hellwig, Guoqing Jiang,
	Jens Axboe, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors, Julia Lawall
In-Reply-To: <dd69629f-abf4-cc03-dddb-b2f3e7cb92fd@users.sourceforge.net>

SF Markus Elfring <elfring@users.sourceforge.net> writes:
>>>> but patches that just fix coding style are a bad thing
>>>
>>> When you find such a change opportunity so "bad", are there any 
>>> circumstances left over where you would dare to touch the corresponding 
>>> source code line.
>> 
>> If you actually rewrite the code or fix some real bug there.
>
> Do the proposed update steps 12 - 16 for the function "setup_conf"
> (in this software module) fit to your condition?
>
> Do you reject this update step?

I do - those changes do nothing to improve the code and simply hides a
lot of history.

Jes

^ permalink raw reply

* Re: md/raid1: Improve another size determination in setup_conf()
From: SF Markus Elfring @ 2016-10-10 13:20 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: Dan Carpenter, Richard Weinberger, linux-raid, Christoph Hellwig,
	Guoqing Jiang, Jens Axboe, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors, Julia Lawall
In-Reply-To: <wrfjzimcbbft.fsf@redhat.com>

>> How do you value compliance with coding styles?
>
> The Linux Coding Style is not a law,

How serious can such guidelines become for software developers?

> nor is it at all perfect.

I got a similar impression. But are there enough items where a mostly clear
guidance is specified?

> You clearly misunderstood how Linux development work

I got an other impression.

> and you are doing a great job wasting everyone's time with this patchset.

Would you like to reject any update steps for the affected source files
from this patch series?

Can it "accidentally" happen that some of them will be really worth
also for your precious software development attention?

Regards,
Markus

^ permalink raw reply

* Re: md/raid1: Improve another size determination in setup_conf()
From: Adam Goryachev @ 2016-10-10 13:23 UTC (permalink / raw)
  To: SF Markus Elfring; +Cc: linux-raid
In-Reply-To: <526621dc-ce01-77eb-035d-fd884bf31d2c@users.sourceforge.net>

Hi Markus,

I'm mostly a lurker on this, sometimes trying to help out other users 
when I can. Either way, I'm not a kernel coder, so take my comments with 
a grain of salt...

On 10/10/2016 23:28, SF Markus Elfring wrote:
>> I am ignoring Markus patches
> It's a pity that you chose such a reaction.
Different people will have different priorities, and that's OK.
>> and have told him that he should focus on bug fixes.
> I find that I suggest to improve something. Could you admit a few times
> that I found a "bug" you care also about at other source code places?
Different people will describe a bug differently. Has *any* of your 
patches been accepted on this list yet? Has *any* of them fixed a 
problem that a user encountered?
>> These patches don't add any value
> Can it be that you express a lower value for the Linux coding style here
> than desired as there might be other concerns behind such negative feedback?
Desired by who? Everyone is different. Sure, the coding style is there, 
and if you see a proposed patch which doesn't comply with the coding 
style, then advise everyone involved, and let's fix it before the patch 
goes in.
After the code is already in, most people are more concerned with other 
things (different priorities and all)...
>> and regularly introduce bugs.
> How do you think about to discuss corresponding facts further?
You haven't proposed any facts.... These are just different priorities 
from different people...

>> That said, "sizeof(*ptr)" is sort of official style.
> When this implementation detail is so official, I wonder then why some
> software developers can become "special" about the proposed update step
> like for this module.
It's not special, like I said above, if there is a new patch proposed by 
someone which doesn't meet the coding guidelines, then speak up.....

The problem here is that you don't have a history of providing "useful" 
patches (ie, where other kernel developers can see that you have fixed 
something that was broken/not working). Currently, all they see is that 
you have run some tool over the code, and then thrown together 50 
patches to change a bunch of code that doesn't really make any 
difference....

So, in short, I would guess you are beating a dead horse.Trying to 
create arguments over what is or is not right, better, or whatever is 
also nothing short of divisive and plain rude (whether intended or not). 
I would guess that if you had found one small style issue, and brought 
it up, explained that you were just starting out and found this code 
which doesn't meet the guidelines, you were confused about how it works, 
you think you understand now, but could it be changed to this which is 
easier to understand and matches the guidelines. Then it would probably 
have been accepted.
You would probably then be tempted to go and find the other 200 similar 
patches, and we would be back at this spot, but hopefully, you might 
move onto finding and fixing "harder" problems rather than another 100 
easy ones, and progressively your skills would improve, and everyone on 
the list would be delighted to receive your patches, and discuss them in 
detail.
Leave the other 100 easy patches to the next 100 people who come after 
you and need to start with an easy patch, before they progress toward 
the harder code/logic issues.

Like I said, I don't speak for anyone, and I myself don't like to see 
code with spelling/grammar mistakes, but with open source, while we can 
all help by submitting patches, "someone" needs to verify those patches, 
and so they must meet a certain usefulness criteria for them to "waste" 
their time on them.

PPS, deleted a bunch of CC's that didn't seem relevant, both lists and 
individuals....

Regards,
Adam

^ permalink raw reply

* Re: dm: free io_barrier after blk_cleanup_queue call
From: Mike Snitzer @ 2016-10-10 13:25 UTC (permalink / raw)
  To: Tahsin Erdogan
  Cc: Alasdair Kergon, dm-devel, Shaohua Li, linux-raid, linux-kernel
In-Reply-To: <1476102919-25112-1-git-send-email-tahsin@google.com>

On Mon, Oct 10 2016 at  8:35am -0400,
Tahsin Erdogan <tahsin@google.com> wrote:

> dm_old_request_fn() has paths that access md->io_barrier. The party
> destroying io_barrier should ensure that no future execution
> of dm_old_request_fn() is possible. Move destruction to below
> blk_cleanup_queue() to ensure this.

I have to believe this was born out of code inspection rather than
actual need (due to crash, etc)?

The cleanup order isn't relevant.  The reference counting of a DM device
governs whether a DM device (and its associated 'struct mapped_device')
can be destroyed.  Please have a look at  __dm_destroy, particularly:
        * No one should increment the reference count of the mapped_device,
        * after the mapped_device state becomes DMF_FREEING.

Mike

^ permalink raw reply

* Re: [PATCH 24/54] md/raid1: Improve another size determination in setup_conf()
From: Bjørn Mork @ 2016-10-10 13:57 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Richard Weinberger, SF Markus Elfring, linux-raid@vger.kernel.org,
	Christoph Hellwig, Guoqing Jiang, Jens Axboe, Mike Christie,
	Neil Brown, Shaohua Li, Tomasz Majchrzak, LKML,
	kernel-janitors@vger.kernel.org, Julia Lawall
In-Reply-To: <20161010110050.GA5687@mwanda>

Dan Carpenter <dan.carpenter@oracle.com> writes:

> I am ignoring Markus patches and have told him that he should focus on
> bug fixes.  These patches don't add any value and regularly introduce
> bugs.

I think there should be a big fat warning in CodingStyle:

 THIS DOCUMENT DOES NOT APPLY TO ANY EXISTING KERNEL CODE.

Preserving existing style is more important than any minor style issue
anyway.  Style changes for the only purpose of style change should be
considered harmful and destructive behaviour. You'd think that is bloody
obvious, but evidently it isn't.  Not only are the untested, buggy,
churning patches still being submitted - some of them are even applied!

Adding such a statement will not prevent style changes, even changes
to bring code more in line with CodingStyle, as long as the patches are
part of some work of substance.  E.g. cleaning up before fixing bugs. If
you submit new code, then you can of course fix the style of any touched
context.

Oh well, I can dream..

Bjørn

^ permalink raw reply

* Re: md/raid1: Improve another size determination in setup_conf()
From: Jes Sorensen @ 2016-10-10 14:01 UTC (permalink / raw)
  To: SF Markus Elfring
  Cc: Dan Carpenter, Richard Weinberger, linux-raid, Christoph Hellwig,
	Guoqing Jiang, Jens Axboe, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors, Julia Lawall
In-Reply-To: <10ff33b4-44f0-eaea-8549-79a6ba250a2a@users.sourceforge.net>

SF Markus Elfring <elfring@users.sourceforge.net> writes:
>>> How do you value compliance with coding styles?
>>
>> The Linux Coding Style is not a law,
>
> How serious can such guidelines become for software developers?
>
>> nor is it at all perfect.
>
> I got a similar impression. But are there enough items where a mostly clear
> guidance is specified?
>
>> You clearly misunderstood how Linux development work
>
> I got an other impression.
>
>> and you are doing a great job wasting everyone's time with this patchset.
>
> Would you like to reject any update steps for the affected source files
> from this patch series?
>
> Can it "accidentally" happen that some of them will be really worth
> also for your precious software development attention?

Given that none of your patches fix any real bugs and you do your best
to ignore any guidance you have been given, I do reject your entire
patchset and you can consider this a NACK for this entire series.

I get the impression you obtain your response to any email from M-x
doctor.

Jes

^ permalink raw reply

* Re: md/raid1: Improve another size determination in setup_conf()
From: SF Markus Elfring @ 2016-10-10 14:20 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: Dan Carpenter, Richard Weinberger, linux-raid, Christoph Hellwig,
	Guoqing Jiang, Jens Axboe, Mike Christie, Neil Brown, Shaohua Li,
	Tomasz Majchrzak, LKML, kernel-janitors, Julia Lawall
In-Reply-To: <wrfjeg3ob8vp.fsf@redhat.com>

>> Can it "accidentally" happen that some of them will be really worth
>> also for your precious software development attention?
> 
> Given that none of your patches fix any real bugs

Are there any ones which would eventually become "real" also for you?


> and you do your best to ignore any guidance you have been given,

I dare occasionally to find reasons out for a specific disagreement.


> I do reject your entire patchset and you can consider this a NACK for this entire series.

Thanks for your feedback.

I am still curious if any other software developers or source code reviewers
would dare to express an other opinion for one of the shown update possibilities.

Regards,
Markus

^ permalink raw reply

* Re: dm: free io_barrier after blk_cleanup_queue call
From: Tahsin Erdogan @ 2016-10-10 14:52 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Alasdair Kergon, dm-devel, Shaohua Li, linux-raid, linux-kernel
In-Reply-To: <20161010132542.GA17958@redhat.com>

On Mon, Oct 10, 2016 at 6:25 AM, Mike Snitzer <snitzer@redhat.com> wrote:
> I have to believe this was born out of code inspection rather than
> actual need (due to crash, etc)?

This got originated from several crashes I have seen with 4.3 kernel.
The crashes
were caused by null dereferencing of io_barrier->per_cpu_ref.

The issue may no longer be relevant after commit c91852ff0815
("dm: optimize dm_request_fn()") because conditions for accessing
io_barrier may not longer exist. But fix should be considered for
forked stable trees.

^ permalink raw reply

* Re: RAID6 - CPU At 100% Usage After Reassembly
From: Anthony Youngman @ 2016-10-10 19:52 UTC (permalink / raw)
  To: Francisco Parada, Shaohua Li; +Cc: mdraid
In-Reply-To: <CAOW94uvxPH9dhpd=-t+eMfOc3YPLxasViz29idZmbQn-GkP5vg@mail.gmail.com>

On 07/10/16 12:23, Francisco Parada wrote:
> I've got WD 3TB Green
> EZRX drives, which I recently found out via the RAID Wiki, that they
> didn't have error correction (after I spent over a thousand dollars on
> the drives in 3 years) ... Had I known better, I would have opted for
> different models.

There's more to greens than that. Apart from not having error recovery, 
they also apparently have a habit of going to sleep. Search the list for 
greens, but you need to do something else to stop them going to sleep, 
as that will also interfere with the raid.

I'll probably be adding that to the wiki at some point - I want to add a 
section on drives readily available, with that sort of information.

Cheers,
Wol

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox