Linux RAID subsystem development

Linux RAID subsystem development
 help / color / mirror / Atom feed

* Re: Linux raid wiki
From: Phil Turmel @ 2016-09-26 21:19 UTC (permalink / raw)
  To: Wols Lists, linux-raid
In-Reply-To: <57E95054.4020903@youngman.org.uk>

On 09/26/2016 12:44 PM, Wols Lists wrote:
> The next section -
> 
> https://raid.wiki.kernel.org/index.php/Assemble_Run
> 
> addresses what to do if the array is messed up in some way. Would you
> mind taking a look at that now too :-)

Hmmm.  The last bit is less than ideal.  If all drives are faulty, but
runnable in the array with at least one drive of redundancy, the best
way to put good drives in service is one-by-one mdadm --replace.  That
lets the redundancy fix any errors, and doesn't load down the problem
drive any more than ddrescue would.  And it has the benefit of
increasing reliability as you go.

If you don't have any redundancy left, then ddrescue of all readable
drives is reasonable.

Phil

^ permalink raw reply

* Re: WARNING: mismatch_cnt is not 0 on <array device>
From: Phil Turmel @ 2016-09-26 21:15 UTC (permalink / raw)
  To: Benjammin2068, Linux-RAID
In-Reply-To: <c72d3567-7c9c-39dc-c2e1-722bc267bdd6@gmail.com>

On 09/26/2016 03:47 PM, Benjammin2068 wrote:
> Well that instills fear and doubt...
> 
> the mismatch_cnt was 8.
> 
> I did a repair and then a check and now it's 10704....
> 
> :(

Danger Will Robinson!

Seriously.  You very likely have a hardware problem corrupting your
data.  Do you have ECC RAM, and if not, when was the last time you did
an exhaustive memtest?

Recheck all of your data cables and if using an add-on controller, check
for a secure install in the PCIe slot.

Phil

^ permalink raw reply

* Re: WARNING: mismatch_cnt is not 0 on <array device>
From: Benjammin2068 @ 2016-09-26 19:47 UTC (permalink / raw)
  To: Linux-RAID
In-Reply-To: <409d9f5f-6f72-a399-93ab-2b10323f4122@fnarfbargle.com>

Well that instills fear and doubt...

the mismatch_cnt was 8.

I did a repair and then a check and now it's 10704....

:(

 -Ben

^ permalink raw reply

* Re: [PATCH v2] raid6/test/test.c: bug fix: Specify aligned(alignment) attributes to the char arrays
From: Shaohua Li @ 2016-09-26 17:47 UTC (permalink / raw)
  To: Gayatri Kammela
  Cc: linux-raid, linux-kernel, h.peter.anvin, ravi.v.shankar,
	fenghua.yu, H . Peter Anvin, Yu-cheng Yu
In-Reply-To: <1474589275-12045-1-git-send-email-gayatri.kammela@intel.com>

On Thu, Sep 22, 2016 at 05:07:55PM -0700, Gayatri Kammela wrote:
> Specifying the aligned attributes to the char recovi[PAGE_SIZE]
> and char recovi[PAGE_SIZE] arrays, so that all malloc memory is page
> boundary aligned.
> 
> Without these alignment attributes, the test causes a segfault in
> userspace when the NDISKS are changed to 4 from 16.
> 
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
> Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
> ---
>  lib/raid6/test/test.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/raid6/test/test.c b/lib/raid6/test/test.c
> index 3bebbabdb510..32a00f11ac50 100644
> --- a/lib/raid6/test/test.c
> +++ b/lib/raid6/test/test.c
> @@ -21,12 +21,13 @@
>  
>  #define NDISKS		16	/* Including P and Q */
>  
> -const char raid6_empty_zero_page[PAGE_SIZE] __attribute__((aligned(256)));
> +const char raid6_empty_zero_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE)));
>  struct raid6_calls raid6_call;
>  
>  char *dataptrs[NDISKS];
>  char data[NDISKS][PAGE_SIZE];

shouldn't this one be page aligned too?

> -char recovi[PAGE_SIZE], recovj[PAGE_SIZE];
> +char recovi[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE)));
> +char recovj[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE)));
>  
>  static void makedata(int start, int stop)
>  {
> -- 
> 2.7.4
> 

^ permalink raw reply

* Re: kernel BUG at block/bio.c:1785! observed on 4.8.0-rc6
From: Shaohua Li @ 2016-09-26 16:52 UTC (permalink / raw)
  To: Yi Zhang; +Cc: linux-raid, Shaohua Li, Xiaotian Zhang
In-Reply-To: <1653300934.1741807.1474867057556.JavaMail.zimbra@redhat.com>

On Mon, Sep 26, 2016 at 01:17:37AM -0400, Yi Zhang wrote:
> Hello 
> 
> I observed below bug during my MD RAID testing on 4.8.0-rc6, anyone could help check it? Thanks.
> 
> [22535.847193] md: bind<loop0>
> [22535.850414] md: bind<loop1>
> [22535.853638] md: bind<loop2>
> [22535.856861] md: bind<loop3>
> [22535.860056] md: bind<loop5>
> [22535.863278] md: bind<loop4>
> [22535.872061] md/raid:md0: device loop3 operational as raid disk 3
> [22535.878783] md/raid:md0: device loop2 operational as raid disk 2
> [22535.885495] md/raid:md0: device loop1 operational as raid disk 1
> [22535.892206] md/raid:md0: device loop0 operational as raid disk 0
> [22535.899761] md/raid:md0: allocated 5432kB
> [22535.904381] md/raid:md0: raid level 5 active with 4 out of 5 devices, algorithm 2
> [22535.912785] md/raid456: discard support disabled due to uncertainty.
> [22535.919885] Set raid456.devices_handle_discard_safely=Y to override.
> [22535.927016] md0: detected capacity change from 0 to 8384413696
> [22535.933796] md: recovery of RAID array md0
> [22535.938386] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
> [22535.944906] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
> [22535.955670] md: using 128k window, over a total of 2046976k.
> [22565.627129] md: md0: recovery done.
> [22569.183047] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null)
> [22570.376773] md: bind<loop7>
> [22570.508870] md: reshape of RAID array md0
> [22570.513358] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
> [22570.519874] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape.
> [22570.530545] md: using 128k window, over a total of 2046976k.
> [22691.448933] md: md0: reshape done.
> [22709.108706] md0: detected capacity change from 8384413696 to 10480517120
> [22709.144385] VFS: busy inodes on changed media or resized disk md0
> [22709.312043] ------------[ cut here ]------------
> [22709.317198] kernel BUG at block/bio.c:1785!
> [22709.321866] invalid opcode: 0000 [#1] SMP
> [22709.326337] Modules linked in: ext4 jbd2 mbcache loop rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp raid456 kvm_intel async_raid6_recov kvm async_memcpy async_pq async_xor xor async_tx irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel raid6_pq aesni_intel lrw iTCO_wdt gf128mul iTCO_vendor_support glue_helper ablk_helper ipmi_devintf ipmi_ssif cryptd dcdbas mei_me sg pcspkr mei lpc_ich ipmi_si ipmi_msghandler shpchp wmi acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_en sd_mod mgag200 i2c_algo_bit drm_kms_he
 lper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm mlx4_core libahci tg3 crc32c_intel libata ptp i2c_core megaraid_sas devlink fjes pps_core dm_mirror dm_region_hash dm_log dm_mod
> [22709.423707] CPU: 4 PID: 11012 Comm: md0_raid5 Not tainted 4.8.0-rc6 #2
> [22709.430990] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
> [22709.439342] task: ffff8810f8850000 task.stack: ffff88102379c000
> [22709.445947] RIP: 0010:[<ffffffff81328a8a>]  [<ffffffff81328a8a>] bio_split+0x8a/0x90
> [22709.454607] RSP: 0018:ffff88102379f930  EFLAGS: 00010246
> [22709.460527] RAX: 0000000000000080 RBX: 0000000000001000 RCX: ffff8810386bfd00
> [22709.468489] RDX: 0000000002400000 RSI: 0000000000000000 RDI: ffff88203a604178
> [22709.476452] RBP: ffff88102379f948 R08: 0000000000000000 R09: ffff88203a604178
> [22709.484413] R10: 00058000ffffffff R11: 0000000000000000 R12: 0000000000000000
> [22709.492376] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000080
> [22709.500339] FS:  0000000000000000(0000) GS:ffff88103ec80000(0000) knlGS:0000000000000000
> [22709.509574] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [22709.515987] CR2: 00007f1460629000 CR3: 0000000001c06000 CR4: 00000000001406e0
> [22709.523951] Stack:
> [22709.526193]  0000000000001000 0000000000000000 0000000000000000 ffff88102379f9f0
> [22709.534489]  ffffffff81335ca0 ffff88103ecd9000 0000000000000001 ffff8810386bfd00
> [22709.542776]  0000000000000000 ffff8810372b2c60 ffff88102379fa28 00000080810c6dac
> [22709.551074] Call Trace:
> [22709.553808]  [<ffffffff81335ca0>] blk_queue_split+0x480/0x640
> [22709.560223]  [<ffffffff8133b9d5>] blk_sq_make_request+0x95/0x490
> [22709.566922]  [<ffffffff8132cec4>] ? generic_make_request_checks+0x234/0x4f0
> [22709.574698]  [<ffffffffa04e51c3>] ? async_xor+0x1c3/0x5b0 [async_xor]
> [22709.581888]  [<ffffffff8132f903>] generic_make_request+0x103/0x1d0
> [22709.588788]  [<ffffffffa0998286>] ops_run_io+0x376/0x960 [raid456]
> [22709.595678]  [<ffffffffa09a0e3b>] handle_stripe+0xbdb/0x23f0 [raid456]
> [22709.602967]  [<ffffffffa09a2a3c>] handle_active_stripes.isra.52+0x3ec/0x4c0 [raid456]
> [22709.611708]  [<ffffffffa0995f69>] ? do_release_stripe+0x99/0x180 [raid456]
> [22709.619382]  [<ffffffffa0996065>] ? __release_stripe+0x15/0x20 [raid456]
> [22709.626862]  [<ffffffffa09a2fb8>] raid5d+0x4a8/0x750 [raid456]
> [22709.633381]  [<ffffffff815756c6>] md_thread+0x136/0x150
> [22709.639218]  [<ffffffff810d2330>] ? prepare_to_wait_event+0xf0/0xf0
> [22709.646214]  [<ffffffff81575590>] ? find_pers+0x70/0x70
> [22709.652045]  [<ffffffff810acca8>] kthread+0xd8/0xf0
> [22709.657490]  [<ffffffff810b515f>] ? finish_task_switch+0x7f/0x240
> [22709.664292]  [<ffffffff816ff13f>] ret_from_fork+0x1f/0x40
> [22709.670309]  [<ffffffff810acbd0>] ? kthread_park+0x60/0x60
> [22709.676430] Code: df e8 eb 29 03 00 8b 73 28 4c 89 e7 e8 80 de ff ff 48 89 d8 5b 41 5c 41 5d 5d c3 e8 61 fc ff ff 48 89 c3 eb b9 31 c0 eb eb 0f 0b <0f> 0b 0f 1f 40 00 0f 1f 44 00 00 48 8b 07 55 48 89 e5 48 85 c0 
> [22709.698111] RIP  [<ffffffff81328a8a>] bio_split+0x8a/0x90
> [22709.704146]  RSP <ffff88102379f930>
> [22709.714624] ---[ end trace 47f4294978ff2bd0 ]---
> [22709.788366] Kernel panic - not syncing: Fatal exception
> [22709.794278] Kernel Offset: disabled
> [22709.867270] ---[ end Kernel panic - not syncing: Fatal exception
> [22709.873997] ------------[ cut here ]------------
> [22709.879159] WARNING: CPU: 4 PID: 11012 at arch/x86/kernel/smp.c:125 native_smp_send_reschedule+0x3f/0x50
> [22709.889740] Modules linked in: ext4 jbd2 mbcache loop rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp raid456 kvm_intel async_raid6_recov kvm async_memcpy async_pq async_xor xor async_tx irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel raid6_pq aesni_intel lrw iTCO_wdt gf128mul iTCO_vendor_support glue_helper ablk_helper ipmi_devintf ipmi_ssif cryptd dcdbas mei_me sg pcspkr mei lpc_ich ipmi_si ipmi_msghandler shpchp wmi acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_en sd_mod mgag200 i2c_algo_bit drm_kms_he
 lper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm mlx4_core libahci tg3 crc32c_intel libata ptp i2c_core megaraid_sas devlink fjes pps_core dm_mirror dm_region_hash dm_log dm_mod
> [22709.987252] CPU: 4 PID: 11012 Comm: md0_raid5 Tainted: G      D         4.8.0-rc6 #2
> [22709.995894] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
> [22710.004469]  0000000000000086 0000000003b679dd ffff88103ec83bb0 ffffffff8135ce3c
> [22710.012757]  0000000000000000 0000000000000000 ffff88103ec83bf0 ffffffff8108d7a1
> [22710.021051]  0000007d3ec190c0 0000000000000000 ffff88203903da00 ffff88103ec190c0
> [22710.029344] Call Trace:
> [22710.032072]  <IRQ>  [<ffffffff8135ce3c>] dump_stack+0x63/0x87
> [22710.038505]  [<ffffffff8108d7a1>] __warn+0xd1/0xf0
> [22710.043851]  [<ffffffff8108d8dd>] warn_slowpath_null+0x1d/0x20
> [22710.050357]  [<ffffffff81050c2f>] native_smp_send_reschedule+0x3f/0x50
> [22710.057647]  [<ffffffff810b6928>] resched_curr+0xa8/0xd0
> [22710.063573]  [<ffffffff810b7685>] check_preempt_curr+0x75/0x90
> [22710.070080]  [<ffffffff810b76b9>] ttwu_do_wakeup+0x19/0xe0
> [22710.076201]  [<ffffffff810b77ef>] ttwu_do_activate+0x6f/0x80
> [22710.082515]  [<ffffffff810b841e>] try_to_wake_up+0x1ae/0x3c0
> [22710.088830]  [<ffffffff810b86e2>] default_wake_function+0x12/0x20
> [22710.095630]  [<ffffffff810d1be5>] __wake_up_common+0x55/0x90
> [22710.101944]  [<ffffffff810d1c33>] __wake_up_locked+0x13/0x20
> [22710.108263]  [<ffffffff81275419>] ep_poll_callback+0xb9/0x200
> [22710.114672]  [<ffffffff810d1be5>] __wake_up_common+0x55/0x90
> [22710.120986]  [<ffffffff810d1d39>] __wake_up+0x39/0x50
> [22710.126626]  [<ffffffff810e9470>] wake_up_klogd_work_func+0x40/0x60
> [22710.133624]  [<ffffffff8118101d>] irq_work_run_list+0x4d/0x70
> [22710.140040]  [<ffffffff8110de30>] ? tick_sched_do_timer+0x50/0x50
> [22710.146837]  [<ffffffff811811d0>] irq_work_tick+0x40/0x50
> [22710.152867]  [<ffffffff810fdca2>] update_process_times+0x42/0x60
> [22710.159567]  [<ffffffff8110d775>] tick_sched_handle.isra.16+0x25/0x60
> [22710.166756]  [<ffffffff8110de6d>] tick_sched_timer+0x3d/0x70
> [22710.173072]  [<ffffffff810fe9c3>] __hrtimer_run_queues+0xf3/0x280
> [22710.179869]  [<ffffffff810feea8>] hrtimer_interrupt+0xa8/0x1a0
> [22710.186380]  [<ffffffff810535d5>] local_apic_timer_interrupt+0x35/0x60
> [22710.193669]  [<ffffffff81701aad>] smp_apic_timer_interrupt+0x3d/0x50
> [22710.200761]  [<ffffffff81700c6c>] apic_timer_interrupt+0x8c/0xa0
> [22710.207461]  <EOI>  [<ffffffff811987da>] ? panic+0x1f1/0x232
> [22710.213786]  [<ffffffff81030ba8>] oops_end+0xb8/0xd0
> [22710.219331]  [<ffffffff8103110b>] die+0x4b/0x70
> [22710.224384]  [<ffffffff8102df20>] do_trap+0x140/0x150
> [22710.230017]  [<ffffffff8102e2a9>] do_error_trap+0x89/0x110
> [22710.236142]  [<ffffffff81328a8a>] ? bio_split+0x8a/0x90
> [22710.241970]  [<ffffffff810b7692>] ? check_preempt_curr+0x82/0x90
> [22710.248671]  [<ffffffff810b76b9>] ? ttwu_do_wakeup+0x19/0xe0
> [22710.254989]  [<ffffffff810c0cc3>] ? update_cfs_rq_load_avg+0x233/0x440
> [22710.262272]  [<ffffffff8102e7e0>] do_invalid_op+0x20/0x30
> [22710.268297]  [<ffffffff816ffd3e>] invalid_op+0x1e/0x30
> [22710.274029]  [<ffffffff81328a8a>] ? bio_split+0x8a/0x90
> [22710.279861]  [<ffffffff81335ca0>] blk_queue_split+0x480/0x640
> [22710.286273]  [<ffffffff8133b9d5>] blk_sq_make_request+0x95/0x490
> [22710.292976]  [<ffffffff8132cec4>] ? generic_make_request_checks+0x234/0x4f0
> [22710.300750]  [<ffffffffa04e51c3>] ? async_xor+0x1c3/0x5b0 [async_xor]
> [22710.307939]  [<ffffffff8132f903>] generic_make_request+0x103/0x1d0
> [22710.314839]  [<ffffffffa0998286>] ops_run_io+0x376/0x960 [raid456]
> [22710.321737]  [<ffffffffa09a0e3b>] handle_stripe+0xbdb/0x23f0 [raid456]
> [22710.329021]  [<ffffffffa09a2a3c>] handle_active_stripes.isra.52+0x3ec/0x4c0 [raid456]
> [22710.337759]  [<ffffffffa0995f69>] ? do_release_stripe+0x99/0x180 [raid456]
> [22710.345429]  [<ffffffffa0996065>] ? __release_stripe+0x15/0x20 [raid456]
> [22710.352907]  [<ffffffffa09a2fb8>] raid5d+0x4a8/0x750 [raid456]
> [22710.359418]  [<ffffffff815756c6>] md_thread+0x136/0x150
> [22710.365248]  [<ffffffff810d2330>] ? prepare_to_wait_event+0xf0/0xf0
> [22710.372240]  [<ffffffff81575590>] ? find_pers+0x70/0x70
> [22710.378071]  [<ffffffff810acca8>] kthread+0xd8/0xf0
> [22710.383513]  [<ffffffff810b515f>] ? finish_task_switch+0x7f/0x240
> [22710.390314]  [<ffffffff816ff13f>] ret_from_fork+0x1f/0x40
> [22710.396337]  [<ffffffff810acbd0>] ? kthread_park+0x60/0x60
> [22710.402457] ---[ end trace 47f4294978ff2bd1 ]---

There is one bug fixed in 4.8-rc7, c94455558337eece474, can you try that?

Thanks,
Shaohua

^ permalink raw reply

* Re: Linux raid wiki
From: Wols Lists @ 2016-09-26 16:44 UTC (permalink / raw)
  To: Phil Turmel, linux-raid
In-Reply-To: <c6d53f0f-4434-9703-cfab-b148dd8dd68b@turmel.org>

On 26/09/16 15:01, Phil Turmel wrote:
> Hi Wol,
> 
> A few comments below.

Thank you very much.
> 
> On 09/24/2016 09:18 AM, Wols Lists wrote:
>> On 23/09/16 00:31, Wols Lists wrote:
>>> I've added the "When Things Go Wrogn" section, but so far only the first
>>> two pages - "Asking for help" and "Timeout Mismatch" - are all my work.
>>> The other three pages were already there, but I moved them here because
>>> I felt they belonged here.
>>>
>>> Please feel free to criticize it (or offer bouquets :-), and give advice
>>> as how to improve things, either in private email or on the list.
>>
>> Replying to myself, but I'm reasonably happy with the first three
>> sections in "When Things Go Wrogn". But it's important that they're
>> correct! Would a couple of experts mind looking them over and sending a
>> critique to the list? Just a simple "Looks good" would be great and set
>> my mind at rest that I have understood things properly and I'm not
>> giving out bad advice.
>>
>> Note that the next section is going to be along the lines of "My array
>> won't assemble / run"
>>
>> https://raid.wiki.kernel.org/index.php/Asking_for_help
> 
> "smartctl --all" doesn't report ERC settings.  --xall is required, or
> for a somewhat shorter report, I find "smartctl -H -i -l scterc" ideal.
> 
Thanks. Noted and updated.

>> https://raid.wiki.kernel.org/index.php/Timeout_Mismatch
> 
> Very good.
> 
>> https://raid.wiki.kernel.org/index.php/Replacing_a_failed_drive
> 
> You should note that USB connections are not suitable for permanent use.
>  Copying a drive or doing a --replace, fine, but don't leave it set up
> that way.  USB disconnects, even if only for sleep, will scramble the MD
> code.

Noted. I've added a bit to say don't use USB for raid but it's okay for
salvaging a drive.
> 
> Also, any time ddrescue is used, the unreadable sectors are replaced
> with zeros and there is no longer any indication that that sector is
> bad.  That means assembling an array from ddrescued components will
> certainly have some corrupt spots.  fsck is mandatory, and there may be
> corrupt file content.  ddrescue is only appropriate if there's no
> redundancy left in the array to use to fix UREs.
> 
Or if there are no errors in the copy ...

That section tries to stress that it only applies if there are no
errors. And if you complete it successfully, you won't lose any data.

> Overall, very good.
> 
The next section -

https://raid.wiki.kernel.org/index.php/Assemble_Run

addresses what to do if the array is messed up in some way. Would you
mind taking a look at that now too :-)

Cheers,
Wol


^ permalink raw reply

* Re: Linux raid wiki - setting up a system - advice wanted :-)
From: Wols Lists @ 2016-09-26 15:48 UTC (permalink / raw)
  To: Phil Turmel; +Cc: linux-raid
In-Reply-To: <e5651061-1404-30be-2777-9ae02a640f42@turmel.org>

On 26/09/16 15:13, Phil Turmel wrote:
> On 09/26/2016 02:50 AM, Wols Lists wrote:
> 
>> Bare metal -> raid [-> lvm] -> /
>>
>> Is there any room on the disk to install grub?
> 
> No.

Ten out of ten. Thanks.

(Hint to examinees - please answer the question on the paper, not the
question you want to answer :-)
> 
>> (Note that - and I know you shouldn't believe everything you read on the
>> internet - apparently Neil Brown prefers passing the entire
>> unpartitioned disk to raid ...)
> 
> I'm with Neil for my large arrays.  I partition a pair of SSDs for UEFI
> boot and a raid mirror holding an LVM volume group for the OS.  All
> other drives are unpartitioned, given entirely to a raid6 w/ a small
> chunk size (16k lately).  A separate LVM volume group on top of that.
> 
> ( I no longer use any bootloader, either, as UEFI will boot a kernel
> directly that's been built with EFI_STUB and a nested initramfs. )
> 
That's good to know. I don't have any experience with UEFI as yet, so I
can document that as a hint to others.

Cheers,
Wol


^ permalink raw reply

* Re: RAID6 - CPU At 100% Usage After Reassembly
From: Francisco Parada @ 2016-09-26 14:29 UTC (permalink / raw)
  To: Shaohua Li, mdraid, Michael J. Shaver
In-Reply-To: <CAOW94utVBcLz191ifzKkjn+nsSthPWDAQF8R-PabcS2uPareag@mail.gmail.com>

Hi all,

It doesn't seem like my response from last night, made it to the list:


Hi Shaohua and all,

I was finally able to upgrade my Ubuntu server to a newer version of
the kernel and mdadm:
==========================
$ uname -r; mdadm -V

4.8.0-rc7-custom

mdadm - v3.4 - 28th January 2016
==========================


I rebuilt the kernel with the options that Shaohua asked me to build it with:
======================================
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 5883ef0..db484ca 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -62,6 +62,9 @@
 #include "raid0.h"
 #include "bitmap.h"

+#undef pr_debug
+#define pr_debug trace_printk
+
 #define cpu_to_group(cpu) cpu_to_node(cpu)
 #define ANY_GROUP NUMA_NO_NODE
======================================


Here's how things look so far, nothing different yet:
======================================
$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md127 : inactive sdd[10](S) sdk[0](S) sdj[2](S) sdh[3](S) sde[11](S)
sdg[9](S) sdf[7](S) sdb[13](S) sdc[12](S)
      26371219608 blocks super 1.2

unused devices: <none>
======================================


Here's an event snapshot of my array, just keep in mind that
"/dev/sdi" is my failed drive, so I omitted it from the examination:
======================================
# mdadm -E /dev/sd[b-h,j,k] |grep Events
         Events : 280033
         Events : 280033
         Events : 280033
         Events : 280033
         Events : 280033
         Events : 280033
         Events : 280011
         Events : 280033
         Events : 280033
======================================


It's important to note, that since I haven't done anything yet, my CPU is idle:
======================================
top - 20:22:00 up  5:56,  2 users,  load average: 0.04, 0.03, 0.00

Tasks: 221 total,   1 running, 220 sleeping,   0 stopped,   0 zombie

%Cpu(s):  1.0 us,  1.0 sy,  0.0 ni, 97.5 id,  0.5 wa,  0.0 hi,  0.0 si,  0.0 st

KiB Mem :  1525400 total,   103836 free,   696208 used,   725356 buff/cache

KiB Swap: 25153532 total, 25117380 free,    36152 used.   454808 avail Mem
 PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2093 cisco     20   0 1761112 153108  55640 S   1.0 10.0   0:12.61 gnome-shell
 4322 root      20   0   40520   3684   3100 R   1.0  0.2   0:00.22 top
    1 root      20   0  119692   5540   3992 S   0.0  0.4   0:02.44 systemd
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.00 kthreadd
    3 root      20   0       0      0      0 S   0.0  0.0   0:00.09 ksoftirqd/0
    5 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/0:
======================================


Now onto the fun part.  I stopped "/dev/md127":
======================================
# mdadm --stop /dev/md127
mdadm: stopped /dev/md127
# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
unused devices: <none>
======================================


For completion, here's the trace output after stopping the array, and
before reassembling:
======================================
# tracer: nop
#
# entries-in-buffer/entries-written: 0/0   #P:2
#
#                              _-----=> irqs-off
#                             / _----=> need-resched
#                            | / _---=> hardirq/softirq
#                            || / _--=> preempt-depth
#                            ||| /     delay
#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
#              | |       |   ||||       |         |
/sys/kernel/debug/tracing/trace (END)
======================================


Then I reassembled the array:
======================================
# mdadm -Afv /dev/md127 /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf
/dev/sdg /dev/sdh /dev/sdj /dev/sdk
mdadm: looking for devices for /dev/md127
mdadm: /dev/sdb is busy - skipping
mdadm: Merging with already-assembled /dev/md/en1
mdadm: /dev/sdb is identified as a member of /dev/md/en1, slot 7.
mdadm: /dev/sdc is identified as a member of /dev/md/en1, slot 8.
mdadm: /dev/sdd is identified as a member of /dev/md/en1, slot 6.
mdadm: /dev/sde is identified as a member of /dev/md/en1, slot 9.
mdadm: /dev/sdf is identified as a member of /dev/md/en1, slot 4.
mdadm: /dev/sdg is identified as a member of /dev/md/en1, slot 1.
mdadm: /dev/sdh is identified as a member of /dev/md/en1, slot 3.
mdadm: /dev/sdj is identified as a member of /dev/md/en1, slot 2.
mdadm: /dev/sdk is identified as a member of /dev/md/en1, slot 0.
mdadm: Marking array /dev/md/en1 as 'clean'
mdadm: /dev/md/en1 has an active reshape - checking if critical
section needs to be restored
mdadm: No backup metadata on device-7
mdadm: No backup metadata on device-8
mdadm: No backup metadata on device-9
mdadm: added /dev/sdg to /dev/md/en1 as 1
mdadm: added /dev/sdj to /dev/md/en1 as 2
mdadm: added /dev/sdh to /dev/md/en1 as 3 (possibly out of date)
mdadm: added /dev/sdf to /dev/md/en1 as 4
mdadm: no uptodate device for slot 5 of /dev/md/en1
mdadm: added /dev/sdd to /dev/md/en1 as 6
mdadm: /dev/sdb is already in /dev/md/en1 as 7
mdadm: added /dev/sdc to /dev/md/en1 as 8
mdadm: added /dev/sde to /dev/md/en1 as 9
mdadm: added /dev/sdk to /dev/md/en1 as 0
======================================


And of course, CPU shoots to 100%:
======================================
top - 20:38:44 up  6:13,  3 users,  load average: 5.05, 3.25, 1.41
Tasks: 239 total,   3 running, 236 sleeping,   0 stopped,   0 zombie
%Cpu(s):  5.9 us, 52.7 sy,  0.0 ni,  0.0 id, 41.4 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  1525400 total,    73124 free,   739576 used,   712700 buff/cache
KiB Swap: 25153532 total, 25111140 free,    42392 used.   415840 avail Mem

 PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 6423 root      20   0       0      0      0 R  99.0  0.0   4:43.51 md127_raid6
 1166 root      20   0  280588   8780   6192 S   3.0  0.6   0:06.56 polkitd
 4022 cisco     20   0  394756  32884  26064 S   3.0  2.2   0:08.77 gnome-disks
 1903 cisco     20   0  256660  34060  26280 S   2.0  2.2   0:29.56 Xorg
 2093 cisco     20   0 1760364 153572  55572 S   2.0 10.1   0:17.96 gnome-shell
======================================


Then surely the array reshape speed goes back down to nothing:
======================================
# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md127 : active raid6 sdk[0] sde[11] sdc[12] sdd[10] sdf[7] sdj[2] sdg[9] sdb[13]
      14650675200 blocks super 1.2 level 6, 512k chunk, algorithm 2
[10/8] [UUU_U_UUUU]
      [=======>.............]  reshape = 39.1% (1146348512/2930135040)
finish=22057575.1min speed=1K/sec
      bitmap: 0/22 pages [0KB], 65536KB chunk

unused devices: <none>
======================================


The size of the trace file is gigantic, so hopefully it doesn't get
trimmed in the email, but any help would be appreciated, thanks in
advance:
================
# tracer: nop
#
# entries-in-buffer/entries-written: 44739/81554230   #P:2
#
#                              _-----=> irqs-off
#                             / _----=> need-resched
#                            | / _---=> hardirq/softirq
#                            || / _--=> preempt-depth
#                            ||| /     delay
#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
#              | |       |   ||||       |         |
     md127_raid6-6423  [001] .... 22228.159918: analyse_stripe: check
7: state 0x13 read           (null) write           (null) written
      (null)
     md127_raid6-6423  [001] .... 22228.159919: analyse_stripe: check
6: state 0xa01 read           (null) write           (null) written
       (null)
     md127_raid6-6423  [001] .... 22228.159919: analyse_stripe: check
5: state 0x801 read           (null) write           (null) written
       (null)
     md127_raid6-6423  [001] .... 22228.159920: analyse_stripe: check
4: state 0x811 read           (null) write           (null) written
       (null)
     md127_raid6-6423  [001] .... 22228.159921: analyse_stripe: check
3: state 0x801 read           (null) write           (null) written
       (null)
     md127_raid6-6423  [001] .... 22228.159921: analyse_stripe: check
2: state 0x811 read           (null) write           (null) written
       (null)
     md127_raid6-6423  [001] .... 22228.159922: analyse_stripe: check
1: state 0xa01 read           (null) write           (null) written
       (null)
     md127_raid6-6423  [001] .... 22228.159923: analyse_stripe: check
0: state 0x811 read           (null) write           (null) written
       (null)
     md127_raid6-6423  [001] .... 22228.159924: handle_stripe:
locked=2 uptodate=10 to_read=0 to_write=0 failed=4 failed_num=6,5
     md127_raid6-6423  [001] .... 22228.159925:
schedule_reconstruction: schedule_reconstruction: stripe 2292697672
locked: 4 ops_request: 10
     md127_raid6-6423  [001] .... 22228.159925: raid_run_ops:
ops_run_reconstruct6: stripe 2292697672
     md127_raid6-6423  [001] .... 22228.159943:
ops_complete_reconstruct: ops_complete_reconstruct: stripe 2292697672
     md127_raid6-6423  [001] .... 22228.159944: handle_stripe:
handling stripe 2292697680, state=0x1401 cnt=1, pd_idx=7, qd_idx=8
, check:0, reconstruct:6
===========================================

I trimmed it because of the failure to send issue.  However, if
someone needs a lengthier snip of the trace, let me know.

Thanks,

'Cisco

^ permalink raw reply related

* Re: Linux raid wiki - setting up a system - advice wanted :-)
From: Phil Turmel @ 2016-09-26 14:13 UTC (permalink / raw)
  To: Wols Lists, Andreas Klauer; +Cc: linux-raid
In-Reply-To: <57E8C550.8000509@youngman.org.uk>

On 09/26/2016 02:50 AM, Wols Lists wrote:

> Bare metal -> raid [-> lvm] -> /
> 
> Is there any room on the disk to install grub?

No.

> (Note that - and I know you shouldn't believe everything you read on the
> internet - apparently Neil Brown prefers passing the entire
> unpartitioned disk to raid ...)

I'm with Neil for my large arrays.  I partition a pair of SSDs for UEFI
boot and a raid mirror holding an LVM volume group for the OS.  All
other drives are unpartitioned, given entirely to a raid6 w/ a small
chunk size (16k lately).  A separate LVM volume group on top of that.

( I no longer use any bootloader, either, as UEFI will boot a kernel
directly that's been built with EFI_STUB and a nested initramfs. )

Phil

^ permalink raw reply

* Re: Linux raid wiki
From: Phil Turmel @ 2016-09-26 14:01 UTC (permalink / raw)
  To: Wols Lists, linux-raid
In-Reply-To: <57E67D32.30108@youngman.org.uk>

Hi Wol,

A few comments below.

On 09/24/2016 09:18 AM, Wols Lists wrote:
> On 23/09/16 00:31, Wols Lists wrote:
>> I've added the "When Things Go Wrogn" section, but so far only the first
>> two pages - "Asking for help" and "Timeout Mismatch" - are all my work.
>> The other three pages were already there, but I moved them here because
>> I felt they belonged here.
>>
>> Please feel free to criticize it (or offer bouquets :-), and give advice
>> as how to improve things, either in private email or on the list.
> 
> Replying to myself, but I'm reasonably happy with the first three
> sections in "When Things Go Wrogn". But it's important that they're
> correct! Would a couple of experts mind looking them over and sending a
> critique to the list? Just a simple "Looks good" would be great and set
> my mind at rest that I have understood things properly and I'm not
> giving out bad advice.
> 
> Note that the next section is going to be along the lines of "My array
> won't assemble / run"
> 
> https://raid.wiki.kernel.org/index.php/Asking_for_help

"smartctl --all" doesn't report ERC settings.  --xall is required, or
for a somewhat shorter report, I find "smartctl -H -i -l scterc" ideal.

> https://raid.wiki.kernel.org/index.php/Timeout_Mismatch

Very good.

> https://raid.wiki.kernel.org/index.php/Replacing_a_failed_drive

You should note that USB connections are not suitable for permanent use.
 Copying a drive or doing a --replace, fine, but don't leave it set up
that way.  USB disconnects, even if only for sleep, will scramble the MD
code.

Also, any time ddrescue is used, the unreadable sectors are replaced
with zeros and there is no longer any indication that that sector is
bad.  That means assembling an array from ddrescued components will
certainly have some corrupt spots.  fsck is mandatory, and there may be
corrupt file content.  ddrescue is only appropriate if there's no
redundancy left in the array to use to fix UREs.

Overall, very good.

Phil

^ permalink raw reply

* Re: Linux raid wiki - setting up a system - advice wanted :-)
From: keld @ 2016-09-26  9:30 UTC (permalink / raw)
  To: Wols Lists; +Cc: linux-raid
In-Reply-To: <57E83EA8.9060809@youngman.org.uk>

On Sun, Sep 25, 2016 at 10:16:24PM +0100, Wols Lists wrote:
> This is a great way for learning lots about raid :-)
> 
> I'm planning a section on setting up a new system, and I need to know
> what will happen if you give entire drives to mdadm.
> 
> Does it leave the first 2 megs empty? Basically, what I'm asking is if I do
> 
> mdadm --create /dev/md/bigarray -add /dev/sda /dev/sdb /dev/sdc
> 
> (note I am passing the entire drive, not the first partition) and then I
> install grub on those drives, will I trash the array?

There is already a section in the wiki:

https://raid.wiki.kernel.org/index.php/Preventing_against_a_failing_disk

Which should be updated for 2TB+ disks.

best regards
Keld

^ permalink raw reply

* Re: Linux raid wiki - setting up a system - advice wanted :-)
From: keld @ 2016-09-26  8:17 UTC (permalink / raw)
  To: Francisco Parada; +Cc: Wols Lists, linux-raid
In-Reply-To: <CAOW94uvBEWZP4aN3jJT7ya68As6XGXPO1oJWFFPQnRmGg41ffg@mail.gmail.com>

Hi

If you do not have a partition table, you cannot  have different partitions
on the disks.

It is in many cases a good idea to have different types of raids,
and partitions for different purposes, and this is where MD RAID has some
advantages over HW RAID. For instance you want a /boot a /root a Swap and one or more
data partitions.

And then different RAID types suits the different purposes, like RAID1, RAID10 and RAID5.

Best regards
Keld

On Sun, Sep 25, 2016 at 08:59:27PM -0400, Francisco Parada wrote:
> Hi Wols,
> 
> Based on my own experience, you can do it without trashing the array.
> However, I should note, that I have never done it this way to an array
> that I was booting from.  But, as long as you've set up GPT to account
> for the 2MB boundary.  If you use "parted" or the graphical
> equivalent, "gparted", you can account for that in newer drives above
> the 2TerraByte capacity anyway with 1MB instead of 2MB.  So if you add
> a little extra padding, say 3MB (2MB for your grub, 1MB for a blank
> section that all drives above 2TB require), you should be in good
> shape.
> 
> Perhaps someone else can chime in also, to confirm.
> 
> On Sun, Sep 25, 2016 at 5:16 PM, Wols Lists <antlists@youngman.org.uk> wrote:
> > This is a great way for learning lots about raid :-)
> >
> > I'm planning a section on setting up a new system, and I need to know
> > what will happen if you give entire drives to mdadm.
> >
> > Does it leave the first 2 megs empty? Basically, what I'm asking is if I do
> >
> > mdadm --create /dev/md/bigarray -add /dev/sda /dev/sdb /dev/sdc
> >
> > (note I am passing the entire drive, not the first partition) and then I
> > install grub on those drives, will I trash the array?
> >
> > Cheers,
> > Wol
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: WARNING: mismatch_cnt is not 0 on <array device>
From: Brad Campbell @ 2016-09-26  7:40 UTC (permalink / raw)
  To: Benjammin2068, Linux-RAID
In-Reply-To: <4118791e-e1d8-1261-2920-560f22ec7b6f@gmail.com>

On 26/09/16 15:19, Benjammin2068 wrote:
>
>
>
> I'll go run a check (and repair)...
>

Don't run a repair until you've got it sussed. Check is read-only, 
repair isn't.

Brad

^ permalink raw reply

* Re: WARNING: mismatch_cnt is not 0 on <array device>
From: Benjammin2068 @ 2016-09-26  7:19 UTC (permalink / raw)
  To: Linux-RAID
In-Reply-To: <409d9f5f-6f72-a399-93ab-2b10323f4122@fnarfbargle.com>



On 09/25/2016 10:42 PM, Brad Campbell wrote:
>
> You best find out what it is to start with. Example from mine :
>
> brad@srv:~$ cat /sys/block/md2/md/mismatch_cnt
> 8171392
>
> This is a bad example because it's a RAID10 of 6 SSDs. 3 support zero after TRIM and the other 3 don't, so after a TRIM of the filesystem the mismatch count is through the roof.
>
> Unless you are swapping to an array or you have some known issue like the one I mention above, a mismatch count of non-zero is not good.
>
> I lost most of a RAID6 due to a faulty SIL SATA controller and it was the high mismatch counts that alerted me. Unfortunately I was about 6 months in and had only checked for the first time. It was silently corrupting reads under load, and so the read-modify-write cycles were quietly corrupting the array.
>
> Check the mismatch count, run a "check" on the array and check it again. If they vary wildly something odd is going on. If they are the same then you might want to figure out what might have caused it.

Mine is 8.

besides moving from a RAID5 to a RAID6, I also recently installed a new HD controller (Marvell 88SE9485) from SuperMicro. (onto a SuperMicro Motherboard)

Previously, the RAID was contained with 4 SATA ports all on the motherboard (also SuperMicro) -- only with this move to RAID6, I needed the extra ports, so this is the card I got.

I'll go run a check (and repair)...

Will report back when it's all done.

 -Ben

^ permalink raw reply

* Re: Linux raid wiki - setting up a system - advice wanted :-)
From: Wols Lists @ 2016-09-26  6:50 UTC (permalink / raw)
  To: Andreas Klauer; +Cc: linux-raid
In-Reply-To: <20160926021622.GA25056@metamorpher.de>

On 26/09/16 03:16, Andreas Klauer wrote:
> On Sun, Sep 25, 2016 at 10:16:24PM +0100, Wols Lists wrote:
>> > I need to know what will happen if you give entire drives to mdadm.
> Installers will pick unpartitioned disks first. Forget just trashing 
> the metadata, easy to accidentally write across the entire disk.
> 
> This is what it looks like when installing Windows: http://imgur.com/a/GtcR2 
> 
> Same can happen with Linux installers. Unpartitioned disks are just unusual. 
> 
> Not sure why this is a thing anyway. There's no downside to partitions. 
> Adds a safety margin, is yet another place that has metadata (with GPT 
> you can use mdnumber-role as partition name / partlabel), doesn't harm 
> performance in any way...

Actually, there IS a downside, which is what I'm getting at.

Bare metal -> partitions -> raid -> lvm -> partions ...

I'm a DB guy by trade. I hate relational DBs with a vengeance - because
they are necessarily complex thanks to relational theory but because
they are also so totally UNnecessary if people weren't wedded to the
(totally impractical in the real world) maths!

I know what I'm doing here. I'm very bright. And I'm trying to work out
how to explain myself without leaving your "bear of very little brain"
in charge of sys-adminning a server scratching his head in confusion
trying to work out what goes where.

As far as linux is concerned, a block device is a block device. But the
poor sysadmin has got to get his head round what goes where, and my
experience with DBs tells me that most people probably aren't as bright
as us ...

At the end of the day, I don't want to recommend anything. I simply want
to know - is it *possible*. Unfortunately, I don't have the hardware to
try it myself :-( The setup I want to know is

Bare metal -> raid [-> lvm] -> /

Is there any room on the disk to install grub?

(Note that - and I know you shouldn't believe everything you read on the
internet - apparently Neil Brown prefers passing the entire
unpartitioned disk to raid ...)

Cheers,
Wol

^ permalink raw reply

* kernel BUG at block/bio.c:1785! observed on 4.8.0-rc6
From: Yi Zhang @ 2016-09-26  5:17 UTC (permalink / raw)
  To: linux-raid; +Cc: Shaohua Li, Xiaotian Zhang
In-Reply-To: <2021001709.1741646.1474866867952.JavaMail.zimbra@redhat.com>

Hello 

I observed below bug during my MD RAID testing on 4.8.0-rc6, anyone could help check it? Thanks.

[22535.847193] md: bind<loop0>
[22535.850414] md: bind<loop1>
[22535.853638] md: bind<loop2>
[22535.856861] md: bind<loop3>
[22535.860056] md: bind<loop5>
[22535.863278] md: bind<loop4>
[22535.872061] md/raid:md0: device loop3 operational as raid disk 3
[22535.878783] md/raid:md0: device loop2 operational as raid disk 2
[22535.885495] md/raid:md0: device loop1 operational as raid disk 1
[22535.892206] md/raid:md0: device loop0 operational as raid disk 0
[22535.899761] md/raid:md0: allocated 5432kB
[22535.904381] md/raid:md0: raid level 5 active with 4 out of 5 devices, algorithm 2
[22535.912785] md/raid456: discard support disabled due to uncertainty.
[22535.919885] Set raid456.devices_handle_discard_safely=Y to override.
[22535.927016] md0: detected capacity change from 0 to 8384413696
[22535.933796] md: recovery of RAID array md0
[22535.938386] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[22535.944906] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
[22535.955670] md: using 128k window, over a total of 2046976k.
[22565.627129] md: md0: recovery done.
[22569.183047] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null)
[22570.376773] md: bind<loop7>
[22570.508870] md: reshape of RAID array md0
[22570.513358] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[22570.519874] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape.
[22570.530545] md: using 128k window, over a total of 2046976k.
[22691.448933] md: md0: reshape done.
[22709.108706] md0: detected capacity change from 8384413696 to 10480517120
[22709.144385] VFS: busy inodes on changed media or resized disk md0
[22709.312043] ------------[ cut here ]------------
[22709.317198] kernel BUG at block/bio.c:1785!
[22709.321866] invalid opcode: 0000 [#1] SMP
[22709.326337] Modules linked in: ext4 jbd2 mbcache loop rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp raid456 kvm_intel async_raid6_recov kvm async_memcpy async_pq async_xor xor async_tx irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel raid6_pq aesni_intel lrw iTCO_wdt gf128mul iTCO_vendor_support glue_helper ablk_helper ipmi_devintf ipmi_ssif cryptd dcdbas mei_me sg pcspkr mei lpc_ich ipmi_si ipmi_msghandler shpchp wmi acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_en sd_mod mgag200 i2c_algo_bit drm_kms_help
 er syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm mlx4_core libahci tg3 crc32c_intel libata ptp i2c_core megaraid_sas devlink fjes pps_core dm_mirror dm_region_hash dm_log dm_mod
[22709.423707] CPU: 4 PID: 11012 Comm: md0_raid5 Not tainted 4.8.0-rc6 #2
[22709.430990] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
[22709.439342] task: ffff8810f8850000 task.stack: ffff88102379c000
[22709.445947] RIP: 0010:[<ffffffff81328a8a>]  [<ffffffff81328a8a>] bio_split+0x8a/0x90
[22709.454607] RSP: 0018:ffff88102379f930  EFLAGS: 00010246
[22709.460527] RAX: 0000000000000080 RBX: 0000000000001000 RCX: ffff8810386bfd00
[22709.468489] RDX: 0000000002400000 RSI: 0000000000000000 RDI: ffff88203a604178
[22709.476452] RBP: ffff88102379f948 R08: 0000000000000000 R09: ffff88203a604178
[22709.484413] R10: 00058000ffffffff R11: 0000000000000000 R12: 0000000000000000
[22709.492376] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000080
[22709.500339] FS:  0000000000000000(0000) GS:ffff88103ec80000(0000) knlGS:0000000000000000
[22709.509574] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[22709.515987] CR2: 00007f1460629000 CR3: 0000000001c06000 CR4: 00000000001406e0
[22709.523951] Stack:
[22709.526193]  0000000000001000 0000000000000000 0000000000000000 ffff88102379f9f0
[22709.534489]  ffffffff81335ca0 ffff88103ecd9000 0000000000000001 ffff8810386bfd00
[22709.542776]  0000000000000000 ffff8810372b2c60 ffff88102379fa28 00000080810c6dac
[22709.551074] Call Trace:
[22709.553808]  [<ffffffff81335ca0>] blk_queue_split+0x480/0x640
[22709.560223]  [<ffffffff8133b9d5>] blk_sq_make_request+0x95/0x490
[22709.566922]  [<ffffffff8132cec4>] ? generic_make_request_checks+0x234/0x4f0
[22709.574698]  [<ffffffffa04e51c3>] ? async_xor+0x1c3/0x5b0 [async_xor]
[22709.581888]  [<ffffffff8132f903>] generic_make_request+0x103/0x1d0
[22709.588788]  [<ffffffffa0998286>] ops_run_io+0x376/0x960 [raid456]
[22709.595678]  [<ffffffffa09a0e3b>] handle_stripe+0xbdb/0x23f0 [raid456]
[22709.602967]  [<ffffffffa09a2a3c>] handle_active_stripes.isra.52+0x3ec/0x4c0 [raid456]
[22709.611708]  [<ffffffffa0995f69>] ? do_release_stripe+0x99/0x180 [raid456]
[22709.619382]  [<ffffffffa0996065>] ? __release_stripe+0x15/0x20 [raid456]
[22709.626862]  [<ffffffffa09a2fb8>] raid5d+0x4a8/0x750 [raid456]
[22709.633381]  [<ffffffff815756c6>] md_thread+0x136/0x150
[22709.639218]  [<ffffffff810d2330>] ? prepare_to_wait_event+0xf0/0xf0
[22709.646214]  [<ffffffff81575590>] ? find_pers+0x70/0x70
[22709.652045]  [<ffffffff810acca8>] kthread+0xd8/0xf0
[22709.657490]  [<ffffffff810b515f>] ? finish_task_switch+0x7f/0x240
[22709.664292]  [<ffffffff816ff13f>] ret_from_fork+0x1f/0x40
[22709.670309]  [<ffffffff810acbd0>] ? kthread_park+0x60/0x60
[22709.676430] Code: df e8 eb 29 03 00 8b 73 28 4c 89 e7 e8 80 de ff ff 48 89 d8 5b 41 5c 41 5d 5d c3 e8 61 fc ff ff 48 89 c3 eb b9 31 c0 eb eb 0f 0b <0f> 0b 0f 1f 40 00 0f 1f 44 00 00 48 8b 07 55 48 89 e5 48 85 c0 
[22709.698111] RIP  [<ffffffff81328a8a>] bio_split+0x8a/0x90
[22709.704146]  RSP <ffff88102379f930>
[22709.714624] ---[ end trace 47f4294978ff2bd0 ]---
[22709.788366] Kernel panic - not syncing: Fatal exception
[22709.794278] Kernel Offset: disabled
[22709.867270] ---[ end Kernel panic - not syncing: Fatal exception
[22709.873997] ------------[ cut here ]------------
[22709.879159] WARNING: CPU: 4 PID: 11012 at arch/x86/kernel/smp.c:125 native_smp_send_reschedule+0x3f/0x50
[22709.889740] Modules linked in: ext4 jbd2 mbcache loop rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp raid456 kvm_intel async_raid6_recov kvm async_memcpy async_pq async_xor xor async_tx irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel raid6_pq aesni_intel lrw iTCO_wdt gf128mul iTCO_vendor_support glue_helper ablk_helper ipmi_devintf ipmi_ssif cryptd dcdbas mei_me sg pcspkr mei lpc_ich ipmi_si ipmi_msghandler shpchp wmi acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_en sd_mod mgag200 i2c_algo_bit drm_kms_help
 er syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm mlx4_core libahci tg3 crc32c_intel libata ptp i2c_core megaraid_sas devlink fjes pps_core dm_mirror dm_region_hash dm_log dm_mod
[22709.987252] CPU: 4 PID: 11012 Comm: md0_raid5 Tainted: G      D         4.8.0-rc6 #2
[22709.995894] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
[22710.004469]  0000000000000086 0000000003b679dd ffff88103ec83bb0 ffffffff8135ce3c
[22710.012757]  0000000000000000 0000000000000000 ffff88103ec83bf0 ffffffff8108d7a1
[22710.021051]  0000007d3ec190c0 0000000000000000 ffff88203903da00 ffff88103ec190c0
[22710.029344] Call Trace:
[22710.032072]  <IRQ>  [<ffffffff8135ce3c>] dump_stack+0x63/0x87
[22710.038505]  [<ffffffff8108d7a1>] __warn+0xd1/0xf0
[22710.043851]  [<ffffffff8108d8dd>] warn_slowpath_null+0x1d/0x20
[22710.050357]  [<ffffffff81050c2f>] native_smp_send_reschedule+0x3f/0x50
[22710.057647]  [<ffffffff810b6928>] resched_curr+0xa8/0xd0
[22710.063573]  [<ffffffff810b7685>] check_preempt_curr+0x75/0x90
[22710.070080]  [<ffffffff810b76b9>] ttwu_do_wakeup+0x19/0xe0
[22710.076201]  [<ffffffff810b77ef>] ttwu_do_activate+0x6f/0x80
[22710.082515]  [<ffffffff810b841e>] try_to_wake_up+0x1ae/0x3c0
[22710.088830]  [<ffffffff810b86e2>] default_wake_function+0x12/0x20
[22710.095630]  [<ffffffff810d1be5>] __wake_up_common+0x55/0x90
[22710.101944]  [<ffffffff810d1c33>] __wake_up_locked+0x13/0x20
[22710.108263]  [<ffffffff81275419>] ep_poll_callback+0xb9/0x200
[22710.114672]  [<ffffffff810d1be5>] __wake_up_common+0x55/0x90
[22710.120986]  [<ffffffff810d1d39>] __wake_up+0x39/0x50
[22710.126626]  [<ffffffff810e9470>] wake_up_klogd_work_func+0x40/0x60
[22710.133624]  [<ffffffff8118101d>] irq_work_run_list+0x4d/0x70
[22710.140040]  [<ffffffff8110de30>] ? tick_sched_do_timer+0x50/0x50
[22710.146837]  [<ffffffff811811d0>] irq_work_tick+0x40/0x50
[22710.152867]  [<ffffffff810fdca2>] update_process_times+0x42/0x60
[22710.159567]  [<ffffffff8110d775>] tick_sched_handle.isra.16+0x25/0x60
[22710.166756]  [<ffffffff8110de6d>] tick_sched_timer+0x3d/0x70
[22710.173072]  [<ffffffff810fe9c3>] __hrtimer_run_queues+0xf3/0x280
[22710.179869]  [<ffffffff810feea8>] hrtimer_interrupt+0xa8/0x1a0
[22710.186380]  [<ffffffff810535d5>] local_apic_timer_interrupt+0x35/0x60
[22710.193669]  [<ffffffff81701aad>] smp_apic_timer_interrupt+0x3d/0x50
[22710.200761]  [<ffffffff81700c6c>] apic_timer_interrupt+0x8c/0xa0
[22710.207461]  <EOI>  [<ffffffff811987da>] ? panic+0x1f1/0x232
[22710.213786]  [<ffffffff81030ba8>] oops_end+0xb8/0xd0
[22710.219331]  [<ffffffff8103110b>] die+0x4b/0x70
[22710.224384]  [<ffffffff8102df20>] do_trap+0x140/0x150
[22710.230017]  [<ffffffff8102e2a9>] do_error_trap+0x89/0x110
[22710.236142]  [<ffffffff81328a8a>] ? bio_split+0x8a/0x90
[22710.241970]  [<ffffffff810b7692>] ? check_preempt_curr+0x82/0x90
[22710.248671]  [<ffffffff810b76b9>] ? ttwu_do_wakeup+0x19/0xe0
[22710.254989]  [<ffffffff810c0cc3>] ? update_cfs_rq_load_avg+0x233/0x440
[22710.262272]  [<ffffffff8102e7e0>] do_invalid_op+0x20/0x30
[22710.268297]  [<ffffffff816ffd3e>] invalid_op+0x1e/0x30
[22710.274029]  [<ffffffff81328a8a>] ? bio_split+0x8a/0x90
[22710.279861]  [<ffffffff81335ca0>] blk_queue_split+0x480/0x640
[22710.286273]  [<ffffffff8133b9d5>] blk_sq_make_request+0x95/0x490
[22710.292976]  [<ffffffff8132cec4>] ? generic_make_request_checks+0x234/0x4f0
[22710.300750]  [<ffffffffa04e51c3>] ? async_xor+0x1c3/0x5b0 [async_xor]
[22710.307939]  [<ffffffff8132f903>] generic_make_request+0x103/0x1d0
[22710.314839]  [<ffffffffa0998286>] ops_run_io+0x376/0x960 [raid456]
[22710.321737]  [<ffffffffa09a0e3b>] handle_stripe+0xbdb/0x23f0 [raid456]
[22710.329021]  [<ffffffffa09a2a3c>] handle_active_stripes.isra.52+0x3ec/0x4c0 [raid456]
[22710.337759]  [<ffffffffa0995f69>] ? do_release_stripe+0x99/0x180 [raid456]
[22710.345429]  [<ffffffffa0996065>] ? __release_stripe+0x15/0x20 [raid456]
[22710.352907]  [<ffffffffa09a2fb8>] raid5d+0x4a8/0x750 [raid456]
[22710.359418]  [<ffffffff815756c6>] md_thread+0x136/0x150
[22710.365248]  [<ffffffff810d2330>] ? prepare_to_wait_event+0xf0/0xf0
[22710.372240]  [<ffffffff81575590>] ? find_pers+0x70/0x70
[22710.378071]  [<ffffffff810acca8>] kthread+0xd8/0xf0
[22710.383513]  [<ffffffff810b515f>] ? finish_task_switch+0x7f/0x240
[22710.390314]  [<ffffffff816ff13f>] ret_from_fork+0x1f/0x40
[22710.396337]  [<ffffffff810acbd0>] ? kthread_park+0x60/0x60
[22710.402457] ---[ end trace 47f4294978ff2bd1 ]---

Best Regards,
  Yi Zhang



^ permalink raw reply

* Re: WARNING: mismatch_cnt is not 0 on <array device>
From: Brad Campbell @ 2016-09-26  3:42 UTC (permalink / raw)
  To: Benjammin2068, Linux-RAID
In-Reply-To: <26b91420-97c9-f405-aa71-16cd5cda3a67@gmail.com>

On 26/09/16 10:43, Benjammin2068 wrote:
> Hey all,
>
>
>  So the RAID5 which I upgraded to RAID6 was humming along all week just fine (I did the change last weekend) and this weekend I got this:
>
> WARNING: mismatch_cnt is not 0 on /dev/md127
>
> The array seems happy and clean:
>

You best find out what it is to start with. Example from mine :

brad@srv:~$ cat /sys/block/md2/md/mismatch_cnt
8171392

This is a bad example because it's a RAID10 of 6 SSDs. 3 support zero 
after TRIM and the other 3 don't, so after a TRIM of the filesystem the 
mismatch count is through the roof.

Unless you are swapping to an array or you have some known issue like 
the one I mention above, a mismatch count of non-zero is not good.

I lost most of a RAID6 due to a faulty SIL SATA controller and it was 
the high mismatch counts that alerted me. Unfortunately I was about 6 
months in and had only checked for the first time. It was silently 
corrupting reads under load, and so the read-modify-write cycles were 
quietly corrupting the array.

Check the mismatch count, run a "check" on the array and check it again. 
If they vary wildly something odd is going on. If they are the same then 
you might want to figure out what might have caused it.

Regards,
Brad.

^ permalink raw reply

* Re: Linux raid wiki - setting up a system - advice wanted :-)
From: Adam Goryachev @ 2016-09-26  3:40 UTC (permalink / raw)
  To: Andreas Klauer, Wols Lists; +Cc: linux-raid
In-Reply-To: <20160926021622.GA25056@metamorpher.de>

On 26/09/16 12:16, Andreas Klauer wrote:
> On Sun, Sep 25, 2016 at 10:16:24PM +0100, Wols Lists wrote:
>> I need to know what will happen if you give entire drives to mdadm.
> Installers will pick unpartitioned disks first. Forget just trashing
> the metadata, easy to accidentally write across the entire disk.
>
> This is what it looks like when installing Windows: http://imgur.com/a/GtcR2
>
> Same can happen with Linux installers. Unpartitioned disks are just unusual.
>
> Not sure why this is a thing anyway. There's no downside to partitions.
> Adds a safety margin, is yet another place that has metadata (with GPT
> you can use mdnumber-role as partition name / partlabel), doesn't harm
> performance in any way...
>
> People panic too much about partition alignment? But alignment is something
> you need to provide through all layers, all the way down to the filesystem,
> not just partitions. Besides, MiB alignment has been standard for years now,
> so this shouldn't be a problem.
>
> The only other obscure issue with partition tables I can think of is
> enclosures for USB-HDD that emulate the wrong sector size (4K vs 512)
> and unfortunately GPT still depends on the sector size; and Linux is
> not flexible/smart enough to support alien sector size GPT partitions.
>
> So if you switch HDD enclosures you might be forced to recreate
> the partition table before you can access your data.
>
Personally, I agree, avoiding a partition table has almost zero benefit. 
Having a partition table can help massively (ie, clearly identifies the 
drive as in-use, shows the content of the drive/partition (RAID), etc....
I would think using a USB interfaced drive in a raid array is hopefully 
not common, and changing the enclosure should be even less common, 
though perhaps likely when dealing with failures.... Can you comment on 
the behaviour of removing the drive from the enclosure and direct 
connecting it? What is the worst case scenario here?
When you say forced to recreate the partition table, I assume in the 
majority of cases it is just delete and re-create using 100% of 
available space, or is there some other difference (eg, the gap at the 
beginning of the drive that might require some searching for the right 
value)?

Regards,
Adam



-- 
Adam Goryachev Website Managers www.websitemanagers.com.au

^ permalink raw reply

* WARNING: mismatch_cnt is not 0 on <array device>
From: Benjammin2068 @ 2016-09-26  2:43 UTC (permalink / raw)
  To: Linux-RAID

Hey all,


 So the RAID5 which I upgraded to RAID6 was humming along all week just fine (I did the change last weekend) and this weekend I got this:

WARNING: mismatch_cnt is not 0 on /dev/md127

The array seems happy and clean:


> /dev/md127:
>         Version : 1.2
>   Creation Time : Tue Aug 23 03:06:46 2011
>      Raid Level : raid6
>      Array Size : 2930276352 (2794.53 GiB 3000.60 GB)
>   Used Dev Size : 976758784 (931.51 GiB 1000.20 GB)
>    Raid Devices : 5
>   Total Devices : 6
>     Persistence : Superblock is persistent
>
>   Intent Bitmap : Internal
>
>     Update Time : Sun Sep 25 21:42:55 2016
>           State : clean
>  Active Devices : 5
> Working Devices : 6
>  Failed Devices : 0
>   Spare Devices : 1
>
>          Layout : left-symmetric
>      Chunk Size : 512K
>
>            Name : :BigRAID
>            UUID : 97b17840:3eaff079:d8e384d0:bfdbda42
>          Events : 905701
>
>     Number   Major   Minor   RaidDevice State
>        6       8       81        0      active sync   /dev/sdf1
>        1       8       33        1      active sync   /dev/sdc1
>        5       8       49        2      active sync   /dev/sdd1
>        4       8       65        3      active sync   /dev/sde1
>        8       8       97        4      active sync   /dev/sdg1
>
>        7       8      113        -      spare   /dev/sdh1

I don't think I've ever got a message before about mismatch_cnt

 -Ben


^ permalink raw reply

* Re: Posting on RISKS - hacked NAS's
From: Benjammin2068 @ 2016-09-26  2:35 UTC (permalink / raw)
  To: linux-raid
In-Reply-To: <de6e2183-1e89-2495-1546-e71bf4bc27e9@websitemanagers.com.au>



On 09/25/2016 06:40 PM, Adam Goryachev wrote:

I read the article too.

What Adam said.

:D

^ permalink raw reply

* Re: Linux raid wiki - setting up a system - advice wanted :-)
From: Andreas Klauer @ 2016-09-26  2:16 UTC (permalink / raw)
  To: Wols Lists; +Cc: linux-raid
In-Reply-To: <57E83EA8.9060809@youngman.org.uk>

On Sun, Sep 25, 2016 at 10:16:24PM +0100, Wols Lists wrote:
> I need to know what will happen if you give entire drives to mdadm.

Installers will pick unpartitioned disks first. Forget just trashing 
the metadata, easy to accidentally write across the entire disk.

This is what it looks like when installing Windows: http://imgur.com/a/GtcR2 

Same can happen with Linux installers. Unpartitioned disks are just unusual. 

Not sure why this is a thing anyway. There's no downside to partitions. 
Adds a safety margin, is yet another place that has metadata (with GPT 
you can use mdnumber-role as partition name / partlabel), doesn't harm 
performance in any way...

People panic too much about partition alignment? But alignment is something 
you need to provide through all layers, all the way down to the filesystem, 
not just partitions. Besides, MiB alignment has been standard for years now, 
so this shouldn't be a problem.

The only other obscure issue with partition tables I can think of is 
enclosures for USB-HDD that emulate the wrong sector size (4K vs 512) 
and unfortunately GPT still depends on the sector size; and Linux is 
not flexible/smart enough to support alien sector size GPT partitions.

So if you switch HDD enclosures you might be forced to recreate 
the partition table before you can access your data.

Regards
Andreas Klauer

^ permalink raw reply

* Re: Linux raid wiki - setting up a system - advice wanted :-)
From: Francisco Parada @ 2016-09-26  0:59 UTC (permalink / raw)
  To: Wols Lists; +Cc: linux-raid
In-Reply-To: <57E83EA8.9060809@youngman.org.uk>

Hi Wols,

Based on my own experience, you can do it without trashing the array.
However, I should note, that I have never done it this way to an array
that I was booting from.  But, as long as you've set up GPT to account
for the 2MB boundary.  If you use "parted" or the graphical
equivalent, "gparted", you can account for that in newer drives above
the 2TerraByte capacity anyway with 1MB instead of 2MB.  So if you add
a little extra padding, say 3MB (2MB for your grub, 1MB for a blank
section that all drives above 2TB require), you should be in good
shape.

Perhaps someone else can chime in also, to confirm.

On Sun, Sep 25, 2016 at 5:16 PM, Wols Lists <antlists@youngman.org.uk> wrote:
> This is a great way for learning lots about raid :-)
>
> I'm planning a section on setting up a new system, and I need to know
> what will happen if you give entire drives to mdadm.
>
> Does it leave the first 2 megs empty? Basically, what I'm asking is if I do
>
> mdadm --create /dev/md/bigarray -add /dev/sda /dev/sdb /dev/sdc
>
> (note I am passing the entire drive, not the first partition) and then I
> install grub on those drives, will I trash the array?
>
> Cheers,
> Wol
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: Posting on RISKS - hacked NAS's
From: Adam Goryachev @ 2016-09-25 23:40 UTC (permalink / raw)
  To: Wols Lists, linux-raid
In-Reply-To: <57E842C7.9000302@youngman.org.uk>

I strongly suspect that this article is talking about a NAS (Network 
Attached Storage), or as described a mini-computer with hard drives 
attached and open to the network, this is not about firmware on drives 
that you would connect to your own Linux computer.

Questions about the accuracy of the article:
1) Seagate has only sold 7000 of this product? Seems like a very small 
run for a major manufacturer...
2) 70% have been hacked? Did the hacker themselves reveal this, or did 
Seagate, or how does this source know?

I would strongly suspect a much higher number of devices sold, and would 
strongly suspect that almost all of these devices would sit behind a 
simple NAT router. Unless seagate have done something really stupid 
(like using upnp to ask the router to port forward from outside directly 
to it *by default*), then this should provide a reasonably decent level 
of protection.

PS, Not to say that the article probably is very accurate, you should 
change passwords, you should have backups, you should NOT allow direct 
connections to your backend storage, etc....

Nevermind, reading deaper:
http://www.infoworld.com/article/3118792/malware/thousands-of-seagate-nas-boxes-host-cryptocurrency-mining-malware.html
We see that they looked for all open FTP servers with public writeable 
directories (7,263) and of those a large majority were Seagate NAS 
(5137). So, Seagate almost certainly have sold more than 7000 of their 
NAS, 7000 has absolutely no correlation to the number of Seagate NAS 
sold or connected.

Of further note:
"Seagate Central's configuration makes it easier for users to expose 
insecure FTP servers to the Internet"
"By default, the Seagate Central NAS system provides a public folder for 
sharing data, ... This public folder cannot be disabled and if the 
device administrator enables remote access to the device, it will become 
accessible to anyone on the Internet"

Finally, the "infection" is just placing the files there, and then 
waiting for the user to execute them on their windows PC, it is not a 
remote code execution exploit by itself.

Regards,
Adam

On 26/09/16 07:33, Wols Lists wrote:
> Just for info. I know it's not really quite this list, but I can't quite
> make out what is affected.
>
> I get the impression this is referring to NAS systems, so it's outside
> our remit. But to me, "Seagate NAS" is actually a raid-suitable disk
> drive, so it makes me wonder whether it's hacked drive firmware...
> unlikely but eminently possible ...
>
> Cheers,
> Wol
>
> ------------------------------
>
> Date: Fri, 23 Sep 2016 11:34:21 -0700
> From: Gene Wirchenko <genew@telus.net>
> Subject: "Seagate NAS hack should scare us all" (Roger A. Grimes)
>
> Roger A. Grimes, InfoWorld, 20 Sep 2016
> An under-the-radar news story proves that computers are far from the only
> devices prey to attack
> http://www.infoworld.com/article/3121338/security/seagate-nas-hack-should-scare-us-all.html
>
> opening text:
>
> No fewer than 70 percent of Internet-connected Seagate NAS hard drives have
> been compromised by a single malware program. That's a pretty startling
> figure.  Security vendor Sophos says the bitcoin-mining malware Miner-C is
> the culprit.
>
>    [At peak, seek to tweak the weak link.  This reeks of leaks that peek as
>    well.  PGN]
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Adam Goryachev Website Managers www.websitemanagers.com.au

^ permalink raw reply

* Posting on RISKS - hacked NAS's
From: Wols Lists @ 2016-09-25 21:33 UTC (permalink / raw)
  To: linux-raid

Just for info. I know it's not really quite this list, but I can't quite
make out what is affected.

I get the impression this is referring to NAS systems, so it's outside
our remit. But to me, "Seagate NAS" is actually a raid-suitable disk
drive, so it makes me wonder whether it's hacked drive firmware...
unlikely but eminently possible ...

Cheers,
Wol

------------------------------

Date: Fri, 23 Sep 2016 11:34:21 -0700
From: Gene Wirchenko <genew@telus.net>
Subject: "Seagate NAS hack should scare us all" (Roger A. Grimes)

Roger A. Grimes, InfoWorld, 20 Sep 2016
An under-the-radar news story proves that computers are far from the only
devices prey to attack
http://www.infoworld.com/article/3121338/security/seagate-nas-hack-should-scare-us-all.html

opening text:

No fewer than 70 percent of Internet-connected Seagate NAS hard drives have
been compromised by a single malware program. That's a pretty startling
figure.  Security vendor Sophos says the bitcoin-mining malware Miner-C is
the culprit.

  [At peak, seek to tweak the weak link.  This reeks of leaks that peek as
  well.  PGN]

^ permalink raw reply

* Linux raid wiki - setting up a system - advice wanted :-)
From: Wols Lists @ 2016-09-25 21:16 UTC (permalink / raw)
  To: linux-raid

This is a great way for learning lots about raid :-)

I'm planning a section on setting up a new system, and I need to know
what will happen if you give entire drives to mdadm.

Does it leave the first 2 megs empty? Basically, what I'm asking is if I do

mdadm --create /dev/md/bigarray -add /dev/sda /dev/sdb /dev/sdc

(note I am passing the entire drive, not the first partition) and then I
install grub on those drives, will I trash the array?

Cheers,
Wol

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox