From: Damien Le Moal <damien.lemoal@opensource.wdc.com>
To: Sven Oehme <oehmes@gmail.com>,
linux-btrfs@vger.kernel.org, Naohiro Aota <Naohiro.Aota@wdc.com>,
Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
Subject: Re: Host managed SMR drive issue
Date: Mon, 27 Sep 2021 08:19:51 +0900 [thread overview]
Message-ID: <6db88069-e263-ae85-4f69-adb9ec69ee76@opensource.wdc.com> (raw)
In-Reply-To: <CALssuR00NvTHJJuoOFhw=4+fHARtBN2PLqTr4W06PT5VMagh_A@mail.gmail.com>
On 2021/09/25 3:25, Sven Oehme wrote:
> Hi,
>
> i am running into issues with Host Managed SMR drive testing. when i
> try to copy or move a file to the btrfs filesystem it just hangs. i
> tried multiple 5.12,5.13 as well as 5.14 all the way to 5.14.6 but the
> issue still persists.
>
> here is the setup :
>
> I am using btrfs-progs-v5.14.1
> device is a Host Managed WDC 20TB SMR drive with firmware level C421
> its connected via a HBA 9400-8i Tri-Mode Storage Adapter , latest 20.0 FW
Beware of the Broadcom FW rev 20. We found problems with it: very slow zone
command scheduling leading to command timeout is some cases. FW 19 does not seem
to have this issue. But that is likely not the cause of the problem here.
Is there anything of interest in dmesg ? Any IO errors ?
Naohiro, Johannes,
Any idea ?
> I am using the /dev/sd device direct , no lvm or device mapper or
> anything else in between
>
> after a few seconds, sometimes minutes data rate to the drive drops to
> 0 and 1 or 2 cores on my system show 100% IO wait time, but no longer
> make any progress
>
> the process in question has the following stack :
>
> [ 2168.589160] task:mv state:D stack: 0 pid: 3814
> ppid: 3679 flags:0x00004000
> [ 2168.589162] Call Trace:
> [ 2168.589163] __schedule+0x2fa/0x910
> [ 2168.589166] schedule+0x4f/0xc0
> [ 2168.589168] schedule_timeout+0x8a/0x140
> [ 2168.589171] ? __bpf_trace_tick_stop+0x10/0x10
> [ 2168.589173] io_schedule_timeout+0x51/0x80
> [ 2168.589176] balance_dirty_pages+0x2fa/0xe30
> [ 2168.589179] ? __mod_lruvec_state+0x3a/0x50
> [ 2168.589182] balance_dirty_pages_ratelimited+0x2f9/0x3c0
> [ 2168.589185] btrfs_buffered_write+0x58e/0x7e0 [btrfs]
> [ 2168.589226] btrfs_file_write_iter+0x138/0x3e0 [btrfs]
> [ 2168.589260] ? ext4_file_read_iter+0x5b/0x180
> [ 2168.589262] new_sync_write+0x114/0x1a0
> [ 2168.589265] vfs_write+0x1c5/0x260
> [ 2168.589267] ksys_write+0x67/0xe0
> [ 2168.589270] __x64_sys_write+0x1a/0x20
> [ 2168.589272] do_syscall_64+0x40/0xb0
> [ 2168.589275] entry_SYSCALL_64_after_hwframe+0x44/0xae
> [ 2168.589277] RIP: 0033:0x7ffff7e91c27
> [ 2168.589278] RSP: 002b:00007fffffffdc48 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000001
> [ 2168.589280] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007ffff7e91c27
> [ 2168.589281] RDX: 0000000000020000 RSI: 00007ffff79bd000 RDI: 0000000000000004
> [ 2168.589282] RBP: 00007ffff79bd000 R08: 0000000000000000 R09: 0000000000000000
> [ 2168.589283] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000000004
> [ 2168.589284] R13: 0000000000000004 R14: 00007ffff79bd000 R15: 0000000000020000
>
> and shows up under runnable tasks :
>
> [ 2168.593562] runnable tasks:
> [ 2168.593562] S task PID tree-key switches
> prio wait-time sum-exec sum-sleep
> [ 2168.593563] -------------------------------------------------------------------------------------------------------------
> [ 2168.593565] S cpuhp/13 92 88923.802487 19
> 120 0.000000 0.292061 0.000000 2 0 /
> [ 2168.593571] S idle_inject/13 93 -11.997255 3
> 49 0.000000 0.005480 0.000000 2 0 /
> [ 2168.593577] S migration/13 94 814.287531 551
> 0 0.000000 1015.550514 0.000000 2 0 /
> [ 2168.593582] S ksoftirqd/13 95 88762.317130 44
> 120 0.000000 1.940252 0.000000 2 0 /
> [ 2168.593588] I kworker/13:0 96 -9.031157 5
> 120 0.000000 0.017423 0.000000 2 0 /
> [ 2168.593593] I kworker/13:0H 97 3570.961886 5
> 100 0.000000 0.034345 0.000000 2 0 /
> [ 2168.593603] I kworker/13:1 400 101650.731913 578
> 120 0.000000 10.110898 0.000000 2 0 /
> [ 2168.593611] I kworker/13:1H 1015 101649.600698 65
> 100 0.000000 1.443300 0.000000 2 0 /
> [ 2168.593618] S loop3 1994 99133.655903 70
> 100 0.000000 1.137468 0.000000 2 0 /
> [ 2168.593625] S snapd 3161 15.296181 166
> 120 0.000000 90.296991 0.000000 2 0
> /system.slice/snapd.service
> [ 2168.593631] S snapd 3198 10.047573 49
> 120 0.000000 5.646247 0.000000 2 0
> /system.slice/snapd.service
> [ 2168.593639] S java 2446 970.743682 301
> 120 0.000000 101.648659 0.000000 2 0
> /system.slice/stor_tomcat.service
> [ 2168.593645] S C1 CompilerThre 2573 1033.157689 3636
> 120 0.000000 615.256247 0.000000 2 0
> /system.slice/stor_tomcat.service
> [ 2168.593654] D mv 3814 2263.816953 186734
> 120 0.000000 30087.917319 0.000000 2 0 /user.slice
>
> any idea what is going on and how to fix this ?
>
> thx.
>
--
Damien Le Moal
Western Digital Research
next prev parent reply other threads:[~2021-09-26 23:37 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-24 18:25 Host managed SMR drive issue Sven Oehme
2021-09-26 23:19 ` Damien Le Moal [this message]
2021-09-27 17:28 ` Sven Oehme
2021-09-27 22:36 ` Sven Oehme
2021-09-27 22:38 ` Sven Oehme
2021-09-27 22:48 ` Damien Le Moal
2021-09-27 22:56 ` Sven Oehme
2021-09-27 23:01 ` Damien Le Moal
2021-09-27 23:17 ` Sven Oehme
2021-09-27 23:31 ` Damien Le Moal
2021-09-27 23:34 ` Sven Oehme
2021-09-27 23:36 ` Sven Oehme
2021-09-27 23:38 ` Damien Le Moal
2021-09-27 23:51 ` Sven Oehme
2021-09-28 7:10 ` Naohiro Aota
2021-09-28 11:49 ` Sven Oehme
2021-09-30 9:55 ` Johannes Thumshirn
[not found] ` <CALssuR1wcChWwLt1wVoxhf=ufWdKtBHa7FXn-m9mkJcpPGbfOg@mail.gmail.com>
[not found] ` <PH0PR04MB7416408BFBD494211A4D86B69BAA9@PH0PR04MB7416.namprd04.prod.outlook.com>
[not found] ` <CALssuR1JZqaBLf_aCMRLm683cww66wc_p+hgCpCENMiQkVoSRg@mail.gmail.com>
[not found] ` <20211006154828.bqiwik2b6jaqxcck@naota-xeon>
[not found] ` <CALssuR3UeNmx0PnwUT8ZR0bOd9iAGjvgmv9u8yfHDnfpChKb2w@mail.gmail.com>
[not found] ` <20211007032239.iwrtygcavadvvb62@naota-xeon>
[not found] ` <CH0PR14MB50764AC8659E9378C65D6DE8F4B19@CH0PR14MB5076.namprd14.prod.outlook.com>
2021-10-11 6:32 ` Johannes Thumshirn
2021-10-11 10:23 ` David Sterba
2021-10-11 10:28 ` Johannes Thumshirn
2021-10-11 13:06 ` David Sterba
2021-09-28 6:36 ` Johannes Thumshirn
2021-09-29 10:29 ` Johannes Thumshirn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6db88069-e263-ae85-4f69-adb9ec69ee76@opensource.wdc.com \
--to=damien.lemoal@opensource.wdc.com \
--cc=Johannes.Thumshirn@wdc.com \
--cc=Naohiro.Aota@wdc.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=oehmes@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox