* [btrfs] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220
[not found] ` <20171029225155.qcum5i75awrt5tzm@wfg-t540p.sh.intel.com>
@ 2017-10-30 7:20 ` Fengguang Wu
2017-10-30 7:44 ` Eryu Guan
0 siblings, 1 reply; 13+ messages in thread
From: Fengguang Wu @ 2017-10-30 7:20 UTC (permalink / raw)
To: Lukas Czerner
Cc: Jan Kara, Darrick J. Wong, Jeff Moyer, Jens Axboe, Linus Torvalds,
Linux Kernel Mailing List, linux-fsdevel
CC fsdevel.
On Sun, Oct 29, 2017 at 11:51:55PM +0100, Fengguang Wu wrote:
>Hi Linus,
>
>Up to now we see the below boot error/warnings when testing v4.14-rc6.
>
>They hit the RC release mainly due to various imperfections in 0day's
>auto bisection. So I manually list them here and CC the likely easy to
>debug ones to the corresponding maintainers in the followup emails.
>
>boot_successes: 4700
>boot_failures: 247
[...]
>WARNING:at_fs/direct-io.c:#dio_complete: 7
>WARNING:at_fs/iomap.c:#iomap_dio_complete: 3
>WARNING:at_fs/iomap.c:#iomap_dio_rw: 1
The first warning happens on btrfs and is bisected to this commit.
The other 2 warnings happen on xfs.
commit 332391a9935da939319e473b4680e173df75afcf
Author: Lukas Czerner <lczerner@redhat.com>
AuthorDate: Thu Sep 21 08:16:29 2017 -0600
Commit: Jens Axboe <axboe@kernel.dk>
CommitDate: Mon Sep 25 08:56:05 2017 -0600
fs: Fix page cache inconsistency when mixing buffered and AIO DIO
Currently when mixing buffered reads and asynchronous direct writes it
is possible to end up with the situation where we have stale data in the
page cache while the new data is already written to disk. This is
permanent until the affected pages are flushed away. Despite the fact
that mixing buffered and direct IO is ill-advised it does pose a thread
for a data integrity, is unexpected and should be fixed.
Fix this by deferring completion of asynchronous direct writes to a
process context in the case that there are mapped pages to be found in
the inode. Later before the completion in dio_complete() invalidate
the pages in question. This ensures that after the completion the pages
in the written area are either unmapped, or populated with up-to-date
data. Also do the same for the iomap case which uses
iomap_dio_complete() instead.
This has a side effect of deferring the completion to a process context
for every AIO DIO that happens on inode that has pages mapped. However
since the consensus is that this is ill-advised practice the performance
implication should not be a problem.
This was based on proposal from Jeff Moyer, thanks!
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
fs/direct-io.c | 47 ++++++++++++++++++++++++++++++++++++++++++-----
fs/iomap.c | 29 ++++++++++++++++-------------
mm/filemap.c | 6 ++++++
3 files changed, 64 insertions(+), 18 deletions(-)
The call traces are:
[ 334.461955] BTRFS info (device vdb): has skinny extents
[ 334.463231] BTRFS info (device vdb): flagging fs with big metadata feature
[ 334.469746] BTRFS info (device vdb): creating UUID tree
[ 336.190840] ------------[ cut here ]------------
[ 336.193338] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220
[ 336.196925] Modules linked in: btrfs xor zstd_decompress zstd_compress xxhash raid6_pq dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver sr_mod cdrom sg ata_generic pata_acpi ppdev snd_pcm snd_timer snd soundcore serio_raw parport_pc pcspkr parport floppy ata_piix libata i2c_piix4 ip_tables
[ 336.203746] CPU: 0 PID: 6379 Comm: kworker/0:0 Not tainted 4.14.0-rc6 #1
[ 336.204799] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 336.207480] Workqueue: dio/vdb dio_aio_complete_work
[ 336.208373] task: ffff880079094a80 task.stack: ffffc90000950000
[ 336.210495] RIP: 0010:dio_complete+0x1d4/0x220
[ 336.211288] RSP: 0018:ffffc90000953e20 EFLAGS: 00010286
[ 336.213145] RAX: 00000000fffffff0 RBX: ffff8800aae8c780 RCX: ffffc90000953c70
[ 336.214232] RDX: 0000000080000000 RSI: 00000000000002b4 RDI: ffffffff81cb9ccb
[ 336.216605] RBP: ffffc90000953e48 R08: 0000000000000000 R09: ffff880079c8b6c0
[ 336.217726] R10: 0000000000000074 R11: 0000000000000000 R12: 0000000000002000
[ 336.220141] R13: 0000000000002000 R14: 0000000000000003 R15: 0000000000074000
[ 336.221232] FS: 0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000
[ 336.223915] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 336.224885] CR2: 000055e9eae6b750 CR3: 00000000752c7000 CR4: 00000000000006f0
[ 336.227202] Call Trace:
[ 336.227750] dio_aio_complete_work+0x19/0x20
[ 336.228536] process_one_work+0x198/0x3e0
[ 336.230396] worker_thread+0x4e/0x3e0
[ 336.231064] kthread+0x114/0x150
[ 336.232489] ? process_one_work+0x3e0/0x3e0
[ 336.233250] ? kthread_create_on_node+0x40/0x40
[ 336.234929] ? kthread_create_on_node+0x40/0x40
[ 336.235754] ret_from_fork+0x25/0x30
[ 336.237322] Code: 8b 78 30 48 83 7f 58 00 0f 84 e9 fe ff ff 4b 8d 54 27 ff 4c 89 fe 48 c1 fe 0c 48 c1 fa 0c e8 64 58 f3 ff 85 c0 0f 84 cc fe ff ff <0f> ff e9 c5 fe ff ff 8b 50 20 f6 c2 10 0f 84 e5 fe ff ff 48 8b
[ 336.240031] ---[ end trace 3d29f0720cd16ad3 ]---
[ 378.043462] generic/095 [failed, exit status 1] - output mismatch (see /lkp/benchmarks/xfstests/results//generic/095.out.bad)
[ 378.043465]
[ 121.160532] run fstests generic/117 at 2017-10-25 05:27:29
[ 123.275347] XFS (vdd): Mounting V5 Filesystem
[ 123.280579] XFS (vdd): Ending clean mount
[ 163.043220] XFS (vdd): Unmounting Filesystem
[ 163.258872] generic/117 42s
[ 163.258875]
[ 164.009187] run fstests generic/208 at 2017-10-25 05:28:12
[ 164.344046] ------------[ cut here ]------------
[ 164.345164] WARNING: CPU: 1 PID: 86 at fs/iomap.c:755 iomap_dio_complete+0xe5/0x100
[ 164.347193] Modules linked in: xfs dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver sr_mod cdrom sg ata_generic pata_acpi ppdev snd_pc
m snd_timer snd soundcore parport_pc pcspkr serio_raw floppy parport ata_piix i2c_piix4 libata ip_tables
[ 164.351698] CPU: 1 PID: 86 Comm: kworker/1:2 Not tainted 4.14.0-rc6 #1
[ 164.353111] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 164.361199] Workqueue: dio/vda iomap_dio_complete_work
[ 164.362418] task: ffff88007f8b8000 task.stack: ffffc90000a4c000
[ 164.363666] RIP: 0010:iomap_dio_complete+0xe5/0x100
[ 164.364923] RSP: 0018:ffffc90000a4fe08 EFLAGS: 00010286
[ 164.366065] RAX: 00000000fffffff0 RBX: ffff88007a5fba80 RCX: ffffc90000a4fc58
[ 164.367281] RDX: 0000000080000000 RSI: 00000000000002b4 RDI: ffffffff81cb9ccb
[ 164.368330] RBP: ffffc90000a4fe30 R08: 0000000000000000 R09: ffff8800750eeb58
[ 164.369503] R10: 0000000000001ffe R11: 0000000000000000 R12: 0000000002000000
[ 164.370573] R13: ffff8800b8d27780 R14: ffff8800b8869058 R15: 0000000000000000
[ 164.371674] FS: 0000000000000000(0000) GS:ffff88013fc80000(0000) knlGS:0000000000000000
[ 164.373136] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 164.387036] CR2: 00007f5ee028e000 CR3: 000000007a1ff000 CR4: 00000000000006e0
[ 164.388119] Call Trace:
[ 164.388581] iomap_dio_complete_work+0x25/0x80
[ 164.389385] process_one_work+0x198/0x3e0
[ 164.390054] worker_thread+0x4e/0x3e0
[ 164.390699] kthread+0x114/0x150
[ 164.391341] ? process_one_work+0x3e0/0x3e0
[ 164.392038] ? kthread_create_on_node+0x40/0x40
[ 164.392922] ret_from_fork+0x25/0x30
[ 164.393506] Code: e0 eb 9f 48 8b 77 18 e9 78 ff ff ff 4c 89 fa 48 03 53 18 4c 89 fe 48 c1 fe 0c 48 83 ea 01 48 c1 fa 0c e8 0f 93 f0 ff 85 c
0 74 93 <0f> ff eb 8f 4c 63 67 28 e9 54 ff ff ff 0f 1f 40 00 66 2e 0f 1f
[ 164.396264] ---[ end trace 77984c98cc2335de ]---
[ 364.297438] generic/208 200s
[ 364.297442]
[ 365.772872] XFS (vda): Unmounting Filesystem
[ 366.507546] XFS (vda): Mounting V5 Filesystem
[ 366.519283] XFS (vda): Ending clean mount
[ 370.264428] run fstests generic/095 at 2017-10-25 05:38:44
[ 370.392041] XFS (vdd): Unmounting Filesystem
[ 372.641826] XFS (vdd): Mounting V5 Filesystem
[ 372.648446] XFS (vdd): Ending clean mount
[ 373.277786] ------------[ cut here ]------------
[ 373.279820] ------------[ cut here ]------------
[ 373.279826] WARNING: CPU: 2 PID: 29166 at fs/iomap.c:1016 iomap_dio_rw+0x346/0x360
[ 373.279827] Modules linked in: xfs dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver sr_mod cdrom sg ata_generic pata_acpi ppdev snd_pcm snd_timer snd soundcore serio_raw pcspkr i2c_piix4 ata_piix parport_pc parport floppy libata ip_tables
[ 373.279842] CPU: 2 PID: 29166 Comm: fio Not tainted 4.14.0-rc6 #1
[ 373.279843] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 373.279844] task: ffff88008bd34a80 task.stack: ffffc900065f8000
[ 373.279846] RIP: 0010:iomap_dio_rw+0x346/0x360
[ 373.279847] RSP: 0000:ffffc900065fbd40 EFLAGS: 00010286
[ 373.279848] RAX: 00000000fffffff0 RBX: ffffc900065fbe98 RCX: ffffc900065fbb90
[ 373.279848] RDX: 0000000080000000 RSI: 00000000000002b4 RDI: ffffffff81cb9ccb
[ 373.279849] RBP: ffffc900065fbdd0 R08: 0000000000000000 R09: 00007f6fe0948000
[ 373.279850] R10: ffff88013ffd2000 R11: 0000000000000400 R12: 000000000005f400
[ 373.279850] R13: ffffc900065fbe70 R14: ffff88007a6fb7e0 R15: 0000000000000011
[ 373.279852] FS: 00007f6fb3ce1700(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000
[ 373.279852] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 373.279853] CR2: 00007f6fe088ac00 CR3: 0000000078eb7000 CR4: 00000000000006e0
[ 373.279856] Call Trace:
[ 373.279913] ? xfs_file_aio_write_checks+0x137/0x190 [xfs]
[ 373.279933] xfs_file_dio_aio_write+0xec/0x260 [xfs]
[ 373.279952] ? xfs_file_dio_aio_write+0xec/0x260 [xfs]
[ 373.279970] xfs_file_write_iter+0x98/0x150 [xfs]
[ 373.279972] __vfs_write+0xd3/0x150
[ 373.279974] vfs_write+0xc7/0x1b0
[ 373.279975] SyS_pwrite64+0x76/0x90
[ 373.279977] entry_SYSCALL_64_fastpath+0x1a/0xa5
[ 373.279979] RIP: 0033:0x7f6fcc3019d3
[ 373.279979] RSP: 002b:00007f6fb3cdeb20 EFLAGS: 00000293 ORIG_RAX: 0000000000000012
[ 373.279980] RAX: ffffffffffffffda RBX: 00007f6fb3ce1700 RCX: 00007f6fcc3019d3
[ 373.279981] RDX: 0000000000000400 RSI: 000055e925caac00 RDI: 0000000000000004
[ 373.279982] RBP: 00007f6fe095fee0 R08: 0000000000000000 R09: 00007f6fb3ce1700
[ 373.279982] R10: 000000000005f400 R11: 0000000000000293 R12: 0000000000000000
[ 373.279983] R13: 00007f6fe095fe2f R14: 0000000000000003 R15: 00007f6fe095fee0
[ 373.279984] Code: 21 00 00 f6 58 2b 81 49 c7 44 24 08 00 00 00 00 4c 89 f7 e8 ed ec ff ff e9 13 fe ff ff 48 8d 7d a0 e8 2f a9 13 00 e9 33 ff ff ff <0f> ff e9 6a fe ff ff 48 c7 c0 f4 ff ff ff e9 f2 fd ff ff 90 90
[ 373.279998] ---[ end trace 01e002a8aa01ae25 ]---
[ 373.334858] WARNING: CPU: 3 PID: 29158 at fs/iomap.c:755 iomap_dio_complete+0xe5/0x100
[ 373.336817] Modules linked in: xfs dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver sr_mod cdrom sg ata_generic pata_acpi ppdev snd_pcm snd_timer snd soundcore serio_raw pcspkr i2c_piix4 ata_piix parport_pc parport floppy libata ip_tables
[ 373.341916] CPU: 3 PID: 29158 Comm: fio Tainted: G W 4.14.0-rc6 #1
[ 373.344151] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 373.346702] task: ffff88008bd32540 task.stack: ffffc900065c8000
[ 373.348512] RIP: 0010:iomap_dio_complete+0xe5/0x100
[ 373.349999] RSP: 0018:ffffc900065cbd08 EFLAGS: 00010286
[ 373.351604] RAX: 00000000fffffff0 RBX: ffff88007a5611e0 RCX: ffffc900065cbb58
[ 373.353690] RDX: 0000000080000000 RSI: 00000000000002b4 RDI: ffffffff81cb9ccb
[ 373.355693] RBP: ffffc900065cbd30 R08: 0000000000000000 R09: 00007f6fe08cb000
[ 373.357775] R10: ffffc900065cb8c8 R11: 0000000000000207 R12: 0000000000000400
[ 373.359837] R13: ffffc900065cbe70 R14: ffff8800787657d8 R15: 0000000000002400
[ 373.361878] FS: 00007f6fe095aa80(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000
[ 373.364333] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 373.366071] CR2: 00007f6fcc38c1ee CR3: 000000007949a000 CR4: 00000000000006e0
[ 373.368170] Call Trace:
[ 373.369165] ? iomap_dio_rw+0x2a6/0x360
[ 373.370467] iomap_dio_rw+0x333/0x360
[ 373.371768] xfs_file_dio_aio_write+0xec/0x260 [xfs]
[ 373.373069] ? xfs_file_dio_aio_write+0xec/0x260 [xfs]
[ 373.374410] xfs_file_write_iter+0x98/0x150 [xfs]
[ 373.375820] __vfs_write+0xd3/0x150
[ 373.377096] vfs_write+0xc7/0x1b0
[ 373.378436] SyS_pwrite64+0x76/0x90
[ 373.379897] entry_SYSCALL_64_fastpath+0x1a/0xa5
[ 373.381488] RIP: 0033:0x7f6fcc3019d3
[ 373.382781] RSP: 002b:00007f6fe0957ea0 EFLAGS: 00000293 ORIG_RAX: 0000000000000012
[ 373.385064] RAX: ffffffffffffffda RBX: 000055e925cb07f0 RCX: 00007f6fcc3019d3
[ 373.386913] RDX: 0000000000000400 RSI: 000055e925ca9800 RDI: 0000000000000005
[ 373.388911] RBP: 0000000000000005 R08: 0000000000000000 R09: 00007f6fe095aa80
[ 373.390588] R10: 0000000000002400 R11: 0000000000000293 R12: 00007f6fe095aa80
[ 373.392296] R13: 00007f6fe0957ec0 R14: 0000000000000003 R15: 000055e925caf360
[ 373.394229] Code: e0 eb 9f 48 8b 77 18 e9 78 ff ff ff 4c 89 fa 48 03 53 18 4c 89 fe 48 c1 fe 0c 48 83 ea 01 48 c1 fa 0c e8 0f 93 f0 ff 85 c0 74 93 <0f> ff eb 8f 4c 63 67 28 e9 54 ff ff ff 0f 1f 40 00 66 2e 0f 1f
[ 373.398812] ---[ end trace 01e002a8aa01ae26 ]---
[ 392.887639] XFS (vdd): Unmounting Filesystem
[ 392.985715] generic/095 [failed, exit status 1] - output mismatch (see /lkp/benchmarks/xfstests/results//generic/095.out.bad)
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [btrfs] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220
2017-10-30 7:20 ` [btrfs] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220 Fengguang Wu
@ 2017-10-30 7:44 ` Eryu Guan
2017-10-31 0:10 ` Fengguang Wu
2017-10-31 15:13 ` Filipe Manana
0 siblings, 2 replies; 13+ messages in thread
From: Eryu Guan @ 2017-10-30 7:44 UTC (permalink / raw)
To: Fengguang Wu
Cc: Lukas Czerner, Jan Kara, Darrick J. Wong, Jeff Moyer, Jens Axboe,
Linus Torvalds, Linux Kernel Mailing List, linux-fsdevel
Hi Fengguang,
On Mon, Oct 30, 2017 at 08:20:21AM +0100, Fengguang Wu wrote:
> CC fsdevel.
>
> On Sun, Oct 29, 2017 at 11:51:55PM +0100, Fengguang Wu wrote:
> > Hi Linus,
> >
> > Up to now we see the below boot error/warnings when testing v4.14-rc6.
> >
> > They hit the RC release mainly due to various imperfections in 0day's
> > auto bisection. So I manually list them here and CC the likely easy to
> > debug ones to the corresponding maintainers in the followup emails.
> >
> > boot_successes: 4700
> > boot_failures: 247
>
> [...]
>
> > WARNING:at_fs/direct-io.c:#dio_complete: 7
> > WARNING:at_fs/iomap.c:#iomap_dio_complete: 3
> > WARNING:at_fs/iomap.c:#iomap_dio_rw: 1
>
> The first warning happens on btrfs and is bisected to this commit.
> The other 2 warnings happen on xfs.
I noticed that the warnings are triggered by generic/095 and
generic/208, they're known to generate such warnings and I think these
warnings are kind of 'known issue', please see comments above
_filter_aiodio_dmesg() in fstests/common/filter.
Please make sure your local fstests contains the following 3 commits:
ca93e26865ab common: move _filter_xfs_dmesg() to common/filter
5aa662733ab0 common: turn _filter_xfs_dmesg() into _filter_aiodio_dmesg()
228aee780f13 generic/036,208: whitelist [iomap_]dio_complete() WARNs
we filtered out such warnings in fstests on purpose so the affected
tests won't fail because of such dmesg warnings.
Thanks,
Eryu
>
> commit 332391a9935da939319e473b4680e173df75afcf
> Author: Lukas Czerner <lczerner@redhat.com>
> AuthorDate: Thu Sep 21 08:16:29 2017 -0600
> Commit: Jens Axboe <axboe@kernel.dk>
> CommitDate: Mon Sep 25 08:56:05 2017 -0600
>
> fs: Fix page cache inconsistency when mixing buffered and AIO DIO
>
> Currently when mixing buffered reads and asynchronous direct writes it
> is possible to end up with the situation where we have stale data in the
> page cache while the new data is already written to disk. This is
> permanent until the affected pages are flushed away. Despite the fact
> that mixing buffered and direct IO is ill-advised it does pose a thread
> for a data integrity, is unexpected and should be fixed.
>
> Fix this by deferring completion of asynchronous direct writes to a
> process context in the case that there are mapped pages to be found in
> the inode. Later before the completion in dio_complete() invalidate
> the pages in question. This ensures that after the completion the pages
> in the written area are either unmapped, or populated with up-to-date
> data. Also do the same for the iomap case which uses
> iomap_dio_complete() instead.
>
> This has a side effect of deferring the completion to a process context
> for every AIO DIO that happens on inode that has pages mapped. However
> since the consensus is that this is ill-advised practice the performance
> implication should not be a problem.
>
> This was based on proposal from Jeff Moyer, thanks!
>
> Reviewed-by: Jan Kara <jack@suse.cz>
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
> fs/direct-io.c | 47 ++++++++++++++++++++++++++++++++++++++++++-----
> fs/iomap.c | 29 ++++++++++++++++-------------
> mm/filemap.c | 6 ++++++
> 3 files changed, 64 insertions(+), 18 deletions(-)
>
> The call traces are:
>
> [ 334.461955] BTRFS info (device vdb): has skinny extents
> [ 334.463231] BTRFS info (device vdb): flagging fs with big metadata feature
> [ 334.469746] BTRFS info (device vdb): creating UUID tree
> [ 336.190840] ------------[ cut here ]------------
> [ 336.193338] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220
> [ 336.196925] Modules linked in: btrfs xor zstd_decompress zstd_compress xxhash raid6_pq dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver sr_mod cdrom sg ata_generic pata_acpi ppdev snd_pcm snd_timer snd soundcore serio_raw parport_pc pcspkr parport floppy ata_piix libata i2c_piix4 ip_tables
> [ 336.203746] CPU: 0 PID: 6379 Comm: kworker/0:0 Not tainted 4.14.0-rc6 #1
> [ 336.204799] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [ 336.207480] Workqueue: dio/vdb dio_aio_complete_work
> [ 336.208373] task: ffff880079094a80 task.stack: ffffc90000950000
> [ 336.210495] RIP: 0010:dio_complete+0x1d4/0x220
> [ 336.211288] RSP: 0018:ffffc90000953e20 EFLAGS: 00010286
> [ 336.213145] RAX: 00000000fffffff0 RBX: ffff8800aae8c780 RCX: ffffc90000953c70
> [ 336.214232] RDX: 0000000080000000 RSI: 00000000000002b4 RDI: ffffffff81cb9ccb
> [ 336.216605] RBP: ffffc90000953e48 R08: 0000000000000000 R09: ffff880079c8b6c0
> [ 336.217726] R10: 0000000000000074 R11: 0000000000000000 R12: 0000000000002000
> [ 336.220141] R13: 0000000000002000 R14: 0000000000000003 R15: 0000000000074000
> [ 336.221232] FS: 0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000
> [ 336.223915] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 336.224885] CR2: 000055e9eae6b750 CR3: 00000000752c7000 CR4: 00000000000006f0
> [ 336.227202] Call Trace:
> [ 336.227750] dio_aio_complete_work+0x19/0x20
> [ 336.228536] process_one_work+0x198/0x3e0
> [ 336.230396] worker_thread+0x4e/0x3e0
> [ 336.231064] kthread+0x114/0x150
> [ 336.232489] ? process_one_work+0x3e0/0x3e0
> [ 336.233250] ? kthread_create_on_node+0x40/0x40
> [ 336.234929] ? kthread_create_on_node+0x40/0x40
> [ 336.235754] ret_from_fork+0x25/0x30
> [ 336.237322] Code: 8b 78 30 48 83 7f 58 00 0f 84 e9 fe ff ff 4b 8d 54 27 ff 4c 89 fe 48 c1 fe 0c 48 c1 fa 0c e8 64 58 f3 ff 85 c0 0f 84 cc fe ff ff <0f> ff e9 c5 fe ff ff 8b 50 20 f6 c2 10 0f 84 e5 fe ff ff 48 8b
> [ 336.240031] ---[ end trace 3d29f0720cd16ad3 ]---
> [ 378.043462] generic/095 [failed, exit status 1] - output mismatch (see /lkp/benchmarks/xfstests/results//generic/095.out.bad)
> [ 378.043465]
>
>
> [ 121.160532] run fstests generic/117 at 2017-10-25 05:27:29
> [ 123.275347] XFS (vdd): Mounting V5 Filesystem
> [ 123.280579] XFS (vdd): Ending clean mount
> [ 163.043220] XFS (vdd): Unmounting Filesystem
> [ 163.258872] generic/117 42s
> [ 163.258875]
> [ 164.009187] run fstests generic/208 at 2017-10-25 05:28:12
> [ 164.344046] ------------[ cut here ]------------
> [ 164.345164] WARNING: CPU: 1 PID: 86 at fs/iomap.c:755 iomap_dio_complete+0xe5/0x100
> [ 164.347193] Modules linked in: xfs dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver sr_mod cdrom sg ata_generic pata_acpi ppdev snd_pc
> m snd_timer snd soundcore parport_pc pcspkr serio_raw floppy parport ata_piix i2c_piix4 libata ip_tables
> [ 164.351698] CPU: 1 PID: 86 Comm: kworker/1:2 Not tainted 4.14.0-rc6 #1
> [ 164.353111] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [ 164.361199] Workqueue: dio/vda iomap_dio_complete_work
> [ 164.362418] task: ffff88007f8b8000 task.stack: ffffc90000a4c000
> [ 164.363666] RIP: 0010:iomap_dio_complete+0xe5/0x100
> [ 164.364923] RSP: 0018:ffffc90000a4fe08 EFLAGS: 00010286
> [ 164.366065] RAX: 00000000fffffff0 RBX: ffff88007a5fba80 RCX: ffffc90000a4fc58
> [ 164.367281] RDX: 0000000080000000 RSI: 00000000000002b4 RDI: ffffffff81cb9ccb
> [ 164.368330] RBP: ffffc90000a4fe30 R08: 0000000000000000 R09: ffff8800750eeb58
> [ 164.369503] R10: 0000000000001ffe R11: 0000000000000000 R12: 0000000002000000
> [ 164.370573] R13: ffff8800b8d27780 R14: ffff8800b8869058 R15: 0000000000000000
> [ 164.371674] FS: 0000000000000000(0000) GS:ffff88013fc80000(0000) knlGS:0000000000000000
> [ 164.373136] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 164.387036] CR2: 00007f5ee028e000 CR3: 000000007a1ff000 CR4: 00000000000006e0
> [ 164.388119] Call Trace:
> [ 164.388581] iomap_dio_complete_work+0x25/0x80
> [ 164.389385] process_one_work+0x198/0x3e0
> [ 164.390054] worker_thread+0x4e/0x3e0
> [ 164.390699] kthread+0x114/0x150
> [ 164.391341] ? process_one_work+0x3e0/0x3e0
> [ 164.392038] ? kthread_create_on_node+0x40/0x40
> [ 164.392922] ret_from_fork+0x25/0x30
> [ 164.393506] Code: e0 eb 9f 48 8b 77 18 e9 78 ff ff ff 4c 89 fa 48 03 53 18 4c 89 fe 48 c1 fe 0c 48 83 ea 01 48 c1 fa 0c e8 0f 93 f0 ff 85 c
> 0 74 93 <0f> ff eb 8f 4c 63 67 28 e9 54 ff ff ff 0f 1f 40 00 66 2e 0f 1f
> [ 164.396264] ---[ end trace 77984c98cc2335de ]---
> [ 364.297438] generic/208 200s
> [ 364.297442]
> [ 365.772872] XFS (vda): Unmounting Filesystem
> [ 366.507546] XFS (vda): Mounting V5 Filesystem
> [ 366.519283] XFS (vda): Ending clean mount
>
>
> [ 370.264428] run fstests generic/095 at 2017-10-25 05:38:44
> [ 370.392041] XFS (vdd): Unmounting Filesystem
> [ 372.641826] XFS (vdd): Mounting V5 Filesystem
> [ 372.648446] XFS (vdd): Ending clean mount
> [ 373.277786] ------------[ cut here ]------------
> [ 373.279820] ------------[ cut here ]------------
> [ 373.279826] WARNING: CPU: 2 PID: 29166 at fs/iomap.c:1016 iomap_dio_rw+0x346/0x360
> [ 373.279827] Modules linked in: xfs dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver sr_mod cdrom sg ata_generic pata_acpi ppdev snd_pcm snd_timer snd soundcore serio_raw pcspkr i2c_piix4 ata_piix parport_pc parport floppy libata ip_tables
> [ 373.279842] CPU: 2 PID: 29166 Comm: fio Not tainted 4.14.0-rc6 #1
> [ 373.279843] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [ 373.279844] task: ffff88008bd34a80 task.stack: ffffc900065f8000
> [ 373.279846] RIP: 0010:iomap_dio_rw+0x346/0x360
> [ 373.279847] RSP: 0000:ffffc900065fbd40 EFLAGS: 00010286
> [ 373.279848] RAX: 00000000fffffff0 RBX: ffffc900065fbe98 RCX: ffffc900065fbb90
> [ 373.279848] RDX: 0000000080000000 RSI: 00000000000002b4 RDI: ffffffff81cb9ccb
> [ 373.279849] RBP: ffffc900065fbdd0 R08: 0000000000000000 R09: 00007f6fe0948000
> [ 373.279850] R10: ffff88013ffd2000 R11: 0000000000000400 R12: 000000000005f400
> [ 373.279850] R13: ffffc900065fbe70 R14: ffff88007a6fb7e0 R15: 0000000000000011
> [ 373.279852] FS: 00007f6fb3ce1700(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000
> [ 373.279852] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 373.279853] CR2: 00007f6fe088ac00 CR3: 0000000078eb7000 CR4: 00000000000006e0
> [ 373.279856] Call Trace:
> [ 373.279913] ? xfs_file_aio_write_checks+0x137/0x190 [xfs]
> [ 373.279933] xfs_file_dio_aio_write+0xec/0x260 [xfs]
> [ 373.279952] ? xfs_file_dio_aio_write+0xec/0x260 [xfs]
> [ 373.279970] xfs_file_write_iter+0x98/0x150 [xfs]
> [ 373.279972] __vfs_write+0xd3/0x150
> [ 373.279974] vfs_write+0xc7/0x1b0
> [ 373.279975] SyS_pwrite64+0x76/0x90
> [ 373.279977] entry_SYSCALL_64_fastpath+0x1a/0xa5
> [ 373.279979] RIP: 0033:0x7f6fcc3019d3
> [ 373.279979] RSP: 002b:00007f6fb3cdeb20 EFLAGS: 00000293 ORIG_RAX: 0000000000000012
> [ 373.279980] RAX: ffffffffffffffda RBX: 00007f6fb3ce1700 RCX: 00007f6fcc3019d3
> [ 373.279981] RDX: 0000000000000400 RSI: 000055e925caac00 RDI: 0000000000000004
> [ 373.279982] RBP: 00007f6fe095fee0 R08: 0000000000000000 R09: 00007f6fb3ce1700
> [ 373.279982] R10: 000000000005f400 R11: 0000000000000293 R12: 0000000000000000
> [ 373.279983] R13: 00007f6fe095fe2f R14: 0000000000000003 R15: 00007f6fe095fee0
> [ 373.279984] Code: 21 00 00 f6 58 2b 81 49 c7 44 24 08 00 00 00 00 4c 89
> f7 e8 ed ec ff ff e9 13 fe ff ff 48 8d 7d a0 e8 2f a9 13 00 e9 33 ff ff ff
> <0f> ff e9 6a fe ff ff 48 c7 c0 f4 ff ff ff e9 f2 fd ff ff 90 90 [
> 373.279998] ---[ end trace 01e002a8aa01ae25 ]---
> [ 373.334858] WARNING: CPU: 3 PID: 29158 at fs/iomap.c:755 iomap_dio_complete+0xe5/0x100
> [ 373.336817] Modules linked in: xfs dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver sr_mod cdrom sg ata_generic pata_acpi ppdev snd_pcm snd_timer snd soundcore serio_raw pcspkr i2c_piix4 ata_piix parport_pc parport floppy libata ip_tables
> [ 373.341916] CPU: 3 PID: 29158 Comm: fio Tainted: G W 4.14.0-rc6 #1
> [ 373.344151] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [ 373.346702] task: ffff88008bd32540 task.stack: ffffc900065c8000
> [ 373.348512] RIP: 0010:iomap_dio_complete+0xe5/0x100
> [ 373.349999] RSP: 0018:ffffc900065cbd08 EFLAGS: 00010286
> [ 373.351604] RAX: 00000000fffffff0 RBX: ffff88007a5611e0 RCX: ffffc900065cbb58
> [ 373.353690] RDX: 0000000080000000 RSI: 00000000000002b4 RDI: ffffffff81cb9ccb
> [ 373.355693] RBP: ffffc900065cbd30 R08: 0000000000000000 R09: 00007f6fe08cb000
> [ 373.357775] R10: ffffc900065cb8c8 R11: 0000000000000207 R12: 0000000000000400
> [ 373.359837] R13: ffffc900065cbe70 R14: ffff8800787657d8 R15: 0000000000002400
> [ 373.361878] FS: 00007f6fe095aa80(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000
> [ 373.364333] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 373.366071] CR2: 00007f6fcc38c1ee CR3: 000000007949a000 CR4: 00000000000006e0
> [ 373.368170] Call Trace:
> [ 373.369165] ? iomap_dio_rw+0x2a6/0x360
> [ 373.370467] iomap_dio_rw+0x333/0x360
> [ 373.371768] xfs_file_dio_aio_write+0xec/0x260 [xfs]
> [ 373.373069] ? xfs_file_dio_aio_write+0xec/0x260 [xfs]
> [ 373.374410] xfs_file_write_iter+0x98/0x150 [xfs]
> [ 373.375820] __vfs_write+0xd3/0x150
> [ 373.377096] vfs_write+0xc7/0x1b0
> [ 373.378436] SyS_pwrite64+0x76/0x90
> [ 373.379897] entry_SYSCALL_64_fastpath+0x1a/0xa5
> [ 373.381488] RIP: 0033:0x7f6fcc3019d3
> [ 373.382781] RSP: 002b:00007f6fe0957ea0 EFLAGS: 00000293 ORIG_RAX: 0000000000000012
> [ 373.385064] RAX: ffffffffffffffda RBX: 000055e925cb07f0 RCX: 00007f6fcc3019d3
> [ 373.386913] RDX: 0000000000000400 RSI: 000055e925ca9800 RDI: 0000000000000005
> [ 373.388911] RBP: 0000000000000005 R08: 0000000000000000 R09: 00007f6fe095aa80
> [ 373.390588] R10: 0000000000002400 R11: 0000000000000293 R12: 00007f6fe095aa80
> [ 373.392296] R13: 00007f6fe0957ec0 R14: 0000000000000003 R15: 000055e925caf360
> [ 373.394229] Code: e0 eb 9f 48 8b 77 18 e9 78 ff ff ff 4c 89 fa 48 03 53
> 18 4c 89 fe 48 c1 fe 0c 48 83 ea 01 48 c1 fa 0c e8 0f 93 f0 ff 85 c0 74 93
> <0f> ff eb 8f 4c 63 67 28 e9 54 ff ff ff 0f 1f 40 00 66 2e 0f 1f [
> 373.398812] ---[ end trace 01e002a8aa01ae26 ]---
> [ 392.887639] XFS (vdd): Unmounting Filesystem
> [ 392.985715] generic/095 [failed, exit status 1] - output mismatch (see /lkp/benchmarks/xfstests/results//generic/095.out.bad)
>
> Thanks,
> Fengguang
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [btrfs] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220
2017-10-30 7:44 ` Eryu Guan
@ 2017-10-31 0:10 ` Fengguang Wu
2017-10-31 6:54 ` Eryu Guan
2017-10-31 15:13 ` Filipe Manana
1 sibling, 1 reply; 13+ messages in thread
From: Fengguang Wu @ 2017-10-31 0:10 UTC (permalink / raw)
To: Eryu Guan
Cc: Lukas Czerner, Jan Kara, Darrick J. Wong, Jeff Moyer, Jens Axboe,
Linus Torvalds, Linux Kernel Mailing List, linux-fsdevel
Hi Eryu,
On Mon, Oct 30, 2017 at 03:44:29PM +0800, Eryu Guan wrote:
>Hi Fengguang,
>
>On Mon, Oct 30, 2017 at 08:20:21AM +0100, Fengguang Wu wrote:
>> CC fsdevel.
>>
>> On Sun, Oct 29, 2017 at 11:51:55PM +0100, Fengguang Wu wrote:
>> > Hi Linus,
>> >
>> > Up to now we see the below boot error/warnings when testing v4.14-rc6.
>> >
>> > They hit the RC release mainly due to various imperfections in 0day's
>> > auto bisection. So I manually list them here and CC the likely easy to
>> > debug ones to the corresponding maintainers in the followup emails.
>> >
>> > boot_successes: 4700
>> > boot_failures: 247
>>
>> [...]
>>
>> > WARNING:at_fs/direct-io.c:#dio_complete: 7
>> > WARNING:at_fs/iomap.c:#iomap_dio_complete: 3
>> > WARNING:at_fs/iomap.c:#iomap_dio_rw: 1
>>
>> The first warning happens on btrfs and is bisected to this commit.
>> The other 2 warnings happen on xfs.
>
>I noticed that the warnings are triggered by generic/095 and
>generic/208, they're known to generate such warnings and I think these
>warnings are kind of 'known issue', please see comments above
>_filter_aiodio_dmesg() in fstests/common/filter.
>
>Please make sure your local fstests contains the following 3 commits:
>
>ca93e26865ab common: move _filter_xfs_dmesg() to common/filter
>5aa662733ab0 common: turn _filter_xfs_dmesg() into _filter_aiodio_dmesg()
>228aee780f13 generic/036,208: whitelist [iomap_]dio_complete() WARNs
OK.
>we filtered out such warnings in fstests on purpose so the affected
>tests won't fail because of such dmesg warnings.
We may also teach 0day robot to ignore the warning when running the
above 2 fstests.
The more generic way of filtering may be to inject a dmesg line like
THIS TEST WILL TRIGGER KERNEL WARNING, PLEASE IGNORE.
just before the specific tests startup. Then 3rd party dmesg parsing
scripts can handle such purpose-made warnings in a uniform way.
Thanks,
Fengguang
>>
>> commit 332391a9935da939319e473b4680e173df75afcf
>> Author: Lukas Czerner <lczerner@redhat.com>
>> AuthorDate: Thu Sep 21 08:16:29 2017 -0600
>> Commit: Jens Axboe <axboe@kernel.dk>
>> CommitDate: Mon Sep 25 08:56:05 2017 -0600
>>
>> fs: Fix page cache inconsistency when mixing buffered and AIO DIO
>>
>> Currently when mixing buffered reads and asynchronous direct writes it
>> is possible to end up with the situation where we have stale data in the
>> page cache while the new data is already written to disk. This is
>> permanent until the affected pages are flushed away. Despite the fact
>> that mixing buffered and direct IO is ill-advised it does pose a thread
>> for a data integrity, is unexpected and should be fixed.
>>
>> Fix this by deferring completion of asynchronous direct writes to a
>> process context in the case that there are mapped pages to be found in
>> the inode. Later before the completion in dio_complete() invalidate
>> the pages in question. This ensures that after the completion the pages
>> in the written area are either unmapped, or populated with up-to-date
>> data. Also do the same for the iomap case which uses
>> iomap_dio_complete() instead.
>>
>> This has a side effect of deferring the completion to a process context
>> for every AIO DIO that happens on inode that has pages mapped. However
>> since the consensus is that this is ill-advised practice the performance
>> implication should not be a problem.
>>
>> This was based on proposal from Jeff Moyer, thanks!
>>
>> Reviewed-by: Jan Kara <jack@suse.cz>
>> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
>> Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
>> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>> fs/direct-io.c | 47 ++++++++++++++++++++++++++++++++++++++++++-----
>> fs/iomap.c | 29 ++++++++++++++++-------------
>> mm/filemap.c | 6 ++++++
>> 3 files changed, 64 insertions(+), 18 deletions(-)
>>
>> The call traces are:
>>
>> [ 334.461955] BTRFS info (device vdb): has skinny extents
>> [ 334.463231] BTRFS info (device vdb): flagging fs with big metadata feature
>> [ 334.469746] BTRFS info (device vdb): creating UUID tree
>> [ 336.190840] ------------[ cut here ]------------
>> [ 336.193338] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220
>> [ 336.196925] Modules linked in: btrfs xor zstd_decompress zstd_compress xxhash raid6_pq dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver sr_mod cdrom sg ata_generic pata_acpi ppdev snd_pcm snd_timer snd soundcore serio_raw parport_pc pcspkr parport floppy ata_piix libata i2c_piix4 ip_tables
>> [ 336.203746] CPU: 0 PID: 6379 Comm: kworker/0:0 Not tainted 4.14.0-rc6 #1
>> [ 336.204799] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
>> [ 336.207480] Workqueue: dio/vdb dio_aio_complete_work
>> [ 336.208373] task: ffff880079094a80 task.stack: ffffc90000950000
>> [ 336.210495] RIP: 0010:dio_complete+0x1d4/0x220
>> [ 336.211288] RSP: 0018:ffffc90000953e20 EFLAGS: 00010286
>> [ 336.213145] RAX: 00000000fffffff0 RBX: ffff8800aae8c780 RCX: ffffc90000953c70
>> [ 336.214232] RDX: 0000000080000000 RSI: 00000000000002b4 RDI: ffffffff81cb9ccb
>> [ 336.216605] RBP: ffffc90000953e48 R08: 0000000000000000 R09: ffff880079c8b6c0
>> [ 336.217726] R10: 0000000000000074 R11: 0000000000000000 R12: 0000000000002000
>> [ 336.220141] R13: 0000000000002000 R14: 0000000000000003 R15: 0000000000074000
>> [ 336.221232] FS: 0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000
>> [ 336.223915] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 336.224885] CR2: 000055e9eae6b750 CR3: 00000000752c7000 CR4: 00000000000006f0
>> [ 336.227202] Call Trace:
>> [ 336.227750] dio_aio_complete_work+0x19/0x20
>> [ 336.228536] process_one_work+0x198/0x3e0
>> [ 336.230396] worker_thread+0x4e/0x3e0
>> [ 336.231064] kthread+0x114/0x150
>> [ 336.232489] ? process_one_work+0x3e0/0x3e0
>> [ 336.233250] ? kthread_create_on_node+0x40/0x40
>> [ 336.234929] ? kthread_create_on_node+0x40/0x40
>> [ 336.235754] ret_from_fork+0x25/0x30
>> [ 336.237322] Code: 8b 78 30 48 83 7f 58 00 0f 84 e9 fe ff ff 4b 8d 54 27 ff 4c 89 fe 48 c1 fe 0c 48 c1 fa 0c e8 64 58 f3 ff 85 c0 0f 84 cc fe ff ff <0f> ff e9 c5 fe ff ff 8b 50 20 f6 c2 10 0f 84 e5 fe ff ff 48 8b
>> [ 336.240031] ---[ end trace 3d29f0720cd16ad3 ]---
>> [ 378.043462] generic/095 [failed, exit status 1] - output mismatch (see /lkp/benchmarks/xfstests/results//generic/095.out.bad)
>> [ 378.043465]
>>
>>
>> [ 121.160532] run fstests generic/117 at 2017-10-25 05:27:29
>> [ 123.275347] XFS (vdd): Mounting V5 Filesystem
>> [ 123.280579] XFS (vdd): Ending clean mount
>> [ 163.043220] XFS (vdd): Unmounting Filesystem
>> [ 163.258872] generic/117 42s
>> [ 163.258875]
>> [ 164.009187] run fstests generic/208 at 2017-10-25 05:28:12
>> [ 164.344046] ------------[ cut here ]------------
>> [ 164.345164] WARNING: CPU: 1 PID: 86 at fs/iomap.c:755 iomap_dio_complete+0xe5/0x100
>> [ 164.347193] Modules linked in: xfs dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver sr_mod cdrom sg ata_generic pata_acpi ppdev snd_pc
>> m snd_timer snd soundcore parport_pc pcspkr serio_raw floppy parport ata_piix i2c_piix4 libata ip_tables
>> [ 164.351698] CPU: 1 PID: 86 Comm: kworker/1:2 Not tainted 4.14.0-rc6 #1
>> [ 164.353111] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
>> [ 164.361199] Workqueue: dio/vda iomap_dio_complete_work
>> [ 164.362418] task: ffff88007f8b8000 task.stack: ffffc90000a4c000
>> [ 164.363666] RIP: 0010:iomap_dio_complete+0xe5/0x100
>> [ 164.364923] RSP: 0018:ffffc90000a4fe08 EFLAGS: 00010286
>> [ 164.366065] RAX: 00000000fffffff0 RBX: ffff88007a5fba80 RCX: ffffc90000a4fc58
>> [ 164.367281] RDX: 0000000080000000 RSI: 00000000000002b4 RDI: ffffffff81cb9ccb
>> [ 164.368330] RBP: ffffc90000a4fe30 R08: 0000000000000000 R09: ffff8800750eeb58
>> [ 164.369503] R10: 0000000000001ffe R11: 0000000000000000 R12: 0000000002000000
>> [ 164.370573] R13: ffff8800b8d27780 R14: ffff8800b8869058 R15: 0000000000000000
>> [ 164.371674] FS: 0000000000000000(0000) GS:ffff88013fc80000(0000) knlGS:0000000000000000
>> [ 164.373136] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 164.387036] CR2: 00007f5ee028e000 CR3: 000000007a1ff000 CR4: 00000000000006e0
>> [ 164.388119] Call Trace:
>> [ 164.388581] iomap_dio_complete_work+0x25/0x80
>> [ 164.389385] process_one_work+0x198/0x3e0
>> [ 164.390054] worker_thread+0x4e/0x3e0
>> [ 164.390699] kthread+0x114/0x150
>> [ 164.391341] ? process_one_work+0x3e0/0x3e0
>> [ 164.392038] ? kthread_create_on_node+0x40/0x40
>> [ 164.392922] ret_from_fork+0x25/0x30
>> [ 164.393506] Code: e0 eb 9f 48 8b 77 18 e9 78 ff ff ff 4c 89 fa 48 03 53 18 4c 89 fe 48 c1 fe 0c 48 83 ea 01 48 c1 fa 0c e8 0f 93 f0 ff 85 c
>> 0 74 93 <0f> ff eb 8f 4c 63 67 28 e9 54 ff ff ff 0f 1f 40 00 66 2e 0f 1f
>> [ 164.396264] ---[ end trace 77984c98cc2335de ]---
>> [ 364.297438] generic/208 200s
>> [ 364.297442]
>> [ 365.772872] XFS (vda): Unmounting Filesystem
>> [ 366.507546] XFS (vda): Mounting V5 Filesystem
>> [ 366.519283] XFS (vda): Ending clean mount
>>
>>
>> [ 370.264428] run fstests generic/095 at 2017-10-25 05:38:44
>> [ 370.392041] XFS (vdd): Unmounting Filesystem
>> [ 372.641826] XFS (vdd): Mounting V5 Filesystem
>> [ 372.648446] XFS (vdd): Ending clean mount
>> [ 373.277786] ------------[ cut here ]------------
>> [ 373.279820] ------------[ cut here ]------------
>> [ 373.279826] WARNING: CPU: 2 PID: 29166 at fs/iomap.c:1016 iomap_dio_rw+0x346/0x360
>> [ 373.279827] Modules linked in: xfs dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver sr_mod cdrom sg ata_generic pata_acpi ppdev snd_pcm snd_timer snd soundcore serio_raw pcspkr i2c_piix4 ata_piix parport_pc parport floppy libata ip_tables
>> [ 373.279842] CPU: 2 PID: 29166 Comm: fio Not tainted 4.14.0-rc6 #1
>> [ 373.279843] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
>> [ 373.279844] task: ffff88008bd34a80 task.stack: ffffc900065f8000
>> [ 373.279846] RIP: 0010:iomap_dio_rw+0x346/0x360
>> [ 373.279847] RSP: 0000:ffffc900065fbd40 EFLAGS: 00010286
>> [ 373.279848] RAX: 00000000fffffff0 RBX: ffffc900065fbe98 RCX: ffffc900065fbb90
>> [ 373.279848] RDX: 0000000080000000 RSI: 00000000000002b4 RDI: ffffffff81cb9ccb
>> [ 373.279849] RBP: ffffc900065fbdd0 R08: 0000000000000000 R09: 00007f6fe0948000
>> [ 373.279850] R10: ffff88013ffd2000 R11: 0000000000000400 R12: 000000000005f400
>> [ 373.279850] R13: ffffc900065fbe70 R14: ffff88007a6fb7e0 R15: 0000000000000011
>> [ 373.279852] FS: 00007f6fb3ce1700(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000
>> [ 373.279852] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 373.279853] CR2: 00007f6fe088ac00 CR3: 0000000078eb7000 CR4: 00000000000006e0
>> [ 373.279856] Call Trace:
>> [ 373.279913] ? xfs_file_aio_write_checks+0x137/0x190 [xfs]
>> [ 373.279933] xfs_file_dio_aio_write+0xec/0x260 [xfs]
>> [ 373.279952] ? xfs_file_dio_aio_write+0xec/0x260 [xfs]
>> [ 373.279970] xfs_file_write_iter+0x98/0x150 [xfs]
>> [ 373.279972] __vfs_write+0xd3/0x150
>> [ 373.279974] vfs_write+0xc7/0x1b0
>> [ 373.279975] SyS_pwrite64+0x76/0x90
>> [ 373.279977] entry_SYSCALL_64_fastpath+0x1a/0xa5
>> [ 373.279979] RIP: 0033:0x7f6fcc3019d3
>> [ 373.279979] RSP: 002b:00007f6fb3cdeb20 EFLAGS: 00000293 ORIG_RAX: 0000000000000012
>> [ 373.279980] RAX: ffffffffffffffda RBX: 00007f6fb3ce1700 RCX: 00007f6fcc3019d3
>> [ 373.279981] RDX: 0000000000000400 RSI: 000055e925caac00 RDI: 0000000000000004
>> [ 373.279982] RBP: 00007f6fe095fee0 R08: 0000000000000000 R09: 00007f6fb3ce1700
>> [ 373.279982] R10: 000000000005f400 R11: 0000000000000293 R12: 0000000000000000
>> [ 373.279983] R13: 00007f6fe095fe2f R14: 0000000000000003 R15: 00007f6fe095fee0
>> [ 373.279984] Code: 21 00 00 f6 58 2b 81 49 c7 44 24 08 00 00 00 00 4c 89
>> f7 e8 ed ec ff ff e9 13 fe ff ff 48 8d 7d a0 e8 2f a9 13 00 e9 33 ff ff ff
>> <0f> ff e9 6a fe ff ff 48 c7 c0 f4 ff ff ff e9 f2 fd ff ff 90 90 [
>> 373.279998] ---[ end trace 01e002a8aa01ae25 ]---
>> [ 373.334858] WARNING: CPU: 3 PID: 29158 at fs/iomap.c:755 iomap_dio_complete+0xe5/0x100
>> [ 373.336817] Modules linked in: xfs dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver sr_mod cdrom sg ata_generic pata_acpi ppdev snd_pcm snd_timer snd soundcore serio_raw pcspkr i2c_piix4 ata_piix parport_pc parport floppy libata ip_tables
>> [ 373.341916] CPU: 3 PID: 29158 Comm: fio Tainted: G W 4.14.0-rc6 #1
>> [ 373.344151] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
>> [ 373.346702] task: ffff88008bd32540 task.stack: ffffc900065c8000
>> [ 373.348512] RIP: 0010:iomap_dio_complete+0xe5/0x100
>> [ 373.349999] RSP: 0018:ffffc900065cbd08 EFLAGS: 00010286
>> [ 373.351604] RAX: 00000000fffffff0 RBX: ffff88007a5611e0 RCX: ffffc900065cbb58
>> [ 373.353690] RDX: 0000000080000000 RSI: 00000000000002b4 RDI: ffffffff81cb9ccb
>> [ 373.355693] RBP: ffffc900065cbd30 R08: 0000000000000000 R09: 00007f6fe08cb000
>> [ 373.357775] R10: ffffc900065cb8c8 R11: 0000000000000207 R12: 0000000000000400
>> [ 373.359837] R13: ffffc900065cbe70 R14: ffff8800787657d8 R15: 0000000000002400
>> [ 373.361878] FS: 00007f6fe095aa80(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000
>> [ 373.364333] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 373.366071] CR2: 00007f6fcc38c1ee CR3: 000000007949a000 CR4: 00000000000006e0
>> [ 373.368170] Call Trace:
>> [ 373.369165] ? iomap_dio_rw+0x2a6/0x360
>> [ 373.370467] iomap_dio_rw+0x333/0x360
>> [ 373.371768] xfs_file_dio_aio_write+0xec/0x260 [xfs]
>> [ 373.373069] ? xfs_file_dio_aio_write+0xec/0x260 [xfs]
>> [ 373.374410] xfs_file_write_iter+0x98/0x150 [xfs]
>> [ 373.375820] __vfs_write+0xd3/0x150
>> [ 373.377096] vfs_write+0xc7/0x1b0
>> [ 373.378436] SyS_pwrite64+0x76/0x90
>> [ 373.379897] entry_SYSCALL_64_fastpath+0x1a/0xa5
>> [ 373.381488] RIP: 0033:0x7f6fcc3019d3
>> [ 373.382781] RSP: 002b:00007f6fe0957ea0 EFLAGS: 00000293 ORIG_RAX: 0000000000000012
>> [ 373.385064] RAX: ffffffffffffffda RBX: 000055e925cb07f0 RCX: 00007f6fcc3019d3
>> [ 373.386913] RDX: 0000000000000400 RSI: 000055e925ca9800 RDI: 0000000000000005
>> [ 373.388911] RBP: 0000000000000005 R08: 0000000000000000 R09: 00007f6fe095aa80
>> [ 373.390588] R10: 0000000000002400 R11: 0000000000000293 R12: 00007f6fe095aa80
>> [ 373.392296] R13: 00007f6fe0957ec0 R14: 0000000000000003 R15: 000055e925caf360
>> [ 373.394229] Code: e0 eb 9f 48 8b 77 18 e9 78 ff ff ff 4c 89 fa 48 03 53
>> 18 4c 89 fe 48 c1 fe 0c 48 83 ea 01 48 c1 fa 0c e8 0f 93 f0 ff 85 c0 74 93
>> <0f> ff eb 8f 4c 63 67 28 e9 54 ff ff ff 0f 1f 40 00 66 2e 0f 1f [
>> 373.398812] ---[ end trace 01e002a8aa01ae26 ]---
>> [ 392.887639] XFS (vdd): Unmounting Filesystem
>> [ 392.985715] generic/095 [failed, exit status 1] - output mismatch (see /lkp/benchmarks/xfstests/results//generic/095.out.bad)
>>
>> Thanks,
>> Fengguang
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [btrfs] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220
2017-10-31 0:10 ` Fengguang Wu
@ 2017-10-31 6:54 ` Eryu Guan
2017-10-31 7:10 ` Fengguang Wu
2017-11-06 1:13 ` Eric Biggers
0 siblings, 2 replies; 13+ messages in thread
From: Eryu Guan @ 2017-10-31 6:54 UTC (permalink / raw)
To: Fengguang Wu
Cc: Lukas Czerner, Jan Kara, Darrick J. Wong, Jeff Moyer, Jens Axboe,
Linus Torvalds, Linux Kernel Mailing List, linux-fsdevel
On Tue, Oct 31, 2017 at 01:10:41AM +0100, Fengguang Wu wrote:
> Hi Eryu,
>
> On Mon, Oct 30, 2017 at 03:44:29PM +0800, Eryu Guan wrote:
> > Hi Fengguang,
> >
> > On Mon, Oct 30, 2017 at 08:20:21AM +0100, Fengguang Wu wrote:
> > > CC fsdevel.
> > >
> > > On Sun, Oct 29, 2017 at 11:51:55PM +0100, Fengguang Wu wrote:
> > > > Hi Linus,
> > > >
> > > > Up to now we see the below boot error/warnings when testing v4.14-rc6.
> > > >
> > > > They hit the RC release mainly due to various imperfections in 0day's
> > > > auto bisection. So I manually list them here and CC the likely easy to
> > > > debug ones to the corresponding maintainers in the followup emails.
> > > >
> > > > boot_successes: 4700
> > > > boot_failures: 247
> > >
> > > [...]
> > >
> > > > WARNING:at_fs/direct-io.c:#dio_complete: 7
> > > > WARNING:at_fs/iomap.c:#iomap_dio_complete: 3
> > > > WARNING:at_fs/iomap.c:#iomap_dio_rw: 1
> > >
> > > The first warning happens on btrfs and is bisected to this commit.
> > > The other 2 warnings happen on xfs.
> >
> > I noticed that the warnings are triggered by generic/095 and
> > generic/208, they're known to generate such warnings and I think these
> > warnings are kind of 'known issue', please see comments above
> > _filter_aiodio_dmesg() in fstests/common/filter.
> >
> > Please make sure your local fstests contains the following 3 commits:
> >
> > ca93e26865ab common: move _filter_xfs_dmesg() to common/filter
> > 5aa662733ab0 common: turn _filter_xfs_dmesg() into _filter_aiodio_dmesg()
> > 228aee780f13 generic/036,208: whitelist [iomap_]dio_complete() WARNs
>
> OK.
>
> > we filtered out such warnings in fstests on purpose so the affected
> > tests won't fail because of such dmesg warnings.
>
> We may also teach 0day robot to ignore the warning when running the
> above 2 fstests.
>
> The more generic way of filtering may be to inject a dmesg line like
>
> THIS TEST WILL TRIGGER KERNEL WARNING, PLEASE IGNORE.
>
> just before the specific tests startup. Then 3rd party dmesg parsing
> scripts can handle such purpose-made warnings in a uniform way.
fstests doesn't know, prior to the test, if the warnings the test is
going to trigger are intentional or real bugs, fstests records the dmesg
log and analyzes the log after test and reports PASS if all the warnings
are intentional (based on the whitelist filter).
But I think it's possible to insert such a message to dmesg *after* test
if fstests finds that all the warnings are intentional. Does that work
for 0day robot?
Thanks,
Eryu
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [btrfs] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220
2017-10-31 6:54 ` Eryu Guan
@ 2017-10-31 7:10 ` Fengguang Wu
2017-11-06 1:13 ` Eric Biggers
1 sibling, 0 replies; 13+ messages in thread
From: Fengguang Wu @ 2017-10-31 7:10 UTC (permalink / raw)
To: Eryu Guan
Cc: Lukas Czerner, Jan Kara, Darrick J. Wong, Jeff Moyer, Jens Axboe,
Linus Torvalds, Linux Kernel Mailing List, linux-fsdevel
On Tue, Oct 31, 2017 at 02:54:17PM +0800, Eryu Guan wrote:
>On Tue, Oct 31, 2017 at 01:10:41AM +0100, Fengguang Wu wrote:
>> Hi Eryu,
>>
>> On Mon, Oct 30, 2017 at 03:44:29PM +0800, Eryu Guan wrote:
>> > Hi Fengguang,
>> >
>> > On Mon, Oct 30, 2017 at 08:20:21AM +0100, Fengguang Wu wrote:
>> > > CC fsdevel.
>> > >
>> > > On Sun, Oct 29, 2017 at 11:51:55PM +0100, Fengguang Wu wrote:
>> > > > Hi Linus,
>> > > >
>> > > > Up to now we see the below boot error/warnings when testing v4.14-rc6.
>> > > >
>> > > > They hit the RC release mainly due to various imperfections in 0day's
>> > > > auto bisection. So I manually list them here and CC the likely easy to
>> > > > debug ones to the corresponding maintainers in the followup emails.
>> > > >
>> > > > boot_successes: 4700
>> > > > boot_failures: 247
>> > >
>> > > [...]
>> > >
>> > > > WARNING:at_fs/direct-io.c:#dio_complete: 7
>> > > > WARNING:at_fs/iomap.c:#iomap_dio_complete: 3
>> > > > WARNING:at_fs/iomap.c:#iomap_dio_rw: 1
>> > >
>> > > The first warning happens on btrfs and is bisected to this commit.
>> > > The other 2 warnings happen on xfs.
>> >
>> > I noticed that the warnings are triggered by generic/095 and
>> > generic/208, they're known to generate such warnings and I think these
>> > warnings are kind of 'known issue', please see comments above
>> > _filter_aiodio_dmesg() in fstests/common/filter.
>> >
>> > Please make sure your local fstests contains the following 3 commits:
>> >
>> > ca93e26865ab common: move _filter_xfs_dmesg() to common/filter
>> > 5aa662733ab0 common: turn _filter_xfs_dmesg() into _filter_aiodio_dmesg()
>> > 228aee780f13 generic/036,208: whitelist [iomap_]dio_complete() WARNs
>>
>> OK.
>>
>> > we filtered out such warnings in fstests on purpose so the affected
>> > tests won't fail because of such dmesg warnings.
>>
>> We may also teach 0day robot to ignore the warning when running the
>> above 2 fstests.
>>
>> The more generic way of filtering may be to inject a dmesg line like
>>
>> THIS TEST WILL TRIGGER KERNEL WARNING, PLEASE IGNORE.
>>
>> just before the specific tests startup. Then 3rd party dmesg parsing
>> scripts can handle such purpose-made warnings in a uniform way.
>
>fstests doesn't know, prior to the test, if the warnings the test is
>going to trigger are intentional or real bugs, fstests records the dmesg
>log and analyzes the log after test and reports PASS if all the warnings
>are intentional (based on the whitelist filter).
Ah OK.
>But I think it's possible to insert such a message to dmesg *after* test
>if fstests finds that all the warnings are intentional. Does that work
>for 0day robot?
That should work, too.
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [btrfs] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220
2017-10-30 7:44 ` Eryu Guan
2017-10-31 0:10 ` Fengguang Wu
@ 2017-10-31 15:13 ` Filipe Manana
1 sibling, 0 replies; 13+ messages in thread
From: Filipe Manana @ 2017-10-31 15:13 UTC (permalink / raw)
To: Eryu Guan
Cc: Fengguang Wu, Lukas Czerner, Jan Kara, Darrick J. Wong,
Jeff Moyer, Jens Axboe, Linus Torvalds, Linux Kernel Mailing List,
linux-fsdevel
On Mon, Oct 30, 2017 at 7:44 AM, Eryu Guan <guaneryu@gmail.com> wrote:
> Hi Fengguang,
>
> On Mon, Oct 30, 2017 at 08:20:21AM +0100, Fengguang Wu wrote:
>> CC fsdevel.
>>
>> On Sun, Oct 29, 2017 at 11:51:55PM +0100, Fengguang Wu wrote:
>> > Hi Linus,
>> >
>> > Up to now we see the below boot error/warnings when testing v4.14-rc6.
>> >
>> > They hit the RC release mainly due to various imperfections in 0day's
>> > auto bisection. So I manually list them here and CC the likely easy to
>> > debug ones to the corresponding maintainers in the followup emails.
>> >
>> > boot_successes: 4700
>> > boot_failures: 247
>>
>> [...]
>>
>> > WARNING:at_fs/direct-io.c:#dio_complete: 7
>> > WARNING:at_fs/iomap.c:#iomap_dio_complete: 3
>> > WARNING:at_fs/iomap.c:#iomap_dio_rw: 1
>>
>> The first warning happens on btrfs and is bisected to this commit.
>> The other 2 warnings happen on xfs.
>
> I noticed that the warnings are triggered by generic/095 and
> generic/208, they're known to generate such warnings and I think these
> warnings are kind of 'known issue', please see comments above
> _filter_aiodio_dmesg() in fstests/common/filter.
I've hit that warning (fs/direct-io.c:293 dio_complete+0xee/0x1ad)
once with test btrfs/070 (on a 4.14-rc6 kernel).
Only happened once and seems very hard to hit it here. Hadn't spent
time yet figuring out why it happens.
>
> Please make sure your local fstests contains the following 3 commits:
>
> ca93e26865ab common: move _filter_xfs_dmesg() to common/filter
> 5aa662733ab0 common: turn _filter_xfs_dmesg() into _filter_aiodio_dmesg()
> 228aee780f13 generic/036,208: whitelist [iomap_]dio_complete() WARNs
>
> we filtered out such warnings in fstests on purpose so the affected
> tests won't fail because of such dmesg warnings.
>
> Thanks,
> Eryu
>
>>
>> commit 332391a9935da939319e473b4680e173df75afcf
>> Author: Lukas Czerner <lczerner@redhat.com>
>> AuthorDate: Thu Sep 21 08:16:29 2017 -0600
>> Commit: Jens Axboe <axboe@kernel.dk>
>> CommitDate: Mon Sep 25 08:56:05 2017 -0600
>>
>> fs: Fix page cache inconsistency when mixing buffered and AIO DIO
>>
>> Currently when mixing buffered reads and asynchronous direct writes it
>> is possible to end up with the situation where we have stale data in the
>> page cache while the new data is already written to disk. This is
>> permanent until the affected pages are flushed away. Despite the fact
>> that mixing buffered and direct IO is ill-advised it does pose a thread
>> for a data integrity, is unexpected and should be fixed.
>>
>> Fix this by deferring completion of asynchronous direct writes to a
>> process context in the case that there are mapped pages to be found in
>> the inode. Later before the completion in dio_complete() invalidate
>> the pages in question. This ensures that after the completion the pages
>> in the written area are either unmapped, or populated with up-to-date
>> data. Also do the same for the iomap case which uses
>> iomap_dio_complete() instead.
>>
>> This has a side effect of deferring the completion to a process context
>> for every AIO DIO that happens on inode that has pages mapped. However
>> since the consensus is that this is ill-advised practice the performance
>> implication should not be a problem.
>>
>> This was based on proposal from Jeff Moyer, thanks!
>>
>> Reviewed-by: Jan Kara <jack@suse.cz>
>> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
>> Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
>> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>> fs/direct-io.c | 47 ++++++++++++++++++++++++++++++++++++++++++-----
>> fs/iomap.c | 29 ++++++++++++++++-------------
>> mm/filemap.c | 6 ++++++
>> 3 files changed, 64 insertions(+), 18 deletions(-)
>>
>> The call traces are:
>>
>> [ 334.461955] BTRFS info (device vdb): has skinny extents
>> [ 334.463231] BTRFS info (device vdb): flagging fs with big metadata feature
>> [ 334.469746] BTRFS info (device vdb): creating UUID tree
>> [ 336.190840] ------------[ cut here ]------------
>> [ 336.193338] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220
>> [ 336.196925] Modules linked in: btrfs xor zstd_decompress zstd_compress xxhash raid6_pq dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver sr_mod cdrom sg ata_generic pata_acpi ppdev snd_pcm snd_timer snd soundcore serio_raw parport_pc pcspkr parport floppy ata_piix libata i2c_piix4 ip_tables
>> [ 336.203746] CPU: 0 PID: 6379 Comm: kworker/0:0 Not tainted 4.14.0-rc6 #1
>> [ 336.204799] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
>> [ 336.207480] Workqueue: dio/vdb dio_aio_complete_work
>> [ 336.208373] task: ffff880079094a80 task.stack: ffffc90000950000
>> [ 336.210495] RIP: 0010:dio_complete+0x1d4/0x220
>> [ 336.211288] RSP: 0018:ffffc90000953e20 EFLAGS: 00010286
>> [ 336.213145] RAX: 00000000fffffff0 RBX: ffff8800aae8c780 RCX: ffffc90000953c70
>> [ 336.214232] RDX: 0000000080000000 RSI: 00000000000002b4 RDI: ffffffff81cb9ccb
>> [ 336.216605] RBP: ffffc90000953e48 R08: 0000000000000000 R09: ffff880079c8b6c0
>> [ 336.217726] R10: 0000000000000074 R11: 0000000000000000 R12: 0000000000002000
>> [ 336.220141] R13: 0000000000002000 R14: 0000000000000003 R15: 0000000000074000
>> [ 336.221232] FS: 0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000
>> [ 336.223915] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 336.224885] CR2: 000055e9eae6b750 CR3: 00000000752c7000 CR4: 00000000000006f0
>> [ 336.227202] Call Trace:
>> [ 336.227750] dio_aio_complete_work+0x19/0x20
>> [ 336.228536] process_one_work+0x198/0x3e0
>> [ 336.230396] worker_thread+0x4e/0x3e0
>> [ 336.231064] kthread+0x114/0x150
>> [ 336.232489] ? process_one_work+0x3e0/0x3e0
>> [ 336.233250] ? kthread_create_on_node+0x40/0x40
>> [ 336.234929] ? kthread_create_on_node+0x40/0x40
>> [ 336.235754] ret_from_fork+0x25/0x30
>> [ 336.237322] Code: 8b 78 30 48 83 7f 58 00 0f 84 e9 fe ff ff 4b 8d 54 27 ff 4c 89 fe 48 c1 fe 0c 48 c1 fa 0c e8 64 58 f3 ff 85 c0 0f 84 cc fe ff ff <0f> ff e9 c5 fe ff ff 8b 50 20 f6 c2 10 0f 84 e5 fe ff ff 48 8b
>> [ 336.240031] ---[ end trace 3d29f0720cd16ad3 ]---
>> [ 378.043462] generic/095 [failed, exit status 1] - output mismatch (see /lkp/benchmarks/xfstests/results//generic/095.out.bad)
>> [ 378.043465]
>>
>>
>> [ 121.160532] run fstests generic/117 at 2017-10-25 05:27:29
>> [ 123.275347] XFS (vdd): Mounting V5 Filesystem
>> [ 123.280579] XFS (vdd): Ending clean mount
>> [ 163.043220] XFS (vdd): Unmounting Filesystem
>> [ 163.258872] generic/117 42s
>> [ 163.258875]
>> [ 164.009187] run fstests generic/208 at 2017-10-25 05:28:12
>> [ 164.344046] ------------[ cut here ]------------
>> [ 164.345164] WARNING: CPU: 1 PID: 86 at fs/iomap.c:755 iomap_dio_complete+0xe5/0x100
>> [ 164.347193] Modules linked in: xfs dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver sr_mod cdrom sg ata_generic pata_acpi ppdev snd_pc
>> m snd_timer snd soundcore parport_pc pcspkr serio_raw floppy parport ata_piix i2c_piix4 libata ip_tables
>> [ 164.351698] CPU: 1 PID: 86 Comm: kworker/1:2 Not tainted 4.14.0-rc6 #1
>> [ 164.353111] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
>> [ 164.361199] Workqueue: dio/vda iomap_dio_complete_work
>> [ 164.362418] task: ffff88007f8b8000 task.stack: ffffc90000a4c000
>> [ 164.363666] RIP: 0010:iomap_dio_complete+0xe5/0x100
>> [ 164.364923] RSP: 0018:ffffc90000a4fe08 EFLAGS: 00010286
>> [ 164.366065] RAX: 00000000fffffff0 RBX: ffff88007a5fba80 RCX: ffffc90000a4fc58
>> [ 164.367281] RDX: 0000000080000000 RSI: 00000000000002b4 RDI: ffffffff81cb9ccb
>> [ 164.368330] RBP: ffffc90000a4fe30 R08: 0000000000000000 R09: ffff8800750eeb58
>> [ 164.369503] R10: 0000000000001ffe R11: 0000000000000000 R12: 0000000002000000
>> [ 164.370573] R13: ffff8800b8d27780 R14: ffff8800b8869058 R15: 0000000000000000
>> [ 164.371674] FS: 0000000000000000(0000) GS:ffff88013fc80000(0000) knlGS:0000000000000000
>> [ 164.373136] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 164.387036] CR2: 00007f5ee028e000 CR3: 000000007a1ff000 CR4: 00000000000006e0
>> [ 164.388119] Call Trace:
>> [ 164.388581] iomap_dio_complete_work+0x25/0x80
>> [ 164.389385] process_one_work+0x198/0x3e0
>> [ 164.390054] worker_thread+0x4e/0x3e0
>> [ 164.390699] kthread+0x114/0x150
>> [ 164.391341] ? process_one_work+0x3e0/0x3e0
>> [ 164.392038] ? kthread_create_on_node+0x40/0x40
>> [ 164.392922] ret_from_fork+0x25/0x30
>> [ 164.393506] Code: e0 eb 9f 48 8b 77 18 e9 78 ff ff ff 4c 89 fa 48 03 53 18 4c 89 fe 48 c1 fe 0c 48 83 ea 01 48 c1 fa 0c e8 0f 93 f0 ff 85 c
>> 0 74 93 <0f> ff eb 8f 4c 63 67 28 e9 54 ff ff ff 0f 1f 40 00 66 2e 0f 1f
>> [ 164.396264] ---[ end trace 77984c98cc2335de ]---
>> [ 364.297438] generic/208 200s
>> [ 364.297442]
>> [ 365.772872] XFS (vda): Unmounting Filesystem
>> [ 366.507546] XFS (vda): Mounting V5 Filesystem
>> [ 366.519283] XFS (vda): Ending clean mount
>>
>>
>> [ 370.264428] run fstests generic/095 at 2017-10-25 05:38:44
>> [ 370.392041] XFS (vdd): Unmounting Filesystem
>> [ 372.641826] XFS (vdd): Mounting V5 Filesystem
>> [ 372.648446] XFS (vdd): Ending clean mount
>> [ 373.277786] ------------[ cut here ]------------
>> [ 373.279820] ------------[ cut here ]------------
>> [ 373.279826] WARNING: CPU: 2 PID: 29166 at fs/iomap.c:1016 iomap_dio_rw+0x346/0x360
>> [ 373.279827] Modules linked in: xfs dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver sr_mod cdrom sg ata_generic pata_acpi ppdev snd_pcm snd_timer snd soundcore serio_raw pcspkr i2c_piix4 ata_piix parport_pc parport floppy libata ip_tables
>> [ 373.279842] CPU: 2 PID: 29166 Comm: fio Not tainted 4.14.0-rc6 #1
>> [ 373.279843] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
>> [ 373.279844] task: ffff88008bd34a80 task.stack: ffffc900065f8000
>> [ 373.279846] RIP: 0010:iomap_dio_rw+0x346/0x360
>> [ 373.279847] RSP: 0000:ffffc900065fbd40 EFLAGS: 00010286
>> [ 373.279848] RAX: 00000000fffffff0 RBX: ffffc900065fbe98 RCX: ffffc900065fbb90
>> [ 373.279848] RDX: 0000000080000000 RSI: 00000000000002b4 RDI: ffffffff81cb9ccb
>> [ 373.279849] RBP: ffffc900065fbdd0 R08: 0000000000000000 R09: 00007f6fe0948000
>> [ 373.279850] R10: ffff88013ffd2000 R11: 0000000000000400 R12: 000000000005f400
>> [ 373.279850] R13: ffffc900065fbe70 R14: ffff88007a6fb7e0 R15: 0000000000000011
>> [ 373.279852] FS: 00007f6fb3ce1700(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000
>> [ 373.279852] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 373.279853] CR2: 00007f6fe088ac00 CR3: 0000000078eb7000 CR4: 00000000000006e0
>> [ 373.279856] Call Trace:
>> [ 373.279913] ? xfs_file_aio_write_checks+0x137/0x190 [xfs]
>> [ 373.279933] xfs_file_dio_aio_write+0xec/0x260 [xfs]
>> [ 373.279952] ? xfs_file_dio_aio_write+0xec/0x260 [xfs]
>> [ 373.279970] xfs_file_write_iter+0x98/0x150 [xfs]
>> [ 373.279972] __vfs_write+0xd3/0x150
>> [ 373.279974] vfs_write+0xc7/0x1b0
>> [ 373.279975] SyS_pwrite64+0x76/0x90
>> [ 373.279977] entry_SYSCALL_64_fastpath+0x1a/0xa5
>> [ 373.279979] RIP: 0033:0x7f6fcc3019d3
>> [ 373.279979] RSP: 002b:00007f6fb3cdeb20 EFLAGS: 00000293 ORIG_RAX: 0000000000000012
>> [ 373.279980] RAX: ffffffffffffffda RBX: 00007f6fb3ce1700 RCX: 00007f6fcc3019d3
>> [ 373.279981] RDX: 0000000000000400 RSI: 000055e925caac00 RDI: 0000000000000004
>> [ 373.279982] RBP: 00007f6fe095fee0 R08: 0000000000000000 R09: 00007f6fb3ce1700
>> [ 373.279982] R10: 000000000005f400 R11: 0000000000000293 R12: 0000000000000000
>> [ 373.279983] R13: 00007f6fe095fe2f R14: 0000000000000003 R15: 00007f6fe095fee0
>> [ 373.279984] Code: 21 00 00 f6 58 2b 81 49 c7 44 24 08 00 00 00 00 4c 89
>> f7 e8 ed ec ff ff e9 13 fe ff ff 48 8d 7d a0 e8 2f a9 13 00 e9 33 ff ff ff
>> <0f> ff e9 6a fe ff ff 48 c7 c0 f4 ff ff ff e9 f2 fd ff ff 90 90 [
>> 373.279998] ---[ end trace 01e002a8aa01ae25 ]---
>> [ 373.334858] WARNING: CPU: 3 PID: 29158 at fs/iomap.c:755 iomap_dio_complete+0xe5/0x100
>> [ 373.336817] Modules linked in: xfs dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver sr_mod cdrom sg ata_generic pata_acpi ppdev snd_pcm snd_timer snd soundcore serio_raw pcspkr i2c_piix4 ata_piix parport_pc parport floppy libata ip_tables
>> [ 373.341916] CPU: 3 PID: 29158 Comm: fio Tainted: G W 4.14.0-rc6 #1
>> [ 373.344151] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
>> [ 373.346702] task: ffff88008bd32540 task.stack: ffffc900065c8000
>> [ 373.348512] RIP: 0010:iomap_dio_complete+0xe5/0x100
>> [ 373.349999] RSP: 0018:ffffc900065cbd08 EFLAGS: 00010286
>> [ 373.351604] RAX: 00000000fffffff0 RBX: ffff88007a5611e0 RCX: ffffc900065cbb58
>> [ 373.353690] RDX: 0000000080000000 RSI: 00000000000002b4 RDI: ffffffff81cb9ccb
>> [ 373.355693] RBP: ffffc900065cbd30 R08: 0000000000000000 R09: 00007f6fe08cb000
>> [ 373.357775] R10: ffffc900065cb8c8 R11: 0000000000000207 R12: 0000000000000400
>> [ 373.359837] R13: ffffc900065cbe70 R14: ffff8800787657d8 R15: 0000000000002400
>> [ 373.361878] FS: 00007f6fe095aa80(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000
>> [ 373.364333] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 373.366071] CR2: 00007f6fcc38c1ee CR3: 000000007949a000 CR4: 00000000000006e0
>> [ 373.368170] Call Trace:
>> [ 373.369165] ? iomap_dio_rw+0x2a6/0x360
>> [ 373.370467] iomap_dio_rw+0x333/0x360
>> [ 373.371768] xfs_file_dio_aio_write+0xec/0x260 [xfs]
>> [ 373.373069] ? xfs_file_dio_aio_write+0xec/0x260 [xfs]
>> [ 373.374410] xfs_file_write_iter+0x98/0x150 [xfs]
>> [ 373.375820] __vfs_write+0xd3/0x150
>> [ 373.377096] vfs_write+0xc7/0x1b0
>> [ 373.378436] SyS_pwrite64+0x76/0x90
>> [ 373.379897] entry_SYSCALL_64_fastpath+0x1a/0xa5
>> [ 373.381488] RIP: 0033:0x7f6fcc3019d3
>> [ 373.382781] RSP: 002b:00007f6fe0957ea0 EFLAGS: 00000293 ORIG_RAX: 0000000000000012
>> [ 373.385064] RAX: ffffffffffffffda RBX: 000055e925cb07f0 RCX: 00007f6fcc3019d3
>> [ 373.386913] RDX: 0000000000000400 RSI: 000055e925ca9800 RDI: 0000000000000005
>> [ 373.388911] RBP: 0000000000000005 R08: 0000000000000000 R09: 00007f6fe095aa80
>> [ 373.390588] R10: 0000000000002400 R11: 0000000000000293 R12: 00007f6fe095aa80
>> [ 373.392296] R13: 00007f6fe0957ec0 R14: 0000000000000003 R15: 000055e925caf360
>> [ 373.394229] Code: e0 eb 9f 48 8b 77 18 e9 78 ff ff ff 4c 89 fa 48 03 53
>> 18 4c 89 fe 48 c1 fe 0c 48 83 ea 01 48 c1 fa 0c e8 0f 93 f0 ff 85 c0 74 93
>> <0f> ff eb 8f 4c 63 67 28 e9 54 ff ff ff 0f 1f 40 00 66 2e 0f 1f [
>> 373.398812] ---[ end trace 01e002a8aa01ae26 ]---
>> [ 392.887639] XFS (vdd): Unmounting Filesystem
>> [ 392.985715] generic/095 [failed, exit status 1] - output mismatch (see /lkp/benchmarks/xfstests/results//generic/095.out.bad)
>>
>> Thanks,
>> Fengguang
--
Filipe David Manana,
“Whether you think you can, or you think you can't — you're right.”
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [btrfs] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220
2017-10-31 6:54 ` Eryu Guan
2017-10-31 7:10 ` Fengguang Wu
@ 2017-11-06 1:13 ` Eric Biggers
2017-11-13 19:13 ` Eric Biggers
1 sibling, 1 reply; 13+ messages in thread
From: Eric Biggers @ 2017-11-06 1:13 UTC (permalink / raw)
To: Eryu Guan
Cc: Fengguang Wu, Lukas Czerner, Jan Kara, Darrick J. Wong,
Jeff Moyer, Jens Axboe, Linus Torvalds, Linux Kernel Mailing List,
linux-fsdevel
On Tue, Oct 31, 2017 at 02:54:17PM +0800, Eryu Guan wrote:
> On Tue, Oct 31, 2017 at 01:10:41AM +0100, Fengguang Wu wrote:
> > Hi Eryu,
> >
> > On Mon, Oct 30, 2017 at 03:44:29PM +0800, Eryu Guan wrote:
> > > Hi Fengguang,
> > >
> > > On Mon, Oct 30, 2017 at 08:20:21AM +0100, Fengguang Wu wrote:
> > > > CC fsdevel.
> > > >
> > > > On Sun, Oct 29, 2017 at 11:51:55PM +0100, Fengguang Wu wrote:
> > > > > Hi Linus,
> > > > >
> > > > > Up to now we see the below boot error/warnings when testing v4.14-rc6.
> > > > >
> > > > > They hit the RC release mainly due to various imperfections in 0day's
> > > > > auto bisection. So I manually list them here and CC the likely easy to
> > > > > debug ones to the corresponding maintainers in the followup emails.
> > > > >
> > > > > boot_successes: 4700
> > > > > boot_failures: 247
> > > >
> > > > [...]
> > > >
> > > > > WARNING:at_fs/direct-io.c:#dio_complete: 7
> > > > > WARNING:at_fs/iomap.c:#iomap_dio_complete: 3
> > > > > WARNING:at_fs/iomap.c:#iomap_dio_rw: 1
> > > >
> > > > The first warning happens on btrfs and is bisected to this commit.
> > > > The other 2 warnings happen on xfs.
> > >
> > > I noticed that the warnings are triggered by generic/095 and
> > > generic/208, they're known to generate such warnings and I think these
> > > warnings are kind of 'known issue', please see comments above
> > > _filter_aiodio_dmesg() in fstests/common/filter.
> > >
> > > Please make sure your local fstests contains the following 3 commits:
> > >
> > > ca93e26865ab common: move _filter_xfs_dmesg() to common/filter
> > > 5aa662733ab0 common: turn _filter_xfs_dmesg() into _filter_aiodio_dmesg()
> > > 228aee780f13 generic/036,208: whitelist [iomap_]dio_complete() WARNs
> >
> > OK.
> >
> > > we filtered out such warnings in fstests on purpose so the affected
> > > tests won't fail because of such dmesg warnings.
> >
> > We may also teach 0day robot to ignore the warning when running the
> > above 2 fstests.
> >
> > The more generic way of filtering may be to inject a dmesg line like
> >
> > THIS TEST WILL TRIGGER KERNEL WARNING, PLEASE IGNORE.
> >
> > just before the specific tests startup. Then 3rd party dmesg parsing
> > scripts can handle such purpose-made warnings in a uniform way.
>
> fstests doesn't know, prior to the test, if the warnings the test is
> going to trigger are intentional or real bugs, fstests records the dmesg
> log and analyzes the log after test and reports PASS if all the warnings
> are intentional (based on the whitelist filter).
>
> But I think it's possible to insert such a message to dmesg *after* test
> if fstests finds that all the warnings are intentional. Does that work
> for 0day robot?
>
Please don't hack around this in tests. WARN_ON() is only meant to be used to
indicate kernel bugs. It should never be reachable by userspace. I'm seeing
the same WARNING when fuzzing with syzkaller (see below). If it's supposed to
be reachable by userspace, then it *must* be removed, or perhaps replaced with
pr_warn_ratelimited() with a message that is actually understandable, like
"process 'foo' mixes buffered I/O with direct I/O".
WARNING: CPU: 2 PID: 27399 at fs/direct-io.c:293 dio_complete+0x532/0x7e0 fs/direct-io.c:293
Kernel panic - not syncing: panic_on_warn set ...
CPU: 2 PID: 27399 Comm: syz-executor0 Not tainted 4.14.0-rc7-00174-g2d6349944d96 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.3-20171021_125229-anatol 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x115/0x1da lib/dump_stack.c:53
panic+0x1b4/0x39c kernel/panic.c:181
__warn+0x1bf/0x1d4 kernel/panic.c:542
report_bug+0x23f/0x2c0 lib/bug.c:184
fixup_bug+0x3f/0x90 arch/x86/kernel/traps.c:178
do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
do_error_trap+0x11c/0x350 arch/x86/kernel/traps.c:298
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:906
RIP: 0010:dio_complete+0x532/0x7e0 fs/direct-io.c:293
RSP: 0018:ffff880039dcf208 EFLAGS: 00010212
RAX: 0000000000010000 RBX: ffff880061a1f040 RCX: 0000000000000ebd
RDX: ffffffff816b97f2 RSI: ffffc90000696000 RDI: ffffed00073b9e36
RBP: ffff880039dcf250 R08: ffff88007ffd709c R09: ffff880037874640
R10: 00000000000002e7 R11: ffff88007ffd709d R12: 0000000000000200
R13: 0000000000000200 R14: ffff8800593cf8f0 R15: ffffffff898005d0
do_blockdev_direct_IO fs/direct-io.c:1374 [inline]
__blockdev_direct_IO+0x46b9/0x8a70 fs/direct-io.c:1400
ext4_direct_IO_write fs/ext4/inode.c:3685 [inline]
ext4_direct_IO+0x98e/0x1950 fs/ext4/inode.c:3807
generic_file_direct_write+0x1e6/0x440 mm/filemap.c:2928
__generic_file_write_iter+0x21f/0x5b0 mm/filemap.c:3107
ext4_file_write_iter+0x5c2/0x10a0 fs/ext4/file.c:264
call_write_iter include/linux/fs.h:1771 [inline]
new_sync_write fs/read_write.c:469 [inline]
__vfs_write+0x681/0x970 fs/read_write.c:482
vfs_write+0x18b/0x480 fs/read_write.c:544
SYSC_write fs/read_write.c:589 [inline]
SyS_write+0xef/0x230 fs/read_write.c:581
entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x4585c9
RSP: 002b:00007ff87eff2bd8 EFLAGS: 00000216 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000738020 RCX: 00000000004585c9
RDX: 0000000000000200 RSI: 0000000020010000 RDI: 0000000000000015
RBP: 0000000000000046 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000216 R12: 00000000006d1048
R13: 00000000ffffffff R14: 00007ff87eff36bc R15: 0000000000000000
Dumping ftrace buffer:
(ftrace buffer empty)
Kernel Offset: 0x8000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Rebooting in 86400 seconds..
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [btrfs] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220
2017-11-06 1:13 ` Eric Biggers
@ 2017-11-13 19:13 ` Eric Biggers
2017-11-13 19:16 ` Jens Axboe
0 siblings, 1 reply; 13+ messages in thread
From: Eric Biggers @ 2017-11-13 19:13 UTC (permalink / raw)
To: Eryu Guan
Cc: Fengguang Wu, Lukas Czerner, Jan Kara, Darrick J. Wong,
Jeff Moyer, Jens Axboe, Linus Torvalds, Linux Kernel Mailing List,
linux-fsdevel
On Sun, Nov 05, 2017 at 05:13:06PM -0800, Eric Biggers wrote:
> On Tue, Oct 31, 2017 at 02:54:17PM +0800, Eryu Guan wrote:
> > On Tue, Oct 31, 2017 at 01:10:41AM +0100, Fengguang Wu wrote:
> > > Hi Eryu,
> > >
> > > On Mon, Oct 30, 2017 at 03:44:29PM +0800, Eryu Guan wrote:
> > > > Hi Fengguang,
> > > >
> > > > On Mon, Oct 30, 2017 at 08:20:21AM +0100, Fengguang Wu wrote:
> > > > > CC fsdevel.
> > > > >
> > > > > On Sun, Oct 29, 2017 at 11:51:55PM +0100, Fengguang Wu wrote:
> > > > > > Hi Linus,
> > > > > >
> > > > > > Up to now we see the below boot error/warnings when testing v4.14-rc6.
> > > > > >
> > > > > > They hit the RC release mainly due to various imperfections in 0day's
> > > > > > auto bisection. So I manually list them here and CC the likely easy to
> > > > > > debug ones to the corresponding maintainers in the followup emails.
> > > > > >
> > > > > > boot_successes: 4700
> > > > > > boot_failures: 247
> > > > >
> > > > > [...]
> > > > >
> > > > > > WARNING:at_fs/direct-io.c:#dio_complete: 7
> > > > > > WARNING:at_fs/iomap.c:#iomap_dio_complete: 3
> > > > > > WARNING:at_fs/iomap.c:#iomap_dio_rw: 1
> > > > >
> > > > > The first warning happens on btrfs and is bisected to this commit.
> > > > > The other 2 warnings happen on xfs.
> > > >
> > > > I noticed that the warnings are triggered by generic/095 and
> > > > generic/208, they're known to generate such warnings and I think these
> > > > warnings are kind of 'known issue', please see comments above
> > > > _filter_aiodio_dmesg() in fstests/common/filter.
> > > >
> > > > Please make sure your local fstests contains the following 3 commits:
> > > >
> > > > ca93e26865ab common: move _filter_xfs_dmesg() to common/filter
> > > > 5aa662733ab0 common: turn _filter_xfs_dmesg() into _filter_aiodio_dmesg()
> > > > 228aee780f13 generic/036,208: whitelist [iomap_]dio_complete() WARNs
> > >
> > > OK.
> > >
> > > > we filtered out such warnings in fstests on purpose so the affected
> > > > tests won't fail because of such dmesg warnings.
> > >
> > > We may also teach 0day robot to ignore the warning when running the
> > > above 2 fstests.
> > >
> > > The more generic way of filtering may be to inject a dmesg line like
> > >
> > > THIS TEST WILL TRIGGER KERNEL WARNING, PLEASE IGNORE.
> > >
> > > just before the specific tests startup. Then 3rd party dmesg parsing
> > > scripts can handle such purpose-made warnings in a uniform way.
> >
> > fstests doesn't know, prior to the test, if the warnings the test is
> > going to trigger are intentional or real bugs, fstests records the dmesg
> > log and analyzes the log after test and reports PASS if all the warnings
> > are intentional (based on the whitelist filter).
> >
> > But I think it's possible to insert such a message to dmesg *after* test
> > if fstests finds that all the warnings are intentional. Does that work
> > for 0day robot?
> >
>
> Please don't hack around this in tests. WARN_ON() is only meant to be used to
> indicate kernel bugs. It should never be reachable by userspace. I'm seeing
> the same WARNING when fuzzing with syzkaller (see below). If it's supposed to
> be reachable by userspace, then it *must* be removed, or perhaps replaced with
> pr_warn_ratelimited() with a message that is actually understandable, like
> "process 'foo' mixes buffered I/O with direct I/O".
>
> WARNING: CPU: 2 PID: 27399 at fs/direct-io.c:293 dio_complete+0x532/0x7e0 fs/direct-io.c:293
> Kernel panic - not syncing: panic_on_warn set ...
>
> CPU: 2 PID: 27399 Comm: syz-executor0 Not tainted 4.14.0-rc7-00174-g2d6349944d96 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.3-20171021_125229-anatol 04/01/2014
> Call Trace:
> __dump_stack lib/dump_stack.c:17 [inline]
> dump_stack+0x115/0x1da lib/dump_stack.c:53
> panic+0x1b4/0x39c kernel/panic.c:181
> __warn+0x1bf/0x1d4 kernel/panic.c:542
> report_bug+0x23f/0x2c0 lib/bug.c:184
> fixup_bug+0x3f/0x90 arch/x86/kernel/traps.c:178
> do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
> do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
> do_error_trap+0x11c/0x350 arch/x86/kernel/traps.c:298
> do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
> invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:906
> RIP: 0010:dio_complete+0x532/0x7e0 fs/direct-io.c:293
> RSP: 0018:ffff880039dcf208 EFLAGS: 00010212
> RAX: 0000000000010000 RBX: ffff880061a1f040 RCX: 0000000000000ebd
> RDX: ffffffff816b97f2 RSI: ffffc90000696000 RDI: ffffed00073b9e36
> RBP: ffff880039dcf250 R08: ffff88007ffd709c R09: ffff880037874640
> R10: 00000000000002e7 R11: ffff88007ffd709d R12: 0000000000000200
> R13: 0000000000000200 R14: ffff8800593cf8f0 R15: ffffffff898005d0
> do_blockdev_direct_IO fs/direct-io.c:1374 [inline]
> __blockdev_direct_IO+0x46b9/0x8a70 fs/direct-io.c:1400
> ext4_direct_IO_write fs/ext4/inode.c:3685 [inline]
> ext4_direct_IO+0x98e/0x1950 fs/ext4/inode.c:3807
> generic_file_direct_write+0x1e6/0x440 mm/filemap.c:2928
> __generic_file_write_iter+0x21f/0x5b0 mm/filemap.c:3107
> ext4_file_write_iter+0x5c2/0x10a0 fs/ext4/file.c:264
> call_write_iter include/linux/fs.h:1771 [inline]
> new_sync_write fs/read_write.c:469 [inline]
> __vfs_write+0x681/0x970 fs/read_write.c:482
> vfs_write+0x18b/0x480 fs/read_write.c:544
> SYSC_write fs/read_write.c:589 [inline]
> SyS_write+0xef/0x230 fs/read_write.c:581
> entry_SYSCALL_64_fastpath+0x1f/0xbe
> RIP: 0033:0x4585c9
> RSP: 002b:00007ff87eff2bd8 EFLAGS: 00000216 ORIG_RAX: 0000000000000001
> RAX: ffffffffffffffda RBX: 0000000000738020 RCX: 00000000004585c9
> RDX: 0000000000000200 RSI: 0000000020010000 RDI: 0000000000000015
> RBP: 0000000000000046 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000216 R12: 00000000006d1048
> R13: 00000000ffffffff R14: 00007ff87eff36bc R15: 0000000000000000
> Dumping ftrace buffer:
> (ftrace buffer empty)
> Kernel Offset: 0x8000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> Rebooting in 86400 seconds..
Ping, this still needs to be fixed. Lukas, it looks like you added the bogus
WARN_ON_ONCE(); can you please fix it? If not you will get bug reports from
fuzzers forever.
Eric
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [btrfs] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220
2017-11-13 19:13 ` Eric Biggers
@ 2017-11-13 19:16 ` Jens Axboe
2017-11-13 19:21 ` Linus Torvalds
0 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2017-11-13 19:16 UTC (permalink / raw)
To: Eric Biggers, Eryu Guan
Cc: Fengguang Wu, Lukas Czerner, Jan Kara, Darrick J. Wong,
Jeff Moyer, Linus Torvalds, Linux Kernel Mailing List,
linux-fsdevel
On 11/13/2017 12:13 PM, Eric Biggers wrote:
> On Sun, Nov 05, 2017 at 05:13:06PM -0800, Eric Biggers wrote:
>> On Tue, Oct 31, 2017 at 02:54:17PM +0800, Eryu Guan wrote:
>>> On Tue, Oct 31, 2017 at 01:10:41AM +0100, Fengguang Wu wrote:
>>>> Hi Eryu,
>>>>
>>>> On Mon, Oct 30, 2017 at 03:44:29PM +0800, Eryu Guan wrote:
>>>>> Hi Fengguang,
>>>>>
>>>>> On Mon, Oct 30, 2017 at 08:20:21AM +0100, Fengguang Wu wrote:
>>>>>> CC fsdevel.
>>>>>>
>>>>>> On Sun, Oct 29, 2017 at 11:51:55PM +0100, Fengguang Wu wrote:
>>>>>>> Hi Linus,
>>>>>>>
>>>>>>> Up to now we see the below boot error/warnings when testing v4.14-rc6.
>>>>>>>
>>>>>>> They hit the RC release mainly due to various imperfections in 0day's
>>>>>>> auto bisection. So I manually list them here and CC the likely easy to
>>>>>>> debug ones to the corresponding maintainers in the followup emails.
>>>>>>>
>>>>>>> boot_successes: 4700
>>>>>>> boot_failures: 247
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>>> WARNING:at_fs/direct-io.c:#dio_complete: 7
>>>>>>> WARNING:at_fs/iomap.c:#iomap_dio_complete: 3
>>>>>>> WARNING:at_fs/iomap.c:#iomap_dio_rw: 1
>>>>>>
>>>>>> The first warning happens on btrfs and is bisected to this commit.
>>>>>> The other 2 warnings happen on xfs.
>>>>>
>>>>> I noticed that the warnings are triggered by generic/095 and
>>>>> generic/208, they're known to generate such warnings and I think these
>>>>> warnings are kind of 'known issue', please see comments above
>>>>> _filter_aiodio_dmesg() in fstests/common/filter.
>>>>>
>>>>> Please make sure your local fstests contains the following 3 commits:
>>>>>
>>>>> ca93e26865ab common: move _filter_xfs_dmesg() to common/filter
>>>>> 5aa662733ab0 common: turn _filter_xfs_dmesg() into _filter_aiodio_dmesg()
>>>>> 228aee780f13 generic/036,208: whitelist [iomap_]dio_complete() WARNs
>>>>
>>>> OK.
>>>>
>>>>> we filtered out such warnings in fstests on purpose so the affected
>>>>> tests won't fail because of such dmesg warnings.
>>>>
>>>> We may also teach 0day robot to ignore the warning when running the
>>>> above 2 fstests.
>>>>
>>>> The more generic way of filtering may be to inject a dmesg line like
>>>>
>>>> THIS TEST WILL TRIGGER KERNEL WARNING, PLEASE IGNORE.
>>>>
>>>> just before the specific tests startup. Then 3rd party dmesg parsing
>>>> scripts can handle such purpose-made warnings in a uniform way.
>>>
>>> fstests doesn't know, prior to the test, if the warnings the test is
>>> going to trigger are intentional or real bugs, fstests records the dmesg
>>> log and analyzes the log after test and reports PASS if all the warnings
>>> are intentional (based on the whitelist filter).
>>>
>>> But I think it's possible to insert such a message to dmesg *after* test
>>> if fstests finds that all the warnings are intentional. Does that work
>>> for 0day robot?
>>>
>>
>> Please don't hack around this in tests. WARN_ON() is only meant to be used to
>> indicate kernel bugs. It should never be reachable by userspace. I'm seeing
>> the same WARNING when fuzzing with syzkaller (see below). If it's supposed to
>> be reachable by userspace, then it *must* be removed, or perhaps replaced with
>> pr_warn_ratelimited() with a message that is actually understandable, like
>> "process 'foo' mixes buffered I/O with direct I/O".
>>
>> WARNING: CPU: 2 PID: 27399 at fs/direct-io.c:293 dio_complete+0x532/0x7e0 fs/direct-io.c:293
>> Kernel panic - not syncing: panic_on_warn set ...
>>
>> CPU: 2 PID: 27399 Comm: syz-executor0 Not tainted 4.14.0-rc7-00174-g2d6349944d96 #1
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.3-20171021_125229-anatol 04/01/2014
>> Call Trace:
>> __dump_stack lib/dump_stack.c:17 [inline]
>> dump_stack+0x115/0x1da lib/dump_stack.c:53
>> panic+0x1b4/0x39c kernel/panic.c:181
>> __warn+0x1bf/0x1d4 kernel/panic.c:542
>> report_bug+0x23f/0x2c0 lib/bug.c:184
>> fixup_bug+0x3f/0x90 arch/x86/kernel/traps.c:178
>> do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
>> do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
>> do_error_trap+0x11c/0x350 arch/x86/kernel/traps.c:298
>> do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
>> invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:906
>> RIP: 0010:dio_complete+0x532/0x7e0 fs/direct-io.c:293
>> RSP: 0018:ffff880039dcf208 EFLAGS: 00010212
>> RAX: 0000000000010000 RBX: ffff880061a1f040 RCX: 0000000000000ebd
>> RDX: ffffffff816b97f2 RSI: ffffc90000696000 RDI: ffffed00073b9e36
>> RBP: ffff880039dcf250 R08: ffff88007ffd709c R09: ffff880037874640
>> R10: 00000000000002e7 R11: ffff88007ffd709d R12: 0000000000000200
>> R13: 0000000000000200 R14: ffff8800593cf8f0 R15: ffffffff898005d0
>> do_blockdev_direct_IO fs/direct-io.c:1374 [inline]
>> __blockdev_direct_IO+0x46b9/0x8a70 fs/direct-io.c:1400
>> ext4_direct_IO_write fs/ext4/inode.c:3685 [inline]
>> ext4_direct_IO+0x98e/0x1950 fs/ext4/inode.c:3807
>> generic_file_direct_write+0x1e6/0x440 mm/filemap.c:2928
>> __generic_file_write_iter+0x21f/0x5b0 mm/filemap.c:3107
>> ext4_file_write_iter+0x5c2/0x10a0 fs/ext4/file.c:264
>> call_write_iter include/linux/fs.h:1771 [inline]
>> new_sync_write fs/read_write.c:469 [inline]
>> __vfs_write+0x681/0x970 fs/read_write.c:482
>> vfs_write+0x18b/0x480 fs/read_write.c:544
>> SYSC_write fs/read_write.c:589 [inline]
>> SyS_write+0xef/0x230 fs/read_write.c:581
>> entry_SYSCALL_64_fastpath+0x1f/0xbe
>> RIP: 0033:0x4585c9
>> RSP: 002b:00007ff87eff2bd8 EFLAGS: 00000216 ORIG_RAX: 0000000000000001
>> RAX: ffffffffffffffda RBX: 0000000000738020 RCX: 00000000004585c9
>> RDX: 0000000000000200 RSI: 0000000020010000 RDI: 0000000000000015
>> RBP: 0000000000000046 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000216 R12: 00000000006d1048
>> R13: 00000000ffffffff R14: 00007ff87eff36bc R15: 0000000000000000
>> Dumping ftrace buffer:
>> (ftrace buffer empty)
>> Kernel Offset: 0x8000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>> Rebooting in 86400 seconds..
>
> Ping, this still needs to be fixed. Lukas, it looks like you added the bogus
> WARN_ON_ONCE(); can you please fix it? If not you will get bug reports from
> fuzzers forever.
I would tend to agree with you, it's annoying to dump a full stack trace
for an expected (even for a rare situation) condition. But it's not the
first one, there's also one in XFS that always triggers for test runs. I
complained about that one in the past.
--
Jens Axboe
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [btrfs] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220
2017-11-13 19:16 ` Jens Axboe
@ 2017-11-13 19:21 ` Linus Torvalds
2017-11-13 21:56 ` Darrick J. Wong
0 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2017-11-13 19:21 UTC (permalink / raw)
To: Jens Axboe
Cc: Eric Biggers, Eryu Guan, Fengguang Wu, Lukas Czerner, Jan Kara,
Darrick J. Wong, Jeff Moyer, Linux Kernel Mailing List,
linux-fsdevel
On Mon, Nov 13, 2017 at 11:16 AM, Jens Axboe <axboe@kernel.dk> wrote:
>
> I would tend to agree with you, it's annoying to dump a full stack trace
> for an expected (even for a rare situation) condition. But it's not the
> first one, there's also one in XFS that always triggers for test runs. I
> complained about that one in the past.
Yeah, we should always consider a WARN_ON() that is triggerable from
user space to be a kernel bug.
If it's a "cannot happen", then the bug should be fixed.
If it's a "can happen, but I want to see the trace", then you just got
the trace and you're done, and the WARN_ON() should be removed.
It could possibly be replaced with a "pr_warn()" or something, so that
it still shows up as a "the user did something dodgy", but honestly,
even that is questionable. We do that for things like "we just removed
support for this, we want to see if somebody is using it"
So in no case is "let's just keep things as is" the right answer.
Linus
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [btrfs] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220
2017-11-13 19:21 ` Linus Torvalds
@ 2017-11-13 21:56 ` Darrick J. Wong
2017-11-13 22:01 ` Linus Torvalds
0 siblings, 1 reply; 13+ messages in thread
From: Darrick J. Wong @ 2017-11-13 21:56 UTC (permalink / raw)
To: Linus Torvalds
Cc: Jens Axboe, Eric Biggers, Eryu Guan, Fengguang Wu, Lukas Czerner,
Jan Kara, Jeff Moyer, Linux Kernel Mailing List, linux-fsdevel,
xfs
[cc xfs list]
[We're discussing the WARN_ON in *dio_complete when invalidation fails]
[yes, again]
On Mon, Nov 13, 2017 at 11:21:50AM -0800, Linus Torvalds wrote:
> On Mon, Nov 13, 2017 at 11:16 AM, Jens Axboe <axboe@kernel.dk> wrote:
> >
> > I would tend to agree with you, it's annoying to dump a full stack trace
> > for an expected (even for a rare situation) condition. But it's not the
> > first one, there's also one in XFS that always triggers for test runs. I
> > complained about that one in the past.
>
> Yeah, we should always consider a WARN_ON() that is triggerable from
> user space to be a kernel bug.
>
> If it's a "cannot happen", then the bug should be fixed.
>
> If it's a "can happen, but I want to see the trace", then you just got
> the trace and you're done, and the WARN_ON() should be removed.
>
> It could possibly be replaced with a "pr_warn()" or something, so that
> it still shows up as a "the user did something dodgy", but honestly,
> even that is questionable. We do that for things like "we just removed
> support for this, we want to see if somebody is using it"
>
> So in no case is "let's just keep things as is" the right answer.
Wellll... the WARN_ON in question happens when:
a) two programs race to write to the same part of a file, one via the page
cache and the other via directio
b) the dio write completes, tries to invalidate the page cache, and fails
because the corresponding page cannot be invalidated
At this point, the page cache contents don't reflect what's on disk, so
I don't think we can quietly ignore the situation. Clearly, enough
people dislike the WARN to complain repeatedly, so perhaps we should try
to barf evidence of this situation up to userspace? The dio write
succeeded, which is why we don't turn err into ret; but now that we can
store and forward error codes through the mapping, how about we just:
errseq_set(dio->inode->i_mapping->wb_error, -EIO);
and then let the writers pick up the EIO the next time they fsync?
Though I can already imagine the complaints about writes that used to
work and suddenly start returning error codes.
--D
> Linus
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [btrfs] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220
2017-11-13 21:56 ` Darrick J. Wong
@ 2017-11-13 22:01 ` Linus Torvalds
2017-11-14 17:17 ` Theodore Ts'o
0 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2017-11-13 22:01 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Jens Axboe, Eric Biggers, Eryu Guan, Fengguang Wu, Lukas Czerner,
Jan Kara, Jeff Moyer, Linux Kernel Mailing List, linux-fsdevel,
xfs
On Mon, Nov 13, 2017 at 1:56 PM, Darrick J. Wong
<darrick.wong@oracle.com> wrote:
> Wellll... the WARN_ON in question happens when:
>
> a) two programs race to write to the same part of a file, one via the page
> cache and the other via directio
> b) the dio write completes, tries to invalidate the page cache, and fails
> because the corresponding page cannot be invalidated
>
> At this point, the page cache contents don't reflect what's on disk, so
> I don't think we can quietly ignore the situation.
Direct-IO causing cache coherency issues? Really? I'm shocked and
surptised that that could _possibly_ happen.
But why the hell would you want a back-trace for it?
IOW, if you want to warn about it, use "pr_warn_ratelimited()" or
something. What did the backtrace and "looks like a kernel oops"
really help with?
And in the end, maybe even the warning is pointless. You used
direct-IO and cached IO at the same time, and you got coherency
issues. What did you expect? directio is fundamentally broken.
Linus
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [btrfs] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220
2017-11-13 22:01 ` Linus Torvalds
@ 2017-11-14 17:17 ` Theodore Ts'o
0 siblings, 0 replies; 13+ messages in thread
From: Theodore Ts'o @ 2017-11-14 17:17 UTC (permalink / raw)
To: Linus Torvalds
Cc: Darrick J. Wong, Jens Axboe, Eric Biggers, Eryu Guan,
Fengguang Wu, Lukas Czerner, Jan Kara, Jeff Moyer,
Linux Kernel Mailing List, linux-fsdevel, xfs
On Mon, Nov 13, 2017 at 02:01:36PM -0800, Linus Torvalds wrote:
>
> And in the end, maybe even the warning is pointless. You used
> direct-IO and cached IO at the same time, and you got coherency
> issues. What did you expect? directio is fundamentally broken.
A single warning per inode, "page cache coherency broken due to direct
I/O; userspace did a dumb thing; oh, well" is probably more than
sufficient. The reason to have the warning is so that when the user
complains about the file system bug, we can point at the "userspace
did a dumb thing" warning message.
- Ted
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2017-11-14 17:17 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CA+55aFxSJGeN=2X-uX-on1Uq2Nb8+v1aiMDz5H1+tKW_N5Q+6g@mail.gmail.com>
[not found] ` <20171029225155.qcum5i75awrt5tzm@wfg-t540p.sh.intel.com>
2017-10-30 7:20 ` [btrfs] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220 Fengguang Wu
2017-10-30 7:44 ` Eryu Guan
2017-10-31 0:10 ` Fengguang Wu
2017-10-31 6:54 ` Eryu Guan
2017-10-31 7:10 ` Fengguang Wu
2017-11-06 1:13 ` Eric Biggers
2017-11-13 19:13 ` Eric Biggers
2017-11-13 19:16 ` Jens Axboe
2017-11-13 19:21 ` Linus Torvalds
2017-11-13 21:56 ` Darrick J. Wong
2017-11-13 22:01 ` Linus Torvalds
2017-11-14 17:17 ` Theodore Ts'o
2017-10-31 15:13 ` Filipe Manana
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).