public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* generic/247 triggers warnings occasionally
@ 2015-05-29  6:32 Eryu Guan
  2015-05-29  7:13 ` Eryu Guan
  2015-05-29 15:13 ` Brian Foster
  0 siblings, 2 replies; 4+ messages in thread
From: Eryu Guan @ 2015-05-29  6:32 UTC (permalink / raw)
  To: xfs

[-- Attachment #1: Type: text/plain, Size: 2535 bytes --]

Hi all,

I've seen generic/247 trigger the following warning occasionally, on
4.1.0-rc5 kernel. And I can reproduce it back on 4.0.0 kernel.

[60397.806729] run fstests generic/247 at 2015-05-29 13:19:55
[60400.197970] ------------[ cut here ]------------
[60400.199285] WARNING: CPU: 1 PID: 13161 at fs/xfs/xfs_file.c:726 xfs_file_dio_aio_write+0x176/0x2a8 [xfs]()
[60400.201783] Modules linked in: loop btrfs xor raid6_pq ppdev parport_pc i2c_piix4 parport i2c_core pcspkr serio_raw acpi_cpufreq xfs sd_mod ata_generic pata_acpi 8139too ata_piix 8139cp libata mii virtio_pci floppy virtio virtio_ring
[60400.207266] CPU: 1 PID: 13161 Comm: dd Not tainted 4.1.0-rc5 #61
[60400.208067] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[60400.208734]  0000000000000000 00000000726fc8c5 ffff880213f83cf8 ffffffff816d825d
[60400.209515]  0000000000000000 0000000000000000 ffff880213f83d38 ffffffff8107deaa
[60400.210260]  ffffea0007f56140 ffff880213f83e40 ffff8800d661f400 ffff8800d661f728
[60400.211207] Call Trace:
[60400.211423]  [<ffffffff816d825d>] dump_stack+0x45/0x57
[60400.211860]  [<ffffffff8107deaa>] warn_slowpath_common+0x8a/0xc0
[60400.212385]  [<ffffffff8107dfda>] warn_slowpath_null+0x1a/0x20
[60400.212909]  [<ffffffffa0145f01>] xfs_file_dio_aio_write+0x176/0x2a8 [xfs]
[60400.213507]  [<ffffffff81342978>] ? __clear_user+0x28/0x50
[60400.213981]  [<ffffffffa011291e>] xfs_file_write_iter+0x10e/0x120 [xfs]
[60400.214554]  [<ffffffff811fd721>] __vfs_write+0xd1/0x110
[60400.215004]  [<ffffffff811fddb9>] vfs_write+0xa9/0x1b0
[60400.215454]  [<ffffffff8102472c>] ? do_audit_syscall_entry+0x6c/0x70
[60400.215984]  [<ffffffff811fec65>] SyS_write+0x55/0xd0
[60400.216431]  [<ffffffff816df96e>] system_call_fastpath+0x12/0x71
[60400.216946] ---[ end trace c403c17f7a0c57a7 ]---
[60404.341600] XFS (sda5): Unmounting Filesystem
[60404.512757] XFS (sda5): Mounting V4 Filesystem
[60404.540546] XFS (sda5): Ending clean mount
[60404.609414] XFS (sda5): Unmounting Filesystem

fs/xfs/xfs_file.c:723
   723			ret = invalidate_inode_pages2_range(VFS_I(ip)->i_mapping,
   724						pos >> PAGE_CACHE_SHIFT,
   725						end >> PAGE_CACHE_SHIFT);
   726			WARN_ON_ONCE(ret);
   727			ret = 0;

It can be reproduced by running generic/247 in loop(usually within 100
loops), without any special parameters/options.

	while ./check generic/247; do : ; done

Attachments are my xfstests config file and host info requested by
http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

Thanks,
Eryu

[-- Attachment #2: hostinfo --]
[-- Type: text/plain, Size: 4308 bytes --]

[root@dhcp-66-86-11 ~]# uname -a
Linux dhcp-66-86-11.rhts.eng.nay.redhat.com 4.1.0-rc5 #61 SMP Thu May 28 12:28:22 CST 2015 x86_64 x86_64 x86_64 GNU/Linux
[root@dhcp-66-86-11 ~]# rpm -q xfsprogs
xfsprogs-3.2.1-6.el7.x86_64
[root@dhcp-66-86-11 ~]# grep -c proc /proc/cpuinfo
4
[root@dhcp-66-86-11 ~]# cat /proc/meminfo
MemTotal:        7912932 kB
MemFree:         7547480 kB
MemAvailable:    7599440 kB
Buffers:             944 kB
Cached:           219292 kB
SwapCached:            0 kB
Active:           170072 kB
Inactive:         106352 kB
Active(anon):      56476 kB
Inactive(anon):     8384 kB
Active(file):     113596 kB
Inactive(file):    97968 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:       8257532 kB
SwapFree:        8257532 kB
Dirty:             42556 kB
Writeback:             0 kB
AnonPages:         56020 kB
Mapped:            33008 kB
Shmem:              8552 kB
Slab:              42520 kB
SReclaimable:      18648 kB
SUnreclaim:        23872 kB
KernelStack:        2336 kB
PageTables:         3280 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    12213996 kB
Committed_AS:     260636 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       26336 kB
VmallocChunk:   34359623680 kB
HardwareCorrupted:     0 kB
AnonHugePages:     12288 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       69620 kB
DirectMap2M:     8318976 kB
[root@dhcp-66-86-11 ~]# cat /proc/mounts
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
sysfs /sys sysfs rw,seclabel,nosuid,nodev,noexec,relatime 0 0
devtmpfs /dev devtmpfs rw,seclabel,nosuid,size=3943408k,nr_inodes=985852,mode=755 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,seclabel,nosuid,nodev 0 0
devpts /dev/pts devpts rw,seclabel,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,seclabel,nosuid,nodev,mode=755 0 0
tmpfs /sys/fs/cgroup tmpfs rw,seclabel,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
pstore /sys/fs/pstore pstore rw,seclabel,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
configfs /sys/kernel/config configfs rw,relatime 0 0
/dev/sda3 / xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
selinuxfs /sys/fs/selinux selinuxfs rw,relatime 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=34,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0
debugfs /sys/kernel/debug debugfs rw,seclabel,relatime 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,seclabel,relatime 0 0
mqueue /dev/mqueue mqueue rw,seclabel,relatime 0 0
/dev/sda1 /boot xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
/dev/sda6 /mnt/testarea/scratch xfs rw,context=system_u:object_r:nfs_t:s0,relatime,attr2,inode64,noquota 0 0
/dev/sda5 /mnt/testarea/test xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
[root@dhcp-66-86-11 ~]# cat /proc/partitions
major minor  #blocks  name

   8        0  314572800 sda
   8        1     512000 sda1
   8        2    8257536 sda2
   8        3   52428800 sda3
   8        4          1 sda4
   8        5   15728640 sda5
   8        6   15728640 sda6
   8        7   15728640 sda7
   8        8   15728640 sda8
   8        9   15728640 sda9
   8       10   15728640 sda10
   8       11   15728640 sda11
   8       16  104857600 sdb
   8       17  104856576 sdb1
[root@dhcp-66-86-11 ~]# lvs
[root@dhcp-66-86-11 ~]#

[-- Attachment #3: local.config --]
[-- Type: text/plain, Size: 176 bytes --]

TEST_DEV=/dev/sda5
TEST_DIR=/mnt/testarea/test				# mount point of TEST PARTITION
SCRATCH_MNT=/mnt/testarea/scratch			# mount point for SCRATCH PARTITION
SCRATCH_DEV=/dev/sda6

[-- Attachment #4: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: generic/247 triggers warnings occasionally
  2015-05-29  6:32 generic/247 triggers warnings occasionally Eryu Guan
@ 2015-05-29  7:13 ` Eryu Guan
  2015-05-29 15:13 ` Brian Foster
  1 sibling, 0 replies; 4+ messages in thread
From: Eryu Guan @ 2015-05-29  7:13 UTC (permalink / raw)
  To: xfs

[-- Attachment #1: Type: text/plain, Size: 1719 bytes --]

On Fri, May 29, 2015 at 02:32:14PM +0800, Eryu Guan wrote:
> Hi all,
> 
> I've seen generic/247 trigger the following warning occasionally, on
> 4.1.0-rc5 kernel. And I can reproduce it back on 4.0.0 kernel.
> 
[snip]
> 
> Attachments are my xfstests config file and host info requested by
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

More info on host info here.

It's a RHEL7 kvm guest running on RHEL6.6, running upstream kernel. The
disk configuration in guest xml is

<disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/home/osimage/rhel7.img'/>
      <target dev='hda' bus='ide'/>
      <alias name='ide0-0-0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>

This is xfs_info output of $TEST_DIR after the warning.

[root@dhcp-66-86-11 xfstests]# xfs_info /mnt/testarea/test/
meta-data=/dev/sda5              isize=256    agcount=4, agsize=983040 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=3932160, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal               bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

I also attached the compressed full dmesg log.

And I do reproduce it on another bare metal host, which has 12 logical
cpus and 16G memory, tested with 12G lvm device.

Thanks,
Eryu

[-- Attachment #2: generic-247.dmesg.gz --]
[-- Type: application/gzip, Size: 20263 bytes --]

[-- Attachment #3: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: generic/247 triggers warnings occasionally
  2015-05-29  6:32 generic/247 triggers warnings occasionally Eryu Guan
  2015-05-29  7:13 ` Eryu Guan
@ 2015-05-29 15:13 ` Brian Foster
  2015-05-29 22:37   ` Dave Chinner
  1 sibling, 1 reply; 4+ messages in thread
From: Brian Foster @ 2015-05-29 15:13 UTC (permalink / raw)
  To: Eryu Guan; +Cc: xfs

On Fri, May 29, 2015 at 02:32:14PM +0800, Eryu Guan wrote:
> Hi all,
> 
> I've seen generic/247 trigger the following warning occasionally, on
> 4.1.0-rc5 kernel. And I can reproduce it back on 4.0.0 kernel.
> 
> [60397.806729] run fstests generic/247 at 2015-05-29 13:19:55
> [60400.197970] ------------[ cut here ]------------
> [60400.199285] WARNING: CPU: 1 PID: 13161 at fs/xfs/xfs_file.c:726 xfs_file_dio_aio_write+0x176/0x2a8 [xfs]()
> [60400.201783] Modules linked in: loop btrfs xor raid6_pq ppdev parport_pc i2c_piix4 parport i2c_core pcspkr serio_raw acpi_cpufreq xfs sd_mod ata_generic pata_acpi 8139too ata_piix 8139cp libata mii virtio_pci floppy virtio virtio_ring
> [60400.207266] CPU: 1 PID: 13161 Comm: dd Not tainted 4.1.0-rc5 #61
> [60400.208067] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
> [60400.208734]  0000000000000000 00000000726fc8c5 ffff880213f83cf8 ffffffff816d825d
> [60400.209515]  0000000000000000 0000000000000000 ffff880213f83d38 ffffffff8107deaa
> [60400.210260]  ffffea0007f56140 ffff880213f83e40 ffff8800d661f400 ffff8800d661f728
> [60400.211207] Call Trace:
> [60400.211423]  [<ffffffff816d825d>] dump_stack+0x45/0x57
> [60400.211860]  [<ffffffff8107deaa>] warn_slowpath_common+0x8a/0xc0
> [60400.212385]  [<ffffffff8107dfda>] warn_slowpath_null+0x1a/0x20
> [60400.212909]  [<ffffffffa0145f01>] xfs_file_dio_aio_write+0x176/0x2a8 [xfs]
> [60400.213507]  [<ffffffff81342978>] ? __clear_user+0x28/0x50
> [60400.213981]  [<ffffffffa011291e>] xfs_file_write_iter+0x10e/0x120 [xfs]
> [60400.214554]  [<ffffffff811fd721>] __vfs_write+0xd1/0x110
> [60400.215004]  [<ffffffff811fddb9>] vfs_write+0xa9/0x1b0
> [60400.215454]  [<ffffffff8102472c>] ? do_audit_syscall_entry+0x6c/0x70
> [60400.215984]  [<ffffffff811fec65>] SyS_write+0x55/0xd0
> [60400.216431]  [<ffffffff816df96e>] system_call_fastpath+0x12/0x71
> [60400.216946] ---[ end trace c403c17f7a0c57a7 ]---
> [60404.341600] XFS (sda5): Unmounting Filesystem
> [60404.512757] XFS (sda5): Mounting V4 Filesystem
> [60404.540546] XFS (sda5): Ending clean mount
> [60404.609414] XFS (sda5): Unmounting Filesystem
> 
> fs/xfs/xfs_file.c:723
>    723			ret = invalidate_inode_pages2_range(VFS_I(ip)->i_mapping,
>    724						pos >> PAGE_CACHE_SHIFT,
>    725						end >> PAGE_CACHE_SHIFT);
>    726			WARN_ON_ONCE(ret);
>    727			ret = 0;
> 
> It can be reproduced by running generic/247 in loop(usually within 100
> loops), without any special parameters/options.
> 
> 	while ./check generic/247; do : ; done
> 
> Attachments are my xfstests config file and host info requested by
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
> 

I reproduce this fairly regularly as well. The test initializes a file
with buffered writes and then runs a forward direct I/O writer and a
reverse mmap writer on the file in parallel. IIRC, the issue here is
basically the mmap writes racing with the flush/invalidate that occurs
as part of the dio write sequence. (Indeed, a quick test to acquire
mmaplock around the flush/invalidate in xfs_file_dio_aio_write() seems
to confirm).

I also vaguely recall some discussion with Dave around why we don't do
this (perhaps during review of the mmaplock bits). If I recall the
reasoning there, direct I/O in XFS is designed to allow parallel writes
to the file and thus the exclusive locks are demoted prior to the write.
This means the mapped write could still race with the DIO and lead to
whatever resulting inconsistency is possible now. So assuming my
understanding is correct there, taking the mmap lock across the
flush/invalidate just ends up hiding a symptom of a problem that isn't
really solved.

I wonder if we should at least update the comment since it somewhat
contradicts the fact that we have a known and reliable reproducer, even
if it's not the best idea to mix direct I/O and mapped writes as
generic/247 does.

Brian

> Thanks,
> Eryu

> [root@dhcp-66-86-11 ~]# uname -a
> Linux dhcp-66-86-11.rhts.eng.nay.redhat.com 4.1.0-rc5 #61 SMP Thu May 28 12:28:22 CST 2015 x86_64 x86_64 x86_64 GNU/Linux
> [root@dhcp-66-86-11 ~]# rpm -q xfsprogs
> xfsprogs-3.2.1-6.el7.x86_64
> [root@dhcp-66-86-11 ~]# grep -c proc /proc/cpuinfo
> 4
> [root@dhcp-66-86-11 ~]# cat /proc/meminfo
> MemTotal:        7912932 kB
> MemFree:         7547480 kB
> MemAvailable:    7599440 kB
> Buffers:             944 kB
> Cached:           219292 kB
> SwapCached:            0 kB
> Active:           170072 kB
> Inactive:         106352 kB
> Active(anon):      56476 kB
> Inactive(anon):     8384 kB
> Active(file):     113596 kB
> Inactive(file):    97968 kB
> Unevictable:           0 kB
> Mlocked:               0 kB
> SwapTotal:       8257532 kB
> SwapFree:        8257532 kB
> Dirty:             42556 kB
> Writeback:             0 kB
> AnonPages:         56020 kB
> Mapped:            33008 kB
> Shmem:              8552 kB
> Slab:              42520 kB
> SReclaimable:      18648 kB
> SUnreclaim:        23872 kB
> KernelStack:        2336 kB
> PageTables:         3280 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:    12213996 kB
> Committed_AS:     260636 kB
> VmallocTotal:   34359738367 kB
> VmallocUsed:       26336 kB
> VmallocChunk:   34359623680 kB
> HardwareCorrupted:     0 kB
> AnonHugePages:     12288 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> DirectMap4k:       69620 kB
> DirectMap2M:     8318976 kB
> [root@dhcp-66-86-11 ~]# cat /proc/mounts
> proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
> sysfs /sys sysfs rw,seclabel,nosuid,nodev,noexec,relatime 0 0
> devtmpfs /dev devtmpfs rw,seclabel,nosuid,size=3943408k,nr_inodes=985852,mode=755 0 0
> securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
> tmpfs /dev/shm tmpfs rw,seclabel,nosuid,nodev 0 0
> devpts /dev/pts devpts rw,seclabel,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
> tmpfs /run tmpfs rw,seclabel,nosuid,nodev,mode=755 0 0
> tmpfs /sys/fs/cgroup tmpfs rw,seclabel,nosuid,nodev,noexec,mode=755 0 0
> cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
> pstore /sys/fs/pstore pstore rw,seclabel,nosuid,nodev,noexec,relatime 0 0
> cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
> cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
> cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
> cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
> cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
> cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
> cgroup /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0
> cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
> cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
> configfs /sys/kernel/config configfs rw,relatime 0 0
> /dev/sda3 / xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
> selinuxfs /sys/fs/selinux selinuxfs rw,relatime 0 0
> systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=34,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0
> debugfs /sys/kernel/debug debugfs rw,seclabel,relatime 0 0
> hugetlbfs /dev/hugepages hugetlbfs rw,seclabel,relatime 0 0
> mqueue /dev/mqueue mqueue rw,seclabel,relatime 0 0
> /dev/sda1 /boot xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
> /dev/sda6 /mnt/testarea/scratch xfs rw,context=system_u:object_r:nfs_t:s0,relatime,attr2,inode64,noquota 0 0
> /dev/sda5 /mnt/testarea/test xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
> [root@dhcp-66-86-11 ~]# cat /proc/partitions
> major minor  #blocks  name
> 
>    8        0  314572800 sda
>    8        1     512000 sda1
>    8        2    8257536 sda2
>    8        3   52428800 sda3
>    8        4          1 sda4
>    8        5   15728640 sda5
>    8        6   15728640 sda6
>    8        7   15728640 sda7
>    8        8   15728640 sda8
>    8        9   15728640 sda9
>    8       10   15728640 sda10
>    8       11   15728640 sda11
>    8       16  104857600 sdb
>    8       17  104856576 sdb1
> [root@dhcp-66-86-11 ~]# lvs
> [root@dhcp-66-86-11 ~]#

> TEST_DEV=/dev/sda5
> TEST_DIR=/mnt/testarea/test				# mount point of TEST PARTITION
> SCRATCH_MNT=/mnt/testarea/scratch			# mount point for SCRATCH PARTITION
> SCRATCH_DEV=/dev/sda6

> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: generic/247 triggers warnings occasionally
  2015-05-29 15:13 ` Brian Foster
@ 2015-05-29 22:37   ` Dave Chinner
  0 siblings, 0 replies; 4+ messages in thread
From: Dave Chinner @ 2015-05-29 22:37 UTC (permalink / raw)
  To: Brian Foster; +Cc: Eryu Guan, xfs

On Fri, May 29, 2015 at 11:13:07AM -0400, Brian Foster wrote:
> On Fri, May 29, 2015 at 02:32:14PM +0800, Eryu Guan wrote:
> > Hi all,
> > 
> > I've seen generic/247 trigger the following warning occasionally, on
> > 4.1.0-rc5 kernel. And I can reproduce it back on 4.0.0 kernel.
> > 
> > [60397.806729] run fstests generic/247 at 2015-05-29 13:19:55
> > [60400.197970] ------------[ cut here ]------------
> > [60400.199285] WARNING: CPU: 1 PID: 13161 at fs/xfs/xfs_file.c:726 xfs_file_dio_aio_write+0x176/0x2a8 [xfs]()
....
> > fs/xfs/xfs_file.c:723
> >    723			ret = invalidate_inode_pages2_range(VFS_I(ip)->i_mapping,
> >    724						pos >> PAGE_CACHE_SHIFT,
> >    725						end >> PAGE_CACHE_SHIFT);
> >    726			WARN_ON_ONCE(ret);
> >    727			ret = 0;
> > 
> > It can be reproduced by running generic/247 in loop(usually within 100
> > loops), without any special parameters/options.
> > 
> > 	while ./check generic/247; do : ; done
> > 
> > Attachments are my xfstests config file and host info requested by
> > http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
> > 
> 
> I reproduce this fairly regularly as well. The test initializes a file
> with buffered writes and then runs a forward direct I/O writer and a
> reverse mmap writer on the file in parallel. IIRC, the issue here is
> basically the mmap writes racing with the flush/invalidate that occurs
> as part of the dio write sequence. (Indeed, a quick test to acquire
> mmaplock around the flush/invalidate in xfs_file_dio_aio_write() seems
> to confirm).
> 
> I also vaguely recall some discussion with Dave around why we don't do
> this (perhaps during review of the mmaplock bits). If I recall the
> reasoning there, direct I/O in XFS is designed to allow parallel writes
> to the file and thus the exclusive locks are demoted prior to the write.
> This means the mapped write could still race with the DIO and lead to
> whatever resulting inconsistency is possible now. So assuming my
> understanding is correct there, taking the mmap lock across the
> flush/invalidate just ends up hiding a symptom of a problem that isn't
> really solved.

Pretty much - the page fault just needs to happen at a slightly
different time, and we have the same problem just now it's silent
and we don't know about it. It's a data corruption vector, so I'd
much prefer to be noisy and hear about false positives than be
silent and miss it as the cause of a real corruption..

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-05-29 22:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-29  6:32 generic/247 triggers warnings occasionally Eryu Guan
2015-05-29  7:13 ` Eryu Guan
2015-05-29 15:13 ` Brian Foster
2015-05-29 22:37   ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox