* kernel bug in xfs_lrw.c (centos v5.5, directio, aio)
@ 2010-08-17 13:42 Nohez
2010-08-18 11:43 ` Dave Chinner
0 siblings, 1 reply; 8+ messages in thread
From: Nohez @ 2010-08-17 13:42 UTC (permalink / raw)
To: xfs
Hi,
I had a kernel bug today when running xfs on CentOS v5.5. I moved to
xfs from ext3 today.
The only application accessing the xfs filesystem is Sybase ASE v15.x.
Database has been configured to use directio with native kernel
asynchronous disk i/o enabled.
Let me know if there is any other information I can provide to help
with debugging.
Thanks
Nohez
# uname -a
Linux xxxxxx 2.6.18-194.11.1.el5 #1 SMP Tue Aug 10 19:05:06 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
# modinfo xfs
filename: /lib/modules/2.6.18-194.11.1.el5/kernel/fs/xfs/xfs.ko
license: GPL
description: SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
author: Silicon Graphics, Inc.
srcversion: 5380928C58CAF3C9D7FF438
depends:
vermagic: 2.6.18-194.11.1.el5 SMP mod_unload gcc-4.1
module_sig: 883f3504c61e211db40e56c3525288511247550a0d2737921213759cfcfebb5d4b9c7349defb86f4f0a0c1199fc66696cad525fb805585a3d05072b47923
# grep xfs /proc/mounts
/dev/mapper/xxxxxx_syb_vol_6130p1 /sybase xfs rw,noatime,nodiratime,nobarrier,logbufs=8,logbsize=32k,noquota 0 0
<begin /var/log/messages>
Aug 17 14:26:02 xxxxxx kernel: XFS mounting filesystem dm-1
Aug 17 15:20:09 xxxxxx kernel: BUG: warning at fs/xfs/linux-2.6/xfs_lrw.c:718/xfs_write() (Tainted: G )
Aug 17 15:20:09 xxxxxx kernel:
Aug 17 15:20:09 xxxxxx kernel: Call Trace:
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff88683d59>] :xfs:xfs_write+0x39f/0x69e
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800e9226>] core_sys_select+0x1f9/0x265
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff886805a0>] :xfs:xfs_file_aio_write+0x65/0x6a
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800ef2b6>] aio_pwrite+0x2c/0x75
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800efd77>] aio_run_iocb+0xef/0x18a
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800f08e1>] io_submit_one+0x396/0x499
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800f0ef8>] sys_io_submit+0xbe/0x1a4
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Aug 17 15:20:09 xxxxxx kernel:
Aug 17 15:20:09 xxxxxx kernel: BUG: warning at fs/xfs/linux-2.6/xfs_lrw.c:718/xfs_write() (Tainted: G )
Aug 17 15:20:09 xxxxxx kernel:
Aug 17 15:20:09 xxxxxx kernel: Call Trace:
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff88683d59>] :xfs:xfs_write+0x39f/0x69e
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800e9226>] core_sys_select+0x1f9/0x265
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff886805a0>] :xfs:xfs_file_aio_write+0x65/0x6a
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800ef2b6>] aio_pwrite+0x2c/0x75
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800efd77>] aio_run_iocb+0xef/0x18a
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800f08e1>] io_submit_one+0x396/0x499
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800f0ef8>] sys_io_submit+0xbe/0x1a4
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Aug 17 15:20:09 xxxxxx kernel:
Aug 17 15:20:09 xxxxxx kernel: BUG: warning at fs/xfs/linux-2.6/xfs_lrw.c:718/xfs_write() (Tainted: G )
Aug 17 15:20:09 xxxxxx kernel:
Aug 17 15:20:09 xxxxxx kernel: Call Trace:
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff88683d59>] :xfs:xfs_write+0x39f/0x69e
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800e9226>] core_sys_select+0x1f9/0x265
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff886805a0>] :xfs:xfs_file_aio_write+0x65/0x6a
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800ef2b6>] aio_pwrite+0x2c/0x75
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800efd77>] aio_run_iocb+0xef/0x18a
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800f08e1>] io_submit_one+0x396/0x499
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800f0ef8>] sys_io_submit+0xbe/0x1a4
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Aug 17 15:20:09 xxxxxx kernel:
Aug 17 15:20:09 xxxxxx kernel: BUG: warning at fs/xfs/linux-2.6/xfs_lrw.c:718/xfs_write() (Tainted: G )
Aug 17 15:20:09 xxxxxx kernel:
Aug 17 15:20:09 xxxxxx kernel: Call Trace:
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff88683d59>] :xfs:xfs_write+0x39f/0x69e
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800e9226>] core_sys_select+0x1f9/0x265
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff886805a0>] :xfs:xfs_file_aio_write+0x65/0x6a
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800ef2b6>] aio_pwrite+0x2c/0x75
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800efd77>] aio_run_iocb+0xef/0x18a
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800f08e1>] io_submit_one+0x396/0x499
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800f0ef8>] sys_io_submit+0xbe/0x1a4
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Aug 17 15:20:09 xxxxxx kernel:
Aug 17 15:20:09 xxxxxx kernel: BUG: warning at fs/xfs/linux-2.6/xfs_lrw.c:718/xfs_write() (Tainted: G )
Aug 17 15:20:09 xxxxxx kernel:
Aug 17 15:20:09 xxxxxx kernel: Call Trace:
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff88683d59>] :xfs:xfs_write+0x39f/0x69e
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800e9226>] core_sys_select+0x1f9/0x265
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff886805a0>] :xfs:xfs_file_aio_write+0x65/0x6a
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800ef2b6>] aio_pwrite+0x2c/0x75
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800efd77>] aio_run_iocb+0xef/0x18a
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800f08e1>] io_submit_one+0x396/0x499
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff800f0ef8>] sys_io_submit+0xbe/0x1a4
Aug 17 15:20:09 xxxxxx kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Aug 17 15:20:09 xxxxxx kernel:
Aug 17 15:20:09 xxxxxx kernel: BUG: warning at fs/xfs/linux-2.6/xfs_lrw.c:718/xfs_write() (Tainted: G )
Aug 17 15:20:09 xxxxxx kernel:
Aug 17 15:20:10 xxxxxx kernel: Call Trace:
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff88683d59>] :xfs:xfs_write+0x39f/0x69e
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800e9226>] core_sys_select+0x1f9/0x265
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff886805a0>] :xfs:xfs_file_aio_write+0x65/0x6a
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800ef2b6>] aio_pwrite+0x2c/0x75
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800efd77>] aio_run_iocb+0xef/0x18a
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800f08e1>] io_submit_one+0x396/0x499
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800f0ef8>] sys_io_submit+0xbe/0x1a4
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Aug 17 15:20:10 xxxxxx kernel:
Aug 17 15:20:10 xxxxxx kernel: BUG: warning at fs/xfs/linux-2.6/xfs_lrw.c:718/xfs_write() (Tainted: G )
Aug 17 15:20:10 xxxxxx kernel:
Aug 17 15:20:10 xxxxxx kernel: Call Trace:
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff88683d59>] :xfs:xfs_write+0x39f/0x69e
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800e9226>] core_sys_select+0x1f9/0x265
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff886805a0>] :xfs:xfs_file_aio_write+0x65/0x6a
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800ef2b6>] aio_pwrite+0x2c/0x75
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800efd77>] aio_run_iocb+0xef/0x18a
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800f08e1>] io_submit_one+0x396/0x499
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800f0ef8>] sys_io_submit+0xbe/0x1a4
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Aug 17 15:20:10 xxxxxx kernel:
Aug 17 15:20:10 xxxxxx kernel: BUG: warning at fs/xfs/linux-2.6/xfs_lrw.c:718/xfs_write() (Tainted: G )
Aug 17 15:20:10 xxxxxx kernel:
Aug 17 15:20:10 xxxxxx kernel: Call Trace:
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff88683d59>] :xfs:xfs_write+0x39f/0x69e
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800e9226>] core_sys_select+0x1f9/0x265
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff886805a0>] :xfs:xfs_file_aio_write+0x65/0x6a
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800ef2b6>] aio_pwrite+0x2c/0x75
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800efd77>] aio_run_iocb+0xef/0x18a
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800f08e1>] io_submit_one+0x396/0x499
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800f0ef8>] sys_io_submit+0xbe/0x1a4
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Aug 17 15:20:10 xxxxxx kernel:
Aug 17 15:20:10 xxxxxx kernel: BUG: warning at fs/xfs/linux-2.6/xfs_lrw.c:718/xfs_write() (Tainted: G )
Aug 17 15:20:10 xxxxxx kernel:
Aug 17 15:20:10 xxxxxx kernel: Call Trace:
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff88683d59>] :xfs:xfs_write+0x39f/0x69e
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800e9226>] core_sys_select+0x1f9/0x265
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff886805a0>] :xfs:xfs_file_aio_write+0x65/0x6a
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800ef2b6>] aio_pwrite+0x2c/0x75
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800efd77>] aio_run_iocb+0xef/0x18a
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800f08e1>] io_submit_one+0x396/0x499
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800f0ef8>] sys_io_submit+0xbe/0x1a4
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Aug 17 15:20:10 xxxxxx kernel:
Aug 17 15:20:10 xxxxxx kernel: BUG: warning at fs/xfs/linux-2.6/xfs_lrw.c:718/xfs_write() (Tainted: G )
Aug 17 15:20:10 xxxxxx kernel:
Aug 17 15:20:10 xxxxxx kernel: Call Trace:
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff88683d59>] :xfs:xfs_write+0x39f/0x69e
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800e9226>] core_sys_select+0x1f9/0x265
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff886805a0>] :xfs:xfs_file_aio_write+0x65/0x6a
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800ef2b6>] aio_pwrite+0x2c/0x75
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800efd77>] aio_run_iocb+0xef/0x18a
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800f08e1>] io_submit_one+0x396/0x499
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff800f0ef8>] sys_io_submit+0xbe/0x1a4
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Aug 17 15:20:10 xxxxxx kernel:
Aug 17 15:20:10 xxxxxx kernel: BUG: warning at fs/xfs/linux-2.6/xfs_lrw.c:718/xfs_write() (Tainted: G )
Aug 17 15:20:10 xxxxxx kernel:
Aug 17 15:20:10 xxxxxx kernel: Call Trace:
Aug 17 15:20:10 xxxxxx kernel: [<ffffffff88683d59>] :xfs:xfs_write+0x39f/0x69e
Aug 17 15:20:11 xxxxxx kernel: [<ffffffff800e9226>] core_sys_select+0x1f9/0x265
Aug 17 15:20:11 xxxxxx kernel: [<ffffffff886805a0>] :xfs:xfs_file_aio_write+0x65/0x6a
Aug 17 15:20:11 xxxxxx kernel: [<ffffffff800ef2b6>] aio_pwrite+0x2c/0x75
Aug 17 15:20:11 xxxxxx kernel: [<ffffffff800efd77>] aio_run_iocb+0xef/0x18a
Aug 17 15:20:11 xxxxxx kernel: [<ffffffff800f08e1>] io_submit_one+0x396/0x499
Aug 17 15:20:11 xxxxxx kernel: [<ffffffff800f0ef8>] sys_io_submit+0xbe/0x1a4
Aug 17 15:20:11 xxxxxx kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Aug 17 15:20:11 xxxxxx kernel:
Aug 17 15:20:11 xxxxxx kernel: BUG: warning at fs/xfs/linux-2.6/xfs_lrw.c:718/xfs_write() (Tainted: G )
Aug 17 15:20:11 xxxxxx kernel:
Aug 17 15:20:11 xxxxxx kernel: Call Trace:
Aug 17 15:20:11 xxxxxx kernel: [<ffffffff88683d59>] :xfs:xfs_write+0x39f/0x69e
Aug 17 15:20:11 xxxxxx kernel: [<ffffffff800e9226>] core_sys_select+0x1f9/0x265
Aug 17 15:20:11 xxxxxx kernel: [<ffffffff886805a0>] :xfs:xfs_file_aio_write+0x65/0x6a
Aug 17 15:20:11 xxxxxx kernel: [<ffffffff800ef2b6>] aio_pwrite+0x2c/0x75
Aug 17 15:20:11 xxxxxx kernel: [<ffffffff800efd77>] aio_run_iocb+0xef/0x18a
Aug 17 15:20:11 xxxxxx kernel: [<ffffffff800f08e1>] io_submit_one+0x396/0x499
Aug 17 15:20:11 xxxxxx kernel: [<ffffffff800f0ef8>] sys_io_submit+0xbe/0x1a4
Aug 17 15:20:11 xxxxxx kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Aug 17 15:20:11 xxxxxx kernel:
<end /var/log/messages>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel bug in xfs_lrw.c (centos v5.5, directio, aio)
2010-08-17 13:42 kernel bug in xfs_lrw.c (centos v5.5, directio, aio) Nohez
@ 2010-08-18 11:43 ` Dave Chinner
2010-08-18 14:51 ` Nohez
2010-08-19 0:47 ` Eric Sandeen
0 siblings, 2 replies; 8+ messages in thread
From: Dave Chinner @ 2010-08-18 11:43 UTC (permalink / raw)
To: Nohez; +Cc: xfs
On Tue, Aug 17, 2010 at 07:12:12PM +0530, Nohez wrote:
>
> Hi,
>
> I had a kernel bug today when running xfs on CentOS v5.5. I moved to
> xfs from ext3 today.
>
> The only application accessing the xfs filesystem is Sybase ASE v15.x.
> Database has been configured to use directio with native kernel
> asynchronous disk i/o enabled.
The warning is being issued because the application is mixing
buffered IO with direct IO on the same file. i.e. data corruption
waiting to happen. This is an application bug - the responsibility
for ensuring data coherency and integrity is assumed by the
application issuing the direct IO.
This was discussed in more detail on a recent thread on this list -
you shoul dbe able to find it in the archives easily enough.
> Let me know if there is any other information I can provide to help
> with debugging.
Report it to the application vendor - it's an application bug, not
a filesystem bug.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel bug in xfs_lrw.c (centos v5.5, directio, aio)
2010-08-18 11:43 ` Dave Chinner
@ 2010-08-18 14:51 ` Nohez
2010-08-18 15:55 ` Dave Chinner
2010-08-19 0:47 ` Eric Sandeen
1 sibling, 1 reply; 8+ messages in thread
From: Nohez @ 2010-08-18 14:51 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
Thank you for your reply.
I checked the archives and found a bug report. The function names in
the call trace report are different from the ones I have reported.
Since it's an application bug, it does not matter which filesystem
is being used. Can I assume that the data corruption can happen on
ext3 too?
Nohez
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel bug in xfs_lrw.c (centos v5.5, directio, aio)
2010-08-18 14:51 ` Nohez
@ 2010-08-18 15:55 ` Dave Chinner
0 siblings, 0 replies; 8+ messages in thread
From: Dave Chinner @ 2010-08-18 15:55 UTC (permalink / raw)
To: Nohez; +Cc: xfs
On Wed, Aug 18, 2010 at 08:21:31PM +0530, Nohez wrote:
>
> Thank you for your reply.
>
> I checked the archives and found a bug report. The function names in
> the call trace report are different from the ones I have reported.
>
> Since it's an application bug, it does not matter which filesystem
> is being used. Can I assume that the data corruption can happen on
> ext3 too?
It depends on the way the filesystem handles parallelism when doing
direct IO. In general, though, you cannot rely on the filesystem
maintaining data coherence if you mix concurrent buffered/mmap IO
and direct IO to the same file. Filesystems do make efforts to
minimise the possibility of coherency problems, but they cannot be
prevented entirely. Hence the delegation of responsibility to the
application issuing the direct IO.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel bug in xfs_lrw.c (centos v5.5, directio, aio)
2010-08-18 11:43 ` Dave Chinner
2010-08-18 14:51 ` Nohez
@ 2010-08-19 0:47 ` Eric Sandeen
2010-08-19 1:34 ` Dave Chinner
1 sibling, 1 reply; 8+ messages in thread
From: Eric Sandeen @ 2010-08-19 0:47 UTC (permalink / raw)
To: Dave Chinner; +Cc: Nohez, xfs@oss.sgi.com
On Aug 18, 2010, at 6:43 AM, Dave Chinner <david@fromorbit.com> wrote:
> On Tue, Aug 17, 2010 at 07:12:12PM +0530, Nohez wrote:
>>
>> Hi,
>>
>> I had a kernel bug today when running xfs on CentOS v5.5. I moved to
>> xfs from ext3 today.
>>
>> The only application accessing the xfs filesystem is Sybase ASE v15.x.
>> Database has been configured to use directio with native kernel
>> asynchronous disk i/o enabled.
>
> The warning is being issued because the application is mixing
> buffered IO with direct IO on the same file. i.e. data corruption
> waiting to happen. This is an application bug - the responsibility
> for ensuring data coherency and integrity is assumed by the
> application issuing the direct IO.
>
You know... A clearer kernel message might help a lot here...
-Eric
> This was discussed in more detail on a recent thread on this list -
> you shoul dbe able to find it in the archives easily enough.
>
>> Let me know if there is any other information I can provide to help
>> with debugging.
>
> Report it to the application vendor - it's an application bug, not
> a filesystem bug.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel bug in xfs_lrw.c (centos v5.5, directio, aio)
2010-08-19 0:47 ` Eric Sandeen
@ 2010-08-19 1:34 ` Dave Chinner
2010-08-19 1:38 ` Eric Sandeen
0 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2010-08-19 1:34 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Nohez, xfs@oss.sgi.com
On Wed, Aug 18, 2010 at 07:47:09PM -0500, Eric Sandeen wrote:
> On Aug 18, 2010, at 6:43 AM, Dave Chinner <david@fromorbit.com> wrote:
>
> > On Tue, Aug 17, 2010 at 07:12:12PM +0530, Nohez wrote:
> >>
> >> Hi,
> >>
> >> I had a kernel bug today when running xfs on CentOS v5.5. I moved to
> >> xfs from ext3 today.
> >>
> >> The only application accessing the xfs filesystem is Sybase ASE v15.x.
> >> Database has been configured to use directio with native kernel
> >> asynchronous disk i/o enabled.
> >
> > The warning is being issued because the application is mixing
> > buffered IO with direct IO on the same file. i.e. data corruption
> > waiting to happen. This is an application bug - the responsibility
> > for ensuring data coherency and integrity is assumed by the
> > application issuing the direct IO.
> >
> You know... A clearer kernel message might help a lot here...
Yeah, probably would given we've had more reports of this in the
last month or two than we've had in the last five years. What sort
of text do you think we should add? I'd argue on the scary side,
say:
"XFS: filesystem 〈blah>: detected potential data corruption issue
caused by application(s) mixing concurrent buffered and direct IO to
the same inode. Inode #12345, pid 6789. Please report this issue
to your application vendor."
What do you think?
As it is, I suspect that the test for this race condition will
need to change somewhat with range-based flushing now working.
Just checking mapping->nr_pages is not sufficient anymore, I think.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel bug in xfs_lrw.c (centos v5.5, directio, aio)
2010-08-19 1:34 ` Dave Chinner
@ 2010-08-19 1:38 ` Eric Sandeen
2010-08-19 1:50 ` Dave Chinner
0 siblings, 1 reply; 8+ messages in thread
From: Eric Sandeen @ 2010-08-19 1:38 UTC (permalink / raw)
To: Dave Chinner; +Cc: Nohez, xfs@oss.sgi.com
On Aug 18, 2010, at 8:34 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Wed, Aug 18, 2010 at 07:47:09PM -0500, Eric Sandeen wrote:
>> On Aug 18, 2010, at 6:43 AM, Dave Chinner <david@fromorbit.com> wrote:
>>
>>> On Tue, Aug 17, 2010 at 07:12:12PM +0530, Nohez wrote:
>>>>
>>>> Hi,
>>>>
>>>> I had a kernel bug today when running xfs on CentOS v5.5. I moved to
>>>> xfs from ext3 today.
>>>>
>>>> The only application accessing the xfs filesystem is Sybase ASE v15.x.
>>>> Database has been configured to use directio with native kernel
>>>> asynchronous disk i/o enabled.
>>>
>>> The warning is being issued because the application is mixing
>>> buffered IO with direct IO on the same file. i.e. data corruption
>>> waiting to happen. This is an application bug - the responsibility
>>> for ensuring data coherency and integrity is assumed by the
>>> application issuing the direct IO.
>>>
>> You know... A clearer kernel message might help a lot here...
>
> Yeah, probably would given we've had more reports of this in the
> last month or two than we've had in the last five years. What sort
> of text do you think we should add? I'd argue on the scary side,
> say:
>
> "XFS: filesystem 〈blah>: detected potential data corruption issue
> caused by application(s) mixing concurrent buffered and direct IO to
> the same inode. Inode #12345, pid 6789. Please report this issue
> to your application vendor."
>
> What do you think?
>
Plenty verbose, might want to limit/throttle it, but sure. Maybe include current->comm?
-Eric
> As it is, I suspect that the test for this race condition will
> need to change somewhat with range-based flushing now working.
> Just checking mapping->nr_pages is not sufficient anymore, I think.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel bug in xfs_lrw.c (centos v5.5, directio, aio)
2010-08-19 1:38 ` Eric Sandeen
@ 2010-08-19 1:50 ` Dave Chinner
0 siblings, 0 replies; 8+ messages in thread
From: Dave Chinner @ 2010-08-19 1:50 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Nohez, xfs@oss.sgi.com
On Wed, Aug 18, 2010 at 08:38:33PM -0500, Eric Sandeen wrote:
> On Aug 18, 2010, at 8:34 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> > On Wed, Aug 18, 2010 at 07:47:09PM -0500, Eric Sandeen wrote:
> >> On Aug 18, 2010, at 6:43 AM, Dave Chinner <david@fromorbit.com> wrote:
> >>
> >>> On Tue, Aug 17, 2010 at 07:12:12PM +0530, Nohez wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> I had a kernel bug today when running xfs on CentOS v5.5. I moved to
> >>>> xfs from ext3 today.
> >>>>
> >>>> The only application accessing the xfs filesystem is Sybase ASE v15.x.
> >>>> Database has been configured to use directio with native kernel
> >>>> asynchronous disk i/o enabled.
> >>>
> >>> The warning is being issued because the application is mixing
> >>> buffered IO with direct IO on the same file. i.e. data corruption
> >>> waiting to happen. This is an application bug - the responsibility
> >>> for ensuring data coherency and integrity is assumed by the
> >>> application issuing the direct IO.
> >>>
> >> You know... A clearer kernel message might help a lot here...
> >
> > Yeah, probably would given we've had more reports of this in the
> > last month or two than we've had in the last five years. What sort
> > of text do you think we should add? I'd argue on the scary side,
> > say:
> >
> > "XFS: filesystem 〈blah>: detected potential data corruption issue
> > caused by application(s) mixing concurrent buffered and direct IO to
> > the same inode. Inode #12345, pid 6789. Please report this issue
> > to your application vendor."
> >
> > What do you think?
> >
> Plenty verbose, might want to limit/throttle it, but sure.
Rate limiting it is a good idea, anyway. How about this:
"XFS: <dev>: inode <#>: pid <#> <name>: detected potential data
corruption issue due to concurrent buffered and direct IO to the
same inode. Please report this issue to your application vendor."
> Maybe include current->comm?
Yes, I thought about that but hadn't gone looking to find out how
easy it was to get the process name.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-08-19 1:50 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-17 13:42 kernel bug in xfs_lrw.c (centos v5.5, directio, aio) Nohez
2010-08-18 11:43 ` Dave Chinner
2010-08-18 14:51 ` Nohez
2010-08-18 15:55 ` Dave Chinner
2010-08-19 0:47 ` Eric Sandeen
2010-08-19 1:34 ` Dave Chinner
2010-08-19 1:38 ` Eric Sandeen
2010-08-19 1:50 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox