* Data loss XFS with RT kernel on Debian.
@ 2014-07-05 12:41 Jan de Kruyf
2014-07-05 20:06 ` Eric Sandeen
2014-07-06 23:57 ` Dave Chinner
0 siblings, 2 replies; 7+ messages in thread
From: Jan de Kruyf @ 2014-07-05 12:41 UTC (permalink / raw)
To: xfs
[-- Attachment #1.1: Type: text/plain, Size: 1642 bytes --]
Hallo,
While doing a reasonably high density job like rsynching a subdirectory
from one place to another, or tarring it to a pipe and untarring it at the
other end, I note that the cpu usage goes practically to 100% and when I
after 5 minutes or so I reset the computer the writing has not finished at
all.
However on the stock Debian kernel it works without a problem.
Could I still use this combination in an industrial environment reading and
writing reasonably short text files? So far I did not experience this
problem with normal day to day use. It stuck up its head during
installation of gnat-gpl-2014-x86_64-linux-bin from the
http://libre.adacore.com/download/ page. The offending code is in the
Makefile in the top directory page. The Xterm will give you the place where
it gets stuck.
Regards,
Jan de Kruijf.
Her are the details of the installation:
root@jan:~# xfs_info -V
xfs_info version 3.1.7
root@jan:~# xfs_info /usr
meta-data=/dev/sda3 isize=256 agcount=4, agsize=732416 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=2929664, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
This combination does not work:
root@jan:~# uname -a
Linux jan 3.14-0.bpo.1-rt-amd64 #1 SMP PREEMPT RT Debian 3.14.7-1~bpo70+1
(2014-06-21) x86_64 GNU/Linux
Also kernel 3.10-0.bpo.3-rt-amd64 does not work
But this combination works:
root@jan:~# uname -a
Linux jan 3.2.0-4-amd64 #1 SMP Debian 3.2.57-3+deb7u2 x86_64 GNU/Linux
[-- Attachment #1.2: Type: text/html, Size: 2282 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Data loss XFS with RT kernel on Debian.
2014-07-05 12:41 Data loss XFS with RT kernel on Debian Jan de Kruyf
@ 2014-07-05 20:06 ` Eric Sandeen
2014-07-05 22:08 ` Eric Sandeen
2014-07-06 23:57 ` Dave Chinner
1 sibling, 1 reply; 7+ messages in thread
From: Eric Sandeen @ 2014-07-05 20:06 UTC (permalink / raw)
To: Jan de Kruyf, xfs
On 7/5/14, 7:41 AM, Jan de Kruyf wrote:
> Hallo,
>
> While doing a reasonably high density job like rsynching a subdirectory from one place to another, or tarring it to a pipe and untarring it at the other end, I note that the cpu usage goes practically to 100% and when I after 5 minutes or so I reset the computer the writing has not finished at all.
> However on the stock Debian kernel it works without a problem.
>
> Could I still use this combination in an industrial environment reading and writing reasonably short text files? So far I did not experience this problem with normal day to day use. It stuck up its head during installation of gnat-gpl-2014-x86_64-linux-bin from the http://libre.adacore.com/download/ page. The offending code is in the Makefile in the top directory page. The Xterm will give you the place where it gets stuck.
http://lwn.net/Articles/457667/
-Eric
> Regards,
>
> Jan de Kruijf.
>
>
> Her are the details of the installation:
>
> root@jan:~# xfs_info -V
> xfs_info version 3.1.7
>
> root@jan:~# xfs_info /usr
> meta-data=/dev/sda3 isize=256 agcount=4, agsize=732416 blks
> = sectsz=512 attr=2
> data = bsize=4096 blocks=2929664, imaxpct=25
> = sunit=0 swidth=0 blks
> naming =version 2 bsize=4096 ascii-ci=0
> log =internal bsize=4096 blocks=2560, version=2
> = sectsz=512 sunit=0 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
>
> This combination does not work:
> root@jan:~# uname -a
> Linux jan 3.14-0.bpo.1-rt-amd64 #1 SMP PREEMPT RT Debian 3.14.7-1~bpo70+1 (2014-06-21) x86_64 GNU/Linux
>
> Also kernel 3.10-0.bpo.3-rt-amd64 does not work
>
> But this combination works:
> root@jan:~# uname -a
> Linux jan 3.2.0-4-amd64 #1 SMP Debian 3.2.57-3+deb7u2 x86_64 GNU/Linux
>
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Data loss XFS with RT kernel on Debian.
2014-07-05 20:06 ` Eric Sandeen
@ 2014-07-05 22:08 ` Eric Sandeen
0 siblings, 0 replies; 7+ messages in thread
From: Eric Sandeen @ 2014-07-05 22:08 UTC (permalink / raw)
To: Jan de Kruyf, xfs
On 7/5/14, 3:06 PM, Eric Sandeen wrote:
> On 7/5/14, 7:41 AM, Jan de Kruyf wrote:
>> Hallo,
>>
>> While doing a reasonably high density job like rsynching a
>> subdirectory from one place to another, or tarring it to a pipe and
>> untarring it at the other end, I note that the cpu usage goes
>> practically to 100% and when I after 5 minutes or so I reset the
>> computer the writing has not finished at all. However on the stock
>> Debian kernel it works without a problem.
>>
>> Could I still use this combination in an industrial environment
>> reading and writing reasonably short text files? So far I did not
>> experience this problem with normal day to day use. It stuck up its
>> head during installation of gnat-gpl-2014-x86_64-linux-bin from the
>> http://libre.adacore.com/download/ page. The offending code is in
>> the Makefile in the top directory page. The Xterm will give you the
>> place where it gets stuck.
> http://lwn.net/Articles/457667/
Ok, sorry - that was a little short ;)
If you have some 100% cpu livelock or whatever, that does sound like
a potential bug. Perhaps some tracing or profiling can help figure
out what has gone wrong there. Maybe sysrq-t & see where the active
threads are, or even top?
But if you are surprised that you lost data when you did a hard reset,
the URL above is informative.
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Data loss XFS with RT kernel on Debian.
2014-07-05 12:41 Data loss XFS with RT kernel on Debian Jan de Kruyf
2014-07-05 20:06 ` Eric Sandeen
@ 2014-07-06 23:57 ` Dave Chinner
2014-07-07 7:59 ` Jan de Kruyf
1 sibling, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2014-07-06 23:57 UTC (permalink / raw)
To: Jan de Kruyf; +Cc: xfs
On Sat, Jul 05, 2014 at 02:41:06PM +0200, Jan de Kruyf wrote:
> Hallo,
>
> While doing a reasonably high density job like rsynching a subdirectory
> from one place to another, or tarring it to a pipe and untarring it at the
> other end, I note that the cpu usage goes practically to 100% and when I
> after 5 minutes or so I reset the computer the writing has not finished at
> all.
> However on the stock Debian kernel it works without a problem.
Which says that it's a RT kernel problem, not an XFS issue. There
have been other recent reports of issues with RT kernels, and they
have proven to be core RT kernel bugs, not filesystem issues. I'd
suggest that you are likely to be seeing the same RT issues....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Data loss XFS with RT kernel on Debian.
2014-07-06 23:57 ` Dave Chinner
@ 2014-07-07 7:59 ` Jan de Kruyf
2014-07-07 9:43 ` Dave Chinner
0 siblings, 1 reply; 7+ messages in thread
From: Jan de Kruyf @ 2014-07-07 7:59 UTC (permalink / raw)
To: xfs
I tend to agree with Dave that is is an RT problem, however just for your info:
Kernel 3.2 (the stock Debian kernel) works in both versions, the plain
vanilla and the RT.
On Debugging this prob: will it be helpful for any of the parties
involved if I rig up a serial link and try to get a stacktrace of all
processes with SysRq ? Does that also give kernel processes? Cause Top
tells me nothing.
( I am a bit fresh to this but willing to try)
Cheers
j.
On Mon, Jul 7, 2014 at 1:57 AM, Dave Chinner <david@fromorbit.com> wrote:
> On Sat, Jul 05, 2014 at 02:41:06PM +0200, Jan de Kruyf wrote:
>> Hallo,
>>
>> While doing a reasonably high density job like rsynching a subdirectory
>> from one place to another, or tarring it to a pipe and untarring it at the
>> other end, I note that the cpu usage goes practically to 100% and when I
>> after 5 minutes or so I reset the computer the writing has not finished at
>> all.
>> However on the stock Debian kernel it works without a problem.
>
> Which says that it's a RT kernel problem, not an XFS issue. There
> have been other recent reports of issues with RT kernels, and they
> have proven to be core RT kernel bugs, not filesystem issues. I'd
> suggest that you are likely to be seeing the same RT issues....
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Data loss XFS with RT kernel on Debian.
2014-07-07 7:59 ` Jan de Kruyf
@ 2014-07-07 9:43 ` Dave Chinner
0 siblings, 0 replies; 7+ messages in thread
From: Dave Chinner @ 2014-07-07 9:43 UTC (permalink / raw)
To: Jan de Kruyf; +Cc: xfs
On Mon, Jul 07, 2014 at 09:59:53AM +0200, Jan de Kruyf wrote:
> I tend to agree with Dave that is is an RT problem, however just for your info:
>
> Kernel 3.2 (the stock Debian kernel) works in both versions, the plain
> vanilla and the RT.
> On Debugging this prob: will it be helpful for any of the parties
> involved if I rig up a serial link and try to get a stacktrace of all
> processes with SysRq ? Does that also give kernel processes? Cause Top
> tells me nothing.
> ( I am a bit fresh to this but willing to try)
Best to start for reading the archives to see the issues and
resolutoins for the problems that Austin Schuh has been having with
recent debian RT kernels - it starts with "XFS Crash" in early March,
and there are other issues from there - the thread migrates to LKML
as core RT problems are diagnosed...
See if the patches for the issues he hit solve your problem, and if
not the threads should give you an idea of who to talk to about the
RT issue...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Data loss XFS with RT kernel on Debian.
@ 2014-07-07 10:39 Jan de Kruyf
0 siblings, 0 replies; 7+ messages in thread
From: Jan de Kruyf @ 2014-07-07 10:39 UTC (permalink / raw)
To: xfs
I did post the prob in the Austin Schuh thread in the RT mailing list.
I believe they know what Austins problem is, but Thomas G. is a bit
busy it seems at the moment.
And I did start reading that thread also.
In any case kernel 3.2-rt works great with good latencies, so we will
stick to that for the time being. It was just 'feature inflation' that
made me move to a later kernel. Not always a good idea.
Thanks for your prompt answers.
j.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-07-07 10:39 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-05 12:41 Data loss XFS with RT kernel on Debian Jan de Kruyf
2014-07-05 20:06 ` Eric Sandeen
2014-07-05 22:08 ` Eric Sandeen
2014-07-06 23:57 ` Dave Chinner
2014-07-07 7:59 ` Jan de Kruyf
2014-07-07 9:43 ` Dave Chinner
-- strict thread matches above, loose matches on Subject: below --
2014-07-07 10:39 Jan de Kruyf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox