From: Marc Lehmann <schmorp@schmorp.de>
To: Michael Monnerie <michael.monnerie@is.it-management.at>
Cc: xfs@oss.sgi.com
Subject: Re: frequent kernel BUG and lockups - 2.6.39 + xfs_fsr
Date: Tue, 9 Aug 2011 13:15:27 +0200 [thread overview]
Message-ID: <20110809111526.GA7631@schmorp.de> (raw)
In-Reply-To: <201108091210.50204@zmi.at>
On Tue, Aug 09, 2011 at 12:10:48PM +0200, Michael Monnerie <michael.monnerie@is.it-management.at> wrote:
> First of all, please calm down. Getting personal is not bringing us
> anywhere.
Well, it's not me who's getting personal, so...?
> > Logic error - if I can corrupt an XFS without special privileges then
> > this is not a problem with xfs_fsr, but simply a kernel bug in the
> > xfs code. And a rather big one, one step below a remote exploit.
>
> No, it's not a kernel bug because as long as you don't use xfs_fsr,
> nothing will ever happen.
"As long as you don't boot, it will not crash".
xfs_fsr uses syscalls, just like other applications. According to your
(wrong) logic, if an application uses chown and this causes a kernel oops,
this is also not a kernel bug.
Thats of course wrong - it's the kernel that crashes when an applicaiton
does certain access patterns.
> (rw,nodiratime,relatime,logbufs=8,logbsize=256k,attr2,barrier,largeio,swalloc)
> and sometimes also
> ,allocsize=64m
As has been reported on this list, this option is really harmful on
current xfs - in my case, it lead to xfs causing ENOSPC even when the disk
was 40% empty (~188gb).
> and I can't find evidence for fragmentation that would be harmful.Yes
Well, define "harmful" - slow logfile reads aren't what I consider
"harmful" either. It's just very very slow.
> The allocsize option helps a lot there. I looked at one webserver access
> log, it has 640MB with 99 fragments, but that's not a lot. On our
> Spamgate I see 250MB logs with 374 fragments.
Well, if it were one fragment, you could read that in 4-5 seconds, at 374
fragments, it's probably around 6-7 seconds. Thats not harmful, but if you
extrapolate this to a few gigabytes and a lot of files, it becomes quite
the overhead.
> don't use the allocsize option there, which I changed now that I looked
That allocsize option is no longer reasonable with newer kernels, as the
kernel will reserve 64m diskspace even for 1kb files indefinitely.
> > If XFS is bad at append-only workloads, which is the most common type
> > of workload, then XFS fails to be very relevant for the real world.
>
> may be valid for your world, not mine. We have webservers, fileservers
> and database servers, all of which are not really append style, but more
> delete-and-recreate.
If you find a way of recreating files without appending to them, let me
know.
The problem with fragmentatioon is that it happens even for a few writers
for "create file" workloads (which do append...).
You probably make a distinction between "writing a file fast" and "writing
a file slow", but the distinction is not a qualitative difference. On busy
servers thta create a lot of files, you get fragmentation the same way
as on less busy servers that write files slower. There is little to no
difference in the resulting patterns.
> Well, db-servers are rather exceptional here.
Yes, append style is what makes up for the vast majority of disk writes on
a normal system, db-servers excepted indeed.
> But if the numbers for fragmentation on your servers are true, you must
> have a very good test case for fragmentation prevention. Therefore it
> could be really interesting if you could grab what Dave Chinner asked
> for:
I'll keep it in mind.
> And maybe he could use it for optimizations. Is there any tool on Linux
> to record such I/O patterns?
I presume strace would do, but thats where the "lot of work" comes in. If
there is a ready-to-use tool, that would of course make it easy.
--
The choice of a Deliantra, the free code+content MORPG
-----==- _GNU_ http://www.deliantra.net
----==-- _ generation
---==---(_)__ __ ____ __ Marc Lehmann
--==---/ / _ \/ // /\ \/ / schmorp@schmorp.de
-=====/_/_//_/\_,_/ /_/\_\
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2011-08-09 11:15 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-06 12:25 frequent kernel BUG and lockups - 2.6.39 + xfs_fsr Marc Lehmann
2011-08-06 14:20 ` Dave Chinner
2011-08-07 1:42 ` Marc Lehmann
2011-08-07 10:26 ` Dave Chinner
2011-08-08 19:02 ` Marc Lehmann
2011-08-09 10:10 ` Michael Monnerie
2011-08-09 11:15 ` Marc Lehmann [this message]
2011-08-10 6:59 ` Michael Monnerie
2011-08-11 22:04 ` Marc Lehmann
2011-08-12 4:05 ` Dave Chinner
2011-08-26 8:08 ` Marc Lehmann
2011-08-31 12:45 ` Dave Chinner
2011-08-10 14:16 ` Dave Chinner
2011-08-11 22:07 ` Marc Lehmann
2011-08-09 9:16 ` Marc Lehmann
2011-08-09 11:35 ` Dave Chinner
2011-08-09 16:35 ` Marc Lehmann
2011-08-09 22:31 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110809111526.GA7631@schmorp.de \
--to=schmorp@schmorp.de \
--cc=michael.monnerie@is.it-management.at \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox