From: Marc Lehmann <schmorp@schmorp.de>
To: Michael Monnerie <michael.monnerie@is.it-management.at>
Cc: xfs@oss.sgi.com
Subject: Re: frequent kernel BUG and lockups - 2.6.39 + xfs_fsr
Date: Tue, 9 Aug 2011 13:15:27 +0200 [thread overview]
Message-ID: <20110809111526.GA7631@schmorp.de> (raw)
In-Reply-To: <201108091210.50204@zmi.at>
On Tue, Aug 09, 2011 at 12:10:48PM +0200, Michael Monnerie <michael.monnerie@is.it-management.at> wrote:
> First of all, please calm down. Getting personal is not bringing us
> anywhere.
Well, it's not me who's getting personal, so...?
> > Logic error - if I can corrupt an XFS without special privileges then
> > this is not a problem with xfs_fsr, but simply a kernel bug in the
> > xfs code. And a rather big one, one step below a remote exploit.
>
> No, it's not a kernel bug because as long as you don't use xfs_fsr,
> nothing will ever happen.
"As long as you don't boot, it will not crash".
xfs_fsr uses syscalls, just like other applications. According to your
(wrong) logic, if an application uses chown and this causes a kernel oops,
this is also not a kernel bug.
Thats of course wrong - it's the kernel that crashes when an applicaiton
does certain access patterns.
> (rw,nodiratime,relatime,logbufs=8,logbsize=256k,attr2,barrier,largeio,swalloc)
> and sometimes also
> ,allocsize=64m
As has been reported on this list, this option is really harmful on
current xfs - in my case, it lead to xfs causing ENOSPC even when the disk
was 40% empty (~188gb).
> and I can't find evidence for fragmentation that would be harmful.Yes
Well, define "harmful" - slow logfile reads aren't what I consider
"harmful" either. It's just very very slow.
> The allocsize option helps a lot there. I looked at one webserver access
> log, it has 640MB with 99 fragments, but that's not a lot. On our
> Spamgate I see 250MB logs with 374 fragments.
Well, if it were one fragment, you could read that in 4-5 seconds, at 374
fragments, it's probably around 6-7 seconds. Thats not harmful, but if you
extrapolate this to a few gigabytes and a lot of files, it becomes quite
the overhead.
> don't use the allocsize option there, which I changed now that I looked
That allocsize option is no longer reasonable with newer kernels, as the
kernel will reserve 64m diskspace even for 1kb files indefinitely.
> > If XFS is bad at append-only workloads, which is the most common type
> > of workload, then XFS fails to be very relevant for the real world.
>
> may be valid for your world, not mine. We have webservers, fileservers
> and database servers, all of which are not really append style, but more
> delete-and-recreate.
If you find a way of recreating files without appending to them, let me
know.
The problem with fragmentatioon is that it happens even for a few writers
for "create file" workloads (which do append...).
You probably make a distinction between "writing a file fast" and "writing
a file slow", but the distinction is not a qualitative difference. On busy
servers thta create a lot of files, you get fragmentation the same way
as on less busy servers that write files slower. There is little to no
difference in the resulting patterns.
> Well, db-servers are rather exceptional here.
Yes, append style is what makes up for the vast majority of disk writes on
a normal system, db-servers excepted indeed.
> But if the numbers for fragmentation on your servers are true, you must
> have a very good test case for fragmentation prevention. Therefore it
> could be really interesting if you could grab what Dave Chinner asked
> for:
I'll keep it in mind.
> And maybe he could use it for optimizations. Is there any tool on Linux
> to record such I/O patterns?
I presume strace would do, but thats where the "lot of work" comes in. If
there is a ready-to-use tool, that would of course make it easy.
--
The choice of a Deliantra, the free code+content MORPG
-----==- _GNU_ http://www.deliantra.net
----==-- _ generation
---==---(_)__ __ ____ __ Marc Lehmann
--==---/ / _ \/ // /\ \/ / schmorp@schmorp.de
-=====/_/_//_/\_,_/ /_/\_\
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2011-08-09 11:15 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-06 12:25 frequent kernel BUG and lockups - 2.6.39 + xfs_fsr Marc Lehmann
2011-08-06 14:20 ` Dave Chinner
2011-08-07 1:42 ` Marc Lehmann
2011-08-07 10:26 ` Dave Chinner
2011-08-08 19:02 ` Marc Lehmann
2011-08-09 10:10 ` Michael Monnerie
2011-08-09 11:15 ` Marc Lehmann [this message]
2011-08-10 6:59 ` Michael Monnerie
2011-08-11 22:04 ` Marc Lehmann
2011-08-12 4:05 ` Dave Chinner
2011-08-26 8:08 ` Marc Lehmann
2011-08-31 12:45 ` Dave Chinner
2011-08-10 14:16 ` Dave Chinner
2011-08-11 22:07 ` Marc Lehmann
2011-08-09 9:16 ` Marc Lehmann
2011-08-09 11:35 ` Dave Chinner
2011-08-09 16:35 ` Marc Lehmann
2011-08-09 22:31 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110809111526.GA7631@schmorp.de \
--to=schmorp@schmorp.de \
--cc=michael.monnerie@is.it-management.at \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.