From: Guido Fiala <gfiala@s.netic.de>
To: Badari Pulavarty <pbadari@gmail.com>
Cc: lkml <linux-kernel@vger.kernel.org>
Subject: Re: large files unnecessary trashing filesystem cache?
Date: Thu, 20 Oct 2005 17:23:52 +0200 [thread overview]
Message-ID: <200510201723.52858.gfiala@s.netic.de> (raw)
In-Reply-To: <1129668484.23632.82.camel@localhost.localdomain>
On Tuesday 18 October 2005 22:48, Badari Pulavarty wrote:
> On Tue, 2005-10-18 at 22:01 +0200, Guido Fiala wrote:
> > Story:
> > Once in while we have a discussion at the vdr (video disk recorder)
> > mailing list about very large files trashing the filesystems memory cache
> > leading to unnecessary delays accessing directory contents no longer
> > cached.
> > [...]
> Is there a reason why those applications couldn't use O_DIRECT ?
>
> Thanks,
> Badari
I asked a vdr-expert on this and here is the reason why O_DIRECT is not
suitable:
O_DIRECT would be great if it were a simple option for opening files.
But as a matter of facts O_DIRECT completely changes the semantics of
file access. You have to read blocks of a defines size to memory that
is aligned to defined block borders. Memory provided by normal malloc()
or new() is not usable and results in IO errors. So the result is you
have to have a complete rewrite of the whole IO subsystem of the
affected program. Most maintainers of non-trivial applications are
completely resistant against such changes - for good reasons.
If there would be an O_DIRECT_EX32++ (or O_STREAMING) that doesn't have
this change in semantic it would be much easier to apply the necessary
changes.
BTW: In the case of the VDR program not even a per process limit in used
buffer caches would help: the same program reads huge files _and_ huge
directory trees with a lot of small files that should be cached. A
heuristic for this case has to work on per file base. It needs to
detect that some files are only used in a streaming manner - with very
seldom jumps in random directions (skipping commercials, review a
scene). I don't know if such a heuristic is possible and if it would
not break other things.
PS: using f_advise helps a bit. One can keep IO semantics but you have
to add a virtualisation layer for all streaming IO. And you can't
combine posix_fadvise(POSIX_FADV_DONTNEED) with
posix_fadvise(POSIX_FADV_WILLNEED) when you possibly have jumps in your
access pattern because you can't cancel (at least to my knowledge) the
POSIX_FADV_WILLNEED call when you see the read ahead is not needed any
more. It would be an interesting add on if POSIX_FADV_DONTNEED would
cancel the read of a region that has been requested by
POSIX_FADV_WILLNEED before.
Ralf (forwarded by me on his request)
---
Hopefully i did now correctly "reply all" - sorry if i accidently caused some
trouble.
next prev parent reply other threads:[~2005-10-20 15:26 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-10-18 20:01 large files unnecessary trashing filesystem cache? Guido Fiala
2005-10-18 20:48 ` Badari Pulavarty
2005-10-20 15:23 ` Guido Fiala [this message]
2005-10-19 3:02 ` Andrew James Wade
2005-10-19 4:37 ` Andrew Morton
2005-10-19 5:45 ` Andrew James Wade
2005-10-19 11:01 ` gfiala
2005-10-19 11:10 ` gfiala
2005-10-19 15:54 ` Ingo Oeser
2005-10-19 19:49 ` Andrew Morton
2005-10-19 22:26 ` Paul Jackson
2005-10-20 6:28 ` Ingo Oeser
2005-10-19 4:10 ` Lee Revell
2005-10-19 15:43 ` Badari Pulavarty
2005-10-19 17:58 ` Guido Fiala
2005-10-19 18:43 ` Kyle Moffett
2005-10-19 18:52 ` Guido Fiala
[not found] <4Z5WG-1iM-19@gated-at.bofh.it>
[not found] ` <4Z6zs-27l-39@gated-at.bofh.it>
2005-10-18 21:58 ` Bodo Eggert
2005-10-18 23:05 ` Badari Pulavarty
2005-10-19 0:20 ` David Lang
2005-10-19 0:33 ` Fawad Lateef
2005-10-19 1:42 ` Bernd Eckenfels
2005-10-19 7:23 ` Bodo Eggert
2005-10-19 11:06 ` gfiala
2005-10-19 13:43 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200510201723.52858.gfiala@s.netic.de \
--to=gfiala@s.netic.de \
--cc=linux-kernel@vger.kernel.org \
--cc=pbadari@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox