public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Guido Fiala <gfiala@s.netic.de>
To: Badari Pulavarty <pbadari@gmail.com>
Cc: lkml <linux-kernel@vger.kernel.org>
Subject: Re: large files unnecessary trashing filesystem cache?
Date: Thu, 20 Oct 2005 17:23:52 +0200	[thread overview]
Message-ID: <200510201723.52858.gfiala@s.netic.de> (raw)
In-Reply-To: <1129668484.23632.82.camel@localhost.localdomain>

On Tuesday 18 October 2005 22:48, Badari Pulavarty wrote:
> On Tue, 2005-10-18 at 22:01 +0200, Guido Fiala wrote:
> > Story:
> > Once in while we have a discussion at the vdr (video disk recorder)
> > mailing list about very large files trashing the filesystems memory cache
> > leading to unnecessary delays accessing directory contents no longer
> > cached.
> > [...]
> Is there a reason why those applications couldn't use O_DIRECT ?
>
> Thanks,
> Badari

I asked a vdr-expert on this and here is the reason why O_DIRECT is not 
suitable:

O_DIRECT would be great if it were a simple option for opening files.
But as a matter of facts O_DIRECT completely changes the semantics of 
file access. You have to read blocks of a defines size to memory that 
is aligned to defined block borders. Memory provided by normal malloc() 
or new() is not usable and results in IO errors. So the result is you 
have to have a complete rewrite of the whole IO subsystem of the 
affected program. Most maintainers of non-trivial applications are 
completely resistant against such changes - for good reasons.

If there would be an O_DIRECT_EX32++ (or O_STREAMING) that doesn't have 
this change in semantic it would be much easier to apply the necessary 
changes.

BTW: In the case of the VDR program not even a per process limit in used 
buffer caches would help: the same program reads huge files _and_ huge 
directory trees with a lot of small files that should be cached. A 
heuristic for this case has to work on per file base. It needs to 
detect that some files are only used in a streaming manner - with very 
seldom jumps in random directions (skipping commercials, review a 
scene). I don't know if such a heuristic is possible and if it would 
not break other things.

PS: using f_advise helps a bit. One can keep IO semantics but you have 
to add a virtualisation layer for all streaming IO. And you can't 
combine posix_fadvise(POSIX_FADV_DONTNEED) with 
posix_fadvise(POSIX_FADV_WILLNEED) when you possibly have jumps in your 
access pattern because you can't cancel (at least to my knowledge) the 
POSIX_FADV_WILLNEED call when you see the read ahead is not needed any 
more. It would be an interesting add on if POSIX_FADV_DONTNEED would 
cancel the read of a region that has been requested by 
POSIX_FADV_WILLNEED before.

Ralf (forwarded by me on his request)

---
Hopefully i did now correctly "reply all" - sorry if i accidently caused some 
trouble.

  reply	other threads:[~2005-10-20 15:26 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-18 20:01 large files unnecessary trashing filesystem cache? Guido Fiala
2005-10-18 20:48 ` Badari Pulavarty
2005-10-20 15:23   ` Guido Fiala [this message]
2005-10-19  3:02 ` Andrew James Wade
2005-10-19  4:37   ` Andrew Morton
2005-10-19  5:45     ` Andrew James Wade
2005-10-19 11:01       ` gfiala
2005-10-19 11:10     ` gfiala
2005-10-19 15:54       ` Ingo Oeser
2005-10-19 19:49         ` Andrew Morton
2005-10-19 22:26           ` Paul Jackson
2005-10-20  6:28           ` Ingo Oeser
2005-10-19  4:10 ` Lee Revell
2005-10-19 15:43   ` Badari Pulavarty
2005-10-19 17:58   ` Guido Fiala
2005-10-19 18:43     ` Kyle Moffett
2005-10-19 18:52       ` Guido Fiala
     [not found] <4Z5WG-1iM-19@gated-at.bofh.it>
     [not found] ` <4Z6zs-27l-39@gated-at.bofh.it>
2005-10-18 21:58   ` Bodo Eggert
2005-10-18 23:05     ` Badari Pulavarty
2005-10-19  0:20       ` David Lang
2005-10-19  0:33       ` Fawad Lateef
2005-10-19  1:42         ` Bernd Eckenfels
2005-10-19  7:23       ` Bodo Eggert
2005-10-19 11:06         ` gfiala
2005-10-19 13:43     ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200510201723.52858.gfiala@s.netic.de \
    --to=gfiala@s.netic.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbadari@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox