From: Zheng Liu <gnehzuil.liu@gmail.com>
To: Lukas Czerner <lczerner@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org
Subject: Re: [RFC] fadvise: add more flags to provide a hint for block allocation
Date: Tue, 6 Mar 2012 21:56:56 +0800 [thread overview]
Message-ID: <20120306135656.GB24695@gmail.com> (raw)
In-Reply-To: <alpine.LFD.2.00.1203060912170.5085@dhcp-27-109.brq.redhat.com>
On Tue, Mar 06, 2012 at 09:27:16AM +0100, Lukas Czerner wrote:
> On Mon, 5 Mar 2012, Zheng Liu wrote:
>
> > Hi list,
> >
> > Block allocation is a key component of file system. Every file systems try to
> > improve the performance with optimizing the block allocation of a file. But no
> > matter what file system does, it just guesses what the user expects. Thus, it
> > is not very accurate. fadvise(2) provides a method to let the user to give a
> > hint to file system. However, until now, only few flags are provided. So we
> > can provide more flags to tell file system how to allocate the blocks for a
> > file.
> >
> > For example:
> > we can add these flags into fadvise(2):
> > FADV_ALLOC_READ_SEQ
> > FADV_ALLOC_READ_RANDOM
> > FADV_ALLOC_WRITE_ONCE
> > FADV_ALLOC_WRITE_APPEND
> >
> > FADV_ALLOC_READ_* are not similar with FADV_SEQUENTIAL and FADV_RANDOM.
> > FADV_ALLOC_READ_SEQ tells file system that this file need to allocate some
> > sequential blocks, and FADV_ALLOC_READ_RADOM tells file system that this file
> > can endure the fragmentation.
> >
> > FADV_ALLOC_WRITE_ONCE indicates that this file just is written once. So file
> > system can allocate some sequential blocks for it to improve the read
> > performance. FADV_ALLOC_WRITE_APPEND flag is set to point out that data will be
> > appended to the end of this file, and file system can reserve some blocks for it
> > to guarantee the sequence as much as possible.
>
> Hi Zheng,
>
> those two flags does not make sense to me. The FADV_ALLOC_WRITE_ONCE is
> actually the same as fallocate, and we certainly do not need more ways
> to do fallocate, one is more than enough.
>
> FADV_ALLOC_WRITE_APPEND seems weird. File systems already do some
> preallocations for the files, so we do not fragment them as much. So
> what might be more interesting is to be able to set how much space we
> want to keep preallocated for the particular file, however strictly
> speaking it is not something we would not achieve with fallocate, but it
> would certainly be more convenient.
>
> -Lukas
>
Hi Lukas,
I have realized that these two flags seem redundant, and we don't need
them.
As we discussed previously and Sunil's suggestions. The key issue is
that user provides a hint to file system, and file system can know
whether or not this file can be stored in a corner or be allocated in
non-sequential blocks. Then the sequential blocks are reserved for the
particular file that has a *_HOT* flag. Although fallocate(2) can
preallocate some blocks for a file, it cannot put a file at the
beginning of the disk to obtain a better performance. So maybe file
system can use these flags to optimize the layout of a file.
Regards,
Zheng
> >
> > File systems can support a subset of these flags according to its design. These
> > flags provide a rich interface that lets the user to control block allocation of
> > files. The user could precisely control the allocation of their files to
> > improve the performance of appliatons.
> >
> > Any comments or suggestions are appreciated. Thank you.
> >
> > Regards,
> > Zheng
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
>
> --
next prev parent reply other threads:[~2012-03-06 13:56 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-05 12:50 [RFC] fadvise: add more flags to provide a hint for block allocation Zheng Liu
2012-03-05 19:48 ` Sunil Mushran
2012-03-06 2:35 ` Zheng Liu
2012-03-06 4:26 ` Sunil Mushran
2012-03-06 13:30 ` Zheng Liu
2012-03-06 8:27 ` Lukas Czerner
2012-03-06 13:56 ` Zheng Liu [this message]
2012-03-06 14:29 ` Lukas Czerner
2012-03-06 17:53 ` Sunil Mushran
2012-03-07 8:51 ` Lukas Czerner
2012-03-07 17:11 ` Ted Ts'o
2012-03-07 0:51 ` Dave Chinner
2012-03-07 4:14 ` Andreas Dilger
2012-03-07 5:02 ` Martin K. Petersen
2012-03-07 12:11 ` Dave Chinner
2012-03-08 4:23 ` Martin K. Petersen
2012-03-08 7:07 ` Dave Chinner
2012-03-08 17:01 ` Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120306135656.GB24695@gmail.com \
--to=gnehzuil.liu@gmail.com \
--cc=lczerner@redhat.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.