Re: Lazy block allocation and block_prepare_write?

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Badari Pulavarty <pbadari@us.ibm.com>
To: Martin Jambor <jamborm@gmail.com>
Cc: linux-fsdevel@vger.kernel.org
Subject: Re: Lazy block allocation and block_prepare_write?
Date: Mon, 18 Apr 2005 20:01:24 -0700	[thread overview]
Message-ID: <42647484.5040208@us.ibm.com> (raw)
In-Reply-To: <8e70aacf05041717546fdff3f@mail.gmail.com>

Martin Jambor wrote:

> Hi all,
> 
> I am a member of a group that implements a filesystem that allocates
> disk blocks to in-memory blocks lazily, that means, the decision is
> made just before the data are actually sent to disk. Moreover, when
> cached pages are modified, the data can be (and almost certainly will
> be) written to a different place to from where it was read.
> 
> I was wondering, whether we could use the generic function
> block_prepare_write at all. The function checks every buffer of the
> page and if it is not mapped, it calls a fs supplied function that is
> supposed to map the buffer, i.e. assign it a block on the device and
> set its mapped flag.
> 
> This is where we would like to give an error if there is not enough
> free disk space left but we cannot give a specific device block number
> yet. Can we make one up, such as -1? What would that do to such dark
> functions as unmap_underlying_metadata or any other? Would some other
> part of kernel break if there was a bunch of buffers assigned to the
> same spot on the disk?
> 
> On the other hand, if I understand buffer flags correctly, I need to
> be able to emulate mapping of buffers to set them dirty, or em I
> wrong?
> 
> Thanks for any insight or thoughts,

Yes. Its possible to do what you want to. I am currently working on
adding "delayed allocation" support to ext3. As part of that, We
are modifying generic helper routines to delay the allocation from
prepare time to actual writeout time. (writepage).

Here is the basic idea:
=======================

The idea is to "reserve" a block at the prepare/commit write instead
of allocating the block. Do the actual allocation in writepage().
Sounds simple :)

Here are the issues:
====================

1) Currently none of the generic helper routines can handle this.
We need to add support to do these, but still somehow make the
routines generic enough for every ones use.

2) There is no easy way to find out if we "reserved" a block or
not in writepage() correctly. There are 2 paths to writepage().

	sys_write() -> prepare/commit()
		and later sync() ----> writepage()

	mmap() -> touch a page()
		and later --> writepage()

In order to do the correct accounting, we need to mark a page
to indicate if we reserved a block or not. One way to do this,
to use page->private to indicate this. But then, all the generic
routines will fail - since they assume that page->private represents
bufferheads. So we need a better way to do this.

3) We need add hooks into filesystem specific calls from these
generic routines to handle "journaling mode" requirements
(for ext3 and may be others).

So, what are your requirements ?  I am looking for a common
way to combine all the requirements and come out with a
saner "generic" routines to handle these.

Thanks,
Badari

next prev parent reply	other threads:[~2005-04-19  3:01 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-04-18  0:54 Lazy block allocation and block_prepare_write? Martin Jambor
2005-04-19  3:01 ` Badari Pulavarty [this message]
2005-04-19 10:10   ` Alex Tomas
2005-04-19 14:48     ` Badari Pulavarty
2005-04-19 15:04       ` Alex Tomas
2005-04-19 15:00         ` Badari Pulavarty
2005-04-19 15:20           ` Alex Tomas
2005-04-19 11:22   ` Nikita Danilov
2005-04-19 14:46     ` Badari Pulavarty
2005-04-19 15:55       ` Nikita Danilov
2005-04-19 16:06         ` Alex Tomas
2005-04-19 16:59           ` Badari Pulavarty
2005-04-19 17:08         ` Mingming Cao
2005-04-19 18:45           ` Nikita Danilov
2005-04-20  0:00     ` Bryan Henderson
2005-04-19 20:41   ` Martin Jambor
2005-04-20 14:52     ` Badari Pulavarty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42647484.5040208@us.ibm.com \
    --to=pbadari@us.ibm.com \
    --cc=jamborm@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).