linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Access content of file via inodes
@ 2005-04-05  1:23 Kathy KN
  2005-04-05  7:22 ` Christoph Hellwig
                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Kathy KN @ 2005-04-05  1:23 UTC (permalink / raw)
  To: linux-fsdevel

Good day all,

How do I access/read the content of the files via using inodes
or blocks that belong to the inode, at sys_link and vfs_link layer?
I used bmap to access the blocks that belongs to the inodes, but
getting access to the buffer_head's b_data doesn't seem to help.

Kindly advise.

Kathy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-04-05  1:23 Access content of file via inodes Kathy KN
@ 2005-04-05  7:22 ` Christoph Hellwig
  2005-04-05 17:53 ` Bryan Henderson
  2005-04-05 19:01 ` Jeff Mahoney
  2 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2005-04-05  7:22 UTC (permalink / raw)
  To: Kathy KN; +Cc: linux-fsdevel

On Tue, Apr 05, 2005 at 09:23:19AM +0800, Kathy KN wrote:
> Good day all,
> 
> How do I access/read the content of the files via using inodes
> or blocks that belong to the inode, at sys_link and vfs_link layer?
> I used bmap to access the blocks that belongs to the inodes, but
> getting access to the buffer_head's b_data doesn't seem to help.

You don't.  There might not even be any blocks.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-04-05  1:23 Access content of file via inodes Kathy KN
  2005-04-05  7:22 ` Christoph Hellwig
@ 2005-04-05 17:53 ` Bryan Henderson
  2005-04-06  1:27   ` Kathy KN (HK)
  2005-04-05 19:01 ` Jeff Mahoney
  2 siblings, 1 reply; 23+ messages in thread
From: Bryan Henderson @ 2005-04-05 17:53 UTC (permalink / raw)
  To: Kathy KN; +Cc: linux-fsdevel

>How do I access/read the content of the files via using inodes
>or blocks that belong to the inode, at sys_link and vfs_link layer?

This is tricky because many interfaces that one would expect to use an 
inode as a file handle use a dentry instead.  To read the contents of a 
file via the VFS interface, you need a file pointer (struct file), and the 
file pointer identifies the file by dentry.  So you need to create a dummy 
dentry, which you can do with d_alloc_root(), and then create the file 
pointer with dentry_open(), then read the file with vfs_read().

That's for "via inodes."  I don't know what "via blocks" means.

--
Bryan Henderson                          IBM Almaden Research Center
San Jose CA                              Filesystems

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-04-05  1:23 Access content of file via inodes Kathy KN
  2005-04-05  7:22 ` Christoph Hellwig
  2005-04-05 17:53 ` Bryan Henderson
@ 2005-04-05 19:01 ` Jeff Mahoney
  2005-04-06  1:32   ` Kathy KN (HK)
  2005-04-08  6:01   ` Kathy KN (HK)
  2 siblings, 2 replies; 23+ messages in thread
From: Jeff Mahoney @ 2005-04-05 19:01 UTC (permalink / raw)
  To: Kathy KN; +Cc: linux-fsdevel

Kathy KN wrote:
> Good day all,
> 
> How do I access/read the content of the files via using inodes
> or blocks that belong to the inode, at sys_link and vfs_link layer?
> I used bmap to access the blocks that belongs to the inodes, but
> getting access to the buffer_head's b_data doesn't seem to help.

Hi Kathy -

What you're trying to do is possible, but you need to go about it in a
different way. Ignore the buffer cache completely and use the page
cache; it's more appropriate for file contents.

You have two options:

If performance isn't critical, a simple approach would be to use your
old_dentry pointer to dentry_open a file and then vfs_read from it to a
buffer you allocate. Make sure you use get_fs/set_fs, since vfs_read
won't accept a kernel pointer otherwise.

If performance is more important or you really do only have access to an
inode, you can read from the page cache directly using inode->i_mapping
and read_cache_page. This has the advantage that you don't need to copy
the data to access it, but the disadvantage that it is more complex and
can be tricky to get right.

-Jeff

-- 
Jeff Mahoney
SuSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-04-05 17:53 ` Bryan Henderson
@ 2005-04-06  1:27   ` Kathy KN (HK)
  2005-04-06  1:53     ` Jeff Mahoney
                       ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Kathy KN (HK) @ 2005-04-06  1:27 UTC (permalink / raw)
  To: Bryan Henderson; +Cc: linux-fsdevel

On Apr 6, 2005 1:53 AM, Bryan Henderson <hbryan@us.ibm.com> wrote:
> >How do I access/read the content of the files via using inodes
> >or blocks that belong to the inode, at sys_link and vfs_link layer?
> 
> This is tricky because many interfaces that one would expect to use an
> inode as a file handle use a dentry instead.  To read the contents of a
> file via the VFS interface, you need a file pointer (struct file), and the
> file pointer identifies the file by dentry.  So you need to create a dummy
> dentry, which you can do with d_alloc_root(), and then create the file
> pointer with dentry_open(), then read the file with vfs_read().
> 
> That's for "via inodes."  I don't know what "via blocks" means.

Bryan,

Thanks for the description on how to read the contents of a file via
the VFS interface. I got to try to see if I can write it in codes, and make
sure that I can read the file via the vfs_read() routine. What I meant by
via blocks is to gain knowledge of the physical blocks used by the inodes
and retrieve the content from it directly, by accessing b_data.

Kathy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-04-05 19:01 ` Jeff Mahoney
@ 2005-04-06  1:32   ` Kathy KN (HK)
  2005-04-06  1:50     ` Jeff Mahoney
  2005-04-08  6:01   ` Kathy KN (HK)
  1 sibling, 1 reply; 23+ messages in thread
From: Kathy KN (HK) @ 2005-04-06  1:32 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: linux-fsdevel

> Hi Kathy -
> 
> If performance is more important or you really do only have access to an
> inode, you can read from the page cache directly using inode->i_mapping
> and read_cache_page. This has the advantage that you don't need to copy
> the data to access it, but the disadvantage that it is more complex and
> can be tricky to get right.

Hi Jeff,

I felt that the second suggestion seems to be more of an elegant
solution, though I need to find out how to actually do it correctly.

Thanks for the tip. If you have example code, that would be splendid.

Kathy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-04-06  1:32   ` Kathy KN (HK)
@ 2005-04-06  1:50     ` Jeff Mahoney
  0 siblings, 0 replies; 23+ messages in thread
From: Jeff Mahoney @ 2005-04-06  1:50 UTC (permalink / raw)
  To: Kathy KN (HK); +Cc: linux-fsdevel

Kathy KN (HK) wrote:
>>Hi Kathy -
>>
>>If performance is more important or you really do only have access to an
>>inode, you can read from the page cache directly using inode->i_mapping
>>and read_cache_page. This has the advantage that you don't need to copy
>>the data to access it, but the disadvantage that it is more complex and
>>can be tricky to get right.
> 
> Hi Jeff,
> 
> I felt that the second suggestion seems to be more of an elegant
> solution, though I need to find out how to actually do it correctly.
> 
> Thanks for the tip. If you have example code, that would be splendid.

It's not the prettiest, and in fact I'm in the process of reworking it,
but code similar to what you're looking at implementing can be found in
fs/reiserfs/xattr.c; Start your analysis at reiserfs_xattr_get().

-Jeff

-- 
Jeff Mahoney
SuSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-04-06  1:27   ` Kathy KN (HK)
@ 2005-04-06  1:53     ` Jeff Mahoney
  2005-04-06 17:57       ` Bryan Henderson
  2005-04-06  7:54     ` Anton Altaparmakov
  2005-04-06 11:33     ` Anton Altaparmakov
  2 siblings, 1 reply; 23+ messages in thread
From: Jeff Mahoney @ 2005-04-06  1:53 UTC (permalink / raw)
  To: Kathy KN (HK); +Cc: Bryan Henderson, linux-fsdevel

Kathy KN (HK) wrote:
> On Apr 6, 2005 1:53 AM, Bryan Henderson <hbryan@us.ibm.com> wrote:
>>>How do I access/read the content of the files via using inodes
>>>or blocks that belong to the inode, at sys_link and vfs_link layer?
>>This is tricky because many interfaces that one would expect to use an
>>inode as a file handle use a dentry instead.  To read the contents of a
>>file via the VFS interface, you need a file pointer (struct file), and the
>>file pointer identifies the file by dentry.  So you need to create a dummy
>>dentry, which you can do with d_alloc_root(), and then create the file
>>pointer with dentry_open(), then read the file with vfs_read().
>>
>>That's for "via inodes."  I don't know what "via blocks" means.
> 
> Bryan,
> 
> Thanks for the description on how to read the contents of a file via
> the VFS interface. I got to try to see if I can write it in codes, and make
> sure that I can read the file via the vfs_read() routine. What I meant by
> via blocks is to gain knowledge of the physical blocks used by the inodes
> and retrieve the content from it directly, by accessing b_data.

The problem with that approach is that some filesystems may store part
of the file outside of a complete block. For example, reiserfs "tails"
will respond with -ENOENT on ->bmap. For files smaller than 16k, they
are quite common.

-Jeff

-- 
Jeff Mahoney
SuSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-04-06  1:27   ` Kathy KN (HK)
  2005-04-06  1:53     ` Jeff Mahoney
@ 2005-04-06  7:54     ` Anton Altaparmakov
  2005-04-06 11:33     ` Anton Altaparmakov
  2 siblings, 0 replies; 23+ messages in thread
From: Anton Altaparmakov @ 2005-04-06  7:54 UTC (permalink / raw)
  To: Kathy KN (HK); +Cc: Bryan Henderson, linux-fsdevel

On Wed, 6 Apr 2005, Kathy KN (HK) wrote:
> On Apr 6, 2005 1:53 AM, Bryan Henderson <hbryan@us.ibm.com> wrote:
> > >How do I access/read the content of the files via using inodes
> > >or blocks that belong to the inode, at sys_link and vfs_link layer?
> > 
> > This is tricky because many interfaces that one would expect to use an
> > inode as a file handle use a dentry instead.  To read the contents of a
> > file via the VFS interface, you need a file pointer (struct file), and the
> > file pointer identifies the file by dentry.  So you need to create a dummy
> > dentry, which you can do with d_alloc_root(), and then create the file
> > pointer with dentry_open(), then read the file with vfs_read().
> > 
> > That's for "via inodes."  I don't know what "via blocks" means.
> 
> Thanks for the description on how to read the contents of a file via
> the VFS interface. I got to try to see if I can write it in codes, and make
> sure that I can read the file via the vfs_read() routine. What I meant by
> via blocks is to gain knowledge of the physical blocks used by the inodes
> and retrieve the content from it directly, by accessing b_data.

You cannot do that safely because some file systems do not store things in 
whole blocks.  For example small files in ntfs are stored as a variable 
length and variable offset record inside the inode record on disk.  And 
compressed/encrypted files on ntfs are stored compressed/encrypted on disk 
and are decompressed/decrypted on access so there are no blocks you could 
usefully read at all.  (This is why ntfs does not implement ->bmap - it 
just makes no sense.)

Oh and another thing is that ->bmap returns 0 for a sparse block, i.e. not 
allocated on disk, it is zero.  But for example ntfs uses 0 as a valid 
block number you can read from/write to so that is not compatible with 
->bmap either.

Best regards,

	Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-04-06  1:27   ` Kathy KN (HK)
  2005-04-06  1:53     ` Jeff Mahoney
  2005-04-06  7:54     ` Anton Altaparmakov
@ 2005-04-06 11:33     ` Anton Altaparmakov
  2005-04-06 13:09       ` Jeffrey Mahoney
  2005-04-07  5:25       ` Kathy KN (HK)
  2 siblings, 2 replies; 23+ messages in thread
From: Anton Altaparmakov @ 2005-04-06 11:33 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: Kathy KN (HK), Bryan Henderson, linux-fsdevel

Jeff Mahoney wrote:
> Kathy KN (HK) wrote:
> > What I meant by via blocks is to gain knowledge of the physical
> > blocks used by the inodes and retrieve the content from it directly,
> > by accessing b_data.
> 
> The problem with that approach is that some filesystems may store part
> of the file outside of a complete block. For example, reiserfs "tails"
> will respond with -ENOENT on ->bmap. For files smaller than 16k, they
> are quite common.

This is one not true and two wrong!

Looking at reiserfs code in the current 2.6 kernel it does:

.bmap = reiserfs_aop_bmap,

Which is:

static sector_t reiserfs_aop_bmap(struct address_space *as, sector_t
block) {
  return generic_block_bmap(as, block, reiserfs_bmap) ;
}

And generic_block_bmap is:

sector_t generic_block_bmap(struct address_space *mapping, sector_t
block,
                            get_block_t *get_block)
{
        struct buffer_head tmp;
        struct inode *inode = mapping->host;
        tmp.b_state = 0;
        tmp.b_blocknr = 0;
        get_block(inode, block, &tmp, 0);
        return tmp.b_blocknr;
}

It ignores any errors from get_block() and always returns tmp.b_blocknr.
Thus is get_block() fails, tmp.b_blocknr is 0 and hence 0 is returned,
i.e. a sparse block.  Which is complete rubbish...

And get_block in this case in reiserfs is:

static int reiserfs_bmap (struct inode * inode, sector_t block,
                          struct buffer_head * bh_result, int create)
{
    if (!file_capable (inode, block))
        return -EFBIG;

    reiserfs_write_lock(inode->i_sb);
    /* do not read the direct item */
    _get_block_create_0 (inode, block, bh_result, 0) ;
    reiserfs_write_unlock(inode->i_sb);
    return 0;
}

This will result in sparse blocks being returned whenever an error
occurs.  Not what is desired...

<rant>
The problem with ->bmap is that it cannot return error at all.  It
either returns 0 for sparse or >0 for real block.  ->bmap is the most
stupid interface I have ever seen...  )-:  If you ask me it should be
removed from the kernel without notice.  Let all applications that use
it break.  Who cares...  It can always be replaced with a sensible
interface that returns errors like -ESPARSE, -ENOTAPPLICABLE, -EIO,
-ENOMEM, etc and doesn't assume that 0 is sparse...
</rant>

Best regards,

        Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-04-06 11:33     ` Anton Altaparmakov
@ 2005-04-06 13:09       ` Jeffrey Mahoney
  2005-04-07  5:25       ` Kathy KN (HK)
  1 sibling, 0 replies; 23+ messages in thread
From: Jeffrey Mahoney @ 2005-04-06 13:09 UTC (permalink / raw)
  To: Anton Altaparmakov; +Cc: Kathy KN (HK), Bryan Henderson, linux-fsdevel

Anton Altaparmakov wrote:
> Jeff Mahoney wrote:
> 
>>Kathy KN (HK) wrote:
>>
>>>What I meant by via blocks is to gain knowledge of the physical
>>>blocks used by the inodes and retrieve the content from it directly,
>>>by accessing b_data.
>>
>>The problem with that approach is that some filesystems may store part
>>of the file outside of a complete block. For example, reiserfs "tails"
>>will respond with -ENOENT on ->bmap. For files smaller than 16k, they
>>are quite common.
> 
> 
> This is one not true and two wrong!
> 
> Looking at reiserfs code in the current 2.6 kernel it does:
[...]
> This will result in sparse blocks being returned whenever an error
> occurs.  Not what is desired...
> 
> <rant>
> The problem with ->bmap is that it cannot return error at all.  It
> either returns 0 for sparse or >0 for real block.  ->bmap is the most
> stupid interface I have ever seen...  )-:  If you ask me it should be
> removed from the kernel without notice.  Let all applications that use
> it break.  Who cares...  It can always be replaced with a sensible
> interface that returns errors like -ESPARSE, -ENOTAPPLICABLE, -EIO,
> -ENOMEM, etc and doesn't assume that 0 is sparse...
> </rant>

Ugh. Mea culpa. I knew reiserfs_bmap would return less than useful
results, and stopped there. I should have dug a little deeper.

-Jeff

-- 
Jeff Mahoney
SuSE Labs
jeffm@suse.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-04-06  1:53     ` Jeff Mahoney
@ 2005-04-06 17:57       ` Bryan Henderson
  0 siblings, 0 replies; 23+ messages in thread
From: Bryan Henderson @ 2005-04-06 17:57 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: Kathy KN (HK), linux-fsdevel

>What I meant by
>> via blocks is to gain knowledge of the physical blocks used by the 
inodes
>> and retrieve the content from it directly, by accessing b_data.
>
>The problem with that approach is that some filesystems may store part
>of the file outside of a complete block.

There's an even more basic problem with this approach:  The question is 
specifically about the filesystem-type-independent layer above the VFS 
interface.  At this layer, you don't even know that there is a block 
device involved.  And if you do, you don't know that the filesystem driver 
uses the buffer cache to access it.  And if you do know that it uses the 
buffer cache, you don't know that the file data you're looking for is 
presently in the buffer cache, or how to get it there if it isn't.

If you believe in the layering at all, the only interface you can consider 
at this layer for getting at file data is VFS ->read.

--
Bryan Henderson                               San Jose California
IBM Almaden Research Center                   Filesystems

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-04-06 11:33     ` Anton Altaparmakov
  2005-04-06 13:09       ` Jeffrey Mahoney
@ 2005-04-07  5:25       ` Kathy KN (HK)
  2005-04-07  6:47         ` Jeffrey Mahoney
  1 sibling, 1 reply; 23+ messages in thread
From: Kathy KN (HK) @ 2005-04-07  5:25 UTC (permalink / raw)
  To: Anton Altaparmakov; +Cc: Jeff Mahoney, Bryan Henderson, linux-fsdevel

> Looking at reiserfs code in the current 2.6 kernel it does:
> 
> .bmap = reiserfs_aop_bmap,
> 
> Which is:
> 
> static sector_t reiserfs_aop_bmap(struct address_space *as, sector_t
> block) {
>   return generic_block_bmap(as, block, reiserfs_bmap) ;
> }
> 
> And generic_block_bmap is:
> 
> sector_t generic_block_bmap(struct address_space *mapping, sector_t
> block,
>                             get_block_t *get_block)
> {
>         struct buffer_head tmp;
>         struct inode *inode = mapping->host;
>         tmp.b_state = 0;
>         tmp.b_blocknr = 0;
>         get_block(inode, block, &tmp, 0);
>         return tmp.b_blocknr;
> }
> 
> It ignores any errors from get_block() and always returns tmp.b_blocknr.
> Thus is get_block() fails, tmp.b_blocknr is 0 and hence 0 is returned,
> i.e. a sparse block.  Which is complete rubbish...
> 
> And get_block in this case in reiserfs is:
> 
> static int reiserfs_bmap (struct inode * inode, sector_t block,
>                           struct buffer_head * bh_result, int create)
> {
>     if (!file_capable (inode, block))
>         return -EFBIG;
> 
>     reiserfs_write_lock(inode->i_sb);
>     /* do not read the direct item */
>     _get_block_create_0 (inode, block, bh_result, 0) ;
>     reiserfs_write_unlock(inode->i_sb);
>     return 0;
> }

Just wondering. Say, reiserfs/r4, how is it possible to access
the tail which contain the data of the file, since most of our
production boxes uses either reiserfs and/or reiser4.

Kathy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-04-07  5:25       ` Kathy KN (HK)
@ 2005-04-07  6:47         ` Jeffrey Mahoney
  2005-04-07  8:09           ` Anton Altaparmakov
  0 siblings, 1 reply; 23+ messages in thread
From: Jeffrey Mahoney @ 2005-04-07  6:47 UTC (permalink / raw)
  To: Kathy KN (HK); +Cc: Anton Altaparmakov, Bryan Henderson, linux-fsdevel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Kathy KN (HK) wrote:
> Just wondering. Say, reiserfs/r4, how is it possible to access
> the tail which contain the data of the file, since most of our
> production boxes uses either reiserfs and/or reiser4.

Hi Kathy -

Using vfs_read or the page cache functions will allow you access to the
tail since they will map it in as part of the file. You'd only run into
that problem if you were trying to access the data block-by-block as you
were initially.

- -Jeff

- --
Jeff Mahoney
SuSE Labs
jeffm@suse.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (Darwin)

iD8DBQFCVNeOLPWxlyuTD7IRAgTMAJ4r+LU63wgVL168eGC/9VUuov4fLQCeKvms
UqIr/gxdn60EzfTRpQsjGlo=
=wQ0F
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-04-07  6:47         ` Jeffrey Mahoney
@ 2005-04-07  8:09           ` Anton Altaparmakov
  0 siblings, 0 replies; 23+ messages in thread
From: Anton Altaparmakov @ 2005-04-07  8:09 UTC (permalink / raw)
  To: Kathy KN (HK); +Cc: Jeffrey Mahoney, Bryan Henderson, linux-fsdevel

On Thu, 2005-04-07 at 02:47 -0400, Jeffrey Mahoney wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Kathy KN (HK) wrote:
> > Just wondering. Say, reiserfs/r4, how is it possible to access
> > the tail which contain the data of the file, since most of our
> > production boxes uses either reiserfs and/or reiser4.
> 
> Hi Kathy -
> 
> Using vfs_read or the page cache functions will allow you access to the
> tail since they will map it in as part of the file. You'd only run into
> that problem if you were trying to access the data block-by-block as you
> were initially.

Exactly the same for ntfs in case you care.  (-:  Btw.  Kathy, if you
want examples how to do page cache reads you could look at the ntfs
driver.  It does them all over the place (because metadata is in the
page cache).  So for example fs/ntfs/aops.c::ntfs_map_page() is (with
comments):

static inline struct page *ntfs_map_page(struct address_space *mapping,
                unsigned long index)

// mapping is the address space mapping, i.e. struct inode *->i_mapping
// index is file position you want to access >> PAGE_CACHE_SHIFT

{
        struct page *page = read_cache_page(mapping, index,
                        (filler_t*)mapping->a_ops->readpage, NULL);

// the above read_cache_page initiates the page to be read
asynchronously so it is not finished when the call returns.

        if (!IS_ERR(page)) {

// if no synchronous error occured:

                wait_on_page_locked(page);

// wait that the page becomes unlocked, which implies that either an
asynchronous error occured or that the page has been read successfully

                kmap(page);

// map the page so can read the contents, you could do this much later
and only use kmap_atomic() depending on when/how you need to read/write
from/to the page

                if (PageUptodate(page) && !PageError(page))
                        return page;

// if the page is now uptodate and it does not have the error bit set
the read was successful!

                ntfs_unmap_page(page);

// ouch.  asynchronous error occured.  ntfs_unmap_page simply does a
"kunmap(page)" and a "page_cache_release(page)".

                return ERR_PTR(-EIO);

// return -EIO error code encoded as a pointer.
        }
        return page;

// ouch synchronous error.  "page" contains the error code encoded as a
pointer.
}

When you are finished accessing the page contents, simply unmap the page
if you have mapped it and do a "page_cache_release()" on the page.

Hope this helps.

Best regards,

        Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-04-05 19:01 ` Jeff Mahoney
  2005-04-06  1:32   ` Kathy KN (HK)
@ 2005-04-08  6:01   ` Kathy KN (HK)
  2005-04-08  8:17     ` Anton Altaparmakov
  1 sibling, 1 reply; 23+ messages in thread
From: Kathy KN (HK) @ 2005-04-08  6:01 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: linux-fsdevel

On Apr 6, 2005 3:01 AM, Jeff Mahoney <jeffm@suse.com> wrote:
> Kathy KN wrote:
> > Good day all,
> >
> > How do I access/read the content of the files via using inodes
> > or blocks that belong to the inode, at sys_link and vfs_link layer?
> > I used bmap to access the blocks that belongs to the inodes, but
> > getting access to the buffer_head's b_data doesn't seem to help.
> 
> Hi Kathy -
> 
> What you're trying to do is possible, but you need to go about it in a
> different way. Ignore the buffer cache completely and use the page
> cache; it's more appropriate for file contents.
> 
> You have two options:
> 
> If performance isn't critical, a simple approach would be to use your
> old_dentry pointer to dentry_open a file and then vfs_read from it to a
> buffer you allocate. Make sure you use get_fs/set_fs, since vfs_read
> won't accept a kernel pointer otherwise.
> 
> If performance is more important or you really do only have access to an
> inode, you can read from the page cache directly using inode->i_mapping
> and read_cache_page. This has the advantage that you don't need to copy
> the data to access it, but the disadvantage that it is more complex and
> can be tricky to get right.

Hi Jeff,

Is it possible to modify the cached page, and invalidate it back
to update the page cache of the new page? I did a recursive grep 
and could only find functions that let you read or grab pages in the
cache.

Kathy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-04-08  6:01   ` Kathy KN (HK)
@ 2005-04-08  8:17     ` Anton Altaparmakov
  2005-05-27 19:13       ` Martin Jambor
  0 siblings, 1 reply; 23+ messages in thread
From: Anton Altaparmakov @ 2005-04-08  8:17 UTC (permalink / raw)
  To: Kathy KN (HK); +Cc: Jeff Mahoney, linux-fsdevel

On Fri, 2005-04-08 at 14:01 +0800, Kathy KN (HK) wrote:
> On Apr 6, 2005 3:01 AM, Jeff Mahoney <jeffm@suse.com> wrote:
> > Kathy KN wrote:
> > > Good day all,
> > >
> > > How do I access/read the content of the files via using inodes
> > > or blocks that belong to the inode, at sys_link and vfs_link layer?
> > > I used bmap to access the blocks that belongs to the inodes, but
> > > getting access to the buffer_head's b_data doesn't seem to help.
> > 
> > Hi Kathy -
> > 
> > What you're trying to do is possible, but you need to go about it in a
> > different way. Ignore the buffer cache completely and use the page
> > cache; it's more appropriate for file contents.
> > 
> > You have two options:
> > 
> > If performance isn't critical, a simple approach would be to use your
> > old_dentry pointer to dentry_open a file and then vfs_read from it to a
> > buffer you allocate. Make sure you use get_fs/set_fs, since vfs_read
> > won't accept a kernel pointer otherwise.
> > 
> > If performance is more important or you really do only have access to an
> > inode, you can read from the page cache directly using inode->i_mapping
> > and read_cache_page. This has the advantage that you don't need to copy
> > the data to access it, but the disadvantage that it is more complex and
> > can be tricky to get right.
> 
> Hi Jeff,
> 
> Is it possible to modify the cached page, and invalidate it back
> to update the page cache of the new page? I did a recursive grep 
> and could only find functions that let you read or grab pages in the
> cache.

Once you have the page (via read_cache_page() or whatever) you can
simply write to it, then do a flush_dcache_page(page), then
set_page_dirty(page) and finally do the page_cache_release().  Oh, and
don't forget to unmap the page.  Usually done straight after the
flush_dcache_page().  And example from ntfs where we get a page, memset
it to a value (val), and then mark it dirty for later write out:

        page = read_cache_page(mapping, idx,
                        (filler_t*)mapping->a_ops->readpage, NULL);
        if (IS_ERR(page)) {
                ntfs_error(vol->sb, "Failed to read first partial "
                                "page (sync error, index 0x%lx).", idx);
                return PTR_ERR(page);
        }
        wait_on_page_locked(page);
        if (unlikely(!PageUptodate(page))) {
                ntfs_error(vol->sb, "Failed to read first partial page "
                                "(async error, index 0x%lx).", idx);
                page_cache_release(page);
                return PTR_ERR(page);
        }
        size = PAGE_CACHE_SIZE;
        if (idx == end)
                size = end_ofs;
        kaddr = kmap_atomic(page, KM_USER0);
        memset(kaddr + start_ofs, val, size - start_ofs);
        flush_dcache_page(page);
        kunmap_atomic(kaddr, KM_USER0);
        set_page_dirty(page);
        page_cache_release(page);

Of course you need to serialise access in some way so multiple writers
do not step on each other's toes, etc...

Best regards,

        Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-04-08  8:17     ` Anton Altaparmakov
@ 2005-05-27 19:13       ` Martin Jambor
  2005-05-28 15:57         ` Anton Altaparmakov
  0 siblings, 1 reply; 23+ messages in thread
From: Martin Jambor @ 2005-05-27 19:13 UTC (permalink / raw)
  To: Anton Altaparmakov; +Cc: Kathy KN (HK), Jeff Mahoney, linux-fsdevel

Hi Anton,

On 4/8/05, Anton Altaparmakov <aia21@cam.ac.uk> wrote:
> On Fri, 2005-04-08 at 14:01 +0800, Kathy KN (HK) wrote:
> > Hi Jeff,
> >
> > Is it possible to modify the cached page, and invalidate it back
> > to update the page cache of the new page? I did a recursive grep
> > and could only find functions that let you read or grab pages in the
> > cache.
> 
> Once you have the page (via read_cache_page() or whatever) you can
> simply write to it, then do a flush_dcache_page(page), then
> set_page_dirty(page) and finally do the page_cache_release().  Oh, and
> don't forget to unmap the page.  Usually done straight after the
> flush_dcache_page().  And example from ntfs where we get a page, memset
> it to a value (val), and then mark it dirty for later write out:
> 
>         page = read_cache_page(mapping, idx,
>                         (filler_t*)mapping->a_ops->readpage, NULL);
>         if (IS_ERR(page)) {
>                 ntfs_error(vol->sb, "Failed to read first partial "
>                                 "page (sync error, index 0x%lx).", idx);
>                 return PTR_ERR(page);
>         }
>         wait_on_page_locked(page);
>         if (unlikely(!PageUptodate(page))) {
>                 ntfs_error(vol->sb, "Failed to read first partial page "
>                                 "(async error, index 0x%lx).", idx);
>                 page_cache_release(page);
>                 return PTR_ERR(page);
>         }
>         size = PAGE_CACHE_SIZE;
>         if (idx == end)
>                 size = end_ofs;
>         kaddr = kmap_atomic(page, KM_USER0);
>         memset(kaddr + start_ofs, val, size - start_ofs);
>         flush_dcache_page(page);
>         kunmap_atomic(kaddr, KM_USER0);
>         set_page_dirty(page);
>         page_cache_release(page);
> 
> Of course you need to serialise access in some way so multiple writers
> do not step on each other's toes, etc...

What do I have to do to when I need to append something to a file?
BTW, if that matters, I'll be implementing the adress_space_operations
of that file as well...

Thanks very much in advance,

Martin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-05-27 19:13       ` Martin Jambor
@ 2005-05-28 15:57         ` Anton Altaparmakov
  2005-05-28 21:44           ` Martin Jambor
  0 siblings, 1 reply; 23+ messages in thread
From: Anton Altaparmakov @ 2005-05-28 15:57 UTC (permalink / raw)
  To: Martin Jambor; +Cc: Kathy KN (HK), Jeff Mahoney, linux-fsdevel

Hi Martin,

On Fri, 27 May 2005, Martin Jambor wrote:
> On 4/8/05, Anton Altaparmakov <aia21@cam.ac.uk> wrote:
> > On Fri, 2005-04-08 at 14:01 +0800, Kathy KN (HK) wrote:
> > > Hi Jeff,
> > >
> > > Is it possible to modify the cached page, and invalidate it back
> > > to update the page cache of the new page? I did a recursive grep
> > > and could only find functions that let you read or grab pages in the
> > > cache.
> > 
> > Once you have the page (via read_cache_page() or whatever) you can
> > simply write to it, then do a flush_dcache_page(page), then
> > set_page_dirty(page) and finally do the page_cache_release().  Oh, and
> > don't forget to unmap the page.  Usually done straight after the
> > flush_dcache_page().  And example from ntfs where we get a page, memset
> > it to a value (val), and then mark it dirty for later write out:
> > 
> >         page = read_cache_page(mapping, idx,
> >                         (filler_t*)mapping->a_ops->readpage, NULL);
> >         if (IS_ERR(page)) {
> >                 ntfs_error(vol->sb, "Failed to read first partial "
> >                                 "page (sync error, index 0x%lx).", idx);
> >                 return PTR_ERR(page);
> >         }
> >         wait_on_page_locked(page);
> >         if (unlikely(!PageUptodate(page))) {
> >                 ntfs_error(vol->sb, "Failed to read first partial page "
> >                                 "(async error, index 0x%lx).", idx);
> >                 page_cache_release(page);
> >                 return PTR_ERR(page);
> >         }
> >         size = PAGE_CACHE_SIZE;
> >         if (idx == end)
> >                 size = end_ofs;
> >         kaddr = kmap_atomic(page, KM_USER0);
> >         memset(kaddr + start_ofs, val, size - start_ofs);
> >         flush_dcache_page(page);
> >         kunmap_atomic(kaddr, KM_USER0);
> >         set_page_dirty(page);
> >         page_cache_release(page);
> > 
> > Of course you need to serialise access in some way so multiple writers
> > do not step on each other's toes, etc...
> 
> What do I have to do to when I need to append something to a file?
> BTW, if that matters, I'll be implementing the adress_space_operations
> of that file as well...

That depends on where you want to do that from and what that file is.

If it is a file accessible from user space the size modification should be 
done under i_sem protection which is tricky from address space operations 
but fine from other parts of the kernel.

If it is a private file to you you need to serialize access with your own 
lock then it does not matter from where you do this (well, unless you try 
to do it from an atomic region/with spinlocks held or something).

Basically there are all sorts of considerations.

Could you explain where in the kernel and in what context you need to do 
the file append?  And whether the file is private to you or accessible 
from user space?

It would be easier to explain what you need/can do if I knew exactly what 
you are trying to do.

Best regards,

	Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-05-28 15:57         ` Anton Altaparmakov
@ 2005-05-28 21:44           ` Martin Jambor
  2005-05-29  7:26             ` Anton Altaparmakov
  0 siblings, 1 reply; 23+ messages in thread
From: Martin Jambor @ 2005-05-28 21:44 UTC (permalink / raw)
  To: Anton Altaparmakov; +Cc: linux-fsdevel

Hi Anton,

On 5/28/05, Anton Altaparmakov <aia21@cam.ac.uk> wrote:
> > What do I have to do to when I need to append something to a file?
> > BTW, if that matters, I'll be implementing the adress_space_operations
> > of that file as well...
> 
> That depends on where you want to do that from and what that file is.

The file is a file system internal file that is used to translate
inode numbers to  addresses of inodes on the disk. It cannot be
written to from userspace. When I need one more inode than I currently
have, I also need to create a new mapping info.

I implemented it usng the same approach which ext2 uses to modify its
directories and believe that is the corect way (by calling aops
prepare and commit methods). The only thing that puzzles me is that
ext2 does not call flush_dcache_page that you suggested. Since it
seems to be an architecture specific function, I have no clue what so
ever whether I need to call it or not.

Thanks for your email and for any clarification,

Martin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-05-28 21:44           ` Martin Jambor
@ 2005-05-29  7:26             ` Anton Altaparmakov
  2005-05-30 21:51               ` Martin Jambor
  0 siblings, 1 reply; 23+ messages in thread
From: Anton Altaparmakov @ 2005-05-29  7:26 UTC (permalink / raw)
  To: Martin Jambor; +Cc: linux-fsdevel

Hi,

On Sat, 28 May 2005, Martin Jambor wrote:
> On 5/28/05, Anton Altaparmakov <aia21@cam.ac.uk> wrote:
> > > What do I have to do to when I need to append something to a file?
> > > BTW, if that matters, I'll be implementing the adress_space_operations
> > > of that file as well...
> > 
> > That depends on where you want to do that from and what that file is.
> 
> The file is a file system internal file that is used to translate
> inode numbers to  addresses of inodes on the disk. It cannot be
> written to from userspace. When I need one more inode than I currently
> have, I also need to create a new mapping info.

Aha.  Just like NTFS then.  (-:

> I implemented it usng the same approach which ext2 uses to modify its
> directories and believe that is the corect way (by calling aops
> prepare and commit methods). The only thing that puzzles me is that

Yes that isw fine.  Just note that prepare/commit write need to run under 
i_sem protection.  But if you are already serializing access to the file 
some other way then you can ignore i_sem.

> ext2 does not call flush_dcache_page that you suggested. Since it
> seems to be an architecture specific function, I have no clue what so
> ever whether I need to call it or not.

Well, as far as I understand it, this function causes changes to the page 
contents to become visible on all CPUs and from all processes and needs to 
be run before you unlock a page/mark it up to date otherwise someone who 
then locks it and/or reads it will possibly read old data from the page 
that is no longer correct.  Thus I always call it after modifying a page.  
Perhaps it is sometimes or even never actually needed for NTFS but I do 
not want to take that risk so I prefer to always call it.  It certainly 
does not do any harm to call it except perhaps making the code a bit 
slower.

> Thanks for your email and for any clarification,

No problem.

Best regards,

	Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-05-29  7:26             ` Anton Altaparmakov
@ 2005-05-30 21:51               ` Martin Jambor
  2005-05-30 22:19                 ` Anton Altaparmakov
  0 siblings, 1 reply; 23+ messages in thread
From: Martin Jambor @ 2005-05-30 21:51 UTC (permalink / raw)
  To: Anton Altaparmakov; +Cc: linux-fsdevel

Hi,

On 5/29/05, Anton Altaparmakov <aia21@cam.ac.uk> wrote:
> > I implemented it usng the same approach which ext2 uses to modify its
> > directories and believe that is the corect way (by calling aops
> > prepare and commit methods). The only thing that puzzles me is that
> 
> Yes that isw fine.  Just note that prepare/commit write need to run under
> i_sem protection.  But if you are already serializing access to the file
> some other way then you can ignore i_sem.

Do they? Documentation/filesystems/Locking only says they want their
page locked... but thanks for telling me, I will check that.

> > ext2 does not call flush_dcache_page that you suggested. Since it
> > seems to be an architecture specific function, I have no clue what so
> > ever whether I need to call it or not.
> 
> Well, as far as I understand it, this function causes changes to the page
> contents to become visible on all CPUs and from all processes and needs to
> be run before you unlock a page/mark it up to date otherwise someone who
> then locks it and/or reads it will possibly read old data from the page
> that is no longer correct. 

Some folks on irc sent me a link to an article that explains this
coherency stuff:
http://www.informit.com/articles/article.asp?p=29961&seqNum=6&rl=1
Basically, you need to call this only if the page might afterwards be
read from the userspace. Directories are not, so ext2 doesn't have to.

Thanks again,

Martin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Access content of file via inodes
  2005-05-30 21:51               ` Martin Jambor
@ 2005-05-30 22:19                 ` Anton Altaparmakov
  0 siblings, 0 replies; 23+ messages in thread
From: Anton Altaparmakov @ 2005-05-30 22:19 UTC (permalink / raw)
  To: Martin Jambor; +Cc: linux-fsdevel

Hi,

On Mon, 30 May 2005, Martin Jambor wrote:
> On 5/29/05, Anton Altaparmakov <aia21@cam.ac.uk> wrote:
> > > I implemented it usng the same approach which ext2 uses to modify its
> > > directories and believe that is the corect way (by calling aops
> > > prepare and commit methods). The only thing that puzzles me is that
> > 
> > Yes that isw fine.  Just note that prepare/commit write need to run under
> > i_sem protection.  But if you are already serializing access to the file
> > some other way then you can ignore i_sem.
> 
> Do they? Documentation/filesystems/Locking only says they want their
> page locked... but thanks for telling me, I will check that.

Well I was being simplistic.  For most (I believe) fs, i_sem is use to 
protect against changes in i_size.  So both file write and truncate are 
done under i_sem.  They are the only ops that can change i_size.  (Asid: 
As a consequence of this the lseek op of those fs is also done under 
i_sem.)

> > > ext2 does not call flush_dcache_page that you suggested. Since it
> > > seems to be an architecture specific function, I have no clue what so
> > > ever whether I need to call it or not.
> > 
> > Well, as far as I understand it, this function causes changes to the page
> > contents to become visible on all CPUs and from all processes and needs to
> > be run before you unlock a page/mark it up to date otherwise someone who
> > then locks it and/or reads it will possibly read old data from the page
> > that is no longer correct. 
> 
> Some folks on irc sent me a link to an article that explains this
> coherency stuff:
> http://www.informit.com/articles/article.asp?p=29961&seqNum=6&rl=1
> Basically, you need to call this only if the page might afterwards be
> read from the userspace. Directories are not, so ext2 doesn't have to.

Ah, that's a great article!  Thanks for the pointer.  I will be able to 
remove quite a few dcache flushes from ntfs now.  (-:

Thanks,

	Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2005-05-30 22:19 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-05  1:23 Access content of file via inodes Kathy KN
2005-04-05  7:22 ` Christoph Hellwig
2005-04-05 17:53 ` Bryan Henderson
2005-04-06  1:27   ` Kathy KN (HK)
2005-04-06  1:53     ` Jeff Mahoney
2005-04-06 17:57       ` Bryan Henderson
2005-04-06  7:54     ` Anton Altaparmakov
2005-04-06 11:33     ` Anton Altaparmakov
2005-04-06 13:09       ` Jeffrey Mahoney
2005-04-07  5:25       ` Kathy KN (HK)
2005-04-07  6:47         ` Jeffrey Mahoney
2005-04-07  8:09           ` Anton Altaparmakov
2005-04-05 19:01 ` Jeff Mahoney
2005-04-06  1:32   ` Kathy KN (HK)
2005-04-06  1:50     ` Jeff Mahoney
2005-04-08  6:01   ` Kathy KN (HK)
2005-04-08  8:17     ` Anton Altaparmakov
2005-05-27 19:13       ` Martin Jambor
2005-05-28 15:57         ` Anton Altaparmakov
2005-05-28 21:44           ` Martin Jambor
2005-05-29  7:26             ` Anton Altaparmakov
2005-05-30 21:51               ` Martin Jambor
2005-05-30 22:19                 ` Anton Altaparmakov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).