* Access content of file via inodes
@ 2005-04-05 1:23 Kathy KN
2005-04-05 7:22 ` Christoph Hellwig
` (2 more replies)
0 siblings, 3 replies; 23+ messages in thread
From: Kathy KN @ 2005-04-05 1:23 UTC (permalink / raw)
To: linux-fsdevel
Good day all,
How do I access/read the content of the files via using inodes
or blocks that belong to the inode, at sys_link and vfs_link layer?
I used bmap to access the blocks that belongs to the inodes, but
getting access to the buffer_head's b_data doesn't seem to help.
Kindly advise.
Kathy
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: Access content of file via inodes 2005-04-05 1:23 Access content of file via inodes Kathy KN @ 2005-04-05 7:22 ` Christoph Hellwig 2005-04-05 17:53 ` Bryan Henderson 2005-04-05 19:01 ` Jeff Mahoney 2 siblings, 0 replies; 23+ messages in thread From: Christoph Hellwig @ 2005-04-05 7:22 UTC (permalink / raw) To: Kathy KN; +Cc: linux-fsdevel On Tue, Apr 05, 2005 at 09:23:19AM +0800, Kathy KN wrote: > Good day all, > > How do I access/read the content of the files via using inodes > or blocks that belong to the inode, at sys_link and vfs_link layer? > I used bmap to access the blocks that belongs to the inodes, but > getting access to the buffer_head's b_data doesn't seem to help. You don't. There might not even be any blocks. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-04-05 1:23 Access content of file via inodes Kathy KN 2005-04-05 7:22 ` Christoph Hellwig @ 2005-04-05 17:53 ` Bryan Henderson 2005-04-06 1:27 ` Kathy KN (HK) 2005-04-05 19:01 ` Jeff Mahoney 2 siblings, 1 reply; 23+ messages in thread From: Bryan Henderson @ 2005-04-05 17:53 UTC (permalink / raw) To: Kathy KN; +Cc: linux-fsdevel >How do I access/read the content of the files via using inodes >or blocks that belong to the inode, at sys_link and vfs_link layer? This is tricky because many interfaces that one would expect to use an inode as a file handle use a dentry instead. To read the contents of a file via the VFS interface, you need a file pointer (struct file), and the file pointer identifies the file by dentry. So you need to create a dummy dentry, which you can do with d_alloc_root(), and then create the file pointer with dentry_open(), then read the file with vfs_read(). That's for "via inodes." I don't know what "via blocks" means. -- Bryan Henderson IBM Almaden Research Center San Jose CA Filesystems ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-04-05 17:53 ` Bryan Henderson @ 2005-04-06 1:27 ` Kathy KN (HK) 2005-04-06 1:53 ` Jeff Mahoney ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Kathy KN (HK) @ 2005-04-06 1:27 UTC (permalink / raw) To: Bryan Henderson; +Cc: linux-fsdevel On Apr 6, 2005 1:53 AM, Bryan Henderson <hbryan@us.ibm.com> wrote: > >How do I access/read the content of the files via using inodes > >or blocks that belong to the inode, at sys_link and vfs_link layer? > > This is tricky because many interfaces that one would expect to use an > inode as a file handle use a dentry instead. To read the contents of a > file via the VFS interface, you need a file pointer (struct file), and the > file pointer identifies the file by dentry. So you need to create a dummy > dentry, which you can do with d_alloc_root(), and then create the file > pointer with dentry_open(), then read the file with vfs_read(). > > That's for "via inodes." I don't know what "via blocks" means. Bryan, Thanks for the description on how to read the contents of a file via the VFS interface. I got to try to see if I can write it in codes, and make sure that I can read the file via the vfs_read() routine. What I meant by via blocks is to gain knowledge of the physical blocks used by the inodes and retrieve the content from it directly, by accessing b_data. Kathy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-04-06 1:27 ` Kathy KN (HK) @ 2005-04-06 1:53 ` Jeff Mahoney 2005-04-06 17:57 ` Bryan Henderson 2005-04-06 7:54 ` Anton Altaparmakov 2005-04-06 11:33 ` Anton Altaparmakov 2 siblings, 1 reply; 23+ messages in thread From: Jeff Mahoney @ 2005-04-06 1:53 UTC (permalink / raw) To: Kathy KN (HK); +Cc: Bryan Henderson, linux-fsdevel Kathy KN (HK) wrote: > On Apr 6, 2005 1:53 AM, Bryan Henderson <hbryan@us.ibm.com> wrote: >>>How do I access/read the content of the files via using inodes >>>or blocks that belong to the inode, at sys_link and vfs_link layer? >>This is tricky because many interfaces that one would expect to use an >>inode as a file handle use a dentry instead. To read the contents of a >>file via the VFS interface, you need a file pointer (struct file), and the >>file pointer identifies the file by dentry. So you need to create a dummy >>dentry, which you can do with d_alloc_root(), and then create the file >>pointer with dentry_open(), then read the file with vfs_read(). >> >>That's for "via inodes." I don't know what "via blocks" means. > > Bryan, > > Thanks for the description on how to read the contents of a file via > the VFS interface. I got to try to see if I can write it in codes, and make > sure that I can read the file via the vfs_read() routine. What I meant by > via blocks is to gain knowledge of the physical blocks used by the inodes > and retrieve the content from it directly, by accessing b_data. The problem with that approach is that some filesystems may store part of the file outside of a complete block. For example, reiserfs "tails" will respond with -ENOENT on ->bmap. For files smaller than 16k, they are quite common. -Jeff -- Jeff Mahoney SuSE Labs ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-04-06 1:53 ` Jeff Mahoney @ 2005-04-06 17:57 ` Bryan Henderson 0 siblings, 0 replies; 23+ messages in thread From: Bryan Henderson @ 2005-04-06 17:57 UTC (permalink / raw) To: Jeff Mahoney; +Cc: Kathy KN (HK), linux-fsdevel >What I meant by >> via blocks is to gain knowledge of the physical blocks used by the inodes >> and retrieve the content from it directly, by accessing b_data. > >The problem with that approach is that some filesystems may store part >of the file outside of a complete block. There's an even more basic problem with this approach: The question is specifically about the filesystem-type-independent layer above the VFS interface. At this layer, you don't even know that there is a block device involved. And if you do, you don't know that the filesystem driver uses the buffer cache to access it. And if you do know that it uses the buffer cache, you don't know that the file data you're looking for is presently in the buffer cache, or how to get it there if it isn't. If you believe in the layering at all, the only interface you can consider at this layer for getting at file data is VFS ->read. -- Bryan Henderson San Jose California IBM Almaden Research Center Filesystems ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-04-06 1:27 ` Kathy KN (HK) 2005-04-06 1:53 ` Jeff Mahoney @ 2005-04-06 7:54 ` Anton Altaparmakov 2005-04-06 11:33 ` Anton Altaparmakov 2 siblings, 0 replies; 23+ messages in thread From: Anton Altaparmakov @ 2005-04-06 7:54 UTC (permalink / raw) To: Kathy KN (HK); +Cc: Bryan Henderson, linux-fsdevel On Wed, 6 Apr 2005, Kathy KN (HK) wrote: > On Apr 6, 2005 1:53 AM, Bryan Henderson <hbryan@us.ibm.com> wrote: > > >How do I access/read the content of the files via using inodes > > >or blocks that belong to the inode, at sys_link and vfs_link layer? > > > > This is tricky because many interfaces that one would expect to use an > > inode as a file handle use a dentry instead. To read the contents of a > > file via the VFS interface, you need a file pointer (struct file), and the > > file pointer identifies the file by dentry. So you need to create a dummy > > dentry, which you can do with d_alloc_root(), and then create the file > > pointer with dentry_open(), then read the file with vfs_read(). > > > > That's for "via inodes." I don't know what "via blocks" means. > > Thanks for the description on how to read the contents of a file via > the VFS interface. I got to try to see if I can write it in codes, and make > sure that I can read the file via the vfs_read() routine. What I meant by > via blocks is to gain knowledge of the physical blocks used by the inodes > and retrieve the content from it directly, by accessing b_data. You cannot do that safely because some file systems do not store things in whole blocks. For example small files in ntfs are stored as a variable length and variable offset record inside the inode record on disk. And compressed/encrypted files on ntfs are stored compressed/encrypted on disk and are decompressed/decrypted on access so there are no blocks you could usefully read at all. (This is why ntfs does not implement ->bmap - it just makes no sense.) Oh and another thing is that ->bmap returns 0 for a sparse block, i.e. not allocated on disk, it is zero. But for example ntfs uses 0 as a valid block number you can read from/write to so that is not compatible with ->bmap either. Best regards, Anton -- Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @) Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-04-06 1:27 ` Kathy KN (HK) 2005-04-06 1:53 ` Jeff Mahoney 2005-04-06 7:54 ` Anton Altaparmakov @ 2005-04-06 11:33 ` Anton Altaparmakov 2005-04-06 13:09 ` Jeffrey Mahoney 2005-04-07 5:25 ` Kathy KN (HK) 2 siblings, 2 replies; 23+ messages in thread From: Anton Altaparmakov @ 2005-04-06 11:33 UTC (permalink / raw) To: Jeff Mahoney; +Cc: Kathy KN (HK), Bryan Henderson, linux-fsdevel Jeff Mahoney wrote: > Kathy KN (HK) wrote: > > What I meant by via blocks is to gain knowledge of the physical > > blocks used by the inodes and retrieve the content from it directly, > > by accessing b_data. > > The problem with that approach is that some filesystems may store part > of the file outside of a complete block. For example, reiserfs "tails" > will respond with -ENOENT on ->bmap. For files smaller than 16k, they > are quite common. This is one not true and two wrong! Looking at reiserfs code in the current 2.6 kernel it does: .bmap = reiserfs_aop_bmap, Which is: static sector_t reiserfs_aop_bmap(struct address_space *as, sector_t block) { return generic_block_bmap(as, block, reiserfs_bmap) ; } And generic_block_bmap is: sector_t generic_block_bmap(struct address_space *mapping, sector_t block, get_block_t *get_block) { struct buffer_head tmp; struct inode *inode = mapping->host; tmp.b_state = 0; tmp.b_blocknr = 0; get_block(inode, block, &tmp, 0); return tmp.b_blocknr; } It ignores any errors from get_block() and always returns tmp.b_blocknr. Thus is get_block() fails, tmp.b_blocknr is 0 and hence 0 is returned, i.e. a sparse block. Which is complete rubbish... And get_block in this case in reiserfs is: static int reiserfs_bmap (struct inode * inode, sector_t block, struct buffer_head * bh_result, int create) { if (!file_capable (inode, block)) return -EFBIG; reiserfs_write_lock(inode->i_sb); /* do not read the direct item */ _get_block_create_0 (inode, block, bh_result, 0) ; reiserfs_write_unlock(inode->i_sb); return 0; } This will result in sparse blocks being returned whenever an error occurs. Not what is desired... <rant> The problem with ->bmap is that it cannot return error at all. It either returns 0 for sparse or >0 for real block. ->bmap is the most stupid interface I have ever seen... )-: If you ask me it should be removed from the kernel without notice. Let all applications that use it break. Who cares... It can always be replaced with a sensible interface that returns errors like -ESPARSE, -ENOTAPPLICABLE, -EIO, -ENOMEM, etc and doesn't assume that 0 is sparse... </rant> Best regards, Anton -- Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @) Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-04-06 11:33 ` Anton Altaparmakov @ 2005-04-06 13:09 ` Jeffrey Mahoney 2005-04-07 5:25 ` Kathy KN (HK) 1 sibling, 0 replies; 23+ messages in thread From: Jeffrey Mahoney @ 2005-04-06 13:09 UTC (permalink / raw) To: Anton Altaparmakov; +Cc: Kathy KN (HK), Bryan Henderson, linux-fsdevel Anton Altaparmakov wrote: > Jeff Mahoney wrote: > >>Kathy KN (HK) wrote: >> >>>What I meant by via blocks is to gain knowledge of the physical >>>blocks used by the inodes and retrieve the content from it directly, >>>by accessing b_data. >> >>The problem with that approach is that some filesystems may store part >>of the file outside of a complete block. For example, reiserfs "tails" >>will respond with -ENOENT on ->bmap. For files smaller than 16k, they >>are quite common. > > > This is one not true and two wrong! > > Looking at reiserfs code in the current 2.6 kernel it does: [...] > This will result in sparse blocks being returned whenever an error > occurs. Not what is desired... > > <rant> > The problem with ->bmap is that it cannot return error at all. It > either returns 0 for sparse or >0 for real block. ->bmap is the most > stupid interface I have ever seen... )-: If you ask me it should be > removed from the kernel without notice. Let all applications that use > it break. Who cares... It can always be replaced with a sensible > interface that returns errors like -ESPARSE, -ENOTAPPLICABLE, -EIO, > -ENOMEM, etc and doesn't assume that 0 is sparse... > </rant> Ugh. Mea culpa. I knew reiserfs_bmap would return less than useful results, and stopped there. I should have dug a little deeper. -Jeff -- Jeff Mahoney SuSE Labs jeffm@suse.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-04-06 11:33 ` Anton Altaparmakov 2005-04-06 13:09 ` Jeffrey Mahoney @ 2005-04-07 5:25 ` Kathy KN (HK) 2005-04-07 6:47 ` Jeffrey Mahoney 1 sibling, 1 reply; 23+ messages in thread From: Kathy KN (HK) @ 2005-04-07 5:25 UTC (permalink / raw) To: Anton Altaparmakov; +Cc: Jeff Mahoney, Bryan Henderson, linux-fsdevel > Looking at reiserfs code in the current 2.6 kernel it does: > > .bmap = reiserfs_aop_bmap, > > Which is: > > static sector_t reiserfs_aop_bmap(struct address_space *as, sector_t > block) { > return generic_block_bmap(as, block, reiserfs_bmap) ; > } > > And generic_block_bmap is: > > sector_t generic_block_bmap(struct address_space *mapping, sector_t > block, > get_block_t *get_block) > { > struct buffer_head tmp; > struct inode *inode = mapping->host; > tmp.b_state = 0; > tmp.b_blocknr = 0; > get_block(inode, block, &tmp, 0); > return tmp.b_blocknr; > } > > It ignores any errors from get_block() and always returns tmp.b_blocknr. > Thus is get_block() fails, tmp.b_blocknr is 0 and hence 0 is returned, > i.e. a sparse block. Which is complete rubbish... > > And get_block in this case in reiserfs is: > > static int reiserfs_bmap (struct inode * inode, sector_t block, > struct buffer_head * bh_result, int create) > { > if (!file_capable (inode, block)) > return -EFBIG; > > reiserfs_write_lock(inode->i_sb); > /* do not read the direct item */ > _get_block_create_0 (inode, block, bh_result, 0) ; > reiserfs_write_unlock(inode->i_sb); > return 0; > } Just wondering. Say, reiserfs/r4, how is it possible to access the tail which contain the data of the file, since most of our production boxes uses either reiserfs and/or reiser4. Kathy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-04-07 5:25 ` Kathy KN (HK) @ 2005-04-07 6:47 ` Jeffrey Mahoney 2005-04-07 8:09 ` Anton Altaparmakov 0 siblings, 1 reply; 23+ messages in thread From: Jeffrey Mahoney @ 2005-04-07 6:47 UTC (permalink / raw) To: Kathy KN (HK); +Cc: Anton Altaparmakov, Bryan Henderson, linux-fsdevel -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Kathy KN (HK) wrote: > Just wondering. Say, reiserfs/r4, how is it possible to access > the tail which contain the data of the file, since most of our > production boxes uses either reiserfs and/or reiser4. Hi Kathy - Using vfs_read or the page cache functions will allow you access to the tail since they will map it in as part of the file. You'd only run into that problem if you were trying to access the data block-by-block as you were initially. - -Jeff - -- Jeff Mahoney SuSE Labs jeffm@suse.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (Darwin) iD8DBQFCVNeOLPWxlyuTD7IRAgTMAJ4r+LU63wgVL168eGC/9VUuov4fLQCeKvms UqIr/gxdn60EzfTRpQsjGlo= =wQ0F -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-04-07 6:47 ` Jeffrey Mahoney @ 2005-04-07 8:09 ` Anton Altaparmakov 0 siblings, 0 replies; 23+ messages in thread From: Anton Altaparmakov @ 2005-04-07 8:09 UTC (permalink / raw) To: Kathy KN (HK); +Cc: Jeffrey Mahoney, Bryan Henderson, linux-fsdevel On Thu, 2005-04-07 at 02:47 -0400, Jeffrey Mahoney wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Kathy KN (HK) wrote: > > Just wondering. Say, reiserfs/r4, how is it possible to access > > the tail which contain the data of the file, since most of our > > production boxes uses either reiserfs and/or reiser4. > > Hi Kathy - > > Using vfs_read or the page cache functions will allow you access to the > tail since they will map it in as part of the file. You'd only run into > that problem if you were trying to access the data block-by-block as you > were initially. Exactly the same for ntfs in case you care. (-: Btw. Kathy, if you want examples how to do page cache reads you could look at the ntfs driver. It does them all over the place (because metadata is in the page cache). So for example fs/ntfs/aops.c::ntfs_map_page() is (with comments): static inline struct page *ntfs_map_page(struct address_space *mapping, unsigned long index) // mapping is the address space mapping, i.e. struct inode *->i_mapping // index is file position you want to access >> PAGE_CACHE_SHIFT { struct page *page = read_cache_page(mapping, index, (filler_t*)mapping->a_ops->readpage, NULL); // the above read_cache_page initiates the page to be read asynchronously so it is not finished when the call returns. if (!IS_ERR(page)) { // if no synchronous error occured: wait_on_page_locked(page); // wait that the page becomes unlocked, which implies that either an asynchronous error occured or that the page has been read successfully kmap(page); // map the page so can read the contents, you could do this much later and only use kmap_atomic() depending on when/how you need to read/write from/to the page if (PageUptodate(page) && !PageError(page)) return page; // if the page is now uptodate and it does not have the error bit set the read was successful! ntfs_unmap_page(page); // ouch. asynchronous error occured. ntfs_unmap_page simply does a "kunmap(page)" and a "page_cache_release(page)". return ERR_PTR(-EIO); // return -EIO error code encoded as a pointer. } return page; // ouch synchronous error. "page" contains the error code encoded as a pointer. } When you are finished accessing the page contents, simply unmap the page if you have mapped it and do a "page_cache_release()" on the page. Hope this helps. Best regards, Anton -- Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @) Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-04-05 1:23 Access content of file via inodes Kathy KN 2005-04-05 7:22 ` Christoph Hellwig 2005-04-05 17:53 ` Bryan Henderson @ 2005-04-05 19:01 ` Jeff Mahoney 2005-04-06 1:32 ` Kathy KN (HK) 2005-04-08 6:01 ` Kathy KN (HK) 2 siblings, 2 replies; 23+ messages in thread From: Jeff Mahoney @ 2005-04-05 19:01 UTC (permalink / raw) To: Kathy KN; +Cc: linux-fsdevel Kathy KN wrote: > Good day all, > > How do I access/read the content of the files via using inodes > or blocks that belong to the inode, at sys_link and vfs_link layer? > I used bmap to access the blocks that belongs to the inodes, but > getting access to the buffer_head's b_data doesn't seem to help. Hi Kathy - What you're trying to do is possible, but you need to go about it in a different way. Ignore the buffer cache completely and use the page cache; it's more appropriate for file contents. You have two options: If performance isn't critical, a simple approach would be to use your old_dentry pointer to dentry_open a file and then vfs_read from it to a buffer you allocate. Make sure you use get_fs/set_fs, since vfs_read won't accept a kernel pointer otherwise. If performance is more important or you really do only have access to an inode, you can read from the page cache directly using inode->i_mapping and read_cache_page. This has the advantage that you don't need to copy the data to access it, but the disadvantage that it is more complex and can be tricky to get right. -Jeff -- Jeff Mahoney SuSE Labs ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-04-05 19:01 ` Jeff Mahoney @ 2005-04-06 1:32 ` Kathy KN (HK) 2005-04-06 1:50 ` Jeff Mahoney 2005-04-08 6:01 ` Kathy KN (HK) 1 sibling, 1 reply; 23+ messages in thread From: Kathy KN (HK) @ 2005-04-06 1:32 UTC (permalink / raw) To: Jeff Mahoney; +Cc: linux-fsdevel > Hi Kathy - > > If performance is more important or you really do only have access to an > inode, you can read from the page cache directly using inode->i_mapping > and read_cache_page. This has the advantage that you don't need to copy > the data to access it, but the disadvantage that it is more complex and > can be tricky to get right. Hi Jeff, I felt that the second suggestion seems to be more of an elegant solution, though I need to find out how to actually do it correctly. Thanks for the tip. If you have example code, that would be splendid. Kathy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-04-06 1:32 ` Kathy KN (HK) @ 2005-04-06 1:50 ` Jeff Mahoney 0 siblings, 0 replies; 23+ messages in thread From: Jeff Mahoney @ 2005-04-06 1:50 UTC (permalink / raw) To: Kathy KN (HK); +Cc: linux-fsdevel Kathy KN (HK) wrote: >>Hi Kathy - >> >>If performance is more important or you really do only have access to an >>inode, you can read from the page cache directly using inode->i_mapping >>and read_cache_page. This has the advantage that you don't need to copy >>the data to access it, but the disadvantage that it is more complex and >>can be tricky to get right. > > Hi Jeff, > > I felt that the second suggestion seems to be more of an elegant > solution, though I need to find out how to actually do it correctly. > > Thanks for the tip. If you have example code, that would be splendid. It's not the prettiest, and in fact I'm in the process of reworking it, but code similar to what you're looking at implementing can be found in fs/reiserfs/xattr.c; Start your analysis at reiserfs_xattr_get(). -Jeff -- Jeff Mahoney SuSE Labs ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-04-05 19:01 ` Jeff Mahoney 2005-04-06 1:32 ` Kathy KN (HK) @ 2005-04-08 6:01 ` Kathy KN (HK) 2005-04-08 8:17 ` Anton Altaparmakov 1 sibling, 1 reply; 23+ messages in thread From: Kathy KN (HK) @ 2005-04-08 6:01 UTC (permalink / raw) To: Jeff Mahoney; +Cc: linux-fsdevel On Apr 6, 2005 3:01 AM, Jeff Mahoney <jeffm@suse.com> wrote: > Kathy KN wrote: > > Good day all, > > > > How do I access/read the content of the files via using inodes > > or blocks that belong to the inode, at sys_link and vfs_link layer? > > I used bmap to access the blocks that belongs to the inodes, but > > getting access to the buffer_head's b_data doesn't seem to help. > > Hi Kathy - > > What you're trying to do is possible, but you need to go about it in a > different way. Ignore the buffer cache completely and use the page > cache; it's more appropriate for file contents. > > You have two options: > > If performance isn't critical, a simple approach would be to use your > old_dentry pointer to dentry_open a file and then vfs_read from it to a > buffer you allocate. Make sure you use get_fs/set_fs, since vfs_read > won't accept a kernel pointer otherwise. > > If performance is more important or you really do only have access to an > inode, you can read from the page cache directly using inode->i_mapping > and read_cache_page. This has the advantage that you don't need to copy > the data to access it, but the disadvantage that it is more complex and > can be tricky to get right. Hi Jeff, Is it possible to modify the cached page, and invalidate it back to update the page cache of the new page? I did a recursive grep and could only find functions that let you read or grab pages in the cache. Kathy ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-04-08 6:01 ` Kathy KN (HK) @ 2005-04-08 8:17 ` Anton Altaparmakov 2005-05-27 19:13 ` Martin Jambor 0 siblings, 1 reply; 23+ messages in thread From: Anton Altaparmakov @ 2005-04-08 8:17 UTC (permalink / raw) To: Kathy KN (HK); +Cc: Jeff Mahoney, linux-fsdevel On Fri, 2005-04-08 at 14:01 +0800, Kathy KN (HK) wrote: > On Apr 6, 2005 3:01 AM, Jeff Mahoney <jeffm@suse.com> wrote: > > Kathy KN wrote: > > > Good day all, > > > > > > How do I access/read the content of the files via using inodes > > > or blocks that belong to the inode, at sys_link and vfs_link layer? > > > I used bmap to access the blocks that belongs to the inodes, but > > > getting access to the buffer_head's b_data doesn't seem to help. > > > > Hi Kathy - > > > > What you're trying to do is possible, but you need to go about it in a > > different way. Ignore the buffer cache completely and use the page > > cache; it's more appropriate for file contents. > > > > You have two options: > > > > If performance isn't critical, a simple approach would be to use your > > old_dentry pointer to dentry_open a file and then vfs_read from it to a > > buffer you allocate. Make sure you use get_fs/set_fs, since vfs_read > > won't accept a kernel pointer otherwise. > > > > If performance is more important or you really do only have access to an > > inode, you can read from the page cache directly using inode->i_mapping > > and read_cache_page. This has the advantage that you don't need to copy > > the data to access it, but the disadvantage that it is more complex and > > can be tricky to get right. > > Hi Jeff, > > Is it possible to modify the cached page, and invalidate it back > to update the page cache of the new page? I did a recursive grep > and could only find functions that let you read or grab pages in the > cache. Once you have the page (via read_cache_page() or whatever) you can simply write to it, then do a flush_dcache_page(page), then set_page_dirty(page) and finally do the page_cache_release(). Oh, and don't forget to unmap the page. Usually done straight after the flush_dcache_page(). And example from ntfs where we get a page, memset it to a value (val), and then mark it dirty for later write out: page = read_cache_page(mapping, idx, (filler_t*)mapping->a_ops->readpage, NULL); if (IS_ERR(page)) { ntfs_error(vol->sb, "Failed to read first partial " "page (sync error, index 0x%lx).", idx); return PTR_ERR(page); } wait_on_page_locked(page); if (unlikely(!PageUptodate(page))) { ntfs_error(vol->sb, "Failed to read first partial page " "(async error, index 0x%lx).", idx); page_cache_release(page); return PTR_ERR(page); } size = PAGE_CACHE_SIZE; if (idx == end) size = end_ofs; kaddr = kmap_atomic(page, KM_USER0); memset(kaddr + start_ofs, val, size - start_ofs); flush_dcache_page(page); kunmap_atomic(kaddr, KM_USER0); set_page_dirty(page); page_cache_release(page); Of course you need to serialise access in some way so multiple writers do not step on each other's toes, etc... Best regards, Anton -- Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @) Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-04-08 8:17 ` Anton Altaparmakov @ 2005-05-27 19:13 ` Martin Jambor 2005-05-28 15:57 ` Anton Altaparmakov 0 siblings, 1 reply; 23+ messages in thread From: Martin Jambor @ 2005-05-27 19:13 UTC (permalink / raw) To: Anton Altaparmakov; +Cc: Kathy KN (HK), Jeff Mahoney, linux-fsdevel Hi Anton, On 4/8/05, Anton Altaparmakov <aia21@cam.ac.uk> wrote: > On Fri, 2005-04-08 at 14:01 +0800, Kathy KN (HK) wrote: > > Hi Jeff, > > > > Is it possible to modify the cached page, and invalidate it back > > to update the page cache of the new page? I did a recursive grep > > and could only find functions that let you read or grab pages in the > > cache. > > Once you have the page (via read_cache_page() or whatever) you can > simply write to it, then do a flush_dcache_page(page), then > set_page_dirty(page) and finally do the page_cache_release(). Oh, and > don't forget to unmap the page. Usually done straight after the > flush_dcache_page(). And example from ntfs where we get a page, memset > it to a value (val), and then mark it dirty for later write out: > > page = read_cache_page(mapping, idx, > (filler_t*)mapping->a_ops->readpage, NULL); > if (IS_ERR(page)) { > ntfs_error(vol->sb, "Failed to read first partial " > "page (sync error, index 0x%lx).", idx); > return PTR_ERR(page); > } > wait_on_page_locked(page); > if (unlikely(!PageUptodate(page))) { > ntfs_error(vol->sb, "Failed to read first partial page " > "(async error, index 0x%lx).", idx); > page_cache_release(page); > return PTR_ERR(page); > } > size = PAGE_CACHE_SIZE; > if (idx == end) > size = end_ofs; > kaddr = kmap_atomic(page, KM_USER0); > memset(kaddr + start_ofs, val, size - start_ofs); > flush_dcache_page(page); > kunmap_atomic(kaddr, KM_USER0); > set_page_dirty(page); > page_cache_release(page); > > Of course you need to serialise access in some way so multiple writers > do not step on each other's toes, etc... What do I have to do to when I need to append something to a file? BTW, if that matters, I'll be implementing the adress_space_operations of that file as well... Thanks very much in advance, Martin ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-05-27 19:13 ` Martin Jambor @ 2005-05-28 15:57 ` Anton Altaparmakov 2005-05-28 21:44 ` Martin Jambor 0 siblings, 1 reply; 23+ messages in thread From: Anton Altaparmakov @ 2005-05-28 15:57 UTC (permalink / raw) To: Martin Jambor; +Cc: Kathy KN (HK), Jeff Mahoney, linux-fsdevel Hi Martin, On Fri, 27 May 2005, Martin Jambor wrote: > On 4/8/05, Anton Altaparmakov <aia21@cam.ac.uk> wrote: > > On Fri, 2005-04-08 at 14:01 +0800, Kathy KN (HK) wrote: > > > Hi Jeff, > > > > > > Is it possible to modify the cached page, and invalidate it back > > > to update the page cache of the new page? I did a recursive grep > > > and could only find functions that let you read or grab pages in the > > > cache. > > > > Once you have the page (via read_cache_page() or whatever) you can > > simply write to it, then do a flush_dcache_page(page), then > > set_page_dirty(page) and finally do the page_cache_release(). Oh, and > > don't forget to unmap the page. Usually done straight after the > > flush_dcache_page(). And example from ntfs where we get a page, memset > > it to a value (val), and then mark it dirty for later write out: > > > > page = read_cache_page(mapping, idx, > > (filler_t*)mapping->a_ops->readpage, NULL); > > if (IS_ERR(page)) { > > ntfs_error(vol->sb, "Failed to read first partial " > > "page (sync error, index 0x%lx).", idx); > > return PTR_ERR(page); > > } > > wait_on_page_locked(page); > > if (unlikely(!PageUptodate(page))) { > > ntfs_error(vol->sb, "Failed to read first partial page " > > "(async error, index 0x%lx).", idx); > > page_cache_release(page); > > return PTR_ERR(page); > > } > > size = PAGE_CACHE_SIZE; > > if (idx == end) > > size = end_ofs; > > kaddr = kmap_atomic(page, KM_USER0); > > memset(kaddr + start_ofs, val, size - start_ofs); > > flush_dcache_page(page); > > kunmap_atomic(kaddr, KM_USER0); > > set_page_dirty(page); > > page_cache_release(page); > > > > Of course you need to serialise access in some way so multiple writers > > do not step on each other's toes, etc... > > What do I have to do to when I need to append something to a file? > BTW, if that matters, I'll be implementing the adress_space_operations > of that file as well... That depends on where you want to do that from and what that file is. If it is a file accessible from user space the size modification should be done under i_sem protection which is tricky from address space operations but fine from other parts of the kernel. If it is a private file to you you need to serialize access with your own lock then it does not matter from where you do this (well, unless you try to do it from an atomic region/with spinlocks held or something). Basically there are all sorts of considerations. Could you explain where in the kernel and in what context you need to do the file append? And whether the file is private to you or accessible from user space? It would be easier to explain what you need/can do if I knew exactly what you are trying to do. Best regards, Anton -- Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @) Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-05-28 15:57 ` Anton Altaparmakov @ 2005-05-28 21:44 ` Martin Jambor 2005-05-29 7:26 ` Anton Altaparmakov 0 siblings, 1 reply; 23+ messages in thread From: Martin Jambor @ 2005-05-28 21:44 UTC (permalink / raw) To: Anton Altaparmakov; +Cc: linux-fsdevel Hi Anton, On 5/28/05, Anton Altaparmakov <aia21@cam.ac.uk> wrote: > > What do I have to do to when I need to append something to a file? > > BTW, if that matters, I'll be implementing the adress_space_operations > > of that file as well... > > That depends on where you want to do that from and what that file is. The file is a file system internal file that is used to translate inode numbers to addresses of inodes on the disk. It cannot be written to from userspace. When I need one more inode than I currently have, I also need to create a new mapping info. I implemented it usng the same approach which ext2 uses to modify its directories and believe that is the corect way (by calling aops prepare and commit methods). The only thing that puzzles me is that ext2 does not call flush_dcache_page that you suggested. Since it seems to be an architecture specific function, I have no clue what so ever whether I need to call it or not. Thanks for your email and for any clarification, Martin ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-05-28 21:44 ` Martin Jambor @ 2005-05-29 7:26 ` Anton Altaparmakov 2005-05-30 21:51 ` Martin Jambor 0 siblings, 1 reply; 23+ messages in thread From: Anton Altaparmakov @ 2005-05-29 7:26 UTC (permalink / raw) To: Martin Jambor; +Cc: linux-fsdevel Hi, On Sat, 28 May 2005, Martin Jambor wrote: > On 5/28/05, Anton Altaparmakov <aia21@cam.ac.uk> wrote: > > > What do I have to do to when I need to append something to a file? > > > BTW, if that matters, I'll be implementing the adress_space_operations > > > of that file as well... > > > > That depends on where you want to do that from and what that file is. > > The file is a file system internal file that is used to translate > inode numbers to addresses of inodes on the disk. It cannot be > written to from userspace. When I need one more inode than I currently > have, I also need to create a new mapping info. Aha. Just like NTFS then. (-: > I implemented it usng the same approach which ext2 uses to modify its > directories and believe that is the corect way (by calling aops > prepare and commit methods). The only thing that puzzles me is that Yes that isw fine. Just note that prepare/commit write need to run under i_sem protection. But if you are already serializing access to the file some other way then you can ignore i_sem. > ext2 does not call flush_dcache_page that you suggested. Since it > seems to be an architecture specific function, I have no clue what so > ever whether I need to call it or not. Well, as far as I understand it, this function causes changes to the page contents to become visible on all CPUs and from all processes and needs to be run before you unlock a page/mark it up to date otherwise someone who then locks it and/or reads it will possibly read old data from the page that is no longer correct. Thus I always call it after modifying a page. Perhaps it is sometimes or even never actually needed for NTFS but I do not want to take that risk so I prefer to always call it. It certainly does not do any harm to call it except perhaps making the code a bit slower. > Thanks for your email and for any clarification, No problem. Best regards, Anton -- Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @) Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-05-29 7:26 ` Anton Altaparmakov @ 2005-05-30 21:51 ` Martin Jambor 2005-05-30 22:19 ` Anton Altaparmakov 0 siblings, 1 reply; 23+ messages in thread From: Martin Jambor @ 2005-05-30 21:51 UTC (permalink / raw) To: Anton Altaparmakov; +Cc: linux-fsdevel Hi, On 5/29/05, Anton Altaparmakov <aia21@cam.ac.uk> wrote: > > I implemented it usng the same approach which ext2 uses to modify its > > directories and believe that is the corect way (by calling aops > > prepare and commit methods). The only thing that puzzles me is that > > Yes that isw fine. Just note that prepare/commit write need to run under > i_sem protection. But if you are already serializing access to the file > some other way then you can ignore i_sem. Do they? Documentation/filesystems/Locking only says they want their page locked... but thanks for telling me, I will check that. > > ext2 does not call flush_dcache_page that you suggested. Since it > > seems to be an architecture specific function, I have no clue what so > > ever whether I need to call it or not. > > Well, as far as I understand it, this function causes changes to the page > contents to become visible on all CPUs and from all processes and needs to > be run before you unlock a page/mark it up to date otherwise someone who > then locks it and/or reads it will possibly read old data from the page > that is no longer correct. Some folks on irc sent me a link to an article that explains this coherency stuff: http://www.informit.com/articles/article.asp?p=29961&seqNum=6&rl=1 Basically, you need to call this only if the page might afterwards be read from the userspace. Directories are not, so ext2 doesn't have to. Thanks again, Martin ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Access content of file via inodes 2005-05-30 21:51 ` Martin Jambor @ 2005-05-30 22:19 ` Anton Altaparmakov 0 siblings, 0 replies; 23+ messages in thread From: Anton Altaparmakov @ 2005-05-30 22:19 UTC (permalink / raw) To: Martin Jambor; +Cc: linux-fsdevel Hi, On Mon, 30 May 2005, Martin Jambor wrote: > On 5/29/05, Anton Altaparmakov <aia21@cam.ac.uk> wrote: > > > I implemented it usng the same approach which ext2 uses to modify its > > > directories and believe that is the corect way (by calling aops > > > prepare and commit methods). The only thing that puzzles me is that > > > > Yes that isw fine. Just note that prepare/commit write need to run under > > i_sem protection. But if you are already serializing access to the file > > some other way then you can ignore i_sem. > > Do they? Documentation/filesystems/Locking only says they want their > page locked... but thanks for telling me, I will check that. Well I was being simplistic. For most (I believe) fs, i_sem is use to protect against changes in i_size. So both file write and truncate are done under i_sem. They are the only ops that can change i_size. (Asid: As a consequence of this the lseek op of those fs is also done under i_sem.) > > > ext2 does not call flush_dcache_page that you suggested. Since it > > > seems to be an architecture specific function, I have no clue what so > > > ever whether I need to call it or not. > > > > Well, as far as I understand it, this function causes changes to the page > > contents to become visible on all CPUs and from all processes and needs to > > be run before you unlock a page/mark it up to date otherwise someone who > > then locks it and/or reads it will possibly read old data from the page > > that is no longer correct. > > Some folks on irc sent me a link to an article that explains this > coherency stuff: > http://www.informit.com/articles/article.asp?p=29961&seqNum=6&rl=1 > Basically, you need to call this only if the page might afterwards be > read from the userspace. Directories are not, so ext2 doesn't have to. Ah, that's a great article! Thanks for the pointer. I will be able to remove quite a few dcache flushes from ntfs now. (-: Thanks, Anton -- Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @) Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/ ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2005-05-30 22:19 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-04-05 1:23 Access content of file via inodes Kathy KN 2005-04-05 7:22 ` Christoph Hellwig 2005-04-05 17:53 ` Bryan Henderson 2005-04-06 1:27 ` Kathy KN (HK) 2005-04-06 1:53 ` Jeff Mahoney 2005-04-06 17:57 ` Bryan Henderson 2005-04-06 7:54 ` Anton Altaparmakov 2005-04-06 11:33 ` Anton Altaparmakov 2005-04-06 13:09 ` Jeffrey Mahoney 2005-04-07 5:25 ` Kathy KN (HK) 2005-04-07 6:47 ` Jeffrey Mahoney 2005-04-07 8:09 ` Anton Altaparmakov 2005-04-05 19:01 ` Jeff Mahoney 2005-04-06 1:32 ` Kathy KN (HK) 2005-04-06 1:50 ` Jeff Mahoney 2005-04-08 6:01 ` Kathy KN (HK) 2005-04-08 8:17 ` Anton Altaparmakov 2005-05-27 19:13 ` Martin Jambor 2005-05-28 15:57 ` Anton Altaparmakov 2005-05-28 21:44 ` Martin Jambor 2005-05-29 7:26 ` Anton Altaparmakov 2005-05-30 21:51 ` Martin Jambor 2005-05-30 22:19 ` Anton Altaparmakov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).