* delayed extent tree test cases @ 2012-03-09 5:18 Allison Henderson 2012-03-09 5:36 ` Yongqiang Yang 0 siblings, 1 reply; 9+ messages in thread From: Allison Henderson @ 2012-03-09 5:18 UTC (permalink / raw) To: Yongqiang Yang, Ext4 Developers List Hi Yongqiang, I am looking for test cases to exercise your delayed extent tree patch set. I am working on expanding your solution for extent locks, and it would be nice to have some sanity checks as I go along to make sure I have not broken it :) Are there any tests in particular that you used while you were developing it? Thx! Allison Henderson ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: delayed extent tree test cases 2012-03-09 5:18 delayed extent tree test cases Allison Henderson @ 2012-03-09 5:36 ` Yongqiang Yang 2012-03-09 6:39 ` Allison Henderson 0 siblings, 1 reply; 9+ messages in thread From: Yongqiang Yang @ 2012-03-09 5:36 UTC (permalink / raw) To: Allison Henderson; +Cc: Ext4 Developers List On Fri, Mar 9, 2012 at 1:18 PM, Allison Henderson <achender@linux.vnet.ibm.com> wrote: > Hi Yongqiang, Hi Allison, > > I am looking for test cases to exercise your delayed extent tree patch set. > I am working on expanding your solution for extent locks, and it would be > nice to have some sanity checks as I go along to make sure I have not broken > it :) Are there any tests in particular that you used while you were > developing it? Thx! There is no particular tests for delayed extents tree on my hand. I just tested the patch set with xfstests. If there is any help I can provide, please tell me. Yongqiang. > > Allison Henderson > -- Best Wishes Yongqiang Yang -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: delayed extent tree test cases 2012-03-09 5:36 ` Yongqiang Yang @ 2012-03-09 6:39 ` Allison Henderson 2012-03-09 9:19 ` Yongqiang Yang 0 siblings, 1 reply; 9+ messages in thread From: Allison Henderson @ 2012-03-09 6:39 UTC (permalink / raw) To: Yongqiang Yang Cc: Ext4 Developers List, Lukas Czerner, Ted Ts'o, Mingming Cao On 03/08/2012 10:36 PM, Yongqiang Yang wrote: > On Fri, Mar 9, 2012 at 1:18 PM, Allison Henderson > <achender@linux.vnet.ibm.com> wrote: >> Hi Yongqiang, > Hi Allison, > > >> >> I am looking for test cases to exercise your delayed extent tree patch set. >> I am working on expanding your solution for extent locks, and it would be >> nice to have some sanity checks as I go along to make sure I have not broken >> it :) Are there any tests in particular that you used while you were >> developing it? Thx! > > There is no particular tests for delayed extents tree on my hand. > I just tested the patch set with xfstests. If there is any help I > can provide, please tell me. > > Yongqiang. > >> >> Allison Henderson >> > > > Alrighty, I'll give it a run through xfstests tonight, and then maybe I can show you what I've got so far. My first few patches are pretty much just renaming things from delayed_extent to status_extent, sense it's doing a lot more than delayed extents now. I figured those patches we could just merge together sense I dont think your set has been merged yet. The next step that I am working on now is getting it to track allocated extents. So any pointers for doing that would be helpful :) It looks like the current code is optimized for merging extents as much as possible, and that makes sense for delayed extents, but for allocated extents, we need to get it to mirror the existing extents. That way we will know what extents there are to lock before we start doing things with the current extent tree. When I think about all the ins and outs of trying to keep the trees in sync, I realize it may get complex, but I dont think we would want to deal with the odd things that might come out of allowing tasks to lock a partial extent either. Suggestions for simplifications are certainly welcome though :) Thx! Allison Henderson ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: delayed extent tree test cases 2012-03-09 6:39 ` Allison Henderson @ 2012-03-09 9:19 ` Yongqiang Yang 2012-03-09 16:40 ` Allison Henderson 0 siblings, 1 reply; 9+ messages in thread From: Yongqiang Yang @ 2012-03-09 9:19 UTC (permalink / raw) To: Allison Henderson Cc: Ext4 Developers List, Lukas Czerner, Ted Ts'o, Mingming Cao > Alrighty, I'll give it a run through xfstests tonight, and then maybe I can > show you what I've got so far. My first few patches are pretty much just > renaming things from delayed_extent to status_extent, sense it's doing a lot > more than delayed extents now. I figured those patches we could just merge > together sense I dont think your set has been merged yet. Agree! This can reduce Ted's work. > > The next step that I am working on now is getting it to track allocated > extents. So any pointers for doing that would be helpful :) It looks like > the current code is optimized for merging extents as much as possible, and > that makes sense for delayed extents, but for allocated extents, we need to Yep, it is optimized much for delayed extents. > get it to mirror the existing extents. That way we will know what extents > there are to lock before we start doing things with the current extent tree. > > When I think about all the ins and outs of trying to keep the trees in sync, Actually, delayed extents is also synced. This can be easily achieved by protecting operations on extent tree by i_data_sem. > I realize it may get complex, but I dont think we would want to deal with > the odd things that might come out of allowing tasks to lock a partial > extent either. Suggestions for simplifications are certainly welcome though > :) I am a little confused by partial extent here. I am guessing you meant extent rb-tree in memory is the mirror of extent tree in inode which is stored on disk. Am I right? In my head, the extent tree used by extent lock traces logical extents, for example, a process locks a range of a file and it does not care the physical blocks. So we just need to record logical extent without physical blocks infos. Then locking on an extent may trigger splitting on an extent while unlocking may trigger merging on extents. Am I right? Yongqiang. > > Thx! > Allison Henderson > -- Best Wishes Yongqiang Yang -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: delayed extent tree test cases 2012-03-09 9:19 ` Yongqiang Yang @ 2012-03-09 16:40 ` Allison Henderson 2012-03-11 14:12 ` Yongqiang Yang 0 siblings, 1 reply; 9+ messages in thread From: Allison Henderson @ 2012-03-09 16:40 UTC (permalink / raw) To: Yongqiang Yang Cc: Ext4 Developers List, Lukas Czerner, Ted Ts'o, Mingming Cao On 03/09/2012 02:19 AM, Yongqiang Yang wrote: >> Alrighty, I'll give it a run through xfstests tonight, and then maybe I can >> show you what I've got so far. My first few patches are pretty much just >> renaming things from delayed_extent to status_extent, sense it's doing a lot >> more than delayed extents now. I figured those patches we could just merge >> together sense I dont think your set has been merged yet. > Agree! This can reduce Ted's work. > >> >> The next step that I am working on now is getting it to track allocated >> extents. So any pointers for doing that would be helpful :) It looks like >> the current code is optimized for merging extents as much as possible, and >> that makes sense for delayed extents, but for allocated extents, we need to > Yep, it is optimized much for delayed extents. >> get it to mirror the existing extents. That way we will know what extents >> there are to lock before we start doing things with the current extent tree. >> >> When I think about all the ins and outs of trying to keep the trees in sync, > Actually, delayed extents is also synced. This can be easily achieved > by protecting operations on extent tree by i_data_sem. Ah, sorry I could have phrased that better. What I meant was trying to keep the new status tree in sync with the on disk tree so that the status tree mirrors the same allocated extents in the on disk tree. > >> I realize it may get complex, but I dont think we would want to deal with >> the odd things that might come out of allowing tasks to lock a partial >> extent either. Suggestions for simplifications are certainly welcome though >> :) > I am a little confused by partial extent here. I am guessing you > meant extent rb-tree in memory is the mirror of extent tree in inode > which is stored on disk. Am I right? > > In my head, the extent tree used by extent lock traces logical > extents, for example, a process locks a range of a file and it does > not care the physical blocks. So we just need to record logical > extent without physical blocks infos. Then locking on an extent may > trigger splitting on an extent while unlocking may trigger merging on > extents. Am I right? > > Yongqiang. > Well initially I was doing something similar to that, where we only lock logical ranges that may or may not be "extent aligned" with the on disk extents. But the concern that I have though is that we may end up with processes that have the same on disk extent locked. For example, say process A locks a logical range of blocks, 1-5 and process B locks a logical range of blocks 6-10. But if the on disk extents are actually 1-2, 3-7 and 8-10, we have a situation where both processes own a piece of the 3-7 extent, but they wont know it until they get down into the on disk extents. And it seems to me they should really have the whole on disk extent locked before they do any on disk splitting. And now we have a deadlock condition since one of them is going to have to give up their lock before the other can proceed. So that's when I started thinking maybe we need to make sure that the locked ranges are extent aligned. Does that make sense? Maybe there is something I am overlooking that would help simplify. Allison Henderson > >> >> Thx! >> Allison Henderson >> > > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: delayed extent tree test cases 2012-03-09 16:40 ` Allison Henderson @ 2012-03-11 14:12 ` Yongqiang Yang 2012-03-13 0:04 ` Allison Henderson 0 siblings, 1 reply; 9+ messages in thread From: Yongqiang Yang @ 2012-03-11 14:12 UTC (permalink / raw) To: Allison Henderson Cc: Ext4 Developers List, Lukas Czerner, Ted Ts'o, Mingming Cao >>> get it to mirror the existing extents. That way we will know what >>> extents >>> there are to lock before we start doing things with the current extent >>> tree. >>> >>> When I think about all the ins and outs of trying to keep the trees in >>> sync, >> >> Actually, delayed extents is also synced. This can be easily achieved >> by protecting operations on extent tree by i_data_sem. > > Ah, sorry I could have phrased that better. What I meant was trying to keep > the new status tree in sync with the on disk tree so that the status tree > mirrors the same allocated extents in the on disk tree. > >> >> I am a little confused by partial extent here. I am guessing you >> meant extent rb-tree in memory is the mirror of extent tree in inode >> which is stored on disk. Am I right? >> >> In my head, the extent tree used by extent lock traces logical >> extents, for example, a process locks a range of a file and it does >> not care the physical blocks. So we just need to record logical >> extent without physical blocks infos. Then locking on an extent may >> trigger splitting on an extent while unlocking may trigger merging on >> extents. Am I right? >> >> Yongqiang. >> > > Well initially I was doing something similar to that, where we only lock > logical ranges that may or may not be "extent aligned" with the on disk > extents. But the concern that I have though is that we may end up with > processes that have the same on disk extent locked. For example, say > process A locks a logical range of blocks, 1-5 and process B locks a logical > range of blocks 6-10. But if the on disk extents are actually 1-2, 3-7 and > 8-10, we have a situation where both processes own a piece of the 3-7 > extent, but they wont know it until they get down into the on disk extents. > And it seems to me they should really have the whole on disk extent locked > before they do any on disk splitting. And now we have a deadlock condition > since one of them is going to have to give up their lock before the other > can proceed. So that's when I started thinking maybe we need to make sure > that the locked ranges are extent aligned. Does that make sense? Extent lock is provided to user space process not to kernel, right? An process acquires extent lock, so that other processes can not access the locked extent. In other words, extent lock is used to protect data in file, not internal data structure of filesystem. What we need to guarantee is that data in the locked extent is not changed, while extent tree on disk can be changed. So maybe we just need to wait lock freed before truncate and puch hole. Are there any other operations changing data of a file? Maybe > there is something I am overlooking that would help simplify. Ok. Now we have two extent trees - the first one is used to implement extent locking while the second one is used to map logical blocks to physical blocks. If we protect operations on the two trees by i_data_sem, then two trees are synced. For example, given that a process wants to modify a tree, it has to acquire i_data_sem, then no other processes can access any tree. Maybe I am overlooking something.:-) Yongqiang. > > Allison Henderson > >> >>> >>> Thx! >>> Allison Henderson >>> >> >> >> > -- Best Wishes Yongqiang Yang -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: delayed extent tree test cases 2012-03-11 14:12 ` Yongqiang Yang @ 2012-03-13 0:04 ` Allison Henderson 2012-03-13 3:39 ` Yongqiang Yang 0 siblings, 1 reply; 9+ messages in thread From: Allison Henderson @ 2012-03-13 0:04 UTC (permalink / raw) To: Yongqiang Yang Cc: Ext4 Developers List, Lukas Czerner, Ted Ts'o, Mingming Cao On 03/11/2012 07:12 AM, Yongqiang Yang wrote: >>>> get it to mirror the existing extents. That way we will know what >>>> extents >>>> there are to lock before we start doing things with the current extent >>>> tree. >>>> >>>> When I think about all the ins and outs of trying to keep the trees in >>>> sync, >>> >>> Actually, delayed extents is also synced. This can be easily achieved >>> by protecting operations on extent tree by i_data_sem. >> >> Ah, sorry I could have phrased that better. What I meant was trying to keep >> the new status tree in sync with the on disk tree so that the status tree >> mirrors the same allocated extents in the on disk tree. >> >>> >>> I am a little confused by partial extent here. I am guessing you >>> meant extent rb-tree in memory is the mirror of extent tree in inode >>> which is stored on disk. Am I right? >>> >>> In my head, the extent tree used by extent lock traces logical >>> extents, for example, a process locks a range of a file and it does >>> not care the physical blocks. So we just need to record logical >>> extent without physical blocks infos. Then locking on an extent may >>> trigger splitting on an extent while unlocking may trigger merging on >>> extents. Am I right? >>> >>> Yongqiang. >>> >> >> Well initially I was doing something similar to that, where we only lock >> logical ranges that may or may not be "extent aligned" with the on disk >> extents. But the concern that I have though is that we may end up with >> processes that have the same on disk extent locked. For example, say >> process A locks a logical range of blocks, 1-5 and process B locks a logical >> range of blocks 6-10. But if the on disk extents are actually 1-2, 3-7 and >> 8-10, we have a situation where both processes own a piece of the 3-7 >> extent, but they wont know it until they get down into the on disk extents. >> And it seems to me they should really have the whole on disk extent locked >> before they do any on disk splitting. And now we have a deadlock condition >> since one of them is going to have to give up their lock before the other >> can proceed. So that's when I started thinking maybe we need to make sure >> that the locked ranges are extent aligned. Does that make sense? > Extent lock is provided to user space process not to kernel, right? > An process acquires extent lock, so that other processes can not > access the locked extent. In other words, extent lock is used to > protect data in file, not internal data structure of filesystem. What > we need to guarantee is that data in the locked extent is not changed, > while extent tree on disk can be changed. Well, it was my impression that the purpose of extent locks it to replace i_mutex. Maybe I dont quite understand what you mean by user space? But I think I understand what you are saying about i_data_sem protecting the internal structures, and extent locks protecting the read/write of data. :) i_data_sem should protect us from the concern I pointed out earlier, so that will certainly simplify things. > > So maybe we just need to wait lock freed before truncate and puch > hole. Are there any other operations changing data of a file? So, definitely punch hole and truncate will need to be locking the space they are removing, but there are a lot of other places where i_mutex will need to be replaced too. I had a list a while ago of all the i_mutex occurrences in ext4. I can repost here so we can talk about though. Replacing all these will probably be the last part of the extent lock project, after i get the tree tracking allocated extents, and then the locking logic on top of that. Ext4 functions that lock i_mutex: ext4_sync_file ext4_fallocate ext4_move_extents via two helper routines: mext_inode_double_lock and mext_inode_double_unlock ext4_ioctl (for the EXT4_IOC_SETFLAGS ioctl) ext4_quota_write ext4_llseek ext4_end_io_work ext4_ind_direct_IO (only while calling ext4_flush_completed_IO) Functions called by vfs with i_mutex locked: ext4_setattr ext4_da_writepages ext4_rmdir ext4_unlink ext4_symlink ext4_link ext4_rename ext4_get_block For these functions called by the vfs, I dont plan to go change vfs code, but we will need to be locking them ourselves in the ext4 code if we want them to by synchronous with the functions in the first list as they are today. Let me know if you see any thing missing or incorrect though. > > > Maybe >> there is something I am overlooking that would help simplify. > Ok. Now we have two extent trees - the first one is used to implement > extent locking while the second one is used to map logical blocks to > physical blocks. If we protect operations on the two trees by > i_data_sem, then two trees are synced. For example, given that a > process wants to modify a tree, it has to acquire i_data_sem, then no > other processes can access any tree. > > > Maybe I am overlooking something.:-) > > Yongqiang. Ok, got it :) I probably should have seen i_data_sem would solve this. Thank you for pointing it out though, it does simplify things a lot. Thx for all the advice :) Allison Henderson >> >> Allison Henderson >> >>> >>>> >>>> Thx! >>>> Allison Henderson >>>> >>> >>> >>> >> > > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: delayed extent tree test cases 2012-03-13 0:04 ` Allison Henderson @ 2012-03-13 3:39 ` Yongqiang Yang 2012-03-14 6:34 ` Allison Henderson 0 siblings, 1 reply; 9+ messages in thread From: Yongqiang Yang @ 2012-03-13 3:39 UTC (permalink / raw) To: Allison Henderson Cc: Ext4 Developers List, Lukas Czerner, Ted Ts'o, Mingming Cao > > Well, it was my impression that the purpose of extent locks it to replace > i_mutex. Maybe I dont quite understand what you mean by user space? Sorry, I understood that wrongly. Thank you for your explanation and I think I am clear now:-) Let's get back to concerns, there are two concerns:sync and dead lock. I don't think we need to sync two trees, actually IMHO it is impossible to sync the two trees. Consider that write acquire lock on extent which exceeds the tail of a file before doing actual writing, that says, we need to lock an extent before it appears in extent tree on disk. Below is the 2nd concern quoted from your email: ====================================================== For example, say process A locks a logical range of blocks, 1-5 and process B locks a logical range of blocks 6-10. But if the on disk extents are actually 1-2, 3-7 and 8-10, we have a situation where both processes own a piece of the 3-7 extent, but they wont know it until they get down into the on disk extents. And it seems to me they should really have the whole on disk extent locked before they do any on disk splitting. And now we have a deadlock condition since one of them is going to have to give up their lock before the other can proceed. So that's when I started thinking maybe we need to make sure that the locked ranges are extent aligned. Does that make sense? ====================================================== I don't think we should hold extent lock just before we modify extent tree on disk . All operations that will modify extent tree on disk have hold extent lock before they acquire i_data_sem, so it is safe for them to split extent or do something else, because they have hold the extent lock they should hold. Continue with your example, both processes own a piece of 3-7 extent, so they have hold their extent lock before acquiring i_data_sem, if process A splits the extent, for example, it removes extent it locked from on disk tree. The piece of 3-7 extent which process A does not lock is still there. Both processed works with no problem. Yongqiang. >> >> So maybe we just need to wait lock freed before truncate and puch >> hole. Are there any other operations changing data of a file? > > > So, definitely punch hole and truncate will need to be locking the space > they are removing, but there are a lot of other places where i_mutex will > need to be replaced too. I had a list a while ago of all the i_mutex > occurrences in ext4. I can repost here so we can talk about though. > Replacing all these will probably be the last part of the extent lock > project, after i get the tree tracking allocated extents, and then the > locking logic on top of that. > > Ext4 functions that lock i_mutex: > ext4_sync_file > ext4_fallocate > ext4_move_extents via two helper routines: > mext_inode_double_lock and mext_inode_double_unlock > ext4_ioctl (for the EXT4_IOC_SETFLAGS ioctl) > ext4_quota_write > ext4_llseek > ext4_end_io_work > ext4_ind_direct_IO (only while calling ext4_flush_completed_IO) > > Functions called by vfs with i_mutex locked: > ext4_setattr > ext4_da_writepages > ext4_rmdir > ext4_unlink > ext4_symlink > ext4_link > ext4_rename > ext4_get_block > > For these functions called by the vfs, I dont plan to go change vfs code, > but we will need to be locking them ourselves in the ext4 code if we want > them to by synchronous with the functions in the first list as they are > today. Let me know if you see any thing missing or incorrect though. > > > >> >> >> Maybe >>> >>> there is something I am overlooking that would help simplify. >> >> Ok. Now we have two extent trees - the first one is used to implement >> extent locking while the second one is used to map logical blocks to >> physical blocks. If we protect operations on the two trees by >> i_data_sem, then two trees are synced. For example, given that a >> process wants to modify a tree, it has to acquire i_data_sem, then no >> other processes can access any tree. >> >> >> Maybe I am overlooking something.:-) >> >> Yongqiang. > > > Ok, got it :) I probably should have seen i_data_sem would solve this. Thank > you for pointing it out though, it does simplify things a lot. Thx for all > the advice :) > > Allison Henderson > > >>> >>> Allison Henderson >>> >>>> >>>>> >>>>> Thx! >>>>> Allison Henderson >>>>> >>>> >>>> >>>> >>> >> >> >> > -- Best Wishes Yongqiang Yang -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: delayed extent tree test cases 2012-03-13 3:39 ` Yongqiang Yang @ 2012-03-14 6:34 ` Allison Henderson 0 siblings, 0 replies; 9+ messages in thread From: Allison Henderson @ 2012-03-14 6:34 UTC (permalink / raw) To: Yongqiang Yang Cc: Ext4 Developers List, Lukas Czerner, Ted Ts'o, Mingming Cao On 03/12/2012 08:39 PM, Yongqiang Yang wrote: >> >> Well, it was my impression that the purpose of extent locks it to replace >> i_mutex. Maybe I dont quite understand what you mean by user space? > Sorry, I understood that wrongly. Thank you for your explanation and > I think I am clear now:-) > > > Let's get back to concerns, there are two concerns:sync and dead lock. > > I don't think we need to sync two trees, actually IMHO it is > impossible to sync the two trees. Consider that write acquire lock on > extent which exceeds the tail of a file before doing actual writing, > that says, we need to lock an extent before it appears in extent tree > on disk. Oh, yes this condition I handle in the extent lock logic that we havent discussed much yet. Basically the new extent lock logic adds a "status" member to the extent structure that can be either "delayed" "allocated" or "hole". The idea is if we try to lock something in the tree that is not there yet, it gets allocated as a "hole" extent. And on unlocking, destroy only if its a hole, and nobody else is waiting to lock it. I'll also need to put in some special splitting logic for when a "hole" becomes becomes "delayed" or "allocated" sense we need retain the information about whose got it locked. I guess now that we've reasoned away the need to track the allocated extents exactly, maybe we really dont need "allocated" and "hole". Maybe all we really need is just "delayed" and "not delayed", or something like that. Though if some one later decides to expand the tree's functionality, it might help to put in that infrastructure now. At the time I had suggested tracking the allocated extents, people saw other uses for it too. It would pretty much speed up any operation that needs to go walk the on disk extent tree just to see what its state is. Like extent searching, allocating new space, etc. As useful as these things are though, maybe we should just try to tackle one new feature at a time. :) > > Below is the 2nd concern quoted from your email: > ====================================================== > For example, say process A locks a logical range of blocks, 1-5 and > process B locks a logical range of blocks 6-10. But if the on disk > extents are actually 1-2, 3-7 and 8-10, we have a situation where both > processes own a piece of the 3-7 extent, but they wont know it until > they get down into the on disk extents. And it seems to me they should > really have the whole on disk extent locked before they do any on disk > splitting. And now we have a deadlock condition since one of them is > going to have to give up their lock before the other can proceed. So > that's when I started thinking maybe we need to make sure that the > locked ranges are extent aligned. Does that make sense? > ====================================================== > > I don't think we should hold extent lock just before we modify extent > tree on disk . All operations that will modify extent tree on disk > have hold extent lock before they acquire i_data_sem, so it is safe > for them to split extent or do something else, because they have hold > the extent lock they should hold. > > Continue with your example, both processes own a piece of 3-7 extent, > so they have hold their extent lock before acquiring i_data_sem, if > process A splits the extent, for example, it removes extent it locked > from on disk tree. The piece of 3-7 extent which process A does not > lock is still there. Both processed works with no problem. Right, these points make sense. i_data_sem will save us here from having to mirror the extents exactly. I guess with the current scheme I have, if an operation removes an extent (allocated or delayed), it would just turn into a "hole" extent (or maybe the "not delayed" type). Im glad you point it out though, because it hadn't crossed my mind earlier. We will need to be careful to only change the status of any locked extent being removed so that we dont free our own lock :) Allison Henderson > > Yongqiang. >>> >>> So maybe we just need to wait lock freed before truncate and puch >>> hole. Are there any other operations changing data of a file? >> >> >> So, definitely punch hole and truncate will need to be locking the space >> they are removing, but there are a lot of other places where i_mutex will >> need to be replaced too. I had a list a while ago of all the i_mutex >> occurrences in ext4. I can repost here so we can talk about though. >> Replacing all these will probably be the last part of the extent lock >> project, after i get the tree tracking allocated extents, and then the >> locking logic on top of that. >> >> Ext4 functions that lock i_mutex: >> ext4_sync_file >> ext4_fallocate >> ext4_move_extents via two helper routines: >> mext_inode_double_lock and mext_inode_double_unlock >> ext4_ioctl (for the EXT4_IOC_SETFLAGS ioctl) >> ext4_quota_write >> ext4_llseek >> ext4_end_io_work >> ext4_ind_direct_IO (only while calling ext4_flush_completed_IO) >> >> Functions called by vfs with i_mutex locked: >> ext4_setattr >> ext4_da_writepages >> ext4_rmdir >> ext4_unlink >> ext4_symlink >> ext4_link >> ext4_rename >> ext4_get_block >> >> For these functions called by the vfs, I dont plan to go change vfs code, >> but we will need to be locking them ourselves in the ext4 code if we want >> them to by synchronous with the functions in the first list as they are >> today. Let me know if you see any thing missing or incorrect though. >> >> >> >>> >>> >>> Maybe >>>> >>>> there is something I am overlooking that would help simplify. >>> >>> Ok. Now we have two extent trees - the first one is used to implement >>> extent locking while the second one is used to map logical blocks to >>> physical blocks. If we protect operations on the two trees by >>> i_data_sem, then two trees are synced. For example, given that a >>> process wants to modify a tree, it has to acquire i_data_sem, then no >>> other processes can access any tree. >>> >>> >>> Maybe I am overlooking something.:-) >>> >>> Yongqiang. >> >> >> Ok, got it :) I probably should have seen i_data_sem would solve this. Thank >> you for pointing it out though, it does simplify things a lot. Thx for all >> the advice :) >> >> Allison Henderson >> >> >>>> >>>> Allison Henderson >>>> >>>>> >>>>>> >>>>>> Thx! >>>>>> Allison Henderson >>>>>> >>>>> >>>>> >>>>> >>>> >>> >>> >>> >> > > > ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-03-14 6:35 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-03-09 5:18 delayed extent tree test cases Allison Henderson 2012-03-09 5:36 ` Yongqiang Yang 2012-03-09 6:39 ` Allison Henderson 2012-03-09 9:19 ` Yongqiang Yang 2012-03-09 16:40 ` Allison Henderson 2012-03-11 14:12 ` Yongqiang Yang 2012-03-13 0:04 ` Allison Henderson 2012-03-13 3:39 ` Yongqiang Yang 2012-03-14 6:34 ` Allison Henderson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).