* [Cluster-devel] [GFS2 PATCH] GFS2: Delete directory block reservation on failure [not found] <103608963.1288829.1375193618752.JavaMail.root@redhat.com> @ 2013-07-30 14:14 ` Bob Peterson 2013-07-30 14:18 ` Steven Whitehouse 0 siblings, 1 reply; 4+ messages in thread From: Bob Peterson @ 2013-07-30 14:14 UTC (permalink / raw) To: cluster-devel.redhat.com Hi, This patch adds one line of code that deletes a block reservation structure for the source directory in the event that the inode creation operation fails. If the inode creation succeeds, the reservation will be deleted anyway, since directory reservations are now only 1 block. Regards, Bob Peterson Red Hat File Systems Signed-off-by: Bob Peterson <rpeterso@redhat.com> --- diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c index a01b8fd..371e4e3 100644 --- a/fs/gfs2/inode.c +++ b/fs/gfs2/inode.c @@ -715,6 +715,7 @@ fail_free_inode: free_inode_nonrcu(inode); inode = NULL; fail_gunlock: + gfs2_rs_delete(dip); gfs2_glock_dq_uninit(ghs); if (inode && !IS_ERR(inode)) { clear_nlink(inode); ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [Cluster-devel] [GFS2 PATCH] GFS2: Delete directory block reservation on failure 2013-07-30 14:14 ` [Cluster-devel] [GFS2 PATCH] GFS2: Delete directory block reservation on failure Bob Peterson @ 2013-07-30 14:18 ` Steven Whitehouse 2013-07-30 15:42 ` Bob Peterson 0 siblings, 1 reply; 4+ messages in thread From: Steven Whitehouse @ 2013-07-30 14:18 UTC (permalink / raw) To: cluster-devel.redhat.com Hi, On Tue, 2013-07-30 at 10:14 -0400, Bob Peterson wrote: > Hi, > > This patch adds one line of code that deletes a block reservation > structure for the source directory in the event that the inode creation > operation fails. If the inode creation succeeds, the reservation will > be deleted anyway, since directory reservations are now only 1 block. > Why would we want to do that? If the creation has failed then that gives us no information about whether further allocations are likely to be made for that directory, Steve. > Regards, > > Bob Peterson > Red Hat File Systems > > Signed-off-by: Bob Peterson <rpeterso@redhat.com> > --- > diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c > index a01b8fd..371e4e3 100644 > --- a/fs/gfs2/inode.c > +++ b/fs/gfs2/inode.c > @@ -715,6 +715,7 @@ fail_free_inode: > free_inode_nonrcu(inode); > inode = NULL; > fail_gunlock: > + gfs2_rs_delete(dip); > gfs2_glock_dq_uninit(ghs); > if (inode && !IS_ERR(inode)) { > clear_nlink(inode); > ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Cluster-devel] [GFS2 PATCH] GFS2: Delete directory block reservation on failure 2013-07-30 14:18 ` Steven Whitehouse @ 2013-07-30 15:42 ` Bob Peterson 2013-07-30 15:53 ` Steven Whitehouse 0 siblings, 1 reply; 4+ messages in thread From: Bob Peterson @ 2013-07-30 15:42 UTC (permalink / raw) To: cluster-devel.redhat.com Hi, ----- Original Message ----- | On Tue, 2013-07-30 at 10:14 -0400, Bob Peterson wrote: | > Hi, | > | > This patch adds one line of code that deletes a block reservation | > structure for the source directory in the event that the inode creation | > operation fails. If the inode creation succeeds, the reservation will | > be deleted anyway, since directory reservations are now only 1 block. | > | Why would we want to do that? If the creation has failed then that gives | us no information about whether further allocations are likely to be | made for that directory, It's hard to explain, but it has to do with keeping the bitmaps as defragmented as possible in memory so that we don't slow down file block allocations with tons of unnecessary reservation structures to go through. Directory reservations are only for a single block anyway, and in the case where a new inode is created successfully, the block reservation is deleted immediately thereafter. The reason we do this is to keep the bitmaps as tightly packed as possible so that file allocations are given priority. Otherwise we spend a huge amount of time rejecting many possible free blocks because of outstanding reservations left around for directories by virtue of the fact that directories are cached and not closed like files. For details, see: http://git.kernel.org/cgit/linux/kernel/git/steve/gfs2-3.0-nmw.git/commit/fs/gfs2?id=af21ca8ed50f01c5278c5ded6dad6f05e8a5d2e4 However, in the unsuccessful case, today's code leaves the single-block reservation structure out there in memory for the directory, also fragmenting the bitmap and creating more clutter for the block allocator to go through when finding free blocks, just like we had before the aforementioned patch. It seems pointless to leave the reservation around speculatively on the hopes of future dinode allocations for that directory. Even more so in the failure case, especially since it seems likely to fail a second and subsequent times as well for the same reason it failed this time. Regards, Bob Peterson Red Hat File Systems ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Cluster-devel] [GFS2 PATCH] GFS2: Delete directory block reservation on failure 2013-07-30 15:42 ` Bob Peterson @ 2013-07-30 15:53 ` Steven Whitehouse 0 siblings, 0 replies; 4+ messages in thread From: Steven Whitehouse @ 2013-07-30 15:53 UTC (permalink / raw) To: cluster-devel.redhat.com Hi, On Tue, 2013-07-30 at 11:42 -0400, Bob Peterson wrote: > Hi, > > ----- Original Message ----- > | On Tue, 2013-07-30 at 10:14 -0400, Bob Peterson wrote: > | > Hi, > | > > | > This patch adds one line of code that deletes a block reservation > | > structure for the source directory in the event that the inode creation > | > operation fails. If the inode creation succeeds, the reservation will > | > be deleted anyway, since directory reservations are now only 1 block. > | > > | Why would we want to do that? If the creation has failed then that gives > | us no information about whether further allocations are likely to be > | made for that directory, > > It's hard to explain, but it has to do with keeping the bitmaps as > defragmented as possible in memory so that we don't slow down file block > allocations with tons of unnecessary reservation structures to go through. > Directory reservations are only for a single block anyway, and in the case > where a new inode is created successfully, the block reservation is deleted > immediately thereafter. The reason we do this is to keep the bitmaps > as tightly packed as possible so that file allocations are given priority. > Otherwise we spend a huge amount of time rejecting many possible free > blocks because of outstanding reservations left around for directories by > virtue of the fact that directories are cached and not closed like files. > > For details, see: > http://git.kernel.org/cgit/linux/kernel/git/steve/gfs2-3.0-nmw.git/commit/fs/gfs2?id=af21ca8ed50f01c5278c5ded6dad6f05e8a5d2e4 > > However, in the unsuccessful case, today's code leaves the single-block > reservation structure out there in memory for the directory, also > fragmenting the bitmap and creating more clutter for the block allocator to > go through when finding free blocks, just like we had before the > aforementioned patch. > > It seems pointless to leave the reservation around speculatively on the > hopes of future dinode allocations for that directory. Even more so in the > failure case, especially since it seems likely to fail a second and > subsequent times as well for the same reason it failed this time. > > Regards, > > Bob Peterson > Red Hat File Systems Well I think we need to take a closer look at what is going on. There are several issues here... one is whether our predictor for how many blocks will be used is doing a good job. The answer seems to be not, since otherwise we wouldn't have needed to cut the reservation size to a single block as a temporary measure. If there is really no need to use reservations with directories, then the best solution would be just to not use them in that case at all, and return to something closer to the old code. It makes no sense to spend a lot of effort to reserve single blocks, as that defeats the objective of trying to keep things in extents. The other issue is whether we can do better with the directory allocations in the first place. I'd very much like to see a scheme for keeping the blocks which make up the hash table contiguous on disk, and to add a flag to the inode which is set when this is the case. That would allow us to read the entire hash table with a single i/o, whatever size it was. This may be a much better approach for dealing with directory allocations. Also, we should look at adding a timeout, perhaps, to directory reservations so that we can keep them for a short time, but drop them if they become unused. We need to find some better predictors of when it is likely that a lot of files will be created in a particular directory I think, Steve. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-07-30 15:53 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <103608963.1288829.1375193618752.JavaMail.root@redhat.com>
2013-07-30 14:14 ` [Cluster-devel] [GFS2 PATCH] GFS2: Delete directory block reservation on failure Bob Peterson
2013-07-30 14:18 ` Steven Whitehouse
2013-07-30 15:42 ` Bob Peterson
2013-07-30 15:53 ` Steven Whitehouse
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).