[Cluster-devel] [GFS2 PATCH] GFS2: Delete directory block reservation on failure

cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed

* [Cluster-devel] [GFS2 PATCH] GFS2: Delete directory block reservation on failure
       [not found] <103608963.1288829.1375193618752.JavaMail.root@redhat.com>
@ 2013-07-30 14:14 ` Bob Peterson
  2013-07-30 14:18   ` Steven Whitehouse
  0 siblings, 1 reply; 4+ messages in thread
From: Bob Peterson @ 2013-07-30 14:14 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

This patch adds one line of code that deletes a block reservation
structure for the source directory in the event that the inode creation
operation fails. If the inode creation succeeds, the reservation will
be deleted anyway, since directory reservations are now only 1 block.

Regards,

Bob Peterson
Red Hat File Systems

Signed-off-by: Bob Peterson <rpeterso@redhat.com> 
---
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index a01b8fd..371e4e3 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -715,6 +715,7 @@ fail_free_inode:
 	free_inode_nonrcu(inode);
 	inode = NULL;
 fail_gunlock:
+	gfs2_rs_delete(dip);
 	gfs2_glock_dq_uninit(ghs);
 	if (inode && !IS_ERR(inode)) {
 		clear_nlink(inode);



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Cluster-devel] [GFS2 PATCH] GFS2: Delete directory block reservation on failure
  2013-07-30 14:14 ` [Cluster-devel] [GFS2 PATCH] GFS2: Delete directory block reservation on failure Bob Peterson
@ 2013-07-30 14:18   ` Steven Whitehouse
  2013-07-30 15:42     ` Bob Peterson
  0 siblings, 1 reply; 4+ messages in thread
From: Steven Whitehouse @ 2013-07-30 14:18 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On Tue, 2013-07-30 at 10:14 -0400, Bob Peterson wrote:
> Hi,
> 
> This patch adds one line of code that deletes a block reservation
> structure for the source directory in the event that the inode creation
> operation fails. If the inode creation succeeds, the reservation will
> be deleted anyway, since directory reservations are now only 1 block.
> 
Why would we want to do that? If the creation has failed then that gives
us no information about whether further allocations are likely to be
made for that directory,

Steve.

> Regards,
> 
> Bob Peterson
> Red Hat File Systems
> 
> Signed-off-by: Bob Peterson <rpeterso@redhat.com> 
> ---
> diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
> index a01b8fd..371e4e3 100644
> --- a/fs/gfs2/inode.c
> +++ b/fs/gfs2/inode.c
> @@ -715,6 +715,7 @@ fail_free_inode:
>  	free_inode_nonrcu(inode);
>  	inode = NULL;
>  fail_gunlock:
> +	gfs2_rs_delete(dip);
>  	gfs2_glock_dq_uninit(ghs);
>  	if (inode && !IS_ERR(inode)) {
>  		clear_nlink(inode);
> 




^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Cluster-devel] [GFS2 PATCH] GFS2: Delete directory block reservation on failure
  2013-07-30 14:18   ` Steven Whitehouse
@ 2013-07-30 15:42     ` Bob Peterson
  2013-07-30 15:53       ` Steven Whitehouse
  0 siblings, 1 reply; 4+ messages in thread
From: Bob Peterson @ 2013-07-30 15:42 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

----- Original Message -----
| On Tue, 2013-07-30 at 10:14 -0400, Bob Peterson wrote:
| > Hi,
| > 
| > This patch adds one line of code that deletes a block reservation
| > structure for the source directory in the event that the inode creation
| > operation fails. If the inode creation succeeds, the reservation will
| > be deleted anyway, since directory reservations are now only 1 block.
| > 
| Why would we want to do that? If the creation has failed then that gives
| us no information about whether further allocations are likely to be
| made for that directory,

It's hard to explain, but it has to do with keeping the bitmaps as
defragmented as possible in memory so that we don't slow down file block
allocations with tons of unnecessary reservation structures to go through.
Directory reservations are only for a single block anyway, and in the case
where a new inode is created successfully, the block reservation is deleted
immediately thereafter. The reason we do this is to keep the bitmaps
as tightly packed as possible so that file allocations are given priority.
Otherwise we spend a huge amount of time rejecting many possible free
blocks because of outstanding reservations left around for directories by
virtue of the fact that directories are cached and not closed like files.

For details, see:
http://git.kernel.org/cgit/linux/kernel/git/steve/gfs2-3.0-nmw.git/commit/fs/gfs2?id=af21ca8ed50f01c5278c5ded6dad6f05e8a5d2e4

However, in the unsuccessful case, today's code leaves the single-block
reservation structure out there in memory for the directory, also
fragmenting the bitmap and creating more clutter for the block allocator to
go through when finding free blocks, just like we had before the
aforementioned patch.

It seems pointless to leave the reservation around speculatively on the
hopes of future dinode allocations for that directory. Even more so in the
failure case, especially since it seems likely to fail a second and
subsequent times as well for the same reason it failed this time.

Regards,

Bob Peterson
Red Hat File Systems

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Cluster-devel] [GFS2 PATCH] GFS2: Delete directory block reservation on failure
  2013-07-30 15:42     ` Bob Peterson
@ 2013-07-30 15:53       ` Steven Whitehouse
  0 siblings, 0 replies; 4+ messages in thread
From: Steven Whitehouse @ 2013-07-30 15:53 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On Tue, 2013-07-30 at 11:42 -0400, Bob Peterson wrote:
> Hi,
> 
> ----- Original Message -----
> | On Tue, 2013-07-30 at 10:14 -0400, Bob Peterson wrote:
> | > Hi,
> | > 
> | > This patch adds one line of code that deletes a block reservation
> | > structure for the source directory in the event that the inode creation
> | > operation fails. If the inode creation succeeds, the reservation will
> | > be deleted anyway, since directory reservations are now only 1 block.
> | > 
> | Why would we want to do that? If the creation has failed then that gives
> | us no information about whether further allocations are likely to be
> | made for that directory,
> 
> It's hard to explain, but it has to do with keeping the bitmaps as
> defragmented as possible in memory so that we don't slow down file block
> allocations with tons of unnecessary reservation structures to go through.
> Directory reservations are only for a single block anyway, and in the case
> where a new inode is created successfully, the block reservation is deleted
> immediately thereafter. The reason we do this is to keep the bitmaps
> as tightly packed as possible so that file allocations are given priority.
> Otherwise we spend a huge amount of time rejecting many possible free
> blocks because of outstanding reservations left around for directories by
> virtue of the fact that directories are cached and not closed like files.
> 
> For details, see:
> http://git.kernel.org/cgit/linux/kernel/git/steve/gfs2-3.0-nmw.git/commit/fs/gfs2?id=af21ca8ed50f01c5278c5ded6dad6f05e8a5d2e4
> 
> However, in the unsuccessful case, today's code leaves the single-block
> reservation structure out there in memory for the directory, also
> fragmenting the bitmap and creating more clutter for the block allocator to
> go through when finding free blocks, just like we had before the
> aforementioned patch.
> 
> It seems pointless to leave the reservation around speculatively on the
> hopes of future dinode allocations for that directory. Even more so in the
> failure case, especially since it seems likely to fail a second and
> subsequent times as well for the same reason it failed this time.
> 
> Regards,
> 
> Bob Peterson
> Red Hat File Systems

Well I think we need to take a closer look at what is going on. There
are several issues here... one is whether our predictor for how many
blocks will be used is doing a good job. The answer seems to be not,
since otherwise we wouldn't have needed to cut the reservation size to a
single block as a temporary measure.

If there is really no need to use reservations with directories, then
the best solution would be just to not use them in that case at all, and
return to something closer to the old code. It makes no sense to spend a
lot of effort to reserve single blocks, as that defeats the objective of
trying to keep things in extents.

The other issue is whether we can do better with the directory
allocations in the first place. I'd very much like to see a scheme for
keeping the blocks which make up the hash table contiguous on disk, and
to add a flag to the inode which is set when this is the case. That
would allow us to read the entire hash table with a single i/o, whatever
size it was. This may be a much better approach for dealing with
directory allocations.

Also, we should look at adding a timeout, perhaps, to directory
reservations so that we can keep them for a short time, but drop them if
they become unused. We need to find some better predictors of when it is
likely that a lot of files will be created in a particular directory I
think,

Steve.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-07-30 15:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <103608963.1288829.1375193618752.JavaMail.root@redhat.com>
2013-07-30 14:14 ` [Cluster-devel] [GFS2 PATCH] GFS2: Delete directory block reservation on failure Bob Peterson
2013-07-30 14:18   ` Steven Whitehouse
2013-07-30 15:42     ` Bob Peterson
2013-07-30 15:53       ` Steven Whitehouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).