* [PATCH] xfs: fix extent format buffer allocation size
@ 2011-03-30 2:52 Dave Chinner
2011-03-30 9:33 ` Christoph Hellwig
2011-04-04 20:09 ` Alex Elder
0 siblings, 2 replies; 5+ messages in thread
From: Dave Chinner @ 2011-03-30 2:52 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
When formatting an inode item, we have to allocate a separate buffer
to hold extents when there are delayed allocation extents on the
inode and it is in extent format. The allocation size is derived
from the in-core data fork representation, which accounts for
delayed allocation extents, while the on-disk representation does
not contain any delalloc extents.
As a result of this mismatch, the allocated buffer can be far larger
than needed to hold the real extent list which, due to the fact the
inode is in extent format, is limited to the size of the literal
area of the inode. However, we can have thousands of delalloc
extents, resulting in an allocation size orders of magnitude larger
than is needed to hold all the real extents.
Fix this by limiting the size of the buffer being allocated to the
size of the literal area of the inodes in the filesystem (i.e. the
maximum size an inode fork can grow to).
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_inode_item.c | 69 ++++++++++++++++++++++++++++------------------
1 files changed, 42 insertions(+), 27 deletions(-)
diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
index 46cc401..12cdc39 100644
--- a/fs/xfs/xfs_inode_item.c
+++ b/fs/xfs/xfs_inode_item.c
@@ -198,6 +198,43 @@ xfs_inode_item_size(
}
/*
+ * xfs_inode_item_format_extents - convert in-core extents to on-disk form
+ *
+ * For either the data or attr fork in extent format, we need to endian convert
+ * the in-core extent as we place them into the on-disk inode. In this case, we
+ * ned to do this conversion before we write the extents into the log. Because
+ * we don't have the disk inode to write into here, we allocate a buffer and
+ * format the extents into it via xfs_iextents_copy(). We free the buffer in
+ * the unlock routine after the copy for the log has been made.
+ *
+ * For the data fork, there can be delayed allocation extents
+ * in the inode as well, so the in-core data fork can be much larger than the
+ * on-disk data representation of real inodes. Hence we need to limit the size
+ * of the allocation to what will fit in the inode fork, otherwise we could be
+ * asking for excessively large allocation sizes.
+ */
+STATIC void
+xfs_inode_item_format_extents(
+ struct xfs_inode *ip,
+ struct xfs_log_iovec *vecp,
+ int whichfork,
+ int type)
+{
+ xfs_bmbt_rec_t *ext_buffer;
+
+ ext_buffer = kmem_alloc(XFS_IFORK_SIZE(ip, whichfork),
+ KM_SLEEP | KM_NOFS);
+ if (whichfork == XFS_DATA_FORK)
+ ip->i_itemp->ili_extents_buf = ext_buffer;
+ else
+ ip->i_itemp->ili_aextents_buf = ext_buffer;
+
+ vecp->i_addr = ext_buffer;
+ vecp->i_len = xfs_iextents_copy(ip, ext_buffer, whichfork);
+ vecp->i_type = type;
+}
+
+/*
* This is called to fill in the vector of log iovecs for the
* given inode log item. It fills the first item with an inode
* log format structure, the second with the on-disk inode structure,
@@ -213,7 +250,6 @@ xfs_inode_item_format(
struct xfs_inode *ip = iip->ili_inode;
uint nvecs;
size_t data_bytes;
- xfs_bmbt_rec_t *ext_buffer;
xfs_mount_t *mp;
vecp->i_addr = &iip->ili_format;
@@ -320,22 +356,8 @@ xfs_inode_item_format(
} else
#endif
{
- /*
- * There are delayed allocation extents
- * in the inode, or we need to convert
- * the extents to on disk format.
- * Use xfs_iextents_copy()
- * to copy only the real extents into
- * a separate buffer. We'll free the
- * buffer in the unlock routine.
- */
- ext_buffer = kmem_alloc(ip->i_df.if_bytes,
- KM_SLEEP);
- iip->ili_extents_buf = ext_buffer;
- vecp->i_addr = ext_buffer;
- vecp->i_len = xfs_iextents_copy(ip, ext_buffer,
- XFS_DATA_FORK);
- vecp->i_type = XLOG_REG_TYPE_IEXT;
+ xfs_inode_item_format_extents(ip, vecp,
+ XFS_DATA_FORK, XLOG_REG_TYPE_IEXT);
}
ASSERT(vecp->i_len <= ip->i_df.if_bytes);
iip->ili_format.ilf_dsize = vecp->i_len;
@@ -445,19 +467,12 @@ xfs_inode_item_format(
*/
vecp->i_addr = ip->i_afp->if_u1.if_extents;
vecp->i_len = ip->i_afp->if_bytes;
+ vecp->i_type = XLOG_REG_TYPE_IATTR_EXT;
#else
ASSERT(iip->ili_aextents_buf == NULL);
- /*
- * Need to endian flip before logging
- */
- ext_buffer = kmem_alloc(ip->i_afp->if_bytes,
- KM_SLEEP);
- iip->ili_aextents_buf = ext_buffer;
- vecp->i_addr = ext_buffer;
- vecp->i_len = xfs_iextents_copy(ip, ext_buffer,
- XFS_ATTR_FORK);
+ xfs_inode_item_format_extents(ip, vecp,
+ XFS_ATTR_FORK, XLOG_REG_TYPE_IATTR_EXT);
#endif
- vecp->i_type = XLOG_REG_TYPE_IATTR_EXT;
iip->ili_format.ilf_asize = vecp->i_len;
vecp++;
nvecs++;
--
1.7.2.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] xfs: fix extent format buffer allocation size
2011-03-30 2:52 [PATCH] xfs: fix extent format buffer allocation size Dave Chinner
@ 2011-03-30 9:33 ` Christoph Hellwig
2011-03-31 5:51 ` Dave Chinner
2011-04-04 20:09 ` Alex Elder
1 sibling, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2011-03-30 9:33 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
> + xfs_bmbt_rec_t *ext_buffer;
> +
> + ext_buffer = kmem_alloc(XFS_IFORK_SIZE(ip, whichfork),
If the fork size be the minimum of XFS_IFORK_SIZE and the if_bytes
value?
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] xfs: fix extent format buffer allocation size
2011-03-30 9:33 ` Christoph Hellwig
@ 2011-03-31 5:51 ` Dave Chinner
2011-03-31 6:29 ` Christoph Hellwig
0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2011-03-31 5:51 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
On Wed, Mar 30, 2011 at 05:33:34AM -0400, Christoph Hellwig wrote:
> > + xfs_bmbt_rec_t *ext_buffer;
> > +
> > + ext_buffer = kmem_alloc(XFS_IFORK_SIZE(ip, whichfork),
>
> If the fork size be the minimum of XFS_IFORK_SIZE and the if_bytes
> value?
I thought about that, but I don't think it makes any difference. If
there are no delalloc extents, then XFS_IFORK_SIZE and ifp->if_bytes
are identical when the fork is in extent format. If there are
delalloc extents, then XFS_IFORK_SIZE() is the one we want. Hence I
don't think we need to even consider the value of ifp->if_bytes at
all here....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] xfs: fix extent format buffer allocation size
2011-03-31 5:51 ` Dave Chinner
@ 2011-03-31 6:29 ` Christoph Hellwig
0 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2011-03-31 6:29 UTC (permalink / raw)
To: Dave Chinner; +Cc: Christoph Hellwig, xfs
On Thu, Mar 31, 2011 at 04:51:14PM +1100, Dave Chinner wrote:
> On Wed, Mar 30, 2011 at 05:33:34AM -0400, Christoph Hellwig wrote:
> > > + xfs_bmbt_rec_t *ext_buffer;
> > > +
> > > + ext_buffer = kmem_alloc(XFS_IFORK_SIZE(ip, whichfork),
> >
> > If the fork size be the minimum of XFS_IFORK_SIZE and the if_bytes
> > value?
>
> I thought about that, but I don't think it makes any difference. If
> there are no delalloc extents, then XFS_IFORK_SIZE and ifp->if_bytes
> are identical when the fork is in extent format. If there are
> delalloc extents, then XFS_IFORK_SIZE() is the one we want. Hence I
> don't think we need to even consider the value of ifp->if_bytes at
> all here....
You're right.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] xfs: fix extent format buffer allocation size
2011-03-30 2:52 [PATCH] xfs: fix extent format buffer allocation size Dave Chinner
2011-03-30 9:33 ` Christoph Hellwig
@ 2011-04-04 20:09 ` Alex Elder
1 sibling, 0 replies; 5+ messages in thread
From: Alex Elder @ 2011-04-04 20:09 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On Wed, 2011-03-30 at 13:52 +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> When formatting an inode item, we have to allocate a separate buffer
> to hold extents when there are delayed allocation extents on the
> inode and it is in extent format. The allocation size is derived
> from the in-core data fork representation, which accounts for
> delayed allocation extents, while the on-disk representation does
> not contain any delalloc extents.
>
> As a result of this mismatch, the allocated buffer can be far larger
> than needed to hold the real extent list which, due to the fact the
> inode is in extent format, is limited to the size of the literal
> area of the inode. However, we can have thousands of delalloc
> extents, resulting in an allocation size orders of magnitude larger
> than is needed to hold all the real extents.
>
> Fix this by limiting the size of the buffer being allocated to the
> size of the literal area of the inodes in the filesystem (i.e. the
> maximum size an inode fork can grow to).
>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
I think one of your comments helps explain why you came about
the fix but it maybe distracts a bit from explaining what
the code is doing. (See below.) But the change looks good.
Reviewed-by: Alex Elder <aelder@sgi.com>
> ---
> fs/xfs/xfs_inode_item.c | 69 ++++++++++++++++++++++++++++------------------
> 1 files changed, 42 insertions(+), 27 deletions(-)
>
> diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
> index 46cc401..12cdc39 100644
> --- a/fs/xfs/xfs_inode_item.c
> +++ b/fs/xfs/xfs_inode_item.c
> @@ -198,6 +198,43 @@ xfs_inode_item_size(
> }
>
> /*
> + * xfs_inode_item_format_extents - convert in-core extents to on-disk form
> + *
> + * For either the data or attr fork in extent format, we need to endian convert
> + * the in-core extent as we place them into the on-disk inode. In this case, we
> + * ned to do this conversion before we write the extents into the log. Because
need
> + * we don't have the disk inode to write into here, we allocate a buffer and
> + * format the extents into it via xfs_iextents_copy(). We free the buffer in
> + * the unlock routine after the copy for the log has been made.
> + *
> + * For the data fork, there can be delayed allocation extents
> + * in the inode as well, so the in-core data fork can be much larger than the
> + * on-disk data representation of real inodes. Hence we need to limit the size
> + * of the allocation to what will fit in the inode fork, otherwise we could be
> + * asking for excessively large allocation sizes.
I think the comment here is sort of oriented toward the old way
of looking at things. I don't think there's any need to justify
the use of the max fork size rather than xfs_ifork->if_bytes.
Just say that any on-disk inode in extent format will fill
no more than that much space on disk, therefore the buffer
we allocate can be limited to that size. (And perhaps as an
afterthought, delayed allocation extents are never recorded
to disk and therefore don't need space for endian-converting
them.)
> + */
> +STATIC void
> +xfs_inode_item_format_extents(
> + struct xfs_inode *ip,
> + struct xfs_log_iovec *vecp,
> + int whichfork,
> + int type)
> +{
. . .
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-04-04 20:06 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-30 2:52 [PATCH] xfs: fix extent format buffer allocation size Dave Chinner
2011-03-30 9:33 ` Christoph Hellwig
2011-03-31 5:51 ` Dave Chinner
2011-03-31 6:29 ` Christoph Hellwig
2011-04-04 20:09 ` Alex Elder
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox