From: Dave Chinner <david@fromorbit.com>
To: xfs@oss.sgi.com
Subject: [PATCH 9/9] xfs: log ticket reservation underestimates the number of iclogs
Date: Mon, 15 Mar 2010 13:35:06 +1100 [thread overview]
Message-ID: <1268620506-10799-10-git-send-email-david@fromorbit.com> (raw)
In-Reply-To: <1268620506-10799-1-git-send-email-david@fromorbit.com>
From: Dave Chinner <dchinner@redhat.com>
When allocation a ticket for a transaction, the ticket is initialised with the
worst case log space usage based on the number of bytes the transaction may
consume. Part of this calculation is the number of log headers required for the
iclog space used up by the transaction.
This calculation makes an undocumented assumption that if the transaction uses
the log header space reservation on an iclog, then it consumes either the
entire iclog or it completes. That is - the transaction that is first in an
iclog is the transaction that the log header reservation is accounted to. If
the transaction is larger than the iclog, then it will use the entire iclog
itself. Document this assumption.
Further, the current calculation uses the rule that we can fit iclog_size bytes
of transaction data into an iclog. This is in correct - the amount of space
available in an iclog for transaction data is the size of the iclog minus the
space used for log record headers. This means that the calculation is out by
512 bytes per 32k of log space the transaction can consume. This is rarely an
issue because maximally sized transactions are extremely uncommon, and for 4k
block size filesystems maximal transaction reservations are about 400kb. Hence
the error in this case is less than the size of an iclog, so that makes it even
harder to hit.
However, anyone using larger directory blocks (16k directory blocks push the
maximum transaction size to approx. 900k on a 4k block size filesystem) or
larger block size (e.g. 64k blocks push transactions to the 3-4MB size) could
see the error grow to more than an iclog and at this point the transaction is
guaranteed to get a reservation underrun and shutdown the filesystem.
Fix this by adjusting the calculation to calculate the correct number of iclogs
required and account for them all up front.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_log.c | 55 +++++++++++++++++++++++++++++++++++++++++------------
1 files changed, 42 insertions(+), 13 deletions(-)
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 1f26a97..7c6b0cd 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -664,7 +664,10 @@ xfs_log_item_init(
/*
* Write region vectors to log. The write happens using the space reservation
* of the ticket (tic). It is not a requirement that all writes for a given
- * transaction occur with one call to xfs_log_write().
+ * transaction occur with one call to xfs_log_write(). However, it is important
+ * to note that the transaction reservation code makes an assumption about the
+ * number of log headers a transaction requires that may be violated if you
+ * don't pass all the transaction vectors in one call....
*/
int
xfs_log_write(
@@ -3156,14 +3159,16 @@ xfs_log_ticket_get(
* Allocate and initialise a new log ticket.
*/
STATIC xlog_ticket_t *
-xlog_ticket_alloc(xlog_t *log,
- int unit_bytes,
- int cnt,
- char client,
- uint xflags)
+xlog_ticket_alloc(
+ struct log *log,
+ int unit_bytes,
+ int cnt,
+ char client,
+ uint xflags)
{
- xlog_ticket_t *tic;
+ struct xlog_ticket *tic;
uint num_headers;
+ int iclog_space;
tic = kmem_zone_zalloc(xfs_log_ticket_zone, KM_SLEEP|KM_MAYFAIL);
if (!tic)
@@ -3207,16 +3212,40 @@ xlog_ticket_alloc(xlog_t *log,
/* for start-rec */
unit_bytes += sizeof(xlog_op_header_t);
- /* for LR headers */
- num_headers = ((unit_bytes + log->l_iclog_size-1) >> log->l_iclog_size_log);
+ /*
+ * for LR headers - the space for data in an iclog is the size minus
+ * the space used for the headers. If we use the iclog size, then we
+ * undercalculate the number of headers required.
+ *
+ * Furthermore - the addition of op headers for split-recs might
+ * increase the space required enough to require more log and op
+ * headers, so take that into account too.
+ *
+ * IMPORTANT: This reservation makes the assumption that if this
+ * transaction is the first in an iclog and hence has the LR headers
+ * accounted to it, then the remaining space in the iclog is
+ * exclusively for this transaction. i.e. if the transaction is larger
+ * than the iclog, it will be the only thing in that iclog.
+ * Fundamentally, this means we must pass the entire log vector to
+ * xlog_write to guarantee this.
+ */
+ iclog_space = log->l_iclog_size - log->l_iclog_hsize;
+ num_headers = (unit_bytes + iclog_space - 1) / iclog_space;
+
+ /* for split-recs - ophdrs added when data split over LRs */
+ unit_bytes += sizeof(xlog_op_header_t) * num_headers;
+
+ /* add extra header reservations if we overrun */
+ while (!num_headers ||
+ ((unit_bytes + iclog_space - 1) / iclog_space) > num_headers) {
+ unit_bytes += sizeof(xlog_op_header_t);
+ num_headers++;
+ }
unit_bytes += log->l_iclog_hsize * num_headers;
/* for commit-rec LR header - note: padding will subsume the ophdr */
unit_bytes += log->l_iclog_hsize;
- /* for split-recs - ophdrs added when data split over LRs */
- unit_bytes += sizeof(xlog_op_header_t) * num_headers;
-
/* for roundoff padding for transaction data and one for commit record */
if (xfs_sb_version_haslogv2(&log->l_mp->m_sb) &&
log->l_mp->m_sb.sb_logsunit > 1) {
@@ -3238,7 +3267,7 @@ xlog_ticket_alloc(xlog_t *log,
tic->t_trans_type = 0;
if (xflags & XFS_LOG_PERM_RESERV)
tic->t_flags |= XLOG_TIC_PERM_RESERV;
- sv_init(&(tic->t_wait), SV_DEFAULT, "logtick");
+ sv_init(&tic->t_wait, SV_DEFAULT, "logtick");
xlog_tic_reset_res(tic);
--
1.6.5
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2010-03-15 2:33 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-15 2:34 [PATCH 0/9] Log and transaction cleanups, factoring and bug fixes V2 Dave Chinner
2010-03-15 2:34 ` [PATCH 1/9] xfs: factor log item initialisation Dave Chinner
2010-03-15 2:34 ` [PATCH 2/9] xfs: Add inode pin counts to traces Dave Chinner
2010-03-15 2:35 ` [PATCH 3/9] xfs: remove stale parameter from ->iop_unpin method Dave Chinner
2010-03-15 2:35 ` [PATCH 4/9] xfs: fix reservation release commit flag in xfs_bmap_add_attrfork() Dave Chinner
2010-03-15 2:35 ` [PATCH 5/9] xfs: split out iclog writing from xfs_trans_commit() Dave Chinner
2010-03-15 2:35 ` [PATCH 6/9] xfs: clean up xfs_trans_commit logic even more Dave Chinner
2010-03-15 2:35 ` [PATCH 7/9] xfs: update and factor xfs_trans_committed() Dave Chinner
2010-03-15 2:35 ` [PATCH 8/9] xfs: Clean up xfs_trans_committed code after factoring Dave Chinner
2010-03-15 15:20 ` Christoph Hellwig
2010-03-15 2:35 ` Dave Chinner [this message]
2010-03-15 15:41 ` [PATCH 9/9] xfs: log ticket reservation underestimates the number of iclogs Christoph Hellwig
2010-03-15 15:44 ` Christoph Hellwig
2010-03-16 1:51 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1268620506-10799-10-git-send-email-david@fromorbit.com \
--to=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox