All of lore.kernel.org
 help / color / mirror / Atom feed
* xlog_write: reservation ran out
@ 2017-04-28 20:15 Ming Lin
  2017-04-28 20:24 ` Ming Lin
  0 siblings, 1 reply; 16+ messages in thread
From: Ming Lin @ 2017-04-28 20:15 UTC (permalink / raw)
  To: xfs, Dave Chinner, Christoph Hellwig
  Cc: Ceph Development, ceph-users, LIU, Fei, xiongwei.jiang, boqian.zy

Hi Dave & Christoph,

I run into below error during a pre-production ceph cluster test with
xfs backend.
Kernel version: CentOS 7.2 3.10.0-327.el7.x86_64

[146702.392840] XFS (nvme9n1p1): xlog_write: reservation summary:
  trans type  = INACTIVE (3)
  unit res    = 83812 bytes
  current res = -9380 bytes
  total reg   = 0 bytes (o/flow = 0 bytes)
  ophdrs      = 0 (ophdr space = 0 bytes)
  ophdr + reg = 0 bytes
  num regions = 0
[146702.428729] XFS (nvme9n1p1): xlog_write: reservation ran out. Need
to up reservation
[146702.436917] XFS (nvme9n1p1): xfs_do_force_shutdown(0x2) called
from line 2070 of file fs/xfs/xfs_log.c.  Return address =
0xffffffffa0651738
[146702.449969] XFS (nvme9n1p1): Log I/O Error Detected.  Shutting
down filesystem
[146702.457590] XFS (nvme9n1p1): Please umount the filesystem and
rectify the problem(s)
[146702.467903] XFS (nvme9n1p1): xfs_log_force: error -5 returned.
[146732.324308] XFS (nvme9n1p1): xfs_log_force: error -5 returned.
[146762.436923] XFS (nvme9n1p1): xfs_log_force: error -5 returned.
[146792.549545] XFS (nvme9n1p1): xfs_log_force: error -5 returned.

Each XFS fs is 1.7T.
The cluster was written about 80% full then we delete the ceph rbd
image, which actually delete a lot of files in the backend xfs.

I'm going to have a try below quick hack and see if it helps.

diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c
index 1b754cb..b2702f5 100644
--- a/fs/xfs/libxfs/xfs_trans_resv.c
+++ b/fs/xfs/libxfs/xfs_trans_resv.c
@@ -800,7 +800,7 @@ xfs_trans_resv_calc(
        resp->tr_link.tr_logcount = XFS_LINK_LOG_COUNT;
        resp->tr_link.tr_logflags |= XFS_TRANS_PERM_LOG_RES;

-       resp->tr_remove.tr_logres = xfs_calc_remove_reservation(mp);
+       resp->tr_remove.tr_logres = xfs_calc_remove_reservation(mp) * 2;
        resp->tr_remove.tr_logcount = XFS_REMOVE_LOG_COUNT;
        resp->tr_remove.tr_logflags |= XFS_TRANS_PERM_LOG_RES;

Meanwhile, could you suggest any upstream patch that I can have a try?

Any help is appropriated.

Thanks,
Ming

------

$lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                64
On-line CPU(s) list:   0-63
Thread(s) per core:    2
Core(s) per socket:    16
Socket(s):             2
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 79
Model name:            Intel(R) Xeon(R) CPU E5-2682 v4 @ 2.50GHz
Stepping:              1
CPU MHz:               2499.609
BogoMIPS:              4994.43
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              40960K
NUMA node0 CPU(s):     0-63

$free
              total        used        free      shared  buff/cache   available
Mem:      131451952    40856592    83974824        9832     6620536    84212472
Swap:       2097148           0     2097148

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-05-05  3:16 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-28 20:15 xlog_write: reservation ran out Ming Lin
2017-04-28 20:24 ` Ming Lin
2017-04-28 20:56   ` Ming Lin
2017-05-01  6:10     ` Ming Lin
2017-05-01 13:12       ` Brian Foster
2017-05-01 17:41         ` Ming Lin
2017-05-01 18:48           ` Brian Foster
2017-05-01 20:18             ` Ming Lin
2017-05-02 23:51           ` Dave Chinner
2017-05-03 14:10             ` Brian Foster
2017-05-03 15:53               ` Darrick J. Wong
2017-05-03 16:15                 ` Brian Foster
2017-05-04  0:37                   ` Dave Chinner
2017-05-04  0:34               ` Dave Chinner
2017-05-04 11:51                 ` Brian Foster
2017-05-05  3:16                   ` Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.