All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Chinner <dgc@sgi.com>
To: Stephane Doyon <sdoyon@max-t.com>
Cc: xfs@oss.sgi.com, David Chinner <dgc@sgi.com>,
	nfs@lists.sourceforge.net,
	Shailendra Tripathi <stripathi@agami.com>,
	Trond Myklebust <trond.myklebust@fys.uio.no>
Subject: Re: several messages
Date: Thu, 5 Oct 2006 18:30:15 +1000	[thread overview]
Message-ID: <20061005083015.GC19345@melbourne.sgi.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0610030917060.31738@madrid.max-t.internal>

On Tue, Oct 03, 2006 at 09:39:55AM -0400, Stephane Doyon wrote:
> Sorry for insisting, but it seems to me there's still a problem in need of 
> fixing: when writing a 5GB file over NFS to an XFS file system and hitting 
> ENOSPC, it takes on the order of 22hours before my application gets an 
> error, whereas it would normally take about 2minutes if the file system 
> did not become full.
>
> Perhaps I was being a bit too "constructive" and drowned my point in 
> explanations and proposed workarounds... You are telling me that neither 
> NFS nor XFS is doing anything wrong, and I can understand your points of 
> view, but surely that behavior isn't considered acceptable?

I agree that this a little extreme and I can't recall of seeing
anything like this before, but I can see how that may happen if the
NFS client continues to try to write every dirty page after getting
an ENOSPC and each one of those writes has to wait for 500ms.

However, you did not mention what kernel version you are running.
One recent bug (introduced by a fix for deadlocks at ENOSPC) could
allow oversubscription of free space to occur in XFS, resulting in
the write being allowed to proceed (i.e. sufficient space for the
data blocks) but then failing the allocation because there weren't
enough blocks put aside for potential btree splits that occur during
allocation. If the linux client is using sync writes on retry, then
this would trigger a 500ms sleep on every write.  That's the right
sort of ballpark for the slowness you were seeing - 5GB / 32k * 0.5s
= ~22 hours....

This got fixed in 2.6.18-rc6 - can you retry with a 2.6.18 server
and see if your problem goes away?

Cheers,

Dave.

-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

WARNING: multiple messages have this Message-ID (diff)
From: David Chinner <dgc@sgi.com>
To: Stephane Doyon <sdoyon@max-t.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>,
	David Chinner <dgc@sgi.com>,
	xfs@oss.sgi.com, nfs@lists.sourceforge.net,
	Shailendra Tripathi <stripathi@agami.com>
Subject: Re: several messages
Date: Thu, 5 Oct 2006 18:30:15 +1000	[thread overview]
Message-ID: <20061005083015.GC19345@melbourne.sgi.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0610030917060.31738@madrid.max-t.internal>

On Tue, Oct 03, 2006 at 09:39:55AM -0400, Stephane Doyon wrote:
> Sorry for insisting, but it seems to me there's still a problem in need of 
> fixing: when writing a 5GB file over NFS to an XFS file system and hitting 
> ENOSPC, it takes on the order of 22hours before my application gets an 
> error, whereas it would normally take about 2minutes if the file system 
> did not become full.
>
> Perhaps I was being a bit too "constructive" and drowned my point in 
> explanations and proposed workarounds... You are telling me that neither 
> NFS nor XFS is doing anything wrong, and I can understand your points of 
> view, but surely that behavior isn't considered acceptable?

I agree that this a little extreme and I can't recall of seeing
anything like this before, but I can see how that may happen if the
NFS client continues to try to write every dirty page after getting
an ENOSPC and each one of those writes has to wait for 500ms.

However, you did not mention what kernel version you are running.
One recent bug (introduced by a fix for deadlocks at ENOSPC) could
allow oversubscription of free space to occur in XFS, resulting in
the write being allowed to proceed (i.e. sufficient space for the
data blocks) but then failing the allocation because there weren't
enough blocks put aside for potential btree splits that occur during
allocation. If the linux client is using sync writes on retry, then
this would trigger a 500ms sleep on every write.  That's the right
sort of ballpark for the slowness you were seeing - 5GB / 32k * 0.5s
= ~22 hours....

This got fixed in 2.6.18-rc6 - can you retry with a 2.6.18 server
and see if your problem goes away?

Cheers,

Dave.

-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

  parent reply	other threads:[~2006-10-05  8:30 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-26 18:51 Long sleep with i_mutex in xfs_flush_device(), affects NFS service Stephane Doyon
2006-09-26 18:51 ` Stephane Doyon
2006-09-26 19:06 ` Trond Myklebust
2006-09-26 19:06   ` [NFS] " Trond Myklebust
2006-09-26 20:05   ` Stephane Doyon
2006-09-26 20:05     ` [NFS] " Stephane Doyon
2006-09-26 20:29     ` Trond Myklebust
2006-09-26 20:29       ` [NFS] " Trond Myklebust
2006-09-27 11:33 ` Shailendra Tripathi
2006-09-27 11:33   ` Shailendra Tripathi
2006-10-02 14:45   ` Stephane Doyon
2006-10-02 22:30     ` David Chinner
2006-10-03 13:39       ` several messages Stephane Doyon
2006-10-03 13:39         ` Stephane Doyon
2006-10-03 16:40         ` Trond Myklebust
2006-10-03 16:40           ` Trond Myklebust
2006-10-05 15:39           ` Stephane Doyon
2006-10-05 15:39             ` Stephane Doyon
2006-10-06  0:33             ` David Chinner
2006-10-06  0:33               ` David Chinner
2006-10-06 13:25               ` Stephane Doyon
2006-10-06 13:25                 ` Stephane Doyon
2006-10-05  8:30         ` David Chinner [this message]
2006-10-05  8:30           ` David Chinner
2006-10-05 16:33           ` Stephane Doyon
2006-10-05 16:33             ` Stephane Doyon
2006-10-05 23:29             ` David Chinner
2006-10-05 23:29               ` David Chinner
2006-10-06 13:03               ` Stephane Doyon
2006-10-06 13:03                 ` Stephane Doyon
  -- strict thread matches above, loose matches on Subject: below --
2023-06-12 16:02 [PATCH mptcp-next] mptcp: drop legacy code Paolo Abeni
2023-06-13 17:37 ` mptcp: drop legacy code.: Tests Results MPTCP CI
2023-06-16 22:54   ` several messages Mat Martineau
2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 18/18] t: retire unused chainlint.sed Eric Sunshine via GitGitGadget
2022-09-02 12:42   ` several messages Johannes Schindelin
2022-09-02 18:16     ` Eric Sunshine
2022-09-02 18:34       ` Jeff King
2022-09-02 18:44         ` Junio C Hamano
2022-06-16 13:55 [PATCH mptcp-next] selftests: mptcp: tweak simult_flows for debug kernels Paolo Abeni
2022-06-16 15:27 ` selftests: mptcp: tweak simult_flows for debug kernels.: Tests Results MPTCP CI
2022-06-17 22:13   ` several messages Mat Martineau
2016-01-25 18:37 [PATCH v2 0/3] x86/mm: INVPCID support Andy Lutomirski
2016-01-25 18:57 ` Ingo Molnar
2016-01-27 10:09   ` several messages Thomas Gleixner
2016-01-27 10:09     ` Thomas Gleixner
2016-01-29 13:21     ` Borislav Petkov
2014-11-10  6:26 [PATCH 00/13] Add VT-d Posted-Interrupts support for KVM Feng Wu
2014-11-10  6:26 ` [PATCH 13/13] iommu/vt-d: Add a command line parameter for VT-d posted-interrupts Feng Wu
     [not found]   ` <1415600812-27773-14-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2014-11-10 18:15     ` several messages Thomas Gleixner
2014-11-10 18:15       ` Thomas Gleixner
2014-11-11  2:28       ` Jiang Liu
2014-11-11  2:28         ` Jiang Liu
     [not found]         ` <5461745F.2080703-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-11-11  6:37           ` Wu, Feng
2014-11-11  6:37             ` Wu, Feng
2014-07-03  5:02 [RFC PATCH v4] ARM: EXYNOS: Use MCPM call-backs to support S2R on Exynos5420 Abhilash Kesavan
2014-07-03 14:46 ` [PATCH v5] " Abhilash Kesavan
2014-07-03 15:45   ` several messages Nicolas Pitre
2014-07-03 15:45     ` Nicolas Pitre
2014-07-03 16:19     ` Abhilash Kesavan
2014-07-03 16:19       ` Abhilash Kesavan
2014-07-03 19:00       ` Nicolas Pitre
2014-07-03 19:00         ` Nicolas Pitre
2014-07-03 20:00         ` Abhilash Kesavan
2014-07-03 20:00           ` Abhilash Kesavan
2014-07-04  4:13           ` Nicolas Pitre
2014-07-04  4:13             ` Nicolas Pitre
2014-07-04 17:45             ` Abhilash Kesavan
2014-07-04 17:45               ` Abhilash Kesavan
2010-07-11 15:06 [PATCHv2] netfilter: add CHECKSUM target Michael S. Tsirkin
2010-07-11 15:14 ` [PATCHv3] extensions: libxt_CHECKSUM extension Michael S. Tsirkin
2010-07-15  9:39   ` Patrick McHardy
2010-07-15 10:17     ` several messages Jan Engelhardt
2009-09-06 14:16 Layla 3G does not recover from ACPI Suspend Mark Hills
2009-09-08 19:32 ` Giuliano Pochini
2009-09-08 22:56   ` several messages Mark Hills
2009-02-09 20:57 [PATCH] libxtables: Introduce global params structuring jamal
2009-02-09 21:04 ` several messages Jan Engelhardt
2009-02-09 21:27   ` jamal
2009-02-09 21:44     ` Jan Engelhardt
2008-11-26 14:33 [PATCH 0/1] HID: hid_apple is not used for apple alu wireless keyboards Jan Scholz
2008-11-26 14:33 ` [PATCH 1/1] HID: Apple alu wireless keyboards are bluetooth devices Jan Scholz
2008-11-26 14:54   ` Jiri Kosina
2008-11-26 15:17     ` Jan Scholz
2008-11-26 15:33       ` Jiri Kosina
2008-11-26 21:06         ` Tobias Müller
2008-11-27  0:57           ` several messages Jiri Kosina
2008-10-19 14:15 [PATCH 1/2] HID: add hid_type Jiri Slaby
2008-10-19 14:15 ` [PATCH 2/2] HID: fix appletouch regression Jiri Slaby
2008-10-19 19:40   ` several messages Jiri Kosina
2008-10-19 20:06     ` Justin Mattock
2008-10-19 20:06       ` Justin Mattock
2008-10-19 22:09     ` Jiri Slaby
     [not found] <9E397A467F4DB34884A1FD0D5D27CF43018903F96E@msxaoa4.twosigma.com>
2008-06-12 16:54 ` Benjamin L. Shi
     [not found] <200702211929.17203.david-b@pacbell.net>
2007-02-22  3:50 ` [patch 6/6] rtc suspend()/resume() restores system clock David Brownell
2007-02-22 22:58   ` several messages Guennadi Liakhovetski
2007-02-22 22:58     ` Guennadi Liakhovetski
2007-02-22 22:58     ` Guennadi Liakhovetski
2007-02-23  1:15     ` David Brownell
2007-02-23  1:15       ` David Brownell
2007-02-23  1:15       ` David Brownell
2007-02-23 11:17     ` Johannes Berg
2007-02-23 11:17       ` Johannes Berg
2007-02-23 11:17       ` Johannes Berg
2006-04-11 17:33 Linux 2.6.16.4 Greg KH
2006-04-11 19:04 ` several messages Jan Engelhardt
2006-04-11 19:20   ` Boris B. Zhmurov
2006-04-11 20:30   ` Greg KH
2006-04-11 23:46     ` Jan Engelhardt
2006-04-12  0:36     ` Nix
2005-05-04 17:31 ata over ethernet question Maciej Soltysiak
2005-05-04 19:48 ` David Hollis
2005-05-04 21:17   ` Re[2]: " Maciej Soltysiak
2005-05-05 15:09     ` David Hollis
2005-05-07 15:05       ` Sander
2005-05-10 22:00         ` Guennadi Liakhovetski
2005-05-11  8:56           ` Vladislav Bolkhovitin
2005-05-11 21:26             ` several messages Guennadi Liakhovetski
2005-05-12  2:16               ` Ming Zhang
2005-05-12 18:32                 ` Dmitry Yusupov
2005-05-13  8:12                   ` Christoph Hellwig
2005-05-13 15:04                     ` Dmitry Yusupov
2005-05-13 15:07                       ` Christoph Hellwig
2005-05-13 15:38                         ` Dmitry Yusupov
2005-05-12 10:17               ` Vladislav Bolkhovitin
2004-05-11  8:45 2.6.6-rc3-mm2 (4KSTACK) Helge Hafting
2004-05-11 17:59 ` several messages Bill Davidsen
2003-04-22 10:34 [patch] HT scheduler, sched-2.5.68-A9 Ingo Molnar
2003-04-22 22:16 ` several messages Bill Davidsen
2003-04-22 23:38   ` Rick Lindsley
2003-04-23  9:17     ` Ingo Molnar
2003-01-23  0:20 ANN: LKMB (Linux Kernel Module Builder) version 0.1.16 Hal Duston
2003-01-27 16:46 ` several messages Bill Davidsen
2003-01-27 16:59   ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061005083015.GC19345@melbourne.sgi.com \
    --to=dgc@sgi.com \
    --cc=nfs@lists.sourceforge.net \
    --cc=sdoyon@max-t.com \
    --cc=stripathi@agami.com \
    --cc=trond.myklebust@fys.uio.no \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.