linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Marc MERLIN <marc@merlins.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>, linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>, Tejun Heo <tj@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: 4.8.8 kernel trigger OOM killer repeatedly when I have lots of RAM that should be free
Date: Tue, 29 Nov 2016 15:01:35 -0800	[thread overview]
Message-ID: <20161129230135.GM7179@merlins.org> (raw)
In-Reply-To: <20161129174019.fywddwo5h4pyix7r@merlins.org>

On Tue, Nov 29, 2016 at 09:40:19AM -0800, Marc MERLIN wrote:
> Thanks for the reply and suggestions.
> 
> On Tue, Nov 29, 2016 at 09:07:03AM -0800, Linus Torvalds wrote:
> > On Tue, Nov 29, 2016 at 8:34 AM, Marc MERLIN <marc@merlins.org> wrote:
> > > Now, to be fair, this is not a new problem, it's just varying degrees of
> > > bad and usually only happens when I do a lot of I/O with btrfs.
> > 
> > One situation where I've seen something like this happen is
> > 
> >  (a) lots and lots of dirty data queued up
> >  (b) horribly slow storage
> 
> In my case, it is a 5x 4TB HDD with 
> software raid 5 < bcache < dmcrypt < btrfs
> bcache is currently half disabled (as in I removed the actual cache) or
> too many bcache requests pile up, and the kernel dies when too many
> workqueues have piled up.
> I'm just kind of worried that since I'm going through 4 subsystems
> before my data can hit disk, that's a lot of memory allocations and
> places where data can accumulate and cause bottlenecks if the next
> subsystem isn't as fast.
> 
> But this shouldn't be "horribly slow", should it? (it does copy a few
> terabytes per day, not fast, but not horrible, about 30MB/s or so)
> 
> > Sadly, our defaults for "how much dirty data do we allow" are somewhat
> > buggered. The global defaults are in "percent of memory", and are
> > generally _much_ too high for big-memory machines:
> > 
> >     [torvalds@i7 linux]$ cat /proc/sys/vm/dirty_ratio
> >     20
> >     [torvalds@i7 linux]$ cat /proc/sys/vm/dirty_background_ratio
> >     10
> 
> I can confirm I have the same.
> 
> > says that it only starts really throttling writes when you hit 20% of
> > all memory used. You don't say how much memory you have in that
> > machine, but if it's the same one you talked about earlier, it was
> > 24GB. So you can have 4GB of dirty data waiting to be flushed out.
> 
> Correct, 24GB and 4GB.
> 
> > And we *try* to do this per-device backing-dev congestion thing to
> > make things work better, but it generally seems to not work very well.
> > Possibly because of inconsistent write speeds (ie _sometimes_ the SSD
> > does really well, and we want to open up, and then it shuts down).
> > 
> > One thing you can try is to just make the global limits much lower. As in
> > 
> >    echo 2 > /proc/sys/vm/dirty_ratio
> >    echo 1 > /proc/sys/vm/dirty_background_ratio
> 
> I will give that a shot, thank you.

And, after 5H of copying, not a single hang, or USB disconnect, or anything.
Obviously this seems to point to other problems in the code, and I have no
idea which layer is a culprit here, but reducing the buffers absolutely
helped a lot.

Thanks much,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2016-11-29 23:01 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-21 15:43 4.8.8 kernel trigger OOM killer repeatedly when I have lots of RAM that should be free Marc MERLIN
2016-11-21 16:30 ` Marc MERLIN
2016-11-21 21:50 ` Vlastimil Babka
2016-11-21 21:56   ` Marc MERLIN
2016-11-21 23:03     ` [PATCH] block,blkcg: use __GFP_NOWARN for best-effort allocations in blkcg Tejun Heo
2016-11-22 15:47       ` Vlastimil Babka
2016-11-22 16:48         ` Tejun Heo
2016-11-22 22:13           ` Linus Torvalds
2016-11-23  8:50             ` Vlastimil Babka
2016-11-28 17:19               ` Tejun Heo
2016-11-29  7:25                 ` Michal Hocko
2016-11-29 16:38                   ` Tejun Heo
2016-11-29 16:57                     ` Vlastimil Babka
2016-11-29 17:13                       ` Michal Hocko
2016-11-29 17:17                         ` Linus Torvalds
2016-11-29 17:28                           ` Michal Hocko
2016-11-29 17:48                             ` Linus Torvalds
2016-11-22 16:00       ` Jens Axboe
2016-11-22 16:06     ` 4.8.8 kernel trigger OOM killer repeatedly when I have lots of RAM that should be free Marc MERLIN
2016-11-22 16:14       ` Vlastimil Babka
2016-11-22 16:25         ` Michal Hocko
2016-11-22 16:47           ` Marc MERLIN
2016-11-22 16:38         ` Greg Kroah-Hartman
2016-11-29 16:25           ` Michal Hocko
2016-11-29 16:43             ` Greg Kroah-Hartman
2016-11-22 19:38         ` Linus Torvalds
2016-11-23  6:34           ` Michal Hocko
2016-11-23  6:53             ` Hillf Danton
2016-11-23  7:00               ` Michal Hocko
2016-11-23  9:18             ` Vlastimil Babka
2016-11-28  7:23             ` Michal Hocko
2016-11-28 20:55               ` Marc MERLIN
2016-11-29 15:55               ` Marc MERLIN
2016-11-29 16:07                 ` Michal Hocko
2016-11-29 16:34                   ` Marc MERLIN
2016-11-29 17:07                     ` Linus Torvalds
2016-11-29 17:40                       ` Marc MERLIN
2016-11-29 18:01                         ` Linus Torvalds
2016-11-30 17:47                           ` Marc MERLIN
2016-11-30 18:14                             ` Linus Torvalds
2016-11-30 18:21                               ` Marc MERLIN
2016-11-30 18:27                               ` Jens Axboe
2016-11-30 20:30                               ` Tejun Heo
2016-12-01 13:50                                 ` Kent Overstreet
2016-12-01 18:16                                   ` Linus Torvalds
2016-12-01 18:30                                     ` Jens Axboe
2016-12-01 18:37                                       ` Linus Torvalds
2016-12-01 18:46                                         ` Jens Axboe
2016-11-29 20:11                         ` Holger Hoffstätte
2016-11-29 23:01                         ` Marc MERLIN [this message]
2016-11-30 13:58                           ` Tetsuo Handa
2017-05-02  4:12                           ` Marc MERLIN
2017-05-02  7:44                             ` Michal Hocko
2017-05-02 14:15                               ` Marc MERLIN
2017-05-02 10:44                             ` Tetsuo Handa
2016-11-29 16:15               ` Marc MERLIN
2016-11-22 21:46         ` Simon Kirby
2016-11-28  8:06           ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161129230135.GM7179@merlins.org \
    --to=marc@merlins.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).