All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Weekes <lists.xen@nuclearfallout.net>
To: Daniel Stodden <daniel.stodden@citrix.com>
Cc: Ian Pratt <Ian.Pratt@eu.citrix.com>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	Jan Beulich <JBeulich@novell.com>
Subject: Re: OOM problems
Date: Wed, 17 Nov 2010 23:15:17 -0800	[thread overview]
Message-ID: <4CE4D285.5060500@nuclearfallout.net> (raw)
In-Reply-To: <1290053337.18200.28.camel@agari.van.xensource.com>


> I think [XCP blktap] should work fine, or wouldn't ask. If not, lemme know.

k.

>> In my last bit of troubleshooting, I took O_DIRECT out of the open call
>> in tools/blktap2/drivers/block-aio.c, and preliminary testing indicates
>> that this might have eliminated the problem with corruption. I'm testing
>> further now, but could there be an issue with alignment (since the
>> kernel is apparently very strict about it with direct I/O)?
> Nope. It is, but they're 4k-aligned all over the place. You'd see syslog
> yelling quite miserably in cases like that. Keeping an eye on syslog
> (the daemon and kern facilites) is a generally good idea btw.

I've been doing that and haven't seen any unusual output so far, which I 
guess is good.

>> (Removing
>> this flag also brings back in use of the page cache, of course.)
> I/O-wise it's not much different from the file:-path. Meaning it should
> have carried you directly back into the Oom realm.

Does it make a difference that it's not using "loop" and instead the CPU 
usage (and presumably some blocking) occurs in user-space? There's not 
too much information on this out there, but it seems at though the OOM 
issue might be at least somewhat loop device-specific. One document that 
references loop OOM problems that I found is this one: 
http://sources.redhat.com/lvm2/wiki/DMLoop. My initial take on it was 
that it might be saying that it mattered when these things were being 
done in the kernel, but now I'm not so certain --

".. [their method and loop] submit[s] [I/O requests] via a kernel thread 
to the VFS layer using traditional I/O calls (read, write etc.). This 
has the advantage that it should work with any file system type 
supported by the Linux VFS (including networked file systems), but has 
some drawbacks that may affect performance and scalability. This is 
because it is hard to predict what a file system may attempt to do when 
an I/O request is submitted; for example, it may need to allocate memory 
to handle the request and the loopback driver has no control over this. 
Particularly under low-memory or intensive I/O scenarios this can lead 
to out of memory (OOM) problems or deadlocks as the kernel tries to make 
memory available to the VFS layer while satisfying a request from the 
block layer. "

Would there be an advantage to using blktap/blktap2 over loop, if I 
leave off O_DIRECT? Would it be faster, or anything like that?

> Just reducing the cpu count alone sounds like sth worth trying even on a
> production box, if the current state of things already tends to take the
> system down. Also, the dirty_ratio sysctl should be pretty safe to tweak
> at runtime.

That's good to hear.

>> The default for dirty_ratio is 20. I tried halving that to 10, but it
>> didn't help.
> Still too much. That's meant to be %/task. Try 2, with 1.5G that's still
> a decent 30M write cache and should block all out of 24 disks after some
> 700M, worst case. Or so I think...

Ah, ok. I was thinking that it was global. With a small per-process 
cache like that, it becomes much closer to AIO for writes, but at least 
the leftover memory could still be used for the read cache.

-John

  reply	other threads:[~2010-11-18  7:15 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-13  7:57 OOM problems John Weekes
2010-11-13  8:14 ` Ian Pratt
2010-11-13  8:27   ` John Weekes
2010-11-13  9:13     ` Ian Pratt
2010-11-13  9:43       ` John Weekes
2010-11-13 10:19       ` John Weekes
2010-11-14  9:53         ` Daniel Stodden
2010-11-15  8:55       ` Jan Beulich
2010-11-15  9:40         ` Daniel Stodden
2010-11-15  9:57           ` Jan Beulich
2010-11-15 17:59           ` John Weekes
2010-11-16 19:54             ` John Weekes
2010-11-17 20:10               ` Ian Pratt
2010-11-17 22:02                 ` John Weekes
2010-11-18  0:56                   ` Ian Pratt
2010-11-18  1:23                   ` Daniel Stodden
2010-11-18  3:29                     ` John Weekes
2010-11-18  4:08                       ` Daniel Stodden
2010-11-18  7:15                         ` John Weekes [this message]
2010-11-18 10:41                           ` Daniel Stodden
2010-11-19  7:27                             ` John Weekes
2010-11-15 14:17         ` Stefano Stabellini
2010-11-13 18:15 ` George Shuklin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CE4D285.5060500@nuclearfallout.net \
    --to=lists.xen@nuclearfallout.net \
    --cc=Ian.Pratt@eu.citrix.com \
    --cc=JBeulich@novell.com \
    --cc=daniel.stodden@citrix.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.