From: Sergei Trofimovich <slyich@gmail.com>
To: Jason Wang <jasowang@redhat.com>
Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
Glauber Costa <glommer@parallels.com>,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCHv 2] tcp: properly initialize tcp memory limits part 2 (fix nfs regression)
Date: Fri, 2 Mar 2012 20:50:00 +0300 [thread overview]
Message-ID: <20120302205000.088bf231@sf.home> (raw)
In-Reply-To: <20120302202421.753b36bc@sf.home>
[-- Attachment #1: Type: text/plain, Size: 3817 bytes --]
> > > The change looks like a typo (division flipped to multiplication):
> > >> limit = nr_free_buffer_pages() / 8;
> > >> limit = nr_free_buffer_pages()<< (PAGE_SHIFT - 10);
> >
> > Hi, thanks for the reporting. It's not a typo. It was previously:
> > sysctl_tcp_mem[1] << (PAGE_SHIFT - 7). Looks like we need to do the
> > limit check before shift the value. Please try the following patch, thanks.
>
> Still does not help. I test it by checking sha1sum of a large file over NFS
> (small files seem to work simetimes):
>
> $ strace sha1sum /gentoo/distfiles/gcc-4.6.2.tar.bz2
> ...
> open("/gentoo/distfiles/gcc-4.6.2.tar.bz2", O_RDONLY
> <HUNG>
> After a certain timeout dmesg gets odd spam:
> [ 314.848094] nfs: server vmhost not responding, still trying
> [ 314.848134] nfs: server vmhost not responding, still trying
> [ 314.848145] nfs: server vmhost not responding, still trying
> [ 314.957047] nfs: server vmhost not responding, still trying
> [ 314.957066] nfs: server vmhost not responding, still trying
> [ 314.957075] nfs: server vmhost not responding, still trying
> [ 314.957085] nfs: server vmhost not responding, still trying
> [ 314.957100] nfs: server vmhost not responding, still trying
> [ 314.958023] nfs: server vmhost not responding, still trying
> [ 314.958035] nfs: server vmhost not responding, still trying
> [ 314.958044] nfs: server vmhost not responding, still trying
> [ 314.958054] nfs: server vmhost not responding, still trying
>
> looks like bogus messages. Might be relevant to mishandled timings
> somewhere else or a bug in nfs code.
And after 120 seconds hung tasks shows it might be an OOM issue
Likely caused by patch, as it's a 2GB RAM +4GB swap amd64 box
not running anything heavy:
[ 720.798052] INFO: task sha1sum:3811 blocked for more than 120 seconds.
[ 720.798056] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 720.798059] sha1sum D ffff88007bd11d40 0 3811 1 0x00000005
[ 720.798065] ffff880073de9c08 0000000000000082 ffff880073de9af8 ffff880073de9fd8
[ 720.798070] ffff880070db1620 ffff880073de9fd8 ffff880073de8000 0000000000004000
[ 720.798075] ffff880073de8000 ffff880073de9fd8 ffff8800790e0000 ffff880070db1620
[ 720.798079] Call Trace:
[ 720.798089] [<ffffffff810fdd53>] ? kfree+0x123/0x150
[ 720.798094] [<ffffffff8123227d>] ? nfs_access_free_entry+0x1d/0x30
[ 720.798097] [<ffffffff810fdd53>] ? kfree+0x123/0x150
[ 720.798101] [<ffffffff8123227d>] ? nfs_access_free_entry+0x1d/0x30
[ 720.798104] [<ffffffff81233cb8>] ? nfs_do_access+0x3a8/0x3d0
[ 720.798109] [<ffffffff8166525a>] schedule+0x3a/0x50
[ 720.798112] [<ffffffff8166390e>] __mutex_lock_slowpath+0xee/0x190
[ 720.798117] [<ffffffff81639228>] ? put_rpccred+0x48/0x130
[ 720.798120] [<ffffffff8166374e>] mutex_lock+0x1e/0x40
[ 720.798125] [<ffffffff81114927>] do_lookup+0x277/0x3a0
[ 720.798128] [<ffffffff811162b8>] do_last.clone.39+0x148/0x7e0
[ 720.798132] [<ffffffff81116a61>] path_openat+0xd1/0x3e0
[ 720.798136] [<ffffffff810604d1>] ? get_parent_ip+0x11/0x50
[ 720.798140] [<ffffffff81060675>] ? add_preempt_count+0x95/0xd0
[ 720.798144] [<ffffffff81666677>] ? _raw_spin_lock_irq+0x17/0x40
[ 720.798147] [<ffffffff81116e84>] do_filp_open+0x44/0xa0
[ 720.798151] [<ffffffff810605a5>] ? sub_preempt_count+0x95/0xd0
[ 720.798154] [<ffffffff81666371>] ? _raw_spin_unlock+0x11/0x40
[ 720.798158] [<ffffffff81123014>] ? alloc_fd+0xe4/0x130
[ 720.798163] [<ffffffff81106f7d>] do_sys_open+0xfd/0x1e0
[ 720.798169] [<ffffffff8100f290>] ? syscall_trace_enter+0xf0/0x1a0
[ 720.798172] [<ffffffff8110707c>] sys_open+0x1c/0x20
[ 720.798176] [<ffffffff81667219>] tracesys+0xd0/0xd5
--
Sergei
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
next prev parent reply other threads:[~2012-03-02 17:50 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-02 7:59 [PATCHv 2] tcp: properly initialize tcp memory limits part 2 (fix nfs regression) Sergei Trofimovich
2012-03-02 9:22 ` Jason Wang
2012-03-02 17:24 ` Sergei Trofimovich
2012-03-02 17:50 ` Sergei Trofimovich [this message]
2012-03-03 14:16 ` Glauber Costa
2012-03-03 14:43 ` Sergei Trofimovich
2012-03-03 23:27 ` Glauber Costa
2012-03-04 9:14 ` Sergei Trofimovich
2012-03-05 6:18 ` Jason Wang
2012-03-05 18:22 ` Sergei Trofimovich
2012-03-06 3:22 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120302205000.088bf231@sf.home \
--to=slyich@gmail.com \
--cc=davem@davemloft.net \
--cc=glommer@parallels.com \
--cc=jasowang@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).