From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tobias Hommel Subject: Re: kernels > v4.12 oops/crash with ipsec-traffic: bisected to b838d5e1c5b6e57b10ec8af2268824041e3ea911: ipv4: mark DST_NOGC and remove the operation of dst_free() Date: Wed, 19 Sep 2018 20:38:39 +0200 Message-ID: <20180919183839.2k4jw4cmyzbtgjfh@delI> References: <3482600.6PjfSIYROA@stwm.de> <20180911103334.GY23674@gauss3.secunet.de> <2028376.H0yIdbXTXp@stwm.de> <20180911190248.hj55ultypwnnkcnx@delI> <20180912085046.GZ23674@gauss3.secunet.de> <20180912151823.z2wk7hnex4zxly3e@arbeitstier> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Wolfgang Walter , Kristian Evensen , Network Development , weiwan@google.com, edumazet@google.com To: Steffen Klassert Return-path: Received: from mail.brieftier.de ([88.99.33.249]:51952 "EHLO mail.brieftier.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731253AbeITARZ (ORCPT ); Wed, 19 Sep 2018 20:17:25 -0400 Content-Disposition: inline In-Reply-To: <20180912151823.z2wk7hnex4zxly3e@arbeitstier> Sender: netdev-owner@vger.kernel.org List-ID: > After running for about 24 hours, I now encountered another panic. This time it > is caused by an out of memory situation. Although the trace shows action in the > filesystem code I'm posting it here because I cannot isolate the error and > maybe it is caused by our NULL pointer bug or by the new fix. > I do not have a serial console attached, so I could only attach a screenshot of > the panic to this mail. > > I am running v4.19-rc3 from git with the above mentioned patch applied. > After 19 hours everything still looked fine, XfrmFwdHdrError value was at ~950. > Overall memory usage shown by htop was at 1.2G/15.6G. > I had htop running via ssh so I was able to see at least some status post > mortem. Uptime: 23:50:57 > Overall memory usage was at 10.2G/15.6G and user processes were just > using the usual amount of memory, so it looks like the kernel was eating up at > least 9G of RAM. > > Maybe this information is not very helpful for debugging, but it is at least a > warning that something might still be wrong. > > I'll try to gather some more information and keep you updated. Running stable under load for more than 5 days now, I was not able to reproduce that OOM situation. I leave it at that, the fix for the initial bug is fine for me.