All of lore.kernel.org
 help / color / mirror / Atom feed
From: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>
To: Steven Haigh <netwiz@crc.id.au>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	linux-kernel@vger.kernel.org
Subject: Re: 4.4: INFO: rcu_sched self-detected stall on CPU
Date: Tue, 3 May 2016 08:46:56 -0700	[thread overview]
Message-ID: <20160503154656.GA27311@kroah.com> (raw)
In-Reply-To: <c959010e-2716-118b-531e-cb1ca539e7ed@crc.id.au>

On Wed, May 04, 2016 at 01:11:46AM +1000, Steven Haigh wrote:
> On 03/05/16 06:54, gregkh@linuxfoundation.org wrote:
> > On Wed, Mar 30, 2016 at 05:04:28AM +1100, Steven Haigh wrote:
> >> Greg, please see below - this is probably more for you...
> >>
> >> On 03/29/2016 04:56 AM, Steven Haigh wrote:
> >>>
> >>> Interestingly enough, this just happened again - but on a different
> >>> virtual machine. I'm starting to wonder if this may have something to do
> >>> with the uptime of the machine - as the system that this seems to happen
> >>> to is always different.
> >>>
> >>> Destroying it and monitoring it again has so far come up blank.
> >>>
> >>> I've thrown the latest lot of kernel messages here:
> >>>      http://paste.fedoraproject.org/346802/59241532
> >>
> >> So I just did a bit of digging via the almighty Google.
> >>
> >> I started hunting for these lines, as they happen just before the stall:
> >> BUG: Bad rss-counter state mm:ffff88007b7db480 idx:2 val:-1
> >> BUG: Bad rss-counter state mm:ffff880079c638c0 idx:0 val:-1
> >> BUG: Bad rss-counter state mm:ffff880079c638c0 idx:2 val:-1
> >>
> >> I stumbled across this post on the lkml:
> >>     http://marc.info/?l=linux-kernel&m=145141546409607
> >>
> >> The patch attached seems to reference the following change in
> >> unmap_mapping_range in mm/memory.c:
> >>> -	struct zap_details details;
> >>> +	struct zap_details details = { };
> >>
> >> When I browse the GIT tree for 4.4.6:
> >> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/mm/memory.c?id=refs/tags/v4.4.6
> >>
> >> I see at line 2411:
> >> struct zap_details details;
> >>
> >> Is this something that has been missed being merged into the 4.4 tree?
> >> I'll admit my kernel knowledge is not enough to understand what the code
> >> actually does - but the similarities here seem uncanny.
> > 
> > I'm sorry, I have no idea what you are asking me about here.  Did I miss
> > a patch that should be backported?  Did I backport something
> > incorrectly?
> 
> Hi Greg + all,
> 
> I did actually find the cause of my rss-counter problems - being the
> experimental PVH functionality in Xen. It caused a number of corruptions
> both on disk and in memory. Turning this off resolved the problem.
> 
> As for the 'fix' above. It seems there was talk that zap_details should
> be defined as { } to avoid a problem in newer versions of the kernel
> that was in linux-next.
> 
> The question that I cannot answer (and I leave this open to the more
> knowledgeable on the list than I) is if that fix should also be applied
> to other trees.
> 
> So the question as I see it:
> Is this an actual bug that we're just not seeing hit in other kernel
> versions - but the newer oom reaper code from linux-next uncovered it -
> or is the code as-is in the 4.4 tree considered correct?
> 
> It could well be that the experimental code in the Xen PVH was tickling
> something that triggered the same type of issue as per the original bug
> report leading to the patch quoted above.

I would recommend working with the xen developers, on their mailing
list, about this issue.  If you end up with a patch that needs to be
applied, please let me and stable@ know about it.

thanks,

greg k-h

  reply	other threads:[~2016-05-03 15:46 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-25  2:53 4.4: INFO: rcu_sched self-detected stall on CPU Steven Haigh
2016-03-25  2:53 ` Steven Haigh
2016-03-25 12:23 ` [Xen-devel] " Boris Ostrovsky
2016-03-25 14:05   ` Steven Haigh
2016-03-25 14:44     ` Boris Ostrovsky
2016-03-25 14:44     ` [Xen-devel] " Boris Ostrovsky
2016-03-25 16:04       ` Steven Haigh
2016-03-25 16:20         ` Boris Ostrovsky
2016-03-25 21:07           ` Steven Haigh
2016-03-29  8:56             ` Steven Haigh
2016-03-29 14:14               ` Boris Ostrovsky
2016-03-29 17:44                 ` Steven Haigh
2016-03-29 18:04                   ` Steven Haigh
2016-03-29 18:04                   ` Steven Haigh
2016-03-29 18:32                     ` Steven Haigh
2016-03-30 13:44                     ` Boris Ostrovsky
2016-03-30 13:44                     ` Boris Ostrovsky
2016-05-02 20:54                     ` gregkh
2016-05-03 15:11                       ` Steven Haigh
2016-05-03 15:46                         ` gregkh [this message]
2016-05-02 20:54                     ` gregkh
2016-03-29 17:44                 ` Steven Haigh
2016-04-02  1:50                 ` Steven Haigh
2016-04-02  1:50                 ` Steven Haigh
2016-03-29 14:14               ` Boris Ostrovsky
2016-03-29  8:56             ` Steven Haigh
2016-03-25 21:07           ` Steven Haigh
2016-03-25 16:20         ` Boris Ostrovsky
2016-03-25 16:04       ` Steven Haigh
2016-03-25 14:05   ` Steven Haigh
2016-03-25 12:23 ` Boris Ostrovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160503154656.GA27311@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=boris.ostrovsky@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netwiz@crc.id.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.