All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Andrew Jones <drjones@redhat.com>, chuck.anderson@oracle.com
Cc: xen-devel <xen-devel@lists.xen.org>
Subject: Re: domU panic on nested call to arch_enter_lazy_mmu_mode()
Date: Fri, 3 May 2013 09:03:30 -0400	[thread overview]
Message-ID: <20130503130330.GB1654@phenom.dumpdata.com> (raw)
In-Reply-To: <1476610678.2256112.1365608135270.JavaMail.root@redhat.com>

On Wed, Apr 10, 2013 at 11:35:35AM -0400, Andrew Jones wrote:
> Hi all,
> 
> A couple years ago a thread[1] popped up here for a bug report that
> Jeremy followed up to with this patch[2]. That patch was never
> committed though (likely because the issue was difficult to
> reproduce/test). We've got a report now of the same issue for the
> rhel6 kernel running on EC2. It's pretty certain that it's the same,
> because the reproducer steps[3] given would certainly generate the
> same call sequences shown in [1], and applying the proposed patch[2]
> to the rhel6 kernel fixes it.
> 
> Now, while the grant table code has changed some between what rhel6
> has and recent kernels, I believe the issue should still be present
> with recent kernels. However, we attempted to reproduce using a
> Fedora18 kernel (>3.8) and could not. So I'm writing to see if I'm
> missing something in my analysis - meaning upstream is no longer at
> risk of hitting this bug, and/or if Jeremy's proposed patch was
> rejected for other reasons than not being testable (or just
> forgotten). If not, then I'd suggest we repost it.

The logic behind the arch_enter/leave_lazy_mmu was that they would
be done within the context of the kernel uninterrupted. Meaning that the
enter and leave would be done at some point and user-space would not
be invoked during that time (which is btw the issue that Chuck
spotted). There were a couple of bugs that did not do that properly and
they have been fixed (I can't remember the exact ones, but a git log
--grep="lazy" should provide some idea).

Most of the issues were not in the Xen code but in generic, such
as vmalloc, and some other ones:

commit 1160c2779b826c6f5c08e5cc542de58fd1f667d5
Author: Samu Kallio <samu.kallio@aberdeencloud.com>
Date:   Sat Mar 23 09:36:35 2013 -0400

    x86, mm, paravirt: Fix vmalloc_fault oops during lazy MMU updates


But if you find this re-appearing, please do report it so we can
either track it down, or use that patch (and add some WARN) so
that the customers can still use the kernel but we can identify
the issues.

> 
> Thanks,
> drew
> 
> [1] http://lists.xen.org/archives/html/xen-devel/2010-12/msg00440.html
> [2] http://lists.xen.org/archives/html/xen-devel/2010-12/msg00505.html
> [3] Reproducer steps
> 1. Start a instance which is a c1.xlarge of Amazon EC2 Instance type.
>    (c1.xlarge has 8 cores)
> 
> 2. create 7 file systems(ext3) on top of Amazon EBS volumes 
> 
> 3. mount 7 file sytemes you created
> 
> 4. For increasing page table operations, create a following program
> 
> --
> #include <unistd.h>
> #include <sys/types.h>
> #include <sys/wait.h>
> 
> int main(void)
> {
>         int status;
>         pid_t pid; 
>         for (;;) {
>                 pid = fork();
>                 if (pid == 0) {
>                         return 0;
>                 }
>                 wait(&status);
>         }
> }
> --
> 
> 5. run the program  pinning CPU0
> 
> # gcc fork.c
> # taskset -c 0 ./a.out  
> 
> 
> 6. For using grant table, execute simultaneous write operation to 7 EBS volumes.
>   ( c1.xlarge can use 8CPU so execute simultaneous write to CPU1-CPU7 except CPU0 )
> 
> For instance:
> --
> for i in `seq 1 7`;
> do
>         taskset -c $i dd if=/dev/zero of=/mnt/$i/testfile bs=10M count=10000 oflag=direct &
> done
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

      reply	other threads:[~2013-05-03 13:03 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1167409089.2244980.1365606532813.JavaMail.root@redhat.com>
2013-04-10 15:35 ` domU panic on nested call to arch_enter_lazy_mmu_mode() Andrew Jones
2013-05-03 13:03   ` Konrad Rzeszutek Wilk [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130503130330.GB1654@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=chuck.anderson@oracle.com \
    --cc=drjones@redhat.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.