kernel-testers.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Kevin Shanahan <kmshanah-biM/RbsGxha6c6uEtOJ/EA@public.gmane.org>
Cc: Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	"Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org>,
	Linux Kernel Mailing List
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Kernel Testers List
	<kernel-testers-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>,
	Mike Galbraith <efault-Mmb7MZpHnFY@public.gmane.org>,
	Peter Zijlstra
	<a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org>
Subject: Re: [Bug #12465] KVM guests stalling on 2.6.28 (bisected)
Date: Tue, 24 Mar 2009 12:47:15 +0100	[thread overview]
Message-ID: <20090324114715.GC6058@nowhere> (raw)
In-Reply-To: <20090324114409.GB6058@nowhere>

On Tue, Mar 24, 2009 at 12:44:12PM +0100, Frederic Weisbecker wrote:
> On Sat, Mar 21, 2009 at 03:30:39PM +1030, Kevin Shanahan wrote:
> > On Thu, 2009-03-19 at 07:54 +1030, Kevin Shanahan wrote:
> > > On Wed, 2009-03-18 at 11:46 +1030, Kevin Shanahan wrote:
> > > > On Wed, 2009-03-18 at 01:20 +0100, Frederic Weisbecker wrote:
> > > > > Ok, I've made a small script based on yours which could do this job.
> > > > > You will just have to set yourself a threshold of latency
> > > > > that you consider as buggy. I don't remember the latency you observed.
> > > > > About 5 secs right?
> > > > > 
> > > > > It's the "thres" variable in the script.
> > > > > 
> > > > > The resulting trace should be a mixup of the function graph traces
> > > > > and scheduler events which look like this:
> > > > > 
> > > > >  gnome-screensav-4691  [000]  6716.774277:   4691:120:S ==> [000]     0:140:R <idle>
> > > > >   xfce4-terminal-4723  [001]  6716.774303:   4723:120:R   + [001]  4289:120:S Xorg
> > > > >   xfce4-terminal-4723  [001]  6716.774417:   4723:120:S ==> [001]  4289:120:R Xorg
> > > > >             Xorg-4289  [001]  6716.774427:   4289:120:S ==> [001]     0:140:R <idle>
> > > > > 
> > > > > + is a wakeup and ==> is a context switch.
> > > > > 
> > > > > The script will loop trying some pings and will only keep the trace that matches
> > > > > the latency threshold you defined.
> > > > > 
> > > > > Tell if the following script work for you.
> > > 
> > > ...
> > > 
> > > > Either way, I'll try to get some results in my maintenance window
> > > > tonight.
> > > 
> > > Testing did not go so well. I compiled and booted
> > > 2.6.29-rc8-tip-02630-g93c4989, but had some problems with the system
> > > load when I tried to start tracing - it shot up to around 16-20 or so. I
> > > started shutting down VMs to try and get it under control, but before I
> > > got back to tracing again the machine disappeared off the network -
> > > unresponsive to ping.
> > > 
> > > When I got in this morning, there was nothing on the console, nothing in
> > > the logs to show what went wrong. I will try again, but my next chance
> > > will probably be Saturday. Stay tuned.
> > 
> > Okay, new set of traces have been uploaded to:
> > 
> >   http://disenchant.net/tmp/bug-12465/trace-3/
> > 
> > These were done on the latest tip, which I pulled down this morning:
> > 2.6.29-rc8-tip-02744-gd9937cb.
> > 
> > The system load was very high again when I first tried to trace with
> > sevarl guests running, so I ended up only having the one guest running
> > and thankfully the bug was still reproducable that way.
> > 
> > Fingers crossed this set of traces is able to tell us something.
> > 
> > Regards,
> > Kevin.
> > 
> > 
> 
> Sorry, I've been late to answer.
> As I explained in my previous mail, you trace is only
> a snapshot that happened in 10 msec.
> 
> I experimented different sizes for the ring buffer but even
> a 1 second trace require 20 Mo of memory. And a so huge trace
> would be impractical.
> 
> I think we should keep the trace filters we had previously.
> If you don't minde, could you please retest against latest -tip
> the following updated patch? Iadded the filters, fixed the python
> subshell and also flushed the buffer more nicely according to
> a recent feature in -tip:
> 
> echo > trace 
> 
> instead of switching to nop.
> You will need to pull latest -tip again.
> 
> Thanks a lot Kevin!


Ah you will also need to increase the size of your buffer.
See below:
 
> 
> #!/bin/bash
> 
> # Switch off all CPUs except for one to simplify the trace
> echo 0 > /sys/devices/system/cpu/cpu1/online
> echo 0 > /sys/devices/system/cpu/cpu2/online
> echo 0 > /sys/devices/system/cpu/cpu3/online
> 
> 
> # Make sure debugfs has been mounted
> if [ ! -d /sys/kernel/debug/tracing ]; then
>     mount -t debugfs debugfs /sys/kernel/debug
> fi
> 
> # Set up the trace parameters
> pushd /sys/kernel/debug/tracing || exit 1
> echo 0 > tracing_enabled
> echo function_graph > current_tracer
> echo funcgraph-abstime > trace_options
> echo funcgraph-proc    > trace_options
> 
> # Set here the kvm IP addr
> addr="hermes-old"
> 
> # Set here a threshold of latency in sec
> thres="5000"
> found="False"
> lat=0
> prefix=/sys/kernel/debug/tracing
> 
> echo 1 > $prefix/events/sched/sched_wakeup/enable
> echo 1 > $prefix/events/sched/sched_switch/enable
> 
> # Set the filter for functions to trace
> echo ''         > set_ftrace_filter  # clear filter functions
> echo '*sched*' >> set_ftrace_filter 
> echo '*wake*'  >> set_ftrace_filter
> echo '*kvm*'   >> set_ftrace_filter
> 
> # Reset the function_graph tracer
> echo function_graph > $prefix/current_tracer

Put a

echo 20000 > $prefix/buffer_size_kb

So that we will have enough space (hopefully).

Thanks!

> 
> while [ "$found" != "True" ]
> do
>         # Flush the previous buffer
>         echo trace > $prefix/trace
> 
>         echo 1 > $prefix/tracing_enabled
>         lat=$(ping -c 1 $addr | grep rtt | grep -Eo " [0-9]+.[0-9]+")
>         echo 0 > $prefix/tracing_enabled
> 
> 	echo $lat
> 	found=$(python -c "print float(str($lat).strip())")
>         sleep 0.01
> done
> 
> echo 0 > $prefix/events/sched/sched_wakeup/enable
> echo 0 > $prefix/events/sched/sched_switch/enable
> 
> 
> echo "Found buggy latency: $lat"
> echo "Please send the trace you will find on $prefix/trace"
> 
> 

  reply	other threads:[~2009-03-24 11:47 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-14 19:11 2.6.29-rc8: Reported regressions 2.6.27 -> 2.6.28 Rafael J. Wysocki
2009-03-14 19:12 ` [Bug #12061] snd_hda_intel: power_save: sound cracks on powerdown Rafael J. Wysocki
2009-03-14 19:20 ` [Bug #12209] oldish top core dumps (in its meminfo() function) Rafael J. Wysocki
2009-03-14 19:20 ` [Bug #12337] ~100 extra wakeups reported by powertop Rafael J. Wysocki
2009-03-14 19:20 ` [Bug #12404] Oops in 2.6.28-rc9 and -rc8 -- mtrr issues / e1000e Rafael J. Wysocki
2009-03-14 19:20 ` [Bug #12208] uml is very slow on 2.6.28 host Rafael J. Wysocki
2009-03-21 14:44   ` ptrace performance (was: [Bug #12208] uml is very slow on 2.6.28 host) Michael Riepe
     [not found]     ` <49C4FD41.4030504-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org>
2009-03-21 15:22       ` Ingo Molnar
2009-03-21 17:02         ` ptrace performance Michael Riepe
2009-03-14 19:20 ` [Bug #12411] 2.6.28: BUG in r8169 Rafael J. Wysocki
2009-03-14 19:20 ` [Bug #12465] KVM guests stalling on 2.6.28 (bisected) Rafael J. Wysocki
2009-03-15  9:03   ` Kevin Shanahan
     [not found]     ` <1237107837.27699.27.camel-9TBizaOOD0ujuAshGpSIhRCuuivNXqWP@public.gmane.org>
2009-03-15  9:18       ` Avi Kivity
     [not found]         ` <49BCC7C8.2020503-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-03-15  9:48           ` Ingo Molnar
     [not found]             ` <20090315094807.GB21169-X9Un+BFzKDI@public.gmane.org>
2009-03-15  9:56               ` Avi Kivity
2009-03-15 10:03                 ` Ingo Molnar
     [not found]                   ` <20090315100329.GA23577-X9Un+BFzKDI@public.gmane.org>
2009-03-15 10:13                     ` Avi Kivity
2009-03-16  9:49       ` Avi Kivity
     [not found]         ` <49BE20B2.9070804-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-03-16 12:46           ` Kevin Shanahan
     [not found]             ` <1237207595.4964.31.camel-9TBizaOOD0ujuAshGpSIhRCuuivNXqWP@public.gmane.org>
2009-03-16 20:07               ` Frederic Weisbecker
2009-03-16 22:55                 ` Kevin Shanahan
     [not found]                   ` <1237244137.4964.54.camel-9TBizaOOD0ujuAshGpSIhRCuuivNXqWP@public.gmane.org>
2009-03-18  0:20                     ` Frederic Weisbecker
2009-03-18  1:16                       ` Kevin Shanahan
     [not found]                         ` <1237338986.4801.11.camel-9TBizaOOD0ujuAshGpSIhRCuuivNXqWP@public.gmane.org>
2009-03-18  2:24                           ` Frederic Weisbecker
2009-03-18 21:24                         ` Kevin Shanahan
     [not found]                           ` <1237411441.5211.5.camel-9TBizaOOD0ujuAshGpSIhRCuuivNXqWP@public.gmane.org>
2009-03-21  5:00                             ` Kevin Shanahan
     [not found]                               ` <1237611639.4933.4.camel-9TBizaOOD0ujuAshGpSIhRCuuivNXqWP@public.gmane.org>
2009-03-21 14:08                                 ` Frederic Weisbecker
2009-03-24 11:44                                 ` Frederic Weisbecker
2009-03-24 11:47                                   ` Frederic Weisbecker [this message]
2009-03-25 23:40                                   ` Kevin Shanahan
2009-03-25 23:48                                     ` Frederic Weisbecker
2009-03-26 20:22                                   ` Kevin Shanahan
2009-03-14 19:20 ` [Bug #12426] TMDC Joystick no longer works in kernel 2.6.28 Rafael J. Wysocki
2009-03-14 19:20 ` [Bug #12421] GPF on 2.6.28 and 2.6.28-rc9-git3, e1000e and e1000 issues Rafael J. Wysocki
2009-03-14 19:20 ` [Bug #12612] hard lockup when interrupting cdda2wav Rafael J. Wysocki
2009-03-17  0:53   ` FUJITA Tomonori
     [not found]     ` <20090317095254P.fujita.tomonori-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2009-03-17 14:52       ` James Bottomley
2009-03-14 19:20 ` [Bug #12619] Regression 2.6.28 and last - boot failed Rafael J. Wysocki
2009-03-14 19:20 ` [Bug #12500] r8169: NETDEV WATCHDOG: eth0 (r8169): transmit timed out Rafael J. Wysocki
2009-03-14 19:20 ` [Bug #12634] video distortion and lockup with i830 video chip and 2.6.28.3 Rafael J. Wysocki
2009-03-14 19:20 ` [Bug #12690] DPMS (LCD powersave, poweroff) don't work Rafael J. Wysocki
2009-03-14 19:20 ` [Bug #12645] DMI low-memory-protect quirk causes resume hang on Samsung NC10 Rafael J. Wysocki
2009-03-14 19:20 ` [Bug #12835] Regression in backlight detection Rafael J. Wysocki
2009-03-14 19:20 ` [Bug #12818] iwlagn broken after suspend to RAM (iwlagn: MAC is in deep sleep!) Rafael J. Wysocki
2009-03-14 19:20 ` [Bug #12798] No wake up after suspend Rafael J. Wysocki
2009-03-14 19:20 ` [Bug #12868] iproute2 and regressing "ipv6: convert tunnels to net_device_ops" Rafael J. Wysocki
  -- strict thread matches above, loose matches on Subject: below --
2009-03-21 17:01 2.6.29-rc8-git5: Reported regressions 2.6.27 -> 2.6.28 Rafael J. Wysocki
2009-03-21 17:07 ` [Bug #12465] KVM guests stalling on 2.6.28 (bisected) Rafael J. Wysocki
2009-03-21 19:50   ` Ingo Molnar
2009-03-03 19:34 2.6.29-rc6-git7: Reported regressions 2.6.27 -> 2.6.28 Rafael J. Wysocki
2009-03-03 19:41 ` [Bug #12465] KVM guests stalling on 2.6.28 (bisected) Rafael J. Wysocki
2009-03-04  3:08   ` Kevin Shanahan
     [not found]     ` <1236136099.7726.12.camel-9TBizaOOD0ujuAshGpSIhRCuuivNXqWP@public.gmane.org>
2009-03-08 10:04       ` Avi Kivity
2009-02-23 22:00 2.6.29-rc6: Reported regressions 2.6.27 -> 2.6.28 Rafael J. Wysocki
2009-02-23 22:03 ` [Bug #12465] KVM guests stalling on 2.6.28 (bisected) Rafael J. Wysocki
2009-02-24  0:59   ` Kevin Shanahan
     [not found]     ` <1235437183.4988.2.camel-9TBizaOOD0ujuAshGpSIhRCuuivNXqWP@public.gmane.org>
2009-02-24  1:37       ` Rafael J. Wysocki
2009-02-24 12:09       ` Avi Kivity
     [not found]         ` <49A3E38F.7080306-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-02-24 22:11           ` Kevin Shanahan
2009-02-14 20:48 2.6.29-rc5: Reported regressions 2.6.27 -> 2.6.28 Rafael J. Wysocki
2009-02-14 20:50 ` [Bug #12465] KVM guests stalling on 2.6.28 (bisected) Rafael J. Wysocki
2009-02-04 10:55 2.6.29-rc3-git6: Reported regressions 2.6.27 -> 2.6.28 Rafael J. Wysocki
2009-02-04 10:58 ` [Bug #12465] KVM guests stalling on 2.6.28 (bisected) Rafael J. Wysocki
2009-02-05 19:35   ` Kevin Shanahan
     [not found]     ` <1233862503.4823.1.camel-9TBizaOOD0ujuAshGpSIhRCuuivNXqWP@public.gmane.org>
2009-02-05 22:37       ` Rafael J. Wysocki
2009-01-19 21:41 2.6.29-rc2-git1: Reported regressions 2.6.27 -> 2.6.28 Rafael J. Wysocki
2009-01-19 21:45 ` [Bug #12465] KVM guests stalling on 2.6.28 (bisected) Rafael J. Wysocki
2009-01-20  0:12   ` Kevin Shanahan
     [not found]     ` <1232410363.4768.21.camel-9TBizaOOD0ujuAshGpSIhRCuuivNXqWP@public.gmane.org>
2009-01-20 11:35       ` Ingo Molnar
     [not found]         ` <20090120113546.GA26571-X9Un+BFzKDI@public.gmane.org>
2009-01-20 12:37           ` Avi Kivity
2009-01-20 12:42           ` Kevin Shanahan
     [not found]             ` <1232455343.4895.4.camel-9TBizaOOD0ujuAshGpSIhRCuuivNXqWP@public.gmane.org>
2009-01-20 12:56               ` Ingo Molnar
     [not found]                 ` <20090120125652.GA1457-X9Un+BFzKDI@public.gmane.org>
2009-01-20 13:07                   ` Ingo Molnar
     [not found]                     ` <20090120130714.GA11048-X9Un+BFzKDI@public.gmane.org>
2009-01-20 14:59                       ` Steven Rostedt
     [not found]                         ` <alpine.DEB.1.10.0901200957220.2681-f9ZlEuEWxVcI6MkJdU+c8EEOCMrvLtNR@public.gmane.org>
2009-01-20 15:04                           ` Ingo Molnar
     [not found]                             ` <20090120150408.GD21931-X9Un+BFzKDI@public.gmane.org>
2009-01-20 17:53                               ` Steven Rostedt
     [not found]                                 ` <alpine.DEB.1.10.0901201251180.2681-f9ZlEuEWxVcI6MkJdU+c8EEOCMrvLtNR@public.gmane.org>
2009-01-20 18:39                                   ` Ingo Molnar
2009-01-20 17:47                           ` Avi Kivity
     [not found]                             ` <49760E2D.2060109-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-01-21 14:25                               ` Kevin Shanahan
     [not found]                                 ` <1232547932.4895.119.camel-9TBizaOOD0ujuAshGpSIhRCuuivNXqWP@public.gmane.org>
2009-01-21 14:34                                   ` Avi Kivity
     [not found]                                     ` <49773275.3020203-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-01-21 14:51                                       ` Kevin Shanahan
     [not found]                                         ` <1232549502.4895.124.camel-9TBizaOOD0ujuAshGpSIhRCuuivNXqWP@public.gmane.org>
2009-01-21 14:59                                           ` Avi Kivity
     [not found]                                             ` <49773848.4080409-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-01-21 15:13                                               ` Steven Rostedt
2009-01-22  1:48                                               ` Steven Rostedt
2009-01-21 15:10                                       ` Steven Rostedt
2009-01-21 15:18                                       ` Ingo Molnar
2009-01-22 19:57                                         ` Kevin Shanahan
     [not found]                                           ` <1232654237.4885.8.camel-9TBizaOOD0ujuAshGpSIhRCuuivNXqWP@public.gmane.org>
2009-01-22 20:31                                             ` Ingo Molnar
     [not found]                                         ` <20090121151820.GA23813-X9Un+BFzKDI@public.gmane.org>
2009-01-26  9:55                                           ` Kevin Shanahan
2009-01-26 11:35                                             ` Peter Zijlstra
2009-01-26 15:00                                               ` Ingo Molnar
2009-01-20 14:23                   ` Kevin Shanahan
     [not found]                     ` <1232461380.4895.33.camel-9TBizaOOD0ujuAshGpSIhRCuuivNXqWP@public.gmane.org>
2009-01-20 14:25                       ` Ingo Molnar
     [not found]                         ` <20090120142515.GC10224-X9Un+BFzKDI@public.gmane.org>
2009-01-20 15:51                           ` Kevin Shanahan
     [not found]                             ` <1232466686.4895.45.camel-9TBizaOOD0ujuAshGpSIhRCuuivNXqWP@public.gmane.org>
2009-01-20 16:06                               ` Ingo Molnar
     [not found]                                 ` <20090120160613.GA32650-X9Un+BFzKDI@public.gmane.org>
2009-01-20 16:19                                   ` Peter Zijlstra
2009-01-20 14:46                       ` Frédéric Weisbecker
2009-01-20 13:04               ` Avi Kivity
     [not found]                 ` <4975CBF8.90101-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-01-20 17:54                   ` Kevin Shanahan
     [not found]                     ` <1232474081.4895.76.camel-9TBizaOOD0ujuAshGpSIhRCuuivNXqWP@public.gmane.org>
2009-01-20 18:42                       ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090324114715.GC6058@nowhere \
    --to=fweisbec-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org \
    --cc=avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=efault-Mmb7MZpHnFY@public.gmane.org \
    --cc=kernel-testers-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=kmshanah-biM/RbsGxha6c6uEtOJ/EA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mingo-X9Un+BFzKDI@public.gmane.org \
    --cc=rjw-KKrjLPT3xs0@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).