From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751761AbbCTQS2 (ORCPT ); Fri, 20 Mar 2015 12:18:28 -0400 Received: from mail.windriver.com ([147.11.1.11]:53513 "EHLO mail.windriver.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750993AbbCTQS0 (ORCPT ); Fri, 20 Mar 2015 12:18:26 -0400 Message-ID: <550C4843.6010200@windriver.com> Date: Fri, 20 Mar 2015 10:18:11 -0600 From: Chris Friesen User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.7.0 MIME-Version: 1.0 To: lkml , Thomas Gleixner , Subject: weird interaction between kvm and NO_HZ_FULL? Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [147.11.117.178] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, I'm running 3.10 (yeah, I know) and I'm playing with CONFIG_NO_HZ_FULL. I'm getting a strange result where some CPUs are able to turn off local timer interrupts and others aren't. Is there a known interaction between kvm-based VMs and CONFIG_NO_HZ_FULL? Background: I've got an x86-64 system with 16 cores. I the kernel has boot args "isolcpus=1-15 rcu_nocbs=1-15 nohz_full=1-15". I have all system tasks running on CPU 0, then a couple of busy-looping CPU hogs (DPDK apps) affined to CPUs 1 and 2 respectively. Then I have a 3-vCPU kvm-based VM running on CPUs 3/4/5. (Each vCPU is affined to a single host CPU.) Within the VM, vCPU0 is running system tasks and is mostly idle, while vCPUs 1/2 are running busy-looping CPU hogs. Current issue: Looking at the local timer interrupts over 10 seconds CPUs 1/2 incremented by about 25, CPU 3 (vCPU0 in the guest, mostly idle) incremented by 57000, CPUs 4/5 (which are busy-looping in the guest) incremented by 10000, and the other CPUs increased by 2. This is fairly reproducible. Looking at the sched ftrace logs over 10 seconds: On CPU 1 I see it running vswitch, rcuc/1-211, and ksoftirqd/1-212. On CPU 5 I see it running kvm, rcuc/5-235, and ksoftirqd/5-236 On CPU 3 I see it running kvm-29634, kvm-29637, and (mostly) the idle task In all cases there doesn't seem to be significant contention. For each of CPUs 1/5 there are under 60 lines of trace output over 10 seconds. Connecting via strace to the kvm thread on CPU 5 it seemed to be doing almost entirely userspace processing, with no syscalls in multiple seconds. Just for fun I ran "cat /dev/zero > /dev/null" on CPU 9 and the interrupt rate remained low though I could see it chewing all the CPU time. I'm at a loss to explain why the timer ticks aren't being suppressed as expected on CPUs 3/4/5. Does anyone have any ideas? Is kvm doing something "odd" to mess it up? Thanks, Chris