From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from queue02a.mail.zen.net.uk (queue02a.mail.zen.net.uk [212.23.3.234]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F2B5D45BE3 for ; Fri, 16 Jan 2026 10:34:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=212.23.3.234 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768559667; cv=none; b=ARPjlyKjTrxusljGzm+A7UNIwHejBd0JYmXIjbmSTs3ySWFVVzmjuyyi8gruQzZELRG7ofLhzW9jeNk3IPIFD9ONsR3tyaZLVzvVAvx8cF8flnkGaIYq3lqDZkgotDnDEZecrDbs5y7FrlRK1iblKDp0NB1q4Mgr3GlzRjkv+mg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768559667; c=relaxed/simple; bh=phakAIGJ62+CXoMWNzrCbjR7UT7qZW60ONZtKYuJb4o=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=hMVU8P6cmV0SvGjOgJ3MB6zhehWhhJOKtvQKvXQ9oWc30qdeeGtqR0RuZLJPGCw8bo2qQZkHOCIhqrmfTVxXp6u3HB9PmfRctR7J2eWY4rFN/rE7oxA27CzdqRIAEIG927QpcW6Xz5bQWJFuvguqj7EvmSUFazfD2pfQX2SyMGE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=dingwall.me.uk; spf=pass smtp.mailfrom=dingwall.me.uk; arc=none smtp.client-ip=212.23.3.234 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=dingwall.me.uk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=dingwall.me.uk Received: from [212.23.1.1] (helo=smarthost01a.sbp.mail.zen.net.uk) by queue02a.mail.zen.net.uk with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1vgh9r-00AT3Q-6y for linux-kernel@vger.kernel.org; Fri, 16 Jan 2026 10:34:23 +0000 Received: from [217.155.64.189] (helo=mail0.xen.dingwall.me.uk) by smarthost01a.sbp.mail.zen.net.uk with esmtpsa (TLS1.0) tls TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA (Exim 4.95) (envelope-from ) id 1vgh9j-004gKl-Na; Fri, 16 Jan 2026 10:34:16 +0000 Received: from localhost (localhost [IPv6:::1]) by mail0.xen.dingwall.me.uk (Postfix) with ESMTP id 93DBEDEF4CD; Fri, 16 Jan 2026 10:34:15 +0000 (GMT) X-Virus-Scanned: Debian amavis at dingwall.me.uk Received: from mail0.xen.dingwall.me.uk ([IPv6:::1]) by localhost (mail0.xen.dingwall.me.uk [IPv6:::1]) (amavis, port 10024) with ESMTP id 9qOV27SLBIom; Fri, 16 Jan 2026 10:34:15 +0000 (GMT) Received: from ghoul.dingwall.me.uk (ghoul.dingwall.me.uk [192.168.1.200]) by dingwall.me.uk (Postfix) with ESMTP id 4D9F5DEF4CA; Fri, 16 Jan 2026 10:34:15 +0000 (GMT) Received: by ghoul.dingwall.me.uk (Postfix, from userid 1000) id 09A97A3E; Fri, 16 Jan 2026 10:34:14 +0000 (GMT) Date: Fri, 16 Jan 2026 10:34:14 +0000 From: James Dingwall To: Roger Pau =?iso-8859-1?Q?Monn=E9?= Cc: linux-kernel@vger.kernel.org Subject: Re: xen pci passthrough stops working after xen/x86: fix initial memory balloon target Message-ID: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Originating-smarthost01a-IP: [217.155.64.189] Feedback-ID: 217.155.64.189 On Fri, Jan 16, 2026 at 11:13:55AM +0100, Roger Pau Monné wrote: > On Fri, Jan 16, 2026 at 09:27:15AM +0000, James Dingwall wrote: > > On Thu, Jan 15, 2026 at 06:55:15PM +0100, Roger Pau Monné wrote: > > > On Thu, Jan 15, 2026 at 02:50:12PM +0000, James Dingwall wrote: > > > > On Thu, Jan 15, 2026 at 01:03:49PM +0100, Roger Pau Monné wrote: > > > > > On Thu, Jan 15, 2026 at 11:23:37AM +0000, James Dingwall wrote: > > > > > > Hi, > > > > > > > > > > > > We have encountered a regression with pci passthrough since the > > > > > > Ubuntu 6.8.0-91.92 which included this commit: > > > > > > > > > > Hello, > > > > > > > > > > Thanks for the report. Could you also send me your kernel Kconfig, to > > > > > see which combination of options are you using? > > > > > > > > > > > > Can you confirm that the config used to build the non-working kernel > > > also has CONFIG_XEN_UNPOPULATED_ALLOC=y? > > > > The config is the same for both builds and CONFIG_XEN_UNPOPULATED_ALLOC=y > > is always set. > > > > > Can you also provide the output of `cat /proc/iomem` for both the > > > working and non-working kernels? > > > > non-working Ubuntu-6.8.0-100.100: > > > > 00000000-00000fff : Reserved > > 00001000-0009ffff : System RAM > > 000a0000-000fffff : Reserved > > 000f0000-000fffff : System ROM > > 00100000-2007ffff : System RAM > > 01000000-025fffff : Kernel code > > 02600000-033bcfff : Kernel rodata > > 03400000-0385613f : Kernel data > > 03d54000-041fffff : Kernel bss > > 20081000-73b57fff : Unusable memory > > 76c58000-76d76fff : ACPI Tables > > 76d77000-76ea0fff : ACPI Non-volatile Storage > > 77fff000-77ffffff : Unusable memory > > 80000000-87ffffff : System RAM > > 88000000-8fffffff : Xen scratch > > 100000000-103f7ffff : System RAM > > 4000200000-400021ffff : 0000:01:00.0 > > 4000220000-400023ffff : 0000:01:00.0 > > 4000240000-400025ffff : 0000:01:00.1 > > 4000260000-400027ffff : 0000:01:00.1 > > > > > > working Ubuntu-6.8.0-100.100: > > > > 00000000-00000fff : Reserved > > 00001000-0009ffff : System RAM > > 000a0000-000fffff : Reserved > > 000f0000-000fffff : System ROM > > 00100000-2007ffff : System RAM > > 01000000-025fffff : Kernel code > > 02600000-033bcfff : Kernel rodata > > 03400000-0385613f : Kernel data > > 03d54000-041fffff : Kernel bss > > 20081000-73b57fff : Unusable memory > > 76c58000-76d76fff : ACPI Tables > > 76d77000-76ea0fff : ACPI Non-volatile Storage > > 77fff000-77ffffff : Unusable memory > > 81100000-811fffff : 0000:01:00.1 > > 81100000-811fffff : igb > > 81200000-812fffff : 0000:01:00.0 > > 81200000-812fffff : igb > > 81300000-8137ffff : 0000:01:00.1 > > 81380000-813fffff : 0000:01:00.0 > > 81400000-81403fff : 0000:01:00.1 > > 81400000-81403fff : igb > > 81404000-81407fff : 0000:01:00.0 > > 81404000-81407fff : igb > > 81500000-815fffff : 0000:03:00.0 > > 81600000-816fffff : 0000:03:00.0 > > 81600000-816fffff : igc > > 81700000-81703fff : 0000:03:00.0 > > 81700000-81703fff : igc > > 88000000-8fffffff : Xen scratch > > 100000000-103f7ffff : System RAM > > 4000200000-400021ffff : 0000:01:00.0 > > 4000220000-400023ffff : 0000:01:00.0 > > 4000240000-400025ffff : 0000:01:00.1 > > 4000260000-400027ffff : 0000:01:00.1 > > For some reason (which I still haven't figure out), the fictitious PFN memory layout > created by Linux ends up placing a RAM region over the BAR MMIO space > used by igc, the difference: > > 81200000-812fffff : 0000:01:00.0 > 81200000-812fffff : igb > 81300000-8137ffff : 0000:01:00.1 > 81380000-813fffff : 0000:01:00.0 > 81400000-81403fff : 0000:01:00.1 > 81400000-81403fff : igb > 81404000-81407fff : 0000:01:00.0 > 81404000-81407fff : igb > 81500000-815fffff : 0000:03:00.0 > 81600000-816fffff : 0000:03:00.0 > 81600000-816fffff : igc > 81700000-81703fff : 0000:03:00.0 > 81700000-81703fff : igc > 88000000-8fffffff : Xen scratch > 100000000-103f7ffff : System RAM > > VS > > 80000000-87ffffff : System RAM > 88000000-8fffffff : Xen scratch > 100000000-103f7ffff : System RAM > > In the non-working case there's a chunk of RAM in the space that > covers the device MMIO BARs. I fear my balloon accounting "fix" has > instead introduced a miss accounting in the balloon driver that causes > Linux to attempt to balloon up memory and it ends up instantiating a > hotplug memory region over the device MMIO BARs. > > I'm still confused as to how the change in balloon_add_regions() has > an effect when CONFIG_XEN_UNPOPULATED_ALLOC=y, as it should become a > no-op in that case, but I will debug this myself. > > > > > Just for completeness the working build also reverts "xen/x86: fix initial > > memory balloon target" because of a conflict in drivers/xen/balloon.c. > > Hm, but that's the same commit that you mentioned in the first email, > there you said reverting: > > commit 74287971dbb3fe322bb316afd9e7fb5807e23bee > Author: Roger Pau Monne > Date: Wed May 14 10:04:26 2025 +0200 > > xen/x86: fix initial memory balloon target > > Fixes the issue. Is there an additional commit that also needs > reverting to fix the issue? That would make more sense, as IMO that > commit should be a no-op given your Kconfig. Argh! This is my mistake reading the two reverts in the wrong order. The bisect landed on "x86/xen: fix balloon target initialization for PVH dom0" but "xen/x86: fix initial memory balloon target" had to be reverted first. I'm sorry if that left you scratching your head. > I don't think I will be able to get into this until Monday, sorry. In > the meantime, does disabling the balloon driver mitigate the issue? I can try this if it could still be relevant? Thanks, James