From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 598123C5DC3 for ; Tue, 26 May 2026 07:11:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779779474; cv=none; b=sPTBRmrahv0ZJneASxZ3Q71u1SxUntxtsfODHEHERV9N0SAt2KpDE0EZOf6COR9q6eDBQoSIjtAMogRjhvK2vD2b924X1IMe1WH7pAzqvIKgImqDMRLloaic3hBGZJQ1lSwds4YLchBxhu7/2BArpRfwO4fsXF5BfGTyz3GBBtg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779779474; c=relaxed/simple; bh=369j/Fgg1IOn86L3O5Opmyx/JInqADXC9tYnI2NqVs8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=UMXHTB1zkS/12O1Ftug+kW8a37mCV0XCof9P6VdcWjgezVqvSz7sthwDaWWvwmzvdpA2pWNcCkRRnSBSpYpIgouKkv7HCRRvY3PGl/D+cPCLjsGUa86NDv50MugmxI5597s5Gr14PYeyvNQYwmRsHGrvq6mdWZOMdFYM3ahMcVk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=WuGs82vR; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="WuGs82vR" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1779779469; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UiDWVLURYZ18ZvU1LX1VSHIw5u6YlXWUvbFYOuzvRLs=; b=WuGs82vRhHQgtJR41J89OSEiDc2BKlvo9MnvZlwfKeCWaB2MXIn4iVfprZEyAuQDNcYtbx ub6tpfH3B8+CumJQkfrpwtijeXPXsGngNKtk9vLJv2HckxrDCkMuPaku5P5kyJLPLLlulo XbQb2+Bu3Vnpy1H7omEyUQJ48dtdwqk= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-484-rkvZp5ZqOiG51CrxXocVpQ-1; Tue, 26 May 2026 03:11:05 -0400 X-MC-Unique: rkvZp5ZqOiG51CrxXocVpQ-1 X-Mimecast-MFC-AGG-ID: rkvZp5ZqOiG51CrxXocVpQ_1779779463 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 13A5019560A7; Tue, 26 May 2026 07:11:02 +0000 (UTC) Received: from localhost (unknown [10.43.135.229]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 5B1E4300019F; Tue, 26 May 2026 07:10:55 +0000 (UTC) Date: Tue, 26 May 2026 09:10:53 +0200 From: Miroslav Lichvar To: David Woodhouse Cc: Richard Cochran , Wen Gu , Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , John Stultz , Thomas Gleixner , Stephen Boyd , Anna-Maria Behnsen , Frederic Weisbecker , Shuah Khan , Peter Zijlstra , Thomas =?iso-8859-1?Q?Wei=DFschuh?= , Arnd Bergmann , Julien Ridoux , Ryan Luu , linux-kernel@vger.kernel.org, Marcelo Tosatti Subject: Re: [RFC PATCH v2 0/8] timekeeping: Fix draft tracking precision and add feed-forward discipline via vmclock Message-ID: References: <20260517220326.4625-1-dwmw2@infradead.org> <0d32da75fa88c92ac0225ef23a9045afdf2ac9fe.camel@infradead.org> <5323ed2a67d3a72d37f98ba04f2444841bc7bfae.camel@infradead.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <5323ed2a67d3a72d37f98ba04f2444841bc7bfae.camel@infradead.org> X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 On Mon, May 25, 2026 at 10:14:10AM +0100, David Woodhouse wrote: > On Mon, 2026-05-25 at 10:08 +0200, Miroslav Lichvar wrote: > > On Thu, May 21, 2026 at 10:54:41AM +0100, David Woodhouse wrote: > > > On Thu, 2026-05-21 at 08:35 +0200, Miroslav Lichvar wrote: > > > > Ok, but I don't see why the phase corrections of the guest need to be > > > > in the kernel. > > > > > > I'm not sure I understand.  > > > > <..clarification...> > > > > /* Compute phase offset at cycle_last and set time_offset to slew */ > > ... > > ntp_set_time_offset(tk->id, ref_err >> tk->tkr_mono.shift); > > > > Ah, I see. Thanks. > > But that's just using ->time_offset which has *always* been in the > kernel. time_offset is an input of the kernel PLL. My concern is that the PLL is fed directly by ptp_vmclock, ignoring everything else. There is no setting of the PLL time constant or the flags, no configuration of the step threshold, or any other options that a more advanced implementation might have. To me it feels like a bad shortcut. I think this part of the loop should be in userspace, properly using the adjtimex() API. The feed-forward part (copying frequency settings of the host) is still possible. > There's nothing fundamental in the actual *timekeeping* here that > hasn't already been in the guest kernel for decades; I'm just fixing a > few arithmetic errors in the core code, and then *driving* it more > precisely using its existing parameters (tick_length, time_offset). Fixing arithmetic errors is great. The driving part is what I'm concerned about, like where it is and what it is driving. > > > Right. This *is* the software fallback, because the hardware scaling > > > and offset aren't sufficient even if we only care about x86 where the > > > former is supported. > > > > IMHO it's a solution done at a wrong layer. > > Understood. What do you believe is the better solution? I think a better solution is scaling of the clocksource, i.e. a layer below the realtime clock. An additional multiplier applied in HW or SW. That would address the problem for all system clocks, not just the realtime clock. adjtimex() changes are applied on top of that, they are not in conflict. > Aside from the case of actually using NTP or a PHC to discipline the > kernel's CLOCK_REALTIME, the use cases I'm trying to enable are: > > • (Micro)VM guest is *given* the TSC→realtime relationship in a virt > enlightenment, gets an interrupt whenever it changes. Can react to > that interrupt and steer the kernel's timekeeping as quickly as any > userspace dæmon could do anything. > > • Dedicated virtual hosting environment needs to discipline the *TSC* > directly against external references (PHC, 1PPS) in order to provide > said virt enlightenment directly to guests and allow for accurate > migration. This environment does not care about the host's actual > CLOCK_REALTIME; that's basically cosmetic for logging purposes. > > • Multi-purpose environment has a standard ntpd/chrony setup, wants > QEMU to be able to provide the same virt enlightenment based on > the kernel's own timekeeping. Which of those couldn't be done with the clocksource scaling and/or adjtimex() if all the necessary information was available to userspace? -- Miroslav Lichvar