From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BFF0C175A60 for ; Wed, 27 May 2026 07:47:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779868029; cv=none; b=OF2KwJq119eVSFIQllBEwubqzVDnUKxIbuyN74zmBkAdQfyE8deT14mijdsjGERn8rdIJkYO+Yw30NHrHiDmSMeqTATiYWRKLOfNs7F09CiM77263dixB87DWxRpT2WOfcJhPEMEAUYs97H8R1tkCzmd8dAUdI6j1d0S8sm7YPw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779868029; c=relaxed/simple; bh=HHMpbOWDQUT9MiN4Vfli7AyDrnjjkzTf9DgZBWD2tL8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=SDNgc91bXV4qQ2ZjD11L8ZeA/hgzWItXVbei/YbTWebHWrBHSWjxdWibEqLH5ciQu1l0tUHrkII/YEuR/Ywc/8Z2Agl396BXwR1JWvOr8sF0vJkA6+3htFTGKsLNKWAubLM9R3odygqndSNd+P+DTM4iNzMFR4Lk9EETtQZcMjw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Xkd3GoV/; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Xkd3GoV/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1779868026; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9QR0vayhAJ7md2nUZ1Nc9L6HoBAdJ0ZUd17HHiLIOHc=; b=Xkd3GoV/RIfQJNJkcT/mtx8zNO7RKmXNfNE/66IuDSWTzoP8N0vWrYZZMDbzg/gYKR9Exe ubfy9rpsXmduxI8njvAO21HN+w0Hj2efddxd3biFRaeR6ox2x5NvJpj0CqshQCoDr6XxpH JDvq0ldpMwK0EzGrHzpKxXfdjqcq/xQ= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-624-kDM9TljcOZCtrtyXcRckBg-1; Wed, 27 May 2026 03:47:01 -0400 X-MC-Unique: kDM9TljcOZCtrtyXcRckBg-1 X-Mimecast-MFC-AGG-ID: kDM9TljcOZCtrtyXcRckBg_1779868019 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 5A8F01956089; Wed, 27 May 2026 07:46:58 +0000 (UTC) Received: from localhost (unknown [10.43.135.229]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A7FD119560A6; Wed, 27 May 2026 07:46:52 +0000 (UTC) Date: Wed, 27 May 2026 09:46:50 +0200 From: Miroslav Lichvar To: David Woodhouse Cc: Richard Cochran , Wen Gu , Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , John Stultz , Thomas Gleixner , Stephen Boyd , Anna-Maria Behnsen , Frederic Weisbecker , Shuah Khan , Peter Zijlstra , Thomas =?iso-8859-1?Q?Wei=DFschuh?= , Arnd Bergmann , Julien Ridoux , Ryan Luu , linux-kernel@vger.kernel.org, Marcelo Tosatti Subject: Re: [RFC PATCH v2 0/8] timekeeping: Fix draft tracking precision and add feed-forward discipline via vmclock Message-ID: References: <0d32da75fa88c92ac0225ef23a9045afdf2ac9fe.camel@infradead.org> <5323ed2a67d3a72d37f98ba04f2444841bc7bfae.camel@infradead.org> <69a953d665738f5021e511c44e193dd832ba009b.camel@infradead.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <69a953d665738f5021e511c44e193dd832ba009b.camel@infradead.org> X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 On Tue, May 26, 2026 at 11:00:28AM +0100, David Woodhouse wrote: > Let us assume that userspace, either from vmclock or direct discipline > of the arch counter against external sources, has: > • Reference time T. > • Arch counter value at time T. > • Period of a single arch counter tick. > > This translates fairly directly into the kernel's tick_length and > time_offset. But *only* if you know cycle_interval, ntp_error and other > details. Which is why my timekeeping_set_reference() takes the > information in that form, and then translates it within the core > timekeeping. > > If you can show me how to do that with adjtimex(), that would be great. tick_length can be set by the adjtimex() modes ADJ_FREQUENCY (in scaled units of 1/65536 ppm up to 500 ppm) and ADJ_TICK (in microseconds per 1/USER_HZ tick). time_offset can be set by the ADJ_OFFSET mode. The PLL needs to be enabled first by setting the STA_PLL status (ADJ_STATUS mode) and also the STA_FREQHOLD flag needs to be set to avoid changing the PLL frequency. The ntp_error and other details need to be exposed to userspace. Maybe in the same API that will be used for reporting the time and frequency offsets between system clocks. > As chrony introduces a change on the host, QEMU propagates that to the > guest (the vmclock: line is from QEMU), and the guest adjusts > accordingly. And then converges *really* slowly, as even setting the > time constant to 0 gives a half-life for time_offset of about 11 > seconds. A simple linear slew would be better for this. The offset is accurate, there is no need for filtering. > Given the simplicity of the 'bad shortcut', and the fact that we do > want the kernel to follow the reference at *boot* time, I do think I'd > like to have a mode for microvms which optionally *allows* the kernel > to continue to track the reference for itself rather than having an > extra userspace tool that literally just polling on /dev/vmclock in > order to feed precisely that same information back into the kernel > directly. Setting the values on boot in the kernel makes sense to me. There is no loop involved. It follows the setting of the system clock from the RTC. > > I think a better solution is scaling of the clocksource, i.e. a layer > > below the realtime clock. An additional multiplier applied in HW or > > SW. That would address the problem for all system clocks, not just the > > realtime clock. adjtimex() changes are applied on top of that, they > > are not in conflict. > > But we literally already have a way to 'scale' the counter in order to > derive CLOCK_MONOTONIC/CLOCK_REALTIME: the kernel's timekeeping code. > Currently driven *only* by NTP/adjtimex(). I see that as a different purpose than guest migrations. A migrated guest should have its clocksource frequency corrected while the clock is controlled by NTP/PTP. If this mechanism was shared, that would not be possible. > Are you suggesting that the actual clocksource driver in the kernel for > e.g. CSID_ARM_ARCH_COUNTER should *scale* the results it returns, > instead of giving raw counter reads? So we have some NTP-like process > to adjust each clocksource, in *addition* to the core kernel > timekeeping? Not so much NTP-like. There would be no mult dithering or phase adjustments, only frequency. > And then those skewed clocksource values are only > meaningful under a seqlock like the existing kernel timekeeper values > are valid under the tk_data.seq seqlock? I guess you are implying here this SW-fallback scaling would have a significant impact on the performance. Could it not be applied at the same time as the normal multiplier in the conversion to nanoseconds? > And would we have a separate way to get real value, to use for > CLOCK_MONOTONIC_RAW? All system clocks should be scaled, that's my point. -- Miroslav Lichvar