From: Chao Gao <chao.gao@intel.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: "Kevin Tian" <kevin.tian@intel.com>,
"Ashok Raj" <ashok.raj@intel.com>, "Wei Liu" <wl@xen.org>,
"Andrew Cooper" <andrew.cooper3@citrix.com>,
"Jun Nakajima" <jun.nakajima@intel.com>,
"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Borislav Petkov" <bp@suse.de>,
"Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: [Xen-devel] [PATCH v8 14/16] x86/microcode: Synchronize late microcode loading
Date: Mon, 5 Aug 2019 21:16:24 +0800 [thread overview]
Message-ID: <20190805131623.GC1685@gao-cwp> (raw)
In-Reply-To: <14b8b6ba-4b3c-dada-dfb2-65b815ca621f@suse.com>
On Mon, Aug 05, 2019 at 10:43:16AM +0000, Jan Beulich wrote:
>On 01.08.2019 12:22, Chao Gao wrote:
>> @@ -234,6 +267,35 @@ bool microcode_update_cache(struct microcode_patch *patch)
>> }
>>
>> /*
>> + * Wait for a condition to be met with a timeout (us).
>> + */
>> +static int wait_for_condition(int (*func)(void *data), void *data,
>> + unsigned int timeout)
>> +{
>> + while ( !func(data) )
>> + {
>> + if ( !timeout-- )
>> + {
>> + printk("CPU%u: Timeout in %s\n", smp_processor_id(), __func__);
>
>I don't think __func__ is helpful here for problem analysis. Instead
>I think you would want to log either func or __builtin_return_address(0).
Will do.
>
>> @@ -283,37 +345,105 @@ static int microcode_update_cpu(const struct microcode_patch *patch)
>> return err;
>> }
>>
>> -static long do_microcode_update(void *patch)
>> +static int do_microcode_update(void *patch)
>> {
>> - unsigned int cpu;
>> + unsigned int cpu = smp_processor_id();
>> + unsigned int cpu_nr = num_online_cpus();
>> + int ret;
>>
>> - /* store the patch after a successful loading */
>> - if ( !microcode_update_cpu(patch) && patch )
>> + /* Mark loading an ucode is in progress */
>> + cmpxchg(&loading_state, LOADING_EXITED, LOADING_ENTERED);
>
>Why cmpxchg(), especially when you don't check whether the store
>has actually happened? (Same further down then.)
loading_state is set to "LOADING_EXITED" by the caller. So the first CPU
coming here would store "LOADING_ENTERED" to it. And we don't need
special handling for the CPU that sets the state, so the return value
isn't checked.
Here are my considerations about how to set the state:
1) We cannot set the state in the caller. Because we would use this
state to block #NMI handling. Setting the state in the caller may
break stop_machine_run mechanism with the patch 16.
2) The first CPU reaching here should set the state (it means we cannot
choose one CPU, e.g. BSP, to do it). With this, #NMI handling is
disabled system-wise before any CPU calls in. Otherwise, if there is
an exception for a CPU, it may be still in #NMI handling, when its
sibling thread starts loading an ucode.
3) A simple assignment on every CPU is problematic in some cases.
For example, if one CPU comes in after other CPUs timed out and left,
it might set the state to "LOADING_ENTERED" and be blocked in #NMI
handling forever with patch 16.
That's why I choose to use a cmpxhg().
Regarding the cmpxchg() in error-handling below, it can be replaced by
a simple assignment. But I am not sure whether it is better to use
cmpxchg() considering cache line bouncing.
>
>> + cpumask_set_cpu(cpu, &cpu_callin_map);
>> + ret = wait_for_condition(wait_cpu_callin, (void *)(unsigned long)cpu_nr,
>> + MICROCODE_CALLIN_TIMEOUT_US);
>> + if ( ret )
>> {
>> - spin_lock(µcode_mutex);
>> - microcode_update_cache(patch);
>> - spin_unlock(µcode_mutex);
>> - patch = NULL;
>> + cmpxchg(&loading_state, LOADING_ENTERED, LOADING_ABORTED);
>> + return ret;
>> }
>>
>> - if ( microcode_ops->end_update )
>> - microcode_ops->end_update();
>> + /*
>> + * Load microcode update on only one logical processor per core, or in
>> + * AMD's term, one core per compute unit. The one with the lowest thread
>> + * id among all siblings is chosen to perform the loading.
>> + */
>> + if ( (cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu))) )
>> + {
>> + static unsigned int panicked = 0;
>> + bool monitor;
>> + unsigned int done;
>> + unsigned long tick = 0;
>>
>> - cpu = cpumask_next(smp_processor_id(), &cpu_online_map);
>> - if ( cpu < nr_cpu_ids )
>> - return continue_hypercall_on_cpu(cpu, do_microcode_update, patch);
>> + ret = microcode_ops->apply_microcode(patch);
>> + if ( !ret )
>> + {
>> + unsigned int cpu2;
>>
>> - /* Free the patch if no CPU has loaded it successfully. */
>> - if ( patch )
>> - microcode_free_patch(patch);
>> + atomic_inc(&cpu_updated);
>> + /* Propagate revision number to all siblings */
>> + for_each_cpu(cpu2, per_cpu(cpu_sibling_mask, cpu))
>> + per_cpu(cpu_sig, cpu2).rev = this_cpu(cpu_sig).rev;
>> + }
>>
>> - return 0;
>> + /*
>> + * The first CPU reaching here will monitor the progress and emit
>> + * warning message if the duration is too long (e.g. >1 second).
>> + */
>> + monitor = !atomic_inc_return(&cpu_out);
>> + if ( monitor )
>> + tick = rdtsc_ordered();
>> +
>> + /* Waiting for all cores or computing units finishing update */
>> + done = atomic_read(&cpu_out);
>> + while ( panicked && done != nr_cores )
>
>Don't you mean "!panicked" here, or || instead of && ? Otherwise I don't
>see how the loop would ever be entered.
Sorry. It should be !panicked.
Other comments are reasonable and I will follow your suggestions.
Thanks
Chao
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
next prev parent reply other threads:[~2019-08-05 13:13 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-01 10:22 [Xen-devel] [PATCH v8 00/16] improve late microcode loading Chao Gao
2019-08-01 10:22 ` [Xen-devel] [PATCH v8 01/16] misc/xen-ucode: Upload a microcode blob to the hypervisor Chao Gao
2019-08-01 10:22 ` [Xen-devel] [PATCH v8 02/16] x86/microcode: always collect_cpu_info() during boot Chao Gao
2019-08-01 10:35 ` Andrew Cooper
2019-08-02 13:23 ` Jan Beulich
2019-08-01 10:22 ` [Xen-devel] [PATCH v8 03/16] microcode/intel: extend microcode_update_match() Chao Gao
2019-08-02 13:29 ` Jan Beulich
2019-08-05 5:58 ` Chao Gao
2019-08-05 9:27 ` Jan Beulich
2019-08-05 11:51 ` Chao Gao
2019-08-05 11:50 ` Jan Beulich
2019-08-01 10:22 ` [Xen-devel] [PATCH v8 04/16] microcode/amd: fix memory leak Chao Gao
2019-08-02 13:39 ` Jan Beulich
2019-08-01 10:22 ` [Xen-devel] [PATCH v8 05/16] microcode/amd: distinguish old and mismatched ucode in microcode_fits() Chao Gao
2019-08-02 13:41 ` Jan Beulich
2019-08-01 10:22 ` [Xen-devel] [PATCH v8 06/16] microcode: introduce a global cache of ucode patch Chao Gao
2019-08-02 14:46 ` Jan Beulich
2019-08-05 7:02 ` Chao Gao
2019-08-05 9:31 ` Jan Beulich
2019-08-01 10:22 ` [Xen-devel] [PATCH v8 07/16] microcode: clean up microcode_resume_cpu Chao Gao
2019-08-02 14:57 ` Jan Beulich
2019-08-01 10:22 ` [Xen-devel] [PATCH v8 08/16] microcode: remove struct ucode_cpu_info Chao Gao
2019-08-02 15:03 ` Jan Beulich
2019-08-01 10:22 ` [Xen-devel] [PATCH v8 09/16] microcode: remove pointless 'cpu' parameter Chao Gao
2019-08-02 15:15 ` Jan Beulich
2019-08-01 10:22 ` [Xen-devel] [PATCH v8 10/16] microcode/amd: call svm_host_osvw_init() in common code Chao Gao
2019-08-02 15:21 ` Jan Beulich
2019-08-05 7:05 ` Chao Gao
2019-08-01 10:22 ` [Xen-devel] [PATCH v8 11/16] microcode: pass a patch pointer to apply_microcode() Chao Gao
2019-08-02 15:25 ` Jan Beulich
2019-08-01 10:22 ` [Xen-devel] [PATCH v8 12/16] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
2019-08-02 15:40 ` Jan Beulich
2019-08-05 7:36 ` Chao Gao
2019-08-05 9:38 ` Jan Beulich
2019-08-05 12:09 ` Chao Gao
2019-08-01 10:22 ` [Xen-devel] [PATCH v8 13/16] microcode: unify loading update during CPU resuming and AP wakeup Chao Gao
2019-08-05 10:19 ` Jan Beulich
2019-08-01 10:22 ` [Xen-devel] [PATCH v8 14/16] x86/microcode: Synchronize late microcode loading Chao Gao
2019-08-05 10:43 ` Jan Beulich
2019-08-05 13:16 ` Chao Gao [this message]
2019-08-05 13:28 ` Jan Beulich
2019-08-05 12:07 ` Jan Beulich
2019-08-01 10:22 ` [Xen-devel] [PATCH v8 15/16] microcode: remove microcode_update_lock Chao Gao
2019-08-01 10:22 ` [Xen-devel] [PATCH v8 16/16] microcode: block #NMI handling when loading an ucode Chao Gao
2019-08-05 12:11 ` Jan Beulich
2019-08-07 7:59 ` Chao Gao
2019-08-07 8:44 ` Jan Beulich
2019-08-01 18:15 ` [Xen-devel] [PATCH v8 00/16] improve late microcode loading Andrew Cooper
2019-08-02 10:59 ` Roger Pau Monné
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190805131623.GC1685@gao-cwp \
--to=chao.gao@intel.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=ashok.raj@intel.com \
--cc=bp@suse.de \
--cc=jun.nakajima@intel.com \
--cc=kevin.tian@intel.com \
--cc=roger.pau@citrix.com \
--cc=tglx@linutronix.de \
--cc=wl@xen.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.