Re: [Xen-devel] [PATCH v11 7/7] microcode: reject late ucode loading if any core is parked

From: Chao Gao <chao.gao@intel.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: "Sergey Dyasli" <sergey.dyasli@citrix.com>,
	"Ashok Raj" <ashok.raj@intel.com>, "Wei Liu" <wl@xen.org>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	xen-devel@lists.xenproject.org,
	"Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: [Xen-devel] [PATCH v11 7/7] microcode: reject late ucode loading if any core is parked
Date: Mon, 30 Sep 2019 14:43:25 +0800	[thread overview]
Message-ID: <20190930064323.GA10598@gao-cwp> (raw)
In-Reply-To: <8c00cb17-60bb-1580-320a-e94e52da74d2@suse.com>

On Fri, Sep 27, 2019 at 01:19:16PM +0200, Jan Beulich wrote:
>On 26.09.2019 15:53, Chao Gao wrote:
>> If a core with all of its thread being parked, late ucode loading
>> which currently only loads ucode on online threads would lead to
>> differing ucode revisions in the system. In general, keeping ucode
>> revision consistent would be less error-prone. To this end, if there
>> is a parked thread doesn't have an online sibling thread, late ucode
>> loading is rejected.
>> 
>> Two threads are on the same core or computing unit iff they have
>> the same phys_proc_id and cpu_core_id/compute_unit_id. Based on
>> phys_proc_id and cpu_core_id/compute_unit_id, an unique core id
>> is generated for each thread. And use a bitmap to reduce the
>> number of comparison.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>> Alternatively, we can mask the thread id off apicid and use it
>> as the unique core id. It needs to introduce new field in cpuinfo_x86
>> to record the mask for thread id. So I don't take this way.
>
>It feels a little odd that you introduce a "custom" ID, but it
>should be fine without going this alternative route. (You
>wouldn't need a new field though, I think, as we've got the
>x86_num_siblings one already.)
>
>What I continue to be unconvinced of is for the chosen approach
>to be better than briefly unparking a thread on each core, as
>previously suggested.

It isn't so easy to go the same way as set_cx_pminfo().

1. NMI handler on parked threads is changed to a nop. To load ucode in
NMI handler, we have to switch back to normal NMI handler in
default_idle(). But it conflicts with what the comments in play_dead()
implies: it is not safe to call normal NMI handler after
cpu_exit_clear().

2. A precondition of unparking a thread on each core, we need to find
out exactly all parked cores and wake up one thread of each of them.
Then in theory, what this patch does is only part of unparking a thread
on each core.

I don't mean they are hard to address. But we need to take care of them.
Given that, IMO, this patch is much straightforward.

>
>> --- a/xen/arch/x86/microcode.c
>> +++ b/xen/arch/x86/microcode.c
>> @@ -573,6 +573,64 @@ static int do_microcode_update(void *patch)
>>      return ret;
>>  }
>>  
>> +static unsigned int unique_core_id(unsigned int cpu, unsigned int socket_shift)
>> +{
>> +    unsigned int core_id = cpu_to_cu(cpu);
>> +
>> +    if ( core_id == INVALID_CUID )
>> +        core_id = cpu_to_core(cpu);
>> +
>> +    return (cpu_to_socket(cpu) << socket_shift) + core_id;
>> +}
>> +
>> +static int has_parked_core(void)
>> +{
>> +    int ret = 0;
>
>I don't think you need the initializer here.
>
>> +    if ( park_offline_cpus )
>
>    if ( !park_offline_cpus )
>        return 0;
>
>would allow one level less of indentation of the main part of
>the function body.
>
>> +    {
>> +        unsigned int cpu, max_bits, core_width;
>> +        unsigned int max_sockets = 1, max_cores = 1;
>> +        struct cpuinfo_x86 *c = cpu_data;
>> +        unsigned long *bitmap;
>
+
>> +        for_each_present_cpu(cpu)
>> +        {
>> +            if ( x86_cpu_to_apicid[cpu] == BAD_APICID )
>> +                continue;
>> +
>> +            /* Note that cpu_to_socket() get an ID starting from 0. */
>> +            if ( cpu_to_socket(cpu) + 1 > max_sockets )
>
>Instead of "+ 1", why not >= ?
>
>> +                max_sockets = cpu_to_socket(cpu) + 1;
>> +
>> +            if ( c[cpu].x86_max_cores > max_cores )
>> +                max_cores = c[cpu].x86_max_cores;
>
>What guarantees .x86_max_cores to be valid? Onlining a hot-added
>CPU is a two step process afaict, XENPF_cpu_hotadd followed by
>XENPF_cpu_online. In between the CPU would be marked present
>(and cpu_add() would also have filled x86_cpu_to_apicid[cpu]),
>but cpu_data[cpu] wouldn't have been filled yet afaict. This
>also makes the results of the cpu_to_*() unreliable that you use
>in unique_core_id().

Indeed. I agree.

>
>However, if we assume sufficient similarity between CPU
>packages (as you've done elsewhere in this series iirc), this

Yes.

>may not be an actual problem. But it wants mentioning in a code
>comment, I think. Plus at the very least you depend on the used
>cpu_data[] fields to not contain unduly large values (and hence
>you e.g. depend on cpu_data[] not gaining an initializer,
>setting the three fields of interest to their INVALID_* values,
>as currently done by identify_cpu()).

Can we skip those threads whose socket ID is invalid and initialize
the three fields in cpu_add()?
Or maintain a bitmap for parked threads to help distinguish them from
real offlined threads, and go through parked threads here?

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel