xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Yi Sun <yi.y.sun@linux.intel.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: xen-devel@lists.xenproject.org
Subject: Re: [PATCH] dom_ids array implementation.
Date: Thu, 27 Apr 2017 17:30:37 +0800	[thread overview]
Message-ID: <20170427093037.GD17458@yi.y.sun> (raw)
In-Reply-To: <5901B06B0200007800154A09@prv-mh.provo.novell.com>

On 17-04-27 00:48:43, Jan Beulich wrote:
> >>> On 27.04.17 at 04:38, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-26 04:04:15, Jan Beulich wrote:
> >> >>> On 20.04.17 at 07:38, <yi.y.sun@linux.intel.com> wrote:
> >> > @@ -221,12 +210,17 @@ static void free_socket_resources(unsigned int socket)
> >> >       */
> >> >      for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> >> >      {
> >> > -        if ( !info->features[i] )
> >> > -            continue;
> >> > -
> >> >          xfree(info->features[i]);
> >> >          info->features[i] = NULL;
> >> >      }
> >> > +
> >> > +    spin_lock(&info->ref_lock);
> >> > +    memset(info->cos_ref, 0, MAX_COS_REG_CNT * sizeof(unsigned int));
> >> > +    spin_unlock(&info->ref_lock);
> >> > +
> >> > +    spin_lock_irqsave(&info->dom_ids_lock, flag);
> >> > +    memset(info->dom_ids, 0, PSR_DOM_IDS_NUM * sizeof(uint32_t));
> >> 
> >> bitmap_clear()
> >> 
> >> I'm also not convinced you need to acquire either of the two locks
> >> here - you're cleaning up the socket after all, so nothing can be
> >> running on it anymore.
> >> 
> > Can domain destroy happens at the same time when a socket is offline?
> 
> Well, yes and no - it depends on what path exactly you sit here.
> Large parts of CPU onlining/offlining happen in stop-machine
> context, which would exclude domain destruction going on in
> parallel.
> 
The 'free_socket_resources' may be called when 'CPU_UP_CANCELED' or 'CPUU_DEAD'
happens. For 'CPUU_DEAD', stop-machine is executed. But for 'CPU_UP_CANCELED',
I cannot see this. 'CPU_UP_CANCELED' happens when cpu up fails and
'free_socket_resources' is called only when the cpu is the last one on socket.
So for 'CPU_UP_CANCELED' case, 'free_socket_resources' should not be called.
So, I think you are right and we can remove the spin_lock protections here.

> >> > +        /*
> >> > +         * If old_bit is 0, that means this is the first time the domain is
> >> > +         * switched to this socket or domain's COS ID has not been set since
> >> > +         * the socket is online. So, the domain's COS ID on this socket should
> >> > +         * be default value, 0. If not, that means this socket has been offline
> >> > +         * and the domain's COS ID has been set when the socket was online. So,
> >> > +         * this COS ID is invalid and we have to restore it to 0.
> >> > +         */
> >> > +        if ( d->arch.psr_cos_ids &&
> >> > +             old_bit == 0 &&
> >> > +             d->arch.psr_cos_ids[socket] != 0 )
> >> 
> >> Why don't you replicate the other two conditions in the if() trying to
> >> avoid taking the lock? (Especially if above you indeed intend to use
> >> a helper function, abstracting the full condition into another one
> >> would be very desirable.)
> >> 
> > Ok, will move the two conditions to above 'if()', like below.
> > 
> > if ( likely(test_bit(d->domain_id, info->dom_ids)) ||
> >      !d->arch.psr_cos_ids ||
> >      !d->arch.psr_cos_ids[socket] )
> > 
> > Accordingly, the later codes should be:
> > 
> > spin_lock(&info->dom_ids_lock);
> > set_bit(d->domain_id, info->dom_ids);
> > d->arch.psr_cos_ids[socket] = 0;
> > spin_unlock(&info->dom_ids_lock);
> 
> Then you didn't fully understand: The test_and_ portion _cannot_
> be moved out of the locked region, but a simple test_bit() can be
> replicated prior to taking the lock.
> 
Oh, sorry, I should use test_and_ here to check the bit again which may be
changed before the lock. Thanks for pointing it out!

> >> > +            d->arch.psr_cos_ids[socket] = 0;
> >> > +
> >> > +        spin_unlock(&info->dom_ids_lock);
> >> 
> >> And then, as a whole: As indicated before, ideally you'd keep this
> >> out of the context switch path altogether. What are the alternatives?
> >> 
> > To restore domains' "psr_cos_ids[socket]" to default when socket offline
> > happens, we have three time windows:
> > 1. When socket is offline, in "free_socket_resources()";
> > 2. When socket is online, in "psr_cpu_init()";
> > 3. When context switch happens, in "psr_ctxt_switch_to()".
> > 
> > Option 1 and 2 have same effect and option 1 is more natural than 2. So, we can
> > do this restore action at "1" or "3".
> > 
> > I have two alternatives below. Please help to check which you think is better:
> > 1. The first version of the patch iterates valid domain list to restore them one
> > by one. Per your comments, it may take much time. That is the reason I submitted
> > this patch to spread out the restore action of all domains. If you think
> > "psr_cos_ids[socket]" restore action happens in context switch path is not good,
> > can we use a tasklet in "free_socket_resources()" to iterate the domain list and
> > restore their "psr_cos_ids"?
> 
> If that tasklet (a) doesn't again take overly long and (b) is
> guaranteed to finish before the same socket may come back online
> again, then yes. Otherwise both the iterate-over-all-domains and
> the in-context-switch approaches have downsides, but the latter
> would then seem preferable (because it only affects performance
> without risking the system's health). The question is whether some
> 3rd method can't be found.
> 
I have another solution now. We may move the psr_cos_ids[socket] restore action
into 'psr_get_val' and only set the bit of 'dom_ids[]' in 'psr_set_val'.
1. When socket is offline, the dom_ids[] is cleared.
2. When socket is online, we have four places to use psr_cos_ids[socket]:
   a. psr_ctxt_, we can use test_bit() atomically check if the bit is set. If
      not, that means the d->arch.psr_cos_ids[socket] is invalid at this time.
      Then, we use 0 to set ASSOC register. But we don't restore psr_cos_ids
      here and do not set dom_ids[]. So, we do not need the spin_lock.
   b. psr_get_val, we use test_bit() to check if the bit is 0 and the
      d->arch.psr_cos_ids[socket] is not 0. If yes, that means this domain's
      cos id has not been restored yet. So we restore it to 0.
   c. psr_set_val, we set the bit in dom_ids[] and set
      d->arch.psr_cos_ids[socket]. As, psr_set_val cannot happen when
      psr_get_val is called, so no protection is needed.
   d. psr_free_cos, clear the bit and free d->arch.psr_cos_ids. This place
      cannot happen at the same time that the above three functions called.
      So, no protection needed.

Per above analysis, we do not need lock protection. So, the CPU serialization
issue can be solved. How do you think?

> > 2. Or, can we use a tasklet in "psr_ctxt_switch_to()" to do above work? The side
> > effect is that the domain's COS ID used in this switch is not right. The valid
> > COS ID may be set in next context switch.
> 
> I think this would complicate things while at the same time making
> them worse.
> 
> Jan
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-04-27  9:30 UTC|newest]

Thread overview: 114+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
2017-04-01 13:53 ` [PATCH v10 01/25] docs: create Cache Allocation Technology (CAT) and Code and Data Prioritization (CDP) feature document Yi Sun
2017-04-01 13:53 ` [PATCH v10 02/25] x86: refactor psr: remove L3 CAT/CDP codes Yi Sun
2017-04-01 13:53 ` [PATCH v10 03/25] x86: refactor psr: implement main data structures Yi Sun
2017-04-03 15:50   ` Jan Beulich
2017-04-05  3:12     ` Yi Sun
2017-04-05  8:20       ` Jan Beulich
2017-04-05  8:45         ` Yi Sun
2017-04-01 13:53 ` [PATCH v10 04/25] x86: move cpuid_count_leaf from cpuid.c to processor.h Yi Sun
2017-04-01 13:53 ` [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow Yi Sun
2017-04-05 15:10   ` Jan Beulich
2017-04-06  5:49     ` Yi Sun
2017-04-06  8:32       ` Jan Beulich
2017-04-06  9:22         ` Yi Sun
2017-04-06  9:34           ` Jan Beulich
2017-04-06 10:02             ` Yi Sun
2017-04-06 14:02               ` Jan Beulich
2017-04-07  5:17                 ` Yi Sun
2017-04-07  8:48                   ` Jan Beulich
2017-04-07  9:08                     ` Yi Sun
2017-04-07  9:46                       ` Jan Beulich
2017-04-10  3:27                         ` Yi Sun
2017-04-10 12:43                           ` Yi Sun
2017-04-01 13:53 ` [PATCH v10 06/25] x86: refactor psr: L3 CAT: implement Domain init/free and schedule flows Yi Sun
2017-04-05 15:23   ` Jan Beulich
2017-04-06  6:01     ` Yi Sun
2017-04-06  8:34       ` Jan Beulich
2017-04-01 13:53 ` [PATCH v10 07/25] x86: refactor psr: L3 CAT: implement get hw info flow Yi Sun
2017-04-05 15:37   ` Jan Beulich
2017-04-06  6:05     ` Yi Sun
2017-04-06  8:36       ` Jan Beulich
2017-04-06 11:16         ` Yi Sun
2017-04-06 14:04           ` Jan Beulich
2017-04-07  5:39             ` Yi Sun
2017-04-01 13:53 ` [PATCH v10 08/25] x86: refactor psr: L3 CAT: implement get value flow Yi Sun
2017-04-05 15:51   ` Jan Beulich
2017-04-06  6:10     ` Yi Sun
2017-04-06  8:40       ` Jan Beulich
2017-04-06 11:13         ` Yi Sun
2017-04-06 14:08           ` Jan Beulich
2017-04-07  5:40             ` Yi Sun
2017-04-01 13:53 ` [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework Yi Sun
2017-04-11 15:01   ` Jan Beulich
2017-04-12  5:53     ` Yi Sun
2017-04-12  9:09       ` Jan Beulich
2017-04-12 12:23         ` Yi Sun
2017-04-12 12:42           ` Jan Beulich
2017-04-13  8:11             ` Yi Sun
2017-04-13  9:41               ` Jan Beulich
2017-04-13 10:49                 ` Yi Sun
2017-04-13 10:58                   ` Jan Beulich
2017-04-13 11:11                     ` Yi Sun
2017-04-13 11:26                       ` Yi Sun
2017-04-13 11:31                       ` Jan Beulich
2017-04-13 11:44                         ` Yi Sun
2017-04-13 11:50                           ` Jan Beulich
2017-04-18 10:55                             ` Yi Sun
2017-04-18 11:46                               ` Jan Beulich
2017-04-19  8:22                                 ` Yi Sun
2017-04-19  9:00                                   ` Jan Beulich
2017-04-20  2:14                                     ` Yi Sun
2017-04-20  9:43                                       ` Jan Beulich
2017-04-20 13:02                                         ` Lars Kurth
2017-04-20 13:21                                           ` Jan Beulich
2017-04-20 16:52                                             ` Lars Kurth
2017-04-21  6:11                                               ` Jan Beulich
2017-04-21  1:13                                             ` Konrad Rzeszutek Wilk
2017-04-21  6:18                                       ` Jan Beulich
2017-04-24  6:40                                         ` Yi Sun
2017-04-24  6:55                                           ` Jan Beulich
2017-04-25  7:15                                             ` Yi Sun
2017-04-25  8:24                                               ` Jan Beulich
2017-04-25  8:40                                                 ` Yi Sun
2017-04-20  5:38   ` [PATCH] dom_ids array implementation Yi Sun
2017-04-26 10:04     ` Jan Beulich
2017-04-27  2:38       ` Yi Sun
2017-04-27  6:48         ` Jan Beulich
2017-04-27  9:30           ` Yi Sun [this message]
2017-04-27  9:39             ` Jan Beulich
2017-04-27 12:03               ` Yi Sun
2017-04-01 13:53 ` [PATCH v10 10/25] x86: refactor psr: L3 CAT: set value: assemble features value array Yi Sun
2017-04-11 15:11   ` Jan Beulich
2017-04-12  5:55     ` Yi Sun
2017-04-12  9:13       ` Jan Beulich
2017-04-12 12:26         ` Yi Sun
2017-04-01 13:53 ` [PATCH v10 11/25] x86: refactor psr: L3 CAT: set value: implement cos finding flow Yi Sun
2017-04-11 15:17   ` Jan Beulich
2017-04-01 13:53 ` [PATCH v10 12/25] x86: refactor psr: L3 CAT: set value: implement cos id picking flow Yi Sun
2017-04-11 15:20   ` Jan Beulich
2017-04-01 13:53 ` [PATCH v10 13/25] x86: refactor psr: L3 CAT: set value: implement write msr flow Yi Sun
2017-04-11 15:25   ` Jan Beulich
2017-04-12  6:04     ` Yi Sun
2017-04-01 13:53 ` [PATCH v10 14/25] x86: refactor psr: CDP: implement CPU init and free flow Yi Sun
2017-04-01 13:53 ` [PATCH v10 15/25] x86: refactor psr: CDP: implement get hw info flow Yi Sun
2017-04-01 13:53 ` [PATCH v10 16/25] x86: refactor psr: CDP: implement get value flow Yi Sun
2017-04-11 15:39   ` Jan Beulich
2017-04-12  6:05     ` Yi Sun
2017-04-12  9:14       ` Jan Beulich
2017-04-01 13:53 ` [PATCH v10 17/25] x86: refactor psr: CDP: implement set value callback functions Yi Sun
2017-04-11 16:03   ` Jan Beulich
2017-04-12  6:14     ` Yi Sun
2017-04-01 13:53 ` [PATCH v10 18/25] x86: L2 CAT: implement CPU init and free flow Yi Sun
2017-04-12 15:18   ` Jan Beulich
2017-04-13  8:12     ` Yi Sun
2017-04-13  8:16       ` Jan Beulich
2017-04-01 13:53 ` [PATCH v10 19/25] x86: L2 CAT: implement get hw info flow Yi Sun
2017-04-01 13:53 ` [PATCH v10 20/25] x86: L2 CAT: implement get value flow Yi Sun
2017-04-01 13:53 ` [PATCH v10 21/25] x86: L2 CAT: implement set " Yi Sun
2017-04-12 15:23   ` Jan Beulich
2017-04-01 13:53 ` [PATCH v10 22/25] tools: L2 CAT: support get HW info for L2 CAT Yi Sun
2017-04-12 15:24   ` Jan Beulich
2017-04-01 13:53 ` [PATCH v10 23/25] tools: L2 CAT: support show cbm " Yi Sun
2017-04-01 13:53 ` [PATCH v10 24/25] tools: L2 CAT: support set " Yi Sun
2017-04-01 13:53 ` [PATCH v10 25/25] docs: add L2 CAT description in docs Yi Sun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170427093037.GD17458@yi.y.sun \
    --to=yi.y.sun@linux.intel.com \
    --cc=JBeulich@suse.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).