All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Parri <parri.andrea@gmail.com>
To: Michael Kelley <mikelley@microsoft.com>
Cc: "linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	KY Srinivasan <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>,
	Nuno Das Neves <Nuno.Das@microsoft.com>
Subject: Re: [RFC PATCH 1/2] Drivers: hv: vmbus: Re-balance channel interrupts across CPUs at CPU hotplug
Date: Sat, 6 Jun 2020 23:49:09 +0200	[thread overview]
Message-ID: <20200606214909.GA28369@andrea> (raw)
In-Reply-To: <SN6PR2101MB10562BB7B20846C8C711F2DDD78B0@SN6PR2101MB1056.namprd21.prod.outlook.com>

> > @@ -515,6 +519,8 @@ static void vmbus_add_channel_work(struct work_struct *work)
> >  	if (ret != 0) {
> >  		pr_err("unable to add child device object (relid %d)\n",
> >  			newchannel->offermsg.child_relid);
> > +		if (hv_is_perf_channel(newchannel))
> > +			free_chn_counts(&newchannel->device_obj->chn_cnt);
> 
> You could drop the "if" condition and just always call free_chn_counts() since
> it will do the right thing for non-perf channels where the memory was never
> allocated.

Well, AFAICT, the above would do the "right" thing for non-perf channels
without calling kfree() twice.  ;-)  It would also serve as a hard-coded
"reminder" of the fact that there is no alloc_chn_counts() for them.  No
strong opinions though, will drop for the next submission...


> > +static void filter_vp_index(struct hv_device *hv_dev, cpumask_var_t cpu_msk)
> > +{
> > +	/*
> > +	 * The selection of the target CPUs is performed in two domains,
> > +	 * the device domain and the connection domain.  At each domain,
> > +	 * in turn, invalid CPUs are filtered out at two levels, the CPU
> 
> I would drop the word "invalid".  You are filtering out CPUs that meet the
> criteria in the sentence starting after the colon below.

Agreed, will drop.


> > +static void balance_vp_index(struct vmbus_channel *chn,
> > +			     struct hv_device *hv_dev, cpumask_var_t cpu_msk)
> > +{
> > +	u32 cur_cpu = chn->target_cpu, tgt_cpu = cur_cpu;
> > +
> > +	if (chn->state != CHANNEL_OPENED_STATE) {
> > +		/*
> > +		 * The channel may never have been opened or it may be in
> > +		 * a closed/closing state; if so, do not touch the target
> > +		 * CPU of this channel.
> > +		 */
> > +		goto update_chn_cnts;
> > +	}
> > +
> > +	/*
> > +	 * The channel was open, and it will remain open until we release
> > +	 * channel_mutex, cf. the use of channel_mutex and channel->state
> > +	 * in vmbus_disconnect_ring() -> vmbus_close_internal().
> > +	 */
> > +
> > +	if (!hv_is_perf_channel(chn)) {
> > +		/*
> > +		 * Only used by the CPU hot removal path, remark that
> > +		 * the connect CPU can not go offline.
> 
> To be super explicit in the comment:  If the channel is not a
> performance channel, then it does not participate in the balancing,
> and we move it back to interrupting the VMBUS_CONNECT_CPU for
> lack of a better choice.  Because non-perf channels are initially set to 
> interrupt the VMBUS_CONNECT_CPU, the only way a non-perf channel
> could be found in this state (i.e., set to a CPU other than
> VMBUS_CONNECT_CPU) is a manual change through the sysfs interface.

The comment was indeed rather meant to make explicit a "please go read
the caller, carefully..." where, among other things, at least parts of
the remarks you pointed out above are spelled out.  But I won't be the
one stingy with comments!  ;-)  Will apply, thanks for the suggestion.


> > +void vmbus_balance_vp_indexes_at_cpuhp(unsigned int cpu, bool add)
> > +{
> > +	struct vmbus_channel *chn, *sc;
> > +	cpumask_var_t cpu_msk;
> > +
> > +	lockdep_assert_cpus_held();
> > +	lockdep_assert_held(&vmbus_connection.channel_mutex);
> > +
> > +	WARN_ON(!cpumask_test_cpu(cpu, cpu_online_mask));
> > +
> > +	/* See the header comment for vmbus_send_modifychannel(). */
> > +	if (vmbus_proto_version < VERSION_WIN10_V4_1)
> > +		return;
> > +
> > +	if (!alloc_cpumask_var(&cpu_msk, GFP_KERNEL))
> > +		return;
> > +
> > +	reset_chn_counts(&vmbus_connection.chn_cnt);
> > +
> > +	list_for_each_entry(chn, &vmbus_connection.chn_list, listentry) {
> > +		struct hv_device *dev = chn->device_obj;
> > +
> > +		/*
> > +		 * The device may not have been allocated/assigned to
> > +		 * the primary channel yet; if so, do not balance the
> > +		 * channels associated to this device.  If dev != NULL,
> > +		 * the synchronization on channel_mutex ensures that
> > +		 * the device's chn_cnt has been (properly) allocated
> > +		 * *and* initialized, cf. vmbus_add_channel_work().
> > +		 */
> > +		if (dev == NULL)
> > +			continue;
> > +
> > +		/*
> > +		 * By design, non-"perf" channels do not take part in
> > +		 * the balancing process.
> > +		 *
> > +		 * The user may have assigned some non-"perf" channel
> > +		 * to this CPU.  To satisfy the user's request to hot
> > +		 * remove the CPU, we will re-assign such channels to
> > +		 * the connect CPU; cf. balance_vp_index().
> > +		 */
> > +		if (!hv_is_perf_channel(chn)) {
> > +			if (add)
> > +				continue;
> > +			/*
> > +			 * Assume that the non-"perf" channel has no
> > +			 * sub-channels.
> > +			 */
> 
> The "if" statement below could use a bit further explanation to help
> the reader. :-)  If this non-perf channel is assigned to some CPU other
> than the one we are hot-removing, then we execute the "continue"
> statement and leave its target CPU unchanged.  But if it is assigned
> to the CPU we are hot removing, then we need to move the channel
> to some other CPU.
> 
> I'm also not clear on how the above comment about having no
> sub-channels is relevant.  Maybe a bit more explanation would
> help.

That comment was meant to simply stress the fact that we can "continue"
without looping over/checking the sub-channels, because we know that the
channel in question has no sub-channels.  ;-)  (BTW, for the very same
reason, we have no "if (!hv_is_perf_channel(sc))..." in the loop below.)

So, maybe something like:

	/*
	 * If this non-"perf" channel is assigned to some...
	 * [your text/explanation above].
	 */
	if (chn->target_cpu != cpu) {
		/*
		 * Non-"perf" channels have no sub-channels:
		 * no need to loop over sc_list.
		 */
		continue;
	}

??
	 

> > +			if (chn->target_cpu != cpu)
> > +				continue;
> > +		} else {
> > +			reset_chn_counts(&dev->chn_cnt);
> > +		}
> > +
> > +		cpumask_copy(cpu_msk, cpu_online_mask);
> > +		if (!add)
> > +			cpumask_clear_cpu(cpu, cpu_msk);
> > +		balance_vp_index(chn, dev, cpu_msk);
> > +
> > +		list_for_each_entry(sc, &chn->sc_list, sc_list) {
> > +			cpumask_copy(cpu_msk, cpu_online_mask);
> > +			if (!add)
> > +				cpumask_clear_cpu(cpu, cpu_msk);
> > +			balance_vp_index(sc, dev, cpu_msk);
> > +		}
> > +	}
> > +
> > +	free_cpumask_var(cpu_msk);
> > +}


> >  int hv_synic_init(unsigned int cpu)
> >  {
> > +	/*
> > +	 * The CPU has been hot added: try to re-balance the channels
> > +	 * across the online CPUs (including "this" CPU), provided that
> > +	 * the VMBus is connected; in part., avoid the re-balancing at
> > +	 * the very first CPU initialization.
> > +	 *
> > +	 * See also inline comments in hv_synic_cleanup().
> > +	 */
> > +	if (vmbus_connection.conn_state == CONNECTED) {
> > +		mutex_lock(&vmbus_connection.channel_mutex);
> > +		vmbus_balance_vp_indexes_at_cpuhp(cpu, true);
> 
> Does the target CPU have its bit in cpu_online_mask set at the time this
> is executed?  reset_chn_counts() does a for_each_online_cpu() loop, and
> we want to make sure the count for this CPU gets reset to zero.

Yes, it does:  We're here (in CPUHP_AP_ONLINE_DYN) near the end of the
"CPU-online" process; IIUC, the bit in question is set earlier in this
process/before the CPU reaches CPUHP_AP_ONLINE_IDLE.

So, yeah, I think I would agree in saying that that:

  WARN_ON(!cpumask_test_cpu(cpu, cpu_online_mask));

(in vmbus_balance_vp_indexes_at_cpuhp()) is more about "paranoid" (for
future changes/uses) than anything else.  I'm keeping that for now.


> > @@ -980,6 +980,9 @@ static void vmbus_device_release(struct device *device)
> >  	mutex_lock(&vmbus_connection.channel_mutex);
> >  	hv_process_channel_removal(channel);
> >  	mutex_unlock(&vmbus_connection.channel_mutex);
> > +
> > +	if (hv_is_perf_channel(channel))
> > +		free_chn_counts(&hv_dev->chn_cnt);
> 
> Again, can drop the 'if' statement.

As mentioned above, either way works for me.  Will drop it for the next
version.

Thanks,
  Andrea

  reply	other threads:[~2020-06-06 21:49 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-26 22:32 [RFC PATCH 0/2] VMBus channel interrupts re-balancing Andrea Parri (Microsoft)
2020-05-26 22:32 ` [RFC PATCH 1/2] Drivers: hv: vmbus: Re-balance channel interrupts across CPUs at CPU hotplug Andrea Parri (Microsoft)
2020-06-02 23:47   ` Michael Kelley
2020-06-06 21:49     ` Andrea Parri [this message]
2020-05-26 22:32 ` [RFC PATCH 2/2] Drivers: hv: vmbus: Re-balance channel interrupts across CPUs at device hotplug Andrea Parri (Microsoft)
2020-06-07 15:42 ` [RFC PATCH 0/2] VMBus channel interrupts re-balancing Andrea Parri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200606214909.GA28369@andrea \
    --to=parri.andrea@gmail.com \
    --cc=Nuno.Das@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikelley@microsoft.com \
    --cc=sthemmin@microsoft.com \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.