From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Dexuan Cui <decui@microsoft.com>
Cc: KY Srinivasan <kys@microsoft.com>,
Haiyang Zhang <haiyangz@microsoft.com>,
"devel\@linuxdriverproject.org" <devel@linuxdriverproject.org>,
"linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 6/6] Drivers: hv: vmbus: do a fair round robin when selecting an outgoing channel
Date: Fri, 24 Apr 2015 10:59:16 +0200 [thread overview]
Message-ID: <87sibpgavv.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <fffd7d5f88334c308029dd0c0bf9e9d9@SIXPR30MB031.064d.mgd.msft.net> (Dexuan Cui's message of "Fri, 24 Apr 2015 08:42:11 +0000")
Dexuan Cui <decui@microsoft.com> writes:
>> -----Original Message-----
>> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
>> Sent: Tuesday, April 21, 2015 22:28
>> To: KY Srinivasan
>> Cc: Haiyang Zhang; devel@linuxdriverproject.org; linux-
>> kernel@vger.kernel.org; Dexuan Cui
>> Subject: [PATCH 6/6] Drivers: hv: vmbus: do a fair round robin when
>> selecting an outgoing channel
>>
>> vmbus_get_outgoing_channel() implements the following algorithm for
>> selecting
>> an outgoing channel (despite the comment before the function saying it
>> distributes the load equally):
>
> Yeah, I also found the issue.
>
>> 1) If we have no subchannels return the primary channel;
>> 2) If primary->next_oc is grater than primary->num_sc reset the primary-
>> >next_oc
>> to 0 and return the primary channel;
>> 3) Aim for the primary->next_oc subchannel, increment primary->next_oc;
>> 4) Loop through all opened subchannels. If we see a channel which has
>> target_cpu == current_cpu return it. If we reached the primary->next_oc'th
>> open subchannel return it;
>> 5) Return the primary channel.
>> The implementation also skips the subchannel No. 0 unless it matches the
>> current
>> cpu as we assign i to 1 in the initialization.
>>
>> This is not a fair round robin as subchannels in the beginning of the list are
>> more likely to be returned and checking for current cpu aslo creates
>
> I suppose the current algorithm is trying to make use of cache locality?
> KY may share more information.
>
>> additional
>> complexity. Simplify the vmbus_get_outgoing_channel() function, make it
>> do what the comment before it says.
>
> Hi Vitaly,
> It looks your algorithm also has an issue:
> Assuming primary->num_sc == 3 (SC1, SC2, SC3)
> 1st time: we choose SC1 and primary->next_oc is set to 1.
> 2nd time: we choose SC2 and primary->next_oc is set to 2.
> 3rd time: we choose SC3 and primary->next_oc is set to 3.
> 4th time and later: since i varies among 1~3 and can't be bigger than 3,
> we always choose the primary channel.
You're right, it seems 'if (primary->next_oc > primary->num_sc)' is
off-by-one, it should be >=
>
> BTW, IMO it's not easy to achieve complete fairness because
> vmbus_get_outgoing_channel() can run simultaneously on different
> CPUs (so primary->next_oc can be modified at the same time by multiple
> CPUs), and we believe it should be lockless.
> Maybe atomic_t can help(?)
This thing bothered me a bit but then I realized that we don't actually
care - doing a mistake once is better than suffering from the slowness
of locks/atomics. I'd suggest we keep it this way (with the fix
mentioned above).
Thanks!
>
> Thanks,
> -- Dexuan
>
>>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>> drivers/hv/channel_mgmt.c | 27 +++++++++------------------
>> 1 file changed, 9 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
>> index daa6417..df82442 100644
>> --- a/drivers/hv/channel_mgmt.c
>> +++ b/drivers/hv/channel_mgmt.c
>> @@ -827,39 +827,30 @@ cleanup:
>> struct vmbus_channel *vmbus_get_outgoing_channel(struct
>> vmbus_channel *primary)
>> {
>> struct list_head *cur, *tmp;
>> - int cur_cpu;
>> struct vmbus_channel *cur_channel;
>> - struct vmbus_channel *outgoing_channel = primary;
>> - int next_channel;
>> - int i = 1;
>> + int i = 0;
>>
>> if (list_empty(&primary->sc_list))
>> - return outgoing_channel;
>> + return primary;
>>
>> - next_channel = primary->next_oc++;
>> -
>> - if (next_channel > (primary->num_sc)) {
>> + if (primary->next_oc > primary->num_sc) {
>> primary->next_oc = 0;
>> - return outgoing_channel;
>> + return primary;
>> }
>>
>> - cur_cpu = hv_context.vp_index[get_cpu()];
>> - put_cpu();
>> list_for_each_safe(cur, tmp, &primary->sc_list) {
>> + i++;
>> cur_channel = list_entry(cur, struct vmbus_channel, sc_list);
>> if (cur_channel->state != CHANNEL_OPENED_STATE)
>> continue;
>>
>> - if (cur_channel->target_vp == cur_cpu)
>> - return cur_channel;
>> -
>> - if (i == next_channel)
>> + if (i > primary->next_oc) {
>> + primary->next_oc = i;
>> return cur_channel;
>> -
>> - i++;
>> + }
>> }
>>
>> - return outgoing_channel;
>> + return primary;
>> }
>> EXPORT_SYMBOL_GPL(vmbus_get_outgoing_channel);
>>
>> --
>> 1.9.3
--
Vitaly
prev parent reply other threads:[~2015-04-24 8:59 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-21 14:27 [PATCH 0/6] Drivers: hv: vmbus: fair round robin algorithm for vmbus_get_outgoing_channel() Vitaly Kuznetsov
2015-04-21 14:27 ` [PATCH 1/6] Drivers: hv: vmbus: unify calls to percpu_channel_enq() Vitaly Kuznetsov
2015-04-21 14:27 ` [PATCH 2/6] Drivers: hv: vmbus: briefly comment num_sc and next_oc Vitaly Kuznetsov
2015-04-21 14:27 ` [PATCH 3/6] Drivers: hv: vmbus: decrease num_sc on subchannel removal Vitaly Kuznetsov
2015-04-24 7:02 ` Dexuan Cui
2015-04-24 8:40 ` Vitaly Kuznetsov
2015-04-21 14:27 ` [PATCH 4/6] Drivers: hv: vmbus: move init_vp_index() call to vmbus_process_offer() Vitaly Kuznetsov
2015-04-24 8:38 ` Dexuan Cui
2015-04-24 8:46 ` Vitaly Kuznetsov
2015-04-21 14:27 ` [PATCH 5/6] Drivers: hv: vmbus: distribute subchannels among all vcpus Vitaly Kuznetsov
2015-04-24 8:40 ` Dexuan Cui
2015-04-24 9:05 ` Vitaly Kuznetsov
2015-04-24 16:46 ` KY Srinivasan
2015-04-27 13:30 ` Vitaly Kuznetsov
2015-04-27 18:09 ` KY Srinivasan
2015-04-21 14:27 ` [PATCH 6/6] Drivers: hv: vmbus: do a fair round robin when selecting an outgoing channel Vitaly Kuznetsov
2015-04-24 8:42 ` Dexuan Cui
2015-04-24 8:59 ` Vitaly Kuznetsov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87sibpgavv.fsf@vitty.brq.redhat.com \
--to=vkuznets@redhat.com \
--cc=decui@microsoft.com \
--cc=devel@linuxdriverproject.org \
--cc=haiyangz@microsoft.com \
--cc=kys@microsoft.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.