From mboxrd@z Thu Jan  1 00:00:00 1970
From: Xiao Guangrong <guangrong.xiao@gmail.com>
Subject: Re: [PATCH v3 2/5] util: introduce threaded workqueue
Date: Tue, 27 Nov 2018 16:29:05 +0800
Message-ID: <fb9e053d-629d-d468-613d-2c695bf686ea@gmail.com>
References: <20181126184919.GA6688@flamenco>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: kvm@vger.kernel.org, mst@redhat.com, mtosatti@redhat.com,
	Xiao Guangrong <xiaoguangrong@tencent.com>, dgilbert@redhat.com,
	peterx@redhat.com, qemu-devel@nongnu.org, quintela@redhat.com,
	wei.w.wang@intel.com, jiang.biao2@zte.com.cn, pbonzini@redhat.com
To: "Emilio G. Cota" <cota@braap.org>
Return-path: <qemu-devel-bounces+gceq-qemu-devel2=m.gmane.org@nongnu.org>
In-Reply-To: <20181126184919.GA6688@flamenco>
Content-Language: en-US
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+gceq-qemu-devel2=m.gmane.org@nongnu.org
Sender: "Qemu-devel"
	<qemu-devel-bounces+gceq-qemu-devel2=m.gmane.org@nongnu.org>
List-Id: kvm.vger.kernel.org


On 11/27/18 2:49 AM, Emilio G. Cota wrote:
> On Mon, Nov 26, 2018 at 16:06:37 +0800, Xiao Guangrong wrote:
>>>> +    /* after the user fills the request, the bit is flipped. */
>>>> +    uint64_t request_fill_bitmap QEMU_ALIGNED(SMP_CACHE_BYTES);
>>>> +    /* after handles the request, the thread flips the bit. */
>>>> +    uint64_t request_done_bitmap QEMU_ALIGNED(SMP_CACHE_BYTES);
>>>
>>> Use DECLARE_BITMAP, otherwise you'll get type errors as David
>>> pointed out.
>>
>> If we do it, the field becomes a pointer... that complicates the
>> thing.
> 
> Not necessarily, see below.
> 
> On Mon, Nov 26, 2018 at 16:18:24 +0800, Xiao Guangrong wrote:
>> On 11/24/18 8:17 AM, Emilio G. Cota wrote:
>>> On Thu, Nov 22, 2018 at 15:20:25 +0800, guangrong.xiao@gmail.com wrote:
>>>> +static uint64_t get_free_request_bitmap(Threads *threads, ThreadLocal *thread)
>>>> +{
>>>> +    uint64_t request_fill_bitmap, request_done_bitmap, result_bitmap;
>>>> +
>>>> +    request_fill_bitmap = atomic_rcu_read(&thread->request_fill_bitmap);
>>>> +    request_done_bitmap = atomic_rcu_read(&thread->request_done_bitmap);
>>>> +    bitmap_xor(&result_bitmap, &request_fill_bitmap, &request_done_bitmap,
>>>> +               threads->thread_requests_nr);
>>>
>>> This is not wrong, but it's a big ugly. Instead, I would:
>>>
>>> - Introduce bitmap_xor_atomic in a previous patch
>>> - Use bitmap_xor_atomic here, getting rid of the rcu reads
>>
>> Hmm, however, we do not need atomic xor operation here... that should be slower than
>> just two READ_ONCE calls.
> 
> If you use DECLARE_BITMAP, you get an in-place array. On a 64-bit
> host, that'd be
> 	unsigned long foo[1]; /* [2] on 32-bit */
> 
> Then again on 64-bit hosts, bitmap_xor_atomic would reduce
> to 2 atomic reads:
> 
> static inline void bitmap_xor_atomic(unsigned long *dst,
> const unsigned long *src1, const unsigned long *src2, long nbits)
> {
>      if (small_nbits(nbits)) {
>          *dst = atomic_read(src1) ^ atomic_read(&src2);
>      } else {
>          slow_bitmap_xor_atomic(dst, src1, src2, nbits);

We needn't do inplace xor operation. i.e, we just fetch the bitmaps to
the local variables do xor locally.

So we need additional complicity to handle the case that is !small_nbits(nbits)
... but it is really not a big deal as you said, it just couple of codes.

However, use u64 for the purpose that only  64 indexes are allowed is more
straightforward and can be naturally understood. :)