From: Mike Christie <mchristi@redhat.com>
To: Xiubo Li <lixiubo@cmss.chinamobile.com>, nab@linux-iscsi.org
Cc: agrover@redhat.com, iliastsi@arrikto.com, namei.unix@gmail.com,
sheng@yasker.org, linux-scsi@vger.kernel.org,
target-devel@vger.kernel.org, linux-kernel@vger.kernel.org,
Jianfei Hu <hujianfei@cmss.chinamobile.com>
Subject: Re: [PATCH v6 2/2] tcmu: Add global data block pool support
Date: Mon, 1 May 2017 13:40:32 -0500 [thread overview]
Message-ID: <59078120.3000200@redhat.com> (raw)
In-Reply-To: <25e7247f-116f-aaa5-e895-b503c0a893f9@cmss.chinamobile.com>
On 04/30/2017 06:29 AM, Xiubo Li wrote:
> [...]
>>> +static struct page *tcmu_try_get_block_page(struct tcmu_dev *udev,
>>> uint32_t dbi)
>>> +{
>>> + struct page *page;
>>> + int ret;
>>> +
>>> + mutex_lock(&udev->cmdr_lock);
>>> + page = tcmu_get_block_page(udev, dbi);
>>> + if (likely(page)) {
>>> + mutex_unlock(&udev->cmdr_lock);
>>> + return page;
>>> + }
>>> +
>>> + /*
>>> + * Normally it shouldn't be here:
>>> + * Only when the userspace has touched the blocks which
>>> + * are out of the tcmu_cmd's data iov[], and will return
>>> + * one zeroed page.
>>
>> Is it a userspace bug when this happens? Do you know when it is
>> occcuring?
> Since the UIO will map the whole ring buffer to the user space at the
> beginning, and the userspace is allowed and legal to access any block
> within the limits of the mapped ring area.
>
> But actually when this happens, it normally will be one bug of the
> userspace. Without this checking the kernel will output many page fault
> dump traces.
>
> Maybe here outputing some warning message is a good idea, and will be
> easy to debug for userspace.
Yeah.
>
>
> [...]
>>> @@ -1388,6 +1509,81 @@ static ssize_t tcmu_cmd_time_out_store(struct
>>> config_item *item, const char *pag
>>> .tb_dev_attrib_attrs = NULL,
>>> };
>>> +static int unmap_thread_fn(void *data)
>>> +{
>>> + struct tcmu_dev *udev;
>>> + loff_t off;
>>> + uint32_t start, end, block;
>>> + struct page *page;
>>> + int i;
>>> +
>>> + while (1) {
>>> + DEFINE_WAIT(__wait);
>>> +
>>> + prepare_to_wait(&unmap_wait, &__wait, TASK_INTERRUPTIBLE);
>>> + schedule();
>>> + finish_wait(&unmap_wait, &__wait);
>>> +
>>> + mutex_lock(&root_udev_mutex);
>>> + list_for_each_entry(udev, &root_udev, node) {
>>> + mutex_lock(&udev->cmdr_lock);
>>> +
>>> + /* Try to complete the finished commands first */
>>> + tcmu_handle_completions(udev);
>>> +
>>> + /* Skip the udevs waiting the global pool or in idle */
>>> + if (udev->waiting_global || !udev->dbi_thresh) {
>>> + mutex_unlock(&udev->cmdr_lock);
>>> + continue;
>>> + }
>>> +
>>> + end = udev->dbi_max + 1;
>>> + block = find_last_bit(udev->data_bitmap, end);
>>> + if (block == udev->dbi_max) {
>>> + /*
>>> + * The last bit is dbi_max, so there is
>>> + * no need to shrink any blocks.
>>> + */
>>> + mutex_unlock(&udev->cmdr_lock);
>>> + continue;
>>> + } else if (block == end) {
>>> + /* The current udev will goto idle state */
>>> + udev->dbi_thresh = start = 0;
>>> + udev->dbi_max = 0;
>>> + } else {
>>> + udev->dbi_thresh = start = block + 1;
>>> + udev->dbi_max = block;
>>> + }
>>> +
>>> + /* Here will truncate the data area from off */
>>> + off = udev->data_off + start * DATA_BLOCK_SIZE;
>>> + unmap_mapping_range(udev->inode->i_mapping, off, 0, 1);
>>> +
>>> + /* Release the block pages */
>>> + for (i = start; i < end; i++) {
>>> + page = radix_tree_delete(&udev->data_blocks, i);
>>> + if (page) {
>>> + __free_page(page);
>>> + atomic_dec(&global_db_count);
>>> + }
>>> + }
>>> + mutex_unlock(&udev->cmdr_lock);
>>> + }
>>> +
>>> + /*
>>> + * Try to wake up the udevs who are waiting
>>> + * for the global data pool.
>>> + */
>>> + list_for_each_entry(udev, &root_udev, node) {
>>> + if (udev->waiting_global)
>>> + wake_up(&udev->wait_cmdr);
>>> + }
>>
>> To avoid starvation, I think you want a second list/fifo that holds the
>> watiers. In tcmu_get_empty_block if the list is not empty, record how
>> many pages we needed and then add the device to the list and wait in
>> tcmu_queue_cmd_ring.
>>
>> Above if we freed enough pages for the device at head then wake up the
>> device.
>>
>> I think you also need a wake_up call in the completion path in case the
>> initial call could not free enough pages. It could probably check if the
>> completion was going to free enough pages for a waiter and then call
>> wake.
>>
> Yes, I meant to introduce this later after this series to not let the
> patches too
> complex to review.
>
> If you agree I will do this later, or in V7 series ?
Yeah, I am ok with adding it after the initial patches go in.
next prev parent reply other threads:[~2017-05-01 18:40 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-26 6:25 [PATCH v6 0/2] tcmu: Dynamic growing data area support lixiubo
2017-04-26 6:25 ` [PATCH v6 1/2] tcmu: Add dynamic growing data area feature support lixiubo
2017-04-30 5:48 ` Mike Christie
2017-04-30 10:22 ` [PATCH v6 1/2] tcmu: Add dynamic growing data area featuresupport Xiubo Li
2017-05-01 18:37 ` Mike Christie
2017-04-26 6:25 ` [PATCH v6 2/2] tcmu: Add global data block pool support lixiubo
2017-04-26 10:07 ` kbuild test robot
2017-04-26 10:07 ` kbuild test robot
2017-04-30 7:31 ` Mike Christie
2017-04-30 11:29 ` Xiubo Li
2017-05-01 18:40 ` Mike Christie [this message]
2017-05-02 5:26 ` Nicholas A. Bellinger
2017-05-02 5:25 ` [PATCH v6 0/2] tcmu: Dynamic growing data area support Nicholas A. Bellinger
2017-05-02 9:25 ` 答复: " lixiubo
2017-05-02 9:25 ` lixiubo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=59078120.3000200@redhat.com \
--to=mchristi@redhat.com \
--cc=agrover@redhat.com \
--cc=hujianfei@cmss.chinamobile.com \
--cc=iliastsi@arrikto.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=lixiubo@cmss.chinamobile.com \
--cc=nab@linux-iscsi.org \
--cc=namei.unix@gmail.com \
--cc=sheng@yasker.org \
--cc=target-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.