From: "\"Zhou, Wenjian/周文剑\"" <zhouwj-fnst@cn.fujitsu.com>
To: Minoru Usui <min-usui@ti.jp.nec.com>,
"kexec@lists.infradead.org" <kexec@lists.infradead.org>
Subject: Re: [PATCH v1] Improve the performance of --num-threads -d 31
Date: Mon, 15 Feb 2016 13:36:56 +0800 [thread overview]
Message-ID: <56C163F8.3070809@cn.fujitsu.com> (raw)
In-Reply-To: <56C134C1.7050005@cn.fujitsu.com>
On 02/15/2016 10:15 AM, "Zhou, Wenjian/周文剑" wrote:
> Hello Usui,
>
> Thanks very much for your comments.
> And sorry for the late reply.
>
> See below.
>
> On 02/08/2016 01:00 PM, Minoru Usui wrote:
>> Hello,
>>
>>> -----Original Message-----
>>> From: kexec [mailto:kexec-bounces@lists.infradead.org] On Behalf Of Minoru Usui
>>> Sent: Thursday, February 04, 2016 8:52 AM
>>> To: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>; kexec@lists.infradead.org
>>> Subject: Re: [PATCH v1] Improve the performance of --num-threads -d 31
>>>
>>> Hi, Zhou
>>>
>>> I have some comments.
>>> I'm sorry if I have misunderstood your code.
>>>
>>>> -----Original Message-----
>>>> From: kexec [mailto:kexec-bounces@lists.infradead.org] On Behalf Of Zhou Wenjian
>>>> Sent: Monday, February 01, 2016 3:22 PM
>>>> To: kexec@lists.infradead.org
>>>> Subject: [PATCH v1] Improve the performance of --num-threads -d 31
>>>>
>>>> v1:
>>>> 1. change page_flag.ready's value to enum
>>>> 2. change the patch description
>>>> 3. cleanup some codes
>>>> 4. fix a bug in cyclic mode
>>>>
>>>> multi-threads implementation will introduce extra cost when handling
>>>> each page. The origin implementation will also do the extra work for
>>>> filtered pages. So there is a big performance degradation in
>>>> --num-threads -d 31.
>>>> The new implementation won't do the extra work for filtered pages any
>>>> more. So the performance of -d 31 is close to that of serial processing.
>>>>
>>>> The new implementation is just like the following:
>>>> * The basic idea is producer producing page and consumer writing page.
>>>> * Each producer have a page_flag_buf list which is used for storing
>>>> page's description.
>>>> * The size of page_flag_buf is little so it won't take too much memory.
>>>> * And all producers will share a page_data_buf array which is
>>>> used for storing page's compressed data.
>>>> * The main thread is the consumer. It will find the next pfn and write
>>>> it into file.
>>>> * The next pfn is smallest pfn in all page_flag_buf.
>>>>
>>>> Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
>>>> ---
>>>> makedumpfile.c | 258 ++++++++++++++++++++++++++++++++++++++-------------------
>>>> makedumpfile.h | 31 ++++---
>>>> 2 files changed, 193 insertions(+), 96 deletions(-)
>>>>
>>>> diff --git a/makedumpfile.c b/makedumpfile.c
>>>> index fa0b779..0ecd065 100644
>>>> --- a/makedumpfile.c
>>>> +++ b/makedumpfile.c
>>>> @@ -3483,7 +3483,8 @@ initial_for_parallel()
>>>> unsigned long page_data_buf_size;
>>>> unsigned long limit_size;
>>>> int page_data_num;
>>>> - int i;
>>>> + struct page_flag *current;
>>>> + int i, j;
>>>>
>>>> len_buf_out = calculate_len_buf_out(info->page_size);
>>>>
>>>> @@ -3562,8 +3563,10 @@ initial_for_parallel()
>>>> - MAP_REGION * info->num_threads) * 0.6;
>>>>
>>>> page_data_num = limit_size / page_data_buf_size;
>>>> + info->num_buffers = 3 * info->num_threads;
>>>>
>>>> - info->num_buffers = MIN(NUM_BUFFERS, page_data_num);
>>>> + info->num_buffers = MAX(info->num_buffers, NUM_BUFFERS);
>>>> + info->num_buffers = MIN(info->num_buffers, page_data_num);
>>>>
>>>> DEBUG_MSG("Number of struct page_data for produce/consume: %d\n",
>>>> info->num_buffers);
>>>> @@ -3588,6 +3591,36 @@ initial_for_parallel()
>>>> }
>>>>
>>>> /*
>>>> + * initial page_flag for each thread
>>>> + */
>>>> + if ((info->page_flag_buf = malloc(sizeof(void *) * info->num_threads))
>>>> + == NULL) {
>>>> + MSG("Can't allocate memory for page_flag_buf. %s\n",
>>>> + strerror(errno));
>>>> + return FALSE;
>>>> + }
>>>> + memset(info->page_flag_buf, 0, sizeof(void *) * info->num_threads);
>>>> +
>>>> + for (i = 0; i < info->num_threads; i++) {
>>>> + if ((info->page_flag_buf[i] = malloc(sizeof(struct page_flag))) == NULL) {
>>>
>>> Fist element of struct page_flag in circular list is allocated by malloc(),
>>> but other elements are allocated by calloc().(see below)
>>> I think both elements should be allocated by calloc().
>>>
>
> Yes, you are right.
> I have made a mistake.
>
>>>> + MSG("Can't allocate memory for page_flag_buf. %s\n",
>>>> + strerror(errno));
>>>> + return FALSE;
>>>> + }
>>>> + current = info->page_flag_buf[i];
>>>> +
>>>> + for (j = 1; j < NUM_BUFFERS; j++) {
>>>> + if ((current->next = calloc(0, sizeof(struct page_flag))) == NULL) {
>>>> + MSG("Can't allocate memory for data of page_data_buf. %s\n",
>>>> + strerror(errno));
>>>> + return FALSE;
>>>> + }
>>>
>>>
>>> First argument of calloc() should be 1, not 0.
>>> And there is typo in error message.
>>> Allocated element is not page_data_buf.
>>>
>
> I agree.
>
>>>> + current = current->next;
>>>> + }
>>>> + current->next = info->page_flag_buf[i];
>>>> + }
>>>> +
>>>> + /*
>>>> * initial fd_memory for threads
>>>> */
>>>> for (i = 0; i < info->num_threads; i++) {
>>>> @@ -3612,7 +3645,8 @@ initial_for_parallel()
>>>> void
>>>> free_for_parallel()
>>>> {
>>>> - int i;
>>>> + int i, j;
>>>> + struct page_flag *current;
>>>>
>>>> if (info->threads != NULL) {
>>>> for (i = 0; i < info->num_threads; i++) {
>>>> @@ -3655,6 +3689,19 @@ free_for_parallel()
>>>> free(info->page_data_buf);
>>>> }
>>>>
>>>> + if (info->page_flag_buf != NULL) {
>>>> + for (i = 0; i < info->num_threads; i++) {
>>>> + for (j = 0; j < NUM_BUFFERS; j++) {
>>>> + if (info->page_flag_buf[i] != NULL) {
>>>> + current = info->page_flag_buf[i];
>>>> + info->page_flag_buf[i] = current->next;
>>>> + free(current);
>>>> + }
>>>> + }
>>>> + }
>>>> + free(info->page_flag_buf);
>>>> + }
>>>> +
>>>> if (info->parallel_info == NULL)
>>>> return;
>>>>
>>>> @@ -7076,10 +7123,10 @@ kdump_thread_function_cyclic(void *arg) {
>>>> void *retval = PTHREAD_FAIL;
>>>> struct thread_args *kdump_thread_args = (struct thread_args *)arg;
>>>> struct page_data *page_data_buf = kdump_thread_args->page_data_buf;
>>>> + struct page_flag *page_flag_buf = kdump_thread_args->page_flag_buf;
>>>> struct cycle *cycle = kdump_thread_args->cycle;
>>>> - int page_data_num = kdump_thread_args->page_data_num;
>>>> mdf_pfn_t pfn;
>>>> - int index;
>>>> + int index = kdump_thread_args->thread_num;
>>>> int buf_ready;
>>>> int dumpable;
>>>> int fd_memory = 0;
>>>> @@ -7125,47 +7172,47 @@ kdump_thread_function_cyclic(void *arg) {
>>>> kdump_thread_args->thread_num);
>>>> }
>>>>
>>>> - while (1) {
>>>> - /* get next pfn */
>>>> - pthread_mutex_lock(&info->current_pfn_mutex);
>>>> - pfn = info->current_pfn;
>>>> - info->current_pfn++;
>>>> - pthread_mutex_unlock(&info->current_pfn_mutex);
>>>> + /*
>>>> + * filtered page won't take anything
>>>> + * unfiltered zero page will only take a page_flag_buf
>>>> + * unfiltered non-zero page will take a page_flag_buf and a page_data_buf
>>>> + */
>>>> + while (page_flag_buf->pfn < kdump_thread_args->end_pfn) {
>>>
>>> At first, page_flag_buf->pfn is not initialized.
>>> I think this block should be replaced with the following code.
>>>
>>> ===
>>> do {
>>> :
>>> } while(page_flag_buf->pfn < kdump_thread_args->end_pfn)
>>> ===
>>
>> I'm sorry, above suggestion is meaningless in terms of page_flag_buf->pfn is uninitialized.
>> It should be replaced like following.
>>
>> ===
>> while (1) {
>> :
>> while (buf_ready == FALSE) {
>> :
>> if (pfn >= kdump_thread_args->end_pfn) {
>> :
>> goto finish;
>> }
>> :
>> }
>> :
>> }
>> finish:
>> ===
>>
>
> page_flag_buf is allocated by calloc().
> The page_flag_buf->pfn's value is 0.
> So I think it is not necessary to modify the code.
>
>> Thanks,
>> Minoru Usui
>>
>>
>>>> + buf_ready = FALSE;
>>>>
>>>> - if (pfn >= kdump_thread_args->end_pfn)
>>>> - break;
>>>> + while (page_data_buf[index].used != 0 ||
>>>> + pthread_mutex_trylock(&page_data_buf[index].mutex) != 0)
>>>> + index = (index + 1) % info->num_buffers;
>>>>
>>>> - index = -1;
>>>> - buf_ready = FALSE;
>>>> + page_data_buf[index].used = 1;
>>>
>>> "1" is a magic number.
>>> It should be changed TRUE or FALSE.
>>>
>
> I see.
>
>>>> while (buf_ready == FALSE) {
>>>> pthread_testcancel();
>>>> -
>>>> - index = pfn % page_data_num;
>>>> -
>>>> - if (pfn - info->consumed_pfn > info->num_buffers)
>>>> + if (page_flag_buf->ready == FLAG_READY)
>>>> continue;
>>>
>>> At first, page_flag_buf->ready is uninitialized, too.
>>> Should it be initialized in head part of this function, even if FLAG_UNUSED is defined 0?
>>>
>>>
>
> The same topic as the page_flag_buf is allocated by calloc().
>
>>>>
>>>> - if (page_data_buf[index].ready != 0)
>>>> - continue;
>>>> -
>>>> - pthread_mutex_lock(&page_data_buf[index].mutex);
>>>> + /* get next pfn */
>>>> + pthread_mutex_lock(&info->current_pfn_mutex);
>>>> + pfn = info->current_pfn;
>>>> + info->current_pfn++;
>>>> + page_flag_buf->ready = FLAG_FILLING;
>>>> + pthread_mutex_unlock(&info->current_pfn_mutex);
>>>>
>>>> - if (page_data_buf[index].ready != 0)
>>>> - goto unlock;
>>>> + page_flag_buf->pfn = pfn;
>>>
>>> It set FLAG_FILLING to page_flag_buf->ready before setting pfn to page_flag_buf->pfn.
>>> But consumer gets page_flag_buf->pfn after checking page_flag_buf->ready != FLAG_UNUSED
>>> in getting minimum pfn of each thread block.
>>> Should it set page_flag_buf->pfn first?
>>>
>
> Have you noticed the following code in the consumer?
> <cut>
> if (current_pfn == info->page_flag_buf[consuming]->pfn)
> break;
> <cut>
>
> The consumer will check if the pfn is changed after the page_flag_buf->ready turns to be FLAG_READY.
> So it's not important whether setting page_flag_buf->pfn first or not.
>
> In the other hand, even setting page_flag_buf->pfn first, if the pfn is not dumpable, the producer
> will also reset the page_flag_buf->pfn.
>
>>>>
>>>> - buf_ready = TRUE;
>>>> -
>>>> - page_data_buf[index].pfn = pfn;
>>>> - page_data_buf[index].ready = 1;
>>>> + if (pfn >= kdump_thread_args->end_pfn) {
>>>> + page_data_buf[index].used = 0;
>>>> + page_flag_buf->ready = FLAG_READY;
>>>> + info->current_pfn--;
>>>> + break;
>>>> + }
>>>
>>> This block decrements info->current_pfn without info->current_pfn_mutex.
>>> I think this block should be moved into previous pthread_mutex_lock(info->current_pfn_mutex) block, so it can remove.
>>>
>
> Why do you think it should have current_pfn_mutex?
>
> If pfn >= kdump_thread_args->end_pfn, info->current_pfn will always larger than
> kdump_thread_args->end_pfn. info->current_pfn-- won't affect anything.
>
> The decrement operation is for cyclic mode.
>
Sorry, it seems I was wrong.
It can't work well in cyclic mode.
I will fix it in the next version.
--
Thanks
Zhou
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
next prev parent reply other threads:[~2016-02-15 5:38 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-01 6:22 [PATCH v1] Improve the performance of --num-threads -d 31 Zhou Wenjian
2016-02-03 23:52 ` Minoru Usui
2016-02-08 5:00 ` Minoru Usui
2016-02-15 2:15 ` "Zhou, Wenjian/周文剑"
2016-02-15 5:36 ` "Zhou, Wenjian/周文剑" [this message]
2016-02-23 2:16 ` Minoru Usui
2016-02-23 3:52 ` "Zhou, Wenjian/周文剑"
2016-02-23 7:46 ` Minoru Usui
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56C163F8.3070809@cn.fujitsu.com \
--to=zhouwj-fnst@cn.fujitsu.com \
--cc=kexec@lists.infradead.org \
--cc=min-usui@ti.jp.nec.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.