From: "Yordan Karadzhov (VMware)" <y.karadz@gmail.com>
To: Steven Rostedt <rostedt@goodmis.org>,
Yordan Karadzhov <ykaradzhov@vmware.com>
Cc: linux-trace-devel@vger.kernel.org
Subject: Re: [PATCH 1/5] kernel-shark: Add new dataloading method to be used by the NumPu interface
Date: Thu, 30 May 2019 17:25:51 +0300 [thread overview]
Message-ID: <c25067b7-2eac-863a-406d-9e3ede76cfb0@gmail.com> (raw)
In-Reply-To: <20190529133605.214b766f@oasis.local.home>
On 29.05.19 г. 20:36 ч., Steven Rostedt wrote:
> On Thu, 23 May 2019 18:18:08 +0300
> Yordan Karadzhov <ykaradzhov@vmware.com> wrote:
>
>> The new function loads the content of the trace data file into a
>> table / matrix, made of columns / arrays of data having various integer
>> types. Later those arrays will be wrapped as NumPy arrays.
>>
>> Signed-off-by: Yordan Karadzhov <ykaradzhov@vmware.com>
>> ---
>> kernel-shark/src/libkshark.c | 155 +++++++++++++++++++++++++++++++++++
>> kernel-shark/src/libkshark.h | 7 ++
>> 2 files changed, 162 insertions(+)
>>
>> diff --git a/kernel-shark/src/libkshark.c b/kernel-shark/src/libkshark.c
>> index 175279c..ac634fd 100644
>> --- a/kernel-shark/src/libkshark.c
>> +++ b/kernel-shark/src/libkshark.c
>> @@ -957,6 +957,161 @@ ssize_t kshark_load_data_records(struct kshark_context *kshark_ctx,
>> return -ENOMEM;
>> }
>>
>> +static bool data_matrix_alloc(size_t n_rows, uint64_t **offset_array,
>> + uint8_t **cpu_array,
>> + uint64_t **ts_array,
>> + uint16_t **pid_array,
>> + int **event_array)
>> +{
>> + if (offset_array)
>> + *offset_array = NULL;
>> +
>> + if (cpu_array)
>> + *cpu_array = NULL;
>> +
>> + if (ts_array)
>> + *ts_array = NULL;
>> +
>> + if (pid_array)
>> + *pid_array = NULL;
>> +
>> + if (event_array)
>> + *event_array = NULL;
>> +
>> + if (offset_array) {
>
> The way you can do this, is remove the above and have:
>
>> + *offset_array = calloc(n_rows, sizeof(**offset_array));
>> + if (!*offset_array)
>
> return false;
>
>> + goto free_all;
>> + }
>> +
>> + if (cpu_array) {
>> + *cpu_array = calloc(n_rows, sizeof(**cpu_array));
>> + if (!*cpu_array)
>
> goto free_offset;
>
>> + goto free_all;
>> + }
>> +
>> + if (ts_array) {
>> + *ts_array = calloc(n_rows, sizeof(**ts_array));
>> + if (!*ts_array)
>
> goto free_cpu;
>
>> + goto free_all;
>> + }
>> +
>> + if (pid_array) {
>> + *pid_array = calloc(n_rows, sizeof(**pid_array));
>> + if (!*pid_array)
>
> goto free_ts;
>
>> + goto free_all;
>> + }
>> +
>> + if (event_array) {
>> + *event_array = calloc(n_rows, sizeof(**event_array));
>> + if (!*event_array)
>
> goto free_pid;
>
>> + goto free_all;
>> + }
>> +
>> + return true;
>> +
>
> We can have a helper function:
>
> static inline void free_ptr(void *ptr)
> {
> if (ptr)
> free(*(void **)ptr);
> }
>
> free_pid:
> free_ptr(pid_array);
> free_ts:
> free_ptr(ts_array);
> free_cpu:
> free_ptr(cpu_array);
> free_offset:
> free_ptr(offset_array);
>
> Then have the print here.
>
>> + free_all:
>> + fprintf(stderr, "Failed to allocate memory during data loading.\n");
>
> return false;
>
> This is the way it's usually done in the Linux kernel.
>
Great! Really elegant solution.
>> +
>> + if (offset_array)
>> + free(*offset_array);
>> +
>> + if (cpu_array)
>> + free(*cpu_array);
>> +
>> + if (ts_array)
>> + free(*ts_array);
>> +
>> + if (pid_array)
>> + free(*pid_array);
>> +
>> + if (event_array)
>> + free(*event_array);
>> +
>> + return false;
>> +}
>> +
>> +/**
>> + * @brief Load the content of the trace data file into a table / matrix made
>> + * of columns / arrays of data. The user is responsible for freeing the
>> + * elements of the outputted array
>> + *
>> + * @param kshark_ctx: Input location for the session context pointer.
>> + * @param offset_array: Output location for the array of record offsets.
>> + * @param cpu_array: Output location for the array of CPU Ids.
>> + * @param ts_array: Output location for the array of timestamps.
>> + * @param pid_array: Output location for the array of Process Ids.
>> + * @param event_array: Output location for the array of Event Ids.
>> + *
>> + * @returns The size of the outputted arrays in the case of success, or a
>> + * negative error code on failure.
>> + */
>> +size_t kshark_load_data_matrix(struct kshark_context *kshark_ctx,
>> + uint64_t **offset_array,
>> + uint8_t **cpu_array,
>> + uint64_t **ts_array,
>> + uint16_t **pid_array,
>> + int **event_array)
>> +{
>> + enum rec_type type = REC_ENTRY;
>> + struct rec_list **rec_list;
>> + size_t count, total = 0;
>> + bool status;
>> + int n_cpus;
>> +
>> + total = get_records(kshark_ctx, &rec_list, type);
>> + if (total < 0)
>> + goto fail;
>> +
>> + n_cpus = tracecmd_cpus(kshark_ctx->handle);
>> +
>> + status = data_matrix_alloc(total, offset_array,
>> + cpu_array,
>> + ts_array,
>> + pid_array,
>> + event_array);
>
> BTW, have you looked into how much memory this takes up in a large
> trace?
One record takes 24 bytes. So it will allocate total*24 bytes (or less
if some of the input arguments is NULL).
>
>> + if (!status)
>> + goto fail_free;
>> +
>> + for (count = 0; count < total; count++) {
>> + int next_cpu;
>> +
>> + next_cpu = pick_next_cpu(rec_list, n_cpus, type);
>> + if (next_cpu >= 0) {
>> + struct kshark_entry *e = &rec_list[next_cpu]->entry;
>
> Hmm, this looks like we are taking an address of a field and then
> freeing it down below. Looking at the definition of rec_list, this is
> currently OK. But this coding style is not robust because of the tight
> dependency to how rec_list is defined. If that ever changes it will be
> hard to find code like this to update it.
>
> A more robust way to do this is:
>
> struct rec_list *rec = rec_list[next_cpu];
> struct kshark_entry *e = &rec->entry;
>
>> +
>> + if (offset_array)
>> + (*offset_array)[count] = e->offset;
>> +
>> + if (cpu_array)
>> + (*cpu_array)[count] = e->cpu;
>> +
>> + if (ts_array)
>> + (*ts_array)[count] = e->ts;
>> +
>> + if (pid_array)
>> + (*pid_array)[count] = e->pid;
>> +
>> + if (event_array)
>> + (*event_array)[count] = e->event_id;
>> +
>> + rec_list[next_cpu] = rec_list[next_cpu]->next;
>
> Then here:
>
> free(rec);
>
> That way there's not a dependency here with the data structure layout
> of rec_list and the freeing of the entry.
>
> -- Steve
>
Thanks a lot! Sending updated version.
Y.
>
>> + free(e);
>> + }
>> + }
>> +
>> + /* There should be no entries left in rec_list. */
>> + free_rec_list(rec_list, n_cpus, type);
>> + return total;
>> +
>> + fail_free:
>> + free_rec_list(rec_list, n_cpus, type);
>> +
>> + fail:
>> + fprintf(stderr, "Failed to allocate memory during data
>> loading.\n");
>> + return -ENOMEM;
>> +}
>> +
>> static const char *kshark_get_latency(struct tep_handle *pe,
>> struct tep_record *record)
>> {
>> diff --git a/kernel-shark/src/libkshark.h
>> b/kernel-shark/src/libkshark.h index c218b61..92ade41 100644
>> --- a/kernel-shark/src/libkshark.h
>> +++ b/kernel-shark/src/libkshark.h
>> @@ -149,6 +149,13 @@ ssize_t kshark_load_data_entries(struct
>> kshark_context *kshark_ctx, ssize_t kshark_load_data_records(struct
>> kshark_context *kshark_ctx, struct tep_record ***data_rows);
>>
>> +size_t kshark_load_data_matrix(struct kshark_context *kshark_ctx,
>> + uint64_t **offset_array,
>> + uint8_t **cpu_array,
>> + uint64_t **ts_array,
>> + uint16_t **pid_array,
>> + int **event_array);
>> +
>> ssize_t kshark_get_task_pids(struct kshark_context *kshark_ctx, int
>> **pids);
>> void kshark_close(struct kshark_context *kshark_ctx);
>
next prev parent reply other threads:[~2019-05-30 14:25 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-23 15:18 [PATCH 0/5] Nodifications needed by the NumPy interface Yordan Karadzhov
2019-05-23 15:18 ` [PATCH 1/5] kernel-shark: Add new dataloading method to be used by the NumPu interface Yordan Karadzhov
2019-05-29 17:36 ` Steven Rostedt
2019-05-30 14:25 ` Yordan Karadzhov (VMware) [this message]
2019-05-23 15:18 ` [PATCH 2/5] kernel-shark: Use full paths for non-standard library headers Yordan Karadzhov
2019-05-23 15:18 ` [PATCH 3/5] kernel-shark: Use the shared versions of the trace-cmd libraries Yordan Karadzhov
2019-05-23 15:18 ` [PATCH 4/5] trace-cmd: Install public interface headers Yordan Karadzhov
2019-05-23 15:18 ` [PATCH 5/5] kernel-shark: " Yordan Karadzhov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c25067b7-2eac-863a-406d-9e3ede76cfb0@gmail.com \
--to=y.karadz@gmail.com \
--cc=linux-trace-devel@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=ykaradzhov@vmware.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).