public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: "Goel, Akash" <akash.goel@intel.com>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH v4] tools/intel_guc_logger: Utility for capturing GuC firmware logs in a file
Date: Wed, 12 Oct 2016 14:32:17 +0100	[thread overview]
Message-ID: <0fcf3442-1697-e83a-55c1-f57c19fe9c22@linux.intel.com> (raw)
In-Reply-To: <080dacb9-dff0-4963-af40-c2d47b497c8c@intel.com>


On 10/10/2016 15:31, Goel, Akash wrote:
>
>
> On 10/10/2016 7:22 PM, Tvrtko Ursulin wrote:
>>
>> On 10/10/2016 11:59, akash.goel@intel.com wrote:
>>> From: Akash Goel <akash.goel@intel.com>
>>>
>>> This patch provides a test utility which helps capture GuC firmware
>>> logs and
>>> then dump them to file.
>>> The logs are pulled from a debugfs file
>>> '/sys/kernel/debug/dri/guc_log' and
>>> by default stored into a file 'guc_log_dump.dat'. The name, 
>>> including the
>>> location, of the output file can be changed through a command line
>>> argument.
>>>
>>> The utility goes into an infinite loop where it waits for the arrival
>>> of new
>>> logs and as soon as new set of logs are produced it captures them in
>>> its local
>>> buffer which is then flushed out to the file on disk.
>>> Any time when logging needs to be ended, User can stop this utility
>>> (CTRL+C).
>>>
>>> Before entering into a loop, it first discards whatever logs are
>>> present in
>>> the debugfs file.
>>> This way User can first launch this utility and then start a
>>> workload/activity
>>> for which GuC firmware logs are to be actually captured and keep
>>> running the
>>> utility for as long as its needed, like once the workload is over this
>>> utility
>>> can be forcefully stopped.
>>>
>>> If the logging wasn't enabled on GuC side by the Driver at boot time,
>>> utility
>>> will first enable the logging and later on when it is stopped (CTRL+C)
>>> it will
>>> also pause the logging on GuC side.
>>>
>>> v2:
>>> - Use combination of alarm system call & SIGALRM signal to run the
>>> utility
>>>    for required duration. (Tvrtko)
>>> - Fix inconsistencies, do minor cleanup and refactoring. (Tvrtko)
>>>
>>> v3:
>>> - Fix discrepancy for the output file command line option and update 
>>> the
>>>    Usage/help string.
>>>
>>> v4:
>>> - Update the exit condition for flusher thread, now will exit only 
>>> after
>>>    the capture loop is over and not when the flag to stop logging is 
>>> set.
>>>    This handles a corner case, due to which the dump of last captured
>>> buffer
>>>    was getting missed.
>>> - Add a newline character at the end of assert messages.
>>> - Avoid the assert for the case, which occurs very rarely, when there
>>> are no
>>>    bytes read from the relay file.
>>>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v3)
>>> ---
>>>   tools/Makefile.sources   |   1 +
>>>   tools/intel_guc_logger.c | 438
>>> +++++++++++++++++++++++++++++++++++++++++++++++
>>>   2 files changed, 439 insertions(+)
>>>   create mode 100644 tools/intel_guc_logger.c
>>>
>>> diff --git a/tools/Makefile.sources b/tools/Makefile.sources
>>> index 2bb6c8e..be58871 100644
>>> --- a/tools/Makefile.sources
>>> +++ b/tools/Makefile.sources
>>> @@ -19,6 +19,7 @@ tools_prog_lists =        \
>>>       intel_gpu_time        \
>>>       intel_gpu_top        \
>>>       intel_gtt        \
>>> +    intel_guc_logger        \
>>>       intel_infoframes    \
>>>       intel_l3_parity        \
>>>       intel_lid        \
>>> diff --git a/tools/intel_guc_logger.c b/tools/intel_guc_logger.c
>>> new file mode 100644
>>> index 0000000..159a54e
>>> --- /dev/null
>>> +++ b/tools/intel_guc_logger.c
>>> @@ -0,0 +1,438 @@
>>> +
>>> +#define _GNU_SOURCE  /* For using O_DIRECT */
>>> +#include <inttypes.h>
>>> +#include <stdio.h>
>>> +#include <fcntl.h>
>>> +#include <sched.h>
>>> +#include <sys/ioctl.h>
>>> +#include <string.h>
>>> +#include <errno.h>
>>> +#include <signal.h>
>>> +#include <stdlib.h>
>>> +#include <unistd.h>
>>> +#include <sys/poll.h>
>>> +#include <sys/wait.h>
>>> +#include <sys/types.h>
>>> +#include <sys/syscall.h>
>>> +#include <sys/mman.h>
>>> +#include <assert.h>
>>> +#include <pthread.h>
>>> +
>>> +#include "igt.h"
>>> +
>>> +#define MB(x) ((uint64_t)(x) * 1024 * 1024)
>>> +#ifndef PAGE_SIZE
>>> +  #define PAGE_SIZE 4096
>>> +#endif
>>> +/* Currently the size of GuC log buffer is 19 pages & so is the size
>>> of relay
>>> + * subbuffer. If the size changes in future, then this define also
>>> needs to be
>>> + * updated accordingly.
>>> + */
>>> +#define SUBBUF_SIZE (19*PAGE_SIZE)
>>> +/* Need large buffering from logger side to hide the DISK IO latency,
>>> Driver
>>> + * can only store 8 snapshots of GuC log buffer in relay.
>>> + */
>>> +#define NUM_SUBBUFS 100
>>> +
>>> +#define RELAY_FILE_NAME  "guc_log"
>>> +#define DEFAULT_OUTPUT_FILE_NAME  "guc_log_dump.dat"
>>> +#define CONTROL_FILE_NAME "i915_guc_log_control"
>>> +
>>> +char *read_buffer;
>>> +char *out_filename;
>>> +int poll_timeout = 2; /* by default 2ms timeout */
>>> +pthread_mutex_t mutex;
>>> +pthread_t flush_thread;
>>> +int verbosity_level = 3; /* by default capture logs at max 
>>> verbosity */
>>> +uint32_t produced, consumed;
>>> +uint64_t total_bytes_written;
>>> +int num_buffers = NUM_SUBBUFS;
>>> +int relay_fd, outfile_fd = -1;
>>> +uint32_t test_duration, max_filesize;
>>> +pthread_cond_t underflow_cond, overflow_cond;
>>> +bool stop_logging, discard_oldlogs, capturing_stopped;
>>> +
>>> +static void guc_log_control(bool enable_logging)
>>> +{
>>> +    int control_fd;
>>> +    char data[19];
>>> +    uint64_t val;
>>> +    int ret;
>>> +
>>> +    control_fd = igt_debugfs_open(CONTROL_FILE_NAME, O_WRONLY);
>>> +    igt_assert_f(control_fd >= 0, "couldn't open the guc log control
>>> file\n");
>>> +
>>> +    val = enable_logging ? ((verbosity_level << 4) | 0x1) : 0;
>>> +
>>> +    ret = snprintf(data, sizeof(data), "0x%" PRIx64, val);
>>> +    igt_assert(ret > 2 && ret < sizeof(data));
>>> +
>>> +    ret = write(control_fd, data, ret);
>>> +    igt_assert_f(ret > 0, "couldn't write to the log control file\n");
>>> +
>>> +    close(control_fd);
>>> +}
>>> +
>>> +static void int_sig_handler(int sig)
>>> +{
>>> +    igt_info("received signal %d\n", sig);
>>> +
>>> +    stop_logging = true;
>>> +}
>>> +
>>> +static void pull_leftover_data(void)
>>> +{
>>> +    unsigned int bytes_read = 0;
>>> +    int ret;
>>> +
>>> +    do {
>>> +        /* Read the logs from relay buffer */
>>> +        ret = read(relay_fd, read_buffer, SUBBUF_SIZE);
>>> +        if (!ret)
>>> +            break;
>>> +
>>> +        igt_assert_f(ret > 0, "failed to read from the guc log 
>>> file\n");
>>> +        igt_assert_f(ret == SUBBUF_SIZE, "invalid read from relay
>>> file\n");
>>> +
>>> +        bytes_read += ret;
>>> +
>>> +        if (outfile_fd >= 0) {
>>> +            ret = write(outfile_fd, read_buffer, SUBBUF_SIZE);
>>> +            igt_assert_f(ret == SUBBUF_SIZE, "couldn't dump the logs
>>> in a file\n");
>>> +            total_bytes_written += ret;
>>> +        }
>>> +    } while(1);
>>> +
>>> +    igt_debug("%u bytes flushed\n", bytes_read);
>>> +}
>>> +
>>> +static int num_filled_bufs(void)
>>> +{
>>> +    return (produced - consumed);
>>> +}
>>> +
>>> +static void pull_data(void)
>>> +{
>>> +    char *ptr;
>>> +    int ret;
>>> +
>>> +    pthread_mutex_lock(&mutex);
>>> +    while (num_filled_bufs() >= num_buffers) {
>>> +        igt_debug("overflow, will wait, produced %u, consumed %u\n",
>>> produced, consumed);
>>> +        /* Stall the main thread in case of overflow, as there are no
>>> +         * buffers available to store the new logs, otherwise there
>>> +         * could be corruption if both threads work on the same 
>>> buffer.
>>> +         */
>>> +        pthread_cond_wait(&overflow_cond, &mutex);
>>> +    };
>>> +    pthread_mutex_unlock(&mutex);
>>> +
>>> +    ptr = read_buffer + (produced % num_buffers) * SUBBUF_SIZE;
>>> +
>>> +    /* Read the logs from relay buffer */
>>> +    ret = read(relay_fd, ptr, SUBBUF_SIZE);
>>> +    igt_assert_f(ret >= 0, "failed to read from the guc log file\n");
>>> +    igt_assert_f(!ret || ret == SUBBUF_SIZE, "invalid read from relay
>>> file\n");
>>> +
>>> +    if (ret) {
>>> +        pthread_mutex_lock(&mutex);
>>> +        produced++;
>>> +        pthread_cond_signal(&underflow_cond);
>>> +        pthread_mutex_unlock(&mutex);
>>> +    } else {
>>> +        /* Occasionally (very rare) read from the relay file 
>>> returns no
>>> +         * data, albeit the polling done prior to read call indicated
>>> +         * availability of data.
>>> +         */
>>> +        igt_debug("no data read from the relay file\n");
>>
>> This worries me. Relayfs kernel side is quite simple which would suggest
>> there is maybe a bug in the logger? And it is very important that we
>> trust logger correctly capturing data. Because people will spend a lot
>> of time debugging issues from the logs and it would be bad to have to
>> chase irregularities caused by a problem in the logger.
>>
> Actually I see this issue rarely and only after logging some GBs of 
> data. And most likely it won't have any bearing on the correctness of 
> captured data, the missed snapshot shall be pulled in the next 
> iteration of capture loop. Also no-overwrite mode of relay is being used.
>
> As per my hunch, it can't be ruled out that this is not a relayfs issue.
>
>> Could you please try and identify why exactly is this happening?
> Will try to investigate both from relayfs side as well as logger side, 
> but it would take some time.
> This may not be considered a blocker.
>

As you reported off list, after some deep investigation there seems to 
be no potential for lost packets. "Empty wakeup" is just a consequence 
of unsynchronized nature of relayfs.

In the light of that I am happy with the logger, so:

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-10-12 13:32 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-06 10:43 [PATCH] tools/intel_guc_logger: Utility for capturing GuC firmware logs in a file akash.goel
2016-09-06 13:17 ` Tvrtko Ursulin
2016-09-06 15:33   ` Goel, Akash
2016-09-06 15:52     ` Tvrtko Ursulin
2016-09-07  8:10       ` Goel, Akash
2016-09-07  8:44         ` Chris Wilson
2016-09-07  9:37           ` Tvrtko Ursulin
2016-09-07  9:55             ` Chris Wilson
2016-09-07 15:27               ` [PATCH v2] " akash.goel
2016-09-07 15:39                 ` Tvrtko Ursulin
2016-09-08 10:44                   ` [PATCH v3] " akash.goel
2016-10-10 10:59                     ` [PATCH v4] " akash.goel
2016-10-10 13:52                       ` Tvrtko Ursulin
2016-10-10 14:31                         ` Goel, Akash
2016-10-12 13:32                           ` Tvrtko Ursulin [this message]
2016-10-25  9:01                             ` Petri Latvala
2016-09-07 10:45             ` [PATCH] " Goel, Akash

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0fcf3442-1697-e83a-55c1-f57c19fe9c22@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=akash.goel@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox