linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Adrian Hunter <adrian.hunter@intel.com>
To: "Wangnan (F)" <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
	David Ahern <dsahern@gmail.com>,
	Yunlong Song <yunlong.song@huawei.com>,
	a.p.zijlstra@chello.nl, paulus@samba.org, mingo@redhat.com,
	linux-kernel@vger.kernel.org, namhyung@kernel.org,
	ast@kernel.org, masami.hiramatsu.pt@hitachi.com,
	kan.liang@intel.com, jolsa@kernel.org, bp@alien8.de,
	jean.pihet@linaro.org, rric@kernel.org, xiakaixu@huawei.com,
	hekuang@huawei.com
Subject: Re: [PATCH] perf record: Add snapshot mode support for perf's regular events
Date: Wed, 25 Nov 2015 11:05:24 +0200	[thread overview]
Message-ID: <565579D4.2030108@intel.com> (raw)
In-Reply-To: <565574B9.5090109@huawei.com>

On 25/11/15 10:43, Wangnan (F) wrote:
> 
> 
> On 2015/11/25 16:27, Adrian Hunter wrote:
>> On 25/11/15 09:47, Wangnan (F) wrote:
>>>
>>> On 2015/11/25 15:22, Adrian Hunter wrote:
>>>> On 25/11/15 05:50, Wangnan (F) wrote:
>>>>> On 2015/11/24 23:20, Arnaldo Carvalho de Melo wrote:
>>>>>> Em Tue, Nov 24, 2015 at 08:06:41AM -0700, David Ahern escreveu:
>>>>>>> On 11/24/15 7:00 AM, Yunlong Song wrote:
>>>>>>>> +static int record__write(struct record *rec, void *bf, size_t size)
>>>>>>>> +{
>>>>>>>> +    if (rec->memory.size && memory_enabled) {
>>>>>>>> +        if (perf_memory__write(&rec->memory, bf, size) < 0) {
>>>>>>>> +            pr_err("failed to write memory data, error: %m\n");
>>>>>>>> +            return -1;
>>>>>>>> +        }
>>>>>>>> +    } else {
>>>>>>>> +        if (perf_data_file__write(rec->session->file, bf, size) < 0) {
>>>>>>>> +            pr_err("failed to write perf data, error: %m\n");
>>>>>>>> +            return -1;
>>>>>>>> +        }
>>>>>>>> +        rec->bytes_written += size;
>>>>>>>>         }
>>>>>>>>
>>>>>>>> -    rec->bytes_written += size;
>>>>>>>>         return 0;
>>>>>>>>     }
>>>>>>>>
>>>>>>>> @@ -86,6 +214,8 @@ static int record__mmap_read(struct record *rec, int
>>>>>>>> idx)
>>>>>>>>         if (old == head)
>>>>>>>>             return 0;
>>>>>>>>
>>>>>>>> +    memory_enabled = 1;
>>>>>>>> +
>>>>>>>>         rec->samples++;
>>>>>>>>
>>>>>>>>         size = head - old;
>>>>>>>> @@ -113,6 +243,7 @@ static int record__mmap_read(struct record *rec,
>>>>>>>> int
>>>>>>>> idx)
>>>>>>>>         md->prev = old;
>>>>>>>>         perf_evlist__mmap_consume(rec->evlist, idx);
>>>>>>>>     out:
>>>>>>>> +    memory_enabled = 0;
>>>>>>>>         return rc;
>>>>>>>>     }
>>>>>>>>
>>>>>>> So you are basically ignoring all samples until SIGUSR2 is received.
>>>>>>> That
>>>>>> No, he is not, its just that his code is difficult to follow, has to be
>>>>>> rewritten, but he is ignoring just PERF_RECORD_SAMPLE events, so it
>>>>>> will..
>>>>>>
>>>>>>> means the resulting data file will have limited history of task
>>>>>>> events for
>>>>>> ... have a complete history of task events, since PERF_RECORD_FORK, etc
>>>>>> are not being ignored.
>>>>>>
>>>>>> No?
>>>>> Actually we are discussing about this problem.
>>>>>
>>>>> For such tracking events (PERF_RECORD_FORK...), we have dummy event so
>>>>> it is possible for us to receive tracking events from a separated
>>>>> channel, therefore we don't have to parse every events to pick those
>>>>> events out. Instead, we can process tracking events differently, then
>>>>> more interesting things can be done. For example, squashing those tracking
>>>>> events if it takes too much memory...
>>>>>
>>>>> Furthermore, there's another problem being discussed: if userspace
>>>>> ringbuffer
>>>>> is bytes based, parsing event is unavoidable. Without parsing event we are
>>>>> unable to find the new 'head' pointer when overwriting.
>>>> Have you considered trying to find the head by trial-and-error at the time
>>>> you make the snapshot i.e. look at the first 8 bytes (event records are 8
>>>> byte aligned) and see if it is a valid record header, if not try the next 8
>>>> bytes.  When you find a real event record it should parse without error and
>>>> the subsequent events should all parse without error too, all the way to
>>>> the
>>>> tail.  Then you can use timestamps and compare the events byte-by-byte to
>>>> avoid overlaps between 2 snapshots.
>>> It seems not work. Now we have BPF output event, it is possible that a
>>> BPF program output anything through that event. Even if we have a magic
>>> in head of each event, we can't prevent BPF output event output that
>>> magic, except we introduce some 'escape' method to prevent BPF output
>>> event output some data pattern. So although might work in reallife,
>>> this solution is logically incorrect. Or am I miss someting?
>> When you find the head, all the events will parse correctly.  It seems to me
>> highly unlikely that would happen if you guessed the head wrongly.
>> It is only incorrect if it gives the wrong result.
> 
> Right, so I said it might work in reallife. However, I think we
> should better to try to provide some logically correct solution.
> Also, 'guessing' means some sort of intelligence, or how do we
> deal with guessing error? Simply drop them?

It is not "intelligence" it is a linear search.  If it gives more than one
answer, it is a fatal error.  You can mitigate that by adding more
validation of the event records.

But it is only a suggestion.

> And what's your opinion on the bucket besed ring buffer? With that
> design we only need to maintain a ringbuffer of pointers. It should
> be much simpler. The only drawback I can image is the waste of memory
> because we have to alloc buckets pessimistically. Do you think
> that method have other problem I haven't considered?

The drawback is that you have to copy all the events all the time instead of
letting the kernel ring buffer wraparound without any userspace involvement
until you make a snapshot.


  reply	other threads:[~2015-11-25  9:09 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-24 14:00 [PATCH] perf record: Add snapshot mode support for perf's regular events Yunlong Song
2015-11-24 14:00 ` Yunlong Song
2015-11-24 14:30   ` Arnaldo Carvalho de Melo
2015-11-25 12:44     ` Yunlong Song
2015-11-24 15:06   ` David Ahern
2015-11-24 15:20     ` Arnaldo Carvalho de Melo
2015-11-24 15:24       ` David Ahern
2015-11-24 15:40         ` Arnaldo Carvalho de Melo
2015-11-24 16:16           ` David Ahern
2015-11-25  3:50       ` Wangnan (F)
2015-11-25  5:06         ` David Ahern
2015-11-25  7:22         ` Adrian Hunter
2015-11-25  7:47           ` Wangnan (F)
2015-11-25  8:27             ` Adrian Hunter
2015-11-25  8:43               ` Wangnan (F)
2015-11-25  9:05                 ` Adrian Hunter [this message]
2015-11-25  7:50     ` Yunlong Song
2015-11-25  9:27 ` Peter Zijlstra
2015-11-25  9:44   ` Wangnan (F)
2015-11-25 12:20     ` Peter Zijlstra
2015-11-25 12:54       ` Wangnan (F)
2015-11-26  9:19         ` Ingo Molnar
2015-11-26  9:24           ` Wangnan (F)
2015-11-26  9:27           ` Ingo Molnar
2015-11-26  9:40             ` Ingo Molnar
2015-11-26  9:57             ` Ingo Molnar
2015-12-02  8:25   ` Wangnan (F)
2015-12-02 13:38     ` [RFC PATCH] perf/core: Put size of a sample at the end of it Wang Nan
2015-12-03 10:08       ` Peter Zijlstra
2015-12-03 10:31         ` Wangnan (F)
2015-12-07 13:28       ` [RFC PATCH v2 0/3] perf core/perf tools: Utilizing overwrite ring buffer Wang Nan
2015-12-07 13:28         ` [RFC PATCH v2 1/3] perf/core: Put size of a sample at the end of it Wang Nan
2015-12-07 13:28         ` [RFC PATCH v2 2/3] perf tools: Enable overwrite settings Wang Nan
2015-12-07 13:28         ` [RFC PATCH v2 3/3] perf record: Find tail pointer through size at end of event Wang Nan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=565579D4.2030108@intel.com \
    --to=adrian.hunter@intel.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@kernel.org \
    --cc=ast@kernel.org \
    --cc=bp@alien8.de \
    --cc=dsahern@gmail.com \
    --cc=hekuang@huawei.com \
    --cc=jean.pihet@linaro.org \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=paulus@samba.org \
    --cc=rric@kernel.org \
    --cc=wangnan0@huawei.com \
    --cc=xiakaixu@huawei.com \
    --cc=yunlong.song@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).