1 of 23 peformance problems at 30 to 40 percent system load

public inbox for linux-rt-users@vger.kernel.org
 help / color / mirror / Atom feed

* 1 of 23 peformance problems at 30 to 40 percent system load
@ 2015-07-17 20:05 Joël Krähemann
  2015-07-20 12:43 ` linuxball
  0 siblings, 1 reply; 4+ messages in thread
From: Joël Krähemann @ 2015-07-17 20:05 UTC (permalink / raw)
  To: linux-rt-users

Hi all,

My name is Joël Krähemann, I'm developing:

http://gsequencer.org

and I'm using:

Linux debian 4.0.5 #1 SMP PREEMPT Sat Jul 11 16:32:49 CEST 2015 x86_64 GNU/Linux

For now I encounter on my system performance problems at a load of 30
to 40 % system load, all 8 virtual cpu's have same average load.

* `chrt` to higher priority doesn't give wished throughput.
* `taskset` has ff as default.
* `cpufreq-set -g performance` brings a little improvement for first seconds

This is my CPU:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    2
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 58
Model name:            Intel(R) Core(TM) i7-3740QM CPU @ 2.70GHz
Stepping:              9
CPU MHz:               3275.648
CPU max MHz:           3700.0000
CPU min MHz:           1200.0000
BogoMIPS:              5387.58
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              6144K
NUMA node0 CPU(s):     0-7

GSequencer uses many threads but doesn't stay in realtime. What am I
doing wrong? Or how to gain more Performance out of the system?

I could imagine on my code side that vary frequency segmentation would
bring better throughput. By modeifing AGS_THREAD_DEFAULT_JIFFIE,
AGS_THREAD_MAX_PRECISION and related.

But for now I search for documentation about linux kernel performance counters.

Best regards,
Joël Krähemann
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 1 of 23 peformance problems at 30 to 40 percent system load
  2015-07-17 20:05 1 of 23 peformance problems at 30 to 40 percent system load Joël Krähemann
@ 2015-07-20 12:43 ` linuxball
  2015-07-21  1:47   ` Joël Krähemann
  0 siblings, 1 reply; 4+ messages in thread
From: linuxball @ 2015-07-20 12:43 UTC (permalink / raw)
  To: jkraehemann-guest, linux-rt-users

Hi Joël,

see remarks in the context below.

Best regards,

Wolfgang

On 17.07.2015 22:05, Joël Krähemann wrote:
> Hi all,
>
> My name is Joël Krähemann, I'm developing:
>
> http://gsequencer.org
>
> and I'm using:
>
> Linux debian 4.0.5 #1 SMP PREEMPT Sat Jul 11 16:32:49 CEST 2015 x86_64 GNU/Linux
 From the output it seems that the kernel you are using is NOT a RT 
kernel (otherwise "uname -v" should say "#1 SMP PREEMPT RT ..."). If you 
want to use a RT kernel you should build the kernel with PREEMPT_RT_FULL 
defined.
> For now I encounter on my system performance problems at a load of 30
> to 40 % system load, all 8 virtual cpu's have same average load.
What do you mean when you talk about "performance problems"? How do they 
manifest?

> * `chrt` to higher priority doesn't give wished throughput.
As you certainly know in general a RT kernel has worse throughput (but 
better response times / lower latencies for time critical threads) than 
a generic kernel.
> * `taskset` has ff as default.
> * `cpufreq-set -g performance` brings a little improvement for first seconds
>
> This is my CPU:
>
> Architecture:          x86_64
> CPU op-mode(s):        32-bit, 64-bit
> Byte Order:            Little Endian
> CPU(s):                8
> On-line CPU(s) list:   0-7
> Thread(s) per core:    2
> Core(s) per socket:    4
> Socket(s):             1
> NUMA node(s):          1
> Vendor ID:             GenuineIntel
> CPU family:            6
> Model:                 58
> Model name:            Intel(R) Core(TM) i7-3740QM CPU @ 2.70GHz
> Stepping:              9
> CPU MHz:               3275.648
> CPU max MHz:           3700.0000
> CPU min MHz:           1200.0000
> BogoMIPS:              5387.58
> Virtualization:        VT-x
> L1d cache:             32K
> L1i cache:             32K
> L2 cache:              256K
> L3 cache:              6144K
> NUMA node0 CPU(s):     0-7
>
>
> GSequencer uses many threads but doesn't stay in realtime. What am I
> doing wrong? Or how to gain more Performance out of the system?
How do you define performance? If you want more throughput then try a 
generic kernel. If you want shorter response times (lower latencies) for 
selected RT threads than use a RT kernel and tweak the thread priorities 
of the respective user and kernel threads in the processing chain. Make 
sure that you don't get priority inversion by using improper RT prio 
assignment, e.g. participating IRQ threads from kernel should have 
higher or same RT prio than threads processing data delivered by those 
IRQ threads. A good example for audio applications is 
http://subversion.ffado.org/wiki/IrqPriorities which shows priority 
settings for JACK audio applications.
> I could imagine on my code side that vary frequency segmentation would
> bring better throughput. By modeifing AGS_THREAD_DEFAULT_JIFFIE,
> AGS_THREAD_MAX_PRECISION and related.
>
> But for now I search for documentation about linux kernel performance counters.
>
> Best regards,
> Joël Krähemann
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 1 of 23 peformance problems at 30 to 40 percent system load
  2015-07-20 12:43 ` linuxball
@ 2015-07-21  1:47   ` Joël Krähemann
  2015-09-14 14:47     ` Joël Krähemann
  0 siblings, 1 reply; 4+ messages in thread
From: Joël Krähemann @ 2015-07-21  1:47 UTC (permalink / raw)
  To: linuxball; +Cc: linux-rt-users

Hi

The problem was more related to a memory leak what slowed the down the computer.
This is fixed for now. For me performance means you can do your work
during a period in time.
So good performance means stay in realtime.

cheers,
Joël

On Mon, Jul 20, 2015 at 2:43 PM, linuxball <linuxball@gmail.com> wrote:
> Hi Joël,
>
> see remarks in the context below.
>
> Best regards,
>
> Wolfgang
>
> On 17.07.2015 22:05, Joël Krähemann wrote:
>>
>> Hi all,
>>
>> My name is Joël Krähemann, I'm developing:
>>
>> http://gsequencer.org
>>
>> and I'm using:
>>
>> Linux debian 4.0.5 #1 SMP PREEMPT Sat Jul 11 16:32:49 CEST 2015 x86_64
>> GNU/Linux
>
> From the output it seems that the kernel you are using is NOT a RT kernel
> (otherwise "uname -v" should say "#1 SMP PREEMPT RT ..."). If you want to
> use a RT kernel you should build the kernel with PREEMPT_RT_FULL defined.
>>
>> For now I encounter on my system performance problems at a load of 30
>> to 40 % system load, all 8 virtual cpu's have same average load.
>
> What do you mean when you talk about "performance problems"? How do they
> manifest?
>
>> * `chrt` to higher priority doesn't give wished throughput.
>
> As you certainly know in general a RT kernel has worse throughput (but
> better response times / lower latencies for time critical threads) than a
> generic kernel.
>
>> * `taskset` has ff as default.
>> * `cpufreq-set -g performance` brings a little improvement for first
>> seconds
>>
>> This is my CPU:
>>
>> Architecture:          x86_64
>> CPU op-mode(s):        32-bit, 64-bit
>> Byte Order:            Little Endian
>> CPU(s):                8
>> On-line CPU(s) list:   0-7
>> Thread(s) per core:    2
>> Core(s) per socket:    4
>> Socket(s):             1
>> NUMA node(s):          1
>> Vendor ID:             GenuineIntel
>> CPU family:            6
>> Model:                 58
>> Model name:            Intel(R) Core(TM) i7-3740QM CPU @ 2.70GHz
>> Stepping:              9
>> CPU MHz:               3275.648
>> CPU max MHz:           3700.0000
>> CPU min MHz:           1200.0000
>> BogoMIPS:              5387.58
>> Virtualization:        VT-x
>> L1d cache:             32K
>> L1i cache:             32K
>> L2 cache:              256K
>> L3 cache:              6144K
>> NUMA node0 CPU(s):     0-7
>>
>>
>> GSequencer uses many threads but doesn't stay in realtime. What am I
>> doing wrong? Or how to gain more Performance out of the system?
>
> How do you define performance? If you want more throughput then try a
> generic kernel. If you want shorter response times (lower latencies) for
> selected RT threads than use a RT kernel and tweak the thread priorities of
> the respective user and kernel threads in the processing chain. Make sure
> that you don't get priority inversion by using improper RT prio assignment,
> e.g. participating IRQ threads from kernel should have higher or same RT
> prio than threads processing data delivered by those IRQ threads. A good
> example for audio applications is
> http://subversion.ffado.org/wiki/IrqPriorities which shows priority settings
> for JACK audio applications.
>>
>> I could imagine on my code side that vary frequency segmentation would
>> bring better throughput. By modeifing AGS_THREAD_DEFAULT_JIFFIE,
>> AGS_THREAD_MAX_PRECISION and related.
>>
>> But for now I search for documentation about linux kernel performance
>> counters.
>>
>> Best regards,
>> Joël Krähemann
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rt-users"
>> in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 1 of 23 peformance problems at 30 to 40 percent system load
  2015-07-21  1:47   ` Joël Krähemann
@ 2015-09-14 14:47     ` Joël Krähemann
  0 siblings, 0 replies; 4+ messages in thread
From: Joël Krähemann @ 2015-09-14 14:47 UTC (permalink / raw)
  To: jkraehemann-guest; +Cc: linux-rt-users

Hi

Today I was thinking about performance. Every system has its
properties e.g. acoustics has it's range within a bandwidth, light has
it's own oscillation range as well. These are all basically
mathematical systems whereby the
rule of performance can be applied. You might underload a system to
make certain properties visible like the amplitude of a wave. Within
finite computation with recursive precision you have to use the
harmonic oscillation
within the system.

Underload as a term would be defined as followig: An optimal system on
highest load shall slow down as underloading. In contrast as it
doesn't run at highest load it would be over-clocked.

In view of threads and the kernel, Linux would preferably be able to
handle more computation capacity on different loads. So concurrency
safety will be handled on compiler level.

About me, I don't code the kernel and don't have to deal with bits or
frequencies. Or at least in a more abstract manner, bits are the
things I switch on and off. Frequency is the synchronization rate I do
...
How ever I really like maths. I hope you enjoy the show:

http://gsequencer.org

By the way electrons are able to tunnel what isn't handled by the
specifications above.

Bests,
Joël Krähemann




On Tue, Jul 21, 2015 at 3:47 AM, Joël Krähemann <jkraehemann@gmail.com> wrote:
> Hi
>
> The problem was more related to a memory leak what slowed the down the computer.
> This is fixed for now. For me performance means you can do your work
> during a period in time.
> So good performance means stay in realtime.
>
> cheers,
> Joël
>
> On Mon, Jul 20, 2015 at 2:43 PM, linuxball <linuxball@gmail.com> wrote:
>> Hi Joël,
>>
>> see remarks in the context below.
>>
>> Best regards,
>>
>> Wolfgang
>>
>> On 17.07.2015 22:05, Joël Krähemann wrote:
>>>
>>> Hi all,
>>>
>>> My name is Joël Krähemann, I'm developing:
>>>
>>> http://gsequencer.org
>>>
>>> and I'm using:
>>>
>>> Linux debian 4.0.5 #1 SMP PREEMPT Sat Jul 11 16:32:49 CEST 2015 x86_64
>>> GNU/Linux
>>
>> From the output it seems that the kernel you are using is NOT a RT kernel
>> (otherwise "uname -v" should say "#1 SMP PREEMPT RT ..."). If you want to
>> use a RT kernel you should build the kernel with PREEMPT_RT_FULL defined.
>>>
>>> For now I encounter on my system performance problems at a load of 30
>>> to 40 % system load, all 8 virtual cpu's have same average load.
>>
>> What do you mean when you talk about "performance problems"? How do they
>> manifest?
>>
>>> * `chrt` to higher priority doesn't give wished throughput.
>>
>> As you certainly know in general a RT kernel has worse throughput (but
>> better response times / lower latencies for time critical threads) than a
>> generic kernel.
>>
>>> * `taskset` has ff as default.
>>> * `cpufreq-set -g performance` brings a little improvement for first
>>> seconds
>>>
>>> This is my CPU:
>>>
>>> Architecture:          x86_64
>>> CPU op-mode(s):        32-bit, 64-bit
>>> Byte Order:            Little Endian
>>> CPU(s):                8
>>> On-line CPU(s) list:   0-7
>>> Thread(s) per core:    2
>>> Core(s) per socket:    4
>>> Socket(s):             1
>>> NUMA node(s):          1
>>> Vendor ID:             GenuineIntel
>>> CPU family:            6
>>> Model:                 58
>>> Model name:            Intel(R) Core(TM) i7-3740QM CPU @ 2.70GHz
>>> Stepping:              9
>>> CPU MHz:               3275.648
>>> CPU max MHz:           3700.0000
>>> CPU min MHz:           1200.0000
>>> BogoMIPS:              5387.58
>>> Virtualization:        VT-x
>>> L1d cache:             32K
>>> L1i cache:             32K
>>> L2 cache:              256K
>>> L3 cache:              6144K
>>> NUMA node0 CPU(s):     0-7
>>>
>>>
>>> GSequencer uses many threads but doesn't stay in realtime. What am I
>>> doing wrong? Or how to gain more Performance out of the system?
>>
>> How do you define performance? If you want more throughput then try a
>> generic kernel. If you want shorter response times (lower latencies) for
>> selected RT threads than use a RT kernel and tweak the thread priorities of
>> the respective user and kernel threads in the processing chain. Make sure
>> that you don't get priority inversion by using improper RT prio assignment,
>> e.g. participating IRQ threads from kernel should have higher or same RT
>> prio than threads processing data delivered by those IRQ threads. A good
>> example for audio applications is
>> http://subversion.ffado.org/wiki/IrqPriorities which shows priority settings
>> for JACK audio applications.
>>>
>>> I could imagine on my code side that vary frequency segmentation would
>>> bring better throughput. By modeifing AGS_THREAD_DEFAULT_JIFFIE,
>>> AGS_THREAD_MAX_PRECISION and related.
>>>
>>> But for now I search for documentation about linux kernel performance
>>> counters.
>>>
>>> Best regards,
>>> Joël Krähemann
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rt-users"
>>> in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-09-14 14:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-17 20:05 1 of 23 peformance problems at 30 to 40 percent system load Joël Krähemann
2015-07-20 12:43 ` linuxball
2015-07-21  1:47   ` Joël Krähemann
2015-09-14 14:47     ` Joël Krähemann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox