* Hitting tcmalloc bug even with patch applied
@ 2015-04-27 12:06 Alexandre DERUMIER
2015-04-27 13:21 ` Alexandre DERUMIER
2015-04-27 20:24 ` Milosz Tanski
0 siblings, 2 replies; 14+ messages in thread
From: Alexandre DERUMIER @ 2015-04-27 12:06 UTC (permalink / raw)
To: ceph-devel, Somnath Roy
Hi,
I'm hitting the tcmalloc even with patch apply.
It's mainly occur when I try to bench fio with a lot jobs (20 - 40 jobs)
Does It need to tuned something in osd environnement variable ?
I double check it with
#g++ -o gperftest gperftest.c -ltcmalloc
# export TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=67108864
# ./gperftest
Tcmalloc OK! Internal and Env cache size are same:67108864
perf top
-------
10.04% libtcmalloc.so.4.1.2 [.] tcmalloc::ThreadCache::ReleaseToCentralCache
8.19% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::FetchFromSpans
3.89% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseToSpans
2.04% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseListToSpans
1.79% libtcmalloc.so.4.1.2 [.] operator new
1.25% ceph-osd [.] ConfFile::load_from_buffer
1.21% libtcmalloc.so.4.1.2 [.] operator delete
1.14% [kernel] [k] _raw_spin_lock
1.08% libstdc++.so.6.0.19 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string
1.04% [kernel] [k] __schedule
1.00% libpthread-2.17.so [.] pthread_mutex_trylock
0.90% [kernel] [k] native_write_msr_safe
0.89% [kernel] [k] __switch_to
0.79% [kernel] [k] _raw_spin_lock_irqsave
0.73% [kernel] [k] copy_user_enhanced_fast_string
Regards,
Alexandre
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hitting tcmalloc bug even with patch applied
2015-04-27 12:06 Hitting tcmalloc bug even with patch applied Alexandre DERUMIER
@ 2015-04-27 13:21 ` Alexandre DERUMIER
2015-04-27 14:53 ` Milosz Tanski
2015-04-27 20:24 ` Milosz Tanski
1 sibling, 1 reply; 14+ messages in thread
From: Alexandre DERUMIER @ 2015-04-27 13:21 UTC (permalink / raw)
To: ceph-devel, Somnath Roy
Seem that starting osd with:
TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=128M /usr/bin/ceph-osd
fix it.
I don't known if it's the right way ?
----- Mail original -----
De: "aderumier" <aderumier@odiso.com>
À: "ceph-devel" <ceph-devel@vger.kernel.org>, "Somnath Roy" <somnath.roy@sandisk.com>
Envoyé: Lundi 27 Avril 2015 14:06:22
Objet: Hitting tcmalloc bug even with patch applied
Hi,
I'm hitting the tcmalloc even with patch apply.
It's mainly occur when I try to bench fio with a lot jobs (20 - 40 jobs)
Does It need to tuned something in osd environnement variable ?
I double check it with
#g++ -o gperftest gperftest.c -ltcmalloc
# export TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=67108864
# ./gperftest
Tcmalloc OK! Internal and Env cache size are same:67108864
perf top
-------
10.04% libtcmalloc.so.4.1.2 [.] tcmalloc::ThreadCache::ReleaseToCentralCache
8.19% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::FetchFromSpans
3.89% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseToSpans
2.04% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseListToSpans
1.79% libtcmalloc.so.4.1.2 [.] operator new
1.25% ceph-osd [.] ConfFile::load_from_buffer
1.21% libtcmalloc.so.4.1.2 [.] operator delete
1.14% [kernel] [k] _raw_spin_lock
1.08% libstdc++.so.6.0.19 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string
1.04% [kernel] [k] __schedule
1.00% libpthread-2.17.so [.] pthread_mutex_trylock
0.90% [kernel] [k] native_write_msr_safe
0.89% [kernel] [k] __switch_to
0.79% [kernel] [k] _raw_spin_lock_irqsave
0.73% [kernel] [k] copy_user_enhanced_fast_string
Regards,
Alexandre
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hitting tcmalloc bug even with patch applied
2015-04-27 13:21 ` Alexandre DERUMIER
@ 2015-04-27 14:53 ` Milosz Tanski
2015-04-27 14:57 ` Mark Nelson
2015-04-27 16:02 ` Somnath Roy
0 siblings, 2 replies; 14+ messages in thread
From: Milosz Tanski @ 2015-04-27 14:53 UTC (permalink / raw)
To: Alexandre DERUMIER, ceph-devel, Somnath Roy
On 4/27/15 9:21 AM, Alexandre DERUMIER wrote:
> Seem that starting osd with:
>
> TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=128M /usr/bin/ceph-osd
>
> fix it.
>
> I don't known if it's the right way ?
Do you know what the default is if you don't specify it?
>
>
>
> ----- Mail original -----
> De: "aderumier" <aderumier@odiso.com>
> À: "ceph-devel" <ceph-devel@vger.kernel.org>, "Somnath Roy" <somnath.roy@sandisk.com>
> Envoyé: Lundi 27 Avril 2015 14:06:22
> Objet: Hitting tcmalloc bug even with patch applied
>
> Hi,
>
> I'm hitting the tcmalloc even with patch apply.
> It's mainly occur when I try to bench fio with a lot jobs (20 - 40 jobs)
>
> Does It need to tuned something in osd environnement variable ?
>
>
> I double check it with
>
> #g++ -o gperftest gperftest.c -ltcmalloc
> # export TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=67108864
> # ./gperftest
> Tcmalloc OK! Internal and Env cache size are same:67108864
>
>
> perf top
> -------
> 10.04% libtcmalloc.so.4.1.2 [.] tcmalloc::ThreadCache::ReleaseToCentralCache
> 8.19% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::FetchFromSpans
> 3.89% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseToSpans
> 2.04% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseListToSpans
> 1.79% libtcmalloc.so.4.1.2 [.] operator new
> 1.25% ceph-osd [.] ConfFile::load_from_buffer
> 1.21% libtcmalloc.so.4.1.2 [.] operator delete
> 1.14% [kernel] [k] _raw_spin_lock
> 1.08% libstdc++.so.6.0.19 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string
> 1.04% [kernel] [k] __schedule
> 1.00% libpthread-2.17.so [.] pthread_mutex_trylock
> 0.90% [kernel] [k] native_write_msr_safe
> 0.89% [kernel] [k] __switch_to
> 0.79% [kernel] [k] _raw_spin_lock_irqsave
> 0.73% [kernel] [k] copy_user_enhanced_fast_string
>
>
>
> Regards,
>
> Alexandre
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hitting tcmalloc bug even with patch applied
2015-04-27 14:53 ` Milosz Tanski
@ 2015-04-27 14:57 ` Mark Nelson
2015-04-27 15:25 ` Alexandre DERUMIER
2015-04-27 16:02 ` Somnath Roy
1 sibling, 1 reply; 14+ messages in thread
From: Mark Nelson @ 2015-04-27 14:57 UTC (permalink / raw)
To: Milosz Tanski, Alexandre DERUMIER, ceph-devel, Somnath Roy
Looks like the default is 16MB:
http://gperftools.googlecode.com/svn/trunk/doc/tcmalloc.html
On 04/27/2015 09:53 AM, Milosz Tanski wrote:
>
>
> On 4/27/15 9:21 AM, Alexandre DERUMIER wrote:
>> Seem that starting osd with:
>>
>> TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=128M /usr/bin/ceph-osd
>>
>> fix it.
>>
>> I don't known if it's the right way ?
>
> Do you know what the default is if you don't specify it?
>>
>>
>>
>> ----- Mail original -----
>> De: "aderumier" <aderumier@odiso.com>
>> À: "ceph-devel" <ceph-devel@vger.kernel.org>, "Somnath Roy" <somnath.roy@sandisk.com>
>> Envoyé: Lundi 27 Avril 2015 14:06:22
>> Objet: Hitting tcmalloc bug even with patch applied
>>
>> Hi,
>>
>> I'm hitting the tcmalloc even with patch apply.
>> It's mainly occur when I try to bench fio with a lot jobs (20 - 40 jobs)
>>
>> Does It need to tuned something in osd environnement variable ?
>>
>>
>> I double check it with
>>
>> #g++ -o gperftest gperftest.c -ltcmalloc
>> # export TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=67108864
>> # ./gperftest
>> Tcmalloc OK! Internal and Env cache size are same:67108864
>>
>>
>> perf top
>> -------
>> 10.04% libtcmalloc.so.4.1.2 [.] tcmalloc::ThreadCache::ReleaseToCentralCache
>> 8.19% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::FetchFromSpans
>> 3.89% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseToSpans
>> 2.04% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseListToSpans
>> 1.79% libtcmalloc.so.4.1.2 [.] operator new
>> 1.25% ceph-osd [.] ConfFile::load_from_buffer
>> 1.21% libtcmalloc.so.4.1.2 [.] operator delete
>> 1.14% [kernel] [k] _raw_spin_lock
>> 1.08% libstdc++.so.6.0.19 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string
>> 1.04% [kernel] [k] __schedule
>> 1.00% libpthread-2.17.so [.] pthread_mutex_trylock
>> 0.90% [kernel] [k] native_write_msr_safe
>> 0.89% [kernel] [k] __switch_to
>> 0.79% [kernel] [k] _raw_spin_lock_irqsave
>> 0.73% [kernel] [k] copy_user_enhanced_fast_string
>>
>>
>>
>> Regards,
>>
>> Alexandre
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hitting tcmalloc bug even with patch applied
2015-04-27 14:57 ` Mark Nelson
@ 2015-04-27 15:25 ` Alexandre DERUMIER
2015-04-27 16:00 ` Milosz Tanski
0 siblings, 1 reply; 14+ messages in thread
From: Alexandre DERUMIER @ 2015-04-27 15:25 UTC (permalink / raw)
To: Mark Nelson; +Cc: Milosz Tanski, ceph-devel, Somnath Roy
>>Looks like the default is 16MB:
>>http://gperftools.googlecode.com/svn/trunk/doc/tcmalloc.html
I don't known if setting it to 128MB have performance impact (cpu, memory garbage collection ?)
I'll try to test different values between 16->128MB.
----- Mail original -----
De: "Mark Nelson" <mnelson@redhat.com>
À: "Milosz Tanski" <milosz@adfin.com>, "aderumier" <aderumier@odiso.com>, "ceph-devel" <ceph-devel@vger.kernel.org>, "Somnath Roy" <somnath.roy@sandisk.com>
Envoyé: Lundi 27 Avril 2015 16:57:44
Objet: Re: Hitting tcmalloc bug even with patch applied
Looks like the default is 16MB:
http://gperftools.googlecode.com/svn/trunk/doc/tcmalloc.html
On 04/27/2015 09:53 AM, Milosz Tanski wrote:
>
>
> On 4/27/15 9:21 AM, Alexandre DERUMIER wrote:
>> Seem that starting osd with:
>>
>> TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=128M /usr/bin/ceph-osd
>>
>> fix it.
>>
>> I don't known if it's the right way ?
>
> Do you know what the default is if you don't specify it?
>>
>>
>>
>> ----- Mail original -----
>> De: "aderumier" <aderumier@odiso.com>
>> À: "ceph-devel" <ceph-devel@vger.kernel.org>, "Somnath Roy" <somnath.roy@sandisk.com>
>> Envoyé: Lundi 27 Avril 2015 14:06:22
>> Objet: Hitting tcmalloc bug even with patch applied
>>
>> Hi,
>>
>> I'm hitting the tcmalloc even with patch apply.
>> It's mainly occur when I try to bench fio with a lot jobs (20 - 40 jobs)
>>
>> Does It need to tuned something in osd environnement variable ?
>>
>>
>> I double check it with
>>
>> #g++ -o gperftest gperftest.c -ltcmalloc
>> # export TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=67108864
>> # ./gperftest
>> Tcmalloc OK! Internal and Env cache size are same:67108864
>>
>>
>> perf top
>> -------
>> 10.04% libtcmalloc.so.4.1.2 [.] tcmalloc::ThreadCache::ReleaseToCentralCache
>> 8.19% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::FetchFromSpans
>> 3.89% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseToSpans
>> 2.04% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseListToSpans
>> 1.79% libtcmalloc.so.4.1.2 [.] operator new
>> 1.25% ceph-osd [.] ConfFile::load_from_buffer
>> 1.21% libtcmalloc.so.4.1.2 [.] operator delete
>> 1.14% [kernel] [k] _raw_spin_lock
>> 1.08% libstdc++.so.6.0.19 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string
>> 1.04% [kernel] [k] __schedule
>> 1.00% libpthread-2.17.so [.] pthread_mutex_trylock
>> 0.90% [kernel] [k] native_write_msr_safe
>> 0.89% [kernel] [k] __switch_to
>> 0.79% [kernel] [k] _raw_spin_lock_irqsave
>> 0.73% [kernel] [k] copy_user_enhanced_fast_string
>>
>>
>>
>> Regards,
>>
>> Alexandre
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hitting tcmalloc bug even with patch applied
2015-04-27 15:25 ` Alexandre DERUMIER
@ 2015-04-27 16:00 ` Milosz Tanski
0 siblings, 0 replies; 14+ messages in thread
From: Milosz Tanski @ 2015-04-27 16:00 UTC (permalink / raw)
To: Alexandre DERUMIER, Mark Nelson; +Cc: ceph-devel, Somnath Roy
On 4/27/15 11:25 AM, Alexandre DERUMIER wrote:
>>> Looks like the default is 16MB:
>>> http://gperftools.googlecode.com/svn/trunk/doc/tcmalloc.html
>
> I don't known if setting it to 128MB have performance impact (cpu, memory garbage collection ?)
> I'll try to test different values between 16->128MB.
It looks like the mongodb people have run into a similar issue in a particular (degenerate) case. Here's the ticket link for reference: https://jira.mongodb.org/browse/SERVER-16551
>
>
>
>
> ----- Mail original -----
> De: "Mark Nelson" <mnelson@redhat.com>
> À: "Milosz Tanski" <milosz@adfin.com>, "aderumier" <aderumier@odiso.com>, "ceph-devel" <ceph-devel@vger.kernel.org>, "Somnath Roy" <somnath.roy@sandisk.com>
> Envoyé: Lundi 27 Avril 2015 16:57:44
> Objet: Re: Hitting tcmalloc bug even with patch applied
>
> Looks like the default is 16MB:
>
> http://gperftools.googlecode.com/svn/trunk/doc/tcmalloc.html
>
> On 04/27/2015 09:53 AM, Milosz Tanski wrote:
>>
>>
>> On 4/27/15 9:21 AM, Alexandre DERUMIER wrote:
>>> Seem that starting osd with:
>>>
>>> TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=128M /usr/bin/ceph-osd
>>>
>>> fix it.
>>>
>>> I don't known if it's the right way ?
>>
>> Do you know what the default is if you don't specify it?
>>>
>>>
>>>
>>> ----- Mail original -----
>>> De: "aderumier" <aderumier@odiso.com>
>>> À: "ceph-devel" <ceph-devel@vger.kernel.org>, "Somnath Roy" <somnath.roy@sandisk.com>
>>> Envoyé: Lundi 27 Avril 2015 14:06:22
>>> Objet: Hitting tcmalloc bug even with patch applied
>>>
>>> Hi,
>>>
>>> I'm hitting the tcmalloc even with patch apply.
>>> It's mainly occur when I try to bench fio with a lot jobs (20 - 40 jobs)
>>>
>>> Does It need to tuned something in osd environnement variable ?
>>>
>>>
>>> I double check it with
>>>
>>> #g++ -o gperftest gperftest.c -ltcmalloc
>>> # export TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=67108864
>>> # ./gperftest
>>> Tcmalloc OK! Internal and Env cache size are same:67108864
>>>
>>>
>>> perf top
>>> -------
>>> 10.04% libtcmalloc.so.4.1.2 [.] tcmalloc::ThreadCache::ReleaseToCentralCache
>>> 8.19% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::FetchFromSpans
>>> 3.89% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseToSpans
>>> 2.04% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseListToSpans
>>> 1.79% libtcmalloc.so.4.1.2 [.] operator new
>>> 1.25% ceph-osd [.] ConfFile::load_from_buffer
>>> 1.21% libtcmalloc.so.4.1.2 [.] operator delete
>>> 1.14% [kernel] [k] _raw_spin_lock
>>> 1.08% libstdc++.so.6.0.19 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string
>>> 1.04% [kernel] [k] __schedule
>>> 1.00% libpthread-2.17.so [.] pthread_mutex_trylock
>>> 0.90% [kernel] [k] native_write_msr_safe
>>> 0.89% [kernel] [k] __switch_to
>>> 0.79% [kernel] [k] _raw_spin_lock_irqsave
>>> 0.73% [kernel] [k] copy_user_enhanced_fast_string
>>>
>>>
>>>
>>> Regards,
>>>
>>> Alexandre
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: Hitting tcmalloc bug even with patch applied
2015-04-27 14:53 ` Milosz Tanski
2015-04-27 14:57 ` Mark Nelson
@ 2015-04-27 16:02 ` Somnath Roy
1 sibling, 0 replies; 14+ messages in thread
From: Somnath Roy @ 2015-04-27 16:02 UTC (permalink / raw)
To: Milosz Tanski, Alexandre DERUMIER, ceph-devel
I doubt the tcmalloc trace will go away permanently with the env variable set. Depending on your workload it may come back. Giving more memory as a thread cache will definitely help.
BTW, in case of any confusion, the patch was to make the env variable in effect (and alleviate the traces) not to resolve the traces.
Thanks & Regards
Somnath
-----Original Message-----
From: Milosz Tanski [mailto:milosz@adfin.com]
Sent: Monday, April 27, 2015 7:53 AM
To: Alexandre DERUMIER; ceph-devel; Somnath Roy
Subject: Re: Hitting tcmalloc bug even with patch applied
On 4/27/15 9:21 AM, Alexandre DERUMIER wrote:
> Seem that starting osd with:
>
> TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=128M /usr/bin/ceph-osd
>
> fix it.
>
> I don't known if it's the right way ?
Do you know what the default is if you don't specify it?
>
>
>
> ----- Mail original -----
> De: "aderumier" <aderumier@odiso.com>
> À: "ceph-devel" <ceph-devel@vger.kernel.org>, "Somnath Roy"
> <somnath.roy@sandisk.com>
> Envoyé: Lundi 27 Avril 2015 14:06:22
> Objet: Hitting tcmalloc bug even with patch applied
>
> Hi,
>
> I'm hitting the tcmalloc even with patch apply.
> It's mainly occur when I try to bench fio with a lot jobs (20 - 40
> jobs)
>
> Does It need to tuned something in osd environnement variable ?
>
>
> I double check it with
>
> #g++ -o gperftest gperftest.c -ltcmalloc # export
> TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=67108864
> # ./gperftest
> Tcmalloc OK! Internal and Env cache size are same:67108864
>
>
> perf top
> -------
> 10.04% libtcmalloc.so.4.1.2 [.]
> tcmalloc::ThreadCache::ReleaseToCentralCache
> 8.19% libtcmalloc.so.4.1.2 [.]
> tcmalloc::CentralFreeList::FetchFromSpans
> 3.89% libtcmalloc.so.4.1.2 [.]
> tcmalloc::CentralFreeList::ReleaseToSpans
> 2.04% libtcmalloc.so.4.1.2 [.]
> tcmalloc::CentralFreeList::ReleaseListToSpans
> 1.79% libtcmalloc.so.4.1.2 [.] operator new 1.25% ceph-osd [.]
> ConfFile::load_from_buffer 1.21% libtcmalloc.so.4.1.2 [.] operator
> delete 1.14% [kernel] [k] _raw_spin_lock 1.08% libstdc++.so.6.0.19 [.]
> std::basic_string<char, std::char_traits<char>, std::allocator<char>
> >::basic_string 1.04% [kernel] [k] __schedule 1.00% libpthread-2.17.so
> [.] pthread_mutex_trylock 0.90% [kernel] [k] native_write_msr_safe
> 0.89% [kernel] [k] __switch_to 0.79% [kernel] [k]
> _raw_spin_lock_irqsave 0.73% [kernel] [k]
> copy_user_enhanced_fast_string
>
>
>
> Regards,
>
> Alexandre
________________________________
PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hitting tcmalloc bug even with patch applied
2015-04-27 12:06 Hitting tcmalloc bug even with patch applied Alexandre DERUMIER
2015-04-27 13:21 ` Alexandre DERUMIER
@ 2015-04-27 20:24 ` Milosz Tanski
2015-04-27 20:33 ` Mark Nelson
1 sibling, 1 reply; 14+ messages in thread
From: Milosz Tanski @ 2015-04-27 20:24 UTC (permalink / raw)
To: Alexandre DERUMIER, ceph-devel, Somnath Roy
On 4/27/15 8:06 AM, Alexandre DERUMIER wrote:
> Hi,
>
> I'm hitting the tcmalloc even with patch apply.
> It's mainly occur when I try to bench fio with a lot jobs (20 - 40 jobs)
>
> Does It need to tuned something in osd environnement variable ?
>
>
> I double check it with
>
> #g++ -o gperftest gperftest.c -ltcmalloc
> # export TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=67108864
> # ./gperftest
> Tcmalloc OK! Internal and Env cache size are same:67108864
>
>
> perf top
> -------
> 10.04% libtcmalloc.so.4.1.2 [.] tcmalloc::ThreadCache::ReleaseToCentralCache
> 8.19% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::FetchFromSpans
> 3.89% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseToSpans
> 2.04% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseListToSpans
> 1.79% libtcmalloc.so.4.1.2 [.] operator new
> 1.25% ceph-osd [.] ConfFile::load_from_buffer
> 1.21% libtcmalloc.so.4.1.2 [.] operator delete
> 1.14% [kernel] [k] _raw_spin_lock
> 1.08% libstdc++.so.6.0.19 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string
> 1.04% [kernel] [k] __schedule
> 1.00% libpthread-2.17.so [.] pthread_mutex_trylock
> 0.90% [kernel] [k] native_write_msr_safe
> 0.89% [kernel] [k] __switch_to
> 0.79% [kernel] [k] _raw_spin_lock_irqsave
> 0.73% [kernel] [k] copy_user_enhanced_fast_string
>
This is obviously going to be more painful but .... can you perform a capture for one OSD process using, pref record -p $OSD_PID. Ideally one with a callgraph and one without.
That can be helpful to investigate further. Can see which parts of those tcmalloc functions are the biggest offer in terms of time. We can also see if there's a new/delete pastern in OSD code that is somehow trigger this degenerate case.
>
>
> Regards,
>
> Alexandre
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hitting tcmalloc bug even with patch applied
2015-04-27 20:24 ` Milosz Tanski
@ 2015-04-27 20:33 ` Mark Nelson
2015-04-28 13:58 ` Chaitanya Huilgol
0 siblings, 1 reply; 14+ messages in thread
From: Mark Nelson @ 2015-04-27 20:33 UTC (permalink / raw)
To: Milosz Tanski, Alexandre DERUMIER, ceph-devel, Somnath Roy
On 04/27/2015 03:24 PM, Milosz Tanski wrote:
>
>
> On 4/27/15 8:06 AM, Alexandre DERUMIER wrote:
>> Hi,
>>
>> I'm hitting the tcmalloc even with patch apply.
>> It's mainly occur when I try to bench fio with a lot jobs (20 - 40 jobs)
>>
>> Does It need to tuned something in osd environnement variable ?
>>
>>
>> I double check it with
>>
>> #g++ -o gperftest gperftest.c -ltcmalloc
>> # export TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=67108864
>> # ./gperftest
>> Tcmalloc OK! Internal and Env cache size are same:67108864
>>
>>
>> perf top
>> -------
>> 10.04% libtcmalloc.so.4.1.2 [.] tcmalloc::ThreadCache::ReleaseToCentralCache
>> 8.19% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::FetchFromSpans
>> 3.89% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseToSpans
>> 2.04% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseListToSpans
>> 1.79% libtcmalloc.so.4.1.2 [.] operator new
>> 1.25% ceph-osd [.] ConfFile::load_from_buffer
>> 1.21% libtcmalloc.so.4.1.2 [.] operator delete
>> 1.14% [kernel] [k] _raw_spin_lock
>> 1.08% libstdc++.so.6.0.19 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string
>> 1.04% [kernel] [k] __schedule
>> 1.00% libpthread-2.17.so [.] pthread_mutex_trylock
>> 0.90% [kernel] [k] native_write_msr_safe
>> 0.89% [kernel] [k] __switch_to
>> 0.79% [kernel] [k] _raw_spin_lock_irqsave
>> 0.73% [kernel] [k] copy_user_enhanced_fast_string
>>
>
> This is obviously going to be more painful but .... can you perform a capture for one OSD process using, pref record -p $OSD_PID. Ideally one with a callgraph and one without.
>
> That can be helpful to investigate further. Can see which parts of those tcmalloc functions are the biggest offer in terms of time. We can also see if there's a new/delete pastern in OSD code that is somehow trigger this degenerate case.
If on a newish (3.11+) kernel that has libunwind compiled into perf,
I've found that dwarf callgraphs are much more detailed. The frequency
may need to be lowered to make it work well. -F 100 or something perhaps.
>
>>
>>
>> Regards,
>>
>> Alexandre
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: Hitting tcmalloc bug even with patch applied
2015-04-27 20:33 ` Mark Nelson
@ 2015-04-28 13:58 ` Chaitanya Huilgol
2015-04-28 17:37 ` Milosz Tanski
0 siblings, 1 reply; 14+ messages in thread
From: Chaitanya Huilgol @ 2015-04-28 13:58 UTC (permalink / raw)
To: Mark Nelson, Milosz Tanski, Alexandre DERUMIER, ceph-devel,
Somnath Roy
Hi,
The default cache size is 32M, the tcmalloc documentation is outdated.
As Somnath mentioned, the tcmalloc fix is to make the env effective as without this fix the library does not use exported value of ' TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES'.
The degenerated case is hit less frequently with higher value of the cache size but we still do encounter the issue.
We are not very sure of what is leading to this, the hypothesis so far is
- Change in OSDs mem allocation profile causing the tcmalloc to bring different size segments to the thread cache
- Change in load on the shard threads (the distribution is uneven) one less active threads due to I/O started on a different pool, this may cause tcmalloc to move memory to these threads
If you want to test with increased cache value, you can export this value in the /etc/init/ceph-osd.conf upstart script
Regards,
Chaitanya
-----Original Message-----
From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Mark Nelson
Sent: 28 April 2015 02:03
To: Milosz Tanski; Alexandre DERUMIER; ceph-devel; Somnath Roy
Subject: Re: Hitting tcmalloc bug even with patch applied
On 04/27/2015 03:24 PM, Milosz Tanski wrote:
>
>
> On 4/27/15 8:06 AM, Alexandre DERUMIER wrote:
>> Hi,
>>
>> I'm hitting the tcmalloc even with patch apply.
>> It's mainly occur when I try to bench fio with a lot jobs (20 - 40
>> jobs)
>>
>> Does It need to tuned something in osd environnement variable ?
>>
>>
>> I double check it with
>>
>> #g++ -o gperftest gperftest.c -ltcmalloc # export
>> TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=67108864
>> # ./gperftest
>> Tcmalloc OK! Internal and Env cache size are same:67108864
>>
>>
>> perf top
>> -------
>> 10.04% libtcmalloc.so.4.1.2 [.] tcmalloc::ThreadCache::ReleaseToCentralCache
>> 8.19% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::FetchFromSpans
>> 3.89% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseToSpans
>> 2.04% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseListToSpans
>> 1.79% libtcmalloc.so.4.1.2 [.] operator new
>> 1.25% ceph-osd [.] ConfFile::load_from_buffer
>> 1.21% libtcmalloc.so.4.1.2 [.] operator delete
>> 1.14% [kernel] [k] _raw_spin_lock
>> 1.08% libstdc++.so.6.0.19 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string
>> 1.04% [kernel] [k] __schedule
>> 1.00% libpthread-2.17.so [.] pthread_mutex_trylock
>> 0.90% [kernel] [k] native_write_msr_safe
>> 0.89% [kernel] [k] __switch_to
>> 0.79% [kernel] [k] _raw_spin_lock_irqsave
>> 0.73% [kernel] [k] copy_user_enhanced_fast_string
>>
>
> This is obviously going to be more painful but .... can you perform a capture for one OSD process using, pref record -p $OSD_PID. Ideally one with a callgraph and one without.
>
> That can be helpful to investigate further. Can see which parts of those tcmalloc functions are the biggest offer in terms of time. We can also see if there's a new/delete pastern in OSD code that is somehow trigger this degenerate case.
If on a newish (3.11+) kernel that has libunwind compiled into perf, I've found that dwarf callgraphs are much more detailed. The frequency may need to be lowered to make it work well. -F 100 or something perhaps.
>
>>
>>
>> Regards,
>>
>> Alexandre
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
________________________________
PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hitting tcmalloc bug even with patch applied
2015-04-28 13:58 ` Chaitanya Huilgol
@ 2015-04-28 17:37 ` Milosz Tanski
2015-04-28 18:04 ` Vijayendra Shamanna
0 siblings, 1 reply; 14+ messages in thread
From: Milosz Tanski @ 2015-04-28 17:37 UTC (permalink / raw)
To: Chaitanya Huilgol
Cc: Mark Nelson, Alexandre DERUMIER, ceph-devel, Somnath Roy
On Tue, Apr 28, 2015 at 9:58 AM, Chaitanya Huilgol
<Chaitanya.Huilgol@sandisk.com> wrote:
>
> Hi,
>
> The default cache size is 32M, the tcmalloc documentation is outdated.
> As Somnath mentioned, the tcmalloc fix is to make the env effective as without this fix the library does not use exported value of ' TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES'.
> The degenerated case is hit less frequently with higher value of the cache size but we still do encounter the issue.
> We are not very sure of what is leading to this, the hypothesis so far is
> - Change in OSDs mem allocation profile causing the tcmalloc to bring different size segments to the thread cache
> - Change in load on the shard threads (the distribution is uneven) one less active threads due to I/O started on a different pool, this may cause tcmalloc to move memory to these threads
Actually reading this (older) documentation:
http://gperftools.googlecode.com/svn/trunk/doc/tcmalloc.html#Sizing_Thread_Cache_Free_Lists
It describes the problem of sizing the thread free list and potential
problems. The asymmetric alloc/free, eg. cross thread alloc/free, in
that case you're basically guaranteeing that you will see worst case
behavior. In this case you don't benefit from thread cache but you pay
the price for the thread cache (maintaing it / always freeing to the
global pool). This would be common in a case you have different IO
threads from network threads (IO allocates space, network thread sends
it and frees it).
Am I correct Chaitanya. That's what you're talking about in the second
statement?
That's why I was hoping Alexandre would we able to provide us with
some callgraphs that indicate where these free/delete are originating
from.
>
>
> If you want to test with increased cache value, you can export this value in the /etc/init/ceph-osd.conf upstart script
>
> Regards,
> Chaitanya
>
> -----Original Message-----
> From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Mark Nelson
> Sent: 28 April 2015 02:03
> To: Milosz Tanski; Alexandre DERUMIER; ceph-devel; Somnath Roy
> Subject: Re: Hitting tcmalloc bug even with patch applied
>
>
>
> On 04/27/2015 03:24 PM, Milosz Tanski wrote:
> >
> >
> > On 4/27/15 8:06 AM, Alexandre DERUMIER wrote:
> >> Hi,
> >>
> >> I'm hitting the tcmalloc even with patch apply.
> >> It's mainly occur when I try to bench fio with a lot jobs (20 - 40
> >> jobs)
> >>
> >> Does It need to tuned something in osd environnement variable ?
> >>
> >>
> >> I double check it with
> >>
> >> #g++ -o gperftest gperftest.c -ltcmalloc # export
> >> TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=67108864
> >> # ./gperftest
> >> Tcmalloc OK! Internal and Env cache size are same:67108864
> >>
> >>
> >> perf top
> >> -------
> >> 10.04% libtcmalloc.so.4.1.2 [.] tcmalloc::ThreadCache::ReleaseToCentralCache
> >> 8.19% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::FetchFromSpans
> >> 3.89% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseToSpans
> >> 2.04% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseListToSpans
> >> 1.79% libtcmalloc.so.4.1.2 [.] operator new
> >> 1.25% ceph-osd [.] ConfFile::load_from_buffer
> >> 1.21% libtcmalloc.so.4.1.2 [.] operator delete
> >> 1.14% [kernel] [k] _raw_spin_lock
> >> 1.08% libstdc++.so.6.0.19 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string
> >> 1.04% [kernel] [k] __schedule
> >> 1.00% libpthread-2.17.so [.] pthread_mutex_trylock
> >> 0.90% [kernel] [k] native_write_msr_safe
> >> 0.89% [kernel] [k] __switch_to
> >> 0.79% [kernel] [k] _raw_spin_lock_irqsave
> >> 0.73% [kernel] [k] copy_user_enhanced_fast_string
> >>
> >
> > This is obviously going to be more painful but .... can you perform a capture for one OSD process using, pref record -p $OSD_PID. Ideally one with a callgraph and one without.
> >
> > That can be helpful to investigate further. Can see which parts of those tcmalloc functions are the biggest offer in terms of time. We can also see if there's a new/delete pastern in OSD code that is somehow trigger this degenerate case.
>
> If on a newish (3.11+) kernel that has libunwind compiled into perf, I've found that dwarf callgraphs are much more detailed. The frequency may need to be lowered to make it work well. -F 100 or something perhaps.
>
> >
> >>
> >>
> >> Regards,
> >>
> >> Alexandre
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> ________________________________
>
> PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
>
--
Milosz Tanski
CTO
16 East 34th Street, 15th floor
New York, NY 10016
p: 646-253-9055
e: milosz@adfin.com
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: Hitting tcmalloc bug even with patch applied
2015-04-28 17:37 ` Milosz Tanski
@ 2015-04-28 18:04 ` Vijayendra Shamanna
2015-04-30 6:15 ` Haomai Wang
0 siblings, 1 reply; 14+ messages in thread
From: Vijayendra Shamanna @ 2015-04-28 18:04 UTC (permalink / raw)
To: Milosz Tanski, Chaitanya Huilgol
Cc: Mark Nelson, Alexandre DERUMIER, ceph-devel, Somnath Roy
Hi Milosz,
The OSD op worker threads which handle requests are part of sharded thread pool. We observed that the distribution across these shards was a bit uneven. Most of the new/delete were originating from Index Manager code in the read path when we last checked.
Thanks,
Viju
-----Original Message-----
From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Milosz Tanski
Sent: Tuesday, April 28, 2015 11:08 PM
To: Chaitanya Huilgol
Cc: Mark Nelson; Alexandre DERUMIER; ceph-devel; Somnath Roy
Subject: Re: Hitting tcmalloc bug even with patch applied
On Tue, Apr 28, 2015 at 9:58 AM, Chaitanya Huilgol <Chaitanya.Huilgol@sandisk.com> wrote:
>
> Hi,
>
> The default cache size is 32M, the tcmalloc documentation is outdated.
> As Somnath mentioned, the tcmalloc fix is to make the env effective as without this fix the library does not use exported value of ' TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES'.
> The degenerated case is hit less frequently with higher value of the cache size but we still do encounter the issue.
> We are not very sure of what is leading to this, the hypothesis so far
> is
> - Change in OSDs mem allocation profile causing the tcmalloc to bring
> different size segments to the thread cache
> - Change in load on the shard threads (the distribution is uneven)
> one less active threads due to I/O started on a different pool, this
> may cause tcmalloc to move memory to these threads
Actually reading this (older) documentation:
http://gperftools.googlecode.com/svn/trunk/doc/tcmalloc.html#Sizing_Thread_Cache_Free_Lists
It describes the problem of sizing the thread free list and potential problems. The asymmetric alloc/free, eg. cross thread alloc/free, in that case you're basically guaranteeing that you will see worst case behavior. In this case you don't benefit from thread cache but you pay the price for the thread cache (maintaing it / always freeing to the global pool). This would be common in a case you have different IO threads from network threads (IO allocates space, network thread sends it and frees it).
Am I correct Chaitanya. That's what you're talking about in the second statement?
That's why I was hoping Alexandre would we able to provide us with some callgraphs that indicate where these free/delete are originating from.
>
>
> If you want to test with increased cache value, you can export this
> value in the /etc/init/ceph-osd.conf upstart script
>
> Regards,
> Chaitanya
>
> -----Original Message-----
> From: ceph-devel-owner@vger.kernel.org
> [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Mark Nelson
> Sent: 28 April 2015 02:03
> To: Milosz Tanski; Alexandre DERUMIER; ceph-devel; Somnath Roy
> Subject: Re: Hitting tcmalloc bug even with patch applied
>
>
>
> On 04/27/2015 03:24 PM, Milosz Tanski wrote:
> >
> >
> > On 4/27/15 8:06 AM, Alexandre DERUMIER wrote:
> >> Hi,
> >>
> >> I'm hitting the tcmalloc even with patch apply.
> >> It's mainly occur when I try to bench fio with a lot jobs (20 - 40
> >> jobs)
> >>
> >> Does It need to tuned something in osd environnement variable ?
> >>
> >>
> >> I double check it with
> >>
> >> #g++ -o gperftest gperftest.c -ltcmalloc # export
> >> TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=67108864
> >> # ./gperftest
> >> Tcmalloc OK! Internal and Env cache size are same:67108864
> >>
> >>
> >> perf top
> >> -------
> >> 10.04% libtcmalloc.so.4.1.2 [.] tcmalloc::ThreadCache::ReleaseToCentralCache
> >> 8.19% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::FetchFromSpans
> >> 3.89% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseToSpans
> >> 2.04% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseListToSpans
> >> 1.79% libtcmalloc.so.4.1.2 [.] operator new
> >> 1.25% ceph-osd [.] ConfFile::load_from_buffer
> >> 1.21% libtcmalloc.so.4.1.2 [.] operator delete
> >> 1.14% [kernel] [k] _raw_spin_lock
> >> 1.08% libstdc++.so.6.0.19 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string
> >> 1.04% [kernel] [k] __schedule
> >> 1.00% libpthread-2.17.so [.] pthread_mutex_trylock
> >> 0.90% [kernel] [k] native_write_msr_safe
> >> 0.89% [kernel] [k] __switch_to
> >> 0.79% [kernel] [k] _raw_spin_lock_irqsave
> >> 0.73% [kernel] [k] copy_user_enhanced_fast_string
> >>
> >
> > This is obviously going to be more painful but .... can you perform a capture for one OSD process using, pref record -p $OSD_PID. Ideally one with a callgraph and one without.
> >
> > That can be helpful to investigate further. Can see which parts of those tcmalloc functions are the biggest offer in terms of time. We can also see if there's a new/delete pastern in OSD code that is somehow trigger this degenerate case.
>
> If on a newish (3.11+) kernel that has libunwind compiled into perf, I've found that dwarf callgraphs are much more detailed. The frequency may need to be lowered to make it work well. -F 100 or something perhaps.
>
> >
> >>
> >>
> >> Regards,
> >>
> >> Alexandre
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe
> >> ceph-devel" in the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>
> > --
> > To unsubscribe from this list: send the line "unsubscribe
> > ceph-devel" in the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> in the body of a message to majordomo@vger.kernel.org More majordomo
> info at http://vger.kernel.org/majordomo-info.html
>
> ________________________________
>
> PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
>
--
Milosz Tanski
CTO
16 East 34th Street, 15th floor
New York, NY 10016
p: 646-253-9055
e: milosz@adfin.com
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hitting tcmalloc bug even with patch applied
2015-04-28 18:04 ` Vijayendra Shamanna
@ 2015-04-30 6:15 ` Haomai Wang
2015-04-30 17:36 ` Gregory Farnum
0 siblings, 1 reply; 14+ messages in thread
From: Haomai Wang @ 2015-04-30 6:15 UTC (permalink / raw)
To: Vijayendra Shamanna
Cc: Milosz Tanski, Chaitanya Huilgol, Mark Nelson, Alexandre DERUMIER,
ceph-devel, Somnath Roy
Hmm, I think we need to pay a lot attention to this problem especially
for fast storage device backend.
I think the way to solve this problem rely to ceph itself instead of
tcmalloc or jemalloc optimization.
I'm not sure the most consuming memory is used by which part,
frequently-used object like ObjectContext/OpTracker or buffers in
bufferlist. Could tcmalloc or jemalloc provide with malloc/free usage
statistic infos which contain callstack, so we may see the memory
bottleneck from it?
On the other hand, maybe ceph need to consider to do memory management
itself for the main or frequent memory users.
On Wed, Apr 29, 2015 at 2:04 AM, Vijayendra Shamanna
<Vijayendra.Shamanna@sandisk.com> wrote:
> Hi Milosz,
>
> The OSD op worker threads which handle requests are part of sharded thread pool. We observed that the distribution across these shards was a bit uneven. Most of the new/delete were originating from Index Manager code in the read path when we last checked.
>
> Thanks,
> Viju
>
> -----Original Message-----
> From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Milosz Tanski
> Sent: Tuesday, April 28, 2015 11:08 PM
> To: Chaitanya Huilgol
> Cc: Mark Nelson; Alexandre DERUMIER; ceph-devel; Somnath Roy
> Subject: Re: Hitting tcmalloc bug even with patch applied
>
> On Tue, Apr 28, 2015 at 9:58 AM, Chaitanya Huilgol <Chaitanya.Huilgol@sandisk.com> wrote:
>>
>> Hi,
>>
>> The default cache size is 32M, the tcmalloc documentation is outdated.
>> As Somnath mentioned, the tcmalloc fix is to make the env effective as without this fix the library does not use exported value of ' TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES'.
>> The degenerated case is hit less frequently with higher value of the cache size but we still do encounter the issue.
>> We are not very sure of what is leading to this, the hypothesis so far
>> is
>> - Change in OSDs mem allocation profile causing the tcmalloc to bring
>> different size segments to the thread cache
>> - Change in load on the shard threads (the distribution is uneven)
>> one less active threads due to I/O started on a different pool, this
>> may cause tcmalloc to move memory to these threads
>
> Actually reading this (older) documentation:
> http://gperftools.googlecode.com/svn/trunk/doc/tcmalloc.html#Sizing_Thread_Cache_Free_Lists
> It describes the problem of sizing the thread free list and potential problems. The asymmetric alloc/free, eg. cross thread alloc/free, in that case you're basically guaranteeing that you will see worst case behavior. In this case you don't benefit from thread cache but you pay the price for the thread cache (maintaing it / always freeing to the global pool). This would be common in a case you have different IO threads from network threads (IO allocates space, network thread sends it and frees it).
>
> Am I correct Chaitanya. That's what you're talking about in the second statement?
>
> That's why I was hoping Alexandre would we able to provide us with some callgraphs that indicate where these free/delete are originating from.
>
>>
>>
>> If you want to test with increased cache value, you can export this
>> value in the /etc/init/ceph-osd.conf upstart script
>>
>> Regards,
>> Chaitanya
>>
>> -----Original Message-----
>> From: ceph-devel-owner@vger.kernel.org
>> [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Mark Nelson
>> Sent: 28 April 2015 02:03
>> To: Milosz Tanski; Alexandre DERUMIER; ceph-devel; Somnath Roy
>> Subject: Re: Hitting tcmalloc bug even with patch applied
>>
>>
>>
>> On 04/27/2015 03:24 PM, Milosz Tanski wrote:
>> >
>> >
>> > On 4/27/15 8:06 AM, Alexandre DERUMIER wrote:
>> >> Hi,
>> >>
>> >> I'm hitting the tcmalloc even with patch apply.
>> >> It's mainly occur when I try to bench fio with a lot jobs (20 - 40
>> >> jobs)
>> >>
>> >> Does It need to tuned something in osd environnement variable ?
>> >>
>> >>
>> >> I double check it with
>> >>
>> >> #g++ -o gperftest gperftest.c -ltcmalloc # export
>> >> TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=67108864
>> >> # ./gperftest
>> >> Tcmalloc OK! Internal and Env cache size are same:67108864
>> >>
>> >>
>> >> perf top
>> >> -------
>> >> 10.04% libtcmalloc.so.4.1.2 [.] tcmalloc::ThreadCache::ReleaseToCentralCache
>> >> 8.19% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::FetchFromSpans
>> >> 3.89% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseToSpans
>> >> 2.04% libtcmalloc.so.4.1.2 [.] tcmalloc::CentralFreeList::ReleaseListToSpans
>> >> 1.79% libtcmalloc.so.4.1.2 [.] operator new
>> >> 1.25% ceph-osd [.] ConfFile::load_from_buffer
>> >> 1.21% libtcmalloc.so.4.1.2 [.] operator delete
>> >> 1.14% [kernel] [k] _raw_spin_lock
>> >> 1.08% libstdc++.so.6.0.19 [.] std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string
>> >> 1.04% [kernel] [k] __schedule
>> >> 1.00% libpthread-2.17.so [.] pthread_mutex_trylock
>> >> 0.90% [kernel] [k] native_write_msr_safe
>> >> 0.89% [kernel] [k] __switch_to
>> >> 0.79% [kernel] [k] _raw_spin_lock_irqsave
>> >> 0.73% [kernel] [k] copy_user_enhanced_fast_string
>> >>
>> >
>> > This is obviously going to be more painful but .... can you perform a capture for one OSD process using, pref record -p $OSD_PID. Ideally one with a callgraph and one without.
>> >
>> > That can be helpful to investigate further. Can see which parts of those tcmalloc functions are the biggest offer in terms of time. We can also see if there's a new/delete pastern in OSD code that is somehow trigger this degenerate case.
>>
>> If on a newish (3.11+) kernel that has libunwind compiled into perf, I've found that dwarf callgraphs are much more detailed. The frequency may need to be lowered to make it work well. -F 100 or something perhaps.
>>
>> >
>> >>
>> >>
>> >> Regards,
>> >>
>> >> Alexandre
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe
>> >> ceph-devel" in the body of a message to majordomo@vger.kernel.org
>> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> >>
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe
>> > ceph-devel" in the body of a message to majordomo@vger.kernel.org
>> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>> >
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>> in the body of a message to majordomo@vger.kernel.org More majordomo
>> info at http://vger.kernel.org/majordomo-info.html
>>
>> ________________________________
>>
>> PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
>>
>
>
>
> --
> Milosz Tanski
> CTO
> 16 East 34th Street, 15th floor
> New York, NY 10016
>
> p: 646-253-9055
> e: milosz@adfin.com
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Best Regards,
Wheat
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hitting tcmalloc bug even with patch applied
2015-04-30 6:15 ` Haomai Wang
@ 2015-04-30 17:36 ` Gregory Farnum
0 siblings, 0 replies; 14+ messages in thread
From: Gregory Farnum @ 2015-04-30 17:36 UTC (permalink / raw)
To: Haomai Wang
Cc: Vijayendra Shamanna, Milosz Tanski, Chaitanya Huilgol,
Mark Nelson, Alexandre DERUMIER, ceph-devel, Somnath Roy
On Wed, Apr 29, 2015 at 11:15 PM, Haomai Wang <haomaiwang@gmail.com> wrote:
> Hmm, I think we need to pay a lot attention to this problem especially
> for fast storage device backend.
>
> I think the way to solve this problem rely to ceph itself instead of
> tcmalloc or jemalloc optimization.
>
> I'm not sure the most consuming memory is used by which part,
> frequently-used object like ObjectContext/OpTracker or buffers in
> bufferlist.
We've talked very briefly about using pools for ObjectContext and
OpTracker objects which might help, but the bulk of the memory
allocation is usually going to be for the message bufferlists, which
are variable-length. :(
(Strictly speaking so are a lot of the others which include string
renditions of the object name, but that can be worked around at least
a bit more than the actual message contents can.)
> Could tcmalloc or jemalloc provide with malloc/free usage
> statistic infos which contain callstack, so we may see the memory
> bottleneck from it?
You can do this with tcmalloc, and it's even integrated into Ceph with
the various heap commands. :)
>
> On the other hand, maybe ceph need to consider to do memory management
> itself for the main or frequent memory users.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2015-04-30 17:36 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-27 12:06 Hitting tcmalloc bug even with patch applied Alexandre DERUMIER
2015-04-27 13:21 ` Alexandre DERUMIER
2015-04-27 14:53 ` Milosz Tanski
2015-04-27 14:57 ` Mark Nelson
2015-04-27 15:25 ` Alexandre DERUMIER
2015-04-27 16:00 ` Milosz Tanski
2015-04-27 16:02 ` Somnath Roy
2015-04-27 20:24 ` Milosz Tanski
2015-04-27 20:33 ` Mark Nelson
2015-04-28 13:58 ` Chaitanya Huilgol
2015-04-28 17:37 ` Milosz Tanski
2015-04-28 18:04 ` Vijayendra Shamanna
2015-04-30 6:15 ` Haomai Wang
2015-04-30 17:36 ` Gregory Farnum
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.