* Single host VM limit when using RBD [not found] <38A500831D3DE24B90BD200D6C8701351BB3AA15@Exchange2010-2.corit.local> @ 2013-01-17 8:37 ` Matthew Anderson 2013-01-17 8:42 ` Andrey Korolyov 0 siblings, 1 reply; 5+ messages in thread From: Matthew Anderson @ 2013-01-17 8:37 UTC (permalink / raw) To: ceph-devel@vger.kernel.org I've run into a limit on the maximum number of RBD backed VM's that I'm able to run on a single host. I have 20 VM's (21 RBD volumes open) running on a single host and when booting the 21st machine I get the below error from libvirt/QEMU. I'm able to shut down a VM and start another in it's place so there seems to be a hard limit on the amount of volumes I'm able to have open. I did some googling and the error 11 from pthread_create seems to mean 'resource unavailable' so I'm probably running into a thread limit of some sort. I did try increasing the max_thread kernel option but nothing changed. I moved a few VM's to a different empty host and they start with no issues at all. This machine has 4 OSD's running on it in addition to the 20 VM's. Kernel 3.7.1. Ceph 0.56.1 and QEMU 1.3.0. There is currently 65GB of 96GB free ram and no swap. Can anyone suggest where the limit might be or anything I can do to narrow down the problem? Thanks -Matt ------------------------- Error starting domain: internal error Process exited while reading console log output: char device redirected to /dev/pts/23 Thread::try_create(): pthread_create failed with error 11common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f4eb5a65960 time 2013-01-17 02:32:58.096437 common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7) 1: (()+0x2aaa8f) [0x7f4eb2de8a8f] 2: (SafeTimer::init()+0x95) [0x7f4eb2cd2575] 3: (librados::RadosClient::connect()+0x72c) [0x7f4eb2c689dc] 4: (()+0xa0290) [0x7f4eb5b27290] 5: (()+0x879dd) [0x7f4eb5b0e9dd] 6: (()+0x87c1b) [0x7f4eb5b0ec1b] 7: (()+0x87ae1) [0x7f4eb5b0eae1] 8: (()+0x87d50) [0x7f4eb5b0ed50] 9: (()+0xb37b2) [0x7f4eb5b3a7b2] 10: (()+0x1e83eb) [0x7f4eb5c6f3eb] 11: (()+0x1ab54a) [0x7f4eb5c3254a] 12: (main()+0x9da) [0x7f4eb5c72a3a] 13: (__libc_start_main()+0xfd) [0x7f4eb1ab4cdd] 14: (()+0x710b9) [0x7f4eb5af80b9] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. terminate called after Traceback (most recent call last): File "/usr/share/virt-manager/virtManager/asyncjob.py", line 96, in cb_wrapper callback(asyncjob, *args, **kwargs) File "/usr/share/virt-manager/virtManager/asyncjob.py", line 117, in tmpcb callback(*args, **kwargs) File "/usr/share/virt-manager/virtManager/domain.py", line 1090, in startup self._backend.create() File "/usr/lib/python2.7/dist-packages/libvirt.py", line 620, in create if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self) libvirtError: internal error Process exited while reading console log output: char device redirected to /dev/pts/23 Thread::try_create(): pthread_create failed with error 11common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f4eb5a65960 time 2013-01-17 02:32:58.096437 common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7) 1: (()+0x2aaa8f) [0x7f4eb2de8a8f] 2: (SafeTimer::init()+0x95) [0x7f4eb2cd2575] 3: (librados::RadosClient::connect()+0x72c) [0x7f4eb2c689dc] 4: (()+0xa0290) [0x7f4eb5b27290] 5: (()+0x879dd) [0x7f4eb5b0e9dd] 6: (()+0x87c1b) [0x7f4eb5b0ec1b] 7: (()+0x87ae1) [0x7f4eb5b0eae1] 8: (()+0x87d50) [0x7f4eb5b0ed50] 9: (()+0xb37b2) [0x7f4eb5b3a7b2] 10: (()+0x1e83eb) [0x7f4eb5c6f3eb] 11: (()+0x1ab54a) [0x7f4eb5c3254a] 12: (main()+0x9da) [0x7f4eb5c72a3a] 13: (__libc_start_main()+0xfd) [0x7f4eb1ab4cdd] 14: (()+0x710b9) [0x7f4eb5af80b9] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. terminate called after -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Single host VM limit when using RBD 2013-01-17 8:37 ` Single host VM limit when using RBD Matthew Anderson @ 2013-01-17 8:42 ` Andrey Korolyov 2013-01-17 8:47 ` Matthew Anderson 0 siblings, 1 reply; 5+ messages in thread From: Andrey Korolyov @ 2013-01-17 8:42 UTC (permalink / raw) To: Matthew Anderson; +Cc: ceph-devel@vger.kernel.org Hi Matthew, Seems to a low value in /proc/sys/kernel/threads-max value. On Thu, Jan 17, 2013 at 12:37 PM, Matthew Anderson <matthewa@base3.com.au> wrote: > I've run into a limit on the maximum number of RBD backed VM's that I'm able to run on a single host. I have 20 VM's (21 RBD volumes open) running on a single host and when booting the 21st machine I get the below error from libvirt/QEMU. I'm able to shut down a VM and start another in it's place so there seems to be a hard limit on the amount of volumes I'm able to have open. I did some googling and the error 11 from pthread_create seems to mean 'resource unavailable' so I'm probably running into a thread limit of some sort. I did try increasing the max_thread kernel option but nothing changed. I moved a few VM's to a different empty host and they start with no issues at all. > > This machine has 4 OSD's running on it in addition to the 20 VM's. Kernel 3.7.1. Ceph 0.56.1 and QEMU 1.3.0. There is currently 65GB of 96GB free ram and no swap. > > Can anyone suggest where the limit might be or anything I can do to narrow down the problem? > > Thanks > -Matt > ------------------------- > > Error starting domain: internal error Process exited while reading console log output: char device redirected to /dev/pts/23 > Thread::try_create(): pthread_create failed with error 11common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f4eb5a65960 time 2013-01-17 02:32:58.096437 > common/Thread.cc: 110: FAILED assert(ret == 0) > ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7) > 1: (()+0x2aaa8f) [0x7f4eb2de8a8f] > 2: (SafeTimer::init()+0x95) [0x7f4eb2cd2575] > 3: (librados::RadosClient::connect()+0x72c) [0x7f4eb2c689dc] > 4: (()+0xa0290) [0x7f4eb5b27290] > 5: (()+0x879dd) [0x7f4eb5b0e9dd] > 6: (()+0x87c1b) [0x7f4eb5b0ec1b] > 7: (()+0x87ae1) [0x7f4eb5b0eae1] > 8: (()+0x87d50) [0x7f4eb5b0ed50] > 9: (()+0xb37b2) [0x7f4eb5b3a7b2] > 10: (()+0x1e83eb) [0x7f4eb5c6f3eb] > 11: (()+0x1ab54a) [0x7f4eb5c3254a] > 12: (main()+0x9da) [0x7f4eb5c72a3a] > 13: (__libc_start_main()+0xfd) [0x7f4eb1ab4cdd] > 14: (()+0x710b9) [0x7f4eb5af80b9] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. > terminate called after > > Traceback (most recent call last): > File "/usr/share/virt-manager/virtManager/asyncjob.py", line 96, in cb_wrapper > callback(asyncjob, *args, **kwargs) > File "/usr/share/virt-manager/virtManager/asyncjob.py", line 117, in tmpcb > callback(*args, **kwargs) > File "/usr/share/virt-manager/virtManager/domain.py", line 1090, in startup > self._backend.create() > File "/usr/lib/python2.7/dist-packages/libvirt.py", line 620, in create > if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self) > libvirtError: internal error Process exited while reading console log output: char device redirected to /dev/pts/23 > Thread::try_create(): pthread_create failed with error 11common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f4eb5a65960 time 2013-01-17 02:32:58.096437 > common/Thread.cc: 110: FAILED assert(ret == 0) > ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7) > 1: (()+0x2aaa8f) [0x7f4eb2de8a8f] > 2: (SafeTimer::init()+0x95) [0x7f4eb2cd2575] > 3: (librados::RadosClient::connect()+0x72c) [0x7f4eb2c689dc] > 4: (()+0xa0290) [0x7f4eb5b27290] > 5: (()+0x879dd) [0x7f4eb5b0e9dd] > 6: (()+0x87c1b) [0x7f4eb5b0ec1b] > 7: (()+0x87ae1) [0x7f4eb5b0eae1] > 8: (()+0x87d50) [0x7f4eb5b0ed50] > 9: (()+0xb37b2) [0x7f4eb5b3a7b2] > 10: (()+0x1e83eb) [0x7f4eb5c6f3eb] > 11: (()+0x1ab54a) [0x7f4eb5c3254a] > 12: (main()+0x9da) [0x7f4eb5c72a3a] > 13: (__libc_start_main()+0xfd) [0x7f4eb1ab4cdd] > 14: (()+0x710b9) [0x7f4eb5af80b9] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. > terminate called after > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: Single host VM limit when using RBD 2013-01-17 8:42 ` Andrey Korolyov @ 2013-01-17 8:47 ` Matthew Anderson 2013-01-17 18:36 ` Dan Mick 0 siblings, 1 reply; 5+ messages in thread From: Matthew Anderson @ 2013-01-17 8:47 UTC (permalink / raw) To: 'Andrey Korolyov'; +Cc: ceph-devel@vger.kernel.org Hi Audrey, I did try your suggestion beforehand and it doesn't appear to fix the issue. [root@KVM04 ~]# cat /proc/sys/kernel/threads-max 2549635 [root@KVM04 ~]# echo 5549635 > /proc/sys/kernel/threads-max [root@KVM04 ~]# virsh start EX03 error: Failed to start domain EX03 error: internal error Process exited while reading console log output: char device redirected to /dev/pts/23 Thread::try_create(): pthread_create failed with error 11common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f5ec9706960 time 2013-01-17 16:46:50.935681 common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7) 1: (()+0x2aaa8f) [0x7f5ec6a89a8f] 2: (SafeTimer::init()+0x95) [0x7f5ec6973575] 3: (librados::RadosClient::connect()+0x72c) [0x7f5ec69099dc] 4: (()+0xa0290) [0x7f5ec97c8290] 5: (()+0x879dd) [0x7f5ec97af9dd] 6: (()+0x87c1b) [0x7f5ec97afc1b] 7: (()+0x87ae1) [0x7f5ec97afae1] 8: (()+0x87d50) [0x7f5ec97afd50] 9: (()+0xb37b2) [0x7f5ec97db7b2] 10: (()+0x1e83eb) [0x7f5ec99103eb] 11: (()+0x1ab54a) [0x7f5ec98d354a] 12: (main()+0x9da) [0x7f5ec9913a3a] 13: (__libc_start_main()+0xfd) [0x7f5ec5755cdd] 14: (()+0x710b9) [0x7f5ec97990b9] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. terminate called after -----Original Message----- From: Andrey Korolyov [mailto:andrey@xdel.ru] Sent: Thursday, 17 January 2013 4:42 PM To: Matthew Anderson Cc: ceph-devel@vger.kernel.org Subject: Re: Single host VM limit when using RBD Hi Matthew, Seems to a low value in /proc/sys/kernel/threads-max value. On Thu, Jan 17, 2013 at 12:37 PM, Matthew Anderson <matthewa@base3.com.au> wrote: > I've run into a limit on the maximum number of RBD backed VM's that I'm able to run on a single host. I have 20 VM's (21 RBD volumes open) running on a single host and when booting the 21st machine I get the below error from libvirt/QEMU. I'm able to shut down a VM and start another in it's place so there seems to be a hard limit on the amount of volumes I'm able to have open. I did some googling and the error 11 from pthread_create seems to mean 'resource unavailable' so I'm probably running into a thread limit of some sort. I did try increasing the max_thread kernel option but nothing changed. I moved a few VM's to a different empty host and they start with no issues at all. > > This machine has 4 OSD's running on it in addition to the 20 VM's. Kernel 3.7.1. Ceph 0.56.1 and QEMU 1.3.0. There is currently 65GB of 96GB free ram and no swap. > > Can anyone suggest where the limit might be or anything I can do to narrow down the problem? > > Thanks > -Matt > ------------------------- > > Error starting domain: internal error Process exited while reading > console log output: char device redirected to /dev/pts/23 > Thread::try_create(): pthread_create failed with error > 11common/Thread.cc: In function 'void Thread::create(size_t)' thread > 7f4eb5a65960 time 2013-01-17 02:32:58.096437 > common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.56.1 > (e4a541624df62ef353e754391cbbb707f54b16f7) > 1: (()+0x2aaa8f) [0x7f4eb2de8a8f] > 2: (SafeTimer::init()+0x95) [0x7f4eb2cd2575] > 3: (librados::RadosClient::connect()+0x72c) [0x7f4eb2c689dc] > 4: (()+0xa0290) [0x7f4eb5b27290] > 5: (()+0x879dd) [0x7f4eb5b0e9dd] > 6: (()+0x87c1b) [0x7f4eb5b0ec1b] > 7: (()+0x87ae1) [0x7f4eb5b0eae1] > 8: (()+0x87d50) [0x7f4eb5b0ed50] > 9: (()+0xb37b2) [0x7f4eb5b3a7b2] > 10: (()+0x1e83eb) [0x7f4eb5c6f3eb] > 11: (()+0x1ab54a) [0x7f4eb5c3254a] > 12: (main()+0x9da) [0x7f4eb5c72a3a] > 13: (__libc_start_main()+0xfd) [0x7f4eb1ab4cdd] > 14: (()+0x710b9) [0x7f4eb5af80b9] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. > terminate called after > > Traceback (most recent call last): > File "/usr/share/virt-manager/virtManager/asyncjob.py", line 96, in cb_wrapper > callback(asyncjob, *args, **kwargs) > File "/usr/share/virt-manager/virtManager/asyncjob.py", line 117, in tmpcb > callback(*args, **kwargs) > File "/usr/share/virt-manager/virtManager/domain.py", line 1090, in startup > self._backend.create() > File "/usr/lib/python2.7/dist-packages/libvirt.py", line 620, in create > if ret == -1: raise libvirtError ('virDomainCreate() failed', > dom=self) > libvirtError: internal error Process exited while reading console log > output: char device redirected to /dev/pts/23 > Thread::try_create(): pthread_create failed with error > 11common/Thread.cc: In function 'void Thread::create(size_t)' thread > 7f4eb5a65960 time 2013-01-17 02:32:58.096437 > common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.56.1 > (e4a541624df62ef353e754391cbbb707f54b16f7) > 1: (()+0x2aaa8f) [0x7f4eb2de8a8f] > 2: (SafeTimer::init()+0x95) [0x7f4eb2cd2575] > 3: (librados::RadosClient::connect()+0x72c) [0x7f4eb2c689dc] > 4: (()+0xa0290) [0x7f4eb5b27290] > 5: (()+0x879dd) [0x7f4eb5b0e9dd] > 6: (()+0x87c1b) [0x7f4eb5b0ec1b] > 7: (()+0x87ae1) [0x7f4eb5b0eae1] > 8: (()+0x87d50) [0x7f4eb5b0ed50] > 9: (()+0xb37b2) [0x7f4eb5b3a7b2] > 10: (()+0x1e83eb) [0x7f4eb5c6f3eb] > 11: (()+0x1ab54a) [0x7f4eb5c3254a] > 12: (main()+0x9da) [0x7f4eb5c72a3a] > 13: (__libc_start_main()+0xfd) [0x7f4eb1ab4cdd] > 14: (()+0x710b9) [0x7f4eb5af80b9] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. > terminate called after > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > in the body of a message to majordomo@vger.kernel.org More majordomo > info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Single host VM limit when using RBD 2013-01-17 8:47 ` Matthew Anderson @ 2013-01-17 18:36 ` Dan Mick 2013-01-17 18:55 ` Jim Schutt 0 siblings, 1 reply; 5+ messages in thread From: Dan Mick @ 2013-01-17 18:36 UTC (permalink / raw) To: Matthew Anderson; +Cc: Andrey Korolyov, ceph-devel@vger.kernel.org How about RLIMIT_NPROC, or memory exhaustion? On Jan 17, 2013, at 12:47 AM, Matthew Anderson <matthewa@base3.com.au> wrote: > Hi Audrey, > > I did try your suggestion beforehand and it doesn't appear to fix the issue. > > [root@KVM04 ~]# cat /proc/sys/kernel/threads-max > 2549635 > [root@KVM04 ~]# echo 5549635 > /proc/sys/kernel/threads-max > [root@KVM04 ~]# virsh start EX03 > error: Failed to start domain EX03 > error: internal error Process exited while reading console log output: char device redirected to /dev/pts/23 > Thread::try_create(): pthread_create failed with error 11common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f5ec9706960 time 2013-01-17 16:46:50.935681 > common/Thread.cc: 110: FAILED assert(ret == 0) > ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7) > 1: (()+0x2aaa8f) [0x7f5ec6a89a8f] > 2: (SafeTimer::init()+0x95) [0x7f5ec6973575] > 3: (librados::RadosClient::connect()+0x72c) [0x7f5ec69099dc] > 4: (()+0xa0290) [0x7f5ec97c8290] > 5: (()+0x879dd) [0x7f5ec97af9dd] > 6: (()+0x87c1b) [0x7f5ec97afc1b] > 7: (()+0x87ae1) [0x7f5ec97afae1] > 8: (()+0x87d50) [0x7f5ec97afd50] > 9: (()+0xb37b2) [0x7f5ec97db7b2] > 10: (()+0x1e83eb) [0x7f5ec99103eb] > 11: (()+0x1ab54a) [0x7f5ec98d354a] > 12: (main()+0x9da) [0x7f5ec9913a3a] > 13: (__libc_start_main()+0xfd) [0x7f5ec5755cdd] > 14: (()+0x710b9) [0x7f5ec97990b9] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. > terminate called after > > > > -----Original Message----- > From: Andrey Korolyov [mailto:andrey@xdel.ru] > Sent: Thursday, 17 January 2013 4:42 PM > To: Matthew Anderson > Cc: ceph-devel@vger.kernel.org > Subject: Re: Single host VM limit when using RBD > > Hi Matthew, > > Seems to a low value in /proc/sys/kernel/threads-max value. > > On Thu, Jan 17, 2013 at 12:37 PM, Matthew Anderson <matthewa@base3.com.au> wrote: >> I've run into a limit on the maximum number of RBD backed VM's that I'm able to run on a single host. I have 20 VM's (21 RBD volumes open) running on a single host and when booting the 21st machine I get the below error from libvirt/QEMU. I'm able to shut down a VM and start another in it's place so there seems to be a hard limit on the amount of volumes I'm able to have open. I did some googling and the error 11 from pthread_create seems to mean 'resource unavailable' so I'm probably running into a thread limit of some sort. I did try increasing the max_thread kernel option but nothing changed. I moved a few VM's to a different empty host and they start with no issues at all. >> >> This machine has 4 OSD's running on it in addition to the 20 VM's. Kernel 3.7.1. Ceph 0.56.1 and QEMU 1.3.0. There is currently 65GB of 96GB free ram and no swap. >> >> Can anyone suggest where the limit might be or anything I can do to narrow down the problem? >> >> Thanks >> -Matt >> ------------------------- >> >> Error starting domain: internal error Process exited while reading >> console log output: char device redirected to /dev/pts/23 >> Thread::try_create(): pthread_create failed with error >> 11common/Thread.cc: In function 'void Thread::create(size_t)' thread >> 7f4eb5a65960 time 2013-01-17 02:32:58.096437 >> common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.56.1 >> (e4a541624df62ef353e754391cbbb707f54b16f7) >> 1: (()+0x2aaa8f) [0x7f4eb2de8a8f] >> 2: (SafeTimer::init()+0x95) [0x7f4eb2cd2575] >> 3: (librados::RadosClient::connect()+0x72c) [0x7f4eb2c689dc] >> 4: (()+0xa0290) [0x7f4eb5b27290] >> 5: (()+0x879dd) [0x7f4eb5b0e9dd] >> 6: (()+0x87c1b) [0x7f4eb5b0ec1b] >> 7: (()+0x87ae1) [0x7f4eb5b0eae1] >> 8: (()+0x87d50) [0x7f4eb5b0ed50] >> 9: (()+0xb37b2) [0x7f4eb5b3a7b2] >> 10: (()+0x1e83eb) [0x7f4eb5c6f3eb] >> 11: (()+0x1ab54a) [0x7f4eb5c3254a] >> 12: (main()+0x9da) [0x7f4eb5c72a3a] >> 13: (__libc_start_main()+0xfd) [0x7f4eb1ab4cdd] >> 14: (()+0x710b9) [0x7f4eb5af80b9] >> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. >> terminate called after >> >> Traceback (most recent call last): >> File "/usr/share/virt-manager/virtManager/asyncjob.py", line 96, in cb_wrapper >> callback(asyncjob, *args, **kwargs) >> File "/usr/share/virt-manager/virtManager/asyncjob.py", line 117, in tmpcb >> callback(*args, **kwargs) >> File "/usr/share/virt-manager/virtManager/domain.py", line 1090, in startup >> self._backend.create() >> File "/usr/lib/python2.7/dist-packages/libvirt.py", line 620, in create >> if ret == -1: raise libvirtError ('virDomainCreate() failed', >> dom=self) >> libvirtError: internal error Process exited while reading console log >> output: char device redirected to /dev/pts/23 >> Thread::try_create(): pthread_create failed with error >> 11common/Thread.cc: In function 'void Thread::create(size_t)' thread >> 7f4eb5a65960 time 2013-01-17 02:32:58.096437 >> common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.56.1 >> (e4a541624df62ef353e754391cbbb707f54b16f7) >> 1: (()+0x2aaa8f) [0x7f4eb2de8a8f] >> 2: (SafeTimer::init()+0x95) [0x7f4eb2cd2575] >> 3: (librados::RadosClient::connect()+0x72c) [0x7f4eb2c689dc] >> 4: (()+0xa0290) [0x7f4eb5b27290] >> 5: (()+0x879dd) [0x7f4eb5b0e9dd] >> 6: (()+0x87c1b) [0x7f4eb5b0ec1b] >> 7: (()+0x87ae1) [0x7f4eb5b0eae1] >> 8: (()+0x87d50) [0x7f4eb5b0ed50] >> 9: (()+0xb37b2) [0x7f4eb5b3a7b2] >> 10: (()+0x1e83eb) [0x7f4eb5c6f3eb] >> 11: (()+0x1ab54a) [0x7f4eb5c3254a] >> 12: (main()+0x9da) [0x7f4eb5c72a3a] >> 13: (__libc_start_main()+0xfd) [0x7f4eb1ab4cdd] >> 14: (()+0x710b9) [0x7f4eb5af80b9] >> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. >> terminate called after >> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" >> in the body of a message to majordomo@vger.kernel.org More majordomo >> info at http://vger.kernel.org/majordomo-info.html > N�����r��y���b�X��ǧv�^�){.n�+���z�]z�{ay�\x1dʇڙ�,j\a��f���h���z�\x1e�w���\f���j:+v���w�j�m����\a����zZ+��ݢj"��!�i -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Single host VM limit when using RBD 2013-01-17 18:36 ` Dan Mick @ 2013-01-17 18:55 ` Jim Schutt 0 siblings, 0 replies; 5+ messages in thread From: Jim Schutt @ 2013-01-17 18:55 UTC (permalink / raw) To: Dan Mick; +Cc: Matthew Anderson, Andrey Korolyov, ceph-devel@vger.kernel.org On 01/17/2013 11:36 AM, Dan Mick wrote: > How about RLIMIT_NPROC, or memory exhaustion? Also, check /proc/sys/kernel/pid_max. I've solved a similar pthread_create problem by increasing this to 256k, up from 32k. -- Jim > > On Jan 17, 2013, at 12:47 AM, Matthew Anderson <matthewa@base3.com.au> wrote: > >> Hi Audrey, >> >> I did try your suggestion beforehand and it doesn't appear to fix the issue. >> >> [root@KVM04 ~]# cat /proc/sys/kernel/threads-max >> 2549635 >> [root@KVM04 ~]# echo 5549635 > /proc/sys/kernel/threads-max >> [root@KVM04 ~]# virsh start EX03 >> error: Failed to start domain EX03 >> error: internal error Process exited while reading console log output: char device redirected to /dev/pts/23 >> Thread::try_create(): pthread_create failed with error 11common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f5ec9706960 time 2013-01-17 16:46:50.935681 >> common/Thread.cc: 110: FAILED assert(ret == 0) >> ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7) >> 1: (()+0x2aaa8f) [0x7f5ec6a89a8f] >> 2: (SafeTimer::init()+0x95) [0x7f5ec6973575] >> 3: (librados::RadosClient::connect()+0x72c) [0x7f5ec69099dc] >> 4: (()+0xa0290) [0x7f5ec97c8290] >> 5: (()+0x879dd) [0x7f5ec97af9dd] >> 6: (()+0x87c1b) [0x7f5ec97afc1b] >> 7: (()+0x87ae1) [0x7f5ec97afae1] >> 8: (()+0x87d50) [0x7f5ec97afd50] >> 9: (()+0xb37b2) [0x7f5ec97db7b2] >> 10: (()+0x1e83eb) [0x7f5ec99103eb] >> 11: (()+0x1ab54a) [0x7f5ec98d354a] >> 12: (main()+0x9da) [0x7f5ec9913a3a] >> 13: (__libc_start_main()+0xfd) [0x7f5ec5755cdd] >> 14: (()+0x710b9) [0x7f5ec97990b9] >> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. >> terminate called after >> >> >> >> -----Original Message----- >> From: Andrey Korolyov [mailto:andrey@xdel.ru] >> Sent: Thursday, 17 January 2013 4:42 PM >> To: Matthew Anderson >> Cc: ceph-devel@vger.kernel.org >> Subject: Re: Single host VM limit when using RBD >> >> Hi Matthew, >> >> Seems to a low value in /proc/sys/kernel/threads-max value. >> >> On Thu, Jan 17, 2013 at 12:37 PM, Matthew Anderson <matthewa@base3.com.au> wrote: >>> I've run into a limit on the maximum number of RBD backed VM's that I'm able to run on a single host. I have 20 VM's (21 RBD volumes open) running on a single host and when booting the 21st machine I get the below error from libvirt/QEMU. I'm able to shut down a VM and start another in it's place so there seems to be a hard limit on the amount of volumes I'm able to have open. I did some googling and the error 11 from pthread_create seems to mean 'resource unavailable' so I'm probably running into a thread limit of some sort. I did try increasing the max_thread kernel option but nothing changed. I moved a few VM's to a different empty host and they start with no issues at all. >>> >>> This machine has 4 OSD's running on it in addition to the 20 VM's. Kernel 3.7.1. Ceph 0.56.1 and QEMU 1.3.0. There is currently 65GB of 96GB free ram and no swap. >>> >>> Can anyone suggest where the limit might be or anything I can do to narrow down the problem? >>> >>> Thanks >>> -Matt >>> ------------------------- >>> >>> Error starting domain: internal error Process exited while reading >>> console log output: char device redirected to /dev/pts/23 >>> Thread::try_create(): pthread_create failed with error >>> 11common/Thread.cc: In function 'void Thread::create(size_t)' thread >>> 7f4eb5a65960 time 2013-01-17 02:32:58.096437 >>> common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.56.1 >>> (e4a541624df62ef353e754391cbbb707f54b16f7) >>> 1: (()+0x2aaa8f) [0x7f4eb2de8a8f] >>> 2: (SafeTimer::init()+0x95) [0x7f4eb2cd2575] >>> 3: (librados::RadosClient::connect()+0x72c) [0x7f4eb2c689dc] >>> 4: (()+0xa0290) [0x7f4eb5b27290] >>> 5: (()+0x879dd) [0x7f4eb5b0e9dd] >>> 6: (()+0x87c1b) [0x7f4eb5b0ec1b] >>> 7: (()+0x87ae1) [0x7f4eb5b0eae1] >>> 8: (()+0x87d50) [0x7f4eb5b0ed50] >>> 9: (()+0xb37b2) [0x7f4eb5b3a7b2] >>> 10: (()+0x1e83eb) [0x7f4eb5c6f3eb] >>> 11: (()+0x1ab54a) [0x7f4eb5c3254a] >>> 12: (main()+0x9da) [0x7f4eb5c72a3a] >>> 13: (__libc_start_main()+0xfd) [0x7f4eb1ab4cdd] >>> 14: (()+0x710b9) [0x7f4eb5af80b9] >>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. >>> terminate called after >>> >>> Traceback (most recent call last): >>> File "/usr/share/virt-manager/virtManager/asyncjob.py", line 96, in cb_wrapper >>> callback(asyncjob, *args, **kwargs) >>> File "/usr/share/virt-manager/virtManager/asyncjob.py", line 117, in tmpcb >>> callback(*args, **kwargs) >>> File "/usr/share/virt-manager/virtManager/domain.py", line 1090, in startup >>> self._backend.create() >>> File "/usr/lib/python2.7/dist-packages/libvirt.py", line 620, in create >>> if ret == -1: raise libvirtError ('virDomainCreate() failed', >>> dom=self) >>> libvirtError: internal error Process exited while reading console log >>> output: char device redirected to /dev/pts/23 >>> Thread::try_create(): pthread_create failed with error >>> 11common/Thread.cc: In function 'void Thread::create(size_t)' thread >>> 7f4eb5a65960 time 2013-01-17 02:32:58.096437 >>> common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.56.1 >>> (e4a541624df62ef353e754391cbbb707f54b16f7) >>> 1: (()+0x2aaa8f) [0x7f4eb2de8a8f] >>> 2: (SafeTimer::init()+0x95) [0x7f4eb2cd2575] >>> 3: (librados::RadosClient::connect()+0x72c) [0x7f4eb2c689dc] >>> 4: (()+0xa0290) [0x7f4eb5b27290] >>> 5: (()+0x879dd) [0x7f4eb5b0e9dd] >>> 6: (()+0x87c1b) [0x7f4eb5b0ec1b] >>> 7: (()+0x87ae1) [0x7f4eb5b0eae1] >>> 8: (()+0x87d50) [0x7f4eb5b0ed50] >>> 9: (()+0xb37b2) [0x7f4eb5b3a7b2] >>> 10: (()+0x1e83eb) [0x7f4eb5c6f3eb] >>> 11: (()+0x1ab54a) [0x7f4eb5c3254a] >>> 12: (main()+0x9da) [0x7f4eb5c72a3a] >>> 13: (__libc_start_main()+0xfd) [0x7f4eb1ab4cdd] >>> 14: (()+0x710b9) [0x7f4eb5af80b9] >>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. >>> terminate called after >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" >>> in the body of a message to majordomo@vger.kernel.org More majordomo >>> info at http://vger.kernel.org/majordomo-info.html >> N�����r��y���b�X��ǧv�^�){.n�+���z�]z�{ay�\x1dʇڙ�,j\a��f���h���z�\x1e�w���\f���j:+v���w�j�m����\a����zZ+��ݢj"��!�i > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-01-17 18:55 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <38A500831D3DE24B90BD200D6C8701351BB3AA15@Exchange2010-2.corit.local>
2013-01-17 8:37 ` Single host VM limit when using RBD Matthew Anderson
2013-01-17 8:42 ` Andrey Korolyov
2013-01-17 8:47 ` Matthew Anderson
2013-01-17 18:36 ` Dan Mick
2013-01-17 18:55 ` Jim Schutt
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.