From mboxrd@z Thu Jan 1 00:00:00 1970 From: LC Bruzenak Subject: need debug suggestions on system freeze Date: Fri, 09 May 2008 14:58:45 -0500 Message-ID: <1210363125.7060.72.camel@homeserver> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Return-path: Received: from mx3.redhat.com (mx3.redhat.com [172.16.48.32]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id m49JxV2Q009220 for ; Fri, 9 May 2008 15:59:31 -0400 Received: from magi (rrcs-24-242-137-197.sw.biz.rr.com [24.242.137.197]) by mx3.redhat.com (8.13.8/8.13.8) with ESMTP id m49Jx1wX014301 for ; Fri, 9 May 2008 15:59:02 -0400 Received: from [24.242.137.194] (helo=[192.168.30.40]) by magi with esmtpsa (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.63) (envelope-from ) id 1JuYjQ-0008OK-W5 for linux-audit@redhat.com; Fri, 09 May 2008 14:58:17 -0500 List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linux-audit-bounces@redhat.com Errors-To: linux-audit-bounces@redhat.com To: Linux Audit List-Id: linux-audit@redhat.com I need some suggestions for debugging an issue I'm having. I have a Dell Vostro laptop I've been using successfully for a while (details below). It has some user apps running but doesn't seem overburdened. I am running mls policy in permissive mode. However, recently the following happens: PART 1 (prelude relay disabled): * audit is enabled, there are 2 audisp plugins (prelude and af_unix). * The audispd.conf q_depth = 128 * I go to our project source directory and start an "svn up" * In another window as root I "tail -f /var/syslog/messages": ... May 9 13:51:40 comms audispd: queue is full - dropping event May 9 13:51:40 comms audispd: queue is full - dropping event May 9 13:51:40 comms audispd: queue is full - dropping event May 9 13:51:40 comms audispd: queue is full - dropping event May 9 13:51:40 comms audispd: queue is full - dropping event May 9 13:51:40 comms audispd: queue is full - dropping event May 9 13:51:40 comms audispd: queue is full - dropping event May 9 13:51:40 comms audispd: queue is full - dropping event May 9 13:51:40 comms audispd: queue is full - dropping event May 9 13:51:40 comms audispd: queue is full - dropping event May 9 13:51:40 comms audispd: queue is full - dropping event May 9 13:51:40 comms audispd: queue is full - dropping event May 9 13:51:40 comms audispd: queue is full - dropping event May 9 13:51:40 comms audispd: queue is full - dropping event May 9 13:51:40 comms audispd: queue is full - dropping event May 9 13:51:42 comms auditd[3629]: Audit daemon rotating log files with keep option May 9 13:52:10 comms prelude-manager: WARNING: Failover enabled: connection error with 192.168.31.120:4690: Connection timed out * Very soon after this the machine locks up. The above is the last entry in the messages log. Only the "caps lock" and some other "lock" icon on the keyboard (but not scroll lock) flash, and I have no inbound network connection & the screen is blank. I cannot get to a terminal with . The only option is power cycle. * After reboot, if I "service auditd stop" then repeat the svn stuff there is no freeze, no messages. I suspect it is something with file traversals and the audit dispatcher/prelude. It also happened once when doing a "rm -rf " on a directory with many files under my home directory. * I purposely have a lot of audit logs left in the directory: [root@hugo ~]# ls -1 /var/log/audit | wc -l 90 * I purposely have the prelude parent manager (relay-to machine) disabled. * The machine was not exceptionally busy in userland according to the "top" I had running in another window. Here is the header from that (the "top" process was running, all others sleeping): top - 13:52:40 up 19 min, 3 users, load average: 0.14, 0.16, 0.12 Tasks: 156 total, 1 running, 155 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 0.3%sy, 0.0%ni, 99.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 2060944k total, 912324k used, 1148620k free, 210276k buffers Swap: 6835000k total, 0k used, 6835000k free, 240760k cached * The freeze-up happens faster (I believe) if I leave the audispd.conf q_depth = 80 (default). Details: [root@hugo ~]# uname -a Linux hugo 2.6.25-14.fc9.x86_64 #1 SMP Thu May 1 06:06:21 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux [root@hugo ~]# rpm -qa | grep audit- audit-libs-1.7.2-6.fc9.i386 audit-1.7.2-6.fc9.x86_64 audit-libs-1.7.2-6.fc9.x86_64 ... * I have lots of audit rules, plan to add more: [root@hugo ~]# auditctl -l | wc -l 84 * Disk is not full: [root@hugo ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup00-LogVol00 108G 7.2G 101G 7% / /dev/sda1 190M 20M 161M 11% /boot tmpfs 1007M 48K 1007M 1% /dev/shm PART 2: So - then I enabled the relaying prelude-manager. The svn update got farther, and I thought maybe that was the cause of the original problem. However, I saw this first in the messages log: ... May 9 15:31:36 comms audispd: queue is full - dropping event May 9 15:31:36 comms audispd: queue is full - dropping event May 9 15:31:36 comms audispd: queue is full - dropping event May 9 15:31:36 comms audispd: queue is full - dropping event May 9 15:31:38 comms audispd: queue is full - dropping event May 9 15:31:38 comms audispd: queue is full - dropping event May 9 15:31:38 comms audispd: queue is full - dropping event May 9 15:31:38 comms audispd: queue is full - dropping event May 9 15:31:38 comms auditd[3682]: Audit daemon rotating log files with keep option May 9 15:31:43 comms auditd[3682]: Audit daemon rotating log files with keep option May 9 15:31:48 comms auditd[3682]: Audit daemon rotating log files with keep option May 9 15:31:53 comms auditd[3682]: Audit daemon rotating log files with keep option Then the same freeze-up happens as described above. Any suggestions or other data I can provide to help debug? In the meantime I will increase the audispd.conf q_depth and retest. Thx, LCB. -- LC (Lenny) Bruzenak lenny@magitekltd.com