* Re: Coredump detected executing rasdaemon 0.8.3 using "rasdaemon -f -r" on host with AMD CPU with Ubuntu 24
[not found] <CGME20250807071449uscas1p21a977bc7bf8b73e4dc05a4535b322be6@uscas1p2.samsung.com>
@ 2025-08-07 7:14 ` Samaresh Singh
0 siblings, 0 replies; 3+ messages in thread
From: Samaresh Singh @ 2025-08-07 7:14 UTC (permalink / raw)
To: Mauro Carvalho Chehab; +Cc: linux-edac@vger.kernel.org
Hi Mauro,
Thanks for your response.
As per your request, I ran the compiled code using GDB and shown below is the stack trace after the crash in the “ras-events.c” at line 862 in function add_event_handler.
A) Running the compiled code using GDB and looking at backtrace after coredump
++++
(gdb) r -f -r
Starting program: /home/samaresh.s/prac/ras1/rasdaemon/rasdaemon -f -r
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
Downloading separate debug info for system-supplied DSO at 0x7ffff7fc3000
Downloading separate debug info for /lib/x86_64-linux-gnu/libsqlite3.so.0
Downloading separate debug info for /lib/x86_64-linux-gnu/libtraceevent.so.1
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
*** buffer overflow detected ***: terminated
Program received signal SIGABRT, Aborted.
Download failed: Invalid argument. Continuing without source file ./nptl/./nptl/pthread_kill.c.
__pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
warning: 44 ./nptl/pthread_kill.c: No such file or directory
(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x00007ffff7c4527e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4 0x00007ffff7c288ff in __GI_abort () at ./stdlib/abort.c:79
#5 0x00007ffff7c297b6 in __libc_message_impl (fmt=fmt@entry=0x7ffff7dce765 "*** %s ***: terminated\n")
at ../sysdeps/posix/libc_fatal.c:134
#6 0x00007ffff7d36c19 in __GI___fortify_fail (msg=msg@entry=0x7ffff7dce74c "buffer overflow detected")
at ./debug/fortify_fail.c:24
#7 0x00007ffff7d365d4 in __GI___chk_fail () at ./debug/chk_fail.c:28
#8 0x00007ffff7d37a47 in __GI___read_chk (fd=fd@entry=3, buf=buf@entry=0x55555558984c, nbytes=nbytes@entry=8192,
buflen=buflen@entry=6804) at ./debug/read_chk.c:24
#9 0x00005555555644e0 in read (__nbytes=8192, __buf=0x55555558984c, __fd=3)
at /usr/include/x86_64-linux-gnu/bits/unistd.h:28
#10 add_event_handler (ras=ras@entry=0x5555555886b0, pevent=pevent@entry=0x555555589190, page_size=page_size@entry=8192,
group=group@entry=0x5555555734c9 "ras", event=event@entry=0x55555557339a "mc_event",
func=0x555555566700 <ras_mc_event_handler>, id=0, filter_str=0x0) at ras-events.c:862
#11 0x0000555555565bc2 in handle_ras_events (record_events=<optimized out>, enable_ipmitool=0) at ras-events.c:985
#12 0x000055555556326a in main (argc=3, argv=0x7fffffffe268) at rasdaemon.c:212
++++
B) Stepping through with break point in ras-events.c at starting at line 837
+++
837 fd = open_trace(ras, fname, O_RDONLY);
(gdb) n
838 if (fd < 0) {
(gdb) p fd
$1 = 3
(gdb) n
852 page = malloc(page_size);
(gdb)
853 if (!page) {
(gdb) p page
$2 = 0x5555555892e0 "trace_total_info"
(gdb) n
862 rc = read(fd, page + size, page_size);
(gdb)
863 if (rc < 0) {
(gdb) p rc
$3 = 1388
(gdb) n
869 size += rc;
(gdb) n
870 } while (rc > 0);
(gdb)
862 rc = read(fd, page + size, page_size);
(gdb)
*** buffer overflow detected ***: terminated
Program received signal SIGABRT, Aborted.
Download failed: Invalid argument. Continuing without source file ./nptl/./nptl/pthread_kill.c.
__pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
warning: 44 ./nptl/pthread_kill.c: No such file or directory
+++
837 fd = open_trace(ras, fname, O_RDONLY);
(gdb) n
838 if (fd < 0) {
(gdb) p fd
$1 = 3
(gdb) n
852 page = malloc(page_size);
(gdb)
853 if (!page) {
(gdb) p page
$2 = 0x5555555892e0 "trace_total_info"
(gdb) n
862 rc = read(fd, page + size, page_size);
(gdb)
863 if (rc < 0) {
(gdb) p rc
$3 = 1388
(gdb) n
869 size += rc;
(gdb) n
870 } while (rc > 0);
(gdb)
862 rc = read(fd, page + size, page_size);
(gdb)
*** buffer overflow detected ***: terminated
Program received signal SIGABRT, Aborted.
Download failed: Invalid argument. Continuing without source file ./nptl/./nptl/pthread_kill.c.
__pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
warning: 44 ./nptl/pthread_kill.c: No such file or directory
++++
Hope to hear soon.
Sincerely
Samaresh Singh
From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Sent: Wednesday, August 6, 2025 1:27 AM
To: Samaresh Singh <samaresh.s@partner.samsung.com>
Cc: linux-edac@vger.kernel.org; mchehab@kernel.org
Subject: Re: Coredump detected executing rasdaemon 0.8.3 using "rasdaemon -f -r" on host with AMD CPU with Ubuntu 24
Em Tue, 5 Aug 2025 21: 44: 50 +0000 Samaresh Singh <samaresh. s@ partner. samsung. com> escreveu: > Hi, > > I compiled rasdaemon 0. 8. 3 using the code at https: //urldefense. com/v3/__https: //protect2. fireeye. com/v1/url?k=083f8089-69b495b9-083e0bc6-000babffaa23-5abe594aea9229de&q=1&e=6e13aa7e-92c9-4b98-8607-69de064d4d6e&u=https*3A*2F*2Fgithub. com*2Fmchehab*2Frasdaemon*2F__;JSUlJSUl!!EwVzqGoTKBqv-0DWAJBm!RtRQa0nQcYG7lPQl-h1jDUkaqK9dm2yqK_07Dq47Tkp_6HSSj4OxnlDGZcc05ewO9n3CFxgrtXVyfR_YJ4YIZyTSzvMRCA$
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
Use caution opening files, clicking links or responding to requests.
ZjQcmQRYFpfptBannerEnd
Em Tue, 5 Aug 2025 21:44:50 +0000
Samaresh Singh <samaresh.s@partner.samsung.com> escreveu:
> Hi,
>
> I compiled rasdaemon 0.8.3 using the code at https://urldefense.com/v3/__https://protect2.fireeye.com/v1/url?k=083f8089-69b495b9-083e0bc6-000babffaa23-5abe594aea9229de&q=1&e=6e13aa7e-92c9-4b98-8607-69de064d4d6e&u=https*3A*2F*2Fgithub.com*2Fmchehab*2Frasdaemon*2F__;JSUlJSUl!!EwVzqGoTKBqv-0DWAJBm!RtRQa0nQcYG7lPQl-h1jDUkaqK9dm2yqK_07Dq47Tkp_6HSSj4OxnlDGZcc05ewO9n3CFxgrtXVyfR_YJ4YIZyTSzvMRCA$ on a machine with AMD CPUs and Ubuntu 24 as OS.
> The details about the CPU is available as:
> +++
> processor : 167
> vendor_id : AuthenticAMD
> cpu family : 25
> model : 17
> model name : AMD EPYC 9634 84-Core Processor
> stepping : 1
> microcode : 0xa101148
> cpu MHz : 1500.000
> cache size : 1024 KB
> physical id : 0
> siblings : 168
> core id : 78
> cpu cores : 84
> apicid : 157
> initial apicid : 157
> fpu : yes
> fpu_exception : yes
> cpuid level : 16
> wp : yes
>
> root@msl-ssg-cx02:/usr/local/var/lib/rasdaemon# uname -a
> Linux msl-ssg-cx02.msl.lab 6.8.0-71-generic #71-Ubuntu SMP PREEMPT_DYNAMIC Tue Jul 22 16:52:38 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
> root@msl-ssg-cx02:/usr/local/var/lib/rasdaemon# cat /etc/lsb-release
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=24.04
> DISTRIB_CODENAME=noble
> DISTRIB_DESCRIPTION="Ubuntu 24.04.1 LTS"
> +++++
>
> I had installed all the required dependencies and configure using "-enable-sqlite3 -enable-aer -enable-mce". The compilation succeeded without any errors/warnings but when I tried executing using the command "rasdaemon -f -r" I got the coredump as shown below:
> +++
> root@msl-ssg-cx02:/usr/local/var/lib/rasdaemon# rasdaemon --version
> rasdaemon 0.8.3
> root@msl-ssg-cx02:/usr/local/var/lib/rasdaemon# rasdaemon -f -r
> *** buffer overflow detected ***: terminated
> Aborted (core dumped)
> ++++
>
> Has this been reported by other folks?
Tha´t s new to me.
> Can you please look into this and let us know what could be causing this?
> If you have already observed it, are there any plans to provide a fix for this issue?
You need to use the core dump to identify what line of code caused
the issue. If your machine is using systemd, there is an utility
(coredumpctl) that allows you to inspect it. There is a description
about how to do it at:
https://urldefense.com/v3/__https://wiki.archlinux.org/title/Core_dump__;!!EwVzqGoTKBqv-0DWAJBm!RtRQa0nQcYG7lPQl-h1jDUkaqK9dm2yqK_07Dq47Tkp_6HSSj4OxnlDGZcc05ewO9n3CFxgrtXVyfR_YJ4YIZyR0AZxEfQ$
Once you check the source, I suggest you to take a look at the code
if it belongs to rasdaemon. If it came from some other place, e.g.
from one of rasdaemon dependencies, then you may need to open a bug
to the specific package on Ubuntu, if rasdaemon is passing the data
right to it.
Regards,
Mauro
>
> Hope to hear soon.
>
> Sincerely
> Samaresh K. Singh
> Memory Solutions Lab Team
> Samsung
Thanks,
Mauro
^ permalink raw reply [flat|nested] 3+ messages in thread