All of lore.kernel.org
 help / color / mirror / Atom feed
* Coredump detected executing rasdaemon 0.8.3 using "rasdaemon -f -r" on host with AMD CPU with Ubuntu 24
       [not found] <CGME20250805214451uscas1p25185f82c64c5519e601f4645044e0b86@uscas1p2.samsung.com>
@ 2025-08-05 21:44 ` Samaresh Singh
  2025-08-06  8:27   ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 3+ messages in thread
From: Samaresh Singh @ 2025-08-05 21:44 UTC (permalink / raw)
  To: linux-edac@vger.kernel.org; +Cc: mchehab@kernel.org

Hi,

  I compiled rasdaemon 0.8.3 using the code at https://github.com/mchehab/rasdaemon/ on a machine with AMD CPUs and Ubuntu 24 as OS.  
The details about the CPU is available as:
+++
processor            : 167
vendor_id            : AuthenticAMD
cpu family           : 25
model                   : 17
model name       : AMD EPYC 9634 84-Core Processor
stepping              : 1
microcode           : 0xa101148
cpu MHz                             : 1500.000
cache size            : 1024 KB
physical id           : 0
siblings : 168
core id                  : 78
cpu cores             : 84
apicid                   : 157
initial apicid        : 157
fpu                         : yes
fpu_exception    : yes
cpuid level          : 16
wp                         : yes

root@msl-ssg-cx02:/usr/local/var/lib/rasdaemon# uname -a
Linux msl-ssg-cx02.msl.lab 6.8.0-71-generic #71-Ubuntu SMP PREEMPT_DYNAMIC Tue Jul 22 16:52:38 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
root@msl-ssg-cx02:/usr/local/var/lib/rasdaemon# cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=24.04
DISTRIB_CODENAME=noble
DISTRIB_DESCRIPTION="Ubuntu 24.04.1 LTS"
+++++
  
I had installed all the required dependencies and configure using "-enable-sqlite3 -enable-aer -enable-mce".  The compilation succeeded without any errors/warnings but when I tried executing using the command "rasdaemon -f -r" I got the coredump as shown below:
+++
root@msl-ssg-cx02:/usr/local/var/lib/rasdaemon# rasdaemon --version
rasdaemon 0.8.3
root@msl-ssg-cx02:/usr/local/var/lib/rasdaemon# rasdaemon -f -r
*** buffer overflow detected ***: terminated
Aborted (core dumped)
++++

Has this been reported by other folks?  Can you please look into this and let us know what could be causing this?  
If you have already observed it, are there any plans to provide a fix for this issue?

Hope to hear soon.

Sincerely
Samaresh K. Singh
Memory Solutions Lab Team
Samsung

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Coredump detected executing rasdaemon 0.8.3 using "rasdaemon -f -r" on host with AMD CPU with Ubuntu 24
  2025-08-05 21:44 ` Coredump detected executing rasdaemon 0.8.3 using "rasdaemon -f -r" on host with AMD CPU with Ubuntu 24 Samaresh Singh
@ 2025-08-06  8:27   ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 3+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-06  8:27 UTC (permalink / raw)
  To: Samaresh Singh; +Cc: linux-edac@vger.kernel.org, mchehab@kernel.org

Em Tue, 5 Aug 2025 21:44:50 +0000
Samaresh Singh <samaresh.s@partner.samsung.com> escreveu:

> Hi,
> 
>   I compiled rasdaemon 0.8.3 using the code at https://github.com/mchehab/rasdaemon/ on a machine with AMD CPUs and Ubuntu 24 as OS.  
> The details about the CPU is available as:
> +++
> processor            : 167
> vendor_id            : AuthenticAMD
> cpu family           : 25
> model                   : 17
> model name       : AMD EPYC 9634 84-Core Processor
> stepping              : 1
> microcode           : 0xa101148
> cpu MHz                             : 1500.000
> cache size            : 1024 KB
> physical id           : 0
> siblings : 168
> core id                  : 78
> cpu cores             : 84
> apicid                   : 157
> initial apicid        : 157
> fpu                         : yes
> fpu_exception    : yes
> cpuid level          : 16
> wp                         : yes
> 
> root@msl-ssg-cx02:/usr/local/var/lib/rasdaemon# uname -a
> Linux msl-ssg-cx02.msl.lab 6.8.0-71-generic #71-Ubuntu SMP PREEMPT_DYNAMIC Tue Jul 22 16:52:38 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
> root@msl-ssg-cx02:/usr/local/var/lib/rasdaemon# cat /etc/lsb-release 
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=24.04
> DISTRIB_CODENAME=noble
> DISTRIB_DESCRIPTION="Ubuntu 24.04.1 LTS"
> +++++
>   
> I had installed all the required dependencies and configure using "-enable-sqlite3 -enable-aer -enable-mce".  The compilation succeeded without any errors/warnings but when I tried executing using the command "rasdaemon -f -r" I got the coredump as shown below:
> +++
> root@msl-ssg-cx02:/usr/local/var/lib/rasdaemon# rasdaemon --version
> rasdaemon 0.8.3
> root@msl-ssg-cx02:/usr/local/var/lib/rasdaemon# rasdaemon -f -r
> *** buffer overflow detected ***: terminated
> Aborted (core dumped)
> ++++
> 
> Has this been reported by other folks? 

Tha´t s new to me.


> Can you please look into this and let us know what could be causing this?  
> If you have already observed it, are there any plans to provide a fix for this issue?

You need to use the core dump to identify what line of code caused
the issue. If your machine is using systemd, there is an utility
(coredumpctl) that allows you to inspect it. There is a description
about how to do it at: 

https://wiki.archlinux.org/title/Core_dump

Once you check the source, I suggest you to take a look at the code
if it belongs to rasdaemon. If it came from some other place, e.g.
from one of rasdaemon dependencies, then you may need to open a bug
to the specific package on Ubuntu, if rasdaemon is passing the data
right to it.

Regards,
Mauro

> 
> Hope to hear soon.
> 
> Sincerely
> Samaresh K. Singh
> Memory Solutions Lab Team
> Samsung



Thanks,
Mauro

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Coredump detected executing rasdaemon 0.8.3 using "rasdaemon -f -r" on host with AMD CPU with Ubuntu 24
       [not found] <CGME20250807071449uscas1p21a977bc7bf8b73e4dc05a4535b322be6@uscas1p2.samsung.com>
@ 2025-08-07  7:14 ` Samaresh Singh
  0 siblings, 0 replies; 3+ messages in thread
From: Samaresh Singh @ 2025-08-07  7:14 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: linux-edac@vger.kernel.org

Hi Mauro,

  Thanks for your response.  
  As per your request, I ran the compiled code using GDB and shown below is the stack trace after the crash in the “ras-events.c” at line 862 in function add_event_handler.

A) Running the compiled code using GDB and looking at backtrace after coredump
++++

(gdb) r -f -r
Starting program: /home/samaresh.s/prac/ras1/rasdaemon/rasdaemon -f -r

Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
Downloading separate debug info for system-supplied DSO at 0x7ffff7fc3000
Downloading separate debug info for /lib/x86_64-linux-gnu/libsqlite3.so.0                                                  
Downloading separate debug info for /lib/x86_64-linux-gnu/libtraceevent.so.1                                               
[Thread debugging using libthread_db enabled]                                                                              
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
*** buffer overflow detected ***: terminated

Program received signal SIGABRT, Aborted.
Download failed: Invalid argument.  Continuing without source file ./nptl/./nptl/pthread_kill.c.
__pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
warning: 44        ./nptl/pthread_kill.c: No such file or directory
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007ffff7c4527e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007ffff7c288ff in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007ffff7c297b6 in __libc_message_impl (fmt=fmt@entry=0x7ffff7dce765 "*** %s ***: terminated\n")
    at ../sysdeps/posix/libc_fatal.c:134
#6  0x00007ffff7d36c19 in __GI___fortify_fail (msg=msg@entry=0x7ffff7dce74c "buffer overflow detected")
    at ./debug/fortify_fail.c:24
#7  0x00007ffff7d365d4 in __GI___chk_fail () at ./debug/chk_fail.c:28
#8  0x00007ffff7d37a47 in __GI___read_chk (fd=fd@entry=3, buf=buf@entry=0x55555558984c, nbytes=nbytes@entry=8192, 
    buflen=buflen@entry=6804) at ./debug/read_chk.c:24
#9  0x00005555555644e0 in read (__nbytes=8192, __buf=0x55555558984c, __fd=3)
    at /usr/include/x86_64-linux-gnu/bits/unistd.h:28
#10 add_event_handler (ras=ras@entry=0x5555555886b0, pevent=pevent@entry=0x555555589190, page_size=page_size@entry=8192, 
    group=group@entry=0x5555555734c9 "ras", event=event@entry=0x55555557339a "mc_event", 
    func=0x555555566700 <ras_mc_event_handler>, id=0, filter_str=0x0) at ras-events.c:862
#11 0x0000555555565bc2 in handle_ras_events (record_events=<optimized out>, enable_ipmitool=0) at ras-events.c:985
#12 0x000055555556326a in main (argc=3, argv=0x7fffffffe268) at rasdaemon.c:212
++++

B) Stepping through with break point in ras-events.c at starting at line 837
+++
837                        fd = open_trace(ras, fname, O_RDONLY);
(gdb) n
838                        if (fd < 0) {
(gdb) p fd
$1 = 3
(gdb) n
852                        page = malloc(page_size);
(gdb) 
853                        if (!page) {
(gdb) p page
$2 = 0x5555555892e0 "trace_total_info"
(gdb) n
862                                       rc = read(fd, page + size, page_size);
(gdb) 
863                                       if (rc < 0) {
(gdb) p rc
$3 = 1388
(gdb) n
869                                       size += rc;
(gdb) n
870                        } while (rc > 0);
(gdb) 
862                                       rc = read(fd, page + size, page_size);
(gdb) 
*** buffer overflow detected ***: terminated

Program received signal SIGABRT, Aborted.
Download failed: Invalid argument.  Continuing without source file ./nptl/./nptl/pthread_kill.c.
__pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
warning: 44        ./nptl/pthread_kill.c: No such file or directory

+++
837                        fd = open_trace(ras, fname, O_RDONLY);
(gdb) n
838                        if (fd < 0) {
(gdb) p fd
$1 = 3
(gdb) n
852                        page = malloc(page_size);
(gdb) 
853                        if (!page) {
(gdb) p page
$2 = 0x5555555892e0 "trace_total_info"
(gdb) n
862                                       rc = read(fd, page + size, page_size);
(gdb) 
863                                       if (rc < 0) {
(gdb) p rc
$3 = 1388
(gdb) n
869                                       size += rc;
(gdb) n
870                        } while (rc > 0);
(gdb) 
862                                       rc = read(fd, page + size, page_size);
(gdb) 
*** buffer overflow detected ***: terminated

Program received signal SIGABRT, Aborted.
Download failed: Invalid argument.  Continuing without source file ./nptl/./nptl/pthread_kill.c.
__pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
warning: 44        ./nptl/pthread_kill.c: No such file or directory
++++

Hope to hear soon.

Sincerely
Samaresh Singh

From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> 
Sent: Wednesday, August 6, 2025 1:27 AM
To: Samaresh Singh <samaresh.s@partner.samsung.com>
Cc: linux-edac@vger.kernel.org; mchehab@kernel.org
Subject: Re: Coredump detected executing rasdaemon 0.8.3 using "rasdaemon -f -r" on host with AMD CPU with Ubuntu 24

Em Tue, 5 Aug 2025 21: 44: 50 +0000 Samaresh Singh <samaresh. s@ partner. samsung. com> escreveu: > Hi, > > I compiled rasdaemon 0. 8. 3 using the code at https: //urldefense. com/v3/__https: //protect2. fireeye. com/v1/url?k=083f8089-69b495b9-083e0bc6-000babffaa23-5abe594aea9229de&q=1&e=6e13aa7e-92c9-4b98-8607-69de064d4d6e&u=https*3A*2F*2Fgithub. com*2Fmchehab*2Frasdaemon*2F__;JSUlJSUl!!EwVzqGoTKBqv-0DWAJBm!RtRQa0nQcYG7lPQl-h1jDUkaqK9dm2yqK_07Dq47Tkp_6HSSj4OxnlDGZcc05ewO9n3CFxgrtXVyfR_YJ4YIZyTSzvMRCA$
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender 
Use caution opening files, clicking links or responding to requests. 


ZjQcmQRYFpfptBannerEnd
Em Tue, 5 Aug 2025 21:44:50 +0000
Samaresh Singh <samaresh.s@partner.samsung.com> escreveu:

> Hi,
> 
>   I compiled rasdaemon 0.8.3 using the code at https://urldefense.com/v3/__https://protect2.fireeye.com/v1/url?k=083f8089-69b495b9-083e0bc6-000babffaa23-5abe594aea9229de&q=1&e=6e13aa7e-92c9-4b98-8607-69de064d4d6e&u=https*3A*2F*2Fgithub.com*2Fmchehab*2Frasdaemon*2F__;JSUlJSUl!!EwVzqGoTKBqv-0DWAJBm!RtRQa0nQcYG7lPQl-h1jDUkaqK9dm2yqK_07Dq47Tkp_6HSSj4OxnlDGZcc05ewO9n3CFxgrtXVyfR_YJ4YIZyTSzvMRCA$ on a machine with AMD CPUs and Ubuntu 24 as OS.  
> The details about the CPU is available as:
> +++
> processor            : 167
> vendor_id            : AuthenticAMD
> cpu family           : 25
> model                   : 17
> model name       : AMD EPYC 9634 84-Core Processor
> stepping              : 1
> microcode           : 0xa101148
> cpu MHz                             : 1500.000
> cache size            : 1024 KB
> physical id           : 0
> siblings : 168
> core id                  : 78
> cpu cores             : 84
> apicid                   : 157
> initial apicid        : 157
> fpu                         : yes
> fpu_exception    : yes
> cpuid level          : 16
> wp                         : yes
> 
> root@msl-ssg-cx02:/usr/local/var/lib/rasdaemon# uname -a
> Linux msl-ssg-cx02.msl.lab 6.8.0-71-generic #71-Ubuntu SMP PREEMPT_DYNAMIC Tue Jul 22 16:52:38 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
> root@msl-ssg-cx02:/usr/local/var/lib/rasdaemon# cat /etc/lsb-release 
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=24.04
> DISTRIB_CODENAME=noble
> DISTRIB_DESCRIPTION="Ubuntu 24.04.1 LTS"
> +++++
>   
> I had installed all the required dependencies and configure using "-enable-sqlite3 -enable-aer -enable-mce".  The compilation succeeded without any errors/warnings but when I tried executing using the command "rasdaemon -f -r" I got the coredump as shown below:
> +++
> root@msl-ssg-cx02:/usr/local/var/lib/rasdaemon# rasdaemon --version
> rasdaemon 0.8.3
> root@msl-ssg-cx02:/usr/local/var/lib/rasdaemon# rasdaemon -f -r
> *** buffer overflow detected ***: terminated
> Aborted (core dumped)
> ++++
> 
> Has this been reported by other folks? 

Tha´t s new to me.


> Can you please look into this and let us know what could be causing this?  
> If you have already observed it, are there any plans to provide a fix for this issue?

You need to use the core dump to identify what line of code caused
the issue. If your machine is using systemd, there is an utility
(coredumpctl) that allows you to inspect it. There is a description
about how to do it at: 

https://urldefense.com/v3/__https://wiki.archlinux.org/title/Core_dump__;!!EwVzqGoTKBqv-0DWAJBm!RtRQa0nQcYG7lPQl-h1jDUkaqK9dm2yqK_07Dq47Tkp_6HSSj4OxnlDGZcc05ewO9n3CFxgrtXVyfR_YJ4YIZyR0AZxEfQ$

Once you check the source, I suggest you to take a look at the code
if it belongs to rasdaemon. If it came from some other place, e.g.
from one of rasdaemon dependencies, then you may need to open a bug
to the specific package on Ubuntu, if rasdaemon is passing the data
right to it.

Regards,
Mauro

> 
> Hope to hear soon.
> 
> Sincerely
> Samaresh K. Singh
> Memory Solutions Lab Team
> Samsung



Thanks,
Mauro


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-08-07  7:14 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CGME20250805214451uscas1p25185f82c64c5519e601f4645044e0b86@uscas1p2.samsung.com>
2025-08-05 21:44 ` Coredump detected executing rasdaemon 0.8.3 using "rasdaemon -f -r" on host with AMD CPU with Ubuntu 24 Samaresh Singh
2025-08-06  8:27   ` Mauro Carvalho Chehab
     [not found] <CGME20250807071449uscas1p21a977bc7bf8b73e4dc05a4535b322be6@uscas1p2.samsung.com>
2025-08-07  7:14 ` Samaresh Singh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.