* RT File Logging
@ 2016-10-01 0:20 Sabar Siddhartha Dasgupta
2016-10-03 18:02 ` Austin Schuh
2016-10-03 18:44 ` Brian Silverman
0 siblings, 2 replies; 4+ messages in thread
From: Sabar Siddhartha Dasgupta @ 2016-10-01 0:20 UTC (permalink / raw)
To: linux-rt-users
Hi all,
I am working with the 4.4.12-rt19 kernel patch.
I have a realtime application that has separate processes running on
separate cores taking in data from the network, computing on that
data, and then logging results. I am attempting to log on the order of
10KB per ms tick of data to file.
The logging process has access to all of the incoming data in shared
memory. Right now, I am using sqlite3 and sqlite3async to buffer the
database to memory in one thread of the logging process and then
commit the in-memory instance to file every second with a call to
sqlite3async_run().
The problem is that during part of the sqlite3async_run() execution,
the sqlite3_step() command to write to the in-memory database buffer
hangs and violates my 1ms timing guarantee.
This question may not be relevant here, but I am still not sure if the
error is happening because of how threaded processes work in a
realtime environment or because of how sqlite3async works. As far as I
can tell, sqlite3async is supposed to be able to buffer the database
in memory using the sqlite3 virtual file system and then handle the
actual file write with a background thread (as detailed here:
https://www.sqlite.org/asyncvfs.html). I have tried changing the
scheduling priorities and nicenesses of each thread to no avail.
Any help or suggestions would be greatly appreciated! (or direction to
the right forum if this is not the right place).
Best,
Sabar
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: RT File Logging
2016-10-01 0:20 RT File Logging Sabar Siddhartha Dasgupta
@ 2016-10-03 18:02 ` Austin Schuh
2016-10-03 18:44 ` Brian Silverman
1 sibling, 0 replies; 4+ messages in thread
From: Austin Schuh @ 2016-10-03 18:02 UTC (permalink / raw)
To: Sabar Siddhartha Dasgupta, rt-users
On Fri, Sep 30, 2016 at 5:21 PM Sabar Siddhartha Dasgupta
<sabard@stanford.edu> wrote:
>
> Hi all,
>
> I am working with the 4.4.12-rt19 kernel patch.
>
> I have a realtime application that has separate processes running on
> separate cores taking in data from the network, computing on that
> data, and then logging results. I am attempting to log on the order of
> 10KB per ms tick of data to file.
>
> The logging process has access to all of the incoming data in shared
> memory. Right now, I am using sqlite3 and sqlite3async to buffer the
> database to memory in one thread of the logging process and then
> commit the in-memory instance to file every second with a call to
> sqlite3async_run().
>
> The problem is that during part of the sqlite3async_run() execution,
> the sqlite3_step() command to write to the in-memory database buffer
> hangs and violates my 1ms timing guarantee.
>
> This question may not be relevant here, but I am still not sure if the
> error is happening because of how threaded processes work in a
> realtime environment or because of how sqlite3async works. As far as I
> can tell, sqlite3async is supposed to be able to buffer the database
> in memory using the sqlite3 virtual file system and then handle the
> actual file write with a background thread (as detailed here:
> https://www.sqlite.org/asyncvfs.html). I have tried changing the
> scheduling priorities and nicenesses of each thread to no avail.
>
> Any help or suggestions would be greatly appreciated! (or direction to
> the right forum if this is not the right place).
>
> Best,
> Sabar
We've seen cases where disk IO ties up the interrupt line and causes
latency. Your application might also be lower priority than the IRQ
for the disk. My best advice for debugging latency events is to use
the kernel tracers to figure out what is happening when you have an
issue. Unfortunately, it is a big hammer. It takes a lot of work to
understand what is in the resulting logs, and enabling tracing can
cause other issues, but it lets you actually observe what is
happening.
Good luck!
Austin
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: RT File Logging
2016-10-01 0:20 RT File Logging Sabar Siddhartha Dasgupta
2016-10-03 18:02 ` Austin Schuh
@ 2016-10-03 18:44 ` Brian Silverman
2016-10-07 7:08 ` Sabar Siddhartha Dasgupta
1 sibling, 1 reply; 4+ messages in thread
From: Brian Silverman @ 2016-10-03 18:44 UTC (permalink / raw)
To: Sabar Siddhartha Dasgupta; +Cc: rt-users
Have you looked for any internal locks sqlite3 is using? Keeping in
mind I have no experience with that library whatsoever, it sounds like
you may be running into sqlite3async_run holding a lock while it does
IO, which causes sqlite3_step to block waiting for it to finish. Even
if the disk IO thread doesn't do much with the lock held, you could be
running into priority inversion
(https://en.wikipedia.org/wiki/Priority_inversion has a description
and some solutions).
You might have less problems if you split anything realtime (like
reading your data from shared memory) and all the sqlite3 things out
into separate processes. Pipes and POSIX shared memory queues should
work.
If you're still having problems, it might be worth asking sqlite3
people. Specifically, how they do locking etc sounds relevant.
Best of luck,
Brian
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: RT File Logging
2016-10-03 18:44 ` Brian Silverman
@ 2016-10-07 7:08 ` Sabar Siddhartha Dasgupta
0 siblings, 0 replies; 4+ messages in thread
From: Sabar Siddhartha Dasgupta @ 2016-10-07 7:08 UTC (permalink / raw)
To: Brian Silverman; +Cc: rt-users
Thank you for the help!
It ended up being a problem with sqlite3's journaling system waiting
for the writer to unlock the OS buffers before creating a rollback
file. Changing the journal configuration to stay in memory rather than
disk made the journaling way faster and fixed the problem.
I will keep the potential priority problems in mind for the future!
Thanks again,
Sabar
On Mon, Oct 3, 2016 at 11:44 AM, Brian Silverman <brian@peloton-tech.com> wrote:
> Have you looked for any internal locks sqlite3 is using? Keeping in
> mind I have no experience with that library whatsoever, it sounds like
> you may be running into sqlite3async_run holding a lock while it does
> IO, which causes sqlite3_step to block waiting for it to finish. Even
> if the disk IO thread doesn't do much with the lock held, you could be
> running into priority inversion
> (https://en.wikipedia.org/wiki/Priority_inversion has a description
> and some solutions).
>
> You might have less problems if you split anything realtime (like
> reading your data from shared memory) and all the sqlite3 things out
> into separate processes. Pipes and POSIX shared memory queues should
> work.
>
> If you're still having problems, it might be worth asking sqlite3
> people. Specifically, how they do locking etc sounds relevant.
>
> Best of luck,
> Brian
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-10-07 7:08 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-01 0:20 RT File Logging Sabar Siddhartha Dasgupta
2016-10-03 18:02 ` Austin Schuh
2016-10-03 18:44 ` Brian Silverman
2016-10-07 7:08 ` Sabar Siddhartha Dasgupta
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).