All of lore.kernel.org
 help / color / mirror / Atom feed
* Help with understanding throttle.finish()
@ 2016-04-28 18:15 Willem Jan Withagen
  2016-05-06 15:29 ` Could be due to a race condition... " Willem Jan Withagen
  0 siblings, 1 reply; 11+ messages in thread
From: Willem Jan Withagen @ 2016-04-28 18:15 UTC (permalink / raw)
  To: Ceph Development

Hi,

I'm running a rather simple setup on my FreeBSD port.

function TEST_simple() {
     # run the most simple config, and run a bechmark on it.
     local dir=$1

     run_mon $dir a || return 1
     run_osd $dir 0 || return 1

     #
     # default values should work
     #
     ceph tell osd.0 bench || return 1

}

This in the end crashes with:
     8059eec00 -1 FileStore: sync_entry timed out after 600 seconds.
exactly 10 minutes after startup.
This trhread does just about exactly nothing, it initialises the time, 
and then traps after 10 minutes.
# grep 8059eec00  testdir/osd-bench/osd.0.log
2016-04-28 19:51:44.444689 8059eec00 -1 FileStore: sync_entry timed out 
after 600 seconds.
2016-04-28 19:51:44.487104 8059eec00 -1 os/filestore/FileStore.cc: In 
function 'virtual void SyncEntryTimeout::finish(int)' thread 8059eec00 time

I've augmented just about every line of code in src/os/filestore/File*.cc

And I get as far as
	throttle.flush()
It seems to work the first time it gets called. That is, it returns.

However the second time is "stalls" in the call
	 throttle.put(to_put_bytes);
after a number of itterations thru the while-loop.

Below the code of Throttle::Flush and an sequence of the thread doing 
the blocking call
from the log where I've added mode debugging code.

Does anybody have a clue as to continue from here?
   Why does throttle.put not return?
   Is it just throttleing too hard and not reorganising the queue?

--WjW

Code:
std::pair<uint64_t, uint64_t> JournalThrottle::flush(uint64_t mono_id)
{
   uint64_t to_put_bytes = 0;
   uint64_t to_put_ops = 0;
   std::pair<uint64_t, uint64_t> ret;
   {
     locker l(lock);
     while (!journaled_ops.empty() &&
            journaled_ops.front().first <= mono_id) {
       to_put_bytes += journaled_ops.front().second;
       to_put_ops++;
       journaled_ops.pop_front();
     }
   }
   throttle.put(to_put_bytes);
   ret = make_pair(to_put_ops, to_put_bytes);
   return ret;
}

Thread log:
2016-04-28 19:41:43.231410 8059ec900 20 filestore(testdir/osd-bench/0) 
sync_entry(1:1) waiting for max_interval 5.000000
2016-04-28 19:41:43.231438 8059ec900 20 filestore(testdir/osd-bench/0) 
sync_entry(1:1) started at: 2016-04-28 19:41:43.231436
2016-04-28 19:41:44.395763 8059ec900 20 filestore(testdir/osd-bench/0) 
sync_entry(1:1) ended at: 2016-04-28 19:41:44.395758
2016-04-28 19:41:44.395781 8059ec900 20 filestore(testdir/osd-bench/0) 
sync_entry(1:1) waited for 1
2016-04-28 19:41:44.395789 8059ec900 20 filestore(testdir/osd-bench/0) 
sync_entry(1:1) woke after 1.164433
2016-04-28 19:41:44.395795 8059ec900 10 filestore(testdir/osd-bench/0) 
sync_entry(1:1) 3726
2016-04-28 19:41:44.395801 8059ec900 10 filestore(testdir/osd-bench/0) 
sync_entry(1:1) 3730
2016-04-28 19:41:44.395806 8059ec900 10 filestore(testdir/osd-bench/0) 
sync_entry(1:1) 3733
2016-04-28 19:41:44.400570 8059ec900 10 filestore(testdir/osd-bench/0) 
sync_entry(1:1) 3735 starting apply_manager.commit_start()
2016-04-28 19:41:44.400584 8059ec900 10 journal commit_start 
max_applied_seq 32, open_ops 0
2016-04-28 19:41:44.400588 8059ec900 10 journal commit_start blocked, 
all open_ops have completed
2016-04-28 19:41:44.400591 8059ec900 10 journal commit_start committing 
32, still blocked
2016-04-28 19:41:44.400596 8059ec900 10 journal FileJournal commit_start
2016-04-28 19:41:44.400599 8059ec900 10 journal commit_start 
FULL_NOTFULL.  All good.
2016-04-28 19:41:44.400602 8059ec900 10 journal 
apply_manager.commit_start() returning 1
2016-04-28 19:41:44.400605 8059ec900 10 filestore(testdir/osd-bench/0) 
sync_entry(1:1) 3738 done(true) apply_manager.commit_start()
2016-04-28 19:41:44.400637 8059ec900 10 filestore(testdir/osd-bench/0) 
sync_entry(1:1) 3743
2016-04-28 19:41:44.400643 8059ec900 10 filestore(testdir/osd-bench/0) 
sync_entry(1:1) 3745
2016-04-28 19:41:44.400653 8059ec900 10 filestore(testdir/osd-bench/0) 
sync_entry(1:1) 3748
2016-04-28 19:41:44.400673 8059ec900 10 filestore(testdir/osd-bench/0) 
sync_entry(1:1) 3750
2016-04-28 19:41:44.400680 8059ec900 10 filestore(testdir/osd-bench/0) 
sync_entry(1:1) 3752
2016-04-28 19:41:44.400685 8059ec900 15 filestore(testdir/osd-bench/0) 
sync_entry(1:1) committing 32
2016-04-28 19:41:44.400701 8059ec900 10 journal commit_started 
committing 32, unblocking
2016-04-28 19:41:44.400728 8059ec900 20 filestore dbobjectmap: seq is 5
2016-04-28 19:41:44.517112 8059ec900 15 
genericfilestorebackend(testdir/osd-bench/0) syncfs: doing a full sync 
(syncfs(2) if possible)
2016-04-28 19:41:45.471870 8059ec900 10 filestore(testdir/osd-bench/0) 
sync_entry(1:1) commit took 1.071257, interval was 2.240513
2016-04-28 19:41:45.471885 8059ec900 10 journal commit_finish 238 thru 
32 journal: 0x8058e5400
2016-04-28 19:41:45.473683 8059ec900 15 journal committed_thru:1804 locked.
2016-04-28 19:41:45.473697 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473701 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473705 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473707 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473710 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473747 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473750 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473753 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473756 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473758 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473761 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473763 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473767 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473772 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473775 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473777 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473779 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473781 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473784 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473786 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473788 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473797 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473800 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473803 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473806 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473809 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473811 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473814 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473817 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473819 8059ec900 10 throttle flush while 
!journaled_ops.empty()
2016-04-28 19:41:45.473822 8059ec900 10 throttle flush:63 calling 
throttle.put()


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-05-13 12:23 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-28 18:15 Help with understanding throttle.finish() Willem Jan Withagen
2016-05-06 15:29 ` Could be due to a race condition... " Willem Jan Withagen
2016-05-06 18:41   ` Gregory Farnum
2016-05-06 21:04     ` Willem Jan Withagen
2016-05-09  8:44       ` Willem Jan Withagen
2016-05-11  9:46         ` Willem Jan Withagen
2016-05-11 13:43           ` Gregory Farnum
2016-05-12 21:02             ` Willem Jan Withagen
2016-05-12 22:22               ` Samuel Just
2016-05-12 22:29                 ` Willem Jan Withagen
2016-05-12 22:51                 ` Willem Jan Withagen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.