All of lore.kernel.org
 help / color / mirror / Atom feed
From: Willem Jan Withagen <wjw@digiware.nl>
To: "Xinze Chi (信泽)" <xmdxcxz@gmail.com>
Cc: Ceph Development <ceph-devel@vger.kernel.org>
Subject: Re: FreeBSD Building and Testing
Date: Mon, 21 Dec 2015 21:14:18 +0100	[thread overview]
Message-ID: <56785D9A.2020701@digiware.nl> (raw)
In-Reply-To: <CANE=7sVw1zhxYzT9q6WKkUV5tCioLuO0J_ZKCkdKXR5jFDHsKg@mail.gmail.com>

On 21-12-2015 01:45, Xinze Chi (信泽) wrote:
> sorry for delay reply. Please have a try
> https://github.com/ceph/ceph/commit/ae4a8162eacb606a7f65259c6ac236e144bfef0a.

Tried this one first:
============================================================================
Testsuite summary for ceph 10.0.1
============================================================================
# TOTAL: 120
# PASS:  100
# SKIP:  0
# XFAIL: 0
# FAIL:  20
# XPASS: 0
# ERROR: 0
============================================================================

So that certainly helps.
Have not yet analyzed the log files... But is seems we are getting 
somewhere.
Needed to manually kill a rados access in:
  | |     |             \-+- 09792 wjw /bin/sh ../test-driver 
./test/ceph_objectstore_tool.py
  | |     |               \-+- 09807 wjw python 
./test/ceph_objectstore_tool.py (python2.7)
  | |     |                 \--- 11406 wjw 
/usr/srcs/Ceph/wip-freebsd-wjw/ceph/src/.libs/rados -p rep_pool -N put 
REPobject1 /tmp/data.9807/-REPobject1__head

But also 2 mon-osd's were running, and perhaps ine was nog belonging
with that test. So they could be in each others way.

Found some fails in OSD's at:

./test-suite.log:osd/ECBackend.cc: 201: FAILED assert(res.errors.empty())
./test-suite.log:osd/ECBackend.cc: 201: FAILED assert(res.errors.empty())

struct OnRecoveryReadComplete :
   public GenContext<pair<RecoveryMessages*, ECBackend::read_result_t& > 
&> {
   ECBackend *pg;
   hobject_t hoid;
   set<int> want;
   OnRecoveryReadComplete(ECBackend *pg, const hobject_t &hoid)
     : pg(pg), hoid(hoid) {}
   void finish(pair<RecoveryMessages *, ECBackend::read_result_t &> &in) {
     ECBackend::read_result_t &res = in.second;
     // FIXME???
     assert(res.r == 0);
201:    assert(res.errors.empty());
     assert(res.returned.size() == 1);
     pg->handle_recovery_read_complete(
       hoid,
       res.returned.back(),
       res.attrs,
       in.first);
   }
};

Given the FIXME?? the code here could be fishy??

I would say that just this patch would be sufficient.
The second patch also looks like it is could be useful since it
lowers the bar on being tested. And when just aligning is required
because of (a)iovec processing that 4096 will likely suffice.

Thanx you very much for the help.

--WjW


> 2015-12-21 0:10 GMT+08:00 Willem Jan Withagen <wjw@digiware.nl>:
>> Hi,
>>
>> Most of the Ceph is getting there in the most crude and rough state.
>> So beneath is a status update on what is not working for me jet.
>>
>> Especially help with the aligment problem in os/FileJournal.cc would be
>> appricated... It would allow me to run ceph-osd and run more tests to
>> completion.
>>
>> What would happen if I comment out this test, and ignore the fact that
>> thing might be unaligned?
>> Is it a performance/paging issue?
>> Or is data going to be corrupted?
>>
>> --WjW
>>
>> PASS: src/test/run-cli-tests
>> ============================================================================
>> Testsuite summary for ceph 10.0.0
>> ============================================================================
>> # TOTAL: 1
>> # PASS:  1
>> # SKIP:  0
>> # XFAIL: 0
>> # FAIL:  0
>> # XPASS: 0
>> # ERROR: 0
>> ============================================================================
>>
>> gmake test:
>> ============================================================================
>> Testsuite summary for ceph 10.0.0
>> ============================================================================
>> # TOTAL: 119
>> # PASS:  95
>> # SKIP:  0
>> # XFAIL: 0
>> # FAIL:  24
>> # XPASS: 0
>> # ERROR: 0
>> ============================================================================
>>
>> The folowing notes can be made with this:
>> 1) the run-cli-tests run to completion because I excluded the RBD tests
>> 2) gmake test has the following tests FAIL:
>> FAIL: unittest_erasure_code_plugin
>> FAIL: ceph-detect-init/run-tox.sh
>> FAIL: test/erasure-code/test-erasure-code.sh
>> FAIL: test/erasure-code/test-erasure-eio.sh
>> FAIL: test/run-rbd-unit-tests.sh
>> FAIL: test/ceph_objectstore_tool.py
>> FAIL: test/test-ceph-helpers.sh
>> FAIL: test/cephtool-test-osd.sh
>> FAIL: test/cephtool-test-mon.sh
>> FAIL: test/cephtool-test-mds.sh
>> FAIL: test/cephtool-test-rados.sh
>> FAIL: test/mon/osd-crush.sh
>> FAIL: test/osd/osd-scrub-repair.sh
>> FAIL: test/osd/osd-scrub-snaps.sh
>> FAIL: test/osd/osd-config.sh
>> FAIL: test/osd/osd-bench.sh
>> FAIL: test/osd/osd-reactivate.sh
>> FAIL: test/osd/osd-copy-from.sh
>> FAIL: test/libradosstriper/rados-striper.sh
>> FAIL: test/test_objectstore_memstore.sh
>> FAIL: test/ceph-disk.sh
>> FAIL: test/pybind/test_ceph_argparse.py
>> FAIL: test/pybind/test_ceph_daemon.py
>> FAIL: ../qa/workunits/erasure-code/encode-decode-non-regression.sh
>>
>> Most of the fails are because ceph-osd crashed consistently on:
>> -1 journal  bl.is_aligned(block_size) 0
>> bl.is_n_align_sized(CEPH_MINIMUM_BLOCK_SIZE) 1
>> -1 journal  block_size 131072 CEPH_MINIMUM_BLOCK_SIZE 4096
>> CEPH_PAGE_SIZE 4096 header.alignment 131072
>> bl buffer::list(len=131072, buffer::ptr(0~131072 0x805319000 in raw
>> 0x805319000 len 131072 nref 1))
>> os/FileJournal.cc: In function 'void FileJournal::align_bl(off64_t,
>> bufferlist &)' thread 805217400 time 2015-12-19 13:43:06.706797
>> os/FileJournal.cc: 1045: FAILED assert(0 == "bl should be align")
>>
>> This is bugging me already for a few days, but I haven't found an easy
>> way to debug this, run it in gdb while being live or in post-mortum.
>>
>> Further:
>> A) unittest_erasure_code_plugin failes on the fact that there is a
>> different error code returned when dlopen-ing a non existent library.
>> load dlopen(.libs/libec_invalid.so): Cannot open
>> ".libs/libec_invalid.so"load dlsym(.libs/libec_missing_version.so, _
>> _erasure_code_init): Undefined symbol
>> "__erasure_code_init"test/erasure-code/TestErasureCodePlugin.cc:88: Failure
>> Value of: instance.factory("missing_version", g_conf->erasure_code_dir,
>> profile, &erasure_code, &cerr)
>>    Actual: -2
>> Expected: -18
>> load dlsym(.libs/libec_missing_entry_point.so, __erasure_code_init):
>> Undefined symbol "__erasure_code_init"erasure_co
>> de_init(fail_to_initialize,.libs): (3) No such processload
>> __erasure_code_init()did not register fail_to_registerload
>> : example erasure_code_init(example,.libs): (17) File existsload:
>> example [  FAILED  ] ErasureCodePluginRegistryTest.
>> all (330 ms)
>>
>> B) ceph-detect-init/run-tox.sh failes on the fact that I need to work in
>> FreeBSD in the tests.
>>
>> C) ./gtest/include/gtest/internal/gtest-port.h:1358:: Condition
>> has_owner_ && pthread_equal(owner_, pthread_se
>> lf()) failed. The current thread is not holding the mutex @0x161ef20
>> ./test/run-rbd-unit-tests.sh: line 9: 78053 Abort trap
>> (core dumped) unittest_librbd
>>
>> Which I think I found some commit comments about in either trac or git
>> about FreeBSD not being able to do things to its own thread. Got to look
>> into this.
>>
>> D) Fix some of the other python code to work as expected.
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-12-21 20:14 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-20 16:10 FreeBSD Building and Testing Willem Jan Withagen
2015-12-21  0:45 ` Xinze Chi (信泽)
     [not found]   ` <CANE=7sU9QPH2uUS8A4xhPQ1j+jR6Fi88=PVvLRGEhzt2cmOceg@mail.gmail.com>
2015-12-21  1:16     ` Fwd: " Xinze Chi (信泽)
2015-12-21 20:14   ` Willem Jan Withagen [this message]
2015-12-28 16:53     ` Willem Jan Withagen
2016-01-05 18:23       ` Gregory Farnum
2016-01-06 10:21         ` Willem Jan Withagen
2016-01-06  7:51       ` Mykola Golub
2016-01-06 10:16         ` Willem Jan Withagen
2016-01-06 12:41         ` Willem Jan Withagen
2015-12-21 23:40 ` Willem Jan Withagen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56785D9A.2020701@digiware.nl \
    --to=wjw@digiware.nl \
    --cc=ceph-devel@vger.kernel.org \
    --cc=xmdxcxz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.