All of lore.kernel.org
 help / color / mirror / Atom feed
* A plea for more tooling usage
@ 2014-05-06 13:48 Daniel Hofmann
  2014-05-06 15:19 ` Sage Weil
  2014-05-08 10:37 ` John Spray
  0 siblings, 2 replies; 11+ messages in thread
From: Daniel Hofmann @ 2014-05-06 13:48 UTC (permalink / raw)
  To: ceph-devel

Preamble: you might want to read the decent formatted version of this mail at:
> https://gist.github.com/daniel-j-h/06f16015c89bec4fbb42


Motivation
----------

I tried to build Ceph with the Clang (3.4) compiler. The only issue
preventing the build to finish was a "VLA of non-POD type" usage which
I fixed here:

https://github.com/ceph/ceph/pull/1768


While the build was resuming I was wondering why no one stumpled upon this
issue in the first place. Has nobody ever tried to compile Ceph with Clang?

So I did what I'm always doing to quick-check a project's code quality:
I set the compiler to Clang. I set flags to -Weverything.

The idea is to gradually disable warnings one by one that are not interesting.

Instead of doing this by hand I used the following flags that I
accumulated over the last weeks or month to disable a handful of not so
important warnings:

> -Weverything -Wno-padded -Wno-c++11-long-long -Wno-variadic-macros
> -Wno-c++11-extensions -Wno-global-constructors -Wno-exit-time-destructors
> -Wno-sign-conversion -Wno-weak-vtables -Wno-disabled-macro-expansion
> -Wno-c99-extensions -Wno-shadow -Wno-documentation-unknown-command -Wno-undef
> -Wno-documentation -Wno-duplicate-enum -Wno-switch-enum -Wno-unused-parameter
> -Wno-packed

Let's build Ceph and have a look at what Clang's -Weverything catches:

    ./autogen.sh
    ./configure --without-libxfs
    make -j 1 2> diagnostics.txt

Many warnings are duplicates because of header includes and the like.
Let's remove duplicates and only keep unique diagnostics:

    $ egrep "\[-.*\]" diagnostics.txt | sort | uniq > uniq.txt
    $ wc -l uniq.txt 
    12286 uniq.txt

Wow. 12286 unique diagnostics. Interested in what showed up?


How bad is it
-------------

I ended up going through the diagnostic categories, because I had a few hours
to spare. The methodology I used was the following:

Check for diagnostic categories, prioritize them, and after that remove all
the occurences in the original diagnostics file by:

    $ sed -i '/Wflag/d' uniq.txt

In the following I'll give you a quick summary of the most important
diagnostics. After this there are still some diagnostics left.

    $ wc -l uniq.txt 
    3362 uniq.txt

This can be explained by diagnostics that are not important at all, such as
unused exception parameter in a catch clause.

But there are also conversions between types that have to be studied one by
one to check if a loss in precision or comparison of types with different
signedness could lead to dangerous behavior.


Note: Clang's diagnostics can be customized, too:

http://clang.llvm.org/docs/UsersManual.html#formatting-of-diagnostics


So let's look through the categories.


Undefined Behavior
------------------

> warning: 'va_start' has undefined behavior with reference types [-Wvarargs]

    $ grep Wvarargs uniq.txt | wc -l
    1

Here's the paragraph from the C++11 Standard, 18.10 §3:
> If the parameter parmN is declared with a function, array, or reference type, or with a type that is not compatible with the type that results when passing an argument for which there is no parameter, the behavior is undefined.

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf


Conditional Uninitialized
-------------------------

> warning: variable '%' may be uninitialized when used here [-Wconditional-uninitialized]

    $ grep Wconditional-uninitialized uniq.txt | wc -l
    4


Unreachable Code
----------------

> warning: will never be executed [-Wunreachable-code]

    $ grep Wunreachable-code uniq.txt | wc -l
    155


Extensions to the C++ language
------------------------------

There's this vast amount of code that is definitely not conforming to
standard C++. Some extensions are used, such as Variable Length Arrays (VLAs,
which were the reason for my issue in the first place), some GCC specificas.

You probably have to decide if you want to go this route or if you want to do
it the portable and standard C++ way.

VLAs are a good example where the standard C++ way of doing it is in no way
harder to accomplish. See the following explanation:

http://clang.llvm.org/compatibility.html#vla


> warning: anonymous types declared in an anonymous union are an extension [-Wnested-anon-types]

    $ grep Wnested-anon-types uniq.txt  | wc -l
    28

> warning: zero size arrays are an extension [-Wzero-length-array]

    $ grep Wzero-length-array uniq.txt | wc -l
    8134

> warning: anonymous structs are a GNU extension [-Wgnu-anonymous-struct]

    $ grep Wgnu-anonymous-struct uniq.txt  | wc -l
    2

> warning: '%' may not be nested in a struct due to flexible array member [-Wflexible-array-extensions]

    $ grep Wflexible-array-extensions uniq.txt | wc -l
    2

> warning: variable length array used [-Wvla]
> warning: variable length arrays are a C99 feature [-Wvla-extension]

    $ grep Wvla uniq.txt | wc -l
    178

> warning: cast between pointer-to-function and pointer-to-object is an extension [-Wpedantic]
> warning: use of non-standard escape character '\%' [-Wpedantic]

    $ grep Wpedantic uniq.txt | wc -l
    15

> warning: use of GNU old-style field designator extension [-Wgnu-designator]

    $ grep Wgnu-designator uniq.txt | wc -l
    43

> warning: anonymous unions are a C11 extension [-Wc11-extensions]

    $ grep Wc11-extensions uniq.txt | wc -l
    3


Miscellaneous:
--------------

> warning: '%' hides overloaded virtual function [-Woverloaded-virtual]

    $ grep Woverloaded-virtual uniq.txt | wc -l
    3

> warning: using namespace directive in global context in header [-Wheader-hygiene]

    $ grep Wheader-hygiene uniq.txt | wc -l
    87

> warning: extra ';' after member function definition [-Wextra-semi]

    $ grep Wextra-semi uniq.txt | wc -l
    135

> warning: struct '%' was previously declared as a class [-Wmismatched-tags]

    $ grep Wmismatched-tags uniq.txt | wc -l
    93

> warning: missing field '%' initializer [-Wmissing-field-initializers]

    $ grep Wmissing-field-initializer uniq.txt | wc -l
    3

> warning: private field '%' is not used [-Wunused-private-field]

    $ grep Wunused-private-field uniq.txt | wc -l
    11

> warning: function '%' could be declared with attribute 'noreturn' [-Wmissing-noreturn]

    $ grep Wmissing-noreturn uniq.txt | wc -l
    23

> warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]

    $ grep Wgnu-zero-variadic-macro-arguments uniq.txt | wc -l
    4



Style
-----

While fixing my initial VLA issue I skimmed through the relevant code
in src/rgw/rgw_main.cc.

What I noticed is that already in this file there's no consistent style.
There's a "using namespace std;" at the top, then in the file there's an
alternate usage between vector<string> and std::vector<std::string>.

It also seems to me that the memory management is often done by hand, e.g.
in line 377 "string *objs = new string[num_objs];" and later the "delete []"
instead of just using a std::vector<std::string>.


Summary
-------

Using Clang-3.4's -Weverything reveals critical issues.

What I hope for Ceph's general code quality is to improve by using the tools
at hand. This means especially:

Do the same check with an up to date Clang (that's one of the reasons I'm not
pointing out all the issues or the concrete source code positions), look
through them in detail, such as conversions and left-out diagnostics, and fix
as many as critical issues as possible.

You probably could even use Clang's -fsanitize flag to check for issues at runtime.
http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation

Also keep in mind that this check was only done on the userspace code in the
ceph/ceph repository.


Thanks,
Daniel J. Hofmann
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A plea for more tooling usage
  2014-05-06 13:48 A plea for more tooling usage Daniel Hofmann
@ 2014-05-06 15:19 ` Sage Weil
  2014-05-06 15:35   ` Milosz Tanski
  2014-05-15 23:34   ` Thorsten Behrens
  2014-05-08 10:37 ` John Spray
  1 sibling, 2 replies; 11+ messages in thread
From: Sage Weil @ 2014-05-06 15:19 UTC (permalink / raw)
  To: Daniel Hofmann; +Cc: ceph-devel

On Tue, 6 May 2014, Daniel Hofmann wrote:
> Preamble: you might want to read the decent formatted version of this mail at:
> > https://gist.github.com/daniel-j-h/06f16015c89bec4fbb42
> 
> 
> Motivation
> ----------
> 
> I tried to build Ceph with the Clang (3.4) compiler. The only issue
> preventing the build to finish was a "VLA of non-POD type" usage which
> I fixed here:
> 
> https://github.com/ceph/ceph/pull/1768

Merged, thanks!

Daniel, this is great.  We've been clean on gcc for a long time but I (at 
least) didn't realize there was a huge set of potentially important issues 
that clang warns about that gcc does not.  We definitely want to clean 
these up and get the right set of warnings whitelisted.  Alfredo is 
setting up a jenkins job for this so we can follow it going forward, but 
there is clearly a pretty big initial effort needed to get the existing 
warnings cleaned up.  If anybody is interested in helping with that 
effort, pull requests are very welcome!  :)

sage


> 
> While the build was resuming I was wondering why no one stumpled upon this
> issue in the first place. Has nobody ever tried to compile Ceph with Clang?
> 
> So I did what I'm always doing to quick-check a project's code quality:
> I set the compiler to Clang. I set flags to -Weverything.
> 
> The idea is to gradually disable warnings one by one that are not interesting.
> 
> Instead of doing this by hand I used the following flags that I
> accumulated over the last weeks or month to disable a handful of not so
> important warnings:
> 
> > -Weverything -Wno-padded -Wno-c++11-long-long -Wno-variadic-macros
> > -Wno-c++11-extensions -Wno-global-constructors -Wno-exit-time-destructors
> > -Wno-sign-conversion -Wno-weak-vtables -Wno-disabled-macro-expansion
> > -Wno-c99-extensions -Wno-shadow -Wno-documentation-unknown-command -Wno-undef
> > -Wno-documentation -Wno-duplicate-enum -Wno-switch-enum -Wno-unused-parameter
> > -Wno-packed
> 
> Let's build Ceph and have a look at what Clang's -Weverything catches:
> 
>     ./autogen.sh
>     ./configure --without-libxfs
>     make -j 1 2> diagnostics.txt
> 
> Many warnings are duplicates because of header includes and the like.
> Let's remove duplicates and only keep unique diagnostics:
> 
>     $ egrep "\[-.*\]" diagnostics.txt | sort | uniq > uniq.txt
>     $ wc -l uniq.txt 
>     12286 uniq.txt
> 
> Wow. 12286 unique diagnostics. Interested in what showed up?
> 
> 
> How bad is it
> -------------
> 
> I ended up going through the diagnostic categories, because I had a few hours
> to spare. The methodology I used was the following:
> 
> Check for diagnostic categories, prioritize them, and after that remove all
> the occurences in the original diagnostics file by:
> 
>     $ sed -i '/Wflag/d' uniq.txt
> 
> In the following I'll give you a quick summary of the most important
> diagnostics. After this there are still some diagnostics left.
> 
>     $ wc -l uniq.txt 
>     3362 uniq.txt
> 
> This can be explained by diagnostics that are not important at all, such as
> unused exception parameter in a catch clause.
> 
> But there are also conversions between types that have to be studied one by
> one to check if a loss in precision or comparison of types with different
> signedness could lead to dangerous behavior.
> 
> 
> Note: Clang's diagnostics can be customized, too:
> 
> http://clang.llvm.org/docs/UsersManual.html#formatting-of-diagnostics
> 
> 
> So let's look through the categories.
> 
> 
> Undefined Behavior
> ------------------
> 
> > warning: 'va_start' has undefined behavior with reference types [-Wvarargs]
> 
>     $ grep Wvarargs uniq.txt | wc -l
>     1
> 
> Here's the paragraph from the C++11 Standard, 18.10 ?3:
> > If the parameter parmN is declared with a function, array, or reference type, or with a type that is not compatible with the type that results when passing an argument for which there is no parameter, the behavior is undefined.
> 
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf
> 
> 
> Conditional Uninitialized
> -------------------------
> 
> > warning: variable '%' may be uninitialized when used here [-Wconditional-uninitialized]
> 
>     $ grep Wconditional-uninitialized uniq.txt | wc -l
>     4
> 
> 
> Unreachable Code
> ----------------
> 
> > warning: will never be executed [-Wunreachable-code]
> 
>     $ grep Wunreachable-code uniq.txt | wc -l
>     155
> 
> 
> Extensions to the C++ language
> ------------------------------
> 
> There's this vast amount of code that is definitely not conforming to
> standard C++. Some extensions are used, such as Variable Length Arrays (VLAs,
> which were the reason for my issue in the first place), some GCC specificas.
> 
> You probably have to decide if you want to go this route or if you want to do
> it the portable and standard C++ way.
> 
> VLAs are a good example where the standard C++ way of doing it is in no way
> harder to accomplish. See the following explanation:
> 
> http://clang.llvm.org/compatibility.html#vla
> 
> 
> > warning: anonymous types declared in an anonymous union are an extension [-Wnested-anon-types]
> 
>     $ grep Wnested-anon-types uniq.txt  | wc -l
>     28
> 
> > warning: zero size arrays are an extension [-Wzero-length-array]
> 
>     $ grep Wzero-length-array uniq.txt | wc -l
>     8134
> 
> > warning: anonymous structs are a GNU extension [-Wgnu-anonymous-struct]
> 
>     $ grep Wgnu-anonymous-struct uniq.txt  | wc -l
>     2
> 
> > warning: '%' may not be nested in a struct due to flexible array member [-Wflexible-array-extensions]
> 
>     $ grep Wflexible-array-extensions uniq.txt | wc -l
>     2
> 
> > warning: variable length array used [-Wvla]
> > warning: variable length arrays are a C99 feature [-Wvla-extension]
> 
>     $ grep Wvla uniq.txt | wc -l
>     178
> 
> > warning: cast between pointer-to-function and pointer-to-object is an extension [-Wpedantic]
> > warning: use of non-standard escape character '\%' [-Wpedantic]
> 
>     $ grep Wpedantic uniq.txt | wc -l
>     15
> 
> > warning: use of GNU old-style field designator extension [-Wgnu-designator]
> 
>     $ grep Wgnu-designator uniq.txt | wc -l
>     43
> 
> > warning: anonymous unions are a C11 extension [-Wc11-extensions]
> 
>     $ grep Wc11-extensions uniq.txt | wc -l
>     3
> 
> 
> Miscellaneous:
> --------------
> 
> > warning: '%' hides overloaded virtual function [-Woverloaded-virtual]
> 
>     $ grep Woverloaded-virtual uniq.txt | wc -l
>     3
> 
> > warning: using namespace directive in global context in header [-Wheader-hygiene]
> 
>     $ grep Wheader-hygiene uniq.txt | wc -l
>     87
> 
> > warning: extra ';' after member function definition [-Wextra-semi]
> 
>     $ grep Wextra-semi uniq.txt | wc -l
>     135
> 
> > warning: struct '%' was previously declared as a class [-Wmismatched-tags]
> 
>     $ grep Wmismatched-tags uniq.txt | wc -l
>     93
> 
> > warning: missing field '%' initializer [-Wmissing-field-initializers]
> 
>     $ grep Wmissing-field-initializer uniq.txt | wc -l
>     3
> 
> > warning: private field '%' is not used [-Wunused-private-field]
> 
>     $ grep Wunused-private-field uniq.txt | wc -l
>     11
> 
> > warning: function '%' could be declared with attribute 'noreturn' [-Wmissing-noreturn]
> 
>     $ grep Wmissing-noreturn uniq.txt | wc -l
>     23
> 
> > warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
> 
>     $ grep Wgnu-zero-variadic-macro-arguments uniq.txt | wc -l
>     4
> 
> 
> 
> Style
> -----
> 
> While fixing my initial VLA issue I skimmed through the relevant code
> in src/rgw/rgw_main.cc.
> 
> What I noticed is that already in this file there's no consistent style.
> There's a "using namespace std;" at the top, then in the file there's an
> alternate usage between vector<string> and std::vector<std::string>.
> 
> It also seems to me that the memory management is often done by hand, e.g.
> in line 377 "string *objs = new string[num_objs];" and later the "delete []"
> instead of just using a std::vector<std::string>.
> 
> 
> Summary
> -------
> 
> Using Clang-3.4's -Weverything reveals critical issues.
> 
> What I hope for Ceph's general code quality is to improve by using the tools
> at hand. This means especially:
> 
> Do the same check with an up to date Clang (that's one of the reasons I'm not
> pointing out all the issues or the concrete source code positions), look
> through them in detail, such as conversions and left-out diagnostics, and fix
> as many as critical issues as possible.
> 
> You probably could even use Clang's -fsanitize flag to check for issues at runtime.
> http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation
> 
> Also keep in mind that this check was only done on the userspace code in the
> ceph/ceph repository.
> 
> 
> Thanks,
> Daniel J. Hofmann
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A plea for more tooling usage
  2014-05-06 15:19 ` Sage Weil
@ 2014-05-06 15:35   ` Milosz Tanski
  2014-05-08 11:37     ` Daniel Hofmann
  2014-05-15 23:34   ` Thorsten Behrens
  1 sibling, 1 reply; 11+ messages in thread
From: Milosz Tanski @ 2014-05-06 15:35 UTC (permalink / raw)
  To: Sage Weil; +Cc: Daniel Hofmann, ceph-devel

Indeed clangs warnings are much exhaustive. We've also had good
success with static analyzer based on clang
(http://clang-analyzer.llvm.org/). We've found a few logic issues in
our application using it as well... esp. a few bugs that where corner
cases had to due to very specific arguments passed cross function (and
back). I don't know how it relates to Coverity is complementary or
more of a replacement.

In terms of the sanitize settings one has to be careful with which
ones are enabled as some of them (thread sanitizer, memory sanitizer)
give you a pretty big performance hit. Not as bad as valgrind (5x to
20x slowdown) but more like 2x to 5x slowdown. It's best to use to
build specialized debug builds to run through integration testing.

P.S: Sorry for the dup-email yet again. Ugh @ gmail.

On Tue, May 6, 2014 at 11:19 AM, Sage Weil <sage@inktank.com> wrote:
> On Tue, 6 May 2014, Daniel Hofmann wrote:
>> Preamble: you might want to read the decent formatted version of this mail at:
>> > https://gist.github.com/daniel-j-h/06f16015c89bec4fbb42
>>
>>
>> Motivation
>> ----------
>>
>> I tried to build Ceph with the Clang (3.4) compiler. The only issue
>> preventing the build to finish was a "VLA of non-POD type" usage which
>> I fixed here:
>>
>> https://github.com/ceph/ceph/pull/1768
>
> Merged, thanks!
>
> Daniel, this is great.  We've been clean on gcc for a long time but I (at
> least) didn't realize there was a huge set of potentially important issues
> that clang warns about that gcc does not.  We definitely want to clean
> these up and get the right set of warnings whitelisted.  Alfredo is
> setting up a jenkins job for this so we can follow it going forward, but
> there is clearly a pretty big initial effort needed to get the existing
> warnings cleaned up.  If anybody is interested in helping with that
> effort, pull requests are very welcome!  :)
>
> sage
>
>
>>
>> While the build was resuming I was wondering why no one stumpled upon this
>> issue in the first place. Has nobody ever tried to compile Ceph with Clang?
>>
>> So I did what I'm always doing to quick-check a project's code quality:
>> I set the compiler to Clang. I set flags to -Weverything.
>>
>> The idea is to gradually disable warnings one by one that are not interesting.
>>
>> Instead of doing this by hand I used the following flags that I
>> accumulated over the last weeks or month to disable a handful of not so
>> important warnings:
>>
>> > -Weverything -Wno-padded -Wno-c++11-long-long -Wno-variadic-macros
>> > -Wno-c++11-extensions -Wno-global-constructors -Wno-exit-time-destructors
>> > -Wno-sign-conversion -Wno-weak-vtables -Wno-disabled-macro-expansion
>> > -Wno-c99-extensions -Wno-shadow -Wno-documentation-unknown-command -Wno-undef
>> > -Wno-documentation -Wno-duplicate-enum -Wno-switch-enum -Wno-unused-parameter
>> > -Wno-packed
>>
>> Let's build Ceph and have a look at what Clang's -Weverything catches:
>>
>>     ./autogen.sh
>>     ./configure --without-libxfs
>>     make -j 1 2> diagnostics.txt
>>
>> Many warnings are duplicates because of header includes and the like.
>> Let's remove duplicates and only keep unique diagnostics:
>>
>>     $ egrep "\[-.*\]" diagnostics.txt | sort | uniq > uniq.txt
>>     $ wc -l uniq.txt
>>     12286 uniq.txt
>>
>> Wow. 12286 unique diagnostics. Interested in what showed up?
>>
>>
>> How bad is it
>> -------------
>>
>> I ended up going through the diagnostic categories, because I had a few hours
>> to spare. The methodology I used was the following:
>>
>> Check for diagnostic categories, prioritize them, and after that remove all
>> the occurences in the original diagnostics file by:
>>
>>     $ sed -i '/Wflag/d' uniq.txt
>>
>> In the following I'll give you a quick summary of the most important
>> diagnostics. After this there are still some diagnostics left.
>>
>>     $ wc -l uniq.txt
>>     3362 uniq.txt
>>
>> This can be explained by diagnostics that are not important at all, such as
>> unused exception parameter in a catch clause.
>>
>> But there are also conversions between types that have to be studied one by
>> one to check if a loss in precision or comparison of types with different
>> signedness could lead to dangerous behavior.
>>
>>
>> Note: Clang's diagnostics can be customized, too:
>>
>> http://clang.llvm.org/docs/UsersManual.html#formatting-of-diagnostics
>>
>>
>> So let's look through the categories.
>>
>>
>> Undefined Behavior
>> ------------------
>>
>> > warning: 'va_start' has undefined behavior with reference types [-Wvarargs]
>>
>>     $ grep Wvarargs uniq.txt | wc -l
>>     1
>>
>> Here's the paragraph from the C++11 Standard, 18.10 ?3:
>> > If the parameter parmN is declared with a function, array, or reference type, or with a type that is not compatible with the type that results when passing an argument for which there is no parameter, the behavior is undefined.
>>
>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf
>>
>>
>> Conditional Uninitialized
>> -------------------------
>>
>> > warning: variable '%' may be uninitialized when used here [-Wconditional-uninitialized]
>>
>>     $ grep Wconditional-uninitialized uniq.txt | wc -l
>>     4
>>
>>
>> Unreachable Code
>> ----------------
>>
>> > warning: will never be executed [-Wunreachable-code]
>>
>>     $ grep Wunreachable-code uniq.txt | wc -l
>>     155
>>
>>
>> Extensions to the C++ language
>> ------------------------------
>>
>> There's this vast amount of code that is definitely not conforming to
>> standard C++. Some extensions are used, such as Variable Length Arrays (VLAs,
>> which were the reason for my issue in the first place), some GCC specificas.
>>
>> You probably have to decide if you want to go this route or if you want to do
>> it the portable and standard C++ way.
>>
>> VLAs are a good example where the standard C++ way of doing it is in no way
>> harder to accomplish. See the following explanation:
>>
>> http://clang.llvm.org/compatibility.html#vla
>>
>>
>> > warning: anonymous types declared in an anonymous union are an extension [-Wnested-anon-types]
>>
>>     $ grep Wnested-anon-types uniq.txt  | wc -l
>>     28
>>
>> > warning: zero size arrays are an extension [-Wzero-length-array]
>>
>>     $ grep Wzero-length-array uniq.txt | wc -l
>>     8134
>>
>> > warning: anonymous structs are a GNU extension [-Wgnu-anonymous-struct]
>>
>>     $ grep Wgnu-anonymous-struct uniq.txt  | wc -l
>>     2
>>
>> > warning: '%' may not be nested in a struct due to flexible array member [-Wflexible-array-extensions]
>>
>>     $ grep Wflexible-array-extensions uniq.txt | wc -l
>>     2
>>
>> > warning: variable length array used [-Wvla]
>> > warning: variable length arrays are a C99 feature [-Wvla-extension]
>>
>>     $ grep Wvla uniq.txt | wc -l
>>     178
>>
>> > warning: cast between pointer-to-function and pointer-to-object is an extension [-Wpedantic]
>> > warning: use of non-standard escape character '\%' [-Wpedantic]
>>
>>     $ grep Wpedantic uniq.txt | wc -l
>>     15
>>
>> > warning: use of GNU old-style field designator extension [-Wgnu-designator]
>>
>>     $ grep Wgnu-designator uniq.txt | wc -l
>>     43
>>
>> > warning: anonymous unions are a C11 extension [-Wc11-extensions]
>>
>>     $ grep Wc11-extensions uniq.txt | wc -l
>>     3
>>
>>
>> Miscellaneous:
>> --------------
>>
>> > warning: '%' hides overloaded virtual function [-Woverloaded-virtual]
>>
>>     $ grep Woverloaded-virtual uniq.txt | wc -l
>>     3
>>
>> > warning: using namespace directive in global context in header [-Wheader-hygiene]
>>
>>     $ grep Wheader-hygiene uniq.txt | wc -l
>>     87
>>
>> > warning: extra ';' after member function definition [-Wextra-semi]
>>
>>     $ grep Wextra-semi uniq.txt | wc -l
>>     135
>>
>> > warning: struct '%' was previously declared as a class [-Wmismatched-tags]
>>
>>     $ grep Wmismatched-tags uniq.txt | wc -l
>>     93
>>
>> > warning: missing field '%' initializer [-Wmissing-field-initializers]
>>
>>     $ grep Wmissing-field-initializer uniq.txt | wc -l
>>     3
>>
>> > warning: private field '%' is not used [-Wunused-private-field]
>>
>>     $ grep Wunused-private-field uniq.txt | wc -l
>>     11
>>
>> > warning: function '%' could be declared with attribute 'noreturn' [-Wmissing-noreturn]
>>
>>     $ grep Wmissing-noreturn uniq.txt | wc -l
>>     23
>>
>> > warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments]
>>
>>     $ grep Wgnu-zero-variadic-macro-arguments uniq.txt | wc -l
>>     4
>>
>>
>>
>> Style
>> -----
>>
>> While fixing my initial VLA issue I skimmed through the relevant code
>> in src/rgw/rgw_main.cc.
>>
>> What I noticed is that already in this file there's no consistent style.
>> There's a "using namespace std;" at the top, then in the file there's an
>> alternate usage between vector<string> and std::vector<std::string>.
>>
>> It also seems to me that the memory management is often done by hand, e.g.
>> in line 377 "string *objs = new string[num_objs];" and later the "delete []"
>> instead of just using a std::vector<std::string>.
>>
>>
>> Summary
>> -------
>>
>> Using Clang-3.4's -Weverything reveals critical issues.
>>
>> What I hope for Ceph's general code quality is to improve by using the tools
>> at hand. This means especially:
>>
>> Do the same check with an up to date Clang (that's one of the reasons I'm not
>> pointing out all the issues or the concrete source code positions), look
>> through them in detail, such as conversions and left-out diagnostics, and fix
>> as many as critical issues as possible.
>>
>> You probably could even use Clang's -fsanitize flag to check for issues at runtime.
>> http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation
>>
>> Also keep in mind that this check was only done on the userspace code in the
>> ceph/ceph repository.
>>
>>
>> Thanks,
>> Daniel J. Hofmann
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Milosz Tanski
CTO
10 East 53rd Street, 37th floor
New York, NY 10022

p: 646-253-9055
e: milosz@adfin.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A plea for more tooling usage
  2014-05-06 13:48 A plea for more tooling usage Daniel Hofmann
  2014-05-06 15:19 ` Sage Weil
@ 2014-05-08 10:37 ` John Spray
  1 sibling, 0 replies; 11+ messages in thread
From: John Spray @ 2014-05-08 10:37 UTC (permalink / raw)
  To: Daniel Hofmann; +Cc: Ceph Development

Couple of notes for anyone following along on ubuntu precise or
another older system:
 * make sure you're using clang >=3.3 to get support for suppressing
all the warning flags in Daniel's command line.
 * libcrypto++ 5.6.1 (the version in ubuntu) doesn't compile with
clang, unless you hack

I was looking through the warnings out of interest and made a couple notes.

On Tue, May 6, 2014 at 2:48 PM, Daniel Hofmann <daniel@trvx.org> wrote:
>> warning: anonymous types declared in an anonymous union are an extension [-Wnested-anon-types]
>
>     $ grep Wnested-anon-types uniq.txt  | wc -l
>     28

This is mostly ceph_osd_op in rados.h - code is fine but nonstandard.
We can avoid this particular warning by explicitly typedeffing the
struct types before using them, but would still have the warning that
anonymous unions are themselves nonstandard.  Anonymous unions are
nice...

>
>> warning: zero size arrays are an extension [-Wzero-length-array]
>
>     $ grep Wzero-length-array uniq.txt | wc -l
>     8134

This is mostly dout_impl, I think -- it is using an array whose size
is defined as 0 or -1 conditionally, apparently in order to detect out
of bounds severity numbers.

I wonder if there is a neater way to do this - I am not a preprocessor guru.

>> warning: '%' may not be nested in a struct due to flexible array member [-Wflexible-array-extensions]
>
>     $ grep Wflexible-array-extensions uniq.txt | wc -l
>     2

This is true (ceph_frag_tree in ceph_mds_reply_inode)... I don't see a
nice way around it.  Perhaps its a useful enough language extension
that we should continue to use it.

>> warning: cast between pointer-to-function and pointer-to-object is an extension [-Wpedantic]
>> warning: use of non-standard escape character '\%' [-Wpedantic]
>
>     $ grep Wpedantic uniq.txt | wc -l
>     15

The pointer cast thing is overly pedantic in the cases we're using
dlsym (which returns a void*), object pointer cast to fn pointer is
illegal in the language standard but guaranteed to work in POSIX.

The \% thing is easily fixed.

>> warning: use of GNU old-style field designator extension [-Wgnu-designator]
>
>     $ grep Wgnu-designator uniq.txt | wc -l
>     43

fuse_ll.cc:fuse_ll_oper and config.cc:g_default_file_layout.

Can easily just remove the field designators and initialize as {val1,
val2} but that's much less readable :-/  I don't think nice struct
initialization becomes a thing until C++11.

>> warning: using namespace directive in global context in header [-Wheader-hygiene]
>
>     $ grep Wheader-hygiene uniq.txt | wc -l
>     87

We should be able to solve these easily with a fairly mechanical procedure:
 * Remove the 'using's from the headers
 * Change type references in headers to use appropriate prefix (mostly std::)
 * Add in the 'using's to any .cc files that relied on them

>> warning: struct '%' was previously declared as a class [-Wmismatched-tags]
>
>     $ grep Wmismatched-tags uniq.txt | wc -l
>     93

At least some of these (especially Inode in libcephfs.h) appear to
have the intention of exposing POD classes in C-compatible headers.
We should probably change the C++ side to also use struct for the
things that are going to be exposed to C land.

>> warning: missing field '%' initializer [-Wmissing-field-initializers]
>
>     $ grep Wmissing-field-initializer uniq.txt | wc -l
>     3

Hmm, I only saw one of these:
test/osd/TestRados.cc:260:36: warning: missing field 'ec_pool_valid'
initializer [-Wmissing-field-initializers]

...but I'm on an older clang (3.3) so perhaps that explains it.

>> warning: private field '%' is not used [-Wunused-private-field]
>
>     $ grep Wunused-private-field uniq.txt | wc -l
>     11

This is a handy warning indeed, it appears to be accurately pointing
out unused fields.

John

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A plea for more tooling usage
  2014-05-06 15:35   ` Milosz Tanski
@ 2014-05-08 11:37     ` Daniel Hofmann
  2014-05-08 12:55       ` Milosz Tanski
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Hofmann @ 2014-05-08 11:37 UTC (permalink / raw)
  To: Milosz Tanski, Sage Weil; +Cc: ceph-devel

Using the Static Analyzer is orthogonal to enabling more diagnostics.

I uploaded my latest report results (only the summary, the specific reports take around 5 GB) here:

> http://trvx.org/~daniel/ceph-devel/report.html

Quite interesting -- hopefully you soon get this from a Jenkins build, as discussed on IRC.



On May 6, 2014 5:35:00 PM CEST, Milosz Tanski <milosz@adfin.com> wrote:
>Indeed clangs warnings are much exhaustive. We've also had good
>success with static analyzer based on clang
>(http://clang-analyzer.llvm.org/). We've found a few logic issues in
>our application using it as well... esp. a few bugs that where corner
>cases had to due to very specific arguments passed cross function (and
>back). I don't know how it relates to Coverity is complementary or
>more of a replacement.
>
>In terms of the sanitize settings one has to be careful with which
>ones are enabled as some of them (thread sanitizer, memory sanitizer)
>give you a pretty big performance hit. Not as bad as valgrind (5x to
>20x slowdown) but more like 2x to 5x slowdown. It's best to use to
>build specialized debug builds to run through integration testing.
>
>P.S: Sorry for the dup-email yet again. Ugh @ gmail.
>
>On Tue, May 6, 2014 at 11:19 AM, Sage Weil <sage@inktank.com> wrote:
>> On Tue, 6 May 2014, Daniel Hofmann wrote:
>>> Preamble: you might want to read the decent formatted version of
>this mail at:
>>> > https://gist.github.com/daniel-j-h/06f16015c89bec4fbb42
>>>
>>>
>>> Motivation
>>> ----------
>>>
>>> I tried to build Ceph with the Clang (3.4) compiler. The only issue
>>> preventing the build to finish was a "VLA of non-POD type" usage
>which
>>> I fixed here:
>>>
>>> https://github.com/ceph/ceph/pull/1768
>>
>> Merged, thanks!
>>
>> Daniel, this is great.  We've been clean on gcc for a long time but I
>(at
>> least) didn't realize there was a huge set of potentially important
>issues
>> that clang warns about that gcc does not.  We definitely want to
>clean
>> these up and get the right set of warnings whitelisted.  Alfredo is
>> setting up a jenkins job for this so we can follow it going forward,
>but
>> there is clearly a pretty big initial effort needed to get the
>existing
>> warnings cleaned up.  If anybody is interested in helping with that
>> effort, pull requests are very welcome!  :)
>>
>> sage
>>
>>
>>>
>>> While the build was resuming I was wondering why no one stumpled
>upon this
>>> issue in the first place. Has nobody ever tried to compile Ceph with
>Clang?
>>>
>>> So I did what I'm always doing to quick-check a project's code
>quality:
>>> I set the compiler to Clang. I set flags to -Weverything.
>>>
>>> The idea is to gradually disable warnings one by one that are not
>interesting.
>>>
>>> Instead of doing this by hand I used the following flags that I
>>> accumulated over the last weeks or month to disable a handful of not
>so
>>> important warnings:
>>>
>>> > -Weverything -Wno-padded -Wno-c++11-long-long -Wno-variadic-macros
>>> > -Wno-c++11-extensions -Wno-global-constructors
>-Wno-exit-time-destructors
>>> > -Wno-sign-conversion -Wno-weak-vtables
>-Wno-disabled-macro-expansion
>>> > -Wno-c99-extensions -Wno-shadow -Wno-documentation-unknown-command
>-Wno-undef
>>> > -Wno-documentation -Wno-duplicate-enum -Wno-switch-enum
>-Wno-unused-parameter
>>> > -Wno-packed
>>>
>>> Let's build Ceph and have a look at what Clang's -Weverything
>catches:
>>>
>>>     ./autogen.sh
>>>     ./configure --without-libxfs
>>>     make -j 1 2> diagnostics.txt
>>>
>>> Many warnings are duplicates because of header includes and the
>like.
>>> Let's remove duplicates and only keep unique diagnostics:
>>>
>>>     $ egrep "\[-.*\]" diagnostics.txt | sort | uniq > uniq.txt
>>>     $ wc -l uniq.txt
>>>     12286 uniq.txt
>>>
>>> Wow. 12286 unique diagnostics. Interested in what showed up?
>>>
>>>
>>> How bad is it
>>> -------------
>>>
>>> I ended up going through the diagnostic categories, because I had a
>few hours
>>> to spare. The methodology I used was the following:
>>>
>>> Check for diagnostic categories, prioritize them, and after that
>remove all
>>> the occurences in the original diagnostics file by:
>>>
>>>     $ sed -i '/Wflag/d' uniq.txt
>>>
>>> In the following I'll give you a quick summary of the most important
>>> diagnostics. After this there are still some diagnostics left.
>>>
>>>     $ wc -l uniq.txt
>>>     3362 uniq.txt
>>>
>>> This can be explained by diagnostics that are not important at all,
>such as
>>> unused exception parameter in a catch clause.
>>>
>>> But there are also conversions between types that have to be studied
>one by
>>> one to check if a loss in precision or comparison of types with
>different
>>> signedness could lead to dangerous behavior.
>>>
>>>
>>> Note: Clang's diagnostics can be customized, too:
>>>
>>>
>http://clang.llvm.org/docs/UsersManual.html#formatting-of-diagnostics
>>>
>>>
>>> So let's look through the categories.
>>>
>>>
>>> Undefined Behavior
>>> ------------------
>>>
>>> > warning: 'va_start' has undefined behavior with reference types
>[-Wvarargs]
>>>
>>>     $ grep Wvarargs uniq.txt | wc -l
>>>     1
>>>
>>> Here's the paragraph from the C++11 Standard, 18.10 ?3:
>>> > If the parameter parmN is declared with a function, array, or
>reference type, or with a type that is not compatible with the type
>that results when passing an argument for which there is no parameter,
>the behavior is undefined.
>>>
>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf
>>>
>>>
>>> Conditional Uninitialized
>>> -------------------------
>>>
>>> > warning: variable '%' may be uninitialized when used here
>[-Wconditional-uninitialized]
>>>
>>>     $ grep Wconditional-uninitialized uniq.txt | wc -l
>>>     4
>>>
>>>
>>> Unreachable Code
>>> ----------------
>>>
>>> > warning: will never be executed [-Wunreachable-code]
>>>
>>>     $ grep Wunreachable-code uniq.txt | wc -l
>>>     155
>>>
>>>
>>> Extensions to the C++ language
>>> ------------------------------
>>>
>>> There's this vast amount of code that is definitely not conforming
>to
>>> standard C++. Some extensions are used, such as Variable Length
>Arrays (VLAs,
>>> which were the reason for my issue in the first place), some GCC
>specificas.
>>>
>>> You probably have to decide if you want to go this route or if you
>want to do
>>> it the portable and standard C++ way.
>>>
>>> VLAs are a good example where the standard C++ way of doing it is in
>no way
>>> harder to accomplish. See the following explanation:
>>>
>>> http://clang.llvm.org/compatibility.html#vla
>>>
>>>
>>> > warning: anonymous types declared in an anonymous union are an
>extension [-Wnested-anon-types]
>>>
>>>     $ grep Wnested-anon-types uniq.txt  | wc -l
>>>     28
>>>
>>> > warning: zero size arrays are an extension [-Wzero-length-array]
>>>
>>>     $ grep Wzero-length-array uniq.txt | wc -l
>>>     8134
>>>
>>> > warning: anonymous structs are a GNU extension
>[-Wgnu-anonymous-struct]
>>>
>>>     $ grep Wgnu-anonymous-struct uniq.txt  | wc -l
>>>     2
>>>
>>> > warning: '%' may not be nested in a struct due to flexible array
>member [-Wflexible-array-extensions]
>>>
>>>     $ grep Wflexible-array-extensions uniq.txt | wc -l
>>>     2
>>>
>>> > warning: variable length array used [-Wvla]
>>> > warning: variable length arrays are a C99 feature
>[-Wvla-extension]
>>>
>>>     $ grep Wvla uniq.txt | wc -l
>>>     178
>>>
>>> > warning: cast between pointer-to-function and pointer-to-object is
>an extension [-Wpedantic]
>>> > warning: use of non-standard escape character '\%' [-Wpedantic]
>>>
>>>     $ grep Wpedantic uniq.txt | wc -l
>>>     15
>>>
>>> > warning: use of GNU old-style field designator extension
>[-Wgnu-designator]
>>>
>>>     $ grep Wgnu-designator uniq.txt | wc -l
>>>     43
>>>
>>> > warning: anonymous unions are a C11 extension [-Wc11-extensions]
>>>
>>>     $ grep Wc11-extensions uniq.txt | wc -l
>>>     3
>>>
>>>
>>> Miscellaneous:
>>> --------------
>>>
>>> > warning: '%' hides overloaded virtual function
>[-Woverloaded-virtual]
>>>
>>>     $ grep Woverloaded-virtual uniq.txt | wc -l
>>>     3
>>>
>>> > warning: using namespace directive in global context in header
>[-Wheader-hygiene]
>>>
>>>     $ grep Wheader-hygiene uniq.txt | wc -l
>>>     87
>>>
>>> > warning: extra ';' after member function definition [-Wextra-semi]
>>>
>>>     $ grep Wextra-semi uniq.txt | wc -l
>>>     135
>>>
>>> > warning: struct '%' was previously declared as a class
>[-Wmismatched-tags]
>>>
>>>     $ grep Wmismatched-tags uniq.txt | wc -l
>>>     93
>>>
>>> > warning: missing field '%' initializer
>[-Wmissing-field-initializers]
>>>
>>>     $ grep Wmissing-field-initializer uniq.txt | wc -l
>>>     3
>>>
>>> > warning: private field '%' is not used [-Wunused-private-field]
>>>
>>>     $ grep Wunused-private-field uniq.txt | wc -l
>>>     11
>>>
>>> > warning: function '%' could be declared with attribute 'noreturn'
>[-Wmissing-noreturn]
>>>
>>>     $ grep Wmissing-noreturn uniq.txt | wc -l
>>>     23
>>>
>>> > warning: token pasting of ',' and __VA_ARGS__ is a GNU extension
>[-Wgnu-zero-variadic-macro-arguments]
>>>
>>>     $ grep Wgnu-zero-variadic-macro-arguments uniq.txt | wc -l
>>>     4
>>>
>>>
>>>
>>> Style
>>> -----
>>>
>>> While fixing my initial VLA issue I skimmed through the relevant
>code
>>> in src/rgw/rgw_main.cc.
>>>
>>> What I noticed is that already in this file there's no consistent
>style.
>>> There's a "using namespace std;" at the top, then in the file
>there's an
>>> alternate usage between vector<string> and std::vector<std::string>.
>>>
>>> It also seems to me that the memory management is often done by
>hand, e.g.
>>> in line 377 "string *objs = new string[num_objs];" and later the
>"delete []"
>>> instead of just using a std::vector<std::string>.
>>>
>>>
>>> Summary
>>> -------
>>>
>>> Using Clang-3.4's -Weverything reveals critical issues.
>>>
>>> What I hope for Ceph's general code quality is to improve by using
>the tools
>>> at hand. This means especially:
>>>
>>> Do the same check with an up to date Clang (that's one of the
>reasons I'm not
>>> pointing out all the issues or the concrete source code positions),
>look
>>> through them in detail, such as conversions and left-out
>diagnostics, and fix
>>> as many as critical issues as possible.
>>>
>>> You probably could even use Clang's -fsanitize flag to check for
>issues at runtime.
>>>
>http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation
>>>
>>> Also keep in mind that this check was only done on the userspace
>code in the
>>> ceph/ceph repository.
>>>
>>>
>>> Thanks,
>>> Daniel J. Hofmann
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe
>ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A plea for more tooling usage
  2014-05-08 11:37     ` Daniel Hofmann
@ 2014-05-08 12:55       ` Milosz Tanski
  2014-05-08 13:19         ` Daniel Hofmann
  2014-05-08 14:22         ` Daniel Hofmann
  0 siblings, 2 replies; 11+ messages in thread
From: Milosz Tanski @ 2014-05-08 12:55 UTC (permalink / raw)
  To: Daniel Hofmann; +Cc: Sage Weil, ceph-devel

Daniel,

It is orthogonal. Having said that I find the issues raised by them of
higher value (logic errors, bad comparisons) then warnings about
unused private fields, or various extensions that make the code easier
to write (0 len array, anon union).

A a side note Daniel it looks like the  links from the report
frontpage are broken. All the links hard code a uri that points to
localhost. I believe there's a command line argument to the tool that
generates static pages (without needing to run their webserver).

Thanks for putting this up,
- Milosz

On Thu, May 8, 2014 at 7:37 AM, Daniel Hofmann <daniel@trvx.org> wrote:
> Using the Static Analyzer is orthogonal to enabling more diagnostics.
>
> I uploaded my latest report results (only the summary, the specific reports take around 5 GB) here:
>
>> http://trvx.org/~daniel/ceph-devel/report.html
>
> Quite interesting -- hopefully you soon get this from a Jenkins build, as discussed on IRC.
>
>
>
> On May 6, 2014 5:35:00 PM CEST, Milosz Tanski <milosz@adfin.com> wrote:
>>Indeed clangs warnings are much exhaustive. We've also had good
>>success with static analyzer based on clang
>>(http://clang-analyzer.llvm.org/). We've found a few logic issues in
>>our application using it as well... esp. a few bugs that where corner
>>cases had to due to very specific arguments passed cross function (and
>>back). I don't know how it relates to Coverity is complementary or
>>more of a replacement.
>>
>>In terms of the sanitize settings one has to be careful with which
>>ones are enabled as some of them (thread sanitizer, memory sanitizer)
>>give you a pretty big performance hit. Not as bad as valgrind (5x to
>>20x slowdown) but more like 2x to 5x slowdown. It's best to use to
>>build specialized debug builds to run through integration testing.
>>
>>P.S: Sorry for the dup-email yet again. Ugh @ gmail.
>>
>>On Tue, May 6, 2014 at 11:19 AM, Sage Weil <sage@inktank.com> wrote:
>>> On Tue, 6 May 2014, Daniel Hofmann wrote:
>>>> Preamble: you might want to read the decent formatted version of
>>this mail at:
>>>> > https://gist.github.com/daniel-j-h/06f16015c89bec4fbb42
>>>>
>>>>
>>>> Motivation
>>>> ----------
>>>>
>>>> I tried to build Ceph with the Clang (3.4) compiler. The only issue
>>>> preventing the build to finish was a "VLA of non-POD type" usage
>>which
>>>> I fixed here:
>>>>
>>>> https://github.com/ceph/ceph/pull/1768
>>>
>>> Merged, thanks!
>>>
>>> Daniel, this is great.  We've been clean on gcc for a long time but I
>>(at
>>> least) didn't realize there was a huge set of potentially important
>>issues
>>> that clang warns about that gcc does not.  We definitely want to
>>clean
>>> these up and get the right set of warnings whitelisted.  Alfredo is
>>> setting up a jenkins job for this so we can follow it going forward,
>>but
>>> there is clearly a pretty big initial effort needed to get the
>>existing
>>> warnings cleaned up.  If anybody is interested in helping with that
>>> effort, pull requests are very welcome!  :)
>>>
>>> sage
>>>
>>>
>>>>
>>>> While the build was resuming I was wondering why no one stumpled
>>upon this
>>>> issue in the first place. Has nobody ever tried to compile Ceph with
>>Clang?
>>>>
>>>> So I did what I'm always doing to quick-check a project's code
>>quality:
>>>> I set the compiler to Clang. I set flags to -Weverything.
>>>>
>>>> The idea is to gradually disable warnings one by one that are not
>>interesting.
>>>>
>>>> Instead of doing this by hand I used the following flags that I
>>>> accumulated over the last weeks or month to disable a handful of not
>>so
>>>> important warnings:
>>>>
>>>> > -Weverything -Wno-padded -Wno-c++11-long-long -Wno-variadic-macros
>>>> > -Wno-c++11-extensions -Wno-global-constructors
>>-Wno-exit-time-destructors
>>>> > -Wno-sign-conversion -Wno-weak-vtables
>>-Wno-disabled-macro-expansion
>>>> > -Wno-c99-extensions -Wno-shadow -Wno-documentation-unknown-command
>>-Wno-undef
>>>> > -Wno-documentation -Wno-duplicate-enum -Wno-switch-enum
>>-Wno-unused-parameter
>>>> > -Wno-packed
>>>>
>>>> Let's build Ceph and have a look at what Clang's -Weverything
>>catches:
>>>>
>>>>     ./autogen.sh
>>>>     ./configure --without-libxfs
>>>>     make -j 1 2> diagnostics.txt
>>>>
>>>> Many warnings are duplicates because of header includes and the
>>like.
>>>> Let's remove duplicates and only keep unique diagnostics:
>>>>
>>>>     $ egrep "\[-.*\]" diagnostics.txt | sort | uniq > uniq.txt
>>>>     $ wc -l uniq.txt
>>>>     12286 uniq.txt
>>>>
>>>> Wow. 12286 unique diagnostics. Interested in what showed up?
>>>>
>>>>
>>>> How bad is it
>>>> -------------
>>>>
>>>> I ended up going through the diagnostic categories, because I had a
>>few hours
>>>> to spare. The methodology I used was the following:
>>>>
>>>> Check for diagnostic categories, prioritize them, and after that
>>remove all
>>>> the occurences in the original diagnostics file by:
>>>>
>>>>     $ sed -i '/Wflag/d' uniq.txt
>>>>
>>>> In the following I'll give you a quick summary of the most important
>>>> diagnostics. After this there are still some diagnostics left.
>>>>
>>>>     $ wc -l uniq.txt
>>>>     3362 uniq.txt
>>>>
>>>> This can be explained by diagnostics that are not important at all,
>>such as
>>>> unused exception parameter in a catch clause.
>>>>
>>>> But there are also conversions between types that have to be studied
>>one by
>>>> one to check if a loss in precision or comparison of types with
>>different
>>>> signedness could lead to dangerous behavior.
>>>>
>>>>
>>>> Note: Clang's diagnostics can be customized, too:
>>>>
>>>>
>>http://clang.llvm.org/docs/UsersManual.html#formatting-of-diagnostics
>>>>
>>>>
>>>> So let's look through the categories.
>>>>
>>>>
>>>> Undefined Behavior
>>>> ------------------
>>>>
>>>> > warning: 'va_start' has undefined behavior with reference types
>>[-Wvarargs]
>>>>
>>>>     $ grep Wvarargs uniq.txt | wc -l
>>>>     1
>>>>
>>>> Here's the paragraph from the C++11 Standard, 18.10 ?3:
>>>> > If the parameter parmN is declared with a function, array, or
>>reference type, or with a type that is not compatible with the type
>>that results when passing an argument for which there is no parameter,
>>the behavior is undefined.
>>>>
>>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf
>>>>
>>>>
>>>> Conditional Uninitialized
>>>> -------------------------
>>>>
>>>> > warning: variable '%' may be uninitialized when used here
>>[-Wconditional-uninitialized]
>>>>
>>>>     $ grep Wconditional-uninitialized uniq.txt | wc -l
>>>>     4
>>>>
>>>>
>>>> Unreachable Code
>>>> ----------------
>>>>
>>>> > warning: will never be executed [-Wunreachable-code]
>>>>
>>>>     $ grep Wunreachable-code uniq.txt | wc -l
>>>>     155
>>>>
>>>>
>>>> Extensions to the C++ language
>>>> ------------------------------
>>>>
>>>> There's this vast amount of code that is definitely not conforming
>>to
>>>> standard C++. Some extensions are used, such as Variable Length
>>Arrays (VLAs,
>>>> which were the reason for my issue in the first place), some GCC
>>specificas.
>>>>
>>>> You probably have to decide if you want to go this route or if you
>>want to do
>>>> it the portable and standard C++ way.
>>>>
>>>> VLAs are a good example where the standard C++ way of doing it is in
>>no way
>>>> harder to accomplish. See the following explanation:
>>>>
>>>> http://clang.llvm.org/compatibility.html#vla
>>>>
>>>>
>>>> > warning: anonymous types declared in an anonymous union are an
>>extension [-Wnested-anon-types]
>>>>
>>>>     $ grep Wnested-anon-types uniq.txt  | wc -l
>>>>     28
>>>>
>>>> > warning: zero size arrays are an extension [-Wzero-length-array]
>>>>
>>>>     $ grep Wzero-length-array uniq.txt | wc -l
>>>>     8134
>>>>
>>>> > warning: anonymous structs are a GNU extension
>>[-Wgnu-anonymous-struct]
>>>>
>>>>     $ grep Wgnu-anonymous-struct uniq.txt  | wc -l
>>>>     2
>>>>
>>>> > warning: '%' may not be nested in a struct due to flexible array
>>member [-Wflexible-array-extensions]
>>>>
>>>>     $ grep Wflexible-array-extensions uniq.txt | wc -l
>>>>     2
>>>>
>>>> > warning: variable length array used [-Wvla]
>>>> > warning: variable length arrays are a C99 feature
>>[-Wvla-extension]
>>>>
>>>>     $ grep Wvla uniq.txt | wc -l
>>>>     178
>>>>
>>>> > warning: cast between pointer-to-function and pointer-to-object is
>>an extension [-Wpedantic]
>>>> > warning: use of non-standard escape character '\%' [-Wpedantic]
>>>>
>>>>     $ grep Wpedantic uniq.txt | wc -l
>>>>     15
>>>>
>>>> > warning: use of GNU old-style field designator extension
>>[-Wgnu-designator]
>>>>
>>>>     $ grep Wgnu-designator uniq.txt | wc -l
>>>>     43
>>>>
>>>> > warning: anonymous unions are a C11 extension [-Wc11-extensions]
>>>>
>>>>     $ grep Wc11-extensions uniq.txt | wc -l
>>>>     3
>>>>
>>>>
>>>> Miscellaneous:
>>>> --------------
>>>>
>>>> > warning: '%' hides overloaded virtual function
>>[-Woverloaded-virtual]
>>>>
>>>>     $ grep Woverloaded-virtual uniq.txt | wc -l
>>>>     3
>>>>
>>>> > warning: using namespace directive in global context in header
>>[-Wheader-hygiene]
>>>>
>>>>     $ grep Wheader-hygiene uniq.txt | wc -l
>>>>     87
>>>>
>>>> > warning: extra ';' after member function definition [-Wextra-semi]
>>>>
>>>>     $ grep Wextra-semi uniq.txt | wc -l
>>>>     135
>>>>
>>>> > warning: struct '%' was previously declared as a class
>>[-Wmismatched-tags]
>>>>
>>>>     $ grep Wmismatched-tags uniq.txt | wc -l
>>>>     93
>>>>
>>>> > warning: missing field '%' initializer
>>[-Wmissing-field-initializers]
>>>>
>>>>     $ grep Wmissing-field-initializer uniq.txt | wc -l
>>>>     3
>>>>
>>>> > warning: private field '%' is not used [-Wunused-private-field]
>>>>
>>>>     $ grep Wunused-private-field uniq.txt | wc -l
>>>>     11
>>>>
>>>> > warning: function '%' could be declared with attribute 'noreturn'
>>[-Wmissing-noreturn]
>>>>
>>>>     $ grep Wmissing-noreturn uniq.txt | wc -l
>>>>     23
>>>>
>>>> > warning: token pasting of ',' and __VA_ARGS__ is a GNU extension
>>[-Wgnu-zero-variadic-macro-arguments]
>>>>
>>>>     $ grep Wgnu-zero-variadic-macro-arguments uniq.txt | wc -l
>>>>     4
>>>>
>>>>
>>>>
>>>> Style
>>>> -----
>>>>
>>>> While fixing my initial VLA issue I skimmed through the relevant
>>code
>>>> in src/rgw/rgw_main.cc.
>>>>
>>>> What I noticed is that already in this file there's no consistent
>>style.
>>>> There's a "using namespace std;" at the top, then in the file
>>there's an
>>>> alternate usage between vector<string> and std::vector<std::string>.
>>>>
>>>> It also seems to me that the memory management is often done by
>>hand, e.g.
>>>> in line 377 "string *objs = new string[num_objs];" and later the
>>"delete []"
>>>> instead of just using a std::vector<std::string>.
>>>>
>>>>
>>>> Summary
>>>> -------
>>>>
>>>> Using Clang-3.4's -Weverything reveals critical issues.
>>>>
>>>> What I hope for Ceph's general code quality is to improve by using
>>the tools
>>>> at hand. This means especially:
>>>>
>>>> Do the same check with an up to date Clang (that's one of the
>>reasons I'm not
>>>> pointing out all the issues or the concrete source code positions),
>>look
>>>> through them in detail, such as conversions and left-out
>>diagnostics, and fix
>>>> as many as critical issues as possible.
>>>>
>>>> You probably could even use Clang's -fsanitize flag to check for
>>issues at runtime.
>>>>
>>http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation
>>>>
>>>> Also keep in mind that this check was only done on the userspace
>>code in the
>>>> ceph/ceph repository.
>>>>
>>>>
>>>> Thanks,
>>>> Daniel J. Hofmann
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe
>>ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>>in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Milosz Tanski
CTO
10 East 53rd Street, 37th floor
New York, NY 10022

p: 646-253-9055
e: milosz@adfin.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A plea for more tooling usage
  2014-05-08 12:55       ` Milosz Tanski
@ 2014-05-08 13:19         ` Daniel Hofmann
  2014-05-08 14:22         ` Daniel Hofmann
  1 sibling, 0 replies; 11+ messages in thread
From: Daniel Hofmann @ 2014-05-08 13:19 UTC (permalink / raw)
  To: Milosz Tanski; +Cc: Sage Weil, ceph-devel

As I said, I only uploaded the summary.
The reports take around 5 GB of space.

Please run the analyzer yourself for the individual reports.

On May 8, 2014 2:55:28 PM CEST, Milosz Tanski <milosz@adfin.com> wrote:
>Daniel,
>
>It is orthogonal. Having said that I find the issues raised by them of
>higher value (logic errors, bad comparisons) then warnings about
>unused private fields, or various extensions that make the code easier
>to write (0 len array, anon union).
>
>A a side note Daniel it looks like the  links from the report
>frontpage are broken. All the links hard code a uri that points to
>localhost. I believe there's a command line argument to the tool that
>generates static pages (without needing to run their webserver).
>
>Thanks for putting this up,
>- Milosz
>
>On Thu, May 8, 2014 at 7:37 AM, Daniel Hofmann <daniel@trvx.org> wrote:
>> Using the Static Analyzer is orthogonal to enabling more diagnostics.
>>
>> I uploaded my latest report results (only the summary, the specific
>reports take around 5 GB) here:
>>
>>> http://trvx.org/~daniel/ceph-devel/report.html
>>
>> Quite interesting -- hopefully you soon get this from a Jenkins
>build, as discussed on IRC.
>>
>>
>>
>> On May 6, 2014 5:35:00 PM CEST, Milosz Tanski <milosz@adfin.com>
>wrote:
>>>Indeed clangs warnings are much exhaustive. We've also had good
>>>success with static analyzer based on clang
>>>(http://clang-analyzer.llvm.org/). We've found a few logic issues in
>>>our application using it as well... esp. a few bugs that where corner
>>>cases had to due to very specific arguments passed cross function
>(and
>>>back). I don't know how it relates to Coverity is complementary or
>>>more of a replacement.
>>>
>>>In terms of the sanitize settings one has to be careful with which
>>>ones are enabled as some of them (thread sanitizer, memory sanitizer)
>>>give you a pretty big performance hit. Not as bad as valgrind (5x to
>>>20x slowdown) but more like 2x to 5x slowdown. It's best to use to
>>>build specialized debug builds to run through integration testing.
>>>
>>>P.S: Sorry for the dup-email yet again. Ugh @ gmail.
>>>
>>>On Tue, May 6, 2014 at 11:19 AM, Sage Weil <sage@inktank.com> wrote:
>>>> On Tue, 6 May 2014, Daniel Hofmann wrote:
>>>>> Preamble: you might want to read the decent formatted version of
>>>this mail at:
>>>>> > https://gist.github.com/daniel-j-h/06f16015c89bec4fbb42
>>>>>
>>>>>
>>>>> Motivation
>>>>> ----------
>>>>>
>>>>> I tried to build Ceph with the Clang (3.4) compiler. The only
>issue
>>>>> preventing the build to finish was a "VLA of non-POD type" usage
>>>which
>>>>> I fixed here:
>>>>>
>>>>> https://github.com/ceph/ceph/pull/1768
>>>>
>>>> Merged, thanks!
>>>>
>>>> Daniel, this is great.  We've been clean on gcc for a long time but
>I
>>>(at
>>>> least) didn't realize there was a huge set of potentially important
>>>issues
>>>> that clang warns about that gcc does not.  We definitely want to
>>>clean
>>>> these up and get the right set of warnings whitelisted.  Alfredo is
>>>> setting up a jenkins job for this so we can follow it going
>forward,
>>>but
>>>> there is clearly a pretty big initial effort needed to get the
>>>existing
>>>> warnings cleaned up.  If anybody is interested in helping with that
>>>> effort, pull requests are very welcome!  :)
>>>>
>>>> sage
>>>>
>>>>
>>>>>
>>>>> While the build was resuming I was wondering why no one stumpled
>>>upon this
>>>>> issue in the first place. Has nobody ever tried to compile Ceph
>with
>>>Clang?
>>>>>
>>>>> So I did what I'm always doing to quick-check a project's code
>>>quality:
>>>>> I set the compiler to Clang. I set flags to -Weverything.
>>>>>
>>>>> The idea is to gradually disable warnings one by one that are not
>>>interesting.
>>>>>
>>>>> Instead of doing this by hand I used the following flags that I
>>>>> accumulated over the last weeks or month to disable a handful of
>not
>>>so
>>>>> important warnings:
>>>>>
>>>>> > -Weverything -Wno-padded -Wno-c++11-long-long
>-Wno-variadic-macros
>>>>> > -Wno-c++11-extensions -Wno-global-constructors
>>>-Wno-exit-time-destructors
>>>>> > -Wno-sign-conversion -Wno-weak-vtables
>>>-Wno-disabled-macro-expansion
>>>>> > -Wno-c99-extensions -Wno-shadow
>-Wno-documentation-unknown-command
>>>-Wno-undef
>>>>> > -Wno-documentation -Wno-duplicate-enum -Wno-switch-enum
>>>-Wno-unused-parameter
>>>>> > -Wno-packed
>>>>>
>>>>> Let's build Ceph and have a look at what Clang's -Weverything
>>>catches:
>>>>>
>>>>>     ./autogen.sh
>>>>>     ./configure --without-libxfs
>>>>>     make -j 1 2> diagnostics.txt
>>>>>
>>>>> Many warnings are duplicates because of header includes and the
>>>like.
>>>>> Let's remove duplicates and only keep unique diagnostics:
>>>>>
>>>>>     $ egrep "\[-.*\]" diagnostics.txt | sort | uniq > uniq.txt
>>>>>     $ wc -l uniq.txt
>>>>>     12286 uniq.txt
>>>>>
>>>>> Wow. 12286 unique diagnostics. Interested in what showed up?
>>>>>
>>>>>
>>>>> How bad is it
>>>>> -------------
>>>>>
>>>>> I ended up going through the diagnostic categories, because I had
>a
>>>few hours
>>>>> to spare. The methodology I used was the following:
>>>>>
>>>>> Check for diagnostic categories, prioritize them, and after that
>>>remove all
>>>>> the occurences in the original diagnostics file by:
>>>>>
>>>>>     $ sed -i '/Wflag/d' uniq.txt
>>>>>
>>>>> In the following I'll give you a quick summary of the most
>important
>>>>> diagnostics. After this there are still some diagnostics left.
>>>>>
>>>>>     $ wc -l uniq.txt
>>>>>     3362 uniq.txt
>>>>>
>>>>> This can be explained by diagnostics that are not important at
>all,
>>>such as
>>>>> unused exception parameter in a catch clause.
>>>>>
>>>>> But there are also conversions between types that have to be
>studied
>>>one by
>>>>> one to check if a loss in precision or comparison of types with
>>>different
>>>>> signedness could lead to dangerous behavior.
>>>>>
>>>>>
>>>>> Note: Clang's diagnostics can be customized, too:
>>>>>
>>>>>
>>>http://clang.llvm.org/docs/UsersManual.html#formatting-of-diagnostics
>>>>>
>>>>>
>>>>> So let's look through the categories.
>>>>>
>>>>>
>>>>> Undefined Behavior
>>>>> ------------------
>>>>>
>>>>> > warning: 'va_start' has undefined behavior with reference types
>>>[-Wvarargs]
>>>>>
>>>>>     $ grep Wvarargs uniq.txt | wc -l
>>>>>     1
>>>>>
>>>>> Here's the paragraph from the C++11 Standard, 18.10 ?3:
>>>>> > If the parameter parmN is declared with a function, array, or
>>>reference type, or with a type that is not compatible with the type
>>>that results when passing an argument for which there is no
>parameter,
>>>the behavior is undefined.
>>>>>
>>>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf
>>>>>
>>>>>
>>>>> Conditional Uninitialized
>>>>> -------------------------
>>>>>
>>>>> > warning: variable '%' may be uninitialized when used here
>>>[-Wconditional-uninitialized]
>>>>>
>>>>>     $ grep Wconditional-uninitialized uniq.txt | wc -l
>>>>>     4
>>>>>
>>>>>
>>>>> Unreachable Code
>>>>> ----------------
>>>>>
>>>>> > warning: will never be executed [-Wunreachable-code]
>>>>>
>>>>>     $ grep Wunreachable-code uniq.txt | wc -l
>>>>>     155
>>>>>
>>>>>
>>>>> Extensions to the C++ language
>>>>> ------------------------------
>>>>>
>>>>> There's this vast amount of code that is definitely not conforming
>>>to
>>>>> standard C++. Some extensions are used, such as Variable Length
>>>Arrays (VLAs,
>>>>> which were the reason for my issue in the first place), some GCC
>>>specificas.
>>>>>
>>>>> You probably have to decide if you want to go this route or if you
>>>want to do
>>>>> it the portable and standard C++ way.
>>>>>
>>>>> VLAs are a good example where the standard C++ way of doing it is
>in
>>>no way
>>>>> harder to accomplish. See the following explanation:
>>>>>
>>>>> http://clang.llvm.org/compatibility.html#vla
>>>>>
>>>>>
>>>>> > warning: anonymous types declared in an anonymous union are an
>>>extension [-Wnested-anon-types]
>>>>>
>>>>>     $ grep Wnested-anon-types uniq.txt  | wc -l
>>>>>     28
>>>>>
>>>>> > warning: zero size arrays are an extension [-Wzero-length-array]
>>>>>
>>>>>     $ grep Wzero-length-array uniq.txt | wc -l
>>>>>     8134
>>>>>
>>>>> > warning: anonymous structs are a GNU extension
>>>[-Wgnu-anonymous-struct]
>>>>>
>>>>>     $ grep Wgnu-anonymous-struct uniq.txt  | wc -l
>>>>>     2
>>>>>
>>>>> > warning: '%' may not be nested in a struct due to flexible array
>>>member [-Wflexible-array-extensions]
>>>>>
>>>>>     $ grep Wflexible-array-extensions uniq.txt | wc -l
>>>>>     2
>>>>>
>>>>> > warning: variable length array used [-Wvla]
>>>>> > warning: variable length arrays are a C99 feature
>>>[-Wvla-extension]
>>>>>
>>>>>     $ grep Wvla uniq.txt | wc -l
>>>>>     178
>>>>>
>>>>> > warning: cast between pointer-to-function and pointer-to-object
>is
>>>an extension [-Wpedantic]
>>>>> > warning: use of non-standard escape character '\%' [-Wpedantic]
>>>>>
>>>>>     $ grep Wpedantic uniq.txt | wc -l
>>>>>     15
>>>>>
>>>>> > warning: use of GNU old-style field designator extension
>>>[-Wgnu-designator]
>>>>>
>>>>>     $ grep Wgnu-designator uniq.txt | wc -l
>>>>>     43
>>>>>
>>>>> > warning: anonymous unions are a C11 extension [-Wc11-extensions]
>>>>>
>>>>>     $ grep Wc11-extensions uniq.txt | wc -l
>>>>>     3
>>>>>
>>>>>
>>>>> Miscellaneous:
>>>>> --------------
>>>>>
>>>>> > warning: '%' hides overloaded virtual function
>>>[-Woverloaded-virtual]
>>>>>
>>>>>     $ grep Woverloaded-virtual uniq.txt | wc -l
>>>>>     3
>>>>>
>>>>> > warning: using namespace directive in global context in header
>>>[-Wheader-hygiene]
>>>>>
>>>>>     $ grep Wheader-hygiene uniq.txt | wc -l
>>>>>     87
>>>>>
>>>>> > warning: extra ';' after member function definition
>[-Wextra-semi]
>>>>>
>>>>>     $ grep Wextra-semi uniq.txt | wc -l
>>>>>     135
>>>>>
>>>>> > warning: struct '%' was previously declared as a class
>>>[-Wmismatched-tags]
>>>>>
>>>>>     $ grep Wmismatched-tags uniq.txt | wc -l
>>>>>     93
>>>>>
>>>>> > warning: missing field '%' initializer
>>>[-Wmissing-field-initializers]
>>>>>
>>>>>     $ grep Wmissing-field-initializer uniq.txt | wc -l
>>>>>     3
>>>>>
>>>>> > warning: private field '%' is not used [-Wunused-private-field]
>>>>>
>>>>>     $ grep Wunused-private-field uniq.txt | wc -l
>>>>>     11
>>>>>
>>>>> > warning: function '%' could be declared with attribute
>'noreturn'
>>>[-Wmissing-noreturn]
>>>>>
>>>>>     $ grep Wmissing-noreturn uniq.txt | wc -l
>>>>>     23
>>>>>
>>>>> > warning: token pasting of ',' and __VA_ARGS__ is a GNU extension
>>>[-Wgnu-zero-variadic-macro-arguments]
>>>>>
>>>>>     $ grep Wgnu-zero-variadic-macro-arguments uniq.txt | wc -l
>>>>>     4
>>>>>
>>>>>
>>>>>
>>>>> Style
>>>>> -----
>>>>>
>>>>> While fixing my initial VLA issue I skimmed through the relevant
>>>code
>>>>> in src/rgw/rgw_main.cc.
>>>>>
>>>>> What I noticed is that already in this file there's no consistent
>>>style.
>>>>> There's a "using namespace std;" at the top, then in the file
>>>there's an
>>>>> alternate usage between vector<string> and
>std::vector<std::string>.
>>>>>
>>>>> It also seems to me that the memory management is often done by
>>>hand, e.g.
>>>>> in line 377 "string *objs = new string[num_objs];" and later the
>>>"delete []"
>>>>> instead of just using a std::vector<std::string>.
>>>>>
>>>>>
>>>>> Summary
>>>>> -------
>>>>>
>>>>> Using Clang-3.4's -Weverything reveals critical issues.
>>>>>
>>>>> What I hope for Ceph's general code quality is to improve by using
>>>the tools
>>>>> at hand. This means especially:
>>>>>
>>>>> Do the same check with an up to date Clang (that's one of the
>>>reasons I'm not
>>>>> pointing out all the issues or the concrete source code
>positions),
>>>look
>>>>> through them in detail, such as conversions and left-out
>>>diagnostics, and fix
>>>>> as many as critical issues as possible.
>>>>>
>>>>> You probably could even use Clang's -fsanitize flag to check for
>>>issues at runtime.
>>>>>
>>>http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation
>>>>>
>>>>> Also keep in mind that this check was only done on the userspace
>>>code in the
>>>>> ceph/ceph repository.
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Daniel J. Hofmann
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>ceph-devel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe
>ceph-devel"
>>>in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A plea for more tooling usage
  2014-05-08 12:55       ` Milosz Tanski
  2014-05-08 13:19         ` Daniel Hofmann
@ 2014-05-08 14:22         ` Daniel Hofmann
  1 sibling, 0 replies; 11+ messages in thread
From: Daniel Hofmann @ 2014-05-08 14:22 UTC (permalink / raw)
  To: Milosz Tanski; +Cc: Sage Weil, ceph-devel

As I said, I only uploaded the summary.
The reports take around 5 GB of space.

Please run the analyzer yourself for the detailed individual reports.

On May 8, 2014 2:55:28 PM CEST, Milosz Tanski <milosz@adfin.com> wrote:
>Daniel,
>
>It is orthogonal. Having said that I find the issues raised by them of
>higher value (logic errors, bad comparisons) then warnings about
>unused private fields, or various extensions that make the code easier
>to write (0 len array, anon union).
>
>A a side note Daniel it looks like the  links from the report
>frontpage are broken. All the links hard code a uri that points to
>localhost. I believe there's a command line argument to the tool that
>generates static pages (without needing to run their webserver).
>
>Thanks for putting this up,
>- Milosz
>
>On Thu, May 8, 2014 at 7:37 AM, Daniel Hofmann <daniel@trvx.org> wrote:
>> Using the Static Analyzer is orthogonal to enabling more diagnostics.
>>
>> I uploaded my latest report results (only the summary, the specific
>reports take around 5 GB) here:
>>
>>> http://trvx.org/~daniel/ceph-devel/report.html
>>
>> Quite interesting -- hopefully you soon get this from a Jenkins
>build, as discussed on IRC.
>>
>>
>>
>> On May 6, 2014 5:35:00 PM CEST, Milosz Tanski <milosz@adfin.com>
>wrote:
>>>Indeed clangs warnings are much exhaustive. We've also had good
>>>success with static analyzer based on clang
>>>(http://clang-analyzer.llvm.org/). We've found a few logic issues in
>>>our application using it as well... esp. a few bugs that where corner
>>>cases had to due to very specific arguments passed cross function
>(and
>>>back). I don't know how it relates to Coverity is complementary or
>>>more of a replacement.
>>>
>>>In terms of the sanitize settings one has to be careful with which
>>>ones are enabled as some of them (thread sanitizer, memory sanitizer)
>>>give you a pretty big performance hit. Not as bad as valgrind (5x to
>>>20x slowdown) but more like 2x to 5x slowdown. It's best to use to
>>>build specialized debug builds to run through integration testing.
>>>
>>>P.S: Sorry for the dup-email yet again. Ugh @ gmail.
>>>
>>>On Tue, May 6, 2014 at 11:19 AM, Sage Weil <sage@inktank.com> wrote:
>>>> On Tue, 6 May 2014, Daniel Hofmann wrote:
>>>>> Preamble: you might want to read the decent formatted version of
>>>this mail at:
>>>>> > https://gist.github.com/daniel-j-h/06f16015c89bec4fbb42
>>>>>
>>>>>
>>>>> Motivation
>>>>> ----------
>>>>>
>>>>> I tried to build Ceph with the Clang (3.4) compiler. The only
>issue
>>>>> preventing the build to finish was a "VLA of non-POD type" usage
>>>which
>>>>> I fixed here:
>>>>>
>>>>> https://github.com/ceph/ceph/pull/1768
>>>>
>>>> Merged, thanks!
>>>>
>>>> Daniel, this is great.  We've been clean on gcc for a long time but
>I
>>>(at
>>>> least) didn't realize there was a huge set of potentially important
>>>issues
>>>> that clang warns about that gcc does not.  We definitely want to
>>>clean
>>>> these up and get the right set of warnings whitelisted.  Alfredo is
>>>> setting up a jenkins job for this so we can follow it going
>forward,
>>>but
>>>> there is clearly a pretty big initial effort needed to get the
>>>existing
>>>> warnings cleaned up.  If anybody is interested in helping with that
>>>> effort, pull requests are very welcome!  :)
>>>>
>>>> sage
>>>>
>>>>
>>>>>
>>>>> While the build was resuming I was wondering why no one stumpled
>>>upon this
>>>>> issue in the first place. Has nobody ever tried to compile Ceph
>with
>>>Clang?
>>>>>
>>>>> So I did what I'm always doing to quick-check a project's code
>>>quality:
>>>>> I set the compiler to Clang. I set flags to -Weverything.
>>>>>
>>>>> The idea is to gradually disable warnings one by one that are not
>>>interesting.
>>>>>
>>>>> Instead of doing this by hand I used the following flags that I
>>>>> accumulated over the last weeks or month to disable a handful of
>not
>>>so
>>>>> important warnings:
>>>>>
>>>>> > -Weverything -Wno-padded -Wno-c++11-long-long
>-Wno-variadic-macros
>>>>> > -Wno-c++11-extensions -Wno-global-constructors
>>>-Wno-exit-time-destructors
>>>>> > -Wno-sign-conversion -Wno-weak-vtables
>>>-Wno-disabled-macro-expansion
>>>>> > -Wno-c99-extensions -Wno-shadow
>-Wno-documentation-unknown-command
>>>-Wno-undef
>>>>> > -Wno-documentation -Wno-duplicate-enum -Wno-switch-enum
>>>-Wno-unused-parameter
>>>>> > -Wno-packed
>>>>>
>>>>> Let's build Ceph and have a look at what Clang's -Weverything
>>>catches:
>>>>>
>>>>>     ./autogen.sh
>>>>>     ./configure --without-libxfs
>>>>>     make -j 1 2> diagnostics.txt
>>>>>
>>>>> Many warnings are duplicates because of header includes and the
>>>like.
>>>>> Let's remove duplicates and only keep unique diagnostics:
>>>>>
>>>>>     $ egrep "\[-.*\]" diagnostics.txt | sort | uniq > uniq.txt
>>>>>     $ wc -l uniq.txt
>>>>>     12286 uniq.txt
>>>>>
>>>>> Wow. 12286 unique diagnostics. Interested in what showed up?
>>>>>
>>>>>
>>>>> How bad is it
>>>>> -------------
>>>>>
>>>>> I ended up going through the diagnostic categories, because I had
>a
>>>few hours
>>>>> to spare. The methodology I used was the following:
>>>>>
>>>>> Check for diagnostic categories, prioritize them, and after that
>>>remove all
>>>>> the occurences in the original diagnostics file by:
>>>>>
>>>>>     $ sed -i '/Wflag/d' uniq.txt
>>>>>
>>>>> In the following I'll give you a quick summary of the most
>important
>>>>> diagnostics. After this there are still some diagnostics left.
>>>>>
>>>>>     $ wc -l uniq.txt
>>>>>     3362 uniq.txt
>>>>>
>>>>> This can be explained by diagnostics that are not important at
>all,
>>>such as
>>>>> unused exception parameter in a catch clause.
>>>>>
>>>>> But there are also conversions between types that have to be
>studied
>>>one by
>>>>> one to check if a loss in precision or comparison of types with
>>>different
>>>>> signedness could lead to dangerous behavior.
>>>>>
>>>>>
>>>>> Note: Clang's diagnostics can be customized, too:
>>>>>
>>>>>
>>>http://clang.llvm.org/docs/UsersManual.html#formatting-of-diagnostics
>>>>>
>>>>>
>>>>> So let's look through the categories.
>>>>>
>>>>>
>>>>> Undefined Behavior
>>>>> ------------------
>>>>>
>>>>> > warning: 'va_start' has undefined behavior with reference types
>>>[-Wvarargs]
>>>>>
>>>>>     $ grep Wvarargs uniq.txt | wc -l
>>>>>     1
>>>>>
>>>>> Here's the paragraph from the C++11 Standard, 18.10 ?3:
>>>>> > If the parameter parmN is declared with a function, array, or
>>>reference type, or with a type that is not compatible with the type
>>>that results when passing an argument for which there is no
>parameter,
>>>the behavior is undefined.
>>>>>
>>>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf
>>>>>
>>>>>
>>>>> Conditional Uninitialized
>>>>> -------------------------
>>>>>
>>>>> > warning: variable '%' may be uninitialized when used here
>>>[-Wconditional-uninitialized]
>>>>>
>>>>>     $ grep Wconditional-uninitialized uniq.txt | wc -l
>>>>>     4
>>>>>
>>>>>
>>>>> Unreachable Code
>>>>> ----------------
>>>>>
>>>>> > warning: will never be executed [-Wunreachable-code]
>>>>>
>>>>>     $ grep Wunreachable-code uniq.txt | wc -l
>>>>>     155
>>>>>
>>>>>
>>>>> Extensions to the C++ language
>>>>> ------------------------------
>>>>>
>>>>> There's this vast amount of code that is definitely not conforming
>>>to
>>>>> standard C++. Some extensions are used, such as Variable Length
>>>Arrays (VLAs,
>>>>> which were the reason for my issue in the first place), some GCC
>>>specificas.
>>>>>
>>>>> You probably have to decide if you want to go this route or if you
>>>want to do
>>>>> it the portable and standard C++ way.
>>>>>
>>>>> VLAs are a good example where the standard C++ way of doing it is
>in
>>>no way
>>>>> harder to accomplish. See the following explanation:
>>>>>
>>>>> http://clang.llvm.org/compatibility.html#vla
>>>>>
>>>>>
>>>>> > warning: anonymous types declared in an anonymous union are an
>>>extension [-Wnested-anon-types]
>>>>>
>>>>>     $ grep Wnested-anon-types uniq.txt  | wc -l
>>>>>     28
>>>>>
>>>>> > warning: zero size arrays are an extension [-Wzero-length-array]
>>>>>
>>>>>     $ grep Wzero-length-array uniq.txt | wc -l
>>>>>     8134
>>>>>
>>>>> > warning: anonymous structs are a GNU extension
>>>[-Wgnu-anonymous-struct]
>>>>>
>>>>>     $ grep Wgnu-anonymous-struct uniq.txt  | wc -l
>>>>>     2
>>>>>
>>>>> > warning: '%' may not be nested in a struct due to flexible array
>>>member [-Wflexible-array-extensions]
>>>>>
>>>>>     $ grep Wflexible-array-extensions uniq.txt | wc -l
>>>>>     2
>>>>>
>>>>> > warning: variable length array used [-Wvla]
>>>>> > warning: variable length arrays are a C99 feature
>>>[-Wvla-extension]
>>>>>
>>>>>     $ grep Wvla uniq.txt | wc -l
>>>>>     178
>>>>>
>>>>> > warning: cast between pointer-to-function and pointer-to-object
>is
>>>an extension [-Wpedantic]
>>>>> > warning: use of non-standard escape character '\%' [-Wpedantic]
>>>>>
>>>>>     $ grep Wpedantic uniq.txt | wc -l
>>>>>     15
>>>>>
>>>>> > warning: use of GNU old-style field designator extension
>>>[-Wgnu-designator]
>>>>>
>>>>>     $ grep Wgnu-designator uniq.txt | wc -l
>>>>>     43
>>>>>
>>>>> > warning: anonymous unions are a C11 extension [-Wc11-extensions]
>>>>>
>>>>>     $ grep Wc11-extensions uniq.txt | wc -l
>>>>>     3
>>>>>
>>>>>
>>>>> Miscellaneous:
>>>>> --------------
>>>>>
>>>>> > warning: '%' hides overloaded virtual function
>>>[-Woverloaded-virtual]
>>>>>
>>>>>     $ grep Woverloaded-virtual uniq.txt | wc -l
>>>>>     3
>>>>>
>>>>> > warning: using namespace directive in global context in header
>>>[-Wheader-hygiene]
>>>>>
>>>>>     $ grep Wheader-hygiene uniq.txt | wc -l
>>>>>     87
>>>>>
>>>>> > warning: extra ';' after member function definition
>[-Wextra-semi]
>>>>>
>>>>>     $ grep Wextra-semi uniq.txt | wc -l
>>>>>     135
>>>>>
>>>>> > warning: struct '%' was previously declared as a class
>>>[-Wmismatched-tags]
>>>>>
>>>>>     $ grep Wmismatched-tags uniq.txt | wc -l
>>>>>     93
>>>>>
>>>>> > warning: missing field '%' initializer
>>>[-Wmissing-field-initializers]
>>>>>
>>>>>     $ grep Wmissing-field-initializer uniq.txt | wc -l
>>>>>     3
>>>>>
>>>>> > warning: private field '%' is not used [-Wunused-private-field]
>>>>>
>>>>>     $ grep Wunused-private-field uniq.txt | wc -l
>>>>>     11
>>>>>
>>>>> > warning: function '%' could be declared with attribute
>'noreturn'
>>>[-Wmissing-noreturn]
>>>>>
>>>>>     $ grep Wmissing-noreturn uniq.txt | wc -l
>>>>>     23
>>>>>
>>>>> > warning: token pasting of ',' and __VA_ARGS__ is a GNU extension
>>>[-Wgnu-zero-variadic-macro-arguments]
>>>>>
>>>>>     $ grep Wgnu-zero-variadic-macro-arguments uniq.txt | wc -l
>>>>>     4
>>>>>
>>>>>
>>>>>
>>>>> Style
>>>>> -----
>>>>>
>>>>> While fixing my initial VLA issue I skimmed through the relevant
>>>code
>>>>> in src/rgw/rgw_main.cc.
>>>>>
>>>>> What I noticed is that already in this file there's no consistent
>>>style.
>>>>> There's a "using namespace std;" at the top, then in the file
>>>there's an
>>>>> alternate usage between vector<string> and
>std::vector<std::string>.
>>>>>
>>>>> It also seems to me that the memory management is often done by
>>>hand, e.g.
>>>>> in line 377 "string *objs = new string[num_objs];" and later the
>>>"delete []"
>>>>> instead of just using a std::vector<std::string>.
>>>>>
>>>>>
>>>>> Summary
>>>>> -------
>>>>>
>>>>> Using Clang-3.4's -Weverything reveals critical issues.
>>>>>
>>>>> What I hope for Ceph's general code quality is to improve by using
>>>the tools
>>>>> at hand. This means especially:
>>>>>
>>>>> Do the same check with an up to date Clang (that's one of the
>>>reasons I'm not
>>>>> pointing out all the issues or the concrete source code
>positions),
>>>look
>>>>> through them in detail, such as conversions and left-out
>>>diagnostics, and fix
>>>>> as many as critical issues as possible.
>>>>>
>>>>> You probably could even use Clang's -fsanitize flag to check for
>>>issues at runtime.
>>>>>
>>>http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation
>>>>>
>>>>> Also keep in mind that this check was only done on the userspace
>>>code in the
>>>>> ceph/ceph repository.
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Daniel J. Hofmann
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>ceph-devel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe
>ceph-devel"
>>>in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A plea for more tooling usage
  2014-05-06 15:19 ` Sage Weil
  2014-05-06 15:35   ` Milosz Tanski
@ 2014-05-15 23:34   ` Thorsten Behrens
  2014-05-27 22:19     ` Thorsten Behrens
  1 sibling, 1 reply; 11+ messages in thread
From: Thorsten Behrens @ 2014-05-15 23:34 UTC (permalink / raw)
  To: Sage Weil; +Cc: Daniel Hofmann, ceph-devel

[-- Attachment #1: Type: text/plain, Size: 1201 bytes --]

Sage Weil wrote:
> If anybody is interested in helping with that effort, pull requests
> are very welcome!  :)
> 
Here goes: https://github.com/ceph/ceph/pull/1814

I'm not sure though I like what I did, this macro magic is slightly
over-verbose, and even clang error messages are not overly helpful,
should one get something wrong around there.

As hinted at in the patch, something like boost::program_options would
be nice, but that's a chunk of work & I'd rather hear feedback first
this way or the other. For the boost option, not sure if verbose error
reporting (the if (oss) parts) would translate over easily, I presume
that is deemed important?

Another pending fix in my fork, is consolidating the spelling of long
args in the code (majority uses dash as word separator, minority
underscore). That conflicts badly with the ceph_argparse_flag() change
of course, thus unsubmitted & curious about your feedback for the
while. Again, retaining acceptance of both spellings is probably
mandatory? ;)

Cheers,

-- 

Thorsten Behrens

SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg; GF: Jeff
Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg)

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 966 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A plea for more tooling usage
  2014-05-15 23:34   ` Thorsten Behrens
@ 2014-05-27 22:19     ` Thorsten Behrens
  2014-05-28 21:21       ` Josh Durgin
  0 siblings, 1 reply; 11+ messages in thread
From: Thorsten Behrens @ 2014-05-27 22:19 UTC (permalink / raw)
  To: Danny Al-Gaaf; +Cc: ceph-devel

[-- Attachment #1: Type: text/plain, Size: 677 bytes --]

I wrote:
> Sage Weil wrote:
> > If anybody is interested in helping with that effort, pull requests
> > are very welcome!  :)
> > 
[snip]

> As hinted at in the patch, something like boost::program_options would
> be nice, but that's a chunk of work & I'd rather hear feedback first
> this way or the other. For the boost option, not sure if verbose error
> reporting (the if (oss) parts) would translate over easily, I presume
> that is deemed important?
> 
Hi guys,

any further opinion on that? If no one feels strongly, I think I'll
just go ahead & hack something up with program_options, and see if
that looks more pleasant. :)

Cheers,

-- Thorsten

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 966 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: A plea for more tooling usage
  2014-05-27 22:19     ` Thorsten Behrens
@ 2014-05-28 21:21       ` Josh Durgin
  0 siblings, 0 replies; 11+ messages in thread
From: Josh Durgin @ 2014-05-28 21:21 UTC (permalink / raw)
  To: Thorsten Behrens, Danny Al-Gaaf; +Cc: ceph-devel

On 05/27/2014 03:19 PM, Thorsten Behrens wrote:
> I wrote:
>> Sage Weil wrote:
>>> If anybody is interested in helping with that effort, pull requests
>>> are very welcome!  :)
>>>
> [snip]
>
>> As hinted at in the patch, something like boost::program_options would
>> be nice, but that's a chunk of work & I'd rather hear feedback first
>> this way or the other. For the boost option, not sure if verbose error
>> reporting (the if (oss) parts) would translate over easily, I presume
>> that is deemed important?
>>
> Hi guys,
>
> any further opinion on that? If no one feels strongly, I think I'll
> just go ahead & hack something up with program_options, and see if
> that looks more pleasant. :)

It'd be interesting if you could use boost::program_options just
for the binary-specific flags.

Trying to use it for all options, including those from
common/config_opts.h, is a lot more effort. There was some work towards
that a while ago, though the branch is no longer there:

http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/11139

Josh

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-05-28 21:21 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-06 13:48 A plea for more tooling usage Daniel Hofmann
2014-05-06 15:19 ` Sage Weil
2014-05-06 15:35   ` Milosz Tanski
2014-05-08 11:37     ` Daniel Hofmann
2014-05-08 12:55       ` Milosz Tanski
2014-05-08 13:19         ` Daniel Hofmann
2014-05-08 14:22         ` Daniel Hofmann
2014-05-15 23:34   ` Thorsten Behrens
2014-05-27 22:19     ` Thorsten Behrens
2014-05-28 21:21       ` Josh Durgin
2014-05-08 10:37 ` John Spray

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.