100% code coverage is near-meaningless - but is there a good measure to use?

vampatori@feddit.uk · 1 year ago

100% code coverage is near-meaningless - but is there a good measure to use?

RandomBit@lemmy.sdf.org · 1 year ago

“When a measure becomes a target, it ceases to be a good measure”. — Goodhart’s law

Zoe Codez@lemmy.digital-alchemy.app · 1 year ago

There are tools to detail the code coverage if your tests. I’ve worked with Istanbul in the past, and it’s helped to point out parts of the code that could use more attention

https://istanbul.js.org/

SorteKanin@feddit.dk · 1 year ago

But is there any accepted means of formally measuring a system and ensuring that some level of test quality exists?

Formally? No, this is basically impossible by Rice’s Theorem. There is not even a guarantee that if you have 100% test coverage, the program is good (the tests could be flawed).

This is just a natural limitation of turing completeness. You can’t decide these properties while also having full computational power. In order to decide such things, you need a less powerful mode of computation (something not turing complete) that can be analyzed more thoroughly and with more guarantees.

fades@beehaw.org · 1 year ago

So true lol. Mgmt just announced a directive at my work last week that code must have 95-100% coverage.

Meanwhile they hire contractors from india that write the dumbest, most useless tests possible. I’ve worked with many great Indian devs but the contractors we use today all seem like a step down in quality. More work for me I guess

MagicShel@programming.dev · edit-2 1 year ago

~~Pit~~ Mutation testing is useful. It basically tests how effective your tests are and tells you missed conditions that aren’t being tested.

For Java: https://pitest.org

Edit: corrected to the more general name instead of a specific implementation.

mattburkedev@programming.dev · 1 year ago

The most extreme examples of the problem are tests with no assertions. Fortunately these are uncommon in most code bases.

Every enterprise I’ve consulted for that had code coverage requirements was full of elaborate mock-heavy tests with a single Assert.NotNull at the end. Basically just testing that you wrote the right mocks!

Deely@programming.dev · 1 year ago

Yeah. All the same. Create lazy metric - get lazy and useless results.

xthexder@l.sw0.com · 1 year ago

I’d never heard of mutation testing before either, and it seems really interesting. It reminds me of fuzzing, except for the code instead of the input. Maybe a little impractical for some codebases with long build times though. Still, I’ll have to give it a try for a future project. It looks like there’s several tools for mutation testing C/C++.

The most useful tests I write are generally regression tests. Every time I find a bug, I’ll replicate it in a test case, then fix the bug. I think this is just basic Test-Driven-Development practice, but it’s very useful to verify that your tests actually fail when they should. Mutation/Pit testing seems like it addresses that nicely.

Sleepkever@lemm.ee · 1 year ago

We are running the above pi tests with an extra (Gradle based) build plugin so that it only runs mutations for the changed lines in that pull request. That drastically reduces runtime and still ensures that new code is covered to the mutation test level we want. Maybe something similar can be done for C or C++ projects.

robotdna@lemmy.world · 1 year ago

Does something like this exist for Python?

v_krishna@lemmy.ml · 1 year ago

https://mutatest.readthedocs.io/en/latest/

Warning: Some posts on this platform may contain adult material intended for mature audiences only. Viewer discretion is advised. By clicking ‘Continue’, you confirm that you are 18 years or older and consent to viewing explicit content.

100% code coverage is near-meaningless - but is there a good measure to use?

100% code coverage is near-meaningless - but is there a good measure to use?