Coding quality
I value software quality. Higher quality software has fewer defects, better security, and better performance, which leads to happier users who can work more effectively. Code reviews are an effective method for improving software quality. McConnell (2004) suggests that unit testing finds approximately 25% of defects, function testing 35%, integration testing 45%, and code review 55-60%. While this means that none of these methods are good enough on their own and that they should be combined, clearly code review is an essential tool here.
This library is therefore developed with several techniques, such as coding styling, low complexity, docstrings, reviews, and unit tests. Such conventions are helpfull to improve the quality, make the code cleaner and more understandable but alos to trace future bugs, and spot syntax errors.
library
The file structure of the generated package looks like:
path/to/hgboost/
├── .editorconfig
├── .gitignore
├── .pre-commit-config.yml
├── .prospector.yml
├── CHANGELOG.rst
├── docs
│ ├── conf.py
│ ├── index.rst
│ └── ...
├── LICENSE
├── MANIFEST.in
├── NOTICE
├── hgboost
│ ├── __init__.py
│ ├── __version__.py
│ └── hgboost.py
├── README.md
├── requirements.txt
├── setup.cfg
├── setup.py
└── tests
├── __init__.py
└── test_hgboost.py
Style
This library is compliant with the PEP-8 standards. PEP stands for Python Enhancement Proposal and sets a baseline for the readability of Python code. Each public function contains a docstring that is based on numpy standards.
Complexity
This library has been developed by using measures that help decreasing technical debt.
Version 0.1.0 of the hgboost
library scored, according the code analyzer: VALUE, for which values > 0 are good and 10 is a maximum score.
Developing software with low(er) technical dept may take extra development time, but has many advantages:
Higher quality code
easier maintanable
Less prone to bugs and errors
Higher security
Unit tests
The use of unit tests is essential to garantee a consistent output of developed functions.
The following tests are secured using tests.test_hgboost()
:
The input parameters are checked.
The output values are checked and whether they are encoded properly.
The check of whether parameters are handled correctly.
For each method, i.e. xgboost(_reg), catboost(_reg), lightboost(_reg) I assessed the correct working using all combinations of input parameters.
Each input parameters can also have multiple states so I created the underneath set that results in 2304 different combinations.
All errors are captured bij hgboost
and (hopefully) a understandable error is described.
- Parameters combined:
max_evals = [None, 10] cvs = [None, 5, 11] val_sizes = [None, 0.2] test_sizes = [None, 0.2] methods = [‘xgb_clf’, ‘xgb_clf_multi’] pos_labels = [None, 0, 2, ‘value not in y’] top_cv_evals = [None, 1, 20] thresholds = [None, 0.5] eval_metrics = [None,’f1’]
pytest tests\test_hgboost.py
====================================== test session starts ======================================
platform win32 -- Python 3.6.10, pytest-5.4.0, py-1.8.1, pluggy-0.13.1
collected 3 items
tests\test_hgboost.py ... [100%]
======================================= warnings summary ========================================
tests/test_hgboost.py::test_plot
=========================== 3 passed, 1 warning in 1254.97s (0:20:54) ===========================