A variable selection package driving Netica with Python


Bayesian Networks (BNs) are useful methods of probabilistically modelling environmental systems. BN performance is sensitive to the number of variables included in the model framework. The selection of the optimum set of variables to include in a BN (“variable selection”) is therefore a key part of the BN modelling process. While variable selection is an issue dealt with in the wider BN and machine learning literature, it remains largely absent from environmental BN applications to date, due in large part to a lack of software designed to work with available BN packages. CVNetica_VS is an open-source Python module that extends the functionality of Netica, a commonly used commercial BN software package, to perform variable selection. CVNetica_VS uses wrapper-based variable selection and cross-validation to search for the optimum variable set to use in a BN. The software will aid in objectifying and automating the development of BNs in environmental applications.

Journal of Environmental Modelling & Software, 115