Publication details

Research Report

Introduction to Feature Selection Toolbox 3 – The C++ Library for Subset Search, Data Modeling and Classification

Somol Petr, Vácha Pavel, Mikeš Stanislav, Hora Jan, Pudil Pavel, Žid Pavel

: ÚTIA, (Praha 2010)

: Research Report 2287

: CEZ:AV0Z10750506

: 1M0572, GA MŠk, 2C06019, GA MŠk

: feature selection, software library, subset search, attribute selection, variable selection, optimization, machine learning, classification, pattern recognition

: http://fst.utia.cz/download/FST3_Introduction_UTIA_TR2287.pdf

(eng): We introduce a new standalone widely applicable software library for feature selection (also known as attribute or variable selection), capable of reducing problem dimensionality to maximize the accuracy of data models, performance of automatic decision rules as well as to reduce data acquisition cost. The library can be exploited by users in research as well as in industry. Less experienced users can experiment with different provided methods and their application to real-life problems, experts can implement their own criteria or search schemes taking advantage of the toolbox framework. In this paper we first provide a concise survey of a variety of existing feature selection approaches. Then we focus on a selected group of methods of good general performance as well as on tools surpassing the limits of existing libraries. We build a feature selection framework around them and design an object-based generic software library. We describe the key design points and properties of the library.

: BD