Dataset for: A Statistical Approach to Combining Multi-Source Information in One-Class Classifiers
datasetposted on 08.06.2017 by Katherine Simonson, R. Derek West, Ross Hansen, Thomas LaBruyere III, Mark VanBenthem
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
A new method is introduced for combining information from multiple sources to support one-class classification. The contributing sources may represent measurements taken by different sensors of the same physical entity, repeated measurements by a single sensor, or numerous features computed from a single measured image or signal. The approach utilizes the theory of statistical hypothesis testing, and applies Fisher’s approach to combining p-values, modified to handle non-independent sources. Classifier outputs take the form of fused p-values, which may be used to gauge the consistency of unknown entities with one or more class hypotheses. The approach enables rigorous assessment of classification uncertainties, and allows for traceability of classifier decisions back to the constituent sources, both of which are important for high-consequence decision support. Application of the technique is illustrated on two challenge problems, one for skin segmentation and the other for terrain labelling. The method is seen to be particularly effective for relatively small training samples.