ADAPTIVE FEATURE EXTRACTION AND COST-SENSITIVE NEURAL LEARNING FOR ROBUST CLASSIFICATION IN HIGH-IMBALANCE DOMAINS: BRIDGING BLIND SOURCE SEPARATION AND FRAUD DETECTION
Keywords:
Class Imbalance, Independent Component Analysis, Neural Networks, Fraud DetectionAbstract
The proliferation of high-dimensional datasets has exacerbated the challenge of detecting rare but critical events, such as financial fraud or distinct medical pathologies. Traditional machine learning algorithms, optimized for overall accuracy, frequently exhibit a bias toward the majority class, rendering them ineffective in high-imbalance scenarios. This article proposes and evaluates a novel hybrid architecture: the Independent Component Analysis–Cost-Sensitive Neural Network (ICA-CSNN). By integrating the blind source separation capabilities of ICA with the decision-boundary flexibility of neural networks, we aim to isolate latent "fraud" signals from background noise before classification. We first employ Information-Maximization (Infomax) ICA to transform correlated input features into statistically independent components, effectively disentangling the underlying data structure. Subsequently, these components serve as inputs to a Multi-Layer Perceptron (MLP) trained using a cost-sensitive backpropagation algorithm that disproportionately penalizes minority-class errors. Drawing upon seminal work in signal processing and recent advances in imbalance learning, we demonstrate that this dual-stage approach significantly improves sensitivity and F-measure scores compared to traditional methods. Our analysis suggests that enforcing statistical independence in the feature space aids the neural network in converging upon optimal decision boundaries, even when the minority class represents less than 1% of the data. The findings have direct implications for the design of robust automated detection systems in cybersecurity and healthcare.
References
Dip Bharatbhai Patel. (2025). Comparing Neural Networks and Traditional Algorithms in Fraud Detection. The American Journal of Applied Sciences, 7(07), 128–132.
Amari, S.-I., Cichocki, A., and Yang, H. (1996). A new learning algorithm for blind source separation. In Advances in Neural Information Processing Systems 8, pages 757–763. MIT Press.
Back, A. D. and Weigend, A. S. (1997). A first application of independent component analysis to extracting structure from stock returns. Int. J. on Neural Systems, 8(4):473–484.
Bell, A. and Sejnowski, T. (1995). An information-maximization approach to blind separation and blind deconvolution. Neural Computation, 7:1129–1159.
Bell, A. and Sejnowski, T. (1997). The ’independent components’ of natural scenes are edge filters. Vision Research, 37:3327–3338.
Cardoso, J.-F. (1997). Infomax and maximum likelihood for source separation. IEEE Letters on Signal Processing, 4:112–114.
Cardoso, J.-F. and Laheld, B. H. (1996). Equivariant adaptive source separation. IEEE Trans. on Signal Processing, 44(12):3017–3030.
Cichocki, A. and Unbehauen, R. (1996). Robust neural networks with on-line learning for blind identification and blind separation of sources. IEEE Trans. on Circuits and Systems, 43(11):894–906.
Brian Mac Namee, Padraig Cunningham, Stephen Byrne, and Owen I Corrigan. (2002). The problem of bias in training data in regression problems in medical decision support. Artificial intelligence in medicine, 24(1):51–70.
Philip K Chan and Salvatore J Stolfo. (1998). Toward scalable learning with non-uniform class and cost distributions: A case study in credit card fraud detection. In KDD, volume 1998, pages 164–168.
Predrag Radivojac, Nitesh V Chawla, A Keith Dunker, and Zoran Obradovic. (2004). Classification and knowledge discovery in protein databases. Journal of Biomedical Informatics, 37(4):224–239.
Claire Cardie and Nicholas Howe. (1997). Improving minority class prediction using casespecific feature weights. In ICML, pages 57–65.
Guo Haixiang, Li Yijing, Jennifer Shang, Gu Mingyun, Huang Yuanyue, and Gong Bing. (2016). Learning from class-imbalanced data: Review of methods and applications. Expert Systems with Applications.
Nathalie Japkowicz and Shaju Stephen. (2002). The class imbalance problem: A systematic study. Intelligent data analysis, 6(5):429–449.
Maciej A Mazurowski, Piotr A Habas, Jacek M Zurada, Joseph Y Lo, Jay A Baker, and Georgia D Tourassi. (2008). Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance. Neural networks, 21(2):427–436.