In AI fraud detection for financial institutions, PMI-CPMAI–aligned practices place strong emphasis ondata quality, completeness, and relevanceas the foundation of model reliability and regulatory compliance. Because the team has access to various internal and external data sources, the appropriate method is to perform acomprehensive data audit and cleansing process.
A data audit systematically examines each source for accuracy, consistency, timeliness, coverage of key fraud patterns, and alignment with business and regulatory needs. It checks for missing values, duplicates, inconsistencies across systems, and potential bias (e.g., underrepresentation of certain customer segments or regions). Cleansing then addresses identified issues through deduplication, normalization, imputations where appropriate, and removal of unusable or misleading records. This process ensures that the data used to train and operate the AI solution truly reflects real-world transactions and fraud behaviors, supporting trustworthy and explainable outcomes.
Limiting data to internal sources only (option B) may unnecessarily reduce coverage and predictive power, especially when reputable external data (e.g., watchlists, consortium data) can enhance detection. Integrating data “as is” (option C) violates good AI governance and greatly increases the risk of poor model performance and regulatory concerns. Using pretrained models without tailoring (option D) ignores the need for alignment with the institution’s own data and fraud patterns. Therefore, the method that directly addresses the objectives isconducting a comprehensive data audit and cleansing process.