For a financial services firm building an AI model for fraud detection, the accuracy and trustworthiness of transaction data is critical. PMI-CPMAI’s guidance on AI data governance stresses the need to understand where data comes from, how it flows, and what transformations it undergoes before being used for model training or inference. This is precisely what data lineage tools are designed to support.
Data lineage enables teams to trace data back to its original source, see each processing step (cleansing, aggregation, enrichment), and verify that transformations conform to defined business and regulatory rules. In regulated sectors like finance, this traceability is essential for audits, model validation, and demonstrating that AI decisions (such as fraud flags) are based on accurate, well-governed data. While technologies like blockchain (option C) or batch cleansing (option D) may have roles in specific architectures, PMI-style AI governance places primary emphasis on visibility, traceability, and control over the data lifecycle.
A federated database system (option B) addresses access architecture, not inherently accuracy. By contrast, utilizing data lineage tools directly supports identifying and validating data sources and understanding whether the data remains accurate after multiple hops. Therefore, in line with PMI-CPMAI data governance practices, option A is the most effective method listed to help ensure data accuracy.