The correct answer is D because low latency and optimized inference speed are critical for real-time applications. For delivering real-time service quotes, the system must respond in milliseconds or a few seconds at most, making latency a primary concern when choosing the model.
From AWS Bedrock documentation:
"When selecting a foundation model for real-time applications, inference speed and latency are key evaluation metrics to ensure responsive user experiences."
Explanation of other options:
A. Model size affects performance and cost but doesn't directly guarantee low latency.
B. Training data quality is important for accuracy, but it doesn’t address real-time performance requirements.
C. GPU availability matters in infrastructure planning, not in model selection for latency optimization.
Referenced AWS AI/ML Documents and Study Guides:
Amazon Bedrock Model Selection Guide – Real-time Use Case Considerations
AWS ML Specialty Guide – Foundation Model Performance Criteria