A company uses Amazon Bedrock to implement a Retrieval Augmented Generation (RAG)-based system to serve...

Amazon Web Services AIP-C01 Question Answer

A company uses Amazon Bedrock to implement a Retrieval Augmented Generation (RAG)-based system to serve medical information to users. The company needs to compare multiple chunking strategies, evaluate the generation quality of two foundation models (FMs), and enforce quality thresholds for deployment.

Which Amazon Bedrock evaluation configuration will meet these requirements?

Create a retrieve-only evaluation job that uses a supported version of Anthropic Claude Sonnet as the evaluator model. Configure metrics for context relevance and context coverage. Define deployment thresholds in a separate CI/CD pipeline.

Create a retrieve-and-generate evaluation job that uses custom precision-at-k metrics and an LLM-as-a-judge metric with a scale of 1–5. Include each chunking strategy in the evaluation dataset. Use a supported version of Anthropic Claude Sonnet to evaluate responses from both FMs.

Create a separate evaluation job for each chunking strategy and FM combination. Use Amazon Bedrock built-in metrics for correctness and completeness. Manually review scores before deployment approval.

Set up a pipeline that uses multiple retrieve-only evaluation jobs to assess retrieval quality. Create separate evaluation jobs for both FMs that use Amazon Nova Pro as the LLM-as-a-judge model. Evaluate based on faithfulness and citation precision metrics.

Explanation:

Option B is the correct evaluation configuration because it enables end-to-end assessment of both retrieval and generation quality while supporting direct comparison of chunking strategies and foundation models. Amazon Bedrock evaluation jobs are designed to support RAG workflows by evaluating how well retrieved context supports accurate and high-quality model outputs.

A retrieve-and-generate evaluation job evaluates the complete RAG pipeline, not just retrieval. This is essential for medical information use cases, where both the relevance of retrieved content and the correctness of generated responses directly impact user safety and trust. Including multiple chunking strategies in the evaluation dataset allows side-by-side comparison under identical prompts and conditions.

Custom precision-at-k metrics measure how effectively the retrieval component surfaces relevant chunks, while an LLM-as-a-judge metric provides qualitative scoring of generated responses. Using a numeric scale enables consistent, repeatable evaluation and supports automated quality gates. Amazon Bedrock supports LLM-based evaluators to score dimensions such as accuracy, completeness, and relevance.

Using the same evaluator model to assess outputs from both FMs ensures consistent scoring and eliminates evaluator bias. This configuration allows the company to define quantitative thresholds that must be met before deployment, enabling automated promotion through CI/CD pipelines.

Option A evaluates retrieval only and cannot assess generation quality. Option C introduces manual review, which does not scale and delays deployment. Option D separates retrieval and generation evaluation, making it harder to correlate chunking strategies with final output quality.

Therefore, Option B best meets the requirements for systematic evaluation, comparison, and quality enforcement in an Amazon Bedrock–based RAG system.

Amazon Web Services AIP-C01 View All Questions

Amazon Web Services AIP-C01 Summary

Vendor: Amazon Web Services
Product: AIP-C01
Update on: Feb 22, 2026
Questions: 107

Price: $52.5 ~~$149.99~~

Buy Now AIP-C01 PDF + Testing Engine Pack

A financial services company is creating a Retrieval Augmented Generation (RAG) application that uses Amazon...

A company is building a serverless application that uses AWS Lambda functions to help students...

Spring Sale Special Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: xmasmnth

A company uses Amazon Bedrock to implement a Retrieval Augmented Generation (RAG)-based system to serve...

The Answer Is:

Explanation:

Amazon Web Services AIP-C01 Summary

Payments We Accept

Contact Us