The correct answer is C because Continued Pre-training (also known as domain-adaptive pre-training) involves training a pre-trained model further on unlabeled domain-specific data. This method helps adapt the LLM to a specific domain without needing labeled datasets, making it ideal for cases where the goal is to enhance the model’s understanding of technical language or terminology.
From AWS documentation:
"Continued pre-training allows an LLM to ingest large volumes of domain-specific text without labels to improve contextual understanding in a particular area. This is effective when adapting a foundation model to new knowledge without altering the model architecture."
Explanation of other options:
A. Full training refers to building a model from scratch, which is extremely resource-intensive and unnecessary if a strong base model already exists.
B. Supervised fine-tuning requires labeled data, which the scenario explicitly lacks.
D. RAG is a method to retrieve external information at inference time, not a training technique using unlabeled data.
Referenced AWS AI/ML Documents and Study Guides:
AWS Bedrock Model Customization Documentation – Continued Pre-training
Amazon SageMaker JumpStart – Domain Adaptation Techniques
AWS Machine Learning Specialty Study Guide – Foundation Model Customization Section