The verified answer is B. Pre-training . The scenario describes the first major training stage for a large language model, where the model starts with randomly initialized weights and learns broad language patterns from a very large corpus of unlabeled or self-supervised text data. AWS describes large language models as very large deep learning models that are pre-trained on vast amounts of data . In this stage, the model is not yet being customized for a narrow business task; instead, it learns general-purpose linguistic and semantic representations from large-scale text.
AWS machine learning guidance explains that pre-training teaches the model broad linguistic and semantic patterns , including grammar, context, world knowledge, reasoning, and token prediction, using self-supervised learning approaches such as masked language modeling or causal language modeling. That exactly matches the question’s description: random weight initialization followed by fitting the model to a large web dataset using a language-modeling objective. This is the core training process used to create a foundation model or LLM before later adaptation.
Option A. Fine-tuning is incorrect because AWS defines fine-tuning as training a pretrained model on a new dataset rather than training from scratch. Fine-tuning adapts an existing model for a specific task or domain and usually requires less data and less training time than pre-training.
Option C. Model selection is incorrect because model selection refers to choosing an appropriate model or architecture, not training the model weights from random initialization. Option D. Deployment is also incorrect because AWS describes deployment as the stage where a trained model is made available for inference or predictions, such as through SageMaker inference endpoints.
Therefore, because the company is training an LLM from initial weights on a large web corpus using a language-modeling objective, the stage is pre-training .