AWS Generative AI Developer Pro (AIP - C01) Exam Questions 2026 AWS Generative AI Developer Pro (AIP - C01) Questions 2026 Contains 1200+ exam questions to pass the exam in first attempt. SkillCertPro offers real exam questions for practice for all major IT certifications. For a full set of 1250 questions. Go to https://skillcertpro.com/product/aws - generative - ai - developer - pro - aip - c01 - exam - questions/ SkillCertPro offers detailed explanations to each question which helps to understand the concepts better. It is recommended to score above 85% in SkillCertPro exams before attempting a real exam. SkillCertPro updates exam questions every 2 weeks. You will get life time access and life time free updates SkillCertPro assures 100% pass guarantee in first attempt. Below are the free 10 sample questions. Question 1: A company is building a multi - agent GenAI platform that uses Strands Agents and a second, non - AWS agent framework. The agents must call several internal REST APIs and databases as tools. These tool calls are short - lived and stateless, but traffic is highly spiky. The team wants a single implementation of each tool that can be reused across all agent frameworks by using the Model Context Protocol (MCP), and they wan t to minimize infrastructure management and keep access patterns consistent across agents. Which solution will BEST meet these requirements with the LEAST operational overhead? A. Have each agent framework call the internal REST APIs directly and use fram ework - specific SDKs to normalize responses, without introducing an MCP server or MCP client libraries. B. Create Amazon Bedrock agents with action groups that call the internal REST APIs, and have other agent frameworks call these Bedrock agents when they need tool functionality. C. Run an MCP server as a long - lived containerized service on an Amazon ECS cluster with Fargate. Configure each agent framework to call the MCP server over a private Application Load Balancer endpoint. D. Implement an MCP server a s an AWS Lambda function that wraps the internal REST APIs. Expose the MCP endpoint through Amazon API Gateway and have all agent frameworks call it by using MCP client libraries. Answer: D Explanation: Recommended Architecture for MCP - Based Tool Access The scenario involves l ightweight, stateless tool calls, highly variable traffic, and a requirement for consistent, reusable access across multiple agent frameworks using Model Context Protocol (MCP). A Lambda - based MCP server, which wraps internal REST APIs and is exposed thro ugh API Gateway, best fits these requirements. This approach: Aligns with MCP guidance for stateless, lightweight tools Leverages AWS Lambda’s automatic scaling to handle variable traffic Minimizes operational and infrastructure management overhead Enables consistent access patterns across different agent frameworks via MCP client libraries Alternative approaches are less suitable: Running the MCP server on ECS: Better suited for complex or long - running tools and introduces unnecessary operational overhead for this use case. Allowing each framework to call REST APIs directly: Eliminates the benefits of MCP and requires duplicating integration logic across frameworks. Using Bedrock agents with action groups: Couples tool access to a single foundati on model service and does not provide a general MCP - based abstraction usable across all frameworks. Question 2: A company is building a customer - support chat application on Amazon Bedrock. The application receives a mix of very simple FAQ - style questions and complex multi - step troubleshooting requests. Currently, all traffic is routed to a large, high - cost FM, which meets quality requirements but is driving up inference costs. The company has validated that a smaller, cheaper FM produces acceptable answers for most s imple queries, but they want complex queries to continue using the larger FM. Client applications must keep calling a single HTTPS endpoint without needing to know which model is used for a given request. Which solution will MOST cost - effectively coordinat e the different FMs while requiring the LEAST change to client applications? A. Configure provisioned throughput on the large FM and enable cross - Region inference for that model to handle all traffic with lower latency and higher throughput, keeping a sin gle API Gateway integration. B. Expose three separate Amazon API Gateway endpoints, each mapped to a different Bedrock FM. Update client applications to choose the appropriate endpoint based on custom heuristics in the client code. C. Create a model - routin g service behind a single API Gateway endpoint that uses Amazon Bedrock Intelligent Prompt Routing or a router agent (for example, with Strands Agents) to classify each request by complexity and invoke either a smaller or a larger FM accordingly. D. Use AW S Step Functions to run a workflow that always invokes the small FM first. If the response confidence is low, invoke the large FM in a second state. Return both responses to the client so the client can decide which to display. Answer: C Explanation: Recommended Approach: Centr alized Model Coordination with Dynamic Routing The optimal solution is to introduce a centralized model coordination layer that implements dynamic routing based on query characteristics. This can be achieved by using Amazon Bedrock Intelligent Prompt Rou ting or a router agent that classifies each incoming request as simple or complex, and then routes it to the most appropriate foundation model (FM): Smaller, lower - cost models for simple queries Larger, more capable models for complex queries All routin g is exposed through a single API Gateway endpoint, ensuring minimal or no changes are required on the client side. Key benefits: Cost optimization by using less expensive models whenever possible Centralized routing logic, simplifying management and governance Minimal client impact, as clients continue calling a single endpoint Why other approaches are less suitable: Multiple API endpoints with client - side routing: Requires widespread client changes and makes routing log ic harder to manage centrally. Step Functions chaining multiple models for every request: Increases both cost and latency, and still fails to fully abstract routing decisions away from clients. Scaling a single large model (provisioned throughput and cro ss - Region inference): Improves performance but does not reduce per - request costs and does not provide coordination across multiple foundation models. Question 3 : A news aggregation company uses Amazon Bedrock to generate personalized article summaries. The team is testing three FM configurations (different model families and temperature settings) and wants to pick one before updating production. They have a curated dataset of 10,000 historical user prompts with reference summaries created by editors. The team needs to systematically compare response quality across all three configurations and also understand token usage and latency per configuration so they can choose the best price?performance option. They want to minimize custom evaluation code and be able to repe at the evaluation when new FM versions become available. Which solution will BEST meet these requirements? A. Import each FM configuration into Amazon SageMaker AI endpoints, enable SageMaker Model Monitor for data quality and model quality, and run a sho rt synthetic traffic test against each endpoint. Use the Model Monitor reports to choose the configuration with the best combination of latency and accuracy. B. Upload the curated prompt dataset and corresponding reference summaries to Amazon S3. Use Amazo n Bedrock Model Evaluations to run a multi - model evaluation for the three FM configurations by using an LLM - as - a - judge. Use the evaluation reports to compare quality metrics, and run a controlled load test for each configuration while monitoring Amazon Clo udWatch token and latency metrics to select the best price - performance configuration. C. Deploy each FM configuration behind a separate Amazon API Gateway stage and send a small percentage of live production traffic to each configuration as a canary test. After two weeks, export API Gateway access logs to Amazon S3 and manually compare response samples, average latency, and token usage. D. Export the prompt dataset to a local workstation and invoke each FM configuration directly from a custom Python script. Log all responses and timing information to a CSV file and ask subject matter experts to score a random subset of outputs by hand to decide which configuration performs best. Answer: B Explanation: Recommended Approach: Amazon Bedrock Model Evaluations The best approach is to use Amazon Bedrock Model Evaluations with a curated prompt and reference dataset stored in Amazon S3 to run a multi - model evaluation across all three foundation model (FM) configurations. This approach provides: Standardized quality metrics and side - by - side comparison reports Repeatable evaluations that can be re - run whenever models or parameters change Minimal custom code, accelerating experimentation and decision - making To complete the analysis, the evaluation results can be combined with latency and token usage metrics collected from controlled test runs in Amazon CloudWatch. This delivers a comprehensive view of cost, performance, and quality trade - offs. Why alternativ e approaches are less suitable: Routing live traffic via canary tests: Focuses on incremental production rollout rather than controlled, pre - deployment multi - model evaluation and requires significant manual analysis. Running local scripts with manual sco ring: Is not scalable, difficult to repeat, and prone to inconsistency. Re - deploying configurations as SageMaker endpoints with Model Monitor: Adds unnecessary operational complexity and is designed for production drift and quality monitoring, not systema tic comparison of model configurations prior to deployment. Question 4 : A company is building an internal RAG - based knowledge assistant by using Amazon Bedrock Knowledge Bases and Claude. Before rolling out widely, the GenAI team wants a repeatable evaluation proc ess that: – Scores responses for answer relevance, factual accuracy against retrieved context, logical coherence, and helpfulness. – Detects problematic behaviors such as hallucinations and harmful or biased content. – Can be run automatically on a curated prompt dataset without involving end users. The team has only used accuracy and BLEU for previous ML projects and wants to move beyond these traditional metrics. Which approach will BEST meet these requirements? A. Deploy the assistant to a pilot group a nd configure Amazon CloudWatch to track latency, Time to First Token (TTFT), token usage, and error rates. Consider configurations with the lowest latency and error rates to be the highest quality. B. Run an A/B test in production with two FM configuration s using CloudWatch Evidently. Log click - through rates and time - on - page for each configuration and pick the configuration with the highest engagement. C. Create a prompt dataset with user questions, reference answers, and reference contexts in Amazon S3. Us e Amazon Bedrock Model Evaluations in retrieve - and - generate (RAG) mode to run automated evaluations that compute metrics such as correctness, completeness, faithfulness to context, answer and context relevance, logical coherence, and harmfulness. Periodica lly review a sample of model outputs with human evaluators to calibrate the metrics. D. Generate synthetic Q&A pairs from the FM and compute ROUGE and BLEU scores between generated answers and the reference answers. Select the configuration with the highes t average ROUGE and BLEU scores. Answer: C Explanation: Recommended Evaluation Framework for GenAI Applications Using Amazon Bedrock Model Evaluations with a prompt dataset that includes reference answers — and, for RAG use cases, reference contexts — enables the team to compute G enAI - specific evaluation metrics, including: Correctness and completeness Answer relevance and context relevance Faithfulness to retrieved source text Logical coherence Harmfulness and safety indicators When these automated evaluations are combined with periodic human review, they form a comprehensive assessment framework that goes well beyond traditional machine learning metrics such as accuracy, ROUGE, or BLEU. Why other approaches fall short: Methods based solely on n - gram overlap metrics Approaches focused only on operational metrics (latency, throughput, cost) Evaluations based on high - level engagement signals These alternatives do not explicitly measure critical GenAI dimensions such as groundin g, hallucination rates, and safety, and therefore do not satisfy the stated evaluation requirements. Question 5 : A company exposes a generative text?summarization service through Amazon API Gateway. The API invokes an AWS Lambda function that calls an Amazon Bedro ck model by using the AWS SDK. Recently, intermittent Bedrock failures and timeouts have caused Lambda invocations to pile up, leading to cascading timeouts in upstream clients. The company wants to safeguard the workflow so that when the FM starts failing repeatedly, the system quickly stops sending new requests to the model for a cooldown period, returns a standardized fallback response, and automatically resumes normal behavior after the model becomes healthy again. The company wants to implement this co ntrol logic in AWS without requiring changes to client applications. Which solution will BEST meet these requirements? A. Place an AWS Step Functions state machine between API Gateway and the Lambda function. Use a Task state that calls a Lambda function to check a circuit - breaker flag in Amazon DynamoDB, conditionally invoke the Bedrock - calling Lambda, and update failure counters. Use Choice and Wait states to open the circuit after repeated failures, return a fallback response without calling the model w hile the circuit is open, and periodically probe the model to close the circuit when it recovers. B. Keep the existing API Gateway " Lambda " Bedrock flow. Increase the Lambda function timeout and memory, enable maximum automatic retries in the AWS SDK, an d configure an Amazon CloudWatch alarm to notify operators when error rates increase so they can manually disable the API if needed. C. Enable Amazon Bedrock cross - Region inference for the model so traffic is automatically distributed across Regions, and r ely on the SDK ’ s default retry behavior to mitigate transient errors. D. Insert an Amazon SQS queue between API Gateway and the Lambda function so that requests are buffered. Configure the Lambda consumer with a dead - letter queue (DLQ) and use exponential backoff retries when calling Bedrock. Answer: A Explanation: Recommended Approach: Circuit Breaker for Safeguarded AI Workflows Implementing a circuit breaker using AWS Step Functions, Lambda, and DynamoDB provides a robust mechanism to protect AI workflows: The state machine checks the breaker state before invoking the Bedrock model. The breaker opens after repeated failures, preventing further requests from hitting the failing model. During a cooldown period, a standardized fallback response is returned. The system period ically probes the model and closes the breaker once the model is healthy, restoring normal operation. Why other approaches are insufficient: Increasing timeouts and retries or adding SQS buffering does not prevent new requests from reaching a failing foundation model. Enabling cross - Region inference improves availability but does not provide automatic, application - level control over when the m odel is invoked. This architecture ensures controlled, resilient access to foundation models and reduces operational risk during failures. For a full set of 1250 questions. Go to https://skillcertpro.com/product/aws - generative - ai - developer - pro - aip - c01 - exam - questions/ SkillCertPro offers detailed explanations to each question which helps to understand the concepts better. It is recommended to score above 85% in SkillCertPro exams before attempting a real exam. SkillCertPro updates exam questions every 2 weeks. You will get life time access and life time free updates SkillCertPro assures 100% pass guarantee in first attempt. Question 6 : A company has deployed a customer - facing chat application that uses an FM through Amazon Bedrock. The GenAI team wants to continuously improve prompts and model selection based on real user experience. Product managers want production users to be able to rate each answer (for example, 1 – 5 stars) and optionally leave comments. The ratings must not slow down the chat response, and the team needs a structured, queryable dataset that they can analyze by model version and prompt template. Which solution BEST supports continuous, user - centered evaluation of FM responses with minimal additional operational overhead? A. Add a “Rate this answer“ button in the UI that calls an Amazon API Gateway endpoint. Use an AWS Lambda function to validate the payload and write feedback records to an Amazon DynamoDB table that stores user ID, model ID, prompt template ID, rating, comment, and timestamp. Build internal dashboards and analyses directly from the DynamoDB table. B. Configure Amazon Augmented AI (Amazon A2I) to send a sample of all model responses to a private labeling workforce for review. Use the reviewers ’ labels as the prima ry signal to refine prompts and choose models. C. Modify the chat UI to append user feedback directly into the next prompt as a free - text comment so that the FM can self - correct. Use Amazon CloudWatch metric streams to export token usage and latency metric s for later analysis. D. Rely on Amazon Bedrock model invocation logs and Amazon CloudWatch Logs Insights to infer user satisfaction from prompt and response patterns, such as response length and error rates, without collecting explicit ratings from users. Answer: A Explanation: Recommended Approach: Dedicated User Feedback Interface The best solution introduces a dedicated feedback interface backed by API Gateway, Lambda, and DynamoDB, enabling: Direct capture of user ratings and comments without affecting chat latency Storage of feedback in a structured, queryable format, keyed by model and prompt identifiers Support for continuous, user - centered evaluation of foundation model performance Why alternative approaches are less suitable: Relying solely on logs and operat ional metrics does not capture explicit user judgments. Using Amazon A2I focuses on expert labeling workflows rather than end - user feedback, adding unnecessary operational complexity. This architecture ensures that real user feedback drives model evaluat ion and improvement in a scalable and low - latency manner. Question 7 : A financial services company is building an internal virtual assistant by using Amazon Bedrock. The assistant must hold natural multi - turn conversations with employees, answer free - form question s about internal policies, and help automate routine tasks such as drafting emails and walking users through approval workflows. The assistant will operate only with text in English and does not need to generate images or video. The company wants to choose a single Bedrock FM that is explicitly optimized for conversational interactions, question answering, and workflow automation without requiring additional model training. Which FM should the company use as the primary model for this assistant? A. Jurassi c - 2 (AI21 Labs) multilingual LLM on Amazon Bedrock B. Stable Diffusion (Stability.ai) image model on Amazon Bedrock C. Amazon Titan text model on Amazon Bedrock D. Claude (Anthropic) text model on Amazon Bedrock Answer: D Explanation: Recommended Foundation Model for an Internal Text - Only Assistant For a text - only internal assistant that must support: Natural multi - turn conversations Question answering (Q&A) Workflow automation the most appropriate choice is Claude on Amazon Bedrock, as it is specifically designed for conversational LLM use cases, Q&A, and workflow automation. Why alternative models are less suitable: Amazon Titan: Focuses on text generation, summarization, Q&A, and embeddings, but is not explicitly optimized for convers ational assistants or workflow automation. Jurassic - 2: Primarily centered on multilingual text generation, which does not meet the English - only requirement or the emphasis on assistant - style interactions. Stable Diffusion: An image - generation model, and th erefore does not support the required text - based conversational or workflow capabilities. This makes Claude on Bedrock the ideal choice for the described internal assistant scenario. Question 8 : A company has built an internal “ask the docs“ assistant by using Ama zon Bedrock Knowledge Bases. The knowledge base indexes documents stored in Amazon S3, uses an Amazon Titan embedding model, and stores vectors in Amazon OpenSearch Service. The document s include policies from multiple departments (HR, Finance, IT). Each S 3 object already has custom metadata keys such as ′ department ′ , ′ last_updated ′ , and ′ data_classification ′ . Users report that: – Queries like “ latest travel policy “ sometimes return outd ated policies. – Results often mix documents from multiple departments even when the question clearly relates to a single department. The development team wants to improve search precision and context awareness by leveraging document metadata. They want to avoid changing application code that calls the knowledge base and want to keep metadata out of the embedded text to reduce token usage. Which solution BEST meets these requirements with the LEAST operational overhead? A. Configure the Amazon Bedrock Knowledge Base ingestion to read S3 object metadata and map fields such as ′ department ′ and ′ last_updated ′ as metadata columns by using a ′ metadata.json ′ configuration file. Use these metadata fields for filtering and relev ance scoring during retrieval. B. Add a second Amazon OpenSearch Service index dedicated to metadata fields, and implement a two - step retrieval where the application first queries the metadata index for department and date filters, then queries the existing vector index only for the document IDs returned by the first query. C. Modify the preprocessing Lambda fun ction to prepend the department, last_updated timestamp, and data_classification as human - readable text at the top of each document before chunking and embedding, so that the model can infer relevance from this information. D. Create a new Amazon DynamoDB table to store metadata keyed by document ID. After the knowledge base returns candidate chunks, have the application call DynamoDB to look up metadata for each result and discard results that do not match the user ’ s department or recency requirements. Answer: A Explanation: Re commended Approach: Using Structured Metadata in Bedrock Knowledge Bases The most efficient solution is to leverage structured metadata that already exists on the S3 objects for the knowledge base and vector store: Configure ingestion to map S3 object meta data into metadata columns via a metadata.json configuration. This enables queries to filter and rank by fields such as department or last_updated without modifying application code or embedding metadata into the text. Key advantages over alternative appro aches: Prepending metadata into text: Relies on the model to infer structure Consumes more tokens Limits precise metadata - based filtering Adding DynamoDB or OpenSearch for metadata: Adds extra services and synchronization logic Requires custom ret rieval flows Increases operational overhead unnecessarily Using built - in metadata support in Bedrock Knowledge Bases and the underlying vector store provides efficient, scalable, and precise query capabilities with minimal operational complexity. Question 9 : A d ata engineering team generates embeddings for 20 million archived chat transcripts each night by using an Amazon Titan embeddings model on Amazon Bedrock. An AWS Glue job reads the transcripts from Amazon S3 and, for each record, calls the bedrock - runtime InvokeModel API synchronously. The job increasingly fails to finish within the 6?hour maintenance window. Amazon CloudWatch metrics show frequent throttling and that most of the runtime is spent waiting on many small Bedrock invocations. The embeddings are only needed by the next morning, and the team wants to avoid paying for provisioned throughput that would sit idle during the day. Which solution will MOST effectively increase FM throughput while minimizing operational overhead for this nightly job? A. Increase the size and number of workers in the AWS Glue job so that more transcripts are processed in parallel while continuing to call InvokeModel once per transcript. B. Refactor the Glue job to write all transcript prompts as JSONL files to Amazon S3 an d trigger Amazon Bedrock Batch Inference jobs. Read the resulting embeddings from S3 after the batch jobs complete. C. Enable Amazon Bedrock cross - Region inference with a global inference profile so that the Glue job can send InvokeModel requests to multip le Regions for the same Titan embeddings model. D. Wrap the existing InvokeModel calls in an AWS Step Functions state machine that uses a Map state to fan out to many parallel AWS Lambda functions, each calling the same Titan embeddings model. Answer: B Explanation: Recommended Approach: Increasing Throughput for Nightly Embeddings Pipeline The most effective way to increase throughput for a nightly, non - interactive embeddings pipeline is to adopt a batch processing pattern: Write prompts to S3 and use Amazon Bedrock Batch Infer ence. This allows Bedrock to process large batches efficiently, reducing per - request overhead and maximizing token processing throughput with minimal operational effort. Why alternative approaches are less effective: Adding more Glue workers or fanning out Lambda invocations: Still relies on individual InvokeModel calls Constrained by Bedrock invocation quotas Does not efficiently handle bulk processing Cross - Region inference: Improves availability and latency Adds unnecessary complexity for offline batch w orkloads Adopting a batch - based workflow ensures high throughput, scalability, and efficient use of Bedrock resources with minimal operational overhead. Question 10 : A software company is building a retrieval - augmented question - answering application for its custome r support portal. The application will use Amazon Bedrock FMs to answer questions over tens of millions of PDFs and HTML articles stored in Amazon S3. The team wants semantic search with metadata - based filtering and prefers a fully managed vector store tha t automatically handles chunking, embedding generation, and index management without requiring the team to operate search clusters. Which solution will BEST meet these requirements with the LEAST operational overhead? A. Provision an Amazon OpenSearch Ser vice domain with vector search enabled. Use AWS Lambda to read documents from S3, chunk the content, call an Amazon Titan embeddings model via Amazon Bedrock, and index the embeddings and metadata into a custom OpenSearch index. B. Use Amazon Aurora Postgr eSQL with the pgvector extension to store document text, metadata, and embeddings. Run a scheduled AWS Lambda function to generate embeddings with Amazon Bedrock and load them into Aurora for similarity search with SQL queries. C. Store documents and embed dings in an Amazon DynamoDB table, with an attribute that holds the embedding vector. Use PartiQL queries with filter expressions over the embedding attribute to approximate similarity search and apply metadata filters. D. Create an Amazon Bedrock Knowledg e Base that uses the S3 bucket as a data source, configure an Amazon Titan embeddings model, and use the default serverless Amazon OpenSearch Service vector store with standard or semantic chunking and metadata fields for filtering. Answer: D Explanation: Recommended Approach: Managed RAG with Bedrock Knowledge Bases The most effective design is to use Amazon Bedrock Knowledge Bases with: An S3 data source The default serverless OpenSearch Service vector store This architecture provides a fully managed Retrieval - Augmented Genera tion (RAG) layer by: Automating document chunking and embedding generation using a selected model (e.g., Amazon Titan) Handling vector storage and retrieval Supporting metadata fields and hybrid search Eliminating the need for the t eam to operate or maintain search infrastructure Why alternative approaches are less suitable: Custom OpenSearch or Aurora pgvector pipelines: Functionally feasible but require significantly more operational work for design, scaling, and maintenance Dynamo DB as a vector store: Does not support native vector similarity search Cannot scale efficiently for semantic retrieval over tens of millions of documents This solution delivers managed, scalable, and feature - rich RAG capabilities with minimal operational o verhead. For a full set of 1250 questions. Go to https://skillcertpro.com/product/aws - generative - ai - developer - pro - aip - c01 - exam - questions/ SkillCertPro offers detailed explanations to each question which helps to understand the concepts better. It is recommended to score above 85% in SkillCertPro exams before attempting a real exam. S killCertPro updates exam questions every 2 weeks. You will get life time access and life time free updates SkillCertPro assures 100% pass guarantee in first attempt.