Professional Machine Learning Engineer

The GCP Professional Machine Learning Engineer certification exam validates expertise in designing, building, and productionizing low‑code ML solutions on Google Cloud, covering architecture, collaboration, scaling, serving, and pipeline automation.

120

Minutes

Questions

70/100

Passing Score

$200

Exam Cost

Languages

Who Should Take This

It is intended for data scientists, ML engineers, and cloud architects who have at least three years of hands‑on experience developing models, engineering data pipelines, and implementing MLOps on Google Cloud. Candidates seek to demonstrate mastery of end‑to‑end ML system design, scaling, and operationalization to advance their professional credentials.

What's Covered

1Designing ML solutions using AutoML, BigQuery ML, and pre-built AI APIs for common use cases without custom model development.

2Managing data and model governance, version control, and collaboration workflows using Vertex AI Feature Store, Model Registry, and ML metadata tracking.

3Training custom models using Vertex AI Custom Training with TensorFlow, PyTorch, or JAX; implementing hyperparameter tuning, distributed training, and experiment tracking.

4Deploying models to Vertex AI Prediction endpoints; implementing online and batch prediction; optimizing serving infrastructure with autoscaling, GPUs, and TPUs.

5Building ML pipelines with Vertex AI Pipelines and Kubeflow; implementing CI/CD for ML; automating model retraining, evaluation, and deployment workflows.

6Implementing model monitoring for data drift, prediction drift, and feature attribution; configuring alerts and automated retraining triggers for production ML systems.

Exam Structure

Question Types

Multiple Choice
Multiple Select

Scoring Method

Pass/fail. Google does not publish a scaled score or passing percentage.

Delivery Method

Kryterion testing center or online proctored

Prerequisites

None required. Associate Cloud Engineer recommended.

Recertification

3 years

What's Included in AccelaStudy® AI

Adaptive Knowledge Graph

Practice Questions

Lesson Modules

Console Simulator Labs

Exam Tips & Strategy

13 Activity Formats

Course Outline

1Domain 1: Architecting Low-Code ML Solutions

3 topics

Develop ML models using BigQuery ML

Implement BigQuery ML models for regression and classification tasks using CREATE MODEL statements with linear regression, logistic regression, and XGBoost model types, specifying hyperparameters, training options, and data splits directly in SQL.
Analyze BigQuery ML model type selection tradeoffs among k-means clustering, matrix factorization, ARIMA_PLUS time series, and DNN architectures by evaluating data characteristics, interpretability needs, and prediction requirements for unsupervised and forecasting tasks.
Analyze BigQuery ML model performance using ML.EVALUATE, ML.CONFUSION_MATRIX, ML.ROC_CURVE, and ML.FEATURE_INFO functions to assess accuracy, identify feature importance, and determine model readiness for production deployment.

Use pre-built ML APIs

Implement Vision AI and Document AI solutions for image classification, object detection, OCR, and document parsing by selecting appropriate pre-trained models and configuring API requests with confidence thresholds.
Analyze Natural Language AI, Translation AI, Speech-to-Text, Text-to-Speech, and Video AI capabilities by evaluating API throughput limits, streaming versus batch tradeoffs, language coverage, and model version quality to select optimal configurations.
Analyze pre-built API suitability versus custom model training by evaluating accuracy requirements, data specificity, latency constraints, and cost tradeoffs to determine when custom models are justified over pre-built APIs.

Use AutoML and Vertex AI for low-code ML

Implement AutoML training workflows on Vertex AI for tabular, image, text, and video data types by configuring dataset imports, training budgets, optimization objectives, and model export formats.
Analyze AutoML model evaluation results including precision-recall curves, confusion matrices, and feature attributions to identify model limitations and determine whether AutoML performance meets production requirements.
Design decision frameworks for selecting among BigQuery ML, pre-built APIs, AutoML, and custom training approaches based on data volume, model complexity, team expertise, latency requirements, and total cost of ownership.

2Domain 2: Collaborating Within and Across Teams to Manage Data and Models

3 topics

Explore and preprocess data

Implement data exploration and analysis workflows using BigQuery SQL queries, statistical profiling, and schema validation to understand data distributions, identify quality issues, and assess feature relevance for ML tasks.
Implement data preprocessing pipelines using Dataflow (Apache Beam) and Dataprep for large-scale transformations including missing value imputation, normalization, encoding categorical variables, and handling imbalanced datasets.
Implement feature engineering strategies including feature crosses, embedding lookups, temporal aggregations, and text tokenization to transform raw data into ML-ready feature representations using Dataflow and BigQuery.
Implement data validation using TensorFlow Data Validation (TFDV) to generate statistics, detect schema anomalies, identify training-serving skew, and establish data quality gates in ML pipelines.
Design data preprocessing strategies that balance batch and streaming approaches, optimize feature engineering impact on model performance, and establish data split governance across training, validation, and test sets for production ML systems.

Manage datasets and models

Implement Vertex AI Feature Store for centralized feature management including feature group creation, online and offline serving configurations, point-in-time lookups, and feature sharing across ML projects.
Implement Vertex AI Model Registry for model versioning, metadata tracking, model aliases, and lifecycle stage management to maintain organized model inventories across development, staging, and production environments.
Analyze experiment tracking results using Vertex AI Experiments to compare metrics, parameters, and artifacts across training runs, evaluate statistical significance of performance differences, and select optimal model configurations for promotion.
Design model governance strategies that integrate Feature Store, Model Registry, and experiment tracking to ensure reproducibility, traceability, and compliance across cross-functional ML teams and projects.

Build and maintain ML pipelines for data and model management

Implement Vertex AI Pipelines using the Kubeflow Pipelines SDK to define pipeline components, configure input/output artifacts, and orchestrate multi-step ML workflows with dependency management.
Implement Cloud Composer (Apache Airflow) orchestration for complex ML workflows including DAG authoring, sensor-based triggers, cross-service task operators, and retry policies for end-to-end data pipeline management.
Design pipeline orchestration strategies that evaluate Vertex AI Pipelines, Kubeflow Pipelines on GKE, and Cloud Composer capabilities to establish the optimal orchestration approach aligned with workflow complexity, team expertise, and long-term operational requirements.

3Domain 3: Scaling Prototypes into ML Models

3 topics

Build ML models on Vertex AI

Implement custom training jobs on Vertex AI using TensorFlow, PyTorch, and JAX with pre-built containers, specifying machine types, accelerator configurations, and training scripts for scalable model development.
Implement custom container training on Vertex AI by building Docker images with framework dependencies, configuring Artifact Registry for container storage, and defining custom training specifications for specialized environments.
Implement distributed training strategies using Vertex AI with data parallelism (MirroredStrategy, MultiWorkerMirroredStrategy), model parallelism, and parameter server configurations for training large-scale models across multiple workers and accelerators.
Analyze distributed training architecture tradeoffs between data parallelism, model parallelism, and pipeline parallelism strategies to optimize training throughput, convergence speed, and resource utilization for large model development.
Design ML model architecture selection frameworks that evaluate framework suitability (TensorFlow, PyTorch, JAX), training paradigm, model complexity, and production serving requirements to guide prototype-to-production transitions.

Train and tune ML models

Implement hyperparameter tuning using Vertex AI Vizier with search algorithms (grid, random, Bayesian optimization), defining parameter search spaces, optimization metrics, and early stopping conditions for efficient tuning.
Implement GPU and TPU training configurations on Vertex AI by selecting appropriate accelerator types, configuring mixed-precision training, managing TPU pod slices, and optimizing data input pipelines for accelerator utilization.
Analyze transfer learning approach suitability by evaluating pre-trained models from TensorFlow Hub and Model Garden, comparing fine-tuning strategies with frozen layers versus full retraining, and assessing domain adaptation effectiveness for target tasks.
Analyze training performance bottlenecks by evaluating GPU/TPU utilization, data pipeline throughput, convergence patterns, and memory constraints to identify and resolve training efficiency issues.
Design training infrastructure strategies that balance accelerator selection (GPU vs TPU), spot/preemptible instance usage, training budget constraints, and time-to-completion requirements for cost-effective model development.

Evaluate ML models

Implement model evaluation workflows using Vertex AI Model Evaluation to compute precision, recall, F1-score, AUC-ROC, mean absolute error, and root mean squared error across model slices and data segments.
Analyze model fairness using the What-If Tool and Vertex AI Model Evaluation to detect bias across protected attributes, evaluate disparate impact metrics, and assess equalized odds across demographic slices for compliance and ethical deployment.
Analyze evaluation metric selection tradeoffs for different ML task types including classification thresholds, regression error bounds, and ranking metrics to choose evaluation criteria aligned with business objectives.
Design model validation strategies that combine offline evaluation, online A/B testing, shadow deployments, and champion-challenger frameworks to ensure production model quality meets business service level objectives.

4Domain 4: Serving and Scaling Models

3 topics

Serve models with Vertex AI Prediction

Implement Vertex AI online prediction endpoints by uploading models, configuring machine types, deploying model versions, and setting up request routing with traffic splitting for real-time inference serving.
Implement Vertex AI batch prediction jobs by configuring input sources (BigQuery, Cloud Storage), output destinations, machine types, and batch sizes for large-scale offline inference workloads.
Analyze custom prediction routine design tradeoffs on Vertex AI by evaluating pre-processing and post-processing container architectures, health check strategies, model loading patterns, and latency impacts for specialized inference workflows.
Analyze model optimization technique tradeoffs including TensorFlow Lite quantization (post-training versus quantization-aware), weight pruning, and knowledge distillation by evaluating accuracy degradation, size reduction, and latency improvement for deployment targets.
Design serving architecture strategies that balance online and batch prediction patterns, custom containers versus pre-built serving, and model optimization impact on accuracy to establish deployment configurations meeting organizational latency, throughput, and cost requirements.

Scale ML serving infrastructure

Implement autoscaling configurations for Vertex AI prediction endpoints by setting minimum and maximum replica counts, target CPU utilization thresholds, and scale-down delay parameters for elastic inference capacity.
Analyze traffic management strategies for model endpoints by evaluating traffic splitting percentages, gradual rollout configurations, and canary deployment patterns to determine safe model version transition approaches with controlled blast radius.
Analyze A/B testing results for model serving by evaluating traffic splitting configurations, prediction log data, and statistical significance of model performance differences to determine production-readiness of candidate model versions.
Analyze GPU and TPU provisioning strategies for serving endpoints by evaluating accelerator type selection, multi-model serving configurations on shared accelerators, and quota management approaches for cost-effective high-throughput inference.
Analyze serving infrastructure scaling patterns to optimize autoscaling parameters, accelerator utilization, and cold-start latency while balancing prediction throughput against infrastructure cost constraints.
Design serving cost optimization strategies that balance committed use discounts, preemptible resources, multi-region deployment, and caching layers to minimize total cost of ownership for production ML inference.

Manage model lifecycle in production

Implement Vertex AI Model Monitoring to detect data drift, prediction drift, and feature attribution skew using statistical tests, configuring alert thresholds, and establishing monitoring schedules for deployed models.
Analyze automated retraining trigger effectiveness by evaluating model monitoring alert thresholds, scheduled interval frequency, and data freshness criteria to determine optimal retraining cadence balancing model freshness against computational cost.
Implement CI/CD pipelines for ML using Cloud Build triggers, Vertex AI Pipelines, and model validation gates to automate the build, test, and deployment lifecycle for ML models across environments.
Analyze model degradation patterns including concept drift, data drift, and upstream data pipeline changes to determine optimal retraining frequency, monitoring granularity, and rollback decision criteria.
Design CI/CD pipeline optimization strategies for ML by establishing deployment frequency targets, change failure rate thresholds, recovery time objectives, and model validation coverage standards to improve the ML delivery lifecycle.
Design model lifecycle management strategies that integrate monitoring, automated retraining, versioned deployments, and rollback procedures into a unified production ML governance framework across organizational teams.

5Domain 5: Automating and Orchestrating ML Pipelines

4 topics

Design ML pipeline architectures

Implement pipeline component design using the Kubeflow Pipelines SDK with typed inputs and outputs, component specifications, container operations, and reusable component libraries for modular ML workflow construction.
Analyze DAG orchestration pattern tradeoffs for ML pipelines including conditional execution, parallel branches, dynamic pipeline generation, and loop constructs to select optimal workflow structures for training and evaluation complexity.
Implement pipeline artifact management using Vertex ML Metadata to track datasets, models, metrics, and lineage across pipeline runs for reproducibility and provenance auditing.
Analyze pipeline architecture patterns to evaluate component granularity, caching effectiveness, resource allocation per step, and failure isolation strategies for reliable and efficient ML pipeline execution.
Design pipeline architecture strategies that define component boundaries, artifact contracts, and versioning policies to maximize reuse across ML projects while maintaining isolation and independent deployability.

Automate ML workflows

Implement Vertex AI Pipelines SDK workflows by compiling pipeline definitions, configuring pipeline parameters, submitting pipeline runs, and managing pipeline run artifacts and execution history.
Analyze custom pipeline component design tradeoffs by evaluating lightweight Python function components versus container-based components, dependency isolation strategies, and interface contracts for reusable specialized ML operations.
Implement pipeline scheduling using Cloud Scheduler cron triggers, Pub/Sub event-driven triggers, and Cloud Functions to automate recurring pipeline execution and data-arrival-based pipeline activation.
Analyze event-driven ML pipeline trigger architectures using Pub/Sub, Cloud Functions, and Eventarc by evaluating event routing patterns, delivery guarantees, idempotency requirements, and failure handling for reliable pipeline activation.
Design pipeline automation reliability strategies by establishing trigger monitoring, execution success rate targets, scheduling governance, and event processing SLOs to prevent and resolve pipeline automation failures at scale.
Design end-to-end ML automation strategies that coordinate data ingestion, feature computation, training, evaluation, and deployment pipelines into a unified continuous training and delivery system.

Monitor and optimize ML operations

Implement pipeline monitoring using Cloud Logging, Cloud Monitoring, and Vertex AI pipeline run dashboards to track step execution times, failure rates, resource consumption, and pipeline SLA compliance.
Analyze cost tracking and resource optimization for ML pipelines by evaluating budget alert effectiveness, per-component resource usage patterns, and machine type and accelerator allocation efficiency to identify cost reduction opportunities.
Design MLOps maturity advancement roadmaps by evaluating current automation levels (manual, ML pipeline, CI/CD pipeline, automated retraining, full MLOps), prioritizing capability gaps, and establishing incremental adoption plans for target operational maturity.
Analyze pipeline performance bottlenecks by profiling step execution durations, identifying I/O-bound and compute-bound components, and evaluating caching hit rates to optimize end-to-end pipeline throughput.
Design MLOps operational excellence strategies that define SLOs for pipeline reliability, resource efficiency targets, cost governance models, and continuous improvement processes for production ML systems.

Ensure responsible AI practices

Implement model cards and documentation practices to record model purpose, training data characteristics, evaluation results, intended use cases, and known limitations for transparency and accountability in ML systems.
Analyze bias detection results using Vertex AI Model Evaluation fairness metrics, slice-based analysis, and counterfactual testing to quantify algorithmic bias severity, assess remediation options, and determine deployment risk across protected demographic attributes.
Implement explainability solutions using Vertex Explainable AI with feature attributions (Sampled Shapley, Integrated Gradients, XRAI) to provide interpretable model predictions for stakeholder trust and regulatory compliance.
Design explainability strategies that select among feature attribution techniques (Sampled Shapley, Integrated Gradients, XRAI), evaluate their applicability to different model architectures, and establish interpretability standards across organizational ML systems.
Design AI governance frameworks that integrate model cards, bias monitoring, explainability requirements, human review processes, and organizational accountability structures aligned with Google AI Principles and regulatory standards.

Hands-On Labs

20 labs ~393 min total Console Simulator

Practice in a simulated cloud console or Python code sandbox — no account needed. Each lab runs entirely in your browser.

Certification Benefits

Salary Impact

$162,000

Average Salary

Related Job Roles

ML EngineerData ScientistAI EngineerMLOps EngineerApplied ML Researcher

Industry Recognition

Google Cloud certifications are highly valued in AI-focused organizations. Google is a pioneer in machine learning research (TensorFlow, Transformers, TPUs), and this certification validates expertise in Vertex AI and GCP's industry-leading ML infrastructure and toolchain.

Scope

Included Topics

All domains and task statements in the Google Cloud Professional Machine Learning Engineer certification exam guide: Domain 1 Architecting Low-Code ML Solutions (12%), Domain 2 Collaborating Within and Across Teams to Manage Data and Models (16%), Domain 3 Scaling Prototypes into ML Models (18%), Domain 4 Serving and Scaling Models (26%), and Domain 5 Automating and Orchestrating ML Pipelines (28%).
Professional-level ML engineering decisions for low-code ML development, data and model management, model training and evaluation, model serving and scaling, and ML pipeline automation on Google Cloud Platform.
Complex scenario-based tradeoff analysis involving ML architecture design, training infrastructure optimization, serving cost management, pipeline orchestration strategies, and responsible AI governance on GCP.
Key GCP services for ML engineers: Vertex AI (AutoML, Custom Training, Prediction, Pipelines, Feature Store, Model Registry, Model Monitoring, Explainable AI, Vizier, Model Evaluation), BigQuery ML, Dataflow, Dataprep, Cloud Composer, Kubeflow Pipelines, TensorFlow, PyTorch, JAX, Vision AI, Natural Language AI, Speech-to-Text, Text-to-Speech, Translation AI, Video AI, Document AI, Cloud TPU, Cloud GPU, Artifact Registry, Cloud Storage, Pub/Sub, Cloud Functions, Cloud Build, Cloud Logging, Cloud Monitoring.

Not Covered

Deep enterprise strategy content unrelated to ML engineering operating models and automation outcomes expected by the Professional ML Engineer exam.
Provider-agnostic tooling detail that does not map to GCP native services and integration patterns used in the exam objectives.
Research-level machine learning theory not connected to practical model development, serving, and operations on Google Cloud.
Exact short-lived pricing terms and transient promotional details not suitable for durable technical domain specifications.

Official Exam Page

Learn more at Google Cloud

Visit

Ready to master PMLE?

Adaptive learning that maps your knowledge and closes your gaps.

Enroll

Professional Machine Learning Engineer

Who Should Take This

What's Covered

Exam Structure

Question Types

Scoring Method

Delivery Method

Prerequisites

Recertification

What's Included in AccelaStudy® AI

Course Outline

Develop ML models using BigQuery ML

Use pre-built ML APIs

Use AutoML and Vertex AI for low-code ML

Explore and preprocess data

Manage datasets and models

Build and maintain ML pipelines for data and model management

Build ML models on Vertex AI

Train and tune ML models

Evaluate ML models

Serve models with Vertex AI Prediction

Scale ML serving infrastructure

Manage model lifecycle in production

Design ML pipeline architectures

Automate ML workflows

Monitor and optimize ML operations

Ensure responsible AI practices

Hands-On Labs

Build an AutoML Model with Vertex AI

Use Vertex AI AutoML for Tabular, Image, and Text Classification

Create and Manage Vertex AI Datasets

Use Vertex AI Feature Store for Feature Management

Manage Model Versions with Vertex AI Model Registry

Train a Custom Model with Vertex AI Custom Training Jobs

Perform Distributed Training with Vertex AI and GPUs/TPUs

Tune Hyperparameters with Vertex AI Vizier

Use BigQuery ML for In-Database Model Training

Deploy a Model to a Vertex AI Endpoint for Online Prediction

Configure Vertex AI Batch Prediction Jobs

Implement Model Autoscaling on Vertex AI Endpoints

Serve Models with TensorFlow Serving on GKE

Build a Vertex AI Pipeline with Kubeflow Pipelines SDK

Orchestrate ML Workflows with Cloud Composer and Vertex AI

Implement CI/CD for ML Models with Cloud Build and Vertex AI

Set Up Vertex AI Model Monitoring for Prediction Drift

Use Vertex AI Explainable AI for Model Interpretability

Implement Data Validation with TensorFlow Data Validation

Deploy a Generative AI Model with Vertex AI Model Garden

Certification Benefits

Salary Impact

Related Job Roles

Industry Recognition

Scope

Included Topics

Not Covered

Official Exam Page

Ready to master PMLE?

Trademark Notice