What is Vertex AI?

Sharon Rajendra Manmothe
Dec 2, 2024
9 min read

When we think of Google, the first thing that comes to mind is its dominance in the search engine space. However, Google has made substantial contributions to the data science industry, consistently delivering state-of-the-art products and solutions to help users unlock the full potential of their data.

One of Google’s standout contributions is Vertex AI, a platform launched in 2021 to simplify the machine learning (ML) process on an enterprise scale.

In this tutorial, we will explore how to get started with Google’s Vertex AI platform and use it to manage the various stages of the ML lifecycle. By the end, we’ll have a deployed model ready to generate predictions for a classification task.

A Comprehensive Solution for the Machine Learning Life Cycle

The machine learning lifecycle is a multi-step process that encompasses the following stages:

Data Preparation, Ingestion, and Exploration
Feature Engineering and Selection
Model Training and Tuning
Deployment and Model Monitoring

Each stage comes with its own set of tools, techniques, and challenges. Typically, implementing ML solutions requires a team of specialists with expertise in diverse areas.

This is where Vertex AI by Google Cloud stands out—it unifies and streamlines the entire ML lifecycle within a single, cohesive platform.

Simplifying Machine Learning for Everyone

Vertex AI is designed to cater to users of all skill levels, democratizing machine learning in the following ways:

1. AutoML for Beginners

No Coding Required: With AutoML, users with minimal technical expertise can create high-quality ML models.
Automated Workflows: From data preparation to hyperparameter tuning, everything is handled seamlessly under the hood.
Quick Results: Users simply upload their data and follow a few simple steps to build a model.

2. Custom Model Training for Experts

Flexibility and Freedom: Experienced data scientists can train models using their preferred frameworks, such as TensorFlow, PyTorch, or XGBoost.
Advanced Capabilities: Vertex AI provides robust tools to handle complex ML workflows, allowing experts to push the boundaries of what’s possible.

3. Simplified Deployment for Everyone

Real-Time APIs: Easily deploy models to serve real-time predictions, integrating them into live applications.
Batch Predictions: Perform large-scale inference tasks efficiently.

Unified ML Workflow in Action

Vertex AI’s holistic approach not only saves time but also bridges the gap between technical experts and business users, enabling organizations to scale their ML initiatives efficiently.

In the sections to follow, we will dive deeper into the features and practical applications of Vertex AI, walking through how it simplifies the end-to-end machine learning lifecycle.

Stay tuned as we unravel the full potential of Vertex AI—a tool that’s reshaping how enterprises leverage machine learning!

What Are Google Cloud Services?

Before diving into the specifics of Vertex AI, it’s essential to understand the broader ecosystem it belongs to—Google Cloud Services. This suite of cloud computing solutions provides an extensive range of tools to support storage, networking, databases, analytics, and machine learning tasks.

Google Cloud Services seamlessly integrate with Vertex AI, offering a unified platform for managing the end-to-end machine learning workflow. Let’s explore some of the key services that complement Vertex AI:

Key Google Cloud Services

1. Data Storage and Management

Efficient machine learning requires robust data storage and management capabilities, and Google Cloud offers powerful tools to handle this:

Cloud Storage
- Acts as the central repository for raw data, making it easily accessible to Vertex AI.
- Stores datasets required for model training and analysis, ensuring scalability and security.
BigQuery
- A high-performance, serverless data warehouse designed for large-scale datasets.
- Enables advanced data querying and analytics, which can be directly integrated with BigQuery ML to train models.

2. Compute Resources

Machine learning workloads, especially at scale, demand significant computational power. Google Cloud offers versatile compute solutions:

Compute Engine
- Provides customizable virtual machines (VMs) to handle resource-intensive ML tasks.
- Vertex AI can utilize these VMs for custom model training, allowing you to scale resources as needed.
Vertex AI Pipelines
- Orchestrates complex ML workflows across multiple compute resources, improving efficiency and traceability.
- Facilitates automation, ensuring your ML processes are optimized for both time and cost.

Getting Started with Google Cloud Services

To explore these services and begin your journey with Vertex AI, follow these steps:

Visit the Homepage
- Navigate to cloud.google.com to explore the available services.
- If you’re new, click “Get started for free” to activate a free trial and access introductory credits.
Access the Cloud Console
- Go directly to console.cloud.google.com for hands-on access to Google Cloud tools.
- This is your control center for managing projects, accessing services, and monitoring resource usage.

By leveraging the full spectrum of Google Cloud Services, Vertex AI users can optimize their machine learning workflows. These services work in perfect harmony to ensure that your data, compute resources, and processes are aligned for maximum efficiency.

Setting Up Your Google Cloud Console for Vertex AI

Before you begin your journey with Vertex AI, you’ll need to set up your Google Cloud Console. Follow this step-by-step guide to configure the console and prepare it for your Vertex AI projects. Note that you should be prepared to spend approximately $25–30 for this tutorial, using the most affordable configurations.

Step 1: Visit the Google Cloud Console

Navigate to the Google Cloud Console. You’ll typically land on the welcome page, which displays your workspace name (e.g., ibexprogramming.com).

Step 2: Create a Project

Projects in Google Cloud are organizational units used to manage resources for specific tasks. Here’s how to create one:

Go to the Console
- On the welcome page, look for the option to create a new project.
Create the Project
- Follow the prompts to name your project and configure basic settings.
- Once created, select your project. You’ll notice its name displayed on the top bar of the page.

Step 3: Set Up a Billing Account

Vertex AI requires billing information to enable its services. Don’t worry—you’ll only be charged for paid resources you use.

Visit the Billing Console
- Head over to the Google Cloud Billing Console.
Create an Account
- If you don’t have an existing billing account, click on “Create account” and follow the setup instructions.
Link Your Billing Account
- Once your billing account is created, link it to your project to enable payment for any resources Vertex AI might use.

Step 4: Access Vertex AI

View All Products
- Return to the Google Cloud Console and click on “View all products” at the bottom of the page.
Find and Pin Vertex AI
- Use the search function (Ctrl + F) to locate Vertex AI in the list of services.
- Pin it to your menu for quick access.
Enable the Vertex AI API
- Click on Vertex AI in the menu to navigate to its dashboard.
- If prompted, enable the Vertex AI API by clicking “Enable.”
- Alternatively, if no prompt appears, use the “Enable all API permissions” option to ensure Vertex AI services are activated.

Step 5: Ready to Upload Your Dataset

With your project created, billing account linked, and Vertex AI API enabled, your console is now set up for your machine learning tasks. Next, you’ll upload a dataset to Vertex AI to begin the process of building and deploying models.

Uploading a Dataset in Vertex AI

Adding a dataset is one of the foundational steps in building machine learning models using Vertex AI. For simplicity, we’ll use a local CSV file in this tutorial. Let’s walk through the process step by step.

Step 1: Download a Sample Dataset

For this example, we’ll use the Dry Bean Dataset from the UCI Machine Learning Repository.

About the Dataset

Description: Contains numeric measurements of 13,611 instances of beans.

Task: Classify beans into seven categories:

Seker

Barbunya

Bombay

Cali

Dermosan

Horoz

Sira

Save the Dataset

Download the dataset as a ZIP file from the UCI repository.

Extract the file, and locate the Excel (.xlsx) file in the ZIP archive.

Save the Excel file in a suitable directory on your local machine (e.g., in a data folder within your working directory).

Step 2: Read and Convert the Dataset

To use the dataset with Vertex AI, we need to convert it from Excel format to CSV. Here’s how to do it:

Import Required Libraries

First, ensure you have the necessary libraries installed. The pandas library is used for data manipulation, and openpyxl is required for reading Excel files:

pip install pandas openpyxl

Code for Reading and Saving the Dataset

from pathlib import Path

import pandas as pd

# Set the file paths

cwd = Path.cwd()

data_path = cwd / "data" / "Dry_Bean_Dataset.xlsx"

# Read the Excel file

beans = pd.read_excel(data_path)

# Display the shape of the dataset

print(beans.shape)  # Output: (13611, 17)

# Save the dataset as a CSV file

csv_path = cwd / "data" / "dry_bean.csv"

beans.to_csv(csv_path, index=False)

Step 3: Validate the CSV File

Ensure the CSV file has been saved correctly by navigating to the directory and opening the file. The dataset should now be ready for upload into Vertex AI.

Next Steps: Uploading the Dataset to Vertex AI

In the next section, we’ll guide you through the process of uploading this CSV file into Vertex AI, configuring the dataset, and preparing it for model training. Stay tuned!

Create a Cloud Storage Bucket

To manage and store raw data for machine learning tasks in Vertex AI, we need to set up a Google Cloud Storage Bucket. This serves as a centralized repository for files like our dataset. Follow these steps to create and configure your bucket.

Step 1: Link a Billing Account

To use storage services, your Google Cloud project must have a billing account linked. Here's how to do that:

Navigate to your Google Cloud Console.
Go to the Billing section from the navigation menu.
Link the billing account you created earlier to your current project.

Step 2: Create a Cloud Storage Bucket

Buckets must have globally unique names, so take care when naming them. Follow these steps:

In the Cloud Console, locate Storage in the navigation menu and click on Buckets.
Click the “Create” button.
Provide a unique name for your bucket.
- For example: my-vertex-ai-dataset-storage.
Select the default options for the remaining fields (e.g., region, storage class) and click “Continue” until the bucket is created.

Congratulations! Your bucket is now ready to store data.

Ingest a Local CSV into Vertex AI

With your Cloud Storage Bucket created and the CSV file prepared in the previous step, let’s upload the file into Vertex AI.

Step 1: Upload the CSV to the Cloud Bucket

Open the Storage > Buckets section in the Google Cloud Console.
Click on your bucket’s name to open it.
Use the Upload Files button to upload your dry_bean.csv file into the bucket.

Step 2: Create a Dataset in Vertex AI

Go to the Vertex AI Dashboard.
Navigate to the Datasets tab in the left-hand menu.
Click + Create Dataset.

Step 3: Configure the Dataset

Dataset Name: Provide a unique name for the dataset, such as dry-bean-classification.
Data Type: Choose Tabular (since we’re working with structured data).
Source Options: Select Upload from Cloud Storage.
- Enter the Cloud Storage path for your uploaded CSV file (e.g., gs://my-vertex-ai-dataset-storage/dry_bean.csv).

Step 4: Analyze the Dataset

Once uploaded, Vertex AI will analyze the dataset. You will see the Analyze tab displaying metadata like:Configuring JupyterLab and Compute Resources in Vertex AI Workbench

Vertex AI Workbench provides a powerful, integrated development environment for building, training, and deploying machine learning models. Here’s how you can set it up for your project:

Step 1: Creating a Workbench

Navigate to the Workbench tab in the Vertex AI Dashboard.
Click on the + New Notebook button.
Provide a name for your notebook, such as vertexai-tutorial-notebook.
Configure the hardware:
- Machine Type: Choose a smaller instance like n1-standard-2 for cost efficiency (~$0.12/hour).
- Disk Size: Default options are sufficient for small datasets.
Set the idle shutdown duration to 10 minutes to ensure the environment automatically stops if left idle.
Click Create to initialize the workbench.

The notebook will take a few minutes to provision. Once ready, the OPEN JUPYTERLAB button will appear.

Step 2: Setting Up JupyterLab

Click OPEN JUPYTERLAB to launch the environment.
Create a new Python 3 notebook:
- Go to File > New Notebook.
- Rename the notebook to something meaningful (e.g., vertex_ai_setup).

Step 3: Install the Required SDK

In the first cell of your notebook, install or upgrade the Google Cloud AI Platform SDK:

python

!pip3 install --upgrade --quiet google-cloud-aiplatform

This SDK allows you to interact with Vertex AI services directly.

Step 4: List Available Projects

Verify your active project configuration using the gcloud command-line tool:

python

!gcloud config list

The output will display details like the current project ID, region, and compute configurations:

plaintext

[compute] region = us-central1 [core] account = your-email@example.com project = vertexai-tutorial-423010

Step 5: Save Configuration Details

From the output, save the following details as Python variables:

python

Copy code

PROJECT_ID = 'vertexai-tutorial-423010' BUCKET_URI = 'gs://your-unique-bucket-name' REGION = 'us-central1'

Note: Replace BUCKET_URI with the path to your actual Google Cloud Storage bucket.

Step 6: Initialize the SDK

Import the aiplatform module and initialize it using the saved configurations:

python

from google.cloud import aiplatform as ai ai.init(project=PROJECT_ID, location=REGION, staging_bucket=BUCKET_URI)

This sets up the Vertex AI SDK for use in your project.

Step 7: Manage the Workbench Instance

When you finish working or need a break, stop the notebook instance to avoid unnecessary costs:

Return to the Vertex AI Workbench tab.
Locate your active notebook instance.
Click Stop to shut it down.

Tip: Always monitor your active instances to manage expenses effectively.

With JupyterLab configured, you're ready to explore Vertex AI's capabilities further, including custom training and AutoML workflows!

Number of rows and columns.
Data distribution and basic statistics.

$50

Product Title

Product Details goes here with the simple product description and more information can be seen by clicking the see more button. Product Details goes here with the simple product description and more information can be seen by clicking the see more button

$50

Product Title

$50

Product Title

What is Vertex AI?

A Comprehensive Solution for the Machine Learning Life Cycle

Simplifying Machine Learning for Everyone

1. AutoML for Beginners

2. Custom Model Training for Experts

3. Simplified Deployment for Everyone

Unified ML Workflow in Action

What Are Google Cloud Services?

Key Google Cloud Services

1. Data Storage and Management

2. Compute Resources

Getting Started with Google Cloud Services

Setting Up Your Google Cloud Console for Vertex AI

Step 1: Visit the Google Cloud Console

Step 2: Create a Project

Step 3: Set Up a Billing Account

Step 4: Access Vertex AI

Step 5: Ready to Upload Your Dataset

Create a Cloud Storage Bucket

Step 1: Link a Billing Account

Step 2: Create a Cloud Storage Bucket

Ingest a Local CSV into Vertex AI

Step 1: Upload the CSV to the Cloud Bucket

Step 2: Create a Dataset in Vertex AI

Step 3: Configure the Dataset

Step 4: Analyze the Dataset

Step 1: Creating a Workbench

Step 2: Setting Up JupyterLab

Step 3: Install the Required SDK

Step 4: List Available Projects

Step 5: Save Configuration Details

Step 6: Initialize the SDK

Step 7: Manage the Workbench Instance

Recommended Products For This Post

Recent Posts

Comments