name: gke-creation description: Guides the user through creating GKE clusters using pre-defined templates (Standard, Autopilot, GPU/AI).
GKE Cluster Creation Skill
This skill helps users create Google Kubernetes Engine (GKE) clusters by providing a set of best-practice templates and guiding them through the customization process.
core_behavior
- Template Selection:
- Present the available templates to the user if they haven't specified one.
- Explain the trade-offs (e.g., Cost vs. Availability, Autopilot vs. Standard).
- Customization:
- Once a template is selected, present the default configuration (JSON/YAML).
- Ask the user for essential missing information:
project_id,location,cluster_name. - Ask if they want to modify optional fields (e.g.,
machineType,nodeCount,network).
- Validation:
- Ensure
project_id,location, andcluster_nameare set. - Ensure the configuration matches the
create_clusterMCP tool schema.
- Ensure
- Execution:
- Call the
create_clusterMCP tool with the final configuration.
- Call the
best_practices
When guiding the user or generating configurations, adhere to the following GKE cluster creation best practices:
Security
- Private Clusters: Default to private clusters with a private control plane and restricted public endpoints to minimize attack surface.
- VPC-Native Networking: Use VPC-native clusters to enable alias IP ranges, which allows pod-level firewall rules and better network security.
- Workload Identity: Prefer Workload Identity for securely granting GKE workloads access to Google Cloud services instead of using static service account keys.
- Shielded GKE Nodes: Enable Shielded GKE Nodes to protect against rootkits and bootkits.
- Least Privilege (RBAC): Institute strict Role-Based Access Control limits granting minimal privilege to users and workloads.
Cost Optimization
- Autoscaling: Enable Cluster Autoscaler and Horizontal Pod Autoscaler to adjust resources based on demand.
- Right-Sizing: Choose the appropriate machine types and node counts. Consider Spot VMs for fault-tolerant, non-critical workloads.
High Availability & Reliability
- Regional Clusters: Use Regional Clusters for production environments to ensure control plane replication across multiple zones. (Note: standard regional creates nodes across 3 zones by default).
- Pod Disruption Budgets: Recommend setting Pod Disruption Budgets for application stability during node maintenance.
- Release Channels: Subscribe to a release channel (e.g., Regular or Stable) for automated and safer cluster upgrades.
templates
1. Standard Zonal (Cost-Effective Dev/Test)
Best for: Development, testing, non-critical workloads.
{
"name": "projects/{PROJECT_ID}/locations/{ZONE}/clusters/{CLUSTER_NAME}",
"initialNodeCount": 1,
"nodeConfig": {
"machineType": "e2-medium",
"diskSizeGb": 50,
"oauthScopes": [
"https://www.googleapis.com/auth/devstorage.read_only",
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring",
"https://www.googleapis.com/auth/service.management.readonly",
"https://www.googleapis.com/auth/servicecontrol",
"https://www.googleapis.com/auth/trace.append"
]
}
}
2. Standard Regional (High Availability)
Best for: Production workloads requiring high availability. Note: Creates 3 nodes (one per zone in the region) by default.
{
"name": "projects/{PROJECT_ID}/locations/{REGION}/clusters/{CLUSTER_NAME}",
"initialNodeCount": 1,
"nodeConfig": {
"machineType": "e2-standard-4",
"diskSizeGb": 100,
"oauthScopes": ["https://www.googleapis.com/auth/cloud-platform"]
}
}
3. Autopilot (Operations-Free)
Best for: Most workloads where you don't want to manage nodes.
{
"name": "projects/{PROJECT_ID}/locations/{REGION}/clusters/{CLUSTER_NAME}",
"autopilot": {
"enabled": true
}
}
4. GPU Inference (L4)
Best for: AI/ML Inference, small model serving.
Note: Requires g2-standard-4 quota.
{
"name": "projects/{PROJECT_ID}/locations/{REGION}/clusters/{CLUSTER_NAME}",
"initialNodeCount": 1,
"nodeConfig": {
"machineType": "g2-standard-4",
"accelerators": [
{
"acceleratorCount": "1",
"acceleratorType": "nvidia-l4"
}
],
"diskSizeGb": 100,
"oauthScopes": ["https://www.googleapis.com/auth/cloud-platform"]
}
}
5. AI Hypercompute (A3 HighGPU)
Best for: Large Model Training/Inference. Note: High cost and strict quota requirements.
{
"name": "projects/{PROJECT_ID}/locations/{REGION}/clusters/{CLUSTER_NAME}",
"initialNodeCount": 1,
"nodeConfig": {
"machineType": "a3-highgpu-8g",
"accelerators": [
{
"acceleratorCount": "8",
"acceleratorType": "nvidia-h100-80gb-hbm3"
}
],
"diskSizeGb": 200,
"oauthScopes": ["https://www.googleapis.com/auth/cloud-platform"]
}
}
instructions
- ALWAYS ask for the
project_idif it is not in the context. - ALWAYS ask for the
location(Region or Zone). - ALWAYS ask for a unique
cluster_name. - CHECK if the user wants
Access to Google Cloud APIs(defaultcloud-platformscope is usually best for modern GKE). - WARN the user about cost if they select GPU or Reginal clusters.
- USE
create_clusterMCP tool to create the cluster. Theparentargument isprojects/{PROJECT_ID}/locations/{LOCATION}and theclusterargument is the JSON object. Thecluster.nameis just the short name (e.g. "my-cluster"). - IMPORTANT: When calling
create_cluster, thecluster.nameshould be the short name (e.g.,my-cluster), NOT the full resource path, because theparentargument defines the scope.
example_usage
User: "I want to create a GKE cluster." Model: "I can help with that. What kind of cluster do you need?
- Standard Zonal: Good for dev/test.
- Standard Regional: High availability.
- Autopilot: Fully managed.
- GPU Enabled: For AI/ML workloads."
User: "Standard Zonal, please." Model: "Great. I'll need a few details:
- Project ID
- Zone (e.g., us-central1-a)
- Cluster Name"
User: "Project my-proj, zone us-west1-b, name dev-cluster."
Model: "Here is the configuration I will use:
[JSON view]
Do you want to proceed?"

