Node Groups and Node Pools in Depth: How Managed Kubernetes Runs Your Workloads

Every Pod in Kubernetes eventually lands on a node—a worker machine with kubelet, container runtime, and kube-proxy. Managed clusters do not give you a vague “fleet of servers”; they give you node groups (AWS EKS) or node pools (Azure AKS, Google GKE): homogeneous pools of VMs with shared settings for instance type, OS image, subnets, labels, taints, and autoscaling. Understanding that layer is what separates “my Deployment won’t schedule” from a five-minute fix.

In short

A node group / node pool is the cloud’s unit of worker capacity: same SKU, same bootstrap, same network placement, often one scaling boundary for Cluster Autoscaler. You split pools for system vs apps, spot vs on-demand, GPU, or Windows. Pods reach a pool via scheduling (resources, labels, affinity, taints/tolerations). Do not confuse cloud pools with Karpenter NodePools—same word, different object. See Karpenter in depth after you are solid on classic groups.

Nodes, groups, and pools: vocabulary

In plain Kubernetes, a Node is an API object representing one worker. The control plane (API server, scheduler, controllers) usually runs separately; what you scale for applications are worker nodes.

Hyperscaler managed Kubernetes wraps workers in a higher-level construct:

  • AWS EKSManaged node groups (MNG) or self-managed node groups backed by EC2 Auto Scaling Groups (ASGs). Fargate profiles are a different compute model (no SSH-able nodes).
  • Azure AKSNode pools (system pool + user pools; optional Windows, GPU, spot).
  • Google GKENode pools per zone or regional, with autoscaling and surge upgrades.

Documentation and Terraform modules often say “node group” on AWS and “node pool” elsewhere. In conversation, teams use both to mean: a cohort of workers that share infrastructure DNA.

If the control plane vs worker split is still fuzzy, read Kubernetes architecture in simple terms first.

What a node group actually controls

A group is not just “three m5.large instances.” It bundles decisions that affect every Pod scheduled there:

Dimension Why it matters Examples
Instance type / SKU CPU, memory, GPU, local SSD, architecture (x86 vs arm64) m6i.large, g5.xlarge, Standard_D4s_v5
Capacity type Cost vs interruption risk On-demand, spot, reserved capacity
OS image Kernel, cgroup driver, GPU drivers, compliance EKS-optimized AMI, AKS Ubuntu, GKE Container-Optimized OS
Networking AZ spread, IP exhaustion, load balancer placement Private subnets per AZ, max pods per node (prefix delegation)
Identity What the kubelet/instance may call in the cloud API EKS node IAM role, AKS managed identity, GKE service account
Kubernetes surface Scheduler input Labels, node.kubernetes.io/instance-type, taints, allocatable resources
Scaling How many nodes exist under load ASG min/max/desired, cluster autoscaler tags, surge settings

When you “add a GPU pool,” you are really saying: launch machines of family G with NVIDIA drivers, taint them nvidia.com/gpu=true:NoSchedule, and let only GPU workloads tolerate that taint.

EKS: managed node groups vs self-managed

On Amazon EKS, most teams start with managed node groups:

  • AWS handles the ASG, launch template alignment, and rolling AMI updates for the group.
  • You choose subnets, instance types, disk size, labels, taints, and the node IAM role.
  • Nodes register into the cluster with predictable naming and managed draining hooks on scale-in.

Self-managed node groups are ASGs you own entirely—more knobs (custom bootstrap, mixed instances policy nuances), more operational burden. Teams pick self-managed when they need behavior MNG does not expose yet, or when integrating legacy automation.

Fargate is not a node group in the EC2 sense: AWS runs Pods on shared infrastructure per profile. You trade node SSH and DaemonSet-on-every-node assumptions for serverless ops. Useful for small bursty services; costly and limiting for cluster-wide agents (some observability, storage, or CNI patterns).

# eksctl sketch — one on-demand app pool + one spot pool
eksctl create cluster --name prod --region ap-south-1

eksctl create nodegroup --cluster prod --name app-ondemand \
  --node-type m6i.large --nodes 2 --nodes-min 2 --nodes-max 10 \
  --node-labels workload=general --asg-access

eksctl create nodegroup --cluster prod --name app-spot \
  --spot --instance-types m6i.large,m6a.large,m5.large \
  --nodes 0 --nodes-min 0 --nodes-max 20 \
  --node-labels workload=spot,capacity-type=spot \
  --node-taints spot=true:NoSchedule

AKS and GKE: node pool mental model

AKS creates a default system node pool (often tainted CriticalAddonsOnly) for CoreDNS, metrics, and controllers. User workloads should land on separate user node pools so application scale-out does not starve system Pods. Windows pools require matching taints and node selectors on Pod specs.

GKE node pools are regional or zonal; autoscaling and surge upgrades are first-class. Multi-pool clusters commonly isolate:

  • Default / general — burstable app tier
  • Spot / preemptible — batch and fault-tolerant services
  • High-memory or GPU — ML inference or training helpers

For a service-by-service map across clouds, see AWS, GCP, and Azure services mapping.

From cloud pool to Kubernetes Node object

When an instance boots, bootstrap (eksctl, cloud-init, or managed bootstrap) starts kubelet and registers with the API server. The Node object then carries:

  • Capacity vs allocatable — kube-reserved and system-reserved shrink what Pods may request.
  • Labels — zone (topology.kubernetes.io/zone), instance type, OS, and your custom labels from the group template.
  • Taints — repel Pods unless they tolerate the taint (pool isolation).
  • ConditionsReady, disk pressure, memory pressure—scheduler ignores NotReady nodes.
kubectl get nodes -L node.kubernetes.io/instance-type,eks.amazonaws.com/nodegroup
kubectl describe node ip-10-0-12-34.ap-south-1.compute.internal

Pending Pods with 0/X nodes are available often mean: no pool has enough free CPU/memory, labels/affinity do not match, or taints block the workload. That is scheduling—not necessarily “cluster broken.” See Kubernetes troubleshooting playbook.

Why teams split multiple groups / pools

One giant pool is seductive and usually wrong at production scale. Common split patterns:

Pool Purpose Typical knobs
System Cluster add-ons, ingress, monitoring agents Smaller stable on-demand; taints excluding app Deployments
General apps Stateless APIs, workers Balanced instance type; CA min ≥ 2 for HA
Spot / preemptible Batch, queue consumers, dev/test burst Mixed instance types; interruption handling; PDBs
GPU ML inference, CUDA jobs G-family SKUs; GPU device plugin; dedicated taints
Arm64 Cost-optimized compatible images Graviton or Ampere pools; multi-arch container builds
Windows .NET Framework, legacy IIS-style apps Separate pool; nodeSelector for Windows

Each extra pool adds IAM, patching, and autoscaler configuration surface. The art is enough isolation without operator fatigue—often three to five pools cover 80% of estates.

Taints, tolerations, and node selectors

Taints on a node (or applied at pool creation) say “do not schedule here unless tolerated.” Tolerations on a Pod say “I am allowed on tainted nodes.” Node affinity / nodeSelector choose which eligible nodes, not whether taints allow scheduling.

# Pool created with: spot=true:NoSchedule
# Deployment fragment:
spec:
  template:
    spec:
      tolerations:
        - key: spot
          operator: Equal
          value: "true"
          effect: NoSchedule
      nodeSelector:
        capacity-type: spot

Platform teams often taint GPU and spot pools by default so a misconfigured Deployment cannot land on expensive or interruptible hardware silently.

Autoscaling: who scales the pool?

Two layers confuse newcomers:

  1. Pod-level — HPA / KEDA increase Deployment replicas when CPU, memory, or queue lag demand it (KEDA in depth).
  2. Node-levelCluster Autoscaler (CA) increases ASG/MIG/VMSS size when Pods are unschedulable due to insufficient capacity, and shrinks underutilized nodes (respecting PDBs and safe eviction).

CA is node-group-aware: each group is typically one homogeneous instance profile. If your only pool is m6i.2xlarge and a Pod needs 500m CPU, CA may still add another full 2xlarge—coarse bin-packing. That limitation is why many EKS teams adopt Karpenter, which provisions per-Pod-shaped instances via NodePool CRDs (Kubernetes objects, not AWS ASG names).

Naming collision: AWS/GKE “node pool,” Karpenter “NodePool,” and Azure “node pool” are three different layers. In architecture reviews, say “EKS managed node group” vs “Karpenter NodePool” explicitly.

Cluster Autoscaler prerequisites (EKS-focused)

For CA to scale an EKS managed node group, the ASG must be discoverable and correctly tagged. Typical requirements include:

  • IAM permissions for CA to describe and modify ASGs.
  • Tags such as k8s.io/cluster-autoscaler/enabled and cluster ownership tags on the ASG (exact keys evolved across versions—match your CA manifest docs).
  • Pod resource requests set—scheduler simulates placement; CA trusts that shape.
  • No conflicting manual ASG desired capacity fights with CA (use min/max on the group, let CA set desired).

If HPA adds Pods but nodes never appear, check Pending Pod events, CA logs, ASG max size, and subnet IP capacity—not only the Deployment.

Lifecycle: create, upgrade, drain, delete

Node groups are living infrastructure:

  • Create — Rolling launch across subnets/AZs; nodes join Ready; DaemonSets must schedule (CNI, kube-proxy, observability).
  • Upgrade — Kubernetes version skew policies: control plane first, then node groups (surge pools on GKE; MNG rolling updates on EKS). Watch PodDisruptionBudgets during cordon/drain.
  • Scale-in — CA or manual drain → evict Pods → terminate instance. Stateful workloads without PDBs cause pain here.
  • Delete pool — Ensure no Pods still target labels/taints only that pool provided; migrate workloads first.
kubectl cordon ip-10-0-1-23...
kubectl drain ip-10-0-1-23... --ignore-daemonsets --delete-emptydir-data

For storage-heavy Pods, draining order and volume attachment matter—tie back to PV, PVC, and StorageClass and ensure CSI DaemonSets run on every pool that needs volumes (CRI and CSI deep dive).

Networking and IP planning

Each node consumes IP addresses from the subnet (and possibly secondary ENIs for high pod density). A classic production failure mode:

  • Large instance types × many Pods × small subnets → AWS cannot launch nodes even when CA requests scale-out.
  • Cross-AZ imbalance when one subnet is full—Pods pending for “topology spread” reasons.

Plan subnets per AZ, consider prefix delegation or custom networking on EKS, and align max pods per node with CNI docs. Network design belongs with AWS network architecture design thinking, not only the Kubernetes team.

Security and compliance on worker pools

  • Least-privilege node IAM — Nodes pull images, attach volumes, maybe call S3—nothing broad like admin policies.
  • IMDSv2, encrypted EBS, and hardened AMIs on the launch template.
  • SSH access — Prefer SSM Session Manager over open port 22; break-glass only.
  • Pod isolation — Node pools are not multi-tenant security boundaries alone; use namespaces, NetworkPolicies, and admission policy (cluster RBAC).

Regulated environments sometimes mandate dedicated pools per sensitivity tier (PCI workloads on dedicated subnets and instance types).

Infrastructure as code patterns

Node groups are almost always declared in Terraform, Pulumi, eksctl, or cloud consoles—not hand-built VMs:

  • Terraform aws_eks_node_group — Subnets, scaling config, labels, taints, launch template overrides.
  • eksctl ClusterConfig — GitOps-friendly YAML for multiple nodeGroups and IAM with IRSA.
  • AKS azurerm_kubernetes_cluster_node_pool — Separate resource per pool.
  • GKE google_container_node_pool — Per-pool autoscaling and management settings.

Store group definitions in Git, review changes like application code, and pin Kubernetes versions explicitly—see Terraform IaC for everyone and GitOps principles.

Spot and mixed-instance groups

Spot pools cut cost dramatically for fault-tolerant workloads. Operational requirements:

  • Multiple instance types in the ASG (capacity-optimized allocation on AWS).
  • Interruption handling—graceful termination notices, shorter terminationGracePeriodSeconds where safe.
  • PDBs and multiple replicas so one node loss is not an outage.
  • Optional dedicated on-demand pool as fallback (higher CA priority or Karpenter weights).

FinOps angle: spot savings only count if workloads actually tolerate eviction—pair with FinOps in plain English chargeback tags per pool.

Debugging checklist

  1. kubectl get nodes -o wide — Enough Ready nodes? Correct zones?
  2. kubectl describe pod <pending-pod> — Insufficient cpu/memory, taints, affinity, or volume topology?
  3. Cloud console / CLI — ASG at max? Launch failures (capacity, IAM, subnet IPs)?
  4. CA logs — “scale-up failed,” “unschedulable pod didn’t trigger scale-up” (often missing requests or taint mismatch).
  5. DaemonSets — Missing CNI or CSI on new pool because selectors exclude it?

Label nodes consistently at pool creation (workload=gpu, environment=prod) so incidents are grep-friendly in metrics and logs.

Production pitfalls

  • Single pool for everything — No blast-radius isolation; system Pods compete with batch jobs.
  • No resource requests — Scheduler and CA fly blind; nodes look “full” while CPU graphs show idle waste.
  • Taint without toleration — Silent Pending Pods after new pool rollout.
  • CSI/CNI not on new pool — Storage mounts fail only on scaled-out nodes.
  • Ignoring max pods / IPs — Scale-out stalls with cloud API errors unrelated to Kubernetes.
  • Running CA and Karpenter on same ASG — Fighting controllers; migrate deliberately (Karpenter migration section).
  • Oversized instance type per group — Paying for 8 vCPU nodes hosting 200m CPU Pods because CA cannot fragment groups.

How this fits your platform stack

Node groups are the capacity contract between cloud infrastructure and the Kubernetes scheduler. Application teams feel them as labels, taints, and whether scale-out succeeds. Platform teams own subnet design, IAM, patching, pool count, and autoscaling glue.

A sensible learning path on this site:

  1. Architecture — what a node is
  2. This post — how clouds package nodes into groups/pools
  3. Day-one practices — requests, limits, probes
  4. KEDA — scale Pods on events
  5. Karpenter — scale nodes per Pod shape

Further reading

Node groups are unglamorous infrastructure—and the layer where most “Kubernetes is broken” tickets actually start. Name your pools clearly, tag them for cost, taint them on purpose, and wire autoscaling so Pods and nodes scale together.