GitHub CI/CD in Depth: From First Workflow to Production Delivery
Continuous integration proves every change integrates safely; continuous delivery keeps the mainline always releasable; continuous deployment ships to production automatically when policy allows. GitHub Actions is GitHub’s event-driven automation engine—workflows in YAML that build, test, scan, and deploy from the same repository your team already reviews in pull requests.
In short
Model CI/CD as feedback loops on Git events: workflows trigger on push/PR/tag; jobs run on runners; steps execute shell or marketplace actions. Protect main with required checks, scope secrets per environment, prefer OIDC over long-lived cloud keys, cache dependencies, pin third-party actions, and separate “build artifact” from “promote to prod” so GitOps or manual gates can own the last mile.
CI, CD, and where GitHub sits
Teams blur the acronym, but the distinctions matter for pipeline design:
| Term | What it optimizes | Typical GitHub signal |
|---|---|---|
| CI (continuous integration) | Merge conflicts and defects surface early—build, unit tests, lint on every PR | pull_request, push to feature branches |
| CD (continuous delivery) | Main is always deployable; release is a business decision (button or tag) | Green main + artifact to staging; prod deploy may need approval |
| CD (continuous deployment) | Every green merge reaches production automatically | push to main triggers prod workflow when policy allows |
GitHub is not only hosting: pull requests are the change-management unit; branch protection enforces reviews and status checks; Actions runs the automation; Environments add deployment gates and scoped secrets. That colocation—code, review, and pipeline in one place—is why many platform teams standardize on GitHub for application repos even when clusters are managed elsewhere.
If Git fundamentals are rusty, start with Git & GitHub in depth. When delivery shifts to declared cluster state, read GitOps principles for the reconcile layer Actions usually hands off to.
GitHub Actions: the object model
Think in four nested layers:
- Workflow — one YAML file under
.github/workflows/, bound to repository (or org) and triggered by events. - Job — a unit of work that shares a runner; jobs in the same workflow can run in parallel or depend on each other (
needs). - Step — ordered commands inside a job: either
run(shell) oruses(an action). - Action — reusable step bundle (JavaScript, Docker, or composite shell).
Runners are the machines that execute jobs. GitHub-hosted runners (Ubuntu, Windows, macOS) are ephemeral VMs; self-hosted runners are your VMs or Kubernetes pods for private networks, GPUs, or compliance. Labels (runs-on: ubuntu-latest, runs-on: [self-hosted, linux, gpu]) route jobs to the right pool.
Events: what starts a workflow
The on key is your contract with the repo. Common triggers:
| Event | Use when | Caveat |
|---|---|---|
push | CI on branches, deploy on main, tag releases | Forks: secrets are not exposed to workflows from fork PRs by default |
pull_request | PR validation before merge | Use pull_request for untrusted code; pull_request_target only with extreme care |
workflow_dispatch | Manual runs, ops playbooks | Inputs validate in YAML; audit who clicked Run |
schedule (cron) | Nightly scans, drift checks | UTC only; repos with no commits can pause schedules |
release | Publish on GitHub Release | Pairs with semantic versioning and changelog automation |
workflow_call | Reusable workflow from another repo/workflow | Pass inputs/outputs explicitly; version with tags or SHA |
# .github/workflows/ci.yml
name: CI
on:
push:
branches: [main]
pull_request:
branches: [main]
workflow_dispatch:
concurrency:
group: ci-${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
Concurrency groups cancel superseded runs—valuable on busy monorepos so an old PR push does not waste minutes after a newer one. Path filters (paths / paths-ignore) skip workflows when only docs change.
A production-shaped pipeline (build → test → publish)
name: Build and test
on:
pull_request:
push:
branches: [main]
permissions:
contents: read
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: "1.22"
cache: true
- run: go test ./... -race -count=1
build-image:
needs: test
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
id-token: write # OIDC to cloud/registry
steps:
- uses: actions/checkout@v4
- uses: docker/setup-buildx-action@v3
- uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- uses: docker/build-push-action@v6
with:
push: true
tags: ghcr.io/${{ github.repository }}:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
Patterns worth copying:
- Default-deny permissions at workflow level; elevate per job.
- PR runs test only; main push publishes immutable artifacts tagged with
github.sha. - BuildKit + GHA cache speeds Docker layers without a bespoke registry cache.
- OIDC (
id-token: write) enables passwordless federation to AWS, GCP, Azure—see below.
Jobs: parallelism, matrices, and dependencies
needs: [test] builds a DAG. strategy.matrix fans out dimensions—OS, language version, service container:
jobs:
integration:
strategy:
fail-fast: false
matrix:
postgres: [14, 16]
runs-on: ubuntu-latest
services:
postgres:
image: postgres:${{ matrix.postgres }}
env:
POSTGRES_PASSWORD: postgres
ports: ["5432:5432"]
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- uses: actions/checkout@v4
- run: npm ci && npm run test:integration
env:
DATABASE_URL: postgres://postgres:postgres@localhost:5432/app
Job outputs pass data downstream: jobs.build.outputs.image_tag consumed by needs.build.outputs.image_tag. Conditionals (if:) skip deploy on draft PRs or dependabot branches.
Caching, artifacts, and reproducibility
| Mechanism | Stores | Best for |
|---|---|---|
actions/cache | Dependency dirs (npm, pip, Go modules) | Faster CI; key on lockfile hash |
actions/upload-artifact | Build outputs between jobs/workflows | Test reports, binaries, Terraform plans |
| Container/registry tags | Immutable images by digest or SHA | Deployable unit promoted across environments |
Artifacts expire (retention configurable); registries are the system of record for releases. Pin images with digest (image@sha256:…) in Kubernetes manifests—same discipline as Docker hardening.
Secrets, variables, and environments
Never commit credentials. GitHub offers layers:
- Repository secrets — available to workflows in that repo (encrypted at rest).
- Organization secrets — shared with selected repos; good for shared tooling tokens.
- Environment secrets — scoped to
production,staging, etc., optionally requiring reviewers. - Variables — non-secret configuration (region names, feature flags); also at repo/org/environment level.
Reference as ${{ secrets.API_TOKEN }} or ${{ vars.AWS_REGION }}. Logs mask secret values when printed accidentally—still avoid echoing them.
jobs:
deploy:
runs-on: ubuntu-latest
environment:
name: production
url: https://app.example.com
steps:
- run: ./deploy.sh
env:
API_TOKEN: ${{ secrets.API_TOKEN }}
Environments add deployment branches/tags rules, required reviewers, wait timers, and environment-specific secrets—your “change advisory board in YAML.” Pair with deployment protection rules and GitHub’s deployment API for tracking what is live.
OIDC: federated identity to the cloud
Long-lived AWS_ACCESS_KEY_ID in GitHub secrets rot poorly and over-privilege CI. OpenID Connect lets GitHub mint a short-lived token that AWS IAM, GCP Workload Identity, or Azure Entra trusts—scoped to repo, branch, or environment.
# AWS example (job permissions + configure-aws-credentials)
permissions:
id-token: write
contents: read
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/github-actions-deploy
aws-region: ap-south-1
Trust policies should pin sub claims—e.g. only repo:org/app:environment:production may assume the prod role. Same pattern for EKS, S3, Terraform state buckets, and parameter stores. This is the default posture for platform teams in regulated environments.
Reusable workflows and composite actions
DRY across dozens of repos:
- Reusable workflow —
on: workflow_callin.github/workflows/reusable-ci.yml; callers useuses: org/repo/.github/workflows/reusable-ci.yml@v2withsecrets: inheritor explicit secret mapping. - Composite action —
runs.using: compositebundles shell steps; lives inaction.ymlinside the repo or a dedicatedactionsrepo. - JavaScript/Docker actions — for complex tooling; pin to commit SHA, not only
@v4, for supply-chain safety.
# Caller
jobs:
ci:
uses: my-org/platform-pipelines/.github/workflows/golang-ci.yml@v3
with:
go-version: "1.22"
secrets: inherit
Version reusable workflows with tags or SHAs; treat breaking changes like library semver.
Delivery strategies on GitHub
| Pattern | How | Trade-off |
|---|---|---|
| Trunk-based + CD | Merge to main → workflow deploys | Requires strong tests, flags, fast rollback |
| GitHub Flow | PR CI → merge → deploy staging/prod | Simple; environments gate prod |
| Release tags | on: push: tags: ['v*'] builds release artifact | Explicit versioning; slower cadence |
| GitOps handoff | CI builds image; PR updates manifest repo or kustomize overlay | Cluster pulls desired state; see GitOps |
For Kubernetes, CI often ends at “image in registry + updated manifest commit.” Argo CD or Flux reconciles the cluster—avoid giving CI cluster-admin kubectl apply on every merge unless you accept push-based drift.
Branch protection and required checks
Automation without policy is suggestions. On main:
- Require pull request reviews (CODEOWNERS for sensitive paths).
- Require status checks to pass—exact job names from workflows (e.g.
test,build-image). - Require branches to be up to date before merge (or use merge queue on GitHub Team/Enterprise).
- Block force pushes; optional signed commits.
Rulesets (organization level) apply the same rules across many repos—successor patterns to copying branch protection per repository.
Security and supply chain
CI runs arbitrary code on every PR—treat it as a high-value attack surface.
- Pin actions to full commit SHAs —
uses: actions/checkout@b4ffde65…mitigates tag-moving attacks. - Least-privilege
GITHUB_TOKEN— default read-only; grantpackages: writeonly on publish jobs. - Avoid
pull_request_targetfor building untrusted fork code with base-repo secrets—usepull_request+ limited permissions instead. - Dependabot — version update PRs for actions, npm, Docker base images.
- Code scanning (CodeQL) — static analysis on default branch and PRs.
- Secret scanning + push protection — block commits containing cloud keys.
- Artifact attestations / SLSA — emerging patterns to prove build provenance.
Align with shift-left practices from DevOps life & business value: failing a PR on CVE thresholds is cheaper than explaining a prod incident.
Self-hosted runners and isolation
Use self-hosted runners when you need private RFC1918 networks, license servers, larger disks, or GPU builds. Hardening checklist:
- One runner pool per trust zone (prod deploy runners ≠ PR CI runners).
- Ephemeral runners (auto-deregister after one job) reduce persistent compromise.
- Do not mount host Docker socket into untrusted PR jobs without gVisor/Kata or separate VMs.
- Restrict which repos/workflows may use production-tagged runners via runner groups.
GitHub-hosted runners start clean each job—prefer them for open-source and standard builds when network isolation allows.
GitHub Packages and registries
GitHub Container Registry (ghcr.io) stores OCI images beside code—permissions tie to repo and GITHUB_TOKEN. npm, Maven, NuGet, and Rubygems registries live under GitHub Packages with similar auth. Promotion flow: CI pushes ghcr.io/org/app:sha; deploy workflow or GitOps updates digest in manifests; prod never builds from floating :latest without an explicit policy.
Observability and operations
- Workflow run UI — per-step logs, annotations from
::error::workflow commands. gh run list/gh run watch— terminal-first ops (see Git & GitHub).- Notifications — failed workflows on default branch → Slack/email via integrations or custom workflow.
- Metrics — export via API or third-party; track queue time, failure rate, minutes consumed (FinOps).
When deploys fail, correlate Git SHA, workflow run URL, and application release in APM—the same triad incident guides use in incident response.
Cost, quotas, and performance
GitHub Actions bills by minutes (included minutes on plans, then usage). Linux minutes are cheaper than macOS; larger runners cost more. Tactics:
- Path filters and concurrency cancellation on PR floods.
- Dependency caching; split slow integration into nightly
scheduleworkflows. - Reusable workflows to avoid copy-paste drift and duplicate experimentation.
- Right-size self-hosted runners for steady high volume vs per-minute cloud billing.
How Actions compares to Jenkins and GitLab CI
| Dimension | GitHub Actions | Jenkins | GitLab CI |
|---|---|---|---|
| Config location | In-repo YAML | Controller + Jenkinsfile (often in repo) | In-repo .gitlab-ci.yml |
| Trigger model | GitHub-native events | Webhooks, polling, manual | GitLab-native events |
| Runner model | Hosted + self-hosted | Agents/executors you operate | Shared or self-hosted runners |
| Strength | PR + policy integration on github.com | Plugins, air-gapped control | All-in-one DevOps platform |
| Ops burden | Low for hosted runners | Controller HA, plugins, upgrades | Depends on GitLab deployment |
Many enterprises run more than one system—Actions for application repos, Tekton or Jenkins for legacy, GitLab CI for specific business units. Standardize on artifact + OIDC + promotion patterns so the engine is swappable. For Jenkins architecture, Jenkinsfiles, and controller hardening, see Jenkins CI/CD in depth.
Infrastructure and Terraform in CI
IaC pipelines add plan/apply semantics. Typical PR job: terraform fmt -check, validate, plan posted as comment; apply only on merge to main with environment protection. Remote state in S3/GCS with locking prevents concurrent applies. Read Terraform & IaC for everyone for state anatomy; wire AWS/GCP roles via OIDC, not static keys in secrets.
Local debugging: act and dry runs
nektos/act runs workflows locally in Docker—approximate, not identical to GitHub’s hosted environment. Use it to iterate on shell steps fast. For expression debugging, use echo '${{ toJson(github) }}' in a throwaway step (remove before merge). workflow_dispatch inputs help reproduce prod-only paths without fake commits.
Suggested lab progression
- Add a workflow that runs tests on every PR to a new repo.
- Enable branch protection requiring that check on
main. - Add caching for your package manager; measure minute delta.
- Build and push an image to GHCR on
mainonly. - Create
stagingandproductionenvironments; require reviewer on prod. - Replace AWS access keys with OIDC role assumption.
- Extract a reusable workflow shared across two sample repos.
- Enable Dependabot for actions and base images; fix one security PR.
- Hand off deploy to GitOps: CI opens manifest PR; controller syncs cluster.
- Capstone: document rollback—revert commit or redeploy previous image digest.
Common pitfalls
- Secrets in logs or fork PRs — rotate credentials; tighten
pull_requestpermissions. - Floating action tags — supply-chain risk; pin SHAs for third-party actions.
- God-token
GITHUB_TOKEN— write-all at workflow top level. - Deploy from CI with cluster-admin — bypasses GitOps audit trail.
- No concurrency on monorepo PRs — wasted spend and confusing check status.
- Required check name drift — rename job breaks branch protection until updated.
- Caching wrong paths — poisoned cache after dependency upgrades; bust keys on lockfile change.
- Treating green CI as production SLO — add synthetic checks and observability after deploy.
How this connects to platform engineering
Platform teams productize pipelines: golden workflows, OIDC roles per account, standard environments, and documentation so product squads do not reinvent Dockerfile and deploy YAML. The outcome leaders care about—shorter lead time, lower change-fail rate—is the same DORA vocabulary in DevOps life & business value, implemented with GitHub as the control point for code and checks.
Next steps: deepen Git collaboration (Git & GitHub), cluster delivery (GitOps, Kubernetes workloads), and calm operations when pipelines green-light a bad change (incident response).
Further reading
- GitHub Docs — Actions, environments, OIDC, security hardening
- GitHub Blog — merge queue, larger runners, artifact attestations
- DORA / Accelerate — metrics for delivery performance
- OpenGitOps — delivery vs reconciliation split
- NIST SSDF — secure software development practices mapped to CI gates
Blog index · Jenkins CI/CD · GitLab CI/CD · Git & GitHub in depth · GitOps principles · Terraform & IaC