Platform & culture · 2 May 2026 · Guide · By Babulal Tamang

GitOps
Kubernetes
Flux
Argo CD
Platform

GitOps Best Principles: From OpenGitOps to Production Habits That Stick

GitOps is not “Kubernetes plus a Git repo.” It is an operating model: declare how the world should look, store that declaration where humans collaborate, and let automation reconcile reality with the declaration—continuously, audibly, and safely. The best teams treat those ideas as principles, not a product checkbox.

In short

Keep desired state declarative and in Git; let controllers pull and reconcile continuously; govern changes with the same PR culture as application code; never store secrets in plain text; and measure success by healthy sync plus workload SLOs, not by whether Argo CD has a green icon on Tuesday.

What GitOps is (and is not)

GitOps describes how cloud-native systems—usually Kubernetes, but the pattern travels to VMs, serverless, and data platforms—are operated when the approved configuration lives in version control and automated agents apply it. The term spread after Weaveworks articulated the pattern; today Flux, Argo CD, FluxCD with OCI artifacts, and vendor-managed controllers implement the same ideas with different ergonomics.

GitOps does not replace CI, observability, incident response, or security programs. It complements them:

CI builds, tests, scans, and publishes artifacts (images, charts, policy bundles)—see GitHub CI/CD in depth for pipeline patterns on GitHub.
GitOps decides which approved version should run where, records that decision in Git, and keeps clusters aligned.
Runtime (metrics, traces, logs) tells you whether the declared state is actually healthy for users.

Confusing “we installed Argo CD” with “we practice GitOps” is the most common failure mode. Tools enable principles; they do not substitute for them.

The OpenGitOps principles (community baseline)

The OpenGitOps project documents four principles teams use as a shared vocabulary. Treat them as non-negotiable defaults; everything later in this article is how mature organizations implement those defaults under real constraints.

Declarative — A system’s desired state is expressed as data, not as a one-off runbook of imperative commands.
Versioned and immutable — Desired state is stored in Git (or an OCI registry with digest-pinned artifacts) so every change has history, identity, and rollback.
Automatically pulled — Software agents pull desired state toward the cluster; humans do not SSH in to “deploy Friday’s build.”
Continuously reconciled — Agents compare live state to declared state on a loop and act until they match policy—or surface why they cannot.

If your process violates any of those four, you may still ship software—but you are not getting the auditability, drift detection, or recovery semantics GitOps is bought for.

Push deploys vs pull-based GitOps

Many pipelines “deploy” by running kubectl apply from CI with a long-lived kubeconfig. That can work, but it is a push model: CI holds powerful credentials and must reach the API server on every release. GitOps controllers use a pull model: the cluster (or a cluster-adjacent agent) watches Git and applies changes it is allowed to make.

Aspect	Push from CI	Pull-based GitOps
Credentials	CI often needs broad cluster admin	Short-lived or scoped credentials inside the cluster; Git read access only
Drift	Manual hotfixes may never return to Git	Reconciliation detects and optionally reverts drift
Audit	“Pipeline run #4521 applied YAML”	“Commit `abc123` is live in prod” maps to PR and reviewers
Blast radius	Compromised CI can touch every cluster it knows	Compromise must beat Git + controller RBAC + policy gates
Air-gapped / private API	CI must reach API server	Cluster pulls from Git/OCI inside allowed network paths

Hybrid setups are normal: CI builds and bumps image tags in Git; the controller syncs the bump. The boundary to protect is “production shape is declared in Git, not applied only from a laptop.”

Seven production principles (beyond the install wizard)

1. Declarative desired state — reviewable diffs, not shell archaeology

Imperative scripts (“run these fifty commands in order”) hide intent and break when timing or API defaults shift. GitOps favors declarative manifests: Kubernetes YAML, Helm, Kustomize overlays, Jsonnet, or packaged OCI artifacts that describe what should exist.

Declarative models make pull requests legible: reviewers see the world after merge, not a story about commands someone ran. The same diff becomes evidence for auditors and on-call engineers retracing an incident (“what changed at 14:03 UTC?”).

Best practice: ban one-off kubectl edit in production unless break-glass policy requires it—and require a follow-up commit that matches reality within hours.

2. Git (or OCI) as the system of record — one truth, many clusters

Git is the source of truth for what is allowed to run—not a mirror of “what someone clicked in a console last Tuesday.” Commits attach author, time, and intent (messages, linked tickets). Branches and tags model promotion: main for dev, release branches or directories for staging and production.

This principle fails the moment teams “fix prod” by hand without backporting. Drift erodes trust; the rule that scales is: if it is not in Git, it does not belong in the cluster.

Best practices for repo layout:

Separate concerns: application manifests vs platform addons vs cluster bootstrap—different blast radius, different reviewers (CODEOWNERS).
Environment overlays: Kustomize bases + overlays/staging / overlays/prod or Helm values per env—avoid copy-paste YAML forks.
Pin versions: image digests, chart versions, and policy bundle tags—not floating :latest in production paths.
App-of-apps / app-of-clusters: one root Application (Argo CD) or Kustomization (Flux) that composes children so onboarding a new service is adding a folder, not relearning wiring.

3. Automated delivery through pull-based reconciliation

A controller inside (or trusted beside) the cluster watches Git or an OCI registry and pulls desired state. It diff-clusters live resources against the declaration and applies changes. CI’s job ends at “artifact is vetted; manifest bump is merged”—not at holding cluster-admin tokens.

Best practice: scope controller service accounts with least-privilege RBAC: only namespaces and API groups that team owns. Platform controllers get wider roles; tenant controllers stay fenced.

4. Continuous reconciliation and explicit drift policy

Reconciliation is not a one-time deploy. The controller re-queues work so partial applies, failed hooks, or manual edits are detected. You must decide policy upfront:

Auto-heal: revert unauthorized drift (good for strict prod; surprising if someone was debugging).
Alert-only: flag drift but do not delete human changes until reviewed.
Admission block: OPA, Kyverno, or validating webhooks stop non-Git sources from creating resources at all.

This is where GitOps meets SRE: desired state is a contract; sync failures, prune errors, and webhook denials are symptoms that the contract cannot be met—worthy of SLOs and paging, not only of a UI badge.

5. Everything reviewable — the same PR culture as application code

Infrastructure and platform changes flow through merge requests with reviewers, CI checks (lint, kubeconform, policy tests), and optional manual approval for regulated paths. Branch protection, required checks, and separation of duties map cleanly onto “who may merge to env/prod.”

Best practices:

Small, frequent manifest changes beat monthly “mega YAML” PRs nobody reads.
Policy-as-code in CI and admission: fail PRs early; fail applies at the API server if someone bypasses CI.
Break-glass exists in every org; the GitOps discipline is to record the outcome quickly—revert or promote a hotfix branch—so the system of record catches reality.

6. Security and secrets — defaults, not afterthoughts

GitOps amplifies good hygiene and punishes plaintext secrets in repos.

No long-lived secrets in Git. Use Sealed Secrets, External Secrets Operator, cloud KMS references, or workload identity federation.
RBAC and tenancy limit which namespaces or clusters a controller may touch; align with team boundaries and multi-cluster fleet design.
Supply chain: sign commits or artifacts where policy requires; verify OCI signatures; allow only approved registries via admission policy.
Policy-as-code (OPA Gatekeeper, Kyverno): “no privileged containers,” “labels required,” “resource limits mandatory”—enforced at apply time, not in a wiki.
Bootstrap discipline: the cluster and GitOps agent themselves need an audited birth story—often a small seed applied once, then everything else via Git.

7. Observability — sync health is part of the product

A rollout is not done when CI is green; it is done when the controller reports healthy sync and workloads meet SLOs. Dashboards and alerts should cover:

Application golden signals (latency, errors, saturation).
GitOps signals: sync lag, failed hooks, prune failures, manifest render errors, CRD version skew.
Change correlation: live commit SHA / Helm revision visible next to incident timelines.

On-call runbooks should answer in one glance: which commit is live? which manifest failed validation? which dependency could not be fetched?

Where CI ends and GitOps begins

Developer → PR (app code) → CI: test, scan, build image → push image to registry
                                    ↓
              Release bot or human → PR (GitOps repo): bump image tag / chart version
                                    ↓
              Merge → GitOps controller pulls → reconcile cluster → workloads roll out
                                    ↓
              Observability confirms: sync OK + SLOs OK

Keeping that seam crisp prevents “CI owns production” anti-patterns while still using CI for everything expensive or non-declarative (unit tests, SAST, integration tests against ephemeral envs).

Progressive delivery and operations extras

GitOps describes steady state; progressive delivery describes how you move between states safely. Best teams combine both:

Sync waves / ordering — CRDs and namespaces before workloads; databases before apps that depend on them.
Health checks and rollback — Argo Rollouts, Flagger, or service-mesh traffic splitting for canaries; automatic rollback when metrics regress.
Prune policy — explicit about whether removing a manifest deletes prod resources (dangerous) or only dev (expected).
Multi-cluster fleets — one repo path per region or tenant; promotion by merge from clusters/eu to clusters/us or by tag—not by snowflake kubectl sessions.

Production readiness checklist

Use this as a blunt self-audit before calling GitOps “done”:

Question	Healthy answer
Can we rebuild prod from Git alone?	Yes, including addons, ingress, and observability agents
What happens if someone kubectl-patches prod?	Detected within minutes; policy defines heal vs alert
Where do secrets live?	Never plaintext in Git; rotation documented
Who can merge to prod paths?	Named owners; branch protection enforced
How do we roll back?	Revert commit or roll tag; controller converges automatically
What pages on-call?	Sync failure + user-facing SLO burn—not only pod restarts
How do new services onboard?	Template + docs; no bespoke cluster SSH setup

Common pitfalls (principles vs practice)

Tool install without declarative discipline. You automate confusion faster if every squad invents its own folder structure and naming.
Monorepo without ownership. Use CODEOWNERS, path-based policies, and platform review for shared bases.
Secrets in Git “temporarily.” Temporaries become breaches; use sealed or external secrets from day one.
Ignoring bootstrap and lifecycle. Who upgrades the controller? Who rotates Git deploy keys? Document it.
Forklift push pipelines labeled GitOps. If CI still applies YAML with cluster-admin, you may only have Git storage—not reconciliation semantics.
Environment skew as copy-paste. Prefer overlays and shared bases over three nearly identical repos that diverge silently.

How this fits DevOps and platform engineering

GitOps is a concrete pattern inside the broader DevOps aim: short feedback loops, shared ownership, and measurable delivery. Platform teams offer a paved road—golden templates, cluster blueprints, policy bundles—so product squads get GitOps benefits without each becoming kubectl experts overnight.

If you read DevOps history, GitOps is the chapter where infrastructure as code meets continuous reconciliation and the pull request becomes the control plane for the cloud. New to Git? See Git & GitHub in depth. New to Terraform for cloud foundations? See Anyone Can Terraform. Running workloads already? Pair this with Kubernetes architecture, cluster RBAC, and the troubleshooting playbook.

Blog