Home / Notebooks / DevOps
DevOps
beginner

Introduction to DevOps

A beginner's guide to DevOps principles, practices, and the tools that bring them to life

May 26, 2026
Updated regularly

Introduction to DevOps

DevOps is a set of practices, cultural philosophies, and tools that combine software development (Dev) and IT operations (Ops) to shorten the development lifecycle and deliver high-quality software continuously.

What is DevOps?

Before DevOps, development and operations teams worked in silos. Developers wrote code, threw it "over the wall" to operations, and ops struggled to deploy and maintain it. DevOps breaks down that wall.

What are Silos?

A silo is an organizational pattern where a team works in isolation — with its own goals, tools, processes, and definition of success — largely disconnected from adjacent teams.

In a siloed setup, a typical software company looked like this:

[ Dev Team ]        [ QA Team ]        [ Ops Team ]
 Writes code   →    Tests code    →    Deploys & runs it
 Goal: ship          Goal: find          Goal: stability
 features fast       bugs                (resist change)

Each team optimized for its own KPIs, which often conflicted:

  • Dev was measured on features shipped, so they pushed code frequently.
  • QA was measured on bugs caught, so they slowed releases down with long test cycles.
  • Ops was measured on uptime, so they resisted frequent deployments that could introduce instability.
  • The result: releases became rare, high-risk events. When something broke in production, teams pointed fingers because no one had full ownership of the outcome.

    Concrete signs of silo thinking:

  • "That's not my job" — a developer ignores a production alert because monitoring belongs to Ops.
  • Long handoff queues — a ticket sits for days waiting for another team to pick it up.
  • Duplicated tooling — Dev uses one set of scripts to build locally, Ops uses a completely different process in production.
  • Blame culture — post-mortems focus on who caused the outage rather than what process allowed it.
  • Core idea: automate and collaborate so that code goes from a developer's laptop to production reliably, quickly, and repeatedly.

    The Three Ways (Principles)

    DevOps thinking is often framed around three foundational principles:

  • Flow — Optimize the left-to-right flow of work from Dev → Ops → Customer. Remove bottlenecks, reduce batch sizes, and limit work-in-progress.
  • Feedback — Create fast feedback loops at every stage so problems are caught and fixed close to where they originate.
  • Continual Learning — Foster a culture of experimentation, blameless post-mortems, and learning from failures.
  • The DevOps Lifecycle

    Plan → Code → Build → Test → Release → Deploy → Operate → Monitor
      ^___________________________________________________|
                        (continuous loop)
    
    StageWhat happens
    PlanDefine features, track work (Jira, Linear, GitHub Issues)
    CodeWrite source code, peer review via pull requests
    BuildCompile or package the application
    TestAutomated unit, integration, and end-to-end tests
    ReleaseTag a version, produce an artifact (Docker image, binary)
    DeployPush the artifact to an environment (staging, production)
    OperateRun and maintain the live system
    MonitorObserve metrics, logs, and alerts; feed insights back to Plan

    Key Practices

    Continuous Integration (CI)

    Developers merge code changes to a shared branch frequently — at least once a day. Each merge triggers an automated build and test run.

    Why it matters: Catch integration bugs early, when they are cheap to fix.

    Push code → CI server picks it up → Build → Run tests → Pass/Fail notification
    

    Continuous Delivery (CD)

    Every change that passes CI is automatically packaged and made ready for deployment to any environment at the push of a button.

    Continuous Deployment

    One step further than Continuous Delivery — passing changes are deployed to production automatically, with no manual approval.

    CI (pass) → staging deploy → smoke test → production deploy
    

    Infrastructure as Code (IaC)

    Servers, networks, and databases are defined in code files and version-controlled like application code.

    Benefits:

  • Reproducible environments
  • Eliminates "works on my machine" drift
  • Changes are reviewable and auditable
  • Popular tools: Terraform, Pulumi, AWS CloudFormation

    Monitoring and Observability

    You can't improve what you can't measure. Observability is built on three pillars:

    PillarWhat it tells youTools
    MetricsNumeric measurements over time (CPU, request rate, error %)Prometheus, Datadog
    LogsTimestamped event recordsLoki, CloudWatch, ELK Stack
    TracesEnd-to-end path of a single requestJaeger, Zipkin, OpenTelemetry

    DORA Metrics

    DORA (DevOps Research and Assessment) metrics are the four industry-standard measurements used to assess how well a software team is delivering. They were established through years of research by the DORA team (now part of Google Cloud) and published in the State of DevOps reports.

    The key insight: high performers excel at all four simultaneously — speed and stability are not a trade-off, they reinforce each other.

    MetricWhat it measuresElite benchmark
    Deployment FrequencyHow often you deploy to productionOn-demand (multiple times/day)
    Lead Time for ChangesTime from commit to running in productionLess than 1 hour
    Change Failure Rate% of deployments that cause a production incident0–5%
    Mean Time to Recovery (MTTR)How fast you restore service after an incidentLess than 1 hour

    Deployment Frequency

    Measures how often your team ships to production. Low frequency usually signals large, risky batches — the bigger the release, the harder it is to isolate what broke.

    Low performer: monthly or less Elite performer: multiple deploys per day

    Lead Time for Changes

    The elapsed time from a developer committing code to that code running in production. Long lead times point to slow CI pipelines, manual approval gates, or infrequent merges.

    Low performer: 1 month to 6 months Elite performer: less than 1 hour

    Change Failure Rate

    The percentage of production deployments that result in a degraded service or require a hotfix/rollback. A high rate signals insufficient testing, missing feature flags, or poor deployment practices.

    Low performer: 46–60% Elite performer: 0–5%

    Mean Time to Recovery (MTTR)

    How quickly you restore normal service after a production incident. Teams with low MTTR invest in observability, runbooks, and on-call processes so they can diagnose and fix fast.

    Low performer: 1 week to 1 month Elite performer: less than 1 hour

    Tracking DORA Metrics

    You can derive these from your existing tools:

  • Git history — commit timestamps and deployment tags give lead time and frequency
  • Incident trackers — PagerDuty, OpsGenie give MTTR and change failure rate
  • Dedicated platformsLinearB aggregates Git, CI/CD, and issue tracker data to surface DORA metrics automatically
  • Common DevOps Tools

    Version Control

  • Git — the universal standard
  • GitHub / GitLab / Bitbucket — hosting, code review, and CI/CD integration
  • CI/CD Pipelines

  • GitHub Actions — tightly integrated with GitHub repos
  • GitLab CI — built into GitLab, YAML-based pipelines
  • CircleCI / Jenkins — flexible, self-hosted or cloud
  • Containerization

  • Docker — package an app and its dependencies into a portable image
  • Kubernetes — orchestrate containers at scale across clusters
  • Infrastructure as Code

  • Terraform — cloud-agnostic provisioning
  • Ansible — configuration management and automation
  • Cloud Platforms

  • AWS, GCP, Azure — managed infrastructure, databases, networking, and more
  • DevOps vs. Traditional IT

    AspectTraditionalDevOps
    Release cadenceMonthly / quarterlyDaily / on-demand
    Team structureDev and Ops siloedCross-functional teams
    Failure responseBlame-orientedBlameless post-mortems
    Infra changesManual, undocumentedCode-reviewed, automated
    Feedback loopWeeksMinutes

    A Simple CI/CD Example

    A minimal GitHub Actions workflow that builds and tests a Node.js app on every push:

    # .github/workflows/ci.yml
    name: CI
    
    on: [push, pull_request]
    
    jobs:
      build-and-test:
        runs-on: ubuntu-latest
        steps:
          - uses: actions/checkout@v4
          - uses: actions/setup-node@v4
            with:
              node-version: 20
          - run: npm ci
          - run: npm test
    

    This single file gives you:

  • Automated builds on every push
  • Tests that must pass before a PR can merge
  • A visible green/red status on every commit
  • Where to Go Next

    Once comfortable with the basics, explore these areas in depth:

  • ContainersDocker Essentials
  • OrchestrationKubernetes Essentials
  • PipelinesGitHub Actions · GitLab CI
  • CloudAmazon ECS · Amazon EC2
  • Topics

    DevOpsCI/CDInfrastructureCulture

    Found This Helpful?

    If you have questions or suggestions for improving these notes, I'd love to hear from you.