Introduction to DevOps

DevOps is a set of practices, cultural philosophies, and tools that combine software development (Dev) and IT operations (Ops) to shorten the development lifecycle and deliver high-quality software continuously.

What is DevOps?

Before DevOps, development and operations teams worked in silos. Developers wrote code, threw it "over the wall" to operations, and ops struggled to deploy and maintain it. DevOps breaks down that wall.

What are Silos?

A silo is an organizational pattern where a team works in isolation — with its own goals, tools, processes, and definition of success — largely disconnected from adjacent teams.

In a siloed setup, a typical software company looked like this:

[ Dev Team ]        [ QA Team ]        [ Ops Team ]
 Writes code   →    Tests code    →    Deploys & runs it
 Goal: ship          Goal: find          Goal: stability
 features fast       bugs                (resist change)

Each team optimized for its own KPIs, which often conflicted:

Dev was measured on features shipped, so they pushed code frequently.

QA was measured on bugs caught, so they slowed releases down with long test cycles.

Ops was measured on uptime, so they resisted frequent deployments that could introduce instability.

The result: releases became rare, high-risk events. When something broke in production, teams pointed fingers because no one had full ownership of the outcome.

Concrete signs of silo thinking:

"That's not my job" — a developer ignores a production alert because monitoring belongs to Ops.

Long handoff queues — a ticket sits for days waiting for another team to pick it up.

Duplicated tooling — Dev uses one set of scripts to build locally, Ops uses a completely different process in production.

Blame culture — post-mortems focus on who caused the outage rather than what process allowed it.

Core idea: automate and collaborate so that code goes from a developer's laptop to production reliably, quickly, and repeatedly.

The Three Ways (Principles)

DevOps thinking is often framed around three foundational principles:

Flow — Optimize the left-to-right flow of work from Dev → Ops → Customer. Remove bottlenecks, reduce batch sizes, and limit work-in-progress.

Feedback — Create fast feedback loops at every stage so problems are caught and fixed close to where they originate.

Continual Learning — Foster a culture of experimentation, blameless post-mortems, and learning from failures.

The DevOps Lifecycle

Plan → Code → Build → Test → Release → Deploy → Operate → Monitor
  ^___________________________________________________|
                    (continuous loop)

Stage	What happens
Plan	Define features, track work (Jira, Linear, GitHub Issues)
Code	Write source code, peer review via pull requests
Build	Compile or package the application
Test	Automated unit, integration, and end-to-end tests
Release	Tag a version, produce an artifact (Docker image, binary)
Deploy	Push the artifact to an environment (staging, production)
Operate	Run and maintain the live system
Monitor	Observe metrics, logs, and alerts; feed insights back to Plan

Key Practices

Continuous Integration (CI)

Developers merge code changes to a shared branch frequently — at least once a day. Each merge triggers an automated build and test run.

Why it matters: Catch integration bugs early, when they are cheap to fix.

Push code → CI server picks it up → Build → Run tests → Pass/Fail notification

Continuous Delivery (CD)

Every change that passes CI is automatically packaged and made ready for deployment to any environment at the push of a button.

Continuous Deployment

One step further than Continuous Delivery — passing changes are deployed to production automatically, with no manual approval.

CI (pass) → staging deploy → smoke test → production deploy

Infrastructure as Code (IaC)

Servers, networks, and databases are defined in code files and version-controlled like application code.

Benefits:

Reproducible environments

Eliminates "works on my machine" drift

Changes are reviewable and auditable

Popular tools: Terraform, Pulumi, AWS CloudFormation

Monitoring and Observability

You can't improve what you can't measure. Observability is built on three pillars:

Pillar	What it tells you	Tools
Metrics	Numeric measurements over time (CPU, request rate, error %)	Prometheus, Datadog
Logs	Timestamped event records	Loki, CloudWatch, ELK Stack
Traces	End-to-end path of a single request	Jaeger, Zipkin, OpenTelemetry

DORA Metrics

DORA (DevOps Research and Assessment) metrics are the four industry-standard measurements used to assess how well a software team is delivering. They were established through years of research by the DORA team (now part of Google Cloud) and published in the State of DevOps reports.

The key insight: high performers excel at all four simultaneously — speed and stability are not a trade-off, they reinforce each other.

Metric	What it measures	Elite benchmark
Deployment Frequency	How often you deploy to production	On-demand (multiple times/day)
Lead Time for Changes	Time from commit to running in production	Less than 1 hour
Change Failure Rate	% of deployments that cause a production incident	0–5%
Mean Time to Recovery (MTTR)	How fast you restore service after an incident	Less than 1 hour

Deployment Frequency

Measures how often your team ships to production. Low frequency usually signals large, risky batches — the bigger the release, the harder it is to isolate what broke.

Low performer: monthly or less Elite performer: multiple deploys per day

Lead Time for Changes

The elapsed time from a developer committing code to that code running in production. Long lead times point to slow CI pipelines, manual approval gates, or infrequent merges.

Low performer: 1 month to 6 months Elite performer: less than 1 hour

Change Failure Rate

The percentage of production deployments that result in a degraded service or require a hotfix/rollback. A high rate signals insufficient testing, missing feature flags, or poor deployment practices.

Low performer: 46–60% Elite performer: 0–5%

Mean Time to Recovery (MTTR)

How quickly you restore normal service after a production incident. Teams with low MTTR invest in observability, runbooks, and on-call processes so they can diagnose and fix fast.

Low performer: 1 week to 1 month Elite performer: less than 1 hour

Tracking DORA Metrics

You can derive these from your existing tools:

Git history — commit timestamps and deployment tags give lead time and frequency

Incident trackers — PagerDuty, OpsGenie give MTTR and change failure rate

Dedicated platforms — LinearB aggregates Git, CI/CD, and issue tracker data to surface DORA metrics automatically

Common DevOps Tools

Version Control

Git — the universal standard

GitHub / GitLab / Bitbucket — hosting, code review, and CI/CD integration

CI/CD Pipelines

GitHub Actions — tightly integrated with GitHub repos

GitLab CI — built into GitLab, YAML-based pipelines

CircleCI / Jenkins — flexible, self-hosted or cloud

Containerization

Docker — package an app and its dependencies into a portable image

Kubernetes — orchestrate containers at scale across clusters

Infrastructure as Code

Terraform — cloud-agnostic provisioning

Ansible — configuration management and automation

Cloud Platforms

AWS, GCP, Azure — managed infrastructure, databases, networking, and more

DevOps vs. Traditional IT

Aspect	Traditional	DevOps
Release cadence	Monthly / quarterly	Daily / on-demand
Team structure	Dev and Ops siloed	Cross-functional teams
Failure response	Blame-oriented	Blameless post-mortems
Infra changes	Manual, undocumented	Code-reviewed, automated
Feedback loop	Weeks	Minutes

A Simple CI/CD Example

A minimal GitHub Actions workflow that builds and tests a Node.js app on every push:

# .github/workflows/ci.yml
name: CI

on: [push, pull_request]

jobs:
  build-and-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
      - run: npm ci
      - run: npm test

This single file gives you:

Automated builds on every push

Tests that must pass before a PR can merge

A visible green/red status on every commit

Where to Go Next

Once comfortable with the basics, explore these areas in depth:

Containers → Docker Essentials

Orchestration → Kubernetes Essentials

Pipelines → GitHub Actions · GitLab CI

Cloud → Amazon ECS · Amazon EC2

Introduction to DevOps

On This Page

Introduction to DevOps

What is DevOps?

What are Silos?

The Three Ways (Principles)

The DevOps Lifecycle

Key Practices

Continuous Integration (CI)

Continuous Delivery (CD)

Continuous Deployment

Infrastructure as Code (IaC)

Monitoring and Observability

DORA Metrics

Deployment Frequency

Lead Time for Changes

Change Failure Rate

Mean Time to Recovery (MTTR)

Tracking DORA Metrics

Common DevOps Tools

Version Control

CI/CD Pipelines

Containerization

Infrastructure as Code

Cloud Platforms

DevOps vs. Traditional IT

A Simple CI/CD Example

Where to Go Next

Topics

Found This Helpful?