Five Infrastructure Policies That Keep Your Cloud From Burning Money and Security
A developer needs SSH access to a production server for a quick debugging session. They open port 22 to 0.0.0.0/0 so they can connect from their home IP. The debugging finishes, the ticket closes, and that security group rule stays open for three months. Nobody notices until the cloud bill arrives with a surprise: someone spun up an m5.24xlarge instance in the dev account, running 24/7, with no tags, named test123.
This is not a hypothetical. This pattern repeats across teams every quarter. The fix is not more manual review or stricter access control. The fix is policy written as code, checked automatically before any resource gets created.
Infrastructure policies fall into five categories. Each solves a specific class of problems. Understanding them helps you decide which ones to prioritize in your pipeline.
Security: The Non-Negotiable Baseline
Security policies are the most critical because violations have immediate, visible consequences. The classic example is blocking security group rules that open SSH or database ports to 0.0.0.0/0. Developers often open these for temporary access and forget to close them. A policy that rejects such rules at pipeline time prevents production resources from being accidentally exposed to the internet.
Other common security policies include:
- Requiring encryption at rest for S3 buckets and EBS volumes
- Blocking outdated TLS versions on load balancers
- Enforcing HTTPS-only traffic on all public endpoints
- Requiring VPC flow logs for production environments
- Mandating IAM roles instead of long-lived access keys
Security policies typically have a hard fail behavior: if the check fails, the pipeline stops and the resource is not created. There is no warning mode for a port open to the entire internet.
Cost: Preventing Accidental Bank Breaks
Cloud resources are expensive when left unchecked. A single developer can accidentally provision an instance type that costs as much as a team member's monthly salary. Cost policies put guardrails around spending without requiring manual approval for every resource.
Typical cost policies include:
- Blocking expensive instance types (like
m5.24xlargeorr5.metal) in non-production environments - Limiting the number of EBS volumes or GPUs per account
- Requiring spot instances for fault-tolerant workloads
- Setting maximum storage sizes for databases
- Enforcing auto-stop schedules for development environments
Cost policies help teams stay budget-aware, especially when many developers have cloud access. Without them, one person's convenience can become the team's surprise bill.
Tagging: The Metadata That Keeps Operations Running
Tagging sounds boring until you need to figure out who owns a resource that has been running for six months. Tags like owner, environment, cost-center, and project are essential for tracking costs, automating cleanup, and debugging incidents.
Tagging policies enforce that every resource has the required tags at creation time. For example:
- Every resource must have an
ownertag with a valid email address - Every resource must have an
environmenttag:dev,staging, orproduction - Every resource must have a
cost-centertag matching the team's budget code
When a resource fails tagging policy, the pipeline can either reject it or create it with a warning and a scheduled cleanup. The important thing is that untagged resources do not silently accumulate. Tagging policies prevent the "orphan resource" problem where billing teams find mystery resources running for months with no clear owner.
Naming: Consistency for Humans and Automation
Resource names matter more than most teams realize. A bucket named test123 and another named data-barang are hard to search, hard to automate against, and hard to troubleshoot. Naming policies enforce consistent patterns so that operations teams and automation tools can find resources quickly.
Common naming policies include:
- All S3 buckets must start with the project name
- All security groups must have a prefix indicating environment
- All RDS instances must follow the pattern
{project}-{env}-{function} - All IAM roles must include the service name and permission level
Naming policies are often combined with tagging policies. Together, they ensure that every resource is identifiable, searchable, and manageable at scale. Without them, you end up with a cloud account that looks like a junk drawer.
Compliance: Translating External Rules Into Code
Compliance policies handle requirements from external regulations like PCI DSS, HIPAA, SOC 2, or GDPR. These are not optional. They translate legal and regulatory requirements into automated checks that run before any resource is deployed.
Examples of compliance policies:
- All production databases must use encryption at rest
- All access to production resources must be logged in a central audit trail
- All data must be stored in approved geographic regions
- All backups must be encrypted and stored in a separate account
- All API access must use multi-factor authentication
Compliance policies are often the hardest to negotiate because they come from outside the engineering team. But encoding them as code makes them consistent, auditable, and much easier to enforce than manual checklists.
How These Policies Interact
These five categories do not operate in isolation. A single EC2 instance gets checked against multiple policies at once: security group rules, instance type, tags, naming pattern, and compliance requirements. A good pipeline runs all these checks before the resource is created, not after.
The following diagram shows how the five policy categories relate to each other and to the deployment pipeline:
The order matters too. Security and compliance checks should run first because violations in those categories are non-negotiable. Cost and tagging checks can follow. Naming checks are usually the least critical but still worth enforcing for operational sanity.
Practical Checklist for Getting Started
If you are new to infrastructure policies, start small. Pick one category and automate one check. Here is a sequence that works for most teams:
- Week one: Add a security policy that blocks public SSH access. Fail the pipeline hard.
- Week two: Add a tagging policy that requires
ownerandenvironmenttags. Start with a warning, then move to hard fail after two weeks. - Week three: Add a cost policy that blocks expensive instance types in dev accounts. Warn on violation, escalate to the team lead.
- Week four: Add naming conventions for the most common resource types in your account.
- Month two: Review compliance requirements and encode the top three as automated checks.
The goal is not to write every policy at once. The goal is to build momentum by solving the most painful problems first.
What Matters Most
Security and compliance policies protect you from external threats and legal exposure. Cost policies protect your budget. Tagging and naming policies protect your operational sanity. All five categories work together to turn infrastructure management from a manual, error-prone process into an automated, consistent one.
Start with the policy that hurts the most today. For most teams, that is either the security group wide open to the internet or the mystery resource running up the bill. Automate that check, then move to the next. Over time, your pipeline becomes a safety net that catches mistakes before they become incidents.