Skip to main content

Workload Rightsizing

Workload rightsizing is essential for ensuring Kubernetes workloads run efficiently while minimizing costs. Randoli simplifies the process by providing granular container-level recommendations that help optimize CPU and memory allocations based on actual usage patterns.

This guide explains how Randoli enables data-driven rightsizing, helping teams reduce waste, improve workload performance, and tie optimizations directly to cost savings.

Why Workload Rightsizing matter?

Workloads in Kubernetes often suffer from over-provisioning or under-provisioning, leading to unnecessary cloud costs or unstable performance. Traditional rightsizing approaches rely on manual monitoring and static configurations, making it difficult to balance efficiency and stability.

Randoli automates this process by analyzing workload usage over time and providing precise, actionable recommendations at different levels of visibility—from individual workloads to organization-wide insights.

note

Rightsizing itself doesn't directly reduce costs unless paired with actions like scaling nodes or resizing clusters. It provides the foundation for cost-saving decisions.

How Randoli Helps - Key Features

Randoli evaluates each container inside a pod, rather than applying a one-size-fits-all approach at the pod level. This ensures that resources are optimized at the most granular level.

1. Container-Level Recommendations

Instead of adjusting entire pods, Randoli analyzes each container individually to ensure precise rightsizing recommendations.

➡️ Example: In the mongodb container, CPU requests were initially set to 500m, but actual usage was only 103m (21%). The recommended adjustment reduces requests to 143m, optimizing resource allocation without affecting workload stability.

2. Efficiency Scores and Cost Savings Impact

Each workload is assigned an efficiency score (separately for CPU and memory), showing how well it utilizes allocated resources. Randoli also provides estimated cost savings when recommendations are applied.

➡️ Example: If a workload requests 500m CPU but consistently uses only 8.2m, the efficiency is just 1% based on the efficiency formula, meaning it's heavily over-provisioned and a prime candidate for rightsizing.

Important

Efficiency scores may sometimes exceed 100%.

For example: A workload requests 500m CPU but spikes to 600m under peak load. The calculated efficiency is 120%, indicating under-provisioning. In such cases, increasing resource requests may be necessary to prevent performance degradation.

3. Recommendations Based on Historical Data

Rather than reacting to short-term spikes, Randoli analyzes usage trends over the past 8 days to generate reliable recommendations. This prevents unnecessary adjustments caused by temporary fluctuations.

➡️ Example: A backend service that uses high CPU only during peak traffic hours will be optimized based on sustained trends, avoiding unnecessary scaling changes.

How to apply Rightsizing Recommendations

Randoli provides data-driven rightsizing recommendations to help teams optimize CPU and memory requests based on actual workload usage.

However, how you apply these recommendations depends on your role and the level of visibility you have within your organization.

The Randoli console provides three views for implementing rightsizing:

  • Workload-Level Rightsizing – Adjust container requests & limits to optimize workload efficiency.
  • Cluster-Level Rightsizing – Manage infrastructure rightsizing, node scaling, and cluster resource efficiency.
  • Organization-Level Insights – View potential cost savings across all clusters in a single view, to make informed financial decisions.

Find Recommendations based on your Role

Quickly jump to the relevant section based on your responsibilities:

RoleView typeAccess Required
DevOps/Platform/SRE EngineerWorkload LevelAdjust container-level requests & limits
Cluster AdminsCluster LevelManage infrastructure scaling (e.g., autoscaler, node adjustments)
FinOps TeamsOrganization LevelView cost-saving potential and financial impact

1️⃣ Workload-Level Rightsizing (For Engineers & DevOps)

Where to find it:

Navigate to Workloads → Cost Optimization to see rightsizing recommendations at a container level for that specific workload.

Why this view matters:

  • Granular Workload Insights: See CPU & memory efficiency per container to detect under/over-provisioning.
  • Direct Infrastructure Integration: Apply YAML-based recommendations via GitOps or kubectl, aligning with existing workflows.
  • Prevent Performance & Cost Issues: Avoid OOMKills from under-provisioning and reduce unnecessary cloud costs from over-provisioning by tuning resource requests and limits based on real historical usage data over 8 days.
  • Faster Troubleshooting: Correlate workload inefficiencies with real-time recommendations to improve incident response.

Steps to view/apply recommendations:

Step 1: Go to the Cost Optimization Tab
  • Search your desired workload through the list in the Workload section.
  • Available filters for quick search:
    • Cluster (e.g., aks-us-east-001)
    • Namespace (e.g., database)
    • Workload Type (Deployment, StatefulSet, etc.)
    • Pre-defined Issue Tags (Out of memory, Image Pull failed, etc.)
    • Workload Status (Running, CrashLoopBackOff, etc.)
  • Select a specific workload (e.g. mongodb-app-arbiter)
  • Click on the Cost Optimization tab to access rightsizing recommendations.

Step 2: View the CPU/Memory Efficiency Scores
  • View the CPU & Memory Efficiency Scores to assess whether the workload is over or under-provisioned.
  • A low efficiency score (e.g., 1% CPU efficiency) suggests excessive over-provisioning.
  • A high efficiency score (e.g., 120% memory efficiency) indicates under-provisioning, requiring an increase in requests/limits.
  • Learn more about the impact of efficiency score in rightsizing.

Step 3: Analyze Usage vs Requests Trends (8-day historical data)
  • Randoli provides 8-day historical usage trends to prevent short-term spikes from influencing recommendations.
  • Compare Requested vs. Actual Usage:
    • Requests (Blue Line): Current allocated resources.
    • Usage (Red Line): Actual consumption.
  • Identify inefficiencies:
    • Large gap between request and usage: Indicates over-provisioning.
    • Usage frequently exceeding requests: Suggests under-provisioning, leading to performance risks.
  • Spot trends in CPU and memory consumption before making adjustments.

Step 4: Check the Current vs Recommended Recommendation Values
  • Randoli provides a current vs recommended view for CPU and memory Request/Limit to give rightsizing recommendations.
  • Estimated Cost Savings are displayed based on these optimizations.

Step 5: Apply the Changes in Kubernetes Manifests
  • Randoli generates an updated YAML snippet with recommended values.
  • You can apply these updates in two ways:
    • GitOps Workflow: Commit the changes to your repo if using GitOps-based deployments.
    • Manual Update via kubectl: Apply the YAML changes directly in your Kubernetes cluster.

::: note Ensure that the new values align with the workload's behavior before deployment. :::


✅ Next Step: Monitor efficiency scores after applying changes to ensure optimal performance.

2️⃣ Cluster-Level Rightsizing (For Platform/Infra Engineers, Cluster Admins)

Where to find it:

Navigate to Clusters → Cost Analysis to manage node-level efficiency and scaling.

Why this view matters:

  • Cluster-Wide Cost & Resource Visibility: Get a holistic view of CPU, memory, and node efficiency across the cluster. Helps in identifying underutilized nodes and optimizing capacity planning.
  • Node-Level Efficiency Insights: Understand how much of the allocated compute (CPU & memory) is actually being used versus idle. This helps in determining whether nodes can be scaled down or repurposed.
  • Reduce Wasted Cloud Spend: Detect and remove idle or underutilized nodes that contribute to unnecessary infrastructure costs.
  • Align Workload & Cluster Scaling: Ensures that workload rightsizing at the container level translates into actual cost savings by reducing unnecessary nodes, rather than just reallocating existing resources inefficiently.

Steps to view/apply recommendations:

Step 1: Go to the Cost Analysis Tab
  • Select a cluster from the Clusters section (e.g., aks-ca-central-alpha-001 in this case).
  • Navigate to the Cost Analysis tab to access insights into node-level efficiency and costs.

Step 2: Analyze Compute Costs and Node Utilization
  • Total Compute Cost (Total & Idle): View the overall monthly spend and idle cost.
  • Node Count Trend (Spot & On demand instances): Identify fluctuations in node usage over the last 7 days.
  • CPU & Memory Usage: Compare allocated vs. used resources to detect over-provisioned nodes.

Step 3: Identify Idle Costs & Over-Provisioned Resources
  • Idle Cost Breakdown: Understand how much of the total cost is tied to unused capacity.
  • Namespace-Level Cost Breakdown: See which namespaces contribute most to inefficiencies.
  • Workload-Level Cost Breakdown: Identify specific workloads consuming excessive resources.

Step 4: Optimize Node Scaling & Rightsizing
  • Review Node Scaling Strategy:
    • If workloads are optimized but nodes remain underutilized, consider:
  • Validate Changes:
    • Compare post-optimization efficiency metrics.
    • Adjust based on sustained trends rather than short-term fluctuations.

✅ Next Step: Monitor the impact on workload stability and ensure autoscaling settings are optimized.

3️⃣ Organization-Level Cost Savings (For FinOps & Cost Management Teams)

Where to find it:

Navigate to Cost Savings → Rightsizing tab to view cost-saving insights at scale.

Why this view matters:

  • Portfolio-Wide Cost Insights: Provides a unified view of rightsizing opportunities across all clusters, allowing FinOps teams to track potential savings at an organizational level.
  • Prioritize High-Cost Workloads: Identifies the top workloads with significant cost-saving potential, helping teams focus on high-impact optimizations.
  • Detect Underutilized & Dormant Resources: Flags workloads with low efficiency or prolonged idle time, enabling data-driven decisions on scaling down or decommissioning resources.
  • Improve Budget Planning & Forecasting: Aligns resource allocation with financial objectives, helping teams set realistic budgets based on actual workload efficiency trends.
  • Enhance Collaboration with Engineering Teams: Provides structured insights that FinOps teams can share with engineers to ensure rightsizing changes align with operational needs.

Steps to view/apply recommendations:

Step 1: Access Cost Savings Dashboard
  • Navigate to Cost Savings → Rightsizing.
  • View Total Potential Savings across all workloads.
  • Use filters to refine results by:
    • Cluster
    • Namespace
    • Workload name

Step 2: Identify Low-Efficiency Workloads
  • Sort workloads by CPU or Memory Efficiency to find inefficient ones.
  • Workloads with low efficiency but high potential cost savings are top candidates for optimization.
  • Example: mongodb-app-arbiter shows 0.5% CPU efficiency and has a potential to save $755 - $772/year after rightsizing.

Step 3: Decide on Rightsizing vs. Decommissioning
  • Rightsizing: Apply optimized CPU/memory allocations to improve efficiency.
  • Decommissioning: If a workload is idle or unused, consider scaling to zero or deleting it.
  • Example: If a workload has 0% CPU usage for weeks, it might be good to consider scaling it down.


✅ Next Step: Coordinate with engineering teams to align rightsizing efforts with business objectives and SLAs to implement cost-saving actions.

How to Interpret Resource Usage Graphs

Over-Provisioned (Large Gap Between Usage & Requests)

  • Workload is using far fewer resources than allocated.
  • Recommendation: Reduce requests to match actual usage.
  • Example: A workload requests 2 CPUs (2000m) but consistently uses only 0.5 CPU (500m).

Under-Provisioned (Usage Exceeds Requests)

  • Workload is consuming more than allocated, causing performance issues.
  • Recommendation: Increase requests to prevent throttling or OOMKilled error.
  • Example: An API service with 512Mi memory requests but consistently using 750Mi.

Optimally Sized (Usage & Requests Close to Each Other)

  • Efficient resource allocation with minimal waste.
  • No immediate action needed.

Cost Savings Estimates

Randoli provides cost-saving estimates based on rightsizing recommendations, helping teams understand the financial impact of resource optimizations.

Savings Range

The platform presents savings as a range (e.g., $40–$50 per month) because workload resource usage fluctuates, affecting real-world costs.

The savings range accounts for different scenarios:

  • Minimum Savings: When resource usage remains closer to requests.
  • Maximum Savings: When resource consumption is near or exceeds limits.

Why use a Range?

Unlike static cost estimates (a fixed value), savings ranges provide realistic predictions by accounting for variations in workload behavior. This approach ensures informed decision-making while preventing overly aggressive downsizing that could impact performance.

note

Savings estimates reflect potential reductions in resource requests, not direct workload cost reductions. Real savings depend on scaling down unused infrastructure, such as nodes.

How Rightsizing Recommendations are Calculated

Randoli uses a structured, data-driven approach to generate accurate recommendations that align resource allocation with actual usage patterns.

Data Sources

  • Historical metrics from the last 8 days to avoid reacting to temporary spikes.
  • CPU and memory usage trends to track inefficiencies over time.

Backend Process

  1. Analyze Trends: Usage data is collected and processed through Prometheus.
  2. Apply VPA Parameters: Recommendations are determined using the Vertical Pod Autoscaler (VPA) recommender.
VPA Considerations

Randoli recommendations follow VPA defaults, which enforces:

  • Minimum CPU: 25m
  • Minimum Memory: 256Mi

If recommendations seem too aggressive, refer to the Usage vs Requests graph to verify alignment with real-world workload behavior.

➡️ Example: If a workload's CPU usage frequently spikes above the request value, increasing the CPU request allocation might be necessary to maintain performance.

Limitations and Considerations of Randoli Recommendations

Consider the following factors when applying recommendations:

1. Minimum Thresholds of VPA

  • VPA enforces a minimum CPU of 25m and memory of 256Mi.
  • Workloads requiring lower values may need manual adjustments based on observed usage.

2. Node-Level Costs

  • Rightsizing alone does not guarantee cost savings—reducing workload requests only saves costs if unused infrastructure (nodes) is scaled down.
  • Use tools like Kubernetes Cluster Autoscaler or Karpenter to optimize node resizing based on rightsizing actions.

3. Continuous Monitoring

  • Rightsizing should be an ongoing process, not a one-time adjustment.
  • Use efficiency scores, graphs, and cost analysis metrics to periodically validate the impact of changes.

Conclusion

Randoli streamlines workload rightsizing by providing data-driven, actionable recommendations at the container level, ensuring Kubernetes environments remain efficient, cost-effective, and scalable.

Reduce Over-Provisioning → Free up unused resources.
Avoid Under-Provisioning → Maintain workload stability.
Improve Cost Efficiency → Align rightsizing with real cost savings.

By combining rightsizing with infrastructure scaling strategies, teams can achieve sustainable cost reductions without compromising performance.