EASE 2026 — Glasgow, United Kingdom  |  DOI: 10.1145/3816483.3816532

EnCoDe: Energy Estimation of
Source Code At Design-Time

1 Software Engineering Research Centre, IIIT Hyderabad, India
Figure 2: Design Time Energy Estimation Methodology

Figure 2: Design Time Energy Estimation Methodology. The framework spans five phases: data construction via PowerLens measurement, feature extraction & engineering, ML training, feature correlation profiling, and design-time inference — producing energy estimates and tier classifications without executing code.

Abstract

Energy efficiency has emerged as a vital attribute of software quality, with significant implications for both environmental sustainability and operational costs. However, existing profiling tools operate only at runtime and coarse granularity, capturing energy at the process or method level — failing to expose how small code blocks such as functions, loops, and conditionals contribute to energy consumption during development.

To address this gap, we propose EnCoDe, a methodology for fine-grained, design-time energy estimation, with three key contributions: (1) PowerLens — a novel measurement methodology achieving reliable sub-millisecond energy readings for small code blocks; (2) an extensive empirical study on executable code blocks extracted from over 18,000 Python programs, uncovering linear and non-linear relationships between energy and static code features; and (3) predictive modeling achieving R² = 0.75 for regression and 80.6% accuracy for identifying energy hotspots at design time — without execution.

Keywords: Green Software Engineering · Software Sustainability · Design-Time · Energy Estimation · Static Code Analysis

Key Contributions

🔬

PowerLens Measurement

A novel sub-millisecond energy measurement methodology achieving reliable readings for microsecond-scale code blocks through execution amplification, RAPL temporal synchronization, calibrated subtraction, and IQR-based aggregation across 10 trials. Over 90% of blocks show <10% coefficient of variation.

📊

Fine-Grained Energy Dataset

Empirical study on executable code blocks from over 18,000 Python programs, yielding a first-of-its-kind dataset annotated with 33 static AST features. Energy spans six orders of magnitude (2.37×10⁻⁵ J – 7.48×10² J).

🤖

Predictive Modeling & Validation

Classical ML models trained on static code features achieve R² = 0.755 for continuous energy regression and 80.6% accuracy for Low / Medium / High tier classification — enabling developers to identify energy hotspots early, without running the code.

Motivation: RAPL Cannot Measure Short Code Blocks

Intel's Running Average Power Limit (RAPL) is the de facto standard for software-based energy measurement, but its counters update at approximately 1 ms granularity. Individual code blocks — functions, loops, conditionals — execute in microseconds, making them invisible to standard profilers.

As Figure 1 demonstrates, workloads under 1 ms register more than 110% coefficient of variation and 4–5 runs out of 10 return zero readings. This fundamental limitation motivated the development of PowerLens.

Figure 1: Quality of RAPL Measurements over Execution Time

Figure 1: Quality of RAPL's Measurements over Execution Time of the Workload (ms, log scale). Workloads under 1 ms show >110% CV and frequent zero readings.

Methodology

Five-Phase Pipeline

1

Block Identification

Source code parsed to AST; block-rooting nodes — FunctionDef, For, While, If, Try, With — extracted as distinct, hierarchically-aware blocks.

2

PowerLens Measurement

Each block measured using execution amplification (N=1000+), temporal synchronization with RAPL boundaries, calibrated subtraction of loop overhead, and IQR-filtered mean over 10 trials.

3

Feature Extraction & Engineering

33 static AST metrics per block across 7 categories: Basic, Complexity, Density, Diversity, Structural, Code Pattern, and Halstead metrics.

4

ML Training

Regression model (Mr) predicts continuous energy in Joules; Classification model (Mc) assigns Low / Medium / High tiers using equal-frequency binning. Stratified 5-fold CV.

5

Design-Time Inference

New code parsed to AST, features extracted, pre-trained models queried — energy estimate returned in milliseconds without executing any code.

Figure 3: Code Parsing to identify blocks from AST

Figure 3: Code Parsing to identify blocks from AST. Source code is parsed to an AST, block-rooting nodes are identified, and each subtree is mapped to a distinct block with contextual hierarchy preserved.

33 Static Features (7 categories)

Basic (5)
AST node count max depth avg depth unique node types depth variance
Complexity (4)
cyclomatic cognitive nesting control flow
Density (5)
operator density literal density call density variable density attribute density
Diversity / Entropy (6)
node entropy operator entropy variable entropy unique vars unique ops unique fns
Structural (3)  |  Code Pattern (5)  |  Halstead (5)
branching factor leaf ratio loops count conditionals functions count try blocks program volume program effort difficulty vocabulary length

PowerLens: Sub-Millisecond Energy Measurement

Figure 4: PowerLens Sub-Millisecond Energy Measurement Methodology

Figure 4: PowerLens — Sub-Millisecond Energy Measurement Methodology. (a) Baseline unmeasurable block; (b) Execution Amplification — block repeated N times to amplify the energy signal above RAPL's threshold; (c) Synchronisation — execution start aligned with RAPL counter refresh cycle; (d) Calibrated Subtraction — pre-measured padding overhead removed. Final energy is the IQR-filtered mean across 10 trials.

Figure 6: Validation — PowerLens vs Aggregate RAPL

Figure 6: Validation — Sum of block-level PowerLens measurements compared against coarse-grained whole-program RAPL readings. For all six block types, the aggregated PowerLens values closely match the PyRAPL baseline, validating the accuracy of the fine-grained methodology while exhibiting substantially lower variance.

Block-Level Energy Measurement

Figure 5 illustrates block-level energy measurements for the running example score_function.py. PowerLens annotates each nested block with its individual energy consumption in Joules, measured at the FunctionDef, For, and If levels independently.

The observed energy values span six orders of magnitude (2.37×10⁻⁵ J to 7.48×10² J) across the full dataset — confirming that PowerLens can capture both trivial microsecond-scale constructs and computationally intensive functions within a single unified framework.

Figure 5: Block Level Energy Measurement of score_function.py

Figure 5: Block Level Energy Measurement of the score computation function. Each nested block (FunctionDef, For, If) is individually annotated with its PowerLens-measured energy in Joules.

Feature Analysis: Correlation & Importance

Top 15 features ranked by three correlation measures (Pearson |r|, Spearman |ρ|, Kendall |τ|) and three model-based importance scores. Bold = linear relationship; italic = non-linear relationship.

Table 1 — Top 15 Features by Correlation and Feature Importance
# Pearson |r| Spearman |ρ| Kendall |τ| Extra Trees Random Forest Gradient Boosting
FeatureVal FeatureVal FeatureVal FeatureVal FeatureVal FeatureVal
1operator density0.286functions count0.621functions count0.507operator density0.086operator density0.099operator density0.102
2operator entropy0.205node type entropy0.294node type entropy0.193loops count0.063unique node types0.075program difficulty0.056
3conditionals count0.181conditionals count0.229conditionals count0.178functions count0.061program difficulty0.073program effort0.054
4unique operators0.178cognitive complexity0.225cognitive complexity0.165operator entropy0.054call density0.072variable entropy0.037
5literal density0.174nesting complexity0.215unique functions0.165variable entropy0.054variable entropy0.061loops count0.037
6functions count0.166control flow complexity0.210nesting complexity0.160call density0.053variable density0.061unique node types0.031
7loops count0.156literal density0.207control flow complexity0.156program difficulty0.052leaves to nodes ratio0.053call density0.022
8variable entropy0.145unique functions0.205cyclomatic complexity0.150depth variance0.046depth variance0.052literal density0.016
9variable density0.144cyclomatic complexity0.203literal density0.145unique variables0.046literal density0.051conditionals count0.014
10program difficulty0.129vocabulary size0.199vocabulary size0.141unique node types0.043program effort0.042operator entropy0.011
11unique variables0.122operator density0.195operator density0.138literal density0.042operator entropy0.037leaves to nodes ratio0.010
12leaves to nodes ratio0.117unique node types0.191call density0.133conditionals count0.036unique variables0.032functions count0.010
13depth variance0.116call density0.183unique node types0.127variable density0.035unique functions0.030program length0.009
14attribute density0.080program volume0.148unique variables0.110unique operators0.032loops count0.030depth variance0.009
15max branching factor0.073variable entropy0.139variable entropy0.105unique functions0.027total nodes0.028unique operators0.008

Bold = linear relation (appears in Pearson top-10); Italic = non-linear relation.

Results

0.755
Regression R²
XGBoost · test set
80.6%
Classification Accuracy
XGBoost · Low/Med/High
0.805
Weighted F1
XGBoost · test set
>90%
Blocks with CV<10%
PowerLens stability

Table 2 — Regression Models with Log Transform on Target

ModelTest R²CV R² (±std)RMSEMAEMAPEEnergy (mJ)
XGBoost0.7550.811 ± 0.0400.2810.057172.4515.26
SVR0.7520.803 ± 0.0390.2830.091182.2715.47
Gradient Boosting0.7470.810 ± 0.0400.2860.058171.8313.56
CatBoost0.7220.799 ± 0.0430.3000.058166.3522.97
Random Forest0.7190.793 ± 0.0470.3020.060162.83258.01
Extra Trees0.6820.737 ± 0.0470.3210.074170.85275.83
Decision Tree0.6480.722 ± 0.0700.3380.067170.2414.69
KNN0.6450.684 ± 0.0810.3400.066105.7819.59
AdaBoost0.6330.757 ± 0.0510.3450.092171.8120.04

Shading indicates top two models by each metric. Energy column reports inference cost per prediction (mJ).

Table 3 — Energy Tier Classification Models

ModelAccuracyCV Accuracy (±std)PrecisionRecallF1Energy (J)
XGBoost0.8060.793 ± 0.0070.8040.8060.8050.022
Random Forest0.7920.780 ± 0.0070.7890.7920.7890.373
Gradient Boosting0.7880.794 ± 0.0080.7880.7880.7880.015
K-NN0.7830.771 ± 0.0050.7800.7830.7800.027
SVM0.7810.771 ± 0.0090.7800.7810.7780.022
Extra Trees0.7690.765 ± 0.0030.7680.7690.7650.315
Decision Tree0.7490.736 ± 0.0090.7440.7490.7450.014
Logistic Regression0.7350.726 ± 0.0030.7290.7350.7310.017
SGD Classifier0.7290.713 ± 0.0050.7220.7290.7210.022

Shading indicates top two models. XGBoost achieves best performance with stable cross-validation across all metrics.

Table 4 — Feature Group Ablations

Feature Group Leave-One-Out Group Only
ΔR² ΔAcc #FeatAcc
Density−0.002−2.8 pp50.70071.1%
Counts−0.011−2.6 pp80.72074.2%
Halstead−0.006−0.1 pp50.68859.7%
Complexity−0.002−0.4 pp40.06751.2%
Entropy+0.001−0.1 pp30.69168.3%
AST Structural~0−0.1 pp80.65069.4%

Leave-One-Out: Δ relative to full 33-feature model (R²=0.755, Acc=80.6%). Group Only: model trained on that group's features alone.

WattWise — VS Code Extension

EnCoDe is operationalised as WattWise, a VS Code extension that surfaces design-time energy estimates as inline lint-like annotations — directly inside the developer's editor, with no execution or hardware setup required.

WattWise VS Code Extension — inline energy estimates on Python code

WattWise annotates every analysed block with an energy estimate in Joules and a Low / Medium / High tier — powered by the EnCoDe Gradient Boosting and XGBoost models.

Inline energy decorations

Energy in Joules and tier label shown at the end of every def, for, while, if, try, and with block in real time.

🤖
AI-powered optimization suggestions

Gemini 2.5 Flash explains why a High-energy block is expensive and proposes concrete rewrites — algorithmic improvements, vectorisation, data structure alternatives.

📊
Repo-wide energy dashboard

FastAPI + React dashboard scans an entire repository, tracks energy trends over time, and estimates annual electricity cost per block.

🔀
GitHub PR bot

Automatically comments energy regressions on pull requests and requests manager approval when configurable cost thresholds are exceeded.

Source Code Setup Guide

BibTeX

@misc{goyal2026encodeenergyestimationsource, title = {EnCoDe: Energy Estimation of Source Code At Design-Time}, author = {Shailender Goyal and Akhila Matathammal and Karthik Vaidhyanathan}, year = {2026}, eprint = {2605.00504}, archivePrefix = {arXiv}, primaryClass = {cs.SE}, url = {https://arxiv.org/abs/2605.00504}, doi = {10.1145/3816483.3816532} }
×