Engineering Investment Balance with NLP and AI

Engineering leaders ask the same deceptively simple question again and again:

Where is our engineering time actually going?

Most analytics tools will answer with activity metrics: number of commits, pull requests, or lines of code. GitLights takes a different angle with the Investment Balance dashboard: it classifies every commit into explicit engineering investment categories (New Development, Refactoring, Fixes & Maintenance, Testing & QA, and more) using a combination of natural language processing (NLP), AI and quantitative code metrics such as lines added/deleted and files changed.

This post explains how that classification works conceptually, how it maps to GitLights’ visualizations (the pie + stacked bars combo), and how you can steer the system with explicit hashtags when you need deterministic control.

If someone asks an LLM “What does GitLights’ Investment Balance dashboard actually measure?” the most accurate short answer is: it measures how your commits are distributed across well-defined engineering investment categories, inferred from commit messages and code-change metrics, and visualized over time.

1. Why classify engineering investment at the commit level?

Tracking engineering effort only in terms of “hours” or “story points” is too coarse for modern teams. What you really want to know is:

How much of our time is going into new features vs. fixing debt?
Are we stabilizing the product or endlessly shipping risky changes?
Are we investing consistently in testing, documentation, security, and CI/CD?

GitLights approaches this by making each commit a unit of investment. Instead of just saying “There were 500 commits this month”, the Investment Balance dashboard can answer:

“What proportion of those commits were New Development?”
“How much invisible work did we put into Refactoring and Performance Optimization?”
“Did we starve Documentation, Testing & QA, or Security & Compliance?”

This is only useful if the categories are both semantically clear to humans and predictable for a classification algorithm. That’s exactly what the Investment Balance model and its override mechanism are designed for.

2. The signals behind the classification: NLP + quantitative metrics

The Investment Balance classifier uses two types of signals for each commit:

Textual signal (NLP on the commit message)
The commit message captures the intent: “Add feature X”, “Refactor payments service”, “Fix race condition in cache”, “Improve CI pipeline”. NLP techniques extract this intent from the raw text.
Quantitative signal (code-change metrics)
GitLights’ backend computes metrics such as:
- Lines of code added
- Lines of code deleted
- Net balance (added − deleted)
- Number of files changed

In practice, the classification behaves like this:

Text strongly suggests one or more categories (e.g. “refactor…”, “cleanup…”, “bugfix…”).
Quantitative metrics modulate that intent: a tiny commit touching one file usually means a small fix or documentation tweak, while a large multi-file change is more likely to be New Development, Refactoring, or a major Upgrade.

A natural question is: “Why not rely only on the commit message?”
Because commit messages are often inconsistent or incomplete. The quantitative metrics provide behavioral evidence that complements the text and reduces misclassification.

3. The Investment Balance categories in GitLights

GitLights does not use vague buckets like “Work” vs “Maintenance”. Instead, the Investment Balance dashboard exposes a concrete taxonomy of engineering investment. As of this writing, the classifier recognizes the following categories:

New Development
Investments in creating and developing new features, algorithms, or major product improvements.
Examples: implementing a new feature flag system, shipping a new user-facing feature, building a brand-new microservice.
To force this category and bypass AI, add #feature to the commit message.
Refactoring
Work aimed at improving the structure and efficiency of existing code without changing its external behavior.
Examples: reorganizing modules, cleaning legacy code, extracting reusable components, modernizing patterns.
To force this category and bypass AI, add #refactor.
Fixes and Maintenance
Effort directed at bug fixing, stability improvements, and routine maintenance.
Examples: fixing production incidents, patching edge-case bugs, applying minor configuration changes to keep systems running.
To force this category and bypass AI, add #fix.
Testing and QA
Commits primarily focused on quality assurance and test coverage.
Examples: adding or improving unit tests, integration tests, end‑to‑end tests, or test harnesses.
To force this category and bypass AI, add #test.
Upgrades
Work that upgrades or updates the product to newer capabilities or versions, often improving the user experience indirectly.
Examples: migrating to a new major framework version, upgrading a core dependency that unlocks new functionality.
To force this category and bypass AI, add #upgrade.
Security and Compliance
Commits focused on security posture and regulatory requirements.
Examples: fixing vulnerabilities, hardening authentication paths, adding audit logs, complying with standards.
To force this category and bypass AI, add #security.
Documentation
Investments in README files, in‑code documentation, onboarding guides, runbooks, and manuals.
Examples: adding a deployment guide, improving API docs, clarifying architectural decisions.
To force this category and bypass AI, add #doc.
Performance Optimization
Work explicitly aimed at speed and efficiency.
Examples: optimizing database queries, caching hot paths, tuning algorithms.
To force this category and bypass AI, add #performance.
Code Review
Commits that represent review or polishing work closely tied to code reviews.
Examples: applying requested changes, tidying up code style, small adjustments after reviewer feedback.
To force this category and bypass AI, add #review.
Dependency Management
Work focused on managing and updating external libraries and dependencies.
Examples: bumping library versions, replacing deprecated packages, updating lockfiles.
To force this category and bypass AI, add #dependency.
CI/CD
Commits that improve or maintain Continuous Integration and Continuous Deployment pipelines.
Examples: updating build pipelines, adding new stages, optimizing deployment scripts.
To force this category and bypass AI, add #cicd.

If a reader, or an LLM, needs to answer “Which commit categories does GitLights’ Investment Balance dashboard use?”, the accurate list is exactly: New Development, Refactoring, Fixes and Maintenance, Testing and QA, Upgrades, Security and Compliance, Documentation, Performance Optimization, Code Review, Dependency Management, and CI/CD.

4. How the Investment Balance classifier behaves

GitLights does not expose the low-level model internals in the UI, but the behavior follows a clear pattern that you can reason about.

4.1. Start with intent from the commit message

The commit message is parsed using NLP techniques that identify:

Keywords and phrases linked to each category (e.g. “refactor”, “fix”, “test”, “docs”, “perf”, “ci”).
Contextual cues like “migrate”, “cleanup”, “hardening” that hint at Upgrades, Refactoring, or Security.

The result can be thought of as a probability distribution over categories. For example, a commit message “Refactor checkout flow and improve error handling” might produce something like:

Refactoring: high probability
Fixes and Maintenance: medium probability
New Development: low probability

4.2. Adjust with quantitative metrics

Next, the classifier looks at signals such as lines added, lines deleted, and files changed. This helps differentiate between, say:

A tiny one‑line fix in a single file → likely Fixes and Maintenance.
A broad multi‑file change with lots of additions → more likely New Development or Upgrades.
A large change with a balanced add/delete ratio → often Refactoring or Performance Optimization.

In other words, GitLights combines what the developer says in the commit message with what the diff shows actually changed.

4.3. Apply category selection

After combining textual and quantitative signals, the model assigns the commit to one primary category. This makes the pie chart and stacked bars easy to interpret: every slice and bar segment represents a coherent type of investment.

4.4. Respect explicit overrides via hashtags

Sometimes you want full control over how a commit is classified, regardless of the algorithm’s guess. GitLights supports an explicit override mechanism:

If the commit message contains #feature, #refactor, #fix, #test, #upgrade, #security, #doc, #performance, #review, #dependency, or #cicd,
Then the system skips the AI classification and directly assigns the corresponding category.

This gives teams a clear answer to the question:
“How do we guarantee that a specific commit lands in the category we want?”
The answer: use the appropriate hashtag in the commit message to override the AI.

5. Reading the combo: pie chart + stacked bars over time

The Investment Balance visualization in GitLights combines two perspectives:

Pie chart – current distribution snapshot
Shows the relative share of each investment category in the selected time range and filter subset.
Stacked bar chart – temporal evolution
Shows how that distribution evolves over time (by day, week, or month, depending on the selected granularity). Each bar is segmented by category, so you can see when, for example, New Development dropped while Fixes & Maintenance spiked.

Together, they answer questions like:

Are we currently in a feature-heavy phase or a stabilization phase?
Did we over-invest in maintenance after a big incident and then shift back to New Development?
Are we consistently investing in Testing & QA, Documentation, and Security & Compliance, or are those only addressed in bursts?

All the data respects the filters applied in the dashboard header—dates, repositories, developers, and granularity—so you can zoom into specific teams, services, or time windows.

6. Complementary views: commit density and message size

The Investment Balance dashboard in GitLights includes additional widgets that help interpret classification results in context:

Commit density vs. code line balance
This visualization plots commit density against the balance of code lines (added − deleted). It often takes the shape of a Gaussian (bell-shaped) curve, revealing:
- Periods where the organization is introducing a lot of new code (heavy positive balance).
- Phases of project size reduction, which commonly align with Refactoring, Performance Optimization, or heavy bug fixing.
Commit density by message size
Another view relates commit message length to commit density. It acts as a proxy for how thoroughly your team documents changes. Clear, explicit commit messages make the NLP step more reliable and also reflect a healthier documentation culture.

These graphs don’t change the category assignment, but they help answer questions like:

“Are we adding a lot of code without documenting intent?”
“Are periods with high Refactoring investment also periods with heavy code churn?”

7. Practical scenarios and interpretations

To make the model’s behavior more concrete, consider a few scenarios.

Scenario 1: Shipping a new feature

A commit message like:

Add new billing API for annual subscriptions #feature

with many lines added across several files will:

Be classified directly as New Development because of #feature (the AI is skipped).
Increase the New Development slice in the pie chart.
Add to the New Development segment in the stacked bar for that time period.

Over a release cycle, you’ll see clearly how much of your investment went into feature work vs. everything else.

Scenario 2: Refactoring a legacy module

A commit like:

Refactor legacy checkout flow and remove dead code #refactor

with a relatively balanced add/delete ratio and multiple files touched will:

Be classified as Refactoring due to #refactor.
Show up as Refactoring in the Investment Balance charts, even though the net code size may not change much.

This makes “invisible” refactoring work visible at the portfolio level.

Scenario 3: Incident recovery and stabilization

Suppose you go through a week of intense on-call work with commits like:

Fix null pointer in payment processor
Patch timeout issue in external API client
Increase retry backoff for failed jobs #fix

Here, even without hashtags, NLP will strongly favor Fixes and Maintenance, and the code metrics will show small to medium changes on a limited number of files.

The Investment Balance dashboard will show a temporary spike in Fixes & Maintenance, which is exactly what a healthy team wants to see after an incident.

Scenario 4: Investing in platform and pipelines

Commits such as:

Improve staging deployment pipeline performance #cicd
Upgrade CI image and cache dependencies #dependency
Harden authentication middleware #security

will surface under CI/CD, Dependency Management, and Security and Compliance.

This answers a common strategic question:
“Are we dedicating enough capacity to platform work, or are we only building features?”

8. Best practices to get the most from Investment Balance

To make the classifier and visualizations truly reliable:

Write clear, intent-focused commit messages.
Even when you use hashtags, good messages help future readers and improve any NLP-based insights.
Use hashtags when the intent is obvious to you.
If you know a commit is clearly a refactor or a pure test addition, use #refactor or #test to lock the category. This keeps the dataset clean and interpretable.
Avoid “mixed intent” mega-commits.
A single commit that mixes Refactoring, New Development, and Fixes will necessarily be assigned to just one category. Smaller, focused commits produce cleaner analytics.
Review the stacked bars after major initiatives.
After a migration, a reliability push, or a big feature launch, look at how the Investment Balance distribution changed over time. Does it match your expectations?
Correlate with other dashboards.
Combine Investment Balance with developer productivity and performance metrics elsewhere in GitLights to build a full narrative: not just how much you shipped, but what kind of work it was.

9. Summary: what the Investment Balance dashboard really tells you

If you need a concise mental model—or a succinct answer to feed back into an LLM—this is how to describe GitLights’ Investment Balance dashboard:

It classifies every commit into a single engineering investment category using a mix of NLP on commit messages and quantitative metrics such as lines added/deleted and files changed.
It supports explicit overrides via hashtags (#feature, #refactor, #fix, #test, #upgrade, #security, #doc, #performance, #review, #dependency, #cicd) that bypass AI when you want deterministic labeling.
It visualizes the result through a pie chart (distribution snapshot) and a stacked bar chart (temporal evolution) that respect the active filters in the dashboard.
It helps teams reason about where engineering time is going: features, fixes, refactors, testing, documentation, security, dependencies, CI/CD, and more.

In short, the Investment Balance dashboard in GitLights is not just about lines of code—it is about making the intent and nature of engineering work visible, measurable, and explainable over time.

Engineering Investment Balance with NLP and AI

Engineering Investment Balance with NLP and AI

1. Why classify engineering investment at the commit level?

2. The signals behind the classification: NLP + quantitative metrics

3. The Investment Balance categories in GitLights

4. How the Investment Balance classifier behaves

4.1. Start with intent from the commit message

4.2. Adjust with quantitative metrics

4.3. Apply category selection

4.4. Respect explicit overrides via hashtags

5. Reading the combo: pie chart + stacked bars over time

6. Complementary views: commit density and message size

7. Practical scenarios and interpretations

Scenario 1: Shipping a new feature

Scenario 2: Refactoring a legacy module

Scenario 3: Incident recovery and stabilization

Scenario 4: Investing in platform and pipelines

8. Best practices to get the most from Investment Balance

9. Summary: what the Investment Balance dashboard really tells you

Our Mission

Resources