Add analysis modules for code health and technical debt tracking#1232
Add analysis modules for code health and technical debt tracking#1232karthiknadig wants to merge 7 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request introduces a comprehensive code health analysis system for tracking technical debt and code quality metrics over time. The implementation consists of Python-based analysis modules that examine git history, code complexity, dependency patterns, and debt indicators, plus a CI workflow to generate regular snapshots.
Changes:
- Adds Python analysis modules for git history analysis, complexity metrics, dependency analysis, and technical debt detection
- Introduces GitHub Actions workflow for automated snapshot generation on pushes to main
- Adds agent configuration and hooks for maintainer/reviewer workflows
- Creates skill documentation for snapshot generation and development workflows
Reviewed changes
Copilot reviewed 21 out of 22 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
analysis/snapshot.py |
Main orchestrator that aggregates analysis results into JSON snapshot |
analysis/git_analysis.py |
Analyzes git history for hotspots, churn, and temporal coupling |
analysis/dependency_analysis.py |
Analyzes TypeScript/JavaScript module dependencies and coupling |
analysis/debt_indicators.py |
Scans for TODO/FIXME markers and code smells |
analysis/complexity_analysis.py |
Calculates complexity metrics using radon and regex |
analysis/__init__.py |
Package initialization |
analysis/pyproject.toml |
Python dependencies specification |
.github/workflows/code-analysis.yml |
CI workflow for generating snapshots |
.github/hooks/maintainer-hooks.json |
Agent hooks configuration |
.github/hooks/scripts/*.py |
Hook scripts for session management and validation |
.github/agents/*.agent.md |
Agent definitions for reviewer and maintainer |
.github/skills/*/SKILL.md |
Skill documentation for development workflows |
.gitignore |
Excludes generated snapshot files |
Comments suppressed due to low confidence (2)
analysis/dependency_analysis.py:17
- Import of 'Tuple' is not used.
from typing import Dict, List, Optional, Set, Tuple
.github/hooks/scripts/session_start.py:94
- 'except' clause does nothing but pass and there is no explanatory comment.
except json.JSONDecodeError:
- Fix Python 3.9 typing compatibility (use typing module) - Remove unused imports: defaultdict in dependency_analysis.py - Remove unused variables: func_start, since_date, agent_id - Add explanatory comments for except pass blocks - Fix uv pip install command in code-analysis.yml - Update README.md to say Python 3.9+ instead of 3.10+ - Remove unused gitpython dependency from pyproject.toml - Rename manager-discovery skill to python-manager-discovery - Quote # symbol in cross-platform-paths description - Set user-invocable: false for reference skills - Remove / prefix from skill name references in agents/hooks
- Fix Python 3.9 typing compatibility (use typing module) - Remove unused imports: defaultdict in dependency_analysis.py, Tuple in dependency_analysis.py - Remove unused variables: func_start, since_date, agent_id - Add explanatory comments for except pass blocks - Fix uv pip install command in code-analysis.yml - Update README.md to say Python 3.9+ instead of 3.10+ - Remove unused gitpython dependency from pyproject.toml - Rename manager-discovery skill to python-manager-discovery - Quote # symbol in cross-platform-paths description - Set user-invocable: false for reference skills - Remove / prefix from skill name references in agents/hooks
ac29f94 to
93b7995
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 22 changed files in this pull request and generated 8 comments.
Comments suppressed due to low confidence (4)
analysis/debt_indicators.py:258
- The function boundary detection for Python is overly simplistic and may misidentify function ends. It detects the end of a function when encountering a line with indentation less than or equal to the function definition indentation, but this doesn't account for nested functions, class methods, decorators, or other Python constructs. This could lead to incorrect function length calculations. Consider using an AST-based approach for more accurate function boundary detection.
# Detect when we've left the current function (dedent to same or less level)
elif current_func and line.strip():
line_indent = len(line) - len(line.lstrip())
if line_indent <= current_indent and not line.strip().startswith("#"):
# End of function
length = i - func_start
if length > LONG_FUNCTION_THRESHOLD:
long_funcs.append(
LongFunction(
file=rel_path,
function_name=current_func,
line=func_start + 1,
length=length,
)
)
current_func = None
analysis/complexity_analysis.py:195
- The ternary operator pattern '\b\?\s*[^:]+\s*:' is problematic because it will match the colon in object literals, type annotations, and other TypeScript constructs that contain '?'. For example, it would incorrectly count 'type Foo = { bar?: string }' as a branching statement. Consider making the pattern more specific or using an AST-based approach.
r"\b\?\s*[^:]+\s*:", # ternary
analysis/debt_indicators.py:194
- The code line counting logic is overly simplistic and doesn't account for multi-line comments, docstrings, or other non-code content. For Python, it only excludes lines starting with '#', but triple-quoted strings (docstrings) and multi-line comments are still counted. For TypeScript, it only excludes lines starting with '//', but multi-line /* */ comments and JSDoc blocks are still counted. Consider using a more robust approach or document this limitation.
# Count code lines (non-empty, non-comment)
suffix = filepath.suffix.lower()
if suffix == ".py":
code_lines = sum(
1 for line in lines if line.strip() and not line.strip().startswith("#")
)
elif suffix in {".ts", ".js", ".tsx", ".jsx"}:
code_lines = sum(
1 for line in lines if line.strip() and not line.strip().startswith("//")
)
else:
code_lines = total_lines
analysis/complexity_analysis.py:198
- The logical operator patterns '\|\|' and '&&' will be counted even within strings or comments, potentially inflating complexity scores. Consider filtering out matches that appear within string literals or comments, or document this as a known limitation of the regex-based approach.
r"\|\|", # logical or
r"&&", # logical and
- Add timeout parameters to subprocess calls in snapshot.py and git_analysis.py - Fix debt marker regex pattern (remove literal pipe from character class) - Fix import pattern regex to avoid cross-statement matches - Use astral-sh/setup-uv@v4 action instead of curl | sh
- Fix should_analyze_file() to use relative path parts instead of absolute - Add [build-system] table to pyproject.toml for uv pip install - Fix snapshot path in skill docs and session_start.py (use repo root) - Rename open_issues_count to recent_issues_count (reflects --limit 5)
- Use relative imports in snapshot.py for proper package structure - Add traceback.print_exc() for better CI debugging on failures - Add test file patterns to _should_skip_file in git_analysis.py - Include .tsx, .js, .jsx in complexity analysis (not just .ts) - Update workflow to use 'python -m analysis.snapshot' invocation - Update skill docs with new execution method
| [project] | ||
| name = "vscode-python-environments-analysis" | ||
| version = "0.1.0" | ||
| description = "Code health and technical debt analysis tools" | ||
| requires-python = ">=3.9" | ||
| dependencies = [ | ||
| "radon>=6.0.0", | ||
| "pathspec>=0.11.0", | ||
| ] |
There was a problem hiding this comment.
This Hatchling pyproject.toml doesn't specify build targets (packages/modules to include). With the current name = "vscode-python-environments-analysis", Hatchling's default discovery typically looks for a package like vscode_python_environments_analysis/, which isn't present—so building/installing this project (as done in the workflow) is likely to fail. Add [tool.hatch.build.targets.wheel] configuration (and matching package layout) or adjust the install approach to only install dependencies.
| # Handle the import path | ||
| # Remove ./ or ../ prefixes and resolve | ||
| resolved = (from_dir / import_path).resolve() | ||
|
|
||
| # Try common extensions | ||
| candidates = [ | ||
| resolved, | ||
| resolved.with_suffix(".ts"), | ||
| resolved.with_suffix(".tsx"), | ||
| resolved.with_suffix(".js"), | ||
| resolved / "index.ts", | ||
| resolved / "index.tsx", | ||
| resolved / "index.js", | ||
| ] | ||
|
|
||
| for candidate in candidates: | ||
| if candidate.exists() and candidate.is_file(): | ||
| try: | ||
| return candidate.relative_to(repo_root).as_posix() | ||
| except ValueError: | ||
| return None |
There was a problem hiding this comment.
resolve_import_path() calls .resolve() on the candidate path, but then tries candidate.relative_to(repo_root). If repo_root is a relative path (e.g., when running this module directly), relative_to() will raise ValueError and the import will be dropped, producing an incomplete dependency graph. Consider resolving repo_root (and/or from_file) to absolute paths before computing relatives, or avoid .resolve() and instead normalize relative paths consistently.
| "branch": branch, | ||
| "message": message[:200], # Truncate long messages | ||
| } | ||
| except (subprocess.CalledProcessError, subprocess.TimeoutExpired): |
There was a problem hiding this comment.
get_git_info() doesn't handle FileNotFoundError (e.g., when git isn't installed / not on PATH). In that case snapshot generation will raise instead of gracefully returning unknown metadata like the other failure modes handled here.
| except (subprocess.CalledProcessError, subprocess.TimeoutExpired): | |
| except (subprocess.CalledProcessError, subprocess.TimeoutExpired, FileNotFoundError): |
| # Filter out untracked files in certain directories | ||
| lines = [ | ||
| line | ||
| for line in output.split("\n") | ||
| if line.strip() and not line.strip().startswith("??") # Ignore untracked | ||
| ] |
There was a problem hiding this comment.
The comment says this filters out “untracked files in certain directories”, but the implementation currently ignores all untracked files regardless of directory. Either adjust the comment to match the behavior, or implement the directory-based filtering that the comment implies.
Introduce tools for analyzing code health and tracking technical debt. This includes a snapshot generation workflow that aggregates various analysis results into a single JSON output. The implementation supports dependency analysis, complexity analysis, and debt indicators, enhancing the ability to monitor and manage technical debt over time.