Code Complexity Refactoring Backlog

This document tracks functions with cognitive complexity > 15 that need refactoring.

Critical Priority (Complexity >= 40)

1. DevToContentAnalyzer._determine_content_type - Complexity: 3 ✅

File: src/devto_mirror/ai_optimization/content_analyzer.py:639 Original Complexity: 45 Current Complexity: 3 Target Complexity: ≤ 15

Refactoring Strategy:

Extract content type detection logic into separate helper methods
Create a mapping/strategy pattern for content type detection
Reduce nested conditionals using early returns

2. DevToSchemaGenerator.generate_article_schema - Complexity: 7 ✅

File: src/devto_mirror/ai_optimization/schema_generator.py:252 Original Complexity: 55 Current Complexity: 7 Target Complexity: ≤ 15

Refactoring Completed:

Extracted author extraction into _extract_author_info()
Extracted date handling into _extract_dates() with _normalize_iso_date() helper
Extracted image handling into _extract_images()
Extracted tags/keywords into _extract_keywords()
Extracted engagement metrics into _extract_engagement_metrics()
Extracted content metrics into _extract_content_metrics()
Refactored main method as orchestrator calling helper methods

High Priority (Complexity 20-39)

3. DevToMetadataEnhancer._determine_content_type - Complexity: 24

File: src/devto_mirror/ai_optimization/metadata_enhancer.py:119 Current Complexity: 24 Target Complexity: ≤ 15

Refactoring Strategy:

Similar to item #1, extract into helper methods
Consider sharing logic with DevToContentAnalyzer._determine_content_type

4. DevToContentAnalyzer.extract_api_metrics - Complexity: 20 ✓ COMPLETED

File: src/devto_mirror/ai_optimization/content_analyzer.py:92 Original Complexity: 20 Target Complexity: ≤ 15 Status: ✓ Refactored

Refactoring Approach:

Created _validate_numeric_metric() helper method to centralize validation logic
Replaced 5 repetitive nested conditionals with configuration-driven approach
Used metric configuration list to define validation rules (key, min_value)
Single loop iterates through configuration, eliminating nested conditionals

Changes:

Added _validate_numeric_metric(value, min_value) method for validation
Refactored extract_api_metrics() to use metric configuration
Maintained backward compatibility and all existing tests pass

5. DevToMetadataEnhancer._add_article_meta_tags - Complexity: 8 ✅

File: src/devto_mirror/ai_optimization/metadata_enhancer.py:79 Original Complexity: 19 Current Complexity: 8 Target Complexity: ≤ 15

Refactoring Applied:

Extracted _extract_author_name() helper for author extraction
Extracted _extract_published_date() helper for date extraction
Extracted _ensure_iso_timezone() helper for timezone normalization
Simplified main method to orchestrate helper calls

6. DevToAISitemapGenerator._determine_content_type - Complexity: 18

File: src/devto_mirror/ai_optimization/sitemap_generator.py:312 Current Complexity: 18 Target Complexity: ≤ 15

Refactoring Strategy:

Share implementation with other _determine_content_type methods
Consider creating a base class or mixin for content type determination

7. _fetch_article_pages - Complexity: 18 ✅

File: scripts/generate_site.py:81 Original Complexity: 18 Final Complexity: ≤ 15 Status: COMPLETED

Refactoring Applied:

Created src/devto_mirror/api_client.py module with helper functions
Extracted session creation logic to create_devto_session()
Extracted retry logic to fetch_page_with_retry()
Extracted date filtering to filter_new_articles()
Migrated API client logic from scripts/ to src/devto_mirror/ package

Medium Priority (Complexity 16-19)

8. GitHubPagesCrawlerAnalyzer.analyze_robots_txt - Complexity: 17

File: scripts/analyze_github_pages_crawlers.py:32 Current Complexity: 17 Target Complexity: ≤ 15

Refactoring Strategy:

Extract robots.txt parsing into separate method
Extract permission analysis into helper function
Simplify user-agent handling logic

9. DevToMetadataEnhancer.add_source_attribution_metadata - Complexity: 5 ✅

File: src/devto_mirror/ai_optimization/metadata_enhancer.py:283 Original Complexity: 17 Current Complexity: 5 Target Complexity: ≤ 15

Refactoring Applied:

Extracted _build_canonical_metadata() for canonical URL processing
Extracted _build_api_metadata() for API data processing
Extracted _add_engagement_metrics() for metric processing
Used configuration-driven approach for engagement metrics

10. DevToContentAnalyzer.extract_code_languages - Complexity: 16 ✅

File: src/devto_mirror/ai_optimization/content_analyzer.py:212 Current Complexity: ≤ 15 (Refactored) Target Complexity: ≤ 15

Refactoring Strategy:

Extract language detection patterns
Extract confidence scoring logic
Simplify nested loops

11. DevToSchemaGenerator (class) - ✅ COMPLETED

File: src/devto_mirror/ai_optimization/schema_generator.py:20 Original Complexity: 17 → Current Complexity: 7 Target Complexity: ≤ 15 ✓

Refactoring Applied:

Extracted _extract_author_info() helper method (complexity: 8)
Extracted _extract_dates() and _ensure_iso_format() helper methods (complexity: 5, 6)
Extracted _extract_image() helper method (complexity: 5)
Extracted _extract_tags() helper method (complexity: 4)
Extracted _calculate_word_count() helper method (complexity: 2)
Extracted _extract_engagement_metrics() helper method (complexity: 12)
Removed duplicate code blocks from generate_article_schema
Reduced generate_article_schema from 55 → 14
Average class complexity: 6.08

Status Summary

Total Functions: 11
Completed: 11
Remaining: 0
Critical Priority (>= 40): 2 ✅
High Priority (20-39): 5 ✅
Medium Priority (16-19): 4 ✅

Refactoring Guidelines

Maximum Cognitive Complexity: 15 per function
Extract Methods: Break down complex functions into smaller, focused helpers
Early Returns: Use guard clauses to reduce nesting
Strategy Pattern: Use dictionaries/mappings instead of long if-else chains
Single Responsibility: Each function should do one thing well
Code Reuse: Look for duplicated logic across similar functions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code Complexity Refactoring Backlog

Critical Priority (Complexity >= 40)

1. DevToContentAnalyzer._determine_content_type - Complexity: 3 ✅

2. DevToSchemaGenerator.generate_article_schema - Complexity: 7 ✅

High Priority (Complexity 20-39)

3. DevToMetadataEnhancer._determine_content_type - Complexity: 24

4. DevToContentAnalyzer.extract_api_metrics - Complexity: 20 ✓ COMPLETED

5. DevToMetadataEnhancer._add_article_meta_tags - Complexity: 8 ✅

6. DevToAISitemapGenerator._determine_content_type - Complexity: 18

7. _fetch_article_pages - Complexity: 18 ✅

Medium Priority (Complexity 16-19)

8. GitHubPagesCrawlerAnalyzer.analyze_robots_txt - Complexity: 17

9. DevToMetadataEnhancer.add_source_attribution_metadata - Complexity: 5 ✅

10. DevToContentAnalyzer.extract_code_languages - Complexity: 16 ✅

11. DevToSchemaGenerator (class) - ✅ COMPLETED

Status Summary

Refactoring Guidelines

Progress Tracking

FilesExpand file tree

COMPLEXITY_REFACTORING.md

Latest commit

History

COMPLEXITY_REFACTORING.md

File metadata and controls

Code Complexity Refactoring Backlog

Critical Priority (Complexity >= 40)

1. DevToContentAnalyzer._determine_content_type - Complexity: 3 ✅

2. DevToSchemaGenerator.generate_article_schema - Complexity: 7 ✅

High Priority (Complexity 20-39)

3. DevToMetadataEnhancer._determine_content_type - Complexity: 24

4. DevToContentAnalyzer.extract_api_metrics - Complexity: 20 ✓ COMPLETED

5. DevToMetadataEnhancer._add_article_meta_tags - Complexity: 8 ✅

6. DevToAISitemapGenerator._determine_content_type - Complexity: 18

7. _fetch_article_pages - Complexity: 18 ✅

Medium Priority (Complexity 16-19)

8. GitHubPagesCrawlerAnalyzer.analyze_robots_txt - Complexity: 17

9. DevToMetadataEnhancer.add_source_attribution_metadata - Complexity: 5 ✅

10. DevToContentAnalyzer.extract_code_languages - Complexity: 16 ✅

11. DevToSchemaGenerator (class) - ✅ COMPLETED

Status Summary

Refactoring Guidelines

Progress Tracking