-
Couldn't load subscription status.
- Fork 46
Add documentation for PR merge time analysis tools #2789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Copilot
wants to merge
5
commits into
dev
Choose a base branch
from
copilot/calculate-pr-merge-time
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
73be4d9
Initial plan
Copilot cc4b1b9
Add documentation for PR merge time analysis scripts
Copilot 4119323
Update calculate_pr_merge_time_mcp.py
moumighosh 1a6b0b3
Update calculate_pr_merge_time_mcp.py
moumighosh fb02186
Update calculate_pr_merge_time.py
moumighosh File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,94 @@ | ||
| # Pull Request Merge Time Analysis | ||
|
|
||
| This repository contains scripts to calculate the average time to merge for pull requests in the Microsoft Authentication Library Common for Android repository. | ||
|
|
||
| ## Scripts | ||
|
|
||
| ### 1. `calculate_pr_merge_time.py` | ||
|
|
||
| This is the main script that queries the GitHub API directly to fetch pull request data and calculate merge time statistics. | ||
|
|
||
| **Features:** | ||
| - Fetches merged pull requests from the repository | ||
| - Calculates time difference between PR creation and merge | ||
| - Provides comprehensive statistics including: | ||
| - Average merge time | ||
| - Median merge time | ||
| - Fastest and slowest merge times | ||
| - Standard deviation | ||
| - Sample of recent merged PRs | ||
|
|
||
| **Usage:** | ||
| ```bash | ||
| python3 calculate_pr_merge_time.py | ||
| ``` | ||
|
|
||
| **Requirements:** | ||
| - Python 3.6+ | ||
| - `requests` library (`pip install requests`) | ||
| - Optional: GitHub personal access token (set as `GITHUB_TOKEN` environment variable for higher API rate limits) | ||
|
|
||
| **Setting up GitHub Token:** | ||
| ```bash | ||
| export GITHUB_TOKEN=your_personal_access_token_here | ||
| python3 calculate_pr_merge_time.py | ||
| ``` | ||
|
|
||
| ### 2. `calculate_pr_merge_time_mcp.py` | ||
|
|
||
| This script is designed to work with pre-fetched pull request data (e.g., from GitHub MCP server tools or GitHub API responses saved to a file). | ||
|
|
||
| **Usage:** | ||
| ```bash | ||
| python3 calculate_pr_merge_time_mcp.py <pr_data.json> | ||
| ``` | ||
|
|
||
| Where `pr_data.json` contains pull request data from the GitHub API. | ||
|
|
||
| ## Example Output | ||
|
|
||
| ``` | ||
| ================================================================================ | ||
| Pull Request Merge Time Statistics for AzureAD/microsoft-authentication-library-common-for-android | ||
| ================================================================================ | ||
|
|
||
| Total PRs Analyzed: 100 | ||
|
|
||
| Average Time to Merge: 2 days, 5 hours | ||
| (53.24 hours) | ||
|
|
||
| Median Time to Merge: 1 day, 18 hours | ||
| (42.15 hours) | ||
|
|
||
| Fastest Merge: 1 hour, 28 minutes | ||
| (1.47 hours) | ||
|
|
||
| Slowest Merge: 15 days, 3 hours | ||
| (363.12 hours) | ||
|
|
||
| Standard Deviation: 72.34 hours | ||
|
|
||
| ================================================================================ | ||
|
|
||
| Sample of Recent Merged Pull Requests: | ||
| -------------------------------------------------------------------------------- | ||
| PR #2782: 1 hour, 28 minutes | ||
| Title: Manual merging working/release/23.0.0 to release/23.0.0 | ||
|
|
||
| PR #2780: 1 day, 15 hours | ||
| Title: [DEV] [Cherry-pick September Hotfix] Don't set browser pkg name | ||
|
|
||
| ... and 98 more PRs | ||
| ================================================================================ | ||
| ``` | ||
|
|
||
| ## Configuration | ||
|
|
||
| Both scripts are pre-configured to analyze the `AzureAD/microsoft-authentication-library-common-for-android` repository. To analyze a different repository, modify the `owner` and `repo` variables in the `main()` function of the respective script. | ||
|
|
||
| ## Notes | ||
|
|
||
| - The scripts only analyze **merged** pull requests (PRs that were closed without merging are excluded) | ||
| - By default, the scripts analyze up to 100 most recently updated pull requests | ||
| - API rate limits apply when using the GitHub API without authentication (60 requests per hour) | ||
| - With a GitHub personal access token, the rate limit increases to 5,000 requests per hour |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,281 @@ | ||
| #!/usr/bin/env python3 | ||
| """ | ||
| Script to calculate the average time to merge for pull requests in a GitHub repository. | ||
|
|
||
| This script queries the GitHub API to fetch closed pull requests that were merged, | ||
| calculates the time difference between when each PR was created and when it was merged, | ||
| and computes the average merge time. | ||
|
|
||
| Usage: | ||
| python3 calculate_pr_merge_time.py | ||
|
|
||
| Requirements: | ||
| - Python 3.6+ | ||
| - requests library (pip install requests) | ||
| - GitHub personal access token (optional, for higher API rate limits) | ||
| Set as GITHUB_TOKEN environment variable | ||
| """ | ||
|
|
||
| import os | ||
| import sys | ||
| import json | ||
| from datetime import datetime | ||
| from typing import List, Dict, Optional | ||
| import statistics | ||
|
|
||
| try: | ||
| import requests | ||
| except ImportError: | ||
| print("Error: 'requests' library not found.") | ||
| print("Please install it using: pip install requests") | ||
| sys.exit(1) | ||
|
|
||
|
|
||
| class PRMergeTimeCalculator: | ||
| """Calculate average merge time for GitHub pull requests.""" | ||
|
|
||
| def __init__(self, owner: str, repo: str, token: Optional[str] = None): | ||
| """ | ||
| Initialize the calculator. | ||
|
|
||
| Args: | ||
| owner: Repository owner (username or organization) | ||
| repo: Repository name | ||
| token: GitHub personal access token (optional) | ||
| """ | ||
| self.owner = owner | ||
| self.repo = repo | ||
| self.token = token | ||
| self.base_url = "https://api.github.com" | ||
| self.headers = { | ||
| "Accept": "application/vnd.github.v3+json" | ||
| } | ||
| if self.token: | ||
| self.headers["Authorization"] = f"token {self.token}" | ||
|
|
||
| def fetch_merged_pull_requests(self, max_prs: int = 100) -> List[Dict]: | ||
| """ | ||
| Fetch merged pull requests from the repository. | ||
|
|
||
| Args: | ||
| max_prs: Maximum number of PRs to fetch (default: 100) | ||
|
|
||
| Returns: | ||
| List of pull request data dictionaries | ||
| """ | ||
| url = f"{self.base_url}/repos/{self.owner}/{self.repo}/pulls" | ||
| params = { | ||
| "state": "closed", | ||
| "per_page": min(max_prs, 100), | ||
| "sort": "updated", | ||
| "direction": "desc" | ||
| } | ||
|
|
||
| merged_prs = [] | ||
| page = 1 | ||
|
|
||
| while len(merged_prs) < max_prs: | ||
| params["page"] = page | ||
| try: | ||
| response = requests.get(url, headers=self.headers, params=params, timeout=30) | ||
| response.raise_for_status() | ||
| prs = response.json() | ||
|
|
||
| if not prs: | ||
| break | ||
|
|
||
| # Filter only merged PRs | ||
| for pr in prs: | ||
| if pr.get("merged_at"): | ||
| merged_prs.append(pr) | ||
| if len(merged_prs) >= max_prs: | ||
| break | ||
|
|
||
| page += 1 | ||
|
|
||
| # Check rate limit | ||
| if "X-RateLimit-Remaining" in response.headers: | ||
| remaining = int(response.headers["X-RateLimit-Remaining"]) | ||
| if remaining < 10: | ||
| print(f"Warning: Only {remaining} API requests remaining") | ||
|
|
||
| except requests.exceptions.RequestException as e: | ||
| print(f"Error fetching pull requests: {e}") | ||
| break | ||
|
|
||
| return merged_prs | ||
|
|
||
| def calculate_merge_time(self, pr: Dict) -> Optional[float]: | ||
| """ | ||
| Calculate the time to merge for a single pull request. | ||
|
|
||
| Args: | ||
| pr: Pull request data dictionary | ||
|
|
||
| Returns: | ||
| Merge time in hours, or None if calculation fails | ||
| """ | ||
| try: | ||
| created_at = datetime.strptime(pr["created_at"], "%Y-%m-%dT%H:%M:%SZ") | ||
| merged_at = datetime.strptime(pr["merged_at"], "%Y-%m-%dT%H:%M:%SZ") | ||
|
|
||
| time_diff = merged_at - created_at | ||
| hours = time_diff.total_seconds() / 3600 | ||
|
|
||
| return hours | ||
| except (KeyError, ValueError, TypeError) as e: | ||
| print(f"Error calculating merge time for PR #{pr.get('number', 'unknown')}: {e}") | ||
| return None | ||
|
|
||
| def format_duration(self, hours: float) -> str: | ||
| """ | ||
| Format duration in hours to a human-readable string. | ||
|
|
||
| Args: | ||
| hours: Duration in hours | ||
|
|
||
| Returns: | ||
| Formatted string (e.g., "2 days, 5 hours") | ||
| """ | ||
| days = int(hours // 24) | ||
| remaining_hours = int(hours % 24) | ||
| minutes = int((hours * 60) % 60) | ||
|
|
||
| parts = [] | ||
| if days > 0: | ||
| parts.append(f"{days} day{'s' if days != 1 else ''}") | ||
| if remaining_hours > 0 or not parts: | ||
| parts.append(f"{remaining_hours} hour{'s' if remaining_hours != 1 else ''}") | ||
| if minutes > 0 and days == 0: | ||
| parts.append(f"{minutes} minute{'s' if minutes != 1 else ''}") | ||
|
|
||
| return ", ".join(parts) | ||
|
|
||
| def calculate_statistics(self, max_prs: int = 100, verbose: bool = True) -> Dict: | ||
| """ | ||
| Calculate merge time statistics for pull requests. | ||
|
|
||
| Args: | ||
| max_prs: Maximum number of PRs to analyze | ||
| verbose: Whether to print detailed information | ||
|
|
||
| Returns: | ||
| Dictionary containing statistics | ||
| """ | ||
| if verbose: | ||
| print(f"Fetching merged pull requests from {self.owner}/{self.repo}...") | ||
|
|
||
| merged_prs = self.fetch_merged_pull_requests(max_prs) | ||
|
|
||
| if not merged_prs: | ||
| print("No merged pull requests found.") | ||
| return {} | ||
|
|
||
| if verbose: | ||
| print(f"Found {len(merged_prs)} merged pull requests.") | ||
| print("Calculating merge times...\n") | ||
|
|
||
| merge_times = [] | ||
| pr_details = [] | ||
|
|
||
| for pr in merged_prs: | ||
| merge_time = self.calculate_merge_time(pr) | ||
| if merge_time is not None and merge_time >= 0: | ||
| merge_times.append(merge_time) | ||
| pr_details.append({ | ||
| "number": pr["number"], | ||
| "title": pr["title"], | ||
| "merge_time_hours": merge_time | ||
| }) | ||
|
|
||
| if not merge_times: | ||
| print("Could not calculate merge times for any pull requests.") | ||
| return {} | ||
|
|
||
| # Calculate statistics | ||
| avg_time = statistics.mean(merge_times) | ||
| median_time = statistics.median(merge_times) | ||
| min_time = min(merge_times) | ||
| max_time = max(merge_times) | ||
|
|
||
| if len(merge_times) > 1: | ||
| stdev_time = statistics.stdev(merge_times) | ||
| else: | ||
| stdev_time = 0 | ||
|
|
||
| stats = { | ||
| "total_prs_analyzed": len(merge_times), | ||
| "average_hours": avg_time, | ||
| "median_hours": median_time, | ||
| "min_hours": min_time, | ||
| "max_hours": max_time, | ||
| "stdev_hours": stdev_time, | ||
| "pr_details": pr_details | ||
| } | ||
|
|
||
| if verbose: | ||
| self.print_statistics(stats) | ||
|
|
||
| return stats | ||
|
|
||
| def print_statistics(self, stats: Dict): | ||
| """Print statistics in a formatted manner.""" | ||
| print("=" * 80) | ||
| print(f"Pull Request Merge Time Statistics for {self.owner}/{self.repo}") | ||
| print("=" * 80) | ||
| print(f"\nTotal PRs Analyzed: {stats['total_prs_analyzed']}") | ||
| print(f"\nAverage Time to Merge: {self.format_duration(stats['average_hours'])}") | ||
| print(f" ({stats['average_hours']:.2f} hours)") | ||
| print(f"\nMedian Time to Merge: {self.format_duration(stats['median_hours'])}") | ||
| print(f" ({stats['median_hours']:.2f} hours)") | ||
| print(f"\nFastest Merge: {self.format_duration(stats['min_hours'])}") | ||
| print(f" ({stats['min_hours']:.2f} hours)") | ||
| print(f"\nSlowest Merge: {self.format_duration(stats['max_hours'])}") | ||
| print(f" ({stats['max_hours']:.2f} hours)") | ||
| print(f"\nStandard Deviation: {stats['stdev_hours']:.2f} hours") | ||
| print("\n" + "=" * 80) | ||
|
|
||
| # Show sample PRs | ||
| print("\nSample of Recent Merged Pull Requests:") | ||
| print("-" * 80) | ||
| for pr in stats['pr_details'][:10]: | ||
| print(f" PR #{pr['number']}: {self.format_duration(pr['merge_time_hours'])}") | ||
| print(f" Title: {pr['title'][:70]}{'...' if len(pr['title']) > 70 else ''}") | ||
|
|
||
| if len(stats['pr_details']) > 10: | ||
| print(f"\n ... and {len(stats['pr_details']) - 10} more PRs") | ||
| print("=" * 80) | ||
|
|
||
|
|
||
| def main(): | ||
| """Main function to run the script.""" | ||
| # Configuration | ||
| owner = "AzureAD" | ||
| repo = "microsoft-authentication-library-common-for-android" | ||
|
|
||
| # Get GitHub token from environment variable (optional) | ||
| token = os.environ.get("GITHUB_TOKEN") | ||
|
|
||
| if not token: | ||
| print("Note: No GITHUB_TOKEN environment variable found.") | ||
| print("Running without authentication (lower API rate limits).") | ||
| print("To increase rate limits, set GITHUB_TOKEN environment variable.\n") | ||
|
|
||
| # Create calculator instance | ||
| calculator = PRMergeTimeCalculator(owner, repo, token) | ||
|
|
||
| # Calculate and display statistics | ||
| try: | ||
| calculator.calculate_statistics(max_prs=100, verbose=True) | ||
| except KeyboardInterrupt: | ||
| print("\n\nOperation cancelled by user.") | ||
| sys.exit(0) | ||
| except Exception as e: | ||
| print(f"\nError: {e}") | ||
| import traceback | ||
| traceback.print_exc() | ||
| sys.exit(1) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hardcoded datetime format assumes GitHub API timestamps always end with 'Z'. The MCP variant uses
datetime.fromisoformat()with.replace('Z', '+00:00')for more robust parsing. Consider using the same approach here for consistency and better handling of ISO 8601 formats.