Skip to content

Commit 62cc8a4

Browse files
authored
Add issue deduplicator workflow (#4628)
It's a bit hand-holdy in that it pre-downloads issue list but that keeps codex running in read-only no-network mode.
1 parent f895d4c commit 62cc8a4

File tree

3 files changed

+108
-0
lines changed

3 files changed

+108
-0
lines changed
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
You are an assistant that triages new GitHub issues by identifying potential duplicates.
2+
3+
You will receive the following JSON files located in the current working directory:
4+
- `codex-current-issue.json`: JSON object describing the newly created issue (fields: number, title, body).
5+
- `codex-existing-issues.json`: JSON array of recent issues (each element includes number, title, body, createdAt).
6+
7+
Instructions:
8+
- Load both files as JSON and review their contents carefully.
9+
- Compare the current issue against the existing issues to find up to five that appear to describe the same underlying problem or request.
10+
- Only consider an issue a potential duplicate if there is a clear overlap in symptoms, feature requests, reproduction steps, or error messages.
11+
- Prioritize newer issues when similarity is comparable.
12+
- Ignore pull requests and issues whose similarity is tenuous.
13+
- When unsure, prefer returning fewer matches.
14+
15+
Output requirements:
16+
- Respond with a JSON array of issue numbers (integers), ordered from most likely duplicate to least.
17+
- Include at most five numbers.
18+
- If you find no plausible duplicates, respond with `[]`.
19+
- Do not emit any additional commentary, text, or keys beyond the JSON array.
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
name: Issue Deduplicator
2+
3+
on:
4+
issues:
5+
types:
6+
# - opened - disabled while testing
7+
- labeled
8+
9+
jobs:
10+
gather-duplicates:
11+
name: Identify potential duplicates
12+
if: ${{ github.event.action == 'opened' || (github.event.action == 'labeled' && github.event.label.name == 'codex-deduplicate') }}
13+
runs-on: ubuntu-latest
14+
permissions:
15+
contents: read
16+
outputs:
17+
codex_output: ${{ steps.codex.outputs.final_message }}
18+
steps:
19+
- uses: actions/checkout@v4
20+
21+
- name: Prepare Codex inputs
22+
env:
23+
GH_TOKEN: ${{ github.token }}
24+
run: |
25+
set -eo pipefail
26+
27+
CURRENT_ISSUE_FILE=codex-current-issue.json
28+
EXISTING_ISSUES_FILE=codex-existing-issues.json
29+
30+
gh issue list --repo "${{ github.repository }}" \
31+
--json number,title,body,createdAt \
32+
--limit 1000 \
33+
--state all \
34+
--search "sort:created-desc" \
35+
| jq '.' \
36+
> "$EXISTING_ISSUES_FILE"
37+
38+
printf '%s' '${{ toJson(github.event.issue) }}' \
39+
| jq '{number, title, body}' \
40+
> "$CURRENT_ISSUE_FILE"
41+
42+
- id: codex
43+
uses: openai/codex-action@main
44+
with:
45+
openai_api_key: ${{ secrets.CODEX_OPENAI_API_KEY }}
46+
prompt_file: .github/prompts/issue-deduplicator.txt
47+
require_repo_write: false
48+
49+
comment-on-issue:
50+
name: Comment with potential duplicates
51+
needs: gather-duplicates
52+
if: ${{ needs.gather-duplicates.result != 'skipped' }}
53+
runs-on: ubuntu-latest
54+
permissions:
55+
contents: read
56+
issues: write
57+
steps:
58+
- name: Comment on issue
59+
uses: actions/github-script@v7
60+
env:
61+
CODEX_OUTPUT: ${{ needs.gather-duplicates.outputs.codex_output }}
62+
with:
63+
github-token: ${{ github.token }}
64+
script: |
65+
let numbers;
66+
try {
67+
numbers = JSON.parse(process.env.CODEX_OUTPUT);
68+
} catch (error) {
69+
core.info(`Codex output was not valid JSON. Raw output: ${raw}`);
70+
return;
71+
}
72+
73+
const lines = ['Potential duplicates detected:', ...numbers.map((value) => `- #${value}`)];
74+
75+
await github.rest.issues.createComment({
76+
owner: context.repo.owner,
77+
repo: context.repo.repo,
78+
issue_number: context.payload.issue.number,
79+
body: lines.join("\n"),
80+
});
81+
82+
- name: Remove codex-deduplicate label
83+
if: ${{ always() && github.event.action == 'labeled' && github.event.label.name == 'codex-deduplicate' }}
84+
env:
85+
GH_TOKEN: ${{ github.token }}
86+
run: |
87+
gh issue edit "${{ github.event.issue.number }}" --remove-label codex-deduplicate || true
88+
echo "Attempted to remove label: codex-deduplicate"

.github/workflows/issue-labeler.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ jobs:
2828
with:
2929
openai_api_key: ${{ secrets.CODEX_OPENAI_API_KEY }}
3030
prompt_file: .github/prompts/issue-labeler.txt
31+
require_repo_write: false
3132

3233
apply-labels:
3334
name: Apply labels from Codex output

0 commit comments

Comments
 (0)