-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Description
Recent Statistic.R queries the Weblate API to collate data on new translations and translations marked for edit.
This code is unnecessarily complicated, making it difficult to read and maintain.
This task is to refactor the code, in particular extracting code into functions that could be used in other scripts that query the Weblate API.
Improvements could include:
- Structure code into clear, reusable functions, e.g.
get_auth_handle(API_TOKEN): get handle to pass tocurl_fetch_memory()
fetch_results(endpoint, handle): get results for a specific endpoint (all pages) - Instead of manually computing page counts, loop until the API returns no more results (length of returned results is zero). The results for each page can be saved in a list and combined after breaking from the loop.
- Instead of processing each page of results, process the results for all pages together, after the results have been combined.
- Replace loops with vectorized code where possible, e.g. L86-92 could use
match()as in L153:155 - Consider defining helper functions for repeated code, e.g.
resolve_language_codes(codes, Language_Statistics)to make the code cleaner.
Since the script is run daily via GitHub actions, we want to minimize the (total) number of dependencies. So:
- Use base R functions to remove dependence on {stringr} (ref: Try not to depend on stringr #9).
- Avoid adding new dependencies in refactoring.
Put each function in a separate R file in the root of the repository, so that the functions can be used across scripts with
helpers <- list.files("R", pattern = "\\.R$", full.names = TRUE)
lapply(helpers, source)roxygen-style documentation would be helpful.
PRs should be made to the main branch.
Metadata
Metadata
Labels
No labels
Type
Projects
Status
Done