Skip to content

Refactor code to obtain recent statistics from Weblate API #56

@hturner

Description

@hturner

Recent Statistic.R queries the Weblate API to collate data on new translations and translations marked for edit.

This code is unnecessarily complicated, making it difficult to read and maintain.

This task is to refactor the code, in particular extracting code into functions that could be used in other scripts that query the Weblate API.

Improvements could include:

  1. Structure code into clear, reusable functions, e.g.
    get_auth_handle(API_TOKEN): get handle to pass to curl_fetch_memory()
    fetch_results(endpoint, handle): get results for a specific endpoint (all pages)
  2. Instead of manually computing page counts, loop until the API returns no more results (length of returned results is zero). The results for each page can be saved in a list and combined after breaking from the loop.
  3. Instead of processing each page of results, process the results for all pages together, after the results have been combined.
  4. Replace loops with vectorized code where possible, e.g. L86-92 could use match() as in L153:155
  5. Consider defining helper functions for repeated code, e.g. resolve_language_codes(codes, Language_Statistics) to make the code cleaner.

Since the script is run daily via GitHub actions, we want to minimize the (total) number of dependencies. So:

Put each function in a separate R file in the root of the repository, so that the functions can be used across scripts with

helpers <- list.files("R", pattern = "\\.R$", full.names = TRUE)
lapply(helpers, source)

roxygen-style documentation would be helpful.

PRs should be made to the main branch.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions