Skip to content

Commit b020b7a

Browse files
committed
Using filesystem functions rather than github apis
The previous setup led to race conditions between running the relevant actions and the availability of all the files via github.io.
1 parent aea0b34 commit b020b7a

File tree

5 files changed

+160
-195
lines changed

5 files changed

+160
-195
lines changed

script/README.md

Lines changed: 7 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,25 @@
11
# Display minutes' information
22

3-
At the moment, the minutes generated for W3C calls (using the "standard" `scribe` tool) are spread on W3C's date space. Unfortunately, there is no single place where all the minutes could be seen altogether to see their original agenda, or to see the resolutions that have been passed. This small utility is built on top of the standard W3C minutes to make this possible. It relies on the assumption that the list of all minutes can be located via an API and, on top of those it generates two files:
3+
At the moment, the minutes generated for W3C calls (using the "standard" `scribe` tool) are spread over W3C's date space. Unfortunately, there is no single place where all the minutes could be seen altogether, to see their original agenda, or to see the resolutions that have been passed. This small utility is built on top of the standard W3C minutes to make this possible. It relies on the assumption that the list of all minutes can be located in the same (GitHub) repository where this tool resides; it generates two files:
44

5-
- `index.html`: a file listing all the minutes, grouped by years and the possibility to look at the agenda of all those
5+
- `index.html`: a file listing all the minutes, grouped by years, and the possibility to look at the agenda of all those (via a `<detail>` element)
66
- `resolutions.html`: a file listing all the formal resolutions passed and documented, grouped by years.
77

88
The script can be installed in a GitHub repository and can be used to automatically generate those two files in a GitHub page using an action script (see below).
99

10-
***At the moment*** the only way of using this script is to (manually…) regroup all the minutes in a directory within the GitHub repository and use the GitHub API to access to the minute references. It is hoped that, eventually, there will be a W3C API entry to extract that information (e.g., by accessing the WG Calendar entries). The script should be easily adaptable for new APIs. The script has been developed for the [Publishing Maintenance Working Group](https://www.w3.org/groups/wg/pm) and has been deployed on the [WG repository](https://github.com/w3c/pm-wg).
10+
***At the moment*** the only way of using this script is to (manually…) regroup all the minutes in a directory within the GitHub repository. It is hoped that, eventually, there will be a W3C API entry to extract that information (e.g., by accessing the WG Calendar entries). The script should be easily adaptable for an access to the minutes via an API. The script has been developed for the [Publishing Maintenance Working Group](https://www.w3.org/groups/wg/pm) and has been deployed on the [WG repository](https://github.com/w3c/pm-wg).
1111

1212
## Technical details
1313

14-
The script is in typescript, and has been developed and deployed using [deno](https://deno.land). (It is meant to be compatible with [node.js+typescript](https://nodejs.org), though.) It consists of three scripts
14+
The script is in TypeScript, and has been developed and deployed using [deno](https://deno.land). (It is meant to be compatible with [node.js+tsc](https://nodejs.org), though.) It consists of three files:
1515

16-
- `main.ts`: deploys the information on minutes by filling in the dedicated "slots" in two template files: `templates/index_template.html` and `templates/resolutions_template.html`. The results are stored in the `index.html` and `resolutions.html` files, respectively. Note that the `main.ts` file includes a `const wg = "pm-wg";` declaration referring to a (w3c) repository storing the minutes. This variable should be adapted to another repository, providing that the directory setup is identical.
17-
- `tools.ts`: set of functions to access, and retrieve, table of content and resolutions from a minute file. It is based on the specificities the minutes files as generated by the standard `scribe` tool.
18-
19-
The file contains the `APIUrl` and `HTMLUrl` strings used to access the API for the minutes and the format
20-
for the final minute URLs. The first constant also depends on a particular structure of the data in the repository (i.e., the minutes are stored in a `minutes` directory).
21-
22-
Both constats rely, at the moment, on GitHub for the exact format. Those two constants, and the internals of the `getMinutes` method, are GitHub specific, and should be adapted if another API is used.
16+
- `main.ts`: deploys the information on minutes by filling in the dedicated "slots" in two template files: `templates/index_template.html` and `templates/resolutions_template.html`. The results are stored in the `index.html` and `resolutions.html` files, respectively. Note that the `main.ts` file includes a `const directory = "../minutes";` declaration referring to the (relative) file name of the directory containing the minutes. This variable should be adapted to another repository, providing that the directory setup is roughly identical.
17+
- `data.ts`: set of functions to access, and retrieve, table of content and resolutions from the minute files. It is based on the specificities the minutes files as generated by the standard `scribe` tool.
2318
- `minidom.ts`: a thin layer on top of the HTML DOM with a few handy shortcut functions. The exact choice of the DOM implementation package is also "hidden" in this module, and can be updated if needed.
2419

2520
### Installation in GitHub pages
2621

27-
The tool can also be used to make the file generation automatic via a GitHub workflow. For reference, here is the workflow used on the aforementioned repository (the GitHub options for `pages` deployment must be set to "GitHub actions"):
22+
The tool can also be used to make the file generation automatic via a GitHub workflow. For reference, here is the workflow used on the PM WG repository (the GitHub options for `pages` deployment must be set to "GitHub actions"):
2823

2924
```yml
3025
# Relies on the standard GitHub action setup to deploy to GitHub Pages

script/lib/data.ts

Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
import { MiniDOM } from './minidom.ts';
2+
import * as fs from 'node:fs/promises';
3+
4+
type FileName = string;
5+
6+
/** The data extracted from the minutes that is supposed to be displayed */
7+
interface DisplayedData {
8+
/** The File name of the minutes */
9+
fname: FileName;
10+
/** Date of the minutes */
11+
date: Date;
12+
/** Array of TOC entries (can be HTML) */
13+
toc: string[];
14+
/** Array of Resolution entries entries (can be HTML) */
15+
res: string[];
16+
}
17+
18+
/** The data regrouped per years */
19+
export type GroupedData = Map<number, DisplayedData[]>;
20+
21+
/** Files names that must be ignored in the directory of minutes, if there */
22+
const ignoredFiles: string[] = ["index.html", "resolutions.html"];
23+
24+
/**
25+
* Get the file names of all the minutes for a given WG.
26+
* The file names are relative to the directory.
27+
*
28+
*
29+
* @param directory
30+
* @returns
31+
*/
32+
async function getMinutes(directory: string): Promise<FileName[]> {
33+
return (await fs.readdir(directory))
34+
.filter(file => !ignoredFiles.includes(file))
35+
.map(file => `${directory}/${file}`);
36+
}
37+
38+
39+
/**
40+
* Get all TOCs and resolutions with the respective date; one block each that can
41+
* be displayed in the generated HTML
42+
*
43+
* @param minutes
44+
* @returns
45+
*/
46+
async function getAllData(minutes: FileName[]): Promise<DisplayedData[]> {
47+
/*
48+
* Extract a list of <li> entries from the content of a minutes file, using a CSS selector.
49+
*/
50+
const extractListEntries = (fname: FileName, content: MiniDOM, selector: string): string[] => {
51+
// There is only one cleanup operation for now, but it could be extended if needed
52+
const cleanupData = (nav: string): string => {
53+
return nav
54+
// References should not be relative
55+
.replace(/href="#/g, `target="_blank" href="${fname}#`)
56+
;
57+
};
58+
59+
const resLines: NodeListOf<Element> = content.querySelectorAll(selector);
60+
61+
if (resLines.length === 0) {
62+
return [];
63+
} else {
64+
return Array.from(resLines)
65+
// Get the HTML line corresponding to an 'li' element
66+
.map((line: Element) => line.innerHTML)
67+
// Cleanup each the line before returning it
68+
.map(cleanupData);
69+
}
70+
}
71+
72+
// Get the data for a single entry; the Promises are collected in an array
73+
// for a parallel execution via Promise.allSettled.
74+
const retrieveDisplayData = async (fname: FileName): Promise<DisplayedData> => {
75+
// Get the minutes file as a text
76+
const response = await fs.readFile(fname, "utf-8");
77+
// Parse the (HTML) text into a MiniDOM
78+
const content = new MiniDOM(response);
79+
80+
// Find the date of the minutes
81+
const date_title :string | null | undefined = content.querySelector("header h2:first-of-type")?.textContent;
82+
const date = new Date(date_title ?? "1970-01-01");
83+
84+
return {
85+
fname : fname,
86+
date : date,
87+
toc : extractListEntries(fname, content, "#toc ol li"),
88+
res : extractListEntries(fname, content, "#ResolutionSummary ol li"),
89+
};
90+
}
91+
92+
// Gather all the Promises for a parallel execution
93+
const promises: Promise<DisplayedData>[] = minutes.map(retrieveDisplayData);
94+
95+
// Some of the promises might have failed, so we need to filter those out.
96+
// But we want to display everything we can...
97+
const results : PromiseSettledResult<DisplayedData>[] = await Promise.allSettled(promises);
98+
const output = results
99+
.filter((result) => result.status === "fulfilled")
100+
.map((result) => result.value);
101+
102+
// Sorting the output by date before returning it
103+
return output.sort((a, b) => {
104+
if (a.date > b.date) return -1;
105+
if (a.date < b.date) return 1;
106+
else return 0;
107+
});
108+
}
109+
110+
111+
/**
112+
* Main entry point to get the Data grouped by year. The data themselves are arrays of strings, in HTML format.
113+
*
114+
* @param directory
115+
* @returns
116+
*/
117+
export async function getGroupedData(directory: string): Promise<GroupedData> {
118+
const groupDisplayedDataByYear = (data: DisplayedData[]): GroupedData => {
119+
const groups: GroupedData = new Map<number, DisplayedData[]>();
120+
for (const entry of data) {
121+
const year = entry.date.getFullYear();
122+
if (!groups.has(year)) {
123+
groups.set(year, []);
124+
}
125+
groups.get(year)?.push(entry);
126+
}
127+
return groups;
128+
}
129+
130+
// Get the references to all the minutes
131+
const minutes: FileName[] = await getMinutes(directory);
132+
// For each of the minutes, get the content.
133+
const display: DisplayedData[] = await getAllData(minutes);
134+
return groupDisplayedDataByYear(display);
135+
}

script/lib/minidom.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ import { JSDOM } from 'npm:jsdom';
33

44
/**
55
* A thin layer on top of the regular DOM Document. Necessary to "hide" the differences
6-
* between DOM implementations, such as JSDOM and Deno's DOM WASM.
6+
* between DOM implementations such as JSDOM, and Deno's DOM WASM.
77
* Higher layers should not depend on these.
88
*
99
* The class also includes some handy shorthands to make the code cleaner…

script/lib/tools.ts

Lines changed: 0 additions & 164 deletions
This file was deleted.

0 commit comments

Comments
 (0)