88[ ![ Backers] [ backers-badge ]] [ collective ]
99[ ![ Chat] [ chat-badge ]] [ chat ]
1010
11- [ micromark] [ ] extension to support GFM [ literal autolinks] [ spec ] .
11+ [ micromark] [ ] extensions to support GFM [ literal autolinks] [ spec ] .
1212
1313## Contents
1414
1919* [ API] ( #api )
2020 * [ ` gfmAutolinkLiteral ` ] ( #gfmautolinkliteral )
2121 * [ ` gfmAutolinkLiteralHtml ` ] ( #gfmautolinkliteralhtml )
22+ * [ Bugs] ( #bugs )
2223* [ Authoring] ( #authoring )
2324* [ HTML] ( #html )
2425* [ CSS] ( #css )
3233
3334## What is this?
3435
35- This package contains extensions that add support for the autolink syntax
36+ This package contains extensions that add support for the extra autolink syntax
3637enabled by GFM to [ ` micromark ` ] [ micromark ] .
3738
3839GitHub employs different algorithms to autolink: one at parse time and one at
@@ -43,27 +44,33 @@ But also because issues/PRs/comments omit (perhaps by accident?) the second
4344algorithm for ` www. ` , ` http:// ` , and ` https:// ` links (but not for email links).
4445
4546As this is a syntax extension, it focuses on the first algorithm.
46- The second algorithm is performed by [ ` mdast-util-gfm-autolink-literal ` ] [ util ] .
47+ The second algorithm is performed by
48+ [ ` mdast-util-gfm-autolink-literal ` ] [ mdast-util-gfm-autolink-literal ] .
4749The ` html ` part of this micromark extension does not operate on an AST and hence
4850can’t perform the second algorithm.
4951
52+ The implementation of autolink literal on github.com is currently buggy.
53+ The bugs have been reported on [ ` cmark-gfm ` ] [ cmark-gfm ] .
54+ This micromark extension matches github.com except for its bugs.
55+
5056## When to use this
5157
52- These tools are all low-level.
53- In many cases, you want to use [ ` remark-gfm ` ] [ plugin ] with remark instead.
58+ This project is useful when you want to support autolink literals in markdown.
59+
60+ You can use these extensions when you are working with [ ` micromark ` ] [ micromark ] .
61+ To support all GFM features, use
62+ [ ` micromark-extension-gfm ` ] [ micromark-extension-gfm ] instead.
5463
55- Even when you want to use ` micromark ` , you likely want to use
56- [ ` micromark-extension-gfm ` ] [ micromark-extension-gfm ] to support all GFM
57- features.
58- That extension includes this extension.
64+ When you need a syntax tree, combine this package with
65+ [ ` mdast-util-gfm-autolink-literal ` ] [ mdast-util-gfm-autolink-literal ] .
5966
60- When working with ` mdast-util-from-markdown ` , you must combine this package with
61- [ ` mdast-util-gfm-autolink-literal ` ] [ util ] .
67+ All these packages are used in [ ` remark-gfm ` ] [ remark-gfm ] , which focusses on
68+ making it easier to transform content by abstracting these internals away .
6269
6370## Install
6471
6572This package is [ ESM only] [ esm ] .
66- In Node.js (version 12.20+, 14.14+, 16.0+, or 18.0 +), install with [ npm] [ ] :
73+ In Node.js (version 16 +), install with [ npm] [ ] :
6774
6875``` sh
6976npm install micromark-extension-gfm-autolink-literal
@@ -108,47 +115,56 @@ Yields:
108115
109116## API
110117
111- This package exports the identifiers ` gfmAutolinkLiteral ` and
112- ` gfmAutolinkLiteralHtml ` .
118+ This package exports the identifiers
119+ [ ` gfmAutolinkLiteral ` ] [ api-gfm-autolink-literal ] and
120+ [ ` gfmAutolinkLiteralHtml ` ] [ api-gfm-autolink-literal-html ] .
113121There is no default export.
114122
115- The export map supports the endorsed [ ` development ` condition] [ condition ] .
123+ The export map supports the [ ` development ` condition] [ development ] .
116124Run ` node --conditions development module.js ` to get instrumented dev code.
117125Without this condition, production code is loaded.
118126
119127### ` gfmAutolinkLiteral `
120128
121- Syntax extension for micromark (passed in ` extensions ` ).
129+ Extension for ` micromark ` that can be passed in ` extensions ` to enable GFM
130+ autolink literal syntax ([ ` Extension ` ] [ micromark-extension ] ).
122131
123132### ` gfmAutolinkLiteralHtml `
124133
125- HTML extension for micromark (can be passed in ` htmlExtensions ` ).
134+ Extension for ` micromark ` that can be passed in ` htmlExtensions ` to support
135+ GFM autolink literals when serializing to HTML
136+ ([ ` HtmlExtension ` ] [ micromark-html-extension ] ).
126137
127- ## Authoring
138+ ## Bugs
128139
129- When authoring markdown, it ’s recommended * not * to use this construct .
130- It is fragile (easy to get wrong) and not pretty to readers (it’s presented as
131- just a URL, there is no descriptive text) .
132- Instead, use link (resource) or link (label) :
140+ GitHub ’s own algorithm to parse autolink literals contains three bugs .
141+ A smaller bug is left unfixed in this project for consistency.
142+ Two main bugs are not present in this project .
143+ The issues relating to autolink literals are :
133144
134- ``` markdown
135- Instead of https://example.com (worst), use <https://example.com> (better),
136- or [link (resource)](https://example.com) or [link (reference)][ref] (best).
145+ * [ GFM autolink extension (` www. ` , ` https?:// ` parts): links don’t work when
146+ after bracket] ( https://github.com/github/cmark-gfm/issues/278 ) \
147+ fixed here ✅
148+ * [ GFM autolink extension (` www. ` part): uppercase does not match on
149+ issues/PRs/comments] ( https://github.com/github/cmark-gfm/issues/280 ) \
150+ fixed here ✅
151+ * [ GFM autolink extension (` www. ` part): the word ` www `
152+ matches] ( https://github.com/github/cmark-gfm/issues/279 ) \
153+ present here for consistency
137154
138- [ref]: https://example.com
139- ```
155+ ## Authoring
140156
141- When authoring markdown where the source does not matter (such as comments to
142- some page), it can be useful to quickly paste URLs, and this will mostly work.
157+ It is recommended to use labels, either with a resource or a definition,
158+ instead of autolink literals, as those allow relative URLs and descriptive
159+ text to explain the URL in prose.
143160
144161## HTML
145162
146- GFM autolink literals, similar to normal CommonMark autolinks (such as
147- ` <https://example.com> ` ), relate to the ` <a> ` element in HTML.
148- See [ * § 4.5.1 The ` a ` element* ] [ html ] in the HTML spec for more info.
149- When an email autolink is used, the string ` mailto: ` is prepended before the
150- email, when generating the ` href ` attribute of the hyperlink.
151- When a ` www ` autolink is used, the string ` http:// ` is prepended.
163+ GFM autolink literals relate to the ` <a> ` element in HTML.
164+ See [ * § 4.5.1 The ` a ` element* ] [ html-a ] in the HTML spec for more info.
165+ When an email autolink is used, the string ` mailto: ` is prepended when
166+ generating the ` href ` attribute of the hyperlink.
167+ When a www autolink is used, the string ` http:// ` is prepended.
152168
153169## CSS
154170
@@ -183,64 +199,107 @@ a:not([href]) {
183199
184200## Syntax
185201
186- Autolink literals are very complex to parse.
187- They form with, roughly, the following BNF:
202+ Autolink literals form with, roughly, the following BNF:
188203
189204``` bnf
190- ; Restriction: not allowed to be in unbalanced braces.
191- autolink ::= www-autolink | http-autolink | email-autolink
205+ gfm_autolink_literal ::= gfm_protocol_autolink | gfm_www_autolink | gfm_email_autolink
192206
193- ; Restriction: the code before must be `www-autolink-before`.
194- www-autolink ::= 3( "w" | "W" ) "." [ domain [ path ] ]
195- www-autolink-before ::= eof | eol | space-or-tab | "(" | "*" | "_" | "~"
207+ ; Restriction: the code before must be `www_autolink_before`.
208+ ; Restriction: the code after `.` must not be eof.
209+ www_autolink ::= 3('w' | 'W') '.' [domain [path]]
210+ www_autolink_before ::= eof | eol | space_or_tab | '(' | '*' | '_' | '[' | ']' | '~'
196211
197- ; Restriction: the code before must be `http-autolink-before `.
198- ; Restriction: the code after the protocol must be `http-autolink-protocol-after `.
199- http-autolink ::= ( "h" | "H" ) 2( "t" | "T" ) ( "p" | "P" ) [ "s" | "S" ] ":" 2"/" domain [ path ]
200- http-autolink-before ::= code - ascii-alpha
201- http-autolink-protocol-after ::= code - eof - eol - ascii-control - unicode-whitespace - unicode-punctuation
212+ ; Restriction: the code before must be `http_autolink_before `.
213+ ; Restriction: the code after the protocol must be `http_autolink_protocol_after `.
214+ http_autolink ::= ('h' | 'H' ) 2('t' | 'T' ) ('p' | 'P' ) ['s' | 'S'] ':' 2'/' domain [path]
215+ http_autolink_before ::= byte - ascii_alpha
216+ http_autolink_protocol_after ::= byte - eof - eol - ascii_control - unicode_whitespace - ode_punctuation
202217
203- ; Restriction: the code before must be `email-autolink-before `.
204- ; Restriction: `ascii-digit ` may not occur in the last label part of the label.
205- email-autolink ::= 1*( "+" | "-" | "." | "_" | ascii-alphanumeric ) "@" 1*( 1*label-segment label-dot-cont ) 1*label-segment
206- email-autolink-before ::= code - ascii-alpha - "/"
218+ ; Restriction: the code before must be `email_autolink_before `.
219+ ; Restriction: `ascii_digit ` may not occur in the last label part of the label.
220+ email_autolink ::= 1*('+' | '-' | '.' | '_' | ascii_alphanumeric) '@' 1*(1*label_segment l_dot_cont ) 1*label_segment
221+ email_autolink_before ::= byte - ascii_alpha - '/'
207222
208223; Restriction: `_` may not occur in the last two domain parts.
209- domain ::= 1*( url-ampt-cont | domain-punct-cont | "-" | code - eof - ascii-control - unicode-whitespace - unicode-punctuation )
224+ domain ::= 1*(url_ampt_cont | domain_punct_cont | '-' | byte - eof - ascii_control - ode_whitespace - unicode_punctuation )
210225; Restriction: must not be followed by `punct`.
211- domain-punct-cont ::= "." | "_"
226+ domain_punct_cont ::= '.' | '_'
212227; Restriction: must not be followed by `char-ref`.
213- url-ampt-cont ::= "&"
228+ url_ampt_cont ::= '&'
214229
215230; Restriction: a counter `balance = 0` is increased for every `(`, and decreased for every `)`.
216- ; Restriction: `)` must not be `paren-at-end `.
217- path ::= 1*( url-ampt-cont | path-punctuation-cont | "(" | ")" | code - eof - eol - space-or-tab )
231+ ; Restriction: `)` must not be `paren_at_end `.
232+ path ::= 1*(url_ampt_cont | path_punctuation_cont | '(' | ')' | byte - eof - eol - space_or_tab )
218233; Restriction: must not be followed by `punct`.
219- path-punctuation-cont ::= trailing-punctuation - "<"
234+ path_punctuation_cont ::= trailing_punctuation - '<'
220235; Restriction: must be followed by `punct` and `balance` must be less than `0`.
221- paren-at-end ::= ")"
236+ paren_at_end ::= ')'
222237
223- label-segment ::= label-dash-underscore-cont | ascii-alpha | ascii-digit
238+ label_segment ::= label_dash_underscore_cont | ascii_alpha | ascii_digit
224239; Restriction: if followed by `punct`, the whole email autolink is invalid.
225- label-dash-underscore-cont ::= "-" | "_"
240+ label_dash_underscore_cont ::= '-' | '_'
226241; Restriction: must not be followed by `punct`.
227- label-dot-cont ::= "."
242+ label_dot_cont ::= '.'
243+
244+ punct ::= *trailing_punctuation ( byte - eof - eol - space_or_tab - '<' )
245+ char_ref ::= *ascii_alpha ';' path_end
246+ trailing_punctuation ::= '!' | '"' | '\'' | ')' | '*' | ',' | '.' | ':' | ';' | '<' | '?' | '_' | '~'
247+ ```
248+
249+ The grammar for GFM autolink literal is very relaxed: basically anything
250+ except for whitespace is allowed after a prefix.
251+ To use whitespace characters and otherwise impossible characters, in URLs,
252+ you can use percent encoding:
253+
254+ ``` markdown
255+ https://example.com/alpha%20bravo
256+ ```
257+
258+ Yields:
259+
260+ ``` html
261+ <p ><a href =" https://example.com/alpha%20bravo" >https://example.com/alpha%20bravo</a ></p >
262+ ```
263+
264+ There are several cases where incorrect encoding of URLs would, in other
265+ languages, result in a parse error.
266+ In markdown, there are no errors, and URLs are normalized.
267+ In addition, many characters are percent encoded
268+ ([ ` sanitizeUri ` ] [ micromark-util-sanitize-uri ] ).
269+ For example:
270+
271+ ``` markdown
272+ www.a👍b%
273+ ```
228274
229- punct ::= *trailing-punctuation ( code - eof - eol - space-or-tab - "<" )
230- char-ref ::= *ascii-alpha ";" path-end
231- trailing-punctuation ::= "!" | "\"" | "'" | ")" | "*" | "," | "." | ":" | ";" | "<" | '?' | '_' | '~'
275+ Yields:
276+
277+ ``` html
278+ <p ><a href =" http://www.a%F0%9F%91%8Db%25" >www.a👍b%</a ></p >
232279```
233280
281+ There is a big difference between how www and protocol literals work
282+ compared to how email literals work.
283+ The first two are done when parsing, and work like anything else in
284+ markdown.
285+ But email literals are handled afterwards: when everything is parsed, we
286+ look back at the events to figure out if there were email addresses.
287+ This particularly affects how they interleave with character escapes and
288+ character references.
289+
234290## Types
235291
236292This package is fully typed with [ TypeScript] [ ] .
237293It exports no additional types.
238294
239295## Compatibility
240296
241- This package is at least compatible with all maintained versions of Node.js.
242- As of now, that is Node.js 12.20+, 14.14+, 16.0+, and 18.0+.
243- It also works in Deno and modern browsers.
297+ Projects maintained by the unified collective are compatible with all maintained
298+ versions of Node.js.
299+ As of now, that is Node.js 16+.
300+ Our projects sometimes work with older versions, but this is not guaranteed.
301+
302+ These extensions work with ` micromark ` version 3+.
244303
245304## Security
246305
@@ -250,12 +309,14 @@ construct always produces safe links.
250309
251310## Related
252311
253- * [ ` syntax-tree/mdast-util-gfm-autolink-literal ` ] [ util ]
254- — support GFM autolink literals in mdast
255- * [ ` syntax-tree/mdast-util-gfm ` ] [ mdast-util-gfm ]
256- — support GFM in mdast
257- * [ ` remarkjs/remark-gfm ` ] [ plugin ]
258- — support GFM in remark
312+ * [ ` micromark-extension-gfm ` ] [ micromark-extension-gfm ]
313+ — support all of GFM
314+ * [ ` mdast-util-gfm-autolink-literal ` ] [ mdast-util-gfm-autolink-literal ]
315+ — support all of GFM in mdast
316+ * [ ` mdast-util-gfm ` ] [ mdast-util-gfm ]
317+ — support all of GFM in mdast
318+ * [ ` remark-gfm ` ] [ remark-gfm ]
319+ — support all of GFM in remark
259320
260321## Contribute
261322
@@ -317,20 +378,32 @@ abide by its terms.
317378
318379[ typescript ] : https://www.typescriptlang.org
319380
320- [ condition ] : https://nodejs.org/api/packages.html#packages_resolving_user_conditions
381+ [ development ] : https://nodejs.org/api/packages.html#packages_resolving_user_conditions
321382
322- [ util ] : https://github.com/syntax-tree/mdast-util-gfm-autolink-literal
383+ [ micromark ] : https://github.com/micromark/micromark
384+
385+ [ micromark-extension-gfm ] : https://github.com/micromark/micromark-extension-gfm
323386
324- [ plugin ] : https://github.com/remarkjs/remark-gfm
387+ [ micromark-util-sanitize-uri ] : https://github.com/micromark/micromark/tree/main/packages/micromark-util-sanitize-uri
388+
389+ [ micromark-extension ] : https://github.com/micromark/micromark#syntaxextension
390+
391+ [ micromark-html-extension ] : https://github.com/micromark/micromark#htmlextension
392+
393+ [ mdast-util-gfm ] : https://github.com/syntax-tree/mdast-util-gfm
394+
395+ [ mdast-util-gfm-autolink-literal ] : https://github.com/syntax-tree/mdast-util-gfm-autolink-literal
396+
397+ [ remark-gfm ] : https://github.com/remarkjs/remark-gfm
325398
326399[ spec ] : https://github.github.com/gfm/#autolinks-extension-
327400
328- [ html ] : https://html.spec.whatwg.org/multipage/text-level-semantics.html#the-a-element
401+ [ html-a ] : https://html.spec.whatwg.org/multipage/text-level-semantics.html#the-a-element
329402
330403[ css ] : https://github.com/sindresorhus/github-markdown-css
331404
332- [ micromark ] : https://github.com/micromark/micromark
405+ [ cmark-gfm ] : https://github.com/github/cmark-gfm
333406
334- [ micromark-extension- gfm] : https://github.com/micromark/micromark-extension-gfm
407+ [ api- gfm-autolink-literal ] : #gfmautolinkliteral
335408
336- [ mdast-util- gfm] : https://github.com/syntax-tree/mdast-util-gfm
409+ [ api- gfm-autolink-literal-html ] : #gfmautolinkliteralhtml
0 commit comments