You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Nov 27, 2025. It is now read-only.
Copy file name to clipboardExpand all lines: README.org
+21-19Lines changed: 21 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,15 +1,17 @@
1
-
#+title: org-mode-hs
2
-
3
-
This repository provides a parser and exporters for Org Mode documents. The Org document is parsed into an AST similar to =org-element='s, and the exporters are highly configurable using HTML, Markdown or LaTeX templates.
4
1
5
2
#+begin_quote
6
3
Post Morten Note: I've learned a lot in my Org journey (which included writing this parser), but I have stopped using Org for anything other than tangling the Emacs config. Honestly, it's a quite messy language once you dig into its idiosyncrasies, its faults and its details. And it was not exactly planned to be used in the outside world.
7
4
8
5
I've come to appreciate simpler and more standardized languages, and if I were to recommend something it would be [[https://commonmark.org/][CommonMark]] with all the [[https://github.com/jgm/commonmark-hs/tree/master/commonmark-extensions][extensions]] enabled.
9
6
#+end_quote
10
7
8
+
* org-mode-hs
9
+
10
+
This repository provides a parser and exporters for Org Mode documents. The Org document is parsed into an AST similar to =org-element='s, and the exporters are highly configurable using HTML, Markdown or LaTeX templates.
11
+
12
+
11
13
12
-
* Table of contents :TOC:
14
+
** Table of contents :TOC:
13
15
- [[#org-cli-horg][org-cli (horg)]]
14
16
- [[#usage][Usage]]
15
17
- [[#installation-from-source][Installation from source]]
@@ -21,11 +23,11 @@ I've come to appreciate simpler and more standardized languages, and if I were t
The Pandoc exporter does not output Pandoc formats directly, but rather, it generates a JSON AST that can be fed into Pandoc. You can pipe this JSON into Pandoc to convert to Markdown or any other supported format:
It can happen that the JSON API version is incompatible with your installed version of Pandoc. The CI compiles =horg= with the latest Pandoc API available at the build time, so using the latest released version of the Pandoc binary has a good chance of fixing the problem.
Please note that this library and some of its dependencies are not on Hackage yet, so you need to clone this repository first.
64
66
65
-
** Customizing templates
67
+
*** Customizing templates
66
68
You can use the =horg init-templates= command to populate a =.horg= directory in the current directory with the default templates, which you can then modify.
67
69
68
70
Detailed documentation on how the templates work is TODO.
69
71
70
-
* org-parser library
71
-
** How to test and play with it
72
-
*** Testing the parser in =ghci=
72
+
** org-parser library
73
+
*** How to test and play with it
74
+
**** Testing the parser in =ghci=
73
75
74
76
This assumes you have =cabal= installed.
75
77
@@ -85,10 +87,10 @@ You can write the contents to be parsed between =[text|= and =|]=. More generall
85
87
86
88
Where =[parser to parse]= can be basically any of the functions from =Org.Parse.Document=, =Org.Parser.Elements= or =Org.Parser.Objects= whose types are wrapped by the =OrgParser= or =Marked OrgParser= monads. You don't need to import those modules yourself as they are already imported in the ~test~ namespace.
87
89
88
-
*** Unit tests
90
+
**** Unit tests
89
91
You can view the unit tests under [[org-parser/test][org-parser/test]]. They aim to touch as much corner cases as possible against org-element, so you can take a look there to see what already works, and how well it works.
90
92
91
-
** Progress
93
+
*** Progress
92
94
In the spec terms (see below the table for other features), the following components are implemented:
@@ -148,7 +150,7 @@ In the spec terms (see below the table for other features), the following compon
148
150
| Markup | X | X |
149
151
(Thanks @tecosaur for the table)
150
152
151
-
*** Going beyond what is listed in the spec
153
+
**** Going beyond what is listed in the spec
152
154
153
155
~org-element-parse-buffer~ does not parse /everything/ that will eventually be parsed or processed when exporting a document written in Org-mode. Examples of Org features that are not handled by the parser alone (so aren't described in the spec) include content from keywords like =#+title:=, that are parsed "later" by the exporter itself, references in lines of =src= or =example= blocks and link resolving, that are done in a post-processing step, and the use of =#+include:= keywords, =TODO= keywords and radio links, that are done in a pre-processing step.
154
156
@@ -164,7 +166,7 @@ Since the aspects listed above are genuine /org-mode features/, and not optional
164
166
| Per-file TODO keywords | not yet (on the way, some work is done) |
165
167
| Macro definitions and substitution | not yet (on the way, some work is done) |
166
168
167
-
** Comparasion to Pandoc
169
+
*** Comparasion to Pandoc
168
170
The main difference between =org-parser= and the Pandoc Org Reader is that this one parses into an AST is more similar to the org-element's AST, while Pandoc's parses into the =Pandoc= AST, which cannot express all Org elements directly. This has the effect that some Org features are either unsupported by the reader or "projected" onto =Pandoc= in ways that bundle less information about the Org source. In contrast, this parser aims to represent Org documents more faithfully before "projecting" them into formats like HTML or the Pandoc AST itself. So you can expect more org-specific features to be parsed, and a hopefully more accurate parsing in general.
169
171
170
172
Also, if you are developer mainly interested in rendering Org documents to HTML, Pandoc is a very big library to depend upon, with very long build times (at least in my computer, sadly).
@@ -181,10 +183,10 @@ This single paragraph is broken into three by Pandoc, because it looks for a new
181
183
182
184
Another noteworthy difference is that =haskell-org-parser= uses a different parsing library, ~megaparsec~. Pandoc uses the older ~parsec~, but also bundles many features on its own library.
183
185
184
-
* org-exporters library
186
+
** org-exporters library
185
187
This library provides functions for post-processing of the Org AST and exporting to various formats with =ondim=.
186
188
187
-
** Defining a new export backend
189
+
*** Defining a new export backend
188
190
Basically:
189
191
- Use the [[https://github.com/lucasvreis/ondim][~ondim~ library]] to create a Ondim template system for the desired format, if it does not already exist.
190
192
- Import ~Org.Exporters.Common~ and create an ~ExportBackend~ for your format.
0 commit comments