bb

Brian's Playground

This repository contains stuff I've tinkered with, some of it may be interesting, some of it may be incomplete. The contents are provided "as is".

`xhtml.dyalog` namespace

Contains utilities to:

convert HTML to XHTML which is subsequently able to be parsed by ⎕XML
search and extract elements from the result of ⎕XML

`HTMLtoXHTML`

xhtml ← xhtml.HTMLtoXHTML html html is a character vector containing HTML xhtml is a matrix form of the XHTML

HTMLtoXHTML assumes that the HTML is reasonably formed (e.g. open tags have corresponding closing tags). It handles most, but probably not all, HTMLisms of some elements not requiring a closing tag.

`Xfind`

boolvec ← xml xhtml.Xfind spec xml is an XML matrix (could be XHTML, but doesn't have to be) spec is a delimited-string search specification (first character is the delimiter) in the form /levels/elements/content/attribute/value where:

levels, if non-empty, specifies the level(s) to consider in the search. For example:
- 3 specifies level 3 elements only, 3- level 3 and lower (to 0), 3+ level 3 and higher, 3-5 levels 3 through 5
elements is a space-delimited list of elements to select
content is case-insensitive content to search for using ⍷
attribute is a case-sensitive attribute name to exactly search for
value is a case-insensitive attribute value to search for using ⍷, if no attribute is specified, all attributes will be searched.

boolvec is a Boolean vector marking matching elements

Examples:


      xml xhtml.Xfind '//table//class/results' ⍝ find all <table> elements with a class attribute containing 'results'

      xml xhtml.Xfind '/2////foobar' ⍝ find all level 2 elements with any attribute containing 'foobar'

      xml xhtml.Xfind '/3+/th td/bloof' ⍝ find all level 3 or higher <th> or <td> elements containing 'bloof'

`Xsel`

elements ← xml Xsel boolvec xml is an XML matrix (could be XHTML, but doesn't have to be) boolvec is a Boolean vector with as many elements as rows in xml elements is a nested vector of elements marked by boolvec and their descendants

Typical Use Case

In general, you'll convert some HTML to XHTML and then search for and extract element of interest to you. For example:


      resp ← HttpCommand.Get 'someurl.com/somefile.html' ⍝ make a request 
      'request failed' ⎕SIGNAL (0 200≢resp.(rc HttpStatus))/777 ⍝ check that it succeeded
      h ← resp.Data ⍝ grab the response data
      x ← xhtml.HTMLtoXHTML h ⍝ convert to XHTML
      mytables ← x xhtml.Xsel x xhtml.Xfind '//table//class/results' ⍝ extract all the <table> elements with a class attribute containing "results"

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
CIA		CIA
HTMLRenderer		HTMLRenderer
MIME		MIME
OpenAI		OpenAI
mandelbrot		mandelbrot
maze		maze
rosetta		rosetta
spiro		spiro
wss		wss
BigInt.dyalog		BigInt.dyalog
CongaDlls.dyalog		CongaDlls.dyalog
GSearch.dyalog		GSearch.dyalog
GitHub.dyalog		GitHub.dyalog
LICENSE		LICENSE
README.md		README.md
TextFollows.dyalog		TextFollows.dyalog
XMLtoJSON.dyalog		XMLtoJSON.dyalog
battleship.dyalog		battleship.dyalog
battleship2.dyalog		battleship2.dyalog
changes.dyalog		changes.dyalog
divvy.aplf		divvy.aplf
divvy2.aplf		divvy2.aplf
findfiles.dyalog		findfiles.dyalog
mkdocsLinks.dyalog		mkdocsLinks.dyalog
monty.dyalog		monty.dyalog
move.dyalog		move.dyalog
websocket.dws		websocket.dws
xhtml.dyalog		xhtml.dyalog

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

bb

`xhtml.dyalog` namespace

`HTMLtoXHTML`

`Xfind`

`Xsel`

Typical Use Case

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

bpbecker/bb

Folders and files

Latest commit

History

Repository files navigation

bb

xhtml.dyalog namespace

HTMLtoXHTML

Xfind

Xsel

Typical Use Case

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

`xhtml.dyalog` namespace

`HTMLtoXHTML`

`Xfind`

`Xsel`

Packages