Skip to content

performance optimization #3

@stevengj

Description

@stevengj

I haven't done any benchmarking yet, but it seems likely that the current algorithm will be fairly slow. It can probably be made much faster if needed.

For the "plain text" of the document, PATTERN does a regex match one character at a time via the final (.) pattern. (And for each regex match there is a bunch of type-unstable code that executes.) One simple improvement would be update the regex so that it can match long strings of plain text.

There might be other ways to improve performance. Relying on StringEncodings/iconv for translating a few bytes at a time (we have to flush before every print because Unicode and Windows-codepage encodings are intermixed) is surely inefficient, and also type-unstable because the code page is in the type of the encoder stream. But I don't really want to implement a Julia-native code-page conversion routine myself.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions