Skip to content

iis-research-team/RuMathBERT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

RuMathBERT

MathBert for Russian language

Dataset for training is available here: https://drive.google.com/file/d/11tSWlA_TU0eESvAxo31-Ir5yi3LPbVLy/view?usp=sharing

WordPiece tokenizer is available here: https://drive.google.com/drive/folders/1wwHMSLUsdf5ZwHPxXLVY8F0VaXfGD2Q_?usp=sharing

The OPTs.json file containing 98226 formula operator trees is available here: https://drive.google.com/file/d/1jIwokyfs1kLdqBTux2lqZytIuOrwynqJ/view?usp=sharing

The file structure is as follows:

[
  {
    "nodes": [
      {
        "id": <node id>,
        "type": "<node type: function, variable, number, etc>",
        "value": "<node value",
        "description": "<additional description for better readability>"
      },
      ...
    ],
    "edges": [
      [<source id>, <target id>],
      ...
    ],
    "latex_string": "<latex expression>"
  },
  ...
]

About

MathBert for Russian language

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages