Skip to content

hamac03/VietnameseCalligraphyRecognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Datasets

  • Vicalligraphy: A dataset collected from the Internet, consisting of Vietnamese calligraphy images in various writing styles.
  • ViSynth1M: A synthetic dataset containing 1,000,000 scene text images.
  • ViCalligraphySynth: A synthetic dataset containing 10,000 generated Vietnamese calligraphy images, created using 5 Vietnamese calligraphy fonts. It is designed to improve OCR models' ability to recognize calligraphic text with diverse font styles and layouts.
  • SupportSamples: Used to compare confused words and select the most similar ones generated from 5 Vietnamese calligraphy fonts.

ABINet, SRN, PARSeq, SVTR, ViTSTR

Utilize PaddleOCR to train and evaluate 5 models: ABINet, SRN, PARSeq, SVTR, ViTSTR

Installation

Ensure you have Python 3.7 or later installed. Then, install PaddleOCR using:

pip install paddlepaddle-gpu==2.6.1

If having any error, please follow the official guide here: PaddleOCR Quick Start

Training

Run the following command to train a model:

python tools/train.py -c path/to/config/file

Configuration files for models can be found in: PaddleOCR/config/ViCalligraphy/

Evaluation

To evaluate a trained model, use:

python tools/eval.py -c path/to/config/file -o Global.pretrained_model=path/to/pretrained/model

Checkpoints for models can be found in: PaddleOCR/output/rec/

VietOCR

Train and evaluate VietOCR

Installation

Install using pip:

pip install vietocr

Quick Start

You can follow this notebook: vietocr/ViCalligraphy.ipynb to know how to use the model.

Our model weight here: vietocr/weights/transformerocr.pth

SMTR

Utilize OpenOCR to train and evaluate SMTR

Quick Start

Dependencies:

  • PyTorch version >= 1.13.0
  • Python version >= 3.7
conda create -n openocr python==3.8
conda activate openocr
# install gpu version torch
conda install pytorch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 pytorch-cuda=11.8 -c pytorch -c nvidia
# or cpu version
conda install pytorch torchvision torchaudio cpuonly -c pytorch

After installing dependencies, the following two installation methods are available. Either one can be chosen.

Or our conda env:

conda env create -f OpenOCR/environment.yml

Usage:

python tools/infer_rec.py --c ./configs/rec/svtrv2/repsvtr_ch.yml --o Global.infer_img=/path/img_fold or /path/img_file

Our config file: OpenOCR/configs/smtr/config.yml

Our checkpoint here.

CCD

Installation

Python == 3.7

conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install -r CCD/CCD_Ha/requirement.txt

Or using conda env:

conda env create -f CCD/environment.yml

Fine-tuning

The difference between character-based and stroke-based models lies only in the inference step. Therefore, during fine-tuning, we follow the training approach of the character-based model.

cd CCD/CCD_Ha/
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 CCD_Ha/train_finetune.py --config path/to/config/file

Our configuration files: CCD/CCD_Ha/Dino/configs/

Character-based models:

Stroke-based models:

Testing

# Character-based
cd CCD/CCD_Ha
CUDA_VISIBLE_DEVICES=0 python test.py --config path/to/config/file

# Stroke-based (Stroke-level Decomposition)
cd CCD/CCD_stroke
CUDA_VISIBLE_DEVICES=0 python test.py --config path/to/config/file

Our checkpoint files: CCD/CCD_Ha/saved_models/

Character-based models:

Stroke-based models:

Demo App

Installation

Python = 3.7

pip install streamlit==1.23.1 streamlit-drawable-canvas

Run app

streamlit run DemoSTR/app.py --server.port 8501 --server.address 0.0.0.0

Demo

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published