Detect-Anything

YOLO-World
EfficientViT-SAM
LaMa
Stable Diffusion 2 Inpainting

Getting Started

Installation

download the pretrained weights from the following links and save them in the weights directory. https://huggingface.co/han-cai/efficientvit-sam/resolve/main/xl1.pt

Use Anaconda to create a new environment and install the required packages.

uv venv

.venv\Scripts\activate or source .venv/bin/activate

uv pip install -r pyproject.toml

Weights

Download the weights from the following links and save them in the weights directory.

curl -LJO https://huggingface.co/smartywu/big-lama/resolve/main/big-lama.zip
unzip big-lama.zip

EfficientViT-SAM-XL1

Running the Project

uv run app.py

Core Models

YOLO-World

YOLO-World is an open-vocabulary object detection model with high efficiency. On the challenging LVIS dataset, YOLO-World achieves 35.4 AP with 52.0 FPS on V100, which outperforms many state-of-the-art methods in terms of both accuracy and speed.

EfficientViT-SAM

EfficientViT-SAM is a new family of accelerated segment anything models. Thanks to the lightweight and hardware-efficient core building block, it delivers 48.9× measured TensorRT speedup on A100 GPU over SAM-ViT-H without sacrificing performance.

LaMa

LaMa is an advanced image inpainting method that significantly improves the restoration of large missing areas, complex geometric structures, and high-resolution images through the use of fast Fourier convolutions and high receptive field perceptual loss.

Stable Diffusion 2 Inpainting

Stable Diffusion 2 Inpainting is a diffusion-based image inpainting method that can automatically generate reasonable and high-quality content based on masked areas, widely used in object removal, content filling, and other scenarios.

Citation

@article{cheng2024yolow,
  title={YOLO-World: Real-Time Open-Vocabulary Object Detection},
  author={Cheng, Tianheng and Song, Lin and Ge, Yixiao and Liu, Wenyu and Wang, Xinggang and Shan, Ying},
  journal={arXiv preprint arXiv:2401.17270},
  year={2024}
}

@misc{zhang2024efficientvitsam,
  title={EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss},
  author={Zhuoyang Zhang and Han Cai and Song Han},
  year={2024},
  eprint={2402.05008},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@article{suvorov2021resolution,
  title={Resolution-robust Large Mask Inpainting with Fourier Convolutions},
  author={Suvorov, Roman and Logacheva, Elizaveta and Mashikhin, Anton and Remizova, Anastasia and Ashukha, Arsenii and Silvestrov, Aleksei and Kong, Naejin and Goka, Harshith and Park, Kiwoong and Lempitsky, Victor},
  journal={arXiv preprint arXiv:2109.07161},
  year={2021}
}

Note

Lama refine
lama/configs/prediction/default.yaml
refine: False # refiner will only run if this is True

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
assets		assets
datasets		datasets
efficientvit		efficientvit
images		images
lama		lama
learn		learn
utils		utils
weights		weights
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
app.py		app.py
lama_inpaint.py		lama_inpaint.py
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml
stable_diffusion_inpaint.py		stable_diffusion_inpaint.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Detect-Anything

Getting Started

Installation

Weights

Running the Project

Core Models

YOLO-World

EfficientViT-SAM

LaMa

Stable Diffusion 2 Inpainting

Citation

Note

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

pg56714/Detect-Anything

Folders and files

Latest commit

History

Repository files navigation

Detect-Anything

Getting Started

Installation

Weights

Running the Project

Core Models

YOLO-World

EfficientViT-SAM

LaMa

Stable Diffusion 2 Inpainting

Citation

Note

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages