A lightweight Python prototype that transforms a single-object photo or a short text prompt into a basic 3D mesh (.obj/.stl), complete with a quick visualization.
- Dual-Mode Input: Accepts either a photo (
.jpg
/.png
) of an object or a text prompt (e.g., “A small toy car”). - Open-Source Pipelines: Leverages HunyuanDiT for text-to-image and Hunyuan3D-2mini for image-to-mesh.
- Fast Prototyping: Two simple function calls—
t2i(prompt)
andi2m(image)
—to generate your 3D model. - Export Formats: Outputs a
.obj
(or.stl
) file compatible with tools like MeshLab, Blender, or 3D slicers. - Extendable: Easy to integrate background removal (
rembg
), alternative shape generators (e.g., Shap-E), or custom post-processing.
- Python 3.8 or higher
- CUDA-enabled GPU (12 GB VRAM recommended)
- virtualenv for environment isolation
- Hugging Face account (for model downloads)
-
Clone the repository
git clone https://github.com/Tencent/Hunyuan3D-2.git cd Hunyuan3D-2
-
Set up a virtual environment
python3 -m venv venv source venv/bin/activate
-
Install Python dependencies
pip install --upgrade pip pip install -r requirements.txt pip install torch torchvision diffusers transformers trimesh
-
Install HunyuanDiT & Hunyuan3D-mini
pip install huggingface_hub pip install git+https://github.com/Tencent/HunyuanDiT.git # Or, if you prefer the Diffusers format: pip install diffusers
python main.py --input_image path/to/photo.jpg --mode photo
- Preprocessing (optional): Background removal using
hy3dgen.rembg
for cleaner silhouettes. - 3D Generation: Feeds the RGB image into the
Hunyuan3DDiTFlowMatchingPipeline
. - Output:
output.obj
mesh saved to the project root.
python main.py --prompt "A small toy car" --mode text
- Text → Image:
HunyuanDiTPipeline
generates a 512×512 PIL image of your prompt. - Image → Mesh: Flow-matching model creates the 3D mesh.
- Output:
output.obj
ready for viewing or 3D printing.
from hy3dgen.text2image import HunyuanDiTPipeline
from hy3dgen.shapegen import Hunyuan3DDiTFlowMatchingPipeline
import trimesh
# Text → 2D
t2i = HunyuanDiTPipeline(
'Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers-Distilled',
device='cuda'
)
img = t2i("A shiny red apple on white")
# 2D → 3D
i2m = Hunyuan3DDiTFlowMatchingPipeline.from_pretrained(
'tencent/Hunyuan3D-2mini',
subfolder='hunyuan3d-dit-v2-mini-turbo',
device='cuda'
)
mesh = i2m(img, num_inference_steps=5)[0]
# Export mesh
mesh.export("apple.obj")
- HunyuanDiT: Multi-resolution diffusion transformer for text-to-image [GitHub]
- Hunyuan3D-2mini: 0.6B parameter shape generator using flow matching [Hugging Face]
- diffusers & transformers: Model loading and inference [Hugging Face]
- trimesh: Mesh handling and export
- rembg (optional): Background removal for photos
- Model Selection: Chose HunyuanDiT for high-quality 2D generation and Hunyuan3D-mini for fast, Python-native mesh synthesis.
- Simplicity: Designed two-step pipelines (
t2i
&i2m
) for minimal boilerplate. - Extendability:
- Integrate
rembg
to refine object silhouettes. - Explore combining with Shap-E for direct text-to-3D.
- Add surface smoothing, UV mapping, and texture baking.
- Integrate