TRELLIS: 대규모 3D 생성 AI 모델

7 May 2025 ~ 7 May 2025

문서 목표

TRELLIS의 개념과 구조를 이해한다.
설치 및 사용법을 단계별로 따라 한다.
사전학습 모델, 데이터셋, 트레이닝 방법을 확인한다.
실전 활용 팁과 참고자료를 얻는다.

TRELLIS란?

TRELLIS는 Microsoft가 개발한 대규모 3D 생성 AI 모델입니다. 텍스트 또는 이미지 프롬프트를 입력받아 다양한 3D 자산(Radiance Fields, 3D Gaussians, Meshes 등)을 생성할 수 있습니다.

SLAT(Structured LATent): 다양한 3D 형식으로 디코딩 가능한 통합 구조적 잠재 표현
Rectified Flow Transformer: 강력한 3D 생성 백본
2B 파라미터, 50만 개 대규모 3D 데이터셋으로 학습
유연한 출력 포맷, 로컬 3D 편집 지원

프로젝트 페이지: https://trellis3d.github.io
논문: arXiv:2412.01506
데모: HuggingFace Spaces

🌟 주요 특징

고품질: 복잡한 형태와 텍스처의 다양한 3D 자산 생성
다양성: 텍스트/이미지 입력, Radiance Fields, 3D Gaussians, Meshes 등 다양한 출력
유연한 편집: 생성된 3D 자산의 변형, 부분 편집 가능

기능	설명
입력	텍스트, 이미지
출력	Radiance Field, 3D Gaussian, Mesh 등
사전학습 모델	최대 2B 파라미터, 다양한 모델 제공
데이터셋	TRELLIS-500K (50만 3D 오브젝트)
편집	로컬/부분 편집, 다양한 변형

📦 설치 방법

사전 준비

OS: Linux 권장 (Windows는 미완전 지원)
GPU: NVIDIA 16GB 이상 (A100, A6000, RTX 4090 등)
Python: 3.8 이상(3.10 권장)
CUDA Toolkit: 11.8 또는 12.2(Windows 기준 12.4 실행 확인): CUDA Toolkit 12.4
Build Tools for Visual Studio 2022: Build Tools for Visual Studio 2022

설치 단계

레포지토리 클론

  git clone --recurse-submodules https://github.com/microsoft/TRELLIS.git
  cd TRELLIS

파이썬 가상환경 생성

  python -m venv "trellis-venv"
  .\trellis-venv\Scripts\activate

pip, wheel, setuptools 업데이트

  python -m pip install --upgrade pip wheel
  python -m pip install setuptools=75.8.2

의존성 1 생성 (requirements_1.txt)

 --extra-index-url https://download.pytorch.org/whl/cu124
 torch==2.5.1+cu124
 torchvision==0.20.1+cu124
 xformers==0.0.28.post3

의존성 2 생성 (requirements_2.txt)

 pillow==10.4.0
 imageio==2.37.0
 imageio-ffmpeg==0.6.0
 tqdm==4.67.1
 easydict==1.13
 opencv-python-headless==4.11.0.86
 scipy==1.15.2
 ninja==1.11.1.4
 rembg==2.0.65
 onnxruntime==1.21.0
 trimesh==4.6.6
 xatlas==0.0.10
 pyvista==0.44.2
 pymeshfix==0.17.0
 igraph==0.11.8
 transformers==4.50.3
 open3d==0.19.0

 git+https://github.com/EasternJournalist/utils3d.git@9a4eb15e4021b67b12c460c7057d642626897ec8

 https://github.com/bdashore3/flash-attention/releases/download/v2.7.1.post1/flash_attn-2.7.1.post1+cu124torch2.5.1cxx11abiFALSE-cp310-cp310-win_amd64.whl; sys_platform == 'win32' and python_version == '3.10'
 https://github.com/bdashore3/flash-attention/releases/download/v2.7.1.post1/flash_attn-2.7.1.post1+cu124torch2.5.1cxx11abiFALSE-cp311-cp311-win_amd64.whl; sys_platform == 'win32' and python_version == '3.11'
 https://github.com/bdashore3/flash-attention/releases/download/v2.7.1.post1/flash_attn-2.7.1.post1+cu124torch2.5.1cxx11abiFALSE-cp312-cp312-win_amd64.whl; sys_platform == 'win32' and python_version == '3.12'
 flash-attn; sys_platform != 'win32'

 -e extensions/vox2seq

 --find-links https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.5.1_cu124.html
 kaolin

 git+https://github.com/NVlabs/nvdiffrast.git
 git+https://github.com/JeffreyXiang/diffoctreerast.git
 git+https://github.com/autonomousvision/mip-splatting.git#subdirectory=submodules/diff-gaussian-rasterization/

 spconv-cu120==2.3.6

 gradio==4.44.1
 gradio_litmodel3d==0.0.1
 pydantic==2.10.6

의존성 설치

  python -m pip install -r requirements_1.txt
  python -m pip install -r requirements_2.txt

환경변수 설정 및 실행

 set ATTN_BACKEND=flash-attn
 set TORCH_CUDA_ARCH_LIST=8.9
 set SPCONV_ALGO=native
 set XFORMERS_FORCE_DISABLE_TRITON=1

 python ./app.py

참고: 사용 중인 GPU에 따라 환경변수 set TORCH_CUDA_ARCH_LIST=8.9의 값이 달라집니다. CUDA GPU Compute Capability를 참조하세요.

TIP: 설치가 오래 걸릴 수 있습니다. 에러 발생 시 의존성을 하나씩 설치하거나, 이슈에서 해결 방법을 확인하세요.

🤖 사전학습 모델

모델	설명	파라미터 수	다운로드
TRELLIS-image-large	이미지→3D 대형 모델	1.2B	Download
TRELLIS-text-base	텍스트→3D 기본 모델	342M	Download
TRELLIS-text-large	텍스트→3D 대형 모델	1.1B	Download
TRELLIS-text-xlarge	텍스트→3D 초대형 모델	2.0B	Download

이미지 조건 모델이 일반적으로 더 좋은 품질을 제공합니다.

모델 로딩 예시

from trellis.pipelines import TrellisImageTo3DPipeline

pipeline = TrellisImageTo3DPipeline.from_pretrained("JeffreyXiang/TRELLIS-image-large")
pipeline.cuda()

💡 사용법 (최소 예제)

from PIL import Image
from trellis.pipelines import TrellisImageTo3DPipeline

pipeline = TrellisImageTo3DPipeline.from_pretrained("JeffreyXiang/TRELLIS-image-large")
pipeline.cuda()
image = Image.open("assets/example_image/T.png")
outputs = pipeline.run(image, seed=1)
# outputs['gaussian'], outputs['radiance_field'], outputs['mesh'] 등 다양한 3D 결과

결과: 3D Gaussian, Radiance Field, Mesh 비디오(mp4) 및 파일(glb, ply) 생성
자세한 예제: example.py

🌐 웹 데모 실행

Gradio 기반 웹 데모 제공:

python .\app.py

Hugging Face에서 온라인 데모도 체험 가능: Live Demo

📚 데이터셋: TRELLIS-500K

Objaverse(XL), ABO, 3D-FUTURE, HSSD, Toys4k 등에서 50만 개 3D 오브젝트 수집
미적 품질 기반 필터링, 다양한 3D 형식 포함
상세 정보: DATASET.md

🏋️‍♂️ 트레이닝

train.py로 트레이닝 실행, 구성 파일(configs/)로 하이퍼파라미터 설정
싱글/멀티 노드, 체크포인트 재시작, 프로파일링 등 다양한 옵션 지원

트레이닝 명령 예시

python train.py \
  --config configs/vae/slat_vae_dec_mesh_swin8_B_64l8_fp16.json \
  --output_dir outputs/slat_vae_dec_mesh_swin8_B_64l8_fp16_1node \
  --data_dir /path/to/your/dataset1,/path/to/your/dataset2

TIP: 분산 트레이닝, 체크포인트 재시작, dry-run 등 다양한 옵션은 공식 문서 참고

⚖️ 라이선스

코드 및 모델 대부분: MIT License
일부 서브모듈: 별도 라이선스 (diffoctreerast, Flexicubes 등)
상세: LICENSE

📜 인용

이 프로젝트가 도움이 되었다면 논문을 인용해 주세요.

@article{xiang2024structured,
    title   = {Structured 3D Latents for Scalable and Versatile 3D Generation},
    author  = {Xiang, Jianfeng and Lv, Zelong and Xu, Sicheng and Deng, Yu and Wang, Ruicheng and Zhang, Bowen and Chen, Dong and Tong, Xin and Yang, Jiaolong},
    journal = {arXiv preprint arXiv:2412.01506},
    year    = {2024}
}

✅ 체크리스트

설치 환경 및 의존성 확인
사전학습 모델 다운로드 및 로딩
최소 예제 실행 성공
데이터셋 구조 및 활용법 이해
트레이닝/추론 옵션 숙지

TIP: 최신 업데이트, FAQ, 버그 리포트 등은 GitHub Issues에서 확인하세요.