pip install -r requirements.txtollama pull gemma3:4b
ollama pull embeddinggemma:300m
ollama pull llava:7bpython api_server_integrated.pyThat's it!
chmod +x start_server.sh
./start_server.shstart_server.bat- API Server: http://localhost:8000
- Interactive Docs: http://localhost:8000/docs
- Alternative Docs: http://localhost:8000/redoc
import requests
# Summarization
response = requests.post(
"http://localhost:8000/summarize",
json={
"question": "What is microgravity?",
"top_k_texts": ["Microgravity is..."]
}
)
print(response.json()["answer"])
# Text Embedding
response = requests.post(
"http://localhost:8000/embed/text",
data={"text": "Hello world"}
)
print(f"Dimension: {response.json()['dimension']}")# Health Check
curl http://localhost:8000/health
# Summarization
curl -X POST "http://localhost:8000/summarize" \
-H "Content-Type: application/json" \
-d '{
"question": "What is microgravity?",
"top_k_texts": ["Microgravity is a condition..."]
}'
# Text Embedding
curl -X POST "http://localhost:8000/embed/text" \
-F "text=Hello world"
# Image Embedding
curl -X POST "http://localhost:8000/embed/image" \
-F "file=@image.jpg"
# Audio Embedding
curl -X POST "http://localhost:8000/embed/audio" \
-F "file=@audio.mp3"- POST
/summarize- Summarize retrieved document chunks - POST
/embed/text- Generate text embeddings - POST
/embed/image- Generate image embeddings - POST
/embed/audio- Generate audio embeddings - GET
/health- Check server health - GET
/- API information - GET
/docs- Interactive API documentation
# Check if Ollama is installed
ollama --version
# If not installed, download from:
# https://ollama.ai/download# List installed models
ollama list
# Pull missing models
ollama pull gemma3:4b
ollama pull embeddinggemma:300m
ollama pull llava:7bEdit api_server_integrated.py and change:
uvicorn.run(app, host="0.0.0.0", port=8001) # Change to 8001Install FFmpeg:
- Windows: Download from https://ffmpeg.org/download.html
- Mac:
brew install ffmpeg - Linux:
sudo apt-get install ffmpeg
This unified API combines:
✅ Content Summarization
- Answer questions using retrieved documents
- Powered by Gemma 3 4B model
✅ Multi-Modal Embeddings
- Text embeddings
- Image embeddings (via LLaVA vision model)
- Audio embeddings (via Whisper transcription)
✅ Automatic Ollama Management
- No need to run
ollama servemanually - Automatic startup and cleanup
✅ Production Ready
- Comprehensive logging
- Error handling
- Health checks
- CORS enabled
- API documentation
- ✅ Server running? → Test with
test_integrated_api.py - ✅ Tests passing? → Check docs at
/docs - ✅ Ready to integrate? → See
README_INTEGRATED_API.mdfor examples - ✅ Deploying? → See deployment section in README
- Performance: GPU-enabled Ollama models run faster
- Monitoring: Check
api_server.logfor detailed logs - Development: Use
/docsfor interactive testing - Integration: All endpoints return JSON
Made for NASA Space Apps