- Clone the repository
git clone [email protected]:nekomeowww/poc-audio-inference.git
cd poc-audio-inference- Install dependencies
corepack enable # Make sure you have corepack enabled
pnpm install- Start development server
pnpm dev- Open http://localhost:5173 in your browser.
- Backend should listen on http://localhost:8080 and should already configured with proxy for frontend.
.
├── apps # Frontend applications
│ └── streaming-web
├── infra # Infrastructure related
│ └── go
│ └── operator # Kubernetes streaming-backend-operator
├── packages # Shared packages
│ └── backend-shared
├── services # Backend services
│ ├── inference-server
│ └── streaming-backend
├── cspell.config.yaml # Spell check config
├── eslint.config.mjs # ESLint config
├── package.json # Workspace global dependencies
├── pnpm-workspace.yaml # Monorepo config for pnpm
├── pnpm-lock.yaml
├── tsconfig.json # TypeScript config for monorepo
└── vitest.workspace.ts # Unit test related
├── README.mddocker compose up -d- Copy
.env.exampleto.env.localand configure the environment variables.
You can always start any of the needed apps, packages or services by running either:
pnpm -F @audio-inference/web devfor frontendpnpm -F @audio-inference/backend devfor backendcd services/inference-server && pixi run startfor inference server
docker buildx build --platform linux/arm64 --load . -f ./apps/streaming-web/Dockerfile -t test.nekomeowww.local/streaming-audio/web:0.0.1
docker buildx build --platform linux/arm64 --load . -f ./services/streaming-backend/Dockerfile -t test.nekomeowww.local/streaming-audio/backend:0.0.1
docker buildx build --platform linux/arm64 --load . -f ./services/inference-server/Dockerfile -t test.nekomeowww.local/streaming-audio/inference-server:0.0.1Note
For x86_64 (amd64)
docker buildx build --platform linux/arm64 --load . -f ./apps/streaming-web/Dockerfile -t test.nekomeowww.local/streaming-audio/web:0.0.1
docker buildx build --platform linux/arm64 --load . -f ./services/streaming-backend/Dockerfile -t test.nekomeowww.local/streaming-audio/backend:0.0.1
docker buildx build --platform linux/arm64 --load . -f ./services/inference-server/Dockerfile -t test.nekomeowww.local/streaming-audio/inference-server:0.0.1
docker run -dit -p 8080:80 test.nekomeowww.local/streaming-audio/web:0.0.1
docker run -dit -p 8081:8081 -e REDIS_URL='URL of Redis' test.nekomeowww.local/streaming-audio/backend:0.0.1
docker run -dit -p 8082:8082 test.nekomeowww.local/streaming-audio/inference-server:0.0.1We have pre-defined kind configurations in the infra/go/operator/hack directory to help you get started with.
Note
Install kind if you haven't already. You can install it using the following command:
go install sigs.k8s.io/kind@latestTo create a kind cluster with the configurations defined in infra/go/operator/hack/kind-config.yaml, run the following command:
kind create cluster --config infra/go/operator/hack/kind-config.yaml --name kind-streaming-backendNote
You can check the nodes with the following command:
kubectl get nodesSince the streaming-backend-inference-server.yaml specified and simulated the resource allocation of nvidia.com/gpu, we need to prepare the cluster with the GPU resources.
Note
This is not a real GPU and not even essential for the workload, you can remove the resource constraint by modifying the deployment file:
resources:
limits:
-- nvidia.com/gpu: "1"
requests:
-- nvidia.com/gpu: "1"kubectl label node kind-worker run.ai/simulated-gpu-node-pool=default
kubectl label node kind-worker2 run.ai/simulated-gpu-node-pool=default
kubectl label node kind-worker3 run.ai/simulated-gpu-node-pool=defaulthelm repo add fake-gpu-operator https://fake-gpu-operator.storage.googleapis.com
helm repo update
helm upgrade -i gpu-operator fake-gpu-operator/fake-gpu-operator --namespace gpu-operator --create-namespacekind load docker-image test.nekomeowww.local/streaming-audio/web:0.0.1 --name kind-streaming-backend
kind load docker-image test.nekomeowww.local/streaming-audio/backend:0.0.1 --name kind-streaming-backend
kind load docker-image test.nekomeowww.local/streaming-audio/inference-server:0.0.1 --name kind-streaming-backendkubectl apply -f deploy/kubernetes-yaml/envs/local/streaming-backend-web.yaml --server-side
kubectl apply -f deploy/kubernetes-yaml/envs/local/streaming-backend-backend.yaml --server-side
kubectl apply -f deploy/kubernetes-yaml/envs/local/streaming-backend-inference-server.yaml --server-sidekubectl expose deployment/web --type=NodePort --name web-nodeport
# Modify the nodePort to 30101, specified with extra port mappings with the infra/go/operator/hack/kind-config.yaml
kubectl patch service web-nodeport --type='json' --patch='[{"op": "replace", "path": "/spec/ports/0/nodePort", "value":30101}]'
kubectl expose deployment/backend --type=NodePort --name backend-nodeport
# Modify the nodePort to 30102, specified with extra port mappings with the infra/go/operator/hack/kind-config.yaml
kubectl patch service backend-nodeport --type='json' --patch='[{"op": "replace", "path": "/spec/ports/0/nodePort", "value":30102}]'
kubectl expose deployment/inference-server --type=NodePort --name inference-server-nodeport
# Modify the nodePort to 30103, specified with extra port mappings with the infra/go/operator/hack/kind-config.yaml
kubectl patch service inference-server-nodeport --type='json' --patch='[{"op": "replace", "path": "/spec/ports/0/nodePort", "value":30103}]'stub: Generating a set of desired stubbing.mjs,.d.tsfiles for a package to be able to allow the end package to be able to resolve without complicated watch setup.workspace: A monorepo workspace that contains multiple packages, services or apps.filter: Please refer to pnpm filter for more information.