Production-ready PyTorch model serving toolkit with Hydra structured configs, FastAPI endpoints, and Pydantic validation. Supports GPT-2 text generation and ResNet image classification with batched inference.
Upload a photo, get a depth map and interactive 3D point cloud. Uses Depth Anything V2 for monocular depth estimation, projects pixels into 3D space, and renders with Plotly. Live demo on HuggingFace Spaces.
Search video frames with natural language. Upload a video, CLIP encodes each frame into embeddings, then find specific scenes by describing them in plain text. Live demo on HuggingFace Spaces.
My thought process and lessons learned when deploying a website for a client. I go through the problem statement and various options using AWS services focusing on cost.