FAISS, Up Close: Fast Similarity Search For The Vector Age Every modern AI product has one quiet workhorse: finding the nearest neighbors of a vector fast. FAISS is the library many of us reach for when the dataset gets large and latency matters. Built at Met... ANN cuVS faiss GPU Meta AI vector search
BitNet: 1-bit LLMs Land With Practical Inference on CPUs and GPUs BitNet from Microsoft Research is the official C++ inference stack for native 1-bit large language models, centered on BitNet b1.58. The repo ships fast, lossless ternary kernels for CPUs, a CUDA W2A8... 1-bit LLM BitNet CPU GGUF GPU inference llama.cpp quantization T-MAC
Google Cloud and Docker Supercharge AI App Deployments with Compose-to-Cloud Run Integration Deploying AI applications from local development to production environments is now easier than ever, thanks to a groundbreaking collaboration between Google Cloud and Docker. The seamless integration ... Agentic applications AI deployment Cloud Run Compose Specification Docker Compose GPU Serverless
Mistral Compute: Democratizing Advanced AI Infrastructure for Everyone What if anyone, from startups to nations, could access the powerful infrastructure needed to build next-generation AI? Mistral AI is making this a reality with Mistral Compute , a platform designed to... AI infrastructure cloud computing data sovereignty enterprise AI Europe GPU open science sustainability