RAG Implementation Diagram

Implementing RAG from scratch with Python, Qdrant, and Docling

Everyone talks about RAG, but few have actually built one. Let’s break the spell and implement a semantic search system step by step using Python and Qdrant.

November 29, 2025 · 5 min · TechLife
Descriptive alt text for semantic caching with ScyllaDB

Optimize LLM Costs with ScyllaDB Semantic Caching

Key Highlights Semantic caching reduces LLM costs and latency by storing frequent queries and their responses. ScyllaDB’s Vector Search enables efficient semantic caching for large-scale LLM applications. Combining LLM APIs with ScyllaDB’s low-latency database optimizes performance and cost. The increasing adoption of Large Language Models (LLMs) in various applications has led to significant concerns about costs and latency. As LLMs continue to grow in complexity and size, the need for efficient and cost-effective solutions becomes more pressing. This move reflects broader industry trends towards optimizing AI workloads and reducing operational overhead. ScyllaDB’s semantic caching offers a promising solution to these challenges, allowing developers to reduce the number of LLM calls and improve response times. ...

November 27, 2025 · 2 min · TechLife