Skip to content

Blog

Recipes – A Pattern for Common Code Transformations

I did a thing. A very silly, very meta thing. I vibe-coded a CLI tool that summarizes YouTube videos, recorded myself making the tool, and then used the tool to summarize the video of me making the tool. And now, dear reader, you are reading a blog post that was largely generated from that summary.

But the real star of the show isn't the tool, or the video, it's the Recipe Pattern – a way to encapsulate repetitive coding work into a one-off, reusable doc.

Recipe Bot

Supercharging LLM Classifications with Logprobs

I was just reading the classification chapter of Jay Alammar and Maarten Grootendorst's excellent book Hands-On Large Language Models. I felt inspired to extend their work and show yet another cool trick you can do with LLM-based text classification. In their work they demonstrated how an LLM can be used as a "hard classifier" to determine the sentiment of movie reviews. By "hard" I mean that it gives a concrete answer, "positive" or "negative". However, we can do one better! Using "this one simple trick"™ we can make a "soft" classifier that returns the probabilities of each class rather than a concrete single choice. This makes it possible to tune the classifier – you can set a threshold in the probabilities so that classifications are optimally aligned with a training set.

Soft Classification

Fire Yourself First: The E-Myth Approach to Iteratively AI App Development

I've always been interested in entrepreneurship, so, early on in my career, I asked my financial advisor for book recommendations about startups. He handed me "The E-Myth" by Michael Gerber – a book about... building food service franchises? In the heat of the dot-com explosion, this wasn't exactly the startup guide I was hoping for, but its core message stuck with me and turned out to be surprisingly relevant to the problems I hear about regularly when talking to people about building reliable LLM applications.

Fire Yourself

Roaming RAG – RAG without the Vector Database

Let's face it, RAG can be a big pain to set up, and even more of a pain to get right.

There's a lot of moving parts. First you have to set up retrieval infrastructure. This typically means setting up a vector database, and building a pipeline to ingest the documents, chunk them, convert them to vectors, and index them. In the LLM application, you have to pull in the appropriate snippets from documentation and present them in the prompt so that they make sense to the model. And things can go wrong. If the assistant isn't providing sensible answers, you've got to figure out if it's the fault of the prompt, the chunking, or the embedding model.

If your RAG application is serving documentation, then there might be an easy alternative. Rather than setting up a traditional RAG pipeline, put the LLM assistant to work. Let it navigate through the documentation and find the answers. I call this "Roaming" RAG, and in this post I'll show you how it's done.

Roaming RAG

Cut the Chit-Chat with Artifacts

Most chat applications are leaving something important on the table when it comes to user experience. Users are not satisfied with just chit-chatting with an AI assistant. Users want to work on something with the help of the assistant. This is where the prevailing conversational experience falls short.

Asset-Aware Assistant

Bridging the Gap Between Keyword and Semantic Search with SPLADE

In information retrieval, we often find ourselves between two tools: keyword search and semantic search. Each has strengths and limitations. What if we could combine the best of both?

By the end of this post, you will:

  • Understand the challenges of keyword and semantic search
  • Learn about SPLADE, an approach that bridges these methods
  • See a practical implementation of SPLADE to enhance search

If you've struggled with inaccurate search results or wanted a more transparent search system, this post is for you. Let's explore how SPLADE can change your approach to information retrieval.