The Sassy AI Devil’s Advocate

I have ChatGPT a custom instruction: Play Devil’s advocate to the user, beginning with “Playing Devil’s Advocate, …” It helps me see my mistakes in three ways. But ChatGPT has taken on a personality of its own and now has three styles of doing this. How about… – It suggests a useful alternative. Are you sure…? – It thinks you’re wrong and warns you of risks. Yeah, right… – It knows you’re wrong and rubs it in. (Jeeves, the butler, would be proud.) Here are some examples. ...

Features actually used in an LLM playground

At Straive, only a few people have direct access to ChatGPT and similar large language models. We use a portal, LLM Foundry to access LLMs. That makes it easier to prevent and track data leaks. The main page is a playground to explore models and prompts. Last month, I tracked which features were used the most. A. Attaching files was the top task. (The numbers show how many times each feature was clicked.) People usually use local files as context when working with LLMs. ...

Things I Learned - 12 Jan 2025

This week, I learned: Measuring developer productivity with the DX Core 4 is a framework for measuring developer productivity. It encapsulates other frameworks like DORA, SPACE, and DevEx. Can LLMs write better code if you keep asking them to “write better code? A delightful exploration of how Claude 3.5 Sonnet keeps optimizing and adding features to improve code. My takeaway: repeatedly applying a prompt gives us interesting new directions to explore. Wednesday comes from Wōdnesdæg - named after Odin (or Woden). CLIProxyAPI seems a good way to allow any CLI coding agent (Codex, Claude Code, etc.) to work with any provider (e.g. Gemini, OpenRouter, etc.) The documentation needs a few more examples, but it’s usable. mise x github:router-for-me/CLIProxyAPI -- cli-proxy-api starts a local server that proxies requests. Create a config.yaml, update the keys, and configure your coding agent, e.g. Codex to use it. It’s also a good way to see what prompts are being sent by the various harnesses. smolagents is a new agents library from HuggingFace. It seems simple enough to use. whisper-flow does real-time speech transcription! Switchboard-1 is a labelled audio corpus with ~260 hours of speech. It has ~2,400 calls among 500+ speakers in the US. Cloudflare tunnel is like ngrok but more permanent. It’s a bit more complex, too. But given CloudFlare’s liberal free tier, it’s a good, viable option for long-term local hosting. John Wheeler: “We live on an island surrounded by a sea of ignorance. As our island of knowledge grows, so does the shore of our ignorance.” A great way to understand how ignorance actually grows as you learn more. justhtml is a fast enough pure Python fully HTML5 compliant library. For a faster, mostly compliant solution, html5-parser with lxml works. There is little reason to use Redis. There are several clones you can use. Databases in 2024: A Year in Review Microsoft’s Garnet KeyDB (only Linux) ValKey (only source) DragonFly (only Linux) ReDict (only Linux) Every few years, something comes along trying to replace relational databases and SQL, and gets absorbed. YouTube Key value stores. People soon realize they need more features, e.g. indices. MapReduce systems. Most MapReduce vendors put SQL on top of SQL. Then the Hadoop market crashed. (But HDFS, S3, distributed storage systems are a good idea) Document Databases. JSON. SQL absorbed that. SQLite 3.45+ supports even JSONB. DuckDB, of course, has JSON. Column Databases. Again, these introduced SQL. Graph Databases. SQL:2023 introduced graph queries via SQL/PGQ (Property Graph Queries). DuckPGQ beats Neo4J Array Databases. SQL:2023 adds SQL/MDA which allows for matrix operations. But specialized databases might make sense in this category. Vector Databases. Every DB is adding support for this. TheAgentCompany is a benchmark of real-world tasks like: Arranging a meeting room Analyze a spreadsheet Add a Gitlab wiki page Salvatore Sanfilippo (antirez - Redis) finds DeepSeek v3 comparable with Claude 3.5 Sonnet. YouTube He also passed a paper and his code to compare them. A useful prompt. YouTube