Things I Learned - 28 Apr 2024

This week, I learned: Tough prompt to test: Gr brx vshdn Fdhvdu flskhu? is a quick way to assess LLM capability. Ref Cheap cloud GPU services thread on Twitter lists: Runpod (17) Vast.ai (17) Modal Labs (8) fly.io (4) LightningAI (4) Colab (4) AkashNet (4) Lambda Labs (4) ShadeFormAI (3) Mac Mini (3) Tensor Dock (2) Hetzner (2) BrevDev (2) JSR lets you publish Deno packages that can be imported by npm via. It also auto-evaluates documentation and scores it! via Snowflake Arctic Cookbook explains how mixture of experts models work A long list of LLM courses online Embeddings can be averaged. So, to embed large documents, average the embeddings of their chunks! OpenAI suggests this.

A quick way to assess LLM capabilities

Simon Willison initiated this very interesting Twitter thread that asks, “What prompt can instantly tell us how good an LLM model is?” The Sally-Anne Test is a popular test that asks: Sally hides a marble in her basket and leaves the room. While she is away, Anne moves the marble from Sally’s basket to her own box. When Sally returns, where will she look for her marble?" ...

Things I Learned - 21 Apr 2024

This week, I learned: Effort engine introduces “effort” as a parametrizable way to speed up LLMs with a quality trade-off. Works on Mistral for now. Many arts demand devotion. Devoting unrestricted time is part of that. 16 hours of practice a day is not uncommon. Sessions don’t start and end on time. Instruments take a lot longer to learn than vocal music. The instrument needs to become an extension of you. Tests and homework have a purpose. It helps people figure out whether they’ve learnt. So: Write tests that make people think! Like DuckDB workshop Share a list of exercises that people can explore People need to explicitly be INVITED, and potentially IN PERSON, before they will engage with something new. For example, no one posted to [email protected] until the VIA Talks session where we got them to post. For example, having one day at IITM mandatory (especially early in the course) gets online students familiar with TAs. They understand that TAs actually help, at high quality. That they can use Discord. What makes Delhi students more assertive? How can we inculcate that in others? jsr-io/migrations is a great example of database migrations. Shape Detection API in the browser detects QR codes, face bounding boxes, Browsers also natively support blurring and face tracking. via Lessons after half a billion GPT tokens for GPT-4: Vague instructions are better than over-specifying Avoid libraries like Langchain. APIs are stabler 1 token = 3 characters is good enough GPT4 doesn’t hallucinate much, except it does a poor job of saying “I don’t know” or “There’s no such data” (the null hypothesis) Keep the output down to 10 items or so if you’re listing. For longer lists, have it explicitly enumerate Don’t worry about niches. Just wait for GPT5 #WRITE GPT clearly prefers 42 as a random number. #WRITE fal.ai “animates” pictures, creating videos. It made one from my talk. I morphed into various somewhat similar people rapidly in a 2-second span. Very promising, and far from good. llmsherpa extracts PDFs using LLMs. It has errors but it preserves hierarchy, extracts tables well, and retains image coordinates. Via +91 90031 35354 ~Vetrivel PS www.web.sp.am is a content farm that’s getting hit by OpenAI. Highlights how easy it is to create content farms, and therefore “easy” it can be to introduce bias into LLMs. OpenAI supports batching requests. Didn’t know that. Marvin provides Python decorators to create AI functions. Pretty intuitive! Outlines generates structured test with LLMs. It uses the ⭐ logit_bias trick to limit choices in output. See get_choice() Lemur from Assembly.ai does real time call transcription and summary W3C is exploring ways to allow web pages to train LLMs, to flag content as AI generated, etc. Data Provenance Explorer lists open datasets used to train LLMs. Summarize.tech summarizes YouTube videos. #WRITE Stable Audio 2.0 generates 3 min of music from a prompt. I tried Bollywood Tamil film background music. Dark, soulful and Horror movie background. Drums starts darkly. Build up to a crescendo of intense chaos.. Great that it managed, but not great music. Somewhat stereotyped. I need to learn how to prompt better. BTW, Udio is another such. Harpa.ai is a well designed Chrome extension / plugin that can chat with or automate any page. Due to in-context learning, giving 100s of examples in the prompt can teach LLMs to jailbreak. Ref With RAG on search becoming big, search APIs are growing. serper.dev, you.com, searxng being examples.

When picking a number between 1-100, do #LLMs pick randomly? Or pick like a human? Leniolabs_ found #ChatGPT prefers 42. Gramener re-ran the experiment. Things have changed a bit. Now, 47 is the new favorite. But Claude 3 Haiku latched on to 42 as its favorite. Gemini’s favorite is 72. See https://sanand0.github.io/llmrandom/ They all avoid multiples of 10 (10, 20, …), repeated digits (11, 22, …), single digits (1, 2, …) and prefer 7-endings (27, 37, …). These are clearly human #biases – avoiding regular / round numbers and seeking 7 as “random”. ...

Things I Learned - 14 Apr 2024

This week, I learned: Prashant Pandey: we need to prepare before every meeting. Something to teach VS Code Select any code and command Explain this to understand the code %something in command bar searches ACROSS files for a term. Exactly like Ctrl+Shift+F Copilot has an Inline Chat: Start in Terminal (that needed me to unbind Ctrl+I in bash to work) Ctrl+2 opens a second window on the side. Ctrl+1 goes back to the first window Terminal: Open Detected Link lets you scroll through detected (file) links in terminal Terminal sticky scroll is transparent. (But Terminal stick scroll isn’t working for me.) Copilot uses last 10 commit messages, Jupyter notebook kernel state (variables) as additional context 1.88: supports locked scrolling to sync scrolling of side-by-side windows fsspec is used by csvbase, Pandas, etc. to implement file system protocols like s3fs, gcfs, etc. SQLime is a SQLite client / playground on the browser! Do nothing. Then do less Humans have a bias against inaction. Hence a strategic advantage. What can you cancel today? Humans have a bias against subtraction or removal. That too is a strategic advantage. What can you remove today? Humans have a bias against constraints. That’s a strategic advantage. What constraint can you embrace? No Yay! When declining something, add it your calendar so that when the time comes you can say yeah I got this time back

Things I Learned - 07 Apr 2024

This week, I learned: CSS nesting is now available in browsers Cold starts in AWS Lambda: serverless functions stay alive for 5-7 min. All languages are fast but Docker is slow. More npm packages slow start dramatically. WiFi only works when it’s raining because a tree was obstructing the signal but was weighed down when raining! Good reasons why finding a technical co-founder won’t work. You want a unicorn to passionately trust YOUR idea after 2 meetings. Why should THEY risk money for YOUR idea? You’re the money guy. RAISE the money for YOUR idea! How passionate are you about software? And you want to build one now? This is a subtle vulnerability. ChatGPT hallucinated pip install huggingface-cli. Sosomeone created the package and got 30,000 downloads! Video-Llava is a video LLM MusicCNN-embeddings provides embeddings for music genre classification How I write podcast. Paul Graham essays Write simply. It helps communicate. (Don’t concise if communication worsens.). It forces you to make the idea better Do lame stuff. Else you won’t start. Low standards drive creativity The more to delete, the better your writing. Read your piece. Highlight what feels poor. Fix it. Ask friends to highlight what’s BORING? UNCONVINCING? Delete the first, brainstorm the second. Or ask, what’s the 10% to cut and 10% to keep. Write about stuff you don’t know above the. Writing GENERATES ideas Write about what’s BUS. GENERAL and SURPRISING. (Laughter is a sign of comprehension.) Do HARD things to cultivate taste. Spend more time with people who generate ideas in you. Ravi chithappa. Ram. Ankor. Ganes. Books! Build taste. I have a taste for picking technologies. Data visualization. Retrospect. Write down what you like and dislike. Copy what you REALLY like. Guilty pleasures. A benefit of lower standards is that it let’s you pick the path less travelled. ITERATE. Discuss ideas. Iterate. Acknowledge. ITERATE.

This is the coolest data visualization I’ve seen in a long time. It makes you think about human behaviour. Please try and GUESS why the AirBnB occupancy rates shoot up in the red areas on Apr 7 before you read the comments! LinkedIn