This week, I learned:
- Fine tuning makes economic sense only if the input tokens SAVED is twice the output token size on each call.
- Docker container memory usage on WSL2
docker stats- frolvlad/alpine-glibc:alpine-3.17: 540KB
- ubuntu: 1MB (python3: +5MB)
- nikolaik/python-nodejs:python3.10-nodejs18-bullseye: 1.4MB (python3: +5MB)
- python:3-alpine: 612KB (python3: +7.5MB)
- python:3: 500KB (python3: +11.2MB)
- continuumio/miniconda3: 7.6MB (+6.5MB)
- Discussion with Vinu Yamunan
- Databuck by FirstEigen. Autolysis plus monitoring
- Quality council has the data steward (maintainer of each dataset) coming together with the uses on a weekly basis to understand what quality problems to users are facing. Data owners jaundice at a lower frequency to get an understanding
- #TODO Automate rules for data quality in our projects and intranet
- Convert a config rule into business language. Explain SQL. These are good use cases for llm’s
- Graph DBs are powerful for flexible data structures, but query generation needs AI or expertise. Check the Neo4J language cypher
- Explore storing SAME data in relational DBs AND in graph DBs / document DBs for different use cases
- Dallas rocketry challenge. Build a rocket that can take an egg to 800 feet exactly and land without breaking it
- Discussion with Karthik A
- #TODO Ask IIT students to do internship tasks. Use advent of code is a qualifying criterion
- Tata motors unionized DB admins for longevity. No one can take their jobs. Hires people who LIKE their jobs
- Rust gives me typing. It’s very efficient. Pola.rs is interesting but Pandas as good enough.
- Explore alerts from CCTV feeds. Karthik sends email alerts with pictures for:
- “Is the machine on or off”? for productivity
- “Are people not wearing helmets?” for safety at Cummins
- #TODO Integrate with WhatsApp. Use LLMs with function calling for responses
- Use expiring links (to pictures or content). It increases engagement
- Check Deno licensing. Is there a commercial clause? #ANS No - it’s MIT license
- Centre or excellence for zero emission tech at IIT. Karthik is part of it
- Explore auth0. 7000 users are free
tomlis part of the Python 3.11 standard library!- If copilot writes code we don’t understand we are screwed. Hence expertise matters
- Discussion with Vikas Kedia
- #TODO Plan an AMA
- The mind becomes lazy with financial success. Vikas is treating his podcast as a startup
- Hire a professional videographer for your content
- Financial RoI in financial markets is the highest. Programming is high too but FS is even better
- “Performative power” – when you’re forced to perform, you get better ideas
- Observable 2.0 is an open source static site generator for data
- Python dataclasses
- SORA is OpenAI’s video generation model, and is stunning!
- If Appa comes to Singapore even for a week, he will feel better and can boast to his friends. At over 90, it may be better to move Appa to where I am since many of his friends would be no more and shops, doctors, etc can be managed and getting an independent house nearby is not hard.
- There is an SEZ in Gujarat where Indians can invest like in Mauritius without forex restraint
- Shubha: Media sites are moving away from Vickrey auctions to first-price auctions for ads. That’s because they send the auction price forward to a search engine and the winning second-price value can lose even though the owner is willing to pay more. Second-price auctions don’t work unless ALL bidders are in the SAME auction. Ad networks are a hierarchy of auctions!
- Gemini 1.5 launched.
- Fly.io offers GPU hosting and auto stop when they have nothing to do.
- Embeddings in random forest are very effective at classification – much better than dot product.
- To deploy apps with OAuth + templating support in a small Docker container, use Caddy
- Deno has native TypeScript, browser APIs, and compiles to multiple OSs
- Ruff is a MUCH faster flake8
- Two pass generation is a clever technique to get multiple SEQUENTIAL answers in a single API request. For example the schema
{'code', 'optimized_code'}will generatecodeand then optimize it. - Unions in function calling allows flexible multi-step prompts in a single API.