Things I Learned - 28 Jul 2024

This week, I learned: Speech editing in audio files is a thing. Speech Editing Toolkit and Descript GPT 4o Mini is almost as good as GPT 4o in the LMSYS leaderboard. Llama 3.1 400B model and Mistral 2 Large are yet to be evaluated. If LLMs can generate any text, and text can describe the real world, we can rapidly generate “artifacts” that generate: 3D Printable Models: STL (Stereolithography): Defines the surface geometry of 3D objects using triangular facets. OBJ (Wavefront OBJ): Describes 3D geometry including vertices, textures, and normals. X3D: An XML-based file format for representing 3D computer graphics. Vector Graphics: SVG (Scalable Vector Graphics): Defines vector-based graphics in XML format, useful for illustrations, diagrams, and user interface elements. CAD Drawings: DXF (Drawing Exchange Format): Represents CAD data, including shapes, lines, and curves, used in engineering and architecture. Circuit Designs: KiCAD: An open-source software suite for Electronic Design Automation (EDA), which uses various file formats like PCBNew and EESchema to represent circuit designs. Blueprints and Architectural Designs: GML (Geography Markup Language): Encodes geographical features and spatial information. CityGML: A specific GML application schema for modeling and exchanging 3D city models. Molecular Structures: PDB (Protein Data Bank): Describes the three-dimensional structures of molecules. CML (Chemical Markup Language): An XML-based standard for representing molecular data. Robotics and Automation: URDF (Unified Robot Description Format): Defines the physical configuration of a robot, including joints, links, and sensors. COLLADA (Collaborative Design Activity): An XML-based schema to describe digital assets for 3D applications, often used in robotics. Geospatial Data: KML (Keyhole Markup Language): Used for geographic data visualization, primarily in Google Earth. GeoJSON: A format for encoding a variety of geographic data structures using JSON. Mathematical Markup: MathML (Mathematical Markup Language): Describes mathematical notation and captures both its structure and content. Music and Sound: MusicXML: Encodes sheet music in a structured format that can be easily shared between different music notation software. Documents and Text: DocBook: A semantic markup language for technical documentation. Markdown: A lightweight markup language with plain text formatting syntax. Biological Data: SBML (Systems Biology Markup Language): Represents computational models of biological processes. PhyloXML: An XML format for representing phylogenetic trees. Game Development: FBX (Filmbox): A file format for 3D animation that can hold information about the geometry, textures, and animations. VRML (Virtual Reality Modeling Language): Describes interactive 3D objects and worlds. Data Visualization: ChartML: Encodes charts and graphs in a structured format. D3.js (Data-Driven Documents): Uses HTML, SVG, and CSS to bring data to life with interactive visualizations. Building Information Modeling (BIM): IFC (Industry Foundation Classes): Describes building and construction data. Textiles and Fabrics: LoomML: Represents the design and structure of woven fabrics. Augmented Reality and Virtual Reality: ARML (Augmented Reality Markup Language): Defines how augmented reality applications should behave and what content they should display. VRML (Virtual Reality Modeling Language): For describing interactive 3D objects and worlds. Medical Imaging and Health Data: DICOM (Digital Imaging and Communications in Medicine): Encodes medical imaging data. HL7 (Health Level 7): A set of standards for the exchange of information between medical applications. Simulation Data: FMI (Functional Mock-up Interface): Represents and exchanges dynamic simulation models. SBML (Systems Biology Markup Language): For computational models of biological processes. Sound and Audio: MML (Music Markup Language): For encoding music notation and performance information. SoundFont: A file format for defining musical instrument sounds. Animation and Visual Effects: BVH (Biovision Hierarchy): Encodes motion capture data. Alembic: A computer graphics interchange framework primarily for exchanging animation and visual effects data. Textile Patterns: WIF (Weaving Information File): Describes weaving patterns and structures. Knitting Markup Language: Encodes knitting patterns in a structured format. Scientific Data: CDF (Common Data Format): Used for storing scientific data. NetCDF (Network Common Data Form): Supports the creation, access, and sharing of array-oriented scientific data. Photography and Imaging: XMP (Extensible Metadata Platform): Used for embedding metadata in digital images and other media files. Construction and Engineering: LandXML: For civil engineering and land surveying data. gbXML (Green Building XML): Facilitates the transfer of building data for analysis of energy and environmental performance. Packaging and Retail: BPL (Barcode Product Labeling): Encodes information for product packaging and labeling. GS1 XML: Used for electronic business messaging, including product identification and tracking. Typography and Font Design: UFO (Unified Font Object): A format for storing font data. SFNT (Spline Font): Encodes scalable font information. Product Data Management: PLMXML (Product Lifecycle Management XML): Used for sharing product data across PLM systems. GPT 4o Mini can be fine-tuned! Awesome PaaS lists self-hosted deployment platforms. Piku - similar to Dokku – is promising.

Loved this Rocky Aur Rani Kii Prem Kahaani scene where Ranveer asks, “Chinese ko Chinese bol sakte hai?” हम बहनदी भी नहीं बोल सकते? आंटी, मैं दिल्ली से हूँ। मैं कैसे नहीं बहनदी बोलूं बहनदी!? कैसा जमाना आ गया है? फैट-ों को फैट नहीं बोल सकते, ब्लैक-ों को ब्लैक नहीं बोल सकते, ओल्ड-ों को ओल्ड नहीं बोल सकते, मुँह खोलने से डर लगता है मुझे! आप मुझे बताओ, चाइनीज़ को चाइनीज़ बोल सकते हैं? ...

Things I Learned - 21 Jul 2024

This week, I learned: GPT For Work has a set of useful spreadsheet LLM functions Xata offers a free PostgreSQL tier with REST API Mamba now uses mambaforge as the default installation, i.e. conda-forge is the default and only channel! Update: 6 Jun 2025. Mambaforge is sunset as of 29 Jul 2024. Conda-forge now uses Miniforge as the standard installer Ref conda-forge.org. Users should switch to Miniforge instead. nginx supports a load-balancing method least_conn which is far better than the default round-robin. #IMPOSSIBLE LLMs cannot provide a bounding box of objects in images. (Maybe Florence 2 can). Update: Mar 2025. Gemini has good timestamps and bounding boxes Models gently grow in capability. It helps to maintain an impossibility list that steadily gets invalidated. Ref Github Copilot internals walks through how Copilot constructs its prompts

Things I Learned - 14 Jul 2024

This week, I learned: Carlton’s TDS session Always create a new venv via VS Code when starting a training session. Helps reproduce issues (though I could use Colab instead) Create an empty .ipynb notebook and double-click it. That’s another way (though slower) to open a Jupyter notebook Share Parrish Knowledge Project podcast. Three generations of wealth There is a big difference between liking animals and being a vet. Between liking education and being a teacher. Even if no one reads your writing, you benefit from the writing. Emotional.crises like 9/11 or Covid are far easier for markets to recover from Hidden brain podcast. White trying to hard can back fire on you Sometimes conscious thinking makes our automated responses of sports music, dance are great examples Instead, SURRENDER to something outside of you. Like playing with kids. Exercise also sends blood away from brain. Drugs. ChatGPT. It’s called Ue in Chinese philosophy A quick check on the pricing of text to speech models OpenAI TTS: $15/1M chars Ref Deepgram Aura: $15/1M chars Ref Elevenlabs Scale: $165/1M chars Ref Google TTS Neural2: $16/1M chars Ref Azure AI Speech: $15/1M chars Ref AWS Polly Neural TTS: $16/1M chars Ref

I'll leave tomorrow's problems to tomorrow's me

What a delightful idea. I’ll leave tomorrow’s problems to tomorrow’s me. – Saitama, One Punch Man Saitama is now one of my favorite heroes. Right up there with Atticus Finch and Juror #8. Very few people can articulate such a wonderful philosophy as effectively. The closest was Calvin. Of course, it’s not a perfect system. But they do say, “Sometimes, the best way to get something is to stop trying to get it.”

Things I Learned - 07 Jul 2024

This week, I learned: Predibase uses LORAX to run multiple fine-tunings of a base model in a single GPU via adapters. Ref