Tomorrow, we’ll be vibe-analyzing data at a Hasgeek Fifth Elephant workshop.
It’s a follow-up to my DataHack Summit talk “RIP Data Scientists”. I showed how it’s possible to automate many data science tasks. In this workshop, the audience will be doing that.
Slides: https://sanand0.github.io/talks/2025-09-16-vibe-analysis/ (minimal because… well, it’s “vibe analysis”. We’ll code as we go.)
Here are datasets I’ll suggest to the audience:
- India Census 2011: https://www.kaggle.com/datasets/danofer/india-census
- MovieLens movies: https://grouplens.org/datasets/movielens/32m/
- IMDb movies: https://datasets.imdbws.com/
- Occupational Employment and Wage Statistics (OEWS): https://www.bls.gov/oes/tables.htm
- Global AI Job Market & Salary Trends 2025: https://www.kaggle.com/datasets/bismasajjad/global-ai-job-market-and-salary-trends-2025
- Flight Delay Dataset: https://www.kaggle.com/datasets/shubhamsingh42/flight-delay-dataset-2018-2024
- London House Price Data: https://www.kaggle.com/datasets/jakewright/house-price-data
- Exchange Rates to USD: https://www.kaggle.com/datasets/robikscube/exhange-rates-to-usd-from-imforg-updated-daily
- Thailand Road Accidents (2019-202): https://www.kaggle.com/datasets/thaweewatboy/thailand-road-accident-2019-2022
… but if you’d like stories from any interesting recent datasets (10K - 10M rows, easy-to-download), please suggest in the comments. 🙏