Will people accept AI performance evaluations?
Anish Agarwal triggered this question a few weeks ago, mentioning that it’s hard for people to feel evaluated by AI.
But I believe LLMs are great for evaluation. We need to get comfortable AND familiar with them.
So I’m introducing a project next week for my students:
- USE AN LLM to automatically analyze data. Given a dataset, write a program that will use LLMs to create an analysis report.
- CONVINCE IT to give you marks. Write the code and report in a way that the LLM will reward you.
Here’s the project: https://github.com/sanand0/tools-in-data-science-public/blob/tds-2023-t3-project2-wip/project-2-automated-analysis.md
This is a WORK IN PROGRESS. I’d love your feedback.
What would you CHANGE? What would you LEARN from this?