I’ve been using AI in my Tools in Data Science course for over two years - to teach AI, and using AI to teach.

I told GitHub Copilot (prompt) to go through my transcripts, blog posts, code, and things I learned since 2024 to list my every experiment in AI education, rating it on importance and novelty.

Here is the full list of my experiments.

1. Teach using exams and prompts, not content

  • Use exams to teach. The typical student is busy. They want grades, not learning. They’ll write the exams, but not read the content. So, I moved the course material into the questions. If they can answer the question, great. Skip the content.
  • Use AI to generate the content. I used to write content. Then I linked to the best content online – it’s better than mine. Now, AI drafts comics, interactive explainers, and simulators. My job is to pick good topics and generate in good formats.
  • Give them prompts directly. Skip the content! I generated them with prompts anyway. Give students the prompts directly. They can use better AI models, revise the prompts, and learn how to learn with AI.
  • Add an “Ask AI” button. Make it easy for students to use ChatGPT. Stop pretending that real-world problem solving is closed-book and solo.
  • Make test cases teach, not just grade. Automate the testing (with code or AI). Good test cases show students the kind of mistake they may - teaching them, not just grading them. That’s great for teachers to analyze, too.
  • Test first, then teach from the mistakes. Let them solve problems first. Then teach them, focusing on what failed. AI does the work; humans handle what AI can’t. This lets us teach really useful skills based on real mistakes.

2. Make cheating pointless through design, not detection

  • Allow copying, collaboration, and hacking. In real work, nobody gets bonus points for working alone or re-inventing the wheel. Collaboration, using available resources well, verifying inputs, disclosed shortcuts – all are rewarded.
  • Reward originality without punishing collaboration. Blanket anti-copying rules assume that all similarity is bad. A more AI-native approach is to allow learning from others openly, but give extra credit for genuine variation, initiative, and novel improvement.
  • Give each student a unique variant. If everyone sees the same problem with the same visible answer path, answer-sharing becomes the dominant strategy. Deterministic but unique variants shift the game from leaking answers to actually solving the problem.
  • Make process logs part of the evidence. When outputs can be copied or AI-generated, the trace becomes more valuable than the final artifact. Logs, verification notes, session recordings, and agent traces show whether the student can actually orchestrate the work.
  • Use repo-grounded vivas for authenticity. If you really want to know whether a student owns their project, ask them questions drawn from their own repo and make them change something live. That is much harder to fake than polished submitted output.
  • Use structural similarity, not string matching. Strip docstrings, tokenize, MinHash. Students who rename variables are still caught; students who genuinely collaborated produce detectable clusters rather than suspicious pairs.

3. Test skills that matter in an AI world

  • Teach what AI still cannot do well. Syntax and routine execution are declining in value. Judgment, debugging, orchestration, validation, integration, and taste are rising. The curriculum should move upward, not cling to the parts AI is already eating.
  • Use hard, messy problems to build real resilience. Some questions should be intentionally tricky, partly wrong, hidden in the UI, or out of syllabus. The students who find and solve them anyway are demonstrating exactly the adaptability that real work demands. Smooth progression alone doesn’t build that.
  • Test live, hands-on AI skills. Don’t just lecture about embeddings, vision, structured outputs, or hallucinations. Put students in live API-driven tasks where they have to use these things under time pressure and genuine uncertainty.
  • Grade students on designing AI workflows. In many real settings, the important skill is not “give the answer” but “design the chain of steps that gets to the answer reliably.” That includes tools, prompts, datasets, quality checks, fallbacks, and output formats.
  • Use game-like tasks to teach agentic work. Mazes, escape rooms, and API games force state tracking, exploration strategy, and backtracking — exactly the behaviors agentic systems require. They’re not gimmicks; they’re the syllabus.
  • Test prompt attacks and defenses. Security and adversarial literacy should not be abstract topics. Make students jailbreak, defend, manipulate, and harden model behavior. That turns “prompt security” from a lecture topic into a measurable skill.

4. Make assessment more like real work

5. Use AI to build the course, not just teach inside it

6. Build the infrastructure for AI in education

  • Break rubrics into binary sub-criteria; reason before judging. Open-ended project grading becomes more auditable when you decompose it into binary yes/no criteria and ask the model to explain its reasoning before delivering a verdict. High or suspicious scores get re-evaluated with stronger guardrails.
  • Give every student shared, budgeted AI access. If AI access depends on personal subscriptions, the institution is quietly grading wealth, not skill. Shared governed access makes AI a course capability, not a private advantage.
  • Let AI handle routine support; keep humans for judgment. AI handles repetitive, searchable, first-pass questions. Humans handle ambiguity, reassurance, escalation, and final accountability. Neither alone is the right model at scale.
  • Turn recurring answers into canonical Q&A cards. Once the same question appears three times, it should stop living in somebody’s head or an old thread. Convert it into a canonical artifact that both humans and bots can cite consistently.
  • Govern with green/amber/red review levels. Not every decision needs the same scrutiny. Auto-ship the low-risk, spot-check the medium-risk, always human-review the high-stakes. This is how you scale without losing trust.
  • Roll out in shadow mode first. High-stakes academic workflows should not be launched with fingers crossed. Run the AI system quietly in parallel with human judgment and learn before turning it loose.
  • Turn policy into executable checks. A policy that cannot be operationalized at scale is mostly theater. If you can translate rules and rubrics into machine-checkable form, governance becomes consistent rather than person-dependent.
  • Make the course publicly inspectable. Openness raises the bar. It invites scrutiny, reuse, criticism, and improvement, and it turns the course into a visible institutional experiment rather than a sealed classroom.
  • Use reasoning models only for the borderline cases. Cheap screening first, expensive verification for the high-stakes or suspicious. Increasing reasoning effort on even a small model can flip an evaluator from sloppy to reliable — the cost curve makes this the natural operating model.

7. Analyze and research learning exhaust

8. Upgrade the human role

  • Make judgment and taste explicit learning goals. AI makes average output cheap. The premium moves to selecting what is worth doing, recognizing quality, and knowing what to reject. That is a teachable skill, not a vague aspiration.
  • Teach directional feedback as a skill. You do not always need to micromanage AI with detailed corrections. The higher-order skill is to say “more concrete,” “less jargon,” “optimize for faculty adoption,” or “make this defensible.” That is learnable and more effective than micromanaging.
  • Teach faculty to manage agents, not just chat with them. Institutional AI does not scale on prompting alone. People need to learn specs, budgets, kill switches, and escalation rules — orchestration literacy, not just chatbot familiarity.
  • Use AI as a personalized coach. The model is not just an answer engine. It can become a research guide, curiosity amplifier, and next-step recommender tailored to the individual learner’s gaps and goals.
  • Let non-coders build interactive learning tools. AI lowers the cost of making timelines, maps, biographies, and interactive explainers. That opens AI-native pedagogy far beyond computer science into humanities and social sciences.
  • Teach students to run many AI attempts in parallel. One of the biggest AI-native workflow shifts is from single-path effort to portfolio thinking — run several attempts, compare them, and converge faster. That is a teachable habit, not an obvious default.