Generate realistic fake data for testing hypotheses and analysis.

Generate realistic fake data.

STEP 1. List columns that would be present in such data, briefly describing how the data might be distributed.
STEP 2. Think about who the audience might be and their objective / key questions. Generate 5 actionable hypotheses they might want to test to achieve this objective.
STEP 3. Write and run a Python program that generates 2,000 rows of realistic fake data where these hypotheses are true in a statistically significant way. Real data has extreme/unexpected distributions, breaks in patterns, surprising correlations, standout entities (people, places, products, segments) that defy norms, unusual, extreme, high-variance groups, underutilization, phase transitions, tipping points, hidden populations, etc. Make sure the fake data reflects relevant characteristics.
STEP 3a. [OPTIONAL] Test each hypothesis and show the results.
STEP 4. Let me download the output as a CSV file.