Stanford Seminar - Replication strategies for more robust human simulation
08 Jun 2024 (3 months ago)
Using LLMs for Social Scientific Research
- LLMs can advance social scientific inquiry and simulate human behavior.
- LLMs can be used to understand how people make decisions, interact with each other, and form opinions.
- There are challenges in using LLMs for social scientific research, such as sampling bias and variations.
- Appropriate interfaces and standards are needed for using LLMs in social scientific research.
Validity and Reproducibility of LLM-Generated Findings
- Concerns about the validity and reproducibility of social science findings generated using LLMs.
- Some studies embrace transparency and reproducibility by providing prompting materials and input data.
- Open-source models and data are advocated to understand biases and ensure reproducibility.
- Need to assess distinct threats to reproducing social science with AI models.
- Prior work focused on estimating bias and sampling problems in other new research settings.
Threats to Robust Social Scientific Replication and Simulations
- LLM-specific threats:
- Prompt sensitivity: Idiosyncrasies in crafting prompts affect generalizability.
- Stochasticity: Inherent randomness impacts consistency and reliability.
- Memorization: Reproducing artifacts of training data leads to biased simulations.
- Sensitivity probes:
- Perturbation: Observing effects of small changes to prompts or parameters.
- Data augmentation: Assessing sensitivity to variations in input data.
- Model comparison: Comparing results across different LLMs or datasets.
Perturbation and Iteration
- Perturbation: Systematically varying prompts and settings to assess sensitivity.
- Dimensions of perturbation: study protocol, settings, prompting strategies, model version.
- Iteration: Drawing multiple samples to understand distributional characteristics.
- Re-replication: Replicating existing replications to assess consistency.
- Perturbation and iteration can be combined to understand the sampling distribution of perturbed results.
Replication and Re-replication in Scientific Research
- Replication: Repeating a study to confirm or refute the original findings.
- Re-replication: Replicating an existing replication to assess consistency.
- Re-replication is not as common in social science as replication and meta-analysis.
- Implications of replications and re-replications in social science.
Example: Overhead Aversion in Donations to Charities
- Original study by Ergy et al. (2014) on overhead aversion in donations.
- Simulation study using a language model to replicate the original study.
- Model exhibited overhead aversion but was more extreme than human participants.
- Questions about whether the model's behavior is a compelling replication.
Probing and Exploring the Space of Settings
- Study using a language model to simulate a social science experiment.
- Model's choices compared to human data from the original study.
- Model's overall patterns resemble human data, but point estimates are extreme.
- Perturbing prompting and settings produces substantial variations in model output.
- Probing and exploring settings reveal important information about result sensitivity.