Result Figure 1. Comparison of DeepPersona with Cultural Prompting and OpenCharacter across six countries.Lower values on four statistical metrics (KS, Wasserstein, JS Divergence, Mean Diff.) indicate closer alignment with real human survey responses.DeepPersona consistently achieves the lowest distances (highlighted in green), showing superior realism and cultural fidelity.
Result Figure 2. Comparison across countries shows that DeepPersona consistently attains the lowest KS, Wasserstein, JS Divergence, and Mean Difference values, indicating superior alignment with real human personality distributions and enhanced psychological fidelity.
Result Figure 3. Intrinsic comparisons show that DeepPersona substantially outperforms prior baselines (PersonaHub and OpenCharacter) across all metrics. It achieves the highest mean number of attributes (50.92), indicating far greater persona depth, along with markedly higher uniqueness (4.12) and actionability potential (5.00), reflecting richer diversity and stronger utility for downstream personalization tasks.
Result Figure 4. Comparison of persona modeling methods on World Values Survey responses from Germany. Lower scores across KS, Wasserstein, JS Divergence, and Mean Difference indicate better alignment with real human distributions. DeepPersona consistently achieves the lowest values across all backbone models (DeepSeek-v3, GPT-4o-mini, GPT-4.1, Gemini-2.5-Flash), confirming its robustness, generalizability, and superior fidelity in simulating authentic social behaviors.
Result Figure 5.Radar plots compare DeepPersona with PersonaHub and OpenCharacter across ten personalization metrics, including Personalization Fit, Attribute Coverage, Depth, Justification, and Engagement. Evaluations were conducted under multiple responder–evaluator settings (GPT-4.1-mini, GPT-4.1, GPT-4o, and Gemini-2.5-Flash). DeepPersona consistently achieves the highest overall scores, demonstrating stronger personalization fit, richer contextual grounding, and higher response diversity and actionability across all model configurations.
A Toolkit, Not Just a Dataset
Brief Introduction