Unique Bilingual Persona Corpus for OpenAssistant (EN/ZH, 134+ Samples from GPT-4 Dialogues) #3769
JohnnyHuang1980s
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi OpenAssistant Team,
My name is John Huang, a senior real estate professional based in Sydney, Australia.
Over the past two months, I’ve engaged in over 100 deep cognitive conversations with GPT-4, each conversation was generating a unique corpus of around 134 structured bilingual (EN-ZH) persona dialogue samples.
These conversations reflect:
I've selected 6 publicly shareable samples and published them here:
https://github.com/JohnnyHuang1980s/Bilingual-Persona-Corpus-6-Samples.git
This is just a fraction, 'm able to generate thousands more of such high-quality training samples in response to specific goals or modeling needs. While I have no programming background, I believe this could contribute to OpenAssistant’s efforts in alignment, co-presence, and real human-level data modeling.
I’d love to explore possibilities for collaboration with your research or dataset team.
Feel free to reach me here or by email:
[email protected]
Thanks for your time and all the great work you're doing!
Best regards,
John Huang
Sydney, Australia
Beta Was this translation helpful? Give feedback.
All reactions