
While reading the latest WSJ interview with the OpenAI CTO, I was 🤔curious to know what she meant by 😱“publicly available data” when the interviewer asked about the training data for SORA(their text-to-video generator service). After going through the latest interview of WSJ with OpenAI CTO , I was keen on what she meant by “ Publicly available data ” when the interviewer asked about the training data for SORA (their text-to-video generator service). The reason I was keen was to know whether they use data(like pictures and videos) from public social media accounts to train their models for Dall-E and SORA .
So I asked SORA ’s 👪family member ChatGPT what “publicly available data” really means and whether it includes data from public social media accounts. Yes, it does😶. However, whether OpenAI uses this data or not would remain unanswered unless OpenAI discloses this, but there is a possibility that OpenAI may use data from public social media accounts to train its models.
I couldn’t find much info on this, so let me know if you found anything interesting on this.✍️
← Back to Blog