Warning: Some posts on this platform may contain adult material intended for mature audiences only. Viewer discretion is advised. By clicking ‘Continue’, you confirm that you are 18 years or older and consent to viewing explicit content.
Artificial intelligence firm Anthropic hits out at copyright lawsuit filed by music publishing corporations, claiming the content ingested into its models falls under ‘fair use’ and that any licensing regime created to manage its use of copyrighted material in training data would be too complex and costly to work in practice
GenAI tools ‘could not exist’ if firms are made to pay copyright::undefined
The fact that the “AI” can spit out whole passages verbatim when given the right prompts, suggests that there is a big problem here and they haven’t a clue how to fix it.
It’s not “learning” anything other than the probable order of words.
I really hate this reduction of gpt models. Is the model probabilistic? Absolutely. But it isn’t simply learning a comprehensible probability of words–it is generating a massively complex conditional probability sequence for words. Largely, humans might be said to do the same thing. We make a best guess at the sequence of words we decide to use based on conditional probabilities along a myriad number of conditions (including semantics of the thing we want to say).
Completely agree. And that should be the focal point of the issue.
Sam Altman is correctly stating that AI is not possible without using copyrighted materials. And I don’t think there’s anything wrong with that.
His mistake is not redirecting the conversation. He should be talking about the efforts they’re making to stop their machine from reproducing copyrighted works. Not whether or not they should be allowed to use it in the first place.
The fact that the “AI” can spit out whole passages verbatim when given the right prompts, suggests that there is a big problem here and they haven’t a clue how to fix it.
It’s not “learning” anything other than the probable order of words.
I really hate this reduction of gpt models. Is the model probabilistic? Absolutely. But it isn’t simply learning a comprehensible probability of words–it is generating a massively complex conditional probability sequence for words. Largely, humans might be said to do the same thing. We make a best guess at the sequence of words we decide to use based on conditional probabilities along a myriad number of conditions (including semantics of the thing we want to say).
Completely agree. And that should be the focal point of the issue.
Sam Altman is correctly stating that AI is not possible without using copyrighted materials. And I don’t think there’s anything wrong with that.
His mistake is not redirecting the conversation. He should be talking about the efforts they’re making to stop their machine from reproducing copyrighted works. Not whether or not they should be allowed to use it in the first place.
What about these:
https://arxiv.org/abs/2310.02207
https://notes.aimodels.fyi/researchers-discover-emergent-linear-strucutres-llm-truth/
https://notes.aimodels.fyi/self-rag-improving-the-factual-accuracy-of-large-language-models-through-self-reflection/