Which of the following sounds more reasonable?

  • I shouldn’t have to pay for the content that I use to tune my LLM model and algorithm.

  • We shouldn’t have to pay for the content we use to train and teach an AI.

By calling it AI, the corporations are able to advocate for a position that’s blatantly pro corporate and anti writer/artist, and trick people into supporting it under the guise of a technological development.

  • pensivepangolin@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    I think it’s the same reason the CEO’s of these corporations are clamoring about their own products being doomsday devices: it gives them massive power over crafting regulatory policy, thus letting them make sure it’s favorable to their business interests.

    Even more frustrating when you realize, and feel free to correct me if I’m wrong, these new “AI” programs and LLMs aren’t really novel in terms of theoretical approach: the real revolution is the amount of computing power and data to throw at them.

    • eerongal@ttrpg.network
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 year ago

      Even more frustrating when you realize, and feel free to correct me if I’m wrong, these new “AI” programs and LLMs aren’t really novel in terms of theoretical approach: the real revolution is the amount of computing power and data to throw at them.

      This is 100% true. LLMs, neural networks, markov chains, gradient descent, etc. etc. on down the line is nothing particularly new. They’ve collectively been studied academically for 30+ years. It’s only recently that we’ve been able to throw huge amounts of data, computing capacity, and time to tweak said models to achieve results unthinkable 10-ish years ago.

      There have been efficiencies, breakthroughs, tweaks, and changes over this time too, but that’s just to be expected. But largely its just sheer raw size/scale that’s just been achievable recently.

  • itsnotlupus@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 year ago

    I’ll note that there are plenty of models out there that aren’t LLMs and that are also being trained on large datasets gathered from public sources.

    Image generation models, music generation models, etc.
    Heck, it doesn’t even need to be about generation. Music recognition and image recognition models can also be trained on the same sort of datasets, and arguably come with similar IP right questions.

    It’s definitely a broader topic than just LLMs, and attempting to enumerate exhaustively the flavors of AIs/models/whatever that should be part of this discussion is fairly futile given the fast evolving nature of the field.

    • themarty27@lemmy.sdf.org
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      1 year ago

      Still, all those models are, even conceptually, far removed frow AI. They would most properly be called Machine Learning Models (MLMs).

      • itsnotlupus@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        The term AI was coined many decades ago to encompass a broad set of difficult problems, many of which have become less difficult over time.

        There’s a natural temptation to remove solved problems from the set of AI problems, so playing chess is no longer AI, diagnosing diseases through a set of expert system rules is no longer AI, processing natural language is no longer AI, and maybe training and using large models is no longer AI nowadays.

        Maybe we do this because we view intelligence as a fundamentally magical property, and anything that has been fully described has necessarily lost all its magic in the process.
        But that means that “AI” can never be used to label anything that actually exists, only to gesture broadly at the horizon of what might come.

  • Iceblade@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 year ago

    IMO content created by either AI or LLMs should have a special license and be considered AI public domain (unless they can prove that they own all content the AI was trained on). Commercial content made based on content marked with this license would be subject to a flat % tax that should be applied to the product price which would be earmarked for a fund distributing to human creators (coders, writers, musicians etc.).

    • kklusz@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      What about LLM generated content that was then edited by a human? Surely authors shouldn’t lose copyright over an entire book just because they enlisted the help of LLMs for the first draft.

      • Cethin@lemmy.zip
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        1 year ago

        If you take open source code using GNU GPL and modify it, it retains the GNU GPL license. It’s like saying it’s fine to take a book and just change some words and it’s totally not plagerism.

  • lolpostslol@kbin.social
    link
    fedilink
    arrow-up
    0
    arrow-down
    1
    ·
    1 year ago

    It’s just a happy coincidence for them, they call it AI because calling it “a search engine that steals stuff instead of linking to it and blends different sources together to look smarter” wouldn’t be as interesting to clueless financial markets people