• Buffalox@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    3
    ·
    2 days ago

    No the test is not training, that’s a weird thing to claim. The switch is what is tested, and you disregard that 2 other tests have shown similar results. An actual decline in critical and problem solving thinking.

    • FauxLiving@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      1
      ·
      20 hours ago

      Here is the paper: https://ai-project-website.github.io/AI-assistance-reduces-persistence/

      No the test is not training, that’s a weird thing to claim.

      The control group solved 12 questions manually and then the 3 test questions manually. The AI grouped solved 0 questions manually and the 3 test questions manually. One group had 12 more manual math tasks to prepare for the manual math test the other group had 0 and also had to context switch.

      The AI-assisted group was dealt a context switch, which results in a pretty severe performance loss. A context switch causes performance loss of around 40% according to this paper, which was peer-reviewed and published and is also the most cited paper on the topic, in the APA: https://www.apa.org/pubs/journals/releases/xhp274763.pdf

      The AI-assisted group also did not have 12 questions to adjust to the new context, like the control group did. If they wanted to wipe out the context switching performance loss they should have kept asking questions to see if, after 12 questions, the AI-assisted group had a similar performance.

      The switch is what is tested, and you disregard that 2 other tests have shown similar results.

      No, they did not switch what was tested. Here is an image from the actual paper.

      They were given 12 tasks with one group using AI and another doing mental math and then 3 tasks doing mental math. One group had 12 more tasks worth of preparation than the other.

      Nothing, not even the article in theOP, says that they did math and swapped to reading to test.

      They did 3 different experiments, in each experiment they gave 12 tasks and then disabled the AI for one group and gave 3 more tasks as a test. At no point did they ask 12 math questions and then finish with 3 reading questions or vice versa. They did 2 experiments using math tasks and 1 experiment using reading comprehension tasks.

      So one group had 15 math tasks and one group had 12 ‘how to ask an AI’ tasks and then 3 math questions.

      They also did not control for context switching losses, which is a well documented (see the APA paper) effect. The proper control would be to continue asking questions so the AI group also had 12 math tasks before the test.

      There’s a reason that this is published on arXiv and not in a peer-reviewed journal. Designing a poor quality experiment doesn’t tell you anything useful even if you do multiple different versions of the same experiment.

      This paper demonstrates a lack of a proper control group, specifically a failure to control for context switching performance loss.

      • Buffalox@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        3
        ·
        14 hours ago

        The picture you post contradict your claims. The 2 groups are getting the same question, but one has AI assistance, the other has not.
        Again you fail to show anything to support your claims.

        • Sockenklaus@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          2
          ·
          4 hours ago

          No, what they meant is: The control group had 12 questions to get into the flow of solving math problems and then solved three more math problems for good measure.

          The AI group on the other hand got into the flow of formulating math problems to ChatGPT and then had to actually solve three math problems themselves

          Their critique is, that solving math problems yourself and prompting ChatGPT to solve math problems are not necessarily comparable tasks and require different skill sets so disabling AI after 12 tasks meant the first group had to switch context and therefore had worse performance.

          If you want to analyze the first groups general ability of problem solving you should give them again twelve tasks after disabling AI so they get used to this new type of task (solving math problems yourself vs. prompting math problems to the AI) before measuring their performance.

          • Buffalox@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            4 hours ago

            The AI group on the other hand got into the flow of formulating math problems to ChatGPT and then had to actually solve three math problems themselves

            That’s what the friggin test is about! So of course they did.

        • FauxLiving@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          ·
          11 hours ago

          I also wrote text.

          If you’re just going to cherry pick a single point and dismiss everything else then we’re done here.

    • Womble@piefed.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      2 days ago

      The switch is what is being tested yes, but it is not clear that what is being measured in the switch is “AI fried their brains” rather than “context switching in the middle of a test”. If they wanted to make that point it would be useful to have the maths test run with a calculator group who also got it yanked halfway through, that way we would be able to see what proportion of the effect is over dependence on AI removing critical thinking and what amount is having your methods disrupted mid task.

      • Buffalox@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        1 day ago

        The calculator test might be good for comparison, and I’m pretty sure if given the same amount of time, and one group being allowed to use calculator for half the test, that group would solidly outperform a group not using calculators at all.

        I was in 5th grade in 1975, and we were the first class to get calculators in 5th grade. Which became the standard for many years after.
        I have never heard complaints about students being less capable of understanding basic math problems because they use calculators. Although the idea of using calculators in schools were heavily debated. It’s similar to people not getting worse at spelling from using a dictionary.

    • iglou@programming.dev
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      Not training, no, but warm up. And no, it is not about critical thinking, it’s about reading comprehension and calculations.