Can Artificial Intelligence Be Relied upon to Teach Mathematics? Researchers Answer

Artificial intelligence has entered almost all fields, most notably: the field of education. Many artificial intelligence tools have been developed to enhance the educational experience. Some of these tools are used in classrooms, and some are used by students to help them solve homework and understand the materials. With the rapid development of artificial intelligence, there are aspirations to adopt it in teaching some subjects, including mathematics.

But one of the main problems with using AI in education is hallucination. Hallucination refers to AI models creating things that don’t exist or aren’t real. In mathematics in particular, there is a lot of room for error and hallucination.

Math errors are a big, scary thing to happen. Kids might memorize incorrect solutions, and teachers who use ChatGPT and other generative AI models to write tests or lesson plans can get in big trouble. While a teacher can review the output of an AI tool before presenting it to students, it’s even more dangerous when students are asked to learn directly from an AI tool.

Some experts are trying to combat these errors through a process they call “AI desensitization,” with the goal of helping AI tools like ChatGPT become more widely adopted in math teaching.

What method have experts used to mitigate AI hallucinations in mathematics?

In a study published in May 2024 , two researchers from the University of California, Berkeley School of Education documented how they managed to reduce ChatGPT’s errors in algebra to nearly zero, but they didn’t achieve the same result with statistics, where their technique still left 13% errors.

In this study, Berkeley computer scientist Zachary Pardos and one of his students, Shreya Bhandari, first asked ChatGPT to explain how to solve an algebra equation. They found that ChatGPT gave detailed information about the steps it used to arrive at the answer, and they didn’t have to ask it to explain the steps of the solution. However, these lengthy answers didn’t help with accuracy. ChatGPT’s answers were wrong a third of the time, meaning that ChatGPT would get a D if it were a student.

The researchers took advantage of the fact that ChatGPT does not give the same answer every time, and asked it to answer the same math problem 10 times in a row, and each time it answered differently.

The researchers collected the final similar answers together, evaluated their accuracy, and found that the robot gave correct answers seven times out of ten, and each time there were different details in the solution method, meaning that the robot relied on different methods to provide answers.

After adopting this method of asking a set of questions to the robot, the researchers found that ChatGPT was very good at basic algebra for high school students, as its error rate decreased from 25% to 0%, and the error rate in middle school algebra decreased from 47% to 2%, and decreased from 27% to 2% in college algebra.

But when the researchers applied this method, which they call “self-consistency,” to statistics problems, it didn’t work as well. ChatGPT’s error rate dropped from 29% to 13%, which is a very high error rate for students who might use it to learn math.

So, can ChatGPT be relied upon at this time to teach mathematics?

To answer the question of whether the solutions provided by ChatGPT could help students learn math better than traditional teaching, the researchers in the second part of the study asked 274 adults online to solve the same math problems they posed to the bot, and randomly selected a third of them to see the solutions provided by ChatGPT as hints if they needed them.

The participants then took another test, and the results showed that the adults' performance improved by 17%, compared with a 12% improvement in the answers of adults who got hints written by their undergraduate math teachers. Those who were not given any hints scored about the same on both tests.

These results led the study’s authors to predict that large language models could be improved for teaching mathematics. Robots based on advanced large language models, such as ChatGPT, could comprehend an entire chapter of a textbook and present it to students instead of the teacher.

However, this robot or any other currently cannot be used to teach mathematics or any other educational subjects independently without human intervention. These tools still sometimes provide incorrect solutions that can negatively affect the way students understand the educational material, especially those in the primary stages.

masrawysat

Can Artificial Intelligence Be Relied upon to Teach Mathematics? Researchers Answer