Large-Scale Language Model (LLM) is not a Doctor

Large-Scale Language Model (LLM) is not a Doctor Sept 14, 2023 10:56:21 GMT

Quote

Post by account_disabled on Sept 14, 2023 10:56:21 GMT

The rise of 'Large-Scale Language Models' (LLMs) like GPT-4, which generate text with great fluency and confidence, is certainly surprising. However, the over-packaging is no less surprising. Microsoft researchers claim that OpenAI's GPT-4 model shows "glimmers of artificial general intelligence (AGI)." Sorry for Microsoft, but that's not the case.

Microsoft's claim is probably not referring Phone Number List to the so-called 'hallucination phenomenon' of releasing incorrect text without hesitation.

Additionally, GPT is bad at games like chess or Go, is not good at math, and can probably produce code riddled with errors and bugs. However, this does not mean that all LLM/GPT are overhyped. Not at all. However, this means that we should have a certain sense of balance in the discussion surrounding generative AI and significantly eliminate the exaggeration.

ⓒGetty Images Bank

According to an article in IEEE Spectrum, several experts, including OpenAI co-founder and chief scientist Ilya Sutskever, believe that adding reinforcement learning with human feedback could eliminate the LLM hallucination. But other experts, including Yann LeCun, chief scientist at Meta's AI Institute, and Jeffrey Hinton, the father of deep learning who recently left Google, argue that current large-scale language models are fundamentally flawed. These two believe that large-scale language models lack the non-linguistic knowledge essential to understanding the reality that language describes.

“For everything from gaming to writing code, reinforcement learning models that are small, fast, and cheap to run easily outperform LLMs with hundreds of billions of parameters,” Diffblue CEO Matthew Lodge told Infoworld. .

So maybe you're looking for gold in the wrong places?

One game?

As Lodge said, we may be pushing generative AI into areas where reinforcement learning can do much better. Games are a representative example. If you look at the video of Chess International Master Levi Roseman playing chess with ChatGPT , ChatGPT makes absurd moves, such as catching its own pieces, and even commits fouls. Additionally, the open source chess software Stockfish, which does not use neural networks at all, defeated ChatGPT in 10 moves. This is a good example showing that LLM is far below its hype.

Google AlphaGo is based on reinforcement learning. Reinforcement learning generates and tries multiple solutions to a problem and uses the results to improve the next proposal. And this process is repeated thousands of times to find the best result. In AlphaGo, AI tries various moves and predicts whether the move is a good move and whether it is likely to win at that position. Use this feedback to follow winning sequences and generate other possible moves. This process is called probabilistic search. This method is very effective in playing the game. AlphaGo has defeated several Go players in the past. AlphaGo is not perfect, but it outperforms the current best LLM.

Probability vs. Accuracy

When presented with evidence showing that LLM is significantly inferior to other types of AI, proponents say that LLM will “get better in the future.” But Lodge noted, “For this argument to be true, we need to understand why LLMs are better at this type of work, and that’s difficult.” This is because no one can predict what results GPT-4 will produce for a specific prompt. This model cannot be explained by humans. That's why prompt engineering is pointless, Lodge said. He also pointed out that it is difficult for AI researchers to prove that LLM's emergent properties exist, and even more difficult to predict.

Perhaps the best counterargument is induction. Because GPT-4 is larger than GPT-3, it is better at some language tasks. So wouldn't a larger model be even better? Is that really the case? According to Lodge, “The problem is that GPT-4 struggles at the same tasks that GPT-3 struggled with.” One of them is mathematics. GPT-4 is better at addition than GPT-3, but is still poor at other mathematical operations, including multiplication.

Making the language model larger does not magically solve these chronic problems. Additionally, even OpenAI stated that a larger model is not the answer. The reason is OpenAI ForumIt lies in the fundamental characteristics of LLM, which have been mentioned before. “Large-scale language models are inherently probabilistic and operate by generating highly probable outputs based on patterns observed in the training data. “For math and physics problems, there is usually only one correct answer, and the probability of generating this one answer can be very low.

NexusOS

Large-Scale Language Model (LLM) is not a Doctor

Post by account_disabled on Sept 14, 2023 10:56:21 GMT

Shoutbox