ChatGPT Competitors: Amazon Jumps Into Fray With Generative AI Better Than GPT-3.5
OpenAI made ChatGPT available to the general public
Just over two months ago, OpenAI made ChatGPT available to the general public, thrusting the AI-powered chatbot into the centre of popular conversation and igniting discussions about how it can change business, education, and other areas.
Then, Chinese internet behemoths Google and Baidu debuted their chatbots to demonstrate to the public that their so-called “generative AI” (technology that can create conversational text, visuals, and more) was also ready for general use. Amazon’s new language models now exceed many people and GPT-3.5 by a margin of 16 percentage points (75.17%) on the ScienceQA benchmark.
The ScienceQA benchmark is a sizable collection of annotated answers to scientific questions across several media. There are almost 21,000 multimodal multiple-choice questions (MCQs). Recent technology developments enable large language models (LLMs) to work efficiently on tasks demanding complex reasoning. This is done using the chain-of-thought (CoT) prompting approach, which involves constructing logically sequential intermediate steps to show how to do something.
However, the majority of recent CoT research solely examines language modality, and when looking for CoT reasoning in multimodality, researchers frequently employ the Multimodal-CoT paradigm. Multiple inputs, including language and visuals, are necessary for multimodality. Even if the inputs come from multiple modalities like language and visual, Multimodal-CoT divides issues with more than one step into intermediate thinking processes that lead to the ultimate response.
Before requesting LLMs to do CoT, one of the most popular methods for performing Multimodal-CoT is to aggregate data from many modalities into a single modality. However, this approach has certain drawbacks, one of which is that there is a significant amount of information lost while converting data between formats. Small language models that have been fine-tuned may perform CoT reasoning in multimodality by fusing various parts of language and visual.
Finding a cause and determining the solution are the two steps that the framework separates the reasoning process into. The model’s arguments are stronger since it incorporates the vision into both phases. It also makes it easier to make judgements about the responses that are more accurate.
It is the first study of its sort to examine the variations in CoT reasoning. The method, as described by Amazon researchers, exhibits cutting-edge performance on the ScienceQA test, beating GPT-3.5 accuracy by 16 percentage points and surpassing human performance.
How does it perform better?
The Multimodal-answer CoT has the same model architecture for its inference and reasoning-generating stages, but the inputs and outputs are different. A vision-language model, for instance, feeds input from both the visual and linguistic domains during the reason-generating step.
In the response inference phase, the reason is then added to the initial language input to create the language input for the following stage. To put it simply, a Transformer encoder is used to create a textual representation of the language’s text. The Transformer decoder is then fed this combined textual and visual representation.
The researchers conducted several tests on ScienceQA to check the effectiveness of their methodology. The researchers came to the conclusion that their approach outperforms the prior state-of-the-art GPT-3.5 model on the benchmark by 16%. In a nutshell, Amazon researchers investigated the issue of eliciting Multimodal-CoT reasoning and found a solution by putting out a two-stage architecture for integrating visual and language representations with running Multimodal-CoT. In order to determine the ultimate solutions, the model offers useful justifications.
In their work, the Amazon researchers show how the use of visual elements aids in the creation of more persuasive justifications, which in turn contribute to more precise answer inference. They show that 1B-models beat GPT-3.5 on the ScienceQA benchmark by 16% using Multimodal-CoT. Their error analysis implies that future research may benefit from utilising more efficient visual characteristics, including common sense information, and utilising filtering techniques to enhance CoT reasoning.
Don’t Chat With ChatGPT: Amazon’s Employee Warning
Recently, Amazon issued a warning to its staff members regarding ChatGPT, a chatbot created by OpenAI. The article claims that Amazon staff members have been utilising ChatGPT to assist them with research and problem-solving on a regular basis. However, Amazon has advised its employees not to provide the chatbot any private information.
According to reports, Amazon has advised staff not to save sensitive information on ChatGPT, an AI-powered chatbot that can resolve difficult problems in a matter of seconds. According to an internal chat group that Business Insider staff members provided, Amazon staff members use ChatGPT for both research and problem-solving on a regular basis.
A corporate counsel at Amazon reportedly forewarned staff about ChatGPT after observing how closely it resembled confidential Amazon data. This is crucial since your inputs could be utilised as training data for future versions of the ChatGPT programme, and we wouldn’t want the application to include or resemble personal information, the lawyer allegedly added.
Amazon has warned its staff against using ChatGPT because it may collect sensitive information from employees as training data for future iterations. Because it can give training materials and answer customer service questions, engineers have been successfully using the chatbot to review code. However, there is a chance that ChatGPT will discover false or fake data. OpenAI, the company that created ChatGPT, may later add new capabilities, but a code error might lead to the chatbot being misused.
Additionally, there are rumours that Google is developing a ChatGPT competitor, which might raise questions about the security of the data given on Amazon’s search engine. Given all above, Amazon employees should use ChatGPT with additional caution since it has the potential to leak sensitive information or provide inaccurate data.
Edited by Prakriti Arora