ChatGPT v Bard v Bing v Claude 2 v Aria v human-expert. How good are AI chatbots at scientific writing?
September 14, 2023 ยท Declared Dead ยท ๐ Future Internet
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Edisa Loziฤ, Benjamin ล tular
arXiv ID
2309.08636
Category
cs.CL: Computation & Language
Cross-listed
cs.AI,
cs.CY,
cs.ET,
cs.HC
Citations
49
Venue
Future Internet
Last Checked
4 months ago
Abstract
Historical emphasis on writing mastery has shifted with advances in generative AI, especially in scientific writing. This study analysed six AI chatbots for scholarly writing in humanities and archaeology. Using methods that assessed factual correctness and scientific contribution, ChatGPT-4 showed the highest quantitative accuracy, closely followed by ChatGPT-3.5, Bing, and Bard. However, Claude 2 and Aria scored considerably lower. Qualitatively, all AIs exhibited proficiency in merging existing knowledge, but none produced original scientific content. Inter-estingly, our findings suggest ChatGPT-4 might represent a plateau in large language model size. This research emphasizes the unique, intricate nature of human research, suggesting that AI's emulation of human originality in scientific writing is challenging. As of 2023, while AI has transformed content generation, it struggles with original contributions in humanities. This may change as AI chatbots continue to evolve into LLM-powered software.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
๐
๐
Old Age
XLNet: Generalized Autoregressive Pretraining for Language Understanding
๐ฎ
๐ฎ
The Ethereal
Effective Approaches to Attention-based Neural Machine Translation
๐
๐
Old Age
A large annotated corpus for learning natural language inference
๐
๐
Old Age
HellaSwag: Can a Machine Really Finish Your Sentence?
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
๐ป
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
๐ป
Ghosted