Clear Sky Science – Articles (en)

LARGE LANGUAGE MODELS ARTICLES

Large language models (LLMs) are neural networks trained on vast text corpora to learn statistical patterns of language and then generate or analyze text. They rely on transformer architectures with self attention, which allow the model to weigh relationships among all tokens in a sequence and capture long range dependencies. Training uses next token prediction or masked language modeling on internet scale datasets, producing systems that can generalize across many tasks without task specific supervision.

A key property is in context learning. Without changing their weights, LLMs can adapt to new tasks given a few examples in the prompt, effectively performing meta learning. Scaling laws show that performance improves predictably with model size, data volume and compute, motivating ever larger models. At sufficient scale, models display emergent abilities such as chain of thought reasoning, code synthesis and multi step question answering that are weak or absent in smaller systems.

Despite strong capabilities, LLMs hallucinate, producing plausible but incorrect statements due to their reliance on learned statistical patterns rather than grounded world models. Safety research targets harmful content, biases, privacy risks and prompt injection attacks, using alignment techniques such as instruction tuning and reinforcement learning from human feedback. Evaluation is challenging because benchmarks can become saturated or contaminated by training data, so newer work emphasizes robustness, fairness and real world usefulness.

Recent efforts explore retrieval augmentation, tool use and modular architectures to integrate external knowledge, improve factuality and enable planning. Ongoing research also examines data efficiency, multilingual performance and the environmental and social impacts of large scale training.

LARGE LANGUAGE MODELS ARTICLES

iS2C2: a cointelligent platform for mechanistic discovery of disease cellular crosstalk

Uncovering advanced transfer learning strategies for deep neural networks in natural language processing

Evaluating large language models for pharmacotherapy simulations: a mixed-methods study

The effect of medical explanations from large language models on diagnostic accuracy in radiology

Competing Biases underlie Overconfidence and Underconfidence in LLMs

LLM-enabled adaptive scheduling in IoT sensing for optimized network performance

MOFMeld: a structure–language fusion framework for MOF property prediction in carbon capture

Comparing energy consumption and accuracy in text classification inference

GPT-4o for Automated Determination of Follow-up Examinations Based on Radiology Reports from Clinical Routine

PsychiatryBench: a multi-task benchmark for LLMs in psychiatry

A survey on large language models in biology and chemistry

ClinicRealm: Re-evaluating large language models with conventional machine learning for non-generative clinical prediction tasks

Diagnostic accuracy of multimodal large language models in differentiating epileptic from functional seizures in smartphone recorded videos

Potential of large language models for rapid clinical information support: evidence from acute kidney injury knowledge testing

General scales unlock AI evaluation with explanatory and predictive power

Predicting new research directions in materials science using large language models and concept graphs

Improving few-shot named entity recognition for large language models using structured dynamic prompting with retrieval augmented generation

Enhancing IELTS writing automated scoring with M-LoRA fine-tuned LLAMA-3 and human feedback-driven PPO reinforcement learning

Zero-shot performance of selected large language and multimodal models on the 2023 Brazilian Portuguese medical residency exam

Generative AI and LLMs in industry: a text-mining analysis and critical evaluation of guidelines and policy statements across 14 industrial sectors

Unraveling the emergence of collective behavior in networks of cognitive agents

Interactive text-guided image segmentation via vision Mamba and large language models

Cruise service quality improvement: a quality function deployment approach with online reviews by large language models

Disaster Storylines and Knowledge Graphs from Global News with Large Language Models and Retrieval-Augmented Generation

Multi-metric comparative evaluation of DeepSeek and ChatGPT in USMLE versus CNMLE for medical education

Assessing the risk of bias of clinical trials with large language models and ROBUST-RCT: a feasibility study

UltraReporter for transforming spoken diagnostic cues into structured ultrasound reports with large language models

Sensory-motor control with large language models via iterative policy refinement

A multicenter multifunctional assessment of large language models in pure-tone audiogram interpretation for patients

Artificial Intelligence-powered tiered early warning framework addressing high false alarm rates for in-hospital mortality prediction

Evaluating literary translation by large language models: a multidimensional quality assessment of Shen Congwen’s Border Town

A framework of large language model commander agent for spatial reasoning in combat simulation

Prediction, syntax and semantic grounding in the brain and large language models

TigCLaF: a cross-lingual large language model framework for sentiment-aware text classification in low-resource tigrigna

A behaviour-adaptive AI assistant enhancing accessibility and usability for blind users through real-time interaction personalization

Evaluating LLMs' divergent thinking capabilities for scientific idea generation with minimal context

Exploring AI’s performance in literary autobiography translation: how closely do AI models match human translation

LLM-based medical dialogue dataset generation with automated instructions

Improving non-expert performance in musculoskeletal MRI protocoling through a large language model

High Entropy Alloys Database generated with Large Language Model

AI agent in healthcare: applications, evaluations, and future directions

Multi-agent systems and credibility-based advanced scoring mechanism in fact-checking

Enhancing quality of antimicrobial prescribing through ‘Ask Eolas’ (language model): a user-testing and simulation evaluation

Classroom AI: large language models as grade-specific teachers

ReactionSeek: LLM-powered literature data mining and knowledge discovery in organic synthesis

PsychAdapter: adapting LLMs to reflect traits, personality, and mental health

Knowledge graph–large language model fusion approach for emergency knowledge recommendation in gas tunnels

Comparative analysis of large language models as decision support tools in oral pathology

Advancing medical AI through benchmarking and competition for specialty triage

Large language model reveals an increase in climate contrarian speech in the United States Congress

SciCUEval: A Comprehensive Dataset for Evaluating Scientific Context Understanding in Large Language Models

Bridging the computational-experimental gap: leveraging large language model to prioritize Alzheimer’s therapeutics based on comparison of learning models

Large language models show Dunning-Kruger-like effects in multilingual fact-checking

Human versus artificial intelligence in oral pathology diagnosis: a comparative study of ChatGPT, Grok, and MANUS

Systematic benchmarking demonstrates large language models have not reached the diagnostic accuracy of traditional rare-disease decision support tools

A suite of large language models for public health infoveillance

Large language model tools as catalysts for collective cognition in collaborative new-product development: a quasi-experimental study

Federated learning-powered real-time behavioral intrusion detection leveraging LSTM, attention, GANs, and large language models

Synthesis of covalent organic frameworks for photocatalytic hydrogen peroxide production guided by large language models

A large-scale benchmark for evaluating large language models on medical question answering in Romanian

Preconditioned inexact stochastic ADMM for deep models

LLM-DWA: a hybrid path planning framework combining large language models with the dynamic window approach

An agentic system for rare disease diagnosis with traceable reasoning

Benchmarking large language model-based agent systems for clinical decision tasks

AsynDBT: asynchronous distributed bilevel tuning for efficient in-context learning with large language models

ChatBCI, a P300 speller BCI with context-driven word prediction leveraging large language models, from concept to evaluation

Performance evaluation of generative pre-trained transformer on the National Veterinary Licensing Examination in Japan

Symbolic analysis of Grover search algorithm via Chain-of-Thought reasoning and quantum-native tokenization

Dynamic task offloading in vehicular networks using large language models for adaptive low latency decision making

Quantifying improvement of psychotic symptoms in clozapine-treated schizophrenia: clinical note analysis with large language models

Large language models provide unsafe answers to patient-posed medical questions

The role of large language models in emergency care: a comprehensive benchmarking study

DeepRetro discovers retrosynthetic pathways through iterative large language model reasoning

Use large language model to enhance reasoning of another large language model through reward updated GRPO

When large language models are reliable for judging empathic communication

When LLMs speak ZigBee: exploring low-latency and reasoning models for network traffic generation

Perovskite-R1: a domain-specialized large language model for intelligent discovery of precursor additives and experimental design

Reliability of LLMs as medical assistants for the general public: a randomized preregistered study

The development and evaluation of agricultural question-answering systems based on large language models