arXiv cs.AI AI News
AINews has published 367 articles sourced from arXiv cs.AI, tracking recurring developments and primary announcements. Key themes include conflict-aware guidance, diffusion models, flow models.
Overview
AINews has published 367 articles sourced from arXiv cs.AI, tracking recurring developments and primary announcements. Key themes include conflict-aware guidance, diffusion models, flow models.
Published articles
367
Latest update
May 22, 2026
Quality score
9
Topic diversity
1505
Primary host
arxiv.org
Latest coverage from arXiv cs.AI
For years, inference-time guided sampling has faced a critical bottleneck: when a model must satisfy multiple constraints simultaneously—like a drug molecule needing high target af…
The data engineering world has hit a wall. Traditional AI agents tasked with building data infrastructure rely on a brute-force loop: write code, run it, parse error logs, fix bugs…
The industrial sector has been quietly suffering from a 'latency disaster' as AI agents, tasked with querying sensor data, work orders, fault libraries, and predictive models, repe…
AINews has learned that Mahjax, a novel GPU-accelerated mahjong simulator, has been officially released. Built on Google's JAX framework, it is purpose-designed for reinforcement l…
For decades, negotiation theory has been a discipline built on case studies, intuition, and the hard-won wisdom of practitioners. The fundamental tension—balancing empathy for the …
For the past two years, the AI agent ecosystem has been trapped in a single-metric arms race. Benchmarks like GAIA, SWE-bench, and ToolBench each measure a narrow slice of agent pe…
A new research breakthrough has demonstrated that neural networks can be trained to produce high-quality numerical embeddings for logical statements using triplet loss, a technique…
A groundbreaking research framework, OSCToM (Opponent-Structured Counterfactual Theory of Mind), is redefining how we measure AI's ability to understand others' mental states. Unli…
The industrial design world has long suffered from a 'semantic gap': the stress distributions, thermal fields, and flow streamlines output by CAE simulations must be manually trans…
The AI community is witnessing a fundamental shift in how intelligent agents can operate in dynamic, real-world environments. SOLAR, a novel autonomous agent architecture, directly…
Natural language to SQL (NL2SQL) has long faced an awkward reality: while large language models can grasp human intent, their error rates on multi-table joins, aggregate functions,…
Large language model training has become a high-stakes gamble. Aggressive learning rates, parameter stress at scale, and runtime anomalies routinely cause training runs to diverge …
For years, the promise of Personal Health Records (PHRs) has been hollow: patients own their data but cannot understand it. A landmark study, analyzing 2,257 authentic user queries…
For years, the document intelligence field has suffered a glaring disconnect: academia releases ever-more-powerful understanding models, while production teams struggle to maintain…
The current state of large language model (LLM) development is plagued by a fundamental irony: we feed models terabytes of data but understand almost nothing about how individual d…
PopuLoRA represents a paradigm shift in how large language models (LLMs) can autonomously improve their reasoning capabilities. Traditional self-play methods, where a single model …
The fundamental limitation of current AI world models is their tendency to learn superficial semantic correlations—mapping inputs to outputs—rather than the underlying causal laws …
GRID represents a paradigm shift in how security knowledge graphs are built. For years, the cybersecurity industry has struggled to transform the vast, unstructured flow of threat …
The AI industry has been locked in a race to expand context windows, with models like GPT-4 Turbo boasting 128K tokens and Gemini 1.5 Pro reaching 1 million. Yet a deeper, more ins…
MetaKGEnrich represents a fundamental shift in how AI systems handle their own limitations. Instead of relying on human-curated datasets or expensive retraining, this pipeline equi…
LinAlg-Bench, a rigorous new benchmark for mathematical reasoning, has delivered a sobering verdict on the current generation of large language models. By testing 10 frontier model…
Multimodal AI has long faced a fundamental trade-off: chain-of-thought (CoT) reasoning dramatically improves embedding quality, but the computational cost of generating full reason…
The discovery of new materials has long been a bottleneck in industries from energy storage to microelectronics. Traditional approaches—high-throughput screening of known databases…
A landmark study on LLM-based negotiation agents has uncovered a startling asymmetry: these models can infer an opponent's hidden preferences — such as whether they value price ove…