New Model Releases A

Showing 61–90 of 267
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    Freeing the Law with LOCUS: A Local Ordinance Corpus for the United States
    LOCUS releases a US local-ordinance corpus for legal AI
    Deep Learning Meta Retrieval-Augmented Generation (RAG) Reinforcement Learning
    Progress in legal AI depends on authoritative legal text at scale, yet US local ordinances—a consequential layer of American law—are largely missing from machine-readable corpora. The authors build LOCUS, a corpus of US local ordinances, to broaden legal-AI research data.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Training & Fine-tuning extract
    UBP2: Uncertainty-Balanced Preference Planning for Efficient Preference-based Reinforcement Learning
    UBP2: uncertainty-balanced planning for efficient preference-based RL
    Meta Neural Network Reinforcement Learning
    Preference-based RL learns reward models from pairwise behavior comparisons, bypassing explicit reward design, but existing methods often rely on passive data collection. UBP2 introduces uncertainty-balanced preference planning to actively select comparisons and learn efficiently from fewer preferences.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Optimal scenario design for climate emulation
    Optimal scenario design improves climate emulation surrogates
    AI Agents Deep Learning
    As deep learning for physical systems grows, efforts to improve generalizability have focused on architectures embedding physical constraints. This work instead studies optimal scenario design for machine-learning surrogate models of climate, improving generalization and predictive accuracy.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Multimodal extract
    Does VLA Even Know the Basics? Measuring Commonsense and World Knowledge Retention in Vision-Language-Action Models
    Measuring commonsense and knowledge retention in VLA models
    AI Agents Computer Vision Fine-tuning Robotics Software Engineering
    Embodied Vision-Language-Action (VLA) models are typically obtained by fine-tuning powerful pretrained VLMs on robotics data, yet how much commonsense and factual knowledge they retain is unclear. This work measures that retention, revealing how much fine-tuning erodes prior world knowledge.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Multimodal extract
    A Multi-Domain Benchmark for Detecting AI-Generated Text-Rich Images from GPT-Image-2
    A multi-domain benchmark to detect GPT-Image-2 text-rich images
    Computer Vision GPT OpenAI Retrieval-Augmented Generation (RAG)
    Text-rich images often hold privacy-sensitive, transactional, or decision-relevant information. As multimodal generators synthesize realistic text and layouts, this work builds a multi-domain benchmark for detecting AI-generated text-rich images from GPT-Image-2, assessing detector reliability.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Safety & Evaluation extract
    X+Slides: Benchmarking Audience-Conditioned Slide Generation
    X+Slides benchmarks audience-conditioned slide generation
    Neural Network Retrieval-Augmented Generation (RAG) Reinforcement Learning
    Automatically generating slide decks from documents is an important LLM application, but existing benchmarks mainly assess completeness and technical depth. X+Slides introduces a benchmark for audience-conditioned slide generation, evaluating how well decks adapt to their intended audience.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    SCAN: Enhance Time Series Anomaly Detection via Multi-Scale Neighborhood-Centered Clustering
    SCAN boosts time-series anomaly detection via neighborhood clustering
    Reinforcement Learning
    Time-series anomaly detection is crucial across applications, and reconstruction-based methods dominate but suffer from over-generalization that reconstructs anomalies too well. SCAN uses multi-scale neighborhood-centered clustering to curb this over-generalization and improve detection.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Multimodal extract
    OneCanvas: 3D Scene Understanding via Panoramic Reprojection
    OneCanvas enables VLM 3D scene understanding via panoramic reprojection
    Computer Vision Embeddings Neural Network Robotics Software Engineering
    Existing 3D scene understanding in VLMs relies on complex, model-specific geometry encoders or large training budgets for spatial reasoning. OneCanvas instead uses panoramic reprojection, letting VLMs reason about 3D scenes efficiently without dedicated geometry encoders or heavy training.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Acceleration of an algebraic multigrid pressure solver using graph neural networks
    Graph neural networks accelerate an algebraic multigrid pressure solver
    Neural Network
    Solving the pressure-Poisson equation is the main bottleneck in incompressible unstructured flow solvers, as traditional linear solvers are sensitive to mesh irregularities. This work uses graph neural networks to accelerate an algebraic multigrid pressure solver, improving solve efficiency.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Transformer Geometry Observatory TGO-I: Spectral Geometry Observatory
    TGO-I: a spectral geometry observatory for Vision Transformers
    Computer Vision Reinforcement Learning Transformer
    Despite the wide adoption and success of Vision Transformers, understanding of their dimensional and representational geometry remains limited. The Transformer Geometry Observatory (TGO-I) studies ViTs through spectral geometry, observing and analyzing the structure of their representation spaces.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Developer Tools extract
    A Taxonomy of Mental Health and Technology Needs for Alzheimer's and Dementia Caregivers
    A taxonomy of mental-health and tech needs for dementia caregivers
    Deep Learning Reinforcement Learning
    Family members caring for people with Alzheimer's and related dementias form the foundation of long-term care worldwide; in 2023 over 11 million U.S. relatives provided unpaid care. This work presents a taxonomy of caregivers' mental-health and technology needs to guide supportive design.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Safety & Evaluation extract
    TxBench-PP: Analyzing AI Agent Performance on Small-Molecule Preclinical Pharmacology
    TxBench-PP evaluates AI agents on preclinical pharmacology
    AI Agents Claude GPT Reinforcement Learning from Human Feedback (RLHF) Software Engineering
    AI agents promise to accelerate drug discovery by compressing interpretation and decision loops, but deployment needs trusted evaluation on realistic tasks. TxBench-PP is a benchmark analyzing AI agent performance on small-molecule preclinical pharmacology, assessing their practical reliability.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    Machine Unlearning for the XGBoost Model with Network Intrusion Datasets
    Machine unlearning for XGBoost on network intrusion data
    Deep Learning
    Machine unlearning removes specific data points from trained models without full retraining, but most research targets neural networks. This work studies machine unlearning for the XGBoost gradient-boosted tree model using network intrusion datasets, extending unlearning beyond deep models.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    RECOM: A Validity Discrimination Tradeoff in Automatic Metrics for Open Ended Reddit Question Answering
    RECOM analyzes validity vs discrimination in automatic metrics
    Neural Network Software Engineering
    Automatic metrics are the default for evaluating LLM-generated text, yet a metric is quietly asked to do two jobs: tell genuine content alignment from surface coincidence (validity) and discriminate quality. Using open-ended Reddit QA, RECOM analyzes this validity–discrimination trade-off.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • Publickey · JA New Model Releases extract
    あなたのAWSのコストの問題がどこにあるか、AIが教えてくれる「AWS FinOps Agent」パブリックプレビュー開始
    AWS launches a public preview of 'AWS FinOps Agent' for cost analysis
    Amazon Web Services has begun a public preview of the 'AWS FinOps Agent,' an AI agent that answers questions about AWS costs and investigates the causes of cost anomalies. It targets FinOps operations support. The specific feature scope and accuracy are per the article and announcement, unverified independently.
    Read original (Publickey) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    The More the Merrier: Combining Properties for ABox Abduction under Repair Semantics for ELbot
    Combining properties for ABox abduction under repair semantics
    Abduction explains missing entailments from a knowledge base by proposing a hypothesis that would make them hold. This work studies ABox abduction under repair semantics for the EL description logic, combining multiple properties to produce stronger explanatory hypotheses.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    When AUC Misleads: Polarization-Aware Evaluation of Deepfake Detectors under Domain Shift
    Polarization-aware evaluation of deepfake detectors under domain shift
    Generative AI Retrieval-Augmented Generation (RAG) Reinforcement Learning
    Advances in diffusion models and face-swapping enable highly realistic deepfakes and real-world harm. This work shows AUC can mislead when evaluating detectors under domain shift, and proposes a polarization-aware evaluation that better reflects deepfake detector performance across domains.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Funding & M&A extract
    Dango: A Strictly L1-Only Large Language Model for Studying Second Language Acquisition
    Dango: an L1-only 1.8B LLM for studying second-language acquisition
    The authors introduce Dango, a 1.8B-parameter language model designed for controlled studies of L1-to-L2 (Japanese-to-English) transfer in second language acquisition. By training strictly on L1 only, Dango enables controlled experiments on transfer phenomena that prior SLA model studies could not.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Safety & Evaluation extract
    Beyond Safe Data: Pretraining-Stage Alignment with Regular Safety Reflection
    Pretraining-stage alignment via regular safety reflection
    Fine-tuning Inference Reinforcement Learning
    To achieve deeper safety alignment for LLMs, recent work pushes safety interventions earlier into pretraining, mainly by filtering unsafe data or rewriting it into safe forms. Going beyond safe data, this work embeds regular safety reflection during pretraining to instill more fundamental alignment.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    Essential Subspace Merging for Multi-Task Learning
    Essential subspace merging for multi-task model merging
    Inference Neural Network
    Model merging integrates the capabilities of several models fine-tuned from the same pretrained checkpoint into one, enabling multi-task learning. This work proposes Essential Subspace Merging, which extracts and merges each task's essential subspace to reduce interference and preserve multi-task performance.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    IndicContextEval: A Benchmark for Evaluating Context Utilisation in Audio Large Language Models Across 8 Indic Languages
    IndicContextEval: audio-LLM context use across 8 Indic languages
    Meta Neural Network Software Engineering Speech Processing
    Audio LLMs can condition speech recognition on textual prompts such as domain descriptions or entity lists, but whether they truly use this context is unclear. IndicContextEval is a benchmark evaluating context utilisation in audio large language models across eight Indic languages.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    Complementary Attention Head Pruning for Efficient Transformers
    Complementary attention-head pruning for efficient Transformers
    Natural Language Processing (NLP) Reinforcement Learning Transformer
    Transformers' success stems from architectural scaling, which inflates parameter counts and hinders deployment in resource-constrained settings. This work proposes complementary attention head pruning, removing heads so that retained ones stay complementary, preserving accuracy while improving efficiency.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Developer Tools extract
    OpenAnt: LLM-Powered Vulnerability Discovery Through Code Decomposition, Adversarial Verification, and Dynamic Testing
    OpenAnt: LLM-powered vulnerability discovery via code decomposition
    Automated vulnerability discovery in large codebases is hard: static analysis yields high false positives while dynamic methods like fuzzing lack coverage. OpenAnt is an LLM-powered approach combining code decomposition, adversarial verification, and dynamic testing to surface real vulnerabilities.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    OrthoReg: Orthogonal Regularization for Hybrid Symbolic-Neural Dynamical Systems
    OrthoReg: orthogonal regularization for symbolic-neural dynamical systems
    Neural Network Reinforcement Learning
    Dynamical systems are fundamental to modeling the natural world, but modeling them trades off interpretable hand-specified mechanistic models against flexible yet opaque neural ones. OrthoReg introduces orthogonal regularization to disentangle symbolic and neural components in hybrid dynamical systems.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    Human-AI Coevolution Dynamics: A Formal Theory of Social Intelligence Emergence Through Long-Term Interaction
    A formal theory of human-AI coevolution and social intelligence
    Conversational AI has advanced in language generation, personalization, and long-context interaction, but most methods model social behavior through isolated components. This work offers a formal theory of human-AI coevolution dynamics, explaining how social intelligence emerges through long-term interaction.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    INDEQS: Informed Neural controlled Differential EQuationS
    INDEQS: informed neural controlled differential equations for forecasting
    Neural Network Reinforcement Learning
    Neural Controlled Differential Equations provide a powerful continuous-time framework for time-series forecasting, but standard graph-based extensions struggle to learn spatial structure. INDEQS introduces informed neural controlled differential equations to better capture structure and improve forecasting.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Training & Fine-tuning extract
    ProductConsistency: Improving Product Identity Preservation in Instruction-Based Image Editing via SFT and RL
    ProductConsistency preserves product identity in instruction-based editing
    Fine-tuning Machine Learning Reinforcement Learning
    Instruction-based image editing enables complex edits from natural language, but in product-centric scenarios preserving product features and branding is hard. ProductConsistency uses supervised fine-tuning and reinforcement learning to improve product identity preservation during instruction-based editing.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Structure Over Nonlinearity: Explicit Interaction Architectures for Dynamical Learning
    Explicit interaction architectures for dynamical learning
    Most learning architectures for dynamical systems rely on generic nonlinear function approximation, often needing high complexity to capture structured behavior. Favoring structure over nonlinearity, this work proposes explicit interaction architectures that model variable interactions directly for efficient dynamical learning.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Context-Aware Optimization of Follow-Up Intervals for Type 2 Diabetes Care Using Markov Decision Processes
    Optimizing type-2 diabetes follow-up intervals with MDPs
    Reinforcement Learning
    Chronic disease management relies on regular patient-provider interactions to track progression and control. For Type 2 Diabetes, guidelines prescribe fixed follow-up intervals. This work uses Markov decision processes to optimize follow-up intervals in a context-aware way, tailoring scheduling to each patient.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    ARIADNE: Agnostic Routing for Inference-time Adapter DyNamic sElection
    ARIADNE: agnostic routing for inference-time adapter selection
    Embeddings Fine-tuning Inference Llama Retrieval-Augmented Generation (RAG)
    Widespread parameter-efficient fine-tuning yields ecosystems where one backbone pairs with many task-specialized adapters. ARIADNE provides agnostic routing for inference-time dynamic adapter selection, choosing the right adapter per input without model-specific assumptions.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗