Safety & Evaluation A

Showing 151–180 of 290
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Kolmogorov Regression for Robust Diffusion Policies
    Kolmogorov regression yields robust diffusion policies
    Inference Neural Network Reinforcement Learning
    Finite-dimensional diffusion policies suffer temporal drift from discretization that degrades long-horizon performance. The paper introduces a backward Kolmogorov equation that lifts diffusion policies into a Cameron-Martin space to make them more robust.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    A Diffusion Approximation for Temporal-Difference Learning with Linear Features under Markovian Noise
    A diffusion approximation for TD learning under Markovian noise
    The classical continuous-time description of temporal-difference learning with linear features is an ODE capturing asymptotic mean dynamics but neglecting stochasticity. This work provides a diffusion approximation for TD learning under Markovian noise to capture those fluctuations.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    A Convex Quasilinearization Method for Solving Nonlinear PDEs with Physics-Informed Neural Networks
    Convex quasilinearization solves nonlinear PDEs with PINNs
    Neural Network Reinforcement Learning
    For the forward solution of nonlinear PDEs, the method uses Bellman-Kalaba quasilinearization to reduce the nonlinear problem to a sequence of linear subproblems, each discretized by collocation and solved with physics-informed neural networks.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    Evaluating Open-Source LLMs for Multi-Label ATT&CK Technique Classification on CTI Reports
    Evaluating open-source LLMs for ATT&CK multi-label CTI classification
    Neural Network Retrieval-Augmented Generation (RAG) Reinforcement Learning
    The paper evaluates open-source LLMs on multi-label classification of cyber threat intelligence (CTI) reports using MITRE ATT&CK techniques. Summary is title-based and neutral; details and figures are as presented by the source and not independently verified.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    The Measurement Gap in the Automation of EU Law: Benchmarking Doctrinal Legal Reasoning under the EU AI Act
    Benchmarking doctrinal legal reasoning under the EU AI Act
    Neural Network
    LLMs produce legal text of at least median quality, yet no benchmark evaluates doctrinal legal reasoning, the interpretive core of legal work. The paper benchmarks doctrinal reasoning under the EU AI Act and discusses the measurement gap in legal automation.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Training & Fine-tuning extract
    WEQA: Wearable hEalth Question Answering with Query-Adaptive Agentic Reasoning
    WEQA: query-adaptive agentic reasoning for wearable health QA
    Deep Learning Neural Network Software Engineering
    The paper proposes WEQA, a framework for question answering over wearable health sensor data using query-adaptive agentic reasoning, arguing that diverse sensor modalities and user intents cannot be handled by a fixed reasoning workflow.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    Your AI Travel Agent Would Book You a Bullfight: An Agentic Benchmark for Implicit Animal Welfare in Frontier AI Models
    An agentic benchmark for implicit animal welfare in frontier AI
    AI Agents Claude DeepSeek Gemini GPT
    AI agents are shifting from advisors to actors that book travel and run procurement. Existing animal-welfare benchmarks grade only text answers, so this work introduces an agentic benchmark testing whether implicit animal-welfare reasoning transfers to agent actions in frontier models.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Safety & Evaluation extract
    Towards Understanding and Measuring COGNITIVE ATROPHY in LLM Behaviour
    Formalizing 'cognitive atrophy' as a process-level measure of LLM behaviour
    Neural Network
    The paper formalizes 'cognitive atrophy,' a process-level behavioural measure of AI-mediated mental-health support, capturing whether interactions help users keep reflecting, coping, and deciding, a dimension distinct from safety and static response quality.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    Unintended Effects of Geographic Conditioning in Large Language Models
    Unintended regional biases from geographic conditioning in LLMs
    Claude Llama Meta Neural Network Reinforcement Learning
    Conversational AI localizes responses using user metadata, yet the regional biases this hidden context introduces remain poorly understood. The paper analyzes the unintended effects of geographic conditioning on large language model outputs.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.LG (Machine Learning) · EN Inference & Efficiency extract
    Embedded Machine Learning for Microcontroller-Class Edge Devices: Data, Feature, Evaluation, and Deployment Pipelines
    A pipeline survey of embedded ML for microcontroller-class devices
    Inference Machine Learning Quantization
    Embedded machine learning moves inference from the cloud to resource-constrained devices. This practice-oriented synthesis lays out data, feature, evaluation and deployment pipelines for an embedded ML workflow on microcontroller-class platforms.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Developer Tools extract
    Structural Role Injection in Handlebars-Templated LLM Prompts: Triple-Brace Interpolation, Delimiter Family, and the Limits of HTML Auto-Escaping
    Structural role injection in Handlebars-templated LLM prompts
    Claude GPT Llama Machine Learning Microsoft
    LLM apps build prompts from templates, with Handlebars the default in Microsoft Semantic Kernel. While double-brace expressions HTML-escape values, triple-brace interpolation inserts them raw. The paper studies structural role injection and the limits of HTML auto-escaping.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    HistoRAG: Embedding Historical Methodology in Retrieval-Augmented Generation Through Critical Technical Practice
    HistoRAG embeds historical methodology into RAG via critical practice
    Embeddings Retrieval-Augmented Generation (RAG) Reinforcement Learning Software Engineering
    RAG grounds model outputs in external evidence, but its dominant evaluations and defaults are oriented toward factual question answering. HistoRAG embeds historical methodology into retrieval-augmented generation through critical technical practice for interpretive historical studies.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Safety & Evaluation extract
    IsabeLLM: Automated Theorem Proving Applied to Formally Verifying Consensus
    IsabeLLM: automated theorem proving to formally verify consensus
    Retrieval-Augmented Generation (RAG)
    The paper presents IsabeLLM, applying AI-based automated theorem proving to formally verify blockchain consensus, aiming to automate much of the expertise-intensive verification workload and make formal verification more accessible.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • NVIDIA Developer Blog · EN Infrastructure & Hardware extract
    How to Optimize Transformer-Based Models for Low-Precision Training
    NVIDIA guide on optimizing transformer models for low-precision training
    Generative AI NVIDIA Transformer
    An NVIDIA technical post explains techniques for optimizing transformer-based models during low-precision training. The export raw_excerpt was blocked (cookie/query data), so this summary is based only on the title and source; specific methods and figures are unverified.
    Read original (NVIDIA Developer Blog) ↗
  • arXiv cs.LG (Machine Learning) · EN Inference & Efficiency extract
    S4oP: Operator-level Pruning of Structured State Space Models for Resource-Constrained Devices
    S4oP prunes structured state space models at the operator level
    Fine-tuning Inference Reinforcement Learning
    Structured state space models such as S4 and S4D capture long-range dependencies but are hard to deploy on constrained devices. S4oP introduces operator-level pruning to enable efficient deployment of SSMs on time- and resource-constrained hardware.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Training & Fine-tuning extract
    EAGG: Embodiment-Aligned Grasp Generation via Geometry-Aware Graph Conditioning
    EAGG: embodiment-aligned grasp generation via graph conditioning
    Fine-tuning Retrieval-Augmented Generation (RAG)
    The paper presents EAGG, an embodiment-aligned grasp generator that represents each end-effector with a topology-aware graph and embodiment-specific conditioning, aiming to generalize grasp generation across objects and diverse robot embodiments.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • Google DeepMind Blog · EN Agents & Tool Use extract
    Securing the future of AI agents
    DeepMind outlines an AI Control Roadmap to secure AI agents
    AI Agents
    Google DeepMind presents an AI Control Roadmap for securing the future of AI agents, combining traditional safeguards with real-time monitoring to protect internal systems. The framework lays out layered defenses against agent misuse and unsafe behavior as agents proliferate.
    Read original (Google DeepMind Blog) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    Edge Flow: A Tractable and Predictive Continuous-Time Model for Gradient Descent at the Edge of Stability
    Edge Flow: a tractable continuous-time model for GD at the edge of stability
    Deep Learning
    Gradient descent in deep learning can operate at the edge of stability, where the loss Hessian's top eigenvalue hovers near the stability threshold. Classical tools fail there, so Edge Flow offers a tractable, predictive continuous-time model of this regime.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Safety & Evaluation extract
    Agentic AI-based Framework for Mitigating Premature Diagnostic Handoff and Silent Hallucination in Healthcare Applications
    A multi-agent framework against premature handoff and silent hallucination
    AI Agents Llama
    The paper proposes a multi-agent framework for healthcare that mitigates premature diagnostic handoff and silent clinical hallucinations, replacing LLM-as-a-judge routing with deterministic orchestration constraints and adding two safety mechanisms.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    NoiseTilt: Noise-Tilted Reverse Kernels for Diffusion Reward Alignment
    NoiseTilt injects reward gradients via the noise term in diffusion
    Inference
    NoiseTilt (NTRK) is a reward-guided diffusion sampler that injects reward gradients through the noise term, leaving the score kernel unchanged and needing only a single sample per step, improving reward alignment of pretrained diffusion models.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    PseudoBench: Measuring How Agentic Auto-Research Fuels Pseudoscience
    PseudoBench measures how agentic auto-research fuels pseudoscience
    AI Agents Deep Learning
    As LLM-based agents enter autonomous scientific research, resisting pseudoscience matters. PseudoBench is an adversarial benchmark measuring how such agents may rapidly generate plausible yet misleading studies that contaminate academic literature.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    Compositional Skill Routing for LLM Agents: Decompose, Retrieve, and Compose
    Compositional skill routing for LLM agents: decompose, retrieve, compose
    AI Agents Model Context Protocol (MCP) Neural Network Reinforcement Learning
    LLM agents rely on reusable tool specifications (skills), but real tasks require composing multiple skills. The paper formalizes compositional skill routing: decomposing a complex query into atomic sub-tasks, retrieving relevant skills, and composing them.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.LG (Machine Learning) · EN Training & Fine-tuning extract
    Uncertainty Quantification for Flow-Based Vision-Language-Action Models
    Uncertainty quantification for flow-based vision-language-action models
    Computer Vision Fine-tuning Retrieval-Augmented Generation (RAG) Reinforcement Learning
    Vision-language-action models combine vision-language backbones with expressive generative action heads trained via flow matching on large robotic datasets. Despite strong performance, the paper studies uncertainty quantification for these flow-based VLA models.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • NVIDIA Developer Blog · EN Infrastructure & Hardware extract
    NVIDIA Blackwell Tops MLPerf Training 6.0 with Industry-Leading Scale and Performance
    NVIDIA says Blackwell tops MLPerf Training 6.0 benchmark
    Generative AI Machine Learning NVIDIA Software Engineering
    NVIDIA announced that its Blackwell GPU architecture topped the MLPerf Training 6.0 benchmark with what it calls industry-leading scale and performance. Summarized neutrally from the title; the export excerpt was blocked (cookie/query data), so figures are vendor claims, not independently verified.
    Read original (NVIDIA Developer Blog) ↗
  • arXiv cs.CL (Computation and Language) · EN Agents & Tool Use extract
    ProvenanceGuard: Source-Aware Factuality Verification for MCP-Based LLM Agents
    ProvenanceGuard: source-aware factuality verification for MCP agents
    AI Agents Model Context Protocol (MCP) Software Engineering
    Tool-using LLM agents use the Model Context Protocol to answer from heterogeneous sources like search, APIs, databases and clinical records. ProvenanceGuard provides source-aware factuality verification to catch provenance-sensitive failure modes that standard metrics miss.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling
    LoopCoder-v2: loop once for efficient test-time compute scaling
    Deep Learning Software Engineering Transformer
    Looped transformers scale latent computation by repeating shared blocks, but sequential looping raises latency and KV-cache memory with loop count. Building on parallel loop transformers, LoopCoder-v2 makes loop count a practical knob for efficient test-time computation scaling.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    LegalHalluLens: Typed Hallucination Auditing and Calibrated Multi-Agent Debate for Trustworthy Legal AI
    LegalHalluLens audits typed legal-AI hallucinations with calibrated debate
    Retrieval-Augmented Generation (RAG)
    Legal-AI systems hallucinate at aggregate rates near 52%, but averages hide where and how errors concentrate. LegalHalluLens is an auditing framework pairing typed hallucination auditing with calibrated multi-agent debate to give compliance officers actionable signals for trustworthy legal AI.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Fast Nonparametric Conditional Independence Testing via Two-Stage Regression
    Fast nonparametric conditional independence testing via two-stage regression
    Algorithms & Theory Reinforcement Learning from Human Feedback (RLHF)
    Conditional independence testing is fundamental to statistics and causal inference. The paper proposes a fast nonparametric conditional independence test based on two-stage regression, aiming to improve computational efficiency and power.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    LLM Consumer Behavior Theory: Foundations of a Novel Research Field
    LLM Consumer Behavior Theory: a new field for agentic markets
    AI Agents Natural Language Processing (NLP) Retrieval-Augmented Generation (RAG)
    The paper introduces LLM Consumer Behavior Theory, a proposed field analyzing consumer behavior in agentic markets where LLMs make consumption decisions on behalf of users, drawing on classical and behavioral economics alongside NLP.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    VoidPadding: Let [VOID] Handle Padding in Masked Diffusion Language Models so that [EOS] Can Focus on Semantic Termination
    VoidPadding lets [VOID] handle padding so [EOS] focuses on termination
    Deep Learning Inference Retrieval-Augmented Generation (RAG) Reinforcement Learning
    In masked diffusion language models, padding and semantic termination roles get entangled. VoidPadding introduces a [VOID] token to handle padding so that [EOS] can focus on signaling semantic termination, improving generation behavior.
    Read original (arXiv cs.CL (Computation and Language)) ↗