Safety & Evaluation (Page 6 of 10)｜AI/Tech News Trends

arXiv cs.LG (Machine Learning) · 2026-06-16 EN New Model Releases extract

Kolmogorov Regression for Robust Diffusion Policies

Kolmogorov regression yields robust diffusion policies

Inference Neural Network Reinforcement Learning

Finite-dimensional diffusion policies suffer temporal drift from discretization that degrades long-horizon performance. The paper introduces a backward Kolmogorov equation that lifts diffusion policies into a Cameron-Martin space to make them more robust.

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.LG (Machine Learning) · 2026-06-16 EN New Model Releases extract

A Diffusion Approximation for Temporal-Difference Learning with Linear Features under Markovian Noise

A diffusion approximation for TD learning under Markovian noise

The classical continuous-time description of temporal-difference learning with linear features is an ODE capturing asymptotic mean dynamics but neglecting stochasticity. This work provides a diffusion approximation for TD learning under Markovian noise to capture those fluctuations.

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.LG (Machine Learning) · 2026-06-16 EN Safety & Evaluation extract

A Convex Quasilinearization Method for Solving Nonlinear PDEs with Physics-Informed Neural Networks

Convex quasilinearization solves nonlinear PDEs with PINNs

Neural Network Reinforcement Learning

For the forward solution of nonlinear PDEs, the method uses Bellman-Kalaba quasilinearization to reduce the nonlinear problem to a sequence of linear subproblems, each discretized by collocation and solved with physics-informed neural networks.

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.LG (Machine Learning) · 2026-06-16 EN Safety & Evaluation extract

Evaluating Open-Source LLMs for Multi-Label ATT&CK Technique Classification on CTI Reports

Evaluating open-source LLMs for ATT&CK multi-label CTI classification

Neural Network Retrieval-Augmented Generation (RAG) Reinforcement Learning

The paper evaluates open-source LLMs on multi-label classification of cyber threat intelligence (CTI) reports using MITRE ATT&CK techniques. Summary is title-based and neutral; details and figures are as presented by the source and not independently verified.

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Safety & Evaluation extract

The Measurement Gap in the Automation of EU Law: Benchmarking Doctrinal Legal Reasoning under the EU AI Act

Benchmarking doctrinal legal reasoning under the EU AI Act

Neural Network

LLMs produce legal text of at least median quality, yet no benchmark evaluates doctrinal legal reasoning, the interpretive core of legal work. The paper benchmarks doctrinal reasoning under the EU AI Act and discusses the measurement gap in legal automation.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-06-16 EN Training & Fine-tuning extract

WEQA: Wearable hEalth Question Answering with Query-Adaptive Agentic Reasoning

WEQA: query-adaptive agentic reasoning for wearable health QA

Deep Learning Neural Network Software Engineering

The paper proposes WEQA, a framework for question answering over wearable health sensor data using query-adaptive agentic reasoning, arguing that diverse sensor modalities and user intents cannot be handled by a fixed reasoning workflow.

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN New Model Releases extract

Your AI Travel Agent Would Book You a Bullfight: An Agentic Benchmark for Implicit Animal Welfare in Frontier AI Models

An agentic benchmark for implicit animal welfare in frontier AI

AI Agents Claude DeepSeek Gemini GPT

AI agents are shifting from advisors to actors that book travel and run procurement. Existing animal-welfare benchmarks grade only text answers, so this work introduces an agentic benchmark testing whether implicit animal-welfare reasoning transfers to agent actions in frontier models.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-06-16 EN Safety & Evaluation extract

Towards Understanding and Measuring COGNITIVE ATROPHY in LLM Behaviour

Formalizing 'cognitive atrophy' as a process-level measure of LLM behaviour

Neural Network

The paper formalizes 'cognitive atrophy,' a process-level behavioural measure of AI-mediated mental-health support, capturing whether interactions help users keep reflecting, coping, and deciding, a dimension distinct from safety and static response quality.

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN New Model Releases extract

Unintended Effects of Geographic Conditioning in Large Language Models

Unintended regional biases from geographic conditioning in LLMs

Claude Llama Meta Neural Network Reinforcement Learning

Conversational AI localizes responses using user metadata, yet the regional biases this hidden context introduces remain poorly understood. The paper analyzes the unintended effects of geographic conditioning on large language model outputs.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.LG (Machine Learning) · 2026-06-16 EN Inference & Efficiency extract

Embedded Machine Learning for Microcontroller-Class Edge Devices: Data, Feature, Evaluation, and Deployment Pipelines

A pipeline survey of embedded ML for microcontroller-class devices

Inference Machine Learning Quantization

Embedded machine learning moves inference from the cloud to resource-constrained devices. This practice-oriented synthesis lays out data, feature, evaluation and deployment pipelines for an embedded ML workflow on microcontroller-class platforms.

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Developer Tools extract

Structural Role Injection in Handlebars-Templated LLM Prompts: Triple-Brace Interpolation, Delimiter Family, and the Limits of HTML Auto-Escaping

Structural role injection in Handlebars-templated LLM prompts

Claude GPT Llama Machine Learning Microsoft

LLM apps build prompts from templates, with Handlebars the default in Microsoft Semantic Kernel. While double-brace expressions HTML-escape values, triple-brace interpolation inserts them raw. The paper studies structural role injection and the limits of HTML auto-escaping.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN New Model Releases extract

HistoRAG: Embedding Historical Methodology in Retrieval-Augmented Generation Through Critical Technical Practice

HistoRAG embeds historical methodology into RAG via critical practice

Embeddings Retrieval-Augmented Generation (RAG) Reinforcement Learning Software Engineering

RAG grounds model outputs in external evidence, but its dominant evaluations and defaults are oriented toward factual question answering. HistoRAG embeds historical methodology into retrieval-augmented generation through critical technical practice for interpretive historical studies.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-06-16 EN Safety & Evaluation extract

IsabeLLM: Automated Theorem Proving Applied to Formally Verifying Consensus

IsabeLLM: automated theorem proving to formally verify consensus

Retrieval-Augmented Generation (RAG)

The paper presents IsabeLLM, applying AI-based automated theorem proving to formally verify blockchain consensus, aiming to automate much of the expertise-intensive verification workload and make formal verification more accessible.

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

NVIDIA Developer Blog · 2026-06-16 EN Infrastructure & Hardware extract

How to Optimize Transformer-Based Models for Low-Precision Training

NVIDIA guide on optimizing transformer models for low-precision training

Generative AI NVIDIA Transformer

An NVIDIA technical post explains techniques for optimizing transformer-based models during low-precision training. The export raw_excerpt was blocked (cookie/query data), so this summary is based only on the title and source; specific methods and figures are unverified.

Read original (NVIDIA Developer Blog) ↗

arXiv cs.LG (Machine Learning) · 2026-06-16 EN Inference & Efficiency extract

S4oP: Operator-level Pruning of Structured State Space Models for Resource-Constrained Devices

S4oP prunes structured state space models at the operator level

Fine-tuning Inference Reinforcement Learning

Structured state space models such as S4 and S4D capture long-range dependencies but are hard to deploy on constrained devices. S4oP introduces operator-level pruning to enable efficient deployment of SSMs on time- and resource-constrained hardware.

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-06-16 EN Training & Fine-tuning extract

EAGG: Embodiment-Aligned Grasp Generation via Geometry-Aware Graph Conditioning

EAGG: embodiment-aligned grasp generation via graph conditioning

Fine-tuning Retrieval-Augmented Generation (RAG)

The paper presents EAGG, an embodiment-aligned grasp generator that represents each end-effector with a topology-aware graph and embodiment-specific conditioning, aiming to generalize grasp generation across objects and diverse robot embodiments.

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

Google DeepMind Blog · 2026-06-16 EN Agents & Tool Use extract

Securing the future of AI agents

DeepMind outlines an AI Control Roadmap to secure AI agents

AI Agents

Google DeepMind presents an AI Control Roadmap for securing the future of AI agents, combining traditional safeguards with real-time monitoring to protect internal systems. The framework lays out layered defenses against agent misuse and unsafe behavior as agents proliferate.

Read original (Google DeepMind Blog) ↗

arXiv cs.LG (Machine Learning) · 2026-06-16 EN Safety & Evaluation extract

Edge Flow: A Tractable and Predictive Continuous-Time Model for Gradient Descent at the Edge of Stability

Edge Flow: a tractable continuous-time model for GD at the edge of stability

Deep Learning

Gradient descent in deep learning can operate at the edge of stability, where the loss Hessian's top eigenvalue hovers near the stability threshold. Classical tools fail there, so Edge Flow offers a tractable, predictive continuous-time model of this regime.

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-06-16 EN Safety & Evaluation extract

Agentic AI-based Framework for Mitigating Premature Diagnostic Handoff and Silent Hallucination in Healthcare Applications

A multi-agent framework against premature handoff and silent hallucination

AI Agents Llama

The paper proposes a multi-agent framework for healthcare that mitigates premature diagnostic handoff and silent clinical hallucinations, replacing LLM-as-a-judge routing with deterministic orchestration constraints and adding two safety mechanisms.

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.LG (Machine Learning) · 2026-06-16 EN Safety & Evaluation extract

NoiseTilt: Noise-Tilted Reverse Kernels for Diffusion Reward Alignment

NoiseTilt injects reward gradients via the noise term in diffusion

Inference

NoiseTilt (NTRK) is a reward-guided diffusion sampler that injects reward gradients through the noise term, leaving the score kernel unchanged and needing only a single sample per step, improving reward alignment of pretrained diffusion models.

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Safety & Evaluation extract

PseudoBench: Measuring How Agentic Auto-Research Fuels Pseudoscience

PseudoBench measures how agentic auto-research fuels pseudoscience

AI Agents Deep Learning

As LLM-based agents enter autonomous scientific research, resisting pseudoscience matters. PseudoBench is an adversarial benchmark measuring how such agents may rapidly generate plausible yet misleading studies that contaminate academic literature.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Safety & Evaluation extract

Compositional Skill Routing for LLM Agents: Decompose, Retrieve, and Compose

Compositional skill routing for LLM agents: decompose, retrieve, compose

AI Agents Model Context Protocol (MCP) Neural Network Reinforcement Learning

LLM agents rely on reusable tool specifications (skills), but real tasks require composing multiple skills. The paper formalizes compositional skill routing: decomposing a complex query into atomic sub-tasks, retrieving relevant skills, and composing them.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.LG (Machine Learning) · 2026-06-16 EN Training & Fine-tuning extract

Uncertainty Quantification for Flow-Based Vision-Language-Action Models

Uncertainty quantification for flow-based vision-language-action models

Computer Vision Fine-tuning Retrieval-Augmented Generation (RAG) Reinforcement Learning

Vision-language-action models combine vision-language backbones with expressive generative action heads trained via flow matching on large robotic datasets. Despite strong performance, the paper studies uncertainty quantification for these flow-based VLA models.

Read original (arXiv cs.LG (Machine Learning)) ↗

NVIDIA Developer Blog · 2026-06-16 EN Infrastructure & Hardware extract

NVIDIA Blackwell Tops MLPerf Training 6.0 with Industry-Leading Scale and Performance

NVIDIA says Blackwell tops MLPerf Training 6.0 benchmark

Generative AI Machine Learning NVIDIA Software Engineering

NVIDIA announced that its Blackwell GPU architecture topped the MLPerf Training 6.0 benchmark with what it calls industry-leading scale and performance. Summarized neutrally from the title; the export excerpt was blocked (cookie/query data), so figures are vendor claims, not independently verified.

Read original (NVIDIA Developer Blog) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Agents & Tool Use extract

ProvenanceGuard: Source-Aware Factuality Verification for MCP-Based LLM Agents

ProvenanceGuard: source-aware factuality verification for MCP agents

AI Agents Model Context Protocol (MCP) Software Engineering

Tool-using LLM agents use the Model Context Protocol to answer from heterogeneous sources like search, APIs, databases and clinical records. ProvenanceGuard provides source-aware factuality verification to catch provenance-sensitive failure modes that standard metrics miss.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.LG (Machine Learning) · 2026-06-16 EN Safety & Evaluation extract

LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling

LoopCoder-v2: loop once for efficient test-time compute scaling

Deep Learning Software Engineering Transformer

Looped transformers scale latent computation by repeating shared blocks, but sequential looping raises latency and KV-cache memory with loop count. Building on parallel loop transformers, LoopCoder-v2 makes loop count a practical knob for efficient test-time computation scaling.

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Safety & Evaluation extract

LegalHalluLens: Typed Hallucination Auditing and Calibrated Multi-Agent Debate for Trustworthy Legal AI

LegalHalluLens audits typed legal-AI hallucinations with calibrated debate

Retrieval-Augmented Generation (RAG)

Legal-AI systems hallucinate at aggregate rates near 52%, but averages hide where and how errors concentrate. LegalHalluLens is an auditing framework pairing typed hallucination auditing with calibrated multi-agent debate to give compliance officers actionable signals for trustworthy legal AI.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.LG (Machine Learning) · 2026-06-16 EN New Model Releases extract

Fast Nonparametric Conditional Independence Testing via Two-Stage Regression

Fast nonparametric conditional independence testing via two-stage regression

Algorithms & Theory Reinforcement Learning from Human Feedback (RLHF)

Conditional independence testing is fundamental to statistics and causal inference. The paper proposes a fast nonparametric conditional independence test based on two-stage regression, aiming to improve computational efficiency and power.

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-06-16 EN New Model Releases extract

LLM Consumer Behavior Theory: Foundations of a Novel Research Field

LLM Consumer Behavior Theory: a new field for agentic markets

AI Agents Natural Language Processing (NLP) Retrieval-Augmented Generation (RAG)

The paper introduces LLM Consumer Behavior Theory, a proposed field analyzing consumer behavior in agentic markets where LLMs make consumption decisions on behalf of users, drawing on classical and behavioral economics alongside NLP.

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN New Model Releases extract

VoidPadding: Let [VOID] Handle Padding in Masked Diffusion Language Models so that [EOS] Can Focus on Semantic Termination

VoidPadding lets [VOID] handle padding so [EOS] focuses on termination

Deep Learning Inference Retrieval-Augmented Generation (RAG) Reinforcement Learning

In masked diffusion language models, padding and semantic termination roles get entangled. VoidPadding introduces a [VOID] token to handle padding so that [EOS] can focus on signaling semantic termination, improving generation behavior.

Read original (arXiv cs.CL (Computation and Language)) ↗