Safety & Evaluation A
Showing 151–180 of 290
-
Kolmogorov Regression for Robust Diffusion PoliciesKolmogorov regression yields robust diffusion policiesFinite-dimensional diffusion policies suffer temporal drift from discretization that degrades long-horizon performance. The paper introduces a backward Kolmogorov equation that lifts diffusion policies into a Cameron-Martin space to make them more robust.
-
A Diffusion Approximation for Temporal-Difference Learning with Linear Features under Markovian NoiseA diffusion approximation for TD learning under Markovian noiseThe classical continuous-time description of temporal-difference learning with linear features is an ODE capturing asymptotic mean dynamics but neglecting stochasticity. This work provides a diffusion approximation for TD learning under Markovian noise to capture those fluctuations.
-
A Convex Quasilinearization Method for Solving Nonlinear PDEs with Physics-Informed Neural NetworksConvex quasilinearization solves nonlinear PDEs with PINNsFor the forward solution of nonlinear PDEs, the method uses Bellman-Kalaba quasilinearization to reduce the nonlinear problem to a sequence of linear subproblems, each discretized by collocation and solved with physics-informed neural networks.
-
Evaluating Open-Source LLMs for Multi-Label ATT&CK Technique Classification on CTI ReportsEvaluating open-source LLMs for ATT&CK multi-label CTI classificationThe paper evaluates open-source LLMs on multi-label classification of cyber threat intelligence (CTI) reports using MITRE ATT&CK techniques. Summary is title-based and neutral; details and figures are as presented by the source and not independently verified.
-
The Measurement Gap in the Automation of EU Law: Benchmarking Doctrinal Legal Reasoning under the EU AI ActBenchmarking doctrinal legal reasoning under the EU AI ActLLMs produce legal text of at least median quality, yet no benchmark evaluates doctrinal legal reasoning, the interpretive core of legal work. The paper benchmarks doctrinal reasoning under the EU AI Act and discusses the measurement gap in legal automation.
-
WEQA: Wearable hEalth Question Answering with Query-Adaptive Agentic ReasoningWEQA: query-adaptive agentic reasoning for wearable health QAThe paper proposes WEQA, a framework for question answering over wearable health sensor data using query-adaptive agentic reasoning, arguing that diverse sensor modalities and user intents cannot be handled by a fixed reasoning workflow.
-
Your AI Travel Agent Would Book You a Bullfight: An Agentic Benchmark for Implicit Animal Welfare in Frontier AI ModelsAn agentic benchmark for implicit animal welfare in frontier AIAI agents are shifting from advisors to actors that book travel and run procurement. Existing animal-welfare benchmarks grade only text answers, so this work introduces an agentic benchmark testing whether implicit animal-welfare reasoning transfers to agent actions in frontier models.
-
Towards Understanding and Measuring COGNITIVE ATROPHY in LLM BehaviourFormalizing 'cognitive atrophy' as a process-level measure of LLM behaviourThe paper formalizes 'cognitive atrophy,' a process-level behavioural measure of AI-mediated mental-health support, capturing whether interactions help users keep reflecting, coping, and deciding, a dimension distinct from safety and static response quality.
-
Unintended Effects of Geographic Conditioning in Large Language ModelsUnintended regional biases from geographic conditioning in LLMsConversational AI localizes responses using user metadata, yet the regional biases this hidden context introduces remain poorly understood. The paper analyzes the unintended effects of geographic conditioning on large language model outputs.
-
Embedded Machine Learning for Microcontroller-Class Edge Devices: Data, Feature, Evaluation, and Deployment PipelinesA pipeline survey of embedded ML for microcontroller-class devicesEmbedded machine learning moves inference from the cloud to resource-constrained devices. This practice-oriented synthesis lays out data, feature, evaluation and deployment pipelines for an embedded ML workflow on microcontroller-class platforms.
-
Structural Role Injection in Handlebars-Templated LLM Prompts: Triple-Brace Interpolation, Delimiter Family, and the Limits of HTML Auto-EscapingStructural role injection in Handlebars-templated LLM promptsLLM apps build prompts from templates, with Handlebars the default in Microsoft Semantic Kernel. While double-brace expressions HTML-escape values, triple-brace interpolation inserts them raw. The paper studies structural role injection and the limits of HTML auto-escaping.
-
HistoRAG: Embedding Historical Methodology in Retrieval-Augmented Generation Through Critical Technical PracticeHistoRAG embeds historical methodology into RAG via critical practiceRAG grounds model outputs in external evidence, but its dominant evaluations and defaults are oriented toward factual question answering. HistoRAG embeds historical methodology into retrieval-augmented generation through critical technical practice for interpretive historical studies.
-
IsabeLLM: Automated Theorem Proving Applied to Formally Verifying ConsensusIsabeLLM: automated theorem proving to formally verify consensusThe paper presents IsabeLLM, applying AI-based automated theorem proving to formally verify blockchain consensus, aiming to automate much of the expertise-intensive verification workload and make formal verification more accessible.
-
How to Optimize Transformer-Based Models for Low-Precision TrainingNVIDIA guide on optimizing transformer models for low-precision trainingAn NVIDIA technical post explains techniques for optimizing transformer-based models during low-precision training. The export raw_excerpt was blocked (cookie/query data), so this summary is based only on the title and source; specific methods and figures are unverified.
-
S4oP: Operator-level Pruning of Structured State Space Models for Resource-Constrained DevicesS4oP prunes structured state space models at the operator levelStructured state space models such as S4 and S4D capture long-range dependencies but are hard to deploy on constrained devices. S4oP introduces operator-level pruning to enable efficient deployment of SSMs on time- and resource-constrained hardware.
-
EAGG: Embodiment-Aligned Grasp Generation via Geometry-Aware Graph ConditioningEAGG: embodiment-aligned grasp generation via graph conditioningThe paper presents EAGG, an embodiment-aligned grasp generator that represents each end-effector with a topology-aware graph and embodiment-specific conditioning, aiming to generalize grasp generation across objects and diverse robot embodiments.
-
Securing the future of AI agentsDeepMind outlines an AI Control Roadmap to secure AI agentsGoogle DeepMind presents an AI Control Roadmap for securing the future of AI agents, combining traditional safeguards with real-time monitoring to protect internal systems. The framework lays out layered defenses against agent misuse and unsafe behavior as agents proliferate.
-
Edge Flow: A Tractable and Predictive Continuous-Time Model for Gradient Descent at the Edge of StabilityEdge Flow: a tractable continuous-time model for GD at the edge of stabilityGradient descent in deep learning can operate at the edge of stability, where the loss Hessian's top eigenvalue hovers near the stability threshold. Classical tools fail there, so Edge Flow offers a tractable, predictive continuous-time model of this regime.
-
Agentic AI-based Framework for Mitigating Premature Diagnostic Handoff and Silent Hallucination in Healthcare ApplicationsA multi-agent framework against premature handoff and silent hallucinationThe paper proposes a multi-agent framework for healthcare that mitigates premature diagnostic handoff and silent clinical hallucinations, replacing LLM-as-a-judge routing with deterministic orchestration constraints and adding two safety mechanisms.
-
NoiseTilt: Noise-Tilted Reverse Kernels for Diffusion Reward AlignmentNoiseTilt injects reward gradients via the noise term in diffusionNoiseTilt (NTRK) is a reward-guided diffusion sampler that injects reward gradients through the noise term, leaving the score kernel unchanged and needing only a single sample per step, improving reward alignment of pretrained diffusion models.
-
PseudoBench: Measuring How Agentic Auto-Research Fuels PseudosciencePseudoBench measures how agentic auto-research fuels pseudoscienceAs LLM-based agents enter autonomous scientific research, resisting pseudoscience matters. PseudoBench is an adversarial benchmark measuring how such agents may rapidly generate plausible yet misleading studies that contaminate academic literature.
-
Compositional Skill Routing for LLM Agents: Decompose, Retrieve, and ComposeCompositional skill routing for LLM agents: decompose, retrieve, composeLLM agents rely on reusable tool specifications (skills), but real tasks require composing multiple skills. The paper formalizes compositional skill routing: decomposing a complex query into atomic sub-tasks, retrieving relevant skills, and composing them.
-
Uncertainty Quantification for Flow-Based Vision-Language-Action ModelsUncertainty quantification for flow-based vision-language-action modelsVision-language-action models combine vision-language backbones with expressive generative action heads trained via flow matching on large robotic datasets. Despite strong performance, the paper studies uncertainty quantification for these flow-based VLA models.
-
NVIDIA Blackwell Tops MLPerf Training 6.0 with Industry-Leading Scale and PerformanceNVIDIA says Blackwell tops MLPerf Training 6.0 benchmarkNVIDIA announced that its Blackwell GPU architecture topped the MLPerf Training 6.0 benchmark with what it calls industry-leading scale and performance. Summarized neutrally from the title; the export excerpt was blocked (cookie/query data), so figures are vendor claims, not independently verified.
-
ProvenanceGuard: Source-Aware Factuality Verification for MCP-Based LLM AgentsProvenanceGuard: source-aware factuality verification for MCP agentsTool-using LLM agents use the Model Context Protocol to answer from heterogeneous sources like search, APIs, databases and clinical records. ProvenanceGuard provides source-aware factuality verification to catch provenance-sensitive failure modes that standard metrics miss.
-
LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation ScalingLoopCoder-v2: loop once for efficient test-time compute scalingLooped transformers scale latent computation by repeating shared blocks, but sequential looping raises latency and KV-cache memory with loop count. Building on parallel loop transformers, LoopCoder-v2 makes loop count a practical knob for efficient test-time computation scaling.
-
LegalHalluLens: Typed Hallucination Auditing and Calibrated Multi-Agent Debate for Trustworthy Legal AILegalHalluLens audits typed legal-AI hallucinations with calibrated debateLegal-AI systems hallucinate at aggregate rates near 52%, but averages hide where and how errors concentrate. LegalHalluLens is an auditing framework pairing typed hallucination auditing with calibrated multi-agent debate to give compliance officers actionable signals for trustworthy legal AI.
-
Fast Nonparametric Conditional Independence Testing via Two-Stage RegressionFast nonparametric conditional independence testing via two-stage regressionConditional independence testing is fundamental to statistics and causal inference. The paper proposes a fast nonparametric conditional independence test based on two-stage regression, aiming to improve computational efficiency and power.
-
LLM Consumer Behavior Theory: Foundations of a Novel Research FieldLLM Consumer Behavior Theory: a new field for agentic marketsThe paper introduces LLM Consumer Behavior Theory, a proposed field analyzing consumer behavior in agentic markets where LLMs make consumption decisions on behalf of users, drawing on classical and behavioral economics alongside NLP.
-
VoidPadding: Let [VOID] Handle Padding in Masked Diffusion Language Models so that [EOS] Can Focus on Semantic TerminationVoidPadding lets [VOID] handle padding so [EOS] focuses on terminationIn masked diffusion language models, padding and semantic termination roles get entangled. VoidPadding introduces a [VOID] token to handle padding so that [EOS] can focus on signaling semantic termination, improving generation behavior.