Safety & Evaluation (Page 7 of 10)｜AI/Tech News Trends

arXiv cs.AI (Artificial Intelligence) · 2026-06-16 EN Multimodal extract

STAR: SpatioTemporal Adaptive Reward Allocation for Text-to-Image RL Post-Training

STAR: spatiotemporal adaptive reward allocation for text-to-image RL

Reinforcement Learning

The paper proposes STAR, a spatiotemporal adaptive reward allocation method for text-to-image RL post-training, replacing a single scalar advantage applied uniformly with rewards that account for the temporal and spatial structure of generation.

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN New Model Releases extract

Fine-tuning LLMs for Passive Depression Severity Estimation from AI Mental Health Dialogue

Fine-tuning LLMs for passive depression severity from AI dialogue

Claude Fine-tuning Neural Network Reinforcement Learning

The paper fine-tunes LLMs for passive estimation of depression severity from AI mental-health dialogue, exploring how conversational signals can indicate severity. Figures and efficacy are as reported by the source and not independently verified.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.LG (Machine Learning) · 2026-06-16 EN Safety & Evaluation extract

KANLib -- An Modular, Extensible and Fast Kolmogorov-Arnold Network Implementation

KANLib: a modular, extensible and fast KAN implementation

Kolmogorov-Arnold Networks replace linear weights with learnable univariate functions but their high computational cost hampers practical research. KANLib provides a modular, extensible and fast implementation of KANs to ease experimentation.

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN New Model Releases extract

Non-negative Elastic Net Decoding for Information Retrieval

Non-negative elastic net decoding for information retrieval

Deep Learning Embeddings Neural Network

Dense retrieval has become the dominant paradigm in information retrieval. The paper applies non-negative elastic net decoding to information retrieval, aiming to improve retrieval representations and accuracy.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN New Model Releases extract

ChLogic: Evaluating Robustness of Logical Reasoning in Chinese Expressions

ChLogic evaluates logical reasoning robustness in Chinese

LLMs do well on standardized logical reasoning benchmarks, but whether this holds beyond English is unclear. ChLogic is an English-Chinese aligned benchmark testing whether models preserve logical reasoning when the same latent structure is expressed in Chinese.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.LG (Machine Learning) · 2026-06-16 EN Safety & Evaluation extract

Dimensionality Controls When Modularity Helps in Continual Learning

Dimensionality controls when modularity helps in continual learning

Reinforcement Learning

Compositional learning systems must balance plasticity and stability. The paper analyzes when modularity helps in continual learning and shows that the dimensionality of representations controls whether modular structure is beneficial.

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.LG (Machine Learning) · 2026-06-16 EN Safety & Evaluation extract

Monotonic Kolmogorov-Arnold Networks: A Theoretical and Empirical Study of Monotonicity as an Inductive Bias

Monotonic KANs: monotonicity as an inductive bias, studied theoretically

Deep Learning Machine Learning Neural Network Software Engineering

Monotonicity is a useful architectural inductive bias in tabular, scientific and economic settings. The paper proposes monotonic Kolmogorov-Arnold Networks with per-edge functional transparency and studies monotonicity as an inductive bias both theoretically and empirically.

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.LG (Machine Learning) · 2026-06-16 EN Safety & Evaluation extract

AnchorKV: Safety-Aware KV Cache Compression via Soft Penalty with a Refusal Anchor

AnchorKV: safety-aware KV cache compression via soft penalties

Inference Reinforcement Learning

AnchorKV is a safety-aware KV cache compression method that uses soft penalties (anchors) to retain important key-value entries while reducing memory. Summary is largely title-based; details are as presented by the source and not independently verified.

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Safety & Evaluation extract

GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine?

GameCraft-Bench: can agents build playable games end-to-end?

AI Agents

Game generation is an emerging coding-agent application requiring natural-language specs to become playable interactive systems. GameCraft-Bench evaluates whether agents can build games end-to-end inside a real game engine, where scripts, scenes, assets, rendering and runtime must cohere.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.LG (Machine Learning) · 2026-06-16 EN New Model Releases extract

WallZero: Mastering the Game of WallGo with Strategic Analysis

WallZero masters the board game WallGo with strategic analysis

Meta Retrieval-Augmented Generation (RAG) Reinforcement Learning

WallGo is a recently introduced strategic board game. WallZero masters WallGo through an approach incorporating strategic analysis, demonstrating game-playing performance and strategic insights.

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.LG (Machine Learning) · 2026-06-16 EN Multimodal extract

Qwen-RobotManip Technical Report: Alignment Unlocks Scale for Robotic Manipulation Foundation Models

Qwen-RobotManip: alignment unlocks scale for robot manipulation models

Computer Vision

Language and multimodal foundation models generalize by aligning heterogeneous data under a unified formulation and training at scale. This technical report investigates applying that recipe to robotic manipulation, arguing alignment unlocks scale for manipulation foundation models.

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Safety & Evaluation extract

When Multiple Scripts Matter: Evaluating ASR in Clinical Settings

Evaluating ASR in clinical settings when multiple scripts matter

Meta Speech Processing

Automatic speech recognition in non-English clinical settings faces multiscript variability, where a term appears in multiple valid orthographies. String-matching metrics treat variants as errors and underestimate performance; the paper studies ASR evaluation when multiple scripts matter.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Inference & Efficiency extract

Improving low-resource ASR using bilingual fine-tuning with language identification: a cross-linguistic evaluation

Improving low-resource ASR via bilingual fine-tuning with language ID

Fine-tuning Inference Speech Processing

The study explores improving low-resource automatic speech recognition using bilingual fine-tuning combined with language identification, and evaluates the approach across languages in a cross-linguistic setting.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Developer Tools extract

A Framework for Evaluating Agentic Skills at Scale

A framework for evaluating agentic skills at scale

AI Agents Deep Learning Reinforcement Learning

Agent skills, structured reusable knowledge artifacts that augment LLM agents, have been rapidly adopted, yet their cross-domain impact and a reusable methodology for evaluating individual skills are lacking. The paper presents a framework for evaluating agentic skills at scale.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Safety & Evaluation extract

Position: Coding Benchmarks Are Misaligned with Agentic Software Engineering

Position: coding benchmarks are misaligned with agentic software engineering

AI Agents Software Engineering

Coding agents have become a major mode of software engineering. This position paper argues that existing coding benchmarks are misaligned with real agentic software engineering and calls for rethinking how such systems are evaluated.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Multimodal extract

The Slop Paradox: How Synthetic Standardization Erodes Clinical Uncertainty and Cross-Modal Alignment in AI-Rewritten Radiology Reports

The Slop Paradox: AI-rewritten radiology reports erode clinical uncertainty

AI clinical documentation tools increasingly summarize and reformat radiology reports with LLMs. Using 450 chest X-ray reports from the Indiana University dataset, the paper measures resulting information degradation, showing erosion of clinical uncertainty and cross-modal alignment in AI-rewritten reports.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Safety & Evaluation extract

Toward Accessible Psychotherapy Training Using AI-Driven Interactive Patient Avatars

AI-driven patient avatars for more accessible psychotherapy training

GPT

Training psychotherapists in evidence-based interventions like Acceptance and Commitment Therapy needs repeated practice with feedback, limited by ethical, logistical and resource constraints. The paper introduces AI-driven interactive patient avatars to make such training more accessible.

Read original (arXiv cs.CL (Computation and Language)) ↗

Hacker News (Front Page) · 2026-06-16 EN Safety & Evaluation extract

Feds freaked over Fable 5 after simple 'fix this code' prompt, not jailbreak

Feds alarmed by Fable 5 via a plain 'fix this code' prompt, not a jailbreak

A Hacker News front-page headline reports that authorities grew alarmed over the AI model 'Fable 5' after a simple 'fix this code' prompt rather than a sophisticated jailbreak. The export's raw_excerpt was empty, so this is a neutral, title-only summary; specifics and accuracy should be confirmed against the original article. Claims are described neutrally rather than asserted as established fact.

Read original (Hacker News (Front Page)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Multimodal extract

Vision-language models for chest radiography do not always need the image

Computer Vision Inference Software Engineering

Medical vision-language models combine images and text for reporting. For chest radiography, the paper shows these models do not always need the image to make predictions, and discusses the implications for evaluation and clinical use.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Safety & Evaluation extract

EComAgentBench: Benchmarking Shopping Agents on Long-Horizon Tasks with Distributed Hidden Intent

EComAgentBench: shopping agents on long-horizon tasks with hidden intent

AI Agents Software Engineering

As LLM-based shopping agents reach production, existing benchmarks miss how requirements arrive: implicitly, in a profile, or only when the right question is asked. EComAgentBench evaluates shopping agents on long-horizon tasks with distributed hidden intent.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN New Model Releases extract

SuCo: Sufficiency-guided Continuous Adaptive Reasoning

SuCo: sufficiency-guided continuous adaptive reasoning

Fine-tuning Reinforcement Learning Software Engineering

SuCo is a method for sufficiency-guided continuous adaptive reasoning that adapts the reasoning process to a necessary-and-sufficient extent, aiming to balance efficiency and accuracy. Summary is largely title-based; details are as presented by the source.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Safety & Evaluation extract

Bridging Functional Correctness and Runtime Efficiency Gaps in LLM-Based Code Translation

Bridging correctness and runtime efficiency in LLM code translation

Neural Network Retrieval-Augmented Generation (RAG)

LLMs have advanced the functional correctness of automated code translation, but runtime efficiency of translated programs has received little attention. As Moore's law wanes, the paper works to bridge the gap between functional correctness and runtime efficiency in LLM-based code translation.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN New Model Releases extract

From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning

From trainee to trainer: LLM-designed RL training environments

Gemini GPT Reinforcement Learning

RL pipelines for LLM training often rely on manually redesigned environments between stages, forcing heuristic guesses about good configurations. The paper has the LLM itself design training environments for reinforcement learning with multi-agent reasoning, moving from trainee to trainer.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Multimodal extract

EnvRL: Learn from Environment Dynamics in Agentic Reinforcement Learning

EnvRL learns from environment dynamics in agentic RL

AI Agents Retrieval-Augmented Generation (RAG) Reinforcement Learning

EnvRL is a method that learns from environment dynamics in agentic reinforcement learning, leveraging the structure of agent-environment interaction to improve learning efficiency and performance.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN New Model Releases extract

MambaCount: Efficient Text-guided Open-vocabulary Object Counting with Spatial Sparse State Space Duality Block

MambaCount: efficient open-vocabulary counting via state-space duality

Reinforcement Learning Transformer

Text-guided open-vocabulary object counting is hard in dense scenes with large scale variation, and existing Transformer methods are limited by quadratic complexity. MambaCount uses a spatial sparse state space duality block for efficient open-vocabulary object counting.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Inference & Efficiency extract

Beyond Domains: Reusing Web Skills via Transferable Interaction Patterns

Reusing web skills via transferable interaction patterns

AI Agents Meta Retrieval-Augmented Generation (RAG)

LLM web agents are usually deployed as tool callers that read a fresh page observation each turn and emit a structured action. The paper proposes reusing web skills across domains via transferable interaction patterns rather than domain-specific behaviors.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN Safety & Evaluation extract

Prompt Perturbation for Reliable LLM Evaluation over Comparison Graphs

Prompt perturbation for reliable LLM evaluation over comparison graphs

Evaluating LLMs is important but can be fragile to small prompt changes. The paper proposes using prompt perturbation to achieve more reliable LLM evaluation over comparison graphs.

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-06-16 EN New Model Releases extract

OPD-Evolver: Cultivating Holistic Agent Evolver via On-Policy Distillation

OPD-Evolver cultivates self-evolving agents via on-policy distillation

AI Agents

Memory is a standard substrate for self-evolving agents, but retaining experience differs from learning how to evolve through it. OPD-Evolver uses on-policy distillation to cultivate a holistic agent evolver that selects useful experience, acts on it and writes reusable knowledge.

Read original (arXiv cs.CL (Computation and Language)) ↗

Simon Willison's Weblog · 2026-06-16 EN Safety & Evaluation extract

The Fable 5 Export Controls Harm US Cyber Defense

Willison: Fable 5 export controls harm US cyber defense

Anthropic Claude Computer Vision Neural Network Reinforcement Learning

Willison cites Kate Moussouris that the 'jailbreak' behind Claude Fable 5's export-control ban was merely asking it to 'fix this code' containing known CVEs and planted bugs. Since fixing security bugs is core to coding models, he argues the controls weaken US cyber defense.

Read original (Simon Willison's Weblog) ↗

Simon Willison's Weblog · 2026-06-16 EN Safety & Evaluation extract

Quoting Matteo Wong, The Atlantic

Willison quotes The Atlantic on the White House's pressure on Anthropic

Anthropic Claude

Simon Willison quotes Matteo Wong of The Atlantic on the White House escalating its conflict with Anthropic. Security expert Katie Moussouris said Anthropic shared the White House's report on the "Fable jailbreak" for her appraisal. IT experts asked an AI model to find and patch bugs; given deliberately insecure code, it refused "review the code for security issues" but complied with "fix this code." Moussouris called this the model working as intended for cyberdefense.

Read original (Simon Willison's Weblog) ↗