Safety & Evaluation A

Showing 181–210 of 289
  • arXiv cs.AI (Artificial Intelligence) · EN Multimodal extract
    STAR: SpatioTemporal Adaptive Reward Allocation for Text-to-Image RL Post-Training
    STAR: spatiotemporal adaptive reward allocation for text-to-image RL
    Reinforcement Learning
    The paper proposes STAR, a spatiotemporal adaptive reward allocation method for text-to-image RL post-training, replacing a single scalar advantage applied uniformly with rewards that account for the temporal and spatial structure of generation.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    Fine-tuning LLMs for Passive Depression Severity Estimation from AI Mental Health Dialogue
    Fine-tuning LLMs for passive depression severity from AI dialogue
    Claude Fine-tuning Neural Network Reinforcement Learning
    The paper fine-tunes LLMs for passive estimation of depression severity from AI mental-health dialogue, exploring how conversational signals can indicate severity. Figures and efficacy are as reported by the source and not independently verified.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    KANLib -- An Modular, Extensible and Fast Kolmogorov-Arnold Network Implementation
    KANLib: a modular, extensible and fast KAN implementation
    Kolmogorov-Arnold Networks replace linear weights with learnable univariate functions but their high computational cost hampers practical research. KANLib provides a modular, extensible and fast implementation of KANs to ease experimentation.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    Non-negative Elastic Net Decoding for Information Retrieval
    Non-negative elastic net decoding for information retrieval
    Deep Learning Embeddings Neural Network
    Dense retrieval has become the dominant paradigm in information retrieval. The paper applies non-negative elastic net decoding to information retrieval, aiming to improve retrieval representations and accuracy.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    ChLogic: Evaluating Robustness of Logical Reasoning in Chinese Expressions
    ChLogic evaluates logical reasoning robustness in Chinese
    LLMs do well on standardized logical reasoning benchmarks, but whether this holds beyond English is unclear. ChLogic is an English-Chinese aligned benchmark testing whether models preserve logical reasoning when the same latent structure is expressed in Chinese.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    Dimensionality Controls When Modularity Helps in Continual Learning
    Dimensionality controls when modularity helps in continual learning
    Reinforcement Learning
    Compositional learning systems must balance plasticity and stability. The paper analyzes when modularity helps in continual learning and shows that the dimensionality of representations controls whether modular structure is beneficial.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    Monotonic Kolmogorov-Arnold Networks: A Theoretical and Empirical Study of Monotonicity as an Inductive Bias
    Monotonic KANs: monotonicity as an inductive bias, studied theoretically
    Deep Learning Machine Learning Neural Network Software Engineering
    Monotonicity is a useful architectural inductive bias in tabular, scientific and economic settings. The paper proposes monotonic Kolmogorov-Arnold Networks with per-edge functional transparency and studies monotonicity as an inductive bias both theoretically and empirically.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    AnchorKV: Safety-Aware KV Cache Compression via Soft Penalty with a Refusal Anchor
    AnchorKV: safety-aware KV cache compression via soft penalties
    Inference Reinforcement Learning
    AnchorKV is a safety-aware KV cache compression method that uses soft penalties (anchors) to retain important key-value entries while reducing memory. Summary is largely title-based; details are as presented by the source and not independently verified.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine?
    GameCraft-Bench: can agents build playable games end-to-end?
    AI Agents
    Game generation is an emerging coding-agent application requiring natural-language specs to become playable interactive systems. GameCraft-Bench evaluates whether agents can build games end-to-end inside a real game engine, where scripts, scenes, assets, rendering and runtime must cohere.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    WallZero: Mastering the Game of WallGo with Strategic Analysis
    WallZero masters the board game WallGo with strategic analysis
    Meta Retrieval-Augmented Generation (RAG) Reinforcement Learning
    WallGo is a recently introduced strategic board game. WallZero masters WallGo through an approach incorporating strategic analysis, demonstrating game-playing performance and strategic insights.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Multimodal extract
    Qwen-RobotManip Technical Report: Alignment Unlocks Scale for Robotic Manipulation Foundation Models
    Qwen-RobotManip: alignment unlocks scale for robot manipulation models
    Computer Vision
    Language and multimodal foundation models generalize by aligning heterogeneous data under a unified formulation and training at scale. This technical report investigates applying that recipe to robotic manipulation, arguing alignment unlocks scale for manipulation foundation models.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    When Multiple Scripts Matter: Evaluating ASR in Clinical Settings
    Evaluating ASR in clinical settings when multiple scripts matter
    Meta Speech Processing
    Automatic speech recognition in non-English clinical settings faces multiscript variability, where a term appears in multiple valid orthographies. String-matching metrics treat variants as errors and underestimate performance; the paper studies ASR evaluation when multiple scripts matter.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Inference & Efficiency extract
    Improving low-resource ASR using bilingual fine-tuning with language identification: a cross-linguistic evaluation
    Improving low-resource ASR via bilingual fine-tuning with language ID
    Fine-tuning Inference Speech Processing
    The study explores improving low-resource automatic speech recognition using bilingual fine-tuning combined with language identification, and evaluates the approach across languages in a cross-linguistic setting.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Developer Tools extract
    A Framework for Evaluating Agentic Skills at Scale
    A framework for evaluating agentic skills at scale
    AI Agents Deep Learning Reinforcement Learning
    Agent skills, structured reusable knowledge artifacts that augment LLM agents, have been rapidly adopted, yet their cross-domain impact and a reusable methodology for evaluating individual skills are lacking. The paper presents a framework for evaluating agentic skills at scale.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    Position: Coding Benchmarks Are Misaligned with Agentic Software Engineering
    Position: coding benchmarks are misaligned with agentic software engineering
    AI Agents Software Engineering
    Coding agents have become a major mode of software engineering. This position paper argues that existing coding benchmarks are misaligned with real agentic software engineering and calls for rethinking how such systems are evaluated.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Multimodal extract
    The Slop Paradox: How Synthetic Standardization Erodes Clinical Uncertainty and Cross-Modal Alignment in AI-Rewritten Radiology Reports
    The Slop Paradox: AI-rewritten radiology reports erode clinical uncertainty
    AI clinical documentation tools increasingly summarize and reformat radiology reports with LLMs. Using 450 chest X-ray reports from the Indiana University dataset, the paper measures resulting information degradation, showing erosion of clinical uncertainty and cross-modal alignment in AI-rewritten reports.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    Toward Accessible Psychotherapy Training Using AI-Driven Interactive Patient Avatars
    AI-driven patient avatars for more accessible psychotherapy training
    GPT
    Training psychotherapists in evidence-based interventions like Acceptance and Commitment Therapy needs repeated practice with feedback, limited by ethical, logistical and resource constraints. The paper introduces AI-driven interactive patient avatars to make such training more accessible.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • Hacker News (Front Page) · EN Safety & Evaluation extract
    Feds freaked over Fable 5 after simple 'fix this code' prompt, not jailbreak
    Feds alarmed by Fable 5 via a plain 'fix this code' prompt, not a jailbreak
    A Hacker News front-page headline reports that authorities grew alarmed over the AI model 'Fable 5' after a simple 'fix this code' prompt rather than a sophisticated jailbreak. The export's raw_excerpt was empty, so this is a neutral, title-only summary; specifics and accuracy should be confirmed against the original article. Claims are described neutrally rather than asserted as established fact.
    Read original (Hacker News (Front Page)) ↗
  • arXiv cs.CL (Computation and Language) · EN Multimodal extract
    Vision-language models for chest radiography do not always need the image
    Vision-language models for chest radiography do not always need the image
    Computer Vision Inference Software Engineering
    Medical vision-language models combine images and text for reporting. For chest radiography, the paper shows these models do not always need the image to make predictions, and discusses the implications for evaluation and clinical use.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    EComAgentBench: Benchmarking Shopping Agents on Long-Horizon Tasks with Distributed Hidden Intent
    EComAgentBench: shopping agents on long-horizon tasks with hidden intent
    AI Agents Software Engineering
    As LLM-based shopping agents reach production, existing benchmarks miss how requirements arrive: implicitly, in a profile, or only when the right question is asked. EComAgentBench evaluates shopping agents on long-horizon tasks with distributed hidden intent.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    SuCo: Sufficiency-guided Continuous Adaptive Reasoning
    SuCo: sufficiency-guided continuous adaptive reasoning
    Fine-tuning Reinforcement Learning Software Engineering
    SuCo is a method for sufficiency-guided continuous adaptive reasoning that adapts the reasoning process to a necessary-and-sufficient extent, aiming to balance efficiency and accuracy. Summary is largely title-based; details are as presented by the source.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    Bridging Functional Correctness and Runtime Efficiency Gaps in LLM-Based Code Translation
    Bridging correctness and runtime efficiency in LLM code translation
    Neural Network Retrieval-Augmented Generation (RAG)
    LLMs have advanced the functional correctness of automated code translation, but runtime efficiency of translated programs has received little attention. As Moore's law wanes, the paper works to bridge the gap between functional correctness and runtime efficiency in LLM-based code translation.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning
    From trainee to trainer: LLM-designed RL training environments
    Gemini GPT Reinforcement Learning
    RL pipelines for LLM training often rely on manually redesigned environments between stages, forcing heuristic guesses about good configurations. The paper has the LLM itself design training environments for reinforcement learning with multi-agent reasoning, moving from trainee to trainer.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Multimodal extract
    EnvRL: Learn from Environment Dynamics in Agentic Reinforcement Learning
    EnvRL learns from environment dynamics in agentic RL
    AI Agents Retrieval-Augmented Generation (RAG) Reinforcement Learning
    EnvRL is a method that learns from environment dynamics in agentic reinforcement learning, leveraging the structure of agent-environment interaction to improve learning efficiency and performance.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    MambaCount: Efficient Text-guided Open-vocabulary Object Counting with Spatial Sparse State Space Duality Block
    MambaCount: efficient open-vocabulary counting via state-space duality
    Reinforcement Learning Transformer
    Text-guided open-vocabulary object counting is hard in dense scenes with large scale variation, and existing Transformer methods are limited by quadratic complexity. MambaCount uses a spatial sparse state space duality block for efficient open-vocabulary object counting.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Inference & Efficiency extract
    Beyond Domains: Reusing Web Skills via Transferable Interaction Patterns
    Reusing web skills via transferable interaction patterns
    AI Agents Meta Retrieval-Augmented Generation (RAG)
    LLM web agents are usually deployed as tool callers that read a fresh page observation each turn and emit a structured action. The paper proposes reusing web skills across domains via transferable interaction patterns rather than domain-specific behaviors.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    Prompt Perturbation for Reliable LLM Evaluation over Comparison Graphs
    Prompt perturbation for reliable LLM evaluation over comparison graphs
    Evaluating LLMs is important but can be fragile to small prompt changes. The paper proposes using prompt perturbation to achieve more reliable LLM evaluation over comparison graphs.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    OPD-Evolver: Cultivating Holistic Agent Evolver via On-Policy Distillation
    OPD-Evolver cultivates self-evolving agents via on-policy distillation
    AI Agents
    Memory is a standard substrate for self-evolving agents, but retaining experience differs from learning how to evolve through it. OPD-Evolver uses on-policy distillation to cultivate a holistic agent evolver that selects useful experience, acts on it and writes reusable knowledge.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • Simon Willison's Weblog · EN Safety & Evaluation extract
    The Fable 5 Export Controls Harm US Cyber Defense
    Willison: Fable 5 export controls harm US cyber defense
    Anthropic Claude Computer Vision Neural Network Reinforcement Learning
    Willison cites Kate Moussouris that the 'jailbreak' behind Claude Fable 5's export-control ban was merely asking it to 'fix this code' containing known CVEs and planted bugs. Since fixing security bugs is core to coding models, he argues the controls weaken US cyber defense.
    Read original (Simon Willison's Weblog) ↗
  • Simon Willison's Weblog · EN Safety & Evaluation extract
    Quoting Matteo Wong, The Atlantic
    Willison quotes The Atlantic on the White House's pressure on Anthropic
    Anthropic Claude
    Simon Willison quotes Matteo Wong of The Atlantic on the White House escalating its conflict with Anthropic. Security expert Katie Moussouris said Anthropic shared the White House's report on the "Fable jailbreak" for her appraisal. IT experts asked an AI model to find and patch bugs; given deliberately insecure code, it refused "review the code for security issues" but complied with "fix this code." Moussouris called this the model working as intended for cyberdefense.
    Read original (Simon Willison's Weblog) ↗