Developer Tools B

Showing 91–120 of 304
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    SCAN: Enhance Time Series Anomaly Detection via Multi-Scale Neighborhood-Centered Clustering
    SCAN boosts time-series anomaly detection via neighborhood clustering
    Reinforcement Learning
    Time-series anomaly detection is crucial across applications, and reconstruction-based methods dominate but suffer from over-generalization that reconstructs anomalies too well. SCAN uses multi-scale neighborhood-centered clustering to curb this over-generalization and improve detection.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Transformer Geometry Observatory TGO-I: Spectral Geometry Observatory
    TGO-I: a spectral geometry observatory for Vision Transformers
    Computer Vision Reinforcement Learning Transformer
    Despite the wide adoption and success of Vision Transformers, understanding of their dimensional and representational geometry remains limited. The Transformer Geometry Observatory (TGO-I) studies ViTs through spectral geometry, observing and analyzing the structure of their representation spaces.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Developer Tools extract
    A Taxonomy of Mental Health and Technology Needs for Alzheimer's and Dementia Caregivers
    A taxonomy of mental-health and tech needs for dementia caregivers
    Deep Learning Reinforcement Learning
    Family members caring for people with Alzheimer's and related dementias form the foundation of long-term care worldwide; in 2023 over 11 million U.S. relatives provided unpaid care. This work presents a taxonomy of caregivers' mental-health and technology needs to guide supportive design.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Training & Fine-tuning extract
    STARE: Surprisal-Guided Token-Level Advantage Reweighting for Policy Entropy Stability
    STARE reweights token advantages to stabilize policy entropy
    Algorithms & Theory Retrieval-Augmented Generation (RAG) Reinforcement Learning
    Reinforcement learning with verifiable rewards, such as GRPO, dominates post-training for complex LLM reasoning but often suffers policy entropy collapse. STARE introduces surprisal-guided token-level advantage reweighting to stabilize policy entropy and preserve exploration during training.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN Infrastructure & Hardware extract
    A Human-in-the-Loop Bayesian Optimization Framework for Constraint-Aware Bioprocess Development
    Human-in-the-loop Bayesian optimization for bioprocess development
    Reinforcement Learning
    This work extends Pareto Front Guided Sampling (PFGS), a human-in-the-loop Bayesian optimization framework, by reformulating Gaussian-process surrogate quantities as objectives. It enables constraint-aware bioprocess development, blending expert input with efficient search for optimal conditions.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Training & Fine-tuning extract
    Mechanism-Guided Selective Unlearning for RLVR-Induced Reasoning
    MAST selectively unlearns RLVR-induced reasoning with less damage
    Fine-tuning Reinforcement Learning
    The authors propose MAST (Mechanism-Aligned Selective Targeting), a mechanism-guided method for unlearning RLVR-induced reasoning with substantially less collateral damage than standard full-parameter updates, removing targeted reasoning while preserving other capabilities.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN Developer Tools extract
    Generalised Eigenvalue Geometry of Semantic Adversarial Attacks
    Generalised eigenvalue geometry of semantic adversarial attacks
    Algorithms & Theory Embeddings Neural Network
    Recent work shows semantically equivalent paraphrases can fool financial sentiment classifiers: a paraphrase stays close to the original under a strong reference embedding yet flips the prediction. This paper analyzes such semantic adversarial attacks through generalised eigenvalue geometry.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    The More the Merrier: Combining Properties for ABox Abduction under Repair Semantics for ELbot
    Combining properties for ABox abduction under repair semantics
    Abduction explains missing entailments from a knowledge base by proposing a hypothesis that would make them hold. This work studies ABox abduction under repair semantics for the EL description logic, combining multiple properties to produce stronger explanatory hypotheses.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN Developer Tools extract
    Learning to Annotate Delayed and False AEB Events: A Practical System for Extreme Class Imbalance and Asymmetric Label Noise
    Annotating rare delayed and false AEB events under class imbalance
    AI Agents Neural Network Reinforcement Learning
    Optimizing Autonomous Emergency Braking relies on accurately annotated real-world triggers, especially rare but critical delayed and false AEB events that expose defects. This work presents a practical system to learn to annotate such events under extreme class imbalance and asymmetric labels.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • Hugging Face Blog · EN Developer Tools extract
    MolmoMotion: Language-guided 3D motion forecasting
    MolmoMotion: a language-guided approach to 3D motion forecasting
    Allen Institute for AI (AI2) introduces MolmoMotion on the Hugging Face blog, a method that forecasts 3D motion guided by natural-language instructions. This summary is title-based as no excerpt was retrieved; method details and any performance claims are per the source and unverified independently.
    Read original (Hugging Face Blog) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Developer Tools extract
    Compute Efficiency and Serial Runtime Tradeoffs for Stochastic Momentum Methods
    Compute efficiency vs serial runtime in stochastic momentum methods
    Deep Learning Reinforcement Learning
    Stochastic momentum methods such as heavy ball, Nesterov momentum, and accelerated SGD are widely used in training, but their stochastic benefits depend on two distinct quantities. This work analyzes the trade-offs between compute efficiency and serial runtime for these methods, offering guidance.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Training & Fine-tuning extract
    User as Engram: Internalizing Per-User Memory as Local Parametric Edits
    User as Engram: per-user memory as local parametric edits
    Retrieval-Augmented Generation (RAG) Software Engineering
    Personal memory in a language model involves two problems: content and reasoning skill, which the brain keeps apart—a sparse local hippocampal engram per episode and slow neocortical skill. Inspired by this, the work internalizes per-user memory as local parametric edits to the model.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN Inference & Efficiency extract
    The Reward Was in Your Data All Along: Correcting Flow Matching with Discriminator-Guided RL
    Discriminator-guided RL corrects flow matching using your data
    Inference Neural Network Reinforcement Learning
    Score- and flow-matching models often rely on preference-based RL both to align with subjective preferences and, surprisingly, to recover certain properties. This work argues the reward was in the data all along, correcting flow matching with discriminator-guided reinforcement learning.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    Complementary Attention Head Pruning for Efficient Transformers
    Complementary attention-head pruning for efficient Transformers
    Natural Language Processing (NLP) Reinforcement Learning Transformer
    Transformers' success stems from architectural scaling, which inflates parameter counts and hinders deployment in resource-constrained settings. This work proposes complementary attention head pruning, removing heads so that retained ones stay complementary, preserving accuracy while improving efficiency.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Developer Tools extract
    OpenAnt: LLM-Powered Vulnerability Discovery Through Code Decomposition, Adversarial Verification, and Dynamic Testing
    OpenAnt: LLM-powered vulnerability discovery via code decomposition
    Automated vulnerability discovery in large codebases is hard: static analysis yields high false positives while dynamic methods like fuzzing lack coverage. OpenAnt is an LLM-powered approach combining code decomposition, adversarial verification, and dynamic testing to surface real vulnerabilities.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Training & Fine-tuning extract
    On Local Population-Risk Certificates
    Local population-risk certificates for model updates
    Reinforcement Learning from Human Feedback (RLHF)
    This paper develops local certificates for population-risk increments around a current model. For a local candidate set, the certificate provides a two-sided confidence bound on the change in population risk, giving theoretical guarantees on the risk impact of local model updates.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Multimodal extract
    ChronoSurv: A Clinical Pathway-Guided Graph Framework for Multimodal Survival Analysis
    ChronoSurv: clinical-pathway graph framework for survival analysis
    Neural Network
    Accurate survival prediction is essential for personalized treatment in head and neck cancer but is challenging given heterogeneous, high-dimensional multimodal clinical data. ChronoSurv is a clinical pathway-guided graph framework that integrates multimodal data to improve survival analysis.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    Urdu Katib Handwritten Dataset: A Historical Document Dataset for Offline Urdu Handwritten Text Recognition with CRNN-Based Baseline Evaluation
    Urdu Katib: a historical dataset for offline Urdu handwriting recognition
    Neural Network Retrieval-Augmented Generation (RAG)
    Automatic handwritten text recognition is challenging, especially for cursive scripts. This work introduces the Urdu Katib Handwritten Dataset, a historical-document dataset for offline Urdu handwritten text recognition, providing resources to advance recognition research on cursive scripts.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    INDEQS: Informed Neural controlled Differential EQuationS
    INDEQS: informed neural controlled differential equations for forecasting
    Neural Network Reinforcement Learning
    Neural Controlled Differential Equations provide a powerful continuous-time framework for time-series forecasting, but standard graph-based extensions struggle to learn spatial structure. INDEQS introduces informed neural controlled differential equations to better capture structure and improve forecasting.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Developer Tools extract
    Giskard : Byzantine Robust and Confidential Aggregation for Large-Scale Decentralized Learning
    Giskard: Byzantine-robust, confidential aggregation for decentralized learning
    Deep Learning Machine Learning
    Handling confidentiality and Byzantine behavior simultaneously in decentralized learning is hard. This work presents Giskard, a method enabling Byzantine-robust and confidential aggregation for large-scale decentralized learning, where clients train models without exposing their data.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Infrastructure & Hardware extract
    Written by AI, Managed by AI: Semantic Space Control and Index Sickness Elimination Across 391 Consecutive Sessions
    Semantic-space control sustains 391 consecutive AI-run sessions
    Neural Network Reinforcement Learning
    The prevailing intuition for conceptual drift in long-horizon LLM collaboration is to trade more formal constraints for more reliable outputs. Across 391 consecutive sessions, this work studies semantic space control and 'index sickness' elimination in workflows written and managed by AI.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Developer Tools extract
    Analysing drivers and interdependencies in European electricity markets using XAI
    XAI analyzes drivers and interdependencies in European power markets
    Neural Network Reinforcement Learning
    Electricity markets are complex systems with strong nonlinearities, high-dimensional interactions, and growing cross-regional interdependence. While deep neural networks predict well but stay opaque, this work uses explainable AI to analyze the drivers and interdependencies in European electricity markets.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN Inference & Efficiency extract
    Wasserstein Policy Learning for Distributional Outcomes
    Wasserstein policy learning for distributional outcomes
    Deep Learning Inference
    Offline policy learning is gaining attention in causal inference, aiming to learn an individualized treatment rule mapping covariates to treatments that maximizes empirical outcomes. This work proposes Wasserstein policy learning for distributional outcomes, accounting for the full outcome distribution.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Agents & Tool Use extract
    Towards an Agent-First Web: Redesigning the Web for AI Agents
    Towards an agent-first web: redesigning the web for AI agents
    AI Agents Meta Reinforcement Learning
    The Web was built on a three-decade assumption that its primary content consumer is human, which permeates every layer of its access model. This work argues for an agent-first web, redesigning the Web for AI agents and rethinking access, structure, and interaction for an agent-driven era.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Structure Over Nonlinearity: Explicit Interaction Architectures for Dynamical Learning
    Explicit interaction architectures for dynamical learning
    Most learning architectures for dynamical systems rely on generic nonlinear function approximation, often needing high complexity to capture structured behavior. Favoring structure over nonlinearity, this work proposes explicit interaction architectures that model variable interactions directly for efficient dynamical learning.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Context-Aware Optimization of Follow-Up Intervals for Type 2 Diabetes Care Using Markov Decision Processes
    Optimizing type-2 diabetes follow-up intervals with MDPs
    Reinforcement Learning
    Chronic disease management relies on regular patient-provider interactions to track progression and control. For Type 2 Diabetes, guidelines prescribe fixed follow-up intervals. This work uses Markov decision processes to optimize follow-up intervals in a context-aware way, tailoring scheduling to each patient.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Developer Tools extract
    Model-Free Reinforcement Learning Control for Resilient Cyber-Physical Systems
    Model-free RL control for resilient cyber-physical systems
    Reinforcement Learning
    This paper compares model-free controllers on a nonlinear system under cyberattacks, including false data injection and denial-of-service attacks. Four RL reward types are analyzed for accuracy, cost, and robustness, informing controller design for resilient cyber-physical systems.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Developer Tools extract
    Which Sections of a Research Paper Best Reveal Its Research Methods? Evidence from Library and Information Science
    Which paper sections best reveal research methods?
    Deep Learning Meta Neural Network
    Research methods are essential carriers of knowledge contribution in academic papers, and automatically classifying them can support knowledge services. Using library and information science as evidence, this work examines which sections of a paper best reveal its research methods to aid such classification.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Developer Tools extract
    RODS: Reward-Driven Online Data Synthesis for Multi-Turn Tool-Use Agents
    RODS: reward-driven online data synthesis for tool-use agents
    AI Agents Inference Reinforcement Learning
    Multi-turn tool-use RL is bottlenecked by the rapid depletion of informative samples in static datasets. Observing that GRPO's gradient signal concentrates on certain tasks, RODS performs reward-driven online data synthesis to continually supply informative samples for multi-turn tool-use agents.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    Adaptive Speech-to-Spike Encoding for Spiking Neural Networks
    Adaptive speech-to-spike encoding for spiking neural networks
    Deep Learning Google Neural Network Speech Processing
    The mismatch between continuous acoustic signals and discrete event-driven processing is a fundamental bottleneck for neuromorphic speech processing. Rather than fixed spike encoders, this work proposes adaptive speech-to-spike encoding for spiking neural networks, improving downstream performance.
    Read original (arXiv cs.LG (Machine Learning)) ↗