Developer Tools B

Showing 271–300 of 304
  • Hacker News (Front Page) · EN Developer Tools extract
    Arch Linux Now Believes Malware Incident Under Control: More Than 1,500 Packages
    Arch Linux says 1,500+ package malware incident is contained
    Arch Linux now believes a malware incident affecting more than 1,500 packages is under control, an episode that highlights ongoing supply-chain security risks in software package ecosystems.
    Read original (Hacker News (Front Page)) ↗
  • Simon Willison's Weblog · EN Developer Tools extract
    Statement on the US government directive to suspend access to Fable 5 and Mythos 5
    Willison on the US directive to suspend Fable 5 and Mythos 5
    Anthropic Claude
    Simon Willison comments on the US government's national-security export-control directive suspending all foreign-national access to Fable 5 and Mythos 5, calling the move extraordinary and questioning its rationale and impact.
    Read original (Simon Willison's Weblog) ↗
  • Simon Willison's Weblog · EN New Model Releases extract
    OpenAI WebRTC Audio Session, now with document context
    Simon Willison adds document context to his OpenAI WebRTC audio tool
    GPT OpenAI
    Simon Willison updated his browser tool for OpenAI's WebRTC realtime audio API. It now supports the newer realtime voice model touting GPT-5-class reasoning, and lets users paste document text as context for spoken conversations about it.
    Read original (Simon Willison's Weblog) ↗
  • Microsoft Research Blog · EN Developer Tools extract
    Ire identifies another LOTUSLITE specimen
    Microsoft's Project Ire AI flags LOTUSLITE malware missed by EDR tools
    Microsoft
    Microsoft Research reports its autonomous malware-analysis agent Project Ire reverse-engineered a new specimen and identified LOTUSLITE traits that most major EDR tools failed to detect, underscoring AI's expanding role in threat analysis.
    Read original (Microsoft Research Blog) ↗
  • arXiv cs.CL (Computation and Language) · EN Multimodal extract
    Gaze Heads: How VLMs Look at What They Describe
    'Gaze heads' in VLMs track and steer described image regions
    Computer Vision Deep Learning Software Engineering
    The paper identifies a small set of attention heads, dubbed gaze heads, that track the image region a vision-language model is currently describing. Intervening on the top ~100 of them can steer the model to describe any chosen region.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Developer Tools extract
    Persona-Pruner: Sculpting Lightweight Models for Role-Playing
    Persona-Pruner sculpts lightweight role-playing language models
    Reinforcement Learning
    Persona-Pruner is a pruning approach that sculpts lightweight language models specialized for role-playing. It aims to retain consistent, persona-driven interaction while reducing model size.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    A Complexity Measure for Active Learning in Multi-group Mean Estimation
    A complexity measure for active multi-group mean estimation
    The paper studies active learning for multi-group mean estimation framed as a d-armed bandit minimizing max-risk. It introduces a complexity measure characterizing the difficulty of adaptive budget allocation.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Multimodal extract
    CottonLeafVision: An Explainable and Robust Deep Learning Framework for Cotton Leaf Disease Classification
    CottonLeafVision: explainable, robust deep learning for cotton leaf disease
    Deep Learning Neural Network Reinforcement Learning
    Cotton underpins the textile industry, so accurate detection of cotton leaf disease is crucial for economic stability. The paper proposes CottonLeafVision, an explainable and robust deep learning framework for classifying cotton leaf diseases.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN Inference & Efficiency extract
    HumP-KD: A Hybrid Uncertainty-Aware Multi-Stage Progressive Knowledge Distillation Framework for Efficient Fire Classification
    HumP-KD: uncertainty-aware distillation for efficient fire classification
    Machine Learning Meta Neural Network Transformer
    HumP-KD is a hybrid, uncertainty-aware multi-stage progressive knowledge distillation framework for fire classification. It targets models that are simultaneously accurate and efficient for real-time use.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Optimal Hidden-Target Learning for Online Inventory Optimization on General Convex Sets
    Optimal hidden-target learning for online inventory optimization
    The work casts online inventory optimization as online convex optimization with memory, where carryover makes the feasible set history-dependent. It develops an optimal hidden-target learning method on general convex sets.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    AgentSpec: Understanding Embodied Agent Scaffolds Through Controlled Composition
    AgentSpec dissects embodied agent scaffolds via controlled composition
    AI Agents Machine Learning Reinforcement Learning
    AgentSpec studies scaffolded LLM agents that combine reasoning, memory, reflection, and action through controlled composition. It aims to isolate how each component contributes to overall performance.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    Giving AI a Headache: Acoustic Adversarial Attacks to Computer Vision Applications
    Acoustic adversarial attacks that disrupt computer vision systems
    Computer Vision Deep Learning Reinforcement Learning
    As AI automates real-world computer vision applications such as autonomous vehicle control, this paper demonstrates acoustic adversarial attacks that can disrupt CV systems, highlighting a new physical, sound-based attack surface.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    Abstracting Cross-Domain Action Sequences into Interpretable Workflows
    Abstracting cross-domain action sequences into interpretable workflows
    Deep Learning Inference Microsoft Reinforcement Learning
    Time-stamped interaction logs objectively record digital app usage, but their granularity and noise obscure meaningful insights into work. The paper proposes abstracting cross-domain action sequences into interpretable workflows.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN Training & Fine-tuning extract
    Graph Structured Combinatorial Semi-Bandit with Nonlinear Reward Associations through Separable Signals
    Graph-structured combinatorial semi-bandits with nonlinear rewards
    Neural Network Retrieval-Augmented Generation (RAG) Reinforcement Learning
    The paper addresses combinatorial semi-bandit identification of optimal structures under nonlinear reward associations. It leverages separable signals to reduce sampling and computational cost.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    Which Directions Matter? Sparse Design for Affine Robust Optimization
    Sparse design identifies which directions matter in robust optimization
    Machine Learning Retrieval-Augmented Generation (RAG)
    The work studies which uncertainty directions a model must cover in affine robust optimization defined by a finite dictionary and budget. It proposes a sparse design selecting the directions that matter.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Developer Tools extract
    Listening with Attention: Entropy-Guided Explainability for Transformer-Based Audio Models
    Entropy-guided explainability for Transformer-based audio models
    Speech Processing Transformer
    Transformer-based ASR models like Whisper are accurate but hard to interpret, and existing XAI methods lack faithfulness and temporal precision. The paper proposes an entropy-guided explainability approach for such audio models.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Industry Adoption extract
    When Good Verifiers Go Bad: Self-Improving VLMs Can Regress on New Tasks
    When good verifiers go bad: self-improving VLMs can regress on new tasks
    Neural Network Reinforcement Learning from Human Feedback (RLHF)
    Verifier-driven self-DPO, where a frozen verifier scores candidates to form preference pairs, is a common recipe for self-improving vision-language models. The paper shows that under this setup VLMs can regress on new tasks when the verifier misbehaves.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.CL (Computation and Language) · EN Developer Tools extract
    Characterizing Cultural Localization in AI-Generated Stories
    Characterizing cultural localization in AI-generated stories
    Retrieval-Augmented Generation (RAG)
    The paper assesses how well AI generates culturally localized stories. It characterizes the ways cultural localization appears in generated narratives.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    Neither Parallel Nor Sequential: How DiffusionGemma Actually Commits Tokens
    How DiffusionGemma actually commits tokens, neither parallel nor sequential
    Deep Learning Mixture of Experts (MoE)
    Diffusion language models are marketed as parallel decoders, yet their real token-commit order is rarely measured. Instrumenting DiffusionGemma, the paper shows it is neither purely parallel nor sequential.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Inference & Efficiency extract
    Moonlight in Latent Space: Chirality and Structural Correspondence Between Beethoven's Op. 27 No. 2 and Machine Learning Mechanisms
    Structural correspondence between Beethoven's Moonlight Sonata and ML
    Embeddings Machine Learning Neural Network Natural Language Processing (NLP) Reinforcement Learning
    Through computational analysis, this paper argues that the three movements of Beethoven's Moonlight Sonata (Op. 27 No. 2) instantiate three distinct machine learning architectures by structural correspondence rather than mere analogy.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    Expert-Driven Survival Machines: Improving Stratification and Interpretability in Multiple Clinical Cohorts
    Expert-driven survival machines for stratification across clinical cohorts
    Mixture of Experts (MoE) Neural Network Reinforcement Learning
    Survival prediction is central for healthcare providers and clinical researchers. The paper introduces expert-driven survival machines that improve risk stratification and interpretability across multiple clinical cohorts.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Training & Fine-tuning extract
    A Comparative Study of Deep Learning Architectures for Multi-Horizon Behavioural Forecasting for Mobile Health
    Comparing deep learning for multi-horizon behavioural forecasting in mHealth
    Deep Learning Fine-tuning Machine Learning Neural Network Transformer
    Wearables and smartphones generate rich behavioural time series for proactive health interventions, yet systematic comparisons of forecasting architectures are lacking. The paper benchmarks deep learning architectures for multi-horizon behavioural forecasting in mobile health.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    LoSoNA: A Benchmark for Local Social Norm Adaptation in Group Conversations
    LoSoNA benchmarks local social norm adaptation in group chats
    AI Agents Claude Gemini Software Engineering
    Online group chats have rarely-stated local conversational norms. LoSoNA is a benchmark measuring whether LLM-based agents can recognize and adapt to these local social norms.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Multimodal extract
    AudioDER: A Deduplication-Enhanced Reasoning Dataset for Post-Training Large Audio-Language Models
    AudioDER: a deduplication-enhanced reasoning dataset for audio LLMs
    Neural Network Retrieval-Augmented Generation (RAG) Reinforcement Learning Software Engineering Speech Processing
    Large audio-language models perform well on audio understanding yet still struggle with reasoning. The paper introduces AudioDER, a deduplication-enhanced reasoning dataset for post-training large audio-language models.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Safety & Evaluation extract
    When Errors Become Narratives: A Longitudinal Taxonomy of Silent Failures in a Production LLM Agent Runtime
    A longitudinal taxonomy of silent failures in a production LLM agent runtime
    Meta
    LLM agents increasingly run as long-lived autonomous runtimes that schedule jobs, call tools, maintain memory, and push results to humans. This longitudinal study of one persistent system presents a taxonomy of its silent failures.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    Persuasion Index: A Theory-Guided Framework for Persuasion Analysis
    Persuasion Index: a theory-guided framework for persuasion analysis
    Identifying persuasive rhetorical cues matters for detecting manipulation, AI safety, and health communication. The paper proposes Persuasion Index, a theory-guided framework for persuasion analysis.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Safety & Evaluation extract
    StreamMemBench: Streaming Evaluation of Agent Memory for Future-Oriented Assistance
    StreamMemBench: streaming evaluation of agent memory for assistance
    A core role of personal-agent memory is turning stored information and prior interactions into future-oriented assistance. StreamMemBench provides a streaming evaluation of agent memory using cues from what the agent observes and how users interact.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    Regional Climate Model Emulation with Diffusion Approaches: What is the Added Value of Generative Machine Learning?
    Added value of diffusion-based generative ML for climate model emulation
    Deep Learning Machine Learning Neural Network Reinforcement Learning
    Emulators cheaply reproduce regional climate models' downscaling, linking global-model predictors to high-resolution fields. The paper assesses the added value of diffusion-based generative machine learning for regional climate model emulation.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    CANN-EUCLID: unsupervised constitutive artificial neural network model discovery from full-field data
    CANN-EUCLID: unsupervised constitutive model discovery from full-field data
    Neural Network
    CANNs offer interpretable material model discovery but have relied on stress-supervised data. CANN-EUCLID enables unsupervised constitutive model discovery directly from full-field measurement data.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    ORCA: A Platform for Open-Source Dexterity Research
    ORCA: an open-source platform for dexterity research
    Neural Network Retrieval-Augmented Generation (RAG) Robotics
    Two-finger grippers dominate manipulation research but are limited by their form factor. ORCA is an open-source platform to support research on more dexterous robotic manipulation.
    Read original (arXiv cs.LG (Machine Learning)) ↗