Training & Fine-tuning A

Showing 91–99 of 99
  • arXiv cs.AI (Artificial Intelligence) · EN Safety & Evaluation extract
    ClinHallu: A Benchmark for Diagnosing Stage-Wise Hallucinations in Medical MLLM Reasoning
    ClinHallu: a stage-wise hallucination diagnosis benchmark for medical MLLMs
    Fine-tuning Machine Learning Software Engineering
    ClinHallu is a benchmark for diagnosing where hallucinations originate in medical multimodal LLM reasoning, decomposing traces into visual recognition, knowledge recall, and reasoning integration. It provides 7,031 validated instances and uses stage-replacement interventions to localize error sources.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN Training & Fine-tuning extract
    Graph Structured Combinatorial Semi-Bandit with Nonlinear Reward Associations through Separable Signals
    Graph-structured combinatorial semi-bandits with nonlinear rewards
    Neural Network Retrieval-Augmented Generation (RAG) Reinforcement Learning
    The paper addresses combinatorial semi-bandit identification of optimal structures under nonlinear reward associations. It leverages separable signals to reduce sampling and computational cost.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Training & Fine-tuning extract
    From Self-Supervised Speech Models to Mixture-of-Experts for Robust Anti-Spoofing
    Self-supervised speech models plus MoE for robust anti-spoofing
    Mixture of Experts (MoE) Speech Processing
    Advances in speech generation make synthetic speech more natural and spoofing detection harder. The paper combines self-supervised speech models with a mixture-of-experts design to build more robust anti-spoofing systems.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Industry Adoption extract
    When Good Verifiers Go Bad: Self-Improving VLMs Can Regress on New Tasks
    When good verifiers go bad: self-improving VLMs can regress on new tasks
    Neural Network Reinforcement Learning from Human Feedback (RLHF)
    Verifier-driven self-DPO, where a frozen verifier scores candidates to form preference pairs, is a common recipe for self-improving vision-language models. The paper shows that under this setup VLMs can regress on new tasks when the verifier misbehaves.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Training & Fine-tuning extract
    A Comparative Study of Deep Learning Architectures for Multi-Horizon Behavioural Forecasting for Mobile Health
    Comparing deep learning for multi-horizon behavioural forecasting in mHealth
    Deep Learning Fine-tuning Machine Learning Neural Network Transformer
    Wearables and smartphones generate rich behavioural time series for proactive health interventions, yet systematic comparisons of forecasting architectures are lacking. The paper benchmarks deep learning architectures for multi-horizon behavioural forecasting in mobile health.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Cluster LOCO: Feature Importance For Interpreting Clusters
    Cluster LOCO gives feature importance to interpret clusters
    Algorithms & Theory
    Clustering is widely used but its outputs are hard to interpret and audit. Cluster LOCO provides feature-importance scores to explain what distinguishes each cluster.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Training & Fine-tuning extract
    BayLing-Duplex: Native Full-Duplex Speech Dialogue with a Single Autoregressive LLM
    BayLing-Duplex: native full-duplex speech dialogue from one LLM
    Deep Learning Fine-tuning Llama Reinforcement Learning from Human Feedback (RLHF) Speech Processing
    BayLing-Duplex enables native full-duplex speech interaction with a single autoregressive LLM, letting it listen and speak simultaneously. It handles natural phenomena such as overlap and hesitation.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Training & Fine-tuning extract
    Dense Coordinate-List Fine-Tuning Induces a Controllable Interference Surface in Vision-Language Models
    Dense coordinate-list fine-tuning induces a controllable interference surface
    Computer Vision Fine-tuning Reinforcement Learning from Human Feedback (RLHF) Software Engineering
    Fine-tuning vision-language models to emit dense coordinate lists improves grounding but alters how they serialize, repeat, and terminate structured output. The paper shows this induces a controllable interference surface in VLMs.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Training & Fine-tuning extract
    A Fixed-Point Neural Operator for Size- and Functional-Transferable Hamiltonian Prediction
    A fixed-point neural operator for transferable Hamiltonian prediction
    Fine-tuning Inference Machine Learning Neural Network
    Predicting the Kohn-Sham Hamiltonian with ML can accelerate density functional theory while retaining orbitals and energy levels. The paper proposes a fixed-point neural operator for size- and functional-transferable Hamiltonian prediction.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗