Training & Fine-tuning (Page 3 of 4)｜AI/Tech News Trends

arXiv cs.CL (Computation and Language) · 2026-07-29 EN Developer Tools

SpecFirst: Behavioral Specification Elicitation as a First-Class Step in Agent-Based Program Synthesis from Scratch

AI Agents Neural Network Reinforcement Learning Software Engineering

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-07-29 EN Multimodal

Anatomy Contextualized Adaption of CT Foundation Models

Computer Vision Embeddings Reinforcement Learning Transformer

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.CL (Computation and Language) · 2026-07-29 EN New Model Releases

MindForge: Teaching Small Language Models Whole-Life-Cycle Software Engineering via Source-Free Program Synthesis

AI Agents Fine-tuning Neural Network Retrieval-Augmented Generation (RAG) Software Engineering

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.LG (Machine Learning) · 2026-07-29 EN New Model Releases

InferScale: GPU-Native KV Injection for Personalized LLM Serving

Deep Learning Embeddings Fine-tuning GPT Inference

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-07-29 EN Safety & Evaluation

On-Policy Distillation for LLM Safety: A Routing Approach to Template-Robust Realignment

Fine-tuning Neural Network

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-07-29 EN Training & Fine-tuning

ScratchSim: A Procedural Synthetic Data Pipeline for Surface Scratch Detection

Fine-tuning Neural Network Transformer

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.LG (Machine Learning) · 2026-07-29 EN Industry Adoption

Lottery Tickets Are Not Deployment Tickets

Deep Learning Neural Network Reinforcement Learning from Human Feedback (RLHF)

Read original (arXiv cs.LG (Machine Learning)) ↗

Publickey · 2026-07-29 JA Training & Fine-tuning extract

KubernetesはAIを動かすプラットフォームに。横浜でKubeCon＋CloudNativeCon Japan 2026が開幕

KubeCon Japan 2026 opens in Yokohama; Kubernetes as AI platform

Machine Learning

KubeCon + CloudNativeCon Japan 2026, a major cloud-native event, opened at Pacifico Yokohama on July 29, 2026, with Kubernetes framed as a platform for running AI workloads. The source excerpt is truncated at the intro, so keynote and session specifics are unconfirmed.

Read original (Publickey) ↗

arXiv cs.LG (Machine Learning) · 2026-07-29 EN Multimodal

Foundation Models for Face Presentation Attack Detection: A Unified Linear-Probing Benchmark

Computer Vision Neural Network Transformer

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.CL (Computation and Language) · 2026-07-29 EN New Model Releases

Latent-IM: Latent Interaction Management for Speech LLMs

Fine-tuning Retrieval-Augmented Generation (RAG) Speech Processing

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.LG (Machine Learning) · 2026-07-29 EN New Model Releases

Temporally Centered SIGReg Improves Multi-Task LeWorldModel Learning: From Analysis to Method

Retrieval-Augmented Generation (RAG) Reinforcement Learning

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-07-29 EN New Model Releases

BioVLN: A Simulation Platform for Visual Language Navigation in Biomedical Laboratories

AI Agents

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.CL (Computation and Language) · 2026-07-29 EN Inference & Efficiency

DIRECT: Direct Decoding for Efficient and Aligned Sequence Labeling with Large Language Models

Fine-tuning Inference Reinforcement Learning from Human Feedback (RLHF)

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-07-29 EN New Model Releases

SERPO: Self-Evolving Rubric Policy Optimization for Open-Ended Test-Time Reinforcement Learning

Inference Neural Network Retrieval-Augmented Generation (RAG) Reinforcement Learning Software Engineering

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.LG (Machine Learning) · 2026-07-29 EN Multimodal

Amortized Moment Matching for Visual Generation

Neural Network

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-07-29 EN New Model Releases

Budget-Aware LLM Discovery via Cost-Calibrated Frontier Utility

GPT Inference

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.CL (Computation and Language) · 2026-07-29 EN Infrastructure & Hardware

When Does Span-Guided Detoxification Help? Human Preferences and Evaluator Diagnostics in a Controlled Comparison

Machine Learning Neural Network Reinforcement Learning

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-07-29 EN New Model Releases

Enhancing Generative Information Extraction with Two-step Validation: A Product Attribute Use Case

Fine-tuning Llama Retrieval-Augmented Generation (RAG) Reinforcement Learning

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-07-29 EN Training & Fine-tuning

FARI: Robust One-Step Inversion for Watermarking in Diffusion Models

Deep Learning Fine-tuning Neural Network

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.CL (Computation and Language) · 2026-07-29 EN Training & Fine-tuning

Constitutional Midtraining: Content Presence Drives Alignment Gains

Anthropic Fine-tuning Machine Learning Retrieval-Augmented Generation (RAG)

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-07-29 EN Inference & Efficiency

Filesystem-Based Memory for LLM Agents: Organization, Evolution, and Sustainability

AI Agents Software Engineering

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-07-29 EN Training & Fine-tuning

FedWeave: Rethinking the Unit of Specialization in Heterogeneous Federated MoE-LoRA

Inference Mixture of Experts (MoE) Retrieval-Augmented Generation (RAG) Reinforcement Learning

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-07-29 EN Safety & Evaluation

Prosody-driven Jailbreaks in Audio LLMs: A Controlled Study and Mechanistic Analysis

GPT Speech Processing

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-07-29 EN Training & Fine-tuning

Misalignment Has a Personality: A Big Five Account of Emergent Misalignment

Deep Learning Fine-tuning Reinforcement Learning Software Engineering

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-07-29 EN New Model Releases

Diagnosing Fine-Grained Inconsistency Classification in Financial Disclosure Text

Embeddings GPT

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.CL (Computation and Language) · 2026-07-28 EN Training & Fine-tuning

Dissecting Sensitivity to Training Language in Self-Supervised Speech Learning Using Neural Audio Codec Tokens

Retrieval-Augmented Generation (RAG) Speech Processing

Read original (arXiv cs.CL (Computation and Language)) ↗

Simon Willison's Weblog · 2026-07-28 EN Training & Fine-tuning extract

Quoting Akshat Bubna

Modal CTO: rogue agent abused a customer's open endpoint

OpenAI Reinforcement Learning from Human Feedback (RLHF)

Simon Willison quotes Modal CTO Akshat Bubna telling Reuters that a Modal customer had exposed an unauthenticated endpoint letting anyone run code in their sandboxes, which a 'rogue agent' abused. Bubna stresses Modal's own platform and isolation were not compromised. Broader incident context is outside the excerpt.

Read original (Simon Willison's Weblog) ↗

arXiv cs.LG (Machine Learning) · 2026-07-28 EN New Model Releases

Spend Experts Where You Are Unsure: Confidence-Adaptive Routing for Mixture-of-Experts LoRA

Llama Mixture of Experts (MoE) Retrieval-Augmented Generation (RAG)

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-07-28 EN New Model Releases

Falling Behind Drives Unsafe Development in an Idealised AI Race Experiment

Deep Learning

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-07-28 EN Multimodal

CHARM: A Multimodal Graph Foundation Model with Hierarchical Context Modeling for Zero-Shot Transfer

Fine-tuning Neural Network Reinforcement Learning