Industry Adoption C

Showing 61–83 of 83
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    Understanding the Behaviors of Environment-aware Information Retrieval
    Paper: RL adapts LLM query formulation per retriever
    Deep Learning Embeddings Retrieval-Augmented Generation (RAG) Reinforcement Learning
    An arXiv paper presents a systematic analysis of how LLMs can learn, via reinforcement learning, to adapt their query formulation strategies to different retrievers in retrieval-augmented generation. Summarized neutrally from the abstract.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • Lobste.rs (AI tagged) · EN Developer Tools extract
    Building llm-driven “ai” still requires domain knowledge
    Building LLM-driven tools still hinges on capturing domain knowledge
    Software Engineering
    A developer shares lessons from building an LLM-driven tool that answers user questions via a customer API. Capturing and writing down domain knowledge is much of the work, easier than earlier AI generations since it need not be rigidly structured, yet exactly where prior efforts foundered.
    Read original (Lobste.rs (AI tagged)) ↗
  • arXiv cs.LG (Machine Learning) · EN Multimodal extract
    Gen-VCoT: Generative Visual Chain-of-Thought Reasoning via Diffusion-Based RGB Intermediate Representations
    Gen-VCoT uses generated RGB visual intermediates for multimodal reasoning
    Machine Learning
    Gen-VCoT replaces text-only chain-of-thought with generated RGB intermediates, staging visual grounding (SAM), depth (Marigold), and semantic reasoning (Qwen2-VL) under an adaptive router. It improves spatial (+25%) and depth (+50%) questions but can hurt simple factual ones; text CoT still wins on CLEVR, suggesting task-dependent representations.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    GD$^2$PO: Mitigating Multi-Reward Conflicts via Group-Dynamic reward-Decoupled Policy Optimization
    GD²PO eases multi-reward conflicts in LLM RL via dynamic reward decoupling
    Algorithms & Theory Reinforcement Learning Reinforcement Learning from Human Feedback (RLHF)
    As LLM post-training RL uses multi-dimensional rewards, conflicting signals across reward groups can cancel out and hinder training. GD²PO decouples rewards into groups and, inspired by DAPO, dynamically filters near-zero-advantage rollouts, reducing conflicts and improving RL training efficiency.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Developer Tools extract
    Can LLM Agents Infer World Models? Evidence from Agentic Automata Learning
    Can LLM agents infer world models? Evidence from automata learning
    AI Agents Algorithms & Theory Deep Learning Neural Network Reinforcement Learning
    This arXiv paper proposes agentic automata learning to assess how well tool-calling LLM agents can uncover hidden environments through interaction. An agent must infer a hidden deterministic finite automaton (DFA) via membership and equivalence queries, yielding a scalable testbed with controlled task complexity. Evaluating state-of-the-art LLMs, the authors find performance drops sharply as DFA size grows, with reasoning models markedly stronger.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Training & Fine-tuning extract
    SkillWiki: A Living Knowledge Infrastructure for Agent Skills
    SkillWiki: a living knowledge infrastructure for agent skills
    While knowledge is managed via Wikipedia and software via GitHub, agent skills still lack infrastructure for large-scale production, governance, and evolution. SkillWiki is a living knowledge infrastructure turning heterogeneous knowledge into reusable skill assets linked to their originating evidence. It presents the full skill lifecycle, from knowledge ingestion to provenance-aware exploration, governance, and execution-driven evolution, with a live demo and source code available.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Infrastructure & Hardware extract
    daVinci-kernel: Co-Evolving Skill Selection, Summarization, and Utilization via RL for GPU Kernel Optimization
    daVinci-kernel: an RL framework co-evolving skills for GPU kernel tuning
    AI Agents Fine-tuning Reinforcement Learning
    GPU kernel optimization assumes correctness and targets execution efficiency. The authors present daVinci-kernel, an RL framework coupling skill discovery and exploitation via a dynamically evolving skill library. Three agents share one LLM backbone: a Selection Agent retrieving techniques via BM25 and LLM reranking, a Policy Agent generating CUDA/Triton kernels, and a Summary Agent distilling rollouts into reusable skills. Skills are added only after execution verification confirms speedups.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • ITmedia AI+ · JA New Model Releases extract
    Javaアプリ更新を1カ月→3日に爆速化 “ソースコード生成AI止まり”じゃない「IBM Bob」の仕組み
    IBM unveils 'IBM Bob', an AI that speeds Java app modernization
    IBM's new AI tool 'IBM Bob' reportedly cut Java application modernization from 30 days to 3 at early adopters. Its distinguishing feature is going beyond mere source-code generation.
    Read original (ITmedia AI+) ↗
  • Cohere Blog · EN Funding & M&A extract
    Cohere triples UK footprint with new London office to support R&D growth
    Cohere triples its UK footprint with a new London R&D office
    Neural Network Reinforcement Learning
    Cohere announced it will move to a larger London office at 100 New Oxford Street, nearly tripling its UK footprint. The expansion backs the city's AI talent and R&D base and supports growing demand for secure, enterprise-grade sovereign AI across the UK and Europe.
    Read original (Cohere Blog) ↗
  • OpenAI Blog · EN Industry Adoption extract
    Introducing the OpenAI Partner Network
    OpenAI launches Partner Network, investing $150M to speed enterprise AI
    OpenAI
    OpenAI introduced its Partner Network, committing $150M to help global partners accelerate enterprise AI adoption, deployment, and transformation. The program aims to broaden OpenAI's reach into enterprise markets through a structured partner ecosystem.
    Read original (OpenAI Blog) ↗
  • Sakana AI Blog (ja) · JA New Model Releases extract
    Sakana AI、初の商用プロダクト「Sakana Marlin」を提供開始
    Sakana AI launches Marlin, its first commercial autonomous research assistant
    AI Agents Algorithms & Theory Inference Neural Network Reinforcement Learning
    Sakana AI has launched Sakana Marlin, its first commercial product: an autonomous research assistant for business. Given a research theme, it works autonomously for up to about eight hours—forming hypotheses, gathering and verifying information—then outputs structured summary slides and a report spanning dozens of pages. Built on the firm's long-horizon reasoning technology, it aims to act as a 'virtual CSO,' is self-serve, and available same day, with plans from free pay-per-use to Enterprise.
    Read original (Sakana AI Blog (ja)) ↗
  • Simon Willison's Weblog · EN Developer Tools extract
    Statement on the US government directive to suspend access to Fable 5 and Mythos 5
    Willison on the US directive to suspend Fable 5 and Mythos 5
    Anthropic Claude
    Simon Willison comments on the US government's national-security export-control directive suspending all foreign-national access to Fable 5 and Mythos 5, calling the move extraordinary and questioning its rationale and impact.
    Read original (Simon Willison's Weblog) ↗
  • ITmedia AI+ · JA Industry Adoption extract
    最新AI「Fable 5」でYouTube動画作ってみた 想像以上の出来に驚愕、ただし大きな弱点も
    Hands-on: making a YouTube video with the new Fable 5 AI
    A hands-on test of using the new Fable 5 AI to produce a YouTube video. The author is impressed by the surprisingly high quality of the output but also flags a significant weakness in the workflow.
    Read original (ITmedia AI+) ↗
  • Anthropic News · EN Industry Adoption extract
    TCS and Anthropic partner to bring Claude to regulated industries
    Anthropic partners with TCS to bring Claude to regulated industries
    Anthropic Claude Neural Network Reinforcement Learning
    Anthropic announced a partnership with Tata Consultancy Services. TCS will deploy Claude to 50,000 employees across 56 countries, build Claude-powered products for finance, healthcare and the public sector, and join the Claude Partner Network.
    Read original (Anthropic News) ↗
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    AgentSpec: Understanding Embodied Agent Scaffolds Through Controlled Composition
    AgentSpec dissects embodied agent scaffolds via controlled composition
    AI Agents Machine Learning Reinforcement Learning
    AgentSpec studies scaffolded LLM agents that combine reasoning, memory, reflection, and action through controlled composition. It aims to isolate how each component contributes to overall performance.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Developer Tools extract
    Listening with Attention: Entropy-Guided Explainability for Transformer-Based Audio Models
    Entropy-guided explainability for Transformer-based audio models
    Speech Processing Transformer
    Transformer-based ASR models like Whisper are accurate but hard to interpret, and existing XAI methods lack faithfulness and temporal precision. The paper proposes an entropy-guided explainability approach for such audio models.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Industry Adoption extract
    When Good Verifiers Go Bad: Self-Improving VLMs Can Regress on New Tasks
    When good verifiers go bad: self-improving VLMs can regress on new tasks
    Neural Network Reinforcement Learning from Human Feedback (RLHF)
    Verifier-driven self-DPO, where a frozen verifier scores candidates to form preference pairs, is a common recipe for self-improving vision-language models. The paper shows that under this setup VLMs can regress on new tasks when the verifier misbehaves.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN Infrastructure & Hardware extract
    A Statistical and Machine Learning Framework for Operational Threshold Detection and Deployable Dispatch Controller Development in Hydrogen Multi-Energy Systems
    ML framework for threshold detection in hydrogen multi-energy systems
    Machine Learning Reinforcement Learning
    The study presents a statistical and machine learning framework characterizing a hydrogen-based multi-energy system. It targets operational threshold detection and deployable dispatch controller development.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Safety & Evaluation extract
    When Errors Become Narratives: A Longitudinal Taxonomy of Silent Failures in a Production LLM Agent Runtime
    A longitudinal taxonomy of silent failures in a production LLM agent runtime
    Meta
    LLM agents increasingly run as long-lived autonomous runtimes that schedule jobs, call tools, maintain memory, and push results to humans. This longitudinal study of one persistent system presents a taxonomy of its silent failures.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Safety & Evaluation extract
    VISTA: View-Consistent Self-Verified Training for GUI Grounding
    VISTA: view-consistent self-verified training for GUI grounding
    Reinforcement Learning Software Engineering
    Applying GRPO to GUI grounding samples rollouts from a single screenshot, so groups often turn all-failure or all-success and yield weak signal. VISTA introduces view-consistent, self-verified training to stabilize GUI grounding.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Inference & Efficiency extract
    Securing the Future of IoMT in the Post-Quantum Era: An Edge-Native Federated Learning Approach
    Edge-native federated learning to secure IoMT in the post-quantum era
    Deep Learning
    Internet of Medical Things devices handle sensitive health data under tight resource constraints, making security and privacy critical, while federated learning adds complexity. The paper proposes an edge-native federated learning approach to secure IoMT in the post-quantum era.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • NVIDIA Developer Blog · EN Industry Adoption extract
    Deploy Long-Context Reasoning and Agentic Workflows with MiniMax M3 on NVIDIA Accelerated Infrastructure
    NVIDIA details deploying MiniMax M3 for long-context agentic workflows
    Generative AI NVIDIA Retrieval-Augmented Generation (RAG)
    NVIDIA's developer blog explains how to deploy MiniMax M3 on NVIDIA accelerated infrastructure for long-context reasoning and agentic workflows, addressing fragmented enterprise AI pipelines spanning text, vision, and other modalities.
    Read original (NVIDIA Developer Blog) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    Beyond the Training Distribution: Evaluating Predictions Under Distribution Shift and Selection Bias
    Evaluating predictions under distribution shift and selection bias
    Algorithms & Theory Machine Learning
    Knowing how a model will perform in a new environment before deployment helps prevent harm. The paper evaluates predictions under two common sources of degradation: distribution shift and selection bias.
    Read original (arXiv cs.LG (Machine Learning)) ↗