Industry Adoption C

Showing 31–60 of 83
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    GateMem: Benchmarking Memory Governance in Multi-Principal Shared-Memory Agents
    GateMem: benchmarking memory governance in shared-memory agents
    AI Agents Neural Network
    Memory benchmarks for LLM agents largely assume single-user settings, leaving shared-memory governance untested. GateMem benchmarks memory governance, such as access control and management, in multi-principal shared-memory agents.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    ForecastBench-Sim: A Simulated-World Forecasting Benchmark
    ForecastBench-Sim: a simulated-world forecasting benchmark
    Reinforcement Learning Software Engineering
    Forecasting benchmarks for general-purpose AI usually inherit real-world events, making evaluation hard to control. ForecastBench-Sim introduces a simulated-world forecasting benchmark, enabling controlled assessment of AI forecasting ability.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • ITmedia AI+ · JA Developer Tools extract
    日立、OpenAIとの連携を本格化 「Codex」でレガシーシステム刷新、サイバー防衛も
    Hitachi deepens OpenAI tie-up, using Codex to modernize legacy systems
    OpenAI
    Hitachi is expanding its partnership with OpenAI, pairing the code-analysis AI "Codex" with its own systems-development expertise. It aims to establish an AI-driven workflow that visualizes upstream specifications from existing code through migration testing, and also cites cybersecurity defense as a use case.
    Read original (ITmedia AI+) ↗
  • Preferred Networks Tech Blog · JA Training & Fine-tuning extract
    PLaMo-3.0-Prime-β を LLM 開発の現場で使う
    Preferred Networks shows PLaMo-3.0-Prime-β in real LLM development
    Deep Learning
    Preferred Networks continues developing its large language model PLaMo and shares how to use the latest PLaMo-3.0-Prime-β in real development work. Beyond training large models, it covers the many surrounding tasks involved in building high-performance LLMs in practice.
    Read original (Preferred Networks Tech Blog) ↗
  • Cohere Blog · EN Inference & Efficiency extract
    LLM Serving Fairness: No more noisy neighbours
    Cohere ensures fair compute sharing across LLM serving tenants
    Deep Learning Inference Meta Neural Network Reinforcement Learning
    Cohere details how it ensures every tenant gets a fair share of compute in LLM serving, tackling the 'noisy neighbour' problem where one user monopolizes resources. The design allocates capacity fairly across tenants to deliver stable, predictable multi-tenant performance.
    Read original (Cohere Blog) ↗
  • ITmedia AI+ · JA Industry Adoption extract
    セルフ給油、実はスタッフが手動で許可していた!? コスモ石油の「AI監視」は消えゆくガソリンスタンドを救うか
    Cosmo Oil and ELEMENTS build AI to approve self-service refueling
    At self-service gas stations in Japan, staff still manually approve refueling after a safety check. Cosmo Oil Marketing and ELEMENTS have jointly developed a monitoring system in which AI judges whether to permit refueling, aiming to support that task. The article cites labor shortages and a declining number of service stations as background. Details are per the article and the companies.
    Read original (ITmedia AI+) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Industry Adoption extract
    Visual Verification Enables Inference-time Steering and Autonomous Policy Improvement
    VERITAS steers and self-improves robot policies at inference time
    Inference Reinforcement Learning
    The paper proposes VERITAS, a generator-verifier framework pairing a pre-trained generalist robot policy with a gradient-free visual verifier that evaluates actions at inference time, improving performance without extra training and enabling self-improvement.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    Analyzing and Encoding the Al-Mawrid Arabic-English Dictionary with the ISO Language Markup Framework and TEI Lex-0
    Encoding the Al-Mawrid Arabic-English dictionary with LMF and TEI Lex-0
    The paper presents a methodology to systematically digitize and encode the legacy print Al-Mawrid Arabic-English dictionary using the ISO Language Markup Framework and TEI Lex-0, addressing a gap in Arabic lexical infrastructure by producing a standardized computational lexicon.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Developer Tools extract
    All Smoke, No Alarm: Oracle Signals in Agent-Authored Test Code
    Study finds agent-authored test code often lacks real verification logic
    AI Agents Claude OpenAI
    The paper examines test code generated by AI coding agents in open-source pull requests, arguing that test files lacking explicit assertions verify no behavior, so presence-based quality gates overestimate verification strength.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN Developer Tools extract
    Memory as a Wasting Asset: Pricing Flash Endurance for Embodied Agents, and the Limits of Doing So
    Pricing flash endurance as a wasting asset for embodied agents
    AI Agents
    A robot's flash endurance is a non-renewable stock: each persisted write spends one of a few thousand program/erase cycles and never refills. The paper frames flash endurance as a wasting asset, proposes pricing it for embodied agents, and examines the limits of doing so.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    Knowledge Reutilization in Meta-Reinforcement Learning
    A meta-knowledge reutilization framework for meta-RL across agents
    AI Agents Inference Meta Reinforcement Learning
    The paper proposes a meta-knowledge reutilization framework for meta-reinforcement learning that learns task-level knowledge on a dynamics-simplified agent and transfers it to heterogeneous agents, using a Bayesian non-parametric prior to organize latent task modes.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN Inference & Efficiency extract
    Ternary Mamba: Grouped Quantization-Aware Training of W1.58A16 State Space Models
    Ternary Mamba: grouped QAT for W1.58A16 state space models
    Inference Quantization Retrieval-Augmented Generation (RAG) Transformer
    Ternary Mamba applies grouped quantization-aware training to Mamba state space models with ternary (W1.58) weights and 16-bit activations, targeting efficient low-bit training and inference of sequence models while preserving accuracy.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    PseudoBench: Measuring How Agentic Auto-Research Fuels Pseudoscience
    PseudoBench measures how agentic auto-research fuels pseudoscience
    AI Agents Deep Learning
    As LLM-based agents enter autonomous scientific research, resisting pseudoscience matters. PseudoBench is an adversarial benchmark measuring how such agents may rapidly generate plausible yet misleading studies that contaminate academic literature.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.LG (Machine Learning) · EN Developer Tools extract
    INI-VPINN: A Variational Physics-Informed Neural Network with Implicit Neumann and Interface Handling for Multi-Material Domains with Geometric Singularities
    INI-VPINN: a variational PINN for multi-material domains
    Deep Learning Neural Network
    INI-VPINN is a weak-form physics-informed neural network that naturally incorporates Neumann boundary and interface conditions into a variational formulation, targeting multi-material domains with geometric singularities.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Infrastructure & Hardware extract
    Predictive Analytics in E-Commerce for CustomerBehavior Forecasting using hybrid Ret-DNN withXGBoost Model
    Hybrid Ret-DNN with XGBoost for e-commerce behavior forecasting
    Deep Learning Neural Network
    E-commerce platforms struggle to understand customer behavior and predict future purchases. The study proposes predictive analytics using a hybrid Ret-DNN combined with an XGBoost model to forecast customer behavior.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    Dynamic Rollout Editing for Reducing Overthinking in RL-Trained Reasoning Models
    Dynamic rollout editing reduces overthinking in RL reasoning models
    Neural Network Reinforcement Learning Software Engineering
    Long chain-of-thought reasoning helps, but models often keep generating unnecessary reasoning after reaching a correct answer. Framing this as overthinking in GRPO-style RL post-training, the paper proposes dynamic rollout editing to reduce it.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • ITmedia AI+ · JA New Model Releases extract
    OpenAIの高度AIでソフトバンクの脆弱性を1万件発見 孫正義氏「大変な危機」 日本の重要インフラ企業へ診断サービス提供
    SoftBank unveils OpenAI-powered Patching-as-a-Service security offering
    GPT OpenAI
    SoftBank Group announced "Patching as a Service" on June 16, a cybersecurity offering built on OpenAI technologies such as "GPT-5.5 Cyber." It simulates attacks on corporate systems to find vulnerabilities, then proposes remediation plans and implementation end-to-end. SoftBank says it will prioritize select firms supporting Japan's critical infrastructure, while chairman Masayoshi Son stressed the gravity of the cyber threat.
    Read original (ITmedia AI+) ↗
  • arXiv cs.CL (Computation and Language) · EN Multimodal extract
    EnvRL: Learn from Environment Dynamics in Agentic Reinforcement Learning
    EnvRL learns from environment dynamics in agentic RL
    AI Agents Retrieval-Augmented Generation (RAG) Reinforcement Learning
    EnvRL is a method that learns from environment dynamics in agentic reinforcement learning, leveraging the structure of agent-environment interaction to improve learning efficiency and performance.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    Prompt Perturbation for Reliable LLM Evaluation over Comparison Graphs
    Prompt perturbation for reliable LLM evaluation over comparison graphs
    Evaluating LLMs is important but can be fragile to small prompt changes. The paper proposes using prompt perturbation to achieve more reliable LLM evaluation over comparison graphs.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • OpenAI Blog · EN Safety & Evaluation extract
    Predicting model behavior before release by simulating deployment
    OpenAI unveils Deployment Simulation to predict model behavior pre-release
    OpenAI
    OpenAI introduced Deployment Simulation, a method to predict an AI model's behavior before deployment by using real conversation data to simulate responses, aiming to improve safety and evaluation accuracy. The claims are OpenAI's own and not independently verified.
    Read original (OpenAI Blog) ↗
  • ITmedia AI+ · JA Industry Adoption extract
    月2000時間のムダをなくす大阪ガスらのNotion×AI活用 「使われない情報」の生かし方
    Osaka Gas cuts 2,000 hours/month via Notion-plus-AI knowledge reuse
    Two companies including Osaka Gas sharply reduced the burden of hunting for documents by combining Notion with AI. Achieving 2,000 hours of monthly savings, the case turns buried information into organizational knowledge assets and highlights how to build systems that prevent over-reliance on individuals.
    Read original (ITmedia AI+) ↗
  • ITmedia AI+ · JA Industry Adoption extract
    生成AI×3D CADでどこまでできるか試してみた
    Testing generative AI with 3D CAD using Autodesk Fusion's Assistant
    Generative AI is expanding beyond text, images, and video into 3D CAD, with environments emerging that draft 3D models from natural-language prompts alone. The article tries Autodesk Fusion's Autodesk Assistant to model a plastic bottle, illustrating both the promise and current limits of pairing generative AI with 3D CAD.
    Read original (ITmedia AI+) ↗
  • ITmedia AI+ · JA Industry Adoption extract
    300億円は「ROI不問」 Olive、Trunkを仕掛けるSMBC、新規事業の神髄は「撤退」にアリ
    SMBC plans 50B yen in generative-AI investment; key to ventures is exit
    Sumitomo Mitsui Financial Group grew its Olive and Trunk services and unveiled a 50-billion-yen generative-AI investment plan. Once a bank lagging rivals in mobile a decade ago, it became an organization that repeatedly ships new ventures, finding the essence of new business in knowing when to exit.
    Read original (ITmedia AI+) ↗
  • arXiv cs.CL (Computation and Language) · EN Training & Fine-tuning extract
    The Value Axis: Language Models Encode Whether They're on the Right Track
    LLMs encode a 'value axis' tracking if their strategy works
    Fine-tuning Reinforcement Learning Reinforcement Learning from Human Feedback (RLHF)
    Researchers built a 'value axis' for Qwen3-8B that captures whether its current strategy is likely to reach its goal. The axis separates high- and low-confidence rollouts, backtracking, and correct vs. corrupted code; steering it up suppresses self-correction while steering down induces exploration. DPO can raise the internal value of rewarded behaviors.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.LG (Machine Learning) · EN Multimodal extract
    ROVE: Unlocking Human Interventions for Humanoid Manipulation via Reinforcement Learning
    ROVE: RL that learns humanoid manipulation from imperfect interventions
    Computer Vision Machine Learning Reinforcement Learning
    ROVE is an RL framework for post-training humanoid Vision-Language-Action models from imperfect human interventions. It pairs a human-in-the-loop data pipeline with Optimistic Value Estimation to prioritize high-value behaviors in mixed-quality trajectories, and adds cross-embodiment human videos to robustify value estimation.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Inference & Efficiency extract
    A Causal Model of Theory of Mind in Conflict for Artificial Intelligence
    A structural causal model for when AI should engage theory of mind in conflict
    Inference
    Theory of mind (ToM), ascribing mental states to others for prediction and inference, is widely assumed essential for human-machine integration. Existing AI-ToM models address how to mentalize but leave when largely unaddressed. The paper asks under what situational and agent-level conditions ToM engagement is causally warranted in conflict, presenting a structural causal model as a directed acyclic graph that treats ToM as a mechanism activated by conditions rather than an always-on capacity.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN Developer Tools extract
    CrossMaps: Confidence-Aware Open-Vocabulary Semantic Mapping for Rover Navigation
    CrossMaps: confidence-aware open-vocabulary semantic mapping for rovers
    Embeddings
    Rovers rely on perception to maintain spatial maps encoding objects and sensor quality (range reliability, lighting artifacts, data density) to guide fusion, embedding updates, and navigation under partial observability. The paper presents CrossMaps, a real-time confidence-aware open-vocabulary semantic mapping pipeline that builds language-queryable maps from RGB-D data, extending VLMaps-style approaches with multi-scale CLIP embeddings, confidence-aware fusion, and a dual-memory architecture.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Inference & Efficiency extract
    Exploring Extrinsic and Intrinsic Properties for Effective Reasoning with Code Interpreter
    Study probes extrinsic and intrinsic traits of code-interpreter reasoning
    Fine-tuning Inference Retrieval-Augmented Generation (RAG) Reinforcement Learning
    This paper studies reasoning with a Code Interpreter (CI) in LLMs from two angles: extrinsic properties (crucial tokens) and intrinsic properties (code-specific cognitive behaviors). It reports that stronger CI reasoning models show more crucial tokens and behaviors—especially verification, backtracking, and backward chaining—and explores leveraging these at inference and training time. Summarized neutrally from the abstract.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Industry Adoption extract
    Beyond Models: Reflections on Engineering AI-enabled Systems in a Project-Based Course
    Reflections on teaching the engineering of AI-enabled systems in a course
    Algorithms & Theory Machine Learning Neural Network Reinforcement Learning Software Engineering
    This paper reflects on a project-based master's course at the University of Bremen on engineering AI-enabled systems. It argues that machine learning courses emphasize model development while students lack experience in architectural design, deployment, and monitoring, and reports on the course's design and implementation.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.CL (Computation and Language) · EN Industry Adoption extract
    Does Traversal Order Matter? A Systematic Study of Tree Traversal Methods in Transformer Grammars
    Paper: compares tree traversal orders in Transformer Grammars
    Reinforcement Learning Transformer
    An arXiv paper systematically studies tree linearization orders in Transformer Grammars, exploring breadth-first and a novel Production-Rule Traversal alongside the conventional depth-first approach. Summarized neutrally from the abstract.
    Read original (arXiv cs.CL (Computation and Language)) ↗