New Model Releases A

Showing 1–30 of 269
  • ITmedia AI+ · JA New Model Releases extract
    理研、AI for Science向けスパコンの名前を「理究」(りきゅう)に決定 由来は?
    RIKEN names its AI-for-Science supercomputer 'Rikyu' (Rikyu)
    RIKEN, Japan's national research institute, has decided to name its supercomputer dedicated to AI for Science 'Rikyu' (理究). The announcement also explains the origin and reasoning behind the chosen name.
    Read original (ITmedia AI+) ↗
  • ITmedia AI+ · JA New Model Releases extract
    GMO傘下、Unitreeの国内正規代理店に 人型ロボの導入から保守まで一気通貫で支援
    GMO unit becomes Unitree's official robot distributor in Japan
    Robotics
    GMO AI & Robotics, a GMO Internet Group subsidiary, has signed an official distributor agreement with China's Unitree Robotics for the Japanese market. It will offer end-to-end support, from deployment to maintenance, aiming to accelerate humanoid robot adoption across Japan.
    Read original (ITmedia AI+) ↗
  • ITmedia AI+ · JA New Model Releases extract
    画面操作を“録画”→AIが作業代行 Codexに新機能「Record & Replay」
    OpenAI adds 'Record & Replay' to Codex to automate recorded UI steps
    OpenAI
    OpenAI has added a new 'Record & Replay' feature to its Codex coding agent. Users record on-screen operations, and the AI then reproduces those steps to carry out the task automatically, according to ITmedia.
    Read original (ITmedia AI+) ↗
  • ITmedia AI+ · JA New Model Releases extract
    Gartnerが警鐘 プライバシー法執行が本格化、CISOは何を見直すべきか?
    Gartner warns US privacy-law fines topped $3.4B in 2025
    Gartner reports that US state authorities imposed about $3.425 billion in privacy-law violation fines in 2025, exceeding the combined total of the previous five years. It expects enforcement to keep accelerating through 2028, urging CISOs to reconsider their privacy and compliance posture.
    Read original (ITmedia AI+) ↗
  • ITmedia AI+ · JA New Model Releases extract
    ChatGPTで広告テスト、日本でも開始 非表示にする方法は?
    OpenAI begins testing ads in ChatGPT in Japan
    GPT OpenAI
    OpenAI's Japan arm announced it has started testing ad displays within ChatGPT in Japan. The article explains how the ads appear and how users can hide them.
    Read original (ITmedia AI+) ↗
  • Simon Willison's Weblog · EN New Model Releases extract
    Datasette Apps: Host custom HTML applications inside Datasette
    Datasette Apps lets you host custom HTML apps inside Datasette
    Machine Learning Neural Network
    Simon Willison introduced Datasette Apps, letting developers host custom HTML/JS applications inside a Datasette instance. The apps can read Datasette's databases, enabling lightweight, data-backed web apps served directly from the data exploration tool itself.
    Read original (Simon Willison's Weblog) ↗
  • arXiv cs.LG (Machine Learning) · EN Multimodal extract
    UNIEGO: Proxies as Mediators for Unified Egocentric Video Representation Learning
    UNIEGO: unified egocentric video encoder via multi-teacher distillation
    Neural Network
    UNIEGO is a unified egocentric video encoder trained via a hierarchical multi-teacher distillation framework. Representation-specific proxy models translate knowledge from teachers spanning multiple viewpoints, modalities, and foundation models into a single egocentric space, while remaining deployable from egocentric video alone.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Predictability as a Fine-Grained Measure for Privacy
    Privacy via predictability, a fine-grained privacy measure
    The paper introduces 'privacy via predictability,' a fine-grained privacy framework that explicitly incorporates an attacker's core prior knowledge. It aims to ease the costly privacy-accuracy tradeoff imposed by the worst-case guarantees of differential privacy.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    Multi-Task Bayesian In-Context Learning
    Multi-task Bayesian inference via in-context learning
    Inference Meta Reinforcement Learning Transformer
    The paper studies multi-task Bayesian in-context learning, using in-context learning to perform Bayesian predictive inference across tasks. It targets the intractability of exact inference and the cost or restrictiveness of scalable approximations, aiming for uncertainty quantification and data efficiency.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Inference & Efficiency extract
    Execution-State Capsules: Graph-Bound Execution-State Checkpoint and Restore for Low-Latency, Small-Batch, On-Device Physical-AI Serving
    Execution-State Capsules: checkpoint/restore for on-device AI serving
    AI Agents Meta NVIDIA Retrieval-Augmented Generation (RAG) Speech Processing
    The paper introduces Execution-State Capsules, a graph-bound mechanism to checkpoint and restore execution state for low-latency, small-batch, on-device physical-AI serving. It targets scenarios beyond the high-throughput, high-concurrency regime that paged or radix KV caches mainly serve.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents
    LedgerAgent: structured state for policy-adherent tool-calling agents
    AI Agents Inference Retrieval-Augmented Generation (RAG)
    Policy-adherent tool-calling agents in customer-service domains must track task state across turns while following rules. LedgerAgent introduces structured state to help such agents stay consistent and policy-compliant.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    StylisticBias: A Few Human Visual Cues Drive Most Social Biases in MLLMs
    StylisticBias: few visual cues drive most social bias in MLLMs
    Machine Learning Reinforcement Learning
    StylisticBias investigates the visual cues that shape how multimodal large language models judge people. The study finds that a small set of human visual cues drives most of the social biases exhibited by MLLMs, which are increasingly deployed in consequential settings.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    DeepSWIP: Quotient-WMC Counterfactuals for Neural Probabilistic Logic Programs
    DeepSWIP: quotient-WMC counterfactuals for neural probabilistic logic programs
    Inference Reinforcement Learning
    Neurosymbolic systems such as DeepProbLog combine neural perception with probabilistic logic, but standard inference has limits. DeepSWIP introduces quotient-WMC counterfactuals to enable counterfactual reasoning in neural probabilistic logic programs.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Developer Tools extract
    Sovereign Execution Brokers: Enforcing Certificate-Bound Authority in Agentic Control Planes
    Sovereign Execution Brokers for agentic control planes
    AI Agents Neural Network
    Autonomous agents are increasingly wired into cloud, deployment, and data-control workflows, straining production security. This work proposes sovereign execution brokers that enforce certificate-bound authority within agentic control planes.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    FlowEdit: Associative Memory for Lifelong Pronunciation Adaptation in Flow-Matching TTS
    FlowEdit: associative memory for lifelong pronunciation adaptation in TTS
    Embeddings Inference Speech Processing
    Flow-matching text-to-speech achieves strong zero-shot quality but stays static after deployment. FlowEdit uses associative memory to enable lifelong pronunciation adaptation without full retraining.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Safety & Evaluation extract
    Multi-LCB: Extending LiveCodeBench to Multiple Programming Languages
    Multi-LCB: extending LiveCodeBench to multiple programming languages
    Reinforcement Learning Software Engineering
    LiveCodeBench has become a widely adopted benchmark for evaluating large language models on code. Multi-LCB extends it to multiple programming languages to assess multilingual code generation.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Probe-and-Refine Tuning of Repository Guidance for Coding Agents
    Probe-and-Refine: tuning repository guidance for coding agents
    AI Agents Fine-tuning Retrieval-Augmented Generation (RAG) Software Engineering
    The paper presents Probe-and-Refine, a method for tuning the repository guidance (such as AGENTS.md files) that LLM-based coding agents rely on. It targets the higher-level operational knowledge—file layout, test workflows, and error-prone patterns—that is not contained in the code itself.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Agents & Tool Use extract
    Efficient and Sound Probabilistic Verification for AI Agents
    Efficient and sound probabilistic verification for AI agents
    AI Agents Deep Learning Inference Neural Network
    Securing AI agents that operate in complex digital environments has become critical, motivating runtime verification. This paper presents an efficient and sound probabilistic verification approach for AI agents.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Safety & Evaluation extract
    FreeStyle: Free Control of Style-Content Dual-Reference Generation from Community LoRA Mining
    FreeStyle: dual-reference style-content control via community LoRA mining
    Retrieval-Augmented Generation (RAG)
    Style-content dual-reference generation aims to synthesize an image that preserves structure while adopting a reference style. FreeStyle leverages community LoRA mining to give free control over style and content.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Training & Fine-tuning extract
    Calibration Without Comprehension: Diagnosing the Limits of Fine-Tuning LLMs for Vulnerability Detection in Systems Software
    Diagnosing whether fine-tuned LLMs comprehend software vulnerabilities
    Fine-tuning Neural Network Reinforcement Learning
    It is unclear whether LLMs that score well on vulnerability benchmarks truly reason about security or merely pattern-match. This work diagnoses the limits of fine-tuning LLMs for vulnerability detection in systems software.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Safety & Evaluation extract
    Contagion Networks: Evaluator Bias Propagation in Multi-Agent LLM Systems
    Contagion Networks: evaluator bias propagation in multi-agent LLMs
    AI Agents DeepSeek Reinforcement Learning
    When large language models act as evaluators in multi-agent systems, their systematic evaluation biases can spread through the system. This work analyzes how such evaluator bias propagates across agents.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    Beyond Global Replanning: Hierarchical Recovery for Cross-Device Agent Systems
    Hierarchical recovery for cross-device agent systems
    AI Agents Neural Network Reinforcement Learning
    The paper proposes a hierarchical recovery mechanism for cross-device agent systems, moving beyond coarse-grained global replanning. It targets real-world computer-use tasks that span multiple applications and devices and must coordinate heterogeneous environments under dynamic runtime failures.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    Optimal Order of Multi-Agent and General Many-Body Systems
    Optimal order of multi-agent and general many-body systems
    AI Agents Retrieval-Augmented Generation (RAG)
    This paper develops a general framework for analyzing multi-agent systems with feedback loops between agents, as well as general many-body systems, and characterizes their optimal order.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • OpenAI Blog · EN New Model Releases extract
    New usage analytics and updated spend controls for enterprises
    OpenAI adds usage analytics and spend controls to ChatGPT Enterprise
    GPT OpenAI
    OpenAI introduced new usage analytics and updated spend controls for ChatGPT Enterprise, helping organizations track and manage AI costs while scaling with confidence. Admins gain visibility into per-team consumption and can set limits to optimize spend.
    Read original (OpenAI Blog) ↗
  • arXiv cs.CL (Computation and Language) · EN Multimodal extract
    Scalable Training of Spatially Grounded 2D Vision-Language Models for Radiology
    RefRad2D: training spatially grounded radiology VLMs at scale
    Computer Vision Fine-tuning Neural Network Software Engineering
    The paper studies how to train spatially grounded vision-language models for radiology without manual spatial annotations. It introduces RefRad2D, a large-scale bilingual (German/English) dataset of 1.2M CT and MR image-text pairs derived from clinical practice, with VQA and spatial grounding subsets.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.LG (Machine Learning) · EN Funding & M&A extract
    Sparsity, Superposition, and Forgetting: A Mechanistic Study of Representation Retention in Continual Learning
    A mechanistic study of forgetting in continual learning
    Reinforcement Learning
    The paper presents a mechanistic study of representation retention in continual learning, using a controlled toy-world framework to make the drivers of forgetting observable and testable. It examines how sparsity and superposition relate to forgetting, isolating mechanisms that real datasets usually entangle.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Neural network surrogates with uncertainty quantification for inverse problems in partial differential equations
    NN surrogates with uncertainty quantification for PDE inverse problems
    Inference Neural Network Reinforcement Learning
    The paper develops neural network surrogates with uncertainty quantification for inverse problems in partial differential equations. It targets the inference of unknown model parameters from noisy or incomplete observations, where traditional numerical methods are costly, particularly in Bayesian settings.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Pseudo-Feature Padding: A Lightweight Defense Against False Data Injection in Power Grids
    Pseudo-Feature Padding: a defense against grid false-data injection
    Neural Network Reinforcement Learning
    The paper proposes Pseudo-Feature Padding, a lightweight defense against false data injection attacks in power grids. It targets the vulnerability of deep neural network detectors in cyber-physical systems, where attackers can craft inputs to evade detection during critical operations.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Direct Advantage Estimation for Scalable and Sample-efficient Deep Reinforcement Learning
    Scalable, sample-efficient direct advantage estimation for deep RL
    Algorithms & Theory Reinforcement Learning
    The paper improves Direct Advantage Estimation (DAE) for scalable and sample-efficient deep reinforcement learning. It addresses DAE's reliance on full environment observability and the computational overhead of modeling transition probabilities, which limit its use in realistic settings.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN New Model Releases extract
    DataMagic: Transforming Tabular Data into Data Insight Video
    DataMagic: turning tabular data into data-insight videos
    Neural Network Retrieval-Augmented Generation (RAG) Reinforcement Learning
    Data videos combine dynamic charts, voice narration, and synchronized animation to convey insights. DataMagic automatically transforms tabular data into such data-insight videos.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗