Infrastructure & Hardware B

Showing 61–90 of 110
  • NVIDIA Developer Blog · EN Infrastructure & Hardware extract
    NVIDIA Blackwell Tops MLPerf Training 6.0 with Industry-Leading Scale and Performance
    NVIDIA says Blackwell tops MLPerf Training 6.0 benchmark
    Generative AI Machine Learning NVIDIA Software Engineering
    NVIDIA announced that its Blackwell GPU architecture topped the MLPerf Training 6.0 benchmark with what it calls industry-leading scale and performance. Summarized neutrally from the title; the export excerpt was blocked (cookie/query data), so figures are vendor claims, not independently verified.
    Read original (NVIDIA Developer Blog) ↗
  • arXiv cs.LG (Machine Learning) · EN Training & Fine-tuning extract
    Catastrophic Forgetting is Low-Rank: A Function-Space Theory for Continual Adaptation
    Catastrophic forgetting is low-rank: a function-space theory
    Fine-tuning Reinforcement Learning
    Catastrophic forgetting in continual adaptation is usually viewed via parameter drift or replay, which do not reveal which output directions are vulnerable. The paper gives a function-space account in the NTK regime, showing new-task training drifts old-task predictions low-rank through the cross-task kernel.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN New Model Releases extract
    Recursive Scaling in Masked Diffusion Models
    Recursive scaling in masked diffusion models
    Deep Learning Inference Transformer
    Masked diffusion models (MDMs) have recently emerged as a generative approach. The paper investigates recursive scaling in MDMs, offering insights into their behavior and efficiency.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    LegalHalluLens: Typed Hallucination Auditing and Calibrated Multi-Agent Debate for Trustworthy Legal AI
    LegalHalluLens audits typed legal-AI hallucinations with calibrated debate
    Retrieval-Augmented Generation (RAG)
    Legal-AI systems hallucinate at aggregate rates near 52%, but averages hide where and how errors concentrate. LegalHalluLens is an auditing framework pairing typed hallucination auditing with calibrated multi-agent debate to give compliance officers actionable signals for trustworthy legal AI.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.LG (Machine Learning) · EN Infrastructure & Hardware extract
    Multiple cyclicity and Wavelet Decomposition with Channel Correlation for Long-term Time Series Forecasting
    Multiple cyclicity and wavelet decomposition for long-term forecasting
    Neural Network Reinforcement Learning
    Cyclicity and trend are key components of time series, but prior work often neglects real-world inter-channel correlations. The paper combines multiple cyclicity with wavelet decomposition and channel correlation to improve long-term time series forecasting.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Inference & Efficiency extract
    SoftMoE: Soft Differentiable Routing for Mixture-of-Experts in LLMs
    SoftMoE: soft differentiable routing for mixture-of-experts in LLMs
    Inference Mixture of Experts (MoE)
    Sparse mixture-of-experts architectures scale LLM parameters but their discrete routing complicates training. SoftMoE introduces soft, differentiable routing for mixture-of-experts in LLMs to enable more stable and efficient expert selection.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Infrastructure & Hardware extract
    Predictive Analytics in E-Commerce for CustomerBehavior Forecasting using hybrid Ret-DNN withXGBoost Model
    Hybrid Ret-DNN with XGBoost for e-commerce behavior forecasting
    Deep Learning Neural Network
    E-commerce platforms struggle to understand customer behavior and predict future purchases. The study proposes predictive analytics using a hybrid Ret-DNN combined with an XGBoost model to forecast customer behavior.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    Monotonic Kolmogorov-Arnold Networks: A Theoretical and Empirical Study of Monotonicity as an Inductive Bias
    Monotonic KANs: monotonicity as an inductive bias, studied theoretically
    Deep Learning Machine Learning Neural Network Software Engineering
    Monotonicity is a useful architectural inductive bias in tabular, scientific and economic settings. The paper proposes monotonic Kolmogorov-Arnold Networks with per-edge functional transparency and studies monotonicity as an inductive bias both theoretically and empirically.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Infrastructure & Hardware extract
    Meta-classification of one-class classification models using ranking correlation and nearest neighbor
    Meta-classification of one-class models via ranking correlation and kNN
    Algorithms & Theory Machine Learning Meta
    ML has been applied widely, but applying ML to ML models is underexplored. Treating models as approximable by one-class classification (OCC), the paper proposes meta-classification of OCC models using ranking correlation and nearest-neighbor methods.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Training & Fine-tuning extract
    Perceptual compensation for tonal context in self-supervised speech models
    Perceptual compensation for tonal context in self-supervised speech models
    Embeddings Retrieval-Augmented Generation (RAG) Speech Processing
    The study examines the extent to which self-supervised speech models exhibit perceptual compensation for tonal context, analyzing how context effects seen in human speech perception are reflected in the models' learned representations.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    EComAgentBench: Benchmarking Shopping Agents on Long-Horizon Tasks with Distributed Hidden Intent
    EComAgentBench: shopping agents on long-horizon tasks with hidden intent
    AI Agents Software Engineering
    As LLM-based shopping agents reach production, existing benchmarks miss how requirements arrive: implicitly, in a profile, or only when the right question is asked. EComAgentBench evaluates shopping agents on long-horizon tasks with distributed hidden intent.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • Simon Willison's Weblog · EN Safety & Evaluation extract
    The Fable 5 Export Controls Harm US Cyber Defense
    Willison: Fable 5 export controls harm US cyber defense
    Anthropic Claude Computer Vision Neural Network Reinforcement Learning
    Willison cites Kate Moussouris that the 'jailbreak' behind Claude Fable 5's export-control ban was merely asking it to 'fix this code' containing known CVEs and planted bugs. Since fixing security bugs is core to coding models, he argues the controls weaken US cyber defense.
    Read original (Simon Willison's Weblog) ↗
  • ITmedia AI+ · JA Infrastructure & Hardware extract
    急拡大するAIインフラの電力需要……光明は「ワットビット連携」に? さくら田中社長と東電が対談
    Sakura's Tanaka and TEPCO discuss 'watt-bit' coupling for AI power demand
    As AI infrastructure drives surging electricity demand, how should data centers and the power grid adapt? In a keynote at Interop Tokyo 2026 in Makuhari, TEPCO Holdings senior fellow Hiroshi Okamoto and Sakura Internet president Kunihiro Tanaka held a dialogue, exploring the potential of 'watt-bit' coupling that links computing resources with power supply.
    Read original (ITmedia AI+) ↗
  • NVIDIA Developer Blog · EN Training & Fine-tuning extract
    Fine-Tuning Biological Foundation Models with LoRA Using NVIDIA BioNeMo Recipes
    NVIDIA details LoRA fine-tuning of biological foundation models via BioNeMo
    Fine-tuning NVIDIA
    An NVIDIA developer blog post explains how to efficiently fine-tune biological foundation models—pretrained on large protein or genomic sequence corpora, such as the ESM2 protein language model—using LoRA, illustrated with the company's BioNeMo Recipes. A technical piece on applying foundation models in computational biology.
    Read original (NVIDIA Developer Blog) ↗
  • arXiv cs.LG (Machine Learning) · EN Infrastructure & Hardware extract
    Exact Posterior Score Estimation for Solving Linear Inverse Problems
    Exact closed-form posterior score for linear inverse problems
    Inference Reinforcement Learning
    The paper derives the exact posterior score in closed form for linear Gaussian inverse problems under general Gaussian interpolants, showing that posterior sampling reduces to a denoising problem at an operator-dependent shifted pivot with anisotropic noise. It turns this into a training objective, Exact Posterior Score (EPS), that preserves standard denoising structure and can be trained from scratch or fine-tuned.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Infrastructure & Hardware extract
    Hierarchical Advantage Weighting for Online RL Fine-Tuning of VLAs from Sparse Episode Outcomes
    HABC: hierarchical advantage weighting for RL fine-tuning of VLAs
    Fine-tuning Reinforcement Learning
    Online RL fine-tuning of pretrained VLA policies yields only one binary outcome per episode, yet actor updates need per-transition signals. The authors argue a single scalar conflates viability and efficiency and that mixing autonomous and intervention segments misassigns credit. Their method, Hierarchical Advantage-Weighted Behavior Cloning (HABC), trains separate critic heads for the two objectives on different data subsets.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    Selection Without Signal, Recovery Through Expression: A Measurement Study of Post-Hoc Falsification Operators for Frozen Small Code Models
    Measurement study of post-hoc falsification operators for code models
    Fine-tuning Neural Network Retrieval-Augmented Generation (RAG)
    Per its title, this paper presents a measurement study of post-hoc 'falsification operators' applied to frozen (non-retrained) small code models, framed around selection without signal and recovery through expression. The raw excerpt was blocked by a content filter, so this summary is based on the title alone and stays deliberately neutral.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.AI (Artificial Intelligence) · EN Infrastructure & Hardware extract
    Stable Menus of Public Goods: AI-Enabled Progress
    Study tests AI research workflows on an open economics problem
    Retrieval-Augmented Generation (RAG)
    Using an open problem from an EC 2025 paper as a testbed, the paper studies AI-for-economics research workflows. It reports that prompting with human intuition and multi-turn interaction can help, while finding an LLM slightly less effective than a first-year PhD student on the task.
    Read original (arXiv cs.AI (Artificial Intelligence)) ↗
  • arXiv cs.LG (Machine Learning) · EN Inference & Efficiency extract
    Decoupling Inference from State Updates in Low-Latency Feature Engines via Probabilistic Thinning
    Probabilistic thinning decouples inference from state updates in streams
    Inference Machine Learning Neural Network Retrieval-Augmented Generation (RAG)
    Streaming data systems increasingly underpin ML workflows maintaining many continuously updated aggregations. In production, each event triggers read-modify-write operations to storage, making high-frequency state updates a dominant source of latency, contention, and cost. This work decouples inference from persistence via probabilistic thinning: every event is scored, but durable updates fire only for informative events, using approximate disk-backed statistics with no in-memory control plane.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.LG (Machine Learning) · EN Infrastructure & Hardware extract
    Phantoms and Disclosures: a Causal Framework for Auditing Synthetic Data
    A causal auditing framework to detect synthetic-data privacy disclosures
    Generative AI Inference Reinforcement Learning
    Generative AI and LLMs have made synthetic data a popular privacy-preserving substitute for sensitive datasets, yet it can memorize and reproduce private training data. The authors propose a customizable empirical framework distinguishing "true disclosures" (direct reproduction of user data) from "phantom disclosures" (incidental generation). Using training/holdout partitioning and statistical hypothesis testing, it checks whether disclosures match strict privacy baselines like zero-learning.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • NVIDIA Developer Blog · EN Developer Tools extract
    Boosting MoE Training Throughput with Advanced Fusion Kernels
    NVIDIA details advanced fusion kernels to boost MoE training throughput
    Deep Learning Generative AI Machine Learning Mixture of Experts (MoE) NVIDIA
    On its developer blog, NVIDIA explains advanced fusion-kernel techniques aimed at boosting training throughput for Mixture-of-Experts (MoE) models. Noting that MoE has rapidly become a foundational component of modern large-scale AI systems, the post outlines kernel-level optimizations for more efficient training.
    Read original (NVIDIA Developer Blog) ↗
  • arXiv cs.LG (Machine Learning) · EN Safety & Evaluation extract
    Upper Bounds on the Generalization Error of Deep Learning Models via Local Robustness and Stability
    Tighter deep-learning generalization bounds via local robustness
    Deep Learning Neural Network Reinforcement Learning
    Robustness-based generalization bounds are often vacuous in practice. The authors trace much of the looseness to the robustness term itself, especially for 0-1 loss, which is usually treated as a global measure. They propose a bound that scales the robustness term by the number of stable and unstable samples across input sub-regions, yielding tighter estimates.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Infrastructure & Hardware extract
    Revisiting the Systematicity in Negation in the Era of In-Context Learning
    Revisiting LLM systematicity in negation via in-context learning
    An arXiv paper analyzes how large language models understand negation from two angles: behavioral and representational systematicity. It reports that, via demonstrations and in-context learning, LLMs can handle negation to some degree, and examines the limits of that systematicity. Neutral, abstract-based summary.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.LG (Machine Learning) · EN Training & Fine-tuning extract
    Deep Q-Learning on Hölder Spaces
    Bellman-target regularity analysis motivates a tensor-product DeepONet
    Reinforcement Learning
    This work studies the operator-theoretic core of Q-learning in continuous-time stochastic control with continuous states and actions. Under uniform ellipticity and Hölder-regular coefficients, a Bellman update smooths the state while leaving Lipschitz dependence on the action, motivating a tensor-product DeepONet and yielding approximation and resource bounds with a stiffness-complexity trade-off.
    Read original (arXiv cs.LG (Machine Learning)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    How Much Can We Trust LLM Search Agents? Measuring Endorsement Vulnerability to Web Content Manipulation
    Paper: framework measures LLM search-agent endorsement risk
    AI Agents Claude Gemini GPT Speech Processing
    An arXiv paper introduces SearchGEO, a controlled framework for measuring endorsement corruption in LLM-based web-search agents, combining a web-evidence manipulation pipeline and a five-mode attack taxonomy across multiple backends. Summarized neutrally from the abstract.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Multimodal extract
    Scaling LLM Reasoning from Minimal Labels: A Semi-Supervised Framework with a Lightweight Verifier
    Paper: semi-supervised LLM reasoning from minimal labels
    Neural Network Software Engineering
    An arXiv paper presents a semi-supervised framework that scales LLM reasoning from minimal supervision, using a lightweight reasoning-correctness classifier to turn verification into a data-creation mechanism. Summarized neutrally from the abstract.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN New Model Releases extract
    Multi-Turn Reflective Masking Elicits Reasoning in Mask Diffusion Models
    Reflective Masking elicits iterative reasoning in mask diffusion models
    Retrieval-Augmented Generation (RAG) Software Engineering
    The paper introduces Reflective Masking, a lightweight post-training method that lets mask diffusion models iteratively revisit and revise prior outputs via multi-turn masking, plus a History Reference component. Claims reflect the abstract and are not independently verified.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    FraudSMSWalker: Benchmarking Agentic Large Language Models for SMS-to-Webpage Fraud Detection
    FraudSMSWalker benchmark targets URL-masked SMS-to-webpage fraud
    AI Agents Meta Neural Network Reinforcement Learning
    The paper introduces FraudSMSWalker, a controlled benchmark for URL-masked SMS-to-webpage fraud judgment. It contains 699 bilingual chains (332 fraudulent, 367 benign) across ten scenarios, withholding raw URLs, hosts, and reputation metadata so models cannot rely on reputation shortcuts, and evaluates nine web agents.
    Read original (arXiv cs.CL (Computation and Language)) ↗
  • NVIDIA Developer Blog · EN Multimodal extract
    Pretrained to Imagine, Fine-Tuned to Act: The Rise of World-Action Models
    NVIDIA explains the rise of World-Action Models for robotics
    Computer Vision Generative AI NVIDIA Reinforcement Learning Robotics
    NVIDIA's technical blog surveys World-Action Models (WAMs)—robot policies pretrained to "imagine" via world modeling, then fine-tuned to act. It relates them to Vision-Language-Action (VLA) models built on pretrained VLM backbones for robotics.
    Read original (NVIDIA Developer Blog) ↗
  • arXiv cs.CL (Computation and Language) · EN Safety & Evaluation extract
    VeriGraph: Towards Verifiable Data-Analytic Agents
    VeriGraph: a traceable neuro-symbolic framework for verifiable data agents
    AI Agents Neural Network Software Engineering
    This arXiv paper introduces VeriGraph, a traceable neuro-symbolic reasoning framework for verifiable data-analytic agents. The authors note that LLM agents' reliance on linear text trajectories makes reasoning hard to audit, entangling deterministic computations over raw data with semantic deductions over natural-language claims. VeriGraph instead has agents build an explicit heterogeneous evidence directed acyclic graph (DAG) during execution.
    Read original (arXiv cs.CL (Computation and Language)) ↗