Infrastructure & Hardware B
Showing 31–60 of 110
-
Essential Subspace Merging for Multi-Task LearningEssential subspace merging for multi-task model mergingModel merging integrates the capabilities of several models fine-tuned from the same pretrained checkpoint into one, enabling multi-task learning. This work proposes Essential Subspace Merging, which extracts and merges each task's essential subspace to reduce interference and preserve multi-task performance.
-
How Musicians Can Get Paid for Training AIIEEE Spectrum explores how musicians can be paid for AI training useIEEE Spectrum examines how musicians can be compensated when their music is used to train AI, covering attribution and payment for training-data use. This summary is title-based as the excerpt was blocked by a cookie/query-string wall and not retrieved; the specific mechanisms are per the article and unverified independently.
-
AdsMind: A Physics-Grounded Multi-Agent System for Self-Correcting Discovery of Adsorption Configurations on Heterogeneous Catalyst SurfacesAdsMind: physics-grounded multi-agent search for adsorption configsIdentifying the lowest-energy surface-adsorbate configuration is critical for modeling heterogeneous catalysis, but exhaustive ab initio exploration is prohibitive. AdsMind is a physics-grounded multi-agent system that self-corrects to efficiently discover adsorption configurations.
-
Written by AI, Managed by AI: Semantic Space Control and Index Sickness Elimination Across 391 Consecutive SessionsSemantic-space control sustains 391 consecutive AI-run sessionsThe prevailing intuition for conceptual drift in long-horizon LLM collaboration is to trade more formal constraints for more reliable outputs. Across 391 consecutive sessions, this work studies semantic space control and 'index sickness' elimination in workflows written and managed by AI.
-
Quantifying and Auditing LLM Evaluation via Positive--Unlabeled LearningAuditing LLM-as-judge bias via positive-unlabeled learningLLMs are increasingly used as judges for scalable evaluation, yet LLM-as-a-Judge systems show systematic biases decoupled from semantic quality, notably verbosity bias. This work uses positive-unlabeled learning to quantify and audit LLM evaluation, helping detect and correct such biases.
-
Adaptive Speech-to-Spike Encoding for Spiking Neural NetworksAdaptive speech-to-spike encoding for spiking neural networksThe mismatch between continuous acoustic signals and discrete event-driven processing is a fundamental bottleneck for neuromorphic speech processing. Rather than fixed spike encoders, this work proposes adaptive speech-to-spike encoding for spiking neural networks, improving downstream performance.
-
FoMoE: Breaking the Full-Replica Barrier with a Federation of MoEsFoMoE breaks the full-replica barrier with a federation of MoEsPretraining LLMs typically demands large-scale infrastructure with tightly coupled accelerators. As model and data scale grow, FoMoE proposes a federation of Mixture-of-Experts that avoids replicating the full model across devices, breaking the full-replica barrier and easing infrastructure constraints.
-
Spotlight: Synergizing Seed Exploration and Spot GPUs for DiT RL Post-TrainingSpotlight cuts DiT RL post-training cost with spot GPUsReinforcement learning post-training of Diffusion Transformers is prohibitively expensive, needing thousands of high-end GPUs. Spotlight synergizes seed exploration with cheap, preemptible spot GPUs to substantially reduce the cost of DiT RL post-training.
-
Enhancing Multilingual Reasoning via Steerable Model MergingEnhancing multilingual reasoning via steerable model mergingModel merging effectively composes the capabilities of a multilingual model and a reasoning model, achieving promising generalization on multilingual reasoning by aligning their feature spaces. This work introduces steerable model merging to control the composition and further boost multilingual reasoning.
-
TRAP: Benchmark for Task-completion and Resistance to Active Privacy-extractionTRAP benchmarks agents on task completion and privacy resistanceAgents are increasingly deployed in document-intensive workflows where sensitive private information is routine input—e.g., booking a flight needs passport numbers. TRAP is a benchmark evaluating agents on both task completion and resistance to active privacy-extraction attempts.
-
G-IdiomAlign: A Gloss-Pivoted Benchmark for Cross-Lingual Idiom AlignmentG-IdiomAlign: a gloss-pivoted cross-lingual idiom benchmarkIdioms resist literal cross-lingual mapping because they are non-compositional. G-IdiomAlign anchors each idiom to an English Wiktionary gloss and adds a high-confidence reference alignment set. Two protocols (multiple-choice idiom equivalence and gloss-contrastive generation) isolate the effect of explicit glosses.
-
Decoupling Search from Reasoning: A Vendor-Agnostic Grounding Architecture for LLM AgentsDecoupling search from reasoning: a vendor-agnostic grounding architectureProduction LLM agents increasingly depend on real-time search but get locked into vendor-specific grounding. This work decouples search from reasoning with a vendor-agnostic grounding architecture, letting search backends be swapped while preserving reasoning quality.
-
Graph-ESBMC-PLC: Formal Verification of Graphical PLCopen XML Ladder Diagram Programs Using SMT-Based Model CheckingGraph-ESBMC-PLC: SMT-based verification of PLCopen ladder diagramsPLCopen XML defines encodings for IEC 61131-3 Ladder Diagrams. Graph-ESBMC-PLC applies SMT-based model checking to formally verify graphical PLCopen XML Ladder Diagram programs, supporting correctness checking of industrial control software.
-
Approximate Structured Diffusion for Sequence LabellingApproximate structured diffusion for sequence labellingSequence labelling is a core NLP task. This work proposes an approximate structured diffusion approach that models label dependencies while keeping sequence labelling efficient.
-
Morpheus: A Morphology-Aware Neural Tokenizer and Word Embedder for TurkishMorpheus: a morphology-aware neural tokenizer and embedder for TurkishTurkish is agglutinative, with meaning carried by morphemes that subword tokenizers fail to capture. Morpheus is a morphology-aware neural tokenizer and word embedder designed to improve Turkish language processing.
-
LLM Serving Fairness: No more noisy neighboursCohere ensures fair compute sharing across LLM serving tenantsCohere details how it ensures every tenant gets a fair share of compute in LLM serving, tackling the 'noisy neighbour' problem where one user monopolizes resources. The design allocates capacity fairly across tenants to deliver stable, predictable multi-tenant performance.
-
Building AI Agents for AR Glasses and XR Devices with NVIDIA XR AINVIDIA unveils XR AI to build AI agents for AR glasses and XR devicesNVIDIA introduced NVIDIA XR AI, a framework for developers to build AI agents for AR glasses and wearable XR devices. It targets the gap between ready hardware and the work of integrating live, real-time AI experiences. Capabilities are per NVIDIA's own announcement; third-party verification pending.
-
Build Your Own Transaction Foundation Model for Financial IntelligenceNVIDIA details building a transaction foundation model for financeNVIDIA's developer blog walks through how to build your own transaction foundation model aimed at financial-intelligence use cases such as fraud detection and risk analysis. Specifics and claimed benefits come from NVIDIA's own post; independent verification is pending, as the raw excerpt was unavailable and this is summarized from the title and source.
-
生成AI×自動運転で注目のTesla・Waymo・NVIDIA 各社が目指す「フィジカルAI」は何が違うのかHow Tesla, Waymo and NVIDIA differ on 'physical AI' for drivingITmedia surveys 'physical AI'—a strategic focus area for Japan's government—through the lens of autonomous driving. The article reviews how advances in generative AI are reshaping the competition and compares the latest moves and differing approaches of Tesla, Waymo and NVIDIA.
-
Adaptive Volumetric Mechanical Property Fields Invariant to ResolutionAdaVoMP predicts resolution-invariant mechanical property fields for 3DReliable physics simulation needs Young's modulus, Poisson's ratio and density, which most 3D assets lack. AdaVoMP predicts dense, spatially varying values of these properties for input 3D objects in a way invariant to resolution across representations.
-
Finite-Time Queue Peak Laws in Stochastic Networks: Logarithmic Scaling After Geometric ThresholdsFinite-time queue-peak laws show log scaling after geometric thresholdsStudying finite-horizon queue peaks in generalized switches, where many queues share constrained service resources, the paper derives laws under a uniform interior-slack load condition showing logarithmic scaling of peaks after geometric thresholds.
-
Build On-Device AI Companions with the NVIDIA ACE Game Agent SDK and Unreal Engine 5 PluginsNVIDIA unveils ACE Game Agent SDK and UE5 plugins for on-device AINVIDIA announced the ACE Game Agent SDK and Unreal Engine 5 plugins for developers to build on-device AI companions—AI agents that run locally on the device rather than in the cloud—for in-game characters. The export raw_excerpt was blocked (cookie/query string data), so this is summarized neutrally from the title and the NVIDIA developer blog framing; specific figures and performance claims are unverified.
-
Towards Understanding and Measuring COGNITIVE ATROPHY in LLM BehaviourFormalizing 'cognitive atrophy' as a process-level measure of LLM behaviourThe paper formalizes 'cognitive atrophy,' a process-level behavioural measure of AI-mediated mental-health support, capturing whether interactions help users keep reflecting, coping, and deciding, a dimension distinct from safety and static response quality.
-
Unintended Effects of Geographic Conditioning in Large Language ModelsUnintended regional biases from geographic conditioning in LLMsConversational AI localizes responses using user metadata, yet the regional biases this hidden context introduces remain poorly understood. The paper analyzes the unintended effects of geographic conditioning on large language model outputs.
-
Ternary Mamba: Grouped Quantization-Aware Training of W1.58A16 State Space ModelsTernary Mamba: grouped QAT for W1.58A16 state space modelsTernary Mamba applies grouped quantization-aware training to Mamba state space models with ternary (W1.58) weights and 16-bit activations, targeting efficient low-bit training and inference of sequence models while preserving accuracy.
-
HistoRAG: Embedding Historical Methodology in Retrieval-Augmented Generation Through Critical Technical PracticeHistoRAG embeds historical methodology into RAG via critical practiceRAG grounds model outputs in external evidence, but its dominant evaluations and defaults are oriented toward factual question answering. HistoRAG embeds historical methodology into retrieval-augmented generation through critical technical practice for interpretive historical studies.
-
How to Optimize Transformer-Based Models for Low-Precision TrainingNVIDIA guide on optimizing transformer models for low-precision trainingAn NVIDIA technical post explains techniques for optimizing transformer-based models during low-precision training. The export raw_excerpt was blocked (cookie/query data), so this summary is based only on the title and source; specific methods and figures are unverified.
-
Tensor-based second-order causal discoveryTensor-based second-order causal discovery (TSCD)To uncover causal dependencies among variables, the paper proposes TSCD, a tensor-based second-order causal discovery algorithm whose input is a tensor formed from covariance matrices of observational and interventional data, assuming linear structural equations.
-
Agentic AI-based Framework for Mitigating Premature Diagnostic Handoff and Silent Hallucination in Healthcare ApplicationsA multi-agent framework against premature handoff and silent hallucinationThe paper proposes a multi-agent framework for healthcare that mitigates premature diagnostic handoff and silent clinical hallucinations, replacing LLM-as-a-judge routing with deterministic orchestration constraints and adding two safety mechanisms.
-
ConTex: Reformulating Counterfactual Generation For Time Series ForecastingConTex reformulates counterfactual generation for time-series forecastingDecision-making with deep time-series forecasting needs not just accurate predictions but actionable insight, which current architectures lack. ConTex reformulates counterfactual generation to indicate how present conditions must change to shift a predicted outcome toward a desired future.