Infrastructure & Hardware｜AI/Tech News Trends

Simon Willison's Weblog · 2026-08-02 EN New Model Releases extract

condense-json 1.0

Simon Willison ships condense-json 1.0 for compact JSON

Simon Willison released condense-json 1.0, a small Python library that shrinks JSON by replacing repeated strings with a short replacements map (e.g. mapping a key to a recurring phrase). Now a year and a half old, it graduates to a stable 1.0 with sensible, non-disruptive fixes.

Read original (Simon Willison's Weblog) ↗

Data Center Dynamics · 2026-08-02 EN Infrastructure & Hardware extract

Meta boosts AI data center capex, forecasts $130-145bn spend

Meta raises AI data center capex outlook to $130-145bn

Meta

Meta raised its 2026 capex forecast for AI data centers to $130-145 billion, underscoring the rising cost of scaling generative-AI infrastructure. Shares fell as free cash flow declined, highlighting how heavy AI spending weighs on near-term returns.

Read original (Data Center Dynamics) ↗

Simon Willison's Weblog · 2026-08-01 EN New Model Releases extract

Ten advances in mathematics and theoretical computer science

Simon Willison weighs in on OpenAI and Anthropic's math results

Anthropic Claude GPT OpenAI

Simon Willison discussed OpenAI's 'ten advances in mathematics and theoretical computer science,' noting that just days earlier Anthropic had reported similar discoveries. The post reflects on a growing trend of frontier AI models contributing to open problems in mathematics.

Read original (Simon Willison's Weblog) ↗

Data Center Dynamics · 2026-08-01 EN Infrastructure & Hardware extract

Sponsored: Tackling complexity in AI data centers: leveraging fully integrated solutions

Tackling AI data-center complexity with integrated solutions (sponsored)

Retrieval-Augmented Generation (RAG)

A sponsored piece argues that rising densities and a growing variety of equipment and services are driving complexity in AI data centers. It contends the market needs fully integrated solutions to manage this complexity rather than piecemeal components.

Read original (Data Center Dynamics) ↗

Simon Willison's Weblog · 2026-07-31 EN Infrastructure & Hardware extract

deepseek-ai/DeepSeek-V4-Flash-0731

DeepSeek releases new V4-family model, DeepSeek-V4-Flash-0731

DeepSeek Gemini

Simon Willison highlighted DeepSeek-V4-Flash-0731, the latest release in DeepSeek's V4 family, described as having substantially enhanced capabilities. The fast, lightweight-oriented model underscores the continued momentum of open-weight AI model development.

Read original (Simon Willison's Weblog) ↗

NVIDIA Developer Blog · 2026-07-31 EN Infrastructure & Hardware extract

Co-Designing AI Model Attention for Fast, Interactive Long-Context Inference

NVIDIA details co-designed attention for fast long-context inference

Generative AI Inference NVIDIA

NVIDIA describes co-designing model attention with hardware to speed up interactive long-context inference. As agentic and long-context workloads grow, attention takes a larger share of inference time, and the approach targets that bottleneck for faster serving.

Read original (NVIDIA Developer Blog) ↗

Simon Willison's Weblog · 2026-07-31 EN New Model Releases extract

smevals - a small eval suite for evaluating models, prompts, and harnesses

Simon Willison introduces 'smevals,' a small eval suite for models

Claude GPT Machine Learning Neural Network Software Engineering

Simon Willison introduced smevals, a small evaluation suite for testing models, prompts, and harnesses. Built in collaboration with Jesse Vincent's Prime Radiant applied AI research lab, the framework aims to help answer questions about the capabilities of different AI models.

Read original (Simon Willison's Weblog) ↗

arXiv cs.CL (Computation and Language) · 2026-07-31 EN Infrastructure & Hardware

TokTier: Exact Stateful Tokenization for Agentic LLM Serving

AI Agents GPT

Read original (arXiv cs.CL (Computation and Language)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-07-31 EN Infrastructure & Hardware

ExtractBench: A Benchmark for Schema-Guided Enterprise Document Extraction

AI Agents Llama Meta

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.LG (Machine Learning) · 2026-07-31 EN Infrastructure & Hardware

Sign compression for Muon: SignMuon, MuonSign, and the Limits of Error Feedback

GPT

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-07-31 EN Infrastructure & Hardware

Development of FDD-ON: an Ontology for VAV HVAC System Fault Detection and Diagnostics

Retrieval-Augmented Generation (RAG)

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-07-31 EN Infrastructure & Hardware

CENDRe: Concept Extraction with Natural Domain Representations

Neural Network Reinforcement Learning

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.LG (Machine Learning) · 2026-07-31 EN Infrastructure & Hardware

TOOD: Task-Aware Out-of-Distribution Score Calibration for Continual Learners

Reinforcement Learning

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-07-31 EN Infrastructure & Hardware

TraceViT: Grounded Trace Supervision for Visual Abstract Reasoning

Neural Network

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-07-31 EN Infrastructure & Hardware

COntExt: Towards Context-Aware Ontology Extension from Operational Metrics

Algorithms & Theory Neural Network

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

Data Center Dynamics · 2026-07-31 EN Infrastructure & Hardware extract

AWS reports fastest growth since 2021, Amazon annual capex to hit $220bn on AI memory costs

AWS posts fastest growth since 2021; Amazon capex to hit $220bn

Neural Network

AWS reported its fastest growth since 2021 as cloud sales boomed, according to DatacenterDynamics. Amazon is ramping up data center construction, with annual capital expenditure expected to reach $220bn, driven in part by rising AI memory costs.

Read original (Data Center Dynamics) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-07-31 EN New Model Releases

AMTFV: Agentic Mathematical Tool-Flow Verification for LLM Self-Correction

DeepSeek Gemini GPT Software Engineering

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-07-31 EN New Model Releases

From Code Review to Code Critique: Intent, Drift, and Spotlight for AI-Generated Diffs at Scale

AI Agents Meta Neural Network

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

NVIDIA Developer Blog · 2026-07-31 EN Developer Tools extract

NVIDIA Video Codec SDK 13.1: Zero-Copy Transcode, AV1 B-Frames, and Frame-Accurate Seek

NVIDIA ships Video Codec SDK 13.1 with zero-copy transcode, AV1 B-frames

Computer Vision NVIDIA

NVIDIA released Video Codec SDK 13.1, adding zero-copy transcoding, AV1 B-frame support, and frame-accurate seeking. The update targets accelerating demand for high-quality video across industries, from immersive streaming to media pipelines.

Read original (NVIDIA Developer Blog) ↗

arXiv cs.LG (Machine Learning) · 2026-07-31 EN Training & Fine-tuning

MoPET: Parameter-Efficient Mixture-of-Experts for Unified Medical Image Classification

Deep Learning Fine-tuning Mixture of Experts (MoE) Retrieval-Augmented Generation (RAG)

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-07-31 EN Multimodal

QR-Structured Thermal Triggers for Targeted Semantic Attacks on Infrared Vision-Language Models

Computer Vision Deep Learning Software Engineering

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

Data Center Dynamics · 2026-07-31 EN Infrastructure & Hardware extract

Why ‘next wave’ data center markets are at the heart of Europe's fight for data sovereignty

Why 'next wave' data-center markets anchor Europe's data-sovereignty fight

DatacenterDynamics argues that emerging 'next wave' data center markets are central to Europe's push for data sovereignty. Amid policy and investment moves to keep data within the region, the piece analyzes how these growing markets shape the sovereignty debate.

Read original (Data Center Dynamics) ↗

Data Center Dynamics · 2026-07-31 EN Infrastructure & Hardware extract

Musk confirms fourth SpaceXAI data center in Memphis, company starts removing 'illegal' gas turbines

Musk confirms fourth xAI data center in Memphis, removes 'illegal' turbines

Elon Musk confirmed a fourth xAI-linked data center in Memphis and said the company has begun removing gas turbines flagged as 'illegal,' per DatacenterDynamics. The moves underscore the environmental and permitting tensions accompanying rapid AI compute buildout.

Read original (Data Center Dynamics) ↗

Data Center Dynamics · 2026-07-31 EN Infrastructure & Hardware extract

Amazon, Duke Energy accused of evading Clean Air Act at under-development data center in Hamlet, North Carolina

Amazon, Duke Energy accused of evading Clean Air Act at NC data center

Machine Learning

Amazon and Duke Energy were accused of evading the Clean Air Act at a data center under development in Hamlet, North Carolina, DatacenterDynamics reported. The complaint centers on separate applications to deploy a combined 649 diesel generators at the site.

Read original (Data Center Dynamics) ↗

Data Center Dynamics · 2026-07-31 EN Infrastructure & Hardware extract

CenterPoint Energy raises investment plan by $1.2bn as data center load pipeline grows

CenterPoint Energy adds $1.2bn to plan as data-center load grows

CenterPoint Energy raised its investment plan by $1.2bn as its data center load pipeline expands. The utility expects to energize up to 8GW of data center demand in the Greater Houston area by 2029, reflecting surging power needs from AI computing.

Read original (Data Center Dynamics) ↗

Data Center Dynamics · 2026-07-31 EN Infrastructure & Hardware extract

GCM expands range of high-performance heat sinks to cater for liquid-cooled data centers

GCM expands high-performance heat sinks for liquid-cooled data centers

GCM expanded its range of high-performance heat sinks aimed at liquid-cooled data centers. The lineup targets the growing cooling demands of increasingly power-dense AI servers, supporting more efficient thermal management in modern facilities.

Read original (Data Center Dynamics) ↗

arXiv cs.AI (Artificial Intelligence) · 2026-07-31 EN Infrastructure & Hardware

Beyond Component Testing: Validating Agentic AI Systems

Neural Network Retrieval-Augmented Generation (RAG)

Read original (arXiv cs.AI (Artificial Intelligence)) ↗

arXiv cs.LG (Machine Learning) · 2026-07-31 EN Inference & Efficiency

OnlineCache: Learning Dynamic Caching Policies with Error Correction for Efficient Diffusion Inference

Inference Retrieval-Augmented Generation (RAG) Reinforcement Learning

Read original (arXiv cs.LG (Machine Learning)) ↗

arXiv cs.CL (Computation and Language) · 2026-07-31 EN Inference & Efficiency

Studying quantization trade-offs for efficient inference deployment in machine translation

Deep Learning Inference Quantization