DeepSeek-R1 Enhances GPU Kernel Generation with Inference Time Scaling

Felix Pinkston Feb 13, 2025 18:01 NVIDIA's DeepSeek-R1 mannequin makes use of inference-time scaling to enhance ...

AI energy efficiency monitoring ranks low among enterprise users, survey by inference CPU specialists finds

January 19, 2025

Swimlane survey finds many companies aren't conserving on prime of AI power wantsPractically three quarters are conscious of the dramatic ...

Strategies to Optimize Large Language Model (LLM) Inference Performance

by admin

August 22, 2024

0

Iris Coleman Aug 22, 2024 01:00 NVIDIA consultants share methods to optimize giant language mannequin (LLM) ...

NVIDIA Enhances TensorRT Model Optimizer v0.15 with Improved Inference Performance

by admin

August 16, 2024

0

Zach Anderson Aug 16, 2024 03:03 NVIDIA releases TensorRT Mannequin Optimizer v0.15, providing enhanced inference efficiency ...

NVIDIA NVLink and NVSwitch Enhance Large Language Model Inference

by admin

August 13, 2024

0

Felix Pinkston Aug 13, 2024 07:49 NVIDIA's NVLink and NVSwitch applied sciences increase giant language mannequin ...

Character.AI Enhances AI Inference Efficiency, Reduces Costs by 33X

by admin

June 21, 2024

0

Character.AI, a full-stack AI firm, has unveiled a sequence of groundbreaking developments in AI inference ...

GPU Alternative d-Matrix Raises 0 Million for AI Inference

GPU Alternative d-Matrix Raises $110 Million for AI Inference

by admin

September 6, 2023

0

Microsoft's enterprise group is amongst d-Matrix's supporters, investing in making in-memory compute for AI and LLM inference. Picture: Shuo/Adobe Inventory ...

Tag: Inference