Amazon AWS releases Inferentia 2 chip to accelerate large-scale model reasoning

2024-12-26 07:13
 31
Amazon AWS released the Inferentia 2 chip, which triples the computing performance and increases the total accelerator memory by a quarter. Inferentia 2 supports distributed reasoning and can support up to 175 billion parameters, making it a strong competitor for large-scale model reasoning.