Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Linear time-invariant Systems

Linear time-invariant Systems

A system is linear if it displays the characteristics of homogeneity and additivity, together termed the superposition property. This principle is fundamental in all linear systems. Linear time-invariant (LTI) systems include systems with linear elements and constant parameters.
The input-output behavior of an LTI system can be fully defined by its response to an impulsive excitation at its input. Once this impulse response is known, the system's reaction to any other input can be...

Parallel Processing

Parallel Processing

The brain processes sensory information rapidly due to parallel processing, which involves sending data across multiple neural pathways at the same time. This method allows the brain to manage various sensory qualities, such as shapes, colors, movements, and locations, all concurrently. For instance, when observing a forest landscape, the brain simultaneously processes the movement of leaves, the shapes of trees, the depth between them, and the various shades of green. This enables a quick and...

Vector Algebra: Method of Components

Vector Algebra: Method of Components

It is cumbersome to find the magnitudes of vectors using the parallelogram rule or using the graphical method to perform mathematical operations like addition, subtraction, and multiplication. There are two ways to circumvent this algebraic complexity. One way is to draw the vectors to scale, as in navigation, and read approximate vector lengths and angles (directions) from the graphs. The other way is to use the method of components.
In many applications, the magnitudes and directions of...

Cartesian Vector Notation

Cartesian Vector Notation

Cartesian vector notation is a valuable tool in mechanical engineering for representing vectors in three-dimensional space, performing vector operations such as determining the gradient, divergence, and curl, and expressing physical quantities such as the displacement, velocity, acceleration, and force. By using Cartesian vector notation, engineers can more easily analyze and solve problems in various areas of mechanical engineering, including dynamics, kinematics, and fluid mechanics. This...

Cartesian Form for Vector Formulation

Cartesian Form for Vector Formulation

The Cartesian form for vector formulation is a process to calculate the moment of force using the position and force vectors. The moment of force is defined as the cross-product of these vectors, making it a vector quantity. The Cartesian form of the position and force vectors involves unit vectors, which can be used to express the cross-product in determinant form.

Ampere-Maxwell's Law: Problem-Solving

Ampere-Maxwell's Law: Problem-Solving

A parallel-plate capacitor with capacitance C, whose plates have area A and separation distance d, is connected to a resistor R and a battery of voltage V. The current starts to flow at t = 0. What is the displacement current between the capacitor plates at time t? From the properties of the capacitor, what is the corresponding real current?
To solve the problem, we can use the equations from the analysis of an RC circuit and Maxwell's version of Ampère's law.
For the first part of...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Mixed-Precision <i>Ab Initio</i> Tensor Network State Methods Adapted for NVIDIA Blackwell Technology via Emulated FP64 Arithmetic.

Journal of chemical theory and computation·2026

Same author

Orbital Optimization of Large Active Spaces via AI-Accelerators.

Journal of chemical theory and computation·2025

Same author

Parallel Implementation of the Density Matrix Renormalization Group Method Achieving a Quarter petaFLOPS Performance on a Single DGX-H100 GPU Node.

Journal of chemical theory and computation·2024

Same author

Machine Learning Guided AQFEP: A Fast and Efficient Absolute Free Energy Perturbation Solution for Virtual Screening.

Journal of chemical theory and computation·2024

Same author

Large Scale Quantum Chemistry with Tensor Processing Units.

Journal of chemical theory and computation·2022

Same author

Conformal Fields and Operator Product Expansion in Critical Quantum Spin Chains.

Physical review letters·2020

Same journal

The TaMYB55-TaSnRK1α1-TabZIP9 module confers heat stress tolerance in wheat.

Proceedings of the National Academy of Sciences of the United States of America·2026

Same journal

Superstatistics approach to turbulent circulation fluctuations.

Proceedings of the National Academy of Sciences of the United States of America·2026

Same journal

A molecular timescale for evolution of cobamide biosynthesis.

Proceedings of the National Academy of Sciences of the United States of America·2026

Same journal

Pierre Chambon, a pioneer of molecular biology and gene regulation in eukaryotes.

Proceedings of the National Academy of Sciences of the United States of America·2026

Same journal

Granulosa cell glycogen fuels the avascular corpus luteum.

Proceedings of the National Academy of Sciences of the United States of America·2026

Same journal

Synthetic essentiality of TRAIL/TNFSF10 in VHL-deficient renal cell carcinoma.

Proceedings of the National Academy of Sciences of the United States of America·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 2, 2025

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Published on: November 22, 2019

Large-scale distributed linear algebra with tensor processing units.

Adam G M Lewis^1,2, Jackson Beall^1,2, Martin Ganahl^1,2

¹Simulation & Optimization Team, Sandbox AQ, Palo Alto, CA 94301.

Proceedings of the National Academy of Sciences of the United States of America

|August 8, 2022

Summary

This summary is machine-generated.

Google Tensor Processing Units (TPUs), designed for machine learning, are repurposed as powerful supercomputers for dense linear algebra. These TPUs demonstrate significant scaling and performance for matrix multiplication and other linear algebra tasks.

Keywords:

ASICs TPUs distributed computing linear algebra scientific computation

More Related Videos

Stereo-Imaging System DLT Calibration to Capture 3D In Situ Displacements of Stretched Peripheral Nerves

Stereo-Imaging System DLT Calibration to Capture 3D In Situ Displacements of Stretched Peripheral Nerves

Published on: January 12, 2024

Large Scale Energy Efficient Sensor Network Routing Using a Quantum Processor Unit

Large Scale Energy Efficient Sensor Network Routing Using a Quantum Processor Unit

Published on: September 8, 2023

Related Experiment Videos

Last Updated: Sep 2, 2025

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Published on: November 22, 2019

Stereo-Imaging System DLT Calibration to Capture 3D In Situ Displacements of Stretched Peripheral Nerves

Stereo-Imaging System DLT Calibration to Capture 3D In Situ Displacements of Stretched Peripheral Nerves

Published on: January 12, 2024

Large Scale Energy Efficient Sensor Network Routing Using a Quantum Processor Unit

Large Scale Energy Efficient Sensor Network Routing Using a Quantum Processor Unit

Published on: September 8, 2023

Area of Science:

High-performance computing
Applied mathematics
Computer architecture

Background:

Tensor Processing Units (TPUs) are specialized hardware accelerators developed for machine learning.
Dense linear algebra operations are fundamental to many scientific and engineering disciplines.
Repurposing existing hardware can offer cost-effective solutions for specialized computational needs.

Purpose of the Study:

To investigate the feasibility of using Google TPUs for large-scale dense linear algebra computations.
To evaluate the performance and scalability of TPUs in this new application domain.
To demonstrate the effectiveness of curated algorithms on TPU architecture for linear algebra tasks.

Main Methods:

Repurposing Google TPUs, originally designed for machine learning, into supercomputers for dense linear algebra.
Utilizing TPUs' fast intercore interconnects (ICIs), 2D network topology, and high-bandwidth memory (HBM).
Developing and applying distributed matrix multiplication algorithms optimized for TPU architecture.

Main Results:

TPUs achieve computationally bound regimes where matrix-multiply units (MXUs) dominate runtime, enabling impressive scaling and performance.
A 2,048-core TPU pod can perform matrix multiplication of size [Formula: see text] in approximately 2 minutes using float32 precision.
Curated algorithms allow other dense linear algebra tasks, including QR decomposition, linear system resolution, and matrix function computation (e.g., matrix polar factorization), to scale effectively.

Conclusions:

Google TPUs can be successfully repurposed as high-performance supercomputers for dense linear algebra.
The architecture of TPUs, particularly their interconnects and memory, is well-suited for computationally intensive linear algebra operations.
This repurposing opens new avenues for accelerating scientific computations using machine learning hardware.