February 05, 2024
Here’s what caught our eye last week in AI.
Feature matching is the process of finding points in one image that correspond to points in another image of the same scene. It’s a challenging task since the two images can be taken at very different angles and times. This new paper proposes MESA (Matching Everything by Segmenting Anything), which is a method that first finds matching areas with the help of the Segment Anything Model (SAM), and then finds matching points within each area using existing feature matching techniques.
The authors suggest that it is necessary to process the outputs of SAM with graphical models as shown below, instead of using the outputs directly.
Deep learning model architectures can be optimized to run faster on specific hardware. This paper investigates the performance of General Matrix Multiplications (GEMMs) in transformer models during training and inference, on NVIDIA GPUs. Why focus on GEMMs? Because according to the authors, GEMMs make up most of the computations in transformers (68% to 94% depending on model size). The authors point out several characteristics of NVIDIA GPUs that can slow down performance, like tile quantization and wave quantization, and suggest a number of rules for creating efficient transformer models. Applying these rules to the GPT architecture results in speed gains of up to 39% while maintaining the same accuracy.
Interested in future weekly updates? Stay up to date by joining our Slack Community!