Dimensionality's Effect on Variance in Scaled Attention
BackThis project investigated how the dimensionality of vectors relates to the distribution of dot products—an important piece in understanding the attention mechanism's behavior within machine learning models. Using Python's NumPy and Matplotlib libraries, I created an animation that shows the change in variance of dot product values while dimensions of vectors increase.
The code defines a range of vector dimensions from 2 to 2000 and makes random vectors in order to calculate dot products between them. An animated histogram shows the distribution of how the dot products change under increasing dimensionality. From this dynamic interpretation, one important insight is that the variance of the dot product starts to grow as the number of dimensions grows. That can certainly lead to an uncontrollable variance.
By employing an animated visualization, I aimed to show how high-dimensional spaces affect the statistical properties of dot products, which is relevant when considering scaled attention mechanisms where the understanding of variance comes into play in hopes of mitigating issues such as the vanishing gradient problem.