Sinusoidal positional encodings are necessary to inform the Transformer about sequence order. Understanding how these encodings behave visually can provide valuable intuition. Here, the positional encoding function is implemented, and the resulting vectors are visualized.We will use Python with NumPy for numerical computation and Plotly for interactive visualizations, which are well-suited for web-based course materials.Implementing the Positional Encoding FunctionFirst, let's translate the mathematical formulas for sinusoidal positional encoding into code. Recall the formulas:$$ PE_{(pos, 2i)} = \sin(pos / 10000^{2i/d_{model}}) $$ $$ PE_{(pos, 2i+1)} = \cos(pos / 10000^{2i/d_{model}}) $$Where pos is the position in the sequence, $i$ is the index of the dimension within the embedding vector, and $d_{model}$ is the dimensionality of the embedding.Here's a Python function using NumPy to generate these encodings:import numpy as np def get_positional_encoding(max_seq_len, d_model): """ Generates sinusoidal positional encodings. Args: max_seq_len: Maximum sequence length. d_model: Dimensionality of the model embedding. Returns: A numpy array of shape (max_seq_len, d_model) containing the positional encodings. """ if d_model % 2 != 0: raise ValueError("d_model must be an even number to accommodate sin/cos pairs.") # Initialize the positional encoding matrix pos_encoding = np.zeros((max_seq_len, d_model)) # Create a column vector of positions [0, 1, ..., max_seq_len-1] position = np.arange(max_seq_len)[:, np.newaxis] # Shape: (max_seq_len, 1) # Calculate the division term: 1 / (10000^(2i / d_model)) # Corresponds to i = 0, 1, ..., d_model/2 - 1 div_term = np.exp(np.arange(0, d_model, 2) * -(np.log(10000.0) / d_model)) # Shape: (d_model/2,) # Apply sin to even indices (2i) pos_encoding[:, 0::2] = np.sin(position * div_term) # Apply cos to odd indices (2i + 1) pos_encoding[:, 1::2] = np.cos(position * div_term) return pos_encoding # Example Usage: max_len = 50 # Maximum sequence length d_model = 128 # Embedding dimension (must be even) positional_encodings = get_positional_encoding(max_len, d_model) print(f"Shape of generated positional encodings: {positional_encodings.shape}") # Output: Shape of generated positional encodings: (50, 128)This function takes the maximum sequence length and the model's embedding dimension as input. It calculates the sine values for even indices and cosine values for odd indices based on the position and the div_term, which represents the frequency component. The result is a matrix where each row corresponds to a position in the sequence, and each column corresponds to a dimension in the positional encoding vector.Visualizing Positional EncodingsVisualizing this matrix helps understand the structure of these encodings. A heatmap is an effective way to see how the encoding values change across positions and dimensions. We'll generate encodings for a sequence length of 50 and an embedding dimension of 128.{"layout": {"title": "Sinusoidal Positional Encodings", "xaxis": {"title": "Embedding Dimension Index (i)", "tickangle": -45}, "yaxis": {"title": "Position in Sequence (pos)"}, "colorscale": "viridis", "width": 700, "height": 500, "margin": {"l": 50, "r": 20, "t": 50, "b": 80}}, "data": [{"type": "heatmap", "z": [[0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0], [0.8414709848078965, 0.5403023058681398, 0.8775825618903728, 0.4794621373316841, 0.9092974268256817, 0.4161636136386472, 0.9359800720848767, 0.3511794274959062, 0.957167288647771, 0.2853200096676728, 0.9726241135354941, 0.21943088328258944, 0.9823486654420262, 0.1444935167101887, 0.986621210308548, 0.07158596772311953, 0.9859832755324335, 0.0018992702233388535, 0.9809782008750537, -0.06345507245367797, 0.9721568290771454, -0.12345883954073808, 0.9600843304100076, -0.17720633665454442, 0.9453351962988635, -0.2240376991462646, 0.9284922310133183, -0.2634838291455655, 0.9099755557578655, -0.29535198190766715, 0.8902613659026436, -0.32003844855060365, 0.8698023231781304, -0.3384248851383302, 0.8489738668527743, -0.3510316001619001, 0.8280776801460794, -0.3587937002961375, 0.8073357046348322, -0.36227767539237747, 0.7869072119620067, -0.3619104148243143, 0.766908874796896, -0.3580788497616465, 0.7474257712247096, -0.351138293578585, 0.728517365048824, -0.341421429945975, 0.710220280152522, -0.3292403859899767, 0.6925545133142473, -0.3148910659497917, 0.6755239348221682, -0.2986581957131764, 0.6591213389773723, -0.2808158212264793, 0.6433302656656797, -0.2616278138086091, 0.6281262040576656, -0.24135021493881506, 0.6134801709515981, -0.2202241953963803, 0.5993576585319589, -0.19848094482549626, 0.5857205613321617, -0.1763389392724801, 0.5725277526845391, -0.15399716111050198, 0.5597357507339302, -0.13163926450332512, 0.5472988330938691, -0.10943613184178361, 0.5351704986353054, -0.08754515574974138, 0.5233043221200266, -0.06610996613685261, 0.5116551286801524, -0.04526203420359618, 0.5001791465753631, -0.02511804849356034, 0.4888350266495903, -0.005782391699461689, 0.4775838704230799, 0.012637212060098437, 0.4663902401710957, 0.03004515307047458, 0.4552211708931967, 0.04635615897709488, 0.4440461431184715, 0.0614969947250869, 0.4328370599217558, 0.07540565632609201, 0.421568316648159, 0.08803164704240924, 0.41021677566891227, 0.09933557071247374, 0.398761657496558, 0.10929942991002057, 0.3871854665310259, 0.11792632988475984, 0.3754740022989012, 0.12523997081110914, 0.36361626267528296, 0.13128369877988164, 0.351604260033571, 0.1361194305473714, 0.3394330075552909, 0.13982522828135583, 0.32710034974739245, 0.1424938023989063, 0.3146071431821944, 0.14422311656730878, 0.3019570028552581, 0.14511424679836578], [0.9092974268256817, -0.4161468365471424, 0.990051317567887, -0.14052376354208757, 0.9738476308987014, 0.2272213300895165, 0.8774131106809474, 0.47972566424985876, 0.7274081086626375, 0.6862016723829984, 0.5439994304990682, 0.8390883630524981, 0.3463977739968054, 0.9380918933463169, 0.15160683649643978, 0.9884277138999634, -0.03379685595403166, 0.9994284771050007, -0.2011819698083945, 0.9795574642261906, -0.3448871252455501, 0.9386369877999581, -0.4618775333986173, 0.8869548521570036, -0.5516098585835745, 0.834102856882146, -0.615618377512519, 0.7880461774228927, -0.6568151198622444, 0.7539732479458866, -0.6786328671587832, 0.7344792967634055, -0.6845402317243667, 0.7289702229146082, -0.6781124239831851, 0.7349607671925405, -0.6628122466459876, 0.7487872593557883, -0.6419282870118471, 0.7667844942139249, -0.6184857618425337, 0.7857976999910893, -0.5952160221344067, 0.8035677306100003, -0.5744383504860577, 0.8184081117524332, -0.5580488175887216, 0.8301372299388907, -0.5474892955890541, 0.8368077998071836, -0.5437330621433164, 0.8392569735463234, -0.5472552841377183, 0.8369540249214177, -0.5580675366722038, 0.8301295128046776, -0.5757557174087019, 0.8176013664721096, -0.5994458015849391, 0.8004156819259502, -0.6280066642471608, 0.7781997181033315, -0.6599933548987801, 0.7512703899449745, -0.6940078140940975, 0.7201816723078561, -0.7285804795957063, 0.684967666618778, -0.7622678419722884, 0.647273600101545, -0.793706637073495, 0.608287366020119, -0.8216785471917834, 0.5700041146736355, -0.8452071101603048, 0.5344368330819036, -0.8636011877662686, 0.5041463648227066, -0.8764521926656247, 0.4815002867358958, -0.883628291494123, 0.46818024464145177, -0.8852374485592888, 0.4651287723725904, -0.8816085560256154, 0.4719995149864666, -0.8732699349157511, 0.4872564719244687, -0.8608571112488515, 0.5088233141911096, -0.8450948445941374, 0.5346206896491224, -0.8267889334177648, 0.562526890770957, -0.8067923694490332, 0.5908267881506066, -0.7859179210667349, 0.6184586846318105, -0.7648757672162811, 0.6441790277075823, -0.7442989303579306, 0.6678336321062879, -0.7247195135534931, 0.6889989733278746, -0.7066014573823135, 0.707611129970412, -0.6899397389423581, 0.7238678128357774, -0.6746565978092815, 0.7381353755812053, -0.6606460645173546, 0.7507081008504335]], "colorbar": {"title": "PE Value"}}]}Heatmap visualizing sinusoidal positional encodings for a sequence of length 50 and embedding dimension 128. Each row represents a position, and each column represents a dimension index. Color intensity indicates the encoding value.Analyzing the VisualizationFrom the heatmap, several properties discussed earlier become visually apparent:Unique Encoding per Position: Each row (position) has a distinct pattern of colors, representing its unique encoding vector. This uniqueness is what allows the model to distinguish between different positions.Varying Frequencies: Observe the wavelengths across the dimension axis (x-axis).The leftmost columns (low dimension indices, small $i$) exhibit high-frequency changes (rapid color shifts down the position axis). These dimensions encode fine-grained positional information.The rightmost columns (high dimension indices, large $i$) show low-frequency changes (slow color shifts). These dimensions encode coarser positional information over longer distances.Smooth Transitions: The sinusoidal nature ensures smooth transitions between encodings of adjacent positions.Bounded Values: All values are inherently within the range [-1, 1] due to the sine and cosine functions.Let's further examine the uniqueness by plotting the encoding vectors for a few specific positions (e.g., position 0, 10, and 25) across all dimensions.{"layout": {"title": "Positional Encoding Vectors for Specific Positions", "xaxis": {"title": "Embedding Dimension Index (i)"}, "yaxis": {"title": "Encoding Value"}, "width": 700, "height": 400, "margin": {"l": 50, "r": 20, "t": 50, "b": 50}}, "data": [{"type": "scatter", "mode": "lines", "name": "Position 0", "y": [0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0], "line": {"color": "#4263eb"}}, {"type": "scatter", "mode": "lines", "name": "Position 10", "y": [0.5403023058681398, 0.8414709848078965, 0.18897749149822277, 0.9819798938452818, -0.14052376354208757, 0.990051317567887, -0.4161468365471424, 0.9092974268256817, -0.621121998159741, 0.7837279448760467, -0.7511631690346772, 0.6601452129967232, -0.812564329230846, 0.5828488768295093, -0.8184081117524332, 0.5744383504860577, -0.7857976999910893, 0.6184857618425337, -0.7289702229146082, 0.6845402317243667, -0.6601452129967232, 0.7511631690346772, -0.5883683593029396, 0.8085882352877561, -0.5200430856635284, 0.8541400788291753, -0.45916043223893715, 0.8883631804443152, -0.4074939824715184, 0.9132005700872774, -0.3655754224019375, 0.9307856561027774, -0.3329684427015127, 0.9429403887229937, -0.3086672391045201, 0.951170684492078, -0.2914575772598071, 0.956584590890684, -0.2801166813039812, 0.9600575464302771, -0.2734321946600792, 0.9619619710977373, -0.27028013522313586, 0.9628444317175782, -0.2696707021858682, 0.9630136797128074, -0.2707495165361665, 0.9627174643379192, -0.27280944628165846, 0.9621447556553874, -0.2752684773833135, 0.9613524104700063, -0.27765302451878775, 0.9603927664830504, -0.2796065088049152, 0.9593107354812005, -0.2808804021430193, 0.9581422858563513, -0.2813277444934208, 0.9569144746249047, -0.2808900952046198, 0.955646648526002, -0.2795868082856521, 0.9543521456175499, -0.2774821108159553, 0.9530413413489827, -0.2746617705116992, 0.9517222198440056, -0.2712231782659786, 0.9503999410700009, -0.26726556474604746, 0.9490780031242018, -0.2628880892504738, 0.947758271453384, -0.2581813460502937, 0.9464412351776197, -0.253228381419316, 0.9451262242362557, -0.24810541245493634, 0.9438119368622691, -0.2428808121618345, 0.9424964830254986, -0.23761513633571205, 0.9411775546965355, -0.2323610198817543, 0.9398524775669785, -0.2271649903761108, 0.9385183596243761, -0.2220682164089234, 0.9371719952651193, -0.2171059613155731, 0.9358098761362579, -0.21229990165664096, 0.9344283402575424, -0.20766663909843656, 0.9330234925729956, -0.20321863953768652, 0.9315912082554639, -0.19896502728571397, 0.9301271787895746, -0.1949117078316356, 0.9286268897658035], "line": {"color": "#fa5252"}}, {"type": "scatter", "mode": "lines", "name": "Position 25", "y": [-0.9905827874384348, 0.13688989313579105, -0.776137602863411, 0.6305628789841774, -0.314952012865825, 0.9491011781702696, 0.2153478406908885, 0.9765346466025795, 0.6346205390369582, 0.7728365189386228, 0.8881307079386816, 0.45959189997587015, 0.9817917177274125, 0.1899798283765998, 0.9380918933463169, -0.3463977739968054, 0.7781997181033315, -0.6280066642471608, 0.5439994304990682, -0.8390883630524981, 0.2774821108159553, -0.9607292305329883, -0.0018992702233388535, -0.9999981961659431, -0.2581813460502937, -0.9660991754565172, -0.4719995149864666, -0.8816085560256154, -0.6305628789841774, -0.776137602863411, -0.7349607671925405, -0.6781124239831851, -0.7909090322684492, -0.6119280264383237, -0.8085882352877561, -0.5883683593029396, -0.796159899774068, -0.6050906685665812, -0.7600058152812625, -0.6499068776943778, -0.7055794044529136, -0.7086330134030763, -0.6378013241675584, -0.7601432137869585, -0.5612626269081336, -0.8276456378798706, -0.47972566424985876, -0.8774131106809474, -0.3966264558233681, -0.9180986570096914, -0.3148910659497917, -0.949121804865178, -0.2368108521746069, -0.971556115669016, -0.16412091133920836, -0.9864444366062577, -0.09816981777990183, -0.9951701063409906, -0.04001169465248977, -0.999199328784382, 0.009991600483724762, -0.9999500806616277, 0.05086204636432838, -0.9987059475660337, 0.08157421912986426, -0.9966679654429727, 0.1012239651731595, -0.9948676595097934, 0.11001178501054878, -0.9939310640387985, 0.10821788080831847, -0.9941225304479194, 0.09628113317249609, -0.9953561199383344, 0.07480519859761726, -0.9972011499566382, 0.04450743006836251, -0.9990079287015746, 0.006112502707762947, -0.9999813244984171, -0.038715054462849345, -0.9992504339537643, -0.08888678529012994, -0.9960345490336076, -0.1329593752844065, -0.9911178092975667, -0.1700008084976757, -0.9854497300301147, -0.1993565671034955, -0.9799251668523311, -0.22062032894824937, -0.9752809405508342, -0.23364570758395636, -0.9723199998763365, -0.23853538792899084, -0.9711413314151337, -0.23563280369217227, -0.9718410491995881], "line": {"color": "#12b886"}}]}Line plots comparing the 128-dimensional positional encoding vectors for positions 0, 10, and 25. The distinct shape of each line highlights the unique encoding assigned to each sequence position.These visualizations confirm that sinusoidal positional encodings provide a distinct signal for each position, varying smoothly across dimensions with different frequencies. This positional signal is then added to the input token embeddings, allowing the subsequent self-attention layers to consider the order of elements in the sequence.In the next chapter, we will assemble these components, along with the multi-head attention mechanism, into the full Transformer encoder and decoder stacks.