I often teach introductory machine learning to software engineers who have a lot of programming experience but have little experience with ML. One of the biggest challenges for engineers who are new to ML is something nobody ever talks about — working with multi-dimensional arrays.
The NumPy library has dozens of ndarray (“n-dimensional array”) functions. I was working on a problem where I needed to concatenate/merge two 2D arrays. The np.concatenate() function is overkill in the sense that it has many optional parameters.
Sometimes I prefer to implement helper functions from scratch, probably because I worked for many years in C/C++/C# environments where implementing from scratch was the standard approach for multi-dimensional array functions. Unfortunately, when working with Python, a from-scratch implementation of an array function is usually much slower than a library implementation because Python from-scratch looping is slow.
The moral of the story is that implementing ML systems has many layers — it’s difficult just to get a system to work, but there are also many subtle issues such as implementing-from-scratch vs. using library functions.
I was in Manhattan, New York City, for several days not too long ago. I spent one full day walking around from the Brooklyn Bridge up to Central Park. I was struck by the way buildings are concatenated together, producing incredible urban density. People who live in such conditions should read about the Rat Utopia experiment by John Calhoun.
# concat_demo.py import numpy as np def my_concat_rows(a, b): (a_rows, a_cols) = a.shape # mxn (b_rows, b_cols) = b.shape # rxn result = np.zeros((a_rows + b_rows, a_cols)) for i in range(a_rows): for j in range(a_cols): result[i][j] = a[i][j] # or result[i,j] = a[i,j] for i in range(b_rows): # 0, 1, .. for j in range(a_cols): result[i+a_rows][j] = b[i][j] return result a = np.array([[1.1, 1.2, 1.3], [2.1, 2.2, 2.3], [3.1, 3.2, 3.3], [4.1, 4.2, 4.3]]) # 4x3 b = np.array([[5.1, 5.2, 5.3], [6.1, 6.2, 6.3]]) # 2x3 print("\na = ") print(a) print("\nb = ") print(b) c = np.concatenate((a,b), axis=0) # by rows print("\nc = ") print(c) d = my_concat_rows(a, b) print("\nd = ") print(d)