I often teach introductory machine learning to software engineers who have a lot of programming experience but have little experience with ML. One of the biggest challenges for engineers who are new to ML is something nobody ever talks about — working with multi-dimensional arrays.

The NumPy library has dozens of ndarray (“n-dimensional array”) functions. I was working on a problem where I needed to concatenate/merge two 2D arrays. The np.concatenate() function is overkill in the sense that it has many optional parameters.

Sometimes I prefer to implement helper functions from scratch, probably because I worked for many years in C/C++/C# environments where implementing from scratch was the standard approach for multi-dimensional array functions. Unfortunately, when working with Python, a from-scratch implementation of an array function is usually much slower than a library implementation because Python from-scratch looping is slow.

The moral of the story is that implementing ML systems has many layers — it’s difficult just to get a system to work, but there are also many subtle issues such as implementing-from-scratch vs. using library functions.

*I was in Manhattan, New York City, for several days not too long ago. I spent one full day walking around from the Brooklyn Bridge up to Central Park. I was struck by the way buildings are concatenated together, producing incredible urban density. People who live in such conditions should read about the Rat Utopia experiment by John Calhoun.*

Demo code:

# concat_demo.py
import numpy as np
def my_concat_rows(a, b):
(a_rows, a_cols) = a.shape # mxn
(b_rows, b_cols) = b.shape # rxn
result = np.zeros((a_rows + b_rows, a_cols))
for i in range(a_rows):
for j in range(a_cols):
result[i][j] = a[i][j] # or result[i,j] = a[i,j]
for i in range(b_rows): # 0, 1, ..
for j in range(a_cols):
result[i+a_rows][j] = b[i][j]
return result
a = np.array([[1.1, 1.2, 1.3],
[2.1, 2.2, 2.3],
[3.1, 3.2, 3.3],
[4.1, 4.2, 4.3]]) # 4x3
b = np.array([[5.1, 5.2, 5.3],
[6.1, 6.2, 6.3]]) # 2x3
print("\na = ")
print(a)
print("\nb = ")
print(b)
c = np.concatenate((a,b), axis=0) # by rows
print("\nc = ")
print(c)
d = my_concat_rows(a, b)
print("\nd = ")
print(d)

### Like this:

Like Loading...

*Related*

You must be logged in to post a comment.