I gave a technical talk titled “Simple Unsupervised Anomaly Detection Using a PyTorch Transformer Autoencoder” at the Fall 2022 Machine Learning Artificial Intelligence and Data Science (MLADS) conference. The MLADS conference is an internal event at the large tech company I work for, and so the conference wasn’t open to the public. The bottom line is that I learned a lot, enjoyed the event, and made some valuable connections.
The event ran from November 14-17, 2022 in Redmond, Washington.
In my talk, I started by describing unsupervised anomaly detection using a standard neural autoencoder. Next I explained Transformer Architecture (TA) as briefly as possible. Then I showed an example of anomaly detection using TA. I used one of my standard examples where the data represents employees and looks like:
1 0.24 1 0 0 0.2950 0 0 1 -1 0.39 0 0 1 0.5120 0 1 0 1 0.63 0 1 0 0.7580 1 0 0 -1 0.36 1 0 0 0.4450 0 1 0 1 0.27 0 1 0 0.2860 0 0 1 . . .
The fields are sex (male = -1), age (divided by 100), city (one of three), income (divided by 100,000), and job type (one of three). The point is that the technique works with any type of data: Boolean, integer, float/real, categorical.
There were about 40 people in the physical audience and about 80 people watching online. The presentation was recorded and historically most people will watch the recorded version of the presentation in the days following the event.
I enjoy interacting with my work colleagues in person. There are a lot of very smart people and I always pick up interesting new ideas. When I present in person, I can pick up cues from attendees’ body language and voice characteristics, which helps me give a better presentation. But the main value I get from presenting at internal work conferences is making connections with people who have interesting problems where deep neural techniques can be useful.
In addition to my technical talk, I sat on a panel discussion titled “Meet a Data Scientist”. The panel had five employees (including me) with wildly varying backgrounds and job types. I posed a question to my fellow panelists: “What skills do you think are requirements for the Data Scientist job role?” I was fully expecting to hear SQL, R, and Python (for use with a library such as scikit or PyTorch) and maybe a few other skills. But to my complete surprise, not one of the other four panelists felt that SQL, R, or Python was an essential skill for a Data Scientist. Hmmm. For the work I do, I can’t imagine a Data Scientist not having at least a moderate knowledge of SQL, R, and Python.
When I give a technical presentation, I have to think carefully about what I wear. If I dress too nicely, I might lose credibility with hard core tech guys. If I dress too casually, I might lose credibility with people who interact with external traditional conservative customers such as banks and medical companies. Here are three images from a search for Buryat traditional clothing. Quite elaborate. I like the fancy hats. Buryats are an indigenous group in Siberia. There are roughly 500,000 Buryats.