Eugene Yan is an Applied Scientist at Amazon.
This interview was done together with CareerFair, where they dived deeper into an average (and non-average day) of the Applied Scientist role, as well as what working in Amazon is like.
Hi Eugene, what do you do?
I work at the intersection of machine learning & product to build pragmatic machine learning systems that serve customers. I also write about effective data science, machine learning, and career growth.
Currently, I’m an Applied Scientist at Amazon helping users read more, and get more out of reading. We build book recommendation systems and contribute to efforts in discovery (e.g., search). Previously, I led the data science team at Lazada (acquired by Alibaba in 2016) and worked on e-commerce ML systems (e.g., ranking, automation, fraud detection).
What does an average day look like?
First, a disclaimer: Even for people with the same title (i.e., applied scientist), the average day will look different. It will also vary with the project’s lifecycle, such as research, prototyping, development, and maintenance. My role mostly involves prototyping and development.
My day usually has these buckets of activities:
- Stand-up: The team checks in on what’s being worked on, blocked, or needs help.
- Data science/coding: This includes (i) literature research, (ii) exploring and preparing data, (iii) offline experiments, (iv) building prototypes (and giving demos), (v) writing and reviewing production code, and (vi) launching A/B tests.
- Writing: I write documents (e.g., one-pagers, design docs) to share ideas and get feedback. I also document methodology, decisions, and experiment results for future reference. By writing ideas and findings, they become easier to scale.
- Reading: Reading papers helps me to be a more effective data scientist. Thus, I try to read at least an hour a day. The content includes internal/external articles and papers. (I have a bias towards papers on applied machine learning.)
- Meetings: Not the most enjoyable activity for me. Nonetheless, meetings are essential for coordination and communication. A 30-min meeting beats days (or weeks) of email back and forth.
What does a non-average day look like?
I’m struggling with this question as most days don’t seem average. Nonetheless, here are some exceptional events that may come up:
- High severity incidents: This includes critical system failures, sometimes with customer-facing impact. Fortunately, this seldom happens for our team (i.e., less than a handful of times a year).
- Migrating legacy systems: All code eventually becomes legacy code. In a previous role, I was involved in a massive migration (cloud providers, data and machine learning systems) that needed us to drop everything and solely focus on migration.
- Attending conferences: Many great conferences are online now which is great. Attending them requires having to balance between work and the conference.
What’s your favourite part about the job?
I really enjoy working with data. Through data (e.g., search logs, clickstreams, transactions), we understand our customers and how they interact with our platform and products. The data reveals interesting patterns in human behaviour. For example, consumption changes due to life-stages (e.g., becoming a parent) and socio-economic events (e.g., COVID-19, work from home). By understanding our customers better, we can serve them better.
Another aspect I enjoy is the amount of leverage working in a consumer tech company (e.g., Lazada, Amazon) provides. Our team can build and deploy machine learning systems to help customers around the world. It scales well too. Most of the system doesn’t need to change from country to country. Some necessary changes include using local data and adapting to local regulations (e.g., privacy). I get a huge kick from seeing customers benefit from our work (we see this through metrics and anecdotes).
What’s your least favourite part about being a Data Scientist?
I’m still learning about how to manage this, but sometimes, I spend more time than I would like writing documents and in meetings. Nonetheless, it’s essential for socialising ideas and getting buy-in and feedback. I just wish I was more effective and faster at it.
Occasionally, stakeholders suggest solutions that are way more complex than it needs to be. I blame the overhyping of tech and machine learning in the media. When this happens, our team patiently tries to understand their perspective and educate them. Nonetheless, it takes considerable time and effort and distracts us from work that helps customers.
Lastly, because my work revolves around data, I’m also constrained by access to high-quality data. Delays happen now and then. Sometimes, it’s a minor lack of permissions which takes a few hours to a few days to resolve. Other times, we find that our system isn’t tracking a specific field and we need to update our trackers and wait a few months, or backfill the data.
Do you think more people in tech would benefit from having a humanities background? You studied Psychology as an undergrad. What are some ways in which that has helped you in Data Science?
Having a humanities background is associated with certain traits: Being more open-minded, critical thinking, better problem framing, research skills, and the ability to communicate with laymen. I think such traits would benefit everyone, not just tech folks. While a humanities degree helps with cultivating these traits, there are plenty of other ways—it can also come from having the opportunity to work on diverse, challenging problems, good role models, and work experience.
Other than the traits mentioned above, my Psychology degree taught me how to analyse qualitative and quantitative data. It also taught me about statistics (and how to be skeptical of it). In addition, I learned about how people perceive, think, and behave; this helps when I’m building customer-facing machine learning systems.
How is working at a big company like Amazon different to working at a startup like Lazada? Do you enjoy one more than the other?
After a year at Amazon, I see Amazon as a group of start-ups (rather than a big company). For example, each AWS service seems to operate like a start-up. In that sense, my experience so far has been similar to working in Lazada. We’re constantly experimenting, shipping, and getting feedback from customers. Nonetheless, being a global company, Amazon does provide slightly more leverage in terms of customer impact.
I enjoy—and work best in—a role that’s between commando and soldier. Both my experience in Lazada and Amazon allow me to do this which plays to my strengths.
How much do you think strong communication plays a role in being a successful data scientist? How can younger professionals cultivate that skill?
Communication is one of the most important—if not the most important—skill for an effective data scientist. Initially, I didn’t think this way. But I reached out to several mentors asking what the most important skill for a data scientist was and guess what—it was communication. Thus, I focused on improving my communication and saw gains in my effectiveness within a year.
I think the best way to improve communication is through practice. At the start, it’s useful to read about the fundamentals of good writing and speaking—this arms us with knowledge from the experts. But to really get better, we need to practice.
How can we practice? Offer to write documents at work. This can be in the form of proposals, design documents, or internal newsletters. Or write about personal projects or what we learn on a blog. To practice speaking, offer to share at meet-ups or conferences about work-related or personal projects. With everything online now, it’s much easier.
This interview has been edited for relevance and was first published here.
Eugene Yan works at the intersection of machine learning & product to build ML systems. He’s currently an Applied Scientist at Amazon. Previously, he led the data science team at Lazada and uCare.ai. He writes on how to be effective at data science, machine learning, and career at eugeneyan.com and tweets at @eugeneyan. Follow him at Twitter or subscribe to his weekly newsletter to learn more about this space.
Disclaimer: This article was written by a contributor. All content is written by and reflects the personal perspective of the writer. If you’d like to contribute, you can apply here.