Data engineering may not be a ‘new’ field per se, but it is one that has seen tremendous growth in the past decade, and has steadily risen in importance within organizations of all stripes. As a result of the increase in demand, people with a variety of technical and mathematical backgrounds have been drawn to the profession. Given the requirements of the data engineer profession, it’s not surprising that many software engineers have also taken up the gauntlet and transitioned to data engineering jobs. In this blog, we’ll discuss what some of the requirements of the data engineering profession are, how they relate to software engineering, and talk about some tips for making the transition.
Want to learn more about the move from software engineer to data engineer?
Check out this free eBook from O’Reilly, 97 Things Every Data Engineer Should Know, Chapter 52 from John Salinas, a software engineer turned data engineer at USAA
Why switch from software to data engineering?
Data engineering is appealing as a career option for many reasons. After all, it’s stimulating, cutting edge, and in high demand (which of course translates to higher salaries and job security). But all of this is of course true of software engineering too. So what’s the appeal?
While it’s true that average salaries are a bit higher for data engineers, this is far from the only benefit. As data engineer John Salinas put it, “The move from software engineering to data engineering is rewarding and exciting.” First, you get to retain all of the elements that make software engineering a great career, such as solving technical challenges in ways that maximize simplicity for end-users.
But, as Salinas also put it, “you expand your craft to include analytical and data-related problems.” This includes looking at business from a highly dynamic perspective. While software certainly needs to be updated and redesigned, data infrastructure has to adapt on a daily, even hourly basis to a shifting landscape of data inputs and business needs. And, you get to design new paradigms that ultimately will become the data backbone of new software applications.
Data science is a hybrid field that combines the skills of a statistician, a business expert and a programmer. Aside from the possibility that the skills associated with the programmer may be the most difficult to attain, software developers also bring an understanding of data structure that you won’t get in a college stats class, where they don’t teach about arrays, data frames, stacks, queues. All this knowledge is necessary to make the leap from comprehending structured data stored in a tabular database to the more ethereal concepts of Big Data.
Skills for the transition
A software developer already has a good chunk of the skills that make a great engineer in their toolkit, such as:
With that said, there’s still a bit of a learning curve, even for a software engineer. For instance, you do need to bring an understanding of statistics. For instance, if you’re building a data pipeline, do you know what data to clip? It’s critical to know, from a statistical standpoint, what constitutes an outlier. And this question may be something you need domain expertise to answer as well.
Here are a few areas to check your knowledge and skills, and if necessary, ramp up/brush up:
Good luck with your journey from software engineer to data engineer! If, along the way, you want to learn how to make data engineering much easier, and sidestep such complicated processes as ETL, learn about the data fabric.