
Data Has Always Been Software Engineering
Working in data has always been about software engineering. It just took a while to realise it and step out of the closet with it. LLMs are the icing on the cake, but only if the cake is properly engineered.
The data field was born very close to academia, from tools like SPSS, Stata, R, or Matlab, where you’d build models that couldn’t be deployed, analyses that couldn’t be reproduced, and pipelines that couldn’t be maintained. It has taken decades of pressure and real-world demands for the field to adjust, live up to the expectations, and come to terms with what it really is, mostly: software engineering.
But we are there now!
Today, software engineering practices have been incorporated into the data realm: version control, testing, environments, DevOps with CI/CD practices, Infrastructure as Code, etc. Even Python has been getting serious, trying to mimic the statically typed languages with frameworks like Pydantic and doing proper dependency management with Poetry or uv.
And thanks to this development, after working in the field for more than a decade it finally feels right to work with data! The tools just work, they are pleasant to work with, and with the right setup and practices, productivity explodes.
Seen on the internet. Python won actually.
LLMs: The Icing on the Cake
And LLMs are the icing on the cake of this evolution. The superpowers.
So if you are working with a well-engineered data platform where, end to end, all is code: infrastructure, integrations, modeling, data serving, visualization, and DevOps. Then an LLM can generate a new pipeline that follows your existing patterns or scaffold a new data product with amazing accuracy and proper conventions. Of course, you need to review and iterate, but you are rocket fuelled. And who says data platform, says any platform basically…
However, if you have a patchwork platform where code is scattered, some parts live in GUI-based tools with no proper API or documentation, logic is buried in drag-and-drop workflows, and tribal knowledge fills the gaps… well, AI won’t multiply much.
Managing Change, Not Perfection
And I’m not saying every company needs a perfectly engineered data platform from day one. Software engineering has never been about perfection. It’s about managing change.
Just saying that with proper engineering foundations, change becomes cheap. AI can help you move faster, but only if the system is legible, structured, and built for evolution.
The data field isn’t turning into software engineering.
It always was.
And the sooner you get to this realisation, the better!
Cover photo by Sandie Clarke on Unsplash
