• Petter Hultin Gustafsson

Why your data projects never make any money

Don’t be hard on yourself, only 13% go in to production.



What’s your ML ROI?


Machine learning is great, it opens doors to new ventures, solving new challenges, as well as helping already established companies reinvent their businesses. But are you really doing that? And if so, what’s the cost? Is your revenue bump justifiable to the huge increase in development costs?


50% of companies, spend between 30–90 days getting a single model into production. Meanwhile, the data scientist is one of the priciest employees, together with their sparring partners — the data engineers.


Imagine


If your “normal” software engineering team was pushing features at a 90 days interval in 2020, completely unacceptable. And the same should go for data science. The tooling available today, allows any data driven company to mock, PoC and deploy new ideas at an almost daily rate. And let’s face it, if your company is in the business of making gold out of data, then your Data engineers, Data scientists and ML engineers could, would and should be your most valuable asset, and they should really be titled Product engineers, as their work is directly consumed by your users.



Making gold


For a company with any self improvement plan for their employees, this is really an easy fix (and you might want to double that budget and distribute it yesterday). Because let’s be honest, is your business dependent on the absolute latest ML research to bring value to your customers? I think not. So in that case, there is enough tools and information online to make all your key players true innovators in the field. And this is how you do it:


1. Data scientists need to learn the craft of preprocessing, which in most cases means, learning distributed computations and some mainstream tool like Spark. This can be quite an uphill battle, but then again, do you need to do it perfectly from the start to gain business value? Nope. They also need to understand the art of software engineering, not saying they need to be full blown CI/CD experts, but understanding load balancing, maintainable code and operating systems, might be a good idea. Once again use your Data engineers and software engineers as mentors, do a couple of cases, and there you go.


2. Data scientists need to learn the craft of preprocessing, which in most cases means, learning distributed computations and some mainstream tool like Spark. This can be quite an uphill battle, but then again, do you need to do it perfectly from the start to gain business value? Nope. They also need to understand the art of software engineering, not saying they need to be full blown CI/CD experts, but understanding load balancing, maintainable code and operating systems, might be a good idea. Once again use your Data engineers and software engineers as mentors, do a couple of cases, and there you go.


3. Devops, and software engineers need to learn both the craft of data science and data engineering. But let’s face it, they probably know alot about data engineering already, and data science, at business value level, is not rocket science for someone that is already holding your whole ship together.


As you can see, the knowledge transfer process does not need to be long, tedious or demand any external knowledge. It’s all about bringing together the teams and competences that are already (or should be) working towards the same goal. With MLOps you can have full control and have all your team work in the same infrastructure, giving you the possibility to do what used to take months in minutes. Give your data scientists and developers a chance to really become a team and collaborate.


Get started with a 30-day free trial.


Sign up here to get a handbook that’s relevant for you.