How to decide which programming language to use for data products?
Which profiles do I have in my data organisation? What can I easily acquire in the market?
🐍 Rumour has it that Elon wants to rebuild Twitter entirely on Python. The data leader faces this decision daily: What about Julia or Rust for our latest data product? Should we try Scala for pipelining? Should everything be SQL?
This decision is also triggered top-down. Senior management reads about the new programming language used in AI or how a hot new startup built their stack around a modern but obscure stack.
Data leaders should make this decision using two variables:
Which profiles do I have in my data organisation? What can I easily acquire in the market?
You should continue with the same stack and profiles if you have an established team of Java and Scala developers.
But if you need to look in the market?
For data, the answer to this is obvious. The ecosystem is entirely behind Python and SQL. There are modern alternatives, but it will take years to establish a community and have mature and robust libraries.
If we look at the job market, the answer is Python. It is consistently the number one or two programming language across surveys and community websites such as StackOverFlow.
It also became the programming language students learn during their first-semester undergrad. Bygone are the days of Java as the default educational language in academia.
Exploring new programming languages and running proof of concepts in the month's flavour is always exciting and sound exercises, but your main concerns should be reliability, robustness and maintenance.
Is it easier to find a group of Python experts or Julia if all your team leaves and you have to hire from scratch?
The same logic should apply to libraries and software. Many experts tell me they dislike dbt, but the ecosystem and talent are around dbt. It is a sound decision to continue building on top.
What have been your experiences exploring new tech stacks or libraries? Were you successful and satisfied with the outcome? Do you prefer to stick with the tried and true?