The KU Leuven Library aka Universiteitsbibliotheek (pc: Vishwas Katti)

One year in academia: a data scientist’s (working) guide to staying sane and productive

Hussain Kazmi

--

In late 2020, my research proposal on energy data science was approved by the Flemish Research Council (FWO). At that time, after having finished my PhD in 2019, I was mostly busy building the data science stack for smart grids at i.LECO as a senior data scientist, while also teaching energy data science at KU Leuven. The allure of academic independence was simply too great though and, after only a few days of consultation with friends and family, I took the offer to become a full time FWO research fellow at KU Leuven.

A year later, I reflect on some of the biggest differences I have noticed between working at a well-funded research group as an independent postdoc researcher, and a well-funded start-up as a senior data scientist, both in the energy sector.

  1. Collaboration. Undoubtedly the greatest joy of academia is the endless possibility for collaboration, both within the university and outside of it. Borders do not mean much for academic cooperation, and I ended 2021 collaborating with colleagues from tens of countries in three different continents. In industry, such collaborations are naturally much rarer, especially where IP concerns exist. Conflict of interest issues make contributing to open-source projects while working for a company harder. In academia, these concerns are very often just not there.
  2. Independence. The university is an amazing place to work with international experts on topics you have quite some freedom in choosing. The best part? You get to decide which experts and topics you want to work with and learn from. As one climbs the academic ladder, this independence becomes something to cherish everyday. Not so much in the industry, where one can influence the make up of their team at senior levels, but the work itself is largely dictated by overarching objectives (unless if you are the CEO, of course, and even then the constraints are arguably more stringent).
  3. Organisational inertia. Not everything is bright and shiny in academia though. There is a tremendous amount of inertia and, even when there is ample funding, it is often very difficult to get some things pushed through bureaucracy because of archaic rules. In fact, on multiple occasions, I have been told to not think of these rules from an engineer's perspective and rather just as something that is. These rules are naturally there for a reason, but things such as sub-contracting repetitive work that I would not think about twice while working at a startup have turned out to be virtually impossible at the university. In fairness, this is a big organisation issue than specifically a university one.
  4. Data science in research vs. in the wild. Academic energy data science is, in many ways, fundamentally different than what is done or needed in the industry. Grad search, as opposed to grid search, means that difficult machine learning and optimization problems can be tackled by simply throwing graduate students at the problem (compute power is optional). The focus is more often than not on batch processing of data, which really does not capture many issues that arise with streaming data commonly encountered in industrial energy data science challenges. Devops, or MLOps, and good software engineering principles are very much still not a thing in academic settings, even though scaling data science techniques to vast energy networks is a very relevant application area in practice. This is exacerbated by the fact that many academic groups struggle to have access to actual data, which limits the real world applicability of their results. In industry, the opposite problem exists, often due to a lack of theoretical knowledge: in the last few years, I have seen some truly upsetting things such as people trying to brute force combinatorial problems because someone on the internet said so.
  5. Sparse and delayed rewards. In academic research, feedback is sparse and rewards often come with a long delay. In this sense, it becomes virtually impossible to answer what is good enough in terms of research output? Is publishing 1 journal paper in a year OK? How about 2? Or 4, or 8? What about citations? How does authorship order and venue of publication alter the calculus? These are relevant questions, especially before one has tenure, but their vague nature feeds into both anxiety and rampant impostor syndrome in academia. In the industry, especially in functional start-ups, you would rather quickly find out how well you’re performing based on performance reviews. Somewhat counterintuitively, star performers in both academia and industry usually get piled with higher workloads. In start-ups / industry, you can always try to link your work output to your compensation. This is much less straightforward (or common) in academia, especially in Europe, where there are usually no real performance related bonuses.
  6. Publish and/or perish. Much has been written about the publish or perish mentality in academia where the sparse, delayed rewards mean that many academics are in a perpetual rush to get one more paper out. In industry, this mentality of rushing through as many releases as possible would be (rightly) perceived as madness because it makes sense for each release to address customer needs (or pretend to, at least). In other words, what you release has more immediate bearing than how many releases there are in total. This makes me think that rewards in academia may also be misplaced, in addition to being delayed and sparse.
  7. The joys of publishing. One thing that really bothers me is how joyless the publication process can become at some point. Many early stage PhDs say that they hate writing papers. This is unfortunate, since it is only through communicating science that we can create an ecosystem for independent research to thrive in. However, there is a long, long lead time between the point where one conceives an interesting research question and carries out the research, to when it has actually gone through the multiple iterations of peer review process to get published. When this lead time is long and involves numerous revisions, the joy of discovery is largely decoupled from the act of publication, which becomes a chore. For me personally, writing the first draft is a joy, and the first revision is bearable, but anything beyond that is pure misery, especially because by then I will invariably have moved on to a fresh, new, shiny research project. At least for conferences, there is still the joy of discovering a new city (or airport if you’re unlucky with the timing), but Covid-19 has largely put an end to this for now. In industry, especially in agile teams, you would normally expect much more immediate impact from your work. At the same time, in industry, one does not (often) have the luxury of tiring of a product after its first release, i.e. it must often be iteratively improved for many release cycles. This can naturally get just as repetitive as polishing the seventh draft of a journal paper.
  8. Mentoring. I have noticed quite a few differences between mentoring students at the university (both at MS and PhD level) and mentoring colleagues at a company. The biggest is arguably the high turn over at the university, especially for MSc students doing their thesis. A few of these students stick around to do a PhD, but most leave to work in the industry immediately after defending their thesis, and so you only get a limited window to mentor them. Very often, the research these students start also remains unfinished for this very reason. Mentoring PhD students is, in many ways, harder still. The time limitation is less stringent (a PhD in Belgium is usually funded for 4 years), but the PhD candidate must develop a research mindset in addition to learning the tools of the trade and making an original contribution to the field. Squeezing all of these in four years can be tricky. In my experience with the industry, mentoring is simpler as it focuses a lot more on teaching junior colleagues tools of the trade and (some) background knowledge, but certainly less on developing a research mindset or making an original contribution to the field.

Despite all this and the tribulations of the pandemic, academia offers a unique place and opportunity to carry out independent research. But, if you are in academic research, do not forget to take a moment to celebrate the next milestone, be it a paper or grant or something else. Good companies make sure that they do this. Academics should too.

--

--

Hussain Kazmi

Data science, mostly for energy. Postdoc at KU Leuven. Tech-optimist and sci-fi aficionado.