International Women's Day event panellist: Meet Bianca Peterson

Today, our spotlight shines on Bianca Peterson, a senior data scientist at Fathom Data. Bianca serves as the co-chair of CODATA-RDA Schools of Research Data Science and chairs the RDA Interest Group: CODATA-RDA Research Data Science Schools for Low and Middle Income Countries. In addition to this, Bianca is a Carpentries instructor and trainer, as well as a member of the WITIN Advisory Board. She will also be a pannelist on ourWomen’s Day Event on the 8th of March 2024.

In short

📽️ What is your all time favourite movie and what makes it so special?

Frozen! The music and animation in this movie is amazing. The humour in this movie is also delightful. But most of all, this is a story where the damsel in distress is not rescued by a knight in shining armour. Instead, the hero is female!

🎞️ Frozen Trailer

📚 Can you describe your background and your current role?

What line of work are you in?

Consulting (Data Science).

Give a short summary of what you do in this role

I’m a Senior Data Scientist and also the head of training at Fathom Data. Although my main focus is to do training in computational skills (e.g. Unix, Git, R, Python, SQL, report writing, etc.), I also do data analytics. We do anything and everything related to data - from data capturing to modelling, applications, forecasting, and reporting. Our data engineering team is excellent with data migration and building pipelines for clients working with big data.

What did you study?

B.Sc. Microbiology & Biochemistry; B.Sc. Hons., M.Sc. and Ph.D. in Environmental Sciences (specifically Molecular Biology).

How did you get involved in the tech space?

During my PhD I had sequence data that I needed to analyse, but didn’t have any coding experience. I met Anelda through the Genomics study group that she was running and she suggested that I attend a Software Carpentry workshop. And so my Data Science journey began!

What software do you use on a daily basis in your job?

Unix, Git, R, VS Code, Slack, and Asana.

How did you learn to use these tools?

I was introduced to Unix, Git, and R at the Software Carpentry Workshop, after which I also learned about it again at the first CODATA-RDA School of Research Data Science, Trieste, Italy, in 2016. I only really mastered these tools once I started using them in my daily work. I learned to use VS Code, Slack, and Asana at Fathom Data.

Which tool has the biggest impact on your ability to succeed at work?

Git for version control of my code and documents. It’s non-negotiable and it’s a life saver!

Do you think women are well represented in your line of work?

When I joined Fathom Data in 2022, I was the only female. The previous female employees had changed jobs just before I joined. Since then, I’ve recruited more women and currently 30% of the employees are female. I think that females sometimes feel intimidated by Data Science and prefer to choose alternative careers. Via my training I’m trying to inspire females and show them that Data Science is for everyone.

🌱 Tell us more about your community of practice

Describe the purpose of this community from your point of view.

The aim of these schools is to build core data science skills and introduce open tools and resources for researchers. These are essential skills that you need to work effectively. Our vision is to grow a network of host institutions, instructors, assistants and helpers to increase research capacity, as it relates to data in Low and Middle Income Countries.

How did you get involved?

I attended the very first CODATA-RDA School of Research Data Science (SoRDS) in 2016 as a student. At that time I was still doing my PhD. I then became a helper the following year. I then connected with Dr. Martie van Deventer and Prof. Marlene Holmner to organise the very first South African Data Science Summer School in 2020, after which I joined the CODATA-RDA SoRDS as a co-chair. Since then, I have co-organised and co-taught at many schools in Trieste (Italy) and South Africa.

Why do you find it useful to be a part of this community?

The school was invaluable to me, as it helped me to embark on my journey to becoming a Data Scientist. I established life-long friendships and built a network of people with whom I can collaborate and we all want to pay it forward. I still get great feedback from our participants and they say that this school was life changing and it enabled them to do their own data analyses.

Is your community accessible and welcoming to women? If so, how?

Yes, absolutely! These schools are open to any post-graduate students or early career researchers, regardless of their gender.

How can other people become a part of this community?

On our website we have a “Get Involved” section which specifies how people can become helpers, instructors, future hosts, or supporters.

What other communities of practice are you a part of?

  • RDA Interest Group: CODATA-RDA Research Data Science Schools for Low and Middle Income Countries
  • Carpentries instructor and trainer
  • WITIN Advisory Board Member

💡 What advice do you have for women in Humanities and Social Sciences (HSS) eager to grow their computational skills?

There are abundant free and open resources for upskilling, but without a roadmap, you can easily get lost. The best approach is to start improving a single thing. For example, continue using Excel, but at least make sure that the data is tidy. Then try to control the versions using Git. Eventually, you can try to make a plot in R or Python. My advice is: Just start somewhere.

In hindsight, I would say that the following tools are essential for academic success (and I wish I had learned about this when I was a student):

  • UNIX: to navigate in a terminal and use commands to easily process files in batches.
  • Git: for version control of code and/or files (also serves as a backup so that you don’t need to worry about your laptop getting stolen or breaking).
  • R or Python: for data cleaning, analysis, and visualization.
  • SQL: to retrieve data from databases, especially when you work with big tabular datasets.
  • LaTeX: for writing manuscripts.
  • Mendeley or Zotero: for managing references.
  • Markdown: for reproducible report writing (this is like a digital lab book where you can write notes, add code and images, or even make plots with code).