Author: David Tang

Reflecting on programming in 2023

The year is coming to an end, and this is a good juncture to look back at 2023. This is a short rewind in 2023 from a programming and coding perspective.

I had the opportunity to incorporate a couple of new frameworks in my work this year.

Read below on my what I have experienced, and my takeaways.

1. Ploomber

This is a project management tool for a data science workflow.

It is similar to other frameworks such as Apache Airflow. It allows the project to be streamlined as a DAG (direct acyclic graph).

Ploomber icon
Ploomber logo. Using an otter as a mascot is an automatic win for me.

According to the developers, it is more suited than Airflow for ML projects. It is more targeted towards the development or explorative stage. What it does not aim to do is data orchestration and running regular production tasks.

Advantages of Ploomber:

 

  • low technical burden (just a package that can be installed in conda)
  • can include unit tests, debugging
  • keeps track of any new changes in project files
  • easily rerun parts of the projects, where scripts have been edited
  • can incorporate various languages such as SQL and, R within the pipeline

Disadvantages of Ploomber:

  • probably not needed for scripting and quick prototyping

Takeaway: Data projects are quickly evolving, and we need to be able to pivot flexibly at times. It is useful to have an equally flexible pipeline in place to take care of making sure upstream and downstream scripts & outputs are kept up-to-date.

2. WSL2 and Bash scripting

This was more of a lesson for me in managing environments in a virtual machine.

WSL2 = Windows Subsystem for Linux2. It is basically adding a virtual Linux-based command line system that you can access from within Windows itself.

I am currently running Ubuntu with Bash on WSL2. Currently, I find it sufficient enough to do everything data science related from the command line. Git works nicely within WSL2 as well. And I can integrate everything via my favourite IDE (integrated development environment), Visual Studio Code.

WSL2
WSL2 – Best of both worlds, Unix and Windows

Pros:

  • can run pretty much anything, as if within a native Linux environment
  • don’t need to fiddle with installing similar commands into the Windows command line (cmd)
  • WSL2 is a one-liner setup on the Windows command line now, with newer versions of Windows

Cons:

  • sometimes the virtualisation gets a bit abstract
  • need to be mindful to connect/execute within the right command-line (WSL2 vs windows native)

Takeaway: With big data/cloud computing, we’ll need to deal with remote machines at times. It can get quite abstract but beats running everything on a local machine at times. Getting comfortable with navigating Unix systems is handy.

3. Handlebars

This is a framework that was a bit of a tryout for me. It is based in the web development space, and not so much typical analytics programming.

What is this? It’s a template compiler. Basically, it compiles into HTML and Javascript. Apparently, some code and speed optimisations happen under the hood as well.

handlebars icon
Handlebars logo. Kind of reminds me of Mr. Potato Head for some reason.

I had quite a bit of fun with using ‘moustaches‘ in my code too!

If you don’t know what they are, they are these {{curly braces}}. They are pretty versatile and can do a little bit more than your usual declared variable. For example, one can either load whole templates of website sections or declare helper functions with moustaches.

Don’t have too much by ways of pros & cons on this one, I’ll probably need a bit more time to explore it.

With frontend web development, there is an abundance of frameworks (Can be a bit overwhelming!). It feels like I’m spoilt for choice to decide between a variety of frameworks just for Javascript alone (ie. Vue, Angular, React, etc).

Takeaway: Programming is just a tool and a means to an end. I see the frontend as a means to communicate and put ideas out there. I am pretty sure it will come in handy at some point!

Closing

I am really thankful to be able to geek out on these frameworks. It is like getting a new tool, testing it out, and actually building something from it.

Perhaps, at this point, just looking at a few frameworks might not have amounted to something tangible in the real world. I’m sure it will all add up at a later time.

“The best way to predict the future is to create it” -Peter Drucker

Hello World!

introduction with a person typing code
Welcome to this blog and stay for the code!

Preface & Intro

If you are reading this, this is the opening post to this Imperial Blog! 🥳

I have just got it approved, and what better way to celebrate than to post something.

 

Just a 2-minute introduction here:

I’m a research assistant in the Department of Epidemiology and Biostatistics. Before this, I was a clinician in a third-world setting for a good many years and transitioned on to do the MSc Health Data Analytics and Machine Learning at Imperial College.

 

Aspiration

Recently, I have been feeling a bit of a ‘rut’. I am starting this blog just to share general posts around, but not exclusive to:

    • experiences with learning programming
    • programming or coding in academia
    • testing out packages or libraries in R or Python

Hopefully, this will help me process or reflect better on my journey. I’m thinking about it as moving sideways for a while. I don’t intend to come across as self-serving. But, as a side effect, I see that this writing can at least help someone in the future. I’d love to get better at writing to an audience and delivering value.

I am still figuring things out around here, but will be coming back every now and then to update things!

Shoutout

If you happen to read any posts, please drop a comment below. Seriously, I’d love it, even if it’s to let me know you are here!

If you are curious about who I am, follow this page here.