Strange Things Are Afoot...

Updates for 2023

January 04, 2023 · 5 mins read

…In 2023

When I started this blog almost a year ago, I was coming to the end of a work contract, had a significantly reduced workload, had an injured knee stopping me from going out climbing (my usual stand-in activity for work), and I was looking for a way of keeping myself busy that would be beneficial to my professional development. I am bit of a computer-nerd (apparently) and have always found using my domain expertise to educate and inspire a fulfilling experience, so building and hosting a data science blog seemed like the logical solution.

About two months after starting the blog, I landed a new job. A couple of months later, my other-half and I bought our first house. So, on top of all the “fun” that has entailed, I am involved in a handful of pioneering clinical studies, am working on two arms of the 100, 000 genomes project, and have several academic research collaborations to tend to. I have also been lucky enough to be embedded in the University of Sheffield Bioinformatics Core Facility, providing me with more opportunities to teach bioinformatics and data analysis through lecturing, delivering small group-based teaching, and one-to-one tutoring of graduate students and university staff. The blog has naturally played second fiddle to these other commitments.

During the short break over the Christmas holidays, I have had some time to think about the shape the blog has taken during the last year. While I have learned a lot about content creation in this time, I have also noted that some of my original goals for the blog have been overlooked and have identified a couple of things that could be improved upon.


“Strange things are afoot at the Circle K” by Jeff Boyes

One thing I realised is that the longer I have been in my current role, the less frequent my posts have been. This was something I wanted to avoid. Part of the problem has been that I have tended towards writing tutorials, and that these have primarily been aimed at beginners. Generally speaking, written tutorials are an inefficient medium in terms of composition time versus volume of information delivery. This is doubly true when writing for a naive audience.

A second thing that I realised is that I have developed a huge list of blog post and data science project ideas over the last year or so but have barely scratched the surface of the list. I also realise that I will never get through even a small fraction of the list unless I find a cunning way of tackling them more effectively alongside my current (and growing) workload.

Something that also hasn’t escaped my attention is that my posts have generally gotten longer and have frequently been split into multiple parts, despite having tried to set personal limits on the maximum read time. This was coupled with the recent realisation that there are many topics I want to cover that will make for very lengthy written tutorials, and some that will be practically impossible to deliver in that format.

A final thing I have noted is that I have tended towards writing tutorials in R. This is probably because as I use it a lot at work and it’s easy for me not to have to switch to another language on any specific day. Given that I also program in BASh, Python and SQL, am well versed in Nextflow, and am proficient in the use of several powerful command-line tools such as Git, I would like to make more content demonstrating these skills.

Having gone through this process of critical appraisal, I have concluded that a data science blog shouldn’t just be a repository for tutorials but should also offer insight into both the data analysis process and author mindset. I have also concluded that seeing more of my ideas come to fruition is a priority, as is covering more advanced topics that will stretch my knowledgebase and help identify areas that I might need to brush up on.

What’s New?

The biggest news is that I have started a YouTube channel. My hope is that I can use it to implement a bunch of changes to address the issues discussed above. Anyone that reached this article via my blog webpage might have already spotted the new link to the channel in the site navigation bar. For those of you reading this having gotten here via an alternative means such as LinkedIn, the channel can be found here. I haven’t yet uploaded any videos at the time of writing (4th January 2023) but the first one should be up within the next few days.

Here’s a quick look at what the major changes to the blog will entail:

1. Tutorial Videos

My plan is to use screen capture videos for creating the majority of tutorials; both beginner-oriented and those for which post writing would be unnecessarily difficult or time-consuming. This new format should greatly increase information delivery bandwidth and will theoretically be much quicker for me to put together.

2. Conventional Blog Posts

The new and improved video tutorial format should mean that I have a more time to use the blog space as an actual blog containing reviews, personal insights, and posts on more succinct topics that make the most sense as an article.

3. Project Screencasts

The thing I am personally most excited for is the prospect of screencasting some data science projects. What I am thinking of doing here is regularly getting hold of a real-world data set that I haven’t previously worked with, spending around 30 minutes exploring the data and narrative ideas, then analysing the data in an unscripted screencast using either R or Python over the course of an hour or so. Hopefully this will provide a mutually beneficial opportunity for me to get through some of my backlog and tackle some more stimulating problems while giving insight into the process, mistakes and troubleshooting that happen during real-life data analysis work in both these languages.

4. Access to My Code

I will make the code I write during each screencast available through my GitHub after the respective video goes live. Hopefully this will allow people to experiment and see how my scripts work. I also plan on including the notes and ideas that come out of my initial brainstorming so that anyone can continue an analysis from where I left off in any direction they choose.

5. Opportunities for Interaction and Feedback

One thing I had hoped for during the last year was for more people to get in-touch and either request topics or give me some feedback, be it positive or negative. I am hoping that the YouTube format will give me more opportunities for audience interaction via the comments section, and that more people will use the new format blog posts to get in-touch via the contacts page to discuss ideas and possibilities I present throughout the posts.

6. A New URL

The last major change I should mention is that I will be changing the site URL very shortly to www.lewisdoesdata.com now that it houses more than just a blog. Other than the link to the YouTube channel, navigating the site should remain largely the same.

That’s pretty much the full round-up. I hope that you’re as excited for the 2023 overhaul as I am. If not, see point 5 above and shout up.

Catch you later 🤙

. . . . .

Thanks for reading. I hope you enjoyed the article and that it helps you to get a job done more quickly or inspires you to further your data science journey. Please do let me know if there’s anything you want me to cover in future posts.

Happy Data Analysis!

. . . . .

Disclaimer: All views expressed on this site are exclusively my own and do not represent the opinions of any entity whatsoever with which I have been, am now or will be affiliated.