Future Blog Post

less than 1 minute read

Published: January 01, 2199

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Share on

Bluesky Facebook LinkedIn Mastodon X (formerly Twitter)

Building Efficient Data Pipelines with Polars: A Step-by-Step Guide

4 minute read

Published: December 09, 2024

Data pipelines are at the heart of modern data engineering. They allow you to automate data ingestion, cleaning, transformation, and feature generation—making your datasets ready for analysis or machine learning tasks.

Understanding Cron vs Anacron: Which Scheduler Fits Your Needs?

1 minute read

Published: November 24, 2024

Scheduling tasks on Linux is essential for automating system maintenance, backups, or repetitive jobs. Two popular tools for this are Cron and Anacron. While they might seem similar, they serve different purposes and are optimized for different environments. Let’s break it down.

Docker Best Practices

13 minute read

Published: September 10, 2024

Docker is simple. Production is not.Most Docker problems don’t come from Docker itself—they come from small shortcuts that compound over time. In this post, we’ll cover practical Docker best practices that keep images small, builds fast, and deployments predictable.

Dockerised REDCap Installation for Research Data Management

3 minute read

Published: August 14, 2024

Running REDCap locally on your PC shouldn’t be fragile or painful. In this post, I walk through how I Dockerised REDCap to create a portable, reproducible, and easy-to-maintain research data platform on my laptop.

Franklin Okech