Comparing the seaborn objects interface with ggplot2

The Seaborn objects interface, introduced in version 0.12.0, is a new system based on the Grammar of Graphics, similar to R’s ggplot2. It offers a more consistent and flexible API, comprising a collection of composable classes for transforming and plotting data. This interface allows for end-to-end plot specification and customization without dropping down to the matplotlib level, making it suitable for more complex plots with multiple layers and mark types. While the interface is still experimental and incomplete, it provides a modular and Pythonic API that is informed by ggplot2’s design philosophy. In this post, I replicate the plots from the ggplot2 book using the seaborn objects interface.

Read More

Comparing dplyr with polars

Data manipulation is one of the most common and essential tasks in data analysis. Whether you are working with tabular data, time series, spatial data, or any other kind of data, you need to be able to perform operations such as filtering, grouping, aggregating, joining, reshaping, and more. In the R world, the dplyr package is one of the most popular and powerful tools for data manipulation. It provides a consistent and expressive syntax based on the tidyverse principles, which make it easy to write readable and maintainable code. Many other packages extend its functionality, allowing you to work with large and diverse data sources.

Read More

Hanukkah of Data 2023

Are you ready for a thrilling data adventure? Noah’s Rug is an awesome data game that challenges you to solve puzzles using a fictional dataset. Every day of Hanukkah, you can light up a candle and discover a piece of a mysterious tapestry. After the last day, you can test your skills in a speedrun mode. You can use any tools you want to explore the data and find the answers. This is a collection of my solutions with some explanations.

Read More

Introduction to Bayesian Data Analysis

Bayesian statistics is a powerful and elegant framework for data analysis that allows us to incorporate prior knowledge, uncertainty, and learning from data into our statistical reasoning. In this entry, I will introduce the conceptual foundations of Bayesian data analysis and show how it can help us solve challenging problems that classical statistics may struggle with.

Read More