this is _default loayout

Recent Post

Lots of fun with Postgres and Python timezone shenanigans!

Lots of fun with Postgres and Python timezone shenanigans!

There are few things developers love more than having to handle timezones. One of them is having to handle timezones in different environments!

Lately I had to deal with some timezone operations across Python and Postgres and decided to document here the shenanigans and quirks of the two systems and how I try to avoid them.

TIMESTAMP WITH TIME ZONE does NOT store a timezone

This is something I knew already but it irks me every time I remember it exists.

...

Implement a CHIP-8 emulator in Python

Implement a CHIP-8 emulator in Python

For quite some time I entertained the idea of implementing an emulator. My knowledge of low level programming is mostly teoretical and this would be a good chance to learn more, and also to experiment with optimizations I rarely encounter in my usual machine learning tasks (being based on libraries like Numpy and Scikit-learn which already take care of the heavier operations).

The Game Boy is an obvious candidate, being it a console I had as a kid, well documented and for which there are many existing implementations including a Python one.

...

Insert data into Postgres. Fast.

Insert data into Postgres. Fast.

The task of ingesting data into Postgres is a common one in my job as data engineer, and also in my side projects.

As such, I learned a few tricks that here I’m going to discuss, in particular related to ingesting data from Python and merging it with existing rows.

Before starting, I have to say the fastest way to insert data into a Postgres DB is the COPY command, which has a counterpart \copy on the psql CLI tool that is useful to invoke it remotely.

...

Generate a grammar quiz in 300+ languages using simple NLP

Generate a grammar quiz in 300+ languages using simple NLP

In this article I’ll explain how I populated the database that powers grammarquiz, a grammar quiz app that I created for the Kotoeba initiative. The code of the application is freely available.

The backstory

As you may already know, Tatoeba is a database of sentences translated in different languages. The database is at this time (early 2021) almost 10 million sentences strong and keeps growing. The dataset can be downloaded and used with an open license, similar to Wikipedia or Openstreetmap, which makes it very interesting for users who, like me, have interest in NLP and languages.

...