Jacopo Farina's blog

Recent Posts

Lots of fun with Postgres and Python timezone shenanigans!

Lots of fun with Postgres and Python timezone shenanigans!

There are few things developers love more than having to handle timezones. One of them is having to handle timezones in different environments! Lately I had to deal with some timezone operations across Python and Postgres and decided to document here the shenanigans and quirks of the two systems and how I try to avoid them. TIMESTAMP WITH TIME ZONE does NOT store a timezone This is something I knew already but it irks me every time I remember it exists....

Implement a CHIP-8 emulator in Python

Implement a CHIP-8 emulator in Python

For quite some time I entertained the idea of implementing an emulator. My knowledge of low level programming is mostly teoretical and this would be a good chance to learn more, and also to experiment with optimizations I rarely encounter in my usual machine learning tasks (being based on libraries like Numpy and Scikit-learn which already take care of the heavier operations). The Game Boy is an obvious candidate, being it a console I had as a kid, well documented and for which there are many existing implementations including a Python one....

Making a fully static map, part 3: Text search

Making a fully static map, part 3: Text search

NOTE: a complete interactive demo of the final result is here. In the previous post of this series, we saw how to generate vector tiles starting from an OpenStreetMap PBF extract using Tilemaker. After the article I refined the process and wrote a Python tool to automate it, adding the possibility to index named objects like streets and shops. Usually, such a search would be performed using a geocoding service that can handle the full text search with all the nuances like alternative spellings, typos and ambiguities....

Making a fully static map, part 2: Vector tiles

Making a fully static map, part 2: Vector tiles

UPDATE: I created a Python tool to automate this process, including a refined style and packaging. I suggest using it. In the previous post of this series we saw that an extract of the data from OpenStreetMap can be easily transformed into a set of raster tiles, essentially fragments of the map at different levels of zoom, arranged in a structure that enables a library like Leaflet.js to fetch them as needed when the user zooms and pans on the map....

Making a fully static map, part 1: Generate raster tiles from QGIS

Making a fully static map, part 1: Generate raster tiles from QGIS

In this article we are going to implement an interactive map that can be included in a fully static website. By fully static I mean that the map does not rely on any external service nor a backend, it is just a bunch of files served directly by nginx (like this blog) or even a CDN. This approach is generally cheaper and simpler to operate, maintain and migrate, without depending on external services whose terms of use may change....

Lessons learned using Postgres in production

Lessons learned using Postgres in production

On April, 12th 2022 I willgive a talk at PyCon Berlin about how we use Postgres in a data science project at Flixbus. These are the slides for this presentation, you can contact me on the conference Discord, Twitter, Github or in person at the venue. Download the presentation...

Render a building in 3D from OpenStreetMap data

Render a building in 3D from OpenStreetMap data

Since quite some time I have an interest in GIS and rendering, and after experimenting with the two separately I decided to finally try and render geographical data from OpenStreetMap in 3D, focusing on a small scale never bigger than a city. In this article I will go through the process of generating a triangle mesh from a building shape, rendering and exporting it in a format suitable for Blender or game engines like Godot....

Insert data into Postgres. Fast.

Insert data into Postgres. Fast.

The task of ingesting data into Postgres is a common one in my job as data engineer, and also in my side projects. As such, I learned a few tricks that here I’m going to discuss, in particular related to ingesting data from Python and merging it with existing rows. Before starting, I have to say the fastest way to insert data into a Postgres DB is the COPY command, which has a counterpart \copy on the psql CLI tool that is useful to invoke it remotely....

Generate a grammar quiz in 300+ languages using simple NLP

Generate a grammar quiz in 300+ languages using simple NLP

In this article I’ll explain how I populated the database that powers grammarquiz, a grammar quiz app that I created for the Kotoeba initiative. The code of the application is freely available. The backstory As you may already know, Tatoeba is a database of sentences translated in different languages. The database is at this time (early 2021) almost 10 million sentences strong and keeps growing. The dataset can be downloaded and used with an open license, similar to Wikipedia or Openstreetmap, which makes it very interesting for users who, like me, have interest in NLP and languages....

Correlating sleep duration and flashcard performance

Correlating sleep duration and flashcard performance

Since a few years I regularly use Anki, a flashcard system, to memorize and remember information, in particular German words. This daily activity sometimes feels like a breeze and sometimes more like an endless chore. It requires focus, and some day I have the feeling that I’m forgetting words and concepts that normally are a piece of cake. Anki provides reports about the usage, which in my case don’t show any particular pattern (e....

Add punctuation to text using Skorch

Add punctuation to text using Skorch

Note: The whole code for this article is freely available on GitHub A few weeks ago I saw a talk about Skorch, a library that wraps a PyTorch neural network to use it as a Scikit-learn model. That is amazing: I can take an existing product based on, say, a random forest, and replace only the model without refactoring anything else: the fit and predict functions have the usual interface. On the other hand, I can use the powerful tools offered by Scikit-learn, like the grid search for hyperparameters and make_pipeline to apply encoders....

Create an animated heatmap from a Google location data Takeout export

Create an animated heatmap from a Google location data Takeout export

I love to go around by bike, and Berlin offers a good choice of paths to explore. However, after some year in the city I did realize there were areas I never visited and routes I did so often to become boring. Out of curiosity I tried to process my own location history to map the places I visited more often and, to tell the routine commute habit apart, visualize the time of the day of a visit....