Jacopo Farina's blog

Recent Post

Visualize the functioning of supervised learning models – part 4: Neural networks

After trying regression using k-neighbours, linear and SVR models, I wanted to conclude using neural networks. I did the 5 deep learning courses from Andrew Ng on Coursera to get a grasp of these models, and decided to use Keras. This library makes defining, training and applying a model quite easy, once one has an idea of what to use. Artificial neural networks The naming is suggestive and one may think the goal is to replicate the human brain, but it’s more akin to a generic and very flexible mathematical function....

Gensim: a generator is not an iterator

When using Gensim word2vec on a dataset stored in a database, I was pleased to see the library accepts an iterator to represent the corpus, allowing to process bigger-than-memory datasets. So, I wrote my generator function to stream text directly from a database, and came across a strange message: TypeError: You can't pass a generator as the sentences argument. Try an iterator. Looking at the code of Gensim, this is intended and is for a good reason: while Gensim is fine with iterating over the dataset, it may need to iterate on it more than once....

Visualize the functioning of supervised learning models – part 3: K-neighbours and decision trees

After trying regression using linear and SVR models, I wanted to try other two methods offered by scikit-learn based on different principles: K nearest neighbor and decision trees. Nearest Neighbors K-nearest neighbor in this case is straightforward, with K=1 transforms the picture in a mosaic (a Voronoi diagram based on the sampled points): Nearest neighbor model with K=1 and 1000 samples increasing the value of K, the model will use more points to predict the color of each pixel, doing an average and as a consequence smoothing the zones:...

Visualize the functioning of supervised learning models – part 2 – SVR and GridSearchCV

In the previous article we used a linear regression model to predict the color of an image pixel given a sample of other pixels, then used a hand-written function to enrich the coordinates and add non linearity, seeing how it improves the result. Without it, we can only get a gradient image. Again, all the code is visible in the notebook. I had fun playing with the enrichment function, but scikit-learn offers kernel methods out of the box....

Visualize the functioning of supervised learning models

ConvnetJS offers a demo of a neural network which paints an image learning to predict the color of a pixel given its coordinates. I liked the idea as it is immediate and visually appealing, and decided to create a visual comparison of various supervised learning models applied to this toy problem. In this and further articles will review the results All the code is in this Jupyter notebook. Given an image with the 3 RGB channels, a train and test dataset can be created just by sampling random pixels....

Fleximatcher, a library to help parse natural language

Note: this is an old article and while the software is still available and I think the idea is pretty you probably want to give a look to this. The software here described is probably more flexible, but harder to use. Some months ago, I stumbled across this amazing article about transforming an arbitrary English text in a patent application. The underlying pattern library allows, among other nice things, to find patterns like “*The [an adjective] [a noun] and the [a noun]*” easily, look for hypernyms (“is a” relationships between expressed concepts, for example “animal” is an hypernym of “cat”) in WordNet, and conjugate verbs in various languages comprehending Italian....

Create a simple infrared webcam using a Raspberry Pi, Pi noir and Flask

Some months ago I bought a Rapspberry Pi B+ and a Pi noir sensor, with the idea of using it as a small server and take IR pictures. Once the sensor is attached, is possible to take pictures with the raspistill command raspistill -o picture.jpg it has many parameters to regulate exposition, color and image format, but this usage is perfect for most cases. Since the human eye cannot see within the infrared spectrum (above wavelengths of 750 nm), and consequently computers are not equipped to represent it, the camera has to remap colors to make room for it....

Eliza, in italiano

“Terminator” in polacco significa apprendista, e non essendo un nome particolarmente accattivante fu diffuso in Polonia con il nome assassino elettronico. La creazione di macchine pensanti è sempre stata vista come imminente dal cinema di fantascienza. Uno dei primi ricordi che ho dell’informatica in generale è di aver letto, da bambino, una descrizione di Eliza. Si tratta di un programma creato nel 1966, che cerca di imitare una chat con uno psicologo, ed è probabilmente il primo chatbot nella storia....

Seam Carving in Pure Java

Today I tried to implement seam carving in pure Java, without dependencies. In a few words, it’s a method to automatically reduce the size of an image keeping as much details as possible. How is this accomplished? It’s done by calculating the importance of each pixel with some metric (for example, the gradient magnitude) and removing the less important pixels at first. The result is made clear by this example from Wikipedia:...