Real Estate Database & Visualization - Elastic Stack

Project Description

In January 2019, (French government) released a database (.csv files) of all the real estate transactions between 2014-2019. Including the location, sales price, and much more.

To make advance analysis on those datas, I decided to integrate it on Elasticsearch and to enrich it with geolocations (I created geo-coordinates from the addresses, using the « National Address Database » open-source project and Logstash – Elascticsearch). Finally, I used Kibana to analyse the data through dataviz and map-visualisation.

This project uses Filebeat to read / stream the files, Logstash to transform the data and integrate it into Elasticsearch, Elasticsearch to store and search the data, and Kibana for dataviz.

Finally, it uses Amazon Elasticsearch Service to run the ES database.

01.  Filebeat - Logstash

As the input files are .csv, I use Filebeat to read and stream the data to Logstash. Logstash then aggregates and process the data, to transforms or enrich the input. As an output, Logstash send the data to the Amazon Elacticsearch cluster.

02.  ElasticSearch - AWS

The data are loaded from logstash to an AWS Elasticsearch instance. We use the logstash-output-amazon_es plugin to reach our Amazon instance.
We also setup custom indexes through our mapping file, especially to be able to use geocoordinates and map-dataviz.

In this project, we load 2 kinds of data in Elasticsearch :

First, we start by loading our « National Address Database » to have a list of all existing addresses and coordinates in France.

Then we use those data to enrich our real-estate data by querying Elasticsearch through Logstash. This allows us to add geo-coordinates to our real-estate database. Then, we load those enriched real estate data in Elasticsearch.

03.  Kibana

Finally, we use the Kibana instance provided by AWS to analyse our data.
From Kibana, we can then create several kind of analytics, including a display of the results on a map, based on the lon-lat coordinates.

This project is fully available on Github ! Check on the sources !