Introduction

Data analytics is the science of extracting patterns, trends, and actionable information from large sets of data. Data analytics involves improving your ways of making sense of that data before acting on it; further still, you can slice and dice the data to extract insights that allow you to leverage this data to give the organization a competitive advantage.

Improving your capacity to analyze this data works at multiple stages — from collection processes to organizing and processing the data. Yet, whereas data once required a large team of skilled analysts to be made useful, today there are a number of enterprise level tools for running high speed data analytics on massive amounts of data.

Overview

The system we are about to discuss is a complete end to end data analytics system built fully in-house by synergizing various advanced technologies. The system covers a whole event journey of:

  • Capturing the user event data
  • Persist storage of the data
  • Real-Time processing
  • Batch processing
  • Generating reports
  • Analysing trends

System Architecture

The entire architecture can be divided into two groups. One for capturing and storing the events Bhrigu, and the other called Big Query for analyzing the data. The entire system is built using following open-source technologies:

Bhrigu Archictecture

System Architecture Diagram

Bhrigu
Big Query
  • Mist for communicating with spark cluster
  • Python for creating server functionality
  • Flask for web framework
  • Bootstrap for beautiful client UI
  • Chartjs for visualization of trends