Saturday 22:14
523 Tools in the
Data Science Stack.
Get an overview. Talk to us.
Saturday 22:14
523 Tools in the
Data Science Stack.
Get an overview. Talk to us.

Idea

We maintain the Liip Data Science Stack to help you orient yourself in a highly crowded area. Thats how we want to help you find the right tools in one place. Because we love open source, we have sorted these tools to the top. Enjoy browsing the website or if you are a geek just download the JSON file.

Next Step

Often just knowing about a myriad of tools won't help you much if you can't connect them to the business question. Don't worry - you are not lost. Our team will help you to select the right approach and methodology for your question. Transform your data into insights and action starting from today.

Mailing List

Stay up to date on the current developments. Subscribe to get a quarterly Email containing all the updates we did to the stack. We won't send you any spam or advertising for our services. Pinky promise.

Search

Search for a technology in the stack.

Data Sources

Where does your data usually come from? For us, it's mainly websites and apps with sophisticated event tracking. Yet for some projects the data has to be scraped, comes from social media outlets or comes from IoT devices.

Scraping

Scrapy Cluster
Scrapy as a cluster
X-ray
Very fast and simple scraper
Scrapy
Powerfull Python Scraping Framework
Nutch
High Scalability Crawler
PhantomJS
Headless Browser perfect for scraping
Show more (3)

Social Media

Quintly
Cross-platform social media analytics
Buffer
Content planning + analytics
HootSuite
Content curation + analytics
BrandWatch
Campaign monitoring in social media
Klout
User rating platform
Show more (7)

Website Analytics

Ahoy
Simple JS tracker
Piwik
On Premise alternative with Saas option
Countly
Selfhosted analytics
Awstats
Perl based
Openweb Analytics
Similar to Piwik runs on PHP
Show more (32)

Tag Management

7Tag
From the people that developed Piwik
Google Tag Manager
Google Tag Manager
Tealium Tag Manager
Enterprise Level Tag Manager
Tag Commander
Tag Manager
Qubit Tag manager
Tag Manager with containers
Show more (1)

IoT

Raspberry Pi
Standard
Photon
IoT with Wifi
NodeRed
Flow based programming for IoT
Flowhub
Flow based programming for the full stack
UbiDots
Data Collection and Analysis
Show more (3)

Heatmaps

CrazyEgg
Heatmap Analytics
Inspectlet
Analytics with focus on heatmaps
Session Cam
Focus on heatmaps integration
Hotjar
Focus on visual analytics. New approach.
Mouseflow
Live heatmap analytics

Mobile Analytics

Appcelerator
Native apps analytics
Tapstream
App analytics
AppsFlyer
Mobile Analytics
Upsight
Omnichannel marketing tracking
App Trace
App store analytics
Show more (13)

Data Processing

How can we initially clean or transform the data? How and where can we store the logs that those events create? Also from where do we also take additional valuable data?

ETL

RubyETL
ETL for Ruby
SQOOP
Data scooping
Drill
Transform on the fly
Nifi
Process and Distribute Data
Kettle
ETL from Pentaho
Show more (7)

Datacleaning

Openrefine
Easy data cleansing
Wrangler
Open soure version of data wrangler
Trifacta
Commercial version of wrangler
Pappa Parse
CSV importer for huge files

Alerting/Logging

Graylog
Opensource Logging Service
Logstash
From ELK Stack
Fluentd
Open Source Data collector
Goaccess
Realtime weblog analyzer
Loggly
Like Graylog but paid
Show more (2)

MessageQueue

Heron
Realitme Stream processing from Twitter
Streamalert
Serverless, realtime data analysis framework
Celery
Distributed Task Queue
Impala
Realtime Stream Query
Nats
Opensource for Cloud and Realtime
Show more (15)

Database

What options are out there to store the data? How can we search through it? How can we connect big data sources like Hadoop efficiently with existing applications?

Database

LevelDB
Key/Value
Beringei
Inmem Timeseries DB from Facebook
TimescaleDB
Build ontop of postgres
SQLITE
Filebased/Prototype DB
Postgres
SQL
Show more (38)

In Memory/Search

Elastic
General flexible fast solution
Lucene
Mature Search engine
Solr
Builds on Lucene
Lunar
Inbrowser search engine
Compass
Java search engine
Show more (16)

Hadoop Ecosystem

Hbase
Distributed Database/Filesystem
Cassandra
Distributed database
Hive
Data Warehouse
Oozie
workflow Scheduler
Openstack
Open source software for creating private and public clouds.
Show more (8)

Analysis / ML

Which stats packages are available to analyze the data? Which frameworks are out there to do machine learning, deep learning, computer vision, natural language processing?

Deep Learning

TFLearn
Highlevel Deeplearning (now part of tensorflow)
Brainstorm
Sucessor of pybrain
NeuroJS
JS Deep learning library
Tensor2Tensor
Open-source system for training deep learning models in TensorFlow
MobileNet
ML on apple devices
Show more (19)

Stats Software

Ipython
Ipython Notebooks
RMarkdown
Like Ipython Notebooks for R
Rattle
Graphical UI for R models
Weka
ML industry strength toolkit in java
R
Big opensource stats framefork mainly used in academia
Show more (6)

General (focus python)

Lore
Maintainable ML approachable for Software Engineers.
Tuber
Youtube Analytics in R
DMTK
Distributed Machine Learning Toolkit from Microsoft
Sklearn-Pandas
Scikit-Learn for pandas data frames
Dask-learn
Scikit-Learn parallel computing
Show more (24)

Assistant

Lyrebird
Voice synth from samples
Tada
AI Assistant for Google Analytics
Statsbot
AI Assistant for Google Analytics
Kitt.ai
NLU / Chatflow

Computer Vision

Openface
State of the art open source face detection
Facenet
Deep Learning face detection
Video Intelligence
Video Intelligence
SciKit Image
A collection of algorithms for image processing in Python
Google Computer Vision
Saas for Image recognition
Show more (6)

NLP

CoreNLP
Integrated NLP Toolkit
SAGA
GATE Sentiment plugin
SEAS
GATE Sentiment plugin
LDAjs
Topic Modeling
FuzzyWuzzy
Fuzzy Stringmatching in Ruby
Show more (25)

ChatBots Framework

Rasa
Turn natural language into structured data
P-Brain
Natural Language for Bots
Botpress
Open Source Framwork for Bots
Telegram Bot
Bot Framework for Telegram
Botframework
Microsoft Bot Framework
Show more (12)

Speech

Speech-Recognition-Tensorflow
Open Source Listen-Attend-Spell implementation
ESpeak-NG
Open Source speech synthesis for 102 languages
Spitch
Swiss ASR provider
OpenSTT
Initiative for opensource STT
Speech Recognition
Meta-lib for python
Show more (7)

Visualization / Dashboard

What happens with the results? What options do we have to visually communicate them? How do we turn those visualizations into dashboards or whole applications? Which additional ways of communicating with the user beside reports/emails are out there?

General

Datamaps
SVG Map visualization
Glue
Linked Data viz in python
Altair
Vegalite for Python
Britecharts
Charting Lib from Eventbrite
Folium
Python Map visualization with leafletjs
Show more (35)

Dashboards

JsonDash
Make Dashboards with Flask and JSON
ReDash
Dashboard with Analytics
Vida
Flexible dashboard
Screenly
Manage multiple dashboards
Plotly
Dashboards wit a lot of Bindings
Show more (13)

Javascript

Cubism
Time series viz in js
Vega
Declarative Viz in js
Odyssey
Stories visualization in js
DC.js
Multi Dimensional Charting
Crossfilter
Multi Dimensional Filtering
Show more (33)

Business Intelligence

What solutions are out there that try to integrate the data sourcing, data storage, analysis and visualization in one package? What solutions BI solutions are out there for big data? Are there platforms/solutions that offer more of a flexible data-scientist approach?

Business Intelligence

OpenMining
BI in Python Notebooks
Blazer
Simple SQL to Graph
Bdash
Simple Mac BI application
Kibana
Fast Moving BI on top of elastic
Airpal
BI from airbnb
Show more (42)

BI on Hadoop

Arcadia Data
BI on Hadoop
Datameer
BI on Hadoop
Kyvos Insights
BI on Hadoop
Attivio
BI on Hadoop
Zoomdata
BI on Hadoop

Data Science Platforms

Hue
Cloudera Hue solution for haddop stack
Rapidminer
Java Data Modeling and Prediction
Orange
Data Modeling and prediction
Graphlab
Integrated Datascience
Knime
BI for Datascientists
Show more (18)

Discover More

Send us a message

or an Email to thomas.ebermann@liip.ch