The Datalion Blog

DataLion at the Digital Challenge 2018

In mid-June, numerous specialists and interested parties from agencies, institutes and companies got together in Munich in order to discuss the latest developments in digitization.

We were very much excited to be there as well: our CEO and founder Dr. Benedikt Köhler introduced the new world of data science. In a live example he demonstrated how one can use Foursquare to avoid being marked as a tourist when visiting an unknown new city.

The speeches and presentations where up-to-date and rather widespread thematically. The participants were presented with examples of the applications of AI and Blockchain, as well as Best and Worst Cases of Influencer Marketing.

Johannes Ceh´s keynote topped off the conference by highlighting the opportunities and risks of digitization. The networking afterwards lived up to the expectations and the speeches were discussed thoroughly at the After Show Party.

The Five Most Creative Music Visualizations

“Data is power”: by now an established fact. “Music is power”: a universal truth. So what do we get, when we creatively combine the plethora of available data on music with analyzing techniques and powerful software? We gain knowledge, insight and tremendous inspiration. We have gathered some really interesting and inspiring interactive and non-interactive visualizations on the evolution of music as well as on current distinctive trends. Enjoy!

Using data from the Billboard Top 100 Data 1958 – 2016, The Pudding´s Matt Daniels  has visualized the evolution of music taste on a month-to-month basis over the years: every top 5 song from 1958 to 2016 in the U.S.. Headphones required – it´s a “soundalized” visualization!

Kaylin Pavlik, on the on the other hand, in her Blog post „50 Years of Pop Music” has used R and data from the Billboard Year-End Hot 100 to offer a more quantitative insight into the evolution of Pop from 1965 to 2015 in the U.S..

Colin Morris (again of The Pudding) is putting one hypothesis into question: Are Pop Lyrics Getting More Repetitive? . The author used the Lempel-Ziv algorithm to measure repetitive lyrics with compression. The higher the compressibility of a song, the more repetitive its lyrics are. So seems like Rihanna´s Lyrics aren’t as much of a poetry as Frank Sinatra´s used to be! Wait, we did see that coming, didn´t we?

On a more international level, Spotify and open source platform CARTO have created the “Musical Map of the World”. The interactive map is making use of Spotify´s data from cities all over the world to give them their special “musical character”. You want to know how Germany sounds like? Click on the country to listen to its distinctive music!

Brady Fowler of Decibels and Decimals is not a friend of classifying music through genres. So he used Spotify data and Python iGraph to visualize the connection between artists and their music. The beautiful graph is the outcome of clustering the artists into groups, based on the listening habits of Spotify users.

Did you find the visualizations as interesting as we did? What would you add on the list? We are looking forward to your input.

DataLion out and about in Munich

We are proud to be holding speeches in two events in just one week. It all starts on Monday, the 11th of June at the Travel Industry Club. Our CEO and founder Dr. Benedikt Köhler will be talking about how the travel industry can make use of smart data. The event is taking place at Schweiger´s Kochschule and starts at 06.30 pm. You can find more information here.

AI, AR/VR, Blockchain, Influencer Marketing, GDPR – all these current trends will be discussed during the “Digital Challenge 2018” on Thursday, the 14 th of June. DataLion is right in the middle of it, with Dr. Benedikt Köhler talking about Alien Data Science (no worries, E.T. will not be showing up ;)). The location is the Freiheizhalle, starting at 9 am. Read more about it here.

Best 4 free math books for deepening your machine learning skills

The best things in life aren’t things – but free books. At least if you want to spend the next few weeks to take your machine learning to another level.
We’ve selected five great books that help you to understand one important aspect in machine learning in a very profound way. Thanks to the Open Access initiative, all of these works are available for free:

Elements of Statistical Learning

The Elements of Statistical Learning by Trevor Hastie, Robert Tibshirani and Jerome Friedman

If you need a refresher on your machine learning methods, ESL is the book to go to. From Lasso to boosted trees and ensemble learning – this beautiful typeset work covers all the bases for your professional life as a data scientist.

Here’s the latest 12th printing from January 2017 as PDF download.

Convex Optimization by Stephen Boyd and Lieven Vandenberghe

Convex Optimization

If you have spend some years in machine learning, the probability is very high, that you’ve stumbled upon convex optimization problems. The theory and methods around convex optimization has been around a long time. But until a few decades, they were thought to be mostly of theoretical value. Today, convex optimization is e.g. an important part of Deep Learning and many smart things around are powered by these algorithms.

Here you can download the full book by the Stanford professor for free – and there’s a lot of additional material on the website and even an online course.

Group Representations in Probability and Statistics by Persi Diaconis

Group Representations in Probability and Statistics

This book goes back to Diaconis’ lecture notes for his course on this topic at Harvard in the 1980s. There are a lot of situations where data scientists have to deal with rankings, e.g. consumers having to rank products in a survey. These mathematical problems can be solved by applying group theory.

But wait, there’s more: As Diaconis is also a magician, the shuffling of cards also plays quite a role in this work.

The book is available at project euclid.

Mining of Massive Datasets by Jure Leskovec, Anand Rajaraman and Jeff Ullman

Mining of Massive Datasets

And finally, for something more accessible: This book is already a classic on big data techniques for processing large data sets. If you haven’t already, you really take a look at the 2nd edition of the book that also includes mining large graphs and map-reduce programming. You can also take a look at the first chapters of the upcoming 3rd edition.

Here’s the book’s homepage with a lot of additional information.

I hope, there are some useful additions to your reading list. Looking forward to your feedback about the books. What was especially useful? Which books are missing from this list?

Craft Beer & Data Tasting

After the great success of our first “Festbier Tasting” side-event of the Bits & Pretzels startup conference, we are looking forward to host another data-driven beer tasting to celebrate our first trade-fair appearance at the world’s largest marketing research fair Research & Results.

This time, we will explore some of the most fascinating Bavarian craft beers. Together with beer sommelier Stefan Hermansdorfer, we will taste six very different craft beer varieties from Munich and region.

Of course, all our tasting results – from appearance and mouth-feel to aroma and taste – will be visualized in real-time on a DataLion live dashboard:

We are looking forward to this event – and there are still a few remaining free tickets available: Register for free at Eventbrite.

When: Wednesday, October 25, 2017
Time: 8pm – 10pm
Where: DataLion GmbH, Herzog-Wilhelm-Straße 1, 80331 Munich (at Karlsplatz/Stachus)

Data-Driven Beer Research @ DataLion

Monday evening was a special evening at the DataLion office in Munich:

We hosted a beer tasting event as an official side-event of the Bits & Pretzels Founders Conference and Startup Night.

Together with beer sommelier Stefan Hermansdorfer, we tasted six different beers from Munich and surroundings that were all in the Märzen or Festbier style (Hacker-Pschorr, Augustiner, Hofbräu, Tilmans Das Helle, Eittinger and Giesinger). One of them even had a lion on the label. And because we all love data, we also created a short survey for describing and scoring the beers in terms of appearance, aroma and, of course, taste.

All results were displayed in a DataLion real-time dashboard in our conferencing room: We chose radar charts for our live visualization because they allow to compare the different taste profiles of the beers very intuitively at one glance. Every time one of our guests rated a beer, the results were updating and showing the new rankings.

You can see the live dashboard at this link. If you’re interested in live dashboards that are connected to a quick survey for your business, just leave a message at our contact page.

Big Data Design Thinking

Solving problems and developing innovations for Big Data with Design Thinking

Data is the new oil. But while the use of oil was relatively clearly defined, there are much more possibilities in data. It’s an endless story.

Almost all companies are facing the challenge of having tons of data from very heterogeneous sources. But often, they lack a clear vision of what the data could be used for.

What are new business models that can be fueled with the data? Which data products can be defined and sold by the company? The solution is: Design Thinking.

Continue reading “Big Data Design Thinking”

Time Series, data import, drag & drop – version 1.6 is live now

New Month – new update! Our software engineering lions worked until last night on the latest new features and hunted for the last remaining bugs. Their hunt was so successful that we can finally launch the new DataLion version 1.6.

Here are some of the most important new features and improvement:

Drag and drop for even more customized dashboards

The new software release allows you to create even more customized dashboards: All charts can be moved freely on the dashboard and you can also adjust the size of the charts by drag and drop. When you hover over the lower right corner of a chart, an arrow will appear that allows dragging the chart larger or smaller.

Easy import with our new wizard

Previous versions of DataLion required administration rights to import data. Now, you can import your CSV files into DataLion as a normal user, recode the values in the files and create dashboards with your own data. In our free version, you can test this feature with CSV files containing up to 500 rows and up to 100 columns. Try importing your data now!

Automated data import and dynamic variables

These two new features are complementing each other perfectly. In version 1.6 you can set up an automated data import so your data can be updated at regular time intervals (e.g. every Sunday at 2am). All analyzes and reports are also updated automatically for you.

By using the new dynamic variables, real-time analyses will become even more comfortable. With just a few clicks, you can e.g. look at the results for the last 10 days and save the results in a dynamic report that will always refer to the correct time frame.

New timeline features

Time-series analyses are becoming more and more important for many of our users, so we have improved the timeline chart even more. Now it allows the user to define columns that contain timestamps or dates in the date and then switch between different time intervals (e.g., weeks, months, and years) with one click.

In addition, you can now also compare target groups or evaluate different variables over time.

Software tour

Especially for new users, we have created a quick tour through the most important features of the DataLion software. To access the tour, click “Start Tour” in your login menu in the top right corner (you have to be logged-in).

Create presentations in record time

With our new advanced export features, you can export not just individual charts, but also entire dashboards and reports. This allows you to create your presentations in record time.

Further updates in 1.6

  • Visualize Google Analytics data in real-time with DataLion
  • Add text and images to your dashboards and reports
  • Change the look and feel of your dashboard with themes
  • “Rainbow” mode: Apply different colors to the bars in bar or column charts
  • New chart types: gauge and multidimensional scaling (See live example here)

We are looking forward to your feedback on the new features and improvements!

DataLion nominated for official selection of the I-COM 2017 data startup competition

The international marketing and data community will meet in Porto at the I-COM global summit “Data 2017 – the year of change” from April 24 to 27. We’re very proud that we’ve been nominated by the very high quality jury as one of the 10 finalists for the official selection of the Data Startup Competition.

On April 25 we’ll pitch together with many exciting and innovative startups like Sentiance from Belgium or Catalyx from the UK for the Unilever-sponsored startup award.

But we’ll also be around on the other days and are looking forward to many inspiring conversations on data science, machine learning, dashboards and visualization.

If you’re interested in our impressions of this event – that has a very comprehensive agenda reaching from artificial intelligence and targeting to attribution modeling and the role of data in gourmet restaurants – you can follow our Twitter and Instagram channels where we’ll be covering the conference.

Visualizing the Blockchain: The 7 most beautiful Bitcoin visualizations

What is the Blockchain?

The idea of money and currencies as flows in networks is not new. German sociologist Georg Simmel put forth similar ideas in his Philosophy of Money. With Bitcoin we finally have a currency that not only links people (or: nodes) together in financial transactions, but the whole network is transparent.

Experts argue that the currency is just one of many applications that can be built on the Blockchain algorithm and database powering Bitcoin. This way of linking nodes can be used to weave authentication layers in all sorts of networked applications e.g. the Internet of Things.

The best Bitcoin and Blockchain visualizations

This combination of open and transparent data available through various interfaces or APIs in combination with a networked data structure should be a jackpot for data visualizers and information designers. So we took a look around to find the most impressive ways of visualizing the Bitcoin transaction flows. Some of the following visualizations even come with recipes.

1. Bitnodes

The first visualization “Bitnodes” by Addy Yeow shows the distribution of Bitcoin nodes across the globe. It uses a Bitcoin crawler implemented in Python that is also available on Github:

bitnodes

2. Network Map

Even more impressive is this visualization by the same author of all the Bitcoin nodes and the node density. Although the network structure is not clearly visible, it really suggests that there is a new “universe” evolving in finance:

bitnodes_network

3. Daily Blockchain

The following visualization uses the Open Source vivagraph.js library to display the networked nature of Bitcoin. You can see Bitcoin transaction happening in real-time and the evolving hubs of the Bitcoin network.

dailyblockchain

4. Interaqt

This one by luchendricks goes in the same direction – it is also a visualization of live Bitcoin transaction. Here the size of the nodes represents the volume of the transaction. Every node also carries a link to all information on the transaction on bitcoin.info.

interaqt

5. Wizbit

To show the global nature of Bitcoin, the following live-map by Wizbit not only displays the transactions but also the latest discovered Bitcoin blocks (wow, someone in the USA just found a block of 2479 Bitcoins which is worth about 580,000 USD). The WebGL globe visualization is also available on Github.

liveglobe

6. Big Bang

This visualization “The Bitcoin Big Bang” by Elliptic is one of the most beautiful visualizations of Bitcoin history at the moment:

elliptic

7. Blockseer

Blockseer is more of a visual research tool than a creative visualization of the Bitcoin universe. You can visualize transactions and blocks in a detailed tree diagram:

blockseer

8. Follow the Cryptocurrency

For those of you that want to try creating your own visualizations with the public Bitcoin APIs, here’s a very interesting interactive notebook “Follow the Cryptocurrency” by Dato that shows all the different steps of accessing, analyzing and interpreting Bitcoin data.

9. Bitcoin Tree

Here’s some examples we did at DataLion to query and visualize the blockchain with R:

btc_network-1024x731

10. Blockchain Tweets

To round it all off, here’s a sneak preview from our new tool “DataLion Social” (currently in beta) that allows to perform quick semantic analyses of online communities or tribes like the Blockchain community in almost no time.

DataLion Social Blockchain Community

By the way, if you’re interested in joining the beta phase of DataLion Social here’s more information.