I haven’t done a “tools” post in a while, so this blog post will just be a whirlwind introductory tour of tools and applications I have explored recently and my thoughts on them.
The interesting trend that links many (but not all) of these apps and services below together is that a lot of them are starting to embed machine learning into their feature sets.
From the automatic powerpoint designs that Powerpoint 2016 recommends when you drag in photos, to Google sheets automatically suggesting visualizations based on your dataset, to data wranglers/cleaning tools trying to guess what transformations you want to make based on what you highlight , machine learning is indeed coming!
The tools I will cover will fall into the following categories.
- PowerPoint 2016 – designer and Morph feature
- Office Sway
- Office Mix
- Openrefine add-ons – (VIB-BITS diff )
- Trifacta Wrangler
- Talend data preparation
- Excel add-on (fuzzy match and power query) + Google doc (add-ins)
- R rattle
- Rapid Miner
Due to the length of the post, I will split this blog post into two, with the first one focusing on presentation tools and Visualization tools. A latter post will consider data cleaning or data wrangling and machine learning tools.
It’s interesting to note that the competition, Google slides offers a similar feature by using the “Explore” function.
The other nice feature that caught my eye in the new powerpoint is a new type of transition called “morph”. Allows you to smoothly add animations by duplicating slides and making slight differences, morphing will smoothing transition to the altered slide.
Tired of normal powerpoint? The next two tools are also from Microsoft. The first is Microsoft Sway
There has been no shortage of apps and services trying to disrupt Powerpoint style of presentations. There are some that try to be radically different like Prezi while most are more like web based versions of Powerpoint with collaboration and easier importing of content from to web services like Youtube, Twitter and other web content sources online to generate web-based presentations. Google sheets of course is the paradigm example.
These days though native Powerpoint has most of these features already so it’s curious Microsoft has experimented with a tool called Microsoft Sway that lies somewhere in-between those two types.
Microsoft Sway is described as a “digital story telling tool” but is closer to the latter than the former.
In terms of format layout it’s more flexible than traditional power point allowing you to create presentations that scroll horizontally, vertically or in slideshow mode. You can search for images, videos or other content via Onedrive, Bing, Flickr, Youtube etc.
Instead of slides, Sway has “cards” , which can be grouped in many ways for display such as stacking them, comparing them, grid and slideshow style etc.
So far, none of this is really ground breaking.
The somewhat more interesting parts is the about to click “remix” and the style will try to remix your slides in other styles. It does a good attempt but I prefer Powerpoint 2016’s “Design ideas” feature.
Similar to the latest powerpoint, you can also type in a topic and it will try to generate a presentation for you. It seems heavily depend on using Wikipedia to do so. Below shows an outline created for the topic search for my institution.
Notice Sway, gives me suggestions for what to search e.g. it knows the president’s name and suggests what to search and also generates suitable images and videos to use.
Not super ground-breaking but niffy if you are new at research.
Hence the rise of data visualization desktop tools , that are reaching a level where you can create quite amazing visualizations with little or no coding skills.
A lot of these are also starting to embed machine smarts…
Tableau is an extremely powerful tool, and I probably only use the bare minimum of what it can do and I’m constantly surprised by the tool’s growing capabilities.
Tableau touts “self service analytics”, which allows users to extract data and helps them “self serve” data in any way they like. The “show me” function is perhaps one of the nicer functions in Tableau as it suggests visualizations based on the type of data fields you have. I admit I tend to explore data by semi-randomly selecting data fields and look at what the “show me” function recommends, or it would remind me the types of data fields I need to generate say a “packed bubble” visualization or a “heat map”.
It works with a wide variety of file formats as well as it seems with pretty much any commonly used servers.
Want to do more statistical analysis? Tableau integrates with R (a popular programming language for data analysis and Machine learning), which is helpful for me because I have started to learn R lately.
With such a large user base, the Tableau community is very active and I find my questions posted to the forums tend to get answered in a day or less.
If you are playing with Tableau , there’s a free public version. Be careful though, this version of Tableau requires that you store your data on the web, so if it’s sensitive data you don’t want to be out on the net, don’t use this.
If you work in an academic institution, you may have access to Tableau desktop via an academic option. Like SAS and SPSS, they may have realized the best way to build brand loyalty is to give away their software to students at educational institutions.
Microsoft Power BI Desktop
Microsoft’s entry to the data visualization software tool is named Power BI desktop. It’s the newest of the 3 big names (the others are Tableau and Qlik) and as such Power BI desktop has a lot of ground to catch up.
However Microsoft does have quite a few advantages and at the time I tried it a year ago it was quite raw, but it is ramping up quickly.
Also some may find Power BI desktop’s more familiar to use given it’s Microsoft roots. But over time, I found Tableau desktop was seemingly more flexible and more powerful in what it could do but this may reflect my lack of experience with Power BI. Also with Microsoft quickly adding update features every month , the feature gap is closing fast.
One thing nice about Power BI is that you can easily add new snazzy visualizations by going to custom visualizations gallery and download new visualizations.
For example, heard or played with the impressive Sand Dance visualization? Want that in Power BI? You can.
Just go to the office store for Power BI visualizations, search and add the sand dance visualization.
Qlik Sense desktop
As I said, my experience with this tool is extremely limited. But it does seem capable and could be a worthy tool.
Two years ago , I was in a mailing list on LibQual and I saw an email from a Chandler Christoffel sharing his interesting visualization of comments required in LibQual.
This was my first introduction to the free open source tool – Raw Graphs, and with the kind help of Chandler, I managed to do something similar for my own comments.
So what is Raw Graphs? It is a free web-based open source tool that is capable of generating 21 less commonly seen visualizations (read not doable in default Excel).
It’s a pretty simple tool, upload your data, select the visualization you want and then add the correct fields to labels, colors, size etc needed to generate the visualization you need.
You can do further customization if you like, but if not it generates visualizations in SVG format that you can edit in photoshop.
Gephi – network visualization tool
My familiarity with network visualization tools is far less than with other tools, but for a long while I was aware that Gephi is a very popular one. For Excel there is NodeXL but I haven’t tried that yet.
For librarians, when we think about network visualizations the obvious one is visualizing bibliometric networks from data we extract from Web of Science or Scopus.
It isn’t particularly easy to do this with Gephi, but following the instructions here, I managed my first author-keyword network using papers generated by authors in my institutions.
Obviously, this needs a lot of work but it does show nodes of popular keywords (red) and authors who publish papers with those keywords (green)
The key seems to be this though. While it’s possible to use Gephi directly on Scopus data (see this tutorial for example), perhaps easier would be to use the free online tools at Sciencescape to generate the network file to be used in Gephi.
Still I find that if you want is something simple, CWTS’s VOSviewer seems to be the ticket.
I personally find VOSviewer fairly easy to use, as easy as such tools can be anyway.
Obviously you will need to know what terms like bibliographic coupling, link strength etc mean, but VOSviewer keeps things as simple as possible.
It first asks you the mapping group you are going to take.
Google Sheet and explore function
Would it surprise you to know both features are in Google Sheets? You need to invoke it using the easily missed “explore” function, hidden at the bottom right.
In the example above, I uploaded a Google analytics data file and after clicking on explore, besides offering me some graphs it also offers to allow me to type questions in natural language to get answers. Some suggested type of questions you can try are already shown. e.g. Finding correlations, medians etc.
It’s still not very smart at interpreting what I want but it’s possibly going to get better as more people use it.
This was a really long post, with a diverse mix of tools, services and apps.
A lot of the machine learning type features where the system provides recommendation and advice based on the data you have seem to have start appearing in 2016, so things are still very fluid right now and we should be expecting more and more new services and tools to have such features. God knows when library provided software will follow suit. 🙂
Hopefully, my post gave you some ideas on what can be done and inspiration to try out some of these tools that you think might be useful.