In a guest post for OJB, Barbara Maseda looks at how the media has used text-as-data to cover State of the Union addresses over the last decade. Continue reading “Text-as-data journalism? Highlights from a decade of SOTU speech coverage”
Over the last few months, I have been talking to journalists about their trials and tribulations with textual sources, trying to get as detailed a picture as possible of their processes, namely:
- how and in what format they obtain the text,
- how they find newsworthy information in the documents,
- using what tools,
- for what kinds of stories,
…among other details.
What I’ve found so far is fascinating: from tech-savvy reporters who write their own code when they need to analyze a text collection, to old-school investigative journalists convinced that printing and highlighting are the most reliable and effective options — and many shades of approaches in between. Continue reading “What do journalists do with large amounts of text?”
Because he sends me an email every December, Nic Newman has a tag all of his own on this blog. So as this year’s email lands in my inbox here’s my annual reply around what I’ve noticed in the last 12 months — along with some inevitably doomed predictions of what might happen in the next year… Continue reading “What changed in 2017 — and what we can expect in 2018 (maybe)”
The prices of my 3 data journalism ebooks — Data Journalism Heist, Finding Stories in Spreadsheets and Scraping for Journalists — have been cut to $5 on Leanpub in the lead up to Christmas. And if you want to get all 3, you can also get the data journalism books bundle on Leanpub for more than half price over the same period, at $13. Get them while it lasts!
This week I’m rounding off the first semester of classes on the new MA in Data Journalism with a session on artificial intelligence (AI) and machine learning. Machine learning is a subset of AI — and an area which holds enormous potential for journalism, both as a tool and as a subject for journalistic scrutiny.
So I thought I would share part of the class here, showing some examples of how the 3 types of machine learning — supervised, unsupervised, and reinforcement — have already been used for journalistic purposes, and using those to explain what those are along the way. Continue reading “Data journalism’s AI opportunity: the 3 different types of machine learning & how they have already been used”
The event featured speakers from the regional press, hyperlocal publishers, web startups, nonprofits, and national broadcasters in the UK and Ireland, with talks covering investigative journalism, automated factchecking, robot journalism, the Internet of Things, and networked, collaborative data journalism. You can read a report on the conference at Journalism.co.uk. Continue reading “Here are all the presentations from Data Journalism UK 2017”
Earlier this year I announced a new MA in Data Journalism. Now I am announcing a version of the course for those who wish to study a shorter, part time version of the course.
The PGCert in Data Journalism takes place over 8 months and includes 3 modules from the full MA:
- Data Journalism;
- Law, Regulation and Institutions (including security); and
- Specialist Journalism, Investigations and Coding
Today I will be introducing my MA Data Journalism students to SQL (Structured Query Language), a language used widely in data journalism to query databases, datasets and APIs.
I’ll be partly using the mapping tool Carto as a way to get started with SQL, and thought I would share my tutorial here (especially as since its recent redesign the SQL tool is no longer easy to find).
So, here’s how you can get started using SQL in Carto — and where to find that pesky SQL option. Continue reading “How to: get started with SQL in Carto and create filtered maps”
MA Data Journalism students Carmen Aguilar Garcia and Victoria Oliveres attended the Information is Beautiful awards this week and spoke to some of the nominees and winners. In a guest post for OJB they give a rundown of the highlights, plus insights from data visualisation pioneers Nadieh Bremer, Duncan Clark and Alessandro Zotta.
Nadieh Bremer was one of the major winners at this year’s Information is Beautiful Awards 2017 — winning in both the Science & Technology and Unusual categories for Why Are so Many Babies Born around 8:00 A.M.? (with Zan Armstrong and Jennifer Christiansen) and Data Sketches in Twelve Installments (with Shirley Wu).
In a special guest post Anders Eriksen from the #bord4 editorial development and data journalism team at Norwegian news website Bergens Tidende talks about how they manage large data projects.
Do you really know how you ended up with those results after analyzing the data from Public Source?
Well, often we did not. This is what we knew:
- We had downloaded some data in Excel format.
- We did some magic cleaning of the data in Excel.
- We did some manual alterations of wrong or wrongly formatted data.
- We sorted, grouped, pivoted, and eureka! We had a story!
Then we got a new and updated batch of the same data. Or the editor wanted to check how we ended up with those numbers, that story. Continue reading “How one Norwegian data team keeps track of their data journalism projects”
A few weeks ago I posted a list of 9 great newsletters about data. The post generated so many suggestions of other newsletters that I thought I’d gather them together in a follow-up post. So, here are 9 more newsletters about data journalism, data science, and data visualisation.
1. Graphic Content
Graphic Content is a regular email newsletter — and Tumblr blog — from the head of data and transparency at the Institute for Government, Gavin Freeguard.
The format is simple: a collection of lists to some of the most interesting data visualisation, data journalism and ‘meta data’ (other links about data) that day. You can subscribe to the newsletter here.
Hacks/Hackers is a global network of meetups for journalists (hacks) and developers (hackers) interested in the potential of data for newsgathering and storytelling.
The network also has a weekly email which recently reached its 100th issue. It also rounds up events around the world in the week ahead, jobs, funding and useful links. You can subscribe to it on their blog.
3. Best in Visual Storytelling
Rachel Schallom emailed to let me know about her weekly visual journalism newsletter Best in Visual Storytelling, “which isn’t 100% about data, but includes a ton of data-driven projects.” It arrives on Mondays. The sign-up form is here.
4. Data Elixir
The first of four newsletters suggested by Jeremy Singer-Vine, whose newsletter Data Is Plural featured in the original post, Data Elixir is “a weekly newsletter of curated data science news and resources from around the web” on Tuesdays, from Lon Riesberg. It’s already passed 150 issues.
5. Data Science Weekly
Surpassing that, Data Science Weekly recently hit its 200th issue. It focuses on data science, with news, articles and jobs. The archive covers everything from predicting NFL plays to tutorials on creating a bar chart.
6. Data & Society
Data & Society is a research institute “focused on the social and cultural issues arising from data-centric technological development.”
If you’re interested in the more critical/academic side of data journalism, their newsletter provides updates on their research, events, and other useful links.
7. The Data Science Community newsletter
NYU’s Center for Data Science publishes its own newsletter focused on the data science community and “featuring data science news delivered with humor & snark plus an always popular Tweet of the Week”. The emphasis here is on breadth with lots of detail on each link.
8. data.world Data Digest
Gabriela Swider from data.world – a new platform for sharing and analysing data – got in touch to recommend their Data Digest, which highlights a few of the most interesting datasets on the platform every Friday. Subscribe here.
9. Naked Data
And rounding off the list on a high is Jason Norwood-Young’s newsletter Naked Data — recommended by Anastasia Valeeva. “Sign up for a weekly roundup of the best data journalism projects, news, tech and happenings from around the world,” promises the sign up page. There’s a lot here beyond the usual suspects, and it’s well curated.
If you know of any newsletters not mentioned here or in the previous post, please let me know!
Filed under: online journalism Tagged: Anastasia Valeeva, Best in Visual Storytelling, Data & Society, Data Elixir, Data Science Weekly, data.world, email, Gabriela Swider, Gavin Freeguard, Graphic Content, hacks/hackers, Jason Norwood-Young, Lon Riesberg, Naked Data, newsletters, NYU Center for Data Science, Rachel Schallom
We’ve confirmed the line up for this year’s Data Journalism UK conference on December 5 — and I’m pretty excited about it.
We’ve managed to pack in networked data journalism and investigations, automation and the internet of things, and some practical sessions too, with my new MA Data Journalism students pitching in to help.
Tickets are available here including early bird and afternoon-only options, but you’ll need to be quick — the event sold out last year.
Here’s more detail on the running order…
Networked data journalism
Kicking off the day is Megan Lucero who has been leading the Bureau of Investigative Journalism’s project Bureau Local.
The former Times data journalist will talk about what they’ve learned one year in to the project, which was established with £500,000 from Google’s Digital News Innovation Fund.
Also aiming to stimulate data journalism at a local level is the BBC’s new Shared Data Unit, based here in Birmingham.
Peter Sherlock, who heads up the team, will be talking about the first few months of that project as the unit takes on its first secondees from partners in local media.
On the day that we held the last Data Journalism UK conference, Johnston Press announced that they were forming a new investigations unit. Project lead Aasma Day will be here this year to talk about what has happened since.
There’s a terrific first panel of investigative journalists including the winner of this year’s Paul Foot award, Emma Youle and The Ferret’s Peter Geoghegan.
And Karrie Kehoe will be speaking about how she works on computational investigations at the Irish broadcaster RTÉ.
Automation and factchecking
Two more recipients of funding from the Google Digital News Initiative are speaking in the afternoon. Urbs Media CEO Alan Renwick has worked with publishers such as Thomson Regional Newspapers, Mirror Group, TES and DMGT, and was Strategy Director at regional group Local World.
And Mevan Babakar from FullFact will be speaking about their project to automate factchecking.
Joining them will be CW Anderson, the editor of the book Remaking The News, currently working on a forthcoming book about data journalism, and former Guardian media and technology reporter Mercedes Bunz, co-author of ‘The Internet of Things‘.
We’ll have practical sessions at different points in the day, with attendees invited to nominate skills they would like covered.
Trinity Mirror data journalist Rob Grant will be doing a session on R for journalists and I’ll be doing a session on handling big data, based on a story that involved analysing 37 million rows of crime data.
You can book tickets on the Eventbrite page, or by clicking on the image below.
Filed under: online journalism Tagged: Aasma Day, Alan Renwick, Bureau Local, CW Anderson, data journalism UK, Emma Youle, Ferret, FullFact, Google Digital News Initiative, investigative journalism, Johnston Press, Karrie Kehoe, Megan Lucero, Mercedes Bunz, Mevan Babakar, Peter Geoghegan, Peter Sherlock, RADAR, robot journalism, RTE, Urbs Media