Open Calais joins DocumentCloud initiative

Thomson Reuters today announced that it has joined forces with two dozen world-leading publishers on the DocumentCloud initiative. Funded by the John S. and James L. Knight Foundation with this year’s largest grant, DocumentCloud is a unique online resource that will provide public access to news reporters’ original source materials. It will debut in a beta version by the end of this year.

Specifically, Thomson Reuters has agreed to contribute high-volume use of its OpenCalais service to the Document Cloud Initiative. OpenCalais uses natural language processing (NLP) to “read” a document, instantly identifying and tagging the relevant people, places, companies, facts and events for improved search and navigation. This will make it easy for users to explore connections between newsmakers, corporations and events across documents and across the full collection of source materials.

“By using OpenCalais to tag documents, DocumentCloud will enable journalists, researchers and scholars to find otherwise hidden connections between people, companies and concepts across a body of documents.” said Barak Pridor, CEO, ClearForest, the Thomson Reuters company that produces the OpenCalais service. “DocumentCloud will also enable the public to get the ‘back story’ behind the news and see for themselves the information that reporters use to get at the facts. It’s an incredibly powerful tool for democracy and an important step in the ongoing evolution of citizen journalism.”

DocumentCloud is the brainchild of Scott Klein, editor of online development at ProPublica; Eric Umansky, senior editor of ProPublica; Aron Pilhofer, editor of interactive news technologies for The New York Times and Ben Koski, software engineer in The New York Times’ interactive news technology group. It was inspired by The New York Times’ DocViewer software, which created a searchable database of the more than 11,000 pages of Hillary Clinton’s public schedule during her eight years as first lady in the White House.

Other publishers and non-profit organizations that have joined in supporting the DocumentCloud initiative include the ACLU National Security Project, Arizona Republic, The Atlantic, Center for Democracy and Technology / OpenCRS, Centre for Investigative Journalism (City University London), Center for Investigative Reporting / California Watch, Center for Public Integrity, Chicago Tribune, Dallas Morning News, Gotham Gazette, The Investigative Reporting Workshop at American University, The National Security Archive, The New York Times, New Yorker, MinnPost, MSNBC, Mother Jones, PBS NewsHour, ProPublica, St. Petersburg Times, Sunlight Foundation, Talking Points Memo, Voice of San Diego, Washington Post and WNYC.

Matthew Buckland: Publisher


Sign up to our newsletter to get the latest in digital insights. sign up

Welcome to Memeburn

Sign up to our newsletter to get the latest in digital insights.