Working Groups/Linguistics

= Working Group on Open Data in Linguistics =

Purpose
1. Promote the idea and definition, as specified in opendefinition.org of open data in Linguistics and in relation to language data.

2. Act as a central point of reference and support for those interested in open linguistic data.

3. Facilitate communication between researchers from different communities that use, distribute, or maintain open linguistic data.

4. Serve as a mediator between providers and users of technical infrastructure.

5. Build and maintain an index of open linguistic data sources and tools that link existing resources.

6. Assemble best-practice guidelines and use cases concerning creating, using and distributing data.

7. Gather information on legal issues surrounding linguistic data to the community.

Blog
http://linguistics.okfn.org/

Meetings and Workshops
We usually meet in intervals of 6 - 8 weeks, either in person or in skype. Aside from group meetings, we organize workshops.

Upcoming:
 * 2013, Nov (Etherpad, Doodle)

Previous Meetings

 * 2013, Oct 14: telco 02:00 CET (Etherpad)
 * 2013, Sep 23: 2nd Workshop on Linked Data in Linguistics (LDL-2013): Representing and Linking Lexicons, Terminologies and other Language Data, Pisa, Italy, held in conjunction with the 6th International Conference on Generative Approaches to the Lexicon (GL2013)
 * 2013, Aug 15-18: Linked Data in Linguistic Typology theme session at the 10th Biennial Conference of the Association for Linguistic Typology (ALT 10)
 * 2013, July 11: telco 02:15pm CET (wg/linguistics/minutes/20130711)
 * 2013, May 15: telco 06:15pm CET (wg/linguistics/minutes/20130515)
 * 2013, March 20th: telco 02:00pm CET (wg/linguistics/minutes/20130320)
 * 2012, Sep 23rd - 25th: workshop on Multilingual Linked Open data for Enterprises (MLODE, http://sabre2012.infai.org/mlode), Leipzig, Germany
 * 2012, July 23rd: telco (wg/linguistics/minutes/20120723)
 * 2012, July 11th: OWLG lunch @ ACL 2012
 * 2012, May 9th: real life meeting @ LREC Istanbul (wg/linguistics/minutes/20120524)
 * 2012, Apr 30th: telco (wg/linguistics/minutes/20120430)
 * 2012, Mar 9th: real life meeting @ LDL 2012 (wg/linguistics/minutes/20120309)
 * 2012, Mar 7th - 9th: workshop on [Linked Data in Linguistics (LDL 2012)], Frankfurt/M.
 * 2012, Jan 25th: telco (wg/linguistics/minutes/20120115
 * 2011, Dec 14th: telco (wg/linguistics/minutes/20111214)
 * 2011, Oct 24th: real-life meeting @ ISWC 2011, Bonn (wg/linguistics/minutes/20111024, from Etherpad)
 * 2011, Jun 30th: workshop at the OKCon 2011
 * 2011, May 27th: telco (minutes)
 * 2011, Jan 18th: real-life meeting in Berlin (minutes)
 * 2010, Dec 1st: real life meeting in Berlin
 * 2010, Oct 26th: real-life meeting in Berlin
 * 2010, Oct 19th: telco

Members
Members (incomplete, please add yourself)
 * Armelle Boussidan, CNRS Lyon
 * Felix Burkhardt, Telekom Innovation Laboratories, Germany
 * Christian Chiarcos, Applied Computational Linguistics, University Frankfurt am Main, Germany
 * Gerard de Melo, ICSI Berkeley
 * Judith Eckle-Kohler, Technische Universität Darmstadt
 * Sebastian Hellmann, Universität Leipzig
 * Nancy Ide, Professor of Computer Science at Vassar College and Technical Director of the American National Corpus project
 * Steven Moran, LMU Munich
 * Sebastian Nordhoff, MPI for Evolutionary Anthropology
 * Jonathan Pool, PanLex project, The Long Now Foundation
 * Cornelius Puschmann, University of Düsseldorf
 * Pablo Mendes, Freie Universität Berlin
 * Zoltán Varjú, linguist advisor, Weblib LLC, Hungary
 * Richard Littauer, University of Edinburgh
 * John McCrae, Bielefeld University
 * Igor Korsakov, IKorsa.net, St.Petersburg, Russia
 * Charalampos Bratsas, Aristotle University of Thessaloniki, OKFN Greece Coordinator, Greece
 * Mark Selent, Cleveland State University
 * Hugh Paterson III, User Experience Consultant, West Coast, U.S.A.
 * Ritesh Kumar, Asst. Professor of Linguistics (Computational Linguistics), Dr. B.R. Ambedkar University, Agra, India

Possible Projects

 * Collecting use cases and howtos, developing best practices recommendations for making linguistic data open.
 * legal issues
 * technical issues
 * ontology building for linguistic resources
 * Maintaining a registry of collections of open corpora, dictionaries and other linguistic resources on CKAN
 * Developing a Linked Open Data (sub)cloud of linguistic resources, cf. Linguistics Linked Open Data cloud page
 * Developing a workflow repository and platform. Cf. Workflows page

Participate

 * Open Linguistics Mailing List: http://lists.okfn.org/mailman/listinfo/open-linguistics
 * Wiki page: http://wiki.okfn.org/wg/linguistics
 * Etherpad: http://okfnpad.org/OWLG

Tasks
As the community grows, it becomes necessary that contact persons for different specific tasks are identified. Please find these people on the tasks and contact persons site. However, everyone interested may support us with respect to these administrative goals. At the moment, wiki/website/blog curators are urgently needed.

Beyond that, everyone is encouraged to
 * invite other prospective members
 * ask around for other people to invite
 * discuss WG purpose, projects and ideas to be pursued by the group
 * participate in the projects identified above
 * collect case studies and best practice recommendations, e.g., on
 * legal issues (licenses, copyright, etc.)
 * technical issues (howtos, format recommendations, etc.)
 * ontology building for linguistic resources (howtos, format recommendations, etc.)
 * workflows (ideas for workflow repository and platform development)
 * register data at CKAN and contribute to the Linguistics Linked Open Data cloud

Resources

 * Bibliography
 * joint publications

Links

 * Copyright issues in the PanLex project: “Content ownership” section at http://panlex.org/tech/doc/deriv/sourcing.shtml

CategoryOKFNMember