Wg/linguistics/Legal Issues

This page is deprecated and its content has been added to legal issues.

=Copyright= Copyright is problematic especially for corpus linguistics. The copyright status of aggregated and processed corpora is often not clear.

=Privacy= Researchers often want to have as much metadata as possible about their subject (age, sex, competence in other languages). This might touch upon privacy issues and can not always be recorded.

=Ethics= Linguistic and cultural data from developing countries are often felt to be in need of being protected from Western entertainment industry (Disney etc). The standard scenario is an animated motion picture from a sacred or taboo ritual dance broadcast in cinemas world wide, possible containing mythical knowledge reserved for Shamans or similar. This is the reason why many field workers insist on their data being CC-NC. It is important to acknowledge this fear regardless of whether one thinks that it is well-founded

=Licenses= Always include a license/copyright statement with your data. State your intentions clearly.

If you have a choice, go for a licence which complies with the http://opendefinition.org. This means that CC-noncommercial and CC-noderivatives are ruled out.

If for some reason, you cannot follow the preceding advice, it is better to go for CC-NC/ND than to not indicate any information at all. But only a little better. CC-NC is about as bad as a traditional assertion of copyright as it severely hampers the possibility of your data to travel.