Search Engines and Finding a Threshold Test for Plagiarism

Pheh Hoon LIM

Abstract


Nowadays, using electronic search engines has replaced visiting libraries for information allowing in-print, out of print and orphan works to be conjured up from cyberspace for ready perusal. Such “allowed” borrowing of content on the internet has created a whole new cut and paste culture making it easy for writers to freely download materials and mix and match during the creativity process. On the other side of the coin, copying and plagiarism, which might have been rife but were more difficult to track in a hard copy world of text and books, are becoming easier to detect. The substantive legal test as to whether copyright has been infringed has not been much affected by the sheer ease of copying digital material; it still remains to be assessed whether a substantial amount from a particular work has been copied, whether the quantum is gauged qualitatively or quantitatively. Plagiarism, however, has never been subject to a comparable substantive test probably because on its own it does not give rise to any legal action. A charge of plagiarism, however, is not without its consequences.

Persons denounced as plagiarists are subjected to sanctions (severe in many academic environments) and social stigma. A recent example is the furore in New Zealand involving Witi Ihimaera’s latest published piece of historical fiction, The Trowenna Sea. This incident highlighted the finely tuned forensic role that search engines such as Turnitin and Google now play in today’s knowledge society in both compiling and unearthing historical memorabilia from all kinds of sources and historical eras. At the same time, every act of reusing a ‘free’ historical fact or picturesque turn of phrase without fastidious attribution runs the risk of being detected and negatively publicised. It demonstrates the need for a clearer answer for writers, academics and students as to what is acceptable and what is not even when one borrows and builds on material in the public domain. Search engines undeniably play a useful forensic role but have brought to the fore an urgent need for new and clear standards of good practice. Bearing in mind that search engines are heuristic devices, technology aids the plagiarism detection process only to the extent of helping to determine a set of experience-based rules for optimal solutions. While efficient controls are needed at the same time, the process should be thought out on the basis that it remains open to an aggrieved party to pursue a copyright claim if any borrowing (attributed or otherwise) amounts to a “substantial part” and infringement.

In this paper, the author first explores the nature of plagiarism and its intersection with copyright infringement and then tackles three related issues. The first looks at whether there is a need for a threshold test for plagiarism in terms of the quantum of unacknowledged borrowing that might be permitted before a plagiarism alert is triggered, and whether the test applied could be the same as copyright’s infringement threshold, namely substantial taking (whether that be assessed quantitatively or qualitatively). The second is based on the assumption that there should be a standard and questions the appropriateness of allowing technology to nudge the knowledge society towards greater accountability and zero tolerance for borrowing. From a purely pragmatic point of view, were it to become the norm to avoid a claim of plagiarism that all writers must attribute each and every descriptive snippet of information they reuse literary works in future might become overbalanced with more footnotes or endnotes than text. The third issue relates to whether students and academics should be held to a higher standard (effectively zero tolerance) than other writers such as journalists, political speech writers or historical novelists.

Full Text:

HTML PDF