Thesis/Full Text
From Researchwiki
Faculty of Science and Agriculture
First and foremost I would like to thank my supervisors, Ken Eustace and Geoff Fellows for their support and guidance throughout this marvellous challenge. Special thanks go to Ken for volunteering his class to participate in this research, and to the all members of the ITC213 class who participated.
Thanks to honours coordinator Edward Stow, and acting honours coordinator Yeslam Al-Saggaf. Yeslam’s kindness and support throughout the uncertain first half of the year kept me on track. His inspiring PhD also proved a shining example of academic writing. It has proved an inexhaustible guide to structuring and writing my thesis.
I would like to thank the SCI401 tutors, Anne Lloyd, Barley Dalgarno and Jason Howarth for broadening by understanding of research.
Thanks to “Butler B”, the residents of building 273, in particular Shannon Eves, for the support and laughter throughout the year. “It’s All Good.”
Many thanks to Deborah Buckley, whose company throughout the year was most welcome. I hope you learned as much from me as I did from you.
Finally, and most importantly, to my loving family, whose patience, kindness, and endless support, allowed me to complete another invaluable year of study. Mum, Dad, thankyou.
User generated content is having an ever-increasing influence and presence on the Internet. Wiki communities, in particular Wikipedia, have gained wide spread attention and criticism. This research explores criticisms and strengths of wiki communities, and methods to reconcile the two. Wiki communities have made monumental achievements in terms of breadth, depth, and apparent quality of content produced, however critics are quick to point out that quality can vary greatly, and none of the content is validated or academically useful. This research tests wiki software in an educational setting to determine indicators of quality, and how trust is created within the community. The results give insight into the use of wiki systems in educational settings, suggest possible methods of improving the validity of content created within wiki communities, and provide groundwork for further research in the area.
Contents
|
Introduction
The Internet has seen the astonishing growth of blogging, RSS, and podcasting, as forms of user-generated content. Blogs are replacing traditional news sites, and online discussion and interaction, through the popularity of sites such as digg and Slashdot are changing the way we find, judge and trust information. Wikis have continued this trend in user-built interactive information universes. Wikipedia, a free, open-content (public user created) encyclopaedia has popularised the concept of a wiki, with many projects adopting MediaWiki (shown in figure 1.1), the software used by Wikimedia for Wikipedia, or creating their own custom wiki systems.
Examples of important wikis are
- Wikipedia and related projects (Wiktionary, Wikibooks, Wikicommons, Wikisource, Wikinews etc.)
- c2.com (Ward's Wiki) - The first wiki, hosting the Portland Pattern Repository and material on Extreme Programming
- Georgia Institute of Technology (CoWeb) - used by students of classes at Georgia Tech.
- New York Times Digital - Used by project teams within the company
- Motorola Systems-on-Chip Design Technology (TWiki) used for project management, communication, documentation, article writing, and group scheduling
- Encarta Encyclopaedia, who introduced a wiki-like extension to their encyclopaedia
The Internet community, as an intellectual group, promotes distribution of information made as correct as possible by ones' ability, but care must be taken, as that level of ability may be low. Google search and PageRank help indicate the best pages, improving the apparent average quality, as writers link to sources they trust. When reading blogs, the standard is set by the best blogs, because nobody reads the average blog. (Graham 2005).
(Graham 2005)
Open source and wikis break the traditional model of publishing; rather than authors publishing in their own spaces, and competing for an audience, authors contribute to the same space, attempting to improve the collective writing of the community. To paraphrase Graham (2005); people contribute what they like, the good stuff stays, the bad gets removed.
Herein lies the major criticism of wikis in general, that quality can not "evolve" from this process. There is no guarantee of the accuracy of content, and there is no formal process of validation, by which content is said to be correct. Rather, a continual process is used, where content is constantly being validated and edited, and accuracy is transitory.
This research seeks to understand these criticisms, and discern how to improve the content that wiki and open-content communities have worked to create, making it more authoritative and widely usable in the academic world.
Definition of Selected Terms
Apache a popular open source web (HTTP) server.
blog a publicly accessible journal published online using specialised software, where entries are presented in reverse chronological order, usually written by a single author, or a small group. Most blogging software supports RSS, which allows readers to subscribe to a blog, and automatically receive updates.
CamelCaps a method of joining words together by capitalising each word before removing spaces between words. Commonly used by programmers, and in some wiki systems.
CGI Common Gateway Interface. A technology used by web servers to allow the server to communicate with an external application, allowing the application to respond to a user request.
click-stream a list of links or web pages a user follows while browsing the Internet
CSS Cascading Style Sheets. A document used to store formatting information for a HTML page. CSS facilitates the separation of content and formatting.
CSU Charles Sturt University.
digg a social bookmarking website presenting science and technology news. Users may vote news items up and down, varying the popularity of the item.
extreme programming an incremental software development methodology, emphasising the need for the software and developers to be adaptable, to be able to respond quickly to changes during the development lifetime.
gift culture a community where goods and services are given away in exchange for favours or respect.
GNU Recursive acronym for GNU's Not Unix. A free operating system and related tools and applications.
group-think the act of conforming to the shared opinion of a community, without a significant attempt to consider alternatives
Hacker an enthusiast(Raymond 2006). Specifically, in this document, a software developer. N.B. While this is the original meaning of the word, today it is often corrupted to mean someone who breaks security.
hook a software construct whereby a module may request to be called to handle an event
HTML Hyper-Text Mark-up Language. A document format used for writing and formatting web pages.
HyperCard a powerful and flexible programming environment written by Apple Computers.
IP address Internet Protocol address. A unique number assigned to all devices (typically computers) connected to the Internet. This number facilitates the forwarding of information on the Internet to the correct destination.
ISP Internet Service Provider. An organisation who provide Internet access.
JavaScript a scripting language commonly used for performing simple tasks within a web browser.
MIME type Multipurpose Internet Mail Extensions type. A part of an Internet standard used for specifying information (typically file) formats. The MIME type specifies a content type and subtype. The major types are application, audio, image, message, model, multipart, text, and video.
MediaWiki database-backed wiki software developed closely with Wikipedia and its community. Probably the most popular and recognisable wiki engine.
MySQL a popular open source database management system.
namespace in MediaWiki a namespace is an abstract virtual container allowing articles to be grouped such that articles from different namespaces with the same name do not conflict. In MediaWiki namespaces are for separating different types of content, such as help content, personal content, templates, and images. Namespaces are by a phrase placed before a colon in the full identifier of an article, eg. Help:FAQ.
NPOV Neutral Point of View. A Wikipedia policy stating that 'all articles must be written from a neutral point of view, that is, they must represent all significant views fairly and without bias' (Wikipedia Contributors 2006k).
open-content works not produced for profit and released for distribution and improvement by others at no cost. Such works are often written collaboratively.
open source open-content source code publishing, where the source materials used in generating the end product are also released. Most commonly refers to open source software, where the source code is released along with the finished product.
PageRank A method for determine a numerical approximation of the reputation of a web page, used by Google search for raking search results.
Perl the specification for a level interpreted programming language sharing features with C, and AWK well suited to processing text files.
perl a software implementation of the Perl specification.
PHP PHP: Hypertext Preprocessor. A popular open-source programming language commonly used for writing web applications.
PIM Personal Information Manager. Software that combines features such as notes or todos, calendars or communications (email/instant messaging/telephone/fax), as an organisation aid to the user.
podcast a collection of files (typically audio or video) distributed on the Internet using the enclosures feature of RSS to "push" the files out to subscribers. Podcatcher or aggregator software allows users to subscribe to RSS "feeds" which signal the software to download new files as they become available.
RCS Revision Control System. Software used to manage multiple versions of files (such as documentation or program source code). Such a system typically allows a user to review or revert to previous versions, as well as track changes and related meta-data (contributing user, date etc.)
RSS Really Simple Syndication (most common meaning). A specially formatted file published on the Internet, containing a series of entries. These textual entries usually contain a summary of available content, such as blog items, news items, or podcast items. End user software is used to automatically collect up-to-date versions of these files, and present the contained summaries to the user. An "enclosures" feature allows the inclusion of a file (typically audio or video) with each entry.
seeding creating the initial set of pages in a wiki, providing an initial structure and guidelines for users.
Slashdot A popular technology news site, with a large and active community. The Slash software used on the site contains a moderation system used to rate and filter the often hundreds of comments posted in reply to each news item.
social bookmarking web based collaborative repository of Internet bookmarks (URLs or links). Such repositories typically support some sort of rating or commenting mechanism to help visitors find and manage bookmarks.
Special Page a set of dynamic pages in MediaWiki, facilitating functions such as deleting pages, searching, moving pages, logging in and out, and various administration functions.
spider or crawler. Automated software typically used by search engines that and downloads web pages, using links in pages downloaded to find new pages to download.
SQL Structured Query Language. A programming language designed to provide an interface to database management systems.
user sub-pages Sub-pages are a feature of the MediaWiki software where pages may be created logically "beneath" another page. For example, a page titled Animals may have a sub-page called Animals/Dogs. A user sub-page is a sub-page beneath a users personal page in the "User" namespace.
wiki 1.a website allowing collaborative authoring, where users may add edit and remove text (or possibly other media) in a single central repository of "pages". 2. software that facilitates such functions.
Wikimedia A not-for-profit organisation co-founded by Jimmy Wales. Wikimedia maintains several web sites including Wikipedia, Wikinews and Wikibooks.
Wikipedia A free open-content multi-lingual encyclopedia run my the Wikimedia foundation.
WYSIWYG What You See Is What You Get. A phrase used to describe the ideal in document editing, that the content will appear on the printed page (or other final format) as it does on screen (during editing).
Outline of Chapters
This research is presented in five chapters. Chapter two reviews literature in the areas of wikis, and online trust and reputation. It serves to introduce the field to the reader, to explore what is known in these areas, and to identify voids that research has yet to explore. These voids propose topics for the research detailed in later chapters.
Chapter three details the nature of the research being performed. The first half of chapter three outlines assumptions, limitations, the research questions being studied, and the methods for achieving the goals of the research. The second half explains and expands the technical aspects of the tools used in the research, as well as the reasons for their selection.
Chapter four presents the data, analysis and results of the research, discussions on how the data is interpreted, and explanations of their relevance and importance.
Chapter five summarises the process taken in this research, and provides a broad discussion and conclusions based on the research. It summarises the results of chapter four, and provides interpretations and limitations of these findings. Finally it suggests avenues for further research.
Literature Review
Introduction
The purpose of this chapter is to provide a summary of the existing literature. It will give the reader a good understanding of the concepts involved in this research.
The chapter will begin by introducing wikis, their history, and development. The topic will be expanded further by explaining the principles behind wiki systems, and some ideals behind their use. The manners in which wikis have been used in various settings will be explored, followed by a survey of the communities who use them. Wiki systems will be compared to other forms of online communication, and finally several common wiki systems will be reviewed.
The chapter will then enter into a discussion of trust, and how it is important online and within wikis. This will be expanded to a discussion of reputation, and several existing online reputation systems will be reviewed.
The chapter will conclude with an introduction to the concept of information attention.
Wikis
Leuf and Cunningham (2001), p. 14 (cited in Schwall 2003) define a wiki from a technological standpoint as
From a conceptual view the Wikipedia Contributors (2006e) describe a wiki as:
There are several major defining points generally accepted to be featured in a wiki system:
- Comprises of a server component, storing and serving pages, as well as a client component (usually a web browser) providing an interface for viewing and changing content
- All/most/some content is editable by its users. The amount of editable content varies from wiki to wiki, to the extent that a system may be called a wiki, where wiki software is used, but content is not made editable to a community.
- Editing is open to all members of a community. This too varies from wiki to wiki. In some editing is completely open, in others only a small group of authorised contributors are permitted.
Several minor points are also important
- Pages can link to each other, as well as pages that do not yet exist
- The editor uses a simple markup which is converted by the server software to standard HTML pages
- The markup language and page layout is simple, so as not to distract from content being the primary feature
- More than just a technological solution, a successful wiki comprises an active community of editors and content-checkers
Wiki History
Ward's Wiki
The first wiki software was a script developed by Ward Cunningham in 1995 (Cunningham 1995a, 2003, 2006a, Sparks 2006), based on some ideas previously explored with HyperCard (Cunningham 2006c). There is some debate over the originality of Ward's idea. Several earlier systems with similar characteristics predate Cunningham's wiki (WardsWiki Community 2005), however it is generally accepted that Ward had no knowledge of these, and these earlier systems did not implement the concept in the scale or radical openness that Ward used.
Ward's system was developed as a supplement to the Portland Pattern Repository (a set of computer language and programming patterns). WardsWiki (as it became known) was built to provide a new documentation system to word processors, intended to support programmers. The new system supported revision control, was easy to use, and was highly automated (Schwall 2003).
(Cunningham 1995b)
This system provided a simple and convenient system for the community to communicate and share information in a fashion that would be useful for later reference to an external visitor. Cunningham built, and further developed this software on a set of principles promoting openness and collaboratively (see 2.2.2.1).
The full name for Ward's idea was WikiWikiWeb. The name wiki-wiki comes from the Hawaiian word meaning quick (Schwall 2003), chosen in preference to "quick web" so as not to conflict with other products such as QuickBasic (Cunningham 2003). The words in the name were concatenated in CamelCaps in the same way as hypertext links were identified by his system. The abbreviation wiki first came from the name of the script, published at http://c2.com/cgi/wiki. Lower case letters were used in accordance with Unix conventions for file naming (Cunningham 2003). Leuf and Cunningham (2001) used "Wiki" to refer to the concept, and "wiki" to refer to the implementation (a specific system, similar to the differences between "Perl" and "perl"). Although the original capitalisation is to write "wiki" in all lower-case, today it has evolved into a noun, and used as a proper-noun when referring to specific systems (Schwall 2003).
Wikipedia
The wiki concept gained widespread attention in 2001, when Jimmy Wales and Larry Sanger launched Wikipedia (Ma 2006a, Szybalski 2005). A successor of the Nupedia project, Wikipedia implemented the same openness as WardsWiki, open for all to edit, but without the strict editorial control of Nupedia.
Nupedia was founded by Jimmy Wales in 2000, after motivating the owners of company Bomis (of which Wales was one), to fund his idea of a free-content encyclopedia (Sanger 2005). The encyclopedia would ask academics and other experts of their fields to voluntarily contribute articles for incorporation into the encyclopedia. These articles would go through a seven stage review process, and approved by an academic advisory board before being publicly posted to the site.
Having produced few more than 20 articles after almost a year of development, this slow pace was identified as problem that would be ongoing, prompting Larry Sanger to be assigned to address this problem (Sanger 2005). Larry proposed several ideas deemed as too expensive, before proposing the idea of a wiki like system. Due to the low cost of setting up a wiki, it was trialled, and in January 2001, announced to the (approximately 2000) members of the Nupedia mailing list. Many of these members started directing their energies to this new Wikipedia, and with a steady increase in membership over the months, the project started growing by tens of articles a day (Sanger 2005).
Work was ongoing in Nupedia to improve the process, by reducing the process to a simple submission-acceptance/rejection model, designed to take articles from Wikipedia, validate them, and publish in a separate "authoritative" repository (Sanger 2002). As participation continually declined in favour of Wikipedia, the project eventually ground to a halt, and was officially ended in 2003 (Sanger 2005).
Wikipedia today is still the most popular wiki, ranked the 17th most popular web site globally (Alexa Internet 2006), however the popularity of wiki software has spawned many smaller wikis. Wikia houses the largest collection of wikis, the largest of which are Uncyclopedia (12.2M words), The Psychology Wiki (6.9M words), Wookieepedia (5.8M words) and Memory Alpha (4.8M words) (Wikia Inc. 2006). Other popular wikis exist, such as the Emacs wiki (http://www.emacswiki.org/cgi-bin/wiki), and the WordPress wiki, now called codex (http://codex.wordpress.org/Main_Page, originally at http://wiki.wordpress.org/). Many wikis are in existence, like Wards-wiki, to support programmers, very often for use in open source projects such as OpenTTD (http://wiki.openttd.org/) and Bugzilla (http://wiki.mozilla.org/Bugzilla).
Criticism of Wikipedia
Wikipedia's criticism extends primarily from its promotion as an encyclopedia. Using this label "carries a powerful connotation of reliability" (Orlowski 2005), something which the Wikipedia community can not guarantee. Building an encyclopedia is an admirable and inspiring goal that drives contributors, leading to Wikipedia's success today.
(McHenry 2006)
The principal criticism of Wikipedia as an encyclopedia is that there are no limits on who may edit content, and the lack of a formal peer review process. The freedom of editability is one of Wikipedia's greatest advantages, and has been the factor that has allowed Wikipedia to grow at such a phenomenal speed. Nupedia employed a strict editorial process of peer review that ultimately brought development to a crawl, where Wikipedia abandoned such limitations completely. This open editing is one of the philosophies of the community, allowing people to contribute anonymously, and refine/fix other contributors work.
(http://c2.com/cgi/wiki?WhyWikiWorks cited in Leuf & Cunningham 2001 and Schwall 2003)
Critics say that allowing "any fool" to edit the encyclopedia is a great detriment to the encyclopedia, allowing poor quality content to enter into the encyclopedia. The community however disagrees; poor work is removed or repaired by the community.
(Sanger 2002)
Larry Sanger (Sanger 2004) has criticised Wikipedia as being "anti-elitist", lacking respect of the authority of experts who edit Wikipedia. This may be a cultural factor, with many Wikipedia editors being young technically oriented enthusiasts, who also tend to be anti-authoritarian (Raymond 2003).
Wikipedia also suffers from a bias, due to the demographics of the editorship (see 2.3.2.1).
The Wikipedia community acknowledges many problems with the project (Wikipedia Contributors 2006b) and has a detailed set of apologetics (Wikipedia Contributors 2006c) rebutting many of the criticisms. There is some effort on dealing with the more major criticisms, however the major focus is still to build content.
Wiki Philosophy
The wiki idea was born out of Cunningham's HyperCard application, and has since exploded to form vast communities of editors, a popular editing model, wiki supporters and critics. This section will review several perspectives of wiki systems.
Cunningham's Principles
Ward Cunningham originally designed his wiki based on a set of principles (Cunningham 2006b, Ma 2005, Wagner 2004), now featured today as central to any wiki system
- Open - Should a page be found to be incomplete or poorly organized, any reader can edit it as they see fit.
- Incremental - Pages can cite other pages, including pages that have not been written yet.
- Organic - The structure and text content of the site are open to editing and evolution.
- Mundane - A small number of (irregular) text conventions will provide access to the most useful page markup.
- Universal - The mechanisms of editing and organizing are the same as those of writing so that any writer is automatically an editor and organizer.
- Overt - The formatted (and printed) output will suggest the input required to reproduce it.
- Unified - Page names will be drawn from a flat space so that no additional context is required to interpret them.
- Precise - Pages will be titled with sufficient precision to avoid most name clashes, typically by forming noun phrases.
- Tolerant - Interpretable (even if undesirable) behavior is preferred to error messages.
- Observable - Activity within the site can be watched and reviewed by any other visitor to the site.
- Convergent - Duplication can be discouraged or removed by finding and citing similar or related content.
The following four important concepts can be identified from Cunningham's principles.
Openness
- (Open, Organic)- editable by anyone
This is the single greatest feature of the wiki concept. The ability for any visitor to edit content is what allows wikis to grow so quickly. The barrier to entry of editing is kept very low. This feature however is the one that draws the greatest criticism. There is no authentication or checking the credentials of the editor. Anything changed becomes visible to the world immediately. The negative implications of this are reduced by the Observable concept below.
Some wikis and wiki software only partially implement this feature. Some require registration for editing, some only to create new pages. Others are completely private, requiring registration or invitation to view content.
Ease
- (Mundane, Universal, Overt, Tolerant)- markup is easy to learn, and familiar to users. Users can not cause an error by entering incorrect markup
This feature also keeps the barrier for entry to editing low, allowing novice users to comfortably use the software with only a very gentle learning curve.
Software is also designed not to produce errors, rather interpret the wikitext as best it can. This ensures a novice user can not "break" a page by entering incorrect text. The content should still be viewable, though the formatting may not be correct.
Precision
- (Unified, Precise, Convergent) - Page names kept simple, clear, concise and understandable by everyone. Duplication is reduced by assigning discrete topic names, and allowing articles to reference other articles.
This makes it easy for readers to find information, and provides a clear indication to writers as to the expected content of an article.
Cunningham recommends article names be contained in a flat namespace, i.e. all pages are equal, there are no sub-pages, where one page can be part of, and hidden by another page. This makes a search mechanism effective, and reduces the complexity of the software. A search for LCD in a hierarchical scheme might result in Science/Electronics/LCD and Science/Mathematics/Fractions/LCD. It also reduces naming conflicts such as Science/Mathematics/Fractions/LCD and Science/Mathematics/ElementryArithmetic/LCD. A flat namespace forces full names to be used such as LiquidCrystalDisplay and LowestCommonDenominator.
Keeping articles focused on discrete topics allows readers to locate and access articles simply and without confusion. This and the linking between pages allows a user to explore a topic by following links to related articles that interest them.
MediaWiki for example allows the creation of hierarchical namespaces, however communities such as Wikipedia generally avoid use of this feature for general articles with the exception made for technical aspects of the wiki such as templates and user sub-pages.
Observance
- (Observable)- Changes can be observed by any other visitor
This builds into the wiki a level of safety from poor edits. It allows contributors to check what changes were made, and decide if they are an improvement or not. A user could also restore a page to a previous version if a change was deemed not to improve the article.
Some wikis will also store along with the changes, the contributor who made the change, adding a level of accountability. This is thought to improve quality by making users accountable for their own work, encouraging them to make positive contributions (David 2004). Such software typically allows users to view another user's (or their own) contributions, by viewing a history of every change made by that user. This is thought by some to be detrimental, and a deviation from an ego-less spirit (Challborn & Reimann 2005), whereby the content is the most important part of the wiki, and any distraction from that is detrimental.
Stallman's Encyclopaedia
Richard Stallman published a vision of a 'Free Universal Encyclopaedia and Learning Resource' (Stallman 1999). These ideas eventually helped shape Wikipedia and its values. The guidelines set out are aimed to ensure the usefulness, and continued success of such an encyclopedia (Stallman 1999).
Many of these values have survived in the modern wiki and Wikipedia. Stallman proposed the encyclopedia should be written by anyone in principle, although most writers would be teachers and outstanding students. Progress will, and should, be made in small steps. Few people have the time to make large contributions, but 'enough ... small contributions can cover the whole range of knowledge' (Stallman 1999). Stallman writes the project would not be a short one, and would take many years to create, but members should keep the original vision, and encourage others to join and contribute.
To ensure the encyclopedia remains free, and always accessible, the encyclopedia must be available to anyone with Internet access, using only free software to display or otherwise access the encyclopedia. The encyclopedia should be available for copying verbatim, including use for translation, and modification.
Finally, there should be no central control of quality, or committee governing creation of the encyclopedia. Stallman writes that such an authority would be too easy to politicize or corrupt. Reviews may be made by 'various groups which will earn respect by their own policies and actions' (Stallman 1999). Peer reviews and endorsements should be encouraged. Such reviews through more traditional means would boost the credibility of the encyclopedia's content. Such endorsements of a work however, would apply only to that version of the work. 'In a world where no one is infallible, this is the best we can do' (Stallman 1999).
Uses
Document and Thread Writing Mode
Leuf and Cunningham (2001) note that there are two formats in which to write within a wiki.
- Document Mode
- When a wiki page represents a single concise article, focusing on a specific topic, and usually written in an encyclopedic or third person tone. Typically viewed as community property, free to be updated by other users.
- Thread Mode
- Where the wiki page consists of many personal opinions or comments. Such comments are usually signed by the authors with their username. Replies may be posted to such comments, by placing the reply underneath the original comment. Such comments usually remain the "property" of the original author. It is generally considered rude to edit a personal comment, other than obvious spelling or grammar mistakes, or to move it intact, to another more appropriate part of the wiki (Wikipedia Community 2006b).
Individual Uses
Cunningham and Leuf identify the usefulness of wikis as a personal or individual tool. In this form, it might be used as:
- A PIM: A replacement for post-it notes. Are contained in a repository and are easily searchable, categorizable and sortable.
- A notebook, logbook, brainstorming: an unlimited free form notebook. Its associative ability (linking) adds value to the notes.
- address book/Internet link manager
- Collection manager (videos, books etc.): An unstructured 'database'. Leuf and Cunningham (2001), p.86 show how such a wiki can be used to easily answer specific questions by searching notes.
- An "Anywhere Resource": An online wiki that can be accessed from anywhere with an Internet connection.
- Document manager: An editing environment with built-in versioning system
It is worthy of note that this dissertation was written entirely using MediaWiki, using a custom designed extension for citation management, and custom CSS elements to make the visual format suitable for printing in the style dictated by CSU. With the exception of scripts used to collect and process data for presentation in chapter 4, all writing occurred in the wiki, including drafting, collaborative proofing, spell-checking (using the SpellBound Firefox Extension), bibliographic management, note taking, and printing the main body of the text.
Collaborative Uses
The more commonly accepted use of wiki software is as a collaborative tool between a group of people. Such settings may include special-interest groups, academic groups and corporate groups.
These three groups have different requirements of a wiki. Corporate users typically need some form of security infrastructure, such as a firewall, preventing users from outside the company accessing the wiki. Academic and special interest groups may desire to publish information on their wikis, but limit editing to only authorised users.
Actual uses of wikis vary widely. Leuf and Cunningham (2001) explain the planning process that goes into selecting a wiki system, planning its content, and seeding the wiki (providing initial structure and instructional guides (Leuf & Cunningham 2001, tip 4.3)).
Leuf and Cunningham (2001) note several applications of wiki software in a collaborative environment:
- Resource collections: well annotated collection of documents, images, quotes or other data
- Collaborative FAQ: an evolving Q&A site where well-answered questions are filtered into an appropriate section for easy use by visitors
- Project management: An effective central location for communication and planning for a project group. Threaded discussions may be left visible to see reasoning in the decision-making process.
- Web site management: A novel publishing medium, using a wiki but excluding threaded mode discussions, where only authorised users can edit pages. Used in this manner, a wiki is designed to be informative to the general public.
- Online guestbook: Another limited variation, where a small wiki is used to allow visitors to post comments to a web page.
Collaborative wikis may be short or long term. Special wikis, or project wikis may only exist for the period of the project, product, or other activity. Other wikis may be designed for general use over an extended time, such as resource collections, document management wikis, or FAQs, which may run over many years.
Open-content Communities
Open source software presents the best example of open-content communities. The term "open source", coined in 1998 (Open Source Initiative 2006), most commonly refers to open source software, which is software where the human readable source code is released to the general public. Users are normally given permission to use and modify such software (fixing problems or adding features), and after doing so, will often contribute those changes back to the author.
By users of the software contributing their work on improving the software back to the software community it helps all members of the community. Where it is the norm to contribute improvements back to the community, the software develops very quickly. One motivation for contributing to the community are utilitarian progression of the community. The other motivation comes from such societies, in effect forming gift cultures. By contributing to the community, members receive an improved reputation, and boosted ego (Raymond 2000).
(Graham 2005)
The Darwinian "bottom up" approach of online communities have potential that most publishers lack (Graham 2005), the ability to quickly adapt to changes in the industry or to severe problems. 'Companies ensure quality through rules to prevent employees from [making mistakes]. But you don't need that when the audience can communicate with one another' (Graham 2005).
Open-source loosely employs the Delphi method. The Delphi method is 'a method for structuring a group communication process so that the process is effective in allowing a group of individuals, as a whole, to deal with a complex problem.' (Linstone & Turoff 2002). This process is known to produce solutions to complex problems with amazing accuracy (Surowiecki 2004).
The Delphi method is reformulated by Raymond (2001) in the context of open source as "Linus's Law", 'given a large enough beta-tester and co-developer base, almost every problem will be characterized quickly and the fix obvious to someone'. That is to say, with a large enough community, there is enough talent distributed in the group, that for any problem, there is a subgroup who will find it easy to solve, and will do so quickly.
Wiki communities
Wiki communities have originated out of developer communities, or "hacker" (in the sense of a software designer or problem solver, see 1.1) communities, and other IT-literate communities who find interest in such open communities as wikis. As such, most members share the attitudes of hacker communities, of 'freedom and voluntary mutual help' (Raymond 2006).
(Leuf & Cunningham 2001, tip 10.1, pg. 323)
As Raymond (2006) suggests, an individual must truly believe in the ideals of the community to become an accepted member. Large wikis (and to some extent, smaller wikis) however, have room for casual members, who may only visit or contribute occasionally. For these members it can be confusing to understand how to appropriately contribute to a wiki, both technically, in terms of contributing positively, and socially, in terms of which changes are good for the community, and follow policy or group norms.
(Leuf & Cunningham 2001, pg. 324)
Leuf and Cunningham (2001) argue that a wiki is nothing without an active community. It is therefore important to encourage new members, to grow a wiki, and to replace retiring members. Given also the evidence that most content in Wikipedia is contributed by casual visitors and infrequent contributors (Swartz 2006), it suggests a greater importance in encouraging visitors to contribute.
(Leuf & Cunningham 2001, tip 10.3, pg 325)
Wikipedia Community
The English Wikipedia has 2.2 million registered users (Wikimedia Foundation 2006a). These people at some stage spent the 30 seconds it takes to register an account to edit under. This however, is not a good measure of the size of the community. Many users may be inactive, and similarly, many non-registered users contribute without creating an account. Perhaps a better indicator are the 1009 Administrators of the English Wikipedia. These are active and regular contributors to the project, who have gained a level of trust in the community such that they are granted an extended set of rights. Administrators in Wikipedia have access to delete pages, lock pages from being edited, block users from editing, and are given more powerful methods of reverting edits (Wikipedia Contributors 2006f). Administrators are required in the community to field requests from users to make the above mentioned actions when necessary. They typically perform clean-up operations, such as reverting vandalism, and fixing categories, as well as participating in discussion regarding issues within the project and project policies. To effectively handle these requests, the administrator pool must be at least large enough to support requests from the community.
Anthony (2006) measures the number of Wikipedia editors at about 38,000 members, by counting users who have made more than 5 edits. Seth also supports Swartz (2006) in showing that administrators do not generally make substantial contributions of content to the encyclopedia. They are too busy being "janitors".
The Wikipedia community has defined a total of about 42 policies currently in use, in five categories (Wikipedia Contributors 2006j, 2006l);
- Behavioural
- Content
- Enforcement
- Deletion
- Legal and copyright
- Miscellaneous
distilled into five central "pillars" or summaries (Wikipedia Contributors 2006i):
- Wikipedia is an encyclopedia
- Wikipedia has a neutral point of view
- Wikipedia is free content
- Wikipedia has a code of conduct
- Wikipedia does not have firm rules
These policies are freely editable, like most other pages, by any visitor to the site, and as such, are said to represent consensus of the community, for the reason that if they did not, they would be frequently edited to reflect differing views. To highlight this, the following text is displayed with each policy:
(Wikipedia Contributors 2006d)
One of the most highly prised policies is the Neutral Point of View (NPOV) policy. This policy states that 'all articles must be written from a neutral point of view, that is, they must represent all significant views fairly and without bias' (Wikipedia Contributors 2006k).
Also defined by the community are a lesser set of policies called guidelines. These are defined similarly to policies, but without as wide an acceptance. The most cited is the "Be bold in updating pages" guideline, encouraging users to make changes without excessive hesitation, citing "Wikis develop faster when people fix problems, correct grammar, add facts, make sure the language is precise, and so on" (Wikipedia Contributors 2006a).
Wikipedia has several policies and guidelines pointing out the fact to keep ego and personal agenda out of editing (Wikipedia Contributors 2006h, 2006m), however with many editors being of the hacker mentality (see 2.2.2.4.1), ego is an important motivator for contributing (Raymond 2001). Challborn and Reimann (2005) note that contribution tracking mechanisms in MediaWiki detract from ideals of 'ego-less spirit-of-wiki purists'. These mechanisms however, are a vital tool in the combating of vandalism, and so too in research into wiki editors and behaviours.
Collaborative Models/Rival Technologies
The wiki began as a simple concept, and started with a very simple design. Ward's wiki, like most wikis, has developed and become an increasingly complex system (Cunningham 2006a), with the addition of features such as revision tracking, security/user accounts, notifications, and syntax enhancements. The concept and usability however remains for the large part unchanged. The simple ability for users to change content has remained while additional features have accumulated around it.
Wikis are a form of communication, facilitated through a document repository. As such a communication tool, it has a suitability for certain tasks. Wikis are best suited to openly distributing and collaborating on textual documents, in groups of any size, where security is not a great concern. Wikis are not a push medium such as email. Checking for new content requires visiting the repository. This limits its ability as a direct form of communication, without the incorporation of another technology such as email notifications or RSS updates.
Email usually operates in a one-to-one mode, but by using cc fields, or a more structured distribution list software, email becomes a one-to-many medium. Email usually has no centralised repository, requiring each participant to organise and filter content themselves. It allows no facility for editing or annotating messages. Email becomes impractical as a document management system with a large group of people as changes must be managed and merged manually. (Leuf & Cunningham 2001)
Shared File/Folder Access
Usually takes the form of a document on a file server editable by all members of a group. Access to the file is often very transparent, such as a network file share. Any document is effectively editable by one user at a time, as changes from multiple users must be resolved manually. Email messages are often used in conjunction to discuss changes, and to alert other editors of updates for review. (Leuf & Cunningham 2001)
Blog
Weblogs originally started as a log of the web, with each entry linking to other sites. It has evolved to become simply a log on the web (Wagner 2004). Weblogs are usually maintained by a single individual or a small group of editors. Content is not editable by anyone other than these authors. Most blogging software facilitates user feedback in the form of short comments. These comments can provide feedback or suggestions on content, but its usefulness for iterative editing is limited.
Forum
A common form of many-to-many conversation. Allows threaded posting in a central space. Members are able to view and post freely, and sometimes all content is available for reading by any Internet user. (Wagner 2004)
Static Web-Site
A one-to-many form of communication. It provides no mechanism for alerting for updates, and no mechanism for feedback. A publisher will often publish an email address as a channel for feedback. Other web site owners may comment about, or provide feedback via their own web sites. This however, requires the original author to find this comment manually. (Wagner 2004)
Wiki Systems
Hundreds of wiki engines exist (WardsWiki Community 2006b, Wikipedia Community 2006a) designed for different uses, different environments, and different audiences. Leuf and Cunningham (2001) detail the difference in syntax of several wiki engines, including the original WardsWiki, the then current WikiWikiWeb, TWiki, Swiki/CoWeb, and Zwiki, as well as describing the use of some of these through case studies of their use in educational and business settings.
MediaWiki
MediaWiki (see figure 1.1) is probably the most recognisable wiki engine, due to its use for the popular Wikipedia, as well as its sister projects run by Wikimedia, and the commercial Wikia. MediaWiki is developed closely with the needs of Wikimedia and the Wikipedia community (MediaWiki.org Wiki Contributors 2006a). Its use on Wikipedia requires the code-base to be stable, and perform well enough to serve about 10,000 pages a second on the Wikimedia server farm. Started in 2001 (Dill 2001) as a replacement for the perl implemented UseModWiki (MediaWiki.org Wiki Contributors 2006b), the MediaWiki engine is written in PHP, using a MySQL back end. MediaWiki provides several features not commonly seen in other wiki engines.
The MediaWiki software allows content to be divided between several namespaces. The default namespace is where the bulk of the content is usually placed. In the case of Wikipedia, this is where the encyclopedic articles are. The project namespace (the name usually follows the name of the wiki) is usually where details about the project, its goals and policies are written. A help namespace provides a space for instructions guiding users on contributing to the project. Each namespace has a partner "talk" namespace, allowing every page in every namespace a partner discussion page, where the articles contents can be discussed and debated, without soiling the actual article. This allows Wikipedia's contributors to discuss or dispute facts, or organise themselves without interfering with the readers of the encyclopedia. A template namespace is used for writing sections of wiki markup that can be included on other pages. These are commonly used for adding banners to pages, such as the "This is a current event" banner, informing the user that the article is likely to be frequently updated.
MediaWiki supports user accounts, letting each user to define their own set of preferences for using the software, and allowing the software to track individual contributions, as well as letting users to define watchlists. Users can use watchlists to select a group of pages they are interested in, requesting the software to alert them when one of those pages is changed by another user.
MediaWiki also supports uploading of files (limited to certain MIME types by default), and a powerful wiki-syntax allowing complex layouts and visual styles. The MediaWiki code-base is commonly reported to be poorly structured, as too is the MediaWiki documentation, however these have been steadily improving (WardsWiki Community 2006a).
c2Wiki
Cunningham's wiki was the original wiki implementation hosted at c2.com. The simple single file perl CGI script uses a flat file database for storing wiki text and search indexes. The c2Wiki has remained largely unchanged. In particular the layout and editing remains the same, although several features have been added. The software now allows users to assign themselves a username used to track a their edits, allows pages to be deleted, keeps a history of all changes to a page, shows if a page is a new page or deleted in RecentChanges and allows edits to be marked as a minor edit when only a small change is made. Some of these features however, have been disabled due to spam abuse (Cunningham 2006a).
It is interesting to note that WardsWiki was not designed around a technical model, rather Cunningham based it on a set of concepts about how the content of the wiki should be treated and how people should interact with the software. (see 2.2.2.1)
UseModWiki
UseMod follows closely the style of Wards Wiki. It is a single file perl implemented flat file database wiki. Its interface is simple like that of the original wiki (UseModWiki Community 2005). UseModWiki includes many desired features, such as recent change lists, page diffs (shows the differences between two versions of a pages), sub-pages (allows a page to act like a separate wiki), interwiki links (allows easy linking to other structured sites such as wikis), page redirects, edit conflict detection and page locking. Its simple implementation (single perl file with a single data directory) makes it easy to set up.
TWiki
TWiki represents a different type of wiki. Promoted as "Enterprise Collaboration Platform", TWiki provides an extensive set of powerful features, such as WYSIWYG editing, fine-grained user control, calendaring, charts, database integration, slide-show presentations and spreadsheets (TWiki Developers 2006). TWiki is implemented in perl, using simple file based storage, or using GNU Revision Control System (RCS). TWikis pages are edited via a form where the user may edit the plain wikitext. Features such as calendars are controlled using special markup in this wikitext. A beta version of a WYSIWYG editor is also released with TWiki for editing pages.
Trust
For the purposes of this research, trust is defined as:
(Shneiderman, 2000 cited in Preece 2000, p.192).
Trust plays an important role in decision making. Jøsang, Ismail and Boyd (2005) point out the distinction between reliability trust (above), and decisional trust, where decisional trust is the 'extent to which one party is willing to depend [on another party or tool or process]' (Jøsang, Ismail & Boyd 2005, pg. 4).
When interacting in the physical world, we rely on a wide range of cues to determine trustworthiness. These cues are mostly absent in online situations, therefore requiring that substitutes be present for trust to form (Jøsang, Ismail & Boyd 2005). Feng, Lazar and Preece (2004) explain how video and audio communication can be almost as good as face to face communication for generating trust, while text based communication however scores poorly.
Trust Within Wikipedia
Shneiderman's definition above shares a few elements with a discussion by Preece (2000), citing three conditions necessary for trust. The conditions are summarised as follows:
- There must be a high probability of future interaction.
- During interactions, members must be able to identify other individuals.
- There must be a record of past interactions.
Wikipedia partially applies these three conditions. Its popularity increasing rapidly (Alexa Internet Inc 2006) would seem to suggest people are not only revisiting the site, but influencing others to visit and revisit.
Wikipedia allows users to create an account, giving that user a unique "name" under which to edit. There are however some flaws. Anonymous users are allowed to edit pages. Until December 2005, anonymous users could also create new pages, however this was removed to reduce the workload of editors checking articles (Wikimedia Foundation 2005). "Anonymous" users however are identified by IP address, tracing them back to their ISP. There is debate (Wikipedia Community 2005) as to if "anonymous" edits are more traceable than edits by registered users. It is possible for users to create multiple accounts, and as IP addresses are only shown (to the public) for editors that are not logged in, it is argued that such use of "sock puppets" ('an additional username used by a Wikipedian who edits under more than one name' Wikipedia Community 2006c) creates a greater level of anonymity than users identified by IP address.
MediaWiki does however keep a history of every edit made to every article, making reverting vandalism easy, and allows the identification of which user made which change.
There are no simple and convenient measures of trust between members, such as the numeric rating eBay gives, however, if one member has reason to doubt the reliability of another member, the means exist to investigate their previous performance.
Osterloh, Rota and Wartburg (2001) show how trust between contributors in open source projects is related to the concept of "swift trust". Swift trust defines a form of trust developed in temporary teams assembled for a common task. As a form of describing shared norms, Osterloh, Rota and Wartburg (2001) argue that swift trust is a suitable explanation of how open source projects develop trust. Members of such communities accept and adhere to "norms of cooperation", and in turn, expect others to. Osterloh, Rota and Wartburg (2001) state that good participation does not necessarily come from interpersonal trust, but rather a perpetual "highly active, proactive, enthusiastic, generative style of action" present in the community (Meyerson, Weick, & Kramer, 1996, p. 180, cited in Osterloh, Rota & Wartburg 2001).
(Resnick & Zeckhauser 2002)
Trust of Wikipedia
High profile cases of libel and vandalism have drawn public attention and scrutiny to Wikipedia's vulnerabilities. McHenry (2004, 2006) presents criticism which represents the public distrust of Wikipedia. He identifies problems with Wikipedia's processes, such as the unproven model of content creation, internal politics, the writing style accepted, and the lack of quality control. Orlowski (2005) and Seigenthaler (2005) detail a high profile example of a problem found in Wikipedia's content, the case of the John Seigenthaler article, which for 132 days contained libellous information against him, contributed by an anonymous author.
Ma (2006b) explains how Wikipedia's similarities to the open source model may affect its perception. While some people buy brand name products, whose quality is guaranteed to a point by the author, some opt for open source products, which have been peer reviewed by a wider range of reviewers, with a wide range of biases and preferences, rather than a single small group of employees. The scale of Wikipedia may however break down this model.
Systemic Bias
Ma (2006b) identifies a systemic bias in the editors of Wikipedia. She reports that Wikipedia users are generally male, technically-inclined, formally educated, speak English, and are from an industrialized nation. These demographics pose some problems.
Wikipedia's goal to provide 'every single person [with] free access to the sum of all human knowledge' (Wikimedia Foundation 2006b), requires two things, firstly that the content is made accessible to all people, and that all knowledge is collected.
The collection of "all human knowledge" is an unachievable ideal, although it is romantic and has served the foundation well so far, but Wikipedia's systemic bias has hindered the goal. In the first years of Wikipedia, the number of articles on Lord Of The Rings eclipsed the number of articles on all of Africa, because that is what interested the community more at the time (Sanger ?, (Kapor 2006)). More recently Ma (2006b) exemplifies the three articles on the Kashmir earthquake, hurricane Katrina, and the Indian ocean earthquake, showing that the neither scale of the incident (number of deaths), or the time of the event influenced the detail of the article, but rather the locality of the event in relation to the bulk of contributors. Similarly she shows that although malaria is a more severe condition than allergies, but the article on allergies is more detailed because more Wikipedia contributors suffer from allergies than malaria.
The second aspect of Wikimedia's goal is that the information is made accessible. This not only means that everyone has access to the content, but that they can read and understand it. Although Wikipedia employs its NPOV policy, the fact remains that any "consensus" reached on Wikipedia, is a consensus of the subset of people who contribute to Wikipedia. Wikipedia's article on combating systemic bias (Wikipedia Community 2006d) shows how intellectuality, religious ideals, and social status, among other aspects, influence the audience for which articles are written. It suggests editors expose themselves to foreign media (newspapers from the locations about which they are writing), to achieve a more balanced view on the facts.
Reputation Systems
Reputations exist to record, manage and summarise data, and to present to the user a metric of another user's standing within the community. These systems are required to allow users to generate trust despite the lack of real world queues. These systems allow any two users, with no prior interaction, or who may only ever interact once, to generate trust.
Slashdot
Slashdot is a news site dedicated to computing and technology, with a very active community of commentators, participating in long and detailed discussions on each news item (Lampe & Resnick 2004). The site adopts what is in essence a blog format, with editors posting about two dozen stories each day (from a pool of stories submitted by members), with each story viewable in its own page along with a threaded discussion. Each story typically draws a few hundred comments, so to prevent information overload and improve the readability and quality, Slashdot implements a comment moderation system (Lampe & Resnick 2004, Malda 1999). The ever-increasing number of comments complicated this problem. The system went through several evolutions, ultimately producing a distributed moderation system, in keeping with four goals:
- Promote Quality, Discourage Crap
- Make Slashdot as readable as possible for as many people as possible
- Do not require a huge amount of time from any single moderator
- Do not allow a single moderator a 'reign of terror'
The current system potentially allows any registered member to make a limited number of moderations, in keeping with defined eligibility rules. Each time a moderation is made on a comment, it changes the score of the comment by one, with in the range of -1 to 5. Moderations on a member's comments influences that user's karma, Slashdot's measure of reputation. Karma is displayed as "Good" or "Bad" (although the internal measure is actually more complex). Members with positive karma will be able to moderate, and have higher initial scores for any comments they submit.
eBay
eBay, an online auction site, challenges the traditional model of buying and selling. The traditional model, of established shop-fronts, and plazas, requires very little trust on the part of the buyer. The buyer has a vast array of cues as to the reputability and commitment of a dealer, for example, the condition of the shop, and the other patrons in the shop. The buyer can also see the exact item they are purchasing, with no doubt as to its specifications or general quality (damage etc.) (Resnick et al. 2006).
These cues are however not available when buying online. The only cues available are the site you are viewing, and the information the seller publishes about themselves and product, any or all of which may be false. eBay's model introduces a reliable third party to vouch for the seller's reliability, or lack of. eBay's mechanisms promote trust and positive behaviours among its members (Dellarocas 2001)
eBay publishes reputation information for each member. This reputation information is created whenever a buyer and seller complete a transaction. Each participant provides feedback about the other, in the form of a positive/negative/neutral indication, as well as a short textual comment. A members feedback history is made available to all other members, showing textual feedback, as well as a the sum of positive, negative and neutral feedback. (Dellarocas 2002)
digg
digg is a social bookmarking news site. digg's users submit stories to an "upcoming stories" pool. Users may then "digg" an item, effectively adding it to their list of "favourites", which increases the popularity of the item. Alternatively a user may "bury" an item, indicating the story is spam, or of significantly low quality. With enough "diggs" the item will gain high visibility on the home page the most prominently viewed page on the site (digg Inc. 2006).
MacManus (2006c) and Torkington (2006) explain how digg suffers from group-think. As a higher rating (or more popular) story reaches a greater number of readers (possibly putting it on the front page), more will comment on the item, and more will rely on these comments rather than visiting the original source. This is exemplified by the Steve Mallett controversy (Torkington 2006) in January 2006. Mallett was falsely accused by a digg poster of using digg's own web page design for his own sites. The claim was not sufficiently examined by the community and within 3 hours 300 mostly negative comments were posted, despite the article (one click away) clearly was not as negative (MacManus 2006c).
digg has since been extended with features to help solve this problem, by allowing members to tag a story as inaccurate (MacManus 2006b).
reddit is a lesser-known social bookmarking site, using a similar binary system where a submission may be marked as "hot" or "cold". The higher rating an item receives, the closer to the top of the queue it will move (reddit.com 2006).
Rating articles also has two further effects, it influences the reputability of the poster, and it trains a personal filter. Each "redditor" has their own karma value. This karma value is incremented or decremented by one every time one of their posts is marked as hot or cold respectively, allowing reddit to learn what types of article each user likes and dislikes. This enables, through the use of a personal filter, reddit to provide recommendations to each user. By personalising results, and removing emphasis from a single communal list, reddit hopes to reduce the group-think that digg suffers (MacManus 2006a).
Attention
Attention is an action performed by humans whereby ones' brain is partially or fully focussed on another person, or an informational product of that person. This can include such actions as listening (in person, or a recording), reading, learning, carrying out a request, waiting for or waiting on, or empathising with someone. Goldhaber (2006) sums up attention as 'temporarily (and thereby permanently) allowing another [person] to shape how your mind works'. That is to say, while paying attention, ones' mind is altered to focus on, think like, or think about the target of attention. Memory and learning make permanent this effect, by remembering the action of paying attention, or remembering what was learned while doing so.
Goldhaber (2006) likens the transfer of attention to an economy, where the currency is the limited resource of attention. Varian (1995) describes the "Information economy", where information is distributed and traded electronically. Today however, with the abundance of information, and services such as search engines that allow us to access information effectively, information is no longer a scarce resource upon which to base an economy. Today attention, although it cannot be bought, forms an economy where it has the ability to affect ones' thoughts, actions, and movement of money.
Attention management is becoming an important issue in business today. People today command much more information than ever before, and are exposed to more than we can comprehend. It is becoming increasingly important that our attention be managed to remain productive (Davenport & Berk 2002). Howard Rheingold (1993 cited in Goldhaber 1997a) says 'Rule Number One is to pay attention. Rule Number Two might be: Attention is a limited resource, so pay attention to where you pay attention.' Attention leads to action, so to ensure the best actions in business, attention is a major resource that needs to be managed.
Although the amount of attention transferred can not be measured, there are many approximations (Goldhaber 1997b). Steve Gillmore and Seth Goldstein founded AttentionTrust.org to build technologies around attention data, and to raise awareness of the value of attention data (Iskold 2006). AttentionTrust, and the more recent RootMarkets, have together developed software that records a person's online attention data, currently in the limited form of the user's click-stream. This data can be stored locally on the user's computer, or remotely in a secure online attention vault.
Attention vaults have an ever-expanding variety of potential applications. RootMarkets' attention vaults are designed to allow web services to analyse attention data, producing varying forms of personalised data. These applications may include personalised search, recommendations, alerts, news, shopping, and other forms of personalised information filters. Individual technology companies have already started using these forms of filters. Amazon produces personalised recommendations, Google has launched Search History to provide personalised search results, and reddit uses user's ratings to suggest content. Attention vaults have the potential to collect richer data than simple click-streams, and to more successfully mine larger sets of attention data (Iskold 2006).
Conclusion
This chapter followed the evolution of wikis, from its origins as a HyperCard application, to its implementation in the Portland Pattern Repository, to its popularisation through Wikipedia. Through its rapid popularisation, wikis have also accumulated a number of criticisms, against both the principles of wiki systems and the Wikipedia community. Stallman's vision (Stallman 1999) provides important goals and guidelines for the Wikipedia community. Various uses of wikis were identified, apart from their common perception as an encyclopedia. Wikis uses in personal information management and small groups are presented, and various writing styles explored.
The review of literature shows the range of applications and audiences that have used wikis. The typical wiki community was studied, including their thinking styles and values, and the Wikipedia community was explored in detail. It was found that these communities often derive from open-source communities, and are comprised of technical minded people. Continued exploration was made into how wikis, as a technology, fits into the collaborative tools landscape, by comparing and contrasting wikis with other tools. From this, it was found that wikis can compliment or replace a wide range of technologies. In a final survey of wikis, several common wiki systems were analysed.
The chapter entered into a discussion on trust, and how it applies to Wikipedia, both to the perceptions of the Wikipedia community, and of the external view of Wikipedia. Significant criticisms of the lack of authority and reliability were found. Problems with Wikipedia as a globally available resource were identified as Wikipedias' biases were explored. Following on from trust, the concept of reputation was explored. Various existing systems, their mechanisms, and social benefits and consequences were explored. The literature review concluded with an overview of attention, and how it can be employed to generate measures of reputation through personal recommendations.
The chapter explored the literature surrounding Wikipedia, the only widely documented wiki community. Wikipedia is the largest, and best known wiki community, but only one of hundreds or thousands of successful wikis. This prompted several questions:
- How do these smaller wikis behave differently from Wikipedia?
- How do closed (company wikis for example) wikis behave as compared to open wikis?
- How do wikis behave when aiming for a more specific goal as compared to building an "encyclopedia of everything"?
- Does including first hand or subjective data alter the dynamics of wiki communities?
Further questions are raised from the many criticisms of Wikipedia. These criticisms derive from the lack of authoritative analysis and approval of content, which is made difficult by the dynamic nature of content.
- Are there any mechanisms by which the community (or a subset) can provide some level of authority?
- Can a process be developed to allow an authoritative body to validate pages with minimal disruption to the current process?
Design and Justification
The last chapter reviewed the literature in the field, and identified some gaps in the knowledge, and possible questions of study. This chapter builds on from those questions, defining the research methods that will be presented in later chapters.
The research here focused on one question identified from the literature review. The question asked was as follows.
This research tested a few simple mechanisms in the attempt to provide an answer to this question.
One proposed method of making the validating process of articles easier is to identify good articles to validate, thus eliminating poor articles from ever beginning, and ultimately failing the validation process. This research focused on two such mechanisms, user ratings, and attention data. The following general questions were investigated by this research, using the methods presented in this chapter:
- Can ratings made by users be reliably used to identify quality articles?
- When users are asked to rate an article, what exactly are they rating?
- Can attention data be used in an wiki environment to estimate the quality of an article?
In order to answer these questions, this research uses a range of research methodologies to process different types of data, and to triangulate results from these different sources to provide a clearer picture (Preece 2000). Considering the little research done in the field, this research looks at a wiki in a naturalistic setting, focusing on aspects of quality arising from a real-world wiki community.
The community studied was one created for this research, formed from the students of a "computer supported collaborative work" subject at CSU. Using this group of participants also allowed observations to be made of their participation in an authentic learning setting regarding the following questions from the literature review:
- How do smaller wikis behave differently from Wikipedia?
- How do wikis differ when the Wikipedia style limitation prohibiting first hand data or subjective experiences is removed?
And also the question:
- How effective are wikis in educational settings?
Justification for the paradigm and methodology
Myers (1997) and Straub, Gefen and Boudreau (2005) explain the two major forms of research, qualitative and quantitative research.
- Quantitative research was originally designed to study natural phenomena, and deals with numeric data collection. Methods include surveys, experiments, and various formal methods
- Qualitative research was designed for studying social and cultural phenomena using methods such as case studies, action research and ethnography, dealing with descriptive data.
This research uses both forms of research in analysing different sets of data. Preece (2000) presents an expanded set of research approaches for application in the study of online communities, and summarises the methods into the matrix of evaluation types in table 3.1.
| Evaluation Type | Qualitative Data | Quantitative Data |
|---|---|---|
| Subjective | Ethnographic data, for example, interviews, observations, artifacts are interpreted by ethnographers | Questionnaires, for example, take subjective input, then express it using numeric rating scales |
| Objective | For example, content analysis categorises user comments seeking to identify patterns and frequencies | For example, usage logs generate data that is statistically analysed |
- Table 3.1 "matrix of evaluation types" (Preece 2000, pg. 305)
This research combined two formal data collection methods, with several informal methods. The literature review, presented in chapter two, identified opportunities for research, allowing questions to be selected for further investigation and a suitable methodology to be build. This research continued to implement a wiki, and design a rating system to be trialled in a naturalistic setting, while log data is gathered. Building from the results of this wiki experiment, a survey was conducted. The survey results were used to validate results from the experiment, and answer questions arising from the experiment. This process is summarised in figure 3.1.
(Preece 2000) details five approaches to research in online communities. These are reviews, surveys, observations, experiments, and data logging. This research employs three of these methods, as explained below.
Data collection occurred in two phases. Firstly by collecting web server logs, from a running wiki, extended with a rating system, secondly from a follow-up survey distributed to the users of the wiki. The server logs allowed participants behaviour to be analysed, while the survey aided in interpreting results from the server logs, and helped to answer any remaining questions. Observations from the wiki, and the development process were also recorded.
The wiki experiment employs data logging to collect data. From this data users' actions can be quantified for statistical analysis. The "wiki experiment" is not necessarily considered a formal experiment. It is considered to be a pre-experimental method, using a single group of participants, and with few controls (Tanner 2002), however, it does not fit the definition presented by Preece, as the experiment itself does not employ the manipulation of controls to directly test hypotheses. It was endeavoured in fact, to avoid placing controls upon the community, in order to monitor the community as it develops on its own, and how members behave and interact freely within the community.
Surveys were employed in the final phase of data collection, where participants were asked to answer a questionnaire, comprising of questions arising from observations and results from the wiki experiment phase. This was aimed to determine users' perceptions, opinions, feelings and motives regarding their participation in the wiki.
Observation was used in both the design phase and wiki experiment phase. Observation was used in an informal manner, but was useful for providing an extra layer of detail. Observation allowed subjective data to be elicited where there was otherwise no formal data collection. In the design phase observation was used to gain and present a general understanding of the process of writing a MediaWiki extension, which is otherwise poorly documented. In the wiki experiment phase, it allowed the researcher, as an active member of the wiki, to gain a better understanding of the dynamics of the community.
Returning to Preece's matrix of data and evaluation types above, this research covers each of the four categories, as shown in table 3.2.
| Evaluation Type | Qualitative Data | Quantitative Data |
|---|---|---|
| Subjective | Observations during development and reviewing comments from survey data | survey data, captured using Likert scales |
| Objective | Analysing comments from survey data | Studying Wiki Logs |
- Table 3.2 "data collection in this research grouped according to Preece's matrix of evaluation types"
Implementation of the wiki, including a rating mechanism involves the use of systems development procedures. Nunamaker, Chen and Purdin (1991) explain that systems development can be used as a tool in research for developing an artefact to be studied, either through its use as a proof-of-concept following a theory building exercise, or the focus of an experiment, such as a naturalistic trial. This research uses a wiki system as the main tool in the experiment. This system and its implementation will be detailed later in this chapter.
Research procedures
This research determines how a rating system might be used in a wiki environment, its effects on the community, and usefulness as a measure of article quality. To achieve this, a wiki community was established and monitored, by implementing a wiki extended with a simple page rating mechanism. The rating mechanism allowed users to rate an article within the wiki, using a standard 5-star rating. A group of undergraduate students partaking in an undergraduate IT subject were invited to the wiki.
Students as part of their study were asked to undertake assignments enforcing typical wiki usage through the following guidelines
- write 1000 words (approximately 2 articles) or word equivalent (words could be split across more articles, by contributing to other partial articles)
- rate other articles you read
- be creative, let ideas flow, making students search and rate, and produce content
These activities were designed to encourage an organic (as per Cunningham's ideals, see 2.2.2.1) site, where students may otherwise not have been used to studying this way.
At the completion of the experiment, a follow-up survey was issued. This data was used to aid in the interpretation of the results from the wiki logs.
The following data was collected
- Article Content - The content created by users of the wiki. This includes any textual content, discussions, comments, and historic revisions of articles.
- Ratings - As determined by users of the wiki through the embedded rating system.
- Server logs - allowing user behaviours to be monitored, to see how users interact with the system, and to determine time spent using the system, including attention data.
- Survey results - General demographics, users' computer self-efficacy, and user thoughts on the use of the wiki and rating system.
- Observations throughout the development of the rating mechanism
- Observations from the live wiki community
Once data was collected, articles were analysed to determine an objective measure of quality. These measurements provided baseline measure of the quality of articles, for comparison with system generated ratings.
Ethical considerations
This research involved human participants contributing to a wiki site, and analysing contributions and actions when using that wiki. This creates the possibility for participants to be ill-affected by this research. The possibility of this was kept to an absolute minimum.
Study of this wiki was done with permission from the server administrator and the CSU Ethics Committee. Students from the ITC213 subject were required to participate in this wiki as part of their class work. The requirements of this participation were kept general, and in keeping with the subject objectives, keeping disruption to a minimum. No requirement was made for students to use their real identity within the system (although their identity within the system must be revealed to the subject assessors for marking). No personally identifiable information within the system was used for this research, and all data was de-identified before analysis. No identifying data or large samples of raw data was released as part of this research.
Design
To facilitate the data collection, a live wiki server was implemented. The wiki system chosen was MediaWiki, for several reasons. MediaWiki is one of the more common wiki systems in use. It is well known by many Internet users, and provided the necessary features to allow student work to be tracked to allow a lecturer to assess students' work. MediaWiki runs on a standard Apache/PHP/MySQL stack, a common and trusted set of applications installed on many web servers by default. Netcraft Ltd. (2006) report Apache to have about 60% market share, and Seguy (2006) reports PHP to have about 35% market share in September 2006. The MediaWiki installation is an easy automated process that detects its environment and configures itself accordingly (shown in figure 3.2). Due to its popularity, and despite the poor state of the official documentation, there is an abundance of support and development information available which is useful for setting up and maintaining the engine. The MediaWiki engine is also designed with extension in mind, by supporting an increasing set of hooks, providing a convenient interface for adding functionality to the engine.
MediaWiki Extension
A PHP extension was written and applied to the MediaWiki installation to add a page rating mechanism.
This rating mechanism added a set of five stars to each page in the User, Project, Image, Template, Help, Category and Main namespaces, as well as their associated talk namespaces (see 2.2.3.1). The stars allowed users, with a single click, to rate a page from one to five. This rating was stored and averaged by the system, and presented to the user both via feedback in by colouring the stars (to count the rating), and a three-digit numeric rating. A minimum of three ratings from different users was required before a rating was shown. Until this number of ratings was satisfied, the rating stars and text was highlighted with the intention of drawing the user's attention to rating the page.
The extension is a single PHP file of about 600 lines, accompanied by a set of small images providing the stars for the rating mechanism. This PHP file was activated with the addition of a single "require" instruction added to the MediaWiki configuration file, instructing the software to load the extension whenever the software was executed.
The extension made several "hooks" into the MediaWiki software. These were three parser hooks, an OutputPageBeforeHTML hook, and a SkinTemplateSetupPageCss hook.
The SkinTemplateSetupPageCss hook instructed MediaWiki to add CSS code to its HTML output. The CSS added graphical formatting instructions for the rating mechanism. The OutputPageBeforeHTML instructed MediaWiki to call upon the extension after any page is generated, but before it is returned to the user. This allows the extension to add the rating mechanism to the top of the page.
The three Parser hooks informed MediaWiki that three new tags should now be allowed in wikitext, and that when found, the extension should be called to process them. The three tags allow the dynamic content to be added to a page, one showing the rating for a specified page, one showing the number of ratings for a specified page. The third showing a list of pages and their ratings, sorted and filtered by several factors, and displayed using a variety of formats (see Appendix 1).
The stars, when clicked, instruct the browser to visit a page. Downloading this page causes the rating to be recorded for the page the user was viewing, clears the MediaWiki page cache for that page, which allows it to be updated with the new rating. The browser is then instructed to return to the updated page. The links triggered when the user clicks on a star are tagged with rel="nofollow" tags. This instructs spiders not to follow these links (Google Inc. 2005), as doing so would trigger the rating mechanism, causing false ratings.
The extension makes use of two database tables, ratings and ratings_cache. The ratings table stores each rating made within the system, using three fields, the ID of the page, the ID of the user making the rating, and the numeric rating itself. The rating_cache table summarises the rating for each page revision by storing the page ID, the calculated page rating, and the total number of ratings made for that page. The record for a page is updated whenever a rating is made for that page. If a user rates a revision twice, the previous rating is ignored, replaced by the most recent rating. The ratings_cache table is not necessary, but allows the system to provide prompt responses for tasks such as displaying the top ten ranked pages. This task would otherwise require the calculation of ratings for every page, an increasingly time consuming task as the wiki grows.
Code in the extension is called in two ways. Normally functions are called by the MediaWiki engine, however to provide the rating submission, as well as returning of image and CSS files, code in the main scope of the program (not in any function) interrupted the normal execution of the MediaWiki engine.
Returning files is a fairly simple operation. If a request is detected, the plug-in would return the file and end the execution of the script, before the MediaWiki engine can load.
Recording ratings however required access to the database. Although PHP provides standard functions for accessing a MySQL database, the obvious solution to this problem is to use the MediaWiki functions. Using these functions allows access to the standard set of queries typical for retrieving data from the MediaWiki database. The MediaWiki documentation however is not comprehensive enough to explain how the database components are loaded. To solve this problem, if database access was required, the entire MediaWiki engine was loaded, making available the required functions. This solution is inefficient, but sufficient to solve the problem given the small scale on which the extension is intended to implemented.
Writing to the database using the manually loaded engine however uncovered another problem. The MediaWiki engine does not perform write queries until the end of the execution. Normally this would not cause a problem, however because the MediaWiki engine was not run to completion, it never executed these queries. The poorly documented Database->immediateCommit() function was used to force the engine to flush data out to the database so that the script could be safely terminated.
Wiki
MediaWiki 1.7.1 was the version implemented (the most up to date stable version at the time of installation), it was configured for public access, and seeded with a set of pages explaining the purpose of the wiki, instructions for editing the wiki, and the set of assignment tasks to be completed by the students.
The standard installation script was run, and after providing the information required, the script initialised the database, and created a configuration file (see figure 3.2 for an example output). The configuration file was modified to allow users to upload files of any type (manual monitoring of the wiki ensured this was not abused), and to enable the rating mechanism.
The seed pages provided communal areas for communication and finding information relevant to the students' tasks. Certain key pages were omitted, with the intention that students should create these create based on instructions provided. Example pages were also added, as a guide to students when writing their own pages.
Over the course of the experiment, the wiki was monitored, and kept tidy by the researcher (who filled the "janitorial" role, see 2.2.2.4.2), following the guidelines set out by the Wikipedia Contributors (2006g).
With no set due date for completion of POD exercises, contributions to the wiki were made in fairly regular intervals. It was observed that most students would post an entire article with a single edit. It seemed that students preferred to draft their writing in an off-line system (perhaps a word processor). There were however a few exceptions, where students may post a paragraph at a time. There was little modification to text however, after its original submission.
The exception to this contribution style were the POD Group pages, pages students were specifically asked to use to collaborate between members.
POD Exercise 1, see Appendix 3.1.1
These pages were not created when seeding the wiki, they were left for students to create. Much editing of these pages was observed, usually by several members of the group. These pages were frequently updated with often minor presentation changes, following each others leads, and fixing each others mistakes. It seemed these pages were perceived common property, whereas where a single student posted their writing on a page of its own, students would not venture to interfere.
POD Exercises
Participants in the wiki were students from the ITC213 undergraduate class for spring 2006. ITC213, or Computer Supported Collaborative Work teaches a range of social and technical topics surrounding online communities. The class was volunteered by the lecturer and coordinator Ken Eustace. The subject is a very hands on subject where students collaborate using various collaborative tools (Charles Sturt University 2006).
Students in the class are placed into Pools of Online Dialogue (POD) groups, and in those groups, complete fortnightly collaborative exercises (Eustace 2006). This research was designed to provide activities for the first two POD exercises.
The POD exercises were designed with three goals in mind:
- Provide a suitable exercise for learning and assessment
- Generate valuable content for the wiki
- Promote typical wiki behaviour
A wiki does not generally have a set of tasks that each user completes, rather there may be a to-do list that volunteers may complete, with no requirement to do so. For the participants of this research however, it was required that there be a compulsory element to the activities to ensure participation, and to facilitate the academic assessment of the students work.
The wiki's long term plan is to be used by academics studying instructional gaming. The design of the exercises attempted to influence the content being created to provide useful springboards for research by academics. It encouraged students to think critically about how games related to theory in areas such as communication or social interaction.
The set exercises attempted to set, where possible, few limitations regarding the type of interaction within the wiki. This was to try to accurately simulate a wiki community, where there are generally no such limitations. Students were required to contribute 500 words for each of the two activities, however the distribution of these 500 words was not limited. The words could be "spent" adding to existing articles, or pooled with a group of people to generate a larger article.
Survey
A follow-up survey was conducted after initial analysis of wiki data. This survey was designed to provide information useful in interpreting the wiki data, and answering any resulting questions. The survey was designed to achieve the following goals:
- Determine general demographics
- Determine computer self-efficacy of participants
- Discover participants previous experience with online tools
- Determine what the participant has learned about wikis
- Probe for factors affecting levels of participation in the wiki, including:
- Difficulties with POD exercises, or online game
- Difficulties submitting to the wiki, and perceptions of the wiki
- Determine usefulness of rating mechanism:
- When rating "friends" vs. "fiends"
- In terms of self meta-moderation - or how confident participants are about their own rating submissions
- What factors were more important in rating? Content, quality writing, or graphical appearance?
- Did the rating mechanism make sense? Was it understandable and useful?
A full listing of the survey questions are provided in Appendix 2.
Conclusion
This chapter defined the methods used in this research, and detailed the design of the software and wiki, including observations made during those processes. The next chapter will present an analysis of the results collected using the methods presented here.
Analysis of Data
Two formal data sources were used for analysis in this research. Firstly Apache logs collected from the wiki community, and secondly data from a follow up survey of the participants. This chapter will present an analysis of this data.
Many figures quoted in this chapter have companion commands used to generate them. As these methods may be useful in future research, these are indicated by subscript numbers throughout the chapter, and commands are included for reference at the end of the chapter.
Data Analysis
Data gathering consisted of two stages; firstly Apache server logs were collected for a wiki system, and secondly a survey was completed by the users of the wiki. Considering these two forms of data is important in understanding users behaviour. Apache logs give an objective view of behaviour, a detailed listing of every action, whereas the survey results allow the painting of a more detailed picture by brining in an understanding of the intent, thoughts and feelings behind the behaviour. Presented here are the results of the analysis of the two data sources, and discussion of their interpretation in terms of the research questions presented in chapter 3.
Apache Logs
Data Overview
Data was collected from log-files generated by the Apache server, and from the MediaWiki database. The log was processed and stored in the database in a format facilitating simple SQL queries to be run against the data.
The wiki was installed and initialised on the Internet Special Projects Group (ISPG) server on the 4/Jul/2006. The ISPG server, maintained by Geoff Fellows, hosts several applications for research projects, as well as teaching resources. Over the following weeks, the wiki was seeded with its initial set of pages. Data recovered from ISPG contained 225401 lines of logs (see table 4.1), recorded between 31/Jul/2006 and 20/Sep/2006, of which 15162 pertained to KakapoWiki (the server hosts several projects). From this number were removed requests for files (3234) (not to the MediaWiki engine itself), including CSS files and JavaScript files requested through the engine (2119 and 762 respectively). 2436 lines generated from automated (non-human) requests to the wiki were also ignored. 5 lines also failed to be matched by the regular expression being used to parse log file records. None of these five pertained to the wiki.
not_kakapo: 210239 For other projects hosted on ISPG lines_recorded: 6606 Total lines added to the database file_request: 3234 Requests for files through the rating system re_fail: 5 Lines that the regular expression failed to match css_request1: 1420 requests for CSS files total_lines: 225401 Total number of log records processed css_request2: 699 more requests for CSS files js_request: 762 requests for JavaScript files stats_logger: 2436 automated requests Table 4.1 "report from log file parser, log2db.py"
A total of 6606 usable entries (table 4.1) were retrieved from the Apache logs (non-automated queries to the engine). These entries show users actions or behaviours while interacting with the wiki. These entries were broken down in table 4.2.
+--------+-------+-----------+---------------+ | action | total | sup_count | student_count | +--------+-------+-----------+---------------+ | DEL | 2 | 2 | 0 | | DIFF | 42 | 36 | 6 | | EDIT | 786 | 250 | 536 | | HIST | 460 | 198 | 262 | | LOGIN | 242 | 12 | 230 | | LOGOUT | 32 | 14 | 18 | | MOVE | 27 | 7 | 20 | | RATE | 81 | 39 | 42 | | REG | 59 | 0 | 59 | | SAVE | 688 | 332 | 356 | | SPEC | 467 | 165 | 302 | | UL | 37 | 14 | 23 | | UWATCH | 2 | 1 | 1 | | VIEW | 3677 | 853 | 2824 | | WATCH | 4 | 1 | 3 | | total | 6606 | 1924 | 4682 | +--------+-------+-----------+---------------+ Table 4.2 "Results from SQL Query of the Apache logs1"
+------------+----------+ | log_action | COUNT(*) | +------------+----------+ | delete | 2 | | move | 4 | | upload | 10 | +------------+----------+ Table 4.3 "Results from SQL query from the MediaWiki database2"
The Log data (table 4.2) counts the number of requests made to the Apache server. From the logs 15 types of requests could be identified.
DEL counts the number of requests to delete items from the wiki. Only the researchers had the ability to remove objects. It is standard practice in MediaWiki installations that only administrators (or sysops as they are known) have delete privileges. A count of successful requests (table 4.3) confirms the log figure as accurate.
DIFF shows the number of times users have viewed the differences between two versions of a page. Most commonly this is to view the last made change, but other times to see how the page has changed over time.
EDIT represents the number of times the wikitext has been viewed by clicking the edit button. Some of these result in page SAVEs, others just satisfy a users curiosity of how a page is marked-up.
HIST represents a view of a pages history. Viewing this page may result in subsequent actions. A user may choose to view a DIFF of two versions of the page, or they may VIEW an older version of the page. HIST also includes viewing of a user's list of contributions, and viewing a list of recent changes to the wiki.
LOGIN shows when a user attempts to provide their correct username and password to login to the system. As shown below, login attempts result in at least two log records.
LOGOUT counts when a user instructs the system to disassociate further actions with the users' current user account. It is typically performed when the user has finished working, or another user wishes to work at the same workstation.
MOVE shows when a user clicks the move link above a page. Only four of these requests were completed during the data recording period (table 4.3), two each by students and researchers
RATE indicates when a rating star is clicked by a user to record or modify their rating of a page. A rating will result in another VIEW as the browser returns the viewed page again with the new rating. There were 633 ratings stored in the database. This indicates that 18 ratings were adjustments on previous ratings (where a user changed a rating). There were 374 ratings by users who had no involvement with a page's content. These ratings are considered unbiased ratings, as the rater has no influence on the content.
REG monitors when a user registers their username with the wiki. This action may result in multiple records. The are not included in the recorded data as they had registered before data collection started. A total of 265 users were registered in the wiki, including the three researchers.
SAVE follows an EDIT or another SAVE, and indicates either the save or preview button is pressed after editing wikitext. Several previews may be made before saving. Alternatively, the page may also be saved without modification, or no save may be made after modifying the wikitext. Saving the page without a change (a "null edit") causes MediaWiki to re-parse the page, updating any components derived from templates that may have changed. While this would show up in the logs, it is not included as a change to the page in MediaWiki. The MediaWiki database reports 3606 edits to 1127 pages during the data collection period. These edits are when a page is changed (ie. not a "null edit") and committed (not previewed) to the database.
SPEC counts when a special page is viewed. Special pages include viewing a list of all pages in the database, a list of all files uploaded, version information of the wiki, searching, viewing statistics, and viewing a list of users. Logging in and out and moving pages involves the use of special pages, but were counted in their own counts.
UL counts attempts to upload an image or other file. A successful attempt will result in several records. In total 10 files (figure 4.3) were uploaded during the data collection period.
WATCH indicates when a user clicks the watch tab above a page. This is not the only way to add a page to a watchlist, as a user has the option to add a page when saving changes to that page.
UWATCH shows when a user clicks the unwatch tab above a page to remove a page from their watchlist.
VIEW counts users' views of a standard content page. This count also includes views of old versions of a page, and views of some special pages. The MediaWiki database reports 27708 page views. This count does not include repeated sequential views of a page, but increments only when a page is parsed and html generated from the wikitext. Pages are parsed and stored in a cache, which unless manually expired, remain valid for 24 hours. For each user, MediaWiki maintains their own set of cache entries.
Explanation of differences in counts
Some values in the log data are inflated when compared with the MediaWiki database. Log data represents actions attempted, while the MediaWiki database represents actions successfully completed. Similarly, some actions require multiple steps. To illustrate, logging-in is a two-step process.
The following example shows four stages:
- The user visiting a page, not logged in
- The user having clicked the login link
- The user after correctly entering their username and password
- The user viewing the original page again, now logged in
Figure 4.1 shows examples from the Apache logs for the previously mentioned four events
137.166.81.96 - - [08/Aug/2006:14:14:08 +1000] ispg.csu.edu.au "GET /kakapowiki/
index.php/KakapoWiki:ITC213/200670/POD_Activities HTTP/1.1" 200 4276 "http://isp
g.csu.edu.au/kakapowiki/index.php/KakapoWiki:ITC213/200670/POD_Activities" "Mozi
lla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.5) Gecko/20060731 Ubuntu/dapper-sec
urity Firefox/1.5.0.5" "kakapowikidbUserName=TrevorP; kakapowikidb_session=70da8
f9646996952454cd2da00a79360; kakapowikidbLoggedOut=20060808041401"
137.166.81.96 - - [08/Aug/2006:14:14:10 +1000] ispg.csu.edu.au "GET /kakapowiki/
index.php?title=Special:Userlogin&returnto=KakapoWiki:ITC213/200670/POD_Activiti
es HTTP/1.1" 200 2392 "http://ispg.csu.edu.au/kakapowiki/index.php/KakapoWiki:IT
C213/200670/POD_Activities" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.5)
Gecko/20060731 Ubuntu/dapper-security Firefox/1.5.0.5" "kakapowikidbUserName=Tr
evorP; kakapowikidb_session=70da8f9646996952454cd2da00a79360; kakapowikidbLogged
Out=20060808041401"
137.166.81.96 - - [08/Aug/2006:14:14:12 +1000] ispg.csu.edu.au "POST /kakapowiki
/index.php?title=Special:Userlogin&action=submitlogin&type=login&returnto=Kakapo
Wiki:ITC213/200670/POD_Activities HTTP/1.1" 200 2132 "http://ispg.csu.edu.au/kak
apowiki/index.php?title=Special:Userlogin&returnto=KakapoWiki:ITC213/200670/POD_
Activities" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.5) Gecko/20060731
Ubuntu/dapper-security Firefox/1.5.0.5" "kakapowikidbUserName=TrevorP; kakapowik
idb_session=70da8f9646996952454cd2da00a79360; kakapowikidbLoggedOut=200608080414
01"
137.166.81.96 - - [08/Aug/2006:14:14:15 +1000] ispg.csu.edu.au "GET /kakapowiki/
index.php/KakapoWiki:ITC213/200670/POD_Activities HTTP/1.1" 200 4467 "http://isp
g.csu.edu.au/kakapowiki/index.php?title=Special:Userlogin&action=submitlogin&typ
e=login&returnto=KakapoWiki:ITC213/200670/POD_Activities" "Mozilla/5.0 (X11; U;
Linux i686; en-US; rv:1.8.0.5) Gecko/20060731 Ubuntu/dapper-security Firefox/1.5
.0.5" "kakapowikidbUserName=TrevorP; kakapowikidb_session=70da8f9646996952454cd2
da00a79360; kakapowikidbLoggedOut=20060808041401; kakapowikidbUserID=1"
* These have been modified, removing irrelevant cookies used by other CSU
services.
- Figure 4.1 "A log-file sample of the login sequence"
Data, such as that in figure 4.1, was processed by a python script, the important fields were extracted, and inserted into a database. Table 4.4 shows the same four records after being processed and placed in the database.
| userid | date | action | name space | title | url |
|---|---|---|---|---|---|
| 0 | 2006-08-08 14:14:08 | VIEW | 4 | ITC213/200670/ POD_Activities | /kakapowiki/index.php/ KakapoWiki: ITC213/200670/ POD_Activities |
| 0 | 2006-08-08 14:14:10 | LOGIN | -1 | Userlogin | /kakapowiki/ index.php?title= Special:Userlogin& returnto=KakapoWiki: ITC213/200670/ POD_Activities |
| 0 | 2006-08-08 14:14:12 | LOGIN | -1 | Userlogin | /kakapowiki/ index.php?title= Special:Userlogin& action=submitlogin &type=login &returnto= KakapoWiki: ITC213/200670/ POD_Activities |
| 1 | 2006-08-08 14:14:15 | VIEW | 4 | ITC213/200670/ POD_Activities | /kakapowiki/ index.php/ KakapoWiki: ITC213/200670/ POD_Activities |
- Table 4.4 "A sample log of a login event formatted into a database table"
Examination of the data in figure 4.1 or table 4.4 shows how a single event can manifest itself as multiple records in the log. As a further illustration, if the user had not correctly entered their username and/or password, there would be another line where the user was given an error message and asked to try again.
Analysis
This part of the analysis was performed in two sections. Firstly the system generated ratings were studied, followed by analysis of attention data.
Ratings
The primary goal of this experiment was to monitor a simple rating system in a wiki community, and users behaviours and reactions to the system, where the participants were undergraduate students. Apache logs provide a rich source of objective measurements of behaviour, however they do not reveal intent or the thoughts and feelings behind the behaviour. The next section will analyse survey results, which attempt to cover this topic.
Apache logs indicated a total of 81 ratings, from one to five, submitted by users to the rating system. The distribution of ratings were uneven, with 78% of ratings occurring in the top 40% of the one to five scale (figure 4.2).
The context of these ratings must be analysed. The system allowed users to adjust their rating, by re-entering their selection via a click on the rating stars. The rating system was designed to accept this rating, replacing that users original rating for that revision of the page. These re-ratings account for 18, or 22% of ratings. Another factor to be conscious of is that it is possible for the same user to modify and rate the same page. The rating system was designed with a simple mechanism to reduce the possibility that this fact could be abused. Users were not excluded from rating pages they had edited, for the reason that though the user had edited the page, their influence on that page is reduced by future edits. A users influence may be removed completely if the text they add is completely removed by a future edit. For this reason also, a ratings importance diminishes as a page is edited, and as the quality of a page increases or decreases with each edit. To solve this simply, ratings were associated with a revision, not the page itself. Once a page had been updated, old ratings were only used to supplement ratings for the current version until a quorum of ratings had been contributed for the current article. As each user is only allowed one vote, a biased vote can be quickly corrected by other raters. In this system however, with the low number of ratings, it is unlikely this situation ever occurred.
Table 4.5 summarises ratings based on two determining factors; if the rating is a correction or alteration on a previous rating (that is, a rating by the same user on the same revision of a page), and if the user has ever modified the page they are rating, possibly resulting in a biased rating.
| Ratings | Possibly Biased | Unbiased | total |
|---|---|---|---|
| Final Ratings | 32 | 31 | 63 |
| Corrections | 12 | 6 | 18 |
| total | 44 | 37 | 81 |
- Table 4.5 "Summary of ratings usefulness"
In total, 31 usable ratings were generated. This number excludes corrections to existing ratings, and ratings where the rating user had edited the page at any time. Comparing ratings with page views gives an idea of users willingness to rate articles. A user will rate 2.2-2.9% (depending on the page count used) of pages they view, generating 0.84-1.1% usable ratings per page view.
This filtering results in a similar unbalance of ratings as the unfiltered results, with 87% of ratings in the top 40% of rating values, that is, a rating of four or five (figure 4.3).
Both the filtered and unfiltered sets of ratings (figures 4.2 and 4.3) show a regular occurrence of high ratings. This is found also to be true in eBay ratings, where members are very positive towards trading partners. Resnick and Zeckhauser (2002) suggests this is in keeping with social norms to treat others well. Dellarocas (2001) however puts it down to a "culture of praise", where members feel that the correct thing to do is to be kind and forgiving to other members. In doing so, users may also avoid rating bad pages.
Comparison between the two sets of data represented by figures 4.2 and 4.3 show that the "potentially biased" unfiltered ratings are more evenly distributed than the "unbiased" or filtered ratings. By analysing the data available, 419 "biased" ratings (pages where a user rated and edited) were were found to be made by a user after editing the page they rated, while only 1810 ratings were found before a user edited the page.
Rated pages were manually assessed by the researcher to gain a baseline quality measurement (summarised in figure 4.4). Pages were judged on the potential usefulness of the information on a page (to an internal or external viewer), and the information content of the page. The scale used was a 1-5 rating, with intervals of 0.5. Judging usefulness meant that short pages generally received low scores. To be deemed useful, a page required a substantial amount of information relevant to the community, and required the content be coherent and sensible. A better rating could be sought, either by building a more strict marking scheme, or consulting an expert in the field of instructional gaming, however the ratings generated are believed to be a suitable starting point for analysis.
The manual ratings (figure 4.4) show a more even distribution than the system generated ratings (figures 4.2 and 4.3), with the mode slightly lower than the system generated ratings, and a minimum score of 1.5. These scores were compared with the system generated ratings to see how effectively the rating system performed in the wiki.
Comparison of ("unbiased") system ratings and manual ratings (figure 4.5) show little statistical correlation between the two. To the eye however, the main body of the graph shows more correlation. Removing all the system ratings of 5 makes this the main feature, and increases the correlation from 0.17808398844 to 0.411799050614. Overall however, the ratings generated by the rating mechanism in this environment were largely ineffective.
While some results may be improved with a better manual rating, it was believed after reviewing the results available, that any improvement in correlation gained would be minimal.
Attention Data
A secondary focus of the experiment was to study attention data. Attention data gives a measure of the amount of attention (measured in time) spent on a task (rating) or object (page). Attention data was collected from the Apache logs and processed by a script that identified individually a users actions, and by processing them chronologically, identified the amount of time between actions.
A simple measure to study the usefulness of attention data is to study the time a user spends reading a page before they rate it. Given that ratings generated by the rating system appear invalid, manual ratings will be used for comparison. Time in this analysis is measured in seconds per character. This gives a standard measurement regardless of the length of a page.
The attention data for the time spent rating pages (figure 4.6) shows a slight correlation between time taken to rate and the manual rating assigned to a page. Values 3, 3.5, 4, and 5 (the ratings for which most of the manual ratings fell) show an almost linear sequence, tending downwards, that is, users spent less time reading better pages before rating them. There is however, not enough data to reasonably show this beyond chance occurrence.
Attention data for the time spent viewing pages was gathered and filtered to attempt to remove two types of events. Firstly, page views where the user immediately went on to another page, and secondly views where the user had visited the page and ended their viewing, either closing the browser, or by leaving the browser open at that page for an extended period of time. Only the middle half of values were analysed, removing the top and bottom quarters. Also pages with too few views were excluded to reduce outlying values. After filtering, the data for the time spent viewing pages (figure 4.7) shows only a slight statistical correlation of -0.343993118283 when compared to manual ratings.
Counting page views (figure 4.8) also gives a rough measure of attention. This simple count shows a very high number (641) of views of pages manually assigned a rating of 5. When considering there were only three pages rated at five this count is surprising. Closer inspection finds that these three pages are the main page of the wiki, and two revisions of the assignment questions. The main page is a high visibility page, and the default entry point for anyone visiting the wiki. The assignment questions were important to the participants of the wiki, as they contained information important to their success in their studies. After removing this anomaly, the distribution of views (figure 4.8) compared to the number of pages in each group (figure 4.4) shows no useful information.
Survey
Given the behavioural findings in the last section, this research sought to explain these behaviours. Two major findings in the logs were the low number of ratings, and the fact that a large number of ratings were identified as having a potential for bias. The questionnaire sought to understand the students' level of participation in the wiki, and what factors influenced it. It sought to understand how students viewed the rating system, how they used it, and possible biases. The survey asked about the students confidence in completing the pod exercises, and in learning about a game and its concepts. It asked how the participant felt submitting their work to a public wiki, as well as gauging their understanding of the wiki. The survey also asked for some basic demographics, assessed the participants computer self-efficacy, and their experience with online tools.
The questionnaire was conducted online and contained 57 questions, consisting of 51 numeric ratio entry questions, 1 short answer and 5 extended response boxes.
Students asked to participate in the wiki counted 13, and in total 11 students completed the survey. The 85% response rate is due to strong encouragement from the lecturer of the subject, and substituting the survey for what was originally an assessable task. Since the results of the survey were designed to provide guidance in interpreting the log results, the low sample size of 11, given the population size of 13, was not seen as a problem.
The ages of participants ranged from 20 to 32, with 20 the mode, an average of 24.4, and a median of 22. Participants were eight male and three female, with a roughly equal split of internal and distance education students, 55% internal, 45% external. Every respondent was completing an information technology or information science based degree. These results were expected for an information technology focused subject at CSU.
The survey sought to understand participants' confidence and ability with computing tasks. Students were asked to report their confidence in performing a number of tasks. Allowable responses followed a standard five-point Likert scale, and were as follows
- strongly disagree
- disagree
- undecided
- agree
- strongly agree
Results received from students were processed, and ordered by average response in table 4.6.
| Average | Median | Min | Max | Question |
|---|---|---|---|---|
| 4.8 | 5.0 | 4 | 5 | I feel confident using a personal computer |
| 4.8 | 5.0 | 4 | 5 | I feel confident copying and moving files to removable media (floppy disc, CD, or USB thumb/flash drive) |
| 4.8 | 5.0 | 4 | 5 | I feel confident deleting files when they're no longer needed |
| 4.8 | 5.0 | 4 | 5 | I feel confident checking email |
| 4.7 | 5.0 | 4 | 5 | I feel confident sending email |
| 4.6 | 5.0 | 4 | 5 | I feel confident learning how to use new software |
| 4.6 | 5.0 | 3 | 5 | I feel confident using instant messaging |
| 4.6 | 5.0 | 4 | 5 | I feel confident using an online forum |
| 4.6 | 5.0 | 4 | 5 | I feel confident installing software |
| 4.5 | 5.0 | 3 | 5 | I feel confident starting a computer program |
| 4.5 | 5.0 | 4 | 5 | I feel confident using word processing software to format a letter or essay |
| 4.5 | 5.0 | 4 | 5 | I feel confident using help features in software |
| 4.5 | 5.0 | 4 | 5 | I feel confident locating information on the Internet |
| 4.3 | 4.0 | 3 | 5 | I feel confident understanding HTML |
| 4.2 | 5.0 | 3 | 5 | I feel confident writing web pages |
| 4.18 | 4.0 | 3 | 5 | I feel confident writing a blog |
| 4.09 | 4.0 | 3 | 5 | I feel confident troubleshooting hardware problems |
| 4.0 | 4.0 | 3 | 5 | I feel confident troubleshooting software problems |
| 3.6 | 4.0 | 2 | 5 | I feel confident using an object-oriented programming language |
| 3.5 | 4.0 | 2 | 5 | I feel confident using a functional programming language |
| 3.4 | 4.0 | 1 | 5 | I feel confident writing computer applications |
| 3.3 | 3.0 | 2 | 5 | I feel confident using a scripting programming language |
- Table 4.6 "Ranked list of students computer self-efficacy responses"
Computer self-efficacy results (table 4.6) show a highly confident group of participants. The students, who should be in at least their second year of study, have most likely been exposed to most of the skills questioned, and these results confirm this. The results show that some participants are not confident writing web pages or blogs, performing troubleshooting, or using programming languages. An understanding of programming languages would make editing wiki-text obvious to the user, however, understanding how HTML generates a web page, and knowing how to format documents with a word processor should be sufficient stills to easily write wiki-text.
Participants were asked about their previous experience and exposure to wikis. No respondent agreed that they knew what a wiki was before participating in this research, however three of the participants reported they were unsure. When asked if they had contributed to a wiki before, three were uncertain, while the remainder were confident that they had not. Realising that many people do not understand that there is a collaborative model behind Wikipedia, participants were also asked if they had ever used Wikipedia. Most (9) participants were unsure, while the remainder reported they had not.
To understand how effectively students learned to use the wiki system, a short wiki self-efficacy questionnaire was completed by students.
| Average | Median | Min | Max | Question |
|---|---|---|---|---|
| 4.5 | 4.0 | 4 | 5 | I now feel confident adding text to a wiki |
| 4.5 | 4.0 | 4 | 5 | I now feel confident using headings in a wiki |
| 4.3 | 4.0 | 3 | 5 | I now feel confident participating in discussions in a wiki |
| 4.3 | 4.0 | 4 | 5 | I now feel confident creating articles |
| 4.3 | 4.0 | 4 | 5 | I now feel confident understanding which is the correct name space to use |
| 4.2 | 4.0 | 3 | 5 | I now feel confident making links in a wiki |
| 4.2 | 4.0 | 3 | 5 | I now feel confident using lists (numbered lists or unnumbered bullets) in a wiki |
| 4.2 | 4.0 | 3 | 5 | I now feel confident uploading files to a wiki |
| 3.9 | 4.0 | 3 | 5 | I now feel confident adding a signature to a wiki discussion |
- Table 4.7 "Ranked list of students wiki self-efficacy responses"
Wiki self-efficacy results (table 4.7) show that all participants were confident using the most important features of the wiki. In each task, one respondent was unsure about participating in discussions, making links, and using lists. Three respondents were unsure when uploading files, and four reported they were unsure how to sign comments in discussion. It was found that discussion was rarely used in the wiki, Uploading files and making lists were features taught and encouraged, but not required. Making links was a skill required of students in completing the pod tasks.
When participants were asked if they had rated any articles, only six report they had done so. As reported above, some members may be hesitant to rate items in a small community and where benefits of rating are unclear. Rashid et al. (2006) found that the number of ratings contributed can be increased by illustrating certain benefits to the user. When asked if they understood how to use the rating system most responded positively, three were unsure, and only three were highly confident. Those who did not rate on average were slightly more confident in understanding the rating system than those who did rate.
The survey attempted to identify if there may have been any bias in ratings. When asked to honestly assess the accuracy of their ratings, participants were largely uncertain of the accuracy of their ratings. One respondent felt their ratings were not accurate, while only four showed any confidence in their own ratings. There was less confidence among participants of other users' ratings. eight of respondents displayed neither agreement or disagreement with the statement that other peoples ratings were accurate.
The exercises completed in the subject were intended to allow students to get to know each other, to work together, and generate friendship between peers. It was therefore pertinent to ask if participants felt they were likely to rate their friends' articles more favourably. A range of responses were received, from strong agreement (2), to strong disagreement (1). Overall two disagreed with the statement, while five expressed agreement.
Overall users of the wiki found ratings to be only slightly useful, one respondent found the system not useful, while five found it only slightly useful.
The survey asked students what factors they felt were important when rating articles. The three factors presented were visual quality, factual quality, and writing style. Students reported factual quality to be the most important factor, with all students expressing either agreement or strong agreement with the statement. The next most important factor identified was the visual quality. For this statement, a wider spread of responses was received from disagree to strongly agree. Writing style returned a similar spread of responses, with a slightly lower average, but still with general agreement. Very few students responded as unsure to any of these statements (one each for visual quality and writing style).
Respondents were given the opportunity to respond freely if they felt there were other factors they felt important. Two respondents made comments that language used in articles should be easy to understand, with one emphasising the importance of clear descriptions. Two respondents expressed that organisation of information should well thought out. Two respondents also said that articles should be written to allow for a wide audience.
(Anonymous respondent)
One student provided a view on how ratings impact viewers.
(Anonymous respondent)
Participants were questioned about the Pools of Online Dialogue (POD) exercises they were asked to complete using the wiki. Although there was no strong agreement with the statement, eight reported the pod exercises easy, while the remainder were undecided. Most respondents were confident in understanding the game they chose. eight responded positively to the statement, with one of those a strong agreement. The remainder were undecided with the exception of one who found understanding the game a challenge. When asked what was found difficult about completing the POD exercises, the most common response was unfamiliarity with computer games, with three such responses. Two respondents reported finding a game to study a challenge, however no participant explained the reason for this. One participant also cited poor personal organisation and time management as an issue, while another found it a challenge getting to know the other POD members they were assigned to work with.
Students at CSU have been increasingly encouraged to make use of communication technology in recent years. Among other services, CSU provides subject forums, which are secure (available only to authorised members) web-based forums, where students may communicate with their peers and lecturer. The wiki however presents a much more open form of communication, where students are required to place their own work in a globally available space, where their work may be critiqued by a wide audience. The survey sought to discover if this had an influence on students' behaviour. Students were asked to rate how comfortable they were contributing to the wiki. All students responded positively, with the exception of one who declined to respond. Of those who responded, three students indicated they were strongly confident.
Students were asked to describe further their feelings working in such an open environment. Students were provided with a short list of words, excited, enthusiastic, proud, confident, indifferent, nervous, unsure, confused, and shy, and asked to provide three words, either from the list, or of their own, that describe how they felt contributing content to the wiki.
In total eight responses were received, including about 19 single word answers, and three longer descriptions. These longer descriptions were simplified to single words, or short phrases, creating 23 words or phrases in total. These words and phrases were studied and grouped into five categories, representing different sentiments. These groupings were "strongly confident", "positive response", "neutral", "relief", and "doubting" (see table 4.8).
| Category | Number of Responses | Percentage |
|---|---|---|
| strongly confident | 11 | 48% |
| positive response | 5 | 22% |
| neutral | 3 | 13% |
| relief | 2 | 9% |
| doubting | 2 | 9% |
| total | 23 | 100% |
- Table 4.8 "Distribution between categories of students feelings when using wikis"
In the strongly confident category were a total of 11 responses expressing very positive and energetic feelings towards using the wiki (table 4.8). Among the words in this category were "proud" (4 times), "confident" (3 times), "enthusiastic" (2 times), "excited" and "helpful". The positive response category can be summarised by the response "happy once completed". In total five responses fell into this category, the remaining being "good", "unabashed", "achievement", and "good experience". Three neutral responses were received, with two students reporting they felt "indifferent", with the third reporting feeling "blazae". Two students reported feelings of "relief" after submitting to the wiki. Finally two students reported not feeling confident about participating in the wiki, replying with "unsure" and "nervous".
While some students used three words to describe similar feelings, there were also instances of varied feelings in combination. Three students expressed a mix, of a sense of achievement, and of relief, while another expressed enjoyment, while still being unsure about their contributions.
Students were asked to elaborate on their feelings by explaining their choice of words. A mixed response was received. Two responses expressed a lack of special enthusiasm of the wiki as a medium, that the only reason for participation was the completion of their required class-work. "It just felt like placing an article online for people to read", and is "[not] something to go crazy about". Another responded "it was an assignment. I probably wouldn't do it out of my own interest in Wikis".
Two respondents explained their relief came from having completed another university assignment task. One respondent however, as well as expressing relief, explained they were proud to have contributed to their POD group, despite they were unsure how their work would be received by others.
One reportedly excited and enthusiastic student was particularly interested in the outcome of the wiki, stating the activities were "something different", and that they were "keen to see how the Wiki would turn out". Two other students were enthusiastic and interested in the medium. One pointed out the importance of online communities, and that "to share our ideas and arguments is obviously a good thing". Another was pleased to have a space where they could "contribute [their] ideas and knowledge, sharing with other people", and likewise, to "learn different ideas and information from [other people]". "I felt enthusiastic, proud and confident when I posted my contributions".
Throughout the written feedback, there were several general comments about using the wiki. One student explains how they overcame problems when trying to understand and write wiki-text:
(Anonymous respondent)
This comment shows a level of problem-solving ability, similar to that of hacker cultures (Raymond 2006), a personality believed to be common among wiki contributors.
Several responses, all positive in nature, were received regarding the personal learning experience of using the wiki. One expressed pride over using a wiki in this subject for the first time, while another rejoiced upon the "new learning methods", and knowledge they gained through using online tools.
One student summed these sentiments up in their response:
(Anonymous respondent)
This response exemplifies several important features of wiki communities. Stallman (1999) explained how small contributions to wikis are important, and how wikis work as a communication and brainstorming technology, where users can contribute ideas a learn from others.
Commands
This section summarises the commands used to generate figures, as indicated in subscript throughout the chapter.
| 1 | SELECT `action`, (SELECT COUNT(*) from `log` where `log`.`action`=`actions`.`action`) as `total`, (SELECT COUNT(*) from `log` where `log`.`action`=`actions`.`action` and `log`.`userid` in (1,2,3)) as `sup_count` , (SELECT COUNT(*) from `log` where `log`.`action`=`actions`.`action` and `log`.`userid` not in (1,2,3)) as `student_count` from (SELECT `action` FROM `log` group by `action`) as `actions` UNION SELECT 'total', (SELECT COUNT(*) FROM `log`), (SELECT COUNT(*) FROM `log` where `userid` in (1,2,3)), (SELECT COUNT(*) FROM `log` where `userid` not in (1,2,3)); |
| 2 | SELECT log_action, COUNT(*) FROM logging where log_timestamp>20060731145656 group by log_action |
| 3 | SELECT COUNT(*) FROM ratings; |
| 4 | SELECT COUNT(*) FROM ratings, revision, page where page_oldid=rev_id and rev_page=page_id and user_id not in (SELECT rev_user FROM revision as r where r.rev_page=page_id) |
| 5 | SELECT COUNT(*) FROM `user` |
| 6 | SELECT * FROM revision WHERE rev_timestamp>20060731145656 and rev_user_text != 'MediaWiki default'; |
| 7 | SELECT COUNT(*) from (SELECT count(*) FROM revision WHERE rev_user_text != 'MediaWiki default' group by rev_page) as `pages`; |
| 8 | SELECT SUM(page_counter) FROM page |
| 9 | select count(*) from (select userid, timestamp, namespace, title, min(rev_timestamp) as edittime, url from log, page, revision where action='RATE' and page_title=title and page_namespace=namespace and rev_page=page_id and rev_user=userid group by date, url) as log where timestamp>edittime |
| 10 | select count(*) from (select userid, timestamp, namespace, title, max(rev_timestamp) as edittime, url from log, page, revision where action='RATE' and page_title=title and page_namespace=namespace and rev_page=page_id and rev_user=userid group by date, url) as log where timestamp<edittime |
- Table 4.9 "Commands used to generate figures"
Conclusion
This chapter presented, analysed and discussed the data collected throughout this research, from both the wiki experiment, and the survey. The next chapter will draw conclusions, recommendations, and present topics for further research.
Discussion and Conclusion
Chapter 4 presented an analysis of data collected throughout this research, the behavioural analysis gained from log file data, and the survey results, used to better understand users' behaviour. This chapter will review the project, draw conclusions from the analysis, and place those conclusions within the available literature, showing how this research contributes to the field. The chapter will also explain the limitations of the research conducted, and identify paths for future research in the area.
Review
This research began by reviewing literature in the areas of online communities, and wikis, with the aim of discovering gaps in the literature, opening possibilities for research. The literature review explored the history of the wiki concept, and followed the growth of the popular Wikipedia, along with its criticisms. Many of these criticisms came out of the fact that Wikipedia calls itself an encyclopedia, and can not provide the expected level of reliability while using a radically open-editing model.
Several theoretical viewpoints of wikis were presented, models behind the design, and visions for the future of wikis. A wide range of uses of wiki technology were explored, personal and collaborative uses were reviewed, along with the various writing, collaboration, and communication styles used. This lead into a discussion on open-content communities, focussing specifically on the Wikipedia community, how the community functions under a self-governance model. Finally on wikis, the literature review compared wiki systems with other technologies, and explored in detail several common wiki implementations.
The literature review concluded with a survey of trust, reputation and attention, exploring the mechanics of trust, and several failures of trust in Wikipedia. Reputation was defined, and several content reputation systems surveyed. Attention was introduced, and an explanation of the use of attention data was provided.
Several gaps were identified during the literature review, leading to several questions. Chapter three selected some of these questions as the focus of this research, and developed methods to answer these questions. The primary goal was to see if there were any methods by which a wiki community can provide some level of authority. One proposed solution, is to have an authoritative group review articles, however this in itself is a huge task. This research tests methods of rating articles, so only good articles need to be validated. The two methods studied were user ratings, and attention data.
A MediaWiki system was installed, and extended with a simple rating mechanism. Over almost two months, users' actions on the wiki were recorded. At the end of the data collection period, the logs from the wiki were analysed, to determine the effectiveness of the system generated ratings, and of the attention data collected. A survey was then distributed to participants of the wiki. The survey sought to elicit from the students their perceptions of the wiki and the rating system, to better explain the behaviours observed in the wiki.
Limitations
The research presented here was completed in partial completion of an Honours research project, and had to meet the requirements thereof. A limit of one school year is a non-negotiable requirement. Ethics approval also had to be sought before research involving human participants could commence.
This research focused on a small group of undergraduate students. Witnessed in this research were how the dynamics of a small group differs from the expected behaviour of a larger wiki. The unexpectedly low amount of data makes for weak answers to the research questions, however, the research exemplifies the limitations of small wikis, and provides valuable lessons for future wiki designers.
The findings presented in this research may not be successfully extrapolated to larger wikis, or wikis used outside of an educational setting.
Findings
Two methods were trialled in this research in a small wiki. While neither provided accurate results in this setting, further research needs to be done with larger and established wikis. Further research may also develop alternate methods to test.
Initial analysis of the data highlighted the care with which MediaWiki log data must be analysed. Discrepancies were found between Apache log data and the MediaWiki statistics available. These are explained in chapter four as being caused by failed attempts at an action, or actions that require more than one step to complete. This inflates the Apache logs, requiring careful filtering of log data.
Analysis of ratings did find that the vast majority of ratings were positive, with at least 78% of ratings found in the top 40% of the rating scale. Dellarocas (2001) and Resnick and Zeckhauser (2002) also found an overwhelming number of positive ratings when studying eBay ratings.
Logs showed that while between 0.84% and 1.1% of page views resulted in ratings, 22% of these were corrections made on previous ratings, and only 49% of these ratings were usable as reliable results. The remainder of ratings were made by users who had also edited the pages they rated.
In small education wikis, no.
In this trial, pages generated by users were assessed manually by the researcher. The analysis involved a comparison of the manually assessed ratings with the system generated ratings. Little correlation was found between the two, and thus it was concluded, that in the wiki community trialled, user rating were not a reliable indicator of article quality.
Biases towards other members was identified as a factor that possibly reduced the accuracy of user ratings. The participant survey determined that this may significant factor. This fact may reduce in significance as the size of the community increases, and the likelihood of interaction between friends lowers.
Current evidence suggests factual quality is most important.
The participant survey tested three possible answers to this question. Results show that all students sampled felt that factual quality is the most important factor. Visual quality and writing style were considered slightly important, however a varied response showed they are not universally considered important. There was not enough data from the Apache logs to allow this question to be tested based on users behaviour.
In small wikis, no, although analysis suggests attention data may be more useful in larger wikis.
Analysis of data showed some correlation between time spent rating pages, and their quality. Comparisons between time taken to view, and article quality showed little correlation. A known problem with using attention data this way, is that users may visit a good quality page that they have seen before, and quickly move on to another page. In a larger wiki with more pages, the dynamics of this behaviour may change.
Apache logs as an attention metric also poses its limitations. Client side monitoring would provider richer data, allowing the researcher to have a better idea if the user was actually reading a page, had simply left the page open in the browser while working on another task, or closed the browser entirely.
Major stakeholders of wiki pages (ie. the principal authors of a page) will often monitor that page for changes. Any change to that page will be reviewed quickly, and often fixed (if the edit was poor), or appended with ideas brought about by the new edit. This creates a very active community, where ideas are exchanged quickly. This behaviour was not observed in this smaller wiki.
There was very little collaboration between users on most pages. Observation suggests students may perceive pages where the entirety of text was written by another student as belonging to that student, and are hesitant to edit it.
It was expected that opening up a wiki would increase editing, by allowing opinion into a wiki, making users feel more free to contribute. No significant exchange of ideas was observed in the wiki.
A possible explanation is that the number of participants constituted a pre-critical-mass, a point at which the wiki has enough members and momentum that any change will be responded to by another user. There was not enough interaction seen in the wiki trial for the allowing of subjective information to have any effect.
Observation of the wiki found little interaction between participants. One of the exercises required of students was to create a shared page, where each student was required to participate by introducing themselves. On these pages, a significant amount of interaction was observed. While it was rare to see any interaction where students posted their personal assignment tasks, most shared pages experienced a high level of collaboration, and evolved quickly. These pages saw students adopt a shared presentation style, and in some cases, a form of group identity, where the group adopted a shared name.
The participant survey elicited mixed feelings towards the wiki as a learning tool. Some saw it as just a form of publishing, whereas others identified the potential for shared learning and collaboration.
The Georgia Institute of Technology with their CoWeb wiki system saw some benefits of wiki use by students (Leuf & Cunningham 2001). In the teaching environment studied in this research it was seen, that in the right conditions, students form a group identity and shared respect. Students also find the wiki an enjoyable and efficient medium for discussion, learning, and sharing of ideas.
The need for further research
Observations made during this research highlight the behavioural differences between small and large wikis. Wikipedia is a common target of study, due to the need to solve problems with its reliability. There is comparably little study done on small and medium wikis. There is a need for future work to study the dynamics and interactions of smaller wikis, determine a software tool-set that best fits their needs, so that they can be most effectively used. Further work is also needed to understand how best to design and manage small wikis in educational settings.
This research observed a high number of potentially "biased" ratings. These ratings were made by participants who rated the same pages they edited. Further work can determine firstly if user ratings produce reliable recommendations in larger wikis. Secondly if the high number of such ratings is a feature only of small wikis, and if such ratings are indeed inaccurate?
Results of this showed a high number of positive ratings. This has also been observed in communities such as eBay (Dellarocas 2001, Resnick & Zeckhauser 2002), however Malda (1999) through his work has created a system (and community) that generates more accurate ratings. Are there any features of the software system, its mechanisms, or the community that can account for these differences?
Wikis comprise a community of editors, and over time, networks of friends develop. Do these social networks affect the way members perceive friends articles, and ultimately ratings of such articles?
Analysis in this research showed inconsistencies between the manual ratings, and the system generated ratings. Is there a need for some form of training so that users understand how articles should be critiqued and assigned an accurate rating?
Established wikis form a culture, a with it a set of norms, expectations and guidelines (written or otherwise), that new members learn and follow when joining the community. Is training required to members in private (corporate) and educational wikis to act as a surrogate to cultivate an effective wiki in its early stages?
Conclusion
This chapter provided an overview of literature reviewed in this research, and revisited the research questions. It summarised the processes followed in this research, and their limitations. The chapter then proceeds to answer the research questions presented in chapter three. Finally, this research concluded by providing several paths for future research focusing on small scale and educational wikis and their communities.
Bibliography
Alexa Internet 2006, 'Top 500', last edited 28 August, viewed 28 August 2006, <http://www.alexa.com/data/details/main?q=&url=http://www.wikipedia.org>.
Alexa Internet Inc 2006, 'Related Info for wikipedia.org', last edited 24 March, viewed 24 March 2006, <http://www.alexa.com/data/details/traffic_details?&range=2y&size=large&compare_sites=&y=t&url=http://www.wikipedia.org>.
Anthony, S 2006, 'Contribution Patterns Among Active Wikipedians: Finding and Keeping Content Creators', last edited 5 August, viewed 20 September 2006, <http://upload.wikimedia.org/wikipedia/wikimania2006/7/71/SA1_slides.pdf>.
Challborn, C & Reimann, T 2005, 'Wiki Products: A comparison', electronic version, The International Review of Research in Open and Distance Learning, viewed 28 August 2006, <http://www.irrodl.org/index.php/irrodl/article/view/229/312>.
Charles Sturt University 2006, 2006 Undergraduate Handbook, electronic version, viewed 8 November 2006, <http://www.csu.edu.au/handbook/subjects/ITC213.html>.
Cunningham, W 1995a, 'c2 mail history', last edited 16 March, viewed 28 August 2006, <http://c2.com/wiki/mail-history.txt>.
Cunningham, W 1995b, 'Invitation To The Patterns List', last edited 1 May, viewed 28 August 2006, <http://c2.com/cgi/wiki?InvitationToThePatternsList>.
Cunningham, W 2003, 'Correspondence on the Etymology of Wiki', last edited November, viewed 28 August 2006, <http://c2.com/doc/etymology.html>.
Cunningham, W 2006a, 'c2:Wiki History', last edited 17 August, viewed 28 August 2006, <http://c2.com/cgi/wiki?WikiHistory>.
Cunningham, W 2006b, 'Wiki Design Principles', last edited 16 August, viewed 28 August 2006, <http://c2.com/cgi/wiki?WikiDesignPrinciples>.
Cunningham, W 2006c, 'Wiki Wiki Hyper Card', last edited 18 August, viewed 28 August 2006, <http://c2.com/cgi/wiki?WikiWikiHyperCard>.
Davenport, TH & Berk, JC 2002, 'The Strategy and Structure of Firms in the Attention Economy', electronic version, Ivey Business Journal, vol. 66, num 4, viewed 12 October 2006, <http://www.iveybusinessjournal.com/article.asp?intArticle_ID=377>.
David, S 2004, 'Opening the sources of accountability', last edited November, First Monday, vol.9 no.11, viewed 12 October 2006, <http://www.firstmonday.org/issues/issue9_11/david/index.html>.
Dellarocas, C 2001, Analyzing the Economic Efficiency of eBay-like Online Reputation Reporting Mechanisms, electronic version, ACM Conference on Electronic Commerce EC'01, Tampa, Florida, USA, viewed 14 February 2006, <http://ccs.mit.edu/dell/papers/ec01.pdf>.
Dellarocas, C 2002, 'Efficiency and Robustness of Mediated Online Feedback Mechanisms - The Case of eBay', last edited June, viewed 6 October 2006, <http://www.stanford.edu/group/SITE/Dellarocas.pdf>.
digg Inc. 2006, 'Frequently Asked Questions', viewed 12 October 2006, <http://digg.com/faq>.
Dill, M 2001, 'Welcome to Wikipedia', last edited 31 August, viewed 12 October 2006, <http://sourceforge.net/forum/forum.php?forum_id=108877>.
Eustace, K 2006, ITC213 Subject Outline, electronic version, Charles Sturt University, viewed 9 November 2006, <http://ispg.csu.edu.au/subjects/cscw/pods/instructions>.
Feng, J, Lazar, J & Preece, J 2004, 'Empathy and online interpersonal trust', viewed 15 February 2006, <http://www.ifsm.umbc.edu/~preece/Papers/trust_paper_BIT2.pdf>.
Goldhaber, MH. 1997a, 'Attention Shoppers!', electronic version, Wired Magazine, 5.12, viewed 12 October 2006, <http://www.wired.com/wired/archive/5.12/es_attention.html>.
Goldhaber, MH. 1997b, 'The Attention Economy: The Natural Economy of the Net', last edited 7 April, First Monday, Vol.2 No.4, viewed 12 October 2006, <http://www.firstmonday.org/issues/issue2_4/goldhaber/>.
Goldhaber, MH. 2006, 'The Real Nature of the Emerging Attention Economy', last edited 8 March, talk at O'reilly "E-tech" Conference on "the Attention Economy", viewed 12 October 2006, <http://www.well.com/user/mgoldh/MMIG.pdf>.
Google Inc. 2005, 'Preventing comment spam', last edited 18 January, viewed 19 October 2006, <http://googleblog.blogspot.com/2005/01/preventing-comment-spam.html>.
Graham, P 2005, 'What Business Can Learn from Open Source', talk at O'Reilly Open Source Convention, viewed 24 March 2006, <http://paulgraham.com/opensource.html>.
Iskold, A 2006, 'From attention economy to attention architecture', last edited 19 September, viewed 12 October 2006, <http://alexiskold.wordpress.com/2006/09/19/from-attention-economy-to-attention-architecture/>.
Jøsang, A, Ismail, R & Boyd, C 2005, 'A Survey of Trust and Reputation Systems for Online Service Provision', electronic version, Decision Support Systems, to appear, <http://security.dstc.edu.au/papers/JIB2005-DSS.pdf>.
Kapor, M 2006, 'I'd Like to Have an Argument: Inspiration from Wikipedia about Collaborative Advocacy and Politics', Proceedings of Wikimania 2006, viewed 12 October 2006, <http://wikimania2006.wikimedia.org/wiki/Proceedings:MK2>.
Lampe, C & Resnick, P 2004, Slash(dot) and Burn: Distributed Moderation in a Large Online Conversation Space, electronic version, in Proceedings of the 2004 Conference on Human Factors in Computing Systems, CHI 2004, Vienna, Austria, April 24 - 29, 2004, Dykstra-Erickson, E & Tscheligi, M (ed.), Association for Computing Machinery, viewed 7 March 2006, <http://www.si.umich.edu/~presnick/papers/chi04/index.html>.
Leuf, B & Cunningham, W 2001, The Wiki Way - Quick Collaboration on the Web.
Linstone, HA. & Turoff, M 2002, The Delphi Method:Techniques and Applications, electronic version, Addison-Wesley, Reading, MA, viewed 20 September 2006, <http://www.is.njit.edu/pubs/delphibook/ch1.html>.
Ma, CPS 2005, 'What Makes WikiPedia So Special - The Social Cultural Economical Implications of the Wikipedia', viewed 19 May 2006, <http://scholar.google.com/scholar?hl=en&lr=&q=cache:-YNB4TInBKgJ:cathyma.net/wikipedia/cathyma_wikipedia.pdf+wiki+trust>.
Ma, CPS 2006a, 'The Social, Cultural and Economic Implications of the Wikipedia', viewed 27 August 2006, <http://web.hku.hk/~cathyma/asset/wiki_trust_overview_cathyma.pdf>.
Ma, CPS 2006b, 'Trust and Wikipedia: The roles of social capitals on participatory knowledge production', <http://wikimania2006.wikimedia.org/wiki/Proceedings:CM1>.
MacManus, R 2006a, 'Collaborative filtering: comparing Reddit's karma system to Digg', last edited 16 January, ZDNet Web 2.0 Explorer, viewed 12 October 2006, <http://blogs.zdnet.com/web2explorer/?p=99>.
MacManus, R 2006b, 'Interview with Digg's Kevin Rose, Pt 2: On Personalization and Fighting Spam', last edited 2 February, ZDNet Web 2.0 Explorer, viewed 12 October 2006, <http://blogs.zdnet.com/web2explorer/?p=109>.
MacManus, R 2006c, 'More evidence of GroupThink at Digg.com', last edited 9 January, ZDNet Web 2.0 Explorer, viewed 12 October 2006, <http://blogs.zdnet.com/web2explorer/?p=96>.
Malda, R 1999, 'Slashdot Moderation', last edited 9 September, viewed 18 December 2005, <http://slashdot.org/moderation.shtml>.
McHenry, R 2004, 'The Faith-Based Encyclopedia', last edited 15 November, Technology Commerce Society Daily, viewed 26 September 2006, <http://www.tcsdaily.com/article.aspx?id=111504A>.
McHenry, R 2006, 'The Faith-Based Encyclopedia Blinks', Technology Commerce Society Daily, viewed 6 September 2006, <http://www.tcsdaily.com/article.aspx?id=121305E>.
MediaWiki.org Wiki Contributors 2006a, 'MediaWiki FAQ', last edited 11 October, viewed 12 October 2006, <http://www.mediawiki.org/wiki/Help:FAQ>.
MediaWiki.org Wiki Contributors 2006b, 'Project of the Month', last edited 23 August, viewed 12 October 2006, <http://www.mediawiki.org/wiki/POTM>.
Myers, MD 1997, 'Qualitative Research in Information Systems', electronic version, MISQ Discovery, 20 May, pp.241-242, viewed 13 September 2006, <http://www.misq.org/discovery/MISQD_isworld/>.
Netcraft Ltd. 2006, 'October 2006 Web Server Survey', last edited 6 October, viewed 19 October 2006, <http://news.netcraft.com/archives/2006/10/06/october_2006_web_server_survey.html>.
Nunamaker, JF, Chen, M & Purdin, TDM 1991, 'Systems Development in Information Systems Research', Journal of Management Information Systems, vol 7 no 3, pp.89-106.
Open Source Initiative 2006, 'History of the OSI', viewed 18 October 2006, <http://www.opensource.org/docs/history.php>.
Orlowski, A 2005, 'There's no Wikipedia entry for 'moral responsibility, last edited 12 December, The Register, viewed 6 September 2006, <http://www.theregister.co.uk/2005/12/12/wikipedia_no_responsibility/page2.html>.
Osterloh, M, Rota, S & Wartburg, MV 2001, 'Open Source-New Rules in Software Development', viewed 25 September 2006, <http://scholar.google.com/url?sa=U&q=http://www.iou.unizh.ch/orga/downloads/OpenSourceAoM.pdf>.
Preece, J 2000, Online Communities - Designing Usability, Supporting Sociability, John Wiley & Sons, West Sussex, England.
Rashid, AM, Ling, K, Tassone, RD, Resnick, P, Kraut, R & Riedl, J 2006, 'Motivating Participation by Displaying the Value of Contribution', Proceedings of ACM CHI 2006 Conference on Human Factors in Computing Systems, pp.955-958, viewed 10 March 2006, <http://www.si.umich.edu/~presnick/papers/CHI06/>.
Raymond, ES 2000, Homesteading the Noosphere, electronic version, viewed 20 September 2006, <http://www.catb.org/~esr/writings/cathedral-bazaar/homesteading/>.
Raymond, ES 2001, The Cathedral and the Bazaar, electronic version, O'Reilly Media, viewed 24 March 2006, <http://www.catb.org/~esr/writings/cathedral-bazaar/cathedral-bazaar/cathedral-bazaar.ps>.
Raymond, ES (ed.) 2003, 'Politics', A Portrait of J. Random Hacker, The Jargon File, viewed 6 September 2006, <http://www.catb.org/jargon/html/politics.html>.
Raymond, ES 2006, How To Become A Hacker, electronic version, <http://www.catb.org/~esr/faqs/hacker-howto.html>.
reddit.com 2006, 'frequently asked questions', viewed 12 October 2006, <http://reddit.com/help/faq>.
Resnick, P & Zeckhauser, R 2002, 'Trust Among Strangers in Internet Transactions:Empirical Analysis of eBays Reputation System', electronic version, Baye, MR. (ed.), Advances in Applied Microeconomics, 11, viewed 12 October 2006, <http://www.si.umich.edu/~presnick/papers/ebayNBER/index.html>.
Resnick, P, Zeckhauser, R, Swanson, J & Lockwood, K 2006, 'The Value of Reputation on eBay: A Controlled Experiment', electronic version, Experimental Economics, Volume 9, Number 2, pp.79-101, viewed 28 August 2006, <http://www.si.umich.edu/~presnick/papers/postcards/index.html>.
Sanger, L 2002, 'Wikipedia and why it matters', last edited 16 January, talk delivered at the Stanford University Computer Systems Laboratory EE380 Colloquium, viewed 28 August 2006, <http://meta.wikimedia.org/wiki/Wikipedia_and_why_it_matters>.
Sanger, L 2004, 'Why Wikipedia Must Jettison Its Anti-Elitism', Kuro5hin, viewed 6 September 2006, <http://www.kuro5hin.org/story/2004/12/30/142458/25>.
Sanger, L 2005, 'The Early History of Nupedia and Wikipedia: A Memoir', last edited 18 April, Slashdot, viewed 28 August 2006, <http://features.slashdot.org/features/05/04/18/164213.shtml>.
Schwall, J 2003, 'The wiki phenomenon', viewed 27 August 2006, <http://www.schwall.de/dl/20030828_the_wiki_way.pdf>.
Seguy, D 2006, 'PHP statistics for August 2006', last edited 4 September, viewed 19 October 2006, <http://www.nexen.net/chiffres_cles/phpversion/php_statistics_for_august_2006.php#global>.
Seigenthaler, J 2005, 'Seigenthaler and Wikipedia:A Case Study on the Veracity of the "Wiki" concept Seigenthaler's Op-Eds', last edited 1 October, Project for Excellence in Journalism, viewed 12 October 2006, <http://www.journalism.org/node/1673>.
Sparks, M 2006, 'Wards Wiki Tenth Anniversary', last edited 11 July, viewed 28 August 2006, <http://c2.com/cgi/wiki?WardsWikiTenthAnniversary>.
Stallman, R 1999, 'The Free Universal Encyclopedia and Learning Resource', viewed 19 October 2006, <http://www.gnu.org/encyclopedia/free-encyclopedia.html>.
Straub, DW, Gefen, D & Boudreau, MC 2005, 'Quantitative Research', electronic version, Pries-Heje, D.AaJ. (ed.), pp.221-238, viewed 13 September 2006, <http://dstraub.cis.gsu.edu:88/quant/>.
Surowiecki, J 2004, The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations, DoubleDay Books.
Swartz, A 2006, 'Who Writes Wikipedia?', last edited 4 September, viewed 20 September 2006, <http://www.aaronsw.com/weblog/whowriteswikipedia>.
Szybalski, A 2005, 'Why its not a wiki world (yet)', last edited 14 March, viewed 27 August 2006, <http://www.stanford.edu/~andyszy/papers/wiki_world.pdf>.
Tanner, K 2002, Experimental research designs, in Research methods for students, academics and professionals: Information management and systems, Williamson, K (ed.), Centre for Information Studies, Charles Sturt University, Wagga Wagga, NSW, Australia.
Torkington, N 2006, 'Digging The Madness of Crowds', last edited 19 January, O'Reilly Radar, viewed 12 October 2006, <http://radar.oreilly.com/archives/2006/01/digging_the_madness_of_crowds_1.html>.
TWiki Developers 2006, 'TWiki Home Page', viewed 12 October 2006, <http://twiki.org/>.
UseModWiki Community 2005, 'UseModWiki Home Page', last edited 16 June, viewed 12 October 2006, <http://www.usemod.com/cgi-bin/wiki.pl>.
Varian, HR. 1995, 'The Information Economy: How much will two bits be worth in the digital marketplace?', electronic version, Scientific American, September 1995, pp.200-201, viewed 12 October 2006, <http://www.ischool.berkeley.edu/~hal/pages/sciam.html>.
Wagner, C 2004, 'Wiki- A Technology for Conversational Knowledge Management and Group Collaboration', electronic version, Communications of the Association for Information Systems, Volume13, 2004, pp.265-289, viewed 16 May 2006, <http://e-learning.pbwiki.com/f/CAIS%20Article%202004%20published.pdf>.
WardsWiki Community 2005, 'Wiki Wiki Origin', last edited 7 November, viewed 12 October 2006, <http://c2.com/cgi/wiki?WikiWikiOrigin>.
WardsWiki Community 2006a, 'C2:MediaWiki', last edited 16 February, viewed 12 October 2006, <http://c2.com/cgi/wiki?MediaWiki>.
WardsWiki Community 2006b, 'Wiki Engines', last edited 9 October, viewed 12 October 2006, <http://c2.com/cgi/wiki?WikiEngines>.
Wikia Inc. 2006, 'Wikia Statistics - Words', last edited 15 September, viewed 16 September 2006, <http://wikia.com/wikistats/EN/TablesDatabaseWords.htm>.
Wikimedia Foundation 2005, 'Wikipedia tightens editorial control', last edited 6 December, viewed 24 March 2006, <http://wikimediafoundation.org/wiki/Press_releases/Wikipedia_tightens_editorial_control>.
Wikimedia Foundation 2006a, 'Statistics - Wikipedia, the free encyclopedia', last edited 20 September, viewed 20 September 2006, <http://en.wikipedia.org/wiki/Special:Statistics>.
Wikimedia Foundation 2006b, 'Wikimedia Foundation Home Page', last edited 24 August, viewed 12 October 2006, <http://wikimediafoundation.org/wiki/Home>.
Wikipedia Community 2005, 'Wikipedia: Block all anonymous edits', last edited 7 December, viewed 24 March 2006, <http://en.wikipedia.org/w/index.php?title=Wikipedia:Block_all_anonymous_edits&oldid=30387810>.
Wikipedia Community 2006a, 'List of wiki software', last edited 11 October, viewed 12 October 2006, <http://en.wikipedia.org/wiki/List_of_wiki_software>.
Wikipedia Community 2006b, 'Wikipedia - User page', last edited 12 October, viewed 12 October 2006, <http://en.wikipedia.org/wiki/Wikipedia:User_page>.
Wikipedia Community 2006c, 'Wikipedia:Sock puppetry', last edited 26 March, viewed 26 March 2006, <http://en.wikipedia.org/w/index.php?title=Wikipedia:Sock_puppetry&oldid=45515363>.
Wikipedia Community 2006d, 'WikiProject Countering systemic bias', last edited 11 October, viewed 12 October 2006, <http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Countering_systemic_bias>.
Wikipedia Contributors 2006a, 'Be bold in updating pages', viewed 24 March 2006, <http://en.wikipedia.org/w/index.php?title=Wikipedia:Be_bold_in_updating_pages&oldid=45361150>.
Wikipedia Contributors 2006b, 'Criticism of Wikipedia', viewed 6 September 2006, <http://en.wikipedia.org/wiki/Criticism_of_Wikipedia>.
Wikipedia Contributors 2006c, 'Replies to common objections', viewed 6 September 2006, <http://en.wikipedia.org/wiki/Wikipedia:Replies_to_common_objections>.
Wikipedia Contributors 2006d, 'Template:Policy', last edited 25 July, viewed 20 September 2006, <http://en.wikipedia.org/wiki/Template:Policy>.
Wikipedia Contributors 2006e, 'Wiki', last edited 31 July, viewed 31 July 2006, <http://en.wikipedia.org/w/index.php?title=Wiki&oldid=66792603>.
Wikipedia Contributors 2006f, 'Wikipedia:Administrators', last edited 18 September, viewed 20 September 2006, <http://en.wikipedia.org/wiki/Wikipedia:Administrators>.
Wikipedia Contributors 2006g, 'Wikipedia:Don't bite the newcomers', last edited 7 November, viewed 9 November 2006, <http://en.wikipedia.org/w/index.php?title=Wikipedia:Please_do_not_bite_the_newcomers&oldid=86176581>.
Wikipedia Contributors 2006h, 'Wikipedia:Don't disrupt Wikipedia to illustrate a point', last edited 8 September, viewed 20 September 2006, <http://en.wikipedia.org/wiki/WP:POINT>.
Wikipedia Contributors 2006i, 'Wikipedia:Five pillars', last edited 20 September, viewed 20 September 2006, <http://en.wikipedia.org/wiki/Wikipedia:Five_pillars>.
Wikipedia Contributors 2006j, 'Wikipedia:List of policies', last edited 16 September, viewed 20 September 2006, <http://en.wikipedia.org/wiki/Wikipedia:List_of_policies>.
Wikipedia Contributors 2006k, 'Wikipedia:Neutral point of view', last edited 20 September, viewed 20 September 2006, <http://en.wikipedia.org/wiki/Wikipedia:Neutral_point_of_view>.
Wikipedia Contributors 2006l, 'Wikipedia:Policies and guidelines', last edited 20 September, viewed 20 September 2006, <http://en.wikipedia.org/wiki/Wikipedia:Policies_and_guidelines>.
Wikipedia Contributors 2006m, 'Wikipedia:What Wikipedia is not', last edited 19 September, viewed 20 September 2006, <http://en.wikipedia.org/wiki/WP:WIN#Wikipedia_is_not_a_soapbox>.
Appendices
Appendix 1. RatingExtension
Details
- Version 0.3.1
- Date 8 Aug 2006
- Tested on MediaWiki 1.7.1, PHP 5.0.5 (apache2handler)
- Tested on MediaWiki 1.6.6, PHP 5.0.5 (apache2handler)
Features
- Allows 5 star rating for each page in a wiki
- With too few ratings it will use previous ratings
- With too few ratings, rating will change color to draw attention to get ratings
- NoFollow support
- Tags to allow insertion of ratings into a page, or a list of "top 5" ratings etc.
- Rating cache, to speed up <ratings> usage in large wikis.
Usage
Enabling the plugin will display star ratings above each page. 5 stars are displayed, which may be clicked by a user to indicate their opinion of the page. Stars are arranged left to right, with the first star having a value of 1, and the last a value of 5. Each user (identified by login, or by IP address) may have one vote on any revision of a page, and may change their vote at any time.
The extension also enables the use of three new wiki tags:
- <rating>
- <ratingcount>
- <ratings>
The rating and rating count tags display a numeric value, indicating the page rating, or the number of ratings for a selected page. The page is indicated by placing the name of the page between an opening and closing tag.
The ratings tag displays a list of pages, and their ratings or rating counts. The ratings tag supports the following parameters:
- namespace - limit the namespaces of displayed pages (numeric value). May be included multiple times to indicate multiple namespaces. Defaults to all namespaces
- sortby (rating, name or count) - select the index to sort the list on. defaults to rating.
- sortby (ascending or descending) - sort in ascending or decending
- displaycount (true or false) - show the number of ratings for each page. defaults false
- displayrating (true or false) - show the rating for each page. Defaults true
- pattern - indicates the text to be returned for each result. supports four parameters: $1 => page_name, $2 => page name, $3 => rating, $4 => count, $5 => page revision id. overrides displaycount and displayrating.
- limit - limits the maximum number of results. defaults to 10. 0 = unlimited
Examples
<rating>Main Page</rating>
displays
4.52
<ratingcount>Main Page</ratingcount>
displays
23
Most popular page: <ratings> namespace=0 namespace=1 namespace=2 namespace=3 sortby=rating sortby=descending pattern=[[$1|$2]] with a rating of $3 limit=1 </ratings>
displays
Most Popular Page: Main Page with a rating of 4.52
Installation
Media
Download Rating.tar.gz and extract to the 'extensions/' directory. (Extensions directory should now contain a 'rating' subdirectory containing several png files)
Database
Run the following sql code on your database to create the ratings table. Ensure you have selected the correct database using the USE command.
CREATE TABLE IF NOT EXISTS `ratings` ( `page_oldid` int unsigned NOT NULL, `user_id` varchar(15) NOT NULL, `page_rating` int unsigned NOT NULL, PRIMARY KEY (`page_oldid`, `user_id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci; CREATE TABLE IF NOT EXISTS `ratings_cache` ( `page_oldid` int unsigned NOT NULL, `page_rating` float unsigned NOT NULL, `page_rating_count` int unsigned NOT NULL, PRIMARY KEY (`page_oldid`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci;
Extension
Copy and paste the source code into a file named 'rating.php', and place it in the 'extensions/' directory of your mediawiki installation.
Insert the following line into the end of 'LocalSettings.php' (before the '?>')
require("extensions/rating.php");
Verification
To check to see if it is installed properly, visit your Version page, eg Special:Version.
You should see the following items:
- Extensions:
- Parser hooks:
- Rating Extension (version 0.3.1), Adds rating mechanism, by Trevor Peacock
- Extension functions:
- RatingExtension
- Parser extension tags:
- <rating>, <ratingcount> and <ratings>
- Parser hooks:
- Hooks:
- OutputPageBeforeHTML: InsertRating
- SkinTemplateSetupPageCss: RatingCss
Releases
Todo
- Add ratings into search output
RoadMap
0.4 No Date
- Add ratings into search output
History
0.3.1 - Date 8 Aug 2006
- Ratings list shows link rather than image for Image namespace articles
0.3 - Date 2 Aug 2006
- Add rating cache, to speed up <ratings> usage in large wikis.
0.2 - Date 31 July 2006
- Fix sql injection possibilities
- Add tags to allow insertion of ratings into a page, or a list of "top 5" ratings etc.
- NoFollow support
- rating tag
- ratingcount tag
- ratings tag
0.1 - Date 30 July 2006
- Allows 5 star rating for each page in a wiki
- With too few ratings it will use previous ratings
- With too few ratings, rating will change color to draw attention to get ratings
Source Code
<?php
################################################################################
# Setup #
################################################################################
# This section defines parameters of the plugin #
################################################################################
#-------------------------------------------------------------------------------
# Sets the minimum number of ratings that can be returned in a ratings tag #
# This helps keep performance for large wikis acceptable #
#-------------------------------------------------------------------------------
$renderLimit=100;
#-------------------------------------------------------------------------------
# Sets the minimum number of ratings required for the rating to be valid #
# If the number of ratings are less, rating is 0 and bolded #
#-------------------------------------------------------------------------------
$countLimit=3;
#-------------------------------------------------------------------------------
# Defines namespaces for which ratings are enabled #
#-------------------------------------------------------------------------------
$allowedNamespaces=array(
0=>true, //Default
1=>true, //Talk
2=>true, //User
3=>true, //User_talk
4=>true, //Project
5=>true, //Project_talk
6=>true, //Image
7=>true, //Image_talk
# 8=>true, //MediaWiki
# 9=>true, //MediaWiki_talk
10=>true, //Template
11=>true, //Template_talk
12=>true, //Help
13=>true, //Help_talk
14=>true, //Category
15=>true //Category_talk
);
################################################################################
# Core Code #
################################################################################
# This section handles the core functions of this extension #
# * Serving Files #
# * Recording Ratings #
# * Displaying Ratings #
################################################################################
#-------------------------------------------------------------------------------
# If html request requests file, serve file #
# Files are requested by adding the ?file= GET parameter #
#-------------------------------------------------------------------------------
if(isset($_GET['file']))
doFile();
#-------------------------------------------------------------------------------
# If html request contains rate information, process it #
# Rating information indicated by ?rate= GET parameter #
# #
# MediaWiki engine is started by imitating the code in index.php #
# This ensures database functions are working #
#-------------------------------------------------------------------------------
if(isset($_GET['rate']))
{
#Init
#----------------
require_once( 'includes/Setup.php' );
require_once( "includes/Wiki.php" );
$mediaWiki = new MediaWiki();
$action = $wgRequest->getVal( 'action', 'view' );
$title = $wgRequest->getVal( 'title' );
$wgTitle = $mediaWiki->checkInitialQueries($title, $action, $wgOut,
$wgRequest, $wgContLang);
#Record Rating
#----------------
$dbr =& wfGetDB( DB_WRITE );
$sql = "REPLACE INTO `ratings` (`page_oldid`, `user_id`, `page_rating`) ".
"VALUES (".intval($_GET['oldid']).", '".
($wgUser->getID()?$wgUser->getID():$wgUser->getName()).
"', ".intval($_GET['rate']).")";
$res=wfQuery($sql, DB_WRITE, "");
calculateRating($_GET['oldid']);
#Return to refering page and exit
#----------------
$wgTitle->invalidateCache();
$dbr->immediateCommit();
header( 'Location: '.$_SERVER["HTTP_REFERER"] ) ;
die();
}
#-------------------------------------------------------------------------------
# Initialize extension #
# Sets up credit information, and hooks #
#-------------------------------------------------------------------------------
if(isset($wgScriptPath))
{
$wgExtensionCredits["parserhook"][]=array(
'name' => 'Rating Extension',
'version' => '0.3.1',
'url' => 'http://wiki.peacocktech.com/wiki/RatingExtension',
'author' => '[http://about.peacocktech.com/trevorp/ Trevor Peacock]',
'description' => 'Adds rating mechanism' );
$wgHooks['OutputPageBeforeHTML'][] = 'InsertRating';
$wgHooks['SkinTemplateSetupPageCss'][] = 'RatingCss';
$wgExtensionFunctions[] = "RatingExtension";
}
function RatingExtension()
{
global $wgParser;
$wgParser->setHook("rating", "renderRating");
$wgParser->setHook("ratingcount", "renderRatingCount");
$wgParser->setHook("ratings", "renderRatings");
}
#-------------------------------------------------------------------------------
# Render the <rating> tag #
# Shows the rating of the page specified by the text between the tags #
#-------------------------------------------------------------------------------
function renderRating($input, $argv, &$parser)
{
$rating=getRatingCache(getOldIDFromTitle(trim($input)));
return number_format($rating['rating'], 2);
}
#-------------------------------------------------------------------------------
# Render the <rating> tag #
# Shows the number of ratings of the page specified by the text between #
# the tags #
#-------------------------------------------------------------------------------
function renderRatingCount($input, $argv, &$parser)
{
$rating=getRatingCache(getOldIDFromTitle(trim($input)));
return number_format($rating['count'], 2);
}
#-------------------------------------------------------------------------------
# Displays a list of ratings based on the parameters given between the tags #
#-------------------------------------------------------------------------------
function renderRatings($input, $argv, &$parser)
{
$parameters=splitParameters($input);
$sortdata=getSortData(isset($parameters['namespace'])?
$parameters['namespace']:array());
$ratings=getRatingList(isset($parameters['namespace'])?
$parameters['namespace']:array(), $sortdata['column'], $sortdata['order'],
isset($parameters['limit'])?$parameters['limit'][0]:10);
$output=;
$displayCount=(isset($parameters['displaycount'])?
$parameters['displaycount'][0]:'false')!='false';
$displayRating=(isset($parameters['displayrating'])?
$parameters['displayrating'][0]:'true')!='false';
$pattern=isset($parameters['pattern'])?$parameters['pattern'][0]:
'[[$1|$2]]'.($displayRating?' ($3)':).($displayCount?' ($4 ratings)':);
$limit=isset($parameters['limit'])?$parameters['limit'][0]:10;
foreach($ratings as $rating)
{
$output.=str_replace(array('$1', '$2', '$3', '$4', '$5'),
array(($rating['namespace']==6?':':).$rating['title'],
strtr($rating['title'], '_', ' '),
number_format($rating['rating'], 2),
$rating['count'], $rating['oldid']), $pattern)."nn";
if(--$limit==0) break;
}
return renderWikiText(trim($output), $parser);
}
#-------------------------------------------------------------------------------
# Insert rating to top of page #
#-------------------------------------------------------------------------------
function InsertRating($parserOutput, $text) {
global $wgArticle, $allowedNamespaces;
if(!$allowedNamespaces[$wgArticle->getTitle()->getNamespace()])
return;
$oldid=getOldID($wgArticle);
if($oldid)
$text='<div id="ratingsection">'.
getRatingHTML(getRatingCache($oldid), $oldid).'</div>'.$text;
}
#-------------------------------------------------------------------------------
# Add CSS into skin for rating #
#-------------------------------------------------------------------------------
function RatingCss(&$css) {
global $wgScriptPath;
$css = "/*<![CDATA[*/".
" @import "$wgScriptPath/?file=rating.css"; ".
"/*]]>*/";
return true;
}
################################################################################
# Supporting Functions #
################################################################################
# These functions support the core functions of the extension #
################################################################################
#-------------------------------------------------------------------------------
# Processes sortby parameters to determine how to sort ratings #
#-------------------------------------------------------------------------------
function getSortData($sort)
{
$column='rating';
$order=-1;
foreach($sort as $item)
{
switch($item)
{
case 'rating':
$column='rating'; break;
case 'name':
$column='title'; break;
case 'count':
$column='count'; break;
case 'ascending':
$order=SORT_ASC; break;
case 'descending':
$order=SORT_DESC; break;
}
}
if($order==-1)
{
switch($column)
{
case 'rating':
$order=SORT_DESC; break;
case 'title':
$order=SORT_ASC; break;
case 'count':
$order=SORT_DESC; break;
default:
$order=SORT_DESC;
}
}
return array('column'=>$column, 'order'=>$order);
}
#-------------------------------------------------------------------------------
# Processes the given text using the mediawiki parser engine #
#-------------------------------------------------------------------------------
function renderWikiText($input, &$parser)
{
return $parser->parse($input, $parser->mTitle, $parser->mOptions,
false, false)->getText();
}
#-------------------------------------------------------------------------------
# Returns the page rating given the pages string title #
#-------------------------------------------------------------------------------
function getOldIDFromTitle($title)
{
$title=Title::newFromText($title);
if($title==null)
{
echo "NoArticle";
return 0;
}
$oldid=getOldID(new Article($title));
if(!$oldid)
{
return "NoOldID";
return 0;
}
return $oldid;
}
#-------------------------------------------------------------------------------
# Completes ratings_cache table for any revisions without ratings #
#-------------------------------------------------------------------------------
function doFillCache()
{
global $allowedNamespaces;
$namespacestring=;
foreach(array_keys($allowedNamespaces) as $space)
{
if(is_numeric($space))
{
$namespacestring.=', '.$space;
}
}
$namespacestring=substr($namespacestring, 2);
$sql="SELECT `page_latest` AS `oldid` FROM `page` WHERE".
' `page_namespace` IN ('.$namespacestring.') AND'.' `page_latest` NOT IN '.
'(SELECT `page_oldid` FROM `ratings_cache`);';
$articles=runQuery2($sql);
foreach($articles as $article)
getRating($article->oldid);
}
#-------------------------------------------------------------------------------
# Returns a list of pages in the specified namespaces #
# Optionally it may return sorting rating information for results #
#-------------------------------------------------------------------------------
function getPageList($namespace=array(), $ratings=false, $sort='title',
$order=SORT_ASC, $limit=100)
{
global $renderLimit, $allowedNamespaces;
$limit=($limit>$renderLimit?$renderLimit:$limit);
$namespacestring=;
$filterByNamespace=false;
foreach($namespace as $space)
{
if(is_numeric($space) && isset($allowedNamespaces[$space]))
{
$namespacestring.=', '.$space;
$filterByNamespace=true;
}
}
if($filterByNamespace)
$namespacestring=substr($namespacestring, 2);
$sql="SELECT `page_title` AS `title`, `page_namespace` AS `namespace`,
`page_latest` AS `oldid` FROM `page`".($filterByNamespace?
' WHERE `page_namespace` IN ('.$namespacestring.')':"").';';
if($ratings)
$sql="SELECT `page_title` AS `title`, `page_namespace` AS `namespace`, ".
"`page_latest` AS `oldid`, `page_oldid`, `page_rating` AS `rating`, ".
"`page_rating_count` AS `count` FROM `page`, `ratings_cache` ".
"WHERE `page_latest`=`page_oldid`".($filterByNamespace?
' AND `page_namespace` IN ('.$namespacestring.')':"").
' ORDER BY '.$sort.($order==SORT_ASC?:).
($order==SORT_DESC?' DESC':).' LIMIT '.$limit.';';
return runQuery2($sql);
}
#-------------------------------------------------------------------------------
# Returns a list of pages and their ratings for all pages in the specified #
# namespaces #
#-------------------------------------------------------------------------------
function getRatingList($namespace=array(), $sort='title', $order=SORT_ASC,
$limit=100)
{
global $wgNamespaceNamesEn, $renderLimit;
doFillCache();
$limit=($limit>$renderLimit?$renderLimit:$limit);
$articles=getPageList($namespace, true, $sort, $order, $limit);
$ratings=array();
foreach($articles as $article)
{
$rating=formatRating(array('count'=>$article->count,
'rating'=>$article->rating));
$ratings[]=array('title' => $wgNamespaceNamesEn[$article->namespace].
($article->namespace==0?:':').$article->title,
'rating' => $rating['rating'], 'count' => $rating['count'],
'oldid'=>$article->oldid, 'title_name'=>$article->title,
'namespace_name'=>$wgNamespaceNamesEn[$article->namespace],
'namespace'=>$article->namespace);
}
return $ratings;
}
#-------------------------------------------------------------------------------
# Splits parameters from the wikitext. #
# Each parameter should be on its own line in the format parameter = value #
#-------------------------------------------------------------------------------
function splitParameters($input)
{
$parameters=array();
foreach(split("n", $input) as $parameter)
{
$parameter=split('=', $parameter, 2);
if(count($parameter)==2)
{
foreach($parameter as $key => $val)
$parameter[$key]=trim($val);
if(isset($parameters[$parameter[0]]))
$parameters[$parameter[0]][]=$parameter[1];
else
$parameters[$parameter[0]]=array($parameter[1]);
}
}
return $parameters;
}
#-------------------------------------------------------------------------------
# Use the mediawiki engine to run the given sql code and return an object #
# containing the first result of the query #
#-------------------------------------------------------------------------------
function runQuery($sql)
{
$dbr =& wfGetDB( DB_SLAVE );
$res=wfQuery($sql, DB_SLAVE, "");
if(wfNumRows($res)>0)
return $dbr->fetchObject( $res );
else
return null;
}
function runQuery2($sql)
{
$dbr =& wfGetDB( DB_SLAVE );
$res=wfQuery($sql, DB_SLAVE, "");
$array=array();
while($item=$dbr->fetchObject( $res ))
$array[]=$item;
return $array;
}
#-------------------------------------------------------------------------------
# Return the oldid of the current page #
# If oldid=0 (most current revision) take the latest oldid from the database #
# for the current article #
#-------------------------------------------------------------------------------
function getOldID($article)
{
$oldid=$article->getOldIDFromRequest();
if($oldid!=0)
return $oldid;
$dbr =& wfGetDB( DB_SLAVE );
$sql="SELECT `page_latest` AS `oldid` FROM `page` ".
"WHERE `page_id`=".$article->getID().";";
$res=wfQuery($sql, DB_SLAVE, "");
$row=$dbr->fetchObject( $res );
if($row->oldid)
return $row->oldid;
return null;
}
#-------------------------------------------------------------------------------
# Performs a SQL query to fetch a rating for page oldid #
#-------------------------------------------------------------------------------
function getRatingData($oldid)
{
$sql="SELECT COUNT(*) AS `count`, AVG(`page_rating`) AS `rating` ".
"FROM ratings WHERE `page_oldid`=".intval($oldid)." GROUP BY `page_oldid`;";
return runQuery($sql);
}
#-------------------------------------------------------------------------------
# Returns the ID for the revision before $revision #
#-------------------------------------------------------------------------------
function getPreviousRevisionID( $revision ) {
$dbr =& wfGetDB( DB_SLAVE );
return $dbr->selectField( 'revision', 'rev_id',
'rev_page=(SELECT `rev_page` from `revision` WHERE `rev_id`='.
intval( $revision ).')'.' AND rev_id<' . intval( $revision ) .
' ORDER BY rev_id DESC' );
}
#-------------------------------------------------------------------------------
# Fetch and calculate a rating for page oldid #
# If there are not enough ratings for the current revivion, cycle #
# older revisions to gather a minimum number of ratings #
# Updates cache table with calcaulated values #
#-------------------------------------------------------------------------------
function calculateRating($oldid)
{
global $wgTitle, $countLimit;
$origid=$oldid;
$ratingdata=getRatingData($oldid);
$finalrating='?';
$currentcount=number_format($ratingdata->count, 0);
#If there are not enough ratings for the current revision
if($ratingdata->count<$countLimit)
{
$count=$ratingdata->count;
$rating=$count*$ratingdata->rating;
#cycle older revisions looking for more ratings
while($oldid=getPreviousRevisionID($oldid))
{
$ratingdata=getRatingData($oldid);
#If still not enough ratings
if($count+$ratingdata->count<$countLimit)
{
$count+=$ratingdata->count;
$rating+=$ratingdata->count*$ratingdata->rating;
}
else #found enough ratings
{
$rating+=($countLimit-$count)*$ratingdata->rating;
$count=$countLimit;
$finalrating=$rating/$count;
$oldid=false;
}
}
}
else
$finalrating=$ratingdata->rating;
$dbr =& wfGetDB( DB_WRITE );
$sql = "REPLACE INTO `ratings_cache` (`page_oldid`, `page_rating`, ".
"`page_rating_count`) VALUES (".intval($origid).", ".
(is_numeric($finalrating)?$finalrating:0).", ".$currentcount.")";
$res=wfQuery($sql, DB_WRITE, "");
return array('rating'=>$finalrating, 'count'=>$currentcount);
}
#-------------------------------------------------------------------------------
# Formats rating array for insertion to page rating section #
#-------------------------------------------------------------------------------
function formatRating($rating)
{
$finalrating=$rating['rating'];
$currentcount=$rating['count'];
#format rating data
if(is_numeric($finalrating) && $finalrating>0)
$finalrating=($finalrating-1)*1.25;
$ratingarray=array('display'=>
(is_numeric($finalrating)?number_format($finalrating, 2):$finalrating).
" ($currentcount ratings)",
'count'=>$currentcount,
'rating'=>(is_numeric($finalrating)?$finalrating:0));
return $ratingarray;
}
#-------------------------------------------------------------------------------
# Calculates rating from database for specified revision #
# Updates rating cache table #
#-------------------------------------------------------------------------------
function getRating($oldid)
{
return formatRating(calculateRating($oldid));
}
#-------------------------------------------------------------------------------
# Fetches the ratings from cache table #
# Calculates rating if it is not found in cache #
#-------------------------------------------------------------------------------
function getRatingCache($oldid)
{
$sql='SELECT * FROM `ratings_cache` WHERE `page_oldid`='.intval($oldid).';';
$rating=runQuery($sql);
$rating=array('rating'=>$rating->page_rating,
'count'=>$rating->age_rating_count);
if($rating->page_oldid==null)
$rating=calculateRating($oldid);
return formatRating($rating);
}
#-------------------------------------------------------------------------------
# Given the rating array and the page oldid, generate HTML code to be #
# displayed #
#-------------------------------------------------------------------------------
function getRatingHTML($rating, $oldid)
{
global $wgTitle, $wgScriptPath, $countLimit;
$html=;
#generate stars
for($x=0;$x<=4;$x++)
{
$html.='<a href="'.
$wgTitle->getFullURL('oldid='.$oldid.'&rate='.($x+1)).'" rel="nofollow">'.
'<img src="?file=Star';
if($rating['rating']>=$x+1) #larger than current star : filled
$html.='4'.($rating['count']<$countLimit?'b':);
elseif($rating['rating']>=$x+0.75) #3/4 current star : 3/4 filled
$html.='3'.($rating['count']<$countLimit?'b':);
elseif($rating['rating']>=$x+0.5) #1/2 current star : 1/2 filled
$html.='2'.($rating['count']<$countLimit?'b':);
elseif($rating['rating']>=$x+0.25) #1/4 current star : 1/4 filled
$html.='1'.($rating['count']<$countLimit?'b':);
else #less than current star : empty
$html.='0';
$html.='.png" align=bottom/></a>'."n";
}
#add text rating
$html.=' '.$rating['display'].;
$html=($rating['count']<$countLimit?'<b>'.$html.'</b>':$html);
return $html;
}
#-------------------------------------------------------------------------------
# Return a file and exit. #
# File determined by ?file= GET parameter #
#-------------------------------------------------------------------------------
function doFile()
{
switch ($_GET['file'])
{
#Star .png files
case "Star0.png":
case "Star1.png":
case "Star1b.png":
case "Star2.png":
case "Star2b.png":
case "Star3.png":
case "Star3b.png":
case "Star4.png":
case "Star4b.png":
header("Content-type: image/png");
echo readFile('extensions/rating/'.$_GET['file']);
die();
#extension css styling
case "rating.css":
header("Content-type: text/css");
?>
#ratingsection {
float: right;
margin-top: -3.7em;
padding: 3px;
}
#ratingsection b {
color: red;
padding: 4px;
}
<?php
die();
}
}
Appendix 2. Survey
- What is your age?
- What is your gender?
- What is your primary mode of study? (Internal/Distance)
- Which course are you studying?
- Rate your ability with computers. I feel confident ...
- using a personal computer
- starting a computer program
- using word processing software to format a letter or essay
- copying and moving files to removable media (floppy disc, CD, or USB thumb/flash drive)
- deleting files when they're no longer needed
- learning how to use new software
- using help features in software
- locating information on the Internet
- checking email
- sending email
- using instant messaging
- using an online forum
- writing a blog
- writing web pages
- understanding HTML
- installing software
- troubleshooting software problems
- troubleshooting hardware problems
- using a scripting programming language
- using a functional programming language
- using an object-oriented programming language
- writing computer applications
- Before completing this subject I ...
- knew what a wiki was
- had read a wiki without knowing at the time it was a wiki
- had read a wiki knowing it was a wiki
- had read from Wikipedia
- had contributed to a wiki
- I now feel confident
- adding text to a wiki
- making links in a wiki
- participating in discussions in a wiki
- adding a signature to a wiki discussion
- using headings in a wiki
- using lists (numbered lists or unnumbered bullets) in a wiki
- creating articles
- understanding which is the correct name space to use
- uploading files to a wiki
- A rating system was added to Kakapo wiki. Please rate the following statements:
- Did you understand how to use the rating system
- Did you rate any articles?
- Please rate the following statements.
- Do you think your ratings were accurate?
- Do you think other peoples ratings were accurate
- Did you find the ratings useful
- Do you think you were inclined to rate your friends articles more favourably
- Visual quality of the article is important when rating
- The factual quality of the article is important when rating
- The writing style of the article is important when rating
- Do you feel there is another important factor when rating (free response)
- Please rate the following statement.
- I found the pod exercises easy
- Describe two things you found difficult
- Please rate the following statement.
- I found understanding the game I chose easy
- I felt comfortable submitting my words to the wiki
- Choose three words that describe how you felt placing text on the wiki (e.g. excited, enthusiastic, proud, confident, indifferent, nervous, unsure, confused, shy)
- Explain why you felt this way
- What general comments would you like to make
Appendix 3. POD Exercises
POD Activity 1
Exercise 1
Visit http://ispg.csu.edu.au/kakapowiki/ and create an account. Read the help and learn how a wiki works and how to use it.
Experiment by writing your own user page.
N.B. Its best to be logged in when editing so the marker can be sure that you actually posted the content that you want to be marked on.
Also, create an article for your POD group. Have each member add a short description of themselves, including your full name, a link to your user page, and whether you are a distance or internal student. Use this page to communicate with the world what your group is doing.
Exercise 2
As a group, research and choose a Multiplayer Online Game or Massively Multiplayer Online Game that you can all play. Sign up and start learning about the game. You can if you wish choose the same game as another POD group, or choose a different game. You may like to keep a journal or notes as you learn about the game. You may need them for Exercise 2. For your journal/notes, you may wish to use your user page in the wiki.
To find a game to play, try the following:
Alter the filters to find one that suits. Eg. choose "Browser" for Client type if you don't want to install any extra software, or choose a 2D or text based game if you don't have a fast computer.
- http://www.free-games.com.au/Free_Online_Multiplayer_Games/
- http://en.wikipedia.org/wiki/List_of_MMORPGs
- http://en.wikipedia.org/wiki/Category:Browser-based_games
Exercise 3
Each person must write at least 500 words of content in the wiki.
Examples of content are:
- An article on the game you chose as a group
- An article on a theory or a principle used online games eg.
- Synthetic economies
- Free-form games
- Characterisation / Alternate online identity
- Online Romance / Relationships
- Addiction
- Motivations of playing
- comradeship / support / help and friendship
- An article on an online game or online collaborative space you are interested in or have used
Keep the following points in mind:
- Creating your user page or POD page does not count. your 500 words must be in the form of a standard article.
- your 500 words may be in one article, or spread across multiple articles
- more than one person can contribute to an article. You might like to write a single 2000 word article as a group.
- you must reference any information you use in the article
Resources:
POD Activity 2
Choose a theory or principle exemplified in the game you played and write an article on it. Include a section detailing your experience, how you discovered/learned about it, and your views on it.
Again you may choose the same topic as another person, in which case you must still contribute 500 words, including your own personal view.
Articles should contain an expanded definition of the topic with references. This may be followed by several personal experiences.
Ensure the section detailing your experiences is clearly marked as a personal experience, and separate from the main article.









