Building and Using Your Own Corpus and Concordance

"The use of corpora and concordance is now an area of considerable interest"

Building Your Own Corpus

There are two types of corpus:-

1) Corpus of specific genre of text; e.g. academic article, business letters and newspaper features articles.
Building specific corpus need just need to find the file and download it. Requirement size for each corpus depends on the number of examples of each corpus. Finding certain article can be easy with certain web site for examples; or
2) General corpuses which includes text from wide variety of different genres.
Building a general corpus need to see the world; argest corpus, the Internet. To use the Internet as a corpus, need to use the search engine with wide coverage which search within pages as well as meta-tags is needed.

Make your Own Concordance

After decided to use the Internet as a corpus, next is use it to make a concordance. Concordance has two different meaning; both have its own application in language learning.
To produce word-count concordance from a corpus, need to use concording program that is helpful to the user.

Using Your Corpus and Concordance

A word-count concordance can be use with specific corpus even though most concording programmed is limit to certain concordance in the corpus. The main pedagogical use of word-count concordance can be use in course and materials preparations. It indicate which word need to be taught in specific corpus and help in finding representative texts to be use as teaching tools. Using examples to show concordance in language learning and asked the student to induce the rules of each concordance. It’s encouraging the students to realize the benefit of inducing their own rules in identifying language data. Moreover, while using teacher-chosen example-of-use in concordance as teaching technique, it represent the way the teacher to encourage the student learning the point of learning. However, most of language data that student learned has not been vetted by teachers and it make them learn what is valuable learning also learn many types of learning examples in unambiguous illustrations of a language point. An investigation by the writers shows a valuable use of concordances build by the student themselves without teachers and concordance help, base only on programmed that they familiar with and create their own concordance in language learning. Students' use of self-selected concordance in self-correction write in one possible application besides teacher conduct self-awareness by questioning about word and asked the student to build their own concordance and notice their own ability in concordance from the data they learn. Corpora and concordance has their own use in language learning from standard use do teacher-created concordances in the classroom through awareness-raising questions while the student do self-correction in learning language itself. As a conclusion, using concordances and corpora help the students and teachers to learn about language leaning itself.

'The boss was the same old boss'

boss 2
the 2
old 1
same 1
was 1

To produce a word-count concordance from a corpus, a concording programmed is helpful, although some other programs can also create concordance. 2.06 PM 28 Mac 2009

Using Concordance Programs in the Modern Foreign Languages Classroom

A "concordance", according to the Collins Cobuild English Dictionary, is:

“An alphabetical list of the words in a book or a set of books which also says where each word can be found and often how it is used.”

Using concordance programs in the modern foreign languages classroom by Graham Davies by doing word-count using concordance in creating glossaries and dictionaries and an extremely useful item for teachers in language learning.

Concordance means a list of words taken from a piece of authentic language displayed in the center of the page and shown with parts of the contexts in which they occur.


Concordance 1 on the word "sin":

1. Thus from my lips, by yours, my


is purged.

2. Then have my lips the


that they have took.



from thy lips? O trespass sweetly urged!

4. Give me my



Text used as basis for the concordance, with the keyword in bold:

Ay, pilgrim, lips that they must use in prayer.
O, then, dear saint, let lips do what hands do;
They pray, grant thou, lest faith turn to despair.
Saints do not move, though grant for prayers’ sake.
Then move not, while my prayer’s effect I take.
Thus from my lips, by yours, my sin is purged.
Then have my lips the sin that they have took.
from thy lips? O trespass sweetly urged!
Give me my sin again.

A computer-generated concordance

Now look at that same concordance, displayed with fuller context (here between 75 and 80 characters each side, including blank spaces):

1. Move not, while my prayer’s effect I take. Thus from my lips, by yours, my sin
JULIET Then has my lips the sin that they have took. ROMEO
is purged.

2. Thus from my lips, by yours, my sin is purged. JULIET Then has my lips the sinROMEO Sin from thy lips? O trespass sweetly urged! that they have took.

3. Is purged. JULIET Then have my lips the sin that they have took. ROMEO Sin from thy lips? O trespass sweetly urged! Give me my sin again

4. they have took. ROMEO Sin from thy lips? O trespass sweetly urged! Give me my sin again.

The KWIC and the fuller context display are both useful, depending on what you want to do with the material.

So there you have the basic ingredients for any concordance: a text base and a procedure. But whereas the procedure was manual and it gave us an extremely limited concordance (the concordance had only four citations), the meanings of the word "sin" that appear in it are rooted in the poetic world of Romeo and Juliet. Below, in contrast, is a concordance on the same keyword, based this time on a 25-citation sample created by a concordance, using contemporary including British and American books, ephemera, newspapers, magazines, radio transcripts and transcriptions of ordinary conversations.

Concordance 2 on the word "sin":

List of uses of concordance for language teachers

· The teacher can use a concordance to find examples of authentic usage to demonstrate features of vocabulary, typical collocations, a point of grammar or even the structure of a text

· The teacher can generate exercises based on examples drawn from a variety of corpora, for example gap-filling exercises and tests.

· Students can work out rules of grammar or usage and lexical features for themselves by searching for key words in context. Depending on their level, they can be invited to question some of the rules, based on their observation of patterns in authentic language.

· Students can be more active in their vocabulary learning: depending on their level, they can be invited to discover new meanings, to observe habitual collocations, to relate words to syntax, or to be critical of dictionary entries.

· Students can be invited to reflect on language use in general, based on their own explorations of a corpus of data, thus turning themselves into budding researchers.

Concordance software and corpora

Concordances for Windows

Concordance by R.J.C. Watt of Dundee University makes both a full concordance and a KWIC-concordance (by Watt called “Fast Concordance”). The “Fast Concordance” is really fast. The “Full Concordance” is, of course, a bit slower, and making a full concordance of a very large corpus will require a lot of computer power and patience. But a full concordance of Sir Walter Scott’s Ivanhoe (about 200,000 words) took about 5 minutes on a Pentium 166MHz machine with 64MB of RAM.


How big a corpus one needs also depends on what it is to be used for? Basically the corpus must be so big that there are enough occurrences of the language elements we want to study. For comparison: Cobuild uses a corpus of about 200 million words of written and spoken UK, US and NZ English in dictionary compilation. Birmingham University’s The Bank of English corpus comprises about 500 million words, and is well suited for linguistic research. Letting our students loose on such vast masses of text is, in most cases, likely to create more confusion that clarity. Less will often do. But, of course, if confronted with a really ardent advocate of misguided ideas of what is correct usage and what is not, a failure to find examples of the misguided expressions in a corpus of 400-500 million words just might make an impression on him/her. Chris Tribble argues that a specialist micro corpus of about 25,000-30,000 words will be quite adequate for most educational purposes. On the other hand, see Tribble and Jones (1997:11): “We tend to think that a word like crime is a common word but it actually occurs only about 20 times in every one million words of a 'balanced" collection of texts such as the Longman-Lancaster corpus”. Later we’ll show examples of what can be done with a corpus of about 50,000 German words.

Preparing for working with concordances for teachers and students

Need to prepare yourself and created discussion topic to discuss in the class between the students and the teacher to create interaction between them. Preparing the text for concordance also needed besides prepared a learning task and discussion topic. The discussion topic needs to include a critical mass of idea, control of contextual information, scrupulous of the original materials, deciding on the degree of editorial control needed. Preparing the student to face learning of concordances is important in term of the obvious thing that you forgot to mention, independence from authority, discussion topic and the hard work of learning from a raw data, dynamics and pacing of group-work at the computer and helping students to move on: transferable skills. 2.07 PM 28 Mac 2009

EBSCHOHOST is one of the online database that Universiti Kebangsaan Malaysia had subscribed to helps the student in finding information about the article or information regarding what they want to find oe seek. Rather than spending hours searching through the internet, it's more effective searching the information through the online database as long as they had the internet connection to find it. Searching through this site might be a bit difficult if we does not search through it with specific title, author, journal name, or subject terms. The subject that we want to search through EBSCHOST also need to be precise to avoid getting the wrong information during searching through this database.

Lisa Net (CSA ILLUMINA) is also a online database that helps the user in finding an information regarding what they want to find. In addition, through Lisa Net, the user can find the information through three method; Quick Search, Advanced Search or Search Tools. Finding information through this site a bit tricky if the user do not know how to use it in finding the information. In helping the user in finding information easily, this site provided search tips, technology search area and date range of the information that the user want to find.

For the article summarizing, I had choose the topic that I familiar with; "Computer-Assisted Language Learning (CALL)". From both EBSCHOST and Lisa Net, I found articles that involved the use of CALL in "Intelligent computer assisted language learning as cognitive science: the choice of syntactic frameworks for language tutoring by Matthews C" and "A Software Development Approach for Computer Assisted Language Learning by Steve Cushion". Both this article related to CALL in term of using it for cognitive science and software development.

For summarizing the first article, the article is " A software Development Approach for Computer Assisted Language Learning" by Steve Cushion. This article I found in EBSCHOST online database.

Computer Assisted Language Learning is a field which computers and language play an important part in area of computational linguistics. Argument about highly interdisciplinary nature of the field and influence of different currents of psychology that's involved the history of CALL projects. Error Analysis and Description describe the analysis of regular parsers in CALL context includes grammar and spelling checkers eventhough learnererror is not the same with the native speaker error in language learning. Feedback on history before CALL being design discuss the problem that the learner faced including human interaction, general learning and general problem of language generation. Student Modelling in parser-based CALL explored the problems in the latter of Intelligent Tutoring System for areas other than a second language.

As for the second article, I found it in Lisa Net online database. The title is "Intelligent computer assisted language learning as cognitive science: the choice of syntactic framework for language" by Matthews C. However, the article that I found only shows the outline of the article itself.

Part of a special issue based on the workshop SCIAL'93 (cognitive Science, Computer Science and Language learning), held Oct 93 in Clermont-Ferrand, France, which considered the development of interactive language learning environments. The development of Intelligent Computer Assisted Language Learning (ICALL) has often tended to proceed in an ad hoc fashion. Questions relating to ICALL Should be asked and answered within a principled framework. Considers which grammar framework might best form the basis of the syntactic component of an ICALL system. Compares 2 frameworks, Definite Clause Grammars and Principles and Parameters Theory, with respect to a number of criteria of adequacy. Although there are various reasons for preferring the latter framework, illustrates the types of questions that ICALL should be raising rather than providing any precise answers.

:: 3rd Posting :: :: google scholar :: eric digest :: ::

Hello.... is a metasearch engine owned by Copernic Inc. that helps in finding information from directories, and deep content sites. It properly formats the words and helps in finding the things that we want to know with specific information. Moreover, compiles their results in a virtual database, eliminates duplicates, and displays them in a uniform manner according to compatibility of the information that we want to seek.

Google Scholar

Google Scholar helps in finding information that involved scholary literature. In addition, through this site we can find information like; peer-reviewed papers, thesis, books, abstracts and articles, from academic publishers, professional societies, preprint repositories, universities and also other scholary organizations. Google Scholar features; Search which diverse sources from one convenient place, Find papers, abstracts and citations, Locate the complete paper through your library or on the web and Learn about key papers in any area of research.

Eric Digest

Eric Digest helps improviding an access towards the ERIC Digests (education articles) produced by the former ERIC Clearinghouse system. This site provided :-

- short reports (1,000 - 1,500 words) on topics of prime current interest in education. There are a large variety of topics covered including teaching, learning, libraries, charter schools, special education, higher education, home schooling, and many more.

- targeted specifically for teachers, administrators, policymakers, and other practitioners, but generally useful to the broad educational community.

- designed to provide an overview of information on a given topic, plus references to items providing more detailed information.

- produced by the former 16 subject-specialized ERIC Clearinghouses, and reviewed by experts and content specialists in the field.

- funded by the Office of Educational Research and Improvement (OERI), of the U.S. Department of Education (ED).

- The full-text ERIC Digest database contains over 3000 Digests with the latest updates being added to this site in July 2005. helps in providing images, videos site, shopping site, news and research database. Moreover, it also provide site like news, music, movies, map, and business site. This site function as same like in helping finding information but it have it's own site which combine all links to all site that we want to visit besides its search engine.


The differences between all of the is every search engine has their own way in helping the user in finding what they want. For and and, both helping the user in finding all sort of information from document, pictures to news. As for Google Scholar and Eric Digest, it helping the user in finding specific information in what the user want to find. Google Scholar helps in finding information about literature whereas Eric Digest is a site that goods for education field in finding information about helping teachers to teach in the class.


The similarities between all of the search engine are they help the user in finding what they want and helping in an easy way to locate certain information without needing to browse through the internet for hours just to find specific information about something. Google Scholar and Eric Digest have similarities in form of both helping in finding a certain information about literature world and specific information for educational method.

:: 2nd Posting :: Blogging can assist language learners to improve and enhance their writing skill ::

In improving our writing skill, we need to improve our writing skill. In assisting language learners, writing skill they would think critically while writing on blogs. Their way of thinking involved in how to create an interesting topic to discuss in their blog, how to make it lively, how to make the readers would be trill in reading more about whats contain in the blog and the most important thing is the writing skill itself.

Eventhough writing seems out of date to certain teenagers, writing bring a lot of advantages in terms of improving the writing skills and grammar building.

Blogging appears to be helping teens become more productive writers. This is a promising finding that has important implications for schools. A survey recently conducted by the Pew Internet and American Life Project explored the links between the formal writing that teens do for school and the informal, electronic communication they exchange through email and text messaging.

The report also confirms my findings about the ambiguity in computer usage at home and at school. Educators seem perplexed when it comes to assigning blogging assignments because many students do not have computers at home. But computer ownership increases daily as prices go down and the need or want of these machines rises. Teens who use a computer at home for their non-school writing believe computers have a greater impact on the amount of writing they produce than on the overall quality of their writing.

Other blog that I found which discuss Cultivate Your Writing Skills to Improve Your Blog, it enhance in writing skill.

Blogging is about communicating first and foremost, and to get started communicating you only need to get your idea across.

On the other hand, if you want to be remembered, in part you need to be distinctive. You need to better than every other mediocre writer with good ideas. You need to find your own voice. And some of the great writing advice you can get will help you find yourself - cutting out the excess and leaving the worthy.

I think the best way of taking advantage of your blogging schedule to improve your writing skills, is to write your post and then edit it. Writing on it’s own, won’t improve your skills, it will only display them as they are. Improving your writing skills, I think, means improving your editing and rewriting skills. Once the ideas are in place, improving and rewriting your language to better communicate those ideas just takes practice.

Mental Illness

Reading for Information. For this 1st blog post, I want to talk about what is Mental Illness. Moreover, I’m one of many people who love to find many topics to increase my knowledge about things that not usually involved in my subject of study. When we read, it does not matter if it accidental reading or for finding information, it depends on the person themselves to find what they want in the internet.

People might know that mental illness causes by that person mental state that is not stable at that time or having personal problem. However, causes of mental illness also cause by physical state. From what I’ve learn in this article, mental illness mean ‘medical conditions that disrupt a person’s thinking, feeling, mood, ability to relate to others, and daily functioning’. It shows that a person medical conditions also effects from their way of thinking, feeling, mood and the most important thing is daily behaviour.

Mental illness can effect a person in any age, race, region or income. The depression that person feel not only felt to a person who have a difficult life but also a person who have everything from healthy body to prosperous life. The types of mental illness that I’ve learn from reading this article are major depression, schizophrenia, bipolar disorder, obsessive compulsive disorder (OCD), panic disorder, post traumatic stress disorder (PTSD), and borderline personality disorder. Although people might say that mental illness taking a long time to recover, but as long as it can be cure, there’s hope, right?

These articles also state some important facts about mental illness and how to recover from it. For examples, the fact that mental illnesses usually strike individuals in the prime of their lives, often during adolescence and young adulthood. Moreover, all ages are susceptible, but the young and the old are especially vulnerable. It shows that during young and old age, the consequences of getting mental illness is bigger that during mature age.

The method to cure mental illness also being state in this article, and I have more knowledge in how to overcome with a person who in mental illness problem. However, the best method to cure mental illness is early identification and treatment is of vital importance; and by ensuring access to the treatment and recovery supports that are proven effective. Recovery from Mental Illness is accelerated and the further harm related to the course of illness is minimized if we take precaution about this problem seriously.

Finally, the exercise that could be form from this article is basically involved in classified the meaning of Mental Illness, to whom the Mental Illness would effect, types of Mental Illness, and facts about Mental Illness.

That's it for now..please enjoy the video that's involved Mental Illness...