KWIC : Java Glossary

KWIC
KWIC (Keyword In Context) These massive printed indexes were popular in the days of fanfold paper, punch cards and batch programs, and 12 cm (4.72 in) thick printed listings of an entire body of data printed in a concordance of keywords where data items are sorted in alphabetical order by keyword, with items with more than one keyword, generating a line for each keywords. The keywords align vertically down the middle of the page, with some context either side of how they were used. You might think of them as rudimentary precursors to the search engines.

For example, presume your raw data consisted of sentences like this:

Lions like to eat gazelles and wildebeest.
Crocodiles eat wildebeest once a year when the wildebeest cross the rivers.
Gazelles are popular icons for car and computer companies since they suggest swiftness.
White tigers live mostly in India.
Wild lions live mostly in Africa.
And you were interested in building a concordance (aka index) of the animals mentioned in the text, then you could prepare a KWIC index showing every keyword of interest in every possible context it which it occurs:
Example KWIC Index
Concordance of Animals
Crocodiles eat wildebeest once a year when the wildebeest cross the rivers.

Lions like to eat gazelles and wildebeest.
Gazelles are popular icons for car and computer companies since they suggest swiftness.

Lions like to eat gazelles and wildebeest.
Wild lions live mostly in Africa.

White tigers live mostly in India and zoos.

Lions like to eat gazelle and wildebeest .
Crocodiles eat wildebeest once a year when the wildebeest cross the rivers.
Crocodiles eat wildebeest once a year when the wildebeest cross the rivers.
 
That example is just to give you the general idea. There are many variations in the details of the presentation and sorting. Today we use search engines and databases to look up just the information we are interested in. The problem with such KWIC indexes is they are enormous and consume entire forests to print. People reading them usually only ever use a tiny fraction of the entire hard copy. It is more efficient to key the name of the index of interest and have the contexts of interest composed on the spot, the way Google does. The original line printers could not easily create highlighting effects. Thus you had to use alignment to help people pick out the keywords in context. Today you would use bold or colours.

available on the web at:

http://mindprod.com/jgloss/kwic.html
ClustrMaps is down

optional Replicator mirror
of mindprod.com
on local hard disk J:

J:\mindprod\jgloss\kwic.html
logo
Please the feedback from other visitors, or your own feedback about the site.
Contact Roedy.
Blog
IP:[65.110.21.43]
Your face IP:[54.196.69.189]
You are visitor number 11.