CSV (Comma-Separated Value). A file of ASCII (American Standard Code for Information Interchange)
fields separated by commas. Microsoft Word, Microsoft Excel, and SQL (Standard Query Language) can often import some variant on this
"orange, Valencia", lemon, lime
"""extra virgin"" olive", palm, date
Usually fields containing embedded spaces or commas are contained in " marks, but
there are other conventions. Quotes (") inside quoted fields are doubled.
Europeans often use ; and Perl aficionados use tab to separate fields instead of
commas. Sybase SQL import uses ' instead of ".
I wrote CSVReader and CSVWriter which are available with
complete Java source. They are full featured and
configurable. As well as the read/write classes there are 20 utilities to let you
do such thisgs as sort, align, pack, etc. with the library or with the sample utilities. I find these utilities
useful for massaging data into tidied form, e.g. by screenscaping, without having to write any Java code.
For simple key=value you might use the built-in Properties
mechanism instead. Unfortunately, it has a complex system of encoding awkward characters incompatible with
- This is the advantage I appreciate most. You can use a suite of standard utilities to manipulate them. If
you put your data in this format, it is amazing how much you can do with the CMP (Canadian Mind Products) CSV utilities without writing
a line of code: align in columns, change case, condense, dedup, dump, convert to entities, convert entities to
UTF-8, pack, replace, reshape, sort, covert tabs to comma, create csv files using flat files and a template,
convert to a search replace script, convert a flat file to CSV, convert CSV to a flat file, convert a CSV file
to an HTML (Hypertext Markup Language) table, convert an HTML table to CSV, create a CSV file from groups of lines.
- They are human readable.
- They are compact, at least relative to XML (extensible Markup Language).
- THey are widely supported, albeit with a variety of separator characters.
- You can embed comments in them.
- You can safely view and edit them with an ordinary text editor.
- You can easily convert them to HTML tables and back.
- If they are damaged, you can manually repair them.
- They don’t have a way of encoding unprintable characters or binary.
- They can be encoding with many different charsets. There is nothing in the file itself to tell you which