HTTP : Java Glossary

The JDisplay Java Applet displays the large program listings on this web page. JDisplay requires an up-to-date browser and Java version 1.5+, preferably 1.7.0_02. If you can’t see the listings, or if you just want to learn more about JDisplay, click  here for help.
HTTP
HTTP (Hypertext Transfer Protocol). A protocol used on the Internet by web browsers to transport text and graphics. It is focuses on grabbing a page at a time, rather setting up a session. Applets also use it to download jars, classes and resources. Browsers use to download files and images, not just HTML (Hypertext Markup Language) text.
Message Headers From Browser To Server Under the Hood
Message Headers From Server To Browser response codes
Language and Charset Speeding Up HTTP
GET vs POST Learning More
Sample Code Links

Message Headers From Browser To Server

Some of the acroynms you will encounter it deciphering HTTP headers include: HTTP, MSIE (Microsoft Internet Explorer), CLR (Common Language Runtime). Fields in the headers let browsers and servers communicate. You set them up with .setHeaderField or more specialised methods. For example:

HTTP Headers that Browsers Send Servers
Field Typical Value Meaning
User-Agent: Java.exe default ⇒ Java/1.7.0_02 Last revised/verified: 2011-08-30

FirefoxMozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0.1) Gecko/20100101 Firefox/8.0.1 Last revised/verified: 2011-12-13

Opera Opera/9.80 (Windows NT 6.1; U; en) Presto/2.10.229 Version/11.60 Last revised/verified: 2011-12-13

SeaMonkeyMozilla/5.0 (Windows NT 6.1; WOW64; rv:7.0.1) Gecko/20110928 Firefox/7.0.1 SeaMonkey/2.4.1 Last revised/verified: 2011-12-13

SafariMozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534.52.7 (KHTML, like Gecko) Version/5.1.2 Safari/534.52.7 Last revised/verified: 2011-12-13

AvantMozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0.1) Gecko/20111121 Firefox/8.0.1 Last revised/verified: 2011-12-13

ChromeMozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.63 Safari/535.7 Last revised/verified: 2011-12-13

IE (Internet Explorer) 9Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0) Last revised/verified: 2011-12-13

IE 8Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; Avant Browser; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC (Personal Computer) 6.0; .NET4.0C; .NET4.0E) Last revised/verified: 2011-02-03

IE 7Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.0.04506)

NetscapeMozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.5pre) Gecko/20070710 firefox/2.0.0.4 Navigator/9.0b2

Which browser being used/simulated
Host: localhost:8081 destination url, server:port.
Accept: application/xhtml+voice+xml;version=1.2, application/x-xhtml+voice+xml;version=1.2, application/x-shockwave-flash,text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,text/css,*/*;q=0.1
or
text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
MIME (Multipurpose Internet Mail) types the browser is willing to accept. The encoding of this field, is described in  RFC 2616 section 14. and in the more friendly w3.org version. Roughly the q numbers define your preference. The higher the number the higher the preference. Default is 1. The q applies to the preceding MIME. You set this with URLConnection. setRequestProperty( "Accept", …); not "accept" as the Sun docs erroneously suggest.
Accept-Language: en Language the browser in willing to accept.
Accept-Charset: windows-1252, utf-8, utf-16, iso-8859-1;q=0.6, *;q=0.1 Character set encodings the browser is willing to accept.
Accept-Encoding: deflate, gzip, x-gzip, identity, *;q=0 compression schemes the browser is willing to accept.
  • deflate: zlib format defined inRFC 1950 plus the deflate compression mechanism described inRFC 1951. This is a stripped down gzip without the header.
  • gzip, alias x-gzip: Java-style gzip RFC 1952 Lempel-Ziv coding with a 32-bit CRC (Cyclic Redundancy Check).
  • compress, alias x-compress, UNIX compress
  • identity means as-is, no compression. Use in the Content-Request header, but not the Content-Encoding header. Just leave out the Content-Encoding if it is identity.
Referer: http://mindprod.com/jgloss/http.html the web page that contained the link that triggered this request.
If-Modified-Since: Mon, 06 Feb 2006 01:24:23 GMT (Greenwich Mean Time) Only bother with the request if the file has changed since this date, otherwise the browser already has a copy in cache. If the file has not changed, you will get a 304 not modified response code.
Connection: Keep-Alive requests server keep the socket open for further messages. It is true by default in HTTP 1.1, so you don’t need to use it.
Keep-Alive: 300 requests server keep the socket open 300 seconds for further messages.
Pragma: no-cache requests getting a fresh copy from the server, rather than from a cache.
Content-Type: application/x-www-form-urlencoded MIME type of the payload to the server.
Content-Length: 114 length in encoded bytes of the payload to the server.

Beware usingHttpURLConnection.setFollowRedirects ( false); This reportedly causes trouble in recent JDKs. When it is set true, it will not automatically follow responses with: <META HTTP-EQUIV="Refresh".

Message Headers From Server To Browser

You read these .getHeaderField or more specialised methods after the connection has been made.
HTTP Headers that Servers Send Browsers
Field Typical Value Meaning
Server: Apache/2.0.55 (NETWARE) mod_perl/1.99_12 Perl/v5.8.4 Which server software being used.
Accept-Ranges: bytes Inform the browser that the server supports downloading just parts of files, as small as a byte granularity.
Location: http://mindprod.com/index.html If the URL (Uniform Resource Locator) has been redirected/moved, this is the new URL to use instead. You can tell if it is permanently or temporarily redirected by looking at the response code.
Keep-Alive: timeout=15, max=99 how long to keep this socket open for more messages.
Connection: Keep-Alive requests browser keep the socket open for further messages.
Content-Type: image/png MIME type of the payload from the server. Also used to encode the CharSet encoding, e. g. Content-Type: text/html; charset=utf-8
Content-Encoding: gzip gzip or x-zip or deflate or not present if no compression.
Content-disposition: attachment;filename="smile.png" Server suggests a filename to save this download under.
Content-Length: 842 length in encoded bytes of the payload from the server.

In the real world, the conversations between browser/client and server are much more complicated as slipshod than you might suppose. Each query often results is a flurry of permanent and temporary redirects back and forth. Each element on an HTML page must be requested independently. Sometimes servers will send back a fail error code, then send the page anyway. Or they will send a 404 with an OK text response code. Sometimes servers refuse HEAD requests, but accept the equivalent GET. Sometimes servers send back https: in response to an http: request. Sometimes servers give you a totally different page from the one you requested and don’t tell you the one you wanted is on longer available. Sometimes servers redirect to localhost, or send back gibberish messages. Sometimes a server won’t send you a page if you have recently previously requested it. They expect you to have cached it. Browsers just do their best to muddle through. When you start emulating browsers with code, you get pretty flaky programs.

Language and Charset

You might wonder, where does the server encode the language and character set? Oddly not in the HTTP header, but embedded
<!-- embedding language and charset inside an HTML document -->
<meta http-equiv="Content-Language" content="en">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
Embedding this information makes it easier for web page authors to control, even if it makes finding the information slightly more difficult for the browser.

GET vs POST

In a GET, the parameters are embedded in the URL send to the host, with various header fields following the URL. The first parameter starts with a ?. Subsequent ones start with &. = separates the keyword from the value. Here is what a typical message sent to the server looks like:

A POST, is like a GET, with optional embedded parameters in the URL sent to the host. In addition in has a message tacked on the end after a blank line like this:

Sample Code

This code for doing GET and POST is from the com.mindprod.http package. You can download the whole package.

Code to do a GET

Code to do a POST

Code to do a HEAD

Code to do a Fetch (generic GET)

Code to do a PROBE (check if page present)

Base class for GET, POST, and PROBE

view

Code to READ

Read the response either as bytes (readBytesBlocking) or converted to a String

Under the Hood

What happens when your Java-based browser requests a page?
  1. URL.openConnection just sets up a place to build the HTTP header. It does no communicating with the outside world.
  2. HTTPConnection.connect() requests sending the header to the server. This request triggers opening a TCP/IP (Transmission Control Protocol/Internet Protocol) socket connection to the server. This is done by sending a SYN connection request packet. The server sends back a SYN+ACK. Then the client sends an ACK, upon which may be piggybacked some data.
  3. This triggers sending the GET header composed of all the header fields set up before the .connect call. The GET request header includes a list of the encodings and compression algorithms the browser would like in response. .connect does not return until the HTTP header is safely sent out the wire. .connect can take a long time to return since it waits for the other end to respond, or a timeout.
  4. The browser callsHTTPConnection.getResponseCode to see if request went ok. This blocks until the server responds with an HTTP response header.
  5. Then the browser callsHTTPConnection.getInputStream and reads the text of the message from the server containing the requested web page. Using the standard TCP/IP protocol flow-control features, the server sends data only as fast the browser can read it.
  6. The browser then scans the web page for the URLs of embedded images and puts out GET requests for them.
  7. Then various images usually come back from the server on the original socket. The browser could elect to request each image on it own socket so they can arrive simultaneously.
The stream is made purely of printable characters. The server can detect the start of a new GET request by looking for line terminators.

Speeding Up HTTP

There are several things you might consider to speed up HTTP transmissions.

Learning More

RFC 2045 MIME part 1.

RFC 2049 MIME part 2, non ASCII (American Standard Code for Information Interchange).

RFC 1945 HTTP 1.0 specification.

RFC 2045 MIME Part One: Format of Internet Message Bodies, specifies the various headers used to describe the structure of MIME messages.

RFC 2046 MIME Part Two: Media Types, describes the general structure of the MIME media typing system and defines an initial set of media types.

RFC 2047 MIME Part Three: Message Header Extensions for non-ASCII text

RFC 4288 andRFC 4289 MIME Part Four: Registration Procedures

RFC 2049 MIME Part Five: Conformance Criteria and Examples, Provides some illustrative examples of MIME message formats

RFC 2183 MIME Part Five: Conformance Criteria and HTTP Content-Disposition

RFC 2616 updates the HTTP protocol

RFC 2617: for details on how to send username and password in http headers to restrict access

RFC 2183 MIME Part Five: Conformance Criteria and HTTP Content-Disposition

Oracle’s Javadoc on URLConnection class : available:
Oracle’s Javadoc on HttpURLConnection class : available:
Avant
browser
CGI
Chrome
Details on HTTP headers
File I/O Amanuensis: to see how to write code that reads and writes via HTTP-CGI
Firefox
Flock
forms: see the raw socket information exchanged
HTTP Client
HTTP response codes: numeric codes in the HTTP response header
IE
MIME
network properties
Opera
remote file access
RFC
Safari
SeaMonkey
send big files
TCP/IP

CMP homejump to top You can get the freshest copy of this page from: or possibly from your local J: drive (Java virtual drive/mindprod.com website mirror)
http://mindprod.com/jgloss/http.html J:\mindprod\jgloss\http.html
logofeedback Please email your feedback for publication, letters to the editor, errors, omissions, typos, formatting errors, ambiguities, unclear wording, broken/redirected link reports, suggestions to improve this page or comments to Roedy Green : feedback email If you want your message kept confidential, not considered for posting, please explicitly specify that.
mindprod.com IP:[65.110.21.43]
view BlogYour face IP:[38.107.179.214]
You are visitor number 28,241.