How web browsers work

This is a very simplified view of how a web browser works, but it should be enough to explain how web tracking works.

When you click a link – for example https://somesite.com/some-page – firstly, your browser looks up the address of somesite.com. It then sends a message, using a protocol calls “HTTPS1“, asking that address for the contents of “some-page”. That message includes some information about you and your browser (which operating system you are using, which browser, various other bits and pieces which are supposed to help the web-server).

The web-server figures out what you mean by “some-page” and then returns a document back to your browser. That document is normally written in HTML2 and tells the browser what “content” to display. This is basically the words that appear on screen, along with some simple codes for things like “this is a heading”, “this is a paragraph”, “this is a section” and so on.

However, a HTML document by itself doesn’t do much. So the HTML contains references to other files to help it out. The most common of these are stylesheets (CSS3 files, which define the appearance, colours and fonts of the page) and Javascript files (which define are programming code that can make your page interactive). Plus, HTML documents don’t contain images – instead they contain a reference, so another message is sent to the web server, saying “load up image1.jpg” and insert it here.

Whenever the web server receives one of these messages, it figures out which file it needs to find (whether that’s HTML, CSS, JS, image or whatever) and sends it back. But the web-server can also drop small pieces of information, known as cookies, onto your machine – which are useful for storing information about you. For example, if you’re logged in, then it can record a token saying who you are. If you always like your products listed from “Low to High price” then that’s probably stored in a cookie. Cookies are legitimately useful bits of information.

So that’s the basics of how a web page is shown … it’s pretty simple. Tomorrow is where we get into the consequences of that simple scheme.

Disclaimer: All the information here is greatly simplified. Don’t write in and complain that I’ve got it wrong. Pretty please?

  1. Hyper-Text Transfer Protocol Secure
  2. Hyper Text Markup Language
  3. Cascading Style Sheets