Viewing One Website Is Promiscuous
Published: 09/14/2009
There was a time when using a computer meant that you sat down in front of a keyboard, did your work, saved the documents to disk or floppy, and then walked away. None of that attaching things to e-mails, clicking on a hyperlink, or sending data through established connections to another computer somewhere else on a "network." In other words, you worked on a computer that was completely stand-alone and offline.
Those days are over.
Using a computer these days that's disconnected from the Internet is almost unimagineable, just as having a cell phone that doesn't have a network connection seems completely unreasonable. Our expectation in this day and age means that we turn the computer on, expect an established Internet connection so we can look at a website, log into web-based e-mail, or do some other work which relies on an underlying network-based hook-up.
However, Internet communications happen quickly and many subtle things transpire without the user knowing or being able to approve of them. As an analogy, when you go to the auto dealership to buy a new car, the salesmen are also slipping in the extended warranty, over-priced paint sealant, rust-proofing upgrade, and fabric protection add-ons without you knowing it. On the web, you're paying the price with slower webpage load time, possible malware downloads, and potential identity theft issues.
Opening a simple website - what happens under the hood
To make these applications work seamlessly across a global network that's physically distributed over the planet requires an enormous amount of automation. As a generalization, for someone to view a website the following must happen in this order:
1) The user opens a web browser such as Mozilla Firefox or Microsoft Internet Explorer.
2) The user types in the address of a website she wishes to see, such as www.google.com.
3) The web browser application makes a request to the OS using API calls to establish a network connection to the requested website.
4) The OS sends a name query request to its assigned DNS server to determine the IP address of www.google.com.
5) Assuming the DNS server responds with the address of www.google.com, the OS then sends a connection establishment request to the server representing www.google.com. For web-related requests, this is known as a TCP handshake.
6) Once the TCP connection is established, the OS (but originally initiated via the web browser) sends an "HTTP GET" request for the root file of the website (this usually defaults to something like index.html, which is like the "intro page" of the site). HTTP is the protocol ("language") used for client-to-website communication.
7) The web server sends the page over to the user's computer. The user's OS downloads this file and places it in its designated browser cache store somewhere on the disk. The web browser then reads through the file and starts rendering the page contents in the prescribed layout.
8) If the index file references additional objects (such as images) which make up the webpage, the web browser and OS make additional "HTTP GET" requests for them. Pictures, cookie files, advertisement images, style sheets, etc., all make up the individual object pieces that fully make up the complete web page. Remember that images and page text information are separate items that make up a page. They are not put into "one file" like office documents so the web browser (and server) manages all of these as individual pieces that have to be put together like a puzzle.
9) Once all the objects are downloaded into the browser cache area on the disk, the browser is able to fully render the website as laid out by the invisible specifications embedded within the web page document.
10) If there are any scripts (sometimes referred to as JavaScript and JScript) defined within the web page document, these are automatically ran. Scripts are automated tasks that perform many functions such as loading image objects dynamically that are specific to the web browser version, load up advertising content from third-party sites, write session-specific cookie information, redirect the connection to another site, download executable objects to your disk, etc.. Generally speaking, scripts are performed transparently without the user's permission.
As hinted above, simply going to a single website address does not mean that all the content comes from that one source. Instead, it's very possible that a lot of the images and animated Flash content is downloaded from third-party websites, a scenario especially common when advertisements are involved as these are pulled from commercial ad networks which aggregate feeds from multiple marketers and stream the material directly to the end-user's computer. In virtually all cases, the original website that the user typed in the address for does not control the specific ad content coming from the third-party locations.
This is where the web gets dangerous because all this happens within a few seconds for a single web page. One click does it all.
A real-life example
Through various technical trickeries, malicious code writers potentially serve their virus-laden wares by embedding them in advertisements or by hijacking sites that serve ad content. Let's take a look at a popular website such as espn.com. The following represents a detailed session capture and the transactions which happen underneath the hood when a user opens the front page of the website (this particular examination reflects the specific characteristics of the site and content as of this writing; it's very important to understand that the content may dynamically change from one moment to the next).
Here's the capture file for those of you who know how to read a packet trace and want to follow along with Wireshark.
View details of loading espn.com
For every different website that the browser has to pull an object from, the client performs a DNS request to resolve the server to a numerical IP address, then performs a TCP handshake directly to the resolved IP address, and then sends an HTTP GET request and downloads the files from the server. For example:
- Client queries DNS server for the IP address of www.espn.com.
- DNS server replies with IP address 199.181.132.250.
- Client performs TCP three-way handshake with 199.181.132.250.
- Client performs an HTTP GET for the "/" (root) of the web server at 199.181.132.250.
- In this particular case for this domain name, the web server responds with an HTTP 301 "Moved Permanently" message indicating the client should redirect to espn.go.com instead.
- Client queries DNS server for the IP address of espn.go.com.
- DNS server replies with IP address 198.105.194.105.
- Client performs TCP handshake with 198.105.194.105.
- Client performs an HTTP GET for the "/" (index root page) of the web server at 198.105.194.105.
- The web server starts sending the website's index page to the client. As the browser parses the page, it sees that the page contains references to objects outside of the espn.go.com domain which the client has to perform additional DNS queries and HTTP GET requests for to get more files to complete the page rendering.
The following are general summaries of the web content client requests to each server that are outside of espn.go.com:
a.espncdn.com (CNAME: a.espncdn.com.edgesuite.net » CNAME: a1831.g.akamai.net » 204.2.179.75, 204.2.179.34)
/combiner/c?css=base.200907221727.css,modules.200907241146.css,frontpage_scoreboard.200907221318.css,insider.200907150953.css,espn360.200907150953.css,sn_icon_sprite.200907150953.css,master_sprite.200907231718.css
/combiner/c?js=jquery-1.3.2.js,plugins/json2.js,plugins/teacrypt.js,plugins/jquery.metadata.js,plugins/jquery.bgiframe.js,plugins/jquery.easing.1.3.js,plugins/jquery.hoverIntent.js,plugins/jquery.jcarousel.js,plugins/jquery.tinysort.js,ui/1.7.2/ui.core.js,ui/1.7.2/ui.tabs.js,espn.core2.200908261423.js,frontpage_scoreboard.200908281651.js,registration/myEspn.200908251118.js,registration/staticLogin.200909041014.js,insider/espn.insider.200906091431.js,tsscoreboard.20090612.js,flashObjWrapper.200907151202.js
/prod/scripts/swfobject/2.2/swfobject.js
/prod/assets/carousel_getflash_111808.gif
/photo/2009/0913/nfl_qb3_576.jpg
/photo/2009/0913/nfl_day1qbs_134.jpg
/photo/2009/0913/nfl_u_mcnabbinjury1_134.jpg
/media/motion/2009/0913/dm_090913_ten_clijsters_schighlight_thumdnail_wbig.jpg
/photo/2009/0911/boston_smh_launch_134.jpg
/prod/assets/main_search.gif
/webslices/icons/12x12.png
/i/columnists/reilly_rick_35fp.jpg
/i/columnists/simmons_bill_35fp.jpg
/prod/assets/clear.png
/photo/2009/0913/ten_a_clijsterstroph1_134.jpg
/media/motion/2009/0913/FEDSbuck913909_thumdnail_wbig.jpg
/icons/watch.png
/media/motion/2009/0913/dm_090913_ncf_big12_final_verdict_thumdnail_wbig.jpg
/winnercomm/outdoors/fishing/2009/WT-TROUT_134.jpg
/icons/in.gif
/streak/SFC09_playnow.jpg
/photo/2009/0913/sn_g_phillips01_35.jpg
/insertfiles/javascript/wa/sOmni.js?xhr=1
/insertfiles/javascript/wa/analytics.js?xhr=1
/streak/sfc09_new_logo.gif
/streak/loader_w.gif
/insertfiles/javascript/wa/sOmni.js
/streak/gradient_back.jpg
/streak/checks.gif
a1.espncdn.com (CNAME: a.espncdn.com.edgesuite.net » CNAME: a1831.g.akamai.net » 204.2.179.34, 204.2.179.75)
/prod/assets/bg_v2/bg_frontpage_elements.jpg
/prod/assets/bg_v2/bg_frontpage_red.jpg
/prod/assets/empire_trans_background_right.png
/prod/assets/myespn_trans_bg.png
/prod/assets/icon_externallink_gray.png
/prod/assets/empire_sep.png
a2.espncdn.com (CNAME: a.espncdn.com.edgesuite.net » CNAME: a1831.g.akamai.net » 204.2.179.75, 204.2.179.34)
/prod/assets/master-07092009.png
/prod/assets/bg944-070909.png
/prod/assets/frontpage_scoreboard/background_bar.png
/prod/assets/empire_trans_background_left.png
/prod/assets/empire_trans_background_middle.png
/prod/assets/gradient_back.jpg
/prod/assets/nav_trans_background_fp.png
/prod/assets/icon_video_ie6.png
/prod/assets/sn_icon_sprite_40.png
ads.pointroll.com (72.32.153.176)
/PortalServe/?pid=848233R65720090819231723&flash=6&time=1|1:14|-7&redir=http://log.go.com/log?srvc%3dsz%26guid%3dC0E45016-C631-4225-AB6B-866A9802D9B5%26drop%3d0%26addata%3d3345:53156:552561:53156%26a%3d1%26goto%3d$CTURL$&r=0.4291133528394331
/PortalServe/?pid=848105K50720090819175107&flash=6&time=1|1:14|-7&redir=http://log.go.com/log?srvc%3dsz%26guid%3d7333E1E2-CE1C-4363-B749-73E3393DB397%26drop%3d0%26addata%3d1331:53156:552560:53156%26a%3d1%26goto%3d$CTURL$&r=0.6454309718492637
view.atdmt.com (65.203.229.42)
/AVE/view/169962512/direct/01/&0.4291133528394331
/AVE/view/169959917/direct/01/&0.6454309718492637
speed.pointroll.com (CNAME: speed.pointroll.com.edgesuite.net » CNAME: a1343.g.akamai.net » 209.170.75.224, 209.170.75.232)
/PointRoll/Media/Banners/Nike/701579/TackleTakeOverESPN_924x50.jpg?PRAd=1255697&PRCID=1255697&PRplcmt=848233&PRPID=848233
/PointRoll/Media/Banners/Nike/701578/TackleTakeOver_300x250.jpg?PRAd=1255696&PRCID=1255696&PRplcmt=848105&PRPID=848105
assets.espn.go.com (CNAME: assets.espn.go.com.edgesuite.net » CNAME: a1589.g.akamai.net » 204.2.179.17, 204.2.179.8)
/icons/video.png
/icons/watch.png
/icons/listen.png
/insertfiles/javascript/wa/b_oo_engine.js
ec.atdmt.com (CNAME: atlasdmt.vo.msecnd.net » 65.54.81.64, 65.54.81.57)
/images/pixel.gif
adsatt.espn.go.com (CNAME: adimages.go.com.edgesuite.net » CNAME: a1412.g.akamai.net » 209.170.75.169, 209.170.75.179)
/ad/sponsors/nissan/Sep_2009/niss-300x100-0001.jpg
/ad/sponsors/ESPN_Internet_Ventures/Aug_2009/es13-300x100-0508.jpg
/ad/sponsors/gatorade/Aug_2009/gato-298x40-0021.jpg
/ad/sponsors/utilities/adinsert.js
ad.doubleclick.net (CNAME: dart-ad.l.doubleclick.net » CNAME: ad.3ad.doubleclick.net » 209.62.176.52)
/ad/N3340.espn/B3731500.26;sz=1x1;ord=2009.09.14.01.14.47?
cdn1.eyewonder.com (CNAME: cdn1.eyewonder.com.edgesuite.net » CNAME: a1956.g.akamai.net » 204.2.179.75, 204.2.179.66)
/200125/758106/1096030/ewtrack.gif?ewbust=2009.09.14.01.14.47
m1.2mdn.net (CNAME: m1.2mdn.net.edgesuite.net » CNAME: a509.cd.akamai.net » 209.62.187.45, 209.62.187.43)
/viewad/1361549/143-DL_1x1_tracking_pixel.gif
3ps.go.com (68.71.209.52)
/DynamicJSAd?js=1&itype=Pop1&srvc=sz&url=/
log.go.com (CNAME: log.wip.go.com » 68.71.209.70)
/log?ft=j&srvc=sz&addata=2206:73821:473959:73821|858:65::|0:73822:473960:73822|1331:53156:552560:53156|1034:183:227005:65|1149:65::|1643:53156:545118:53156|3345:53156:552561:53156|3346:65::|3347:65::|3348:65::|3358:53156:542703:53156|3359:53156:549472:53156&method=GET&cap=&svr=espn.go.com&host=espn.go.com&guid=D522A4BB-73F3-4458-97FF-B206EF584D5A&sf=cnt_codes:SZ11
content.dl-rms.com (CNAME: content.dl-rms.com.edgesuite.net » CNAME: a1666.x.akamai.net » 204.2.179.57)
/rms/11107/nodetag.js
/dt/s/11107/s.js
broadband.espn.go.com (68.71.208.110)
/espn360/util/espn360user?callback=jsonp1252916090169
streak.espn.go.com (68.71.208.175)
/format/modules/front09v2/streakModule
w88.go.com (CNAME: go.com.112.2o7.net » 66.235.138.2, 66.235.138.18, 66.235.138.19, 66.235.138.44, 66.235.139.54, 66.235.139.118, 66.235.139.121, 66.235.139.152)
/b/ss/wdgespcom,wdgespge/1/H.17/s02273497489981?AQB=1&ndh=1&t=14/8/2009%201%3A14%3A57%201%20420&ns=espn&cdp=2&pageName=espn%3Ahome%3Afrontpage&g=http%3A//espn.go.com/&cc=USD&ch=espn%3Ahome&server=espn.go.com&events=event3%2Cevent38&products=ads%3Blogin_cta_unreg_reg_registernow%3B%3B%3Bevent38%3D1%2Cads%3B3345%3A53156%3A552561%3A53156%3B%3B%3Bevent38%3D1%2Cads%3B1331%3A53156%3A552560%3A53156%3B%3B%3Bevent38%3D1%2Cads%3B2009_insdr_mod_front_xxx_xxx%3B%3B%3Bevent38%3D1%2Cads%3B3358%3A53156%3A542703%3A53156%3B%3B%3Bevent38%3D1&c1=espn&h1=espn%3Ahome%3Afrontpage&c2=D%3DSWID&c4=index&c5=espn%3Ahome%3Afrontpage&c6=New&v7=%3Aunknown%3Aanonymous%3Aanonymous%3Apremium-no%3A&v9=en&c11=anonymous%3Apremium-no&v11=index%3Aespn%3Ahome&v13=espn%3Ahome%3Afrontpage&c17=en&c21=unknown&c22=unknown&c24=First%20Visit&c29=anonymous&c30=false&s=1024x768&c=16&j=1.5&v=Y&k=Y&bw=792&bh=450&ct=lan&hp=N&AQE=1
/b/ss/wdgespcom,wdgespge/1/H.17/s02273497489981?AQB=1&pccr=true&vidn=2556FDC10515A53A-4000017200003EDA&&ndh=1&t=14/8/2009%201%3A14%3A57%201%20420&ns=espn&cdp=2&pageName=espn%3Ahome%3Afrontpage&g=http%3A//espn.go.com/&cc=USD&ch=espn%3Ahome&server=espn.go.com&events=event3%2Cevent38&products=ads%3Blogin_cta_unreg_reg_registernow%3B%3B%3Bevent38%3D1%2Cads%3B3345%3A53156%3A552561%3A53156%3B%3B%3Bevent38%3D1%2Cads%3B1331%3A53156%3A552560%3A53156%3B%3B%3Bevent38%3D1%2Cads%3B2009_insdr_mod_front_xxx_xxx%3B%3B%3Bevent38%3D1%2Cads%3B3358%3A53156%3A542703%3A53156%3B%3B%3Bevent38%3D1&c1=espn&h1=espn%3Ahome%3Afrontpage&c2=D%3DSWID&c4=index&c5=espn%3Ahome%3Afrontpage&c6=New&v7=%3Aunknown%3Aanonymous%3Aanonymous%3Apremium-no%3A&v9=en&c11=anonymous%3Apremium-no&v11=index%3Aespn%3Ahome&v13=espn%3Ahome%3Afrontpage&c17=en&c21=unknown&c22=unknown&c24=First%20Visit&c29=anonymous&c30=false&s=1024x768&c=16&j=1.5&v=Y&k=Y&bw=792&bh=450&ct=lan&hp=N&AQE=1
games-ak.espn.go.com (CNAME: espn.go.com.edgesuite.net » CNAME: a200.g.akamai.net » 204.2.179.11, 204.2.179.51)
/s/minigames/i/page1/logo_fantasy_pigskin_80.png
Totals:
73 HTTP GET requests.
21 DNS queries.
19 servers contacted (excluding DNS) located in:
- Burbank, CA
- Cambridge, MA
- Englewood, CO
- New York, NY
- Orem, UT
- San Antonio, TX
- Seattle, WA
In summary, simply going to espn.com (which redirects you to espn.go.com as its official home) automatically forces the client OS to contact twenty other servers to download content behind the scenes. Many of these are for general web page images, but some are specifically for advertising content and possibly other material of unknown nature which could potentially be hazardous.
Some safety nets
Until all web servers are hardened for much better security configurations, malware writers stop using ad networks as electronic disease carriers, and websites in general become less prone to riddling each webpage request with dozens of subsequent dependent object requests that make up images, cookies, JavaScript functions, Java applets, distracting Flash animations, etc., many security-conscious users choose to block these by default with a method of approving them on a case-by-case basis if they feel that a particular website and related third-party domain is trustworthy from both a security and aesthetics perspective.
A popular solution is to use Mozilla Firefox with the NoScript and Adblock Plus add-ons (which are additional plug-in software components that adds functionality to the base browser). However, by default this does make some sites slightly-to-completely inoperable without enabling scripting for particular third-party domains which the original site may be dependent on. The user has to ultimately make the choice whether to permit scripting or not (either permanently or only for the current session).
If Internet Explorer must be used as a browser, then using Privoxy as a local, client-side proxy at least filters out some common ads in the default configuration. IE will have to be configured to route through a proxy at the loopback address (127.0.0.1) on port 8118.
The Internet is in a constant state of change and part of the growing pain is in defending networks to protect both the client and server and the communication between them. Seemingly-trustworthy websites such as banking or news media sites can be compromised by increasingly-sophisticated crackers who embed their secret sauce into sites you may visit daily. While there may be nothing obvious to indicate a hijacked site on the surface, it may lurk there for some time unless the site owners (if ever) find out.
Go back to the main articles list.