The Domino Effect of LinkedIn’s DNS Outage

On Wednesday night users trying to access LinkedIn site were redirected to servers owned by Confluence Labs. The outage lasted a couple of hours and a post mortem Thursday described the issue as caused by human error at Network Solutions, provider for LinkedIn’s DNS.

Immediate effect of the outage is obvious: upset LinkedIn users, ad revenue loss, alarmist media coverage, possibly some brand damage, and a reminder of last year’s security leak. Sadly the incident not only impacted the LinkedIn website, but it also triggered web performance problems on websites using Linkedin’s Share Plugin.

Web Performance Impact on Other Websites

During the LinkedIn DNS downtime, our monitoring agents observed failures on other websites such as The Atlantic, The Daily Beast, and The Economist. The problems lasted about 6.5 hours, from 6 PM PT Wednesday to 2:30 AM PT Thursday.

The impacted sites included the LinkedIn Social Plugin in their webpages by referencing an inline JavaScript call to “platform.linkedin.com”. Unfortunately, this type of code blocks the browser from rendering HTML until the JavaScript request from LinkedIn is successfully loaded and executed. When the LinkedIn DNS problem started, the JavaScript calls “platform.linkedin.com” resulted in an infinite chain of redirects.

Before the outage, under normal performance, this is the waterfall of the LinkedIn tags on a website:

During the outage, platform.linkedin.com instead responded with multiple 302 redirects to itself.

The recursive redirects caused serious performance degradations, as the browser could not render the rest of the webpage HTML until it gave up on the redirect chain. The servers of Confluence could not handle the high load of requests (of all the users hitting it) directed to LinkedIn, resulting in slow response times.

When looking at the performance of “platform.linkedin.com”, you can see that LinkedIn response times spiked.

In the Web Performance community this event is referred to as SPoF – “Single Point of Failure”. The most unfortunate part of a SPoF from a vendor tag, is that the site’s end user will blame the site itself for the issue. End users have no idea of what is happening behind the scenes, what social or adserving tag is causing the slowness.

You can see the impact on the overall page performance below:

The Fragility of the Web and the Security Nightmares

The incident clearly shows how fragile and delicate the web can be. Third party tags can easily impact performance of a website and they can become gateways to users on thousands of sites.

Imagine for a moment that LinkedIn’s DNS or Servers had been hijacked (like the media first assumed), the effect on the web would have been devastating. Not only would visitors to linkedin.com be impacted, but also any visitor of a webpage that embedded the LinkedIn tag as well. The hijacker would have the ability to take down a large portion of the web, could deliver malicious software to millions of people, or steal who knows what information.

Lessons Learned

So what can websites do to deal with such situations?

Before you put a tag on the webpage make sure you know who you are doing business with, understand how secure and scalable their system is, and most importantly ensure you are protected with an SLA and contract agreement from that vendor.
Ensure third party tags load async, so they don’t block the content. When possible, mitigate security risks by relying on iframes to a different domain from the webpage. There are plenty of articles on this topic, or you could simply rely a Tag Management System. This way, your team can quickly remove unresponsive tags and can react immediately to third party issues.
Monitor, monitor, monitor. Always know how your site and providers are performing and have an alerting mechanism set in place to notify upon performance degradation.

Lastly, kudos to the LinkedIn, Confluence Labs, and Network Solutions teams for their quick reactions and transparency during the resolution and post mortem following – an example every company should follow.

Mehdi – Catchpoint

The Domino Effect of LinkedIn’s DNS Outage

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112