Product News
Announcing Cloud Insights for Amazon Web Services

The Internet Report

Ep. 9: Outages Become a Night Owl’s Nuisance, and the COVID-19 Impact on Submarine Cables

By Angelique Medina
| | 34 min read
Internet Report on Apple Podcasts Internet Report on Spotify Internet Report on SoundCloud

Summary


Watch on YouTube - The Internet Report - Ep. 9: May 18 – May 24, 2020

Welcome back to the Internet Report! On this week’s episode, we cover our usual check-up of ISPs, cloud and collaboration app outages, and we discuss several major middle-of-the-night outages that affected services from providers such as Google and Virgin Media—proving, definitively, that outages never sleep. We’re also joined by TeleGeography’s Alan Mauldin to discuss submarine cables, terrestrial networks, international Internet infrastructure and more.

Find us on:

Finally, don’t forget to leave a comment here or on Twitter, tagging @ThousandEyes and using the hashtag #TheInternetReport.

ThousandEyes T-shirt Offer

Show Links:


Catch up on past episodes of The Internet Report here.

Listen on Transistor - The Internet Report - Ep. 9: May 18 – May 24, 2020

Follow Along with the Transcript

Angelique Medina:
Welcome to the Internet Report. I'm Angelique Medina and I'm joined by my co-host, Archana Kesavan.

Archana Kesavan:
Hey, guys.

Angelique Medina:
And we have a great guest for you today. I'm joined by Alan Mauldin. He comes to us from TeleGeography.

Alan Mauldin:
Hi, happy to be here today.

Angelique Medina:
Alan is a research director there, so he manages the company's infrastructure research group, focusing primarily on submarine cables, terrestrial networks, international Internet infrastructure and bandwidth demand modeling. He also advises clients with due diligence analysis, feasibility studies and business plan development for projects around the world. And we're going to talk to him today about some of the recent articles and research that he's done around submarine cables. So really excited about that, and in terms of what we're going to cover today, so I'm going to go ahead and walk through with Archana, some of the events that have happened last week, out on the Internet. So, overall, it was a pretty quiet week, but we did see a few blips, a few outages, and we'll walk through those today.

Angelique Medina:
So, first thing, before we get into that, just a reminder to everyone to subscribe to the report, if you haven't already. We're on YouTube, we're also on pretty much every podcast platform that you can think of. So, go ahead and do that. So you can get updated when we put out new shows every week, and also just a reminder, we introduced this last week, but we have a virtual summit coming up on June 18th, called the State of the Internet. We have a lot of great speakers that are already lined up, including David Belson of the Internet Society. We have one of the researchers at Verizon Media, and on the content delivery network side, so the EdgeCast side of the business.

Angelique Medina:
And we also have, really excited about, is Geoff Huston of APNIC. He's always a great speaker, and he's going to do a talk looking at the future of the Internet. So, that's really something that you should tune in to watch. If you have an idea of something you'd like to see at the event, or you want to submit a talk, you can reach out to us at InternetReport@thousandeyes.com, and also there's a little link here, you can go ahead and register. Registration is now open.

Angelique Medina:
So with that, let's just briefly touch on some of the overall kind of outage trends. So last week we saw a very slight increase from the previous week, in terms of outages across different provider types. So ISP's, cloud service providers, UCaaS providers and so on. In terms of ISP's, just a very slight increase over the previous week. And overall though, this is very much in keeping with some of the numbers that we saw previous to March. So these are kind of more norm levels that we see, and of course, networks have, service providers have outages all the time, things break. Alan will talk a little bit about cable breaks. This stuff just happens, so overall things are looking pretty good.

Network Outages ISP Cloud May 26, 2020
Figure 1: Network Outages - Week of May 18, 2020

Angelique Medina:
So, some specifics on what went down. So ... Just go here. We're going to look at a few instances. So this one happened last week, on Wednesday. So this was May 20th, and this was a fairly large outage in Google's network, and it happened on the East Coast. So it was primarily in the New York area, and it impacted users connecting to Google services, during that period. But it took place over an off-peak time. So this was, let's see, this was around 3:00 AM.

Archana Kesavan:
Eastern Time.

Angelique Medina:
Eastern time. Yeah.

Archana Kesavan:
The patterns that you're starting to see here, you see the first spike, which is kind of the more interesting of the few, the outages in Google's network that you're seeing. This was probably the more impactful one, just from the number of interfaces that were impacted here. But as you expand that timeline, you start seeing these smaller outages within Google. Again, the interesting piece here is, this was again, off-peak time, depending on where the outage was.

Angelique Medina:
Yeah.

Archana Kesavan:
Right? So this was … Go ahead.

Angelique Medina:
Yeah, so this took place in Hong Kong, this one here that was a little bit smaller. And this was also at a time that was off-peak for the local users.

Archana Kesavan:
It was like 11:30 PM at night for Hong Kong. Not sure really...

Angelique Medina:
If you're a night owl, that's not off-peak, but.

Archana Kesavan:
But the next one there, I think, I believe that was Moscow and that was around 5:00 PM PST, 3:00 AM Moscow time. So again, following that same trend of an off-peak outage. Probably the reason we didn't hear much about it in the news, was regionally it was really an off-peak time.

Angelique Medina:
Yeah, absolutely. I mean, that's a key thing, because in some instances we'll see outages that are pretty widespread or big. But because of when they took place, they haven't really impacted users, so there's not a lot of chatter about them. So that's good. Didn't seem to impact overall kind of user experience of Google services. We also did see an outage event that took place in the UK, within one of the sort of, I guess, sister networks to Virgin Media. So their parent company, Liberty Global, has UPC, which is kind of more of their backbone part of their network. And then, they also have Virgin Media.

Archana Kesavan:
Virgin Media.

Angelique Medina:
And this particular outage took place, I think it was like...

Archana Kesavan:
This was also around midnight. It was on the 19th here, for us in the US. And so, the timeline that you see here is on a PST. So, it was midnight and it's kind of the same pattern, right? Both these outages Google that we just saw, and then the Virgin Media or the Liberty Global, both around that off-peak, midnight time. This one lasted almost five hours, though.

Angelique Medina:
Yeah. Yeah again, I mean, it took place over a period that if it happened during the day, for those local users, would have probably, there would have been more visibility on that. But, it was in the middle of the night and you can see here, users from around the globe. So we have Europe as well as India, even parts of the United States connecting across a variety of transit providers. So through Zayo, Tata, Level 3, when they're entering the backbone network. So UPC, there's an outage event and that's impacting reach-ability of Virgin Media, their network and other services.

Angelique Medina:
So again, it had a kind of variation in terms of the number of interfaces involved. So, a little bit higher here. We still saw some outage activity during these points here. So over multiple hours, and then had a big spike here. Still some problems in their network during this period, and then this was kind of the final peak.

Archana Kesavan:
Yeah. Some of the ... This, as you guys, if you've been hearing us for a while now. This is kind of the higher level, bird's eye view of Internet health. Right? There's actually filters down and specific tests that are being monitored to specific services. And one of the services that we were monitoring, the five-hour dip that we saw was from there actually. We had 100% packet loss within Virgin Media's network. It lasted for about five hours, around the same time period ...

Angelique Medina:
Yeah, and one of the reasons why you'll see certain periods of the outage being surfaced here versus other times, is to eliminate a lot of just noise. You may have instances in which just one interface on a router just has a little blip, and we're not going to surface that as an outage, because there would just be too much of it. So there has to be a certain threshold that's met, at a particular time, through a particular network in order to kind of get surfaced here as an outage. So.

Archana Kesavan:
Yeah.

Angelique Medina:
All right, and then finally there was a transit provider, in this case Hurricane Electric, that had pretty widespread kind of incident on Friday of last week. So this was the 22nd. So for kind of...

Archana Kesavan:
You're going to see outages...

Angelique Medina:
Yeah, there's a little bit of kind of smaller scale stuff that you see here, but then you see this prolonged period in which there's kind of varying degrees of effect in their network. So early in the incident, and kind of go here, we can look and see it's kind of, we see some Midwest locations. Kansas, Omaha, Colorado and if we go a little bit further along, we can see...

Archana Kesavan:
Moves to the West Coast a little bit.

Angelique Medina:
Yeah, well here, it's still kind of moving. We see some West Coast as well, and at some points it was really just, really at its peak. We just saw, for example, West Coast as well as Tokyo.

Archana Kesavan:
Tokyo.

Angelique Medina:
As well. So it seemed to kind of, throughout this event, go between West Coast where it mostly impacted their network, and then kind of Midwest and West Coast again, and also Tokyo at various points as well. So, and this impacted the reach-ability of all kinds of services, Amazon, Microsoft and others.

Archana Kesavan:
Hurricane Electric is a pretty well connected pretty big transit provider. So, anything like this prolonged outage that with this intensity, is bound to affect other services that pass through their network. I think this was happening around, in the morning, right? PDT.

Angelique Medina:
That's right. So it started around 5:50-ish or so, and I think concluded around 7:50 AM. I mean, that's definitely not off-peak for that particular, for the West Coast.

Archana Kesavan:
Yeah.

Angelique Medina:
So, that would have been noticed, if you had a provider you were connected to that also then peered with Hurricane Electric, and your traffic transited through them. Then there may have been some impact in terms of the reach-ability of some services. So that was the other major incident that we saw last week, but again, these things happen. We see major events happen throughout the year, regardless of other kind of externalities, other things that are happening in terms of users. So, this is still kind of in the range of normal.

Archana Kesavan:
Mm-hmm (affirmative).

Angelique Medina:
So with that, I wanted to kind of switch gears here and talk a little bit about some of the work that Alan's done at TeleGeography, And he's written multiple articles recently, talking about submarine cable trends. So, specifically around the impact of COVID-19 on cable-sea operations and project roll-outs. But it's been, what was this? Was this back in March that you put that out?

Alan Mauldin:
Yeah, it's about two months ago, I think.

Angelique Medina:
Yeah. Yeah. So what's changed?

Alan Mauldin:
Yeah, so whenever the lock downs first started happening, and there was concern about the ability for cables to stay in service, and cables to continue being deployed and up upgraded throughout the world. And so, how was it going to impact it? And so we would look at these different areas and see what would the impact possibly be. So, the real challenge was the travel restrictions and the quarantines, the ability to move people on and off ships, the challenges to keep them healthy on the ships. Also, the challenge is to have people in a factory working very closely together to make the cable, right?

Alan Mauldin:
So there was a factory that did shut down, actually two factories that did shut down for about a month or so, actually, it's now open again though, which is good news. So, there could be some small delays in cables that are going to be built and deployed. Also been some challenges with the installation terms of getting the permits needed to access the waters, and the ports that you have to get to, to put the cable into the water, right? So.

Angelique Medina:
Right.

Alan Mauldin:
Those are being managed so far, as I understand it. Most of the challenges, it's been a lot of work among parties to make things go smoothly. But so far things have been going I think pretty well, as we understand it. The other big issue is trying to maintain the cables, right? As you mentioned before, cables do break all the time. An average, there's about an average of one fault every three days somewhere in the world, right?

Angelique Medina:
Wow.

Alan Mauldin:
It's not sharks, that's the myth as we all know, right? It's anchors, it's fishermen, it's earthquakes, typhoons, things like this that are causing the damage to cables.

Archana Kesavan:
It's more than fat-fingering, you mean.

Alan Mauldin:
Pardon me?

Archana Kesavan:
It's more than fat fingering and creating an outage, it's actually naturally.

Alan Mauldin:
It's a physical thing that really...

Angelique Medina:
Damage, yeah.

Alan Mauldin:
To the cables, yeah. So...

Angelique Medina:
But do we know that, so you mentioned some recent breaks. Has there been any delay in fixing some of these breaks, as a result of maybe more red tape or kind of process that's been slowed down?

Alan Mauldin:
Well we have heard, from people in the industry who are involved in doing this, they've been able to perform repairs pretty well so far. Able to keep the same crew on board. The crew has been safe. They don't have COVID. They're keeping them all safe. The ship's actually very safe for them, if it's a secured environment and they've been able to access the water where repairs have been required. There's been a few faults just in the past week, that we can mention. There was a fault on the European India Gateway cable, which was near the coast of Morocco. That was on May 20th. It's been fixed now. Off the coast of Europe...

Archana Kesavan:
Not to interrupt, but what's kind of the average time frame to fix a cable while it's like this, based on your ...

Alan Mauldin:
A few weeks, depending on the ability to locate the fault, how severe the fault is, the weather. Multiple things play a role in trying to get a cable repaired-

Angelique Medina:
So during that time ... Sorry to interrupt again. What would be the impact on an average user?

Archana Kesavan:
Yeah.

Alan Mauldin:
Well, oftentimes the user wouldn't even know a difference at all, if the network provider has procured enough capacity on enough diverse paths, they can accommodate a failure. It could just be a small blip. In spots that I mentioned here, like one was one off the coast of Africa, on the ACE cable. It's had four faults this year, right? Many countries have fewer cables on the West Coast of Africa. Thus, the loss of one cable could have a greater impact on the user experience in those countries.

Alan Mauldin:
The AAG Cable, which has been really fault prone, has had two faults in the past month. The fault last week was off the coast of Vietnam. And so, it won't be repaired until June 2nd apparently. So that's a little bit longer to fix that one, but once again, cables do break quite frequently, and so having a lot of cables is important and putting capacity in many different paths is absolutely vital. And that's what everybody does. You can't just rely on two. You need three, four, as many as you can get.

Angelique Medina:
Right.

Archana Kesavan:
Right.

Angelique Medina:
So from a provisioning standpoint, as well as from a redundancy standpoint, that's sort of factored in, as these projects are rolled out and you want to have redundant cabling, you want to over-provision so you just have that buffer in place, in case one of the cables is damaged in some fashion.

Alan Mauldin:
Yeah, absolutely. I mean, since cables do break often, the uperature are quite aware of what could happen and build their networks accordingly, knowing that the cables do tend to have faults.

Archana Kesavan:
And Alan, you mentioned that COVID hasn't necessarily, like maybe it's a small bit of a delay, right? In terms of maintenance, but you also mentioned that in terms of some of the new cable lines that were in progress, it's still moving for maybe a little bit of delay. Is there anything that's significant there from a delay perspective, and you think can actually impact the end user, if it didn't come up in the right time?

Alan Mauldin:
Yeah, I don't think so right now, at least with what we're seeing, that the delays are in terms of weeks and months, so far it seems, not in terms of a whole year. Cables are delayed oftentimes for other reasons, getting the permits is a huge problem even without COVID, and now with COVID and government's are kind of shut down as well. That's a big problem as well, whether it can be a problem as well.

Alan Mauldin:
But I think for the end user, it's important to realize that most of the cables that are currently in service, they have a lot of capacity that can still be activated and added out these new cables. So you're seeing upgrades still taking place. It's a little harder these days, you have to get the equipment on site, to get people there to install. The upgrades can be a challenge apparently, but there's still sufficient capacity. And even if cables were delayed over a year, it would still be fine.

Archana Kesavan:
Still be okay.

Archana Kesavan:
Is there any ... As you're talking about the permits and permits can always be delayed. Is there any kind of priority that's being given by governments, now that the Internet is kind of in the forefront? In the backdrop of COVID being so critical, is there any prioritization that's happening to give permits out faster at this time?

Alan Mauldin:
Yeah, so the industry has done a lot of work, particularly the International Cable Protection Committee has issued a call to action, which is try to highlight the governments around the world that cables are an essential service, that the workers who are working in them are essential employees. I think that's apparently had an impact in helping to raise the profile of cables among governments around the world.

Archana Kesavan:
Yep. Okay.

Angelique Medina:
And it sounds like in your recent article where you unpack this, you had mentioned the projects in question are, that these are not things that are necessarily planned imminently. These are things that are going to be rolling out maybe in a year or even more. So it sounds like the projects are kind of staged in such a way, that there's actually quite a lot of buffer between now and when the completion date is for some of these cables to get rolled out. So there may be an opportunity to kind of make up, if there are delays or if there's any issue.

Alan Mauldin:
Yeah, that's a very good point. Cables take a long time to build, to do the planning, the surveying, the trying to build the cable, to deploy the cable, it's a long process. So for example, just last week there was a cable announced, it's called the 2Africa Project. It's a massive 37,000 kilometer cable that's going to go on both coasts of Africa, to Europe and the Middle East as well. This cable was announced, just I said, last week. It's going to be in service during 2023, 2024. So it's a long rollout period here. So there's plenty of time for these projects to adjust, given the changes we're seeing with COVID and other things that are slowing projects down potentially.

Archana Kesavan:
That makes a lot of sense.

Angelique Medina:
Yeah. I mean, one thing that we've also talked about is just, in terms of the investors in these cable projects, right? I mean, carriers traditionally have, in some cases, formed alliances and rolled out these cables. But we're seeing, and maybe kind of give us a sense of kind of the timelines here, but content providers have been much more actively involved in kind of sponsoring and rolling out these cable projects.

Alan Mauldin:
Yeah, absolutely. That's where the biggest change is probably in the last 10 years or so, is the role of content providers. It's really Google, Facebook and some degree, Amazon and Microsoft as well, who have taken the decision to not just buy capacity on the cables, but to actually be involved in building them and trying to join a consortium to build a cable, or in the case of Google, build your own cable yourself. It started in 2010 when Google, the cable from Japan to the US called Unity. And since then, you've seen other companies getting involved in this. And really, the focus is on the major transoceanic routes. The Atlantic, the Pacific, all within Asia.

Angelique Medina:
So you primarily see it on the Pacific, it's the between the West Coast and Asia that you're seeing more of a focus for the content providers?

Alan Mauldin:
Sure, it's primarily for the inter-data center links, and has been the focus of most. But, as we're seeing with the 2Africa cable, that involves Facebook. Also Equiano, it's a Google cable also going along the coast of Africa as well. So the investment is not just on these major routes, it's going to other areas as well and it's growing very, very, very quickly.

Archana Kesavan:
Alan, one of the questions, as you're talking through these content providers investing in these cables. The question is, do they load share like a major cable link, or are they making investments to have the private connectivity across data centers?

Alan Mauldin:
Well in terms of how they're building it, I mean, they want to have capacity on multiple different cables as well. And so, the goal of the investment is really to acquire a large enough capacity, usually an entire fiber pair and how they choose to use that is up to them. But, a lot of the cables that are being built now in other parts of the world, is because they can't get fiber pairs. There's not free pairs on current cables, so they'll do a cable, you can get more access to pairs. Also, newer cables can carry more pairs, more pairs leads to a lower unit cost. So it's really about trying to keep the cost down, to boost the capacity as high as possible, and also to go to certain spots where they want to go, which is where they have their data centers.

Archana Kesavan:
Got it, got it. One of the trends that we've seen across the big cloud providers, AWS for instance or Google, is this whole monetization of their backbone that they have kind of been pushing forward. They have services that you obviously get a little bit more priority, and it basically means you get better performance because you're using their backbone, you know? So it almost feels like all of the investments that they're making in the submarine cables, in terms of getting their backend and infrastructure up and going, they have to monetize it some way. So it kind of makes sense that over time, we're going to see more services that you pay extra for, using their backbone for better performance.

Angelique Medina:
Yeah. So, any final thoughts on sort of the state of summary, and cables and operations, Alan?

Alan Mauldin:
I think it's just, what we've seen in the past two months, has just been really a testament to how important and reliable these cables are, that are laying on the bottom of the ocean floor. That there has not been any major outages or major problems caused by this global pandemic. So I think that, ultimately, it's been a test of the industry and the industry so far has risen to the challenge. So I think it's an encouraging story ultimately, and highlights the important role that cables do play for trying to connect all of us globally together.

Angelique Medina:
Yeah, absolutely ... Great. Well, that's a great place to conclude for today. So Archana, why don't you share with readers, or excuse me, listeners, where they can get this free working from home T-shirt?

Archana Kesavan:
Yeah. Email us or leave a review, and all of those podcast channels Angelique mentioned before, Spotify, Stitcher, wherever you get your podcasts, and email us with your address and size, and we'll send you our latest T-shirt in there. And, with that, we'll close today's show and we'll see you next week.

Subscribe to the ThousandEyes Blog

Stay connected with blog updates and outage reports delivered while they're still fresh.

Upgrade your browser to view our website properly.

Please download the latest version of Chrome, Firefox or Microsoft Edge.

More detail