CloudFlare are having issues with their DNS services.
This may impact sites and services globally.
We’re starting to see some of their services returning to normal but we still caution that there may be more impacts.
More information at https://www.cloudflarestatus.com/
One of our network upstreams has announced network maintenance.
Please note that routing changes may cause momentary traffic dip during this time frame.
Our own traffic should automatically failover to secondary ISPs.
Date and time for maintenance window
Start date and time: 2020-05-05 03:00 UTC
End date and time: 2020-05-05 04:00 UTC
At 04.30 we performed a reboot of one of our core routers to ensure the issue is not on our end.
We will reboot some other equipment to troubleshoot issues.
We apologize for the disruption.
Update: One core switch was also rebooted manually at 05.50 – 05.52 . (short dip)
We have made adjustments to network traffic to find a stopgap solution to flapping network connection.
One of our upstream transits suffered an outage at 19.05. This caused network re-routing to happen and may have been felt as momentary network dip.
Traffic has been re-routed to failovers. We’re monitoring the networks.
When: 19:05 – 19:10 (upstream has failed over)
Impact: 5min re-routing, traffic dip
Update: Some more network dips occured during the evening. The network upstream suspected of causing high CPU usage in our cores have been isolated. We’re continuing to monitor the situation.
When: 20.30, 23.30 , duration (roughly 2-3min per incident)
At 22.00 we experienced some routing issues. We are currently investigating the cause of this.
Cause: One of our Upstream Transit lost BGP connectivity / flapped
Impact: 5 – 10min routing table rebuilds
Update (22.30): One of our transit operators having issues. Recovered after a few minutes.
- 2 x E5-2620v3 Xeon Six-Core CPU
- 2.4GHz Processor Speed
- 128GB RAM
- 2 x 900GB 10k SAS 2.5″ Hard Drives
- PERC H730 RAID
- iDRAC8 Enterprise
- Up to 2 x 2.5″ Hot-Swap Drives
Contact us for customized setup
At 04:00 we pushed an urgent patch to routers. This caused BGP to reset.
Traffic was disrupted momentarily until BGP was re-established.
We’ve applied important security updates to some of our switches in Västberga.
Each switch requires less than 5 minutes to reboot.
Impact: 08:50 – 08:55, 09:50 – 09.55
Problem: Upstream/ISP router rebooted. Suspected DDOS. Operator is restarting services and restoring connectivity.
Future resolution: Replacing upstream router.
We’ve now omitted the ISP with connectivity issues and re-routed traffic to different ISP.
Fault: Optical fiber outage causing network disruption of internet traffic.
Status: Partially remedied (failover path in use).
Work is being done to find the fault and remedy it.
Datastorage is not affected by the outage (multipathing).
At 10:40 backup and primary paths are now recovered. Full redundancy has been restored.
Further investigation into the failed optical fiber will be made.
Work is being undertaken to improve and speed up the fail-over of fiber redundancy for inter-DC connectivity.
At 09.00 a switch failure occurred in core infrastructure which burnt optical fiber transceivers.
Remediation: By 09.15 the fault had been found and all network paths had been migrated to secondary fiber connections.
The faulty switch has been replaced and redundancy restored. We’re planning to perform maintenance work (future scheduled window) to implement new protocols for automated fail-over of fiber paths.