September 2, 2013 - Expiration of legacy Intermediate Certificate Authority Certificate –A legacy intermediate Issuing CA certificate is expiring at approximately 10pm on 9/2/2013 PDT. Access to VPN resources using the respective VPN thick client and native personal certificates may be impacted if certain environmental conditions exist. Please follow this link for more information if you are experiencing VPN client access issues.
April, 27, 2013 - At 5:43AM PDT SecureAuth Operations was alerted to a low disk space issue on one of the primary Web Servers. After an investigation to the cause of the disk condition it was determined the index service had created a catalog file on the Web Server’s primary OS drive. Due to this low disk space condition the Web Services could not publish some content. The Index service has been reconfigured and the disk space alert levels will be changed to send notifications earlier to avoid any additional problems. All systems have been tested operational.
SecureAuth SMS Services
March 5th, 2015 -
At approximately 7:45 PM PST on Thursday, March 5th, SecureAuth hosted services started experiencing DNS issues which has led to a service outage for Telephony, SMS and Certificate delivery methods of authentication. Other methods of authentication are not affected. We are currently investigating the cause of this outage and will be sending out a follow-up email once the Phone and SMS services have been restored.
We apologize for the service interruption and appreciate your patience during this outage. The resolution will be posted following our investigation into the root cause of the outage.
February 19, 2015
- At approximately 1:00 PM PST on Thursday, February 19th, the SecureAuth cloud services experienced an outage impacting SMS and Telephony services.
Service requests could not be processed because the production database was unavailable due to the failure of the MS SQL Server fail-over cluster environment. SecureAuth engineering was notified of the loss of connection to the SQLAgent and SQLServer services at 1.29pm PST. At that time a review of the MS SQL cluster environment determined we could manually migrate services to re-enable SQL Services. The SQL Service application processes were immediately migrated and they were confirmed operational again at 1.39pm PST.
Detail of Issue:
There are two SQL Server nodes in an active/passive cluster managed by Microsoft Cluster Services (MSCS). A MSCS service identified a node connection failure and attempted to move services to the passive node, however, the intermittent loss of virtual network connectivity between the cluster nodes caused the fail-over process itself to fail. The intermittent loss of network connectivity has been attributed to a network adapter driver ,which caused an issue with the Microsoft Failover Cluster Virtual Adapter. It is this adapter which is responsible for determining the nodeís network configuration and connectivity with other nodes in the cluster.
As a result of this outage, we have taken several actions to improve the reliability of the cluster and prevent this recurring. The actions taken are as follows:
1) The virtual network adapter firmware and driver was updated as recommended by the vendor on the MS SQL server nodes. No new cluster fail-over error events have occurred since this update was completed.
2) The MS Cluster disk timeout has been increased from the default of 60 seconds to the Microsoft recommended setting of 120 seconds.
3) We have identified events being logged on the systems indicating non-fatal loss of network connectivity that we are monitoring in addition to monitoring the main MS SQL Server services.
4) The alerting services provider is being configured to send alerts via email and text message for MS SQL Server cluster fail-over related service error events to avoid delayed notifications going forward.
October 16, 2014
- Between the hours of 12:45 AM PDT to 7:30AM PDT, some of our customers reported that some of their users were unable to receive their OTPs via the SMS option. After an internal investigation revealed that our systems were running as expected, a support ticket was opened with our Service Provider (Telesign). Their evaluation also concluded that all services were functioning normally. Telesign then inquired upstream about specific non-deliveries to Carrier Operators, from which it was determined that Sprint and Google Voice, among other smaller US operators, were having network issues. T-Mobile users were also affected, which can be attributed to the fact that carriers often times share towers. With that being said, although the cause for the service interruption had nothing to do with SecureAuth or Telesign uptime, we certainly understand the impact of the disruption and are working with Telesign to streamline the information gathering process so that we may provide more details regarding carrier issues.
September 12, 2013 – SecureAuth SMS Services will be updated starting at 11:00pm PDT on September 12, 2013 and completing at 1:00am PDT September 13, 2013. There may be some delays in SMS delivery during short periods during this window, but we do not anticipate any major delays or outages. UPDATE: The configuration changes were completed and SMS Services are tested and fully functional.
September 3, 2014 - SecureAuth SMS Services experienced slowness on Wednesday, September 3 at approximately 7:50 AM PDT and again Friday September 5 @ 6:30AM PDT. Customers may have experienced excessively long wait times and/or possibly an error message when using the SMS one-time password option. The issue has been resolved and all systems are functioning normally. Root Cause: Large numbers of simultaneous SMS requests and subsequent SMS provider acknowledgement updates to the SMS table caused excessive CPU and Disk IO load on the SQL server cluster servicing the Hosted services database. It has been determined that the root cause of this condition was a configuration issue on the SMS table in which forced an index scan to be executed as opposed to an index seek causing significant resource usage. As the resource usage increased the ability of the SQL Server to process the SMS requests slowed. Additionally, the end user authentication workflow was dependent upon processing the SMS provider’s acknowledgement response and subsequent table update before refreshing the screen to the Pin Pad allowing the end user’s to enter their one-time-password. This condition further exacerbated the resource condition by extending the length of time the web session is needed between the on premise appliance and the hosted web services. Steps taken to resolve this issue: 1. A Table configuration enhancement will be applied on 9/12/2014 to the web services database SMS table which will significantly improve the performance of the update process and free up the appropriate resources so that the SMS services can scale appropriately. Note: All other tables were reviewed and do not require any additional configuration changes which is why the other services were available and not affected during the SMS outage period. 2. The dependency has been removed by decoupling the synchronous communication between the web services and the database. This, in effect, results in allowing the hosted web services to complete the end user’s request and close the web session between it and the on premise appliance without waiting for the SMS provider’s acknowledgement response.
June 20th 2013 - Intermittent SMS delivery issues in the US across multiple mobile carriers - SecureAuth Customers may have experienced SMS delivery delays or outages starting Thursday, June 20th at approximately 6::00am Pacific Daylight Time. The cause of these issues were due to problems with the SMS network routes which connect Telesign, our SMS service provider, to all the mobile operators/carriers. Telesign updated the SMS routes which were causing these delays/failures to resolve the issue at 8:32am PDT and continues to monitor and work with the mobile carriers to ensure that no additional routes require updates.
SecureAuth Telephony Services:
SecureAuth Push Notification Services:
SecureAuth Certificate Services
August 22, 2014 - SecureAuth Hosted Services experienced a partial outage Friday morning, from which some customers experienced CSSL certificate validation errors or errors while attempting to revoke a certificate. Normal certificate services were not affected. The primary issue was related to a disk space issue on one of the Certificate Authorities. That issue was resolved and all systems are functioning normally.
January 18th, 2014 - SecureAuth Services experienced a temporary outage which some customers experienced certificate signing issues. This issue was resolved and all systems are functioning as expected. Root Cause: The outage was determined to have been caused by a local firewall change blocking TCP Port 80 on one of the SecureAuth certificate authority distribution point web servers. A new Anti-virus application was installed earlier in the week which made the change. Resolution: The local firewall was modified on the affected certificate authority distribution point web server to allow the appropriate communications. Going forward: Future functionality testing plans will include additional verification of connectivity testing to avoid this and similar service and/or communication issues from occurring.