Queries Not Running and App Not Loading
Incident Report for Sisense for Cloud Data Teams
Postmortem

The following is the incident report for the Sisense incident that occurred on October 13th, 2020 between 12:40 PM PDT and 8:16 PM PDT. This issue affected all customers using the Sisense for Cloud Data Teams web application to varying degrees. For brief periods of time between 12:40 - 12:48 PM PDT and 4:23 PM - 5:01PM PDT, customers may have lost access to the application. More commonly, customers were not able to run any new queries during the outage period.  We understand the effect this had on our customers and sincerely apologize. We have taken a number of steps to prevent this issue from occurring in the future as detailed further below.

ISSUE SUMMARY

Our investigation found that a database used in running customer queries was unable to accept new writes, effectively becoming read-only.   Further investigation revealed that the database load was due to high-contention of the database’s access to shared memory. Resetting and failing over the database cleared the contention but it quickly reappeared, due to bloated indices that were consuming excessive shared memory and causing the database contention. 

Once the database became effectively read-only, customers weren’t able to run new queries because the new query requests couldn’t be written to the database.

To restore application availability, the database was restarted, the affected tables were re-indexed and all associated services were restored. This restart & reindex cycle took place three times during the incident, as multiple tables were affected and could not all be resolved in a single pass. Once complete, all customers were able to access Sisense for Cloud Data Teams and run queries, as normal.

REMEDIATION

We are confident that we have identified appropriate corrections and are equipped to handle any similar outages in the future. The team is committed to creating the most reliable data platform possible. As a response to this issue, we’re making key improvements to our infrastructure:

  • Tuning our autovacuum settings to clear dead rows more frequently 
  • Adding monitoring around index bloat, which will alert on-call engineers before similar situations can occur in the future.

If you have any questions, please reach out to our Solutions Team at supportdt@sisense.com or via live chat.

Posted Oct 15, 2020 - 12:15 PDT

Resolved
All systems are now operational. The Sisense Cloud Support Team can be reached at supportdt@sisense.com.
Posted Oct 13, 2020 - 18:15 PDT
Monitoring
The Sisense application is now loading for users and queries are running normally. Engineers are monitoring to ensure continued functionality. The Sisense Cloud Support Team can be reached at supportdt@sisense.com.
Posted Oct 13, 2020 - 17:53 PDT
Update
Engineers are actively investigating the issue.
Posted Oct 13, 2020 - 17:29 PDT
Update
The Sisense application is loading and queries are running as expected. Engineers are continuing to investigate. The Sisense Cloud Support Team can be reached at supportdt@sisense.com.
Posted Oct 13, 2020 - 17:05 PDT
Update
The Sisense Application is not loading while engineers restart backend services. The Sisense Cloud Support Team can be reached at supportdt@sisense.com.
Posted Oct 13, 2020 - 16:28 PDT
Update
Engineers are currently investigating the issue with queries not loading and charts missing from dashboards. Updates will be posted on the status page https://status.periscopedata.com/
Posted Oct 13, 2020 - 16:12 PDT
Update
Engineers are continuing to investigate the issue. The Sisense Cloud Support Team can be reached at supportdt@sisense.com or via live chat.
Posted Oct 13, 2020 - 15:45 PDT
Investigating
Queries in the Sisense application are not running for some users. Engineers are actively investigating the issue. The Sisense Cloud Support Team can be reached at supportdt@sisense.com or via live chat.
Posted Oct 13, 2020 - 15:17 PDT
This incident affected: Sisense for Cloud Data Teams.