Resolved -
This incident has now been resolved. Errored replicas from this timeframe will be reprocessed.
Apr 25, 07:42 UTC
Identified -
As of April 24 at 7:28 PM UTC, new replicas in training have been encountering errors due to a TTS issue. We're actively coordinating with the TTS provider on a resolution. Once resolved, affected replicas will be reprocessed and completed.
Note that this only impacts new replicas in training. All existing replicas are not impacted and are fully operational.
Apr 25, 07:30 UTC
Resolved -
This incident has been resolved.
Apr 18, 09:26 UTC
Monitoring -
Median startup time is around 3 seconds, which is slower than expected. The issue has been traced to a malfunctioning container hosted by one of our GPU providers. A patch is implemented and being monitored.
Apr 18, 05:43 UTC