Platform maintenance and workflow recycles

There are several cases where it is necessary for Workflows to shut down and restart. We call this a recycle and this article describes this behavior. It also discusses the architecture you should use when building Workflows that run infrequently or run for a long time.

Platform Maintenance

Platform updates are generally pushed to our Cloud environments once a week. All scheduled updates are published at http://status.flowgear.net/ and you're able to subscribe to notifications for these changes.

During maintenance, we progressively roll out updates for each tenant cluster, one node at a time. Each node is given time (up to 5 minutes) for its Workflows to shut down gracefully.

Following a recycle, Always On Workflows will be brought back online within 2 - 15 minutes.

Daily recycle

Runtimes restart once every 24 hours. The specific recycle time depends on the region the tenant is provisioned in to and is generally configured to be early morning for the region. Create a support ticket to confirm or adjust the recycle time for your tenant if needed.

During a recycle, all Always On Workflows will shut down and then start again (usually within 5 minutes). There is no disruption to the ability to service new workload (e.g. API invokes) although there will be some caching overhead on first invokes on a freshly started runtime.

Hourly Workflow continuation

To recover memory allocated by a Site, Flowgear will perform a continuation of an Always On Workflow hourly provided the following conditions are met:

  • The workflow is not being debugged in the designer
  • The execution cursor of the Workflow is currently within a Trigger Node
  • The Trigger Node has invoked at least once (i.e. Nodes downstream from the Trigger have executed implying at least one iteration of the workflow)
  • The Workflow has been running for at least a specified amount of time (currently this is configured to 1 hour)

In such instances the Workflow will be stopped and immediately restarted. We call this a continuation rather than a recycle:

  • It restarts immediately rather than by waiting for our Always On monitor to restart it
  • It restarts on the same node in the cluster (load balancing is bypassed)
  • It restarts from within the runtime node (our API, cache and other resources are bypassed)

A Workflow continuation can be identified by examining the Mode element of the InitilisationXml Property on the Start Node which will contain the value AutoStart (Continuation).

Designing infrequently triggered workflows

Due to the recycle architecture described above, Workflows that run infrequently (e.g. once a day) are not guaranteed to fire if a recycle occurs at precisely the time they are set for.

If it is essential that a workflow fires only once at very narrow time window, use a Trigger like Day Scheduler to configure a window of approximately 1 hour with in an interval of between 1 and 5 minutes.

After the Day Scheduler, read a Key/Value to determine whether the Workflow has fired for the specific period (e.g. day) and if not, set a Key/Value and begin execution of the Workflow.

If you require this behavior on multiple Always On Workflows, consider factoring this logic into a sub-Workflow that given a workflow ID, returns a boolean indicating whether the Workflow should run (i.e. whether or not it has run for the current day).

See Sample Workflow for an example of how to implement such a Workflow.

Designing long-running Workflows

As Workflows may be required to shut down at any time, it is important to implement a pattern that enables them to resume from their last position when they re-invoke.

We recommend using the the ETL pattern on long-running workflows to achieve this. Specifically, use of Reduce will ensure that they do not unnecessarily re-process data they've already processed.