Platform maintenance and workflow recycles

There are several cases where it is necessary for Workflows to shut down and restart. We call this a recycle and this article describes this behavior and how to design workflows that are infrequently executed or run for a long time.

Platform Maintenance

Platform updates are published to our Cloud environments many times a month. All scheduled updates are published at http://status.flowgear.net/ and users are welcome to subscribe to notifications for these changes.

During maintenance, sets of engine servers are shut down in a rolling set. When this occurs, workflows are given an opportunity to shut down gracefully (up to 5 minutes).

Within 15 minutes, Always On workflows will be pulled back up again on an updated engine instance.

Daily recycle

Runtimes restart once every 24 hours. The specific recycle time depends on the region the tenant is provisioned in to and is generally configured to be early morning for the region. Create a support ticket to confirm or adjust the recycle time for your tenant.

During a recycle, all Always On Workflows will shut down and then start again (usually within five minutes). There is no disruption to the ability to service new workload (e.g. API invokes) although there will be some caching overhead on first invokes on a freshly started runtime.

Hourly Workflow continuation

To conserve memory allocated by a Site, Flowgear will perform a continuation of an Always On Workflow hourly provided the following conditions are met:

  • The workflow is not being debugged in the designer
  • The workflow is currently executing a Trigger Node
  • The Trigger Node has invoked at least once (i.e. Nodes downstream from the Trigger have executed implying at least one iteration of the workflow)
  • The workflow has been running for at least a specified amount of time (currently this is configured to 1 hour)

In such instances the Workflow will be stopped and immediately restarted. We call this a continuation rather than a recycle:

  • It restarts immediately rather than by waiting for our Always On monitor to restart it
  • It restarts on the same engine server (load balancing is bypassed)
  • It restarts from within the engine server (our API, cache and other resources are bypassed)

A workflow continuation can be identified by examining the Mode element of the InitilisationXml Property on the Start Node which will contain the value AutoStart (Continuation).

Designing infrequently triggered workflows

Due to the recycle architecture described above, workflows that run infrequently (e.g. once a day) are not guaranteed to fire if a recycle occurs at precisely the time they are set for.

If it is essential that a workflow fires only once at very narrow time window, use a Trigger like Day Scheduler to configure a window of approximately 1 hour with in an interval of between 1 and 5 minutes.

After the Day Scheduler, read a Key/Value to determine whether the Workflow has fired for the specific period (e.g. day) and if not, set a Key/Value and begin execution of the Workflow.

If you require this behavior on multiple Always On Workflows, consider factoring this logic into a sub-Workflow that given a workflow ID, returns a boolean indicating whether the Workflow should run (i.e. whether or not it has run for the current day).

See Sample Workflow for an example of how to implement such a Workflow.

Designing long-running Workflows

As Workflows may be required to shut down at any time, it is important to implement a pattern that enables them to resume from their last position when they re-invoke.

We recommend using the the ETL pattern on long-running workflows to achieve this. Specifically, use of Reduce will ensure that they do not unnecessarily re-process data they've already processed.