Reducer

Removes XML Paths that have previously been processed. Use this Processor to remove data that has not changed since the last integration, for example master data. With this technique, the data that is processed downstream is vastly reduced improving processing time and reducing bandwidth.

Warning: This Node is replaced by Reduce.

Warning
This Node should only be used as an assistive device to reduce the amount of master data to be processed - ie. to process only delta since the last execution. Use of this Node to prevent duplication of transactional data is strongly discouraged.

Revision History

3.0.0.1 Added retries on SQL transactions
3.0.0.2 Added deduplication of request. Added more error info.

Properties

Group

Type: String Input
An identifier, unique to the Site that is used to group the type of data being processed. For example "Customers".

Action

Type: List Input
The processing action to be taken by the Reducer

ReduceCommit - Strip out XPath's that have previously passed through the Reducer and immediately commit the remaining items so they will not be returned on the next iteration
Reduce - Strip out XPath's that have previously passed through the Reducer only
Commit - Commit the XPath's provided, no data is returned in ReducedXml

Reset

Type: Boolean Input
When True, indicates that the cache of processed XPath's should be reset.

Namespaces

Type: Multiline Text Input
A list of namespaces (one per line of text), in the form :

XPath

Type: String Input
The XPath expression that returns isolated entities (for example, a single Customer record out of a list of Customers)

KeyXPath

Type: String Input
The XPath to a unique identifier on the node specified with the XPath property. This will cause the node to overwrite the last record based on it's key when a commit is done. The key must be present for all nodes, and cannot be empty.

SourceXml

Type: Xml Input
Input XML on which the specified XPath will be matched

ReducedXml

Type: Xml Output
The XML document in Source XML with all previously processed XPath's stripped out. Note that this property is always empty when Action is Commit

Remarks

The Reducer eliminates data that has already been processed by comparing a hash of each individual entity in an XML document with a persisted cache. XPath expressions that have already been cached are not returned in ReducedXml.

The primary use cases for this connector are reduction of master data where a large number of records are being integrated or to prevent duplicate integration of transactional data where there is no way to prevent the same transaction from being returned by the source system.

There are two design patterns for use of the Reducer.

Immediate commit pattern

  • Request data from source
  • Use the Reducer with Action set to ReduceCommit
  • Process the data that is returned in SourceXml

This pattern should be used whenever you are certain you will never need the same data again once it has been received.

Deferred commit pattern

  • Request data from source
  • Invoke the Reducer with Action set to Reduce
  • Process the data
  • For any transactions/master data records that are successful, invoke the Reducer again, passing the same XML structure as was passed to the first Reducer, with Action set to Commit

This pattern should be used whenever you may need to re-request the same data from the source again when it fails to integrate

Using KeyXPath

Consider stock records that run through the Reducer every time the stock level changes. First:

<Levels>
	<Stock>
		<Code>PROD-001</Code>
		<Quantity>1</Quantity>
	</Stock>
</Levels>

And then second:

<Levels>
	<Stock>
		<Code>PROD-001</Code>
		<Quantity>2</Quantity>
	</Stock>
</Levels>

Now, if the stock level drops back to 1, the record will not be returned by the Reducer again because the hash for that record has been marked as seen.

In order to prevent this from happening, the KeyXPath must be specified. If the XPath is Levels/Stock then the KeyXPath should be Code .

This enables the Reducer to store that state keyed against the value that is resolved from KeyXPath which ensures that a record will be re-processed even if it returns to an earlier state.