This class provides context and methods for aggregating records. More...

Inherits object.

Public Member Functions
def	__init__
	Construct an aggregator instance. More...

def	isNullAggregation
	Return True if no aggregation will be performed, either because the aggregationInfo was None or all aggregation params within it were 0.

def	next
	Return the next aggregated record, if any. More...

Detailed Description

This class provides context and methods for aggregating records.

The caller should construct an instance of Aggregator and then call the next() method repeatedly to get each aggregated record.

This is an example aggregationInfo dict: { 'hours': 1, 'minutes': 15, 'fields': [ ('timestamp', 'first'), ('gym', 'first'), ('consumption', 'sum') ], }

Constructor & Destructor Documentation

def __init__	(	self,
		aggregationInfo,
		inputFields,
		timeFieldName = `None`,
		sequenceIdFieldName = `None`,
		resetFieldName = `None`,
		filterInfo = `None`
	)

Construct an aggregator instance.

Params:

aggregationInfo: a dictionary that contains the following entries
- fields: a list of pairs. Each pair is a field name and an aggregation function (e.g. sum). The function will be used to aggregate multiple values during the aggregation period.
- aggregation period: 0 or more of unit=value fields; allowed units are: [years months] | [weeks days hours minutes seconds milliseconds microseconds] NOTE: years and months are mutually-exclusive with the other units. See getEndTime() and _aggregate() for more details. Example1: years=1, months=6, Example2: hours=1, minutes=30, If none of the period fields are specified or if all that are specified have values of 0, then aggregation will be suppressed, and the given inputFile parameter value will be returned.

inputFields: The fields from the source. This is a list of tuples as returned from the 'fields' member of a RecordStreamIFace. The tuples are: (fieldName, type, special). The type can be 'float', 'int', 'string', 'datetime' or 'bool'. And the special can be '', 'T' (for timestamp), 'S' (for sequenceId), or 'R' (for reset).

timeFieldName: name of the field to use as the time field. If None, then the time field will be queried from the reader.

sequenceIdFieldName: name of the field to use as the sequenecId. If None, then the time field will be queried from the reader.

resetFieldName: name of the field to use as the reset field. If None, then the time field will be queried from the reader.

filterInfo: a structure with rules for filtering records out

If the input file contains a time field, sequence id field or reset field that were not specified in aggregationInfo fields, those fields will be added automatically with the following rules:

The order will be R, S, T, rest of the fields
The aggregation function for all will be to pick the first: lambda x: x[0]

Member Function Documentation

def next	(	self,
		record,
		curInputBookmark
	)

Return the next aggregated record, if any.

Parameters:

record: The input record (values only) from the input source, or None if the input has reached EOF (this will cause this method to force completion of and return any partially aggregated time period) curInputBookmark: The bookmark to the next input record retval: (outputRecord, inputBookmark)

outputRecord: the aggregated record inputBookmark: a bookmark to the last position from the input that contributed to this aggregated record.

If we don't have any aggregated records yet, returns (None, None)

The caller should generally do a loop like this: while True: inRecord = reader.getNextRecord() bookmark = reader.getBookmark()

(aggRecord, aggBookmark) = aggregator.next(inRecord, bookmark)

reached EOF?

if inRecord is None and aggRecord is None: break

if aggRecord is not None: proessRecord(aggRecord, aggBookmark)

This method makes use of the self._slice member variable to build up the values we need to aggregate. This is a dict of lists. The keys are the field indices and the elements of each list are the values for that field. For example:

self._siice = { 0: [42, 53], 1: [4.0, 5.1] }

The documentation for this class was generated from the following file:

nupic/data/aggregator.py

Public Member Functions

Detailed Description

Constructor & Destructor Documentation

Member Function Documentation

Parameters:

reached EOF?