Input / Output¶
FileRecordStream¶
-
class
nupic.data.file_record_stream.
FileRecordStream
(streamID, write=False, fields=None, missingValues=None, bookmark=None, includeMS=True, firstRecord=None)¶ CSV file based RecordStream implementation
-
appendRecord
(record, inputBookmark=None)¶ Saves the record in the underlying csv file.
record: a list of Python objects that will be string-ified
Returns: nothing
-
appendRecords
(records, inputRef=None, progressCB=None)¶ Saves multiple records in the underlying storage.
- Params: records - array of records as in ‘appendRecord’
- inputRef - reference to the corresponding input (not applicable
- in case of a file storage)
progressCB - callback to report progress
Returns: nothing
-
clearStats
()¶ Resets stats collected so far.
-
getBookmark
()¶ Returns an anchor to the current position in the data. Passing this anchor to a constructor makes the current position to be the first returned record.
-
getDataRowCount
()¶ Returns: count of data rows in dataset (excluding header lines)
-
getError
()¶ Returns errors saved in the stream.
-
getFieldNames
()¶ Returns an array of field names associated with the data.
-
getFields
()¶ Returns a sequence of nupic.data.fieldmeta.FieldMetaInfo name/type/special tuples for each field in the stream.
-
getLastRecords
(numRecords)¶ Returns a tuple (successCode, recordsArray), where successCode - if the stream had enough records to return, True/False recordsArray - an array of last numRecords records available when
the call was made. Records appended while in the getLastRecords will be not returned until the next call to either getNextRecord() or getLastRecords()
-
getNextRecord
(useCache=True)¶ Returns next available data record from the file.
- retval: a data row (a list or tuple) if available; None, if no more records
- in the table (End of Stream - EOS); empty sequence (list or tuple) when timing out while waiting for the next record.
-
getNextRecordIdx
()¶ Returns the index of the record that will be read next from getNextRecord()
-
getRecordsRange
(bookmark=None, range=None)¶ Returns a range of records, starting from the bookmark. If ‘bookmark’ is None, then records read from the first available. If ‘range’ is None, all available records will be returned (caution: this could be a lot of records and require a lot of memory).
-
getStats
()¶ Parse the file using dedicated reader and collect fields stats. Never called if user of FileRecordStream does not invoke getStats method.
- Returns: a dictionary of stats. In the current implementation, min and max
fields are supported. Example of the return dictionary is:
- {
- ‘min’ : [f1_min, f2_min, None, None, fn_min], ‘max’ : [f1_max, f2_max, None, None, fn_max]
}
(where fx_min/fx_max are set for scalar fields, or None if not)
-
isCompleted
()¶ Returns True if all records are already in the stream or False if more records is expected.
-
next
()¶ Implement the iterator protocol
-
recordsExistAfter
(bookmark)¶ Returns True iff there are records left after the bookmark.
-
rewind
()¶ Put us back at the beginning of the file again)
-
seekFromEnd
(numRecords)¶ Seeks to numRecords from the end and returns a bookmark to the new position.
-
setAutoRewind
(autoRewind)¶ Controls whether getNext() should automatically rewind the source when EOF is reached.
- autoRewind: True = getNext() will automatically rewind the source on EOF;
- False = getNext() will not automatically rewind the source on EOF
-
setCompleted
(completed=True)¶ Marks the stream completed (True or False)
-
setError
(error)¶ Saves specified error in the stream.
-
setTimeout
(timeout)¶ Set the read timeout
-