![]() |
NuPIC
0.2.7.dev0
Numenta Platform for Intelligent Computing
|
A scalar encoder encodes a numeric (floating point) value into an array of bits. More...
Public Member Functions | |
def | __init__ |
w – number of bits to set in output minval – minimum input value maxval – maximum input value (input is strictly less if periodic == True) More... | |
def | getDecoderOutputFieldTypes |
[Encoder class virtual method override] | |
def | getBucketIndices |
See method description in base.py. | |
def | encodeIntoArray |
See method description in base.py. | |
def | decode |
See the function description in base.py. | |
def | getBucketValues |
See the function description in base.py. | |
def | getBucketInfo |
See the function description in base.py. | |
def | topDownCompute |
See the function description in base.py. | |
def | closenessScores |
See the function description in base.py. | |
![]() | |
def | getWidth |
Should return the output width, in bits. More... | |
def | encodeIntoArray |
Encodes inputData and puts the encoded value into the numpy output array, which is a 1-D array of length returned by getWidth(). More... | |
def | setLearning |
Set whether learning is enabled. More... | |
def | setFieldStats |
This method is called by the model to set the statistics like min and max for the underlying encoders if this information is available. More... | |
def | encode |
Convenience wrapper for encodeIntoArray. More... | |
def | getScalarNames |
Return the field names for each of the scalar values returned by getScalars. More... | |
def | getDecoderOutputFieldTypes |
Returns a sequence of field types corresponding to the elements in the decoded output field array. More... | |
def | setStateLock |
Setting this to true freezes the state of the encoder This is separate from the learning state which affects changing parameters. More... | |
def | getEncoderList |
def | getScalars |
Returns a numpy array containing the sub-field scalar value(s) for each sub-field of the inputData. More... | |
def | getEncodedValues |
Returns the input in the same format as is returned by topDownCompute(). More... | |
def | getBucketIndices |
Returns an array containing the sub-field bucket indices for each sub-field of the inputData. More... | |
def | scalarsToStr |
Return a pretty print string representing the return values from getScalars and getScalarNames(). More... | |
def | getDescription |
This returns a list of tuples, each containing (name, offset). More... | |
def | getFieldDescription |
Return the offset and length of a given field within the encoded output. More... | |
def | encodedBitDescription |
Return a description of the given bit in the encoded output. More... | |
def | pprintHeader |
Pretty-print a header that labels the sub-fields of the encoded output. More... | |
def | pprint |
Pretty-print the encoded output using ascii art. More... | |
def | decode |
Takes an encoded output and does its best to work backwards and generate the input that would have generated it. More... | |
def | decodedToStr |
Return a pretty print string representing the return value from decode(). | |
def | getBucketValues |
Returns a list of items, one for each bucket defined by this encoder. More... | |
def | getBucketInfo |
Returns a list of EncoderResult namedtuples describing the inputs for each sub-field that correspond to the bucket indices passed in 'buckets'. More... | |
def | topDownCompute |
Returns a list of EncoderResult namedtuples describing the top-down best guess inputs for each sub-field given the encoded output. More... | |
def | closenessScores |
Compute closeness scores between the expected scalar value(s) and actual scalar value(s). More... | |
def | getDisplayWidth |
Calculate width of display for bits plus blanks between fields. More... | |
def | formatBits |
Copy one array to another, inserting blanks between fields (for display) If leftpad is one, then there is a dummy value at element 0 of the arrays, and we should start our counting from 1 rather than 0. More... | |
A scalar encoder encodes a numeric (floating point) value into an array of bits.
The output is 0's except for a contiguous block of 1's. The location of this contiguous block varies continuously with the input value.
The encoding is linear. If you want a nonlinear encoding, just transform the scalar (e.g. by applying a logarithm function) before encoding. It is not recommended to bin the data as a pre-processing step, e.g. "1" = $0 - $.20, "2" = $.21-$0.80, "3" = $.81-$1.20, etc. as this removes a lot of information and prevents nearby values from overlapping in the output. Instead, use a continuous transformation that scales the data (a piecewise transformation is fine).
w – The number of bits that are set to encode a single value - the "width" of the output signal restriction: w must be odd to avoid centering problems.
minval – The minimum value of the input signal.
maxval – The upper bound of the input signal
periodic – If true, then the input value "wraps around" such that minval = maxval For a periodic value, the input must be strictly less than maxval, otherwise maxval is a true upper bound.
There are three mutually exclusive parameters that determine the overall size of of the output. Only one of these should be specifed to the constructor:
n – The number of bits in the output. Must be greater than or equal to w radius – Two inputs separated by more than the radius have non-overlapping representations. Two inputs separated by less than the radius will in general overlap in at least some of their bits. You can think of this as the radius of the input. resolution – Two inputs separated by greater than, or equal to the resolution are guaranteed to have different representations.
Note: radius and resolution are specified w.r.t the input, not output. w is specified w.r.t. the output.
Example: day of week. w = 3 Minval = 1 (Monday) Maxval = 8 (Monday) periodic = true n = 14 [equivalently: radius = 1.5 or resolution = 0.5]
The following values would encode midnight – the start of the day monday (1) -> 11000000000001 tuesday(2) -> 01110000000000 wednesday(3) -> 00011100000000 ... sunday (7) -> 10000000000011
Since the resolution is 12 hours, we can also encode noon, as monday noon -> 11100000000000 monday midnt-> 01110000000000 tuesday noon -> 00111000000000 etc.
It may not be natural to specify "n", especially with non-periodic data. For example, consider encoding an input with a range of 1-10 (inclusive) using an output width of 5. If you specify resolution = 1, this means that inputs of 1 and 2 have different outputs, though they overlap, but 1 and 1.5 might not have different outputs. This leads to a 14-bit representation like this:
1 -> 11111000000000 (14 bits total) 2 -> 01111100000000 ... 10-> 00000000011111 [resolution = 1; n=14; radius = 5]
You could specify resolution = 0.5, which gives 1 -> 11111000... (22 bits total) 1.5 -> 011111..... 2.0 -> 0011111.... [resolution = 0.5; n=22; radius=2.5]
You could specify radius = 1, which gives 1 -> 111110000000.... (50 bits total) 2 -> 000001111100.... 3 -> 000000000011111... ... 10 -> .....000011111 [radius = 1; resolution = 0.2; n=50]
An N/M encoding can also be used to encode a binary value, where we want more than one bit to represent each state. For example, we could have: w = 5, minval = 0, maxval = 1, radius = 1 (which is equivalent to n=10) 0 -> 1111100000 1 -> 0000011111
range = maxval - minval h = (w-1)/2 (half-width) resolution = radius / w n = w * range/radius (periodic) n = w * range/radius + 2 * h (non-periodic)
def __init__ | ( | self, | |
w, | |||
minval, | |||
maxval, | |||
periodic = False , |
|||
n = 0 , |
|||
radius = DEFAULT_RADIUS , |
|||
resolution = DEFAULT_RESOLUTION , |
|||
name = None , |
|||
verbosity = 0 , |
|||
clipInput = False , |
|||
forced = False |
|||
) |
w – number of bits to set in output minval – minimum input value maxval – maximum input value (input is strictly less if periodic == True)
Exactly one of n, radius, resolution must be set. "0" is a special value that means "not set".
n – number of bits in the representation (must be > w) radius – inputs separated by more than, or equal to this distance will have non-overlapping representations resolution – inputs separated by more than, or equal to this distance will have different representations
name – an optional string which will become part of the description
clipInput – if true, non-periodic inputs smaller than minval or greater than maxval will be clipped to minval/maxval
forced – if true, skip some safety checks (for compatibility reasons), default false
See class documentation for more information.