Pre-Processing¶

class src.lib.analysis.preprocessing.PreProcessing¶

Bases: object

Pre processing is intended to make initial operations to fix or adequate the data which is received from the API, fo example, limiting the amount of data as eventually the series can have years of data, which is not necessary; or to define the the correct column for the closing price of the entry.

define_closure()¶: Define the column for closure. This is necessary since depending on the source of data or on the configuration, there might be different columns for it. The new column is named “Close Final”.

define_past_time()¶

Populates all the initial available data as Real data. This information is relevant to the dataset, as any data from prediction (future) will be tagged as Predict data.

Parameters: None – No parameters are used by this method.
Returns: Result is done directly to the ohlc_dataset dataframe, by adding a new column named Data Type.
Return type: None

extend_time_range(length: int)¶

Populates the Pandas dataframe with dates following the next day for predictions. The list skips weekends, however holidays are not taken into account, so not skipped.

Parameters: length (int) – Length of the list of dates.
Returns: Populated new index with dates on the ohlc_dataset_prediction dataframe, with a sequence of dates incrementing one by one. Weekends are skipped in the list.
Return type: None

truncate_range(length: int = 0, shift_last: int = 0)¶

Limit the data to a proper length, always keeping the latest data available.

Note

The dataframe to be trim might be based on work-days only, so weekends are not included. Attention for such cases, so for example if a year of data is necessary, the input should be 250 instead of 360.

Parameters: length (int) – Number of entries to be included in the analysis. Data outside this range is truncated.

Navigation

Related Topics

Pre-Processing¶