MCPcopy
hub / github.com/microsoft/qlib / setup_data

Method setup_data

qlib/data/dataset/handler.py:634–664  ·  view source on GitHub ↗

Set up the data in case of running initialization for multiple time Parameters ---------- init_type : str The type `IT_*` listed above. enable_cache : bool default value is false: - if `enable_cache` == True:

(self, init_type: str = IT_FIT_SEQ, **kwargs)

Source from the content-addressed store, hash-verified

632 IT_LS = "load_state" # The state of the object has been load by pickle
633
634 def setup_data(self, init_type: str = IT_FIT_SEQ, **kwargs):
635 """
636 Set up the data in case of running initialization for multiple time
637
638 Parameters
639 ----------
640 init_type : str
641 The type `IT_*` listed above.
642 enable_cache : bool
643 default value is false:
644
645 - if `enable_cache` == True:
646
647 the processed data will be saved on disk, and handler will load the cached data from the disk directly
648 when we call `init` next time
649 """
650 # init raw data
651 super().setup_data(**kwargs)
652
653 with TimeInspector.logt("fit & process data"):
654 if init_type == DataHandlerLP.IT_FIT_IND:
655 self.fit()
656 self.process_data()
657 elif init_type == DataHandlerLP.IT_LS:
658 self.process_data()
659 elif init_type == DataHandlerLP.IT_FIT_SEQ:
660 self.fit_process_data()
661 else:
662 raise NotImplementedError(f"This type of input is not supported")
663
664 # TODO: Be able to cache handler data. Save the memory for data processing
665
666 def _get_df_by_key(self, data_key: DATA_KEY_TYPE = DataHandlerABC.DK_I) -> pd.DataFrame:
667 if data_key == self.DK_R and self.drop_raw:

Callers

nothing calls this directly

Calls 5

fitMethod · 0.95
process_dataMethod · 0.95
fit_process_dataMethod · 0.95
logtMethod · 0.80
setup_dataMethod · 0.45

Tested by

no test coverage detected