wmgifford commited on
Commit
61fc549
·
1 Parent(s): 81b3e95

r2.1 release updates

Browse files
Files changed (1) hide show
  1. README.md +97 -151
README.md CHANGED
@@ -18,25 +18,26 @@ library_name: granite-tsfm
18
  </p>
19
 
20
  TinyTimeMixers (TTMs) are compact pre-trained models for Multivariate Time-Series Forecasting, open-sourced by IBM Research.
21
- **With model sizes starting from 1M params, TTM (accepted in NeurIPS 24) introduces the notion of the first-ever “tiny” pre-trained models for Time-Series Forecasting.**
22
 
23
 
24
- TTM outperforms several popular benchmarks demanding billions of parameters in zero-shot and few-shot forecasting. TTMs are lightweight
25
  forecasters, pre-trained on publicly available time series data with various augmentations. TTM provides state-of-the-art zero-shot forecasts and can easily be
26
- fine-tuned for multi-variate forecasts with just 5% of the training data to be competitive. Refer to our [paper](https://arxiv.org/pdf/2401.03955.pdf) for more details.
27
 
 
 
 
 
28
 
29
- **The current open-source version supports point forecasting use-cases specifically ranging from minutely to hourly resolutions
30
- (Ex. 10 min, 15 min, 1 hour.).**
31
 
32
- **Note that zeroshot, fine-tuning and inference tasks using TTM can easily be executed in 1 GPU machine or in laptops too!!**
33
-
34
-
35
- **TTM-R2 comprises TTM variants pre-trained on larger pretraining datasets (~700M samples).** We have another set of TTM models released under `TTM-R1` trained on ~250M samples
36
- which can be accessed from [here](https://huggingface.co/ibm-granite/granite-timeseries-ttm-r1). In general, `TTM-R2` models perform better than `TTM-R1` models as they are
37
- trained on larger pretraining dataset. In standard benchmarks, TTM-R2 outperform TTM-R1 by over 15%. However, the choice of R1 vs R2 depends on your target data distribution. Hence requesting users to try both
38
- R1 and R2 variants and pick the best for your data.
39
 
 
 
 
 
 
40
 
41
 
42
  ## Model Description
@@ -47,15 +48,12 @@ we opt for the approach of constructing smaller pre-trained models, each focusin
47
  yielding more accurate results. Furthermore, this approach ensures that our models remain extremely small and exceptionally fast,
48
  facilitating easy deployment without demanding a ton of resources.
49
 
50
- Hence, in this model card, we release several pre-trained
51
- TTMs that can cater to many common forecasting settings in practice.
52
-
53
- Each pre-trained model will be released in a different branch name in this model card. Kindly access the required model using our
54
  getting started [notebook](https://github.com/IBM/tsfm/blob/main/notebooks/hfdemo/ttm_getting_started.ipynb) mentioning the branch name.
55
 
56
- ## Model Releases:
57
 
58
- Given the variety of models included, please use the [[get_model]](https://github.com/ibm-granite/granite-tsfm/blob/main/tsfm_public/toolkit/get_model.py) utility to automatically select the required model based on your input context length and forecast length requirement.
59
 
60
  There are several models available in different branches of this model card. The naming scheme follows the following format:
61
  `<context length>-<prediction length>-<frequency prefix tuning indicator>-<pretraining metric>-<release number>`
@@ -70,22 +68,87 @@ There are several models available in different branches of this model card. The
70
 
71
  - release number ("r2" or "r2.1"): Indicates the model release; the release indicates which data was used to train the model. See "training data" below for more details on the data included in the particular training datasets.
72
 
73
-
74
 
75
- ## Model Capabilities with example scripts
76
 
77
- The below model scripts can be used for any of the above TTM models. Please update the HF model URL and branch name in the `from_pretrained` call appropriately to pick the model of your choice.
 
78
 
79
- - Getting Started [[colab]](https://colab.research.google.com/github/ibm-granite/granite-tsfm/blob/main/notebooks/hfdemo/ttm_getting_started.ipynb)
 
80
  - Zeroshot Multivariate Forecasting [[Example]](https://github.com/ibm-granite/granite-tsfm/blob/main/notebooks/hfdemo/ttm_getting_started.ipynb)
81
  - Finetuned Multivariate Forecasting:
82
  - Channel-Independent Finetuning [[Example 1]](https://github.com/ibm-granite/granite-tsfm/blob/main/notebooks/hfdemo/ttm_getting_started.ipynb) [[Example 2]](https://github.com/ibm-granite/granite-tsfm/blob/main/notebooks/hfdemo/tinytimemixer/ttm_m4_hourly.ipynb)
83
  - Channel-Mix Finetuning [[Example]](https://github.com/ibm-granite/granite-tsfm/blob/main/notebooks/tutorial/ttm_channel_mix_finetuning.ipynb)
84
- - **New Releases (extended features released on October 2024)**
85
- - Finetuning and Forecasting with Exogenous/Control Variables [[Example]](https://github.com/ibm-granite/granite-tsfm/blob/main/notebooks/tutorial/ttm_with_exog_tutorial.ipynb)
86
  - Finetuning and Forecasting with static categorical features [Example: To be added soon]
87
  - Rolling Forecasts - Extend forecast lengths via rolling capability. Rolling beyond 2*forecast_length is not recommended. [[Example]](https://github.com/ibm-granite/granite-tsfm/blob/main/notebooks/hfdemo/ttm_rolling_prediction_getting_started.ipynb)
88
  - Helper scripts for optimal Learning Rate suggestions for Finetuning [[Example]](https://github.com/ibm-granite/granite-tsfm/blob/main/notebooks/tutorial/ttm_with_exog_tutorial.ipynb)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
89
 
90
  ## Benchmarks
91
 
@@ -101,15 +164,7 @@ adoption in resource-constrained environments. For more details, refer to our [p
101
  - TTM-A referred in the paper maps to the 1536 context models.
102
 
103
  The pre-training dataset used in this release differs slightly from the one used in the research
104
- paper, which may lead to minor variations in model performance as compared to the published results. Please refer to our paper for more details.
105
-
106
- **Benchmarking Scripts: [here](https://github.com/ibm-granite/granite-tsfm/tree/main/notebooks/hfdemo/tinytimemixer/full_benchmarking)**
107
-
108
- ## Recommended Use
109
- 1. Users have to externally standard scale their data independently for every channel before feeding it to the model (Refer to [TSP](https://github.com/IBM/tsfm/blob/main/tsfm_public/toolkit/time_series_preprocessor.py), our data processing utility for data scaling.)
110
- 2. The current open-source version supports only minutely and hourly resolutions(Ex. 10 min, 15 min, 1 hour.). Other lower resolutions (say weekly, or monthly) are currently not supported in this version, as the model needs a minimum context length of 512 or 1024.
111
- 3. Enabling any upsampling or prepending zeros to virtually increase the context length for shorter-length datasets is not recommended and will
112
- impact the model performance.
113
 
114
 
115
 
@@ -117,128 +172,19 @@ paper, which may lead to minor variations in model performance as compared to th
117
 
118
  For more details on TTM architecture and benchmarks, refer to our [paper](https://arxiv.org/pdf/2401.03955.pdf).
119
 
120
- TTM-1 currently supports 2 modes:
121
 
122
  - **Zeroshot forecasting**: Directly apply the pre-trained model on your target data to get an initial forecast (with no training).
123
 
124
  - **Finetuned forecasting**: Finetune the pre-trained model with a subset of your target data to further improve the forecast.
125
 
126
- **Since, TTM models are extremely small and fast, it is practically very easy to finetune the model with your available target data in few minutes
127
- to get more accurate forecasts.**
128
 
129
  The current release supports multivariate forecasting via both channel independence and channel-mixing approaches.
130
  Decoder Channel-Mixing can be enabled during fine-tuning for capturing strong channel-correlation patterns across
131
- time-series variates, a critical capability lacking in existing counterparts.
132
-
133
- In addition, TTM also supports exogenous infusion and static categorical data infusion.
134
-
135
-
136
- ### Model Sources
137
-
138
- - **Repository:** https://github.com/ibm-granite/granite-tsfm/tree/main/tsfm_public/models/tinytimemixer
139
- - **Paper:** https://arxiv.org/pdf/2401.03955.pdf
140
-
141
-
142
- ### Blogs and articles on TTM:
143
- - Refer to our [wiki](https://github.com/ibm-granite/granite-tsfm/wiki)
144
-
145
-
146
- ## Uses
147
-
148
 
149
- Automatic Model selection
150
- ```
151
- def get_model(
152
- model_path,
153
- model_name: str = "ttm",
154
- context_length: int = None,
155
- prediction_length: int = None,
156
- freq_prefix_tuning: bool = None,
157
- **kwargs,
158
- ):
159
-
160
- TTM Model card offers a suite of models with varying context_length and forecast_length combinations.
161
- This wrapper automatically selects the right model based on the given input context_length and prediction_length abstracting away the internal
162
- complexity.
163
-
164
- Args:
165
- model_path (str):
166
- HF model card path or local model path (Ex. ibm-granite/granite-timeseries-ttm-r1)
167
- model_name (*optional*, str)
168
- model name to use. Allowed values: ttm
169
- context_length (int):
170
- Input Context length. For ibm-granite/granite-timeseries-ttm-r1, we allow 512 and 1024.
171
- For ibm-granite/granite-timeseries-ttm-r2 and ibm/ttm-research-r2, we allow 512, 1024 and 1536
172
- prediction_length (int):
173
- Forecast length to predict. For ibm-granite/granite-timeseries-ttm-r1, we can forecast upto 96.
174
- For ibm-granite/granite-timeseries-ttm-r2 and ibm/ttm-research-r2, we can forecast upto 720.
175
- Model is trained for fixed forecast lengths (96,192,336,720) and this model add required `prediction_filter_length` to the model instance for required pruning.
176
- For Ex. if we need to forecast 150 timepoints given last 512 timepoints using model_path = ibm-granite/granite-timeseries-ttm-r2, then get_model will select the
177
- model from 512_192_r2 branch and applies prediction_filter_length = 150 to prune the forecasts from 192 to 150. prediction_filter_length also applies loss
178
- only to the pruned forecasts during finetuning.
179
- freq_prefix_tuning (*optional*, bool):
180
- Future use. Currently do not use this parameter.
181
- kwargs:
182
- Pass all the extra fine-tuning model parameters intended to be passed in the from_pretrained call to update model configuration.
183
-
184
-
185
- ```
186
-
187
- ```
188
- # Load Model from HF Model Hub mentioning the branch name in revision field
189
-
190
-
191
- model = TinyTimeMixerForPrediction.from_pretrained(
192
- "https://huggingface.co/ibm-granite/granite-timeseries-ttm-r2", revision="main"
193
- )
194
-
195
- or
196
-
197
- from tsfm_public.toolkit.get_model import get_model
198
- model = get_model(
199
- model_path="https://huggingface.co/ibm-granite/granite-timeseries-ttm-r2",
200
- context_length=512,
201
- prediction_length=96
202
- )
203
-
204
-
205
-
206
- # Do zeroshot
207
- zeroshot_trainer = Trainer(
208
- model=model,
209
- args=zeroshot_forecast_args,
210
- )
211
- )
212
-
213
- zeroshot_output = zeroshot_trainer.evaluate(dset_test)
214
-
215
-
216
- # Freeze backbone and enable few-shot or finetuning:
217
-
218
- # freeze backbone
219
- for param in model.backbone.parameters():
220
- param.requires_grad = False
221
-
222
- finetune_model = get_model(
223
- model_path="https://huggingface.co/ibm-granite/granite-timeseries-ttm-r2",
224
- context_length=512,
225
- prediction_length=96,
226
- # pass other finetune params of decoder or head
227
- head_dropout = 0.2
228
- )
229
-
230
- finetune_forecast_trainer = Trainer(
231
- model=model,
232
- args=finetune_forecast_args,
233
- train_dataset=dset_train,
234
- eval_dataset=dset_val,
235
- callbacks=[early_stopping_callback, tracking_callback],
236
- optimizers=(optimizer, scheduler),
237
- )
238
- finetune_forecast_trainer.train()
239
- fewshot_output = finetune_forecast_trainer.evaluate(dset_test)
240
-
241
- ```
242
 
243
 
244
  ## Training Data
@@ -279,8 +225,8 @@ The r2.1 TTM models (denoted by branches with suffix r2.1) were trained on the a
279
 
280
 
281
  ## Citation
282
- Kindly cite the following paper, if you intend to use our model or its associated architectures/approaches in your
283
- work
284
 
285
  **BibTeX:**
286
 
@@ -295,13 +241,13 @@ work
295
 
296
  ## Model Card Authors
297
 
298
- Vijay Ekambaram, Arindam Jati, Pankaj Dayama, Wesley M. Gifford, Sumanta Mukherjee, Chandra Reddy and Jayant Kalagnanam
299
 
300
 
301
- ## IBM Public Repository Disclosure:
302
 
303
  All content in this repository including code has been provided by IBM under the associated
304
  open source software license and IBM is under no obligation to provide enhancements,
305
  updates, or support. IBM developers produced this code as an
306
  open source project (not as an IBM product), and IBM makes no assertions as to
307
- the level of quality nor security, and will not be maintaining this code going forward.
 
18
  </p>
19
 
20
  TinyTimeMixers (TTMs) are compact pre-trained models for Multivariate Time-Series Forecasting, open-sourced by IBM Research.
21
+ **With model sizes starting from 1M params, TTM introduces the notion of the first-ever “tiny” pre-trained models for Time-Series Forecasting. The paper describing TTM was accepted at [NeurIPS 24](https://proceedings.neurips.cc/paper_files/paper/2024/hash/874a4d89f2d04b4bcf9a2c19545cf040-Abstract-Conference.html).**
22
 
23
 
24
+ TTM outperforms other models demanding billions of parameters in several popular zero-shot and few-shot forecasting benchmarks. TTMs are lightweight
25
  forecasters, pre-trained on publicly available time series data with various augmentations. TTM provides state-of-the-art zero-shot forecasts and can easily be
26
+ fine-tuned for multi-variate forecasts with just 5% of the training data to be competitive. **Note that zeroshot, fine-tuning and inference tasks using TTM can easily be executed on 1 GPU or on laptops.**
27
 
28
+ TTM r2 comprises TTM variants pre-trained on larger pretraining datasets (~700M samples). The TTM r2.1 release increases the pretraining dataset size to approximately (~1B samples). The prior model releases, TTM r1, were trained on ~250M samples and can be accessed [here](https://huggingface.co/ibm-granite/granite-timeseries-ttm-r1). In general, TTM r2 models perform better than TTM r1 models as they are
29
+ trained on a larger pretraining dataset. In standard benchmarks, TTM r2 outperform TTM r1 by over 15%. However, the choice of r1 vs. r2 depends on your target data distribution, and hence users should try both variants and pick the best model for your data.
30
+ The TTM r2 releases support point forecasting use-cases specifically ranging from minutely to hourly resolutions
31
+ (Ex. 10 min, 15 min, 1 hour.). With the TTM r2.1 release, we add support for daily and weekly resolutions.
32
 
 
 
33
 
34
+ ### Links
 
 
 
 
 
 
35
 
36
+ - **Paper:** [NeurIPS 2024](https://proceedings.neurips.cc/paper_files/paper/2024/hash/874a4d89f2d04b4bcf9a2c19545cf040-Abstract-Conference.html), [ArXiV](https://arxiv.org/pdf/2401.03955.pdf)
37
+ - **Repository:** https://github.com/ibm-granite/granite-tsfm
38
+ - **PyPI project:** https://pypi.org/project/granite-tsfm/
39
+ - **Model architecture:** https://github.com/ibm-granite/granite-tsfm/tree/main/tsfm_public/models/tinytimemixer
40
+ - **Time Series Cookbook:** https://github.com/ibm-granite-community/granite-timeseries-cookbook
41
 
42
 
43
  ## Model Description
 
48
  yielding more accurate results. Furthermore, this approach ensures that our models remain extremely small and exceptionally fast,
49
  facilitating easy deployment without demanding a ton of resources.
50
 
51
+ Hence, in this model card, we release several pre-trained TTMs that can cater to many common forecasting settings in practice.
52
+ Each pre-trained model will be released in a different branch name in this model card. Given the variety of models included, we recommend the use of [`get_model()`](https://github.com/ibm-granite/granite-tsfm/blob/main/tsfm_public/toolkit/get_model.py) utility to automatically select the required model based on your input context length, and forecast length, and other requirements. You can also directly access a specific model using our
 
 
53
  getting started [notebook](https://github.com/IBM/tsfm/blob/main/notebooks/hfdemo/ttm_getting_started.ipynb) mentioning the branch name.
54
 
 
55
 
56
+ ## Model Releases
57
 
58
  There are several models available in different branches of this model card. The naming scheme follows the following format:
59
  `<context length>-<prediction length>-<frequency prefix tuning indicator>-<pretraining metric>-<release number>`
 
68
 
69
  - release number ("r2" or "r2.1"): Indicates the model release; the release indicates which data was used to train the model. See "training data" below for more details on the data included in the particular training datasets.
70
 
 
71
 
 
72
 
73
+ ### Example recipes and notebooks
74
+ The scripts below can be used for any of the above TTM models. Please update the HF model URL and branch name in the `from_pretrained` call appropriately to pick the model of your choice. Please note that a few of the notebooks directly use the [`get_model()`](https://github.com/ibm-granite/granite-tsfm/blob/main/tsfm_public/toolkit/get_model.py) utility to select the model.
75
 
76
+ - Getting started [[Recipe]](https://github.com/ibm-granite-community/granite-timeseries-cookbook/blob/main/recipes/Time_Series/Time_Series_Getting_Started.ipynb) [[colab]](https://colab.research.google.com/github/ibm-granite/granite-tsfm/blob/main/notebooks/hfdemo/ttm_getting_started.ipynb)
77
+ - Getting started with IBM watsonx [[Recipe]](https://github.com/ibm-granite-community/granite-timeseries-cookbook/blob/main/recipes/Time_Series/Getting_Started_with_WatsonX_AI_SDK.ipynb)
78
  - Zeroshot Multivariate Forecasting [[Example]](https://github.com/ibm-granite/granite-tsfm/blob/main/notebooks/hfdemo/ttm_getting_started.ipynb)
79
  - Finetuned Multivariate Forecasting:
80
  - Channel-Independent Finetuning [[Example 1]](https://github.com/ibm-granite/granite-tsfm/blob/main/notebooks/hfdemo/ttm_getting_started.ipynb) [[Example 2]](https://github.com/ibm-granite/granite-tsfm/blob/main/notebooks/hfdemo/tinytimemixer/ttm_m4_hourly.ipynb)
81
  - Channel-Mix Finetuning [[Example]](https://github.com/ibm-granite/granite-tsfm/blob/main/notebooks/tutorial/ttm_channel_mix_finetuning.ipynb)
82
+ - TTM r2 release (extended features released on October 2024):
83
+ - Finetuning and Forecasting with Exogenous/Control Variables [[Recipe 1]](https://github.com/ibm-granite-community/granite-timeseries-cookbook/blob/main/recipes/Time_Series/Few-shot_Finetuning_and_Evaluation.ipynb) [[Recipe 2]](https://github.com/ibm-granite-community/granite-timeseries-cookbook/blob/main/recipes/Time_Series/Bike_Sharing_Finetuning_with_Exogenous.ipynb)
84
  - Finetuning and Forecasting with static categorical features [Example: To be added soon]
85
  - Rolling Forecasts - Extend forecast lengths via rolling capability. Rolling beyond 2*forecast_length is not recommended. [[Example]](https://github.com/ibm-granite/granite-tsfm/blob/main/notebooks/hfdemo/ttm_rolling_prediction_getting_started.ipynb)
86
  - Helper scripts for optimal Learning Rate suggestions for Finetuning [[Example]](https://github.com/ibm-granite/granite-tsfm/blob/main/notebooks/tutorial/ttm_with_exog_tutorial.ipynb)
87
+ - TTM r2.1 release:
88
+ - GIFT-Eval benchmark [[notebook]](https://github.com/SalesforceAIResearch/gift-eval/blob/main/notebooks/ttm.ipynb)
89
+
90
+
91
+ ### Usage guidelines
92
+ 1. Users have to externally standard scale their data independently for every channel before feeding it to the model (refer to [`TimeSeriesPreprocessor`](https://github.com/IBM/tsfm/blob/main/tsfm_public/toolkit/time_series_preprocessor.py), our data processing utility for data scaling).
93
+ 2. The current open-source version supports only minutely and hourly resolutions(Ex. 10 min, 15 min, 1 hour.). Other lower resolutions (say monthly or yearly) are currently not supported in this version, as the model needs a minimum context length of 512 or 1024. With the r2.1 release, we now also support daily and weekly resolution.
94
+ 3. Enabling any upsampling or prepending zeros to virtually increase the context length for shorter-length datasets is not recommended and will impact the model performance.
95
+
96
+
97
+ ### Automatic model selection
98
+ Automatic model selection based on context length, prediction length, and other requirements can be done through use of the `get_model()` function. For reference, the signature of the function is provided below:
99
+ ```
100
+ def get_model(
101
+ model_path: str,
102
+ model_name: str = "ttm",
103
+ context_length: Optional[int] = None,
104
+ prediction_length: Optional[int] = None,
105
+ freq_prefix_tuning: bool = False,
106
+ freq: Optional[str] = None,
107
+ prefer_l1_loss: bool = False,
108
+ prefer_longer_context: bool = True,
109
+ force_return: Optional[str] = None,
110
+ return_model_key: bool = False,
111
+ **kwargs,
112
+ ) -> Union[str, PreTrainedModel]:
113
+ """TTM Model card offers a suite of models with varying `context_length` and `prediction_length` combinations.
114
+ This wrapper automatically selects the right model based on the given input `context_length` and
115
+ `prediction_length` abstracting away the internal complexity.
116
+
117
+ Args:
118
+ model_path (str): HuggingFace model card path or local model path (Ex. ibm-granite/granite-timeseries-ttm-r2)
119
+ model_name (str, optional): Model name to use. Current allowed values: [ttm]. Defaults to "ttm".
120
+ context_length (int, optional): Input Context length or history. Defaults to None.
121
+ prediction_length (int, optional): Length of the forecast horizon. Defaults to None.
122
+ freq_prefix_tuning (bool, optional): If true, it will prefer TTM models that are trained with frequency prefix
123
+ tuning configuration. Defaults to None.
124
+ freq (str, optional): Resolution or frequency of the data. Defaults to None. Allowed values are as
125
+ per the `DEFAULT_FREQUENCY_MAPPING`.
126
+ prefer_l1_loss (bool, optional): If True, it will prefer choosing models that were trained with L1 loss or
127
+ mean absolute error loss. Defaults to False.
128
+ prefer_longer_context (bool, optional): If True, it will prefer selecting model with longer context/history
129
+ Defaults to True.
130
+ force_return (str, optional): This is used to force the get_model() to return a TTM model even when the provided
131
+ configurations don't match with the existing TTMs. It gets the closest TTM possible. Allowed values are
132
+ ["zeropad"/"rolling"/"random_init_small"/"random_init_medium"/"random_init_large"/`None`].
133
+ "zeropad" = Returns a pre-trained TTM that has a context length higher than the input context length, hence,
134
+ the user must apply zero-padding to use the returned model.
135
+ "rolling" = Returns a pre-trained TTM that has a prediction length lower than the requested prediction length,
136
+ hence, the user must apply rolling technique to use the returned model to forecast to the desired length.
137
+ The `RecursivePredictor` class can be utilized in this scenario.
138
+ "random_init_small" = Returns a randomly initialized small TTM which must be trained before performing inference.
139
+ "random_init_medium" = Returns a randomly initialized medium TTM which must be trained before performing inference.
140
+ "random_init_large" = Returns a randomly initialized large TTM which must be trained before performing inference.
141
+ `None` = `force_return` is disable. Raises an error if no suitable model is found.
142
+ Defaults to None.
143
+ return_model_key (bool, optional): If True, only the TTM model name will be returned, instead of the actual model.
144
+ This does not downlaod the model, and only returns the name of the suitable model. Defaults to False.
145
+
146
+ Returns:
147
+ Union[str, PreTrainedModel]: Returns the Model, or the model name.
148
+ """
149
+ ```
150
+
151
+
152
 
153
  ## Benchmarks
154
 
 
164
  - TTM-A referred in the paper maps to the 1536 context models.
165
 
166
  The pre-training dataset used in this release differs slightly from the one used in the research
167
+ paper, which may lead to minor variations in model performance as compared to the published results. Please refer to our paper for more details. Benchmarking scripts can be found [here](https://github.com/ibm-granite/granite-tsfm/tree/main/notebooks/hfdemo/tinytimemixer/full_benchmarking).
 
 
 
 
 
 
 
 
168
 
169
 
170
 
 
172
 
173
  For more details on TTM architecture and benchmarks, refer to our [paper](https://arxiv.org/pdf/2401.03955.pdf).
174
 
175
+ TTM currently supports two modes:
176
 
177
  - **Zeroshot forecasting**: Directly apply the pre-trained model on your target data to get an initial forecast (with no training).
178
 
179
  - **Finetuned forecasting**: Finetune the pre-trained model with a subset of your target data to further improve the forecast.
180
 
181
+ Since, TTM models are extremely small and fast, it is practically very easy to finetune the model with your available target data in few minutes to get more accurate forecasts.
 
182
 
183
  The current release supports multivariate forecasting via both channel independence and channel-mixing approaches.
184
  Decoder Channel-Mixing can be enabled during fine-tuning for capturing strong channel-correlation patterns across
185
+ time-series variates, a critical capability lacking in existing counterparts. In addition, TTM also supports exogenous infusion and static categorical data infusion.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
186
 
187
+ The r2.1 release builds upon the above, adding improved accuracy for shorter context length, daily/weekly resolution, combined with a larger pre-training dataset.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
188
 
189
 
190
  ## Training Data
 
225
 
226
 
227
  ## Citation
228
+ Please cite the following paper if you intend to use our model or its associated architectures/approaches in your
229
+ work.
230
 
231
  **BibTeX:**
232
 
 
241
 
242
  ## Model Card Authors
243
 
244
+ Vijay Ekambaram, Arindam Jati, Pankaj Dayama, Wesley M. Gifford, Tomoya Sakai, Sumanta Mukherjee, Chandra Reddy and Jayant Kalagnanam
245
 
246
 
247
+ ## IBM Public Repository Disclosure
248
 
249
  All content in this repository including code has been provided by IBM under the associated
250
  open source software license and IBM is under no obligation to provide enhancements,
251
  updates, or support. IBM developers produced this code as an
252
  open source project (not as an IBM product), and IBM makes no assertions as to
253
+ the level of quality nor security, and will not be maintaining this code going forward.