**The kNN Euclidean Forecast SMA**

The kNN Euclidean Forecast SMA is a predictive model designed to forecast the price of the next bar in a time series. The model relies on the k Nearest Neighbors algorithm, where k represents the number of nearest neighbors used for prediction. The general idea is to create training records by taking a sequence of consecutive prices and using a portion of them as predictor values and the next price as the prediction value. This process allows us to create multiple training records from a given time series.

**Step-by-Step kNN Process**

1. Optimize and Determine k-Value: The first step in the process is to optimize and determine the k-value, representing the number of nearest neighbors to use for prediction. Finding the optimal k-value is essential as it directly affects the accuracy of the model. For this, a python script that uses machine learning can be employed to analyze backtested historical data and identify the best k-value.

2. Calculate the Distance: Once the k-value is determined, the distance between an instance and all the training samples is calculated. For the one-dimensional distance, we simply take the absolute value from the instance to the value of x (| x – v |). In the case of Euclidean distance, the formula would be sqrt((x - v)^2), where x is the instance value and v represents the value from the training sample.

3. Rank and Determine Nearest Neighbors: The distances are then ranked, and the k-nearest neighbors are selected based on the k'th minimum distance. These neighbors represent the most similar instances to the current one.

4. Gather Nearest Neighbor Values: The values of the nearest neighbors are gathered to use them for the prediction process.

5. Use Average of Nearest Neighbors: The prediction value of the instance is calculated as the average of the values of the nearest neighbors. This ensures a collective decision-making process based on the proximity of instances.

6. Incorporate Prediction in Forecast Line: The prediction value is utilized in the forecast line calculations. As the model runs recursively, previous predicted points become a part of the prediction calculation for the subsequent instances.

**Step-by-Step Forecast Construction**

1. The forecast calculation starts when the last bar of the chart is reached (barstate.islast).

2. The code sets up variables and arrays to process the forecast for each bar.

3. It defines a calculationWindow array that holds historical prices from the last forecastWindowLength * 2 + evaluationWindowLength bars. The independentVariable array is used to store corresponding bar indexes for linear regression (if applicable).

4. The code then creates a referenceWindow, which contains the first forecastWindowLength - 1 element from the calculationWindow.

5. Next, the algorithm goes through an evaluation process, comparing the similarity or dissimilarity of the referenceWindow with subsequent windows of evaluationWindowLength elements in the calculationWindow.

6. The correlation between the referenceWindow and each evaluationWindow is calculated using the covariance formula.

7. Based on the chosen forecastMode ("Similarity" or "Dissimilarity"), the algorithm maximizes or minimizes the correlation values to find the best match (highest correlation for "Similarity" or lowest correlation for "Dissimilarity").

8. The variable windowOffset is used to keep track of the position of the best-matching window.

9. Once the best match is found, the algorithm calculates the forecasted value for each bar in the forecastWindow.

10. The forecastConstructionMethod determines how the forecast values are constructed:

- "Cumulative": The forecast value is added to the current value at each bar in the forecastWindow.

- "Mean": The forecast value is added to the average of the referenceWindow.

- "Linreg" (Linear Regression): A linear regression model is used to predict future values based on historical data and the bar index.

**Strengths and Weaknesses**

Strengths:

- Simple and easy-to-understand indicator.

- Works well with a small number of features or dimensions.

- Handles non-linear relationships effectively.

- Can be helpful in noisy data, as it relies on local information.

- Can implement Deep Learning Neural Networks in future updates

Weaknesses:

- Sensitive to the choice of k, which may impact prediction accuracy.

- Does not utilize indicators within feature groups for training sets

- It can be computationally expensive with large datasets and higher values of k.

- Assumes equal importance of all features, which might not always hold true.

- Prone to the "curse of dimensionality" when dealing with high-dimensional data.

TL;DR:

Use this indicator as a forecast for future prices. Choose from using Machine learning (1D or Euclidean formats) to create a forecast calculation. The chart script also allows one to change the target timeframe, whether or not to use Euclidean distance or one-dimensional calculations, as well as whether or not to use machine learning predicted values. It is suggested that one should use the 1-hour candle timeframe to minimize false signals. The values of k (nearest neighbors to consider), amount of evaluation bars, as well as prediction bars, can all be optimized.

Disclaimer: The information presented in this publication is for educational purposes only and should not be considered as financial advice. Always conduct thorough research and seek advice from financial professionals before making any investment decisions.