How Do Protein Levels Drive Black Sea Wheat Prices?
Reading time: 10 minutes
How do Wheat protein level variations impact prices in the Black Sea?
When it comes to buying or trading Wheat, protein levels are one of the main price drivers. Because protein levels are volatile, they can put traders and purchasing managers in stressful situations. For instance, as a purchasing manager, if you cannot find Wheat with the quality specs aligning with your internal standards, you could face difficulties during the production process and ultimately financial losses. But it’s not all grim, on the other hand, disparities between protein levels and pricing can offer arbitrage opportunities and risk mitigation if you’re trading.
Know when to adapt your Wheat sourcing strategy
Data provided by SGS, the world’s leading inspection, verification, testing and certification company
Methodology
To determine whether variations in protein level impact Wheat price in the Black Sea, this study will cross Black Sea Wheat prices from November 2018 to May 2020 with their respective protein levels variations using Pearson and Mutual Information correlations analysis to identify correlated combinations. Following this, a pool of these combinations will be chosen with a varying delta in protein levels to test their collinearity. Finally, a distance analysis will determine the price variation dynamics, which will enable access to predictability and acceleration patterns that could provide insights for relative trading.
Figure 1: Price Russia Wheat CPT Russia per Protein Level (Jul 18 to Jun 19)
The scope of this analysis compares 15 different protein quality levels for Wheat and Milling Wheat from Russia, Romania, and Ukraine:
• Russia Wheat CPT 10.5 Russia
• Russia Wheat CPT 11 Russia
• Russia Wheat CPT 11.5 Russia
• Russia Wheat CPT 12 Russia
• Russia Wheat CPT 12.5 Russia
• Russia Wheat CPT 13.5 Russia
• Russia Wheat CPT 14.5 Russia
• Russia Milling Wheat 10.5 FOB Russia
• Russia Milling Wheat 11.5 FOB Russia
• Russia Milling Wheat 12.5 FOB Russia
• Russia Milling Wheat 13.5 FOB Russia
• Romania Bulgaria Milling Wheat FOB
• Ukraine Milling Wheat 11.5 FOB Ukraine
• Ukraine Wheat CFR Indonesia
• Ukraine Wheat CFR Malaysia
Step 1: Extraction of protein quality data for the different origins
Step 2: Correlation analysis for different distributions
Step 3: Cointegration analysis for tickers combinations
Step 4: Distance analysis of clustered curves
Step 5: Results analysis for protein correlations
Correlation Analysis
The first step to understanding the relationship between the different protein qualities across origins and products is to determine correlation coefficients, by applying Pearson and Spearman correlations to take into account different data distributions. The Mutual Information scores method is applied because it enables analysis of two curves as clusters to score their overall similarities. The closer the correlation coefficient approaches 1, the higher the correlation between the two series.
Figure 2: High Correlation Coefficients for different Protein Qualities Products combinations
Figure 3: Low Correlation Coefficients for different Protein Qualities Products combinations
The Pearson and Spearman columns in figure 2 and 3 indicate that products with the same origins have a higher correlation (fig. 2) than products with the same protein percentage (fig. 3). As to protein levels – the more they differ, the lower the correlation.
The Mutual Information column reveals that combinations with the same origin, destination, and incoterm, but differing protein levels have lower score (fig. 2), whereas combinations of different origin, destination, and incoterm, but same protein percentage have a relatively higher score in comparison to Pearson and Spearman. (fig. 3)
In other words, the price of Wheat products with the same protein percentage, but different origins and destination will not necessarily evolve in the same way. This suggests that for similar Wheat products, price variations are independent of one another for different markets. Therefore one cannot look solely at protein levels to attempt to derive price evolutions.
Analyze 100,000 Wheat samples results (2016-2020)
Data provided by SGS, the world’s leading inspection, verification, testing and certification company
Cointegration Analysis
Cointegration shows collinearity between two time series to highlight similarities (or lack thereof). The range of incertitude in percentage for specific critical values determines the extent of the similarity. The cointegration test proposes a null hypothesis of no cointegration; and an alternate hypothesis of cointegration, given the range of incertitude we take. The outcome results in the t-value, p-value, and the three critical values for the test.
Figure 4: Table of Cointegration hypothesis test results for Combinations of Russia Wheat CPT 10.5 Russia with Russia Wheat CPT 11.5, 12.5, 13.5, and 14.5 Russia
The combinations of tickers used in the cointegration analysis have various protein levels. The protein level percentages increasingly differ from 1% to 4% (lines 1 to 4).
If there is indeed collinearity, the p-value returned must be below the critical values. And the t-values are smaller than the output critical values. However, it is not the case for any of the tests. Hence there is no collinearity for all the combinations.
As such, this test corroborates the small correlations found for the combinations of these tickers with the Pearson and Spearman coefficients. But this also means that combinations with high correlations are not collinear. Therefore, it is not possible to predict the evolution of protein product’s prices using another wheat protein product; even though they are – or close to being – best-correlated.
Distance Analysis
The advantage of distance analysis is that it compares each data point at a specific time point and evaluates the ‘cost’ of nearby other points that are the least expensive. In other words, it tries to find the nearest point to its value and timestamp. It is another way to find similarities based on the proximity of points for each comparison. Distances are calculated by generating the path between two curves and using a dynamic time warping (DTW) method.
Figure 5: Ideal Distance Analysis Path
Figure 5 compares Russian Wheat products with different protein qualities – as they previously yielded the best results for correlation. The idealized path is linear on the graph, which means that the prices for both products in the same period have the least cost – i.e they are not the same but the prices are following the same dynamics.
Figure 6: Distance Analysis for Russian Wheat 10.5 CPT with 11.5, 12.5, 13.5, and 14.5 protein levels
[/dica_divi_carouselitem][dica_divi_carouselitem button_url_new_window=”1″ _builder_version=”4.4.3″ global_colors_info=”{}”]
[/dica_divi_carouselitem][dica_divi_carouselitem button_url_new_window=”1″ _builder_version=”4.4.3″ global_colors_info=”{}”]
[/dica_divi_carouselitem][dica_divi_carouselitem button_url_new_window=”1″ _builder_version=”4.4.3″ global_colors_info=”{}”]
[/dica_divi_carouselitem][/dica_divi_carousel]
The four graphs in Figure 6, reveal a pattern – a smaller difference in protein levels differences are associated to close fit of the ticker’s distance path to the ideal one. Like the path between Russian Wheat CPT 10.5 and Russian Wheat CPT 11.5. In other words, the closeness of the tickers’ prices for these products shows that they are very similar and probably correlated.
On the other hand, we see that when the protein quality level difference is high (i.e., Russia Wheat CPT 10.5 CPT Russia vs. Russia Wheat CPT 14.5 Russia), the fit between the distance path and the ideal is much worse. Moreover, we can observe significant distortions for specific periods. That indicates that even though the products are correlated together with the correlation coefficients, they show little to no similarities when under the scope of distance analysis.
Figure 7: Highlighting the cliffs and plateaux in for high difference protein levels
Looking more closely at the distortions in the paths for high-difference protein level combinations (figure 7), we can observe there are long ‘plateaux’ (highlighted in yellow) and high ‘cliffs’ (highlighted in red). Although these features are found in lower-difference combinations, they do not exist at the scale of those in higher-difference ones. As such, we understand that different products have different dynamics linked to their protein level quality. Furthermore, we can interpret the cliffs and plateaux as lags. Thus the cliff highlighted in red in figure 7 shows that prices for Russian 14.5 Protein Wheat prices increase very rapidly until July 2019. And the Russian 10.5 Protein prices only caught up later the same month, with the following plateaux. Therefore, we can observe that as we compare the 10.5 protein quality Wheat with higher protein levels (thus increasing the difference), the distance path leaves the ideal one. As such, it further reinforces the idea that prices are likely more similar to each other if the origin and protein levels of the two products are close.
Analyze 100,000 Wheat samples results (2016-2020)
Data provided by SGS, the world’s leading inspection, verification, testing and certification company
Conclusion
Is there a correlation between the prices of different Wheat products with different origins and protein qualities?
In this study, in order to find the relationships between wheat protein level products, the first step was to look for correlations between the prices of different products. The results showed that products with the same origins were highly correlated. On the other hand, the correlations found between the evolution of prices for the same product in different markets were insignificant.
Furthermore, the cointegration analysis’ results indicated a non-collinearity relationship for all tests performed, thus showing that there was no mutual prediction power of proteinate Wheat product, despite the strong correlations found before.
Lastly, the distance analysis based on the combinations showing collinearity suggested that the larger the difference between protein levels, the more prices between tickers were lagging and had a hard time catching up. This also highlighted the important differences in tickers’ price dynamics over time.
Higher protein level delta showed that they had more significant price dynamics differences than for smaller protein level Delta; meaning that the different protein levels drive the price dynamics differently.
In conclusion, this study showed that there were strong correlations between Wheat products that had very similar protein levels. Additionally, the prediction level of these combinations proved to be irrelevant, as there was no collinearity between tickers; even for the low delta and high correlations ones. Furthermore, the distance analysis demonstrated that the price differences were not similar throughout time, and that price lag existed even between the smaller protein level differences.
Looking back at observations in the correlation coefficient analysis, the low correlation found for similar products shipped to different markets raises the question of the dynamics of wheat protein products ’prices throughout markets.