Foresight of Macro Environment in Agribusiness: Dynamic Relationships of Food Consumption and Agricultural Production (Analysis of the Relations between Agricultural Production and Domestic Consumption)

The article is devoted to an important economic element agribusinessforesight the analysis of general economic conditions, and specifically the search for links in the production of agricultural products in general and domestic consumption of basic food products. As the main research methods for the current research there were used LASSO Regression method with a regularizer and Gradient Boosting Machine algorithm. The time series from 1991 to 2018, transformed into the form of annual percentage changes, were used as the initial data. At the beginning, a preliminary selection of features was made, and further two corresponding models were formed on the basis of the selected hyperparameters. As an additional target variable, a series of production was created with a time lag of one period in relation to the rest of the series. The degrees of the linear and nonlinear relationships for various types of food are determined, as well as differences in aggregates of crop production and animal husbandry are revealed. General recommendations are given within the framework of the policy for the development of the institutional environment of agribusiness and its foresight systems.


Introduction
Modern agribusiness management functions go beyond simplified planning and forecasts, approaching full-fledged foresight systems [1, p. 48-54] for decision-making in a wide range of business activities. Further digitalization, the spread of the Internet of things (including means of production) help to reveal the internal potential of the enterprise for effective activities [2]. At the same time, taking into account economic difficulties and uncertainty, especially the functioning of the domestic agro-industrial complex, it is equally important to have a long-term view of economic development in retrospect (ex post) and in the future (ex ante). There is a necessity of a fundamental analysis of the environment where the subjects operate according to different parameters.
Food consumption is important in the point of view of two general business positions. The first position is a reflection of the level of food security, which depends not only on covering domestic needs with food, but also on their availability to the general public. The second one is related to the development of domestic consumption as such, and the growth of domestic consumption is the cornerstone of economic growth in the country. The leading role here remains with the agricultural sector, since food, in fact, is the main result of its functioning. Thus, the urgent task of the study is to identify the interactions and their level between food consumption and agricultural production within the national economic system.

Methods and Materials
It is supposed to use data of basic foodstuff consumption in the Russian Federation in physical terms as the initial features (independent variables), and to use the production index of agricultural productivity, also in the whole country (Table 1.2), as the target attribute. To facilitate the manipulation of these data later in the text, we assign them a simplified letter code.  Table 2 presents the values of the indices of agricultural production in the Russian Federation. Thus, there were obtained an initial "date frame" consisting of 29 observations (years), 9 attributes or independent variables (types of main products) and a target variable in the form of an agricultural production index. To level out the occurrence of false correlation and regression of time series, we transform the signs into the series of percentage changes over the years (growth). The target variable, in fact, is already a percentage change (growth); in order to find the gain it was reduced by 100. Then there were checked the feasibility of using the time lag from Y at the input (model with distributed lag) and at the output, as an additional target variable. A series shift was by 1 period so that future values Yt+1 (Y lag) correspond to independent variables.

Table 3. Indicators transformed into a series of percentage changes
Year Selection of variables for the model was performed in the following way. The first step is to remove the variables with the minimum standard deviation. In our case, it is K -bread products. Next, we construct the correlation matrix by the Pearson method (Table 4).  It is acceptable to select features with a correlation of at least 0.5 for the target variable. However, some variables with a lower correlation in the analysis of paired scatter plots show the presence of nonlinear relationships. We also note that significant autocorrelation in the Y series is not observed. This additionally confirms the ARIMA analysis, identifying it as an integrated series with a moving average of the second order (0, 1, 2) [4]. Also, the correlation with the independent variables in the "lagged" series Yt+1 is less pronounced than with the original Y. At this stage, only the characteristic variables A, H, Yt+1 are removed from the proposed model. However, Yt+1 is left as an additional target variable. In addition, it is noted that the remaining signs are observed multicollinearity (cross-correlation> 0.5). There are chosen two models that are resistant to the above conditions for comparison: LASSO Regression and the Gradient Boosting Machine algorithm. LASSO is resistant to multicollearity and can be used to select meaningful data. GBM is based on sequential training of decision trees and will help to cover non-linear relationships and also rank attributes by importance.

Results and Discussion
The current models will not be used for forecasting, so there was no additional separation of the "hold-out" test sample, in favor of cross-validation on the entire data set (5 splits). The main task remains to study the effect of input consumption variables on the dynamics of agricultural production. Model fitting was performed using specialized Sklearn and Xgboost modules in the Python software environment [5,6].
Based on the calculation results, the following parameters were obtained (Table 5).  As for a linear model Y with a regularizer, the only coefficients greater above zero are C and D, and C is most significant. However, the quality of the model taking into account R 2 is very low, and explains only 6% of Y variability. The approximation of the Yt+1 model is even worse and is not worth further consideration.
More interesting results are shown by GBM. Value of features is opposite to the previous model, perhaps not least due to the capture of non-linear patterns in the data. The most significant are E, B, G, just below are F, C, D. The situation is similar in the model taking into account the time lag. More specifically, within the framework of coinciding periods, the following products have the greatest connection with the dynamics of agricultural production: milk and dairy products, sugar, vegetables and gourds. The connection with the next year of production in these groups remains at a similar level, and it increases for meat and meat products, eggs and egg products, but for fruits and berries it practically disappears.
In other words, the consumption of livestock products shows an inertial reaction, which is a consequence of the livestock industry characteristics [7, p. 5-19]. Thus, in the current economic situation it can be summarized that domestic consumption is not a strong driver of agricultural production for most products. The dynamics of consumption in most food time series in itself did not possess noticeable variability and growth trends, and with a decrease in the macroeconomic consumer background, it is likely to decline.
At the institutional level, the agro-industrial complex management strategy as part of the implementation of state development programs requires adjustment [8, p. 82-87], including the foregoing view. Highly processed export support should play an important role here. A more intensive implementation of digital elements in the supply chain, integrated foresight and the integration of science, technology and innovation [9] will help increase the efficiency of agribusiness. The flip side of the process will be the explosive growth of data requiring processing and interpretation [10]. Improving the systems of long-term planning and decision-making based on the analysis of big data and risks is impossible without the involvement of the academic environment and the expert community [11, p. 265-266.], as well as strengthening of local public-private partnerships. This is especially true in the context of technological sanctions.
The results of this work also outline further research tasks that may involve a deeper study of domestic consumption, the influence of price factors, and export and import analysis. It is also useful to assess the role of production of the non-food component in agriculture, to comprehensively study the complex system of mutual influence of related areas of the agricultural sector at the regional and federal levels.

Conclusion
The article provides a regression analysis to determine the relationship between the consumption of in-kind basic food products and the dynamics of agricultural production from 1991 to 2018. All time series are transformed to the form of annual percentage changes; preliminary selection of attributes is made. As a response, the production line was additionally compared with the time lag. A linear model of LASSO Regularizer and GBM were used. It is determined that in the majority of their relationships are not linear in nature. The effect varies by food group. For example, livestock products have a greater inertia in time. It can be concluded that the impact of domestic consumption on the dynamics of agricultural production is insufficient and requires additional attention in the framework of the development policy of the institutional environment of agribusiness as a whole and its foresight systems.