The Role Of Data In Financial Modeling And Risk Management

Image Source: Pexels
 

Much emphasis has been placed on developing accurate and robust financial models, whether for pricing, trading, or risk management. However, a crucial yet often overlooked component of any quantitative system is the reliability of the underlying data. In this post, we explore some issues with financial data and how to address them.
 

How to Deal with Missing Financial Data?

In the financial industry, data plays a critical role in enabling managers to make informed decisions and manage risk effectively. Despite the critical importance of financial data, it is often missing or incomplete. Financial data can be difficult to obtain due to a lack of standardization and regulatory requirements. Incomplete or inaccurate data can lead to flawed analysis, incorrect decision-making, and increased risk.

Reference [1] studied the missing data in firms’ fundamentals and proposed methods for imputing the missing data.
 

Findings

-Missing financial data affects more than 70% of firms, representing approximately half of total market capitalization.

-The authors find that missing firm fundamentals exhibit complex, systematic patterns rather than occurring randomly, making traditional ad-hoc imputation methods unreliable.

-They propose a novel imputation method that utilizes both time-series and cross-sectional dependencies in the data to estimate missing values.

-The method accommodates general systematic patterns of missingness and generates a fully observed panel of firm fundamentals.

-The paper demonstrates that addressing missing data properly has significant implications for estimating risk premia, identifying cross-sectional anomalies, and improving portfolio construction.

-The issue of missing data extends beyond firm fundamentals to other financial domains such as analyst forecasts (I/B/E/S), ESG ratings, and other large financial datasets.

-The problem is expected to be even more pronounced in international data and with the rapid expansion of Big Data in finance.

-The authors emphasize that as data sources grow in volume and complexity, developing robust imputation methods will become increasingly critical.

In summary, the paper provides foundational principles and general guidelines for handling missing data, offering a framework that can be applied to a wide range of financial research and practical applications.

We think that the proposed data imputation methods can be applied not only to fundamental data but also to financial derivatives data, such as options.

Reference

[1] Bryzgalova, Svetlana and Lerner, Sven and Lettau, Martin and Pelger, Markus, Missing Financial Data SSRN 4106794
 

Predicting Realized Volatility Using High-Frequency Data: Is More Data Always Better?

A common belief in strategy design is that ‘more data is better.’ But is this always true? Reference [2] examined the impact of the quantity of data in predicting realized volatility. Specifically, it focused on the accuracy of volatility forecasts as a function of data sampling frequency. The study was conducted on crude oil, and it used GARCH as the volatility forecast method.
 

Findings

-The research explores whether increased data availability through higher-frequency sampling leads to improved forecast precision.

-The study employs several GARCH models using Brent crude oil futures data to assess how sampling frequency influences forecasting performance.

-In-sample results show that higher sampling frequencies improve model fit, indicated by lower AIC/BIC values and higher log-likelihood scores.

-Out-of-sample analysis reveals a more complex picture—higher sampling frequencies do not consistently reduce forecast errors.

-Regression analysis demonstrates that variations in forecast errors are only marginally explained by sampling frequency changes.

-Both linear and polynomial regressions yield similar results, with low adjusted R² values and weak correlations between frequency and error metrics.

-The findings challenge the prevailing assumption that higher-frequency data necessarily enhance forecast precision.

-The study concludes that lower-frequency sampling may sometimes yield better forecasts, depending on model structure and data quality.

-The paper emphasizes the need to balance the benefits and drawbacks of high-frequency data collection in volatility prediction.

-It calls for further research across different assets, markets, and modeling approaches to identify optimal sampling frequencies.

In short, increasing the data sampling frequency improves in-sample prediction accuracy. However, higher sampling frequency actually decreases out-of-sample prediction accuracy.

This result is surprising, and the author provided some explanation for this counterintuitive outcome. In my opinion, financial time series are usually noisy, so using more data isn’t necessarily better because it can amplify the noise.

Another important insight from the article is the importance of performing out-of-sample testing, as the results can differ, sometimes even contradict the in-sample outcomes.

Reference

[2] Hervé N. Mugemana, Evaluating the impact of sampling frequency on volatility forecast accuracy, 2024, Inland Norway University of Applied Sciences
 

Closing Thoughts

Both studies underscore the central role of high-quality data in financial modeling, trading, and risk management. Whether it is the frequency at which data are sampled or the completeness of firm-level fundamentals, the integrity of input data directly determines the reliability of forecasts, model calibration, and investment decisions. As financial markets become increasingly data-driven, the ability to collect, process, and validate information with precision will remain a defining edge for both researchers and practitioners.


More By This Author:

Volatility Risk Premium Across Different Asset Classes
When Trading Systems Break Down: Causes Of Decay And Stop Criteria
Tail Risk Hedging Using Option Signals And Bond ETFs
How did you like this article? Let us know so we can better customize your reading experience.

Comments

Leave a comment to automatically be entered into our contest to win a free Echo Show.
Or Sign in with