Skip to content

algotrade-education/Group12

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Static Badge

CS408 Computation Finance Project Report

Abstract

This study explores a potential reversal‐trend signal in financial markets based on candlestick patterns and a Simple Moving Average (SMA) filter. After a sequence of three consecutive bearish candles (downtrend), the emergence of a bullish candle whose close price exceeds both the SMA and the high of the preceding candle is taken as a reversal signal. Conversely, after three consecutive bullish candles (uptrend), a bearish candle closing below both the SMA and the low of the prior candle is treated as a reversal signal. We backtest this rule on the VN30F230X futures contract (HOSE) using high‑frequency data, evaluate performance metrics (cumulative returns, Sharpe ratio, maximum drawdown, win‑rate), and optimize the key parameters to assess feasibility and profitability.

Introduction

Algorithmic backtesting is essential for validating trading strategies under historical conditions before live deployment. This project focuses on a pattern‑based reversal strategy applied to the VN30F230X (VN30 Index Futures) on the Ho Chi Minh Stock Exchange. We use minute‑level data from January 3, 2023, to September 13, 2023, implement a custom Python backtester with realistic margin and fee handling, and compare strategy performance against benchmark metrics to draw insights on its effectiveness in an emerging market context.

  • Problem statement/Hypothesis: A reversal strategy based on 3-candle trends combined with an SMA filter can be profitable for VN30F230X futures.
  • Method: Backtesting using historical high-frequency data with parameter optimization.
  • Results: The strategy results in some negative outcomes at first, but does produce some positve ones after optimization, suggesting a potential future improvement on it.

Trading (Algorithm) Hypotheses

  • Bearish reversal: After three consecutive bearish candles (close < open), a bullish candle whose close > SMA and close > previous high signals a long entry.
  • Bullish reversal: After three consecutive bullish candles (close > open), a bearish candle whose close < SMA and close < previous low signals a short entry.

Data

  • Data source: Algotrade internship database (via SQL) and yfinance (for reference).
  • Data type: Tick data, later aggregated into OHLC candles.
  • Data period: 01/2023 to 12/2023.
  • How to get the input data?
    • Run the SQL query against the Algotrade DB in python data_collecting.py.
    • Example query:
      SELECT datetime, tickersymbol, price
      FROM "quote"."matched"
      WHERE tickersymbol LIKE 'VN30F23%' 
        AND datetime >= '2023-01-01 00:00:00';
  • How to store the output data?
    • Raw tick data → src/ticks.csv (CSV)
    • Processed in‑sample / out‑of‑sample data and optimized parameters → JSON
    • Trades from backtesting → src/trades.pkl (Pickle)

Data collection

  • We extracted per‑trade records (timestamp, symbol, price) from the Algotrade database using SQL and stored them as a local CSV (src/ticks.csv). By default, the script collects data from 2023‑03‑01 to 2023‑09‑13; adjust start_date/end_date in src/data_collecting.py as needed.

Data Processing

  1. Load ticks.csv and parse timestamps as datetime.
  2. Resample to OHLC at the interval defined in src/params.json.
  3. Compute the SMA over the close price with window sma_window (from params.json).
  4. Split chronologically into:
    • In‑sample (70%) for parameter tuning
    • Out‑of‑sample (30%) for validation
  5. Export enriched OHLC datasets to src/in‑sample.json and src/out‑sample.json.

Implementation

Environment setup:

  pip install pandas
  pip install numpy
  pip install psycopg[binary]
  pip install mplfinance

Data Collection

python src/data_collecting.py

This will get the tick data with the following query sql SELECT datetime, tickersymbol, price FROM "quote"."matched" WHERE tickersymbol LIKE 'VN30F23%' AND datetime >= '2023-01-01 00:00:00'; The daily stock price results will be stored in the src/ticks.csv file. The data is stored with the following format:

datetime                   tickersymbol   price

2023-01-03 08:53:35.786065	VN30F2301  1003.9

2023-01-03 08:53:38.830273	VN30F2301  1003.0

Data Processing

This section describes the procedure used to prepare the historical high-frequency data of the VN30F230X futures contract for subsequent backtesting and analysis. The objective is to transform raw tick-level data into a structured and consistent format suitable for time-series modeling and strategy evaluation.

Data Acquisition and Parameter Configuration

Historical tick data were imported from a CSV file (ticks.csv), which contains time-stamped records of individual trades, including the ticker symbol and transaction price. In parallel, strategy-related parameters—such as resampling frequency and moving average window size—were loaded from a JSON configuration file (params.json). Default values were applied in cases where specific parameters were not provided.

Dataset Partitioning

The complete dataset was divided chronologically into two distinct subsets:

  • In-sample dataset (70%): used for strategy design, parameter tuning, and model training.

  • Out-of-sample dataset (30%): reserved for performance validation to assess generalization under unseen market conditions.

Data Cleaning and Transformation

For both subsets, several preprocessing steps were applied:

  • Timestamps were converted into datetime objects and set as the index to enable time-based operations.

  • Non-numeric values in the price field were coerced and removed.

  • A filtering mechanism was employed to exclude contracts whose expiration month was inconsistent with the calendar month of the data point, ensuring temporal coherence.

  • The datasets were sorted in ascending order of time to maintain chronological integrity.

Resampling to OHLC Format

The tick data were aggregated into OHLC (Open, High, Low, Close) format using a fixed resampling interval, which is pre-defined in src/params.json. This transformation reduces data granularity and enables the use of technical indicators commonly employed in quantitative strategies.

Feature Construction

A Simple Moving Average (SMA) was computed over the close price with a rolling window size determined by the user-defined parameter sma_window,which is pre-defined in src/params.json. This indicator is widely used for trend identification and signal generation in systematic trading.

Data Export

The resulting OHLC datasets, enriched with the SMA feature, were exported to structured JSON files (in-sample.json and out-sample.json) for use in the backtesting engine. This format facilitates efficient access and indexing during the simulation phase.

By applying the above data processing pipeline, the raw market data were successfully transformed into a clean, resampled, and feature-enhanced format suitable for rigorous strategy evaluation within the computer finance framework.

Run the following command for data processing pipeline:

python src/data_processing.py

An optional flag --params can be added for using the defined parameters in src/params.json. For now, we do not add this flag so that the program will run on the initial parameters.

The data is stored with the following format:

datetime                   tickersymbol   price

2023-01-03 08:53:35.786065	VN30F2301  1003.9

2023-01-03 08:53:38.830273	VN30F2301  1003.0

Back Testing

Trading Rules

The trading logic is built around simple price action patterns in conjunction with the SMA as a trend filter. The strategy identifies potential long entry signals when three consecutive bearish candles (i.e., close < open) are immediately followed by a breakout candle, whose close price exceeds both its SMA and the high of the preceding candle. Conversely, short entry signals are triggered under the inverse condition: three consecutive bullish candles followed by a breakdown below both the SMA and the prior candle’s low.

Each trade is executed at the close price of the breakout candle. Once entered, positions are monitored at each subsequent time step for exit conditions:

  • Take Profit: the position is exited when unrealized profit reaches or exceeds a pre-specified number of index points, which is pre-defined in src/params.json.
  • Stop Loss: the position is force-closed when losses reach a threshold, which is pre-defined in src/params.json. All trades are assumed to be closed at the end of the trading session (i.e., if the trading date changes), simulating an intraday-only strategy and eliminating overnight risk.

Position Sizing and Capital Constraints

The strategy is capital-aware and incorporates realistic margin mechanics. A margin deposit is required for each trade, computed as:

Deposit = (Entry Price × Multiplier × Margin Ratio) ÷ Acceptable Risk (AR)

where the multiplier is set to 100,000 (reflecting the VN30F contract), and margin ratio and AR are fixed at 17.5% and 0.8 respectively. A trade is only executed if sufficient available capital is present. The model updates available and total capital dynamically, based on realized P&L, margin usage, and trading fees (0.47 index points per round trip).

Run backtest

After processing data, we can run the back testing with the following command:

python src/backtest.py data_file_name

Replace data_file_name with your desired dataset, specifically in-sample.json or out-sample.json

Configuration

The configurations for backtesting are also the ones for optimization, which are defined in src/params.json, as follow:

{
    "sma_window": 50,
    "take_profit": 3,
    "stop_loss": -1,
    "time_frame": 1
}
  • sma_window is the SMA window for data processing.
  • time_frame is the time frame of each candle for data processing
  • take_profit and stop_loss are thresholds for the backtesting process

In-sample Backtesting

  • Parameters: The sma_window, take_profit, stop_loss, time_frame are initially set to 50, 3, 1, 1.
  • Data: In-sample dataset (in-sample.json). Run the in-sample backtesting with the following command:
python src/backtest.py in-sample.json

An optional flag --params can be added for using the defined parameters in src/params.json. For now, we do not add this flag so that the program will run on the initial parameters.

In-sample Backtesting Result

To get the result, run the following command:

python src/evaluate.py

The initial results are as follow:

  • Initial Capital: 100'000'000 VND
  • Final Capital: 80'192'999.99999996 VND
  • Total Profit: -19'807'000.000000045 VND
  • Asset Over Time:
  • Holding Period Return (HPR): -19.81%
  • Maximum Drawdown (MDD): 20.23%
  • Daily-based Sharpe Ratio: -4.951047135793618

Optimization

  • Description: We will randomly generate different sets of parameters and run the data processing, backtesting, and evaluation sequentially. We will try to find the set that results in highest HPR.
  • Parameters to optimize: As mentioned above, the parameters for optimizations are defined in src/params.json, as follow:
{
    "sma_window": 50,
    "take_profit": 3,
    "stop_loss": -1,
    "time_frame": 1
}
  • Hyper-parameters of the optimization process: Followings are the range for random generating:
    • sma_window: 10 to 100
    • time_frame: 1 to 30
    • take_profit: 2.0 to 10.0
    • stop_loss: -5.0 to -0.5
  • Run the following command for parameter optimization:
python src/optimize.py --seed 12345

12345 is the random seed we used for generating parameters, you can specify another number.

This process will take about 1-2 hours to finish on a standard laptop. We have already adjust the parameters to the most optimal set as we run the optimization in src/params.json.

Optimization Result

After optimization, we put the best set of parameters to src/params.json. If you accidentally run the optimization but do not want to wait, you can safely stop it and pass the following contents to src/params.json for running on the optimal parameters.

{
    "sma_window": 33,
    "take_profit": 3.09,
    "stop_loss": -4.99,
    "time_frame": 3
}

After that, re-run the data processing and backtesting on in-sample data with the following 2 commands:

python src/data_processing.py --params
python src/backtest.py in-sample.json --params

This time, we add the flag for the programs to run on the optimized parameters.

To get the result, run the following command:

python src/evaluate.py

The results after optimizing are as follow:

  • Initial Capital: 100'000'000 VND
  • Final Capital: 108'672'000.0 VND
  • Total Profit: 8'672'000.000000002 VND
  • Asset Over Time:
  • Holding Period Return (HPR): 8.67%
  • Maximum Drawdown (MDD): 2.29%
  • Daily-based Sharpe Ratio: 1.9230540998448737

Out-of-sample Backtesting

  • Parameters: The sma_window, take_profit, stop_loss, time_frame are retrieved from the optimized ones from src/params.json.
  • Data: Out-sample dataset (out-sample.json). Run the out-sample backtesting with the following command:
python src/backtest.py out-sample.json --params

We also add the flag here to run the out-sample backtesting on the optimized parameters.

Out-sample Backtesting Result

To get the result, run the following command:

python src/evaluate.py

The initial results are as follow:

  • Initial Capital: 100'000'000 VND
  • Final Capital: 103'324'000.00000007 VND
  • Total Profit: 3'324'000.000000073 VND
  • Asset Over Time:
  • Holding Period Return (HPR): 3.32%
  • Maximum Drawdown (MDD): 1.36%
  • Daily-based Sharpe Ratio: 1.4503619250285522

Conclusion

The initial backtest of the 3-candle reversal strategy with default parameters yielded negative returns on the VN30F230X futures contract. However, after optimizing the SMA window, time frame, take profit, and stop loss parameters using in-sample data, the strategy showed positive profitability (8.67% HPR, Sharpe Ratio 1.92). Validation on the out-of-sample data confirmed the strategy's potential, generating a positive HPR of 3.32% and a Sharpe Ratio of 1.45, albeit lower than in-sample results. This suggests the optimized strategy has some predictive value but may be sensitive to overfitting or changing market conditions. Further refinement or incorporation of additional filters could potentially improve robustness.

About

Project of Group 12 - CS408 - APCS, HCMUS - 2025

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages