Lab 004: Proprietary Capital Systems

Lab 004 is the bigger sibling to Arasaka Neural Bastion.

Arasaka started as a machine learning and forex research project: data, models, automation scripts, MQL5 integration, and the question of whether a system could be disciplined enough to reject weak signals. Lab 004 takes that idea further. It is less of a demo and more of an operating system for the whole research loop.

It currently runs online with GBP 1,000 allocated to the live trading experiment.

That does not make it a finished trading product. It makes it real enough that the engineering matters more. A backtest can be wrong quietly. A notebook can be optimistic forever. A live system has to wake up on schedule, fetch data, survive broker constraints, reconcile positions, write logs, and keep enough evidence around for me to know what happened later.

This is a technical project note, not financial advice. Lab 004 is an experimental system for research, engineering, and controlled live testing.

Why Build It?

Arasaka proved that the basic research direction was interesting. But it also showed the limit of having a project shaped mostly around model experiments.

The next version needed more structure:

a clean data pipeline
explicit market-regime handling
repeatable feature generation
model training and inference paths
strategy construction
backtesting
portfolio risk aggregation
paper execution
broker reconciliation
live deployment discipline

Lab 004 is my attempt to make those responsibilities visible and separate.

The project is organized as a Python package called lab004-propcap, short for Trueblood Labs Lab 004 - Proprietary Capital Systems. The name sounds dramatic, but the architecture is deliberately plain: each module owns one part of the pipeline.

The Module Stack

The local project is split into modules that move from broker access through live operation:

m0_broker: MetaTrader 5 connection and broker-facing utilities
m1_data: bar ingestion
m2_clean: cleaning and validation
m3_features: feature generation
m4_regimes: market-regime detection
m5_signals: model training, routing, inference, and signal generation
m6_strategy: position construction and strategy rules
m7_backtest: backtest engine
m8_risk: portfolio aggregation and risk reporting
m9_paper: paper execution and reconciliation
m10_live: live-operation boundary

The important design choice is that model output is not the system. Model output is one stage in a longer chain. By the time a signal becomes a position, it has passed through regime logic, strategy filters, risk logic, broker constraints, and reconciliation.

That is the right shape for this kind of project because the easiest way to fool yourself is to let a model score look like a complete decision.

The Trading Universe

The current H1 universe contains six forex pairs:

EURUSD
GBPUSD
USDJPY
AUDUSD
USDCAD
USDCHF

The system stores data in Parquet, builds features per symbol, produces per-symbol positions, then aggregates risk at the portfolio level. The H1 cadence makes the system active enough to be interesting without requiring low-latency infrastructure.

That cadence also fits the operational model: scheduled hourly checks and daily train/evaluate tasks.

Risk Snapshot

One of the latest local risk reports I inspected was risk_h1_fe0849461c, generated on March 23, 2026. It reported a six-symbol H1 portfolio with:

portfolio_sharpe_like: 1.8679
portfolio_total_pnl_net: 14.5940
portfolio_max_drawdown: -1.4732
portfolio_method: equal_weight

Those numbers are useful, but they need context. A separate project assessment already identified a major issue: raw PnL values are not normalized across symbols before aggregation. That matters because JPY pairs naturally produce larger raw price moves than five-decimal pairs. Without normalization, portfolio statistics can become biased.

Lab 004 risk snapshot showing six-symbol coverage, latest risk report metrics, and live allocation

This is exactly why I like the project. It does not just produce nice metrics. It creates enough evidence to criticize its own metrics.

From Paper to Live

The paper execution layer is where the system becomes more serious. A recent paper report showed execute_orders: true, lot_size: 0.01, and per-symbol actions like HOLD, current positions, target positions, stop-loss values, and checks that positions stayed inside the expected set.

That layer forces the system to answer operational questions:

What position do we want?
What position do we currently have?
Did we attempt an order?
Did it execute?
Did the broker position match the target after the run?
Were there errors?

The live version builds on that same discipline. A live account with GBP 1,000 is not large in capital terms, but it is large in seriousness terms. It means every assumption has somewhere to land.

What the V2 Assessment Taught Me

The most useful project document so far is not a victory lap. It is the V2 upgrade assessment.

It identified several things that matter more than another small threshold tweak:

normalize portfolio and backtest PnL before comparing symbols
cover the high-volatility regime instead of routing too much of the market to no model
harden broker execution and post-trade reconciliation
improve model observability and training diagnostics
make the market calendar session-aware

That is the grown-up version of the project. More models are tempting. Better measurement is more valuable.

The Difference From Arasaka

Arasaka was about proving the direction. Lab 004 is about operationalizing the loop.

Arasaka asks:

Can machine learning produce useful forex signals?

Lab 004 asks:

Can a complete system ingest data, build signals, test them, size them, execute them, reconcile them, and explain itself afterwards?

That second question is harder, but more useful. It forces me to treat the project as a system of responsibilities rather than a pile of promising experiments.

Where It Goes Next

The next work is not glamorous, which usually means it is important:

normalize PnL into a common unit before portfolio aggregation
add richer walk-forward model diagnostics
add stronger broker capability discovery
improve reconciliation retries and mismatch alerts
separate normal market closure from unexpected missing bars
make live monitoring easier to inspect at a glance

The lesson from Lab 004 is that the edge is not only in the model. It is in the whole chain of disciplined decisions around the model.

That is what makes it the bigger sibling to Arasaka. It is not trying to look smarter. It is trying to be harder to fool.