PJ38 Umanou - Strategy Options Comparison

背景・検証経緯

Phase 8 完了時点のベースライン状態

LambdaRank V17b は 二値ラベル（勝=1 / 負=0） で学習した順位予測モデル。 Walk-Forward CV（36ヶ月スライディングウィンドウ × 4 fold）で検証した結果:

Dirt AUC0.8236

Dirt EV Recovery~85%

Dirt Oana ROI200.2%

Turf AUC0.8302

Turf EV Recovery~77%

Turf Oana ROI128.7%

大穴軸三連複（oana）と馬連（PL picks）は黒字だが、単勝戦略は期待値割れが続いていた。モデル精度（AUC）は十分だが「順位を当てる」と「儲かる馬を選ぶ」は別の問題 ── この乖離を埋める3つの方向性を検証した。

3つの改善方向（Why A/B/C?）

Option	仮説	アプローチ	検証コスト
A	現モデルで十分。単勝を廃止しoana+馬連に集中すれば期待値は正	変更なし（ベースライン比較用）	ゼロ
B	二値ラベルでは「1着 vs 2着」と「1着 vs 10着」が同じ重みになる。オッズ加重の段階ラベルに変えれば高配当馬の順位精度が向上するのでは	label_gain を段階化して LambdaRank を再学習（3バリアント: soft / exp / linear）	中（再学習）
C	モデルは変えずに、予測後のフィルタで「モデルが高評価 × 市場が低評価」（value_score = P_model / P_market）の乖離が大きい馬だけ選べばROI向上するのでは	推論後の post-filter のみ（追加学習不要）	低（フィルタ追加のみ）

Option B は「モデル自体を改善」、Option C は「モデルの出力の使い方を改善」── 直交するアプローチを同時に検証することで、改善余地がモデル側・運用側どちらにあるかを切り分けられる。

Option B: label_gain バリアント解説

label_gain とは何か

LightGBM の LambdaRank は NDCG（Normalized Discounted Cumulative Gain）を最適化する学習ランキングアルゴリズム。 NDCG 計算の核心は 各ラベル値に対する「gain（利得）」 の定義であり、label_gain パラメータがこれを制御する。

ベースライン（二値ラベル: 勝=1 / 負=0）では gain = [0, 1] だけ。つまり「勝つか負けるか」しか区別しない。
段階ラベル（graded relevance: 0〜4）に変えると、着順やオッズに応じた重み付き順位学習が可能になる。

NDCG計算式: gain(label) を位置割引 1/log2(rank+1) で割り引いて合計し、理想順位での値で正規化。
label_gain が大きいラベルほど、そのアイテムを上位にランクすることへの報酬が大きくなる = 誤ランク時のペナルティも大きくなる。

3バリアントの違い

バリアント	label_gain	ラベル1	ラベル2	ラベル3	ラベル4	4 vs 3 の差	特性
linear	[0,1,2,3,4]	1	2	3	4	+1（均等）	全着順を均等に重視。1着と2着の差 = 3着と4着の差
soft	[0,1,2,4,8]	1	2	4	8	+4（2倍）	上位ほど重要。1着の gain が 2着の2倍 ── 穏やかな集中
exp	[0,1,3,7,15]	1	3	7	15	+8（倍増）	1着に極端に集中。gain(4) = gain(3)の2倍超 ── 強い勝馬重視

Gain カーブ比較（視覚化）

ラベル付けのロジック（二値 → 段階）

ベースライン（Option A）は is_winner = 1 or 0 の二値。Option B ではオッズ加重の段階ラベル（0〜4）を付与:

ラベル	条件	意味
4	1着 × 高オッズ（上位25%）	高配当の勝馬 = 最も価値が高い
3	1着 × 中〜低オッズ	人気馬の勝利 = 重要だが配当は低い
2	2〜3着	惜しい馬 = 連対・複勝圏
1	4〜5着	掲示板 = わずかに関連性あり
0	6着以下	無関連

同じラベルでも label_gain の値が違えば学習の重み付けが変わる。
例: ラベル4の馬を3位にランクした場合の「損失」は、exp では gain差 15-7=8、soft では 8-4=4、linear では 4-3=1。
→ exp はトップ馬の取りこぼしに最も敏感、linear は全着順の誤差を均等に扱う。

実験結果の解釈

Dirt（ダート）

soft が最強: oana ROI 223% / umaren ROI 206%。
穏やかな gain 差が「高配当馬を適度に重視しつつ全体の順位精度も維持」するバランスを実現。 exp は AUC は最高（0.8379）だが、1着集中しすぎて連対馬の順位が崩れやすい。

Turf（芝）

exp が最強: umaren ROI 194%。芝は人気馬決着が多く、1着重視の exp が有利。
一方 soft は oana ROI 89%で赤字。芝の大穴は soft の gain 配分では十分に学習できない。 → 面（dirt/turf）ごとにバリアントを変える必要があるが、組合せの複雑性が運用リスクになる。

判断: Option B は保留（シャドーラン蓄積後に再検討）

dirt=soft / turf=exp と面別バリアント固定が必要だが、WF-CV 4 fold では統計的根拠が不十分
turf soft の oana ROI 89% = 赤字リスク ── バリアント選択ミスの代償が大きい
まずシャドーランで実レースデータを蓄積し、面×バリアント組合せの安定性を確認してから本番投入

Strategy Overview

Option A Baseline (Current Model)

LambdaRank V17b binary labels. Drop tansho, focus on umaren + sanrenpuku with oana-axis strategy (min_odds=15, axis_rank=4, partner_rank=5, EV sort, 3R/day limit).

ModelLambdaRank V17b (binary)

EffortNone (use as-is)

RiskLow

Option B Value Model (Retrained)

LambdaRank with odds-weighted graded relevance labels (0-4). Tested 3 gain variants: exp [0,1,3,7,15], soft [0,1,2,4,8], linear [0,1,2,3,4].

ModelLambdaRank (graded relevance)

EffortMedium (retrain)

RiskMedium

Option C Value Filter (Post-hoc)

Apply value_score = P_model / P_market as post-prediction filter to baseline model. Sweep multiple thresholds on umaren and tansho.

ModelLambdaRank V17b + VS filter

EffortLow (filter only)

RiskLow

Key Finding

Option B "soft" variant improves dirt ROI to 223%/206% (oana/umaren). Option C's value filter has no effect on oana/umaren (axis candidates already high VS). Option C's tansho standout: dirt ROI 280% (vs>=2.0, ev>=2.5, rank<=2).

Main Comparison: Oana + Umaren ROI

DIRT Track

Strategy	Variant	AUC	Oana ROI	Oana Bets	Oana Hits	Oana Profit	Umaren ROI	Umaren Bets	Umaren Hits	Umaren Profit	Combined Profit
Option A	Baseline	0.8236	200.2%	759	36 (4.7%)	+76,040	145.2%	418	30 (7.2%)	+18,890	+94,930
Option B	exp	0.8379	208.4%	836	25 (4.2%)	+90,590	191.9%	387	10 (5.2%)	+35,560	+126,150
Option B	soft	0.8163	223.3%	815	23 (4.4%)	+100,500	206.2%	410	10 (6.1%)	+43,540	+144,040
Option B	linear	0.8212	199.1%	731	20 (4.4%)	+72,410	183.3%	431	9 (6.5%)	+35,920	+108,330
Option C	umaren vs>=3.0	0.8236	200.2%	759	36 (4.7%)	+76,040	158.9%	382	30 (7.9%)	+22,490	+98,530

TURF Track

Strategy	Variant	AUC	Oana ROI	Oana Bets	Oana Hits	Oana Profit	Umaren ROI	Umaren Bets	Umaren Hits	Umaren Profit	Combined Profit
Option A	Baseline	0.8302	128.7%	807	29 (3.6%)	+23,200	159.2%	348	24 (6.9%)	+20,600	+43,800
Option B	exp	0.8209	126.1%	817	20 (3.8%)	+21,320	194.2%	403	11 (6.5%)	+37,950	+59,270
Option B	soft	0.8345	89.1%	815	14 (2.1%)	-8,920	133.3%	361	10 (3.9%)	+12,030	+3,110
Option B	linear	0.8166	112.9%	759	14 (3.2%)	+9,780	113.2%	366	8 (5.7%)	+4,840	+14,620
Option C	umaren vs>=3.0	0.8302	128.7%	807	29 (3.6%)	+23,200	168.7%	318	23 (7.2%)	+21,840	+45,040

Combined Profit Comparison (yen)

DIRT - Combined Profit

94,930

A: Base

126,150

B: exp

144,040

B: soft

108,330

B: linear

98,530

C: vs>=3

TURF - Combined Profit

43,800

A: Base

59,270

B: exp

3,110

B: soft

14,620

B: linear

45,040

C: vs>=3

Option B: Value Model Detail

Graded relevance labels based on winning horse odds tier. Higher odds winners get higher relevance (label_gain), teaching the model to prioritize value upsets.

Cross-Validation Metrics (4-fold WF-CV)

Track	Variant	Label Gain	Mean AUC	Mean EV Rate	Fold 1	Fold 2	Fold 3	Fold 4
DIRT	exp	[0,1,3,7,15]	0.8379	116.6%	0.8395	0.8441	0.8171	0.8507
	soft	[0,1,2,4,8]	0.8163	118.7%	0.7668	0.8267	0.8225	0.8491
	linear	[0,1,2,3,4]	0.8212	125.8%	0.8418	0.8350	0.8176	0.7902
TURF	exp	[0,1,3,7,15]	0.8209	102.0%	0.8035	0.8174	0.8391	0.8234
	soft	[0,1,2,4,8]	0.8345	103.5%	0.8275	0.8189	0.8393	0.8524
	linear	[0,1,2,3,4]	0.8166	101.4%	0.7634	0.8141	0.8394	0.8493

Key Insight: Track-Specific Best Variant

DIRT: "soft" [0,1,2,4,8] is best

Oana ROI223.3% (+23pp vs baseline)

Umaren ROI206.2% (+61pp vs baseline)

Combined Profit+144,040 (+52% vs baseline)

AUC0.8163 (-0.007 vs baseline)

NoteLower AUC, higher profit

TURF: "exp" [0,1,3,7,15] is best

Oana ROI126.1% (-3pp vs baseline)

Umaren ROI194.2% (+35pp vs baseline)

Combined Profit+59,270 (+35% vs baseline)

AUC0.8209 (-0.009 vs baseline)

NoteUmaren carries the gain

Warning: "soft" Fails on Turf

Soft variant: Dirt=great, Turf=disaster

Turf Oana ROI89.1% (losing money)

Turf Combined Profit+3,110 (-93% vs baseline)

This shows the best variant differs by track. A production system would need dirt=soft, turf=exp (or stick with baseline for turf).

Option C: Value Filter Detail

Post-prediction filter: value_score = P_model / P_implied. Higher VS means model sees more value than market.

Oana (Sanrenpuku): No Effect

All VS thresholds (0.8 to 3.0) produce identical results for oana on both tracks. Axis candidates (high predicted rank) are inherently high VS since they are longshots by construction (oana_min_odds=15).

Umaren Value Filter Sweep

Track	VS Threshold	Bets	Hits	Hit Rate	ROI	Profit	vs Baseline
DIRT	Baseline (no filter)	418	30	7.2%	145.2%	+18,890	-
	vs >= 2.5	410	30	7.3%	148.0%	+19,690	+800
	vs >= 3.0	382	30	7.9%	158.9%	+22,490	+3,600
TURF	Baseline (no filter)	348	24	6.9%	159.2%	+20,600	-
	vs >= 2.0	342	24	7.0%	162.0%	+21,200	+600
	vs >= 3.0	318	23	7.2%	168.7%	+21,840	+1,240

Tansho Value Standouts

Although Option A proposes dropping tansho, the VS filter reveals a high-ROI niche:

Track	Filter	Bets	Hits	Hit Rate	ROI	Profit	Mean Odds
DIRT	Baseline (ev>=1.5, rank<=3)	268	17	6.3%	123.5%	+6,300	24.2
	vs>=2.5, ev>=2.0	120	7	5.8%	158.5%	+7,020	34.8
	vs>=2.0, ev>=2.5, rank<=2	41	5	12.2%	280.5%	+7,400	39.1
TURF	Baseline (ev>=1.5, rank<=3)	303	27	8.9%	138.4%	+11,640	19.0
	vs>=3.0, ev>=2.0	92	7	7.6%	178.2%	+7,190	30.3
	vs>=2.0, ev>=2.5, rank<=2	73	9	12.3%	215.5%	+8,430	23.3

vs>=2.0, ev>=2.5, rank<=2 = very selective (41-73 bets/year) but ROI 215-280%. Consider keeping tansho as a small satellite with this filter only.

Drawdown & Risk (Baseline)

Phase 1 reliability analysis of Option A baseline. Option B drawdowns not yet measured (would require per-bet simulation).

DIRT Drawdown

Max Drawdown-14,700

Max Loss Streak27 races

Duration (oana)May-Aug 2023

Race Win Rate14.5% (21/145)

Bootstrap 95% CI[84.2%, 355.6%]

Umaren Max DD-26,120

Umaren Max Streak42 races

TURF Drawdown

Max Drawdown-20,700

Max Loss Streak29 races

Duration (oana)Apr-Dec 2023

Race Win Rate10.5% (17/162)

Bootstrap 95% CI[59.4%, 209.1%]

Umaren Max DD-8,900

Umaren Max Streak24 races

Top-N Hit Removal Sensitivity

How many top hits must we remove before profit goes negative? Tests if profits depend on a few lucky hits.

DIRT Oana - Top Hit Removal

Removed	ROI	Profit	Status
0	200.2%	+76,040	Profitable
1	146.3%	+35,130	Profitable
2	114.0%	+10,650	Profitable
3	94.8%	-3,960	Losing

Survives removal of top 2 hits. Marginal if top 3 removed.

TURF Oana - Top Hit Removal

Removed	ROI	Profit	Status
0	128.7%	+23,200	Profitable
1	105.7%	+4,580	Profitable
2	86.6%	-10,810	Losing

Survives removal of top 1 hit only. More fragile than dirt.

Monthly P&L (Baseline)

DIRT - Oana Cumulative P&L

Dirt Oana: Jan-Feb drawdown, Mar breakout (+29K single month), Nov mega-hit (+35K). Final: +76,040

TURF - Oana Cumulative P&L

Turf Oana: Jan-Mar underwater, Apr mega-spike (+43K, 553% ROI). Gradual erosion Oct-Nov. Final: +23,200

PL Calibration (Baseline Model)

Model underestimates actual top-3 rates by 2-4x across all probability bins. This systematic under-confidence is what makes the betting strategies profitable.

DIRT Calibration

Pred Prob Bin	Mean Pred	Actual Top3	Ratio
0.001-0.025	1.2%	5.2%	4.4x
0.025-0.036	3.2%	4.2%	1.3x
0.036-0.045	4.1%	6.8%	1.7x
0.045-0.053	4.9%	7.1%	1.5x
0.053-0.061	5.7%	11.9%	2.1x
0.061-0.070	6.6%	18.3%	2.8x
0.070-0.080	7.5%	27.4%	3.6x
0.080-0.094	8.7%	33.3%	3.8x
0.094-0.119	10.5%	43.2%	4.1x
0.119-0.600	17.5%	59.2%	3.4x

TURF Calibration

Pred Prob Bin	Mean Pred	Actual Top3	Ratio
0.003-0.015	1.1%	2.0%	1.9x
0.015-0.023	1.9%	4.4%	2.4x
0.023-0.032	2.8%	5.4%	2.0x
0.032-0.042	3.7%	9.9%	2.7x
0.042-0.054	4.8%	15.7%	3.3x
0.054-0.070	6.1%	20.3%	3.3x
0.070-0.089	7.9%	25.8%	3.3x
0.089-0.112	10.0%	34.5%	3.5x
0.112-0.151	13.0%	45.0%	3.5x
0.151-0.684	23.3%	66.7%	2.9x

Final Recommendation

Recommended Production Strategy

Primary: Option A (Baseline) for immediate deployment. Already proven ROI with known risk profile. Bootstrap CIs exclude 100% on dirt.
Enhancement: Option C umaren vs>=3.0 filter — zero effort, marginal ROI improvement (dirt +14pp, turf +10pp). Apply today.
Satellite: Option C tansho (vs>=2.0, ev>=2.5, rank<=2) — very selective (41-73 bets/year) but ROI 215-280%. Don't drop tansho entirely; keep this niche.
Phase 2: Option B "soft" for dirt only — if shadow run confirms, swap dirt model to soft variant (+52% combined profit). Keep turf on baseline or exp.

Risk Assessment Matrix

Strategy	Expected ROI (dirt)	Expected ROI (turf)	Confidence	Implementation Effort	Downside Risk
A: Baseline	200%	129-159%	High	None	CI lower: 84%/59%
A+C: Base + VS filter	200/159%	129/169%	High	Trivial	Same as A (filter only removes)
B-soft (dirt only)	223/206%	N/A (use A)	Medium	Retrain	Lower AUC; 1 CV fold data
B-exp (turf only)	N/A (use A)	126/194%	Medium	Retrain	Oana ROI slightly below base

Implementation Roadmap

Week 1 (Now)Deploy Option A + VS filter (umaren vs>=3.0). Begin shadow.

Week 2-4Add tansho satellite (vs>=2.0, ev>=2.5, rank<=2). Collect 50+ bets.

Month 2-3Shadow-run Option B soft (dirt) + exp (turf). Compare to baseline live.

Month 3+If B shadow > A live by >20pp ROI, swap. Otherwise stay on A+C.

Phase 9 作業サマリ・選定結果

作業概要

Phase 8 完了時点の LambdaRank V17b ベースラインに対し、ROI をさらに引き上げるための3方向の戦略検証を実施した。検証は Walk-Forward CV（2020〜2023年、36ヶ月スライディングウィンドウ × 4 fold）で統一し、ダート・芝を別モデルとして独立に評価した。

Option	内容	検証結果	選定
A	現モデル維持（ベースライン比較用）	Dirt oana ROI 200% / Turf oana ROI 129% — 安定した黒字。Bootstrap CI 下限もdirtは100%超	即時採用
B	label_gain 段階化（soft / exp / linear の3バリアント）で LambdaRank を再学習	Dirt は soft が最強（oana ROI 223%）だが、Turf は soft が赤字（oana ROI 89%）。面×バリアント固定が必須で組合せリスクが高い	保留（シャドーラン後）
C	value_score（= P_model / P_market）による推論後フィルタ	馬連 VS≥3.0 フィルタで dirt +14pp / turf +10pp 改善。サテライト単勝（VS≥2.0, EV≥2.5, rank≤2）は dirt ROI 280% / turf ROI 215%	即時採用

選定結果: 即時採用した施策

サテライト単勝フィルタ: pred_rank≤2 & value_score≥2.0 & ev_ratio≥2.5
旧フィルタ（pred_rank==1, ev_ratio≥1.2, 4≤odds≤15）を廃止し、VS ベースの高選択性フィルタに切替。年間 41〜73 bet と少数精鋭だが ROI 215〜280%。
馬連 VS≥3.0 フィルタ: Plackett-Luce picks のうち、構成馬の value_score 最大値が 3.0 以上のペアのみ残す。追加学習不要の zero-cost 改善。
大穴軸三連複: 変更なし。軸候補は元々高 VS のため VS フィルタの追加効果がない。

実装済みの変更箇所

ファイル	変更内容
`ml/race_signal_today.py`	value_score 計算追加 / サテライト単勝フィルタ変更 / 馬連 VS≥3.0 フィルタ追加 / Telegram 通知テキスト更新（「サテライト単勝」「馬連 PL+VS≥3.0」）/ signal log の ticket_type を `tansho_satellite` に変更 / コンソール出力に Rank・VS 列追加
`Vault: PJ38 index.md`	Phase 9 完了セクション追記（3戦略ポートフォリオ・Option A/B/C 結果・チェックリスト）
`Vault: ADR-2026-05-20`	意思決定記録（Option C 即時採用 / Option B 保留の根拠）
`Vault: MEMORY.md`	PJ38 エントリ更新（Phase 9 戦略最適化完了・3戦略ポートフォリオ情報）

今後のロードマップ

時期	アクション	判断基準
即時（2026-05-20〜）	Option A + C フィルタで本番稼働。launchd で毎週土日 09:00 自動実行	—
〜4週間（5/24〜6/21）	サテライト単勝シャドーラン。50 bet 以上蓄積	ROI ≥ 150% で継続、< 100% で閾値再調整
2〜3ヶ月目	Option B（dirt=soft / turf=exp）シャドーラン並行実行	ベースライン比 +20pp 以上で本番切替を検討
3ヶ月目以降	シャドーラン結果に基づき Option B 本番投入 or 棄却	100 bet 以上 × ROI ≥ ベースライン + 20pp

最終ポートフォリオ構成（Phase 9 時点）

DIRT

大穴軸三連複ROI 200%

馬連 PL + VS≥3.0ROI 159%

サテライト単勝ROI 280%

TURF

大穴軸三連複ROI 129%

馬連 PL + VS≥3.0ROI 169%

サテライト単勝ROI 215%

PJ38 Umanou - Strategy Comparison