NSE Options Data Analysis Pipeline

Заказчик: AI | Опубликовано: 12.02.2026

NSE FO Bhavcopy Filter Pipeline – Summary **1. Objective: A fully vectorised Python system that downloads NSE F&O bhavcopy ZIPs for a user‑defined date range, processes them in‑memory, applies a strict multi‑stage filter chain, and outputs only fully qualified option contracts as an Excel file. 2. Key Features: 2.1 Flexible date input (e.g., 1 Jan 2026, 2026-01-01) with automatic adjustment – start → Sunday of its week, end → Saturday of its week (no confirmation). 2.2 Robust download engine – HTTP session with retries, polite delay, skips missing files. 2.3 Column standardisation – maps NSE CSV variants to ten fixed columns: TckrSymb, XpryDt, StrkPric, OptnTp, OpnPric, HghPric, LwPric, ClsPric, TtlTradgVol, NewBrdLotQty. 2.4 Week grouping – calendar weeks (Sun→Sat). W0 = week containing end date, W1–W5 earlier weeks. **3. Filter Pipeline (exact order):** 3.1 W0 Blank Filter** – if any of the ten columns is blank on any trading day in W0 → eliminate entire option. 3.2 W0 Price Filter – per day fail if any of OpnPric/HghPric/LwPric = 0 OR all four prices equal. Grace: ≤1 fail days → pass, else eliminate. 3.3 W1 Match & Filter** – for each W0‑qualified option:  - If present in W1 → apply same Blank + Price filters on W1 data; pass both → OLD, else eliminate.  - If absent → NEW (skip W1 filters). 3.4 Volume Filter** –  - OLD: Vol(W0) > Vol(W1) (strict).  - NEW: Vol(W0) > 0. 3.5 Expiry Filter** – remove options expiring in W0 or next week (≤ next Saturday after W0). 3.6 Multi‑Week Volume Progression** – if W2 data exists, require strictly increasing volume from older to newer weeks (W5→W4→…→W0). 3.7 **Pivot Calculation** – traditional pivot levels (P, R1–R5, S1–S2) from W0 high/low/close; compute % differences between consecutive levels. 3.8 **Pivot Scenario Thresholds** – user‑configurable T1–T4:  - S1 < 0 → Scenario 3 (check R5→P% ≤ T4).  - S2 ≤ 0 → Scenario 1 (R5→P% ≤ T1, P→S1% ≤ T2).  - S2 > 0 → Scenario 2 (R5→P% ≤ T1, P→S1% ≤ T2, S1→S2% ≤ T3). 3.9 Affordability Filter** – Affordability = S1 × LotQty. If S1 < 0 → auto‑pass; else pass if Affordability ≤ user_limit (default 8000). 3.10 Final Output** – Excel file qualified_options.xlsx with columns: TckrSymb, XpryDt, StrkPric, OptnTp, HghPric, LwPric, ClsPric, NewBrdLotQty, TtlTradgVol, %VolIncrease, Appearance, Affordability. 4. Mandatory Technical Requirements: 4.1 100% vectorised – no row loops (no for, iterrows, etc.). 4.2 In‑memory processing – no physical intermediate files. 4.3 Command‑line + interactive – via argparse; if no args, prompts for dates/thresholds. 4.4 Error handling – missing files skipped, column validation, clear console logs. 5. Skills Required: Python, pandas (vectorised groupby/merge), numpy, requests, in‑memory ZIP/CSV, datetime, argparse, openpyxl. Experience with NSE FO data and pivot calculations preferred. 6. Deliverable:** Single Python script implementing the entire pipeline exactly as specified.