AI Ticker Tagging

Ticker tagging is the process of identifying which US-listed equities a news article materially references. Top Tier Newswire does this AI-assisted on every story that hits the database.

How it works

The same fine-tuned language model that scores sentiment also returns the set of tickers a headline-plus-summary materially mentions. "Apple beats Q3 estimates" tags AAPL. "Tech sector roundup" with no specific names tags nothing.

The model is anchored to the SEC's company_tickers.json registry of US-listed symbols, plus an internal alias map covering brand names that differ from registered names (Google → GOOGL, Facebook → META, Block → XYZ, Instacart → CART, etc.). The full company-name normalisation pipeline strips corporate-suffix noise ("Inc", "Corp", "Holdings"), handles ADR variants, and applies a manually curated known-traps list to correct mis-tickering that the fine-tune consistently gets wrong (Samsung → SN, Baidu → BABA, Snapchat → SN, etc. all get corrected).

Grounding gate

A deterministic post-step drops any ticker the model emitted that lacks textual evidence in the source. This catches a common failure mode of fine-tuned taggers: hallucinating mega-cap names (AAPL, MSFT, NVDA) onto loosely-related stories purely from theme association. A ticker only passes if its symbol literally appears in the text, OR its canonical company name / alias appears.

ETF gating

Broad-market and sector ETFs (SPY, QQQ, XLK, XLE, etc.) require explicit textual evidence ("S&P 500", "tech sector roundup", "treasuries") — single-stock stories don't tag the constituent ETF even when the company is in the index. Otherwise every AAPL story would also tag SPY, which is noise.

Why this matters

Clean tagging is what makes everything else work. Bad tagging contaminates your watchlist, breaks the sentiment-aggregation rollups, and corrupts the AI Top Trades evidence packs (which pull headlines by ticker). We deliberately err toward fewer false tags rather than higher recall.