Skip to main content

Working with Responses

GenSearch responses are delivered as markdown text with inline citations that link back to source documents in the AlphaSense content library. This guide covers the three citation formats you will encounter, how to parse them programmatically, and how to extract structured data from the markdown response.


The 3 Citation Formats

Every citation in a GenSearch response is a markdown hyperlink pointing to research.alpha-sense.com with deep-link query parameters. The API uses three distinct formats depending on context.

FormatExampleWhen Used
Short name + link[[38 • S1/A] ](https://research.alpha-sense.com?docid=...&page=29)Most common inline citation
Number-only + link[[311]](https://research.alpha-sense.com?docid=...&page=33)Repeated references, tables
Full metadata + link[[312] S1/A • Roblox Corp • 22 Feb 21 • "Amended Prospectus"](https://research.alpha-sense.com?docid=...&page=33)First mention, full provenance

Example in context

Apple reported Q4 2025 revenue of $94.9 billion, a 6% year-over-year increase
[[38 • 10-K]](https://research.alpha-sense.com?docid=V00001234&page=29).
The growth was primarily driven by strong performance in the Services segment
[[38]](https://research.alpha-sense.com?docid=V00001234&page=31), which reached
an all-time high of $25.0 billion
[[312] 10-K • Apple Inc. • 30 Oct 25 • "Annual Report"](https://research.alpha-sense.com?docid=V00001234&page=45).

URL Anatomy

Each citation URL contains query parameters that deep-link into the AlphaSense research platform.

ParameterDescriptionExample
docidUnique document identifier in the AlphaSense content libraryV00001234
pageThe page number within the document where the cited content appears29
stmtStatement highlight identifier -- pinpoints the exact statement on the pagestmt_5a3b
hlHighlight terms -- search keywords highlighted in the document viewrevenue+growth

A fully-qualified citation URL looks like this:

https://research.alpha-sense.com?docid=V00001234&page=29&stmt=stmt_5a3b&hl=revenue+growth

Not all parameters are present in every URL. The docid and page parameters appear in virtually every citation, while stmt and hl are included when available.


Regex Patterns

Use the following regular expressions to extract citations from GenSearch markdown responses.

import re

# -------------------------------------------------------------------
# Short name: [[38 • S1/A]](url)
# Captures: (1) number, (2) source name, (3) URL
# -------------------------------------------------------------------
short_pattern = r'\[\[(\d+)\s*•\s*([^\]]+)\]\]\((https?://[^\)]+)\)'

# -------------------------------------------------------------------
# Number-only: [[311]](url)
# Captures: (1) number, (2) URL
# -------------------------------------------------------------------
number_pattern = r'\[\[(\d+)\]\]\((https?://[^\)]+)\)'

# -------------------------------------------------------------------
# Full metadata: [[312] S1/A • Roblox Corp • 22 Feb 21 • "Amended Prospectus"](url)
# Captures: (1) number, (2) metadata string, (3) URL
# -------------------------------------------------------------------
full_pattern = r'\[\[(\d+)\]\s*([^\]]*)\]\((https?://[^\)]+)\)'


# --- Usage example ---
sample = (
'Revenue grew 6% YoY [[38 • 10-K]](https://research.alpha-sense.com'
'?docid=V00001234&page=29) and Services hit a record '
'[[311]](https://research.alpha-sense.com?docid=V00001234&page=31).'
)

short_matches = re.findall(short_pattern, sample)
number_matches = re.findall(number_pattern, sample)

for num, source, url in short_matches:
print(f"Citation {num}: {source.strip()} -> {url}")

for num, url in number_matches:
print(f"Citation {num}: -> {url}")

Rendering Citations

To display GenSearch responses in your UI, convert the markdown citation syntax into clickable HTML anchor tags. The following examples transform [[N • Source]](url) into styled links that open the AlphaSense document viewer.

import re

def render_citations_html(markdown: str) -> str:
"""Convert GenSearch citation syntax to HTML anchor tags.

Handles all three citation formats:
- [[N • Source]](url) -> <a href="url" ...>[N]</a>
- [[N]](url) -> <a href="url" ...>[N]</a>
- [[N] metadata](url) -> <a href="url" ...>[N]</a>
"""
# Short name format
html = re.sub(
r'\[\[(\d+)\s*•\s*[^\]]+\]\]\((https?://[^\)]+)\)',
r'<a href="\2" target="_blank" rel="noopener noreferrer" '
r'class="citation" title="Source \1">[\1]</a>',
markdown,
)

# Number-only format
html = re.sub(
r'\[\[(\d+)\]\]\((https?://[^\)]+)\)',
r'<a href="\2" target="_blank" rel="noopener noreferrer" '
r'class="citation" title="Source \1">[\1]</a>',
html,
)

# Full metadata format
html = re.sub(
r'\[\[(\d+)\]\s*([^\]]*)\]\((https?://[^\)]+)\)',
r'<a href="\3" target="_blank" rel="noopener noreferrer" '
r'class="citation" title="\2">[\1]</a>',
html,
)

return html

Markdown Structure

GenSearch responses follow a consistent markdown structure. Understanding this structure helps you parse, render, and extract data programmatically.

Headers

Responses use standard markdown headers to organize content into sections. Deep Research mode produces the most structured output, often with multiple heading levels.

## Revenue Overview

Apple's fiscal Q4 2025 revenue reached $94.9 billion [[38 • 10-K]](...).

### Services Segment

The Services segment generated $25.0 billion [[311]](...), representing a
new all-time quarterly record.

## Margin Analysis

Gross margin expanded to 46.2% [[38 • 10-K]](...), driven by the higher-margin
Services mix.

Bullet Points

Key findings are often presented as bullet lists, especially in Fast and Think Longer modes.

Key takeaways from Apple's Q4 2025 earnings:

- Revenue: $94.9 billion, up 6% YoY [[38 • 10-K]](...)
- Services: $25.0 billion, all-time quarterly record [[311]](...)
- Gross margin: 46.2%, up 80bps YoY [[38 • 10-K]](...)
- iPhone revenue: $46.2 billion, up 3% YoY [[38 • 10-K]](...)

Tables

Financial comparisons are frequently rendered as markdown tables.

| Metric | Q4 2025 | Q4 2024 | Change |
|---|---|---|---|
| Revenue | $94.9B | $89.5B | +6% |
| Gross Margin | 46.2% | 45.4% | +80bps |
| Services | $25.0B | $22.3B | +12% |

Citation Sections

Some Deep Research responses include a dedicated sources section at the end of the response listing all cited documents.


Structured Data Extraction

Use the following utilities to parse financial tables and bullet points from GenSearch markdown responses.

import re
from typing import Any


def extract_tables(markdown: str) -> list[list[list[str]]]:
"""Extract all markdown tables as lists of rows.

Returns a list of tables, where each table is a list of rows,
and each row is a list of cell values (strings).
"""
tables: list[list[list[str]]] = []
current_table: list[list[str]] = []

for line in markdown.splitlines():
stripped = line.strip()
if stripped.startswith("|") and stripped.endswith("|"):
# Skip separator rows (e.g., |---|---|---|)
if re.match(r'^\|[\s\-:|]+\|$', stripped):
continue
cells = [
cell.strip()
for cell in stripped.split("|")[1:-1] # drop leading/trailing empty
]
current_table.append(cells)
else:
if current_table:
tables.append(current_table)
current_table = []

if current_table:
tables.append(current_table)

return tables


def extract_bullet_points(markdown: str) -> list[str]:
"""Extract all top-level bullet points from the markdown."""
bullets: list[str] = []
for line in markdown.splitlines():
stripped = line.strip()
if stripped.startswith("- ") or stripped.startswith("* "):
bullets.append(stripped[2:].strip())
return bullets


def extract_financial_values(text: str) -> dict[str, str]:
"""Extract key-value pairs from text like 'Revenue: $94.9 billion'.

Returns a dict mapping metric names to their values.
"""
pattern = r'([A-Za-z\s]+):\s*(\$[\d.,]+\s*(?:billion|million|B|M|%)?)'
results: dict[str, str] = {}
for match in re.finditer(pattern, text, re.IGNORECASE):
key = match.group(1).strip()
value = match.group(2).strip()
results[key] = value
return results

Edge Cases

When working with GenSearch responses, handle these edge cases to build a robust integration.

Empty results (NO_DOCS error code)

If GenSearch cannot find relevant documents for a query, the polling response returns an error with code NO_DOCS. No markdown is generated.

{
"data": {
"genSearch": {
"conversation": {
"id": "conv_abc123",
"markdown": null,
"progress": 1.0,
"error": {
"code": "NO_DOCS"
}
}
}
}
}

How to handle: Display a message to the user suggesting they broaden their query or adjust search terms. Check the error field before attempting to parse markdown.

Partial results

While progress is less than 1.0, the markdown field may contain incomplete content. Partial markdown may have:

  • Unfinished sentences at the end of the text
  • Open markdown formatting (e.g., an unclosed bold ** or table row)
  • Citations that reference documents not yet fully processed

How to handle: If you display partial results during polling, consider appending a loading indicator. Only treat the response as final when progress reaches 1.0.

Truncated responses

Very long Deep Research responses may reach internal length limits. When this occurs, the response is complete (progress is 1.0) but the content may end abruptly.

How to handle: Check whether the markdown ends with a complete sentence or section. If it appears truncated, inform the user that the response reached the maximum length and suggest breaking the question into smaller, more focused queries.


Utility Functions

The following copy-paste ready functions cover the most common parsing tasks when working with GenSearch responses.

import re
from urllib.parse import urlparse, parse_qs


def extract_citations(markdown: str) -> list[dict]:
"""Extract all citations from a GenSearch markdown response.

Returns a list of dicts, each containing:
- number (str): The citation number
- source (str): The source name or metadata (empty string if number-only)
- url (str): The full citation URL
- docid (str | None): The document ID extracted from the URL
- page (str | None): The page number extracted from the URL
"""
citations = []
seen = set()

# Match all three formats in a single pass
# Format 1: [[N • Source]](url)
for m in re.finditer(r'\[\[(\d+)\s*•\s*([^\]]+)\]\]\((https?://[^\)]+)\)', markdown):
key = (m.group(1), m.group(3))
if key not in seen:
seen.add(key)
parsed = urlparse(m.group(3))
params = parse_qs(parsed.query)
citations.append({
"number": m.group(1),
"source": m.group(2).strip(),
"url": m.group(3),
"docid": params.get("docid", [None])[0],
"page": params.get("page", [None])[0],
})

# Format 2: [[N]](url)
for m in re.finditer(r'\[\[(\d+)\]\]\((https?://[^\)]+)\)', markdown):
key = (m.group(1), m.group(2))
if key not in seen:
seen.add(key)
parsed = urlparse(m.group(2))
params = parse_qs(parsed.query)
citations.append({
"number": m.group(1),
"source": "",
"url": m.group(2),
"docid": params.get("docid", [None])[0],
"page": params.get("page", [None])[0],
})

# Format 3: [[N] metadata](url)
for m in re.finditer(r'\[\[(\d+)\]\s*([^\]]+)\]\((https?://[^\)]+)\)', markdown):
key = (m.group(1), m.group(3))
if key not in seen:
seen.add(key)
parsed = urlparse(m.group(3))
params = parse_qs(parsed.query)
citations.append({
"number": m.group(1),
"source": m.group(2).strip(),
"url": m.group(3),
"docid": params.get("docid", [None])[0],
"page": params.get("page", [None])[0],
})

return citations


def extract_sections(markdown: str) -> dict[str, str]:
"""Extract sections from a GenSearch markdown response.

Splits the markdown on heading lines (## or ###) and returns
a dict mapping each heading to its content.
"""
sections: dict[str, str] = {}
current_heading = ""
current_content: list[str] = []

for line in markdown.splitlines():
if line.startswith("## ") or line.startswith("### "):
if current_heading or current_content:
sections[current_heading] = "\n".join(current_content).strip()
current_heading = line.lstrip("#").strip()
current_content = []
else:
current_content.append(line)

# Capture the last section
if current_heading or current_content:
sections[current_heading] = "\n".join(current_content).strip()

return sections


def strip_citations(markdown: str) -> str:
"""Remove all citation markers from markdown, leaving clean text.

Strips all three citation formats:
- [[N • Source]](url) -> ""
- [[N]](url) -> ""
- [[N] metadata](url) -> ""
"""
# Remove short name citations
text = re.sub(r'\[\[\d+\s*•\s*[^\]]+\]\]\(https?://[^\)]+\)', '', markdown)
# Remove number-only citations
text = re.sub(r'\[\[\d+\]\]\(https?://[^\)]+\)', '', text)
# Remove full metadata citations
text = re.sub(r'\[\[\d+\]\s*[^\]]*\]\(https?://[^\)]+\)', '', text)
# Clean up extra whitespace left behind
text = re.sub(r' +', ' ', text)
return text.strip()

Related Resources
  • GenSearch Modes: Learn about fast, thinkLonger, and deepResearch modes, their credit costs, and when to use each at GenSearch Modes.
  • Credits and Limits: Understand how credits are consumed and what rate limits apply at Credits and Limits.