GenSearch & Deep Research Troubleshooting
Overview
The AlphaSense Enterprise Insight platform provides two AI-powered research capabilities: Deep Research (DR), which performs multi-step iterative research generation across planning, retrieval, and citation stages; and GenSearch, which delivers LLM-assisted search answers grounded in retrieved documents. Both features share a common dependency stack including hybrid search, reranking, embedding inference, and citation services, but operate through separate orchestration pipelines.
This document serves to outline common failure scenarios for Deep Research and GenSearch and provide troubleshooting steps for resolution.
Failure Scenarios
1. Deep Research Request Fails Mid-Run or Never Completes
Triage:
A user submits a Deep Research request and it either fails partway through execution, runs indefinitely without completing, or produces empty output that stops early. Because DR spans multiple sequential steps across planning, retrieval, and citation services, failures in any hop — including pod restarts, queue backlogs, or provider rate limits during a long-running run — can cause the request to stall or abort.
Troubleshooting:
Capture the repro details before investigating: prompt, user, timestamp, and approximate duration before failure.
Verify that Deep Research and its dependencies are in Running state:
kubectl get pods -n research | grep -E "deep-research|assistant-planning|hybrid-search-service-deepresearch|citation-service-deepresearch|text-embedding|nlu-service-deepresearch"
Check deep-research logs for step boundaries, the failing hop, and any rate limit errors:
kubectl logs -n research deployment/deep-research --tail=500
If logs point to a planning failure, check the planner service:
kubectl logs -n research deployment/assistant-planning --tail=300
If logs point to a retrieval or citation failure, check downstream services:
kubectl logs -n research deployment/hybrid-search-service-deepresearch --tail=300
kubectl logs -n research deployment/citation-service-deepresearch --tail=300
If the failing service is in an error state, restart it:
kubectl rollout restart deployment/deep-research -n research
If the service is saturated, scale it up or reduce retry configuration to prevent runaway latency. If a model or provider routing issue is identified, route to an alternate provider if available.
2. GenSearch Returns No Answer or Irrelevant Results
Triage:
A user submits a search query and receives an empty answer, an irrelevant response, or an answer with missing or incorrect citations. This typically indicates a failure in the retrieval, reranking, or query parsing stage — the pipeline did not fetch sufficient relevant passages to ground the answer, or downstream embedding/reranking inference errors degraded result quality.
Troubleshooting:
Capture the repro details before investigating: query text, affected user, expected vs actual output, timestamp, and example documents that should have been cited.
Verify that GenSearch and its dependencies are in Running state:
kubectl get pods -n research | grep -E "graphql-assistant|hybrid-search|reranker|citation|nlu|text-embedding|feature-storage"
Check the GenSearch service logs for upstream errors, timeout chains, or empty retrieval sets:
kubectl logs -n research deployment/graphql-assistant --tail=400
Check retrieval and reranking services for errors:
kubectl logs -n platform-vector-search deployment/hybrid-search-service-gensearch --tail=300
kubectl logs -n platform-vector-search deployment/hybrid-search-service-deepresearch --tail=300
kubectl logs -n research deployment/hybrid-snippet-reranker --tail=300
If a specific component is saturated, scale it:
kubectl patch scaledobject hybrid-snippet-reranker -n research --type='merge' -p '{"spec":{"minReplicaCount": <N>}}'
3. Citations Missing (Deep Research and GenSearch)
Triage:
Output is generated successfully but citations fail to attach, are incomplete, or reference incorrect documents. This applies to both Deep Research and GenSearch citation pipelines.
Troubleshooting:
Check citation service logs for errors and any document metadata issues.
For Deep Research:
kubectl logs -n research deployment/citation-service-deepresearch --tail=300
For GenSearch:
kubectl logs -n research deployment/citation-service --tail=300
If permission checks are failing during citation resolution, validate that the affected user has access to the documents being cited (see entitlement scenario below).
If the citation service is in an error state, restart it:
kubectl rollout restart deployment/citation-service -n research
4. High Latency or Timeouts (Deep Research and GenSearch)
Triage:
Requests for Deep Research or GenSearch are taking significantly longer than expected, timing out, or causing downstream backpressure. This typically originates from p99 latency spikes in the reranker or embedding inference services, retry storms, or provider rate limits.
Troubleshooting:
Check litellm logs for timeout errors:
kubectl -n research logs deployment/litellm --tail=300
If the reranker or embedding service is the bottleneck, scale replicas:
kubectl scale deployment/hybrid-snippet-reranker -n research --replicas=<N>
If provider rate limits are causing retries, work with engineering to tune backoff and retry configuration.
5. Works for Some Users but Not Others (Entitlement Issues)
Triage:
Deep Research or GenSearch requests succeed for most users but fail or return degraded results for specific users. This is typically caused by entitlement or permission mismatches where the affected user cannot access the documents used for grounding — resulting in empty retrieval sets or incomplete answers for that user.
Troubleshooting:
This issue can be complex to diagnose accurately.
Check litellm logs for entitlement-related errors:
kubectl -n research logs deployment/litellm --tail=300
It is advisable to create a support ticket with AlphaSense Support for assistance. The following steps serve as initial data collection to facilitate more effective support:
Confirm whether the problem affects one or multiple users, and whether the issue is consistent or intermittent.
Reproduce the issue using impersonation and compare results against a known-good user with similar entitlements to isolate whether the issue is user-specific.
Check entitlement-related errors in logs for the affected user.
Validate which documents were retrieved and confirm the affected user should have access to them. If entitlement data is out of sync, re-index or re-sync entitlements as needed (refer to the Solr–DDS Entitlement Mismatch runbook).
When escalating, provide:
- Query or prompt and affected user
- Timestamp and request IDs
- Logs from the failing hop
- List of expected cited documents
Validation Steps
After applying any resolution:
- Verify that all affected pods are in Running state with no recent restarts
- Submit a representative Deep Research prompt and confirm it completes successfully end-to-end
- Submit a GenSearch query and confirm a grounded answer with correct citations is returned
- If the issue was user-specific, impersonate the affected user and confirm the request now succeeds
- If a service was restarted or scaled, monitor logs for healthy operation before closing the incident