Search Troubleshooting
Overview
The AlphaSense Enterprise product is designed to empower customers by allowing them to ingest private documents, enabling them to utilize the comprehensive features provided by the AlphaSense platform. Simultaneously, the platform is equipped to facilitate searching for documents available on the AlphaSense mothership platform.
In this document, we will refer to these two main capabilities as:
-
Private Document Search: The ability to search and access private documents ingested by the customer.
-
Public Document Search: The capability to search for and access documents available on the AlphaSense mothership platform.
In addition, this document also covers another common search use case using the ticker box, which can return both public doc and private doc.
Different search boxes are displayed in the screenshot below
We will walk through common failures and provide troubleshooting steps for both private and public document search functionalities, ensuring a seamless search experience for users.
Search Use Cases
Private Document Search
The diagram below illustrates the basic steps involved in performing a search using the keyword box.
Our goal is to offer more comprehensive steps to simplify troubleshooting and generate diagnostic reports in the future. While we work on that, this section aims to provide a comprehensive guide on how to troubleshoot the problems encountered along the way.
Public Doc Search
The diagram below illustrates the basic steps involved in performing a public doc search using the keyword box.
The main difference in this flow is that traffic will be routed to AlphaSense mothership instead of the Apache Solr data nodes in the cluster.
Search using the Ticker box
The ticker search box allows users to search documents based on the company ticker. The list of tickers is updated in an interval (default 1h). It is supported on both public doc and private doc search flows.
Failure Scenarios
Troubleshooting app-search
Description:
app-search
is a Node.js web service that exposes document search functionality via web sockets.
This is the first step in the search flow. Errors in this step can cause the result to be blank or
other undefined behaviors such as long loading time.
Triage:
- Blank text search doesn't return any result
- In a web browser, the Networking tab shows service errors. Example screenshot below
Troubleshoot
- Check pod health
kubectl get pods -lapp=app-search -napplications
- Errors in this step should be visible from the pod log. Example query in Loki:
sum by (message, app) (count_over_time({app="app-search"} |~ "(?i)(error).* " != "status code 401" | regexp `(?i)(?P<message>(error).*).` [1h]))
Troubleshooting doc-search-realtime
Description:
doc-search-realtime
is the component that is responsible for breaking down user search query into
queries to the document storage backend and verify if the user is entitled to access the documents.
Triage:
- Press F11 in a web browser and go to the Network tab
- Filter calls containing
doc-search
and verify if there is any error
Troubleshoot:
- Check for pod status
kubectl get pods -napplications 'app in (doc-search,doc-search-realtime,doc-search-realtime-solr9)'
- Check dashboard
doc-search - SLO
on Grafana - Check in pod logs
- Generic errors:
{pod=~"doc-search.+"} |~ "(?i)error"
- Timeout errors:
{pod=~"doc-search.+"} |~ "(?i)error" |= "timeout"
- Generic errors:
Example result:
[2024-02-27 11:34:28,459] ERROR [723b56bacbbaaace] [node-fetch/1.0 (+https://github.com/bitinn/node-fetch)] [] [executor-46] [SearchResultNonPagingCallback.java:66] - Solr router query failed (possible timeout) ShardSource type: IMAGE_DOCS Exception: Can't retrieve data from the Solr: http://se-solrcloud-router.platform-search.svc.cluster.local/solr
- solr-related errors -
{pod=~"doc-search.+"} |~ "(?i)error" |= "solr"
Example result:
[2024-01-16 14:26:08,158] ERROR [984a74486e9328f4] [OTHER/CURL/8.0.1] [] [executor-6458] [SearchResultNonPagingCallback.java:66] - Solr router query failed (possible timeout) ShardSource type: ALL_DOCS Exception: Can't retrieve data from the Solr: http://se-solrcloud-router.platform-search:80/solr
Troubleshooting se-solrcloud-router
Description:
Documents ingested to the AlphaSense platform are stored in Apache Solr clusters in the end.
se-solrcloud-router
is the service that routes the queries from doc-search
to the correct
se-sorlcloud-k8
data node (called shard), aggregates the results from multiple data nodes and send
the response back to doc-search
.
Triage
Failure in this layer are indicated in doc-search
logs. Example:
doc-search-realtime [2023-12-08 16:12:45,141] ERROR [ddbbb85b6366ff6f] [axios/1.6.2] [] [executor-47] [SearchService.java:78] - Failed to search docs in shard: ALL_DOCS org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://se-solrcloud-router.platform-search:80/solr: No shards to route with given route parameters!
at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:681)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:214)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1035)
at com.alphasense.solr.dao5.repository.AbstractSolrRepository.findByQuery(AbstractSolrRepository.java:192)
... 8 common frames omitted
Wrapped by: com.alphasense.solr.dao5.repository.SolrDaoException: Can't retrieve data from the Solr: http://se-solrcloud-router.platform-search:80/solr
at com.alphasense.solr.dao5.repository.AbstractSolrRepository.findByQuery(AbstractSolrRepository.java:206)
at com.alphasense.search.service.SearchService.requestAndConvertResponse(SearchService.java:89)
at com.alphasense.search.service.SearchService.search(SearchService.java:75)
at com.alphasense.search.service.SearchService.lambda$executeQueries$0(SearchService.java:102)
at com.alphasense.utils.tracing.util.TracingUtil.lambda$wrap$3(TracingUtil.java:88)
at com.alphasense.logging.pool.MdcUtil$1.run(MdcUtil.java:38)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Troubleshoot:
- Check pod health
kubectl get pods -lapp=se-solrcloud-router-solr9 -nsearch-engine
- Check dashboard
Router - Server Level
on Grafana - Check logs:
se-solrcloud-router
log:{pod=~"se-solrcloud-router-solr9.+"}
solr-queue-indexer
log:pod=~"solr-queue-indexer.+
Troubleshooting se-solr-node
Description:
Documents ingested to AlphaSense platform are stored in Apache Solr clusters.
Triage
Failures in this layer are indicated in se-solrcloud-router
or solr-queue-indexer
logs. Example:
ERROR [ac4fca60-e765-4c9f-8a84-734438d3f474] [] [] [listener-high-priority52] [BaseCloudSolrClient.java:960] - Request to collection [lys-sd_1.0_36] failed due to (400) org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://se-solrcloud-k8-4.se-solrcloud-k8-headless.platform-search:8983/solr/lys-sd_1.0_36: ERROR: [doc=EC-c91bc706-8ad3-4b31-9271-460d3a4e9755-TESTDOC] unknown field 'AnalystPerspective', retry=0 commError=false errorCode=400
Troubleshoot:
- Check pod health
kubectl get pods -nsearch-engine
- Check dashboard
Solr Shards - Server Level
on Grafana