Google Files New Patent On Personal History-Based Search via @sejournal, @martinibuster

Google recently filed a new patent for a way to provide search results based on a user’s browsing and email history. The patent outlines a new way to search within the context of a search engine, within an email interface, and through a voice-based assistant (referred to in the patent as a voice-based dialog system).

A problem that many people have is that they can remember what they saw but they can’t remember where they saw it or how they found it. The new patent, titled Generating Query Answers From A User’s History, solves that problem by helping people find information they’ve previously seen within a webpage or an email by enabling them to ask for what they’re looking for using everyday language such as “What was that article I read last week about chess?”

The problem the invention solves is that traditional search engines don’t enable users to easily search their own browsing or email history using natural language. The invention works by taking a user’s spoken or typed question, recognizing that the question is asking for previously viewed content, and then retrieving search results from the user’s personal history (such as their browser history or emails). In order to accomplish this it uses filters like date, topic, or device used.

What’s novel about the invention is the system’s ability to understand vague or fuzzy natural language queries and match them to a user’s specific past interactions, including showing the version of a page as it looked when the user originally saw it (a cached version of the web page).

Query Classification (Intent) And Filtering

Query Classification

The system first determines whether the intent of the user’s spoken or typed query is to retrieve previously accessed information. This process is called query classification and involves analyzing the phrasing of the query to detect the intent. The system compares parts of the query to known patterns associated with history-seeking questions and uses techniques like semantic analysis and similarity thresholds to identify if the user’s intent is to seek something they’d seen before, even when the wording is vague or conversational.

The similarity threshold is an interesting part of the invention because it compares what the user is saying or typing to known history-seeking phrases to see if they are similar. It’s not looking for an exact match but rather a close match.

Filtering

The next part is filtering, and it happens after the system has identified the history-seeking intent. It then applies filters such as the topic, time, or device to limit the search to content from the user’s personal history that matches those criteria.

The time filter is a way to constrain the search to within a specific time frame that’s mentioned or implied in the search query. This helps the system narrow down the search results to what the user is trying to find. So if a user speaks phrases like “last week” or “a few days ago” then it knows to restrict the query to those respective time frames.

An interesting quality of the time filter is that it’s applied with a level of fuzziness, which means it’s not exact. So when a person asks the voice assistant to find something from the past week it won’t do a literal search of the past seven days but will expand it to a longer period of time.

The patent describes the fuzzy quality of the time filter:

“For example, the browser history collection… may include a list of web pages that were accessed by the user. The search engine… may obtain documents from the index… based on the filters from the formatted query.

For example, if the formatted query… includes a date filter (e.g., “last week”) and a topic filter (e.g., “chess story”), the search engine… may retrieve only documents from the collection… that satisfy these filters, i.e., documents that the user accessed in the previous week that relate to a “chess story.”

In this example, the search engine… may apply fuzzy time ranges to the “last week” filter to account for inaccuracies in human memory. In particular, while “last week” literally refers to the seven calendar days of the previous week, the search engine… may search for documents over a wider range, e.g., anytime in the past two weeks.”

Once a query is classified as asking for something that was previously seen, the system identifies details in the user’s phrasing that are indicative of topic, date or time, source, device, sender, or location and uses them as filters to search the user’s personal history.

Each filter helps narrow the scope of the search to match what the user is trying to recall: for example, a topic filter (“turkey recipe”) targets the subject of the content; a time filter (“last week”) restricts results to when it was accessed; a source filter (“WhiteHouse.gov”) limits the search to specific websites; a device filter (e.g., “on my phone”) further restricts the search results from a certain device; a sender filter (“from grandma”) helps locate emails or shared content; and a location filter (e.g., “at work”) restricts results to those accessed in a particular physical place.

By combining these context-sensitive filters, the system mimics the way people naturally remember content in order to help users retrieve exactly what they’re looking for, even when their query is vague or incomplete.

Scope of Search: What Is Searched

The next part of the patent is about figuring out the scope of what is going to be searched, which is limited to predefined sources such as browser history, cached versions of web pages, or emails. So, rather than searching the entire web, the system focuses only on the user’s personal history, making the results more relevant to what the user is trying to recall.

Cached Versions of Previously Viewed Content

Another interesting feature described in the patent is web page caching. Caching refers to saving a copy of a web page as it appeared when the user originally viewed it. This enables the system to show the user that specific version of the page in search results, rather than the current version, which may have changed or been removed.

The cached version acts like a snapshot in time, making it easier for the user to recognize or remember the content they are looking for. This is especially useful when the user doesn’t remember precise details like the name of the page or where they found it, but would recognize it if they saw it again. By showing the version that the user actually saw, the system makes the search experience more aligned with how people remember things.

Potential Applications Of The Patent Invention

The system described in the patent can be applied in several real-world contexts where users may want to retrieve content they’ve previously seen:

Search Engines

The patent refers multiple times to the use of this technique in the context of a search engine that retrieves results not from the public web, but from the user’s personal history, such as previously visited web pages and emails. While the system is designed to search only content the user has previously accessed, the patent notes that some implementations may also include additional documents relevant to the query, even if the user hasn’t viewed them before.

Email Clients

The system treats previously accessed emails as part of the searchable history. For example, it can return an old email like “Grandma’s turkey meatballs” based on vague, natural language queries.

Voice Assistants

The patent includes examples of “a voice-based search” where users speak conversational queries like “I’m looking for a turkey recipe I read on my phone.” The system handles speech recognition and interprets intent to retrieve relevant results from personal history.

Read the entire patent here:

Generating query answers from a user’s history

Featured Image by Shutterstock/JHVEPhoto

Leave a Reply

Your email address will not be published. Required fields are marked *