Correlations

An interesting type of data analysis includes identifying dependencies between variables. For two continuous variables we can, for example, compute correlation between them as an estimate of the dependency between them. Each concept in Event Registry can be seen as a time series where the value on a particular day corresponds to the number of articles that we collected that mention the concept. Given any input time series we can then compute which concepts are correlating the most with the provided input time series.

To compute what things in Event Registry correlate the most with a time series we can use the GetTopCorrelations() class.

import { EventRegistry, GetTopCorrelations, QueryArticles } from "eventregistry";
const er = new EventRegistry({apiKey: "YOUR_API_KEY"});
const corr = new GetTopCorrelations(er);

Step 1: Providing input data

Depending on what you want to use as the input time series, you have three options - (a) loading a time series of a concept/category, (b) loading a time series based on an article query, or (c) providing your own data.

Input time series based on a concept/category from Event Registry

To load a time series of a concept or a category, we can simply use the GetCounts() class.

er.getConceptUri("Obama").then(() => {
    const counts = new GetCounts();
    corr.loadInputDataWithCounts(counts);
})

Input time series based on an article query

You can also form an article query using different set of conditions. The resulting set of articles also forms a time series that can be used as the input time series. In the bottom example we would find all articles that mention keyword "iphone" and use the obtained time series as the input data.

const query = new QueryArticles({ keywords: "iphone" });
corr.loadInputDataWithQuery(query)

Input time series based on the users input

The user is also able to provide his own input data. The data can be provided by calling the setCustomInputData() method where the argument is expected to be a list of python tuples, containing date and count values.

const query = new QueryArticles({ keywords: "iphone" })
corr.setCustomInputData([("2015-01-01", 213), ("2015-01-02", 13), ("2015-01-03", 423), ...])

Step 2: Computing top correlations

Once the user in some way provides the input data, we can compute the things that correlate the most with input data. Depending on the interests, the user can compute the correlations with either concepts or categories.

To compute top correlations with concepts, getTopConceptCorrelations() method can be called:

const conceptInfo = corr.getTopConceptCorrelations({
    conceptType: ["person", "org"],
    exactCount: 10,
    approxCount: 100,
});

The method arguments are as follows:

candidateConceptsQuery: optional. An instance of QueryArticles that can be used to limit the space of concept candidates
candidatesPerType: If candidateConceptsQuery is provided, then this number of concepts for each valid type will be return as candidates
conceptType: optional. A string or an array containing the concept types that are valid candidates on which to compute top correlations. Valid values are person, org, loc and/or wiki
exactCount: the number of returned concepts for which the exact value of the correlation is computed
approxCount: the number of returned concepts for which only an approximate value of the correlation is computed
returnInfo: specifies the details about the concepts that should be returned in the output result

Alternatively, one can compute the list of categories that correlate the most with the input data. For this purpose, the getTopCategoryCorrelations should be called:

const categoryInfo = corr.getTopCategoryCorrelations({
    exactCount: 10,
    approxCount: 100,
})

The method arguments are as follows:

exactCount: the number of returned categories for which the exact value of the correlation is computed
approxCount: the number of returned categories for which only an approximate value of the correlation is computed
returnInfo: specifies the details about the categories that should be returned in the output result

NewsAPI.ai

Home

Core Information
Usage tracking
Terminology
EventRegistry class
ReturnInfo class
Data models for returned information
Finding concepts for keywords
Filtering content by news sources

Text analytics
Semantic annotation, categorization, sentiment

Searching
Searching for events
Searching for articles

Article/event info
Get event information
Get article information

Other
Supported languages
Feed of new articles/events
Social media shares
Daily trends
Correlations
Mentions in news or social media
Find event for your own text
Article URL to URI mapping

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correlations

Step 1: Providing input data

Step 2: Computing top correlations

Clone this wiki locally