InCites Benchmarking & Analytics: About InCites Data

InCites Benchmarking & Analytics is a research analytics tool.

Web of Science Core Collection is the data source for InCites Benchmarking & Analytics.  This guide describes the scope, update schedules and data elements that are the foundation of the resource.

Product Subscriptions

Question Web of Science Core Collection InCites Benchmarking & Analytics
What is the full range of content available?

Science Citation Index Expanded  1900-

Social Sciences Citation Index 1900-

Arts & Humanities Citation Index 1975-

Conference Proceedings Citation Index (Science) 1990-

Conference Proceedings Citation Index (Social Sciences & Humanities) 1990-

Book Citation Index (Science) 2005-

Book Citation Index (Social Sciences & Humanities) 2005-

Emerging Sources Citation Index 2015-

Current Chemical Reactions 1985-

Index Chemicus 1993-

The full InCites dataset extends back to 1980 and includes content indexed in:

  • Science Citation Index Expanded
  • Social Sciences Citation Index
  • Arts & Humanities Citation Index
  • Conference Proceedings Citation Index (SCI & SSH)
  • Book Citation Index (SCI & SSH)
All document types from each source index are included.
How does my subscription affect my access? Your institution may subscribe to only a subset of the indexes listed above, and the backfile depth will also vary by institution. You can only discover source records within your subscription limits.  All users can analyze the full InCites dataset, regardless of WoS Core Collection subscription limits.


Update Schedules

Question Web of Science Core Collection InCites Benchmarking & Analytics
How frequently is each tool updated? New content is added to WoS Core Collection on a daily basis. As new records are added to the database, the number of results for a search could increase. 

InCites is updated monthly. Prior to each release, data are extracted from WoS Core Collection to prepare for InCites. This means that the data cutoff date is earlier than the data release date. For example, the InCites dataset update on August 12, 2017 includes WoS Core Collection content indexed through June 30, 2017.

Baselines are recalculated with each bimonthly update.

Hover over the info icon  in any filter panel to review the current dates for InCites.  


Citation Counts

Question Web of Science Core Collection InCites Benchmarking & Analytics
Why are there different citation counts for the same article?

Times cited counts could increase daily as new citing articles are added to the database.

The WoS Core Collection times cited count for an item includes citations from all of the editions (see full list in Product Subscriptions).

InCites only includes citations coming from the InCites source editions (see Product Subscriptions). This means that citations coming from Emerging Sources Citation Index are NOT reflected in times cited counts in InCites.

Times cited counts are fixed when the WoS Core Collection data are extracted for InCites. This is necessary to calculate consistent, accurate baselines for normalized metrics. 


Custom Datasets

Question Web of Science Core Collection InCites Benchmarking & Analytics
How do these differences affect custom dataset creation?

WoS Core Collection includes content published prior to 1980, and it is updated more frequently than InCites. This means that some records may be too old or too new to be included in the InCites dataset for analysis.

All Emerging Sources Citation Index records are outside of the scope of InCites.

When you search WoS Core Collection, you will only retrieve results that are within your subscription access. 
When uploading a file of document identifiers or using the Save to InCites feature in WoS Core Collection, the only records that make it into a custom dataset are those that overlap with the InCites dataset. 


Bibliographic Data Elements

Question Web of Science Core Collection InCites Benchmarking & Analtyics
Do you include all document types? Content sources for the Web of Science Core Collection are fully indexed from cover-to-cover, meaning every scholarly item is indexed and all significant publication types are included All document types from Web of Science Core Collection are included in InCites for analysis.
How do you handle author names? A complete list of authors is always captured for all publications in Web of Science Core Collection, including given name (from 2008-present), surname and initials. Authors may also be associated to ResearcherID or ORCID profiles. All author names from Web of Science publications can be used in InCites analyses.
How do you treat institution names?

In addition to all author names, all author affiliations are captured from each publication, including (where available on the source publication) organization name, city, state or province, postal code, country or territory. In InCites, the full organization name is displayed and searchable. Since 2008 all author names are associated with their affiliated institutions as listed with the publication.

The policy of including all affiliations is particularly important for multi-authored papers which may contain hundreds of different affiliations, all of which are searchable and displayable. This ability to comprehensively identify an institution’s publications is a key benefit of InCites when compared to other databases of scholarly literature which may only capture some of the affiliations and may not accurately capture all name variants.

Address Unification: Care is taken to unify variant institution names from Web of Science addresses, including name variants, such as previous names, affiliated sub-organizations and spelling variants. More than 7,000 institutions have undergone the unification process, and work is ongoing to extend it to more organizations. The unification process is a combination of background research by Clarivate Analytics staff and feedback from organizations.

Organization Types: Each unified organization is assigned an organization type by Clarivate Analytics to facilitate filtering by broad grouping.


Research Area Schemas

Why use Research Area Schemas?

Research area schemas, alongside baselines, are important to place bibliometric data into context. A citation count of a paper in isolation is a relatively meaningless number. But by looking at it in the context of peer publications, one can understand the performance, see if it is above or below average and by how much. Through benchmarking, data becomes actionable knowledge.

It is necessary to understand performance within the context of subject areas because publication rates and citation behavior can vary considerably from discipline to discipline, document type and over time. For example, mathematics papers are usually cited at a relatively low rate but the citation rate can persist over a long period of time. Whereas molecular biology papers are typically cited more frequently and the citations tail off after a few years as the research is superseded. By understanding the underlying trends and comparing the publications of interest to publications in the same subject area, year and document type will have more meaningful results. 

What schemas are available, and how do I decide?

There are 17 different research area schemas available in InCites. Three are exclusive to Clarivate and are described below.

A further 14 are based on mapping Clarivate data to external subject classification systems. These schemas are designed to enable the use of bibliometric indicators in the context of a regional research evaluation program, for example the Research Excellence Framework in the United Kingdom. Alternatively, the Organization for Economic Cooperation and Development (OECD) subject classification schema is a valuable tool for looking at national level bibliometric indicators in the context of demographical and financial data provided by the OECD. Typically, schemas based on external subject classifications are developed in partnership with research evaluation bodies in that region. They may be based on journal classifications or the mapping of Web of Science Core Collection categories. Please see the InCites Help file for additional details on these schema. 

Which schema to use will depend on the objectives of the analysis. Typically if looking at small sets of publications, such as the output of a single department or individual author, it is advisable to use the higher precision of a narrow subject classification such as the Web of Science schema. This approach may be useful to overcome differences between things such as applied and theoretical research of the same topic. However, if you wish to understand the overall subject mix of an organization or a country, using a broader schema may be more appropriate. 

Web of Science: The narrowest categorization. The Web of Science schema is comprised of 252 subject categories in science, social sciences, arts and humanities. The schema is created by assigning each journal to one or more subject categories. Broad disciplines such as physics are represented as smaller subfields, for example “Physics, Applied” and “Physics, Nuclear.” This narrow definition of subject is an important characteristic of the schema as citation behavior may significantly vary among subfields. The Web of Science subject schema is generally considered the best for detailed bibliometric analysis as its granularity enables the user to objectively measure performance against papers that are similar in scope and citation characteristics. However, because it is often not possible to assign a journal to a single category, there can be overlapping coverage of categories which may complicate an analysis. Each published item will inherit all subject categories assigned to the parent journal. Coverage of books and conferences follow the same definitions of subject area.

Essential Science Indicators: A broad categorization. The Essential Science Indicators schema comprises 22 subject areas in science and social sciences and is based on journal assignments. Arts & Humanities journals are not included. Each journal is found in only one of the 22 subject areas and there is no overlap between categories which can facilitate simpler analysis. 

GIPP: A very broad categorization. The GIPP schema comprises six broad disciplines but covers all fields of scholarly research. The GIPP schema is based on an aggregation of the Web of Science subject categories and contains significant overlap between disciplines. 

Are there any other considerations when using these schema?

Research Area Schema Selection and Total Results 

Each Research Area schema maps uniquely to the research areas and journals established with the Web of Science Core Collection. For that reason, document totals within the results table will not necessarily correspond to the same total displayed when Web of Science is selected. You can view how categories relate to those in Web of Science Core Collection by viewing the mappings included in each of the Research Area descriptions.

Reclassification of Papers in Multidisciplinary and Medical Journals

Clarivate reassigns publications in multidisciplinary journals such as Nature and Science to their most relevant subject area. While these journals publish articles on a wide array of topics, individual articles in those journals focus on one area of research. By using the information found in the cited references of each publication it is possible, in most cases, to algorithmically reassign them to a subject area. In cases where it is not possible to accurately reassign the publications (for example when the article does not have cited references) the articles are left as multidisciplinary.

This reclassification process allows articles to be appropriately compared with articles of similar citation characteristics and topic focus. The reclassification is applied to articles in the categories of “Multidisciplinary Sciences” and “Medicine, General and Internal” in the Web of Science Core Collection (and therefore any subject schema that is based on aggregations of Web of Science categories) and the “Multidisciplinary” field in the Essential Science Indicators schema.