COSMOS

The largest job posting dataset available

Explore

Explore the universe
of job postings

Revelio Labs COSMOS dataset is the only job posting dataset to source from 270 thousand company websites, all major job boards, and staffing firm job boards. It is the most complete job posting dataset on the market with over 2 billion current and historic job postings, covering 5.25 million companies. The unified job postings data is deduplicated, parsed, and enriched for a variety of use cases:

  • Predicting company performance
  • Competitive analysis
  • Talent analytics
  • Talent sourcing

Never before
seen enrichments

COSMOS Job Postings is a part of the Revelio Labs suite of workforce datasets, including workforce dynamics, sentiment, and layoff notices. Each posting is mapped to occupation, seniority level, geography, and other characteristics of each job. We also enrich each posting with variables unavailable anywhere else (sample file):

  • Expected hire(s) per posting
  • Salary and benefits
  • Activities performed
  • Skills required
  • Suitability of job to remote work
  • Tags for contract work, internships, etc.

Light speed updates

COSMOS uses the latest technology to automatically identify and collect job postings from company websites, without human intervention. The process is extremely robust to website changes, allowing for support of an enormous number of companies. When site changes inevitably occur, these changes are detected and corrected immediately, minimizing disruption for customers.

COSMOS vs Competitors

COSMOS

Competitors

Source from over 270K company websites, major aggregators, staffing firms

Source from no more than 120K sources

2B current and historical job postings

No more than 1B current and historical job postings

5.25M companies

No more than 3.5M companies

195 countries

150 countries - translated in 9 languages

Minimal restatements

Frequent restatements

Down times from company website changes remediated within hours

Down times from company website changes can last weeks

Dynamic deduplication model

Rigid deduplication formula

Posting text is fully parsed: qualifications, skills, responsibilities, company information, etc.

Only high-level metadata is parsed: job title, location, etc

Expected Hires: All postings are weighted based on expected hires

No acknowledgement that not all postings lead to hires and that there are some postings that represent many openings

Fully Mapped

Companies

Using various company entity identifiers, we map each publicly reported position at a company to a Revelio Labs Company ID (RCID) – an ID in our proprietary company universe. Each subsidiary company has its own RCID that is tied to its parent company’s RCID. Also map to:

  • Company identifiers (CUSIP, ISIN, SEDOL, Ticker, GVKEY, etc)
  • Industries (NAICS, SIC, GICS, etc)

Suite of taxonomies

Our taxonomies take billions of unique job titles, descriptions, skills, and activities seen in public workforce data to create a universal job architecture, enabling comparisons across companies.

Data Consumption

Like all of our products, data can be accessed in a variety of ways:

  • Data feeds
  • API
  • Dashboard
  • Reports