COSMOS
The largest job posting dataset available
Explore the universe
of job postings
Revelio Labs COSMOS dataset is the only job posting dataset to source from 440 thousand company websites, all major job boards, and staffing firm job boards. It is the most complete job posting dataset on the market with over 4.1 billion current and historic job postings, covering 6.6 million companies. The unified job postings data is deduplicated, parsed, and enriched for a variety of use cases:
- Predicting company performance
- Competitive analysis
- Talent analytics
- Talent sourcing
Never before
seen enrichments
COSMOS Job Postings is a part of the Revelio Labs suite of workforce datasets, including workforce dynamics, sentiment, and layoff notices. Each posting is mapped to occupation, seniority level, geography, and other characteristics of each job. We also enrich each posting with variables unavailable anywhere else (sample file):
- Expected hire(s) per posting
- Salary and benefits
- Activities performed
- Skills required
- Suitability of job to remote work
- Tags for contract work, internships, etc.
Light speed updates
COSMOS uses the latest technology to automatically identify and collect job postings from company websites, without human intervention. The process is extremely robust to website changes, allowing for support of an enormous number of companies. When site changes inevitably occur, these changes are detected and corrected immediately, minimizing disruption for customers.
COSMOS vs Competitors
COSMOS | Competitors |
Source from over 440K company websites, major aggregators, staffing firms | Source from no more than 120K sources |
4.1B current and historical job postings | No more than 1B current and historical job postings |
6.6M companies | No more than 3.5M companies |
195 countries | 150 countries - translated in 9 languages |
Minimal restatements | Frequent restatements |
Down times from company website changes remediated within hours | Down times from company website changes can last weeks |
Dynamic deduplication model | Rigid deduplication formula |
Posting text is fully parsed: qualifications, skills, responsibilities, company information, etc. | Only high-level metadata is parsed: job title, location, etc |
Expected Hires: All postings are weighted based on expected hires | No acknowledgement that not all postings lead to hires and that there are some postings that represent many openings |
Fully Mapped
Companies
Using various company entity identifiers, we map each publicly reported position at a company to a Revelio Labs Company ID (RCID) – an ID in our proprietary company universe. Each subsidiary company has its own RCID that is tied to its parent company’s RCID. Also map to:
- Company identifiers (CUSIP, ISIN, SEDOL, Ticker, GVKEY, etc)
- Industries (NAICS, SIC, GICS, etc)
Suite of taxonomies
Our taxonomies take billions of unique job titles, descriptions, skills, and activities seen in public workforce data to create a universal job architecture, enabling comparisons across companies.
Data Consumption
Like all of our products, data can be accessed in a variety of ways:
- Data feeds
- API
- Dashboard
- Reports



