mdatool
Healthcare Data Dictionary for the Modern Data Stack
LibraryBlogPricing
mdatool
mdatool

The healthcare data dictionary for dbt, Snowflake, Databricks, and BigQuery. 100,000+ ISO-11179 standard terms, free SQL tools, and AI data modeling.

HIPAA-AlignedEnterprise Ready

Tools

  • SQL Linter
  • DDL Converter
  • Bulk Sanitizer
  • Naming Auditor
  • Name Generator
  • AI Data Modeling
  • HCC Calculator
  • Data Model Canvas

Library

  • Glossary
  • Guides
  • Blog

Company

  • About
  • Contact
  • Pricing

Account

  • Sign Up Free
  • Sign In
  • Upgrade to Pro
  • Dashboard

Legal

  • Privacy Policy
  • Terms of Service

© 2026 mdatool. All rights reserved.

Built for healthcare data teams.

HomeBlogProvider DataProvider Data Management: NPI, Credentialing, Network, and Provider Data Warehouse Design
Provider Data

Provider Data Management: NPI, Credentialing, Network, and Provider Data Warehouse Design

The complete guide to healthcare provider data management for data architects and engineers. Covers NPI numbers, provider credentialing, network participation, taxonomy codes, and production-ready provider master data model design for Snowflake, Databricks, and BigQuery.

mdatool Team·June 14, 2026·16 min read
provider dataNPIcredentialingprovider networktaxonomy codenetwork adequacySnowflakeDatabricksBigQueryprovider master datahealthcare data engineering

Provider data is one of the most operationally complex domains in healthcare analytics. Providers move between organizations, merge practices, update credentials, join and leave networks, and maintain relationships with dozens of facilities and payers simultaneously. A single physician may have multiple NPIs, practice under several group affiliations, hold privileges at multiple hospitals, and participate in dozens of insurance networks — all while their demographic and credentialing information changes continuously.

For health plans, provider networks are the product. For hospitals, provider data drives clinical operations. For payers and regulators, provider data quality is a compliance requirement. This guide covers everything a healthcare data architect, data modeler, or data engineer needs to know about provider data — from NPI numbers and credentialing to production-ready master data model design for Snowflake, Databricks, and BigQuery.

🏥NPI Lookup

Look up any NPI number and validate provider data against the NPPES registry.

Try it free

What Is Healthcare Provider Data?

A healthcare provider is any individual or organization licensed to deliver clinical services and eligible to bill insurance payers for those services. Provider data encompasses the complete set of administrative, credentialing, network, and performance information maintained about providers by health plans, hospitals, credentialing organizations, and government agencies.

Provider data falls into four primary categories:

Identity and demographics capture who the provider is — their name, NPI number, tax identification number, practice addresses, phone numbers, and organizational affiliations. The National Provider Identifier is the universal provider identifier mandated by HIPAA for all healthcare administrative transactions.

Credentials and licensure document the provider's qualifications — medical school, residency training, board certifications, state licenses, DEA registration, and malpractice history. Provider credentialing is the formal verification process that health plans and hospitals use to confirm these credentials before granting network participation or clinical privileges.

Network participation defines which insurance plans a provider has contracted with, what services are covered under each contract, what reimbursement rates apply, and whether the provider is currently accepting new patients. Provider network status is one of the most queried fields in healthcare operations — driving claims adjudication, member-facing provider directories, and network adequacy filings.

Performance and quality captures how the provider performs on clinical quality metrics, patient experience measures, cost efficiency scores, and utilization patterns. Provider quality scores and provider star ratings are increasingly used in tiered network designs and value-based payment programs.

Core Provider Data Elements

Every healthcare data team working with provider data needs to understand these fundamental fields:

The National Provider Identifier (npi) is the 10-digit unique identification number assigned to every healthcare provider in the United States by CMS under HIPAA. Type 1 NPIs are assigned to individual practitioners and Type 2 NPIs to organizational providers. The NPI is required on all HIPAA standard transactions including claims, eligibility, and remittance.

The provider taxonomy code (prvdr_tax) is the NUCC 10-character alphanumeric code identifying the provider type, classification, and specialization in a standardized hierarchy. Taxonomy codes drive fee schedule assignment, claims editing, and network adequacy measurement.

The provider specialty code (prvdr_spclty_cd) identifies the provider clinical specialty and drives network adequacy analysis — health plans must demonstrate sufficient specialist availability within defined access standards for each member ZIP code.

The provider tax ID (prvdr_tax_id) is the federal tax identification number used for claims payment and 1099 tax reporting. For individual providers this may be a Social Security Number and for organizations an Employer Identification Number.

The provider network status (prvdr_ntwk_sts) indicates whether the provider is actively contracted and participating in a specific health plan network — driving claims adjudication benefit tier application and member directory accuracy.

The provider credentialing status (prvdr_credntl_sts) tracks where a provider is in the credentialing workflow — from application received through committee approval to active network participation.

The provider panel status (prvdr_pnl_sts) indicates whether the provider is accepting new patients — a critical network adequacy data element that CMS requires health plans to maintain accurately and update within 30 days of any change.

The provider exclusion indicator (prvdr_excl_ind) flags providers listed on the OIG List of Excluded Individuals and Entities — health plans must screen against this list monthly to avoid contracting with excluded providers.

NPI Registry and NPPES Data

The National Plan and Provider Enumeration System is the CMS database that assigns and maintains NPI numbers for all healthcare providers. NPPES data is publicly available and serves as the authoritative source for provider identity information. Use our NPI Lookup tool to search any provider by NPI number or name.

NPPES contains these key data elements for every registered provider:

  • NPI number and NPI type (Type 1 individual / Type 2 organization)
  • Provider name (legal name and credential suffix)
  • Practice location addresses (up to 50 locations per provider)
  • Mailing address for correspondence
  • Taxonomy codes (primary and secondary specializations)
  • Enumeration date and last update date
  • Deactivation status for providers who have surrendered their NPI

Healthcare data teams download the monthly NPPES full replacement file to maintain current provider identity data, validate provider NPIs submitted on claims, and enrich internal provider records with standardized demographic and taxonomy information from the authoritative CMS source.

Provider Credentialing Process

Provider credentialing is the formal verification of a provider's qualifications before granting network participation or clinical privileges. NCQA standards require health plans to complete initial credentialing within 180 days of application and recredential providers every three years.

The credentialing process involves primary source verification of:

  • Medical school graduation from an accredited institution
  • Residency and fellowship completion at accredited programs
  • Board certification status from recognized specialty boards
  • State licensure status in all states of practice
  • DEA registration for providers with prescribing authority
  • Malpractice insurance coverage and claims history
  • Hospital privileges and facility affiliations
  • OIG exclusion and SAM.gov debarment screening
  • National Practitioner Data Bank query results

Healthcare data teams build credentialing workflow systems that track provider credentialing application status through each verification step, calculate provider recredentialing due dates three years from initial approval, generate advance notice workflows at 180, 90, and 30 days before expiration, and maintain audit trails required for NCQA credentialing accreditation surveys.

Provider Network Design

A health plan provider network is the set of contracted providers who have agreed to deliver covered services to members at negotiated reimbursement rates. Network design decisions — which specialties to include, how many providers per service area, what reimbursement rates to offer — directly determine member access to care and plan financial performance.

Network adequacy is the regulatory requirement that health plans maintain sufficient provider availability within defined distance and time standards for each member ZIP code in the service area. CMS requires Medicare Advantage plans to demonstrate that members can access primary care within 15 miles and 30 minutes and specialists within 30 miles and 60 minutes.

Tiered networks create multiple provider tiers within a single network, with members paying lower cost sharing for Tier 1 preferred providers who have demonstrated superior quality and cost efficiency. Provider star ratings and provider cost efficiency scores determine tier placement.

Narrow networks include a smaller subset of providers with favorable quality and cost profiles, enabling lower premiums in exchange for more limited provider choice. Narrow network design requires careful adequacy analysis to ensure the reduced provider set still meets regulatory access standards.

Healthcare data teams build network management systems that track provider network effective dates and termination dates, calculate network adequacy metrics by specialty and service area ZIP code, identify access gaps requiring additional contracting, and maintain provider accepting patients status for directory accuracy.

Provider Master Data Model Design

Provider master data management requires a hub-and-spoke architecture that maintains the provider identity record separately from the multiple relationships and attributes that can change independently — network participation, facility affiliations, credentials, and performance metrics.

Below is a production-ready provider master data model generated by the mdatool AI Data Modeling tool:

Core Provider Identity Table

-- Snowflake DDL — generated with mdatool AI Data Modeling
CREATE TABLE DIM_PROVIDER (
  PRVDR_KEY           INTEGER         NOT NULL,   -- surrogate key
  PRVDR_NPI           VARCHAR(10),                -- NPI (Type 1 or 2)
  PRVDR_TIN           VARCHAR(10),                -- tax identification number
  PRVDR_FIRST_NM      VARCHAR(100),               -- first name
  PRVDR_LAST_NM       VARCHAR(100),               -- last name
  PRVDR_ORG_NM        VARCHAR(255),               -- organization name
  PRVDR_DBA_NM        VARCHAR(255),               -- doing business as name
  PRVDR_TYP_CD        VARCHAR(20),                -- provider type code
  PRVDR_TAX           VARCHAR(10),                -- primary taxonomy code
  PRVDR_SPCLTY_CD     VARCHAR(10),                -- specialty code
  PRVDR_DEG_CD        VARCHAR(20),                -- degree code (MD/DO/NP)
  PRVDR_LIC_TYP_CD    VARCHAR(20),                -- license type code
  PRVDR_STATE_CD      CHAR(2),                    -- primary state
  PRVDR_ZIP_CD        VARCHAR(10),                -- primary ZIP code
  PRVDR_CNTY          VARCHAR(50),                -- primary county
  PRVDR_GRP_NPI       VARCHAR(10),                -- group NPI (Type 2)
  PRVDR_BRD_CERT_IND  CHAR(1),                    -- board certified indicator
  EFF_START_DT        DATE            NOT NULL,   -- SCD2 effective start
  EFF_END_DT          DATE,                       -- SCD2 effective end
  CURR_ROW_IND        BOOLEAN         NOT NULL DEFAULT TRUE,
  LOAD_DT             TIMESTAMP_NTZ   NOT NULL DEFAULT CURRENT_TIMESTAMP,
  CONSTRAINT PK_DIM_PROVIDER PRIMARY KEY (PRVDR_KEY)
);

Provider Network Participation Table

-- One row per provider per network per contract period
CREATE TABLE FACT_PROVIDER_NETWORK (
  PRVDR_NTWK_KEY      INTEGER         NOT NULL,   -- surrogate key
  PRVDR_KEY           INTEGER         NOT NULL,   -- FK to DIM_PROVIDER
  PLAN_KEY            INTEGER         NOT NULL,   -- FK to DIM_PLAN
  NTWK_EFF_DT         DATE            NOT NULL,   -- network effective date
  NTWK_TERM_DT        DATE,                       -- network termination date
  PRVDR_NTWK_STS      VARCHAR(10),                -- active/pending/terminated
  PRVDR_IN_NTWK_IND   CHAR(1),                    -- in network indicator
  PRVDR_ACCPT_PT_IND  CHAR(1),                    -- accepting patients
  PRVDR_PNL_STS       VARCHAR(10),                -- panel open/closed
  PRVDR_CNTRCT_TYP_CD VARCHAR(10),                -- contract type code
  PRVDR_REIMB_RT      DECIMAL(10,4),              -- reimbursement rate
  LOAD_DT             TIMESTAMP_NTZ   NOT NULL DEFAULT CURRENT_TIMESTAMP,
  CONSTRAINT PK_FACT_PROVIDER_NETWORK
    PRIMARY KEY (PRVDR_NTWK_KEY)
);

Provider Credentialing Table

-- Tracks credentialing lifecycle for each provider
CREATE TABLE FACT_PROVIDER_CREDENTIAL (
  PRVDR_CRED_KEY      INTEGER         NOT NULL,
  PRVDR_KEY           INTEGER         NOT NULL,   -- FK to DIM_PROVIDER
  CREDNTL_TYP_CD      VARCHAR(20),                -- credential type
  CREDNTL_STS         VARCHAR(20),                -- credentialing status
  APPL_RCVD_DT        DATE,                       -- application received date
  CMTE_APPR_DT        DATE,                       -- committee approval date
  CREDNTL_EFF_DT      DATE,                       -- credential effective date
  RECREDNTL_DUE_DT    DATE,                       -- recredentialing due date
  SANCT_IND           CHAR(1),                    -- sanction indicator
  EXCL_IND            CHAR(1),                    -- exclusion indicator
  DEBAR_IND           CHAR(1),                    -- debarment indicator
  LOAD_DT             TIMESTAMP_NTZ   NOT NULL DEFAULT CURRENT_TIMESTAMP,
  CONSTRAINT PK_FACT_PROVIDER_CREDENTIAL
    PRIMARY KEY (PRVDR_CRED_KEY)
);

Provider Performance Table

-- Annual provider quality and efficiency scores
CREATE TABLE FACT_PROVIDER_PERFORMANCE (
  PRVDR_PERF_KEY      INTEGER         NOT NULL,
  PRVDR_KEY           INTEGER         NOT NULL,   -- FK to DIM_PROVIDER
  PLAN_KEY            INTEGER         NOT NULL,   -- FK to DIM_PLAN
  MEAS_YR             SMALLINT        NOT NULL,   -- measurement year
  PRVDR_QLTY_SCR      DECIMAL(5,2),               -- quality score
  PRVDR_STAR_RTG      DECIMAL(3,1),               -- star rating (1-5)
  PRVDR_CST_EFF_SCR   DECIMAL(5,2),               -- cost efficiency score
  PRVDR_HEDIS_SCR     DECIMAL(5,2),               -- [HEDIS](/terms/hedis) composite score
  PRVDR_ATTR_MBR_CNT  INTEGER,                    -- attributed member count
  LOAD_DT             TIMESTAMP_NTZ   NOT NULL DEFAULT CURRENT_TIMESTAMP,
  CONSTRAINT PK_FACT_PROVIDER_PERFORMANCE
    PRIMARY KEY (PRVDR_PERF_KEY)
);

Generate this complete schema instantly using the mdatool AI Data Modeling tool — select Provider Network as your domain, Star Schema as your architecture, and your target platform. Production-ready DDL with ISO-11179 standard column names in 30 seconds.

Common Provider Data Analytics Use Cases

Healthcare data teams use provider data across a wide range of analytical programs:

Network Adequacy Analysis measures whether health plan members can access required provider types within defined distance and time standards. Analytics calculate member-to-provider distances by ZIP code using provider practice location data, identify geographic access gaps, and produce CMS-required network adequacy filings.

Provider Directory Maintenance ensures member-facing provider directories accurately reflect current network participation, accepting patient status, practice locations, and specialty information. CMS requires Medicare Advantage plans to update directory information within 30 days of any change and attest to accuracy quarterly.

Credentialing Workflow Management tracks provider applications through each verification step, calculates recredentialing due dates, generates advance notice communications, and produces NCQA accreditation audit documentation.

Provider Performance Reporting aggregates claims data to calculate provider-level quality rates, cost efficiency scores, and utilization patterns for value-based contract performance evaluation and tiered network placement decisions.

Fraud Detection and Program Integrity screens provider rosters against OIG exclusion lists, identifies billing patterns inconsistent with provider specialty or practice location, and detects providers with anomalous claim volumes or code distributions.

Value-Based Care Attribution assigns members to primary care providers based on plurality of qualifying primary care visits, supporting shared savings calculations, quality measure attribution, and care gap notification workflows.

Provider Data Quality Challenges

Provider data quality is notoriously difficult to maintain. Healthcare data teams must address these common challenges:

Provider identity resolution is the most complex provider data quality problem. The same physician may have multiple NPIs from different enrollment periods, bill under both individual and group NPIs, appear under name variations across source systems, and have different addresses in different payer databases. Master provider indexes using probabilistic matching on NPI, name, date of birth, and license number resolve these identities for accurate analytics.

Network participation lag occurs when providers join or leave networks but directory systems are not updated in a timely manner. Inaccurate network status data results in members receiving incorrect benefit tier information at point of service, leading to member complaints and regulatory findings.

Credentialing data gaps arise when credential expiration dates pass without renewal, license sanctions are not identified through ongoing monitoring, or primary source verification is not completed within required timeframes. Healthcare data teams implement automated monitoring workflows that flag expiring credentials and active sanctions for immediate review.

Taxonomy code accuracy affects fee schedule assignment and network adequacy measurement. Providers with incorrect primary taxonomy codes may be paid at wrong rates, included in incorrect specialty counts for adequacy analysis, or excluded from member directories for their actual specialty.

Provider Data Tools

mdatool provides several free tools specifically designed for provider data work:

  • NPI Lookup — Search any provider by NPI number, name, organization, or specialty instantly
  • AI Data Modeling — Generate a complete provider master data model for Snowflake, BigQuery, or Databricks in 30 seconds
  • Data Model Canvas — Visualize your provider schema as an interactive ER diagram
  • Naming Auditor — Audit provider data column names against ISO-11179 healthcare naming standards
  • SQL Linter — Validate provider analytics SQL before it reaches production
  • DDL Converter — Convert provider schema DDL between Snowflake, BigQuery, Databricks, and other platforms

Frequently Asked Questions

What is the difference between a Type 1 and Type 2 NPI? A Type 1 NPI is assigned to individual healthcare practitioners — physicians, nurse practitioners, physician assistants, and other licensed individual providers. A Type 2 NPI is assigned to organizational providers — hospitals, medical groups, clinics, skilled nursing facilities, and other entities that deliver care. Individual providers bill under their Type 1 NPI when working independently and may also bill under a Type 2 group NPI when working within a group practice.

What is provider credentialing and how long does it take? Provider credentialing is the formal process of verifying a provider qualifications, training, licensure, and professional standing before granting network participation or clinical privileges. The process involves primary source verification of medical education, training, board certifications, licenses, malpractice history, and exclusion status. Initial credentialing typically takes 60 to 90 days from completed application to committee approval. NCQA standards require health plans to complete credentialing within 180 days and recredential providers every three years.

What is the OIG exclusion list and why does it matter? The OIG List of Excluded Individuals and Entities is maintained by the Office of Inspector General and identifies providers and organizations excluded from participation in Medicare, Medicaid, and other federal healthcare programs due to fraud, abuse, patient harm, or other disqualifying conduct. Health plans and healthcare organizations are legally required to screen against the OIG exclusion list and may not employ or contract with excluded providers for services billed to federal programs. Monthly screening against the LEIE is a standard compliance requirement.

How do you maintain provider directory accuracy? Provider directory accuracy requires systematic processes including provider attestation portals where providers confirm their information regularly, automated outreach for status updates when information has not been verified recently, comparison against NPPES for demographic accuracy, and audit programs that verify directory information against actual appointment availability. CMS requires Medicare Advantage plans to update directories within 30 days of any change and to conduct quarterly attestation of directory accuracy.

What is network adequacy and how is it measured? Network adequacy is the requirement that a health plan maintain sufficient provider availability to ensure members can access covered services within defined distance and time standards. CMS measures Medicare Advantage network adequacy by calculating the percentage of members who live within required distance and time thresholds from contracted providers for each specialty type. Health plans must demonstrate adequacy for primary care, specialists, hospitals, and other provider types using member ZIP code data and provider practice location coordinates.

What is provider attribution and why does it matter? Provider attribution is the assignment of health plan members to a specific primary care provider for quality measurement, shared savings calculations, and care management coordination. CMS uses plurality of primary care evaluation and management visits to attribute Medicare Advantage members to providers for the Medicare Shared Savings Program and related value-based programs. Accurate attribution is essential for fair performance measurement — providers should only be accountable for outcomes of members they actually managed.

How are provider taxonomy codes used in claims adjudication? Provider taxonomy codes identify the provider specialty and service category and are used in claims adjudication to validate that the procedure codes billed are appropriate for the provider type, apply the correct fee schedule for the specialty and service setting, and route claims to the appropriate clinical review process for medical necessity determination. Incorrect taxonomy codes on claims can result in payment at wrong rates, inappropriate claim edits, or incorrect medical necessity determinations.

M

mdatool Team

The mdatool team builds free engineering tools for healthcare data architects, analysts, and engineers working across payer, provider, and life sciences data.

Related Guides

Healthcare Analytics

Population health analytics, data warehousing, and clinical intelligence.

Read Guide

NPI: National Provider Identifiers

NPI codes, database design, taxonomy classifications, and provider data standards.

Read Guide

Key Terms in This Article

provider network effective dateprovider network termination dateprovider network statusprovider network

More in Provider Data

NPI Number Validation & Provider Data: A Practical Guide

How to validate NPI numbers using the Luhn algorithm, build a provider master data model, and reconcile against the NPPES registry — with SQL and Python examples for healthcare data engineers.

Read more

Free Tools

Free NPI Lookup

Search any provider by NPI number, name, or organization instantly.

Try it free

Free SQL Linter

Catch SQL bugs, performance issues, and naming violations before production.

Try it free

Ready to improve your data architecture?

Free tools for DDL conversion, SQL analysis, naming standards, and more.

Get Started Free

Get weekly healthcare data engineering tips

Practical guides on data modeling, SQL standards, and healthcare domain conventions — straight to your inbox.

No spam. Unsubscribe any time.

On this page

  • What Is Healthcare Provider Data?
  • Core Provider Data Elements
  • NPI Registry and NPPES Data
  • Provider Credentialing Process
  • Provider Network Design
  • Provider Master Data Model Design
  • Core Provider Identity Table
  • Provider Network Participation Table
  • Provider Credentialing Table
  • Provider Performance Table
  • Common Provider Data Analytics Use Cases
  • Provider Data Quality Challenges
  • Provider Data Tools
  • Frequently Asked Questions

Share

Share on XShare on LinkedIn

Engineering Tools

Convert DDL, lint SQL, and audit naming conventions — free.

Explore Tools