Data Dictionary
Documentation of all data fields, enums, and methodology notes for PoweredByWho data exports.
Contents
studies.csv
Each row represents one published opinion research study or poll.
| Field | Type | Description |
|---|---|---|
| id | uuid | Unique identifier for the study. |
| title | text | Title of the study or poll. |
| source_url | url | Link to the original source document. |
| archive_url | url | null | Link to an archived copy (e.g. Wayback Machine). |
| pollster | text | null | Organization that conducted the fieldwork. |
| sponsor | text | null | Organization that commissioned or funded the study. |
| sponsor_category | enum | Category of the sponsor. See Sponsor Category enum. |
| field_start | date | null | Date fieldwork began (YYYY-MM-DD). |
| field_end | date | null | Date fieldwork ended (YYYY-MM-DD). |
| publish_date | date | null | Date the study was published. |
| geography_state | text | null | Two-letter U.S. state code (e.g. VA, TX). |
| geography_county | text | null | County name, if study covers a specific county. |
| geography_city | text | null | City name, if study covers a specific city. |
| geography_other | text | null | Free-text geography description for non-standard areas. |
| mode | text | null | Survey mode (e.g. online, phone, mixed). |
| population | text | null | Target population (e.g. registered voters, adults, likely voters). |
| sample_n | integer | null | Total sample size. |
| moe | float | null | Margin of error as a percentage (e.g. 3.5 means +/- 3.5%). |
| extraction_confidence | enum | Confidence in the accuracy of extracted data. See Extraction Confidence enum. |
| review_status | enum | Review workflow status. Only 'published' items appear in public exports. |
questions.csv
Each row represents one question extracted from a study. Join to studies on study_id.
| Field | Type | Description |
|---|---|---|
| id | uuid | Unique identifier for the question. |
| study_id | uuid | Foreign key to the parent study. |
| question_text_verbatim | text | Exact question wording as it appeared in the source. |
| question_type | enum | Type of question. See Question Type enum. |
| topic_tags | text[] | Array of topic tags. See Topic Tags list. |
| cluster_id | uuid | null | Foreign key to a question cluster for cross-study comparison. |
| sort_order | integer | Order of the question within the study. |
results.csv
Each row represents one response option for a question. Includes both topline ("Overall") and crosstab results. Join to questions on question_id.
| Field | Type | Description |
|---|---|---|
| id | uuid | Unique identifier for the result row. |
| question_id | uuid | Foreign key to the parent question. |
| subgroup_label | text | Subgroup label. 'Overall' for topline results, otherwise describes the crosstab subgroup. |
| response_option | text | The response choice (e.g. 'Support', 'Oppose', 'Strongly support'). |
| pct_value | float | null | Percentage value (e.g. 52.3 means 52.3%). |
| base_n | integer | null | Number of respondents for this subgroup. |
| sort_order | integer | Order of the result within its question and subgroup. |
projects.csv
Each row represents one data center project tracked by PoweredByWho.
| Field | Type | Description |
|---|---|---|
| id | uuid | Unique identifier for the project. |
| canonical_name | text | Project name used for display. |
| developer | text | null | Company developing the data center. |
| operator | text | null | Company that will operate the data center, if different from developer. |
| status_current | enum | Current project status. See Project Status enum. |
| status_current_date | date | null | Date when the current status was last confirmed. |
| state | text | Two-letter U.S. state code. |
| county | text | null | County name. |
| city | text | null | City name. |
| lat | float | null | Latitude coordinate. |
| lng | float | null | Longitude coordinate. |
| geography_precision | enum | Precision of the coordinates. See Geography Precision enum. |
| capacity_mw | float | null | Planned or actual power capacity in megawatts. |
| capacity_mw_source_confidence | enum | null | Confidence in the MW figure. See MW Confidence enum. |
| source_url | url | null | Link to the primary source for this project. |
| archive_url | url | null | Archived copy of the source. |
project_events.csv
Timeline events associated with projects. Join to projects on project_id.
| Field | Type | Description |
|---|---|---|
| id | uuid | Unique identifier for the event. |
| project_id | uuid | Foreign key to the parent project. |
| event_type | text | Type of event (e.g. 'permit_filed', 'public_hearing', 'construction_start'). |
| event_date | date | null | Date of the event. |
| source_url | url | null | Link to the source for this event. |
| archive_url | url | null | Archived copy of the event source. |
| extraction_confidence | enum | Confidence in the extracted event data. |
| notes_verbatim | text | null | Verbatim notes extracted from the source. |
Enums & Allowed Values
Sponsor Category
| Value | Label | Description |
|---|---|---|
| industry | Industry | Data center companies, tech firms, or industry trade groups. |
| advocacy | Advocacy | Issue advocacy organizations, community groups, or political organizations. |
| media | Media | News organizations that commission polls. |
| academic | Academic | Universities or research institutions. |
| government | Government | Government agencies or officials. |
| unknown | Unknown | Sponsor could not be determined from available information. |
Question Type
| Value | Label | Description |
|---|---|---|
| support_oppose | Support/Oppose | Questions asking whether respondents support or oppose a project or policy. |
| favorability | Favorability | Questions asking about favorable/unfavorable views. |
| concern | Concern | Questions measuring level of concern about impacts. |
| tradeoff | Tradeoff | Questions presenting tradeoffs (e.g. jobs vs. environmental impact). |
| awareness | Awareness | Questions measuring awareness of projects or issues. |
| other | Other | Questions that do not fit other categories. |
Project Status
| Value | Label | Description |
|---|---|---|
| announced | Announced | Project has been publicly announced but no permits filed. |
| proposed | Proposed / In Permitting | Project is in the permitting or planning process. |
| under_construction | Under Construction | Construction is underway. |
| operational | Operational | Data center is operating. |
| canceled | Canceled | Project has been canceled or abandoned. |
| on_hold | On Hold | Project is paused or delayed. |
Extraction Confidence
| Value | Label | Description |
|---|---|---|
| high | High | Data was clearly stated in the source and extraction is very likely correct. |
| medium | Medium | Data required some interpretation but is likely correct. |
| low | Low | Data was ambiguous or required significant interpretation. |
Geography Precision
| Value | Label | Description |
|---|---|---|
| site | Site Address | Coordinates are for a specific site address. |
| city | City Centroid | Coordinates are approximate, placed at the city center. |
| county | County Centroid | Coordinates are approximate, placed at the county center. |
Topic Tags
Questions may have zero or more of the following topic tags:
Caveats & Methodology
Coverage
PoweredByWho does not claim to be a comprehensive database of all data center opinion research. Coverage is limited to studies and projects discovered through a curated list of allowlisted public sources. New sources are added over time.
Data Extraction
Study data is extracted from source documents using a combination of automated processing and human review. The extraction_confidence field indicates the reliability of the extraction. All items are reviewed before being published.
Question Verbatim Text
Question text is reproduced verbatim from the source document to the extent possible. Some formatting may be lost in extraction. Always refer to the original source for the authoritative wording.
Project Coordinates
Project latitude and longitude may be approximate. The geography_precision field indicates whether coordinates are for a specific site, a city centroid, or a county centroid. Do not rely on these coordinates for precise location purposes.
Capacity Figures
MW capacity figures are sourced from public reporting and may represent planned, permitted, or operational capacity depending on the project stage. The capacity_mw_source_confidence field indicates the reliability of the figure.
Updates
Data is refreshed from sources approximately twice daily. New studies and projects are added as they are discovered and pass editorial review.
Citation
When citing data from PoweredByWho, please reference both PoweredByWho as the aggregator and the original source study or document. Each study detail page includes the source URL for direct citation.