Fair use of database content can feel like a puzzle — what can you copy, and when will copying trigger a copyright claim?
Are you building products, pulling facts for research, or drafting policy? The U.S. framework pivots on Section 107’s four-factor test. It weighs purpose, nature, amount taken, and market effect. Courts look at transformation, the “heart” taken, and whether your work displaces a market.
Want clear steps? Start by documenting purpose, minimizing what you extract, and checking licenses. The Copyright Office’s Fair Use Index and PACER provide case summaries and opinions to back decisions.
Quick promise: this article gives practical guidance and resources so your teams answer tough questions with confidence and ship data-driven projects with fewer surprises.
What “database content” means in U.S. copyright law
When you open a collection, ask: are you handling raw facts or expressive works? That question decides the legal path. Facts and most metadata are not copyrightable under U.S. law. You can often copy a factual number or date without copyright issues.
But creativity changes things. A photograph, essay, chart, or a crafted caption is a copyrighted work. Copying those items copies expression, not mere fact. Contracts can also restrict how you may use factual entries—even when copyright does not.
Facts versus expression: why raw data often isn’t protected
Telephone-style lists and numeric records typically report facts. Courts treat them as information, not authorship. If you retype a fact, copyright usually won’t block you. Still, check licenses and terms before commercial use.
When data becomes a copyrighted work
A curated list or an unusual arrangement can get “thin” protection for selection and structure. Minimal creativity — a unique layout or a distinctive caption — can cross the threshold into an original work.
- Ask first: are you copying facts or expression?
- Photographs and essays in a set remain individual works.
- Schema and arrangement may carry limited copyright protection.
- Document which fields are factual and which are expressive.
The legal backbone: Section 107 and the four-factor test
What test will courts apply when your team copies data for a report or product? Section 107 of the Copyright Act names four factors that judges weigh case‑by‑case. Follow them in order and document your reasoning.
Purpose and character
Ask: does your use add new meaning, critique, or analysis? Transformation favors a positive outcome. Educational or noncommercial aims help, but they do not guarantee a win.
Nature of the work
Factual materials prompt more leeway than highly creative works. Cite whether entries are raw data or a copyrighted work with expressive selection.
Amount and substantiality
Take only what your task needs. Avoid copying the “heart” even if it’s a small excerpt.
Market effect
Consider whether your feature would substitute for existing licenses or cut future revenue. Courts often treat this factor as decisive.
- Do this: write a factor‑by‑factor memo before launch.
- Do this: cite leading decisions and the Fair Use Index when you can.
- Do this: limit quantity, prove necessity, and record research steps.
| Factor | Key question | Practical step |
|---|---|---|
| Purpose | Transformative? | Summarize added meaning |
| Nature | Factual or creative? | Flag expressive fields |
| Market | Substitute risk? | Model impact on licensing |
Databases versus data: separating structure from contents
Think of the collection as two layers: the container and the entries inside. The container is the structure—fields, order, labels—and the entries are the raw facts that fill it. Treat these layers differently when you plan a project.

Thin copyright in selection, coordination, and arrangement
A compilation can carry a thin copyright for creative selection and arrangement. Curating which records to include can reflect judgment that earns protection.
If you copy a layout or replicate selection wholesale, you risk copying that protected arrangement. Extract facts into a new schema to lower risk.
Metadata, schemas, and creative choices that carry rights
Field names, taxonomies, and ordering are often expressive choices. Those elements can be a form of copyright protection or other rights.
- Separate container from contents: protect structure, but many rows are plain facts.
- Map needed fields and relations—replicate only what is functionally necessary.
- Contracts can govern access to factual materials even when copyright does not.
| Aspect | Likely status | Practical step |
|---|---|---|
| Selection & arrangement | Thin copyright | Create your own ordering |
| Raw facts | Usually free | Extract into new structure |
| Schemas & labels | May be expressive | Rename and document choices |
Fair use of database content in practice
Small choices—resolution, quantity, context—shape whether a lift is defensible. Keep decisions narrow. Record why each excerpt is needed. Short notes help later reviews.
Research and scholarship: quoting, sampling, and analysis
In research, quote tiny samples of expressive fields. Show only what proves your point. Pair excerpts with your own models or code to transform raw figures into analysis.
News and commentary: context that transforms raw facts
For data journalism, add visualization or critique. A chart that reframes numbers changes purpose. Thumbnails or low-resolution photographs can illustrate method without full reposting.
Teaching and libraries: access with proportion and purpose
In classrooms, link to licensed sources through your library instead of reposting materials. Use excerpts only as necessary and explain why each sample matters.
- Limit exports to needed fields and rows.
- Keep a log of examples, quantities, and rationale.
- Credit sources—citation alone won’t replace a fair use analysis.
| Scenario | Practical step | Why it helps |
|---|---|---|
| Academic research | Sample expressive fields; document necessity | Shows transformation and minimality |
| News reporting | Visualize and critique raw figures | Alters purpose; reduces substitution risk |
| Teaching & library | Link licensed access; use thumbnails | Respects licenses and preserves access |
Bottom line: follow Section 107 factors in every project. Treat each excerpt as an example you must justify.
When licenses and terms overrule your fair use plans
Which written terms can strip away your legal options before you ever copy a record? Many sites post Terms that act as contracts. Clicking “I agree” often creates binding obligations that limit what you may do next.
Click-through contracts and EULAs that bind your use
Did you click a checkbox? That moment can impose limits on copying, scraping, and reposting. Read the license and the EULA. Note deletion, audit, or IP assignment clauses.
Institutional licensing and access-controlled reposting
Enterprise licenses often bar exports to external systems. Open reposting is usually banned. Access-controlled reposting may be allowed with conditions. Check with your library or contracts team before you proceed.
Linking safely when copying is restricted
When in doubt, link to the record instead of copying materials into your system. Linking preserves access and avoids violating a contract or copyright owner’s terms.
- Did you click “I agree”? Track that contract in your project plan.
- Confirm license scope, user counts, and permitted work with your library.
- Keep screenshots minimal; verify that the terms permit them.
- If a license conflicts with statutory rights, the license usually governs permitted use.
- When planned use exceeds license limits, ask the copyright owner for permission.
| Restriction | Typical clause | Practical step |
|---|---|---|
| Copying rows or fields | Export/scrape prohibited | Request permission or link |
| Reposting materials | Open reposting banned | Use access-controlled viewer or ask library |
| Audit & deletion | Retention limits shown | Log terms and renewal dates |
Public domain and open licenses: lawful shortcuts to clarity
Searching for sources you can deploy with minimal legal friction?
Public domain materials give you a fast path. Works created by the U.S. federal government are public domain by statute. That means many NASA photos and federal reports are free to copy and transform—still verify exceptions and attribution rules.
U.S. federal works and public domain materials
Prefer public domain when timelines are tight and risk tolerance is low. Use CC0 datasets to remove hurdles. Even when attribution is not required, note provenance in your registry.
Creative Commons and rights statements you can rely on
Creative Commons licenses let you know exactly what you may do. Check ND, NC, and SA flags before integration. Filter sources with usage-rights tools—Google Advanced Image Search and government portals are good starting points.
- Keep a verified list of open resources and portals your team trusts.
- Record license version and URL at ingestion for every work you ingest.
- Combine public-domain data with your proprietary models to build clear commercial advantages.
| Source type | Typical status | Practical step |
|---|---|---|
| U.S. federal sites (NASA, USA.gov) | Public domain | Verify page notes; log URL |
| CC0 datasets | No restrictions | Ingest with attribution best practice |
| Creative Commons (CC BY/NC/SA) | Conditional reuse | Follow license terms; document version |
| Filtered image search | Varies by source | Use filters; confirm original license |
Building a defensible workflow for data-driven projects
Start every project with a one-page memo. State the goal, the transformation you will perform, and why each material is necessary. That memo becomes your first line of defense when questions arise.
Document intent, transformation, and necessity
Write clear intent: what you will change and why. Tie each excerpt to analysis or commentary. Cite MIT: crediting sources avoids plagiarism but does not cure infringement.
Minimize quantity: fields, rows, and resolution
Pre-specify exact needs. List fields, row counts, and image resolution. Use sampling, thumbnails, or summaries to limit what you pull.
Record sources, citations, and permissions
Log every source URL, license, and terms with dates and authorized users. Keep version notes in your registry.
Plan for rights conflicts before publication
Route gray-area uses to counsel early. Add license and rights checks to your pull request and release checklist. Prefer links over uploads when library licenses restrict redistribution.
- One-page memo: goal, transformation, necessity.
- Pre-specify: fields, rows, resolution.
- Minimize: sample or thumbnail first.
- Log: sources, license URLs, dates, users.
- Escalate: counsel before publication.
| Step | Action | Why it helps |
|---|---|---|
| Intent memo | Write 1 page | Documents purpose and necessity |
| Pre-specify data | List exact fields/rows | Limits quantity and substitution risk |
| Release checks | Include license & rights signoff | Prevents last-minute legal surprises |
Risk signals courts watch for in database cases
Courts flag a few concrete behaviors that predict trouble—spot them early.
Judges focus on market harm first. They ask whether your release replaces paid access or licenses. If it does, that signal favors a negative decision.
Substitution risk and loss of licensing markets
Does your product satisfy the same user need as the source? If yes, expect courts to weigh lost sales heavily. Bulk exports that mirror structure and selection look like substitution. At scale, similar projects can erode licensing revenue fast.
Copying the “heart” of a compilation
Even small excerpts can be fatal when they capture the core curated records. Lifting the heart of a work increases risk more than copying marginal rows. Republished material that satisfies identical demand is a red flag.
- Does the release substitute for the source? High risk.
- Bulk exports that reproduce selection or layout suggest copying.
- Prefer analysis, models, or annotations over straight aggregation.
- When multiple signals stack, get permission or redesign.
| Signal | Why it matters | Action |
|---|---|---|
| Substitution | Lost licensing revenue | Limit downloads; add paywall |
| Heart copied | Core curated value taken | Redact key records; summarize |
| Mirrored structure | Suggests replication beyond facts | Create new schema; rename fields |
Real-world patterns from the Fair Use Index
What patterns emerge when judges catalog data rulings in a public index? The Index tracks Supreme, circuit, and district court decisions and lists categories like internet and digitization. It links to full opinions on Google Scholar, Justia, Westlaw, LEXIS, and PACER.

How courts weigh transformation in data contexts
Courts look for added analysis, context, or a new function. If your project changes how material works, that helps. Necessity matters—take only what the new purpose demands.
Pull citations from the Index. Read full opinions on Google Scholar or PACER for nuance. That step shows judges you considered section factors and copyright law.
When entire works or datasets still pass muster
Some decisions permit whole works when completeness is essential to the new work. Internet and digitization cases give many such examples. Track which materials were deemed excessive versus essential.
- 1. Favor transformation that adds analysis or critique.
- 2. Document necessity and minimality for each excerpt.
- 3. Catalog examples internally to speed future reviews.
| Pattern | Why it matters | Action |
|---|---|---|
| Transformation | Shifts purpose | Annotate and analyze |
| Necessity | Limits quantity | Record exact fields |
| Completeness | Sometimes required | Note citation & source |
Next steps to use database content with confidence today
Ready to move from caution to practice with your next data project? Write a one‑page memo that states purpose and the transformation you will do. Keep it short. Keep it specific.
Then limit exports to the smallest slice needed. Prefer links and viewers rather than full uploads. Check license and terms. If anything is unclear, ask your library or counsel fast.
Prefer public domain or CC0 sources when speed and low risk matter. Log sources, rights, and approvals in a shared tracker. If scope grows, rerun your analysis before launch.
For edge cases, contact the copyright owner or seek alternative resources. Consult the Fair Use Index and read cited decisions for similar fact patterns. Bake these steps into product workflows and keep the playbook fresh.