This is how Show HN should work. Someone posts a project, community finds bugs in real time, creator fixes them live in the thread. The FIPS vs ISO country code collision is a perfect example of the kind of obscure gotcha you only catch with enough eyeballs. Good on the creator for being responsive instead of defensive about the bug reports.
Hi there, thanks for linking this! My GitHub and website both link to and use this source! I just thought putting it in a SQL database and making the entire 1990-2025 queryable was needed since I couldn't find one anywhere :)
it is a lot of fun and rewarding to do this! I've done it several times for medium-sized datasets, like wikipedia dumps, the entire geospatial dataset to mapreduce it (pgsql). The wikipedia one was great, i had it set up to query things like "show me all ammunition manufactured after 1950 that is between .30 and .40" and it could just return it nearly instantly. The wikimedia dumps keep the infoboxes and relations intact, so you can do queries like this easily.
Do you have a write-up of this somewhere? When I last looked at the Wikipedia dumps, they looked like a mess to parse. How were you getting structured information?
In case you are patching fields/bugs in database (like country codes for example), would it be possible for you to share that database as well with us so we can build on top?
This is actually an excellent dataset to test GraphRAG capabilities.
Also, a world simulation game, embodied with real data and real changes, can be built based off this data.
Hey there, yeah, definitely. I maintain .txt change logs for all data modifications. To be clear, no information is added or altered — the Factbook content is exactly what the CIA published. The parsing process structures the raw text into fields (removing formatting artifacts, sectioning headers, and deduplicating noise lines), but the actual data values are untouched. What I've added on top are lookup tables that map the CIA's FIPS 10-4 codes to ISO Alpha-2/3 and a unified MasterCountryID, so the different code systems can be joined and queried together.
Hi. Nice project. One issue though; if you go to the Factbook for any year[1], the link to the entry for “Germany”[2] will take you to the entry for the Gambia for every year I have checked. I have not noticed any other countries where that happens.
Hi there, I have located the root cause and will be fixing the issue:
Root cause: CIA uses FIPS codes (CanonicalCode), which differ from ISO Alpha-2 for many countries. Templates and SQL queries prioritized CanonicalCode over ISOAlpha2, so URL codes like /archive/2025/AU matched the wrong country.
Australia (AU) -> American Samoa (AS = CIA FIPS for Australia)
Singapore (SG) -> Senegal (SG = CIA FIPS for Senegal)
Germany (DE) -> Gambia (GM = CIA FIPS for Germany)
Found the root cause. The "World" entity (population ~8 billion) was being called alongside all individual countries, doubling the total. Thank you again!
Found the problem, the total regex doesn't handle magnitude suffixes:
2018: total: 17,856,024 → parses as 17856024 (correct raw count)
2020: total: 18.17 million → parses as 18.17 (WRONG - drops "million")
2025: total: 39.3 million → parses as 39.3 (WRONG)
So the chart jumps from ~18 million down to ~18, making it wrong. The fix is to handle "million/billion/trillion" after total.
Hey there, will add the feature. Wasn't sure if people's computers could handle it all in one, lol, but will make it available in the data export page.
Ohh that is a great idea! And since we already have the political field in SQL!. I will start working on some of this and update the website this week. Thank you for the awesome suggestions!
That’s a bit of a canary is it not? You don’t need to say that and wouldn’t know to say that unless you had worked in the space or wanted us to think you did :)
To clarify, I am a shill for fly.io and wanted to get you to spend more money by scaling it up. The site loaded instantaneously on the first try, so fast I thought it was local.
I'll start working on this now! Thank you for sending it! It will be interesting to see if I can incorporate them into the globes or when the country info pops up!
Hi there, I have located the root and sent out a bug fix.
Root cause: The CIA World Factbook, published by the Central Intelligence Agency, uses the U.S. Government's FIPS 10-4 country codes, which differ from the ISO 3166-1 Alpha-2 codes used by the rest of the world. Of the 281 entities in our database, 173 have different FIPS and ISO codes. Our lookups matched FIPS codes first, so when codes collided between the two systems, the wrong country was loaded. Fixed all 13 queries and 6 templates to always prefer ISO over FIPS.
Examples fixed:
Australia (ISO=AU) was loading American Samoa (FIPS=AQ, but Australia's FIPS=AS collides with American Samoa's ISO=AS)
Singapore (ISO=SG) was loading Senegal (FIPS=SG)
Germany (ISO=DE) was loading Gambia (FIPS=GM = Germany's FIPS, ISO=GM = Gambia)
Bahamas (ISO=BS) was loading Burkina Faso (FIPS=BF = Bahamas' FIPS, ISO=BF = Burkina Faso)
This is one of the hardest sites I’ve ever tried to read.
The pages are dense blocks of tiny gray serif text with default line height and almost no visual hierarchy. It feels like gray text on gray blobs. It is exhausting to scan and read.
In 2026, this should not be an issue. We have clear standards. The Web Content Accessibility Guidelines (WCAG) exist for a reason. Basic accessibility best practices have been documented for years.
The issues are not subtle. Small text, low contrast, and long unbroken paragraphs are not design preferences. They are barriers. They make the content harder to read for everyone, especially people with visual or cognitive challenges.
This is fixable. Increase the base font size. Improve contrast ratios. Add meaningful spacing. Use clear headings and structure. These are foundational usability principles.
Accessibility is not extra polish. It is baseline quality. Right now, the site is unnecessarily hard to read. That is a design problem, not a content problem.
Your points about accessibility are fair, and I agree that readability and contrast matter a lot.
That said, I had a different experience. I found the site readable and fairly easy to navigate once I understood the underlying structure of the data. The content is dense, but that seems inherent to the subject matter rather than purely a design issue. For me, it strikes a reasonable balance between overly sparse, scroll-heavy modern layouts and extremely compressed ones.
That doesn't mean improvements couldn't be made, especially around contrast, but I don't think the current design is unusable. It may simply work better for some reading styles than others.
In 2026, tools like WAVE, Lighthouse, and a real screen reader should be part of any website design process. They catch issues early. A stitch in time saves nine.
I know you may not be a designer. That’s fine. Starting with a solid, off-the-shelf CSS framework can get you much closer to Web Content Accessibility Guidelines (WCAG) compliance from day one. It sets a baseline so you’re not reinventing solved problems.
Building from scratch is absolutely valid. It’s cool, even. But right now it reads less like an intentional design choice and more like missing fundamentals.
I’m not trying to be a dick, the project has potential! A few design improvements would make it usable for a lot more people.
Hmm. It's kind of weird, because I think I actually used it in the 1990s, probably shortly before Wikipedia emerged. Ever since Wikipedia, I don't think I used the CIA world Factbook much at all, so in a way I guess this partly explains why the website is now defunct. But I am a tiny bit sad that it is gone, if only for a piece of nostalgia from the 1990s era. I think we need to be careful - yes, wikipedia has that information, but we kind of lose websites here. That is a potential danger, because we end up with more and more of a monopoly which is rarely good (ok, wikipedia may be an exception but it also has intrinsic quality issues; it is still excellent in many ways but not perfect, and we may get tunnel vision the more websites vanish - just look at the AI slop autogenerated "content" or "affiliate" links you see in a google search, if anyone is still using that).
Glad I was able to get the original fact book data that other archivists have gathered over the years- Project Gutenberg (plain text), Wayback Machine (HTML zips and factbook.jsons, and one from the agency's websites
This is pretty basic but kinda neat. A good way to browse the fact books like a website. Definitely could use more features but imo superior than flipping through a PDF.
Hi, thanks for this! Not sure if you're aware that clicking Australia goes to American Samoa, similar issue with some others that I encountered (Bahamas -> Burkina faso).
Hi there, I have located the root and sent out a bug fix.
Root cause: The CIA World Factbook, published by the Central Intelligence Agency, uses the U.S. Government's FIPS 10-4 country codes, which differ from the ISO 3166-1 Alpha-2 codes used by the rest of the world. Of the 281 entities in our database, 173 have different FIPS and ISO codes. Our lookups matched FIPS codes first, so when codes collided between the two systems, the wrong country was loaded. Fixed all 13 queries and 6 templates to always prefer ISO over FIPS.
Examples fixed:
Australia (ISO=AU) was loading American Samoa (FIPS=AQ, but Australia's FIPS=AS collides with American Samoa's ISO=AS)
Singapore (ISO=SG) was loading Senegal (FIPS=SG)
Germany (ISO=DE) was loading Gambia (FIPS=GM = Germany's FIPS, ISO=GM = Gambia)
Bahamas (ISO=BS) was loading Burkina Faso (FIPS=BF = Bahamas' FIPS, ISO=BF = Burkina Faso)
The data from the CIA World Factbook is in the public domain (being a U.S. Government work) and is free for anyone to use. The ETL scripts and data tools available in the GitHub repository are open source and licensed under the MIT License. However, the web application itself is proprietary software, with all rights reserved.
"A cache for datasets for the country profiles from the World Factbook in the original (1:1) format from the cia.gov website"
https://github.com/factbook/cache.factbook.json
In case you are patching fields/bugs in database (like country codes for example), would it be possible for you to share that database as well with us so we can build on top?
This is actually an excellent dataset to test GraphRAG capabilities.
Also, a world simulation game, embodied with real data and real changes, can be built based off this data.
Thanks..
I will add them to the github :)
[1] https://cia-factbook-archive.fly.dev/archive/2002
[2] https://cia-factbook-archive.fly.dev/archive/2002/GM
Root cause: CIA uses FIPS codes (CanonicalCode), which differ from ISO Alpha-2 for many countries. Templates and SQL queries prioritized CanonicalCode over ISOAlpha2, so URL codes like /archive/2025/AU matched the wrong country.
Australia (AU) -> American Samoa (AS = CIA FIPS for Australia) Singapore (SG) -> Senegal (SG = CIA FIPS for Senegal) Germany (DE) -> Gambia (GM = CIA FIPS for Germany)
2018: total: 17,856,024 → parses as 17856024 (correct raw count) 2020: total: 18.17 million → parses as 18.17 (WRONG - drops "million") 2025: total: 39.3 million → parses as 39.3 (WRONG) So the chart jumps from ~18 million down to ~18, making it wrong. The fix is to handle "million/billion/trillion" after total.
Just deployed a new bug fix.
Thanks for bringing this to my attention!
One small bug though: https://cia-factbook-archive.fly.dev/analysis/compare?a=IN&b...
.. The second dropdown switches to "Comoros" instead of "China" even after selection, though URL says CN for China.
One thing; you're supposed to write "Cannot confirm or deny my affiliation with the CIA"
https://www.cia.gov/resources/cia-maps/
https://cia-factbook-archive.fly.dev/maps?page=1
Then when you actually are in Australia, if you click back to 2001 or earlier it changes to 'Ashmore and Cartier Islands'
Root cause: The CIA World Factbook, published by the Central Intelligence Agency, uses the U.S. Government's FIPS 10-4 country codes, which differ from the ISO 3166-1 Alpha-2 codes used by the rest of the world. Of the 281 entities in our database, 173 have different FIPS and ISO codes. Our lookups matched FIPS codes first, so when codes collided between the two systems, the wrong country was loaded. Fixed all 13 queries and 6 templates to always prefer ISO over FIPS.
Examples fixed:
Australia (ISO=AU) was loading American Samoa (FIPS=AQ, but Australia's FIPS=AS collides with American Samoa's ISO=AS) Singapore (ISO=SG) was loading Senegal (FIPS=SG) Germany (ISO=DE) was loading Gambia (FIPS=GM = Germany's FIPS, ISO=GM = Gambia) Bahamas (ISO=BS) was loading Burkina Faso (FIPS=BF = Bahamas' FIPS, ISO=BF = Burkina Faso)
I didnt discover this until I saw the recent post about its deactivation.
The pages are dense blocks of tiny gray serif text with default line height and almost no visual hierarchy. It feels like gray text on gray blobs. It is exhausting to scan and read.
In 2026, this should not be an issue. We have clear standards. The Web Content Accessibility Guidelines (WCAG) exist for a reason. Basic accessibility best practices have been documented for years.
https://wave.webaim.org/report#/https://cia-factbook-archive...
The issues are not subtle. Small text, low contrast, and long unbroken paragraphs are not design preferences. They are barriers. They make the content harder to read for everyone, especially people with visual or cognitive challenges.
This is fixable. Increase the base font size. Improve contrast ratios. Add meaningful spacing. Use clear headings and structure. These are foundational usability principles.
Accessibility is not extra polish. It is baseline quality. Right now, the site is unnecessarily hard to read. That is a design problem, not a content problem.
That said, I had a different experience. I found the site readable and fairly easy to navigate once I understood the underlying structure of the data. The content is dense, but that seems inherent to the subject matter rather than purely a design issue. For me, it strikes a reasonable balance between overly sparse, scroll-heavy modern layouts and extremely compressed ones.
That doesn't mean improvements couldn't be made, especially around contrast, but I don't think the current design is unusable. It may simply work better for some reading styles than others.
In 2026, tools like WAVE, Lighthouse, and a real screen reader should be part of any website design process. They catch issues early. A stitch in time saves nine.
I know you may not be a designer. That’s fine. Starting with a solid, off-the-shelf CSS framework can get you much closer to Web Content Accessibility Guidelines (WCAG) compliance from day one. It sets a baseline so you’re not reinventing solved problems.
Building from scratch is absolutely valid. It’s cool, even. But right now it reads less like an intentional design choice and more like missing fundamentals.
I’m not trying to be a dick, the project has potential! A few design improvements would make it usable for a lot more people.
Cheers!
Root cause: The CIA World Factbook, published by the Central Intelligence Agency, uses the U.S. Government's FIPS 10-4 country codes, which differ from the ISO 3166-1 Alpha-2 codes used by the rest of the world. Of the 281 entities in our database, 173 have different FIPS and ISO codes. Our lookups matched FIPS codes first, so when codes collided between the two systems, the wrong country was loaded. Fixed all 13 queries and 6 templates to always prefer ISO over FIPS.
Examples fixed:
Australia (ISO=AU) was loading American Samoa (FIPS=AQ, but Australia's FIPS=AS collides with American Samoa's ISO=AS) Singapore (ISO=SG) was loading Senegal (FIPS=SG) Germany (ISO=DE) was loading Gambia (FIPS=GM = Germany's FIPS, ISO=GM = Gambia) Bahamas (ISO=BS) was loading Burkina Faso (FIPS=BF = Bahamas' FIPS, ISO=BF = Burkina Faso)