Rendered at 21:01:04 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
showerst 1 days ago [-]
I've been working on legislative data for 15 years now, on open source scrapers with OpenStates and running a commercial product targeted at professionals (competitor to those in the article).
We tried for years with OpenStates to run a free legislative tracking product before eventually having it partner with a commercial provider who was willing to contribute the resources to keep it alive and help out with the open source pieces (shout out to Plural, nice folks).
Believe me when I say that this space is a classic nerd tar pit. It looks like a relatively easy problem, a few hundred scrapers, search, and some basic CRM functionality and you're off to the races.
The problem is that behind the scenes the data is very complicated, and the sources constantly change and break in goofy ways. You need to be running hundreds of scrapers constantly (many of them against akamai or cloudflare), and working around new source website bugs or procedural edge cases every week. It doesn't scale like something like product or web search where you can just ignore broken pages, the penalty for missing things is too high. Tuning your workflow so people find what they need without getting buried is tough, because there are tens of thousands of bills a session about things people think they care about like "AI" or "taxes". On top of that, the low or zero budget clientele is often that mix of high-expectation and low domain knowledge that makes them a big support burden.
Fiscalnote burned 750 million dollars in VC money on this and just went under this week, granted with a series of spectacular own-goals.
I wish this author the best of luck, and if you want to team up on scrapers please give us a shout. But please be aware that you're promising the moon, and try to build a model that will be financially and effort-sustainable. Keeping this stuff going is a _slog_. I'm really hoping that someone can bring the professional level tools to normal people.
orf 23 hours ago [-]
> Fiscalnote burned 750 million dollars in VC money on this
What percentage of that went towards solving the actual problem?
3/4 of a billion dollars is enough to pay many, many people 50$ an hour to sit at a screen 24/7 and refresh any number of websites you want.
pbronez 6 hours ago [-]
Classic example of a problem we need to solve with policy instead of technology. The law isn’t even that hard:
1) Government authored/sponsored materials are not considered “published” until they are available in both human and machine readable digital formats at no cost to the reader.
2) Publications of such materials
2.1) must not keep a record of individual usage beyond that technically necessary to provide access to the publication
2.2) not withstanding 2.1, take reasonable steps to secure digital publications from abuse that creates an undue administrative burden
aaronbrethorst 1 days ago [-]
I have some experience in this space and I want to strongly encourage the author to reconsider their free as in beer model.
Yes, your target users don’t have a lot of money, but they also deserve a sense of whether or not you’re going to keep maintaining this project. Additionally, they are generally NOT technical and will not have the skills necessary to set up or maintain this platform.
Without a paid offering, they will have to run the software and will not have any clarity about your long term commitment to the project. Feel free to reach out to me. My email address is in my profile.
skyberrys 20 hours ago [-]
Hey this is a good comment about the value of something being free that I hadn't thought of before. If it's open on GitHub though couldn't another person fork and continue along if the original author vanishes?
aaronbrethorst 18 hours ago [-]
Well, sure. But what if another person doesn't show up to continue along? How does a non-technical customer validate that the fork is going to fit their needs and does what they say?
tptacek 23 hours ago [-]
All civic tech is awesome, but a word of advice (may or may not be applicable in your particular neck of the woods, but definitely is in mine):
Public comment is one of the least effective mechanisms for influencing policy, at least at the margin. You can drastically amplify your influence with a simple change: move from public to private commentary, directly and personally addressing your state and local reps. They all have email addresses and I think it's more likely than not you'll be surprised when they (and it'll actually be them, not some factotum) respond to your email.
This would stop working if everybody did it, but I live in an unusually (famously, in fact) engaged municipality and have been unreasonably successful at influencing policy and the evidence I see is that almost nobody does this.
There's probably a civic tech thingy to do here. Though I'd also be mindful of the appearance of canvassing. My experience is that decisionmakers very quickly clock canvassing efforts, and then mentally bucket input into "low-effort" and "high-effort", often in a way that amplifies smaller interests.
I also think you can probably get a long way just by doing a better job than your policy adversaries at presenting information. Another thing I've noticed reps quickly clocking: the commenters who clearly have never read a budget, or who don't know the difference between an Enterprise Fund and the General Fund. These are also problems tech can solve, by digesting and contextualizing data so people can present informed (or informed-sounding) arguments.
pacbard 19 hours ago [-]
Regarding public comments, I don't believe a good politician will make a snap decision at the dais following public comments. Most of them will have received the meeting agenda in advance and formed an opinion about how they are going to vote and the questions they are going to ask. If this is the case, public comment is just a waste of time for them, as they won't really get swayed by it. At most, they will mention a point that a public commenter made to support something that they were already going to do.
Emailing them privately in advance of the meeting will give them the opportunity to think about your input and, in some cases, reply and engage with you about the policy. It might not change their mind, but it will definitely help them see others' perspectives on their upcoming decision.
tptacek 19 hours ago [-]
In the best case, you're betting on some subset of whatever board showing up not having counted the votes they need for their preferred outcome. If something's on the agenda, it's because the board wanted it on the agenda!
pimlottc 1 days ago [-]
This is great but confused me at first because it's slightly misusing the term "civic tech". It's generally used pretty broadly to include all government and gov-adjacent technology. Public monitoring and engagement tools are a part of it but that's just one piece. Civic tech includes actual government projects like Healthcare.gov and IRS Direct File (RIP); organizing platforms like MoveOn.org and ActBlue; and volunteer programs like Code for America and U.S. Digital Response.
The line at the bottom of the page does a better job of describing what specifically this project is:
"FireStriker is a free civic engagement and legislative intelligence platform for community organizations, unions, PACs, and activists."
caesil 1 days ago [-]
Kind of sad to see AI water use as the first listed issue motivating this.
The word "fake" draws attention but I think the article obscures two real problems:
Training is missing from the analysis entirely (as someone else noted)
Inference water use is indeed minimal per prompt no argument there, but training the old GPT-3 consumed roughly 5.4 million liters of water. LLaMA 3: ~22 million. These are huge events, happening multiple times a year across the industry, folding them into national averages seems like the statistical simplification he article criticizes everyone else for doing…
"Small nationally" ≠ "fine locally"
The Dalles, Oregon is the clearest example. In 2012, Google used 12% of the city's water supply. Today it consumes a third, around 1.19 million gallons per day, and well a sixth data center comes online in 2026, in the same area.
The city is now pursuing a $260 million reservoir expansion into a national forest (!), where 95% of the projected new water demand will be industrial, not residential. Residents are looking at a potential 99% rate increase by 2036 to fund infrastructure that may exists primarily to serve one company. Apparently the city fought a 13-month legal battle just to keep those numbers secret, that’s like a community being reshaped around a single tenant.
Hays County, Texas residents sharing the Edwards Aquifer with incoming data centers voted to block one. Memphis is watching xAI draw 5 million gallons per day. Bloomberg found two-thirds of new U.S. data centers since 2022 are sited in high water-stress zones. Arizona have already passed ordinances capping data center water use.
This to me looks like a problem in the making, AI water use isn't a national crisis for now, but local impacts are already real, training costs are systematically underreported, and the five year trajectory in water stressed regions deserves serious attention indeed
tadfisher 24 hours ago [-]
There is so much nuance and context missing that I can't see this as anything other than astroturfing.
I can get behind "AI water use is not a serious concern" if all you are talking about is selling inference, and you're comparing some sort of usage metric (e.g. "water use per request"). Water and power use for inference is on the level of other heavy Internet products like video streaming or cloud compute.
There is a lot I can't ignore, though. Model training is incredibly demanding, so much that OpenAI was trying to get $1 trillion in investment to practically double the number of data centers in the United States by 2030. That is a serious concern when we have to make decisions between, say, consumer water availability and tech investment in water-scarce areas like Arizona and New Mexico.
In Oregon, there are some unique problems with Amazon's water deals in Umatilla, where they are increasing nitrate concentration of the local groundwater through evaporative cooling, and refusing to pay for on-site treatment.
I can go on about other environmental harms, but I think you should take a more nuanced look at the issue. Having ChatGPT summarize a news article is not an unreasonable demand compared to other compute activities, but AI in general is driving compute demand so high that the general public is forced to reckon with a problem that's been there since the beginning: the expansion, operation and use of the Internet has physical environmental consequences.
deaux 17 hours ago [-]
It's completely trivial when you compare it to livestock agriculture's water usage (and even more trivial when looking at overall environmental damage), yet mysteriously 99% of people clamoring against GenAI's water usage have never opened their mouths about that. Or ever skipped a hamburger due to it. It's a joke. There are a hundred other good reasons to criticize GenAI, go pick one of those. The truth is that people are embarrassed of the real main reasons they dislike GenAI, yet they shouldn't be. Unlike this one, they're perfectly valid reasons.
I've known this for a long time and ironically this very article proves me right. The article is straight LLM slop, right from the mouth of GPT, Gemini or Claude. It's overt. Out of all 30 articles on the current HN frontpage, this one is by far the highest % LLM-written of all of them - I took the time to skim through each of them. This tells you all you need to know. With this in mind it's poor behavior to accuse GP of all people of astroturfing.
knowaveragejoe 1 days ago [-]
Okay. There are other criticisms of datacenter buildout that make this kind of product valuable. Moving on.
csunoser 1 days ago [-]
I am more sympathetic to the OG.
There are many good criticisms against data center. And yet, the water issue always comes up first. Must we spew false/untruthhood just so our political message is catchy? I suppose yes - in times of war/politics, the laws/truths are silent. But it doesn't have to be so here.
ImPostingOnHN 1 days ago [-]
> the water issue always comes up first
I've never had it come up first. Neat how 2 people can have 2 opposite experiences based on their different life paths.
Anyways: Between our 2 opposite experiences, it might as well be totally random, so I don't think the ordering of concerns is that important. Better to focus on substance, like the concerns themselves.
1 days ago [-]
waterfisher 1 days ago [-]
I appreciate the impetus behind this. But I'm unsure whether this warrants an HN post. The post text is AI-written, and there's no information on technical details--just a kind of vague problem statement. Nor was I able to find any code for the project elsewhere on the site.
dlipovetsky 1 days ago [-]
The author might like knowing about a similar effort targeting Automated License Plate Readers (ALPR), that was discussed here a few months ago: https://news.ycombinator.com/item?id=46290916.
Not sure you should be free. You want to be sustainable but not for profit.
Charging the customer is the small guys weapon to keep going.
The big guys cal also do that but they can subsidize lossy departments and sell data, sell stovk etc.
Make it GPLv3 open source if you want to offer a free.version
croisillon 24 hours ago [-]
next time just post the prompt
skyberrys 20 hours ago [-]
I like the idea of building free apps for civic engagement, but I'm not sure how to use it or help your project along. Maybe I should ask Gemini or Claude what to do with it. I think making civic tech free is a good idea, I want more vibe coded projects to be released for free. I agree a link to its GitHub is helpful for the group of people on hacker news. I'm more likely to contribute code or documentation than to actually be a target audience for using it.
pbiggar 22 hours ago [-]
Tech for Palestine runs a (free) incubator for civic tech. Would encourage the OP to apply!
104 points huh? HNers really love 100% LLM-written slop nowadays, it seems. Kudos on getting through to the end.
Or maybe the explanation is that nearly no one actually read it, that seems the more likely one.
up2isomorphism 1 days ago [-]
I am always skeptical about making anything useful "free". Because unless there is no cost associated with that, "free" is a fake term, which only means someone else absorbs the cost. There are cases which makes sense, but not sure "civic tech" is one of them.
We tried for years with OpenStates to run a free legislative tracking product before eventually having it partner with a commercial provider who was willing to contribute the resources to keep it alive and help out with the open source pieces (shout out to Plural, nice folks).
Believe me when I say that this space is a classic nerd tar pit. It looks like a relatively easy problem, a few hundred scrapers, search, and some basic CRM functionality and you're off to the races.
The problem is that behind the scenes the data is very complicated, and the sources constantly change and break in goofy ways. You need to be running hundreds of scrapers constantly (many of them against akamai or cloudflare), and working around new source website bugs or procedural edge cases every week. It doesn't scale like something like product or web search where you can just ignore broken pages, the penalty for missing things is too high. Tuning your workflow so people find what they need without getting buried is tough, because there are tens of thousands of bills a session about things people think they care about like "AI" or "taxes". On top of that, the low or zero budget clientele is often that mix of high-expectation and low domain knowledge that makes them a big support burden.
Fiscalnote burned 750 million dollars in VC money on this and just went under this week, granted with a series of spectacular own-goals.
I wish this author the best of luck, and if you want to team up on scrapers please give us a shout. But please be aware that you're promising the moon, and try to build a model that will be financially and effort-sustainable. Keeping this stuff going is a _slog_. I'm really hoping that someone can bring the professional level tools to normal people.
What percentage of that went towards solving the actual problem?
3/4 of a billion dollars is enough to pay many, many people 50$ an hour to sit at a screen 24/7 and refresh any number of websites you want.
1) Government authored/sponsored materials are not considered “published” until they are available in both human and machine readable digital formats at no cost to the reader.
2) Publications of such materials
2.1) must not keep a record of individual usage beyond that technically necessary to provide access to the publication
2.2) not withstanding 2.1, take reasonable steps to secure digital publications from abuse that creates an undue administrative burden
Yes, your target users don’t have a lot of money, but they also deserve a sense of whether or not you’re going to keep maintaining this project. Additionally, they are generally NOT technical and will not have the skills necessary to set up or maintain this platform.
Without a paid offering, they will have to run the software and will not have any clarity about your long term commitment to the project. Feel free to reach out to me. My email address is in my profile.
Public comment is one of the least effective mechanisms for influencing policy, at least at the margin. You can drastically amplify your influence with a simple change: move from public to private commentary, directly and personally addressing your state and local reps. They all have email addresses and I think it's more likely than not you'll be surprised when they (and it'll actually be them, not some factotum) respond to your email.
This would stop working if everybody did it, but I live in an unusually (famously, in fact) engaged municipality and have been unreasonably successful at influencing policy and the evidence I see is that almost nobody does this.
There's probably a civic tech thingy to do here. Though I'd also be mindful of the appearance of canvassing. My experience is that decisionmakers very quickly clock canvassing efforts, and then mentally bucket input into "low-effort" and "high-effort", often in a way that amplifies smaller interests.
I also think you can probably get a long way just by doing a better job than your policy adversaries at presenting information. Another thing I've noticed reps quickly clocking: the commenters who clearly have never read a budget, or who don't know the difference between an Enterprise Fund and the General Fund. These are also problems tech can solve, by digesting and contextualizing data so people can present informed (or informed-sounding) arguments.
Emailing them privately in advance of the meeting will give them the opportunity to think about your input and, in some cases, reply and engage with you about the policy. It might not change their mind, but it will definitely help them see others' perspectives on their upcoming decision.
The line at the bottom of the page does a better job of describing what specifically this project is:
"FireStriker is a free civic engagement and legislative intelligence platform for community organizations, unions, PACs, and activists."
It is a completely fake concern. See here: https://blog.andymasley.com/p/the-ai-water-issue-is-fake
Training is missing from the analysis entirely (as someone else noted)
Inference water use is indeed minimal per prompt no argument there, but training the old GPT-3 consumed roughly 5.4 million liters of water. LLaMA 3: ~22 million. These are huge events, happening multiple times a year across the industry, folding them into national averages seems like the statistical simplification he article criticizes everyone else for doing…
"Small nationally" ≠ "fine locally"
The Dalles, Oregon is the clearest example. In 2012, Google used 12% of the city's water supply. Today it consumes a third, around 1.19 million gallons per day, and well a sixth data center comes online in 2026, in the same area.
The city is now pursuing a $260 million reservoir expansion into a national forest (!), where 95% of the projected new water demand will be industrial, not residential. Residents are looking at a potential 99% rate increase by 2036 to fund infrastructure that may exists primarily to serve one company. Apparently the city fought a 13-month legal battle just to keep those numbers secret, that’s like a community being reshaped around a single tenant.
Hays County, Texas residents sharing the Edwards Aquifer with incoming data centers voted to block one. Memphis is watching xAI draw 5 million gallons per day. Bloomberg found two-thirds of new U.S. data centers since 2022 are sited in high water-stress zones. Arizona have already passed ordinances capping data center water use.
This to me looks like a problem in the making, AI water use isn't a national crisis for now, but local impacts are already real, training costs are systematically underreported, and the five year trajectory in water stressed regions deserves serious attention indeed
I can get behind "AI water use is not a serious concern" if all you are talking about is selling inference, and you're comparing some sort of usage metric (e.g. "water use per request"). Water and power use for inference is on the level of other heavy Internet products like video streaming or cloud compute.
There is a lot I can't ignore, though. Model training is incredibly demanding, so much that OpenAI was trying to get $1 trillion in investment to practically double the number of data centers in the United States by 2030. That is a serious concern when we have to make decisions between, say, consumer water availability and tech investment in water-scarce areas like Arizona and New Mexico.
In Oregon, there are some unique problems with Amazon's water deals in Umatilla, where they are increasing nitrate concentration of the local groundwater through evaporative cooling, and refusing to pay for on-site treatment.
I can go on about other environmental harms, but I think you should take a more nuanced look at the issue. Having ChatGPT summarize a news article is not an unreasonable demand compared to other compute activities, but AI in general is driving compute demand so high that the general public is forced to reckon with a problem that's been there since the beginning: the expansion, operation and use of the Internet has physical environmental consequences.
I've known this for a long time and ironically this very article proves me right. The article is straight LLM slop, right from the mouth of GPT, Gemini or Claude. It's overt. Out of all 30 articles on the current HN frontpage, this one is by far the highest % LLM-written of all of them - I took the time to skim through each of them. This tells you all you need to know. With this in mind it's poor behavior to accuse GP of all people of astroturfing.
There are many good criticisms against data center. And yet, the water issue always comes up first. Must we spew false/untruthhood just so our political message is catchy? I suppose yes - in times of war/politics, the laws/truths are silent. But it doesn't have to be so here.
I've never had it come up first. Neat how 2 people can have 2 opposite experiences based on their different life paths.
Anyways: Between our 2 opposite experiences, it might as well be totally random, so I don't think the ordering of concerns is that important. Better to focus on substance, like the concerns themselves.
It relies on automated scraping + human confirmation. Louis Rossman describes how it works in https://www.youtube.com/watch?v=W420BOqga_s
Not sure you should be free. You want to be sustainable but not for profit.
Charging the customer is the small guys weapon to keep going.
The big guys cal also do that but they can subsidize lossy departments and sell data, sell stovk etc.
Make it GPLv3 open source if you want to offer a free.version
https://techforpalestine.org/incubator
Or maybe the explanation is that nearly no one actually read it, that seems the more likely one.
https://firestriker.org/blog/building-firestriker-why-im-mak...
--
related, author is a friend with a less active HN account https://news.ycombinator.com/user?id=blakeofwilliam