Skip to main content

Songs on the Security of Networks
a blog by Michał "rysiek" Woźniak

Fighting Disinformation: We're Solving The Wrong Problems

This post was written for and originally published by the Institute of Network Cultures as part of the Dispatches from Ukraine: Tactical Media Reflections and Responses publication. It also benefited from copy editing by Chloë Arkenbout, and proofreading by Laurence Scherz.


Tackling disinformation and misinformation is a problem that is important, timely, hard… and, in no way new. Throughout history, different forms of propaganda, manipulation, and biased reporting have been present and deployed — consciously or not; maliciously or not — to steer political discourse and to goad public outrage. The issue has admittedly become more urgent lately and we do need to do something about it. I believe, however, that so far we’ve been focusing on the wrong parts of it.

Consider the term “fake news” itself. It feels like a new invention even though its literal use was first recorded in 1890. On its face it means “news that is untrue”, but of course, it has been twisted and abused to claim that certain factual reporting is false or manufactured — to a point where its very use might suggest that a person using it not being entirely forthright.

That’s the crux of it; in a way, “fake” is in the eye of the beholder.

Matter of trust

While it is possible to define misinformation and disinformation, any such definition necessarily relies on things that are not easy (or possible) to quickly verify: a news item’s relation to truth, and its authors’ or distributors’ intent.

This is especially valid within any domain that deals with complex knowledge that is highly nuanced, especially when stakes are high and emotions heat up. Public debate around COVID-19 is a chilling example. Regardless of how much “own research” anyone has done, for those without an advanced medical and scientific background it eventually boiled down to the question of “who do you trust”. Some trusted medical professionals, some didn’t (and still don’t).

As the world continues to assess the harrowing consequences of the pandemic, it is clear that the misinformation around and disinformation campaigns about it had a real cost, expressed in needless human suffering and lives lost.

It is tempting, therefore, to call for censorship or other sanctions against misinformation and disinformation peddlers. And indeed, in many places legislation is already in place that punishes them with fines or jail time. These places include Turkey and Russia, and it will surprise no one that media organizations are sounding alarms about them.

The Russian case is especially relevant here. On the one hand, the Russian state insists on calling their war of aggression against Ukraine a “special military operation” and blatantly lies about losses sustained by the Russian armed forces, and about war crimes committed by them. On the other hand, Kremlin appoints itself the arbiter of truth and demands that any news organizations in Russia propagate these lies on its behalf — using “anti-fake news” laws as leverage.

Disinformation peddlers are not just trying to push specific narratives. The broader aim is to discredit the very idea that there can at all exist any reliable, trustworthy information source. After all, if nothing is trustworthy, the disinformation peddlers themselves are as trustworthy as it gets. The target is trust itself.

And so we apparently find ourselves in an impossible position:

On one hand, the global pandemic, a war in Eastern Europe, and the climate crisis are all complex, emotionally charged high-stakes issues that can easily be exploited by peddlers of misinformation and disinformation, which thus become existential threats that urgently need to be dealt with.

On the other hand, in many ways, the cure might be worse than the disease. “Anti-fake news” laws can, just like libel laws, enable malicious actors to stifle truthful but inconvenient reporting, to the detriment of the public debate, and the debating public. Employing censorship to fight disinformation and misinformation is fraught with peril.

I believe that we are looking for solutions to the wrong aspects of the problem. Instead of trying to legislate misinformation and disinformation away, we should instead be looking closely at how is it possible that it spreads so fast (and who benefits from this). We should be finding ways to fix the media funding crisis; and we should be making sure that future generations receive the mental tools that would allow them to cut through biases, hoaxes, rhetorical tricks, and logical fallacies weaponized to wage information wars.

Compounding the problem

The reason why misinformation and disinformation spread so fast is that our most commonly used communication tools had been built in a way that promotes that kind of content over fact-checked, long-form, nuanced reporting.

According to Washington Post, “Facebook programmed the algorithm that decides what people see in their news feeds to use the reaction emoji as signals to push more emotional and provocative content — including content likely to make them angry.”

When this is combined with the fact that “[Facebook’s] data scientists confirmed in 2019 that posts that sparked [the] angry reaction emoji were disproportionately likely to include misinformation, toxicity and low-quality news”, you get a tool fine-tuned to spread misinformation and disinformation. What’s worse, the more people get angry at a particular post, the more it spreads. The more angry commenters point out how false it is, the more the algorithm promotes it to others.

One could call this the “outrage dividend”, and disinformation benefits especially handsomely from it. It is related to “yellow journalism”, the type of journalism where newspapers present little or no legitimate, well-researched news while instead using eye-catching headlines for increased sales, of course. The difference is that tabloids of the early 20th century didn’t get the additional boost from a global communication system effectively designed to promote this kind of content.

I am not saying Facebook intentionally designed its platform to become the best tool a malicious disinformation actor could dream of. This might have been (and probably was) an innocent mistake, an unintended consequence of the way the post-promoting algorithm was supposed to work.

But in large systems, even tiny mistakes compound to become huge problems, especially over time. And Facebook happens to be a gigantic system that has been with us for almost two decades. In the immortal words of fictional Senator Soaper: “To err is human, but to really foul things up you need a computer.”

Of course, the solution is not as simple as just telling Facebook and other social media platforms not to do this. What we need (among other things) is algorithmic transparency, so that we can reason about how and why exactly a particular piece of content gets promoted.

More importantly, we also need to decentralize our online areas of public debate. The current situation in which we consume (and publish) most of our news through two or three global companies, who effectively have full control over our feeds and over our ability to reach our audiences, is untenable. Monopolized, centralized social media is a monoculture where mind viruses can spread unchecked.

It’s worth noting that these monopolistic monocultures (in both the policy and software sense) are a very enticing target for anyone who would be inclined to maliciously exploit the algorithm’s weaknesses. The post-promoting algorithm is, after all, just software, and all software has bugs. If you find a way to game the system, you get to reach incredibly numerous audiences. It should then come as no surprise that most vaccine hoaxes on social media can be traced back to only 12 people.

Centralization obviously also relates to the ability of billionaires to just buy a social network wholesale or the inability (or unwillingness) of mainstream social media platforms to deal with abuse and extremism. They all stem from the fact that a handful of for-profit companies control the daily communication of several billion people. This is too few companies to wield that kind of power, especially when they demonstrably wield it so badly.

Alternatives do already exist. Fediverse, a decentralized social network, does not have a single company controlling it (and no shady algorithm deciding who gets to see which posts), and does not have to come up with a single set of rules for everyone on it (an impossible task, as former Twitter CEO, Jack Dorsey, admits). Its decentralized nature (there are thousands of servers run by different people and groups, with different rules) means that it’s easier to deal with abuse. And since it’s not controlled by a single for-profit company there is no incentive to keep bad actors in so as not to risk an outflow of users (and thus a drop in stock prices).

So we can start by at least setting up a presence in the Fediverse right now (following thousands of users who migrated there after Elon Musk’s Twitter bid). And, we can push for centralized social media walled gardens to be forced to open their protocols, so that their owners no longer can keep us hostage. Just like the ability to move a number between mobile providers makes it easier for us to switch while keeping in touch with our contacts, the ability to communicate across different social networks would make it easier to transition out of the walled gardens without losing our audience.

Media funding

As far as funding is concerned, entities spreading disinformation have at least three advantages over reliable media and fact-checking organizations.

First, they can be bank-rolled by actors who do not care if they turn a profit. Secondly, they don’t have to spend any money on actual reporting, research, fact-checking, and everything else that is both required and costly in an honest news outlet. Third, as opposed to a lot of nuanced long-form journalism, disinformation benefits greatly from the aforementioned “outrage dividend” — it is easier for disinformation to get the clicks, and create ad revenues.

Meanwhile, honest media organizations are squeezed from every possible side. Not the least by the very platforms that gate-keep their reach, or provide (and pay for) ads on their websites.

Many organizations, including small public grant-funded outlets, find themselves in a position where they feel they have to pay Facebook for “reach”; to promote their posts on its platform. They don’t benefit from the outrage dividend, after all.

In other words, money that would otherwise go into paying journalists working for a small, often embattled media organization, gets funneled to one of the biggest tech companies in the world, which consciously built their system as a “roach motel” — easy to get in, very hard to get out once you start using it — and now exploits that to extract payments for “reach”. An economist might call it “monopolistic rent-seeking”.

Meanwhile, the biggest ad network operator, Google, uses their similar near-monopoly position to extract an ever larger share of ad revenues, leaving less and less on the table for media organizations that rely on them for their ads.

All this means that as time goes by it gets progressively harder to publish quality fact-checked news. This is again tied to centralization giving a few Big Tech companies the ability to control global information flow and extract rents from that.

A move to non-targeted, contextual ads might be worth a shot — some studies show that targeted advertising offers quite limited gains compared to other forms of advertising. At the same time, cutting out the rent-seeking middle man leaves a larger slice of the pie on the table for publishers. More public funding (perhaps funded by a tax levied on the mega-platforms) is also an idea worth considering.

Media education

Finally, we need to make sure our audiences can understand what they’re reading, along with the fact that somebody might have vested interests in writing a post or an article in a particular way. We cannot have that without robust media literacy education in schools.

Logic and rhetoric have long been banished from most public schools as, apparently, they are not useful for finding a job. Logical fallacies are barely (if at all) covered. At the same time both misinformation and disinformation rely heavily on logical fallacies. I will not be at all original when I say that school curricula need to emphasize critical thinking, but it still needs to be said.

We also need to update the way we teach, to fit the current world. Education is still largely built around the idea that information is scarce and the main difficulty is acquiring it (hence its focus on memorizing facts and figures). Meanwhile, for at least a decade if not more, information is plentiful, and the difficulty lies in filtering it and figuring out which information sources to trust.

Solving the right problem, together

“Every complex problem has a solution which is simple, direct, plausible — and wrong”, observed H. L. Mencken. This describes the push for seemingly simple solutions to the misinformation and disinformation crisis, like legislation making disinformation (however defined) “illegal”, well.

News and fact-checking communities have limited resources. We cannot afford to spend them on ineffective solutions — and much less on in-fighting about proposals that are both highly controversial and recognized broadly as dangerous.

To really deal with this crisis we need to recognize centralization — of social media, of ad networks, of media ownership, of power over our daily communication, and in many other areas related to news publishing — and poor media literacy among the public as crucial underlying causes that need to be tackled.

Once we do, we have options. Those mentioned in this text are just rough ideas; there are bound to be many more. But we need to start by focusing on the right parts of the problem.

Dealing with SEO Link Spam E-mails

Disclaimer: I am not a lawyer. I am not your lawyer. None of this is legal advice. All of this might also be a horribly bad idea.

Ah, SEO link spam e-mails. If you have a blog that’s been online longer than, say, three years, you know what I’m talking about:

Hey,

I read your article at <link-to-a-blogpost-of-mine> talking about <actually-not-the-topic-of-the-blogpost>. I think your readers would benefit from a link to <link-to-an-irrelevant-or-trivial-piece>.

Would you consider linking to our article?

For a long time I just ignored these, flagging as spam and moving on. Obviously I am not going to link to some marketing crap that’s there only to drive up SEO of some random site.

But then that one spammer showed up in my mailbox, and he was persistent. Several e-mails and follow-ups within a month. I decided I needed a better strategy.

What if I told them to pay for a link being placed on my blog?

I asked for input on fedi, and after quite a few useful suggestions and comments, I drafted what is now my standard template to deal with these kinds of requests.

The Template

Hey,

thanks for reaching out. My going rate for a link placed on my blog is $500USD; I get to decide where and how I place it, and within what content. It will be placed in a regular blogpost, reachable by search engines, on the blog in question. It will stay up for at least a year. No other guarantees are made.

I require payment of half of the sum ($250, non-refundable) before I prepare the specific placement offer, for you to accept or reject. The placement, context and meaning of the link in the placement offer shall be determined at my sole and absolute discretion. There is no representation or warranty whatsoever as to whether the link is placed in a way that would imply an endorsement, or even fail to be an explicit or implied disparagement.

Once provided, the placement offer is final, and if rejected, I understand you are no longer interested in placing a link on my blog. At that point the initial payment is considered payment for my time and expertise in preparing the placement offer.

Once you accept the placement offer, I will put the link on-line within 10 business days, and I will expect payment in full at the latest 20 business days from it went online. After that period interest will accrue at 12% p.a., calculated annually.

Please be advised that any further communication from anyone at <company-name-or-domain-spam-e-mail-was-sent-from> or in relation to <domain-of-the-link-being-peddled> that is neither a clear rejection of this deal nor acceptance of the terms as outlined herein (and discussion about invoicing or accounting technicalities) will accrue a $50 processing fee. Any further communication from anyone at <company-name-or-domain-spam-e-mail-was-sent-from> or in relation to <domain-of-the-link-being-peddled>, including apparently unrelated to the matter at hand, amounts to acceptance of these terms, regardless of when it takes place and who the sender is. Any and all disputes must be subject only to the law of my jurisdiction (Iceland) and handled solely in the courts herein.

Do let me know if you have any specific invoicing/accounting requirements. I am looking forward to doing business with you.

The Point

The point, obviously, is to limit the amount of SEO link spam e-mails I have to deal with. But of course if somebody decides to take me up on the offer, I am happy to pocket the $500 to publish a blogpost about how they just paid $500 for the privilege of being made fun of, by me.

Yes, I will link to where they ask, yes it will be reachable by search engines, but also: yes, the link might have rel="sponsored nofollow" attribute set.

This is also somewhat the point of this very blogpost. Each and every SEO link spam e-mail claims that the sender “has read my site”. Well, if they did, they are now surely aware what’s in stock.

Finally, most SEO link spam e-mails mention you can “unsubscribe” by replying to them. I never “subscribed” to any of them in the first place, so that just feels wrong. More importantly though, I simply don’t trust the spammers to actually respect my request to be removed from their contacts database.

I do however trust that once they are informed that any further communication would cost them $50, they might not want to communicate further.

The Outcome

I have used the template several times over the last few months. I have not once heard back from any of the spammers that got served with it, and the overall amount of SEO link spam e-mails I receive seems to have gone down measurably — which might or might not be related to my use of the template, of course.

The Future

I would love to be able to charge SEO link spam e-mail senders even for the first e-mail they send me. So I am thinking of adding some kind of EULA to that effect to my blog.

I hate EULAs; I find the assumption that some terms are binding even if the visitor has not explicitly agreed to them (nor read them) to be asinine. But if that’s the world we live in, I might as well use it to make SEO link spam a bit more costly.

The Outrage Dividend

I would like to propose a new term: outrage dividend.

Outrage dividend is the boost in reach that content which elicits strong emotional responses often gets on social media and other content sharing platforms.

This boost can be related to human nature — an outrage-inducing article will get shared more. It can also be caused by the particular set-up of the platform a given piece of content is shared on — Facebook’s post-promoting algorithm was designed to be heavily biased to promote posts that get the “angry” reaction.

A tale of two media outlets

Imagine two media organizations.

A Herald is a reliable media organization, with great fact-checking, in-depth reporting, and so on. Their articles are nuanced, well-argued, and usually stay away from sensationalism and clickbaity titles.

B Daily is (for want of a better term) a disinformation peddler. They don’t care about facts, as long as their sensationalist, clickbaity articles get the clicks, and ad revenue rolls in.

Thanks to the outrage dividend, content produced by B Daily will get more clicks (and more ad revenue), as more people will engage with it simply because it’s exploiting our human nature; but it will also be put in front of more eyeballs because it causes people to be angry, and anger gets a boost (at least on Facebook).

Outrage Dividend’s compound interest

It gets worse: not only B Daily’s content is cheaper to produce (no actual reporting, no fact-checking, etc), not only does it get promoted more on the platform due to the particular angry reaction it causes in people, but also every time it gets fact-checked or debunked, that’s more engagement, and so even more reach.

Meanwhile, A Herald not only has to pay for expensive experts to do fact-checking, for reporters to do reporting, and so on, but also they feel they need to pay for reach, because their nuanced, in-depth, well-reasoned pieces get fewer clicks as they get promoted less by the platform’s algorithms.

Relation to tabloids / yellow journalism

There obviously is a relation here to yellow journalism and tabloids. I think it’s fair to say that these types of outlets use or exploit the outrage dividend for profit, basically basing their business model on it.

Of course, tabloid newspapers of (say) early 20th century did benefit from the human side of the outrage dividend (which made them possible and profitable in the first place). But the rise of global, centralized platforms like Facebook, with their content promoting algorithms that can apparently be gamed in order to reach effectively unlimited audiences, made the rift between how hard it is to get nuanced content reach a broad audience, and how easy it is to spread disinformation and misinformation, really problematic.

With all this in mind I think we need to seriously consider ways outrage dividend could be countered, and what options (technological, legislative, or other) are available for that.

FLOSS developers and open web activists are people too

I can’t believe I have to spell this out, but:
free/libre/open-source software developers and open web activists selflessly running independent services online are people too.

It seems this idea is especially difficult to grasp for researchers (including, apparently, whoever reviews and green-lights their studies). The latest kerfuffle with the Princeton-Radboud Study on Privacy Law Implementation shows this well.

“Not a human subject study”

The idea of that study seems simple enough: get a list of “popular” websites (according to the research-oriented Tranco list), send e-mails to e-mail addresses expected to be monitored for privacy-related requests (like privacy@example.com), and use that to assess the state of CCPA and GDPR implementation. Sounds good!

There were, however, quite a few problems with this:

Imagine you’re running a small independent social media site and you get a lawyery-sounding e-mail about a privacy regulation you might not even have heard about, that ends with:

I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code.

Should you reach out to a lawyer? That can easily get costly, fast. Is it okay to ignore it? That could end in an even costlier lawsuit. And so, now you’re losing sleep over something that sounds serious, but turns out to be a researcher’s idea of “not a human subject study”.

Humanity-erasure

The study’s FAQ consistently mentions “websites”, and “contacting websites”, and so on, as if there were no people involved in running them nor in answering these e-mails. Consider this gem (emphasis mine):

What happens if a website ignores an email that is part of this study?

We are not aware of any adverse consequences for a website declining to respond to an email that is part of this study. We will not send a follow-up email about an email that a website has not responded to, and we will not name websites when describing email responses in our academic research.

Sadly, nobody told this to the volunteer admin of a small social media site, who is perhaps still worrying (or even spending money on a lawyer) over this. But don’t worry, the Princeton University Institutional Review Board has determined that the “study does not constitute human subjects research”. So it’s all good!

This is not the first time such humanity-erasure happens, either. Some time ago, researchers at University of Minnesota conducted a study that involved submitting intentionally buggy patches to the Linux kernel.

They insisted that they were “studying the patching process”, but somehow missed the fact that that process involved real humans, many of whom volunteered time and effort to work on the Linux kernel. The developers were not amused.

Eventually, the researchers had to issue an apology for their lack of empathy and consideration for Linux kernel developers and their wasted time.

Tangent: taking “open” seriously

This is a bit tangential, but to me all this seems to be connected to a broader problem of people not treating communities focused on (broadly speaking) openness seriously.

In the case of the Princeton study, several Fediverse instance admins were affected. The University of Minnesota study affected Linux kernel developers. In both cases their effort (maintaining independent social media sites; developing an freely-licensed piece of software) was not recognized as serious or important – even if its product (like the Linux kernel) perhaps was.

I see this often in other contexts: people complain about Big Tech and “the platforms” a lot, but any mention of Fediverse as a viable alternative (both in the terms of a service, but also in terms of a funding model) is more often than not met with a patronizing dismissal. We’ve been seeing the same for years regarding free software, too.

Meanwhile, a proven abuser like Facebook can pull a Meta and everyone will dutifully debate how insightful and deep a move this is.

Oh, the humanity!

It is quite disconcerting that researchers seem unable to recognize the humanity of FLOSS developers or admins of small, independent websites or services. It is even more disturbing that, apparently, this tends to fly under the radar of review boards tasked with establishing if something is or isn’t a human-subject study.

And it is disgraceful to abuse scarce resources (such as time and energy) available to volunteer admins and FLOSS developers in order to run such inconsiderate research. It alienates a privacy-conscious, deeply invested community at a time when research into privacy and digital human rights is more important than ever.

Blockchain-based consensus systems are an energy-waste ratchet

A lot has already been written about different aspects of why most distributed blockchain-based consensus systems are just… bad. And yet we are still able to find new such reasons. At least I think this is a new one. I have not seen it mentioned anywhere so far.

Distributed blockchain-based consensus systems, as they are currently implemented, are an energy-waste ratchet.

I am specifically talking about systems like Bitcoin and Ethereum, and any other system that:

  • is distributed;
  • lets their users control some kind of “assets” by tying these to their “wallets” until they spend them;
  • uses blockchain for consensus.

What’s in a wallet

When you have any assets on any such system, they are associated with some form of a wallet. That boils down to a file containing the private key, often password-protected, which needs to be stored somewhere safe. It is also necessary to have that file and the associated password in order to do anything with your assets.

We are, however, human, and as humans we are bad both at remembering passwords, and at keeping digital files safe for long periods of time. Passwords get forgotten. Harddrives fail or are thrown away.

And when that happens, there is no way to retrieve the assets in question. They’re lost, forever.

A wasteful ratchet

As time goes by and more people lose access to their wallets, more assets will be irretrievably lost. This is a one-way street, or in other words: a ratchet.

All those assets, now lost (like tears in… rain), nevertheless still took energy (sometimes an insane amount!) to mine or mint. Even if someone considers it worth it to use that energy on mining or minting in the first place, we can probably agree that for assets that get irretrievably lost, that energy has simply been wasted.

Mining capacity doesn’t go away with lost assets, though – and so, that (steadily growing most of the time) mining capacity is used to support transactions in a network with more and more assets that remain forever inaccessible.

Blockchain-based consensus systems inevitably waste energy on creating worthless, lost cryptoassets. With time, the amount of lost cryptoassets can only grow.

To make matters worse, for systems that are supply-limited (like Bitcoin) that also means that at some point the amount of lost cryptoassets will exceed the amount of still accessible ones.

Why I like the Contract-Based Dependency Management idea

About a week ago, @tomasino published a post on his contract-based dependency management idea (aka CBDM), and I would be lying if I said I didn’t like it.

Not only does it provide a better model for dependency management than SemVer or any other versioning scheme, but it also:

  • provides strong incentive for developers to maintain extensive test suites for their own software;
  • provides strong incentive for developers to help developers of their project’s dependencies maintain extensive test suites, too;
  • provides very clear and unambiguous information on whether or not some functionality or behaviour of the dependency is, in fact, officially supported by dependency’s developers, or not;
  • provides very clear and unambiguous information if some functionality or behaviour of the dependency has changed;
  • makes it very, very clear who done goofed if a dependency upgrade breaks a dependent project.

What’s CBDM?

The basic idea boils down to this: when deciding if a given version of a given dependency is compatible with a dependent piece of software, instead of relying on version numbers – rely on tests that actually verify the functionality and behaviour that piece of software actually depends on.

In other words, when considering updating dependencies of a project, don’t look at version numbers, but look at tests of the dependency (and their results).

Tomasino’s post goes into more detail and is well-worth a read.

What’s wrong with version numbers?

Version numbers are are notoriously unreliable in predicting if something breaks after the upgrade. That’s the whole point of SemVer – to try to make them more reliable.

The problem is that it’s impossible to express, in a set of just few numbers, all the dimensions in which a piece of software might change. More importantly, certain changes might be considered irrelevant or minor by the developers, but might break projects that depend on some specific peculiarity.

Cue specifications, and endless debates whether or not a particular change breaks the specification or not.

How could CBDM work in practice?

Let’s say I’m developing a piece of software, call it AProject. It depends on a library, say: LibBee. LibBee developers are Gentlefolk Scholars, and therefore LibBee has quite extensive test coverage.

As the developer of AProject I specify the dependency not as:

LibBee ver x.y.z

…but as:

LibBee, (list of upstream tests I need to be unchanged, and to pass)

(Bear with me here and let’s, for the moment, wave away the question of how exactly this list of upstream tests is specified.)

This list does not need to contain all of LibBee’s tests – in fact, it should not contain all of them as that would effectively pin the current exact version of LibBee (assuming full coverage; we’ll get back to that). However, they should be tests that test all of LibBee’s functionality and behaviour AProject does rely on.

This set of tests becomes a contract. As long as this contract is fulfilled by any newer (or older) version of LibBee I know it should be safe for it to be upgraded without breaking AProject.

What if a LibBee upgrade breaks AProject anyway?

I say “should”, because people make mistakes. If upgrading LibBee breaks AProject even though the contract is fulfilled (that is, all specified tests have not been modified, and are passing), there is basically only a single option: AProject relied on some functionality or behaviour that was not in the contract.

That makes it very clear who is responsible for that unexpected breakage: I am. I failed to make sure the contract contained everything I needed. Thus a long and frustrating blame-game between myself and LibBee’s developers is avoided. I add the information about the additional test to the contract, and deal with the breakage as in any other case of dependency breaking change.

AProject just got a better, more thorough dependency contract, and I didn’t waste any time (mine nor LibBee developers’) blaming anyone for my own omission.

Win-win!

What if the needed upstream test does not exist?

If a test does not exist upstream for a particular functionality or behaviour of LibBee that I rely on, it makes all the sense in the world for me to write it, and submit it as a merge request to LibBee.

When that merge request gets accepted by LibBee’s developers, it clearly means that functionality or behaviour is supported (and now also tested) upstream. I can now add that test to AProject’s dependency contract. LibBee just got an additional test contributed and has more extensive test coverage, for free. My project has a more complete contract and I can be less anxious about dependency upgrades.

Win-win!

What if the needed test is rejected?

If LibBee developers reject my merge request, that is a very clear message that AProject relies on some functionality or behaviour that is not officially supported.

I can either decide to roll with it, still add that test to the contract, and keep the test itself in AProject to check each new version of LibBee when upgrading; or I can decide that this is too risky, and re-write AProject to not rely on that unsupported functionality or behaviour.

Either way, I know what I am getting into, and LibBee’s developers know I won’t be blaming them if they change that particular aspect of the library – after all, I’ve been warned, and have a test to prove it.

You guessed it: win-win!

Abolish version numbers, then?

No, not at all. They’re still useful, even if just to know that a dependency has been upgraded. In fact, they probably should be used alongside a test-based dependency contract, allowing for a smooth transition from version-based dependency management to CBDM.

Version numbers work fine on a human level, and with SemVer they carry some reasonably well-defined information. They are just not expressive enough to rely on them for dependency management. Anyone who has ever maintained a large project with a lot of dependencies will agree.

Where’s the catch?

There’s always one, right?

The difficult part, I think, is figuring out three things:

  1. How does one “identify a test”?
  2. What does it mean that “a test has not changed”?
  3. How to “specify a test” in a dependency contract?

The answers to 1. and 2. will almost certainly depend on the programming language (and perhaps the testing framework used), and will almost certainly mostly define the answer to 3.

One rough idea would be:

  1. A test is identified by it’s name (basically every unit testing framework provides a way to “name” tests, often requiring them to be named).
  2. If the code of the test changes in any way, the test is deemed to have changed. Probably makes sense to consider some linting first, so that whitespace changes don’t invalidate the contracts of all dependent projects.
  3. If a test is identified by it’s name, using that name is the sanest.

I really think the idea has a lot of merit. Software development is becoming more and more test-driven (which is great!), why not use that to solve dependency hell too?

How (not) to talk about hackers in media

Polish version of this entry has originally been published by Oko Press.

Excessive use by the media of words “hacker”, “hacking”, “hack”, and the like, whenever a story concerns information security, online break-ins, leaks, and cyberattacks is problematic:

  1. Makes it hard to inform the public accurately about causes of a given event, and thus makes it all but impossible to have an informed debate about it.
  2. Demonizes a creative community of tinkerers, artists, IT researchers, and information security experts.

Uninformed public debate

The first problem is laid bare by the recent compromise of a private e-mail account belonging to Michał Dworczyk, Polish PM’s top aide.

Headlines like “Hacker attack against Dworczyk” or “Government hacked” put Mr Dworczyk and the government in a position of innocent victims, who got “attacked” by some assumed but unknown (and thus, terrifying) “hackers”, who then seem to be the ones responsible.

How would the public debate change if instead the titles were “Sensitive data leaked from an official’s insecure private account” or “Private e-mail accounts used for official government business”? Perhaps the focus would move to Mr Dworczyk’s outright reckless negligence (he did not even have 2-factor authentication enabled). Perhaps we would be talking about why government officials conduct official business using private e-mail accounts – are they trying to hide anything?

These are not hypothetical: after the leak became public Polish government immediately blamed “Russian hackers”

The problem is bigger than that, though. Every time an Internet-connected device turns out not to be made secure by the manufacturer (from light bulbs, through cars, all the way to sex toys), media write about “hacking” and “hackers”, instead of focusing on the suppliers of the faulty, insecure product. In effect, energy and ink are wasted on debating “how to protect from hackers”.

On the one hand, this doesn’t help with solving the actual issues at hand (government officials not using secure government infrastructure, politicians not using most basic security settings, hardware manufacturers selling insecure products).

On the other: laws are written and enacted (like the Computer Fraud and Abuse Act in the USA) which treat tech-savvy, talented and curious individuals as dangerous criminals and terrorists. This leads to security researchers who responsibly inform companies about security issues they find being charged with “hacking crimes”.

Hacker community

A large part of these talented, tech-savvy people would call themselves “hackers”, though not all hackers are necessarily tech-savvy. Hacker is a curious person, someone who thinks out of the box, likes to transgress and to share knowledge: "*information wants to be free".

Haker needs not be an IT professional. MacGyver or Leonardo da Vinci are great examples of hackers; so is Polish artist Julian Antonisz. They espouse creative problem solving and the drive to share and help others.

Polish hacker community (like communities in other places) revolves around hackerspaces. Most of them are official, registered organizations (foundations or associations, usually) with members, boards, and a registered address. Polish hackers took part in public debates, pressed thousands of medical visors and sent them (for free) to medical professionals fighting the pandemic, organized hundreds of hours of cybersecurity trainings for anyone interested. They also became subjects of a sociology paper.

Globally, hackers are just as active: they take part in public consultations, 3d-print missing parts for medical ventilators, or help Arab Spring protesters deal with Internet blocks.

It’s difficult to say when the hacker movement had started – no doubt Ada Lovelace is a member, after all – but MIT’s Tech Model Railroad Club is often mentioned as an important place and time (late 1940’s and early 1950’s) for the birth of the modern hacker culture. Yes, the first modern hackers were model rail hobbyists. At that time in communist Poland we called such people “tinkerers”.

As soon as personal computers and the Internet started becoming popular, so did hacker culture (while also becoming somewhat fuzzy). First hackerspaces emerged: spaces where hackers could dive into their hobbies and share knowledge. Places to sit with a laptop and focus, get WiFi, power, and coffee. Sometimes there’s a server room. Often – a wood- or metalworking workshop, 3d printers, electronic workshop, laser cutter. Bigger ones (like the Warsaw Hackerspace) have heavier equipment, like lathes.

Hackerspaces are an informal, global network of locations where members of the community, lost in an unfamiliar city, can get access to power and the Internet, and find friendly faces. Gradually some hackerspaces started associating into bigger hacker organizations, like the Chaos Computer Club in Germany. Related movements also sprang up: the free software movement, the free culture movement.

Eventually, Fablabs and Makerspaces became a thing. These focus more on the practical, creative side of the hacker movement.

Borders here are blurry, many Fab Labs and Makerspaces do not self-identify as part of the hacker movement. In general: Makerspaces focus less on the hacker ethic, and more on making things. They also tend to be less interested in electronics and programming. Fablabs in turn are makerspaces that are less focused on building a community, and more on creating a fabrication labortory available commercially to anyone who’s interested (and willing to pay).

Hacker ethic

There is no single, globally recognized definition of the hacker ethic – but there are certain common elements that pop up on almost any relevant list:

  • knowledge empowers, access to it should not be stifled (“information wants to be free”);
  • authority is always suspect, so is centralization (of knowledge, power, control, etc.);
  • the quality of a hacker is not judged based on skin color, gender, age, etc., but based on knowledge and skill;
  • practice is more important than theory.

Hackers are often keenly aware of the difference between something being illegal, and something being unethical. Illegal and unethical actions are way less interesting than illegal but ethical actions.

Hence hackers’ support for journalists and NGOs.

Hence tools like the Tor Project, SecureDrop, Signal, or Aleph, broadly used by journalistic organizations around the world, but started and developed by members of the hacker community.

And hence actions of groups like Telecomix, ranging from helping Tunisians and Egyptians circumvent Internet blockages, to swiping server logs proving that companies from the USA were helping the Syrian government censor the Internet and spy on Syrian citizens.

Why did Telecomix decide to publish these server logs? Because Syrian government’s actions, and actions of the co-operating Americans, were utterly unethical, and technology was used by them in ways that are not acceptable to hackers: blocking access to knowledge and stifling opposition. Hacker ethics in action.

Hackers and burglars

As with any ethical question, making value-judgments about such actions is not a black-and-white affair. The line between a hacker and a cybercriminal is fuzzy, and roughly defined by that not-entirely-clear hacker ethic. But that still does not make it okay to outright equate all hackers to cybercriminals.

A good synonym for the verb “hack” (in the hacker culture context) is “tinker”. Usually that means something completely innocent, like fixing one’s bicycle or installing new shelves in the garage. And while “tinkering” with somebody else’s door lock does sound quite shady, we still won’t say: “someone tinkered into my apartment and stole my TV set.”

There are hacker-burglars, just like there are tinkerer-burglars. And yet if a tinkerer breaks-in somewhere, we’d call them a burglar. When a tinkerer steals something from someone, we’d call them a thief.

It would be absurd to claim some large robbery was perpetrated by a “gang of tinkerers” just because tools were used in the process.

We would not call “tinkerers” a group of kids who break into teachers’ launge by breaking the lock with a screwdriver.

And finally, we would also not speak of “tinkerers” while refering to a criminal group financed, equipped, and trained by a nation state, which guides the groups’ efforts.

And yet, somehow, we are not bothered by headlines like: “300 Lithuanian sites hacked by Russian hackers” or quotes along the lines of: “13-year-old boy hacked into school computer system to get answers to his homework.”

There is an important difference between an organized crime group (whether it is active on-line or off-line is a separate matter), and a state espionage unit. The Chinese thirteen year old has nothing in common with Russian cyber-spies, and these in turn don’t have much in common with a criminal gang demanding ransom on-line. Calling all of them “hackers” is neither informative, nor helpful.

Reality bytes

Outside of computer slang, the verb “hack” means “to chop, to cut roughly”. At some point at MIT the word started to be used as a noun meaning “a practical joke”, “a prank”, especially when referring to pranks which required inventiveness and dedication. In hacker culture it gained one additional meaning: “perhaps not very elegant, but efective and ingenious solution to a problem.”

The “problem” could be wrong voltage of the current in the model railway tracks, or Internet being blocked in Tunisia, or… no public access to a library of scientific papers. And since information wants to be free", somebody should fix that.

That, however, can easily be interpreted as a “cyberattack” – thanks to the aforementioned laws written in order to “defend from hackers”. That led to persecution of a hacker, activist, co-founder of Reddit, the creator of SecureDrop and co-creator of the RSS format, Aaron Swartz. After his death, JSTOR decided to make their library a bit more open to the public.

Had the hacker movement not been demonized so much, perhaps law enforcement agencies would treat that case differently, and Aaron would still be alive.


Frequently Asked Questions

How should people who break into individual and corporate systems with malicious intent be called?

Crackers” or “cybercriminals”, if we’re talking about criminal break-ins. “Vandals” (perhaps with an adjective, like “digital”, “internet”, etc.), if we’re talking about breaking in and defacing a website – especially if it did not require high technical skill (like in the case of the notorious admin1 password on Polish Prime Minister’s website during ACTA). “(Cyber)spies” if we’re talking about attacks perpetrated, financed, or otherwise connected to nation state governments.

When in doubt, one can always call them “attackers”, “malicious actors”, etc.

Technical note: often there even was no actual break-in! For example, in case of “young hackers” who allegedly “broke into” servers of a Polish provider of cloud services for schools, the perpetrators “overloaded the servers, temporarily making it difficult to continue on-line classes.” It’s not that different from a group of people staging a sit-in in front of the school entrance – hardly a break-in!

When to actually call someone a hacker

In the similar situations as we would be inclined to call them a “tinkerer” if a given event was not related to computers. This is really a very good model.

[Tinkerers] broke into the glass-case with school announcements and posted unsavory messages” – doesn’t sound all that well. Even if these vandals do call themselves “tinkerers”. So, also not: “[Hackers] broke into a website and defaced it.”

[Tinkerers] manufactured 50.000 anti-covid face shields and sent them to hospitals and other medical institutions” – that works. So, also: “hackers manufactured…

[Tinkerers] broke into a minister’s apartment” makes utterly no sense. And so does “hackers broke into minister’s e-mail account”: you want “unknown perpetrators”, “attackers suspected to be working with foreign intelligence services”, etc.

What are hackathons?

Hackathons are events where technically-skilled people try to solve certain problems or achieve some goal in a strictly limited time. Hackathons can be charity-focused (like Random Hacks of Kindness or Polish SocHack a few years ago), or focused on creating technological startups (like the Startup Weekend).

What is hacking, really?

Hacking is simply tinkering, although it does suggest that computers are being used (usually, but not always). No, really. You can check for yourself at your local hackerspace.

Isn’t the fight for this word already lost? Wouldn’t it be easier to just find a new word for this community?

We tried – “hacktivist” and “digital activist” did not come from nowhere. But they immediately started being co-opted to mean “cybercriminal”, for example here:

“Activists or hacktivists are threat actors motivated by some political, economic, or social cause, from highlighting human rights abuse to internet copyright infringement and from alerting an organization for its vulnerabilities to declaring online war with people or groups whose ideologies they do not agree with”

There are examples of words that have been reclaimed by their communities. The LGBTQ+ movement successfully reclaimed several words that used to be slurs used against homosexual people (nobody in mainstream media would today use the f-word!). Similarly, the Black community in the USA successfully reclaimed the n-word.

Finally, and perhaps most importantly: why should we give up on this word without a fight at all? This is how we call ourselves, this is how this community refers to itself, are we not worthy of a completely basic measure of respect? Why should we just silently accept being lumped with criminals and spies, only because some people find it easier to type “hacker” than trying to figure out what actually happened in a particular case?

Breaking radio silence

After a long while (almost 5 years!), the blog is finally back online. And yes, I did at the long last come to terms with the word “blog”. Also, the title got changed to what a major translation service spat out when fed the Icelandic information security law.

I admit it took way too much time for me to finally start working on bringing the blog back, and then again too much time to actually get it done. I probably did overthink stuff massively. As I am prone to do.

But hey, at least we can now have…

Nice things

There are Atom and RSS feeds; I am also considering adding a JSON feed. There is a Contents page with tag- and language-based filtering available, all implemented without a single line of JavaScript and no external includes. All the content from the old site is preserved and old URLs are redirected to new URLs (if the URLs got changed).

Care was taken to make the site usable for screen readers, and to be readable and useful even with CSS completely blocked. Go ahead, check how the site looks with CSS disabled! One page where it is very difficult to make the pure markup nice and easy to use is the Contents page, due to CSS-based interactivity, but even that page is not horrid I hope, and I am eager to improve.

I am also sure there is plenty I could improve for screen readers and other assistive technologies. Feedback welcome.

Plans

Eventually, I am planning to add a Tor Onion Service (with Alt-Svc or Onion-Location headers), Gemini site, and PDF/EPUB versions of each article. You can already get a source Markdown version for each post, see just below a post’s title, on the right.

The whole thing is a static site, so it won’t break due to a PHP version upgrade – which, as embarassing as it is to admit it, was the reason why the site went dark all those years ago. This also means I can add more interesting stuff later: put it behind Fasada, easily deploy Samizdat, or generate a zipfile to download and browse off-line for anyone who is so inclined.

I would like the site to become a bit of a showcase of different ways websites can be made resilient against web censorship. I don’t expect rys.io to be blocked anywhere, but making it such a showcase could perhaps help admins of other websites, more likely to be blocked, figure out ways to stay available for their readers.

You can read a bit more about the site (theme, header graphic, etc.) on the About page.

Blast from the past

After pondering this for quite a while, I decided to bring back all of the content that was available on the blog until it went under. All old content is tagged as ancient.

For some posts bringing them back was an obvious decision:

Subjectively on Anti-ACTA in Poland
A subjective historical record of the Anti-ACTA campaign in Poland, referenced by quite a few other sites.
Why I find -ND unnecessary and harmful
The No Derivatives versions of Creative Commons licenses are quite problematic. Here’s why.
How information sharing uproots conservative business models
Copyright was never really about authors’ rights. If the Internet is incompatible with copyright-based business models, it’s the business models that need to adapt.
Blurry line between private service and public infrastructure
The question of when does a private service become de facto public infrastructure (and what should be done about it) is exactly the question that needs answering now in the context of Big Tech.

Others are perhaps interesting in the context of the Fediverse, especially considering they have been published years before Fediverse was even a thing:

Breaking the garden walls
This was written with Diaspora and pre-Pump.io Identi.ca in mind, and it’s interesting to see how the Fediverse basically solves the first two steps mentioned in that post.
Diaspora-Based Comment System
A decade ago I advocated for a decentralized social media based comment system for blogs; way before it was cool got implemented as ActivityPub plugins for WordPress and for Drupal.
Social blogosphere
Another take on the idea of decentralized social media enabled blogs.

Some are braindumps, summaries of experience I gained from particular workshops or through my activism. They might still be useful, although at least partially they might have not aged all that well:

Border conditions for preserving subjectivity in the digital era
Summary of a workshop about subjectivity (that is: being a subject, not an object, of actions; having agency) online.
HOWTO: effectively argue against Internet censorship ideas
Eight years ago Internet censorship landscape was similar yet different in many interesting ways. Still, useful snapshot of an activist’s perspective on it at a particular point in time.
Public consultations and anonymity
How does pseudonymity and anonymity work withing a public consultations process? Can they bring value to them, even though they make accountability more difficult?

But then… then there are the other posts. The silly ones, or those published before I figured out this whole blogging thing (today they would be toots on the fedi instead). I struggled with those, but in the end decided to keep them for histerical (sic!) record.

Lot of effort went into this site. I hope you enjoy reading it as much as I enjoyed creating it!

Centralisation is a danger to democracy

A version of this post was originally published on Redecentralized and VSquare.

After the violent events at the US Capitol social media monopolists are finally waking up to the reality that centralisation is dangerous; with power over daily communication of hundreds of millions of users comes responsibility perhaps too big even for Big Tech.

For years Facebook and Twitter were unwilling to enforce their own rules against those inciting violence, in fear of upsetting a substantial part of their userbase. Now, by banning the accounts of Donald Trump and peddlers of QAnon conspiracy theory they are hoping to put the genie back in the bottle, and go back to business as usual.

Not only is this too little too late, but needs to be understood as an admission of complicity.

After all, nothing really changed in President Trump’s rhetoric, or in the wild substance of QAnon conspiracy theories. Social media monopolists were warned for years that promoting this kind of content will lead to bloodshed (and it has in the past already).

Could it be that after the electoral shake-up what used to be an asset became a liability?

A “difficult position”

I have participated in many a public forum on Internet governance, and whenever anyone pointed out that social platforms like Facebook need to do more as far as content moderation is concerned, Facebook would complain that it’s difficult in their huge network, since regulation and cultures are so different across the world.

They’re not wrong! But while their goal was to stifle further regulation, they were in fact making a very good argument for decentralisation.

After all the very reason they are in this “difficult position” is their business decision to insist on providing centrally-controlled global social media platforms, trying to push the round peg of a myriad of cultures into a square hole of a single moderation policy.

Social media behemoths argued for years that democratically elected governments should not regulate them according to the will of the people, because it is incompatible with their business models!

Meanwhile they were ignoring calls to stifle the spread of violent white supremacy, making money hand over fist by outrightpromoting extremist content (something their own research confirms).

Damage done to the social fabric itself is, unsurprisingly, just an externality.

Damned if you do, damned if you don’t

Of course, major social media platforms banning anyone immediately raise concerns about censorship (and those abusing those social networks to spread a message of hate and division know how to use this argument well). Do we want to live in a world where a handful of corporate execs control the de-facto online public space for political and social debate?

Obviously we don’t. This is too much power, and power corrupts. But the question isn’t really about how these platforms should wield their power — the question is whether these platforms should have such power in the first place.

And the answer is a resounding “no”.

Universe of alternatives

There is another way. The Fediverse is a decentralised social network.

Imagine if Twitter and Facebook worked the way e-mail providers do: you can have an account on any instance (as servers are called on the Fediverse), and different instances talk to each other — If you have an account on, say, mastodon.social, you can still talk to users over at pleroma.soykaf.com or almost any other compatible instance.

Individual instances are run by different people or communities, using different software, and each has their own rules.

These rules are enforced using moderation tools, some of which are simply not possible in a centralised network. Not only are moderators able to block or silence particular accounts, but also block (or, “defederate from”) whole instances which cater to abusive users — which is inconceivable if the whole network is a single “instance”.

Additionally, each user has the ability to block or silence threads, abusive users, or whole instances, too. All this means that the response to abusive users can be fine-tuned. Because Fediverse communities run their own instances, they care about keeping any abuse or discrimination at bay, and they have the agency to do just that.

Local rules instead of global censorship

White supremacy and alt-right trolling were a problem also on the Fediverse. Services like Gab tried to become part of it, and individual bad actors were setting up accounts on other instances.

They were, however, decisively repudiated by a combination of better moderation tools, communitiesbeing clear about what is and what is not acceptable on their instances, and moderators and admins being unapologetic about blocking abusive users or defederating from instances that are problematic.

This talk by technology writer and researcher Derek Caelin provides pretty good overview of this (along with quite some data), I can only recommend watching it in full.

Now, alt-right trolls and white supremacists are all but limited to a corner of the Fediverse almost nobody else talks to. While it does not prevent a dedicated group from talking hatefully among themselves on their own instance (like Gab), it does isolate them, makes radicalising new users harder, and protects others from potential abuse. They are also, of course, welcome to create accounts on other instances, provided that they behave themselves.

All that despite there not being a central authority to enforce the rules. Turns out not many people like talking to or platforming fascists.

Way forward

Instead of trying to come up with a single centrally-mandated set of rules — forcing it on everyone and acting surprised when that inevitably fails — it is time to recognise that different communities have different sensibilities, and members of these communities better understand the context and can best enforce their rules.

On an individual level, you can join the Fediverse. Collectively, we should break down the walls of mainstream social media, regulate them, and make monetising toxic engagement spilling into public discourse as onerous as dumping toxic waste into a river.

In the end even the monopolists are slowly recognising moderation in a global centralised network is impossible and that there is a need for more regulation. Perhaps everyone else should too.

Needless haystacks

This is an ancient post, published more than 4 years ago.
As such, it might not anymore reflect the views of the author or the state of the world. It is provided as historical record.

I find that in most situations where any mishap is involved, especially with any large institutions in the picture, Hanlon’s razor tends to apply, and is a good working model to base assumptions on.

This has been the case with most Internet censorship debates in Poland, for instance. Assuming malice really wasn’t helping to get our point across.

Of needles and haystacks

This is why I am flabbergasted with NSA’s (and the rest of the gang, too) insistence on gathering as much data as they can. Sure, for most regular Jacks or Jills, “you need the haystack to find the needle” might sound about right. A bit more observant person might however do a double-take: “wait, what?”. When I’m searching for a needle, the last thing I want or need is an ever-larger haystack. Something’s fishy.

Then, they might go the extra mile and dig a bit, finding out that NSA’s data has no real impact on anti-terrorism efforts. Maybe they’ll even dig out a 2007 Stratfor report on the “obstacles to the capture of Osama”, pointing out things like:

[T]he Taliban and al Qaeda so far have used their home-field advantage to establish better intelligence networks in the area than the Americans.

And:

One big problem with this, according to sources, was that most of these case officers were young, inexperienced and ill-suited to the mission.

Or this gem:

This lack of seasoned, savvy and gritty case officers is complicated by the fact that, operationally, al Qaeda practices better security than do the Americans.

And while one of the sections of the report is indeed entitled “Needle in a Haystack”, it doesn’t exactly support the “we need the whole haystack” narrative of the NSA and it’s ilk. Because this narrative simply makes no sense. Why? Because math.

When we’re talking about searching large datasets for something, we need to account for false positives and false negatives. The larger the dataset, the larger a problem they become. But don’t take my word for it, Floyd Rudmin has written a great analysis of this back in 2006:

Suppose that NSA’s system is really, really, really good, really, really good, with an accuracy rate of .90, and a misidentification rate of .00001, which means that only 3,000 innocent people are misidentified as terrorists. With these suppositions, then the probability that people are terrorists given that NSA’s system of surveillance identifies them as terrorists is only p=0.2308, which is far from one and well below flipping a coin. NSA’s domestic monitoring of everyone’s email and phone calls is useless for finding terrorists.

That’s right. Even if we assume amazingly good accuracy, the agency has a better chance catching a terrorist by flipping a coin, than by actually using the data they gather.

Unknown knowns and competent incompetence

That’s exactly why I am flabbergasted: usually that would be the point where I’d call upon Hanlon’s razor. But we have just assumed that NSA is really, really competent in what they’re doing, and what they’re doing is, in no small part, math.

So either they are very, very competent and understand that mass surveillance cannot work the way NSA claims it is supposed to; or they are not competent enough to know this, but then all the more they lack the most basic skills to work with datasets they have. Can’t have it both ways!

The third way

The scary possibility is that NSA knows this full well, and yet they still gather the data. Why would they do this? Well, while it might not be all that useful to catching terrorists, it might be a game-changer in areas where the numbers are different. Again, Floyd Rudmin puts it best:

Also, mass surveillance of the entire population is logically plausible if NSA’s domestic spying is not looking for terrorists, but looking for something else, something that is not so rare as terrorists. For example, the May 19 Fox News opinion poll of 900 registered voters found that 30% dislike the Bush administration so much they want him impeached. If NSA were monitoring email and phone calls to identify pro-impeachment people, and if the accuracy rate were .90 and the error rate were .01, then the probability that people are pro-impeachment given that NSA surveillance system identified them as such, would be p=.98, which is coming close to certainty (p_1.00).

So are the NSA and other security agencies too incompetent to understand mass surveillance is useless for its stated purpose, or are they competent enough to understand it and the real purpose is just a bit different?

Neither possibility makes me feel safer. Or be safer, for that matter.