myprotegeai
9 months ago
A company recently demoed to me that they have the ability to see the work history, credit report, and bank balance of a visitor that visits a site with some tracking code, in under 500ms. They use this information for a product that qualifies leads for sales teams, so the sales team knows who is a waste of time to go after and who isn't.
Creeps me the fuck out, and the owners seem to have no ethical qualms about buying, selling, and using this data.
next_xibalba
9 months ago
None of it is accurate and almost all of it is modeled from sparse, low quality training sets. Banks are not selling PII’ed account balance data to shady aggregators.
To me, the more interesting and outrageous story is how many aggregators are able to sell garbage data so successfully.
bbarnett
9 months ago
Banks are not selling PII’ed a
You know how some banks have a service which tells you how you spend your money? With graphs, 20% on power, 15% on food, etc?
That service is provided by a third party, who is given the data anonymized. A unique id number assigned. Yet it's trivial to deanonymize, and that's what happens.
All that is required is one buy with a points card, an airmiles card, and you are forever relinked to your data. It's how points cards make cash on the side, how air miles do. Exact time, date, amount, location of purchase is a great sync method.
If you pay for your phone with any form of traceable payment, they know who you are, your address, etc. From this immense data is gleamed, such as lot value, neighborhood, and so on. Companies can even get current location and geofence you, being alerted if you move in/out of a certain location.
Mobile phone companies sell this data/service via an easy api. Companies relink a phone from the app level via IMEI and number, which is sold to aggregators along with phone data (contacts, etc). The telco api links to real identity.
Once linked, forever linked.
Most people love free apps, and give up messages/sms, contacts, and more to save a dollar on an app. From this immense relationship data is gleamed, including likely employer and social circke.
Even if you are careful with your app permissions, certainly many acquaintances of yours aren't, so you get linked to their social circle, often with contact name/address.
This is just the simple stuff.
Source: I've dealt with these companies.
hammock
9 months ago
>Banks are not selling PII’ed account balance data to shady aggregators.
But is Plaid?
And banks do sell account balance data, they also sell credit and debit transaction history
Seattle3503
9 months ago
> But is Plaid?
Or any of those budgeting apps that integrate with your bank account.
prasadjoglekar
9 months ago
That's probably the signal. But as one of the parent posters said, the # of folks who use such budgeting apps is quite small. For advertising, small samples are useless, so this data has to be modeled to the full US population.
For that, this very biased training set. And almost always the independent variables used for modeling are 7-10 standard demographics.
dml2135
9 months ago
Seems like Plaid would be f’d six ways til Sunday if it got out that they were selling consumer data to 3rd parties, no? A huge part of their business model is based on trust and doing that would completely burn it.
hammock
9 months ago
dml2135
9 months ago
Sorry, maybe “third party” isn’t the correct term. Let me try to lay out my point a bit more clearly:
Plaid’s business model is — Company A needs a consumer’s data from Bank B. Plaid takes the consumer’s banking credentials, gets the data, and sells it to Company A.
At no point in this process does Plaid go and sell this data to another unrelated Company C. The lawsuit cited was about Plaid not sufficiently explaining its position between Company A and Bank B to the consumer. It was not about Plaid going and selling the data to the highest bidder.
salawat
9 months ago
How do you think they made money? It certainly wasn't from licensing their SDK that intentionally spoofed 3rd patry banks in a way that deliberately misled users into assuming they were logging in with the bank directly instead of handing Plaid an access token that allows them to exfiltrate arbitrary transaction histories.
Any time you hear yourself utter the words "Wouldn't x be f'd if word got out that y"... You need to stop and consider that there is an entire industry around reputation management, and PR crisis management that is leverageable by the deep pocketed in order to keep their name out of news items, and that the favorite acquisition of the absurdly deep pocketed is the media outlet/platform.
Think. The world is full of scummy people looking to make a buck, and a much more pauce number eho worry about doing so honestly. Until you meet one of the rare ones who falls on their sword for their ideals, never assume the guy on the other side of the table is one until proven through deed.
dml2135
9 months ago
They make money through the fees they charge companies that pay for their service, so that they can get banking data from their consumers. Those fees are not cheap, so I do imagine they are doing most of the work to sustain the business right now.
I’m not saying “you should trust Plaid with your data” — absolutely, 100% not that. I imagine that’s how I’m being interpreted, hence all the downvotes.
What I’m saying is that at the present time, it does not seem to me that Plaid would be incentivized to do something that they explicitly say they are not doing. Plaid’s business model is, trust us to get your customers data and deliver it to you, and only you, safely. Selling it to Bob down the street on top of that would threaten their primary business model. And today, that primary business model is doing very well! So why threaten it?
Now, someday in the future, maybe that business model has stagnated, and line still needs to go up, so someone may get greedy and that may change. In fact, this is even likely to happen! But there will be signals that it is coming.
Even re: the issue of misleading users that they are not their bank — after they got slapped down on that one, their strategy changed. There is a new set of regulations around disclosure around these things, and Plaid is pushing them pretty hard. My guess is they had some hand in drafting these regs and are hoping to use a higher regulatory burden to build a moat against competitors.
But honestly, I’m kind of surprised at the lack of nuance in understanding how Plaid works, especially here on HN.
mike22
9 months ago
The value prop of Plaid, Yodlee, et al is that they can do this with one(-ish) API surface for tens of thousands of financial institutions. In their efforts to ensure Bob down the street won’t be sold any data, they do treat each customer (of the API, not the end users they pull data on behalf of) as an isolated tenant.
mystified5016
9 months ago
Pretty much no corporation in the last 40 years has suffered the consequences of their actions. Boeing has killed how many people and it's taking an act of Congress to even start talking about some consequences later, maybe.
nickff
9 months ago
Arthur Andersen went under after its accounting negligence: https://en.m.wikipedia.org/wiki/Arthur_Andersen
A few food companies have failed due to poor quality control: https://www.thestreet.com/retail/another-popular-ice-cream-b...
In fact, many companies go bankrupt every year: https://en.m.wikipedia.org/wiki/Bankruptcy_in_the_United_Sta...
gruez
9 months ago
>Pretty much no corporation in the last 40 years has suffered the consequences of their actions.
There's hundreds of regulatory actions taken by governments per year. That's "consequence" by definition.
brewdad
9 months ago
Fines of a few percent of the revenues generated aren’t enough of a deterrent.
hedvig23
9 months ago
That logic suffices as truth to you?
user
9 months ago
ethbr1
9 months ago
> None of it is accurate and almost all of it is modeled from sparse, low quality training sets. Banks are not selling PII’ed account balance data to shady aggregators.
Part of the problem though is that much of this data is persistent, across order-of-human-lifetime.
How often does your employer salary history have to be obtained to be useful? Maybe once every 10 years?
I have zero faith that in jurisdictions without national laws prohibiting it (and laws that prevent usage of extra-national data) that's not happening.
inkyoto
9 months ago
> Banks are not selling PII’ed account balance data to shady aggregators.
Banks might not be directly selling the transaction history, but they report the customer transaction history to Equifax and similar credit scoring agencies. Equifax certainly does onsell that to shady credit companies, which has happened to me twice with letters in both cases stating in the footprint in a very small font size and in a very pale hue of grey «provided by Equifax».
myprotegeai
9 months ago
Maybe they are using garbage data, but at least for the credit checks, he was running them on-demand at $0.75 a pop. He also mentioned browser fingerprint databases that he has purchased. Half of his job seemed to be processing and importing different databases that he had purchased.
pkphilip
9 months ago
I use an app called PayTM for online payments. It shows me notifications that I have rent pending on a flat which i rent when I have NEVER used it to pay rent ever. It also shows me that I have pending electricity bills. It also picks up and shows me data on how much credit card payment is due when I have never used it to pay credit card bills.
All of this information can come only through cooperation between banks, credit reporting companies, utilities etc.
Grimblewald
9 months ago
Any ideas on how I can make my metrics tank predictions for I stop being marketed to so aggressively?
MavisBacon
9 months ago
Second. Had to get a spam blocker because I was getting like 5-10 calls/day from “debt consolidation” companies which is a significant distraction
The spam blocker is pretty powerful though, you aren’t getting past it unless you are in my contacts or have a # flagged as affiliated with a reputable business
user
9 months ago
ruined
9 months ago
free startup idea: trolley-solver-as-a-service.
integrate something like this with license plate data, property records, person recognition, and realtime location. when a self-driving automobile detects that it's out of control and unable to avoid imminent liability, it can make a cost-benefit analysis of each prospective casualty by querying an API that provides an avoidance score for each consumer and property in the vicinity. based on this score the client automobile will be able to identify a route of least liability. consumers may be encouraged to integrate with these services by assigning unidentified things a score of zero.
Grimblewald
9 months ago
Don't give them ideas. Given the thing's we've been seeing, you just know some nepo-CEO somewhere will read this and think it is A) their idea, and B) brilliant.
datavirtue
9 months ago
Damn, that was harsh.
ruined
9 months ago
hire me a patent lawyer
vundercind
9 months ago
The first time I saw a session replay of all the mouse movements and input of a user on their own fucking computer that some marketing website-spyware had recorded was the moment I decided the Internet was a mistake.
mason55
9 months ago
Pretty much every analytics product does this now. Amplitude, Statsig, Posthog, etc.
Not saying it’s a good thing but assume that most websites are recording your session at this point.
datavirtue
9 months ago
Another way for my mouse jiggler to add value.
jerlam
9 months ago
An intern at my company built a proof-of-concept of this within a month, under a mistaken direction to build "analytics tools". When the intern presented this to the team, everyone was horrified and we never brought it up again after the intern left.
rexarex
9 months ago
You mean the free product Microsoft Clarity that everyone uses?
vundercind
9 months ago
Nah, it was some smallish company’s SAAS thingy. This was maybe 2015.
a13n
9 months ago
fullstory
vundercind
9 months ago
It was already common then, I gather—the ex-developer-product-owner guy who showed it to me (in the course of doing something else) didn’t seem to think it was remarkable, just an assumed capability. I don’t recall the name of the product, but it’d record all the input and page content for an entire session, you could watch it play back like a video. Exactly like standing over someone’s shoulder while they used their computer. Creepy as fuck, but some genius renamed “spyware” to “telemetry” and that was enough to get every developer on board because we’re super insecure and will jump at the chance to pretend we’re building Mars rovers or something else real while we make yet another “app” the world doesn’t need (I suppose that’s why that label was so successful at changing attitudes, anyway)
jonhohle
9 months ago
Isn’t this how heatmaps were generated as far back as the late 2000s?
vundercind
9 months ago
Click-mapping came earlier, and there may have been a few places mouse-movement and cross-page-load session tracking some sessions, but I don’t think it was a “just turn it on and leave it on” thing for even most large sites. And a lot of early heat maps came from user studies, which is the right way to do that.
[edit] also, that just happened to be the first time I’d seen a single session represented that way, rather than aggregates. Again, it wasn’t some brand-new thing then, it’d been around long enough to have multiple companies offering it as a service, not just an internal tool at a couple giants.
Grimblewald
9 months ago
time to make plugins that send fake mouse data, and have that draw nothing but hyper-realistic phalli.
datavirtue
9 months ago
Put this in your AI and smoke it.
XCSme
9 months ago
Are surveillance cameras in shops any different?
barryrandall
9 months ago
You can usually see the cameras, and many places require that you notify people they're being recorded.
sensanaty
9 months ago
We had one of these, Hotjar I think. To their (smallest possible) credit, there's 0 legible text in the replays, you basically only see the rough UI outlines and everything else is redacted. Wouldn't be surprised if it featured a keylogger though.
I asked our data team what the fuck they need this level of tracking for, and they said "wasn't us, it was marketing that requested it".
So I ask many of the marketing people, and they just say "oh we thought it could be useful!" Without actually clarifying the "how" or "why".
I removed that shit with a quickness after that, and no one's complained so far (duh)
I love the GDPR if nothing else because it scares the - excuse the vulgarity and ableism - retarded decision makers into not doing idiotic shit like this. For any kind of bullshit like this I just bring up GDPR as a shield these days and none of it goes through
ipdashc
9 months ago
> So I ask many of the marketing people, and they just say "oh we thought it could be useful!" Without actually clarifying the "how" or "why".
This stuff bugs me so much; it all feels so cargo-culty. Even ignoring privacy, I wonder how much money and computing power is burned on buying and collecting data that nobody needs and that doesn't actually serve any significant business purpose.
m463
9 months ago
What if it was your daughter?
22 years old, height proportional to weight, poor decision making skills.
What about your son?
I've seen this offered to young kids paying rent:
"Flex lets you pay rent on a schedule that works better for your monthly budget and frees up your cash flow."
"Help you pay rent on time. Improve your cash flow. Build your credit history."
zoltrix303
9 months ago
I had a similar experience once where a vendor demoed their tracking tech for advertising. This was in France (before GDPR) and they had partnered with many apps (Weather apps and such) to access user locations. I don't remember the size of their target but it was a big chunk of the French population. They showed a map of Paris showing the day of a particular user from leaving their home, which route they took, how long they stood in front of which store and how long the spend inside others etc. My boss at the time found the whole thing very exciting...
Mountain_Skies
9 months ago
While out hiking one day, I started thinking about buying a small ladder for the kitchen. When I got home that evening, I started seeing ads for ladders even though I had not searched for ladders, spoke to anyone about ladders, or even texted anyone about them. It was just a thought I had while hiking. Was it a coincidence or something else?
Finally figured it out a day later when reviewing my hike on the Fitbit app. At the end of my hike I forgot to shutoff route tracking. On my way home, I had stopped by Walmart to grab a few things and while there, looked at their ladders. I could see on the app the path I took through the store, including when I stopped for a few minutes in front of the ladders. That was enough data to trigger ads for ladders for the next couple of days.
We leak data about ourselves constantly without realizing how much we're doing it or where it ends up going. Lots of it is also circumstantial and makes me wonder what erroneous ideas some of these databases might have accumulated over the years and who gets to see that "information". What happens if you walk through a part of town where there's an activist rally for "We Love Kitten Torture" going on? Do you forever get tagged in a bunch of databases as an animal torturer?
squigz
9 months ago
We don't leak data about ourselves. Companies specifically collect data about us, and then do whatever they want with it.
luckylion
9 months ago
"A visitor" as in "any visitor"? Or rather "a visitor", i.e. a specific one, about whom they already possess all this data and it's just a look up?
The latter I absolutely believe. The former I'd file under sci-fi marketing tales that anyone with some amount of knowledge about web technologies wouldn't fall for.
Justsignedup
9 months ago
Overheard a convo from our sales team "I reached out to a few people, just waiting for them to do more than 5 seconds of Google searching of us before we reach back out"
A4ET8a8uTh0
9 months ago
Wait.. physical site like a store or a web site? Not that either would make it that much better than the other, but you got me really curious.
riahi
9 months ago
This sounds like they are somehow identifying the user and querying theworknumber.
You can get a ton from a worknumber query.
anjel
9 months ago
Soon to be combined with palantir face recognition tech. No need to chip your citizenry!
raxxorraxor
9 months ago
They do get info on those that willingly share, but not the other ones.
Problem is that people share so much that those that do not start to stand out and might get penalized as well.
belter
9 months ago
"Jeffrey Epstein’s Island Visitors Exposed by Data Broker" - https://www.wired.com/story/jeffrey-epstein-island-visitors-...
tonetegeatinst
9 months ago
What data broker would even sell this data?
nipponese
9 months ago
Name the company please.
ranger_danger
9 months ago
Nothing like this exists for data on the general public and it would be illegal anyways. Either one of you is not aware of what that product actually isn't, or are being intentionally deceitful and spreading FUD.
bitnasty
9 months ago
Ever heard of the national public data breach?
advisedwang
9 months ago
https://support.microsoft.com/en-us/topic/national-public-da... does not mention work history, credit reports, or bank balances.
mixmastamyk
9 months ago
The Experian breaches did. ADP sells recurring payroll as well. Shouldn’t be too hard to cross reference.
juanani
9 months ago
[dead]
whycombinater
9 months ago
Just beat them to death.
Jury nullification.
Or vote, or whatever the site rules permit, good luck with that.
bofadeez
9 months ago
Sounds like vaporware. Might be possible for a negligibly small % of visitors. And even then cold outreach is not very effective.
drdaeman
9 months ago
It's basically same as classic approach of correlating salaries with ZIP codes, just with more parameters. Which sort of works statistically, because there are correlations - but is nothing more than a hallucination at individual visitor scale.
bofadeez
9 months ago
That seems more realistic. But even if a marketer theoretically had access to atomic level detail on every single prospect, there's not much they can do to manufacture demand.
Humans are kind of smart and resistant to manipulation. Especially the ones with money.
drdaeman
9 months ago
> Humans are kind of smart and resistant to manipulation. Especially the ones with money.
I'm not sure. I think gaming/gambling industry having a concept of "whales" kind of disproves this.