How Data Analytics Will Change Thanks to SSI

Day/Session:Thursday 5B

Convener:Matt Norton & Chris Matichuk

Notes-taker(s): Alec Laws

Discussion notes, key understandings, outstanding questions, observations, and, if appropriate to this discussion: action items, next steps:

Huge part of the corp world, generating value for them

- analytics

- data science

- big data

What does this look like when consumers own their data?

context comfortables, people still use demographics. instead of 'how willing'/comfortable people are to share data?

- what notif

- what to share/send them

people more comfortable sharing data if they what value data provides to business

why would you sign in with FB but not instagram? why can some companies share data

Hypothesis: analytics transforms so that company is more transparent about value data providers & benefit the consumer at the moment (in context)

ocean protocol, silo'd data for analytics (not ssi compliance)

overlays, data capture schema generic across trusted framework. different overlay on scheme

- overlays are strong for SSI

- but ocean is ready to go

Q. they do same thing, or complementary?

- compl. needs convo between sovrin/ssi id and ocean

- a lot of PI companies, will give money/token for data

- does NOT like this, so like money that time thorght about risks makes it irrelevant

- avoid this discussion completely "it's a dumb idea"

- want to figure out value that costs person the least, but highest value to business

- a lot of agreement here

SSI is about giving power back to user

- data is held by company to deliver value

- if you hold yourself, you have the full view

take the data and make it make sense to the user

- understand (really) what they share, and what value they get

Q. what if i dont have access to data in enough quality to make a decision?

- can't build a model (statistically relevant)

- must reat as research project, different that what eula that was agreed too

- not using for eula purpose

- can't get data to build model and solve problem (in reasonable amount of time)


- can't imagine when data purpose isn't explicitly stated

- SSI difference, db/system is internal, collected, mined. but not shared with partners (hotel/car) providers

    - without being explicit about opt in to the user

    - no more profile db with all info

gpdr exists everywhere

- same as PII (in CA/US/etc)

what is the use case in the moment?

ocean/oerlays are tech solutions, what about FB data?

- shift in economic incentive

- ie decentralized FB

- why does the user signup (research trial)

"jobs to be done", hire company to accomplish a job

what am I hiring a data collector to do? just for one research model

- or are you a trust broker to downstream systems?

- do i trust you to share

Once data is collected/owned by user

- WIlling to give company data and let them help you discover services?

clinical trial ->FDA requires data to be held for 10 years

- locker mechanism before it's pushed out decentralized

- if data collected proper way, it may automatically pushed to public data store

overlay essential to do that

are we away from monetizing data as a bad thing?

gdpr didn not appear because we dont share data. we want to share data

- enables services, personal, usable in life

- therefore we acccept tos/eula etc

6000/50 person models

- statistics vs marketing

PII not all useful, labels/categories are

- want to know as many labels as possible

criteria search on hypothetical global data store

-> do you want to appear in the results?

-> yes -> pushed to result

-> no -> removed from the set

ie google pays for 3M result set

smaller costs less

maybe every 'observation' costs a fixed value

pay premium for more PII associated with labels

- ex when did you default on the loan

- pay more for PII associated

depends if consumer consents to how that data is used?

- if data is anonymized?

- you should never get name/address/PIIs

keep schema very basic, 'entry overlay' with prepopulated anon values

- everything else is free form

- PII auto encrypted, but associated with features

- if no overlay, same as PII

are we talking about buying data from orgs that collect data in SSI? that is ocean

- vs new model with SSI

SSI is about presenting credentials?

distributed trusted framework is good at identity, but broader thorugh trusted relationships

idea that id is base of credential, this is problem with id space

- identity is based on what we do (with some certifying authorities)

- real id is who all of these data points

part of service layers is decentralized compute

- same problems because other computers use your infor

- does this leak

-> homorphic encryption

clinical trials, you have relationship with study, pairwise with study

- you can't participate in multiple trials once you start

- but data starts by going into each trials 'locker'

lose transparency without reputation

- can tell who issued, but not how 'good/bad' they are

- tech gives proof data came from source, nothing to do with trust

this is the trust framework on top of digital tech

rwot, p2p degrees of trust

- missing from math modeal, subjectivity of model

- you capture trust at a time

- address is hard to prove due to conflicting data points

    - increase trust when by having something to lose (life/money/rep)

    - some actors are inherently altruistic(but it's a minority)

    - most poepl are neutral (good or bad)

    - what is correct incentive (is this what trust is?)

- adding math doesn't solve problem with stake

the relationship/rep of person you are interacting with is essential to how org uses/shars the data    

- as an issuer you must prove worth

    - this is implicity (common sense), but hard to capture in tech

    - how to show that entity can be trusted

how can individual/community affect rep of entity?

- ex yelp

- solved by broker between parties?

value for data is the important piece. but doesn't need to be monetary

how does mutual contractual protection help?

- rep is important/subjective/complicated 

in healthcare, they care about bottom line, drugs on shelves

- only hold data because of regulator (FDA)

- don’t want the data at all

- but they have a central id system (pims)

    - change it there, not at big pharma

with SSI, decay of companies who make money by selling their data

- Credit bureau needs to go, or need to be redefined?

    - scraping info without consent is wrong

    - they enable commerce (loans, etc)

    - companies need to evolve

if data only used for credit score, that’s fine

- but data has been exposed

- need better way, but not to go away

if we are for this, someone will be against it

SSI will change everything, don't think it will

- still need to give info to do business

- data shared removes anonymity

need to be able prove ownership of data to user? ex ccnumber/cvc/exp. not nessecarily

don’t need tech perfection. lets be more pragmatic

- identity is about correlation, let user own it

- but must be usable

- correlation must be wanted and known by user

- possession vs knowledge based auth

    - possession + knowledge

    - must participate in transaction (authenticate)

more expensive correlation is, harder to establish rep

- need artifacts to base rep on

under SSI, data only doesn’t prove anything. need credential + proof

cc, printed private key on public card

online, do you know email address?

SSI moves it back to a real secret