CNNs learn powerful features
...so much so that sometimes, we can exploit them!
Remember Gradient Descent?
To attack a network, do gradient ascent instead!
Fast Gradient Sign Method (FGSM)
$$ x_{adv} = x + \alpha\ \mathrm{sign}(\nabla_x L(f_\theta(x), y)) $$
Fast Gradient Sign Method (FGSM)
$$ x_{adv} = x + \alpha\ \mathrm{sign}(\nabla_x L(f_\theta(x), y)) $$
Projected Gradient Descent (PGD)
Repeated FGSM, but clip the resulting outputs to $[x-\epsilon, x+\epsilon]$
$$x_{adv} = \mathrm{Clip}_{x, \epsilon} \Bigg(x + \alpha\ \mathrm{sign}(\nabla_x L(f_\theta(x), y))\Bigg) $$
Attacks can be:
1.
targeted
or
untargeted
2.
black-box
or
white-box
3.
evasion
or
poisoning
4.
digital
or
physically realizable
What ethical implications might these attacks have for AI developers?
AI Ethics
Foundations for Ethical AI Development
AI
Ethics
The study of moral principles governing behavior,
encompassing concepts of right and wrong.
AI Ethics
The study of moral principles governing behavior,
encompassing concepts of right and wrong.
AI Ethics address the
responsible development
and deployment of AI systems.
Why should we care?
Impact on society: prevent harm
Foster trust in AI systems
Avoid negative publicity
Legal consequences!
Key Principles
Transparency
Accountability
Privacy
Fairness
Key Principles
Transparency
Making AI algorithms and
decisions understandable
Privacy
Safeguarding user data
against leaks or threats
Accountability
Holding AI systems and
developers responsible
Fairness
Ensuring unbiased outcomes
for all user groups
Transparency
Openness in AI algorithms and decision-making processes
Making AI algorithms and decisions understandable
Especially useful with non-linear classifiers
Accountability
Holding AI systems/developers accountable
for the consequnces of their deicisions
Implementing mechanisms for tracing decisions
Related to explainability
When developing AI systems...
Ensure all stakeholders truly understand how your AI system works
Fairness
Ensuring unbiased outcomes for all user groups
Mitigating bias in training data and algorithms
Privacy
Eliminating risks associated with the misuse of personal data
Safeguarding user data against leaks or threats
This is my research area
Privacy
Imagine a dataset that contains your info leaks
It identifies you as having a certain disease
What would your insurance company do?
Privacy
Legal Safeguards
HIPAA, GDPR
Technical Safeguards
De-identification, Encryption, Secure Access Protocols
2006, The case of AOL
AOL released 20 million search queries of 650,000 users
Intended for research purposes
Identifying information (name, address, etc.) removed from data
What do you think happened next?
2006, The case of AOL
An NYT reporter tracked down Mrs. Thelma Arnold
of Georgia within weeks
Search queries included items like "landscapers in Lilburn, GA", "60s single men", "numb fingers"
Class action lawsuit, CTO resigns
2008, Netflix Recommendation Challenge
Netflix released movie selections of 450,000 users
$1 Million prize
Any guesses?
2008, Netflix Recommendation Challenge
Netflix released movie selections of 450,000 users
$1 Million prize
Combined with IMDb, users re-identified
Netflix canceled the contest permanently
2014, NYC Yellow Cab data
Trip data for 173 million rides
Md5 hash used on medallion numbers to anonymize drivers
How would you re-identify drivers?
2014, NYC Yellow Cab data
Trip data for 173 million rides
Md5 hash used on license & medallion numbers to anonymize drivers
Medallion numbers were finite and structured
Entire dataset de-anonymized in 2 hours
Privacy in the Genomic Era
Re-identification $\dots$, Venkatesaramani et al., Science Advances 2021
Genomic sequencing is increasingly common
23andMe, AncestryDNA
Privacy in the Genomic Era
Privacy in the Genomic Era
You may ask:
So what? Isn't it anonymized?
May be possible to link DNA sequences to other public info about the individual...
Any guesses?
Linking DNA to Face Images
Predict face features from an image
Use Naïve Bayes to predict most likely genome
Predict face features from an image
Use Naïve Bayes to predict most likely genome
Turns out, that works better than expected...
What can we do?
Adversarial Examples!
Adversarial Examples!
Add noise to face images
Fools phenotype predictors -
misclassifies skin color, hair color, etc.
Very little noise needed
Adversarial Examples!
Add noise to face images
Fools phenotype predictors -
misclassifies skin color, hair color, etc.
Very little noise needed