CNNs learn powerful features

...so much so that sometimes, we can exploit them!

Remember Gradient Descent?

To attack a network, do gradient ascent instead!

Fast Gradient Sign Method (FGSM)

$$ x_{adv} = x + \alpha\ \mathrm{sign}(\nabla_x L(f_\theta(x), y)) $$

Fast Gradient Sign Method (FGSM)

$$ x_{adv} = x + \alpha\ \mathrm{sign}(\nabla_x L(f_\theta(x), y)) $$

Projected Gradient Descent (PGD)

Repeated FGSM, but clip the resulting outputs to $[x-\epsilon, x+\epsilon]$

$$x_{adv} = \mathrm{Clip}_{x, \epsilon} \Bigg(x + \alpha\ \mathrm{sign}(\nabla_x L(f_\theta(x), y))\Bigg) $$

Attacks can be:

1. targeted or untargeted

2. black-box or white-box

3. evasion or poisoning

4. digital or physically realizable

What ethical implications might these attacks have for AI developers?

AI Ethics

Foundations for Ethical AI Development

AI Ethics

The study of moral principles governing behavior,
encompassing concepts of right and wrong.

AI Ethics

The study of moral principles governing behavior,
encompassing concepts of right and wrong.

AI Ethics address the responsible development
and deployment of AI systems.

Why should we care?

Impact on society: prevent harm

Foster trust in AI systems

Avoid negative publicity

Legal consequences!

Key Principles

Transparency

Accountability

Privacy

Fairness

Key Principles

Transparency

Making AI algorithms and
decisions understandable

Privacy

Safeguarding user data
against leaks or threats

Accountability

Holding AI systems and
developers responsible

Fairness

Ensuring unbiased outcomes
for all user groups

Transparency

Openness in AI algorithms and decision-making processes

Making AI algorithms and decisions understandable

Especially useful with non-linear classifiers

Accountability

Holding AI systems/developers accountable
for the consequnces of their deicisions

Implementing mechanisms for tracing decisions

Related to explainability

When developing AI systems...

Ensure all stakeholders truly understand how your AI system works

Fairness

Ensuring unbiased outcomes for all user groups

Mitigating bias in training data and algorithms

Privacy

Eliminating risks associated with the misuse of personal data

Safeguarding user data against leaks or threats

This is my research area

Privacy

Imagine a dataset that contains your info leaks

It identifies you as having a certain disease

What would your insurance company do?

Privacy

Legal Safeguards

HIPAA, GDPR

Technical Safeguards

De-identification, Encryption, Secure Access Protocols

2006, The case of AOL

AOL released 20 million search queries of 650,000 users

Intended for research purposes

Identifying information (name, address, etc.) removed from data

What do you think happened next?

2006, The case of AOL

An NYT reporter tracked down Mrs. Thelma Arnold
of Georgia within weeks

Search queries included items like "landscapers in Lilburn, GA", "60s single men", "numb fingers"

Class action lawsuit, CTO resigns

2008, Netflix Recommendation Challenge

Netflix released movie selections of 450,000 users

$1 Million prize

Any guesses?

2008, Netflix Recommendation Challenge

Netflix released movie selections of 450,000 users

$1 Million prize

Combined with IMDb, users re-identified

Netflix canceled the contest permanently

2014, NYC Yellow Cab data

Trip data for 173 million rides

Md5 hash used on medallion numbers to anonymize drivers

How would you re-identify drivers?

2014, NYC Yellow Cab data

Trip data for 173 million rides

Md5 hash used on license & medallion numbers to anonymize drivers

Medallion numbers were finite and structured

Entire dataset de-anonymized in 2 hours

Privacy in the Genomic Era

Re-identification $\dots$, Venkatesaramani et al., Science Advances 2021

Genomic sequencing is increasingly common

23andMe, AncestryDNA

Privacy in the Genomic Era

Privacy in the Genomic Era

You may ask:
So what? Isn't it anonymized?

May be possible to link DNA sequences to other public info about the individual...

Any guesses?

Linking DNA to Face Images

Predict face features from an image

Use Naïve Bayes to predict most likely genome

Predict face features from an image

Use Naïve Bayes to predict most likely genome

Turns out, that works better than expected...

What can we do?

Adversarial Examples!

Adversarial Examples!

Add noise to face images

Fools phenotype predictors -
misclassifies skin color, hair color, etc.

Very little noise needed

Adversarial Examples!

Add noise to face images

Fools phenotype predictors -
misclassifies skin color, hair color, etc.

Very little noise needed