Sensitive Information - SSN Detection#2208
Conversation
Create new list for sensitive information Add SSN detector to the list
…ue to allow returning nils
| SensitiveInformationDetectors = initMapFromVelesPlugins([]velesPlugin{ | ||
| {ssn.NewDetector(), "sensitiveinformation/ssn", 0}, | ||
| }) | ||
|
|
There was a problem hiding this comment.
I created a new collection for sensitiveinformation plugins
| type Detector struct { | ||
| // The maximum length of the sensitive information. | ||
| maxLen uint32 | ||
| MaxLen uint32 |
There was a problem hiding this comment.
I modified the Detector struct to have all the properties public
| Sensitivity: sensitiveinformation.SensitivityLevelModerate, | ||
| }, | ||
| Likelihood: sensitiveinformation.LikelihoodLikely, | ||
| Raw: bytes.Clone(b), |
There was a problem hiding this comment.
Up for discussion:
Assigning the incoming b byte slice directly to the Raw property breaks tests. This happens because slices share underlying memory and any subsequent modifications to b by the detector will also alter the Raw value.
Given that we need to Clone the bytes anyway, maybe we could store strings instead of byte arrays in the SensitiveInformation struct?
| func NewDetector() veles.Detector { | ||
| return simpleregex.Detector{ | ||
| MaxLen: maxSecretLength, | ||
| Re: ssnRe, |
There was a problem hiding this comment.
Given the format of SSNs is pretty distinctive, we decided against using additional KeywordsRe filtering.
Open to changing our mind.
This PR introduces a first detector for sensitive information. It uses the
sensitiveInformation/common/simpleregexto detect Social Security Numbers.As it is a first entry using the sensitive information simple regex, I had to introduce some changes and patterns. I'll highlight them below in code comments.