Welcome to In Depth Defense. In Depth Defense LLC is a privately owned Information Security Consulting company owned and operated by Mark Baggett. In Depth Defense specializes in Penetration Testing and Incident Response. At this time In Depth Defense is not accepting any new client work, but we are happy to speak to you and point you to other resources in the community.

Mark Baggett has been active in Information Security for 18+ years. I've served in a variety of roles from software developer to CISO. You can find archives of older blog entries below and read my newer posts on http://www.pauldotcom.com, http://isc.sans.edu and http://pen-testing.sans.org








Monday, May 28, 2007

What are the last 4 digits of your SSN?

Note to readers:  This very old blog entry still gets a pretty high amount of traffic.  I rarely, but sometimes do get nasty emails from people telling me that I am "teaching people to steal identities".     If we assume that I am the only person in the world capable of reading the instructions published on the Social Security Administrations (SSA) website, that explain the same thing I have here, then I agree with you.  But if that were true then people not smart enough to understand the SSA's website would probably be stumped by this as well, so we are still safe.   If I was the only one who could understand that website that might make me the smartest person in the world!! (in which case who are you to question my wisdom?)  I asked my wife about the possibility that I was the smarted person in the world and she confirmed for me that I am not (she is still laughing for some reason).  
The purpose of this site is to make people aware of the danger of simply sharing the last 4 digits of your SSN.   When a company asks you for your last 4 digits,  tell them no and send them a copy of this site.   Your SSN is a form of identification (like a account name), not authentication (like a password).   The problem is that many places today use a very predictable piece of data to authenticate who you are (ie your SSN as a 'password').  Now that you know it is predictable (which won't change whether this blog exists or not) you can fight to change the way organizations use it (which could change with an educated populous).   Screaming fire in a crowded movie theater is a good thing when the building is on fire.  Don't shoot the messenger.

Follow me on Twitter for information security and hacker news:  @MarkBaggett

Original blog entry:
“What are the last 4 digits of your SSN?” Nowadays, it seems to be accepted as a standard question to validate your identity. But throw in “What is your date of birth” and “What is your birth place?” and you may have given away your identity. I don't think it would be uncommon to find those three questions asked together in many cognitive password reset systems. Last week I answered the question with my bank and it made me wonder how predictable is my SSN with the rest of the information my bank has on me. I did a little research and sure enough, it seems feasible to me that with a few pieces of info and your last four an attacker could reasonable predict your SSN. The number of permutations are certainly low enough to make a brute force attack feasible.

First of all lets clarify something. The question “Where were you born?” is probably a good indicator of the actual question that needs to be asked which is “In what state did you apply for a SSN?” And “What is your birth date?” is not as accurate as “What date did you apply for a SSN?” But in a non-scientific polling of people I have asked it seems that your probably close enough.

Now lets look at predicting your SSN…

Your SSN is in the following format AAA-BB-CCCC. AAA is a number that represents the state in which you applied for the SSN. These numbers well documented and available on the Social Security Administrations website. For example, Were you born in Nevada? Your SSN starts with 530. New Mexico? 525. Most states have a range of a few digits. But lets say you were issued your SSN in New Mexico and you gave me your last 4; with no other information it will only require 99 guesses to guarantee I will predict your SSN.. In 1973 these numbers became even more closely tied to your geography. Now all number are issued by the central office in Baltimore based upon the ZIPCODE of the submitter. So those numbers can be predicted based upon the date your SSN was applied for and/or your zip code. But brute forcing 99 whole possibilities, that could take a while. But perhaps its even easier than that.

The second group of digits (BB) are handed out in a semi-sequential, but still chronological order. Therefore with the correct insight into which numbers where issued at what time you could predict this information. A good explanation of how these numbers are issued is in the “GROUP NUMBER” section on this site. http://www.usrecordsearch.com/ssn.htm

So what would it take to build a database of middle number and when they were issued? Well, looks like the SSA has already done that for us and published it on their website. They have what they refer to as the “High group number”. Every month they predict what the highest middle digits are for each of the geographic codes. The numbers can be found here…

http://www.ssa.gov/employer/ssnvhighgroup.htm

So in April 2006 the middle digits for the first three state codes (born in New Hampshire) were :
001 (First 3 digits) = 04 (Middle 2 digits)
002 = 02
003  = 02

Then in May of 2006 they became
001=04
002=04 (the next group according to their sequence)
003=02

In October of 2006 geographic code 003 began issuing number with 04 as the middle two
001=04
002=04
003=04

In May of 2007 geographic code 001 began issuing number with 06 as the middle two.
001=06
002=04
003=04

Today the history of "high groups" only date back to November 2003 on the main website. But 4 years seems to be long enough to determine how quickly the digits in various geographical areas change. That information combined with data from other public sources such as the number of births in a state in a given year would be helpful in establishing a prediction database.

Reading these descriptions it is obvious that numbers are issued chronologically based upon geography of the requester. So how difficult would it be for a computer to either accurately predict or come reasonably close such that a brute force is reasonable.


Here is a table of states and SSN geographic codes

001-003 NH 400-407 KY 530 NV
004-007 ME 408-415 TN 531-539 WA
008-009 VT 416-424 AL 540-544 OR
010-034 MA 425-428 MS 545-573 CA
035-039 RI 429-432 AR 574 AK
040-049 CT 433-439 LA 575-576 HI
050-134 NY 440-448 OK 577-579 DC
135-158 NJ 449-467 TX 580 VI Virgin Islands
159-211 PA 468-477 MN 581-584 PR Puerto Rico
212-220 MD 478-485 IA 585 NM
221-222 DE 486-500 MO 586 PI Pacific Islands*
223-231 VA 501-502 ND 587-588 MS
232-236 WV 503-504 SD 589-595 FL
237-246 NC 505-508 NE 596-599 PR Puerto Rico
247-251 SC 509-515 KS 600-601 AZ
252-260 GA 516-517 MT 602-626 CA
261-267 FL 518-519 ID 627-645 TX
268-302 OH 520 WY 646-647 UT
303-317 IN 521-524 CO 648-649 NM
318-361 IL 525 NM *Guam, American Samoa,
362-386 MI 526-527 AZ Philippine Islands,
387-399 WI 528-529 UT Northern Mariana Islands

650-699 unassigned, for future use
700-728 Railroad workers through 1963, then discontinued
729-799 unassigned, for future use
800-999 not valid SSNs. Some sources have claimed that numbers
above 900 were used when some state programs were converted
to federal control, but current SSA documents claim no
numbers above 799 have ever been used.


http://www.usrecordsearch.com/ssn.htm
http://en.wikipedia.org/wiki/Social_Security_number
http://www.ssa.gov/employer/ssnvhighgroup.htm
Subscribe