SocialPond

Things about society.

Friday, August 26, 2016

ACS 2014 1-year PUMS for database/IT professional

Continue with the ACS PUMS database project, the task is to import the 1-year ACS PUMS product of 2014.

The data is successfully imported with major hurdle to overcome due to problems with the data dictionary published by Census. Beside editing errors, there are few structure issues that would require structure rule/guidance to ensure consistency for the future operation. We will point these out while we enlist all issue encountered.

Here are all the issues encountered while processing the 2014 1-year ACS PUMS data dictionary file: PUMSDataDict14.txt tagged Oct. 27, 2015:
    Variable: OCPIP
        '101 101% or more' -> '101 .101% or more'
    Variable: INTP
        ->  '-09999..-00001 .Loss of $1 to $9999 (Rounded and bottom-coded)'
    Variable: NWAB
        Split 'NWAB 1 (UNEDITED - See "Employment Status Recode" (ESR))' to two line at 1
        * If Census can make this format standard: optional '()' enclosed comment after
          variable name and length, it can be incorporated into program.
        * For now, since multi-line variable description is allowed, split is easier.
    Variable: NWAV, NWLA, NWLK, NWRE
        Similar to NWAB
    Variable: NWLK
        Value 'b' description occupied multiple lines.
        * If Census can make this format standard: additional line are lead by space following
          by a '.' leaded text, this can be incorporated into program.

    Variable: RAP
        -> '00001..99999 .$1 to $99999 (Rounded)'
   
Variable: RETP
         -> '000001..999999 .$1 to $999999 (Rounded and top-coded)'    Variable: WKHP: 
         Value 'bb . ...' occupied two lines    Variable: ESP
         At first sight, it looks like multiple line value description. However, the line '.Living with
             two parents:' is actually a qualifier for the following values.
         * The minimum effort from Census is to make this a standard:
            Value description ended with ':' are intended as qualifier for upcoming values

    Variable: NAICSP
        -> '311811 .MFG-RETAIL BAKERIES'
        -> '3399ZM .MFG-MISCELLANEOUS MANUFACTURING, N.E.C.'
        -> '5191ZM .INF-OTHER INFORMATION SERVICES, EXCEPT LIBRARIES AND

              ARCHIVES, AND INTERNET PUBLISHING AND BROADCASTING AND WEB
              SEARCH PORTALS'    Variable: OCCP
        'OCCP4' -> 'OCCP 4'    Variable: POVPIP
        '501501 percent or more' -> '501 .501 percent or more'    Variable: RACAS
        Should be RACASN
      RACPI: Extra line before RACPI
    Variable: VPS
        Similar to ESP above

After these lines were adjusted for current program, the process went smoothly.

Labels: , , , ,

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home