SocialPond

Things about society.

Sunday, January 28, 2018

Population migration derived from ACS 2011 5-year PUMS dataset



This is data release of working-age-population migration based on the ACS 2011 5-year PUMS. This article provides the same info as in my previous article Population migration derived from ACS 2011 5-year PUMS dataset.

The released spreadsheet table shows the population migration moved from each US State or foreign country into each US State, including in-state moves. It is to be emphasized that since these are based on sampling, the number is for references only. To get a sense of  possible errors, the MOE should be consulted. The spreadsheet can be accessed via Google Drive.

As an example, the spreadsheet show that, from 2007 to 2011, on average, there are about 46, 117, and 15 people per year with doctoral degree moved into Nebraska from France, China, and Jamaica respectively.

Labels: , , , , , , , , , , , ,

Tuesday, January 23, 2018

Educational Attainment of Nebraska's Working Age Population via 2016-12 ACS PUMS

This is a data release for Nebraska's working age population.  The working age is defined as 22 to 64 inclusive. The data is based on the PUMS (Public Use Micro Sample) data released by the US Census' American Community Survey.

The table below presented the number of people with various educational attainment with the age between 22 and 64.

Ed. AttainmentPopulationLow(90%MOE)Hi(90%MOE)Percent
1. LssHsDgr87,69185,03490,3488.5%
2. HsDgrEqv241,063236,468245,65823.4%
3. SomeCllg250,798245,393256,20324.4%
4. AssctDgr116,124112,680119,56811.3%
5. BchlrDgr234,181228,870239,49222.7%
6. Mstr71,87469,33674,4127.0%
7. FP16,80615,64117,9711.6%
8. Drs11,30910,23012,3881.1%

Detailed data with sampling weights can be download from here located in Google Drive. The weights can allow users to aggregate the presented educational attainment levels.

Labels: , , , , , , , ,

Migration of Nebraska Working Age Population via 2016-12 ACS PUMS


This is a data release concerning Nebraska's working age population. The working age is defined as 22 to 64 years old, inclusive.

The table below shows the estimated net number of people that moved into Nebraska per year between 2012 and 2016 with age between 22 and 64.

Ed. AttainmentNet (In) MigrationLow(90%MOE)Hi(90%MOE)
1. No HS Dgr-203-1056650
2. HS Graduated447-6801574
3. Some College294-11111699
4. Associate Dgr366-5451277
5. Bachelor Dgr-953-2253347
6. Graduate Dgr-637-1547273
* 20180124 Number verified.
Detailed data with sampling weights can be download from Google drive at here.
.

Labels: , , , , , , , , , ,

Monday, January 22, 2018

Importing ACS 2016-12 5-Year PUMS data


The US Census Bureau released the 2016-12 ACS 5-year PUMS data on Jan. 18, 2018. I have spent few days in trying to import the data into my database.

For those that following my blog, you would be curious about what happens to the dictionary file this time - well, for most part, everything went through except that Census decided to add more section titles. In the past, there are only two sections. One for Housing records and one for Person records. This time, Census decided to sub-divided each sections into more sections with sub-section titles. Census also decided to add a section title that denote the end of all definitions, which is very good even though I simply delete it without adding code to detecting it. But, if Census continue to supply that title, it would be worth to modify my code to adapt to it.

Well, it all sound too good to be true and it is. This time, my hurdle isn't the dictionary file, it is the size of the data. I suppose due to the increase in US population, there are more data records this time around and which exceeding the capacity of my software. When I was checking my import integrity, I noticed the missing of data. It take me a while to figure out what is going on. As I found out the reason, I wasn't sure how to proceed. I finally decided to import the extra data manually to separate database and knowing that I will have extra difficulty when trying to use the data - I likely will need to handle those data manually too when retrieving the data.

For now, this probably the best way to handle it even though it will create extra work when doing analysis. Taking other routes now will not only take time but also facing uncertainties that could taking even more time to resolve.

Well, my plate is full. After performing the targeted analysis, i will need to spend time figure out the new approaches. At this point, my structure do not allow storing data in more than one unit - doing so will break my current code and will have to manually involved when doing analysis. To allow multiple units, it means the structure changes and it will affect a lot of code that have been written. The other option is to look for software that can handle larger unit. That could mean spend money in commercial grade software or try out other open source software, which can mean the re-do a major part of my importing code - I am worried about the speed too. At this point, I can import the whole 5-year ACS data in, say, few hours. Depend on how good the new software is, this can change in orders.

Well. We will see.

Labels: , , , , , , , , , , , , , ,

Wednesday, December 20, 2017

Educational Attainment of Nebraska's Working Age Population via 2011-07 ACS PUMS

For this data release, the working age is defined as 22 to 64, inclusive.

This is a quick release of Nebraska's data, the full data for every State will follow.

Last year I was summarizing data manually - basically, wrote database queries with a temporary mapping table that translates ACS' education attainment levels to our desired levels. This year, I formalized some features in the database and try to build queries based on those formalized features.

The Nebraska data is presented in the following table:
Ed. Attainment LevelHead Count90% MOEPossible Range
1. No High School Diploma84,4173,19881,219 to 87,614
2. Has High School Diploma258,0464,404253,641 to 262,450
3. Some College Exp. - No Degree255,9705,053250,917 to 261,022
4. Associate's Degree109,7362,995106,740 to 112,731
5. Bachelor's Degree212,9313,753209,178 to 216,683
6. Master's Degree59,0122,14656,865 to 61,158
7. First Professional Degree17,2501,02016,230 to 18,269
8. Doctor's Degree9,3359758,359 to 10,310

The released data can be accessed through Google drive here. The released data includes all estimates that can be used to combine different degree levels. For data related to other states, please follow this link.

Related articles: 
Nebraska Brain Drain Migration and Ed. Attainment, 2015 United States ACS 

Labels: , , , , ,

Wednesday, December 13, 2017

Migration of Nebraska Working Age Population via 2011-07 ACS PUMS


This is simply a release of migration data for Nebraska working age population based on the ACS (American Community Survey) 2011-2007 5-year PUMS (Public Use Microdata Sample) data released by US Census Bureau.

Last year, after publishing the serials of articles about the population migration in the US, I re-examined what I did and spent times in revise the approach using more R codes than manually preparing and running SQL queries. This year, after comparing my R process for 2015-2011 and 2010-2006, I decide to restructure the R codes in an attempt to extract most of the common code to be shared and, hopefully, it will reduce the time spend in maintaining the code in the future. I intended to create R code to replace last year's process for education attainment too.

For this data release, the working age is defined as 22 to 64, inclusive.

Education DegreeNet (In) Migration90% MOEPossible Range
1. No High School Diploma2,1301,037.31092 to 3167
2. Has High School Diploma351,247.1-1213 to 1282
3. Some College Exp. - No Degree1,5011,493.27 to 2994
4. Associate's Degree153824.8-672 to 977
5. Bachelor's Degree891,260.9-1172 to 1349
6. Graduate Degree-1,733935.0-2669 to -798

As can be seen from the above table, for every year during that five-years period, there are, in net, estimated 1,733 people with graduate degree moved out of Nebraska. Since the number is derived from sampling, with 90% of certainty, the true number can lie between 2,669 and 798. So it is very likely (90% certainty) that Nebraska loses about 2,669 to 798 people with graduate degree every year during the five year period.

For population with bachelor degree, the net migration pattern isn't as clear cut as those with graduate degree since, with 90% certainty, the net can vary from 1,172 moving out to 1,349 moving in.


The released data file can be accessed here through Google Drive. The released data includes all weights that is needed to combine education categories if so desired. For data concerning other states, please follow this link.


Related articles: 
Nebraska Brain Drain Migration and Ed. Attainment, 2015 United States ACS 

Labels: , , , , , , , , ,

Sunday, December 10, 2017

ACS 2011-07 5-Year PUMS data


This brief is based on my note when I imported the ACS 2011-07 5 year PUMS file.

As briefly noted at the end of my article describing my process of importing the ACS 2016 1 year PUMS data, I did spend time to automated the process further. I believe I also tested the new process. But when it's the time to importing ACS 2011-07 PUMS, I largely forgot about the proper steps and it does take a bit of time to re-familiar with the process. Because of the additional automation, the amount of my note has reduced tremendously.

This is a Brief notes that may help people that taking the same route as me. The data dictionary for this data product was provided as .pdf file.With my pdf file reader, I copied and paste it into text file.

With a bit of imagination, I was able to use RE to help formatting the text file into what is acceptable by my program. These include: remove of leading space, eliminate page number, adding empty line before variable def. ... etc.

Besides what mentioned above, the major problem with this file is the multi-line 'Note:'. Notes before the following variables were multiple lines: ADJINC, WGTP, AGS, ST, NATITIVY, PAOC, and VPS. I think I will consider adding the handling of multiple line note into my dictionary verification code.

Few problem found with the current program but were fixed.

The data were imported smoothly. Database was created on Dec. 1, 2017 for both the person and housing data elements.


Labels: , , , , , ,

Wednesday, December 06, 2017

Adult College Enrollment - IPEDS data


As described in my previous articles, I have been working on importing all IPEDS (Integrated Postsecondary Education Data System) data into my database. After spending years on this project, I was able to import about 15 years' worth of IPEDS into the database with verification. There are still few files that, without major efforts, would be hard to handle properly - these are basically long lines with embedded new line characters in the line.

That being said, I was eager to run a test case with these IPEDS data. As the fate has presented itself, my previous articles were about adult college enrollment, and it happens that the college enrollment can be approximated from the IPEDS enrollment data.


For this article, the IPEDS fall (semester) enrollment data were first examined via my R interface code, which allows searching and checking definitions across data years. In this particular case, the R code reveals that, at 2009, the 'first professional' enrollment level disappeared from the level definition. By examining the IPEDS documentation, it is verified that from 2009 and on, the  first professional enrollment is to be reported in the graduate enrollment level. Since I am a kind of familiar with the IPEDS data collection, I knew the enrollment age data were not mandatory for even-number years. If not, a quick R code that checking the total for each year should have revealed that.

Since in IPEDS, data were only tagged with college id (unitid), extra steps were needed to tag the data with attributes from the college. These attributes are made available through the so called 'Institutional Chararcteristic' survey. Whit this survey, colleges can be tagged with control (Public/Private), level (4 or 2 year college), location (state/address...). For this project, it was found that, in 2011, there were 3 institutions did not reported appropriate information for the 'institutional characteristic' survey. Luckily, two of them were available from other years. To preserve most data, we fixed the two with info from other year and coded the third one with special code so that we can include them if we so desire. For this article, we include all institutions that were collected by IPEDS and this include institutions that located on US territories and miscellaneous islands. To list a few, this includes AS (America Samoa), GU (Guam), PR (Puerto Rico), MH (Marshall Island) ... etc.


With previous adult college enrollment article in mind, under-graduate enrollment from the IPEDS was considered a better approximation to those from the ACS data.


Examining the IPEDS age data, it is noticed that not all data were collected with equal age span. For example, data are collected with age categories like 18 to 19, 22 to 24, 25 to 29 ... etc. Presenting age data directly with with these age categories results in the following chart and the chart can trick reader to think that there is a bump in the age distribution which sure not look like the age distribution presented in my previous article.

Age distribution using IPEDS age categories

A better approach to resolve this would be using the average head count for each age category instead. Better yet, you can assign the average to each age in the category to provide a better representation in terms of age axis.

In this article, an average assigned to the category is used. To approaching the college enrollment data in my previous ACS based article, we presented the age distribution with the total enrollment, the sum of both full-time and part-time students. As shown in the following graph, it can be seen that the curve exhibits a familiar monotonic decreases after the primary peak around college graduation.

Age distribution for total fall enrollment using average for each age category

Since the IPEDS data also allow the separation of data with full-time, and part-time, it is worth the efforts to examine these characters too. The overall (sum over states) full-time distribution can be seen in the chart below.
Age distribution for full-time fall college enrollment using average for each age category

A typical age distribution for a state (NE) can be seen below. For most state, the only difference is whether the age group 18 to 19 or the age group 20 to 22 is the highest. The full-time fall enrollment age distribution for Utah, however, show a very different distribution - see chart below. This may related to the Mormon missionary program but more evidence from other survey or data elements may be needed.
A typical state (NE) age distribution for full-time enrollment using average for each category

Full-Time fall enrollment age distribution for the state of Utah
The IPEDS universal total for part-time fall enrollment can be seen in the chart below. Comparing to full-time and total enrollment, it clearly show a distinctive age distribution. For some states, their part-time fall enrollment are similar to that of the IPEDS universal total as shown below (NE). There are, however, another set of states (e.g. CA, FL, GA, ... etc.) that shows a quite different age distribution pattern. Part-time students in these states seem to take a break from school (to work?) and come back to enroll in school later.
IPEDS universal age distribution for part-time fall enrollment

Part-Time fall enrollment for the State of Nebraska

Part-Time fall enrollment for the State of California

Examining the IPEDS universe part-time enrollment, trending by years, we noticed that there were more younger kids in recent years. By presenting these same data in percentages, it shows that, proportionally, elder adult were taking smaller share of the part-time enrollment in recent years.
Age distribution of part-time students in percents

Labels: , , , , , , , , , , ,