vignettes/duplicate_datasets.Rmd
duplicate_datasets.Rmd
The following datasets are currently supported by cchsflow:
Starting with the 2009-2010 CCHS survey cycle, Statistics Canada released different data files containing different CCHS information. The two types of data files that are included in cchsflow are Two year data files & One year data files. The two year data files are combined data file which contain respondents from two years along with common variables that are asked across both years. The one year data file contains respondents from one year along with common variables and variables that were optional in that year. The two year data files do not contain optional variables that were asked in one year.
This means that the respondents from the CCHS PUMF 2010 dataset are also included in the CCHS PUMF 2009-2010 dataset. There are optional variables, however, that are not included in the CCHS PUMF 2009-2010 dataset. The same goes for the CCHS PUMF 2011-2012 & CCHS PUMF 2012 datasets and CCHS PUMF 2013-2014 & CCHS PUMF 2014 datasets. As cchsflow continues to grow in the number of variables that are supported, the two types of data files are supported so that optional variables can be captured and harmonized across as many cycles as possible.
Care must be taken in ensuring that your research does not include repeats of the same respondent. If your research primarily focuses on common variables, we recommend using the two year data files (i.e. CCHS 2009-2010, 2011-2012, 2013-2014). If your research requires the use of optional variables that are not included in the two year data files, we recommend using the one year data files (i.e. CCHS 2010, CCHS 2012, CCHS 2014).