3.1 Sources
Although census data collection and processing have to meet high
quality standards, it is very difficult to eliminate all potential errors.
There are two kinds of population coverage error. Population undercoverage
refers to the exclusion of persons who should have been enumerated, and
population overcoverage refers to the inclusion of persons who were enumerated
more than once (generally twice). Overcoverage also includes persons who were
enumerated but should not have been. However, this type of error is considered
negligible; consequently, it is not measured.
Undercoverage can occur in the first stage of the census if the
list of dwellings used for the dwelling universe is incomplete. This risk is
higher, for example, if a dwelling is under construction. Conversely,
overcoverage can occur if a dwelling is listed twice.
Coverage error can also occur during the field data collection
stage. Respondent error is responsible for coverage error when the person
completing the census form omits someone whose usual place of residence,
according to census rules, is the dwelling concerned; this is undercoverage.
The person may also include someone whose usual place of residence is not the
dwelling concerned; there is overcoverage if this person has already been
enumerated at their usual place of residence or somewhere else. In most cases,
it is easy to determine a person’s usual place of residence. However, as stated
in the previous section, the process is sometimes more complex, and special
rules have been developed for determining an individual’s usual place of
residence. The rules are spelled out in the census questionnaire, but the list
is long, and there can be comprehension difficulties. Coverage error may result
when the rules are not consulted or are incorrectly applied. The idea of using
Census Day as the reference date for determining usual residence may also be
misunderstood, which can lead to coverage error.
Coverage errors may also be committed during the processing stage
at any point where records for persons or households are added to or removed
from the census database. Records can be deleted by mistake. Questionnaires may
be linked to the wrong record or returned too late to be included.
Even though efforts are made to enumerate the homeless population,
the risk of undercoverage is high. Some other living arrangements are also
susceptible to coverage error. For example, young adults newly away from home
may be either undercovered, because neither their roommates nor their parents
include them in the census questionnaire, or overcovered, because they are
included in both census questionnaires. Persons who maintain a second residence
because of their employment can also cause coverage error.
Users should also be aware of the
extent to which Indian reserves and Indian settlements participated in the 2016
Census. In some cases, enumeration was not permitted by the community or was
interrupted before it could be completed. These geographic areas (14 in all in
2016) are considered incompletely enumerated Indian reserves and settlements.
There are no 2016 data for incompletely enumerated Indian reserves and
settlements, and those areas are not included in the totals. Similar problems
have occurred in previous censuses. For example, 22 Indian reserves and
settlements were incompletely enumerated in the 2006 Census, and 31 in the 2011
Census. Of those reserves and settlements, 20 participated in the 2016 Census.
The demographic estimates for the 14 incompletely enumerated Indian
reserves and settlements are based on a model. However, since no reliable
source is available to verify the assumptions in the model, the estimates must
be used with caution. For more information, see Section 12.2.
3.2 Control
Potential sources of coverage error were recognized during the
planning stage of the 2016 Census, and the following measures were taken to
minimize the associated risks:
- Collection unit
(CU) boundaries were carefully defined and mapped to ensure that no geographic
areas were left out or included twice.
- List/leave
areas: The enumerator’s manual contained instructions on how to enumerate a CU
so as to minimize the risk of missing dwellings. The total number of dwellings
from the 2011 Census was provided to field operations supervisors to help them
identify significant changes. In addition, when the listing operation resulted
in a substantial difference in the number of dwellings relative to the 2011
Census, the listing was checked. Lastly, specific quality control procedures
were applied to the CU to evaluate and correct any changes made in the listing.
- Mail-out areas:
Mail-out was based on a list of addresses from Statistics Canada’s Address Register.
This list was updated regularly and listing activities were carried out mainly
in the fastest-growing areas. These listing activities were carried out
continuously, but more intensively in the two years preceding the census.
Listing operations led to nearly 30% of the addresses in the mail-out areas
being checked. The work of enumerators was closely monitored. Some collective dwellings had to be checked by
field staff to verify their occupancy status before the collection stage; if
they were occupied then they were identified and included in the census.
- Special
procedures were developed for the enumeration of persons who have difficulty
responding (e.g., people who are fluent in neither English nor French, or are
illiterate) and persons located in specific parts of large cities where
response or coverage was poor in the past.
- Special
procedures were defined for the enumeration of the population residing on
Indian reserves.
- Advertisements
informed Canadians about the census and indicated what to do if they did not
receive a questionnaire.
- The Census Help Line (CHL) was available to answer any questions about the census, including
questions about coverage.
- There was a
“Whom to include” section in the questionnaire so respondents could determine
which persons should be included. Also, almost 70% of the responses to the 2016
Census were obtained through Internet, and the electronic questionnaire
included additional verification questions when respondents reported a dwelling
as unoccupied or non-existent, or if they had a problem determining whether a
person should be included or not.
- In the
questionnaire, respondents were asked to indicate whether there were people who
had not been listed because they were not sure they should be included. The
electronic questionnaire provided guidance so respondents could make the right
decision. In the other cases, a telephone follow-up was subsequently carried
out with the respondent to determine if the persons in question should or
should not be listed in the questionnaire.
- Telephone
follow-up was carried out after questionnaires were reviewed for coverage
inconsistencies or to verify household status, including questionnaires
containing only foreign residents or persons temporarily present.
- Non-response
follow-up included a dwelling coverage check.
These procedures, along with appropriate staff training,
supervisory checks and quality controls during the collection and processing
stages, helped to reduce the number of coverage errors.
3.3 Definitions
Algebraic definitions of coverage errors are presented in this
section. Let
denote the total or the “actual” number of
persons targeted by the Census of Population. Let
denote the published census count of persons
in the target population. The error associated with using
instead of
is as follows:
This error, denoted as
, is the net population coverage error.
Let denote population undercoverage, the number of
persons not included in
who should have been.
The census count
is
composed of two elements:
Where:
is the number of persons enumerated. This is the number of persons
who were listed on a census questionnaire.
is the
number of persons imputed. This is an estimate of the number of persons missed
because their dwelling was classified as occupied but non-response or
misclassified as unoccupied, therefore for which no follow-up was done. For more information on whole household imputation (WHI), see Section 3.6 of the Sampling and Weighting Technical
Report, Census of Population, 2016, Catalogue no. 98-306-X.
Undercoverage compared with the published census count is therefore what remains of the
persons who should have been listed on a census questionnaire and who were not
taken into account by the WHI. In other words, it does not include the estimate
of the number of persons who were not enumerated either because no completed
census questionnaire was returned for the dwelling (non-response dwelling) or
because the dwelling was misclassified as unoccupied (classification error) and
did not receive a questionnaire.
The concept of undercoverage before the WHI also exists. This is
what is referred to as Census of Population collection undercoverage. For more
information, see Section 12.1.
Let denote population overcoverage, the number of
excess enumerations included in that should not have been.
has two components. One is the excess
enumerations of persons enumerated more than once. Coverage studies focus on
these excess enumerations. The second is persons who were enumerated but who
were not in the census target population. For example, foreign residents
visiting Canada who are listed on a census questionnaire as usual residents of
a dwelling should not be included in
.
Fictitious persons are another example. According to previous studies, the
number of persons who are enumerated but are not in the census target
population is generally very small and can be ignored. Consequently, census
coverage does not measure this component of coverage error.
Since refers
to persons who were not enumerated but should be included in
and since
denotes enumerations that should not be
included in
, the
difference between
and
is
less
. That
is:
The actual number of persons in the census target population is
therefore:
In practice, for reasons of cost and timeliness of the data
produced, an estimate of
is given by
, based on sample studies, where:
is an estimate of the number of persons not
included in
who should have been, and
is an estimate of the number of persons
included in who
should not have been. We can assume that overcoverage from persons included in
who are not in the census target population is
zero, since it is negligible. Consequently,
is simply an estimate of the number of
duplicate enumerations. The purpose of census coverage studies is to determine
the values of
and
.
In summary, the actual population
is composed of the census count
and the net undercoverage
. This is referred to as net undercoverage
because is generally larger than in the context of the current census
in Canada. However, the opposite is possible, whereby would be negative.
consists of
plus the number of persons added in WHI, and
this imputation
targets persons living in non-response
dwellings or in occupied dwellings misclassified as unoccupied.
Census population coverage errors can generally be
expressed as rates relative to the actual population. The undercoverage rate
is
as a percentage of
. The overcoverage rate
is
as a percentage of
.
The net undercoverage rate
is the difference between
and
as a percentage of the census target
population. These three rates can be estimated by
,
and
,
as follows:
A positive net undercoverage rate
indicates that the undercoverage rate is higher than the overcoverage rate.
That is, the number of people not included in the published census count
is higher than the number of excess
enumerations. That is generally the case for all Canadian censuses. For some
domains of interest, however, negative net undercoverage is sometimes observed.
3.4 Evaluation
Two postcensal studies were carried
out to estimate the 2016 Census population coverage error. The Reverse Record Check (RRC) provided estimates for population undercoverage, while the Census Overcoverage Study (COS) estimated population overcoverage. As previously
mentioned, the Dwelling Classification Survey (DCS) does not contribute to
census coverage error estimates since census counts are already adjusted to
take DCS results into account.
The RRC and COS were conducted
subsequent to field collection and census processing operations. Preliminary
estimates of 2016 Census population coverage error were released on
March 29, 2018. Following an in-depth validation exercise with the
Demography Division and the provincial and territorial statistical focal
points, final estimates were released on September 27, 2018. The data
were released at the same time as the new official demographic estimates
reflecting the update of the base population to the 2016 Census. Census
population counts adjusted for net population undercoverage constituted the
updated estimates of the base population.
A brief description of the
methodology used in the two census coverage studies is presented below:
Reverse
Record Check (RRC)
In the RRC, a random sample of individuals
representing the 2016 Census target population was selected from frames
independent of the census. These frames are described in Section 7.1. The 2016
RRC sample consisted of 67,872 persons in the provinces and 2,595 persons in
the territories. The 2016 Census database was then searched to determine
whether these persons had indeed been enumerated.
Where necessary, interviews were conducted, mostly via
computer-assisted telephone interviewing (CATI) from the regional offices
(ROs), to collect information for use in additional searches of the 2016 Census
database. An interview was completed for 82.1% of the 15,584 cases sent to the
ROs. The sampling weight was adjusted for non-response. Specifically, the total
sampling weight of non-respondents was divided among groups of respondents most
like the non-respondents in their response probability.
The estimate of population undercoverage is based on the number of
persons in the RRC sample who were classified as “missed.” These persons were
part of the target population for the 2016 Census, but no evidence of
enumeration could be found in the 2016 Census Response Database. Nationally,
4,821 persons in the RRC sample were classified as missed in the provinces and
1,128 in the territories.
Census
Overcoverage Study (COS)
Overcoverage was measured by matching the final 2016 Census
database to itself, and then matching the final 2016 Census database and a list
of persons who should have been enumerated according to administrative data
sources. Probabilistic linkage was used for matching. Probabilistic linkage
identifies matches that are close but not exact. A sample of potential
duplicates was selected for each linkage, and demographic characteristics and
names were examined to identify true cases of overcoverage.